File sizes required for the execution of the database reorganization utility (pdrorg)

The following table shows the formulas for determining the file sizes required for the execution of the database reorganization utility (pdrorg):

File type	Formula (bytes)
Unload data file^#¹ (no options specified)	n (P_i + M) + L_i^#⁷₊1,200 + A + B + c 96 + D + I + F i=1
Unload data file^#¹ (-w option specified)	DAT or extended DAT format: { c (maximum length of converted character string for column i ^#² + 1) C + M i = 1 } n FIX table in binary format: { c (column data length i^#³) + M i = 1 } n Non-FIX table in binary format: { c (column data length i^#³ + G) + M + 4 (c + 1) i = 1 } n Fixed-length character format: { c (maximum length of converted character string for column i ^#⁴ + crlf) C + M i = 1 } n
Unload data file^#¹ (-j option specified or during reorganization in units of schemas)^#⁵	n (P_i + M) + L_i^#⁷ + i = 1 n { i = 1 m (O_ij + 44) j = 1 } + 1,200 + A + B + c 96 + D + I + F
LOB data unload file^#¹	n { i = 1 m (O_ij + 44) j = 1 } + 1,200 + A + B + c 96 + D + I + F
Index information file	B-tree index: (K + p) n + 512 Plug-in index: (12 + X) n + 1,024 These formulas are for the size of one index. If there are multiple indexes, determine the size of each index.
Process results file	1,700 + number of servers storing table 500 + number of tables in schema 1,000 + total number of storage RDAREAs in schema 100
Work file^#⁶	[8 + 2 S + 2 {n (value of past message output interval specified by the -m option)} + 3 R + 4 J R] 200
Work file for sorting	Condition 1: Size of index information file + 4 n Condition 2: {Size of index information file + 4 n} 2 Condition 1 When 1,024 (KB) E Condition 2 When 1,024 (KB) < E E: Buffer size The buffer size obtained according to 19.1.12 Buffer size used to determine the size of the work file for sorting.

A: For key range partitioning: 48 + number of partitioning conditions 284: For hash partitioning: 40 + a 60; For matrix partitioning (combination of key range partitioning of the boundary value specification and hash partitioning): 48 + (number of partitioning conditions 284) + (40 + a 60)

B: n 36 (for FIX table) or (44 + c 4) n (for non-FIX table)

C: If the output character encoding of the unload data file is not a default HiRDB character encoding: 2: Otherwise: 1

D: 16 + (number of LOB columns a 80): Add the value of D only if there are LOB columns.

F: Use the following value:

{(number of abstract-data-type attributes provided by plug-in in column i [Figure]

84) +
i = 1
(number of abstract-data-type LOB attributes provided by plug-in in column i [Figure]

72)}
+ 64 +
d
[Figure]

(84 + number of reverse generation functions i [Figure]

60)
i = 1

G: Number of attributes for which the return value of the reverse generation function on column I is BLOB 4

I: 136 + number of index partitions 60: Add this value when including the index.

J: Number of indexes

K: Index key length: See Table 16-5 List of index key lengths. For variable-length data, keep in mind when defining the maximum length that single columns are also handled as multicolumns.

L_i: Actual length of row

Obtain the actual row length (approximate or accurate value). If the row is BINARY type and has the compression specification, we recommend that you obtain an approximate value because obtaining an accurate value requires a complicated calculation.

Obtaining an approximate value:

Use the following formula to obtain from the data stored in the database an approximate value (bytes) for the sum of the actual lengths of all rows:

Number of pages used in the table storage RDAREA [Figure]

page length of the table storage RDAREA

You can obtain the number of pages used in the table storage RDAREA and the table storage RDAREA's page length from the results of a condition analysis by RDAREA (logical analysis) or table that are provided by pddbst.

Obtaining an accurate value:

Use the following values to obtain the actual row length for the data stored in each column:

Column's data type	Actual length (bytes)
BLOB	16
Abstract data type provided by plug-ins	2
BINARY^#	If pdrorg -k rorg is specified with the compression specification and no UOC is used Definition length + 8 MAX(number of times a concatenation operation was used in the UPDATE SQL statement, definition length split compression size) Otherwise Definition length of the BINARY type column
Other	Actual length of each column

#: For obtaining the maximum actual row length, this assumes BINARY type data whose length is the definition length and that the compression rate is 0%.

M: Total data length of the character string type column for which the character set is specified

The value is as follows:

k
(

column data length i)
i = 1

O_ij: LOB data length

P_i: Data length of the abstract data type provided by a plug-in

R: Number of partitioned table or index RDAREAs

S: Number of table-storing servers

X: Value is as follows

27 for the abstract data type stored in the LOB RDAREA
Key length + 2 for the abstract data type of a maximum of 255 bytes of definition length
2 for the abstract data type of 256 bytes or more of definition length

Typical abstract data type values are as follows.

27 for the SGMLTEXT type
2 for the FREEWORD, GEOMETRY, and XML types

a: Number of partitioned RDAREAs

c: Number of column definitions

d: Number of columns for which the abstract data type provided by a plug-in is defined

k: Number of character string type columns for which the character set is specified

m: Number of LOB columns

n: Number of rows (for a repetition column, number of rows number of elements)

p: If all key component columns are fixed length, the value is 10; if they include a variable length, the value is 12.

crlf: Length of linefeed characters added when cr or crlf is specified in the -W option

Determine the length of linefeed characters from the following table:

-W option value		Value to be added
-W dat or -W extdat	,cr	1
	,crlf	2
	Not specified	1
-W fixtext	,cr	1
	,crlf	2
	Not specified	0

Note: When calculating the size of index information files and sort work files, if the index configuration columns are repetition columns, the number of rows to reload and n do not refer to the number of rows but to (number of rows number of elements).

#1: If the file is larger than 2 GB, take one of the following actions:

Create multiple files, each of which is no larger than 2 GB.
Use large files. For details about how to create large files, see 4.3 Creating HiRDB file system areas.

#2: The following table lists maximum lengths of converted character strings for columns in DAT format (-W dat) or extended DAT format (-W extdat).

Table 19-1 Maximum lengths of converted character strings for columns (in DAT or extended DAT format)

Data type		Maximum length of converted character string (bytes)
Numeric data	INTEGER	11
	SMALLINT	11
	DECIMAL	40
	FLOAT	23
	SMALLFLT	23
Character string data#¹	CHARACTER	Defined length + 2#²
Character string data#¹	VARCHAR	Actual length + 2#²
Mixed character string data#¹	MCHAR	Defined length + 2#²
Mixed character string data#¹	MVARCHAR	Actual length + 2#²
National character data#¹	NCHAR	Defined length + 2#²
National character data#¹	NVARCHAR	Actual length + 2#²
Date data	DATE	10
Time data	TIME	8
Date interval data	INTERVAL YEAR TO DAY	9
Time interval data	INTERVAL HOUR TO SECOND	7
Time stamp data	TIMESTAMP	19 If the number of digits for fractions of a second is not 0, add the number of digits for fractions of a second + 1.
Binary data#¹	BINARY	Actual length + 2#²

#1: If data in extended DAT format contains a double quotation mark ("), the length of the converted character string becomes longer by the number of double quotation marks.

#2: Two bytes are added for the enclosing brackets.

If -W dat or -W extdat is specified and sup is specified in the operand, the maximum lengths of converted character strings take effect on the columns as shown below. Note that the actual length indicates the length without the trailing consecutive spaces. For details about the space-compressed output format, see the -W option of the database reorganization utility (pdrorg) in the manual HiRDB Version 9 Command Reference.

Data type		Maximum lengths of converted character string (bytes)
Character string data	CHARACTER	Actual length + 2
Mixed character string data	MCHAR	Actual length + 2
National character data	NCHAR	Actual length + 2

#3: For details about the data length, see the following tables:

#4: The following table lists maximum lengths of converted character strings for columns in fixed-length character format (-W fixtext).

Table 19-2 Maximum lengths of converted character strings for columns (fixed-length character format)

Data type		Maximum lengths of converted character string (bytes)
Numeric data	INTEGER	11
	SMALLINT	6
	DECIMAL	Number of digits + 2
	FLOAT	23
	SMALLFLT	23
Character string data	CHARACTER VARCHAR	Defined length	If fixtext_option is specified in the enclose operand, add 2 to the output length.
Mixed character string data	MCHAR MVARCHAR	Defined length
National character data	NCHAR NVARCHAR	Defined length 2
Date data	DATE	10
Time data	TIME	8
Date interval data	INTERVAL YEAR TO DAY	10
Time interval data	INTERVAL HOUR TO SECOUND	8
Time stamp data	TIMESTAMP	Decimal part 0:19 2:22 4:24 6:26
Large object data	BLOB	0
Binary data	BINARY	0
Abstract data type	ADT	0

#5: If you are reorganizing files in units of schemas (including unload files), use the sum of the values obtained for individual tables.

#6: Output if lvl2 is specified as the information message output suppression level in the -m option.

#7: To obtain an accurate value of Li, replace (Pi + M) + Li with (Li + Pi + M).

19.1.2 File sizes required for the execution of the database reorganization utility (pdrorg)