19.1.2 File sizes required for the execution of the database reorganization utility (pdrorg)

The following table shows the formulas for determining the file sizes required for the execution of the database reorganization utility (pdrorg):

File typeFormula (bytes)
Unload data file# (no options specified)n
[Figure] (Pi) + Li#6 + 1,200 + A + B + c x 96 + D + I + F
i=1
Unload data file# (-w option specified)
DAT or extended DAT format:
c
[Figure] (maximum length of converted character string in column I#1 + 1) x n
i = 1
FIX table in binary format:
c
[Figure] (column data length i#2) x n
i = 1
Non-FIX table in binary format:
{
c
[Figure] (column data length i#2 + G) + 4 x (c + 1)
i = 1
} x n
Fixed-length character format:
c
[Figure] (maximum length of converted character string for column i#3 + crlf) x n
i = 1
Unload data file# (-j option specified or during reorganization in units of schemas)#4n
[Figure] (Pi) + Li#6 +
i = 1
n
[Figure]{
i = 1
m
[Figure] (Oij + 44)
j = 1
} + 1,200 + A + B + c x 96 + D + I + F
LOB data unload file#n
[Figure]{
i = 1
m
[Figure] (Oij + 44)
j = 1
} + 1,200 + A + B + c x 96 + D + I + F
Index information file
B-tree index:
(K + p) x n + 512
Plug-in index:
(12 + X) x n + 1,024
These formulas are for the size of one index. If there are multiple indexes, determine the size of each index.
Process results file1,700 + number of servers storing table x 500 + number of tables in schema x 1,000 + total number of storage RDAREAs in schema x 100
Work file#5[8 + 2 x S + 2 x {n[Figure] (value of past message output interval specified by the -m option)} + 3 x R + 4 x J x R] x 200
Work file for sorting
Condition 1:
Size of index information file + 4 x n
Condition 2:
{Size of index information file + 4 x n} x 2
  • Condition 1
    When the work buffer size specified in the sort statement [Figure] E
  • Condition 2
    When the work buffer size specified in the sort statement < E
E: Buffer size
The buffer size obtained according to buffer-size-for-sorting for Database Reorganization Utility (pdrorg) in the manual HiRDB Version 9 Command Reference.
A: For key range partitioning: 48 + number of partitioning conditions x 284
For hash partitioning: 40 + a x 60
For matrix partitioning (combination of key range partitioning of the boundary value specification and hash partitioning): 48 + (number of partitioning conditions x 284) + (40 + a x 60)
B: n x 36 (for FIX table) or (44 + c x 4) x n (for non-FIX table)
D: 16 + (number of LOB columns x a x 80)
Add the value of D only if there are LOB columns.
F: Use the following value:
D
[Figure]{(number of abstract-data-type attributes provided by plug-in in column i x 84) +
i = 1
(number of abstract-data-type LOB attributes provided by plug-in in column i x a x 72)}
+ 64 +
d
[Figure] (84 + number of reverse generation functions i x 60)
i = 1
G: Number of attributes for which the return value of the reverse generation function on column I is BLOB x 4
I: 136 + number of index partitions x 60
Add this value when including the index.
J: Number of indexes
K: Index key length
See Table 16-5 List of index key lengths. For variable-length data, keep in mind when defining the maximum length that single columns are also handled as multicolumns.
Li: Actual length of row
Obtain the actual row length (approximate or accurate value). If the row is BINARY type and has the compression specification, we recommend that you obtain an approximate value because obtaining an accurate value requires a complicated calculation.
Obtaining an approximate value:
Use the following formula to obtain from the data stored in the database an approximate value (bytes) for the sum of the actual lengths of all rows:
Number of pages used in the table storage RDAREA x page length of the table storage RDAREA
You can obtain the number of pages used in the table storage RDAREA and the table storage RDAREA's page length from the results of a condition analysis by RDAREA (logical analysis) or table that are provided by pddbst.
Obtaining an accurate value:
Use the following values to obtain the actual row length for the data stored in each column:
Column's data typeActual length (bytes)
BLOB16
Abstract data type provided by plug-ins2
BINARY#
  • If pdrorg -k rorg is specified with the compression specification and no UOC is used
    Definition length + 8 x MAX(number of times a concatenation operation was used in the UPDATE SQL statement,
    [Figure]definition length[Figure] split compression size[Figure])
  • Otherwise
    Definition length of the BINARY type column
OtherActual length of each column
#
For obtaining the maximum actual row length, this assumes BINARY type data whose length is the definition length and that the compression rate is 0%.
Oij: LOB data length
Pi: Data length of the abstract data type provided by a plug-in
R: Number of partitioned table or index RDAREAs
S: Number of table-storing servers
X: Value is as follows
  • 27 for the abstract data type stored in the LOB RDAREA
  • Key length + 2 for the abstract data type of a maximum of 255 bytes of definition length
  • 2 for the abstract data type of 256 bytes or more of definition length
Typical abstract data type values are as follows.
  • 27 for the SGMLTEXT type
  • 2 for the FREEWORD, GEOMETRY, and XML types
a: Number of partitioned RDAREAs
c: Number of column definitions
d: Number of columns for which the abstract data type provided by a plug-in is defined
m: Number of LOB columns
n: Number of rows (for a repetition column, number of rows x number of elements)
p: If all key component columns are fixed length, the value is 10; if they include a variable length, the value is 12.
crlf: Length of linefeed characters added when cr or crlf is specified in the -W option
Determine the length of linefeed characters from the following table:
-W option valueValue to be added
-W dat or -W extdat,cr1
,crlf2
Not specified1
-W fixtext,cr1
,crlf2
Not specified0
Note
When calculating the size of index information files and sort work files, if the index configuration columns are repetition columns, the number of rows to reload and n do not refer to the number of rows but to (number of rows x number of elements).
#1: The following table lists maximum lengths of converted character strings for columns in DAT format (-W dat) or extended DAT format (-W extdat).

Table 19-1 Maximum lengths of converted character strings for columns (in DAT or extended DAT format)

Data typeMaximum length of converted character string (bytes)
Numeric dataINTEGER11
SMALLINT11
DECIMAL40
FLOAT23
SMALLFLT23
Character string data#1CHARACTERDefined length + 2#2
VARCHARActual length + 2#2
Mixed character string data#1MCHARDefined length + 2#2
MVARCHARActual length + 2#2
National character data#1NCHARDefined length + 2#2
NVARCHARActual length + 2#2
Date dataDATE10
Time dataTIME8
Date interval dataINTERVAL YEAR TO DAY9
Time interval dataINTERVAL HOUR TO SECOND7
Time stamp dataTIMESTAMP19
If the number of digits for fractions of a second is not 0, add the number of digits for fractions of a second + 1.
Binary data#1BINARYActual length + 2#2
#1: If data in extended DAT format contains a double quotation mark ("), the length of the converted character string becomes longer by the number of double quotation marks.
#2: Two bytes are added for the enclosing brackets.
If -W dat or -W extdat is specified and sup is specified in the operand, the maximum lengths of converted character strings take effect on the columns as shown below. Note that the actual length indicates the length without the trailing consecutive spaces. For details about the space-compressed output format, see the -W option of the database reorganization utility (pdrorg) in the manual HiRDB Version 9 Command Reference.
Data typeMaximum lengths of converted character string (bytes)
Character string dataCHARACTERActual length + 2
Mixed character string dataMCHARActual length + 2
National character dataNCHARActual length + 2
#2: For details about the data length, see the following tables:
#3: The following table lists maximum lengths of converted character strings for columns in fixed-length character format (-W fixtext).

Table 19-2 Maximum lengths of converted character strings for columns (fixed-length character format)

Data typeMaximum lengths of converted character string (bytes)
Numeric dataINTEGER11
SMALLINT6
DECIMALNumber of digits + 2
FLOAT23
SMALLFLT23
Character string dataCHARACTER
VARCHAR
Defined lengthIf fixtext_option is specified in the enclose operand, add 2 to the output length.
Mixed character string dataMCHAR
MVARCHAR
Defined length
National character dataNCHAR
NVARCHAR
Defined length x 2
Date dataDATE10
Time dataTIME8
Date interval dataINTERVAL YEAR TO DAY10
Time interval dataINTERVAL HOUR TO SECOUND8
Time stamp dataTIMESTAMPDecimal part
0:19 2:22 4:24 6:26
Large object dataBLOB0
Binary dataBINARY0
Abstract data typeADT0
#4: If you are reorganizing files in units of schemas (including unload files), use the sum of the values obtained for individual tables.
#5: Output if lvl2 is specified as the information message output suppression level in the -m option.
#6: To obtain an accurate value of Li, replace (Pi) + Li with (Li + Pi).