Nonstop Database, HiRDB Version 9 Installation and Design Guide

[Contents][Index][Back][Next]

19.1.2 File sizes required for the execution of the database reorganization utility (pdrorg)

The following table shows the formulas for determining the file sizes required for the execution of the database reorganization utility (pdrorg):

File type Formula (bytes)
Unload data file#1 (no options specified) n
[Figure] (Pi + M) + Li#7 + 1,200 + A + B + c [Figure] 96 + D + I + F
i=1
Unload data file#1 (-w option specified) DAT or extended DAT format:
{
c
[Figure](maximum length of converted character string for column i #2 + 1) [Figure] C + M
i = 1
} [Figure] n
FIX table in binary format:
{
c
[Figure](column data length i#3) + M
i = 1
} [Figure] n
Non-FIX table in binary format:
{
c
[Figure](column data length i#3 + G) + M + 4 [Figure] (c + 1)
i = 1
} [Figure] n
Fixed-length character format:
{
c
[Figure](maximum length of converted character string for column i #4 + crlf) [Figure] C + M
i = 1
} [Figure] n
Unload data file#1 (-j option specified or during reorganization in units of schemas)#5 n
[Figure] (Pi + M) + Li#7 +
i = 1
n
[Figure]{
i = 1
m
[Figure] (Oij + 44)
j = 1
} + 1,200 + A + B + c [Figure] 96 + D + I + F
LOB data unload file#1 n
[Figure]{
i = 1
m
[Figure] (Oij + 44)
j = 1
} + 1,200 + A + B + c [Figure] 96 + D + I + F
Index information file

B-tree index:
(K + p) [Figure] n + 512

Plug-in index:
(12 + X) [Figure] n + 1,024
These formulas are for the size of one index. If there are multiple indexes, determine the size of each index.
Process results file 1,700 + number of servers storing table [Figure] 500 + number of tables in schema [Figure] 1,000 + total number of storage RDAREAs in schema [Figure] 100
Work file#6 [8 + 2 [Figure] S + 2 [Figure] {n [Figure] (value of past message output interval specified by the -m option)} + 3 [Figure] R + 4 [Figure] J [Figure] R] [Figure] 200
Work file for sorting

Condition 1:
Size of index information file + 4 [Figure] n

Condition 2:
{Size of index information file + 4 [Figure] n} [Figure] 2
  • Condition 1
    When 1,024 (KB) [Figure] E
  • Condition 2
    When 1,024 (KB) < E

E: Buffer size
The buffer size obtained according to 19.1.12 Buffer size used to determine the size of the work file for sorting.

A: For key range partitioning: 48 + number of partitioning conditions [Figure] 284
For hash partitioning: 40 + a [Figure] 60
For matrix partitioning (combination of key range partitioning of the boundary value specification and hash partitioning): 48 + (number of partitioning conditions [Figure] 284) + (40 + a [Figure] 60)

B: n [Figure] 36 (for FIX table) or (44 + c [Figure] 4) [Figure] n (for non-FIX table)

C: If the output character encoding of the unload data file is not a default HiRDB character encoding: 2
Otherwise: 1

D: 16 + (number of LOB columns [Figure] a [Figure] 80)
Add the value of D only if there are LOB columns.

F: Use the following value:
D
[Figure]{(number of abstract-data-type attributes provided by plug-in in column i [Figure] 84) +
i = 1
(number of abstract-data-type LOB attributes provided by plug-in in column i [Figure] a [Figure] 72)}
+ 64 +
d
[Figure] (84 + number of reverse generation functions i [Figure] 60)
i = 1

G: Number of attributes for which the return value of the reverse generation function on column I is BLOB [Figure] 4

I: 136 + number of index partitions [Figure] 60
Add this value when including the index.

J: Number of indexes

K: Index key length
See Table 16-5 List of index key lengths. For variable-length data, keep in mind when defining the maximum length that single columns are also handled as multicolumns.

Li: Actual length of row
Obtain the actual row length (approximate or accurate value). If the row is BINARY type and has the compression specification, we recommend that you obtain an approximate value because obtaining an accurate value requires a complicated calculation.
Obtaining an approximate value:
Use the following formula to obtain from the data stored in the database an approximate value (bytes) for the sum of the actual lengths of all rows:

Number of pages used in the table storage RDAREA [Figure] page length of the table storage RDAREA
You can obtain the number of pages used in the table storage RDAREA and the table storage RDAREA's page length from the results of a condition analysis by RDAREA (logical analysis) or table that are provided by pddbst.
Obtaining an accurate value:
Use the following values to obtain the actual row length for the data stored in each column:
Column's data type Actual length (bytes)
BLOB 16
Abstract data type provided by plug-ins 2
BINARY#
  • If pdrorg -k rorg is specified with the compression specification and no UOC is used
    Definition length + 8 [Figure] MAX(number of times a concatenation operation was used in the UPDATE SQL statement,
    [Figure]definition length [Figure] split compression size[Figure])
  • Otherwise
    Definition length of the BINARY type column
Other Actual length of each column
#
For obtaining the maximum actual row length, this assumes BINARY type data whose length is the definition length and that the compression rate is 0%.

M: Total data length of the character string type column for which the character set is specified
The value is as follows:
k
([Figure]column data length i)
i = 1

Oij: LOB data length

Pi: Data length of the abstract data type provided by a plug-in

R: Number of partitioned table or index RDAREAs

S: Number of table-storing servers

X: Value is as follows
  • 27 for the abstract data type stored in the LOB RDAREA
  • Key length + 2 for the abstract data type of a maximum of 255 bytes of definition length
  • 2 for the abstract data type of 256 bytes or more of definition length
Typical abstract data type values are as follows.
  • 27 for the SGMLTEXT type
  • 2 for the FREEWORD, GEOMETRY, and XML types

a: Number of partitioned RDAREAs

c: Number of column definitions

d: Number of columns for which the abstract data type provided by a plug-in is defined

k: Number of character string type columns for which the character set is specified

m: Number of LOB columns

n: Number of rows (for a repetition column, number of rows [Figure] number of elements)

p: If all key component columns are fixed length, the value is 10; if they include a variable length, the value is 12.

crlf: Length of linefeed characters added when cr or crlf is specified in the -W option
Determine the length of linefeed characters from the following table:
-W option value Value to be added
-W dat or -W extdat ,cr 1
,crlf 2
Not specified 1
-W fixtext ,cr 1
,crlf 2
Not specified 0

Note
When calculating the size of index information files and sort work files, if the index configuration columns are repetition columns, the number of rows to reload and n do not refer to the number of rows but to (number of rows [Figure] number of elements).

#1: If the file is larger than 2 GB, take one of the following actions:

#2: The following table lists maximum lengths of converted character strings for columns in DAT format (-W dat) or extended DAT format (-W extdat).

Table 19-1 Maximum lengths of converted character strings for columns (in DAT or extended DAT format)

Data type Maximum length of converted character string (bytes)
Numeric data INTEGER 11
SMALLINT 11
DECIMAL 40
FLOAT 23
SMALLFLT 23
Character string data#1 CHARACTER Defined length + 2#2
VARCHAR Actual length + 2#2
Mixed character string data#1 MCHAR Defined length + 2#2
MVARCHAR Actual length + 2#2
National character data#1 NCHAR Defined length + 2#2
NVARCHAR Actual length + 2#2
Date data DATE 10
Time data TIME 8
Date interval data INTERVAL YEAR TO DAY 9
Time interval data INTERVAL HOUR TO SECOND 7
Time stamp data TIMESTAMP 19
If the number of digits for fractions of a second is not 0, add the number of digits for fractions of a second + 1.
Binary data#1 BINARY Actual length + 2#2
#1: If data in extended DAT format contains a double quotation mark ("), the length of the converted character string becomes longer by the number of double quotation marks.
#2: Two bytes are added for the enclosing brackets.
If -W dat or -W extdat is specified and sup is specified in the operand, the maximum lengths of converted character strings take effect on the columns as shown below. Note that the actual length indicates the length without the trailing consecutive spaces. For details about the space-compressed output format, see the -W option of the database reorganization utility (pdrorg) in the manual HiRDB Version 9 Command Reference.
Data type Maximum lengths of converted character string (bytes)
Character string data CHARACTER Actual length + 2
Mixed character string data MCHAR Actual length + 2
National character data NCHAR Actual length + 2

#3: For details about the data length, see the following tables:

#4: The following table lists maximum lengths of converted character strings for columns in fixed-length character format (-W fixtext).

Table 19-2 Maximum lengths of converted character strings for columns (fixed-length character format)

Data type Maximum lengths of converted character string (bytes)
Numeric data INTEGER 11
SMALLINT 6
DECIMAL Number of digits + 2
FLOAT 23
SMALLFLT 23
Character string data CHARACTER
VARCHAR
Defined length If fixtext_option is specified in the enclose operand, add 2 to the output length.
Mixed character string data MCHAR
MVARCHAR
Defined length
National character data NCHAR
NVARCHAR
Defined length [Figure] 2
Date data DATE 10
Time data TIME 8
Date interval data INTERVAL YEAR TO DAY 10
Time interval data INTERVAL HOUR TO SECOUND 8
Time stamp data TIMESTAMP Decimal part
0:19 2:22 4:24 6:26
Large object data BLOB 0
Binary data BINARY 0
Abstract data type ADT 0

#5: If you are reorganizing files in units of schemas (including unload files), use the sum of the values obtained for individual tables.

#6: Output if lvl2 is specified as the information message output suppression level in the -m option.

#7: To obtain an accurate value of Li, replace (Pi + M) + Li with (Li + Pi + M).