16.1.2 Calculating the number of table storage pages

The procedure used to calculate the number of pages required to store a table depends on whether FIX is specified for the table in CREATE TABLE. The procedures are explained in (1) and (2) as follows. The variables used in the formulas are explained in (3), and examples of calculating the number of pages needed to store a table are presented in (7). Estimating the RDAREA size when the rebalancing facility is used is explained in (6).

If a table is row-partitioned, the number of pages for the table in each storage RDAREA must be obtained.

Organization of this subsection
(1) FIX not specified
(2) FIX specified
(3) Variables used in formulas
(4) How to obtain the data lengths of abstract data type columns
(5) How to obtain the data lengths of repetition columns
(6) How to estimate the area when the rebalancing facility is used
(7) Examples of calculating the number of table storage pages

(1) FIX not specified

When FIX is not specified, the following formula is used to calculate the number of pages needed to store a table.

Formula
Number of table storage pages =
number of pages that store columns other than BINARY columns defined as branching
+ number of pages that store BINARY columns defined as branching (pages)
[Figure]Number of pages that store columns other than BINARY columns defined as branching
[Figure]
[Figure]Number of pages that store BINARY columns defined as branching
SPN2
(a) How to obtain the value of P

Use the following formula to obtain the value of P. The denominator enclosed in parentheses indicates the number of rows stored per page; its minimum and maximum values are 1 and 255, respectively.

Formula
[Figure]
(b) How to obtain the value of PSi

Use the following formula to obtain the value of each PSi and then obtain their sum, where n indicates the number of columns to which Table 16-2 Data lengths for the variable-length character string type (except abstract data type and repetition columns) is applicable.

PSi = a x[Figure] ei/(b - 62) [Figure]

(2) FIX specified

When FIX is specified, the following formula is used to calculate the number of pages needed to store a table.

Formula
[Figure]
(a) How to obtain the value of Q

Use the following formula to obtain the value of Q, in which the denominator enclosed in parentheses indicates the number of rows stored per page; its minimum and maximum values are 1 and 255, respectively.

[Figure]

(3) Variables used in formulas

a: Total number of rows stored in table
b: Page length of user RDAREA (bytes)
c: Percentage of unused area specified with CREATE TABLE (%)
If you omit the percentage of unused area, the system assumes 30%.
di: Data length of a column (bytes)
Obtain the values for all columns from Table 16-1 List of data lengths.
For a column with an abstract data type, see (4) How to obtain the data lengths of abstract data type columns.
For a repetition column, see (5) How to obtain the data lengths of repetition columns.
ei: Average column data length (bytes)
f: Total number of columns defined for table
g: Segment size of RDAREA for storing table (pages)
h: Percentage of free pages in segment specified with CREATE TABLE (%)
If you omit the percentage of free pages in segment, the system assumes 10%. Here, free pages refers to unused pages.
SPN1: Number of pages that store columns defined as branching (non-BINARY)
For details about branching conditions, see footnote #5 following Table 16-1 List of data lengths.
SPN1 =
f
[Figure]Value of branch di[Figure] (b - 61)[Figure] x a x SF
i=1
SPN2: Number of pages that store BINARY columns defined as branching
For details about branching conditions, see footnote #5 following Table 16-1 List of data lengths.
SPN2 = SPN2A + SPN2B + SPN2C
[Figure] Number of branch pages that use INSERT SQL
Calculate for the BINARY columns defined as branching.
SPN2A =
k
{[Figure] Li[Figure] (b - 59)[Figure] x a + A} x SF
i=1
[Figure]Number of branch pages that use pdload or pdorg
Li > (b - 2,853) [Figure] 255
SPN2B =
k
[Figure]{[Figure](Li + 11) x a} [Figure] (b - 48)[Figure] x SF
i=1
Li[Figure] (b - 2,853) [Figure] 255
SPN2C =
[Figure]a[Figure] 255[Figure] x SF
The formula for A is shown below.
[Figure]
k: Number of columns defined as branching
Li: Actual data length of each column (bytes)
For a compressed column, use the following formula:
Data length after compression + ([Figure]data length before compression [Figure]split compression size[Figure]) x 8
SF: 1.3
However, make this value larger than 1.3 when:
  • Large numbers of abstract data type columns will be updated
  • Executing, on repetition columns, large quantities of updates that increase element data length or updates that increase the number of elements
  • Executing large quantities of updates that significantly increase data length on VARCHAR, NVARCHAR, MVARCHAR, or BINARY type columns
  • Executing large quantities of updates that significantly increase data length on BINARY type columns
  • Executing, on columns on which data suppression has been executed on individual columns, large quantities of updates that significantly increase data length
  • Executing large quantities of updates from NULL value to non-NULL values with data type other than the above

    Table 16-1 List of data lengths

    ClassificationData typeData length (bytes)
    Numeric dataINTEGER4
    SMALLINT2
    LARGE DECIMAL(m,n)#1[Figure]m/2 [Figure] + 1#2
    FLOAT or DOUBLE PRECISION8
    SMALLFLT or REAL4
    Character dataCHARACTER(n)n#3
    VARCHAR(n) (variable-length character string)d[Figure] 255Element of repetition columnd + 2
    Otherd + 1
    d[Figure] 2566
    VARCHAR (n) with no-split option specifiedn[Figure] 255Attribute of abstract data typed + 3
    Element of repetition columnd + 2
    Otherd + 1
    n[Figure] 256Branching#56
    Not branching#5Attribute of abstract data typed + 3
    Element of repetition columnd + 2
    Otherd + 3
    National character dataNCHAR(n) or NATIONAL CHARACTER(n)2 x n#4
    NVARCHAR(n)d[Figure] 127Element of repetition column2 x d + 2
    Other2 x d + 1
    d[Figure] 1286
    NVARCHAR(n) with no-split option specifiedn[Figure] 127Attribute of abstract data type2 x d + 3
    Element of repetition column2 x d + 2
    Other2 x d + 1
    n[Figure] 128Branching#56
    Not branching#5Attribute of abstract data type2 x d + 3
    Element of repetition column2 x d + 2
    Other2 x d + 3
    Mixed character string dataMCHAR(n)n#3
    MVARCHAR(n)d[Figure] 255Element of repetition columnd + 2
    Otherd + 1
    d[Figure] 2566
    MVARCHAR(n) with no-split option specifiedn[Figure] 255Attribute of abstract data typed + 3
    Element of repetition columnd + 2
    Otherd + 1
    n[Figure] 256Branching#56
    Not branching#5Attribute of abstract data typed + 3
    Element of repetition columnd + 2
    Otherd + 3
    Date dataDATE4
    Time dataTIME3
    Date interval dataINTERVAL YEAR TO DAY5
    Time interval dataINTERVAL HOUR TO SECOND4
    Timestamp dataTIMESTAMP(n)7 + (n[Figure] 2)
    Large object dataBLOB9
    Binary dataBINARY(n)n[Figure] 255d + 3
    n[Figure] 256Branching#515
    Not branching#5d + 3
    Binary dataBINARY(n)n[Figure] 255d + 3
    n[Figure] 256Branching#515
    Not branching#5d + 3
    BINARY(n)
    with compression specified
    Branching#515
    Not branching#5[Figure]+ 9
d: Actual data length (in characters)
m, n: Positive integer
[Figure]: Data length after compression (number of characters)
#1: This is a fixed decimal number consisting of a total of m digits and n decimal places. If m is omitted, 15 is assumed.
#2: If the SUPPRESS DECIMAL table option is specified in the table definition, the data length will be " [Figure] k / 2 [Figure] + 2", where k is the number of significant digits during storage (excluding leading zeros). If the condition shown as follows is satisfied, SUPPRESS DECIMAL should not be used (a in the condition is the total data lengths of the columns in the table when SUPPRESS DECIMAL or column data suppression is not used):
32,717 < (a + number of columns in table x 2 + 8)
#3: If column data suppression is specified and data suppression actually occurs, the value of n is n - b + 4. Data suppression occurs only when column data suppression is specified, the column data ends with the blank character, and this blank character is immediately followed by at least four single-byte blank characters. b is the number of blank characters following the last character of the column data.
If column data suppression is specified but data suppression does not actually occur, one byte of information is added to each column. However, if the condition shown below is satisfied, column data suppression should not be specified (a in the condition is the total data lengths of the columns in the table when SUPPRESS DECIMAL or column data suppression is not used):
32,717 < (a + number of columns in table x 2 + 8)
#4: If column data suppression is specified and data suppression actually occurs, 2 x n becomes 2 x n - 2 x b + 5. Data suppression occurs only when column data suppression is specified, the column data ends with the blank character, and this blank character is immediately followed by at least three double-byte blank characters. b is the number of blank characters following the last character of the column data.
However, if the condition shown below is satisfied, column data suppression should not be specified (a in the condition is total data lengths of the columns in the table when SUPPRESS DECIMAL or column data suppression is not used):
32,717 < (a + number of columns in table x 2 + 8)
#5: In general, the calculation assumes that there is no branching. Branch only when the condition shown below is satisfied. For a compressed column, use the data length before compression in the calculation.
BL > page length - 50
f
BL (bytes) =[Figure]di + 2 x f + 6
i=1
If this branch condition is satisfied, recalculate the value of BL assuming that each column branches, starting with the column having the smallest column number until the column no longer satisfies the branch condition.

Table 16-2 Data lengths for the variable-length character string type (except abstract data type and repetition columns)

Data typeData length (bytes)
VARCHAR (n)d[Figure] 256d + 2
No-split option specified0
NVARCHAR (n)d[Figure] 1282 x d + 2
No-split option specified0
MVARCHAR (n)d[Figure] 256d + 2
No-split option specified0
d: Actual data length (in characters)

(4) How to obtain the data lengths of abstract data type columns

Use the following formula to obtain data length di of an abstract data type column.

Formula
h
di = [Figure] ADTk + 5
k=1
h: Inheritance count for the abstract data type
If there is no inheritance, this value is 1.
If you have specified the UNDER operand in the CREATE TYPE statement to inherit another abstract data type, the highest abstract data type is h and the lowest abstract data type is 1.
ADTk: Data length of the abstract data type (bytes)
Use the following formula to obtain this value:
m
ADTk= [Figure]attj + 10 + 2 x m
i=1
m: Total number of attributes of the abstract data type
attj: Data length for each attribute of the abstract data type (bytes)
If there is no inheritance, m = 1; therefore, calculate the value of ADT1.

For the data lengths for each attribute, see Table 16-1 List of data lengths. If the data type satisfies the condition shown in Table 16-3 Data lengths for the variable-length character string type (abstract data type), calculate the data according to Table 16-3.

Assign the value of the corresponding attej to the following formula and add branch row storage pages ADTLS to P:

h
ADTLS= [Figure]attej[Figure] (b - 62) [Figure] x a
I=1

When attributes are defined as an abstract data type, use the following formula to obtain their data length:

h
attj(bytes)= [Figure] ADTk + 5
k=1

Table 16-3 Data lengths for the variable-length character string type (abstract data type)

Data typeConditionData length attj (bytes)Data length of branch section attej (bytes)
VARCHAR (n)d[Figure] 2568d + 2
No-split option specifiedd + 30
NVARCHAR (n)d[Figure] 12882 x d + 2
No-split option specified2 x d + 30
MVARCHAR (n)d[Figure] 2568d + 2
No-split option specifiedd + 30
d: Actual data length (in characters)

(5) How to obtain the data lengths of repetition columns

Use the following formula to obtain the data length of a repetition column:

Formula
di = 4 + (eli + 1) x eni
eli: Data length of a repetition column
Obtain the data length from Table 16-1 List of data lengths.
For the variable-length character string type, obtain the data length from Table 16-4 Data lengths for the variable-length character string type (repetition columns).
eni: Average number of elements for a repetition column

Table 16-4 Data lengths for the variable-length character string type (repetition columns)

Data typeConditionData length eli (bytes)Data length of branch section esj (bytes)
VARCHAR (n)d[Figure] 2565d + 2
No-split option specifiedd + 20
NVARCHAR (n)d[Figure] 12852 x d + 2
No-split option specified2 x d + 30
MVARCHAR (n)d[Figure] 2565d + 2
No-split option specifiedd + 20
d: Actual data length (in characters)

If a repetition column with the variable-length character string type satisfies the value of eli shown in Table 16-4, add the value obtained from the following formula to P:

m
[Figure][Figure]{esi x eni + 14 x (eni - 1)} [Figure] (b - 62)[Figure] x a
i=1
m: Number of repetition columns with the variable-length character string type that satisfy the condition shown in Table 16-4.
esi: Average value of the actual data length per element
Apply the data length shown in Table 16-2 Data lengths for the variable-length character string type (except abstract data type and repetition columns).

(6) How to estimate the area when the rebalancing facility is used

If there are partitioning tables that use any one of the HASHA through HASHF hash functions, the data is divided into 1,024 hash element values, each of which is stored in a separate segment.

An average of (1,024 [Figure] number of partitions) hash elements of data is stored in each partitioned RDAREA. Therefore, at a minimum, enough segments must be allocated to each RDAREA to store the number of elements.

When the rebalancing facility is used, the RDAREA size can be estimated as follows:

  1. The total number of segments Sn that will be required is estimated from the number of items of data N, row length L, and page length P.
  2. Estimate the number of segments Ssn required per RDAREA.
    Ssn =[Figure]Sn[Figure] Srn[Figure] x Srn
    Srn: [Figure]1,024 [Figure] Dvn[Figure]
    Dvn: Number of RDAREA partitions
  3. Estimate the number of segments S used per RDAREA, making provision for a surplus.
    S =[Figure](Ssn x K) [Figure] Srn[Figure] x Srn
    K: Coefficient (Example: 20% surplus, 1.2)

(7) Examples of calculating the number of table storage pages

(a) Example

Obtain the number of table storage pages for the following STOCK table:

PCODEPNAMESTANDARDPRICEQUANTITYCOST
20180CLEANERC20200002615000
20190CLEANERC772800010523000
20130REFRIGERATORP10300007025000
20220TVK18350001230000
20200CLEANERC89350003030000
20140REFRIGERATORP23350006030000
20280AMPLIFIERL103800020033000
20150REFRIGERATORP32480005043000
20290AMPLIFIERL504980026045000
20230TVK20500001545000
20160REFRIGERATORP355580012050000
Conditions:
  1. Total number of rows stored in the table: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE TABLE: 30%
  4. Number of columns: 6
  5. Segment size for storing table: 100 pages
  6. Percentage of free pages in a segment specified in CREATE TABLE: 40%
  7. Columns' data types:
    ColumnData type
    PCODECHARACTER(5)
    PNAMECHARACTER(4)
    STANDARDCHARACTER(3)
    PRICEINTEGER
    QUANTITYINTEGER
    COSTINTEGER
FIX not specified
  1. Calculation of row length
    5(PCODE) + (2 x 4)(PNAME) + 3(STANDARD) + 4(PRICE) + 4(QUANTITY) + 4(COST) = 28 bytes
  2. Calculation of P

    [Figure]

  3. Calculation of the number of table storage pages

    [Figure]

FIX specified
  1. Calculation of row length
    5(PCODE) + (2 x 4)(PNAME) + 3(STANDARD) + 4(PRICE) + 4(QUANTITY) + 4(COST) = 28 bytes
  2. Calculation of Q

    [Figure]

  3. Calculation of the number of table storage pages

    [Figure]