17.1.3 Calculating the number of index storage pages

The procedure used to calculate the number of pages required to store an index is explained in (1) as follows. The variables used in the formulas are explained in (2), and examples of calculating the number of pages needed to store an index are presented in (3).

If cluster key is specified with CREATE TABLE, the number of cluster key storage pages should be obtained in the same manner as for the number of index storage pages.

If an index is row-partitioned, the number of pages for the index in each storage RDAREA must be obtained.

Note
When an index page split occurs, the storage ratio of the keys in the index is 50:50, and the index is divided into two indexes. Therefore, if there are many additions or updates to an index, make the estimate for the maximum number of index storage pages double the calculated value. Also, even if there are inserts from the UAP, the index page split of the leaf page that stores the largest key considers the value of the PCTFREE operand.
One method to reduce the frequency of index page splits is unbalanced index splits. For details about index page splits and unbalanced index splits, see the manual HiRDB Version 8 System Operation Guide.
Organization of this subsection
(1) Calculation procedure
(2) Variables used in formulas
(3) Examples of calculating the number of index storage pages

(1) Calculation procedure

The following formula is used to calculate the number of pages needed to store an index.

Formula

                       n
Number-of-index-storage-pages = [Figure] Pi + Pd
                       i=1

The recursive formula shown in Formula 1 as follows is used to obtain Pi. Pi + 1 must be calculated until Pn = 1, then the sum of the results must be obtained.

The value of Pd must be obtained if the number of duplicated key values exceeds 200. The formula for obtaining the value of Pd is shown in Formula 2 as follows:

Formula 1

[Figure]
Formula 2

[Figure]
Number of duplicated elements per row when the index contains repetition columns
If the index contains repetition columns, the number of duplicated elements per row must not exceed the following value:
Number of duplicated elements = [Figure] ([Figure] a [Figure] 0.95 [Figure] - 82) [Figure] 4 [Figure] - 1

(2) Variables used in formulas

a: Page length of user RDAREA (bytes)
b: Percentage of unused space specified with CREATE TABLE1(%)
c: Number of keys with up to 200 duplicated key values#2, #3, #4
d: Average number of duplicates for keys with up to 200 duplicated key values#3, #5
e: Number of keys with more than 200 duplicated key values,#3, #4
f: Average number of duplicates for keys with more than 200 duplicated key values#3, #5
g: DB storage key length#6 (bytes)
h: One of the following:
  • For a unique index: Number of keys not containing a null value
    For a multicolumn index, the total number of keys not containing a null value in its component columns.
  • For other than unique index: 0
#1
If no percentage of unused space is specified, 30% should be used in the calculation. If a cluster key is specified, the percentage of unused space specified with CREATE TABLE should be used.
#2
Non-duplicated keys in unique indexes must be included.
#3
Calculate so that the value of c[Figure] d + e[Figure] f is larger than the total number of index keys.
#4
Duplicate keys in unique indexes must be included (keys that are duplicate due to the fact that the key contains a null value).
#5
Round up fractions to an integer value.
#6
See Table 17-5 List of index key lengths. The length of the DB storage key can be obtained from the following formula:
  • For single column indexes and fixed-size multicolumn indexes
    [Figure]key-length[Figure] 4[Figure][Figure] 4
  • For variable-size multicolumn indexes with a key length of 255 bytes or less
    [Figure](key-length + 1) [Figure] 4[Figure][Figure] 4
  • For variable-size multicolumn indexes with a key length of 256 bytes or more
    [Figure](key-length + 2) [Figure] 4[Figure][Figure] 4
The key length of a multicolumn index is the sum of the key lengths of its component columns based on the key lengths shown in Table 17-5 List of index key lengths.

Table 17-5 List of index key lengths

Classifi-
cation
Data typeData length (bytes)
Key length less than 256Key length 256 or greater
Single-
column index
Multicolumn indexSingle-
column index
Multicolumn index
Fixed length1Variable length2Fixed length1Variable length2
Numeric dataINTEGER456N57
SMALLINT234N35
LARGE DECIMAL(m,n)3[Figure]m[Figure] 2 [Figure] + 1[Figure]m[Figure] 2 [Figure] + 2[Figure]m[Figure] 2 [Figure] + 3N[Figure]m[Figure] 2 [Figure] + 2[Figure]m[Figure] 2 [Figure] + 4
FLOAT or DOUBLE PRECISION8EE[Figure]EE
SMALLFLT or REAL4EE[Figure]EE
Character dataCHARACTER(n)nn + 1n + 2nn + 1n + 3
VARCHAR(n)a + 1Na + 2a + 2Na + 3
National character dataNCHAR(n) or NATIONAL CHARACTER(n)2 [Figure] n2 [Figure] n + 12 [Figure] n + 22 [Figure] n2 [Figure] n + 12 [Figure] n + 3
NVARCHAR(n)2 [Figure] b + 1N2 [Figure] b + 22 [Figure] b + 2N2 [Figure] b + 3
Mixed character dataMCHAR(n)nn + 1n + 2nn + 1n + 3
MVARCHAR(n)a + 1Na + 2a + 2Na + 3
Date dataDATE456[Figure]57
Time dataTIME345[Figure]46
Date interval dataINTERVAL YEAR TO DAY567[Figure]68
Time interval dataINTERVAL HOUR TO SECOND456[Figure]57
Timestamp dataTIMESTAMP(p)7 + (p[Figure]2)8 + (p[Figure]2)9 + (p[Figure]2)[Figure]8 + (p[Figure]2)10 + (p[Figure]2)
Legend:
a: Actual data length
b: Actual number of characters
m, n, p: Positive integer
E: Error occurs during index definition
[Figure]: Not applicable
Note
Begin calculation with a key length less than 255 bytes. If it turns out that the key length is 256 bytes or greater, recalculate at a key length of 256 bytes or greater.
1 Key length of an index that contains only fixed-length component columns.
2 Key length of an index that contains variable-length component columns.
3 This is a fixed decimal number consisting of a total of m digits and n decimal places. If m is omitted, 15 is assumed.

(3) Examples of calculating the number of index storage pages

(a) Example 1

Obtain the number of index storage pages for a unique index (no duplicated key values) of the PCODE column for the following STOCK table:

PNOPNAMESTANDARDPRICEQUANTITYCOST
20180CLEANERC20200002615000
20190CLEANERC772800010523000
20130REFRIGERATORP10300007025000
20220TVK18350001230000
20200CLEANERC89350003030000
20140REFRIGERATORP23350006030000
20280AMPLIFIERL103800020033000
20150REFRIGERATORP32480005043000
20290AMPLIFIERL504980026045000
20230TVK20500001545000
20160REFRIGERATORP355580012050000
Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE TABLE: 30%
  4. Data type of index: CHARACTER
  5. Index key length: 5 bytes
  6. Number of key duplicates: 1
Formula
[Figure]
(b) Example 2

Obtain the number of index storage pages for the STOCK table shown in Example 1 when the PNAME column is used as the index (with duplicated key values).

Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE TABLE: 30%
  4. Data type of index: NCHAR
  5. Index key length: 4 characters (Kanji characters)
  6. Number of keys with more than 200 duplicated key values: 1
    (Average number of duplicates: 250)
  7. Number of keys with up to 200 duplicated key values: (10000 - 250)/5 =
    1950 (Average number of duplicates: 5)
Formula
[Figure]

Obtain the number of index storage pages for the following MEMBERSHIP_TABLE using the SEX and YEAR_JOINED columns as a multicolumn index:

MNONAMEAGESEXYEAR_JOINED
0001
0002
0003
0004
[Figure]
[Figure]
[Figure]
1000
Lisa Roberts
John Anderson
Jane Wood
Mark Wood
[Figure]
[Figure]
[Figure]
Joe Young
18
25
24
25
[Figure]
[Figure]
[Figure]
30
F
M
F
M
[Figure]
[Figure]
[Figure]
M
1983
1967
1987
1964
[Figure]
[Figure]
[Figure]
1995
Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE INDEX: 30%
  4. Number of members joined in 1964: 1,000
  5. Number of members joined in any other year: 200 or fewer
  6. Period covered: 31 years from 1965 to 1995
  7. The same numbers of male and female members are assumed to have joined each year.
  8. Data types of columns:
    ColumnData type
    MNOCHARACTER(5)
    NAMENCHAR(4)
    AGEINTEGER
    SEXCHARACTER(4)
    YEAR_JOINEDINTEGER
Formula:
  1. The number of keys (c) for members who joined within 31 years after 1965 (no more than 200 per year including both male and female members): c = 31 [Figure] 2 = 62
  2. The average number of duplicates (d) is: d=(10000-1000) [Figure]62=146.
  3. Number of keys (e) for members who joined in 1964 (1,000 members including both male and female members): e = 2
  4. Average number of duplicates (f): f = 1000/2 = 500
  5. DB storage key length (g) of the SEX and YEAR_JOINED columns:
    g = [Figure] (4 + 1 + 5)/4 [Figure][Figure] 4 = 12

    [Figure]