Scalable Database Server, HiRDB Version 8 Installation and Design Guide

[Contents][Index][Back][Next]

16.1.3 Calculating the number of index storage pages

The procedure used to calculate the number of pages required to store an index is explained in (1) as follows. The variables used in the formulas are explained in (2), and examples of calculating the number of pages needed to store an index are presented in (3).

If cluster key is specified with CREATE TABLE, the number of cluster key storage pages should be obtained in the same manner as for the number of index storage pages.

If an index is row-partitioned, the number of pages for the index in each storage RDAREA must be obtained.

Note
When an index page split occurs, the storage ratio of the keys in the index is 50:50, and the index is divided into two indexes. Therefore, if there are many additions or updates to an index, make the estimate for the maximum number of index storage pages double the calculated value. Also, even if there are inserts from the UAP, the index page split of the leaf page that stores the largest key considers the value of the PCTFREE operand.
One method to reduce the frequency of index page splits is unbalanced index splits. For details about index page splits and unbalanced index splits, see the manual HiRDB Version 8 System Operation Guide.
Organization of this subsection
(1) Calculation procedure
(2) Variables used in formulas
(3) Examples of calculating the number of index storage pages

(1) Calculation procedure

The following formula is used to calculate the number of pages needed to store an index.

Formula
                        n
Number-of-index-storage-pages = [Figure] Pi + Pd
                        i=1

The recursive formula shown in Formula 1 as follows is used to obtain Pi. Pi + 1 must be calculated until Pn = 1, then the sum of the results must be obtained.

The value of Pd must be obtained if the number of duplicated key values exceeds 200. The formula for obtaining the value of Pd is shown in Formula 2 as follows:

Formula 1

[Figure]

Formula 2

[Figure]

Number of duplicated elements per row when the index contains repetition columns
If the index contains repetition columns, the number of duplicated elements per row must not exceed the following value:
Number of duplicated elements = [Figure] ([Figure] a [Figure] 0.95 [Figure] - 82) [Figure] 4 [Figure] - 1

(2) Variables used in formulas

a: Page length of user RDAREA (bytes)

b: Percentage of unused space specified with CREATE TABLE1(%)

c: Number of keys with up to 200 duplicated key values#2, #3, #4

d: Average number of duplicates for keys with up to 200 duplicated key values#3, #5

e: Number of keys with more than 200 duplicated key values,#3, #4

f: Average number of duplicates for keys with more than 200 duplicated key values#3, #5

g: DB storage key length#6 (bytes)

h: One of the following:
  • For a unique index: Number of keys not containing a null value
    For a multicolumn index, the total number of keys not containing a null value in its component columns.
  • For other than unique index: 0

#1
If no percentage of unused space is specified, 30% should be used in the calculation. If a cluster key is specified, the percentage of unused space specified with CREATE TABLE should be used.

#2
Non-duplicated keys in unique indexes must be included.

#3
Calculate so that the value of c [Figure] d + e [Figure] f is larger than the total number of index keys.

#4
Duplicate keys in unique indexes must be included (keys that are duplicate due to the fact that the key contains a null value).

#5
Round up fractions to an integer value.

#6
See Table 16-5 List of index key lengths. The length of the DB storage key can be obtained from the following formula:
  • For single column indexes and fixed-size multicolumn indexes
    [Figure]key-length [Figure] 4[Figure] [Figure] 4
  • For variable-size multicolumn indexes with a key length of 255 bytes or less
    [Figure](key-length + 1) [Figure] 4[Figure] [Figure] 4
  • For variable-size multicolumn indexes with a key length of 256 bytes or more
    [Figure](key-length + 2) [Figure] 4[Figure] [Figure] 4
The key length of a multicolumn index is the sum of the key lengths of its component columns based on the key lengths shown in Table 16-5 List of index key lengths.

Table 16-5 List of index key lengths

Classifi-
cation
Data type Data length (bytes)
Key length less than 256 Key length 256 or greater
Single-
column index
Multicolumn index Single-
column index
Multicolumn index
Fixed length1 Variable length2 Fixed length1 Variable length2
Numeric data INTEGER 4 5 6 N 5 7
SMALLINT 2 3 4 N 3 5
LARGE DECIMAL(m,n)3 [Figure] m [Figure] 2 [Figure] + 1 [Figure] m [Figure] 2 [Figure] + 2 [Figure] m [Figure] 2 [Figure] + 3 N [Figure] m [Figure] 2 [Figure] + 2 [Figure] m [Figure] 2 [Figure] + 4
FLOAT or DOUBLE PRECISION 8 E E [Figure] E E
SMALLFLT or REAL 4 E E [Figure] E E
Character data CHARACTER(n) n n + 1 n + 2 n n + 1 n + 3
VARCHAR(n) a + 1 N a + 2 a + 2 N a + 3
National character data NCHAR(n) or NATIONAL CHARACTER(n) 2 [Figure] n 2 [Figure] n + 1 2 [Figure] n + 2 2 [Figure] n 2 [Figure] n + 1 2 [Figure] n + 3
NVARCHAR(n) 2 [Figure] b + 1 N 2 [Figure] b + 2 2 [Figure] b + 2 N 2 [Figure] b + 3
Mixed character data MCHAR(n) n n + 1 n + 2 n n + 1 n + 3
MVARCHAR(n) a + 1 N a + 2 a + 2 N a + 3
Date data DATE 4 5 6 [Figure] 5 7
Time data TIME 3 4 5 [Figure] 4 6
Date interval data INTERVAL YEAR TO DAY 5 6 7 [Figure] 6 8
Time interval data INTERVAL HOUR TO SECOND 4 5 6 [Figure] 5 7
Timestamp data TIMESTAMP(p) 7 + (p[Figure]2) 8 + (p[Figure]2) 9 + (p[Figure]2) [Figure] 8 + (p[Figure]2) 10 + (p[Figure]2)

Legend:
a: Actual data length
b: Actual number of characters
m, n, p: Positive integer
E: Error occurs during index definition
[Figure]: Not applicable

Note
Begin calculation with a key length less than 255 bytes. If it turns out that the key length is 256 bytes or greater, recalculate at a key length of 256 bytes or greater.

1 Key length of an index that contains only fixed-length component columns.

2 Key length of an index that contains variable-length component columns.

3 This is a fixed decimal number consisting of a total of m digits and n decimal places. If m is omitted, 15 is assumed.

(3) Examples of calculating the number of index storage pages

(a) Example 1

Obtain the number of index storage pages for a unique index (no duplicated key values) of the PCODE column for the following STOCK table:

PNO PNAME STANDARD PRICE QUANTITY COST
20180 CLEANER C20 20000 26 15000
20190 CLEANER C77 28000 105 23000
20130 REFRIGERATOR P10 30000 70 25000
20220 TV K18 35000 12 30000
20200 CLEANER C89 35000 30 30000
20140 REFRIGERATOR P23 35000 60 30000
20280 AMPLIFIER L10 38000 200 33000
20150 REFRIGERATOR P32 48000 50 43000
20290 AMPLIFIER L50 49800 260 45000
20230 TV K20 50000 15 45000
20160 REFRIGERATOR P35 55800 120 50000

Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE TABLE: 30%
  4. Data type of index: CHARACTER
  5. Index key length: 5 bytes
  6. Number of key duplicates: 1

Formula
[Figure]
(b) Example 2

Obtain the number of index storage pages for the STOCK table shown in Example 1 when the PNAME column is used as the index (with duplicated key values).

Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE TABLE: 30%
  4. Data type of index: NCHAR
  5. Index key length: 4 characters (Kanji characters)
  6. Number of keys with more than 200 duplicated key values: 1
    (Average number of duplicates: 250)
  7. Number of keys with up to 200 duplicated key values: (10000 - 250)/5 =
    1950 (Average number of duplicates: 5)

Formula
[Figure]

Obtain the number of index storage pages for the following MEMBERSHIP_TABLE using the SEX and YEAR_JOINED columns as a multicolumn index:

MNO NAME AGE SEX YEAR_JOINED
0001
0002
0003
0004
[Figure]
[Figure]
[Figure]
1000
Lisa Roberts
John Anderson
Jane Wood
Mark Wood
[Figure]
[Figure]
[Figure]
Joe Young
18
25
24
25
[Figure]
[Figure]
[Figure]
30
F
M
F
M
[Figure]
[Figure]
[Figure]
M
1983
1967
1987
1964
[Figure]
[Figure]
[Figure]
1995

Conditions:
  1. Total number of index keys: 10,000
  2. Page length of user RDAREA: 8,192 bytes
  3. Percentage of unused space specified with CREATE INDEX: 30%
  4. Number of members joined in 1964: 1,000
  5. Number of members joined in any other year: 200 or fewer
  6. Period covered: 31 years from 1965 to 1995
  7. The same numbers of male and female members are assumed to have joined each year.
  8. Data types of columns:
    Column Data type
    MNO CHARACTER(5)
    NAME NCHAR(4)
    AGE INTEGER
    SEX CHARACTER(4)
    YEAR_JOINED INTEGER

Formula:
  1. The number of keys (c) for members who joined within 31 years after 1965 (no more than 200 per year including both male and female members): c = 31 [Figure] 2 = 62
  2. The average number of duplicates (d) is: d=(10000-1000) [Figure]62=146.
  3. Number of keys (e) for members who joined in 1964 (1,000 members including both male and female members): e = 2
  4. Average number of duplicates (f): f = 1000/2 = 500
  5. DB storage key length (g) of the SEX and YEAR_JOINED columns:
    g = [Figure] (4 + 1 + 5)/4 [Figure] [Figure] 4 = 12

    [Figure]