Calculating the number of index storage pages

The procedure used to calculate the number of pages required to store an index is explained in (1) as follows. The variables used in the formulas are explained in (2), and examples of calculating the number of pages needed to store an index are presented in (3).

If cluster key is specified with CREATE TABLE, the number of cluster key storage pages should be obtained in the same manner as for the number of index storage pages.

If an index is row-partitioned, the number of pages for the index in each storage RDAREA must be obtained.

Note: When an index page split occurs, the storage ratio of the keys in the index is 50:50, and the index is divided into two indexes. Therefore, if there are many additions or updates to an index, make the estimate for the maximum number of index storage pages double the calculated value. Also, even if there are inserts from the UAP, the index page split of the leaf page that stores the largest key considers the value of the PCTFREE operand.; One method to reduce the frequency of index page splits is unbalanced index splits. For details about index page splits and unbalanced index splits, see the manual HiRDB Version 8 System Operation Guide.

Organization of this subsection: (1) Calculation procedure; (2) Variables used in formulas; (3) Examples of calculating the number of index storage pages

(1) Calculation procedure

The following formula is used to calculate the number of pages needed to store an index.

Formula

                        n
Number-of-index-storage-pages =  Pi + P_d
                        i=1

The recursive formula shown in Formula 1 as follows is used to obtain P_i. P_i + 1 must be calculated until P_n = 1, then the sum of the results must be obtained.

The value of P_d must be obtained if the number of duplicated key values exceeds 200. The formula for obtaining the value of P_d is shown in Formula 2 as follows:

Formula 1

[Figure]

Formula 2

[Figure]

Number of duplicated elements per row when the index contains repetition columns

If the index contains repetition columns, the number of duplicated elements per row must not exceed the following value:

Number of duplicated elements = [Figure]

(

0.95

- 82)

- 1

(2) Variables used in formulas

a: Page length of user RDAREA (bytes)

b: Percentage of unused space specified with CREATE TABLE¹(%)

c: Number of keys with up to 200 duplicated key values^#^2,^#^3,^#⁴

d: Average number of duplicates for keys with up to 200 duplicated key values^#^3,^#⁵

e: Number of keys with more than 200 duplicated key values^,^#^3,^#⁴

f: Average number of duplicates for keys with more than 200 duplicated key values^#^3,^#⁵

g: DB storage key length^#⁶ (bytes)

h: One of the following:

For a unique index: Number of keys not containing a null value
For a multicolumn index, the total number of keys not containing a null value in its component columns.
For other than unique index: 0

#1: If no percentage of unused space is specified, 30% should be used in the calculation. If a cluster key is specified, the percentage of unused space specified with CREATE TABLE should be used.

#2: Non-duplicated keys in unique indexes must be included.

#3: Calculate so that the value of c d + e f is larger than the total number of index keys.

#4: Duplicate keys in unique indexes must be included (keys that are duplicate due to the fact that the key contains a null value).

#5: Round up fractions to an integer value.

#6

See Table 16-5 List of index key lengths. The length of the DB storage key can be obtained from the following formula:

For single column indexes and fixed-size multicolumn indexes
key-length 4 4
For variable-size multicolumn indexes with a key length of 255 bytes or less
(key-length + 1) 4 4
For variable-size multicolumn indexes with a key length of 256 bytes or more
(key-length + 2) 4 4

The key length of a multicolumn index is the sum of the key lengths of its component columns based on the key lengths shown in Table 16-5 List of index key lengths.

Table 16-5 List of index key lengths

Classifi- cation	Data type	Data length (bytes)
		Key length less than 256			Key length 256 or greater
		Single- column index	Multicolumn index		Single- column index	Multicolumn index
		Single- column index	Fixed length¹	Variable length²	Single- column index	Fixed length¹	Variable length²
Numeric data	INTEGER	4	5	6	N	5	7
	SMALLINT	2	3	4	N	3	5
	LARGE DECIMAL(m,n)³	m 2 + 1	m 2 + 2	m 2 + 3	N	m 2 + 2	m 2 + 4
	FLOAT or DOUBLE PRECISION	8	E	E		E	E
	SMALLFLT or REAL	4	E	E		E	E
Character data	CHARACTER(n)	n	n + 1	n + 2	n	n + 1	n + 3
Character data	VARCHAR(n)	a + 1	N	a + 2	a + 2	N	a + 3
National character data	NCHAR(n) or NATIONAL CHARACTER(n)	2 n	2 n + 1	2 n + 2	2 n	2 n + 1	2 n + 3
National character data	NVARCHAR(n)	2 b + 1	N	2 b + 2	2 b + 2	N	2 b + 3
Mixed character data	MCHAR(n)	n	n + 1	n + 2	n	n + 1	n + 3
Mixed character data	MVARCHAR(n)	a + 1	N	a + 2	a + 2	N	a + 3
Date data	DATE	4	5	6		5	7
Time data	TIME	3	4	5		4	6
Date interval data	INTERVAL YEAR TO DAY	5	6	7		6	8
Time interval data	INTERVAL HOUR TO SECOND	4	5	6		5	7
Timestamp data	TIMESTAMP(p)	7 + (p2)	8 + (p2)	9 + (p2)		8 + (p2)	10 + (p2)

Legend:: a: Actual data length; b: Actual number of characters; m, n, p: Positive integer; E: Error occurs during index definition; : Not applicable

Note: Begin calculation with a key length less than 255 bytes. If it turns out that the key length is 256 bytes or greater, recalculate at a key length of 256 bytes or greater.

¹ Key length of an index that contains only fixed-length component columns.

² Key length of an index that contains variable-length component columns.

³ This is a fixed decimal number consisting of a total of m digits and n decimal places. If m is omitted, 15 is assumed.

(3) Examples of calculating the number of index storage pages

(a) Example 1

Obtain the number of index storage pages for a unique index (no duplicated key values) of the PCODE column for the following STOCK table:

PNO	PNAME	STANDARD	PRICE	QUANTITY	COST
20180	CLEANER	C20	20000	26	15000
20190	CLEANER	C77	28000	105	23000
20130	REFRIGERATOR	P10	30000	70	25000
20220	TV	K18	35000	12	30000
20200	CLEANER	C89	35000	30	30000
20140	REFRIGERATOR	P23	35000	60	30000
20280	AMPLIFIER	L10	38000	200	33000
20150	REFRIGERATOR	P32	48000	50	43000
20290	AMPLIFIER	L50	49800	260	45000
20230	TV	K20	50000	15	45000
20160	REFRIGERATOR	P35	55800	120	50000

Conditions:

Total number of index keys: 10,000
Page length of user RDAREA: 8,192 bytes
Percentage of unused space specified with CREATE TABLE: 30%
Data type of index: CHARACTER
Index key length: 5 bytes
Number of key duplicates: 1

Formula

(b) Example 2

Obtain the number of index storage pages for the STOCK table shown in Example 1 when the PNAME column is used as the index (with duplicated key values).

Conditions:

Total number of index keys: 10,000
Page length of user RDAREA: 8,192 bytes
Percentage of unused space specified with CREATE TABLE: 30%
Data type of index: NCHAR
Index key length: 4 characters (Kanji characters)
Number of keys with more than 200 duplicated key values: 1
(Average number of duplicates: 250)
Number of keys with up to 200 duplicated key values: (10000 - 250)/5 =
1950 (Average number of duplicates: 5)

Formula

Obtain the number of index storage pages for the following MEMBERSHIP_TABLE using the SEX and YEAR_JOINED columns as a multicolumn index:

MNO	NAME	AGE	SEX	YEAR_JOINED
0001 0002 0003 0004 1000	Lisa Roberts John Anderson Jane Wood Mark Wood Joe Young	18 25 24 25 30	F M F M M	1983 1967 1987 1964 1995

Conditions:

Total number of index keys: 10,000
Page length of user RDAREA: 8,192 bytes
Percentage of unused space specified with CREATE INDEX: 30%
Number of members joined in 1964: 1,000
Number of members joined in any other year: 200 or fewer
Period covered: 31 years from 1965 to 1995
The same numbers of male and female members are assumed to have joined each year.
Data types of columns:

Column Data type

MNO CHARACTER(5)

NAME NCHAR(4)

AGE INTEGER

SEX CHARACTER(4)

YEAR_JOINED INTEGER

Column	Data type
MNO	CHARACTER(5)
NAME	NCHAR(4)
AGE	INTEGER
SEX	CHARACTER(4)
YEAR_JOINED	INTEGER

Formula:

The number of keys (c) for members who joined within 31 years after 1965 (no more than 200 per year including both male and female members): c = 31 2 = 62
The average number of duplicates (d) is: d=(10000-1000) 62=146.
Number of keys (e) for members who joined in 1964 (1,000 members including both male and female members): e = 2
Average number of duplicates (f): f = 1000/2 = 500
DB storage key length (g) of the SEX and YEAR_JOINED columns:
g = (4 + 1 + 5)/4 4 = 12

16.1.3 Calculating the number of index storage pages

(1) Calculation procedure

(2) Variables used in formulas

(3) Examples of calculating the number of index storage pages

(a) Example 1

(b) Example 2