16.1.3 Calculating the number of index storage pages
The procedure used to calculate the number of pages required to store an index is explained in (1) as follows. The variables used in the formulas are explained in (2), and examples of calculating the number of pages needed to store an index are presented in (3).
If cluster key is specified with CREATE TABLE, the number of cluster key storage pages should be obtained in the same manner as for the number of index storage pages.
If an index is row-partitioned, the number of pages for the index in each storage RDAREA must be obtained.
- Note
- When an index page split occurs, the storage ratio of the keys in the index is 50:50, and the index is divided into two indexes. Therefore, if there are many additions or updates to an index, make the estimate for the maximum number of index storage pages double the calculated value. Also, even if there are inserts from the UAP, the index page split of the leaf page that stores the largest key considers the value of the PCTFREE operand.
- One method to reduce the frequency of index page splits is unbalanced index splits. For details about index page splits and unbalanced index splits, see the HiRDB Version 9 System Operation Guide.
- Organization of this subsection
- (1) Calculation procedure
- (2) Variables used in formulas
- (3) Examples of calculating the number of index storage pages
(1) Calculation procedure
(2) Variables used in formulas
- a: Page length of user RDAREA (bytes)
- b: Percentage of unused space specified with CREATE TABLE#1(%)
- c: Number of keys with up to 200 duplicated key values#2, #3, #4
- d: Average number of duplicates for keys with up to 200 duplicated key values#3, #5
- e: Number of keys with more than 200 duplicated key values,#3, #4
- f: Average number of duplicates for keys with more than 200 duplicated key values#3, #5
- g: DB storage key length#6 (bytes)
- h: One of the following:
- For a unique index: Number of keys not containing a null value
For a multicolumn index, the total number of keys not containing a null value in its component columns.
- For other than unique index: 0
- #1
- If no percentage of unused space is specified, 30% should be used in the calculation. If a cluster key is specified, the percentage of unused space specified with CREATE TABLE should be used.
- #2
- Non-duplicated keys in unique indexes must be included.
- #3
- Calculate so that the value of c x d + e x f is larger than the total number of index keys.
- #4
- Duplicate keys in unique indexes must be included (keys that are duplicate due to the fact that the key contains a null value).
- #5
- Round up fractions to an integer value.
- #6
- See Table 16-5 List of index key lengths. The length of the DB storage key can be obtained from the following formula:
- For single column indexes and fixed-size multicolumn indexes
key-length
4
x 4
- For variable-size multicolumn indexes with a key length of 255 bytes or less
(key-length + 1)
4
x 4
- For variable-size multicolumn indexes with a key length of 256 bytes or more
(key-length + 2)
4
x 4
- The key length of a multicolumn index is the sum of the key lengths of its component columns based on the key lengths shown in Table 16-5 List of index key lengths.
Table 16-5 List of index key lengths
Classifi- cation | Data type | Data length (bytes) |
---|
Key length less than 256 | Key length 256 or greater |
---|
Single- column index | Multicolumn index | Single- column index | Multicolumn index |
---|
Fixed length#1 | Variable length#2 | Fixed length#1 | Variable length#2 |
---|
Numeric data | INTEGER | 4 | 5 | 6 | N | 5 | 7 |
SMALLINT | 2 | 3 | 4 | N | 3 | 5 |
LARGE DECIMAL(m,n)#3 | m 2 + 1 | m 2 + 2 | m 2 + 3 | N | m 2 + 2 | m 2 + 4 |
FLOAT or DOUBLE PRECISION | 8 | E | E | -- | E | E |
SMALLFLT or REAL | 4 | E | E | -- | E | E |
Character data | CHARACTER(n) | N | n + 1 | n + 2 | n | n + 1 | n + 3 |
VARCHAR(n) | a + 1 | N | a + 2 | a + 2 | N | a + 3 |
National character data | NCHAR(n) or NATIONAL CHARACTER(n) | 2 x n | 2 x n + 1 | 2 x n + 2 | 2 x n | 2 x n + 1 | 2 x n + 3 |
NVARCHAR(n) | 2 x b + 1 | N | 2 x b + 2 | 2 x b + 2 | N | 2 x b + 3 |
Mixed character data | MCHAR(n) | N | n + 1 | n + 2 | n | n + 1 | n + 3 |
MVARCHAR(n) | a + 1 | N | a + 2 | a + 2 | N | a + 3 |
Date data | DATE | 4 | 5 | 6 | -- | 5 | 7 |
Time data | TIME | 3 | 4 | 5 | -- | 4 | 6 |
Date interval data | INTERVAL YEAR TO DAY | 5 | 6 | 7 | -- | 6 | 8 |
Time interval data | INTERVAL HOUR TO SECOND | 4 | 5 | 6 | -- | 5 | 7 |
Timestamp data | TIMESTAMP(p) | 7 + (p 2) | 8 + (p 2) | 9 + (p 2) | -- | 8 + (p 2) | 10 + (p 2) |
- a: Actual data length
- b: Actual number of characters
- m, n, p: Positive integer
- E: Error occurs during index definition
- --: Not applicable
- Note
- Begin calculation with a key length less than 255 bytes. If it turns out that the key length is 256 bytes or greater, recalculate at a key length of 256 bytes or greater.
- #1: Key length of an index that contains only fixed-length component columns.
- #2: Key length of an index that contains variable-length component columns.
- #3: This is a fixed decimal number consisting of a total of m digits and n decimal places. If m is omitted, 15 is assumed.
- Reference note
- Non-unique indexes have areas that store index data multiple times in their index data storage areas, so they are that much larger. Unique indexes, on the other hand, have no areas that store duplicated instances. For this reason, unique indexes are smaller than non-unique indexes.
(3) Examples of calculating the number of index storage pages
(a) Example 1
Obtain the number of index storage pages for a unique index (no duplicated key values) of the PCODE column for the following STOCK table:
PNO | PNAME | STANDARD | PRICE | QUANTITY | COST |
---|
20180 | CLEANER | C20 | 20000 | 26 | 15000 |
20190 | CLEANER | C77 | 28000 | 105 | 23000 |
20130 | REFRIGERATOR | P10 | 30000 | 70 | 25000 |
20220 | TV | K18 | 35000 | 12 | 30000 |
20200 | CLEANER | C89 | 35000 | 30 | 30000 |
20140 | REFRIGERATOR | P23 | 35000 | 60 | 30000 |
20280 | AMPLIFIER | L10 | 38000 | 200 | 33000 |
20150 | REFRIGERATOR | P32 | 48000 | 50 | 43000 |
20290 | AMPLIFIER | L50 | 49800 | 260 | 45000 |
20230 | TV | K20 | 50000 | 15 | 45000 |
20160 | REFRIGERATOR | P35 | 55800 | 120 | 50000 |
- Conditions:
- Total number of index keys: 10,000
- Page length of user RDAREA: 8,192 bytes
- Percentage of unused space specified with CREATE TABLE: 30%
- Data type of index: CHARACTER
- Index key length: 5 bytes
- Number of key duplicates: 1
- Formula
![[Figure]](figure/zu161150.gif)
(b) Example 2
Obtain the number of index storage pages for the STOCK table shown in Example 1 when the PNAME column is used as the index (with duplicated key values).
- Conditions:
- Total number of index keys: 10,000
- Page length of user RDAREA: 8,192 bytes
- Percentage of unused space specified with CREATE TABLE: 30%
- Data type of index: NCHAR
- Index key length: 4 characters (kanji characters)
- Number of keys with more than 200 duplicated key values: 1
(Average number of duplicates: 250)
- Number of keys with up to 200 duplicated key values: (10,000 - 250)/5 =
1,950 (Average number of duplicates: 5)
- Formula
![[Figure]](figure/zu161160.gif)
Obtain the number of index storage pages for the following MEMBERSHIP_TABLE using the SEX and YEAR_JOINED columns as a multicolumn index:
MNO | NAME | AGE | SEX | YEAR_JOINED |
---|
0001 0002 0003 0004 ... ... ... 1000 | Lisa Roberts John Anderson Jane Wood Mark Wood ... ... ... Joe Young | 18 25 24 25 ... ... ... 30 | F M F M ... ... ... M | 1983 1967 1987 1964 ... ... ... 1995 |
- Conditions:
- Total number of index keys: 10,000
- Page length of user RDAREA: 8,192 bytes
- Percentage of unused space specified with CREATE INDEX: 30%
- Number of members joined in 1964: 1,000
- Number of members joined in any other year: 200 or fewer
- Period covered: 31 years from 1965 to 1995
- The same numbers of male and female members are assumed to have joined each year.
- Data types of columns:
Column | Data type |
---|
MNO | CHARACTER(5) |
NAME | NCHAR(4) |
AGE | INTEGER |
SEX | CHARACTER(4) |
YEAR_JOINED | INTEGER |
- Formula:
- The number of keys (c) for members who joined within 31 years after 1965 (no more than 200 per year including both male and female members): c = 31 x 2 = 62
- The average number of duplicates (d) is: d=(10,000 - 1,000)
62 = 146. - Number of keys (e) for members who joined in 1964 (1,000 members including both male and female members): e = 2
- Average number of duplicates (f): f = 1,000
2 = 500 - DB storage key length (g) of the SEX and YEAR_JOINED columns:
g =
(4 + 1 + 5)/4
x 4 = 12
![[Figure]](figure/zu161170.gif)