Scalable Database Server, HiRDB Version 8 UAP Development Guide
This section describes the items that must be set before hash join or hash execution of a subquery can be applied with the SQL extension optimizing option.
Before hash join or hash execution of a subquery can be applied, the following items must be set:
Use the pd_hash_table_size operand in the system definition or PDHASHTBLSIZE in the client environment definition to set the hash table size. Calculate the maximum hash table row length, and then set the hash table size to a value that is equal to or greater than the value obtained from the following formula:
hash-table-size (in kilobytes) (maximum-hash-table-row-length (in bytes) 2 + 32) 128 128 |
hash-table-size-for-batch-hash-join (in kilobytes) = (number-of-hash-table-data-pages + number-of-hash-table-management-table-pages) number-of-one-segment-pages 128 |
number-of-hash-table-data-pages = number-of-hash-table-rows MIN { (hash-table-page-length 48) hash-table-row-length , 255} + 63 |
number-of-hash-table-management-table-pages = (16 number-of-hash-table-rows + ( (number-of-hash-table-data-pages hash-table-page-length + 16 number-of-hash-table-rows) (number-of-one-segment-pages hash-table-page-length) 8) + 8) hash-table-page-length hash-table-page-length |
number-of-one-segment-pages = (128 1024) hash-table-page-length |
Determine the hash table page length from the hash table row length as shown in the following table.
Hash table row length | Hash table page length |
---|---|
0 to 1,012 | 4,096 |
1,013 to 2,036 | 8,192 |
2,037 to 4,084 | 16,384 |
4,085 to 16,360 | 32,768 |
16,361 to 32,720 | (hash-table-row-length + 48) 2048 2048
|
The method for allocating the work table buffer must be set to buffer batch allocation (pool) in server process units. Therefore specify pool in the pd_work_buff_mode operand of the system definition.
Hash tables are allocated in the work table buffer. If the work table buffer size or the upper limit size for expansion allocation of the work table buffer is smaller than the specified hash table size, an error occurs because of insufficient space in the work table buffer. Therefore, in the pd_work_buff_size or pd_work_buff_expand_limit operand of the system definition, set a value that is equal to or larger than the value calculated with the following formula:
work-table-buffer-size (in kilobytes) (hash-table-size (in kilobytes) 2 + 256) maximum-number-of-hash-joins-in-SELECT-statement + 128
number-of hash-joins-in-SELECT-statement = ((number-of-tables-joined-with-=) (number-of-query-specifications-joined-with-=)), + (number-of-=ANY-quantified-predicates) + (number-of-=SOME-quantified-predicates) + (number-of-IN-(subquery)-specifications) + (number-of-other-subqueries-that-specify-external-reference-columns-with-=-in-search-conditions)
SELECT A.A1,B.B2,C.C3 FROM A,B,C 3-1 WHERE A.A1=B.B1 AND A.A2=B.B2 AND B.B3=C.C3 AND A.A1=C.C1 AND A.A4=ANY(SELECT D.D4 FROM D) 1 AND A.A5=SOME(SELECT E.E5 FROM E) 1 AND A.A6 IN(SELECT F.F6 FROM F WHERE F.F1=A.A1) 1 AND EXISTS(SELECT G.G1 FROM G WHERE G.G1=B.B1) 1
work-table-buffer-size (in kilobytes) hash-table-size (in kilobytes) maximum-number-of-hash-joins-in-SELECT-statement + 384
A retrieval that accompanies a hash join or subquery execution is processed by hashing.
You can select the hashing mode with the pd_hashjoin_hashing_mode operand of the system definition or with PDHJHASHINGMODE client environment definition. The default is TYPE1.
TYPE1 is the hashing mode of HiRDB versions before 07-02. If you use hash join for the first time in HiRDB version 07-02 or a more recent version, specify TYPE2.
If you specify TYPE1 in HiRDB version 07-02 or a more recent version, you may not obtain the desired performance. If this happens, specify TYPE2 as the hashing mode, or see (3) Tuning the hashing mode, and tune the mode.
The hash table size can be tuned based on either of the following types of tuning information:
For details about the UAP statistical report, see 10.1.4 UAP statistical report facility. For details about the statistics analysis utility, see the HiRDB Version 8 Command Reference manual.
When tuning information about the hash table size is obtained, the following items can be determined:
Bucket repartitioning refers to the repetition of bucket partitioning recursively for up to three levels when the bucket size exceeds the hash table size. An example is shown as follows.
The number of partitions in one bucket partitioning is determined from the following formula:
number-of-bucket-partitions = MIN { (hash-table-size 2) hash-table-page-length , 64}
Even with batch hash join, level 1 bucket partitioning is executed for inner tables to be joined.
Table 4-14 describes the tuning methods for hash table size.
Table 4-14 Tuning methods for hash table size
Tuning information (unit: kilobytes) | Tuning method |
---|---|
Maximum batch hash table size | If a value that is equal to or greater than this value was set as the hash table size, batch hash join without bucket partitioning is set for all data.1 If this value exceeds the maximum hash table size that can be specified, batch hatch join cannot be used. If this value is 0, hash join or subquery hash execution has not been performed. |
Level 1 maximum bucket size | If a value that is equal to or greater than this value was set as the hash table size, bucket partitioning has terminated at level 1. If bucket partitioning is set to level 2 or higher, specifying this value as the hash table size terminates bucket partitioning at level 1.2 If batch hash join without bucket partitioning was executed on all data, 0 is displayed for this value. |
Level 2 maximum bucket size | If a value that is equal to or greater than this value was set as the hash table size, bucket partitioning has terminated at level 2. If bucket partitioning is set to level 3 or higher, specifying this value as the hash table size terminates bucket partitioning at level 2.2 If level 2 bucket partitioning was not executed even once, 0 is displayed for this value. |
Level 3 maximum bucket size | If a value that is equal to or greater than this value was set as the hash table size, bucket partitioning has terminated at level 3. If the hash table size is smaller than this value, each bucket is partially expanded into the hash table, and the processing efficiency becomes worse. In this case, set the hash table size to this value or larger.2 In some cases, not applying hash join or subquery hash execution improves the performance. If level 3 bucket partitioning was not executed even once, 0 is displayed for this value. |
1 When the hash table size is increased, the number of bucket partitions at each partitioning execution may increase according to the formula for calculating the number of bucket partitions. Consequently, a hash table size that is larger than the size that was used during tuning information acquisition may be necessary.
If you used tuning information and increased the hash table size, get the tuning information again. If the intended results are not realized, increase the hash table size again according the new tuning information that was acquired.
2 When the hash table size is increased, the number of bucket partitions at each partitioning execution may increase according to the formula for calculating the number of bucket partitions. Consequently, bucket partitioning may terminate at the intended level even if the hash table size is smaller than the size that was used during tuning information acquisition.
On the other hand, when the hash table size is decreased, the number of bucket partitions at each partitioning execution may decrease. Consequently, bucket partitioning may not terminate at the same level used during tuning information acquisition. You should therefore use this tuning information when the hash table size is increased.
The hash table size can be tuned based on either of the following types of tuning information:
For details about the UAP statistical report, see 10.1.4 UAP statistical report facility. For details about the statistics analysis utility, see the manual HiRDB Version 8 Command Reference.
By obtaining tuning information about the hash table size, you can determine the following item:
Table 4-15 shows tuning information for the hashing mode.
Table 4-15 Tuning information for the hashing mode
Tuning information (unit: number) | Description |
---|---|
Maximum number of comparisons | Maximum number of comparisons during hash searching |
Total number of comparisons | Total number of comparisons during hash searching |
Number of searches | Number of hash table searches |
Average number of comparisons | Total number of comparisons / number of searches |
All Rights Reserved. Copyright (C) 2007, Hitachi, Ltd.