6.25.3 Number of work tables created during retrieval using hash tables
This subsection explains how to determine the number of work tables created during retrieval using hash tables.
Retrieval using hash tables is performed when one of the following conditions is satisfied:
-
When hash join is used to join tables
-
When hash execution is used to process subqueries
-
When global hash grouping is used for processing of grouping or the DISTINCT set function
-
When hash execution is used as a method for processing SELECT DISTINCT
-
When hash execution is used as a method for processing the set operation
A hash table area is reserved and hash join processing is performed on the basis of the value specified for the adb_sql_exe_hashtbl_area_size operand in the server definition and client definition. A work table is created when a shortage occurs in the reserved hash table area during this processing.
Use the following formula to determine the number of work tables created during retrieval using hash tables for each SQL statement to be executed.
Formula
Explanation of variables
- A: Maximum row length in hash tables (bytes)
-
Determine the row length in hash tables for each of the elements described later in each query expression body. The maximum value among the determined values becomes the maximum row length in the hash tables.
-
Query specification that uses the equal sign (=) to join multiple tables
For each table joined by the equal sign (=), determine the sum total of the lengths of all columns specified in selection expressions and search conditions.
Note that if value expressions that include a column specification are specified on the left and right sides of the = join condition, add the data lengths of the results of those value expressions to the sum total.
The sum total of all values determined for the tables, excluding the minimum value, becomes the row length in the hash tables.
However, if only two tables are joined using the equal sign (=) and the results of the value expressions specified on the left and right sides have different data types or lengths, determine the total of all values. In this case, do not exclude the minimum value.
The hash table row length can be determined from the access path display result. For a tree query for which the table joining method is HASH JOIN, perform the above to determine the hash table row length.
-
Table subquery of a quantified predicate
The sum total of the following four values becomes the row length in the hash tables:
∙ Sum total of the data lengths of the results of subquery selection expressions
∙ Sum total of the data lengths of the results of the set functions specified in a subquery
∙ Sum total of the lengths of all columns specified in a predicate that includes an external reference column within the search condition
∙ If the result of the table subquery selection expression on the right side of the quantified predicate or IN predicate and the column specified on the left side of that predicate have different data types or data lengths, the length of the column specified on the left side of the predicate
The hash table row length can be determined from the access path display result. For a tree query for which the subquery processing is SUBQUERY HASH, the sum of the preceding four values determines the hash table row length.
-
Table subquery in the IN predicate
For details about how to determine the row length in hash tables, see Table subquery of a quantified predicate above.
-
Subquery that uses the equal sign (=) to specify an external reference column within the search condition, other than a table subquery of a quantified predicate or a table subquery in the IN predicate
For details about how to determine the row length in hash tables, see Table subquery of a quantified predicate above.
-
Query specification that includes the GROUP BY clause
Determine the column length of the grouping column and the data length of the result of the set function. The sum total of these results becomes the row length in the hash tables.
The hash table row length can be determined from the access path display result. Determine the column length of the grouping column and the data length of the set function result for a tree query for which the grouping method is GLOBAL HASH GROUPING. The sum total of these results becomes the row length in the hash tables.
-
Query specification that includes the DISTINCT set function
Determine the following values: The column length of the grouping column, the data length of the result of the value expression specified for the argument of the DISTINCT set function, and the data length of the results of set functions (excluding the DISTINCT set function). The sum total of these results becomes the row length in the hash tables.
-
Query specification that includes SELECT DISTINCT
The total data length of the selection expressions in SELECT DISTINCT becomes the row length of hash tables.
The hash table row length can be determined from the access path display result. Determine the data length of the selection expression in SELECT DISTINCT for each tree query for which hash execution is the method for processing SELECT DISTINCT. The sum total of these results becomes the row length in the hash tables.
-
Query expression body in which UNION or UNION DISTINCT is specified
Determine the column length of the table that is derived by the query expression body. The sum total of these results becomes the row length in the hash tables.
The hash table row length can be determined from the access path display result. Determine the column length of the table that is derived by the query expression body for each tree query for which hash execution is the method for processing the set operation. The sum total of these results becomes the row length in the hash tables.
-
- B: Maximum number of intermediate retrieval results
-
Use the following formula to determine its value:
Determine the maximum number of provisional intermediate retrieval results for each of the elements described later in each query expression body. The largest number of the determined maximum numbers is the maximum number of provisional intermediate retrieval results.
-
Query specification that uses the equal sign (=) to join multiple tables
For both sides of each equal sign (=), determine the number of retrieval results, and then select the smaller value. Of the values determined for all joins, use the largest as the maximum number of intermediate retrieval results.
The number of intermediate retrieval results can be determined from the access path display result. Determine the number of results for the outer table for a tree query for which the table joining method is HASH JOIN. From the determined numbers of results, the largest number of results is the number of intermediate retrieval results.
-
Table subquery of a quantified predicate
The number of rows of results for the subquery is the number of intermediate retrieval results.
The number of intermediate retrieval results can be determined from the access path display result. Determine the number of subquery results for a tree query for which subquery processing is SUBQUERY HASH. From the determined numbers of results, the largest number of results is the number of intermediate retrieval results.
-
Table subquery in the IN predicate
For details about how to determine the number of intermediate retrieval results, see Table subquery of a quantified predicate.
-
Subquery that uses the equal sign (=) to specify an external reference column inside the search condition, other than a table subquery of a quantified predicate or a table subquery in the IN predicate
For details about how to determine the number of intermediate retrieval results, see Table subquery of a quantified predicate.
-
Query specification that includes the GROUP BY clause
The number of results after grouping is used becomes the number of intermediate retrieval results.
The number of intermediate retrieval results can be determined from the access path display result. Determine the number of results after performing grouping for a tree query for which the grouping method is GLOBAL HASH GROUPING. From the determined numbers of results, the largest number of results is the number of intermediate retrieval results.
-
Query specification that includes the DISTINCT set function
The number of results after duplication is eliminated becomes the number of intermediate retrieval results. If you specify the GROUP BY clause in a query specification, assume that the argument of the DISTINCT set function is also a grouping column. Then, use the number of results obtained after grouping as the approximate number of intermediate retrieval results.
-
Query specification that includes SELECT DISTINCT
The number of results after duplication is eliminated becomes the number of intermediate retrieval results.
-
Query expression body in which UNION or UNION DISTINCT is specified
The number of results after the set operation is performed becomes the number of intermediate retrieval results.
-
- C: Number of hash tables
-
Use the following formula to determine its value:
The number of hash tables can also be determined from the access path display result. If you can display the access path, determine the number of hash tables from the access path display result and the following formula:
- #
-
Determine the number by using the information displayed in the <<Tree View>> of the access path display result. For example, if HASH JOIN appears twice in the <<Tree View>>, calculate the number of HASH JOINs (table joining method) displayed as 2.
- D: Hash table area
-
Use the following formula to determine its value:
If the value specified for the adb_sql_exe_hashtbl_area_size operand in the client definition is smaller than the value specified for the adb_sql_exe_hashtbl_area_size operand in the server definition, substitute the value specified in the client definition. Otherwise, substitute the value specified in the server definition.
If 0 is specified for the adb_sql_exe_hashtbl_area_size operand, retrieval using hash tables is not applied.