Table reorganization

Deleting data does not release the segments and pages in which the deleted data has been stored. Therefore, when deletion of data has been performed many times, the amount of dead space in the table becomes significant, resulting in a decrease in data storage efficiency. Eventually, this can mean a shortage of space for RDAREAs, even though there has not been an increase in the amount of data.

Similarly, if data is added repeatedly, problems arise with respect to data not being stored in pages near the cluster key and to proliferation in the number of data I/O operations. The result is degradation of data retrieval performance. Executing the database reorganization utility (pdrorg) to reorganize a table causes the system to re-store the table's data, which can prevent these problems from arising. Figure 7-9 illustrates table reorganization.

Figure 7-9 Table reorganization

[Figure]

Explanation

First, the table is stored temporarily in an unload file; this process is called table unloading. Subsequently, the data is re-stored in the table; this process is called table data reloading. The entire process is called table reorganization.
If an index is defined for the table, the index information is output to an index information file when the data is reloaded. Based upon this information, HiRDB re-creates the index in the batch mode, which reorganizes the index as well.

Organization of this subsection: (1) Execution units for table reorganization; (2) Reorganizing a table containing a large quantity of data; (3) Facility for predicting reorganization time

(1) Execution units for table reorganization

Table reorganization can be executed in the following units:

By table
By RDAREA
By schema

(a) Reorganization by table

Reorganization processing can be performed on an entire table; this is the method that is usually used. You should first execute the database condition analysis utility to determine whether or not the entire table needs to be reorganized. If so, you can then execute reorganization of the entire table. Figure 7-10 illustrates reorganization of an entire table.

Figure 7-10 Reorganization of an entire table

[Figure]

Note: The data indicated by shading is subject to reorganization.

Explanation: The table to be reorganized is specified in the -t option of the database reorganization utility.

(b) Reorganization by RDAREA

Reorganization processing is performed on a per-RDAREA basis. This method can be used only if the table is row-partitioned. Reorganization of RDAREAs is executed when the results of the database condition analysis utility indicate that it would suffice to reorganize only a portion of a row-partitioned table. This reduces the processing time compared with reorganization of the entire table. Figure 7-11 illustrates reorganization of an RDAREA.

Figure 7-11 Reorganization of an RDAREA

[Figure]

Explanation: The table to be reorganized is specified in the -t option and the RDAREAs to be reorganized are specified in the -r option of the database reorganization utility.

(c) Reorganization by schema

This processing reorganizes all tables in a schema in the batch mode. Reorganization by schema can be used when you wish to reorganize all the tables you own on a batch basis. Figure 7-12 illustrates reorganization of a schema.

Figure 7-12 Reorganization of a schema

[Figure]

Explanation: The authorization identifier of the schema to be reorganized is specified in the -t option of the database reorganization utility. The specification format is: -t authorization-identifier.all.

(2) Reorganizing a table containing a large quantity of data

When a table containing a large quantity of data is to be reorganized, you must consider whether or not reorganization with synchronization points set should be executed.

Normally, while a table is being reorganized, transactions cannot be reconciled until storage processing of all the data has been completed. This means that synchronization point dumps cannot be obtained during execution of the database reorganization utility. If HiRDB terminates abnormally during reorganization of a large quantity of data, it will take a long time to restart HiRDB. To resolve this problem, you can set synchronization points at intervals of any number of data items during storage of the data (reload processing) in order to reconcile transactions. This is called reorganization with synchronization points set.

To perform reorganization with synchronization points set, you must specify a synchronization point lines count, which is the number data items to be stored before a synchronization point is set. This value is specified in the option statement of the database reorganization utility.

Synchronization point setting can also be specified in the database load utility; this is called data loading with synchronization points set.

(3) Facility for predicting reorganization time

The decision on whether to reorganize tables or indexes or whether to extend an RDAREA must be made by the user based on using messages that are output and the execution results of the pddbst command to make a comprehensive evaluation of which tables to reorganize and when to reorganize them. It is possible that the user may reorganize a table that does not need to be reorganized or may neglect to reorganize a table that does need reorganizing because of an overlooked message that was output.

To simplify this operation, HiRDB is now able to predict when reorganization will be necessary. The function that performs this prediction is called the facility for predicting reorganization time. Figure 7-13 provides an overview of this facility.

Figure 7-13 Overview of facility for predicting reorganization time

[Figure]

^* A day on which RDAREA maintenance needs to be performed is called a scheduled database maintenance day.

Predicting when reorganization will be needed is divided into two phases:

Phase 1: Get reorganization prediction data
- The pddbst command is executed periodically, and analysis results from the database are accumulated in the database state analyzed table.
- Whenever SQL code or a command is executed, an operation log from the database is output to the database management table.
Phase 2: Analyze reorganization prediction data
Using the database state analyzed table and database management table as the input information, the pddbst command is used to analyze the reorganization prediction data. The user checks the execution results of the pddbst command and performs one of the following operations as needed:
- Uses the pdrorg command to reorganize tables or indexes
- Uses the pdreclaim command to release used free pages and free segments
- Uses the pdmod command to extend an RDAREA
- Uses the pdmod command to automatically extend an RDAREA
- Uses the pdmod command to reinitialize an RDAREA.

The facility for predicting reorganization time also provides the two levels described below. Prediction level 1, which predicts the time required for pddbst to accumulate database analysis results, can be run in a relatively short amount of time. However, Prediction level 2 may require a much longer time to run.

Prediction level 1
This level predicts when an RDAREA will no longer have sufficient free space. This level is designed for users who do not want to reorganize frequently while there is still sufficient free space for the RDAREA.
Prediction level 2
In addition to predicting when an RDAREA will no longer have sufficient free space, this level predicts degradation of storage efficiency in tables and indexes. This level is designed for users who want to reorganize as soon as table or index access performance is reduced, regardless of the amount of free space for the RDAREA.

For details about the facility for predicting reorganization time, see the HiRDB Version 8 System Operation Guide and the section about the database condition analysis utility in the manual HiRDB Version 8 Command Reference.

7.3.1 Table reorganization

(1) Execution units for table reorganization

(a) Reorganization by table

(b) Reorganization by RDAREA

(c) Reorganization by schema

(2) Reorganizing a table containing a large quantity of data

(3) Facility for predicting reorganization time