Hitachi

Hitachi Advanced Database Setup and Operation Guide


11.4.9 Merging chunks (to reduce the number of chunks)

If the number of chunks increases due to repeated background import, performance might decline during retrieval processing using B-tree indexes and text indexes. When indexes are split into chunks, the total amount of index data increases and the number of I/O operations increases proportionately to the number of chunks.

Therefore, as the number of chunks increases, consider using the adbmergechunk command to reduce the number of chunks by merging them.

Organization of this subsection

(1) Notes on chunk statuses

You can merge the following chunks. (You can merge multiple chunks of the same status.)

For a regular multi-chunk table:
  • You can merge multiple chunks that are in normal status into a single chunk that is in normal status.

  • You can merge multiple chunks that are in wait status into a single chunk that is in wait status.

For an archivable multi-chunk table:
  • You can merge multiple chunks that are in normal status and not in archived state into a single chunk that is in normal status and not in archived state.

  • You can merge multiple archived chunks that are in normal status into a single archived chunk that is in normal status.

  • You can merge multiple chunks that are in wait status and not in archived state into a single chunk that is in wait status and not in archived state.

  • You can merge multiple archived chunks that are in wait status into a single archived chunk that is in wait status.

Important
  • You cannot merge a chunk that is in normal status with a chunk that is in wait status.

  • You cannot merge an unarchived chunk with an archived chunk.

(2) Notes on merging chunks in a column store table

If all of the following conditions are met, reorganize the merge-source chunks before merging them:

If you merge chunks and then reorganize the merge-target chunk, the following problems will arise:

You can check whether update SQL statements have been executed for the merge-source chunks from information about the need for reorganization that can be obtained by running the adbdbstatus command. In the obtained information, if the value of either of the following items is 1 or larger, update SQL statements have been executed for the target chunks:

For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

(3) Using the adbmergechunk command to merge chunks

The example below uses the adbmergechunk command to merge chunks.

For details about the adbmergechunk command, see adbmergechunk (Merge Chunks) in the manual HADB Command Reference.

■ Chunk merging execution example

The HADB user (ADBUSER01) merges two chunks (with chunk IDs 1 and 2) in the shop table (SHOPSLIST).

adbmergechunk -u ADBUSER01 -p '#HelloHADB_01' -g 2
              -w /home/adbmanager/tmp
              -z /home/adbmanager/merge_file/merge_opt_file.txt
              -m 'January 2014'
              -c 1,2
              SHOPSLIST
Figure 11‒5: Merge chunk processing example

[Figure]

Explanation

Executing the adbmergechunk command merges two merge-source chunks (with chunk IDs of 1 and 2) into a single chunk (with chunk ID 4). When merging is completed, the merge-source chunks are deleted.

Note
■ Delete-pending chunks

If the adbmergechunk command is interrupted after the merge-target chunk is ready but before the merge-source chunks are completely deleted, the merge-source chunks are placed in delete-pending status. At this time, the merge-source chunks will remain undeleted. For example, merge-source chunks might remain undeleted if the HADB server has insufficient memory or when the adbcancel command is executed.

If the adbmergechunk command is executed with NOWAIT specified in the --purge-chunk option, the merge-source chunks will be in delete-pending status from the time when the merge-target chunk is ready until deletion of the merge-source chunks begins. This remains true even when there is still processing that is referencing the merge-chunk target tables. This means that the merge-source chunks will remain undeleted.

You need to manually delete merge-source chunks that are in delete-pending status.

To manually delete merge-source chunks, determine their chunk IDs based on the explanation in (12) Checking for delete-pending chunks under C.9 Searching system tables, and use the PURGE CHUNK statement to delete those chunks.

You can specify a condition (IN predicate, quantified predicate) for the WHERE clause of the PURGE CHUNK statement to delete in a batch the deletion-pending chunks. For an example of specifying the PURGE CHUNK statement, see Example in Specification format and rules for the PURGE CHUNK statement of PURGE CHUNK (delete all rows in a chunk) in the manual HADB Command Reference.

■ Temporary increase in the amount of index data that is associated with chunk merging

While the adbmergechunk command is being executed, index data of both the merge-source chunks and merge-target chunks might exist temporarily. Consequently, the amount of index data might increase temporarily.

During this process, if the adbmergechunk command is interrupted such that there are chunks that remain in delete-pending status, the indexes that were temporarily created also remain. Therefore, you need to manually delete the chunks that remain in delete-pending status.

■ Relationship between merge-source chunk deletion and search/update processing

When the merge-target chunk is completed, the adbmergechunk command checks whether or not there is a search executed on any of the merge-chunk target tables (SQL statements, adbexport command, adbdbstatus command with the --shared-lock option specified, and others). If a search is being performed, the merge-source chunks are deleted after waiting for the completion of all detected searches. For this reason, depending on the search, it might take time for the deletion of merge-source chunks to start. Any searches executed after the merge-target chunk is completed do not affect the deletion of the merge-source chunks.

If, after the completion of the merge-target chunk, a merge-chunk target table needs to be updated without waiting for the completion of merge-source chunk deletion, cancel the adbmergechunk command by using the adbcancel command. After canceling, you can update a table that is to be merge-chunked.

However, the status of the merge-source chunks is delete-pending status; the merge-source chunks are still not deleted. You will have to manually delete merge-source chunks with this status.

■ System chunk

A data DB area contains system chunks that are separate from normal chunks. System chunks make it possible to execute the adbmergechunk command even when no chunks can be created in a base table or data DB area.

When the adbmergechunk command is executed but no chunks can be created in a base table or data DB area, a system chunk is used as the merge-target chunk. Then, when the merge-source chunks are deleted, one of the deleted chunks becomes a system chunk.

(4) Chunks that are retrieved during execution of the adbmergechunk command

The following table describes the chunks that can be retrieved by chunk-ID-based retrieval (the adbexport command with the -c option or the #GETDATA subcommand of the adbsql command) performed during execution of the adbmergechunk command.

Table 11‒3: Chunks that are retrieved during execution of the adbmergechunk command

No.

adbmergechunk command processing status

Chunks to be retrieved

Can it be retrieved?

1

After command start but before completion of the merge-target chunk

After the KFAA90000-I message (start of the adbmergechunk command processing) is output but before the KFAA51242-I message (creation of the merge-target chunk) is output

Merge-source chunks

Y

Merge-target chunk

N

Chunks other than those that are to be merged

Y

2

After completion of the merge-target chunk but before deletion of the merge-source chunks

After the KFAA51242-I message (creation of the merge-target chunk) but before the KFAA80243-I message (deletion of the merge-source chunk) is output

Merge-source chunks (in delete-pending status)

N

Merge-target chunk

Y

Chunks other than those that are to be merged

Y

3

From complete deletion of the merge-source chunks to command termination

After the KFAA80243-I message (deletion of the merge-source chunk) but before the KFAA90001-I message (end of the adbmergechunk command processing) is output

Merge-source chunks

N

Merge-target chunk

Y

Chunks other than those that are to be merged

Y

Legend:

Y: Can be retrieved.

N: Because these chunks cannot be retrieved, the retrieval result is 0.

(5) Notes on merging chunks in wait status

When multiple chunks in wait status (merge-source chunks) are merged, the merge-target chunk becomes a chunk in wait status. During this process, the current chunk is not changed.

Furthermore, when chunks in wait status are merged, depending on the combination of the statuses of the chunks to be merged, the information that is set for the chunk creation date and time in the system table STATUS_CHUNKS differs from the information that is set for the date and time when the current chunk was swapped. Details are provided below.

Note

For details about the system table STATUS_CHUNKS, see C.5 Content of STATUS_CHUNKS.

(a) When only chunks in wait status that were never in normal status in the past are merged

When only chunks in wait status that were never in normal status in the past are merged, the chunk creation date and time and the date and time when the current chunk was swapped are not set in the merge-target chunk.

Figure 11‒6: Example of merging only chunks in wait status that were never in normal status in the past

[Figure]

Explanation

When chunks in wait status (chunks 1 and 2) that were never in normal status in the past are merged, a chunk in wait status (chunk 4) is created as the result of the merge.

During this process, the current chunk is not changed. Further, the chunk creation date and time and the date and time when the current chunk was swapped are not set in chunk 4.

(b) When only chunks in wait status that were in normal status in the past are merged

When only chunks in wait status that were in normal status in the past are merged, the chunk creation date and time and the date and time when the current chunk was swapped are set in the merge-target chunk.

Figure 11‒7: Example of merging only chunks in wait status that were in normal status in the past

[Figure]

Explanation

When chunks in wait status (chunks 1 and 2) that were in normal status in the past are merged, a chunk in wait status (chunk 4) is created as the result of the merge.

During this process, the current chunk is not changed. Further, the chunk creation date and time and the date and time when the current chunk was swapped are set in chunk 4.

  • Information set for the chunk creation date and time

    The oldest date and time information associated with the merge-source chunk is set.

  • Information set for the date and time when the current chunk was swapped

    The most recent date and time information associated with the merge-source chunk is set.

(c) When a chunk in wait status that was never in normal status in the past is merged with a chunk in wait status that was in normal status in the past

When a chunk in wait status that was never in normal status in the past is merged with a chunk in wait status that was in normal status in the past, the chunk creation date and time and the date and time when the current chunk was swapped are set in the merge-target chunk.

Figure 11‒8: Example of merging a chunk in wait status that was never in normal status in the past with a chunk in wait status that was in normal status in the past

[Figure]

Explanation

When a chunk in wait status that was never in normal status in the past (chunk 1) is merged with a chunk in wait status that was in normal status in the past (chunk 2), a chunk in wait status (chunk 4) is created as the merge-target chunk.

During this process, the current chunk is not changed. Furthermore, the chunk creation date and time and the date and time when the current chunk was swapped are set in chunk 4.

  • Information set for the chunk creation date and time

    The oldest date and time information associated with the merge-source chunk is set.

  • Information set for the date and time when the current chunk was swapped

    The most recent date and time information associated with the merge-source chunk is set.

Important

When a chunk in wait status that was never in normal status in the past is merged with a chunk in wait status that was in normal status in the past, changing the merge-target chunk (chunk in wait status) into a chunk in normal status will not make it the current chunk.

Therefore, we do not recommend merging a chunk in wait status that was never in normal status in the past with a chunk in wait status that was in normal status in the past.

(6) Points to consider when specifying the --purge-chunk option of the adbmergechunk command

This subsection explains the points to consider when executing the adbmergechunk command with the --purge-chunk option specified.

For the --purge-chunk option, you can specify WAIT or NOWAIT. Under normal circumstances, specify WAIT (or omit the --purge-chunk option altogether).

If you execute the adbmergechunk command with WAIT specified for the --purge-chunk option, merge chunk processing takes place in the following sequence. This behavior is the same as if the --purge-chunk option were omitted.

■ Merge chunk processing when WAIT is specified for --purge-chunk option
  1. Multiple chunks (merge-source chunks) are merged into a single chunk (the merge-target chunk).

  2. After the merge is completed, if there are no more SQL statements or commands whose processing references the table processed by the merge chunk command, the merge-source chunks are deleted.

    If there are still SQL statements or commands whose processing references the table processed by the merge chunk command, the system will wait to delete the merge-source chunks. The merge-source chunks will be deleted as soon as there is no more processing that references the table processed by the merge chunk command.

  3. When deletion of the merge-source chunks has completed, the adbmergechunk command terminates.

If you specify WAIT for the --purge-chunk option, as described earlier, the system will sometimes wait for other processes before deleting the merge-source chunks. The adbmergechunk command does not terminate during this waiting period. Therefore, processing that updates the table processed by the merge chunk command cannot be executed. For example, you will be unable to perform background import using the adbimport command.

In some circumstances, having the adbmergechunk command wait to delete merge-source chunks when processing remains that references the table processed by the merge chunk command might be problematic. In these circumstances, consider specifying NOWAIT in the --purge-chunk option. If you execute the adbmergechunk command with NOWAIT specified for the --purge-chunk option, merge chunk processing takes place in the following sequence.

■ Merge chunk processing when NOWAIT is specified for --purge-chunk option
  1. Multiple chunks (merge-source chunks) are merged into a single chunk (the merge-target chunk).

    After the merge is completed, if there are no more SQL statements or commands whose processing references the table subject to merge chunk processing, the process advances to step 2.

    If there are still SQL statements or commands whose processing references the table subject to merge chunk processing, the process advances to step 3.

  2. If there are no more SQL statements or commands whose processing references the table processed by the merge chunk command, the merge-source chunks are deleted.

    When deletion of the merge-source chunks has completed, the adbmergechunk command terminates.

  3. If there are SQL statements or commands whose processing references the table processed by the merge chunk command, the adbmergechunk command terminates without deleting the merge-source chunks.

    The merge-source chunks that are not deleted remain as deletion-pending chunks.

If you specify NOWAIT for the --purge-chunk option, as explained earlier, the adbmergechunk command will sometimes terminate without deleting the merge-source chunks. Because the adbmergechunk command terminates, processing that updates the table processed by the merge chunk command can now be executed. For example, you will be able to perform tasks such as background import using the adbimport command. In this manner, specify NOWAIT for the --purge-chunk option lets you avoid situations in which the adbmergechunk command waits to delete merge-source chunks while processing remains that references the table being processed.

However, because the merge-source chunks that are not deleted remain as deletion-pending chunks, you then need to delete them by using the PURGE CHUNK statement. You will be unable to execute the adbmergechunk command until the merge-source chunks are deleted. The number of chunks you can create is also reduced by a number equivalent to the number of remaining merge-source chunks. After the adbmergechunk command terminates, use the PURGE CHUNK statement at an appropriate time to delete the merge-source chunks. To delete merge-source chunks, first determine their chunk IDs based on the explanation in (12) Checking for delete-pending chunks under C.9 Searching system tables. Then, use the PURGE CHUNK statement to delete the desired chunks based on the chunk IDs you determined. You can also obtain the chunk ID of the merge-source chunk from the KFAA96785-E message.

Important

Do not specify NOWAIT in the --purge-chunk option if there will be no opportunity for a PURGE CHUNK statement to delete the remaining merge-source chunks before the next execution of the adbmergechunk command. You cannot execute a PURGE CHUNK statement if processing is pending that references or updates a table processed by the merge chunk command for which merge-source chunks remain.