Hitachi

Hitachi Advanced Database Setup and Operation Guide


11.4.14 Reorganizing a multi-chunk table: Chunk-based reorganization

This section describes how to perform chunk-based reorganization for a multi-chunk table. You can use either of the following methods to perform chunk-based reorganization. Normally, use method 1 to perform chunk-based reorganization.

  1. Reorganizing a table while continuing retrieval

    With this method, you can retrieve (but cannot update) the data in the table that is being reorganized.

    Note that this method requires both the reorganization-target data and reorganization-result data to be stored temporarily. Therefore, the data DB area that stores the reorganization-target table must have free space for storing both types of data. If the free space is insufficient, you cannot perform this method. Therefore, before you perform this method, use the adbdbstatus command to compare the size of the reorganization-target chunk with the size of free space in the data DB area that stores the table. For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

    For details about reorganization using this method, see (1) Reorganizing a table while continuing retrieval.

  2. Reorganizing a table after stopping retrieval

    If you cannot use method 1 because the data DB area that stores the reorganization-target table does not have sufficient free space, use this method to perform chunk-based reorganization. With this method, you cannot retrieve or update the data in the table that is being reorganized.

    For details about reorganization using this method, see (2) Reorganizing a table after stopping retrieval.

    If only the current chunk is to be reorganized, the data DB area must have as large a free space as the data stored in the current chunk. If the free space is insufficient in the data DB area, secure sufficient free space as explained in 11.10.6 Securing free space in a data DB area.

Note

HADB provides a sample shell script that performs chunk-based reorganization. You can use the sample shell script for only a multi-chunk table that is a column store table. For details, see 11.4.16 Reorganizing a multi-chunk table: Reorganization using a sample shell script.

Organization of this subsection

(1) Reorganizing a table while continuing retrieval

The procedure for chunk-based reorganization is as follows.

Procedure

  1. Make sure that no application programs and commands can update the reorganization-target table.

    Perform the following operations to prevent application programs and commands from updating the reorganization-target table:

    • If commands and jobs that update the reorganization-target table (such as the adbimport and adbmergechunk commands that are periodically executed) are executing on the HADB server machine, stop all of them.

    • Stop all application programs and commands that access the HADB server to update the reorganization-target table.

    We recommend that you perform both of the preceding operations.

  2. Obtain the information about the reorganization-target chunk.

    Perform data retrieval in the system table STATUS_CHUNKS to obtain the following information about the reorganization-target chunk. For details about how to retrieve data from the STATUS_CHUNKS table, see (3) Checking the information about all chunks in a table based on a table name in C.9 Searching system tables.

    • CHUNK_COMMENT (Comment set for a chunk)

    • CHUNK_STATUS (Chunk status)

  3. Obtain the information about the current chunk.

    Perform data retrieval in the system table STATUS_CHUNKS to obtain the following information about the current chunk. For details about how to retrieve data from the STATUS_CHUNKS table, see (16) Checking the information about the current chunk in C.9 Searching system tables.

    • CHUNK_ID (Chunk ID)

      Check whether the value of this item is the same as the chunk ID of the reorganization-target chunk (whether the reorganization-target chunk is the current chunk).

    • CHUNK_COMMENT (Comment set for a chunk)

  4. Export the data in a multi-chunk table in units of chunks.

    Use the adbexport command with the -c option specified to export the data in the multi-chunk table in units of chunks. For details about how to export data in units of chunks, see 11.4.5 Exporting data in units of chunks.

  5. Use the background-import facility to import the exported data as chunks in wait status.

    Perform background import of the data exported in step 4 by executing the adbimport command with the following options specified:

    • -b option

    • --status wait option

      If the --status wait option is specified, the command creates a chunk in wait status and stores data in the chunk.

    • -m option

      Specify the chunk comment that you checked in step 2.

    For details about background import that creates a chunk in wait status, see 11.4.4 Temporarily excluding data to be imported to a multi-chunk table from retrieval (creating a chunk in wait status).

    Important

    When you execute the adbimport command, make sure that the value you specify for the import option (adb_import_rthd_num) satisfies the formula shown in (3) Estimating the value to be specified for the import option. If you specify a value that does not satisfy the formula, the data storage efficiency might degrade.

  6. Change the chunk's status.

    If the reorganization-target chunk that you checked in step 2 is in wait status, you can skip this step. Proceed to step 7.

    If the reorganization-target chunk that you checked in step 2 is in normal status, execute the adbchgchunkstatus command with both the -w and -n options specified. Change the status of two types of chunks at the same time.

    • For the -w option, specify the chunk that manages the data exported in step 4 (the chunk subject to reorganization).

      Change the status of the chunk from normal status to wait status.

    • For the -n option, specify the chunk that was imported by background import processing in step 5 and is in wait status.

      Change the status of the chunk from wait status to normal status.

    For details about how to change the chunk status, see 11.4.12 Changing the chunk status.

  7. Delete the chunk subject to reorganization.

    Use the PURGE CHUNK statement to delete the chunk that you exported in step 4. Note that you cannot execute the PURGE CHUNK statement simultaneously with data retrieval from the multi-chunk table. Therefore, execute the PURGE CHUNK statement for the multi-chunk table when no operations are being performed for the table.

    For details about how to delete chunks by using the PURGE CHUNK statement, see 11.4.6 Deleting data in units of chunks.

    Notes on the current chunk
    • If the current chunk is reorganized, you can skip steps 8 and 9. Proceed to step 10.

    • When a chunk other than the current chunk is reorganized, the current chunk changes. If you want to change the current chunk back to the one before reorganization, perform steps 8 and 9.

    • If it is not necessary to change the current chunk to the one that was the current chunk before reorganization, skip steps 8 and 9. Proceed to step 10.

  8. Use the adbimport command to create an empty chunk (a chunk containing no data).

    Execute the adbimport command with the -b option to perform background import, specifying an empty file as the input data file.

  9. Merge chunks.

    Use the adbmergechunk command to merge the empty chunk that you created in step 8 and the chunk that was the current chunk before reorganization. At this time, for the -m option in the adbmergechunk command, specify the comment (that you checked in step 3) on the chunk that was the current chunk before reorganization.

    When merging of the chunks finishes, the chunk that was the current chunk before reorganization becomes the current chunk again.

  10. Make sure that the application programs and commands can update the reorganization-target table.

    Perform the following operation to enable application programs and commands to update the reorganization-target table:

    • If you stopped commands and jobs running on the HADB server machine in step 1

      Restart the commands and jobs that were stopped.

    • If you stopped application programs and commands that access the HADB server in step 1

      Restart the application programs and commands that you stopped.

Important
  • If you perform chunk-based reorganization in accordance with the procedure described here, the chunk ID of the reorganization-target chunk changes after reorganization. However, the configuration, status, and comment of the chunk do not change.

  • Do not execute an update SQL statement or a command that updates a table (such as the adbimport or adbmergechunk command) for a table that is being reorganized. If you do so, the changes made by the SQL statement or command might be lost after reorganization.

(2) Reorganizing a table after stopping retrieval

The procedure for chunk-based reorganization is as follows.

Procedure

  1. Make sure that no application programs and commands can access the reorganization-target table.

    Perform the following operations to prevent application programs and commands from accessing the reorganization-target table:

    • Execute the adbchgsrvmode command with the --offline option specified to change the HADB server operation mode to offline mode. Furthermore, if commands and jobs that update the target table (such as the adbimport and adbmergechunk commands that are periodically executed) are executing on the HADB server machine, stop all of them.

    • Stop all application programs and commands that access the HADB server.

    We recommend that you perform both of the preceding operations.

  2. Obtain the information about the reorganization-target chunk.

    Perform data retrieval in the system table STATUS_CHUNKS to obtain the following information about the reorganization-target chunk. For details about how to retrieve data from the STATUS_CHUNKS table, see (3) Checking the information about all chunks in a table based on a table name in C.9 Searching system tables.

    • CHUNK_COMMENT (Comment set for a chunk)

    • CHUNK_STATUS (Chunk status)

    When reorganizing multiple chunks, obtain the information about all the chunks to be reorganized.

  3. Obtain the information about the current chunk.

    Perform data retrieval in the system table STATUS_CHUNKS to obtain the following information about the current chunk. For details about how to retrieve data from the STATUS_CHUNKS table, see (16) Checking the information about the current chunk in C.9 Searching system tables.

    • CHUNK_ID (Chunk ID)

      Check whether the value of this item is the same as the chunk ID of the reorganization-target chunk (whether the reorganization-target chunks include the current chunk).

    • CHUNK_COMMENT (Comment set for a chunk)

    Important

    About the subsequent procedure

    • If the reorganization-target chunks do not include the current chunk:

      Perform steps 4 to 6 for each chunk. Then, proceed to step 7.

    • If the reorganization-target chunks include the current chunk and other chunks:

      • First, perform steps 4 to 6 for each of the chunks that are not the current chunk.

      • Then, perform steps 4 to 6 for the current chunk.

      • After you complete the preceding operations, proceed to step 8.

    • If the current chunk is the only reorganization-target chunk:

      Perform step 4, step 6, and step 5, in this order. Then, proceed to step 8.

  4. Export the data in a multi-chunk table in units of chunks.

    Use the adbexport command with the -c option specified to export the data in the multi-chunk table in units of chunks. For details about how to export data in units of chunks, see 11.4.5 Exporting data in units of chunks.

  5. Delete the chunk subject to reorganization.

    Use the PURGE CHUNK statement to delete the reorganization-target chunk from which data was exported in step 4.

    For details about how to delete chunks by using the PURGE CHUNK statement, see 11.4.6 Deleting data in units of chunks.

  6. Import the exported data into the multi-chunk table by background import.

    • If the reorganization-target chunk that you checked in step 2 is in normal status:

      Perform background import of the data exported in step 4 by executing the adbimport command with the following options specified:

      • -b option

      • -m option

        Specify the chunk comment that you checked in step 2.

      For details about background import, see 11.4.2 Storing data in a multi-chunk table (background import).

    • If the reorganization-target chunk that you checked in step 2 is in wait status:

      Perform background import of the data exported in step 4 by executing the adbimport command with the following options specified:

      • -b option

      • --status wait option

        If the --status wait option is specified, the command creates a chunk in wait status and stores data in the chunk.

      • -m option

        Specify the chunk comment that you checked in step 2.

      For details about background import that creates a chunk in wait status, see 11.4.4 Temporarily excluding data to be imported to a multi-chunk table from retrieval (creating a chunk in wait status).

    Important

    When you execute the adbimport command, make sure that the value you specify for the import option (adb_import_rthd_num) satisfies the formula shown in (3) Estimating the value to be specified for the import option. Specifying a value that does not satisfy the formula might adversely affect data storage efficiency.

  7. Change the current chunk.

    The status of the current chunk changes when reorganization is performed. If you want to change the current chunk back to the one before reorganization, perform the following operations. If it is not necessary to change the current chunk back to the one before reorganization, you can skip the following operations. Proceed to step 8.

    • Use the adbimport command to create an empty chunk (a chunk containing no data).

      Execute the adbimport command with the -b option to perform background import, specifying an empty file as the input data file.

    • Merge chunks.

      Use the adbmergechunk command to merge the empty chunk that you created by the preceding operations and the chunk that was the current chunk before reorganization. At this time, for the -m option in the adbmergechunk command, specify the comment (that you checked in step 3) on the chunk that was the current chunk before reorganization.

    When merging of the chunks finishes, the chunk that was the current chunk before reorganization becomes the current chunk again.

  8. Make sure that application programs and commands can access the reorganization-target table.

    Perform the following operation to enable application programs and commands to access the reorganization-target table:

    • If you changed the HADB server's operation mode to offline mode in step 1

      Execute the adbchgsrvmode command with the --normal option specified to change the HADB server operation mode to normal mode. If you stopped commands and jobs that were running on the HADB server machine, restart them.

    • If you stopped application programs and commands that access the HADB server in step 1

      Restart the application programs and commands that you stopped.

Important
  • If you perform chunk-based reorganization in accordance with the procedure described here, the chunk ID of the reorganization-target chunk changes after reorganization. However, the configuration, status, and comment of the chunk do not change.

  • Do not execute an update SQL statement or a command that updates a table (such as the adbimport or adbmergechunk command) for a table that is being reorganized. If you do so, the changes made by the SQL statement or command might be lost after reorganization.

(3) Estimating the value to be specified for the import option

The formula for estimating the value to be specified for the import option (adb_import_rthd_num) differs depending on the reason why reorganization was performed. Use the appropriate formula according to the reason.

■ If reorganization was performed because update and deletion of rows were performed repeatedly (when the multi-chunk table was a row store table)

When you execute the adbimport command, make sure that the value you specify for the import option (adb_import_rthd_num) satisfies the following formula. If you specify a value that does not satisfy the formula, the data storage efficiency might degrade.

Formula

value-specified-for-import-option-adb_import_rthd_num ≤
    ↑ number-of-segments-to-be-reorganized × data-storage-efficiency ↑ + 1
number-of-segments-to-be-reorganized

Substitute the number of segments used in the reorganization-target chunk.

Execute the adbdbstatus command with the -d used option specified to output the information about the usage of DB areas, tables, and indexes. Then, check the information output as Used_segments (number of used segments for each chunk) by using Chunk_ID (chunk ID) as a key.

For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

If you reorganize the entire table, obtain the total number of segments used in all chunks in the table.

data-storage-efficiency

Substitute the data storage efficiency of the reorganization-target chunk. Substitute the value obtained in (a) In a case where the multi-chunk table is a row store table in (2) How to check whether reorganization is necessary in 11.4.13 Checking whether a multi-chunk table needs to be reorganized.

If you reorganize the entire table, calculate the data storage efficiency of all chunks in the table.

■ If reorganization was performed because addition, update, and deletion of rows were performed repeatedly (when the multi-chunk table was a column store table)

When you execute the adbimport command, make sure that the value you specify for the import option (adb_import_rthd_num) satisfies the following formula. If you specify a value that does not satisfy the formula, the data storage efficiency might degrade.

Formula

value-specified-for-import-option-adb_import_rthd_num ≤
    ↑ number-of-segments-to-be-reorganized × number-of-rows-to-be-reorganized
       ÷ number-of-rows-stored-in-column-data-segment ↑ + 1
number-of-segments-to-be-reorganized

Substitute the number of column-data segments used in the reorganization-target chunk. Execute the adbdbstatus command with the -d used option specified to output the information about the usage of DB areas, tables, and indexes. Then, check the information output as Used_segments (number of used segments) for the rows whose Segment_type (segment type) is Column_data, by using Chunk_ID (chunk ID) as a key.

For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

If you reorganize the entire table, obtain the total number of column-data segments used in all chunks in the table.

number-of-rows-to-be-reorganized

Substitute the number of data items in the reorganization-target chunk.

If you reorganize the entire table, substitute the number of data items in the table.

You can obtain the number of data items in a chunk or table by executing the #GETCOUNT subcommand of the adbsql command.

number-of-rows-stored-in-column-data-segment

Substitute the number of rows that were imported into the reorganization-target chunk by using the adbimport command. Alternatively, substitute the number of rows converted from row store format to column store format by using the updated-row columnizing facility.

Execute the adbdbstatus command to obtain information about the need for reorganization. The value of Column_data_num for the reorganization-target chunk is the number of rows stored in the column-data segment.

To reorganize an entire table, sum the Column_data_num values for all chunks in the table.

■ If reorganization was performed because a small amount of data was repeatedly added by background import

When you execute the adbimport command, make sure that the value you specify for the import option (adb_import_rthd_num) satisfies the following formula. If you specify a value that does not satisfy the formula, the data storage efficiency might degrade.

Formula

value-specified-for-import-option-adb_import_rthd_num ≤
    ↑ number-of-segments-to-be-reorganized × data-storage-efficiency ↑ + 1
number-of-segments-to-be-reorganized
  • To reorganize a row store table:

    Substitute the number of segments used in the reorganization-target chunk.

    Execute the adbdbstatus command with the -d used option specified to output the information about the usage of DB areas, tables, and indexes. Then, check the information output as Used_segments (number of used segments for each chunk) by using Chunk_ID (chunk ID) as a key.

    For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

    If you reorganize the entire table, obtain the total number of used segments of all chunks in the table.

  • To reorganize a column store table:

    Substitute the number of column-data segments used in the reorganization-target chunk. Execute the adbdbstatus command with the -d used option specified to output the information about the usage of DB areas, tables, and indexes. Then, check the information output as Used_segments (number of used segments) for the rows whose Segment_type (segment type) is Column_data, by using Chunk_ID (chunk ID) as a key.

    For details about the adbdbstatus command, see adbdbstatus (Analyze the Database Status) in the manual HADB Command Reference.

    If you reorganize the entire table, obtain the total number of used column-data segments in all chunks in the table.

data-storage-efficiency

Substitute the data storage efficiency of the reorganization-target chunk. Substitute the value obtained in (c) If a small amount of data is repeatedly added by background import in (2) How to check whether reorganization is necessary in 11.4.13 Checking whether a multi-chunk table needs to be reorganized.

If you reorganize the entire table, calculate the data storage efficiency of all chunks in the table.