Hitachi

Hitachi Advanced Database Setup and Operation Guide


2.2.1 Row store tables and column store tables

The two basic types of base tables are row store tables and column store tables, each of which stores data in a different format. You can expect to see an improvement in search performance by using the table type that is appropriate in terms of the intended use of the base table and the operation to which it relates.

Organization of this subsection

(1) What are row store tables and column store tables?

There are two table-data storage formats: row store format and column store format. You can specify the table-data storage format when defining the table. A table defined with row store format as the table-data storage format is called a row store table. A table defined with column store format as the table-data storage format is called a column store table.

Row store tables store data in row store format. Column store tables store data in column store format. However, if the INSERT or UPDATE statement is executed for a column store table, the added or updated data is stored in the column store table in row store format.

■ Relationship between the column store table and table-data storage format
  • If the adbimport command is used to import data into a column store table, the data is stored in the table in column store format.

  • If the INSERT or UPDATE statement is used to add or update data for a column store table, the data is stored in the table in row store format.

Note

All tables defined in versions of HADB server earlier than 04-01 are row store tables.

(2) What is row store format?

A format in which data is stored in the database at the row level is called row store format. In row store format, one row of data is stored in the database as one record. The following figure shows how data is stored in row store format:

Figure 2‒4: Data storage in row store format

[Figure]

Explanation

Note

For details about pages and segments, see 2.4.3 DB area structure (segments and pages).

■ Retrieval methods that benefit from row store tables

Because data is stored at the row level, row store tables are suited to the following manners of data retrieval:

  • Data retrieval that accesses data at the row level

    One example of this manner of data retrieval is executing a SELECT statement with * specified in the selection expression, as follows:

    SELECT * FROM "T1" WHERE "C1">=DATE'2017-09-06'

    Row store format remains suitable when the selection expression specifies almost every column.

  • When retrieving data in a manner that narrows down the search range using B-tree indexes

    When a table search is executed, the HADB server only reads the pages that store rows that are in the search range. Therefore, using a B-tree index to narrow down the search range can reduce the number of pages that need to be loaded.

(3) What is column store format?

A format in which data is stored in the database at the column level is called column store format. In column store format, the data in each column of a table is stored together in the database. The following figure shows how data is stored in column store format:

Figure 2‒5: Data storage in column store format

[Figure]

Explanation

Note

For details about pages and segments, see 2.4.3 DB area structure (segments and pages).

■ Retrieval methods that benefit from column store tables

Because data is stored at the column level, column store tables are suited to the following manners of data retrieval:

  • Searching the data in a specific column in its entirety without significantly narrowing the search range

  • Accessing the data in a specific column within a specific range (such as a specific month or year)

  • Performing operations that involve frequent calculations (such as computing averages and totals) in relation to the value data in a specific column

  • Grouping data in specific columns on a frequent basis

When a table search is executed, the HADB server first finds the segment that stores the column data targeted by the search. The HADB server then reads the pages within that segment that store the column data targeted by the search. When performing operations that involve calculations in relation to the value data in a specific column, the HADB server only needs to access the pages that store the target column data. This allows the number of pages that need to be read to be reduced.

■ Column-data compression types

When data is imported into a column store table, the data in each column is compressed as it is stored in the table. There are several column-data compression types. For details about each column-data compression type, see (4) Column-data compression types for column store tables in 5.2.2 Criteria for selecting row store tables and column store tables.

Note

When importing data into a column store table, HADB server automatically selects the compression type based on the data being imported. You can also specify the column-data compression type when defining the column store table.

■ Relationship between deletion of data in column store format and invalid row information pages

If the DELETE statement is executed for data that is stored in a column store table in column store format, the data is invalidated. The data is not deleted from the disk.

The HADB server uses invalid row information pages to manage the information that indicates specific data has been invalidated. Each time the DELETE statement is executed for a column store table, an invalid row information page is allocated and free space in the data DB area decreases by the size of that page.

Therefore, if the DELETE statement is executed for data in column store format when free space in the data DB area is insufficient, an error might occur.

■ If the INSERT or UPDATE statement is executed for a column store table

If the INSERT or UPDATE statement is executed for a column store table, the added or updated data is stored in the column store table in row store format. If data is stored in the column store table in row store format, characteristics of the column store table might be adversely affected. For example, the retrieval performance might be degraded at the time of access to specific column data, or the data compression rate might be lowered. Therefore, in situations where the INSERT or UPDATE statement might be executed for a column store table, we recommend that you enable the updated-row columnizing facility. If the updated-row columnizing facility is enabled and data is stored in row store format in a column store table, HADB automatically converts the data into column store format. For details about the updated-row columnizing facility, see 11.18 Using the updated-row columnizing facility (maintaining the retrieval performance for column store tables).