Hitachi

Hitachi Advanced Database Setup and Operation Guide


2.4.3 DB area structure (segments and pages)

A DB area consists of units called segments and pages. Each DB area consists of multiple segments. Furthermore, each segment consists of multiple pages. The following figure shows the structure of a DB area.

Figure 2‒14: DB area structure

[Figure]

Organization of this subsection

(1) Segment

A segment is the smallest unit that stores a table or index. Each segment can store a single table or index. The following figure shows the relationships that segments can have with tables and indexes.

Figure 2‒15: Relationships that segments can have with tables and indexes

[Figure]

Note that a single segment cannot store multiple tables or multiple indexes. The following figure shows examples of ways in which tables and indexes cannot be stored in segments.

Figure 2‒16: Examples of ways in which tables and indexes cannot be stored in segments

[Figure]

(a) Segment types

Segments are classified according to the page types assigned to them as shown in the table below. For details about page types, see (a) Page types in (2) Pages under 2.4.3 DB area structure (segments and pages).

Table 2‒5: Segment types

No.

Segment type

Description

1

Table (row store format)

Base row segment

Segment in which pages that store base rows (base row pages) are allocated

2

Branch row segment

Segment in which pages that store branch rows (branch row pages) are allocated

3

Table (column store format)

Column-data segment

Segment that stores data in column store format when importing data into a column store table.

The column-data segment is also used to store data that is converted from row store format to column store format by the updated-row columnizing facility. For details about the updated-row columnizing facility, see 11.18 Using the updated-row columnizing facility (maintaining the retrieval performance for column store tables).

4

Row-data segment

Segment that stores branched data (branch rows) when importing data into a column store table.

The row-data segment is also used to store the following types of data:

  • Data added to a column store table by using the INSERT statement

  • Data updated in a column store table by using the UPDATE statement

  • Data indicating that specific data was invalidated by running the UPDATE or DELETE statement

5

B-tree index

Upper page segment

Segment in which root pages and middle pages of a B-tree index are allocated

6

Lower page segment

Segment in which leaf pages, row ID directory pages, and row ID list pages of a B-tree index are allocated

7

Text index

String control segment

Segment in which string control pages of a text index are allocated

8

Position control segment

Segment in which position control pages of a text index are allocated

9

Range index

DB area file control segment

Segment in which DB area file control pages of a range index are allocated

10

Range control segment

Segment in which segment control pages and range control pages of a range index are allocated

(b) Timing at which segments are allocated or released

Timing at which segments are allocated

A segment is allocated when a new page is allocated. If there are no more free pages in a segment that has already been allocated, a new segment is allocated and assigned when a new page is allocated.

In the following cases, however, a new segment is allocated and assigned, even if there are free pages in a segment that has already been allocated:

  • When background import is executed

  • When multiple chunks are merged

  • When a chunk status change swaps the current chunk and creates a new chunk

  • When a system table is reorganized

  • When HADB uses the updated-row columnizing facility to convert data from row store format to column store format

Timing at which segments are released

Allocated segments are released at the following times:

  • When the base table and index are deleted

  • When all row data is deleted from the base table (when the TRUNCATE TABLE statement is executed)

  • When data is imported into the base table and deletion of all existing data is specified

  • When the chunk is deleted

  • When the index is rebuilt

  • When the index data for the merge-source chunks is deleted after the chunks are merged

  • When a chunk is archived

  • When an archivable multi-chunk table is changed to a regular multi-chunk table (when the ALTER TABLE statement is executed)

  • When a system table is reorganized

(2) Pages

A page is the smallest unit for disk I/O operations. Data is read from or loaded to a disk in page units.

(a) Page types

Pages are categorized into some types based on the type of data to be stored. The following table shows the page types.

Table 2‒6: Page types

No.

Page type

Description

1

Data page (row store format)

Base row page

Page for storing the base rows of a table

2

Branch row page

Page for storing the branch rows of a table

3

Data page (column store format)

Entry page

Page for storing the management information related to table data stored in column store format. This page is allocated in a column-data segment.

4

Dictionary page

Page for storing dictionary data when dictionary encoding (DICTIONARY) is selected as the compression type. This page is allocated in a column-data segment.

5

Column-data page

Page for storing column data (excluding branch row data) in column store format. This page is allocated in a column-data segment.

6

Basic row page

When the INSERT statement is used to add data to a column store table, the data is stored on this page as base rows.

Also, when the UPDATE statement is executed for data that is stored in column store format, the updated data is stored on this page as base rows.

This page is allocated in a row-data segment.

7

Branch row page

Page for storing the branch rows of row data stored in column store format. This page is allocated in a row-data segment.

Column data whose actual length is 128 bytes or more is stored in this page as branch row data.

8

Invalid row information page

This page is used to manage information about data invalidated when the UPDATE or DELETE statement was executed for data that is stored in column store format.

This page is allocated in a row-data segment.

9

Allocation control page

A page that is present at the beginning of a row-data segment and controls the allocation of pages within the segment.

10

B-tree index index page#1

Upper page

Root page

The root page is the highest level index page in the B-tree structure.

11

Middle page

Middle level B-tree index page

12

Leaf page

Lower level B-tree index page

13

Row ID directory page

Page for managing row ID lists

14

Row ID list page

Leaf page used exclusively for a key value if that key value is used 256 times or more

15

Text index index page

String control page

Page for managing the types of text-indexed character strings

16

Position control page

Page for managing the positions of text-indexed character strings in pages

17

Range index index page

DB area file control page

Page that stores pointers to relate DB area files to data ranges in the DB area files in a table for which a range index is defined

This page also stores the ranges of data stored in a chunk of a table for which a range index is defined.

18

Segment control page

Page that stores pointers to relate segments to data ranges in the segments in a table for which a range index is defined

19

Range control page

Page that stores the ranges of data stored in a segment of a table for which a range index is defined

20

Work table page#2

Page for work table DB areas The page size for work table DB areas can exceed 32 kilobytes. There are no pages for storing branch rows.

21

Directory page#3

Page for storing the management information related to DB areas

#1

The following figure shows the relationship between a B-tree index and index pages:

[Figure]

#2

A work table page is allocated when you execute an SQL statement that creates a work table.

#3

A predetermined number of directory pages are allocated whenever DB areas are created. After that, new pages are allocated any time the number of data pages or index pages reaches a preset value.

■ Base rows and branch rows

In row store format, an entire row of data is stored on the same page. However, if there is a variable-length column whose definition length exceeds 255 bytes, one row of data might be broken up and stored on separate pages. Rows that are broken up and stored in separate pages are called branch rows. The row containing the information on the branched destination is called the base row. This applies to both the data of a row store table and the data added to a column store table by using an update SQL statement.

When you define a row store table (by executing the CREATE TABLE statement), you can specify whether a variable-length column's data is to be broken up and stored on separate pages.

■ Existence of directory page groups and locations of directory pages

Each DB area file begins with a directory page group, which manages the following types of information:

  • DB area definition information

  • Information about the defined tables and indexes

  • Chunk information

In the first one of the DB area files that make up a DB area, the directory page group includes the directory page that manages the DB area information. Therefore, the locations of directory pages in the first DB area file are different from the locations of directory pages in other DB area files.

A directory page group is located before segments that store tables and indexes. A directory page is located at regular intervals to manage the information about each segment. Therefore, the entire space in the DB area file is not used for segments that store tables and indexes.

The following figure shows an example of the locations of directory pages in DB area files.

Figure 2‒17: Example of directory page placement in DB area files

[Figure]

(b) Timing at which pages are allocated or released

Timing at which pages are allocated

Data pages and index pages are allocated at the following times:

  • When a row is added to, or updated in, a row store table

    A base row page or a branch row page are allocated when a row is added to a table or when a row is updated. Page allocation does not take place when a row store table is defined by executing the CREATE TABLE statement. When an already allocated page is filled to its capacity as a result of repeated operations to add rows, a new page will be reserved and allocated.

    An index page of a range index is allocated when a row is added to the table or when a row is updated.

  • When a row is added to, or updated in, a column store table

    • An entry page, dictionary page, or column-data page is allocated when a row is added to a column store table by executing the adbimport command.

      A basic row page is allocated when a row is added to, or updated in, a column store table by executing an update SQL statement.

      A branch row page is allocated when a row is added to, or updated in, a column store table by executing the adbimport command or an update SQL statement.

      An entry page, dictionary page, column-data page, basic row page, or branch row page is not allocated when a column store table is defined by using the CREATE TABLE statement. A new page is reserved and allocated when an already allocated page is filled to its capacity as a result of repeated operations to add rows.

    • An invalid row information page is allocated when the data in column store format is deleted or updated. The invalid row information page is not allocated when a column store table is defined by using the CREATE TABLE statement. A new page is reserved and allocated when an already allocated page is filled to its capacity as a result of repeated operations to delete or update rows.

      Note that an invalid row information page is not allocated when the data added or updated in row store format by using an update SQL statement is deleted or updated.

    • An index page of a range index is allocated when a row is added to the table or when a row is updated.

  • When HADB uses the updated-row columnizing facility to convert data from row store format to column store format

    When data in a column store table is converted from row store format to column store format, a new page is reserved and allocated.

  • When a B-tree index is defined

    When a B-tree index is defined by executing the CREATE INDEX statement, one upper page and one leaf page are allocated.

  • When index page split occurs in a B-tree index

    A new index page is allocated and assigned every time a B-tree index page split occurs. For details about B-tree index page splits, see (2) B-tree index page splits in 5.3.4 Allocating an unused area inside a B-tree index page (PCTFREE).

  • When a text index is defined

    When a text index is defined by executing the CREATE INDEX statement, two string control pages and two position control pages are allocated.

  • When index page split occurs in a text index

    A new index page is allocated and assigned every time an index page split occurs in a text index. For details about index page splits of text indexes, see (2) Text index page splits in 5.4.2 Allocating an unused area inside a text index page (PCTFREE).

  • When a range index is defined

    When a range index is defined by executing the CREATE INDEX statement, one segment's worth of the DB area file control page is allocated.

Timing at which pages are released

When a segment is released, all pages inside that segment are released.

(c) Page group

When pages are allocated from the range control segment of a range index, HADB allocates multiple pages at once. The pages that are allocated together form a unit called a page group. The number of pages comprising a page group is called the page group size.

To determine the page group size, see the description of the variable PGGRPSIZE in (1) Determining the SGRI variable under 5.8.6 Determining the number of segments for storing each range index.

To check the page group size of a range index, see 10.9.5 Checking the status and usage of range indexes.