Nonstop Database, HiRDB Version 9 Installation and Design Guide

[Contents][Index][Back][Next]

12.21.1 Data compression facility

You can compress the data that HiRDB stores in a table. This is called the data compression facility. Data compression is specified for individual columns. A column in which compressed data is stored is called a compressed column, and a table containing a compressed column is called a compressed table.

Compressing data provides the following advantages:

The following figure provides an overview of data compression.

Figure 12-44 Overview of data compression

[Figure]

Explanation:
There is no need for the user to provide instructions for data compression and expansion because HiRDB performs this processing.
Organization of this subsection
(1) Criteria
(2) Guidelines for data compression efficiency
(3) Files that are output by HiRDB when compressed tables are manipulated
(4) Error handling

(1) Criteria

We recommend that you compress a table that contains large variable-length binary data, such as images and audio data. Because there is overhead for compression and expansion processing that is associated with compressed tables, you should use compressed tables in a system that values storage efficiency over performance.

(2) Guidelines for data compression efficiency

Compression efficiency is a representation of how much storage space can be saved after compression versus before compression. Use the following formula to determine compression efficiency:


Compression efficiency (%) =
{pre-compression data length - post-compression data length) [Figure] pre-compression data length} [Figure] 100

The relationship between the data compression rate and the compression efficiency is as follows:


compression rate + compression efficiency = 100

For details about how to measure the compression rate, see 12.21.7 How to measure the data compression rate.

The table below provides guidelines for evaluating data compression efficiency. Note that the compression efficiency values shown in the table are only guidelines. The actual data compression rate and compression efficiency depend on the specific data to be compressed.

Table 12-25 Guidelines for compression efficiency

Data type Compression efficiency (%)
BINARY data consisting of the same characters 98.51
Completely random BINARY data -0.36#
Text data (.txt) 58.50
Image data (.bmp) 75.42
Audio data (.wav) 9.46

#
This compression efficiency is a negative value because a header area is added during compression processing. For details about the compression processing, see 12.21.2 How data is compressed.

(3) Files that are output by HiRDB when compressed tables are manipulated

The following table shows the data status in files that are output by HiRDB when compressed tables are manipulated.

Table 12-26 Data status in files that are output by HiRDB

Processing File Data status
Execution of SQL statements that require a work table file Work table file Expanded data
Database update processing System log file Compressed data
System log unload processing Unload log file
Database backup processing (pdcopy command) Backup file
Table reorganization processing
(pdrorg -k rorg command)
Unload data file Compressed data#
Table unload processing
(pdrorg -k unld command)
Expanded data

#: If reorganization is performed by using UOC, the expanded data is stored in a table.

(4) Error handling

If an error occurs in an RDAREA containing the data and indexes of a compressed table, you can recover the RDAREA by using the database recovery utility (pdrstr) in the same manner as for normal database recovery.