12.1 Items to be examined during table design

A HiRDB database is a relational database. The user must examine the design of a table, which is the logical structure of the database.

To begin with, tables must be normalized. Even among tables normalized in the same manner, table processing performance may vary depending on the method used to store the table in user RDAREAs. In some cases, operability may be more important than processing performance; therefore, tables must be designed to achieve expected results. The following table lists items to consider when you are designing a table.

Table 12-1 Items to consider when you are designing a table

Design task and items to be examinedAdvantagesDisadvantagesSection
Table normalizationTable storage efficiency and processing efficiency improve.Processing performance may be reduced if join retrieval of normalized tables is necessary during table retrieval.12.2
Table row partitioningSpecification of table row partitioning
  • Operations can be performed in units of RDAREAs.
  • For HiRDB/Parallel Server, high-speed table access and workload distribution can be achieved because table access processing can be executed concurrently on multiple RDAREAs.
  • The number of RDAREAs increases.
  • If an index for this table is not row-partitioned, the level of concurrent executions is reduced due to index locking.
12.3
Key range partitioning
  • RDAREAs that contain specific table data are known.
  • Data for each application can be stored in a separate RDAREA.
Data cannot be stored uniformly without knowing the key ranges.
Flexible hash partitioning#
  • Data can be stored uniformly without having to know the key ranges.
  • RDAREAs and hash function can be changed easily.
  • It is easy to cope with the addition of CPUs or disks.
  • It is difficult to know which table data is stored in which RDAREA.
  • If a specific key is heavily weighted or duplicated, data cannot be stored uniformly.
  • Uniqueness of the key cannot be checked.
FIX hash partitioning#
  • The RDAREAs to be used to store data are determined by the key values.
  • Data can be stored uniformly in RDAREAs without having to know the key ranges.
  • It is easy to add CPUs or disks.
  • Input data can be stored in the RDAREAs by creating a UAP that uses the hash function for table partitioning.
Once data is stored in a table, user RDAREAs cannot be added, nor can the hash function be changed.
Table matrix partitioning
  • Compared to normal key range partitioning, high-speed SQL processing and reduced operation time can be expected because data partitioned by key ranges can be partitioned further on the basis of the values in a different column.
  • This method is applicable to a wider range of applications because key range partitioning can be combined with hash partitioning.
Compared to normal row partitioning, operation and management become complex because this method allows more detailed partitioning of RDAREAs.12.4
Defining a triggerSQL can execute automatically in response to an operation on a table.None12.5
Creation of view table
  • If other users are given access privileges only for the view table but not the base table, the accessible range of the base table can be restricted on a row or column basis.
  • If a view table is created beforehand using data that can be retrieved by a complicated query, the table referencing operation is simplified.
  • A base table can be referenced or updated via its view table.
None12.6
Specification of FIX attribute
  • If row-by-row interface is used, access performance can be improved, even when there are many columns.
  • Null value can be prohibited as input data for a table with the FIX attribute.
  • If a table contains many columns, the disk space required can be reduced.
None12.7
Specification of primary keyUniqueness constraint and NOT NULL constraint apply to a column for which a primary key is defined.None12.8
Specification of cluster key
  • Input/output time can be reduced when retrieving, updating, or deleting a row with a range specified, or when retrieving or updating data on the basis of a cluster key value.
  • If UNIQUE is specified for the cluster key, all the values in the cluster key column must be unique.
  • The database load utility (pdload) can be used to determine whether input data for a table is sorted in ascending or descending order of the cluster key values.
  • When a table is being reorganized, the database reorganization utility (pdrorg) can be used to determine whether the cluster key for the unloaded row matches the cluster key to be reloaded.
  • Values in the column that constitutes the cluster key cannot be updated.
  • The null value cannot be inserted in the column that constitutes the cluster key.
  • When an attempt is made to add data to a table for which a cluster key is specified, there is overhead for retrieving the page that has the key value adjacent to the specified key value.
12.9
Specification of suppress option
  • Disk space required can be reduced.
  • Input/output time for retrieval processing can be reduced, such as retrieval of all entries.
None12.10
Specification of no-split optionData storage efficiency can be improved, thereby reducing the disk space required.None12.11
Specification of a binary data columnVariable-length large object data can be specified, such as document, image, and audio dataNone12.12
Specification of a character setCharacter string data can be stored in different character sets for each column in a table. This allows the following:
  • When migrating from the VOS3 system to HiRDB, character data stored in a database can be retrieved, substituted, and compared in the collating sequence of character string data in the VOS3 system.
  • Retrieval, substitution, and comparison of character data in UTF-16.
None12.13
Specification of WITHOUT ROLLBACK optionOccurrences of locking can be reduced because the rows subject to update processing are unlocked automatically when table update processing is completed (such as when defining a table used for numbering applications).None12.14
Specification of the falsification prevention facilityPrevents table data errors or invalid updates.There are restrictions on the facilities, SQL, utilities, and commands that can be executed on an RDAREA that contains a falsification prevented table.12.15
Table with repetition column
  • There is no need to join multiple tables.
  • The disk space required can be reduced because duplicated information is eliminated.
  • Better access performance can be achieved than when a separate table is used because related data items (repetition columns) are placed adjacent to each other.
None12.16
Table with abstract data typeData with a complicated structure can be stored in a table to process it as normal table data.None12.17
Shared table
  • Overhead caused by connection and data transfer between multiple back-end servers can be reduced.
  • Efficiency of parallel processing improves, such as when multiple transactions are executed concurrently.
When a shared table is updated, all back-end servers lock the RDAREA that contains the shared table. If another application accesses another table in the same shared RDAREA, deadlock may occur.12.18
Referential constraintsIntegrity checking and data manipulation on data in multiple tables can be automated.When referenced tables and referencing tables are updated, processing time increases because data integrity is checked.12.19
Check constraintsData checking can be automated when data is added or updated.When a table for which a check constraint has been defined is updated, processing time increases because data integrity is checked.12.20
Compressed table
  • Table data storage efficiency improves and the database size can be reduced.
  • Data compression processing need not be provided when UAP is developed because HiRDB compresses data.
When data in compressed columns is manipulated by using SQL statements and utilities, there is overhead for compression and expansion processing.12.21
Temporary table
  • Processing is not affected by other users because a dedicated table is created for each transaction or SQL session.
  • Complex processing can be performed, such as storing intermediate processing results in a temporary table and then obtaining the final results by processing the intermediate results.
  • No postprocessing is needed because temporary tables are deleted automatically.
There is overhead for initializing a temporary table RDAREA when HiRDB starts or when the first INSERT statement is executed on a temporary table.12.22
#: You should use the hash facility for hash row partitioning in the following cases:
  • Hash row partitioning is to be used with the table.
  • The amount of data is expected to increase.
If you add RDAREAs to handle an increase of data for a hash row partitioning table (if you increase the number of table row partitions), data may become uneven among the existing RDAREAs and newly added RDAREAs. The rebalancing facility for hash row partitioning can correct such unevenness of data when you increase the number of table row partitions. For details about the rebalancing facility for hash row partitioning, see the HiRDB Version 9 System Operation Guide.