It is important in terms of table storage efficiency and processing efficiency to normalize tables. The columns that constitute a table should be examined during table normalization.
This section describes the following normalization topics:
If a table has multiple columns that contain similar data, the table should be normalized so that all the columns contain unique data. Such normalization improves the efficiency of data storage for the table. Figure 13-1 shows an example.
Figure 13-1 Multiple columns in a table containing similar data
The PCODE and PNAME columns in the STOCK table have a one-to-one correspondence before normalization, which means that the data in these columns is redundant. For this case, another table called PRODUCT can be created that consists of the STOCK table's PCODE and PNAME columns. The PRODUCT table is created so that the PCODE and PNAME columns do not contain duplicative data.
If the same table is used by multiple applications, normalization can result in a separate table for each application. This can improve the level of concurrent execution for each table. Figure 13-2 shows an example.
Figure 13-2 Same table used by multiple applications
The PMANAGE (product management) table is used by the inventory management application and the orders management application. The PMANAGE table can be normalized to the STOCK table that is used only by the inventory management application and the ORDERS table that is used only by the orders management application.
If some of a table's columns are accessed frequently and some are not, normalization can result in a table consisting of the columns that are accessed frequently and a table consisting of the columns that are accessed infrequently. Figure 13-3 shows an example.
Figure 13-3 Columns with different access frequencies
Taking the STOCK table shown in Figure 13-3, if the retrieval frequency ratio of the PNO and SQUANTITY columns to the PNAME and PRICE columns is 9:1, normalization of the STOCK table should result in a table (STOCK2) consisting of the columns that are retrieved frequently and a table (PRODUCT) consisting of the columns that are retrieved infrequently.
Assume that 10,000 physical input/output operations are required in order to retrieve all entries in the STOCK table. When the STOCK table is divided into the two tables STOCK2 and PRODUCT, the numbers of physical input/output operations required in order to retrieve all entries in these two tables drops to 4,500 (5,000 0.9) and 500 (5,000
0.1), respectively. As a result, only 5,000 physical input/output operations are required to retrieve all entries, thereby improving the overall table processing efficiency.