Designing the import procedure

The provided import methods are the transaction-based import method and the table-based import method. For details about these import methods, see 3.3.3 Import methods.

You use the startmode or breakmode operand in the import environment definition to specify the import processing method.

(a) Transaction-based import method

The transaction-based import method imports update information into the target HiRDB database in the order the transactions were updated in the source database. If import of update information is into a table with the same format (same table name, column names, and attributes), you can omit the import definition when you use the transaction-based import method.

(b) Table-based import method

This method creates an import group for one or more tables subject to import processing and imports data for one group at a time. You can define a maximum of 128 import groups per round of import processing. You use the import group definition to specify import groups. For details about the import group definition, see 5.10.6 Import group definition.

The table-based import method is broken down into the following types:

Table-based partitioning method
Key range-based partitioning method
Hash partitioning method

Table-based partitioning method

The table-based partitioning method classifies the transactions updated in the source database into user-defined target groups, and then imports them in parallel.

If you use the table-based partitioning method for import processing and there is a referential constraint among tables, group together those tables that have the referential constraint.

Key range-based partitioning method

The key range-based partitioning method imports the transactions updated in the source database in parallel on the basis of user-defined key ranges. Note the following when you use the key range-based partitioning method to perform import processing:

You can define a maximum of eight key ranges per import group.
In the key range partitioning condition statement, you can specify only column names in the table subject to import processing that correspond to the mapping key of the table subject to extraction.
You can specify a maximum of eight conditional statements for one key range partitioning condition. When you specify multiple conditional statements for one key range partitioning condition, the target Datareplicator connects all the specified conditional statements with AND, enabling specification of complex key range partitioning conditions. If you specify complex key range partitioning conditions, performance might be degraded because of the amount of time required to check the conditional statements. If performance is more important than partitioning, specify fewer conditional statements for a single key range partitioning condition.
If import processing is concentrated in one key range, performance might not be as good as when key range partitioning is not used. This is because the processing is not distributed in the same manner as when key range partitioning is not used and because the key range partitioning must be checked. In this case, terminate the target Datareplicator normally or immediately, further subdivide the key range in which import processing is concentrated, and then restart the target Datareplicator.
If you start many import processes or SQL processes while the CPU performance is low, the target Datareplicator might terminate abnormally due to the machine's workload. To avoid this, reduce the number of processes started at one time by using a condition such as other to combine key ranges where there is little import processing.

Hash partitioning method

The hash partitioning method uses the hash method to import the transactions updated in the source database in parallel.

(2) Designing the import processing method when the multi-FES facility is used

If the target HiRDB uses the multi-FES facility, the target Datareplicator can execute import processing supporting the multi-FES facility. To use the multi-FES facility, you must specify in the import definition the target front-end servers that correspond to the import groups. The target Datareplicator establishes the correspondence to an SQL process for each target front-end server according to the import definition. This enables you to issue SQL statements in parallel for the various target front-end servers, thereby distributing the workload among the front-end servers.

The following provides an example of using the multi-FES facility for each partitioning type under the table-based import method and discusses considerations concerning use of the multi-FES facility.

(a) Table-based partitioning method is used

If tables are grouped by server at the target HiRDB, using the table-based partitioning method enables you to issue the SQL statements in parallel for the various target front-end servers. You can expect more of an improvement in throughput than with the other methods. The following figure provides an example of using the multi-FES facility with the table-based partitioning method.

Figure 4-40 Example of using the multi-FES facility with the table-based partitioning method

[Figure]

(b) Key range-based partitioning method

If a table is stored in different servers by row-partitioning key ranges at the target HiRDB, using the key range-based partitioning method enables you to issue the SQL statements in parallel for the various target front-end servers. You can expect an improvement in throughput. The following figure provides an example of using the multi-FES facility with the key range-based partitioning method.

Figure 4-41 Example of using the multi-FES facility with the key range-based partitioning method

[Figure]

(c) Hash partitioning method

If a table is partitioned at the target HiRDB, using the hash partitioning method enables you to execute the SQL statements in parallel for the various front-end servers in accordance with the hash method. Compared to the key range-based partitioning method, this method can execute the processing faster for each front-end server for the multi-FES facility. The following figure provides an example of using the multi-FES facility with the hash partitioning method.

Figure 4-42 Example of using the multi-FES facility with the hash partitioning method

[Figure]

If you employ a multi-FES configuration, the hash partitioning method enables you to set the front-end server to be used to process import data for each partition. This helps to distribute the workload among the front-end servers, and if the front-end server and the back-end server containing the RDAREA are on the same server, it can also reduce the overhead of communication between the front-end server and back-end server.

(d) Considerations

In the case of a multi-FES environment at the target HiRDB where a target front-end server specified in the import definition and the back-end server that actually executes the import processing are located on different machines, effective processing cannot be executed because of the communication that is required between the front-end server and the back-end server each time import processing occurs. Therefore, if you specify a front-end server in the import definition, specify one that is located on the machine containing the table data.

(3) Designing the DISCONNECT-issuance interval for import processing

You must design the interval at which Datareplicator is to issue DISCONNECT requests to the target HiRDB after it detects the end of the update information in the import information queue file. You use the disconnect operand in the import system definition to specify the DISCONNECT-issuance interval for import processing.

Consider the following points in specifying the DISCONNECT- issuance interval:

If DISCONNECT is not to be issued, specify 0 for the interval.
If the source database is a HiRDB, when you specify the discintvl operand, take into account the value of the sendintvl operand in the source Datareplicator's transmission environment definition and the frequency of transaction occurrences in the source system. If the source database is a mainframe database, when you specify the discintvl operand, take into account the value of the RINTERVAL clause in the XDM/DS startup definition and the frequency of transaction occurrences in the source system.
If transactions occur frequently at short intervals until the source system's application has terminated, specifying a value close to 0 will increase the probability of DISCONNECT being issued. If the frequency of transaction occurrences fluctuates, you can optimize DISCONNECT issuances by reducing the value of the sendintvl operand or the RINTERVAL clause if the current value is large, or by increasing the value of the sendintvl operand or the RINTERVAL clause if the current value is small.

(4) Designing the use of the event facility for automatic control of import processing

You can use the event facility to implement import operations on the basis of events at the source system. To use the event facility, you define event codes to correspond to actual events that are issued at the source system, and you specify these event codes in the source Datareplicator's import environment definition. The following discusses the import operations that can be implemented by the event facility.

(a) Import operations that can be implemented by the event facility

The following table shows the import operations that can be implemented by the event facility and their relationship to the import environment definition.

Table 4-42 Import operations that can be implemented and their relationship to the import environment definition

Type of event	Import operation	Operand to be specified
Import processing stop event	Stops import processing.	eventspd
Transaction-based import event	Switches the import method to the transaction-based import method during import processing.	eventtrn
Table-based import event	Switches the import method to the table-based import method during import processing.	eventtbl
Transaction-based import restart event	Restarts import processing using the transaction-based import method while import processing is stopped.	eventretrn
Table-based import restart event	Restarts import processing using the table-based import method while import processing is stopped.	eventretbl
Event to reset the import processing count	Resets the target Datareplicator's import processing count.	eventcntreset

(b) Considerations in using the event facility

Use the applicable operand in the import environment definition to specify an event code to correspond to the actual event code that is issued by the source system
If the source system issues an event that does not correspond to any of the event codes specified in the import environment definition, Datareplicator issues a message to the target system's syslog file. You can use such a message to report the source system's action to the target system. For example, when the source system is terminated, you can report this to the target system by having the source system issue an event that is not specified at the target system.

(5) Designing the COMMIT-issuance interval for import processing

The interval for issuing COMMITs is specified in terms of a number of transactions at the source system. You use the cmtintvl, trncmtintvl, or tblcmtintvl operand in the import environment definition to specify this interval for issuing COMMITs to the target HiRDB.

Consider the following points in specifying the COMMIT-issuance interval:

As the COMMIT-issuance interval increases, SQL statements can be issued to more transactions at one time, thereby reducing the target HiRDB's workload and improving performance. However, if an error occurs, there will be more transactions that cannot be imported and which will be imported automatically the next time import processing starts; thus, if the COMMIT-issuance interval is large, a considerable amount of time is required for recovery.
Evaluate the amount of log information at the target HiRDB that will be output during one commit interval during import processing and check that the files needed for error recovery (such as the system log file) will not become filled up.
A COMMIT is issued whenever the target HiRDB issues the PURGE TABLE SQL statement. Whenever an event is detected, the target Datareplicator issues a COMMIT to obtain a synchronization point. Therefore, the COMMIT interval specified in the import environment definition might not always take effect when update information, including PURGE TABLE, is imported or when events are detected.
You can specify a COMMIT issuance interval for each import method. Use the trncmtintvl operand for the transaction-based import method and the tblcmtintvl operand for the table-based import method. To specify a COMMIT- issuance interval common to both import methods, use the cmtintvl operand.
If large values are specified in the cmtintvl, trncmtintvl, and tblcmtintvl operands, the HiRDB server is affected as described below. Make sure that the size of a transaction that is generated from the target Datareplicator will not exceed the range permitted for the target HiRDB.
- The amount of locked resources (locked rows) increases, resulting in a shortage of locked database resources.
- The maximum skip count for effective synchronization point dump processing is exceeded and the import processing is rolled back.
- A commit is issued when the commit_wait_time operand value is exceeded, resulting in termination of import processing.
Note:

The execution time of a transaction that is generated from the target Datareplicator is the amount of time required for executing the actual SQL statements plus the amount of time waiting for the arrival of the next transaction to be imported. The following figure shows the actual transaction execution time.

Use the commit_wait_time operand to specify the maximum amount of time to wait for the arrival of the next transaction to be imported. If no transaction arrives before the commit_wait_time operand value is exceeded, the target Datareplicator automatically issues COMMIT and settles the transaction.

(6) Designing the handling of update information that is not defined in the import definition

(7) Designing checking for tables subject to import processing

(8) Designing the size of shared memory for storing definition information

When the target Datareplicator starts, it allocates a space in shared memory for storing definition information. You use the defshmsize operand in the import environment definition to specify the size of the shared memory to be used for storing definition information.

Consider the following point in determining the size of the shared memory for storing the definition information:

Datareplicator uses the shared memory to store definition information in order to retain the source system's extraction definition information that is stored in the import information queue file. Therefore, you must ensure that the size of the shared memory for storing definition information is greater than the size of the source system's extraction definition information. Otherwise, an error will occur when the target Datareplicator starts. For details about the extraction definition information that is stored in the import information queue file, see 4.7.7 Designing the target Datareplicator's resources.

(9) Designing the resources of the target HiRDB

(a) Number of locked resources

Of the two locked resources described below, use whichever is larger as the estimate for the resources for the target HiRDB. If data linkage is applied to pdload that is executed on the source HiRDB, all instances of this statement are replaced with INSERT statements at the target. This means that locked resources are required for each INSERT statement. Therefore, if a large amount of HiRDB's locked resources are used, the import processing on the target Datareplicator might result in an SQL error due to a shortage of locked resources. If you apply data linkage to pdload, estimate the number of locked resources of the target database to be equal to or greater than the number of insert operations performed by pdload.

Enough locked resources for executing the DESCRIBE statement on all target tables defined in the load statement
Enough locked resources for executing the SQL statements (INSERT, UPDATE, and DELETE) that are issued between the commits of import processing.

For details about estimating the number of locked resources, see the manual HiRDB Version 9 System Definition.

(b) Number of data linkage identifiers that can be connected to HiRDB concurrently

Add the number of data linkage identifier that can be connected to HiRDB concurrently to the pd_max_users operand value in the HiRDB definition. The number of data linkage identifiers that can be connected to HiRDB concurrently for each data linkage identifier defined in the import system definition is as follows:

Transaction-based import mode:: One data linkage identifier

Table-based import mode:

Total number of partitions^# specified in the group statements defined in the import definition

#

Use the following for the number of partitions:

Table-based partitioning method: 1
Key range-based partitioning method: Number of key range partitions
Hash partitioning method: Number of hash partitions or SQL partitions

Example:: group G1 by T1 : 1; group G2 by T2 : 1; group G3 by T3 : 5; hash divide into 5; In this example, the number of data linkage identifiers that can be connected to HiRDB concurrently is 7 (1 + 1 + 5).

(c) Number of tables that can access HiRDB concurrently

Add the number of tables that can access HiRDB concurrently in the pd_max_access_tables operand value in the HiRDB definition. The number of tables that can access HiRDB concurrently is the number of load statements specified in the import definition.

4.7.3 Designing the import procedure

(1) Designing the import processing method

(a) Transaction-based import method

(b) Table-based import method

(2) Designing the import processing method when the multi-FES facility is used

(a) Table-based partitioning method is used

(b) Key range-based partitioning method

(c) Hash partitioning method

(d) Considerations

(3) Designing the DISCONNECT-issuance interval for import processing

(4) Designing the use of the event facility for automatic control of import processing

(a) Import operations that can be implemented by the event facility

(b) Considerations in using the event facility

(5) Designing the COMMIT-issuance interval for import processing

(6) Designing the handling of update information that is not defined in the import definition

(7) Designing checking for tables subject to import processing

(8) Designing the size of shared memory for storing definition information

(9) Designing the resources of the target HiRDB

(a) Number of locked resources

(b) Number of data linkage identifiers that can be connected to HiRDB concurrently

(c) Number of tables that can access HiRDB concurrently