8.8 Monitoring UAP status (skipped effective synchronization point dump monitoring facility)

When a UAP updates a database continuously because of an infinite loop, many overwrite disabled system log files are created because the synchronization point dumps cannot be validated. Once all the system log files have been placed in overwrite disabled status, HiRDB terminates abnormally.

If HiRDB is terminated abnormally or forcibly while more than half of all system log files are in overwrite disabled status, HiRDB cannot be restarted because there are too few system log files for rollback processing. In this case, new system log files must be added in order to restart HiRDB. This also results in a longer restart processing time.

To avoid such problems, HiRDB provides the skipped effective synchronization point dumps monitoring facility.

Organization of this section
(1) Skipped effective synchronization point dumps monitoring facility
(2) Value to be specified in the pd_spd_syncpoint_skip_limit operand
(3) Calculation methods
(4) When the value of the pd_spd_syncpoint_skip_limit operand is not appropriate
(5) When not to use the skipped effective synchronization point dumps monitoring facility
(6) Transactions that are not rolled back
(7) Notes
(8) If HiRDB Datareplicator is being used

(1) Skipped effective synchronization point dumps monitoring facility

If a UAP experiences an infinite loop, synchronization point dump validation processing may not be executed consecutively (synchronization point dump validation processing may be skipped). When the number of consecutively skipped validations reaches a set value, the offending transaction is stopped forcibly and rollback processing is performed. This capability is provided by the skipped effective synchronization point dumps monitoring facility. To use this facility, specify the pd_spd_syncpoint_skip_limit operand.

(2) Value to be specified in the pd_spd_syncpoint_skip_limit operand

Normally, the pd_spd_syncpoint_skip_limit operand is set to 0. When its value is 0, HiRDB calculates automatically the maximum number of times skipping is permitted. However, if system instability occurs with 0 specified, one of the methods shown in (3) below can be used to calculate an appropriate value for this operand.

When any one of the following conditions is satisfied, the value calculated by either of the methods described in (3) will provide a more precise result than the value HiRDB calculates:

Also in the following cases you should not use automatic calculation, but should calculate a value using one of the methods described in (3):

(3) Calculation methods

The following two calculation methods are provided; use one of them to calculate an appropriate value to use:

Specify in the pd_spd_syncpoint_skip_limit operand a value that is slightly smaller than the value obtained by these calculations. The value specified in this operand takes effect following the first synchronization point dump validation performed after the next HiRDB startup (or restart).

(a) Method based on the byte count of the output system logs

Use the following formula to obtain the value:

Formula
{([Figure] a[Figure] b[Figure][Figure] c) [Figure] d} - 1
a:
Sum (in bytes) of the system log information output by the transaction that updates the largest volume of database data and the system log information output by other transactions that are executing concurrently. For details about how to obtain the byte count of system logs, see the manual HiRDB Version 8 Installation and Design Guide.
[Figure]When HiRDB Datareplicator is being used
When data replication transaction processing on the HiRDB being updated takes a long time, this monitoring facility may roll back the data replication transaction on the HiRDB being updated. Therefore, you must also add the byte count of system logs output by data replication transaction processing. Add the value obtained by the following formula:
byte-count-of-system-logs-output-by-data-replication-transaction- processing = [Figure] (byte-count-of-system-logs-output-by-the-transaction-that-updates-the- largest-amount-of-database-data)
[Figure]represents the log volume output by transactions as specified in the cmtintvl operands (trncmtintvl and tblcmtintvl) of the HiRDB Datareplicator replication environment definitions.
b:
System log file record length. You can obtain the record length with the pdlogls command.
c:
Average number of records in a system log block. Normally, this is roughly 3 [Figure] 4096 [Figure] b. You can use the following formula to obtain a precise value:
[Figure]average-length-of-system-log-output-block[Figure] b[Figure]
You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
d:
Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
(b) Method based on the byte count of all system logs

Use the following formula to obtain the value:

Formula
{(a[Figure] b[Figure] c) [Figure] d} [Figure] e
a:
Number of system log files that can placed in swappable target status while HiRDB is running.
b:
Number of records in a system log file. If the number of records differs between files, use the average number of records.
c:
Ratio of skipped synchronization point dump validations. Use the ratio of the number of files placed in overwrite disabled status to the total number of system log files.
  • For a HiRDB/Single Server, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
  • For a back-end server, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
  • For the dictionary server, use a value of approximately 0.5.
  • For a front-end server, use a value of approximately 0.7.
d:
Normally, this is roughly 3 [Figure] 4096 [Figure] f. You can use the following formula to obtain a more precise value:
[Figure]average-length-of-system-log-output-block[Figure] f[Figure]
You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
e:
Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
f:
System log file record length. You can obtain the record length with the pdlogls command.

(4) When the value of the pd_spd_syncpoint_skip_limit operand is not appropriate

If the specified value is too large, it may not be possible to overwrite any of the system log files. If this happens, HiRDB terminates abnormally and cannot be restarted unless new system log files are added.

If the specified value is too small, there may be an increase in the number of transactions that are rolled back forcibly.

(5) When not to use the skipped effective synchronization point dumps monitoring facility

  1. During batch processing that involves updating of a large amount of data and the amount of system log information that is output before the commit statement is issued is more than one-third of the total size of all system log files.
  2. When the sum of the amount of system log information output by the transaction that updates the largest amount of database data and the amount of system log information output by transactions that execute concurrently is more than one-third of the total size of all system log files.

For details on obtaining the amount of system log information, see the manual HiRDB Version 8 Installation and Design Guide.

(6) Transactions that are not rolled back

Even if the number of consecutively skipped synchronization dump points exceeds the value specified in the pd_spd_syncpoint_skip_limit operand, the following transactions are not rolled back:

(7) Notes

If you execute the pdlogswap command several times consecutively while HiRDB is running, the number of system log files that can be used as primary files decreases. This tends to increase the probability that a unit will terminate abnormally, due to there being an insufficient number of system log files.

(8) If HiRDB Datareplicator is being used

If data replication transaction processing on the HiRDB being updated (target HiRDB) takes a long time, the skipped effective synchronization point dumps monitoring facility may roll back the data replication transaction. If this happens, the target HiRDB issues the KFPS00993-I message (REQUEST= abnormal_tran_end), and the instance of the HiRDB Datareplicator on the target HiRDB side issues the KFRB03007-W and KFRB03013-I messages. The following shows how to handle this situation:

Procedure
  1. Use the pdstop command to terminate the target HiRDB normally.*
  2. Change the value of the pd_spd_syncpoint_skip_limit operand. For details about an appropriate value to specify, see (3)(a) Method based on the byte count of the output system logs.
  3. Determine whether the number of system log file generations satisfies the following condition; if it does not, add enough system log files to satisfy this condition:
    value-of-ps_spd_syncpoint_skip_limit-operand-after-change[Figure] number-of-system-log-file-generations[Figure] 3
  4. Use the pdstart command to start the target HiRDB normally.
  5. Use the hdsrfctl command of the HiRDB Datareplicator on the target HiRDB side to re-execute the data reflection transaction.
* When you use the system reconfiguration command (pdchgconf command), you do not need to restart HiRDB normally, because the pdchgconf command allows you to modify HiRDB system definitions while HiRDB is running. Note that HiRDB Advanced High Availability is required in order to use this command. For details about modifying HiRDB system definitions while HiRDB is running, see 9.2 Modifying HiRDB system definitions while HiRDB is running (system reconfiguration command).

Figure 8-13 shows the operational flow when a data replication transaction is rolled back forcibly by the skipped effective synchronization point dumps monitoring facility.

Figure 8-13 Operational flow when a data replication transaction is forcibly rolled back by the skipped effective synchronization point dumps monitoring facility

[Figure]