Nonstop Database, HiRDB Version 9 System Operation Guide

[Contents][Index][Back][Next]

8.8 Monitoring UAP status (skipped effective synchronization point dump monitoring facility)

When a UAP updates a database continuously because of an infinite loop, many overwrite disabled system log files are created because the synchronization point dumps cannot be validated. Once all the system log files have been placed in overwrite disabled status, HiRDB terminates abnormally.

If HiRDB is terminated abnormally or forcibly while more than half of all system log files are in overwrite disabled status, HiRDB cannot be restarted because there are too few system log files for rollback processing. In this case, new system log files must be added in order to restart HiRDB. This also results in a longer restart processing time.

To avoid such problems, HiRDB provides the skipped effective synchronization point dump monitoring facility.

Organization of this section
(1) Skipped effective synchronization point dump monitoring facility
(2) Value to be specified in the pd_spd_syncpoint_skip_limit operand
(3) Calculation methods
(4) Method of checking the value of the pd_spd_syncpoint_skip_limit operand
(5) When the value of the pd_spd_syncpoint_skip_limit operand is not appropriate
(6) When not to use the skipped effective synchronization point dump monitoring facility
(7) Transactions that are not rolled back
(8) Cases in which skipping is not included in the skip count
(9) Notes
(10) If HiRDB Datareplicator is being used

(1) Skipped effective synchronization point dump monitoring facility

If a UAP experiences an infinite loop, synchronization point dump validation processing might not be executed consecutively (synchronization point dump validation processing might be skipped). When the number of consecutively skipped validations reaches a set value, the offending transaction is stopped forcibly and rollback processing is performed. This capability is provided by the skipped effective synchronization point dump monitoring facility. To use this facility, specify the pd_spd_syncpoint_skip_limit operand.

(2) Value to be specified in the pd_spd_syncpoint_skip_limit operand

Normally, the pd_spd_syncpoint_skip_limit operand is set to 0. When its value is 0, HiRDB calculates automatically the maximum number of times skipping is permitted. However, if system instability occurs with 0 specified, one of the methods shown in subsection (3) below can be used to calculate an appropriate value for this operand.

When any one of the following conditions is satisfied, the value calculated by either of the methods described in subsection (3) will provide a more precise result than the value HiRDB calculates:

If the KFPS02101-I message is output, stop the automatic calculation and use the calculation methods described in subsection (3) below.

In the cases described below, determine whether to continue operation using the maximum number of times skipping is permitted as calculated using automatic calculation. If you stop automatic calculation, use the method described in 8.8(3)(a) Method based on the byte count of the output system logs.

(3) Calculation methods

The following two calculation methods are provided; use one of them to calculate an appropriate value to use:

Specify in the pd_spd_syncpoint_skip_limit operand a value that is slightly smaller than the value obtained by these calculations.

(a) Method based on the byte count of the output system logs

Use the following formula to obtain the value.

Formula
{([Figure]a [Figure] b[Figure] [Figure] c) [Figure] d} - 1
a:
Sum (in bytes) of the system log information output by the transaction that updates the largest volume of database data and the system log information output by other transactions that are executing concurrently. For details about how to obtain the byte count of system logs, see the HiRDB Version 9 Installation and Design Guide.
[Figure] When HiRDB Datareplicator is being used
When data replication transaction processing on the HiRDB being updated takes a long time, this monitoring facility might roll back the data replication transaction on the HiRDB being updated. Therefore, you must also add the byte count of system logs output by data replication transaction processing. Add the value obtained by the following formula:
byte-count-of-system-logs-output-by-data-replication-transaction- processing = [Figure] (byte-count-of-system-logs-output-by-the-transaction-that-updates-the- largest-amount-of-database-data)
[Figure] represents the log volume output by transactions as specified in the cmtintvl operands (trncmtintvl and tblcmtintvl) of the HiRDB Datareplicator replication environment definitions.
b:
System log file record length. You can obtain the record length with the pdlogls command.
c:
Average number of records in a system log block. Normally, this is roughly 3 [Figure] 4,096 [Figure] b. You can use the following formula to obtain a precise value:
[Figure]average-length-of-system-log-output-block [Figure] b[Figure]
You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
d:
Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
(b) Method based on the byte count of all system logs

Use the following formula to obtain the value.

Formula
{(a [Figure] b [Figure] c) [Figure] d} [Figure] e
a:
Number of system log files that can placed in swappable target status while HiRDB is running.
b:
Number of records in a system log file. If the number of records differs between files, use the average number of records.
c:
Ratio of skipped synchronization point dump validations. Use the ratio of the number of files placed in overwrite disabled status to the total number of system log files.
  • For a HiRDB single server configuration, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
  • For a back-end server, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
  • For the dictionary server, use a value of approximately 0.5.
  • For a front-end server, use a value of approximately 0.7.
d:
Normally, this is roughly 3 [Figure] 4,096 [Figure] f. You can use the following formula to obtain a more precise value:
[Figure]average-length-of-system-log-output-block [Figure] f[Figure]
You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
e:
Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
f:
System log file record length. You can obtain the record length with the pdlogls command.

(4) Method of checking the value of the pd_spd_syncpoint_skip_limit operand

To check the upper limit for skipped effective synchronization point dumps being applied by the active HiRDB, use the pdlogls -d spd command. For details about the pdlogls -d spd command, see the manual HiRDB Version 9 Command Reference.

(5) When the value of the pd_spd_syncpoint_skip_limit operand is not appropriate

If the specified value is too large, it might not be possible to overwrite any of the system log files. If this happens, HiRDB terminates abnormally and cannot be restarted unless new system log files are added.

If the specified value is too small, there might be an increase in the number of transactions that are rolled back forcibly.

(6) When not to use the skipped effective synchronization point dump monitoring facility

In the following cases, do not use the skipped effective synchronization point dump monitoring facility:

For details on obtaining the amount of system log information, see the HiRDB Version 9 Installation and Design Guide.

(7) Transactions that are not rolled back

Even if the number of consecutively skipped synchronization dump points exceeds the value specified in the pd_spd_syncpoint_skip_limit operand, the following transactions are not rolled back:

(8) Cases in which skipping is not included in the skip count

If the effective synchronization point dump is skipped in the following cases, these instances of skipping are not included in the consecutive skip count:

(9) Notes

If you execute the pdlogswap command several times consecutively while HiRDB is running, the number of system log files that can be used as primary files decreases. This tends to increase the probability that a unit will terminate abnormally, due to there being an insufficient number of system log files.

(10) If HiRDB Datareplicator is being used

If data replication transaction processing on the HiRDB being updated (target HiRDB) takes a long time, the skipped effective synchronization point dump monitoring facility might roll back the data replication transaction. If this happens, the target HiRDB issues the KFPS00993-I message (REQUEST= abnormal_tran_end), and the instance of the HiRDB Datareplicator on the target HiRDB side issues the KFRB03007-W and KFRB03013-I messages. The following shows how to handle this situation.

Procedure
  1. Use the pdstop command to terminate the target HiRDB normally.#
  2. Change the value of the pd_spd_syncpoint_skip_limit operand. For details about an appropriate value to specify, see (3)(a) Method based on the byte count of the output system logs.
  3. Determine whether the number of system log file generations satisfies the following condition; if it does not, add enough system log files to satisfy this condition:
    value-of-ps_spd_syncpoint_skip_limit-operand-after-change [Figure] number-of-system-log-file-generations [Figure] 3
  4. Use the pdstart command to start the target HiRDB normally.
  5. Use the hdsrfctl command of the HiRDB Datareplicator on the target HiRDB side to re-execute the data reflection transaction.
#: When you use the system reconfiguration command (pdchgconf command), you do not need to restart HiRDB normally, because the pdchgconf command allows you to modify HiRDB system definitions while HiRDB is running. Note that HiRDB Advanced High Availability is required in order to use this command. For details about modifying HiRDB system definitions while HiRDB is running, see 9.2 Modifying HiRDB system definitions while HiRDB is running (system reconfiguration command).

The following figure shows the operational flow when a data replication transaction is rolled back forcibly by the skipped effective synchronization point dump monitoring facility.

Figure 8-13 Operational flow when a data replication transaction is forcibly rolled back by the skipped effective synchronization point dump monitoring facility

[Figure]