Nonstop Database, HiRDB Version 9 System Operation Guide
(1) Skipped effective synchronization point dump monitoring facility
If a UAP experiences an infinite loop, synchronization point dump validation processing might not be executed consecutively (synchronization point dump validation processing might be skipped). When the number of consecutively skipped validations reaches a set value, the offending transaction is stopped forcibly and rollback processing is performed. This capability is provided by the skipped effective synchronization point dump monitoring facility. To use this facility, specify the pd_spd_syncpoint_skip_limit operand.
Normally, the pd_spd_syncpoint_skip_limit operand is set to 0. When its value is 0, HiRDB calculates automatically the maximum number of times skipping is permitted. However, if system instability occurs with 0 specified, one of the methods shown in subsection (3) below can be used to calculate an appropriate value for this operand.
When any one of the following conditions is satisfied, the value calculated by either of the methods described in subsection (3) will provide a more precise result than the value HiRDB calculates:
- There are five or fewer generations of system log files that can be used as current files.
- Transactions that take a long time to process are being executed concurrently.
- Data replication transaction processing takes a long time to complete on the HiRDB being updated (when linked with HiRDB Datareplicator).
If the KFPS02101-I message is output, stop the automatic calculation and use the calculation methods described in subsection (3) below.
In the cases described below, determine whether to continue operation using the maximum number of times skipping is permitted as calculated using automatic calculation. If you stop automatic calculation, use the method described in 8.8(3)(a) Method based on the byte count of the output system logs.
- If the system log size of a system log file that cannot be overwritten (the size of the system log that is written when HiRDB is restarted) increases during skipping, thus lengthening the time required to restart HiRDB:
In this case, to shorten the time required to restart HiRDB, make a correction, and then carry out automatic calculation (the KFPS02103-I message is output).
- If the maximum number of times skipping is permitted, calculated using automatic calculation, exceeds 100,000:
In this case, 100,000 is assumed (the KFPS02103-I message is output).
(3) Calculation methods
The following two calculation methods are provided; use one of them to calculate an appropriate value to use:
- Method based on the byte count of the output system logs
- Method based on the byte count of all system logs
Specify in the pd_spd_syncpoint_skip_limit operand a value that is slightly smaller than the value obtained by these calculations.
(a) Method based on the byte count of the output system logs
Use the following formula to obtain the value.
- Formula
- {(a b c) d} - 1
- a:
- Sum (in bytes) of the system log information output by the transaction that updates the largest volume of database data and the system log information output by other transactions that are executing concurrently. For details about how to obtain the byte count of system logs, see the HiRDB Version 9 Installation and Design Guide.
- When HiRDB Datareplicator is being used
- When data replication transaction processing on the HiRDB being updated takes a long time, this monitoring facility might roll back the data replication transaction on the HiRDB being updated. Therefore, you must also add the byte count of system logs output by data replication transaction processing. Add the value obtained by the following formula:
- byte-count-of-system-logs-output-by-data-replication-transaction- processing = (byte-count-of-system-logs-output-by-the-transaction-that-updates-the- largest-amount-of-database-data)
- represents the log volume output by transactions as specified in the cmtintvl operands (trncmtintvl and tblcmtintvl) of the HiRDB Datareplicator replication environment definitions.
- b:
- System log file record length. You can obtain the record length with the pdlogls command.
- c:
- Average number of records in a system log block. Normally, this is roughly 3 4,096 b. You can use the following formula to obtain a precise value:
- average-length-of-system-log-output-block b
- You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
- d:
- Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
(b) Method based on the byte count of all system logs
Use the following formula to obtain the value.
- Formula
- {(a b c) d} e
- a:
- Number of system log files that can placed in swappable target status while HiRDB is running.
- b:
- Number of records in a system log file. If the number of records differs between files, use the average number of records.
- c:
- Ratio of skipped synchronization point dump validations. Use the ratio of the number of files placed in overwrite disabled status to the total number of system log files.
- For a HiRDB single server configuration, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
- For a back-end server, use a value of 0.333 or less. If the number of guaranteed-valid generations is 2, use a value of 0.167 or less.
- For the dictionary server, use a value of approximately 0.5.
- For a front-end server, use a value of approximately 0.7.
- d:
- Normally, this is roughly 3 4,096 f. You can use the following formula to obtain a more precise value:
- average-length-of-system-log-output-block f
- You can obtain the average length of the system log output blocks from the system activity statistical information produced by the statistics analysis utility (OUTPUT BLOCK LENGTH).
- e:
- Value of the first parameter of the pd_log_sdinterval operand (which specifies the synchronization point dump acquisition interval in terms of the volume of system logs that are output).
- f:
- System log file record length. You can obtain the record length with the pdlogls command.
To check the upper limit for skipped effective synchronization point dumps being applied by the active HiRDB, use the pdlogls -d spd command. For details about the pdlogls -d spd command, see the manual HiRDB Version 9 Command Reference.
(5) When the value of the pd_spd_syncpoint_skip_limit operand is not appropriate
If the specified value is too large, it might not be possible to overwrite any of the system log files. If this happens, HiRDB terminates abnormally and cannot be restarted unless new system log files are added.
If the specified value is too small, there might be an increase in the number of transactions that are rolled back forcibly.
(6) When not to use the skipped effective synchronization point dump monitoring facility
In the following cases, do not use the skipped effective synchronization point dump monitoring facility:
- During batch processing that involves updating of a large amount of data and the amount of system log information that is output before the commit statement is issued is more than one-third of the total size of all system log files
- When the sum of the amount of system log information output by the transaction that updates the largest amount of database data and the amount of system log information output by transactions that execute concurrently is more than one-third of the total size of all system log files
- When the skipped effective synchronization point dump is monitored more than 100,000 times
For details on obtaining the amount of system log information, see the HiRDB Version 9 Installation and Design Guide.
(7) Transactions that are not rolled back
Even if the number of consecutively skipped synchronization dump points exceeds the value specified in the pd_spd_syncpoint_skip_limit operand, the following transactions are not rolled back:
- Transactions that are already being rolled back.
- Transactions waiting for a phase 2 commit completion instruction from OpenTP1.
- Transactions generated by a utility.
(8) Cases in which skipping is not included in the skip count
If the effective synchronization point dump is skipped in the following cases, these instances of skipping are not included in the consecutive skip count:
- When the time specified in the pd_log_sdinterval operand has elapsed since the previous effective synchronization point dump
- When the pdlogswap command is executed to swap the system log files
- When the pdlogsync command is executed
- Reference note
- Although these cases are not included in the consecutive skip count, they are included in Total number of synchronization point dump opportunities that were ignored, displayed in the KFPS02179-I message.
If you execute the pdlogswap command several times consecutively while HiRDB is running, the number of system log files that can be used as primary files decreases. This tends to increase the probability that a unit will terminate abnormally, due to there being an insufficient number of system log files.
(10) If HiRDB Datareplicator is being used
If data replication transaction processing on the HiRDB being updated (target HiRDB) takes a long time, the skipped effective synchronization point dump monitoring facility might roll back the data replication transaction. If this happens, the target HiRDB issues the KFPS00993-I message (REQUEST= abnormal_tran_end), and the instance of the HiRDB Datareplicator on the target HiRDB side issues the KFRB03007-W and KFRB03013-I messages. The following shows how to handle this situation.
- Procedure
- Use the pdstop command to terminate the target HiRDB normally.#
- Change the value of the pd_spd_syncpoint_skip_limit operand. For details about an appropriate value to specify, see (3)(a) Method based on the byte count of the output system logs.
- Determine whether the number of system log file generations satisfies the following condition; if it does not, add enough system log files to satisfy this condition:
value-of-ps_spd_syncpoint_skip_limit-operand-after-change number-of-system-log-file-generations 3
- Use the pdstart command to start the target HiRDB normally.
- Use the hdsrfctl command of the HiRDB Datareplicator on the target HiRDB side to re-execute the data reflection transaction.
- #: When you use the system reconfiguration command (pdchgconf command), you do not need to restart HiRDB normally, because the pdchgconf command allows you to modify HiRDB system definitions while HiRDB is running. Note that HiRDB Advanced High Availability is required in order to use this command. For details about modifying HiRDB system definitions while HiRDB is running, see 9.2 Modifying HiRDB system definitions while HiRDB is running (system reconfiguration command).
The following figure shows the operational flow when a data replication transaction is rolled back forcibly by the skipped effective synchronization point dump monitoring facility.
Figure 8-13 Operational flow when a data replication transaction is forcibly rolled back by the skipped effective synchronization point dump monitoring facility
All Rights Reserved. Copyright (C) 2011, 2015, Hitachi, Ltd.