8.13 Monitoring the number of times server processes terminate abnormally (abnormal termination monitoring facility)

If server processes terminate abnormally often, servers may not be able to accept new services. However, because HiRDB itself does not usually terminate abnormally when a server process does, frequent server process abnormal terminations could bring online operations to an effective halt. To prevent this from occurring, the abnormal termination monitoring facility has been made available.

Organization of this section
(1) Overview of the abnormal termination monitoring facility
(2) Application range of the abnormal termination monitoring facility
(3) Specifying the abnormal termination monitoring facility
(4) Notes

(1) Overview of the abnormal termination monitoring facility

If the number of times that a server process is terminated abnormally in a specified amount of time reaches the value specified in the pd_down_watch_proc operand, HiRDB (or the associated unit for a HiRDB/Parallel Server) also terminates abnormally. This capability is provided by the abnormal termination monitoring facility.

We recommend that you use this facility in conjunction with the system switchover facility. This way, if HiRDB terminates abnormally because server processes have terminated abnormally more than the specified number of times, the system will be switched over quickly. If this monitoring facility is not used, HiRDB does not terminate abnormally, and the system is not switched over.

Even if you do not use this facility, you can restart HiRDB, which will refresh memory and other resources, leading to improved processing efficiency.

If HiRDB terminates abnormally due to the abnormal termination monitoring facility, the KFPS-01821-E and KFPS00729-E messages are issued.

(2) Application range of the abnormal termination monitoring facility

This facility monitors processes that have terminated abnormally due to a PDCWAITTIME timeout or an abort. For a HiRDB/Single Server, it counts the number of times that single server processes terminate abnormally. For a HiRDB/Parallel Server, it counts the total number of times that front-end, back-end, and dictionary server processes in the unit terminates abnormally. Table 8-11 lists the factors that may cause server processes to terminate abnormally and indicates which of these are counted as an abnormal termination.

Table 8-11 Causes of abnormal termination of server processes and which are counted as an abnormal termination

Cause of abnormal termination of a server processCounted as an abnormal termination?
Single server processFront-end server processDictionary server processBack-end server process
The value of the PDCWAITTME operand in the client environment definitions was exceeded.YYN1N1
The pdcancel command was executed.NN2NN
An internal forced termination occurred (HiRDB issued SIGKILL internally to stop the process).Y3Y3N1N1
An abort occurred.YYYY
Rollback occurred on an XA-connected UAP.YYNN
An abnormal termination other than the above occurred.YYYY
Legend:
Y: Counted as an abnormally terminating process
N: Not counted as an abnormally terminating process
1 If an error is detected during a transaction branch, any abnormal termination of a front-end server process generated from that transaction branch is counted.
2 If the pdcancel command is used to terminate forcibly a back-end server process or a dictionary server process, front-end server processes are terminated forcibly internally. In such a case, abnormal termination of the front-end server processes may be counted.
3 If an error is detected during a global transaction issued by an OLTP system, any abnormal termination of a single server process or a front-end server process generated from that global transaction is counted.

(3) Specifying the abnormal termination monitoring facility

You use the pd_down_watch_proc operand to specify the period over which the number of server process abnormal terminations is to be monitored and the maximum number of times that server processes are to be allowed to terminate abnormally.

Example: pd_down_watch_proc = 1000, 60
In this case, the number of times server processes terminate abnormally is monitored in 60-second intervals. If the number of times server processes terminate abnormally in any 60-second interval exceeds 1000, HiRDB terminates abnormally.

(4) Notes