Monitoring the number of times server processes terminate abnormally (abnormal termination monitoring facility)

If server processes terminate abnormally often, servers might not be able to accept new services. However, because HiRDB itself does not usually terminate abnormally when a server process does, frequent server process abnormal terminations could bring online operations to an effective halt. To prevent this from occurring, the abnormal termination monitoring facility has been made available.

Organization of this section: (1) Overview of the abnormal termination monitoring facility; (2) Application range of the abnormal termination monitoring facility; (3) Specifying the abnormal termination monitoring facility; (4) Notes

(1) Overview of the abnormal termination monitoring facility

If the number of times that a server process is terminated abnormally in a specified amount of time reaches the value specified in the pd_down_watch_proc operand, HiRDB (or the associated unit for a HiRDB parallel server configuration) also terminates abnormally. This capability is provided by the abnormal termination monitoring facility.

We recommend that you use this facility in conjunction with the system switchover facility. This way, if HiRDB terminates abnormally because server processes have terminated abnormally more than the specified number of times, the system will be switched over quickly. If this monitoring facility is not used, HiRDB does not terminate abnormally, and the system is not switched over.

Even if you do not use this facility, you can restart HiRDB, which will refresh memory and other resources, leading to improved processing efficiency.

If HiRDB terminates abnormally due to the abnormal termination monitoring facility, the KFPS-01821-E and KFPS00729-E messages are issued.

(2) Application range of the abnormal termination monitoring facility

This facility monitors for server processes that terminate abnormally because of a PDCWAITTIME timeout or abort. For a HiRDB single server configuration, it counts the number of times that single server processes terminate abnormally. For a HiRDB parallel server configuration, it counts the total number of times that front-end, back-end, and dictionary server processes in the unit terminates abnormally. The following table shows the factors that might cause server processes to terminate abnormally and indicates which of these are counted as abnormal terminations.

Table 8-13 Causes of abnormal termination of server processes and which are counted as an abnormal termination

Cause of abnormal termination of a server process	Counted as an abnormal termination?
Cause of abnormal termination of a server process	Single server process	Front-end server process	Dictionary server process	Back-end server process
The value of the PDCWAITTME operand in the client environment definitions was exceeded.	Y	Y	N^#¹	N^#¹
The pdcancel command was executed.	N	N^#²	N	N
An internal forced termination occurred (HiRDB issued SIGKILL internally to stop a server process).	Y^#³	Y^#³	N^#¹	N^#¹
An abort occurred.	Y	Y	Y	Y
Either of the following occurred: Abnormal termination of a server process by the OLTP system's transaction recovery processing Abnormal termination of a server process by XDS' transaction recovery processing^#4	Y	Y	N	N
An abnormal termination of a server process other than the above occurred.	Y	Y	Y	Y

Legend:: Y: Counted as an abnormally terminating server process.; N: Not counted as an abnormally terminating server process.

#1: If an error is detected during a transaction branch, any abnormal termination of a front-end server process generated from that transaction branch is counted.

#2: If the pdcancel command is used to terminate forcibly a back-end server process or a dictionary server process, front-end server processes are terminated forcibly internally. In such a case, abnormal termination of the front-end server processes might be counted.

#3: If an error is detected during a global transaction issued by an OLTP system, any abnormal termination of a single server process or a front-end server process generated from that global transaction is counted.

#4: If a transaction executed by XDS on the server that is providing the primary functions cannot be completed, rollback by XDS' transaction recovery processing occurs and might abnormally terminate the server process.

(3) Specifying the abnormal termination monitoring facility

You use the pd_down_watch_proc operand to specify the period over which the number of server process abnormal terminations is to be monitored and the maximum number of times that server processes are to be allowed to terminate abnormally.

Example: pd_down_watch_proc = 1000, 60

In this case, the number of times server processes terminate abnormally is monitored in 60-second intervals. If the number of times server processes terminate abnormally in any 60-second interval exceeds 1,000, HiRDB terminates abnormally.

(4) Notes

The KFPS01820-E message is issued when a server process terminates abnormally. The KFPS01820-E message is also issued when the pdcancel command is used to terminate a server process abnormally; however, in this case, the abnormal termination is not counted.
Use of a mutual system switchover configuration might actually cause traffic to increase, negating any benefits from this facility. The reason for this is because, if system switchover executes, multiple instances of HiRDB will become active on a single server machine. When you use the abnormal termination monitoring facility, we recommend that you restart HiRDB on the same system as the instance of HiRDB that terminated abnormally.

8.13 Monitoring the number of times server processes terminate abnormally (abnormal termination monitoring facility)

(1) Overview of the abnormal termination monitoring facility

(2) Application range of the abnormal termination monitoring facility

(3) Specifying the abnormal termination monitoring facility

(4) Notes