3.2.1 Monitoring a server in the monitor mode (for the active server)
This subsection explains the function for monitoring servers in the monitor mode. This function enables the hot standby operation to be performed automatically in the event of a server failure on a server in the monitor mode. Such a hot standby operation is performed as a planned hot standby.
If you use the monitor-mode program management function to monitor UAPs, also see 3.6 Function for controlling programs (monitor mode).
- Organization of this subsection
(1) How to monitor servers
On a server in the monitor mode, the method to detect an application or task failure differs depending on the specifications of the application or task. Therefore, to notify HA Monitor that an application or task failure occurred, the user must create a server monitoring command.
A server monitoring command can be specified in either of the following ways:
-
Specifying a monitoring command for the ptrlcmd_ex operand in the server environment definition
-
Specifying a monitoring command for the patrolcommand operand in the server environment definition
In the following cases, we recommend that you specify a monitoring command by using the ptrlcmd_ex operand in the server environment definition:
-
Hot-standby switchover needs to be performed when HA Monitor detects an application or task slowdown.
-
Monitoring of servers needs to be temporarily stopped and restarted.
The following describes the flow of failure monitoring processing in the case where the ptrlcmd_ex and patrolcommand operands are specified in the server environment definition.
(a) Specifying a monitoring command for the ptrlcmd_ex operand in the server environment definition
If a monitoring command is specified for the ptrlcmd_ex operand in the server environment definition, when startup of the server finishes, HA Monitor executes the command at regular intervals to determine the command execution result (EXIT code). The interval is specified in the ptrlcmd_ex_inter operand in the server environment definition. HA Monitor can detect a server failure from the EXIT code. If the ptrlcmd_ex_tmout operand is specified in the server environment definition and command execution does not end within the time specified for that operand, HA Monitor assumes that a slowdown occurred. In this case, HA Monitor determines that a server failure occurred on the server.
The following figure shows an overview of monitoring a failure on a server in the monitor mode (in the case where a monitoring command is specified for the ptrlcmd_ex operand in the server environment definition).
- Important
-
If a monitoring command is specified for the ptrlcmd_ex operand in the server environment definition, the following items cannot be specified:
-
Setting of a monitoring command specified for the patrolcommand operand in the server environment definition
-
Monitor-mode program management function
-
For details about using HA Monitor to control the server monitoring command, see 4.1.4 Controlling the server monitoring command.
The monchange command allows you to change the settings and operations while HA Monitor and servers are running. You can use this command to temporarily stop and restart server monitoring. For details, see 9.6 monchange (changes settings and operations while HA Monitor and servers are running).
(b) Specifying a monitoring command for the patrolcommand operand in the server environment definition
If a monitoring command is specified for the patrolcommand operand in the server environment definition, HA Monitor automatically starts the command when startup of the server is completed. HA Monitor then checks for the command's process at 1-second intervals. When HA Monitor detects termination of the server monitoring command, it determines that a failure has occurred on the server.
The following figure shows an overview of monitoring a failure on a server in the monitor mode (in the case where a monitoring command is specified for the patrolcommand operand in the server environment definition).
For details about using HA Monitor to control the server monitoring command, see 4.1.4 Controlling the server monitoring command.
(2) HA Monitor processing after detection of a server failure
When HA Monitor detects a server failure, it performs one of the following operations:
-
Restarts the active server.
HA Monitor restarts the active server on the host resulting in the server failure. You can use the servexec_retry operand in the server environment definition to specify the number of start retries.
If the restarted server has been operating normally for a certain period of time, HA Monitor assumes that the server operation is stable, and resets the number of start retries to 0. The time required for HA Monitor to assume stable operation differs depending on the operand for which the monitoring command is specified, as follows:
-
If the monitoring command is specified for the ptrlcmd_ex operand in the server environment definition:
(The interval time specified for the ptrlcmd_ex_inter operand in the server environment definition) × (the count specified for the ptrlcmd_ex_retry operand in the server environment definition)
Note, however, the monitoring command must continue to return an EXIT code of 0 until the preceding time period expires.
-
If the monitoring command is specified for the patrolcommand operand in the server environment definition:
The time period specified for the retry_stable operand in the server environment definition
-
-
Performs the hot standby operation.
In the following cases, HA Monitor stops the active server and begins to switch to the standby server.
-
The value 0 has been specified in the servexec_retry operand in the server environment definition.
-
The number of server failures that have been detected has reached the value specified in the servexec_retry operand in the server environment definition.
-
The monitoring command is specified for the ptrlcmd_ex operand in the server environment definition, and the server monitoring command terminates with an EXIT code of 20 to 29.
Note that if hot standby processing cannot be performed for a reason such as that there is no available standby server, the server status remains under execution processing in the HA Monitor. Restart the active server or perform hot standby processing, as appropriate for your operation.
-
-
Performs host pair shutdown.
If the monitoring command is specified for the ptrlcmd_ex operand in the server environment definition, and the pair shutdown function is used (use is set for the pairdown operand in the server environment definition), pair shutdown is performed.
For details about the processing flow for starting and terminating the server monitoring command, and for monitoring, see 4.1.4 Controlling the server monitoring command.
(3) Required environment settings
The user must create a server monitoring command. The user also must specify the created server monitoring command for the ptrlcmd_ex or patrolcommand operand in the server environment definition.
Some programs themselves provide a command for monitoring the program. For details about how to create a server monitoring command, see 6.13.3 Creating a server monitoring command.