Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


4.7.2 Program restart control

The monitor-mode program management function enables you to restart individual programs after detection of a program failure by executing a program restart command. The user must create the program restart command. The user then specifies the created program restart command in the restartcommand operand in the monitor-mode program environment definition.

For details about how to create a program restart command, see 6.15 Creating a program restart command (when the program management function is used).

Organization of this subsection

(1) Unit of restart processing

You can select server or program as the unit of program restart processing. This means that if a UAP failure occurs, you can choose to restart only the UAP resulting in the UAP failure or you can restart the server that contains the UAP where the failure occurred.

The unit of restart processing is determined by whether the restartcommand operand is specified in the monitor-mode program environment definition. When this operand is specified, HA Monitor restarts the individual program; when specification of this operand is omitted, HA Monitor restarts the server.

(2) Processing flow of program restart

This subsection explains the processing flow when restart processing is performed for the program.

If a program fails, HA Monitor attempts to restart the program as many times as specified in program restart limit. The program restart limit means the maximum number of times program restart is to be attempted. You specify the program restart limit in the exec_retry operand in the monitor-mode program environment definition. You can also specify a timeout value for program restart processing. When the program restart command times out, the program restart command is re-executed. You specify a timeout value for program restart processing in the restart_timeout operand in the monitor-mode program environment definition.

In the event of a program failure, HA Monitor repeats the following processing until the program restart limit is reached:

  1. If the program restart command is already executing, forcibly terminates the program restart command.

  2. If the UAP being monitored consists of multiple processes, forcibly terminates all processes that issued the API.

  3. Executes the program restart command.

(3) Handling of restart errors

If HA Monitor detects a restart limit resulting in program termination and hot standby processing can be performed, HA Monitor performs hot standby processing. If hot standby processing cannot be performed, the user must restart the program manually.