Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


7.4.3 Restarting the host where a failure occurred as the standby system

After hot standby processing has been completed, the host on which a failure occurred is reset and placed in the inactive status. However, if SCSI reservation for shared disk is used, or if the function for controlling hot standby based on the availability of LAN communications is used, the hosts and HA Monitor return to a running state after completion of hot standby processing due to a monitoring path failure or host slowdown.

To prepare for the possibility of a subsequent failure on the secondary system that is currently executing jobs, the operator must resolve the cause of the failure and then restart the host on which the failure occurred. The following is the procedure for enabling hot standby processing for a host on which a failure occurred.

If the host on which the failure occurred terminated after hot standby processing finished

To restart the host where a failure occurred:

  1. Resolve the cause of the failure on the host where the failure occurred.

    Implement the recommended action to eliminate the cause of the failure according to the displayed messages.

  2. Restart the host.

    To be prepared for the possibility of a subsequent failure on the secondary system that is executing jobs, start the host where the failure occurred. The restarted host becomes the standby system in case a failure occurs on the host currently executing jobs.

  3. Start HA Monitor.

    If you have set HA Monitor to be started manually, execute the HA Monitor start command (monstart command).

    You can skip this step if you have set HA Monitor to start automatically when the kernel starts.

  4. Verify that HA Monitor startup processing is complete, and that there is a connection to the host currently executing jobs.

    For details about how to check this, see 7.4.4 Checking the status of servers and hosts after handling an error.

    The KAMN002-I message is issued when the HA Monitor startup processing is completed.

  5. Start the servers.

    Start the servers on the host where the failure occurred.

    For servers in the server mode:

    Execute the start command provided by the program.

    For servers in the monitor mode:

    Execute the monitor-mode server start command (monbegin command).

    You can also set servers in the monitor mode to start automatically. For details about how to specify the settings, see 7.7.3 Automating an operation after hot standby processing.

  6. Verify that the server startup processing has been completed and that the system is ready for hot standby processing.

    The KAMN252-I message is issued when the system is ready for hot standby processing.

If the host where the failure occurred did not terminate after hot standby processing finished

  1. Resolve the cause of the failure on the host where the failure occurred.

    Implement the recommended action to eliminate the cause of the failure according to the displayed messages.

  2. Execute the HA Monitors manual connection command (monlink command).

  3. Verify that there is a connection to the host currently executing jobs.

    For details about how to check this, see 7.4.4 Checking the status of servers and hosts after handling an error.

  4. Start the servers.

    Start the servers on the host on which the failure occurred.

    For servers in the server mode:

    Execute the start command provided by the program.

    For servers in the monitor mode:

    Execute the monitor-mode server start command (monbegin command).

    You can also set servers in the monitor mode to start automatically. For details about how to specify the settings, see 7.7.3 Automating an operation after hot standby processing.

  5. Verify that the server startup processing is complete and that the system is ready for hot standby processing.

    The KAMN252-I message is issued when the system is ready for hot standby processing.