Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


4.2.3 Host reset when there are multiple standby systems

This subsection explains host reset when there are multiple standby systems, such as in a multi-standby configuration.

Organization of this subsection

(1) How to determine reset priorities

If there are multiple standby systems, you must specify reset priorities in order to determine a unique reset issuance order in all hosts belonging to the same hot-standby configuration.

When the local host is connected to the remote hosts at startup, HA Monitor determines the local host's reset priority. Among all the connected hosts, the one with the lowest host address has the highest priority. The reset priority ranges from 0 to 31, where 0 is the highest priority.

The figure below shows the relationship between host address and reset priority. In this example, if a failure occurs on host 3, the reset is issued by host 1 because it has the lowest host address.

Figure 4‒14: Relationship between host address and reset priority

[Figure]

If a host failure is detected, the host with the highest reset priority issues the reset. Each other host waits for a specific period of time before it issues a reset, depending on its reset priority. This wait time for each host is determined by the formula host's reset priority × 10 (seconds).

Note that the value of 10 (seconds) in the formula might differ, depending on the system being used.

(2) Host reset when reset priorities are specified

This subsection explains how each host functions during a host reset when reset priorities have been specified.

A host that has detected a host failure issues a reset to the host on which the failure occurred according to its (the local host's) reset priority. That host then issues a reset issuance request to all the hosts that are connected. A host that has received the reset issuance request issues a reset to the failed host according to the local host's reset priority.

If a reset is successful:

If a reset is successful, the host that issued the reset notifies all the hosts of the success (reset success notification). If there is any standby server to be switched over, the host then performs the hot standby operation.

A host that has received the reset success notification stops waiting if it has been on reset issuance wait state for the failed host. If there is any standby server to be switched over, the host performs the hot standby operation.

If a reset fails:

If a reset fails, the host that issued the reset waits for a reset success notification from some other host.

(3) Processing if a host reset from a standby system has failed

If a host reset from a standby system fails, the hot standby operation can be performed automatically because there are multiple standby systems. This subsection explains by way of example the reset processing when a host reset from a standby system has failed. In this example, a failure has occurred on host 2, a reset from host 1 has failed, and a reset is then issued from host 3 after the reset issuance wait time has elapsed. The following figure shows the host reset processing when a host reset has failed in a hot-standby configuration where there are multiple standby systems.

Figure 4‒15: Host reset processing when a host reset has failed in a hot-standby configuration where there are multiple standby systems

[Figure]

The following provides the details of reset issuance when reset priorities are specified, where the item numbers correspond to the numbers in the figure:

  1. Send a reset instruction and a reset request.

    Host 1, which has the highest reset priority, detects a host failure on host 2 and issues a reset instruction to host 2. Host 1 also sends a reset request to the other hosts.

    Host 3 detects a host failure on host 2 and waits for issuance of a host 2 reset.

  2. Reset has failed.

    If the reset instruction from host 1 has failed, host 1 waits for a reset success notification from another host.

  3. Send a reset instruction.

    If there is no reset success notification from host 1 and the reset wait time based on the reset priority has elapsed, host 3 sends a reset instruction to host 2.

  4. Send a reset success notification and perform the hot standby operation.

    If the reset is successful, host 3 sends a reset success notification to the other hosts and performs the hot standby operation (if there is a standby server available for the hot standby operation).