Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


2.3.7 Function for controlling hot standby based on the availability of LAN communications

The function for controlling hot standby based on the availability of LAN communications is used to control operation in the event of a host failure in such a manner that the active server runs only on a host on which LAN communication is available.

You must decide whether to use this function. For details about how to decide, see 1.4 Hot-standby switchover methods and 3.1 List of functions supported by HA Monitor.

When this function is used, a business application LAN that is used to check the availability of communication must be configured in such a manner that it is also used as a monitoring path.

For examples of configuring the environment in this way, see (11) Example of environment settings when using the function for controlling hot standby based on the availability of LAN communications.

Organization of this subsection

(1) Hot standby operation when a host failure is detected

This subsection explains hot standby operation when the function for controlling hot standby based on the availability of LAN communications is used.

If the multi-standby function is not used

If a host failure is detected in the primary system or the secondary system, HA Monitor checks whether LAN communication is available. HA Monitor then runs the server as the active server on a host on which LAN communication is available and terminates the server on the host on which LAN communication is not available.

The following figure shows the general flow of operations when a host failure is detected due to a monitoring path failure, communication is not available via a business application LAN in the primary system, and communication is available via a business application LAN in the secondary system.

Figure 2‒17: Hot standby operation when communication is not available via a business application LAN in the primary system and communication is available via a business application LAN in the secondary system

[Figure]

The following figure shows the general flow of operations when a host failure is detected due to a monitoring path failure, communication is available via a business application LAN in the primary system, and communication is not available via a business application LAN in the secondary system.

Figure 2‒18: Hot standby operation when communication is available via a business application LAN in the primary system and communication is not available via a business application LAN in the secondary system

[Figure]

If the multi-standby function is used

See (3) Function for controlling hot standby based on the availability of LAN communications.

(2) Operation when the primary system has recovered from a slowdown

If hot standby occurred because the primary system slowed down and this was detected by the secondary system as a host failure, but then the primary system recovered from the slowdown, HA Monitor in the primary system will detect that the active server is running in the secondary system. HA Monitor then performs planned termination on the active server in the primary system.

In a configuration that inherits IP addresses, the IP addresses might overlap temporarily between the primary and secondary systems until planned termination takes place on the active server that has recovered from the slowdown that occurred.

(3) OS panic in the event of a process failure on HA Monitor

If a process failure is detected on HA Monitor, the KAMN066-E message is output and an OS panic is generated to avoid contention for shared resources.

(4) Hot standby operation or planned hot standby operation in the event of an active server failure

In the event of an active server failure, HA Monitor performs a hot standby operation regardless of the status of business application LANs in the same manner as when the function for controlling hot standby based on the availability of LAN communications is not used.

If the shared resource (IP address) disconnection process fails on the source host, HA Monitor generates an OS panic on the source host, and then continues with the hot standby operation. This prevents contention for shared resources.

(5) Conditions of use

The function for controlling hot standby based on the availability of LAN communications can be used when both the following conditions are satisfied:

(6) Required environment settings

Specify the settings as shown in the following table, as appropriate for the type of business application LANs used for checking the availability of communication.

Table 2‒3: Environment settings depending on the type of business application LANs

Required environment settings

Types of business application LANs#2

hbonding#1

bonding

ethernet

active-backup

mode

802.3ad

mode

HA Monitor environment settings

fence_lan

Y

Y

Y

Y

lanfailswitch

Y

Y

Y

Y

lancheck_patrol

O

O

O

O

hbond_lacp

N

O

N

N

lancheck_mode

O

O

O

O

Server environment definition

switch_judge

Y

Y

Y

Y

switchbyfail

Y

Y

Y

Y

Files used for monitoring LANs

LAN monitor definition file

Y

Y

Y

Y

LAN monitoring script

O

Y

Y

Y

Legend:

Y: Specify.

O: Specify if necessary.

N: Cannot be specified.

#1

The monitoring interval of a slave interface monitored by hbonding must be shorter than the host failure monitoring interval (specified in the patrol operand in the HA Monitor environment settings).

#2

A tagged VLAN can also be used by specifying an interface name that does not include a period (.) for the device name.

For details about the settings related to the monitoring of LANs, see 3.4.1 LAN monitoring and automatic hot standby in the event of a failure, (1) Specifying a LAN monitor definition file, and (2) Specifying a LAN monitoring script.

Note that when a LAN monitoring script (lanpatrol.sh) is used, the time required for a hot standby operation in the event of a host failure includes the time needed for LAN monitoring script execution. Adjust the LAN monitoring script execution time according to the requirements for the hot standby processing time.

If you monitor the number of packets received, specify the host failure monitoring time (patrol operand in the HA Monitor environment settings) as follows:

Host failure monitoring timelancheck_patrol × 2

(7) Correspondence between server configuration and business application LANs

You must choose one business application LAN to be used for checking the availability of communication for each ungrouped server and for each group.

If there is a server that does not use a business application LAN, either set up a business application LAN to be used for checking the availability of communication, or group that server with other servers that use a business application LAN.

(8) Notes about the configuration of LANs

Note the following about the configuration of LANs: