Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


8.3.1 HA Monitor environment settings (sysdef)

You create a definition file containing the HA Monitor environment settings under the file name sysdef in the directory for HA Monitor environment settings.

A sample sysdef file is provided under the directory for HA Monitor sample files. If you copy this file to the directory for HA Monitor environment settings and then edit it, you can save time over creating the file from scratch.

You must create this definition file for each host. Bold type indicates a value that must be the same on all hosts. Care must be taken in specifying the other items so that there will be no conflicts between the hosts.

The following shows the format of the definition file used to specify the HA Monitor environment settings:

/*    HA Monitor environment settings    */
  environment  name  host-name-of-local-host
              ,address
                     host-address-of-local-host
              ,patrol
                     host-failure-monitoring-time
              ,lan   TCP/IP-host-name[:TCP/IP-host-name...]
              ,lanport
                     TCP/IP-service-name[:TCP/IP-service-name...]
            [,fs_log_size
                   {size-of-file-system-switchover-log-file
                   |65536}]
            [,servmax
                   {16|64|128|129-512#}]
            [,hostmax#
                   {maximum-number-of-hosts-to-be-connected|32}]
            [,pgmmax
                   {maximum-number-of-programs-that-can-run-concurrently|0}];
[function  [ cpudown
                   {online|standby|system}{,|;}]
            [ clearwait
                   {reset-publication-host-wait-time|20}{,|;}]
            [ standbyreset
                   {use|nouse}{,|;}]
            [ pathpatrol
                     monitoring-path-health-check-interval{,|;}]
            [ pathpatrol_retry
                   {re-check-interval|30}:{re-check-count|5}{,|;}]
            [ message_retry
                   {retry-interval|3}{,|;}]
            [ connect_retry
                   {connection-interval|5}:{connection-count|200}{,|;}]
            [ monbegin_restart
                   {use|nouse}{,|;}]
            [ netmask
                   {byte|bit}{,|;}]
            [ usrcommand
                     absolute-path-name-of-user-command{,|;}]
            [ resetpatrol
                   {reset-path-health-check-interval|2}{,|;}]
            [ resetpath_retry
                    {use|nouse}{,|;}]
            [ resetpath_inter
                      reset-path-re-check-interval{,|;}]
            [ multistandby
                   {use|nouse}{,|;}]
            [ deviceoff_order
                   {order|reverse}{,|;}]
            [ reset_type
                   {server|host}{,|;}]
            [ partition_reset
                   {use|nouse}{,|;}]
            [ jp1_event
                   {use|nouse}{,|;}]
            [ ph_log_size
                     monitoring-history-file-size{,|;}]
            [ ph_threshold
                     time-when-host-monitoring-history-is-to-be-obtained{,|;}]
            [ termcmd_at_abort
                   {use|nouse}{,|;}]
            [ alive_interval
                     alive-message-transmission-interval{,|;}]
            [ alive_multicast
                   {use|nouse}{,|;}]
            [ multicast_lan
                     host-name-of-multicast-group{,|;}]
            [ lanfailswitch
                   {use|nouse}{,|;}]
            [ lancheck_patrol
                   {LAN-monitoring-interval|15}{,|;}]
            [ lancheck_mode
                   {packet|route}{,|;}]
            [ disk_ptrl
                   {use|nouse}{,|;}]
            [ disk_ptrl_inter
                   {system-disk-check-interval|120}{,|;}]
            [ disk_ptrl_retry
                   {system-disk-check-retry-interval|60}:
                   {system-disk-check-retry-count|1}{,|;}]
            [ disk_log_size
                   {disk-monitoring-log-size|1048576}{,|;}]
            [ patrol_100ms
                   {use|nouse}{,|;}]
            [ suppress_reset
                     minimum-number-of-active-hosts{,|;}]
            [ exitcode
                   {type1|type2}{,|;}]
            [ vg_off
                   {sequential|parallel}{,|;}]
            [ vmware_env
                    {use|nouse}{,|;}]
            [ patrol_type
                   {server|host}{,|;}]
            [ resetpatrol_mode
                   {mode1|mode2}{,|;}]
            [ fence_reset
                   {use|nouse}{,|;}]
            [ fence_scsi
                   {use|nouse}{,|;}]
            [ fence_lan
                   {use|nouse}{,|;}]
            [ scsi_check
                   {reserve-status-monitoring-time|5}{,|;}]
            [ scsi_pathcheck
                   {disk-path-status-monitoring-interval|1}{,|;}]
            [ scsi_timeout
                   {SCSI-command-reply-wait-time|30}{,|;}]
            [ scsi_retry
                   {SCSI-command-retry-count|2}{,|;}]
            [ hbond_lacp
                     hbonding-interface-name
                     [:hbonding-interface-name...];]
            [ vg_offskip
                   {use|nouse}{,|;}]
            [ notswitch_notify
                   {use|nouse}{,|;}]
            [ servcomplete_msg
                   {use|nouse}{,|;}]
            [delay_kamn238
                   {KAMN238-D-message-output-delay-time|0}{,|;}]
[options   [ clearcheck
                   {host-reset-end-time|20};]

#: The operand can be specified only when HA Monitor Extension is used.

Note that if you use SCSI reservation for shared disk (use is specified in the fence_scsi operand), the following operands are ignored, if specified:

The following operands are ignored when you use the function for controlling hot standby based on the availability of LAN communications used by business applications (use is specified in the fence_lan operand):

Organization of this subsection

(1) environment definition statement

The environment definition statement defines HA Monitor's operating environment. The following explains the operands of the environment definition statement.

(a) name

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 32 alphanumeric characters

--

(Units: --)

No

--

Description

Specifies a name for the host. This name is used by HA Monitor to identify the host.

Each host's name must be unique. The name specified in the name operand is displayed as the host name (host name) when HA Monitor commands are executed.

(b) address

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer having 1 to 4 digits

0 to 9999 (or 1 to 9999 if use is specified for the fence_scsi operand in the function definition statement)

(Units: --)

No

--

Description

Specifies a value indicating the local host's host address in order to identify the host when the CPU is reset because host reset has been performed, as well as to identify the host when it reserves a shared disk (when SCSI reservation for shared disk is used).

HA Monitor specifies the hardware settings.

You must specify a unique address for each host connected via monitoring paths and the reset path. If the system contains multiple hot-standby configurations that do not monitor each other but they share the same reset path, each of the hosts must have a unique address.

If two or more hosts have the same address, HA Monitor might malfunction. If you accidentally set the same address and the KAMN414-E message is output, take appropriate action as described in 7.5.14 Handling a duplicated address operand value.

Due to hardware limitations, specify a value in the range from 0 to 9999. If you specify use in the fence_scsi operand in the function definition statement, specify a value in the range from 1 to 9999.

If the multistandby operand is specified in the function definition statement, or if host is specified in the reset_type operand, the host with the lowest host address has the highest reset priority. When deciding on a host address, consider the following points:

  • If you want to prioritize resetting the active server over the standby server when a host failure is detected simultaneously on both the active system and the standby system, give the standby system a higher host address than the active system. For standby systems in a multi-standby configuration, give lower host addresses to systems with smaller values for the standbypri operand in the server environment definition.

  • If you want to prioritize hot standby processing over resetting the active server when a host failure is detected simultaneously on both the active system and the standby system, give the standby system a lower host address than the active system. For standby systems in a multi-standby configuration, give lower host addresses to hosts with larger values for the standbypri operand in the server environment definition. This setting is recommended when you need to obtain an accurate dump as soon as possible in order to investigate a problem such as a slowdown of the active system.

  • In a configuration without a clear separation between active and standby systems, such as a configuration in which all the hosts are active systems, decide on a host address by taking into account the reset priority of the host based on factors such as the importance of the job and the host's amount of resources.

(c) patrol

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

3 to 600

(Units: seconds)

No

--

Description

Specifies the interval for monitoring for receipt of alive messages from remote hosts. This monitoring is used to detect host failures.

You must specify the same value for all hosts in the hot-standby configuration. If no alive message is sent from a remote host within the time specified in the patrol operand, HA Monitor determines that a host failure has occurred. To specify a time shorter than five seconds, specify the alive_interval operand.

(d) lan

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Path name having 1 to 32 characters

--

(Units: --)

No

--

Description

Specifies the host names of the TCP/IP LANs that are to be used as monitoring paths.

Specify a unique name for each host.

A host name specified in the lan operand must be a host name specified in the /etc/hosts file. For details about the /etc/hosts file, see 6.10.1 Adding host names and service names.

Delimit the TCP/IP LAN host names with the colon (:). You can specify a maximum of six host names. If you use multiple monitoring paths and have a preferred monitoring path, specify a hash mark (#) in front of that path's host name. You can specify only one preferred path. The following shows a specification example:

lan  #path11:path12,

This example uses two monitoring paths, path11 and path12, and path11 is to be used as the preferred path. If the preferred monitoring path fails and transmission of alive messages stops for 70% of the host failure monitoring time, HA Monitor attempts to communicate via all monitoring paths and switches to a normal monitoring path over which a reply is received.

(e) lanport

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 32 alphanumeric characters

--

(Units: --)

No

--

Description

Specifies the service names of the TCP/IP LANs that are to be used as monitoring paths.

Specify the same values for all hosts in the hot-standby configuration.

A service name specified in the lanport operand must be a service name specified in the /etc/services file. For details about the /etc/services file, see 6.10.1 Adding host names and service names.

The service names in the lanport operand must be specified so that they correspond in order and number to the host names specified in the lan operand. You must specify as many service names as there are host names in the lan operand. The following shows a specification example:

      lan      path11 :path12,
                 ↓
      lanport  HAmon1 :HAmon2,

In this example, path11 in the lan operand corresponds to HAmon1 in the lanport operand, and path12 in the lan operand corresponds to HAmon2 in the lanport operand.

(f) fs_log_size

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

0 to 2147483647

(Units: bytes)

Yes

65536

Description

Specifies the maximum size (in bytes) for a log file that is created when file systems are switched.

Specify the same value for all hosts in the hot-standby configuration. When a log file reaches the specified maximum size, it is automatically backed up at the next log acquisition point and then the file is cleared.

(g) servmax

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

16|64|128|129 to 512

--

(Units: --)

Yes

16

Description

Specifies the maximum number of servers that can be run concurrently on a single host.

Specify the same value for all hosts in the hot-standby configuration. You can specify only one of the following numeric values:

  • 16: Sets 16 as the maximum number of servers that can be run concurrently.

  • 64: Sets 64 as the maximum number of servers that can be run concurrently.

  • 128: Sets 128 as the maximum number of servers that can be run concurrently.

  • 129 to 512: Sets a value in the range from 129 to 512 as the maximum number of servers that can be run concurrently. This is applicable only when HA Monitor Extension is used.

If you set the maximum number of servers that can be run concurrently to a value other than 16 (default), you must also change a kernel parameter setting. For details about the kernel parameter setting, see 6.3.6 Specifying the kernel parameters.

(h) hostmax

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

32 to 256

(Units: --)

Yes

32

Description

Specifies the maximum number of hosts, including the local host, that can be connected.

You must specify the same value for all hosts in the hot-standby configuration.

This operand can be specified only when HA Monitor Extension is used.

For details about HA Monitor Extension, see 3.7 Functions of HA Monitor Extension.

If you have changed the maximum number of hosts, you must also change kernel parameter settings. For details about the kernel parameter settings, see 6.3.6 Specifying the kernel parameters.

(i) pgmmax

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

0 to 256

(Units: --)

Yes

0

Description

When the monitor-mode program management function is used, this operand specifies the maximum number of processes that can issue the hamon_patrolstart function per host.

If you will be using the program management function, ensure that this operand value has been changed from its default value. You must specify the same value for all hosts in the hot-standby configuration.

(2) function definition statement

The function definition statement defines operation options for HA Monitor. Specification of this definition statement is optional. If you omit the function definition statement, all the default values are assumed. The following explains the operands of the function definition statement.

(a) cpudown

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

online|standby|system

--

(Units: --)

Yes

online

Description

Specifies whether the active system host or the standby system host is to have reset priority.

Specify the same value for all hosts in the hot-standby configuration. When you specify the host that is to have reset priority and a host failure is detected on both the active system and the standby system at the same time, the host for which reset priority has been specified is the one that resets remote hosts.

  • online: The active system is the host that has reset priority.

  • standby: The standby system is the host that has reset priority.

  • system: Prevents concurrent host resets without having to specify a reset priority. You can specify system only if you employ a mutual hot-standby configuration, BladeSymphony is being used, and the SVPs support the cluster management function.

    Note also that system cannot be specified if you use any of the following operand specifications:

    • reset_type operand for which host is specified

    • multistandby operand for which use is specified

    • standbyreset operand for which use is specified

    • vmware_env operand for which use is specified

(b) clearwait

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 600

(Units: seconds)

Yes

20

Description

If all monitoring paths fail, all hosts might be reset at the same time. To prevent such a problem, this operand is used to specify the time (in seconds) during which the hosts that do not have reset priority wait until a host reset request is issued.

The following shows the points that you must note when specifying this operand. If all monitoring paths fail when an incorrect setting has been specified, all hosts might be reset at the same time, causing job processing to terminate.

  • Specify the same value that is set for the clearcheck operand. For details, see the description of the clearcheck operand.

  • Specify the same value for all hosts in the hot-standby configuration. Even if the models of servers in the hot-standby switchover configuration include ones that do not require setting of this operand, make sure that the operand is specified with the same value for all servers.

(c) standbyreset

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the standby system is reset if the active system detects a host failure in the standby system. Specify the same value for all hosts in the hot-standby configuration. Do not specify use in a configuration other than the 1-to-1 switchover configuration.

  • use: Resets the standby system.

  • nouse: Does not reset the standby system.

(d) pathpatrol

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 240

(Units: minutes)

Yes

--#

#: If you omit this operand, HA Monitor does not perform health checks on the monitoring paths.

Description

Specifies the interval at which monitoring path health checks are performed.

Specify the same value for all hosts in the hot-standby configuration.

When you specify this operand, HA Monitor performs health checks of monitoring paths, which facilitates early detection of failures that might occur on any of the multiple monitoring paths. If you omit this operand, HA Monitor does not perform health checks on the monitoring paths. We recommend that you specify this operand.

(e) pathpatrol_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

re-check-interval: Unsigned integer

3 to 600

(Units: seconds)

Yes

30

re-check-count: Unsigned integer

0 to 20

(Units: number of times)

Yes

5

Description

Specifies rechecking of a monitoring path's status after a monitoring path failure. You specify here the interval between the status checks, a colon (:), and the maximum number of times rechecking is performed.

Specify the same values for all hosts in the hot-standby configuration. If you omit the pathpatrol operand, HA Monitor does not recheck a failed monitoring path.

  • re-check-interval

    Specifies the interval at which a failed monitoring path is re-checked.

  • re-check-count

    Specifies the maximum number of times a failed monitoring path is re-checked. If you specify 0, HA Monitor does not recheck monitoring paths.

Specify the recheck interval value and the recheck count value so that their product does not exceed the monitoring path health check interval specified in the pathpatrol operand. Use the following formula to check if the specified values satisfy this condition:

monitoring path health check interval × 60 ≥ recheck interval × (re-check count + 1) + a

Legend:a: 30 to 60 seconds

For example, if the value of the pathpatrol operand is 1 minute and the re-check interval is 5 seconds, the formula is satisfied if the re-check count is 5 or smaller.

(f) message_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

3 to 600

(Units: seconds)

Yes

3

Description

Specifies the interval at which message transmission is retried when transmission of a query response message that uses a monitoring path fails.

Specify the same value for all hosts in the hot-standby configuration.

If you omit the message_retry operand, HA Monitor retries the message transmission every three seconds.

HA Monitor continues resending the message until the transmission is successful. If the hosts are not monitoring each other (alive message transmission is not being monitored) and transmission of a query response message does not succeed after 60 seconds, HA Monitor determines that a host failure has occurred. If a value greater than 60 seconds is specified in the message_retry operand, HA Monitor waits for message reception until the specified amount of time elapses. If a message has still not been received, HA Monitor then determines that a host failure has occurred.

For details about retrying transmission of query response messages, see (4) Retrying transmission of query response messages.

(g) connect_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

connection-interval: Unsigned integer

5 to 60

(Units: seconds)

Yes

5

connection-count: Unsigned integer

1 to 9999

(Units: number of times)

Yes

200

Description

Specifies how connection processing is performed between HA Monitors. You specify here the interval between attempts to establish connection, a colon (:), and the maximum number of times connection establishment is attempted.

Specify the same values for all hosts in the hot-standby configuration.

It is important that you specify this operand, because if many HA Monitors are connected, connection establishment between HA Monitors might fail. This operand becomes effective when an HA Monitor connection configuration file has been created.

  • connection-interval

    Specifies the interval between attempts to establish connection between HA Monitors. A short interval might adversely affect HA Monitor performance, while a long interval might result in delays in establishing connections between HA Monitors. As a guideline, if 10 or more HA Monitors are connected, specify at least 10 seconds.

  • connection-count

    Specifies the number of times connection processing between HA Monitors is retried. If 9999 is specified, connection processing will be retried until all HA Monitors are connected.

    If this value is too small, connection between HA Monitor might fail. A large value might adversely affect HA Monitor performance. As a guideline, specify the default values and consider increasing the values only if connection between HA Monitors fails.

    If connection has not been established after the specified number of connection attempts, the KAMN176-E message is issued.

(h) monbegin_restart

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

use

Description

This operand specifies whether to start a monitor-mode server by executing the monitor-mode server start command (monbegin command) automatically in the following cases:

  • After hot-standby switchover occurs due to a host failure, HA Monitor for the host on which the failure occurred is restarted.

  • HA Monitor for the host on which the standby server is running is restarted.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Automatically starts the servers in the monitor mode. If you do not use an operations management software program (such as JP1) to start servers automatically, we recommend that you specify use.

  • nouse: Does not start the servers in the monitor mode automatically. If you use an operations management software program (such as JP1) to start servers automatically, we recommend that you specify nouse.

(i) netmask

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

byte|bit

--

(Units: --)

Yes

byte

Description

Specifies the netmask that was specified for the LAN interfaces used for the monitoring paths.

Specify the same value for all hosts in the hot-standby configuration. If a LAN is configured as multiple subnets with the same network number, specify the same netmask. For details about the netmask settings, see 5.4.2 Necessary IP addresses and port numbers.

If the netmask operand and netmask settings are invalid, HA Monitor will malfunction, disabling communication with the HA Monitors on the remote hosts.

  • byte: Specify this value only if the netmask consists only of 255 and 0 (in decimal). The following shows an example:

    255.255.0.0
  • bit: Specify this value if the netmask does not consist only of 255 and 0 (in decimal). The following shows an example:

    255.255.255.192

    When bit is specified in the netmask operand, the following limitations apply to the netmask settings:

    The network address part (network number and subnet number) that is recognized by the combination of IP address and netmask must include the entire network number part defined for each class. For example, 255.254.0.0 cannot be specified as the netmask for a network in class B.

You can use the ifconfig OS command to check the netmask value. For details about the netmask and the ifconfig command, see the OS documentation.

(j) usrcommand

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Path name having 1 to 1000 characters

--

(Units: --)

Yes

--

Description

Specifies the absolute path name of the user commands that are issued automatically by HA Monitor.

Specify the same value for all hosts in the hot-standby configuration.

For details about how to create user commands and examples of user commands, see 6.19 Creating user commands.

(k) resetpatrol

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 60

(Units: minutes)

Yes

2

Description

Specifies the interval at which reset path health checking is performed.

Specify the same value for all hosts in the hot-standby configuration.

An appropriate reset path health check interval depends on the number of HA Monitors to be connected. Specify a value that satisfies the following condition:

Reset path health check interval × 60 > number of hosts × 10

(l) resetpath_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

use

Description

This operand specifies whether to re-check the reset path status so that the reset path can be restored after an error is detected by a reset path health check.

Set the same value for all hosts in the hot-standby configuration. If a temporary reset path failure occurs (for example, due to restart of the failure management processor) when this function is used, the status where host reset is possible is automatically restored after recovery from the failure. When the reset path is restored, the KAMN646-I message is output.

  • use: HA Monitor automatically restores the status of the reset path.

  • nouse: HA Monitor does not automatically restore the status of the reset path.

(m) resetpath_inter

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

10 to 1440

(Units: minutes)

Yes

15

Description

This operand specifies the interval at which the reset path status is to be re-checked so that the reset path can be restored after an error is detected by a reset path health check. The interval can be specified in minutes in the range from 10 to 1,440.

This operand takes effect only if the resetpath_retry operand is omitted or if use is specified for the resetpath_retry operand. If nouse is specified for the resetpath_retry operand, the specification of this operand is ignored.

(n) multistandby

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the multi-standby function that enables multiple standby servers to be defined for a single active server is to be used.

Specify the same value for all hosts in the hot-standby configuration. For details about the multi-standby function, see 4.5 Managing servers and hosts when using the multi-standby function.

  • use: Uses the multi-standby function.

  • nouse: Does not use the multi-standby function.

(o) deviceoff_order

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

order|reverse

--

(Units: --)

Yes

order

Description

Specifies the order in which shared resources are disconnected.

Specify the same value for all hosts in the hot-standby configuration. This operand applies to all servers defined in the HA Monitor.

You can also specify a shared resource separation order for each server. For details about how to specify a shared resource separation order for each server, see 8.4.1 Server environment definition (servers). For details about the order in which shared resources are disconnected, see 4.8.7 Processing flow for disconnecting shared resources in the reverse order from which they were connected.

  • order: Disconnects shared resources in the same order in which they were connected.

  • reverse: Disconnects shared resources in the reverse order from which they were connected.

(p) reset_type

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

server|host

--

(Units: --)

Yes

server

Description

Specifies how the reset issuing host is determined.

Specify the same value for all hosts in the hot-standby configuration. If use is specified for the multistandby operand in the HA Monitor environment settings, host is assumed regardless of this operand's setting.

  • server: Determines the reset issuing host in advance according to a process that determines both the reset issuing host and the host that has reset priority.

  • host: Determines the reset issuing host at the time the host reset is performed, based on the host that has the highest reset priority. The smaller the host address specified in the address operand of the HA Monitor environment settings, the higher the reset priority of the host.

(q) partition_reset

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the physical partition reset function is to be used.

This operand is applicable in virtualization environments in which Hitachi server virtualization (Virtage) or VMware ESXi is used. In a hot-standby configuration that includes virtualization and non-virtualization environments, this operand does not have to be specified for the hosts in non-virtualization environments. Specify this operand only in a virtualization environment.

  • use: Uses the physical partition reset function. Specify use if each processor is to be reset during hot standby processing in the event of a reset error on the host in a virtualization environment.

  • nouse: Does not use the physical partition reset function. Specify nouse if the server is to be placed in the hot-standby wait state in the event of a reset error on the host in the virtualization environment.

(r) jp1_event

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the JP1 event notification function is to be used.

You must specify the same value for all hosts in the hot-standby configuration. To use the JP1 event notification function, you need JP1/Base for the physical host environment in your system.

When use is specified in the jp1_event operand, HA Monitor issues JP1 events. When the jp1_event operand is omitted, HA Monitor does not issue JP1 events.

  • use: Uses the JP1 event notification function.

  • nouse: Does not use the JP1 event notification function.

(s) ph_log_size

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1024 to 10485760

(Units: bytes)

Yes

--

Description

Specifies the size of a monitoring history file that is used to record monitoring history information about slowdowns of servers and hosts.

You must specify the same value for all hosts in the hot-standby configuration.

When this operand is specified, HA Monitor creates a monitoring history file of the specified size when HA Monitor starts, and then collects monitoring history for servers and hosts. When the monitoring history file becomes full, its contents are stored in a backup file and the monitoring history file is wrapped around for reuse. The following shows the formula for estimating the size of a monitoring history file:

Number of times the servers and hosts for which monitoring history is collected are started × 180 + number of slowdowns to be recorded in the monitoring history file × 90

(t) ph_threshold

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

5 to 559

(Units: seconds)

Yes

--#

#: If this operand is omitted, the host monitoring history is not collected.

Description

When information about slowdowns of local and remote hosts is to be collected as monitoring history, this operand specifies the amount of time (time when host monitoring history is to be obtained) it takes from the time transmission of alive messages stops to the time collection of monitoring history begins.

You must specify the same value for all hosts in the hot-standby configuration. The value for the time when host monitoring history is to be obtained must be less than the value for the host failure monitoring time specified in the patrol operand in the HA Monitor environment settings.

When you specify this operand to collect a host monitoring history, you must also specify the ph_log_size operand.

(u) termcmd_at_abort

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether postprocessing by the server termination command is to be enabled if server startup processing is cancelled due to an error in the server start command and hot standby or grouped-system switchover processing is performed in the event of a server failure.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Enables postprocessing by the server termination command.

    If server startup processing is canceled due to an error in the server start command or if hot-standby switchover or grouped-system switchover is performed, the server termination command is executed with the -c argument.

    You must create the server termination command in such a manner that the command performs postprocessing when it is executed with the -c argument.

  • nouse: Does not enable postprocessing by the server termination command.

    If server startup processing is cancelled due to an error in the server start command, the server termination command is not executed.

    If hot standby or grouped-system switchover processing is to be performed in the event of a server failure, the server termination command is executed with the -w or -e argument in the same manner as with normal termination of a server.

For details about the server termination command, see 6.13.2 Creating a server termination command.

(v) alive_interval

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 10

(Units: 100 milliseconds)

Yes

10

Description

Specifies a new (changed) default value for the alive message transmission interval.

The normal default alive message transmission interval is 1,000 milliseconds. If you set the host failure monitoring time to a small value, we recommend that you change the alive message transmission interval. Specify the same value for all hosts in the hot-standby configuration.

If this operand is omitted, the normal default alive message transmission interval is assumed. If you use the normal default alive message transmission interval, we recommend that you omit this operand (there is no need to specify 10 in this operand).

When you specify this operand, a preferred monitoring path cannot be specified because alive messages are sent to all monitoring paths.

If this operand is not specified when the patrol value is small, HA Monitor is likely to detect a disruption of alive messages, which might lead to false detection of a failure.

(w) alive_multicast

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse#

#: The system assumes that use is specified only if 33 or a larger value is specified for the hostmax operand.

Description

Specifies whether alive messages are to be transmitted by multicasting.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Multicasts alive messages.

  • nouse: Does not multicast alive messages (transmits by unicasting).

When you specify use in this operand, alive messages are transmitted to all monitoring paths that use the same communication method.

When alive messages are transmitted by multicasting, 239.0.0.1 is used as the multicast group ID. If necessary, you can use the multicast_lan operand to specify a different multicast group ID.

(x) multicast_lan

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 32 alphanumeric characters

--

(Units: --)

Yes

--#

#: If you omit this operand, 239.0.0.1 is assumed as the multicast group ID in the /etc/hosts file.

Description

Specifies the host name corresponding to the multicasting group ID that is to be used when alive messages are transmitted by multicasting.

Specify the same value for all hosts in the hot-standby configuration.

This host name must be a host name specified in the /etc/hosts file. For the IP address to be specified in the /etc/hosts file, specify a multicast address of class D.

Specification of this operand takes effect only if alive messages are to be transmitted by multicasting. In other cases, this operand is ignored even if it is specified.

(y) lanfailswitch

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether LANs are to be monitored.

Make sure that the same value is specified on all hosts for which LANs are to be monitored. (The same value does not need to be specified on all hosts in the hot-standby switchover configuration.)

  • use: Monitors LANs.

  • nouse: Does not monitor LANs.

When you specify this operand, you must also specify in the switchbyfail operand in the server environment definition the name of the LAN interface to be monitored. If the server specified in the switchbyfail operand is grouped and a failure occurs on the specified LAN interface, hot standby processing will be performed on all servers in the group.

(z) lancheck_patrol

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 180

(Units: seconds)

Yes

15

Description

Specifies the monitoring interval when LANs are to be monitored.

Make sure that the same value is specified on all hosts for which LANs are to be monitored. (The same value does not need to be specified on all hosts in the hot-standby switchover configuration.)

If you omit the lanfailswitch operand, this operand is ignored, if specified.

Make sure that this operand's value is greater than the execution time for the LAN monitoring script. For details about the LAN monitoring script, see (2) Specifying a LAN monitoring script.

Note that if only the hbonding status is to be monitored, there is no need to specify this operand because the monitoring interval is fixed at one second.

(aa) lancheck_mode

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

packet|route

--

(Units: --)

Yes

packet

Description

Specifies the LAN monitoring method that HA Monitor uses when monitoring LANs.

Make sure that the same value is specified on all hosts for which LANs are to be monitored. (The same value does not need to be specified on all hosts in the hot-standby switchover configuration.)

This operand is ignored if the lanfailswitch operand is specified with the value nouse or is omitted. If you use a LAN monitoring script (lanpatrol.sh) that was created for HA Monitor whose version is earlier than 01-68, make sure that this operand is omitted or specified with the value packet.

  • packet: HA Monitor monitors the number of packets received by the monitoring-target LAN adapter.

  • route: HA Monitor monitors ping responses that arrive at the address specified in the LAN monitor definition file.

(ab) disk_ptrl

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether to monitor the system disk.

Specify the same values for all hosts in the hot-standby configuration.

  • use: HA Monitor monitors the system disk.

  • nouse: HA Monitor does not monitor the system disk.

If you specify use for this operand, you must set up the monitoring definition file for the system disk. For details about how to specify the settings in the monitoring definition file for the system disk, see (1) Settings in the files required for monitoring the system disk in 6.20.2 Settings in the files required for disk monitoring.

If necessary, also specify the following operands in the HA Monitor environment settings:

  • disk_ptrl_inter

  • disk_ptrl_retry

  • disk_log_size

(ac) disk_ptrl_inter

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

3 to 3600

(Units: seconds)

Yes

120

Description

Specifies the interval at which the status of the system disk is to be checked.

Specify the same values for all hosts in the hot-standby configuration.

Determine the value to be specified for this operand taking into account the following items specified for the disk_ptrl_retry operand in the HA Monitor environment settings:

  • Check retry interval for the system disk

  • Check retry count for the system disk

Specify a value that meets the following condition:

system-disk-check-intervalsystem-disk-check-retry-interval × (system-disk-check-retry-count + 1)

(ad) disk_ptrl_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

system-disk-check-interval: Unsigned integer

1 to 3600

(Units: seconds)

Yes

60

system-disk-check-retry-count: Unsigned integer

0 to 60

(Units: number of times)

Yes

1

Description

Use this operand to specify the interval at which HA Monitor retries checking the system disk and the number of times HA Monitor can retry it if access to the system disk fails. Specify the retry interval and count with an intervening colon (:). If the number of times access to the system disk fails exceeds the specified count, HA Monitor judges that the system disk has failed.

Specify the same values for all hosts in the hot-standby configuration.

  • system-disk-check-retry-interval

    Specify the interval at which HA Monitor retries checking the system disk. Make sure that the value you specify is equal to or larger than the disk I/O timeout value. The disk I/O timeout value means the time for which an application that requests an I/O operation can wait before receiving a response. This value is determined from the driver settings, OS kernel parameters, and multipath software settings.

  • system-disk-check-retry-count

    Specify the number of times HA Monitor can retry checking the system disk. If you specify 0, HA Monitor does not retry checking the system disk.

(ae) disk_log_size

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

65535 to 10485760

(Units: bytes)

Yes

1048576

Description

Specifies in bytes the maximum size of log files that are used for disk monitoring. This operand is applied to both of the following types of log file monitoring:

  • Monitoring of the system disk

  • Monitoring of disks for business use

Specify the same values for all hosts in the hot-standby configuration.

Normally, you do not need to change the value of this operand.

(af) patrol_100ms

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies a new (changed) default value for the alive message check interval.

The normal default alive message check interval is one second. When you specify this operand, HA Monitor will check for alive messages every 100 milliseconds. If you set the host failure monitoring time to a small value, we recommend that you check for alive messages every 100 milliseconds. Specify the same value for all hosts in the hot-standby configuration.

  • use: Checks alive messages every 100 milliseconds.

  • nouse: Checks alive messages every second.

(ag) suppress_reset

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 32

(Units: --)

Yes

--#

#: If this operand is omitted, host reset is not suppressed.

Description

If hosts are to be reset, this operand specifies the minimum number of active hosts as a reference for suppressing host resets.

As the minimum number of active hosts, specify a value that is greater than half the total number of hosts that constitute the system (minimum number of active hosts > total number of hosts in the entire system ÷ 2) and equal to or less than the value of the hostmax operand. You must specify the same value for all hosts in the hot-standby configuration.

For details about suppressing host resets, see 3.3.4 Suppressing host resets.

(ah) exitcode

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

type1|type2

--

(Units: --)

Yes

type1

Description

Specifies the version of the codes to be returned by HA Monitor's commands:

  • type1: Returns the codes described in 9. Commands.

  • type2: Returns the codes for P-9S2C-E111 HA Monitor version 01-41 or earlier.

(ai) vg_off

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

sequential|parallel

--

(Units: --)

Yes

parallel

Description

Specifies whether disconnection processing is to be performed for volume groups sequentially or in parallel, when volume group disconnection processing is to be performed more than once (such as when multiple servers are terminated).

Specify the same value for all hosts in the hot-standby configuration.

Note that if the OS is RHEL 6.2 or later or RHEL 7 or later, you do not need to specify this operand.

  • sequential: Performs volume group disconnection processing separately for each volume group (performs the processing sequentially).

    If there are multiple server and resource definition statements with the disk operand specified in the server environment definition, we recommend that you specify sequential in order to avoid disconnection errors that might occur when multiple volume groups are disconnected at the same time (in parallel). Note that if the OS is RHEL 6.2 or later or RHEL 7 or later, a disconnection error does not occur when multiple volume groups are disconnected simultaneously (in parallel).

  • parallel: Performs volume group disconnection processing in parallel.

(aj) vmware_env

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether hot standby processing is to be performed in a VMware ESXi-based virtualization environment.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Performs hot standby processing in a VMware ESXi-based virtualization environment.

  • nouse: Does not perform hot standby processing in a VMware ESXi-based virtualization environment.

If you have specified that a VMware ESXi virtual machine is to be used in the reset path setting by using the monsetup -resetpath command, use is assumed.

If SCSI reservation for shared disk is performed, you do not need to specify this operand even in a VMware ESXi-based virtualization environment.

For examples of the values to be specified for the reset path settings, see 6.7.2 Examples of settings (BladeSymphony), 6.8.2 Examples of settings (model HA8000xM or earlier), or 6.9.2 Examples of settings (model HA8000xN or later).

(ak) patrol_type

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

server|host

--

(Units: --)

Yes

server

Description

Specifies when hosts are to be monitored.

You must specify the same value for all hosts in the hot-standby configuration.

  • server: Monitors hosts while hot standby processing can be performed.

  • host: Monitors hosts while HA Monitor's hosts are connected.

Specify host when either of the following conditions is satisfied:

  • The suppress_reset operand is specified.

  • The servers are not monitoring each other on all hosts in the hot-standby switchover configuration.

If neither of the above conditions is satisfied, specify server or omit this operand.

(al) resetpatrol_mode

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

mode1|mode2

--

(Units: --)

Yes

mode1

Description

When the system being used is BladeSymphony, specifies whether a reset path failure is to be detected when an SVP cannot respond temporarily to a reset path health check.

In BladeSymphony, specify mode2. In HA8000, either omit this operand or specify mode1. If this operand is omitted, mode1 is assumed.

This operand is applicable only to reset path health checking.

  • mode1: Detects as a reset path failure the status in which an SVP cannot respond temporarily to a reset path health check.#

  • mode2: Does not detect as a reset path failure the status in which an SVP cannot respond temporarily to a reset path health check.#

#

A BladeSymphony SVP might not be able to respond to a reset path health check temporarily (for up to two minutes) when SVPs are switched (for example, for maintenance purposes).

(am) fence_reset

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

use

Description

Specifies whether host reset is to be used as the method for protecting data on a shared disk.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Resets hosts.

  • nouse: Does not reset hosts.

The value to be specified for this operand differs depending on the function to be used.

If hybrid fencing is used

Omit this operand or specify use. In addition, specify use for the fence_scsi operand.

If hybrid fencing is not used

Specify use or nouse for this operand. If this operand is omitted, the system assumes that use is specified for this operand. If you omit this operand or specify use for this operand, you must specify nouse for the fence_scsi and fence_lan operands. Conversely, if you specify nouse for this operand, you must specify use for the fence_scsi and fence_lan operands.

If nouse is specified in this operand, the reset path settings specified by the HA Monitor environment setup command (monsetup command) are ignored, if specified.

For details about host reset, see 2.3.5 Host reset. For examples of the values to be specified for the reset path settings, see 6.7.2 Examples of settings (BladeSymphony), 6.8.2 Examples of settings (model HA8000xM or earlier), or 6.9.2 Examples of settings (model HA8000xN or later).

(an) fence_scsi

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether SCSI reservation for shared disk is to be used as the method for protecting data on a shared disk.

Specify the same value for all hosts in the hot-standby configuration.

  • use: Uses SCSI reservation for shared disk.

  • nouse: Does not use SCSI reservation for shared disk.

The value to be specified for this operand differs depending on the function to be used.

If hybrid fencing is used

Specify use for this operand. In addition, specify use for the fence_reset operand or omit the operand.

If hybrid fencing is not used

Specify use or nouse for this operand. If this operand is omitted, the system assumes that nouse is specified for this operand. When you specify use in this operand, the following settings are also required:

• Specifying nouse in the fence_reset and fence_lan operands

• Specifying the scsi_device or dmmp_device operand in the server environment definition on the servers that use a shared disk#

#: If you do not specify the scsi_device or dmmp_device operand, active servers might start on multiple hosts, causing the data on the shared disk to be corrupted.

For details about SCSI reservation for shared disk, see 2.3.6 SCSI reservation for shared disk.

(ao) fence_lan

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the function for controlling hot standby based on the availability of LAN communications is to be used.

You must specify the same value for all hosts in the hot-standby configuration.

  • use: Uses the function for controlling hot standby based on the availability of LAN communications.

  • nouse: Does not use the function for controlling hot standby based on the availability of LAN communications.

If you specify use in this operand, specify nouse in the fence_reset and fence_scsi operands. Also specify use in the lanfailswitch operand and the name of a LAN interface whose availability is to be checked in the switch_judge operand in the server environment definition.

For details about the function for controlling hot standby based on the availability of LAN communications, see 2.3.7 Function for controlling hot standby based on the availability of LAN communications.

(ap) scsi_check

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 600

(Units: seconds)

Yes

5

Description

Specifies the interval at which the shared disk reservation status is to be monitored from the active system when SCSI reservation for shared disk is used.

Specify the same value for all hosts in the hot-standby configuration.

If this operand is omitted, five seconds is assumed. Normally, there is no need to change this operand's default value.

When nouse is specified in the fence_scsi operand or the fence_scsi operand is omitted, this operand is ignored.

(aq) scsi_pathcheck

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 60

(Units: minutes)

Yes

1

Description

Specifies the interval at which a shared disk path status is to be monitored from the standby system when SCSI reservation for shared disk is used.

Specify the same value for all hosts in the hot-standby configuration.

When nouse is specified in the fence_scsi operand or the fence_scsi operand is omitted, this operand is ignored.

(ar) scsi_timeout

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 3600

(Units: seconds)

Yes

30

Description

Specifies the response wait time for the SCSI command that is issued to a physical device when SCSI reservation for shared disk is used.

Specify the same value for all hosts in the hot-standby configuration.

I/O timeout values, such as the timeout value between RAID controller and Linux SCSI subsystem or the timeout value for a Linux block device, might be specified in OS settings. For these cases, specify a value that is larger than the I/O timeout value specified here in the scsi_timeout operand.

When nouse is specified in the fence_scsi operand or the fence_scsi operand is omitted, this operand is ignored.

(as) scsi_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 60

(Units: number of times)

Yes

2

Description

Specifies the number of times SCSI command issuance is to be retried when SCSI reservation for shared disk is used and the SCSI command issued to a physical device fails or results in a timeout.

Specify the same value for all hosts in the hot-standby configuration.

If this operand is omitted, 2 (twice) is assumed. In this case, after an attempt is made to execute the SCSI command, retry is performed twice. Therefore, a maximum of three attempts are made to execute the SCSI command.

When nouse is specified in the fence_scsi operand or the fence_scsi operand is omitted, this operand is ignored.

(at) hbond_lacp

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 32 alphanumeric characters

--

(Units: --)

Yes

--#

#: If this operand is omitted, HA Monitor monitors only hbonding physical link status.

Description

Specifies the name of an interface for hbonding when hbonding with the LACP monitoring method is to be monitored in 802.3ad mode by using HA Monitor's LAN monitoring function.

You can specify a maximum of 32 interface names delimited by the colon (:). Specifying this operand enables you to detect failures in network switches and physical links based on the hbonding status. When this operand is omitted, HA Monitor monitors only hbonding physical link status.

For details about the settings required to monitor hbonding, see (4) Required environment settings.

(au) vg_offskip

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether volume group disconnection processing is to be skipped when resource disconnection processing that occurs before volume group disconnection processing fails on the active system that is the source system for hot standby processing.

  • use: Skips volume group disconnection processing.

  • nouse: Does not skip volume group disconnection processing.

When use is specified in this operand and shared resource disconnection processing fails on the active system that is the source system for hot standby processing, the time required for hot standby processing can be reduced because volume group disconnection processing will be skipped. Volume group disconnection processing is skipped for the resources listed below when shared resource disconnection processing fails.

When the deviceoff_order operand value is order
  • File system disconnection

  • Volume group disconnection

When the deviceoff_order operand value is reverse
  • File system disconnection

  • Volume group disconnection

If host reset is enabled for the active system, the active system is reset and, if SCSI reservation for shared disk is used, kernel panic is generated; therefore, volume groups are disconnected.

(av) notswitch_notify

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

use

Description

Specifies whether a warning message is to be issued every hour while hot standby processing is disabled.

  • use: Issues a warning message repeatedly.

  • nouse: Does not issue a warning message.

If you specify use in this operand, you can obtain early warning of hot standby-disabled status.

The following table shows the types of hot standby-disabled status that result in issuance of a warning message and the warning message IDs.

Table 8‒3: Hot standby-disabled statuses resulting in issuance of warning messages and the warning message IDs

Hot standby-disabled status resulting in issuance of a warning message

Warning message ID

The active server has started but it is not ready for hot standby processing.

KAMN288-W

The number of active hosts is equal to or less than the minimum number of active hosts.#

KAMN447-W

The server monitoring is temporarily stopped.

KAMN521-W

The system disk monitoring is temporarily stopped.

KAMN540-W

An error was detected during reset path health check.

KAMN644-W

While SCSI reservation is being used, a disk path error was detected when the shared disk path was checked from the standby system.

KAMN729-W

#

This includes HA Monitors whose servers are not active.

(aw) servcomplete_msg

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

use|nouse

--

(Units: --)

Yes

nouse

Description

Specifies whether the KAMN298-I and the KAMN496-I messages are to be issued. The KAMN298-I message is issued when server termination processing is completed, and the KAMN496-I message is issued when grouped-system switchover processing is completed.

  • use: Issues a message when server termination processing is completed and when grouped-system switchover processing is completed.

  • nouse: Does not issue a message when server termination processing is completed or when grouped-system switchover processing is completed.

If grouped-system switchover processing fails on even one of the target servers, the KAMN496-I message is not issued. For a server for which the switch_error operand is specified in the server environment definition to perform start retries, the message is not issued because the server is treated as having been terminated abnormally.

(ax) delay_kamn238

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 600

(Units: seconds)

Yes

0#

#: If this operand is omitted, output of the KAMN238-D message is not delayed.

Description

If this operand is specified, when the standby server begins to wait for start of the active server, output of the KAMN238-D# message is delayed by the specified length of time.

#

This operand takes effect only when the message is output for the first time. After the message is output for the first time, it is output at regular intervals for the purpose of status notification to users regardless of the value specified for this operand.

Specify the same value for all hosts in the hot-standby configuration.

Even if the system settings are specified to start the active and standby servers at the same time, the KAMN238-D message might be output due to a slight difference in start timing. By specifying this operand, you can prevent the KAMN238-D message from being output due to a slight timing difference. If the active and standby servers are not started at the same time, you do not need to specify this operand.

If the multi-standby function is used, output of the KAMN972-D message, which is output together with the KAMN238-D message, is also delayed. Note that this operand does not take effect for the KAMN238-D message that is output together with the following messages: KAMN242-D, KAMN243-D, KAMN244-D, and KAMN973-D.

If you are not sure of the appropriate value, specify 60 seconds as a trial value. If the KAMN238-D message is still output, specify a larger value.

(3) options definition statement

This definition statement defines the options related to HA Monitor operation. Specification of this definition statement is optional. If you omit this definition statement, all the default values are assumed. The following explains the operands of the options definition statement.

(a) clearcheck

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

1 to 600

(Units: seconds)

Yes

20

Description

This operand specifies the host reset end time (in seconds).

If host reset does not finish within a specified time when hot-standby switchover occurs, the system outputs an error message and enters the hot-standby switchover wait state. After that, when host reset finishes, the system restarts hot-standby switchover.

For details about specification of this operand, see the following table:

Model

Environment

Server configuration

Specification details

HA8000xN or later

VMware ESXi not used

Management server not used

Set a value for this operand according to the documentation for HA Monitor Connector. If the documentation for HA Monitor Connector does not include descriptions of this operand, do not specify this operand.

Management server used

Do not specify this operand.

VMware ESXi used

Management server not used

Specify a value that is larger by 20 (seconds) than the value indicated in the documentation for HA Monitor Connector. If the documentation for HA Monitor Connector does not include descriptions of this operand, do not specify this operand.

Management server used

Do not specify this operand.

Other models

--

--

Do not specify this operand.

Legend:

--: Not applicable

The following shows the points that you must note when specifying this operand.

  • Specify the same value for all hosts in the hot-standby configuration. Even if the models of servers in the hot-standby switchover configuration include ones that do not require setting of this operand, make sure that the operand is specified with the same value for all servers.

  • Specify the same value that is set for the clearwait operand.