Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


8.5.1 Monitor-mode program environment definition (programs)

You create a definition file containing the monitor-mode program environment definitions under the file name programs in the directory for HA Monitor environment settings.

A sample programs file is provided under the directory for HA Monitor sample files. Instead of creating a new file from scratch, you can save time by copying this sample file to the directory for HA Monitor environment settings and then editing it.

You must create this definition file for each host. Bold type indicates a value that must be the same on all hosts. Care must be taken in specifying the other items so that there will be no conflicts between the hosts.

/*    monitor-mode program environment definition    */
program  name  monitor-mode-program-name
        ,alias monitor-mode-program-alias-name
        ,server_alias
               monitor-mode-server-alias-name
      [,restartcommand
               absolute-path-name-of-program-restart-command]
      [,patrol
               program-slowdown-monitoring-interval]
      [,restart_timeout
               program-restart-monitoring-time]
      [,exec_retry
             {program-restart-retry-count|2}]
      [,retry_stable
             {duration-of-stable-program-operation|60}];
Organization of this subsection

(1) program definition statement

You use the program definition statement to define an operating environment for programs that use the program management function. The following explains the operands of the program definition statement.

(a) name

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Path name having 1 to 1,000 characters

--

(Units: --)

No

--

Description

Specifies a name used to identify the monitor-mode program.

You can also specify a path name. This name must be unique within each host. If this program is a UAP, specify the name specified in this operand in the HAMON_UAPNAME environment variable.

Specify the same value among all hosts in a hot-standby configuration.

(b) alias

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 8 alphanumeric characters

--

(Units: --)

No

--

Description

Specifies the monitor-mode program alias name that is to be used in HA Monitor commands and in the messages that are issued.

This name must be unique within each host. Specify the same value among all hosts in a hot-standby configuration.

(c) server_alias

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

1 to 8 alphanumeric characters

--

(Units: --)

No

--

Description

Specifies the server alias name for the monitor mode that is managed by this program (alias operand value in the server environment definition).

Specify the same value for all hosts in the hot-standby configuration.

(d) restartcommand

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Path name having 1 to 1,000 characters

--

(Units: --)

Yes

--#

#: If this operand is omitted, HA Monitor performs hot-standby switchover on a server basis in the event of a program error.

Description

Specifies the absolute path name of the program restart command that has been created by using a shell script, for example.

You must specify the same value for all hosts in the hot-standby configuration.

You can specify tab and end-of-line codes in this operand by enclosing the operand value in double quotation marks ("). The double quotation marks are not included in the character count.

(e) patrol

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

60 to 3600

(Units: seconds)

Yes

--#

#: If this operand is omitted, HA Monitor does not monitor for UAP operation reports.

Description

Specifies the amount of time HA Monitor is to monitor for an operation report from the UAP (program slowdown monitoring interval).

You must specify the same value for all hosts in the hot-standby configuration. If no operation report is received from the UAP within the amount of time specified in the patrol operand, HA Monitor detects a UAP error.

(f) restart_timeout

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

60 to 3600

(Units: seconds)

Yes

--

Description

Specifies a timeout value for program restart processing.

This is the amount of time from the start to the termination of a program restart command or to issuance of the hamon_patrolstart function. You must specify the same value for all hosts in the hot-standby configuration.

When a timeout occurs, HA Monitor re-executes the program restart command.

If the restartcommand operand is omitted, this operand is ignored.

(g) exec_retry

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

0 to 9999

(Units: number of times)

Yes

2#

#: If this operand is omitted, HA Monitor restarts the server according to the setting of the servexec_retry operand in the server environment definition.

Description

Specifies the number of times the program restart command is to be re-executed when a program error has been detected on the active server.

You must specify the same value for all hosts in the hot-standby configuration.

When HA Monitor detects a program error on the active server, it retries execution of the program restart command specified in the restartcommand operand as many times as specified in this operand. If the program restarts successfully within the retry count, HA Monitor treats it as a normal operation and continues processing. If the program does not restart within the specified number of retry attempts, HA Monitor treats it as a program error, stops the program restart processing, and performs hot standby processing.

If 9999 is specified in this operand, HA Monitor retries until program restart processing is completed or the active server is terminated.

If 0 is specified for this operand, HA Monitor immediately performs hot-standby switchover without attempting to restart the program.

If the restartcommand operand is omitted, specification of this operand is ignored.

(h) retry_stable

User-specified values

Value type

Range of numerical values

Can be omitted

Assumed value when this value is omitted

Unsigned integer

60 to 9999

(Units: seconds)

Yes

60

Description

Specifies a monitoring interval (in seconds) for determining whether program operation has become stable after the program has restarted successfully.

You must specify the same value for all hosts in the hot-standby configuration.

If no failure is detected within the amount of time specified in this operand after the program restarts, HA Monitor determines that the program's operation is stable and resets the restart count to 0. After a successful restart, HA Monitor starts monitoring for stable operation beginning at second 0.

If 9999 is specified in this operand, HA Monitor does not reset the retry count to 0 until the program terminates normally or is terminated by hot standby processing.

If an error occurs within the amount of time specified in this operand after the program has restarted, the restart count is updated and the program is restarted again. If the updated restart count exceeds the exec_retry operand value specified in the monitor-mode program environment definition, HA Monitor performs hot standby processing without restarting the program.