Health check definition file
- Organization of this page
Format
[JP1_EVENT] OUTPUT={YES | NO} RECOVER={YES | NO} [SYSLOG] OUTPUT={YES | NO} RECOVER={YES | NO} [OTHER_HOSTS] INTERVAL=remote-host-monitoring-interval (seconds) TIMEOUT=communication-timeout-value (seconds) STOP_CHECK={YES | NO} ERROR_DETAIL={YES | NO} POLLENDMSG={YES | NO} HOST=host-name1,host-name2,... THRESHOLD={monitoring-threshold | host-name:monitoring-threshold},... |
Parameters by type
- Required parameters:
-
None
- Custom parameters:
-
[OTHER_HOSTS] section
-
INTERVAL
-
TIMEOUT
-
ERROR_DETAIL
-
HOST
-
POLLENDMSG
-
THRESHOLD
-
Storage destination directory
- In Windows:
-
installation-folder\conf\jbshc\
shared-folder\jp1base\conf\jbshc\ (in a cluster system)
- In UNIX:
-
/etc/opt/jp1base/conf/jbshc/
shared-directory/jp1base/conf/jbshc/ (in a cluster system)
Description
Specifies the host to be monitored and the process-monitoring interval as the behavior of the health check function.
Application of settings
Restart JP1/Base or execute the jbs_spmd_reload command to apply the settings.
Definition details
The following conventions apply to entries in the health check definition file (jbshc.conf).
-
A hash mark (#) (code 0x23) at the start of a line indicates a comment.
-
Do not enter a space or tab before or after an equal sign (=) or comma (,) or at the beginning or end of a line. If you enter either of these, the line will be ignored.
-
Lines containing only a linefeed character are ignored.
-
The health check definition file is a text file in which each line has no more than 1,023 bytes.
- [JP1_EVENT]
-
This section is about issuing JP1 events.
- OUTPUT={YES | NO}
-
Specify whether to issue a JP1 event upon the detection of a problem during a health check. Specify YES or NO. The default is YES.
- RECOVER={YES | NO}
-
Specify whether to issue a JP1 event upon recovery from the problem detected during a health check. Specify YES or NO. The default is YES.
RECOVER=YES is invalid if you have specified OUTPUT=NO.
- [SYSLOG]
-
This section is about message output to the syslog or event log.
- OUTPUT={YES | NO}
-
Specify whether to output a message to either syslog or the event log upon the detection of a problem during a health check. Specify YES or NO. The default is YES.
- RECOVER={YES | NO}
-
Specify whether to output a message to either syslog or the event log upon recovery from the problem detected during a health check. Specify YES or NO. The default is YES.
RECOVER=YES is invalid if you have specified OUTPUT=NO.
- [OTHER_HOSTS]
-
This section is about remote host monitoring.
- INTERVAL=remote-host-monitoring-interval (seconds)
-
Specify the interval over which to monitor a remote host. The specifiable range is 60 to 7200 (seconds).
Estimate the monitoring interval as follows:
(number-of hosts-specified-in- the-HOST-parameter) x 3 (seconds)
Allow 3 seconds per host as the time required to monitor processes. The time might vary depending on the state of the network and the status of the monitored hosts.
If you set a monitoring interval that is shorter than this guideline, errors will be detected more quickly, but the health check function might not finish monitoring a remote host within the specified interval. In this case, the function waits until the previous monitoring round ends.
If you set a monitoring interval that is longer than this guideline, you can save network and OS resources, but error detection might be delayed.
The default is 300 (seconds).
- If the message KAVA7219-W is output to the integrated trace log during a health check
-
The specified monitoring interval might be too short. Estimate the required monitoring interval using the following equation:
(current-interval) + ((KAVA7227-I-output-time - KAVA7219-W output-time) x 1.1)
- TIMEOUT=communication-timeout-value (seconds)
-
Specify how long the monitoring host (manager) that performed polling to the monitored host can wait for a response from the monitored host.
You can specify from 1 to 3,600 in seconds. The default is 60.
If the monitored host does not respond until the specified time elapses, message KAVA7223-E or KAVA7229-W is output, and monitoring fails.
- STOP_CHECK={YES | NO}
-
Specify whether to monitor starting and stopping of monitored hosts. Specify YES or NO. If you do not specify a value, NO is assumed.
- ERROR_DETAIL={YES | NO}
-
Specify whether to include detailed information in the message (KAVA7223-E or KAVA7229-W) that is output if remote host monitoring fails. You can specify YES or NO. The default is NO.
- POLLENDMSG={YES | NO}
-
Specify whether the polling completion message (KAVA7239-I) is to be output. Specify YES or NO. If you do not specify a value, NO is assumed.
- HOST=host-name1,host-name2,...
-
Specify the target remote hosts to be monitored. There is no need to specify this keyword if you want to monitor the local host only.
Delimit the host names with commas. You can specify multiple values for the HOST parameter. A maximum of 2,500 remote hosts can be specified. Hosts in excess of this maximum are not monitored. Host names are case-sensitive.
When you specify multiple host names, you can use a multi-line specification, in addition to the single-line specification shown above. The following shows examples of single-line and multi-line specifications whose meanings are the same:
-
Example of a single-line specification
HOST=hostA,hostB,hostC
-
Example of a multi-line specification
HOST=hostA
HOST=hostB
HOST=hostC
-
- THRESHOLD={monitoring-threshold | host-name:monitoring-threshold},...
-
Specify the threshold for judging whether an error has occurred on the monitored remote host. The value you specify (as the threshold) determines how many successive monitoring failures will cause JP1/Base to assume an error on the monitored host and report the error to the monitoring host (manager).
You can specify a value in the range from 1 to 64. If you omit this parameter, 1 is assumed as the value of monitoring-threshold.
- monitoring-threshold
-
The value specified for monitoring-threshold is applied to all monitored hosts. The default is 1. If you specify multiple values, only the first specified value takes effect.
- host-name:monitoring-threshold
-
The value specified for monitoring-threshold is applied to the monitored host specified for host-name. For host-name, specify a character string of no more than 255 bytes. The monitoring threshold specified in host-name:monitoring-threshold format takes precedence over the monitoring threshold specified in monitoring-threshold format.
For host-name, specify a host name that completely matches the host name specified for the HOST parameter, including the case. If you specify different thresholds with the same host name, the first specified one takes effect.
You can specify multiple monitoring thresholds in single-line and multi-line specification formats. The following shows examples of single-line and multi-line specifications whose meanings are the same:
-
Example of a single-line specification
THRESHOLD=1,hostA:3,hostB:5
-
Example of a multi-line specification
THRESHOLD=1
THRESHOLD=hostA:3
THRESHOLD=hostB:5
Notes
-
If the TIMEOUT value (communication timeout) is greater than the INTERVAL value (remote-host monitoring interval), monitoring might not be completed within a monitoring interval. In normal operation, do not change the communication timeout from the initial value of 60 (seconds). If you change the communication timeout, set a value smaller than the remote-host monitoring interval. If you set a communication timeout that is longer than the remote-host monitoring interval, message KAVA7237-W is output.
-
For detailed information, such as user actions for the error reported by message KAVA7223-E or KAVA7229-W if YES is set for the ERROR_DETAIL parameter, see JP1/Base Messages.
-
For host-name in the THRESHOLD parameter, you can specify one of the monitored hosts specified for the HOST parameter. If you specify a host that is not a monitored host, message KAVA7238-W is output.
-
If you set the THRESHOLD parameter, detection of an error on the monitored host is delayed by the number of times specified as the monitoring threshold. In normal operation, do not change the monitoring threshold from the initial value of 1. Consider whether to adjust the monitoring threshold value if you encounter a monitoring error such as that shown in 2.7.5(5) Operation when a monitoring error occurs due to a temporary failure.