Hitachi

Job Management Partner 1 Version 10 Job Management Partner 1/Integrated Management - Manager Overview and System Design Guide


7.2.3 How the health check function works

The JP1/IM - Manager health check function is realized by having processes monitor one another.

The following table describes the correspondence between the processes that perform monitoring in the JP1/IM - Manager health check function, and the processes they monitor.

Table 7‒3: Correspondence between monitoring processes and monitored processes

Monitoring processes

Monitored processes

Event base service (evflow)

Event console service (evtcon)

Automatic action service (jcamain)

Event generation service (evgen)#1

Event service (jevservice)#2

Event console service (evtcon)

Event base service (evflow)

#1: Applicable when not using the integrated monitoring database.

#2: A JP1/Base service that runs on the manager.

Organization of this subsection

(1) Detecting process errors

In the JP1/IM - Manager health check function, a process that performs monitoring communicates over the network with the processes it monitors, to check whether the processes are working normally.

To detect process errors, the health check function sends polling signals to the monitored processes at regular intervals. If a process has not responded to the signal within a set time, the health check function regards the process as being in an abnormal state.

The interval at which processes are polled, and the number of non-responses for a process to be judged abnormal, differ according to the monitored process, as follows:

Table 7‒4: Differences in non-response count

Monitored process

Polling interval

Non-response count

Event service (jevservice)

60 to 3,600 seconds (default: 300 seconds)

1 to 60 (default: 2)

Process other than the event service

60 to 3,600 seconds (default: 60 seconds)

1 to 60 (default: 3)

The figure below shows in diagrammatic form how process errors are detected.

Figure 7‒3: Communication between processes

[Figure]

(2) Reporting process errors

When the JP1/IM - Manager health check function is enabled, on detection of a process error, JP1/IM - Manager executes the following processing to report that an error has occurred:

When the failed process has been restored to normal status, message KAVB8061-I is output to the integrated trace log and to the Windows event log or UNIX syslog. If JP1 event issuance is enabled, a JP1 event (event ID: 00002014) is issued.

Reference note
  • The JP1 event with event ID 00002013 is a dummy event (an event not registered in the event database) issued to JP1/IM - View. A dummy event is issued when an error occurs in the event service in which JP1 events are registered.

  • We recommend that you set up the functionality for executing a notification command when using the JP1/IM - Manager health check function.

    Execution of a notification command is recommended because if errors are reported only by issuing JP1 events, the user may fail to respond promptly when not monitoring services in JP1/IM - View or if a problem occurs in the event console service (that is, the user is not made aware that an error has been detected by JP1/IM).