Hitachi

JP1 Version 12 JP1/Automatic Job Management System 3 Configuration Guide


6.3.14 Environment setting parameters related to communication for event/action control

When an event job or a jobnet with start conditions is executed, the event/action control manager and the event/action control agent communicate with each other. To initiate communication, the event/action control manager and agent establish a connection over which an execution or kill request for the event job or jobnet with start conditions and an event occurrence report can be exchanged.

The following figure shows the communication that occurs when an event job or a jobnet with start conditions is executed.

Figure 6‒2: Communication when an event job or a jobnet with start conditions is executed

[Figure]

If an error occurs during communication, the information that could not be sent is saved in a file to prepare for a retry. This information is called unreported information.

If a communication error occurs, communication is retried as defined in the environment setting parameters.

The following table describes the environment setting parameters related to communication retries for event/action control.

Table 6‒63: Environment setting parameters related to communication retries for event/action control.

Definition key

Environment setting parameter

Explanation

For the manager host:

[{JP1_DEFAULT|logical-host}\JP1AJSMANAGER\scheduler-service\NETWORK\EVMANAGER]#

For the agent host:

[{JP1_DEFAULT|logical-host}\JP1AOMAGENT]#

"ClientConnectTimeout"=

Connection timeout period

  • For all scheduler services

    [{JP1_DEFAULT|logical-host}\JP1AJS2\SCHEDULER\EV\MANAGER]#

  • For a specific scheduler service

    [{JP1_DEFAULT|logical-host}\JP1AJSMANAGER\scheduler-service\EV\MANAGER]#

"NotificationConstantRetry"=

Option for resending unreported information at regular intervals

For the manager host:
  • For all scheduler services

    [{JP1_DEFAULT|logical-host}\JP1AJS2\SCHEDULER\EV\MANAGER]#

  • For a specific scheduler service

    [{JP1_DEFAULT|logical-host}\JP1AJSMANAGER\scheduler-service\EV\MANAGER]#

For the agent host:

[{JP1_DEFAULT|logical-host}\JP1AOMAGENT]#

"NotificationRetryInterval"=

Interval for retrying to send unreported information

"NotificationRetryCount"=

Maximum number of retries for sending unreported information

#:

The specification of the {JP1_DEFAULT|logical-host} part depends on whether the host is a physical host or a logical host. For a physical host, specify JP1_DEFAULT. For a logical host, specify the logical host name.

The following describes the relationship between the environment setting parameters, and provides examples of setting these parameters.

Organization of this subsection

(1) About ClientConnectTimeout

When the event/action control manager sends a connection request to the event/action control agent, or when the event/action control agent sends a connection request to the event/action control manager, the sender waits for a response. If no response is returned within a predefined time, the wait times out so that other processing can be performed. The time during which the manager or agent waits for a response to a connection request is called the connection timeout period.

Use the ClientConnectTimeout environment setting parameter to set the connection timeout period.

The following figure shows the connection timeout period set by using the ClientConnectTimeout environment setting parameter.

Figure 6‒3: Connection timeout period set by using the ClientConnectTimeout environment setting parameter

[Figure]

Increasing the value of the ClientConnectTimeout environment setting parameter also increases the connection timeout period. Accordingly, connection timeouts might not occur very often even when a long time is required to receive a response due to communication load.

However, if no response has been sent from a connection request for a long time because of a network device failure or similar reason, the time that elapses before the timeout also increases. Accordingly, the time during which neither an execution registration or kill request for an event job or jobnet with start conditions nor an event occurrence report is processed also increases. When the manager or agent is waiting for a timeout, an attempt to kill or register an event job or a jobnet with start conditions for execution on another agent available for communication cannot be processed immediately. As a result, changing the job status will take a long time. Therefore, if a connection timeout occurs, the manager or agent with default settings gradually increases the interval for each retry, instead of using the regular interval, in order to gradually reduce the retry frequency. For details, see (2) About NotificationConstantRetry.

(2) About NotificationConstantRetry

Depending on the value of the ClientConnectTimeout environment setting parameter, a long time is required before the response to a connection request is sent if a network device failure or other problem occurs. In such cases, there is a long delay before an event job or a jobnet with start conditions is registered for execution or killed. To reduce the frequency of processing delays, unlike a regular interval, the communication retry interval used when a connection timeout occurs gradually increases by default. Specifically, each retry is performed at successive intervals of 300 seconds, 600 seconds, 900 seconds, 1,800 seconds, and 3,600 seconds (3,600 seconds is the interval thereafter), until a total of 27 retries (over 24 hours) have been performed.

The following figure shows the communication between the event/action control manager and the event/action control agent when a connection timeout occurs.

Figure 6‒4: Communication between the event/action control manager and the event/action control agent when a connection timeout occurs

[Figure]

However, if a connection timeout is due to a temporary cause such as a high communication load, the retry process described above takes more time, delaying the execution of an event job or jobnet with start conditions on the execution agent. For such cases, you can also use a regular interval for retries.

Set Y for the NotificationConstantRetry environment setting parameter to use a regular interval for retries, irrespective of whether retries are due to connection timeouts or other types of errors. For details about the retry interval, see (3) About NotificationRetryInterval and NotificationRetryCount.

(3) About NotificationRetryInterval and NotificationRetryCount

In addition to a connection timeout, a communication error might also be caused by the following problems:

For retries performed for an error other than a connection timeout that occurs during communication between the event/action control manager and event/action control agent, you can set the retry interval by using the NotificationRetryInterval environment setting parameter (the default is 30 seconds). Similarly, you can set the maximum number of retries by using the NotificationRetryCount environment setting parameter (the default is 2,880).

The following figure shows an example of an error that is not a timeout error.

Figure 6‒5: When an error other than a connection timeout occurs

[Figure]

Note that if you want to change only the retry interval or only the number of retries, the retry period (the period during which retries can be performed) also changes. If you want to retain a retry period, you need to adjust the values of both environment setting parameters. For example, if you change the retry interval to 15 seconds, which is half the default value, the number of retries that preserves the retry period is 5,760 (twice the default value).

(4) Guideline for environment setting parameter settings

The following table provides the guidelines for environment setting parameter settings based on what is most important for communication.

Table 6‒64: Guidelines for environment setting parameter settings

Environment setting parameters requiring adjustment

Cautionary note

"ClientConnectTimeout"

(in milliseconds)

"NotificationConstantRetry"

"NotificationRetryInterval"

(in seconds)

"NotificationRetryCount"

Default value

Windows: 30,000

UNIX: 1,000#1

N

30

2,880 or 27#2

N/A

Most important consideration

Suppress processing delays for other agents when timeouts occurs for an agent during communication

3,000 to 10,000

N

N/A

N/A

Because the retry interval gradually increases by 300, 600, 900, 1,800, and 3,600 seconds even when communication with the agent no longer times out, a long time is still required before sending is retried.

Prevent timeouts for agents during communication

10,000 to 60,000

N

N/A

N/A

If timeout occurs for an agent during communication, processing for other agents might be delayed and event detection might be disabled.

Suppress processing delays with a quick recovery response for the communication environment if a temporary communication error occurs in an otherwise stable communication environment

3,000 to 10,000

Y

3 to 10

2,880#2

If timeouts occur for an agent in rapid succession or continue over a long time during communication, processing for other agents might be delayed and event detection might be disabled.

Ensure communication in an unstable communication environment even if communication is delayed

10,000 to 60,000

Y

3 to 10

8,640 to 28,800

If timeouts occur for an agent in rapid succession or continue over a long time during communication, processing for other agents might be delayed and event detection might be disabled.

Detecting errors at an early stage

3,000 to 10,000

Y

3 to 10

30 to 100

An event job might end abnormally, in which case the KAVT0103-E message is output to the integrated trace log. Monitoring for this message allows environment errors to be detected.

Legend:

N/A: Not applicable.

#1

The default values are very different for Windows and UNIX because the default values in UNIX have backward compatibility with the settings of JP1/AJS2 version 8.

In version 8, the ClientConnectTimeout environment setting parameter does not exist, but the operation is the same as when the environment setting parameter is set to 1,000. The UNIX default value is based on this value.

#2

Use 2,880 for errors that are not timeout errors. Use 27 for timeout errors that continue to occur.

For details about the definition of each environment setting parameter, see the following documentation:

For details about the definition of the environment setting parameters related to communication between the event/action control manager and the event/action control agent, see the following documentation:

For details about the definition of the environment setting parameter related to communication from the event/action control agent to the event/action control, see the following documentation: