Hitachi

Job Management Partner 1 Version 10 Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 1


6.2.8 Changing the interval and number of retry attempts when a TCP/IP connection error occurs

As job execution control, TCP/IP is used to pass information between the processes for registering and delivering jobs, reporting and checking the job status, and checking the agent host status. However, if the host to be connected to is not running or if a network error has occurred, TCP/IP connection fails.

If the other host does not respond to a TCP/IP connection request, job execution control first waits for a maximum of 90 seconds for a response, and then makes two retry attempts at 20-second intervals (under the default settings). If both attempts fail, four or five minutes might pass before the connection finally resulted in an error.

If a communication error occurs during processing that registers or delivers a job, reports or checks the job status, or checks the agent host status, detection of the error might be delayed. This delay might result in a further delay in changing the job status.

If TCP/IP connection errors are frequent, you can set smaller values for the connection timeout value, the number of retry attempts, and the retry interval to speed up the detection of an error.

For details about changing the settings for delivering jobs to agent hosts, checking the job status, and checking the agent host status on the manager side, see 2.6 Setting up the communication control environment in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2.

Notes on communication errors caused by insufficient socket ports

When a system has a large number of jobs to execute per unit of time, the number of socket ports used for TCP/IP communication increases. This can result in insufficient socket ports being available. For communication errors that result from insufficient socket ports, the system retries communication at regular intervals. However, failure to resolve the situation by the time communication is retried may cause delays in job execution, or result in the abnormal termination of jobs, scheduler services, and commands.

If you encounter an error related to insufficient socket ports, refer to 3.1.1(5) OS tuning in the Job Management Partner 1/Automatic Job Management System 3 System Design (Configuration) Guide and adjust the operating system parameters as needed.

The retry behaviour of JP1/AJS3 in the event of a communication error related to insufficient socket ports depends on your operating system.

  • In Windows Server 2003

    The environment setting parameters (for communication control) listed in Table 6-21 and 2.6 Setting up the communication control environment in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2 do not apply in the event of a communication error related to insufficient socket ports. Consequently, you cannot change the retry interval or the number of retry attempts for such an error. The system attempts 48 retries at 20-second intervals (a total of 960 seconds) to check whether socket ports have become available.

  • In operating systems other than Windows Server 2003

    The environment setting parameters (for communication control) listed in Table 6-21 and 2.6 Setting up the communication control environment in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2 apply when a communication error related to insufficient socket ports occurs.

The following table lists the definition keys for which values are to be changed, and their purpose.

Table 6‒20: Definition keys for which values are to be changed

Definition key

Purpose

  • For all scheduler services

    JP1AJS2\SCHEDULER\QUEUE\MANAGER\Network

  • For a specific scheduler service

    JP1AJSMANAGER\scheduler-service\QUEUE\MANAGER\Network

  • For submit jobs and a compatible ISAM configuration

    JP1NBQMANAGER\Network

Reporting the job status

JP1NBQAGENT\Network

Reporting the job status

JP1NBQCLIENT\Network

Registering a job from the scheduler and executing a job from a command

  • For all scheduler services

    JP1AJS2\SCHEDULER\QUEUE\NOTIFY\Network

  • For a specific scheduler service

    JP1AJSMANAGER\scheduler-service\QUEUE\NOTIFY\Network

  • For submit jobs and a compatible ISAM configuration

    JP1NBQNOTIFY\Network

Checking the job status on another system (such as JP1/NQSEXEC or JP1/OJE) and reporting the status

The following describes how to set the connection timeout value, retry interval, and number of retry attempts in job execution control.

Note that the procedure described below is not necessary if the queueless job execution facility is used.

Organization of this subsection

(1) Definition procedure

  1. In Windows Control Panel, open the Services administrative tool, and stop the following service:

    • JP1/AJS3 service

  2. Execute the following command to set the environment setting parameters described in (2) below:

    jajs_config -k definition-key "parameter-name-1"=value-1
    ["parameter-name-2"=value-2] 
    ["parameter-name-3"=value-3]

    You can specify only one definition key. If you want to set environment setting parameters for different definition keys, you must execute the jajs_config command for each definition key.

  3. Restart JP1/AJS3.

    The new settings are applied.

(2) Environment setting parameters

Table 6‒21: Environment setting parameters for job execution control

No.

Definition key

Environment setting parameter

Explanation

1

  • For all scheduler services

    [{JP1_DEFAULT|logical-host}\JP1AJS2\SCHEDULER\QUEUE\MANAGER\Network]#

  • For a specific scheduler service

    [{JP1_DEFAULT|logical-host}\JP1AJSMANAGER\scheduler-service\QUEUE\MANAGER\Network]#

  • For submit jobs and a compatible ISAM configuration

    [{JP1_DEFAULT|logical-host}\JP1NBQMANAGER\Network]#

"ConnectTimeout"=

Defines the timeout value (in milliseconds) for a TCP/IP connection attempted by the job execution control manager.

2

"CommunicateRetryCount"=

Defines the maximum number of retry attempts for a TCP/IP connection attempted by the job execution control manager.

3

"CommunicateRetryInterval"=

Defines the retry interval (in seconds) for a TCP/IP connection attempted by the job execution control manager.

4

[{JP1_DEFAULT|logical-host}\JP1NBQAGENT\Network]#

"ConnectTimeout"=

Defines the timeout value (in milliseconds) for a TCP/IP connection attempted by the job execution control agent.

5

"CommunicateRetryCount"=

Defines the maximum number of retry attempts for a TCP/IP connection attempted by the job execution control agent.

6

"CommunicateRetryInterval"=

Defines the retry interval (in seconds) for a TCP/IP connection attempted by the job execution control agent.

7

[{JP1_DEFAULT|logical-host}\JP1NBQCLIENT\Network]#

"ConnectTimeout"=

Defines the timeout value (in milliseconds) for a TCP/IP connection attempted by job execution commands and the scheduler.

8

"CommunicateRetryCount"=

Defines the maximum number of retry attempts for a TCP/IP connection attempted by job execution commands and the scheduler.

9

"CommunicateRetryInterval"=

Defines the retry interval (in seconds) for a TCP/IP connection attempted by job execution commands and the scheduler.

10

  • For all scheduler services

    [{JP1_DEFAULT|logical-host}\JP1AJS2\SCHEDULER\QUEUE\NOTIFY\Network]#

  • For a specific scheduler service

    [{JP1_DEFAULT|logical-host}\JP1AJSMANAGER\scheduler-service\QUEUE\NOTIFY\Network]#

  • For submit jobs and a compatible ISAM configuration

    [{JP1_DEFAULT|logical-host}\JP1NBQNOTIFY\Network]#

"ConnectTimeout"=

Defines the timeout value (in milliseconds) for a TCP/IP connection attempted by the process that reports the job execution control status.

11

"CommunicateRetryCount"=

Defines the maximum number of retry attempts for a TCP/IP connection attempted by the process that reports the job execution control status.

12

"CommunicateRetryInterval"=

Defines the retry interval (in seconds) for a TCP/IP connection attempted by the process that reports the job execution control status.

#:

The specification of the {JP1_DEFAULT|logical-host} part depends on whether the host is a physical host or a logical host. For a physical host, specify JP1_DEFAULT. For a logical host, specify the logical host name.

For details about the definition of these environment setting parameters, see the following:

  1. 2.3.2(24) ConnectTimeout in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  2. 2.3.2(25) CommunicateRetryCount in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  3. 2.3.2(26) CommunicateRetryInterval in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  4. 2.3.2(67) ConnectTimeout in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  5. 2.3.2(68) CommunicateRetryCount in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  6. 2.3.2(69) CommunicateRetryInterval in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  7. 2.3.2(75) ConnectTimeout in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  8. 2.3.2(76) CommunicateRetryCount in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  9. 2.3.2(77) CommunicateRetryInterval in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  10. 2.3.2(82) ConnectTimeout in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  11. 2.3.2(83) CommunicateRetryCount in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2

  12. 2.3.2(84) CommunicateRetryInterval in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 2