Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 1


14.2.12 Changing the wait time for recovery when an agent has failed

This subsection discusses JP1/AJS3 behavior when an agent host executing a job (a PC or Unix job other than a queueless job, or a queue, action, or custom job running on JP1/AJS3) fails or a communication error occurs. In such situations, JP1/AJS3 does not immediately assume a failure, and retries communication after waiting a specified time for recovery. The purpose of waiting is to prevent operation from stopping due to a temporary, recoverable failure. The default wait time is 10 minutes. However, depending on the operation, you might want to determine the failure location and take corrective action immediately rather than waiting for recovery. You can do this by reducing the wait time for recovery.

The following describes how to change the wait time for recovery when an agent host has failed.

Organization of this subsection
(1) Definition procedure
(2) Environment setting parameters

(1) Definition procedure

  1. Stop the JP1/AJS3 service.
    Execute the following commands to confirm that all processes have stopped:
    # /etc/opt/jp1ajs2/jajs_stop#1
    # /opt/jp1ajs2/bin/jajs_spmd_status

    Confirm that automatic termination has been set.
    In a cluster system, also stop the JP1/AJS3 service on each logical host.
  2. Execute the following command to set the environment setting parameters described in (2) below:
    jajs_config -k definition-key "parameter-name-1"=value-1 ["parameter-name-2"=value-2]

    Cautionary note:
    In a cluster system, perform this step on both the primary and secondary nodes.
  3. Restart JP1/AJS3.
    The new settings are applied.

(2) Environment setting parameters

Table 14-20 Environment setting parameters used to set the amount of time to wait for recovery when an agent has failed

Definition key Environment setting parameter Explanation

  • For all scheduler services
  • For a specific scheduler service
  • For submit jobs and a compatible ISAM configuration
"QueuingJobRecoveryTime"= Specifies in seconds how long to wait for recovery from an agent failure related to a queued job.
"ExecutingJobRecoveryTime"= Specifies in seconds how long to wait for recovery from an agent failure related to a job being executed.

The specification of the {JP1_DEFAULT|logical-host} part depends on whether the host is a physical host or a logical host. For a physical host, specify JP1_DEFAULT. For a logical host, specify the logical host name.

For details about the definition of these environment setting parameters, see the following:



Copyright (C) 2009, 2010, Hitachi, Ltd.
Copyright (C) 2009, 2010, Hitachi Solutions, Ltd.