Hitachi

JP1 Version 12 JP1/Automatic Job Management System 3 System Design (Configuration) Guide


3.1.1 Job throughput

When calculating the throughput of a JP1/AJS3 system, you need to consider job execution time as distinct from the JP1/AJS3 processing performance. The execution time of a job runs from the time JP1/AJS3 starts a job execution process until the job has actually ended. Thus, when estimating how many jobs can be executed per unit time, bear in mind both the JP1/AJS3 processing performance and job execution times. You will also need to consider transfer times if large quantities of data are output to the standard error output, or if a large number of files are transferred.

Organization of this subsection

(1) Job processing performance

JP1/AJS3 processing capacity is estimated as the number of job executions per unit of time. You can consider light load conditions, where many "jobs that do nothing" are executed, as giving the peak performance for JP1/AJS3. Check that the number of jobs to be executed is comfortably below this peak performance.

Most jobs tend to be concentrated in particular time periods, so for a balanced performance estimate you have to ensure that the system remains within its peak performance during the period when the concentration of jobs is highest.

When estimating throughput, allow some latitude for the possibility of execution errors and recovery processing.

We recommend, for the sake of simplicity, that you base your estimates on the assumption that 10 times the average number of jobs will be executed during the most concentrated period, and that the daily volume of jobs will use about one tenth of the peak system performance. The following table describes performance estimates for a system that runs jobs 12 hours per day.

Note that the peak performance will vary depending on the hardware you use and other factors.

Table 3‒1: Example estimate for JP1/AJS3 job throughput

Peak performance

(number of jobs executed per second)

Peak performance

(number of jobs executed per hour)

Recommended number of jobs executed per day with 12 hours' operation

0.5

1,800

2,160

1.0

3,600

4,320

2.0

7,200#

8,640

3.0

10,800#

12,960

4.0

14,400#

17,280

5.0

18,000#

21,600

#

In practice, we recommend that you keep the number of jobs started to no more than 5,000.

For example, suppose that 0.5 jobs are executed per second at peak times.

The number of jobs executed per hour in this case will be:

(0.5 jobs) x (3,600 seconds) = 1,800 jobs

Operating 12 hours per day, and at peak performance throughout the day, the number of jobs executed per day would be:

(1,800 jobs) x (12 hours) = 21,600 jobs

Consequently, the recommended number of job executions would be:

(21,600 jobs) / 10 = 2,160 jobs

(2) Event job processing performance

The discussion in (1) Job processing performance above does not apply to event jobs (including those within a start condition) because they have a different execution process from standard jobs.

Certain restrictions apply to event jobs, such as the maximum number of event jobs that can be registered for execution at the same time. Use event jobs judiciously, referring to B.8 Limits for the event/action control.

First estimate the number of events likely to be generated, and keep them to within the system's processing capacity. See 3.1.5 Event monitoring performance.

(3) Activating multiple scheduler services concurrently

When a manager host has two or more CPUs, there are limits to fully utilizing those resources if just one scheduler service is used. We recommend running multiple scheduler services in this situation. For details about using multiple scheduler services, see 4.3.2 Activating multiple scheduler services concurrently.

On a host with only one CPU, running multiple scheduler services within the limits of the CPU's processing capacity might still be an efficient use of resources. We recommend that you consider this option.

(4) Distributing job execution among multiple agent hosts

To utilize the manager host's processing capacity to its fullest extent, we recommend a system configuration in which jobs are executed on multiple agent hosts, thereby helping to reduce the load at the manager host.

For details about configurations that distribute the processing load among multiple agent hosts, see 2.5.2 Load distribution.

(5) OS tuning

(a) TCP/IP parameters

When a JP1/AJS3 system has a large number of jobs to execute, the number of socket ports used for communication between internal processes and between the manager and agent hosts might exceed the OS limit. This can result in job execution delays and abnormal termination of jobs, scheduler services, and commands.

For this reason, you must adjust the OS's TCP/IP parameters to avoid insufficient socket ports at peak loads. Adjust the following parameters:

  • Number of socket ports

  • Socket port TIME_WAIT interval

The following table lists the defaults for these parameters in each OS.

Table 3‒2: OS defaults for TCP/IP parameters

OS

Number of socket ports that can be used

TIME_WAIT interval

(seconds)

Windows

16,383

120

HP-UX (IPF)

16,383

60

Solaris

32,768

60

AIX

32,768

15

Linux

28,233

60

The number of socket ports and the value of the TIME_WAIT interval might vary according to the OS version and service pack. See the latest documentation for your OS.

In Windows, you can change the TCP/IP parameters by changing the following registry setting, and then restarting Windows. For the setting procedure, or if using an OS other than Windows, see your OS documentation.

  • Registry key

    \\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

  • Parameters

    • MaxUserPort

    • TcpTimedWaitDelay

Estimate the maximum number of jobs that can be executed per TIME_WAIT interval as described below, and adjust the TCP/IP parameters as needed.

Note that these adjustments might increase usage of OS resources. For details, see the documentation for your OS.

Estimation

Use the following formula:

Maximum number of jobs that can be executed per TIME_WAIT interval#1

= number-of-available-socket-ports#1, #2, #3 / number-of-ports-used-by-JP1/AJS3#4

#1

Check the actual value set in the OS's TCP/IP parameter.

#2

The value of the OS's TCP/IP parameter is the total number of socket ports available to the system. Therefore, subtract the number used by OS services and by software products other than JP1/AJS3.

#3

If a failed agent is detected when job distribution delay reduction function is enabled, JP1/AJS3 uses a maximum of 1,024 socket ports while conducting a communication status check or communication recovery check.

#4

This is the number of ports that are placed in the TIME_WAIT status when one job is executed by JP1/AJS3.

The number of ports that are placed in the TIME_WAIT status when one job is executed by JP1/AJS3 varies depending on the settings of the reduction of the ports to be used and the communication encryption function.

You can set the reduction of the ports to be used by using the REDUCEUSEPORT environment setting parameter. For details about the REDUCEUSEPORT environment setting parameter, see 20.8.2(5) REDUCEUSEPORT in the JP1/Automatic Job Management System 3 Configuration Guide.

The following table lists the numbers of ports that are placed in the TIME_WAIT status when one job is executed by JP1/AJS3.

Table 3‒3: Number of ports that are placed in the TIME_WAIT status when one job is executed by JP1/AJS3

Setting of the reduction of ports to be used#1

Communication encryption function

Disabled

Enabled

Enabled

Maximum of 4#2

Maximum of 1#2

Disabled

Maximum of 11#2

Maximum of 5#2

#1

If any of the following types of job is used, estimate the maximum number of jobs on the assumption that the reduction of the ports to be used is disabled.

  • Queue Job

  • Submit Job

  • Flexible Job

  • Event Job

#2

If the agent-monitoring interval (5 minutes by default) is less than the TIME_WAIT interval, estimate the maximum number of jobs while adding 1 to the number-of-ports-used-by-JP1/AJS3.

When estimating the maximum number of jobs, you need to take into consideration the number of ports that are placed in the TIME_WAIT status by an operation on the execution agent. For details about the number of ports that are used by an operation on the execution agent, see 3.2.6 Estimating the number of ports to be used.

Estimation example

This example is for Windows Server 2019, using the OS defaults.

  • Number of socket ports available to the OS

    16,383#

  • TIME_WAIT interval

    120 seconds

  • If the reduction of the ports to be used is enabled and the communication encryption function is disabled, the number of ports that are placed in the TIME_WAIT status is as follows:

    Maximum of 4

  • Maximum number of jobs that can be executed per TIME_WAIT interval

    16,383 / 4 = approx. 4,095

#

This assumes that other software products are not using socket ports. In practice, the number used by OS services and by software products other than JP1/AJS3 needs to be subtracted from this figure.

In this example, if more than 4,095 jobs are executed within 120 seconds, the socket ports will run out, causing delays and other potential problems. If this could occur in your system, you will need to change the TCP/IP parameter settings to adjust the maximum number of jobs that can be executed.