Hitachi

JP1 Version 13 JP1/Automatic Job Management System 3 Administration Guide


6.3.1 Restarting an abnormally terminated JP1/AJS3 process

When JP1/AJS3 starts, multiple processes are generated. You can set up JP1/AJS3 - Manager and JP1/AJS3 - Agent to automatically restart a process that has terminated abnormally for whatever reason.

The restart setting described here applies to JP1/AJS3 that is not operating in a cluster system. If you want to automatically restart processes in a cluster system, use cluster software. For details, see 10.1 Overview of cluster systems.

Automatic restarting is set in the extended startup process definition file. We recommend using the default values. In JP1/AJS3, the processes that can be restored by restarts are set by default (in JP1/AJS2, no processes are to be restarted by default). For details about the default values of the restart settings for each process, see Tables 6-15 to 6-18.

To change the settings, edit the extended startup process definition file, and then restart JP1/Base and JP1/AJS3.

The extended startup process definition file is in the following location.

In Windows:

JP1/AJS3-installation-folder\conf

In UNIX:

/etc/opt/jp1ajs2/conf

Whether abnormally terminated processes are restarted automatically is different depending on the host, physical or logical host. The following describes the difference in the restarting operation:

Physical host:

Automatic restarting is performed according to definitions in the extended startup process definition file.

Logical host:
In Windows:

Automatic restarting is disabled regardless of the setting of the -HA option.

In UNIX:

If JP1/AJS3 - Manager has been started by the jajs_start.cluster command or the jajs_spmd command with the -HA option specified, automatic restarting is disabled.

For JP1/AJS3 - Agent, automatic restarting is disabled regardless of the setting of the -HA option.

The tables below list the processes applicable to the restart setting. Applicable processes are only the child processes or detailed processes of the JP1/AJS3 - Manager and JP1/AJS3 - Agent services in the table. You cannot specify the restart setting for any process except those listed in Tables 6-10 to 6-13.

In Windows:
Table 6‒10: Processes applicable to the restart setting (JP1/AJS3 - Manager)

No.

Child process name or detailed process name

Extended startup process definition file

Process that can be restarted

1

jajs_dbmd.exe

jp1ajs_service_0700.conf

jajs_dbmd.exe and detailed process#

2

ajsdbmgrd.exe

jp1ajs_dbmd_0700.conf

ajsdbmgrd.exe

3

jajs_hstd.exe

jp1ajs_service_0700.conf

jajs_hstd.exe and detailed process#

4

ajshlogd.exe

jp1ajs_hstd_0700.conf

ajshlogd.exe

5

ajsinetd.exe

jp1ajs_hstd_0700.conf

ajsinetd.exe

6

ajsnetwd.exe

jp1ajs_hstd_0700.conf

ajsnetwd.exe

7

ajsagtmd.exe

jp1ajs_hstd_0700.conf

ajsagtmd.exe

8

ajsovstatd.exe

jp1ajs_hstd_0700.conf

ajsovstatd.exe

9

ajsgwmasterd.exe

jp1ajs_hstd_0700.conf

ajsgwmasterd.exe

10

ajsqlcltd.exe

jp1ajs_hstd_0700.conf

ajsqlcltd.exe

11

jpqman.exe

jp1ajs_hstd_0700.conf

jpqman.exe

12

jpomanager.exe

jp1ajs_hstd_0700.conf

jpomanager.exe

13

ajscdinetd.exe

jp1ajs_hstd_0700.conf

ajscdinetd.exe

14

jajs_schd.exe

jp1ajs_service_0700.conf

jajs_schd.exe and detailed process#

15

ajslogd.exe

jp1ajs_schd_0700.conf

ajslogd.exe

16

jpqman.exe

jp1ajs_schd_0700.conf

jpqman.exe

17

jpomanager.exe

jp1ajs_schd_0700.conf

jpomanager.exe

18

ajsmasterd.exe

jp1ajs_schd_0700.conf

ajsmasterd.exe

19

jajs_agtd.exe

jp1ajs_service_0700.conf

jajs_agtd.exe and detailed process#

20

jpqmon.exe

jp1ajs_agtd_0700.conf

jpqmon.exe

21

jpoagent.exe

jp1ajs_agtd_0700.conf

jpoagent.exe

22

ajsagtmond.exe

jp1ajs_hstd_0700.conf

ajsagtmond.exe

#

For details about JP1/AJS3 detailed processes, see B. List of Processes in the manual JP1/Automatic Job Management System 3 Troubleshooting.

Table 6‒11: Processes applicable to the restart setting (JP1/AJS3 - Agent)

No.

Child process name or detailed process name

Extended startup process definition file

Process that can be restarted

1

ajsqlcltd.exe

jp1ajs_service_0700.conf

ajsqlcltd.exe

2

jpqmon.exe

jp1ajs_service_0700.conf

jpqmon.exe

3

jpoagent.exe

jp1ajs_service_0700.conf

jpoagent.exe

In UNIX:
Table 6‒12: Processes applicable to the restart setting (JP1/AJS3 - Manager)

No.

Child process name or detailed process name

Extended startup process definition file

Process that can be restarted

1

jajs_dbmd

jp1ajs_service_0700.conf

jajs_dbmd and detailed process#1

2

ajsdbmgrd

jp1ajs_dbmd_0700.conf

ajsdbmgrd

3

jajs_hstd

jp1ajs_service_0700.conf

jajs_hstd and detailed process#1

4

ajshlogd

jp1ajs_hstd_0700.conf

ajshlogd

5

ajsinetd

jp1ajs_hstd_0700.conf

ajsinetd

6

ajsnetwd

jp1ajs_hstd_0700.conf

ajsnetwd

7

ajsagtmd

jp1ajs_hstd_0700.conf

ajsagtmd

8

ajsovstatd

jp1ajs_hstd_0700.conf

ajsovstatd

9

ajsgwmasterd

jp1ajs_hstd_0700.conf

ajsgwmasterd

10

jpqman#2

jp1ajs_hstd_0700.conf

jpqman#2

11

jpomanager

jp1ajs_hstd_0700.conf

jpomanager

12

ajscdinetd

jp1ajs_hstd_0700.conf

ajscdinetd

13

jajs_schd

jp1ajs_service_0700.conf

jajs_schdhstd and detailed process#1

14

ajslogd

jp1ajs_schd_0700.conf

ajslogd

15

jpqman

jp1ajs_schd_0700.conf

jpqman

16

jpomanager

jp1ajs_schd_0700.conf

jpomanager

17

ajsmasterd

jp1ajs_schd_0700.conf

ajsmasterd

18

jajs_agtd

jp1ajs_service_0700.conf

jajs_agtdhstd and detailed process#1

19

jpqmon

jp1ajs_agtd_0700.conf

jpqmon

20

jpoagent

jp1ajs_agtd_0700.conf

jpoagent

21

ajsagtmond

jp1ajs_hstd_0700.conf

ajsagtmond

#1

For details about JP1/AJS3 detailed processes, see B. List of Processes in the manual JP1/Automatic Job Management System 3 Troubleshooting.

#2

In the case of HP-UX, AIX, and Linux, the detailed process name is jpqman32.

Table 6‒13: Processes applicable to the restart setting (JP1/AJS3 - Agent)

No.

Child process name or detailed process name

Extended startup process definition file

Process that can be restarted

1

jpqmon

jp1ajs_service_0700.conf

jpqmon

2

jpoagent

jp1ajs_service_0700.conf

jpoagent

The following shows the definition file format.

In JP1/AJS3 - Manager:

process-name|path|startup-option|whether-to-restart|restart-count|retry-interval|retry-count-reset-time|type|scheduler-flag|start-sequence|auto-start|stop-path|stop-option|status-check-path|status-check-option|status-check-return-code|status-check-interval|

In JP1/AJS3 - Agent:

process-name|path|startup-option|whether-to-restart|restart-count|retry-interval|retry-count-reset-time|

The definition file contains pre-defined information. You can change the values of the whether-to-restart, restart-count, retry-interval, and retry-count-reset-time fields. Do not change any other fields, which are used by the system. You cannot omit the vertical bar (|) that delimits fields. If you want to insert a comment line, begin the line with a hash mark (#). The line up to the linefeed is assumed to be a comment line.

The following table lists the values that can be specified for the variable fields.

Table 6‒14: Values that can be specified for the variable fields

Field name

Description

whether-to-restart

Specify whether to restart a process when it has terminated abnormally. Specify 0 if the process is not to be restarted. Specify 1 to restart the process. An appropriate value is set by default.

restart-count

Specify the number of times a restart of a process is attempted. You can specify a value in the range from 0 to 99. An appropriate value is initially set for each process. Customize this value according to the operating mode. If 0 is set for the whether-to-restart field, the restart-count field is disabled regardless of whether a value is specified.

retry-interval

Specify the interval in seconds at which a process restart is attempted. You can specify a value in the range from 0 to 3,600. An appropriate value is initially set for each process. Customize this value according to the operating mode. If 0 is set for the whether-to-restart field, the retry-interval field is disabled regardless of whether a value is specified.

retry-count-reset-time

Specify the period of time (hours converted to seconds) that can elapse from the time a process is restarted until the time the restart count is reset. When the specified time elapses after the process is started, the retry count is reset. If the process abnormally terminates again, the restart count starts again from 1.

If a process is restarted and then abnormally terminates again before the specified time expires, the previous restart count is inherited. You can specify a value in the range from 3,600 and 2,147,483,647 (seconds). An appropriate value is initially set for each process. Customize this value according to the operating mode. If 0 is set for the whether-to-restart field, the retry-count-reset-time field is disabled regardless of whether a value is specified.

Cautionary notes
  • If you attempt to start a process without a value specified or with an incorrect value specified, an error occurs and the process will not start.

  • When you start a process managed by a logical host in a cluster configuration, if the conf folder on the logical host does not contain the extended startup process definition file, the file is copied from the physical host.

  • When the ajsmasterd child process is restarted, the status of the jobnets and jobs after the scheduler service is restarted depends on the start mode of the JP1/AJS3 service. For details about the status of jobnets and jobs for the service start mode, see 6.2.1(3) Jobnet and job statuses for each start mode.

  • When a process is restarted, the following message might be output to the integrated trace log: KNAD3737-E The JP1/AJS3 management-target-process-name terminated abnormally. This might occur when a process is restarted too quickly after it has terminated abnormally. In such cases, the restart operation starts before the complete stop of the abnormally terminated process and the child processes of that process, and a double startup is detected. Because a restart of the process is attempted the specified number of times until the process restarts, there is no problem if the KNAD3737-E error message is output. However, you can suppress output of this message by increasing the retry interval in the extended startup process definition file. The possibility of this problem occurring increases on low-performance computers. If necessary, customize the retry interval.

    Note that increasing the retry interval might increase the time required for restarting the JP1/AJS3 service. Therefore, do not specify too large a value for the retry count. The recommended value is 10 seconds.

  • If JP1/AJS3 child processes restart, their detailed processes also restart. As a result, the restart counts of the detailed processes are reset.

The following tables describe the default values of the restart settings.

In Windows:
Table 6‒15: Default values of the restart settings (JP1/AJS3 - Manager)

No.

Child process name or detailed process name

whether-to-restart

restart-count

retry-interval

retry-count-reset-time

1

jajs_dbmd.exe

Yes

3

3

21600

2

ajsdbmgrd.exe

No

3

3

21600

3

jajs_hstd.exe

Yes

3

20

21600

4

ajshlogd.exe

Yes

3

3

21600

5

ajsinetd.exe

Yes

3

3

21600

6

ajsnetwd.exe

Yes

3

3

21600

7

ajsagtmd.exe

Yes

3

3

21600

8

ajsovstatd.exe

Yes

3

3

21600

9

ajsgwmasterd.exe

Yes

3

3

21600

10

ajsqlcltd.exe

Yes

3

3

21600

11

jpqman.exe

Yes

3

3

21600

12

jpomanager.exe

Yes

3

3

21600

13

ajscdinetd.exe

Yes

3

3

21600

14

ajsagtmond.exe

Yes

3

3

21600

15

jajs_schd.exe

Yes

3

10

21600

16

ajslogd.exe

No

3

3

21600

17

jpqman.exe

No

3

3

21600

18

jpomanager.exe

No

3

3

21600

19

ajsmasterd.exe

No

3

3

21600

20

jajs_agtd.exe

Yes

3

3

21600

21

jpqmon.exe

Yes

3

3

21600

22

jpoagent.exe

Yes

3

3

21600

Table 6‒16: Default values of the restart settings (JP1/AJS3 - Agent)

No.

Child process name or detailed process name

whether-to-restart

restart-count

retry-interval

retry-count-reset-time

1

ajsqlcltd.exe

Yes

3

3

21600

2

jpqmon.exe

Yes

3

3

21600

3

jpoagent.exe

Yes

3

3

21600

In UNIX:
Table 6‒17: Default values of the restart settings (JP1/AJS3 - Manager)

No.

Child process name or detailed process name

whether-to-restart

restart-count

retry-interval

retry-count-reset-time

1

jajs_dbmd#1

Yes#2

3

3

21600

2

ajsdbmgrd

No

3

3

21600

3

jajs_hstd

Yes#2

3

20

21600

4

ajshlogd

Yes#2

3

3

21600

5

ajsinetd

Yes#2

3

3

21600

6

ajsnetwd

Yes#2

3

3

21600

7

ajsagtmd

Yes#2

3

3

21600

8

ajsovstatd

Yes#2

3

3

21600

9

ajsgwmasterd

Yes#2

3

3

21600

10

jpqman#3

Yes

3

3

21600

11

jpomanager

Yes

3

3

21600

12

ajscdinetd

Yes#2

3

3

21600

13

ajsagtmond

Yes

3

3

21600

14

jajs_schd#1

Yes#2

3

10

21600

15

ajslogd

No

3

3

21600

16

jpqman

No

3

3

21600

17

jpomanager

No

3

3

21600

18

ajsmasterd

No

3

3

21600

19

jajs_agtd#1

Yes#2

3

3

21600

20

jpqmon

Yes

3

3

21600

21

jpoagent

Yes

3

3

21600

#1

On a logical host, you do not need to specify the restart setting for individual detailed processes under jajs_dbmd, jajs_schd, and jajs_agtd. The detailed processes are restarted when their parent processes are restarted.

#2

On a logical host, automatic restarting is disabled by default.

#3

In the case of HP-UX, AIX, and Linux, the detailed process name is jpqman32.

Table 6‒18: Default values of the restart settings (JP1/AJS3 - Agent)

No.

Child process name or detailed process name

whether-to-restart

restart-count

retry-interval

retry-count-reset-time

1

jpqmon

Yes

3

3

21600

2

jpoagent

Yes

3

3

21600

The default values of the restart settings have been set to the most appropriate values after taking into account the characteristics of each process. The following describes the characteristics of the processes:

Organization of this subsection

(1) Setting example

The following shows an example of settings in the extended startup process definition file, and the operation performed when a process terminates abnormally.

This example assumes that the following conditions have been set for JP1/AJS3 child processes:

whether-to-restart: 1 (Restart the process)
restart-count: 4
retry-interval: 3 (seconds)
restart-count-reset-time: 3,600 (seconds)
Figure 6‒1: Example of settings in the extended startup process definition file

[Figure]

The following shows an example of the operation performed when a process terminates abnormally.

Figure 6‒2: Example of the operation performed when a process terminates abnormally

[Figure]

In the above example, if the process does not abnormally terminate before 3,600 seconds (specified for restart-count) has elapsed since the process was restarted, the restart count is reset. Therefore, if the process abnormally terminates again, the restart count starts from 1. However, if the process abnormally terminates within 3,600 seconds after it was restarted, the restart count is incremented. If the restart count then reaches the specified count value, no attempts are made to restart the process even if it terminates abnormally.