Hitachi

JP1 Version 12 JP1/Integrated Management 2 - Manager Configuration Guide


8.4.1 Creating a script to be registered into the cluster software (for UNIX)

When you use UNIX cluster software, you normally use a method such as a script to create a tool to control applications, and then register the script into the cluster software. In general, such a script must provide the start, stop, operation monitoring, and forced termination functions.

This subsection describes the JP1/IM - Manager information that is needed to design a script. You use this information to create a script that controls JP1/IM - Manager according to the cluster software specifications, and then you register the script into the cluster software.

Table 8‒4: Detailed information for script design in cluster registration

Function to be registered

Description

Start

Starts JP1/IM - Manager.

  • Command to be used

    jco_start.cluster logical-host-name

  • Start command termination timing

    The start command waits for JP1/IM - Manager to start before it terminates itself. However if the startup processing is not completed within the timeout period (300 seconds is the default) due to some problem, the command terminates without completing the startup processing. In such a case, the command terminates with the startup processing still underway (the command does not cancel the startup processing).

    For details about how to set the timeout period, see 13.7.11 Considering the timeout period during startup or stop of JP1/IM - Manager (in UNIX) in the manual JP1/Integrated Management 2 - Manager Overview and System Design Guide.

  • Check the start command result

    The script should determine the result of starting JP1/IM - Manager by the operation monitoring method described below. Normally, the result is determined by the cluster software's operation monitoring. The return value of the start command is 0 (normal termination) or 1 (argument error). Therefore, the result cannot be determined from the return value.

Stop

Terminates JP1/IM - Manager.

  • Command to be used

    jco_stop.cluster logical-host-name

  • Stop command termination timing

    The stop command waits for JP1/IM - Manager to terminate before it terminates itself. However if the stop processing is not completed within the timeout period (300 seconds is the default) due to some problem, the command terminates without completing the stop processing. In such a case, the command terminates with the stop processing still underway (the command does not cancel the stop processing).

    For details about how to set the timeout period, see 13.7.11 Considering the timeout period during startup or stop of JP1/IM - Manager (in UNIX) in the manual JP1/Integrated Management 2 - Manager Overview and System Design Guide.

  • Check the stop command result

    The script should determine the result of terminating JP1/IM - Manager by the operation monitoring method described below. The return value of the stop command is 0 (normal termination) or 1 (argument error). Therefore, the result cannot be determined from the return value.

We recommend that you execute the forced termination command described below after the stop command has terminated. This enables you to terminate the process and prevent a failover error even in the event of a problem.

JP1/IM - Manager operation monitoring#1

Monitors normal operation of JP1/IM - Manager.

  • Command to be used

    jco_spmd_status -h logical-host-name

To determine whether JP1/IM - Manager is running normally, check the return value of the jco_spmd_status command. This command determines the status from the operating status of each process.

Some cluster software does not provide the operation monitoring function. If there is no need to perform failover in the event of a JP1/IM - Manager failure, do not register this function.

  • Check the operation monitoring result

    The following explains how to interpret the return value:

    Return value = 0 (all running):

    JP1/IM - Manager is running normally.

    Return value = 1 (error):

    An unrecoverable error occurred. Treat this as a failure.

    Note: If you were to execute the jco_spmd_status command at the secondary server whose shared disk is offline, the return value will be 1 because the shared disk is not available.

    Return value = 4 (partially stopped):

    Some JP1/IM - Manager processes are stopped due to a problem. Treat this as a failure.

    Return value = 8 (all stopped):

    All JP1/IM - Manager processes are stopped due to a problem. Treat this as a failure.

    Return value = 12 (retriable error):

    While the jco_spmd_status was checking the operating status, an error that can be recovered by retries has occurred. Retry checking the operating status as many times as specified.

IM database operation status checking#2

Checks to see if the IM databases are running normally.

  • Command to be used

    jimdbstatus -h logical-host-name

To determine the operating status, check the return value of the jimdbstatus command.

  • Check the operating status result

    The following explains how to interpret the return value:

    Return value = 0: Running

    Return value = 1: The jimdbstatus command terminated abnormally.

    Return value = 4: Start or stop processing is underway.

    Return value = 8: Stopped (IM database is in restart-interrupted status and is unstable)

    Return value = 12: Stopped (stopped normally)

    Return value = 20: Installed HiRDB has not been set up.

    Return values 1 and 4 are subject to retries. Return values 8 and above indicate an error and are subject to failover.

Forced termination

Forcibly terminates JP1/IM - Manager and releases the current resources.

  • Command to be used

    jco_killall.cluster logical-host-name

The jco_killall.cluster command forcibly terminates each process without performing JP1/IM - Manager termination processing.

Note:

Before you execute forced termination, use the stop command to terminate JP1/IM - Manager.

#1

The commands used for JP1 operations related to operation checking are the same between UNIX and Windows, but the operations are different.

Windows operations differ from UNIX operations due to their association with Windows service control. In Windows, when some of the processes terminate, the JP1 process management terminates each process automatically and places the service in stopped status. Treat service stop as an error or detect an error when a command such as jco_spmd_status returns a value of 8.

#2

Executed when the IM databases are used.

Note

About JP1 restart

When a JP1 failure is detected in a cluster operation system, restart of JP1 might be retried at the same server before failover to the secondary server is executed.

In such a case, do not perform restart using JP1 process management.

The cluster software attempts restart after detection of the JP1 failure. Depending on the nature of the failure, JP1's restart function might be affected and normal operation might not be achieved. To restart JP1 successfully, use the cluster software to restart JP1.