5.5.4 Registering daemons in the cluster software

In the cluster software used in your system, register the JP1/Base daemons for failovers. For details on the registration procedure, see the documentation for your cluster software. Remember the following points when registering services:

Ensure that the secondary node can take over the daemons from the primary node, together with the IP address and shared disk. Also, if the failover of an application program leads to the failover of a service, ensure that the secondary node can also take over the application program.
After the logical IP address and shared disk have become available, start JP1/Base first, and then start JP1/IM and JP1/AJS. When stopping the products, stop them in the reverse order.
Before registering the daemons to the cluster software, change the value of the LANG environment variable of jbs_start.cluster to the language specified in jp1bs_env.conf for the logical host.

The information needed when registering JP1/Base into cluster software is shown below:

Functionality	Description
Start	Start JP1/Base. Command `jbs_start.cluster` `logical-host-name` End timing of the start command The start command ends after JP1/Base is started. If starting JP1/Base does not complete for any reason after the timeout period (typically 60 seconds) elapsed the command ends before JP1/Base is started. In such a case, an attempt to start JP1/Base is not suspended; the command ends but an attempt to start JP1/Base continues. Result start judgment for the start command Determine the result of starting JP1/Base based on the information in the operation monitoring section of this table. Usually, the operation monitor functionality of the clustering software is used. The return value of the start command cannot be used for judgment because it is either `0` (normal end) or `1` (abnormal argument).
Stop	Stop JP1/Base. Command `jbs_stop.cluster` `logical-host-name` End timing of the stop command The stop command ends after JP1/Base is stopped. If stopping JP1/Base does not complete for any reason after the timeout period (typically 60 seconds) elapsed, the command ends before JP1/Base is stopped. In such a case, the attempt to stop JP1/Base is not suspended; the command ends but the attempt to stop JP1/Base continues. Result judgment for the stop command Determine the result of stopping JP1/Base based on the information in the operation monitoring section of this table. The return value of the stop command cannot be used for judgment because it is either `0` (normal end) or `1` (abnormal argument). Remarks: After the stop command finishes, execute the `jbs_spmd_status` and `jevstat` commands to check whether JP1/Base has stopped normally. If JP1/Base has not stopped, execute the command described in the kill functionality below.
Operation monitoring	Use the return values from the `jbs_spmd_status` and `jevstat` commands to monitor whether JP1/Base is operating normally. These commands judge the operating status based on whether each process is running or not. Some clustering software does not support this functionality. Register this functionality only when a failover is required upon a failure in JP1/Base. Command `jbs_spmd_status -h` `logical-host-name` `jevstat` `logical-host-name` Result judgment for operation monitoring The return values have the following meanings: Return value = 0 (all operating) JP1/Base is operating normally. Return value = 1 (error) An unrecoverable error has occurred. Judge this as a failure. Note If you execute the `jbs_spmd_status` command on the secondary node with the shared disk offline, it returns `1` because the shared disk is not found. Return value = 4 (partial stop) Some of the JP1/Base processes have stopped for some reason. Judge this as a failure (for UNIX).^# Return value = 8 (all stopped) All processes of JP1/Base have stopped for some reason. Judge this as a failure. Return value = 12 (error but retry possible) While the `jbs_spmd_status` command is checking the operating status, an error has occurred which can be recovered by retry. Retry checking the operating status up to a specified number of times. For the `jevstat` command, this return value indicates an error for which retry is not possible.
Kill	Kill JP1/Base and release the resources it has been using. Command `jbs_killall.cluster` `logical-host-name` When you execute the `jbs_killall.cluster` command, each process is forcibly stopped without performing any processing for stopping JP1/Base. Note Stop JP1/Base using the stop command before executing the kill command. Use the kill command only when a problem has occurred, for example, when executing the stop command cannot terminate processing.

#

In Windows, operation differs from that in UNIX due to the relationship with service control by Windows. If some processes have stopped in Windows, the JP1 process management automatically stops all the processes, placing the service into the stopped state. You can determine a failure by detecting the stop of the service or when the jbs_spmd_status command returns a value of 8.

Remarks: Restarting JP1

If a JP1 failure is detected in a cluster system, the primary server might restart JP1 to attempt recovery before it performs a failover to the secondary server.

In such a case, we recommend that you use the clustering software control to restart JP1 rather than restarting by JP1 process management.

The clustering software attempts to restart JP1 after a failure is detected, so that it might prevent the normal operation of the JP1 restart functionality. To ensure a more reliable restart, restart JP1 under the control of the clustering software.

To Page Top