C.14 JP1/DH - Server operations in the cluster

In a cluster system managed by clustering software, usually the software is responsible for detecting failure and switching the system.

What failures can be detected is determined by each clustering software and its settings. In this appendix, a failure is determined by checking whether service processes are running, as described in C.9 (3) Settings for each generic service and C.10 (1) Failure detection.

This approach might be a less accurate way of monitoring services. Sometimes an application cannot service its users even when its process is still running.

You can identify failures with higher accuracy by using the JP1/DH - Server commands.

To identify failures with higher accuracy, a system administrator must perform either of the following:

Change how your clustering software identify failures.

Customize monitoring scripts used by the clustering software for detecting failures. This means that JP1/DH - Server commands are used in the monitoring script. A successful command will allow processing to proceed, and a failed command will raise an error.

Note that different clustering software have different monitoring script functions. For details, see the documentation of your clustering software.
Monitoring and detecting failures separately of the monitoring functionality of the clustering software.

Use the JP1/DH - Server commands to create scripts. These scripts can execute the JP1/DH - Server commands, and send an email notification to a system administrator if the script fails. The scripts must be run on a regular basis by using, for example, a task scheduler.

The major difference between the two approaches above is whether the clustering software switches the system when a JP1/DH - Server command fails.

In the first approach, the software switches the system immediately after the JP1/DH - Server command fails. However, even if the command fails, the application might be able to provide its service (including changing the configuration for authentication rules and delivery rules). Any operation during system switchover causes an error, which might result in a shorter service time.

In the second approach, the software does not switch the system. Thus, when receiving the notification, a system administrator must check what type of error exists, and manually switch the system if a failure actually occurs. This procedure takes some time to switch the system, but the switchover can be performed only when the service is unavailable.

However, the system administrator is unaware of a failure occurring, without email notifications sent by a script or some other means. If you employ the second approach, make sure that the system administrator is notified of any failure in a way such as email notification.

For the first approach, the system administrator must be familiar with the specifications of monitoring scripts provided by the clustering software. Therefore, we recommend the second approach in a clustered JP1/DH - Server configuration if you need a higher accuracy of failure detection.

In a cluster configuration described in this appendix, the clustering software detects a failure by checking whether JP1/DH - Server application processes are running. What is necessary for failure detection depends on how you operate your system. See the documentation of your clustering software to configure failure detection suitable for your operations.

To Page Top