Flow of the recovery process

Perform the operations with the executing node arranged in a cluster configuration, and a single standby node that acts as the recovery server. If a failure occurs even in one of the multiple executing node Application Servers, the cluster software detects the failure and switches to the standby node recovery server. The standby node terminates the transaction of the executing node Application Server in which the failure occurred.

The following figure shows the flow of the recovery process:

Figure 19-2 Flow of the recovery process

[Figure]

The cluster software of the recovery server detects the failure.
If a failure occurs in the executing node Application Server C and if the server terminates, the cluster software of the standby node recovery server detects the termination of Application Server C.
Perform node switching with the cluster software of the recovery server.
Switch to the standby node recovery server. At this point, switch and mount the shared disk device, and set up the IP address of the virtual host.
Lock the recovery server.
Execute the recovery process sequentially in the recovery server. Before you execute the recovery process, lock the recovery server.
Execute the recovery command in the cluster software of the recovery server, and then start Application Server of the recovery server in the recovery mode.
In the recovery server, execute the recovery process of the transaction that was running in Application Server C in which the failure occurred.
Only the recovery process of the transaction is executed in the recovery server. Application Server of the recovery server terminates after the completion of the recovery process.
Release the lock of the recovery server.
Execute the process following the termination of Application Server in the cluster software of the recovery server.
Switch and unmount the shared disk device, and delete the IP address of the virtual host.

If a failure occurs in another executing node Application Server, execute the recovery process. Note that executing node Application Servers A and B are not affected by the failure in Application Server C and they continue with the application processing.

Notes

If a failure occurs in multiple executing nodes and Application Servers terminate, the recovery server (standby node) executes the sequential recovery process exclusively.
During the recovery process in the recovery server, the service port of the recovery server is blocked, and therefore, no processing is received.
When the recovery processing does not terminate in the recovery server, timeout monitoring is executed.
If a timeout occurs due to a failure in the database, execute the recovery process manually. For details, see 2.5.11 When errors occur in the N-to-1 recovery system in the uCosminexus Application Server Maintenance and Migration Guide.
Double failures such as termination of recovery servers are not supported.

19.3.2 Flow of the recovery process