uCosminexus Application Server, Maintenance and Migration Guide

[Contents][Glossary][Index][Back][Next]

2.5.11 If problems occurs in N-to-1 recovery systems

This subsection describes the recovery procedure for each OS, when problems occur in N-to-1 recovery systems.

Organization of this subsection
(1) In Windows
(2) In UNIX

(1) In Windows

With N-to-1 recovery systems, if the recovery processing of the standby node host (recovery server) is timed out because of a database server failure (such as server down or deadlock), collect the log and manually execute the recovery procedure. The recovery procedure is as follows:

  1. Collect the log of the N-to-1 node switching system.
    If a problem occurs in the N-to-1 recovery system, you must collect cluster logs. Collect the logs same as you would collect when problems occur in 1-to-1 node switching systems. For details on the logs to be collected, see the subsection 2.5.10 If a problem occurs in 1-to-1 node switching system.
  2. Recover the N-to-1 node switching system manually.
    Use one of the following methods to manually recover the N-to-1 recovery system:
    • Set up the target resources of the standby node host as online
    • Execute the transaction recovery command (cjstartrecover) of the J2EE server
(a) Setting the target resources of the standby node host as online and executing recovery

To set up the target resources of the standby node host as online and execute recovery:

  1. Execute operations, such as restarting the database to resolve causes of the timeout.
  2. Set up the target resources of the standby node host as online.
(b) Executing the transaction recovery command (cjstartrecover) of the J2EE server for recovery

To execute the transaction recovery command (cjstartrecover) of the J2EE server for recovery:

  1. Execute operations, such as restarting the database to resolve causes of the timeout.
  2. Create a folder in the path specified in Dir_Name using the universal script for the standby node host.
    If a folder already exists, delete the folder, and then create a new folder.
  3. When the universal script for the standby node host is online, reference cluster logs, and then execute cjstartrecover.
  4. When recovery is successful, delete the folder created in the path specified in Dir_Name.

For details on cjstartrecover command, see cjstartrecover (J2EE Server Transaction Recovery) in the uCosminexus Application Server Command Reference Guide.

(2) In UNIX

With N-to-1 recovery systems, if the recovery processing of the standby node host (recovery server) is timed out because of a database server failure (such as server down or deadlock), manually execute the recovery procedure as follows:

  1. Execute operations, such as restarting the database to resolve causes of the timeout.
  2. In the standby node host, execute monbegin for the failed executing node host.
     
    # monbegin server-identification-name
     
    In the underlined part, specify an identification name of the server for the executing node specified in the operand alias of the servers file.
  3. In the standby node host, execute monact for the failed executing node host.
     
    # monact server-identification-name
     
    In the underlined part, specify an identification name of the server for the executing node specified in the operand alias of the servers file.
    You can use a standby node host (recovery server) to execute the recovery processing for the unconcluded transactions in the failed executing node host.
    Reference note
    For details on the environment settings to support servers in the HA monitor, such as defining the servers file, see 19.5.4 Environment settings of the HA Monitor in the uCosminexus Application Server Operation, Monitoring, and Linkage Guide.