Nonstop Database, HiRDB Version 9 System Operation Guide
This subsection explains how to perform a system switchover when one of the following errors occurs:
If a large number of server processes terminate abnormally, new services might not be accepted. Although abnormal termination of server processes will not cause HiRDB to terminate abnormally, HiRDB is essentially in online stopped status. Also, because HiRDB does not terminate abnormally, a system switchover is not performed. The following subsections explain how to perform system switchovers when HiRDB is in online stopped status.
Performing a system switchover might not be effective and might actually cause traffic to increase because more than one HiRDB is running on the same server machine. If you are using the process abnormal termination monitoring facility in a mutual system switchover configuration, we recommend that you do not perform a system switchover when HiRDB terminates abnormally. Instead, restart HiRDB in the system where it terminated abnormally by specifying pd_mode_conf=MANUAL1.
When a large number of server processes terminate abnormally, a large amount of troubleshooting information might be output, causing the ensuing system switchover to take a long time. Specifying the following operands suppresses output of troubleshooting information, making it possible to reduce the system switchover time when many server processes have terminated abnormally:
Also, when you specify Y in the pd_ha_switch_timeout operand, if the internal termination processing of the running HiRDB when a system switchover occurs exceeds the server failure monitoring time, the system switchover can occur without waiting for the internal termination processing of the running HiRDB.
This subsection explains how to perform a system switchover when an RDAREA input/output error (path error) occurs. For this purpose, an input/output error (I/O error) means an error that occurs when HiRDB fails to perform an operation on a file because HiRDB cannot identify the file. The error code returned from the request for access to the HiRDB file system is -1544.
Specify the pd_db_io_error_action operand. If unitdown is specified in the pd_db_io_error_action operand, HiRDB (or a unit for a HiRDB parallel server configuration) terminates abnormally when an RDAREA I/O error occurs, causing a system switchover to be performed. When the cause of the I/O error is a path error, job tasks can continue because I/O processing can be performed after the system switchover is performed. For this purpose, a path error means a status in which files cannot be accessed because the path of communication between HiRDB and the files was interrupted.
For details about specifying unitdown in the pd_db_io_error_action operand, see 20.20 Actions to take when an RDAREA I/O error occurs.
When an I/O error occurs and HiRDB terminates abnormally, perform a system switchover and continue the processing that was in progress when the error occurred. To resolve the error, read the messages that are output. Then, perform another system switchover, or terminate and restart HiRDB, as appropriate. If the I/O error re-occurs after the system switchover, the RDAREA shuts down. If this happens, use the database recovery utility (pdrstr command) to recover the RDAREA.
All Rights Reserved. Copyright (C) 2011, 2015, Hitachi, Ltd.