Nonstop Database, HiRDB Version 9 System Operation Guide

[Contents][Index][Back][Next]

20.8.2 Procedure for starting a HiRDB (unit) while there is an erroneous status file

The following figure shows the procedure for starting a HiRDB (unit) while there is an erroneous status file.

Figure 20-1 Procedure for starting a HiRDB (unit) while there is an erroneous status file

[Figure]

Note
The numbers to the left of the process boxes correspond to the paragraph numbers of the explanations on the following pages. For example, step 5 is explained in paragraph (5) below.

Figure 20-2 Actions to be taken when an error occurs in a status file

[Figure]

Note
The numbers to the left of the process boxes correspond to the paragraph numbers of the explanations on the following pages. For example, step 8 is explained in paragraph (8) below.
Organization of this subsection
(1) Checking for a disk error
(2) Correcting a disk error
(3) Checking for a status file error
(4) Restarting HiRDB
(5) Restoring the status file in which the error occurred
(6) Check if single operation is being used
(7) Checking if HiRDB can identify the current file that was in effect during the previous session
(8) Using the current file identification facility
(9) Identifying the current file that was in effect during the previous session
(10) Checking if the identified current file is normal
(11) Specifying the identified current file

(1) Checking for a disk error

Check if a disk error has occurred at the disk that stores the status file in which the error occurred. Check for a physical error (such as physical damage or a power outage), as well as for an OS or disk driver error; also check that the disk is enabled.

The following table shows how to determine whether a physical error has occurred at the disk.

Table 20-16 Determining if a physical error has occurred at the disk (physical error check)

Has a disk error occurred? Is the physical error recoverable? Status file data Determination result
No -- -- No physical error
Yes Yes Data remains.
Data has been lost. Physical error occurred (no entity)
No --

Legend:
--: Not applicable
Note
Regardless of whether a disk error has occurred, unless otherwise indicated, do not use the pdstsinit, pdstsrm, or pdfmkfs command until error recovery is completed.

(2) Correcting a disk error

If the check identifies a disk error, correct it. If it is not possible to correct the error, start HiRDB with the remaining normal disks only.

(3) Checking for a status file error

Check for an error in a status file. The following table shows how to determine whether a logical error has occurred.

Table 20-17 Determining if a logical error has occurred (logical error check)

Command execution Results displayed by the command (compared to the values specified at the time of file creation) Determination result
Terminated normally No inconsistency No logical error
Inconsistency found Logical error occurred
Terminated abnormally (error message is output) -- Logical error occurred

Legend:

--: Not applicable

Execute the pdcat command for a status file in which no physical error has occurred and check for an error in its contents. The status file is normal if both the following conditions are satisfied:

An example of execution of the pdcat command follows:

 
pdcat -d sts -u UNT1 -f /sysfile/usts1a -v        ...1
pdcat -d sts -s b001 -f /sysfile/sstsb1a -v       ...2
 

Explanation
  1. Command execution example for a unit status file.
  2. Command execution example for a server status file.

If neither a physical error nor a logical error has occurred, proceed to the next step. If a physical error or a logical error has occurred, take the appropriate actions described in Figure 20-2 Actions to be taken when an error occurs in a status file.

(4) Restarting HiRDB

Use the pdstart command to restart HiRDB. If HiRDB cannot be restarted, take the appropriate actions described in 20.8.3 Actions to be taken when a HiRDB (unit) cannot be restarted due to an error in both versions of the current file.

(5) Restoring the status file in which the error occurred

If an error has occurred in one of the current files, the HiRDB administrator must immediately take the appropriate action described in Table 20-15 Actions to be taken in the event of an error in the current file.

If a file has been shut down by an error, place the shutdown file in spare status by following the procedure described in 20.8.1(1) Placing shutdown files in spare status.

After you have restored all status files, terminate HiRDB if necessary, restore the specification values of the following operands to their original values, then start HiRDB:

(6) Check if single operation is being used

Check if single operation is being used for the status file in which the error occurred. If the following operand is specified, as applicable, for the status file in which the error occurred, single operation is being used:

(7) Checking if HiRDB can identify the current file that was in effect during the previous session

Based on the results of steps (1) through (3), for each logical file of the status file in which the error occurred, determine whether file versions A and B were in the statuses shown in the following table. If the status file is in any of the statuses shown in the following table, HiRDB cannot identify the current file that was in effect during the previous session.

Table 20-18 Cases in which HiRDB cannot identify the current file that was in effect during the previous session

Status of file version A Status of file version B
Logical error present Logical error present
Logical error present Physical error present (no entity)
Physical error present (no entity) Logical error present
Physical error present (no entity) Physical error present (no entity)

(8) Using the current file identification facility

Use HiRDB's current file identification facility. Specify the following operand, as applicable, for the status file:

(9) Identifying the current file that was in effect during the previous session

Identify the current (most recent) file that was in effect during the previous session. You can identify this file from the messages listed below. Retrieve these messages from the message log file or syslogfile (retrieve the messages for the unit or server for which the current file cannot be identified).

Check these messages for the one output most recently; the current file is indicated in that message.

(10) Checking if the identified current file is normal

Check that the current file identified in step (9) is normal. You can determine this from the results of steps (1) through (3).

If status file single operation was in effect during the previous session (the last message that was output in step (9) was KFPS01044-I), check whether the status file for the active file system shown in the KFPS01044-I message is normal.

If status file single operation was not in effect during the previous session (the last message that was output in step (9) was KFPS01001-I or KFPS01063-I), check whether either of the status files shown in the KFPS01001-I or KFPS01063-I message is normal.

If the current file is normal (or if one of the files is normal if status file single operation was not in effect), proceed to the next step.

If an error had occurred in the current file, the current file that was in effect during the previous session has been lost, which means that HiRDB cannot be restarted. In such a case, take the actions described in 20.8.3 Actions to be taken when a HiRDB (unit) cannot be restarted due to an error in both versions of the current file.

(11) Specifying the identified current file

Specify the identified current file that was in effect during the previous session in the following operands:

[Figure] When the error occurred in a unit status file
Specify the following operands for the applicable unit:
  • pd_syssts_initial_error=continue or excontinue
  • pd_syssts_last_active_file=name-of-current-status-file-that-was-in-effect-during-the-previous-session#
  • pd_syssts_last_active_side=file-system-that-was-normal-during-the-previous-session#

[Figure] When the error occurred in a server status file
Specify the following operands for the applicable server:
  • pd_sts_initial_error=continue or excontinue
  • pd_sts_last_active_file=name-of-current-status-file-that-was-in-effect-during-the-previous-session#
  • pd_sts_last_active_side=file-system-that-was-normal-during-the-previous-session#

#: Specify the current file name and normal file system identified in steps (9) and (10).