18.17.1 Restart procedure

When HiRDB runs out of current system log files due to a shortage of space, it issues the KFPS01220-E message and terminates itself (unit) abnormally. When this happens, abort code Psjnf07 or Psjn381 is output. The HiRDB administrator must restart the job using the procedure shown below:

Procedure
[Figure]
[Figure]

The numbers to the left of the process boxes correspond to the paragraph numbers of the explanations on the following pages. For example, step 5 is explained in paragraph (5) below.

1 This step is explained in Section 18.17.2 Determining the minimum number of system log files to be added.

2 This step is explained in Section 18.17.3 Creating a file in swappable target status.

Messages can be checked during the procedure explained below. Because messages in the HiRDB message log files ($PDDIR/spool/pdlog1 and pdlog2) may have been overwritten, check the messages in syslogfile.

Organization of this subsection
(1) Determine the server that caused the abnormal termination (for a HiRDB/Parallel Server)
(2) Determine the number of system log files to be used as input information during restart
(3) Use the pdfstatfs command to determine if there is enough space in the HiRDB file system area for system files
(4) Add the pdlogadfg and pdlogadpf operands in the server definition
(5) Use the pdloginit command to add the system log files
(6) Enter the pdstart command to restart the unit
(7) Check that the synchronization point dump has been validated
(8) Use the pdcopy command to back up data (applicable to operation without unloading of the system log)
(9) Use the pdlogls command to determine if there is a system log file in swappable target status
(10) Restart the application
(11) Reevaluate the size of the system log files

(1) Determine the server that caused the abnormal termination (for a HiRDB/Parallel Server)

The server that caused the abnormal termination can be determined from the KFPS01220-E message:

Contents of the syslogfile

KFPS01220-E PRDT untF Request to swap sys(bes1) log file unable to be executed
because there is no standby log file group available.(13830)

In this example, bes1 is the cause of abnormal termination. Assume for this example that this server's system log files are organized as follows:

[Figure]

(2) Determine the number of system log files to be used as input information during restart

The number of system log files to be used as the input information during restart can be determined from the KFPS01229-I message and the pdlogls command.

Reference note
  • A method other than this method can also be used to determine the number of system log files to be used as input information during restart. For details on the other method, see 18.17.5 Determining the number of system log files to be used as input files during restart.
  • If pd_mode_conf=AUTO or pd_mode_conf=MANUAL1 is specified, automatic restart processing was executed several times after the unit terminated abnormally, in which case the KFPS01229-I message was output each time. The first KFPS01229-I message that was output during the first abnormal termination (during online operation) is the one that must be used.

Contents of the syslogfile

KFPS01220-E PRDT untF Request to swap sys(bes1) log file unable to be executed
because there is no standby log file group available.(13830)
KFPO00105-E PRDT untF Server _log1s(process ID=13830) killed by
code=Psjnf07(13830)
KFPS01821-E PRDT untF Unable to continue HiRDB unit processing because serious
error occurred; stops HiRDB unit untF (13776)
KFPS01229-I PRDT untF Next bes1 log file restart point,generation number=4,
block number=d. restart end point, generation number=6, blocknumber=11.
last acquired syncpoint dump 1998/11/15 15:54:41 (13776)

Execution results of the pdlogls command:

pdlogls -d sys -s bes1

HOSTNAME : dcm3500(163541)
***** Off-line Information *****
Group    Type Server   Gen No.  Status    Run ID      Block No.
logfg01  sys  bes1     1        cna---u  364a4ac2     1       6
logfg02  sys  bes1     2        cna---u  364a4ac2     7       9
logfg03  sys  bes1     3        cna---u  364a4ac2     a       c
logfg04  sys  bes1     4        cna---u  364a4ac2     d       e
logfg05  sys  bes1     5        cna---u  364a4ac2     f       10
logfg06  sys  bes1     6        cn---cu  364a4ac2     11      0
logfg07  sys  bes1     0        cn-----  00000000​     0       0
logfg08  sys  bes1     0        cn-----  00000000​     0       0

Explanation
The KFPS01229-I message displays information about the system log files that are to be used as the input information during the restart. In this example, the input start generation is 4 and the input end generation is 6 for the system log files to be used during the restart. Files with generation numbers (Gen No) 4 to 6 (logfg04, logfg05, logfg06) are the system log files to be used as the input information during the restart. Therefore, a total of 3 system log files are used as the input information during the restart.
If the number of system log files that can be added equals number-of-system-log-files-used-as-input-during-restart + 1 (3 + 1 = 4), HiRDB can be restarted. Otherwise, determine the minimum number of system log files to be added by referring to 18.17.2 Determining the minimum number of system log files to be added. Then change files in overwrite-enabled status to swappable status and restart HiRDB.
[Figure]

(3) Use the pdfstatfs command to determine if there is enough space in the HiRDB file system area for system files

Use the pdfstatfs command to determine if there is enough space in the HiRDB file system area to add the four system log files determined in step (2).

pdfstatfs /bes1/sysfile_a
pdfstatfs /bes1/sysfile_b

The following procedure is used to determine the size of one system log file:

Procedure
To determine the system log file size:
  1. Use the pdfls command to check the number of records in the existing system log file.
  2. The size of one system log file can be obtained from the following formula, where a is the number of records obtained in step 1:
    a[Figure] 4096 (bytes)
When there is not enough space in the HiRDB file system area
If there is not enough space in the HiRDB file system area to add the four system log files determined in step (2), take one of the following actions:
  1. See 18.17.2 Determining the minimum number of system log files to be added to determine the minimum number of files to be added. Then place overwrite-enabled files in swappable target status.
  2. Create a HiRDB file system area for system files on the hard disk, then add the system log files there. For details on how to create a HiRDB file system area for system files, see 18.17.4 Creating a HiRDB file system area for system files.
  3. System log files cannot be added if there is not enough space on the hard disk (otherwise, the unit cannot be restarted). In such a case, add a new hard disk and create a HiRDB file system area for system files there.

(4) Add the pdlogadfg and pdlogadpf operands in the server definition

pdlogadfg -d sys -g logfg09 ONL
pdlogadpf -d sys -g logfg09 -a /bes1/sysfile_a/log09a\
                           -b /bes1/sysfile_b/log09b
pdlogadfg -d sys -g logfg10 ONL
pdlogadpf -d sys -g logfg10 -a /bes1/sysfile_a/log10a\
                           -b /bes1/sysfile_b/log10b
pdlogadfg -d sys -g logfg11 ONL
pdlogadpf -d sys -g logfg11 -a /bes1/sysfile_a/log11a\
                           -b /bes1/sysfile_b/log11b
pdlogadfg -d sys -g logfg12 ONL
pdlogadpf -d sys -g logfg12 -a /bes1/sysfile_a/log12a\
                           -b /bes1/sysfile_b/log12b

Specify the pdlogadfg and pdlogadpf operands for the system log files to be added. In this example, four files are added to each of versions A and B.

Note
If HiRDB Datareplicator is running, it must be terminated. HiRDB Datareplicator can be started after steps (4) and (5) have been completed.

(5) Use the pdloginit command to add the system log files

pdloginit -d sys -s bes1 -f /bes1/sysfile_a/log09a -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_b/log09b -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_a/log10a -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_b/log10b -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_a/log11a -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_b/log11b -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_a/log12a -n 5000
pdloginit -d sys -s bes1 -f /bes1/sysfile_b/log12b -n 5000

In this example, four files are added to each of versions A and B.

(6) Enter the pdstart command to restart the unit

HiRDB/Single Server

pdstart

HiRDB/Parallel Server

In the case of a HiRDB/Parallel Server, restart the applicable unit.

pdstart -u untF

Although jobs can be accepted immediately after HiRDB (the unit) has been restarted, do not resume jobs yet. Immediately after HiRDB (the unit) is restarted, HiRDB is still recovering the database. If many database-updating jobs are entered at this point, a system log file space shortage may occur again. If that happens, HiRDB (the unit) will terminate abnormally again. Therefore, refrain from resuming jobs until all the steps through (9) have been completed.

(7) Check that the synchronization point dump has been validated

When HiRDB is restarted, a synchronization point dump is validated, in which case the KFPS02183-I message is output. A check should be made that the synchronization point dump has been validated before the next step is executed.

Note that the KFPS02183-I message is output only if Y was specified in the pd_spd_assurance_msg operand or if this operand was omitted. If this condition is not satisfied, check that the synchronization point dump is validated using the procedure explained in Section 18.17.6 Checking for synchronization point dump validation.

(8) Use the pdcopy command to back up data (applicable to operation without unloading of the system log)

pdcopy -m dcm3500:/dbarea/area1/rdmt1 -M x -p /usr/ofile -f /usr/seifile/cfl
-z /usr/bes1/logpoint02

If operation without unloading of the system log was in effect, use the database copy utility (pdcopy command) to back up all RDAREAs in the server before restarting an application. In such a case, specify the -z option to determine the log point information file.

After backing up the RDAREAs, use the pdlogchg -z command to release the system log file and place it in unload completed status:

pdlogchg -z /usr/bes1/logpoint02 [-x host_name]

(9) Use the pdlogls command to determine if there is a system log file in swappable target status

pdlogls -d sys -s bes1

If there is no file in swappable target status, HiRDB may terminate abnormally again. Immediately create a file in swappable target status. For example, if a file can be placed in swappable target status by releasing it from unload wait status, use the pdlogunld command to unload the system log.

(10) Restart the application

Restart the application.

(11) Reevaluate the size of the system log files

Reevaluate the size of the system log files. For details on how to obtain the size of system log files, see the manual HiRDB Version 8 Installation and Design Guide. If any added system log file is not needed, delete it. The procedure for deleting a system log file is shown below.

If operation without unloading of the system log is being used, a system log file cannot be deleted while it is in use as the current system log file. Check that the system log file to be deleted is not the current file, execute step (8) again, then delete the system log file.

Procedure
To delete a system log file:
  1. Enter the pdstop command to terminate HiRDB normally.
  2. Unload the system log file to be deleted, or change its status.
  3. If HiRDB Datareplicator is being used, check that it has finished extracting the system log file that is to be deleted.
  4. If HiRDB Datareplicator is running, terminate it.
  5. Use the pdlogrm command to delete the system log file.
  6. Delete the pdlogadfg and pdlogadpf operands for the deleted system log file.
  7. Enter the pdstart command to start HiRDB normally.
  8. If HiRDB Datareplicator is being used, start HiRDB Datareplicator.