Nonstop Database, HiRDB Version 9 System Operation Guide

[Contents][Index][Back][Next]

26.7.7 Reducing the system switchover time (user server hot standby, rapid system switchover facility)

The following functions reduce the amount of time required to perform system switchovers. The system switchover facility must be operating in the server mode to use these functions:

Organization of this subsection
(1) User server hot standby
(2) Rapid system switchover facility
(3) System configuration examples when the rapid system switchover facility is used
(4) Checking procedure when startup of the standby system takes too much time
(5) Notes on using the rapid system switchover facility

(1) User server hot standby

When a system switchover occurs, the following processes are performed to start HiRDB on the standby system:

The time required for activating these server processes accounts for a large part of the system switchover time. Because the time required for activating a server process is proportional to the number of resident server processes, the system switchover time increases as the number of resident processes increases. Therefore, server processes of the standby system HiRDB can be activated in advance so that no time is required during switchover for their startup processing. The time required to perform a system switchover is reduced by the amount of time that would have been required for their activation processing. This function is called user server hot standby. For example, activating one server process on a server machine operating at about 100 MIPS requires approximately 1 second. Therefore, by eliminating this activation process, you can reduce the system switchover time by approximately 1 second.

To use the user server hot standby function, specify Y in the pd_ha_server_process_standby operand.

(2) Rapid system switchover facility

Server processes or system servers for HiRDB on the standby system can be activated in advance, instead of during a system switchover. This function is called the rapid system switchover facility. The time required to perform a system switchover is reduced by the amount of time that would have been required for the activation processing of the server processes or system servers during a system switchover.

The rapid system switchover facility is more effective in reducing the time required to perform a system switchover than user server hot standby (the rapid system switchover facility includes the functionality of user server hot standby).

(a) Conditions for IP addresses

HiRDB single server configuration

The unit that uses the rapid system switchover facility cannot inherit IP addresses. Therefore, for a HiRDB single server configuration, configure the unit so that it does not inherit IP addresses.

HiRDB parallel server configuration

Configure the unit that uses the rapid system switchover facility so it does not inherit IP addresses. Specify N in the pd_ha_ipaddr_inherit operand in the unit control information definition for the applicable unit. Units that do not use the rapid system switchover facility can be configured to inherit IP addresses.

For system configuration examples, see 26.7.7(3) System configuration examples when the rapid system switchover facility is used.

If you are using HA Monitor as your cluster software

Before starting the running system and standby system HiRDB (or unit), activate the IP addresses specified in the -x and -c options of the pdunit operand. Do not specify an IP address specified in the -x or -c option of the pdunit operand in the alias operand value's .up or .down file for HA Monitor. If IP addresses for client connections are to be inherited, specify these IP addresses. If there are no IP addresses to be inherited, such as IP addresses for client connections, specify nouse in the lan_updown operand of the server definition file of HA Monitor, or delete the alias operand value's .up and .down files.

(b) Operands specified when using the rapid system switchover facility

To use the rapid system switchover facility, specify standbyunit in the pd_ha_agent operand.

For a HiRDB parallel server configuration, consider whether to use the transaction queuing facility. When the transaction queuing facility is used, fewer transaction errors will occur during system switchover. For details, see 26.7.10 Transaction queuing facility (for the rapid system switchover facility only).

(3) System configuration examples when the rapid system switchover facility is used

The following figure shows an example of a system configuration when the rapid system switchover facility is used.

Figure 26-94 System configuration example when the rapid system switchover facility is used

[Figure]

Explanation
  • Because the units for the system manager and front-end server (unit 1) are configured to inherit IP addresses, it cannot use the rapid system control facility. This unit has a 1-to-1 switchover configuration that uses the user server hot standby function.
  • The back-end server units (unit 2 and unit 3) have a mutual system switchover configuration that uses the rapid system switchover facility.

The following examples show the specification of operands in the HiRDB system definition. These definition examples explain only operands related to the system common definition and unit control information definition.

[Figure] System common definition

set pd_ha = use
set pd_name_port = 20000
 
pdunit -x hostAA -u unt1 -d "/hirdb1" -p 20000                       1
pdunit -x hostB -u unt2 -d "/hirdb2" -c hostBB -p 20001              2
pdunit -x hostC -u unt3 -d "/hirdb3" -c hostCC -p 20002              3
 
pdstart -t MGR -u unt1
pdstart -t DIC -u unt1 -s DIC
pdstart -t FES -u unt1 -s FES
pdstart -t BES -u unt2 -s BES1
pdstart -t BES -u unt3 -s BES2

Explanation
  1. This is the definition for unit 1. Because this unit inherits IP addresses after a system switchover, the -c option is not specified.
  2. This is the definition for unit 2. Because this unit does not inherit IP addresses after a system switchover, the host name of the secondary system is specified in the -c option.
  3. This is the definition for unit 3. Because this unit does not inherit IP addresses after a system switchover, the host name of the secondary system is specified in the -c option.

[Figure] Unit control information definition (unit 1)

set pd_hostname = host1                             1
set pd_ha_acttype = server                          2
set pd_ha_server_process_standby = Y                3

Explanation
  1. Specifies the standard host name of the primary system.
  2. Executes the system switchover facility in the server mode. The server mode is required for a unit that uses user server hot standby.
  3. Specifies that user server hot standby is used for this unit.

[Figure] Unit control information definition (unit 2)

set pd_hostname = hostB                            1
set pd_ha_acttype = server                         2
set pd_ha_ipaddr_inherit = N                       3
set pd_ha_agent = standbyunit                      4

Explanation
  1. Specifies the host name of the primary system.
  2. Executes the system switchover facility in the server mode. The server mode requires that the unit use the rapid system switchover facility.
  3. Specifies that IP addresses are not to be inherited. Units that use the rapid system switchover facility cannot inherit IP addresses.
  4. Specifies that the rapid system switchover facility is used for this unit.

[Figure] Unit control information definition (unit 3)

set pd_hostname = hostC                           1
set pd_ha_acttype = server                        2
set pd_ha_ipaddr_inherit = N                      3
set pd_ha_agent = standbyunit                     4

Explanation
  1. Specifies the host name of the primary system.
  2. Executes the system switchover facility in the server mode. The server mode requires that the unit use the rapid system switchover facility.
  3. Specifies that IP addresses are not to be inherited. Units that use the rapid system switchover facility cannot inherit IP addresses.
  4. Specifies that the rapid system switchover facility is used for this unit.

(4) Checking procedure when startup of the standby system takes too much time

Some of the processing for starting the standby system waits until the running system is started. If startup of a standby system unit using the rapid system switchover facility is taking a long time, check the following items:

  1. Check whether the running system was started. Start the running system if it was not started.
  2. Use the pdls -d prc -a command to check whether the pdenvcp command that is issued internally by the HiRDB in the standby system has responded. If the pdenvcp command has not responded, check whether the _pd0envc command process remains on the running system. If this command process does remain, terminate it, and then restart the standby system.
  3. Use the pdls -d rpc command to check whether RPC or a file input/output process stopped the pdenvcp command that HiRDB in the primary system issues internally. Eliminate the cause of the network or OS error, and then restart the standby system.
  4. If startup processing for the standby system during system switchover times out, redefine the value of the pd_system_complete_wait_time operand (completion wait time for the pdstart command). Set a value that takes into account the startup time of the standby system, and then restart the standby system.
  5. When a large number of lists are used, list initialization processing might increase the time required to perform a system switchover. In this case, consider changing the list initialization time with the pd_list_initialize_timing operand. For details about changing the list initialization time, see 13.21(9) Changes when initializing (deleted) lists.

(5) Notes on using the rapid system switchover facility

(a) Operations requiring a restart of the HiRDB (or unit) in the standby system

Depending on the operations listed in the table below, you might need to terminate the standby system HiRDB (the standby system unit for a HiRDB parallel server configuration), and then restart it. If you do not restart the standby system HiRDB in these cases, it will terminate abnormally when a system switchover occurs.

Table 26-40 Operations in which the standby system HiRDB (or unit) must be restarted

Classification Operation Did a running system server restart after independently terminating normally? Is it necessary to restart the standby system HiRDB (or unit)?
SQL execution Execution of a definition SQL statement No N
Yes Y
Definition of the first abstract data type after HiRDB system construction (CREATE TYPE statement execution) No Y
Yes Y
Operation command or utility execution Database structure modification utility (pdmod) execution Registration of HiRDB file system area generation No N
Yes N
Deletion of HiRDB file system area generation No N
Yes N
Auditor registration No N
Yes N
Re-initialization of RDAREA without the with reconstruction operand specified No Y
Yes Y
RDAREA attribute modification No Y
Yes Y
Audit trail table creation No N
Yes Y
Other operations No Y
Yes Y
Execution of the pddbchg command No N
Yes Y
Execution of online reorganization
(execution of the pdorbegin or pdorend command)
No N
Yes Y
First execution of the pdplgrgst command after HiRDB system construction (CREATE TYPE statement execution) No Y
Yes Y
System common definition modification Modification of one of the following definitions:
  • Global buffer definition (pdbuffer)
  • Maximum RDAREA count (pd_max_rdarea_no)
  • Maximum number of HiRDB files comprising an RDAREA (pd_max_file_no)
  • Maximum inner replica group count
    (pd_inner_replica_control)
  • Minimum guaranteed value for table reservation count (pd_assurance_table_no)
  • Minimum guaranteed value for index reservation (pd_assurance_index_no)
No N
Yes Y

Legend:
Y: The HiRDB (or unit) must be restarted.
N: The HiRDB (or unit) need not be restarted.

If HiRDB in the standby system terminated abnormally, use the pdstart command (pdstart -u or pdstart -q for a HiRDB parallel server configuration) to start HiRDB in the standby system.

(b) RDAREA opening trigger attributes

Standby system units that are subject to the rapid system switchover facility do not open any RDAREAs when in waiting status. To minimize the time required to perform a system switchover, the rapid system switchover facility opens only the RDAREAs needed for full recovery when a system switchover occurs. Therefore, the RDAREA opening trigger is as follows:

For details about the RDAREA opening trigger, see 15.6 Modifying an RDAREA opening trigger attribute (RDAREA modification).

(c) Linking with OLTP products

Caution is urged when all of the following conditions are present:

In such a case, if the OLTP products perform recovery processing on an undetermined transaction, the X/Open-compliant API might return an error and not be able to recover the transaction. If this problem occurs, upgrade the HiRDB client to version 06-02-/B or later. If you cannot immediately upgrade the HiRDB client because, for example, you do not wish to stop the current job task, switch the HiRDB (or unit) in the primary system from the standby system to the running system. However, this is only a temporary measure. It is important to upgrade the HiRDB client after the current job task has been completed.