25.5.3 Creating HiRDB system definitions

Organization of this subsection
(1) Standby system switchover facility
(2) Standby-less system switchover (1:1) facility
(3) Standby-less system switchover (effects distributed) facility
(4) HiRDB system definition operands related to the system switchover facility
(5) Specifying the switching destination (standby-less system switchover (effects distributed) facility only)
(6) Allocating server processes following system switchover

(1) Standby system switchover facility

Use the same HiRDB system definitions in the primary system and the secondary system. Create the HiRDB system definitions for the primary system, then copy those HiRDB system definitions to the secondary system. Figures 25-32 and 25-33 show configuration examples of the HiRDB system definition files.

Figure 25-32 Configuration example of HiRDB system definition files when using the standby system switchover facility (for a HiRDB/Single Server)

[Figure]

Figure 25-33 Configuration example of HiRDB system definition files when using the standby system switchover facility (for a HiRDB/Parallel Server)

[Figure]

(2) Standby-less system switchover (1:1) facility

Copy the unit control information definition file and back-end server definition file of the normal BES unit to the alternate BES unit. Change the name of the unit control information definition file as shown below:

pdutsys.unit-identifier-of-normal-BES-unit

Of the operands specified in this definition file, those whose settings become effective during alternation are listed below. For all other operands (other than those listed below), the values that are set in the unit control information definition file of the alternate BES unit are effective.

Figure 25-34 shows a configuration example of the HiRDB system definition files when using the standby-less system switchover facility (mutual alternating configuration).

Figure 25-34 Configuration example of HiRDB system definition files when using the standby-less system switchover facility (mutual alternating configuration)

[Figure]

Explanation
  • Copy the unit control information definition file and back-end server definition file of the normal BES unit (BES1) to the alternate BES unit (BES2). Then change the name of the unit control information definition file to pdutsys.UNT1.
  • Copy the unit control information definition file and back-end server definition file of the normal BES unit (BES2) to the alternate BES unit (BES1). Then change the name of the unit control information definition file to pdutsys.UNT2.

(3) Standby-less system switchover (effects distributed) facility

Table 25-13 shows how system definition files are used when standby-less system switchover (effects distributed) is used.

Table 25-13 Use of system definition files when standby-less system switchover (effects distributed) is used

Definition typeUse of definition files
System common definitionCopy files to all units within the system.
Specify in the system common definition the parameters that are to be set as the default values for back-end server definitions.
Unit control information definitionSpecify only the following operands (operands that cannot be specified in the system common definition):
  • pd_unit_id
  • pd_hostname
  • pd_ha_unit
  • pd_rpl_hdepath
  • pd_ha_restart_failure
  • pd_ha_acttype
  • pd_ha_server_process_standby
  • pd_ha_agent
  • pd_ha_max_act_guest_servers
  • pd_ha_max_server_process
  • pd_ha_resource_act_wait_time
  • pd_ha_process_count
  • pd_syssts_file_name_1[Figure]7
  • pd_syssts_initial_error
  • pd_syssts_last_active_file
  • pd_syssts_last_active_side
  • pd_syssts_singleoperation
Specify all other operands in the system common definition or in back-end server definitions. If any operand other than those listed above is specified, an error occurs (and the KFPS05618-E message is output).
Server common definitionCopy files to all units within the HA group.
Back-end server definitionCopy files to all units within the HA group.

(4) HiRDB system definition operands related to the system switchover facility

Table 25-14 explains the HiRDB system definition operands that relate to the system switchover facility.

Table 25-14 HiRDB system definition operands related to the system switchover facility

Operand nameExplanation and Notes
pd_haSpecifies that the system switchover facility is to be used.
pd_ha_ipaddr_inheritSpecifies whether or not IP addresses are to be inherited after a system switchover. Specify N for units using the rapid system switchover facility. Omit this operand for units using the standby-less system switchover facility.
Y: Inherit IP addresses after system switchover.
N: Do not inherit IP addresses after system switchover.
pd_ha_unitSpecify nouse for any unit that is not to use the system switchover facility. You must specify nouse for a recovery-unnecessary front-end server.
pd_ha_acttypeSpecifies whether the system switchover facility is to be used in the monitor mode or the server mode. The server mode cannot be used when the system switchover facility uses Sun Cluster, HACMP, or ClusterPerfect.
monitor: Operate the system switchover facility in the monitor mode.
server: Operate the system switchover facility in the server mode.
pd_ha_restart_failureWhen operating the system switchover facility in the monitor mode, specifies a command to be executed if the restarting HiRDB fails. This operand has no effect when you use the server mode.
pd_ha_server_process_standbySpecifies whether or not user server hot standby is to be used.
Y: Use user server hot standby.
N: Do not use user server hot standby.
pd_ha_agentSpecifies the system switchover facility to be used:
standbyunit: Rapid system switchover facility
server: Standby-less system switchover (1:1) facility
activeunits: Standby-less system switchover (effects distributed) facility
pd_ha_transaction
pd_ha_trn_queuing_wait_time
pd_ha_trn_restart_retry_time
  • Specify these operands when you use the transaction queuing facility.
  • If you specify queuing in the pd_ha_transaction operand and the maximum number of concurrent connections (value of the pd_max_users operand) is exceeded, the HiRDB client will make retries to connect to the HiRDB server for only the amount of time that is equal to pd_ha_trn_queuing_wait_time + pd_ha_trn_restart_retry_time.
pd_ha_switch_timeoutThis operand can be specified when the server mode is used. This operand is invalid if it is specified in the monitor mode.
This operand specifies whether or not system switchover is to performed without waiting for HiRDB termination processing when termination processing of HiRDB (or the unit for a HiRDB/Parallel Server) during system switchover exceeds the server failure monitoring time. Server failure monitoring time refers to the time specified in the patrol operand of HA monitor or Hitachi HA Toolkit Extension.
Y: Switch systems without waiting for HiRDB termination processing when HiRDB termination processing during system switchover exceeds the server failure monitoring time.
N: Do not switch systems until HiRDB termination processing during system switchover terminates.
pd_ha_prc_cleanup_checkSpecifies whether or not system switchover processing is to be placed on hold until HiRDB processes have terminated. For details, see 25.5.2(2) Shared disk access control.
pd_mode_confThis operand is related to HiRDB (or unit) startup. Specify this operand as follows:
When the monitor mode is used, specify MANUAL1.
When the server mode is used, specify one of the following:
  • MANUAL2 if switch is specified in the switchtype operand of the servers definition of Hitachi HA Toolkit Extension.
  • MANUAL1 if restart or manual is specified in the switchtype operand of the servers definition of Hitachi HA Toolkit Extension.
pd_hostnameSpecifies the standard host name of the primary system. When using the standby-less system switchover facility, specifies the unit's standard host name. (This is the same as when not using the system switchover facility.)
pdunit-xSpecifies the host name of the primary system. When using the standby-less system switchover facility, specifies the unit's host name. (This is the same as when not using the system switchover facility.)
-uSpecifies the unit identifier.
-dSpecifies the HiRDB directory name. When using the standby-less system switchover (1:1) facility, specify the same directory name for the normal BES unit and the alternate BES unit. When using the standby-less system switchover (effects distributed) facility, specify the same directory name for all units within the HA group.
-cSpecifies the host name of the secondary system. Specify this option when not inheriting IP addresses after system switchover. Omit this option when using the standby-less system switchover facility.
-pSpecifies the port number. Specify this option when using a utility special unit or HiRDB/Parallel Server. When using the standby-less system switchover (1:1) facility, specify the same port number for the normal BES unit and the alternate BES unit. When using the standby-less system switchover (effects distributed) facility, specify the same port number for all units within the HA group.
pdstart-cSpecifies the alternate BES name. Specify this option when using the standby-less system switchover (1:1) facility.
-gWhen using the standby-less system switchover (effects distributed) facility, specify the identifier of the HA group that constitutes the set of units that become server switching destinations.
pdbuffer-cSpecify this option when allocating global buffers that the alternate portion uses when alternating units. Specify this option when using a standby-less system switchover facility. For details when using the standby-less system switchover (1:1) facility, see 25.5.7 Definition of global buffers (standby-less system switchover (1:1) facility only); for details when using the standby-less system switchover (effects distributed) facility, see 25.5.8 Definition of global buffers (standby-less system switchover (effects distributed) facility only).
pdhagroup-gTo use the standby-less system switchover (effects distributed) facility, you define an HA group that will constitute the set of units that will become server switching destinations. Specify an identifier that will uniquely identify this HA group within the system.
-uSpecifies the unit identifiers of the units that are to comprise the HA group.
pd_ha_max_act_guest_serversWhen using the standby-less system switchover (effects distributed) facility, specifies the maximum number of guest BESs that will be permitted to run concurrently in a unit.
pd_ha_max_server_processWhen using the standby-less system switchover (effects distributed) facility, specifies the maximum permissible number of active user server processes in a unit.
pd_ha_resource_act_wait_timeWhen the standby-less system switchover (effects distributed) facility is used, specifies the maximum time to wait until the running server's resources are activated when the unit is started.
pd_service_portCare must be exercised in specifying this operand in a server machine configuration that includes multiple units (including a mutual system switchover configuration). For such a configuration (including a mutual system switchover configuration), use this operand to specify a separate port number for each unit in its unit control information definition.
If either of the following specifications is made, system switchover to one of the units fails:
  • The pd_service_port operand of the system common definition is specified (when the pd_service_port operand of the unit control information definition is not specified).
  • A port number that is specified in the pd_service_port operand of another unit control information definition is specified in the pd_service_port operand of the unit control information definition.
pd_redo_allpage_putWhen Y is specified in this operand, all pages that have been updated since a synchronization point are written into the database during full recovery processing that occurs when HiRDB is restarted. This can eliminate inconsistencies between the original and duplicate volumes that occurred during system switchover.
For details about how to handle inconsistencies between the original and duplicate volume, see 18.24 Actions to take when a mismatch occurs between the original and the mirror duplicate.
pd_ha_mgr_rerunWhen notwait is specified in this operand, HiRDB does not wait to receive a processing startup completion notice from each unit when switching system manager units (when starting processing at the switching destination). As a result, system manager units can be switched even when some units are stopped.
For details about the operation method, see 25.21 Actions to take when a stopped unit prevents switching of the system manager unit.

(5) Specifying the switching destination (standby-less system switchover (effects distributed) facility only)

When the standby-less system switchover (effects distributed) facility is used, the method of determining the switching destination differs significantly from when the other system switchover facilities are used.

(a) Accepting unit

Because the standby-less system switchover (effects distributed) facility switches systems on a server-by-server basis, a switching destination must be specified for each server. You may specify multiple accepting units for a server. Multiple accepting units are defined as an HA group, you must specify an HA group as the switching destination for each server.

When you use the standby-less system switchover (effects distributed) facility, you can also specify the maximum number of guest BESs that will be permitted to run concurrently in each unit (pd_ha_max_act_guest_servers).

Figure 25-35 shows an example of an HA group configuration.

Figure 25-35 HA group configuration example

[Figure]

pdhagroup -g hag1 -u unt1,unt2,unt3,unt4

pdstart -t BES -s bes1A -u unt1 -g hag1
pdstart -t BES -s bes1B -u unt1 -g hag1
pdstart -t BES -s bes1C -u unt1 -g hag1
pdstart -t BES -s bes2A -u unt2 -g hag1
pdstart -t BES -s bes2B -u unt2 -g hag1
pdstart -t BES -s bes2C -u unt2 -g hag1
pdstart -t BES -s bes2D -u unt2 -g hag1
pdstart -t BES -s bes3A -u unt3 -g hag1
pdstart -t BES -s bes3B -u unt3 -g hag1
pdstart -t BES -s bes3C -u unt3 -g hag1
pdstart -t BES -s bes4A -u unt4 -g hag1
pdstart -t BES -s bes4B -u unt4 -g hag1

(b) HA group definition

You use the HiRDB system definition to define an HA group. Specify a name for the HA group in the -g option of the pdhagroup operand, and specify in the -u option the unit identifiers of the units that will comprise the HA group.

You can specify only one HA group in each system definition.

Example: pdhagroup -g hag1 -u unt1,unt2,unt3,unt4
Defines an HA group named hag1 that consists of unt1, unt2, and unt3.

The following restrictions apply to HA groups:

Each unit comprising an HA group must satisfy all the following conditions:

  1. Because a unit that contains no host BES (an accepting-only unit) cannot belong to an HA group, each unit belonging to an HA group must contain at least one host BES.
  2. All servers that comprise a unit belonging to an HA group must be back-end servers; an HA group unit cannot contain any server whose server type is other than BES.
  3. The only type of system switchover that can be used for units belonging to an HA group is standby-less system switchover (effects distributed). This means that for units belonging to an HA group, the only value that can be specified in the pd_ha_agent operand is activeunits.
(c) Specifying an accepting unit

In the HiRDB system definition, you specify in the -g option of the pdstart command the HA group to which an accepting unit belongs.

You must specify the -g option for all servers that belong to a unit to which standby-less system switchover (effects distributed) is applicable.

Example: pdstart -t BES -s bes1A -u unt1 -g hag1
When unt1 or bes1 terminates abnormally, processing for bes1 can be accepted by a unit belonging to the HA group named hag1.

You should note the following about specifying the -g option:

  1. Both the regular unit and the accepting unit must be comprised exclusively of back-end servers.
    • BES must be specified in the -t option.
    • Each unit belonging to the HA group specified by the -g option must not contain any server whose server type is not BES.
  2. The number of servers comprising a regular unit need not be the same as the number of servers comprising an accepting unit.
    • The number of servers in the unit specified by the -u option (regular unit) need not be the same as the number of servers in the unit belonging to the HA group specified by the -g option (accepting unit).
(d) Specifying the maximum number of concurrently running guest BESs

You can specify in the pd_ha_max_act_guest_servers operand of the unit control information definition the maximum number of guest BESs that will be permitted to operate concurrently as running systems in a unit. The purpose of this specification is to reduce the amount of resources required by guest BESs. It can also prevent excessive increases in workload.

Example: pd_ha_max_act_guest_servers = 2

The maximum value that can be specified in the pd_ha_max_act_guest_servers operand is the number obtained by subtracting the number of servers in the local unit from the number of servers in the HA group. If you specify a value greater than this maximum, the maximum value will be set in the pd_ha_max_act_guest_servers operand. The number of host BESs plus the value of the pd_ha_max_act_guest_servers operand cannot exceed 34.

The number of guest BESs that are in accepting status in a unit is not restricted. However, when the number of guest BESs that are operating as running systems in a unit reaches the value specified in the pd_ha_max_act_guest_servers operand, acceptability is cancelled for all the non-active guest BESs.

Once the number of erroneous BESs in an HA group exceeds the combined total number of free guest areas in the running units in the HA group, any subsequent error will cause some servers to stop and their processing will be suspended.

(6) Allocating server processes following system switchover

(a) Standby-less system switchover (1:1)

Once standby-less system switchover (1:1) occurs, the alternate BES unit both executes its own processes and assumes the alternate BES's processes. For this to occur, server processes are allocated to the alternate BES's original processes as well as to the normal BES's processes. The number of server processes executing the alternate BES's original processes and assuming the normal BES's processes varies according to need. However, the maximum number of active alternate BES processes (value of the pd_max_bes_process operand) is also the maximum for the combined total of the number of processes for both BESs. This prevents an excessive increase in workload at the alternate BES after system switchover. On the other hand, however, you need to be aware that the maximum number of service requests that can be processed concurrently after system switchover is limited to one half of the original. For this reason, when you specify the pd_max_bes_process operand for the alternate BES, you should take into consideration both the increase in the unit's workload and the number of service requests that can be processed concurrently.

If a safety margin has been built into the number of resident processes before system switchover (value of the pd_process_count operand), and if processes that are not actually processing service requests are resident, you have these resident processes that are not processing service requests available to assume the normal BES's processing after system switchover. As a result, processing performance after system switchover improves.

Figure 25-36 shows allocation of server processes following standby-less system switchover (1:1) (Part 1).

Figure 25-36 Allocation of server processes following standby-less system switchover (1:1) (Part 1)

[Figure]

Before system switchover occurs, the maximum number of processes that can be processed concurrently equals the value of the pd_max_bes_process operand specified for the alternate BES (bes1). Additionally, as many server processes as the value of the pd_process_count operand for the alternate BES (bes1) can be kept resident.

When system switchover occurs, processing for the normal BES (bes2) begins using available resident processes of the alternate BES (bes1). Therefore, there is no need to start processes for the normal BES (bes2) and the processing of the normal BES (bes2) resumes immediately following system switchover. Moreover, there is no need to start standby processes for the normal BES (bes2) before system switchover.

Once all resident processes are being used, additional processes are started as needed. However, the total number of processes is limited to the value of the pd_max_bes_process operand for the alternate BES (bes1).

Figure 25-37 shows allocation of server processes following standby-less system switchover (1:1) (Part 2).

Figure 25-37 Allocation of server processes following standby-less system switchover (1:1) (Part 2)

[Figure]

After system switchover, while the alternate BES (bes1) is handling the processes of the normal BES (bes2), processes that are started as needed within the value of the pd_max_bes_process operand of the alternate BES are allocated to handle the processes of the alternate BES (bes1) as well as of the normal BES (bes2).

Where there are processing requests only for the alternate BES (bes1), the number of processes up to the value of the pd_max_bes_process operand for the alternate BES (bes1) can be executed concurrently for the alternate BES (bes1).

Where there are processing requests only for the normal BES (bes2), the number of processes up to the value of the pd_max_bes_process operand for the alternate BES (bes1) can be executed concurrently for the alternate BES (bes2).

(b) Standby-less system switchover (effects distributed) facility

Even though standby-less system switchover (effects distributed) has occurred, an accepting unit can continue to accept guest servers until the number of running guest servers reaches the value of the pd_ha_max_act_guest_servers operand.

At an accepting unit, the host BESs and guest BESs individually start server processes up to the maximum number of processes that can be started (value of the pd_max_bes_process operand). However, the total number of server processes in a unit is limited to the value of the pd_ha_max_server_process operand. This prevents an excessive increase in workload at the accepting unit. However, you should be aware that there may be an upper limit to the number of service requests that can be processed concurrently after system switchover. For this reason, when you specify the pd_ha_max_server_process operand, you should take into consideration both the increase in the unit's workload following system switchover and the number of service requests that can be processed concurrently.

If a safety margin has been built into the number of resident processes before system switchover (value of the pd_process_count operand) and if processes that are not actually processing service requests are resident, you have these resident processes that are not processing service requests available to assume the normal BES's processing after system switchover. As a result, processing performance after system switchover improves. On the other hand, when the number of resident processes is set too large, processes that are not processing service requests may cause the number of processes to reach the value of the pd_ha_max_server_process operand. As a result, it may not be possible to process additional service requests even though the number of processes that have been started by other servers has not reached the value of the pd_max_bes_process operand. It is advisable to set the ratio between the total number of resident processes in units and the total of the maximum number of running processes to remain the same before and after guest servers are accepted. In this way, the total number of resident processes in units after guest servers are accepted is restricted by the pd_ha_process_count operand. The actual number of resident processes is either the number obtained by allocating proportionally the value of the pd_ha_process_count operand based on the values of the pd_process_count operands of the servers that are running in the unit, or the actual value of the pd_process_count operand, whichever is smaller.

The meanings of the operands related to number of processes are explained below:

Figure 25-38 shows allocation of server processes following standby-less system switchover (effects distributed) (Part 1)

Figure 25-38 Allocation of server processes following standby-less system switchover (effects distributed) (Part 1)

[Figure]

Before system switchover occurs, each host BES (bes1 and bes2) can execute concurrently as many processes as the value of its pd_max_bes_process operand. For each, as many server processes as the value of its pd_process_count operand can also be made resident.

When system switchover occurs, resident processes in the host BESs (bes1 and bes2) are used to provide processes for the guest BES (bes3). Therefore, there is no need to start processes for the guest BES (bes3) and processing of the guest BES (bes3) can begin immediately following system switchover. Moreover, there is no need to start standby processes for the guest BES (bes3) before system switchover.

Each server starts processes as needed up to the value of its pd_max_bes_process operand, but the combined total number of server processes in the unit is limited by the value of the pd_ha_max_server_process operand.

Also, the number of resident processes in each server is adjusted so that the combined total number of resident processes in the units equals the value of the pd_ha_process_count operand. The value of the pd_ha_process_count operand is allocated among the servers so that the number of resident processes for each server after adjustment maintains the ratio determined by each server's pd_process_count operand value.

Figure 25-39 shows allocation of server processes following standby-less system switchover (effects distributed) (Part 2).

Figure 25-39 Allocation of server processes following standby-less system switchover (effects distributed) (Part 2)

[Figure]

After system switchover and once the guest BES (bes3) has been accepted, the processes of the host BESs (bes1 and bes2) and the processes of the guest BES (bes3) are started, as long as the number of processes in the unit does not exceed the value of the pd_ha_max_server_process operand.

If the number of processing requests to a particular host BES (bes1, for example) is especially large, processes can be executed concurrently up to the value of the pd_max_bes_process operand of that host BES (bes1). However, the number of processing requests that can be handled by other servers (bes3, for example) decreases accordingly.