Scalable Database Server, HiRDB Version 8 Description

[Contents][Glossary][Index][Back][Next]

8.1.1 Overview of the system switchover facility

The system switchover facility includes the standby system switchover facility and the standby-less system switchover facility.

Organization of this subsection
(1) Overview of the standby system switchover facility
(2) Overview of the standby-less system switchover facility
(3) Shared disk unit

(1) Overview of the standby system switchover facility

By deploying a standby HiRDB separate from the HiRDB that is actively processing jobs, if a failure occurs on the server machine or on HiRDB, job processing can be automatically switched over to the standby HiRDB. This ability is called the system switchover facility (standby system switchover facility). Job processing is interrupted from the time the failure occurs to the time processing is switched over to the standby HiRDB. The system switchover facility is used to keep system downtime to a minimum when a failure occurs.

You implement the system switchover facility with a cluster system configuration consisting of multiple server machines. For HiRDB/Single Server, the system is switched over on a per-system basis. For HiRDB/Parallel Server, the system is switched over on a per-unit basis.

The system on which jobs are currently being processed is called the running system, and the system that is currently in reserve is called the standby system. Whenever a system switchover occurs, the running system and the standby system are swapped. In addition, to distinguish between the two systems while you are building the systems and configuring the environments, the system that is initially started as the running system is called the primary system, and the system that is first started as the standby system is called the secondary system. Although the running system and the standby system change when a system switchover occurs, the primary system and secondary system do not. Figure 8-1 provides an overview of the system switchover facility (standby system switchover facility).

Figure 8-1 Overview of the system switchover facility (standby system switchover facility)

[Figure]

* For details about shared disk unit, see (3) Shared disk unit, as follows.

Explanation
If a failure occurs on the running system while it is processing jobs, the failure occurrence is reported to the standby system, and the system is switched over. The standby system becomes the running system, and resumes job processing.

(2) Overview of the standby-less system switchover facility

The system switchover facility includes the previously described standby system switchover facility and the standby-less system switchover facility. The standby-less system switchover facility is further classified as follows:

The standby-less system switchover facility can be used in a back-end server unit of a HiRDB/Parallel Server; it cannot be used in a unit in which a server other than a back-end server resides.

Unlike the standby system switchover facility, with the standby-less system switchover facility you do not have to allocate a standby unit. When a failure occurs, the system does not switch over to a standby unit; instead, the system switches over to another unit whose currently running back-end server takes over the back-end server processing of the failed unit. This is called the standby-less system switchover facility.

(a) Standby-less system switchover (1:1) facility

With the standby-less system switchover (1:1) facility, there is a one-to-one relationship between the unit on which the failure occurs and the unit to whose back-end server processing is switched.

A back-end server whose processing is transferred to another unit when a failure occurs is called a normal BES, and a back-end server that takes over processing is called an alternate BES. Similarly, the unit containing the normal BES is called the normal BES unit, and the unit containing the alternate BES is called the alternate BES unit. Figure 8-2 provides an overview of the standby-less system switchover (1:1) facility.

Figure 8-2 Overview of the standby-less system switchover (1:1) facility

[Figure]

Explanation
  • Normally, both BES1 and BES2 perform processing.
  • If a failure occurs on the normal BES unit (UNT1), the system switches over, and processing is taken over by the alternate BES. The area in which processing is taken over is called the alternate portion and, when the alternate portion is performing processing, it is said to be alternating.
  • After the failure is resolved and the normal BES unit is started, the processing taken over by the alternate BES is switched over to the normal BES, and returned to normal status. This is called reactivating the system.

Remarks
Using the concepts of the primary and other systems in the standby system switchover facility, consider the following with respect to the standby-less system switchover (1:1) facility:
  • Think of the normal BES unit as the primary system, and the alternate BES unit as the secondary system.
  • Under normal conditions, think of the normal BES unit as the running system, and the alternate portion as the standby system. During alternating, think of the alternate portion as the running system, and the normal BES unit as the standby system.

Prerequisites
To use the standby-less system switchover (1:1) facility, all of the following must be satisfied:
  • HiRDB Advanced High Availability is installed.
  • Hitachi HA Toolkit Extension is installed.
  • The system switchover facility is running in server mode.

Advantages of the standby-less system switchover facility
The following describes the advantages of the standby-less system switchover facility over the standby system switchover facility:
  • You do not need to set aside a standby system unit, which means you can use system resources more efficiently. However, remember that the load increases on the back-end server that takes over processing when the system is switched over, which may adversely affect processing performance.
  • The server processes and system servers are already running, which allows you to reduce system switchover time to be about the same as when the rapid system switchover facility is used. For details about the rapid system switchover facility, see 8.1.5 Functions that reduce system switchover time (user server hot standby and the rapid system switchover facility).
(b) Standby-less system switchover (effects distributed) facility

When a failure occurs, processing requests directed to back-end servers in the failed unit can be distributed to and executed in multiple active units. This is called the standby-less system switchover (effects distributed) facility. This facility enables you to use your system resources more efficiently, without having to allocate a standby server machine or a standby unit. Of course, there may be adverse effects on transaction processing performance because of the increased processing load on the units that have taken over processing for the servers in the failed unit. However, because the processing requests directed to the failed servers are distributed to and executed in a number of units, the workload increase for each unit is minimized, reducing overall degradation of system performance.

The standby-less system switchover (effects distributed) facility distributes the workload to and switches over among multiple back-end servers. The workload can also be distributed among multiple units. If another failure occurs, this time on a unit that was a switchover destination from the previous error, processing can be continued by again switching to a running unit (this is called multi-stage system switchover). Multi-stage system switchover cannot be performed with the standby-less system switchover (1:1) facility, so if a failure occurs at a switchover destination under that facility, processing for the failed unit cannot be continued.

It is appropriate to use the standby-less system switchover (effects distributed) facility in a system in which system resources must always be used at high efficiency and for which degradation of system performance must be minimized.

With the standby-less system switchover (effects distributed) facility, a back-end server that relinquishes processing when a failure occurs is called a host BES, and a back-end server that takes over processing is called a guest BES. The unit containing the host BESs is called the regular unit, and a unit containing a guest BES is called an accepting unit. All accepting units must be pre-defined as an HA group. The resources for back-end servers associated with guest BESs are called guest areas.

Figure 8-3 provides an overview of the standby-less system switchover (effects distributed) facility (with distribution alternates and multi-stage system switchover).

Figure 8-3 Overview of standby-less system switchover (effects distributed) facility (with distribution alternates and multi-stage system switchover)

[Figure]

Prerequisites
To use the standby-less system switchover (effects distributed) facility, the following conditions must be satisfied:
  • HiRDB Advanced High Availability is installed
  • The standby-less system switchover (effects distributed) facility can switch only to a unit dedicated to back-end servers (a unit that consists only of back-end servers).
  • A unit that uses the standby-less system switchover (effects distributed) facility must consist of one or more back-end servers for the primary system. It cannot be used as a dedicated accepting unit.

(3) Shared disk unit

System switchover requires that there be an external hard disk that is shared by the primary and secondary systems. This hard disk is called the shared disk unit; it is used to transfer information from the running system to the standby system when system switchover occurs. The following HiRDB files must be created on the shared disk unit: