Hitachi

JP1 Version 12 JP1/Performance Management User's Guide


10.6.8 Performing the required operation when a failover occurs in a cluster system

When a failure occurs on the executing node, the cluster software executes a failover and the standby node takes over the processing.

Organization of this subsection

(1) Flow of processing for a failover when a failure occurs in PFM - Manager

Figure 10‒31: Flow of processing when a failover occurs on the PFM - Manager host

[Figure]

  1. The cluster software forces PFM - Manager to terminate when a failover occurs.

  2. The cluster software directs the standby node to take over the PFM - Manager processing from the executing node.

  3. The cluster software starts up PFM - Manager on the standby node.

(a) Operation on PFM - Web Console

The KAVJS0012-E message is displayed if you are performing operations from a PFM - Web Console window when a failover occurs in PFM - Manager.

To connect to the failover destination PFM - Manager:

  1. Log out from the window of PFM - Web Console.

    Click the Logout menu in the Main window.

  2. Log on in the window of PFM - Web Console.

    Log on from the window of PFM - Web Console again after the failover destination PFM - Manager starts up.

    Important

    If a failover occurs while you are working with the bookmarks, the information that was not correctly written in the bookmarks definition information is lost. Correct the bookmark definition if the bookmarks cannot be operated properly.

(b) Operations using PFM - Agent or PFM - RM

You do not need to perform special operations in PFM - Agent or PFM - RM when a failover occurs in PFM - Manager during operation. The performance data continues to be collected in PFM - Agent or PFM - RM during a failover of PFM - Manager.

(2) Effects when PFM - Manager stops

Stopping PFM - Manager affects the entire Performance Management system.

PFM - Manager performs integrated management of the agent information for each node where PFM - Agent or PFM - RM runs. Also, PFM - Agent or PFM - RM controls alarm event reports sent when a performance value exceeds a threshold value during monitoring and execution of actions triggered by an alarm event. For that reason, stopping PFM - Manager affects the Performance Management system in the areas listed in the following table.

Table 10‒15: Effects on PFM - Web Console when PFM - Manager stops

Effect

Solution

  • An alarm flashing in red in the window of PFM - Web Console returns to green immediately after PFM - Manager restarts or when a failover occurs, and then starts flashing in red again.

    When PFM - Manager stops, the KAVJS0012-E message occurs and no further operations can be performed.

Start up PFM - Manager, and then log on again.

  • You cannot log on to Performance Management when you attempt to log on from the window of PFM - Web Console if PFM - Manager has stopped.

Start up PFM - Manager, and then log on again.

Table 10‒16: Effects on PFM - Agent or PFM - RM when PFM - Manager stops

Effect

Solution

  • The Performance data continues to be collected.

  • Since the alarm event that occurred cannot be reported to PFM - Manager, the alarm event is retained for each alarm definition and the report is retried until PFM - Manager starts up. The oldest alarm event is overwritten when more than three alarm events are retained. If PFM - Agent or PFM - RM is stopped, all retained alarm events are deleted.

  • When PFM - Manager restarts, the alarm status that has already been reported to PFM - Manager is reset at once. Then PFM - Manager checks the status of PFM - Agent or PFM - RM and updates the alarm status.

  • When you attempt to stop PFM - Agent or PFM - RM, it takes time because the attempt to stop the program is not sent to PFM - Manager.

Start up PFM - Manager.

You can continue using the PFM - Agent or PFM - RM that is running without any changes. However, since an alarm might not be reported as expected, check the KAVE00024-I message output to the common message log of PFM - Agent or PFM - RM after PFM - Manager recovers.

(3) Overview of failover when a failure occurs in PFM - Agent or PFM - RM

Figure 10‒32: Flow of processing when a failover occurs in PFM - Agent or PFM - RM

[Figure]

  1. The cluster software forces PFM - Agent or PFM - RM to terminate when a failover occurs.

  2. The cluster software directs the standby node to take over the PFM - Agent or PFM - RM processing from the executing node.

  3. The cluster software starts up PFM - Agent or PFM - RM on the standby node.

(a) Operations in the window of PFM - Web Console

A message appears, according to the status, if you operate in the window of PFM - Web Console during a PFM - Agent or PFM - RM failover. In such cases, wait until the failover completes and the operation starts.

If you operate in the window of PFM - Web Console after the PFM - Agent or PFM - RM failover, you will be connected to and operate PFM - Agent or PFM - RM that is started up on the failover destination node.