Hitachi

JP1 Version 11 JP1/Performance Management Planning and Configuration Guide


2.5.2 Important considerations for operating JP1/Performance Management in large systems

This subsection describes some of the important considerations that you must take into account when operating JP1/Performance Management in large systems.

Organization of this subsection

(1) Simultaneously starting all PFM services on the host for which the dispersion of the reconnection is disabled

If you attempt to simultaneously start all PFM services on the host for which the dispersion of the reconnection is disabled, it can take at least three hours for all PFM services to start running in normal mode. In this case, you can reduce the time it takes for PFM services to start running in normal mode by starting agents in several batches. If you want to start agents in several batches, you have to determine the number of agents to be started at a time and the time interval between batches by carefully considering and comparing not only the permissible startup time from the standpoint of operating JP1/Performance Management but also the time it can take for all agents to start running without entering stand-alone mode.

(2) The time it takes for all PFM services to start running in normal mode

Simultaneously starting up all PFM services as part of the simultaneous startup of the operating system or scheduled restart can strain the monitoring managers, causing agents and action handlers to start running in stand-alone mode on a temporary basis. When there are a large number of agents, it takes time for the agents and the action handlers to transition from stand-alone mode to normal mode. During stand-alone mode, records are collected but alarm evaluation is not performed. These are the factors that must be taken into consideration when operating JP1/Performance Management in large systems. The table below shows the approximate time it takes for agents and action handlers to transition to normal mode.

Table 2‒3: The approximate time to normal mode activation

Number of agents#

Number of action handlers

Approximate time to normal mode activation (units: minutes)

100

100

20

500

500

40

1,200

1,024

70

2,500

2,500

120

#: This is the number of Agent Collectors or RM Collectors.

(3) Command execution time

Because the jpctool config sync command, the jpctool config alarmsync command, and the jpcconf primmgr notify command access agents and action handlers, they take time to execute in large systems. The table below shows the approximate command execution time.

Table 2‒4: Approximate command execution time

Number of agents#1

Number of action handlers

Approximate command execution time (units: minutes)

jpctool config sync command

jpctool config alarmsync command#2

jpcconf primmgr notify command

100

100

25

15

2

500

500

120

55

10

1,200

1,024

240

120

20

2,500

2,500

585

290

50

#1:

This is the number of Agent Collectors or RM Collectors.

#2:

The execution time shown for this command assumes that all services are subject to synchronization (that is, the application status is ether Failed or Uncertain).

The jpctool config sync command synchronizes alarm information and node information between agents and action handlers. The jpctool config alarmsync command synchronizes alarm information between the agents and action handlers whose application status is either Failed or Uncertain. Because commands take a long time to execute in large systems, we recommend that you use different commands under different circumstances as necessary.

(4) Simultaneously starting all PFM services on a system on which the automatic bind function is used

When you are starting agents for the first time after setting automatic bind, if you want to simultaneously start all PFM services as the agents start, automatic bind might not work in some agents due to the excessive burden placed on the system by PFM services. If automatic bind fails, the KAVE00568-E message is output. If this message is output, set alarm bind again or restart the agents in question and apply alarm information to them.

You can avoid this problem by starting agents in several batches. In doing so, you have to determine the number of agents to be started at a time and the time interval between batches by carefully considering and comparing not only the permissible startup time from the standpoint of operating JP1/Performance Management but also the time it can take for all agents to start running without entering stand-alone mode.