Hitachi

JP1 Version 12 JP1/Performance Management Planning and Configuration Guide


2.5.1 Important considerations for configuring JP1/Performance Management in large systems

This subsection describes some of the important considerations that you must take into account when configuring JP1/Performance Management in large systems.

Organization of this subsection

(1) Enabling Performance Management functions

Enable the following Performance Management functions in large systems:

Status management function

This function manages the status of the PFM services, thereby providing a safeguard against placing too much burden on the monitoring managers. For details about how to configure the status management function, see the chapter describing the problem detection mechanisms provided by Performance Management in the JP1/Performance Management User's Guide.

Functionality for binding multiple alarm tables

This function allows you to bind multiple alarm tables to agents. For details about how to configure the functionality for binding multiple alarm tables, see 4.4.4 Configuring the functionality for binding multiple alarm tables for Windows and 5.4.4 Configuring the functionality for binding multiple alarm tables for UNIX.

Dispersion of the reconnection

This function disperses the timing of reconnection attempts made by multiple agents that have started in stand-alone mode. For details about how to configure the dispersion of the reconnection, see the section describing what to do when the multiple agents that have simultaneously started are taking time to recover from stand-alone mode in the JP1/Performance Management User's Guide.

Monitoring suspension function

This function allows you to suspend or resume alarms, a health check, and other monitoring operations. For details about how to configure the monitoring suspension function, see the section describing the configuration of the monitoring suspension function in the JP1/Performance Management User's Guide.

Communication restriction by means of fixed ports

This function reduces accesses to the monitoring managers upon the start of agents and thereby reduces the burden placed on the monitoring managers by fixing the port numbers of the Master Manager and the Correlator. For details about how to set port numbers, see 4.1.4 (2) Specifying settings for the network for windows and 5.1.4 (3) Specifying settings for the network for UNIX.

Important

If, after fixing the port numbers of the Master Manager and the Correlator, you want to change the port number settings, see the following section:

For Windows

4.3.16 Changing the port number settings

For UNIX

5.3.16 Changing the port number settings

(2) Creating alarms by taking into account the maximum number of alarms allowed

You can create no more than 20,000 alarms within a single system. If you create an alarm table for each server available in a large system, the total number of alarms can exceed 20,000. To prevent this from happening, share an alarm table (or alarm tables) across several servers as necessary.

With Performance Management, alarm definitions in different alarm tables are counted as different alarms, even when they are identical. When there are multiple servers binding identical alarms to agents, you can define such alarms in a common alarm table to reduce the total number of defined alarms.

You can use a single alarm to monitor the change of status of each one of the record instances. To do this, you have to create an alarm that notifies about the change of status of each record instance when any status change occurs. For details about how to configure such an alarm, see 3.3.4 Triggers for sending alarms.

(3) Taking the frequency of alarm event occurrence into account when configuring an environment

The number of alarm events occurring inside the system must be kept to somewhere around 150 per minute even during peak periods. As the number of agents running inside the system increases, the number of allowable alarm events per agent decreases. We therefore recommend that you carefully consider the following settings when configuring JP1/Performance Management:

Occurrence frequency

The occurrence frequency is a setting whereby you can have a notification sent out only when the threshold is exceeded more than the specified number of times during the specified number of monitoring intervals. By using this setting, you can have alarm events issued only during continuous heavy load conditions, while suppressing the issuance of alarm events during temporary load conditions.

Temporary monitoring setting

When, for example, you are performing system maintenance that is likely to cause a large number of alarm events, you can use the monitoring suspension function to suspend alarm event monitoring.

(4) Polling interval setting for the health check agent

The health check function monitors the operating status of both the monitoring agents and the hosts on which they are running by using a health check agent dedicated to that purpose. By default, the health check agent is configured to perform polling every five minutes.

With a large system, it can take time for the health check agent to collect operating information, and if the health check agent fails to collect operating information in five minutes, the polling is skipped. For details, see the section that provides notes on the health check function in the JP1/Performance Management User's Guide.

(5) Setting the number of PA records to be saved

You must set the maximum number of PA records provided by the Master Store service that can be saved per Agent/RM Collector service.

Set the value b such that it satisfies the following equation:

a × b × 0.015 < 2,000 (megabytes)

Legend:

a: Total number of the Agent Collector services, Remote Monitor Collector services, Remote Agents, and Group Agents to be connected

b: Number of PA records to be retained (1,000 in the initial status)