Hitachi

JP1 Version 12 JP1/Performance Management - Remote Monitor for Platform Description, User's Guide and Reference


2.2.1 Example of monitoring a processor

By monitoring processes, you can determine performance trends over the entire system.

Windows processes consist of two types of processor access modes, called the user mode and the kernel mode. The following figure provides an overview of the Windows architecture.

Figure 2‒3: Overview of the Windows architecture

[Figure]

UNIX processes consist of operations by kernel processes and operations by user processes. The following figure shows the relationship between the UNIX kernel and processes.

Figure 2‒4: Relationship between the UNIX kernel and processes

[Figure]

Organization of this subsection

(1) Overview of processor monitoring

Execution of jobs such as processes involves scheduling by the OS and allocation to the CPU. The queue request count that indicates the number of jobs waiting for CPU allocation tends to be proportional to the volume of loading in the entire system. Therefore, in general, you can obtain the processor usage status by monitoring the CPU usage rate and queue request count.

The following table lists and describes the records and fields that are used by PFM - RM for Platform for monitoring the processor.

Table 2‒1: Records and fields used for processor monitoring

No.

Record

Field

Description of value

Interpretation of value

1

PI

Processor Queue Length

Queue request count

When this value continuously exceeds the threshold value, the processor might be busy.

2

Run Queue Avg 5 min

Average number of threads waiting in the execution queue

When this value is large, there might be a problem with processor utilization efficiency.

3

CPU %

CPU usage rate

When this value continuously exceeds the threshold value, the processor might be responsible for a system bottleneck.

4

System %

CPU usage rate in the kernel mode

When this value is large and the CPU % field of the PI record continuously exceeds the threshold value, there might be problems in a specific application process, such as a service or a system process.

5

User %

CPU usage rate in the user mode

When this value is large and the CPU % field of the PI record continuously exceeds the threshold value, there might be problems in a specific application process, such as a service.

6

Idle %

CPU idle rate

When this value is high, there might be no load on the CPU.

7

Interrupt Counts/sec

Hardware interrupt count (per second)

When the value of this field has increased greatly while the system workload is light, there might be a hardware interrupt problem, such as a slow device resulting in processor overloading.

8

PI_CPU#

CPU %

A processor's CPU usage rate

When this value continuously exceeds the threshold value, the processor might be responsible for a system bottleneck.

9

System %

CPU usage rate for a processor executed in the kernel mode

When this value is large and the CPU % field of the PI_CPU record continuously exceeds the threshold value, there might be problems in a specific application process, such as a service or a system process.

10

User %

CPU usage rate for a processor executed in the user mode

When this value is large and the CPU % field of the PI record continuously exceeds the threshold value, there might be problems in a specific application process, such as a service.

11

Interrupt Counts/sec

Hardware interrupt count (per second) for a processor

When the value of this field has increased greatly while the system workload is light, there might be a hardware interrupt problem, such as a slow device resulting in processor overloading.

#

Each field of the PI_CPU record is used to monitor the performance of one processor.

In a multiprocessor environment, the average value of all CPU usage rates is treated as the CPU usage rate for the system. Therefore, to obtain an accurate CPU usage rate, check the value for each CPU. To identify a process causing a bottleneck, check the CPU usage rate for each process.

You must use PFM - Agent for Platform to check the CPU usage rate for each process. For details about how to monitor processes, see the manual JP1/Performance Management - Agent Option for Platform (for Windows systems), or JP1/Performance Management - Agent Option for Platform (for UNIX systems).

(2) Example of a monitoring template for monitoring a processor

This subsection describes an example of alarms and reports that are provided as a monitoring template for monitoring a processor.

PFM - RM for Platform provides alarms and reports, such as the CPU Usage alarm and the CPU Used Status (Multi-Agent) report. To obtain more detailed performance information for a processor, you must monitor various aspects of the processor.

(a) Alarms

The following table lists and describes the processor-related alarms.

Table 2‒2: Examples of alarms related to processor monitoring

No.

Alarm

Record

Field

Abnormal condition

Warning condition

Interpretation of value

1

CPU Usage

PI

CPU %

>= 90

>= 80

A processor usage rate of 80% or higher is treated as the warning or abnormal status.

When this value becomes greater than the threshold value set in the warning or abnormal condition, the process might be causing a system bottleneck.

If you find a process that makes excessive use of the processor, you must check the status of the process, and then take appropriate action. If there is no process that is using the processor excessively, you might need to consider upgrading the processor or adding a new processor.

2

Kernel CPU

PI

System %

> 75

> 50

A CPU usage rate in the kernel mode of higher than 50% is treated as the warning or abnormal status.

When this value becomes greater than the threshold value set in the warning or abnormal condition, there might be a problem in the OS or system operating procedures.

Check to see if more processes than the kernel scheduling can overcome have been created or deleted in a short period of time, or if there is a process that uses the processor excessively, and then take appropriate action.

If there is no process that is using the processor excessively, you might need to consider upgrading the processor or adding a new processor.

3

Processor Queue

PI

Processor Queue Length

>= 10

>= 2

A consecutive queue request count of 2 or greater is treated as the warning or abnormal status.

When this value becomes greater than the threshold value set in the warning or abnormal condition, the process might be causing a system bottleneck.

If you find a process that makes excessive use of the processor, you must check the status of the process and take appropriate action. If there is no process that is using the processor excessively, you might need to consider upgrading the processor or adding a new processor.

4

Run Queue

PI

Run Queue Avg 5 min

> 8

> 4

An average thread count greater than 4 in the execution queue is treated as the warning or abnormal status.

When this value becomes greater than the threshold value set in the warning or abnormal condition, there might be a problem in the OS or system operating procedures or with a specific application.

Check to see if more processes than the kernel scheduling can overcome have been created or deleted in a short period of time, or if there is a process that uses the processor excessively, and then take appropriate action.

If there is no process that is using the processor excessively, you might need to consider upgrading the processor or adding a new processor.

5

User CPU

PI

User %

> 85

> 65

A CPU usage rate higher than 65% in the user mode is treated as the warning or abnormal status.

When this value becomes greater than the threshold value set in the warning or abnormal condition, there might be a problem with a specific application.

Check to see if more processes than the kernel scheduling can overcome have been created or deleted in a short period of time, or if there is a process that uses the processor excessively, and then take appropriate action.

If there is no process that is using the processor excessively, you might need to consider upgrading the processor or adding a new processor.

(b) Reports

The following table lists and describes the processor-related reports.

Table 2‒3: Examples of reports related to processor monitoring

No.

Report name

Information displayed in the report

1

CPU Used Status (Multi-Agent)

Displays the CPU usage status in multiple systems.

2

CPU Used Status

Displays the CPU usage status in the system.

3

CPU Per Processor Status

Displays the processor usage status for each processor.