Job Management Partner 1/Performance Management - Agent Option for Platform Description, User's Guide and Reference
The following provides examples of definitions for the monitoring template and definitions for items not included in the monitoring template for each monitored resource. The following notes apply to reading the definition examples:
- In the examples, the PFM - Web Console check boxes are shown as follows:
(selected) and
(not selected)
- In the examples, the PFM - Web Console radio buttons are shown as follows:
(selected) and
(not selected)
- In the examples, xxx, yyy, zzz, and dummy are variables that the user replaces with the character strings appropriate for the system environment. For other definition items, the values should be changed as required.
- In the examples, the proper values for the frequency of occurrence settings (for example, m occurrence(s) during n interval(s)) differ depending on the system environment. Accordingly, specify the appropriate values. For example, assume that the status whose threshold has been exceeded for at least two minutes in the system environment is the high-load status. Further assume that the collection interval is 60 seconds and that the maximum for the number of times that the threshold can be exceeded is twice per five intervals. Under these conditions, an unacceptable high-load condition occurs when the threshold is exceeded at least three times per five intervals. The setting in this case is 3 occurrence(s) during 5 interval(s).
- Organization of this subsection
- (1) Processor
- (2) Memory
- (3) Disks
- (4) Network
- (5) Processes and services
- (6) Event logs
(1) Processor
The following shows definition examples for the monitoring template and for items not included in the monitoring templates.
(a) Monitoring template
l Processor-related monitoring template alarms
Processor-related monitoring template alarms are stored in the alarm table for PFM Windows Template Alarms [CPU] 09.00.
Table 1-19 Processor-related monitoring template alarms
Monitoring template alarm Record Field Error threshold Warning threshold Description CPU Usage PI CPU % >= 90 > 80 If the processor usage (%) continues to be high, the processor might be a system bottleneck. Any processes that are using the processor excessively must be found, and appropriate action taken. If no such processes exist, the system environment is not adequate for the processing. In this case, you might need to upgrade the processor or add other processors. Processor Queue PI Processor Queue Length >= 10 >= 2 If the number of request continues at or above the threshold, this indicates processor congestion. SVR Processor Queue PI_SVRQ Queue Length >= 3 >= 2 If the queue length continues at or above the threshold, this indicates significant load on the processor. l Processor-related monitoring template reports
Table 1-20 Processor monitoring template reports
Report name Displayed information CPU Status (Multi-Agent) An hourly summary of the CPU usage by multiple agents for the last 24 hours CPU Trend Daily CPU usage in the user mode and daily CPU usage in the kernel mode for the last month CPU Trend (Multi-Agent) Daily CPU usage by multiple systems for the last month CPU Usage Summary A summary of the CPU usage on a minute-by-minute basis for the last hour For details about settings for existing reports, see 4. Monitoring Templates.
(b) Definition examples other than for monitoring templates
l Real-time report for checking processes whose processor usage is high
Table 1-21 Definition example
Item Explanation Name and Type Report name PD_PDI - Memory Product Windows (6.0) Report type Real-time (single agent)
(Select)
Historical (single agent)
-- Historical (multiple agents)
-- Field Record PD_PDI Selected fields Program
PDI
CPU %
Privileged CPU %
User CPU %Filter Conditional expression: (Select Simple or Complex.)
Program <> "_Total"
AND PID <> "0"Specify when displayed (Clear)
Indication settings Specify when displayed
(Select)
Indicate delta value
(Clear)
Refresh interval Do not refresh automatically
(Clear)
Initial value 30 Minimum value 30 Display by ranking Field CPU % Display number 10# In descending order
(Clear)
Components Table All fields List -- Graph Privileged CPU %
User CPU %Display key Field (None) In descending order -- Graph Graph type Stacked bar graph Series direction Row Axis labels X-axis Process name (process ID) Y-axis CPU % Data label Data label 1 Process name Data label 2 Process ID Drilldown Report drilldown Arbitrary Field drilldown Arbitrary
- Legend:
- --: Do not specify this item.
- #
- Specify a value appropriate for the circumstances.
(2) Memory
The following shows definition examples for the monitoring templates and for items not included in the monitoring templates.
(a) Monitoring template
l Memory-related monitoring template alarms
Table 1-22 Memory monitoring template alarms
Monitoring template alarm Record Field Error threshold Warning threshold Description Available Memory PI Available Mbytes < 3 < 4 When the unused size is below the threshold, physical memory might be insufficient. Find any processes using excess memory, and perform the necessary countermeasures. If there are no process problems, perform the necessary countermeasures, such as increasing memory, as the system environment is exceeding its resources. Committed Mbytes PI Committed Mbytes >= 2046 >= 1024 If usage of the virtual memory area continues at or above the threshold (the Total Physical Mem Mbytes field in the PI record), physical memory might be insufficient. Pages/sec PI Pages/sec >= 5 >= 4 If the pages per second continues at or above the threshold, memory might be causing a system bottleneck. However, if the exceeded threshold is temporary, the monitored value might be allowed to reach 20. Page Faults/sec PI Page Faults/sec >= 5 >= 4 If the rate of page faults continues at or above the threshold, memory might be a bottleneck. For details about settings for existing alarms, see 4. Monitoring Templates.
l Memory-related monitoring template reports
Table 1-23 Memory monitoring template reports
Report name Displayed information Memory Available Trend (Multi-Agent) The daily amount of available physical memory for multiple systems for the last month Memory Paging The number of times paging occurred on a minute-by-minute basis for the last hour Memory Paging Status (Multi-Agent) An hourly summary of memory page faults that occurred on multiple agents for the last 24 hours OS Memory Usage Status (real-time report indicating memory usage) Usage status of physical memory OS Memory Usage Status (historical report indicating memory usage) An hourly summary of the physical memory usage status for the last 24 hours System Memory Detail Details of system physical memory on a minute-by-minute basis for the last hour l System-related monitoring template reports (for memory)
Table 1-24 System monitoring template reports
Report name Displayed information File System I/O Summary A summary of the number of I/O operations on a minute-by-minute basis for the last hour Process Trend The number of processes executed by the system for the last month (by day) System Overview (real-time report giving a system overview) A summary of the status of the entire system System Overview (historical report giving a system overview) A summary of the system status on a minute-by-minute basis for the last hour Workload Status Data related to the system workload Workload Status (Multi-Agent) An hourly summary of the workload-related data for multiple systems for the last 24 hours For details about settings for existing reports, see 4. Monitoring Templates.
(b) Definition examples other than for monitoring templates
l Historical report for checking whether a memory leak has occurred
Table 1-25 Definition example
Item Explanation Name and Type Report name PI - Memory Product Windows (6.0) Report type Real-time (single agent)
-- Historical (single agent)
(Select)
Historical (multiple agents)
-- Field Record PI Selected fields Pool Nonpaged Bytes
Pool Paged Bytes
Pages/sec
Page Faults/sec
Data Map Hits %
Commit Limit Mbytes
Committed Mbytes
Non Committed Mbytes
% Committed Bytes in Use
Total Physical Mem Mbytes
Used Physical Mem Mbytes
Available Mbytes
% Physical Mem
Current Processes
Current ThreadsFilter Conditional expression: (Specify no filter condition.)
Indication settings Specify when displayed (Clear)
Specify when displayed
(Select)
Settings for the report display period Date range The value is specified when the report is displayed. Report interval One minute Peak time Field (None) Maximum number of records 1440# Components Table All fields List -- Graph Pool Nonpaged Bytes Display name -- Display key Field (None) In descending order -- Graph Graph type Line graph Series direction Row Axis labels X-axis Time Y-axis Nonpaged pool Data label Data label 1 (None) Data label 2 (None) Drilldown Report drilldown Arbitrary Field drilldown Arbitrary
- Legend:
- --: Do not specify this item.
- #
- Specify a value appropriate for the circumstances.
l Real-time report for checking the memory usage of a process
Table 1-26 Definition example
Item Explanation Name and Type Report name PD_PDI - Memory Product Windows (6.0) Report type Real-time (single agent)
(Select)
Historical (single agent)
-- Historical (multiple agents)
-- Field Record PD_PDI Selected fields Select all fields. Filter Conditional expression: (Select Simple or Complex.)
Program <> "_Total"
AND PID <> "0"Specify when displayed (Clear)
Indication settings Specify when displayed
(Select)
Indicate delta value
(Clear)
Refresh interval Do not refresh automatically
(Clear)
Initial value 30 Minimum value 30 Display by ranking Field Pool Nonpaged Kbytes# Display number 30# In descending order
(Select)
Components Table Program
PID
Handle Count
Page Faults/sec
Pool Nonpaged Kbytes
Pool Paged Kbytes
Working Set Kbytes
Page File Kbytes
Private Kbytes
CPU %List -- Graph Pool Nonpaged Kbytes
Pool Paged Kbytes
Working Set Kbytes
Page File Kbytes
Private KbytesDisplay name -- Display key Field (None) In descending order -- Graph Graph type Line graph Series direction Row Axis labels X-axis Time Y-axis Memory usage Data label Data label 1 (None) Data label 2 (None) Drilldown Report drilldown Arbitrary Field drilldown Arbitrary
- Legend:
- Do not specify this item.
- #
- Set the fields that you want to monitor.
(3) Disks
The following shows definition examples for the monitoring templates.
(a) Monitoring template
l Disk-related monitoring template alarms
Disk-related monitoring template alarms are stored in the alarm table for PFM Windows Template Alarms [DSK] 09.00.
Table 1-27 Disk monitoring template alarms
Monitoring template alarm Record Field Error threshold Warning threshold Description Disk Space PI_LOGD % Free Space < 5 < 15 If the free space is less than the threshold, disk capacity might be insufficient. Appropriate action, such as deleting unnecessary files, compressing files, defragmenting the disk, or adding a disk, might be required. Logical Disk Free PI_LOGD ID <> _Total <> _Total If there is little unused area, disk capacity might be insufficient. Free Mbytes < 5120 < 10240 Disk Busy % PI_LOGD ID <> _Total <> _Total If the time elapsed continues at or above the threshold, this indicates high disk load. % Disk Time >= 90 >= 50 Logical Disk Queue PI_LOGD ID <> _Total <> _Total If the number of requests continues at or above the threshold, this indicates that the logical disk is congested. Current Disk Queue Length >= 5 >= 3 Physical Disk Queue PI_PHYD ID <> _Total <> _Total If the number of requests continues at or above the threshold, this indicates that the physical disk is congested. Current Disk Queue Length >= 5 >= 3 For details about settings for existing alarms, see 4. Monitoring Templates.
l Disk-related monitoring template reports
Table 1-28 Disk monitoring template reports
Report name Displayed information Disk Time - Top 10 Logical Drives 10 logical disks with the highest disk usage Free Megabytes - Logical Drive Status Information about the free space on a logical disk Free Space - Low 10 Logical Drives 10 logical disks with the smallest amount of free space Free Space - Top 10 Logical Drives 10 logical disks with the largest amount of free space Logical Drive Detail Details of a specific logical disk For details about existing reports, see 4. Monitoring Templates.
(4) Network
The following shows definition examples for the monitoring template.
(a) Monitoring template
l Network-related monitoring template alarms
Network-related monitoring template alarms are stored in the alarm table for PFM Windows Template Alarms [NET] 09.00.
Table 1-29 Network monitoring template alarms
Monitoring template alarm Record used Field used Abnormal condition Warning condition Meaning Network Received PI_NETI Bytes Rcvd/sec >= 3000 >= 2048 Compare the number of bytes received from the network by the server with the total bandwidth performance for the network card, and if the bandwidth (amount of data that can be transferred over the network in a fixed time) is at or above 50%, the network connection might be a bottleneck. For details about existing alarms, see 4. Monitoring Templates.
l Network-related monitoring template reports
Table 1-30 Network monitoring template reports
Report name Displayed information Access Failure Status (real-time report indicating system access errors) The number of errors occurring in system access attempts Access Failure Status (historical report indicating system access errors) The cumulative number of errors occurring in system access attempts on an hourly basis for the last 24 hours Server Activity Detail Information about the status of communication with the network Server Activity Summary (Multi-Agent) An hourly summary of the status of communication with the network for the last 24 hours Server Activity Summary (real-time report providing information about the status of communication over the network) Information about the status of communication with the network Server Activity Summary (historical report providing information about the status of communication over the network) The status of communication with the network on a minute-by-minute basis for the last hour Server Activity Summary Trend (Multi-Agent) The status of data communication between the network and the servers of multiple systems on a daily basis for the last month Server Sessions Trend (Multi-Agent) The number of active sessions on the servers of multiple systems on a daily basis for the last month System Utilization Status The status of communication between the server and the network For details about existing reports, see 4. Monitoring Templates.
(5) Processes and services
The following gives definition examples for monitoring templates.
(a) Monitoring template
l Process-related monitoring template alarms
Process-related monitoring template alarms are stored in the alarm table for PFM Windows Template Alarms [PS] 09.00.
Table 1-31 Process monitoring template alarms
Monitoring template alarm Record used Field used Abnormal condition Warning condition Meaning Process End PD_PDI Program = jpcsto = jpcsto If performance data is not collected, this indicates the process has stopped. Process Alive PI_WGRP Process Count > 0 > 0 This indicates that the workgroup process is running. Workgroup = workgroup = workgroup Service(Service Nm) PD_SVC Service Name = JP1PCAGT_TS = JP1PCAGT_TS If the application service (process) is not running (RUNNING), this indicates that the service has stopped. State <> RUNNING <> RUNNING Service(Display Nm) PD_SVC Display Name = PFM - Agent Store for Windows = PFM - Agent Store for Windows If the application service (process) is not running (RUNNING), this indicates that the service has stopped. State <> RUNNING <> RUNNING For details about existing alarms, see 4. Monitoring Templates.
l Process-related monitoring template reports
Table 1-32 Process monitoring template reports
Report name Displayed information CPU Usage - Top 10 Processes The 10 processes with the highest CPU usage Process Detail Details about system resource consumption by a specific process Page Faults - Top 10 Processes The 10 processes with the highest page fault frequency For details about existing reports, see 4. Monitoring Templates.
(6) Event logs
The following gives definition examples for monitoring templates.
(a) Monitoring template
l Event log-related monitoring template alarms
Monitoring template alarms related to event logs are stored in the alarm table for PFM Windows Template Alarms [LOG] 09.00.
Table 1-33 Event log monitoring template alarms
Monitoring template alarm Record used Field used Abnormal condition Warning condition Meaning Event Log(all) PD_ELOG Log Name <> dummy <> dummy This indicates that an error or warning has occurred for the application. Event Type Name = Error = Warning Source Name <> dummy <> dummy Event ID <> 0 <> 0 Description <> dummy <> dummy Event Log(System) PD_ELOG Log Name = System = System This indicates that an error or warning has occurred for MSCS. Event Type Name = Error = Warning Source Name = ClusSvc = ClusSvc Event ID <> 0 <> 0 Description <> dummy <> dummy For details about existing alarms, see 4. Monitoring Templates.
l Event log-related monitoring template reports
N/A
All Rights Reserved. Copyright (C) 2009, Hitachi, Ltd.