3.1.5 Process and service monitoring examples
This subsection explains how to monitor process performance and service performance.
- Organization of this subsection
(1) Overview
Because system functionality is provided by individual processes and services, understanding the operating status of processes and services is essential for stable system operation.
If one of the processes or services that provide system functionality terminates abnormally, the system stops with serious consequences. In order to detect such an abnormal condition early and take appropriate action, it is necessary to monitor the status of processes and services including their generation and disappearance.
Note that PFM - Agent for Platform performs a process check at the same intervals that information is collected. Accordingly, the time that the disappearance of a process is detected is the time that PFM - Agent for Platform collects information, not the actual time that the process disappeared.
The following table lists and describes the principal records and fields related to the monitoring of processes and services.
Record |
Field |
Description (example) |
---|---|---|
PI_WGRP |
Process Count |
The number of processes. If the value of this field is the threshold or less (the minimum number of processes that need to be activated), some or all of the required processes are inactive.# |
PD_PDI |
Program |
The name of a process. If this record is not collected, the process is inactive. |
PD_ASVC, PD_SVC |
Service Name |
The name and status of a service. If the status of the application service (process) is not RUNNING, the service is inactive. |
Display Name |
||
State |
||
PD_APS |
Program Name |
The name of a process. If this record is not collected, the process has stopped. |
PD_APP, PD_APP2 |
Application Name |
The name of an application definition. |
Application Exist |
A status of the applications. NORMAL indicates that the status of any one of the monitored targets is normal. ABNORMAL indicates that the status of all the monitored targets is abnormal. |
|
Application Status |
A status of the applications. NORMAL indicates that the status of all the monitored targets is normal. ABNORMAL indicates that the status of any one of the monitored targets is abnormal. |
|
Application Name |
Conditional results on the number of monitors. If the value of the Monitoring Status field is ABNORMAL, the number of running programs, services, or command lines is not within the specified range. |
|
Monitoring Label |
||
Monitoring Status |
- #
-
The collection data addition utility must be set up to collect this record.
(2) Monitoring methods
(a) Monitoring process disappearance
You can use the Process End alarm provided by the monitoring templates to monitor process disappearance.
If a process terminates abnormally, the system stops with serious consequences. You can monitor the disappearance of processes by using an alarm, enabling prompt recovery of the system.
For details, see 3.2.5(1) Monitoring template.
(b) Monitoring process generation
You can use the Process Alive alarm provided by the monitoring templates to monitor process generation.
You can use an alarm to monitor the generation of processes for each application or the status of scheduled processes, enabling you to check the operating status of the production system.
By using the PI_WGRP record and specifying the workgroup settings of the collection data addition utility, you can perform several types of monitoring. For example, you can monitor the following items: process generation, process disappearance, the number of processes that have the same name, the number of processes for each application, and the number of processes activated for each user.
For details, see 3.2.5(1) Monitoring template.
(c) Monitoring for service stoppages
Service stoppage can be monitored using the Service (Service Nm) alarm or Service (Display Nm) alarm provided by the monitoring templates.
If a service terminates abnormally, the production system stops with serious consequences.
You can monitor a service for stoppages by using an alarm, enabling prompt recovery of the system.
For details, see 3.2.5(1) Monitoring template.