3.1.3 Disk monitoring examples

This subsection explains how to monitor disk performance.

Organization of this subsection

(1) Overview
(2) Monitoring methods

(1) Overview

You can monitor disk performance to detect disk resource shortages and bottlenecks caused by a disk. Continuous monitoring of disk performance allows you to check for trends in increased disk space usage so that you can determine an appropriate configuration for the system or determine when the system configuration should be expanded.

A disk stores programs, the data used by the programs, and other data. If the amount of free disk space becomes insufficient, data might be lost or the system response might slow down.

If a program that is performing a disk I/O operation must pause (that is, wait for a response), the disk is becoming a bottleneck.

A disk bottleneck can cause any of several types of performance degradation, such as slow process response. For this reason, it is important to check that disk performance is not degrading.

If you think a disk bottleneck has occurred, first make sure that the disk is not fragmented. Next, make sure that there is enough free disk space by making sure that no invalid files are occupying disk space. If invalid files exist, you must identify the programs that created the files and take appropriate action.

The Disk Space alarm is provided by the monitoring templates. If you want to perform more detailed monitoring, see the following table, which lists and describes the principal records and fields related to the monitoring of disk performance.

Table 3‒5: Principal fields related to disk monitoring
Record	Field	Description (example)
`PI_LOGD`, `PI_PHYD`	`% Disk Time`	The disk busy rate. If the value of this field continues to be at or above the threshold (50% or more, or close to 100%), the load on the disk is high.
	`Current Disk Queue Length`	The number of queued requests. If the value of this field continues to be at or above the threshold (3), the disk is congested.
	`Avg Disk Bytes/Xfer`	The number of bytes transferred between disks in one I/O operation. The larger the value of this field, the more efficiently the system is operating.
	`Disk Bytes/sec`	The number of bytes transferred between disks per second. The larger the value of this field, the more efficiently the system is operating.
`PI_LOGD`	`% Free Space`	The percentage of free disk space. If the percentage is low, the amount of free disk space is insufficient.
`PI_LOGD`	`Free Mbytes`	The amount of available disk space. If the value of this field is small, the amount of free disk space is insufficient.

To Page Top

(2) Monitoring methods

(a) Monitoring the percentage of free logical-disk space

The percentage of the amount of free space on a logical disk can be monitored using the Disk Space alarm provided by the monitoring templates.

When the percentage of free logical-disk space is near or at the threshold value (the % Free Space field of the PI_LOGD record), file defragmentation might be affected.

If the disk capacity is large, the system might operate normally even when the percentage of free logical-disk space is near or at threshold value. Therefore, monitoring the amount of free logical-disk space, as well as the percentage, is recommended.

For details, see 3.2.3(1) Monitoring template.

(b) Monitoring the amount of free logical-disk space

The amount of free space on a logical disk can be monitored using the Logical Disk Free alarm provided by the monitoring templates.

You can effectively detect a low disk space level by using an alarm to monitor the amount of free logical-disk space.

The threshold for the amount of free logical-disk space (the Free Mbytes field of the PI_LOGD record) can be used as a guideline for determining whether to take action, such as deleting unnecessary files, compressing files, or adding a disk.

For details, see 3.2.3(1) Monitoring template.

(c) Monitoring the disk busy rate

You can use the Disk Busy % alarm provided by the monitoring template to monitor the disk busy rate.

You can monitor the disk busy rate by using an alarm to check whether excessive paging (reading and writing of pages by processes) is occurring.

If the disk busy rate (the % Disk Time field of the PI_PHYD or PI_LOGD record) continues to be at or above the threshold, you might need to take action. For example, you might need to identify the processes that frequently request disk I/O operations, and then distribute the processing of these processes.

When you monitor the disk busy rate, monitoring disk congestion is also recommended.

For details, see 3.2.3(1) Monitoring template.

(d) Monitoring disk congestion

Disk congestion can be monitored using the Logical Disk Queue alarm or Physical Disk Queue alarm provided by the monitoring templates.

You can monitor disk congestion by using an alarm to check whether I/O requests have been excessive.

If the disk congestion level (the Current Disk Queue Length field of the PI_PHYD or PI_LOGD record) continues to be at or above the threshold, you might need to take action. For example, you might need to identify those processes that frequently request disk I/O, and then distribute the processing of the processes.

When you monitor disk congestion, monitoring the disk busy rate is also recommended.

For details, see 3.2.3(1) Monitoring template.

To Page Top