Hitachi

JP1 Version 12 JP1/Performance Management User's Guide


6.3.5 Setting the function for measurement value output at alarm recovery

Because alarms that monitor multi-instance records only enter normal status when the value of every instance is in the normal range, they do not identify the specific instance that caused the alarm to be generated. For this reason, fixed values are displayed for measured values and message text when conditions return to normal.

If you enable the function for measurement value output at alarm recovery, the instance that most recently caused the alarm to enter abnormal or warning status is assumed to be responsible for its return to normal status, and the measured values and message text are set accordingly.

The following table describes how this function affects the alarm generated when an alarm that monitors a multi-instance record returns to normal status.

Table 6‒4: Differences between alarms generated at return to normal status

Item

Function for measurement value output at alarm recovery

Enabled

Disabled

Number of alarms generated

The number of alarms generated simultaneously when the last abnormal or warning alarm was generated

1

Measured value (the variable %CVS in the alarm definition

The current value of the instance that caused the last abnormal or warning alarm to be generated#

<OK>

The contents of the message text (the variable %MTS in the alarm definition)

The value set in the alarm definition

--

Legend:

--: Not generated.

#

If the instance has no value, (N/A) is set.

For details on how to set the function for measurement value output at alarm recovery, see the chapter describing installation and setup in the JP1/Performance Management Planning and Configuration Guide.

Tip

Even with the function for measurement value output at alarm recovery enabled, the only part of the alarm message text that changes is the measured value. Therefore, if you write alarm message text that refers to the alarm being in abnormal or warning status, the message receiver might assume that the instance to which it refers is in abnormal or warning status, when in fact it indicates a return to normal status. For this reason, we recommend that you do not include references to the alarm status in alarm message text in a system with the function for measurement value output at alarm recovery enabled.

Also, some monitoring template alarms for PFM - Agent and PFM - RM might contain alarm message text that refers to the alarm status. When using such a monitoring template alarm in a system with the function for measurement value output at alarm recovery enabled, we recommend that you copy the alarm table and edit the alarm message text as needed.

Organization of this subsection

(1) Contents of alarm message text

This subsection describes the contents of alarm message text, using the Abnormal Status(A) alarm of the health check agent as an example. The Abnormal Status(A) alarm monitors the health check status of an agent. In the example below, the alarm monitors PFM - Agent for Platform (Windows) on the server host01.

The following table describes how the function for measurement value output at alarm recovery affects the alarm message text.

Table 6‒5: Effect of the function for measurement value output at alarm recovery on alarm message text

Event

Alarm message text

With feature enabled

With feature disabled

A value exceeds the threshold and an abnormal or warning alarm is reported

Status of TA1host01 changed to Incomplete

Status of TA1host01 changed to Incomplete

The value enters a normal range and a normal alarm is reported

Status of TA1host01 changed to Running

--

Legend:

--: Not output.

(2) Examples of alarm settings and generated alarms

For each of the following alarm types, this subsection describes examples of issued alarms and their message text:

#1

For details, see the chapter describing the New Alarm > Basic Information window or the Edit > Basic Information window in the manual JP1/Performance Management Reference.

#2

For details, see 6.4.3 Setting a value whose existence is to be monitored.

(a) Standard alarms

The following describes an example of a standard alarm. In this example, the alarm monitors PFM - Agent for Platform (Windows). The alarm definition is as follows:

  • Message text: Disk Busy % (%CVS1) = %CVS2

  • Check whether the value exists: Not selected.

  • Activate alarm: Selected.

  • Notify when the state changed: Selected.

  • State changes for the alarm: Selected.

  • Evaluate all data: Not selected.

  • Evaluate regularly: Selected.

  • Alarm when damping conditions are satisfied: Not selected.

  • Record: Logical Disk Overview (PI_LOGD)

  • Abnormal: ID <> "_Total" AND % Disk Time >= "90.000"

  • Warning: ID <> "_Total" AND % Disk Time >= "50.000"

■ When an instance enters abnormal or warning status after an alarm of that status has been generated

The following table describes an example in which an abnormal or warning alarm is generated for an instance, after which another instance enters the same status as the alarm.

Table 6‒6: Example of alarms issued when multiple instances have the same status

Timing of alarm evaluation

Status of instance

Generated alarm

Instance 1 (C drive)

Instance 2 (D drive)

1st

Measurement value: 10

Status: Normal

Measurement value: 20

Status: Normal

--#1

2nd

Measurement value: 90

Status: Abnormal

Measurement value: 30

Status: Normal

Alarm: [Figure] (abnormal)

Alarm message:

Disk Busy % (C:) = 90

3rd

Measurement value: 60

Status: Warning

Measurement value: 90

Status: Abnormal

--#2

4th

Measurement value: 20

Status: Normal

Measurement value: 30

Status: Normal

Alarm: [Figure] (normal)

Alarm message:

Disk Busy % (C:) = 20

Legend:

--: Not generated.

#1

A normal alarm is not generated as it is the first time the alarm is evaluated.

#2

An abnormal alarm is not generated because the alarm status has not changed.

In this example, different instances enter the abnormal range the second and third time the alarm is evaluated. However, because the status of the record has not changed between the second and third evaluation, an abnormal alarm is not issued as a result of the third evaluation.

The variable %CVS in a normal alarm stores the measurement value of the instance that caused the last abnormal or warning alarm to be issued. For this reason, the normal alarm generated at the fourth evaluation uses the measurement value of instance 1, which caused the abnormal alarm to be generated at the second evaluation.

■ When an alarm of another status is generated after an abnormal or warning alarm

The following table describes an example in which an abnormal or warning alarm is generated, after which another alarm of a different status is generated based on a measurement value of another instance.

Table 6‒7: Examples of alarms issued when multiple instances have different statuses

Timing of alarm evaluation

Instance status

Generated alarm

Instance 1 (C drive)

Instance 2 (D drive)

1st

Measurement value: 10

Status: Normal

Measurement value: 20

Status: Normal

--#

2nd

Measurement value: 90

Status: Abnormal

Measurement value: 30

Status: Normal

Alarm: [Figure] (abnormal)

Alarm message:

Disk Busy % (C:) = 90

3rd

Measurement value: 40

Status: Normal

Measurement value: 60

Status: Warning

Alarm: [Figure] (warning)

Alarm message:

Disk Busy % (D:) = 60

4th

Measurement value: 20

Status: Normal

Measurement value: 30

Status: Normal

Alarm: [Figure] (normal)

Alarm message:

Disk Busy % (D:) = 30

Legend:

--: Not generated.

#

A normal alarm is not generated because it is the first time the alarm is evaluated.

In this example, instance 1 enters the normal range the third time the alarm is evaluated. However, because instance 2 has entered the warning range, a warning alarm is generated.

The variable %CVS in a normal alarm stores the measurement value of the instance that caused the last abnormal or warning alarm to be issued. For this reason, the normal alarm generated at the fourth evaluation uses the measurement value of instance 2, which caused the warning alarm to be generated at the third evaluation.

(b) Alarms that monitor whether a value exists

The following describes an example of an alarm that monitors whether a value exists. In this example, the alarm monitors PFM - Agent for Platform (Windows). The alarm definition is as follows:

  • Message text: %CVS

  • Check whether the value exists: Selected.

  • Activate alarm: Selected.

  • Notify when the state changed: Selected.

  • State changes for the alarm: Selected.

  • Evaluate all data: Not selected.

  • Evaluate regularly: Selected.

  • Alarm when damping conditions are satisfied: Not selected.

  • Record: Process Detail Interval (PD_PDI)

  • Field: Program

  • Value: process2

The following table describes an example in which this alarm is generated.

Table 6‒8: Example of alarms generated when monitoring whether a value exists

Timing of alarm evaluation

Instance status

Generated alarms

Instance 1

Instance 2

1st

Measurement value: process1

Measurement value: process2

--#1

2nd

Measurement value: process1

Measurement value: process3

Alarm: [Figure] (abnormal)

Alarm message:

(N/A)

3rd

Measurement value: process3

Measurement value: process4

--#2

4th

Measurement value: process2

Measurement value: process3

Alarm: [Figure] (normal)

Alarm message:

process2

Legend:

--: Not generated.

#1

A normal alarm is not generated because it is the first time the alarm is evaluated.

#2

An abnormal alarm is not generated because the alarm status has not changed.

In this example, the value process2 whose existence is being monitored is not found when the alarm is evaluated for the second time, and an abnormal alarm is generated. However, because the value does not exist when the alarm is generated, (N/A) appears for the %CVS variable in the message text.

In the fourth evaluation, because the value process2 is present again, a normal alarm is generated, and the value process2 appears for the %CVS variable.

(c) Alarms that evaluate all data

The following describes an example of an alarm that evaluates all data. In this example, the alarm monitors PFM - Agent for Platform (Windows). The alarm definition is as follows:

  • Message text: Disk Busy % (%CVS1) = %CVS2

  • Check whether the value exists: Not selected.

  • Activate alarm: Selected.

  • Notify when the state changed: Selected.

  • State changes for the alarm: Selected.

  • Evaluate all data: Selected.

  • Evaluate regularly: Selected.

  • Alarm when damping conditions are satisfied: Not selected.

  • Record: Logical Disk Overview (PI_LOGD)

  • Abnormal: ID <> "_Total" AND % Disk Time >= "90.000"

  • Warning: ID <> "_Total" AND % Disk Time >= "50.000"

The following table describes an example in which this alarm is generated.

Table 6‒9: Example of alarms generated when evaluating all data

Timing of alarm evaluation

Instance status

Generated alarms

Instance 1 (C drive)

Instance 2 (D drive)

Instance 1 (C drive)

Instance 2 (D drive)

1st

Measurement value: 10

Status: Normal

Measurement value: 20

Status: Normal

--#1

--#1

2nd

Measurement value: 90

Status: Abnormal

Measurement value: 90

Status: Abnormal

Alarm: [Figure] (abnormal)

Alarm message:

Disk Busy % (C:) = 90

Alarm: [Figure] (abnormal)

Alarm message:

Disk Busy % (D:) = 90

3rd

Measurement value: 20

Status: Normal

Measurement value: 90

Status: Abnormal

--#2

--#2

4th

Measurement value: 10

Status: Normal

Measurement value: 20

Status: Normal

Alarm: [Figure] (normal)

Alarm message:

Disk Busy % (C:) = 10

Alarm: [Figure] (normal)

Alarm message:

Disk Busy % (D:) = 20

Legend:

--: Not generated.

#1

A normal alarm is not generated because it is the first time the alarm is evaluated.

#2

An abnormal alarm is not generated because the alarm status has not changed.

When an alarm evaluates all data, an alarm is generated for each instance whose data meets the alarm conditions when the alarm status changes. Therefore, the second time the alarm is evaluated, abnormal alarms are generated for instance 1 and instance 2 which have both entered the abnormal range.

When the measurement values of all instances return to the normal range, a number of normal alarms equivalent to the number of abnormal and warning alarms are issued. In this example, because two abnormal alarms were generated at the second evaluation, two normal alarms are generated at the fourth evaluation. The %CVS variable in the message text of the normal alarms is replaced with the measurement values of instance 1 and instance 2 that caused the abnormal alarms to be generated in the second evaluation.