Hitachi

Job Management Partner 1 Version 10 Job Management Partner 1/IT Service Level Management Description, User's Guide, Reference and Operator's Guide


4.4.1 Checking the timing of an event causing an error or warning

When an error or warning is displayed for a monitored service, you can check a performance chart for the monitored service's monitored target to determine the timing of the event that caused the error or warning.

Use the Home window, Real-time Monitor window, and Troubleshoot window for this checking.

If you want to check the overall status of the service group, you identify the target monitored service, and then use the Home window to investigate the cause of the event. If you are focusing in on a specific monitored service and want to investigate the cause of an event that occurred in that monitored service, use the Real-time Monitor window.

Organization of this subsection

(1) Before you start

(2) Procedure

The following shows the Home window and the Troubleshoot window:

To check the timing of an event causing an error or warning:

  1. If the Home window is not displayed, click the Home button.

    The Current service group status summary, Caution service, and Events in the last 7 days areas are displayed.

    If you need to determine the monitored service to be investigated from the event issuance status, go to step 2.

    If you know which monitored service is to be investigated, go to step 3.

  2. In the Home window, from the Events in the last 7 days area, select an error or warning that you want to check, then click the Details column of the corresponding line.

    For the selected error or warning, the Performance chart tab on the Troubleshoot window is displayed. Note that the Performance chart tab is displayed only when an event related to service performance is selected.

  3. In the Troubleshoot window, in the Event and Performance chart tabs area, check the performance chart displayed on the Performance chart tab to determine the timing of the event that caused the error or warning.

    Check the performance chart and look for the time period in which the average value for service performance started to veer significantly from the baseline. On a performance chart, a colored band indicates a timeframe during which a significant change in service performance occurred. The timeframe indicated by the colored band might be when the event causing the error or warning occurred.

    You can also determine the timing of the event causing the error by selecting a node state display from the Node state display pull-down menu. If Event is selected from the Node state display pull-down menu, an icon indicating the event is displayed above the time the event occurred. This is useful for determining the base for troubleshooting because the status at the time the event occurred is displayed. If Monitor item state is selected from the Node state display pull-down menu, a band indicating the current events is displayed on the chart. You can check the transition of events by following the displayed band.

    You can change the item displayed in the performance chart. Click a display item to display the Select Items to Display dialog box, and then select the items that you want to display. For details about the display items, see 10.4.4 Event and Performance chart tabs area (Performance chart tab selected).

  4. Click [Figure] to display configuration information.

    Configuration information helps you identify the monitoring item of the monitored service that resulted in the error. If necessary, you can display performance information as a graph by clicking the [Figure] button associated with the monitoring item. You can also check whether a problem has occurred in the system, such as with a host or middleware. If a problem has occurred in the system, click the [Figure] button to connect to Performance Management for further investigation, if necessary.

Based on the information for the specific time period, check the CPU usage, memory usage, or disk usage for that period to evaluate the cause of the error or warning.

You can also check in the performance chart past service performance. For details about how to check past service performance, see 4.4.2 Checking past data.

Reference note

You can also display the Troubleshoot window from the Real-time Monitor window. The following explains how to check the timing of an event causing an error or warning from the Real-time Monitor window and shows the Real-time Monitor window used in the procedure:

  • Real-time Monitor window

    [Figure]

  1. Click the Real-time Monitor button.

    The Services, Service performance information, and System performance information areas and the Event and Performance chart tabs area are displayed.

  2. In the Services area of the Real-time Monitor window, select a service group, a monitored service, or a monitored target of a monitored service that you want to investigate.

    If you select a monitored target of a monitored service, go to step 4 (the task in step 3 is not necessary).

  3. In the Service performance information area of the Real-time Monitor window, select the monitored service's monitored target that you want to investigate.

    If threshold value monitoring, trend monitoring, or out-of-range value detection resulted in the error or warning, select the monitored target of a monitored service that you want to investigate based on the information, including icons, displayed in the Service performance information area. If you are monitoring system availability information by linking with Performance Management, the availability information is displayed in the Availability column in the Service performance information area. Check the displayed icon information and select the monitored service's monitored target that you want to investigate.

    Note that you can select a monitored target of a monitored service on the Event tab in step 4 without selecting it here.

  4. In the Real-time Monitor window, on the Event tab in the Event and Performance chart tabs area, check information about the event, and then click the Details column of the error or warning that you want to check.

    On the Event tab, you can check information about events that occurred in threshold value monitoring, trend monitoring, or out-of-range value detection. If you click the Details column, messages and a performance chart for the service performance resulting in the error or warning are displayed in the Event and Performance chart tabs area in the Troubleshoot window.

  5. In the Troubleshoot window, in the Event and Performance chart tabs area, check the performance chart displayed on the Performance chart tab to determine the timing of the event that caused the error or warning.

    Check the performance chart and look for a time period in which the average value for service performance started to veer significantly from the baseline. On the performance chart, a colored band indicates a timeframe during which a significant change in service performance occurred. The timeframe indicated by the colored band might be when the event causing the error or warning occurred.

    You can change the item displayed in the performance chart. Click a display item to display the Select Items to Display dialog box, and then select the items that you want to display. For details about the display items, see 10.4.4 Event and Performance chart tabs area (Performance chart tab selected).

  6. Click [Figure] to display configuration information.

    Configuration information helps you identify the monitoring item of the monitored service that resulted in the error. If necessary, you can display performance information as a graph by clicking the [Figure] button associated with the monitoring item. You can also check whether a problem has occurred in the system, such as with a host or middleware. If a problem has occurred in the system, click the [Figure] button to connect to Performance Management for further investigation, if necessary.

(3) Related topics