Job Management Partner 1/Integrated Management - Manager Overview and System Design Guide

[Contents][Glossary][Index][Back][Next]


11.1.7 Considerations for setting event guide information

Using the event guide function, you can record your experience and success in resolving problems, and you can reference and accumulate diagnostic case studies, troubleshooting examples, and so on.

The system administrator manages the system through a process of error detection based on JP1 event monitoring, investigation, and remedial action. By recording your experience and results as event guide information after you have resolved a problem, users can respond quickly if the same type of JP1 event occurs again.

Event guide information is displayed as detailed information about a JP1 event in the Event Details window of the Central Console.

One item of event guide information can be displayed for one JP1 event. But the larger the system, the greater the number of JP1 events issued from linked JP1 products and user applications. Consider the following points when setting event guide information.

Organization of this subsection
(1) Restricting applicable JP1 events
(2) Setting appropriate event guide information
(3) Setting event guide information using variables (placeholder strings)

(1) Restricting applicable JP1 events

JP1 events cover a wide range and their number increases according to the size of the system. It would not be easy to set event guide information for every event. Also, the number of items that can be defined in an event guide information file is limited to 1,000.

For these reasons, you must restrict the JP1 events for which event guide information is set. Decide how to do this from the following perspectives, for example.

(a) Restricting applicable JP1 events by event level

The JP1 event levels are Emergency, Alert, Critical, Error, Warning, Notice, Information, and Debug. Depending on the types of JP1 events issued by the managed hosts in your system, register event guide information for the more important JP1 events (Error level or higher, for example).

When you use the integrated monitoring database, the user-defined event level applies for JP1 events.

Under the default settings, JP1 events of Emergency, Alert, Critical, Error, or Warning level are forwarded to a manager from JP1/Base on an agent.

(b) Restricting applicable JP1 events by frequency and urgency

Find out what sort of JP1 events are being issued from the managed hosts by performing an event search or by executing the JP1/Base jevexport command, and examine the subtotals in the output results. If it appears that some JP1 events of concern are being issued more often than others, you can target those JP1 events according to which host they originate from, or how urgently they need to be identified and dealt with.

If any JP1 events requiring urgent action are being issued at a high frequency, the system administrator and operators will need to discuss and determine troubleshooting procedures. Set event guide information for these sorts of JP1 events.

For details about the jevexport command, see the chapter on commands in the Job Management Partner 1/Base User's Guide.

Note
A maximum of 1,000 items of event guide information can be set. Make sure that you prioritize JP1 events to keep them within this limit.
If it is difficult to restrict the applicable JP1 events to no more than 1,000, consider the following strategy:
  • Group similar events or related events, and write a list of links (used as an index page) in the event-guide message for the group.
This approach requires the user to search for advice relating to a particular event from the list of links. You should therefore establish clear editing rules and explore other ways of making the list easy to search.

(2) Setting appropriate event guide information

Because you can set event guide information as you choose, you can set appropriate information for your operational requirements, as in the following examples:

You can also prepare event guide information according to the nature of the JP1 event. For example, for JP1 events of Error level or higher that require urgent action, you might describe the initial response procedure, while for JP1 events of Warning level indicating a preventable future problem, you might describe how to investigate and preempt the problem.

(a) Event guide information for initial response (example)

In this example, event guide information is needed for an event indicating that a JP1/AJS job running on a managed host has ended abnormally.

The JP1 event indicating abnormal termination of a JP1/AJS job has an event ID (B.ID) of 00004107 and an event level (E.SEVERITY) of Error level. Set event guide information for this JP1 event as follows.

Example of contents written in the event guide information file (jco_guide.txt):
(extract of the condition definition)
[EV_GUIDE_001]
EV_COMP=B.ID:00004107:00000000
EV_COMP=E.SEVERITY:Error
EV_GUIDE=The job ended abnormally.\n Contact the system administrator in charge of host $E.C0 urgently.\n\n List of system administrator contact details \n Host-A:TEL(03-xxxx-xxxx) Mail(xxxxx@xxx.co.jp) \n Host-B:TEL(03-xxxx-xxxx) Mail(xxxxx@xxx.co.jp) \n Host-C:TEL(03-xxxx-xxxx) Mail(xxxxx@xxx.co.jp)
[END]

(b) Event guide information for error investigation and troubleshooting (example)

In this example, event guide information is needed for an event indicating that the number of commands queued in JP1/Base running on an agent has reached a set threshold.

The JP1 event indicating that the command queue count threshold has been exceeded has an event ID (B.ID) of 00003FA5 and an event level (E.SEVERITY) of Warning level. Set event guide information for this JP1 event as follows.

Example of contents written in the event guide information file (jco_guide.txt):
(extract of the condition definition)
[EV_GUIDE_002]
EV_COMP=B.IDBASE:00003FA5
EV_COMP=E.SEVERITY:Warning
EV_FILE=user-specified-folder(path)\jco_guidemes_002.txt
[END]

Example of contents written in an event-guide message file (jco_guidemes_002.txt)
The number of queued commands has exceeded the threshold (10).
Determine the JP1/Base host from the message text.
Check whether there is insufficient memory or a backlog of automated actions on the host.
Open the List of Action Results window, or execute the jcashowa and jcocmdshow commands, to check the statuses of the automated actions.
If any urgent automated actions are waiting to be executed, cancel them as a temporary measure.
To cancel an automated action, use the jcacancel or jcocmddel command.
These two commands display a confirmation message requiring you to type y or n. When executing either command from the Execute Command window, specify the -f option to bypass the confirmation message.
If this event occurs frequently, use the jcocmddef command to modify the command execution environment.

(3) Setting event guide information using variables (placeholder strings)

A variable (placeholder string) can be used to represent a JP1 event attribute in an event-guide message. For example, if you set the host name of the server where the problem originated (B.SOURCESERVER) as a variable, the actual host name will be displayed in the event guide information by means of the variable, and the message text will match the actual situation. This reduces the time required to identify the host where the problem occurred.

The following table describes the variables you can use in an event-guide message.

Table 11-4 Variables that can be used in event-guide messages

Event attribute Variable Format of substituted value
Basic attribute Serial number B.SEQNO Integer character string
Event ID Either of the following:
  1. B.ID
  2. B.IDBASE
String in the format:
  1. basic-code:extended-code
  2. basic-code
Source process ID B.PROCESSID Integer character string
Registered time B.TIME
Arrived time B.ARRIVEDTIME
Source user ID B.USERID
Source group ID B.GROUPID
Source user name B.USERNAME Character string
Source group name B.GROUPNAME
Source event server name B.SOURCESERVER
Destination event server name B.DESTSERVER
Source serial number B.SOURCESEQNO Integer character string
Message B.MESSAGE Character string
Extended attribute Event level E.SEVERITY
User name E.USER_NAME
Product name E.PRODUCT_NAME
Object type E.OBJECT_TYPE
Object name E.OBJECT_NAME
Root object type E.ROOT_OBJECT_TYPE
Root object name E.ROOT_OBJECT_NAME
Object ID E.OBJECT_ID
Occurrence E.OCCURRENCE
Start time E.START_TIME
End time E.END_TIME
Return code E.RESULT_CODE
Other extended attribute E.xxxxxx#

#: Any JP1 product-specific extended attribute can be used. For example, a JP1/AJS job execution host is E.C0. For details about program-specific extended attributes, see the documentation for the particular product that issues JP1 events.


By using these variables, you can write event-guide messages that can be generally applied. For example, if you use the variable for a JP1/AJS job execution host (E.C0), you can write event-guide messages like the following.

Example of an event-guide message using a variable (extract of the EV_GUIDE segment):
EV_GUIDE=The job ended abnormally.\n Check whether an error occurred on host $E.C0.\n In a previous case, the job failed due to insufficient memory on host A.\n Check the available memory using the vmstat command.

For details about JP1 event attributes, see 3.1 Attributes of JP1 events in the manual Job Management Partner 1/Integrated Management - Manager Command and Definition File Reference.

The character strings that can be substituted in a JP1 event attribute (variable) depend on the product. When using variables in event-guide messages, see also the description of JP1 events in the product documentation.

[Contents][Back][Next]


[Trademarks]

All Rights Reserved. Copyright (C) 2009, Hitachi, Ltd.