15.9.1 Preparatory tasks
The following describes some precautions and points to consider at the planning stage before you implement maintenance.
- Organization of this subsection
(1) Maintenance planning for the entire system
-
Decide the order in which maintenance will be implemented throughout the system.
Make sure that you implement maintenance from higher-level hosts to lower-level hosts according to the system hierarchy (IM configuration).
You should also consider the following points when planning the order in which maintenance is implemented:
-
When grouping the agents, make sure that maintenance of one host will not affect job processing on another host.
-
If a system operates around the clock, prepare a server that can take over processing while the system is being maintained.
-
Estimate how long maintenance will take for the JP1/IM servers and agents.
- Important
-
When maintaining the entire system, use the jcoimdef command to adjust the event acquisition start location, and switch between event acquisition filters prepared in advance.
The event acquisition start location and event acquisition filter are inter-related. That is, one setting affects the other. In brief, the relationship is as follows:
-
Event acquisition start location
At startup, JP1/IM - Manager acquires events from the JP1/Base event database, according to the start location setting, and stores them in its event buffer.
-
Event acquisition filter
When the events acquired from JP1/Base are stored in the JP1/IM - Manager event buffer, the event acquisition filter processes them according to the set conditions.
In other words, if the event acquisition filter is switched (hence, different filter conditions apply when JP1/IM - Manager stops and when it restarts), and if the event acquisition start location is set so that events will be acquired from before JP1/IM - Manager restarts, the events stored in the event buffer after JP1/IM - Manager restarts might differ from those when it stopped (that is, the events you can see in JP1/IM - View might be different). To avoid this problem, always schedule maintenance starting with the top-level host and proceeding down the hierarchy.
-
-
(2) Maintenance planning for managers (JP1/IM - Manager)
The following points should be considered in regard to performing maintenance on the managers (JP1/IM - Manager):
-
Backup requirements
-
Database maintenance
-
Disk space checks
-
Use of failure reports
For details about these aspects, see 15.8 JP1/IM maintenance considerations. You should also consider the following points from a monitoring perspective:
-
Acquire JP1 events generated while the manager (JP1/IM - Manager) is in stopped state.
To manage JP1 events generated while JP1/IM - Manager is being maintained, you must set up JP1/IM - Manager to acquire JP1 events from before it restarts.
During maintenance of the whole JP1/IM system, JP1/IM - Manager on the manager will be stopped at some stage. To enable management of all generated events, set up JP1/IM - Manager so that events generated while it is stopped will be acquired.
Use the jcoimdef command to acquire events from before JP1/IM - Manager is restarted.
The parameter settings are described below, based on an example.
Events generated while JP1/Base is stopped on the manager cannot be acquired. This is because JP1 events cannot be registered in the event database while JP1/Base is in stopped state. For details, see 15.9.1(3) Maintenance planning for agents (JP1/Base).
-
Disable monitoring of error events (JP1 events indicating failure occurrence) generated during agent maintenance.
Maintenance work will involve restarting the server at some point, which could generate a large number of unwanted error events. As this might disrupt ongoing monitoring operations, set up filtering to eliminate unwanted events.
To filter out unwanted events, you can define an event acquisition filter in which common exclusion-conditions exclude JP1 events issued by hosts that are under maintenance. You can then activate this filter during maintenance. The parameter settings are described below, based on an example.
If JP1/Base is in stopped state on the agent throughout the maintenance work, there is no need for filtering because events will not be registered in the event database.
-
Exclude error events generated during agent maintenance from action execution.
When an automated action, such as sending an email triggered by a reception of an error event, is configured, the automated action can be executed even due to an error event caused by maintenance work. If you want to collect error events generated during maintenance work but exclude them from automated-action execution, you can consider setting a common exclusion-condition to exclude JP1 events from automated-action execution.
By setting automated actions as the exclusion target of a common exclusion-condition, you can configure to exclude error events generated by agents undergoing maintenance from automated-action execution. You do not need to change existing automated action definitions for maintenance.
(3) Maintenance planning for agents (JP1/Base)
-
Forward JP1 events generated while JP1/Base is stopped on the manager.
Set JP1/Base on the agents to retry event forwarding, bearing in mind how long JP1/Base will be stopped on the manager. Set JP1/Base to retry at set intervals in case event forwarding fails because an error occurs or because JP1/Base is stopped on the destination host.
For details, see the description of setting the event service in the JP1/Base User's Guide.