4.3.1 Handling a network device node that is going down

If an incident reports a network device node going down, you need to check the location that has the problem, and then take corrective action.

Procedure

Use the topology map of the NNMi console to check the location in which a failure occurred.
When a failure is detected, the color of an icon on the map changes.

If you put maps into a hierarchy, open the child node group to check the status. For the status of a node group, the most critical status is displayed. The statuses of child node groups are applied to the parent node group.
Open the Incident Browsing workspace to check the incident that was reported as the root cause.
Open the Open Key Incidents view or the All Incidents view to reference the content of the incident, and then check the location that has the problem. If you select the target node, and then open the Incident tab, you can check occurrence of the incidents in chronological order. First, check Source Node, Source Object, and Custom Attributes.
Double-click the incident to check detailed information about the incident.
The Incident view is displayed. Use the message and name information to check the type of the incident. Use the source node information to check the location where the failure occurred. Use the date and time information to check the time when the failure occurred.

Note

For an SNMP trap incident, use the Custom Attributes tab to check the detailed information. In the Custom Attributes tab, the information reported by an SNMP trap is displayed. Check the content by referring to the documentation of the devices that issued the SNMP trap.
Set Lifecycle State of the incident to In Progress.
After you understand the details of the problem, from the Lifecycle State pull-down menu, select a state. Immediately after an incident is registered, the state is set to Registered.
From the Assigned To pull-down menu, select your account.
If you want to assign an operator other than you, make sure that the operator can access the assigned incident.
Click (Save and Close).
The changed setting is saved.
Check the situation of the related parts.
A network failure often affects related parts of the communication routes. Therefore, check not only the root cause but also the related parts.
- In a map window, check the related parts to understand the situation.
- In the Monitoring workspace, make sure that there is no part that has a problem.
Take corrective action.
If you configure automatic actions for an incident beforehand, the specified command can be automatically executed.
After taking corrective action, set Lifecycle State of the incident to Completed.
Closed is automatically set when the system identifies that there is no problem.
Click (Save and Close).
The changed setting is saved.
Check the changed state of the incident.
In the Incident Browsing view, make sure that Lifecycle State is set to Closed.

Result

You have now successfully taken corrective action for a network device node going down.