uCosminexus Application Server, Maintenance and Migration Guide

[Contents][Glossary][Index][Back][Next]

2.2.2 Flow of data acquisition when a trouble occurs

You can acquire the data of troubleshooting automatically in the system built on the Application Server. When you start the logical server, the Administration Agent starts monitoring the logical server. If a failure occurs in the logical server, the Administration Agent detects the failure and notifies it to the Management Server. The Management Server gets and collects the log snapshot data, and stops and restarts the logical server.

The following figure shows the flow when the data is obtained automatically.

Figure 2-2 Flowfor automatic data acquisition

[Figure]

Using commands that are executed when an error is detected, mentioned in step 2, the information required for troubleshooting is output. Collect the information output by these commands and any other information that is required for troubleshooting in a compiled form as a snapshot log in step 3. Now, the snapshot log can also be collected without using the failure detection command, after the logical server is stopped in step 4. of the figure, but the information collected this time is that of J2EE server only. Hitachi, therefore, recommends that you use the failure detection command for collecting the snapshot log in step 3 of the figure. For details about the data acquisition using failure detection commands, see 2.3.2 Collecting the Material Using Commands during Error Detection and about snapshot log, see 2.3.3 Collecting the Snapshot Log.

You can also use the management command (mngsvrutil) of the Management Server to collect the snapshot log at any time. For details about collecting snapshot logs using the management command, see 2.3.3(4) Collecting snapshot logs using the management commands.

The processing flow when the process of the logical server is down or when the process of the logical server has hung up is as follows:

Organization of this subsection
(1) The processing flow when the process of logical server is down
(2) Processing flow when the logical server process hangs

(1) The processing flow when the process of logical server is down

After the logical server is started, the process monitoring of the Administration Agent periodically monitors the process by using the process ID of the logical server process. The following figure shows the processing flow when the logical server process is down.

Figure 2-3 The processing flow when the process of logical server is down

[Figure]

  1. The process is monitored periodically by using the process ID of the logical server process.
  2. If the logical server process ends abnormally, the Administration Agent detects that the process is down, and notifies it to the Management Server.
    The existence of the process ID is checked during the monitoring of processes. The contents of this check differ according to the type of the logical server. For details, see 2.3.1 Logical Server Start and Recourse Information in the uCosminexus Application Server Operation, Monitoring, and Linkage Guide and 2.3.2 Stop Logical Server in the uCosminexus Application Server Operation, Monitoring, and Linkage Guide.
  3. If the Management Server finds that the process is down, the failure detection command is executed and the snapshot log is collected.
  4. The logical server restarts automatically after the execution of the failure detection command and the collection of snapshot log.

(2) Processing flow when the logical server process hangs

After the logical server starts, the process monitoring of the Administration Agent periodically checks for the logical server process that the logical server is running. The following figure shows the processing flow when the logical server process hangs up while checking the operation.

Figure 2-4 Processing flow when the logical server process hangs up

[Figure]

  1. Check the operation of the logical server process periodically.
    The operation is checked after the existence of the process ID is checked by the process monitoring. The contents of this check differ according to the type of the logical server.
    For details, see 2.3.1 Logical Server Start and Recourse Information in the uCosminexus Application Server Operation, Monitoring, and Linkage Guide and 2.3.2 Stop Logical Server in the uCosminexus Application Server Operation, Monitoring, and Linkage Guide.
  2. If the operation check fails twice consecutively (default value), the Administration Agent detects that the process has hung up and notifies it to the Management Server. You can change the number of failed attempts of the operation check before a process is determined as hung up.
  3. If the Management Server detects that the process has hung up, it executes the failure detection command and collects the snapshot log.
  4. The automatic stop process is executed because the process is still running when it is detected as hung up.
  5. In the Administration Agent, execute the stop command of the logical server.
    The force stop command is executed if the logical server does not stop even after a certain time period has elapsed.
  6. The logical server restarts automatically after the execution of the failure detection command and the collection of snapshot log.