uCosminexus Application Server, Maintenance and Migration Guide

[Contents][Glossary][Index][Back][Next]

6.7.2 Troubleshooting when a response is delayed

This subsection describes troubleshooting for a delayed response.

Organization of this subsection
(1) Flow of actions when a response is delayed
(2) Flow of actions when a response is delayed

(1) Flow of actions when a response is delayed

The following figure shows the flow of troubleshooting when a response is delayed.

Figure 6-7 Flow of actions when a response is delayed

[Figure]

The details of the processing shown in the figure are described in subsection (2).

(2) Flow of actions when a response is delayed

The following points describe the operations according to the contents of the delayed response flow:

  1. Check the CPU usage
    Check the CPU usage of the applicable process.
    The following is an example display of CPU usage with the task manager.

    Figure 6-8 CPU usage

    [Figure]

    Hint

    If 1core is close to 100%
    This includes cases that run into an infinite loop and recursive invocation. A CPU bottleneck is a possible cause. Go to step 2 and proceed with the check.

    If 1core is close to 0%
    A possible cause is a non-responding or deadlocked back-end process, based on the reason that the back-end process does not return a response. Go to step 2 and proceed with the check.
  2. Acquire the PRF trace
    Execute the mngsvrutil command to output the PRF trace.
    Example of execution
    mngsvrutil -m 123.45.67.89 -u admin2 collect allPrfTraces
  3. Open the PRF trace
    Open the PRF trace.
    Output destination
    C:\Program Files\Hitachi\Cosminexus\manager\log\prf
    File name
    The file is output with the following file names for the trace information to be collected.
    Note that the date and time at which the PRF trace was collected is displayed in date-and-time.
    Performance tracer types File name
    All the performance tracers running on the hosts in the management domain management-domain-name-date-and-time.zip
    All the performance tracers running on a specific host host-name-date-and-time.zip
    Specific performance tracer logical-server-name-date-and-time.zip
  4. Check the PRF trace
    Check the Time column in PRF trace and find the processing that requires a long period of time.
    The PRF trace is a trace information that outputs events across processes and effective data for performance analysis or error analysis.

    Figure 6-9 Example of output of PRF trace

    [Figure]

    In the example, there is a gap of 11 minutes after the SQL statement is issued. Furthermore, the execution of the SQL statement has not ended. Therefore, a problem might have occurred in the database while the SQL statement was being executed.
    Note that the PRF trace is easy to check if you use spreadsheet software.
  5. Output the thread dump
    Execute the mngsvrutil command to output the thread dump.
    Example of execution
    mngsvrutil -m 123.45.67.89 -u admin2 dump server
  6. Check the thread dump
    For an infinite loop
    The following figure shows an example of the thread dump output and the check points in the case of an infinite loop.

    Figure 6-10 Example of thread dump output (infinite loop)

    [Figure]
    Output the thread dump multiple times, observe the time series, and perform a comparative check of the stack trace of the threads with the same tid in each thread dump.
    Point 1
    If the thread attribute is runnable, this thread is executable. This thread is participating in the increased CPU usage (if the attribute is waiting for monitor entry, the thread is not executable and so does not increase the CPU usage).
    Point 2
    All the thread attributes with the same tid are runnable in multiple thread dump files.
    The threads might be running for a long period of time.
    Point 3
    If a specific line in the same method is being executed repeatedly, an infinite loop might be suspected.
    Hint
    If an infinite loop is suspected in the checks until now, request the developer to perform the check.
    If an infinite loop is not suspected, go to step 7.
    For a deadlock
    The following figure shows an example of thread dump output and the check points in the case of a deadlock.

    Figure 6-11 Example of thread dump output (deadlock)

    [Figure]
    The above figure shows an example of thread dump when a deadlock occurs.
    The thread attributes are output after nid:... in the example of output.
    Find the thread with the attribute waiting for monitor entry.
    Check the contents of "-waiting to lock..." and "-locked...". There is a deadlock if the threads are waiting to acquire a lock for the areas that are mutually locked.
    Point 1
    If the thread attribute is runnable, this thread is executable, and so this thread is irrelevant to a deadlock.
    Point 2
    If the thread attribute is waiting for monitor entry, it indicates that this thread is waiting to acquire a lock.
    This thread might have caused the deadlock.
    Point 3
    If a thread has acquired a lock, and if the thread is waiting for a lock at Point 2, there is a high possibility that the thread is causing the deadlock.
    Compare the addresses of the locked objects to detect the deadlock for a thread applicable to Point 2 and Point 3.
    In the example, Thread-3 has acquired the <02A328C8> lock and is waiting to acquire <02A328C0>.
    On the other hand, Thread-1 has acquired the <02A328C0> lock and is waiting to acquire <02A328C8>. This shows that Thread-3 and Thread-1 are in a deadlock.
    Hint
    If a deadlock is suspected in the checks until now, request the developer to perform the check.
    If a deadlock is not suspected, go to step 7.
  7. Improve the business application. Remove redundant processing
    Based on the results of checks on the PRF trace and the thread dump, check and take action if you suspect delays in the business application.
    Hint
    If the problem is resolved, the troubleshooting process ends at this point.
    If the problem is not resolved and if the CPU usage is high, go to step 8.
    If the problem is not resolved and if the CPU usage is low, request the helpdesk to check, based on the purchase agreement.
  8. Reduce the parameters with concurrently executing threads and control the number of concurrently executing processing
    The pending requests might accumulate, but you must wait for some time for the processing.
  9. Upgrade the machine CPU
    Note the additional middleware license costs when you upgrade the CPU.
  10. Add more machines and distribute the load of the transactions
    Note the additional hardware and software license costs when you add machines.
    Hint
    If the problem is resolved, the troubleshooting process is complete.
    If the problem is not resolved, request the helpdesk to check, based on the purchase agreement.