Hitachi

JP1 Version 12 JP1/Performance Management User's Guide


17.2.1 Troubleshooting problems related to setup and service startup

Organization of this subsection

(1) A Performance Management program service (other than PFM - Web Console) does not start

Possible causes and solutions:

(2) The PFM - Web Console service does not start

Possible causes and solutions are provided below. If only causes are provided, take the corrective action provided in 17.2.1(1) A Performance Management program service (other than PFM - Web Console) does not start.

(3) A service takes a long time to start once startup is requested

It might take a long time for service to actually start once you execute the jpcspm start command or start a service by selecting Services in Windows. If the following factors are the reason for this, subsequent service startups should take less time.

(4) Immediately after a Performance Management program service is stopped, another program starts service and communication is not performed properly

Immediately after stopping a Performance Management program service, another program service might start that uses the same port that the stopped service was using. In this case, communication might not be performed properly. You can use either of the following techniques to avoid this problem:

(5) After the message "The disk capacity is insufficient" is output, the Master Store service or Agent Store service stops

If there is insufficient space on the disk used by the Store database, the storing of data to the Store database is cancelled. In this case, after the message The disk capacity is insufficient is output, the Master Store service, Agent Store service, or Remote Monitor Store service stops.

If this message appears, use either of the following techniques to solve this problem.

If the Master Store service, the Agent Store service, or the Remote Monitor Store service does not start even after taking these actions, there may be some unrecoverable logical errors in the Store database. In this case, you must restore the Store database from the backup data, and then restart the Master Store service, the Agent Store service, or the Remote Monitor Store service. If you have no backup data, you must initialize the Store database, and then start the Master Store service, the Agent Store service, or the Remote Monitor Store service. To initialize the Store database, delete all of the following files in the installation directories of the Store database:

When the Store database version is 1.0
  • Files with the extension .DB

  • Files with the extension .IDX

When the Store database version is 2.0
  • Files with the extension .DB

  • Files with the extension .IDX

    Delete the files in the STPI, STPD, and STPL directories.

    (Do not delete the STPI, STPD, and STPL directories themselves.)

The following shows the default installation directories of the Store database.

Store database installation directory for performance data:

For details, see the appropriate PFM - Agent or PFM - RM manual.

Store database installation directory for event data:
When PFM - Manager is in a non-cluster environment
  • In Windows:

    installation-folder\mgr\store\

  • In UNIX:

    /opt/jp1pc/mgr/store/

When PFM - Manager is in a cluster environment
  • In Windows:

    environment-directory\jp1pc\mgr\store\

  • In UNIX:

    environment-directory/jp1pc/mgr/store/

(6) The Correlator service takes a long time to start after PFM - Manager restarts

The Correlator service checks alarm status on agents when it starts. If you restart PFM - Manager without stopping agents, the Correlator service might take some time to start. If you want to prevent this, consider enabling the Correlator quick start function.

When you enable the Correlator quick start function, the Correlator service checks alarm status on agents after it starts when necessary. As a result, the Correlator service requires less time to start. When the Correlator service reports checked alarm status, PFM - Manager might issue agent events containing one of the following messages.

Message

Description

State information

The Correlator service received an alarm event from an agent and successfully checked the alarm status.

State information (Unconfirmed)

The Correlator service received an alarm event from an agent but could not check the alarm status.

State change (Unconfirmed)

The Correlator service received an alarm event from an agent whose alarm status was unknown. The Correlator service assumed the status of PFM - Agent or PFM - RM based on the content of the received alarm event.

The following table describes the triggers that prompt PFM - Manager to issue agent events containing the messages described in the above table.

Status of the Correlator quick start function

Trigger for the Correlator service to check alarm status

Success or failure of alarm status checking and message to be included in agent events

Success

Failure

Disabled (when the Retry Getting Alarm Status label is enabled in the startup information file (jpccomm.ini))

When PFM - Manager starts

State information

State information (Unconfirmed)

When the Correlator service fails to check alarm status on an agent and receives the next alarm event from the agent

State information

State change (Unconfirmed)#

Enabled

When PFM - Manager starts and then the Correlator service receives the first alarm event from an agent

State information

State information (Unconfirmed)

When the Correlator service fails to check alarm status on an agent and then receives the next alarm event from the agent

State information

State change (Unconfirmed)#

#

The message is output only when the status of PFM - Agent or PFM - RM that is assumed by PFM - Manager changes.

To enable or disable the Correlator quick start function:

  1. Stop the Performance Management programs and services.

    When the Performance Management programs and services are running on the PFM - Manager host, execute the jpcspm stop command to stop all of them. When Performance Management is running in a cluster system, use cluster software to stop all the Performance Management programs and services.

  2. Use a text editor to open the jpccomm.ini file on the PFM - Manager host.

    The jpccomm.ini file is stored in the following location:

    For a physical host

    • For Windows

      installation-folder\

    • For UNIX

      /opt/jp1pc/

    For a logical host

    • For Windows

      environment-directory\jp1pc\

    • For UNIX

      environment-directory/jp1pc/

  3. Enable or disable the Correlator quick start function.

    In the jpccomm.ini file, in the Common Section section, set a desired value for the following label.

    • Enabling the function

      Correlator Startup Mode=1

    • Disabling the function

      Correlator Startup Mode=0

  4. Save and close the jpccomm.ini file.

  5. Start the Performance Management programs and services.

(7) The Agent Collector service or Remote Monitor Collector service does not start

Suppose the OS of a PFM - Agent host or a PFM - RM host is Windows, PFM - Agent or PFM - RM starts, and the Agent Collector service or the Remote Monitor Collector service fails to start. When that occurs and Windows restarts, the following message might be output to a Windows event log.

These messages appear when the Windows Service Control Manager times out. The Service Control Manager is likely to time out if the communication load on PFM - Manager is high and PFM - Manager takes a long time to issue a response. These messages are output if all of the following conditions are met:

To prevent the Service Control Manager from timing out, perform either of the following procedures:

(8) Multiple agents that start simultaneously take a long time to recover from stand-alone mode

A monitoring agent that enters stand-alone mode during startup automatically tries to reconnect to the monitoring manager. If it succeeds, the monitoring agent enters normal mode.

If you start multiple monitoring agents simultaneously, communication from each monitoring agent to the monitoring manager is concentrated and connection errors will occur, and multiple monitoring agents might enter stand-alone mode. At that time, if those monitoring agents try to reconnect to the monitoring manager repeatedly at the same time it will cause a concentration of communication and the monitoring agents might be delayed in entering normal mode.

When such an event occurs, change the value of the Random Retry Mode label (the dispersion of the reconnection) of the Common Section section of the startup information file (jpccomm.ini) to 1 (enabled).

This operation allows attempts to reconnect from monitoring agents in stand-alone mode to the monitoring manager to be made at random intervals rather than at regular intervals and can thus avoid communication concentration.

Note that these settings are applicable when the version of PFM - Manager or PFM - Base in the system is 10-10-20 or later and the version of PFM - Agent or PFM - RM is 10-00 or later.

For details on how to change the startup information file (jpccomm.ini), see the part that explains the startup information file (jpccomm.ini) in the appendixes of the manual JP1/Performance Management Reference.