Hitachi

Job Management Partner 1 Version 10 Job Management Partner 1/Performance Management - Remote Monitor for Platform Description, User's Guide and Reference


9.2.5 Troubleshooting problems with collection and management of performance data

This subsection describes how to handle problems that are related to collection and management of Performance Management performance data.

Organization of this subsection

(1) The size of the PFM - RM for Platform's Store database is not reduced even though the data retention period was shortened

If the Store database has reached its maximum size but shortening the data retention period does not reduce the file size, you need to shorten the retention period, back up the Store database, and then restore the database.

For details about how to specify the data retention period, see the chapter that describes management of operation monitoring data in the Job Management Partner 1/Performance Management User's Guide. For details about how to back up and restore the Store database, see the chapter that describes backup and restore processing in the Job Management Partner 1/Performance Management User's Guide.

(2) The message "Illegal data was detected in the Store database" is output to the common message log

An inconsistency might have occurred in the Store database due to an unexpected service stop or machine shutdown. Take one of the following actions:

(3) PFM - RM for Platform was started, but no performance data is being collected

If the Status field value in the PD record is ERROR, take appropriate action based on the Reason field value.

The following describes the items to be checked for each Reason field value.

(a) Connection failed: Connection to the monitored host failed.

When the monitored host is running Windows
  • Is the monitored host running?

  • Is the WMI service running on the monitored host?

  • Were the settings for the following specified correctly when the monitoring target was set up?#1

    [Figure] TargetHost

  • Can the name be resolved by the host name (TargetHost) that was specified when the monitoring target was set up?

  • Were the following WMI connection setup procedures performed correctly?

    [Figure] DCOM setting at the PFM - RM host

    [Figure] WMI namespace setting at the monitored host

    [Figure] Firewall setting at the monitored host

  • If there is a firewall between PFM - RM for Platform and the monitoring target, is the firewall passage port set appropriately?

When the monitored host is running UNIX
  • Is the monitored host running?

  • Is the SSH service running on the monitored host?

  • Were the following settings specified correctly when the monitored host was set up?#1

    [Figure] Target Host

    [Figure] UseCommonAccount#2

    [Figure] User#3

    [Figure] Private_Key_File#3

    [Figure]Port

  • Can the name be resolved by the host name (TargetHost) that was specified when the monitoring target was set up?

  • Were the settings for the following items specified correctly when the instance environment was set up?#4 (This applies only when the PFM - RM host is running Windows.)

    [Figure]SSH_Client

    [Figure]Perl_Module

  • Was the SSH connection setup procedure performed correctly?

  • If there is a firewall between PFM - RM for Platform and the monitoring target, is the firewall passage port set appropriately?

#1

To check the items that have been set up, execute the jpcconf target setup command. If you are using common account information, execute the jpcconf acc display command to check the setting items. Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the settings.

#2

This item is displayed when both the version of PFM - RM for Platform and the version of the prerequisite program (PFM - Manager or PFM - Base) in the same device as PFM - RM for Platform are 10-50 or later.

#3

If you are using common account information, the values of User and Private_Key_File are the respective values that are specified in User and Private_Key_File in common account information (ssh).

#4

To check the items that have been set up, execute the jpcconf inst setup command. Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the settings.

(b) Authorization failed: Authorization of the monitored host failed.

The items to be checked in Windows are described below. This error is not applicable to UNIX.

When the monitored host is running Windows
  • Were the following settings specified correctly when the monitoring target was set up?#1

    [Figure]UseCommonAccount#2

    [Figure] User#3

    [Figure] Password#3

    [Figure] Domain#3

  • Were the following WMI connection setup procedures performed correctly?

    [Figure] DCOM setting on the PFM - RM host

    [Figure] DCOM setting on the monitored host

#1

To check the settings, execute the jpcconf target setup command. If you are using common account information, execute the jpcconf acc display command to check the setting items. Alternatively, use PFM - Web Console to check the settings by displaying the Remote Monitor Configuration properties from the Remote Monitor Collector service of PFM - RM for Platform.

#2

This item is displayed when both the version of PFM - RM for Platform and the version of the prerequisite program (PFM - Manager or PFM - Base) in the same device as PFM - RM for Platform are 10-50 or later.

#3

If you are using common account information, the values of User, Password, and Domain are the respective values that are specified in User, Password, and Domain in common account information (wmi).

(c) Collection timeout: Performance data collection did not end within the specified time

The items to be checked are shown below:

  • In the instance environment, is the collection interval of the collection process for the monitored host too short?

    The collection interval of the collection process means the Interval setting in the instance environment. If this interval is short, either reduce the number of monitoring targets in the instance environment or lengthen the collection interval of the collection process.

  • Has the monitored host been started?

  • Were settings specified correctly when the monitored host was set up?#

  • Was the WMI connection setting procedure followed correctly?

  • Was the SSH connection setting procedure followed correctly?

  • Is the PFM - RM host or the monitored host under a heavy system load?

If the cause of the error cannot be determined, collect maintenance data and contact the system administrator.

#

To check the items that have been set up, execute the jpcconf target setup command. Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the settings.

(d) Invalid environment (SSH_Client): The file specified in SSH_Client when the instance environment was set up does not exist (when the PFM - RM host is running Windows and the monitored host is running UNIX)

The item to be checked when the PFM - RM host is running Windows and the monitored host is running UNIX is shown below. This error is not applicable when the monitored host is running Windows or when the PFM - RM host is running UNIX.

  • Was the following setting specified correctly when the instance environment was set up?#

    SSH_Client

#

To check the item that has been set up, execute the jpcconf inst setup command. Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the setting.

(e) Invalid environment (Perl_Module): The file specified in Perl_Module when the instance environment was set up does not exist (when the PFM - RM host is running Windows and the monitored host is running UNIX)

The item to be checked when the PFM - RM host is running Windows and the monitored host is running UNIX is shown below. This error is not applicable when the monitored host is running Windows or when the PFM - RM host is running UNIX.

  • Was the following setting specified correctly when the instance environment was set up?#

    Perl_Module

#

To check the item that has been set up, execute the jpcconf inst setup command. Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the setting.

(f) Invalid environment (Private_Key_File): The file specified in Private_Key_File when the monitored host was set up does not exist

The environment where the message The file specified in Private_Key_File when the monitored host was set up does not exist is output differs according to the version of PFM - RM for Platform, as follows:

  • When the version of PFM - RM for Platform is from 09-50 to 10-00: The message is output if the PFM - RM host is running Windows, and the monitored host is running UNIX.

  • When the version of PFM - RM for Platform is 10-50 or later: The message is output if the PFM - RM host is running Windows or UNIX, and the monitored host is running UNIX.

The items below are checked when the monitored host is running UNIX. This error is not applicable when the monitored host is running Windows.

  • Were the following items specified correctly when the monitored host was set up?#1

    [Figure]UseCommonAccount#2

    [Figure]Private_Key_File#3

#1

To check the item that has been set up, execute the jpcconf target setup command. If you are using common account information, execute the jpcconf acc display command to check the setting items.

Alternatively, in PFM - Web Console, from the Remote Monitor Collector service of PFM - RM for Platform, view the Remote Monitor Configuration property to check the setting.

#2

This item is displayed when both the version of PFM - RM for Platform and the version of the prerequisite program (PFM - Manager or PFM - Base) in the same device as PFM - RM for Platform are 10-50 or later.

#3

If you are using common account information, the value of Private_Key_File is the value that is specified in Private_Key_File in common account information (ssh).

(g) Values other than those described above

  • Collect maintenance data and contact the system administrator.

  • If the monitored host is running Windows, check the application event log and take the appropriate corrective action.

    To use PFM - RM for Platform to collect performance data for the records listed below, PFM - RM for Platform must be set up so that objects can be monitored on the performance console.# The table below lists the objects corresponding to each record, the source (service) names that are output to the event log, and the performance extension DLLs.

    #

    You can use Performance to check the object name that corresponds to each record. If there is no corresponding object, specify the settings according to the procedure provided in Microsoft Knowledge Base by Microsoft so that the objects can be monitored.

    Table 9‒2: Objects corresponding to each record, the source (service) names that are output to the event log, and the performance extension DLLs

    No.

    Category

    Record name (record ID)

    Object name

    Source (service) name that is output to the event log

    Performance extension DLL

    1

    Disk

    Logical Disk Overview (PI_LDSK)

    LogicalDisk

    WinMgmt

    perfdisk.dll

    2

    Physical Disk Overview (PI_PDSK)

    PhysicalDisk

    3

    Network-related

    Network Interface Overview (PI_NET)

    Network Interface

    perfctrs.dll

    4

    OS in general (such as processors and memory)

    System Overview (PI)

    Memory

    perfos.dll

    5

    System

    6

    Processor

    7

    Processor Overview (PI_CPU)

    Processor

    If the name WinMgmt is recorded in the application event log, PFM - RM for Platform might not function correctly or the records corresponding to that source (service) might not be collected. If the application event log contains the events shown in the table below, either reinstall the source (service) or eliminate the cause of the error that is disclosed in Microsoft Knowledge Base, or contact the developer of the source (service), and then repair the environment so that the application event logs are not recorded.

    The following table shows examples of application event logs when PFM - RM for Platform is not functioning correctly or the records for the source (service) cannot be collected.

    Table 9‒3: Examples of application event logs when records are not collected successfully

    No.

    Event ID

    Source (service) name

    Event log information

    1

    37

    WinMgmt

    WMI ADAP 0x0 was unable to read the file-name performance library due to an unknown problem in the library.

    2

    41

    WMI ADAP did not create object index n for the performance library service-name because the value was not found by the 009 subkey.

    3

    61

    WMI ADAP was unable to process the file-name performance library due to a time violation in the open function.

  • If the monitoring target is running UNIX, check whether the df command can be correctly executed, then take the appropriate recovery step.

    If the monitoring target is running UNIX, PFM - RM for Platform needs to run in a state in which the df command can be executed normally and the information in the mounted remote file system can be referenced. If you specify Y for the setting Disk_Category of the instance environment when the df command cannot be executed normally and the remote file system does not return a response, the Remote Agent service will not be able to collect performance data correctly. In this case, take the following actions:

    1. Change the setting Disk_Category of the instance environment to N.

    2. Execute either of the following commands to stop the df process on the remote host specified as the monitoring target:

      [Figure] kill -TERM df-process-ID

      [Figure] kill df-process-ID

    3. Correctly mount the remote file system by taking the appropriate action, such as restarting the NFS daemon.

    4. Return the setting Disk_Category of the instance environment to Y.

(4) Alarms related to process monitoring are not reported as intended

Note the following when you are monitoring the process operation status of a monitored host that is running UNIX: An error alarm might be reported even though the monitoring target process is not stopped, and then a normal alarm might be reported at the following collection time.

In a UNIX environment, when a process generates child processes, copies of these processes are created, and as a result duplicate copies of the same processes might appear to exist. Therefore, keep in mind that the number of processes increases when a process that generates child processes is the monitoring target. Specifically, an error alarm can be reported if process information is collected at the time the number of processes increases, and a normal alarm can then be reported if process information is collected at the time the number of processes returns to 1.

To avoid this phenomenon, take the following steps:

If the process operation status information could not be collected from the OS, the number of monitoring target processes might become 0 and an alarm might be reported. To prevent this alarm, from the Alarms window, open the New Alarm Table > Main Information window or the Edit > Main Information window. Then in Advanced settings, select Report alarm when the following damping condition is reached and specify 2 occurrence(s) during/Interval(s).