Hitachi

JP1 Version 11 JP1/Performance Management - Agent Option for Service Response Description, User's Guide and Reference


13.8.2 PFM - Agent for Service Response cannot perform measurements

[Figure]

You can see measurement results from result codes, which are the common field in the PI record. Any result code value other than 0 (SUCCESS) indicates that the measurement failed, and you can use the value of the code to isolate the cause of the failure.

Also, when the measurement fails, an error message may be written to the integrated trace log or PL_IESM record. So you can check these together with the result code. You can also view the error information by using the measurement test command for Web transactions (jpcvtest), as long as you measure Web transactions. For details about the command, see jpcvtest (executes measurement tests) in 11. Commands and 13.2.1 Measuring Web transactions.

This subsection describes cases where measurements often fail, organized by the value of the result code.

Tip

The description contains the cases that can apply to the situation where scenarios cannot be replayed by the IE Recorder, and is worth reading.

Organization of this subsection

(1) Value of the result code is 1 (NETWORK ERROR)[Figure]

The measurement failed because of a communication error caused by a failure on the network between a monitored server and a PFM - Agent for Service Response host, on the monitored sever, or on anywhere else. The following table lists possible causes of the error.

Table 13‒7: Possible causes when the value of the result code is 1 (NETWORK ERROR)

Cause

What you check

Action

The monitored server or service is not running.

Check whether the monitored server or service is running.

Start the monitored server or service.

The name resolution via DNS failed.

Check whether the name of the monitored host or proxy server host can be resolved (by trying the nslookup command or viewing the hosts file).

Set up your DNS system or the hosts file, so that the name of the monitored host or proxy server host can be resolved.

TCP connection failed.

Check whether the measurement condition registration file (esptask.xml) has the correct values for the following settings:

  • Host name

  • Port number

  • Host name of the proxy server

Correct the setting in the measurement condition registration file (esptask.xml).

Check whether a firewall located between the monitored server and a PFM - Agent for Service Response host blocks communication.

Configure the firewall to allow communication.

Check how the monitored server (including relay devices) communicates with the PFM - Agent for Service Response host, in terms of:

  • Whether they can reach each other (through the ping command or other means)

  • Which path is used for communication (through the tracert command or other means)

  • Whether the monitored service can be connected (through the telnet command or other means)

According to the What you check column, check how the monitored server communicates with the PFM - Agent for Service Response host, and address problems, if any.

It takes long time to establish the TCP connection, resulting in the failure of the connection.

Check the load on the monitored server (including relay devices) and the PFM - Agent for Service Response host.

Review the operation so that the monitored server (including relay devices) and PFM - Agent for Service Response host are not overloaded.

In most cases, network failure occurs when PFM - Agent for Service Response starts monitoring additional new targets or when a communication path is changed in the monitored environment (for example, when a proxy setting is changed or a load balancer is added). Therefore, Hitachi recommends that you check in advance to ensure that PFM - Agent for Service Response can communicate with monitored servers successfully.

(2) Value of the result code is 2 (TIME OUT)[Figure] [Figure]

PFM - Agent for Service Response failed to perform a measurement because it could not finish processing within the timeout period defined in the measurement condition registration file (esptask.xml).

Table 13‒8: Possible causes when the value of the result code is 2 (TIME OUT)

Cause

What you check

Action

The monitored server responded, but the specified timeout period was exceeded.

  • Check the load on the monitored server or network.

  • Check the log for the monitored service (such as the Web server log) to see if communication is normal.

  • Check the timeout period defined in the measurement condition registration file (esptask.xml) to see if the file has a valid value.

According to the What you check column, check the monitored service and whether communication is normal, and address problems, if any.

Unmeasurable Web pages are monitored (IE scenario).

  • Check whether the software is monitoring Web pages that open dialog boxes which cannot be monitored.#1

  • Check whether the requested Web page does not issue the completion notification and is automatically forwarded to another page through JavaScript.#2

This problem cannot be avoided in the measurement of IE scenarios, and thus consider measuring such Web pages in Web transactions. For details about whether these can be measured in Web transactions, read the following notes in advance to determine if they apply:

#1

The IE scenario cannot be used to monitor pages from which the following sample dialog boxes (such as a security dialog box) open.

When the IE probe is used for measurement, it cannot control the dialog box and enters the waiting state, resulting in a timeout.

If you try to replay the IE scenario in the IE Recorder, it cannot be replayed because timeout will not occur and the recorder will wait forever for the operation on the dialog box.

Figure 13‒1: Dialog boxes that cannot be monitored in IE scenarios

[Figure]

[Figure]

[Figure]

#2

PFM - Agent for Service Response requests a Web page and replays the next operation when it receives a completion notification from the Web server. However, if it does not receive the completion notification for the Web page it requested, and the page is automatically forwarded to another Web page through JavaScript, it cannot perform the next operation and enters the waiting state.

If the IE probe is used for measurement, a timeout occurs.

If you try to replay an operation in the IE Recorder, it cannot be replayed because timeout will not occur and the recorder will wait forever for the operation on the dialog box.

Note that some Web pages display a waiting window indicating a message such as Waiting... only when it takes time to show the window. Depending on how long it takes to show the window, the IE Recorder may be able to replay the scenario successfully, but a timeout may occur during a measurement using the IE probe.

This applies to the following sample script:

<html>
 <body>
  <form name="form1" action="next.html"> 
  <script> document.form1.submit(); </script>
  </form>
 </body>
</html>

(3) Value of the result code is 3 (SERVICE ERROR) [Figure]

The measurement failed because the monitored server is working properly, but it did not return the response to the request it received. An example may include the case where requested pages and files cannot be obtained (such as an HTTP 404 error) if the monitoring target is a Web server. The following table lists possible causes of the error.

Table 13‒9: Possible causes when the value of the result code is 3 (SERVICE ERROR)

Cause

What you check

Action

Authentication-related information (user name or password) is incorrect.

When an Internet service is monitored:

Check whether the user names and passwords for the monitored service and the proxy server specified in the measurement condition registration file (esptask.xml) and by using the password utility (esppasswd command) are correct.

When a Web transaction is monitored:
  • Check whether the user names and passwords for Web authentication and the proxy server specified in the Web transaction file and by using the password utility (esppasswd command) are correct.

  • If more than one step is needed to access the Web server that requires Web authentication, check whether Web authentication is set up for all the steps.

According to the What you check column, specify the correct user name and password for the monitored service.

The monitored service returns an error response.

Check the log for the monitored service (such as the Web server log) to see what the error is.

According to the What you check column, address the error on the monitored service.

SSL authentication failed in measurements of HTTPS (including Web transactions)

Check whether an SSL/TLS communication error occurs.

Check whether the monitored server supports the encryption algorithm and SSL/TLS communication protocol that are supported by PFM - Agent for Service Response.

  • Configure the monitored server so that the software can connect to it using the supported encryption algorithm and SSL/TLS communication protocol.

  • If the monitored server does not support TLSv1.1 or TLSv1.2, configure PFM - Agent for Service Response not to use the corresponding protocol.#1

If neither approach works, PFM - Agent for Service Response cannot be used for monitoring.

Check for certificate errors (those that are checked for both server certificate and client certificate).

Check whether the certificates are stored in installation-folder\agtv\probe\cert.

If not, store them in the folder shown in the left.#2

Check whether the certificates are using the hash algorithm supported by PFM - Agent for Service Response.

Use the hash algorithm supported by PFM - Agent for Service Response.#2

  • Check whether a self-signed certificate is being used.

  • Check whether an invalid certificate is being used, such as an expired one.

Use the correct certificate.#2

Check for certificate errors (those that are checked for the client certificate).

Check whether the Base64-encoded file in X.509 format is stored in installation-folder\agtv\probe\cert.

If not, store the Base64-encoded file in X.509 format.

Check whether the Base64-encoded file in X.509 format contains the encrypted private key.

If not, store the Base64-encoded file in X.509 format that contains the private key there.

Check whether the password in the client certificate matches the one specified using the password utility (esppasswd command).

Check the password in the client certificate and use the password utility (esppasswd command) to reset the password.

#1

If the Web server does not support TLSv1.1 or TLSv1.2, PFM - Agent for Service Response tries to continue to communicate with the Web server through a lower-level protocol, but it may not be able to reconnect the Web server through such a protocol. Eventually, an error occurs saying An SSL handshake has failed. because the Web server cannot communicate with PFM - Agent for Service Response.

This problem can sometimes be avoided if you set a disable_ssl_protocol key value in the [General] section of the Probe action condition definition file (esp.conf). For details, see the description of disable_ssl_protocol in Table 8-3 Definition in the General section.

#2

When the software is allowed to continue processing even if an error related to the server certificate occurs, the measurement continues if the <SSL_AUTH_IGNORE> tag is added to the measurement condition registration file (esptask.xml) or Web transaction file. However, the software cannot continue the measurement when an error related to the client certificate occurs.

(4) Value of the result code is 6 (IE SCENARIO REPLAY ERROR) [Figure]

The measurement failed due to such an error that prevents the replay from continuing in IE scenario monitoring. For details about possible causes, see 5.2.3 Notes on IE scenarios and 5.2.4 Notes on the navigation. The following table lists common problems.

Table 13‒10: Possible causes when the value of the result code is 6 (IE SCENARIO REPLAY ERROR)

Cause

What you check

Action

A pop-up dialog box that is not recorded in the IE scenario appeared.

If the value of the ie_service_flag is set to Y in the [General] section of the Probe action condition definition file (esp.conf), check whether the account for the measurement used when the IE scenario is recorded in the IE Recorder is specified as the account for the Extensible Service IE Probe service.

Specify the account for the measurement used when the IE scenario is recorded in the IE Recorder as the account for the Extensible Service IE Probe service.

Check whether all of the following conditions apply:

  • Internet Explorer Enhanced Security Configuration is enabled on the PFM - Agent for Service Response host.

  • The monitored page uses scripts and ActiveX controls.

  • Replays are successful in the IE Recorder (but measurements fail in the IE probe).

See note 34. in 5.2.3 (3) Notes on recording and replaying the IE scenarios.

Check whether the monitored page shows the security dialog box of Internet Explorer saying Revocation information for the security certificate for this site is not available. Do you want to proceed?

In Internet Explorer, open Internet Options, select the Advanced tab, and clear the Check for server certificate revocation check box.

A pop-up dialog box that is not recorded in the IE scenario appeared.

A script error occurred.

Check whether the monitored page depends on the version of Internet Explorer.

Configure either of the following settings:

Check whether the zone settings in Internet Explorer disable scripting if the monitored page uses JavaScript.

Add the monitored page to the list of trusted sites.

Check whether the monitored page contains the cause of the script error.

Review the script so that it will not cause the script error. You may be able to measure the page depending on the settings of Internet Explorer. For details, see notes 4 and 5 in 5.2.3 (3) Notes on recording and replaying the IE scenarios.

The target of an operation (such as a tag or button) during recording of the IE scenario was not found.

Check whether the monitored page has been modified since the IE scenario was recorded and the target of the operation has been removed from the page (if it has been removed, the operation when the scenario was recorded cannot be continued, causing the measurement to fail).

Re-create the IE scenario with the monitored page after the modification.

Check whether a description in 5.2.4 (4) How to handle the error There is no tag that will be operated although the Web page does not change applies.

Follow the description in 5.2.4 (4) How to handle the error There is no tag that will be operated although the Web page does not change.

Others

Check whether a description in 5.2.3 Notes on IE scenarios applies.

Follow the description in 5.2.3 Notes on IE scenarios. Depending on the note that applies, the IE scenario may not be available for measurement. In this case, consider using the Web transaction for measurement.

Note that you should read the following notes in advance to determine if the Web transaction is available for the measurement:

(5) Value of the result code is 7 (IE SCENARIO NAVIGATE ERROR) [Figure]

The measurement failed because an error occurred during page transition (navigation) recorded by the IE Recorder when the IE scenario was replayed.

Table 13‒11: Possible causes when the value of the result code is 7 (IE SCENARIO NAVIGATE ERROR)

Cause

What you check

Action

The browser function of Windows (.NET Framework) detected an error.

Check the PL_IESM record for the code 0x800C0005.

0x800C0005 (INET_E_RESOURCE_NOT_FOUND)

The server or proxy was not found.

If you see the code above, the target server may not be able to be reached. Check that the monitored server, proxy server on the communication path, and other devices are up and running.

Others

Check whether any of the descriptions in 5.2.4 Notes on the navigation applies.

Follow the description in 5.2.4 Notes on the navigation. Depending on the note that applies, the IE scenario may not be available for measurement.

(6) Result code cannot be used to determine the cause (the measurement result itself is not returned)

The following table shows a possible cause of the error if the result code cannot be used to determine it.

Table 13‒12: Possible cause when the result code cannot be used to determine the cause

Cause

What you check

Action

The last operation recorded in an IE scenario involves displaying a dialog box (to close its own window, for example).

Check whether the last operation recorded in the IE scenario involves displaying a dialog box (to close its own window, for example).

Add one more operation (such as moving to another page by entering an address) after the last operation in the IE scenario.