12.5.2 What happens and how to recover from major input errors

Organization of this subsection

(1) Prometheus server scrape specified an incorrect host or port
(2) You specified an incorrect host or port as the remote write destination of the Prometheus server
(3) You specified an incorrect host or port as the Prometheus server alert notification destination.
(4) You specified an incorrect host or port as the Alertmanager notification destination
(5) Blackbox exporter has specified an incorrect host or port to monitor
(6) Prometheus server definition file format is incorrect
(7) Incorrect format of discovery configuration file
(8) Log monitoring common definition file format is invalid
(9) Text-formatted log file monitoring definition file format is invalid
(10) Script exporter executed a script that does not exist
(11) An invalid host and port were specified for Promitor Scraper Resource Discovery

(1) Prometheus server scrape specified an incorrect host or port

Phenomenon

Scrape fails and the UP metric is collected as 0.

The latest information is not displayed in the TRE information of the integrated operation viewer for data acquired via the Prometheus server with incorrect scrape destination settings.

Recovery methods

Correct the scrape definition and reload or restart the Prometheus server.

To Page Top

(2) You specified an incorrect host or port as the remote write destination of the Prometheus server

Phenomenon: The latest information is not displayed in the TRE information of the integrated operation viewer for data acquired via the Prometheus server with incorrect remote light destination settings.
Recovery methods: Correct the remote write definition and reload or restart the Prometheus server.

To Page Top

(3) You specified an incorrect host or port as the Prometheus server alert notification destination.

Phenomenon: Regarding alerts generated from Prometheus server with incorrect alert notification destination settings, the latest information is not displayed in the JP1 event list of the integrated operation viewer.
Recovery methods: Correct the notification destination definition and reload or restart the Prometheus server.

To Page Top

(4) You specified an incorrect host or port as the Alertmanager notification destination

Phenomenon: Regarding alerts sent from Alertmanager with incorrect notification destination settings, the latest information is not displayed in the JP1 event list of the integrated operation viewer.
Recovery methods: Correct the notification destination definition and reload or restart Alertmanager.

To Page Top

(5) Blackbox exporter has specified an incorrect host or port to monitor

Phenomenon: Acquisition of monitoring destination information fails, and probe_success metric is collected as 0.
Recovery methods: Correct the monitored host and port definitions and reload or restart the Prometheus server.

To Page Top

(6) Prometheus server definition file format is incorrect

Phenomenon

When the Prometheus server reload API is executed, the STAY code 500 is returned and the following message is displayed:

failed to reload config: couldn't load configuration (--config.file="file-path"): parsing YAML file file-path: yaml: unmarshal errors:  line line-number: field test not found in type config.plain

Recovery methods

Check the file-path and line-number in the message, correct the definitions, and reload or restart the Prometheus server.

To Page Top

(7) Incorrect format of discovery configuration file

Phenomenon: Despite successful execution of the jddcreatetree and jddupdatetree commands, the IM management node with the information described in the discovery configuration file is not displayed in the integrated operation viewer.
Recovery methods: Check whether the format of the discovery configuration file is correct, such as specifying colons and making sure that the number of half-width spaces is correct. After correcting the error, run the jddcreatetree and jddupdatetree commands (specify configuration change mode (-c option)) again).

To Page Top

(8) Log monitoring common definition file format is invalid

Description

For path parameter in the <buffer> directive in log monitoring common definition file (jpc_fluentd_common.conf), if a pathname greater than 256 bytes is specified, or a path with :, ,, ;, *, ?, ", <, >, |, tabs, or spaces is specified, it will continue to be logged repeatedly with the following error in fluentd:

unexpected error error_class = Errno::ENOENT error="No such file or directory @ dir_s_mkdir - directory-name "

Corrective action

Verify that log monitoring common definition file (jpc_fluentd_common.conf) <buffer> directive contains the correct path parameter format and settings.

To Page Top

(9) Text-formatted log file monitoring definition file format is invalid

Description

If the pos_file parameter in the [Input Settings] section of text-formatted log file monitoring definition file (fluentd_@@trapname@@_tail.conf) is specified to be blank or string exceeds the upper limit of OS filename, it will continue to be logged with the following repetition in fluentd:

unexpected error error_class = Errno::ENOENT error="No such file or directory @ rb_sysopen - directory-name "

Corrective action

For the pos_file parameter in the [Input Settings] section of text-formatted log file monitoring definition file (fluentd_@@trapname@@_tail.conf), make sure that the format and settings are correct.

To Page Top

(10) Script exporter executed a script that does not exist

Result: The script_success metric is collected as 0. In addition, the script_exit_code metric is collected as -1.
Recovery method: Correct the definition, and then reload or restart Script exporter.

To Page Top

(11) An invalid host and port were specified for Promitor Scraper Resource Discovery

Result: No metric acquired from Resource Discovery.
Recovery method: Correct the definition, and then reload or restart the Promitor Scraper.

To Page Top