3.15.1 Performance monitoring function by JP1/IM - Agent

Performance monitoring function consists of Prometheus, Alertmanager, Exporter of add-on program and provides the following two functions:

Function to retrieve performance data through Exporter and send performance data to the Integrated Manager host
This function monitors the thresholds of the acquired performance data. If a condition is met, it alerts JP1/IM - Manager.

Performance data and alerts sent to the Integrated manager host can be viewed in integrated operation viewer.

Organization of this subsection

(1) Performance data collection function
(2) Centralized management of performance data
(3) Performance data monitoring notification function
(4) Communication function

(1) Performance data collection function

Prometheus server is a function that collects performance data from monitored targets. It has two functions:

Scrape function (Prometheus server)
Ability to acquire monitored operation information (Exporter)

(a) Scrape function

Prometheus server is a function that acquires the performance data to be monitored via the Exporter.

When the Prometheus server accesses a specific URL of the Exporter, the Exporter retrieves the monitored performance data and returns it to the Prometheus server. This process is called scrape.

A scrape is executed in units of scrape jobs that combine multiple scrapes for the same purpose.

If a discovery configuration file is used for monitoring through UAP monitoring, jobs should be defined. Also, additional settings are required for the scraping definitions of the log metrics feature.

For details on the scraping description of the log metrics feature, see 1.21.2(10) Setting up scraping definitions in the JP1/Integrated Management 3 - Manager Configuration Guide.

Scrapes are defined in units of scrape jobs. JP1/IM - By default, the following scrape job name scrape definition is set according to the type of exporter.

Scrape Job Name	Scrape Definition
jpc_node	Scrape definition for Node exporter
jpc_windows	Scrape definition for Windows exporter
jpc_blackbox_http	Scrape definition for HTTP/HTTPS monitoring in Blackbox exporeter
jpc_blackbox_icmp	Scrape Definition for ICMP Monitoring in Blackbox exporeter
jpc_cloudwatch	Scrape definition for Yet another cloudwatch exporter
jpc_process	Scraping definition for Process exporter
jpc_promitor	Scraping definition for Promitor
jpc_script	Scraping definition for Script exporter
jpc_oracledb	Scraping definition for OracleDB exporter
jpc_node_aix	Scraping definition for Node exporter for AIX
jpc_web_probe	Scraping definition for Web exporter
jpc_vmware	Scraping definition for VMware exporter
jpc_hyperv	Scraping definition for Windows exporter (Hyper-V monitoring)
jpc_sql	Scraping definition for SQL exporter

If you want to scrape user-defined Exporter, you must add a scrape definition for each target exporter.

The metric obtained from Exporter by scraping of Prometheus server is depending on the type of Exporter. For details, see the description of metric definition file in each Exporter in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

In addition, the Prometheus server generates the following metrics when scraping is performed, in addition to the metrics obtained from the exporter.

Metric Name	Description
up	This metric indicates "1" for successful scraping and "0" for failure. It can be used to monitor the operation of the exporter. Scrape failure may be caused by host stoppage, exporter stop, exporter returning anything other than 200, or communication error.
scrape_duration_seconds	A metric that indicates how long it took to scrape. It is not used in normal operation. It is used for investigations when the scrape does not finish within the expected time.
scrape_samples_post_metric_relabeling	A metric that indicates the number of samples remaining after the metric is relabeled. It is not used in normal operation. It is used to check the number of data when building the environment.
scrape_samples_scraped	A metric that indicates the number of samples returned by the exporter scraped. It is not used in normal operation. It is used to check the number of data when building the environment.
scrape_series_added	A metric that shows the approximate number of newly generated series. It is not used in normal operation.

For details about how to run scrape, see 5.24 API for scrape of Exporter used by JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference. Exporter that you want to scrape must be able to run as described here.

The scrape definition method is shown below.:

Scrape definitions are done in units of scrape jobs.
The scrape definition is described in the Prometheus configuration file (jpc_prometheus_server.yml).
If you are editing a scrape definition, you can download Prometheus configuration file from integrated operation viewer, edit it, and then upload it.

The following are the settings related to scrape definitions supported by JP1/IM - Agent.

Table 3‒35: Settings for scrape definitions supported by JP1/IM - Agent
Setting Item	Description
Scrape Job Name (required)	Sets the name of the scrape job that Prometheus scrapes. You can specify multiple scrape job names. The specified scrape job name is set in the metric label as job="scrape job name".
Scrape to (required)	Set the specific URL of the exporter to be scraped. Only exporters on hosts where JP1/IM - Agent resides can be specified as scrape destinations. The server to be scraped in the URL is specified by the host name. "localhost" cannot be used. The total number of scrape destinations specified in all scrape jobs is limited to 100.
Scrape parameters (optional)	You can set parameters to pass to the Exporter when scraping. Depending on the type of exporter, the contents that can be set differ.
Scrape interval (optional)	You can set the scrape interval. You can set a scrape interval that is common to all scrape jobs and a scrape interval for each scrape job. If both are set, the scrape interval for each scrape job takes precedence. You can specify the following units: years, weeks, days, hours, minutes, seconds, or milliseconds.
Scrape timeout (optional)	You can set a timeout period when scraping takes a long time. You can set a timeout period that is common to all scrape jobs and a timeout period for each scrape job. If both are set, the scrape interval for each scrape job takes precedence.
Relabeling (optional)	You can delete unnecessary metrics and customize labels. By using this feature and setting unnecessary metrics that are not supported by default, you can reduce the amount of data sent to JP1/IM - Manager.

The outcome of scrape by Exporter subject to scrape of Prometheus server is returned in Text-based format data format of Prometheus. Here is a Text-based format of Prometheus:

Text-based format basics

Item	Description
Start time	2014 Apr
Supported Versions	Prometheus Version 0.4.0 or Later
Transmission format	HTTP
Character code	UTF-8 Line feed code is \n
Content-Type	Text/plain; version=0.0.4 If there is no version value, it is treated as the latest text format version.
Content-Encoding	gzip
Advantages	Human readable Easy to assemble, especially for minimal cases (no need for nesting). Read on a line-by-line basis (except for hints and docstring).
Constraints	Redundancy Since the type and docstring are not part of the syntax, there is little validation of the metric contract. Cost of parsing
Supported Metrics	Counter Gauge Histogram Summary Untyped

More information about Text-based format

Text-based format of Prometheus is row-oriented.

Separate lines with a newline character. The line feed code is \n. \ r\n is considered invalid.

The last line must be a newline character.

Also, blank lines are ignored.

Row Format

Within a line, tokens can be separated by any number of blanks or tabs. However, when joining with the previous token, it must be separated by at least one space.

In addition, leading and trailing white spaces are ignored.

Comments, help text, and information

Lines that have # as a character other than the first white space are comments.

This line is ignored unless the first token after # is a HELP or TYPE.

These lines are treated as follows:

If the token is a HELP, at least one more token (metric name) is expected. All remaining tokens are considered to be docstring of that metric name.

HELP line can contain any UTF-8 string after metric name. However, you must escape the backslash as \ and the newline character as \n. For any metric name, there can be only one HELP row.

If the token is a TYPE, two or more tokens are expected. The first is metric name. The second, either counter, gauge, histogram, summary, or untyped, defines the type of metric. There can be only one TYPE row for a given metric. Metric name of TYPE line must appear in front of the first sample.

If no TYPE row exists for metric name, the type is set to untyped.

Write a sample (one per line) using the following EBNF:

 metric_name [
    "{" label_name "=" `"` label_value `"` { "," label_name "=" `"` label_value `"` } [ "," ] "}"
 ] value [ timestamp ]

Sample Syntax

Metric_name and label_name are subject to the limitations of the normal Prometheus expression language.
The label_value is any UTF-8 string. However, backslash (\), double quote ("), and line feed must be escaped as \\, \" and \n, respectively.
Value is a floating-point number required by ParseFloat() function of Go language. In addition to the typical numbers, NaN, +Inf, -Inf is also a valid number. Indicates that NaN is not a number. The + Inf is positive infinity. -Inf is negative infinity.
The timestamp is a int64 (milliseconds from the epoch, 1970-01-01 00:00:00 UTC, excluding leap seconds), and is optionally represented by ParseInt() function of Go.

Grouping and Sorting

All rows granted with metric must be provided as a single grouping, and the optional HELP and TYPE rows must come first (in any order).

It is also recommended, but not required, to perform repeatable sorting with a repeating description.

Each line must have a unique pair of metric names / labels. If it is not a unique combination, the capture behavior is undefined.

Histograms and Summaries

Because histograms and summary types are difficult to express in text format, the following rules apply:

Sample sum x for the summary or histogram appears as another sample called x_sum.
Sample counts named x for a summary or histogram appear as another sample called x_count.
Each quantile in the summary named x appears as another sample line with the same name x and labeled {quantile="y"}.
Each bucket count in the histogram named x appears as another sample line named x_bucket and labeled {le="y"} ( y is the bucket limit).
The histogram must have a bucket of {le="+Inf"}. Its value must be the same as the value of x_count.
For le or quantile labels, the histogram bucket and summary quantiles must appear in ascending order of the values for the labels.

Sample Text-based format

Here is a sample Prometheus metric exposition that contains comments, HELP and TYPE representations, histograms, summaries, and character escaping.

# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{method="post",code="200"} 1027 1395066363000
http_requests_total{method="post",code="400"}    3 1395066363000
 
# Escaping in label values:
msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9
 
# Minimalistic line:
metric_without_timestamp_and_labels 12.47
 
# A weird metric from before the epoch:
something_weird{problem="division by zero"} +Inf -3982045
 
# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320
 
# Finally a summary, which has a complex representation, too:
# HELP rpc_duration_seconds A summary of the RPC duration in seconds.
# TYPE rpc_duration_seconds summary
rpc_duration_seconds{quantile="0.01"} 3102
rpc_duration_seconds{quantile="0.05"} 3272
rpc_duration_seconds{quantile="0.5"} 4773
rpc_duration_seconds{quantile="0.9"} 9001
rpc_duration_seconds{quantile="0.99"} 76656
rpc_duration_seconds_sum 1.7560473e+07
rpc_duration_seconds_count 2693

(b) Ability to obtain monitored operational information

This function acquires operation information (performance data) from the monitoring target. The process of collecting operational information is performed by a program called "Exporter".

In response to scrape requests sent from the Prometheus server to the Exporter, the Exporter collects operational information from the monitored target and returns the results to Prometheus.

Exporters shipped with JP1/IM - Agent scrape only from Prometheus in JP1/IM - Agent that cohabits. Do not scrape from Prometheus provided by other hosts or users.

This section describes the functions of each exporter included with JP1/IM - Agent.

(c) Windows exporter (Windows performance data collection capability)

Windows exporter is an exporter that can be embedded in the monitored Windows host and obtain the operating information of the Windows host.

Windows exporter is installed on the same host as the Prometheus server, and upon a scrape request from the Prometheus server, it collects operational information from the Windows OS of the host and returns it to the Prometheus server.

It is possible to collect operational information related to memory and disk, which cannot be collected by monitoring from outside the host (external monitoring by URL or CloudWatch), from inside the host.

In addition, with JP1/IM - Manager and JP1/IM - Agent version 13-01 or later, you can monitor the operational status of integrated agent host (Windows) services (programs registered in Windows services) (service monitoring function^#).

Note that you cannot use the service monitoring function by running JP1/IM - Agent inside the containers.

#

If you use the service monitoring function in an environment where the version is upgraded from 13-00 to 13-01 or later, you need to configure the settings to perform service monitoring. The following are JP1/IM - Manger and JP1/IM - Agent setup instructions:

Where to find instructions for setting up JP1/IM - Manager: See Editing category name definition file for IM management nodes (imdd_category_name.conf) (optional) in 1.19.3(1)(d) Settings of product plugin (for Windows) in the JP1/Integrated Management 3 - Manager Configuration Guide.
Where to find instructions for setting up JP1/IM - Agent: See the instructions for configuring service monitoring in 1.21.2(3)(f) Configuring service monitoring (for Windows) (optional) and 1.21.2(5)(b) Modify metric to Collect (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide.

This feature creates an IM management node for each service that you want to monitor. For details on displaying the tree, see 3.15.6(1)(i) Tree Format. If you configure an alert, a JP1 event is issued when the service is stopped and registered with IM management node corresponding to the stopped service. You can check the operational status of the past service from the service trend display.

■ Main items to be acquired

The main retrieval items of Windows exporter are defined in Windows exporter metric definition file (default) and Windows exporter (service monitoring) metric definition file (default). For details, see Windows exporter metric definition file (metrics_windows_exporter.conf) in Chapter 2. Definition Files and Windows exporter (service monitoring) metric definition file (metrics_windows_exporter_service.conf) in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add retrieved items to the metric definition file. The following are the metrics that can be specified in the PromQL statement described in the definition file. For details of "Collector" in the table, refer to the description of "Collector" at the bottom of the table.

Metric Name	Collector	What to Get	Label
windows_cache_copy_read_hits_total	cache	Number of copy read requests that hit the cache (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_cache_copy_reads_total	cache	Number of reads from the file system cache page (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_cpu_time_total	cpu	Number of seconds of processor time spent per mode (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `core:` `coreid` `mode:` `mode`^# `#` Contains one of the following: `"dpc"` `"idle"` `"interrupt"` `"privileged"` `"user"`
windows_cs_physical_memory_bytes	cs	Number of bytes of the physical memory capacity	`instance:` `instance-identification-string` `job:` `job-name`
windows_logical_disk_idle_seconds_total	logical_disk	Number of seconds that the disk was idle (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_free_bytes	logical_disk	Number of bytes of unused disk space	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_read_bytes_total	logical_disk	Number of bytes transferred from disk during the read operation (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_read_seconds_total	logical_disk	Number of seconds that the disk was busy for read operations (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_reads_total	logical_disk	Number of read operations to disk (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_requests_queued	logical_disk	Number of requests queued on disk	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_size_bytes	logical_disk	Disk space bytes	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_write_bytes_total	logical_disk	Number of bytes transferred to disk during the write operation (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_write_seconds_total	logical_disk	Number of seconds that the disk was busy for write operations (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_logical_disk_writes_total	logical_disk	Number of disk write operations (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `volume:` `volume-name`
windows_memory_available_bytes	memory	Number of bytes of unused space in physical memory Note: The total of zero, free, and standby (cached) areas allocated to a process or immediately available to the system.	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_cache_bytes	memory	Number of bytes of physical memory used for file system caching	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_cache_faults_total	memory	Number of page faults in the file system cache (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_page_faults_total	memory	Number of times a page fault occurred (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_pool_nonpaged_allocs_total	memory	Number of times a nonpageable physical memory region was allocated	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_pool_paged_allocs_total	memory	Number of times you allocated a pageable physical memory region	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_swap_page_operations_total	memory	Number of pages read from or written to disk to resolve hard page faults (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_swap_pages_read_total	memory	Number of pages read from disk to resolve hard page faults (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_swap_pages_written_total	memory	Number of pages written to disk to resolve hard page faults (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_system_cache_resident_bytes	memory	Number of active system file cache bytes in physical memory	`instance:` `instance-identification-string` `job:` `job-name`
windows_memory_transition_faults_total	memory	The number of page faults resolved by recovering pages that were in use by other processes sharing the page, pages that were on the modified pages list or standby list, or pages that were written to disk (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_net_bytes_received_total	net	Number of bytes received by the interface (cumulative) Note: If the NIC name contains characters other than half-width alphanumeric characters, these characters are converted to underscores and set in the NIC label.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_net_bytes_sent_total	net	Number of bytes sent from the interface (cumulative) Note: If the NIC name contains characters other than half-width alphanumeric characters, these characters are converted to underscores and set in the NIC label.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_net_bytes_total	net	Number of bytes received and transmitted by the interface (cumulative) Note: If the NIC name contains characters other than half-width alphanumeric characters, these characters are converted to underscores and set in the NIC label.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_net_packets_sent_total	net	Number of packets sent by the interface (cumulative) Note: If the NIC name contains characters other than half-width alphanumeric characters, these characters are converted to underscores and set in the NIC label.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_net_packets_received_total	net	Number of packets received by the interface (cumulative) Note: If the NIC name contains characters other than half-width alphanumeric characters, these characters are converted to underscores and set in the NIC label.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_system_context_switches_total	system	Number of context switches (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_system_processor_queue_length	system	Number of threads in the processor queue	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
windows_system_system_calls_total	system	Number of times the process called the OS service routine (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
windows_process_start_time	process	Time of process start	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_cpu_time_total	process	Returns elapsed time that all of the threads of this process used the processor to execute instructions by mode (privileged, user). An instruction is the basic unit of execution in a computer, a thread is the object that executes instructions, and a process is the object created when a program is run. Code executed to handle some hardware interrupts and trap conditions is included in this count.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (`privileged` or `user`)
windows_process_io_bytes_total	process	Bytes issued to I/O operations in different modes (read, write, other). This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write mode includes data operations; other mode includes those that do not involve data, such as control operations.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (`read, write,` or `other`)
windows_process_io_operations_total	process	I/O operations issued in different modes (read, write, other). This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write mode includes data operations; other mode includes those that do not involve data, such as control operations.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (`read`, `write`, or `other`)
windows_process_page_faults_total	process	Page faults by the threads executing in this process. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This can cause the page not to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with which the page is shared.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_page_file_bytes	process	Current number of bytes this process has used in the paging file(s). Paging files are used to store pages of memory used by the process that are not contained in other files. Paging files are shared by all processes, and lack of space in paging files can prevent other processes from allocating memory.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_pool_bytes	process	Pool Bytes is the last observed number of bytes in the paged or nonpaged pool. The nonpaged pool is an area of system memory (physical memory used by the operating system) for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. The paged pool is an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used. Nonpaged pool bytes is calculated differently than paged pool bytes, so it might not equal the total of paged pool bytes.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `pool: paged` (pool paged) or `nonpaged` (pool non paged)
windows_process_priority_base	process	Current base priority of this process. Threads within a process can raise and lower their own base priority relative to the process base priority of the process.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_private_bytes	process	Current number of bytes this process has allocated that cannot be shared with other processes.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_virtual_bytes	process	Current size, in bytes, of the virtual address space that the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite and, by using too much, the process can limit its ability to load libraries.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name`^# `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_service_state	service	The state of the service (State)	`instance:` `instance-identifier-string` `job:` `job-name` `name: service-name`^#1 `state:` service-status^#2 #1 Uppercase letters are converted to lowercase. #2 Contains one of the following: `continue pending` (pending continuation) `pause pending` (suspended) `paused` (paused) `running` (running) `start pending` (pending startup) `stop pending` (suspended) `stopped` (stopped) `unknown` (unknown)

#: The process-name is set, but ".exe" is omitted.

■ Collector

Windows exporter has a built-in collection process called a "collector" for each monitored resource such as CPU and memory.

If you want to add the metrics listed in the table above as acquisition fields, you must enable the collector corresponding to the metric you want to use. You can also disable collectors of metrics that you do not want to collect to suppress unnecessary collection.

Enable/disable for each collector can be specified with the "--collectors.enabled" option on the Windows exporter command line or in the item "collectors.enabled" in the Windows exporter configuration file (jpc_windows_exporter.yml).

For details about Windows exporter command-line options, see the description of windows_exporter command options in Service definition file (jpc_program-name.service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

For details about Windows exporter configuration file entry "collectors.enabled", see the description of item collectors in Windows exporter configuration file (jpc_windows_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Specifying Monitored Services

When using the service monitoring function of Windows exporter, the service to be monitored is specified in the "services-where" field of Windows exporter configuration file (jpc_windows_exporter.yml).

For details about Windows exporter configuration file entry "services-where", see the entry "services-where" in Windows exporter configuration file (jpc_windows_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The value of name label of the metric output by service collectors of Windows exporter is set to the service name. If half-width uppercase characters are included in the service name of the monitoring target, they are converted to half-width lowercase characters and set. When full-pitch uppercase characters are included, they are converted to full-pitch lowercase characters and set.

- About Monitoring JP1/IM - Agent Services

For the service name of JP1/IM - Agent service, see 10.1 Service of JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Administration Guide. For details about the service name in a logical host environment, see 7.3.6 Newly installing JP1/IM - Agent with integrated agent host (for Windows) in the JP1/Integrated Management 3 - Manager Configuration Guide.

Note that you cannot use the service monitoring function to monitor Prometheus server and Windows exporter services.

(d) Node exporter (Linux performance data collection capability)

Node exporter is an exporter that can be embedded in a monitored Linux host to obtain operating information of a Linux host.

The Node exporter is installed on the same host as the Prometheus server, and upon a scrape request from the Prometheus server, it collects operational information from the Linux OS of the host and returns it to the Prometheus server.

In addition, with JP1/IM - Manager and JP1/IM - Agent version 13-01 or later, you can monitor the operational status of integrated agent host (Linux) service (program registered in Systemd) (service monitoring function^#).

Note that you cannot use the service monitoring function by running JP1/IM - Agent inside the containers.

#

If you use the service monitoring function in an environment where the version is upgraded from 13-00 to 13-01 or later, you need to configure the settings to perform service monitoring.

The following are JP1/IM - Manger and JP1/IM - Agent setup instructions:

Where to find instructions for setting up JP1/IM - Manager: See Editing category name definition file for IM management nodes (imdd_category_name.conf) (optional) in 1.19.3(1)(d) Settings of product plugin (for Windows) in the JP1/Integrated Management 3 - Manager Configuration Guide.
Where to find instructions for setting up JP1/IM - Agent: See the instructions for configuring service monitoring in 2.19.2(3)(f) Configuring service monitor settings (for Linux) (optional) and 2.19.2(5)(b) Change metric to collect (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide.

■ Main items to be acquired

The main retrieval items of Node exporter are defined in Node exporter metric definition file (default) and Node exporter (service monitoring) metric definition file (default). For details, see Node exporter metric definition file (metrics_node_exporter.conf) and Node exporter (service monitoring) metric definition file (metrics_windows_exporter_service.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Metric Name	Collector	What to Get	Label
node_boot_time_seconds	stat	Last boot time Note: Shown in UNIX time, including microseconds.	`instance:` `instance-identification-string` `job:` `job-name`
node_context_switches_total	stat	Number of times a context switch has been made (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_cpu_seconds_total	cpu	CPU seconds spent in each mode (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `cpu:` `cpuid` `mode:` `mode`^# `#` Contains one of the following: `user` `nice` `system` `idle` `iowait` `irq` `soft` `steal`
node_disk_io_now	diskstats	Number of disk I/Os currently in progress	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_io_time_seconds_total	diskstats	Seconds spent on disk I/O (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_read_bytes_total	diskstats	Number of bytes successfully read from disk (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_read_time_seconds_total	diskstats	Seconds took to read from disk (cumulative value)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_reads_completed_total	diskstats	Number of successfully completed reads from disk (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_write_time_seconds_total	diskstats	Seconds took to write to disk (cumulative value)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_writes_completed_total	diskstats	Number of successfully completed disk writes (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_disk_written_bytes_total	diskstats	Number of bytes successfully written to disk (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `device-name`
node_filesystem_avail_bytes	filesystem	Number of file system bytes available to non-root users	`instance:` `instance-identification-string` `job:` `job-name` `fstype:` `file-system-type` `mountpoint:` `mount-point`
node_filesystem_files	filesystem	Number of file nodes in the file system	`instance:` `instance-identification-string` `job:` `job-name` `fstype:` `file-system-type` `mountpoint:` `mount-point`
node_filesystem_files_free	filesystem	Number of free file nodes in the file system	`instance:` `instance-identification-string` `job:` `job-name` `fstype:` `file-system-type` `mountpoint:` `mount-point`
node_filesystem_free_bytes	filesystem	Number of bytes of free file system space	`instance:` `instance-identification-string` `job:` `job-name` `fstype:` `file-system-type` `mountpoint:` `mount-point`
node_filesystem_size_bytes	filesystem	Number of bytes in file system capacity	`instance:` `instance-identification-string` `job:` `job-name` `fstype:` `file-system-type` `mountpoint:` `mount-point`
node_intr_total	stat	Number of interrupts handled (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_load1	loadavg	One-minute average of the number of jobs in the run queue	`instance:` `instance-identification-string` `job:` `job-name`
node_load15	loadavg	15-minute average of the number of jobs in the run queue	`instance:` `instance-identification-string` `job:` `job-name`
node_load5	loadavg	5-minute average of the number of jobs in the run queue	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_Active_file_bytes	meminfo	Bytes of recently used file cache memory Note: The value obtained by converting the Active(file) of /proc/meminfo to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_Buffers_bytes	meminfo	Number of bytes in the file buffer Note: The value of Buffers converted to bytes in /proc/meminfo.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_Cached_bytes	meminfo	Number of bytes in file read cache memory Note: This is the value of Cached converted to bytes in /proc/meminfo.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_Inactive_file_bytes	meminfo	Number of bytes of file cache memory that have not been used recently Note: The value of the Inactive(file) of /proc/meminfo converted to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_MemAvailable_bytes	meminfo	The number of bytes of memory available to start a new application without swapping Note: The value of MemAvailable in /proc/meminfo converted to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_MemFree_bytes	meminfo	Number of bytes of free memory Note: The value of MemFree in /proc/meminfo converted to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_MemTotal_bytes	meminfo	Total amount of bytes of memory Note: The value of MemTotal converted to bytes in /proc/meminfo.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_SReclaimable_bytes	meminfo	Number of bytes in the Slab cache that can be reclaimed Note: SReclaimable in /proc/meminfo converted to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_SwapFree_bytes	meminfo	Number of bytes of free swap memory space Note: The value of SwapFree in /proc/meminfo converted to bytes.	`instance:` `instance-identification-string` `job:` `job-name`
node_memory_SwapTotal_bytes	meminfo	Bytes of total swap memory Note: This is the value of SwapTotal converted to bytes in /proc/meminfo.	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Icmp6_InMsgs	netstat	Number of ICMPv6 messages received (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Icmp_InMsgs	netstat	Number of ICMPv4 messages received (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Icmp6_OutMsgs	netstat	Number of ICMPv6 messages sent (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Icmp_OutMsgs	netstat	Number of ICMPv4 messages sent (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Tcp_InSegs	netstat	Number of TCP packets received (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Tcp_OutSegs	netstat	Number of TCP packets sent (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Udp_InDatagrams	netstat	Number of UDP packets received (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_netstat_Udp_OutDatagrams	netstat	Number of UDP packets sent (cumulative)	`instance:` `instance-identification-string` `job:` `job-name`
node_network_flags	netclass	A numeric value indicating the state of the interface Note: /sys/class/net/[iface]/flags is a decimal value.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_iface_link	netclass	Interface serial number Note: The value of /sys/class/net/[iface]/iflink.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_mtu_bytes	netclass	Interface MTU value Note: The value of /sys/class/net/[iface]/mtu.	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_receive_bytes_total	netdev	Number of bytes received by the network device (cumulative value)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_receive_errs_total	netdev	Number of network device receive errors (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_receive_packets_total	netdev	Number of packets received by network devices (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_transmit_bytes_total	netdev	Number of bytes sent by the network device (cumulative value)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_transmit_colls_total	netdev	Number of transmit collisions for network devices (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_transmit_errs_total	netdev	Number of transmission errors for network devices (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_network_transmit_packets_total	netdev	Number of packets sent by network devices (cumulative)	`instance:` `instance-identification-string` `job:` `job-name` `device:` `network-device-name`
node_time_seconds	time	Seconds of system time since the epoch (1970)	`instance:` `instance-identification-string` `job:` `job-name`
node_uname_info	uname	System information obtained by the uname system call	`instance:` `instance-identification-string` `job:` `job-name` `domainname:` `NIS-and-YP-domain-names` `machine:` `hardware-identifiers` `nodename:` `machine-name-in-some-network-defined-at-implementation-time` `release:` `operating-system-release-number` (e.g. "2.6.28") `sysname:` `the-name-of-the-OS` (e.g. "Linux") `version:` `operating-system-version`
node_vmstat_pswpin	vmstat	Number of page swap-ins (cumulative) Note: The value of the pswpin in /proc/vmstat.	`instance:` `instance-identification-string` `job:` `job-name`
node_vmstat_pswpout	vmstat	Number of page swap-outs (cumulative) Note: The value of pswpout in /proc/vmstat.	`instance:` `instance-identification-string` `job:` `job-name`
node_systemd_unit_state	systemd	The state of the systemd unit.	`instance:` `instance-identification-string` `job:` `job-name` `name:` `unit-file-name` `state:` `service-status`^#1 `type:` `how-to-launch-a-process`^#2 #1 Contains one of the following: `activating` (during startup) `active` (running) `deactivating` (stopped) `failed` (failed to execute) `inactive` (stopped) #2 Contains the Type value of the unit file.

■ Collector

The Node exporter has a built-in collection process called a "collector" for each monitored resource such as CPU and memory.

Per-collector enable/disable can be specified in the Node exporter command line options. Specify the collector to enable with the "--collector.collector-name" option and the collector to disable with the "--no-collector.collector-name" option.

For details about Node exporter command-line options, see the description of node_exporter command options in Unit definition file (jpc_program-name.service) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Specifying monitored services

When using the service monitoring function of Node exporter, the service to be monitored is specified in the "--collector.systemd.unit-include" field of Node exporter unit definition file (jpc_node_exporter.service). Collects performance data for the service specified in this file that meets one of the following conditions:

Automatic start of monitored services is enabled (running systemctl enable)
Automatic startup of monitored services is disabled, but the status is active

Performance data for services with auto-start disabled is not collected while the service is stopped. Therefore, if you want to monitor a service that has auto-start disabled and is stopped, start the service that you want to monitor and collect performance data prior to creating IM management node tree.

For unit definition file, see the description in item "--collector.systemd.unit-include" in "node_exporter command options" in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

- About monitoring JP1/IM - Agent services

For unit definition file name of JP1/IM - Agent services, see 10.1 Service of JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Administration Guide. For unit definition file name in a logical host environment, see 8.3.6 Newly installing JP1/IM - Agent with integrated agent host (for UNIX) in the JP1/Integrated Management 3 - Manager Configuration Guide.

Note that you cannot use the service monitoring function to monitor Prometheus server and Node exporter services.

(e) Process exporter (Linux process data collection capability)

Process exporter, built into a monitored Linux host, collects operating information of processes running on that host.

Installed in the same host as Prometheus server, Process exporter collects operating information of the processes from the Linux OS on the host when triggered by scraping requests from Prometheus server, and returns it to the server.

Process exporter allows you to collect process-related operating information, which cannot be obtained through monitoring from outside the host (such as synthetic monitoring with URLs or CloudWatch), from within the host.

■ Key metric items

The key Process exporter metric items are defined in the Process exporter metric definition file (initial status). For details, see Process exporter metric definition file (metrics_process_exporter.conf) in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add more metric items to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

Metric name	Data to be obtained	Label
namedprocess_namegroup_num_procs	Number of processes in this group.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_cpu_seconds_total	CPU usage based on `/proc/[pid]/stat fields utime(14)` and `stime(15)` i.e. user and system time.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `mode: user` or `system`
namedprocess_namegroup_read_bytes_total	Bytes read based on `/proc/[pid]/io` field `read_bytes`. As `/proc/[pid]/io` are set by the kernel as read only to the process' user, to get these values you should run process-exporter either as that user or as root. Otherwise, we can't read these values and you'll get a constant 0 in the metric.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_write_bytes_total	Bytes written based on `/proc/[pid]/io` field `write_bytes`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_major_page_faults_total	Number of major page faults based on `/proc/[pid]/stat` field `majflt(12)`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_minor_page_faults_total	Number of minor page faults based on `/proc/[pid]/stat` field `minflt(10)`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_context_switches_total	Number of context switches based on `/proc/[pid]/status` fields `voluntary_ctxt_switches` and `nonvoluntary_ctxt_switches`. The extra label ctxswitchtype can have two values: voluntary and nonvoluntary.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `ctxswitchtype:` voluntary or nonvoluntary
namedprocess_namegroup_memory_bytes	Number of bytes of memory used. The extra label `memtype` can have three values: resident: Field `rss(24)` from `/proc/[pid]/stat`. This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out. virtual: Field `vsize(23)` from `/proc/[pid]/stat`, virtual memory size. swapped: Field `VmSwap` from `/proc/[pid]/status`, translated from KB to bytes. If gathering smaps file is enabled, two additional values for memtype are added: proportionalResident: Sum of `Pss` fields from `/proc/[pid]/smaps` proportionalSwapped: Sum of `SwapPss` fields from `/proc/[pid]/smaps`	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `memtype: resident`, `virtual`, `swapped`, `proportionalResident`, or `proportionalSwapped`
namedprocess_namegroup_open_filedesc	Number of file descriptors, based on counting how many entries are in the directory `/proc/[pid]/fd`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_worst_fd_ratio	Worst ratio of open filedescs to filedesc limit, amongst all the procs in the group. The limit is the fd soft limit based on `/proc/[pid]/limits`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_oldest_start_time_seconds	Epoch time (seconds since 1970/1/1) at which the oldest process in the group started. This is derived from field `starttime(22)` from `/proc/[pid]/stat`, added to boot time to make it relative to epoch.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_num_threads	Sum of number of threads of all process in the group. Based on `field num_threads(20)` from `/proc/[pid]/stat`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_states	Number of threads in the group in each of various states, based on the field `state(3)` from `/proc/[pid]/stat`. The extra label state can have these values: Running, Sleeping, Waiting, Zombie, Other.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `state: Running`, `Sleeping`, `Waiting`, `Zombie`, or `Other`
namedprocess_namegroup_thread_count	Number of threads in this thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `threadname:` `thread-name`
namedprocess_namegroup_thread_cpu_seconds_total	Same as cpu_user_seconds_total and cpu_system_seconds_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `threadname:` `thread-name` `mode: user` or `system`
namedprocess_namegroup_thread_io_bytes_total	Same as read_bytes_total and write_bytes_total, but broken down per-thread subgroup. Unlike read_bytes_total/write_bytes_total, the label iomode is used to distinguish between read and write bytes.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^# `threadname:` `thread-name` `iomode: read` or `write`
namedprocess_namegroup_thread_major_page_faults_total	Same as major_page_faults_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_thread_minor_page_faults_total	Same as minor_page_faults_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#
namedprocess_namegroup_thread_context_switches_total	Same as context_switches_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`^#

#: The group-name contains a name that uniquely identifies the collected performance value. In addition, the value is stored according to the contents set by the user in the item "name" of the Process exporter configuration file (jpc_process_exporter.yml).

Important

Processes whose name contains multi-byte characters cannot be monitored.
Process exporter still continues to output information of processes that it collected once, even after the processes stop running. Therefore, if Process exporter is configured to collect information based on PIDs, new time-series data is added every time a process is restarted and its PID is changed, resulting in large amounts of unnecessary data.

Furthermore, it is not recommended to use PIDs in open source software (OSS), and thus version 13-00 of our software is configured not to collect PID information by default (groupname). If the user wants to manage processes on the same command line separately, we recommend operational means, such as a change in the order of arguments or the use of PIDs (however, periodic restarts are needed to prevent collected information from accumulating continuously).

Note that information collected by Windows exporter is different from what Process exporter collects, because Windows exporter collects the PID information.
When Process exporter monitors a monitored process, by default it monitors the child processes of the monitored process and acquires the operational data including the child processes.

To avoid including child processes, unit definition file of Process exporter must be edited.

For details, see 2.19.2(6)(d) Setting that excludes child processes from monitoring in the JP1/Integrated Management 3 - Manager Configuration Guide.

(f) Node exporter for AIX (AIX performance data collection capability)

A Node exporter for AIX is an Exporter that is embedded in a monitored AIX host to obtain the health of the host.

Node exporter for AIX is installed on a host other than Prometheus server and is returned to Prometheus server after scrape is requested from Prometheus server to collect operational data from AIX OS of the same host.

You can collect activity on memory and disks from inside the host that cannot be collected by monitoring from outside the host (external shape monitoring by URL or CloudWatch).

■ Prerequisites

It is a prerequisite that the ports used by Node exporter for AIX are protected by firewalls, networking configurations, and so on, so that they are not accessed by anything other than Prometheus server of JP1/IM - Agent.

For the ports used by Node exporter for AIX, see the explanation of node_exporter_aix command options in 10.4.2(1) Enabling registering services in the JP1/Integrated Management 3 - Manager Administration Guide.

■ Conditions to be monitored

See the Release Notes for the supporting OS of the host on which you are installing Node exporter for AIX.

WPAR is not supported.

Multiple boots of Node exporter for AIX on the same host are not supported, even if they are booted on both physical and logical hosts.

The logical host configuration of the monitored AIX hosts is supported only if the following conditions are met:

The hostname of the monitored AIX hostname can be uniquely resolved from Prometheus.

Note: If more than one IP address is assigned to AIX monitored host, Node exporter for AIX can be accessed by all IP addresses.

For the upper limit of Node exporter for AIX that can be monitored by one Prometheus server, refer to the limit value list in JP1/IM - Agent of Appendix D.1 Limits when using the Intelligent Integrated Management Base.

■ Main items to be acquired

The main retrieval items for Node exporter for AIX that JP1/IM - Agent ships with are defined in metric definition-file (default) of Node exporter for AIX. For details, see Node exporter for AIX metric definition file (metrics_node_exporter_aix.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add retrieval items to metric definition file. The following table lists metric that can be specified for PromQL expression in the definition file:

Metric Name	Command-line options for retrieva	Contents to be acquired	Label	Data Source
node_context_switches	`-C`	Total number of context switches. (cumulative value)	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_cpu_total func pswitch of perfstat_cpu_t structure
node_cpu	`-C`	Seconds the cpus spent in each mode. (cumulative value)	`instance:` `instance-identity-string` `job:` `job-name` `cpu:` `cpuid` `mode:` `mode` (`idle`, `sys`, `user`, or `wait`)	Get by perfstat_cpu func Perfstat_cpu_t structure
aix_diskpath_wblks	`-D`	Blocks written via the path	`cpupool_id=physical-processor-shared-pooling-ID` `diskpath=disk-path-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_diskpath func wblks of perfstat_diskpath_t structure
aix_diskpath_rblks	`-D`	Blocks read via the path	`cpupool_id=physical-processor-shared-pooling-ID` `diskpath=disk-path-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_diskpath func rblks of perfstat_diskpath_t structure
aix_disk_rserv	`-d`	Read or receive service time	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func rserv of perfstat_disk_t structure
aix_disk_rblks	`-d`	Number of blocks read from disk	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func rblks of perfstat_disk_t structures
aix_disk_wserv	`-d`	Write or send service time	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func wserv of perfstat_disk_t structure
aix_disk_wblks	`-d`	Number of blocks written to disk	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func wblks of perfstat_disk_t structure
aix_disk_time	`-d`	Amount of time disk is active	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func time of perfstat_disk_t structure
aix_disk_xrate	`-d`	Number of transfers from disk	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func xrate of perfstat_disk_t structure
aix_disk_xfers	`-d`	Number of transfers to/from disk	`cpupool_id=physical-processor-shared-pooling-ID` `disk=disk-name` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `vgname=volume-group-name`	Get by perfstat_disk func xfers of perfstat_disk_t structure
node_filesystem_avail_bytes	`-f`	Filesystem space available to non-root users in bytes.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `device=device-name` `fstype=file-system-type` `mountpoint=mount-point`	Get by stat_filesystems func avail_bytes of filesystem structure
node_filesystem_files	`-f`	Filesystem total file nodes.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `device=device-name` `fstype=file-system-type` `mountpoint=mount-point`	Get by stat_filesystems func files of filesystem structure
node_filesystem_files_free	`-f`	Filesystem total free file nodes.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `device=device-name` `fstype=file-system-type` `mountpoint=mount-point`	Get by stat_filesystems func files_free of filesystem structure
node_filesystem_free_bytes	`-f`	Filesystem free space in bytes.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `device=device-name` `fstype=file-system-type` `mountpoint=mount-point`	Get by stat_filesystems func free_bytes of filesystem structure
node_filesystem_size_bytes	`-f`	Filesystem size in bytes.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `device=device-name` `fstype=file-system-type` `mountpoint=mount-point`	Get by stat_filesystems func size_bytes of filesystem structure
node_intr	`-C`	Total number of interrupts serviced.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_cpu_total func decrintrs of perfstat_cpu_total_t structure mpcsintrs of perfstat_cpu_total_t structure devintrs of perfstat_cpu_total_t structure softintrs of perfstat_cpu_total_t structure
node_load1	`-C`	1m load average.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_cpu_total func loadavg[0] of perfstat_cpu_total_t structure
node_load5	`-C`	5m load average.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_cpu_total func loadavg[1] of perfstat_cpu_total_t structure
node_load15	`-C`	15m load average.	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_cpu_total func loadavg[2] of perfstat_cpu_total_t structure
aix_memory_real_avail	`-m`	Number of pages (in 4KB pages) of memory available without paging out working segments	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func real_avail of perfstat_memory_total_t structure
aix_memory_real_free	`-m`	Free real memory (in 4 KB pages).	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func real_free of perfstat_memory_total_t structures
aix_memory_real_inuse	`-m`	Real memory which is in use (in 4KB pages)	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func real_inuse of perfstat_memory_total_t structures
aix_memory_real_total	`-m`	Total real memory (in 4 KB pages).	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func perfstat_memory_total_t structure real_total
aix_netinterface_mtu	`-i`	Network frame size	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func mtu of perfstat_netinterface_t structure
aix_netinterface_ibytes	`-i`	Number of bytes received on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func ibytes of perfstat_netinterface_t structure
aix_netinterface_ierrors	`-i`	Number of input errors on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func ierrors of perfstat_netinterface_t structure
aix_netinterface_ipackets	`-i`	Number of packets received on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func ipackets of perfstat_netinterface_t structure
aix_netinterface_obytes	`-i`	Number of bytes sent on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func obytes of perfstat_netinterface_t structure
aix_netinterface_collisions	`-i`	Number of collisions on csma interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func collisions of perfstat_netinterface_t structure
aix_netinterface_oerrors	`-i`	Number of output errors on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func oerrors of perfstat_netinterface_t structure
aix_netinterface_opackets	`-i`	Number of packets sent on interface	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID` `netinterface=net-interface-name`	Get by perfstat_netinterface func opackets of perfstat_netinterface_t structure
aix_memory_pgspins	`-m`	Number of page ins from paging space	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func pgspins of perfstat_memory_total_t structure
aix_memory_pgspouts	`-m`	Number of pages paged out from paging space	`cpupool_id=physical-processor-shared-pooling-ID` `group_id=group-ID` `instance:` `instance-identity-string` `job:` `job-name` `lpar=partition-name` `machine_serial=machine-ID`	Get by perfstat_memory_total func pgspouts of perfstat_memory_total_t structure

Node exporter for AIX is collected for each monitored resource, such as CPU, memories. You can enable or disable collection for each resource that you want to monitor by using Node exporter for AIX command-line options.

For Node exporter for AIX command-line options, see the description of node_exporter_aix command options in 10.4.2(1) Enabling registering services in the JP1/Integrated Management 3 - Manager Administration Guide.

Use Script exporter to collect information about processes. For details on how to configure the settings, see 1.23.2(4)(e) Monitoring processes on monitoring hosts (AIX) (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide.

Use JP1/Base log file trap feature to monitor the log files of the monitored AIX hosts.

■ Notes on logging Node exporter for AIX

Node exporter for AIX log file is output to OS system log. Therefore, the destination depends on OS system log settings. For details on changing the output destination of the system log for Node exporter for AIX logging OS, see 1.23.2(4)(f) Changing the log destination of Node exporter for AIX (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide.

■ Precautions When Using SMT or Micro-Partitioning

In an SMT(Simultaneous multithreading) or Micro-Partitioning deployment, calculating CPU Utilization (cpu_used_rate) metric for Node exporter for AIX does not include physical CPU quotas, but calculating CPU utilization as displayed by sar command includes physical CPU quotas.

Therefore, CPU Utilization (cpu_used_rate) of Node exporter for AIX might show a lower metric than sar command output.

(g) Yet another cloudwatch exporter (Azure Monitor performance data collection capability)

Yet another cloudwatch exporter is an exporter included in the monitoring agent that uses Amazon CloudWatch to collect uptime information for AWS services in the cloud.

Yet another cloudwatch exporter is installed on the same host as the Prometheus server, and collects CloudWatch metrics obtained via the SDK provided by AWS (AWS SDK)^# upon scrape requests from the Prometheus server, and sends them to the Prometheus server. I will return it.

#: SDK provided by Amazon Web Services (AWS). Yet another cloudwatch exporter uses the Go language version of the AWS SDK for Go (V1). CloudWatch monitoring requires that Amazon CloudWatch supports the AWS SDK for Go (V1).

You can monitor services that cannot include Node exporter or Windows exporter.

Restrictions: To monitor with Yet another cloudwatch exporter (Amazon CloudWatch performance data collection capability), you must be able to connect to AWS Sercurity Token Service(STS) global endpoint. You cannot use the regional endpoint with Yet another cloudwatch exporter that shipped with JP1/IM - Agent.

■ Main items to be acquired

The main retrieval items of Yet another cloudwatch exporter are defined in Yet another cloudwatch exporter metric definition file (default). For details, see Yet another cloudwatch exporter metric definition file (metrics_ya_cloudwatch_exporter.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ CloudWatch metrics you can collect

You can collect metric of namespace name of AWS that is supported for monitoring by Yet another cloudwatch exporter of JP1/IM - Agent that is listed in 3.15.6(1)(k) Creating an IM Management Node for Yet another cloudwatch exporter.

Specify the metrics to collect by describing the AWS service name and CloudWatch metric name in the Yet another Cloudwatch Exporter configuration file (jpc_ya_cloudwatch_exporter.yml).

The following is an example of the description of the Yet another cloudwatch exporter configuration file when collecting CPUUtilization and DiskReadBytes for CloudWatch metrics for AWS/EC2 services.

discovery:
  exportedTagsOnMetrics:
    ec2:
      - jp1_pc_nodelabel
  jobs:
  - type: ec2
    regions:
      - ap-northeast-1
    period: 60
    length: 300
    delay: 60
    nilToZero: true
    searchTags:
      - key: jp1_pc_nodelabel
        value: .*
    metrics:
      - name: CPUUtilization
        statistics:
        - Maximum
      - name: DiskReadBytes
        statistics:
        - Maximum

For details about what Yet another cloudwatch exporter configuration file describes, see Yet another cloudwatch exporter configuration file (jpc_ya_cloudwatch_exporter.yml in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can also add new metrics to the Yet another cloudwatch exporter metrics definition file using the metrics you set in the Yet another cloudwatch exporter configuration file.

The metrics and labels specified in the PromQL statement described in the definition file conform to the following naming conventions:

- Naming conventions for Exporter metrics

Yet another cloudwatch exporter treats the metric name of CloudWatch as the metric name of the exporter as the automatic conversion of the metric name in CloudWatch by the following rules. Also, the metric specified on the PromQL statement is described using the indicator name of the exporter.

"aws_"^#1+name-space^#2+"_"+CloudWatch-metric^#2+"_"+statistic-type^#2

#1

Appended if the namespace does not begin with "aws_".

#2

Indicates the name you set in the Yet another cloudwatch exporter configuration file (jpc_ya_cloudwatch_exporter.yml). It is converted by the following rules:

It is converted from camel case notation to snake case notation.

CamelCase is a notation that capitalizes word breaks, such as "CamelCase" or "camelCase."

Snakecase is a notation that separates words with "_", such as "snake_case".
The following symbols are converted to "_".

whitespace,comma,tab, /, \, half-width period, -, :, =, full-width left double quote, @, <, >
"%" is converted to "_percent".

- Exporter label naming conventions

Yet another cloudwatch exporter treats the CloudWatch dimension tag name as the Exporter's label name, which is automatically converted by the following rules. Also, labels specified on the PromQL statement are described using the label name of the Exporter.

For dimensions

"dimension"+"_"+dimensions_name^#
For tags

"tag"+"_"+tag_name^#
For custom tags

"custom_tag_"+"_"+custom tag_name^#

#: Indicates the name you set in the Yet another cloudwatch exporter configuration file (jpc_ya_cloudwatch_exporter.yml).

■ About policies for IAM users in your AWS account

To connect to AWS CloudWatch, you must create a policy with the following permissions and assign it to an IAM user.

"tag:GetResources",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"cloudwatch:ListMetrics"

For details on how to set JSON format information, refer to "1.21.2(7)(b) Modify Setup to connect to CloudWatch (for Windows) (optional)" in the manual JP1/Integrated Management 3-Manager Configuration Guide. (The references for Linux are the same.)

■ Environment-variable HTTPS_PROXY

Environment-variable that you specify when you connect to CloudWatch from a Yet another cloudwatch exporter through a proxy. The URL that can be set in the environment-variable HTTPS_PROXY is http only. Note that the only Authentication method supported is Basic authentication.

You can set the environment-variable HTTPS_PROXY to connect to AWS CloudWatch through proxies. The following shows an example configuration.

HTTPS_PROXY=http://username:password@proxy.example.com:5678

■ How to handle monitoring targets JP1/IM - Agent does not support

If you have a product or metric that cannot be monitored by JP1/IM - Agent, you must retrieve it, for example, using user-defined Exporter.

(h) Promitor (Azure Monitor performance data collection capability)

Promitor, included in the integrated agent, collects operating information of Azure services on the cloud environment through Azure Monitor and Azure Resource Graph.

Promitor consists of Promitor Scraper and Promitor Resource Discovery. Promitor Scraper collects metrics on resources from Azure Monitor according to schedule settings and returns them.

Metrics can be collected from target resources in two ways: one method is to specify the target resources separately in a configuration file and the other is to detect the resources automatically. If you choose to detect them automatically, Promitor Resource Discovery detects resources in a tenant through Azure Resource Graph, and based on the results, Promitor Scraper collects metric information.

In addition, both Promitor Scraper and Promitor Resource Discovery require two configuration files for each of them. One configuration file is to define runtime settings, such as authentication information, and the other is to define metric information to be collected.

■ Key metric items

The key Promitor metric items are defined in the Promitor metric definition file (initial status). For details, see the description under Promitor metric definition file (metrics_promitor.conf) in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Metrics you can collect

Promitor can collect metrics for the following services to monitor:

You specify metrics you want to collect in the Promitor Scraper configuration file (metrics-declaration.yaml).

If you want to change the metrics specified in the Promitor Scraper settings file, see Change monitoring metrics (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide 1.21.2(8) Set up of Promitor (d) Configuring scraping targets (required).

You can also add new metrics to the Promitor metric definition file, based on the metrics specified in the Promitor Scraper configuration file. Metrics defined in Promitor Scraper configuration file can be specified to the PromQL statement written in the definition file.

Table 3‒36: Services supported as monitoring targets by Promitor
Promitor resourceType name	Azure Monitor namespace	Automatic discovery support
VirtualMachine	Microsoft.Compute/virtualMachines	Y
FunctionApp	Microsoft.Web/sites	Y
ContainerInstance	Microsoft.ContainerInstance/containerGroups	--
KubernetesService	Microsoft.ContainerService/managedClusters	Y
FileStorage	Microsoft.Storage/storageAccounts/fileServices	--
BlobStorage	Microsoft.Storage/storageAccounts/blobServices	--
ServiceBusNamespace	Microsoft.ServiceBus/namespaces	Y
CosmosDb	Microsoft.DocumentDB/databaseAccounts	Y
SqlDatabase	Microsoft.Sql/servers/databases	Y
SqlServer	Microsoft.Sql/servers/databases Microsoft.Sql/servers/elasticPools	--
SqlManagedInstance	Microsoft.Sql/managedInstances	Y
SqlElasticPool	Microsoft.Sql/servers/elasticPools	Y
LogicApp	Microsoft.Logic/workflows	Y

Legend:

Y: Automatic discovery is supported.

--: Automatic discovery is not supported.

■ Checking how Azure SDKs used by Promitor are supported

Promitor employs Azure SDK for .NET. An end of Azure SDK support is announced 12 months in advance. For details on the lifecycle of Azure SDK, see Lifecycle FAQ at the following website:

https://learn.microsoft.com/ja-jp/lifecycle/faq/azure#azure-sdk-----------

For the lifecycles of versions of Azure SDK libraries, you can find them in the following website:

https://azure.github.io/azure-sdk/releases/latest/all/dotnet.html

■ Credentials required for account information

Promitor can connect to Azure through the service principal method or the managed ID method. For details on the credentials assigned to the service principal and managed ID, see (a) Configuring the settings for establishing a connection to Azure (required) in the JP1/Integrated Management 3 - Manager Configuration Guide 1.21.2(8) Set up of Promitor.

(i) Blackbox exporter (Synthetic metric collector)

Blackbox exporter is an exporter that sends simulated requests to monitored Internet services on the network and obtains operation information obtained from the responses. The supported communication protocols are HTTP, HTTPS, and ICMP.

When the Blackbox exporter receives a scrape request from the Prometheus server, it throws a service request such as HTTP to the monitored target and obtains the response time and response. In addition, the execution results are summarized in the form of metrics and returned to the Prometheus server.

■ Main items to be acquired

The main retrieval items of Blackbox exporter are defined in Blackbox exporter metric definition file (default). For details, see Blackbox exporter metric definition file (metrics_blackbox_exporter.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add retrieved items to the metric definition file. The following are the metrics that can be specified in the PromQL statement described in the definition file.

Metric Name	Prober	What to get	Label
probe_http_duration_seconds	http	The number of seconds taken per phase of the HTTP request Note: All redirects add up.	`instance:` `instance-identification-string` `job:` `job-name` `phase:` `phase`^# `#` Contains one of the following: `"resolve"` `"connect"` `"tls"` `"processing"` `"transfer"`
probe_http_content_length	http	HTTP content response length	`instance:` `instance-identification-string` `job:` `job-name`
probe_http_uncompressed_body_length	http	Uncompressed response body length	`instance:` `instance-identification-string` `job:` `job-name`
probe_http_redirects	http	Number of redirects	`instance:` `instance-identification-string` `job:` `job-name`
probe_http_ssl	http	Whether SSL was used for the final redirect 0: TLS/SSL was not used 1: TLS/SSL was used	`instance:` `instance-identification-string` `job:` `job-name`
probe_http_status_code	http	HTTP response status code value Note: If you are redirecting, the final status code is the value of the metric. If no redirection is performed, the first status code received is the value of the metric.	`instance:` `instance-identification-string` `job:` `job-name`
probe_ssl_earliest_cert_expiry	http	Earliest expiring SSL certificate UNIX time	`instance:` `instance-identification-string` `job:` `job-name`
probe_ssl_last_chain_expiry_timestamp_seconds	http	Expiration timestamp of the last certificate in the SSL chain Note: If you want to monitor this metric, you must specify false for the insecure_skip_verify parameter in the tls_config settings of the Blackbox exporter configuration file (jpc_blackbox_exporter.yml), place the certificate, and specify the path of the certificate file in the appropriate parameter.	`instance:` `instance-identification-string` `job:` `job-name`
probe_ssl_last_chain_info	http	SSL leaf certificate information Note: This is the SHA256 hash value of the server certificate to be monitored. The hash value is set to the label "fingerprint_sha256".	`instance:` `instance-identification-string` `job:` `job-name` `fingerprint_sha256:` `SHA256-fingerprint-on-certificate`
probe_tls_version_info	http	TLS version used Note: The TLS version, such as "TLS 1.2", is set to the label "version".	`instance:` `instance-identification-string` `job:` `job-name` `version:TLS-version`
probe_http_version	http	HTTP version of the probe response	`instance:` `instance-identification-string` `job:` `job-name`
probe_failed_due_to_regex	http	Whether the probe failed due to a regular expression check on the response body or response headers 0: Success 1: Failed	`instance:` `instance-identification-string` `job:` `job-name`
probe_http_last_modified_timestamp_seconds	http	UNIX time showing Last-Modified HTTP response headers	`instance:` `instance-identification-string` `job:` `job-name`
probe_icmp_duration_seconds	icmp	Seconds taken per phase of an ICMP request	`instance:` `instance-identification-string` `job:` `job-name` `phase:` `phase`^# `#` Contains one of the following: `resolve` Name Resolution Time `setup` Time from resolve completion to ICMP packet transmission `rtt` Time to get a response after setup
probe_icmp_reply_hop_limit	icmp	Hop limit (TTL for IPv4) value Note: Hop limit (TTL for IPv4) value	`instance` `instance-identification-string` `job:` `job-name`
probe_success	--	Whether the probe was successful 0: Failed 1: Success	`instance` `instance-identification-string` `job:` `job-name`
probe_duration_seconds	--	The number of seconds it took for the probe to complete	`instance` `instance-identification-string` `job:` `job-name`

■ IP communication with monitored objects

Only IPv4 communication is supported.

■ Encrypted communication with monitored objects

HTTP monitoring enables encrypted communication using TLS. In this case, the Blackbox exporter acts as a TLS client to the monitored object (TLS server).

When using encrypted communication using TLS, specify it in item "tls_config" in the Blackbox exporter configuration file (jpc_blackbox_exporter.yml). In addition, the following certificate and key files must be prepared.

File	Format
CA certificate file	A file encoding an X509 public key certificate in pkcs7 format in PEM format
Client certificate file
Client certificate key file	A file in which the private key in pkcs1 or pkcs8 format is encoded in PEM format^# # You cannot use password-protected files.

File

Format

CA certificate file

A file encoding an X509 public key certificate in pkcs7 format in PEM format

Client certificate file

Client certificate key file

A file in which the private key in pkcs1 or pkcs8 format is encoded in PEM format^#

#: You cannot use password-protected files.

The available TLS versions and cipher suites are supported below.

Item	Scope of support
TLS Version	1.2 to 1.3
Cipher suites	"TLS_AES_128_GCM_SHA256" (TLS 1.3 only) "TLS_AES_256_GCM_SHA384" (TLS 1.3 only) "TLS_CHACHA20_POLY1305_SHA256" (TLS 1.3 only) "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA" (up to TLS 1.2) "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA" (up to TLS 1.2) "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA" (up to TLS 1.2) "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA" (up to TLS 1.2) "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256" (TLS 1.2 only) "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384" (TLS 1.2 only) "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256" (TLS 1.2 only) "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384" (TLS 1.2 only) "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256" (TLS 1.2 only) "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256" (TLS 1.2 only)

Item

Scope of support

TLS Version

1.2 to 1.3

Cipher suites

"TLS_AES_128_GCM_SHA256" (TLS 1.3 only)
"TLS_AES_256_GCM_SHA384" (TLS 1.3 only)
"TLS_CHACHA20_POLY1305_SHA256" (TLS 1.3 only)
"TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA" (up to TLS 1.2)
"TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA" (up to TLS 1.2)
"TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA" (up to TLS 1.2)
"TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA" (up to TLS 1.2)
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256" (TLS 1.2 only)
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384" (TLS 1.2 only)
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256" (TLS 1.2 only)
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384" (TLS 1.2 only)
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256" (TLS 1.2 only)
"TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256" (TLS 1.2 only)

■ Timeout for collecting health information

In a network environment where response is slow (under normal conditions), operating information can be collected by adjusting the timeout period.

On the Prometheus server, you can specify the scrape request timeout period in the entry "scrape_timeout" of the Prometheus configuration file (jpc_prometheus_server.yml). For details, see the description of item scrape_timeout in Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

In addition, the timeout period when connecting from the Blackbox exporter to the monitoring target is 0.5 seconds before the value specified in "scrape_timeout" above.

■ Certificate expiration

When collecting operation information by HTTPS monitoring, the exporter receives a certificate list (server certificate and certificate list certifying server certificate) from the monitoring target.

The Blackbox exporter allows you to collect the expiration time (UNIX time) of the closest expiring certificate as a probe_ssl_earliest_cert_expiry metric.

You can also use the features in 3.15.1(3) Performance data monitoring notification function to monitor certificates that are close to their deadline, because you can calculate the number of seconds remaining before the deadline with the value calculated in probe_ssl_earliest_cert_expiry Metric Value-PromQL's time() function.

■ User-Agent value in HTTP request header when monitoring HTTP

The default value of User-Agent included in HTTP request header during HTTP monitoring is as shown below:

For version 13-00 or earlier

"Go-http-client/1.1"
For version 13-00-01 or later

"Blackbox Exporter/0.24.0"

You can change the value of User-Agent in the setting of item "headers" in the Blackbox exporter configuration file (jpc_blackbox_exporter.yml).

The following is an example of changing the value of User-Agent to "My-Http-Client".

modules:
  http:
    prober: http
    http:
      headers:
        User-Agent: "My-Http-Client"

For details, see the description of item headers in Blackbox exporter configuration file (jpc_blackbox_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ About HTTP 1.1 Name-Based Virtual Host Support

The Blackbox exporter supports HTTP 1.1 name-based virtual hosts and TLS Server Name Indication (SNI). You can monitor virtual hosts that disguise one HTTP/HTTPS server as multiple HTTP/HTTPS servers.

■ About TLS Server Authentication and Client Authentication

In Blackbox exporter's HTTPS monitoring, server authentication is performed using the CA certificate described in item "ca_file" of the Blackbox exporter configuration file (jpc_blackbox_exporter.yml) and the server certificate sent by the server when HTTPS communication with the server starts (TLS handshake).

If the sent certificate is incorrect (server name is incorrect, expired, self-certificate is used, etc.), HTTPS communication cannot be started and monitoring fails.

In addition, when a request is made to send a certificate from the monitored server at the start of HTTPS communication (TLS handshake), the client certificate described in item "cert_file" of the Blackbox exporter configuration file (jpc_blackbox_exporter.yml) is sent to the monitored server.

If the server validates the sent certificate, recognizes it as invalid, and returns an error to the Blackbox exporter via the TLS protocol (or if communication cannot be continued due to a loss of communication, etc.), the monitoring fails.

For details on the verification contents related to the client certificate and the operation in the event of an error on the monitored server, check the specifications of the monitored server (or relay device such as a load balancer).

To detect fraudulent certificates during server authentication, if you specify "true" in item "insecure_skip_verify" in the Blackbox exporter configuration file (jpc_blackbox_exporter.yml), HTTPS communication can be started without errors. However, in that case, the verification operation related to client authentication at the server will be invalidated.

For details, see the description of item insecure_skip_verify in Blackbox exporter configuration file (jpc_blackbox_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Server authentication cannot be performed using certificates that the host name is not listed in the Subject Alternative Name field.

■ About cookie information

The Blackbox exporter does not use cookie information sent from the monitored target in the next HTTP communication request.

■ About external resources referenced from content included in the response body of HTTP communication

In Blackbox exporter, external resources (subframes, images, etc.) referenced from the content included in the response body of HTTP communication are not included in the monitoring range.

■ About Monitoring of Content Included in HTTP Communication Response Body

Since the Blackbox exporter does not parse the content, the execution result and execution time based on the syntax (HTML, javascript, etc.) in the content included in the response body of HTTP communication are not reflected in the monitoring result.

■ Precautions when the monitoring destination of HTTP monitoring redirects with Basic authentication

If the Blackbox exporter's HTTP monitoring destination redirects with Basic authentication, the Blackbox exporter sends the same Basic authentication username and password to the redirect source and destination. Therefore, when performing Basic authentication on both the redirect source and the redirect destination, the same user name and password must be set on the redirect source and the redirect destination.

(j) Script exporter (UAP monitoring capability)

Script exporter runs scripts on a host and gets results.

The Script exporter is installed on the same host as the JP1/IM - Agent, and upon a scrape request from the Prometheus server, it executes a script on that host to retrieve the results and returns them to the Prometheus server.

Developing a script that gets UAP information and converts it to a metric and adding the script to Script exporter enables you to monitor applications that are not supported by Exporter as you want.

■ Key metric items

The key Script exporter metric items are defined in the Script exporter metric definition file (initial status). For details, see Script exporter metric definition file (metrics_script_exporter.conf) in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add more metric items to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

Metric name	Data to be obtained	Label
script_success	Script exit status (0 = error, 1 = success)	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`
script_duration_seconds	Script execution time, in seconds.	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`
script_exit_code	The exit code of the script.	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`

Metric name

Data to be obtained

Label

script_success

Script exit status (0 = error, 1 = success)

instance: instance-identifier-string

job: job-name

script: script-name

script_duration_seconds

Script execution time, in seconds.

instance: instance-identifier-string

job: job-name

script: script-name

script_exit_code

The exit code of the script.

instance: instance-identifier-string

job: job-name

script: script-name

(k) OracleDB exporter (Oracle Database monitoring function)

OracleDB exporter is an Exporter for Prometheus that retrieves performance data from Oracle Database.

- About the number of sessions: If you monitor Oracle Database from OracleDB exporter, it connects to each scrape and disconnects when the data-collection is complete. The number of sessions when connecting is 1.

■ Conditions to be monitored

The following are the Oracle Database configurations and database character sets that JP1/IM - Agent supports for monitoring:

Configuring Oracle Database
- For non-clusters
  
  Non CDB and CDB configurations
- For Oracle RAC
  
  CDB configuration

Because OracleDB exporter connects to one service in a single process, it launches more than one OracleDB exporter if there is more than one target.

Note

Oracle RAC One Node and Oracle Database Cloud Service are not supported.
HA clustering configuration on Oracle Database is not supported.

Oracle Database database-character set
- AL32UTF8(Unicode UTF-8)
- JA16SJIS (Japanese-language SJIS)
- ZHS16GBK (Simplified Chinese GBK)

■ Acquisition items

The metrics that can be retrieved with the OracleDB exporter shipped with the JP1/IM - Agent are the metrics and cache_hit_ratio defined by the OracleDB exporter default.

OracleDB exporter retrieval items are defined in metric definition-file (default) of OracleDB exporter. For details, see OracleDB exporter metric definition file (metrics_oracledb_exporter.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The following tables list metric that can be specified for PromQL expression in the definition file. The value of each metric is obtained by executing the SQL statement shown in the table to Oracle Database. For details about metric, contact Oracle based on SQL statement of the data source.

Metric name	Contents to be acquired	Label	Data source (SQL statement)
oracledb_sessions_value	Count of sessions	`status:` `status` `type:` `session-type`	`SELECT status, type, COUNT(*) as value FROM v$session GROUP BY status, type`
oracledb_resource_current_utilization	Resource usage^#1	`resource_name:` `resource-name`	`SELECT resource_name,current_utilization,CASE WHEN TRIM(limit_value) LIKE 'UNLIMITED' THEN '-1' ELSE TRIM(limit_value) END as limit_value FROM v$resource_limit`
oracledb_resource_limit_value	Resource usage limit^#1 (UNLIMITED: -1)	`resource_name:` `resource-name`
oracledb_asm_diskgroup_total	Bytes of total size of ASM disk group	`name:` `disk-group-name`	`SELECT name,total_mb10241024 as total,free_mb10241024 as free FROM v$asm_diskgroup_stat where exists (select 1 from v$datafile where name like '+%')`
oracledb_asm_diskgroup_free	Bytes of free space available on ASM disk group	`name:` `disk-group-name`
oracledb_activity_execute_count	Total number of calls (user calls and recursive calls) executing SQL statements (cumulative value)	`none`	`SELECT name, value FROM v$sysstat WHERE name IN ('parse count (total)', 'execute count', 'user commits', 'user rollbacks', 'db block gets from cache', 'consistent gets from cache', 'physical reads cache')`
oracledb_activity_parse_count_total	Total number of parse calls (hard, soft and describe) (cumulative value)	`none`
oracledb_activity_user_commits	Total number of user commit (cumulative value)	`none`
oracledb_activity_user_rollbacks	The number of times a user manually issued a ROLLBACK statement, or the total number of times an error occurred during a user's transaction (cumulative value)	`none`
oracledb_activity_physical_reads_cache	Total number of data blocks read from disk to the buffer cache (cumulative value)	`none`
oracledb_activity_consistent_gets_from_cache	Number of times block read consistency was requested from the buffer cache (cumulative value)	`none`
oracledb_activity_db_block_gets_from_cache	Number of times CURRENT blocking was requested from the buffer cache (cumulative value)	`none`
oracledb_process_count	Count of Oracle Database active-processes	`none`	`SELECT COUNT(*) as count FROM v$process`
oracledb_wait_time_administrative	Hours spent waiting for Administrative wait class (in 1/100 seconds)^#2	`none`	`SELECT` `n.wait_class as WAIT_CLASS,` `round(m.time_waited/m.INTSIZE_CSEC,3) as VALUE` `FROM` `v$waitclassmetric m, v$system_wait_class n` `WHERE` `m.wait_class_id=n.wait_class_id AND n.wait_class != 'Idle'`
oracledb_wait_time_application	Hours spent waiting for Application wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_commit	Hours spent waiting for Commit wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_concurrency	Hours spent waiting for Concurrency wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_configuration	Hours spent waiting for Configuration wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_network	Hours spent waiting for Network wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_other	Hours spent waiting for Other wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_scheduler	Hours spent waiting for Scheduler wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_system_io	Hours spent waiting for System I/O wait class (in 1/100 seconds)^#2	`none`
oracledb_wait_time_user_io	Hours spent waiting for User I/O wait class (in 1/100 seconds)^#2	`none`
oracledb_tablespace_bytes	Total bytes consumed by tablespaces	`tablespace:` `name-of-the-tablespace` `type:` `tablespace-contents`	`SELECT` `dt.tablespace_name as tablespace,` `dt.contents as type,` `dt.block_size * dtum.used_space as bytes,` `dt.block_size * dtum.tablespace_size as max_bytes,` `dt.block_size * (dtum.tablespace_size - dtum.used_space) as free,` `dtum.used_percent` `FROM dba_tablespace_usage_metrics dtum, dba_tablespaces dt` `WHERE dtum.tablespace_name = dt.tablespace_name` `ORDER by tablespace`
oracledb_tablespace_max_bytes	Maximum number of bytes in a tablespace	`tablespace:` `name-of-the-tablespace` `type:` `tablespace-contents`
oracledb_tablespace_free	Number of free bytes in the tablespace	`tablespace:` `name-of-the-tablespace` `type:` `tablespace-contents`
oracledb_tablespace_used_percent	Tablespace utilization If auto extension is ON, it is calculated with auto extension taken into account.	`tablespace:` `name-of-the-tablespace` `type:` `tablespace-contents`
oracledb_exporter_last_scrape_duration_seconds	The number of seconds taken the last scrape	`none`	`-`
oracledb_exporter_last_scrape_error	Whether the last scrape resulted in an error 0: Error 1: Success	`none`	`-`
oracledb_exporter_scrapes_total	Total number of times Oracle Database was scraped for metrics	`none`	`-`
oracledb_up	Whether the Oracle Database Server is up 0: Not running 1: Running	`none`	`-`

#1: In a PDB, the table in the source v$resource_limit is empty and cannot be retrieved.
#2: In a PDB, the table in the source v$waitclassmetric is empty and cannot be retrieved.

Important

Prior to using OracleDB exporter, make sure that SQL statements that serve as the data source can be executed, for example, with SQL*Plus command. This ensures that the required information can be displayed. Use OracleDB exporter to connect to Oracle Database when checking.
OracleDB exporter provided by JP1/IM - Agent does not support the ability to collect any metric (custom metrics).

■ Requirements for monitoring Oracle Database

If you want to monitor Oracle Database on OracleDB exporter, Oracle Database must have the following settings:

You do not need to install Oracle Client, etc. on JP1/IM - Agent host-side.

Oracle listener
- Configure Oracle listener and servicename so that they can connect to the target.
- Oracle listener is configured to accept unencrypted connect requests.
Oracle Database

Set Oracle Database database-character set to one of the following:
- AL32UTF8 (Unicode UTF-8)
- JA16SJIS (Japanese-language SJIS)
- ZHS16GBK (Simplified Chinese GBK)
Users used to access Oracle Database
- The user used to connect to Oracle Database must have the following permissions:
  
  - Login permissions
  
  - SELECT permissions to the following tables
  
  dba_tablespace_usage_metrics
  
  dba_tablespaces
  
  v$system_wait_class
  
  v$asm_diskgroup_stat
  
  v$datafile
  
  v$sysstat
  
  v$process
  
  v$waitclassmetric
  
  v$session
  
  v$resource_limit
- User used to connect to Oracle Database
  
  For details about the character types and maximum lengths that can be specified for user names, see Environment variables.
- Password of the user used to connect to Oracle Database
  
  The following character types can be used for passwords:
  
  - Uppercase letters, lowercase letters, numbers, @, +, ', !, $, :, ., (, ), ~, -, _
  
  - The password can be from 1 to 30 bytes in length.

■ Obfuscation of Oracle Database passwords

OracleDB exporter shipped with JP1/IM - Agent manages the passwords in secret obfuscation capabilities for accessing Oracle Database from OracleDB exporter. For details, see 3.15.10 Secret obfuscation function.

■ Notes on Oracle Database log files

Monitoring Oracle Database with OracleDB exporter can generate a large number of logfiles. Therefore, Oracle Database administrator should consider deleting logfiles periodically.

Directory where log files are generated (including subdirectories)	Increasing log file extensions
`$ORACLE_BASE/diag/rdbms`	`.trc`, `.trm`

Directory where log files are generated

(including subdirectories)

Increasing log file extensions

$ORACLE_BASE/diag/rdbms

.trc, .trm

Below is a sample command line for deleting ".trc" or ".trm" files with older renewal dates. If necessary, consider running such commands periodically to delete unnecessary logs.

OS	Command line example for deleting logs
Windows	`forfiles /P "%ORACLE_BASE%\diag\rdbms" /M .trm /S /C "cmd /C del /Q @path" /D -14` `forfiles /P "%ORACLE_BASE%\diag\rdbms" /M .trc /S /C "cmd /C del /Q @path" /D -14`
Linux	`find $ORACLE_BASE/diag/rdbms -name '*.tr[cm]' -mtime +14 -delete`

Command line example for deleting logs

Windows

forfiles /P "%ORACLE_BASE%\diag\rdbms" /M *.trm /S /C "cmd /C del /Q @path" /D -14

forfiles /P "%ORACLE_BASE%\diag\rdbms" /M *.trc /S /C "cmd /C del /Q @path" /D -14

Linux

find $ORACLE_BASE/diag/rdbms -name '*.tr[cm]' -mtime +14 -delete

Set the $ORACLE_BASE and %ORACLE_BASE% environment variables as needed.

■ Environment variables

The following environment variables are required when using OracleDB exporter.

- Environment-variable "DATA_SOURCE_NAME" (required)

Specify the destination of OracleDB exporter in the following format: There is no default value.

For Windows

oracle://user-name@host-name:port/service-name?connection timeout=10[&amp;instance name=instance-name]

For Linux

oracle://user-name@host-name:port/service-name?connection timeout=10[&instance name=instance-name]

user-name

Specifies the username to connect to Oracle listener. Up to 30 characters can be specified.
You can use uppercase letters, numbers, underscores, dollar signs, pound signs, periods, and at signs. Note that lowercase letters are not allowed.
For Linux, replace the pound sign with "%%23" when you include your username in unit definition file. For example, if you are a shared CDB user, specify "C##USER" as "C%%23%%23USER".
For Windows, replace the pound sign with %23 when you include the username in service definition file. For example, if you are a shared CDB user, specify "C##USER" as "C%23%23USER".

host-name

Specifies the host name of Oracle Database host to monitor. Up to 253 characters can be specified.
You can use uppercase letters, lowercase letters, numbers, hyphens, and periods.

port

Specifies the port number for connecting to Oracle listener.

service-name

Specifies the service name of Oracle listener. Up to 64 characters can be specified.
You can use uppercase letters, lowercase letters, numbers, underscores, hyphens, and periods.

Option

You can specify the following options. If you specify more than one, connect them with & in Windows and & in Linux.

connection timeout=number

Specifies the connection timeout in seconds. This option must be specified.

Be sure to specify 10. If you specify a value other than 10 or do not specify this option, scrape of Prometheus server times out and up metric may be 0 even if OracleDB exporter is running.
instance name=instance-name

Specifies instance to connect to. Specifying this option is optional.

(Example of specification)

oracle://orauser@orahost:1521/orasrv?connection timeout=10

For Windows

oracle://orauser@orahost:1521/orasrv?connection timeout=10&amp;instance name=orcl1

For Linux

oracle://orauser@orahost:1521/orasrv?connection timeout=10
&instance name=orcl1

- Environment variable DATA_SOURCE_NAME (required)

Specify the full path of jp1ima directory under JP1/IM - Agent installation directory.

For a logical host, specify the full path of jp1ima directory under JP1/IM - Agent shared directory.

(Example of specification)

For Windows

C:\Program files\Hitachi\jp1ima

For Linux

/opt/jp1ima

■ Notes

If you try to stop the monitored Oracle Database instance and containers prior to stopping OracleDB exporter, NORMAL shutdown of Oracle may not terminate. Stop OracleDB exporter in advance or stop Oracle Database by IMMEDIATE shutdown
Shut down OracleDB exporter before making configuration changes or maintaining Oracle Database instance and containers.

(l) Fluentd (Log metrics)

This capability can generate and measure log metrics from log files created by monitoring targets. For details on the function, see 3.15.2 Log metrics by JP1/IM - Agent.

■ Key metric items

You define what figures you need from the log files created by your monitoring targets in the log metrics definition file (fluentd_any-name_logmetrics.conf). These definitions allow you to get quantified data (log metrics) as metric items.

For details on the log metrics definition file, see Log metrics definition file (fluentd_any-name_logmetrics.conf) in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Sample files

The following provides descriptions of sample files for when you use the log metrics feature. If you copy the sample files, be careful of the linefeed codes. For details, see the description of each file of 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference. These sample files are based on the assumptions in Assumptions of the sample files. Copy each file and change the settings according to your monitoring targets.

- Assumptions of the sample files

The sample files described here assume that HostA, a monitored host (integrated agent host), exists and JP1/IM - Agent is installed in it, and that WebAppA, an application running on HostA, creates the following log file.

- ControllerLog.log

As shown in target log message 1, a log message is created, saying that an HTTP endpoint in WebAppA is used, at the start of processing of the request for that endpoint. The log message also indicates the number of records handled upon request processing.

Target log message 1:

...
2022-10-19 10:00:00 [INFO] c.b.springbootlogging.LoggingController : endpoint "/register" started. Target record: 5.
...

In the sample files, a regular expression to match target log message 1 is used, and the number of the log messages that match the expression is counted. The number is then displayed in the Trends tab of the JP1/IM integrated operation viewer as log metric 1, Requests to the register Endpoint.

The definition for log metric 1 uses counter as its log metric type.

In addition, the regular expression used in the above also extracts the number indicated as Target record from target log message 1, and then the extracted numbers are summed up. The total is then displayed in the Trends tab of the JP1/IM integrated operation viewer as log metric 2, Number of Registered Records.

The definition for log metric 2 uses counter as its log metric type.

Fluentd workers (multi-process workers feature) for the number of log files to be monitored are required. For details on the worker settings related to the log metrics feature, see the log metrics definition file (fluentd_any-name_logmetrics.conf). Here, it is assumed that 11 fluentd workers are running, and ControllerLog.log is monitored by a worker whose worker ID is 10.

These sample files also assume the tree structure consisting of the following IM management nodes:

All Systems
 + Host A
    + Application Server
       + WebAppA

- Target files in this example

The target files used in this example are as follows:

Integrated manager host

- User-specific metric definition file
Integrated agent host

- Prometheus configuration file

- User-specific discovery configuration file

- Log metrics definition file

- Fluentd log monitoring target definition file

- Sample user-specific metric definition file

- File name: metrics_logmatrics1.conf

- Written code

[
  {
    "name":"logmetrics_request_endpoint_register",
    "default":true,
    "promql":"logmetrics_request_endpoint_register and $jp1im_TrendData_labels",
    "resource_en":{
      "category":"HTTP",
      "label":"request_num_of_endpoint_register",
      "description":"The request number of endpoint register",
      "unit":"request"
    },
    "resource_ja":{
      "category":"HTTP",
      "label":"registerへのリクエスト数",
      "description":"The request number of endpoint register",
      "unit":"リクエスト"
    }
  },
  {
    "name":"logmetrics_num_of_registeredrecord",
    "default":true,
    "promql":"logmetrics_num_of_registeredrecord and $jp1im_TrendData_labels",
    "resource_en":{
      "category":"DB",
      "label":"logmetrics_num_of_registeredrecord",
      "description":"The number of registered record",
      "unit":"record"
    },
    "resource_ja":{
      "category":"DB",
      "label":"登録されたレコード数",
      "description":"The number of registered record",
      "unit":"レコード"
    }
  }
]

Note: The storage directory, written code, and file name follow the format of the user-specific metric definition file (metrics_any-Prometheus-trend-name.conf).

- Sample Prometheus configuration file

- File name: jpc_prometheus_server.yml

- Written code

global:
  ...
(omitted)
  ...
scrape_configs:
  - job_name: 'LogMetrics'
    
    file_sd_configs:
      - files:
        - 'user/user_file_sd_config_logmetrics.yml'
    
    relabel_configs:
      - target_label: jp1_pc_nodelabel
        replacement: Log trapper(Fluentd)
    
    metric_relabel_configs:
      - target_label: jp1_pc_nodelabel
        replacement: ControllerLog
      - source_labels: ['__name__']
        regex: 'logmetrics_request_endpoint_register|logmetrics_num_of_registeredrecord'
        action: 'keep'
      - regex: (jp1_pc_multiple_node|jp1_pc_agent_create_flag)
        action: labeldrop
 
  ...
(omitted)
  ...

Note: The storage directory and written code follow the format of the Prometheus configuration file (jpc_prometheus_server.yml). You do not have to create a new file. Instead, you add the scrape_configs section for the log metrics feature to the Prometheus configuration file (jpc_prometheus_server.yml) created during installation.

- Sample user-specific discovery configuration file

- File name: user_file_sd_config_logmetrics.yml

- Written code

- targets:
  - HostA:24830
  labels:
    jp1_pc_exporter: logmetrics
    jp1_pc_category: WebAppA
    jp1_pc_trendname: logmetrics1
    jp1_pc_multiple_node: "{__name__=~'logmetrics_.*'}"
    jp1_pc_agent_create_flag: false

Note

The storage directory and written code follow the format of the user-specific discovery configuration file (file_sd_config_any-name.yml).

ControllerLog.log is monitored by the worker whose Fluentd worker ID is 10. Thus, when 24820 is set for port in the Sample log metrics definition file, the port number of the worker monitoring ControllerLog.log is 24820 + 10 = 24830.

- Sample log metrics definition file

- File name: fluentd_WebAppA_logmetrics.conf

- Written code

## Input
<worker 10>
  <source>
    @type prometheus
    bind '0.0.0.0'
    port 24820
    metrics_path /metrics
  </source>
</worker>
## Extract target log message 1
<worker 10>
  <source>
    @type tail
    @id logmetrics_counter
    path /usr/lib/WebAppA/ControllerLog/ControllerLog.log
    tag WebAppA.ControllerLog
    pos_file ../data/fluentd/tail/ControllerLog.pos
    read_from_head true
    <parse>
      @type regexp
      expression /^(?<logtime>[^\[]*) \[(?<loglebel>[^\]]*)\] (?<class>[^\[]*) : endpoint "\/register" started. Target record: (?<record_num>\d[^\[]*).$/
      time_key logtime
      time_format %Y-%m-%d %H:%M:%S
      types record_num:integer
    </parse>
  </source>
 
## Output
## Define log metrics 1 and 2
  <match WebAppA.ControllerLog>
    @type prometheus
    <metric>
      name logmetrics_request_endpoint_register
      type counter
      desc The request number of endpoint register
    </metric>
    <metric>
      name logmetrics_num_of_registeredrecord
      type counter
      desc The number of registered record
      key record_num
      <labels>
      loggroup ${tag_parts[0]}
      log ${tag_parts[1]}
      </labels>
    </metric>
  </match>
</worker>

Note: The storage directory and written code follow the format of the log metrics definition file (fluentd_any-name_logmetrics.conf).

- Sample Fluentd log monitoring target definition file

- File name: jpc_fluentd_common_list.conf

- Written code

## [Target Settings]
  ...
(omitted)
  ...
@include user/fluentd_WebAppA_logmetrics.conf

Note: The storage directory and written code follow the format of the Fluentd log monitoring target definition file (jpc_fluentd_common_list.conf) in JP1/IM - Agent definition files. You do not have to create a new file. Instead, you add the include section for the log metrics feature to the Fluentd log monitoring target definition file (jpc_fluentd_common_list.conf) created during installation.

(m) Web scenario monitoring function

In JP1/IM - Manager and JP1/IM - Agent versions 13-10 and later, Web scenario monitoring function is available. ^#

Web scenario monitoring function is one of Synthetic metric collector. Monitors how long a user action plays in Web browser. The monitoring scope is HTTP(S) communication of the initial screen and the series of operations from login to logoff. HTTP(S) Monitors the operation of Web contents that issue a large number of communications combining HTML,json,xml, etc. based on communications. It is possible to monitor from the viewpoint of user operation, which cannot be done by Synthetic metric collector (single HTTP (S) monitoring) by Blackbox exporter.

#

If JP1/IM - Manager is upgraded from a version earlier than 13-10 to a version later than 13-10, and you use Web scenario monitoring function, you must configure the settings to use Web scenario monitoring function. For instructions on setting up JP1/IM - Manager, see Setting up the environment variables and Setting up Web exporter in 1.21.2(13)(a) Setting up JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Configuration Guide.

See Configuring authentication in 1.21.2(13)(a) Setting up JP1/IM - Agent

■ Prerequisites

When you use Web scenario monitoring function, you have the following prerequisites:

Prerequisite browser

The following browsers must be installed before you can create and monitor Web scenarios.
- Google Chrome
- Microsoft Edge
In addition, you must be able to access the targets from the above browsers.

The above browsers are used to create Web scenarios and to monitor Web scenarios using Web scenarios.
Agent host

We recommend that you create Web scenarios on the same host and monitor Web scenarios on the same host.

If you want to migrate Web scenario file to a different host, you must perform the steps in 1.5.1(9)(c) Migrating Web Scenario Files to another host in the JP1/Integrated Management 3 - Manager Administration Guide.

In addition, Web scenario monitoring function can only be used by agent host on Windows host that have JP1/IM - Agent for Windows installed.
Web exporter

The listen port used by Web exporter must be protected, for example, by a firewall or networking configuration, so that it is not accessed by anything other than JP1/IM - Agent's Prometheus server. For the port used by Web exporter, see the explanation of web_exporter command options in Service definition file (jpc_program-name_service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Network configuration

We recommend that you install JP1/IM - Agent on a networked host that is close to the user who is using the monitored Web contents to monitor according to user's Web scenarios. If the network path from JP1/IM - Agent to the monitoring target differs greatly from that of the user using Web contents, it is difficult to detect monitoring errors due to failure of the relay device.

■ Function List

Web scenario monitoring function monitors the response time of the user's experience by automatically playing back the user's actions on the browser screen and measuring the playing time.

Web scenario monitoring function consists of Web operation information collection function, which collects performance information for Web operation responses based on Web operation scenario, and Web scenario creation support function, which helps create a Web operation scenario (Web scenario).

Table 3‒37: Function List
Function			Description
Web scenario monitoring function			Monitoring of Web system is realized from the performance data of the collected Web response.
	Web scenario creation support function		Supports you create scenarios for Web manipulation.
		Web scenario creation function	Launch your browser and create Web scenarios.
	Web operation information collection function		Based on Web action scenario, collect the performance-information of the response of Web action. Use a Web exporter that provides Web operation information collection function.
		Web scenario execution function	Perform Web actions as they were created in Web scenario.
		Trace viewer function	Displays trace information to be used for investigation when an error is detected during Web scenario execution.

■ Web Scenario Creation Support Function

Web Scenario creation support function launches Web Scenario creation function, which launches a browser and records what user interact with in the browser as a Web scenario.

■Scenarios that can be created

You can monitor Web contents using Web scenarios that record the following actions:

Operations for displaying the top page

This is just the operation to display the top page. No other operations are required.
To log on from the Login screen

Enter the username and password and click the logon button.
Operation of the logoff button on the logoff screen

■ Web Scenario Creation Function (playwright codegen)

Web Scenario creation function provides the ability to assist in the creation of Web manipulation scenarios (Web scenarios). Web Scenario creation function uses Playwright of OSS.

■Prerequisites

When you run playwright command manually, the current folder must be Playwright working folder. For Playwright working folders, see Appendix A.4(3) Integrated agent host (Windows).

It must be run in built-in Administrator.

■Starting Codegen

Web Scenario creation function uses Codegen of Playwright.

The user who runs Codegen must be the same user as the user who runs Web exporter.

Use playwright codegen command to perform Web scenario creation function.

Playwright codegen command is a command that opens a Web site and generates a code-based page in response to your actions. Allows users to run on the terminal.

Recording starts when Web Scenario creation function is activated.

npx playwright codegen --target playwright-test --channel channels --lang locale URL -o ./tests/Web-scenario-filename

Codegen opens two windows: a browser window for interacting with the monitored Web site and a Playwright Inspector window for recording Web scenario code.

When a user runs a Codegen and performs an action in a browser, Playwright generates the code according to the action.

For details about the parameters that can be specified in playwright codegen command, see the following tables.

npx playwright codegen command option

Item	Description	Changeability	What You Setup in Your JP1/IM - Agent	JP1/IM - Agent Defaults Value
`-o` `filename` or `--output` `filename`	Save the generated script to a file	REQ	Specifies the path to the filename of the destination Web scenario file relative to the command-execution directory. If not specified, the script will be discarded when Codegen terminates. The generated script must be copied to a text file, for example, by the user. File names have the following rules: The filename must be in the format "`String`.spec.ts". The file name can contain only single-byte alphanumeric characters and underscores (_). The maximum number of bytes that can be specified for a parameter is 256 bytes. You cannot specify folders and files on a network drive. If specified, operation cannot be guaranteed in the event of a network failure or delay (in the event of a Windows). The following pathnames cannot be specified: - File name with a leading "-" (hyphen) - Folder or file name containing environment-dependent characters If you specify a file that does not exist, a new file is created. If you specify a file name that already exists, the file is overwritten. For details about the storage location of the output Web scenario file, see Appendix A.4(3) Integrated agent host (Windows).	`./tests/Web-scenario-filename`
`--target` `language`	Select the language for generating the script.	--	None	None
`--channel` `channels`	Specifies the distribution channel for Chromium.	REQ	Specify one of the following as the browser for executing Codegen. `"chrome"` Specify if you want to use Google Chrome. `"msedge"` Specify if you want to use Microsoft Edge.	None
`--lang` `language`	Specify the language and locale. <Example of specification> `"ja-JP"`	REQ	One of the following, depending on the language code at the time of the test run: `"en-US"` `"ja-JP"` `"zh-CN"` `"th"` If not specified, a Web scenario is generated in a language code that differs from the language code at the time of the test. Originally, a successful Web scenario may fail.	None
`--proxy-server` `proxies`	Specify the proxy server. <Example of specification> `"http://myproxy:3128"` `"socks5://myproxy:8080"`	Y	Specifies the proxy used for the request. Specify the entire domain with up to 253 alphanumeric characters.	None
`--proxy-bypass` `Bypass proiesy`	Specifies a comma-separated domain of proxies to bypass. <Example of specification> `".com,chromium.org,.domain.com"`	Y	Specifies the domain for proxy bypass, up to 253 alphanumeric characters.	None
`URL`	Specify URL to be monitored.	Y	Specify the entire domain with up to 253 alphanumeric characters. Specify URL in the following format: `Protocol://Hostname:port-number`	None

Legend:: REQ: Required setting, Y: Changeable, --: Not applicable

■Recording screen operations

When recording operations such as mouse clicking, text entry, and HTML operations, perform the operations you want to record in the browser window while recording is already started. As you work, a Web scenario code is generated on Playwright Inspector window.

The following tables show the browser operations and operations that can be recorded and measured as Web scenarios.

Table 3‒38: Operation and operation of browsers that can be recorded and measured as Web scenario
Classification		Operation	Record	Remarks	Sample Codes Recorded by codegen
Mouse operation	--	--	--	The mouse operation itself is not recorded, but it is recorded as button operation, etc. caused by mouse operation.	--
	Click	Y	Y	--	`await page.getByRole('button', { name: 'Login' }).click();`
	Double click	Y	Y	--	`await page.getByRole('button', { name: 'Clear' }).dblclick();`
	Sub button click	Y	Y	--	`await page.locator('body').click({` `button: 'right'` `});`
Keyboard operation (key entry operation)	--	--	--	--	--
	Entering Characters	Y	Y	The value being input is reflected in real time. The entered value is recorded as a HTML action, etc.	`await page.locator('input[name="username"]').fill('username');`
	Shortcut key input	--	--	This is the same as browser operation.	This is the same as browser operation.
	Accelerate key input	--	--
	Other key input	--	--
Browser operations	--	--	--	--	--
	Move next item [Tab]	Y	Y	Only recorded if an element in HTML is selected.	`await page.locator('body').press('Tab');`
	Move previous item [Shift]+[Tab]	Y	Y	Only recorded if an element in HTML is selected.	`await page.locator('body').press('Shift+Tab');`
	Go to next page [Alt]+[→]	Y	Y	Keyboard actions are disabled. Page transitions are recorded.	`await page.goto('URL');`
	Go to previous page [Alt]+[←] [BackSpace]	Y	Y		`await page.goto('URL');`
	Context menu display [Right-click] [Shift]+[F10]	Y	Y	--	`await page.locator('body').press('Shift+F10');`
	Scroll up [↑]	Y	Y	--	`await page.locator('body').press('ArrowUp');`
	Scroll down [↓]	Y	Y	--	`await page.locator('body').press('ArrowDown');`
	Page Up Scroll [PgUp]	Y	Y	--	`await page.locator('body').press('PageUp');`
	Page Down Scroll [PgDn]	Y	Y	--	`await page.locator('body').press('PageDown');`
	Go to top of page [Home]	Y	Y	--	`await page.locator('body').press('Home');`
	Go to End of Page [End]	Y	Y	--	`await page.locator('body').press('End');`
	Stop operation [Esc]	Y	Y	--	`await page.locator('body').press('Escape');`
	Link-click [Enter] [Click]	Y	Y	This is the same as HTML linking operation. Page transitions are recorded.	This is the same as HTML linking operation.
	Multiple selection operation [Ctrl] +[click]	Y	Y	--	`await page.getByRole('listbox').selectOption(['apple', 'banana', 'orange']);`
	Cut [Ctrl]+[X]	Y	Y	--	`await page.locator('body').press('Control+x');`
	Copy [Ctrl]+[C]	Y	Y	--	`await page.locator('body').press('Control+c');`
	Paste [Ctrl]+[V]	Y	Y	--	`await page.locator('body').press('Control+v');`
	Select All [Ctrl]+[A]	Y	Y	--	`await page.locator('body').press('Control+a');`
Dialog operation	--	--	--	The operation itself may not be recorded, but page transitions are recorded.	`await page.goto('URL');`
	Text input	Y	#1	--	--
	Key operation	Y	#1	--	--
	Other input items	Y	#1	--	--
HTML operation	--	--	--	Records operations related to input operations and page transitions.	--
	Link operation	Y	Y	--	--
	INPUT TEXT handling (text-entry)	Y	Y	--	`await page.getByLabel('Name (4 to 8 characters):').fill('test');`
	INPUT PASSWORD (password-entry)	Y	Y	--	`await page.getByLabel('password (8 characters or more):').fill('pwdtest1');`
	INPUT CHECKBOX	Y	Y	--	`await page.getByRole('checkbox').check();`
	INPUT RADIO	Y	Y	--	`await page.getByLabel(''apple').check();`
	INPUT SUBMIT	Y	Y	--	`await page.getByRole('button', { name: 'Send' }).click();`
	INPUT RESET	Y	Y	--	`await page.getByRole('button', { name: 'Reset Form' }).click();`
	INPUT BUTTON	Y	Y	--	`await page.getByRole('button', { name: 'test' }).click();`
Script operation	--	--	--	Scripts without page transitions, HTML operations, or button operations are not recorded, but page transitions, HTML operations, and button operations that are caused by script operations are recorded. ^#2	--
Script operation	Page transition operation	Y	Y	Actions implemented inside the script may not be recorded, but page transitions are recorded.	--

Legend

Operation field Y: Can be operated --: Not applicable

Recording field Y: Recording object --: Not applicable

Other than the above --: Not applicable

#1

The data is recorded based on the values entered in the dialog or the operation results by pressing the button. However, some dialogs may not be recorded correctly. Be sure to run it after creating Web scenario to see if it runs correctly.

Dialogs that do not run correctly in Web scenario, such as stopping while dialogs are open, cannot be handled.

#2

Depending on the page-transitions, HTML and button-operation timings caused by scripting, recording may not be possible.

Be sure to run it after creating Web scenario to see if it runs correctly.

Operations or behaviors not described in the above tables cannot be recorded and measured as Web scenarios.

Note that dialogue authentication (other than user ID and passwords) and ActiveX in-control are not supported. Also, ftp is not supported.

■Record of assertion

Assertion is an operation that checks whether the elements displayed on Web website match the expected content. When you run Codegen to create a Web scenario, clicking on an element displayed in the browser window and adding an assert to Web scenario determines whether the element displayed on the browser window when Web scenario is run matches the element displayed on the browser window when Codegen is run.

The following types of assertions are available:

assert visibility

Assert that the element exists.
assert text

Asserts that the element contains certain text.
assert value

Asserts that an element has a specific value.

If you want to add an assertion to a Web scenario, click one of the buttons on assert visibility,assert text,assert value and select the element to be asserted in the browser window. An assertion is generated for the selected element in Playwright Inspector window.

■Pausing Recording

If you want to pause recording, press Record. Clicking Record button again resumes recording.

■Saving the generated Web scenario code

When you exit Web scenario creation function, the generated Web scenario is saved in Web scenario file specified in the command-line options at the beginning of Web scenario creation function.

■Exiting Codegen

To exit Web scenario creation function, press the Ctrl+C keys in the terminal where playwright codegen command was executed to exit, or close the browser window that was opened when Web scenario creation function was started.

■Codegen Window Structure

When you use playwright codegen command to execute Web scenario creation function, the following window is displayed.

Browser window

Web site where you want to run the scenarios is displayed. Records clicks and typing actions by navigating Web pages.

The following tables show the buttons and operations that are used to record Web page operations. For details, see Playwright documentation.

Item number	Button	How to operate
1	--	Drag this button to move the tab.
2	Record	Click this button to stop or resume recording.
3	assert visibility	Click this button, and then select the element that you want to assert that the element is visible. Click this button again to return to normal operation recording.
4	assert text	Click this button, then select the element for which you want to assert that the element contains specific text. Click this button again to return to normal operation recording.
5	assert value	Click this button, then select the element for which you want to assert that the element has a specific value. Click this button again to return to normal operation recording.

Playwright Inspector window

Allows you to record Web scenarios.

The following buttons and procedures are used to record Web scenarios: For details, see Playwright documentation.

Item number	Button	How to operate
1	Record	Same as Browser window.
2	assert visibility	Same as Browser window.
3	assert text
4	assert value

■Notes

Web scenario created by Web scenario creation function cannot be used to determine the status code of HTTP. Therefore, if "404 Not Found" or "500 Internal server error" is returned, it may be determined that Web entry was successful.
When using Web scenario creation function to verify that Web page transitions, you will not be able to detect successful or unsuccessful page transitions when using the following Web scenario:

<Example of operation to check>

On integrated operation viewer login page (URL:'http://hostname:20703/login'), enter your registered username and password to verify that you can successfully log in.

<Some coding that Codegen writes to Web scenarios>
```
test('test', async ({ page }) => {
  await page.goto('http://hostname:20703/login');
  await page.locator('input[name="username"]').fill('username');
  await page.locator('input[name="password"]').fill('password');
  await page.getByRole('button', { name: 'Login'}).click();
});
```
When the above Web scenario is played back, the operation to click Login button is played, and playback is terminated regardless of whether or not the screen is displayed after logging in. Therefore, page transitions cannot be detected as successful or unsuccessful.

The following are the actions to be taken to verify a successful page transition:

Use assertion of Codegen to add an action that asserts that the page displays its own elements after the page transition.

For details on how to record assertions, see ■Record of assertion.

Here is an example of a Web scenario with the above example modified:
```
<Some coding that Codegen writes to Web scenarios>
test('test', async ({ page }) => {
  await page.goto('http://hostname:20703/login');
  await page.locator('input[name="username"]').fill('username');
  await page.locator('input[name="password"]').fill('password');
  await page.getByRole('button', { name: 'Login' }).click();
  await expect(page.getByRole('button', { name: Logout' })).toBeVisible();
});
```
In the example of the modified Web scenario above, we added an action to assert that the page that transitions after the login process shows Logout button that should be displayed on that page. If the assertion of Logout button fails, it can detect that the page transitions after logging in failed.
When URL transitions are recorded in Codegen, if a transfer (redirection^#) is made to another new URL when accessed in the specified URL, URL transition to the redirection source is not recorded, only URL transition to the redirection destination may be recorded.

If Codegen records actions that are redirected to a different URL by a server of the specified URL, the monitoring scope cannot include redirection from the redirection source to the destination.

#

Refers to the redirection performed by HTTP protocol using HTTP status code (in the 300 range) and Location header field.

The following example shows where URL transitions to the redirection source are not recorded, but only URL transitions to the redirection destination are recorded.
- When servers redirect from URL to new URL due to, for example, the transfer of a monitored site
- When a forward slash (/) is missing at the end of URL specified in Codegen and the servers automatically add the forward slash and redirect it to the correct URL.
Note that redirects that do not involve the following HTTP protocols do not fall under this precaution, and URL transitions of the redirection source and the redirection destination are recorded.
- HTML redirection using the <meta> element of HTML
- JavaScript redirection executed due to the URL string of the window.location property set by a client script such as JavaScript

■ Web operation information collection function (Web exporter)

Web operation information collection function (Web exporter) executes the scenario for Web scenario file created beforehand using Prometheus server's scrape request as the trigger, and returns the execution result as scrape result. Detailed motion at the time of scenario execution is output as a trace and can be viewed by the user using the trace viewer function.

■Acquisition items

The metrics that can be retrieved with Web exporter (Web operation information collection function are probe_webscena_success (Displays whether the probe was successful^#1) and probe_webscena_duration_seconds (The seconds taken by the web scenario probe^#2).

#1: Signifies the success or failure of the entire collection, including preparation for collection (such as process startup).
#2: If the collection fails, metric may not be retrieved.

Web exporter retrieval items are defined in metric definition file (metrics_web_exporter.conf) of Web exporter. For details, see Web exporter Metric Definition File (metrics_web_exporter.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■Monitoring when a monitoring target is temporarily stopped

To suppress error detection during a power failure or maintenance, you must stop collecting activity information for the target.

The collection of operational information can be stopped by deleting the applicable monitoring target in targets of Web exporter discovery configuration file (jpc_file_sd_config_web.yml). For details, see Web exporter discovery configuration file (jpc_file_sd_config_web.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Web scenario execution function

Use playwright to perform Web scenario execution functions.

Playwright exporter configuration file specifies the parameters for Web scenario execution function.

For Playwright configuration file, see Playwright configuration file (jpc_playwright.config.ts) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■Trace

Web scenario execution function outputs the trace during Web scenario execution to a trace file in the following Web exporter: Web scenario file displays the results of the actions performed and HTTP communication traces.

For physical hosts

Agent-path\logs\web_exporter\trace\Web-scenario-filename-test-project-name-number-of-retries_generation-number\trace.zip
For logical hosts

shared-folder\jp1ima\logs\web_exporter\trace\Web-scenario-filename-test-project-name-number-of-retries_generation-number\trace.zip

Web-scenario-filename

If Web scenario filename ends with ".spec.ts", the text without ".spec.ts" is stored.

project-name

The character string specified in name parameter of Playwright configuration file is set.

Spaces, control characters, and the following characters are converted to a hyphen (-).

! " # $ % & ' ( ) * + , . / : ; < = > ? @ [ \ ] ^ _ { | } ~

number-of-retries

Used if retries parameter of Playwright configuration file is 1 or more. retry1, retry2, retry3, ... is set according to the number of retries when Web scenario execution failed.

The "- number-of-retries" part is granted only when retrying. Therefore, it is not granted to the first-run tracing of Web scenarios per scrape.

generation-number

The 4-digit number is set.

For the number of generations of traces to be saved, see tracenum in Web exporter configuration file (jpc_web_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The file size of the trace is a few MB (it varies greatly depending on the content of the monitored content). For Web scenarios where you log in to and log off from Intelligent Integrated Management Base, it is approximately 2MB per scenario. If 2000 generations and 0 retries are retained every 6 minutes (defaults), approximately 4GB of disc space is required.

■Trace Viewer

Web exporter trace file can be referenced in the trace viewer.

The trace viewer is used to investigate the details when an error is detected.

For details about the trace viewer, see 3.15.1(1)(n) Trace Viewer Function (playwright show-trace).

■ Monitoring with Other Monitoring Function

Web scenario monitor function allows you to monitor Web contents from the user's point of view, but does not allow detailed monitoring of HTTP communication (name resolution times or certificate expiration time) or monitoring inside the monitoring target. Therefore, if an error occurs, you cannot investigate the cause of the error using only metric information acquired by Web scenario monitoring function.

For example, you need to monitor HTTP communication using Blackbox exporter outline monitoring and monitor the inside of the monitored side (HTTP servers and DB servers) using log trapper of Fluentd.

■ Handling of Public Key Infrastructure (PKI: Public Key Infrastructure) Certificates Used in TLS Communication

If the monitoring target is a HTTPS server, register the certificate below in OS (for Windows, register it in the certificate store).

CA certificate of authentication authority that issued the server certificate
Client certificate (if HTTPS server requires a client certificate during TLS handshake) and private key

For details about how to register with OS, see the documentation for your OS.

■ Understanding Web Scenarios for HTTP authentication with Passwords

If the monitored Web contents require HTTP authentication with a username and password (such as Basic authentication), enter the username and password in URL fields of Web scenario creation function as follows:

http://username:password@domain-name:port/Web-content-path

■ Handling passwords

HTTP authentication and Web contents the passwords that you use in your own authentication (if you are prompting for a username and password on the form) are stored in Web exporter configuration file, Web scenario file, and the trace file. When providing the information for failure investigation to the requester, the user should perform masking such as replacing the password part with a different character string to prevent leakage.

■ Configuring HTTP Proxies

To set up a HTTP proxy server to communicate from JP1/IM - Agent host to the monitoring target, set "proxy" in Playwright configuration file (jpc_playwright.config.ts) item.

For details about Playwright configuration file, see Playwright configuration file (jpc_playwright.config.ts) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

For details on configuration file editing procedure, see To edit the configuration files (for Windows) in 1.19.3(1)(a) Common way to setup in the JP1/Integrated Management 3 - Manager Configuration Guide.

■ About Reviewing Web scenarios

Web scenario monitoring function does not provide the ability to independently test Web scenario. Actually monitor Web scenarios. Make sure that the monitoring is successful. Refer to metric of the probe_webscena_success to determine whether it is normal.

■ Timeout Settings and User Tasks When Timeout Occurs

Web scenario creation support function suspends the collection of too-long activity information (collection of Web scenario execution times) due to timeouts.

The following parameters relate to timeouts:

Setting point	Parameter name
Prometheus configuration file (jpc_prometheus_server.yml)^#1	`scrape_timeout` (scrape required timeout period)
web_exporter command options^#2	`--timeout-offset` (The number of seconds to be subtracted from the Prometheus `scrape_timeout` value (Offset subtracted from timeout time)). It is fixed at `0.5` second. The user cannot be changed.

Setting point

Parameter name

Prometheus configuration file (jpc_prometheus_server.yml)^#1

scrape_timeout (scrape required timeout period)

web_exporter command options^#2

--timeout-offset (The number of seconds to be subtracted from the Prometheus scrape_timeout value (Offset subtracted from timeout time)).

It is fixed at 0.5 second. The user cannot be changed.

#1: For details about Prometheus configuration file parameters, see Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.
#2: For details about the options of the web_exporter command, see Service definition file (jpc_program-name_service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The timeout setting is always applied, and the collection is interrupted when the timeout period is exceeded. As a result, the collection does not continue indefinitely without interruption.

The timeout period is --timeout-offset(0.5 seconds) subtracted from the scrape_timeout time.

This timeout time must include the execution time of the processing required to collect the activity information. Collection of operational information includes manipulation of Web contents (browser operations) according to Web scenarios.

In practice, it is recommended that you set a timeout of 30 seconds more than it would take to run the actual Web scenario. This is because there is a startup process for browsers and other processes.

If processing is aborted due to a timeout, one or more of the following messages is output to the log. At this time, the probe_webscena_success metric may be 0 (failed) or the metric may not be sent. Check the following log file to see if the processing was aborted by a timeout.

Log file	Message
web_exporter log	`KNBC20144-E An error occurred while an internal command was executing. (maintenance information = exit status 1)`
web_exporter log	`KNBC20147-E An error occurred while an internal command was executing. (message = Test timeout of milliseconds exceeded., ...)`
Prometheus server log	`msg="Scrape failed" err="Get URL: context deadline exceeded"`

Even if the timeout occurred, the child process that started is terminated, so the user does not need to terminate it.

■ Notes

The following monitoring cannot be performed using Web scenario monitoring function:
- Monitoring Web contents that do not support JP1/IM - Agent supported browsers
- Monitoring Web contents that behave differently than when creating Web scenarios
- Monitoring HTTP status codes
- Monitoring Web sites using external authentication providers for authentication
If a timeout occurs during the collection of operational information, the browser process may remain unfinished. In this case, the user must stop the applicable process. For details, see ■Timeout Settings and User Tasks When Timeout Occurs.

(n) Trace Viewer Function (playwright show-trace)

The Trace viewer function provides a visual overview of the actions recorded in the trace during a Web scenario.

■Prerequisites

When the user runs playwright command manually, the current folder must be Playwright working folder. For Playwright working folders, see Appendix A.4(3) Integrated agent host (Windows).

Run as a user with Administrator's permissions (run from the Administrator Console if Windows's UAC function is enabled).

You can use playwright show-trace commands to perform trace viewer functions.

playwright show-trace command displays the trace viewer. Allows users to run on the terminal.

■Run Web Scenarios to log tracing

To log traces when running Web scenarios, you must specify a on in Playwright configuration file (jpc_playwright.config.ts) trace optional mode to ensure that traces are recorded at all times for every test run.

For the format and options of Playwright configuration file, see Playwright configuration file (jpc_playwright.config.ts) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■Open the trace

You can run the following command to display the trace for the path specified in the command options in the trace viewer.

Run the command as a user with Administrator's permissions (if the Windows UAC function is enabled, run the command from the administrator console).

npx playwright show-trace trace-file-path

For details about the parameters that you specify for playwright show-trace command, see the following tables:

npx playwright show-trace command option

Item	Description	Changeability	What You Setup in Your JP1/IM - Agent	JP1/IM - Agent Defaults Value
`Path to a trace file`	Specifies the trace file to be displayed in the trace viewer.	Y	Specifies the path to the output trace file. If it is not specified, drag-and-drop the trace file on the displayed HTML to display the trace.	None

Item

Description

Changeability

What You Setup in Your JP1/IM - Agent

JP1/IM - Agent Defaults Value

Path to a trace file

Specifies the trace file to be displayed in the trace viewer.

Specifies the path to the output trace file.

If it is not specified, drag-and-drop the trace file on the displayed HTML to display the trace.

None

Legend:: Y: Changeable

In the trace viewer, you can see the following information:

Action

Action tab, you can see which locator was used for the action and how long it took each action to execute.

If you want to verify the transformation of DOM snapshot, hover over the respective action in Web scenario.

If you are investigating or debugging, move the time axis forward or backward and click the action you want to review.

Use the Before and After tabs to see the differences before and after the actions.
Screenshots

Records screenshots as traces and displays thumbnail images in chronological order at the top of the trace viewer. You can mouse over a thumbnail image to display an enlarged image of each action and state.

You can double-click an action to view the time that the action was executed. When you select multiple actions using the sliders on the timeline, they appear in the Action tab, and you can filter and view the log for only the selected actions.

Snapshot

By default, tracing is performed with the snapshot option turned off.

If you want to use this function, you must specify true for the snapshots parameter of the Playwright configuration file. For Playwright configuration file, see Playwright configuration file (jpc_playwright.config.ts) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can switch the tabs in the center of the screen to see the types of snapshots listed in the following table.

Type	Description
Action	Snapshot of the moment of input that was executed Use this type of snapshot to see exactly where Playwright clicked.
Before	Snapshot at the time the action was invoked
After	Snapshot after action

Type

Description

Action

Snapshot of the moment of input that was executed

Use this type of snapshot to see exactly where Playwright clicked.

Before

Snapshot at the time the action was invoked

After

Snapshot after action

Source

When you hover over an action in a Web scenario, the code line for that action is highlighted in the Source tabbed page.
Call

Call tabbed page shows the execution time and used locators.
Log

Use to view a log of actions, such as scrolling, waiting for elements to appear, enabled and stable, clicking, and filling in a view.
Error

If Web scenario execution fails, an error message is displayed on the Error tabbed page. The timeline also displays a red line to indicate where the error occurred.

To check the source code line, select Source tabbed page.
Console

Browse the console logs for browser and Web scenario runs.
Network

Network tabbed page that shows the networking requests that were made during Web scenario.

Name, Method, Status, Content Type, Duration alternatively, select Size to change the order.

Click Request to view information about the request, such as the request header, response header, request body, and response body.

If you want to use this function, you must specify true for the snapshots parameter of the Playwright configuration file. For Playwright configuration file, see Playwright configuration file (jpc_playwright.config.ts) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.
Metadata

Metadata tab next to the Action tab provides detailed information about Web scenario execution, such as browser, viewport size, and runtime.

start time shows the time when Web was started. The time displayed in the trace is the date and time of JP1/IM - Agent host displayed in "YYYY/MM/DD hh:mm:ss" format. If the time zone of the monitored host differs from the time zone of JP1/IM - Agent host, the date and time of JP1/IM - Agent host also apply.

■Close the trace

To exit the trace viewer, press the Ctrl+C keys to exit or close the trace viewer window at the terminal where playwright show-trace command was executed.

(o) VMware exporter (VMware performance data collection capability)

VMware exporter is an Exporter for Prometheus that retrieves performance data from VMware ESXi.

■ Prerequisites

It is a prerequisite that the ports used by VMware exporter are protected by firewalls, networking configurations, and so on, so that they are not accessed by anything other than Prometheus server of JP1/IM - Agent.

For the port used by VMware exporter, see vmware_exporter command options in Service definition file (jpc_program-name_service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Conditions to be monitored

VMware vCenterServer are not monitored.
VMware exporter target is VMware ESXi. For details about the supported VMware ESXi versions, see the Release Notes.
The name of the datastore^# managed by VMware ESXi must be the same as the host name. If the datastore name and host name are different, separate nodes are created for the datastore and the hypervisor, and the available metrics are separated.

When nodes are divided into datastores and hypervisors, the metrics that can be retrieved for each node are as follows.
- Data store
  
  vmware_host_size, vmware_host_used, vmware_host_free, vmware_datastore_used_percent
- Hypervisor
  
  Metrics for hosts, except: vmware_host_size, vmware_host_used, vmware_host_free, vmware_datastore_used_percent
For details about each metric and its description, see VMware exporter metric definition file for host (metrics_vmware_exporter_host.conf) and VMware exporter metric definition file for VM (metrics_vmware_exporter_vm.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

#: If there is more than one data store, use "host-name_any-string".
Do not use duplicate VM names that are managed by VMware ESXi. If VM names are duplicated, the same node will be displayed with more than one monitor result. Therefore, be sure to set VM name to a unique name.

■ Acquisition items

VMware exporter shipped with JP1/IM - Agent has metric that is defined by VMware exporter defaults.

VMware exporter retrieval items are defined in metric definition file for host and metric definition file for VM of VMware exporter. For details, see VMware exporter metric definition file for host (metrics_vmware_exporter_host.conf) and VMware exporter metric definition File for VM (metrics_vmware_exporter_vm.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Metric are obtained using OSS's pyVmomi and VMware's officially provided vRealize Operations of metric. Metric of vRealize Operations used by metric are listed in the following tables.

Metric Name	Category	Description	Label	Data source
vmware_datastore_capacity_size	DATASTORES	VMware Datastore capacity in bytes (Unit:B)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi vmware_datastore_capacity_size of datastore structure
vmware_datastore_freespace_size	DATASTORES	VMware Datastore freespace in bytes (Unit:B)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi vmware_datastore_freespace_size of datastore structure
vmware_host_num_cpu	HOSTS	VMware Number of processors in the Host	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi vmware_host_num_cpu of vmware_datastore_freespace_size of hostst structure
vmware_host_memory_usage	HOSTS	VMware Host Memory usage in Mbytes (Unit:MB)	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi vmware_host_memory_usage of hostst structure
vmware_host_memory_max	HOSTS	VMware Host Memory Max availability in Mbytes (Unit:MB)	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi vmware_host_memory_max of hostst structure
vmware_host_mem_vmmemctl_average	HOSTS	The total amount of memory currently used for virtual machine memory control. (Unit:KB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi mem.vmmemctl.average of performance counters
vmware_vm_mem_swapped_average	VMS	The amount of unreserved memory in kilobytes. (Unit:KB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi mem.swapped.average of performance counters
vmware_host_net_bytesRX_average	HOSTS	Average amount of data received per second. (Unit:KBps)	`dc_name :` `data-center-name` `host_name :` `host-name`	Get by pyVmomi vmware_host_net_bytesRX_average of performance counters
vmware_host_net_bytesTX_average	HOSTS	Average amount of data transferred per second. (Unit:KBps)	`dc_name :` `data-center-name` `host_name :` `host-name`	Get by pyVmomi vmware_host_net_bytesTX_average of performance counters
vmware_vm_mem_active_average	VMS	The amount of memory that is being used effectively. (Unit:KB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi mem.active.average of performance counters
vmware_vm_guest_disk_capacity	VMGUESTS	Disk capacity metric per partition (Unit:B)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_guest_disk_capacity of vmguests structure
vmware_vm_guest_disk_free	VMGUESTS	Disk metric per partition (Unit:B)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_guest_disk_free of vmguests structure
vmware_vm_mem_vmmemctl_average	VMS	The total amount of memory currently used for virtual machine memory control. (Unit:KB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi mem.vmmemctl.average of performance counters
vmware_vm_mem_consumed_average	VMS	The amount of host memory consumed by the virtual machine for guest memory. (Unit:KB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi mem.consumed.average of performance counters
vmware_vm_net_transmitted_average	VMS	The average amount of data transferred per second. (Unit:KBps)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi net.transmitted.average of performance counters
vmware_vm_net_received_average	VMS	The average amount of data received per second. (Unit:KBps)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi net.received.average of performance counters
vmware_vm_power_state	VMS	VMWare VM Power state (On / Off)VMWare VM Power state (On / Off)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_power_state of vms structure
vmware_host_cpu_used_summation	HOSTS	Used CPU (Unit:msec)	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi cpu.used.summation of performance counters
vmware_vm_cpu_ready_summation	VMS	Time spent in VMware host ready state. (Unit:msec)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi cpu.ready.summation of performance counters
vmware_vm_num_cpu	VMS	VMWare Number of processors in the virtual machine	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_num_cpu of vms structure
vmware_vm_memory_max	VMS	VMWare VM Memory Max availability in Mbytes (Unit:MB)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_memory_max of vms structure
vmware_vm_max_cpu_usage	VMS	VMWare VM Cpu Max availability in hz (Unit:hz)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_max_cpu_usage of vms structure
vmware_vm_template	VMS	VMWare VM Template (true / false)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi vmware_vm_template of vms structure
vmware_host_cpu_usage_average	--	Average CPU usage	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi cpu.usage.average of performance counters
vmware_host_disk_write_average	--	The amount of data written to disk during the performance interval. (Unit:KBps)	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi disk.write.average of performance counters
vmware_host_disk_read_average	--	The amount of data read during the performance interval. (Unit:KBps)	`dc_name :` `data-center-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name`	Get by pyVmomi disk.read.average of performance counters
vmware_vm_cpu_usage_average	--	Average CPU usage	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi cpu.usage.average of performance counters
vmware_vm_disk_write_average	--	The amount of data written to disk during the performance interval. (Unit:KBps)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi disk.write.average of performance counters
vmware_vm_disk_read_average	--	The amount of data read during the performance interval. (Unit:KBps)	`dc_name :` `data-center-name` `ds_name :` `datastore-name` `host_name :` `host-name` `instance :` `data-retrieval-address` `job :` `job-name` `vm_name :` `virtual-machine-name`	Get by pyVmomi disk.read.average of performance counters

■ Obfuscation of VMware exporter passwords

VMware exporter shipped with JP1/IM - Agent manages the passwords for accessing VMware ESXi from VMware exporter in secret obfuscation capabilities. For details, see 3.15.10 Secret obfuscation function.

(p) Windows exporter (Hyper-V monitoring function)

Hyper-V monitoring function monitors Hyper-V activity using Widows exporter's hyperv collectors.

■ Prerequisites

The port used by the Hyper-V monitoring function must be protected by a firewall or network configuration so that it cannot be accessed by anyone other than the Prometheus server of JP1/IM - Agent.

For details about the ports used by Hyper-V monitoring function, see the explanation of windows_exporter command options (Hyper-V monitoring) in Service definition file (jpc_program-name_service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Conditions to be monitored

For details about the versions of Hyper-V that Hyper-V monitoring function supports as targets, see the Release Notes.
The following rules apply to VM naming:
- VM name must be the same as the host name of the guest OS.
- Do not set a VM containing "-".
- The name of the disc managed by Hyper-V must be the same as VM name.
  
  If you use a different name for the disc name and VM name, the following metric cannot be displayed on VM:
  
  - hyperv_vm_device_written
  
  - hyperv_vm_device_read
  
  For details about individual metric, see "Acquisition items" below.
  
  If there are multiple disks, use the name of "host-name_any-string".
If you use live migration, for example, to move a VM from a monitored host, you will not be able to monitor that VM. You can monitor the destination VM by making it a monitoring target.
VM that have never been started are not collected, and no VM are created. Therefore, the tree must be updated when VM is started for the first time.
You can monitor only VM of hosts with which JP1/IM - Agent resides. It does not monitor VM in nested constructs.

■ Acquisition items

Hyper-V monitoring function obtains metric of Windows exporter (Hyper-V monitoring) defaults-defined Hyper-V.

Windows exporter (Hyper-V monitoring) retrieval items are defined in metric definition file for host and metric definition file for VM of Windows exporter (Hyper-V monitoring). For details, see Windows exporter (Hyper-V monitoring) metric definition file (metrics_windows_exporter_hyperv_host.conf) and Windows exporter (Hyper-V monitoring) metric definition file for VM (metrics_windows_exporter_hyperv_vm.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The following table lists metric that can be specified for PromQL expression in the definition file: For details about the "Collector" in the table, see the description of the "Collectors" in the table below.

Metric Name	Collector	Contents to be acquired	Type	Label
windows_hyperv_vm_cpu_total_run_time	hyperv	The time spent by the virtual processor in guest and hypervisor code	gauge	`instance:` `instance-identification-string` `job:` `job-name` `core:` `coreid` `vm:` `virtual-machine-name`
windows_hyperv_vm_device_bytes_written	hyperv	The total number of bytes that have been written per second on this virtual device	counter	`instance:` `instance-identification-string` `job:` `job-name` `vm_device:` `virtual-disk-file-path`
windows_hyperv_vm_device_bytes_read	hyperv	The total number of bytes that have been read per second on this virtual device	counter	`instance:` `instance-identification-string` `job:` `job-name` `vm_device:` `virtual-disk-file-path`
windows_hyperv_host_cpu_total_run_time	hyperv	The time spent by the virtual processor in guest and hypervisor code	gauge	`instance:` `instance-identification-string` `job:` `job-name` `core:` `coreid`
windows_hyperv_vswitch_bytes_received_total	hyperv	The total number of bytes received per second by the virtual switch	counter	`instance:` `instance-identification-string` `job:` `job-name` `vswitch:` `virtual-switch-name`
windows_hyperv_vswitch_bytes_sent_total	hyperv	The total number of bytes sent per second by the virtual switch	counter	`instance:` `instance-identification-string` `job:` `job-name` `vswitch:` `virtual-switch-name`
windows_cs_logical_processors	cs	Number of installed logical processors	gauge	`instance:` `instance-identification-string` `job:` `job-name`
windows_hyperv_vm_cpu_hypervisor_run_time	hyperv	The time spent by the virtual processor in hypervisor code	gauge	`instance:` `instance-identification-string` `job:` `job-name` `core:` `coreid` `vm:` `virtual-machine-name`

■ Collector

Windows exporter (Hyper-V monitoring) has a built-in collection process called a "collector" for each monitored resource such as CPU and memory.

You must enable the collectors for metric listed in the tables above that correspond to metric you want to collect. You can also disable collectors for metric that you do not want to collect to suppress unwanted collections.

Enable/disable for each collector can be specified with the "--collectors.enabled" option on the Windows exporter (Hyper-V monitoring) command line or in the item "collectors.enabled" in the Windows exporter (Hyper-V monitoring) configuration file (jpc_windows_exporter_hyperv.yml).

For details about Windows exporter (Hyper-V monitoring) command-line options, see the description of windows_exporter command options (Hyper-V monitoring) in Service definition file (jpc_program-name.service.xml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

For details about Windows exporter (Hyper-V monitoring) configuration file entry "collectors.enabled", see the description of item collectors in Windows exporter (Hyper-V monitoring) configuration file (jpc_windows_exporter_hyperv.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Notes

Because Hyper-V monitoring is used to monitor Hyper-V in JP1/IM - Agent's own host, when you use HA host clusters or live migration, you must deploy JP1/IM - Agent on the monitored targets according to the configuration of Hyper-V that you want to monitor.

When Hyper-V configuration is changed, the tree must be updated after the first startup of VM to be monitored.

(q) SQL exporter (Microsoft SQL Server monitoring function)

SQL exporter is an Exporter for Prometheus that retrieves performance data from Microsoft SQL Server.

- About the number of sessions

When monitoring Microsoft SQL Server from SQL exporter, the connection is made according to the number of connections defined in SQL exporter configuration file (jpc_sql_exporter.yml), and if the session retention time is within the time defined in this file, the data is acquired in the same session.

For details about SQL exporter configuration file, see SQL exporter configuration file (jpc_sql_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Supported targets and configurations

The target is instance of Microsoft SQL Server. Monitoring can be performed in units of instance and the maximum number of monitored devices is 10.

For details about supported Microsoft SQL Server versions and editions, see the Release Notes for JP1/IM - Agent.

The following shows Microsoft SQL Server configurations that are supported for monitoring.

Monitoring a single host (including remote monitoring)
Monitoring multiple hosts (including remote monitoring)

In a mirrored configuration, you can monitor both the principal database and the secondary database by setting them to be monitored. However, because each instance is different, the Watch Tree is collected as a separate node.

If you are configuring with SQL Server AlwaysOn Availability Group function, you can monitor both the primary and secondary databases by setting them to be monitored. However, because each instance is different, the Watch Tree is collected as a separate node.

■ Acquisition items

The metrics that can be retrieved with the SQL exporter shipped with the JP1/IM - Agent are the metrics defined by SQL exporter defaults and metrics listed below.

mssql_database_detail_process_count
mssql_global_server_summary_perc_busy
mssql_global_server_summary_packet_errors
mssql_server_detail_blocked_processes
mssql_server_overview_cache_hit
mssql_transaction_log_overview_log_space_used

SQL exporter retrieval items are defined in metric definition-file of SQL exporter. For details, see SQL exporter metric definition file (metrics_sql_exporter.conf) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

The following tables list metric that can be specified for PromQL expression in the definition file. The value of each metric is obtained by executing the SQL statement shown in the table to Microsoft SQL Server. For details about metric, contact Microsoft based on SQL statement of the data source.

Metric name	Contents to be acquired	Label	Data source (SQL statement)
mssql_local_time_seconds	Local time in seconds since epoch (UNIX time).	none	`SELECT DATEDIFF(second, '19700101', GETUTCDATE()) AS unix_time`
mssql_connections	Number of active connections.	none	`SELECT DB_NAME(sp.dbid) AS db, COUNT(sp.spid) AS count` `FROM sys.sysprocesses sp` `GROUP BY DB_NAME(sp.dbid)`
mssql_deadlocks	Number of lock requests that resulted in a deadlock.	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Number of Deadlocks/sec' AND instance_name = '_Total'`
mssql_user_errors	Number of user errors.	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Errors/sec' AND instance_name = 'User Errors'`
mssql_kill_connection_errors	Number of severe errors that caused SQL Server to kill the connection.	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Errors/sec' AND instance_name = 'Kill Connection Errors'`
mssql_page_life_expectancy_seconds	The minimum number of seconds a page will stay in the buffer pool on this node without references.	none	`SELECT top(1) cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Page life expectancy'`
mssql_batch_requests	Number of command batches received.	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Batch Requests/sec'`
mssql_log_growths	Number of times the transaction log has been expanded, per database.	none	`SELECT rtrim(instance_name) AS db, cntr_value` `FROM sys.dm_os_performance_counters WITH (NOLOCK)` `WHERE counter_name = 'Log Growths' AND instance_name <> '_Total'`
mssql_buffer_cache_hit_ratio	Ratio of requests that hit the buffer cache	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters` `WHERE [counter_name] = 'Buffer cache hit ratio'`
mssql_checkpoint_pages_sec	Checkpoint Pages Per Second	none	`SELECT cntr_value` `FROM sys.dm_os_performance_counters` `WHERE [counter_name] = 'Checkpoint pages/sec'`
mssql_io_stall_seconds	Stall time in seconds per database and I/O operation.	none	`SELECT` `cast(DB_Name(a.database_id) as varchar) AS [db],` `sum(io_stall_read_ms) / 1000.0 AS [read],` `sum(io_stall_write_ms) / 1000.0 AS [write],` `sum(io_stall) / 1000.0 AS io_stall` `FROM` `sys.dm_io_virtual_file_stats(null, null) a INNER JOIN sys.master_files b ON a.database_id = b.database_id AND a.file_id = b.file_id GROUP BY a.database_id`
mssql_io_stall_total_seconds	Total stall time in seconds per database.	none	`SELECT` `cast(DB_Name(a.database_id) as varchar) AS [db],` `sum(io_stall_read_ms) / 1000.0 AS [read],` `sum(io_stall_write_ms) / 1000.0 AS [write],` `sum(io_stall) / 1000.0 AS io_stall` `FROM` `sys.dm_io_virtual_file_stats(null, null) a INNER JOIN sys.master_files b ON a.database_id = b.database_id AND a.file_id = b.file_id GROUP BY a.database_id`
mssql_resident_memory_bytes	SQL Server resident memory size (AKA working set).	none	`SELECT` `physical_memory_in_use_kb * 1024 AS resident_memory_bytes,` `virtual_address_space_committed_kb * 1024 AS virtual_memory_bytes,` `memory_utilization_percentage,` `page_fault_count` `FROM sys.dm_os_process_memory`
mssql_virtual_memory_bytes	Microsoft SQL Server committed virtual memory size.	none	`SELECT` `physical_memory_in_use_kb * 1024 AS resident_memory_bytes,` `virtual_address_space_committed_kb * 1024 AS virtual_memory_bytes,` `memory_utilization_percentage,` `page_fault_count` `FROM sys.dm_os_process_memory`
mssql_memory_utilization_percentage	The percentage of committed memory that is in the working set.	none	`SELECT` `physical_memory_in_use_kb * 1024 AS resident_memory_bytes,` `virtual_address_space_committed_kb * 1024 AS virtual_memory_bytes,` `memory_utilization_percentage,` `page_fault_count` `FROM sys.dm_os_process_memory`
mssql_page_fault_count	The number of page faults that were incurred by the Microsoft SQL Server process.	none	`SELECT` `physical_memory_in_use_kb * 1024 AS resident_memory_bytes,` `virtual_address_space_committed_kb * 1024 AS virtual_memory_bytes,` `memory_utilization_percentage,` `page_fault_count` `FROM sys.dm_os_process_memory`
mssql_os_memory	OS physical memory, used and available.	none	`SELECT` `(total_physical_memory_kb - available_physical_memory_kb) * 1024 AS used, available_physical_memory_kb * 1024 AS available` `FROM sys.dm_os_sys_memory`
mssql_os_page_file	OS page file, used and available.	none	`SELECT` `(total_page_file_kb - available_page_file_kb) * 1024 AS used, available_page_file_kb * 1024 AS available` `FROM sys.dm_os_sys_memory`
mssql_database_detail_process_count	Total number of processes	none	`SELECT` `DB_NAME(ISNULL(des.database_id,0)) AS db, COUNT(des.session_id) AS count` `FROM master.sys.dm_exec_sessions des` `WHERE ISNULL(des.database_id,0) <> 0` `GROUP BY DB_NAME(ISNULL(des.database_id,0))`
mssql_global_server_summary_perc_busy	Percentage of CPU Busy Time Note: This field cannot acquire the correct value.	none	`SELECT 100.0 * @@cpu_busy / (@@cpu_busy+ @@idle+ @@io_busy) AS cpu_busy_percent`
mssql_global_server_summary_packet_errors	The number of packet errors	none	`SELECT @@packet_errors AS count`
mssql_server_detail_blocked_processes	The number of processes waiting due to processes running on Microsoft SQL Server being locked	none	`SELECT DB_NAME(ISNULL(S.database_id,0)) AS db, SUM(ISNULL(R.blocking_session_id,0)) AS count` `FROM master.sys.dm_exec_sessions S LEFT OUTER JOIN master.sys.dm_exec_requests R ON S.session_id = R.session_id` `GROUP BY DB_NAME(ISNULL(S.database_id,0))`
mssql_server_overview_cache_hit	The percentage of times data pages were found in the data cache	none	`SELECT 100.0 * (` `SELECT` `cntr_value` `FROM master.sys.dm_os_performance_counters` `WHERE RTRIM(object_name) LIKE '%:Buffer Manager'` `AND RTRIM(LOWER(counter_name)) = 'buffer cache hit ratio'` `) / (` `SELECT` `cntr_value` `FROM master.sys.dm_os_performance_counters` `WHERE RTRIM(object_name) LIKE '%:Buffer Manager'` `AND RTRIM(LOWER(counter_name)) = 'buffer cache hit ratio base'` `) AS cache_hity_percent`

■ Requirements for monitoring Microsoft SQL Server

If you monitor Microsoft SQL Server on SQL exporter, you must configure the following settings:

Microsoft SQL Server

Set Microsoft SQL Server database-character set to the following:
- AL32UTF8 (Unicode UTF-8)
- JA16SJIS (Japanese-language SJIS)
- ZHS16GBK (Simplified Chinese GBK)
The supported authentication methods are user ID and password-based SQL Server authentication registered in Microsoft SQL Server. Windows authentication is not supported.

Users used to access Microsoft SQL Server

Grant the permissions below to the users you want to use to connect to Microsoft SQL Server.

SELECT permissions to the following tables

Table name	Permissions
`sys.sysprocesses`	`VIEW SERVER STATE`
`sys.dm_os_performance_counters`	`VIEW SERVER PERFORMANCE STATE`
`sys.dm_io_virtual_file_stats`	`VIEW SERVER PERFORMANCE STATE`
`sys.master_files`	`CREATE DATABASE` `ALTER ANY DATABASE` or `VIEW ANY DEFINITION`
`sys.dm_os_process_memory`	`VIEW SERVER PERFORMANCE STATE`
`sys.dm_os_sys_memory`	`VIEW SERVER PERFORMANCE STATE`

■ Obfuscation of Microsoft SQL Server passwords

SQL exporter shipped with JP1/IM - Agent manages the passwords in secret obfuscation capabilities for accessing Microsoft SQL Server from SQL exporter. For details, see 3.15.10 Secret obfuscation function.

■ Notes

If Microsoft SQL Server is not installed or configured, or if Microsoft SQL Server is not running, no performance information is collected.
If Microsoft SQL Server to be monitored is rebuilding the index while performance information is being collected, Microsoft SQL Server may receive lock-release wait to ensure data integrity. In such cases, the lock-release wait is cleared for processes that Microsoft SQL Server determines to have little impact, but the performance information collection request is rolled back, and performance information collection may fail.
If Microsoft SQL Server creates the table during a transaction and does not commit the operation, the data-collection fails because the system table is shared locked. In this case, data collection may not be possible until the operation is confirmed.
A shared lock is placed on the database when performance information is collected. If you attempt to create a new database of Microsoft SQL Server at this time, the creation may fail.

(r) Script exporter (job monitoring function)

The Job monitoring function works in conjunction with JP1/AJS3-Manager 13-50 and later to monitor JP1/AJS3 job information as metric to detect and anomaly detection the performance issues of the execution time of the root JobNet and to visualize the transition of the root jobnet execution time with integrated operation viewer.

Trend data using JP1/AJS3 root jobnet execution time as a metric can be stored in the trend data management DB of JP1/IM - Manager and displayed and monitored on the Trends and Dashboards tabs of the integrated operations viewer.

Metric of JP1/AJS3 job information displayed in integrated operation viewer is defined in metric definition file (metrics_ajs_rootjobnet.conf) of JP1/AJS3. For details, see Setup for linking JP1/IM3 in the JP1/Automatic Job Management System 3 Linkage Guide.

A Script exporter is a Exporter that runs a script that resides on a host and retrieves the results.

When JP1/AJS3 linkage is configured and job supervision functions are used, the Prometheus server can collect performance data of JP1/AJS3 job information via the script exporter after executing and completing the root jobnet of JP1/AJS3.

Note that the maximum number of JP1/AJS3 root jobnet that can collect performance data with a single unified agent is 5,000. If you collect performance data for more than 5000 root jobnet in a single integrated agent, you can collect up to 10,000 alert rules by evaluating them every two minutes or more. For details about how often alert rules are evaluated, see the evaluation_interval entry of Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can also configure integrated agent host and JP1/AJS3-Manager hosts as separate hosts. You must install JP1/Base on integrated agent host.

■ Creating a IM management node for use with the Job monitoring function

The IM management node of the JP1/AJS3 root jobnet, which can monitor job information, is created using the adapter command included with the JP1/AJS3 product plug-in. You can create a IM management node as follows:

Setup JP1/AJS3 linkage.

For details about how to set up monitoring of JP1/AJS3 root JobNets, see Setup for linking JP1/IM3 in the JP1/Automatic Job Management System 3 Linkage Guide.
Generate tree information from the integrated operations viewer or run the jddcreatetree command.
Accept tree information from the integrated operations viewer or run the jddupdatetree command.

■ Tree of IM management node created by the job monitoring function

IM management node tree created by the job monitoring function is shown below.

All Systems
 + JP1/AJS3-manager-host
 |  + Job
 |  |  + JP1/AJS3 - Manager
 |  |     + scheduler-service
 |  |        + job-group^#1
 |  |           + root-jobnet^#2
 |  + Management Applications
 |     + JP1/AJS3 - Manager
 |     + JP1/AJS3 - Manager Scheduler Service
 |        + scheduler-service
 + JP1/AJS3-agent-host
    + Management Applications
       + JP1/AJS3 -Agent

#1: A job group can have multiple hierarchies.
#2: JP1/IM - Manager 13-50 and later, a new SID for the configuration information of the IM management node corresponding to the root jobnet is created. However, the tree structure remains the same as in the JP1/AJS3 linkage used in JP1/IM - Manager version 13-11 and earlier. A node in the root jobnet has two configuration SIDs associated with one tree SID (one with "_JP1PC-IMB_" at the beginning and one without).

The types and formats of configuration SID corresponding to IM management node created by the job monitoring function are shown below.

Type of configuration SID		SID format
Job category	Root jobnet SID	`_JP1PC-IMB_JP1/IM-manager-host-name/_JP1PC-M_Prometheus-host-name/_JP1PC-AHOST_Exporter-host-name/JP1AJS-M_JP1/AJS3-manager-host-name/_HOST_JP1/AJS3-manager-host-name/_JP1SCHE_scheduler-service-name/_JP1JOBG_job-group-name/_JP1ROOTJOBNET_root-job-net-name`^#

#: If a job group is defined with multiple hierarchies, "_JP1JOBG_job-group-name" is repeated depending on the definition.

Because the job monitoring function uses Script exporter, the following IM management node tree is also created.

All Systems
 + JP1/IM-Agent-host
    + Script
    |  + ajseventmetrics^#1
    + Management Applications
       + JP1/IM - agent control base
       + Metric forwarder(Prometheus server)
       + Alert forwarder(Alertmanager)
       + JP1/AJS3 metric collector(Script exporter)^#2

#1: Indicates agent SID of Script exporter for job monitoring.
#2: Indicates agent service SID of Script exporter for job monitoring.

If you use Script exporter and also configure UAP monitoring capability in addition to the job monitoring function, Script exporter's IM management node is created as agent serviced SID for Script metric collector(Script exporter), as shown in the following IM management node tree. If you want to monitor the life and death of integrated agent processes, set the associated alert definition for each IM management node of Script metric collector (Script exporter) and JP1/AJS3 metric collector (Script exporter). In that case, when the script exporter stops, a JP1 event associated with each IM management node is issued. For details about integrated agent process alive monitoring, see 1.21.2 (18) Setup of integrated agent process alive monitoring (for Windows) (optional) and 2.19.2 (17) Setup of integrated agent process alive monitoring (for Linux) (optional) in the JP1/Integrated Management 3 - Manager Configuration Guide.

All Systems
 + JP1/IM-Agent-host
    + Script
    |  + ajseventmetrics^#1
    + Platform^#2
    |  + uap_run^#3
    + Management Applications
       + JP1/IM - agent control base
       + Metric forwarder(Prometheus server)
       + Alert forwarder(Alertmanager)
       + JP1/AJS3 metric collector(Script exporter)^#4
       + Script metric collector(Script exporter)^#5

#1: Indicates agent SID of Script exporter for job monitoring.
#2: Indicates agent SID of Script exporter for user-specified UAP monitoring.
#3: Indicates agent SID of Script exporter for UAP monitoring.
#4: Indicates agent service SID of Script exporter for job monitoring.
#5: Indicates agent service SID of Script exporter for UAP monitoring.

■ Viewing performance data for JP1/AJS3 Job Information

When JP1/AJS3 linkage is set up and JP1/AJS3 root jobnet has been executed, you can check metric of the job information related to the selected root jobnet from the Dashboard tab or Trend tab when you select IM management node of the root jobnet with IM management node of JP1/AJS3 reflected in the tree in integrated operation viewer. You can also customize the Dashboard tab or create a new dashboard to check the trend data of metric in the job information in the various panels.

When customizing the Dashboard tab or creating a new dashboard, we recommend that you specify no more than 20 root jobnets for target node in the various panels. If you specify more than 20, it will take time to display the dashboard panels. In addition, the panel display of the dashboard takes time depending on the following conditions in addition to the number of target nodes.

Fixed value of the range vector selector specified in PromQL (promql in the metric definition file) of the target metric^#1
Number of samples of performance data associated with the target node of target metric
Number of performance data label sets associated with the target node of target metric
Display range setting duration in the panel^#2
Number of plots in the panel^#2

#1: For details about JP1/AJS3 metric definition file (metrics_ajs_rootjobnet.conf), see Setup for linking JP1/IM3 in the JP1/Automatic Job Management System 3 Linkage Guide. For details on specifying the range vector selector, see Consolidation display of trend data with dynamic range vectors in 3.15.6(4)(c) About Performance Data to Retrieve.
#2: For details on the various panel settings, see Target node of each panel in 2.4.3 Add panel window in the JP1/Integrated Management 3 - Manager GUI Reference.

The following table lists the dashboards that are automatically generated and displayed on the Dashboard tab and the information displayed on the Trend tab when IM management node that is created by JP1/AJS3 linkage is selected in integrated operation viewer. Depending on the support of the job monitoring function, the displayed content differs between JP1/IM - Manager 13-11 or earlier and 13-50 or later.

Selected node		Panels from the second row of dashboards that are automatically generated and displayed on the Dashboard tab^#1		Trend tab
		JP1/IM - Manager Version		JP1/IM - Manager Version
		13-11 or earlier	13-50 or later	13-11 or earlier	13-50 or later
Host	`JP1/AJS3-manager-host`	None^#2	None^#2	None	None
`Job` category	`Job`	None^#2	None^#2	None	None
	`JP1/AJS3 - Manager`	None	None	None	None
	`scheduler-service`	None	None	None	None
	`job-group`	`None`	`None`	`None`	`None`
	`root-jobnet`	None	Displays metric trend panel^#3 associated with the root jobnet node	None	Displays metric associated with the root jobnet node
`Management Applications` category	`Management Applications`	None	None	None	None
	`JP1/AJS3 - Manager`	None	None	None	None
	`JP1/AJS3 - Manager Scheduler Services`	None	None	None	None
	`scheduler-service`	None	None	Displays metric associated with the scheduler service	Displays metric associated with the scheduler service
Host	`JP1/AJS3-agent`	None^#2	None^#2	None	None
`Management Applications` category	`Management Applications`	None	None	None	None
`Management Applications` category	`JP1/AJS3 - Agent`	None	None	None	None

#1: No matter which node you select, the first row of the dashboard shows the Node Status, Alert Information, Numeric and Trend panels.
#2: If integrated agent, user-defined Prometheus, and user-defined Fluentd hosts are the same host as JP1/AJS3 manager host, the Trend panel for metric associated with those nodes is displayed. If there is more than one terminal node under the selected node, and the same metric is related, it is displayed in one panel. Note that the panel of metrics related to the root jobnet is not displayed.
#3: If Display range setting of the dashboard is the default to 1 hour, metric trend panel associated with the node in the root jobnet displays seven days of data for each day.

In integrated operation viewer, for a dashboard that is automatically generated and displayed on the Dashboard tab when a node of Exporter or Fluentd and a node of the root JobNet of JP1/AJS3 are selected, the following is the difference in the panel display for metric associated with the terminal node under the selected node.

Selected node	Terminal node under the node of Exporter or Flluentd		Terminal node under the node of JP1/AJS3 root jobnet
Selected node	Panel view^#1	Panel setting^#1	Panel view^#2	Panel setting^#2
Terminal node	View 1-hour trend data per minute ^#3	Display range setting per-panel settings: None (same as dashboard display range) Setting the number of plots: 60	7-day trend data display per day^#4	Display range setting per-panel settings: setting "Time difference from dashboard display range (start date and time or end date and time)" in "Specification method"^{#3, #4} Plot Count setting: 7
Top nodes of the terminal node (except system node)	View 1-hour trend data per minute ^#3		No panel display	Not applicable

#1

Panel display for metrics other than job information in JP1/AJS3 is eligible.

#2

Panel display of JP1/AJS3 job information metrics is eligible.

#3

Assume that the dashboard display range is set to 1 hour.

#4

The dashboard display range settings are configured as follows:

"Dashboard display range (start time or end time)" setting: Start time
"Past or future range of the time difference" setting: Past range
"Time difference" setting: 143h (5 days 23 hours)

(s) Whether Prometheus and Exporter are supported for the same host configuration and another host configuration

The following tables show whether Prometheus and Exporter can be supported for the same host configuration and another host configuration.

Table 3‒39: Whether or not Prometheus and Exporter host configuration are supported
Exporter type		Configuring Prometheus and Exporter hosts
Exporter type		Same host	Another host
Exporter provided by JP1/IM - Agent	Node exporter for AIX	N	Y
Exporter provided by JP1/IM - Agent	Exporter other than the above	Y	N
User-defined Exporter		Y	Y

Legend

Y: Supported

N: Not supported

The following configurations are not supported:

Configuring scrape from more than one Prometheus to the same Exporter
Exporter^# on the remote agent (the host on Exporter and the host being monitored are separate hosts)

#: Exporter of the remote agent is Exporter whose discovery configuration file contains the description "jp1_pc_remote_monitor_instance".

Also, if Prometheus and Exporter are configured on different hosts, it is assumed that the ports used by Exporter are protected by firewalls, network configurations, etc. so that they are not accessed by anyone other than JP1/IM - Agent's Prometheus server (e.g. by building integrated agent host and Exporter hosts in the same network so that they are not accessed externally).

To Page Top

(2) Centralized management of performance data

This function allows Prometheus server to store performance data collected from monitoring targets in the intelligent integrated management database of JP1/IM - Manager. It has the following features:

Remote light function
In addition, if JP1/IM - Agent 13-01 or later is newly installed, the service monitor performance data is centrally managed by default. When upgrading from JP1/IM - Agent 13-00 to 13-01 or later, you need to configure the settings to perform service monitoring. See 3.15.1(1)(c) Windows exporter (Windows performance data collection capability) and 3.15.1(1)(d) Node exporter (Linux performance data collection capability) for details on where to find setup instructions.

(a) Remote light function

This is a function in which the Prometheus server sends performance data collected from monitoring targets to an external database suitable for long-term storage. JP1/IM - Agent uses this function to send performance data to JP1/IM - Manager.

The following shows how to define a remote light.

Remote write definitions are described in the Prometheus server configuration file (jpc_prometheus_server.yml).
Download Prometheus server configuration file from integrated operation viewer, edit it in a text editor, modify Remote Write definition, and then upload it.

The following settings are supported by JP1/IM - Agent for defining Remote Write. For details about the settings, see Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Table 3‒40: Settings for Remote Light Definition Supported by JP1/IM - Agent
Setting items	Description
Remote Light Destination (required)	Set the endpoint URL for JP1/IM agent control base.
Remote light timeout period (optional)	You can set the timeout period if the remote light takes a long time. Change it if you are satisfied with the default value.
Relabeling (optional)	You can remove unwanted metric and customize labeling.

To Page Top

(3) Performance data monitoring notification function

This function allows Prometheus server to monitor performance data collected from monitoring targets at a threshold value and notify JP1/IM - Manager. It has three functions:

Alert evaluation function
Alert notification function
Notification suppression function

If you add a service to be monitored in an environment where an alert definition for monitoring a service is set, the added service is also monitored. If you exclude a monitored service for which an alert has been fired from the monitoring target, you will receive an alert indicating that the alert that was fired has been recovered.

For an example of defining an alert, see Alert definition example for metrics in Node exporter metric definition file and Alert definition example for metrics in Windows exporter metric definition file in Alert configuration file (jpc_alerting_rules.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference. For Linux, the alerts are defined differently depending on whether or not the monitored auto-start is enabled (running systemctl enable). If you want to monitor a service for which automatic startup is disabled, you must create and configure an alert definition for each target.

- When using the job monitoring function: If you want to monitor performance data for job information, the alert rule evaluation interval must be at least one minute. For details about how often alert rules are evaluated, see the evaluation_interval entry of Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

(a) Alert evaluation function

This function monitors performance data collected from monitoring targets at a threshold value.

Define alert rules to evaluate alerts, monitor performance data at thresholds, and notify alerts.

Alerts can be evaluated by comparing the time series data directly with the thresholds, or by comparing the thresholds with the results of formulas using PromQL^#.

#: For details about PromQL, see 2.7.4(7) About PromQL.

For each time series of data or for each data generated by the calculation result of the PromQL expression, the alert status according to the evaluation is managed, and the action related to the notification is executed according to the alert state.

There are three alert states: pending, firing, and resolved. When the condition meets the alert rule first, it will be in the "pending" state. After that, when the condition continues to meet the alert rule (not resolved) during the time of "for" clause defined in the alert rule definition, it will be in the "firing" state.

When the condition does not meet(resolved), or if the time series is gone, it will be in the "resolved" state.

The relationship between alert status and notification behavior is as below.

Alert status	Description	Notification behavior
pending	The threshold is exceeded. The state the threshold is exceeded, but the time of "for" clause defined in the alert rule definition has not passed yet.	Do not notify alerts.
firing	The firing state. The state the threshold is exceeded, and the time of "for" clause defined in the alert rule definition has passed. Alternatively, the state the threshold is exceeded, and the "for" clause of the alert is not specified.	Notifies you of alerts.
resolved	The resolved state. The state the alert rule is no longer met.	When the condition recovers from the "firing" state, a notification of resolved is given. When the condition recovers from the "pending" state, no resolved notification is given.

The following shows how to define an alert rule.

Alert rule definitions are described in the alert configuration file (jpc_alerting_rules.yml) (definitions in any YAML format can also be described).
Before reflecting the created definition file in the environment to be used, format check and alert rule test with the promtool command.
Download alert configuration file from integrated operation viewer, edit it in a text editor, change the definition of the alert rule, and then upload it.

The following settings apply to the alert rule definitions supported by JP1/IM - Agent. For details about the settings, see Alert configuration file (jpc_alerting_rules.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference. There is no default alert rule definition.

Table 3‒41: Settings for alert rule definitions supported by JP1/IM - Agent
Setting Item	Description
Alert Name (required)	Set the alert name.
Conditional expression (required)	Set the alert condition expression (threshold). It can be configured using PromQL.
Waiting time (required)	Set the amount of time to wait after entering the "pending" state before changing to the "firing" state. Change it if you are satisfied with the default value.
Label (required)	Set labels to add to alerts and recovery notifications. In JP1/IM - Agent, a specific label must be set.
Annotation (required)	Set to store additional information such as alert description and URL link. In JP1/IM - Agent, certain annotations must be set.

Labels and annotations can use the following variables:

Variable^#	Description
$labels	A variable that holds the label key-value pairs for the alert instance. The label key can be one of the following labels: When time series data is specified in the alarm evaluation conditional expression You can specify the label that the data retains. When time series data is specified in the alarm evaluation conditional expression You can specify the label that the data retains. When PromQL expression is specified as the condition expression for alarm evaluation You can specify a label that is set as the result of a PromQL expression. The label that the data retains depends on the metrics. With regards to the label, refer the description of the metrics that can be specified in the PromQL statement, in 3.15.1(1) Performance data collection function.
$values	A variable that holds the evaluation value of the alert instance. When a firing is notified, it is expanded to the value at the time the firing was detected. When the resolved notification, it is expanded to the value as of the firing just before resolved (note that it is not the value as of resolved).
$externalLabels	This variable holds the label and value set in "external_labels" of item "global" in the Prometheus configuration file (jpc_prometheus_server.yml).

Variable^#

Description

$labels

A variable that holds the label key-value pairs for the alert instance. The label key can be one of the following labels:

When time series data is specified in the alarm evaluation conditional expression

You can specify the label that the data retains.

When time series data is specified in the alarm evaluation conditional expression

You can specify the label that the data retains.
When PromQL expression is specified as the condition expression for alarm evaluation

You can specify a label that is set as the result of a PromQL expression.

The label that the data retains depends on the metrics.

With regards to the label, refer the description of the metrics that can be specified in the PromQL statement, in 3.15.1(1) Performance data collection function.

$values

A variable that holds the evaluation value of the alert instance.

When a firing is notified, it is expanded to the value at the time the firing was detected.

When the resolved notification, it is expanded to the value as of the firing just before resolved (note that it is not the value as of resolved).

$externalLabels

This variable holds the label and value set in "external_labels" of item "global" in the Prometheus configuration file (jpc_prometheus_server.yml).

#1

Variables are expanded by enclosing them in "{{" and "}}". The following is an example of how to use variables:

description: "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)"

■ Alert rule definition for converting to JP1 events

In order to convert the alert to be notified into a JP1 event on the JP1/IM - Manager side, the following information must be set in the alert rule definition.

Setting item	Value to set	Uses
name	Configure any unique alert group definition name in integrated agent.	Alert group definition name
alert	Set any unique alert-definition-name in integrated agent.	Alert Definition Name
expr	Set the PromQL statement. It is recommended to set the PromQL statement described in the metric definition file. This way, when the JP1 event occurs, you can display trend information in the Integrated Operation Viewer.	Firing conditions^# # If the conditions are met, it is firing, and if the conditions are not met, it is resolved.
labels.jp1_pc_product_name	Set "/HITACHI/JP1/JPCCS" as fixed.	Set to the product name of the JP1 event.
labels.jp1_pc_severity	Set one of the following: Emergency Alert Critical Error Warning Notice Information Debug	Set to JP1 event severity^#. # This value is set to the severity of the JP1 event of the anomaly. The severity of a successful JP1 event is set to Information.
labels.jp1_pc_eventid	Set any value in the range of 0~1FFF,7FFF8000~7FFFFFFF.	Set to the event ID of the JP1 event.
labels.jp1_pc_metricname	Set the metric name. For Yet another cloudwatch exporter, be sure to specify it. Associates the JP1 event with the IM management node in the AWS namespace corresponding to the metric name (or the first metric name if multiple metric names are specified separated by commas).	Set to the metric name of the JP1 event. For yet another cloudwatch exporter, it is also used to correlate JP1 events.
annotations.jp1_pc_firing_description	Specify the value to be set for the message of the JP1 event when the firing condition of the alert is satisfied. If the length of the value is 1,024 bytes or more, set the string from the beginning to the 1,023rd byte. If the specification is omitted, the message content of the JP1 event is "The alert is firing. (alert = alert name)". You can also specify variables to embed job names and evaluation values. If a variable is used, the first 1,024 bytes of the expanded message are valid.	It is set to the message of the JP1 event.
annotations.jp1_pc_resolved_description	Specify the value to be set for the message of the JP1 event when the firing condition of the alert is not satisfied. If the length of the value is 1,024 bytes or more, set the string from the beginning to the 1,023rd byte. If the specification is omitted, the content of the message in the JP1 event is "The alert is resolved. (alert = alert name)". You can also specify variables to embed job names and evaluation values. If a variable is used, the first 1,024 bytes of the expanded message are valid.	It is set to the message of the JP1 event.

For an example of setting an alert definition, see Definition example in alert configuration file (jpc_alerting_rules.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

For details about the properties of the corresponding JP1 event, see 3.2.3 Lists of JP1 events output by JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ How to operate in combination with trending-related functions

Combine the definitions of the PromQL statement described in the metric definition file and the PromQL statement evaluated by the alert evaluation function, and in the alert definition annotations.jp1_pc_firing_description and annotations.jp1_pc_resolved_description of the alert definition in the alert configuration file, By describing the metric name of the corresponding trend data, when the JP1 event of the alert is issued, you can check the past change and current value of the performance value evaluated by the alert on the Trends tab of the integrated operation viewer.

For details about PromQL expression defined in trend displayed related capabilities, see 3.15.6(4) Return of trend data.

For example, if you want the Node exporter to monitor CPU usage and notify you when the CPU usage exceeds 80%, create an alert configuration file (alert definition) and a metric definition file as shown in the following example.

Example of description of alert configuration file (alert definition)

groups:
  - name: node_exporter
    rules:
    - alert: cpu_used_rate(Node exporter)
      expr: 80 < (avg by (instance,job,jp1_pc_nodelabel,jp1_pc_exporter) (rate(node_cpu_seconds_total{mode="system"}[2m])) + avg by (instance,job,jp1_pc_nodelabel,jp1_pc_exporter) (rate(node_cpu_seconds_total{mode="user"}[2m]))) * 100
      for: 3m
      labels:
        jp1_pc_product_name: "/HITACHI/JP1/JPCCS2"
        jp1_pc_component: "/HITACHI/JP1/JPCCS/CONFINFO"
        jp1_pc_severity: "Error"
        jp1_pc_eventid: "0301"
        jp1_pc_metricname: "node_cpu_seconds_total"
      annotations:
        jp1_pc_firing_description: "CPU usage has exceeded the threshold (80%). value={{ $value }}%"
        jp1_pc_resolved_description: "CPU usage has fallen below the threshold (80%)."

Example of description of metric definition file

[
  {
    "name":"cpu_used_rate",
    "default":true,
    "promql":"(avg by (instance,job,jp1_pc_nodelabel,jp1_pc_exporter) (rate(node_cpu_seconds_total{mode=\"system\"}[2m]) and $jp1im_TrendData_labels) + avg by (instance,job,jp1_pc_nodelabel,jp1_pc_exporter) (rate(node_cpu_seconds_total{mode=\"user\"}[2m]) and $jp1im_TrendData_labels)) * 100",
    "resource_en":{
      "category":"platform_unix",
      "label":"CPU used rate",
      "description":"CPU usage.It also indicates the average value per processor. [Units: %]",
      "unit":"%"
    },
    "resource_ja":{
      "category":"platform_unix",
      "label":"CPU使用率",
      "description":"CPU使用率（%）。プロセッサごとの割合の平均値でもある。",
      "unit":"%"
    }
  }
}

When the conditions of the PromQL statement specified in expr of the alert definition are satisfied and the JP1 event of the alert is issued, the message "CPU usage has exceeded the threshold (80%). value = performance-value%" is set in the message of the JP1 event. Users can view this message to view "CPU Usage" trend information and see past changes and current values of CPU usage.

■ Behavior when the service is stopped

If the Alertmanager service is stopped, the JP1 event for the alert is not issued. In addition, if the Prometheus server and Alertmanager services are running and the exporter whose alert is firing is stopped due to a failure, the alert becomes resolved and a normal JP1 event is issued.

When alert is firing and the Prometheus server service is stopped while the Alertmanager is running, a normal JP1 event that gives a notification of resolved of the alert is issued.

For details, see About behavior when the Prometheus server is restarted or stoppedwhile the Alertmanager is running.

■ About behavior when the service is restarted

Even if the alert is firing or resolved and the Prometheus server, Alertmanager, or Exporter service is restarted, when the current alert status is the same as the alert state before the restart, the JP1 event is not issued.

When the alert is firing and the Prometheus server service is restarted while the Alertmanager is running, there are cases in which a normal JP1 event that gives a notification of resolved of the alert is issued.

For details, see About behavior when the Prometheus server is restarted or stopped while the Alertmanager is running.

■ About Considering Performance Data Spikes

Performance data can be momentarily jumpy (large values, small values, or minus values). These sudden changes in performance data are commonly referred to as "spikes." In many cases, even if a spike occurs and becomes an abnormal value momentarily, it immediately returns to normal and does not need to be treated as an abnormal. Also, when the performance data is reset, such as when the OS is restarted, a spike may occur instantaneously.

When monitoring such performance data metrics, it is necessary to consider suppressing sudden anomaly detection by specifying "for" (grace period before treating alerts as anomalies) in the alert rule definition.

■ About behavior when the Prometheus server is restarted or stopped while the Alertmanager is running

When the alert is firing and the Prometheus server service is restarted or stopped while the Alertmanager is running, there are cases in which a normal JP1 event that gives a notification of resolved of the alert be issued.

When following conditions are met, a normal JP1 event is issued.

The sum total of the duration of the "for" clause^# defined in alert definition of firing alert and the duration that Prometheus server service is not runnig due to being stopped or reloading becomes greater than the value of "evaluation_interval" defined in Prometheus configuration file.
#: When the "for" clause of the alert is not specified, define 0.

■ About behavior when the service is reloaded

Even if the alert is firing or resolved and the API that reloads the Prometheus server, Alertmanager, or Exporter service is executed, the JP1 event is not issued.

(b) Alert forwarder

This function notifies you when the alert status becomes "firing" or "resolved" after the Prometheus server evaluates the alert.

When the state of alert changes during JP1/IM - Manager (Intelligent Integrated Management Base) is stopped, there are cases in which a notification of firing and resolved is not performed.

The Prometheus server sends alerts one by one, and the sent alerts are notified to JP1/IM - Manager (Intelligent Integrated Management Base) via Alertmanager. You will also be notified one by one when you retry.

Alerts sent to JP1/IM - Manager are basically sent in the order in which they occurred, but the order may be changed when multiple alert rules meet the conditions at the same time or when a transmission error occurs and they are resent. However, since the alert information includes the time of occurrence, it is possible to understand in which order it occurred.

In addition, if the abnormal condition continues for 7 days, an alert will be re-notified.

The following shows how to define the notification destination of the alert.

Alert destinations are described in both the Prometheus configuration file (jpc_prometheus_server.yml) and the Alertmanager configuration file (jpc_alertmanager.yml).

For Prometheus configuration file, specify a Alertmanager that coexists as a destination for Prometheus server notifications. For Alertmanager configuration file, specify JP1/IM agent control base as the notification destination for Alertmanager.
Download the individual configuration file from integrated operation viewer, edit them in a text editor, change the alert notification destination definitions, and then upload them.

The following settings are related to definition of Prometheus server notification destinations supported by JP1/IM - Agent. For details about the settings, see Prometheus configuration file (jpc_prometheus_server.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Table 3‒42: Settings for defining notification destinations for Prometheus server supported by JP1/IM - Agent
Setting items	Description
Notification destination (required)	Configure the notification destination Alertmanager. If a host name or internet address is specified for --web.listen-address in the Alertmanager command line option, modify localhost to the host name or internet address specified in --web.listen-address. For physical host environments Specifies the Alert manager that you want to live with. For clustered environment Specifies the Alertmanager that runs on the logical host.
Label setting (optional)	You can add labels. Configure as needed.

The following are Alertmanager notification destinations that JP1/IM - Agent supports: For details about the settings, see Alertmanager configuration file (jpc_alertmanager.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

Table 3‒43: Settings for defining Alertmanager notification destinations supported by JP1/IM - Agent
Setting items	Description
Webhook settings (required)	Set the endpoint URL for JP1/IM agent control base.

(c) Notification suppression function

This function suppresses the notifications described in 3.15.1(3)(b) Alert forwarder. It includes:

Silence function

Use this if you do not want to be temporarily notified of certain alerts.

■ Silence function

This feature temporarily suppresses certain notifications. You can set not to notify alerts that occur during temporary maintenance. Unlike when the common exclusion condition of JP1/IM - Manager is used, the notification suppression function does not notify JP1/IM - Manager itself.

While silence is enabled, you will not be notified when the alert status changes. When silence is disabled, if the state has changed compared to the state of the alert before silence was enabled, notification is given.

Here are two examples of when to notify:

Figure 3‒36: Cases where the state is different before and after disabling silence

The above figure shows an example in which the alert status is "abnormal" when silence is enabled, and while silence is enabled, the alert status changes to "normal", and then silence is disabled.

When the alert changes to "normal", you will not be notified because silence is enabled. When silence is disabled, the alert status has changed from "abnormal" to "normal" before silence is enabled, so "normal" notification is given.

Figure 3‒37: Cases where the state is the same before and after enabling silence

The above figure shows an example in which the alert status changed to "normal" once, changed to "abnormal" again, and then disabled silence while silence was enabled.

When silence is disabled, notification is not performed because the alert status is the same "abnormal" as before silence was enabled.

If an alert fails to be sent and retries and silence is enabled to suppress the alert, the alert will not be retried.

- How to Configure silence

Silence settings (enable or disable) and retrieve the current silence settings are performed via REST API (GUI is not supported).

In addition, when configuring silence settings, integrated agent host must be able to communicate with Alertmanager port-number from the machine that you are operating.

For details about silence settings and REST API used to obtain current silence settings, see 5.22.3 Get silence list of Alertmanager, 5.22.4 Silence creation of Alertmanager, and 5.22.5 Silence Revocation of Alertmanager in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

To Page Top

(4) Communication function

(a) Communication protocols and authentication methods

The following shows the communication protocols and authentication methods used by integrated agent.

Connection source	Connect to	Protocol	Authentication method
Prometheus server	JP1/IM agent control base	HTTP	No authentication
Alertmanager	JP1/IM agent control base	HTTP	No authentication
Prometheus server	Alertmanager	HTTP	No authentication
Prometheus server	Exporter	HTTP	No authentication
Blackbox exporter	monitored	HTTP/HTTPS	Basic Authentication
			Basic Authentication
			No authentication
		HTTPS	Server Authentication
			With client authentication
			No client authentication
		ICMP^#1	No authentication
Yet another cloudwatch exporter	Amazon CloudWatch	HTTPS	AWS IAM Authentication
Promitor Scraper	Azure Monitor	HTTPS	No client authentication
Promitor Resource Discovery	Azure Resource Graph	HTTPS	No client authentication
Promitor Scraper	Promitor Resource Discovery	HTTP	No authentication
Prometheus	Fluentd	HTTP	No authentication
OracleDB exporter	Oracle listener	Oracle listener-specific (no encryption)	Authentication by username/password
Web scenario execution function	Browser that invokes Web scenario-execution feature	Chrome devtools protocol (CDP)	No authentication
Web Scenario Execute Function/Browser from which Web Scenario Execute Function starts	Monitored server	Non cryptographic communication ^#2 Cryptographic communication ^#2	No authentication HTTP authentication (Basic authentication only) Authentication Using TLS Servers Certificate Authentication ^#3 with TLS Client Certificate Authentication for entering a username and password on the form
VMware exporter	VMware ESXi	No SSL/TSL connected	Authentication by Username and Password
VMware exporter	VMware ESXi	Connected ^#4 with SSL/TLS	Authentication by Username and Password Authentication with CA Certificates
SQL exporter	Microsoft SQL Server	No TSL connected	Authentication by username and password
SQL exporter	Microsoft SQL Server	Connected ^#5 with TLS	Authentication by username and password

#1: ICMPv6 is not available.
#2: The specific protocol depends on the target.
#3: See Configuring authentication in 1.21.2(13)(a) Setting up JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Configuration Guide.
#4: Only TLS1.1 and TLS1.2 can be connected.
#5: You must provide the option to enable TLS communication with Microsoft SQL Server in the connection information of the monitoring target set by SQL exporter configuration file (jpc_sql_exporter.yml). For details, see SQL exporter configuration file (jpc_sql_exporter.yml) in Chapter 2. Definition Files in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

(b) Network configuration

Integrated agent can be used in a network configuration with only a IPv4 environment or in a network configuration with a mix of IPv4 and IPv6 environments. Only IPv4 communication is supported in a network configuration with a mix of IPv4 and IPv6 environments.

You can use integrated agent in the following configurations without a proxy server:

Connection source	Connect to	Connection type
Prometheus server	JP1/IM agent control base	No proxy server
Alertmanager	JP1/IM agent control base
Prometheus server	Alertmanager
Prometheus server	Exporter
Blackbox exporter	Monitoring targets (ICMP monitoring)
Blackbox exporter	Monitoring targets (HTTP monitoring)	No proxy server Through a proxy server without authentication Through a proxy server with authentication
Yet another cloudwatch exporter	Amazon CloudWatch	No proxy server Through a proxy server without authentication Through a proxy server with authenticationNo proxy server
Promitor Scraper	Azure Monitor	No proxy server Through a proxy server without authentication Through a proxy server with authenticationNo proxy server
Promitor Resource Discovery	Azure Resource Graph
OracleDB exporter	Oracle listener	No proxy server
Web Scenario Execute Function/Browser from which Web Scenario Execute Function starts	Monitored server	No proxy server Through a proxy server without authentication Through a proxy server with authentication (HTTP authentication (Basic authentication only))
VMware exporter	VMware ESXi	No proxy server
SQL exporter	Microsoft SQL Server	No proxy server

Integrated agent transmits the following:

Connection source	Connect to	Transmitted data	Authentication method
Prometheus server	JP1/IM agent control base	Performance data in Protobuf format
Alertmanager	JP1/IM agent control base	Alert information in JSON format^#1
Prometheus server	Exporter	None
Exporter	Prometheus server	Prometheus textual performance data^#2
Blackbox exporter	monitored	Response for each protocol
Yet another cloudwatch exporter	Amazon CloudWatch	CloudWatch data
Promitor Scraper	Azure Monitor	Azure Monitor data (metrics information)	Service principal Managed ID
Promitor Resource Discovery	Azure Resource Graph	Azure Resource Graph data (resources exploration results)	Service principal Managed ID
OracleDB exporter	Oracle listener	Proprietary Oracle listener data
Web scenario execution function	Browser that invokes Web scenario-execution feature	Browser operation data
Web Scenario Execute Function/Browser from which Web Scenario Execute Function starts	Monitored server	Data that depends on the target
Monitored server	Web Scenario Execute Function/Browser from which Web Scenario Execute Function starts	Data that depends on the target
VMware exporter	VMware ESXi	VMware ESXi information
VMware ESXi	VMware exporter	VMware ESXi information
SQL exporter	Microsoft SQL Server	None
Microsoft SQL Server	SQL exporter	Result of executing SQL statement

#1: For details, see the description of the message body for the request in 5.6.5 JP1 Event converter in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.
#2: For details, see the description of Prometheus text formatting in 5.24 API for scrape of Exporter used by JP1/IM - Agent in the JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

To Page Top