12.5.2 Performance monitoring capabilities

Organization of this subsection

(1) Communication capability
(2) Performance data collection capabilities

(1) Communication capability

(a) Communication protocol

The following table lists the communication protocols used by the IM Exporter add-on programs.

Connected from	Connected to	Protocol	Authentication method
Yet another cloudwatch exporter	Amazon CloudWatch	See 9.5.3(1)(a) Communication protocols and authentication methods of JP1/IM - Agent.
Promitor Scraper	Azure Monitor	HTTPS	No client authentication
Promitor Resource Discovery	Azure Resource Graph	HTTPS	No client authentication
Promitor Scraper	Promitor Resource Discovery	HTTP	No authentication
Prometheus	Fluentd	HTTP	No authentication

(b) Network configuration

The environments where the IM Exporter add-on programs are available follow the standards for JP1/IM. The following table shows the proxy configurations that are available.

Connected from	Connected to	Available proxy configuration
Yet another cloudwatch exporter	Amazon CloudWatch	See 9.5.3(1)(b) Network configuration of JP1/IM - Agent.
Promitor Scraper	Azure Monitor	Without a proxy server With a proxy server (no authentication) With a proxy server (with authentication)
Promitor Resource Discovery	Azure Resource Graph

The following table shows what data is transmitted by the IM Exporter add-on programs.

Connected from	Connected to	Data to be transmitted	Authentication method
Yet another cloudwatch exporter	Amazon CloudWatch	See 9.5.3(1)(b) Network configuration of JP1/IM - Agent.
Promitor Scraper	Azure Monitor	Azure Monitor data (metrics information)	Service principal Managed ID
Promitor Resource Discovery	Azure Resource Graph	Azure Resource Graph data (resources exploration results)	Service principal Managed ID

To Page Top

(2) Performance data collection capabilities

With these capabilities, Prometheus server collects performance data from monitoring targets. There are two capabilities available as follows:

Scraping (Prometheus server)
Operating data collection from monitoring targets (Exporters)

For details, see 9.5.3(2) Performance data collection function of JP1/IM - Agent.

(a) Scraping capability

Scraping is defined on a scraping job basis. In JP1/IM - Agent, scraping jobs with names that correspond to the types of Exporters are defined by default.

If a discovery configuration file is used for monitoring through UAP monitoring, jobs should be defined. Also, additional settings are required for the scraping definitions of the log metrics feature.

For details on the scraping description of the log metrics feature, see 10.1.2(2) Setting up scraping definitions (required) of IM Exporter in the manual JP1/Integrated Management 3 - Manager Configuration Guide.

The following table lists the default scraping definition for each IM Exporter add-on program.

Scraping job name	Scraping definition
jpc_windows	Scraping definition for Windows exporter
jpc_process	Scraping definition for Process exporter
jpc_cloudwatch	Scraping definition for Yet another cloudwatch exporter
jpc_promitor	Scraping definition for Promitor
jpc_script	Scraping definition for Script exporter

Prometheus server scrapes targets and receives different metrics from the Exporters depending on the types of Exporters. For details, see the description of the metric definition file for each Exporter under 10. IM Exporter definition files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

(b) Operating data collection from monitoring targets

The following describes the capabilities of the IM Exporter add-on programs, which collect operating information (performance data) from monitoring targets.

(c) Windows exporter

Windows exporter, built into a monitored Windows host, collects operating information from that host. For details, see 9.5.3(2) Performance data collection function of JP1/IM - Agent.

In IM Exporter, operating information of processes can be collected in addition to the capabilities of Windows exporter that comes with JP1/IM - Agent. process is added by default to the collectors available.

■ Key metric items

The key Windows exporter metric items are defined in the Windows exporter metric definition file (initial status). For details, see the description of Windows exporter metric definition file (metrics_windows_exporter.conf) of JP1/IM - Agent in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

In IM Exporter, the metric items listed in the table below can be added to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

Metric name	Collector	Data to be obtained	Label
windows_process_start_time	process	Time of process start	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_cpu_time_total	process	Returns elapsed time that all of the threads of this process used the processor to execute instructions by mode (privileged, user). An instruction is the basic unit of execution in a computer, a thread is the object that executes instructions, and a process is the object created when a program is run. Code executed to handle some hardware interrupts and trap conditions is included in this count.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (privileged or user)
windows_process_io_bytes_total	process	Bytes issued to I/O operations in different modes (read, write, other). This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write mode includes data operations; other mode includes those that do not involve data, such as control operations.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (privileged or user)
windows_process_io_operations_total	process	I/O operations issued in different modes (read, write, other). This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write mode includes data operations; other mode includes those that do not involve data, such as control operations.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `mode:` `mode` (read, write, or other)
windows_process_page_faults_total	process	Page faults by the threads executing in this process. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This can cause the page not to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with which the page is shared.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_page_file_bytes	process	Current number of bytes this process has used in the paging file(s). Paging files are used to store pages of memory used by the process that are not contained in other files. Paging files are shared by all processes, and lack of space in paging files can prevent other processes from allocating memory.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_pool_bytes	process	Pool Bytes is the last observed number of bytes in the paged or nonpaged pool. The nonpaged pool is an area of system memory (physical memory used by the operating system) for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. The paged pool is an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used. Nonpaged pool bytes is calculated differently than paged pool bytes, so it might not equal the total of paged pool bytes.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID` `pool:` paged (pool paged) or nonpaged (pool non paged)
windows_process_priority_base	process	Current base priority of this process. Threads within a process can raise and lower their own base priority relative to the process base priority of the process.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_private_bytes	process	Current number of bytes this process has allocated that cannot be shared with other processes.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`
windows_process_virtual_bytes	process	Current size, in bytes, of the virtual address space that the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite and, by using too much, the process can limit its ability to load libraries.	`instance:` `instance-identifier-string` `job:` `job-name` `process:` `process-name` `process_id:` `process-ID` `creating_process_id:` `creator-process-ID`

■ Comparison with key performance data that can be collected by JP1/PFM - Agent for Platform

The following table shows whether Windows exporter can collect key performance data that can be collected by JP1/PFM - Agent for Platform as metrics, in comparison with the records JP1/PFM - Agent for Platform uses for collection.

Key performance data that can be collected by JP1/PFM - Agent for Platform			Whether Windows exporter can collect it as a metric
Record name (Record ID)	Information stored in the record	Record is based on	What can be collected	What cannot be collected
Process Detail (PD)	Performance data that shows the state of paging, memory usage, and time usage of one process at a point in time.	Process ID	This corresponds to where a node is created on a process_id basis.	Execution user/group ID of a virtualization environment Handle Count Thread Count Size of the memory used by processes Size of the memory used by processes
Process Detail Interval (PDI)	Performance data that shows the state of paging, memory usage, and time usage of one process at a point in time.	Process ID	The metrics to be collected are all included in PD. The metrics that are determined through the calculation of the average or frequency can be obtained by calculating using the start time of the process, not the collection interval.	--
Process End Detail (PD_PEND)	Performance data that shows the state after the process ends.	Process ID	--	The information of any ended process cannot be collected.
Workgroup Summary (PI_WGRP)	Performance data obtained by summarizing a record stored in the Process Detail (PD) record at a point in time on a workgroup basis.	Workgroup	A workgroup is a JP1/PFM-specific unit, which cannot be collected.	--
Application Process Interval (PD_APSI)	Performance data that shows the state of a process for which process monitoring has been configured, at a point in time.	Process ID	A given unit cannot be specified. The metrics to be collected are all included in APS.	--
Application Process Overview (PD_APS)	Performance data that shows the state of a process at a point in time.	Process ID	A given unit cannot be specified, but this corresponds to where a node is created on a process basis.	Command line Execution user/group ID of a virtualization environment Handle Count Thread Count Size of the memory used by processes Size of the memory used by processes

Legend:: --: Not applicable

(d) Process exporter

Process exporter, built into a monitored Linux host, collects operating information of processes running on that host.

Installed in the same host as Prometheus server, Process exporter collects operating information of the processes from the Linux OS on the host when triggered by scraping requests from Prometheus server, and returns it to the server.

Process exporter allows you to collect process-related operating information, which cannot be obtained through monitoring from outside the host (such as synthetic monitoring with URLs or CloudWatch), from within the host.

■ Key metric items

The key Process exporter metric items are defined in the Process exporter metric definition file (initial status). For details, see Process exporter metric definition file (metrics_process_exporter.conf) of 10. IM Exporter definition files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add more metric items to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

Metric name	Data to be obtained	Label
namedprocess_namegroup_num_procs	Number of processes in this group.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_cpu_seconds_total	CPU usage based on `/proc/[pid]/stat fields utime(14)` and `stime(15)` i.e. user and system time.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `mode:` user or system
namedprocess_namegroup_read_bytes_total	Bytes read based on `/proc/[pid]/io` field `read_bytes`. As `/proc/[pid]/io` are set by the kernel as read only to the process' user, to get these values you should run process-exporter either as that user or as root. Otherwise, we can't read these values and you'll get a constant 0 in the metric.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_write_bytes_total	Bytes written based on `/proc/[pid]/io` field `write_bytes`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_major_page_faults_total	Number of major page faults based on `/proc/[pid]/stat` field `majflt(12)`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_minor_page_faults_total	Number of minor page faults based on `/proc/[pid]/stat` field `minflt(10)`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_context_switches_total	Number of context switches based on `/proc/[pid]/status` fields `voluntary_ctxt_switches` and `nonvoluntary_ctxt_switches`. The extra label ctxswitchtype can have two values: voluntary and nonvoluntary.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `ctxswitchtype:` voluntary or nonvoluntary
namedprocess_namegroup_memory_bytes	Number of bytes of memory used. The extra label `memtype` can have three values: resident: Field `rss(24)` from `/proc/[pid]/stat`. This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out. virtual: Field `vsize(23)` from `/proc/[pid]/stat`, virtual memory size. swapped: Field `VmSwap` from `/proc/[pid]/status`, translated from KB to bytes. If gathering smaps file is enabled, two additional values for memtype are added: proportionalResident: Sum of `Pss` fields from `/proc/[pid]/smaps` proportionalSwapped: Sum of `SwapPss` fields from `/proc/[pid]/smaps`	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `memtype:` resident, virtual, swapped, proportionalResident, or proportionalSwapped
namedprocess_namegroup_open_filedesc	Number of file descriptors, based on counting how many entries are in the directory `/proc/[pid]/fd`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_worst_fd_ratio	Worst ratio of open filedescs to filedesc limit, amongst all the procs in the group. The limit is the fd soft limit based on `/proc/[pid]/limits`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_oldest_start_time_seconds	Epoch time (seconds since 1970/1/1) at which the oldest process in the group started. This is derived from field `starttime(22)` from `/proc/[pid]/stat`, added to boot time to make it relative to epoch.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_num_threads	Sum of number of threads of all process in the group. Based on `field num_threads(20)` from `/proc/[pid]/stat`.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_states	Number of threads in the group in each of various states, based on the field `state(3)` from `/proc/[pid]/stat`. The extra label state can have these values: Running, Sleeping, Waiting, Zombie, Other.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `state:` Running, Sleeping, Waiting, Zombie, or Other
namedprocess_namegroup_thread_count	Number of threads in this thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `threadname:` `thread-name`
namedprocess_namegroup_thread_cpu_seconds_total	Same as cpu_user_seconds_total and cpu_system_seconds_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `threadname:` `thread-name` `mode:` user or system
namedprocess_namegroup_thread_io_bytes_total	Same as read_bytes_total and write_bytes_total, but broken down per-thread subgroup. Unlike read_bytes_total/write_bytes_total, the label iomode is used to distinguish between read and write bytes.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name` `threadname:` `thread-name` `iomode:` read or write
namedprocess_namegroup_thread_major_page_faults_total	Same as major_page_faults_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_thread_minor_page_faults_total	Same as minor_page_faults_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`
namedprocess_namegroup_thread_context_switches_total	Same as context_switches_total, but broken down per-thread subgroup.	`instance:` `instance-identifier-string` `job:` `job-name` `groupname:` `group-name`

Important

Processes whose name contains multi-byte characters cannot be monitored.
Process exporter still continues to output information of processes that it collected once, even after the processes stop running. Therefore, if Process exporter is configured to collect information based on PIDs, new time-series data is added every time a process is restarted and its PID is changed, resulting in large amounts of unnecessary data.

Furthermore, it is not recommended to use PIDs in open source software (OSS), and thus version 13-00 of our software is configured not to collect PID information by default (groupname). If the user wants to manage processes on the same command line separately, we recommend operational means, such as a change in the order of arguments or the use of PIDs (however, periodic restarts are needed to prevent collected information from accumulating continuously).

Note that information collected by Windows exporter is different from what Process exporter collects, because Windows exporter collects the PID information. (If you want to exclude the PIDs from the collected information, use drop in the scraping definition of the Prometheus configuration file (jpc_prometheus_server.yml) to exclude them.)

■ Comparison with key performance data that can be collected by JP1/PFM - Agent for Platform

The following table shows whether Process exporter can collect key performance data that can be collected by JP1/PFM - Agent for Platform as metrics, in comparison with the records JP1/PFM - Agent for Platform uses for collection.

Key performance data that can be collected by JP1/PFM - Agent for Platform			Whether Process exporter can collect it as a metric
Record name (Record ID)	Information stored in the record	Record is based on	What can be collected	What cannot be collected
Process Detail (PD)	Performance data that shows the state of a process at a point in time.	Process ID	The data can be collected on a process ID basis if groupname is specified such that it contains {{.PID}}. It also corresponds to cases where a node is created on a process ID basis.	Parent process/child process information Real/effective group/real user/terminal information, effective user ID Memory usage of certain memory types CPU usage by CPU ID of a virtualization environment
Process Detail Interval (PDI)	Performance data of a process over a certain unit of time.	Process ID	The metrics to be collected are all included in PD. The metrics that are determined through the calculation of the average or frequency can be obtained by calculating using the start time of the process, not the collection interval.	--
Process Summary (PD_PDS)	Performance data obtained by summarizing data stored in the Process Detail (PD) record at a point in time.	System	This can be aggregated on an instance (host) basis.	Part of the state of a process Real user/terminal information
Program Summary (PD_PGM)	Performance data obtained by summarizing data stored in the Process Detail (PD) record at a point in time on a program basis.	Program	The data can be collected on a program basis if groupname is specified based on a program (that is, use {{.ExeBase}} or {{.ExeFull}}).	--
Terminal Summary (PD_TERM)	Performance data obtained by summarizing data stored in the Process Detail (PD) record at a point in time on a terminal basis.	Terminal	--	The data cannot be aggregated on a terminal basis because the terminal information cannot be collected.
User Summary (PD_USER)	Performance data obtained by summarizing data stored in the Process Detail (PD) record at a point in time on a user basis.	User ID	The data can be aggregated on a user basis by putting data having the same user name together with {{.Username}} contained in groupname.	Effective user ID
Workgroup Summary (PI_WGRP)	Performance data obtained by summarizing data stored in the Process Detail (PD) record at a point in time on a workgroup basis.	Workgroup	A workgroup is a JP1/PFM-specific unit, which cannot be collected.	--
Application Process Interval (PD_APSI)	Performance data that shows the state of a process for which process monitoring has been configured, at a point in time.	Process ID	All the metrics to be collected, except for ApplicationName (which nearly corresponds to groupname of Process exporter), are included in APS. The metrics that are determined through the calculation of the average or frequency can be obtained by calculating using the start time of the process, not the collection interval.	--
Application Process Overview (PD_APS)	Performance data of processor usage over a certain unit of time.	Process ID	This corresponds to cases where a node is created on a groupname basis. The metrics for each process (process-ID-based) are the same as PD.	Same as PD.

Legend:: --: Not applicable

(e) Yet another cloudwatch exporter

Yet another cloudwatch exporter collects operating information of AWS services running on the cloud environment through Amazon CloudWatch. For details, see the description in 9.5.3(2) Performance data collection function of JP1/IM - Agent.

■ Key metric items

The key metric items of Yet another cloudwatch exporter are defined in the Yet another cloudwatch exporter metric definition file (initial status). For details, see the description under Yet another cloudwatch exporter metric definition file (metrics_ya_cloudwatch_exporter.conf) of JP1/IM - Agent definition files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ CloudWatch metrics you can collect

In addition to the AWS namespaces supported by Yet another cloudwatch exporter of JP1/IM - Agent as monitoring targets, IM Exporter can collect the metrics with the AWS namespaces listed in the following table.

Table 12‒10: AWS namespaces supported by IM Exporter as extended monitoring targets
AWS namespace	Metric category name on CloudWatch^#	Dimension
AWS/EBS	Per-volume metrics	VolumeId
AWS/ECS	ClusterName, ServiceName	ClusterName ServiceName
AWS/EFS	File system metrics	FileSystemId
AWS/EFS	File system storage metrics	FilesSystemId StorageClass
AWS/FSx	File system metrics	FileSystemId
AWS/RDS	Per-database metrics	DBInstanceIdentifier
AWS/RDS	DBClusterIdentifier	DBClusterIdentifier
AWS/SNS	Topic metrics	TopicName

#: The name of a class after metrics are categorized by dimension in AWS CloudWatch. You can view them in the CloudWatch website.

(f) Promitor

Promitor, included in the integrated agent, collects operating information of Azure services on the cloud environment through Azure Monitor and Azure Resource Graph.

Promitor consists of Promitor Scraper and Promitor Resource Discovery. Promitor Scraper collects metrics on resources from Azure Monitor according to schedule settings and returns them.

Metrics can be collected from target resources in two ways: one method is to specify the target resources separately in a configuration file and the other is to detect the resources automatically. If you choose to detect them automatically, Promitor Resource Discovery detects resources in a tenant through Azure Resource Graph, and based on the results, Promitor Scraper collects metric information.

In addition, both Promitor Scraper and Promitor Resource Discovery require two configuration files for each of them. One configuration file is to define runtime settings, such as authentication information, and the other is to define metric information to be collected.

■ Key metric items

The key Promitor metric items are defined in the Promitor metric definition file (initial status). For details, see the description under Promitor metric definition file (metrics_promitor.conf) of 10. IM Exporter definition files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Metrics you can collect

Promitor can collect metrics for the following services to monitor:

You specify metrics you want to collect in the Promitor Scraper configuration file (metrics-declaration.yaml).

If you want to change the metrics specified in the Promitor Scraper settings file, see Change monitoring metrics (optional) in the manial JP1/Integrated Management 3 - Manager Configuration Guide 10.1.2(6) Setting up Promitor (d) Configuring scraping targets (required).

You can also add new metrics to the Promitor metric definition file, based on the metrics specified in the Promitor Scraper configuration file. Metrics defined in Promitor Scraper configuration file can be specified to the PromQL statement written in the definition file.

Table 12‒11: Services supported as monitoring targets by Promitor
Promitor resourceType name	Azure Monitor namespace	Automatic discovery support
VirtualMachine	Microsoft.Compute/virtualMachines	Y
FunctionApp	Microsoft.Web/sites	Y
ContainerInstance	Microsoft.ContainerInstance/containerGroups	--
KubernetesService	Microsoft.ContainerService/managedClusters	Y
FileStorage	Microsoft.Storage/storageAccounts/fileServices	--
BlobStorage	Microsoft.Storage/storageAccounts/blobServices	--
ServiceBusNamespace	Microsoft.ServiceBus/namespaces	Y
CosmosDb	Microsoft.DocumentDB/databaseAccounts	Y
SqlDatabase	Microsoft.Sql/servers/databases	Y
SqlServer	Microsoft.Sql/servers/databases Microsoft.Sql/servers/elasticPools	--
SqlManagedInstance	Microsoft.Sql/managedInstances	Y
SqlElasticPool	Microsoft.Sql/servers/elasticPools	Y
LogicApp	Microsoft.Logic/workflows	Y

Legend:

Y: Automatic discovery is supported.

--: Automatic discovery is not supported.

■ Checking how Azure SDKs used by Promitor are supported

Promitor employs Azure SDK for .NET. An end of Azure SDK support is announced 12 months in advance. For details on the lifecycle of Azure SDK, see Lifecycle FAQ at the following website:

https://learn.microsoft.com/ja-jp/lifecycle/faq/azure#azure-sdk-----------

For the lifecycles of versions of Azure SDK libraries, you can find them in the following website:

https://azure.github.io/azure-sdk/releases/latest/all/dotnet.html

■ Credentials required for account information

Promitor can connect to Azure through the service principal method or the managed ID method. For details on the credentials assigned to the service principal and managed ID, see (a) Configuring the settings for establishing a connection to Azure (required) in the manual JP1/Integrated Management 3 - Manager Configuration Guide 10.1.2(6) Setting up Promitor.

(g) Container monitoring

Container environment monitoring uses different methods to collect operating information depending on monitoring targets, as listed in the following table.

Monitoring target	How to collect operating information
Red Hat OpenShift	User-specific Prometheus
Kubernetes
Amazon Elastic Kubernetes Service (EKS)
Azure Kubernetes Service (AKS)	Azure's monitoring feature (Promitor)

The following describes how operating information is collected for each monitoring target.

(h) Red Hat OpenShift

In Red Hat OpenShift, Prometheus as a default monitoring component collects operating information from scraping targets (kube-state-metrics, node_exporter, and kubelet) and sends the information to JP1/IM - Manager.

■ Key metric items

The key metric items for Red Hat OpenShift are defined in the metric definition file (initial status) for each scraping target for container monitoring, as shown in the table below. For details, see the description on each metric definition file in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference (2. Definition Files or 10. IM Exporter Definition Files).

Scraping target	Metric definition file
kube-stat-metrics	Container monitoring metric definition file (metrics_kubernetes.conf)
node_exporter	Node exporter metric definition file (metrics_node_exporter.conf)
kubelet	Container monitoring metric definition file (metrics_kubernetes.conf)

You can add more metric items to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

- When kube-stat-metrics is to be scraped

Metric name	Data to be obtained	Label
kube_cronjob_info	Info about cronjob.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace` `schedule=schedule` `concurrency_policy=concurrency-policy`
kube_cronjob_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace` `label_CRONJOB_LABEL=CRONJOB_LABEL`
kube_cronjob_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_next_schedule_time	Next time the cronjob should be scheduled. The time after lastScheduleTime, or after the cron job's creation time if it's never been scheduled. Use this to determine if the job is delayed.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_status_active	Active holds pointers to currently running jobs.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_status_last_schedule_time	LastScheduleTime keeps information of when was the last time the job was successfully scheduled.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_spec_suspend	Suspend flag tells the controller to suspend subsequent executions.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_spec_starting_deadline_seconds	Deadline in seconds for starting the job if it misses scheduled time for any reason.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_cronjob_metadata_resource_version	Resource version representing a specific version of the cronjob.	`instance:` `instance-identifier-string` `job:` `job-name` `cronjob:` `cronjob-name` `namespace=cronjob-namespace`
kube_daemonset_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_current_number_scheduled	The number of nodes running at least one daemon pod and are supposed to.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_desired_number_scheduled	The number of nodes that should be running the daemon pod.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_number_available	The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_number_misscheduled	The number of nodes running a daemon pod but are not supposed to.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_number_ready	The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_number_unavailable	The number of nodes that should be running the daemon pod and have none of the daemon pod running and available	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_observed_generation	The most recent generation observed by the daemon set controller.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_status_updated_number_scheduled	The total number of nodes that are running updated daemon pod	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_metadata_generation	Sequence number representing a specific generation of the desired state.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_daemonset_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace` `label_DAEMONSET_LABEL=DAEMONSET_LABEL`
kube_deployment_status_replicas	The number of replicas per deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_replicas_ready	The number of ready replicas per deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_replicas_available	The number of available replicas per deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_replicas_unavailable	The number of unavailable replicas per deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_replicas_updated	The number of updated replicas per deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_observed_generation	The generation observed by the deployment controller.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_status_condition	The current status conditions of a deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `deployment=deployment-name` `namespace=deployment-namespace` `condition=deployment-condition` `status=true`\|`false`\|`unknown`
kube_deployment_spec_replicas	Number of desired pods for a deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_spec_paused	Whether the deployment is paused and will not be processed by the deployment controller.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_spec_strategy_rollingupdate_max_unavailable	Maximum number of unavailable replicas during a rolling update of a deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_spec_strategy_rollingupdate_max_surge	Maximum number of replicas that can be scheduled above the desired number of replicas during a rolling update of a deployment.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_metadata_generation	Sequence number representing a specific generation of the desired state.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_deployment_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `daemonset=daemonset-name` `namespace=daemonset-namespace`
kube_job_info	Information about job.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespac`e `label_JOB_LABEL=JOB_LABEL`
kube_job_owner	Information about the Job's owner.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace` `owner_kind=owner kind` `owner_name=owner name` `owner_is_controller=whether owner is controller`
kube_job_spec_parallelism	The maximum desired number of pods the job should run at any given time.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_spec_completions	The desired number of successfully finished pods the job should be run with.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_spec_active_deadline_seconds	The duration in seconds relative to the startTime that the job may be active before the system tries to terminate it.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_status_active	The number of actively running pods.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_status_succeeded	The number of pods which reached Phase Succeeded.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_status_failed	The number of pods which reached Phase Failed.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace` `reason=failure reason`
kube_job_status_start_time	StartTime represents time when the job was acknowledged by the Job Manager.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_status_completion_time	CompletionTime represents time when the job was completed.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_job_complete	The job has completed its execution.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace` `condition=true`\|`false`\|`unknown`
kube_job_failed	The job has failed its execution.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace` `condition=true`\|`false`\|`unknown`
kube_job_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_status_replicas	The number of replicas per ReplicaSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_status_fully_labeled_replicas	The number of fully labeled replicas per ReplicaSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_status_ready_replicas	The number of ready replicas per ReplicaSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_status_observed_generation	The generation observed by the ReplicaSet controller.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_spec_replicas	Number of desired pods for a ReplicaSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_metadata_generation	Sequence number representing a specific generation of the desired state.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `replicaset=replicaset-name` `namespace=replicaset-namespace` `label_REPLICASET_LABEL=REPLICASET_LABEL`
kube_replicaset_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_replicaset_owner	Information about the ReplicaSet's owner.	`instance:` `instance-identifier-string` `job:` `job-name` `replicaset=replicaset-name` `namespace=replicaset-namespace` `owner_kind=owner kind` `owner_name=owner name` `owner_is_controller=whether owner is controller`
kube_statefulset_status_replicas	The number of replicas per StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_status_replicas_current	The number of current replicas per StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_status_replicas_ready	The number of ready replicas per StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_status_replicas_updated	The number of updated replicas per StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_status_observed_generation	The generation observed by the StatefulSet controller.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_replicas	Number of desired pods for a StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_metadata_generation	Sequence number representing a specific generation of the desired state for the StatefulSet.	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `job_name=job-name` `namespace=job-namespace`
kube_statefulset_labels	Kubernetes labels converted to Prometheus labels.	`instance:` `instance-identifier-string` `job:` `job-name` `statefulset=statefulset-name` `namespace=statefulset-namespace` `label_STATEFULSET_LABEL=STATEFULSET_LABEL`
kube_statefulset_status_current_revision	Indicates the version of the StatefulSet used to generate Pods in the sequence [0,currentReplicas).	`instance:` `instance-identifier-string` `job:` `job-name` `statefulset=statefulset-name` `namespace=statefulset-namespace` `revision=statefulset-current-revision`
kube_statefulset_status_update_revision	Indicates the version of the StatefulSet used to generate Pods in the sequence [replicas-updatedReplicas,replicas)	`instance:` `instance-identifier-string` `job:` `job-name` `statefulset=statefulset-name` `namespace=statefulset-namespace` `revision=statefulset-current-revision`
kube_namespace_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `namespace=namespace-name`
kube_namespace_labels	Kubernetes labels converted to Prometheus labels	`instance:` `instance-identifier-string` `job:` `job-name` `namespace=namespace-name` `label_NS_LABEL=NS_LABEL`
kube_namespace_status_phase	kubernetes namespace status phase	`instance:` `instance-identifier-string` `job:` `job-name` `namespace=namespace-name` `phase=Active`\|`Terminating`
kube_node_info	Information about a cluster node	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `kernel_version=kernel-version` `os_image=os-image-name` `container_runtime_version=container-runtime-and-version-combination` `kubelet_version=kubelet-version` `kubeproxy_version=kubeproxy-version` `pod_cidr=pod-cidr` `provider_id=provider-id` `system_uuid=system-uuid` `internal_ip=internal-ip`
kube_node_labels	Kubernetes labels converted to Prometheus labels	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `label_NODE_LABEL=NODE_LABEL`
kube_node_spec_unschedulable	Whether a node can schedule new pods	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address`
kube_node_spec_taint	The taint of a cluster node.	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `key=taint-key` `value=taint-value` `effect=taint-effect`
kube_node_status_capacity	The capacity for different resources of a node	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `resource=resource-name` `unit=resource-unit`
kube_node_status_allocatable	The allocatable for different resources of a node that are available for scheduling	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `resource=resource-name` `unit=resource-unit`
kube_node_status_condition	The condition of a cluster node	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address` `condition=node-condition` `status=true`\|`false`\|`unknown`
kube_node_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `node=node-address`
kube_pod_info	Information about pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `host_ip=host-ip` `pod_ip=pod-ip` `node=node-name` `created_by_kind=created_by_kind` `created_by_name=created_by_name` `uid=pod-uid` `priority_class=priority_class` `host_network=host_network`
kube_pod_start_time	Start time in unix timestamp for a pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `ip=pod-ip-address` `ip_family=4` OR `6` `uid=pod-uid`
kube_pod_completion_time	Completion time in unix timestamp for a pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-ui`d
kube_pod_owner	Information about the Pod's owner	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `owner_kind=owner kind` `owner_name=owner name` `owner_is_controller=whether owner is controller` `uid=pod-uid`
kube_pod_labels	Kubernetes labels converted to Prometheus labels	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `label_POD_LABEL=POD_LABEL` `uid=pod-uid`
kube_pod_status_phase	The pods current phase	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `phase=Pending`\|`Running`\|`Succeeded`\|`Failed`\|`Unknown` `uid=pod-uid`
kube_pod_status_ready	Describes whether the pod is ready to serve requests	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `condition=true`\|`false`\|`unknown` `uid=pod-uid`
kube_pod_status_scheduled	Describes the status of the scheduling process for the pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `condition=true`\|`false`\|`unknown` `uid=pod-uid`
kube_pod_container_info	Information about a container in a pod	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `image=image-name` `image_id=image-id` `image_spec=image-spec` `container_id=containerid` `uid=pod-uid`
kube_pod_container_status_waiting	Describes whether the container is currently in waiting state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_container_status_waiting_reason	Describes the reason the container is currently in waiting state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `reason=container-waiting-reason` `uid=pod-uid`
kube_pod_container_status_running	Describes whether the container is currently in running state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_container_state_started	Start time in unix timestamp for a pod container	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_container_status_terminated	Describes whether the container is currently in terminated state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_container_status_ready	Describes whether the containers readiness check succeeded	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_container_status_restarts_total	The number of container restarts per container(Counter)	`container=container-name` `namespace=pod-namespace` `instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `uid=pod-uid`
kube_pod_created	Unix creation timestamp	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespac`e `uid=pod-uid`
kube_pod_restart_policy	Describes the restart policy in use by this pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `type=Always`\|`Never`\|`OnFailure` `uid=pod-uid`
kube_pod_init_container_info	Information about an init container in a pod	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `image=image-name` `image_id=image-id` `image_spec=image-spec` `container_id=containerid` `uid=pod-uid`
kube_pod_init_container_status_waiting	Describes whether the init container is currently in waiting state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_init_container_status_running	Describes whether the init container is currently in running state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-nam`e `namespace=pod-namespace` `uid=pod-uid`
kube_pod_init_container_status_terminated	Describes whether the init container is currently in terminated state	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_init_container_status_ready	Describes whether the init containers readiness check succeeded	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_init_container_status_restarts_total	The number of restarts for the init container	`instance:` `instance-identifier-string` `job:` `job-name` `container=container-name` `namespace=pod-namespace` `pod=pod-name` `uid=pod-uid`
kube_pod_spec_volumes_persistentvolumeclaims_info	Information about persistentvolumeclaim volumes in a pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `volume=volume-name` `persistentvolumeclaim=persistentvolumeclaim-claimname` `uid=pod-uid`
kube_pod_spec_volumes_persistentvolumeclaims_readonly	Describes whether a persistentvolumeclaim is mounted read only	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `volume=volume-name` `persistentvolumeclaim=persistentvolumeclaim-claimname` `uid=pod-uid`
kube_pod_status_scheduled_time	Unix timestamp when pod moved into scheduled status	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`
kube_pod_status_unschedulable	Describes the unschedulable status for the pod	`instance:` `instance-identifier-string` `job:` `job-name` `pod=pod-name` `namespace=pod-namespace` `uid=pod-uid`

- When node_exporter is to be scraped

See Key metric items in 9.5.3(2)(d) Node exporter.

- When kubelet is to be scraped

Metric name	Data to be obtained	Label
container_blkio_device_usage_total	Blkio device bytes usage	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name` `major:` `major-identifier` `minor:` `minor-identifier` `operation:` `operation` (Async, Sync, Discard, Read, Write, or Total)
container_cpu_cfs_periods_total	Number of elapsed enforcement period intervals	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_cpu_cfs_throttled_periods_total	Number of throttled period intervals	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_cpu_cfs_throttled_seconds_total	Total time duration the container has been throttled	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_cpu_system_seconds_total	Cumulative system cpu time consumed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_cpu_usage_seconds_total	Cumulative cpu time consumed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `cpu:` `CPU-name`
container_cpu_user_seconds_total	Cumulative user cpu time consumed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_fs_inodes_free	Number of available Inodes	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_inodes_total	Total number of Inodes	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_io_current	Number of I/Os currently in progress	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_io_time_seconds_total	Cumulative count of seconds spent doing I/Os	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_io_time_weighted_seconds_total	Cumulative weighted I/O time	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_limit_bytes	Number of bytes that can be consumed by the container on this filesystem	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_reads_bytes_total	Cumulative count of bytes read	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_read_seconds_total	Cumulative count of seconds spent reading	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_reads_merged_total	Cumulative count of reads merged	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_reads_total	Cumulative count of reads completed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_sector_reads_total	Cumulative count of sector reads completed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_sector_writes_total	Cumulative count of sector writes completed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_usage_bytes	Number of bytes that are consumed by the container on this filesystem	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_writes_bytes_total	Cumulative count of bytes written	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_write_seconds_total	Cumulative count of seconds spent writing	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_writes_merged_total	Cumulative count of writes merged	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_fs_writes_total	Cumulative count of writes completed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `device:` `device-name`
container_memory_cache	Total page cache memory	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_failcnt	Number of memory usage hits limits	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_failures_total	Cumulative count of memory allocation failures	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name` `failure_type:` `cause-of-failure` (pgfault or pgmajfault) `scope:` `scope` (container or hierarchy)
container_memory_mapped_file	Size of memory mapped files	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_max_usage_bytes	Maximum memory usage recorded	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_rss	Size of RSS	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_swap	Container swap usage	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_usage_bytes	Current memory usage, including all memory regardless of when it was accessed	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_memory_working_set_bytes	Current working set	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_cpu_period	CPU period of the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_cpu_quota	CPU quota of the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_cpu_shares	CPU share of the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_memory_limit_bytes	Memory limit for the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_memory_reservation_limit_bytes	Memory reservation limit for the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`
container_spec_memory_swap_limit_bytes	Memory swap limit for the container	`id:` `container-identifier` `name:` `container-name` `image:` `image-name` `container:` `container-name` (defined as kubernetes) `namespace:` `namespace` `pod:` `pod-name`

(i) Kubernetes

In Kubernetes, the user-specific Prometheus that monitors the Kubernetes environment collects operating information from scraping targets (kube-state-metrics, node_exporter, and kubelet) and sends the information to JP1/IM - Manager.

The following table lists the names of components to be monitored by Kubernetes.

Configuration component name		Monitoring target	Component name
Cluster		Y	Cluster
Control Plane	Host	Y^#1	Node
Control Plane	Service (such as apiserver)	--	--
Worker node	Host	Y^#1	Node
Worker node	Service (such as apiserver)	--	--
Container		--	--
Namespace		Y^#1	Namespace
Workload^#2		Y^#1	See the table in #2.
Pod		Y	Pod

Legend:

Y: Monitored, --: Not monitored

#1

Not supported by AKS.

#2

The workloads can be divided into the six types shown in the following table.

Type of workload	Component name
CronJob	CronJob
Job	Job
DaemonSet	DaemonSet
Deployment	Deployment
ReplicaSet	ReplicaSet
StatefulSet	StatefulSet

■ Key metric items

See Key metric items in 12.5.2(2)(h) Red Hat OpenShift.

(j) Amazon Elastic Kubernetes Service (EKS)

In Amazon Elastic Kubernetes Service (EKS), Prometheus or an AWS Distro for OpenTelemetry (ADOT) agent (which uses Prometheus receiver and exporter) collects information from scraping targets (kube-state-metrics, node_exporter, and kubelet) and sends the information to JP1/IM - Manager.

If you want to monitor the EKS on Fargate service, you need to use the ADOT agent in order to collect performance data of pods, as shown in the following table.

Collection tool	Service to be monitored
Collection tool	EKS on EC2	EKS on Fargate
Prometheus	Y	C
ADOT agent	Y	Y

Legend:

Y: The tool can collect metrics (and pods' performance data can be collected).

C (conditional): The tool can collect metrics (whereas pods' performance data cannot be collected).

■ Key metric items

See Key metric items in 12.5.2(2)(h) Red Hat OpenShift.

(k) Azure Kubernetes Service (AKS)

To monitor Azure Kubernetes Service (AKS), the Azure monitoring capability (Promitor) is used to collect default AKS information. For details on Promitor, see 12.5.2(2)(f) Promitor.

■ Key metric items

he key metric items when Promitor monitors AKS are defined in the Promitor metric definition file (initial status). For details, see Promitor metric definition file (metrics_promitor.conf) of 10. IM Exporter Definition Files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add more metric items to the Promitor metric definition file. For details on the AKS-monitoring metrics you can specify with PromQL statements used within the definition file, see Metrics you can collect in 12.5.2(2)(f) Promitor.

(l) Log metrics

This capability can generate and measure log metrics from log files created by monitoring targets.

■ Key metric items

You define what figures you need from the log files created by your monitoring targets in the log metrics definition file (fluentd_any-name_logmetrics.conf). These definitions allow you to get quantified data (log metrics) as metric items.

For details on the log metrics definition file, see Log metrics definition file (fluentd_any-name_logmetrics.conf) of 10. IM Exporter Definition Files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

■ Sample files

The following provides descriptions of sample files for when you use the log metrics feature. If you copy the sample files, be careful of the linefeed codes. For details, see the description of each file of 2. Definition Files and 10. IM Exporter Definition Files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference. These sample files are based on the assumptions in Assumptions of the sample files. Copy each file and change the settings according to your monitoring targets.

- Assumptions of the sample files

The sample files described here assume that HostA, a monitored host (integrated agent host), exists and JP1/IM - Agent is installed in it, and that WebAppA, an application running on HostA, creates the following log file.

- ControllerLog.log

As shown in target log message 1, a log message is created, saying that an HTTP endpoint in WebAppA is used, at the start of processing of the request for that endpoint. The log message also indicates the number of records handled upon request processing.

Target log message 1:

...
2022-10-19 10:00:00 [INFO] c.b.springbootlogging.LoggingController : endpoint "/register" started. Target record: 5.
...

In the sample files, a regular expression to match target log message 1 is used, and the number of the log messages that match the expression is counted. The number is then displayed in the Trends tab of the JP1/IM integrated operation viewer as log metric 1, Requests to the register Endpoint.

The definition for log metric 1 uses counter as its log metric type.

In addition, the regular expression used in the above also extracts the number indicated as Target record from target log message 1, and then the extracted numbers are summed up. The total is then displayed in the Trends tab of the JP1/IM integrated operation viewer as log metric 2, Number of Registered Records.

The definition for log metric 2 uses counter as its log metric type.

Fluentd workers (multi-process workers feature) for the number of log files to be monitored are required. For details on the worker settings related to the log metrics feature, see the log metrics definition file (fluentd_any-name_logmetrics.conf). Here, it is assumed that 11 fluentd workers are running, and ControllerLog.log is monitored by a worker whose worker ID is 10.

These sample files also assume the tree structure consisting of the following IM management nodes:

All Systems
 + Host A
    + Application Server
       + WebAppA

- Target files in this example

The target files used in this example are as follows:

Integrated manager host

- User-specific metric definition file
Integrated agent host

- Prometheus configuration file

- User-specific discovery configuration file

- Log metrics definition file

- Fluentd log monitoring target definition file

- Sample user-specific metric definition file

- File name: metrics_logmatrics1.conf

- Written code

[
  {
    "name":"logmetrics_request_endpoint_register",
    "default":true,
    "promql":"logmetrics_request_endpoint_register and $jp1im_TrendData_labels",
    "resource_en":{
      "category":"HTTP",
      "label":"request_num_of_endpoint_register",
      "description":"The request number of endpoint register",
      "unit":"request"
    },
    "resource_ja":{
      "category":"HTTP",
      "label":"Requests to the register Endpoint",
      "description":"The request number of endpoint register",
      "unit":"request"
    }
  },
  {
    "name":"logmetrics_num_of_registeredrecord",
    "default":true,
    "promql":"logmetrics_num_of_registeredrecord and $jp1im_TrendData_labels",
    "resource_en":{
      "category":"DB",
      "label":"logmetrics_num_of_registeredrecord",
      "description":"The number of registered record",
      "unit":"record"
    },
    "resource_ja":{
      "category":"DB",
      "label":"Number of Registered Records",
      "description":"The number of registered record",
      "unit":"record"
    }
  }
]

Note: The storage directory, written code, and file name follow the format of the user-specific metric definition file (metrics_any-Prometheus-trend-name.conf).

- Sample Prometheus configuration file

- File name: jpc_prometheus_server.yml

- Written code

global:
  ...
(omitted)
  ...
scrape_configs:
  - job_name: 'LogMetrics'
    
    file_sd_configs:
      - files:
        - 'user/user_file_sd_config_logmetrics.yml'
    
    relabel_configs:
      - target_label: jp1_pc_nodelabel
        replacement: Log trapper(Fluentd)
    
    metric_relabel_configs:
      - target_label: jp1_pc_nodelabel
        replacement: ControllerLog
      - source_labels: ['__name__']
        regex: 'logmetrics_request_endpoint_register|logmetrics_num_of_registeredrecord'
        action: 'keep'
      - regex: (jp1_pc_multiple_node|jp1_pc_agent_create_flag)
        action: labeldrop
 
  ...
(omitted)
  ...

Note: The storage directory and written code follow the format of the Prometheus configuration file (jpc_prometheus_server.yml). You do not have to create a new file. Instead, you add the scrape_configs section for the log metrics feature to the Prometheus configuration file (jpc_prometheus_server.yml) created during installation.

- Sample user-specific discovery configuration file

- File name: user_file_sd_config_logmetrics.yml

- Written code

- targets:
  - HostA:24830
  labels:
    jp1_pc_exporter: logmetrics
    jp1_pc_category: WebAppA
    jp1_pc_trendname: logmetrics1
    jp1_pc_multiple_node: "{__name__=~'logmetrics_.*'}"
    jp1_pc_agent_create_flag: false

Note

The storage directory and written code follow the format of the user-specific discovery configuration file (file_sd_config_any-name.yml).

ControllerLog.log is monitored by the worker whose Fluentd worker ID is 10. Thus, when 24820 is set for port in the Sample log metrics definition file, the port number of the worker monitoring ControllerLog.log is 24820 + 10 = 24830.

- Sample log metrics definition file

- File name: fluentd_WebAppA_logmetrics.conf

- Written code

## Input
<worker 10>
  <source>
    @type prometheus
    bind '0.0.0.0'
    port 20732
    metrics_path /metrics
  </source>
</worker>
## Extract target log message 1
<worker 10>
  <source>
    @type tail
    @id logmetrics_counter
    path /usr/lib/WebAppA/ControllerLog/ControllerLog.log
    tag WebAppA.ControllerLog
    pos_file ../data/fluentd/tail/ControllerLog.pos
    read_from_head true
    <parse>
      @type regexp
      expression /^(?<logtime>[^\[]*) \[(?<loglebel>[^\]]*)\] (?<class>[^\[]*) : endpoint "\/register" started. Target record: (?<record_num>\d[^\[]*).$/
      time_key logtime
      time_format %Y-%m-%d %H:%M:%S
      types record_num:integer
    </parse>
  </source>
 
## Output
## Define log metrics 1 and 2
  <match WebAppA.ControllerLog>
    @type prometheus
    <metric>
      name logmetrics_request_endpoint_register
      type counter
      desc The request number of endpoint register
    </metric>
    <metric>
      name logmetrics_num_of_registeredrecord
      type counter
      desc The number of registered record
      key record_num
      <labels>
      loggroup ${tag_parts[0]}
      log ${tag_parts[1]}
      </labels>
    </metric>
  </match>
</worker>

Note: The storage directory and written code follow the format of the log metrics definition file (fluentd_any-name_logmetrics.conf).

- Sample Fluentd log monitoring target definition file

- File name: jpc_fluentd_common_list.conf

- Written code

## [Target Settings]
  ...
(omitted)
  ...
@include user/fluentd_WebAppA_logmetrics.conf

Note: The storage directory and written code follow the format of the Fluentd log monitoring target definition file (jpc_fluentd_common_list.conf) in JP1/IM - Agent definition files. You do not have to create a new file. Instead, you add the include section for the log metrics feature to the Fluentd log monitoring target definition file (jpc_fluentd_common_list.conf) created during installation.

(m) Script exporter

Script exporter runs scripts on a host and gets results.

Installed in the same host as Prometheus, Script exporter runs a script on the host and gets a result when triggered by a scraping request from Prometheus server, and returns the result to the server.

Developing a script that gets UAP information and converts it to a metric and adding the script to Script exporter enables you to monitor applications that are not supported by Exporter as you want.

■ Key metric items

The key Script exporter metric items are defined in the Script exporter metric definition file (initial status). For details, see Script exporter metric definition file (metrics_script_exporter.conf) of 10. IM Exporter Definition Files in the manual JP1/Integrated Management 3 - Manager Command, Definition File and API Reference.

You can add more metric items to the metric definition file. The following table shows the metrics you can specify with PromQL statements used within the definition file.

Metric name	Data to be obtained	Label
script_success	Script exit status (0 = error, 1 = success)	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`
script_duration_seconds	Script execution time, in seconds.	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`
script_exit_code	The exit code of the script.	`instance:` `instance-identifier-string` `job:` `job-name` `script:` `script-name`

Metric name

Data to be obtained

Label

script_success

Script exit status (0 = error, 1 = success)

instance: instance-identifier-string

job: job-name

script: script-name

script_duration_seconds

Script execution time, in seconds.

instance: instance-identifier-string

job: job-name

script: script-name

script_exit_code

The exit code of the script.

instance: instance-identifier-string

job: job-name

script: script-name

To Page Top