2.3.10 Operands related to system monitoring

50) pd_utl_exec_time = utility-execution-monitoring-time
[Figure]<unsigned integer>((0-35791394)) <<0>> (minutes)
Specifies the monitoring time (in minutes) for monitoring the execution time of the following utilities:
  • Database load utility (pdload command)
  • Database reorganization utility (pdrorg command)
  • Free page release utility (pdreclaim command)
  • Global buffer residence utility (pdpgbfon command)
If the execution of a utility is not terminated within the monitoring time specified by this operand, the executing utility is forcibly terminated, and the error information for identifying the cause of no response is output as shown in the following table.
Error information obtainedOutput destination
  • Core files
  • .deb files
These files are output in %PDDIR%\spool\save of the server machine on which the utility was running.
  • pdls -d rpc -a command execution result
  • pdls -d lck command execution result
The results are output in HiRDB-installation-drive:\tmp of the server machine into which the utility command was input. However, if HiRDB executes the command, these results are also output in %PDDIR%\spool\save. The file names are as follows:
command-nameYYYYMMDDHHMMSSpid.txt
  • pid is a process ID.
  • If the pdreclaim or pdpgbfon command is executed, HiRDB internally executes the pdrorg command, and therefore pdrorg is used as the command name to be assigned to the file name.
Advantage
Even if a utility ceases to respond because of an error (communication or disk error, for example) that occurs during utility execution in a nighttime batch job, the succeeding jobs can continue running.
Specification guidelines
  • This operand is designed to deal with no-response errors, and not to monitor for large transactions. Therefore, specify a value for this operand that is somewhat greater than the actual maximum execution time of utilities. For example, if the maximum execution time of the database load utility is approximately 60 minutes, and the maximum execution time of the database reorganization utility is approximately 90 minutes, specify pd_utl_exec_time=120 by leaving some room. In this case, when a process that is normally terminated in 90 minutes does not return a response even after 30 more minutes, it is determined that a no-response error has occurred.
  • If this operand is omitted or 0 is specified for it, a utility may go into a no-response state when the following errors occur. Therefore, specify a value other than 0.
    [Figure]Communication error (including temporary error) between servers
    [Figure]Process not responding due to a disk error, for example
Operand rules
  • If this operand is omitted or 0 is specified for it, utility execution time is not monitored.
  • If monitoring time is specified in the exectime operand of the option control statement of each utility, the specification in the exectime operand takes precedence.
51) pd_watch_time = SQL-maximum-execution-time
[Figure]<unsigned integer>((0-65535)) <<0>> (seconds)
This operand is applicable to a HiRDB/Parallel Server.
Specifies a maximum execution time for SQL statements that are executed in a HiRDB server process.
If execution of an SQL is not completed within the specified amount of time, execution of that SQL is terminated.
Advantage
If the HiRDB server does not halt execution of an SQL statement, even though the HiRDB client has canceled SQL execution (by forcibly terminating a client process, for example), the HiRDB server may continue to execute the SQL statement and may lock resources for a long time. Specifying this operand places a limit on such a lock time.
Specification guidelines
Specify the largest value among the following time values:
Time specified by the PDCWAITTIME operand of the client environment definition
Time specified by the pd_lck_wait_timeout operand
Processing time of the SQL statement with the longest execution time
Notes
  • When 0 is specified for this operand, the SQL execution time is not monitored.
  • When 0 (no time monitoring) is specified for the PDCWAITTIME operand of the client environment definition for an SQL statement, the execution time of that SQL statement is not monitored.
  • When a value shorter than the SQL time is specified for this operand, processing may terminate during SQL execution, and an SQL error may be reported to the HiRDB client or abnormal termination may be reported to the HiRDB server.
  • For a HiRDB/Single Server, SQL maximum execution time is not monitored, even if this operand is specified. If this operand is specified, the specified value is used as the default value of the pd_lck_wait_timeout operand. Therefore, Hitachi recommends that you omit this operand for a HiRDB/Single Server.
52) pd_queue_watch_time = message-queue-monitoring-time
[Figure]<unsigned integer>((0-3600)) <<600>> (seconds)
This operand is designed to prevent a HiRDB process from not responding. For details about a server process that has stopped responding, see the HiRDB Version 8 System Operation Guide.
Reference note
The number of HiRDB server processes is restricted by the following operands:
  • pd_max_server_process
    If a large number of servers are running within a unit, carefully estimate the value to be specified for this operand. Additionally, if the standby-less system switchover facility is used, the estimated value must also take into account system switchovers.
  • pd_max_bes_process
    If a multiple front-end server or the standby-less system switchover (1:1) facility is used, carefully estimate the value to be specified for this operand.
  • pd_max_dic_process
    If a multiple front-end server is used, carefully estimate the value to be specified for this operand.
  • pd_ha_max_server_process
    If the standby-less system switchover (effects distributed) facility is used, carefully estimate the value to be specified for this operand.
  • pd_max_users
    If the number of concurrent connections is large, an appropriate value must be specified for this operand.
HiRDB uses a message queue for allocating server processes. When a server process stops responding, messages cannot be extracted from the message queue. If messages cannot be extracted from the message queue within the time specified by this operand (message queue monitoring time), a warning message or error message (KFPS00888-W or KFPS00889-E) is output. This facility is called the message queue monitoring facility. If one of these messages is output, the server process may have stopped responding.
If 0 is specified in this operand, the message queue is not monitored.
For details on the message queue monitoring facility, see the HiRDB Version 8 System Operation Guide.
Operand rule
Because message queue monitoring is carried out in 10-second increments, specify for this operand a multiple of 10 seconds (100 or 110 seconds, for example). If the specified value is not a multiple of 10 seconds, the last digit is rounded up. For example, if you specify 105 seconds, 110 seconds is used.
Relationship to other operands
This operand is related to the pd_queue_watch_timeover_action operand.
53) pd_queue_watch_timeover_action = continue | stop
Specifies the processing to be performed by HiRDB when messages cannot be extracted from the message queue within the message queue monitoring time.
continue:
HiRDB outputs the warning message (KFPS00888-W).
stop:
HiRDB outputs the warning and error messages (KFPS00888-W and KFPS00889-E) and abnormally terminates the unit that has the message queue.
Specification guidelines
  • Normally, specify stop (or omit this operand). If messages cannot be extracted from the message queue within the message queue monitoring time, the HiRDB server process may have stopped responding. Restarting the unit containing the server process that has stopped responding can sometimes correct the non-responding server process.
  • If you do not wish to stop HiRDB, specify continue. In this case, transactions cannot be executed on the server that has stopped responding. Other servers can still execute transactions. To correct the non-responding server process, use the pdcancel command, for example, to terminate the transactions that were being executed in the server process that stopped responding. Afterwards, identify the cause of the server process no-response and take the necessary action. If there were no transactions, use the pdkill command to terminate the server that stopped responding. Afterwards, identify the cause of the server process no-response and take the necessary action. For details on server process no-response and corrective measures, see the HiRDB Version 8 System Operation Guide.
54) pd_down_watch_proc = upper-limit-for-server-process-abnormal-terminations[,monitoring-interval]
This operand is used for monitoring the number of abnormal terminations of a HiRDB server process. Processes to be monitored are those that are abnormally terminated by PDCWAITTIME over or aborting.
If abnormal terminations of server processes occur frequently, new services may not be accepted. However, because server process abnormal termination does not cause HiRDB abnormal termination, HiRDB is in an online stopped state, in effect. When this operand is specified, you can pull HiRDB out of this state by restarting it.
upper-limit-for-server-process-abnormal-termination: [Figure]<unsigned integer>((0-65535))<<0>>
If abnormal terminations of server processes exceed the value specified in this operand, HiRDB (applicable unit for a HiRDB/Parallel Server) is abnormally terminated. This is called the facility for monitoring abnormal process terminations. For details on this facility, see the HiRDB Version 8 System Operation Guide.
For a HiRDB/Single Server, abnormal terminations of single server processes are counted. In the case of a HiRDB/Parallel Server, the total of the abnormal terminations in the front-end servers, back-end servers, and dictionary servers inside the unit is counted.
If 0 is specified, abnormal terminations of server processes are not monitored.
monitoring-interval:[Figure]<unsigned integer>((10-3600))<<600>> (units: seconds)
Specifies the interval (in seconds) for monitoring abnormal terminations of server processes.
For example, if 100 is specified, abnormal terminations of server processes are monitored every 100 seconds.
Advantages
  • Restart of HiRDB refreshes memory and resource statuses, improving the processing efficiency.
  • If abnormal termination of server processes occurs frequently, HiRDB is abnormally terminated, and thus the system can be switched over immediately.
Notes
  • When a server process is abnormally terminated, the KFPS01820-E message is output. Although this message is also output when the server process is abnormally terminated by the pdcancel command, this is not counted as an abnormal termination.
  • For a mutual system switchover configuration, multiple HiRDBs are activated on the same server machine when system switchover occurs. As a result, the system traffic may increase, causing an adverse effect instead. Therefore, if you specify this operand, Hitachi recommends that you restart HiRDB in the system that was abnormally terminated.
Operand rule
A monitoring interval cannot be specified alone. It must be specified with an upper limit for server process abnormal terminations.
Remarks
  • If HiRDB is abnormally terminated by the facility for monitoring abnormal process terminations, the KFPS01821-E and KFPS00729-E messages are output.
  • The following table shows the causes of server process abnormal termination and the server processes that are included in the abnormal termination count.
    Cause of server process abnormal terminationInclusion in abnormal termination count
    Single server processFront-end server processDictionary server processBack-end server process
    PDCWAITTIME operand value of the client environment definition has been exceededYYN1N1
    pdcancel commandNN2NN
    Internal forced termination (HiRDB internally issues SIGKILL and terminates a process)Y3Y3N1N1
    AbortYYYY
    Rollback has occurred in a UAP with XA connectionYYNN
    Abnormal termination of process other than those described hereYYYY
Legend:
Y: Included in abnormal termination count
N: Not included in abnormal termination count
1 If an error is detected in a transaction branch, the abnormal terminations of the front-end server process that has occurred in the same transaction branch are counted.
2 If the pdcancel command is used to forcibly terminate a back-end server process or dictionary server process, the front-end server process is internally and forcibly terminated. In this case, the abnormal termination of the front-end server process may be counted in some cases.
3 If an error is detected in a global transaction by an OLTP system, the abnormal terminations of the single server process or front-end server process that have occurred in the same global transaction are counted.
55) pd_host_watch_interval = host-to-host-monitoring-interval
[Figure]<unsigned integer>((1-180))<<10>>(Seconds)
Specifies in seconds the amount of time between monitoring of the operating status of other hosts (server machines).
Notes
  • This operand is not applicable to a HiRDB/Single Server or to a HiRDB/Parallel Server that has only one unit.
  • If this value is too large, it will be difficult to detect errors at other hosts.
Relationship to other operands
This operand is related to the pd_ipc_conn_nblock_time operand.
56) pd_watch_resource = MANUAL | AUTO
Specifies whether to output a warning message when the resource usage reaches or exceeds 80%.
MANUAL: Do not issue the warning message.
AUTO: Issue the warning message.
The following table lists the resources that are monitored.
Monitored resourceMessage that is output
Maximum number of concurrent connections specified by the pd_max_users operandKFPS05123-W
Concurrently accessible base tables count specified by the pd_max_access_tables operand
Maximum number of RDAREAs specified by the pd_max_rdarea_no operand
Maximum number of HiRDB files comprising an RDAREA specified by the pd_max_file_no operand
Usage of HiRDB file system area for work table files, as specified by the pdwork operand
Maximum number of users who can own lists concurrently, as specified by pd_max_list_users
Maximum number of lists that a user can create, as specified by pd_max_list_count
Number of audit trail files that cannot be specified as swappable targets
Number of lists created within a serverKFPH22023-W
Relationship to other operands
When AUTO is specified, 80 is set at the default value for the operands listed below, which eliminates the need to specify these operands (however, if it is necessary to specify a value other than 80 for any of the listed operands, the appropriate value can be specified):
  • pd_max_users_wrn_pnt*
  • pd_max_access_tables_wrn_pnt
  • pd_max_rdarea_no_wrn_pnt
  • pd_max_file_no_wrn_pnt
  • pdwork_wrn_pnt
  • pd_max_list_users_wrn_pnt
  • pd_max_list_count_wrn_pnt
  • pd_aud_file_wrn_pnt*
  • pd_rdarea_list_no_wrn_pnt*
* 50 is set as the trigger for resetting the warning message output status.
Note
After a version upgrade, the number of resources to be monitored may increase (i.e., other operands in addition to those listed above may be included).Therefore, to keep the monitoring statuses up to date, the operands listed above should be specified in order to monitor the individual resources, rather than specifying this operand.
57) pd_max_users_wrn_pnt = trigger-for-outputting-warning-message-related-to-number-of-connections-to-HiRDB-server [,trigger-for-resetting-warning-message-output-status]
trigger-for-outputting-warning-message-related-to-number-of-connections-to-HiRDB-server[Figure]<unsigned integer>((0-100))<<0 or 80>>(%)
Specifies when to output the warning message when the number of connections to the HiRDB server reaches or exceeds a predetermined percentage (percentage of the maximum number of concurrent connections specified by the pd_max_users operand). For example, if 200 and 90 are specified for the pd_max_users and pd_max_user_wrn_pnt operands, respectively, the KFPS05123-W warning message is output when the number of connections to the HiRDB server reaches or exceeds 180.
Notes
  • When 0 is specified, no warning message is issued.
  • Specification of this operand is invalid when 9 or less is specified for the pd_max_users operand.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_users_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_users_wrn_pnt operand (i.e., the warning message is issued when the number of concurrently executing users reaches 80% of the specified maximum).
trigger-for-resetting-warning-message-output-status[Figure]<unsigned integer>((0-99))(%)
Specifies the trigger for resetting the warning message output status. When the warning message (KFPS05123-W) is output, HiRDB goes into the warning message output status. Once HiRDB goes into this status, the warning message is not output again even if the number of connections to the HiRDB server exceeds the warning message output trigger level again. However, when the number of connections to the HiRDB server falls below the trigger for resetting the warning message output status specified here, the warning message output status is reset.
For example, if pd_max_users_wrn_pnt=90,70 is specified, the warning message is output when the number of connections to the HiRDB server reaches or exceeds 90% of the maximum number of concurrent connections. Afterwards, no warning message is output until the number of connections to the HiRDB server falls below 70% of the maximum number of concurrent connections. After the percentage falls below 70%, and when it subsequently reaches or exceeds 90% again, the warning message is output.
Notes
  • When this specification is omitted, warning-message-output-trigger - 30 is assumed as the default (if the result is a negative number, 0 is used).
  • If a value greater than the warning message output trigger is specified, the warning message output trigger value is used.
58) pd_max_access_tables_wrn_pnt = trigger-for-issuing-concurrently-accessible-base-tables-count-warning
[Figure]<unsigned integer>((0-100))<<0 or 80>>(%)
Specifies as a percentage of the maximum number of concurrently accessible base tables (as specified by the pd_max_access_tables operand) the number of base tables actually being accessed concurrently at which a warning message is to be issued. For example, if 90 is specified for this operand and 200 has been specified for the pd_max_access_tables operand, a warning message will be issued when the number of concurrently accessed base tables reaches 180 (90% of 200).KFPS05123-W is issued as the warning message.
Note
When 0 is specified, no warning message is issued.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_access_tables_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_access_tables_wrn_pnt operand (i.e., the warning message is issued when the number of concurrently accessed base tables reaches 80% of the specified maximum).
59) pd_max_rdarea_no_wrn_pnt = trigger-for-issuing-RDAREAs-count-warning
[Figure]<unsigned integer>((0-100))<<0 or 80>>(%)
Specifies as a percentage of the maximum number of RDAREAs (as specified by the pd_max_rdarea_no operand) the actual number of RDAREAs at which a warning message is to be issued. For example, if 90 is specified for this operand and 200 has been specified for the pd_max_rdarea_no operand, a warning message will be issued when the number of RDAREAs reaches 180 (90% of 200). KFPS05123-W is issued as the warning message.
Note
When 0 is specified, no warning message is issued.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_rdarea_no_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_rdarea_no_wrn_pnt operand (i.e., the warning message is issued when the number of RDAREAs reaches 80% of the specified maximum).
60) pd_max_file_no_wrn_pnt = trigger-for-issuing-HiRDB-files-count-warning
[Figure]<unsigned integer>((0-100))<<0 or 80>>(%)
Specifies as a percentage of the maximum number of files comprising an RDAREA (as specified by the pd_max_file_no operand) the number of files actually comprising an RDAREA at which a warning message is to be issued. For example, if 90 is specified for this operand and 200 has been specified for the pd_max_file_no operand, a warning message will be issued when the number of files comprising an RDAREA reaches 180 (90% of 200). KFPS05123-W is issued as the warning message.
Note
When 0 is specified, no warning message is issued.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_file_no_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_file_no_wrn_pnt operand (i.e., the warning message is issued when the number of files comprising an RDAREA reaches 80% of the specified maximum).
61) pdwork_wrn_pnt = trigger-for-issuing-work-table-files-warning
[Figure]<unsigned integer>((0-100))<<0 or 80>>(%)
Specifies as a percentage of the applicable usage* of a HiRDB file system area for work table files (as specified by the pdwork operand) the point at which a warning message is to be issued. For example, if 90 is specified for this operand, a warning message will be issued when the applicable usage rate of a HiRDB file system area for work table files reaches 90%. KFPS05123-W is issued as the warning message.
* This operand specifies a percentage of usage in terms of the items listed below, which are specified with the pdfmkfs command when the HiRDB file system areas are created; the warning message can be issued for each of these items:
  • Usage relative to capacity
  • Usage relative to the maximum number of files
  • Usage relative to the maximum number of increments
Note
When 0 is specified, no warning message is issued.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pdwork_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pdwork_wrn_pnt operand (i.e., the warning message is issued when the usage reaches 80%).
62) pd_max_list_users_wrn_pnt = trigger-for-issuing-warning-about-number-of-users-who-have-created-lists
[Figure]<unsigned integer>((0-100)) <<0 or 80>> (%)
Specifies as a percentage of the number of users who can create lists (as specified by the pd_max_list_users operand) the point at which the number of users actually using lists is to cause a warning message to be issued. For example, if 90 is specified for this operand and 200 was specified for the pd_max_list_users operand, a warning message will be issued whenever the number of users using lists reaches 180. KFPS05123-W is issued as the warning message.
Notes
  • When 0 is specified, no warning message is issued.
  • The warning message is output only once. However, if the applicable server is restarted after the warning message has been output, a check of list usage is executed again, and the warning message will be output again.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_list_users_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_list_users_wrn_pnt operand (i.e., the warning message is issued when the usage reaches 80%).
63) pd_max_list_count_wrn_pnt = trigger-for-issuing-warning-about-number-of-lists-created-by-a-user
[Figure]<unsigned integer>((0-100)) <<0 or 80>> (%)
Specifies as a percentage of the number of lists that can be created per user (as specified by the pd_max_list_count operand) the point at which the number of lists created by a user is to cause a warning message to be issued. For example, if 90 is specified for this operand and 200 was specified for the pd_max_list_count operand, a warning message will be issued when a user creates 180 lists. KFPS05123-W is issued as the warning message.
Notes
  • When 0 is specified, no warning message is issued.
  • The warning message is output only once. However, if the applicable server is restarted after the warning message has been output, a check of list usage is executed again, and the warning message will be output again.
  • If multiple users are creating lists simultaneously, the warning message may be output at the server multiple times, depending on the timing.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_max_list_count_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_max_list_count_wrn_pnt operand (i.e., the warning message is issued when the usage reaches 80%).
64) pd_rdarea_list_no_wrn_pnt = trigger-for-issuing-warning-about-number-of-lists-created-at-server [,trigger-for-resetting-warning-output-status]
trigger-for-issuing-warning-about-number-of-lists-created-at-server[Figure]<unsigned integer>((0-100)) <<0 or 80>> (%)
Specifies as a percentage of the number of lists that can be created at a server the point at which a warning message is to be issued. For example, if 90 is specified for this operand and the maximum number of lists that can be created at the server is 1000, a warning message will be issued whenever 900 lists have been created. KFPH22023-W is issued as the warning message.
Note
When 0 is specified, no warning message is issued.
Relationship to other operands
  • When this operand is omitted and MANUAL is specified for the pd_watch_resource operand (or the pd_watch_resource operand is omitted), 0 is assumed for the pd_rdarea_list_no_wrn_pnt operand (i.e., no warning message is issued).
  • When this operand is omitted and AUTO is specified for the pd_watch_resource operand, 80 is assumed for the pd_rdarea_list_no_wrn_pnt operand (i.e., the warning message is issued when the usage reaches 80%).
trigger-for-resetting-warning-output-status[Figure]<unsigned integer>((0-99)) (%)
Specifies the trigger for resetting the warning message output status.
Whenever the KFPH22023-W warning message is issued, HiRDB goes into warning message output status. Once HiRDB is in this status, the warning message will not be issued again even when the number of created lists again reaches the trigger value. However, when the number of created lists falls below the trigger for resetting the warning message output status specified here, the warning message output status is reset.
For example, if pd_rdarea_list_no_wrn_pnt=90,70 is specified, the warning message will be issued when the number of created lists reaches 90% of the maximum permissible number of lists. Once this occurs, the warning message will not be eligible to be issued again until the number of created lists falls below 70% of the maximum number of permissible lists allowed. Thereafter, the warning message will be issued if the number of lists again reaches the 90% trigger value.
Notes
  • If no value is specified for the trigger for resetting the warning output status, 30 is assumed. If a negative is specified, 0 is assumed.
  • If the value specified here is greater than the value specified for the trigger for issuing warning, the value specified here is ignored and the same value as was specified for the trigger for issuing warning is assumed as the trigger for resetting the warning output status.