1.4.2 Collecting data for troubleshooting in UNIX

Organization of this subsection

(1) Execute the data collection tool
(2) Obtain the contents of the core file
(3) Check the status of processes
(4) Check the operation data
(5) Collect information about the embedded database

(1) Execute the data collection tool

JP1/AJS3 provides the data collection tools listed in the following table. Use one of these tools according to the operational requirements.

Table 1‒19: Data collection tools provided by JP1/AJS3 (for UNIX)
Data collection tool	Description
`jajs_log`	Use this tool if you have not changed the output destinations of log files and other data from the defaults, or if you do not need to collect data other than the data collected by default.
`_04`	Use this tool if you have changed the output destinations of log files and other data from the defaults, or if you need to collect data other than the data collected by default. You can customize this tool according to the output destinations or the data to be collected.

For details about how to set up the data collection tool (_04), see 16.1 Collecting log data in the Job Management Partner 1/Automatic Job Management System 3 Configuration Guide 1.

The following shows examples of executing the data collection tool:

When executing jajs_log:

# /opt/jp1ajs2/tools/jajs_log

When executing the file trouble.sh that is created by customizing _04:

# /home/jp1ajs2/trouble.sh

By default, the results of executing the data collection tool are output under /tmp/jp1ajs2/trouble/. Back up these files.

When you use the data collection tools in a cluster system, you can collect data by specifying a logical host name. The data collection tools also provide options for limiting the amount of data that is collected. For details on the data collection tools used in UNIX, see jajs_log or _04 (UNIX only) in 2. Commands in the manual Job Management Partner 1/Automatic Job Management System 3 Command Reference 1.

To Page Top

(2) Obtain the contents of the core file

Obtain the contents of the core file if the file has been output.

The core file is output to one of the following directories:

/opt/jp1ajs2/bin^#1
/var/opt/jp1ajs2/database^#1
/var/opt/jp1ajs2cm/database^#1
User home directory^#2
Current directory in which the command was executed

#1: The data collection tools can be used to collect data.
#2: If the core file was output after connection from JP1/AJS3 - View, this directory is the home directory of the mapped OS user.

If you want to collect only the information needed for analysis of the core file, use the ajs2collectcore command. For details about this command, see ajs2collectcore (UNIX only) in 2. Commands in the manual Job Management Partner 1/Automatic Job Management System 3 Command Reference 1.

To Page Top

(3) Check the status of processes

Use the ps command to check the operating status of processes.

For details about JP1/AJS3 processes, see B.3 Processes (for UNIX).

To Page Top

(4) Check the operation data

For the problem that has occurred, check the operation data and record it. You need to check the following information:

Details about the operation
Time that the problem occurred

Machine configuration (the version of each OS, host name, JP1/AJS3 - Manager and JP1/AJS3 - Agent configuration, and JP1/AJS3 Console Manager and JP1/AJS3 Console Agent configuration)

You can check the machine configuration by executing a command. The following table lists the commands you can use to check the machine configuration for each OS.

Table 1‒20: UNIX commands that can be used to check the machine configuration
OS	Command for checking the OS version	Command for checking the size of physical memory on the host	Command for checking the process information and required memory size
HP-UX	`/usr/bin/uname -a`	`/usr/sbin/dmesg`	`/usr/bin/ps -elf`
Solaris	`/usr/bin/uname -a`	`/usr/sbin/prtconf`	`/usr/bin/ps -elf`
AIX	`/usr/bin/uname -a`	`/usr/sbin/bootinfo -r`	`/usr/bin/ps -elf`
Linux	`/bin/uname -a`	`/usr/bin/free` (or `/bin/cat/proc/meminfo`)	`/bin/ps -elf`

Note: The options used in each command in the above table are typical options of the respective OSs. How the options are specified might vary depending on the environment being used. For details, see the documentation for the applicable OS.

Whether the problem is reproducible
Name of the users, if any, who logged in from JP1/AJS3 - View or JP1/AJS3 Console View.

To Page Top

(5) Collect information about the embedded database

You need the following information if an error occurs while you are using the embedded database:

Data needed to investigate the cause
Information needed to re-create the environment of the embedded database

The following describes how to collect the above information.

(a) Information required to investigate the cause

To investigate the cause of an error, you mainly need to collect information about the OS and information about the embedded database. Use OS commands to collect information about the OS. Use the embedded database commands to collect information about the embedded database.

If you use the support service to solve problems, you must submit the information listed in the table below. This table describes the type of information needed to investigate the cause of problems and how to collect information for each type of problem. Problems are classified into seven levels, of which level 1 has the highest priority.

Problems are grouped by type as follows:

Performance

The following processing or operation takes too much time:
- Startup of the embedded database system (including normal startup, restart, and startup after action has been taken for a failure)
- Stopping of the embedded database system (including normal stopping and forced stopping)
- Execution of an operation command for the embedded database
No response

A response is not returned when the following processing or operation is performed:
- Startup of the embedded database system (including normal startup, restart, and startup after action has been taken for a failure)
- Stopping of the embedded database system (including normal stopping and forced stopping)
- Execution of an operation command for the embedded database

Abnormal end

One of the following has occurred:

Abnormal termination of the embedded database system
Abnormal termination of an embedded database process

Abnormal termination of an operation command for the embedded database

Table 1‒21: Information needed to investigate the cause of failures and how to collect that information
No.	Component	Information to be collected	Collection method	Performance	No response	Abnormal end
1	OS	`syslog`	Use an OS function (command).	1	1	1
2		CPU usage rate and device status	Use an OS command, such as the `sar` command, to collect the information. For details about the commands, see the documentation for the OS.	3	4	3
3		CPU running status and memory status for processes	Use an OS command, such as the `top` command, to collect the information. For details about the commands, see the documentation for the OS.	3	4	3
4		Virtual memory	Use an OS command, such as the `vmstat` command, to collect the information. For details about the commands, see the documentation for the OS.	3	4	3
5		Network status	Use an OS command, such as the `netstat` command, to collect the information. For details about the commands, see the documentation for the OS.	3	4	3
6	Embedded database	Information about embedded database failures	Obtain the files under the following directories and store them on a DAT or another storage device: `embedded-database-practical-directory/spool` `embedded-database-practical-directory/tmp` An error log file and a command log file are output to the above directories.	2	2	2
7			Error log file The error log is output to a file under `embedded-database-practical-directory/spool/errlog`.	2	2	2
8			Command log file The command log is output to a file under `embedded-database-practical-directory/spool/cmdlog`.	2	2	2
9		Embedded database system definitions	Obtain the files under `embedded-database-practical-directory/conf` and store them on a DAT or another storage device.	4	5	4
10		SQL trace file and error log file	Obtain the output files and store them on a DAT or another storage device. A file name begins with `pderr` or `pdsql`.	--	6	5
11		System log file	Use the `ajsembdboplog` command to unload the system log. Obtain the unload log file and store it on a DAT or another storage device.	6	7	6

Legend:: --: Information need not be collected.
Note: The size of a file increases at a fixed rate if additional data continues to be stored in the file as a result of redirection. Because such a file takes up disk space, instead create general-purpose shell scripts, which allow files to be switched and reused after several generations.

(b) Information needed to re-create the embedded database environment

If a problem occurs during operation of the embedded database, you might need to re-create the environment that produced the problem to test whether the problem is reproducible or to investigate the cause of the problem. To enable this procedure, collect the following information needed to re-create the embedded database environment.

conf under the embedded database practical directory (if the user has changed the definition files)
Environment variables related to the embedded database
Data in the embedded database

Use the ajsembdbrorg command to collect the data in the embedded database.

To collect the information needed to re-create the embedded database environment:

Start the embedded database.
Execute the ajsembdbrorg command with the -k unld option specified.
Save conf under the embedded database practical directory in a folder of your choice.
Record the environment variables related to the embedded database.

For details about how to use the ajsembdbrorg command and a description of the command, see 10.2.2 Reorganizing a database in the Job Management Partner 1/Automatic Job Management System 3 Administration Guide.

To Page Top