Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


C. List of Files Output by HA Monitor

The table below lists the files that are output by HA Monitor.

The history, etc, and spool paths in file path names indicate locations under /opt/hitachi/HAmon.

Table C‒1: List of files output by HA Monitor

No.

Name

File path name

Creation and deletion timing

Management method

Supported version#1

Format

Maximum file size (bytes)

Spool file that can be deleted during operation#10

1

Error information transfer

Any path name

(the default during execution of the monts command is under /tmp)

When the monts command is executed

--

01-10 or later

Binary

Variable#2

--

2

Module trace

All files beginning with /opt/hitachi/HAmon/core

When HA Monitor terminates abnormally

If there are two or more files under /opt/hitachi/HAmon when HA Monitor is started, after the contents of spool/save are cleared, core under /opt/hitachi/HAmon is moved.

01-10 or later

Binary

675,840,000

--

3

Reset path configuration file

etc/.CPSS_RESET

When the reset path is set by using the monsetup command

Manages one generation.

01-62 or earlier

Text

108

--

Manages two generations: One with no suffix and one with suffix _old (generation with suffix _old is output under spool/).

01-63 or later

4

HA Monitor start information control file

etc/.CTLFILE

When HA Monitor is started

Manages one generation.

01-10 or later

Binary

32

--

5

Configuration check file for HA Monitor start/termination methods

etc/.HApwon_manual

When automatic start and termination are configured by the monsetup command

Manages one generation.

01-30 or later

Text

0

--

6

etc/.HApwoff_manual

--

7

Reset path option configuration file

etc/.RSTDENV

When the reset path is set by using the monsetup command

Manages one generation.

01-62 or earlier

Text

42

--

Manages two generations: One with no suffix and one with suffix _old (generation with suffix _old is output under spool/).

01-63 or later

8

Connection settings

etc/connection

When HA Monitor is terminated

Manages one generation.

01-10 or later

Text

Variable#3

--

9

Backup copy of server environment definition file

etc/servers.bac

When the mondevice command is executed (except when the -c option is specified)

Manages one generation.

01-10 or later

Text

Variable#4

--

10

Monitoring history

history/patrol_history

When server or host slowdown is recovered while server monitoring history is being collected (with the ph_log_size operand specified)

Manages two _old generations

(wraparound and round-robin methods).

01-12 or later

Text

10,485,760

--

11

HA Monitor start message skip information

spool/.dup_start

When an old HA Monitor start message is received from the other host

Manages two .bak generations.

01-10 or later

Binary

73,400,320

Y

12

fuser execution results

All files with names beginning with spool/.ERRdisk_

When a disk disconnection error occurs

Number of devices × Management of two .bak generations.

01-10 or later

Text

199,680

Y

13

haiod troubleshooting information

spool/.haiod_trace

When a slowdown or another failure occurs#6

21-generation management under spool/ and spool/haiod_trace/

01-11 or later

Text

Through 01-33

4,142,080

01-40 or later

4,142,080 to 8,749,056#5

Y

14

halrstd troubleshooting information

spool/.halrstd_trace

When a slowdown or another failure occurs#6

21-generation management under spool/ and spool/halrstd_trace/

01-50 or later

Text

4,178,944

Y

15

VMware ESXi response information

spool/.halrstdlog

When HA Monitor starts, connection is established between hosts, or a host failure is detected

Manages one generation (wraparound method).

01-50 or later

Text

1,049,120

Y

16

hcmalive troubleshooting information

spool/.hcmalive_trace

When a slowdown or another failure occurs#6

21-generation management under spool/ and spool/hcmalive_trace/

01-32 or later

Text

4,178,944

Y

17

hcmread troubleshooting information

spool/.hcmr_trace

When a slowdown or another failure occurs#6

21-generation management under spool/ and spool/hcmr_trace/

01-10 or later

Text

01-30 or earlier

20,480

01-31 to 01-33

4,178,944

01-40 or later

4,178,944 to 8,749,056#5

Y

18

JP1 error information

spool/.jp1log

When JP1 event registration processing results in an error while the JP1 event notification function is being used (jp1_event operand is specified)

Manages one generation.

01-10 or later

Text

4,096

Y

19

HA Monitor's internal message log

spool/.msglog

When an HA Monitor message is issued

Manages one generation

(wraparound method).

01-10 or later

Text

Through 01-33

917,504

01-40 or later

917,504 (if HA Monitor Extension is used, 26,240,000)

N

20

Socket file for hcmalive process control

spool/.s_alive

When HA Monitor is started or terminated

Manages one generation.

01-30 or later

Binary

0

N

21

Socket file for hcmalive command processing

spool/.s_alivecmd

When HA Monitor is started or terminated

Manages one generation.

01-31 or later

Binary

0

N

22

Socket file for commands

spool/.s_cmnd

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

23

Socket file for internal event control

spool/.s_event

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

24

Socket file for halrstd process control

spool/.s_halrstd

When HA Monitor is started or terminated

Manages one generation.

01-50 or later

Binary

0

N

25

Socket file for hcmread command processing

spool/.s_hcmrcmd

When HA Monitor is started or terminated

Manages one generation.

01-31 or later

Binary

0

N

26

Socket file for haiod process control

spool/.s_iod

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

27

spool/.s_iod2

28

spool/.s_iodother

29

spool/.s_iodcmd

30

API interface socket for haiod UAPs

spool/.s_iod_up

When HA Monitor is started or terminated

Manages one generation.

01-42 or later

Binary

0

N

31

Socket file for prcmain process external event control

spool/.s_other

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

32

Socket file for server control

spool/.s_path

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

33

Socket file for haresetd process control

spool/.s_resetd

When HA Monitor is started or terminated

Manages one generation.

01-10 or later

Binary

0

N

34

Failure management processor response information

spool/.sphralog

When HA Monitor starts, when the reset path fails, or when failure management processor commands are sent or received

Manages one generation

(wraparound method).

01-10 or later

Text

Through 01-33

21,248

01-40 or later

21,248 (if HA Monitor Extension is used, 1,049,120)

01-53 or later

524,560

N

35

Command log

spool/cmdlog/.cmdlog1

When HA Monitor's commands and user commands are executed or terminated

Manages one generation

(wraparound method).

01-11 or later

Text

917,504

N

36

spool/cmdlog/.cmdlog2

Through 01-33

917,504

01-40 or later

917,504 (if HA Monitor Extension is used, 3,145,728)

37

spool/cmdlog/.uoclog

Through 01-33

917,504

01-40 or later

917,504 (if HA Monitor Extension is used, 83,886,080)

38

spool/cmdlog/.unamewrap.log

When moncheck and HA Monitor is started

Manages two .bak generations.

01-60 or later

Text

10,240

N

39

spool/cmdlog/.unamewrap_ins.log

When installation is performed

Text

10,240

N

40

Extended troubleshooting information

spool/DUPSTART_EX

When an old HA Monitor start message is received from the other host

Manages two .bak generations.

01-11 or later

Text

199,680

Y

41

spool/DUPSTART_EX_TOP#7

20,480

42

spool/ERRPATH_*_EX

When a monitoring path failure has occurred

199,680

43

spool/ERRPATH_*_EX_TOP#7

20,480

44

spool/HOSTSLOW_remote-host-name_EX

When a 70% slowdown has occurred on the remote host

199,680

45

spool/HOSTSLOW_remote-host-name_EX_TOP#7

20,480

46

spool/HOSTSLOW_local-host-name_EX

When the local host has recovered from a slowdown

199,680

47

spool/HOSTSLOW_local-host-name_EX_TOP#7

20,480

48

spool/MONTS_EX

When the monts command is executed

199,680

49

spool/MONTS_EX_TOP#7

20,480

50

spool/RESET_EX

When the system is reset

199,680

51

spool/RESET_EX_TOP#7

20,480

52

spool/SERVSLOW_server-name_EX

When a 70% slowdown has occurred on the server

199,680

53

spool/SERVSLOW_server-name_EX_TOP#7

20,480

54

spool/PTRLCMD_EX_SERVSLOW_server-name_EX

When the monitor-mode server monitoring function is used and a slowdown continues for 70% of the time specified (in seconds) for ptrlcmd_ex_tmout

01-65 or later

199,680

55

spool/PTRLCMD_EX_SERVSLOW_server-name_EX_TOP#7

20,480

56

Monitoring path failure information

All files beginning with spool/ERRPATH_

When a monitoring path failure has occurred

Manages the number of monitoring paths × two .bak generations.

01-10 or later

Binary

73,400,320

Y

57

Debug information during slowdown

spool/HOSTSLOW_remote-host-name

When a 70% slowdown has occurred on the remote host

Manages the number of hosts × two .bak generations.

01-10 or later

Binary

73,400,320

Y

58

spool/HOSTSLOW_local-host-name

When the local host has recovered from a slowdown

Manages two .bak generations.

59

spool/SERVSLOW_server-name

When a 70% slowdown has occurred on the server

Manages the number of servers × two .bak generations.

60

spool/UAPSLOW_UAP-alias-name

When a 70% slowdown has occurred on the UAP

Manages the number of programs × one generation.

01-42 or later

61

spool/PTRLCMD_EX_SERVSLOW_server-name

When the monitor-mode server monitoring function is used and a slowdown continues for 70% of the time specified (in seconds) for ptrlcmd_ex_tmout

Manages two .bak generations.

01-65 or later

62

LAN monitoring status

spool/lanpatrol/IF-name.log

When the LAN monitoring function is used

Manages one generation.

(wraparound method).

If the KAMN529-I or KAMN530-E message is output, the information is backed up to _judge.#8

01-40 or later

and

01-56 or later#8

Text

917,504

Y

63

spool/lanpatrol/IF-name_stat.log

4,096

64

Internal execution logs for LAN monitoring script

spool/lanpatrol/lanpatrol_IF-name.log

When the LAN monitoring function is used

Manages two _old generations

(If the LAN monitoring function is used, the information is backed up to _err when the execution time of the LAN monitoring script exceeds the LAN monitoring interval.)

If the KAMN529-I or KAMN530-E message is output, the information is backed up to _judge.#8

01-52 or later

and

01-56 or later#8

Text

5,242,880

Y

65

Error information during system reset

spool/RESET

When the system is reset

Manages two .bak generations.

01-10 or later

Binary

73,400,320

Y

66

SCSI reservation and path check error information#9

spool/scsi3PR_check_err_server-alias-name.scsilog

When a SCSI command results in an error while SCSI reservation and SCSI connection path are checked

Manages two _old generations.

01-52 or later

Text

10,485,760

Y

67

SCSI reservation information#9

spool/scsi3PR_server-alias-name.scsilog

When the server starts

When the server terminates

(when the SCSI command is issued)

Manages two generations: One with no suffix and one with suffix _old

01-52 or later

Text

10,485,760

Y

spool/scsi3PR_scsiclr_server-alias-name.scsilog

When the monscsiclr command is executed

01-58 or later

68

Statistical information

spool/sms

(spool/oldsms)

When a host or server slowdown has occurred

01-10 or earlier:

Manages two generations: One with no suffix and one with suffix _old

(Continuous increase mode)

01-11 or later:

Manages two _old generations

(wraparound and round-robin methods).

01-10 or later

Text

01-10 or earlier

Continuous increase

01-11 or later

102,400

Y

69

Available resource control information

spool/ulimit.log

When HA Monitor is started

Manages one generation.

01-22 or later

Text

703

Y

70

File system switchover information

spool/server-alias-name.fslog

When file systems are switched

Manages the number of servers × two _old generations.

01-10 or later

Text

2,147,483,647

Y

71

Multipath command execution information#9

spool/server-alias-name.mplog

When the server starts

Manages two _old generations.

01-52 or later

Text

65,535

Y

72

LAN status settings file execution results

spool/server-alias-name.up.log

When the server starts or terminates

Manages the number of servers × two .bak generations.

01-12 or later

Text

5,120

Y

73

spool/server-alias-name.down.log

74

Volume group connection and disconnection information

spool/volume-group-name.vglog

When the vgchange -a y command is executed

Manages the number of volume groups × two _old generations.

01-10 or later

Text

65,535

Y

75

Message log

All files beginning with /var/log/messages

When an HA Monitor message is issued

--

01-10 or later

Text

Size specified in /etc/syslog.conf

--

76

Settings for handling a communication error occurring when the reset path is being set

spool/.CPSS_RESET_err

When the reset path is set by using monsetup

Manages one generation.

01-63 or later

Text

108

Y

spool/.RSTDENV_err

42

77

Execution log for the monitoring command

spool/ptrlcmd_ex/server-alias-name_ptrlcmdlog

  • When the server whose monitoring command is specified for the ptrlcmd_ex operand is started as the active server

  • When the server whose monitoring command is specified for the sby_ptrlcmd_ex operand is started as the standby server

Manages three generations: One with no suffix, one with suffix .bak, and one with suffix .bak2.

01-65 or later

Text

917,504

N

78

Details log for the monitoring command

spool/ptrlcmd_ex/server-alias-name.log

  • When the server whose monitoring command is specified for the ptrlcmd_ex operand is started as the active server

  • When the server whose monitoring command is specified for the sby_ptrlcmd_ex operand is started as the standby server

Manages one generation.

01-65 or later

Text

Depends on the condition (for example, the maximum size of the sample monitoring command (patrol_ex.sh) is about 1,000).

Y

spool/ptrlcmd_ex/server-alias-name.log_err

  • When the server whose monitoring command is specified for the ptrlcmd_ex operand detects a server failure while operating as the active server

  • When the server whose monitoring command is specified for the sby_ptrlcmd_ex operand detects a server failure while operating as the standby server

79

Information returned when the reset path is initialized

spool/.gtstpxlog

  • When HA Monitor is started

  • When the reset path is restored after initialization of the reset path failed

Manages two generations: One with no suffix and one with suffix .bak

01-68 or later

Text

1,537

N

80

Disk monitoring execution status log

spool/disk_patrol/disk-monitoring-target-file's-absolute-path#11_log

When disk monitoring is started

Manages one generation.

01-69 or later

Text

10,485,760

Y

spool/disk_patrol/disk-monitoring-target-file's-absolute-path#11_log_err

When disk access for disk monitoring fails

Manages one generation.

Legend:

Y: Can be deleted.

N: Cannot be deleted.

--: Not applicable

#1

Files created by version 01-41 or earlier are supported by P-9S2C-E111 HA Monitor.

#2

The maximum file size varies because data under a directory such as /opt/hitachi/HAmon/etc is acquired in batch mode and files that are not for HA Monitor are also transferred.

#3

The maximum file size depends on the number of connected hosts and the number of monitoring paths.

#4

The maximum file size varies according to the number of device definitions.

#5

The maximum file size depends on the servmax and hostmax operand values.

#6

The file is output at one of the following times:

• When HA Monitor's troubleshooting information collection command (monts command) is executed (corresponds to No. 1 in the table)

• When a monitoring path failure is detected (corresponds to No. 52 in the table)

• When a slowdown of the other host is detected (corresponds to No. 53 in the table)

• When a slowdown of the local host is detected (corresponds to No. 54 in the table)

• When a server slowdown is detected on the local host (corresponds to No. 55 in the table)

• When a UAP slowdown is detected on the local host (corresponds to No. 56 in the table)

• When the remote host is reset (corresponds to No. 60 in the table)

#7

The top command's results are set in xxx_EX_TOP.

#8

Information is backed up to _judge in version 01-56 or later. If the function for controlling hot standby based on the availability of LAN communications is not used with version 01-55 or earlier or version 01-56 or later, no data is backed up to _judge.

#9

This is output when SCSI reservation for shared disk is used.

#10

Delete files after terminating HA Monitor. Do not delete directories.

#11

Compose the disk monitoring target file's absolute path by replacing a slash (/) with a hyphen (-).