Hitachi

For Linux(R) (x86) Systems HA Monitor Cluster Software


7.5.9 Handling device failures on a shared disk (while the standby server is starting, on standby, or terminating) (using SCSI reservation for shared disk)

If an I/O error occurs while the standby server is starting, on standby, or terminating on the device specified in the scsi_device or dmmp_device operand in the server environment definition, HA Monitor issues the KAMN725-W and KAMN726-E messages and resumes processing. In a redundant configuration with multipath software, HA Monitor issues the KAMN726-E message only when a failure has occurred on all paths to the same shared disk.

To recover from a device failure when the standby server is starting, on standby, or terminating:

  1. Determine the cause of the device failure.

    Determine the cause of the failure by referencing the KAMN725-W and KAMN726-E messages and the message issued by the kernel and by using hardware management tools.

  2. Resolve the cause of the device failure.

    Resolve the cause of the device failure by taking an appropriate action, such as by replacing the erroneous device.

    In a redundant configuration with multipath software, do not at this time restore the path that resulted in the failure to online status (failback).

  3. In a multi-path configuration, restore the path that has been recovered from the failure to online status (failback).

    Restore a recovered path to online status (failback) by using the appropriate command provided by the multipath software (HDLM, DMMP, or HFC-PCM). For details about how to restore paths to online status, see the manual Hitachi Dynamic Link Manager Software User's Guide (for Linux(R) systems). Alternatively, see the documentation for DMMP or HFC-PCM.

    For a single-path configuration, or in a VMware ESXi-based virtualization environment (where DMMP is not used), this step is not necessary.

Note that if the OS is restarted while a server is operating as the active system on the remote host, SCSI-related messages might be output to the syslog file. For details, see 6.4.4 Handling SCSI-related messages output to syslog.