19.8.1 Common configuration mistakes : JP1 Version 12 JP1/Network Node Manager i Setup Guide

Disk configuration is not valid.

VCS or SCS: If a resource cannot be probed, there is something wrong with the configuration. If a disk cannot be probed, the disk might no longer be accessible by the operating system.
Test the disk configuration manually and confirm against HA products documentation that the configuration is appropriate.

The disk is in use and cannot be started for the HA resource group.

Always check that the disk is not activated before starting the HA resource group.

WSFC network configuration is not valid.

If network traffic is flowing across multiple NIC cards, RDP sessions fail when activating programs that consume a large amount of network bandwidth, such as the NNMi ovjboss process.

Some HA products do not restart automatically at startup.

Review the HA product documentation for details about how to configure automatic restart at startup.

NFS or other access is added directly to the OS.

The resource group configuration must manage this behavior.

Being in the shared disk mount point during a failover or when the HA resource group is being placed offline.

HA kills any processes that prevent the shared disk from being unmounted. Move to a different directory when a failover occurs or when the resource group becomes offline.

Reusing the HA cluster virtual IP address as the HA resource virtual IP address.

This works on one system and not the other. Configure different IP addresses to each system.

Timeouts are too frequent.

If the products are misbehaving, the HA product might timeout the HA resource and cause a failover.

In WSFC, check the value of the Time to wait for resource to start setting. NNMi sets this value to 15 minutes, but you can increase it.

Maintenance mode is not being used.

Maintenance mode was created for debugging HA failures. If you attempt to bring a resource group online on a system and it fails over shortly thereafter, use the maintenance mode to keep the resource group online to see what is failing.

Cluster logs are not being used.

Cluster logs can show many common mistakes.