Recovering the Operational Node when the Other Node is Unclean

A node in a clustered Fabric Management Platform environment is in the Unclean state if there are multiple failures in the cluster or if a reboot occurred and the cluster cannot communicate with the node. This typically occurs when the platform management console is not reachable from its peer. When a node is in an Unclean state, the Fabric Manager is unavailable.

Use the following procedure to restart the Fabric Manager services on the operational node when the other node is in the Unclean state:

Log in to the ClearPath Forward Fabric Management Platform web-based graphical user interface (also called the FMP Manager user interface).
Access the Cluster Dashboard page, and then check the following items:
- One node is Online (green) and one is Unclean (red).
- All resources in the Inactive Resources section are in the Stopped state.
Remove the failed or Unclean node from the FM LAN or disconnect its power cables. This prevents any failover to the node until it can be repaired.
After isolation of the failed or Unclean node, activate the functional node by disabling the STONITH feature in the cluster. Once the STONITH feature is disabled, failover is not possible.
Note: Make sure no changes are made to the Fabric Manager database on the Unclean node when the STONITH feature is disabled. If changes are made, the Fabric Manager database on the failed or Unclean node must be resynchronized from an empty state.
To disable the STONITH feature using a command prompt on the Fabric Management Platform, enter the following command:
```
crm configure property stonith-enabled=false
```
Since the platform management console on the failed or Unclean node is not reachable, one of the STONITH resources generates an error. The error appears in the failure history section of the Cluster Dashboard page after a few minutes.
If you expect the repair and reinstallation of the failed or Unclean node to take some time, you may want to stop STONITH2 to prevent logging. To stop this resource, locate the failed STONITH resource on the Cluster Dashboard page, click the arrow button next to the resource, and then select Stop. Click the arrow button next to the failed STONITH resource again, and then select Clean Up.
Once the failed or Unclean node can be successfully added to the FM LAN and is operational, enable STONITH and restart the STONITH resource that failed. You will see STONITH errors until the failed or Unclean node is fully functional. It is best to enable STONITH before you reconnect the now functional node.
To enable the STONITH feature using a command prompt on the Fabric Management Platform, enter the following command:
```
crm configure property stonith-enabled=true
```
To restart the STONITH resource, locate the STONITH resource in the Inactive Resources section of the Cluster Dashboard page, click the arrow button next to the resource, and then select Start.
Reconnect the power cables and apply power to the platform management console and/or reconnect the FM LAN to the previously failed or Unclean node. It may take a few minutes for the node to come online after it is booted. The Cluster Dashboard sets the status of the node to Online (green) several minutes after the node is reconnected and rebooted.