How do I recover from a DRBD split-brain?
In some situations, DRBD detects a split brain situation. This happens when the primary FMP is offline, the secondary FMP makes updates with no replication, and then a second outage occurs. DRBD detects the split-brain at the time connectivity becomes available again and the peer nodes exchange the initial DRBD protocol handshake. If DRBD detects that both nodes are (or were at some point, while disconnected) in the primary role, it immediately tears down the replication connection.
If a split-brain situation occurred, a notification is displayed on the FFM High Availability Status pane as well as the Cluster Dashboard page of the FMP Manager user interface. To view the notification message, on the FMP Manager user interface, use the Diagnostics menu to access the Diagnostics page, and then choose the Replication Alerts tab. To automatically fix the split-brain situation, click Fix Split Brain.
Recovering from a DRBD Split-Brain Situation
After DRBD detects the split brain situation, one node always has the resource in a StandAlone connection state. The other node may be in the
StandAlone connection state if both nodes detected the split brain syndrome simultaneously.
WFConnection connection state if the peer tore down the connection before the other node had a chance to detect the split brain situation.
At this point, you must manually intervene by selecting one node whose modifications will be discarded (this node is referred to as the split brain victim). This is typically the FMP that was not previously the master.
On the FMP Manager user interface, use the Diagnostics menu to access the Diagnostics page, and then select the Replication Alerts tab.
Click Fix Split Brain.
In the pop-up dialog box, select the victim node, and then click Fix.
The FMP Manager discards the data on the split brain victim, and makes a copy of the data from the other node so that future replication of data can proceed.
When the split-brain situation resolves, entries in the table on the Cluster Summary page of the FMP Manager user interface display a state of UpToDate in the Replication Data Status column, and a state of Connected in the Replication Connection Status column.
Refer to www.drbd.org/users-guide-8.4 for more information.