MySQL :: MySQL 8.0 Reference Manual :: 18.6.6 Responses to Failure Detection and Network Partitioning

MySQL 8.0 Reference Manual / ... / Responses to Failure Detection and Network Partitioning

18.6.6 Responses to Failure Detection and Network Partitioning

Group Replication's failure detection mechanism is designed to identify group members that are no longer communicating with the group, and expel them as and when it seems likely that they have failed. Having a failure detection mechanism increases the chance that the group contains a majority of correctly working members, and that requests from clients are therefore processed correctly.

Normally, all group members regularly exchange messages with all other group members. If a group member does not receive any messages from a particular fellow member for 5 seconds, when this detection period ends, it creates a suspicion of the fellow member. When a suspicion times out, the suspected member is assumed to have failed, and is expelled from the group. An expelled member is removed from the membership list seen by the other members, but it does not know that it has been expelled from the group, so it sees itself as online and the other members as unreachable. If the member has not in fact failed (for example, because it was just disconnected due to a temporary network issue) and it is able to resume communication with the other members, it receives a view containing the information that it has been expelled from the group.

The responses of group members, including the failed member itself, to these situations can be configured at a number of points in the process. By default, the following behaviors happen if a member is suspected of having failed:

When a suspicion is created, it times out immediately (its lifetime is set to 0), so the suspected member is expelled as soon as the expired suspicion is identified. The member could potentially survive for a further few seconds after the timeout because the check for expired suspicions is carried out periodically.
If an expelled member resumes communication and realises that it was expelled, it does not try to rejoin the group and accepts its expulsion.
When an expelled member accepts its expulsion, it switches to super read only mode and awaits operator attention. (The exception is in releases from MySQL 8.0.12 to 8.0.15, where the default was for the member to shut itself down. From MySQL 8.0.16, the behavior was changed to match the behavior in MySQL 5.7.)

These defaults are set to prioritize the correct operation of the group and the correct handling of requests. However, they might be inconvenient in the case of slower networks or networks with a high rate of transient failures, because in these situations there could be a frequent requirement for operator intervention to fix expelled members. They also do not allow for continued operation of the group to be planned in the case of expected network failures or machine slowdowns. You can use Group Replication configuration options to change these behaviors either permanently or temporarily, to suit your system's requirements and your priorities, as follows:

You can use the group_replication_member_expel_timeout system variable, which is available from MySQL 8.0.13, to allow additional time between the creation of a suspicion and the expulsion of the suspect member. You can set the lifetime of the suspicion up to 3600 seconds (one hour) before it times out. (The 5-second detection period before a suspicion is created is not configurable.) Suspect members in this state are listed as UNREACHABLE, but are not removed from the group's membership list.
Bear in mind that while a group has unreachable members, you cannot add or remove any other members or elect a new primary. If you do want to take one of these actions and you cannot make the suspect member active again, you can force the suspicion to time out by changing group_replication_member_expel_timeout on any online member to a value less than the time that has already elapsed since the suspicion was created.
You can use the group_replication_autorejoin_tries system variable, which is available from MySQL 8.0.16, to make an expelled member that is able to resume communication automatically try to rejoin the group. You can specify a number of attempts that the member makes to rejoin the group, instead of just accepting its expulsion as soon as it resumes communication. When the member's expulsion or unreachable majority timeout is reached, it makes an attempt to rejoin (using the current plugin option values), then continues to make further auto-rejoin attempts up to the specified number of tries. After an unsuccessful auto-rejoin attempt, the member waits 5 minutes before the next try. During the auto-rejoin procedure, the expelled member remains in super read only mode and displays an ERROR state on its view of the replication group.
Bear in mind that while a member remains in this mode, although writes cannot be made on the member, reads can, with an increasing likelihood of stale reads over time. If you do want to intervene to take the member offline, the member can be stopped manually at any time by using a STOP GROUP_REPLICATION statement or shutting down the server. You can monitor the auto-rejoin procedure using the Performance Schema. While an auto-rejoin procedure is taking place, the Performance Schema table events_stages_current shows the event “Undergoing auto-rejoin procedure”, with the number of retries that have been attempted so far during this instance of the procedure (in the WORK_COMPLETED field). The events_stages_summary_global_by_event_name table shows the number of times the server instance has initiated the auto-rejoin procedure (in the COUNT_STAR field). The events_stages_history_long table shows the time each of these auto-rejoin procedures was completed (in the TIMER_END field).
You can use the group_replication_exit_state_action system variable, which is available from MySQL 8.0.12 and MySQL 5.7.24, to choose whether an expelled member that fails to rejoin (or does not try) shuts down MySQL Server or switches itself to super read only mode. As with the auto-rejoin process, if the member goes to super read only mode, there is a probability of stale reads which increases over time. Instructing the member to shut itself down ends this situation and means that you do not need to pro-actively monitor the servers for failures, but it means that the MySQL Server instance is unavailable and must be restarted. Operator intervention is required whatever exit action is set, as an ex-member that has exhausted its auto-rejoin attempts (or never had any) and has been expelled from the group is not allowed to rejoin without a restart of Group Replication.

Important

If a failure occurs before the member has successfully joined the group, the exit action specified by group_replication_exit_state_action is not taken. This is the case if there is a failure during the local configuration check, or a mismatch between the configuration of the joining member and the configuration of the group. In these situations, the super_read_only system variable is left with its original value, and the server does not shut down MySQL. To ensure that the server cannot accept updates when Group Replication did not start, we therefore recommend that super_read_only=ON is set in the server's configuration file at startup, which Group Replication will change to OFF on primary members after it has been started successfully. This safeguard is particularly important when the server is configured to start Group Replication on server boot (group_replication_start_on_boot=ON), but it is also useful when Group Replication is started manually using a START GROUP_REPLICATION command.
If a failure occurs after the member has successfully joined the group, the specified exit action is taken. This is the case if there is an applier error, if the member is expelled from the group, or if the member is set to time out in the event of an unreachable majority. In these situations, if READ_ONLY is the exit action, the super_read_only system variable is set to ON, or if ABORT_SERVER is the exit action, the server shuts down MySQL.

Note that where group members are at an older MySQL Server release that does not support a relevant setting, or at a release with a different default, they act towards themselves and other group members according to the default behaviors stated above. For example, a member that does not support the group_replication_member_expel_timeout system variable expels other members as soon as an expired suspicion is detected, and this expulsion is accepted by other members even if they support the system variable and have a longer timeout set.

Network Partitioning

Members that have not failed might lose contact with part, but not all, of the replication group due to a network partition. For example, in a group of 5 servers (S1,S2,S3,S4,S5), if there is a disconnection between (S1,S2) and (S3,S4,S5) there is a network partition. The first group (S1,S2) is now in a minority because it cannot contact more than half of the group. Any transactions that are processed by the members in the minority group are blocked, because the majority of the group is unreachable, therefore the group cannot achieve quorum. If the servers in the majority group are still online, they can automatically form their own functional partition and continue to function as a replication group. For a detailed description of this scenario, see Section 18.4.5, “Network Partitioning”.

In this situation, the default behavior is for the members in both the minority and the majority to remain in the group, continue to accept transactions (although they are blocked on the members in the minority), and wait for operator intervention. The intervention process, which is described in Section 18.4.5, “Network Partitioning”, involves checking which servers are functioning and forcing a new group membership if necessary.

If you do not want to pro-actively monitor for this situation, and want to avoid the possibility of creating a split-brain situation (with two versions of the group membership) due to inappropriate intervention, you can instruct members that find themselves in a minority to exit the group after a timeout period. The system variable group_replication_unreachable_majority_timeout sets a number of seconds for a member to wait after losing contact with the majority of group members. After this time, all pending transactions that have been processed by the member and the others in the minority group are rolled back, and the servers in that group move to the ERROR state, then follow the exit action specified by group_replication_exit_state_action.

PREV HOME UP NEXT