Skip to content

TQ: Shrink the ack requirements for partial commit#9852

Open
andrewjstone wants to merge 1 commit intomainfrom
tq-modify-commit-params
Open

TQ: Shrink the ack requirements for partial commit#9852
andrewjstone wants to merge 1 commit intomainfrom
tq-modify-commit-params

Conversation

@andrewjstone
Copy link
Contributor

Currently we wait for N-Z acked commits to allow a new configuration to be installed in the Committing state. For a 32 node system this means that 29 nodes have to ack. If more than 3 nodes are offline than a customer cannot add or remove any sleds. This is an unnecessarily harsh restriction.

This PR changes it so that only K+Z nodes must ack commits to go from Preparing to CommittedPartially, similar to the prepare phase of the protocol that takes us from Preparing to Committing. We bump Z a little bit for larger cluster to balance things out. Now, only 21 nodes must ack commits for new configurations to be accepted. This allows up to 11 nodes to be offline and eliminates the need for an escape hatch for oxide support if 4 or 5 nodes rather than 3 are offline. If 11 nodes are offline, something major is going on and really, oxide support should be involved. We will not allow automatic triage via omdb in this situation.

This PR also makes acking commits idempotent in the datastore layer, which fixes a potential bug in the case of multiple nexuses trying to commit the same nodes simultaneously.

Fixes #9826

Currently we wait for N-Z acked commits to allow a new configuration to
be installed in the `Committing` state. For a 32 node system this means
that 29 nodes have to ack. If more than 3 nodes are offline than a
customer cannot add or remove any sleds. This is an unnecessarily harsh
restriction.

This PR changes it so that only K+Z nodes must ack commits to go from
`Preparing` to `CommittedPartially`, similar to the prepare phase of
the protocol that takes us from `Preparing` to `Committing`. We bump Z a
little bit for larger cluster to balance things out. Now, only 21 nodes
must ack commits for new configurations to be accepted. This allows up
to 11 nodes to be offline and eliminates the need for an escape hatch
for oxide support if 4 or 5 nodes rather than 3 are offline. If 11
nodes are offline, something major is going on and really, oxide support
should be involved. We will not allow automatic triage via omdb in this
situation.

This PR also makes acking commits idempotent in the datastore layer,
which fixes a potential bug in the case of multiple nexuses trying to
commit the same nodes simultaneously.

Fixes #9826
9..=16 => 2,
_ => 3,
4..=7 => 1,
8..=15 => 2,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be too harsh. It would only allow 1 node to be offline for an 8 node cluster. Then again, we should never have 8 node clusters in production.

@andrewjstone
Copy link
Contributor Author

This PR also makes acking commits idempotent in the datastore layer, which fixes a potential bug in the case of multiple nexuses trying to commit the same nodes simultaneously.

This actually isn't strictly a bug, except in the contrived test here. It can make commit occur more slowly though. In the case of competing BG tasks, one can ack say nodes 1,2,3 and a second ack 2,3,4 which would have failed. The next time around a BG task will only ask for node 4 and record that since it will know 2 and 3 have already committed. But this requires an extra loop through the BG task which in our case is up to 1 minute.

In any event, there was no reason for the removed filter and so it is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TQ: Want emergency API for stuck trust quorum reconfiguration

2 participants