Fix bug where manageMachinePreservation uncordons nodes of Running Machines unconditionally#1107
Draft
thiyyakat wants to merge 5 commits into
Draft
Fix bug where manageMachinePreservation uncordons nodes of Running Machines unconditionally#1107thiyyakat wants to merge 5 commits into
manageMachinePreservation uncordons nodes of Running Machines unconditionally#1107thiyyakat wants to merge 5 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
3118c8e to
07404d0
Compare
07404d0 to
4ab4e59
Compare
… uncordoning of preserved nodes only when stopping preservation. - preserveMachine and stopPreservationIfActive now return the updated *Machine so callers can thread the latest object through the reconcile cycle without re-fetching. - stopPreservationIfActive is split into stopPreservationOnMachineOnly and stopPreservationWithNode to make the two code paths explicit and easier to follow. - The uncordon step is moved into stopPreservationWithNode, gated on MachineRunning phase, and ordered before clearMachinePreserveExpiryTime so a failed uncordon retries with PreserveExpiryTime still set.
- `reconcileMachinePreservation` handles annotation validation and node/machine state gathering, then delegates to `manageMachinePreservation` which contains the handling for the effective preservation value. This separates concerns and makes each function individually testable. - `getEffectivePreservationAnnotations` is simplified to return only the effective value (no longer mutates machine annotations as a side effect). - `updatePreserveAnnotationOnMachine` is extracted to handle the node annotation tracking sync explicitly, replacing the deferred annotation-diff logic. `shouldAnnotationsBeUpdatedOnMachine` encodes the conditions under which that sync is needed. - Early-exit is added when the machine is neither preserved nor annotated, avoiding unnecessary work on the majority of machines.
4ab4e59 to
0e64a5d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
This PR does the following:
stopPreservationWithNode, gated onphase == MachineRunning, so it only fires when preservation is actively being stopped and the machine has recovered.clearMachinePreserveExpiryTime, so a failed uncordon is retried on the next reconcile instead of being silently dropped.manageMachinePreservationinto a gathering phase (reconcileMachinePreservation) and a transition phase (manageMachinePreservation).getEffectivePreservationAnnotationsinto a pure function.""as a valid preservation annotation value.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
IT run with
machine-controller-manager-provider-awsThe key behavioral change: uncordoning is gated on preservation being stopped in the current reconcile iteration AND phase == MachineRunning. Machines that are actively preserved, or unrelated to preservation, will never have their node uncordoned by this flow.
Release note: