fix(deploy): guarded CAS on redeploy 'building' flip — close TOCTOU (#14)#241
Merged
Merged
Conversation
) Both redeploy entry points (POST /deploy/:id/redeploy and POST /deploy/new redeploy=true) read the deployment row, assert it is non-terminal, then flip it to 'building'. Between the read and the flip the DeploymentExpirer / teardown reconciler can reap the row to 'expired'/'deleted'. The old unconditional UpdateDeploymentStatus would resurrect that reaped, over-TTL / over-cap workload back to 'building'. New models.MarkDeploymentBuilding does the flip as a guarded compare-and-swap (WHERE status IN ('building','deploying','healthy','failed')) and returns rows-affected — mirroring the CAS guards already on MarkDeploymentTornDown and SetDeploymentTTL. Both handlers now: 0 rows -> 409 deployment_not_redeployable (reaped concurrently, do not re-arm); driver error -> log + continue (non-determinate; runRedeployAsync reconciles); 1 row -> proceed. New metric label DeployRedeployInPlaceTotal{outcome="not_redeployable"}. Coverage: Symptom: redeploy resurrects an expired/deleted deploy to 'building' Enumeration: rg -F 'UpdateDeploymentStatus(c.Context(), h.db' + "building" flips in deploy.go Sites found: 2 (in-place /deploy/new redeploy=true; POST /deploy/:id/redeploy) Sites touched: 2 Coverage test: TestMarkDeploymentBuilding_Branches (1-row/0-row/driver), TestDeployNew_Redeploy_CASMiss_Returns409, TestDeployRedeploy_ByID_CASMiss_Returns409, TestDeployRedeploy_ByID_CASSuccess_Returns202, TestDeployRedeploy_ByID_CASDriverError_StillAccepts, TestDeployNew_Redeploy_UpdateStatusError_StillAccepts (updated) Live verified: awaiting post-merge auto-deploy (rule 14 build-SHA gate in CI) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Closes sweep finding #14 (P3, TOCTOU): redeploy unconditionally flips an expired/deleted deploy back to
building.Both redeploy entry points —
POST /deploy/:id/redeployandPOST /deploy/new(redeploy=true) — read the deployment row, assert it's non-terminal (IsDeploymentTerminal/FindActiveDeploymentByTeamEnvName), then flip it tobuilding. Between the read and the flip, theDeploymentExpirer/ teardown reconciler can reap the row toexpired/deleted. The old unconditionalUpdateDeploymentStatuswould resurrect that reaped, over-TTL / over-cap workload back tobuilding.Fix
New
models.MarkDeploymentBuildingperforms the flip as a guarded CAS:returning rows-affected — mirroring the existing CAS guards on
MarkDeploymentTornDownandSetDeploymentTTL. Both handlers now branch on the result:409 deployment_not_redeployable(reaped concurrently — do not re-arm)runRedeployAsyncreconciles status later)New metric label:
instant_deploy_redeploy_total{outcome="not_redeployable"}.Coverage block
100% patch coverage (every new branch in
deploy.go+MarkDeploymentBuildingexercised).go vetclean;gofmtclean.🤖 Generated with Claude Code