fix(api): make actor destroy wait for ack#4327
fix(api): make actor destroy wait for ack#4327MasterPtato wants to merge 1 commit into02-27-chore_flatten_runtime_configfrom
Conversation
|
🚅 Deployed to the rivet-pr-4327 environment in rivet-frontend
|
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
How to use the Graphite Merge QueueAdd the label merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
PR Review: fix(api): make actor destroy wait for ack The fix is correct and the approach is sound. A few observations: Correct subscription ordering. Subscribing to DestroyComplete before sending the Destroy signal eliminates the race where the event could be emitted between sending the signal and setting up the subscription. This is the right pattern. Parallel fetching. Running actor::get and namespace::resolve_for_name_global concurrently with tokio::try_join! is a nice incidental improvement. Explicit destroy_ts guard. The early return on actor.destroy_ts.is_some() is important for correctness: without it, an already-destroyed actor would still proceed past validation, send a signal that returns None (workflow gone), skip destroy_sub.next(), and succeed silently. Now it returns NotFound consistently, which matches caller expectations. Concerns: No timeout on destroy_sub.next().await?. If the destroy workflow stalls (stuck activity, infrastructure issue), this call will hang until the request times out at a higher level. If there is no upstream request timeout, the call hangs indefinitely. Consider wrapping with tokio::time::timeout and returning an appropriate error. Concurrent delete requests sharing the same DestroyComplete event. If two requests concurrently delete the same actor, both subscribe to the topic before sending the signal. The first sends Destroy and triggers the workflow; the second may also get res = Some(...) (workflow still alive at signal time), so both enter the else branch and call destroy_sub.next().await?. Whether both subscribers receive the event depends on the pub/sub semantics of ctx.subscribe. If the message is consumed rather than broadcast, one caller could hang. Worth verifying. Subscription created before validation. The subscription is set up unconditionally before confirming the actor exists or belongs to the right namespace. Requests that fail validation create and immediately drop a subscription. This is minor overhead but should be fine assuming subscription setup has no significant server-side cost. Minor: The PR description template was not filled in. It would help to note the user-visible behavior change (DELETE now blocks until the actor is fully torn down) and any impact on clients that expect a fast response. |
|
This change is part of the following stack: Change managed by git-spice. |
1622668 to
06bdd41
Compare
ec5ffcb to
c88c25b
Compare
06bdd41 to
9df506d
Compare
c88c25b to
c2fcae4
Compare

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: