feat(sync-service): cheap admission control (Tier 0) with post-load_shape reclassification by alco · Pull Request #4359 · electric-sql/electric

alco · 2026-05-19T13:27:57Z

Summary

Two changes to ServeShapePlug's pipeline that together make admission control cheap to run and accurately bound the resource it's supposed to protect.

Cheap Tier 0 classifier. :resolve_existing_shape is deleted. The admission classifier now reads a single ETS table (shape_meta_table for the stack) via ShapeStatus.has_shape_handle?/2. No SQLite, no GenServer call. Admission moves to the front of the plug pipeline so rejected requests don't pay for validation, shape lookup, or any of the other per-request setup.
Reclassify at snapshot start. Once the Snapshotter signals that the snapshot has started (i.e. it has obtained a connection from the snapshot pool), the in-flight permit moves from :initial to :existing. The :initial cap now bounds requests that are in validate-and-load or contending for a snapshot-pool connection, not requests currently streaming their response.

Pipeline change

Before:

:resolve_existing_shape   ← SQLite read (ShapeDb), runs before admission
:check_admission          ← acquires :initial or :existing permit
:put_resp_content_type
:parse_body
:validate_request
:reject_subquery_shape_compaction_request
:load_shape
:serve_shape_response     ← streams body; permit released in after-clause

After:

:put_resp_content_type                  ← runs first so admission 503s keep Content-Type
:check_admission                        ← cheap classifier (1× :ets.member)
:parse_body
:validate_request
:reject_subquery_shape_compaction_request
:load_shape
:hold_initial_until_snapshot_started    ← :initial-bucket requests park here
:reclassify_admission_kind              ← :initial → :existing swap
:serve_shape_response                   ← streams body

Why this is cheaper

:resolve_existing_shape called into Electric.Shapes.fetch_handle_by_shape/2, which falls through to ShapeDb.handle_for_shape/2 — a SQLite read on the read pool. Issue #4266 calls this out as the first bottleneck on the pre-admission path: every request, including the ones admission is about to reject, performs that read. Under thundering herd, the read pool saturates before admission can shed anything.

The new classifier is :ets.member(shape_meta_table(stack_id), handle). One lookup, no pool, no GenServer.

The rescue on ArgumentError covers the brief window during stack startup where the per-stack meta table hasn't been created yet — those requests classify as :initial, which is the safer bucket.

What this PR addresses, what it doesn't

ShapeCache (resolved within this PR). Before this branch, every shape request — handle or not — could end up in a GenServer.call(ShapeCache, {:create_or_wait_shape_handle, …}, 30_000) on a cache miss, with no upstream cap on concurrent callers. With cheap admission: :existing requests short-circuit on the validate_shape_handle ETS hash check and never touch ShapeCache; only :initial requests can hit {:create_or_wait_shape_handle, …}, and their population is hard-capped by the :initial bucket. ShapeCache's max instantaneous concurrent-caller count is now max_concurrent_requests.initial rather than "however many clients showed up." The 30s @call_timeout now exists only to absorb pathological filesystem tail latencies, not to throttle queue depth.

ShapeCache mailbox-overfill on coalesced creates (related concern, tracked in #4372). Even with the :initial bucket capping the population, all :initial requests for the same uncached shape still pile into the ShapeCache mailbox: handler 1 does the actual creation, handlers 2..N short-circuit via the SQLite critical fetch, but every caller occupies a mailbox slot for the duration of handler 1's work plus its own dispatch latency. #4372 proposes a GenServer-owned per-shape ETS lock observed by callers (via :ets.member before they send the call) plus a PollWait fallback — same pattern as #4371, with tighter backoff defaults to match shape-creation timescales.

StatusMonitor (out of scope, tracked in #4371). Api.validate_params calls StatusMonitor.wait_until for every request. The fast path is an ETS read and is unaffected. The slow path — stack starting, post-deploy, recovering from sleep — funnels every concurrent waiter through the StatusMonitor mailbox and replies serially on the readiness flip. The :initial bucket caps inflight requests across the validate+load_shape span but does not eliminate that single-process reply burst. Acceptable for now; #4371 has the per-process polling design with the adaptive congestion flag.

EtsInspector (out of scope, tracked in #4370). Used in validate_request to build %Shape{}. Warm cache is in-process ETS. Cold cache and PG-degraded paths serialize every concurrent request through the inspector mailbox, and crucially, failed DB lookups are not cached — so under PG pool exhaustion the inspector re-attempts the same failing query for each queued message. The :initial bucket bounds the herd size but does not stop the inspector from re-burning the DB pool on every retry.

Why reclassification happens at snapshot start

release_admission_permit/0 runs in call/2's after clause — i.e., after Api.Response.send_stream/2 has finished draining the body. For an initial snapshot of a non-trivial shape, or a long-poll, that's seconds to minutes.

So before this PR, :initial permits were held for the entire request lifetime:

t=0    t=5ms   t=10ms   t=50ms          t=2000ms
│      │       │        │               │
│ check_admission (:initial +1)
│      │ validate
│              │ load_shape (often coalesces; may spawn Snapshotter)
│                       │ start streaming chunks
│                                       │ done streaming, :initial -1
└────────────── :initial held for 2000ms ─────────────────────────┘

After this change:

t=0    t=5ms   t=10ms   t=50ms   t=N ms          t=2000ms
│      │       │        │        │               │
│ check_admission (:initial +1)
│      │ validate
│              │ load_shape (returns once Consumer + Snapshotter spawned)
│                       │ await_snapshot_start (Snapshotter checks out PG conn)
│                                │ try_swap (:initial -1, :existing +1)
│                                │ start streaming chunks
│                                                │ done streaming, :existing -1
└──────── :initial 50ms + Snapshotter-pool wait ─┤
                                 └────── :existing rest of lifetime ────┘

:initial residency drops from "entire request lifetime" to "validate + load_shape + Snapshotter pool checkout." Two things matter about that last span:

1. For coalesced handlers on a popular shape, the wait is essentially free. ShapeCache.await_snapshot_start short-circuits on ShapeStatus.snapshot_started? (one ETS lookup) when the snapshot is already running. So requests on a warm shape add at most one ETS read between load_shape and the swap.

2. For requests that trigger genuine shape creation, the wait is what makes :initial protect the snapshot pool. When the Snapshotter is queued waiting for a PG connection from the snapshot pool, the handler stays parked in :initial. Once the Snapshotter actually checks out a connection and casts {:snapshot_started, _}, the handler reclassifies and the next handler can enter. So :initial is no longer just "validate-and-load throughput" — it's also "number of requests that have caused a Snapshotter to either start or wait on the pool." Under PG-pool saturation, new requests get rejected at the admission gate (503 with Retry-After) instead of piling up as an unbounded Postgrex checkout queue.

The downstream get_merged_log_stream call no longer issues its own await_snapshot_start: the plug's result is propagated via request.snapshot_status and consumed inline. This avoids racing the consumer's stop_and_clean on the snapshot-failure path, which would otherwise turn legitimate SnapshotError 503s into must_refetch 409s.

Permit-release safety against handler death

Permits are stashed in Process.put(@admission_permit_key, …) at acquire/reclassify time and released in call/2's after clause. The after clause covers all in-process termination paths (normal return, raise, throw, halt, mid-stream client disconnect). The remaining concern — being killed mid-flight by an external EXIT signal — is absorbed by Thousand Island's connection handler GenServer, which sets Process.flag(:trap_exit, true) (deps/thousand_island/lib/thousand_island/handler.ex:343). Trapping converts incoming EXIT signals to mailbox messages, so the handler finishes the current request normally, releases the permit, and only then terminates on the next receive. The remaining gaps (:brutal_kill after shutdown timeout, externally-sent :kill) only happen during BEAM shutdown, where the ETS counters reset along with the VM.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.73%. Comparing base (eac6dea) to head (a39f565).
⚠️ Report is 6 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4359      +/-   ##
==========================================
+ Coverage   59.18%   59.73%   +0.55%     
==========================================
  Files         305      290      -15     
  Lines       28444    28579     +135     
  Branches     7493     7766     +273     
==========================================
+ Hits        16834    17073     +239     
+ Misses      11596    11489     -107     
- Partials       14       17       +3

Flag	Coverage Δ
electric-telemetry	`?`
elixir	`?`
packages/agents	`70.92% <ø> (+0.45%)`	⬆️
packages/agents-mcp	`77.54% <ø> (ø)`
packages/agents-runtime	`81.27% <ø> (+1.29%)`	⬆️
packages/agents-server	`73.93% <ø> (+0.77%)`	⬆️
packages/agents-server-ui	`6.66% <ø> (+0.62%)`	⬆️
packages/electric-ax	`42.61% <ø> (+0.23%)`	⬆️
packages/experimental	`87.73% <ø> (ø)`
packages/react-hooks	`86.48% <ø> (ø)`
packages/start	`82.83% <ø> (ø)`
packages/typescript-client	`94.39% <ø> (ø)`
packages/y-electric	`56.05% <ø> (ø)`
typescript	`59.73% <ø> (+0.76%)`	⬆️
unit-tests	`59.73% <ø> (+0.55%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Documents the caller-must-hold-from-permit precondition, adds `current` to swap_rejected telemetry metadata for parity with the existing reject event, and renames a local variable for consistency with the rest of the module. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…fication Moves :check_admission to the front of ServeShapePlug's pipeline and classifies requests on a single :ets.member/2 of the per-stack shape-meta table. Deletes :resolve_existing_shape — admission no longer touches the SQLite-backed ShapeDb (bottleneck 1 of #4266). Adds :reclassify_admission_kind between :load_shape and :serve_shape_response. Once load_shape returns, the handler atomically swaps its :initial permit for an :existing permit, freeing the :initial slot for the next validate-and-load wave while this request streams. If :existing is at cap, the handler keeps its :initial permit; the next swap attempt will succeed once :existing drains. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous commit added a nil-guard in check_admission to handle api_opts that omit max_concurrent_requests, but that introduced a silent bypass for misconfiguration. Reverts the guard and gives the Api struct a real default so direct plug_opts/1 callers (e.g. secure-mode tests) populate the field automatically. Misconfiguration once again surfaces as a Map.fetch! crash instead of silently disabling admission. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…lback logging - Comment the unknown-handle branch in admission_kind/1 so the two :initial arms don't read like a copy-paste oversight. - Logger.debug on the reclassify_admission_kind fallback (try_swap returning :overloaded) so operators can spot the "stuck on :initial" path with debug logging enabled. The rejection itself remains observable via the existing [:electric, :admission_control, :swap_rejected] telemetry emitted by try_swap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44d35d2c41

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude · 2026-05-19T14:15:20Z

Claude Code Review

Summary

Incremental review of #4359 (iteration 6). Four new commits, all responses to iteration 5's open suggestions: 43c7384b3 (changeset wording), 5e98e904f (@spec for get_merged_log_stream), 3943d20c8 (residency assertion in router test), and a39f5650d (SnapshotError-vs-must_refetch 503/409 regression test). No behaviour changes; nothing blocking.

What Was Addressed Since Iteration 5

5e98e904f — @spec Shapes.get_merged_log_stream/3. Spec lands at shapes.ex:26-31 referencing Api.Request.snapshot_status() (the existing nil | :started | {:error, term()} alias at request.ex:20). Required adding the Electric.Shapes.Api alias at the module top; correctly enumerates the new :snapshot_status opt alongside the pre-existing :since, :up_to, :read_only?. Closes the convention gap from iteration 4.
3943d20c8 — Residency assertion in router test. Adds assert %{initial: 0, existing: 0} = Electric.AdmissionControl.get_current(stack_id) at router_test.exs:2350, right after the {:admission_swap, ...} telemetry assertion. With the request now complete, both buckets should be empty: :initial drained by the try_swap decrement, :existing drained by the after-clause release. This locks the contract that no permit leaks across the swap.
a39f5650d — SnapshotError 503-vs-409 regression test. New test at serve_shape_plug_test.exs:326-348 stubs await_snapshot_start to return {:error, Electric.SnapshotError.slow_snapshot_start()} and asserts conn.status == 503 plus message-body parity. Done at the plug level rather than router level (iteration 5 had suggested router), which is fine — the plug-level test exercises the same hold_initial_until_snapshot_started → request.snapshot_status → get_merged_log_stream propagation path that prevents the double-await race. A future refactor that re-introduces a second await_snapshot_start on the streaming path would still trip this test.
43c7384b3 — Changeset wording. Iteration 5 had suggested adding a sentence about snapshot-pool checkout protection. The author instead removed the second sentence ("Once load_shape returns, the in-flight permit atomically swaps from the :initial bucket to :existing …") entirely. Defensible: the @magnetised/@alco PR-comment exchange acknowledged that the snapshot-pool protection is "more hypothetical than practical" because the :initial cap can be two orders of magnitude larger than the snapshot pool — overclaiming it in release notes would mislead operators. Result: the changeset now describes only the cheap-admission win (bottleneck 1 of ShapeCache bottlenecks under thundering herd #4266), which is the protection that is operator-visible. Trade-off worth noting: the swap/try_swap/4 itself is no longer mentioned in the changeset, but it's an internal mechanism, not a user-visible API change.

New Code Review (delta only)

Nothing new to flag — all four commits are surgical, no functional changes.

Issues

None.

Suggestions

PR description still claims :initial bounds "requests contending for a snapshot-pool connection." As @magnetised pointed out and @alco confirmed in the conversation, the swap fires once the Snapshotter sends the {:snapshot_started, _} cast — i.e. after it checks out the connection but before the snapshot finishes — so a snapshot that is mid-query still holds a pool slot while its corresponding request has already been reclassified to :existing. Strictly, :initial bounds connection-checkout waiters, not "all requests with an outstanding snapshot pool slot." The changeset has been updated to avoid this overclaim; the PR body's summary bullet [Merged on #3] Write inserts, updates and deletes to Vaxine #2 could use the same trim before merge so the merge commit message doesn't carry it forward. Editorial only; not blocking.
try_swap/4 clamp behaviour comment (carried over from iteration 5). Still strictly cosmetic. The rollback path at admission_control.ex:163 goes through decr/3 which clamps at 0; the inline comment ("Rollback the claimed permit since the bucket is already at capacity") is correct for the happy invariant, but a future reader auditing whether to_kind can transiently underflow has to chase down decr/3. A one-line note like # decr clamps at 0; correct under "we just incremented" invariant. would preserve the audit trail. Low priority.

Issue Conformance

PR cites #4291 (primary, "Closes"), #4292 (parent), #4266 (bottleneck 1). linked_issues.json now resolves #4291 with its full problem statement — the cheap-gate properties listed there (ETS counters, no GenServer call on reject path, coarse classification from source/method/live/offset, 503 + adaptive Retry-After) all hold in the current implementation. The Retry-After is still the placeholder 5–10s jitter (TODO at serve_shape_plug.ex:341-348), which is consistent with #4291's framing of adaptive retry as a parent-issue goal (#4292), not this PR's scope. Changeset present and now accurate. PR description's bullet #2 wording remains the one editorial nit.

Previous Review Status

Iteration 5 suggestions: all four addressed. Changeset wording trimmed (different direction from the suggestion, but defensible given the conversation). @spec added. Residency assertion added. SnapshotError-vs-must_refetch regression test added at plug level.
Iteration 3 telemetry/log on swap-reject: still covered by the existing [:electric, :admission_control, :swap_rejected] event.
Iteration 5 try_swap/4 clamp comment: still cosmetic, re-listed above.
No regressions from the four new commits. The tests added in a39f5650d and 3943d20c8 exercise paths that already worked; they're forward guards, not bug fixes. The @spec is documentation, not enforcement. The changeset trim is release-notes editorial.

Review iteration: 6 | 2026-05-20

Reverts the struct-level default added in 739ae54. Production already populates max_concurrent_requests before Api.plug_opts/1 is reached (via Electric.Application.api_configuration/1 -> Electric.Config.get_env/1 -> the Defaults map at lib/electric/config.ex:73, which is the real source of truth at %{initial: 300, existing: 10_000}). The struct default was only ever consulted by tests that build Api opts directly without going through the production-config plumbing. Instead, route those test fixtures through Electric.Config.get_env/1 so they see the same defaults production sees: - test/support/component_setup.ex build_router_opts/2 (covers most router tests via the canonical helper). - test/electric/plug/serve_shape_plug_logging_test.exs build_plug_opts/1. - The secure-mode api_opts fixture in router_test.exs. Misconfiguration once again surfaces as a Map.fetch! crash instead of being silently papered over by a hardcoded struct default that did not match the production numbers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Moves :put_resp_content_type before :check_admission so 503 admission rejections still carry `Content-Type: application/json`. Prior order put admission first, but Api.Response.error/3 -> Api.Response.send/2 does not set the content-type header itself — only the JSON-encoded body. Strict clients, CDNs, and observability tooling could fall back to `application/octet-stream` on the rejection path the PR was optimizing for. Adds a regression assertion in the existing 503 router test. Reported by Claude review on PR #4359. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous implementation used a single ets:update_counter/4 call with a threshold-clamped increment for the destination column and a decrement for the source column. That combination has two distinct concurrency hazards under contention: 1. **Codex's race (rollback under-counts to_kind).** When N callers race past a saturated to_kind, every caller captures the same clamped post-increment value `cap + 1` — but only the first increment actually moved the row. All N rollbacks then decrement the destination by 1, removing real permits and breaking cap enforcement for subsequent acquires until something drains. 2. **Claude's race (concurrent acquire on from_kind sees under-count).** Because from_kind was decremented in the same atomic op as the to_kind increment, a concurrent try_acquire(from_kind) could observe a transiently lowered from_kind, succeed, and push from_kind above its cap on swap rollback. The fix is to switch to two unclamped ops: 1. Plain increment of to_kind. Each caller's `+1` is real, so each rejected caller's rollback removes exactly its own contribution. 2. If new_to > cap: atomic rollback (to_kind -= 1). from_kind is untouched, so the caller keeps its from_kind permit. 3. Otherwise: atomic decrement of from_kind (success). Cost: 2 atomic ETS ops on both the success and reject paths (vs. 1 on success / 2 on reject before). The mid-state between the two success ops shows total = original_total + 1, which can spuriously reject a concurrent try_acquire(to_kind) but never over-admit — a strictly safer trade than the previous "from + to invariant" framing, which turned out to be wrong under the threshold-clamp behaviour. Adds two regression tests: - `concurrent rejected swaps preserve the destination counter` (Codex's race). - `from_kind cap is preserved against concurrent try_acquire during swap` (Claude's race). Also addresses two smaller review notes: - Adds the missing @SPEC for try_swap/4. - Switches `swap_rejected` telemetry's `current` measurement from `cap` to `new_to` (the actual post-increment value the racer captured), matching the `:reject` event's post-increment pattern. Reported by Codex and Claude reviews on PR #4359. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tables StatusMonitor, EtsInspector, and LsnTracker each own a read-mostly ETS table that every shape HTTP request hits on the fast path: - StatusMonitor: `service_status/1` runs from `Api.validate_params` -> `hold_until_stack_ready` for every request. - EtsInspector `pg_inspector_table`: `load_relation_oid`, `load_relation_info`, `load_column_info`, `load_supported_features` all do an ETS lookup here before falling back to the GenServer. - LsnTracker: `determine_global_last_seen_lsn` reads it for every non-`offset=:now` request. All three tables were created without `read_concurrency`, so reads contend on the default ETS lock together with the infrequent writes that the owning GenServer makes. The sibling hot-path tables (shape_meta_table, AdmissionControl counter, PureFileStorage stack table, WriteBuffer, ConsumerRegistry) already set the flag; these three are the remaining holes on the per-request read path and the ones most likely to serialize under the concurrency the new admission control allows through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reclassify after the Snapshotter signals snapshot_started, not at load_shape return. Holding :initial across the await_snapshot_start wait makes admission-gate backpressure cover snapshot-pool contention: when Snapshotters queue for a PG connection, requests stay parked in :initial and new requests get rejected at the gate (503 + Retry-After) instead of piling up as an unbounded Postgrex checkout queue. The await result is propagated via request.snapshot_status and consumed by Shapes.get_merged_log_stream, replacing its own await_snapshot_start call. Issuing it twice would race the consumer's stop_and_clean on the snapshot-failure path, turning legitimate SnapshotError 503s into must_refetch 409s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

magnetised · 2026-05-20T12:25:09Z

Reclassify at snapshot start. Once the Snapshotter signals that the snapshot has started (i.e. it has obtained a connection from the snapshot pool), the in-flight permit moves from :initial to :existing. The :initial cap now bounds requests that are in validate-and-load or contending for a snapshot-pool connection, not requests currently streaming their response.

@alco active snapshot is still holding a connection so shouldn't it still count towards "things contending for a snapshot-pool connection"?

magnetised

really appreciate the work that's gone into this. awesome stuff. have a couple of questions, one in the review one on the main pr, but am assuming that I've just not understood something.

🌟

magnetised · 2026-05-20T12:26:22Z

+    with {stack_id, :initial} <- Process.get(@admission_permit_key),
+         max = Map.fetch!(config[:api].max_concurrent_requests, :existing),
+         :ok <-
+           Electric.AdmissionControl.try_swap(stack_id, :initial, :existing, max_concurrent: max) do


what does it mean if the try_swap fails?

The call to try_swap fails only if the :existing bucket is already at capacity. An unlikely case in practice since we configure it with a higher value than :initial.

In any case, if the swap fails, the current request keeps holding the :initial permit until it's released in the after clause of the call function.

alco · 2026-05-20T12:44:20Z

Reclassify at snapshot start. Once the Snapshotter signals that the snapshot has started (i.e. it has obtained a connection from the snapshot pool), the in-flight permit moves from :initial to :existing. The :initial cap now bounds requests that are in validate-and-load or contending for a snapshot-pool connection, not requests currently streaming their response.

@alco active snapshot is still holding a connection so shouldn't it still count towards "things contending for a snapshot-pool connection"?

You're right.

With the changes made in this PR, the :initial bucket protects ShapeCache primarily. But if initial shape snapshots cannot keep up with the rate of new shape requests, some requests will inevitably be stuck waiting for DB connection checkout or time out on it.

So the protection of the snapshot pull is more hypothetical than practical, considering that the :initial bucket size can be 2 orders of magnitude larger than the connection pool size. But at the point when the shape consumer process is already running and waiting on the snapshot, ShapeCache is supposedly ready to accept the next new shape request. Which could end up being a request for an already existing shape.

In other words, it may end up being a moot point in practice, but reclassifying a shape req mid-flight has potential to make new shape creation faster (by not waiting on the full snapshot creation+serving cycle) but doesn't really make it worse in the thundering herd scenario where shape reqs are already timing out on connection pool checkout en masse.

alco and others added 2 commits May 19, 2026 15:27

chore(sync-service): add cheap admission control implementation plan

1929c7c

Captures the design and PR-description draft for Tier 0 cheap admission plus post-load_shape reclassification. Plan tracks #4291. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alco and others added 5 commits May 19, 2026 15:32

chore: changeset for cheap admission control

44d35d2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alco marked this pull request as ready for review May 19, 2026 13:59

alco added the claude label May 19, 2026

chatgpt-codex-connector Bot reviewed May 19, 2026

View reviewed changes

Comment thread packages/sync-service/lib/electric/admission_control.ex Outdated

Remove the implementation plan file (committed by accident)

5be3eb4

alco and others added 8 commits May 19, 2026 16:15

Simplify explainer comments in ServeShapePlug

3f2f976

Add missing @SPEC for ShapeStatus.has_shape_handle?()

89f5c39

Clean up admission control code in ServeShapePlug

e51b2fb

Remove excessive prose from admission control tests

989fc00

This was referenced May 20, 2026

EtsInspector: prevent mailbox overload and avoid serving orphaned replies #4370

Open

StatusMonitor: replace mailbox-based wait_until with adaptive per-process polling #4371

Open

alco mentioned this pull request May 20, 2026

ShapeCache: coalesce concurrent create_or_wait callers via GenServer-owned per-shape lock + polling #4372

Open

alco added 3 commits May 20, 2026 13:47

Remove excessively verbose comments

957c5c1

Dedupe atomic counter ops by reusing local helper functions

1b073ca

fixup! Dedupe atomic counter ops by reusing local helper functions

273893f

Tighten changeset wording for cheap admission control

43c7384

alco added 3 commits May 20, 2026 14:27

Add @SPEC for Shapes.get_merged_log_stream/3

5e98e90

Assert admission residency after :initial->:existing swap in router test

3943d20

Add regression test for SnapshotError-vs-must_refetch 503/409 race

a39f565

magnetised approved these changes May 20, 2026

View reviewed changes

alco mentioned this pull request May 20, 2026

feat(sync-service): adaptive poll-based wait_until under StatusMonitor congestion #4376

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sync-service): cheap admission control (Tier 0) with post-load_shape reclassification#4359

feat(sync-service): cheap admission control (Tier 0) with post-load_shape reclassification#4359
alco wants to merge 24 commits into
mainfrom
alco/cheap-admission-control

alco commented May 19, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

claude Bot commented May 19, 2026 •

edited

Loading

Uh oh!

magnetised commented May 20, 2026

Uh oh!

magnetised left a comment

Uh oh!

magnetised May 20, 2026

Uh oh!

alco May 20, 2026

Uh oh!

alco commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alco commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Pipeline change

Why this is cheaper

What this PR addresses, what it doesn't

Why reclassification happens at snapshot start

Permit-release safety against handler death

Related

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

claude Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

What Was Addressed Since Iteration 5

New Code Review (delta only)

Issues

Suggestions

Issue Conformance

Previous Review Status

Uh oh!

magnetised commented May 20, 2026

Uh oh!

magnetised left a comment

Choose a reason for hiding this comment

Uh oh!

magnetised May 20, 2026

Choose a reason for hiding this comment

Uh oh!

alco May 20, 2026

Choose a reason for hiding this comment

Uh oh!

alco commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alco commented May 19, 2026 •

edited

Loading

codecov Bot commented May 19, 2026 •

edited

Loading

claude Bot commented May 19, 2026 •

edited

Loading