diff --git a/.github/agents/editor-lens.agent.md b/.github/agents/editor-lens.agent.md index 1f06750..84144d1 100644 --- a/.github/agents/editor-lens.agent.md +++ b/.github/agents/editor-lens.agent.md @@ -25,9 +25,26 @@ target is a codebase where every remaining comment is one a maintainer would write today, from scratch, knowing nothing about the PR that introduced it. +A second, broader mandate: catch **cryptic references to internal +review artifacts** wherever they appear in the diff, including +user-facing files (`README.md`, `sphinx/source/**`, `CHANGELOG.md`, +top-level policy docs, `.github/**`). Finding IDs (`F3`, `G5`, +`H2`), remediation slugs, round/chunk markers, and back-references +to internal sketches / plans / review files are useful while a PR +is in flight but have no meaning to a downstream reader. Past PRs +have leaked these into published docs; the cryptic-reference sweep +is the backstop that catches them at finalize. + ## Scope -**In scope** — code prose only: +The lens has **two scopes**: a broad *cryptic-reference sweep* that +applies everywhere there is text, and a narrower *full prose edit* +scope where it may also apply the keep / rewrite / cut policy. + +### Full prose edit — in scope + +Apply the full Keep / Rewrite / Cut policy below to **code prose +only**: - `src/bocpy/**/*.{c,h,py,pyi}` (the library, including `_core.c`, `_math.c`, `boc_*.{c,h}`, `behaviors.py`, `transpiler.py`, `worker.py`, @@ -37,13 +54,17 @@ introduced it. - `templates/c_abi_consumer/src/**/*.{c,h,py}` - `scripts/**/*.py` -**Out of scope — do not touch:** +### Full prose edit — out of scope, do not touch -- `sphinx/source/**` — narrative documentation; different rules, - managed by the docs step of `finalize-pr`. -- `README.md` — user-facing entry point; outside this lens's mandate. -- `CHANGELOG.md` — append-only history; managed by the changelog step - of `finalize-pr`. +These have different rules (Sphinx narrative, user-facing entry +point, append-only history, policy docs, meta config). Do not apply +the general Keep / Rewrite / Cut policy here: + +- `sphinx/source/**` — narrative documentation; managed by the + docs step of `finalize-pr`. +- `README.md` — user-facing entry point. +- `CHANGELOG.md` — append-only history; managed by the changelog + step of `finalize-pr`. - `CONTRIBUTING.md`, `SUPPLY_CHAIN.md`, `SUPPORT.md`, `SECURITY.md`, `CODE_OF_CONDUCT.md` — top-level policy docs. - `.github/**` — agent / skill / workflow definitions (meta). @@ -51,6 +72,49 @@ introduced it. - `templates/c_abi_consumer/{README.md,pyproject.toml}` — template docs read by downstream consumers. +### Cryptic-reference sweep — applies everywhere + +In **every text file** in the branch diff — including the +full-prose-out-of-scope set above, *except* `.copilot/**` — also +scan for and flag **cryptic references to internal review +artifacts** that leaked out of in-flight PR machinery: + +- Finding IDs and remediation slugs: `F1`, `G3`, `H2`, `M5`, `L2`, + `H1–H4`, "Remediation B6", "per F2", "closes G5". +- Round / iteration / chunk markers: "Round-2 adv#6", "iter-3", + "adversarial-iter1", "chunk 4", "step 7e". +- Back-references to internal review or plan files that ship in + the public docs: "see review-finding-1.md", + "per .copilot/plans/X/40-draft-plan.md", + "sketch ID 23", "see PR-Plan Tier 4 item 13". +- Internal codename references that have no public meaning: + "main-pinned-cowns branch", "the X1 refactor". + +For these, the rule is uniform regardless of which file the +reference appears in: + +- If the reference is the *whole* point of the line / paragraph, + cut it. +- If the surrounding prose stands on its own once the reference is + removed, rewrite to drop the reference and keep the prose. +- If removing it would damage the surrounding prose, flag under + "Questions for the user" with a proposed rewrite — do **not** + silently rewrite user-facing docs (README, Sphinx, policy files). + +The sweep is constrained to the *cryptic-reference* category only. +When operating on out-of-scope files you may **only** remove +cryptic references; you may **not** otherwise trim wordiness, +collapse paragraphs, or restructure the prose. The rest of the +Keep / Rewrite / Cut policy below does not apply to those files. + +Rationale: PR-process tags (F#, G#, H#, remediation IDs, sketch +backrefs) are useful while the PR is in flight, but they have no +meaning to a user reading the published README, the Sphinx site, +or the changelog months later. Past PRs have shipped +`per F3 finding` into the README and `closes G5` into the +changelog; this sweep is the backstop that catches them at +finalize. + ## Keep / Rewrite / Cut Policy ### Keep (do not touch) @@ -173,16 +237,27 @@ introduced it. When reviewing, produce findings in these sections: -1. **Cuts (high confidence)** — comments that are pure scaffolding, +1. **Cryptic-reference cuts (all scopes)** — PR slugs, finding IDs, + remediation tags, round/chunk markers, and internal sketch / + plan / review backrefs leaked into any text file in the diff. + Group by file. Cuts inside the *full prose edit* scope can be + deleted; cuts inside user-facing docs (`README.md`, + `sphinx/source/**`, `CHANGELOG.md`, top-level policy files, + `.github/**`) must list the proposed rewrite verbatim so the + user can approve before the change lands. +2. **Cuts (high confidence)** — comments that are pure scaffolding, archaeology, or paraphrase. List file + line range + the comment - text. These can be deleted without further review. -2. **Rewrites** — wordy or stale comments that should be collapsed. - For each, give the original and the proposed replacement. -3. **Keep with edit** — load-bearing comments that need a small fix - (stale file path, wrong PEP number, dated phrasing). -4. **Keep as-is** — comments that initially looked like candidates + text. These can be deleted without further review. *Full prose + edit scope only.* +3. **Rewrites** — wordy or stale comments that should be collapsed. + For each, give the original and the proposed replacement. *Full + prose edit scope only.* +4. **Keep with edit** — load-bearing comments that need a small fix + (stale file path, wrong PEP number, dated phrasing). *Full prose + edit scope only.* +5. **Keep as-is** — comments that initially looked like candidates but are actually load-bearing. Brief justification each. -5. **Questions for the user** — comments whose intent is unclear and +6. **Questions for the user** — comments whose intent is unclear and that should not be removed without confirmation. Include the comment and what's ambiguous. Always include any `TODO` / `FIXME` without an issue or sketch link. @@ -197,9 +272,11 @@ findings remain. - **Adding new comments.** This lens removes; it does not author. The usability lens authors. -- **Editing prose under `sphinx/source/`, `README.md`, +- **Rewriting prose in `sphinx/source/`, `README.md`, `CHANGELOG.md`, the top-level policy docs, or anything under - `.github/`.** + `.github/` beyond removing cryptic internal references.** The + cryptic-reference sweep is the *only* edit permitted in those + files; general wordiness / archaeology / banner cuts are not. - **Rewriting code.** Behavior is out of scope. - **Style enforcement** (formatting, capitalization, period-at-end) unless it is a side effect of an otherwise-justified rewrite. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 90ebd6e..398d0dc 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -190,9 +190,14 @@ interpreter's headers. extracts each decorated function into a top-level `__behavior__N` definition and rewrites the call site as `whencall('__behavior__N', cowns, captures)`. The captures tuple is built **at schedule time**, so loop variables are -snapshotted by value — no `x=x` default-arg idiom is needed (and adding one -breaks the behavior because the transpiler treats every signature name as a -behavior parameter and discards the default). +snapshotted by value. Two spellings of the loop-snapshot idiom are +supported transparently: just reference the loop variable in the body, or +write `def b(c, i=i)` and let the transpiler hoist the default into a +capture. Trailing positional parameters beyond the cown count are also +auto-captured by name (`def b(c, factor)` captures `factor`). The +transpiler also recognises aliased decorators — `from bocpy import when as +boc_when` and `import bocpy [as alias]` followed by `@bocpy.when(...)` or +`@alias.when(...)` — provided the aliasing import is at module level. When debugging behavior dispatch, capture resolution, parameter-count mismatches, or anything else that depends on the transpiler's output, use the diff --git a/.github/skills/testing-with-boc/SKILL.md b/.github/skills/testing-with-boc/SKILL.md index 17f4d8c..b1284e8 100644 --- a/.github/skills/testing-with-boc/SKILL.md +++ b/.github/skills/testing-with-boc/SKILL.md @@ -23,18 +23,27 @@ execute asynchronously on worker interpreters. | Concept | Description | |---------|-------------| | `Cown(value)` | A concurrently-owned wrapper. Behaviors receive exclusive temporal access to the cown's `.value`. | -| `@when(*cowns)` | Decorator that schedules the function as a behavior. The decorator replaces the function with a `Cown` holding the return value. **The decorated function must have exactly as many parameters as there are arguments to `@when`.** | +| `@when(*cowns)` | Decorator that schedules the function as a behavior. The decorator replaces the function with a `Cown` holding the return value. The first N parameters bind to the N cowns; any trailing parameters are auto-captured from the caller's frame (see below). | | `send(tag, contents)` | Sends a cross-interpreter message with the given tag. | | `receive(tags, timeout)` | Blocks until a message with a matching tag arrives (or times out). Returns `(TIMEOUT, None)` on timeout. | | `TIMEOUT` | Sentinel string returned as the tag by `receive` when a timeout elapses. | | `wait(timeout)` | Blocks until all scheduled behaviors have completed. | -### Critical rule: parameter count must match `@when` argument count +### Cown count, parameter count, and auto-captured extras -The number of parameters on the decorated function **must exactly equal** the -number of arguments passed to `@when`. A mismatch causes unspecified behavior -that can crash the worker interpreter. Because the behavior never completes, the -test will hang forever unless `receive` is called with a timeout. +The first `N` parameters of the decorated function bind positionally to the +`N` arguments of `@when`. Any **additional** trailing parameters are treated +as captures of names from the caller's scope: + +* `def b(c, factor)` — captures `factor` by the parameter's own name. +* `def b(c, i=i)` — captures `i` (the canonical loop-snapshot idiom). +* `def b(c, x=y)` — captures `y` and binds it into param `x`. + +Defaults must be plain names; computed defaults (`def b(c, k=foo())`) and +defaults on cown positions (`def b(c=c)`) raise `SyntaxError` at export +time. Free variables referenced in the body (but not in the signature) are +also auto-captured, so the simplest spelling is usually to omit the extras +entirely and just reference them in the body. ```python # CORRECT — 1 @when arg, 1 function param @@ -42,54 +51,60 @@ test will hang forever unless `receive` is called with a timeout. def good(x): return x.value * 2 -# WRONG — 1 @when arg, 2 function params (default args count!) +# ALSO CORRECT — extra params beyond the cown count are auto-captured +# from the caller's frame by name. Plain extras use the param's own +# name; defaults of the form ``i=i`` or ``x=y`` use the default's name. +factor = 2 @when(x) -def bad(x, factor=2): # will crash — factor is an extra param +def with_extra(x, factor): # ``factor`` captured by name return x.value * factor -# FIX — capture extra values via closure, not default args +# FIX — for older code: capture extra values via closure factor = 2 @when(x) -def fixed(x): # 1 param matches 1 @when arg - return x.value * factor # factor captured from enclosing scope +def fixed(x): # 1 param matches 1 @when arg + return x.value * factor # factor captured from enclosing scope ``` -### Do not use the `def _(c, x=x)` loop-capture idiom +### The `def _(c, i=i)` loop-capture idiom is supported -A common Python idiom for snapshotting a loop variable is to bind it as a -default argument: +The canonical Python idiom for snapshotting a loop variable as a default +argument works transparently: ```python for i, c in enumerate(cowns): @when(c) - def _(c, i=i): # unnecessary AND breaks @when + def _(c, i=i): # ``i`` captured at schedule time send("done", i) ``` -**You don't need this with `@when`.** The transpiler rewrites the call site as -`whencall('__behavior__N', (c,), (i,))`, snapshotting captures into a tuple at -schedule time. There is no late-binding hazard to defend against — just -reference the loop variable directly: +The transpiler hoists positional parameters beyond the cown count into +captures: bare extras (`def b(c, factor)`) capture by the parameter's own +name; defaults (`def b(c, i=i)` or the rename form `def b(c, x=y)`) capture by +the default expression's name. The default expression must be a plain +`Name`; computed defaults (`def b(c, k=foo())`) and defaults on cown +positions (`def b(c=c)`) raise `SyntaxError` at export time. + +Because the transpiler already snapshots loop variables into a tuple at +schedule time, you can also just reference the loop variable directly without +the `i=i` idiom — both spellings work: ```python for i, c in enumerate(cowns): @when(c) def _(c): - send("done", i) # i is captured by value at schedule time + send("done", i) # i is captured by value at schedule time ``` -Adding `i=i` to the signature actively breaks the behavior. The transpiler -treats every name in the signature as a behavior parameter and discards the -default, so the worker sees a function with an extra positional arg that the -runtime never supplies. See the "Inspecting Transpiler Output" section of -`.github/copilot-instructions.md` for how to use `export_module.py` to -confirm exactly which names are parameters and which are captures. +See the "Inspecting Transpiler Output" section of +`.github/copilot-instructions.md` for how to use `export_module.py` to confirm +exactly which names are parameters and which are captures. -If you do want a fresh scope per iteration (e.g. to avoid sharing mutable -state between iterations), use a helper function: +If you want a fresh scope per iteration (e.g. to avoid sharing mutable state +between iterations), use a helper function: ```python -def _schedule(c, i): # fresh scope per iteration +def _schedule(c, i): # fresh scope per iteration @when(c) def _(c): send("done", i) diff --git a/.github/workflows/build_wheels.yml b/.github/workflows/build_wheels.yml index 5a8074b..6a61ce4 100644 --- a/.github/workflows/build_wheels.yml +++ b/.github/workflows/build_wheels.yml @@ -20,7 +20,7 @@ jobs: python: [cp310, cp311, cp312, cp313, cp314] steps: - - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Git config for fetching pull requests run: | @@ -61,7 +61,7 @@ jobs: python: [cp310, cp311, cp312, cp313, cp314] steps: - - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 id: python @@ -112,7 +112,7 @@ jobs: python: [cp310, cp311, cp312, cp313, cp314] steps: - - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 id: python @@ -164,7 +164,7 @@ jobs: python: [cp310, cp311, cp312, cp313, cp314] steps: - - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Git config for fetching pull requests run: | @@ -206,7 +206,7 @@ jobs: python: [cp310, cp311, cp312, cp313, cp314] steps: - - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Git config for fetching pull requests run: | @@ -269,7 +269,7 @@ jobs: with: python-version: "3.14" - name: Checkout (for scripts/) - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 with: path: source - name: Validate every wheel against PyPI's checks @@ -336,7 +336,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 diff --git a/.github/workflows/nightly_audit.yml b/.github/workflows/nightly_audit.yml index 9399e94..fe658b7 100644 --- a/.github/workflows/nightly_audit.yml +++ b/.github/workflows/nightly_audit.yml @@ -25,7 +25,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 diff --git a/.github/workflows/pr_gate.yml b/.github/workflows/pr_gate.yml index 24f0a35..fe7b5aa 100644 --- a/.github/workflows/pr_gate.yml +++ b/.github/workflows/pr_gate.yml @@ -25,7 +25,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 @@ -50,7 +50,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 @@ -80,7 +80,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 @@ -118,7 +118,7 @@ jobs: python_version: ["3.10", "3.11", "3.12", "3.13", "3.14"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -144,7 +144,7 @@ jobs: BOCPY_BUILD_INTERNAL_TESTS: "" steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Use Python 3.14 uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 @@ -186,7 +186,7 @@ jobs: - check: templates/c_abi_consumer/src steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Run clang-format style check uses: jidicula/clang-format-action@654a770daa28443dd111d133e4083e21c1075674 # v4.18.0 @@ -202,7 +202,7 @@ jobs: python_version: ["3.10", "3.11", "3.12", "3.13", "3.14"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -219,7 +219,7 @@ jobs: python_version: ["3.10", "3.11", "3.12", "3.13", "3.14"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -236,7 +236,7 @@ jobs: python_version: ["3.12"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -254,7 +254,7 @@ jobs: python_version: ["3.10", "3.11", "3.12", "3.13", "3.14"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -271,7 +271,7 @@ jobs: python_version: ["3.10", "3.11", "3.12", "3.13", "3.14"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -288,7 +288,7 @@ jobs: python_version: ["3.13t", "3.14t"] steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Setup bocpy for testing uses: ./.github/actions/setup-bocpy-test @@ -314,7 +314,7 @@ jobs: CPYTHON_VERSION: v3.14.2 steps: - name: Checkout - uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 + uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - name: Install build dependencies run: | @@ -353,7 +353,7 @@ jobs: - name: Run tests env: - ASAN_OPTIONS: detect_leaks=0:halt_on_error=1 + ASAN_OPTIONS: detect_leaks=1:halt_on_error=1 UBSAN_OPTIONS: halt_on_error=1:print_stacktrace=1 run: | source "$RUNNER_TEMP/asan-venv/bin/activate" diff --git a/CHANGELOG.md b/CHANGELOG.md index 1a20f2d..1f74231 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,74 +1,347 @@ +## 2026-06-05 - Version 0.9.0 +Main-pinned cowns — a new `PinnedCown` subclass holds its +value as a plain `PyObject *` on the main interpreter, never +round-tripped through XIData. Behaviors whose request set contains +any pinned cown are routed by the scheduler to a single-consumer +main-thread queue and drained by the new `pump` entry point +(or implicitly by `wait`, which auto-pumps when pinned cowns +exist). Designed for objects that cannot survive cross-interpreter +shipping — pyglet shapes, Tk widgets, GPU contexts, open file +handles, ctypes pointers. The companion `examples/boids.py` +rewrite demonstrates the coarse-grained pinned-dispatch pattern: +per-cell physics stays on workers, and one `@when(PinnedCown)` +per frame batches the write-back into main-thread matrices. +Also in this release: `quiesce`, a non-tearing-down +checkpoint primitive. + + +**New Features** + +- **`quiesce(timeout=None, *, stats=False, noticeboard=False)`** — + blocks until every in-flight behavior completes, without tearing + down workers or the noticeboard thread. Implemented via a new + `terminator_seed_inc` peer of `terminator_seed_dec` + (Pyrona-style seed-up / seed-down pairing) so quiescence becomes + a *checkpoint* rather than a shutdown. Useful for parallel-search + patterns that need to inspect a best-so-far cown between rounds + and for tests that must read a worker-produced `send` queue + before its producer interpreter is destroyed. The `stats` and + `noticeboard` flags mirror `wait`: returns `None` by + default, a per-worker stats `list[dict]` when `stats=True`, + a noticeboard `dict[str, Any]` when `noticeboard=True`, or a + `WaitResult` when both are set. Raises `TimeoutError` + if quiescence is not reached within `timeout`. Exported from + `bocpy.__all__`. +- **`PinnedCown(Cown[T])`** — a cown whose value lives + permanently on the main interpreter. Constructible only from the + main interpreter (raises `RuntimeError` from workers); + the value is never picklable, never reified twice, and never + reconstructed in a worker. The capsule *handle* remains a + first-class cross-interpreter shareable — workers may hold it, + embed it in a regular `Cown` value graph, and place it in + noticeboard entries, but only the main thread may acquire the + value. See the new `pinned_cowns` page for the full + contract and the coarse-grained-dispatch pattern. +- **`pump(deadline_ms=None, max_behaviors=None, raise_on_error=False)`** + — drains the main-thread queue of behaviors whose request sets + contain a `PinnedCown`. Call from your event loop's + idle / on-tick hook (pyglet `schedule_interval`, Tk `after`, + asyncio task, …); script-mode programs need not call it + explicitly because `wait` pumps internally. Non-preemptive: + `deadline_ms` gates *starting* the next behavior, not + interrupting one already running. Body exceptions default to + landing on the result cown's `.exception`; + `raise_on_error=True` re-raises the first body exception after + drain. Returns a new `PumpResult` `NamedTuple` + (`executed`, `deadline_reached`, `raised`). +- **`set_pump_watchdog(warn_ms=1000, raise_ms=None, on_starve=None)`** + — configure the pinned-queue starvation watchdog. Both thresholds + gate on **queue-non-empty time**, not raw last-pump time, so + programs running only unpinned work never trip them. Default is + warn-only; users opt into fail-fast via an explicit `raise_ms` + so interactive debugger sessions are not wedged by a breakpoint. +- **`set_wait_pump_poll(ms=50)`** — set the poll cadence for + `wait`'s auto-pump loop. Re-read every iteration so a + concurrent call updates the active wait immediately. +- **`bocpy.PumpResult`** — three-field `NamedTuple` returned by + `pump`. `executed` counts pinned behaviors whose lifecycle + completed (including acquire-failure paths whose MCS chain still + drained). `deadline_reached` is `True` only when the + `deadline_ms` budget tripped before the queue drained. + `raised` counts only body exceptions captured to a result cown + (cleanup-path failures use `PyErr_WriteUnraisable` and do not + count). Exported from `bocpy.__all__`. +- **Coarse-grained pinned-dispatch `examples/boids.py`** — the + per-cell `send("update")` / main-thread `receive("update")` + barrier is replaced by per-cell physics on workers plus one + pinned `@when` per frame that captures every per-cell result + cown together with the two main-thread `PinnedCown` matrices + and performs the batched write-back. Same visual output, fully + worker-parallel per-cell work, single main-thread touchpoint. + +**Public C ABI** + +- **`bocpy_main_interpid()`** — new `static inline` helper in + `` returning `PyInterpreterState_GetID( + PyInterpreterState_Main())` pre-typed as `int_least64_t` to + match `bocpy_interpid` for owner-field equality checks. + Safe to call from a worker sub-interpreter for diagnostic / + assert use. Additive — existing consumers recompile unchanged; + `BOCPY_ABI` is unchanged at 1. The + `templates/c_abi_consumer` `bocpy~=` pin moves to + `~=0.9` to signal the new ABI surface it was authored against. + +**Improvements** + +- **`@when` loop-variable snapshot via default arg** — the + transpiler now accepts `def b(c, i=i)` as an explicit + loop-snapshot idiom in addition to the existing implicit form + (just reference the loop variable in the body). Trailing + positional parameters beyond the cown count are also + auto-captured by name (`def b(c, factor)` captures + `factor`). +- **`@when` alias decorators** — the transpiler now recognises + `from bocpy import when as boc_when` and `import bocpy [as + alias]` followed by `@bocpy.when(...)` or + `@alias.when(...)`, provided the aliasing import is at module + level. Previously only the bare `@when` form was detected. +- **`Behaviors.start()` compiles the export module on main** — + the transpiler's rewritten module is now also instantiated as an + in-memory `types.ModuleType` on the main thread (plus a + `linecache` entry for traceback fidelity) so `pump` can + resolve `__behavior__N` the same way workers do via their + bootstrap. +- **Scheduler-owned behavior pre-header** — `bq_node` and the + new `pinned` OR-fold byte moved out of the opaque + `BOCBehavior` into a scheduler-owned `boc_behavior_prehdr_t` + allocated immediately before each behavior (CPython + `_PyGC_Head` style). `boc_sched.c` no longer needs any + knowledge of `BOCBehavior`'s internal layout; layout drift + between the scheduler and its users is impossible by + construction. +- **`terminator_wait_pumpable`** — new entry in + `boc_terminator.{c,h}` lets the auto-pump loop wake on either + count-zero or main-pinned-depth-becoming-non-zero, both wired + through the existing single condition variable. Single-pumper + enforcement on free-threaded builds (`Py_GIL_DISABLED`) lives + alongside via a `MAIN_PUMP_THREAD` CAS that raises + `RuntimeError` if a second thread tries to pump + concurrently, cleared on every exit path including + `BaseException`. + +**Bug Fixes** + +- **CWE-401: inheriting INCREF leak in `cown_decref_inline`** — + `CownCapsule_reduce` packs an encoded `XIData` payload by + taking an *inheriting* `COWN_INCREF` per embedded + `CownCapsule`, normally balanced when the bytes are + unpickled inside a worker. On the orphan-death path (the + consumer side never deserialised the payload) the matching + `COWN_DECREF`s never fired and every embedded cown leaked. + `cown_decref_inline` now feeds the encoded bytes through + `pickle.loads` and immediately drops the result, which lets + CPython's GC fire the matching `COWN_DECREF`s recursively. + Gated on the `pickled` flag so native `XIData` round-trips + (e.g. `Matrix`) skip the work entirely. +- **Main-pump behavior reference leak** — both + `_core_main_pump_bounded` and `_core_main_pump_drain_all` + popped a `BehaviorCapsule` from `MAIN_PINNED_QUEUE` but + never released the strong reference the capsule held on the + underlying `BOCBehavior`. Each pinned behavior leaked + one reference until the runtime was torn down. The pump + helpers now `BEHAVIOR_DECREF` the behavior immediately after + the worker-equivalent cleanup runs. +- **MSVC `` compatibility** — Microsoft's + `` (used by CPython's headers on Windows) does + not expose the unsigned `atomic_uint_least64_t` or + `atomic_uintptr_t` forms that the pinned-pump bookkeeping + used. `MAIN_PINNED_DEPTH`, `MAIN_PINNED_NONEMPTY_SINCE_NS`, + `LAST_PUMP_NS`, `WATCHDOG_WARN_MS`, `WATCHDOG_LAST_WARN_NS`, + `WATCHDOG_ON_STARVE` and `MAIN_PUMP_THREAD` are now + `atomic_int_least64_t` / `atomic_intptr_t`. Depth never + goes negative; pointer bits round-trip losslessly through the + signed atomic boundary. +- **CPython 3.10/3.11 `PyErr_SetRaisedException` polyfill** — + added to `include/bocpy/xidata.h` alongside the existing + `PyErr_GetRaisedException` polyfill so the public C ABI's + exception-stash pattern compiles on Python versions before + 3.12. `BOCPY_ABI` is unchanged. +- **Portable `boc_max_align_t`** — added to `boc_compat.h` as + a union of the most-strictly-aligned fundamental types + (`long long`, `long double`, `void *`, function pointer). + MSVC exposes the C11 `max_align_t` only under `/std:c11`, + which the CPython build does not pass; the + `boc_behavior_prehdr_t` size assertion now uses + `alignof(boc_max_align_t)` so the alignment contract holds on + every supported toolchain. +- **PEP 678 `add_note` 3.10 fallback** — the new + `Behaviors.quiesce` exception-context shim attaches a note + describing the seed-inc / seed-dec balance on failure. CPython + 3.10 predates `BaseException.add_note`; the shim now + writes to `BaseException.__notes__` directly when `add_note` + is missing. +- **Transpiler `except ... as X` mis-classification** — + `ExceptHandler` binds `X` on the handler node + itself rather than via `Name` `Store`, so the + transpiler's free-variable walker mis-classified any read of + `X` inside the handler body as a free variable, appended it + as a behavior parameter, and emitted a call site that + referenced an out-of-scope name. Fixed by a new + `visit_ExceptHandler` hook that registers `X` as a local + before recursing into the handler. Regression locked by + `TestCapturedLocals::test_except_as_name_excluded`. + +**Documentation** + +- New `pinned_cowns` page — concept and when to use, + `PinnedCown` / `pump` / `PumpResult` / `set_pump_watchdog` + / `set_wait_pump_poll` API, coarse-grained pinned-dispatch + pattern, event-loop integration recipes (pyglet, Tk, asyncio), + the queue-non-empty-time watchdog contract, free-threaded + single-pumper rule, and free-threaded support trajectory. + Linked from the root toctree. +- `api` expanded with the new `PinnedCown` / `pump` / + `PumpResult` / `set_pump_watchdog` / `set_wait_pump_poll` + entries. +- New "Talking to main-thread objects" subsection in the root + `README.md`'s "A taste of BOC" with a 10-line pyglet snippet + illustrating the coarse-grained pattern; the public-API list + picks up the five new symbols. +- `examples/README.md` calls out the rewritten `boids.py` and + the new `examples/benchmark.py --pinned-spinner` flag. + +**Tests** + +- **`test/test_pinned_pump.py`** — new module covering the + full `PinnedCown` / `pump` matrix: pure-pinned, mixed + request sets, off-main construction rejection, locked + error-string smoke tests, `deadline_ms` / `max_behaviors` + bounding, body exceptions under default and + `raise_on_error=True`, `wait()` auto-pump, shutdown drain + via drop-exceptions, the watchdog warn-only and explicit-raise + paths, the `QUEUE_NONEMPTY_SINCE` regression for unpinned-only + workloads, hypothesis fuzz over mixed request sets, + `PinnedCown`-handle round-trip through closure capture and + through the noticeboard, `Cown(PinnedCown)` interop, and an + acquire-failure fault-injection test that proves + `IN_PUMP_BODY` / `terminator_dec` / `MAIN_PUMP_THREAD` + cleanup runs on every exit path. +- **`test/test_transpiler.py`** — 192 new lines covering the + `def b(c, i=i)` loop-snapshot form, `@when` alias decorators, + and the `except ... as X` regression. +- **`test_main_pump_drain_all_marks_result_cowns` flaky-shutdown + rewrite** — the original version scheduled eight pinned + behaviors, called `wait(timeout=0)` to force shutdown, then + asserted on the result cowns. The `timeout=0` propagated + through every stage of `Behaviors.stop` (quiescence, + noticeboard drain) and raised `TimeoutError` from one of + them under load before the post-`wait` assertions could run. + The rewritten test calls `_core.main_pump_drain_all` directly + to exercise the shutdown drain in isolation and asserts every + drained result cown carries the shutdown `RuntimeError`. + +**Internal** + +- **`examples/benchmark.py --pinned-spinner`** — high-rate + pinned-dispatch overlay that adds one tail-recursing + `@when(PinnedCown)` driven by `pump(max_behaviors=1)` on the + main thread at a configurable rate while the existing chain-ring + workload runs on workers. Used during development to verify + worker-throughput regression under high-rate pinned dispatch; + on CPython 3.14 at 4 workers / 10 s / 3 repeats the measured + delta with the spinner active was −0.38%. +- **Noticeboard read contract tightened** — `noticeboard` + now explicitly documents that calling `noticeboard` or + `notice_read` from the main thread *outside* a behavior is + undefined behavior; the supported main-thread read path is + `wait(noticeboard=True)`. Seeding the noticeboard with + `notice_write` from the main thread before scheduling any + behavior remains supported. +- **`test_matrix.TestVectorMethodsInCown` migrated to the + `send("assert", ...)` pattern** — the in-cown `Matrix` vector + tests previously asserted on `result.value` directly from the + test thread, which violates the cown ownership contract. They now + ship assertions out of each behavior via `send("assert", ...)` + and collect on the test thread via a `receive_asserts(count)` + helper, matching the project's BOC testing convention. +- **CI: ASAN `detect_leaks=1`** — the pinned-pump leak hunt + cleared the last masking leak; the ASAN job in + `.github/workflows/pr_gate.yml` now sets + `ASAN_OPTIONS=detect_leaks=1:halt_on_error=1` so any new + reachable leak fails the build at the source instead of + silently accumulating under `detect_leaks=0`. + ## 2026-06-02 - Version 0.8.0 -Vector-oriented ``Matrix`` API — six new methods (``vecdot``, -``cross``, ``normalize``, ``perpendicular``, ``angle``, -``magnitude_squared``), two new read-only properties (``size``, -``length``), and a unified ``in_place=`` keyword on every unary -method round out ``Matrix`` as a first-class vector and +Vector-oriented `Matrix` API — six new methods (`vecdot`, +`cross`, `normalize`, `perpendicular`, `angle`, +`magnitude_squared`), two new read-only properties (`size`, +`length`), and a unified `in_place=` keyword on every unary +method round out `Matrix` as a first-class vector and batch-of-vectors type — plus an internal X-macro template refactor -of every ``_math.c`` op family that restores the compiler's +of every `_math.c` op family that restores the compiler's auto-vectoriser. 44 of 71 benched rows improved by ≥10%, with representative wins of −50% to −88% on aggregates, broadcast -arithmetic, and ``normalize``. The ``_math`` extension now ships -with ``-O3`` (Linux/macOS) / ``/O2`` (Windows) so end users pick +arithmetic, and `normalize`. The `_math` extension now ships +with `-O3` (Linux/macOS) / `/O2` (Windows) so end users pick up the wins by default. **New Features** -- **Vector-oriented ``Matrix`` methods** — six new methods designed - for the ``Nx2`` / ``2xN`` / ``Nx3`` / ``3xN`` vector and - batch-of-vectors shapes that show up in ``examples/boids.py`` and +- **Vector-oriented `Matrix` methods** — six new methods designed + for the `Nx2` / `2xN` / `Nx3` / `3xN` vector and + batch-of-vectors shapes that show up in `examples/boids.py` and similar simulation code: - - ``magnitude_squared(axis=None)`` — squared L2 norm without the - ``sqrt`` step. Cheaper than ``magnitude()`` and safe for + - `magnitude_squared(axis=None)` — squared L2 norm without the + `sqrt` step. Cheaper than `magnitude()` and safe for sub-normal thresholding. - - ``vecdot(other, axis=None)`` — axis-aware inner product matching - ``numpy.linalg.vecdot``. **Not** equivalent to ``numpy.dot``; - use ``@`` for matrix multiplication. Same-shape, row-broadcast - (``1xN`` vs ``MxN``), and column-broadcast (``Mx1`` vs ``MxN``) + - `vecdot(other, axis=None)` — axis-aware inner product matching + `numpy.linalg.vecdot`. **Not** equivalent to `numpy.dot`; + use `@` for matrix multiplication. Same-shape, row-broadcast + (`1xN` vs `MxN`), and column-broadcast (`Mx1` vs `MxN`) operands are all supported. - - ``cross(other, axis=None)`` — 2D scalar z-component or 3D cross - product. Five shape paths share one method: ``1x2`` / ``2x1`` - returns a float; ``1x3`` / ``3x1`` returns a same-orientation - ``Matrix``; ``Nx2`` / ``2xN`` batches collect per-vector - scalars; ``Nx3`` / ``3xN`` batches return same-shape ``Matrix`` - results. ``axis=`` disambiguates the square ``2x2`` / ``3x3`` + - `cross(other, axis=None)` — 2D scalar z-component or 3D cross + product. Five shape paths share one method: `1x2` / `2x1` + returns a float; `1x3` / `3x1` returns a same-orientation + `Matrix`; `Nx2` / `2xN` batches collect per-vector + scalars; `Nx3` / `3xN` batches return same-shape `Matrix` + results. `axis=` disambiguates the square `2x2` / `3x3` shapes (default per-row). - - ``normalize(axis=None, in_place=False)`` — divide every element + - `normalize(axis=None, in_place=False)` — divide every element by its magnitude. Zero-magnitude rows / columns are returned as - exact zeros (no NaN, no division by zero). ``axis=`` selects + exact zeros (no NaN, no division by zero). `axis=` selects per-row, per-column, or total normalisation. - - ``perpendicular(axis=None, in_place=False)`` — rotate every 2D - vector 90° counter-clockwise: ``(x, y) -> (-y, x)``. Accepts a - single 2D vector, an ``Nx2`` row batch, or a ``2xN`` column + - `perpendicular(axis=None, in_place=False)` — rotate every 2D + vector 90° counter-clockwise: `(x, y) -> (-y, x)`. Accepts a + single 2D vector, an `Nx2` row batch, or a `2xN` column batch. - - ``angle(axis=None)`` — polar angle ``atan2(y, x)`` of every 2D + - `angle(axis=None)` — polar angle `atan2(y, x)` of every 2D vector. Returns a float for a single 2D vector input, - otherwise a ``Matrix`` of per-vector angles. -- **``Matrix.size`` property** — total element count - (``rows * columns``). Matches ``numpy.ndarray.size``. -- **``Matrix.length`` property** — Frobenius (L2) magnitude as a - read-only ``@property`` so vector-like code reads naturally - (``direction.length``, ``velocity.length``) without the - parentheses of a method call. Equivalent to ``magnitude()`` with + otherwise a `Matrix` of per-vector angles. +- **`Matrix.size` property** — total element count + (`rows * columns`). Matches `numpy.ndarray.size`. +- **`Matrix.length` property** — Frobenius (L2) magnitude as a + read-only `@property` so vector-like code reads naturally + (`direction.length`, `velocity.length`) without the + parentheses of a method call. Equivalent to `magnitude()` with no axis argument. -- **``in_place=`` keyword on every unary ``Matrix`` method** — - ``transpose``, ``ceil``, ``floor``, ``round``, ``negate``, - ``abs``, plus the new ``normalize`` and ``perpendicular`` all - accept ``in_place=True`` to mutate ``self`` and return it. - Replaces the older ``transpose_in_place()`` method (see +- **`in_place=` keyword on every unary `Matrix` method** — + `transpose`, `ceil`, `floor`, `round`, `negate`, + `abs`, plus the new `normalize` and `perpendicular` all + accept `in_place=True` to mutate `self` and return it. + Replaces the older `transpose_in_place()` method (see **Breaking Changes** below). -- **``axis=`` keyword on aggregate methods** — ``sum``, ``mean``, - ``min``, ``max``, ``magnitude``, and the new ``magnitude_squared`` - now share a tri-state ``axis=`` argument (``None`` / ``0`` / ``1``) - decoded through a single classifier. Negative axes (``-1`` / - ``-2``) accepted for NumPy parity. +- **`axis=` keyword on aggregate methods** — `sum`, `mean`, + `min`, `max`, `magnitude`, and the new `magnitude_squared` + now share a tri-state `axis=` argument (`None` / `0` / `1`) + decoded through a single classifier. Negative axes (`-1` / + `-2`) accepted for NumPy parity. **Improvements** -- **Auto-vectorised ``_math.c`` op kernels** — the binary, +- **Auto-vectorised `_math.c` op kernels** — the binary, aggregate, unary, and two-operand-aggregate op families inside - ``_math.c`` are now stamped from per-family descriptor tables, + `_math.c` are now stamped from per-family descriptor tables, one kernel per (op, shape) combination. Each per-element body is literally substituted into its own monomorphic inner loop, restoring the precondition for GCC's / Clang's auto-vectoriser. @@ -76,84 +349,84 @@ up the wins by default. | Bench row | 0.7.0 (ns) | 0.8.0 (ns) | Δ | | ----------------------------------------- | ---------- | ---------- | ------- | - | ``mean()`` shape=(1000, 100) | 44179.6 | 9001.6 | −79.6% | - | ``mean(1)`` shape=(1000, 100) | 51699.4 | 7058.5 | −86.3% | - | ``max(1)`` shape=(1000, 100) | 97184.2 | 11322.7 | −88.3% | - | ``magnitude()`` shape=(1000, 3) | 1098.2 | 306.8 | −72.1% | - | ``add col-bcast`` shape=(1000, 100) | 37823.4 | 20172.5 | −46.7% | - | ``div same-shape`` shape=(1000, 100) | 80134.2 | 45458.9 | −43.3% | - | ``normalize()`` shape=(1000, 3) axis=None | 3644.6 | 1775.5 | −51.3% | + | `mean()` shape=(1000, 100) | 44179.6 | 9001.6 | −79.6% | + | `mean(1)` shape=(1000, 100) | 51699.4 | 7058.5 | −86.3% | + | `max(1)` shape=(1000, 100) | 97184.2 | 11322.7 | −88.3% | + | `magnitude()` shape=(1000, 3) | 1098.2 | 306.8 | −72.1% | + | `add col-bcast` shape=(1000, 100) | 37823.4 | 20172.5 | −46.7% | + | `div same-shape` shape=(1000, 100) | 80134.2 | 45458.9 | −43.3% | + | `normalize()` shape=(1000, 3) axis=None | 3644.6 | 1775.5 | −51.3% | Four rows in code paths untouched by the refactor regressed by - 5–15% from layout drift (``_math.so`` ``.text`` grew +125% from + 5–15% from layout drift (`_math.so` `.text` grew +125% from kernel specialisation); none are on a hot path. No behavioural - change; ``test_matrix.py`` passes unchanged. -- **``-O3`` / ``/O2`` on ``bocpy._math``** — the math extension now - sets per-platform ``extra_compile_args`` in ``setup.py`` - (``-O3 -fno-plt`` on Linux/macOS, ``/O2`` on Windows) so end-user + change; `test_matrix.py` passes unchanged. +- **`-O3` / `/O2` on `bocpy._math`** — the math extension now + sets per-platform `extra_compile_args` in `setup.py` + (`-O3 -fno-plt` on Linux/macOS, `/O2` on Windows) so end-user wheels and editable installs both pick up the auto-vectoriser - wins above. Other ``bocpy`` extensions are unaffected. The SBOM - hash for ``_math.*.so`` will drift accordingly — see - :doc:`sbom` for the auditor-facing note. + wins above. Other `bocpy` extensions are unaffected. The SBOM + hash for `_math.*.so` will drift accordingly — see + `sbom` for the auditor-facing note. **Breaking Changes** -- **``Matrix.transpose_in_place()`` removed** — superseded by - ``Matrix.transpose(in_place=True)``, which returns ``self`` and +- **`Matrix.transpose_in_place()` removed** — superseded by + `Matrix.transpose(in_place=True)`, which returns `self` and so composes the same way every other unary method does. - Migration is mechanical: replace ``m.transpose_in_place()`` with - ``m.transpose(in_place=True)``. + Migration is mechanical: replace `m.transpose_in_place()` with + `m.transpose(in_place=True)`. **Documentation** -- New ``Matrix`` API entries in :doc:`api` for ``size``, ``length``, - ``magnitude_squared``, ``vecdot``, ``cross``, ``normalize``, - ``perpendicular``, and ``angle``, plus updated ``in_place=`` +- New `Matrix` API entries in `api` for `size`, `length`, + `magnitude_squared`, `vecdot`, `cross`, `normalize`, + `perpendicular`, and `angle`, plus updated `in_place=` keyword signatures on the existing unary methods. **Tests** -- **234 new test cases** for the new ``Matrix`` methods and +- **234 new test cases** for the new `Matrix` methods and properties (1571 → 1805 passed). Coverage includes a stub-guard - test that greps ``__init__.pyi`` for every new C-level name and - in-cown coverage exercising each new method inside ``@when``. + test that greps `__init__.pyi` for every new C-level name and + in-cown coverage exercising each new method inside `@when`. - **Portable overflow regex + cross 2x3/3x2 contract pinning** — - the cross-product test for the doubly-valid ``2x3`` / ``3x2`` + the cross-product test for the doubly-valid `2x3` / `3x2` shapes now pins the 2D-batch interpretation explicitly, locking the documented behaviour. **Internal** -- **``scripts/bench_matrix.py``** — bench harness used to gate the - refactor: ``--json`` append mode, ``--report-median`` per-row +- **`scripts/bench_matrix.py`** — bench harness used to gate the + refactor: `--json` append mode, `--report-median` per-row merge, 200 ms warmup, batch-size auto-tuning. -- **``scripts/validate_wheel.py`` + - ``scripts/_vendored_warehouse_wheel.py``** — stdlib-only wheel - ``RECORD`` validator and a vendored slice of Warehouse's wheel - parser; used by the PR gate to catch ``RECORD`` regressions +- **`scripts/validate_wheel.py` + + `scripts/_vendored_warehouse_wheel.py`** — stdlib-only wheel + `RECORD` validator and a vendored slice of Warehouse's wheel + parser; used by the PR gate to catch `RECORD` regressions before PyPI does. **CI / build** -- **``cibuildwheel`` v3.4.0 → v3.4.1** and **``clang-format-action``** +- **`cibuildwheel` v3.4.0 → v3.4.1** and **`clang-format-action`** pin normalised to the underlying commit SHA (Dependabot's preferred format). Both pins move in lock-step with the github-actions Dependabot group. -- **``idna`` 3.16 → 3.17** in ``ci/constraints-docs.txt``. Five - other Dependabot proposals (``docutils`` 0.23, ``ruamel-yaml`` - 0.19, ``sphinx-tabs`` 3.4.7+, ``sphinx-toolbox`` 4.2, and - ``standard-imghdr`` 3.13) require Python ≥3.11 and so cannot +- **`idna` 3.16 → 3.17** in `ci/constraints-docs.txt`. Five + other Dependabot proposals (`docutils` 0.23, `ruamel-yaml` + 0.19, `sphinx-tabs` 3.4.7+, `sphinx-toolbox` 4.2, and + `standard-imghdr` 3.13) require Python ≥3.11 and so cannot enter a universal lock that still includes Python 3.10; a - comment above ``requires-python = ">=3.10"`` in - ``pyproject.toml`` lists them for the post-3.10-EOL bump. -- **``flake8`` ``extend-exclude``** for ``.copilot/``, ``build/``, - ``sphinx/build/``, and the scratch ``.env*`` venvs so the walker + comment above `requires-python = ">=3.10"` in + `pyproject.toml` lists them for the post-3.10-EOL bump. +- **`flake8` `extend-exclude`** for `.copilot/`, `build/`, + `sphinx/build/`, and the scratch `.env*` venvs so the walker no longer trips on generated or vendored Python files. ## 2026-05-28 - Version 0.7.0 Cown-lifecycle correctness fixes — three use-after-free paths in the -``CownCapsule`` pickle / acquire / noticeboard machinery now hold the -inner ``BOCCown`` alive across the writer's wrapper drop — plus +`CownCapsule` pickle / acquire / noticeboard machinery now hold the +inner `BOCCown` alive across the writer's wrapper drop — plus supply-chain hardening: pinned and hash-verified Python dependencies, SHA-pinned GitHub Actions, dependabot coverage, vulnerability scanning, and PEP 770 SBOMs embedded in every wheel. @@ -161,250 +434,250 @@ and PEP 770 SBOMs embedded in every wheel. **New Features** - **PEP 770 SBOMs in every wheel** — every wheel built by - ``.github/workflows/build_wheels.yml`` now embeds a - `CycloneDX 1.6 `_ - JSON SBOM under ``-.dist-info/sboms/bocpy.cdx.json``. + `.github/workflows/build_wheels.yml` now embeds a + [CycloneDX 1.6](https://cyclonedx.org/specification/overview/) + JSON SBOM under `-.dist-info/sboms/bocpy.cdx.json`. Generation runs inside cibuildwheel's repair step on every platform - (Linux ``auditwheel``, macOS ``delocate``, Windows direct injection) - via the new stdlib-only ``scripts/build_sbom.py``. The - ``inject`` subcommand rewrites the wheel's ``RECORD`` atomically + (Linux `auditwheel`, macOS `delocate`, Windows direct injection) + via the new stdlib-only `scripts/build_sbom.py`. The + `inject` subcommand rewrites the wheel's `RECORD` atomically (temp file + rename). -- **SBOM verification in CI** — the new ``verify_sboms`` job in - ``build_wheels.yml`` re-downloads the extracted SBOM artifact and - runs two checks: ``scripts/validate_sbom.py`` (stdlib-only +- **SBOM verification in CI** — the new `verify_sboms` job in + `build_wheels.yml` re-downloads the extracted SBOM artifact and + runs two checks: `scripts/validate_sbom.py` (stdlib-only structural validator pinning bocpy's wire format) and - `grype `_ (third-party SBOM - scanner) with ``--fail-on high``. A separate ``sboms`` artifact is - also uploaded by the ``merge`` job for downstream consumers. -- **``bocpy.__version__``** — a runtime version attribute derived - from ``importlib.metadata.version("bocpy")``, with a - ``PackageNotFoundError`` fallback. Exported from ``bocpy.__all__`` - and documented in ``__init__.pyi``. ``pyproject.toml`` remains the + [grype](https://github.com/anchore/grype) (third-party SBOM + scanner) with `--fail-on high`. A separate `sboms` artifact is + also uploaded by the `merge` job for downstream consumers. +- **`bocpy.__version__`** — a runtime version attribute derived + from `importlib.metadata.version("bocpy")`, with a + `PackageNotFoundError` fallback. Exported from `bocpy.__all__` + and documented in `__init__.pyi`. `pyproject.toml` remains the single source of truth for the version. -- **New documentation** — :doc:`sbom` walk-through covering the +- **New documentation** — `sbom` walk-through covering the embedded SBOM format, extraction recipes, and verification commands. -- **``wait(noticeboard=True)`` final-state capture** — :func:`wait` - now accepts a ``noticeboard`` keyword that returns the final - noticeboard contents as a plain ``dict`` at shutdown (after the +- **`wait(noticeboard=True)` final-state capture** — `wait` + now accepts a `noticeboard` keyword that returns the final + noticeboard contents as a plain `dict` at shutdown (after the noticeboard thread exits, before the entries are freed). Useful for surfacing an early-stopping result, last error, or aggregated counter that a behavior deposited just before the runtime - quiesced, replacing the older ``send`` / ``receive`` handshake - that earlier examples used. Combined with ``stats=True`` it - returns a new :class:`WaitResult` ``NamedTuple`` (also exported - from ``bocpy.__all__``) carrying both snapshots. The - ``examples/prime_factor.py`` example was migrated to the new + quiesced, replacing the older `send` / `receive` handshake + that earlier examples used. Combined with `stats=True` it + returns a new `WaitResult` `NamedTuple` (also exported + from `bocpy.__all__`) carrying both snapshots. The + `examples/prime_factor.py` example was migrated to the new pattern. **Bug Fixes** -- **Cown-in-cown use-after-free** — a ``Cown`` embedded inside +- **Cown-in-cown use-after-free** — a `Cown` embedded inside another cown's value, a message-queue payload, or a noticeboard snapshot was previously freed when the writer's local wrapper dropped, because pickle bytes carry no refcount on their own. - ``CownCapsule_reduce`` now takes an inheriting ``COWN_INCREF`` that - ``_cown_capsule_from_pointer_inheriting`` consumes on unpickle, so - the inner ``BOCCown`` survives until the consumer drops its + `CownCapsule_reduce` now takes an inheriting `COWN_INCREF` that + `_cown_capsule_from_pointer_inheriting` consumes on unpickle, so + the inner `BOCCown` survives until the consumer drops its decoded wrapper. Affects every cross-cown reference shape — see - the new ``TestCownInCown`` class for the full container-shape fuzz. -- **Acquire-failure poisoned-state** — when ``pickle.loads`` failed - partway through ``cown_acquire``, the cown was left in a + the new `TestCownInCown` class for the full container-shape fuzz. +- **Acquire-failure poisoned-state** — when `pickle.loads` failed + partway through `cown_acquire`, the cown was left in a half-acquired state with the encoded bytes still in place. A retry would re-run pickle against bytes whose embedded inherited refs had already been partially consumed by pickle's error path, - risking dereferences of freed ``BOCCown*`` pointers. The cown's - ``xidata`` is now recycled on the failure path and a guard at the - top of ``cown_acquire`` rejects any future acquire with a - deterministic ``RuntimeError``; the worker recovery arm surfaces + risking dereferences of freed `BOCCown*` pointers. The cown's + `xidata` is now recycled on the failure path and a guard at the + top of `cown_acquire` rejects any future acquire with a + deterministic `RuntimeError`; the worker recovery arm surfaces it on the failing behavior's result cown. - **Noticeboard hidden-cown audit** — when a noticeboard value - reached a ``Cown`` via a route the pin walker cannot see — custom - ``__reduce__`` / ``__getstate__``, ``copyreg.dispatch_table``, + reached a `Cown` via a route the pin walker cannot see — custom + `__reduce__` / `__getstate__`, `copyreg.dispatch_table`, closure capture, module-level cache — the borrowing reconstructor - produced a token whose inner ``BOCCown`` was not held alive by + produced a token whose inner `BOCCown` was not held alive by the entry's pin set, leaving the next reader to UAF after the writer's wrapper dropped. A per-thread borrowing context - (``BOC_NB_CTX``) now audits every ``CownCapsule_reduce`` against + (`BOC_NB_CTX`) now audits every `CownCapsule_reduce` against the caller's pin set during the noticeboard write pickle and - fails the whole ``notice_write`` / ``notice_update`` closed if + fails the whole `notice_write` / `notice_update` closed if any cown is unaccounted for. - **`UnicodeDecodeError` on non-UTF-8 Windows locales** — - ``Behaviors.start`` read ``worker.py`` with ``open(path)``, which - picks up ``locale.getpreferredencoding(False)``. On cp1252 + `Behaviors.start` read `worker.py` with `open(path)`, which + picks up `locale.getpreferredencoding(False)`. On cp1252 (English Windows) the UTF-8 em-dashes in the worker source were silently mojibake-d; on cp949 (Korean Windows) the read failed - with ``UnicodeDecodeError: 'cp949' codec can't decode byte 0xe2`` - and ``bocpy`` could not start at all (reported in - `#14 `_ by - `@Forthoney `_). Fixed by passing - ``encoding="utf-8"`` explicitly in ``Behaviors.start``, and the - same fix was applied to every other ``open()`` site in the repo + with `UnicodeDecodeError: 'cp949' codec can't decode byte 0xe2` + and `bocpy` could not start at all (reported in + [#14](https://github.com/microsoft/bocpy/issues/14) by + [@Forthoney](https://github.com/Forthoney)). Fixed by passing + `encoding="utf-8"` explicitly in `Behaviors.start`, and the + same fix was applied to every other `open()` site in the repo that reads or writes text known to contain non-ASCII bytes - (``sphinx/source/conf.py``, ``examples/sketches.py`` x2, - ``export_module.py``). -- **Silent worker-startup failures** — ``Behaviors.start_workers`` - ran ``interpreters.create()`` and ``interpreters.run_string()`` + (`sphinx/source/conf.py`, `examples/sketches.py` x2, + `export_module.py`). +- **Silent worker-startup failures** — `Behaviors.start_workers` + ran `interpreters.create()` and `interpreters.run_string()` on the worker thread without a try/except, so a failure in either - killed the thread without ever replying on ``boc_behavior``. The - parent's bounded ``receive()`` then timed out with no diagnostic. + killed the thread without ever replying on `boc_behavior`. The + parent's bounded `receive()` then timed out with no diagnostic. Both calls are now wrapped, and every failure path sends a - formatted traceback over ``boc_behavior`` so the parent sees a + formatted traceback over `boc_behavior` so the parent sees a structured error instead of a timeout. - **Silent worker bootstrap import failures** — the generated bootstrap script that loads the user module into each worker sub-interpreter is now wrapped in a top-level try/except. Any - ``BaseException`` is formatted with the user module name and sent - over ``boc_behavior`` (falls back to ``sys.stderr`` if the - message-queue ``send`` itself raises), then re-raised so - ``run_string`` reports it as well. Module-import failures that + `BaseException` is formatted with the user module name and sent + over `boc_behavior` (falls back to `sys.stderr` if the + message-queue `send` itself raises), then re-raised so + `run_string` reports it as well. Module-import failures that previously surfaced only as a worker-startup timeout now arrive as a proper traceback. -- **``boc_sched_worker_pop_slow`` skipped ``popped_local``** — the +- **`boc_sched_worker_pop_slow` skipped `popped_local`** — the slow-path pending-fallback and WSQ-dequeue branches returned - work without bumping ``popped_local`` (the fast path always + work without bumping `popped_local` (the fast path always did), so the documented producer/consumer identity in - :c:type:`boc_sched_stats_t` was violated whenever the fairness + `boc_sched_stats_t` was violated whenever the fairness arm fired or a worker entered the slow path directly. Both - branches now increment ``popped_local`` and reset the batch + branches now increment `popped_local` and reset the batch budget, matching the fast path. The header's reconciliation paragraph was also tightened to a "near-identity" that explicitly accounts for fairness-token pops (which are re-enqueued via raw - ``boc_wsq_enqueue`` rather than ``boc_sched_dispatch``, leaving + `boc_wsq_enqueue` rather than `boc_sched_dispatch`, leaving consumer-side counters without a matching producer-side bump). **Supply Chain** - **Hashed and pinned Python dependencies** — every CI dependency is - resolved into a ``ci/constraints-.txt`` file via - ``uv pip compile --universal --generate-hashes`` and installed with - ``pip install --require-hashes``. Covers the ``test``, ``linting``, - ``docs``, and new ``audit`` extras. ``bocpy`` itself is then - installed via ``pip install -e . --no-deps`` so an editable build + resolved into a `ci/constraints-.txt` file via + `uv pip compile --universal --generate-hashes` and installed with + `pip install --require-hashes`. Covers the `test`, `linting`, + `docs`, and new `audit` extras. `bocpy` itself is then + installed via `pip install -e . --no-deps` so an editable build cannot smuggle in an unpinned transitive dependency. -- **Vulnerability scanning** — new ``audit`` job in ``pr_gate.yml`` - runs ``pip-audit --strict`` against every constraints file on every - PR. ``pip-audit`` itself is pinned via ``ci/constraints-audit.txt`` - and self-checked. A new ``.github/workflows/nightly_audit.yml`` - re-runs the audit nightly against ``main``. -- **SHA-pinned GitHub Actions** — every ``uses:`` line in - ``.github/workflows/`` is now pinned to a full 40-char commit SHA - with a trailing ``# vX.Y.Z`` comment. -- **Dependabot coverage** — new ``.github/dependabot.yml`` covers - three ecosystems (``pip`` rooted at ``/ci``, ``github-actions`` - rooted at ``/``, ``pip`` rooted at - ``/templates/c_abi_consumer``), grouped weekly per ecosystem. -- **Downstream template pinned** — ``templates/c_abi_consumer`` - pins ``bocpy~=MAJOR.MINOR`` as both a build requirement and a - runtime dependency. The ``finalize-pr`` skill bumps it in +- **Vulnerability scanning** — new `audit` job in `pr_gate.yml` + runs `pip-audit --strict` against every constraints file on every + PR. `pip-audit` itself is pinned via `ci/constraints-audit.txt` + and self-checked. A new `.github/workflows/nightly_audit.yml` + re-runs the audit nightly against `main`. +- **SHA-pinned GitHub Actions** — every `uses:` line in + `.github/workflows/` is now pinned to a full 40-char commit SHA + with a trailing `# vX.Y.Z` comment. +- **Dependabot coverage** — new `.github/dependabot.yml` covers + three ecosystems (`pip` rooted at `/ci`, `github-actions` + rooted at `/`, `pip` rooted at + `/templates/c_abi_consumer`), grouped weekly per ecosystem. +- **Downstream template pinned** — `templates/c_abi_consumer` + pins `bocpy~=MAJOR.MINOR` as both a build requirement and a + runtime dependency. The `finalize-pr` skill bumps it in lock-step with the root version. -- **New ``SUPPLY_CHAIN.md``** — top-level policy doc describing +- **New `SUPPLY_CHAIN.md`** — top-level policy doc describing everything above with the exact regeneration commands. **Documentation** -- **Cown pickle-leak note** — :class:`Cown` now documents that - ``pickle.dumps`` on a cown produces bytes that carry one strong +- **Cown pickle-leak note** — `Cown` now documents that + `pickle.dumps` on a cown produces bytes that carry one strong reference per embedded cown; orphan bytes (never unpickled in the producing process) leak one strong ref per byte string. The bocpy runtime never produces orphan bytes; the leak surface only - applies to third-party code that calls ``pickle.dumps(cown)`` + applies to third-party code that calls `pickle.dumps(cown)` directly. -- **Noticeboard cown-lifetime guarantee** — :func:`notice_write` and - :func:`notice_update` now document that values may embed - :class:`Cown` references and that the noticeboard keeps each +- **Noticeboard cown-lifetime guarantee** — `notice_write` and + `notice_update` now document that values may embed + `Cown` references and that the noticeboard keeps each embedded cown alive for as long as the entry remains. The new - paragraph in :doc:`noticeboard` mirrors this guarantee for + paragraph in `noticeboard` mirrors this guarantee for readers. -- **Noticeboard final-state capture guide** — :doc:`noticeboard` +- **Noticeboard final-state capture guide** — `noticeboard` gained a "Reading the Final State at Shutdown" section covering - the ``wait(noticeboard=True)`` contract, the combined - ``wait(stats=True, noticeboard=True)`` form returning - :class:`WaitResult`, the empty-dict fallbacks for the + the `wait(noticeboard=True)` contract, the combined + `wait(stats=True, noticeboard=True)` form returning + `WaitResult`, the empty-dict fallbacks for the never-started and never-written cases, and the recommendation - to use ``snap.get(key)`` since :func:`wait` quiesces as soon as + to use `snap.get(key)` since `wait` quiesces as soon as every behavior completes with no guarantee any particular write has landed. The early-stopping worked example in the same file was rewritten around the new API. **Tests** -- **``TestCownInCown``** in ``test/test_boc.py`` — pins the +- **`TestCownInCown`** in `test/test_boc.py` — pins the cown-in-cown UAF fix with three cases: an inner cown allocated inside a behavior and observed by a downstream behavior, a cown sent through the message queue and consumed by the receiver, and a 50-trial deterministic fuzz over seven container shapes - (``list`` / ``tuple`` / ``dict`` / ``@dataclass(slots=True)`` / - ``__dict__``-only / ``__slots__``-only / 2-level ``Cown[Cown[T]]``). -- **``TestAcquireFailureTerminal``** in ``test/test_boc.py`` — pins + (`list` / `tuple` / `dict` / `@dataclass(slots=True)` / + `__dict__`-only / `__slots__`-only / 2-level `Cown[Cown[T]]`). +- **`TestAcquireFailureTerminal`** in `test/test_boc.py` — pins the poisoned-state contract: after a deserialisation failure the cown stays permanently unavailable and every subsequent waiter - receives the deterministic ``RuntimeError`` on its result cown. + receives the deterministic `RuntimeError` on its result cown. - **Noticeboard hidden-cown regressions** in - ``test/test_noticeboard.py`` — exercises ``__reduce__`` and - ``copyreg.dispatch_table`` reductions that hide a cown from the + `test/test_noticeboard.py` — exercises `__reduce__` and + `copyreg.dispatch_table` reductions that hide a cown from the pin walker, and verifies the audit rejects the write closed rather than leaving an unpinned borrowing token in the entry. - A complementary ``_VisibleCownPair`` test guards against the + A complementary `_VisibleCownPair` test guards against the over-eager-rejection regression. -- **``test/test_version.py``** — covers ``bocpy.__version__``: - pyproject parity, PEP 440 shape, ``__all__`` export, and the - ``importlib.metadata`` fallback path (subprocess test that +- **`test/test_version.py`** — covers `bocpy.__version__`: + pyproject parity, PEP 440 shape, `__all__` export, and the + `importlib.metadata` fallback path (subprocess test that verifies the WARNING is emitted when the metadata lookup raises). -- **``test/test_build_sbom.py`` and ``test/test_validate_sbom.py``** +- **`test/test_build_sbom.py` and `test/test_validate_sbom.py`** — full coverage of the SBOM generator and validator: CycloneDX 1.6 shape, deterministic UUIDv5 serialNumber, - ``SOURCE_DATE_EPOCH`` timestamp, per-entry ZIP-attribute - preservation (``external_attr`` / ``create_system`` / - ``compress_type`` / ``date_time``) across symlink and - ``ZIP_STORED`` entries, atomic ``RECORD`` rewrite, and the CLI - ``generate`` / ``inject`` / ``validate`` modes. -- **``TestWaitNoticeboardCapture``** in ``test/test_noticeboard.py`` - — pins the ``wait(noticeboard=True)`` contract: returned dict is a - plain mutable ``dict``, empty-runtime / empty-noticeboard fallbacks - to ``{}``, single-flag back-compat (``wait()`` stays ``None``, - ``wait(stats=True)`` stays ``list``), combined-flag - :class:`WaitResult` shape, last-write-wins, delete propagation + `SOURCE_DATE_EPOCH` timestamp, per-entry ZIP-attribute + preservation (`external_attr` / `create_system` / + `compress_type` / `date_time`) across symlink and + `ZIP_STORED` entries, atomic `RECORD` rewrite, and the CLI + `generate` / `inject` / `validate` modes. +- **`TestWaitNoticeboardCapture`** in `test/test_noticeboard.py` + — pins the `wait(noticeboard=True)` contract: returned dict is a + plain mutable `dict`, empty-runtime / empty-noticeboard fallbacks + to `{}`, single-flag back-compat (`wait()` stays `None`, + `wait(stats=True)` stays `list`), combined-flag + `WaitResult` shape, last-write-wins, delete propagation through a chained behavior, fresh-session isolation, and the - single-shot guarantee that an explicit ``stop()`` followed by - ``wait(noticeboard=True)`` preserves the snapshot rather than + single-shot guarantee that an explicit `stop()` followed by + `wait(noticeboard=True)` preserves the snapshot rather than re-snapshotting the now-empty noticeboard. The existing - scheduler-stats tests in ``test/test_scheduler_stats.py`` were + scheduler-stats tests in `test/test_scheduler_stats.py` were simplified to use the cown-chain barrier directly rather than a - ``send``/``receive`` handshake, now that the same change is - exercised end-to-end by the new ``wait(noticeboard=True)`` tests. + `send`/`receive` handshake, now that the same change is + exercised end-to-end by the new `wait(noticeboard=True)` tests. **Internal** -- ``flake8`` now lints ``.pyi`` stubs (the default ``--filename`` +- `flake8` now lints `.pyi` stubs (the default `--filename` glob silently skipped them). Pre-existing defects in - ``__init__.pyi``, ``_core.pyi``, and ``test_boc.py`` cleaned up in - the same pass. The workflow also lints the new ``scripts/`` + `__init__.pyi`, `_core.pyi`, and `test_boc.py` cleaned up in + the same pass. The workflow also lints the new `scripts/` directory. - **`flake8-encodings` added to the `[linting]` extra** — pins the Windows-locale class of bug above as a permanent regression gate. - Any future ``open()`` call without an explicit ``encoding=`` - (or with ``encoding=None``) now fails the PR-gate lint job. The - plugin and its transitive dependencies (``flake8-helper``, - ``astatine``, ``domdf-python-tools``, ``natsort``) are pinned and - hash-verified in ``ci/constraints-linting.txt`` like every other + Any future `open()` call without an explicit `encoding=` + (or with `encoding=None`) now fails the PR-gate lint job. The + plugin and its transitive dependencies (`flake8-helper`, + `astatine`, `domdf-python-tools`, `natsort`) are pinned and + hash-verified in `ci/constraints-linting.txt` like every other CI dependency. -- **Defensive ``receive()`` timeouts on every lifecycle path** — - ``Behaviors.start_workers``, ``stop_workers``, ``_abort_workers``, +- **Defensive `receive()` timeouts on every lifecycle path** — + `Behaviors.start_workers`, `stop_workers`, `_abort_workers`, and the noticeboard mutator loop now pass a bounded timeout to - every ``_core.receive()`` they own. A wedged worker therefore - fails fast with a deterministic ``RuntimeError`` instead of + every `_core.receive()` they own. A wedged worker therefore + fails fast with a deterministic `RuntimeError` instead of hanging the parent forever. Defence in depth against the sub-interpreter wedge observed on macOS arm64 + Python 3.12/3.13. -- **No ``unittest.mock`` in test files that schedule ``@when``** — +- **No `unittest.mock` in test files that schedule `@when`** — the transpiler exports the whole test module for import in every - worker sub-interpreter, so a top-level ``from unittest import - mock`` triggers an ``import asyncio`` in every worker. On macOS + worker sub-interpreter, so a top-level `from unittest import + mock` triggers an `import asyncio` in every worker. On macOS arm64 + Python 3.12/3.13 this can deadlock during PEP 684 per-interpreter init. Replaced by a small in-house - ``test/mockreplacement.py`` (``patch_attr`` context manager + - ``Recorder`` / ``RecorderMethod`` stubs) imported lazily inside + `test/mockreplacement.py` (`patch_attr` context manager + + `Recorder` / `RecorderMethod` stubs) imported lazily inside the few tests that need it. The pitfall is documented in the - ``testing-with-boc`` skill. + `testing-with-boc` skill. ## 2026-05-10 - Version 0.6.0 Public C ABI for downstream extensions, enabling C-level participation @@ -412,38 +685,38 @@ in behavior-oriented concurrency across worker sub-interpreters. **New Features** -- **Decorator composition with ``@when``** — decorators stacked below - ``@when`` are now preserved on the generated behavior function and +- **Decorator composition with `@when`** — decorators stacked below + `@when` are now preserved on the generated behavior function and compose with the behavior body on the worker. Decorators placed - above ``@when`` raise a ``SyntaxError`` at transpile time with - actionable guidance. ``async def`` functions with ``@when`` are + above `@when` raise a `SyntaxError` at transpile time with + actionable guidance. `async def` functions with `@when` are also explicitly rejected. - **Public C ABI (``)** — downstream C extensions can now link against bocpy to register custom Python types as cross-interpreter shareable so `Cown` can carry instances of them across worker interpreters. The header is C-only, version-gated - via the ``BOCPY_ABI`` macro, and bumped on any incompatible change - to ``bocpy.h`` or ``xidata.h``. Wheels remain CPython-version-tagged + via the `BOCPY_ABI` macro, and bumped on any incompatible change + to `bocpy.h` or `xidata.h`. Wheels remain CPython-version-tagged so a runtime ABI mismatch cannot occur. - **`bocpy.get_include()` / `bocpy.get_sources()`** — Python-level - helpers that downstream ``setup.py`` files use to locate the bocpy + helpers that downstream `setup.py` files use to locate the bocpy headers and the small set of C sources that must be compiled into the consuming extension. - **`templates/c_abi_consumer/`** — a ready-to-copy template for building a C extension against the bocpy ABI, including a - ``setup.py``, a probe extension exercising the public surface, and - a pytest suite (``test_public_c_abi.py``) that validates the ABI + `setup.py`, a probe extension exercising the public surface, and + a pytest suite (`test_public_c_abi.py`) that validates the ABI end-to-end. - **C source reorganisation** — the per-subsystem translation units - introduced in 0.5.0 have been renamed with a ``boc_`` prefix - (``boc_compat.[ch]``, ``boc_sched.[ch]``, ``boc_tags.[ch]``, - ``boc_terminator.[ch]``, ``boc_noticeboard.[ch]``, ``boc_cown.h``) - to give the public ABI a stable, namespaced identity. ``xidata.h`` - has moved under ``include/bocpy/`` alongside ``bocpy.h``. + introduced in 0.5.0 have been renamed with a `boc_` prefix + (`boc_compat.[ch]`, `boc_sched.[ch]`, `boc_tags.[ch]`, + `boc_terminator.[ch]`, `boc_noticeboard.[ch]`, `boc_cown.h`) + to give the public ABI a stable, namespaced identity. `xidata.h` + has moved under `include/bocpy/` alongside `bocpy.h`. **Documentation** -- New :doc:`c_abi`, :doc:`messaging`, and :doc:`noticeboard` pages +- New `c_abi`, `messaging`, and `noticeboard` pages in the Sphinx site; the API reference has been expanded to cover the public ABI surface. @@ -453,7 +726,7 @@ in behavior-oriented concurrency across worker sub-interpreters. counter introduced in 0.4.0 has been removed. It exposed an implementation detail of the snapshot cache that did not survive the C ABI review and had no use case that was not better served - by ``notice_sync`` plus an explicit ``noticeboard()`` read. + by `notice_sync` plus an explicit `noticeboard()` read. ## 2026-04-29 - Version 0.5.0 Verona-RT-style work-stealing scheduler, C source split into per-subsystem diff --git a/CITATION.cff b/CITATION.cff index bab0699..174d9cc 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -5,6 +5,6 @@ authors: given-names: "Matthew Alastair" orcid: "https://orcid.org/0000-0002-1019-8036" title: "bocpy" -version: 0.8.0 -date-released: 2026-06-02 +version: 0.9.0 +date-released: 2026-06-05 url: "https://github.com/microsoft/bocpy" \ No newline at end of file diff --git a/README.md b/README.md index 4ddc5cc..972d006 100644 --- a/README.md +++ b/README.md @@ -250,6 +250,43 @@ You can view the full example The BOC runtime ensures that this operates without deadlock, by construction. +### Talking to main-thread objects + +Some values can't survive an XIData round-trip — pyglet shapes, Tk widgets, +open file handles, ctypes pointers into a library loaded by `__main__`, a +GPU context. Wrap those in a `PinnedCown`. Behaviors whose request set +contains any pinned cown run on the main thread, drained by `pump()` from +your event loop (or implicitly by `wait()`). + +Keep dispatch coarse — one pinned `@when` per frame, not per item — so the +single-consumer main thread doesn't serialise your worker parallelism. The +`pump()` call drains whatever pinned behaviors are queued and returns +immediately when the queue is empty (it never blocks), so it is safe to +call from a tight render-loop tick. Hosts that want a starvation warning +when the queue stays non-empty can enable it explicitly with +`set_pump_watchdog()`; with no call, the runtime stays silent. + +```python +from bocpy import Cown, PinnedCown, pump, start, when + +start() +canvas = PinnedCown(MyCanvas()) # main-thread-only handle + +def update(dt): + pump() # drains prior frame's write-back; returns immediately if nothing is queued + results = [worker_compute(i) for i in range(n)] # per-item worker @whens + + @when(*results, canvas) # one pinned behavior per frame + def _writeback(*args): + *cells, canvas = args + for cell in cells: + canvas.value.draw(cell.value) +``` + +See the [Pinned Cowns guide](https://microsoft.github.io/bocpy/pinned_cowns.html) +for the coarse-grained dispatch pattern, event-loop integration recipes +(pyglet, Tk, asyncio), and the starvation watchdog. + ### Examples We provide a few examples to show different ways of using BOC in a program: @@ -263,7 +300,9 @@ We provide a few examples to show different ways of using BOC in a program: 4. [`bocpy-cooking-boc`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/cooking_boc.py): The example from the [BOC tutorial](https://microsoft.github.io/bocpy/). 5. [`bocpy-boids`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/boids.py): An agent-based bird flocking - example demonstrating the `Matrix` class to do distributed computation over cores. + example demonstrating the `Matrix` class for parallel per-cell physics on + workers, with one `PinnedCown`-driven `@when` per frame batching the + pyglet-visible write-back (the coarse-grained pinned-dispatch pattern). Note: you'll need to install `pyglet` first in order to run the `bocpy-boids` example. 6. [`bocpy-primes`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/primes.py) and [`bocpy-prime-factor`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/prime_factor.py): @@ -313,6 +352,18 @@ read a frozen snapshot via `noticeboard()` / `notice_read()`. The [`bocpy-prime-factor`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/prime_factor.py) example uses it to coordinate early termination across worker behaviors. +For values that can't survive an XIData round-trip — UI handles, GPU +contexts, file descriptors — the library provides `PinnedCown`, a cown whose +value lives permanently in the main interpreter. Behaviors against a pinned +cown run on the main thread, drained by `pump()` from your event loop or +implicitly by `wait()`. The full surface is `PinnedCown`, `pump`, +`PumpResult`, `set_pump_watchdog`, and `set_wait_pump_poll`; the +[`bocpy-boids`](https://github.com/microsoft/bocpy/blob/main/src/bocpy/examples/boids.py) +example drives a pyglet window through one pinned `@when` per frame. See +the [Pinned Cowns guide](https://microsoft.github.io/bocpy/pinned_cowns.html) +for the coarse-grained dispatch pattern, watchdog, and free-threaded +support trajectory. + The library also includes lower-level Erlang-style messaging primitives (`send` / `receive`) for channel-based communication patterns; see the [API documentation](https://microsoft.github.io/bocpy/messaging.html) for @@ -330,6 +381,22 @@ wait() # block indefinitely wait(timeout=5) # raise TimeoutError if not done in 5 s ``` +For a synchronization checkpoint that does **not** tear the runtime +down — e.g. a parallel search that inspects a best-so-far cown +between rounds and then continues — use `quiesce()` instead. It +blocks until every in-flight behavior completes, optionally returns +a per-worker stats or noticeboard snapshot, and leaves the worker +pool and the noticeboard thread running so the next `@when` call +dispatches immediately. + +```python +from bocpy import quiesce + +snap = quiesce(noticeboard=True) # dict[str, Any] +print("best so far:", snap.get("best")) +# ... schedule the next round of @when calls ... +``` + ### Additional Info BOC is built on a solid foundation of serious scholarship and engineering. For further reading, please see: 1. [When Concurrency Matters: Behaviour-Oriented Concurrency](https://dl.acm.org/doi/10.1145/3622852) diff --git a/ci/constraints-audit.txt b/ci/constraints-audit.txt index 6e97382..67d819d 100644 --- a/ci/constraints-audit.txt +++ b/ci/constraints-audit.txt @@ -245,9 +245,9 @@ packaging==26.2 \ # via # pip-audit # pip-requirements-parser -pip==26.1.1 \ - --hash=sha256:99cb1c2899893b075ff56e4ed0af55669a955b49ad7fb8d8603ecdaf4ed653fb \ - --hash=sha256:d36762751d156a4ee895de8af39aa0abeeeb577f93a2eca6ab62467bbf0f8a78 +pip==26.1.2 \ + --hash=sha256:382ff9f685ee3bc25864f820aa50505825f10f5458ffff07e30a6d96e5715cab \ + --hash=sha256:f49cd134c61cf2fd75e0ce2676db03e4054504a5a4986d00f8299ae632dc4605 # via pip-api pip-api==0.0.34 \ --hash=sha256:8b2d7d7c37f2447373aa2cf8b1f60a2f2b27a84e1e9e0294a3f6ef10eb3ba6bb \ diff --git a/examples/README.md b/examples/README.md index b3bea5e..dfa1ecb 100644 --- a/examples/README.md +++ b/examples/README.md @@ -34,16 +34,22 @@ while the ordering required for correct computation is automatically maintained. ## [Boids](boids.py) The [boids](https://en.wikipedia.org/wiki/Boids) agent-based simulation of flocking -birds provides fertile ground for exploring many interesting aspects of BOC. First, -the threadsafe and BOC-enlightened +birds provides fertile ground for exploring many interesting aspects of BOC. The +threadsafe and BOC-enlightened [`Matrix`](http://microsoft.github.io/bocpy/sphinx/api.html#bocpy.Matrix) class is used -quite extensively to store boid positions and velocities and to compute the changing -positions. Boid updates are computed on grid cells, such that for each grid cell to -change it requires unique access to that cell and up to 8 of its neighbors. As the -demonstrator runs at framerate for hundreds of boids, the resulting simulation creates -tens of thousands of behaviors and cowns every second. Lots of thanks to -[Ben Eater's Boids repo](https://github.com/beneater/boids.git), which proved -a helpful starting point. +to store boid positions and velocities and to compute the changing positions. Boid +updates are computed on grid cells, such that for each grid cell to change requires +unique access to that cell and up to 8 of its neighbors. As the demonstrator runs at +framerate for hundreds of boids, the resulting simulation creates tens of thousands +of behaviors and cowns every second. + +The simulation also demonstrates the +[`PinnedCown`](http://microsoft.github.io/bocpy/sphinx/api.html#bocpy.PinnedCown) +pattern: positions and velocities are exposed to the pyglet render loop via main-thread +aliases (read directly between frames) while a single pinned `@when` per frame +performs the write-back, dispatched by `pump()` inside the pyglet update tick. Lots +of thanks to [Ben Eater's Boids repo](https://github.com/beneater/boids.git), which +proved a helpful starting point. ## [Prime Factor](prime_factor.py) This example generates a semiprime (a product of two primes) and then factors it @@ -53,3 +59,27 @@ for a result before doing a batch of trial divisions. When any lane finds a factor it writes to the noticeboard, and the remaining lanes see the result on their next check and stop early. Demonstrates the "behavior loop" pattern and cross-behavior coordination via the noticeboard. + +## [Benchmark](benchmark.py) +`benchmark.py` is the workhorse used to track BOC runtime overhead across releases. +The default run measures end-to-end throughput on a matmul-fanout workload; the +key knobs are summarised below. + +- `--null-payload` — skip the matmul inner loop so the reported throughput + reflects pure scheduler / messaging overhead with the application work + removed. Useful when chasing scheduler regressions. +- `--pinned-spinner` — during the measurement window, drive a tail-recursing + `@when` on a `PinnedCown` via `pump(max_behaviors=1)` so the C-level + pinned-queue 0→1 wakeup path is loaded *alongside* the worker `@when` + stream. Used to verify worker-throughput regression under high-rate pinned + dispatch. +- `--pinned-spinner-sleep-s` (default `0.001`, i.e. ~1 kHz) — per-iteration + sleep inside the pinned-spinner body. Controls the dispatch rate. +- `--repeats`, `--output`, `--table` / `--no-table`, `--quiet` — repeat-count, + results-file path, and reporting toggles for batch runs. +- `--emit-scheduler-stats` — capture per-worker `scheduler_stats()` and + `queue_stats()` snapshots after each repeat and embed them in the result + JSON. + +See [`scripts/bench_matrix.py`](../scripts/bench_matrix.py) for the +matrix-arithmetic micro-bench used to guard `_math.c` performance. diff --git a/examples/benchmark.py b/examples/benchmark.py index 0f66a3d..8863202 100644 --- a/examples/benchmark.py +++ b/examples/benchmark.py @@ -33,8 +33,8 @@ from typing import Optional from bocpy import _core -from bocpy import (Cown, Matrix, notice_write, noticeboard, receive, send, - start, wait, when) +from bocpy import (Cown, Matrix, notice_write, noticeboard, PinnedCown, + pump, start, wait, when) # Sentinels for the parent/child JSON protocol. Uppercase so the # transpiler keeps them as module-level constants in the worker export. @@ -175,6 +175,8 @@ class BenchConfig: payload_cols: int = 16 repeats: int = 1 null_payload: bool = False + pinned_spinner: bool = False + pinned_spinner_sleep_s: float = 0.001 @dataclass @@ -192,6 +194,10 @@ class RepeatResult: # per-window scheduler-stats delta (see # ``compute_derived_metrics``). derived: Optional[dict] = None + # Count of pinned spinner @when dispatches that fired during the + # run (0 when ``pinned_spinner`` was off). Lifted out of the + # noticeboard at shutdown. + pinned_dispatches: int = 0 @dataclass @@ -324,39 +330,60 @@ def build_workload(cfg: BenchConfig): def schedule_snap(state_cowns: list) -> None: - """Schedule the final snapshot + publish behaviors. + """Schedule the final snapshot behavior. - See the module docstring for the snap ordering invariant. This - helper is structured so that the bare ``snap`` and ``_publish`` - return-cown locals fall out of scope at its return boundary, - satisfying the no-bare-Cowns-in-main rule before ``wait()`` runs. + The snap behavior writes the total count to the noticeboard under + ``"final_count"`` rather than sending a message; the parent lifts + the value back via ``wait(noticeboard=True)``. The bare ``snap`` + return-cown local falls out of scope at this function's return + boundary, satisfying the no-bare-Cowns-in-main rule before + ``wait()`` runs. :param state_cowns: Every chain's state cown. """ @when(state_cowns) def snap(states): - return sum(s.value.count for s in states) + notice_write("final_count", sum(s.value.count for s in states)) + notice_write("final_count_ts_ns", time.perf_counter_ns()) notice_write("cr_stop", True) - @when(snap) - def _publish(s): - send("snap", s.value) - -def emit_chain_snapshot(state_cown: Cown, tag: str) -> None: - """Send a chain's ``(count, head_idx)`` over the queue under ``tag``. - - Used by tests that need to inspect chain progress directly. The - helper lives in this module so the ``@when`` decorator runs through - the transpiler that registered ``schedule_step``. - - :param state_cown: The chain's state cown. - :param tag: The tag to ``send`` the snapshot under. +# The pinned cown wraps a one-slot list: ``[count]``. Counting inside +# the cown's own value keeps the hot per-dispatch path off the +# noticeboard, so NB_VERSION is not bumped on every spinner iteration +# (which would invalidate every worker's cached snapshot and skew the +# worker-throughput measurement). The spinner publishes the final +# count to the noticeboard exactly once -- on the iteration where it +# observes ``cr_stop`` and breaks the tail-recursion. +_PINNED_COUNT = 0 + + +def schedule_pinned_spinner(spin_cown: "PinnedCown", + sleep_s: float) -> None: + """Schedule the next pinned-spinner @when on ``spin_cown``. + + The body increments ``p[_PINNED_COUNT]`` (exclusive access to the + cown's value while the behavior runs) and either re-schedules + itself or, on the iteration that sees ``cr_stop`` set, writes the + final count to the noticeboard under ``"pinned_dispatches"`` -- + exactly one ``NB_VERSION`` bump per run instead of one per + dispatch. The sleep lives inside the body so the spinner + self-paces under a ``pump()`` / ``wait()`` auto-pump loop. + + :param spin_cown: The :class:`PinnedCown` the spinner runs on. + Its value must be a one-element list ``[count]``. + :param sleep_s: Per-iteration sleep, in seconds, controlling the + dispatch rate (e.g. ``0.001`` for ~1 kHz). """ - @when(state_cown) - def _emit(s): - send(tag, (s.value.count, s.value.head_idx)) + @when(spin_cown) + def _spinner(p): + p.value[_PINNED_COUNT] += 1 + if not noticeboard().get("cr_stop", False): + time.sleep(sleep_s) + schedule_pinned_spinner(spin_cown, sleep_s) + else: + notice_write("pinned_dispatches", p.value[_PINNED_COUNT]) # --------------------------------------------------------------------------- @@ -415,16 +442,25 @@ def run_single_point_body(cfg: BenchConfig, repeat_index: int) -> RepeatResult: from bocpy import _core sched_stats_warm = _core.scheduler_stats() wall_clock_ns_start = time.time_ns() - t_measure_start = time.perf_counter() - time.sleep(cfg.duration) + t_measure_start_ns = time.perf_counter_ns() + + # Measurement window. When ``--pinned-spinner`` is set, drive + # a tail-recursing @when on a PinnedCown by hand from this + # thread via ``pump(max_behaviors=1)``. The spinner's per-body + # ``time.sleep(sleep_s)`` self-paces the dispatch rate. The + # point is to load the C-level pinned-queue 0->1 wakeup path + # (single terminator cv broadcast per pump) and measure + # worker-throughput regression under that load. + if cfg.pinned_spinner: + pinned = PinnedCown([0]) + schedule_pinned_spinner(pinned, cfg.pinned_spinner_sleep_s) + deadline_ns = t_measure_start_ns + int(cfg.duration * 1e9) + while time.perf_counter_ns() < deadline_ns: + pump(max_behaviors=1) + else: + time.sleep(cfg.duration) schedule_snap(state_cowns) - msg = receive(["snap"], 60.0 + cfg.duration) - t_snap_received = time.perf_counter() - if msg is None or msg[0] != "snap": - raise RuntimeError("snap behavior did not publish in time") - _, total = msg - elapsed_s = t_snap_received - t_measure_start # Snapshot tagged-queue counters BEFORE wait() tears the # runtime down. Per-tag assignments are rebound on the next @@ -433,27 +469,45 @@ def run_single_point_body(cfg: BenchConfig, repeat_index: int) -> RepeatResult: _core.queue_stats() if hasattr(_core, "queue_stats") else None ) finally: - # Drop bare-Cown locals before wait(). + # Drop bare-Cown locals before wait(). ``pinned`` (when set) + # is intentionally left in scope: ``wait()`` auto-pumps any + # remaining pinned bodies on this thread, the spinner + # observes ``cr_stop`` from snap, publishes its final count, + # and stops re-scheduling so the queue drains. del rings del state_cowns - # ``wait(stats=True)`` returns the per-worker scheduler_stats - # snapshot captured AFTER all behaviors completed but BEFORE - # the per-worker array is freed -- the only correct moment - # for a session-final snapshot. - sched_stats_end = wait(stats=True) - + # ``wait(stats=True, noticeboard=True)`` returns a WaitResult + # carrying both the per-worker scheduler_stats snapshot and + # the final noticeboard contents. Both are captured AFTER all + # behaviors completed but BEFORE the per-worker array and the + # noticeboard entries are freed -- the only correct moment. + wait_result = wait(stats=True, noticeboard=True) + sched_stats_end = wait_result.stats + nb_snap = wait_result.noticeboard + + total = int(nb_snap.get("final_count", 0)) + pinned_dispatches = int(nb_snap.get("pinned_dispatches", 0)) + # Use the snap behavior's own write-time so the elapsed_s + # numerator denominator pairing matches: ``total`` is the count + # snap observed at that instant, ``t_measure_start`` is when the + # chains began contributing to it. + snap_ts_ns = nb_snap.get("final_count_ts_ns") + if snap_ts_ns is None: + raise RuntimeError("snap behavior did not publish final_count_ts_ns") + elapsed_s = max(0.0, (int(snap_ts_ns) - t_measure_start_ns) / 1e9) sched_stats_delta = _delta_scheduler_stats(sched_stats_warm, sched_stats_end) throughput = total / elapsed_s if elapsed_s > 0 else 0.0 return RepeatResult(repeat_index=repeat_index, - completed_behaviors=int(total), + completed_behaviors=total, elapsed_s=elapsed_s, throughput=throughput, wall_clock_ns_start=wall_clock_ns_start, scheduler_stats=sched_stats_delta, queue_stats=queue_stats_snap, derived=compute_derived_metrics(sched_stats_delta, - int(total))) + total), + pinned_dispatches=pinned_dispatches) # --------------------------------------------------------------------------- @@ -586,6 +640,10 @@ def cfg_to_argv(cfg: BenchConfig) -> list: args += ["--chains-per-ring", str(cfg.chains_per_ring)] if cfg.null_payload: args += ["--null-payload"] + if cfg.pinned_spinner: + args += ["--pinned-spinner", + "--pinned-spinner-sleep-s", + str(cfg.pinned_spinner_sleep_s)] return args @@ -645,7 +703,8 @@ def run_in_subprocess(cfg: BenchConfig, repeat_index: int, wall_clock_ns_start=int(payload["wall_clock_ns_start"]), scheduler_stats=payload.get("scheduler_stats"), queue_stats=payload.get("queue_stats"), - derived=payload.get("derived")) + derived=payload.get("derived"), + pinned_dispatches=int(payload.get("pinned_dispatches", 0))) def _extract_sentinel_payload(stdout: str) -> Optional[dict]: @@ -1128,6 +1187,21 @@ def build_arg_parser() -> argparse.ArgumentParser: help="Skip the matmul inner loop in each behavior. " "Throughput then reflects pure BOC runtime " "overhead with the application work removed.") + p.add_argument("--pinned-spinner", dest="pinned_spinner", + action="store_true", default=False, + help="During the measurement window, drive a " + "tail-recursing @when on a PinnedCown via " + "pump(max_behaviors=1) so the C-level " + "pinned-queue 0->1 wakeup path is loaded " + "alongside the worker @when stream. Use to " + "verify worker-throughput regression under " + "high-rate pinned dispatch.") + p.add_argument("--pinned-spinner-sleep-s", + dest="pinned_spinner_sleep_s", + type=float, default=0.001, + help="Per-iteration sleep inside the pinned " + "spinner body, in seconds. Controls the " + "dispatch rate (default 1e-3 = ~1 kHz).") p.add_argument("--output", default=None) p.add_argument("--table", dest="table", action="store_true", default=None) p.add_argument("--no-table", dest="table", action="store_false") @@ -1169,6 +1243,8 @@ def args_to_base_cfg(args) -> BenchConfig: payload_cols=args.payload_cols, repeats=args.repeats, null_payload=args.null_payload, + pinned_spinner=args.pinned_spinner, + pinned_spinner_sleep_s=args.pinned_spinner_sleep_s, ) @@ -1195,6 +1271,7 @@ def child_main(args) -> int: "elapsed_s": rep.elapsed_s, "throughput": rep.throughput, "wall_clock_ns_start": rep.wall_clock_ns_start, + "pinned_dispatches": rep.pinned_dispatches, } if args.emit_scheduler_stats: # Read from the snapshot taken INSIDE run_single_point_body, diff --git a/examples/boids.py b/examples/boids.py index 7a0f44e..6d3a02c 100644 --- a/examples/boids.py +++ b/examples/boids.py @@ -6,7 +6,7 @@ import math from typing import Mapping, NamedTuple -from bocpy import Cown, Matrix, receive, send, start, wait, when +from bocpy import Cown, Matrix, PinnedCown, pump, start, wait, when class BoundingBox(NamedTuple("BoundingBox", [("left", int), ("top", int), ("right", int), ("bottom", int)])): @@ -118,9 +118,8 @@ def limit_speed(velocity: Matrix, speed_limit=15): :param velocity: A 1 x 2 velocity vector (modified in place). :param speed_limit: The maximum speed for a boid """ - speed = velocity.magnitude() - if speed > speed_limit: - velocity /= speed + if velocity.magnitude_squared() > speed_limit * speed_limit: + velocity.normalize(in_place=True) velocity *= speed_limit @@ -169,6 +168,8 @@ def update(self, cell_data: Mapping[Cell, "CellData"], width: int, height: int): :param cell_data: The full grid mapping, used to locate neighbor cells. :param width: The simulation area width. :param height: The simulation area height. + :return: A ``Cown`` holding ``(pos_slice, vel_slice)`` for the + frame-end pinned writeback to consume. """ row, column = self.cell boids = self.boids @@ -189,18 +190,18 @@ def update(self, cell_data: Mapping[Cell, "CellData"], width: int, height: int): num_boids = len(boids) if num_boids == 1: @when(self.positions, self.velocities) - def _(positions: Cown[Matrix], velocities: Cown[Matrix]): + def single_cell(positions: Cown[Matrix], velocities: Cown[Matrix]): pos = positions.value vel = velocities.value limit_speed(vel) vel += keep_within_bounds(pos, width, height) pos += vel - send("update", (row, column, pos.copy(), vel.copy())) + return pos.copy(), vel.copy() - return + return single_cell @when(positions, velocities) - def _(positions: list[Cown[Matrix]], velocities: list[Cown[Matrix]]): + def multi_cell(positions: list[Cown[Matrix]], velocities: list[Cown[Matrix]]): batch_positions = Matrix.concat([c.value for c in positions]) batch_velocities = Matrix.concat([c.value for c in velocities]) @@ -219,9 +220,9 @@ def _(positions: list[Cown[Matrix]], velocities: list[Cown[Matrix]]): vcell[i] = batch_velocities[i] = vel pcell[i] = batch_positions[i] = pos + vel - pos_update = batch_positions[:num_boids] - vel_update = batch_velocities[:num_boids] - send("update", (row, column, pos_update, vel_update)) + return batch_positions[:num_boids], batch_velocities[:num_boids] + + return multi_cell class Simulation: @@ -237,7 +238,16 @@ def __init__(self, num_boids: int, width: int, height: int, spacing=50): """ self.spacing = spacing self.num_boids = num_boids - self.positions, self.velocities = init_boids(num_boids, width, height) + positions, velocities = init_boids(num_boids, width, height) + # The per-frame writeback runs on the main thread against + # these matrices; per-cell physics runs on workers against + # cell-local cowns. ``self.positions`` / ``self.velocities`` + # alias the same matrix objects for direct main-thread reads + # (spatial hashing, drawing). + self.positions_cown = PinnedCown(positions) + self.velocities_cown = PinnedCown(velocities) + self.positions = positions + self.velocities = velocities self.num_cells = 2 * num_boids self.cell_start = [0 for _ in range(self.num_cells + 1)] self.cell_entries = [0 for _ in range(self.num_cells)] @@ -328,17 +338,24 @@ def step(self, width: int, height: int): for cell in self.grid_cells: self.cell_data[cell] = self.build_cell_data(self.positions, self.velocities, cell.row, cell.column) - self.num_behaviors = 0 - for value in self.cell_data.values(): - value.update(self.cell_data, width, height) - self.num_behaviors += 1 - - for _ in range(self.num_behaviors): - _, (row, column, positions, velocities) = receive("update") - boids = self.cell_data[Cell(row, column)].boids - for b, pos, vel in zip(boids, positions, velocities): - self.positions[b] = pos - self.velocities[b] = vel + cells = list(self.cell_data.values()) + boid_indices = [cd.boids for cd in cells] + results = [cd.update(self.cell_data, width, height) for cd in cells] + self.num_behaviors = len(cells) + + # One pinned dispatch per frame -- coarse-grained writeback. + # Workers compute per-cell physics in parallel; this single + # main-thread behavior batches the global-matrix update once all + # cell results are ready. + @when(results, self.positions_cown, self.velocities_cown) + def _writeback(per_cell, all_pos, all_vel): + pos_mat = all_pos.value + vel_mat = all_vel.value + for boids, result in zip(boid_indices, per_cell): + pos_slice, vel_slice = result.value + for b, p, v in zip(boids, pos_slice, vel_slice): + pos_mat[b] = p + vel_mat[b] = v self.cell_data.clear() @@ -375,6 +392,7 @@ def __init__(self, width: int, height: int, num_boids: int, self.samples = deque() self.fps_samples = deque() self.show_overlay = show_overlay + self.pending_updates = 0 if show_overlay: self.num_boids_label = pyglet.text.Label( @@ -428,11 +446,19 @@ def update(self, delta_time: float): :param delta_time: Seconds elapsed since the last frame. """ self.elapsed += delta_time + self.total_elapsed += delta_time + result = pump() + self.pending_updates -= result.executed + if self.pending_updates > 0: + # avoid creating extra work until the previous + # update has been applied + return + + self.pending_updates += 1 self.simulation.step(self.width, self.height) self.num_behaviors += self.simulation.num_behaviors self.num_frames += 1 self.total_behaviors += self.simulation.num_behaviors - self.total_elapsed += delta_time if self.elapsed > 1: self.samples.append(self.num_behaviors / self.elapsed) @@ -453,10 +479,10 @@ def update(self, delta_time: float): positions = self.simulation.positions velocities = self.simulation.velocities + angles = velocities.angle() for b, t in enumerate(self.triangles): pos = positions[b] - vel = velocities[b] - angle = math.atan2(vel.y, vel.x) + angle = angles[b, 0] r, g, b = colorsys.hsv_to_rgb(((angle + math.pi) / (2 * math.pi)), 1, 1) r = int(r * 255) g = int(g * 255) diff --git a/pyproject.toml b/pyproject.toml index 1e9105a..01c9492 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "bocpy" -version = "0.8.0" +version = "0.9.0" authors = [ {name = "bocpy Team", email="bocpy@microsoft.com"} ] diff --git a/setup.py b/setup.py index 70ee484..8fda04d 100644 --- a/setup.py +++ b/setup.py @@ -102,8 +102,12 @@ "src/bocpy/_internal_test_atomics.c", "src/bocpy/_internal_test_bq.c", "src/bocpy/_internal_test_wsq.c", + "src/bocpy/_core.c", "src/bocpy/boc_compat.c", + "src/bocpy/boc_noticeboard.c", "src/bocpy/boc_sched.c", + "src/bocpy/boc_tags.c", + "src/bocpy/boc_terminator.c", ], depends=_headers, include_dirs=_include_dirs, diff --git a/sphinx/source/api.rst b/sphinx/source/api.rst index 03c5948..757bfe5 100644 --- a/sphinx/source/api.rst +++ b/sphinx/source/api.rst @@ -20,6 +20,7 @@ Behaviors .. autofunction:: wait .. autoclass:: WaitResult :members: +.. autofunction:: quiesce .. autofunction:: start Cown Groups @@ -66,6 +67,11 @@ The bocpy runtime follows a simple lifecycle: is no central scheduler thread. 3. **Wait** — :func:`wait` blocks until all scheduled behaviors complete, then tears down the runtime (joins workers, closes the noticeboard). + For a non-tearing-down checkpoint (e.g. parallel-search inspection + between rounds), use :func:`quiesce` instead — it blocks until + the runtime is quiescent, returns optional ``stats`` / + ``noticeboard`` snapshots, and leaves workers and the noticeboard + thread running so further ``@when`` calls work immediately. 4. **Re-start** — after ``wait()`` returns, the next ``@when`` call spins up a fresh runtime. The noticeboard is cleared and worker statistics are reset; existing :class:`Cown` objects survive and can be scheduled @@ -80,6 +86,24 @@ Advanced .. autofunction:: whencall +Pinned Cowns +------------ + +See :ref:`pinned-cowns` for the conceptual overview, the +coarse-grained dispatch pattern, event-loop integration recipes, +and the free-threaded support trajectory. + +.. autoclass:: PinnedCown + :members: + :undoc-members: + +.. autofunction:: pump +.. autoclass:: PumpResult + :members: +.. autofunction:: set_pump_watchdog +.. autofunction:: set_wait_pump_poll + + Noticeboard ----------- diff --git a/sphinx/source/c_abi.rst b/sphinx/source/c_abi.rst index 03a48e5..e940d76 100644 --- a/sphinx/source/c_abi.rst +++ b/sphinx/source/c_abi.rst @@ -159,6 +159,18 @@ pattern so downstream code does not have to redefine them: called with the GIL held (or while attached to an interpreter on free-threaded builds) — same contract as the underlying ``PyInterpreterState_GetID(PyInterpreterState_Get())``. + * - ``bocpy_main_interpid()`` + - ``static inline int_least64_t``: returns the *main* + interpreter's ID, pre-typed to match ``bocpy_interpid()`` for + owner-field equality checks. Wraps + ``PyInterpreterState_GetID(PyInterpreterState_Main())``, which + returns the process's main interpreter regardless of which + interpreter the caller is currently attached to, so this + helper is safe to call from a worker sub-interpreter for + diagnostic / assert use (under the GIL or equivalent + attachment, same as ``bocpy_interpid()``). Used by bocpy's + own main-pinned-cown call sites to assert that the running + interpreter is the permanent owner of a pinned cown's value. The two are designed to be used together: producer-side, CAS the owner from ``bocpy_interpid()`` to ``BOCPY_NO_OWNER`` before calling diff --git a/sphinx/source/conf.py b/sphinx/source/conf.py index cc8c0e6..0e4789b 100644 --- a/sphinx/source/conf.py +++ b/sphinx/source/conf.py @@ -14,7 +14,7 @@ project = 'bocpy' copyright = '2026, Microsoft' author = 'Microsoft' -release = '0.8.0' +release = '0.9.0' # -- General configuration --------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration diff --git a/sphinx/source/index.rst b/sphinx/source/index.rst index ac36d92..026252e 100644 --- a/sphinx/source/index.rst +++ b/sphinx/source/index.rst @@ -85,6 +85,13 @@ behavior once its cowns are free. The two transfers serialise on the ``Alice``/``Bob`` cowns, so their effects are interleaved in a deadlock-free, data-race-free order chosen by the runtime. +For a non-tearing-down synchronization point — e.g. a parallel search that +needs to inspect a best-so-far cown between rounds before continuing — use +:func:`quiesce` instead of :func:`wait`. It blocks until the runtime is +quiescent, optionally returns a stats or noticeboard snapshot, and leaves +the worker pool and the noticeboard thread running so the next ``@when`` +call dispatches immediately. + For cross-behavior shared state see :ref:`noticeboard`. For lower-level Erlang-style ``send`` / ``receive`` channels see :ref:`messaging`. @@ -93,6 +100,7 @@ Erlang-style ``send`` / ``receive`` channels see :ref:`messaging`. :caption: Contents: api + pinned_cowns noticeboard messaging c_abi diff --git a/sphinx/source/noticeboard.rst b/sphinx/source/noticeboard.rst index addb3c0..66e9acd 100644 --- a/sphinx/source/noticeboard.rst +++ b/sphinx/source/noticeboard.rst @@ -163,11 +163,35 @@ Edge cases: - The returned dict is a plain mutable ``dict``; mutating it locally does not affect the (now-freed) noticeboard. +Reading the State Between Rounds +--------------------------------- + +When you want a noticeboard snapshot at a synchronization point +*without* tearing the runtime down — e.g. a parallel search that +inspects its best-so-far state between rounds and then keeps +working — use :func:`quiesce` with ``noticeboard=True``:: + + from bocpy import quiesce + + snap = quiesce(noticeboard=True) # plain dict[str, Any] + print("best so far:", snap.get("best")) + # ... next batch of @when calls runs immediately ... + +:func:`quiesce` blocks until every in-flight behavior completes, +captures the snapshot the same way :func:`wait` does (by cycling +the dedicated mutator thread, which guarantees every prior +``notice_write`` / ``notice_update`` / ``notice_delete`` has been +committed before the read), and then leaves the workers and the +noticeboard thread running. The combined ``stats=True, +noticeboard=True`` form returns a :class:`WaitResult` just like +:func:`wait`. + Reading the Noticeboard ----------------------- -From inside a behavior, call :func:`noticeboard` to get a read-only mapping -of all entries, or :func:`notice_read` for a single key:: +The noticeboard is a **behavior-scope read surface**. Inside a behavior, +call :func:`noticeboard` to get a read-only mapping of all entries, or +:func:`notice_read` for a single key:: from bocpy import noticeboard, notice_read, when, Cown @@ -183,7 +207,7 @@ of all entries, or :func:`notice_read` for a single key:: # Single key with a default threshold = notice_read("threshold", 0.5) -The snapshot is taken once per behavior and cached — multiple calls to +The snapshot is taken once per behavior and cached -- multiple calls to :func:`noticeboard` or :func:`notice_read` within the same behavior return data from the same point in time. @@ -191,6 +215,18 @@ Cowns embedded in a noticeboard entry remain valid for the lifetime of the entry; they survive as long as the entry has not been overwritten or deleted, regardless of how many readers have observed the entry. +.. warning:: + + Calling :func:`noticeboard` or :func:`notice_read` from the main + thread *outside* a behavior is **undefined behavior**. The only + supported ways to read the noticeboard from the main thread are + :func:`wait` with ``noticeboard=True`` (see "Reading the Final + State at Shutdown" above) and :func:`quiesce` with + ``noticeboard=True`` (see "Reading the State Between Rounds" + above). Seeding the noticeboard with :func:`notice_write` from + the main thread *before* scheduling behaviors is fine and is + the recommended pattern for installing read-mostly configuration. + Writing and Updating -------------------- diff --git a/sphinx/source/pinned_cowns.rst b/sphinx/source/pinned_cowns.rst new file mode 100644 index 0000000..c123597 --- /dev/null +++ b/sphinx/source/pinned_cowns.rst @@ -0,0 +1,286 @@ +.. _pinned-cowns: + +Pinned Cowns +============ + +.. module:: bocpy + :noindex: + +A :class:`PinnedCown` is a :class:`Cown` whose value never leaves the +**main interpreter**. Behaviors whose request set contains *any* +pinned cown run on the main thread, drained by :func:`pump` (called +from your event loop) or implicitly by :func:`wait`. Use a pinned +cown when the underlying value cannot survive an XIData round-trip: +pyglet shapes, Tk widgets, open file handles, ctypes pointers, GPU +contexts, asyncio loops. + +When to Use a Pinned Cown +------------------------- + +Reach for :class:`PinnedCown` when the value: + +- has a ``__reduce__`` that raises or silently reconstructs a + broken object on the other side of the worker boundary, +- is a handle into a library loaded only by ``__main__`` + (pyglet GL state, a Tk root, an open SQLite connection, a CUDA + context), +- must observe identity across acquires (``id(value)`` stable + between behaviors). + +A regular :class:`Cown` stores its value as cross-interpreter data +and the same Python object is never observed twice in a worker. A +:class:`PinnedCown` holds its value as a plain ``PyObject`` +reference in the main interpreter; every acquire sees the same +object. + +Pump Contract +------------- + +:func:`pump` drains the main-thread queue of pinned-aware +behaviors. Each behavior runs to completion before the next starts. +The pump is non-preemptive: ``deadline_ms`` gates *starting* the +next behavior, not interrupting one already running. + +.. code-block:: python + + from bocpy import pump + + result = pump() # drain to empty + result = pump(deadline_ms=4) # wall-clock budget + result = pump(max_behaviors=8) # hard count + result = pump(raise_on_error=True) # re-raise first body exception + +The result is a :class:`PumpResult` ``NamedTuple``: + +- ``executed`` — pinned behaviors whose lifecycle (acquire attempt + → optional body → release) ran to completion. +- ``deadline_reached`` — ``True`` iff the loop exited because + ``deadline_ms`` tripped before the queue drained. +- ``raised`` — pinned behaviors whose body raised an + ``Exception`` captured to the result cown's ``.exception``. + +Script-mode programs need not call :func:`pump` explicitly — +:func:`wait` pumps internally when any :class:`PinnedCown` exists +in the process. + +.. _pinned-coarse-grained: + +The Coarse-Grained Dispatch Pattern +----------------------------------- + +The pinned arm is **single-consumer**: only the main thread drains +the pump queue. If you schedule a pinned behavior per item, those +behaviors serialise on the main thread and you lose worker +parallelism. + +Schedule pinned behaviors **coarsely** — one per logical frame or +batch, not per item: + +1. Wrap per-item state in regular :class:`Cown`\s. +2. Schedule worker ``@when``\s that compute per-item physics and + **return** a result :class:`Cown`. +3. Schedule **one** pinned ``@when`` per frame that captures all + the result cowns together with the main-thread handle, and + performs the batched write-back. + +The +`boids example `_ +follows this pattern: worker behaviors compute per-cell flock +physics in parallel; one pinned behavior per frame writes the +results back into the global pyglet-visible position and velocity +matrices. Dispatch through the pump queue is ~1 per frame, not N. + +Integrating with an Event Loop +------------------------------ + +Pyglet +^^^^^^ + +Call :func:`pump` at the top of each scheduled tick so the prior +frame's write-back drains before new work is scheduled: + +.. code-block:: python + + import pyglet + from bocpy import PinnedCown, pump, start, when + + start() + canvas = PinnedCown(MyCanvas()) # holds a pyglet handle + + def update(dt): + pump() # drain prior frame's write-back + # ... schedule worker behaviors that return result cowns ... + results = [worker_compute(i) for i in range(num_items)] + + @when(*results, canvas) + def _writeback(*args): + *cells, canvas = args + for cell in cells: + canvas.value.draw(cell.value) + + pyglet.clock.schedule_interval(update, 1 / 60) + pyglet.app.run() + +Tk / asyncio +^^^^^^^^^^^^ + +Drive :func:`pump` from a periodic callback (``root.after(ms, +...)`` for Tk, ``loop.call_later(...)`` or an ``asyncio``-friendly +periodic task for asyncio). The same coarse-grained pattern +applies: keep dispatch rate at one pinned behavior per logical +batch. + +Starvation and the Watchdog +--------------------------- + +**The watchdog is disabled until you call** :func:`set_pump_watchdog`. +No call means no warnings — the runtime stays silent regardless of +how long the pinned queue has been non-empty. + +Once enabled, if pinned work piles up because the host event loop +is wedged or not calling :func:`pump` often enough, the watchdog +logs a warning carrying the queue's age and depth. The threshold +gates on **queue-non-empty time**: a program that runs only +unpinned work indefinitely never trips it. + +.. code-block:: python + + from bocpy import set_pump_watchdog + + set_pump_watchdog(warn_ms=1000) # enable warn-at-1s (matches the kwarg default) + set_pump_watchdog(warn_ms=None) # disable + +- **No call ⇒ no watchdog.** The runtime ships with the warn + threshold unset; you opt in by calling + :func:`set_pump_watchdog` at least once. +- ``warn_ms`` (default ``1000`` when the kwarg is omitted) logs a + warning carrying the queue's non-empty duration (ms) and current + depth. Pass ``None`` to turn the warning off. +- ``on_starve`` lets the host replace the default ``logging`` + sink. Use it to escalate (``on_starve=lambda s, m: pytest.fail(m)`` + in tests, a counter / alert hook in production). + +The watchdog deliberately never raises on its own: the pinned queue +is bounded by the live :class:`PinnedCown` count by construction, +so there is no back-pressure threat the library can defend against +without lying about it. Fail-fast policy belongs in the host's +``on_starve`` callback, where the calling code can record the right +context and pick the right exception class. + +Hosts that need to tune :func:`wait`'s internal pump cadence call +:func:`set_wait_pump_poll`. The default cadence is **50 ms**, which +is the upper bound on how long the auto-pump loop will park between +checks when no broadcast wakes it. + +Main-thread Direct Reads +------------------------ + +The pinned-cown contract refuses worker-side reads of the underlying +value (the owner CAS rejects them). The symmetric question — "may +the main thread read the value directly, outside a pinned ``@when`` +body?" — has a narrower answer: + +- Reading the underlying object from the **main thread** is safe + **iff** no pinned ``@when`` is currently executing against that + cown. Pinned bodies run synchronously inside :func:`pump`; once + ``pump()`` (or :func:`wait`'s auto-pump) returns, no body holds + the cown, and an immediate main-thread read sees a consistent + value. +- Reading from the main thread **while** ``pump()`` is dispatching + a body that targets the same cown is **undefined**. Do not + alternate between "I'm pumping" and "I'm reading directly" + inside the same callback. +- The safe pattern is: stash any main-thread alias for read-only + rendering / event-loop integration, but treat the pinned ``@when`` + body as the only writer. + +In the boids example, the ``Simulation`` object aliases the same +``Matrix`` under ``self.positions`` (for pyglet rendering) and +``self.positions_cown = PinnedCown(positions)`` (for the per-frame +write-back). The render path runs on the main thread between +``pump()`` calls; the write-back runs inside ``pump()``. They never +overlap. + +Thread Affinity and Free-Threaded Builds +----------------------------------------- + +- :class:`PinnedCown` may only be **constructed** from the main + interpreter; a worker that calls ``PinnedCown(x)`` raises + :class:`RuntimeError`. +- :func:`pump` must run on the main interpreter. On classic + CPython, any thread within the main interpreter may pump (the + per-interpreter GIL serialises). +- On free-threaded builds (``Py_GIL_DISABLED``) only **one thread + at a time** may pump, enforced by a CAS on pump entry that + raises :class:`RuntimeError` if a second thread tries to enter + concurrently. The CAS is cleared on every exit path, including + ``BaseException`` propagation from a pinned body. +- :func:`pump` is **not reentrant**. Calling :func:`pump` from + inside a pinned-behavior body raises :class:`RuntimeError`. + +Handle vs. Value +---------------- + +A :class:`PinnedCown` *handle* (the Python wrapper and its C +capsule) is a normal cross-interpreter shareable. It travels via +the same XIData mechanism as a regular :class:`Cown` and may be: + +- shipped as a captured variable to a worker behavior, +- embedded in any value graph stored in a regular :class:`Cown` + (``Cown(PinnedCown(x))`` is supported), +- placed in a noticeboard entry via :func:`notice_write` or + :func:`notice_update`. + +What never crosses interpreter boundaries is the *value*. A worker +that ends up holding a pinned-cown handle can do exactly one +useful thing with it: schedule pinned ``@when``\s against it, +which the runtime auto-routes to the main pump queue. Any attempt +to acquire the value from a worker is rejected by the C-level +owner CAS. + +Mixed Request Sets +------------------ + +A behavior may freely combine pinned and unpinned cowns; the 2PL +acquisition order is unchanged. As soon as the request set contains +any pinned cown, the body runs on the main thread. Unpinned cowns +in the set still travel through XIData into the main interpreter +for the body's duration. + +Exception Model +--------------- + +Body exceptions follow the same rules as worker behaviors: +captured on the result :class:`Cown` and surfaced through +``cown.exception``. The default :func:`pump` does **not** re-raise; +pass ``raise_on_error=True`` to opt into fail-fast propagation. +``BaseException`` (``KeyboardInterrupt``, ``SystemExit``, +``GeneratorExit``) propagates from :func:`pump` immediately after +the offending behavior's per-iteration cleanup completes; any +behaviors still queued remain in the pinned queue and resume on +the next :func:`pump` (or :func:`wait`-driven auto-pump) call. + +Free-Threaded Trajectory +------------------------ + +On free-threaded CPython (the ``3.13t`` and ``3.15t`` builds), +:class:`PinnedCown` works identically to classic CPython — the +sub-interpreter boundary still exists for FT workers, and +free-threaded support is "experimental" across all of bocpy. +:class:`PinnedCown` inherits that label. The single-pumper CAS +prevents silent data races from concurrent pumpers, raising +:class:`RuntimeError` instead. + +Long term, bocpy will fork into a classic-CPython build (using +sub-interpreters — where :class:`PinnedCown` is meaningful) and a +free-threaded build (running workers as plain main-interpreter +threads — where every cown is effectively pinned and +:class:`PinnedCown` becomes a no-op). In that future, the +single-pumper CAS is removed. Out of scope for v1. + +API Reference +------------- + +See :ref:`api` for the autodoc-generated reference for +:class:`PinnedCown`, :func:`pump`, :class:`PumpResult`, +:func:`set_pump_watchdog`, and :func:`set_wait_pump_poll`. diff --git a/src/bocpy/__init__.py b/src/bocpy/__init__.py index 15ab029..49ad513 100644 --- a/src/bocpy/__init__.py +++ b/src/bocpy/__init__.py @@ -9,7 +9,9 @@ from ._math import Matrix from .behaviors import (Behaviors, Cown, notice_delete, notice_read, notice_sync, notice_update, notice_write, noticeboard, + PinnedCown, pump, PumpResult, quiesce, REMOVED, + set_pump_watchdog, set_wait_pump_poll, start, wait, WaitResult, when, whencall, WORKER_COUNT) try: @@ -78,11 +80,15 @@ def get_sources() -> list[str]: return [] -__all__ = ["Behaviors", "Cown", "Matrix", "REMOVED", "TIMEOUT", +__all__ = ["Behaviors", "Cown", "Matrix", "PinnedCown", "PumpResult", + "REMOVED", "TIMEOUT", "WORKER_COUNT", "__version__", "drain", "get_include", "get_sources", "notice_delete", "notice_read", "notice_sync", "notice_update", "notice_write", "noticeboard", + "pump", + "quiesce", "receive", - "send", "set_tags", "start", "wait", "WaitResult", + "send", "set_pump_watchdog", "set_tags", "set_wait_pump_poll", + "start", "wait", "WaitResult", "when", "whencall"] diff --git a/src/bocpy/__init__.pyi b/src/bocpy/__init__.pyi index f670629..853b864 100644 --- a/src/bocpy/__init__.pyi +++ b/src/bocpy/__init__.pyi @@ -577,6 +577,264 @@ class Cown(Generic[T]): """Debug representation.""" +class PinnedCown(Cown[T]): + """A cown whose value never leaves the main interpreter. + + Behaviors whose request set contains *any* PinnedCown run on the + main interpreter, scheduled onto a pump queue that the runtime + drains under :func:`wait` and that hosts may drive explicitly via + :func:`bocpy.pump`. + + A regular :class:`Cown` stores its value as cross-interpreter + data: every time a worker acquires the cown the value is + unpickled into the worker's interpreter, mutated, and re-pickled + on release. That round-trip is the reason a cown can be acquired + by any worker -- but it also means the value must be picklable + and that **the same Python object is never observed twice** in + a worker. + + Many useful values cannot survive that round-trip: pyglet shapes, + Tk widgets, open file handles, ctypes pointers into a library + loaded by ``__main__``, an asyncio event loop, a GPU context. + Their ``__reduce__`` either raises or silently reconstructs a + broken object on the other side. + + A :class:`PinnedCown` holds its value as a plain + :c:type:`PyObject` reference in the main interpreter. The value + never goes through ``XIData``; the same Python object is + observed on every acquire. The trade-off: every behavior whose + request set contains a pinned cown runs **on the main thread**, + drained by :func:`pump` (called from your event loop) or + implicitly by :func:`wait`. + + Pattern: coarse-grained pinned dispatch + The pinned arm is single-consumer (the main thread). If you + schedule a pinned behavior per item, those behaviors + serialise on the main thread and you lose worker + parallelism. Schedule pinned behaviors coarsely -- one per + logical frame or batch, not per item. Do per-item + computation on workers against per-item :class:`Cown` + slices, then dispatch **one** pinned ``@when`` per frame + that captures all of them together with the main-thread + canvas / handle and performs the batched write-back. + + Thread affinity + Pinned cowns may only be constructed from the **main + interpreter**. Constructing one from a worker raises + :class:`RuntimeError`; the value would have no home + interpreter to live in. :func:`pump` likewise requires the + main interpreter -- any thread within it on classic CPython; + on free-threaded builds (``Py_GIL_DISABLED``) a single + thread at a time, enforced by a CAS on pump entry that + raises :class:`RuntimeError` if a second thread tries to + pump concurrently. The CAS is cleared on **every** exit + path, including ``BaseException`` propagation from a + pinned body. + + Mixed request sets + A behavior may freely combine pinned and unpinned cowns; + the 2PL acquisition order is unchanged. As soon as the + request set contains any pinned cown, the body runs on the + main thread. Unpinned cowns in the set still travel through + XIData into the main interpreter for the body's duration. + + Exception model + Body exceptions follow the same rules as worker behaviors: + captured on the result :class:`Cown` and surfaced through + ``cown.exception``. The default :func:`pump` does **not** + re-raise; pass ``raise_on_error=True`` to opt into + fail-fast propagation. + + Nested pumping + Calling :func:`pump` from inside a pinned-behavior body + raises :class:`RuntimeError` (v1). + + Handle vs. value + A :class:`PinnedCown` *handle* (the Python wrapper object + and its C capsule) is a normal cross-interpreter shareable. + It travels via the same XIData mechanism as a regular + :class:`Cown` and may be: + + - shipped as a captured variable to a worker behavior, + - embedded in any value graph stored in a regular + :class:`Cown` (``Cown(PinnedCown(x))`` is supported), + - placed in a noticeboard entry via :func:`notice_write` + or :func:`notice_update`. + + What never crosses interpreter boundaries is the *value* + ``x``. A worker that ends up holding a pinned-cown handle + can do exactly one useful thing with it: schedule pinned + behaviors against it (which the runtime auto-routes to + the main pump queue). Any attempt to acquire the value + from a worker is rejected by the C-level owner CAS -- the + value's owner is permanently the main interpreter. + + Restrictions + - Constructible only on the main interpreter (see + *Thread affinity* above). + - The pinning interpreter is the main interpreter, by + design. There is one pinned queue per process and one + consumer of that queue (the main pumper); pinned cowns do + not split across interpreters. + """ + + def __init__(self, value: T): + """Create a pinned cown wrapping *value*. + + :param value: The initial value to wrap. Stored as a plain + :c:type:`PyObject` reference in the main interpreter -- + no pickling, no XIData round-trip. + :raises RuntimeError: If called from a non-main interpreter. + """ + + +class PumpResult(NamedTuple): + """Result of a :func:`pump` call. + + :ivar executed: Pinned behaviors whose lifecycle ran to + completion this call. Counts the iteration even if the body + raised or the acquire failed (the MCS chain still drained). + :ivar deadline_reached: ``True`` iff the loop exited because + ``deadline_ms`` tripped before the queue drained and before + ``max_behaviors`` capped. ``False`` on drain, on + ``max_behaviors`` cap, or when ``deadline_ms`` is ``None``. + :ivar raised: Pinned behaviors whose body raised an + :class:`Exception` captured to the result cown's + ``.exception``. Cleanup-path failures (acquire, release, + noticeboard cache-clear) do **not** count: they are logged + via ``PyErr_WriteUnraisable`` and the iteration is still + counted in ``executed``. On :class:`BaseException` + propagation, :func:`pump` raises and no + :class:`PumpResult` is returned. + """ + + executed: int + deadline_reached: bool + raised: int + + +def pump(deadline_ms: Optional[int] = None, + max_behaviors: Optional[int] = None, + raise_on_error: bool = False) -> PumpResult: + """Run pinned behaviors that are ready, then return. + + Drains the main-thread queue of behaviors whose request sets + contain at least one :class:`PinnedCown`. Each behavior runs to + completion before the next starts. The pump is non-preemptive: + ``deadline_ms`` gates *starting* the next behavior, not + interrupting one already running. + + Call :func:`pump` from your event loop's idle / on-tick hook. + Script-mode programs need not call it explicitly -- :func:`wait` + pumps internally when any :class:`PinnedCown` exists in the + process. + + Bounding + - ``deadline_ms``: wall-clock budget. ``None`` drains to + empty; otherwise a positive :class:`int`. + - ``max_behaviors``: hard count. ``None`` drains to empty; + otherwise a positive :class:`int`. + ``0`` is rejected for both bounds (use ``if budget:`` at + the call site instead of relying on the pump to no-op). + + Exception model + By default body exceptions land on the result cown; pump + continues. With ``raise_on_error=True``, the first body + exception re-raises on the pump thread after the queue + finishes draining. :class:`BaseException` + (``KeyboardInterrupt``, ``SystemExit``, ``GeneratorExit``) + propagates immediately after the offending behavior's + per-iteration cleanup completes; any behaviors still queued + are left in place for the next :func:`pump` call. + + Thread affinity + :func:`pump` must run on the **main interpreter**. Calling + from a worker interpreter raises :class:`RuntimeError` + immediately. On free-threaded builds (``Py_GIL_DISABLED``) + only one thread may pump at a time: a concurrent call from + a different thread raises :class:`RuntimeError`. Calling + :func:`pump` when no :class:`PinnedCown` exists is a no-op + returning ``PumpResult(0, False, 0)``. + + Reentrance + Not reentrant. Calling from inside a pinned-behavior body + raises :class:`RuntimeError` (v1). + + :param deadline_ms: Wall-clock budget in milliseconds. + ``None`` for unbounded; otherwise a positive :class:`int`. + Must not be :class:`bool`. + :type deadline_ms: Optional[int] + :param max_behaviors: Maximum behaviors to start this call. + ``None`` for unbounded; otherwise a positive :class:`int`. + Must not be :class:`bool`. + :type max_behaviors: Optional[int] + :param raise_on_error: Re-raise the first body exception after + drain. + :type raise_on_error: bool + :return: :class:`PumpResult` (``executed``, + ``deadline_reached``, ``raised``). On + :class:`BaseException` propagation, :func:`pump` raises + and no :class:`PumpResult` is returned. + :rtype: PumpResult + :raises TypeError: if ``deadline_ms`` or ``max_behaviors`` is + not ``None``, a positive :class:`int`, or is :class:`bool`. + :raises RuntimeError: wrong interpreter, concurrent pump on + free-threaded, nested pump, no live runtime + (:func:`start` has not been called), or watchdog raise + threshold tripped. + """ + + +def set_pump_watchdog(warn_ms: Optional[int] = 1000, + on_starve: Optional[ + Callable[[int, str], None]] = None) -> None: + """Configure the pinned-queue starvation watchdog. + + **The watchdog is disabled until this function is called.** No + call means no warnings, regardless of how long the pinned queue + has been non-empty. ``warn_ms=1000`` is the kwarg default that + applies *if and when* you opt in, not the runtime default. + + Warn-side sampling fires from :func:`pump` on entry (so + :func:`wait`'s auto-pump loop counts). The threshold gates on + **queue-non-empty time**: a program that runs only unpinned work + indefinitely never trips it. + + - ``warn_ms`` (kwarg default 1000): logs a warning carrying the + queue's non-empty duration (ms) and current depth. Pass + ``None`` to disable. Must be a positive int when set. + - ``on_starve``: optional callable ``(severity, message)`` to + replace the default logger. Use this to escalate (for + example ``on_starve=lambda s, m: pytest.fail(m)`` in tests, or + a counter / alert hook in production). + + :param warn_ms: Warn-after threshold in milliseconds, or + ``None`` to disable warnings. + :type warn_ms: Optional[int] + :param on_starve: Optional ``(severity, message)`` callback that + replaces the default logger sink. + :type on_starve: Optional[Callable[[int, str], None]] + :raises TypeError: if ``warn_ms`` is not ``None`` or a positive + :class:`int`, or ``on_starve`` is not callable. + :raises OverflowError: if ``warn_ms`` exceeds the maximum + representable nanosecond value. + """ + + +def set_wait_pump_poll(ms: int = 50) -> None: + """Set the poll cadence for :func:`wait`'s auto-pump loop. + + Default cadence is **50 ms** — the upper bound on how long the + auto-pump loop will park between checks when no broadcast wakes + it. The setting is process-global and may be changed at any + time; the active :func:`wait` loop picks up the new value on + its next iteration. + + :param ms: Poll cadence in milliseconds. Must be positive. + :type ms: int + """ + + def notice_write(key: str, value: Any) -> None: """Write a value to the noticeboard. @@ -676,15 +934,29 @@ def notice_delete(key: str) -> None: def noticeboard() -> Mapping[str, Any]: """Return a cached snapshot of the noticeboard. - Must be called from within a ``@when`` behavior. The first call within a - behavior captures all entries under mutex and caches the data. - Subsequent calls in the same behavior return a view of the same - cached data. + The noticeboard is a behavior-scope read surface. The supported + use is from inside a ``@when`` body: the first call captures all + entries under mutex and caches them, and every subsequent call + in the same behavior returns the same cached view. The returned mapping is read-only. - Calling from outside a behavior (e.g. the main thread) will return a - snapshot that is never refreshed for that thread. + The only supported way to read the noticeboard from the main + thread is to ask :func:`wait` for it via ``wait(noticeboard=True)`` + (or ``wait(stats=True, noticeboard=True)``); that snapshot is taken + on the main thread between joining the noticeboard mutator thread + and clearing the C-side entries. + + Calling :func:`noticeboard` or :func:`notice_read` from any other + main-thread context (outside a behavior, outside + ``wait(noticeboard=True)``) is **undefined behavior**: the cached + proxy is never re-anchored on a behavior boundary, so subsequent + calls may observe either a stale snapshot or partially-applied + writes. + + Seeding the noticeboard with :func:`notice_write` from the main + thread *before* scheduling behaviors is fine and is the + recommended pattern for installing read-mostly configuration. :return: A read-only mapping of keys to their stored values. :rtype: Mapping[str, Any] @@ -694,11 +966,11 @@ def noticeboard() -> Mapping[str, Any]: def notice_read(key: str, default: Any = None) -> Any: """Read a single key from the noticeboard. - Must be called from within a ``@when`` behavior. Convenience wrapper - that takes a snapshot and returns one value. - - Calling from outside a behavior (e.g. the main thread) will return a - snapshot that is never refreshed for that thread. + Convenience wrapper over :func:`noticeboard` that takes a snapshot + and returns one value. The same supported-usage contract applies: + call from inside a ``@when`` behavior, or read the final state on + main via ``wait(noticeboard=True)``. Calling :func:`notice_read` + from any other main-thread context is **undefined behavior**. :param key: The noticeboard key to read. :type key: str @@ -844,6 +1116,94 @@ class WaitResult(NamedTuple): noticeboard: dict[str, Any] +@overload +def quiesce(timeout: Optional[float] = None, *, + stats: Literal[False] = False, + noticeboard: Literal[False] = False) -> None: ... + + +@overload +def quiesce(timeout: Optional[float] = None, *, + stats: Literal[True], + noticeboard: Literal[False] = False) -> list[dict]: ... + + +@overload +def quiesce(timeout: Optional[float] = None, *, + stats: Literal[False] = False, + noticeboard: Literal[True]) -> dict[str, Any]: ... + + +@overload +def quiesce(timeout: Optional[float] = None, *, + stats: Literal[True], + noticeboard: Literal[True]) -> "WaitResult": ... + + +def quiesce(timeout: Optional[float] = None, *, + stats: bool = False, + noticeboard: bool = False + ) -> Union[None, list[dict], dict[str, Any], "WaitResult"]: + """Block until in-flight behaviors complete **without** teardown. + + Unlike :func:`wait`, this leaves the runtime fully usable: + workers remain running, the noticeboard thread remains + registered, and the terminator is **not** closed. Further + ``@when`` calls work immediately after ``quiesce()`` returns. + + Typical use is to lift a result out of a long-running parallel + job at a defined synchronization point — e.g. a parallel search + that periodically wants to inspect its best-so-far state — and + then keep working. The flags mirror :func:`wait`: + + - neither flag set: returns ``None`` once the runtime is quiescent. + - ``stats=True`` only: returns the per-worker scheduler-stats + snapshot as ``list[dict]`` (same shape as :func:`wait`). + - ``noticeboard=True`` only: returns a plain ``dict[str, Any]`` + with the noticeboard contents at the quiescence point. + - both flags set: returns :class:`WaitResult`. + + The noticeboard snapshot is captured by cycling the dedicated + mutator thread: a shutdown sentinel is enqueued on the FIFO + ``boc_noticeboard`` tag, the thread is joined (guaranteeing + every prior mutation has been committed), the live state is + read, and the thread is restarted. The result is a true + cross-interpreter point-in-time view that reflects every + ``notice_write`` / ``notice_update`` / ``notice_delete`` posted + by a behavior that completed before the quiesce point. + + Single-caller: like :func:`wait`, ``quiesce`` assumes one + thread at a time on the primary interpreter. Concurrent + ``@when`` calls from secondary threads during a ``quiesce`` are + waited for (their behaviors are part of the quiescence + condition); concurrent ``notice_write`` calls have undefined + ordering with respect to the returned snapshot. + + :param timeout: Maximum seconds to wait. ``None`` means wait + forever. The same deadline bounds both the terminator wait + and the noticeboard-cycle join. + :type timeout: Optional[float] + :param stats: If ``True``, capture per-worker scheduler stats + AFTER quiescence so the counts are stable. + :type stats: bool + :param noticeboard: If ``True``, capture a noticeboard snapshot + via the thread-cycle protocol described above. + :type noticeboard: bool + :return: ``None`` when neither flag is set; the scheduler-stats + list when only ``stats=True``; the noticeboard dict when + only ``noticeboard=True``; a :class:`WaitResult` when both + flags are set. + :rtype: Union[None, list[dict], dict[str, Any], WaitResult] + :raises TimeoutError: If quiescence is not reached within + ``timeout`` (or if the noticeboard-cycle join times out). + Unlike :func:`wait`, ``quiesce`` propagates this rather + than swallowing it -- callers who need silent best-effort + behavior should wrap the call. + :raises RuntimeError: If called from a non-primary interpreter + while pinned cowns are live (same constraint as :func:`wait`). + """ + + def when(*cowns): """Decorator to schedule a function as a behavior using given cowns. @@ -889,6 +1249,15 @@ def start(**kwargs): thread. Scheduling and release run on the caller and worker threads themselves — there is no central scheduler thread. + Idempotent: if the runtime is already up, returns silently. A + follow-up :func:`start` from a sibling code path in the same + process is a no-op rather than an error, which makes "ensure + the runtime is live before I :func:`notice_write`" usable as a + one-liner without try/except scaffolding. Arguments supplied to + a short-circuited call are **ignored**; callers who need a + different ``worker_count`` or ``module`` must :func:`wait` / + :func:`stop` the existing runtime first. + :param worker_count: The number of worker interpreters to start. If ``None``, defaults to the number of available cores minus one. :type worker_count: Optional[int] @@ -896,6 +1265,7 @@ def start(**kwargs): export for worker import. If ``None``, the caller's module will be used. :type module: Optional[tuple[str, str]] + :raises RuntimeError: If called from a non-primary interpreter. """ diff --git a/src/bocpy/_core.c b/src/bocpy/_core.c index 7c4809e..633fab4 100644 --- a/src/bocpy/_core.c +++ b/src/bocpy/_core.c @@ -6,6 +6,7 @@ #include "boc_sched.h" #include "boc_tags.h" #include "boc_terminator.h" +#include #include // Forward declaration — BOCQueue is defined below. @@ -48,6 +49,72 @@ const char *BOC_TIMEOUT = "__timeout__"; const int BOC_CAPACITY = 1024 * 16; atomic_int_least64_t BOC_COUNT = 0; atomic_int_least64_t BOC_COWN_COUNT = 0; +// Live pinned-cown count (incremented in the PinnedCownCapsule factory, +// decremented when a pinned cown's strong refcount reaches zero). +atomic_int_least64_t PINNED_COWN_COUNT = 0; + +// Process-global queue carrying behaviours that touch one or more +// pinned cowns. Producers (worker / caller threads, via +// `boc_sched_dispatch` -> `boc_main_pinned_enqueue`) enqueue the +// prehdr's `bq_node`; the main interpreter drains it from +// `main_pump_bounded`. Initialised once per process in +// `_core_module_exec`; never destroyed (kernel objects outlive +// module unload, matching the BOC_QUEUES / terminator pattern). +static boc_bq_t MAIN_PINNED_QUEUE; + +// Depth of MAIN_PINNED_QUEUE. Bumped by `boc_main_pinned_enqueue`, +// decremented by the main pump as it consumes nodes. A +// `wait()`-blocked main thread observes a non-zero depth via the +// terminator condvar (woken by `terminator_wake_all`). Signed type: MSVC +// stdatomic has no unsigned variant; depth never goes negative. +static atomic_int_least64_t MAIN_PINNED_DEPTH = 0; + +// Monotonic-ns timestamp of the most recent 0 -> 1 transition of +// MAIN_PINNED_DEPTH. Used by the main pump as a fairness signal +// (oldest-pending-pinned-work age). Sampled lock-free; race-tolerant. +static atomic_int_least64_t MAIN_PINNED_NONEMPTY_SINCE_NS = 0; + +// Thread-local re-entry flag for main_pump_bounded. Set true at the +// start of each pinned-body iteration and cleared in the per-iteration +// cleanup block, so a nested pump() called from inside a pinned body +// observes true at gate 3 and is rejected. Thread-local + plain bool +// (no atomicity needed): only the owning thread reads or writes it. +static thread_local bool IN_PUMP_BODY = false; + +// Monotonic-ns timestamp of the most recent pump iteration completion. +// Sampled lock-free; race-tolerant. The pump updates it after each +// iteration so the watchdog can compare against +// MAIN_PINNED_NONEMPTY_SINCE_NS. +static atomic_int_least64_t LAST_PUMP_NS = 0; + +#ifdef Py_GIL_DISABLED +// Free-threaded build only: ID (thrd_current() cast to uintptr_t) of +// the thread that currently owns the main pump CAS. Zero when idle. +// Gate 2 of main_pump_bounded CAS-acquires this so two distinct +// threads cannot pump concurrently. Same-thread re-entry leaves the +// CAS owned by the outer frame; gate 3's IN_PUMP_BODY check rejects +// the nested call. intptr_t (not uintptr_t) for MSVC: thread-id bits round-trip +// bit-cast losslessly. +static atomic_intptr_t MAIN_PUMP_THREAD = 0; +#endif + +// Pump-starvation watchdog config (set via _core.set_pump_watchdog). +// Both atomics default to "disabled": the watchdog produces no +// output until the user opts in by calling set_pump_watchdog(). +// Read on the hot path (boc_main_pinned_check_warn runs on pump +// entry), so the atomics are deliberately the cheapest shape -- +// relaxed reads against ms-resolution counters. 0 means disabled +// for WATCHDOG_WARN_MS. WATCHDOG_ON_STARVE is only touched from +// the main interpreter (set_pump_watchdog refuses non-main; the +// warn callback fires on pump entry which is also main-only); the +// atomic_intptr_t is for store/load visibility; intptr_t (not uintptr_t) for +// MSVC, PyObject* round-trips losslessly. +static atomic_int_least64_t WATCHDOG_WARN_MS = 0; +static atomic_intptr_t WATCHDOG_ON_STARVE = 0; +// Monotonic-ns timestamp of the most recent warn log emission, used +// to rate-limit the warn channel so a slow pump does not flood logs +// once per call. 0 = never warned in this NONEMPTY_SINCE epoch. +static atomic_int_least64_t WATCHDOG_LAST_WARN_NS = 0; #define BOC_SPIN_COUNT 64 #define BOC_BACKOFF_CAP_NS 1000000 // 1 ms @@ -948,6 +1015,32 @@ static PyObject *_core_terminator_wait(PyObject *self, PyObject *args) { Py_RETURN_FALSE; } +/// @brief Pumpable variant of @ref _core_terminator_wait. +/// @details Forwards to @ref terminator_wait_pumpable, supplying the +/// @ref _core_pinned_depth_load reader. Returns one of the three +/// integer constants exposed on the module (`TERMINATED`, +/// `PUMP_READY`, `WAIT_TIMED_OUT`). Releases the GIL across the wait. +/// @param self The module (unused) +/// @param timeout_obj A Python float — seconds to wait. Non-positive +/// performs a non-blocking poll. +/// @return Python int — one of the wake-reason sentinels. +static uint64_t _core_pinned_depth_load(void); +static PyObject *_core_terminator_wait_pumpable(PyObject *self, + PyObject *timeout_obj) { + BOC_STATE_SET(self); + double timeout_s = PyFloat_AsDouble(timeout_obj); + if (timeout_s == -1.0 && PyErr_Occurred()) { + return NULL; + } + + boc_terminator_wake_reason_t reason; + Py_BEGIN_ALLOW_THREADS reason = + terminator_wait_pumpable(timeout_s, _core_pinned_depth_load); + Py_END_ALLOW_THREADS + + return PyLong_FromLong((long)reason); +} + /// @brief Idempotent one-shot decrement of the Pyrona seed. /// @details Called by stop()/wait() to remove the seed that keeps the /// terminator count above zero across momentary quiescence. Safe to call @@ -971,6 +1064,26 @@ static PyObject *_core_terminator_seed_dec(PyObject *self, Py_RETURN_FALSE; } +/// @brief Idempotent one-shot re-arm of the Pyrona seed. +/// @param self The module (unused) +/// @param args Unused +/// @return Python bool — True if this call restored the seed, False if +/// the seed was already present. +static PyObject *_core_terminator_seed_inc(PyObject *self, + PyObject *Py_UNUSED(args)) { + BOC_STATE_SET(self); + if (BOC_STATE->index != 0) { + PyErr_SetString(PyExc_RuntimeError, + "terminator_seed_inc must be called from the primary " + "interpreter"); + return NULL; + } + if (terminator_seed_inc()) { + Py_RETURN_TRUE; + } + Py_RETURN_FALSE; +} + /// @brief Restore terminator state for a fresh runtime start. /// @details Sets count=1 (the Pyrona seed), clears the closed bit, and /// re-arms the seed one-shot. Called from Behaviors.start(). Returns @@ -983,6 +1096,7 @@ static PyObject *_core_terminator_seed_dec(PyObject *self, /// @return A 2-tuple @c (prior_count, prior_seeded). static PyObject *_core_terminator_reset(PyObject *self, PyObject *Py_UNUSED(args)) { + PRINTDBG("_core_terminator_reset\n"); BOC_STATE_SET(self); if (BOC_STATE->index != 0) { PyErr_SetString(PyExc_RuntimeError, @@ -993,6 +1107,17 @@ static PyObject *_core_terminator_reset(PyObject *self, int_least64_t prior_count = 0; int_least64_t prior_seeded = 0; terminator_reset(&prior_count, &prior_seeded); + // Pump-watchdog state: depth and timestamps must return to the + // depth==0 baseline so the watchdog does not carry stale "queue + // has been non-empty since X" / "last warn at Y" readings into + // the next run. MAIN_PINNED_DEPTH is reset by the drain in + // stop_workers; the rest live here. + atomic_store(&MAIN_PINNED_NONEMPTY_SINCE_NS, 0); + atomic_store(&WATCHDOG_LAST_WARN_NS, 0); + atomic_store(&LAST_PUMP_NS, 0); +#ifdef Py_GIL_DISABLED + atomic_store_intptr(&MAIN_PUMP_THREAD, (intptr_t)0); +#endif return Py_BuildValue("(LL)", (long long)prior_count, (long long)prior_seeded); } @@ -1026,6 +1151,15 @@ typedef struct boc_cown { bool pickled; /// @brief Whether the cown holds an exception object bool exception; + /// @brief Whether this cown is pinned to the main interpreter. + /// @details Set permanently by the @c PinnedCownCapsule constructor; + /// never modified afterwards. Adjacent to @c pickled / @c exception + /// so all three bools occupy the same alignment-fill slot and the + /// struct size is unchanged. Read (relaxed) by @c cown_acquire / + /// @c cown_release / @c cown_decref_inline / + /// @c report_unhandled_exception / @c cown_disown to short-circuit + /// the owner CAS and skip the XIData round-trip for pinned cowns. + bool is_pinned; /// @brief the threadsafe serialized cown contents XIDATA_T *xidata; /// @brief the module which last released this cown @@ -1139,14 +1273,80 @@ static inline int_least64_t cown_decref_inline(BOCCown *cown) { // we can clear the object and recycle the xidata if (cown->value != NULL) { - assert(cown->owner == bocpy_interpid()); - Py_CLEAR(cown->value); + if (cown->is_pinned) { + // Pinned cowns hold a main-interpreter PyObject* in + // ``cown->value``. Running ``Py_CLEAR`` from a worker that + // happens to drop the last handle would invoke the value's + // destructor on the wrong interpreter — undefined behaviour + // under PEP 684 and ``Py_GIL_DISABLED``. The safe ship choice + // is a controlled leak: skip the clear here, surface the leak + // via ``PyErr_WriteUnraisable`` so it is at least observable + // in test runs, and let the main interpreter's process-exit + // reclaim the bytes. Callers that want zero leaks must keep + // pinned-cown handles on main. + if (bocpy_interpid() != bocpy_main_interpid()) { + // Preserve any pending exception on the calling thread: + // ``PyErr_WriteUnraisable(NULL)`` writes-and-clears the + // error indicator, so a caller mid-unwind would silently + // lose its in-flight exception when this leak path fires + // (e.g. interpreter shutdown that raced a worker dropping + // the last handle). Fetch around the format/write pair and + // restore on the way out. + PyObject *prev_exc_type, *prev_exc_val, *prev_exc_tb; + PyErr_Fetch(&prev_exc_type, &prev_exc_val, &prev_exc_tb); + PyErr_Format(PyExc_RuntimeError, + "leaking pinned cown %p value: last handle " + "dropped on a non-main interpreter (interp=%" PRIdLEAST64 + "); call Py_DECREF on the " + "originating main interpreter to free the " + "underlying PyObject*", + (void *)cown, bocpy_interpid()); + PyErr_WriteUnraisable(NULL); + PyErr_Restore(prev_exc_type, prev_exc_val, prev_exc_tb); + } else { + Py_CLEAR(cown->value); + } + } else { + assert(cown->owner == bocpy_interpid()); + Py_CLEAR(cown->value); + } } if (cown->xidata != NULL) { + assert(!cown->is_pinned); + + // Deserialize-and-drop encoded xidata so embedded CownCapsule INCREFs + // balance on orphan death (CWE-401): CownCapsule_reduce takes an + // inheriting COWN_INCREF per embedded BOCCown that is normally + // consumed when the bytes are unpickled; running pickle.loads + DECREF + // here lets CPython's GC fire the matching COWN_DECREFs recursively. + // Gated on `pickled` because native XIData round-trips (e.g. Matrix) + // cannot embed CownCapsule and would just waste a pickle round-trip. + if (cown->pickled) { + // Preserve any in-flight error across the deserialize; + // PyErr_WriteUnraisable below clears it otherwise. + PyObject *prev_exc_type, *prev_exc_val, *prev_exc_tb; + PyErr_Fetch(&prev_exc_type, &prev_exc_val, &prev_exc_tb); + + PyObject *drained = xidata_to_object(cown->xidata, true); + if (drained == NULL) { + // Partial unpickle: already-decoded capsules unwind cleanly; tail-end + // opcodes leak, same as cown_acquire on failure. Surface as unraisable. + PyErr_WriteUnraisable(NULL); + } else { + Py_DECREF(drained); + } + + PyErr_Restore(prev_exc_type, prev_exc_val, prev_exc_tb); + } + BOCRecycleQueue_enqueue(cown->recycle_queue, cown->xidata); } + if (cown->is_pinned) { + atomic_fetch_sub(&PINNED_COWN_COUNT, 1); + } + cown_weak_decref(cown); return 0; @@ -1234,6 +1434,7 @@ static BOCCown *BOCCown_new(PyObject *value) { cown->xidata = NULL; cown->pickled = false; cown->exception = false; + cown->is_pinned = false; atomic_store_intptr(&cown->last, 0); // each cown starts with both a strong and weak reference // the weak reference will only be decremented when the strong @@ -1611,6 +1812,31 @@ static PyObject *CownCapsule_acquired(PyObject *op, /// @param cown The cown to acquire /// @return -1 if failure, 0 if success static int cown_acquire(BOCCown *cown) { + if (cown->is_pinned) { + // Pinned cowns are permanently owned by main and never serialised: + // the structural owner-CAS would also reject worker callers (their + // bocpy_interpid() never matches BOCPY_NO_OWNER on a pinned cown), + // but the short-circuit avoids a wasted CAS and the xidata-NULL + // poisoning guard further down. + // + // The interpreter check is a runtime guard, not an assert: pinned + // acquire must never run off-main (it would race the main pump on + // ``cown->value`` without the MCS ordering protecting unpinned + // cowns). Release builds promote this to a hard ``RuntimeError`` + // so a structural bug surfaces deterministically instead of + // silently corrupting state. + if (bocpy_interpid() != bocpy_main_interpid()) { + PyErr_Format(PyExc_RuntimeError, + "cannot acquire pinned cown %p from non-main " + "interpreter (interp=%" PRIdLEAST64 "); pinned " + "cowns are owned by the main interpreter and " + "acquired only by the main pump", + (void *)cown, bocpy_interpid()); + return -1; + } + assert(cown->owner == bocpy_main_interpid()); + return 0; + } int_least64_t expected = BOCPY_NO_OWNER; int_least64_t desired = bocpy_interpid(); if (!atomic_compare_exchange_strong(&cown->owner, &expected, desired)) { @@ -1712,6 +1938,24 @@ static PyObject *CownCapsule_acquire(PyObject *op, PyObject *Py_UNUSED(dummy)) { /// @param cown The cown to release /// @return -1 if error, 0 otherwise static int cown_release(BOCCown *cown) { + if (cown->is_pinned) { + // Pinned cowns never serialise out of main; release is a no-op so + // the value stays resident in main and the owner stays == main_id. + // + // Mirror the runtime guard in ``cown_acquire``: a release coming + // from a non-main interpreter is a structural bug, surface it as + // a hard ``RuntimeError`` instead of relying on debug asserts. + if (bocpy_interpid() != bocpy_main_interpid()) { + PyErr_Format(PyExc_RuntimeError, + "cannot release pinned cown %p from non-main " + "interpreter (interp=%" PRIdLEAST64 "); pinned " + "cowns are released only by the main pump", + (void *)cown, bocpy_interpid()); + return -1; + } + assert(cown->owner == bocpy_main_interpid()); + return 0; + } int_least64_t expected = bocpy_interpid(); int_least64_t owner = atomic_load(&cown->owner); if (owner != expected) { @@ -1779,6 +2023,15 @@ static PyObject *CownCapsule_release(PyObject *op, PyObject *Py_UNUSED(dummy)) { /// @param cown The cown to disown /// @return -1 if error, 0 otherwise static int cown_disown(BOCCown *cown) { + if (cown->is_pinned) { + // Defense-in-depth. Pinned cowns must never be disowned: the + // value lives permanently on main. The existing owner-CAS below + // already rejects worker callers because owner is permanently + // main_id, but a direct main-thread call would otherwise drop the + // value. Treat as a successful no-op. + assert(cown->owner == bocpy_main_interpid()); + return 0; + } int_least64_t expected = bocpy_interpid(); int_least64_t owner = atomic_load(&cown->owner); if (owner != expected) { @@ -2108,6 +2361,180 @@ BOCCown *cown_unwrap(PyObject *op) { return self->cown; } +/// @brief Module-level factory: allocate a pinned CownCapsule. +/// @details Refuses to run outside the main interpreter; the +/// pinned-ownership invariant requires owner == main_id +/// permanently, which only main can establish. The returned capsule +/// wraps a BOCCown with @c is_pinned set, value non-NULL, xidata +/// NULL, and owner == main_id, so it is "born acquired" by main and +/// every subsequent acquire/release on main is a no-op. +/// @param self The module (unused) +/// @param value The Python object to pin into the cown +/// @return A new CownCapsule object, or NULL on error +static PyObject *_core_pinned_cown_capsule(PyObject *self, PyObject *value) { + if (bocpy_interpid() != bocpy_main_interpid()) { + PyErr_SetString(PyExc_RuntimeError, + "PinnedCown must be constructed on the main interpreter"); + return NULL; + } + + BOC_STATE_SET(self); + PyTypeObject *type = BOC_STATE->cown_capsule_type; + CownCapsuleObject *capsule = (CownCapsuleObject *)type->tp_alloc(type, 0); + if (capsule == NULL) { + return NULL; + } + capsule->cown = NULL; + + BOCCown *cown = BOCCown_new(value); + if (cown == NULL) { + Py_DECREF((PyObject *)capsule); + return NULL; + } + + cown->is_pinned = true; + assert(cown->owner == bocpy_main_interpid()); + atomic_fetch_add(&PINNED_COWN_COUNT, 1); + capsule->cown = cown; + + PRINTDBG("PinnedCownCapsule(%p, cown=%p, cid=%" PRIdLEAST64 ", value=", + capsule, cown, cown->id); + PRINTOBJDBG(value); + PRINTFDBG(")\n"); + + return (PyObject *)capsule; +} + +/// @brief Module-level accessor: report whether a capsule is pinned. +/// @param self The module (unused) +/// @param op The CownCapsule (or any object with .impl) to inspect +/// @return Py_True if pinned, Py_False otherwise, or NULL on error +static PyObject *_core_cown_is_pinned(PyObject *Py_UNUSED(self), PyObject *op) { + BOCCown *cown = cown_unwrap(op); + if (cown == NULL) { + return NULL; + } + if (cown->is_pinned) { + Py_RETURN_TRUE; + } + Py_RETURN_FALSE; +} + +/// @brief Module-level accessor: report the live pinned-cown count. +/// @param self The module (unused) +/// @param args Unused +/// @return Python int -- current value of PINNED_COWN_COUNT +static PyObject *_core_pinned_cown_count(PyObject *Py_UNUSED(self), + PyObject *Py_UNUSED(args)) { + return PyLong_FromLongLong((long long)atomic_load(&PINNED_COWN_COUNT)); +} + +/// @brief Enqueue a pinned-bearing behaviour on the main-pinned queue. +/// @details Invoked from `boc_sched_dispatch` when a behaviour's +/// prehdr has the pinned byte set. The scheduler is layout-blind: it +/// reads `pinned` via @ref boc_behavior_node_is_pinned and then hands +/// the node to this function, which owns the actual queue. On the +/// `0 -> 1` depth transition we stamp +/// @ref MAIN_PINNED_NONEMPTY_SINCE_NS and broadcast on the terminator +/// condvar so a `wait()`-blocked main thread wakes to drive the pump. +/// +/// There is intentionally no `queue_max`-style cap: by construction +/// `MAIN_PINNED_DEPTH` cannot exceed the live pinned-cown count +/// (chained pinned `@when`s park in MCS rather than enqueuing), and +/// pinned cowns are only constructed on main -- a worker cannot +/// flood the queue. Run-away depth therefore signals a host-side +/// allocation bug (which is easier to debug at the `PinnedCown(...)` +/// call site than via an opaque library-internal cap). The same +/// invariant is why there is no raise-side back-pressure gate here: +/// only `warn_ms` diagnostics are warranted. +/// @param n The prehdr's `bq_node` (already populated, refcount +/// already incremented by the producer for queue ownership). +/// @return 0 on success. Cannot fail. +int boc_main_pinned_enqueue(boc_bq_node_t *n) { + boc_bq_enqueue(&MAIN_PINNED_QUEUE, n); + uint64_t prev = atomic_fetch_add(&MAIN_PINNED_DEPTH, 1); + if (prev == 0) { + atomic_store(&MAIN_PINNED_NONEMPTY_SINCE_NS, boc_now_ns()); + } + // Wake any wait()-blocked main thread so it can re-evaluate (the + // main pump will drain MAIN_PINNED_QUEUE before re-blocking). + terminator_wake_all(); + return 0; +} + +/// @brief Module-level accessor: report the current main-pinned queue +/// depth. +/// @details Test/diagnostic only. +/// @param self The module (unused) +/// @param args Unused +/// @return Python int -- current value of MAIN_PINNED_DEPTH. +static PyObject *_core_main_pump_queue_depth(PyObject *Py_UNUSED(self), + PyObject *Py_UNUSED(args)) { + return PyLong_FromUnsignedLongLong( + (unsigned long long)atomic_load(&MAIN_PINNED_DEPTH)); +} + +/// @brief Lock-free reader handed to @ref terminator_wait_pumpable. +/// @details Function-pointer indirection so `boc_terminator.c` does +/// not need to see the file-scope `MAIN_PINNED_DEPTH` atomic. +static uint64_t _core_pinned_depth_load(void) { + return (uint64_t)atomic_load(&MAIN_PINNED_DEPTH); +} + +/// @brief Configure the pump-starvation watchdog. +/// @details Refuses non-main callers (the watchdog state is process- +/// global but only meaningful on the pumping interpreter). Each +/// argument may be a positive Python int (enable / set threshold) or +/// @c None (disable that side). Defaults restore the +/// "disabled, no callback" state. Stores are atomic so the hot-path +/// reader (`boc_main_pinned_check_warn` on pump entry) sees a +/// consistent value without a lock. +/// @param self The module (unused) +/// @param args Positional fallback for keyword args +/// @param kwargs `warn_ms`, `on_starve` +/// @return @c None on success; raises @c TypeError / @c RuntimeError +/// otherwise. +static PyObject *_core_set_pump_watchdog(PyObject *self, PyObject *args, + PyObject *kwargs) { + BOC_STATE_SET(self); + if (BOC_STATE->index != 0) { + PyErr_SetString(PyExc_RuntimeError, + "set_pump_watchdog() must be called from the main " + "interpreter"); + return NULL; + } + static char *kwlist[] = {"warn_ms", "on_starve", NULL}; + PyObject *warn_obj = Py_None; + PyObject *on_starve = Py_None; + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|OO", kwlist, &warn_obj, + &on_starve)) { + return NULL; + } + uint64_t warn_ms = 0; + if (warn_obj != Py_None) { + long long v = PyLong_AsLongLong(warn_obj); + if (v < 0 || (v == -1 && PyErr_Occurred())) { + PyErr_SetString(PyExc_TypeError, + "warn_ms must be a positive int or None"); + return NULL; + } + warn_ms = (uint64_t)v; + } + if (on_starve != Py_None && !PyCallable_Check(on_starve)) { + PyErr_SetString(PyExc_TypeError, "on_starve must be a callable or None"); + return NULL; + } + atomic_store(&WATCHDOG_WARN_MS, warn_ms); + // Swap the callback under refcount discipline. The previous slot + // is decref'd after the store so the new readers see the new + // callback before the old one can run to zero. + PyObject *new_cb = (on_starve == Py_None) ? NULL : Py_NewRef(on_starve); + PyObject *prev = + (PyObject *)atomic_exchange_intptr(&WATCHDOG_ON_STARVE, (intptr_t)new_cb); + Py_XDECREF(prev); + Py_RETURN_NONE; +} + static PyObject *BOCRecycleQueue_promote_cowns(BOCRecycleQueue *queue) { if (queue->xidata_to_cowns == NULL) { PyErr_Format(PyExc_RuntimeError, @@ -3171,17 +3598,6 @@ typedef struct behavior_s { struct boc_request **requests; /// @brief Number of entries in @c requests (post-dedup, ≤ args_size + 1). Py_ssize_t requests_size; - /// @brief Intrusive link node for the Verona-style behaviour MPMC - /// queue (`boc_bq_*` API in `sched.{h,c}`). - /// @details Ports `verona-rt/src/rt/sched/work.h::Work::next_in_queue`. - /// Initialised to NULL in @c behavior_new under the GIL, before the - /// behaviour can be reached from any other thread (preserves the - /// link-loop infallibility invariant). Hooked into the `boc_bq_*` - /// enqueue/dequeue path by `behavior_resolve_one` and - /// `request_release_inner`. Placement at struct end is - /// `pahole`-driven to keep the hot fields on their existing cache - /// lines. - boc_bq_node_t bq_node; /// @brief Fairness-token discriminator. /// @details 0 for ordinary behaviours; 1 for the per-worker /// @c token_work sentinel allocated by @@ -3190,9 +3606,7 @@ typedef struct behavior_s { /// flips @c should_steal_for_fairness on the popping worker and /// re-enqueues the token instead of calling @c run_behavior. /// Verona equivalent: @c Core::token_work + @c is_token discriminator - /// (`verona-rt/src/rt/sched/core.h:22-37`). Trailing position keeps - /// the hot fields (count, rc, thunk) on their existing cache lines; - /// the byte costs an 8-byte tail pad on x86_64. + /// (`verona-rt/src/rt/sched/core.h:22-37`). uint8_t is_token; /// @brief Index of the worker that owns this fairness token (or /// @c -1 for ordinary behaviours). @@ -3209,12 +3623,26 @@ typedef struct behavior_s { /// flag, not the running thread's. /// /// Width: @c int16_t. Sized to comfortably exceed any plausible - /// worker count (≤32767) while preserving the existing 8-byte - /// trailing pad with @c is_token; struct size is unchanged from - /// the original @c int8_t encoding (verified by pahole). + /// worker count (≤32767). int16_t owner_worker_index; } BOCBehavior; +// Layout note. The intrusive queue link (`bq_node`) and the OR-fold +// `pinned` byte live in a scheduler-owned `boc_behavior_prehdr_t` +// allocated immediately before each BOCBehavior — CPython +// `_PyGC_Head` / `_Py_AS_GC()` style. See `boc_sched.h` for the +// prehdr definition and the `BOC_BEHAVIOR_PREHDR(b)` recovery macro. +// `behavior_new` / `behavior_free` / the token allocator below own +// the combined allocation; the rest of `_core.c` keeps treating +// BOCBehavior as an ordinary pointer-indirect struct. + +// Recover a BOCBehavior pointer from the prehdr's bq_node. Inverse +// of `BOC_BEHAVIOR_PREHDR`; used by the worker pop sites that pull +// nodes out of the scheduler queues. `bq_node` sits at offset 0 of +// the prehdr, so the cast IS the container_of (no offsetof needed). +#define BEHAVIOR_FROM_PREHDR_NODE(node) \ + ((BOCBehavior *)(((boc_behavior_prehdr_t *)(node)) + 1)) + /// @brief Capsule for holding a pointer to a behavior typedef struct behavior_capsule_object { PyObject_HEAD BOCBehavior *behavior; @@ -3223,31 +3651,33 @@ typedef struct behavior_capsule_object { #define BehaviorCapsule_CheckExact(op) \ Py_IS_TYPE((op), BOC_STATE->behavior_capsule_type) -/// @brief Recover the enclosing @c BOCBehavior from its embedded -/// @c bq_node. -/// @details The dispatch path moves @c BOCBehavior * pointers -/// through the scheduler queue indirectly: the producer hands -/// @c &behavior->bq_node to @ref boc_sched_dispatch, the consumer -/// pops a @c boc_bq_node_t * back, and this macro reverses the -/// embedding offset to recover the owning @c BOCBehavior. Equivalent -/// to the kernel's @c container_of pattern; @c offsetof is the -/// portable C11 idiom. -#define BEHAVIOR_FROM_BQ_NODE(node_ptr) \ - ((BOCBehavior *)((char *)(node_ptr) - offsetof(BOCBehavior, bq_node))) - // Forward declaration: defined alongside the request helpers further down. // behavior_free uses it to clean up any unreleased request array if a // behavior is destroyed without going through behavior_release_all. static void request_decref(BOCRequest *request); BOCBehavior *behavior_new() { - BOCBehavior *behavior; - behavior = (BOCBehavior *)PyMem_RawMalloc(sizeof(BOCBehavior)); - if (behavior == NULL) { + // Combined `prehdr + BOCBehavior` allocation per the pre-header + // scheme (see `boc_sched.h` and the layout note above the struct). + // The returned pointer is past the prehdr so all existing + // `BOCBehavior *` consumers stay unchanged; `behavior_free` + // recovers the allocation origin via `BOC_BEHAVIOR_PREHDR(b)`. + void *raw = + PyMem_RawMalloc(sizeof(boc_behavior_prehdr_t) + sizeof(BOCBehavior)); + if (raw == NULL) { PyErr_NoMemory(); return NULL; } + boc_behavior_prehdr_t *prehdr = (boc_behavior_prehdr_t *)raw; + // Zero the prehdr in full — `pinned`, `_reserved`, and the + // intrusive link's `next_in_queue` (the boc_bq_* enqueue path + // requires this field to start NULL, and we are still under the + // GIL before the behaviour can be reached from any other thread). + memset(prehdr, 0, sizeof(*prehdr)); + + BOCBehavior *behavior = (BOCBehavior *)(prehdr + 1); + behavior->id = atomic_fetch_add(&BOC_BEHAVIOR_COUNT, 1); behavior->thunk = NULL; behavior->result = NULL; @@ -3259,11 +3689,6 @@ BOCBehavior *behavior_new() { behavior->captures = NULL; behavior->requests = NULL; behavior->requests_size = 0; - // Init the boc_bq link before the behaviour becomes reachable from - // any other thread (we are still under the GIL here). The boc_bq_* - // enqueue path requires this field to start NULL. - boc_atomic_store_ptr_explicit(&behavior->bq_node.next_in_queue, NULL, - BOC_MO_RELAXED); // Ordinary behaviours are not fairness tokens. Token allocation // is performed directly in `_core_scheduler_runtime_start` and // bypasses `behavior_new`. @@ -3327,7 +3752,9 @@ void behavior_free(BOCBehavior *behavior) { BOCTag_free(behavior->thunk); } - PyMem_RawFree(behavior); + // Free at the combined allocation's origin (one slot before the + // BOCBehavior, per the pre-header scheme). + PyMem_RawFree(BOC_BEHAVIOR_PREHDR(behavior)); BOC_REF_TRACKING_REMOVE_BEHAVIOR(); } @@ -3504,6 +3931,7 @@ static int BehaviorCapsule_init(PyObject *op, PyObject *args, PyErr_NoMemory(); return -1; } + uint8_t pinned = 0; for (Py_ssize_t i = 0; i < args_size; ++i) { PyObject *item = PySequence_Fast_GET_ITEM(cowns_list_fast, i); int group_id; @@ -3515,11 +3943,19 @@ static int BehaviorCapsule_init(PyObject *op, PyObject *args, } behavior->group_ids[i] = group_id; + pinned |= ((CownCapsuleObject *)cown)->cown->is_pinned; PyTuple_SET_ITEM(cowns, i, Py_NewRef(cown)); } Py_DECREF(cowns_list_fast); + // Publish the OR-fold to the prehdr. The scheduler reads this via + // `boc_behavior_node_is_pinned` from `boc_sched_dispatch` to route + // pinned-touching behaviours onto MAIN_PINNED_QUEUE instead of a + // worker WSQ. Token behaviours never reach this path (their prehdr + // stays zero-initialised by `_core_scheduler_runtime_start`). + BOC_BEHAVIOR_PREHDR(behavior)->pinned = pinned; + behavior->args = add_vars(cowns, &behavior->args_size); Py_DECREF(cowns); if (behavior->args == NULL) { @@ -3547,8 +3983,9 @@ static int BehaviorCapsule_init(PyObject *op, PyObject *args, /// @details Called when a request is at the head of the queue for a /// particular cown. If this is the last request (count -> 0) the thunk /// is dispatched: the unique caller that observes the transition takes -/// a queue-owned reference via @c BEHAVIOR_INCREF and hands -/// @c &behavior->bq_node to @ref boc_sched_dispatch. The matching +/// a queue-owned reference via @c BEHAVIOR_INCREF and hands the prehdr's +/// @c bq_node (recovered via @c BOC_BEHAVIOR_PREHDR) to +/// @ref boc_sched_dispatch. The matching /// @c BEHAVIOR_DECREF runs when the consumer's freshly allocated /// @c BehaviorCapsule (built by @c _core.scheduler_worker_pop) is /// deallocated on the worker side. @@ -3591,7 +4028,7 @@ static int behavior_resolve_one(BOCBehavior *behavior) { int_least64_t count = atomic_fetch_add(&behavior->count, -1) - 1; if (count == 0) { BEHAVIOR_INCREF(behavior); - if (boc_sched_dispatch(&behavior->bq_node) < 0) { + if (boc_sched_dispatch(&BOC_BEHAVIOR_PREHDR(behavior)->bq_node) < 0) { // Roll back the queue-owned reference we just took. The // dispatch failure means no consumer will ever see this // behavior, so no DECREF will fire from the worker side. @@ -3732,20 +4169,9 @@ static PyObject *BehaviorCapsule_create_requests(PyObject *op, return list; } -/// @brief Release every request the behavior owns and free the array. -/// @details Walks @c behavior->requests, calling @c request_release_inner -/// (MCS unlink + handoff to next behavior) on each, then frees the -/// per-request structs and the array itself. Invoked by the worker's -/// release arm in place of the per-request Python @c Request.release loop. -/// @param op The BehaviorCapsule whose requests should be released -/// @return Py_None on success, NULL on error -static PyObject *BehaviorCapsule_release_all(PyObject *op, - PyObject *Py_UNUSED(dummy)) { - BehaviorCapsuleObject *capsule = (BehaviorCapsuleObject *)op; - BOCBehavior *behavior = capsule->behavior; - +static int behavior_release_all_impl(BOCBehavior *behavior) { if (behavior->requests == NULL) { - Py_RETURN_NONE; + return 0; } // Detach the array from the behavior up front so behavior_free's @@ -3762,12 +4188,28 @@ static PyObject *BehaviorCapsule_release_all(PyObject *op, request_decref(requests[k]); } PyMem_RawFree(requests); - return NULL; + return -1; } request_decref(requests[i]); } PyMem_RawFree(requests); + return 0; +} + +/// @brief Release every request the behavior owns and free the array. +/// @details Walks @c behavior->requests, calling @c request_release_inner +/// (MCS unlink + handoff to next behavior) on each, then frees the +/// per-request structs and the array itself. Invoked by the worker's +/// release arm in place of the per-request Python @c Request.release loop. +/// @param op The BehaviorCapsule whose requests should be released +/// @return Py_None on success, NULL on error +static PyObject *BehaviorCapsule_release_all(PyObject *op, + PyObject *Py_UNUSED(dummy)) { + BehaviorCapsuleObject *capsule = (BehaviorCapsuleObject *)op; + if (behavior_release_all_impl(capsule->behavior) < 0) { + return NULL; + } Py_RETURN_NONE; } @@ -3779,7 +4221,7 @@ static PyObject *BehaviorCapsule_release_all(PyObject *op, /// @c Behavior.schedule() collapses to a single call to this function. /// Dispatch itself (the count → 0 transition in /// @ref behavior_resolve_one) is allocation-free and infallible: -/// @ref boc_sched_dispatch enqueues @c &behavior->bq_node directly +/// @ref boc_sched_dispatch enqueues the prehdr's @c bq_node directly /// onto a worker's per-task queue, so there is nothing to pre-build. /// @param op The BehaviorCapsule to schedule /// @return Py_None on success, NULL on error @@ -3868,6 +4310,12 @@ static PyObject *BehaviorCapsule_schedule(PyObject *op, Py_RETURN_NONE; } +static void behavior_set_exception_impl(BOCBehavior *behavior, + PyObject *value) { + cown_set_value(behavior->result, value); + behavior->result->exception = true; +} + /// @brief Store an exception as the behavior's result /// @details Sets the result value and marks the exception flag. Intended for /// the worker exception handler. @@ -3882,9 +4330,7 @@ static PyObject *BehaviorCapsule_set_exception(PyObject *op, PyObject *args) { } BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; - BOCBehavior *behavior = self->behavior; - cown_set_value(behavior->result, value); - behavior->result->exception = true; + behavior_set_exception_impl(self->behavior, value); Py_RETURN_NONE; } @@ -3934,28 +4380,33 @@ static int acquire_vars(BOCCown **vars, Py_ssize_t size) { return 0; } -/// @brief Acquire all the cowns for the behavior. -/// @param args The behavior capsule -/// @return Py_None if successful, NULL otherwise -static PyObject *BehaviorCapsule_acquire(PyObject *op, - PyObject *Py_UNUSED(dummy)) { - BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; - BOCBehavior *behavior = self->behavior; - +static int behavior_acquire_impl(BOCBehavior *behavior) { PRINTDBG("behavior_acquire(%" PRIdLEAST64 ")\n", behavior->id); if (cown_acquire(behavior->result) < 0) { - return NULL; + return -1; } if (acquire_vars(behavior->args, behavior->args_size) < 0) { - return NULL; + return -1; } if (acquire_vars(behavior->captures, behavior->captures_size) < 0) { - return NULL; + return -1; } + return 0; +} + +/// @brief Acquire all the cowns for the behavior. +/// @param args The behavior capsule +/// @return Py_None if successful, NULL otherwise +static PyObject *BehaviorCapsule_acquire(PyObject *op, + PyObject *Py_UNUSED(dummy)) { + BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; + if (behavior_acquire_impl(self->behavior) < 0) { + return NULL; + } Py_RETURN_NONE; } @@ -3970,47 +4421,51 @@ static int release_vars(BOCCown **vars, Py_ssize_t size) { return 0; } -/// @brief Release the cowns for this behavior. -/// @param args The behavior capsule -/// @return Py_None if successful, NULL otherwise -static PyObject *BehaviorCapsule_release(PyObject *op, - PyObject *Py_UNUSED(dummy)) { - BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; - BOCBehavior *behavior = self->behavior; - +static int behavior_release_impl(BOCBehavior *behavior) { PRINTDBG("behavior_release(%" PRIdLEAST64 ")\n", behavior->id); if (cown_release(behavior->result) < 0) { - return NULL; + return -1; } if (release_vars(behavior->args, behavior->args_size) < 0) { - return NULL; + return -1; } if (release_vars(behavior->captures, behavior->captures_size) < 0) { - return NULL; + return -1; } - Py_RETURN_NONE; + return 0; } -/// @brief Executes the thunk on the behavior. -/// @details Before this function can be called, all of the cowns for the -/// behavior must be acquired. -/// @param args The Behavior, and the object or module which contains the named -/// thunk function. -/// @return The result of calling the thunk -static PyObject *BehaviorCapsule_execute(PyObject *op, PyObject *args) { - PyObject *boc_export = NULL; - - if (!PyArg_ParseTuple(args, "O", &boc_export)) { +/// @brief Release the cowns for this behavior. +/// @param args The behavior capsule +/// @return Py_None if successful, NULL otherwise +static PyObject *BehaviorCapsule_release(PyObject *op, + PyObject *Py_UNUSED(dummy)) { + BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; + if (behavior_release_impl(self->behavior) < 0) { return NULL; } + Py_RETURN_NONE; +} - BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; - BOCBehavior *behavior = self->behavior; - +/// @brief C-level worker for the thunk execution path. +/// @details Builds the thunk argument tuple, looks up the named thunk on +/// @p boc_export, invokes it, and stashes the result on @c behavior->result. +/// Any raised exception (:class:`Exception` or :class:`BaseException`) is +/// captured into the result cown with @c exception=true and is **not** +/// propagated; only deeper setup failures (allocation errors, missing +/// thunk, etc.) return NULL with PyErr set. Callers that need to fork on +/// Exception vs BaseException must inspect the result cown's value type +/// after the call returns. +/// @param behavior The behavior whose thunk to invoke +/// @param boc_export Namespace (module or object) carrying the thunk +/// @return Borrowed reference to the stored result value, NULL on setup +/// failure +static PyObject *behavior_execute_impl(BOCBehavior *behavior, + PyObject *boc_export) { size_t num_groups = 0; if (behavior->args_size > 0) { num_groups = abs(behavior->group_ids[behavior->args_size - 1]); @@ -4090,6 +4545,7 @@ static PyObject *BehaviorCapsule_execute(PyObject *op, PyObject *args) { PyObject *thunk = PyObject_GetAttrString(boc_export, behavior->thunk->str); if (thunk == NULL) { + Py_DECREF(thunk_args); return NULL; } @@ -4128,7 +4584,25 @@ static PyObject *BehaviorCapsule_execute(PyObject *op, PyObject *args) { if (is_error) { behavior->result->exception = true; } - return behavior->result->value; + Py_XDECREF(result); + return Py_XNewRef(behavior->result->value); +} + +/// @brief Executes the thunk on the behavior. +/// @details Before this function can be called, all of the cowns for the +/// behavior must be acquired. +/// @param args The Behavior, and the object or module which contains the named +/// thunk function. +/// @return The result of calling the thunk +static PyObject *BehaviorCapsule_execute(PyObject *op, PyObject *args) { + PyObject *boc_export = NULL; + + if (!PyArg_ParseTuple(args, "O", &boc_export)) { + return NULL; + } + + BehaviorCapsuleObject *self = (BehaviorCapsuleObject *)op; + return behavior_execute_impl(self->behavior, boc_export); } static PyMethodDef BehaviorCapsule_methods[] = { @@ -4158,6 +4632,451 @@ static PyType_Spec BehaviorCapsule_Spec = { .flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_IMMUTABLETYPE, .slots = BehaviorCapsule_slots}; +// --------------------------------------------------------------------------- +// main_pump_bounded: drive the main-pinned queue from the main interpreter. +// Mirrors `worker.run_behavior`'s layered try/finally so the per-iteration +// cleanup (IN_PUMP_BODY clear + release pair + terminator_dec) always runs. +// --------------------------------------------------------------------------- + +// Convert a Python deadline_ms (long or None) to an absolute monotonic-ns +// deadline. Returns 0 when no deadline (None) or on invalid input; callers +// validate non-None inputs separately so we only reach the conversion path +// with a known-valid PyLong. +static uint64_t deadline_or_zero(PyObject *deadline_ms) { + if (deadline_ms == NULL || deadline_ms == Py_None) { + return 0; + } + long long ms = PyLong_AsLongLong(deadline_ms); + if (ms <= 0) { + PyErr_Clear(); + return 0; + } + return boc_now_ns() + (uint64_t)ms * 1000000ULL; +} + +// True if more iterations are permitted by the user's max_behaviors cap. +// None / non-positive values are treated as unbounded. +static bool max_behaviors_or_inf(PyObject *max_behaviors, Py_ssize_t executed) { + if (max_behaviors == NULL || max_behaviors == Py_None) { + return true; + } + long long lim = PyLong_AsLongLong(max_behaviors); + if (lim <= 0) { + PyErr_Clear(); + return true; + } + return (long long)executed < lim; +} + +// Release-path error logger. Mirrors worker.run_behavior's outer-finally +// logger.exception arms (release / release_all): log + swallow so a single +// misbehaving step cannot strand the runtime. +static inline void _core_log_release_error(void) { + PyErr_WriteUnraisable(NULL); +} + +// Pump-starvation watchdog warn-side check. Fires when the pinned +// queue has been non-empty for at least WATCHDOG_WARN_MS without a +// pump call making progress. Rate-limited: only one warn emission +// per non-empty epoch (cleared when the queue drains). +// +// Three exit paths: +// * watchdog disabled (warn_ms == 0): no work. +// * queue empty (NONEMPTY_SINCE_NS == 0): no work. +// * threshold not crossed: no work. +// Otherwise: invoke the user's on_starve callback if set, else log a +// default warning. Called from the main interpreter only (pump entry). +static void boc_main_pinned_check_warn(void) { + uint64_t warn_ms = atomic_load(&WATCHDOG_WARN_MS); + if (warn_ms == 0) { + return; + } + uint64_t since_ns = atomic_load(&MAIN_PINNED_NONEMPTY_SINCE_NS); + if (since_ns == 0) { + return; + } + uint64_t now_ns = boc_now_ns(); + if (now_ns < since_ns || (now_ns - since_ns) < warn_ms * 1000000ULL) { + return; + } + // Rate-limit: one warn per non-empty epoch. The epoch closes when + // the queue drains (NONEMPTY_SINCE_NS -> 0); WATCHDOG_LAST_WARN_NS + // is reset alongside. A relaxed compare against since_ns suffices + // because both are monotonic and only main reads/writes them. + uint64_t last_warn = atomic_load(&WATCHDOG_LAST_WARN_NS); + if (last_warn >= since_ns) { + return; + } + atomic_store(&WATCHDOG_LAST_WARN_NS, now_ns); + + uint64_t age_ms = (now_ns - since_ns) / 1000000ULL; + uint64_t depth = atomic_load(&MAIN_PINNED_DEPTH); + PyObject *callback = (PyObject *)atomic_load_intptr(&WATCHDOG_ON_STARVE); + PyObject *msg = PyUnicode_FromFormat( + "pinned-cown queue non-empty for %llu ms with no pump() progress " + "(depth=%llu)", + (unsigned long long)age_ms, (unsigned long long)depth); + if (msg == NULL) { + PyErr_WriteUnraisable(NULL); + return; + } + if (callback != NULL) { + // Severity 0 = warn (raise side would use 1; reserved for future). + PyObject *res = PyObject_CallFunction(callback, "iO", 0, msg); + if (res == NULL) { + PyErr_WriteUnraisable(callback); + } else { + Py_DECREF(res); + } + } else { + PyObject *logging = PyImport_ImportModule("logging"); + if (logging != NULL) { + PyObject *logger = + PyObject_CallMethod(logging, "getLogger", "s", "bocpy.pump"); + if (logger != NULL) { + PyObject *r = PyObject_CallMethod(logger, "warning", "O", msg); + Py_XDECREF(r); + Py_DECREF(logger); + } + Py_DECREF(logging); + } + if (PyErr_Occurred()) { + PyErr_WriteUnraisable(NULL); + } + } + Py_DECREF(msg); +} + +// Acquire-failure capture. behavior_acquire_impl returned < 0 with +// PyErr set. Stash the exception on the result cown so a consumer reading +// it sees a diagnostic, then clear PyErr so the next iteration starts +// clean. +static void handle_pinned_acquire_failure(BOCBehavior *b) { + PyObject *exc = PyErr_GetRaisedException(); + if (exc == NULL) { + exc = PyObject_CallFunction( + PyExc_RuntimeError, "s", + "behavior_acquire failed without a Python exception"); + if (exc == NULL) { + PyErr_WriteUnraisable(NULL); + return; + } + } + behavior_set_exception_impl(b, exc); + Py_DECREF(exc); +} + +// Body-failure capture. behavior_execute_impl returned NULL. +// Fork on `Exception` vs `BaseException`: +// - `Exception`: capture on the result cown, clear PyErr, count as +// raised, populate first_err under raise_on_error. +// - `BaseException` (KeyboardInterrupt, SystemExit, GeneratorExit): +// stash via the out-param. Restored by the caller AFTER per-iteration +// cleanup completes, so the cleanup arms run with PyErr clear. +static void handle_pinned_body_exception(BOCBehavior *b, bool raise_on_error, + PyObject **first_err, + Py_ssize_t *raised, + PyObject **base_err) { + PyObject *exc = PyErr_GetRaisedException(); + if (exc == NULL) { + return; + } + if (PyErr_GivenExceptionMatches(exc, PyExc_Exception)) { + behavior_set_exception_impl(b, exc); + *raised += 1; + if (raise_on_error && *first_err == NULL) { + *first_err = Py_NewRef(exc); + } + Py_DECREF(exc); + } else { + // Transfer ownership to the caller's stash slot. + *base_err = exc; + } +} + +#ifdef Py_GIL_DISABLED +// Portable thread-id source for the FT-only single-pumper CAS. Linux/BSD +// use C11 `thrd_current`; macOS exposes pthread directly; Windows uses +// GetCurrentThreadId. All cast to uintptr_t for atomic storage. +static inline uintptr_t boc_pump_thread_id(void) { +#if defined(_WIN32) + return (uintptr_t)GetCurrentThreadId(); +#elif defined(__APPLE__) + return (uintptr_t)pthread_self(); +#else + return (uintptr_t)thrd_current(); +#endif +} +#endif + +/// @brief Drive the main-pinned queue from the main interpreter. +/// @details Implements the three pump gates (main-interp, FT +/// single-pumper CAS, nested-pump) plus the per-behavior layered +/// try/finally pattern from `worker.run_behavior`. Every popped +/// behavior runs its per-iteration cleanup (IN_PUMP_BODY clear + +/// release pair + terminator_dec) regardless of where the body +/// failed. ``BaseException`` is stashed across cleanup and restored +/// at the post-iteration check so the pump exits cleanly with the +/// FT CAS released. +/// @param args (deadline_ms, max_behaviors, raise_on_error, +/// boc_export). deadline_ms / max_behaviors accept None +/// for "no limit"; boc_export is the namespace whose +/// attributes resolve `__behavior__N` thunks (typically +/// ``sys.modules[__main__]``). +/// @return ``(executed, deadline_reached, raised)`` 3-tuple on +/// success, NULL on gate rejection or raise_on_error trigger. +static PyObject *_core_main_pump_bounded(PyObject *Py_UNUSED(self), + PyObject *args, PyObject *kwds) { + static char *kwlist[] = {"deadline_ms", "max_behaviors", "raise_on_error", + "boc_export", NULL}; + PyObject *deadline_ms = Py_None; + PyObject *max_behaviors = Py_None; + int raise_on_error_flag = 0; + PyObject *boc_export = NULL; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "OOpO", kwlist, &deadline_ms, + &max_behaviors, &raise_on_error_flag, + &boc_export)) { + return NULL; + } + bool raise_on_error = (bool)raise_on_error_flag; + + Py_ssize_t executed = 0; + Py_ssize_t raised = 0; + bool deadline_reached = false; + PyObject *first_err = NULL; + PyObject *base_err = NULL; +#ifdef Py_GIL_DISABLED + bool pump_cas_owner = false; +#endif + + // Gate 1: main interpreter only. + if (bocpy_interpid() != bocpy_main_interpid()) { + PyErr_SetString(PyExc_RuntimeError, + "pump() must be called from the main interpreter"); + goto pump_exit; + } + +#ifdef Py_GIL_DISABLED + // Gate 2: free-threaded single-pumper CAS. + { + intptr_t self_id = (intptr_t)boc_pump_thread_id(); + intptr_t expected = 0; + if (atomic_compare_exchange_strong_intptr(&MAIN_PUMP_THREAD, &expected, + self_id)) { + pump_cas_owner = true; + } else if (expected != self_id) { + PyErr_SetString(PyExc_RuntimeError, + "pump() is in use by another thread on this " + "free-threaded build"); + goto pump_exit; + } + // expected == self_id: re-entry on same thread; gate 3 rejects + // nested calls. pump_cas_owner stays false so we do NOT clear + // MAIN_PUMP_THREAD on exit (outer frame still owns it). + } +#endif + + // Gate 3: nested pump. + if (IN_PUMP_BODY) { + PyErr_SetString(PyExc_RuntimeError, + "pump() is not reentrant; cannot be called from " + "inside a pinned-behavior body (v1 limitation)"); + goto pump_exit; + } + + uint64_t deadline_ns = deadline_or_zero(deadline_ms); + boc_main_pinned_check_warn(); + + while (max_behaviors_or_inf(max_behaviors, executed)) { + assert(!PyErr_Occurred()); + + boc_bq_node_t *n = boc_bq_dequeue(&MAIN_PINNED_QUEUE); + if (n == NULL) { + break; + } + uint64_t new_depth = atomic_fetch_sub(&MAIN_PINNED_DEPTH, 1) - 1; + if (new_depth == 0) { + atomic_store(&MAIN_PINNED_NONEMPTY_SINCE_NS, 0); + // Close the watchdog warn epoch: the next time the queue + // becomes non-empty, the warn fires fresh. + atomic_store(&WATCHDOG_LAST_WARN_NS, 0); + } + + BOCBehavior *b = BEHAVIOR_FROM_PREHDR_NODE(n); + + IN_PUMP_BODY = true; + + bool acquired = false; + noticeboard_cache_clear_for_behavior(); + if (behavior_acquire_impl(b) < 0) { + handle_pinned_acquire_failure(b); + } else { + acquired = true; + } + + if (acquired) { + PyObject *rv = behavior_execute_impl(b, boc_export); + if (rv == NULL) { + // Setup failure inside execute_impl (missing thunk, + // allocation error). PyErr is live; capture to the result + // cown so a consumer sees the diagnostic, then count it as + // a raised Exception (it can only be a non-BaseException). + handle_pinned_body_exception(b, raise_on_error, &first_err, &raised, + &base_err); + } else { + // Body returned normally OR an exception was captured onto + // the result cown. Fork on the stored value type: + // - exception flag set and value is BaseException-but-not- + // Exception (KI/SystemExit/GeneratorExit): stash for + // re-raise AFTER cleanup completes. + // - exception flag set and value is Exception: count as + // raised, populate first_err under raise_on_error. + // - exception flag clear: ordinary return value. + if (b->result->exception) { + PyObject *captured = b->result->value; + if (PyObject_IsInstance(captured, PyExc_Exception) == 1) { + raised++; + if (raise_on_error && first_err == NULL) { + first_err = Py_NewRef(captured); + } + } else { + // BaseException-but-not-Exception: re-raise after cleanup. + base_err = Py_NewRef(captured); + } + } + Py_DECREF(rv); + } + } + + // release / release_all ALWAYS run (mirrors worker.run_behavior outer + // finally). cown_release tolerates partial-acquire by short-circuiting + // NO_OWNER cowns, so we can call it even when acquire failed midway. + if (behavior_release_impl(b) < 0) { + _core_log_release_error(); + } + if (behavior_release_all_impl(b) < 0) { + _core_log_release_error(); + } + terminator_dec(); + // Drop the queue-owned BEHAVIOR_INCREF from behavior_resolve_one; the + // inline pump path has no BehaviorCapsule dealloc to do it for us. + BEHAVIOR_DECREF(b); + + IN_PUMP_BODY = false; + atomic_store(&LAST_PUMP_NS, boc_now_ns()); + executed++; + + if (base_err) { + PyErr_SetRaisedException(base_err); + base_err = NULL; + goto pump_exit; + } + if (raise_on_error && first_err) { + goto pump_exit; + } + if (deadline_ns && boc_now_ns() >= deadline_ns) { + deadline_reached = true; + break; + } + } + +pump_exit: +#ifdef Py_GIL_DISABLED + if (pump_cas_owner) { + atomic_store_intptr(&MAIN_PUMP_THREAD, (intptr_t)0); + } +#endif + + if (first_err) { + PyErr_SetRaisedException(first_err); + first_err = NULL; + return NULL; + } + if (PyErr_Occurred()) { + return NULL; + } + return Py_BuildValue("(nOn)", executed, deadline_reached ? Py_True : Py_False, + raised); +} + +/// @brief Shutdown-drain the main-pinned queue without executing bodies. +/// @details Pops every queued behavior, marks its result cown with a +/// drop-exception so any consumer sees a deterministic diagnostic, then +/// releases the per-behavior request array (MCS unlink + handoff). Does +/// NOT invoke @ref behavior_acquire_impl or @ref behavior_execute_impl +/// — the runtime is tearing down and user code must not run on a +/// half-stopped runtime. Mirrors the depth/watchdog bookkeeping in +/// @ref _core_main_pump_bounded so the global counters return to the +/// depth==0 baseline. +/// @param self The module (unused) +/// @param args Unused +/// @return Python int — number of behaviors drained. +static PyObject *_core_main_pump_drain_all(PyObject *Py_UNUSED(self), + PyObject *Py_UNUSED(args)) { + // Gate: drain runs only from the main interpreter. The pinned queue + // is single-consumer by design and the result-cown acquire below + // asserts main ownership. + if (bocpy_interpid() != bocpy_main_interpid()) { + PyErr_SetString(PyExc_RuntimeError, + "main_pump_drain_all must be called from the main " + "interpreter"); + return NULL; + } + + Py_ssize_t drained = 0; + for (;;) { + boc_bq_node_t *n = boc_bq_dequeue(&MAIN_PINNED_QUEUE); + if (n == NULL) { + break; + } + uint64_t new_depth = atomic_fetch_sub(&MAIN_PINNED_DEPTH, 1) - 1; + if (new_depth == 0) { + atomic_store(&MAIN_PINNED_NONEMPTY_SINCE_NS, 0); + atomic_store(&WATCHDOG_LAST_WARN_NS, 0); + } + + BOCBehavior *b = BEHAVIOR_FROM_PREHDR_NODE(n); + + PyObject *exc = + PyObject_CallFunction(PyExc_RuntimeError, "s", + "behavior drained at shutdown without executing"); + if (exc == NULL) { + PyErr_WriteUnraisable(NULL); + } else { + // Result cown sits in published-and-released state (NO_OWNER, + // xidata set, value NULL). Acquire on main, write the + // drop-exception, then release back to NO_OWNER so any consumer + // reading the result cown observes a deterministic diagnostic. + // Mirrors `BehaviorCapsule_set_drop_exception`. + if (cown_acquire(b->result) < 0) { + PyErr_WriteUnraisable(NULL); + } else { + cown_set_value(b->result, exc); + b->result->exception = true; + if (cown_release(b->result) < 0) { + PyErr_WriteUnraisable(NULL); + } + } + Py_DECREF(exc); + } + + // MCS unlink + successor handoff on every request the drained + // behavior owned. Must run so dependent behaviors on the same + // cowns are advanced (even though we're shutting down, the + // request arrays still hold strong refs that need to be dropped). + if (behavior_release_all_impl(b) < 0) { + PyErr_WriteUnraisable(NULL); + } + terminator_dec(); + // Drop the queue-owned BEHAVIOR_INCREF from behavior_resolve_one (same as + // _core_main_pump_bounded). + BEHAVIOR_DECREF(b); + drained++; + } + return PyLong_FromSsize_t(drained); +} + static PyObject *_new_behavior_object(XIDATA_T *xidata) { BOCBehavior *behavior = (BOCBehavior *)xidata->data; @@ -4899,40 +5818,44 @@ static PyObject *_core_scheduler_runtime_start(PyObject *Py_UNUSED(module), // `boc_sched_init`) because `boc_sched.c` deliberately treats // `BOCBehavior` as opaque. for (Py_ssize_t i = 0; i < (Py_ssize_t)n; ++i) { - BOCBehavior *token = (BOCBehavior *)PyMem_RawCalloc(1, sizeof(BOCBehavior)); - if (token == NULL) { + // Combined `prehdr + BOCBehavior` allocation per the pre-header + // scheme (see `boc_sched.h`). PyMem_RawCalloc zeroes both halves; + // the BOCBehavior pointer starts past the prehdr. + void *token_raw = + PyMem_RawCalloc(1, sizeof(boc_behavior_prehdr_t) + sizeof(BOCBehavior)); + if (token_raw == NULL) { // Roll back any tokens already installed and tear the runtime // back down so the caller sees a clean failure (no half-init). for (Py_ssize_t j = 0; j < i; ++j) { - BOCBehavior *prev = NULL; boc_bq_node_t *prev_node = boc_sched_get_token_node(j); - if (prev_node != NULL) { - prev = BEHAVIOR_FROM_BQ_NODE(prev_node); - } boc_sched_set_token_node(j, NULL); - if (prev != NULL) { - PyMem_RawFree(prev); + if (prev_node != NULL) { + // prev_node points at the prehdr's bq_node (offset 0), + // so the cast IS the allocation origin. + PyMem_RawFree((boc_behavior_prehdr_t *)prev_node); } } boc_sched_shutdown(); PyErr_NoMemory(); return NULL; } + boc_behavior_prehdr_t *token_prehdr = (boc_behavior_prehdr_t *)token_raw; + BOCBehavior *token = (BOCBehavior *)(token_prehdr + 1); // Mark as token. PyMem_RawCalloc has zeroed everything (NULL // thunk/result/args/captures/requests, count == rc == 0, - // bq_node.next_in_queue == NULL). The behaviour is never - // reference-counted via BEHAVIOR_INCREF/DECREF and never visits - // the request/cown machinery; it is recycled in place by the - // token re-enqueue path. We give it an `id` of -1 so any - // diagnostic that prints `behavior->id` for a token is - // immediately recognisable. + // prehdr->bq_node.next_in_queue == NULL, prehdr->pinned == 0). + // The behaviour is never reference-counted via + // BEHAVIOR_INCREF/DECREF and never visits the request/cown + // machinery; it is recycled in place by the token re-enqueue + // path. We give it an `id` of -1 so any diagnostic that prints + // `behavior->id` for a token is immediately recognisable. token->is_token = 1; token->id = -1; token->owner_worker_index = (int16_t)i; - if (boc_sched_set_token_node(i, &token->bq_node) < 0) { + if (boc_sched_set_token_node(i, &token_prehdr->bq_node) < 0) { // worker_index out of range: only possible if WORKER_COUNT // changed under us, which the GIL precludes. Defensive. - PyMem_RawFree(token); + PyMem_RawFree(token_prehdr); boc_sched_shutdown(); PyErr_SetString(PyExc_RuntimeError, "scheduler_runtime_start: token install failed"); @@ -4975,9 +5898,11 @@ static PyObject *_core_scheduler_runtime_stop(PyObject *Py_UNUSED(module), if (node == NULL) { continue; } - BOCBehavior *token = BEHAVIOR_FROM_BQ_NODE(node); boc_sched_set_token_node(i, NULL); - PyMem_RawFree(token); + // The bq_node sits at offset 0 of the prehdr, so the node + // pointer IS the allocation origin (per the pre-header scheme; + // see `boc_sched.h`). Free at the prehdr. + PyMem_RawFree((boc_behavior_prehdr_t *)node); } boc_sched_shutdown(); Py_RETURN_NONE; @@ -5091,7 +6016,7 @@ static PyObject *_core_scheduler_worker_pop(PyObject *Py_UNUSED(module), Py_RETURN_NONE; } } - behavior = BEHAVIOR_FROM_BQ_NODE(n); + behavior = BEHAVIOR_FROM_PREHDR_NODE(n); if (!behavior->is_token) { break; } @@ -5165,7 +6090,7 @@ static PyObject *_core_scheduler_drain_all_queues(PyObject *Py_UNUSED(module), if (n == NULL) { break; } - BOCBehavior *behavior = BEHAVIOR_FROM_BQ_NODE(n); + BOCBehavior *behavior = BEHAVIOR_FROM_PREHDR_NODE(n); if (behavior->is_token) { // Token sentinels are not reference-counted and own no // cowns; they live in the per-worker `token_work` slot and @@ -5250,6 +6175,37 @@ static PyMethodDef _core_module_methods[] = { {"cown_pin_pointers", _core_cown_pin_pointers, METH_VARARGS, "cown_pin_pointers($module, pins, /)\n--\n\n" "INCREF each CownCapsule and return raw pointer ints (transfers refs)."}, + {"PinnedCownCapsule", _core_pinned_cown_capsule, METH_O, + "PinnedCownCapsule($module, value, /)\n--\n\n" + "Allocate a CownCapsule whose value is permanently pinned to the " + "main interpreter. Acquire and release are no-ops on main and " + "structurally rejected on workers."}, + {"cown_is_pinned", _core_cown_is_pinned, METH_O, + "cown_is_pinned($module, capsule, /)\n--\n\n" + "Return True if the cown is pinned to the main interpreter."}, + {"pinned_cown_count", _core_pinned_cown_count, METH_NOARGS, + "pinned_cown_count($module, /)\n--\n\n" + "Return the number of pinned cowns currently alive in the process."}, + {"main_pump_queue_depth", _core_main_pump_queue_depth, METH_NOARGS, + "main_pump_queue_depth($module, /)\n--\n\n" + "Return the current depth of the main-pinned behaviour queue."}, + {"main_pump_bounded", (PyCFunction)_core_main_pump_bounded, + METH_VARARGS | METH_KEYWORDS, + "main_pump_bounded($module, deadline_ms, max_behaviors, " + "raise_on_error, boc_export, /)\n--\n\n" + "Drive the main-pinned queue from the main interpreter. " + "Returns (executed, deadline_reached, raised)."}, + {"main_pump_drain_all", _core_main_pump_drain_all, METH_NOARGS, + "main_pump_drain_all($module, /)\n--\n\n" + "Drain every queued pinned behavior without executing its body; " + "marks each result cown with a drop-exception. Used by stop() " + "to clear post-shutdown pinned work. Main-interp only."}, + {"set_pump_watchdog", (PyCFunction)_core_set_pump_watchdog, + METH_VARARGS | METH_KEYWORDS, + "set_pump_watchdog($module, /, warn_ms=None, on_starve=None)\n--\n\n" + "Configure the pinned-pump starvation watchdog. Pass a positive " + "int for warn_ms to enable warnings; pass None to disable. " + "Main-interp only."}, {"noticeboard_write_direct", _core_noticeboard_write_direct, METH_VARARGS, "noticeboard_write_direct($module, key, value, /)" "\n--\n\nWrites a key-value pair to the noticeboard."}, @@ -5293,9 +6249,17 @@ static PyMethodDef _core_module_methods[] = { {"terminator_wait", _core_terminator_wait, METH_VARARGS, "terminator_wait($module, timeout, /)" "\n--\n\nBlock until the terminator count reaches 0 or timeout."}, + {"terminator_wait_pumpable", _core_terminator_wait_pumpable, METH_O, + "terminator_wait_pumpable($module, timeout_s, /)" + "\n--\n\nBlock until the terminator count reaches 0, the " + "pinned-queue depth becomes positive, or timeout_s elapses. " + "Returns one of TERMINATED, PUMP_READY, or WAIT_TIMED_OUT."}, {"terminator_seed_dec", _core_terminator_seed_dec, METH_NOARGS, "terminator_seed_dec($module, /)" "\n--\n\nIdempotent one-shot decrement of the Pyrona seed."}, + {"terminator_seed_inc", _core_terminator_seed_inc, METH_NOARGS, + "terminator_seed_inc($module, /)" + "\n--\n\nIdempotent one-shot re-arm of the Pyrona seed."}, {"terminator_reset", _core_terminator_reset, METH_NOARGS, "terminator_reset($module, /)" "\n--\n\nRestore terminator state for a fresh runtime start. " @@ -5376,6 +6340,13 @@ static int _core_module_exec(PyObject *module) { // when the runtime starts; here we only initialize the kernel objects. terminator_init(); + // Initialize the main-pinned dispatch queue. This is a + // process-global Verona-style intrusive queue drained by + // `main_pump_bounded`. The depth/timestamp counters were + // zero-initialised at file scope; only the queue itself needs + // explicit init. + boc_bq_init(&MAIN_PINNED_QUEUE); + // Initialize the scheduler module with no workers. The // per-worker array stays unallocated and `_core.scheduler_stats()` // returns an empty list until `behaviors.start()` calls @@ -5498,6 +6469,21 @@ static int _core_module_exec(PyObject *module) { BOC_STATE = state; PyModule_AddStringConstant(module, "TIMEOUT", BOC_TIMEOUT); + // Wake-reason sentinels returned by ``terminator_wait_pumpable``. + // Values mirror ``boc_terminator_wake_reason_t`` so the Python loop + // can compare against module-level constants without re-importing. + if (PyModule_AddIntConstant(module, "TERMINATED", BOC_TERMINATOR_TERMINATED) < + 0) { + return -1; + } + if (PyModule_AddIntConstant(module, "PUMP_READY", BOC_TERMINATOR_PUMP_READY) < + 0) { + return -1; + } + if (PyModule_AddIntConstant(module, "WAIT_TIMED_OUT", + BOC_TERMINATOR_WAIT_TIMED_OUT) < 0) { + return -1; + } return 0; } diff --git a/src/bocpy/behaviors.py b/src/bocpy/behaviors.py index 70f9748..8124dc7 100644 --- a/src/bocpy/behaviors.py +++ b/src/bocpy/behaviors.py @@ -12,12 +12,14 @@ """ import inspect +import linecache import logging import os import sys from textwrap import dedent import threading import time +import types from types import MappingProxyType from typing import Any, Callable, Generic, Mapping, NamedTuple, Optional, TypeVar, Union @@ -82,6 +84,22 @@ def _default_worker_count() -> int: # loud failure here means CI fails in minutes instead of hours. _LIFECYCLE_RECEIVE_TIMEOUT = 120.0 +# Self-defence cap on the alternating pump / orphan drain loop in +# `stop_workers`. A pathological producer that keeps re-feeding +# MAIN_PINNED_QUEUE between rounds would otherwise wedge teardown +# forever; on overflow we log and give up rather than spin. +_MAX_STOP_DRAIN_ROUNDS = 64 + +# Upper bound on any millisecond-valued pump argument +# (`deadline_ms`, `warn_ms`). The C side converts ms to ns via +# `value * 1_000_000`; without a guard, a caller passing +# `2**63` quietly wraps to a small or negative deadline. The bound +# corresponds to the largest ms that fits in an int64 once scaled by +# 1_000_000 — ~9.2e12 ms (~292 years), enough that no real program +# should hit it but small enough to reject programmer-error inputs +# like `sys.maxsize` cleanly. +_MAX_PUMP_MS = (1 << 63) // 1_000_000 - 1 + T = TypeVar("T") # Sentinel distinguishing "key absent" from "key is None" in noticeboard updates. @@ -214,6 +232,364 @@ def __repr__(self) -> str: return repr(self.impl) +class PinnedCown(Cown[T]): + """A cown whose value never leaves the main interpreter. + + Behaviors whose request set contains *any* PinnedCown run on the + main interpreter, scheduled onto a pump queue that the runtime + drains under :func:`wait` and that hosts may drive explicitly via + :func:`bocpy.pump`. + + A regular :class:`Cown` stores its value as cross-interpreter + data: every time a worker acquires the cown the value is + unpickled into the worker's interpreter, mutated, and re-pickled + on release. That round-trip is the reason a cown can be acquired + by any worker -- but it also means the value must be picklable + and that **the same Python object is never observed twice** in + a worker. + + Many useful values cannot survive that round-trip: pyglet shapes, + Tk widgets, open file handles, ctypes pointers into a library + loaded by ``__main__``, an asyncio event loop, a GPU context. + Their ``__reduce__`` either raises or silently reconstructs a + broken object on the other side. + + A :class:`PinnedCown` holds its value as a plain + :c:type:`PyObject` reference in the main interpreter. The value + never goes through ``XIData``; the same Python object is + observed on every acquire. The trade-off: every behavior whose + request set contains a pinned cown runs **on the main thread**, + drained by :func:`pump` (called from your event loop) or + implicitly by :func:`wait`. + + Pattern: coarse-grained pinned dispatch + The pinned arm is single-consumer (the main thread). If you + schedule a pinned behavior per item, those behaviors + serialise on the main thread and you lose worker + parallelism. Schedule pinned behaviors coarsely -- one per + logical frame or batch, not per item. Do per-item + computation on workers against per-item :class:`Cown` + slices, then dispatch **one** pinned ``@when`` per frame + that captures all of them together with the main-thread + canvas / handle and performs the batched write-back. + + Thread affinity + Pinned cowns may only be constructed from the **main + interpreter**. Constructing one from a worker raises + :class:`RuntimeError`; the value would have no home + interpreter to live in. :func:`pump` likewise requires the + main interpreter -- any thread within it on classic CPython; + on free-threaded builds (``Py_GIL_DISABLED``) a single + thread at a time, enforced by a CAS on pump entry that + raises :class:`RuntimeError` if a second thread tries to + pump concurrently. The CAS is cleared on **every** exit + path, including ``BaseException`` propagation from a + pinned body. + + Mixed request sets + A behavior may freely combine pinned and unpinned cowns; + the 2PL acquisition order is unchanged. As soon as the + request set contains any pinned cown, the body runs on the + main thread. Unpinned cowns in the set still travel through + XIData into the main interpreter for the body's duration. + + Exception model + Body exceptions follow the same rules as worker behaviors: + captured on the result :class:`Cown` and surfaced through + ``cown.exception``. The default :func:`pump` does **not** + re-raise; pass ``raise_on_error=True`` to opt into + fail-fast propagation. + + Nested pumping + Calling :func:`pump` from inside a pinned-behavior body + raises :class:`RuntimeError` (v1). + + Handle vs. value + A :class:`PinnedCown` *handle* (the Python wrapper object + and its C capsule) is a normal cross-interpreter shareable. + It travels via the same XIData mechanism as a regular + :class:`Cown` and may be: + + - shipped as a captured variable to a worker behavior, + - embedded in any value graph stored in a regular + :class:`Cown` (``Cown(PinnedCown(x))`` is supported), + - placed in a noticeboard entry via :func:`notice_write` + or :func:`notice_update`. + + What never crosses interpreter boundaries is the *value* + ``x``. A worker that ends up holding a pinned-cown handle + can do exactly one useful thing with it: schedule pinned + behaviors against it (which the runtime auto-routes to + the main pump queue). Any attempt to acquire the value + from a worker is rejected by the C-level owner CAS -- the + value's owner is permanently the main interpreter. + + Restrictions + - Constructible only on the main interpreter (see + *Thread affinity* above). + - The pinning interpreter is the main interpreter, by + design. There is one pinned queue per process and one + consumer of that queue (the main pumper); pinned cowns do + not split across interpreters. + """ + + def __init__(self, value: T): + """Create a pinned cown wrapping *value*. + + :param value: The initial value to wrap. Stored as a plain + :c:type:`PyObject` reference in the main interpreter -- + no pickling, no XIData round-trip. + :raises RuntimeError: If called from a non-main interpreter. + """ + # Skip super().__init__: the value must not go through XIData. + # Thread affinity lives entirely in C: PinnedCownCapsule refuses + # non-main construction, and pump's CAS enforces single-pumper + # on free-threaded builds. The capsule sets owner = main + # interpreter id permanently, which makes worker cown_acquire + # structurally fail. + self.impl = _core.PinnedCownCapsule(value) + + +class PumpResult(NamedTuple): + """Result of a :func:`pump` call. + + :ivar executed: Pinned behaviors whose lifecycle ran to + completion this call. Counts the iteration even if the body + raised or the acquire failed (the MCS chain still drained). + :ivar deadline_reached: ``True`` iff the loop exited because + ``deadline_ms`` tripped before the queue drained and before + ``max_behaviors`` capped. ``False`` on drain, on + ``max_behaviors`` cap, or when ``deadline_ms`` is ``None``. + :ivar raised: Pinned behaviors whose body raised an + :class:`Exception` captured to the result cown's + ``.exception``. Cleanup-path failures (acquire, release, + noticeboard cache-clear) do **not** count: they are logged + via ``PyErr_WriteUnraisable`` and the iteration is still + counted in ``executed``. On :class:`BaseException` + propagation, :func:`pump` raises and no + :class:`PumpResult` is returned. + """ + + executed: int + deadline_reached: bool + raised: int + + +def _validate_pump_bound(name: str, value: Optional[int], *, + ms: bool = False) -> Optional[int]: + """Validate a `pump()` bound argument. + + ``None`` is accepted as "unbounded". Otherwise the value must be + a positive :class:`int` and must not be a :class:`bool` (the + bool-as-int trap silently turns ``True`` into ``1`` and ``False`` + into ``0``, masking caller bugs). ``0`` is rejected: an explicit + zero bound carries no information the caller cannot express with + a one-line ``if budget:`` guard at the call site, and admitting + it forces a short-circuit branch that bypasses other entry-side + checks. ``ms=True`` additionally caps the value at + ``_MAX_PUMP_MS`` so the C side's ``value * 1_000_000`` ns + conversion cannot wrap past int64. The cap is keyed off the + explicit kwarg rather than a name-string heuristic so a future + caller that passes a non-``_ms`` name does not silently lose the + overflow protection. + """ + if value is None: + return None + if isinstance(value, bool) or not isinstance(value, int): + raise TypeError( + f"{name} must be None or a positive int, " + f"got {type(value).__name__}" + ) + if value <= 0: + raise TypeError( + f"{name} must be None or a positive int, got {value}" + ) + if ms and value > _MAX_PUMP_MS: + raise OverflowError( + f"{name}={value} exceeds the maximum supported " + f"millisecond value ({_MAX_PUMP_MS}); the C side would " + f"overflow when scaling to nanoseconds" + ) + return value + + +def pump(deadline_ms: Optional[int] = None, + max_behaviors: Optional[int] = None, + raise_on_error: bool = False) -> PumpResult: + """Run pinned behaviors that are ready, then return. + + Drains the main-thread queue of behaviors whose request sets + contain at least one :class:`PinnedCown`. Each behavior runs to + completion before the next starts. The pump is non-preemptive: + ``deadline_ms`` gates *starting* the next behavior, not + interrupting one already running. + + Call :func:`pump` from your event loop's idle / on-tick hook. + Script-mode programs need not call it explicitly -- :func:`wait` + pumps internally when any :class:`PinnedCown` exists in the + process. + + Bounding + - ``deadline_ms``: wall-clock budget. ``None`` drains to + empty; otherwise a positive :class:`int`. + - ``max_behaviors``: hard count. ``None`` drains to empty; + otherwise a positive :class:`int`. + + ``0`` is rejected for both bounds (use ``if budget:`` at + the call site instead of relying on the pump to no-op). + + Exception model + By default body exceptions land on the result cown; pump + continues. With ``raise_on_error=True``, the first body + exception re-raises on the pump thread after the queue + finishes draining. :class:`BaseException` + (``KeyboardInterrupt``, ``SystemExit``, ``GeneratorExit``) + propagates immediately after the offending behavior's + per-iteration cleanup completes; any behaviors still queued + are left in place for the next :func:`pump` call. + + Thread affinity + :func:`pump` must run on the **main interpreter**. Calling + from a worker interpreter raises :class:`RuntimeError` + immediately. On free-threaded builds (``Py_GIL_DISABLED``) + only one thread may pump at a time: a concurrent call from + a different thread raises :class:`RuntimeError`. Calling + :func:`pump` when no :class:`PinnedCown` exists is a no-op + returning ``PumpResult(0, False, 0)``. + + Reentrance + Not reentrant. Calling from inside a pinned-behavior body + raises :class:`RuntimeError` (v1). + + :param deadline_ms: Wall-clock budget in milliseconds. + ``None`` for unbounded; otherwise a positive :class:`int`. + Must not be :class:`bool`. + :type deadline_ms: Optional[int] + :param max_behaviors: Maximum behaviors to start this call. + ``None`` for unbounded; otherwise a positive :class:`int`. + Must not be :class:`bool`. + :type max_behaviors: Optional[int] + :param raise_on_error: Re-raise the first body exception after + drain. + :type raise_on_error: bool + :return: :class:`PumpResult` (``executed``, + ``deadline_reached``, ``raised``). On + :class:`BaseException` propagation, :func:`pump` raises and + no :class:`PumpResult` is returned. + :rtype: PumpResult + :raises TypeError: if ``deadline_ms`` or ``max_behaviors`` is + not ``None``, a positive :class:`int`, or is :class:`bool`. + :raises RuntimeError: wrong interpreter, concurrent pump on + free-threaded, nested pump, no live runtime + (:func:`start` has not been called), or watchdog raise + threshold tripped. + """ + deadline_ms = _validate_pump_bound("deadline_ms", deadline_ms, ms=True) + max_behaviors = _validate_pump_bound("max_behaviors", max_behaviors) + + # Pinned behaviors look up their `__behavior__N` thunk on the + # runtime's export_module (same shape contract as the worker + # bootstrap's `boc_export`). A NULL export here means the runtime + # is not initialised -- pinned schedules cannot work in that + # state, so fail loud rather than letting every behavior fall + # over with `AttributeError` on thunk lookup. + boc_export = None + if BEHAVIORS is not None: + boc_export = getattr(BEHAVIORS, "export_module", None) + if boc_export is None: + raise RuntimeError( + "pump() requires a live bocpy runtime: call bocpy.start() " + "(or schedule a @when, which auto-starts) before pump(). " + "If the runtime was already stopped, restart it before " + "draining pinned work." + ) + return PumpResult(*_core.main_pump_bounded( + deadline_ms, max_behaviors, raise_on_error, boc_export, + )) + + +def set_pump_watchdog(warn_ms: Optional[int] = 1000, + on_starve: Optional[ + Callable[[int, str], None]] = None) -> None: + """Configure the pinned-queue starvation watchdog. + + **The watchdog is disabled until this function is called.** No + call means no warnings, regardless of how long the pinned queue + has been non-empty. ``warn_ms=1000`` is the kwarg default that + applies *if and when* you opt in, not the runtime default. + + Warn-side sampling fires from :func:`pump` on entry (so + :func:`wait`'s auto-pump loop counts). The threshold gates on + **queue-non-empty time**: a program that runs only unpinned work + indefinitely never trips it. + + - ``warn_ms`` (kwarg default 1000): logs a warning carrying the + queue's non-empty duration (ms) and current depth. Pass + ``None`` to disable. Must be a positive int when set. + - ``on_starve``: optional callable ``(severity, message)`` to + replace the default logger. Use this to escalate (for + example ``on_starve=lambda s, m: pytest.fail(m)`` in tests, or + a counter / alert hook in production). + + :param warn_ms: Warn-after threshold in milliseconds, or + ``None`` to disable warnings. + :type warn_ms: Optional[int] + :param on_starve: Optional ``(severity, message)`` callback that + replaces the default logger sink. + :type on_starve: Optional[Callable[[int, str], None]] + :raises TypeError: if ``warn_ms`` is not ``None`` or a positive + :class:`int`, or ``on_starve`` is not callable. + :raises OverflowError: if ``warn_ms`` exceeds the maximum + representable nanosecond value. + """ + # Validate before crossing the C boundary so callers get a clear + # TypeError with the offending arg rather than a generic C-side + # parse failure. + if warn_ms is not None: + if (not isinstance(warn_ms, int) or isinstance(warn_ms, bool) + or warn_ms <= 0): + # Reject 0 alongside negatives. The C side treats 0 as + # the disable sentinel, which would silently turn the + # watchdog off and surprise the caller; require explicit + # ``None`` to disable. + raise TypeError( + f"warn_ms must be a positive int or None to disable, " + f"got {warn_ms!r}") + if warn_ms > _MAX_PUMP_MS: + raise OverflowError( + f"warn_ms={warn_ms} exceeds the maximum supported " + f"millisecond value ({_MAX_PUMP_MS}); the C side " + f"would overflow when scaling to nanoseconds") + if on_starve is not None and not callable(on_starve): + raise TypeError( + f"on_starve must be a callable or None, got {on_starve!r}") + _core.set_pump_watchdog(warn_ms=warn_ms, on_starve=on_starve) + + +def set_wait_pump_poll(ms: int = 50) -> None: + """Set the poll cadence for :func:`wait`'s auto-pump loop. + + Default cadence is **50 ms** — the upper bound on how long the + auto-pump loop will park between checks when no broadcast wakes + it. The setting is process-global and may be changed at any + time; the active :func:`wait` loop picks up the new value on + its next iteration. + + :param ms: Poll cadence in milliseconds. Must be positive. + :type ms: int + """ + if not isinstance(ms, int) or isinstance(ms, bool) or ms <= 0: + raise TypeError(f"ms must be a positive int, got {ms!r}") + global _WAIT_PUMP_POLL_MS + _WAIT_PUMP_POLL_MS = ms + + +# Re-read on every iteration of the wait() auto-pump loop so a +# mid-wait `set_wait_pump_poll(...)` change is honoured without +# restarting the wait. +_WAIT_PUMP_POLL_MS = 50 + + WORKER_MAIN_END = "# END boc_export" @@ -232,6 +608,11 @@ def __init__(self, num_workers: Optional[int]): self.classes = set() self.worker_threads = [] self.behavior_lookup: Mapping[int, BehaviorInfo] = {} + # Main-side namespace holding the transpiled ``__behavior__N`` + # thunks. :func:`pump` reads this so pinned-behavior bodies + # scheduled via ``@when`` resolve on main the same way they + # resolve on workers. Populated by :meth:`start`. + self.export_module: Optional[types.ModuleType] = None self.logger = logging.getLogger("behaviors") self.logger.debug("behaviors init") # The runtime has no central scheduler thread. Caller threads do 2PL @@ -278,6 +659,19 @@ def __init__(self, num_workers: Optional[int]): self._final_noticeboard: Optional[dict[str, Any]] = None self.final_cowns: tuple[Cown, ...] = () self.bid = 0 + # Set by :meth:`start` to the synthetic linecache key for the + # main-side transpiled export, so :meth:`stop` (and the + # abort path) can pop the entry symmetrically. + self._main_export_file: Optional[str] = None + # Set by :meth:`start` to the prior value of + # ``sys.modules['__bocmain__']`` (None if no entry existed) + # so :meth:`stop` can restore it instead of unconditionally + # popping a slot we never owned. + self._installed_bocmain = False + self._prior_bocmain: Optional[types.ModuleType] = None + # (name, path) of the module pinned by start(); used to detect + # mismatched re-start requests. + self._started_module: Optional[tuple[str, str]] = None def lookup_behavior(self, line_number: int, max_decorator_stack=32) -> BehaviorInfo: """Resolve behavior info from a source line number. @@ -489,15 +883,57 @@ def stop_workers(self): _core.send("boc_cleanup", True) self.teardown_workers() - # Drain any behaviours that were dispatched but never - # consumed (warned path of stop(), or any race where a - # late behaviour landed in a per-task queue between - # request_stop_all and the worker's pop_slow returning - # NULL). MUST run BEFORE scheduler_runtime_stop, which - # frees the worker array and the per-task queues with it. - # release_all on a drained behaviour may dispatch its - # successor; loop until the queues stay empty. - self._stop_drain_errors = self._drain_orphan_behaviors() + # Alternate `main_pump_drain_all` and + # `_drain_orphan_behaviors` until both report empty in + # the same iteration. `release_all` inside the orphan + # drain dispatches successors through + # `boc_sched_dispatch`, whose pinned fast path routes + # pinned-bearing successors onto MAIN_PINNED_QUEUE; a + # single pump-then-orphan ordering would leave those + # successors enqueued and their terminator_inc holds + # undecremented, wedging the next `start()`. The cap is + # a self-defence against a runaway producer that keeps + # re-feeding the queues: log + give up rather than spin + # forever. Main-interp only; skip the pump-side drain + # on sub-interpreter shutdown paths where the pinned + # queue is provably empty (only main can enqueue). + accumulated_drain_errors = [] + try: + if _core.is_primary(): + for _round in range(_MAX_STOP_DRAIN_ROUNDS): + try: + pump_drained = _core.main_pump_drain_all() + except Exception as drain_ex: + self.logger.exception(drain_ex) + pump_drained = 0 + errors_this_round, orphan_drained = ( + self._drain_orphan_behaviors() + ) + accumulated_drain_errors.extend(errors_this_round) + if pump_drained == 0 and orphan_drained == 0: + break + else: + try: + depth = _core.main_pump_queue_depth() + except Exception: + depth = -1 + self.logger.error( + "stop_workers(): drain loop did not converge " + "within %d rounds; main_pump_queue_depth=%d " + "at give-up. Pinned-cown leak likely.", + _MAX_STOP_DRAIN_ROUNDS, depth, + ) + else: + errors_this_round, _ = self._drain_orphan_behaviors() + accumulated_drain_errors.extend(errors_this_round) + finally: + # KeyboardInterrupt/SystemExit re-raised mid-drain must + # not erase already-captured release_all failures. + # extend (not assign) because _drain_orphan_behaviors + # also pushes its in-flight errors before the re-raise. + if accumulated_drain_errors: + self._stop_drain_errors.extend( + accumulated_drain_errors) finally: try: # Snapshot the per-worker scheduler counters before @@ -688,6 +1124,34 @@ def start(self, module: Optional[tuple[str, str]] = None): self.behavior_lookup = export.behaviors + # Compile the transpiled source into a fresh module on the + # main interpreter so :func:`pump` can resolve + # ``__behavior__N`` thunks the same way workers do. Workers + # bootstrap their own copy inside a sub-interpreter + # (``_bocpy_mod`` in the worker_script below); main needs an + # equivalent namespace because pinned-behavior bodies execute + # under ``main_pump_bounded`` on the main interpreter and + # ``behavior_execute_impl`` looks up the thunk via + # ``PyObject_GetAttrString(boc_export, ...)``. Without this + # the lookup falls back to ``sys.modules["__main__"]`` (which + # under pytest is the test runner, not the test module) and + # every pinned ``@when`` body fails with ``AttributeError``. + main_export_name = f"__bocpy_main_export__{module_name}" + main_export_file = f"" + main_export = types.ModuleType(main_export_name) + main_export.__file__ = main_export_file + linecache.cache[main_export_file] = ( + len(export.code), None, + export.code.splitlines(keepends=True), + main_export_file, + ) + self._main_export_file = main_export_file + exec( + compile(export.code, main_export_file, "exec"), + main_export.__dict__, + ) + self.export_module = main_export + # Embed the transpiled source as a Python string literal # (via ``repr()``) into the worker bootstrap. Each worker # compiles and exec's the literal into a fresh @@ -752,6 +1216,8 @@ def start(self, module: Optional[tuple[str, str]] = None): ] if module_name == "__main__": + self._prior_bocmain = sys.modules.get("__bocmain__") + self._installed_bocmain = True sys.modules["__bocmain__"] = sys.modules["__main__"] for cls in export.classes: bootstrap.append(f'\n\nclass {cls}(sys.modules["__bocmain__"].{cls}):') @@ -844,9 +1310,31 @@ def start(self, module: Optional[tuple[str, str]] = None): # Drop the __bocmain__ alias if we installed one, so a # follow-up start() observes a clean sys.modules. Same # rationale as in the successful stop() path. - sys.modules.pop("__bocmain__", None) + self._restore_main_aliases() raise + def _restore_main_aliases(self): + # Symmetric cleanup of the main-side state ``start()`` may + # have installed: the synthetic ``linecache`` entry that + # backs tracebacks for the transpiled export, and the + # ``__bocmain__`` alias used by worker bootstrap to subclass + # user classes defined in ``__main__``. Restoring the prior + # ``__bocmain__`` (instead of unconditionally popping it) + # preserves an alias the host had set before the runtime + # started. + mef = self._main_export_file + if mef is not None: + linecache.cache.pop(mef, None) + self._main_export_file = None + if self._installed_bocmain: + prior = self._prior_bocmain + if prior is None: + sys.modules.pop("__bocmain__", None) + else: + sys.modules["__bocmain__"] = prior + self._installed_bocmain = False + self._prior_bocmain = None + def _abort_workers(self): """Tear down the worker pool after a partial-startup failure. @@ -898,11 +1386,116 @@ def _abort_noticeboard(self): except Exception as ex: self.logger.exception(ex) + def cycle_noticeboard(self, timeout: Optional[float] = None) -> dict[str, Any]: + """Capture a noticeboard snapshot by cycling the mutator thread (sentinel -> join -> snapshot -> restart). + + :param timeout: Upper bound on the join. ``None`` waits forever. + :type timeout: Optional[float] + :returns: The noticeboard contents as a plain ``dict``. + :raises TimeoutError: If the noticeboard thread does not exit within ``timeout``. + """ + if self.noticeboard is None or not self.noticeboard.is_alive(): + _core.noticeboard_cache_clear() + return dict(_core.noticeboard_snapshot()) + + _core.send("boc_noticeboard", "shutdown") + self.noticeboard.join(timeout) + if self.noticeboard.is_alive(): + raise TimeoutError( + "cycle_noticeboard: noticeboard thread did not exit " + f"within timeout={timeout!r}; the in-flight mutation has " + "not finished. Retry once it has." + ) + + _core.clear_noticeboard_thread() + try: + _core.noticeboard_cache_clear() + snap = dict(_core.noticeboard_snapshot()) + finally: + # Restart unconditionally so a failed snapshot does not strand the runtime. + self.start_noticeboard() + return snap + + def quiesce(self, timeout: Optional[float] = None) -> bool: + """Wait for terminator quiescence without tearing down workers; re-arms the Pyrona seed on exit. + + :param timeout: Upper bound on the wait. ``None`` waits forever. + :type timeout: Optional[float] + :returns: ``True`` on quiescence, ``False`` on timeout. + """ + if not _core.is_primary(): + raise RuntimeError( + "Behaviors.quiesce() must be called from the primary " + "interpreter." + ) + # Track whether seed_dec actually dropped the seed so we only re-arm it ourselves. + seed_dropped = _core.terminator_seed_dec() + try: + return self._wait_for_quiescence(timeout) + finally: + if seed_dropped: + # Re-arm so a future stop()/quiesce() can drop the seed again; CAS 0->1 is idempotent. + _core.terminator_seed_inc() + def __enter__(self): """Enter context by starting the runtime.""" self.start() return self + def _auto_pump_loop(self, timeout: Optional[float]) -> bool: + """Pump the pinned queue while waiting for terminator quiescence. + + Used by :meth:`stop` whenever live ``PinnedCown`` handles + exist. On each iteration: block on + ``terminator_wait_pumpable`` for the current + ``_WAIT_PUMP_POLL_MS`` budget (re-read every iteration so a + mid-wait ``set_wait_pump_poll`` change is honoured); + if it wakes with ``PUMP_READY``, drain up to 64 behaviors via + ``main_pump_bounded`` with ``raise_on_error=False`` so body + exceptions surface on result cowns instead of aborting the + wait. Returns ``True`` on terminator quiescence, ``False`` on + deadline expiry — matching ``_core.terminator_wait``'s bool + contract. + """ + deadline = None if timeout is None else time.monotonic() + timeout + while _core.terminator_count() > 0: + poll_s = _WAIT_PUMP_POLL_MS / 1000.0 + if deadline is not None: + remaining = deadline - time.monotonic() + if remaining <= 0: + return False + poll_s = min(poll_s, remaining) + outcome = _core.terminator_wait_pumpable(poll_s) + if outcome == _core.TERMINATED: + return True + if outcome == _core.PUMP_READY: + _core.main_pump_bounded( + None, 64, False, self.export_module, + ) + # WAIT_TIMED_OUT: fall through to the deadline check at + # the top of the next iteration. + return True + + def _wait_for_quiescence(self, timeout: Optional[float]) -> bool: + """Wait for terminator quiescence, auto-pumping if pinned. + + Picks between the byte-equivalent fast path + (``_core.terminator_wait``) and the auto-pump loop based on + the live pinned-cown count. Refuses to run from a non-primary + interpreter when pinned cowns exist — the pinned queue is + single-consumer by design. + """ + if _core.pinned_cown_count() == 0: + return _core.terminator_wait(timeout) + if not _core.is_primary(): + raise RuntimeError( + f"wait() with pinned cowns must run on the main " + f"interpreter (called from a non-main interpreter; " + f"{_core.pinned_cown_count()} pinned cown(s) live). " + f"Call wait() from the original main thread." + ) + return self._auto_pump_loop(timeout) + def stop(self, timeout: Optional[float] = None): """Quiesce all behaviors and tear the runtime down. @@ -980,7 +1573,7 @@ def _remaining(): # complete. if not self._workers_stopped: _core.terminator_seed_dec() - _core.terminator_wait(_remaining()) + self._wait_for_quiescence(_remaining()) # Post-wait reconciliation. If wait() timed out the count is # still > 0 -- skip the assertion in that case so a partial @@ -1109,7 +1702,7 @@ def _remaining(): # subsequent bocpy.start() observes a clean sys.modules # (and so the main module isn't pinned in sys.modules under # an alias after the runtime has shut down). - sys.modules.pop("__bocmain__", None) + self._restore_main_aliases() if drain_errors: # Surface the first failure so the caller sees the leak at # the failure site rather than later as a mysterious @@ -1148,13 +1741,18 @@ def _drain_orphan_behaviors(self): so the loop drains again until ``scheduler_drain_all_queues`` returns an empty list. - :returns: A list of exceptions captured from - ``release_all`` failures, or ``[]`` on a clean - drain. ``stop()`` re-raises if non-empty so a release-side - leak is visible at the failure site rather than later as a - mysterious deadlock on the affected cowns. + :returns: A ``(errors, drained_count)`` tuple. ``errors`` is + a list of exceptions captured from ``release_all`` + failures, or ``[]`` on a clean drain. ``drained_count`` + is the total number of capsules processed by this call; + ``stop_workers`` uses it to detect when the alternating + pump / orphan drain loop has converged. ``stop()`` + re-raises if ``errors`` is non-empty so a release-side + leak is visible at the failure site rather than later as + a mysterious deadlock on the affected cowns. """ errors = [] + drained_count = 0 # KeyboardInterrupt / SystemExit raised mid-drain must not # abort the drain partway -- the orphaned behaviors would # leak their MCS chains and terminator holds, so the next @@ -1166,9 +1764,32 @@ def _drain_orphan_behaviors(self): capsules = _core.scheduler_drain_all_queues() if not capsules: if deferred_base_exc is not None: + if errors: + # Stash current-round errors so a + # KeyboardInterrupt unwinding past stop() does + # not silently erase release_all failures. + self._stop_drain_errors.extend(errors) + note = ( + f"_drain_orphan_behaviors deferred " + f"{len(errors)} release_all error(s); " + "see Behaviors._stop_drain_errors" + ) + # add_note is PEP 678 (3.11+); fall back to writing __notes__ directly on 3.10. + add_note = getattr( + deferred_base_exc, "add_note", None) + if add_note is not None: + add_note(note) + else: + existing = getattr( + deferred_base_exc, "__notes__", None) + if existing is None: + deferred_base_exc.__notes__ = [note] + else: + existing.append(note) raise deferred_base_exc - return errors + return errors, drained_count for payload in capsules: + drained_count += 1 self.logger.warning( "behavior dropped during stop(); the runtime was " "torn down before this behavior could acquire its cowns" @@ -1271,8 +1892,13 @@ def start(worker_count: Optional[int] = None, module: Optional[tuple[str, str]] = None): """Start the behavior runtime: worker pool plus noticeboard thread. - The runtime distributes scheduling (2PL link/release) across caller - and worker threads; there is no central scheduler thread. + Idempotent: bare ``start()`` on a running runtime is a silent no-op; mismatched ``worker_count``/``module`` raise. + + The runtime distributes scheduling (2PL link/release) across + caller and worker threads; there is no central scheduler thread. + + The runtime distributes scheduling (2PL link/release) across + caller and worker threads; there is no central scheduler thread. :param worker_count: The number of worker interpreters to start. If None, defaults to the number of available cores minus one. @@ -1280,20 +1906,42 @@ def start(worker_count: Optional[int] = None, :param module: A tuple of the target module name and file path to export for worker import. If None, the caller's module will be used. :type module: Optional[tuple[str, str]] + :raises RuntimeError: If called from a non-primary interpreter, + or if the runtime is already up under a different + ``worker_count`` / ``module`` than the one supplied. """ global BEHAVIORS + + if not _core.is_primary(): + raise RuntimeError("start() can only be called from the main interpreter") + + # Idempotent: bare start() no-ops; mismatched explicit args raise. if BEHAVIORS is not None: - raise RuntimeError("Behavior runtime already started") + if worker_count is not None and worker_count != BEHAVIORS.num_workers: + raise RuntimeError( + f"bocpy.start(worker_count={worker_count}) was called " + f"but the runtime is already up with worker_count=" + f"{BEHAVIORS.num_workers}. Call wait() (or stop()) to " + f"tear the existing runtime down before starting a new " + f"one with a different worker_count." + ) + if module is not None and module != BEHAVIORS._started_module: + raise RuntimeError( + f"bocpy.start(module={module!r}) was called but the " + f"runtime is already up with module=" + f"{BEHAVIORS._started_module!r}. Call wait() (or " + f"stop()) to tear the existing runtime down before " + f"starting a new one with a different module." + ) + return if worker_count is None: worker_count = WORKER_COUNT - if not _core.is_primary(): - raise RuntimeError("start() can only be called from the main interpreter") - if module is None: module = get_caller_module() BEHAVIORS = Behaviors(worker_count) + BEHAVIORS._started_module = module try: BEHAVIORS.start(module) except BaseException: @@ -1319,9 +1967,11 @@ def when(*cowns): result of executing the behavior. This Cown can be used for further coordination. - Note: the transpiler matches ``@when`` by literal name. Aliasing - the import (``from bocpy import when as boc_when``) is not - supported -- the rewrite will not fire and the worker will fail. + The transpiler recognises module-level aliases for the decorator, + so ``from bocpy import when as boc_when`` and ``@boc_when(...)``, + as well as ``import bocpy [as alias]`` followed by + ``@bocpy.when(...)`` / ``@alias.when(...)``, are all supported. + Aliases declared inside a function body are not tracked. """ def when_factory(func): @@ -1369,14 +2019,62 @@ def when_factory(func): return when_factory +def quiesce(timeout: Optional[float] = None, *, + stats: bool = False, noticeboard: bool = False): + """Block until in-flight behaviors complete without tearing down the runtime. + + :param timeout: Upper bound (seconds). ``None`` waits forever. + :type timeout: Optional[float] + :param stats: If True, capture per-worker scheduler stats. + :type stats: bool + :param noticeboard: If True, capture a noticeboard snapshot via a thread cycle. + :type noticeboard: bool + :raises TimeoutError: If quiescence is not reached within ``timeout``. + :raises RuntimeError: If called from a non-primary interpreter while pinned cowns are live. + """ + def _format(stats_snap, nb_snap): + if stats and noticeboard: + return WaitResult(stats=stats_snap, noticeboard=nb_snap) + if stats: + return stats_snap + if noticeboard: + return nb_snap + return None + + if BEHAVIORS is None: + return _format([], {}) + + if timeout is None: + deadline = None + else: + deadline = time.monotonic() + timeout + + def _remaining() -> Optional[float]: + if deadline is None: + return None + return max(0.0, deadline - time.monotonic()) + + if not BEHAVIORS.quiesce(_remaining()): + raise TimeoutError( + f"quiesce(): runtime did not reach quiescence within " + f"timeout={timeout!r}" + ) + # Sample stats post-quiescence so the per-worker counts are stable. + stats_snap = list(_core.scheduler_stats()) if stats else None + nb_snap = BEHAVIORS.cycle_noticeboard(_remaining()) if noticeboard else None + return _format(stats_snap, nb_snap) + + def wait(timeout: Optional[float] = None, *, stats: bool = False, noticeboard: bool = False): """Block until all behaviors complete, with optional timeout. When ``stats=True``, captures the per-worker - :func:`_core.scheduler_stats` snapshot at shutdown. When - ``noticeboard=True``, captures the final noticeboard contents - as a plain ``dict`` at shutdown. See the stub in + :func:`_core.scheduler_stats` snapshot. When + ``noticeboard=True``, captures the noticeboard contents as a + plain ``dict`` at the quiescence point (NOT after teardown — the + two are equivalent in single-caller programs but the quiescence + snapshot is the documented one). See the stub in ``__init__.pyi`` for the full contract. Return value: @@ -1385,6 +2083,10 @@ def wait(timeout: Optional[float] = None, *, - ``stats=True`` only: ``list[dict]`` (or ``[]``). - ``noticeboard=True`` only: ``dict[str, Any]`` (or ``{}``). - both flags: :class:`WaitResult`. + + Internally a thin wrapper around :func:`quiesce` + + :meth:`Behaviors.stop`; quiescence timeout warns rather than + raising. """ global BEHAVIORS @@ -1397,30 +2099,74 @@ def _format(stats_snap, nb_snap): return nb_snap return None - if BEHAVIORS: - # Clear BEHAVIORS only if stop() drove the runtime all the - # way through teardown (workers joined, noticeboard exited, - # C-level noticeboard slot released). On stop()'s - # noticeboard-join-timeout path the runtime is intentionally - # left running so the caller can diagnose the leak and - # retry; nulling the global handle there would strand the - # live workers / noticeboard thread with no Python-side - # reference. - try: - BEHAVIORS.stop(timeout) - except BaseException: - if BEHAVIORS._teardown_complete: - stats_snap = BEHAVIORS._final_stats if BEHAVIORS._final_stats is not None else [] - nb_snap = BEHAVIORS._final_noticeboard if BEHAVIORS._final_noticeboard is not None else {} - BEHAVIORS = None - if stats or noticeboard: - return _format(stats_snap, nb_snap) - raise + if BEHAVIORS is None: + return _format([], {}) + + if BEHAVIORS._teardown_complete: + # Idempotent: prior stop() already stashed final snapshots; + # return them rather than running on an empty runtime. stats_snap = BEHAVIORS._final_stats if BEHAVIORS._final_stats is not None else [] nb_snap = BEHAVIORS._final_noticeboard if BEHAVIORS._final_noticeboard is not None else {} BEHAVIORS = None return _format(stats_snap, nb_snap) - return _format([], {}) + + if timeout is None: + deadline = None + else: + deadline = time.monotonic() + timeout + + def _remaining() -> Optional[float]: + if deadline is None: + return None + return max(0.0, deadline - time.monotonic()) + + # quiesce() first for a pre-teardown snapshot; on TimeoutError fall + # back to stop()'s post-teardown one (historical warn-and-tear-down). + quiesce_snapshots = None + quiesce_timed_out = False + try: + quiesce_snapshots = quiesce( + _remaining(), stats=stats, noticeboard=noticeboard, + ) + except TimeoutError as ex: + quiesce_timed_out = True + BEHAVIORS.logger.warning( + "wait(): quiesce() timed out (%s); proceeding to stop().", ex, + ) + + # Clear BEHAVIORS only if stop() drove the runtime all the + # way through teardown (workers joined, noticeboard exited, + # C-level noticeboard slot released). On stop()'s + # noticeboard-join-timeout path the runtime is intentionally + # left running so the caller can diagnose the leak and + # retry; nulling the global handle there would strand the + # live workers / noticeboard thread with no Python-side + # reference. + try: + BEHAVIORS.stop(_remaining()) + except BaseException: + if BEHAVIORS._teardown_complete: + if quiesce_snapshots is not None: + BEHAVIORS = None + if stats or noticeboard: + return quiesce_snapshots + return None + stats_snap = BEHAVIORS._final_stats if BEHAVIORS._final_stats is not None else [] + nb_snap = BEHAVIORS._final_noticeboard if BEHAVIORS._final_noticeboard is not None else {} + BEHAVIORS = None + if stats or noticeboard: + return _format(stats_snap, nb_snap) + raise + + if quiesce_snapshots is not None and not quiesce_timed_out: + BEHAVIORS = None + return quiesce_snapshots + + # Quiesce timed out: return stop()'s post-teardown snapshot instead of an empty result. + stats_snap = BEHAVIORS._final_stats if BEHAVIORS._final_stats is not None else [] + nb_snap = BEHAVIORS._final_noticeboard if BEHAVIORS._final_noticeboard is not None else {} + BEHAVIORS = None + return _format(stats_snap, nb_snap) def _validate_noticeboard_key(key: str) -> None: @@ -1656,15 +2402,29 @@ def notice_delete(key: str) -> None: def noticeboard() -> Mapping[str, Any]: """Return a cached snapshot of the noticeboard. - Must be called from within a ``@when`` behavior. The first call within a - behavior captures all entries under mutex and caches the data. - Subsequent calls in the same behavior return a view of the same - cached data. + The noticeboard is a behavior-scope read surface. The supported + use is from inside a ``@when`` body: the first call captures all + entries under mutex and caches them, and every subsequent call + in the same behavior returns the same cached view. The returned mapping is read-only. - Calling from outside a behavior (e.g. the main thread) will return a - snapshot that is never refreshed for that thread. + The only supported way to read the noticeboard from the main + thread is to ask :func:`wait` for it via ``wait(noticeboard=True)`` + (or ``wait(stats=True, noticeboard=True)``); that snapshot is taken + on the main thread between joining the noticeboard mutator thread + and clearing the C-side entries. + + Calling :func:`noticeboard` or :func:`notice_read` from any other + main-thread context (outside a behavior, outside + ``wait(noticeboard=True)``) is **undefined behavior**: the cached + proxy is never re-anchored on a behavior boundary, so subsequent + calls may observe either a stale snapshot or partially-applied + writes. + + Seeding the noticeboard with :func:`notice_write` from the main + thread *before* scheduling behaviors is fine and is the + recommended pattern for installing read-mostly configuration. :return: A read-only mapping of keys to their stored values. :rtype: Mapping[str, Any] @@ -1675,11 +2435,11 @@ def noticeboard() -> Mapping[str, Any]: def notice_read(key: str, default: Any = None) -> Any: """Read a single key from the noticeboard. - Must be called from within a ``@when`` behavior. Convenience wrapper - that takes a snapshot and returns one value. - - Calling from outside a behavior (e.g. the main thread) will return a - snapshot that is never refreshed for that thread. + Convenience wrapper over :func:`noticeboard` that takes a snapshot + and returns one value. The same supported-usage contract applies: + call from inside a ``@when`` behavior, or read the final state on + main via ``wait(noticeboard=True)``. Calling :func:`notice_read` + from any other main-thread context is **undefined behavior**. :param key: The noticeboard key to read. :type key: str diff --git a/src/bocpy/boc_compat.h b/src/bocpy/boc_compat.h index c98b381..571cd71 100644 --- a/src/bocpy/boc_compat.h +++ b/src/bocpy/boc_compat.h @@ -70,6 +70,16 @@ #include #endif +/// @brief Portable stand-in for C11's @c max_align_t. +/// @details Avoids C11-mode @c max_align_t which MSVC only exposes under @c +/// /std:c11 (not set by the CPython build). +typedef union boc_max_align { + long long _ll; + long double _ld; + void *_p; + void (*_fp)(void); +} boc_max_align_t; + // --------------------------------------------------------------------------- // Memory-order tags // --------------------------------------------------------------------------- diff --git a/src/bocpy/boc_sched.c b/src/bocpy/boc_sched.c index 9168685..2d12b85 100644 --- a/src/bocpy/boc_sched.c +++ b/src/bocpy/boc_sched.c @@ -936,6 +936,40 @@ boc_bq_node_t *boc_sched_worker_pop_fast(boc_sched_worker_t *self) { int boc_sched_dispatch(boc_bq_node_t *n) { boc_sched_worker_t *self = current_worker; + + // Off-worker runtime-down gate. Must run BEFORE the pinned fast + // path so a pinned `@when` racing teardown is rejected the same + // way as an unpinned one — otherwise the pinned arm would drop + // the node onto MAIN_PINNED_QUEUE post-`terminator_close` with + // no rollback, leaking an undecremented `terminator_inc` and + // wedging the next `wait()`. Workers releasing successors of + // already-acquired behaviours skip this check (they bypass + // `self == NULL`) so mid-run dispatch is unaffected. + if (self == NULL) { + Py_ssize_t wc = + (Py_ssize_t)boc_atomic_load_u64_explicit(&WORKER_COUNT, BOC_MO_ACQUIRE); + if (wc == 0) { + PyErr_SetString( + PyExc_RuntimeError, + "cannot schedule behavior: bocpy runtime is not running. " + "Call bocpy.start() before scheduling, or avoid scheduling " + "after wait() / stop() has shut the runtime down."); + return -1; + } + } + + // Pinned-routing fast path. The OR-fold pinned byte was set by + // `BehaviorCapsule_init` from the per-arg `BOCCown::is_pinned` + // classification. Read it via the scheduler-public prehdr accessor + // (no knowledge of BOCBehavior layout required) and divert pinned + // behaviours onto the process-global main-pinned queue. The cold + // path transfers entirely to `boc_main_pinned_enqueue` (defined + // in `_core.c`); the worker-dispatch arms below run only when the + // behaviour has no pinned cowns. + if (boc_behavior_node_is_pinned(n)) { + return boc_main_pinned_enqueue(n); + } + boc_sched_worker_t *target; if (self != NULL) { @@ -968,37 +1002,14 @@ int boc_sched_dispatch(boc_bq_node_t *n) { } else { // Off-worker arm: round-robin over the worker ring. // - // Acquire-load WORKER_COUNT and INCARNATION so we observe the - // RELEASE-stores from `boc_sched_shutdown` BEFORE we could - // observe a freed WORKERS[] slot. Without this acquire, an - // off-worker producer running concurrently with shutdown - // could read a stale WORKER_COUNT > 0 and dereference - // WORKERS[0] after it had been freed. - Py_ssize_t wc = - (Py_ssize_t)boc_atomic_load_u64_explicit(&WORKER_COUNT, BOC_MO_ACQUIRE); - // Re-seed `rr_nonlocal` whenever the scheduler incarnation - // changes so a `start()`/`wait()`/`start()` cycle with a - // different worker count cannot land on a stale pointer. + // The runtime-down gate at the top of `boc_sched_dispatch` + // already rejected the case where `WORKER_COUNT == 0`, so by + // the time we get here at least one worker slot is live. + // INCARNATION still needs an acquire load so a stale + // `rr_nonlocal` from a prior incarnation is refreshed before + // we dereference it. size_t inc_now = (size_t)boc_atomic_load_u64_explicit(&INCARNATION, BOC_MO_ACQUIRE); - // Check WORKER_COUNT FIRST so the runtime-down sentinel is - // honoured even when the cached `rr_nonlocal` is non-NULL but - // points into the prior incarnation's freed array (the - // shutdown-then-restart-with-different-count race). - if (wc == 0) { - // No runtime up — surface as a Python exception. Prior - // behaviour was a silent drop, which left whencall's - // `terminator_inc` un-rolled-back: the next `wait()` would - // hang because the caller's hold was never released. The - // caller (`whencall` in `behaviors.py`) catches this and - // calls `terminator_dec` to roll back its hold. - PyErr_SetString( - PyExc_RuntimeError, - "cannot schedule behavior: bocpy runtime is not running. " - "Call bocpy.start() before scheduling, or avoid scheduling " - "after wait() / stop() has shut the runtime down."); - return -1; - } if (rr_nonlocal == NULL || rr_incarnation != inc_now) { rr_nonlocal = &WORKERS[0]; rr_incarnation = inc_now; diff --git a/src/bocpy/boc_sched.h b/src/bocpy/boc_sched.h index c2f5cbf..9e93442 100644 --- a/src/bocpy/boc_sched.h +++ b/src/bocpy/boc_sched.h @@ -34,16 +34,18 @@ // // The queue is intrusive: each node carries an `_Atomic` link // (`boc_bq_node_t::next_in_queue`). Production users embed a -// `boc_bq_node_t` field (`BOCBehavior::bq_node`) and pass its address -// to the enqueue/dequeue API; the queue never dereferences anything -// other than the link, so larger user-defined payloads are reached -// via container_of-style arithmetic at the call site. +// `boc_bq_node_t` field (see `boc_behavior_prehdr_t::bq_node` below +// for the BOCBehavior case) and pass its address to the +// enqueue/dequeue API; the queue never dereferences anything other +// than the link, so larger user-defined payloads are reached via +// container_of-style arithmetic at the call site. /// @brief Verona-style intrusive link node. -/// @details Embedded at a struct-end position inside @c BOCBehavior -/// (see `_core.c`). The queue treats nodes as opaque: the only field -/// it reads or writes is @c next_in_queue. Test code may allocate -/// bare @c boc_bq_node_t instances. +/// @details Embedded at offset 0 of @c boc_behavior_prehdr_t (see +/// below); the prehdr sits immediately before each @c BOCBehavior. +/// The queue treats nodes as opaque: the only field it reads or +/// writes is @c next_in_queue. Test code may allocate bare +/// @c boc_bq_node_t instances. typedef struct boc_bq_node { /// @brief Intrusive forward link, payload type /// `struct boc_bq_node *` stored in a `boc_atomic_ptr_t` slot for @@ -189,6 +191,90 @@ boc_bq_node_t *boc_bq_segment_take_one(boc_bq_segment_t *s); /// @return @c true if the queue currently appears empty. bool boc_bq_is_empty(boc_bq_t *q); +// --------------------------------------------------------------------------- +// Scheduler-visible behaviour pre-header (`boc_behavior_prehdr_t`) +// --------------------------------------------------------------------------- +// +// Pre-header sitting immediately *before* each BOCBehavior +// allocation (CPython `_PyGC_Head` / `_Py_AS_GC()` style). Holds +// the fields the scheduler needs to inspect without including +// BOCBehavior's private definition: the intrusive queue link and +// the OR-fold pinned byte set by `BehaviorCapsule_init` from the +// per-arg cown classification (`is_pinned`). +// +// Why a pre-header instead of fields on BOCBehavior. The dispatch +// path in `boc_sched.c` receives a `boc_bq_node_t *` and must read +// the pinned byte to route pinned behaviours onto the main-thread +// queue. BOCBehavior's struct definition is private to `_core.c`, +// so the alternatives are (a) leak the full struct via this header, +// (b) call through a function pointer on the hot path, or (c) +// hard-code an `offsetof(BOCBehavior, pinned)` magic number in +// `boc_sched.c` and protect it with `static_assert` mirrors. The +// pre-header avoids all three: the scheduler reads `pinned` via a +// normal struct field access, and `bq_node` at offset 0 makes the +// container_of cast trivial and impossible to drift. +// +// `_core.c` owns allocation: `behavior_new` calls +// `PyMem_RawMalloc(sizeof(prehdr) + sizeof(BOCBehavior))`, zeroes +// the prehdr, and returns the pointer past it. Recovery on the free +// path uses `BOC_BEHAVIOR_PREHDR(b)` to walk back. + +/// @brief Scheduler-visible pre-header attached to every behaviour. +/// @details Allocated in front of each @c BOCBehavior; sits in the +/// same cache line as the intrusive link the scheduler dereferences +/// on dispatch. The @c bq_node MUST stay at offset 0 so the cast +/// from @c boc_bq_node_t* to @c boc_behavior_prehdr_t* IS the +/// container_of arithmetic (no compile-time offset constant needed). +typedef struct boc_behavior_prehdr { + /// @brief Intrusive link node (offset 0; load-bearing). + /// @details Literal @c alignas(8) (not @c alignof(boc_max_align_t)) because + /// 32-bit MSVC's @c __declspec(align) only accepts integer literals; 8 + /// satisfies the trailing-fields alignment assert on every supported target. + alignas(8) boc_bq_node_t bq_node; + /// @brief OR-fold of @c BOCCown::is_pinned over the behaviour's + /// request set. Set by `BehaviorCapsule_init`; read by + /// `boc_sched_dispatch` to route the behaviour onto the main-thread + /// pinned queue. + uint8_t pinned; + /// @brief Room to grow for future scheduler-visible fields without + /// reallocating the prehdr footprint. + uint8_t _reserved[7]; +} boc_behavior_prehdr_t; + +static_assert(offsetof(boc_behavior_prehdr_t, bq_node) == 0, + "boc_behavior_prehdr_t.bq_node must be at offset 0 so the " + "container_of cast in boc_sched_dispatch is a no-op"); +static_assert(sizeof(boc_behavior_prehdr_t) % alignof(boc_max_align_t) == 0, + "sizeof(boc_behavior_prehdr_t) must be a multiple of " + "alignof(boc_max_align_t) so (prehdr + 1) is properly aligned " + "for the trailing BOCBehavior fields"); + +/// @brief Recover the prehdr from a BOCBehavior pointer. +/// @details Inverse of the "return the pointer past the prehdr" +/// allocation pattern in `behavior_new`. Cast-only; no field access. +#define BOC_BEHAVIOR_PREHDR(b) (((boc_behavior_prehdr_t *)(b)) - 1) + +/// @brief Read the OR-fold pinned byte from an intrusive queue node. +/// @details The cast is the container_of: @c bq_node is at offset 0 +/// of @c boc_behavior_prehdr_t by construction (see the struct +/// definition above). Used by `boc_sched_dispatch`'s leading branch. +/// @param n The intrusive link node (must point into a prehdr). +/// @return Non-zero iff the owning behaviour has at least one pinned +/// cown in its request set. +static inline uint8_t boc_behavior_node_is_pinned(const boc_bq_node_t *n) { + return ((const boc_behavior_prehdr_t *)n)->pinned; +} + +/// @brief Enqueue a pinned-bearing behaviour on the main-pinned queue. +/// @details Defined in `_core.c` (it owns @c MAIN_PINNED_QUEUE plus +/// the depth/timestamp counters and the terminator wake). Called by +/// @ref boc_sched_dispatch when @ref boc_behavior_node_is_pinned +/// returns non-zero. +/// @param n The prehdr's @c bq_node. +/// @return 0 (infallible; mirrors @ref boc_sched_dispatch's success +/// contract). +int boc_main_pinned_enqueue(boc_bq_node_t *n); + // --------------------------------------------------------------------------- // Verona work-stealing queue cursors (`boc_wsq_*`) // --------------------------------------------------------------------------- diff --git a/src/bocpy/boc_terminator.c b/src/bocpy/boc_terminator.c index 940b1e8..1ea4c4e 100644 --- a/src/bocpy/boc_terminator.c +++ b/src/bocpy/boc_terminator.c @@ -95,6 +95,17 @@ bool terminator_seed_dec(void) { return false; } +bool terminator_seed_inc(void) { + // CAS 0->1: single-shot inc; no broadcast needed (terminator_wait only wakes + // on count==0). + int_least64_t expected = 0; + if (atomic_compare_exchange_strong(&TERMINATOR_SEEDED, &expected, 1)) { + atomic_fetch_add(&TERMINATOR_COUNT, 1); + return true; + } + return false; +} + void terminator_reset(int_least64_t *prior_count, int_least64_t *prior_seeded) { // Fence: raise the closed bit before we touch anything else so any // stray thread still holding a reference to the previous runtime @@ -118,3 +129,34 @@ int_least64_t terminator_seeded(void) { } int_least64_t terminator_count(void) { return atomic_load(&TERMINATOR_COUNT); } + +void terminator_wake_all(void) { + mtx_lock(&TERMINATOR_MUTEX); + cnd_broadcast(&TERMINATOR_COND); + mtx_unlock(&TERMINATOR_MUTEX); +} + +boc_terminator_wake_reason_t +terminator_wait_pumpable(double timeout_s, uint64_t (*pinned_depth_fn)(void)) { + boc_terminator_wake_reason_t reason = BOC_TERMINATOR_WAIT_TIMED_OUT; + double end_time = boc_now_s() + timeout_s; + mtx_lock(&TERMINATOR_MUTEX); + for (;;) { + if (atomic_load(&TERMINATOR_COUNT) == 0) { + reason = BOC_TERMINATOR_TERMINATED; + break; + } + if (pinned_depth_fn != NULL && pinned_depth_fn() > 0) { + reason = BOC_TERMINATOR_PUMP_READY; + break; + } + double now = boc_now_s(); + if (now >= end_time) { + reason = BOC_TERMINATOR_WAIT_TIMED_OUT; + break; + } + cnd_timedwait_s(&TERMINATOR_COND, &TERMINATOR_MUTEX, end_time - now); + } + mtx_unlock(&TERMINATOR_MUTEX); + return reason; +} diff --git a/src/bocpy/boc_terminator.h b/src/bocpy/boc_terminator.h index 8165cce..0d70446 100644 --- a/src/bocpy/boc_terminator.h +++ b/src/bocpy/boc_terminator.h @@ -66,6 +66,13 @@ bool terminator_wait(double timeout, bool wait_forever); /// removed. bool terminator_seed_dec(void); +/// @brief Idempotent one-shot re-arm of the Pyrona seed. +/// @details Atomic CAS 0->1 on SEEDED + INC on COUNT; symmetric with @c +/// terminator_seed_dec. No condvar broadcast (inc never wakes a waiter). +/// @return true if this call restored the seed, false if it was +/// already present. +bool terminator_seed_inc(void); + /// @brief Restore terminator state for a fresh runtime start. /// @details Sets count=1 (seed), clears the closed bit, and re-arms the /// seed one-shot. Returns the prior `(count, seeded)` via the out @@ -81,4 +88,43 @@ int_least64_t terminator_seeded(void); /// @brief Read the current counter. int_least64_t terminator_count(void); +/// @brief Broadcast on the terminator condvar without changing the +/// counter. +/// @details Wakes every thread blocked in @ref terminator_wait so they +/// re-evaluate the count or any external predicate (e.g. the main +/// pinned queue depth). Used by @c boc_main_pinned_enqueue to nudge a +/// main-thread @c wait()/pump cycle when pinned work arrives without +/// changing the global active count. +void terminator_wake_all(void); + +/// @brief Wake reasons returned by @ref terminator_wait_pumpable. +/// @details The auto-pump loop in `wait()` distinguishes three exits +/// so the caller can pump, drain, or time out without re-querying +/// shared state. Exposed to Python as ``_core.TERMINATED``, +/// ``_core.PUMP_READY``, and ``_core.WAIT_TIMED_OUT`` module-level +/// integer constants. +typedef enum boc_terminator_wake_reason { + BOC_TERMINATOR_TERMINATED = 0, + BOC_TERMINATOR_PUMP_READY = 1, + BOC_TERMINATOR_WAIT_TIMED_OUT = 2, +} boc_terminator_wake_reason_t; + +/// @brief Pumpable variant of @ref terminator_wait. +/// @details Blocks on the terminator condvar until one of: +/// - the active count reaches 0 (returns @ref BOC_TERMINATOR_TERMINATED), +/// - the pinned-queue depth becomes positive (returns +/// @ref BOC_TERMINATOR_PUMP_READY), or +/// - @p timeout_s elapses (returns @ref BOC_TERMINATOR_WAIT_TIMED_OUT). +/// @p pinned_depth_fn is a function pointer that returns the current +/// pinned-queue depth; it is invoked while holding the terminator +/// mutex and MUST be lock-free (a relaxed atomic load on the depth +/// counter). The function-pointer indirection keeps @c boc_terminator.c +/// independent of @c _core.c's pinned-queue state. +/// @param timeout_s Maximum wait in seconds. A non-positive value +/// performs a single non-blocking poll. +/// @param pinned_depth_fn Lock-free reader for the pinned-queue depth. +/// @return One of @ref boc_terminator_wake_reason_t. +boc_terminator_wake_reason_t +terminator_wait_pumpable(double timeout_s, uint64_t (*pinned_depth_fn)(void)); + #endif // BOCPY_TERMINATOR_H diff --git a/src/bocpy/include/bocpy/bocpy.h b/src/bocpy/include/bocpy/bocpy.h index 1f68358..9696e33 100644 --- a/src/bocpy/include/bocpy/bocpy.h +++ b/src/bocpy/include/bocpy/bocpy.h @@ -86,4 +86,19 @@ static inline int_least64_t bocpy_interpid(void) { return (int_least64_t)PyInterpreterState_GetID(PyInterpreterState_Get()); } +/// @brief Return the *main* interpreter's ID as `int_least64_t`. +/// +/// Convenience wrapper over +/// `PyInterpreterState_GetID(PyInterpreterState_Main())`, pre-typed to match +/// @ref bocpy_interpid for owner-field equality checks. Used by +/// main-pinned-cown call sites to assert that the running interpreter is the +/// permanent owner of a pinned cown's value. `PyInterpreterState_Main()` +/// returns the process's main interpreter regardless of which interpreter the +/// caller is currently attached to, so this helper is safe to call from a +/// worker sub-interpreter for diagnostic/assert use (under the GIL or +/// equivalent attachment, same as @ref bocpy_interpid). +static inline int_least64_t bocpy_main_interpid(void) { + return (int_least64_t)PyInterpreterState_GetID(PyInterpreterState_Main()); +} + #endif // BOCPY_H diff --git a/src/bocpy/include/bocpy/xidata.h b/src/bocpy/include/bocpy/xidata.h index b658580..70dd6b4 100644 --- a/src/bocpy/include/bocpy/xidata.h +++ b/src/bocpy/include/bocpy/xidata.h @@ -310,6 +310,17 @@ static inline PyObject *PyErr_GetRaisedException(void) { return ev; } +static inline void PyErr_SetRaisedException(PyObject *exc) { + if (exc == NULL) { + PyErr_Clear(); + return; + } + PyObject *typ = Py_NewRef((PyObject *)Py_TYPE(exc)); + PyObject *tb = PyException_GetTraceback(exc); + // PyErr_Restore steals all three references. + PyErr_Restore(typ, exc, tb); +} + #endif /** diff --git a/src/bocpy/transpiler.py b/src/bocpy/transpiler.py index eafbd06..74862c3 100644 --- a/src/bocpy/transpiler.py +++ b/src/bocpy/transpiler.py @@ -7,11 +7,38 @@ from typing import Mapping, NamedTuple, Set -def _has_when_decorator(node: ast.FunctionDef) -> bool: +def _is_when_call(node: ast.AST, + when_aliases: Set[str], + bocpy_module_aliases: Set[str]) -> bool: + """Return True iff ``node`` is a ``@when(...)`` decorator call. + + Matches three spellings: + - bare ``Name`` whose id is in ``when_aliases`` + (``from bocpy import when [as alias]``) + - ``Attribute`` on a ``Name`` whose id is in + ``bocpy_module_aliases`` and whose ``attr`` is ``"when"`` + (``import bocpy [as alias]`` then ``@alias.when(...)``) + The literal name ``"when"`` is always treated as an alias when a + ``from bocpy import when`` statement is present. + """ + if not isinstance(node, ast.Call): + return False + func = node.func + if isinstance(func, ast.Name): + return func.id in when_aliases + if isinstance(func, ast.Attribute): + return (isinstance(func.value, ast.Name) + and func.value.id in bocpy_module_aliases + and func.attr == "when") + return False + + +def _has_when_decorator(node: ast.FunctionDef, + when_aliases: Set[str], + bocpy_module_aliases: Set[str]) -> bool: """Return True if the function carries an ``@when(...)`` decorator.""" for dec in node.decorator_list: - if (isinstance(dec, ast.Call) and isinstance(dec.func, ast.Name) - and dec.func.id == "when"): + if _is_when_call(dec, when_aliases, bocpy_module_aliases): return True return False @@ -19,16 +46,26 @@ def _has_when_decorator(node: ast.FunctionDef) -> bool: class CapturedVariableFinder(ast.NodeVisitor): """Finds captured variables in a FunctionDef.""" - def __init__(self, known_vars: Set[str]): + def __init__(self, known_vars: Set[str], + when_aliases: Set[str] = frozenset({"when"}), + bocpy_module_aliases: Set[str] = frozenset()): """Initialize the captured variable finder. :param known_vars: Any known identifiers (imports, global functions/classes) :type known_vars: Set[str] + :param when_aliases: Names that bind to ``bocpy.when`` (defaults + to the bare name ``"when"``). + :type when_aliases: Set[str] + :param bocpy_module_aliases: Names that bind to the ``bocpy`` + module so ``alias.when(...)`` is recognised. + :type bocpy_module_aliases: Set[str] """ self.local_vars: Set[str] = set() self.used_vars: Set[str] = set() self.captured_vars: Set[str] = set() self.known_vars: Set[str] = known_vars + self.when_aliases: Set[str] = when_aliases + self.bocpy_module_aliases: Set[str] = bocpy_module_aliases def clear(self): """Reset the tracked state between function visits.""" @@ -56,13 +93,18 @@ def visit_FunctionDef(self, node): # noqa: N802 # free names they reference must appear in the outer # behavior's captures. Plain nested def's keep their normal # opaque treatment because Python's own closure handles them. - if _has_when_decorator(stmt): - inner = CapturedVariableFinder(self.known_vars) + if _has_when_decorator(stmt, self.when_aliases, + self.bocpy_module_aliases): + inner = CapturedVariableFinder( + self.known_vars, + when_aliases=self.when_aliases, + bocpy_module_aliases=self.bocpy_module_aliases, + ) inner.visit(stmt) self.used_vars |= inner.captured_vars for dec in stmt.decorator_list: - if (isinstance(dec, ast.Call) and isinstance(dec.func, ast.Name) - and dec.func.id == "when"): + if _is_when_call(dec, self.when_aliases, + self.bocpy_module_aliases): for arg in dec.args: self.visit(arg) continue @@ -82,6 +124,15 @@ def visit_Name(self, node: ast.Name): # noqa: N802 self.generic_visit(node) + def visit_ExceptHandler(self, node: ast.ExceptHandler): # noqa: N802 + """Treat ``except ... as X`` binding as a local, not a capture.""" + # ``except ... as X`` (and ``try ... except* ... as X``) bind X + # on ``ExceptHandler.name`` as a plain identifier, not an + # ``ast.Name(Store)`` node, so the Name visitor never sees it. + if node.name: + self.local_vars.add(node.name) + self.generic_visit(node) + class BOCModuleTransformer(ast.NodeTransformer): """Prepares a main module for transpiling. @@ -96,6 +147,16 @@ def __init__(self): self.functions = set() self.imports = set() self.constants = set() + # Names that bind to ``bocpy.when`` (populated by + # ``visit_ImportFrom``). Always starts with the bare name + # ``"when"`` so a synthetic test or partial source still + # matches the historical literal-name spelling; the import + # visitor adds any explicit ``as`` alias to the set. + self.when_aliases: set = {"when"} + # Names that bind to the ``bocpy`` module (populated by + # ``visit_Import``). Used so ``@alias.when(...)`` is + # recognised as a behavior decorator. + self.bocpy_module_aliases: set = set() def known_vars(self): """Return identifiers known at module scope for capture exclusion.""" @@ -115,6 +176,8 @@ def visit_Import(self, node: ast.Import): # noqa: N802 """Record imported names and keep the node.""" for name in node.names: self.imports.add(name.asname if name.asname else name.name) + if name.name == "bocpy": + self.bocpy_module_aliases.add(name.asname or name.name) return node @@ -123,9 +186,13 @@ def visit_ImportFrom(self, node: ast.ImportFrom): # noqa: N802 for name in node.names: self.imports.add(name.asname if name.asname else name.name) - if node.module == "bocpy" and not any((a.asname or a.name) == "whencall" for a in node.names): - node.names.append(ast.alias(name="whencall")) - self.imports.add("whencall") + if node.module == "bocpy": + for n in node.names: + if n.name == "when": + self.when_aliases.add(n.asname or n.name) + if not any((a.asname or a.name) == "whencall" for a in node.names): + node.names.append(ast.alias(name="whencall")) + self.imports.add("whencall") ast.fix_missing_locations(node) @@ -138,13 +205,8 @@ def visit_ClassDef(self, node: ast.ClassDef): # noqa: N802 def visit_FunctionDef(self, node: ast.FunctionDef): # noqa: N802 """Record non-when functions for later capture resolution.""" - when_dec = None - for dec in node.decorator_list: - if isinstance(dec, ast.Call) and isinstance(dec.func, ast.Name) and dec.func.id == "when": - when_dec = dec - break - - if when_dec is None: + if not _has_when_decorator(node, self.when_aliases, + self.bocpy_module_aliases): self.functions.add(node.name) return node @@ -214,6 +276,23 @@ def visit_Module(self, node: ast.Module): # noqa: N802 new_body.append(new_value) + # If the user only spelled ``import bocpy [as alias]`` we never + # injected ``whencall`` into a ``from bocpy import`` statement, + # but the generated ``__behavior__N`` rewrite still emits + # ``whencall(...)`` as a bare ``Name``. Prepend an explicit + # import so worker resolution succeeds. No-op when ``whencall`` + # is already imported or when no bocpy import is present (in + # which case nothing in the exported module would call it). + if (self.bocpy_module_aliases and "whencall" not in self.imports): + inject = ast.ImportFrom( + module="bocpy", + names=[ast.alias(name="whencall")], + level=0, + ) + ast.fix_missing_locations(inject) + new_body.insert(0, inject) + self.imports.add("whencall") + node.body[:] = new_body @@ -246,11 +325,19 @@ class WhenTransformer(ast.NodeTransformer): # below ``@when`` are on their own. _BANNED_BELOW_DECORATORS = frozenset({"staticmethod", "classmethod", "property"}) - def __init__(self, known_vars: set, path: str, module_scope_names: set): + def __init__(self, known_vars: set, path: str, module_scope_names: set, + when_aliases: Set[str] = frozenset({"when"}), + bocpy_module_aliases: Set[str] = frozenset()): """Prepare behavior extraction with known identifiers and file path.""" self.known_vars = known_vars self.module_scope_names = module_scope_names - self.cap_finder = CapturedVariableFinder(known_vars) + self.when_aliases = when_aliases + self.bocpy_module_aliases = bocpy_module_aliases + self.cap_finder = CapturedVariableFinder( + known_vars, + when_aliases=when_aliases, + bocpy_module_aliases=bocpy_module_aliases, + ) self.nodes = [] self.behaviors = {} self.path = path @@ -368,10 +455,8 @@ def visit_FunctionDef(self, node: ast.FunctionDef): # noqa: N802 """Transform @when functions into exported behaviors.""" when_dec: ast.Expr = None for dec in node.decorator_list: - if not isinstance(dec, ast.Call): - continue - - if isinstance(dec.func, ast.Name) and dec.func.id == "when": + if _is_when_call(dec, self.when_aliases, + self.bocpy_module_aliases): when_dec = dec break @@ -402,26 +487,76 @@ def visit_FunctionDef(self, node: ast.FunctionDef): # noqa: N802 behavior_node = copy.deepcopy(node) ast.copy_location(behavior_node, node) + # Extras-as-captures: positional parameters declared beyond the + # cown count are captured by name from the caller's frame. This + # supports two idioms transparently: + # * the canonical Python loop-snapshot ``def b(c, i=i)`` — + # defaults align with the *tail* of ``args.args``; the + # default name becomes the capture source. + # * the rename form ``def b(c, x=y)`` — capture by ``y``, + # bind into param ``x``. + # Undefaulted trailing positionals (``def b(c, factor)``) are + # captured by the parameter's own name. Non-Name defaults and + # defaults landing on cown positions are rejected up front so a + # broken signature surfaces at export time, not as a confusing + # worker TypeError. + extras_captures: list[str] = [] + n_cowns = len(when_dec.args) + all_params = behavior_node.args.args + defaults = behavior_node.args.defaults + + if len(defaults) > len(all_params) - n_cowns: + raise SyntaxError( + "Default arguments on @when behavior cown positions are " + "not supported — defaults are allowed only on trailing " + "parameters beyond the @when cown count.", + (self.path, node.lineno, node.col_offset, None), + ) + + extras = all_params[n_cowns:] + n_undefaulted = len(extras) - len(defaults) + for arg in extras[:n_undefaulted]: + extras_captures.append(arg.arg) + for arg, dflt in zip(extras[n_undefaulted:], defaults): + if not isinstance(dflt, ast.Name): + raise SyntaxError( + f"Default for @when behavior parameter '{arg.arg}' " + f"must be a plain name (e.g. ``{arg.arg}={arg.arg}``). " + f"Compute the value before the @when call and " + f"capture the resulting name.", + (self.path, dflt.lineno, dflt.col_offset, None), + ) + extras_captures.append(dflt.id) + + # Strip defaults so the worker never tries to evaluate them. + # The captured values are passed positionally by ``whencall``. + behavior_node.args.defaults = [] + # find all the captured variables. These will need to be passed # to the behavior as additional arguments, as the closure will - # no longer function properly. + # no longer function properly. Extras (already in args.args) are + # in ``local_vars`` thanks to the finder's param walk, so they + # will not be re-classified as body free-vars. self.cap_finder.clear() self.cap_finder.visit(behavior_node) # __file__ is rewritten to a string constant by visit_Name below, # so it must not be added to the parameter list as a capture. - captures = [c for c in self.cap_finder.captured_vars if c != "__file__"] + body_captures = [c for c in self.cap_finder.captured_vars + if c != "__file__"] - # add the additional arguments to the function - for name in captures: + # add the body captures as trailing parameters; extras are + # already part of the user's signature. + for name in body_captures: behavior_node.args.args.append(ast.Name(id=name)) + captures = extras_captures + body_captures + # Remove only @when decorators; other decorators compose with # the behavior body and are preserved in the exported module. behavior_node.decorator_list = [ d for d in behavior_node.decorator_list - if not (isinstance(d, ast.Call) - and isinstance(d.func, ast.Name) - and d.func.id == "when") + if not _is_when_call(d, self.when_aliases, + self.bocpy_module_aliases) ] # Reject descriptor-producing decorators that would silently @@ -510,6 +645,8 @@ def export_module(tree: ast.Module, path: str = None) -> ExportResult: boc_export.known_vars() | builtins, path, module_scope_names=boc_export.module_scope_names() | builtins, + when_aliases=boc_export.when_aliases, + bocpy_module_aliases=boc_export.bocpy_module_aliases, ) when_transformer.visit(tree) diff --git a/templates/c_abi_consumer/pyproject.toml b/templates/c_abi_consumer/pyproject.toml index fd484ed..7e40b20 100644 --- a/templates/c_abi_consumer/pyproject.toml +++ b/templates/c_abi_consumer/pyproject.toml @@ -4,11 +4,11 @@ # so the build resolves headers against the bocpy install actually # being tested. See README.md. # -# The compatible-release bound (~=0.8) keeps the template aligned with +# The compatible-release bound (~=0.9) keeps the template aligned with # the public C ABI it was authored against; bump it in lock-step with # ``[project].version`` in the root ``pyproject.toml`` (see the # ``finalize-pr`` skill). -requires = ["setuptools", "wheel", "bocpy~=0.8"] +requires = ["setuptools", "wheel", "bocpy~=0.9"] build-backend = "setuptools.build_meta" [project] @@ -16,4 +16,4 @@ name = "bocpy-c-abi-consumer" version = "0.0.0" description = "Smoke test and canonical downstream template for the bocpy public C ABI." requires-python = ">=3.10" -dependencies = ["bocpy~=0.8"] +dependencies = ["bocpy~=0.9"] diff --git a/test/test_boc.py b/test/test_boc.py index cd2229a..83cfc34 100644 --- a/test/test_boc.py +++ b/test/test_boc.py @@ -1042,10 +1042,8 @@ def test_zero_args_behavior_capsule(self): """BehaviorCapsule with empty args list must construct cleanly.""" from bocpy import start as _start_runtime from bocpy._core import BehaviorCapsule - try: - _start_runtime() - except RuntimeError: - pass # Runtime already started by a prior test. + # start() is idempotent; no try/except needed on re-entry. + _start_runtime() result = Cown(None) # Empty args list — args_size == 0. The @@ -1063,10 +1061,8 @@ def test_large_args_behavior_capsule(self): """BehaviorCapsule with many args constructs and group_ids works.""" from bocpy import start as _start_runtime from bocpy._core import BehaviorCapsule - try: - _start_runtime() - except RuntimeError: - pass # Runtime already started by a prior test. + # start() is idempotent; no try/except needed on re-entry. + _start_runtime() result = Cown(None) # 32 distinct cowns with distinct group_ids. Exercises the @@ -1735,3 +1731,33 @@ def _(result): send("assert", (result.value, 8)) receive_asserts() + + +class TestLoopDefaultCapture: + """``def b(c, i=i)`` — canonical Python loop-snapshot idiom for @when.""" + + @classmethod + def teardown_class(cls): + """Ensure runtime is drained after suite.""" + wait() + + def test_loop_default_captures_per_iteration_value(self): + """``i=i`` captures the loop value at schedule time, not at execution.""" + c = Cown(0) + for i in range(4): + @when(c) + def _(c, i=i): + send("assert", (i, i)) + + receive_asserts(4) + + def test_rename_default_binds_into_param(self): + """``def b(c, x=y)`` — capture ``y`` from caller, bind into ``x``.""" + c = Cown(0) + y = 99 + + @when(c) + def _(c, x=y): + send("assert", (x, 99)) + + receive_asserts() diff --git a/test/test_matrix.py b/test/test_matrix.py index a8ba220..a831440 100644 --- a/test/test_matrix.py +++ b/test/test_matrix.py @@ -6,7 +6,7 @@ import pytest -from bocpy import Cown, Matrix, wait, when +from bocpy import Cown, drain, Matrix, receive, send, TIMEOUT, wait, when # --------------------------------------------------------------------------- @@ -4360,13 +4360,48 @@ class TestVectorMethodsInCown: Mirrors the in-process Matrix-vector tests but routes every call through the worker dispatch path so Matrix XIData round-trip plus - in-cown mutation are both exercised. Scalar-returning methods land - their result on ``result.value`` directly; matrix-returning methods - return an element tuple to keep the assertion self-contained; - in-place methods schedule a second behavior on the same cown to - read back the mutated matrix. + in-cown mutation are both exercised. Assertions are shipped out of + the behaviors via ``send("assert", ...)`` and collected on the test + thread by :meth:`receive_asserts`, per the project's BOC testing + convention; reading ``result.value`` directly from the test thread + would violate cown ownership. """ + RECEIVE_TIMEOUT = 10 + + @classmethod + def teardown_class(cls): + wait() + + def receive_asserts(self, count=1): + """Collect ``count`` ('assert', (actual, expected)) messages. + + Drains the queue on exit so a failure in one test does not leak + residual messages into the next. + """ + failed = None + timed_out = False + try: + for _ in range(count): + result = receive("assert", self.RECEIVE_TIMEOUT) + if result[0] == TIMEOUT: + timed_out = True + break + _, (actual, expected) = result + if failed is None and actual != expected: + failed = (actual, expected) + finally: + drain("assert") + + assert not timed_out, ( + "Timed out waiting for an 'assert' message from a behavior. " + "Check that every @when arg count matches the decorated " + "function's parameter count." + ) + if failed is not None: + actual, expected = failed + assert actual == expected, f"expected {expected!r}, got {actual!r}" + # ---- scalar-returning ------------------------------------------------- def test_vecdot_in_behavior(self): @@ -4378,9 +4413,12 @@ def test_vecdot_in_behavior(self): def result(a): return a.value.vecdot(b) - wait() - assert result.exception is False - assert result.value == pytest.approx(32.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value == pytest.approx(32.0), True)) + + self.receive_asserts(2) def test_length_in_behavior(self): """``length`` getter ``[3, 4] == 5`` via worker dispatch (it's a property).""" @@ -4390,9 +4428,12 @@ def test_length_in_behavior(self): def result(v): return v.value.length - wait() - assert result.exception is False - assert result.value == pytest.approx(5.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value == pytest.approx(5.0), True)) + + self.receive_asserts(2) def test_magnitude_squared_in_behavior(self): """``magnitude_squared([3, 4]) == 25`` via worker dispatch.""" @@ -4402,9 +4443,12 @@ def test_magnitude_squared_in_behavior(self): def result(v): return v.value.magnitude_squared() - wait() - assert result.exception is False - assert result.value == pytest.approx(25.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value == pytest.approx(25.0), True)) + + self.receive_asserts(2) def test_angle_in_behavior(self): """``angle([0, 1]) == pi/2`` via worker dispatch.""" @@ -4414,9 +4458,12 @@ def test_angle_in_behavior(self): def result(v): return v.value.angle() - wait() - assert result.exception is False - assert result.value == pytest.approx(math.pi / 2.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value == pytest.approx(math.pi / 2.0), True)) + + self.receive_asserts(2) # ---- matrix-returning (copy form) ------------------------------------ @@ -4430,11 +4477,14 @@ def result(a): out = a.value.cross(b) return (out[0, 0], out[0, 1], out[0, 2]) - wait() - assert result.exception is False - assert result.value[0] == pytest.approx(-3.0) - assert result.value[1] == pytest.approx(6.0) - assert result.value[2] == pytest.approx(-3.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value[0] == pytest.approx(-3.0), True)) + send("assert", (r.value[1] == pytest.approx(6.0), True)) + send("assert", (r.value[2] == pytest.approx(-3.0), True)) + + self.receive_asserts(4) def test_normalize_copy_in_behavior(self): """``normalize([3, 4]) == [0.6, 0.8]``; original cown value untouched.""" @@ -4445,13 +4495,16 @@ def result(v): n = v.value.normalize() return (n[0, 0], n[0, 1], v.value[0, 0], v.value[0, 1]) - wait() - assert result.exception is False - n0, n1, src0, src1 = result.value - assert n0 == pytest.approx(0.6) - assert n1 == pytest.approx(0.8) - assert src0 == pytest.approx(3.0) - assert src1 == pytest.approx(4.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + n0, n1, src0, src1 = r.value + send("assert", (n0 == pytest.approx(0.6), True)) + send("assert", (n1 == pytest.approx(0.8), True)) + send("assert", (src0 == pytest.approx(3.0), True)) + send("assert", (src1 == pytest.approx(4.0), True)) + + self.receive_asserts(5) def test_perpendicular_copy_in_behavior(self): """``perpendicular([1, 0]) == [0, 1]``; original cown value untouched.""" @@ -4462,13 +4515,16 @@ def result(v): p = v.value.perpendicular() return (p[0, 0], p[0, 1], v.value[0, 0], v.value[0, 1]) - wait() - assert result.exception is False - p0, p1, src0, src1 = result.value - assert p0 == pytest.approx(0.0) - assert p1 == pytest.approx(1.0) - assert src0 == pytest.approx(1.0) - assert src1 == pytest.approx(0.0) + @when(result) + def _(r): + send("assert", (r.exception, False)) + p0, p1, src0, src1 = r.value + send("assert", (p0 == pytest.approx(0.0), True)) + send("assert", (p1 == pytest.approx(1.0), True)) + send("assert", (src0 == pytest.approx(1.0), True)) + send("assert", (src1 == pytest.approx(0.0), True)) + + self.receive_asserts(5) # ---- in-place mutators ----------------------------------------------- @@ -4484,10 +4540,13 @@ def _(v): def check(v): return (v.value[0, 0], v.value[0, 1]) - wait() - assert check.exception is False - assert check.value[0] == pytest.approx(0.6) - assert check.value[1] == pytest.approx(0.8) + @when(check) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value[0] == pytest.approx(0.6), True)) + send("assert", (r.value[1] == pytest.approx(0.8), True)) + + self.receive_asserts(3) def test_perpendicular_in_place_in_behavior(self): """``perpendicular(in_place=True)`` mutates the matrix held by the cown.""" @@ -4501,10 +4560,13 @@ def _(v): def check(v): return (v.value[0, 0], v.value[0, 1]) - wait() - assert check.exception is False - assert check.value[0] == pytest.approx(0.0) - assert check.value[1] == pytest.approx(1.0) + @when(check) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value[0] == pytest.approx(0.0), True)) + send("assert", (r.value[1] == pytest.approx(1.0), True)) + + self.receive_asserts(3) def test_negate_in_place_in_behavior(self): """``negate(in_place=True)`` mutates the matrix held by the cown.""" @@ -4518,8 +4580,11 @@ def _(v): def check(v): return (v.value[0, 0], v.value[0, 1], v.value[0, 2]) - wait() - assert check.exception is False - assert check.value[0] == pytest.approx(-1.0) - assert check.value[1] == pytest.approx(2.0) - assert check.value[2] == pytest.approx(-3.0) + @when(check) + def _(r): + send("assert", (r.exception, False)) + send("assert", (r.value[0] == pytest.approx(-1.0), True)) + send("assert", (r.value[1] == pytest.approx(2.0), True)) + send("assert", (r.value[2] == pytest.approx(-3.0), True)) + + self.receive_asserts(4) diff --git a/test/test_pinned_pump.py b/test/test_pinned_pump.py new file mode 100644 index 0000000..10534d3 --- /dev/null +++ b/test/test_pinned_pump.py @@ -0,0 +1,1004 @@ +"""Tests for :class:`bocpy.PinnedCown` and the pinned-behavior pump. + +Pinned cowns are permanently owned by the main interpreter; their +values never round-trip through XIData. Behaviors whose request set +contains at least one :class:`PinnedCown` run on main via the pump +rather than on a worker. + +Three test classes cover three use shapes: + +* :class:`TestPinnedCownBasics` -- construction invariants (value + identity preserved across acquire/release, debug-build destructor, + wrong-interpreter raise). +* :class:`TestPinnedCownsAutoDrain` -- the script-mode path: + :func:`bocpy.wait` auto-pumps when any :class:`PinnedCown` is + live, and the shutdown drain in ``stop_workers`` clears any + pinned work still in the queue. +* :class:`TestPinnedCownsManualPump` -- the event-loop integration + path: the public :func:`bocpy.pump` facade with ``deadline_ms``, + ``max_behaviors``, ``raise_on_error``, BaseException propagation, + and reentry rejection. + +All tests follow the standard ``send("assert", ...)`` / ``receive`` +/ trailing-``wait()`` idiom documented in +``.github/skills/testing-with-boc/SKILL.md``. +""" + +from __future__ import annotations + +from functools import partial +import gc +import time + +import pytest + +from bocpy import ( + _core, + Cown, + drain, + notice_sync, + notice_update, + notice_write, + noticeboard, + PinnedCown, + pump, + quiesce, + receive, + send, + set_pump_watchdog, + set_wait_pump_poll, + start, + TIMEOUT, + wait, + when, +) +from bocpy import behaviors as _behaviors + + +RECEIVE_TIMEOUT = 10 + + +def _replace_with(new_value, _old): + """notice_update fn that ignores the prior value and substitutes ``new_value``. + + Defined at module scope so it is picklable across the boundary into + the noticeboard sub-interpreter. Use via ``partial(_replace_with, x)``. + """ + return new_value + + +class _NotPicklable: + """Probe: pinned values must not be pickled. + + Any path that routes the value through ``object_to_xidata`` raises + :class:`TypeError` from ``__reduce_ex__``, so a regression in the + pinned acquire/release short-circuit surfaces immediately. + """ + + def __reduce_ex__(self, protocol): # noqa: D401 + raise TypeError("pinned cown values must never be pickled") + + def __repr__(self) -> str: + return "" + + +def receive_asserts(count=1): + """Drain all expected assertion messages, then fail on first mismatch. + + The "assert" queue is always drained before returning so that leftover + messages from a failing test do not leak into subsequent tests in CI. + """ + failed = None + timed_out = False + try: + for _ in range(count): + result = receive("assert", RECEIVE_TIMEOUT) + if result[0] == TIMEOUT: + timed_out = True + break + _, (actual, expected) = result + if failed is None and actual != expected: + failed = (actual, expected) + finally: + drain("assert") + + assert not timed_out, ( + "Timed out waiting for an 'assert' message from a behavior. " + "Check that every @when arg count matches the decorated " + "function's parameter count." + ) + if failed is not None: + actual, expected = failed + assert actual == expected, f"expected {expected!r}, got {actual!r}" + + +class TestPinnedCownBasics: + """PinnedCown construction invariants (no schedule, no pump).""" + + @classmethod + def teardown_class(cls): + wait() + + def test_pinned_value_identity_and_no_pickle(self): + """Pinned-cown value keeps identity across many acquire cycles. + + Schedules 64 single-cown behaviors against one + :class:`PinnedCown` wrapping a non-picklable probe. Any path + that routes the value through ``object_to_xidata`` would + raise from ``__reduce_ex__``; any path that disowns the + value would change ``id(pc.value)``. + """ + obj = _NotPicklable() + obj_id = id(obj) + pc = PinnedCown(obj) + + for _ in range(64): + @when(pc) + def _body(pc): + send("assert", (id(pc.value), obj_id)) + + # quiesce() so worker sub-interpreters survive until + # receive_asserts reads their messages. + quiesce() + receive_asserts(64) + + def test_pinned_destruct_after_construction_only(self): + """Drop a pinned cown immediately after construction. + + Debug builds (``.env313d``) trip ownership assertions if + ``cown_decref_inline`` does not tolerate + ``value != NULL && xidata == NULL`` on pinned cowns. Touching + only the construct + drop path keeps the test focused on the + destructor; the auto-drain class covers the schedule path. + """ + pc = PinnedCown(_NotPicklable()) + del pc + gc.collect() + + def test_pinned_cown_off_main_raises(self): + """``PinnedCown(...)`` from a worker raises ``RuntimeError``. + + The except-clause cannot bind the exception to a name -- the + transpiler's free-variable scan treats ``ExceptHandler.name`` + as a capture and frame-walking cannot resolve it. Capture + the type name and a substring of the message into plain + locals inside the except block and ship them through the + standard ``"assert"`` tag. + """ + @when() + def _(): + exc_type_name = "no-raise" + msg_mentions_main = False + try: + PinnedCown(object()) + except RuntimeError as ex: + exc_type_name = "RuntimeError" + msg_mentions_main = "main interpreter" in str(ex) + + send("assert", ( + (exc_type_name, msg_mentions_main), + ("RuntimeError", True), + )) + + quiesce() + receive_asserts() + + +class TestPinnedCownsAutoDrain: + """Script-mode path: ``wait()`` auto-pumps and ``stop()`` drains.""" + + @classmethod + def teardown_class(cls): + wait() + + # wait() with pinned cowns auto-drains. + def test_wait_auto_drains_pinned(self): + """A pinned behavior scheduled before wait() runs on main without pump().""" + pc = PinnedCown({"hits": 0}) + + @when(pc) + def _body(pc): + pc.value["hits"] += 1 + send("assert", ("ran", "ran")) + + wait() + receive_asserts() + + def test_wait_pinned_cown_in_cown(self): + pc = PinnedCown({"hits": 0}) + wrap = Cown(pc) + + @when(wrap) + def _wrapper(w): + @when(w.value) + def _body(pc): + pc.value["hits"] += 1 + send("assert", ("ran", "ran")) + + wait() + receive_asserts() + + def test_main_pump_drain_all_marks_result_cowns(self): + """``_core.main_pump_drain_all`` pops every entry and marks each result Cown with a shutdown RuntimeError.""" + # 8 distinct cowns: same-cown behaviours serialise via MCS and only the head sits in MAIN_PINNED_QUEUE. + pcs = [PinnedCown(0) for _ in range(8)] + # Capture each @when's result Cown (the value returned by the + # decorator) so we can inspect its exception/value after the + # drain runs. + results = [] + for pc in pcs: + @when(pc) + def _body(pc): + pc.value += 1 + results.append(_body) + + # Precondition: all 8 still queued (no pump has run yet). + assert _core.main_pump_queue_depth() == 8 + + drained = _core.main_pump_drain_all() + assert drained == 8, ( + f"main_pump_drain_all must pop every queued behavior; " + f"got {drained}" + ) + assert _core.main_pump_queue_depth() == 0 + + # Re-acquire each result via the Cown context manager; *every* one must carry the drop-exception. + for result in results: + with result: + assert result.exception is True, ( + "main_pump_drain_all must set exception=True on " + "every drained behavior's result Cown" + ) + assert isinstance(result.value, RuntimeError), ( + f"expected RuntimeError, got " + f"{type(result.value).__name__}" + ) + assert "shutdown" in str(result.value), ( + f"drop-exception message did not mention " + f"shutdown: {result.value!r}" + ) + + # stop() with pending pinned work: drain runs. + def test_stop_drains_pinned_queue(self): + """An explicit stop() should leave MAIN_PINNED_QUEUE empty. + + Also verifies the transpiler's per-iteration capture of ``i``: + a final pinned behaviour reads ``pc.value`` and ships the + tuple back via ``send("final", ...)``. A regression that + late-bound ``i`` at body-execution time would yield ``(3, 3, + 3, 3)`` instead of ``(0, 1, 2, 3)``. + """ + pc = PinnedCown([]) + for i in range(4): + @when(pc) + def _body(pc): + pc.value.append(i) # noqa: B023 + send("assert", ("ran", "ran")) + + @when(pc) + def _final(pc): + send("final", tuple(pc.value)) + + wait() + receive_asserts(4) + final_tag, final_payload = receive("final", RECEIVE_TIMEOUT) + try: + assert final_tag != TIMEOUT, ( + "timed out waiting for the final pinned behaviour" + ) + assert final_payload == (0, 1, 2, 3), ( + f"per-iteration capture of i broke: expected " + f"(0, 1, 2, 3), got {final_payload!r}" + ) + finally: + drain("final") + assert _core.main_pump_queue_depth() == 0 + + # shutdown_no_disown: refcount of pinned value preserved. + def test_shutdown_does_not_disown_pinned_value(self): + """The Python value inside a PinnedCown must outlive stop(). + + Schedule a pinned behaviour that records ``sys.getrefcount`` of the + underlying value before and after the body runs; after ``wait()`` + completes, the value should still be reachable (no disown / no + XIData round-trip). The test reads the value via a fresh + PinnedCown handle inside a follow-up behaviour rather than from + test code so we don't reach across the shutdown boundary. + """ + v = ["sentinel"] + pc = PinnedCown(v) + v_id = id(v) + + @when(pc) + def _body(pc): + # Value identity preserved across acquire/release. + send("assert", (id(pc.value), v_id)) + pc.value.append("post-acquire") + send("assert", (pc.value, ["sentinel", "post-acquire"])) + + wait() + receive_asserts(2) + + +class TestPinnedCownsManualPump: + """Public :func:`bocpy.pump` facade for event-loop integration. + + Script-mode users get the same drain via :func:`wait`; these + tests exist because the bounding arguments (``deadline_ms``, + ``max_behaviors``, ``raise_on_error``) and the + reentry/BaseException paths can only be exercised by an + explicit pump caller. + """ + + @classmethod + def teardown_class(cls): + wait() + + def test_pump_max_behaviors_caps_drain(self): + """``max_behaviors`` stops the drain at the requested bound.""" + pcs = [PinnedCown(i) for i in range(10)] + for pc in pcs: + @when(pc) + def _body(pc): + send("assert", ("ran", "ran")) + + assert _core.main_pump_queue_depth() == 10 + + result = pump(max_behaviors=3) + assert result.executed == 3 + assert result.raised == 0 + assert result.deadline_reached is False + assert _core.main_pump_queue_depth() == 7 + + rest = pump() + assert rest.executed == 7 + assert _core.main_pump_queue_depth() == 0 + + receive_asserts(10) + + def test_pump_deadline_caps_drain(self): + """``deadline_ms`` may trip before the queue drains. + + Tolerates a sub-1ms full drain on fast hardware: either the + deadline trips with a partial drain, or the whole queue + drains in time. Both paths verify the result-tuple + invariants. + """ + pcs = [PinnedCown(i) for i in range(50)] + for pc in pcs: + @when(pc) + def _body(pc): + send("assert", ("ran", "ran")) + + result = pump(deadline_ms=1) + assert result.raised == 0 + + if result.deadline_reached: + assert 0 < result.executed < 50 + remaining = 50 - result.executed + rest = pump() + assert rest.executed == remaining + else: + assert result.executed == 50 + + assert _core.main_pump_queue_depth() == 0 + receive_asserts(50) + + def test_pump_raise_on_error_re_raises(self): + """``raise_on_error`` re-raises the first body Exception. + + After re-raise the second behavior is still queued; a + follow-up unbounded pump drains it. + """ + pc1 = PinnedCown("payload1") + pc2 = PinnedCown("payload2") + + @when(pc1) + def _body1(pc): + raise ValueError(f"boom: {pc.value!r}") + + @when(pc2) + def _body2(pc): + raise ValueError(f"boom: {pc.value!r}") + + with pytest.raises(ValueError, match="boom"): + pump(raise_on_error=True) + + rest = pump() + assert rest.executed == 1 + + def test_pump_propagates_base_exception(self): + """:class:`BaseException` propagates out, cleanup still runs. + + After the first body's :class:`KeyboardInterrupt` re-raises, + the second behavior is still queued; a follow-up unbounded + pump drains it and surfaces its ``send``. + """ + pc1 = PinnedCown("payload1") + pc2 = PinnedCown("payload2") + + @when(pc1) + def _body1(pc): + raise KeyboardInterrupt("base-exc from pump body") + + @when(pc2) + def _body2(pc): + send("assert", ("survivor-ran", "survivor-ran")) + + assert _core.main_pump_queue_depth() == 2 + + with pytest.raises(KeyboardInterrupt, match="base-exc"): + pump() + + assert _core.main_pump_queue_depth() == 1 + + rest = pump() + assert rest.executed == 1 + receive_asserts(1) + + def test_pump_rejects_nested_call(self): + """``pump()`` from inside a pinned body raises ``RuntimeError``. + + The body wraps the inner ``pump()`` in a try/except and ships + the captured type name + message-substring through the + standard ``"assert"`` tag, so the outer pump observes + ``raised == 0``. + """ + pc = PinnedCown("nest") + + @when(pc) + def _attempt(pc): + exc_type_name = "no-raise" + msg_says_reentrant = False + try: + pump() + except RuntimeError as ex: + exc_type_name = "RuntimeError" + msg_says_reentrant = "not reentrant" in str(ex) + send("assert", ( + (exc_type_name, msg_says_reentrant), + ("RuntimeError", True), + )) + + result = pump() + assert result.executed == 1 + assert result.raised == 0 + receive_asserts() + + +class TestPumpArgValidation: + """Type / bound validation in the :func:`pump` Python wrapper. + + Validates the contract: ``deadline_ms`` and ``max_behaviors`` + must be ``None`` or a positive :class:`int` (not :class:`bool`). + ``0`` is rejected outright: the caller's "skip if budget is + zero" intent belongs in a one-line ``if budget:`` guard at the + call site, not inside a short-circuit branch that would also + bypass the live-runtime check. + """ + + @classmethod + def teardown_class(cls): + wait() + + # Type rejection (incl. 0). + @pytest.mark.parametrize("bad", [0, -1, -1000, 1.5, "1", True, False]) + def test_pump_deadline_ms_rejects_bad_input(self, bad): + """Non-None / non-int / non-positive / bool ``deadline_ms`` raises.""" + with pytest.raises(TypeError, match="deadline_ms"): + pump(deadline_ms=bad) + + @pytest.mark.parametrize("bad", [0, -1, -1000, 1.5, "1", True, False]) + def test_pump_max_behaviors_rejects_bad_input(self, bad): + """Non-None / non-int / non-positive / bool ``max_behaviors`` raises.""" + with pytest.raises(TypeError, match="max_behaviors"): + pump(max_behaviors=bad) + + # The overflow cap is gated on the explicit ``ms=True`` + # kwarg, not a name-string heuristic. A non-ms bound named + # ``max_behaviors`` must NOT trip the cap even at huge values. + def test_validator_non_ms_bound_not_capped(self): + """A non-ms bound passes through the validator without OverflowError.""" + huge = _behaviors._MAX_PUMP_MS * 1000 + 1 + assert ( + _behaviors._validate_pump_bound("max_behaviors", huge) + == huge + ) + + def test_validator_ms_bound_capped(self): + """An ms-flagged bound > _MAX_PUMP_MS raises OverflowError.""" + with pytest.raises(OverflowError, match="exceeds"): + _behaviors._validate_pump_bound( + "deadline_ms", + _behaviors._MAX_PUMP_MS + 1, + ms=True, + ) + + +class TestPumpRuntimeRequired: + """``pump()`` refuses to run without a live runtime. + + Previously the wrapper silently fell back to ``sys.modules['__main__']`` + when ``BEHAVIORS.export_module`` was unset, which let pinned behaviors + fail with cryptic per-iteration ``AttributeError``s on thunk lookup. + The new contract is a single loud :class:`RuntimeError` at + :func:`pump` entry naming the missing precondition. + """ + + def test_pump_before_start_raises_runtimeerror(self): + """Without a live ``BEHAVIORS``, :func:`pump` raises immediately.""" + # Ensure the runtime is fully torn down: a prior test in the + # session may have left BEHAVIORS populated. + assert _behaviors.BEHAVIORS is None, ( + "expected runtime to be stopped before this test; previous " + "test did not call wait() in teardown" + ) + + with pytest.raises(RuntimeError, match="bocpy.start"): + pump() + # Stillborn pump must not start the runtime as a side effect. + assert _behaviors.BEHAVIORS is None + + +# set_wait_pump_poll picked up mid-wait. +def test_set_wait_pump_poll_re_read(): + """``_WAIT_PUMP_POLL_MS`` is re-read on every auto-pump iteration.""" + set_wait_pump_poll(50) + assert _behaviors._WAIT_PUMP_POLL_MS == 50 + set_wait_pump_poll(5) + assert _behaviors._WAIT_PUMP_POLL_MS == 5 + # restore default + set_wait_pump_poll(50) + + +def test_set_wait_pump_poll_validation(): + """Reject zero, negative, non-int, and bool inputs.""" + with pytest.raises(TypeError): + set_wait_pump_poll(0) + with pytest.raises(TypeError): + set_wait_pump_poll(-1) + with pytest.raises(TypeError): + set_wait_pump_poll(1.5) + with pytest.raises(TypeError): + set_wait_pump_poll(True) + +# Sanity: the new C constants exist with the expected integer values. + + +def test_terminator_wake_reason_constants(): + assert _core.TERMINATED == 0 + assert _core.PUMP_READY == 1 + assert _core.WAIT_TIMED_OUT == 2 + + +# Sanity: terminator_wait_pumpable returns TERMINATED when no work is in flight. +def test_terminator_wait_pumpable_terminated_when_empty(): + # No outstanding behaviours: count must be 0 -> TERMINATED. + reason = _core.terminator_wait_pumpable(0.01) + assert reason == _core.TERMINATED + + +# Sanity: main_pump_drain_all on an empty queue returns 0 and is a no-op. +def test_main_pump_drain_all_empty(): + assert _core.main_pump_drain_all() == 0 + assert _core.main_pump_queue_depth() == 0 + gc.collect() + + +class TestPinnedRoundTrip: + """Pinned-cown handles round-trip through workers and noticeboards. + + The pinned *value* never crosses an interpreter boundary. The + pinned *handle* (the wrapper + capsule) does -- a worker that ends + up holding one can do exactly one useful thing with it: schedule + a pinned ``@when`` against it, which the runtime routes to the main + pump queue. These tests assert that handle round-trips and that the + routing decision lives in the capsule, not in the Python wrapper + class. + """ + + @classmethod + def teardown_class(cls): + wait() + + def test_handle_round_trip_via_worker_closure(self): + """Worker receives a pinned handle via closure capture, schedules a pinned @when.""" + pc = PinnedCown([]) + unrelated = Cown(0) + + @when(unrelated) + def _ship(u): + send("assert", (_core.cown_is_pinned(pc.impl), True)) + + @when(pc) + def _on_main(pc): + pc.value.append("main-ran") + send("assert", (pc.value, ["main-ran"])) + + quiesce() + receive_asserts(2) + + def test_pinned_via_noticeboard_write(self): + """``notice_write("k", PinnedCown(x))`` round-trips to a worker reader.""" + start() + pc = PinnedCown([]) + notice_write("t5_pc", pc) + notice_sync() + + unrelated = Cown(0) + + @when(unrelated) + def _reader(u): + h = noticeboard()["t5_pc"] + send("assert", (_core.cown_is_pinned(h.impl), True)) + + @when(h) + def _on_main(h): + h.value.append("via-noticeboard") + send("assert", (h.value, ["via-noticeboard"])) + + wait() + receive_asserts(2) + + def test_pinned_list_via_noticeboard(self): + """A worker pulls handles out of a list payload and chains pinned @whens.""" + start() + pcs = [PinnedCown([]), PinnedCown([])] + notice_write("t6_pcs", pcs) + notice_sync() + + unrelated = Cown(0) + + @when(unrelated) + def _reader(u): + handles = noticeboard()["t6_pcs"] + send("assert", (len(handles), 2)) + for i, h in enumerate(handles): + send("assert", (_core.cown_is_pinned(h.impl), True)) + + @when(h) + def _on_main(h, i=i): + h.value.append(("chain", i)) + send("assert", (h.value, [("chain", i)])) + + wait() + # 1 length assert + 2 is_pinned asserts + 2 body asserts. + receive_asserts(5) + + def test_pinned_nested_in_regular_cown_value(self): + """``Cown({"pc": PinnedCown(x), ...})`` -- worker extracts the inner handle.""" + pc = PinnedCown([]) + outer = Cown({"pc": pc, "tag": "wrap"}) + # Pass the expected literal in as a closure capture: the + # transpiler ships it via the captures tuple so the string + # arrives in the worker with its own ownership, sidestepping a + # 3.13 debug-build interned-string teardown bug that bites + # comparisons against literals round-tripped through + # ``o.value[...]``. + expected_tag = "wrap" + + @when(outer) + def _worker(o): + inner = o.value["pc"] + send("assert", (_core.cown_is_pinned(inner.impl), True)) + send("assert", (o.value["tag"], expected_tag)) + + @when(inner) + def _on_main(inner): + inner.value.append("from-nested") + send("assert", (inner.value, ["from-nested"])) + + quiesce() + receive_asserts(3) + + def test_two_workers_share_pinned_handle_via_noticeboard(self): + """Two workers each read the same pinned handle; both pinned bodies run on main.""" + start() + pc = PinnedCown([]) + notice_write("t16_pc", pc) + notice_sync() + + u1 = Cown(0) + u2 = Cown(0) + + @when(u1) + def _w1(u): + h = noticeboard()["t16_pc"] + send("assert", (_core.cown_is_pinned(h.impl), True)) + + @when(h) + def _body(h): + h.value.append("w1") + send("assert", (h.value[-1], "w1")) + + @when(u2) + def _w2(u): + h = noticeboard()["t16_pc"] + send("assert", (_core.cown_is_pinned(h.impl), True)) + + @when(h) + def _body(h): + h.value.append("w2") + send("assert", (h.value[-1], "w2")) + + wait() + # 2 is_pinned asserts + 2 body asserts. + receive_asserts(4) + + # Both workers mutated the *same* pinned value -- strong evidence + # that both handles resolved to the same underlying capsule. + sentinel = PinnedCown(None) + + @when(sentinel) + def _inspect(_s): + content = sorted(pc.value) + send("assert", (content, ["w1", "w2"])) + + wait() + receive_asserts() + + def test_pinned_via_notice_update(self): + """``notice_update`` with a pinned producer; readers see the pinned handle.""" + start() + pc = PinnedCown([]) + notice_update("t16b_pc", partial(_replace_with, pc), default=None) + notice_sync() + + unrelated = Cown(0) + + @when(unrelated) + def _reader(u): + h = noticeboard()["t16b_pc"] + send("assert", (h is not None, True)) + send("assert", (_core.cown_is_pinned(h.impl), True)) + + wait() + receive_asserts(2) + + def test_body_raise_drains_queue(self): + """A raising pinned body marks its result cown and the queue still drains.""" + pc_raise = PinnedCown(0) + pc_ok = PinnedCown(0) + + @when(pc_raise) + def raiser(pc): + raise RuntimeError("planned-failure") + + @when(pc_ok) + def _survivor(pc): + send("assert", ("survived", "survived")) + + @when(raiser) + def _inspect(r): + send("assert", (r.exception, True)) + + quiesce() + receive_asserts(2) + assert _core.main_pump_queue_depth() == 0 + + # Mixed pinned/unpinned routing. + @pytest.mark.parametrize("kind,expected_on_main", [ + (("p", "p"), True), + (("p", "u"), True), + (("u", "p"), True), + (("u", "u"), False), + (("p", "p", "u"), True), + (("u", "u", "u"), False), + ]) + def test_mixed_request_set_routes_to_main_iff_pinned( + self, kind, expected_on_main): + """Every request set containing a pinned cown routes to the main pump.""" + cowns = [PinnedCown(0) if k == "p" else Cown(0) for k in kind] + + if len(cowns) == 2: + a, b = cowns + + @when(a, b) + def _body(a, b, expected_on_main=expected_on_main): + send("assert", (_core.is_primary(), expected_on_main)) + else: + a, b, c = cowns + + @when(a, b, c) + def _body(a, b, c, expected_on_main=expected_on_main): + send("assert", (_core.is_primary(), expected_on_main)) + + quiesce() + receive_asserts() + + +class TestPinnedWatchdog: + """Pump-starvation watchdog: warn-side callback only. + + The watchdog samples at ``pump()`` entry. ``warn_ms`` invokes the + ``on_starve`` callback (or the default ``bocpy.pump`` logger) once + per non-empty epoch. It gates on the pinned queue's non-empty + time, so an unpinned-only window never trips it. + """ + + @classmethod + def teardown_class(cls): + wait() + + def teardown_method(self, method): + # Reset watchdog state so a leaked threshold cannot poison the + # next test. ``None`` disables the sampler. + set_pump_watchdog(warn_ms=None, on_starve=None) + + # Warn fires after starvation threshold. + def test_warn_only_fires_on_starvation(self): + """``warn_ms`` invokes on_starve once after the threshold elapses.""" + warns = [] + + def on_starve(severity, message): + warns.append((severity, str(message))) + + set_pump_watchdog(warn_ms=50, on_starve=on_starve) + + pc = PinnedCown(0) + + @when(pc) + def _body(pc): + send("assert", ("ran", "ran")) + + # Let the queue sit non-empty past warn_ms before the pump runs. + time.sleep(0.15) + # auto-pump drains the body; check_warn samples at pump entry + # and sees age > 50ms. + quiesce() + receive_asserts() + + assert any(s == 0 for s, _ in warns), ( + f"expected warn (severity 0) in {warns!r}") + + # Unpinned-only window leaves warn untripped. + def test_unpinned_only_window_does_not_trip_watchdog(self): + """Watchdog gates on pinned-queue age, not on total work time.""" + warns = [] + + def on_starve(severity, message): + warns.append((severity, str(message))) + + set_pump_watchdog(warn_ms=20, on_starve=on_starve) + + c = Cown(0) + for _ in range(8): + @when(c) + def _busy(c): + # Per-behaviour sleep adds up to ~160 ms total worker + # time, well past warn_ms. The pinned queue remains + # empty throughout, so NONEMPTY_SINCE_NS stays 0. + time.sleep(0.02) + send("assert", ("ran", "ran")) + quiesce() + receive_asserts(8) + assert warns == [], ( + f"warn must not fire across unpinned-only window, got {warns!r}") + + # Now schedule a pinned @when. The pinned queue was empty + # across the unpinned window, so age = 0 < warn_ms. + pc = PinnedCown(0) + + @when(pc) + def _body(pc): + send("assert", ("pinned-ok", "pinned-ok")) + + quiesce() + receive_asserts() + + # Reconfigure-after-first-pinned. + def test_reconfigure_after_first_pinned(self): + """``set_pump_watchdog`` succeeds after live pinned work exists. + + This test pins the as-shipped contract: reconfiguration is + unconditional and the new thresholds take effect on subsequent + samples. + """ + set_pump_watchdog(warn_ms=20, on_starve=None) + + pc = PinnedCown(0) + + @when(pc) + def _body(pc): + send("assert", ("ran", "ran")) + + # Reconfigure mid-flight; must not raise. + warns = [] + + def on_starve(severity, message): + warns.append((severity, str(message))) + + set_pump_watchdog(warn_ms=200, on_starve=on_starve) + + quiesce() + receive_asserts() + # The replaced callback may or may not have fired depending on + # exact timing; either way no exception escapes quiesce(). + + +class TestPumpWatchdogOverflow: + """ms-typed args reject inputs that would overflow ns scaling.""" + + @classmethod + def teardown_class(cls): + wait() + + def test_pump_ms_overflow_raises_overflowerror(self): + """``pump(deadline_ms=_MAX+1)`` raises :class:`OverflowError`.""" + too_big = _behaviors._MAX_PUMP_MS + 1 + with pytest.raises(OverflowError, match="deadline_ms"): + pump(deadline_ms=too_big) + + def test_set_pump_watchdog_ms_overflow(self): + """``set_pump_watchdog(warn_ms=_MAX+1)`` raises :class:`OverflowError`.""" + too_big = _behaviors._MAX_PUMP_MS + 1 + with pytest.raises(OverflowError, match="warn_ms"): + set_pump_watchdog(warn_ms=too_big) + # Restore defaults so we don't leak watchdog state into later tests. + set_pump_watchdog(warn_ms=1000, on_starve=None) + + +class TestSetPumpWatchdogValidation: + """Tighter validators -- `0` is rejected. + + `None` is the documented disable sentinel; `0` previously + slipped through the Python validator and silently turned the + sampler off in C. + """ + + @classmethod + def teardown_class(cls): + set_pump_watchdog(warn_ms=1000, on_starve=None) + wait() + + def test_zero_rejected(self): + """``warn_ms=0`` is no longer a silent-disable value.""" + with pytest.raises(TypeError, match="positive int or None"): + set_pump_watchdog(warn_ms=0) + + +class TestDrainErrorsSurviveBaseException: + """release_all failures are stashed before re-raising KI / SE.""" + + def test_drain_errors_preserved_on_keyboard_interrupt(self, monkeypatch): + """A KI mid-drain must not erase already-captured release_all errors. + + Fakes two orphan payloads: the first fails ``release_all`` with + a normal :class:`Exception` (which lands in the local ``errors`` + list); the second fails with :class:`KeyboardInterrupt` (which + defers the re-raise). Before the fix the deferred re-raise + skipped past the ``return errors, drained_count`` assignment + and the normal error was silently lost. + """ + b = _behaviors.Behaviors(0) + + class _OkPayload: + def set_drop_exception(self, exc): + pass + + def release_all(self): + raise ValueError("release-fail-1") + + class _KIPayload: + def set_drop_exception(self, exc): + pass + + def release_all(self): + raise KeyboardInterrupt("from-drain") + + rounds = iter([[_OkPayload(), _KIPayload()], []]) + monkeypatch.setattr( + _core, "scheduler_drain_all_queues", lambda: next(rounds)) + monkeypatch.setattr(_core, "terminator_dec", lambda: 0) + + with pytest.raises(KeyboardInterrupt) as ei: + b._drain_orphan_behaviors() + + # Both failures survive: the ValueError that was already in + # the local list, and the KI that triggered the re-raise. + assert len(b._stop_drain_errors) == 2 + assert isinstance(b._stop_drain_errors[0], ValueError) + assert isinstance(b._stop_drain_errors[1], KeyboardInterrupt) + # The re-raised KI carries a note pointing at the stashed list. + notes = getattr(ei.value, "__notes__", []) or [] + assert any("2 release_all error" in n for n in notes), notes diff --git a/test/test_scheduling_stress.py b/test/test_scheduling_stress.py index ad24f27..4a9abd0 100644 --- a/test/test_scheduling_stress.py +++ b/test/test_scheduling_stress.py @@ -815,11 +815,14 @@ def _fake_terminator_dec(*args, **kwargs): assert behaviors is not None, ( "runtime must be alive for _drain_orphan_behaviors test" ) - errors = behaviors._drain_orphan_behaviors() + errors, drained_count = behaviors._drain_orphan_behaviors() assert errors == [], ( f"orphan drain reported unexpected errors: {errors!r}" ) + assert drained_count == 1, ( + f"expected exactly one capsule drained; got {drained_count}" + ) fake_capsule.set_drop_exception.assert_called_once() # The argument must be a RuntimeError carrying a stop() # diagnostic; the orphan drain UX contract requires the diff --git a/test/test_transpiler.py b/test/test_transpiler.py index 2f4007f..e728c7d 100644 --- a/test/test_transpiler.py +++ b/test/test_transpiler.py @@ -70,6 +70,19 @@ def helper(): return helper """) == set() + def test_except_as_name_excluded(self): + # ``except ... as X`` binds X via ``ExceptHandler.name`` (a + # plain identifier, not an ``ast.Name(Store)`` node). The + # finder must still treat it as local so a subsequent ``str(X)`` + # read is not classified as a capture. + assert "ex" not in self._captures("""\ + def f(): + try: + pass + except RuntimeError as ex: + return str(ex) + """, known_vars={"RuntimeError", "str"}) + class TestCapturedFreeVars: """Free variables that are not params, locals, or known are captured.""" @@ -908,3 +921,182 @@ def use_alias(x): assert "OD" not in info.captures, ( f"'OD' should not be captured; captures = {info.captures}" ) + + +# ── Defaults-as-captures (loop-snapshot idiom) ────────────────────────── + + +class TestDefaultsAsCaptures: + """``def b(c, i=i)`` and ``def b(c, x=y)`` hoist defaults to captures.""" + + @staticmethod + def _export(source, path="/tmp/test.py"): + tree = ast.parse(textwrap.dedent(source)) + return export_module(tree, path) + + def test_loop_snapshot_idiom(self): + """``def b(c, i=i)`` — capture ``i`` by name, strip the default.""" + result = self._export("""\ + from bocpy import when, whencall, Cown + + def run(c, i): + @when(c) + def b(c, i=i): + return i + """) + info = list(result.behaviors.values())[0] + assert info.captures == ["i"] + # Default must be stripped from the exported behavior. + gen_tree = ast.parse(result.code) + for node in ast.walk(gen_tree): + if isinstance(node, ast.FunctionDef) and node.name.startswith("__behavior__"): + assert node.args.defaults == [], ( + "default for capture must be stripped from behavior signature" + ) + names = [a.arg for a in node.args.args] + assert names == ["c", "i"] + + def test_rename_default(self): + """``def b(c, x=y)`` — capture ``y``, bind into param ``x``.""" + result = self._export("""\ + from bocpy import when, whencall, Cown + + c = Cown(0) + y = 42 + @when(c) + def b(c, x=y): + return x + """) + info = list(result.behaviors.values())[0] + assert info.captures == ["y"] + gen_tree = ast.parse(result.code) + for node in ast.walk(gen_tree): + if isinstance(node, ast.FunctionDef) and node.name.startswith("__behavior__"): + names = [a.arg for a in node.args.args] + assert names == ["c", "x"] + assert node.args.defaults == [] + + def test_undefaulted_extra_captured_by_name(self): + """``def b(c, factor)`` — bare extra captured by its own name.""" + result = self._export("""\ + from bocpy import when, whencall, Cown + + c = Cown(0) + factor = 3 + @when(c) + def b(c, factor): + return factor + """) + info = list(result.behaviors.values())[0] + assert info.captures == ["factor"] + + def test_combined_default_and_body_capture(self): + """Defaults precede body free-vars in the captures list.""" + result = self._export("""\ + from bocpy import when, whencall, Cown + + def run(c, i, factor): + @when(c) + def b(c, i=i): + return i * factor + """) + info = list(result.behaviors.values())[0] + # Extras come first, then body captures. + assert info.captures == ["i", "factor"] + + def test_non_name_default_rejected(self): + """Non-Name defaults cannot be hoisted — must be a bare name.""" + try: + self._export("""\ + from bocpy import when, whencall, Cown + + c = Cown(0) + @when(c) + def b(c, k=foo()): + return k + """) + except SyntaxError as e: + assert "must be a plain name" in str(e) + else: + raise AssertionError("expected SyntaxError for non-Name default") + + def test_default_on_cown_position_rejected(self): + """Defaults on cown positions are not allowed.""" + try: + self._export("""\ + from bocpy import when, whencall, Cown + + c = Cown(0) + @when(c) + def b(c=c): + return 1 + """) + except SyntaxError as e: + assert "cown positions" in str(e) + else: + raise AssertionError("expected SyntaxError for default on cown position") + + +# ── @when alias support ───────────────────────────────────────────────── + + +class TestWhenAlias: + """Aliased ``when`` decorators are detected and rewritten.""" + + @staticmethod + def _export(source, path="/tmp/test.py"): + tree = ast.parse(textwrap.dedent(source)) + return export_module(tree, path) + + def test_from_import_alias(self): + """``from bocpy import when as boc_when`` works end-to-end.""" + result = self._export("""\ + from bocpy import when as boc_when, whencall, Cown + + c = Cown(0) + @boc_when(c) + def b(c): + return c.value + """) + names = [info.name for info in result.behaviors.values()] + assert names == ["__behavior__0"] + # The aliased decorator must be stripped from the behavior. + gen_tree = ast.parse(result.code) + for node in ast.walk(gen_tree): + if isinstance(node, ast.FunctionDef) and node.name.startswith("__behavior__"): + for dec in node.decorator_list: + assert "boc_when" not in ast.unparse(dec) + + def test_module_attr_decorator(self): + """``import bocpy`` + ``@bocpy.when(c)`` is recognized.""" + result = self._export("""\ + import bocpy + + c = bocpy.Cown(0) + @bocpy.when(c) + def b(c): + return c.value + """) + names = [info.name for info in result.behaviors.values()] + assert names == ["__behavior__0"] + # whencall must be auto-imported when only ``import bocpy`` is present. + assert "from bocpy import whencall" in result.code + gen_tree = ast.parse(result.code) + for node in ast.walk(gen_tree): + if isinstance(node, ast.FunctionDef) and node.name.startswith("__behavior__"): + for dec in node.decorator_list: + assert "bocpy.when" not in ast.unparse(dec) + + def test_module_alias_decorator(self): + """``import bocpy as boc`` + ``@boc.when(c)`` is recognized.""" + result = self._export("""\ + import bocpy as boc + + c = boc.Cown(0) + @boc.when(c) + def b(c): + return c.value + """) + names = [info.name for info in result.behaviors.values()] + assert names == ["__behavior__0"] + assert "from bocpy import whencall" in result.code