Skip to content

feat(0.30.0): FindingSubject — typed grammar + parser + Zod boundary#61

Merged
drewstone merged 1 commit into
mainfrom
feat/finding-subject-enforcement
May 20, 2026
Merged

feat(0.30.0): FindingSubject — typed grammar + parser + Zod boundary#61
drewstone merged 1 commit into
mainfrom
feat/finding-subject-enforcement

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

Summary

Closes the substrate gap that made every per-vertical `ImprovementAdapter` dead code: the analyst kinds' actor prompts documented a subject grammar (`agent-knowledge:wiki:`, `system-prompt:

`, etc.) but `RawAnalystFindingSchema.subject` was an unvalidated optional string. The LLM could emit prose like `subject: "fix the prompt"` and downstream `startsWith(...)` routing in consumers silently dropped it.

This PR makes the grammar load-bearing — every emitted subject is parsed at the schema boundary; non-conforming rows fail loud (logged + skipped) rather than being lifted with free-form text.

What changes

  • `src/analyst/finding-subject.ts` — discriminated-union `FindingSubject`, `parseFindingSubject(raw)` parser, `renderFindingSubject(s)` inverse, `FINDING_SUBJECT_GRAMMAR_PROMPT` constant kinds embed as the single source of truth. 14 variants cover every locus the substrate routes on (knowledge `{wiki,claim,raw,stale}`, prompt/tool/scaffolding surfaces, stale signals, cluster labels).
  • `KIND_EXPECTED_SUBJECTS` — per-kind allow-list. failure-mode emits ONLY `cluster`; knowledge-gap can't sneak in a `system-prompt:*` (the improvement-analyst's job); improvement can't emit stale signals.
  • `RawAnalystFindingSchema.subject` — Zod `.refine` that runs the parser at parse time.
  • `kind-factory.ts` — after `parseRawFinding`, the factory enforces the per-kind allow-list. Wrong-kind subjects are logged + counted in `rejected_wrong_subject`.
  • Existing `'tool:foo'` test fixtures updated to canonical `'tool-doc:foo'`.

Why this matters

Downstream substrate work in agent-runtime (`defineAgent` + manifest-driven `ImprovementAdapter`) and per-vertical wiring (tax / legal / gtm / creative / N future verticals) all narrow on `FindingSubject['kind']` instead of fragile prefix matching. No silent skips. No fabricated paths. No theater.

Test plan

  • `pnpm test` — 1196/1196 pass (38 new cases)
  • `pnpm typecheck`
  • Bumps npm + pypi to 0.30.0

Closes the substrate gap that turned every per-vertical ImprovementAdapter into dead code: the analyst kinds' actor prompts documented a subject grammar (`agent-knowledge:wiki:<slug>`, `system-prompt:<section>`, ...) but `subject` was an unvalidated `z.string().optional()` and the LLM could emit prose like `subject: "fix the prompt"` which downstream `startsWith(...)` routing silently dropped.

This PR makes the grammar load-bearing:

1. **`src/analyst/finding-subject.ts`** — discriminated-union `FindingSubject`, `parseFindingSubject(raw)` parser, `renderFindingSubject(s)` inverse, and a `FINDING_SUBJECT_GRAMMAR_PROMPT` constant kinds can embed as the single source of truth.

   Variants cover every locus the substrate routes on:
   - `agent-knowledge:{wiki,claim,raw,stale}:<locus>` → `KnowledgeAdapter`
   - `system-prompt`, `tool-doc`, `new-tool`, `rag`, `memory`, `scaffolding`, `output-schema` → `ImprovementAdapter`
   - `websearch.outdated`, `prior-run-summary` → stale signals
   - `cluster` → failure-mode-only free-form labels

   Slugs / tool ids are constrained to `[a-z0-9-]+`; topics / sections / keys allow free-form text trimmed.

2. **`KIND_EXPECTED_SUBJECTS`** — per-kind allow-list. failure-mode emits ONLY `cluster`; knowledge-gap can't sneak in a `system-prompt:*` (the improvement-analyst's job); improvement can't emit stale signals. Enforced at the kind factory boundary.

3. **`RawAnalystFindingSchema.subject`** — Zod `.refine` that runs the parser. Malformed subjects fail the row at Zod parse time with a clear log message instead of being silently lifted with a free-form string.

4. **`kind-factory.ts`** — after `parseRawFinding`, the factory checks the parsed subject against the kind's allow-list. Wrong-kind subjects (e.g. an improvement finding pointing at `cluster:foo`) are logged + counted in `rejected_wrong_subject` and excluded from `out`. Visible to operators in the `analyst.kind <id> done` log line.

5. **Tests**: 38 new cases on `parseFindingSubject` cover every variant (positive + malformed), boundary inputs (null / empty / whitespace / prose), round-trip via `renderFindingSubject`, and the `KIND_EXPECTED_SUBJECTS` truth table (failure-mode is the ONLY kind that emits cluster; improvement excludes stale signals; etc.). Updated the legacy `'tool:foo'` fixtures in `kinds.test.ts` to canonical `'tool-doc:foo'`.

Result: every downstream consumer (agent-runtime's `KnowledgeAdapter` / `ImprovementAdapter`, per-vertical wiring) can now narrow on `FindingSubject['kind']` instead of `startsWith('agent-knowledge:wiki:')` — no more silent skips, no more fabricated paths, no more theater.

Tests: 1196/1196 pass (38 new). Typecheck clean. Bumps to 0.30.0 (npm + pypi + python `__version__`).
@drewstone drewstone merged commit 29ca3d2 into main May 20, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant