Routing directives ("pins"): force a model from the prompt (opt-in)#9
Merged
Merged
Conversation
59ad5dc to
d70f246
Compare
Adds an opt-in routing layer that lets a request's prompt force which backend serves it, overriding the orchestrator/worker selection AND the Auto Router. The motivating use case: an automated multi-agent workflow where each spawned sub-agent must land on a specific model by role (e.g. plan->opus, code->composer, review->codex, fix->claude). The workflow script bakes a role tag into each agent() prompt; the proxy hard-pins that request deterministically. Marker tiers (most explicit first; a tier wins only if it resolves to exactly one configured backend, so naming two models is ambiguous -> ignored): 1. [[route:NAME]] sentinel (stripped before forwarding) 2. @name / use:NAME / route:NAME / model:NAME tag (stripped) 3. natural language ("have codex review it") fallback (UC_DIRECTIVES_NL) NAME resolves via an alias table auto-derived from configured model ids + display names (composer/codex/opus/minimax/mimo... work with no setup), plus optional directives.aliases overrides. An optional planner routes plan-mode turns (detected structurally via ExitPlanMode) to a chosen model. Also adds a first-class "claude-opus" route to config.example.json: a real-Claude Anthropic-passthrough pick with a clean id. The id is deliberately NOT "claude-opus-4-8" (which the workflow engine hardcodes for background traffic that the orchestrator/worker layer remaps), so "claude-opus" is recognized as a deliberate pick and a directive target ([[route:opus]]) without colliding with stock traffic. This makes the role-pipeline example runnable on a fresh config and gives real Opus as a distinct orchestrator alongside a different worker model - which include_stock_models alone can't provide. Backward compatibility: - OPT-IN, OFF by default. With no directives block (or enabled:false) and no UC_DIRECTIVES env, behavior is byte-for-byte unchanged -- a no-op for existing setups until explicitly enabled. - Safe-degrading: _directive_pin is wrapped in try/except and returns None on any error or when no marker is present; unknown/ambiguous/auto markers are ignored. A request is never broken. - Core pipeline untouched (envelope, Auto Router, orchestrator/worker, openai_compat/codex/passthrough, image forwarding, /uc/select). The only pre-existing function changed is _last_user_text, refactored to delegate to _latest_user_turn with identical behavior (covered by existing tests). Knobs: UC_DIRECTIVES=1/0 (force on/off), UC_DIRECTIVES_NL=0 (tags only), UC_DIRECTIVES_LOG=1 (log decisions). Includes unit + dispatch tests (incl. the opt-in default-off guarantee), a docs/DIRECTIVES.md guide, a runnable plan->code->review->fix workflow at examples/role_pipeline_workflow.js, and a documented (disabled) config.example.json block. Full suite + doctor pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d70f246 to
dee1a25
Compare
…, gpt docs, doctor Found via an independent gpt-5.5 review of the directives feature. - planner: gate the plan-mode auto-route on DIRECTIVES_ENABLED. It was applied whenever a planner was configured, even with directives.enabled:false / UC_DIRECTIVES=0 -- so "off" wasn't fully off. Now it's a true hard-off. - strip: (a) _DIRECTIVE_TAG used a leading capturing boundary, so stripping a tag like "(@composer)" swallowed the "(" and left an orphan ")"; switched to a fixed-width negative lookbehind so the boundary char is preserved. (b) the strip globally collapsed runs of spaces/tabs, flattening code indentation in a pinned prompt; now it removes only the marker and trims trailing/edge whitespace. - natural-language tier is now OPT-IN (UC_DIRECTIVES_NL defaults off). Prose that merely mentions a model after a trigger word ("does this work with Claude?") was silently rerouting; explicit sentinel/tag pins are unaffected. - docs: the alias table claimed `gpt` -> claude-gpt-5.5-codex, but in the shipped example `gpt` is dropped as ambiguous (collides with the Ollama gpt-oss model head). Corrected to `codex`, and documented the collision. - doctor.py: validate directive alias overrides + planner against real routes. - tests: cover planner-gated-when-disabled, surgical strip (paren + indentation), NL opt-in on/off, and the gpt collision (the reduced fixture previously missed it). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds routing directives ("pins") — an opt-in layer that lets a request's prompt force which backend serves it, overriding the orchestrator/worker selection and the Auto Router.
The motivating use case: an automated multi-agent workflow where each spawned sub-agent must land on a specific model by role — e.g. opus plans → composer codes → codex adversarially reviews → claude fixes. The workflow script bakes a role tag into each
agent()prompt; the proxy reads it, hard-pins that request, strips the tag, and forwards the rest. No tag → the normal routing flow decides.How a request gets pinned
Scanned on the latest real user turn, most-explicit tier first; a tier wins only if it resolves to exactly one configured backend (two names = ambiguous → ignored):
[[route:codex]]@codex·use:codex·route:codex·model:codexNames auto-derive from configured model ids + display names (
composer,codex,opus,minimax,mimo, … work with zero setup), with optionaldirectives.aliasesoverrides. An optionalplannerroutes plan-mode turns (detected structurally viaExitPlanMode) to a chosen model.Backward compatibility (the important part)
directivesblock (orenabled: false) and noUC_DIRECTIVESenv, behavior is byte-for-byte unchanged — pulling this is a no-op until explicitly enabled._directive_pinis wrapped in try/except and returnsNoneon any error or when no marker is present; unknown/ambiguous/automarkers are ignored. A request is never broken./uc/selectare all unchanged. The only pre-existing function modified is_last_user_text, refactored to delegate to a new_latest_user_turnwith identical behavior (covered by existing tests).Knobs
UC_DIRECTIVES=1/0(force on/off, overrides config) ·UC_DIRECTIVES_NL=0(deterministic tags only) ·UC_DIRECTIVES_LOG=1(log every decision).Included
proxy.py— parser, auto-derived alias table, hard-pin override, optional plan-mode planner.test_proxy.py— unit + dispatch tests, including the opt-in default-off guarantee.docs/DIRECTIVES.md— full guide.examples/role_pipeline_workflow.js— runnableplan → code → review → fixpipeline.config.example.json— documenteddirectivesblock, shipped disabled.Testing
Full self-test suite +
scripts/doctor.pypass. Verified live through the proxy: tagged request overrides an explicit model; untagged request routes normally; ambiguous/unknown markers ignored;UC_DIRECTIVES=0fully restores prior behavior.🤖 Generated with Claude Code