A policy-gated multi-agent orchestrator built on a real LangGraph StateGraph. It re-expresses a governance pattern, propose/decide/execute separation with fail-closed routing and human-in-the-loop interrupts, as four nodes on a compiled graph.
This is a clean-room reference implementation. No external "governed-swarm" source is imported. The propose step is deterministic and rule-based, so the demo runs with no API keys.
+---------------------+
| START |
+----------+----------+
|
v
+---------------------+
| propose | <-----+ (NEED_MORE_EVIDENCE loop, capped)
+----------+----------+ |
| |
v |
+---------------------+ |
| govern |--------+
+----------+----------+
|
route(decision)|
+--------------------+--------------------+-------------------+
| ALLOW | REQUIRE_HUMAN | DENY | NEED_MORE_EVIDENCE
v v v (back to propose)
+---------+ +-------------+ +-----+
| execute | | human_gate | | END |
+----+----+ +------+------+ +-----+
| | interrupt()
v approve | reject
+-----+ +--------+--------+
| END | v v
+-----+ execute END
flowchart TD
START([START]) --> propose
propose --> govern
govern -->|ALLOW| execute
govern -->|REQUIRE_HUMAN| human_gate
govern -->|DENY| END1([END])
govern -->|NEED_MORE_EVIDENCE| propose
human_gate -->|approve| execute
human_gate -->|reject| END2([END])
execute --> END3([END])
- propose builds a structured proposal (summary, declared effects, truth account) and a claimed receipt hash. Rule-based and deterministic.
- govern evaluates the proposal against policy and emits one
Decision. It is the only node that decides. - route is a pure conditional edge function mapping the decision to the next node.
- execute is reachable only on
ALLOWor after human approval. It recomputes the receipt hash and records whether it matches. - human_gate raises a LangGraph
interrupt()and resumes viaCommand(resume=...).
govern checks conditions top to bottom and returns on the first match. Earlier (more restrictive) checks win, so anything unrecognized is denied rather than allowed.
| Order | Condition | Decision |
|---|---|---|
| 1 | Unknown / unrecognized effect | DENY |
| 2 | Request includes READ_SECRETS (always-deny) |
DENY |
| 3 | Effects fall outside the agent's scope (unknown agent has no scope) | DENY |
| 4 | Autonomy budget exhausted (steps or tool calls) | DENY |
| 5 | Risk score above the agent's ceiling | REQUIRE_HUMAN |
| 6 | Toxic session topology (e.g. READ_REPO then NETWORK_CALL) |
REQUIRE_HUMAN |
| 7 | Proposal insufficient, under the evidence-round cap | NEED_MORE_EVIDENCE |
| 8 | Proposal still insufficient after the cap | DENY |
| 9 | Within scope, under risk ceiling, sufficient evidence | ALLOW |
Fail-closed means: an unknown effect, an unknown agent, or any condition the policy does not explicitly permit resolves to DENY (or, for risk and topology concerns, escalates to a human) rather than silently proceeding. The NEED_MORE_EVIDENCE loop is capped (MAX_EVIDENCE_ROUNDS = 2) so it cannot spin forever; an insufficient proposal that survives the cap is denied.
| Governance concept | LangGraph construct |
|---|---|
| Propose / decide / execute separation | Distinct propose, govern, and execute nodes |
| Fail-closed routing | route() conditional edge with a default of END for unknown decisions |
| Single decision authority | Only govern emits a Decision |
| Human-in-the-loop gate | interrupt() in human_gate, resumed by Command(resume=...) |
| Evidence loop with a cap | Conditional edge back to propose, bounded by evidence_rounds |
| Durable pause/resume | MemorySaver checkpointer keyed by thread_id |
| Cumulative effect tracking | Annotated[list[str], add] reducer on cumulative_effects |
| Tamper-evident receipt | SHA-256 receipt hash claimed at propose, recomputed at execute |
Install in editable mode:
pip install -e ".[dev]"Three demo paths through the ladder:
# 1. Casey, in scope and under the risk ceiling -> ALLOW -> execute
governed-langgraph "read the repo and run tests" --agent Casey --effects READ_REPO RUN_TESTS
# 2. Riley requests DEPLOY, outside Riley's scope -> DENY (never reaches execute)
governed-langgraph "ship to prod" --agent Riley --effects DEPLOY
# 3. Jordan opens a PR, above Jordan's risk ceiling -> REQUIRE_HUMAN -> human gate
# Add --approve to approve the interrupt; omit it to reject.
governed-langgraph "open a pull request" --agent Jordan --effects CREATE_PR --approveEach run prints the final decision, the receipt evidence (claimed vs recomputed hash and whether it verified), and the full audit log.
You can also run the module directly: python -m governed_langgraph "..." --agent Casey.
pytest -q
ruff check src testsThe suite covers the fail-closed ladder, scope enforcement, the human-in-the-loop pause/approve/reject paths, toxic-topology escalation, and the routing function.
This project exists to show a governance pattern expressed cleanly on LangGraph primitives. It is not a hardened, deployable service. The propose step is rule-based rather than model-backed, the policy table is illustrative, the "execution" step only verifies a receipt hash and performs no real side effects, and there is no authentication, persistence beyond an in-memory checkpointer, or operational tooling. Treat it as a readable example to learn from and adapt, not as production infrastructure.