- Status: active
- Source of truth:
package.json,docs/examples/synapse-network/README.md,docs/playbooks/index.md - Verified with:
npm run build,npm run test:unit,npm run validate:docs,npm run validate:synapse-example - Last verified: 2026-03-27
Explain how another project can adopt the Spec2Flow workflow using repository config, adapter-backed agents, deterministic execution, and collaboration tooling.
This guide is for teams that already have an application repository and want to use Spec2Flow as their requirements-to-execution workflow.
Typical adopters:
- product teams with web applications
- solo developers who want repeatable smoke validation
- QA or engineering teams that want issue-ready defect feedback
Before adopting Spec2Flow, an external project should already have:
- a Git repository
- markdown-based requirements or design docs
- a runnable local application or test environment
- a GitHub repository with Actions enabled
- a willingness to review generated tasks, tests, and bug drafts
If the external project does not already have standardized docs, startup scripts, or workflow configs, Spec2Flow should first run an environment preparation step to generate the minimum onboarding templates.
An external project uses Spec2Flow as an orchestrated workflow. At a high level:
The project supplies:
- product requirements
- design notes
- repository context
- startup commands
- target URLs or environments
Spec2Flow generates:
- requirement summaries
- implementation tasks
- test plans
- test cases
- execution reports
- bug drafts
The project uses repository-native validation commands and Playwright when browser validation is needed.
The project uses GitHub Actions and GitHub Issues to publish run status, artifacts, and defect follow-up.
The execution boundary should stay explicit:
- Spec2Flow orchestrates runs, task graphs, execution state, and artifacts
- an external adapter maps one claimed task into a task-scoped agent runtime
- deterministic executors handle startup, validation, and evidence capture
The minimal runtime loop in an adopting repository is:
- generate
task-graph.jsonfrom requirement text, requirement file, or change scope - initialize
execution-state.json - claim the next ready
taskId - send the emitted claim payload to the external adapter
- apply deterministic edits or command execution
- submit the task result back into Spec2Flow state
- repeat until the run is completed, blocked, or failed
For early adoption or integration testing, simulate-model-run can stand in for a real provider adapter so the team can validate controller behavior before wiring Copilot or another API.
Once that works, the next step is to add a model-adapter-runtime.json file and point run-task-with-adapter or run-workflow-loop --adapter-runtime ... at a real external adapter command.
For controller-safe stages, Spec2Flow now also has a deterministic path. run-deterministic-task can execute a claimed environment-preparation or automated-execution task directly, run the declared verification commands, and write schema-backed report artifacts without going through an external provider.
For automated-execution, that deterministic path now also understands topology-backed execution hardening:
- it can bring required entry services to a healthy state before command execution
- it can run route-scoped browser checks and attach HTML or optional Playwright evidence
- it writes an
execution-evidence-indexartifact that catalogs service, command, and browser evidence in one place - it enforces an execution lifecycle policy with timeout and managed-service teardown for services started by Spec2Flow
Route topology can now also declare an optional executionPolicy block with:
maxDurationSecondsteardownPolicyteardownTimeoutSeconds
That policy is routed into the automated-execution task input so deterministic execution can stop long-running commands and clean up only the services it started.
The repository's self-dogfood runtime for that path lives at .spec2flow/runtime/model-adapter-runtime.deterministic.json.
If one workflow needs both deterministic and provider-backed stages, keep one top-level runtime file and use adapterRuntime.stageRuntimeRefs to delegate specific stages. The repository's .spec2flow/runtime/model-adapter-runtime.json now does exactly that: it keeps Copilot CLI as the default runtime while routing environment-preparation and automated-execution to .spec2flow/runtime/model-adapter-runtime.deterministic.json.
Runtime configuration details now live in docs/runtime-config-reference.md.
Use that reference for:
- top-level
adapterRuntimefield meanings - bundled Copilot runtime defaults
- session reuse and persistence overrides
- environment variable semantics
- controller-injected internal wiring versus real user-facing override points
For requirement-driven execution, the most direct entrypoint is:
npm run spec2flow -- generate-task-graph \
--project .spec2flow/project.yaml \
--topology .spec2flow/topology.yaml \
--risk .spec2flow/policies/risk.yaml \
--requirement "Describe the feature request here" \
--output .spec2flow/task-graph.requirement.jsonWhen --requirement or --requirement-file is provided, Spec2Flow selects matching workflow routes first and only falls back to all routes if no route signals match the request.
That adapter command can be a thin wrapper around Copilot CLI, OpenAI, Azure OpenAI, Claude, or an internal agent platform. Spec2Flow does not need to know those provider details as long as the command returns one normalized JSON result.
The bundled example adapter in this repository now uses GitHub Copilot CLI via gh copilot -p.
To use it:
- install Copilot CLI or ensure
gh copilotcan download and run it - run
gh copilot login - verify auth with
gh auth status - optionally set
adapterRuntime.modelinmodel-adapter-runtime.jsonto pin a model that your Copilot CLI account can actually use
If you do not set adapterRuntime.model, the adapter will let Copilot CLI choose its default model. This is the safest default because model availability differs by account and plan.
Run a preflight before the first workflow execution:
npm run spec2flow -- preflight-copilot-cli \
--adapter-runtime docs/examples/synapse-network/model-adapter-runtime.jsonThe preflight confirms command availability, authentication, and that gh copilot -p can complete a minimal JSON probe with the configured model or the account default.
For Copilot-backed adapter runs, Spec2Flow executes that preflight automatically before run-task-with-adapter and run-workflow-loop.
Use --skip-preflight only when you intentionally want to bypass the guardrail, for example while debugging the adapter itself.
Spec2Flow still keeps workflow state in execution-state.json, not inside any provider chat history. The adapter may reuse provider sessions, but the session is never the workflow truth source.
The bundled Copilot adapter can optionally reuse Copilot CLI sessions with --resume when the runtime supplies a stable session key.
The bundled runtime now defaults to specialistSessionKey, which scopes sessions by specialist agent name such as requirements-agent, implementation-agent, or defect-agent. That keeps useful continuity for one responsibility without creating a new session every time a new workflow run starts.
The bundled adapter uses cleanup-safe session persistence by default. No extra runtime setting is required:
- stable single-role keys such as
requirements-agentare persisted and reused - dynamic multi-part keys such as
runId + route + executorTypeare canonicalized back to the stable specialist session by default, so older runtime configs do not keep spawning new task sessions - successful Copilot preflight checks are cached for a short window, so repeated runs against the same runtime do not keep creating identical probe sessions
- old dynamic session files can be consolidated with
npm run migrate:copilot-sessions
Leave the default unchanged unless you have a concrete isolation requirement. Override the session key only when the default specialist-scoped reuse is not the behavior you want.
If you do need an override, the available session scopes and env semantics are documented in docs/runtime-config-reference.md.
After that, run-workflow-loop can act as the first autonomous controller loop for local rehearsals, integration tests, and eventually provider-backed execution. For deterministic stages, the same loop can now point at a runtime that dispatches into run-deterministic-task instead of an external provider adapter.
In practical terms, one feature request becomes one workflow run identified by runId, and that run contains many taskId values.
Spec2Flow now includes the first two PostgreSQL-backed platform slices:
- Phase 1: runtime persistence bootstrap
- Phase 2: scheduler primitives for lease, heartbeat, expiry recovery, and run-state reads
The PostgreSQL commands are:
migrate-platform-db: apply the platform schema migrationsinit-platform-run: persist one repository, one run, its tasks, initial events, and task-graph artifact metadata into PostgreSQLlease-next-platform-task: atomically lease one ready task for one workerheartbeat-platform-task: renew a task lease for the owning workerstart-platform-task: transition one leased task intoin-progressexpire-platform-leases: requeue or block stale leased work after timeoutget-platform-run-state: read one DB-backed run snapshot with tasks, events, and artifactsregister-platform-project: upsert one project registry entry with workspace policy and branch defaultsserve-platform-control-plane: start the first HTTP backend for the future web control planeGET /api/projects: list registered projects with repository and workspace metadataPOST /api/projects: register one project before run intakeGET /api/runs/:runId/tasks/:taskId/artifact-catalog: read one task-scoped execution artifact catalog from the control planeGET /artifacts/:objectKey: serve one local-fs artifact through the control plane whenartifactStore.publicBaseUrlpoints at/artifacts/run-platform-worker-task: materialize one leased DB-backed task into a worker claim, execute it, and persist the result back to PostgreSQLrun-platform-requirements-worker: stage-locked wrapper forrequirements-analysisrun-platform-implementation-worker: stage-locked wrapper forcode-implementationrun-platform-test-design-worker: stage-locked wrapper fortest-designrun-platform-execution-worker: stage-locked wrapper forautomated-executionrun-platform-defect-worker: stage-locked wrapper fordefect-feedbackrun-platform-collaboration-worker: stage-locked wrapper forcollaborationweb:dev: start the frontend shell for the PostgreSQL-backed web control planeweb:build: build the frontend shell for static hosting or preview validation
Example bootstrap flow:
npm run migrate:platform-db -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform
npm run init:platform-run -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--task-graph docs/examples/synapse-network/generated/task-graph.json \
--repository-id spec2flow \
--repository-name Spec2Flow \
--repo-root .
npm run lease:platform-task -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase2 \
--worker-id worker-1
npm run heartbeat:platform-task -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase2 \
--task-id environment-preparation \
--worker-id worker-1
npm run start:platform-task -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase2 \
--task-id environment-preparation \
--worker-id worker-1
npm run expire:platform-leases -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase2
npm run get:platform-run-state -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase2
npm run spec2flow -- register-platform-project \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--repo-root . \
--project .spec2flow/project.yaml \
--topology .spec2flow/topology.yaml \
--risk .spec2flow/policies/risk.yaml \
--project-name Spec2Flow \
--workspace-root . \
--branch-prefix spec2flow/ \
--allowed-read-globs "**/*" \
--allowed-write-globs "src/**,docs/**"
When `register-platform-project` runs from one Spec2Flow workspace against another repository through `--repo-root`, it now writes `model-adapter-runtime.json` into the current control-plane workspace under `.spec2flow/runtime/` and persists that absolute runtime path into the project adapter profile. The runtime scaffold is no longer dropped into the target repository root by default in that flow.
npm run serve:platform-control-plane -- \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--host 127.0.0.1 \
--port 4310
npm run web:dev
npm run spec2flow -- run-platform-worker-task \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase3 \
--task-id environment-preparation \
--worker-id worker-1
npm run spec2flow -- run-platform-requirements-worker \
--database-url postgresql://synapse:12345678@127.0.0.1:5432/synapse_gateway \
--database-schema spec2flow_platform \
--run-id spec2flow-platform-phase3 \
--task-id frontend-smoke--requirements-analysis \
--worker-id worker-1 \
--adapter-runtime docs/examples/synapse-network/model-adapter-runtime.jsonThis persistence layer does not replace task-graph.json or execution-state.json yet.
It establishes the first shared runtime truth for:
repositoriesrunstaskstask_attemptsartifactseventsreview_gatespublications
Phase 2 adds scheduler-safe task runtime semantics on top of that schema:
tasks.statuscan now move throughready,leased,in-progress,blocked,completed, and retry-oriented recovery states- lease ownership is stored in PostgreSQL through
leased_by_worker_id,lease_expires_at, andlast_heartbeat_at - stale leases can be recovered without editing JSON files by running
expire-platform-leases get-platform-run-stateexposes the DB-backed run snapshot needed for future workers and web control-plane surfacesserve-platform-control-planeexposes the first backend HTTP surface for health checks, run submission, run lists, run detail, task detail, observability snapshots, and task-level operator actions for retry and approval- run lists, run detail, and DB-backed run snapshots now surface
project,workspace,branch, andworktreemetadata explicitly instead of burying them in raw run request payloads register-platform-projectandPOST /api/projectslet operators register projects and workspace policy before creating runsPOST /api/runsacceptsrepositoryRootPathplus optionalprojectId,projectName,workspaceRootPath,branchPrefix,worktreeRootPath,worktreeMode,workspacePolicy,projectPath,topologyPath,riskPath,requirement,requirementPath,changedFiles, and repository metadata overrides; onboarding files default to.spec2flow/project.yaml,.spec2flow/topology.yaml, and.spec2flow/policies/risk.yamlrelative torepositoryRootPathPOST /api/projectsacceptsrepositoryRootPathplus optionalprojectId,projectName,workspaceRootPath,projectPath,topologyPath,riskPath,repositoryId,repositoryName,defaultBranch,branchPrefix, andworkspacePolicy- run submission now provisions a project-scoped workspace binding, writes a run-scoped
task-graph.jsoninto the run worktree, and persists bothprojectsandrun_workspacesmetadata alongside the platform run - DB-backed worker claims now carry
workspacecontext and default deterministic or adapter execution into the run worktree instead of the controller repo cwd packages/webnow contains the first frontend shell for that API surface, andnpm run web:devexpects the backend to be available athttp://127.0.0.1:4310unlessVITE_CONTROL_PLANE_BASE_URLoverrides it
Phase 3 adds the first DB-backed worker runtime harness:
- one leased task is materialized into a standard
TaskClaimPayloadso existing adapter and deterministic executors can be reused - the worker harness writes a local claim file and a materialized
execution-state.jsonunder.spec2flow/runtime/platform-workers/ - after the run finishes, task status changes, promoted downstream tasks, artifacts, and worker events are persisted back into PostgreSQL
- the worker harness now runs its own background heartbeat loop, renews leases during async execution, and stops work when lease ownership is lost or heartbeat failures cross the configured threshold
- deterministic execution remains the default only for
environment-preparationandautomated-executionwhen no--adapter-runtimeis supplied - non-deterministic stages require
--adapter-runtime
Phase 3 also introduced platform-auto-runner-service. When serve-platform-control-plane starts, it also starts an embedded background scheduler that continuously polls the database for pending and running runs. This means:
- you do not need to run
run:platform-worker-taskmanually for normal operation environment-preparationandautomated-executiontasks are dispatched and completed in-process without any external adapter- AI-dependent stages are dispatched automatically to the configured
model-adapter-runtime.jsonin the target project's repository root - tasks that require an adapter but do not have one configured are marked
blockedwith a clear reason (no-adapter-configured) so the run pauses at the first AI stage rather than silently stalling
For how to configure an adapter runtime and unblock AI stages, see docs/ops-startup-guide.md — Configuring an AI Adapter.
The default stop semantics for run-platform-worker-task are:
- stop immediately when PostgreSQL rejects a heartbeat because the task is no longer owned, no longer leased, or the lease already expired
- stop and persist a blocked task receipt after three consecutive heartbeat transport failures unless
--heartbeat-error-thresholdoverrides that budget - stop heartbeating as soon as execution finishes and hand the final result back through the normal persistence contract
Phase 4 now adds the first explicit auto-repair policy layer:
- risk-policy rules can declare
maxAutoRepairAttempts,maxExecutionRetries,allowAutoCommit, andblockedRiskLevels - route task
reviewPolicynow carries those fields into the controller path packages/cli/src/runtime/auto-repair-policy-service.tsnow owns downstream rerun invalidation and repair escalation decisions afterdefect-feedbackcompletes- when the defect summary recommends a repairable action and the repair budget still allows it, the owning stage is requeued and downstream route tasks are invalidated back to
pending - when repair is blocked by policy, blocked risk level, or exhausted budget, the route escalates into
collaborationwith explicit escalation notes instead of silently stalling - PostgreSQL-backed runs now persist repair attempts in
repair_attemptsand attach repair lifecycle events, including escalation, to the run history throughpackages/cli/src/platform/platform-auto-repair-service.ts
Phase 5 now upgrades collaboration from a handoff-only stage into a policy-gated publish controller:
packages/cli/src/runtime/collaboration-publication-service.tsreads the completedcollaboration-handoffand decides whether the route can auto-publish or must stop at a manual gate- when
reviewPolicy.allowAutoCommitis enabled and approval is not required, the controller creates a scopedspec2flow/...branch, stages the implementation-summary file set, and creates a deterministic git commit - when auto-commit is blocked or approval is still required, the controller writes a
publication-recordplus optionalpr-draftartifact and moves the route toblockedso an operator can intervene through an explicit publication action approve-publicationnow replays publication through the governed runtime path, including branch push, pull-request creation throughgh, and merge orchestration request when the target repository allows it- PostgreSQL-backed runs now persist those publication outcomes in
publicationsand emit publication events throughpackages/cli/src/platform/platform-publication-service.ts
Another repository does not need to copy the entire Spec2Flow repository structure.
The minimal recommended layout is:
your-project/
├─ docs/
│ ├─ requirements/
│ ├─ design/
│ └─ testing/
├─ playwright/
│ ├─ tests/
│ └─ playwright.config.ts
├─ scripts/
│ ├─ start-service.sh
│ ├─ run-smoke.sh
│ └─ collect-artifacts.sh
├─ .github/
│ ├─ ISSUE_TEMPLATE/
│ └─ workflows/
└─ spec2flow/
├─ outputs/
│ ├─ requirements/
│ ├─ test-plans/
│ ├─ execution/
│ └─ bugs/
└─ config/
└─ project.yaml
Each adopting project should define a small configuration file that answers these questions:
- where the requirement docs live
- how to start the application
- what base URL to test
- where Playwright artifacts should be stored
- which smoke flows are critical
- which labels should be applied to bug issues
A practical configuration shape can include:
project:
name: sample-app
requirements:
paths:
- docs/requirements
- docs/design
runtime:
install: npm install
start: ./scripts/start-service.sh
baseUrl: http://localhost:3000
testing:
smokeSuite: playwright/tests/smoke
artifactsDir: spec2flow/outputs/execution
reporting:
bugDraftDir: spec2flow/outputs/bugs
issueLabels:
- type:defect
- area:executionRoute-scoped execution can also declare an artifact store for deterministic execution evidence:
Local-first default:
topology:
workflowRoutes:
- name: frontend-smoke
entryServices: [frontend]
verifyCommands:
- ./scripts/run-smoke.sh
artifactStore:
mode: local
provider: local-fs
publicBaseUrl: http://127.0.0.1:4310/artifacts/
keyPrefix: frontend-smoke/This is the recommended V1 shape for local deployments. It keeps repository-local writes as the source of truth, makes the local provider explicit, and lets the task's execution-artifact-catalog carry stable retrieval metadata.
When the control plane is running on http://127.0.0.1:4310, that publicBaseUrl now maps to a real built-in route:
http://127.0.0.1:4310/artifacts/<objectKey>
That means Web artifact panels can open catalog-backed local evidence without introducing MinIO or a remote object store.
Remote-catalog upgrade path:
topology:
workflowRoutes:
- name: frontend-smoke
entryServices: [frontend]
verifyCommands:
- ./scripts/run-smoke.sh
artifactStore:
mode: remote-catalog
provider: generic-http
publicBaseUrl: https://artifacts.example.com/spec2flow/
keyPrefix: frontend-smoke/
upload:
endpointTemplate: https://uploads.example.com/spec2flow/{objectKey}
method: PUT
authTokenEnv: SPEC2FLOW_ARTIFACT_TOKENFor mode: local, provider: local-fs is the explicit V1 provider and publicBaseUrl is optional. If it is present and points at serve-platform-control-plane, control-plane artifact views can show a stable local retrieval URL and the backend can serve the file bytes directly from /artifacts/<objectKey>.
For mode: remote-catalog, publicBaseUrl defines retrieval links that end up in the task's execution-artifact-catalog. upload.endpointTemplate turns remote-catalog from metadata-only into a real upload lifecycle, and the control plane can now read that catalog back through GET /api/runs/:runId/tasks/:taskId/artifact-catalog.
- scan the repository structure
- detect existing docs, startup commands, tests, and CI
- generate the minimum
.spec2flow/templates - produce a gap report for missing inputs that still need human confirmation
- keep requirements and design docs in markdown
- make sure the application can start with one clear command
- identify the critical smoke flow to automate first
- confirm project topology
- confirm service startup commands
- confirm risk policy and workflow definitions
- create
scripts/start-service.sh - create
scripts/run-smoke.sh - decide where screenshots, traces, and logs will be stored
- install Playwright in the target repository
- configure one smoke test for the most important user path
- confirm failures collect evidence consistently
- add a workflow to install dependencies
- add a workflow to start the app
- add a workflow to run Playwright
- upload artifacts on failure
- add a bug issue template
- define labels for severity, area, and type
- define who reviews draft defects before issue publication
- summarize a requirement with Copilot
- derive implementation tasks
- generate test design
- run Playwright locally
- run the same smoke path in CI
- convert failures into issue-ready bug drafts
- Create or update a GitHub Issue with the requirement.
- Use an adapter-backed agent to summarize the requirement and impacted modules.
- Use an adapter-backed agent to generate implementation tasks and validation scope.
- Implement the change.
- Generate or update the smoke and regression cases.
- Run Playwright locally.
- Open a pull request with validation notes.
- Let GitHub Actions rerun the smoke path.
- Reproduce the issue locally or in CI.
- Capture screenshot, trace, logs, and failing assertion.
- Convert evidence into a structured bug draft.
- Review the draft.
- Publish it to GitHub Issues.
- Link the issue to the failing run and the fixing PR.
- Identify the changed feature area.
- Regenerate risk-based test scope if needed.
- Run smoke coverage first.
- Expand into focused regression only where risk justifies it.
- writes or curates requirements
- reviews summaries and assumptions
- uses Copilot for implementation planning and code changes
- runs Playwright locally before PR updates
- checks code, validation notes, and artifacts
- confirms failures are turned into proper defect records
- keeps GitHub Actions, templates, and workflow conventions healthy
When using Spec2Flow in another project, keep these rules:
- do not skip requirement summarization for non-trivial work
- do not skip environment preparation when repository conventions are missing
- do not treat generated bug drafts as auto-publishable truth
- keep local and CI execution paths aligned
- automate only the flows that are stable enough to maintain
- keep outputs in a repository-visible location
An external project is successfully using Spec2Flow when:
- at least one critical requirement can be summarized and traced to implementation tasks
- at least one smoke flow runs with Playwright locally and in CI
- failed runs produce usable evidence
- bug drafts can be reviewed and published to GitHub Issues
Fix:
- automate one stable smoke path first
Fix:
- normalize local startup into a single script entrypoint
Fix:
- require trace, screenshot, and exact expected versus actual result
Fix:
- reuse the same scripts and environment assumptions in both places
Fix:
- link requirements, PRs, CI runs, and issues explicitly
For an external team, the first realistic milestone is:
- choose one feature area
- summarize one requirement
- generate one implementation task list
- automate one smoke flow with Playwright
- run it in GitHub Actions
- prove one failed run can become a GitHub Issue draft
Once that works, expand gradually instead of trying to automate the whole product at once.