AgentFlow

Your Kanban board builds your code.

The first AI development pipeline that uses your project management tool as the orchestration layer.
Full observability. Deterministic quality gates. Zero custom infrastructure.

Quick Start • Architecture • Getting Started • Gap Registry • Comparison

What is AgentFlow?

AgentFlow turns your existing Kanban board (Asana, GitHub Projects, Linear, Jira) into a fully autonomous AI development pipeline. Instead of building custom orchestration infrastructure, AgentFlow treats your project management tool as a distributed state machine — tasks move through stages, AI agents read and write state via comments, and humans intervene through the same UI they already use.

The result: Complete pipeline observability from your phone. Crash recovery for free (state lives in your PM tool, not in memory). Human override at any point by dragging a card.

Why AgentFlow?

Most AI coding tools either give you a chatbot or a black-box agent. AgentFlow gives you a visible, auditable pipeline where:

Every decision is a comment on a task card
Every stage transition is a card moving between columns
Every retry carries accumulated context from previous attempts
Every cost is tracked per-task with automatic guardrails
Every failure pattern is captured and fed back into the system

You don't need to trust a black box. Open your Kanban board and watch the pipeline work.

Key Features

Pipeline Orchestration

7-stage Kanban pipeline: Backlog → Research → Build → Review → Test → Integrate → Done
Stateless orchestrator: One-shot sweep via crontab — no daemon, no session dependency, crash-proof
Transitive priority dispatch: Tasks that unblock the most work get built first (automatic critical path)
Conflict-aware scheduling: Parallel tasks touching the same files are serialized automatically

Quality Gates

Deterministic before probabilistic: tsc + eslint + tests run as hard gates before any AI review — catches ~60% of issues at near-zero cost
Adversarial AI review: Reviewers must "list 3 things wrong before deciding to pass"
Coverage gate: 80% threshold on new files before promotion to Test stage
Integration testing: Full suite runs on main after every merge — auto-reverts on failure

Observability & Cost Tracking

Full pipeline observability from your phone — every task card shows current stage, assigned agent, retry count, and accumulated cost
Per-task cost tracking with stage cost ceilings (Sonnet default: Research ~$0.10, Build ~$0.40, Review ~$0.10, Test ~$0.05, Integrate ~$0.03)
Automatic cost guardrails: Warning at $3/$8, hard stop at $10/$20 (Sonnet/Opus) with human escalation
Real-time status dashboard pinned to each project
Heartbeat monitoring: Dead agents detected and reassigned within 10 minutes

System Learning

Feedback loops with accumulated context: Every retry carries what was tried, what failed, and what to do differently
System-level retrospectives: Every 10 completed tasks, common failure patterns are extracted to LEARNINGS.md
Cross-task learning: Builders and reviewers read LEARNINGS.md before starting work
Spec drift detection: SHA-256 hash comparison catches requirement changes mid-sprint

Safety & Recovery

Auto-revert on integration failure: git revert (new commit, never force-push)
Graceful shutdown: Active workers finish, unstarted tasks return to backlog
Blocked task detection: After 2 failed attempts, tasks escalate to human review
Scope creep detection: PR diff files compared against predicted files list
Secret management: Mock values in tests, environment variables in code, manual verification flags for real API tasks

Architecture

                    ┌─────────────────────────────────────────────┐
                    │           Your Kanban Board (Asana)          │
                    │                                             │
  Crontab runs      │  Backlog → Research → Build → Review → ...  │
  every 15 min      │     ↑                            │          │
       │            │     │     Needs Human ←──────────┘          │
       ▼            │     │         (drag card to intervene)      │
  ┌──────────┐      └─────┼───────────────────────────────────────┘
  │Orchestrate│           │
  │ (sweep)   │───reads───┘
  └──────────┘
       │            ┌──────────┐  ┌──────────┐  ┌──────────┐
       └─dispatches─│ Worker T2 │  │ Worker T3 │  │ Worker T4 │ ...
                    │ (Build)   │  │ (Build)   │  │ (Review)  │
                    └──────────┘  └──────────┘  └──────────┘
                         │              │              │
                         └──── all state flows through Kanban ────┘

Core principle: The Kanban board IS the orchestration layer. No separate database, no message queue, no custom infrastructure. State lives where humans already look.

Full architecture docs →

Quick Start

Prerequisites

Claude Code (CLI)
An Asana workspace with MCP integration (or GitHub Projects — adapter coming soon)
Git + Node.js project

Installation

# Clone the repo
git clone https://github.com/UrRhb/agentflow.git

# Copy skills and prompts to your Claude Code config
cp -r agentflow/skills/* ~/.claude/skills/
cp -r agentflow/prompts/* ~/.claude/sdlc/prompts/
cp agentflow/conventions.md ~/.claude/sdlc/conventions.md

Setup

1. Create your spec

Write a SPEC.md for your project — or use Claude to brainstorm one:

You: I want to build [your idea]
Claude: [brainstorms → produces SPEC.md]

2. Decompose into tasks

You: /spec-to-asana
Claude: Reading SPEC.md... Decomposing into atomic tasks...
        Created 14 tasks across 3 sub-phases in Asana.
        Dependencies mapped. Ready to build.

3. Start workers

Open 3-4 terminal windows, each as a worker slot:

# Terminal 2
claude -p "/sdlc-worker --slot T2"

# Terminal 3
claude -p "/sdlc-worker --slot T3"

# Terminal 4 (reviewer)
claude -p "/sdlc-worker --slot T4"

# Terminal 5 (tester)
claude -p "/sdlc-worker --slot T5"

4. Start the orchestrator

# Add to crontab (runs every 15 minutes)
crontab -e
# Add: */15 * * * * ~/.claude/sdlc/agentflow-cron.sh >> /tmp/agentflow-orchestrate.log 2>&1

5. Watch from your phone

Open Asana on your phone. Watch tasks flow through the pipeline. Drag any card to "Needs Human" to intervene.

Stop the pipeline

You: /sdlc-stop
Claude: Graceful shutdown initiated. Active workers finishing...
        3 tasks returned to Backlog. System paused.

Detailed getting started guide →

How It Works

The 7-Stage Pipeline

Stage	What Happens	Gate
Backlog	Task waiting for dependencies + available slot	Dependencies resolved, no file conflicts
Research	Conditional — only runs if task has research triggers	Structured findings posted
Build	AI writes code, creates PR	`tsc + eslint + npm test` (deterministic)
Review	Different AI agent reviews adversarially	Must list 3 issues before passing
Test	Full test suite + visual validation + coverage check	80% coverage on new files
Integrate	Merge to main, run full suite	All tests pass on main
Done	Task complete	—

Task Lifecycle

[SLOT:--] [STAGE:Backlog] [RETRY:0] [COST:~$0]
    │
    ├── Orchestrator assigns slot → [SLOT:T2] [STAGE:Build]
    │
    ├── Worker T2 builds → posts [BUILD:COMPLETE] with PR link
    │
    ├── Lint gate: tsc + eslint + tests → [LINT:PASS]
    │
    ├── Worker T4 reviews → finds 3 issues → all acceptable → [REVIEW:PASS]
    │
    ├── Coverage gate → [COV:PASS]
    │
    ├── Worker T5 tests → [TEST:PASS] → merges PR
    │
    ├── Integration check on main → [INTEGRATE:PASS]
    │
    └── [STAGE:Done] [COST:~$5.75]

What Happens When Things Fail

[REVIEW:REJECT] "SQL injection in user input handler"
    │
    ├── Retry counter increments → [RETRY:1]
    ├── Accumulated context posted (what was tried, what failed, what to do differently)
    ├── Slot cleared → different worker assigned on retry 2+
    ├── Task moves back to Build
    │
    └── If cost exceeds hard stop threshold → [COST:CRITICAL] → moves to "Needs Human"

Comparison

Feature	AgentFlow	GSD	Superpowers	Aperant
Orchestration layer	Your Kanban board (Asana/Linear/Jira)	CLI waves	CLAUDE.md prompts	Electron app
Pipeline observability	Full (phone, web, desktop)	Terminal only	File-based	Desktop app
Deterministic quality gates	tsc + lint + tests before AI review	None	None	None
Per-task cost tracking	Built-in with guardrails	None	None	None
Adversarial review	Different agent, must find 3 issues	Same agent	Same agent	Same agent
Integration testing	Auto-revert on main breakage	None	None	None
System-level learning	LEARNINGS.md retrospectives	None	None	None
Crash recovery	Free (state in PM tool)	Restart from scratch	Re-read files	Restart app
Human intervention	Drag a card	Kill process	Edit files	Click button
Spec drift detection	SHA-256 hash comparison	None	None	None
Multi-project support	Native (portfolio view)	Single project	Single project	Single project
Parallel agents	4+ workers with conflict detection	Wave-based	Sequential	Sequential
Custom infrastructure	None (uses existing PM tool)	CLI tool	Markdown files	Electron + SQLite
Adapter ecosystem	Asana, GitHub Projects (planned), Linear (planned)	GitHub only	Any git repo	Local only

When to Use What

AgentFlow: You want full pipeline observability, deterministic quality gates, cost tracking, and the ability to monitor/intervene from your phone. Best for teams and solo devs running multiple projects.
AgentFlow + Superpowers: You want the best of both — AgentFlow orchestrates across tasks, Superpowers optimizes each worker's methodology. See integration guide below.
GSD: You want a simple CLI tool for wave-based task execution. Good for quick prototyping.
Superpowers: You want a methodology-as-prompt approach with minimal setup. Good for single-project focus.
Aperant: You want a desktop GUI for agent management. Good for visual workflow preference.

Superpowers Integration

AgentFlow and Superpowers operate at different layers and are designed to stack:

┌───────────────────────────────────────────────────────────┐
│  OUTER LOOP — AgentFlow                                    │
│  "Which task should which agent work on, and when?"        │
│  Kanban board • dispatch • transitions • cost gates        │
│                                                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │  Worker T2   │  │  Worker T3   │  │  Worker T4   │       │
│  │  INNER LOOP  │  │  INNER LOOP  │  │  INNER LOOP  │       │
│  │  Superpowers │  │  Superpowers │  │  Superpowers │       │
│  │  brainstorm  │  │  brainstorm  │  │  code-review │       │
│  │  → plan      │  │  → plan      │  │  + adversary │       │
│  │  → execute   │  │  → execute   │  │    rules     │       │
│  │  → verify    │  │  → verify    │  │              │       │
│  └─────────────┘  └─────────────┘  └─────────────┘        │
└───────────────────────────────────────────────────────────┘

AgentFlow decides: "Task APP-007 goes to Worker T2 now" Superpowers decides: "Inside T2, I'll brainstorm → plan → execute with sub-agents → verify"

What Each Layer Controls

Concern	AgentFlow (outer)	Superpowers (inner)
Task assignment	Which worker gets which task	—
Build methodology	Lifecycle markers + heartbeats	brainstorm → plan → execute → verify
Parallelism	Across tasks (T2 builds one, T3 builds another)	Within a task (sub-agents write code in parallel)
Quality gates	Deterministic (tsc/lint/test) + adversarial review	Structured review methodology
Debugging	Retry context + worker rotation	Systematic debugging methodology
Cost tracking	Per-task with guardrails	—

Complexity Gating

Not every task needs Superpowers' full methodology. AgentFlow gates by task complexity:

Complexity	Superpowers Methodology	Why
S (Simple, <30min)	Skip brainstorm + plan. Direct build.	Overkill adds ~$0.50-1.00 for zero quality gain
M (Medium, <1hr)	Skip brainstorm. Use plan → execute.	Planning helps, brainstorming doesn't
L (Complex, <2hr)	Full: brainstorm → plan → execute → verify	Worth the investment on complex tasks

Integration Gaps (24-31)

Stacking two systems creates 8 new failure modes. All are addressed in AgentFlow's design:

#	Gap	Fix
24	Context window war (both systems load large prompts)	Lazy-load: only load Superpowers prompts matching task complexity
25	Two captains (conflicting workflow control)	AgentFlow owns lifecycle, Superpowers owns methodology
26	Sub-agents skip heartbeats (false dead-worker detection)	Parent worker posts heartbeats independently of sub-agents
27	Plan exceeds task scope (Superpowers plans freely)	Feed predicted files + acceptance criteria as hard constraints
28	Double cost tracking (sub-agents add hidden cost)	Adjusted ceilings: S=$3, M=$5, L=$8 when Superpowers active
29	Retry context fragmentation (sub-agent failures lost)	Parent aggregates all sub-agent outputs before posting
30	Brainstorm overkill on simple tasks	Complexity gating (table above)
31	Conflicting review standards	AgentFlow adversarial rules override; Superpowers provides methodology

Full details for all 45 gaps →

Adapters

AgentFlow uses an adapter pattern to support multiple project management tools. Each adapter implements the same interface for reading/writing pipeline state.

Adapter	Status	Notes
Asana	Available	Full MCP integration, recommended for production
GitHub Projects	Planned	Free alternative, community priority
Linear	Planned	For teams already on Linear
Jira	Planned	Enterprise support
Notion	Planned	For Notion-native teams

Want to build an adapter? See CONTRIBUTING.md.

The 45-Gap Registry

AgentFlow was designed by systematically identifying and closing 45 gaps in AI development pipelines — 23 for the core system, 8 for Superpowers integration, and 14 from production audit findings. Each gap represents a failure mode that existing tools don't address.

#	Gap	Fix
1	AI review has shared blindness with AI builder	Deterministic gates (tsc/lint/test) before AI review
2	No decomposition quality standard	9-field rubric with validation
3	No integration testing after merge	7th stage: full suite on main, auto-revert on failure
4	Unconditional research wastes time/money	Trigger-based: only research when task needs external knowledge
5	No cost visibility	Per-task tracking with stage ceilings and automatic guardrails
6	Parallel tasks can conflict on shared files	Predicted files comparison, automatic serialization
7	Same mistakes repeated across tasks	LEARNINGS.md retrospective every 10 tasks
8	Dead agents block the pipeline	Heartbeat every 5 min, reassign after 10 min timeout
9	AI reviews are too lenient	Adversarial prompt: "list 3 things wrong before deciding to pass"
10	Integration failures leave main broken	Auto-revert via `git revert` (new commit, never force-push)
11	Asana custom fields are fragile	All metadata in description headers, parsed with regex
12	Session-based scheduling dies with session	Stateless orchestrator + real crontab (most critical gap)
13	No visibility into what agents are doing	Comment-thread-as-memory: every action is a tagged comment
14	Runaway costs on stuck tasks	Cost ceilings per stage, warning at $3/$8, hard stop at $10/$20 (Sonnet/Opus)
15	Uncontrolled external API usage	Source priority: codebase → docs → web → GitHub (opt-in only)
16	Circular dependencies in task graph	Topological sort validation during decomposition
17	PRs that exceed task scope	Diff files vs predicted files → `[SCOPE:WARNING]`
18	Impossible tasks retry forever	After 2 failures: evaluate → `[BUILD:BLOCKED]` escalation
19	Secrets leaked in code	Mock values in tests, env vars in code, `[NEEDS:MANUAL_VERIFY]`
20	Wrong task gets built first	Transitive priority: count downstream blocked tasks
21	No dashboard for pipeline status	Pinned Status task updated every sweep
22	No clean shutdown mechanism	`/sdlc-stop` drains workers, returns unstarted to backlog
23	Spec changes mid-sprint go unnoticed	SHA-256 hash comparison, `[SPEC:CHANGED]` flag
	Superpowers Integration Gaps
24	Context window war (stacked prompts)	Lazy-load prompts by task complexity
25	Two captains (conflicting workflow control)	AgentFlow owns lifecycle, Superpowers owns methodology
26	Sub-agents skip heartbeats	Parent worker posts heartbeats independently
27	Plan exceeds task scope	Predicted files + acceptance criteria as hard constraints
28	Double cost tracking (hidden sub-agent cost)	Adjusted ceilings: S=$3, M=$5, L=$8 with Superpowers
29	Retry context fragmentation	Parent aggregates all sub-agent outputs
30	Brainstorm overkill on simple tasks	Complexity gating: S=skip, M=plan only, L=full
31	Conflicting review standards	AgentFlow adversarial rules override Superpowers
	Audit Finding Gaps
32	No worktree cleanup	Worktree removed on Done transition
33	Prompt version skew	Version field in conventions.md, re-read per task
34	No merge lock	`[MERGE_LOCK]` on Status task, 10 min timeout
35	Sub-agent git conflicts	Non-overlapping file sets; sequential fallback
36	No orchestrator health monitoring	`[LAST_SWEEP]` timestamp + external watchdog
37	Comment thread pollution	Read only last 10-20 comments per task
38	Dual sweep collision	`[SWEEP:RUNNING]` mutual exclusion lock
39	Adversarial review ping-pong	PASS WITH NOTES for minor-only issues
40	Prompt injection	Input sanitization check at stage entry
41	LEARNINGS.md context bomb	50-line cap with oldest-first rotation
42	Git revert can fail	`[INTEGRATE:REVERT_FAILED]` → Needs Human
43	Crontab environment/auth failure	Wrapper script sources shell environment
44	Cost ceilings assume wrong model	Dual cost profiles (Sonnet/Opus)
45	Orchestrator cost	Idle sweep optimization, doubles interval when idle

Full gap registry with details →

Project Structure

agentflow/
├── skills/                    # Claude Code skill files (copy to ~/.claude/skills/)
│   ├── spec-to-asana.md       # Decompose spec → Kanban tasks
│   ├── sdlc-worker.md         # Execute pipeline stages
│   ├── sdlc-orchestrate.md    # Stateless orchestration sweep
│   └── sdlc-stop.md           # Graceful shutdown
├── prompts/                   # Stage-specific prompt templates
│   ├── decompose.md           # Spec → atomic tasks
│   ├── research.md            # Conditional research stage
│   ├── build.md               # Build with lint gate
│   ├── review.md              # Adversarial review
│   └── test.md                # Test + integration
├── conventions.md             # System conventions and protocols
├── adapters/                  # PM tool adapters
│   ├── asana/                 # Asana MCP adapter (available)
│   └── github-projects/       # GitHub Projects adapter (planned)
├── docs/                      # Documentation
│   ├── architecture.md        # System architecture deep-dive
│   ├── getting-started.md     # Step-by-step setup guide
│   ├── gap-registry.md        # All 45 gaps with full details
│   └── comparison.md          # Detailed competitive analysis
├── examples/                  # Example specs and configurations
│   └── starter-spec.md        # Template SPEC.md to get started
├── CONTRIBUTING.md            # How to contribute
├── LICENSE                    # MIT
└── README.md                  # You are here

Contributing

We welcome contributions! The highest-impact areas:

GitHub Projects adapter — makes AgentFlow free to use (no Asana required)
Linear adapter — popular with dev teams
Stage prompt improvements — better review/test prompts
Documentation — tutorials, examples, translations

See CONTRIBUTING.md for guidelines.

License

MIT License. See LICENSE.

Automation Levels

AgentFlow operates on a spectrum from semi-automated to fully autonomous:

Level	What You Do	What AgentFlow Does
Manual	Write SPEC.md	—
Semi-automated	Run `/spec-to-asana`, open worker terminals, set crontab	Decomposes spec, creates board, validates tasks
Autonomous	Watch from your phone, handle "Needs Human" cards	Everything else — dispatch, build, review, test, merge, revert, retry, learn

What's autonomous today

Orchestrator sweeps (crontab-driven, no human in the loop)
Task dispatch with transitive priority and conflict detection
Build → lint gate → review → coverage gate → test → merge → integration
Feedback loops with accumulated context and worker rotation
Cost tracking with automatic guardrails (warning at $3/$8, hard stop at $10/$20 per Sonnet/Opus profile)
Dead worker detection and reassignment
System-level learning (LEARNINGS.md retrospectives)
Auto-revert on integration failure
Spec drift detection
Graceful shutdown

What's still manual

Starting worker terminals (you open 3-4 iTerm tabs)
Writing the initial SPEC.md
Handling "Needs Human" cards (blocked tasks, cost-critical, spec changes)
Adjusting crontab frequency

Roadmap

Auto-spawn workers: Orchestrator detects empty slots and spawns worker sessions automatically via claude -p "/sdlc-worker --slot T<N>"
GitHub Projects adapter: Free alternative to Asana — no paid PM tool required
Linear adapter: For teams already on Linear
Web dashboard: Real-time pipeline visualization beyond the Kanban board
Multi-language support: Python, Go, Rust project conventions (currently Node.js/TypeScript focused)
Slack/Discord notifications: Pipeline events pushed to team channels
Cost analytics: Historical cost data, trends, and optimization suggestions

Acknowledgments

Built with Claude Code by Anthropic
Inspired by the gaps in existing AI development tools
Designed through systematic CTO-level review (45 gaps identified and addressed)

AgentFlow — Your Kanban board builds your code.
_{Autonomous AI development pipeline with full observability, deterministic quality gates, and cost tracking.}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
adapters		adapters
bin		bin
docs		docs
examples		examples
prompts		prompts
skills		skills
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
conventions.md		conventions.md
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

AgentFlow

What is AgentFlow?

Why AgentFlow?

Key Features

Pipeline Orchestration

Quality Gates

Observability & Cost Tracking

System Learning

Safety & Recovery

Architecture

Quick Start

Prerequisites

Installation

Setup

Stop the pipeline

How It Works

The 7-Stage Pipeline

Task Lifecycle

What Happens When Things Fail

Comparison

When to Use What

Superpowers Integration

What Each Layer Controls

Complexity Gating

Integration Gaps (24-31)

Adapters

The 45-Gap Registry

Project Structure

Contributing

License

Automation Levels

What's autonomous today

What's still manual

Roadmap

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages