ph0smet · ph0smet · May 16, 2026 · May 16, 2026
diff --git a/README.md b/README.md
@@ -2,19 +2,36 @@
 
 [![CI](https://github.com/blindhacker99/code-reason/actions/workflows/ci.yml/badge.svg)](https://github.com/blindhacker99/code-reason/actions/workflows/ci.yml)
 
-**code-reason** is an MCP server that gives coding agents real program-analysis reasoning — so they verify data flow, trace evidence chains, and navigate call graphs from ground truth, not guesswork.
+> *Instead of grep-and-guess, provide your agents program analysis capabilities with code-reason.*
+
+**code-reason** is an MCP server that gives coding agents real program-analysis primitives — data-flow reachability, call-graph traversal, evidence-chain construction — so they verify code behavior from ground truth instead of speculation.
+
+## Why code-reason
 
 Coding agents are good at reading code, but they struggle with whole-program questions:
 
 - *Does user-controlled input actually reach this SQL call, or does sanitization cut the flow off?*
 - *Who really invokes this function across the codebase?*
 - *What is the complete evidence chain from source to sink?*
 
-These are program-analysis questions. Answering them with grep-and-guess, or with ever-larger context windows, scales poorly and produces false confidence. They are better answered with a code property graph and a few well-chosen graph queries — exposed as MCP tools the agent drives directly.
+Without a code-analysis tool, the agent answers these by **manual grep-based tracing.** It can read code, but it can't actually trace data flow, walk a control graph, or verify that user input reaches a sink. Ask any modern coding agent how it traces taint without a tool, and the answer is some variant of *"I read files and follow string matches."*
+
+That works for simple cases. The cracks show on anything non-trivial — aliased variables, inter-procedural flow, sanitization checks, framework-injected inputs. The agent still produces an answer, often with high confidence, but it's reading 6+ files to confirm a single chain, burning context on speculation, and silently missing flows it never thought to grep for. For security-sensitive work, a confident wrong answer is more dangerous than no answer at all — and that's exactly what grep-based tracing produces at scale.
+
+code-reason closes the gap. The agent stays in charge of *what's interesting*; code-reason answers *what's actually true* — backed by a code property graph parsed once and queried cheaply. The result: agents that are more deterministic, more token-efficient, and faster — and agentic security workflows you can actually trust.
+
+Where traditional SAST tools produce findings reports for humans to triage, code-reason exposes the underlying analysis primitives for an agent to drive its own investigation.
+
+Built on [Fraunhofer AISEC's Code Property Graph](https://github.com/Fraunhofer-AISEC/cpg) for multi-language data-flow, control-flow, and taint analysis, and the [Kotlin MCP SDK](https://github.com/modelcontextprotocol/kotlin-sdk) to expose those capabilities as agent-callable tools.
+
+## What you get
 
-**Tech stack.** code-reason uses [Fraunhofer AISEC's Code Property Graph](https://github.com/Fraunhofer-AISEC/cpg) for multi-language data-flow, control-flow, and taint analysis, and the [Kotlin MCP SDK](https://github.com/modelcontextprotocol/kotlin-sdk) to expose those capabilities as agent-callable tools.
+From dogfood sessions on real Java and Python codebases:
 
-**Long-term vision.** Make program-analysis primitives a first-class capability for coding agents — sharpening vulnerability detection precision, eliminating tokens wasted on speculative grep-and-guess exploration, and turning vulnerability hunting into a directed, evidence-driven workflow.
+- **30-40% fewer agent tokens** on multi-step security reviews, mostly from call-graph queries that would otherwise take 5-10 grep iterations to confirm by hand.
+- **Compact structured answers, not file dumps.** A call-graph query returns reachable methods in JSON; the grep-and-read equivalent forces the agent to read 6+ files to confirm one chain.
+- **One analysis pass per service, unlimited queries.** `reason_analyze_project` builds the CPG once; every other `reason_*` tool queries it cheaply.
+- **Real evidence chains.** `reason_trace_taint_path` returns the full source-to-sink path with intermediate steps and code context — not "looks like SQLi maybe."
 
 ## How it works
 
@@ -49,15 +66,17 @@ code-reason exposes nine MCP tools, grouped by purpose:
 
 | Group | Tool | Purpose |
 |---|---|---|
-| Analysis | `reason_analyze_project` | Parse a project into a CPG |
-| Discovery | `reason_find_entry_points` | Locate HTTP handlers and CLI entry points |
-| Discovery | `reason_list_supported_checks` | Enumerate built-in vulnerability checks |
-| Graph | `reason_find_callers` | "Who calls this function?" |
-| Graph | `reason_find_callees` | "What does this function call?" |
-| Graph | `reason_query_dataflow` | Forward/backward reachability over the data-flow graph |
-| Vulnerability | `reason_scan_injections` | Catalog-driven taint analysis (SQLi, XSS, command injection) |
-| Vulnerability | `reason_trace_taint_path` | Full source-to-sink evidence chain for a finding |
-| Vulnerability | `reason_get_finding_detail` | Complete finding description and remediation |
+| Setup | `reason_analyze_project` | Parse a project into a code property graph |
+| Navigation | `reason_find_entry_points` | Locate HTTP handlers, CLI entries, framework hooks |
+| Navigation | `reason_find_callers` | "Who calls this function?" |
+| Navigation | `reason_find_callees` | "What does this function call?" |
+| Data flow | `reason_query_dataflow` | Forward/backward reachability over the data-flow graph |
+| Data flow | `reason_trace_taint_path` | Full source-to-sink evidence chain between any two points |
+| Catalog scan (convenience) | `reason_scan_injections` | Catalog-driven taint analysis (SQLi/XSS/command injection) |
+| Catalog scan (convenience) | `reason_list_supported_checks` | Enumerate built-in vulnerability checks |
+| Catalog scan (convenience) | `reason_get_finding_detail` | Description + remediation for a scan finding |
+
+The marquee value is the **navigation** and **data flow** primitives — the agent composes them to answer whole-program questions on its own. The **catalog scan** tools are a convenience baseline for quick first-pass triage; the agent's own reasoning over the primitives is what makes the difference on real codebases.
 
 ## Prerequisites
 
@@ -79,7 +98,7 @@ The launcher lands at `build/install/code-reason/bin/code-reason`.
 ./gradlew test
 ```
 
-Twelve integration tests run the full pipeline against small Java and Python fixtures.
+Integration tests run the full pipeline against small Java and Python fixtures.
 
 ## Claude Code setup
 
@@ -100,15 +119,15 @@ Restart Claude Code; the `reason_*` tools will appear in its tool list.
 
 ## Example workflow
 
-A typical agent session looks like this:
+A typical agent-driven session — primitives composing into evidence:
 
-1. Agent calls `reason_analyze_project` on the target codebase to build the CPG.
-2. Agent calls `reason_find_entry_points` to enumerate reachable inputs.
-3. Agent calls `reason_scan_injections` to surface candidate data flows.
-4. For each candidate, agent calls `reason_trace_taint_path` to confirm reachability and retrieve the evidence chain.
-5. Agent reasons about exploitability from the structured result.
+1. Agent calls `reason_analyze_project` to build the CPG.
+2. Agent calls `reason_find_entry_points` to enumerate where external input enters the codebase.
+3. For a suspicious entry point, agent uses `reason_find_callees` and `reason_query_dataflow` to map the downstream reach.
+4. When the data reaches a sensitive call, agent calls `reason_trace_taint_path` for the full source-to-sink evidence chain.
+5. Agent reasons about exploitability from the structured result and decides what to investigate next.
 
-At any step the agent can pivot through the graph on its own using `reason_find_callers`, `reason_find_callees`, or `reason_query_dataflow`.
+For quick first-pass triage, the agent can also call `reason_scan_injections` to surface candidate flows from the built-in catalog, then verify each one with `reason_trace_taint_path`.
 
 ## Supported languages
 
@@ -119,7 +138,7 @@ Additional CPG frontends (C/C++, Go, TypeScript, JVM, LLVM, Ruby) can be enabled
 
 ## Status
 
-code-reason is v0.1.0 — early, research-grade. The build is currently pinned to CPG `main-SNAPSHOT`; this will move to a stable `11.x` release once Fraunhofer publishes one to Maven Central.
+code-reason is v0.1.0 — early, research-grade. CI runs on every push and pull request. The build is currently pinned to CPG `main-SNAPSHOT`; this will move to a stable `11.x` release once Fraunhofer publishes one to Maven Central.
 
 ## License