|
1 | | -# 🛡️ CodeGuardian: Advanced Agentic Code Review & Debugging System |
2 | | -### *Your Virtual Senior Engineer, Operating at Wire-Speed* |
| 1 | +<div align="center"> |
3 | 2 |
|
4 | | -[](https://github.com/Ismail-2001) |
5 | | -[](https://github.com/Ismail-2001) |
6 | | -[](https://github.com/Ismail-2001) |
| 3 | +# 🛡️ CodeGuardian: Cognitive Code Review & Debugging Agent |
| 4 | +### A Production-Grade Autonomous Orchestrator Powered by LangGraph & DeepSeek-V3 |
| 5 | + |
| 6 | +<br/> |
| 7 | + |
| 8 | +[](https://python.org) |
| 9 | +[](https://langchain-ai.github.io/langgraph/) |
| 10 | +[](https://deepseek.com) |
| 11 | +[](https://owasp.org) |
| 12 | +[](./LICENSE) |
| 13 | + |
| 14 | +<br/> |
| 15 | + |
| 16 | +> *"CodeGuardian is more than a linter—it's a virtual senior developer that reasons about your architecture, identifies silent logic failures, and autonomously generates surgical code repairs."* |
| 17 | +
|
| 18 | +**CodeGuardian** is an enterprise-grade autonomous system designed for deep code auditing and self-correction. Built on a **LangGraph-based cyclic architecture**, it transcends standard static analysis by employing **11 specialized reasoning nodes** that simulate the mental model of a senior staff engineer. |
| 19 | + |
| 20 | +[**🏗️ Architecture**](#-system-architecture) · [**🧠 Reasoning Core**](#-the-cognitive-pipeline) · [**🚀 Quick Start**](#-getting-started) · [**📊 Reporting**](#-enterprise-reporting) |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +</div> |
| 25 | + |
| 26 | +## 📌 The "Code Rot" Challenge |
| 27 | + |
| 28 | +Enterprise codebases fail not just due to syntax errors, but because of: |
| 29 | +1. **Invisible Logic Breaches**: Code that runs but produces incorrect outcomes. |
| 30 | +2. **Architectural Drift**: Intentional patterns being ignored over time. |
| 31 | +3. **Security Gaps**: Hardcoded secrets or unsafe parsing that static linters miss. |
| 32 | +4. **Review Fatigue**: Human reviewers missing subtle concurrency or performance issues. |
| 33 | + |
| 34 | +**CodeGuardian solves this** by treating code review as an agentic workflow. It doesn't just "check" code; it **reasons** about it across multiple specialized dimensions. |
7 | 35 |
|
8 | 36 | --- |
9 | 37 |
|
10 | | -## 🎬 Overview |
11 | | -**CodeGuardian** is not just a linter; it’s an **Autonomous Agentic Orchestrator** designed to audit enterprise-grade codebases. By leveraging **LangGraph-based multi-phase reasoning**, it identifies security flaws, architectural debt, and performance bottlenecks, offering self-correcting fixes with human-aligned oversight. |
| 38 | +## ✨ Enterprise Capabilities |
| 39 | + |
| 40 | +### ⚡ Agentic Self-Correction (Auto-Fix) |
| 41 | +Powered by the `generate_fixes` node, CodeGuardian doesn't just point out problems—it proposes solutions. |
| 42 | +- **Safe-Only Refactoring**: Automatically addresses medium-to-high severity issues that have high confidence scores. |
| 43 | +- **Human-in-the-loop (HITL)**: Advanced configuration allows for mandatory developer approval before applying security or architectural fixes. |
| 44 | + |
| 45 | +### 🛡️ 11-Phase Intelligence Pipeline |
| 46 | +CodeGuardian executes a strictly orchestrated workflow defined in `graph.py`: |
| 47 | +- **Static Analysis**: Pylint/Flake8/ESLint integration. |
| 48 | +- **Pattern Matching**: Detects anti-patterns and suboptimal "code smells". |
| 49 | +- **Security Audit**: Scans for CWE Top 25, insecure dependencies, and secret leakage. |
| 50 | +- **Performance Profiling**: Identifies N+1 queries, memory bottlenecks, and complexity spikes. |
| 51 | +- **Testing Assessment**: Audits test coverage and identifies "missing edge-case" scenarios. |
| 52 | +- **Logic Verification**: Deep reasoning about function intent vs. implementation. |
| 53 | + |
| 54 | +### ⚙️ Declarative Configuration (`.codeguardian.yml`) |
| 55 | +Manage your entire audit policy with a production-grade YAML spec: |
| 56 | +- Define excluded/included directory patterns. |
| 57 | +- Set complexity thresholds (Cyclomatic/Cognitive). |
| 58 | +- Configure language-specific linter rules. |
| 59 | +- Enable/Disable specific agent nodes based on performance needs. |
12 | 60 |
|
13 | 61 | --- |
14 | 62 |
|
15 | | -## 🏗️ The Intelligence Architecture |
16 | | -CodeGuardian operates on a multi-layered cognitive loop. Below is the rendered reasoning pipeline: |
| 63 | +## 🏗️ System Architecture |
| 64 | + |
| 65 | +### The Cognitive Execution Graph |
17 | 66 |
|
18 | 67 | ```mermaid |
19 | 68 | graph TD |
20 | | - Start((Start)) --> Init[1. Initialization] |
21 | | - Init --> Scope[2. Scope Discovery] |
22 | | - Scope --> Analysis{Deep Analysis Loop} |
| 69 | + Start((Repo URL/Path)) --> Init[1. Initialization & Mapping] |
| 70 | + Init --> Scope[2. Scope & Framework Discovery] |
23 | 71 | |
24 | | - subgraph "Reasoning Core" |
25 | | - Analysis --> Static[3. Static Analysis] |
26 | | - Static --> Security[4. Security Audit] |
27 | | - Security --> Logic[5. Logic Verification] |
28 | | - Logic --> Policy[6. RAG-Based Policy Check] |
| 72 | + subgraph "The Reasoning Engine" |
| 73 | + Scope --> Static[3. Static & Linter Check] |
| 74 | + Static --> Pattern[4. Pattern & Design Audit] |
| 75 | + Pattern --> Security[5. Deep Security Audit] |
| 76 | + Security --> Perform[6. Performance Bottleneck scan] |
| 77 | + Perform --> Testing[7. Testing & Coverage Analysis] |
| 78 | + Testing --> Logic[8. Deep Logic Verification] |
29 | 79 | end |
30 | 80 | |
31 | | - Policy --> Synthesis[7. Findings Synthesis] |
32 | | - Synthesis --> HITL{Human-in-the-loop} |
| 81 | + Logic --> Synth[9. Synthesis & Prioritization] |
33 | 82 | |
34 | | - HITL -- Approved --> Fix[Auto-Fix Generation] |
35 | | - HITL -- Rejected --> Adjust[Manual Adjustment] |
| 83 | + Synth -- Critical Issues Found --> Fix[10. Autonomous Fix Generation] |
| 84 | + Synth -- "No Fixable/Safe Issues" --> Report[11. Report Generation] |
36 | 85 | |
37 | | - Fix --> Report[Final Report Generation] |
38 | | - Adjust --> Report |
39 | | - Report --> End((End)) |
| 86 | + Fix --> Report |
| 87 | + Report --> End((Final GitHub Issues / MD Report)) |
40 | 88 | ``` |
41 | 89 |
|
42 | 90 | --- |
43 | 91 |
|
44 | | -## 🚀 Key Features |
45 | | -- **11-Phase Reasoning Pipeline:** Structured workflow from repo init to deep logic verification. |
46 | | -- **Security-First DNA:** Automated detection of CWE Top 25, hardcoded secrets, and unsafe patterns. |
47 | | -- **RAG-Enhanced Policies:** Use **Retrieval-Augmented Generation** to audit code against *your* custom company standards. |
48 | | -- **Human-in-the-Loop (HITL):** Strategic pauses for developer approval before executing automated code refactoring. |
49 | | -- **Adversarial Test Generation:** Automatically creates synthetic test suites to stress-test your logic. |
| 92 | +## 📊 Enterprise Reporting Suite |
50 | 93 |
|
51 | | ---- |
| 94 | +CodeGuardian generates high-fidelity reports tailored for different stakeholders: |
| 95 | +- **Markdown Summary**: Optimized for GitHub Action logs and PR comments. |
| 96 | +- **JSON Structured Data**: For integration into existing CI/CD dashboards (Datadog/ELK). |
| 97 | +- **GitHub Issues Integration**: Automatically creates and labels issues for discovered vulnerabilities. |
52 | 98 |
|
53 | | -## 📊 Sample Review Report (Output Snippet) |
| 99 | +### Sample Finding Snippet: |
54 | 100 | ```markdown |
55 | | -### 🔎 Finding ID: SEC-001 |
56 | | -- **Severity:** High |
57 | | -- **Issue:** SQL Injection vulnerability detected in `src/db_handler.py:45`. |
58 | | -- **Reasoning:** User-input is directly concatenated into the query string. |
59 | | -- **Proposed Fix:** Parameterized query implementation. |
60 | | -- **Confidence Score:** 98% |
| 101 | +### 🕵️ Finding: SEC-024 (Insecure SQL Handling) |
| 102 | +- **Severity**: 🔴 CRITICAL |
| 103 | +- **Location**: `src/db/manager.py:84` |
| 104 | +- **Reasoning**: User-controlled string 'query_tag' is directly interpolated into a raw SQL string, bypassing ORM protections. |
| 105 | +- **Recommendation**: Use a parameterized query or the helper method `safe_execute()`. |
| 106 | +- **Auto-Fix Generated**: ✅ YES (See PR Preview) |
61 | 107 | ``` |
62 | 108 |
|
63 | 109 | --- |
64 | 110 |
|
65 | | -## 🏁 Quick Start |
66 | | -### Prerequisites |
67 | | -- Python 3.10+ |
68 | | -- OpenAI / DeepSeek API Key |
| 111 | +## 🚀 Getting Started |
69 | 112 |
|
70 | | -### Local Setup |
| 113 | +### 1. Installation |
71 | 114 | ```bash |
72 | 115 | git clone https://github.com/Ismail-2001/Code-Review-and-Debugging-Agent.git |
73 | 116 | cd Code-Review-and-Debugging-Agent |
74 | 117 | pip install -r requirements.txt |
75 | | -python main.py --path ./your_project_directory |
76 | 118 | ``` |
77 | 119 |
|
78 | | ---- |
| 120 | +### 2. Configuration |
| 121 | +Create a `.env` file in the root: |
| 122 | +```env |
| 123 | +OPENAI_API_KEY=sk-... |
| 124 | +# OR |
| 125 | +DEEPSEEK_API_KEY=sk-... |
| 126 | +``` |
79 | 127 |
|
80 | | -## 🗺️ Roadmap |
81 | | -- [ ] Integration with GitLab & Bitbucket. |
82 | | -- [ ] Real-time Streaming Reports. |
83 | | -- [ ] Advanced Graph-based call-chain visualization. |
| 128 | +Configure your standards in `.codeguardian.yml`. |
| 129 | + |
| 130 | +### 3. Run Your First Review |
| 131 | +```bash |
| 132 | +# Analyze a local project |
| 133 | +python main.py --path /path/to/your/project --severity high |
| 134 | + |
| 135 | +# Analyze a specific branch |
| 136 | +python main.py --path /path/to/repo --branch feature/login-security |
| 137 | +``` |
84 | 138 |
|
85 | 139 | --- |
86 | 140 |
|
87 | | -### 🔗 Connecting the Intelligence |
88 | | -Developed by **[Ismail Sajid](https://ismail-sajid-agentic-portfolio.netlify.app/)**. |
89 | | -*Explore more Autonomous Agents on my [Main Profile](https://github.com/Ismail-2001).* |
| 141 | +## 🔭 The Lab Roadmap |
| 142 | + |
| 143 | +### ✅ Completed |
| 144 | +- [x] **LangGraph Orchestrator**: Multi-node state machine for reasoning. |
| 145 | +- [x] **Stateful Memory**: Preserves context across analysis phases. |
| 146 | +- [x] **Configuration Layer**: Full YAML-based control. |
| 147 | + |
| 148 | +### 🔨 Phase 2: Cognitive Depth (Next) |
| 149 | +- [ ] **Cross-File Dependency Mapping**: Detecting issues that span multiple modules. |
| 150 | +- [ ] **RAG-Powered Policies**: Auditing code against custom enterprise Wiki/Documentation. |
| 151 | +- [ ] **Interactive Debugger**: A CLI loop to chat with the agent about specific files. |
90 | 152 |
|
91 | 153 | --- |
| 154 | + |
| 155 | +<div align="center"> |
| 156 | + |
| 157 | +**Built for staff engineers. Powered by autonomous reasoning.** |
| 158 | + |
| 159 | +*CodeGuardian: Where expertise meets wire-speed.* |
| 160 | + |
| 161 | +Built with ❤️ by [Ismail Sajid](https://github.com/Ismail-2001) |
| 162 | + |
| 163 | +</div> |
0 commit comments