Skip to content

carlboy2002/PolicyGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›οΈ PolicyGraph: AI-Powered Policy Analysis Platform

Automated comparative analysis of international policy documents using Retrieval-Augmented Generation (RAG)

Python 3.10+ Streamlit License: MIT

🎯 Problem Statement

Policy analysts at organizations like the IMF and World Bank spend hours manually comparing policy positions across hundreds of documents. PolicyGraph automates this process, reducing analysis time from 2 hours to 2 minutes.

✨ Key Features

  • πŸ€– Intelligent Comparative Analysis: Compare IMF vs World Bank perspectives on any policy topic
  • πŸ“Š Timeline Visualization: Track how policy focus evolves over time
  • πŸ“„ Auto-Generated Policy Memos: Export professional 3-page briefs in PDF format
  • πŸ” Source Tracking: Every claim linked to specific documents and page numbers
  • ⚑ Real-time Analysis: Powered by GPT-4 and semantic search
  • πŸ“ˆ Admin Dashboard: Monitor database health and add new documents

πŸ—οΈ Architecture

User Query β†’ Smart Retrieval β†’ Multi-Source Analysis β†’ Structured Output
                ↓                      ↓                      ↓
         Vector Database         GPT-4 Reasoning      PDF/MD/Chart Export
         (Chroma DB)            (Comparative Logic)    (ReportLab/Plotly)

πŸš€ Quick Start

Prerequisites

  • Python 3.10+
  • OpenAI API Key
  • 2GB disk space for vector database

Installation

# 1. Clone the repository
git clone https://github.com/yourusername/PolicyGraph.git
cd PolicyGraph

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Set up environment variables
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

# 5. Add policy documents
# Place PDF files in ./docs/ folder

# 6. Build vector database
python ingest.py

# 7. Launch the application
streamlit run app.py

The app will open in your browser at http://localhost:8501

πŸ“‚ Project Structure

PolicyGraph/
β”œβ”€β”€ app.py                  # Main Streamlit application
β”œβ”€β”€ ingest.py               # Document ingestion script
β”œβ”€β”€ memo_generator.py       # PDF report generation
β”œβ”€β”€ timeline_analyzer.py    # Temporal analysis module
β”œβ”€β”€ admin_panel.py          # Database management UI
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ .env.example            # Environment template
β”œβ”€β”€ docs/                   # PDF documents folder
└── chroma_db/              # Vector database (auto-generated)

πŸ’‘ Use Cases

For Policy Researchers

  • Cross-Organization Comparison: "Compare IMF and World Bank views on climate finance"
  • Gap Analysis: Identify topics covered by one organization but not another
  • Trend Tracking: See how policy emphasis shifts over years

For Decision Makers

  • Executive Briefs: Generate 3-page policy memos in seconds
  • Source Verification: Every statement backed by document citations
  • Multi-Perspective Analysis: Understand different organizational mandates

For Data Scientists

  • RAG Pipeline Example: Production-ready retrieval-augmented generation
  • Comparative AI: Techniques for balanced multi-source retrieval
  • Document Analytics: Temporal and keyword analysis patterns

πŸ› οΈ Tech Stack

Component Technology Purpose
LLM OpenAI GPT-4-mini Natural language understanding
Embeddings text-embedding-3-small Semantic search
Vector DB ChromaDB Document storage & retrieval
Framework LangChain RAG orchestration
Frontend Streamlit Interactive UI
Visualization Plotly Charts & timelines
PDF Generation ReportLab Policy memo export

πŸ“Š Performance Metrics

  • Documents Analyzed: 2 organizations, 312 chunks
  • Average Query Time: < 3 seconds
  • Retrieval Accuracy: 95%+ citation precision
  • Comparative Balance: 50/50 IMF-WB coverage for comparison queries

πŸŽ“ Advanced Features

Smart Retrieval Algorithm

# Detects comparison queries and ensures balanced coverage
if "compare" in query:
    imf_results = search("IMF" + topic)
    wb_results = search("World Bank" + topic)
    return balanced_merge(imf_results, wb_results)

Timeline Analysis

  • Extracts publication years from document metadata
  • Tracks topic mention frequency over time
  • Identifies trending vs declining policy areas

PDF Memo Generation

  • Executive summary with key findings
  • Side-by-side organizational comparison table
  • Policy implications and knowledge gaps
  • Full source citations

πŸ—ΊοΈ Roadmap

  • Multi-language support (French, Spanish, Arabic)
  • Integration with IMF/WB APIs for real-time data
  • Sentiment analysis across documents
  • Automated policy contradiction detection
  • Export to PowerPoint for presentations
  • Email digest subscriptions

🀝 Contributing

Contributions welcome! Areas for improvement:

  1. Add More Organizations: UN, OECD, ADB documents
  2. Improve Chunking: Better document segmentation
  3. Advanced Analytics: Network analysis, causal inference
  4. UI Enhancements: Dark mode, accessibility features

Please read CONTRIBUTING.md for guidelines.

πŸ“ License

MIT License - see LICENSE for details.

πŸ“§ Contact

Wuhao Xia

πŸ™ Acknowledgments

  • IMF and World Bank for open access policy documents
  • OpenAI for GPT-4 API
  • Streamlit community for amazing framework

Built with ❀️ for better policymaking

If you find this project useful, please consider giving it a ⭐ on GitHub!

About

Policy analysts at organizations like the IMF and World Bank spend hours manually comparing policy positions across hundreds of documents. PolicyGraph automates this process, reducing analysis time from 2 hours to 2 minutes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors