Skip to content

Isha-Das-06/Sentinel-LLM

Repository files navigation

Sentinel-LLM: Enterprise RAG & LLMOps Pipeline

Sentinel-LLM Branding

Production-Ready Stack LLM License


🛡️ Overview

Sentinel-LLM is a high-impact, production-grade Retrieval-Augmented Generation (RAG) framework. Designed for enterprise scalability and strict data privacy, it delivers a fully self-contained AI ecosystem that eliminates dependency on external APIs while maintaining 100% observability over model performance and "Hallucination Control."

This project showcases a complete Modern LLMOps Lifecycle: from automated high-dimensional data ingestion to real-time semantic monitoring and unified lifecycle management.


✨ Enterprise-Grade Features

  • ⚡ High-Performance RAG: Sub-second context retrieval using Qdrant and semantic chunking.
  • 🛡️ Hallucination Guardrails: Automated "Faithfulness" and "Answer Relevancy" scoring integrated into the inference pipeline via RAGAS.
  • 📊 Real-time Observability: Comprehensive Grafana dashboards monitoring latency, token throughput, and retrieval drift.
  • 🔄 Autonomous Lifecycles: Apache Airflow DAGs automate the ingestion of massive document silos without manual intervention.
  • 📜 Prompt Governance: Version-controlled prompt engineering using MLflow, ensuring reproducibility across deployments.
  • 🔒 Local-First Privacy: Powered by Ollama, ensuring all data remains within your infrastructure.

📊 Observability & Monitoring

Sentinel-LLM doesn't just respond; it observes. Every query is tracked for accuracy, latency, and system health.

Monitoring Dashboard Mockup


🧬 System Architecture & Lifecycle

Sentinel-LLM follows an event-driven, multi-layered architecture where every step of the document lifecycle and inference process is tracked and validated.

graph TD
    %% Define Styles
    classDef data fill:#e1f5fe,stroke:#01579b,stroke-width:1px;
    classDef logic fill:#f3e5f5,stroke:#4a148c,stroke-width:1px;
    classDef monitor fill:#fff3e0,stroke:#e65100,stroke-width:1px;
    classDef target fill:#f1f8e9,stroke:#33691e,stroke-width:2px,stroke-dasharray: 5 5;

    %% 📁 DATA INGESTION WORKFLOW
    subgraph Data_Pipe ["📥 Knowledge Ingestion (Airflow)"]
        RAW["Raw PDFs/Docs"] -->|Monitor| AF["Airflow DAG"]
        AF -->|Chunk| CHUNK["Recursive Splitting"]
        CHUNK -->|Embed| VEC["Llama-3 Vectors"]
        VEC -->|Index| QD[("Qdrant DB")]
    end

    %% 🧠 INFERENCE WORKFLOW
    subgraph Inference_Engine ["🧠 RAG Inference (FastAPI)"]
        USER["User Query"] -->|POST /chat| API["Sentinel Server"]
        API -->|Search| QD
        QD -->|Context| API
        API -->|Augment| LLM["Ollama Core"]
        LLM -->|Stream| API
        API -->|Final Response| USER
    end

    %% ⚖️ GUARDRAILS & OPS
    subgraph Observability_Layer ["⚖️ LLMOps & Guardrails"]
        API -->|Audit| RAGAS["RAGAS Evaluator"]
        RAGAS -->|Score| METRICS["Faithfulness/Relevancy"]
        API -->|Trace| PROM["Prometheus"]
        PROM -->|Visual| GRAF["Grafana Dashboards"]
        API -->|Register| MLF[("MLflow Registry")]
    end

    %% Node Styles
    class RAW,AF,CHUNK,VEC,QD data;
    class USER,API,LLM logic;
    class RAGAS,METRICS,PROM,GRAF,MLF monitor;
Loading


🛠️ Quick Start

1. Requirements

  • Docker & Docker Compose
  • 16GB+ RAM recommended for local LLM inference.

2. Deployment

# Clone the repository
git clone https://github.com/your-username/sentinel-llm.git
cd sentinel-llm

# Initialize environment
cp .env.example .env

# Launch Infrastructure
docker compose up -d

3. Model Initialization

docker exec -it sentinel_ollama ollama pull llama3

4. Direct Chat

curl -X POST "http://localhost:8000/chat" \
     -H "Content-Type: application/json" \
     -d '{"prompt": "How does Sentinel-LLM handle hallucinations?"}'

📁 Directory Structure

sentinel-llm/
├── airflow/           # Ingestion Workflows (DAGs)
├── assets/            # README Visuals (Hero/Dashboards)
├── data/              # Document Ingestion Source
├── ingestion/        # Vectorization & Processing Logic
├── k8s/               # Production Kubernetes Manifests
├── monitoring/        # Prometheus & Grafana Configs
├── server/            # FastAPI RAG Engine
├── .env.example      # Environment Template
├── docker-compose.yml # Full Stack Orchestration
└── README.md          # Project Documentation

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

Sentinel-LLM is released under the MIT License. See LICENSE for details.


Built with 💙 for the MLOps Community

About

A production-grade, local-first RAG & LLMOps pipeline for enterprise-grade AI. Features automated document ingestion (Airflow), vector storage (Qdrant), and hallucination guardrails (RAGAS). Demonstrates end-to-end observability and model lifecycle management.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors