Sentinel-LLM: Enterprise RAG & LLMOps Pipeline

🛡️ Overview

Sentinel-LLM is a high-impact, production-grade Retrieval-Augmented Generation (RAG) framework. Designed for enterprise scalability and strict data privacy, it delivers a fully self-contained AI ecosystem that eliminates dependency on external APIs while maintaining 100% observability over model performance and "Hallucination Control."

This project showcases a complete Modern LLMOps Lifecycle: from automated high-dimensional data ingestion to real-time semantic monitoring and unified lifecycle management.

✨ Enterprise-Grade Features

⚡ High-Performance RAG: Sub-second context retrieval using Qdrant and semantic chunking.
🛡️ Hallucination Guardrails: Automated "Faithfulness" and "Answer Relevancy" scoring integrated into the inference pipeline via RAGAS.
📊 Real-time Observability: Comprehensive Grafana dashboards monitoring latency, token throughput, and retrieval drift.
🔄 Autonomous Lifecycles: Apache Airflow DAGs automate the ingestion of massive document silos without manual intervention.
📜 Prompt Governance: Version-controlled prompt engineering using MLflow, ensuring reproducibility across deployments.
🔒 Local-First Privacy: Powered by Ollama, ensuring all data remains within your infrastructure.

📊 Observability & Monitoring

Sentinel-LLM doesn't just respond; it observes. Every query is tracked for accuracy, latency, and system health.

🧬 System Architecture & Lifecycle

Sentinel-LLM follows an event-driven, multi-layered architecture where every step of the document lifecycle and inference process is tracked and validated.

graph TD
    %% Define Styles
    classDef data fill:#e1f5fe,stroke:#01579b,stroke-width:1px;
    classDef logic fill:#f3e5f5,stroke:#4a148c,stroke-width:1px;
    classDef monitor fill:#fff3e0,stroke:#e65100,stroke-width:1px;
    classDef target fill:#f1f8e9,stroke:#33691e,stroke-width:2px,stroke-dasharray: 5 5;

    %% 📁 DATA INGESTION WORKFLOW
    subgraph Data_Pipe ["📥 Knowledge Ingestion (Airflow)"]
        RAW["Raw PDFs/Docs"] -->|Monitor| AF["Airflow DAG"]
        AF -->|Chunk| CHUNK["Recursive Splitting"]
        CHUNK -->|Embed| VEC["Llama-3 Vectors"]
        VEC -->|Index| QD[("Qdrant DB")]
    end

    %% 🧠 INFERENCE WORKFLOW
    subgraph Inference_Engine ["🧠 RAG Inference (FastAPI)"]
        USER["User Query"] -->|POST /chat| API["Sentinel Server"]
        API -->|Search| QD
        QD -->|Context| API
        API -->|Augment| LLM["Ollama Core"]
        LLM -->|Stream| API
        API -->|Final Response| USER
    end

    %% ⚖️ GUARDRAILS & OPS
    subgraph Observability_Layer ["⚖️ LLMOps & Guardrails"]
        API -->|Audit| RAGAS["RAGAS Evaluator"]
        RAGAS -->|Score| METRICS["Faithfulness/Relevancy"]
        API -->|Trace| PROM["Prometheus"]
        PROM -->|Visual| GRAF["Grafana Dashboards"]
        API -->|Register| MLF[("MLflow Registry")]
    end

    %% Node Styles
    class RAW,AF,CHUNK,VEC,QD data;
    class USER,API,LLM logic;
    class RAGAS,METRICS,PROM,GRAF,MLF monitor;

🛠️ Quick Start

1. Requirements

Docker & Docker Compose
16GB+ RAM recommended for local LLM inference.

2. Deployment

# Clone the repository
git clone https://github.com/your-username/sentinel-llm.git
cd sentinel-llm

# Initialize environment
cp .env.example .env

# Launch Infrastructure
docker compose up -d

3. Model Initialization

docker exec -it sentinel_ollama ollama pull llama3

4. Direct Chat

curl -X POST "http://localhost:8000/chat" \
     -H "Content-Type: application/json" \
     -d '{"prompt": "How does Sentinel-LLM handle hallucinations?"}'

📁 Directory Structure

sentinel-llm/
├── airflow/           # Ingestion Workflows (DAGs)
├── assets/            # README Visuals (Hero/Dashboards)
├── data/              # Document Ingestion Source
├── ingestion/        # Vectorization & Processing Logic
├── k8s/               # Production Kubernetes Manifests
├── monitoring/        # Prometheus & Grafana Configs
├── server/            # FastAPI RAG Engine
├── .env.example      # Environment Template
├── docker-compose.yml # Full Stack Orchestration
└── README.md          # Project Documentation

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

Sentinel-LLM is released under the MIT License. See LICENSE for details.

Built with 💙 for the MLOps Community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentinel-LLM: Enterprise RAG & LLMOps Pipeline

🛡️ Overview

✨ Enterprise-Grade Features

📊 Observability & Monitoring

🧬 System Architecture & Lifecycle

🛠️ Quick Start

1. Requirements

2. Deployment

3. Model Initialization

4. Direct Chat

📁 Directory Structure

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
airflow/dags		airflow/dags
assets		assets
data		data
ingestion		ingestion
k8s		k8s
monitoring		monitoring
server		server
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Sentinel-LLM: Enterprise RAG & LLMOps Pipeline

🛡️ Overview

✨ Enterprise-Grade Features

📊 Observability & Monitoring

🧬 System Architecture & Lifecycle

🛠️ Quick Start

1. Requirements

2. Deployment

3. Model Initialization

4. Direct Chat

📁 Directory Structure

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages