Skip to content

A production-ready assistant that converts natural language questions into safe, validated PostgreSQL queries — with support for multiple AI providers.

Notifications You must be signed in to change notification settings

m1dnxt404/LLM-powered-SQL-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Powered SQL Assistant

A production-ready assistant that converts natural language questions into safe, validated PostgreSQL queries — with support for multiple AI providers.


Features

  • Natural language to SQL — ask questions in plain English and get back executable SQL with results
  • Multi-provider AI — switch between Claude, OpenAI, Deepseek, and Ollama from the UI at any time
  • Schema-aware — reads your live database schema and injects it into every LLM prompt
  • SQL safety layer — only SELECT statements are allowed; DROP, DELETE, UPDATE, INSERT, and other mutating keywords are blocked at the validation layer
  • Conversational memory — follow-up questions carry context from previous turns within a session, across provider switches
  • Row limit enforcement — all queries are capped at 200 rows automatically
  • Query timeout — queries are killed after 10 seconds to protect the database

Tech Stack

Layer Technology
Frontend Next.js 15, React 19, TailwindCSS
Backend FastAPI, Python, SQLAlchemy (async)
Database PostgreSQL
LLM — Claude Anthropic SDK (claude-sonnet-4-6)
LLM — OpenAI OpenAI SDK (gpt-4o)
LLM — Deepseek OpenAI SDK + Deepseek base URL (deepseek-chat)
LLM — Ollama OpenAI SDK + local base URL (llama3.2)
Driver asyncpg, psycopg3

Project Structure

LLM-powered SQL assistant/
├── database/
│   ├── schema.sql          # PostgreSQL schema (6 tables)
│   └── seed_data.sql       # Test data (40 employees, 80 customers, 150 orders, …)
├── backend/
│   ├── requirements.txt
│   └── app/
│       ├── main.py          # FastAPI app entry point
│       ├── config.py        # Configuration & credentials (edit before running)
│       ├── database.py      # Async SQLAlchemy engine
│       ├── models.py        # ORM models
│       ├── routers/
│       │   ├── query.py     # POST /api/query
│       │   └── schema.py    # GET  /api/schema
│       └── services/
│           ├── llm_service.py      # Multi-provider LLM integration + session memory
│           ├── sql_validator.py    # SQL safety enforcement
│           └── schema_service.py   # Schema reader + query executor
└── frontend/
    ├── package.json
    ├── next.config.ts
    ├── tailwind.config.ts
    └── app/
        ├── layout.tsx
        ├── page.tsx                    # Main chat interface
        ├── globals.css
        ├── lib/
        │   └── api.ts                  # Typed API client
        └── components/
            ├── ProviderSelector.tsx    # Claude / OpenAI / Deepseek / Ollama toggle
            ├── QueryInput.tsx          # Auto-resizing input
            ├── ResultTable.tsx         # SQL + provider badge + data grid
            └── ConversationHistory.tsx # Sidebar query log with provider tag

Database Schema

Six tables designed to exercise all major SQL patterns (JOINs, aggregations, filters, GROUP BY, ORDER BY):

Table Description
departments 6 departments with budgets and locations
employees 40 employees across departments with salaries and job titles
customers 80 customers from 50+ countries
products 25 products across 6 categories
orders 150 orders with statuses (delivered, shipped, processing, cancelled, pending)
order_items ~400 line items linking orders to products

Setup

Prerequisites

  • Python 3.11+
  • Node.js 20+
  • PostgreSQL 15+

1. Database

# Create the database
psql -U postgres -c "CREATE DATABASE sql_assistant;"

# Apply schema
psql -U postgres -d sql_assistant -f database/schema.sql

# Load test data
psql -U postgres -d sql_assistant -f database/seed_data.sql

2. Backend

Open backend/app/config.py and fill in your credentials. You only need the API key(s) for the provider(s) you intend to use:

# PostgreSQL — always required
DB_USER     = "YOUR_POSTGRES_USERNAME"   # ← replace
DB_PASSWORD = "YOUR_POSTGRES_PASSWORD"   # ← replace

# AI providers — replace only what you use
ANTHROPIC_API_KEY = "YOUR_ANTHROPIC_API_KEY"  # Claude
OPENAI_API_KEY    = "YOUR_OPENAI_API_KEY"     # OpenAI
DEEPSEEK_API_KEY  = "YOUR_DEEPSEEK_API_KEY"   # Deepseek
# Ollama needs no API key — runs locally at http://localhost:11434

Alternatively, create a backend/.env file with the same keys — it is loaded automatically.

Then install and run:

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload

The API will be available at http://localhost:8000.

3. Ollama (optional)

If you want to use Ollama, install it from ollama.com and pull a model:

ollama pull llama3.2

Ollama runs at http://localhost:11434 by default. No API key is needed.

4. Frontend

cd frontend
npm install
npm run dev

Open http://localhost:3000.


AI Providers

Provider Default Model API Key Required Notes
Claude claude-sonnet-4-6 Yes — Anthropic
OpenAI gpt-4o Yes — OpenAI
Deepseek deepseek-chat Yes — Deepseek OpenAI-compatible API
Ollama llama3.2 No Runs locally; install separately

Switch providers at any time using the toggle in the header. Conversational session history is shared across providers — a follow-up question works regardless of which provider is active.

Default models can be changed in backend/app/config.py:

CLAUDE_MODEL   = "claude-sonnet-4-6"
OPENAI_MODEL   = "gpt-4o"
DEEPSEEK_MODEL = "deepseek-chat"
OLLAMA_MODEL   = "llama3.2"

API Reference

POST /api/query

Convert a natural language question to SQL and return the results.

Request

{
  "question": "Show me the top 5 employees by salary",
  "session_id": "abc-123",
  "provider": "claude"
}

provider accepts: "claude" | "openai" | "deepseek" | "ollama". Defaults to "claude".

Response

{
  "sql": "SELECT first_name, last_name, salary FROM employees ORDER BY salary DESC LIMIT 5",
  "columns": ["first_name", "last_name", "salary"],
  "rows": [["Yara", "Hall", "135000.00"], "..."],
  "row_count": 5,
  "provider": "claude"
}

GET /api/schema

Returns the full list of tables and their columns.

DELETE /api/query/history/{session_id}

Clears the conversational memory for a given session.


Example Questions

Show me the top 5 employees by salary
How many orders were placed per month in 2024?
Which department has the highest average salary?
List all products in the Electronics category with stock below 300
Show customers from the United States who joined after 2021
What is the total revenue per product category?
Which employee handled the most orders?

Follow-up queries work within the same session:

Show me the top 5 employees by salary
→ Now filter for the Engineering department only
→ Include their hire date as well

Security

  • Only SELECT statements are executed — the validator checks both statement type (sqlparse) and a keyword blocklist
  • All queries run through SQLAlchemy with parameterization — no raw string interpolation
  • Queries time out after 10 seconds
  • Results are capped at 200 rows
  • CORS is restricted to http://localhost:3000 in development

About

A production-ready assistant that converts natural language questions into safe, validated PostgreSQL queries — with support for multiple AI providers.

Topics

Resources

Stars

Watchers

Forks

Contributors