Skip to content

nberdi/LexAI

Repository files navigation

LexAI

LexAI is a RAG-powered legal document analyzer for contracts, leases, terms of service, NDAs, and other PDF legal documents. Upload a document, ask questions in plain English, and get grounded answers with exact clause-level citations from the source text.

Tech Stack

  • Next.js 14 App Router
  • TypeScript
  • Tailwind CSS
  • Next.js API routes
  • LangChain.js
  • OpenAI text-embedding-3-small and gpt-4o
  • Prisma ORM
  • PostgreSQL with pgvector
  • pdf-parse for PDF text extraction
  • formidable for uploads
  • Docker Compose for local PostgreSQL
  • Simple password gate with HTTP-only session cookie
  • In-memory API rate limiting with lru-cache

Prerequisites

  • Node.js 18 or newer, with Node 20 LTS recommended
  • Docker and Docker Compose
  • OpenAI API key

Setup

  1. Install dependencies:

    npm install
  2. Create your local environment files:

    cp .env.local.example .env.local
    cp .env.local.example .env

    Next.js reads .env.local when the app runs. Prisma CLI reads .env when running commands such as npx prisma migrate dev.

  3. Add your OpenAI key and app password to both .env.local and .env:

    DATABASE_URL=postgresql://postgres:postgres@localhost:5433/lexai
    OPENAI_API_KEY=your_openai_api_key_here
    APP_PASSWORD=your_password_here
  4. Start PostgreSQL with pgvector:

    docker-compose up -d
  5. Generate Prisma Client and run migrations:

    npx prisma generate
    npx prisma migrate dev
  6. Start the app:

    npm run dev
  7. Open http://localhost:3000.

How to Use

  1. Open http://localhost:3000.
  2. Enter the password configured as APP_PASSWORD.
  3. Click Upload Document or drop a PDF into the upload zone.
  4. Wait for LexAI to extract text, chunk it, generate embeddings, and index the document.
  5. Select the uploaded document from the sidebar.
  6. Ask a question in the chat input.
  7. Read the plain-language answer and review the cited source clauses.
  8. Click Sign out in the sidebar to clear the session cookie.

Architecture

When a PDF is uploaded, LexAI extracts its text with pdf-parse, splits the text into overlapping chunks of about 500 tokens, and generates an OpenAI embedding for every chunk. Each chunk and its vector are stored in PostgreSQL using pgvector.

When a user asks a question, LexAI embeds the question, runs a cosine similarity search against the stored chunk vectors, and retrieves the five most relevant chunks. Those chunks are passed to gpt-4o through a LangChain chat model with a strict system prompt that requires grounded answers and exact citations.

LexAI is protected by a single shared password. Successful login sets an HTTP-only lexai_session cookie for the browser session, and middleware redirects unauthenticated requests to /login.

Environment Variables

Variable Required Description
DATABASE_URL Yes PostgreSQL connection string. Defaults locally to postgresql://postgres:postgres@localhost:5433/lexai with the included Docker Compose file.
OPENAI_API_KEY Yes API key used for embeddings and chat completions.
APP_PASSWORD Yes Shared password required to access the app.

About

RAG-powered legal document analyzer with PDF upload, AI chat, and clause-level citations.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages