A Generative AI chatbot that helps users ask questions about GitLab's public Handbook and Direction pages. The chatbot retrieves relevant GitLab documentation chunks from MongoDB, sends them to a Gemini-powered RAG pipeline, and returns grounded answers with source references.
-
GitLab Handbook Q&A
Ask questions about GitLab's mission, values, remote work culture, asynchronous communication, DRI ownership, handbook-first approach, product direction, engineering, security, and more. -
RAG-based Answer Generation
The backend retrieves the most relevant chunks from MongoDB and generates answers using Gemini based only on the retrieved context. -
Source-backed Responses
Each answer can include source links so users can verify where the information came from. -
Guardrailing
If the user asks something unrelated to GitLab, the chatbot responds with:I'm only able to answer questions about GitLab's Handbook.
-
Suggested Questions
Starter prompts are shown on load to help users begin quickly. -
View Source Toggle
Users can expand or collapse sources for each assistant response. -
Authentication
Users can sign up, log in, and log out. -
Saved Chat History
Authenticated users can save and reload previous conversations. -
Persistent Current Chat
The current chat is preserved across page refreshes. -
Improved React UI
Modern frontend built with React, Vite, and Tailwind CSS.
User asks a question
↓
Frontend sends question to FastAPI backend
↓
Backend checks guardrails
↓
Relevant GitLab Handbook chunks are retrieved from MongoDB
↓
Gemini generates an answer using retrieved chunks only
↓
Frontend displays answer + sources- React
- Vite
- Tailwind CSS
- Axios / Fetch API
- LocalStorage for feedback and chat persistence
- FastAPI
- Python
- Uvicorn
- Pydantic
- PyMongo
- BeautifulSoup
- Requests
- python-dotenv
- Google Gemini API
- Gemini embeddings
- MongoDB Atlas for chunk storage
- MongoDB Atlas
Collections used:
gitlab_chatbot
├── handbook_chunks
├── users
└── chatsgitlab-handbook-chatbot/
│
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ │
│ │ ├── routes/
│ │ │ ├── __init__.py
│ │ │ └── chat.py
│ │ │
│ │ ├── core/
│ │ │ ├── __init__.py
│ │ │ ├── rag_pipeline.py
│ │ │ ├── embeddings.py
│ │ │ ├── mongodb.py
│ │ │ └── guardrails.py
│ │ │
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ └── schemas.py
│ │ │
│ │ └── utils/
│ │ ├── __init__.py
│ │ └── helpers.py
│ │
│ ├── scripts/
│ │ └── ingest.py
│ │
│ ├── requirements.txt
│ └── .env
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── Header.jsx
│ │ │ ├── ChatWindow.jsx
│ │ │ ├── InputBar.jsx
│ │ │ ├── SuggestedQuestions.jsx
│ │ │ ├── MessageBubble.jsx
│ │ │ ├── FeedbackButtons.jsx
│ │ │ ├── SourceCard.jsx
│ │ │ ├── Sidebar.jsx
│ │ │ ├── Login.jsx
│ │ │ └── Signup.jsx
│ │ │
│ │ ├── hooks/
│ │ │ ├── useChat.js
│ │ │ └── useAuth.js
│ │ │
│ │ ├── App.jsx
│ │ ├── main.jsx
│ │ └── index.css
│ │
│ ├── package.json
│ └── .env
│
└── README.mdgit clone https://github.com/your-username/gitlab-handbook-chatbot.git
cd gitlab-handbook-chatbotcd backendpython -m venv venvActivate it:
venv\Scripts\activatesource venv/bin/activatepython -m pip install -r requirements.txtIf pip is missing, run:
python -m ensurepip --upgrade
python -m pip install --upgrade pipInside the backend/ folder, create a .env file:
MONGODB_URI=your_mongodb_atlas_connection_string
DATABASE_NAME=gitlab_chatbot
COLLECTION_NAME=handbook_chunks
GOOGLE_API_KEY=your_google_gemini_api_key
JWT_SECRET_KEY=your_secret_key
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=1440Create a MongoDB Atlas database named:
gitlab_chatbotCreate these collections:
handbook_chunks
users
chats| Collection | Purpose |
|---|---|
handbook_chunks |
Stores scraped GitLab Handbook text chunks, embeddings, URLs, and metadata |
users |
Stores user account information |
chats |
Stores user chat sessions and messages |
MongoDB collections are created automatically when data is inserted, but you can also create them manually from MongoDB Atlas.
The ingestion script scrapes selected GitLab Handbook URLs, chunks the text, creates embeddings, and stores the data in MongoDB.
Run:
python scripts/ingest.pyRecommended important GitLab Handbook URLs:
https://handbook.gitlab.com/
https://handbook.gitlab.com/handbook/
https://handbook.gitlab.com/handbook/company/mission/
https://handbook.gitlab.com/handbook/values/
https://handbook.gitlab.com/handbook/company/culture/all-remote/
https://handbook.gitlab.com/handbook/company/culture/all-remote/asynchronous/
https://handbook.gitlab.com/handbook/company/culture/all-remote/handbook-first/
https://handbook.gitlab.com/handbook/people-group/directly-responsible-individuals/
https://about.gitlab.com/direction/
https://about.gitlab.com/direction/company/
https://about.gitlab.com/direction/dev/
https://about.gitlab.com/direction/ops/
https://about.gitlab.com/direction/sec/
https://about.gitlab.com/direction/data-science/From the backend/ folder:
python -m uvicorn app.main:app --reloadBackend will run at:
http://127.0.0.1:8000Health check endpoint:
http://127.0.0.1:8000/healthcd frontendnpm installInside the frontend/ folder, create:
VITE_API_BASE_URL=http://127.0.0.1:8000npm run devFrontend will run at:
http://localhost:5173The chatbot shows starter questions like:
What is GitLab's mission?
How does GitLab work remotely?
What are GitLab's values?
What is asynchronous communication at GitLab?
What is a DRI at GitLab?
What is GitLab's handbook-first approach?These questions are based on high-value GitLab Handbook sections and help demonstrate the chatbot's core functionality.
The app supports:
- User signup
- User login
- JWT-based authentication
- Saved chat history per user
- Sidebar profile information
- Logout
Basic flow:
User signs up / logs in
↓
Backend returns JWT token
↓
Frontend stores token
↓
Token is sent with chat/history requests
↓
User chats are saved and loaded from MongoDBExample backend endpoints:
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Check backend status |
POST |
/chat |
Ask a chatbot question |
POST |
/auth/signup |
Create new user account |
POST |
/auth/login |
Log in user |
GET |
/chats |
Get user's chat history |
GET |
/chats/{chat_id} |
Load a specific chat |
DELETE |
/chats/{chat_id} |
Delete a chat |
{
"question": "What is GitLab's mission?"
}Example response:
{
"answer": "GitLab's mission is to make it so everyone can contribute...",
"sources": [
{
"title": "GitLab Mission",
"url": "https://handbook.gitlab.com/handbook/company/mission/"
}
]
}The chatbot is designed to answer only GitLab Handbook-related questions.
Example:
User: What is the capital of France?
Bot: I'm only able to answer questions about GitLab's Handbook.This keeps the chatbot focused, transparent, and aligned with the project objective.
Recommended deployment setup:
Deploy React frontend on:
- Vercel
- Netlify
Deploy FastAPI backend on:
- Render
- Railway
- Fly.io
Use:
- MongoDB Atlas
Backend deployment variables:
MONGODB_URI=your_mongodb_atlas_connection_string
DATABASE_NAME=gitlab_chatbot
COLLECTION_NAME=handbook_chunks
GOOGLE_API_KEY=your_google_gemini_api_key
JWT_SECRET_KEY=your_secret_key
JWT_ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=1440Frontend deployment variable:
VITE_API_BASE_URL=https://your-backend-url.comAfter deploying the backend, update the frontend API URL to point to the deployed backend instead of localhost.
If you see an error like:
429 You exceeded your current quotaIt means the Gemini API free-tier request limit has been reached. Wait for quota reset, reduce repeated calls, or switch to a different model/API key if allowed.
If PowerShell blocks npm:
running scripts is disabled on this systemRun PowerShell as administrator and use:
Set-ExecutionPolicy RemoteSignedOr use Command Prompt/Git Bash instead.
If installing latest npm fails because of Node version mismatch, either upgrade Node.js or continue using the compatible npm version already installed.
If VS Code shows:
An environment file is configured but terminal environment injection is disabled.Enable this setting in VS Code:
"python.terminal.useEnvFile": trueIf running with:
python -m uvicorn app.main:app --reloadmost backend changes reload automatically. However, restart the server manually after changing .env, installing packages, or changing major app startup logic.
- Add semantic vector search using MongoDB Atlas Vector Search
- Add admin dashboard for feedback analytics
- Add streaming responses
- Add better source ranking
- Add document refresh scheduler
- Add role-based user access
- Add public deployment URL in README after deployment
This chatbot improves access to GitLab's public knowledge base by allowing users to ask natural language questions and receive grounded, source-backed answers. It combines data scraping, RAG, generative AI, FastAPI, MongoDB, and React into a practical full-stack GenAI application.
Sumit Verma
This project is licensed under the MIT License.