This project automates the extraction and organization of data from receipt images. It simplifies manual data entry by converting raw receipt images into structured JSON data and persistent local records.
Inspired by @IAmTomShaw's Receipt Vision.
- Image Processing: Upload receipt images for automated text extraction using Tesseract OCR.
- Language Model Parsing: OpenAI's GPT-4 processes the raw OCR text into strict, structured JSON formatting.
- Mock Testing Mode: Bypass external API calls and Tesseract processing by using the built-in
testAPI key, enabling rapid local UI development. - Data Export: Export parsed receipt details (products, quantities, prices, categories) directly to standard CSV format.
- Local Storage: Parsed data is securely stored in a lightweight SQLite database.
- Responsive Interface: A clean, minimalistic web interface for uploading, viewing, and exporting receipts on any device.
- Backend: Flask (Python)
- Frontend: HTML, CSS, Vanilla JavaScript
- Database: SQLite
- External APIs: OpenAI API
- OCR Engine: Tesseract
- Python 3.10 or higher
- OpenAI API Key (or use
testfor mock data) - Tesseract OCR installed on your system (optional if using mock mode)
-
Clone the repository:
git clone https://github.com/JustCabaret/receipt-parser.git cd receipt-parser -
Set up the virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Environment Configuration:
- Create a copy of the
.env.examplefile and rename it to.env:cp .env.example .env
- Open the
.envfile and set your API Key:OPENAI_API_KEY=your_openai_api_key_here
- Create a copy of the
-
Run the application:
- The SQLite database and tables will be initialized automatically on the first run.
python backend/app.py
The application servers will be available at
http://127.0.0.1:5000(API) andhttp://127.0.0.1:8000(Frontend).
- Access the application: Open
http://127.0.0.1:8000in your browser. - Configure Authentication: Click "API Key" to input your token. You can enter the word
testto simulate backend processing without consuming API credits. - Upload Receipts: Submit images through the upload modal.
- View & Export: Click on any parsed receipt row to view the line-item breakdown, and use the "Export CSV" button to download the data.
- POST /process_receipt: Uploads and processes a receipt image.
- Input:
multipart/form-data(Image file, API key) - Output: Processed receipt data in JSON format
- Input:
- GET /receipts: Retrieves all processed receipts.
- GET /receipts/{receipt_id}: Retrieves line-item details for a specific receipt.
{
"total": 1250,
"store": "SuperMart",
"items": [
{"product": "Milk", "quantity": 2, "price": 250, "type": "Groceries"},
{"product": "Bread", "quantity": 1, "price": 150, "type": "Groceries"}
]
}Feel free to fork the project, submit pull requests, report bugs, or suggest new features.
- JustCabaret - Your GitHub Profile
This project is licensed under the MIT License - see the LICENSE file for details.
