Skip to content

nextframedev/image-classification-trainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model Trainer

License: MIT Python 3.10+

A browser-based web UI for training image classification models locally — no cloud required.
Configure hyperparameters, launch a training run, watch live logs and accuracy charts, then download the saved model — all from your browser.


✨ Features

  • Train — pick architecture, set epochs / batch size / learning rate, select dataset
  • Live Monitor — real-time log console streamed via SSE + Chart.js accuracy & loss charts
  • Pause / Resume / Stop — gracefully pause (saves checkpoint), resume, or stop a running job
  • Jobs — view all training runs with status, progress and best validation accuracy; bulk delete
  • Models — browse saved model files grouped by run; download individual files or full packages
  • Per-job output — each run gets its own output directory for clean isolation
  • Dark / Light theme — toggle from the nav bar; persisted in local storage
  • Help page — quick-start guide and full parameter reference built in

🚀 Quick Start

# 1. Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Start the server
python app.py
# → http://localhost:5002

⏸ Pause, Resume, and Stop

Three controls appear on the Monitor page while a job is running:

Pause — sends a graceful stop signal. Training finishes the current epoch cleanly, saves a checkpoint (checkpoint.pt) containing the model weights, optimizer state, and epoch number, then exits. The job is marked Paused.

Resume — restarts a paused job from its checkpoint. The log and metrics charts continue from where they left off; no history is lost.

Stop — terminates the job immediately. The best model saved so far (written every time validation accuracy improves) is kept on disk. The checkpoint is removed, since the job cannot be resumed after a hard stop.

Training runs in a background process decoupled from the web server. If you close the browser tab, restart the app, or lose the connection, training continues unaffected. When you re-open the app, the job status and live log are restored automatically.


📁 Project Structure

model-trainer/
├── app.py                  # Flask app — routes, job runner, SSE stream
├── requirements.txt        # Python dependencies
├── train/                  # Training scripts & dataset
│   ├── train_image_classifier.py   # Generic PyTorch trainer (timm)
│   ├── train_tf_classifier.py      # Generic TensorFlow trainer (Keras)
│   ├── config.py                   # Shared config defaults
│   ├── dataset/                    # Dataset files
│   │   └── images/                 # ImageFolder layout: images/ClassName/img.jpg
│   └── output/                     # Per-job output directories (auto-created)
│       └── jobs.json               # Job persistence across server restarts
├── templates/
│   ├── base.html           # Shared layout (nav, theme toggle, footer)
│   ├── train.html          # New run configuration form
│   ├── index.html          # Jobs list
│   ├── monitor.html        # Live training monitor
│   ├── models.html         # Model file browser
│   └── help.html           # Quick-start guide & reference
└── static/
    └── css/
        ├── style.css       # Base design system (light/dark theme)
        └── training.css    # Training-specific styles

🗂 Dataset Format

Two formats are supported. The trainer auto-detects which one you're using.

Option 1: ImageFolder (subdirectories per class)

dataset/
├── cats/
│   ├── img001.jpg
│   └── img002.jpg
├── dogs/
│   └── img001.jpg
└── ...

Option 2: CSV + flat images

dataset/
├── labels.csv          # columns: filename, label
├── img001.jpg
├── img002.jpg
└── ...

The CSV file needs a filename column (filename, image, file, or image_id) and a label column (label, class, category, breed, species, or target). Images can also live in an images/ subdirectory.

When creating a new run, point the Dataset Path at the dataset folder. Classes are auto-detected from either subdirectory names or unique labels in the CSV.

Class names are saved to class_names.json in the job output.
If folder names are ImageNet synset IDs (e.g. n01440764), a human-readable imagenet_labels.json mapping is generated automatically.


⚙️ Supported Architectures

PyTorch (timm)

Key Model Notes
mobilevit_xxs MobileViT-XXS Lightweight; great for mobile
efficientnet_b0 EfficientNet-B0 Strong accuracy/size trade-off
resnet50 ResNet-50 Classic workhorse
convnext_tiny ConvNeXt-Tiny Modern CNN
vit_tiny ViT-Tiny Vision Transformer

TensorFlow (Keras)

Key Model Notes
mobilenetv2_tf MobileNetV2 Good for TFLite export
efficientnetb0_tf EfficientNetB0 Strong accuracy/size trade-off
resnet50_tf ResNet50 Classic workhorse

🔧 Environment Variable Overrides

The training scripts respect these env vars (set automatically by the web app):

Variable Default Description
TRAIN_ARCHITECTURE mobilevit_xxs timm model name or Keras app name
TRAIN_EPOCHS 15 / 20 Number of training epochs
TRAIN_BATCH_SIZE 32 Mini-batch size
TRAIN_LR 1e-4 Learning rate
TRAIN_DATASET_PATH dataset/images Path to ImageFolder root
TRAIN_NUM_CLASSES 0 (auto) Number of output classes (0 = auto-detect)
TRAIN_OUTPUT_DIR output Where model files are saved

Server env vars:

Variable Default Description
FLASK_DEBUG false Set to true to enable the Werkzeug debugger (dev only)
SECRET_KEY (dev key) Flask session secret — set a long random value in production

📄 License

MIT — see LICENSE.

Books by the Authors

QR code to our books on Amazon
Scan to check out our books on Amazon

About

Train AI image classification models locally through a simple browser-based UI — no cloud required.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors