Pneumonia Detection from Chest X-ray Images using Transfer Learning

This project demonstrates a deep learning model for classifying chest X-ray images as either 'NORMAL' or 'PNEUMONIA'. It leverages transfer learning with the VGG16 architecture to achieve high accuracy and recall.

The code is structured for use in a Google Colab environment and implements a strategic two-phase training process to maximize the performance of the pre-trained model on this specific medical imaging task. This project was completed as part of my internship to showcase skills in computer vision, deep learning model development, and experimental iteration.

Key Features

Utilizes Transfer Learning with the powerful, pre-trained VGG16 model from ImageNet.
Implements a two-phase training strategy for optimal results:
1. Initial Training (Feature Extraction): Training a custom classifier head on top of a frozen base model to learn high-level features.
2. Fine-Tuning: Unfreezing the VGG16 base model and training the entire network with a low learning rate to adapt it to the specifics of X-ray images.
Handles the dataset's class imbalance using calculated class weights to ensure the model learns fairly from both classes.
Employs data augmentation (random rotations, shifts, zooms, flips) to create a more robust model that generalizes better to new, unseen images.
Achieved a final validation accuracy of ~93.4% with high precision and recall.

Dataset

The model was trained on the "Chest X-ray Images (Pneumonia)" dataset available on Kaggle.

Link to Dataset: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia

Methodology

The core of this project is the two-phase training approach, which is a standard and effective technique in transfer learning:

Initial Phase (Training the Head): We start with the VGG16 model, with its weights pre-trained on ImageNet, and freeze the entire base. We then add our own small classifier head (consisting of a GlobalAveragePooling2D, Dropout, and a final Dense layer with a sigmoid activation function). This new head is trained for several epochs on our X-ray data. This allows our new layers to learn how to interpret the features extracted by the powerful, frozen VGG16 base without disrupting its valuable, pre-learned weights. A very low learning rate is used to ensure stable learning.
Fine-Tuning Phase: After the initial phase provides a good set of weights for the classifier head, we unfreeze the VGG16 base model. The entire network is then trained with a very low learning rate (5e-5). This allows the model to make small, careful adjustments to the pre-trained features to better suit the specific nuances of the chest X-ray images, leading to a significant performance boost.

Callbacks like EarlyStopping and ReduceLROnPlateau are used in both phases to prevent overfitting, save the best-performing model, and optimize the learning rate.

Final Performance

The final model, after the completion of the fine-tuning phase, achieved the following performance on the validation set (624 images). The weights from the best-performing epoch were restored by the EarlyStopping callback.

Validation Accuracy: ~93.43%
Validation Loss: ~0.1887
Precision (for PNEUMONIA): ~94.86% (When the model predicts PNEUMONIA, it is correct ~95% of the time).
Recall (for PNEUMONIA): ~94.62% (The model correctly identifies ~95% of all actual PNEUMONIA cases).
F1-Score (for PNEUMONIA): ~94.74%

How to Use This Project

1. Setup

Clone the repository:

git clone [https://github.com/](https://github.com/)[Your-GitHub-Username]/[Your-Repo-Name].git
cd [Your-Repo-Name]

Install dependencies:
```
pip install -r requirements.txt
```
Download the dataset:
- The training notebook (Pneumonia_Detection_Training.ipynb) contains a cell to download the dataset from Kaggle. You will need a kaggle.json API token file from your Kaggle account.
- Place the downloaded and unzipped data in a directory structure that the script expects, e.g., /content/datasets/pneumonia/chest_xray/ with train and test subfolders.

2. Training the Model

The primary training code is in the Jupyter Notebook (Pneumonia_Detection_Training.ipynb).
Open the notebook in an environment like Google Colab or Jupyter Lab.
Run the cells sequentially to download the data, build the model, and execute the two-phase training process.
The final trained model will be saved as pneumonia_detection_finetuned_fixed.keras.

3. Making Predictions on New Images

A script predict_pneumonia.py is provided to load the saved model and make predictions on new images.
Place your sample X-ray images in the sample_images/ directory.
Run the script from your terminal:
```
python predict_pneumonia.py
```
The script will load the .keras model, process each image, and print the predicted class ('NORMAL' or 'PNEUMONIA') along with the model's confidence.

Project Structure

. ├── Pneumonia_Detection_Training.ipynb # Main notebook for training the model ├── predict_pneumonia.py # Script to predict on new images ├── requirements.txt # List of Python dependencies ├── pneumonia_detection_finetuned_fixed.keras # The final saved model file ├── sample_images/ # Directory for placing sample images for prediction │ ├── normal_sample.jpeg │ └── pneumonia_sample.jpeg └── README.md # This file

Technologies Used

Python 3
TensorFlow & Keras
Scikit-learn
NumPy
Matplotlib
Kaggle API

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Offical_disease_model.ipynb		Offical_disease_model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pneumonia Detection from Chest X-ray Images using Transfer Learning

Key Features

Dataset

Methodology

Final Performance

How to Use This Project

1. Setup

2. Training the Model

3. Making Predictions on New Images

Project Structure

Technologies Used

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pneumonia Detection from Chest X-ray Images using Transfer Learning

Key Features

Dataset

Methodology

Final Performance

How to Use This Project

1. Setup

2. Training the Model

3. Making Predictions on New Images

Project Structure

Technologies Used

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages