Comparative Analysis of Machine Learning and Deep Learning for Air Quality Prediction Using Meteorological and Climate Data
This repository contains the code for our research on air quality predictions using XGBoost, LSTM and Informer using meteorological and climate data. The goal is to compare model performance, model efficiency and feature importance analysis in predicting PM2.5 concentrations across multiple cities.
The associated paper is available on IEEE Explore
Comparative_Analysis_Of_Machine_Learning_and_Deep_Learning_For_Air_Quality_Prediction
│
├── Dataset and Training File/ # Dataset and scripts (training and testing)
│ ├── General_EDA.ipynb # General EDA on dataset
│ ├── Informer_Model_Training_(Exponential_Smoothing)_Fix (1).ipynb # Informer model training and testing
│ └── LSTM_Preprocessing_Training.ipynb # LSTM model training and testing
│ └── XGBoost_EDA_and_Preprocessing_Training.ipynb # XGBoost mdoel training and testing
│ └── combined_dataset.csv # proccessed dataset
│ └── t_paired_test_For_RMSE_per_city_From_Each_Model.ipynb # t-paired test script
|
└── README.md # Main project documentation
1. git clone https://github.com/Andersen-C/Comparative_Analysis_Of_Machine_Learning_and_Deep_Learning_For_Air_Quality_Prediction.git
2. Open the ipynb scripts file in Jupyter Notebook/Google Colab/VS Code
3. Install all the required libraries
4. Run the code
The results of all models' performance are as follows:
| Model | RMSE | MAE | R2 Score | MAPE |
|---|---|---|---|---|
| XGBoost | 0.1907 | 0.0939 | 0.9727 | 15.03% |
| LSTM | 0.0425 | 0.0215 | 0.9203 | 22.86% |
| Informer | 0.0253 | 0.0441 | 0.9666 | 69.12% |
- Andersen Chandra - Lead Researcher
- Laurentius Nicholas - Lead Researcher
- Dr. Ir. Alexander Agung Santoso Gunawan, M.Si., M.Sc., IPM. - Supervisor
- Rilo Chandra Pradana - Supervisor