Skip to content

aeindri-tech/Student_Performance_Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎓 Student Performance Predictor (Machine Learning Project)

A complete end-to-end Machine Learning project that predicts a student's Performance Index based on academic and lifestyle factors using Linear Regression.


🚀 Project Overview

This project analyzes how different factors affect student performance and builds a predictive model using real-world-like data.

It takes user inputs such as study hours, sleep, and previous scores, and predicts the expected performance index.


📊 Dataset Features

The dataset contains 10,000 entries with the following features:

  • 📘 Hours Studied (Numeric)
  • 📝 Previous Scores (Numeric)
  • 🎯 Extracurricular Activities (Categorical: Yes/No)
  • 😴 Sleep Hours (Numeric)
  • 📄 Sample Question Papers Practiced (Numeric)
  • 🎯 Performance Index (Target Variable)

🧠 Machine Learning Workflow

✅ Steps Performed:

  1. Data Loading
  2. Data Cleaning
  3. Encoding Categorical Variables
  4. Feature Scaling (StandardScaler)
  5. Train-Test Split
  6. Model Training (Linear Regression)
  7. Model Evaluation
  8. User Input Prediction System

📈 Model Performance

  • 🔹 Mean Absolute Error (MAE): 1.61
  • 🔹 R² Score: 0.9889

👉 The model performs extremely well, explaining ~99% variance in the data.


🖥️ How It Works

After training, the program allows real-time predictions:

Example:

Enter student details: Hours Studied: 5 Previous Scores: 85 Extracurricular Activities (Yes/No): yes Sleep Hours: 8 Sample Papers Practiced: 0

Predicted Performance Index: 71.21


⚠️ Note

You may see this warning:

UserWarning: X does not have valid feature names

👉 This happens because the model was trained with feature names but prediction input is a raw array.

✅ Fix (Optional Improvement):

Use a DataFrame for prediction instead of a list.


🛠️ Tech Stack

  • 🐍 Python
  • 📊 NumPy
  • 📁 Pandas
  • 🤖 Scikit-learn

📂 Project Structure

Student_Performance_Predictor/ │── main.py │── Student_Performance.csv │── README.md


🔮 Future Improvements

  • 🔹 Add Polynomial Regression
  • 🔹 Try Advanced Models (Random Forest, XGBoost)
  • 🔹 Hyperparameter Tuning
  • 🔹 Build Web App (Streamlit)

💡 Key Learnings

  • Real-world ML pipeline
  • Feature scaling importance
  • Handling categorical data
  • Model evaluation metrics
  • Building interactive ML systems

⭐ Final Thoughts

This project is a strong beginner-to-intermediate ML project that demonstrates:

✔ End-to-end pipeline
✔ Real-world data handling
✔ Model deployment logic (CLI-based)


🚀 Built with dedication by Aeindri

About

A Machine Learning project that predicts a student's Performance Index based on academic and lifestyle factors using Linear Regression. The project covers data preprocessing, feature scaling, model training, evaluation, and an interactive user input system for real-time predictions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages