Skip to content

kavvyaaaa/OptiSys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OptiSys — Predictive System Resource Optimization

AI/ML pipeline for cloud cluster resource intelligence: workload clustering, multi-model utilization forecasting, exhaustion detection, and optimization recommendations. Inspired by Google Cluster Trace patterns (CPU/memory utilization, machine events, task events).

Pipeline

Simulated cluster traces
        ↓
Workload clustering (K-Means + DBSCAN anomalies)
        ↓
Resource usage prediction (4 models)
        ↓
CPU / memory exhaustion forecast
        ↓
Optimization suggestions (rightsizing, migration, savings)
        ↓
Streamlit dashboard

Project structure

.
├── app.py                 # Streamlit dashboard (ClusterMind)
├── smoke_test.py          # End-to-end pipeline verification
├── requirements.txt
├── data/                  # Generated CSV traces (gitignored; created on first run)
└── src/
    ├── data_generator.py  # Google-style cluster trace simulator
    ├── clustering.py      # K-Means + DBSCAN workload clustering
    ├── prediction.py      # RF, XGBoost, LSTM, Cluster+XGBoost
    └── optimizer.py       # Alerts, rightsizing, migration, savings

Requirements

  • Python 3.10+
  • See requirements.txt for dependencies

Optional: install TensorFlow for a true LSTM backend (not bundled by default on Python 3.13).

Setup

git clone <repo-url>
cd <project-folder>

python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate

pip install -r requirements.txt

Run

Dashboard

python -m streamlit run app.py

Open the URL shown in the terminal (default: http://localhost:8501).

Verify the full pipeline

python smoke_test.py

Run individual modules

From the project root:

python -m src.data_generator   # regenerate trace data
python -m src.clustering
python -m src.prediction
python -m src.optimizer

On first run, trace CSVs are generated under data/ (~12 MB for task-level metrics). Use Regenerate Data in the dashboard sidebar or force_regenerate=True in code to rebuild them.

Dashboard tabs

  1. Cluster Live Feed — CPU heatmap, per-machine CPU/memory charts, machine summary
  2. Workload Analysis — cluster scatter, distribution, DBSCAN anomaly detection
  3. Predictive Models — model metrics (MAE, RMSE, R²), actual vs predicted, future forecast
  4. Optimizer — exhaustion alerts, rightsizing, migration suggestions, forecast gauges

Tech stack

Layer Tools
Data processing Python, Pandas, NumPy
ML Scikit-learn, XGBoost
Clustering K-Means, DBSCAN
Time series (optional) LSTM (TensorFlow/Keras)
Visualization Plotly, Matplotlib
Dashboard Streamlit

Author

Made by KAVYA RAJ

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages