TL;DR: Compare Logistic Regression vs Random Forest on a small dataset (Iris). Includes accuracy, ROC-AUC, and confusion matrix.
- Load the Iris dataset from scikit-learn
- Train two classifiers: Logistic Regression and Random Forest
- Evaluate using Accuracy, ROC-AUC, and Confusion Matrix
- PM-style commentary on tradeoffs
Open the notebook in Google Colab:
- Logistic Regression accuracy: ~95%
- Random Forest accuracy: ~97%
- Random Forest performs slightly better, but Logistic Regression is simpler and faster to train.
- Both models separate classes well, but Random Forest handles nonlinear boundaries better.
- Try more classifiers (SVM, KNN, Gradient Boosting)
- Add cross-validation instead of a single train/test split
- Plot feature importance from Random Forest
