Demonstrate binary classification starting with a 2D chirp-boundary data set and ending with confusion-dots and ROC plots.
The models in the notebook are listed here with their Test and [Training]
accuracies
(these will vary for different random training samples):
1- 87.5% [86.5%] The Known Chirp model. Just about the best that any model can do, indicates the Bayes limit.
2- 49.6% [52.7%] A Random model. Random guessing gives ~ 50% for the chirp here.
3- 65.6% [69.2%] A Really Simple model. Looks like more blue in top half, so try that...
4- 74.5% [75.0%] Logistic Regression. A simple linear boundary is ok; test-% and training-% are similar, an under-fitting model.
5- 72.6% [79.2%] Decision Tree model. The training data are over-fitted by a poorly-shaped (although cute) model...
6- 74.7% [80.0%] SVM with Polynomial Features (deg=7) Some overfitting but a reasonable model shape.
7- 75.1% [90.2%] SVM using a Kernel (poly, deg=15) This SVM has the flexibility to fit the training data beyond the Bayes limit.
8- 79.5% [96.2%] Neural Network (hidden=[40,9], no regularization) Finally a model is breaking the ~ 75% accuracy level; note the extreme training over-fitting!
9- 82.1% [90.2%] Neural Network (hidden=[40,9] with L2 regularization) L2 regularization has tamed the over-fitting and improved the test accuracy.
Models 4 - 7 are based on the book and repo Hands on Machine Learning by Aurelien Geron.
The Neural Network model used here comes from Andrew Ng's Deep Learning course, specifically from the Regularization excersize in Week 1 of Course 2. A model with 2 hidden layers and L2 regularization is implemented in the file reg_utils_dd.py here, which is modified from the course's reg_utils.py.
Images of the Trained Models Classifying the Test Data