Machine Learning on a Cancer Dataset - Part 30
In this machine learning tutorial we're going to continue optimizing our support vector machine algorithm (SVC) to improve its performance on the cancer dataset in scikit-learn.
Remember from the previous video tutorial that after scaling the data, the performance improved significantly. In fact, from an overfitting scenario (that we had on the unscaled data) we reached an underfitting scenario. So, we have to do something to fix the underfitting of the classifier.
There are numerous hyper-parameters for SVMs that could be adjusted. However, we're only going to modify one of them here, and see how it changes the performance of the classifier. Specifically, we're going to adjust the C parameter, which deals with regularization.
By default, C is equal to 1. We're going to set it to 1,000, thereby increasing the complexity of our model. Then, we're going to assess whether or not the performance of our SVC (notice I'm using SVM and SVC interchangeably here) improved.
Please see the video below for the complete walk-through.
Previous videos in this series:
- Machine Learning on a Cancer Dataset - Part 20
- Machine Learning on a Cancer Dataset - Part 21
- Machine Learning on a Cancer Dataset - Part 22
- Machine Learning on a Cancer Dataset - Part 23
- Machine Learning on a Cancer Dataset - Part 24
- Machine Learning on a Cancer Dataset - Part 25
- Machine Learning on a Cancer Dataset - Part 26
- Machine Learning on a Cancer Dataset - Part 27
- Machine Learning on a Cancer Dataset - Part 28
- Machine Learning on a Cancer Dataset - Part 29
To stay in touch with me, follow @cristi
Cristi Vlad, Self-Experimenter and Author
Awesome! ♥ !!!!
great job
Thank you, will study this
what do you mean?
Well as Python beginner I will study your posts, but I'm not there yet, need to learn basics first :-)