Machine Learning on a Cancer Dataset - Part 30

in #programming8 years ago

In this machine learning tutorial we're going to continue optimizing our support vector machine algorithm (SVC) to improve its performance on the cancer dataset in scikit-learn.

Remember from the previous video tutorial that after scaling the data, the performance improved significantly. In fact, from an overfitting scenario (that we had on the unscaled data) we reached an underfitting scenario. So, we have to do something to fix the underfitting of the classifier.

There are numerous hyper-parameters for SVMs that could be adjusted. However, we're only going to modify one of them here, and see how it changes the performance of the classifier. Specifically, we're going to adjust the C parameter, which deals with regularization.

By default, C is equal to 1. We're going to set it to 1,000, thereby increasing the complexity of our model. Then, we're going to assess whether or not the performance of our SVC (notice I'm using SVM and SVC interchangeably here) improved.

Please see the video below for the complete walk-through.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 20
  2. Machine Learning on a Cancer Dataset - Part 21
  3. Machine Learning on a Cancer Dataset - Part 22
  4. Machine Learning on a Cancer Dataset - Part 23
  5. Machine Learning on a Cancer Dataset - Part 24
  6. Machine Learning on a Cancer Dataset - Part 25
  7. Machine Learning on a Cancer Dataset - Part 26
  8. Machine Learning on a Cancer Dataset - Part 27
  9. Machine Learning on a Cancer Dataset - Part 28
  10. Machine Learning on a Cancer Dataset - Part 29


To stay in touch with me, follow @cristi


Cristi Vlad, Self-Experimenter and Author

Sort:  

Awesome! ♥ !!!!

Thank you, will study this

what do you mean?

Well as Python beginner I will study your posts, but I'm not there yet, need to learn basics first :-)

Coin Marketplace

STEEM 0.09
TRX 0.31
JST 0.030
BTC 110234.76
ETH 3681.77
USDT 1.00
SBD 0.67