Machine Learning on a Cancer Dataset - Part 10

in #machine-learning7 years ago

This is the third and last video on Logistic Regression on the cancer dataset in scikit-learn; and the 10th video in the machine learning series.

This is all about visualization; in it I discuss how modifying the 'C' parameter in Logistic Regression, which controls the strength of regularization, impacts the results and the performance of the algorithm. We can visualize, using matplotlib, how a lower value of 'C' meaning stronger regularization leads to a tendency to shift the coefficients toward zero, but not reaching 0.

I also discuss the decision making (or the prediction) behind Logistic Regression and other linear models in scikit-learn. They basically depend on the simple equation of a line (y=mx+n); remember from math class?

See the video below for a walk-through if you have no freakin' clue of what I'm talking about here.


As a reminder:

In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 1
  2. Machine Learning on a Cancer Dataset - Part 2
  3. Machine Learning on a Cancer Dataset - Part 3
  4. Machine Learning on a Cancer Dataset - Part 4
  5. Machine Learning on a Cancer Dataset - Part 5
  6. Machine Learning on a Cancer Dataset - Part 6
  7. Machine Learning on a Cancer Dataset - Part 7
  8. Machine Learning on a Cancer Dataset - Part 8
  9. Machine Learning on a Cancer Dataset - Part 9


To stay in touch with me, follow @cristi

#machine-learning #science #python


Cristi Vlad, Self-Experimenter and Author

Coin Marketplace

STEEM 0.21
TRX 0.14
JST 0.030
BTC 68220.71
ETH 3321.59
USDT 1.00
SBD 2.74