Machine Learning on a Cancer Dataset - Part 2

in #machine-learning8 years ago

In the second video, I talk about the dependencies that we're gonna use in this machine learning series. So, these include:

  • scikit-learn
  • matplotlib
  • and others that may come along as we progress through the series

The initial imports that we're doing:

  • load_breast_cancer - this is the dataset that we're working on
  • train_test_split - to split the dataset into training and test subset
  • KNeighborsClassifier - the first ML classifier that we're gonna use
  • matplotlib

I also look at how the description (DESCR) for this dataset looks like in scikit-learn. More in the video below.


As a reminder:

In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 1


To stay in touch with me, follow @cristi

#machine-learning #science #python


Cristi Vlad, Self-Experimenter and Author

Coin Marketplace

STEEM 0.20
TRX 0.19
JST 0.034
BTC 89752.15
ETH 3297.99
USDT 1.00
SBD 3.02