Machine Learning on a Cancer Dataset - Part 3

in #machine-learning7 years ago

In this third video of the series on machine learning, I discuss about the dataset. In this case, we're gonna be working on the cancer dataset that's preloaded with scikit-learn.

Basically, this dataset contains 569 digitized images of FNAs (fine needle aspirates) of tumor masses. The data is labeled, benign or malignant. So, each sample has a set of ~30 features, which describe the nucleus (perimeter mean, area mean, smoothness, etc) and a target, which is malignant or benign.

We will feed this data into machine learning algorithms, and then train the algorithms (fit), check their accuracy and improve or optimize them if that's the case. The ultimate purpose is to use the trained algorithm for classification of new samples, so for prediction whether new data (sample) is malignant or benign. More in the video...


As a reminder:

In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 1
  2. Machine Learning on a Cancer Dataset - Part 2


To stay in touch with me, follow @cristi

#machine-learning #science #python


Cristi Vlad, Self-Experimenter and Author

Sort:  

This is relevant to my interests.

if you need help, let me know!

Coin Marketplace

STEEM 0.18
TRX 0.13
JST 0.030
BTC 57897.73
ETH 3060.51
USDT 1.00
SBD 2.26