Machine Learning on a Cancer Dataset - Part 17

in #machine-learning7 years ago

This is the second tutorial on Random Forests and the 17th in the machine learning on a cancer dataset series.

In the previous video, we've trained a Random Forest classifier on the cancer dataset so that we can use it to predict if tumor samples are malignant or benign.

As previously discussed, each tumor sample is characterized by ~30 features, which are used by the algorithm in its training. In this specific video tutorial we're looking at the importance (weight) each feature carries in the decision making process and in the training of the algorithm. We're using matplotlib for a comprehensive visualization of the data.


As a reminder:

In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 11
  2. Machine Learning on a Cancer Dataset - Part 12
  3. Machine Learning on a Cancer Dataset - Part 13
  4. Machine Learning on a Cancer Dataset - Part 14
  5. Machine Learning on a Cancer Dataset - Part 15
  6. Machine Learning on a Cancer Dataset - Part 16


To stay in touch with me, follow @cristi


Cristi Vlad, Self-Experimenter and Author

Coin Marketplace

STEEM 0.20
TRX 0.12
JST 0.028
BTC 64453.36
ETH 3507.44
USDT 1.00
SBD 2.56