Machine Learning on a Cancer Dataset - Part 32

in #programming7 years ago

In this machine learning tutorial we diverge from the usual algorithmic implementation in scikit-learn. Instead, we learn about uncertainty estimation.

Two of the most common methods for uncertainty estimation in scikit-learn are: the decision function and predict_proba (which is about predicting probabilities).

In this tutorial, we're specifically looking at the decision function, which basically computes the distance from a given point to the separating hyperplane.

To illustrate, we're going to use the support vector machine (SVC) classifier trained during previous tutorials. We're calling the decision function on a subsample of the test subset. The response is represented by positive and negative values, which reflects the degree of confidence for a specific tumor sample being in a class or the other; we're dealing with binary classification here, so the classes are: malignant and benign.

Please see the video below for a full walkthrough of the code and the tutorial.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 30
  2. Machine Learning on a Cancer Dataset - Part 31


To stay in touch with me, follow @cristi


Cristi Vlad, Self-Experimenter and Author

Coin Marketplace

STEEM 0.31
TRX 0.27
JST 0.041
BTC 98256.76
ETH 3657.74
SBD 3.42