Machine Learning on a Cancer Dataset - Part 32
In this machine learning tutorial we diverge from the usual algorithmic implementation in scikit-learn. Instead, we learn about uncertainty estimation.
Two of the most common methods for uncertainty estimation in scikit-learn are: the decision function and predict_proba (which is about predicting probabilities).
In this tutorial, we're specifically looking at the decision function, which basically computes the distance from a given point to the separating hyperplane.
To illustrate, we're going to use the support vector machine (SVC) classifier trained during previous tutorials. We're calling the decision function on a subsample of the test subset. The response is represented by positive and negative values, which reflects the degree of confidence for a specific tumor sample being in a class or the other; we're dealing with binary classification here, so the classes are: malignant and benign.
Please see the video below for a full walkthrough of the code and the tutorial.
Previous videos in this series:
To stay in touch with me, follow @cristi
Cristi Vlad, Self-Experimenter and Author