Machine Learning on a Cancer Dataset - Part 6
In this 6th video of the series, we're actually training our first machine learning classifier, KNN, on the cancer dataset in scikit-learn.
The purpose is to predict whether an image is of a malignant or benign tumor.
The code:
from sklearn.datasets import load_breast_cancer
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
%matplotlib inline
cancer = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, stratify=cancer.target, random_state=42)
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
print('Accuracy of KNN n-5, on the training set: {:.3f}'.format(knn.score(X_train, y_train)))
print('Accuracy of KNN n-5, on the test set: {:.3f}'.format(knn.score(X_test, y_test)))
And the output:
Accuracy of KNN n-5, on the training set: 0.946
Accuracy of KNN n-5, on the test set: 0.930
For a walkthrough, see the video below.
As a reminder:
In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.
Previous videos in this series:
- Machine Learning on a Cancer Dataset - Part 1
- Machine Learning on a Cancer Dataset - Part 2
- Machine Learning on a Cancer Dataset - Part 3
- Machine Learning on a Cancer Dataset - Part 4
- Machine Learning on a Cancer Dataset - Part 5
To stay in touch with me, follow @cristi
#machine-learning #science #python
Cristi Vlad, Self-Experimenter and Author
nice programmer, love you, I'm campaigning about cancer in my place, maybe your skill help full
useful information,,, nice post