Machine Learning on a Cancer Dataset - Part 6

in #machine-learning7 years ago

In this 6th video of the series, we're actually training our first machine learning classifier, KNN, on the cancer dataset in scikit-learn.

The purpose is to predict whether an image is of a malignant or benign tumor.

The code:

from sklearn.datasets import load_breast_cancer
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt
%matplotlib inline

cancer = load_breast_cancer()

X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, stratify=cancer.target, random_state=42)

knn = KNeighborsClassifier()
knn.fit(X_train, y_train)

print('Accuracy of KNN n-5, on the training set: {:.3f}'.format(knn.score(X_train, y_train)))
print('Accuracy of KNN n-5, on the test set: {:.3f}'.format(knn.score(X_test, y_test)))

And the output:

Accuracy of KNN n-5, on the training set: 0.946
Accuracy of KNN n-5, on the test set: 0.930

For a walkthrough, see the video below.


As a reminder:

In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 1
  2. Machine Learning on a Cancer Dataset - Part 2
  3. Machine Learning on a Cancer Dataset - Part 3
  4. Machine Learning on a Cancer Dataset - Part 4
  5. Machine Learning on a Cancer Dataset - Part 5


To stay in touch with me, follow @cristi

#machine-learning #science #python


Cristi Vlad, Self-Experimenter and Author

Sort:  

nice programmer, love you, I'm campaigning about cancer in my place, maybe your skill help full

useful information,,, nice post

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.034
BTC 62759.93
ETH 3112.27
USDT 1.00
SBD 3.87