Machine Learning with Scikit-Learn - [Part 42]

in #machine-learning7 years ago

In this tutorial we're diving into another section of machine learning, which is automatic feature selection.

One of the major reasons to use automatic feature selection with your datasets is to reduce the dimensionality of your data, which is very desired outcome in many projects - one possible reason being the lack of computational resources or time to train an algorithm or a neural network.

Some of the methods used of automatic feature selection include:

  • univariate statistics
  • model-based selection
  • iterative selection

In this specific tutorial we're going to start with univariate statistics, and two methods than can be implemented in scikit-learn for this purpose are: SelectKBest and SelectPercentile.

Here we're going to use the SelectPercentile and we're going to work on the preloaded cancer dataset, that comes with scikit-learn. To the original dataset we're going to add additional noise features, after which we apply SelectPercentile.

As seen in the tutorial, this method reduces the features by about 50%.

Please watch the video below for a complete walk through:

To stay in touch with me, follow @cristi

Cristi Vlad Self-Experimenter and Author


The univariate statistics method of the automatic feature is super cool.

keep it up ♥ following you

Surprising !

wow!! i am personally like it in the winter..

I've seen in other tutorials at least basic diagrams that help visually explain what is going on, I watched your video and maybe it's just me but I think people in general like them for at least some concepts that can be explained in such a way. Does the book have any?

Part 42 is really awesome master ....I already read it ...
Thaka for sharing your valuable post

the tutorial is so good for ourselves..
thanks for sharing

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 63103.76
ETH 2556.80
USDT 1.00
SBD 2.82