Recursive Feature Elimination (RFE) for Feature Selection in Python


Recursive Feature Elimination , or RFE for short, is a popular feature selection algorithm.
RFE is popular because it is easy to configure and use and because it is effective at selecting those features (columns) in a training dataset that are more or most relevant in predicting the target variable.
There are two important configuration options when using RFE: the choice in the number of features to select and the choice of the algorithm used to help choose features. Both of these hyperparameters can be explored, although the performance of the method is not strongly dependent on these hyperparameters being configured well.
In this tutorial, you will discover how to use Recursive Feature Elimination (RFE) for feature selection in Python.
After completing this tutorial, you will know:

RFE is an efficient approach for eliminating features from a training dataset for feature selection.
How to use RFE for feature selection for classification and regression predictive modeling problems.
How to explore the number of selected features and wrapped algorithm used by the RFE procedure.

Let’s get started.

Recursive Feature Elimination (RFE) for Feature Selection in Python Taken by djandywdotcom , some rights reserved.

Tutorial Overview
This tutorial is divided into three parts; they are:

Recursive Feature Elimination
RFE With scikit-learn

RFE for Classification
RFE for Regression

RFE Hyperparameters

Explore Number of Features
Automatically Select the Number of Features
Which Features Were Selected
Explore Base Algorithm

Recursive Feature Elimination
Recursive Feature Elimination, or RFE for short, is a feature selection algorithm.
A machine learning dataset for classification or regression is comprised of rows and columns, like an excel spreadsheet. Rows are often referred to as samples and columns are referred to as features, e.g. features of an observation in a problem domain.
Feature selection refers to techniques that select a subset of the most relevant features (columns) for a dataset. Fewer features can allow machine learning algorithms to run more efficiently (less space or time complexity) and be more effective. Some machine learning algorithms can be misled by irrelevant input features, resulting in worse predictive performance.
For more on feature selection generally, see the tutorial:

Feature Selection for Machine Learning in Python

RFE is a wrapper-type feature selection algorithm. This means that a different machine learning algorithm is given and used in the core of the method, is wrapped by RFE, and used to help select features. This is in contrast to filter-based feature selections that score each feature and select those features with the largest (or smallest) score.
Technically, RFE...

Top