How to Use StandardScaler and MinMaxScaler Transforms in Python


Many machine learning algorithms perform better when numerical input variables are scaled to a standard range.
This includes algorithms that use a weighted sum of the input, like linear regression, and algorithms that use distance measures, like k-nearest neighbors.
The two most popular techniques for scaling numerical data prior to modeling are normalization and standardization. Normalization scales each input variable separately to the range 0-1, which is the range for floating-point values where we have the most precision. Standardization scales each input variable separately by subtracting the mean (called centering) and dividing by the standard deviation to shift the distribution to have a mean of zero and a standard deviation of one.
In this tutorial, you will discover how to use scaler transforms to standardize and normalize numerical input variables for classification and regression.
After completing this tutorial, you will know:

Data scaling is a recommended pre-processing step when working with many machine learning algorithms.
Data scaling can be achieved by normalizing or standardizing real-valued input and output variables.
How to apply standardization and normalization to improve the performance of predictive modeling algorithms.

Let’s get started.

How to Use StandardScaler and MinMaxScaler Transforms Photo by Marco Verch , some rights reserved.

Tutorial Overview
This tutorial is divided into six parts; they are:

The Scale of Your Data Matters
Numerical Data Scaling Methods

Data Normalization
Data Standardization

Sonar Dataset
MinMaxScaler Transform
StandardScaler Transform
Common Questions

The Scale of Your Data Matters
Machine learning models learn a mapping from input variables to an output variable.
As such, the scale and distribution of the data drawn from the domain may be different for each variable.
Input variables may have different units (e.g. feet, kilometers, and hours) that, in turn, may mean the variables have different scales.
Differences in the scales across input variables may increase the difficulty of the problem being modeled. An example of this is that large input values (e.g. a spread of hundreds or thousands of units) can result in a model that learns large weight values. A model with large weight values is often unstable, meaning that it may suffer from poor performance during learning and sensitivity to input values resulting in higher generalization error.
One of the most common forms of pre-processing consists of a simple linear rescaling of the input variables.
— Page 298, Neural Networks for Pattern Recognition , 1995.
This difference in scale for input variables does not affect all machine learning algorithms.
For example, algorithms that fit a...

Top