Multi-Class Imbalanced Classification

Imbalanced classification are those prediction tasks where the distribution of examples across class labels is not equal.
Most imbalanced classification examples focus on binary classification tasks, yet many of the tools and techniques for imbalanced classification also directly support multi-class classification problems.
In this tutorial, you will discover how to use the tools of imbalanced classification with a multi-class dataset.
After completing this tutorial, you will know:

About the glass identification standard imbalanced multi-class prediction problem.
How to use SMOTE oversampling for imbalanced multi-class classification.
How to use cost-sensitive learning for imbalanced multi-class classification.

Discover SMOTE, one-class classification, cost-sensitive learning, threshold moving, and much more in my new book , with 30 step-by-step tutorials and full Python source code.
Let’s get started.

Multi-Class Imbalanced Classification Photo by istolethetv , some rights reserved.

Tutorial Overview
This tutorial is divided into three parts; they are:

Glass Multi-Class Classification Dataset
SMOTE Oversampling for Multi-Class Classification
Cost-Sensitive Learning for Multi-Class Classification

Glass Multi-Class Classification Dataset
In this tutorial, we will focus on the standard imbalanced multi-class classification problem referred to as “ Glass Identification ” or simply “ glass .”
The dataset describes the chemical properties of glass and involves classifying samples of glass using their chemical properties as one of six classes. The dataset was credited to Vina Spiehler in 1987.
Ignoring the sample identification number, there are nine input variables that summarize the properties of the glass dataset; they are:

RI: Refractive Index
Na: Sodium
Mg: Magnesium
Al: Aluminum
Si: Silicon
K: Potassium
Ca: Calcium
Ba: Barium
Fe: Iron

The chemical compositions are measured as the weight percent in corresponding oxide.
There are seven types of glass listed; they are:

Class 1: building windows (float processed)
Class 2: building windows (non-float processed)
Class 3: vehicle windows (float processed)
Class 4: vehicle windows (non-float processed)
Class 5: containers
Class 6: tableware
Class 7: headlamps

Float glass refers to the process used to make the glass.
There are 214 observations in the dataset and the number of observations in each class is imbalanced. Note that there are no examples for class 4 (non-float processed vehicle windows) in the dataset.

Class 1: 70 examples
Class 2: 76 examples
Class 3: 17 examples
Class 4: 0 examples
Class 5: 13 examples