How to Develop an AdaBoost Ensemble in Python


Boosting is a class of ensemble machine learning algorithms that involve combining the predictions from many weak learners.
A weak learner is a model that is very simple, although has some skill on the dataset. Boosting was a theoretical concept long before a practical algorithm could be developed, and the AdaBoost (adaptive boosting) algorithm was the first successful approach for the idea.
The AdaBoost algorithm involves using very short (one-level) decision trees as weak learners that are added sequentially to the ensemble. Each subsequent model attempts to correct the predictions made by the model before it in the sequence. This is achieved by weighing the training dataset to put more focus on training examples on which prior models made prediction errors.
In this tutorial, you will discover how to develop AdaBoost ensembles for classification and regression.
After completing this tutorial, you will know:

AdaBoost ensemble is an ensemble created from decision trees added sequentially to the model
How to use the AdaBoost ensemble for classification and regression with scikit-learn.
How to explore the effect of AdaBoost model hyperparameters on model performance.

Let’s get started.

How to Develop an AdaBoost Ensemble in Python Photo by Ray in Manila , some rights reserved.

Tutorial Overview
This tutorial is divided into three parts; they are:

AdaBoost Ensemble Algorithm
AdaBoost Scikit-Learn API

AdaBoost for Classification
AdaBoost for Regression

AdaBoost Hyperparameters

Explore Number of Trees
Explore Weak Learner
Explore Learning Rate
Explore Alternate Algorithm

AdaBoost Ensemble Algorithm
Boosting refers to a class of machine learning ensemble algorithms where models are added sequentially and later models in the sequence correct the predictions made by earlier models in the sequence.
AdaBoost , short for “ Adaptive Boosting ,” is a boosting ensemble machine learning algorithm, and was one of the first successful boosting approaches.
We call the algorithm AdaBoost because, unlike previous algorithms, it adjusts adaptively to the errors of the weak hypotheses
— A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting , 1996.
AdaBoost combines the predictions from short one-level decision trees, called decision stumps, although other algorithms can also be used. Decision stump algorithms are used as the AdaBoost algorithm seeks to use many weak models and correct their predictions by adding additional weak models.
The training algorithm involves starting with one decision tree, finding those examples in the training dataset that were misclassified, and adding more weight to those examples. Another tree is trained on the same data, although now weighted by the...

Top