What is Bagging in statistics?: “two heads are better than one”

“Bagging” or bootstrap aggregation is a specific type of machine learning process that uses ensemble learning to evolve machine learning models. Pioneered in the 1990s, this technique uses specific groups of training sets where some observations may be repeated between different training sets.

The idea of bagging has been used extensively in machine learning to create better fitting for models. The idea is that if you take several independent machine learning units, they can function collectively better than one unit that would have more resources.

To really illustrate how this works, think of each part of the bagging process as an individual brain. Without bagging, machine learning would consist of one really smart brain working on a problem. With bagging, the process consists of many “weak brains” or less strong brains collaborating on a project. They each have their domain of thinking, and some of those domains overlap. When you put the final result together, it is a lot more evolved than it would be with just one “brain.”

In a very real sense, the philosophy of bagging can be described by a very old axiom that predates technology by quite a few years: “two heads are better than one.” In bagging, 10 or 20 or 50 heads are better than one, because the results are taken altogether and aggregated into a better result. Bagging is a technique that can help engineers to battle the phenomenon of “overfitting” in machine learning where the system does not fit the data or the purpose.


Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s