Neural network ensembles

This page contains description ensembles of neural networks and their implementation in ALGLIB. Prior to reading this page, it is necessary that you look through the paper on the general principles of data analysis methods. It contains important information which, to avoid duplication (as it is of great significance for each algorithm in this section), is moved to a separate page.

A neural network ensemble is a set of neural network models taking a decision by averaging the results of individual models'. Depending on the way the ensemble is designed, its application contributes to solving one of the two problems, namely problems associated with either a tendency of the basic neural network architecture to underfit (the boosting meta-algorithm), or a tendency of the basic architecture to overfit (bagging meta-algorithm, and other algorithms).

Two ensemble-designing algorithms are implemented in the current version of the ALGLIB package, namely bagged neural networks and early stopping ensembles. Both algorithms make use of averaging to overcome the neural network's tendency to overfit. Boosting algorithms are not implemented yet.

Contents

    1 Using Neural Network Ensembles
    2 Bagged neural networks
    3 Early stopping ensembles
    4 Training Set Format
    5 Downloads section

Using Neural Network Ensembles

The neural network ensemble subroutines are similar to the subroutines used to operate on individual neural networks (mlpbase and mlptrain modules).Operations with ensemble models are performed in three stages:

  1. Selection of a basic neural network architecture and choosing the number of networks in an ensemble. Initialization using one of MLPECreateXX subroutines. The interface of these subroutines virtually duplicates, in full, similar subroutines of the mlpbase module. The ensebmle is initialized with random values and needs training.
  2. Training using one of the algorithms set forth below.
  3. Using the trained ensemble (mapping inputs to outputs, serialization, etc). The subprograms performing these operations are almost entirely similar to analogous subroutines of the mlpbase module.

There are two restrictions placed by ALGLIB on the neural network ensembles:

Bagged neural networks

The bagging meta-algorithm (abbreviated from "bootstrap aggregating") consists in generation of K new training sets by sampling examples from the original training set, uniformly and with replacement, and in training of K neural networks using these training sets. The records that fail to get into the j-th set are used as a test set for a j-th neural network. This algorithm is more fully described in Wikipedia.

The following two subroutines can be used for training an ensemble: MLPEBaggingLM and MLPEBaggingLBFGS. The first one uses modified Levenberg-Marquardt algorithm to train individual networks, while the other uses the L-BFGS algorithm.

The main advantage of the algorithm is that an internal generalization error estimate is generated during its work, which is similar to the cross-validation estimate. The main disadvantage consists in the high computational cost which is comparable with that of cross-validation, whereas the generalization error is no better than generalization error of a sufficiently regularized individual neural network. Some saving of time can be achieved, due to the averaging that permits less stringent stopping criterions for individual neural network training algorithm. However, on the whole, this algorithm is not much better than the "individual neural network + regularization + cross-validation" bundle.

Early stopping ensembles

Early stopping is a well-known way to deal with the overfitting of a neural network model. The training set is separated into two parts: one of them is to be used for training, while the other one is meant for validation purposes. A neural network with an rebundant number of neurons in a hidden layer is used (e.g., a network with N inputs, M outputs and one hidden layer containing 30-100 neurons). The network's redundancy is essential for the algorithm's success, that is, the network shall be highly flexible to provide for the efficiency of early stopping. The training is stopped when the error in a validation set starts growing (hence, it is named "early stopping").

Such neural networks are characterized by low bias, but high variance. It means that an individual neural network trained using early stopping has too high error, but the averaging of several neural networks (ten is a good value) leads to a substantial decrease in the error. Experimental results show that an early stopping neural network ensemble generalization error is comparable with an individual neural network of optimal architecture that is trained by a traditional algorithm. But individual neural network needs a long and complex tuning (searching through all possible combinations of architecture with the regularizing parameter), while ensemble of early stopping networks does not need tuning at all.

Thus, early stopping neural network ensembles are characterized by the following advantages:

The algorithm's disadvantages:

Open issues:

Training Set Format

The training set format is described in the paper that is recommended at the top of the page. That paper also deals with such problems as missing values and nominal variable encoding. It should be noted that the dataset format depends on which problem - regression or classification - the network solves.

This article is licensed for personal use only.

Download ALGLIB for C++ / C# / Java / Python / ...

ALGLIB Project offers you two editions of ALGLIB:

ALGLIB Free Edition:
+delivered for free
+offers full set of numerical functionality
+extensive algorithmic optimizations
-no multithreading
-non-commercial license

ALGLIB Commercial Edition:
+flexible pricing
+offers full set of numerical functionality
+extensive algorithmic optimizations
+high performance (SMP, SIMD)
+commercial license with support plan

Links to download sections for Free and Commercial editions can be found below:

ALGLIB 4.03.0 for C++

C++ library.
Delivered with sources.
Monolithic design.
Extreme portability.
Editions:   FREE   COMMERCIAL

ALGLIB 4.03.0 for C#

C# library with native kernels.
Delivered with sources.
VB.NET and IronPython wrappers.
Extreme portability.
Editions:   FREE   COMMERCIAL

ALGLIB 4.03.0 for Java

Java wrapper around HPC core.
Delivered with sources.
Seamless integration with Java.
Editions:   FREE   COMMERCIAL

ALGLIB 4.03.0 for Delphi

Delphi wrapper around C core.
Delivered as precompiled binary.
Compatible with FreePascal.
Editions:   FREE   COMMERCIAL

ALGLIB 4.03.0 for CPython

CPython wrapper around C core.
Delivered as precompiled binary.
Editions:   FREE   COMMERCIAL