Extreme Learning Machines are single hidden-layer feed-forward neural networks. They are one of the neural network approaches to timeseries forecasting (opposed to statistical timeseries forecasting).

Original paper by Hung et al, 2004.


The training process consists of these steps:

  1. All weights and biases are initialized with random values.
  2. The hidden layer output matrix () is calculated by multiplying the inputs with the randomly assigned weights, adding biases, and finally applying an activation function on the output.
  3. The output weight matrix is calculated by multiplying the Moore Penrose inverse of (hidden layer output matrix) with the training data matrix ().
  4. The output weight matrix is finally used to make predictions on new data.

In short, with a single shot we can avoid the multi-step process of iterative training and the backpropagation algorithm that is usually used with feed-forward neural networks.

The tuning of the network will mostly be around its hyperparameters:

  • Hidden layer size
  • Selection of activation function
  • Selection of input sources
  • Selection of the distribution for random values used in the initialization step


It is not as popular as DNN because it still does not reach the accuracy required for non-linear data.