Professional Documents
Culture Documents
Project Report (Conv-ELM)
Project Report (Conv-ELM)
Purusharth Verma
National Institute of Technology, Jamshedpur
02/05/2021
Extreme Learning Machines offer many advantages over traditional neural networks.
Since their parameters are not iteratively set, they can be trained in an extremely short
amount of time. Also, the parameters that they learn have a tendency to not overfit the
training data. Because of these many advantages, an effort has been made through this project
to use Extreme Learning Machines in place of conventional neural networks, and their
performances have been recorded and analyzed.
Table of Contents
1. Introduction
2. Extreme Learning Machines
3. Methodology
3.1 The Training Process
3.2 Sample Code
4. Results and Discussion
4.1 MNIST Dataset
4.2 PBC Dataset
4.3 Dermatology Dataset
5. Conclusions and Recommendations
6. Acknowledgements
7. References
List of Figures
1 Introduction
⦁ The MNIST Dataset: A dataset comprising of 60,000 28X28 black and white images of the
handwritten decimal digits 0-9.
⦁ The PBC Dataset: A dataset comprising of over 17,000 360X360 colored images of 8
different types of cells found in the human body.
⦁ The Dermatology Dataset: A dataset comprising of over 10,000 360X360 colored images of 7
different types of diagnostic categories of pigmented lesions.
HX = T
X = H’ T
Thus, all the trainable weights of an ELM can be set simply by making use of the calculation
specified by the above equation. Further, due to the special properties of the Moore-Penrose
pseudo-inverse, it can be ascertained that the matrix weights thus obtained will produce the
most accurate outputs that can be produced by any kind of linear transformation, and that the
weights thus learned will in principle, generalize well to new inputs.
3 Methodology
❏ Since the weights of an ELM are set within a single step, and there is no way the
weights of an ELM can be updated for the current iteration, we overwrite the weights
of the ELM from the previous iteration.
❏ For enabling gradient flow through an ELM, we simply consider the input output
mapping performed by an ELM as just another function, and assume its
differentiability.
3 .2 Sample Code
class SimpleBaseELM(torch.nn.Module):
self.linear_layer = torch.nn.Linear(
hidden_dim,
output_dim,
bias=False
)
x = self.get_hidden_output(x)
inv = torch.linalg.pinv(x)
wts = torch.matmul(inv, y)
self.set_linear_weights(torch.transpose(wts, 0, 1))
class ConvELM(torch.nn.Module):
super().__init__()
self.conv_model = conv_model
self.flatten = torch.nn.Flatten()
self.output_dim = elm_dims[2]
self.elm_model = SimpleBaseELM(
elm_dims[0],
elm_dims[1],
elm_dims[2]
x = self.flatten(x)
x = self.elm_model(x)
return x
x = self.flatten(x)
x = self.elm_model.forward_with_output(x, y)
return x
if train:
labels_categorical = torch.Tensor(np.eye(self.output_dim,
dtype='float32')[labels.cpu()])
outputs = self.forward_with_output(images,
labels_categorical)
else:
outputs = self(images)
A K-fold training setup with K=5 has been used. The generic model architecture for all
three datasets can be described as consisting of a base convolutional model (Resnet18 in most
cases), which is appended at the end by an Extreme Learning Machine. The Adam optimizer,
along with a learning rate scheduler implementing the One Cycle Policy has been used for the
training process.
4 .1 MNIST Dataset
The results obtained for the MNIST dataset are summarized below. Resnet18 was used
as the base convolutional model during the training process.
4 .2 PBC Dataset
The results obtained for the PBC dataset are summarized below. Resnet18 was used as
the base convolutional model during the training process.
4 .2 Dermatology Dataset
The results obtained for the dermatology dataset are summarized below. Resnet18 was
used as the base convolutional model during the training process.
From the results obtained that are described above, it is easy to conclude that ELMs
offer themselves as a viable substitute to traditional feed forward neural networks. They are
easy to train and optimize, and generalize well to new inputs. ELMs can also be augmented
into other deep neural networks, so that they can leverage the depth and complexity of these
networks, in order to solve more complicated AI tasks. This capacity to serve as a drop-in
replacement in any neural network makes ELMs extremely extensible and flexible, thus
extending their applications to a wide variety of AI related domains. Having said that, ELMs
also have some drawbacks, a few of which are listed below:
❏ In an ELM augmented network, since the weights of the ELM are set (and overwritten)
for every batch, we must ensure that the batch size used during iterative training is
large enough so that the ELM does not overfit the batch data. This often makes ELM
augmented neural networks harder to train due to their increased memory
requirement.
❏ Because the weights of an ELM are set within a single step, and because ELMs don't
have a mechanism for updating weights with new batch data, they have the tendency
to exhibit the phenomenon of catastrophic forgetting, whenever used in a batched
training setup.
6 Acknowledgements
I would like to acknowledge my professors Aditya Hati, Dr. Vinay Kumar, and also Arijit
Nandi, for helping and guiding me throughout this project.
7 References
[1] Extreme learning machine: A new learning scheme of feedforward neural networks, by
Guang-Bin Huang, Qin-Yu Zhu and Chee Kheong Siew
[2] A Review of Advances in Extreme Learning Machine Techniques and Its Applications, by Oyekale
Abel Alade, Ali Selama, and Roselina Sallehuddin