Professional Documents
Culture Documents
Som MJJ
Som MJJ
Introduction
An important aspect of an ANN model is whether it needs guidance in
learning or not. Based on the way they learn, all artificial neural networks
can be divided into two learning categories - supervised and unsupervised.
In supervised learning, a desired output result for each input vector is
required when the network is trained.
An ANN of the supervised learning type, such as the multi-layer perceptron,
uses the target result to guide the formation of the neural parameters.
It is thus possible to make the neural network learn the behavior of the
process under study.
In unsupervised learning, the training of the network is entirely data-
driven and no target results for the input data vectors are provided.
An ANN of the unsupervised learning type, such as the self-organizing map,
can be used for clustering the input data and find features inherent to the
problem.
SOM
The Self-Organizing Map was developed by professor Kohonen. The SOM has been
proven useful in many applications
One of the most popular neural network models. It belongs to the category of
competitive learning networks.
In competitive learning, the output neurons compete amongst themselves to be
activated, with the result that only one is activated at any one time. This activated
neuron is called a winner-takes-all neuron or simply the winning neuron.
The result is that the neurons are forced to organise themselves. For obvious reasons,
such a network is called a Self Organizing Map (SOM).
Based on unsupervised learning, which means that no human involvement is needed
during the learning and that little needs to be known about the characteristics of the
input data.
Use the SOM for clustering data without knowing the class memberships of the input
data. The SOM can be used to detect features inherent to the problem and thus has
also been called SOFM, the Self-Organizing Feature Map.
Conti..
Keep returning to Step 2 until the change of the weights is less than a pre-
specified threshold value or the maximum number T of iterations is reached.
Otherwise stop.
Drawbacks
Does not handle categorical variables well. To obtain a scatterplot with good
data spread to identify clusters, SOM will have to assume that all variables in
the dataset are continuous.
Computationally expensive. A dataset with more variables would
require longer times to calculate distances and identify BMUs. To expedite
computations, we could improve our initial positions of neurons from a
random state to a more informed approximation with the help of a simpler
dimension reduction technique, such as principal components analysis. By
starting the neurons off closer to the data points, less time would be needed
to move them to their optimal locations.
Potentially inconsistent solutions. As initial positions of neurons differ each
time the SOM analysis is run, the eventual SOM map generated will also differ.