Ai Ca3 New

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 21

LOVELY

PROFESSIONAL
UNIVERSITY

ASSIGNMENT
OF
INTRODUCTION TO ARTIFICIAL
INTELLIGENCE
(ECE241)

TOPIC : HEBB NET

Submitted to :- Submitted by :-
Ms. Ayani Nandi Name - Navdeep Singh
Roll no.- 08
Reg no. - 11712668
Section :- EE018
Introduction :-

Hebbian theory is a neuroscientific theory claiming that an increase


in synaptic efficacy arises from a presynaptic cell's repeated and
persistent stimulation of a postsynaptic cell. It is an attempt to
explain synaptic plasticity, the adaptation of brain neurons during the
learning process. It was introduced by Donald Hebb in his 1949 book The
Organization of Behavior. The theory is also called Hebb's rule, Hebb's
postulate, and cell assembly theory.

Hebb states it as follows :-


Let us assume that the persistence or repetition of a
reverberatory activity (or “trace”) tends to induce lasting cellular changes
that add to its stability.When an axon of cell A is near enough to excite a
cell B and repeatedly or persistently takes part in firing it, some growth
process or metabolic change takes place in one or both cells such that A's
efficiency, as one of the cells firing B, is increased.
The theory is often summarized as “Cells that fire together wire
together.” This summary, however, should not be taken too literally.
Hebb emphasized that cell A needs to "take part in firing" cell B, and
such causality can occur only if cell A fires just before, not at the same
time as, cell B. This important aspect of causation in Hebb's work
foreshadowed what is now known about spike-timing-dependent
plasticity, which requires temporal precedence .
Description:-

Generalised Hebbian Learning :-


Extracts a set of principal directions along which data is
organised in a p- dimensional space. Each direction is represent by a
relevant weight vector. The number of those principal directions is ,at
most,p. In an illustrative example presented in Figure 1 the two
dimensional data is organised along two directions , w1 and w2.

Competitive Learning:-
Extracts a set of centers of data clusters . Each center points
is stored in a weight vector. It is obvious that the number of clusters is
independent of dimensionality of the input space. In Figure 2 two-
dimensional data is organised in three clusters, their centers represented
by three weight vectors.
An important extension of a basic competitive learning is known as
feature maps. A feature map is obtained by adding some form of
topological organization to neurons.
Locality of Plain Hebbian Learning :-
Looking at one synapse at a time (from input unit j)
∆wj = η V Ej
This is a local update rule. The Hebbian synapse, characterized by a
weight wj, is modified as a function of the activity of only the two units it
connects. By contrast, back-prop is non-local. The update rule involves
the back - propagation of a distant error signal, computed (potentially
many) layers above it. This makes Hebbian learning a likely candidate for
biological systems.

A Consequence of ∆w = η V E :-
Let’s look at the update rule Eq 3 given our expression for V in Eq 2:
∆w = η V E
= η # w T E $ E (inner product)
= η # E E T $ w (outer product)

Given a current value of the weight vector w, the weight update ∆w will
be a function of the outer product of the input pattern with itself.

Note that learning is incremental; that is, a weight update is performed


each time the neuron is exposed to a training pattern E.

We can compute an expectation for ∆w if we take into account the


distribution over patterns, P(E).

The Correlation Matrix :-


Taking the expectation of Eq 5 under the input distribution
P(E):
#∆w¯$ = η / ξ ¯ξ ¯T 0 w¯ ≡ η C w
where we have defined the correlation matrix as
C =(EET)
=(E1E1) (E1E2) …(E1EN)
(E2E1) (E2E2) …(E2EN)
. . .
. . .
(ENE1) (ENE2) …. (ENEN)

and N is the number of inputs to our Hebbian neuron.

More on the Correlation Matrix :-

Similar to the covariance matrix


Σ = ((E-U)(E-U)T);
C is the second moment of the input distribution about the origin (Σ is the
second moment of the input distribution about the mean) If µ = 0, then C
= Σ.
Like Σ, C is symmetric. Symmetry implies that all eigenvalues are real
and all eigen vectors are orthogonal.

Additionally, because C is an outer product, it’s positive-definite. This


means that all eigenvalues are not just real, they’re also all non-negative.

The Stability of Hebbian Learning :-


Recalling Equation 6,
(w) = n CW
From a mathematical perspective, this is a discretized version of a linear
system of coupled first-order differential equations,
dw / dt = C w
whose natural solutions are
w(t) = e λ t u
where λ is a scalar and u is a vector independent of time. If we wanted to
solve this system of equations, we’d still need to solve for both λ and u.
There turn out to be multiple pairs {λ, u}; then w(t) is a linear
combination of terms like the one in Equation 10.
Note that if any λ i > 0, w(t) blows up.

Hebbian Learning Blows Up :-

To see that Equation 10 represents solutions to Equation 9,


C w = dw/ dt
C ( e λ t u) = d /dt (e λ t u)
=λeλtu
⇒ C u = λ u¯
So the λ i’ s are the eigenvalues of the correlation matrix C!
1. C is an outer product ⇒ all λ ≥ 0.
2. 2. From Equation 10, if any λ > 0 then w→ ∞.
3. 3. The λi ’s cannot all be zero (true only for the zero matrix).
So plain Hebbian learning blows up.

Plain Hebbian Learning: Conclusions :-

We’ve shown that for zero-mean data, the weight vector aligns itself with
the direction of maximum variance as training continues.

This direction corresponds to the eigen vector of the correlation matrix C


with the largest corresponding eigenvalue. If we decompose the current
w¯ in terms of the eigen vectors of C, w = " i α i u i, then the expected
weight update rule
(∆w) = η C w
= η C ! i αi ui
=η!iαiλiui

will move the weight vector w towards eigen vector u i by a factor


proportional to λ i. Over many updates, the eigen vector with the largest λ
i will drown out the contributions from the others. But because of this
mechanism, the magnitude of w blows up.

Oja’s Rule :-

An alternative, suggested by Oja, is to add weight decay.


The weight decay term is proportional to V 2 (recall V = w T E is the
(output) activation of our Hebbian neuron).
∆w = η V (E − V w)

approaches unit length, without any additional explicit effort.


But it retains the property that it points in the direction of maximum
variance in the data (the largest principal component).

Principal Component Analysis (PCA) :-

A tool from statistics for data analysis.


Can reveal structure in high-N-dimensional data that is not otherwise
obvious.
Like Hebbian learning, it discovers the direction of maximum variance in
the data. But then in the (N − 1) dimensional subspace perpendicular to
that direction, it discovers the direction of maximum remaining variance,
and so on for all N.
The result is an ordered sequence of principal components. These are
equivalently the eigen vectors of the correlation matrix C for zero mean
data, ordered by magnitude of eigenvalue in descending order.
They are mutually orthogonal.

PCA: Batch Algorithm :-

Because of this, eigenvalue decomposition is an effective means of


performing principal component analysis, if we can afford to retain all the
data and do the computation offline.
This is typically the case with PCA.
[n Data,n Dim] = size(xi);
xi0 = xi - ones(n Data,1) * mean(xi,1);
C = co v (xi0);
[u,lambda] = eig(C);

PCA Decorrelates Data :-

In producing the N principal components (which are orthogonal),


PCA provides an alternate, complete,
Ortho-normal basis of N-space.

E = Ei ei = Ci Ui
E1 e1 + E2 e2 = C1u1 + C2 u2

In this basis, the data are uncorrelated.


PCA: Geometric Interpretation :-

The Karhunen-Lo`eve transform operates on zero-mean data; it is a


function of the covariance matrix Σ only.

The covariance matrix is a second-order statistic. This makes PCA a


second-order method; it ignores higher-order statistics of the data.
We start with the original covariance matrix in the original orthonormal
basis.
PCA applies a set of rotations to the original orthonormal basis, resulting
in another orthonormal basis, such that the off-diagonal entries of the
covariance matrix become zero and the diagonal entries of the covariance
matrix take on the maximum values possible.

Data Compression with PCA :-

If we wanted to transmit our original zero-mean data set {E}, we could


send the zero-mean, diagonal - covariance data set {E} if the receiver
knew the KL transform UT we used to de correlate it.
Suppose that several, say P, of the eigenvalues were very small, I e. that
the contribution of the corresponding eigen vectors was (almost)
statistically insignificant.
Then we wouldn’t need to send the coefficients N−P+1 through N of
every data point for (almost) perfect reconstruction of the original data
point E.
This results in a compression factor of (N−P)/ N (plus the overhead of
communicating the N × N transform if necessary.
PCA with Neural Networks :-
We’ve seen that Hebbian learning, with appropriate provisions for
preventing blow up, extracts the largest principal component.
Let’s take a look at two different neural network architectures capable of
extracting more of them.
1. Cascading multiple Hebbian neurons.
2. Auto encoder networks (we’ve already seen these before).

Applications of Neural Networks


Neural networks have been successfully applied to the broad
spectrum of data-intensive applications, such as

Activation
Application Architecture / Algorithm
Function
Process modeling
Radial Basis Network Radial Basis
and control

Machine Tan- Sigmoid


Multilayer Perceptron
Diagnostics Function

Portfolio Classification Supervised Tan- Sigmoid


Management Algorithm Function

Target Tan- Sigmoid


Modular Neural Network
Recognition Function

Tan- Sigmoid
Medical Diagnosis Multilayer Perceptron
Function

Logistic Discriminant Analysis


Logistic
Credit Rating with ANN, Support Vector
function
Machine

Targeted Back Propagation Algorithm Logistic


Marketing function

Multilayer Perceptron, Deep


Logistic
Voice recognition Neural Networks( Convolutional
function
Neural Networks)

Financial Logistic
Backpropagation Algorithm
Forecasting function

Intelligent Logistic
Deep Neural Network
searching function

Gradient – Descent Algorithm


Logistic
Fraud detection and Least Mean Square (LMS)
function
algorithm.

Flow Chart :-
MATLAB CODE :-
Clc;
Clear all;
Close all;
X= [1 1 -1 -1;1 -1 1 -1];

t=[1 -1 -1 -1];

W=[0,0];

B=0;

For I=1:4;

For j=1:2;

W(j)= w(j) +t(i) *x(j,i);

End

B=b+ t(i);
End
Disp(‘final weight matrix:’);
Disp (w);
Disp (‘final bias values:’);
Disp (b);
Plot(x(1,1),x(2,1),’or’,’Markersize’,20,’Markerfacecolor’,[0 0
1]); hold on;
Plot(x(1,2),x(2,2),’or’,’Markersize’,20,’Markerfacecolor’,[1 0
0]); hold on;
Plot(x(1,3),x(2,3),’or’,’Markersize’,20,’Markerfacecolor’,[1 0
0]); hold on;
Plot(x(1,4),x(2,4),’or’,’Markersize’,20,’Markerfacecolor’,[1 0
0]); hold on;
Hold on;
M = -(w(1)/w(2));
C = b/w(2);
X1 = linspace (-2, 2, 100);
X2 = m *x1+c;
Plot (x2,x1,’r’);
Axies ([-2 2 -2 2]);

Matlab screenshot :-

Code :-
Output :-
Questions :-
Ans 1.
# Learning outcomes :-
1. I learnt about the basic models of ANN.
2. I learnt about the comparison between biological and artificial neuron.
3. I learnt about basic fundamental neuron model and hebb network.
4. I learnt about the different types of connections of NN, learning and
activation function.

Ans 2.
The best learning outcome is learning about basic fundamental neuron
model and hebb network, because it gives detailed and very usefull
information about hebb net.

Ans 3.
I used MATLAB software to solve the given problem.
MATLAB combines a desktop environment tuned for iterative analysis
and design processes with a programming language that expresses matrix
and array mathematics directly
MATLAB stands for matrix laboratory.
System specifications for installing matlab software :-

Processors :-
Minimum: Any Intel or AMD x86-64 processor
Recommended: Any Intel or AMD x86-64 processor with four logical
cores and AVX2 instruction set support
Disk:-
Minimum: 2.9 GB of HDD space for MATLAB only, 5-8 GB for a
typical installation
Recommended: An SSD is recommended
A full installation of all MathWorks products may take up to 29 GB of
disk space.

RAM :-
Minimum: 4 GB
Recommended: 8 GB
For Polyspace, 4 GB per core is recommended

Graphics :-
No specific graphics card is required.
Hardware accelerated graphics card supporting OpenGL 3.3 with 1GB
GPU memory is recommended.
GPU acceleration using the Parallel Computing Toolbox requires a
CUDA GPU. See GPU Computing Support for details.

Ans 4.
Access MATLAB Add-On Toolboxes
Statistics and Machine Learning Toolbox (Statistics and Machine Learning Toolbox)
Curve Fitting Toolbox (Curve Fitting Toolbox)
Control System Toolbox (Control System Toolbox)
Signal Processing Toolbox (Signal Processing Toolbox)
Mapping Toolbox (Mapping Toolbox)
System Identification Toolbox (System Identification Toolbox)
Deep Learning Toolbox (Deep Learning Toolbox)
DSP System Toolbox (DSP System Toolbox)
Datafeed Toolbox (Datafeed Toolbox)
Financial Toolbox (Financial Toolbox)
Image Processing Toolbox (Image Processing Toolbox)
Text Analytics Toolbox (Text Analytics Toolbox)
Predictive Maintenance Toolbox

Applications covered by MATLAB :-


Capstan with slip , double cardan joint , dynamometer, hydromechanical
hoist , power window system , sheet metal feeder , etc.

Ans 5.
71 companies reportedly use MATLAB in their tech stacks, including
Empatica, ADEXT, and doubleSlash.

 Empatica.
 ADEXT.
 doubleSlash ...
 Walter.
 RideCell.
 Broadcom.
 Diffbot

Cost of the software is Rs.135,000 to Rs.145,000.

THANK YOU

You might also like