AN Overview Artificial Neural Network Approach AN Overview Artificial Neural Network Approach

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

AN

ANOverview
OverviewArtificial
ArtificialNeural
NeuralNetwork
Network
approach
approach
Dr. Mrinmoy Ray
Dr. Mrinmoy Ray
mrinmoy.ray@icar.gov.in
mrinmoy.ray@icar.gov.in
Scientist,
Scientist,ICAR-IASRI
ICAR-IASRI

08/22/2020 1
Algorithm/ Machine
Model learning

Program/ Data Output


Data
model

Computer Computer

Outpu Program/
t model
Introduction

 An artificial neural network (ANN), otherwise called neural network (NN), is a


computational device that is inspired by the structure as well as functional aspects of
biological neural networks the human brain especially.
 A neural network made out of various interconnected simple processing elements
called neurons or nodes.
 Each node receives an input signal which is the aggregate ‘‘information’’ from other
nodes or external stimuli, processes it locally through an activation or transfer
function and produces a transformed output signal to other nodes or external outputs.
 This information processing characteristic makes ANNs an effective computational
device and able to learn from examples and then to generalize to examples never
before seen.
08/22/2020 3
BIOLOGICAL NEURON MODEL
Four parts of a typical nerve
cell : -
 DENDRITES: Accepts the inputs
 SOMA : Process the inputs
 AXON : Turns the processed inputs into
outputs.
 SYNAPSES: The electrochemical contact
between the neurons.
ARTIFICIAL NEURON MODEL
Inputs to the network are x1
w1
represented by the
mathematical symbol, xn x2
w2 f(w1 x1 + ……+ wnxn)
f
Each of these inputs are
multiplied by a connection
weight , wn wn
xn
sum = w1 x1 + ……+ wnxn
These products are simply
summed, fed through the
transfer function, f( ) to generate
a result and then output.
multiplier

x1ϵ{0,1}
x1
w1 (x1 w1)
x1 x2
(x2w2)
o = x1 AND x2 o = x1 AND x2
w2
x2ϵ{0,1} x2

Fig 1: AND gate graph


This graph cannot be considered a neural
network since the connections between the Fig 2: AND gate network
nodes are fixed and appear to play no other The graph structure which
role than carrying the inputs to the node that
computed their conjunction. connects the weights modifiable
using a learning algorithm,
qualifies the computing system to
be called an artificial neural
networks.
Overview of ANN architecture

i) Input layers
These layers are responsible for receiving information (data), signals,
features, or measurements from the external environment. These
inputs (samples or patterns) are usually normalized within the limit
values produced by activation functions. This normalization results
in better numerical precision for the mathematical operations
performed by the network.
ii) Hidden, intermediate, or invisible layers
These layers are composed of neurons which are responsible for
extracting patterns associated with the process or system being
analyzed. These layers perform most of the internal processing from
a network.

08/22/2020 7
iii) Output layers
These layers are also composed of neurons, and thus are
responsible for producing and presenting the final network
outputs, which result from the processing performed by the
neurons in the previous layers.
 

08/22/2020 8
Multilayer feedforward networks

08/22/2020 9
ANN approach to time series forecasting

 The ANN performs the following nonlinear function


mapping between the input and output

yt  f ( yt 1  yt 2 ,..., yt  p , w)   t

where, w is a vector of all parameters and f is a function of


network structure and connection weights. Therefore, the
neural network resembles a nonlinear autoregressive
model.
08/22/2020 10
Architecture of ANN for time series forecasting

Output y(t)
 

 q  p 
yt  f    j g   i j yt i  
 j 0  i 0 

08/22/2020 11
Most commonly used activation functions

Activation function Equation


Identity x
Sigmoid 1
1  e x
TanH 2
tanh( x)  2 x
1
1 e
ArcTan tan 1 ( x)
Sinusoid sin( x)
Gaussian e x2

08/22/2020 12
three common learning algorithms for ANN

1) Supervised Learning
 The supervised learning strategy consists of having available the desired outputs for a
given set of input signals; in other words, each training sample is composed of the input
signals and their corresponding outputs. Henceforth, it requires a table with input/output
data, also called attribute/value table, which represents the process and its behavior.

2) Unsupervised Learning
 Different from supervised learning, the application of algorithm based on unsupervised
learning does not require any knowledge of the respective desired outputs. Thus, the
network needs to organize itself when there are existing particularities between the
elements that compose the entire sample set, identifying subsets (or clusters) presenting
similarities. The learning algorithm adjusts the synaptic weights and thresholds of the
network in order to reflect these clusters within the network .itself.
3) Reinforcement Learning

 It is the hybrid of supervised and unsupervised learning.

08/22/2020 13
Supervised Learning
Contd…
 x , y ( pre-classified training examples)
 Given an observation or set of observations x, what is the best label for y ?
Regression Problem:

Rainfall ( X1) Temperature (X2) Yield (Y)

12 25 67

23 30 78

47 32 80
One-Hot
Encodin
100
g
Time Series
Year Yield Yt-1 Yt-2 Yt

2000 78 78 89 92
2001 89
89 92 96
2002 92
92 96 108
2003 96

2004 108

2005 117

2006 156

2007 158

2008 165
Graphical Representation of supervised
Learning

Training Data Set Test Data Set


X Y ( Only Input)

Input1 Output1
Learning Model
Input2 Output2
Algorithm
. .
. .
. .

Inputn Outputn Predicted Y


Input Output
X
Weights  𝑌
^
Target Y

Error
Unsupervised Learning
Contd…
X
 Given an observation or set of observations x, there is no label
 Cluster Analysis
Graphical Representation of
Unsupervised Learning

Training Data Set


X
Group1

Input1
Learning Group2
Input2
Algorithm
.
.
.

Inputn Group3
Reinforcement Learning
Gradient decent back propagation algorithm

2
1 1 N   q
2
 p
N
  
E   (ei )    yt  f    j g   i j yt i   
N n 1 N n1   j 0  i 0   

E
 j (t )     j (t  1)
 j

E
 e j (n)  f ( x)  y j (n)
w j

E
ij (t )    i j (t  1)
ij

E q
exp( x)
 g ( x)   e j (n) * w j (n)
wij j 0 (1  exp( x)) 2

08/22/2020 26
Step by step procedure

1. Division of the data:


Data is divided into training and test sets. The training sample is used
for ANN for model development and the test sample is utilized to
evaluate the forecasting performance.

2. Data Normalization:
Normalization procedure
Linear transformation to [0,1]: Xn =(X0-Xmin)/ ( Xmax-Xmin)
Statistical normalization: Xn =(X0-mean(X))/var(X)
simple normalization: Xn =X0/ Xmax

08/22/2020 28
3. Selection of appropriate number of hidden nodes as well as
optimum number of lagged:
There are no established theories available for the selection of p and q,
hence experiments are often conducted for the determination of the
optimal values of p and q.

4. Estimation of connection weights:


5. Evaluating forecasting Performance
1 n
MAPE   yt  yˆt / yt 100
n t 1
2
1 n
MSE    yt  yˆt 
n t 1
2
1 n
RMSE    yt  yˆt 
n t 1

08/22/2020 29
LIMITATIONS

 ANNs are nonlinear time series model hence, for linear time series
data the approach may not be better than linear statistical model.
 ANNs are black-box methods. There is no exact form to describe and
analyze the relationship between inputs and outputs. This causes
troublesome for interpretation of results. In addition, no formal
statistical test is available.
 ANNs are subjected to have overfitting problems owing to its large
number of parameters.
 There are no established theories available for the selection of p and
q, hence experiments are often conducted for the determination of
the optimal values of p and q which is tedious.
 ANNs usually require more data for time series forecasting.

08/22/2020 30
References
Anjoy, P., Paul, R. K., Sinha, K., Paul, A. K. and Ray, M. (2017). A hybrid wavelet based neural networks model for
predicting monthly WPI of pulses in India. Indian Journal of Agricultural Sciences. 87 (6), 834-839.

Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (2009), Time Series Analysis: Forecasting and Control (3rd ed.), San
Francisco: Holden-Day.

Broock, W., Scheinkman, J. A., Dechert, W. D. and LeBaron, B. (1996). A test for independence based on the correlation
dimension. Econometric Review, 15, 197–235.

Jha, G. K., and Sinha, K.2014.Time-delay neural networks for time series prediction: an application to the monthly
wholesale price of oilseeds in India. Neural Computing and Applications, 24 (3): 563-571.

Makridakis, S., Wheelwright, S.C. and Hyndman, R. J. (1998).Forecasting: Methods and Applications (3rd ed.),
Chichester: Wiley.

08/22/2020
  31
References...

Mukherjee, A., Rakshit, S., Nag, A., Ray, M., Kharbikar, H. L., Kumari, S., Sarkar, S., Paul, S., Roy, S.,
Maity, A., Meena, V. S. and Burman, R. R. (2016). Climate Change Risk Perception, Adaptation and
Mitigation Strategy: An Extension Outlook in Mountain Himalaya. In: Jaideep Kumar Bisht, Vijay Singh
Meena, Pankaj Kumar Mishra and Arunava Pattanayak Edition. Conservation Agriculture (pp. 257-292).
Singapore. Springer Singapore.

Ray, M., Rai, A., Ramasubramanian, V. and Singh, K. N. (2016). ARIMA-WNN hybrid model for forecasting
wheat yield time series data. Journal of the Indian Society of Agricultural Statistics, 70(1), 63-70.

Ray, M., Rai, A., Singh, K. N., Ramasubramanian, V. and Kumar, A. (2017). Technology forecasting using
time series intervention based trend impact analysis for wheat yield scenario in India. Technological
Forecasting & Social Change, 118, 128–133.

Remus, W. and O’Connor, M.(2001). Neural Networks for Time-Series Forecasting, New york, Springer.

08/22/2020 32
References...

Zhang, G., Patuwo, B. E. and Hu, M. Y. 1998. Forecasting with


artificial neural networks: The state of the art. International Journal
of Forecasting 14: 35-62.

Zhang, G. (2003). Time series forecasting using a hybrid ARIMA and


neural network model. Neurocomputing, 50, 159–175.

08/22/2020 33
THANK YOU ALL
08/22/2020 34

You might also like