Guidelines For Modelling

3.3.
2 Basic Concepts of ANNs
Employing the model structure of neural network, the artificial neural network, is a
very dominant computational technique for modeling the complex non-linear
relationships predominantly in situations where the explicit form of the relation
between the variables involved is unknown. The basic structure of an ANN model is
usually comprised of three distinctive layers, the input layer, where the data are
introduced to the model and computation of the weighted sum of the input is carried
out, the hidden layer or layers, where the data are processed, and the output layer,
where the results of ANN are generated [1].
According to Tapkin (2004), the system is known as feed-forward artificial

neural network as the inputs are passed through its model neuron to produce the
output once, as in figure 3.1. Undoubtedly, FANN is very powerful function in
optimization modeling and has extensively been used for the prediction of water
resource variables [9,10].
Figure 3.1 Typical Model of a Feed-forward Artificial Neural Network
Moreover, it is important to know that neural network is categorized by three

points, which are:
 Architecture – its pattern of connections between the neurons (units, cells, or
nodes)
 Training/learning algorithm – its method of determining the weights (represent
information being used by the network to solve a problem) on the connections
 Activation function - a function used to transform the activation level of a unit
(neuron) into an output signal
3.3.3 Types of ANNs
Classification of ANNs are usually based on their network topology, node

characteristics, learning and training algorithms (Fausett, 1994). Alternatively, Jane et
al. (1996) jump to conclusion that ANNs can be grouped into two categories based on
the connection pattern (as in figure 3.1):
 Feedforward neural network – the connections between the units do not form
directed cycle, there are no cycles or loops in the network
 Feedback (recurrent) neural network – connections between units form a
directed cycle
A feed-forward network with one hidden layer of neurons is given in figure 3.2
(Konar, 1999) . As we can see, the most common family of feed-forward networks is
multilayer perceptron, where neurons are organized into layers that have undirectional
connections between them. Indeed, different connections yield different network
behaviors, also different network architectures require appropriate learning algorithms
(Jane et al., 1996). In contast with feed-forward neural network, recurrent feedback
networks, the inputs of each layer will be influenced from the output of the previous
layer.
Figure 3.2 Taxanomy of feed-forward and recurrent/feedback neural network
architectures
3.3.4 Determination of Network Structure
Network structure is defined by the number of hidden layers and number of neurons.
Hence, these numbers determine the number parameters that must be estimated. This
is a the key in training process as if the number of parameters to be estimated is
insufficient then the model developed may not be able to fit the training data. In
contrast if the number of parameters is too much, this time in relation to available
number of training samples, network may loose its ability to generalize (Maier and
Dandy, 2001).
Generally, the network structure is determined by trial and error. However
there exists some upper and lower general bounds to these numbers. Fortunately,
Hecht and Nielsen (1987) have suggested the following upper limit for the number
of hidden layer neurons in order to ensure that ANNs are able to approximate any
continuous function,
N H ≤ 2 N I +1 1
Where
NH : number of neurons in hidden layer,
NI : number of inputs (number of neurons in input layer)
Moreover, to avoid the network from overfit the training data, Roger and
Dowla (1994) proposed the relationship between the number of training samples
(number ofinput data) and network neuron numbers,
N TR
NH≤
(N I +1)
Where
NTR : number of training samples (number of training data set)
3.3.5 ANNs as Function Predictor
Actually, there is a process during the ANNs training that are continuously been doing
so that error term of the computed output value with the target output value is
minimized; keep altering the connection weights between the neurons. This must be
done after each pattern is presented to the network. In fact, the more patterns are
presented to the network during training process, the better predictor ANNs will
become. Nevertheless, Tapkin (2004), suggested that the network should be tested for
its generalization, practically to verify whether the ANN model established a
functional relationship between the input and output data rather than memorizing the
behavior knowing that ANN are basically learn by examples.
You can check some liner regression model

1. Multiple linear regression (MLR) model
http://www.stat.yale.edu/Courses/1997-98/101/linmult.htm
https://corporatefinanceinstitute.com/resources/knowledge/other/multiple
-linear-regression/
http://mezeylab.cb.bscb.cornell.edu/labmembers/documents/supplement
%205%20-%20multiple%20regression.pdf
https://www.mathworks.com/help/stats/regress.html
2. Partial least square (PLS) model
https://stats.idre.ucla.edu/wp-content/uploads/2016/02/pls.pdf
https://www.mathworks.com/help/stats/partial-least-
squares.html#:~:text=Partial%20least%2Dsquares%20(PLS),of%20the
%20original%20predictor%20variables.
https://www.mathworks.com/help/stats/partial-least-squares-regression-and-
principal-components-regression.html
6.0 METHODOLOGY generic
In this section, the methods applied were described in a chronological order. The
whole study that was conducted for this study employed four different steps. These
are:
 Data collection
 Modeling of using ANN/Data analysis
 Testing and training? Analysis tools
 etc
Generic flowchart ( should inline with your model and case study)
Project
Initialization
PHASE I
DATA COLLECTION
Data preprocessing and analysis
PHASE II
MODELING DESIGN I
Network structure
Variables selection and output generation (prediction)
PHASE III
MODELING DESIGN II
Model Training and Testing
Statistical analysis
PHASE IV
MODELING DESIGN III
 Model Validation/testing to evaluate the
performances
 Statistical analysis
PHASE V
MODEL EXECUTION IN GUI ENVIRONMENT Experimental work (to run the
Employ the model into Graphic User Interface model)
Simulink and Labview (Laboratory and stations in-situ).
PHASE VI
PORTABLE WATER QUALITY KIT DEVELOPMENT
Hardware and software communication
GUI communication for input-output (I/O)
Completion of the thesis

Figure 6.2 Steps of the proposed research
6.3 Description of Methodology
Phase I : Data Collection and Preprocessing
Input-target training data are usually pretreated as explained above in order to

improve the numerical condition for the optimization problem and for better behavior
of the training process. Thus, the data are normally divided into three subsets;
training, validation and testing subsets.
The sensitivity of the 29 parameters towards the target parameters will be

different, as expected. Some of them will have strong correlation, and some of them
in reverse. There are possibilities that different combination of input parameters will
give different prediction.
1. Normalization
2. Outlier etc
3. Total number of data
4. Data division method ( check in Matlab)
Phase II: Model Design
The main aim of this study was to build multivariate models capable of
simultaneously predicting the water quality values in the river using the independent
set of measured quality variables. In this study, FANN model is proposed, which is
chosen as the mathematical tool, resulted in the consolidation of its theoretical
background and the development of its underlying learning and optimization
algorithms. All the computations were performed using Excel 97 and Matlab 7.0.
Please check time series model

Y(t) = Fn( x(t), Y(t-1)…..] NARX model etc
Phase III: Model Training
The training subset data are used to accomplish the network learning and fit the
network weights by minimizing an appropriated error function. Backpropagation is
the training technique usually used for this purpose. It refers to the method for
computing the gradient of the case-wise error function with respect to the weights for
a feedforward network. The performance of the networks is then compared by
evaluating the error function using the validation subset data, independently.
Phase IV: Model Validation
Actual values in the training datasets are compared to predicted values by the neural
network models, to evaluate the model performance which means how accurately the
network predicts targets for inputs that are not in the training set (this is sometimes
referred to as holdout validation). Furthermore, those models were validating by
statistical response study, RMSE. On the other hand, the predictions validations were
made based on percentage error, PE.
Statistical analysis ( R2,RMSE. MSE, error or residue analysis)

Guidelines For Modelling

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Guidelines For Modelling

Uploaded by

Copyright:

Available Formats

3.3.

2 Basic Concepts of ANNs

According to Tapkin (2004), the system is known as feed-forward artificial

Moreover, it is important to know that neural network is categorized by three

3.3.3 Types of ANNs

Classification of ANNs are usually based on their network topology, node

3.3.4 Determination of Network Structure

3.3.5 ANNs as Function Predictor

You can check some liner regression model

6.0 METHODOLOGY generic

Completion of the thesis

6.3 Description of Methodology

Phase I : Data Collection and Preprocessing

Input-target training data are usually pretreated as explained above in order to

The sensitivity of the 29 parameters towards the target parameters will be

Phase II: Model Design

Please check time series model

Phase IV: Model Validation

You might also like