Professional Documents
Culture Documents
Guidelines For Modelling
Guidelines For Modelling
Employing the model structure of neural network, the artificial neural network, is a
very dominant computational technique for modeling the complex non-linear
relationships predominantly in situations where the explicit form of the relation
between the variables involved is unknown. The basic structure of an ANN model is
usually comprised of three distinctive layers, the input layer, where the data are
introduced to the model and computation of the weighted sum of the input is carried
out, the hidden layer or layers, where the data are processed, and the output layer,
where the results of ANN are generated [1].
Figure 3.1 Typical Model of a Feed-forward Artificial Neural Network
A feed-forward network with one hidden layer of neurons is given in figure 3.2
(Konar, 1999) . As we can see, the most common family of feed-forward networks is
multilayer perceptron, where neurons are organized into layers that have undirectional
connections between them. Indeed, different connections yield different network
behaviors, also different network architectures require appropriate learning algorithms
(Jane et al., 1996). In contast with feed-forward neural network, recurrent feedback
networks, the inputs of each layer will be influenced from the output of the previous
layer.
Figure 3.2 Taxanomy of feed-forward and recurrent/feedback neural network
architectures
Network structure is defined by the number of hidden layers and number of neurons.
Hence, these numbers determine the number parameters that must be estimated. This
is a the key in training process as if the number of parameters to be estimated is
insufficient then the model developed may not be able to fit the training data. In
contrast if the number of parameters is too much, this time in relation to available
number of training samples, network may loose its ability to generalize (Maier and
Dandy, 2001).
Generally, the network structure is determined by trial and error. However
there exists some upper and lower general bounds to these numbers. Fortunately,
Hecht and Nielsen (1987) have suggested the following upper limit for the number
of hidden layer neurons in order to ensure that ANNs are able to approximate any
continuous function,
N H ≤ 2 N I +1 1
Where
NH : number of neurons in hidden layer,
NI : number of inputs (number of neurons in input layer)
Moreover, to avoid the network from overfit the training data, Roger and
Dowla (1994) proposed the relationship between the number of training samples
(number ofinput data) and network neuron numbers,
N TR
NH≤
(N I +1)
Where
NTR : number of training samples (number of training data set)
Actually, there is a process during the ANNs training that are continuously been doing
so that error term of the computed output value with the target output value is
minimized; keep altering the connection weights between the neurons. This must be
done after each pattern is presented to the network. In fact, the more patterns are
presented to the network during training process, the better predictor ANNs will
become. Nevertheless, Tapkin (2004), suggested that the network should be tested for
its generalization, practically to verify whether the ANN model established a
functional relationship between the input and output data rather than memorizing the
behavior knowing that ANN are basically learn by examples.
In this section, the methods applied were described in a chronological order. The
whole study that was conducted for this study employed four different steps. These
are:
Data collection
Modeling of using ANN/Data analysis
Testing and training? Analysis tools
etc
Generic flowchart ( should inline with your model and case study)
Project
Initialization
PHASE I
DATA COLLECTION
Data preprocessing and analysis
PHASE II
MODELING DESIGN I
Network structure
Variables selection and output generation (prediction)
PHASE III
MODELING DESIGN II
Model Training and Testing
Statistical analysis
PHASE IV
MODELING DESIGN III
Model Validation/testing to evaluate the
performances
Statistical analysis
PHASE V
MODEL EXECUTION IN GUI ENVIRONMENT Experimental work (to run the
Employ the model into Graphic User Interface model)
Simulink and Labview (Laboratory and stations in-situ).
PHASE VI
PORTABLE WATER QUALITY KIT DEVELOPMENT
Hardware and software communication
GUI communication for input-output (I/O)
The main aim of this study was to build multivariate models capable of
simultaneously predicting the water quality values in the river using the independent
set of measured quality variables. In this study, FANN model is proposed, which is
chosen as the mathematical tool, resulted in the consolidation of its theoretical
background and the development of its underlying learning and optimization
algorithms. All the computations were performed using Excel 97 and Matlab 7.0.
The training subset data are used to accomplish the network learning and fit the
network weights by minimizing an appropriated error function. Backpropagation is
the training technique usually used for this purpose. It refers to the method for
computing the gradient of the case-wise error function with respect to the weights for
a feedforward network. The performance of the networks is then compared by
evaluating the error function using the validation subset data, independently.
Actual values in the training datasets are compared to predicted values by the neural
network models, to evaluate the model performance which means how accurately the
network predicts targets for inputs that are not in the training set (this is sometimes
referred to as holdout validation). Furthermore, those models were validating by
statistical response study, RMSE. On the other hand, the predictions validations were
made based on percentage error, PE.
Statistical analysis ( R2,RMSE. MSE, error or residue analysis)