Professional Documents
Culture Documents
Thesis Full Report
Thesis Full Report
Thesis
MASTER OF TECHNOLOGY in
By
ANKUR MITRA
16CM38F
MAY 2018
DECLARATION
I hereby declare the thesis entitled “Prediction Model for Resource Requirement Analysis of
Network Projects Using Artificial Neural Network”, which is being submitted at the
National Institute of Technology Karnataka, Surathkal in partial fulfilment of the
requirements for the award of the Degree of Master of Technology in Construction
Technology and Management is a bonafide report of the research work carried out by me.
The material contained in this study report has not been submitted to any University or
Institution for the award of any degree.
……………………………………………..
ANKUR MITRA
(16CM38F)
Date:
ii
CERTIFICATE
This is to certify that the thesis report entitled “Prediction Model for Resource Requirement
Analysis of Network Projects Using Artificial Neural Network” submitted by Mr. ANKUR
MITRA (16CM38F) is a bonafide record of the work carried out by him for the partial
fulfilment of the requirements for the award of Master of Technology in Construction
Technology and Management, Department of Civil Engineering, National Institute of
Technology Karnataka, Surathkal.
……………………………………………….
Dr. C. P. DEVATHA
Assistant Professor
N.I.T.K, Surathkal
……………………………………………….
Chairman – DPGC
N.I.T.K, Surathkal
iii
ACKNOWLEDGEMENT
I would like to express my unhindered appreciation to my project guide and supervisor Dr.
C.P. Devatha for her continued support and encouragement. She always gave me freedom to
have an unrelenting work process and gave her valuable suggestions whenever I required
them. Her critical approach to every nuanced aspect was crucial in making this report as
flawless as possible.
I am very thankful to L&T Construction-Water & Effluent Treatment IC for giving me the
opportunity and guiding me throughout my summer internship. It was a unique and very good
learning experience for me to be involved with.
During the period of my internship work, I have received generous help from many
departments of IT City, Mohali site like Accounting, Planning, Contracts and Tendering,
Quality Control and Plants and Machinery which I like to put on record here with deep
gratitude and great pleasure.
I would also like to extend my warmest gratitude to Mr. Manoj Kumar Dubey (Project
Manager of IT City, Mohali during my visit) for allowing me to encroach upon his precious
time and giving me all the necessary help and insights required for my study.
I would like to extend my appreciation towards all the staff members of the Mohali project
site and L&T internal BIS students of NITK, Surathkal for their encouragement, support and
guidance throughout this venture.
Ankur Mitra
(16CM38F)
iv
ABSTRACT
Construction Projects are highly complex and subtly evolving endeavours aimed at a time
bound achievement of predetermined objectives. The precise nuance of achieving the exact
goals without any hindrances and delays is still an elusive art. A majority of the projects still
work on half-baked predictions and results obtained from previous experiences. The biggest
crisis that faces the construction industry today is the need to optimize accurately all the
essentials parameters of change.
In this study, an attempt has been made to learn and train from past construction projects all
the important factors that control resources (resource parameters include 4Ms in terms of
activities that is excavation and backfilling, masonry, piling, road laying, reinforcement and
concreting) and their availability. The success or failure of past projects have been utilised to
train a model for future projects to analyse these changing factors to a quantifiable degree and
implement them to obtain a perfect fit of resource requirement and scheduling for that
particular project. In the study, Artificial Neural Network (ANN) has been used to learn and
train a model coded in MATLAB. The use of Non-linear Auto-Regressive network with
eXogenous input (NARX) model extracts vital information and hidden patterns from a large
database to create near accurate predictions.
All subsequent data have been obtained from individual sites under Water Effluent and
Treatment IC of L&T. For the limited context of the project, resource analysis of waste water
(& network) projects are used. The results obtained underline the need to quantify unknown
factors to a suitable degree and classify future projects for accurate predictions. The finalized
weights for the parameters of the Case Studies were obtained. In the 1st case study, the
weights in order of the above parameters were 5.1159, 3.0205, 2.2348, 1.087 and 0.5289 with
mean square error of 170.85. For the 2nd case study the weights in order of the parameters
above were 12.666, 12.902, 6.514 and 1.307 with mean squared error of 7.17 * 106. The
results obtained from the prediction model are within the acceptable range as shown in auto-
correlation error graphs. Although the model works well for the two case studies, errors
occurred due to data insufficiency. The use of Artificial Intelligent methods like ANN
collates very flexible prediction cycles for future projects.
v
TABLE OF CONTENTS
Acknowledgement………………………………………………………………………..….iv
Abstract..............................................................................................................................…...v
List of Figures……………………………………………………………………………......ix
List of Notations……………………………………………………………………………..xi
CHAPTER 1 INTRODUCTION…………………….……….……………………………..1
1.1 Artificial Neural Network……………….…………………………………...........4
1.1.1 Background…………………..….………………………………………4
1.1.2 Feed Forward ANN……………….…………….……………………….5
1.1.3 Feedback ANN……......…………………………………………………6
1.2 Machine Learning in ANN….....………….……………………………………….7
1.3 ANN in Construction Projects…………….……………………………………….8
1.4 Decision Support System………………….……………………………………....9
1.5 Objective of the study………….………………………………………………...10
1.6 Scope of the study………….…………………………………………………….10
vi
CHAPTER 3 METHODOLOGY…………………….……………………………………26
3.1 Division of Data……………………………………….........................................27
3.2 Data Processing…………………………………………......................................28
3.3 Determination of Network Architecture………….………………………………28
3.4 Model Building Process….....……………………………………………………29
3.4.1 Obtaining Training Data………………………………………………..29
3.4.2 Training of Input Data…………………………………….....................30
3.4.2.1 Transfer Function………………….........................................31
3.4.2.2 Levenberg Marquardt Algorithm…………………………….32
3.4.2.3 Scaled Conjugate or Conjugate Gradient Algorithm………...33
3.4.2.4 Other Factors………………………………………………....34
3.4.3 Testing and Validation………………………………………................35
3.4.3.1 Problem of Under-fitting or Overfitting……………………...36
3.4.4 Error Analysis – Performance Measures……………………………….36
3.5 Forecast Performance Measures………………………………………………….36
3.5.1 Mean Squared Error……………………………………………………36
3.5.2 Sum of Squared Error…………………………………………………..37
3.5.3 Signed Mean Squared Error……………………………………………37
3.5.4 Root Mean Squared Error………………………………………….......37
3.5.5 Normalized Mean Squared Error………………………………………37
3.5.6 Mean Forecast Error……………………………………………………38
3.5.7 Mean Absolute Error…………………………………………………...38
3.5.8 Mean Absolute Percentage Error………………………………………38
3.5.9 Mean Percentage Error…………………………………………………39
vii
4.2.3 Civil Works………….…………………………………………………47
4.2.4 Electrical Works………….…………………………………………….47
CHAPTER 6 CONCLUSION……………………………………………………………...64
REFERENCES……………………………………………………………………………...65
viii
LIST OF FIGURES
Fig 4.4. Reinforcement Works for Ghat Floor Slab at Chaudhary Tola Ghat……………….45
Fig 4.5. Concrete Work under progress using Barge Mounted Batching Plant………….......46
ix
Fig 5.1. Regression Graph of Case Study 1………………….……………...……………….49
x
LIST OF NOTATIONS
CM Construction Management
GA Genetic Algorithm
LM Levenberg-Marquardt algorithm
xi
NTSTOOL Neural Time Series Tool
SD Standard Deviation
xii
CHAPTER 1
INTRODUCTION
However, this brings us to the next stage of the problem of planning resources. Planning, per
se, is a phenomenon based on experience and predictions. There may be certain cases when
the resources planned fall short of the required target or exceeding it by far, both cases
causing huge cost burdens to the overall project. The systematic indulgence of resource
planning with accurate predictions has not been obtained so far. Many resource optimization
techniques have been used such as resource levelling, resource smoothening, data gathering
and representation, analytical techniques. Use of PRIMAVERA, MS Project for resource
optimization has not yet yielded sufficient results.
Few companies can remain competitive in today’s highly competitive business environment
without effectively managing the cost of resources. In practice, basic PERT and CPM
scheduling techniques have proven to be helpful only when the project deadline is not fixed
and the resources are not constrained by either availability or time. Since this is not practical
even for small-sized projects, several techniques have been used to modify CPM results in
account of practical considerations. In dealing with project resources, the two already
discussed techniques are resource allocation and resource levelling. Resource allocation
(sometimes referred to as constrained- resource scheduling) attempts to reschedule the project
tasks so that a limited number of resources can be efficiently utilized while keeping the
unavoidable extension of the project to a minimum. Resource levelling (often referred to as
resource smoothing), on the other hand, attempts to reduce the sharp variations among the
peaks and valleys in the resource demand histogram while maintaining the original project
duration. These techniques, as such, deal with two distinct sub-problems that can only be
applied to a project one after the other rather than simultaneously. Accordingly, they do not
guarantee (either individually or combined) a project schedule that minimizes the overall
project time or cost.
There are a number of factors, however, that make resource management a difficult task, as
summarized in the following points:
1
Segregation of Resource Management: Various researchers have introduced a
number of techniques to deal with individual aspects of resource management, such as
resource allocation, resource levelling, cash flow management, and time-cost trade-off
(TCT) analysis. The work of Talbot and Patterson (1979) and Gavish and Pirkul
(1991), for example, deals with resource allocation, while the work of Easa (1989)
and Shah et al. (1993) deal with resource levelling. Other models were focused only
on TCT analysis e.g., Liu et al. (1995). While these studies are beneficial, they deal
with isolated aspects that can be applied to projects one after the other, rather than
simultaneously. Little effort has been done to achieve combined resource optimization
because of the inherent complexity of projects and the difficulties associated with
modelling all aspects combined,
Inadequacy of Traditional Optimization Algorithms: In the past few decades,
traditional resource optimization was based on either mathematical methods or
heuristic techniques. Mathematical methods, such as integer, linear, or dynamic
programming has been proposed for individual resource problems. Mathematical
methods, however, are computationally non-tractable for any real-life project, which
is reasonable in size. In addition, mathematical models suffer from being complex in
their formulation and may be trapped in local optimum. Heuristic methods, on the
other hand, use experience and rules-of-thumb, rather than rigorous mathematical
formulations. Despite their simplicity, heuristic methods perform with varying
effectiveness when used on different project networks, and there are no hard
guidelines that help in selecting the best heuristic approach to use.
Difficulties with Simulation Modelling: During the past three decades, computer
simulation has been introduced to support the efficient use of construction resources.
Even though researchers were interested in its ability to mimic real-world
construction processes on computers, construction practitioners may find it difficult to
master. As a very beneficial tool for resource planning, extensive research was
directed at developing simulation models of construction operation. However, many
of existing tools require knowledge of computer programming and simulation
language, and lack integration with existing project management software and with
optimization algorithms.
2
Availability of a New Breed of Tools: Recent developments in computer science
have produced a new breed of tools that are beneficial to be utilized for construction
applications.
Based on recent advances in artificial intelligence, a new optimization technique, genetic
algorithms (GA) has emerged. Simulating natural-evolution and survival-of-the-fittest
mechanisms, GAs apply a random search for the optimum solution to a problem. Due to their
perceived benefits, GAs have been used successfully to solve several engineering and
construction management problems.
In the recent past, a move has been made towards the use of algorithms to solve the problem
of optimization. Heuristic, Analytical and multi-objective optimization techniques have been
employed to solve the critical problems of resource allocation. In fact, there is not even a
universally accepted definition of "optimum" as in single-objective optimization, which
makes it difficult to even compare the results of one method to that of another, because
normally the decision about what the "best" answer is depends on the decision maker. In the
past few decades, traditional resource optimization was based on either mathematical
methods or heuristic techniques. Mathematical methods, such as integer, linear, or dynamic
programming have been proposed for individual resource problems.
Mathematical methods, however, are computationally non-tractable for any real-life project,
which is reasonable in size. In addition, mathematical models suffer from being complex in
their formulation and may be trapped in local optimum.
Heuristic methods, on the other hand, use experience and rules-of-thumb, rather than rigorous
mathematical formulations. Researchers have proposed various heuristic methods for
resource allocation. Despite their simplicity, heuristic methods perform with varying
effectiveness when used on different project networks, and there are no hard guidelines that
help in selecting the best heuristic approach to use. They, therefore, cannot guarantee
optimum solutions. Furthermore, their inconsistent solutions have contributed to large
discrepancies among the resource-constrained capabilities of commercial project
management software.
In this paper a new genre of decision making is used in order to solve the problem of resource
optimization. Artificial intelligence and the use of deep learning have gathered speed in the
past decade. The use of Artificial Neural Network has been employed in order to learn from
previous projects and exercise a better understanding of resource allocation for future
projects. The complexity of mathematical methods and the dire unpredictability of heuristic
methods are eliminated in the use of Artificial Neural Network (ANN). The process involving
3
ANN uses vast amounts of data of previous projects in order to set a pattern and classify
important factors that sway the decision on resources.
4
are connected by links and they interact with each other. The nodes can take input data and
perform simple operations on the data. The result of these operations is passed to other
neurons. The output at each node is called its activation or node value. ANNs are considered
nonlinear statistical data modelling tools where the complex relationships between inputs and
outputs are modelled or patterns are found.
Each link is associated with weight. ANNs are capable of learning, which takes place by
altering weight values. The following Figure 1.1 shows a simple ANN:
There are many different types of neural networks, from relatively simple to very complex.
ANN can be classified in to feed forward and feedback or recurrent networks.
5
itself. When feedforward neural networks are extended to include feedback connections, they
are called recurrent neural networks as shown in Figure 1.2.
A teacher is assumed to be present during the learning process, when a comparison is made
between the network’s computed output and the correct expected output, to determine the
error. The error can then be used to change network parameters, which result in an
improvement in performance. In unsupervised learning the target output is not presented to
the network. It is as if there is no teacher to present the desired patterns and hence, the system
6
learns of its own by discovering and adapting to structural features in the input patterns
(Deepthi I. Gopinath and G.S. Dwarakish, 2015).
7
1.3 ANN in Construction Projects
Since late 1980’s several investigators have applied ANN in civil engineering to carry out a
variety of tasks such as prediction, optimization, system modelling and classification. The
applications of ANN in Construction Management include the following fields: Cost,
Productivity, Risk Analysis, Safety, Duration, Dispute, Unit rate and Hybrid Models. ANN’s
were found to learn from the relationships between input and output provided through
training data and could generalize the output, making it suitable for non-linear problems
where judgment, experience and surrounding conditions are the key features. ANNs typically
comprise of 3 layers i.e. input layer with input neurons, hidden layer(s) with hidden neurons
and output layers with output neurons.
Each neuron in the input layer is connected to each neuron in the hidden layer and each
neuron in hidden layer is connected to each neuron in the output layer. The number of hidden
layers and number of neurons in each hidden layer can be one or more than one. The number
of input neurons, hidden neurons and output neurons constitute the network architecture.
Before its application the network is trained, i.e., the connection weights and bias values are
fixed, with the help of a mathematical optimization algorithm and using part of the data set
until a very low value of error is attained. The network is then tested with an unseen data set
to judge the accuracy of the developed model. The network is trained using various training
algorithms which aim at minimizing the error between the observed and network predicted
values. The networks are classified according passage of flow of information either in the
forward direction (feed forward) or in reverse or lateral directions (recurrent network).
Generally three-layer feed-forward or recurrent networks are found to be sufficient in civil
engineering practices.
There are several papers and study that confirm the usefulness of ANNs in carrying out a
variety of prediction, classification; optimization and modelling related tasks in areas of CM.
ANNs are based on the input-output data in which the model can be trained and can always
be updated to obtain better results by presenting new training examples. ANN thus has
significant benefits that make it a powerful tool for solving many problems in the field of
CM.
8
1.4 Decision Support System (DSS)
A decision support system (DSS) is a computerized information system used to support
decision-making in an organization or a business. A DSS lets users sift through and analyse
massive reams of data and compile information that can be used to solve problems and make
better decisions referred in Figure 1.5. DSSs serve the management, operations and planning
levels of an organization and help people make decisions about problems that may be rapidly
changing and not easily specified in advance—i.e. unstructured and semi-structured decision
problems.
The primary purpose of using a DSS is to present information to the customer in a way that is
easy to understand. A DSS system is beneficial because it can be programed to generate
many types of reports, all based on user specifications. A DSS can generate information and
output it graphically, such as a bar chart that represents projected revenue, or as a written
report.
Some authors have extended the definition of DSS to include any system that might support
decision making and some DSS include a decision-making software component; Sprague
(1980) defines a properly termed DSS as follows:
DSS tends to be aimed at the less well structured, underspecified problem that upper
level managers typically face;
DSS attempts to combine the use of models or analytic techniques with traditional
data access and retrieval functions;
DSS specifically focuses on features which make them easy to use by non-computer-
proficient people in an interactive mode; and
DSS emphasizes flexibility and adaptability to accommodate changes in the
environment and the decision making approach of the user.
9
DSSs include knowledge-based systems. A properly designed DSS is an interactive software-
based system intended to help decision makers compile useful information from a
combination of raw data, documents, and personal knowledge, or business models to identify
and solve problems and make decisions.
10
CHAPTER 2
LITERATURE REVIEW
In this chapter, a number of previous studies has been looked at which have used Artificial
Neural Network in model predictions for construction projects. Various authors have given
comprehensive studies on the ways and methods to increase accuracy of the prediction model
using ANN. The concept of Time Series analysis has been explained in details and how it can
be implemented in ANN.
Kulkarni et. al, (2017) reviewed application of ANNs in construction activities related to
prediction of costs, risk and safety, tender bids, as well as labour and equipment productivity.
The review suggested that the ANN’s had been highly beneficial in correctly interpreting an
inadequate input information. It was seen that most of the investigators used feed forward
back propagation type of the network; however if a single ANN architecture was found to be
insufficient then hybrid modelling in association with other machine learning tools such as
genetic programming and support vector machines were much useful. It was however clear
that the authenticity of data and experience of the modeller are important in obtaining good
results.
The review confirmed the usefulness of ANNs in carrying out a variety of prediction,
classification; optimization and modelling related tasks in areas of CM. ANNs are based on
the input-output data in which the model can be trained and can always be updated to obtain
better results by presenting new training examples. ANN thus has significant benefits that
make it a powerful tool for solving many problems in the field of CM. However large scope
is still found to exist in experimenting with variety of network architectures, training
algorithms and hybrid type of methods, which could lead to a higher level of model
performance. Acceptability of ANN for routine use in CM can be increased if clear
guidelines to select input, network architecture, learning algorithms and other network
control parameters are evolved from an exhaustive assessment of all past works. Providing a
standard benchmark for determining the accuracy level of the construction proposals will
help in increased use of ANN in CM. Large scale attempts in future to unlock potential
knowledge in the network system can also go a long way in increasing user confidence in the
ANN use. Only few instances are seen pertaining to use of developed ANN models for
practical applications. Implementation of ANN for live projects and a step towards
understanding the user related problems towards implementation of the same should be done.
11
Sodikov, (2005) attempted to prove that cost estimation inaccuracy at the conceptual phase
can be reduced to half of what it is at the present time by using network simulation models.
ANN could be an appropriate tool to help solve problems which come from a number of
uncertainties such as cost estimation at the conceptual phase. Future work was focused on
developing an ANN model of cost estimation by incorporating other methods including fuzzy
logic, case based reasoning, and other up to date techniques. The limitation made in this study
was that only two types of road works were investigated: new construction and asphalt
overlay projects. Some assumptions were accepted during data analysis such as the type of
missing data and use of synthetic data. Nevertheless, this study can be reproduced and
applied to other road works and with a complete dataset. Some recommendations were
developed during the study, such as how to choose the number of variables corresponding to
project type and prediction accuracy levels, can be reproduced for other model types.
Hung and Babel, (2009) presented a new approach using an Artificial Neural Network
technique to improve rainfall forecast performance. A real world case study was set up in
Bangkok; 4 years of hourly data from 75 rain gauge stations in the area were used to develop
the ANN model. The developed ANN model is being applied for real time rainfall forecasting
and flood management in Bangkok, Thailand. Aimed at providing forecasts in a near real
time schedule, different network types were tested with different kinds of input information.
Gopinath and Dwarakish, (2015) attempted to predict waves at New Mangalore Port Trust
(NMPT) located along the west coast of India using Feed Forward Back Propagation (FFBP)
with LM algorithm and a recurrent network called Non-linear Auto Regressive with
exogenous input(NARX) network. Field data of NMPT has been used to train and test the
network performance, which are measured in terms of mean square error (mse) and
correlation coefficient (r). Effect of network architecture on the performance of model has
been studied. Correlation coefficient is found to be 0.94 in case of NARX predictions
indicating better performance than FFBP network whose ‘r’ value is 0.9. It was found that for
time series prediction NARX network outperform FFBP network not only in terms of
accuracy but also in terms of time required for computation.
The present study makes use of relatively new technique of Artificial Neural Network which
has been tried and tested in various coastal engineering applications. In the present study
FFBP and NARX network are used to predict waves at NMPT along west coast of India.
Predictions of waves at NMPT for one year has been carried out using yearlong wave data
12
with FFBP network gave satisfactory correlation coefficient ‘r’ value of 0.90 and 0.91 for
data set divided on monthly and weekly basis respectively. Using NARX network prediction
up to 25 weeks can be achieved with accuracy level greater than 0.94 using one week’s data
and yearly prediction can be achieved with accuracy greater than 0.94 using one month’s
data. Comparison of the results of FFBP network and NARX network showed NARX
performing better than the later as the ‘r’ obtained in case of NARX was 0.94.
Ezeldin and Sharara, (2016) developed three neural networks to estimate the productivity,
within a developing market, for formwork assembly, steel fixing, and concrete pouring
activities. Eighteen experts working in six projects were carefully selected to gather the data
for the neural networks. Ninety-two data surveys were obtained and processed for use by the
neural networks. Commercial software was used to perform the neural network calculations.
The processed data were used to develop, train, and test the neural networks. The results of
the developed framework of neural networks indicate adequate convergence and relatively
strong generalization capabilities. When used to perform a sensitivity analysis on the input
factors influencing the productivity of concreting activities, the framework has demonstrated
a good potential in identifying trends of such factors.
Mandal and Prabharan, (2005) described an artificial neural network, namely recurrent
neural network with rprop update algorithm and is applied for wave forecasting. Measured
ocean waves off Marmugao, west coast of India are used for this study. Here, the recurrent
neural network of 3, 6 and 12 hourly wave forecasting yields the correlation coefficients of
0.95, 0.90 and 0.87 respectively. This shows that the wave forecasting using recurrent neural
network yields better results than the previous neural network application.
Shukla et. al, (2014) focused on data mining technique based on artificial neural network and
its application in runoff forecasting. The long-term and short- term forecasting model was
developed for runoff forecasting using various approaches of Artificial Neural Network
techniques. This study compares various approaches available for runoff forecasting of
artificial neural networks (ANNs). On the basis of this comparative study, it is tried to find
out better approach in perspective of research work.
13
2.1 Time Series Modelling (NTSTOOL)
Time series modelling is a dynamic research area which has attracted attentions of
researchers’ community over last few decades. The main aim of time series modelling is to
carefully collect and rigorously study the past observations of a time series to develop an
appropriate model which describes the inherent structure of the series. This model is then
used to generate future values for the series, i.e. to make forecasts. Time series forecasting
thus can be termed as the act of predicting the future by understanding the past. Due to the
indispensable importance of time series forecasting in numerous practical fields such as
business, economics, finance, science and engineering, etc. proper care should be taken to fit
an adequate model to the underlying time series. It is obvious that a successful time series
forecasting depends on an appropriate model fitting. A lot of efforts have been done by
researchers over many years for the development of efficient models to improve the
forecasting accuracy. As a result, various important time series forecasting models have been
evolved.
A time series is a sequential set of data points, measured typically over successive times. It is
mathematically defined as a set of vectors x(t),t = 0,1,2,... where t represents the time elapsed.
The variable x(t) is treated as a random variable. The measurements taken during an event in
a time series are arranged in a proper chronological order.
A time series containing records of a single variable is termed as univariate. But if records of
more than one variable are considered, it is termed as multivariate. A time series can be
continuous or discrete. In a continuous time series observations are measured at every
instance of time, whereas a discrete time series contains observations measured at discrete
points of time. For example temperature readings, flow of a river, concentration of a chemical
process etc. can be recorded as a continuous time series. On the other hand population of a
particular city, production of a company, exchange rates between two different currencies
may represent discrete time series. Usually in a discrete time series the consecutive
observations are recorded at equally spaced time intervals such as hourly, daily, weekly,
monthly or yearly time separations.
14
is termed as Secular Trend or simply Trend. Thus, it can be said that trend is a long term
movement in a time series. For example, series relating to population growth, number of
houses in a city etc. show upward trend, whereas downward trend can be observed in series
relating to mortality rates, epidemics, etc.
Seasonal variations in a time series are fluctuations within a year during the season. The
important factors causing seasonal variations are: climate and weather conditions, customs,
traditional habits, etc. For example sales of ice-cream increase in summer, sales of woollen
cloths increase in winter. Seasonal variation is an important factor for businessmen,
shopkeeper and producers for making proper future plans.
The Cyclical variation in a time series describes the medium-term changes in the series,
caused by circumstances, which repeat in cycles. The duration of a cycle extends over longer
period of time, usually two or more years. Most of the economic and financial time series
show some kind of cyclical variation. For example a business cycle consists of four phases,
viz. i) Prosperity, ii) Decline, iii) Depression and iv) Recovery.
Schematically a typical business cycle is shown in Figure 2.1:
16
2.2 Box-Jenkins Methodology
After describing various time series models, the next step is how to select an appropriate
model that can produce accurate forecast based on a description of historical pattern in the
data and how to determine the optimal model orders. Statisticians George Box and Gwilym
Jenkins developed a practical approach to build Auto-regressive model, which best fit to a
given time series. Their concept has fundamental importance on the area of time series
analysis and forecasting.
The Box-Jenkins methodology does not assume any particular pattern in the historical data of
the series to be forecasted. Rather, it uses a three step iterative approach of model
identification, parameter estimation and diagnostic checking to determine the best model
from a general class of Auto-regressive models. This three-step process is repeated several
times until a satisfactory model is finally selected. Then this model can be used for
forecasting future values of the time series. The Box-Jenkins forecast method is
schematically in the Figure 2.3 below.
17
used measures for model identification are Akaike Information Criterion (AIC) and Bayesian
Information Criterion (BIC) which are defined below:
AIC (p) = n ln (σe2 /n) + 2p
BIC (p) = n ln (σe2 /n) + p + p ln (n)
Here n is the number of effective observations, used to fit the model, p is the number of
parameters in the model and σe2 is the sum of sample squared residuals. The optimal model
order is chosen by the number of model parameters, which minimizes either AIC or BIC.
18
Fig 2.4: Flowchart of Methodology followed
In Matlab, ntstool launches the neural time series application and leads the user through
solving a time series problem using a two-layer feed-forward network. Several methods of
time-series modelling can be done such as NAR model, NARx model and Nonlinear Input-
Output model. System Identification Toolbox software provides tools for modelling and
forecasting time-series data. Estimates of both linear and nonlinear black-box and grey-box
models for time series data can be done. Some particular types of models are parametric
autoregressive (AR), autoregressive and moving average (ARMA), and autoregressive
models with integrated moving average (ARIMA). For nonlinear time series models, the
toolbox supports nonlinear ARX models. The Figure 2.5 below shows the MATLAB time
series tool for developing prediction model.
20
Nonlinear autoregressive network is a type of recurrent neural network that can learn to
predict a time series Y given past values of Y. In recurrent networks, the output depends not
only on the current input to the network but also on the previous input and output of the
network. The response of the static network at any point of time depends only on the value of
the input sequence at that same point whereas, the response of the recurrent networks lasts
longer than the input pulse. Its response at any given time depends not only on the current
input, but on the history of input sequence. This is done by introducing a tapped delay line in
the network which makes the input pulse last longer than its duration by an amount which is
equal to the delay given in the tapped delay.
The Nonlinear Autoregressive model predicts the time series using a series y(t) given d past
values of y(t) as shown in figure 2.6. It follows y(t) = f[y(t-1), y(t-2),……………..y(t-d)].
21
Fig 2.7: NAR network diagram with inputs
22
2.5.1 Structure of Nonlinear ARX Models
A nonlinear ARX model consists of model regressors and a nonlinearity estimator. The
nonlinearity estimator comprises both linear and nonlinear functions that act on the model
regressors to give the model output. This block diagram in figure 2.9 represents the structure
of a nonlinear ARX model in a simulation scenario.
The MATLAB software computes the nonlinear ARX model output (y) in two stages:
It computes regressor values from the current and past input values and past output
data. In the simplest case, regressors are delayed inputs and outputs, such as u(t-1)
and y(t-3). These kinds of regressors are called standard regressors. We can specify
the standard regressors using the model orders and delay. We can also specify custom
regressors, which are nonlinear functions of delayed inputs and outputs. By default,
all regressors are inputs to both the linear and the nonlinear function blocks of the
nonlinearity estimator. We can choose a subset of regressors as inputs to the nonlinear
function block.
It maps the regressors to the model output using the nonlinearity estimator block. The
nonlinearity estimator block can include linear and nonlinear blocks in parallel. For
example:
F(x) = LT (x−r) + d + g (Q (x−r))
Here, x is a vector of the regressors, and r is the mean of the regressors x. LT (x) +d is
the outputs of the linear function blocks and is affine when d ≠ 0. ‘d’ is a scalar offset. g
(Q(x−r)) represents the output of the nonlinear function block. Q is a projection matrix that
makes the calculations well-conditioned. The exact form of F(x) depends on your choice of
the nonlinearity estimator. We can select from available nonlinearity estimators, such as tree-
partition networks, wavelet networks, and multilayer neural networks. Also the linear or the
nonlinear function block can either be excluded from the nonlinearity estimator.
23
When estimating a nonlinear ARX model, the MATLAB software computes the model
parameter values, such as L, r, d, Q, and other parameters specifying g. Typically, all
nonlinear ARX models act as black-box structures. The nonlinear function of the nonlinear
ARX model is a flexible nonlinearity estimator with parameters that need not have physical
significance. We can estimate nonlinear ARX in the System Identification app or at the
command line using the nlarx command in MATLAB. Uniformly sampled time-domain
input-output data or time-series data (no inputs) for are used estimating nonlinear ARX
models. The data can have one or more input and output channels, while frequency-domain
data cannot be used for estimation.
24
which does not generate better forecasts. To penalize the addition of extra parameters some
model comparison criteria, such as AIC and BIC can be used.
In summary we can say that NNs are amazingly simple though powerful techniques for time
series forecasting. The selection of appropriate network parameters is crucial, while using NN
for forecasting purpose. Also a suitable transformation or rescaling of the training data is
often necessary to obtain best results.
25
CHAPTER 3
METHODOLOGY
At the beginning of the model building process, it is important to clearly define the criteria by
which the performance of the model will be judged, as they can have a significant impact on
the model architecture and weight optimisation techniques chosen. In most applications,
performance criteria include one or more of the following: prediction accuracy, training
speed and the time delay between the presentation of inputs and the reception of outputs for a
trained network. The time delay between the presentation of network inputs and the reception
of the corresponding outputs is a function of processing speed.
For a particular computational platform, this is a function of the number of connection
weights and the type of connection between them. In order to maximise processing speed, it
is desirable to keep the number of connection weights as small as possible and the
connections between them as simple as possible. The time taken to train a network is highly
problem dependent. However, for a particular case study, training speed is a function of a
number of factors. The optimisation method and its associated parameters have a major
influence on convergence speed. The size of the training set can also play a significant role.
Another factor that affects training speed is the size of the network. Larger networks
generally require fewer weight updates to find an acceptable solution. However, the time
taken to perform one weight update is increased.
Prediction accuracy is affected by the optimisation algorithm. The method’s ability to escape
local minima in the error surface is of particular importance. A number of measures of
generally proposed are calculated using data that have not been utilised in the training process
so that the model’s generalisation ability can be assessed. Generalisation ability is defined as
a model’s ability to perform well on data that were not used to calibrate it and is a function of
the ratio of the number of training samples to the number of connection weights. If this ratio
is too small, continued training can result in overfitting of the training data. This problem is
exacerbated by the presence of noise in the data. To minimise the overfitting problem,
various techniques can be used for determining the smallest number of connection weights
that will adequately represent the desired relationship. Generalisation ability is also affected
by the degree with which the training and validation sets represent the population to be
modelled and by the stopping criterion used.
26
3.1 Division of data
It is common practice to split the available data into two sub-sets; a training set and an
independent validation set. The validation set is itself divided into validation and testing
subsets. Typically, ANNs are unable to extrapolate beyond the range of the data used for.
Consequently, poor forecasts/predictions can be expected when the validation data contain
values outside of the range of those used for training. It is also imperative that the training
and validation sets are representative of the same population. When limited data are
available, it might be difficult to assemble a representative validation set. In our model, the
percentage of data division is different owing to achieving optimum results.
One method which maximises utilisation of the available data is the holdout method
(Masters, 1993). The basic idea is to withhold a small subset of the data for validation and to
train the network on the remaining data. Once the generalisation ability of the trained
network has been obtained with the aid of the validation set, a different subset of the data is
withheld and the above process is repeated. Different subsets are withheld in turn, until the
generalisation ability has been determined for all of the available data. Other methods for
maximising the availability of data for training have also been proposed such as
Lachtermacher and Fuller (1994) generated a synthetic test set that possessed the same
statistical properties as the training data. Maier and Dandy (1998) suggested using a subset of
the data as a testing set in a trial phase to determine how long training should be carried out
so that acceptable generalisation ability is achieved. The subset used for testing is then added
to the remaining training data, and the whole data set is used to train the network for a fixed
number of epochs, based on the results from the trial phase.
Cross-validation (Stone, 1974) is a technique that is used frequently in ANN modelling and
has a significant impact on the way the available data are divided. It can be used to determine
when to terminate training and to compare the generalisation ability of different models. In
cross-validation, an independent test set is used to assess the performance of the model at
various stages of learning. As the validation set must not be used as part of the training
process, a different, independent testing set is needed for the purposes of cross-validation.
This means that the available data need to be divided into three subsets; a training set, a
testing set and a validation set, which is very data intensive. The same applies to cases where
network geometry or internal parameters are optimised by trial and error.
27
3.2 Data Processing
The data processing was conducted in three steps. The first step included obtaining the
Management Planning Control System data record of the latest month from each particular
site as prepared by planning department of each individual L&T site. The inputs of these data
were obtained in several parts containing schedules of planned vs actual quantities of various
resources utilized. Each part had specific details of the various resources used in each activity
included in the study. The data included the quantity of various equipments, material, labour
and money used for the activities. These individual data were used to create neural network
for the study model.
The second step in the data processing was the conversion of data into numeric values (if
needed) and the formulation of the individual set for each neural network. If the input data
fields available in the excel sheet were numbers, then they were directly entered into the
neural networks without manipulations or calculations. Each numerical data field was linked
to its corresponding position in its matrix, and simply transferred there. Data fields of that
nature were concrete quantity, steel quantity, and crew size. On the other hand, if the input
data were in text form, it needed to be converted from text format into numerical form in
order for the neural network to utilize it in its computations.
The third step was an optional scheme for randomizing the data in order to avoid incorrect
outcomes. In this step, data processing included randomizing records and normalizing data.
As the data were fed into the main excel table, it was entered from the excel table at hand. To
avoid the monotony that could arise from this situation, randomization or shuffling of the
records was required. The randomization improved the generalization capability of the
network and allowed for smoother convergence. Normalizing data was another manipulation
that was carried out on the data. The process converted the numbers available in each matrix
to values between fixed ranges. Such scaling would allow the neural networks to converge
faster and later to generalize better outputs.
28
Fig 3.1: Basic Architecture of Proposed Network Model
29
guarantees that there are no attributes which are more important than others because of data
ranges, and also eases a stable convergence of network weights and biases. In most traditional
statistical models, the data have to be normally distributed before the model coefficients can
be estimated efficiently. If this is not the case, suitable transformations to normality have to
be found. It has been suggested in the literature that ANNs overcome this problem, as the
probability distribution of the input data does not have to be known.
More recently, it has been pointed out that as the mean squared error function is generally
used to optimise the connection weights in ANN models, the data need to be normally
distributed in order to obtain optimal results. However, this has not been confirmed by
empirical trials. Clearly, this issue requires further investigation.
Until recently, the issue of stationarity has been rarely considered in the development of
ANN models. However, there are good reasons why the removal of deterministic components
in the data (i.e. trends, variance, seasonal and cyclic components) should be considered. As
previously discussed, it is generally accepted that ANNs cannot extrapolate beyond the range
of the training data. One way to deal with this problem is to remove any deterministic
components using methods commonly used in time series modelling such as classical
decomposition (Chatfield, 1975) or differencing (Box and Jenkins, 1976). Differencing has
already been applied to neural network modelling of non-stationary time series. However, use
of the classical decomposition model may be preferable, as differenced time series can
possess infinite variance. Another way of dealing with trends in the data is to use an adaptive
weight update strategy during on-line operation.
It has been suggested that the ability of ANNs to find non-linear patterns in data makes them
well suited to dealing with time series with non-regular cyclic variation. Maier and Dandy
(1996a) investigated the effect of input data with and without seasonal variation on the
performance of ANN models. Their results indicate that ANNs have the ability to cater to
irregular seasonal variation in the data with the aid of their hidden layer nodes.
30
M= sum of all entries/number of entries
Then the standard deviation, SD, for each of these parameters individually is calculated. Now
after having the values of mean and SD for every parameter, the values for each parameter
were normalized
Normalized value = (x-M)/SD
31
3.4.2.2. Levenberg-Marquardt algorithm
The Levenberg- Marquardt algorithm founded in the works of Levenberg (1944) and
Marquardt (1963) combines the steepest descent algorithm of Rumelhart, Hinton and
Williams (1986b) and Newton's method. Like other quasi-Newton methods it cannot be
counted to the second order gradient methods, because the Hessian matrix is approximated by
combinations of the Jacobian matrix of "Ɛt (), the matrix of first order gradients, such that no
second order gradients rest to calculate. According to Haykin (2009) the advantages of this
method are therefore that it converges rapidly like Newton's method but it cannot diverge
because of the steepest descent algorithm influence. Via modification of some parameters the
Levenberg-Marquardt algorithm can be made equal to either the steepest descentor Newton's
algorithm. According to Bishop (1995) the Levenberg-Marquardt algorithm is especially
applicable to error-sum-of-squares performance functions.
Levenberg-Marquardt algorithm used in this study can be written as:
where J is the Jacobian of the error function (E), I is the identity matrix and γ is the parameter
used to define the iteration step value. It minimizes the error function while trying to keep the
step between old weight configuration (Wold) and new updated one (Wnew) small. The
performance of the network is measured in terms of various performance functions like sum
squared error (SSE), mean squared error (MSE), and Co-efficient of Correlation (CC or ‘r’)
between the predicted and the observed values of the quantities. Lower value of MSE and
higher value of CC indicates better performance of the network.
The Levenberg–Marquardt (LM) modification of the classical Newton algorithm overcomes
one of the problems of the classical Newton algorithm, as it guarantees that the Hessian is
positive definite. However, the computation/memory requirements are still O (k2). The LM
algorithm may be considered to be a hybrid between the classical Newton and steepest
descent algorithms. When far away from a local minimum, the algorithm’s behaviour is
similar to that of gradient descent methods. However, in the vicinity of a local minimum, it
has a convergence rate of order two. This enables it to escape local minima in the error
surface.
32
3.4.2.3 Scaled Conjugate or Conjugate Gradient Algorithm
Conjugate Gradient Algorithm (CGA) in which a search is performed along conjugate
directions, which produces generally faster convergence than steepest descent directions. A
search is made along the conjugate gradient direction to determine the step size, which will
minimize the performance function along that line. A line search is performed to determine
the optimal distance to move along the current search direction. Then the next search
direction is determined so that it is conjugate to previous search direction.
Conjugate gradient methods may be viewed as approximations to the Shanno algorithm. The
quasi-Newton approach proposed by Shanno (1978) overcomes both problems associated
with the classical Newton algorithm while maintaining a convergence rate of order two. The
computation/memory requirements are reduced to O(k) by using an approximation of the
inverse of the Hessian. This approximation also has the property of positive definiteness,
avoiding the problem of ‘uphill’ movement. One potential problem with the Shanno
algorithm is that it cannot escape local minima in the error surface, and may thus converge to
a sub-optimal solution. However the Shanno algorithm is expected to converge more rapidly
to a nearby strict local minimum, take fewer uphill steps, and have greater numerical
robustness than the generic conjugate gradient algorithm.
The general procedure for determining the new search direction is to combine the new
steepest descent direction with the previous search direction. An important feature of the
CGA is that the minimization performed in one step is not partially undone by the next, as it
is the case with gradient descent methods. The key steps of the CGA are summarized as
follows:
Choose an initial weight vector wi.
Evaluate the gradient vector g1,and set the initial search direction d1=-g1
At step j, minimize E(wj + adj) with respect to a to give wj+1 = wj + amindj
Test to see if the stopping criterion is satisfied.
Evaluate the new gradient vector gj+1
Evaluate the new search direction using dj+1= -gj+1 + ßj dj.
The various versions of conjugate gradient are distinguished by the manner in which the
constant ßj is computed.
The approximation tends in the limit to the true value of E (wj + adj). The calculation
complexity and memory usage of gj are, respectively, O (3N2) and O(N5). 5 If this strategy is
combined with the CG approach, we get an algorithm directly applicable to a feedforward
33
neural network. This slightly modified version of the original CG algorithm will also be
referred to as CG. The CG algorithm was tested on a generic test problem. Generally it fails
in almost every case and converges to a nonstationary point. The cause of this failure is that
the algorithm only works for functions with positive definite Hessian matrices, and that the
quadratic approximations on which the algorithm works can be very poor when the current
point is far from the desired minimum. The Hessian matrix for the global error function E has
shown to be indefinite in different areas of the weight space, which explains why CG fails in
the attempt to minimize E.
Learning rate
The learning rate is directly proportional to the size of the steps taken in weight space.
However, it is worthwhile re-iterating that learning rate is only one of the parameters that
affect the size of the steps taken in weight space. Traditionally, learning rates remain fixed
during training and optimal learning rates are determined by trial and error. Guidelines for
appropriate learning rates have been proposed for single-layer networks but it is difficult to
extend these guidelines to the multilayer case. Many heuristics have been proposed which
adapt the learning rate, and hence the size of the steps taken in weight space, as training
progresses based on the shape of the error surface.
Stopping criteria
The criteria used to decide when to stop the training process are vitally important, as they
determine whether the model has been optimally or sub-optimally trained. Examples of the
latter include stopping training too early or once overfitting of the training data has occurred.
34
At this stage, it is important to understand that overfitting is intricately linked to the ratio of
the number of training samples to the number of connection weights. Overfitting does not
occur if the above ratio exceeds 30. In such cases, training can be stopped when the training
error has reached a sufficiently small value or when changes in the training error remain
small.
When the above condition is not met, there are clear benefits in using cross-validation. In
practical terms, however, this is not a straightforward task, and there has been much
discussion about the relative merits of the use of cross-validation as a stopping criterion.
Some researchers suggest that it is impossible to determine the optimal stopping time and that
there is a danger that training is stopped prematurely (i.e. even though the error obtained
using the test set might increase at some stage during training, there is no guarantee that it
will not reach lower levels at a later stage if training were continued). Consequently, when
cross-validation is used, it is vital to continue training for some time after the error in the test
set first starts to rise.
35
3.4.3.1 The Problem of Under-fitting or Overfitting
The training set was used to train the network in order to choose its parameters (weights), the
cross validation set was used for generalization, that is to produce better output for unseen
examples. Finally, the test set was used to measure the performance of the selected ANN
model. A practical issue is the under-fitting/overfitting dilemma. Under-fitting may occur
when models are too simple and have insufficient training data. In such cases both training
error and testing error are large. Overfitting occurs when model is too complex, have
insufficient or noisy training data, resulting training error to be small but large testing error.
In order to tackle these problems stop criteria and weight reset were used during the training
network.
36
MSE is sensitive to the change of scale and data transformations.
Although MSE is a good measure of overall forecast error, but it is not as intuitive
and easily interpretable as the other measures.
37
3.5.6 The Mean Forecast Error (MFE)
This measure is defined as MFE = 1/n Σ et
The properties of MFE are:
It is a measure of the average deviation of forecasted values from actual ones.
It shows the direction of error and thus also termed as the Forecast Bias.
In MFE, the effects of positive and negative errors cancel out and there is no way to
know their exact amount.
A zero MFE does not mean that forecasts are perfect, i.e. contain no error; rather it
only indicates that forecasts are on proper target.
MFE does not panelise extreme errors.
It depends on the scale of measurement and also affected by data transformations.
For a good forecast, i.e. to have a minimum bias, it is desirable that the MFE is as
close to zero as possible.
38
3.5.9 The Mean Percentage Error (MPE)
It is defined as MPE = 1/n Σ (et / yt) * 100
The properties of MPE are:
MPE represents the percentage of average error occurred, while forecasting.
It has similar properties as MAPE, except,
It shows the direction of error occurred.
Opposite signed errors affect each other and cancel out.
Thus like MFE, by obtaining a value of MPE close to zero, we cannot conclude that
the corresponding model performed very well.
It is desirable that for a good forecast the obtained MPE should be small.
Each of these measures has some unique properties, different from others. In experiments, it
is better to consider more than one performance criteria. This will help to obtain a reasonable
knowledge about the amount, magnitude and direction of overall forecast error. For this
reason, time series analysts usually use more than one measure for judgment.
Also, other data plotting is done such as Regression Best Fit plot, Error Histogram and
Performance plot. Error in this study is found out on the basis of mean squared error and sum
of squared error. Both the error data has to be closer to zero to give more accurate data.
Further it is away from 0, less accurate the data is. Also in the regression plot, correlation
coefficient ‘r’ is be closer to 1 in order to have a high correlation been output values and
target values.
39
CHAPTER 4
CASE STUDY DETAILS
In this study, the Artificial Neural Network model developed using NAR and NARx network
have been tried on two different site data that were obtained from L&T Waste Water Unit and
Industrial & Large Water Unit of Water Effluent Treatment (WET) IC. The two different
projects had different project scope, site conditions, essential requirements and different
critical parameters governing the resource requirements of the sites. The data was obtained
from the MPCS records of each site that are updated regularly.
Both the site requirements for resources are discussed in connection with the Case Study
created to understand how ANN can be used to predict resource requirement for future
projects. A detailed analysis with regression and error graphs are provided in the study.
Certain aspects of each project have been found to be unique and do not correlate with one
another.
In order to understand this, a comparative analysis is done at the end of both the findings. The
comparison represents the similarities as well as differences of the pattern of resource
requirements. An attempt has been made to understand what critical factors define the
differences connoted in the findings. This drawback is not taken into consideration as the
validation of the model can only be done accurately when a large data sample is available.
Since only two sites and their resource analysis are done, it considerably reduces the
efficiency of the model. Some of the performance measures may vary beyond the permissible
limits as the prediction is done on a very limited database. These problems can be overcome
with adequate data availability.
40
Project Scope : Construction of Roads, PH services (Water
Supply, Sewerage Storm Water Drainage &
Treated Water Networks), Electrical Services
(HT, LT and Streetlights), Maintenance of PH
and Electrical Works for 5 years
Project Value : Rs 349 Cr
Total built up area : 1700 acres
Type of Contract : Item Rate
41
A detailed site Layout was outlined at the beginning of the project keeping in mind the
logistic requirement. The important areas were office area; material stacking area, lay down
area, parking area, quality lab area, dumping yard, etc. were clearly demarcated. Area of soil
stacking, aggregate stacking and reinforcement stacking area were also identified. Areas for
Formwork and Rebar Yard were clearly separated considering the vehicular movement
requirement. The work on access roads for all part of the project site was constructed after
finalizing above areas. The temporary buildings for L&T offices were also built keeping in
mind the requirement of maintenance for 5 years post completion. All necessary drainage
works, infrastructure work, office establishment are constructed. The PPE area and non PPE
areas are clearly demarcated as per layout.
42
The Waste Water Business Unit of WET IC of L&T Construction was offered the
Construction of roads, development of Public Health services i.e. water supply & distribution
network, Sewerage & Storm network & treated water network , Electrical services includes
HT & LT cabling works , street lighting etc. complete with maintenance of PH & electrical
services for a period of five years.
The project was initiated on 30th August, 2014 after receiving environmental clearances from
State Departments. The estimated project duration was 18 months with a completion date of
30th April, 2017. However, due to multiple interconnected fault lines and in discrepancies in
resource handling, the project got delayed. As per the monthly progress report for the month
of April 2017, the physical progress is 95% completed with a financial progress of 86%.
43
connections. This drawing was received through mail from DTP Dept. on 13th
April'15
Also there was a no work period of 3 months from December 2014 to February 2015 as
during this time GMADA undertook the operation of feasibility checking of the IT City
Layout Plan as shown in Figure 4.3.
44
4.2 Case Study 2
4.2.1 Site Details
1. LOA DATE : 04-03-2014
2. AGREEMENT DATE : 10-04-2014
3. LOA REFERENCE NO. : BUIDCo/Yo-24/10(lll) -665 dated 04-03-2014
4. START DATE : 24-04-2014
5. COMPLETION DATE : 24-06-2016
6. ORIGINAL PROJECT DURATION : 26 Months
7. DLP PERIOD : 1 year after project completion
8. OPERATION & MAINTENANCE : NIL
9. CONTRACT VALUE : 254.52 crores
10. TYPE OF CONTRACT : Item Rate Contract
Fig 4.4: Reinforcement Works for Ghat Floor Slab at Chaudhary Tola Ghat
45
4.2.2 Brief Scope of Work:
PART A:
1) Development of Ghats - 20 No's.
2) Promenade of 6.6 km.
3) Crematorium at Gulvi Ghat.
Fig 4.5: Concrete Work under progress using Barge Mounted Batching Plant
46
4.2.3 Civil Works (Approx):
1. Earthwork
1. Excavation : 85,000 cum
2. Filling : 90,000 cum
2. Pile Work : 9,000 no's
3. Concrete : 83,500 cum
4. Reinforcement steel : 8,500 MT
5. Shuttering : 3, 80,000 Sqm
6. Brickwork : 11,000 cum
7. Flooring works : 1, 32,000 Sqm
8. Plastering & Finishing works : 4, 54,000 Sqm
9. Gabion : 25,000 cum
This project is also a completed project as of January 2017. The project was delayed by a
period of 5 months mainly due to intense rainfall in the Gangetic belt for the monsoon
season. The project was under the Industrial and Large Water unit of Water Effluents and
Treatment Division of L&T.
47
CHAPTER 5
RESULT AND DISCUSSION
5.1 Results of Case Study 1
5.1.1 Project Data
The site MPCS of the latest update had contained the details of every activity performed
monthly till the present. The site construction was completed and that gave us a full scope the
project resource requirements. The MPCS recorded data in a plan vs actual format. The
planned amount of monthly usage of resources under each activity was calculated in the
project initiation phase.
Here, there is a need to understand the planned variable inputs in projects. The planned
resources are fixed under various assumptions and previous experiences. The data that is
calculated for planned resources contain an overall optimum condition fulfilling criteria.
Certain allowances are obviously taken, may it be seasonal or other uncontrollable
hindrances. However, most of the time these presented data does not accommodate various
unforeseeable circumstances. The actual data obtained by physical implementation of
activities on site creates a marked difference in estimates.
For the model analysis, the actual data obtained from sites were taken as input data. Now
there is an interesting take on the choice of selection of input parameters
48
noise is very high as it does not in any ways correlate or tends to correlate with one another.
In this case, it makes it harder for the Case Study to find the setting pattern for these values.
Hence these actual resource values were used to test the efficiency of the model in dire
circumstances. In future, if operations on site are made more viable and less risk prone, the
actual values will have a more established connection almost closer to the planned values.
This will make the work of the model easier.
49
As described earlier the entire test data was divided in 3 parts; training set, validation set and
testing set. For this project analysis, the data was divided as 70%, 15% and 15%. The main
recourse in understanding how to divide the data sets is to find the normalizing effect of the
entire input parameters. If the values are within a finite scope and the percentage of noise is
below 20%, the training data is given a bulk load of the inputs. It helps to formulate a better
non-linear relation between the inputs. Since there is very less noise in the remainder of data
it is highly probable that these values will more or less fit in the designed correlation.
The regression ‘r’ value of all the three combined comes as 0.997 as in Figure 5.1. This is
quite a high value since the optimum value is 1. This achievement shows the stability of the
data set and the viable range of errors is limited. The regression value sticks upward in the
graph scale proving that the LM algorithm has an inbuilt mechanism to fine tune the weights
to the least convergent point. Also, a point needs to be noted that normalization process of
standardizing the values was conducted in this set data inputs. The performance graph shows
a strong convergence between training and validation data sets. This normalization had great
impact in filling the “Data Gaps”. The standardization of inputs helped reduce the noise and
create a more sustained result. In the larger scheme, the slight tweak in the data inputs due to
normalization does not have unreliable prediction efficiency. The normalization takes into
account the range and variations of consecutive values.
The regression graph shows a very high convergence pattern within the input data. This
shows a high correlation among the individual input parameters. This shows that the noise
level in the input data is bare minimum within the prescribed 20% limit.
50
Fig 5.2: Performance Graph of Case Study 1
The performance graph shows that the validation data set and training set is almost correlated
to a degree within error range. The testing data falters to a certain extent as the data set
slightly shifts upward representing the similar path but a deviated pattern. The testing set data
or predicted data had the best convergence in the 9th epoch as the minimum MSE was
obtained as shown in Figure 5.2. As detailed earlier, epoch is the iteration number which the
model continues to find the minimum error. The epoch is continued till the 15th time after
which the error between the predicted values and the actual values starts diverging. The LM
algorithm gets rid of the local minima by continuing on the iteration till such a condition
occurs. In the points of intersection before the 9th epoch between the testing and the
validation and training data sets shows the points of local minima. These redundancies are
done away in the LM algorithm.
51
5.2.2 Time Series Graph
The time series graph in Figure 5.3 shows all the points of inputted data and predicted values
for all the three input subsets. Each of the point denotes both the values. The time series
analysis is done as recorded data follow a monthly pattern although no such connection
between months and data value can be found. The time series analysis makes it easier for the
algorithm to denote each value as an index of a particular time (in this case a month each)
which is used as a correspondent point to the next month’s value.
The time series graph also shows the errors for each predicted value. The error analysis is
described in details in the next context.
52
This maximum value of error recommended is determined by eliminating the noise and then
reducing the MSE value to 0. Although 0 is the optimum value for any error variable,
however large value of error in this Case Study shows minor individual errors over a
sufficiently large data set. In these cases, the regression ‘r’ value comes into play. The better
the ‘r’ value, the more acceptable the error value is.
The above error histogram in Figure 5.4 shows the deviation and magnitude of error of all the
data points. The central yellow line shows the point of zero error. In this graph, 60% of the
inputs have a range of zero error or within the prescribed bin. The remainder of the inputs
have a data range as shown in the graph. As we can see from the graph, the error range lies
between -86.79 to 61.76. Out of these, only 12% of the values range beyond the central
accomodable range of -30 to 30.
53
Fig 5.5: Error Auto-Correlation Graph of Case Study 1
The error Mean Squared Error (MSE) comes to a value of 170.85 which is within reasonable
limits. The mean squared error gives the mean error deviation from the 0 value. The sum of
squared error (SSE) comes to about 6.08 * 104. As we can see from the error correlations
graph in Figure 5.5 the error deviations are within the considerable ranges as shown by the
confidence limit lines. This reasserts that the model prediction has a high convergence of data
inputs.
From the outcome of predicted data, the model worked well for the given set of data for the
first project. The correlation ‘r’ values come to 0.997 while the MSE is 170.85. The data
estimated is highly accurate and suits well for such projects and sequence of events.
The parameters for estimation and prediction were the 4M resources as in Manpower,
Machinery, Materials and Money. These resources were estimated based on essential
activities over the entire project completion duration. In this 1st case study the activities were
road laying, excavation and backfilling, masonry, reinforcement and concreting. The
finalized weights for the above parameters in respective order were 5.1159, 3.0205, 2.2348,
1.087 and 0.5289 with mean squared error of 170.85.
54
5.3 Results of Case Study 2
5.3.1 Project Data
As described in the first project, the data was collected from the MPCS records after project
completion. The work was divided into activities such as concreting, excavation,
reinforcement and the resource requirements for each of these activities were enlisted. The
resources primarily considered for this study are manpower, machinery and material used.
The activities taken into consideration are generally the important activities which form the
bulk of the construction work. Activities on the periphery, with only small scale importance
were not considered. Activities such as flooring, wood work, aluminium works, painting all
occupy a certain time period of the entire project. With respect to the entire project timeline,
they form a minor part and so their monthly requirement is basically 0 except for those few
months when it is done. Entering such inputs would not create a good correlation equation
and generally tend to fail and create large errors.
55
Fig 5.6: Regression Graph of Case Study 2
It should be noted that normalization was also done in this data set; however there were large
“Data Gaps” which had no value. As per the site implementation, in the heavy months of rain
no work was done. Hence to subscribe for this fact, the data inputs were 0. So, the problem
arises when the LM algorithm tries to find the correlation pattern between the data inputs.
Having a consistent series of 0 values in an intermediate or extreme ends will somehow
create a point of decreasing trend of resource requirement. This additional effect when
considered in conjunction to the already preceding trend of data correlation creates a different
pattern. The finalised pattern in the validation phase then somehow predicts wrong values for
the future inputs.
56
The error is very high in this case as the data hardly converges as shown in the performance
graph:
We see in Figure 5.7 that iteration or epoch 3 is taken as the best value after which both the
validation and testing data diverges or follows an almost parallel path with respect to each
other. The iteration process was continued until the 9th epoch beyond which any point of
convergence is highly unlikely.
The point to be made in this Case Study is that this particular analysis of the data inputs is not
completely wrong. The correlation value comes to a quite significant 0.75 which is good in
respect to other conditions. The error in this case is very high and that shows a substantial
individual errors. As stated in the previous modelling, there is no cap or range of acceptable
error as error can increase with the increase in number of data input. The most reliable source
is the correlation value. Going by that, this predicted data set is still a moderately good
output. This will be further discussed in Error Analysis.
57
5.4.2 Time Series Graph
In the above Figure 5.8, each data point is represented by its actual value as well as its
predicted value. The time series graph shows all the points of inputted data and predicted
values for all the three input subsets. Each of the point denotes both the values. The time
series analysis is done as recorded data follow a monthly pattern although no such connection
between months and data value can be found. The time series analysis makes it easier for the
algorithm to denote each value as an index of a particular time (in this case a month each)
which is used as a correspondent point to the next month’s value.
The time series graph also shows the errors for each predicted value. The error analysis is
described in details in the next section.
58
5.4.3.1 Error Histogram
The zero error line in Figure 5.9 is slightly skewed towards the left as majority of the error
shows predicted value more than the actual values. The negative error is more in the data sets.
The error range for zero limits is higher in this case of about -75 to 75. The majority of data
inputs show an error range of -1000 to 1000. This is higher than the previous model the
reasons for which are discussed previously.
Also another interesting point that arises seeing the high value of errors is the data scattering
or noise. In this model, the noise or scattered data has a very high percentage exceeding the
considerable limit of 20%. This is yet another reason that the error obtained is large. Data
scattering is one of the main problems for not obtaining a uniform correlation equation
between the input data sets. This causes the skewing of the histogram above as well. The
noise could not be reduced by data normalization in this case as there was already the
problem of data gaps. This caused the creation of a “Pseudo Effect” wherein a correlation
parametric equation was formed between the input and output. In this effect, the model tried
to predict data taking into consideration several zero values in the middle. The result was
that, it formed a decreasing gradient slope for value estimation whereas the subsequent values
did not follow the trend.
59
Fig 5.10: Error Auto-Correlation Graph of Case Study 2
The error Mean Squared Error (MSE) comes to a value of 7.17 * 106 which is reasonably
high. The mean squared error gives the mean error deviation from the 0 value. The sum of
squared error (SSE) comes to about 1.65 * 109. As we can see from the error correlations
graph in Figure 5.10 the error deviations are within the considerable ranges as shown by the
confidence limit lines. This reasserts our previous discussion that just because the error
values are quite high does not necessarily signify wrong prediction. This confidence limit
shows that with respect to the given inputs along with its inherent data shortcomings, the
predicted values lie within the range. It proves that the model does indeed work well for such
discrepancies as well.
From the outcome of predicted data, the model worked moderately for the given set of data
for the first project. The correlation ‘r’ values come to 0.75 while the MSE is 7.17 * 106. The
data estimated has prediction difficulties which can be eliminated with better data inputs and
no data gaps.
The parameters for estimation and prediction were the 4M resources as in Manpower,
Machinery, Materials and Money. These resources are estimated based on essential activities
over the entire project completion duration. In this 2nd case study the activities were
excavation, piling, reinforcement and concreting. The finalized weights for the above
parameters in respective order were 12.666, 12.902, 6.514 and 1.307 with mean squared error
of 7.17 * 106.
60
5.5 Comparative Analysis
As previously stated in the individual Case Study results of both the projects, the first
prediction worked immensely well while the latter had certain issues. These issues do not
represent the entire spectrum of problems that may arise in these ANN Case Studys.
However, in effect of resource analysis of big construction project, the underlying error in
judgement arises due to the above stated problems. In trying to asses these misinterpretations
of data certain core essentials of Case Study working comes into picture.
If we look closely, these prediction errors or misinterpretations does indeed follow a certain
pattern or tradeable symptoms. There is a need to understand such mechanisms in order to
fulfil the desired goals of the model in question. In doing so, the main reasons can be
explained as follows:
61
When we do that, it creates a problem for the prediction algorithm (in this case the LM
algorithm) as it tries to formulate a pattern including the 0 values. This creates a tendency for
the algorithm to generate lower or random values in place of the 0 values.
A uniformity of data is must in order to understand and predict accurate values. The problem
of data gaps cannot be accommodated in the pre-defined algorithms. It is necessary to
understand that data normalization is not a solution in this case. It cannot help sort the
problem of no data.
62
5.5.4 Volume of Data-Set
One of the most important requirements for obtaining data for Case Studys is the bulk of
data. It is to be understood that for a model to work perfectly and predict almost accurate
outputs it needs to have a huge set of data to train from. Larger the data set better will be the
predictive power of the algorithm. As we know that the data set is divided into three subsets
namely: training, validation and testing set. So a large volume of data needs to be assigned to
the training part of the model.
In our first model prediction, the data set is much larger than the second set. The first project
had a 5 year timeline with 5 different crucial activities. These created a larger data set with
more meaningful insight on the pattern of data variation. In the second model the entire data
had a little more than a 3 year or 40 month timeline with the only essential activities included
excavation, piling, reinforcement and concreting. However, even in these activities,
reinforcement and concreting was only a part of entire operation which further reduced the
acceptable data inputs.
Another interesting point of discussion is the error generation of outputs and targets. As
stated briefly earlier, a large data set tends to create a larger cumulative error for obvious
reasons. As the number of data points increases, each data point corresponds to an error value
and in turn increases the bulk of error produced. But this argument is not a proportional
relation. In order to understand this concept we need to be clear of two things. Firstly, a large
data set creates a higher number of individual errors. Secondly, a large data set also causes a
better fit for the ensuing prediction pattern. As the above two statements creates a confusion,
it can be understood that both the mechanism act a counteracting force against each other,
creating a more stable neural pattern. Since the algorithm works on the principle of reducing
error by changing pattern weights and not the other way round, the first proposition shows the
dominance. The larger the data set better will be the predictability of the algorithm. Though it
creates more individual errors, the magnitude of errors will be negligible thereby reducing the
overall cumulative error.
63
CHAPTER 6
CONCLUSION
With the help of the two distinct iterative processes using separate data from two individual
sites and applying Nonlinear Autoregressive Neural Network, we get certain characteristic
observations. Certain general assumption of data representation, prediction, misinterpretation
and error generation were done which may be applicable to all construction projects. Better
normalization of data, better innovative algorithm, larger data sets and more convergent data
inputs would have definitely yielded better results. However, keeping in mind the persistence
of manually running each model with a varying set of differing parameters the best output has
been obtained. Some observations noted in the process of training, validation and testing of
neural network are as follows:
The parameters for estimation and prediction were the 4M resources as in Manpower,
Machinery, Materials and Money. These resources are estimated based on essential
activities over the entire project completion duration. In the 1st case study the
activities were road laying, excavation and backfilling, masonry, reinforcement and
concreting. In the 2nd case study the activities were excavation, piling, reinforcement
and concreting.
The finalized weights for the parameters of the Case Studies were obtained. In the 1st
case study, the weights in order of the above parameters were 5.1159, 3.0205, 2.2348,
1.087 and 0.5289 with mean square error of 170.85. For the 2nd case study the weights
in order of the parameters above were 12.666, 12.902, 6.514 and 1.307 with mean
squared error of 7.17 * 106.
The main determinants of the success of the tests are mean squared error (MSE) and
the correlation coefficient(r). The correlation was best obtained at an average value of
0.997 in all the three phases combined as shown in the first iterative model. The
correlation coefficient of the second model comes to a decent 0.75 which is a
moderately suitable prediction. The analysis does not provide accurate prediction but
brings into picture certain data shortcomings.
This Case Study can be conclusively used to predict accurately the resource
requirement of partially completed projects by using data of work completed. The
model can help predict resource volumes of any specific site in accordance to its
historical data. The model inculcates slight nuances and differences of each individual
site with precision.
64
REFERENCES
65
12. Samer Ezeldin and Lokman M. Sharara (2006), “Neural Networks for Estimating the
Productivity of Concreting Activities” Date of Issue: 10.1061/ (ASCE) 0733-9364
132:6(650).
13. Jamshid Sodikov (2005), “Cost Estimation of Highway Projects in Developing
Countries: Artificial Neural Network Approach” Journal of the Eastern Asia Society
for Transportation Studies, Vol. 6, pp. 1036 - 1047.
14. Tarek Hegazy and Moustafa Kassab (2003), “Resource Optimization Using
Combined Simulation and Genetic Algorithms” Date of Issue: 10.1061/ (ASCE)
0733-9364 (2003) 129:6(698).
15. Ajith Abraham, Dan Steinberg and Ninan Sajeeth Philip (2001), “Rainfall Forecasting
Using Soft Computing Models and Multivariate Adaptive Regression Splines” pp. 5-
10.
16. K. Gwangseob and P. B. Ana (2001), “Quantitative flood forecasting using multi-
sensor data and neural networks”.
17. Holger R. Maier and Graeme C. Dandy (2000), “Neural networks for the prediction
and forecasting of water resources variables: a review of modelling issues and
applications” Environmental Modelling & Software 15 101–124.
18. R. Sonmez, and J. E. Rowings, (1998), “Construction labour productivity modelling
with neural networks.”
19. J. Portas, and S. Abourizk (1997), “Neural network model for estimating construction
productivity.”
20. M. N. French, W. F. Krajewski, and R. R. Cuykendall (1992), “Rainfall forecasting in
space and time using neural network”.
21. Ratnadip Adhikari and R. K. Agrawal, “An Introductory Study on Time Series
Modelling and Forecasting”.
66