Kim2009

Probabilistic Forecasting of Project Duration Using Bayesian
Inference and the Beta Distribution

Byung-cheol Kim1 and Kenneth F. Reinschmidt2
Abstract: Reliable forecasting is instrumental in successful project management. In order to ensure the successful completion of a
project, the project manager constantly monitors actual performance and updates the current predictions of project duration and cost at
Downloaded from ascelibrary.org by University of Michigan on 02/18/13. Copyright ASCE. For personal use only; all rights reserved.
completion. This study introduces a new probabilistic forecasting method for schedule performance control and risk management of
on-going projects. The Bayesian betaS-curve method 共BBM兲 is based on Bayesian inference and the beta distribution. The BBM provides
confidence bounds on predictions, which can be used to determine the range of potential outcomes and the probability of success.
Furthermore, it can be applied from the outset of a project by integrating prior performance information 共i.e., the original estimate of
project duration兲 with observations of new actual performance. A comparative study reveals that the BBM provides, early in the project,
much more accurate forecasts than the earned value method or the earned schedule method and as accurate forecasts as the critical path
method without analyzing activity-level technical data.
DOI: 10.1061/共ASCE兲0733-9364共2009兲135:3共178兲
CE Database subject headings: Forecasting; Scheduling; Bayesian analysis; Construction management.
Introduction measuring, analyzing, and communicating the actual performance

of a project. EVM also has the advantage of being universally
Forecasting is an essential element of project management applicable over a wide range of project types and sizes because
throughout the life cycle of a project. Reliable forecasts are criti- every project, no matter how large or complex, is represented by
cal once a project gets started because, even with a detailed plan, three functions: the planned value 共PV or the budgeted cost of
there are inherent risk factors that may influence the actual per- work scheduled兲, the earned value 共EV or the budgeted cost of
formance of a project. As a result, the project manager constantly work performed兲, and the actual cost 共AC or the actual cost of
seeks leading indicators for potential problems so that appropriate work performed兲.
actions can be taken in a timely manner. That is, a current devia- However, these methods are fundamentally deterministic and
tion from the plan serves as an early indicator of potential devia- fail to account for the inherent uncertainty in project planning,
tion of the project duration and cost at completion from the execution, and performance measurement. The need for forecast-
project’s objectives. ing arises only when there is uncertainty about the future. Fur-
This paper focuses on the probabilistic schedule forecasting of thermore, uncertainty is an important factor in planning and
ongoing projects. In the project management community, two making decisions about the future progress of a project. However,
common practices for performance forecasting are 共1兲 to use the conventional methods such as EVM and CPM do not provide
earned value method 共EVM兲 for cost and schedule forecasting confidence bounds on predictions, which are essential to effective
and 共2兲 to use the EVM for cost forecasting and the critical path decision making under uncertainty. The program evaluation and
method 共CPM兲 for schedule forecasting 共Fleming and Koppelman review technique 共PERT兲 provides a simplified probabilistic
2006兲. CPM is based on activity-level time estimates and prece- evaluation of project duration. However, it has been criticized for
dence relationships among the activities in the project network bias due to neglecting the influence of near-critical paths and its
schedule. In spite of its mathematical simplicity, applying CPM to assumption of the independence of activities.
large, complex projects requires extensive and detailed technical The objective of this paper is to present a new probabilistic
knowledge of all activities and the logical precedences among forecasting method for project duration, which relies only on
them. On the other hand, EVM relies only on summary project- summary project-level performance information and updates the
level cumulative progress data and provides a systematic way of current prediction of project duration in the light of new perfor-
mance information. The ultimate goal of project performance
1
Assistant Professor of Civil Engineering, Ohio Univ., 114 Stocker forecasting is to provide decision makers with objective and re-
Center, Athens, OH 45701-2927. E-mail: kimb@ohio.edu fined forecasts in a timely manner. However, actual performance
2
Professor of Civil Engineering and J. L. “Corky” Frank/Marathon data, which are probably the most objective and reliable source of
Ashland Petroleum LLC Chair in Engineering Project Management, predictive performance information, are limited early in the
Zachry Dept. of Civil Engineering, Texas A&M Univ., 3136 TAMU, project. Therefore, a major challenge in project performance fore-
College Station, TX 77843-3136.
casting is to make use of subjective judgment or prior knowledge
Note. Discussion open until August 1, 2009. Separate discussions
must be submitted for individual papers. The manuscript for this paper
to overcome the lack of measured performance data to work with
was submitted for review and possible publication on January 4, 2008; 共Gardoni et al. 2007兲 during the early phase of a project.
approved on October 15, 2008. This paper is part of the Journal of The method presented in this paper, Bayesian betaS-curve
Construction Engineering and Management, Vol. 135, No. 3, March 1, method 共BBM兲, is developed based on Bayesian inference and a
2009. ©ASCE, ISSN 0733-9364/2009/3-178–186/$25.00. curve fitting technique using the beta distribution. The BBM is a
178 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009
J. Constr. Eng. Manage. 2009.135:178-186.

Table 1. Typical Approaches for Ongoing Project Forecasting
Categories Description Examples
I. The original estimate • Actual performance data are atypical. CPM
• Typically early in a project.
II. A new estimate • Original estimates are not reliable Bottom up estimating, check list
and actual performance data are atypical.
• Additional time and cost.
• Typically during the closing phase.
III. The original estimate modified by • Actual performance data are typical. EVM, regression models
past performance data • Applicable only when significant data are available
regression model that fits an S-curve function to cumulative earned value method. In EVM, the schedule and cost performance
progress curves of a project and updates the parameter estimates of a project are analyzed in terms of four performance indicators:
of the S-curve using a Bayesian inference approach. Recently, 共1兲 cost variance 共CV兲 = EV− AC; 共2兲 cost performance index
two new models based on this concept have been reported 共Gar- 共CPI兲 = EV/ AC; 共3兲 schedule variance 共SV兲 = EV− PV; and 共4兲
doni et al. 2007; Kim and Reinschmidt 2007兲. In those models, schedule performance index 共SPI兲 = EV/ PV. The standard EVM
actual performance records are fitted by a single or a group of prediction for the project duration and cost at completion rests on
rather simple S-curve functions with two parameters. As a result, the assumption that the cumulative performance indices 关SPI共t兲
their efficiency depends on the existence of S-curve functions that = EV共t兲 / PV共t兲 and CPI共t兲 = EV共t兲 / AC共t兲兴 will represent the per-
fit a specific progress pattern with an acceptable accuracy. The formance efficiency of the jobs in the future. The estimate at
BBM in this paper exploits the potential of the combined use of completion 共EAC兲 at time t is then equal to the cost already spent
Bayesian inference and S-curve functions in project performance 共AC兲 plus the adjusted cost for the remaining work
forecasting, with emphasis on ease of implementation and robust-
ness in quantifying various prior performance information from EAC共t兲 = AC共t兲 + 关BAC − EV共t兲兴/CPI共t兲 = BAC/CPI共t兲共1兲
various sources, such as project plans, historical data, and subjec-
where BAC= budget at completion. Although the use of EVM
tive judgments.
forecasting formulas for cost performance has been supported
This paper is organized as follows. The next section reviews
widely, many modified versions of the standard formula have
conventional project performance forecasting methods. In the
been suggested 共Anbari 2003; Christensen 1993兲. However, those
subsequent section, two component methodologies—S-curve
methods are typically linear extrapolations, assuming, for ex-
models and Bayesian inference—are reviewed. The BBM is then
ample, that the computed CPI, which has changed in the past 共or
formulated based on the general framework of the Bayesian adap-
it would always equal 1.00兲, will not change in the future.
tive forecasting method and a curve fitting technique using the
On the other hand, forecasting project duration using the cu-
beta distribution. Numerical examples using real project data are
mulative planned value 共PV兲 and earned value 共EV兲 has been
presented in order to demonstrate the predictive properties of the
criticized for systematic distortion in results 共Lipke 2003; Vande-
BBM. In addition, the life-cycle forecasting accuracy of the BBM
voorde and Vanhoucke 2006兲. To improve schedule forecasting
is evaluated and compared against a state-of-the-art EVM sched-
with EVM, several modified forecasting formulas have been sug-
ule forecasting method and the CPM.
gested 共Anbari 2003; Lipke 2003兲. Recently, Vanhoucke and
Vandevoorde 共2007兲 conducted a comprehensive comparative
study against three EVM-based schedule forecasting methods in
Review of Project Performance Forecasting the literature and reported that the earned schedule method 共Lipke
Methods
2003兲 outperforms, on the average, other methods. In the earned
schedule method 共ESM兲, the earned schedule at time t, ES共t兲, is
Typical forecasting approaches to update the original estimates
defined as the planned time to achieve the current earned value
with actual performance data from an ongoing project can be
EV共t兲. The estimated duration at completion at time t, EDAC共t兲,
grouped into three categories, depending on the decision maker’s
is then calculated as
perception of the relationship between past and future perfor-
mance 共PMI 2004兲. Table 1 summarizes basic properties and ex-
EDAC共t兲 = t + 关PD − ES共t兲兴/共ES共t兲/t兲共2兲
amples of the three categories. Approaches in Categories I and II
are valid only when actual performance data observed from a where PD= planned project duration.
project are considered irrelevant to the future performance of re- However, application of these present methods to schedule
maining jobs. The Category I approach can be applied when the forecasting of ongoing projects has some limitations. First, all
original estimates are still believed reliable. For example, the these methods are deterministic and fail to provide confidence
CPM updates the project duration at completion given delays in bounds on predictions. The level of uncertainty in forecasts may
some critical path activities in the past, but typically assumes that influence decisions about planning and controlling projects. In
the causes that affected past activities will not affect future ones. addition, most methods forecast the final outcome of a project
When the remaining jobs are considered as a new project, the based on the assumption that the current status measurements are
Category II approach can be applied. accurate and without any errors, which is unrealistic because
Category III methods address the situations in which project there are inherent errors in both measuring time to report and
duration and cost at completion are updated using both the origi- measuring performance at the reporting times. Furthermore, EVM
nal estimate and actual performance data up to the time of fore- forecasting formulas are not recommended early in a project be-
casting. A typical case of the Category III forecasting is the cause of large prediction errors due to few reporting intervals to
JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 179

sum over 共small sample size兲共Fleming and Koppelman 2006兲. is fitted to some known model with associated parameters 共⌰兲,
The stabilization period that is required to get reliable EVM fore- the project manager’s belief in the individual model parameters
casts may change from one project to another and no study has can be updated with actual performance data 共D兲 as the project
been done about how to determine in advance the right time, for proceeds. Bayes’ law for this case can be written as
a specific project, after which the current EVM forecasting for-
mulas can be applied with acceptable reliability.
P共⌰兩D兲 = P共D兩⌰兲P共⌰兲/P共D兲共3兲
where P共⌰兲 = prior distribution reflecting the belief in the param-
Methodologies eters before observing new outcomes; P共D 兩 ⌰兲 = conditional prob-
ability that the particular outcomes D would be observed, given
The Bayesian betaS-curve method is developed based on two
the parameters ⌰; P共D兲 = marginal distribution of the observables
well-known approaches in project management and engineering:
D; and P共⌰ 兩 D兲 = posterior distribution of the parameters ⌰ given
S-curve models and Bayesian inference. This section reviews
that the outcomes D were observed.
these two methodologies with an emphasis on their fundamental

properties that characterize the BBM.
Formulation of the Bayesian BetaS-Curve Method

S-Curve Models
Typical cumulative progress curves of projects show S-like pat- Assumptions and the General Framework of Bayesian
terns, regardless of the unit of measurement; for example, cumu- Adaptive Forecasting Using S-Curve Functions
lative costs, labor hours, or percentage of work 共PMI 2004兲.
S-curve representations have long been widely used in the man- Every project is unique and can be characterized by a unique
agement of real projects. In project management, S-curves, which S-curve 共Kenley and Wilson 1986兲. The Bayesian adaptive fore-
are also called progress curves 共Schexnayder and Mayo 2003兲, casting method is developed based on the premise that every
represent cumulative progress over time, which represents the project proceeds following a characteristic cumulative progress
amount of work done or to be done by a specific time throughout curve 共Kim 2007兲. The underlying strategy of the approach is to
the execution period. In the construction industry, S-curves have identify the characteristic progress curve of a project using the
been used as a graphical tool for measuring progress 共Barraza et prior performance information and to use that curve, which is
al. 2000; Miskawi 1989兲, as control limits based on early start and named the progress curve template, to forecast future progress of
late start dates of activities 共Schexnayder and Mayo 2003兲, and as the project.
a cash flow forecasting tool at relatively early stages of the project A reasonable way, probably the most reliable way, of obtaining
life cycle 共Fellows et al. 2002兲. Touran et al. 共2004兲 introduced a the progress curve template is to construct it from the detailed
financial model for measuring the effect of new U.S. Department plans of the project itself. For example, the planned value distrib-
of Transportation prompt pay provisions, in which a beta distri- uted over time, which is used as the baseline of project perfor-
bution function was used for the expenditure S-curve. mance analysis in the EVM, can be directly used as a progress
However, previous studies of the use of the S-curve as a quan- curve template for forecasting. Once the progress curve template
titative performance management tool for ongoing projects, not for a project is determined, it is approximated by fitting some
just visual displays, are very limited. Murmis 共1997兲 generated a S-curve models. If a model is found that fits the progress curve
symmetric S-curve from a normal distribution and forced it to template acceptably well, it is used to forecast future performance
pass fixed points of the cumulative progress curve in order to
of the project by updating the parameters of the model based on
detect problems in project performance. Cioffi 共2005兲 modified a
actual performance data. This approach is valid under the assump-
typical sigmoid curve used frequently in ecology to develop a
tion that an actual progress curve of an ongoing project or job will
more flexible S-curve. Barraza et al. 共2000, 2004兲 tried to use a
follow the same S-curve model that approximates the progress
set of S-curves generated from a network-based simulation as a
curve template. In projects, if the actual progress deviates signifi-
visual project control tool and extended the concept to a probabi-
listic forecasting method by adjusting the parameters of probabil- cantly from the plan, then something is amiss with the project.
ity distributions of future activities with performance indices 共the The general framework of the Bayesian adaptive forecasting
CPI and SPI in the EVM兲 of finished activities. Useful as they method consists of three steps: 共1兲 generating prior distributions
are, these previous works are not suitable to the objective of of model parameters; 共2兲 updating model parameters based on
probabilistically forecasting at-completion project duration of on- reported data; and 共3兲 using the updated model for forecasting.
going projects, largely because of poor flexibility of the suggested Once the prior performance information of a project is developed,
S-curves and the lack of a mathematically sound forecasting the information is used to generate the corresponding prior distri-
framework based on actual performance information available at butions of model parameters. The periodic performance data dur-
the time of forecasting. ing execution are then used to revise prior beliefs on the model
parameters through Bayesian inference. Finally, updated distribu-
tions of model parameters are used to forecast meaningful project
Bayesian Inference information such as the estimated duration at completion
Bayesian inference provides a systematic way of updating knowl- 共EDAC兲, the range of possible outcomes, and the probability of
edge in the light of new observations 共Gelman et al. 2003兲. A meeting the planned project duration. Based on the assumptions
Bayesian approach also provides a systematic way of combining and the general framework of the Bayesian adaptive forecasting
all pertinent information from various sources in terms of prob- method, the Bayesian betaS-curve method is developed. The re-
ability distributions. If a project manager has an initial estimate of maining part of this section discusses the definition of the betaS-
project progress 共that is, a project plan兲 and if this progress curve curve function and its application to the formulation of the BBM.

BetaS-Curve Function Progress ($) Prior probability distribution of
the project duration
The betaS-curve function is a S-curve function derived based on
the beta distribution. The beta distribution has a long history of BAC
application in engineering and project management 共AbouRizk et
al. 1991; Malcolm et al. 1959; Perry and Greig 1975兲. The pri- Progress curve template
mary advantage of applying the beta distribution is the fact that
the beta distribution can generate a wide range of shapes with
only two parameters 共Touran et al. 2004兲. In statistics, the beta Actual performance data
distribution is a continuous probability distribution on a finite
interval A to B with two shape parameters ␣ and ␤. The probabil-
ity density function 共PDF兲 of a beta random variable X is
1 共x − A兲␣−1共B − x兲␤−1
f共x;␣,␤兲 = ␣ ⬎ 0,␤ ⬎ 0,A 艋 x 艋 B
B共␣,␤兲共B − A兲␣+␤−1
t PDAC
共4兲
Fig. 1. Two elements of the prior performance information and the
where B共␣ , ␤兲⫽beta function
actual performance data
冕
1
B共␣,␤兲 = t␣−1共1 − t兲␤−1dt 共5兲
0 performance information is defined as all relevant performance
information other than actual performance data, which is avail-
The cumulative distribution function of the beta distribution is able even before the inception of a project.
F共x;␣,␤兲 = B共共x − A兲/共B − A兲;␣,␤兲/B共␣,␤兲共6兲 In the BBM, the prior performance information consists of two
elements: the prior probability distribution of project duration and
where B共共x − A兲 / 共B − A兲 ; ␣ , ␤兲⫽incomplete beta function, which the progress curve template. Fig. 1 shows these elements in a
is defined as graphical way. First, the prior distribution of project duration rep-
冕
s resents the best probabilistic estimate of the project duration,
B共s;␣,␤兲 = t␣−1共1 − t兲␤−1dt 共7兲 which is made before observing actual performance data. Com-
0 mon probabilistic schedule planning methods such as PERT 共Mal-
colm et al. 1959兲 and network-based simulation 共Lee 2005兲 can
The betaS-curve function is defined over a range 关0 , T兴, where be used to generate the distribution of project duration. In the
T represents the project duration. The ranges of the two shape absence of detailed project plans, a subjective probability estimate
parameters are restricted to ␣ 艌 1 and ␤ 艌 1 in order to confine the can be made using subjective estimating methods such as three-
plausible solution space to S-curves with unimodal PDF. The uni- point estimates and range estimating techniques. The second ele-
modal shape resembles the typical resource level distribution of ment of prior performance information, the progress curve
projects during the execution period. It should be noted that a template, represents the prior knowledge of the project manager
uniform PDF based on ␣ = 1 and ␤ = 1 is also included in the and project engineers about the plausible progress pattern of the
solution space. In addition, a new parameter m is introduced to actual performance. The progress curve template of a project rep-
represent the location of the mode. The betaS-curve function is resents the characteristics of the project in terms of the cumula-
then defined as tive progress over time.
B共x/T;␣,␤兲
BetaS-curve共x;␣,m,T兲 = 共8兲
B共␣,␤兲 Step 1—Generation of Prior Distributions of
BetaS-Curve Parameters
where ␣ 艌 1, 0 ⬍ m ⬍ 1, T ⬎ 0, and ␤ = 共␣ − 1兲 / m − 共␣ − 2兲.
In the Bayesian approach, the three parameters 共␣, m, and T兲 of
the betaS-curve function are not single valued and deterministic,
Input Elements
but rather are random variables themselves, with their own prob-
The primary information that should be relied on for project per- ability distributions. Depending on the types of prior information
formance forecasting is the past performance data observed in the available and the level of confidence a decision maker puts on
project itself. Early in a project, however, project managers may that information, different types of priors can be used for the
suffer from a lack of sufficient actual performance data to make betaS-curve parameters. Table 2 summarized the types of prior
reliable forecasts, resulting in deferring any judgment about per- distributions recommended in the Bayesian betaS-curve method.
formance control at the risk of missing the opportune time to take Because the duration parameter 共T兲 explicitly represents the
appropriate corrective actions. Therefore, for a method based on project duration, probabilistic, not single-point, priors should be
actual performance data to be also applicable during the early used in order to get probabilistic predictions of the completion
phase of a project, the method should have an adaptive nature that date. When reliable probabilistic estimates of project duration are
updates an original estimate developed during the planning phase available prior to the start of a project, informative prior distribu-
in the light of new performance data reported periodically during tion of T can be used. Otherwise, a noninformative prior should
the execution phase. The Bayesian betaS-curve method makes use be used for parameter T. In this case, predictions are made based
of all relevant performance information available from standard only on actual performance reports from the project.
project management practices and theories. Information used in For the shape parameters, ␣ and m, both single-point estimates
the method can be grouped into two categories: the prior perfor- and probabilistic estimates can be used, depending on the types of
mance information and the actual performance data. The prior information used in forecasting. For example, when a single

Table 2. Types of Prior Distribution for the BetaS-Curve Parameters
Prior types
Parameters Informative Noninformative
Duration parameter 共T兲 • Recommended when prior performance information is • Recommended when prior performance information is not
considered relevant to actual performance. relevant to actual performance.
• E.g., T ⬃ PERT 共optimistic, most likely, pessimistic兲; T • Use uniform distribution over a reasonable range, e.g., T
⬃ normal 共␮ , ␴2兲. ⬃ uniform共0 , 3 * PD兲.
Shape parameters 共␣ and m兲 • Recommended when prior performance information is • Recommended when no information about plausible
considered • Recorelevant to actual performance. progress curve is available.
• E.g., ␣ ⬃ normal 共␮ , ␴2兲 and m ⬃ normal 共␮ , ␴2兲 for • E.g., ␣ ⬃ uniform共1 , 9兲, m ⬃ uniform共0 , 1兲.
probabilistic priors. ␣ ⬃ ␣ˆ and m⬃m̂ for single-point priors.
planned progress curve is used, single-point estimates of ␣ and m parameters that make the deviations normally distributed with
can be obtained by a common curve fitting technique. This ap- zero mean and standard deviation ␴. It is assumed that the ran-
proach is fairly simple and applicable to most projects that are dom errors corresponding to different observations are uncorre-
planned and monitored with cumulative progress curves. When lated. The likelihood of the data conditional on the parameters
there is a network schedule and the project manager can extract can then be calculated as the product of the likelihood of each
probabilistic estimates of activity durations, a more refined ap- observation
proach can be used to build probability distributions of ␣ and m N
共Kim 2007兲. The method proceeds through three steps: 共1兲 given
a network schedule and probabilistic estimates of activity dura-
p共D兩⌰兲 = 兿
i=1
p共ti,wi兩⌰兲
tions, generate a large sample of potential progress curves, which
N
are collectively called stochastic S-curves of the project 共Barraza
et al. 2000, 2004兲, using a simulation approach; 共2兲 for each of the = 兿共1/冑2␲␴兲exp关− 共1/2兲共关ti − T共wi兩⌰兲兴/␴兲2兴共11兲
i=1
stochastic S-curves, calculate the best-fit parameters 共␣ , m , T兲
using a betaS-curve function; 共3兲 repeat this fitting process to all The value of ␴ is determined by the forecaster to adjust the sen-
the random S-curves generated and obtain a set of marginal sitivity of predictions to the actual data reported.
probability distributions of the shape parameters— The marginal distribution of the observed actual progress D is
p共␣兲 , p共m兲 , p共T兲—along with the correlation coefficients between determined from
冕
them. With this method, the stochastic nature of the progress
curves of a project can be quantified in a systematic way and
p共D兲 = p共D,⌰兲d⌰ 共12兲
represented as a set of prior probability distributions of the betaS-
curve parameters. It should be noted that information used for
prior distribution generation is not limited to documented plans where the joint probability distribution of data and parameters is
for a specific project. Historical data from similar projects contain constructed from Eqs. 共9兲 and 共11兲 as p共D , ⌰兲 = p共D 兩 ⌰兲p共⌰兲.
valuable information about possible outcomes of a future project The goal of the Bayesian updating is to obtain a revised or
and, adjusted with appropriate professional judgments, a project posterior marginal distribution of each model parameter condi-
manager may be able to develop useful prior distributions. tional on the observed data. Using fundamental properties of con-
ditional distributions, the posterior marginal distribution of a
parameter, for example ␣, can be derived by integrating the joint
Step 2—Parameter Updating with Bayesian Inference parameter distribution conditional on the observed data, which is
Because the shape parameters are random variables themselves, determined from Eqs. 共9兲–共12兲, with respect to the remaining pa-
estimates of their numerical values can and should be revised rameters ⌰−␣ = 兵m , T其
冕
whenever any new information becomes available. With the
betaS-curve function, the Bayesian updating process in Eq. 共3兲 is p共␣兩D兲 = p共⌰兩D兲d⌰−␣ 共13兲
formulated as follows. Let ⌰ denote the set of parameters
兵␣ , m , T其. The parameters are chosen independently so that the It should be noted that computing the posterior distributions
prior probability distribution of the parameter set is represented as derived above requires multifold integration over the parameters
p共⌰兲 = p共␣兲p共m兲p共T兲共9兲 used in the analysis. In this work, a Monte Carlo integration tech-
nique has been successfully applied without resorting to more
Once a project gets started, actual progress is reported periodi- sophisticated methods such as importance sampling 共Gardoni et
cally and the data can be represented as a series of discrete values al. 2007兲 and Markov chain Monte Carlo method 共Ghosh et al.
D 2007兲 that consume large amounts of computer time.
D:共wi,ti兲,i = 1, . . . ,N 共10兲
where wi represents the cumulative progress reported at time ti; Numerical Examples
and N = number of records up to the time of forecasting.
The likelihood of the data conditional on the parameters cho- The Bayesian betaS-curve method has been implemented for the
sen is measured based on the deviations between the actual times convenience of the spreadsheet user as an add-in for Microsoft
of performance reporting and the planned times determined by the Excel and applied here to two examples of ongoing projects being
betaS-curve parameters T共wi 兩 ⌰兲. The goal is to seek a set of monitored with monthly progress reports. Fig. 2 shows the

100 100
90 90
80 80
% Complete
% Complete
70 Planned, 24 70
60 60 Planned, 25
50 50
40 Planned 40
Planned
30 30
20 Actual 20 Actual
10 Planned F it 10 Planned F it
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
(a) Project A (b) Project B
Fig. 2. Progress curves of the example projects

planned progress 共planned value or budgeted cost of work sched- mean of the posterior distribution of the EDAC over the forecast-
uled兲 and actual progress 共earned value or budgeted cost of work ing time. The upper and lower bounds are determined at the 10%
performed兲 curves from the monthly progress reports of the ex- confidence level on each side. Therefore, the confidence bounds
ample projects. Project A is an engineering project with a budget have 80% probability of including the actual project duration.
over 25 million US dollars. The planned duration of Project A is That is, a forecasting method is derived such that the lower bound
24 months and it is slightly ahead of schedule with 58% comple- indicates a completion date that a project will finish later than
tion as of the eleventh month. Project A represents a case of with 90% probability.
on-the-schedule projects. On the contrary, Project B represents a Forecasts made for Project A are shown in Fig. 3. The results
typical case when a project suffers perpetual schedule delay. reveal that the use of informative prior for the duration parameter
Project B is a plant project with over 11 million US dollar budget 关Fig. 3共b兲兴 provides a more stable EDAC profile than the nonin-
and its network schedule consists of over 1,000 activities. Origi- formative T. On the other hand, when the noninformative prior is
nally scheduled to finish within 25 months, Project B is 68% used 关Fig. 3共a兲兴, the mean of the EDAC responds quickly to early
complete as of the eighteenth month. project reports. The results indicate that the same actual data have
The primary objective of these examples is to demonstrate a stronger influence on the revision of EDAC when a noninfor-
advanced properties of BBM, such as the range of possible mative prior is used. However, influence from prior information
completion dates and potential advantages from using appropriate diminishes as more data accrue from the project. The diminishing
prior performance information in combination with actual perfor- impact of prior information is also found in the profiles of the
mance data. In order to do that, two different prior distributions prediction intervals. Results in Fig. 3 show that the width of the
are used for the project duration parameter T and the estimated prediction interval narrows, as it should, as more data are ob-
duration at completion 共EDAC兲 is updated every month with a served. However, the rate of narrowing is greater with noninfor-
new monthly progress report. The prior distributions used for the mative prior than PERT estimates of T.
betaS-curve parameters are summarized in Table 3. The informa- The results for Project B 共Fig. 4兲 show similar patterns dis-
tive prior for project duration is determined from subjective judg- cussed with Project A and, more importantly, demonstrate the
ment of possible outcomes of project duration using the PERT benefits of using BBM when a project is behind schedule. Obvi-
range estimates. In this paper, planned project duration is used as ously, this is the case when project managers start to keep a close
the most likely 共ML兲 estimate of the project duration, while the eye on their projects. In such cases, the prediction bounds on the
optimistic estimate 共O兲 and pessimistic estimate 共P兲 of the project EDAC provide an objective, quantitative indicator of the signifi-
duration are assumed as 80% and 140%, respectively, of the cance of the gap between the planned progress and the actual. For
planned duration. For example, the PERT estimates of Project B example, the lower bounds in Figs. 4共a and b兲 indicate that,
are O = 20 months, ML= 25 months, and P = 35 months. On the 10 months after the project start, the probability of completing the
other hand, a uniform distribution spanning from zero to three project within its original duration 共25 months兲 is highly unlikely
times planned project duration is used to represent noninforma- 共falls below 10%兲. Of course, one can adjust the confidence level
tive prior distribution for project duration. For the shape param- according to his or her accepted level of risk, for example, to 95%
eters, ␣ and m, least-square estimates of the best-fit betaS-curve or to 99%. The point here is that the BBM method provides
function to the planned progress curves are used. project managers with additional information about schedule risk
The BBM is repeatedly applied to the two projects using the of their projects, which deterministic methods cannot convey.
two prior cases and the time histories of the EDAC are shown in It should be noted that a forecast can only be as accurate as the
Figs. 3 and 4. In the graphs, the thick solid line represents the information used. Whether one should use an informative prior or
Table 3. Priors Used in Numerical Examples

Project A Project B
Informative T Noninformative T Informative T Noninformative T

T ⬃ PERT 共O = 19.2; ML= 24; P = 34兲 T ⬃ Uniform 共0 , 72兲 T ⬃ PERT 共O = 20; ML= 25; P = 35兲 T ⬃ Uniform共0 , 75兲
␣ ⬃ ␣ˆ = 2.00 ␣ ⬃ ␣ˆ = 2.00 ␣ ⬃ ␣ˆ = 3.39 ␣ ⬃ ␣ˆ = 3.39
m ⬃ m̂ = 0.47 m ⬃ m̂ = 0.47 m ⬃ m̂ = 0.42 m ⬃ m̂ = 0.42

40 EDAC(t) 40
EDAC(t)
35 Upper Bound 35 Upper Bound
Lower Bound
30 30 Lower Bound
EDAC
EDAC
25 Planned, 24 25 Planned, 24
20 20
15 15
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
(a) Using noninformative T (b) Using PERT estimates of T

Fig. 3. EDAC共t兲 of Project A
a noninformative prior depends on whether prior knowledge and reliable a prediction is at a specific completion point. In order to
beliefs available are considered important and useful in predicting get statistically meaningful results, each forecasting method
the performance of the current project. The greater the emphasis should be independently applied to a large set of sample projects
placed on prior knowledge, the less the emphasis on current data that represent diverse progress patterns in real projects 共Kim
when making project predictions. In practice, informative prior 2007; Vanhoucke and Vandevoorde 2007兲. Comparing different
for the three parameters of the betaS-curve model can be gener-
forecasting methods based on a limited number of projects does
ated with a variety of information sources such as similar projects
not prove much because with a small sample size it would almost
in the past, in-house project database, specific project plans, or
subjective judgment based on personal experience. It is a project always be possible to find some other contradicting cases. This
manager’s call to select the most appropriate information source would be like trying to determine the probability distribution of a
or to combine all information available using his or her own general class of coins by flipping only a couple of coins.
judgment. Although the writers relied on rough PERT estimates In this study, a large set of artificial projects are used to over-
for the informative prior distribution of the duration parameter, come limited and often incomplete real project data. More spe-
project managers in real projects may be able to extract valuable cifically, 30 projects with 200 activities are generated with a
information about the plausible distribution of all the three betaS- random network generation technique combined with a network-
curve parameters using other approaches discussed in previous based schedule simulation 共Barraza et al. 2000; Kim 2007兲. Indi-
sections. When reliable information is not available, noninforma- vidual projects have different network structures and different
tive prior should be used so that a prediction be made based only levels of schedule complexity. For each project, 100 “actual”
on actual performance data from the current project.
progress curves are simulated. Being randomly generated, these
3,000 sets of actual project data are unbiased and independent of
each other, representing the range of project outcomes that might
be realized in networks of 200 activities. In this example, a
Life-Cycle Accuracy Profile
network-based simulation technique is used to generate informa-
In this study, the BBM is compared with the earned schedule tive prior distributions for the duration parameter T.
method 关Eq. 共2兲兴 and CPM in terms of forecasting accuracy at In this paper, the life-cycle accuracy profiles of BBM, ESM,
different stages of a project. Such a life-cycle accuracy profile and CPM are measured with the mean absolute percentage error
provides the project manager with useful information about how 共MAPE兲, which is defined as a function of time t
40 EDAC(t) 40 EDAC(t)
35 Upper Bound 35 Upper Bound

Lower Bound Lower Bound
30 30
EDAC
EDAC
25 Planned, 25 25 Planned, 25
Overrun Overrun
Warning Warning
20 Point 20 Point
15 15
0 5 10 15 20 25 0 5 10 15 20 25
Month Month
(a) Using noninformative T (b) Using PERT estimates of T

Fig. 4. EDAC共t兲 of Project B

20% much more accurate predictions than the ESM without drilling
17.3% ESM down into activity level information.
Note that the predictions of project durations made by all these
MAPE
15% CPM
12.6% BBM
methods are algorithmic; they combine the information embedded
in periodic updated project progress reports with 共perhaps兲 infor-
10%
mation and beliefs available before the project starts. They do not
7.1% model the attitudes or psychology of the project manager, owner,
5% contractors, or others engaged in the project such as what actions
these people will take in response to the forecasts, when they will
0% take those actions, and whether those actions will work success-
5%PD 10%PD 20%PD 30%PD 40%PD 90%C fully. In response to the predictions made, some project managers
may rebaseline 共modify the project duration or BAC兲 and some
Fig. 5. MAPE at different stages of completion may not. Moreover, the predictions themselves may be self-
negating. Consider the example in Fig. 4. Suppose that at month

10, the project manager determines that 10% is unacceptable, and
takes action to restore the schedule; subsequently, as a result of
冉冊兺
N
100 兩APDUi − EDACi共t兲兩 these actions, the project gets back on its track at 18 months.
MAPE共t兲 = 共14兲 Does this prove that the forecast at month 10 is manifestly
N i=1 APDUi
wrong? Or was it—it was this forecast that inspired the project
where APDUi = actual project duration of random execution i, manager to take action, and without this action the project would
which is generated by a network-based schedule simulation ap- have been late. The only way in which the accuracy of predictive
proach. methods can be obtained would be if project managers took their
Fig. 5 shows the MAPE obtained from the 3,000 project ob- hands off the wheel and put the project on autopilot, and this is
servations at six evaluation points. Among the six evaluation not going to happen. The only claim for the predictive method
points, the first five evaluation points 共5%PD, 10%PD, 20%PD, proposed in this paper is that it squeezes more information out of
30%PD, 40%PD, and 90%C兲 correspond to the time points, at 5, a set of project data than other methods can, and the value of this
10, 20, 30, and 40% of the planned project duration. However, the is to provide better analyses to the project manager with which to
last evaluation point 共90%C兲 is determined at the time point when make, hopefully, better decisions.
actual project progress reaches the 90% complete point of the
project.
The results in Fig. 5 can be summarized as follows: Conclusions
• Early in the project, the ESM has significantly larger errors
than the CPM and the BBM. The forecasting accuracy of the A new probabilistic schedule forecasting method has been devel-
ESM improves over time. However, it still takes about 30% of oped. The Bayesian betaS-curve method is a probabilistic method
the planned project duration to get comparable MAPE to the that provides confidence bounds on predictions. It is also an adap-
other methods. tive method that starts with the original estimate of project dura-
• Although the ESM gives predictions of project duration that tion and adjusts the influence of prior performance information on
are not accurate early in the project, both the CPM and the prediction as actual performance data accrue. Furthermore, the
BBM have MAPE less than 5% even very early in the project BBM relies on project-level performance data, which make the
共even at 5% of the planned duration, for example兲, when reli- method seamlessly integrated into common project performance
able forecasts are most valuable. Therefore, both these meth- metrics, such as the earned value and percent complete in various
ods should give, on the average, forecasts of project duration units.
that are accurate enough for use by project managers through- Combined with a curve fitting technique, the BBM is ex-
out the project. tremely robust in quantifying all relevant performance informa-
• The BBM outperforms, on the average, the CPM at the first tion from various sources. The flexibility of the betaS-curve
two evaluation points. However, the MAPE of the BBM con- function enables the BBM to possess a greater capability of rep-
verges to that of the ESM, resulting in slightly larger mean resenting extreme cases and variability in project progress pat-
errors than the CPM during the following evaluation points. terns compared to common two-parameter S-curve functions in
Statistical tests revealed that all the MAPEs in Fig. 5 were the literature 共Kim and Reinschmidt 2007兲. Using two real
statistically different from each other at every time period at projects, it has been demonstrated that the BBM extracts more
the ␣ = 0.05 level, except for the BBM and the EVM at 30%PD information from common monthly progress reports and provides
and 40%PD, which were not significantly different. more informative predictions than the conventional methods,
The example here does not demonstrate that the proposed which can be used as an objective, quantitative indicator of sched-
method produces more accurate forecasts than the CPM network ule risk of ongoing projects.
method. It does demonstrate that the proposed method generates This paper also provides a comparative study about the life-
forecasts that are much more accurate than the ESM early in the cycle accuracy of three forecasting methods: the BBM, the ESM,
project and just as accurate as CPM forecasting, while being and the CPM. In order to draw statistically meaningful conclu-
much easier to use than CPM. It should be noted that the BBM sions, a large test set of artificial project data are generated inde-
does not require detailed information about the status of indi- pendently from these methods and used to obtain the mean
vidual activities as the CPM does or revised durations of future absolute percentage error 共MAPE兲 at different stages of a project
activities. Therefore, it can be concluded that, for the given life cycle. The results reveal that, for a project group scheduled
project, both the CPM and the BBM provide forecasts sufficiently with 200 activities, the ESM, even though it has been asserted to
accurate to be usable, and the BBM provides quick, and still, be the best EVM schedule forecasting method 共Vanhoucke and

Vandevoorde 2007兲, has, early in the project, stunningly larger 133共3兲, 180–189.
errors than the BBM. Both the BBM and the CPM provide suffi- Kenley, R., and Wilson, O. D. 共1986兲. “A construction project cash flow
ciently accurate forecasts: MAPE less than 5% even very early in model—An idiographic approach.” Constr. Manage. Econom., 4共3兲,
the project. However, it should be noted that the BBM does not 213–232.
rely on activity level performance data and analysis as the CPM Kim, B. C. 共2007兲. “Forecasting project progress and early warning of
project overruns with probabilistic methods.” Ph.D. thesis, Texas
does.
A&M Univ., College Station, Tex.
Kim, B. C., and Reinschmidt, K. 共2007兲. “An S-curve Bayesian model for
forecasting probabilistic distributions on project duration and cost at
References completion.” Proc., 25th Anniversary Conf. of Construction Manage-
ment and Economics: Past, Present, and Future, 136.
AbouRizk, S. M., Halpin, D. W., and Wilson, J. R. 共1991兲. “Visual inter- Lee, D.-E. 共2005兲. “Probability of project completion using stochastic
active fitting of beta distributions.” J. Constr. Eng. Manage., 117共4兲, project scheduling simulation.” J. Constr. Eng. Manage., 131共3兲,
589–605. 310–318.
Anbari, F. T. 共2003兲. “Earned value project management method and
Lipke, W. H. 共2003兲. “Schedule is different.” Measurable News, 31–34.

extensions.” Comments Mol. Cell. Biophys., 34共4兲, 12–23.
Barraza, G. A., Back, W. E., and Mata, F. 共2000兲. “Probabilistic monitor- Malcolm, D. G., Roseboom, J. H., Clark, C. E., and Fazar, W. 共1959兲.
ing of project performance using SS-curves.” J. Constr. Eng. Man- “Application of a technique for research and development program
age., 126共2兲, 142–148. evaluation.” Oper. Res., 7共5兲, 646–669.
Barraza, G. A., Back, W. E., and Mata, F. 共2004兲. “Probabilistic forecast- Miskawi, Z. 共1989兲. “An S-curve equation for project control.” Civ. Eng.
ing of project performance using stochastic S curves.” J. Constr. Eng. Pract., 7共2兲, 115–124.
Manage., 130共1兲, 25–32. Murmis, G. M. 共1997兲. “‘S’ curves for monitoring project progress.”
Christensen, D. S. 共1993兲. “The estimate at completion problem: A re- Comments Mol. Cell. Biophys., 28共3兲, 29–35.
view of three studies.” Comments Mol. Cell. Biophys., 24共1兲, 37–42. Perry, C., and Greig, I. D. 共1975兲. “Estimating the mean and variance of
Cioffi, D. F. 共2005兲. “A tool for managing projects: An analytic param- subjective distributions in PERT and decision analysis.” Manage. Sci.,
eterization of the S-curve.” Int. J. Proj. Manage., 23, 215–222. 21共12兲, 1477–1480.
Fellows, R., Langford, D., Newcombe, R., and Urry, S. 共2002兲. Construc- PMI. 共2004兲. A guide to the project management body of knowledge, 3rd
tion management in practice, 2nd Ed., Blackwell Publishing, Malden, Ed., Project Management Institute, Inc., Newtown Square, Pa.
Mass. Schexnayder, C. J., and Mayo, R. 共2003兲. Construction management fun-
Fleming, Q. W., and Koppelman, J. M. 共2006兲. Earned value project damentals, McGraw-Hill, Boston.
management, 3rd Ed., Project Management Institute, Newtown Touran, A., Atgun, M., and Bhurisith, I. 共2004兲. “Analysis of the United
Square, Pa. States Dept. of Transportation prompt pay provisions.” J. Constr. Eng.
Gardoni, P., Reinschmidt, K. F., and Kumar, R. 共2007兲. “A probabilistic Manage., 130共5兲, 719–725.
framework for Bayesian adaptive forecasting of project progress.” Vandevoorde, S., and Vanhoucke, M. 共2006兲. “A comparison of different
Comput. Aided Civ. Infrastruct. Eng., 22共3兲, 182–196. project duration forecasting methods using earned value metrics.” Int.
Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. 共2003兲. Bayesian J. Proj. Manage., 24共4兲, 289–302.
data analysis, 2nd Ed., Chapman and Hall/CRC, Boca Raton, Fla. Vanhoucke, M., and Vandevoorde, S. 共2007兲. “A simulation and evalua-
Ghosh, B., Basu, B., and O’Mahony, M. 共2007兲. “Bayesian time-series tion of earned value metrics to forecast the project duration.” J. Oper.
model for short-term traffic flow forecasting.” J. Transp. Eng., Res. Soc., 58共10兲, 1361–1374.

Kim2009

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kim2009

Uploaded by

Copyright:

Available Formats

Probabilistic Forecasting of Project Duration Using Bayesian

Inference and the Beta Distribution

Introduction measuring, analyzing, and communicating the actual performance

178 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:178-186.

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 179

J. Constr. Eng. Manage. 2009.135:178-186.

these two methodologies with an emphasis on their fundamental

Formulation of the Bayesian BetaS-Curve Method

180 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:178-186.

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 181

J. Constr. Eng. Manage. 2009.135:178-186.

182 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:178-186.

(a) Project A (b) Project B

Fig. 2. Progress curves of the example projects

Table 3. Priors Used in Numerical Examples

Informative T Noninformative T Informative T Noninformative T

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 183

J. Constr. Eng. Manage. 2009.135:178-186.

(a) Using noninformative T (b) Using PERT estimates of T

Fig. 3. EDAC共t兲 of Project A

35 Upper Bound 35 Upper Bound

(a) Using noninformative T (b) Using PERT estimates of T

184 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:178-186.

negating. Consider the example in Fig. 4. Suppose that at month

JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009 / 185

J. Constr. Eng. Manage. 2009.135:178-186.

Lipke, W. H. 共2003兲. “Schedule is different.” Measurable News, 31–34.

186 / JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT © ASCE / MARCH 2009

J. Constr. Eng. Manage. 2009.135:178-186.

You might also like