Professional Documents
Culture Documents
5 - Estimation Techniques
5 - Estimation Techniques
Ashish Kumar
1
Software Project Planning
Before starting to build software, it is essential to plan the
entire development effort in detail
Software project planning encompasses five major activities
Estimation, scheduling, risk analysis, quality management
planning, and change management planning
Estimation determines how much money, effort, resources,
and time it will take to build a specific system or product
The software team first estimates
The work to be done
The resources required
The time that will elapse from start to finish
Then they establish a project schedule that
Defines tasks and milestones
Identifies who is responsible for conducting each task
Specifies the inter-task dependencies
2
What is an estimate?
According to Webster,
“the act of appraising or valuing something”
“a statement of the cost of work to be done”
PMO defines an estimate as,
a rough calculation of the costs and the amount of work
prior to the commencement of work.
When estimating for a project cost we can also
estimate ongoing costs.
Outputs of an estimation exercise are:
Project Costs Estimate
Ongoing Costs Estimate
3
What are IT costs?
4
What are IT costs?
5
Why do we need to estimate?
To get an idea of the costs of a project.
To get an idea of the time needed to complete the
project.
Very important from a scheduling and project planning
standpoint.
To identify resource needs.
To identify ongoing costs and resource needs.
6
Importance of Estimations
During the planning phase of a project, a first guess
about cost and time is necessary.
Estimations are often the basis for the decision to start
a project or not.
Estimations are the foundation for project planning
and for further actions .
7
Software Cost Estimation
Determine size of the product.
From the size estimate,
determine the effort needed.
8
Software Cost Estimation
Three main approaches to estimation:
Empirical techniques:
an educated guess based on past experience.
Heuristic techniques:
assume that the characteristics to be estimated can be
expressed in terms of some mathematical expression.
Analytical techniques:
derive the required results starting from certain simple
assumptions.
9
Estimation Techniques
Expert estimates
Lines of code
Function point analysis
Halstead Size Estimation
COCOMO Model
10
Expert Estimates
= Guess from experienced people
It is subjective. (consensus is difficult to achieve)
No better than the participants
Suitable for a typical projects
Result justification difficult
Important when no detailed estimation can be done (due
to lacking information about scope)
Delphi
Developed by Rand Corporation in 1940 where participants
are involved in two assessment rounds.
Work Breakdown Structure (WBS)
A way of organizing project element into a hierarchy that
simplifies the task of budget estimation and control.
11
Expert Estimates
Advantages
Useful in the absence of quantified, empirical data.
Can factor in differences between past project
experiences and requirements of the proposed project
Can factor in impacts caused by new technologies,
applications and languages.
Disadvantages
Estimate is only as good expert’s opinion
Hard to document the factors used by the experts
12
Size Estimation/Problem Based Estimation
Here, we subdivide the problem into small problems.
When all the small problems are solved the main
problem is solved.
Lines of Code
Function Point
LOC (Lines of Code), FP(Function Point) estimation
methods consider the size as the measure.
In LOC the cost is calculated based on the number of
lines.
In FP the cost is calculated based on the number of
various functions in the program.
13
Problem Based Estimation
For both approaches, the planner uses lessons learned to
estimate an optimistic, most likely, and pessimistic size
value for each function or count (for each information
domain value)
Then the expected size value S is computed as follows:
Optimistic Value,
Most Likely Value
Pessimistic Value
16
EXAMPLE OF (LOC) BASED ESTIMATION
FUNCTIONS ESTIMATED LOC
Total Estimated Project Cost and Project Effort can be calculated as:
follows-
Considering that the Total LOC ( ∑ LOC) for the System is 33,200
Total Estimated Project Cost = (33200 * 13 ) = $431,600
18
EXAMPLE OF (LOC) BASED ESTIMATION
19
Problems with lines of code estimation
Creating source code is only a small part of the total
software development effort.
Programming languages produce the same result with
different lines of code.
How do you count the lines of code?
Only executable lines? or include data definitions?
Include comments?
What if you use inheritance? Recount inherited lines?
Not all code written is delivered to the client.
A code generator can produce thousands of lines of
code in a few minutes.
Only at when product is final do you know the actual
lines of code.
LOC is not defined for nonprocedural languages (like
LISP). 20
Function Point Analysis
Developed by Allen Albrecht, IBM Research, 1979.
Technique to determine size of software projects.
Size is measured from a functional point of view.
Estimates are based on functional requirements.
Albrecht originally used the technique to predict effort.
Size is usually the primary driver of development effort.
Independent of:
Implementation language and technology
Development methodology
Capability of the project team
A top-down approach based on function types
Three steps: Plan the count, perform the count, estimate the
effort. 21
How to count functions
There are two ways to count functions.
The first way is the traditional Albrecht approach based on the
analysis of transactions.
The second way is based on the data modeling approach (E/R
model).
To do the count, the FP counter has to do the following three
steps:
1. Count data function types.
2. Count the transaction function types.
3. Estimate the effort.
Compute the unadjusted function points (UFP)
Compute the Value Added Factor (VAF)
Compute the adjusted Function Points (FA)
Compute the performance factor
Calculate the effort in person days 22
Count data function types
Data functions are either accessing Internal Logical Files
(Maf) or External Interfaces Files (Inf).
The count of the MAfs and Infs is done by examining the
data model (E/R model or the attributes of the persistent
objects in the object model) or by asking the system
designer for the application’s main categories of persistent
data.
This count must be done before the transactions function
types are counted, because the complexity of the
transactions is dependent on the way the data access
functions are identified.
23
Count the transaction function types
Transaction function types are External Inputs (Inp),
External Outputs (Out) or External Queries (Inq).
This is typically the longest part of the count, and the part
where the system expert's assistance really saves time.
Complexity of each transaction is based, in part, on the
number of data types it references.
The system expert can usually make this determination
without consulting any documentation.
24
Function Points Estimation
Based on the number of External inputs (Inp), external
outputs (Out), external inquiries (Inq), internal logical files
(Maf), external interface files (Inf)
Input: A set of related inputs is counted as one input.
Output: A set of related outputs is counted as one output.
Inquiries: Each user query type is counted.
Files: Files are logically related data and thus can be data
structures or physical files.
Interface: Data transfer to other systems.
For any product, the size in “function points” is given by
FP = 4 Inp + 5 Out + 4 Inq + 10 Maf + 7 Inf
This is an oversimplification of a 3-step process.
25
Function Points Estimation
Step 1. Classify each component of the product (Inp,
Out, Inq, Maf, Inf) as simple, average, or complex
Assign the appropriate number of function points
The sum gives UFP (unadjusted function points)
26
Function Points Estimation
Step 2. Compute the technical
complexity factor (TCF)
Assign a value from 0 (“not
present”) to 5 (“strong influence
throughout”) to each of 14 factors
such as transaction rates,
portability
Add the 14 numbers
This gives the total degree of
influence (DI)
TCF = 0.65 + 0.01 DI
The technical complexity factor
(TCF) lies between 0.65 and 1.35 27
Function Points Estimation
Step 3. The number of function points (FP) is then
given by
FP = UFP TCF
28
Advantages of Function Point Estimation
Independent of implementation language and
technology
Estimates are based on design specification
Usually known before implementation tasks are known
Users without technical knowledge can be integrated
into the estimation process
Incorporation of experiences from different
organizations
Easy to learn
Limited time effort
29
Disadvantages of Function Point Estimation
Complete description of functions necessary
Often not the case in early project stages -> especially in
iterative software processes
Only complexity of specification is estimated
Implementation is often more relevant for estimation
High uncertainty in calculating function points:
Weight factors are usually deducted from past
experiences (environment, used technology and tools
may be out-of-date in the current project)
Does not measure the performance of people
30
Example : Function Point Analysis
31
Example : Function Point Analysis
32
Example : Function Point Analysis
Assuming
Estimated FP = 400
Organisation average productivity (similar project type) =
6.5 FP/p-m (person-month)
Burdened labour rate = 8000 $/p-m
Then
Estimated effort = 400/6.5 = (61.54) = 62 p-m
Cost per FP = 8000/6.5 = 1231 $/FP
Project cost = 8000 * 62 = 496000 $
33
COCOMO (COnstructive COst MOdel)
Developed by Barry Boehm in 1981.
Also called COCOMO I or Basic COCOMO or COCOMO’81.
COCOMO’81 is derived from the analysis of 63 software
projects.
Top-down approach to estimate cost, effort and schedule of
software projects, based on size and complexity of projects.
Assumptions:
Derivability of effort by comparing finished projects
(“COCOMO database”).
System requirements do not change during development.
Exclusion of some efforts (for example administration,
training, rollout, integration).
34
COCOMO (COnstructive COst MOdel)
Divides software product developments into three
(3) categories:
Organic : If the team size required is adequately small,
the problem is well understood and has been solved in
the past and also the team members have a nominal
experience regarding the problem.
Semidetached
Project team consists of a mixture of experienced and
inexperienced staff.
Embedded
The software is strongly coupled to complex hardware, or real-
time systems.
35
COCOMO (COnstructive COst MOdel)
For each of the three product categories:
From size estimation (in KLOC), Boehm provides
equations to predict:
project duration in months
effort in programmer-months
software products.
Tdev is the estimated time to develop the software in
months.
Effort estimation is obtained in terms of person
months (PMs).
37
Effort Estimation Time Estimation
Organic : Organic:
Effort = 2.4 (KLOC)1.05 PM Tdev = 2.5 (Effort)0.38 Months
Semi-detached: Semi-detached:
Effort = 3.0(KLOC)1.12 PM Tdev = 2.5 (Effort)0.35 Months
Embedded: Embedded:
Effort = 3.6 (KLOC)1.20PM Tdev = 2.5 (Effort)0.32 Months
38
Basic COCOMO Model
Effort is somewhat super-linear in problem size.
Development time sublinear function of product size.
When product size increases two times, development time
does not double.
Time taken is almost same for all the three product types.
Dev. Time
ed
ch dd ed a ch ed
ta
Effort id
e be idet
m 18 Months Em Sem
Se
ded
bed 14 Months
Em anic ic
Or
g rgan
O
30K 60K
Size 39
Size
Basic COCOMO Model
Development time does not increase linearly with
product size:
For larger products more parallel activities can be identified:
can be carried out simultaneously by a number of engineers.
Development time is roughly the same for all the three
categories of products:
For example, a 60 KLOC program can be developed in
approximately 18 months
regardless of whether it is of organic, semi-detached, or
embedded type.
There is more scope for parallel activities for system and
application programs,
than utility programs.
40
Example
The size of an organic software product has been
estimated to be 32,000 lines of source code.
Effort = 2.4*(32)1.05 = 91 PM
Nominal development time = 2.5*(91)0.38 = 14 months
41
Intermediate COCOMO
Basic COCOMO model assumes
effort and development time depend on product size alone.
However, several parameters affect effort and development
time:
Reliability requirements
Availability of CASE tools and modern facilities to the developers
42
Intermediate COCOMO
If modern programming practices are used,
initial estimates are scaled downwards.
If there are stringent reliability requirements on the
product :
initial estimate is scaled upwards.
Rate different parameters on a scale of one to five:
Depending on these ratings,
multiply cost driver values with the estimate obtained using
the basic COCOMO.
43
Cost Drivers Classes
44
Intermediate COCOMO -Effort Estimation
Effort = a *(KLOC)b * m(x)
Organic :
m(x)
a = 3.2 =
|| b= 1.05
multiply of all
cost drivers factors
Semi-detached:
a = 3.0 || b= 1.12
Embedded:
a = 2.8 || b= 1.20
45
Shortcomings of both models
Both models:
consider a software product as a single homogeneous
entity:
However, most large systems are made up of several
smaller sub-systems.
Some sub-systems may be considered as organic type, some
may be considered embedded, etc.
for some the reliability requirements may be high, and so on.
46
Complete COCOMO
Cost of each sub-system is estimated separately.
Costs of the sub-systems are added to obtain total cost.
Reduces the margin of error in the final estimate.
For Example:
A Management Information System (MIS) for an organization having
offices at several places across the country:
Database part (semi-detached)
Graphical User Interface (GUI) part (organic)
Communication part (embedded)
Costs of the components are estimated separately:
summed up to give the overall cost of the system.
47
COCOMO II
Revision of COCOMO I in 1997
Provides three models of increasing detail
Application Composition Model
Estimates for prototypes based on GUI builder tools and existing
components
Similar goal as for Function Point analysis
Based on counting Object Points (instead of function points)
49
Application Composition Model
Supports prototyping projects and projects where
there is extensive reuse.
Based on standard estimates of developer productivity
in object points/month.
Takes CASE tool use into account.
Formula is
Effort (PM)= ( NOP ´ (1 - %reuse/100 ) ) / PROD
PM is the effort in person-months, NOP is the number
of object points and PROD is the productivity
50
Object point productivity
51
Advantages of COCOMO
Appropriate for a quick, high-level estimation of
project costs.
Fair results with smaller projects in a well known
development environment.
Assumes comparison with past projects is possible.
Covers all development activities (from analysis to
testing).
Intermediate COCOMO yields good results for
projects on which the model is based.
Tool that automate intermediate COCOMO and
COCOMO II are available.
52
Problems with COCOMO
Judgment requirement to determine the influencing
factors and their values.
Experience shows that estimation results can deviate
from actual effort by a factor of 4.
Some important factors are not considered:
Skills of team members, travel, environmental factors,
user interface quality, overhead cost.
53
Thank You
54