5 - Estimation Techniques

Estimation Techniques
Ashish Kumar
1
Software Project Planning
Before starting to build software, it is essential to plan the
entire development effort in detail
Software project planning encompasses five major activities
 Estimation, scheduling, risk analysis, quality management
planning, and change management planning
Estimation determines how much money, effort, resources,
and time it will take to build a specific system or product
The software team first estimates
 The work to be done
 The resources required
 The time that will elapse from start to finish
Then they establish a project schedule that
 Defines tasks and milestones
 Identifies who is responsible for conducting each task
 Specifies the inter-task dependencies
2
What is an estimate?
According to Webster,
“the act of appraising or valuing something”
“a statement of the cost of work to be done”
PMO defines an estimate as,
a rough calculation of the costs and the amount of work
prior to the commencement of work.
When estimating for a project cost we can also
estimate ongoing costs.
Outputs of an estimation exercise are:
Project Costs Estimate
Ongoing Costs Estimate
3
What are IT costs?
4
What are IT costs?
5
Why do we need to estimate?
To get an idea of the costs of a project.
To get an idea of the time needed to complete the
project.
Very important from a scheduling and project planning
standpoint.
To identify resource needs.
To identify ongoing costs and resource needs.
6
Importance of Estimations
During the planning phase of a project, a first guess
about cost and time is necessary.
Estimations are often the basis for the decision to start
a project or not.
Estimations are the foundation for project planning
and for further actions .
à Estimating is one of the core tasks of project

management, but still considered as black magic !
7
Software Cost Estimation
Determine size of the product.
From the size estimate,
determine the effort needed.
From the effort estimate,

determine project duration, and cost.
8
Software Cost Estimation
Three main approaches to estimation:
Empirical techniques:
an educated guess based on past experience.
Heuristic techniques:
assume that the characteristics to be estimated can be
expressed in terms of some mathematical expression.
Analytical techniques:
derive the required results starting from certain simple
assumptions.
9
Estimation Techniques
Expert estimates
Lines of code
Function point analysis
Halstead Size Estimation
COCOMO Model
10
Expert Estimates
= Guess from experienced people
It is subjective. (consensus is difficult to achieve)
No better than the participants
Suitable for a typical projects
Result justification difficult
Important when no detailed estimation can be done (due
to lacking information about scope)
Delphi
 Developed by Rand Corporation in 1940 where participants
are involved in two assessment rounds.
Work Breakdown Structure (WBS)
 A way of organizing project element into a hierarchy that
simplifies the task of budget estimation and control.
11
Expert Estimates
Advantages
Useful in the absence of quantified, empirical data.
Can factor in differences between past project
experiences and requirements of the proposed project
Can factor in impacts caused by new technologies,
applications and languages.
Disadvantages
Estimate is only as good expert’s opinion
Hard to document the factors used by the experts
12
Size Estimation/Problem Based Estimation
Here, we subdivide the problem into small problems.
When all the small problems are solved the main
problem is solved.
Lines of Code
Function Point
LOC (Lines of Code), FP(Function Point) estimation
methods consider the size as the measure.
In LOC the cost is calculated based on the number of
lines.
In FP the cost is calculated based on the number of
various functions in the program.
13
Problem Based Estimation
For both approaches, the planner uses lessons learned to
estimate an optimistic, most likely, and pessimistic size
value for each function or count (for each information
domain value)
Then the expected size value S is computed as follows:
Optimistic Value,
Most Likely Value
Pessimistic Value
 Once the Expected Value has been determined, Historical

(LOC) and (FP) Productivity data are applied.
14
Lines of Code Estimation
The measure was first proposed when programs were
typed on cards with one line per card.
Traditional way for estimating application size.
Advantage: Easy to do.
Disadvantages:
Focus on developer’s point of view.
No standard definition for “Line of Code”.
“You get what you measure”: If the number of lines of
code is the primary measure of productivity,
programmers ignore opportunities of reuse.
Multi-language environments: Hard to compare mixed
language projects with single language projects.
15
Example - Statement of Scope
The mechanical CAD software will accept two- and three-
dimensional geometric data from an engineer. The
engineer will interact and control the CAD system through
a user interface that will exhibit characteristics of good
human/machine interface design. All geometric data and
other supporting information will be maintained in a CAD
database. Design analysis modules will be developed to
produce the required output, which will be displayed on a
variety of graphics devices. The software will be designed to
control and interact with peripheral devices that include a
mouse, digitizer, laser printer.
16
EXAMPLE OF (LOC) BASED ESTIMATION
FUNCTIONS ESTIMATED LOC
- User Interface and Control Facilities (UICF) 2,300

- Two Dimensional Analysis (2DGA) 5,300
- 3D Geometric Analysis Function (3DGA) 6,800 ***
- Database Management (DBM) 3,350
- Computer Graphic Display facility (CGDF) 4,950
- Peripheral Control Function (PCF) 2,100
- Design Analysis Modules (DAM) 8,400
_____________________________________________________________
TOTAL ESTIMATED LOC (∑ LOC) 33,200
=========================================================
For Example:- Using the Expected Value Equation we can calculate the Estimated Value for
(3DGA) Function as follows:-
Optimistic Estimation = 5,000 LOC

Most Likely Estimation = 6,700 LOC
Pessimistic Estimation = 9,000 LOC
Historical data obtained from the Metrics indicates the following
Organizational Averages:
Average Productivity is 620 LOC / Pm (Lines of Code Per Month)
Average Labor Cost is $8,000 Per month.
Cost for a Line of Code can be calculated as follows (COST / LOC)
COST / LOC = (8000 / 620) = $13
Total Estimated Project Cost and Project Effort can be calculated as:
follows-
Considering that the Total LOC ( ∑ LOC) for the System is 33,200
 Total Estimated Project Cost = (33200 * 13 ) = $431,600
 Total Estimated Project Effort = (33200 / 620) = ~ 54 Man Months
18
19
Problems with lines of code estimation
Creating source code is only a small part of the total
software development effort.
Programming languages produce the same result with
different lines of code.
How do you count the lines of code?
 Only executable lines? or include data definitions?
 Include comments?
 What if you use inheritance? Recount inherited lines?
Not all code written is delivered to the client.
A code generator can produce thousands of lines of
code in a few minutes.
Only at when product is final do you know the actual
lines of code.
LOC is not defined for nonprocedural languages (like
LISP). 20
Function Point Analysis
Developed by Allen Albrecht, IBM Research, 1979.
Technique to determine size of software projects.
 Size is measured from a functional point of view.
 Estimates are based on functional requirements.
Albrecht originally used the technique to predict effort.
 Size is usually the primary driver of development effort.
Independent of:
 Implementation language and technology
 Development methodology
 Capability of the project team
A top-down approach based on function types
 Three steps: Plan the count, perform the count, estimate the
effort. 21
How to count functions
There are two ways to count functions.
The first way is the traditional Albrecht approach based on the
analysis of transactions.
The second way is based on the data modeling approach (E/R
model).
To do the count, the FP counter has to do the following three
steps:
1. Count data function types.
2. Count the transaction function types.
3. Estimate the effort.
 Compute the unadjusted function points (UFP)
 Compute the Value Added Factor (VAF)
 Compute the adjusted Function Points (FA)
 Compute the performance factor
 Calculate the effort in person days 22
Count data function types
Data functions are either accessing Internal Logical Files
(Maf) or External Interfaces Files (Inf).
The count of the MAfs and Infs is done by examining the
data model (E/R model or the attributes of the persistent
objects in the object model) or by asking the system
designer for the application’s main categories of persistent
data.
This count must be done before the transactions function
types are counted, because the complexity of the
transactions is dependent on the way the data access
functions are identified.
23
Count the transaction function types
Transaction function types are External Inputs (Inp),
External Outputs (Out) or External Queries (Inq).
This is typically the longest part of the count, and the part
where the system expert's assistance really saves time.
Complexity of each transaction is based, in part, on the
number of data types it references.
The system expert can usually make this determination
without consulting any documentation.
24
Function Points Estimation
Based on the number of External inputs (Inp), external
outputs (Out), external inquiries (Inq), internal logical files
(Maf), external interface files (Inf)
Input: A set of related inputs is counted as one input.
Output: A set of related outputs is counted as one output.
Inquiries: Each user query type is counted.
Files: Files are logically related data and thus can be data
structures or physical files.
Interface: Data transfer to other systems.
For any product, the size in “function points” is given by
FP = 4  Inp + 5  Out + 4  Inq + 10  Maf + 7  Inf
This is an oversimplification of a 3-step process.
25
Step 1. Classify each component of the product (Inp,
Out, Inq, Maf, Inf) as simple, average, or complex
Assign the appropriate number of function points
The sum gives UFP (unadjusted function points)
26
Step 2. Compute the technical
complexity factor (TCF)
Assign a value from 0 (“not
present”) to 5 (“strong influence
throughout”) to each of 14 factors
such as transaction rates,
portability
Add the 14 numbers
This gives the total degree of
influence (DI)
TCF = 0.65 + 0.01  DI
The technical complexity factor
(TCF) lies between 0.65 and 1.35 27
Step 3. The number of function points (FP) is then
given by
FP = UFP  TCF
28
Advantages of Function Point Estimation
Independent of implementation language and
technology
Estimates are based on design specification
Usually known before implementation tasks are known
Users without technical knowledge can be integrated
into the estimation process
Incorporation of experiences from different
organizations
Easy to learn
Limited time effort
29
Disadvantages of Function Point Estimation
Complete description of functions necessary
Often not the case in early project stages -> especially in
iterative software processes
Only complexity of specification is estimated
Implementation is often more relevant for estimation
High uncertainty in calculating function points:
Weight factors are usually deducted from past
experiences (environment, used technology and tools
may be out-of-date in the current project)
Does not measure the performance of people
30
Example : Function Point Analysis
31
32
Assuming
Estimated FP = 400
Organisation average productivity (similar project type) =
6.5 FP/p-m (person-month)
Burdened labour rate = 8000 $/p-m
Then
Estimated effort = 400/6.5 = (61.54) = 62 p-m
Cost per FP = 8000/6.5 = 1231 $/FP
Project cost = 8000 * 62 = 496000 $
33
COCOMO (COnstructive COst MOdel)
Developed by Barry Boehm in 1981.
Also called COCOMO I or Basic COCOMO or COCOMO’81.
COCOMO’81 is derived from the analysis of 63 software
projects.
Top-down approach to estimate cost, effort and schedule of
software projects, based on size and complexity of projects.
Assumptions:
 Derivability of effort by comparing finished projects
(“COCOMO database”).
 System requirements do not change during development.
 Exclusion of some efforts (for example administration,
training, rollout, integration).
34
Divides software product developments into three
(3) categories:
Organic : If the team size required is adequately small,
the problem is well understood and has been solved in
the past and also the team members have a nominal
experience regarding the problem.
Semidetached
 Project team consists of a mixture of experienced and
inexperienced staff.
Embedded
 The software is strongly coupled to complex hardware, or real-
time systems.
35
For each of the three product categories:
From size estimation (in KLOC), Boehm provides
equations to predict:
 project duration in months
 effort in programmer-months
Boehm obtained these equations by examined

historical data collected from a large number of actual
projects.
Software cost estimation is done through three stages:
Basic COCOMO,
Intermediate COCOMO,
Complete COCOMO.
36
Basic COCOMO Model
Gives only an approximate estimation:
Effort = a1 (KLOC)a2
Tdev = b1 (Effort)b2
 KLOC is the estimated kilo lines of source code.
 a1,a2,b1,b2 are constants for different categories of
software products.
 Tdev is the estimated time to develop the software in
months.
 Effort estimation is obtained in terms of person
months (PMs).
37
Effort Estimation Time Estimation
Organic : Organic:
 Effort = 2.4 (KLOC)1.05 PM Tdev = 2.5 (Effort)0.38 Months
 Semi-detached: Semi-detached:
Effort = 3.0(KLOC)1.12 PM Tdev = 2.5 (Effort)0.35 Months
 Embedded: Embedded:
Effort = 3.6 (KLOC)1.20PM Tdev = 2.5 (Effort)0.32 Months
38
Basic COCOMO Model
Effort is somewhat super-linear in problem size.
Development time sublinear function of product size.
When product size increases two times, development time
does not double.
Time taken is almost same for all the three product types.
Dev. Time
ed
ch dd ed a ch ed
ta
Effort id
e be idet
m 18 Months Em Sem
Se
ded
bed 14 Months
Em anic ic
Or
g rgan
O
30K 60K
Size 39
Size
Basic COCOMO Model
Development time does not increase linearly with
product size:
 For larger products more parallel activities can be identified:
 can be carried out simultaneously by a number of engineers.
Development time is roughly the same for all the three
categories of products:
For example, a 60 KLOC program can be developed in
approximately 18 months
 regardless of whether it is of organic, semi-detached, or
embedded type.
There is more scope for parallel activities for system and
application programs,
 than utility programs.
40
Example
The size of an organic software product has been
estimated to be 32,000 lines of source code.
Effort = 2.4*(32)1.05 = 91 PM
Nominal development time = 2.5*(91)0.38 = 14 months
41
Intermediate COCOMO
Basic COCOMO model assumes
 effort and development time depend on product size alone.
However, several parameters affect effort and development
time:
 Reliability requirements
 Availability of CASE tools and modern facilities to the developers
 Size of data to be handled
For accurate estimation, the effect of all relevant

parameters must be considered.
Intermediate COCOMO model recognizes this fact:
 refines the initial estimate obtained by the basic COCOMO by
using a set of 15 cost drivers (multipliers).
42
Intermediate COCOMO
If modern programming practices are used,
 initial estimates are scaled downwards.
If there are stringent reliability requirements on the
product :
 initial estimate is scaled upwards.
Rate different parameters on a scale of one to five:
 Depending on these ratings,
 multiply cost driver values with the estimate obtained using
the basic COCOMO.
43
Cost Drivers Classes
44
Intermediate COCOMO -Effort Estimation
Effort = a *(KLOC)b * m(x)
Organic :
m(x)
 a = 3.2 =
|| b= 1.05
multiply of all
cost drivers factors
 Semi-detached:
 a = 3.0 || b= 1.12
 Embedded:
 a = 2.8 || b= 1.20
45
Shortcomings of both models
Both models:
consider a software product as a single homogeneous
entity:
However, most large systems are made up of several
smaller sub-systems.
 Some sub-systems may be considered as organic type, some
may be considered embedded, etc.
 for some the reliability requirements may be high, and so on.
46
Complete COCOMO
 Cost of each sub-system is estimated separately.
 Costs of the sub-systems are added to obtain total cost.
 Reduces the margin of error in the final estimate.
 For Example:
 A Management Information System (MIS) for an organization having
offices at several places across the country:
 Database part (semi-detached)
 Graphical User Interface (GUI) part (organic)
 Communication part (embedded)
 Costs of the components are estimated separately:
 summed up to give the overall cost of the system.
47
COCOMO II
Revision of COCOMO I in 1997
Provides three models of increasing detail
 Application Composition Model
 Estimates for prototypes based on GUI builder tools and existing
components
 Similar goal as for Function Point analysis
 Based on counting Object Points (instead of function points)
 Early Design Model

 Estimates before software architecture is defined
 For system design phase, closest to original COCOMO, uses
function points as size estimation

 Post Architecture Model
 Estimates once architecture is defined ; Most detailed
 For actual development phase and maintenance; Uses FPs or
SLOC as size measure

Estimator selects one of the three models based on current
state of the project. 48
COCOMO II
Targeted for iterative software lifecycle models
 Boehm’s spiral model
 COCOMO I assumed a waterfall model
 30% design; 30% coding; 40% integration and test
COCOMO II includes new costs drivers to deal with
 Team experience
 Developer skills
 Distributed development
COCOMO II includes new equations for reuse
 Enables build vs. buy trade-offs
COCOMO II has 17 multiplicative cost drivers (7 are new).
49
Application Composition Model
Supports prototyping projects and projects where
there is extensive reuse.
Based on standard estimates of developer productivity
in object points/month.
Takes CASE tool use into account.
Formula is
Effort (PM)= ( NOP ´ (1 - %reuse/100 ) ) / PROD
PM is the effort in person-months, NOP is the number
of object points and PROD is the productivity
50
Object point productivity
51
Advantages of COCOMO
Appropriate for a quick, high-level estimation of
project costs.
Fair results with smaller projects in a well known
development environment.
Assumes comparison with past projects is possible.
Covers all development activities (from analysis to
testing).
Intermediate COCOMO yields good results for
projects on which the model is based.
Tool that automate intermediate COCOMO and
COCOMO II are available.
52
Problems with COCOMO
Judgment requirement to determine the influencing
factors and their values.
Experience shows that estimation results can deviate
from actual effort by a factor of 4.
Some important factors are not considered:
Skills of team members, travel, environmental factors,
user interface quality, overhead cost.
53
Thank You
54

5 - Estimation Techniques

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

5 - Estimation Techniques

Uploaded by

Copyright:

Available Formats

Estimation Techniques

à Estimating is one of the core tasks of project

From the effort estimate,

 Once the Expected Value has been determined, Historical

- User Interface and Control Facilities (UICF) 2,300

Optimistic Estimation = 5,000 LOC

COST / LOC = (8000 / 620) = $13

 Total Estimated Project Effort = (33200 / 620) = ~ 54 Man Months

Boehm obtained these equations by examined

 Size of data to be handled

For accurate estimation, the effect of all relevant

 Early Design Model

 For system design phase, closest to original COCOMO, uses

function points as size estimation

 For actual development phase and maintenance; Uses FPs or

SLOC as size measure

You might also like