SLIDES01 - SE15 - AOA (Read-Only)

Analysis of Algorithms
Day 1
1
Objectives of the course
• To introduce the concept of ‘Analysis of Algorithms’

• To learn the various factors that affect the performance of an algorithm
• To introduce algorithm design techniques
• To learn Code Tuning Techniques
• To introduce Numerical Analysis (Accuracy)
• To introduce Intractable problems
ER/CORP/CRS/SE15/003
Copyright © 2004, Infosys 2
Technologies Ltd Version No: 2.0
The main concerns of a software engineer are to ensure:

(i) Correctness of the solution
(ii) Decomposition of a software application into small and clean units which can be
maintained easily
(iii) Improving the performance of software application
The main objective of the course is to introduce “Analysis of Algorithms” and to compute the
performance parameters of an algorithm.
After studying this course, you will get a better understanding on the importance of
designing good algorithms and efficient programs.
2
References
1. Donald E Knuth (1997)The Art of Computer Programming, Fundamental
Algorithms, Volume 1, Third Edition, Addison Wesley
2. Cormen, Leiserson, Rivest, Stein(2001), Introduction to Algorithms, Second
Edition, Prentice Hall
3. Alfred V Aho, John E Hopcraft, Jeffrey D Ullman (1998), Design & Analysis of
Computer Algorithms, Addison Wesley Publishing Company
4. Ellis Horowitz, Sartaj Sahni, Sanguthevar, (1998)Fundamentals of Computer
Algorithms, Galgotia Publications private limited, New Delhi
5. Weiss M, W. (1993) Data Structures and Algorithm Analysis in C, Benjamin
Cummings, Addison Wesley
6. Jon Bentley(2000), Programming Pearls, Second Edition, Pearson Education
7. McConnell, S. (1993) Code complete, Microsoft Press
8. Press, et al (2002), Numerical Recipes in C++, Cambridge Univ Press
3
Course Plan
Day 1
• Introduction to Analysis of algorithms

– What is an Algorithm?
– Properties of an Algorithm
– Life cycle of an Algorithm
• Analyzing Algorithms
– Introduction to Space and Time complexities
– Basic Mathematical principles
– Order of magnitude
– Introduction to Asymptotic notations
• Best case
• Worst case
• Average case
4
Course Plan (cont...)
Day 2
• Algorithm design techniques

– Brute force
– Greedy
– Divide & Conquer
– Decrease & Conquer
– Dynamic Programming
5
Course plan (cont…)
Day 3
• Code Tuning
• SQL Query Tuning
• Introduction to Numerical Analysis
• Intractable problems
– Deterministic Vs Non-Deterministic machines
– P Vs NP
– NP Complete
6
Unit 1 - Introduction
7
Introduction to Algorithms
The etymology of the word Algorithm dates
back to the 8th Century AD.
The word Algorithm is derived from the
name of the Persian author
“Abu Jafar Mohammad ibn Musa al
Khowarizmi”
Muhammad al-Khowarizmi, from a 1983 USSR

commemorative stamp scanned by Donald Knuth
Reference: ACM Trans - Algorithms
Abu Jafar Mohammad ibn Musa al Khowarizmi - was a great mathematician who was born
around 780 AD in Baghdad. He worked on, algebra, geometry, and astronomy. His treatise
on algebra, Hisab al-jabr w'al-muqabala, was the most famous and important of all of al-
Khwarizmi's works. It is the title of this text that gives us the word "algebra"
8
What is an Algorithm?
• Finite set of instructions to accomplish a task. The algorithm should be
correct
• The properties of an algorithm are as follows:
Finiteness
Effectiveness Algorithm Definiteness
Output Input
An Algorithm is defined as “Finite set of instructions to accomplish a task”.
An Algorithm has five properties as follows:

Finiteness: An algorithm should end in a finite number of steps.
Definiteness: Every step of an algorithm should be clear and unambiguously defined.
Input: The input of an algorithm can either be given interactively by the user or generated
internally.
Output: An algorithm should have at least one output.
Effectiveness: Every step in the algorithm should be easy to understand and prove using
paper and pencil.
9
Algorithm
Practice:
Write an algorithm to find the GCD of two numbers?
Step1: Get two numbers m & n

Step2: Divide m by n
Step3: If the remainder is 0 then return n as the GCD
else
m Å n, nÅ remainder
Goto Step2
Check if the above algorithm (Euclid’s Algorithm) to find the GCD of two given
numbers is satisfying all the properties of an algorithm
The above algorithm satisfies all the properties except definiteness because what will
happen if m=-2 and n=3.45.
So change Step1 as “Get two positive non zero integers m and n”.
10
Algorithms span a vast space
• The definition
“Finite Set of Instructions to accomplish a task”
spans a very vast space. We will only discuss a few kinds of algorithms,
but will briefly indicate the larger picture, through a simplified banking
application
The gamut of algorithms is very vast, spanning symbolic, numerical, power efficient, fault
tolerant algorithms, etc. We illustrate this wide variety through an simplified banking
example.
The different kinds of algorithms used in the banking application have different speeds,
memory requirements, real time response, numerical accuracy, fault tolerance, etc.
11
Banking Applications: Utilize Computers, Networks,
and Storage Routing Tables, Link State
Information
Telephone Banking (IVR),

Authentication, Encryption Real Time TTS
Fault tolerant Communication protocols, real

Datastructures time / Error Recovery
Failover
High Speed Rule-based System, with/without

state, 10K+ transactions/second
Financially accurate calculation (Rs 1 in Rs
1000,000 crores, one part in 1013)
WAN Link to
Mirror, Keep multiple Copies in DR
Sync!
Huge databases: 10’s of Terabytes
Replication
Disk Layout, Data Compression, Database Optimization, Encryption Bandwidth Conservation, MP3
Fault tolerance (detect potential loss)
This banking application utilizes all kinds of algorithms from symbolic through real time through fault tolerant. The
figure also illustrates in a simplified form the architectural building blocks which comprise this banking system, and
algorithm classes written to execute on it.
We show a Finacle installation, with terminals at a branch connected to a set of clustered web servers for
authentication. The web servers are in turn connected to a set of application servers for implementing banking
rules and polices. The application servers access mirrored and/or replicated data storage. Redundancy is present
in the network also. Telephone banking using an Interactive Voice Response System is used as a backup if the
branch terminals break down.
The design of the authentication hardware and software requires fault tolerance – the users should not have to
relogin if one or more servers fail – some state should be stored in the form of cookies in non-volatile storage
somewhere. The banking calculations require very high accuracy (30+ digit accuracy). Various kinds of fault
tolerance schemes are used for storage. For example, two mirrored disks always keep identical data. A write to
one disk is not considered complete till the other is written also. The servers have to respond within seconds to
each user level request (deposit, withdrawal, etc) – the real time response of the system has to be evaluated using
queuing theory and similar techniques. For TTS, the response output speech samples have to be guaranteed to
be delivered at periodic time intervals, say every 125 microseconds.
Glossary:
DR: Disaster Recovery
TTS: Text to Speech
IVR: Interactive Voice Response
WAN: Wide Area Network
12
Pseudo Code
• An algorithm is independent of any language or machine whereas a program is
dependent on a language and machine
• To fill the gap between these two, we need pseudo codes
Psuedo-code is a way to represent the step by step methods in finding the solution
to the given problem.
Example:
Algorithm arrayMax (A,n)
Input array A of n integers
Output maximum element of A
CurrentMax Å A[0]
for i = 1 to n do
if A[i] > currentMax then
currentMax Å A[i]
return currentMax
Algorithms are developed during the design phase of software engineering. During the
design phase, we first look at the problem, try to write the “psuedo-code” and move towards
the programming (implementation) phase.
It is a high level description of the algorithm

It is less detailed than the program
Will not reveal the design issues of the program
Uses English like language
13
Life Cycle of an Algorithm
• Design the Algorithm
• Write (Implementation of the Algorithm)
• Test the Algorithm
• Analyze the Algorithm
The life cycle of an algorithm consists of the four phases: Design, Write, Test and Analyze.
(i) Design:
The design techniques help in devising the algorithms. Some techniques are Divide & Conquer, Greedy
Technique, Dynamic Programming etc.
The design techniques will be dealt in Unit-3 (day 2).
(ii) Write (implementation): Implementing the algorithm in pseudo code which will be later represented in an
appropriate programming language.
(iii) Test: Testing the algorithm for its correctness.
(iv) Analyze: Estimating the amount of time/space (which are considered to be prime resources) required while
executing the algorithm.
14
Resources available in a computer
POWER
The Primary Resources available in a deterministic silicon computer are:

CPU &
Primary memory
In this course we will focus on time (CPU utilization) and space (memory utilization).
When an algorithm is designed it should be analyzed for the amount of these resources it
consumes. While solving a problem, an algorithm consuming more resources than others
will not be considered in most of the cases.
15
• An algorithm when implemented, uses the computer’s primary memory and
Central Processing Unit
• Analyzing the amount of resources needed for a particular solution of the

problem
• The Analysis is done at two stages:

– Priori Analysis:
» Analysis done before implementation
– Posteriori Analysis:
» Analysis done after implementation
In Analysis we analyze the amount of resources needed for a particular solution of the
problem.
There are two types of Analysis:
Priori Analysis:
This is the theoretical estimation of resources required. Here the efficiency of the algorithm
is checked. If possible the logic of the algorithm can be improved for efficiency.
This is done before the implementation of the algorithm on a machine and so it is done
independent of any machine/software.
Posteriori Analysis:
This Analysis is done after implementing the algorithm on a target machine. It is aimed
at determination of actual statistics about algorithm’s consumption of time and space
requirements (primary memory) in the computer when it is being executed as a program.
Eg. Algorithm to check whether a number is prime or not.

Algo1: Divide the number n from 2 to (n-1) and check the reminder
Algo2: Divide the number n from 2 to n/2 and check the reminder
Algo3: Divide the number n from 2 to sqrt(n) and check the reminder
Before implementing the algorithm (Priori Analysis) in a programming language, the best
of the three algorithms will be selected(Algo3 will suit if n is large).
After implementing the algorithm (Posteriori Analysis) in a programming language, the

performance is checked with the help of a profiler.
16
A high-level view of analysis of algorithms
Accurate, to within
Algorithm Error Margin
Number!
Condition Number!
Correctness
Resource Usage: Resiliency

Time/Memory/Power Analysis:
Communication/I- Mirroring,
Communication/I-O
Replication,…
Distributed
System Analysis
Asymptotics
Beyond
O(N2), Asymptotics: Power Analysis,
O(Nlog(N))
O(Nlog(N)) Mean,Variance,
Mean,Variance, Physical
… Modeling
Algorithms can be analyzed in many dimensions, speed, accuracy, power consumption,

and resiliency.
•Numerical algorithms have to be devised for adequate accuracy. Only after you get
sufficient accuracy can we look at speed.
•Speed has many dimensions, asymptotics, mean time, variance of the execution time, etc.
Memory or in general resource usage is a dual metric
•Embedded systems have to be power efficient, e.g. cell phones.
•Many algorithms, especially banking and finance are required to be fault tolerant,
especially of server failures, etc. These systems are required to be generally geographically
distributed. The resulting communication overhead can often be the dominant contribution
to time.
17
Efficiency Measures
• Performance of a solution
• Most of the software problems do not have a single best solution
• Then how do we judge these solutions?
• The solutions are chosen based on performance measures
• Performance Measures
• Time
• Quality
• Simplicity…
Why Performance?
Since most of the software problems do not have a unique solution, we are always
interested in finding the better solution. A better solution is judged based on its performance.
Some of the performance measures include the time taken by the solution, the quality of the
solution, the simplicity of the solution, etc.
For any solution to a problem we would always ask the following questions:
“Is it feasible to use this solution?” Æ In other words is it efficient enough to be used in
practice? The efficiency measure which we normally look for is time and space. How much
time does this solution take?. How much space (memory) does this solution occupy?
Improving the performance of a solution can be done by improving the algorithm design,
database design, transaction design and by paying attention to the end-user psychology.
Also continuous improvements in hardware and communication infrastructure aid in
improving the performance of a solution.
18
Efficiency Measures (Contd…)
• Space Time Tradeoff
Example 1: Consider a personnel management product that an organization can purchase

and use to maintain information about its employees. If employee details were to be stored in
an array, the array would have to be declared large enough to be able to hold the maximum
number of records the system was rated to handle. This would always take up a large
amount of memory. With a linked list implementation on the other hand, there would be
better utilization of memory.
Which implementation would provide faster access to an employee with a given employee
number?
Which implementation would be easier to code?
Which implementation would be easier to test?
The above mentioned example tries to highlight the need for performance. Each of the three
questions asked are aimed at some performance measure.
The array data structure is a better data structure for each of these questions. However if a
different company also plans to buy this product, then the size of the array must be very high
(which could as well lead to wastage of space). In this case a linked list data structure might
be a better option.
This example also highlights an universal problem called the space time tradeoff, which we
will be discussing shortly.
19
Efficiency Measures (Contd …)
Example 2: Think of a GUI drop-down list box that displays a list of employees
whose names begin with a specified sequence of characters. If the employee
database is on a different machine, then there are two options:
Option a: fire a SQL and retrieve the relevant employee names each time the list
is dropped down.
Option b: keep the complete list of employees in memory and refer to it each time
the list is dropped down.
In your opinion which is the preferred option and why?
This example again does not have a unique solution. It depends on various parameters
which include:
• The number of employees
•The transmission time from the database server to the client machine
•The volume of data transmission each time
•The frequency of such requests.
•The network bandwidth
Neither of the solutions is the better one. The main point here is the tradeoff. When ever we
need a better performance in terms of time taken, then we could opt for the option b which
would however lead to more memory requirements. The vice versa is also true. When we
want our solution to occupy less memory (space) then we need to strike a compromise for
the efficiency in terms of time taken. This tradeoff is called the space time tradeoff which is
an universal principle.
20
Efficiency Measures (Contd …)
Example 3:
Which one of the following problems requires more space?
• Design a computer program which produces an output 1, if the word is of

length 3n (n=0,1,2,…) and 0, otherwise
example:
If the input is “aabcef” the output is 1
If the input is “aabc” then the output is 0
• Design a computer program that sorts ( in Ascending order ) and outputs the
result for any input sequence a1,a2,…an of numbers, where n is any natural
number
Consider the RAM size required in both the programs.

Program 1 always requires a constant amount of memory.
Program 2 must require memory of arbitrary length.
21
Summary of Unit - 1
• What is an Algorithm?
• Properties of an Algorithm
• Life Cycle of an Algorithm
• Performance Measures
22
Unit 2 - Analyzing Algorithms
23
• Refers to predicting the resources required by the

algorithm, based on size of the problem
• The primary resources required are Time and Space
• Analysis based on time taken to execute

the algorithm is called Time complexity of the Algorithm
• Analysis based on the memory required to

execute the algorithm is called Space
complexity of the Algorithm
When a programmer builds an algorithm during design phase of software life cycle, he/she
might not be able to implement it immediately. This is because programming comes in later
part of the software life cycle. But there is a need to analyze the algorithm at that stage.
This will help in forecasting how much time the algorithm takes or how much primary
memory it might occupy when it is implemented. So analysis of algorithm becomes very
important.
Complexity of an algorithm represents the amount of resources required while executing

the algorithm.
There will always be a tradeoff between the time and space complexity.
Most of the problems which require more space will take less time to execute and vice
versa.
24
Space Complexity
The space needed by a program has the following components:
• Instruction space
• Data space
• Environment stack space
Instruction space:
Space needed to store the object code.
Data space:
Space needed to store constants & variables.
Environment stack space:
Space needed when functions are called. If the function, fnA calls another
function fnB then the return address and all the local variables and formal parameters are
to stored.
25
Time Complexity
Time complexity depends on the machine, compilers and other real time factors.
Total time = Σ ( ti * opi(n) )
Where opi(n) is the number of instances the operation opi occurs and ti is the time
taken for executing the operation
This Total time is a varying factor which depends on the current load of the system
and other real time factors like communication
Time complexity also depends on all the factors that the space complexity depends on.
Time complexity includes the compilation time and execution time but compilation is done
once whereas the execution is done n number of times. So the compilation time is not
considered in most of the cases but only the execution time.
26
Time Complexity (Cont…)
Operation count is one way to estimate the Time Complexity.
• Example 1: Searching an array for the presence of an element

Here the time complexity is estimated based on the number of search operations.
• Example 2: Finding the roots of a quadratic equation ax2+bx+c =0

The roots are (–b + sqrt(b2 -4*a*c))/2a and (–b - sqrt(b2 -4*a*c))/2a.
Here the number of operations can be reduced by computing the common

expression sqrt(b2 -4*a*c).
The success of this method (Operation count) depends on the identification of the exact
operation/s that contribute most to the time complexity.
27
Step count is another way to estimate time complexity
Consider the code below: Total steps

___________
sum(array, n) 0
{ 0
1.1 tsum = 0; 1
1.2 for (i=0 ; i<n ; i++) 2n+2
1.2.1 tsum = tsum + array[i]; n
1.3 return tsum; 1
} 0
___________
Total number of steps: 3n+4
28
Recursive functions:
Total steps
________
fact(n) 0
{ 0
1.1 if (n<=1) n
return 1; 1
1.2 return ( n*fact(n-1) ); 2n-2
} 0
________
3n - 1
Step1.1 is executed for n times and return for 1 time

Step1.2 contains one multiplication and one function calling. Each will be done for (n-1)
times, so 2n-2
29
Function calling:
Consider a function calling the function sum(array,n) (ref: slide 28)
Total Steps
____________
Callsum(array1,array2,n) 0
{ 0
1.1 for( i=0 ; I < n; i++) 2n+2
1.1.1 array2[i]=sum(array1,i+1); 3i + 8 = n(3n+13)/2
} 0
______________
Total number of steps: (3n2 + 17n + 4)/2
Regarding step 1.1.1 the function sum(array,n) is being called.

The total number of steps for that function is already calculated as 3n + 4. The function
Callsum is called for n=i+1. So substituting n=i+1 will give 3(i + 1) + 4 = 3i + 7.
This value is incremented by 1 for the function call. So, it will become 3i + 8.
This 3i + 8 will vary for i=0 to n-1 which is (3*0 + 8) + (3*1 + 8) + … + (3*(n-1) + 8)
= 3(0+1+2+…+n-1) + 8n = 3(n-1)n/2 + 8n = n(3n+13)/2
30
Kinds of Analysis of Algorithms
• Posteriori Analysis is aimed at determination of actual statistics about

algorithm’s consumption of time and space requirements (primary memory) in
the computer when it is being executed as a program. The Profiler tool is
mainly used in finding the performance bottlenecks of a program
• Priori Analysis is aimed at analyzing the algorithm before it is implemented on

any computer. It will give the approximate amount of resources required to
solve the problem before execution
Posteriori analysis is done after implementing the algorithm in a Programming Language

and running it in a machine.
Priori Analysis is carried out before the program is written (based on the algorithm). The
calculation of order of magnitude in the examples we have seen above, is the priori
analysis of the algorithm.
In case of priori analysis, we ignore the machine and platform dependent factors. Also we
analyze the algorithm before we write the program. It is always better if we analyze the
algorithm at the earlier stage of the software life cycle.
31
Posteriori analysis
• External factors influence the execution of the algorithm
– Network delay
– Hardware failure etc.,
• The same algorithms might behave differently on different systems

• The load on the machine can vary which affects the real performance
measure of the algorithm
• Profiler tool can be used for performing Posteriori analysis
32
Posteriori Analysis (Cont…)
PROFILER
• What is a Profiler?
A tool to identify the performance bottlenecks of an application.
• Why Profiler?
– To find the performance bottle necks.
– Visualizing the run time of the code.
– Finding out the time consumed by the code for the given input
• Limitations of a Profiler
– Most Profilers talks more specific in terms of time duration
– May vary depending on the load on the system
• Queries can also be profiled (provided by database vendors)

– tkprof
Build a table which lists the total number of steps that each statement contributes. Add the
contributions of all statements to obtain the step count for the entire program. So we can
get the percentage of each statement. This approach in obtaining the step count (ref: time
complexity) is called profiling. The same approach is applicable to various functions
(subprograms) available in a program.
Refer Lab guide for VC++ profiler.
33
Priori Analysis
Priori analysis require the knowledge of
– Mathematical equations
– Determination of the problem size
– Order of magnitude of any algorithm
Each of these are discussed in the forthcoming sections
34
Some Basic Mathematics
Arithmetic Progressions:
n
+1)
∑i = 1 + 2 + 3 + ... + (n −1) + n = n(n2
i =0
Geometric Progressions:
x −1 n +1
, if ( x ≠ 1)
n
∑x =
i
i =0
x −1
∞
1
∑x
i =0
i
=
1− x
, if x < 1
Mathematical knowledge is an essence for performing priori analysis.

Arithmetic progressions:
In this series, the difference between an element to its successor is the same as the
difference between the element and its predecessor.
So the series will be,
a, a + d, a + 2d, a + 3d,…
Sum of n terms = n/2 * ( first term + last term)
Also the sum of n terms = (n/2) * [ 2 * first term + (n-1) * constant diff.] = (n/2)*[2a + (n - 1) d]
Geometric Progressions:
There will be a constant ratio between an element and its successor( it is the same as the
ratio between an element and its predecessor).
So the series will be,
a, a r, ar^2, a r^3, …
The sum to n terms are shown in the above slide.
35
Some Basic Mathematics (Contd…)
Logarithms:
a = b log b a
logb a = (logc a)(logb c)

1
logb a =
log a b
log c a
log b a =
log c b
The log functions grow slowly compared to linear functions.

•loga(x) is a constant multiple of logb(x) for fixed a, b
Whenever the log is specified, it is log base 2.
Factorials:
A number n! is represented by 1 * 2 * 3 * …. * (n-1) * n
36
A few mathematical formulae.
1 + 22 + … + n2 = n * (n + 1) * (2n + 1) / 6
1 + a + a2 + … + an = (a(n+1) – 1) / (a – 1)
Floor function f(x) or x :

For a real number x, f(x) is the largest integer not greater than x.
Choice function:n n!
C =
r!( n − r )!
r
•Applying the basic concepts we had seen so far, the above series can be
evaluated.
37
Growth of functions
Algorithm complexity will be represented
in terms of mathematical functions
Ex. n log n, n2
Given the complexities, n2 nlog(n)

n log(n)
n2
which will grow slow?
2n n
log(n)
•In the figure in the slide, the x axis represents the problem size and the y axis represents
the resources.
•As part of Basic Mathematical Principles we will introduce applicable mathematics as
required for this course.
•Growth of functions: The above figure shows the growth of a few mathematical
functions. The x-axis varies from 0 to 50 and the y-axis varies from 0 to 100. The
point to observed here is that the growth rate of the function log(n) is smaller when
compared to the other functions namely n, nlog(n), n2 and 2n. An exponential
function like 2n will ultimately over take any polynomial function. The need to
understand the growth of these basic functions will be well appreciated in the later
chapters wherein we analyze algorithms.
•From the graph, we can find that the logarithmic functions will grow more slowly
and the exponential functions will grow much faster.
What are factorial functions? What is their growth rate?
The functions which grows at the rate of n! are called factorial functions.
The growth rate of factorial is tremendous, that it will be much more greater than 2 ^ x.
38
How many times should we divide (into half) the number of elements ‘n’
(discarding reminders if any) to reach 1 element?
Since n is being divided by 2 consecutively we need to consider two cases.

Case – 1: n is a power of 2:
Say for example n = 8 in which case 8 must be halved 3 times to reach 1
8 Î 4 Î 2 Î 1. Similarly 16 must be halved 4 times to reach 1.
16 Î 8 Î 4 Î 2 Î 1
Case – 2: n is not a power of 2:
Say for example n = 9 in which case 9 must be halved 3 times to reach 1
9 Î 4 Î 2 Î 1. Similarly 15 must be halved 3 times to reach 1
15 Î 7 Î 3 Î 1. So if 2m < n < 2(m+1) then n must be halved m times to reach 1
In general, n must be halved m times and m is given by :
m= floor(log2n)
•The above mentioned result is necessary for analyzing most of the

algorithms
•As a corollary to the above mentioned result we can easily see that a
number n must be halved floor(log2n) + 1 times to reach 0.
39
A high-level view of analysis of algorithms
Accurate, to within
Algorithm Error Margin
Number!
Condition Number!
Correctness
Resource Usage: Resiliency

Time/Memory/Power Analysis:
Communication/I- Mirroring,
Communication/I-O
Replication,…
Distributed
System Analysis
Asymptotics
Beyond
O(N2), Asymptotics: Power Analysis,
O(Nlog(N))
O(Nlog(N)) Mean,Variance,
Mean,Variance, Physical
… Modeling
Given the wide variety of algorithms, they can be analyzed in many dimensions, speed,
accuracy, power consumption, and resiliency.
•Numerical algorithms have to be devised for adequate accuracy. Only after you get
sufficient accuracy can we look at speed.
•Speed has many dimensions, asymptotics, mean time, variance of the execution time, etc.
Instead of time, we can look at memory or in general resource usage also.
•Embedded systems have to be power efficient, e.g. cell phones.
•Many algorithms, especially banking and finance are required to be fault tolerant,
especially of server failures, etc. These systems are required to be generally geographically
distributed. The resulting communication overhead can often be the dominant contribution
to time.
•In this module, we shall primarily focus on ASYMPTOTICS
40
Problem size
The problem size depends on the nature of the problem for which we are
developing the algorithms.
The complexity of an algorithm is expressed as a function of problem size
Examples:
• If we are searching an element in an array having ‘n’ elements, the problem
size is ____
same as the size of array ( = ‘n’).
• If we are merging 2 arrays of size ‘n’ and ‘m’, the problem size of the algorithm
is _____
sum of two array sizes ( = ‘n + m’)
• If we are computing the nth factorial, the problem size is
The Problem size is ‘n’
The space required for storing n elements is n.

The space required for representing the binary form of a number n is floor(log n) + 1.
41
Order of Magnitude of an algorithm
Calculate the running time and consider only the leading term of the formula which
gives the order of magnitude.
• Example 1
for( i = 0; i< n; i ++)
{
...
...
}
Assume there are ‘c’ number of statements inside the loop
Each statement takes 1 unit of time
Execution time for 1 loop = c * 1 = c
Total execution time = n * c
Since ‘c’ is constant it is insignificant. So the order is ‘n’
In calculating the order of magnitude, the lower order terms are left out as they are
relatively insignificant.
The assumptions in the example are made because we will not know on which machine the
algorithm is to be implemented. So we can’t exactly say how much time each statement will
take. The exact time depends on the machine on which the algorithm is run.
In the example the approximation is done because for higher values of ‘n’, the effect of ‘c’
(constant) will not be significant. Thus, constants can be ignored.
42
Order of Magnitude of an algorithm (Cont…)
• Example 2
for( i=0;i<n; i ++) {
for(j=0;j<m;j++) {
…. ….
}
}
Assume we have ‘c’ number of statements inside the innermost loop
Following the same assumptions as the earlier example
Execution time for 1 loop = c * 1
Execution time for the inner loop = m * c
Total execution time = n * (m * c)
Since c is a constant, the total execution time = n * m
In the above example, the inner loop will be executed m times and the outer loop n times.
43
Analysis based on the nature of the problem
The analysis of the algorithm can be performed based on the nature of the
problem.
Thus we have:
• Worst case analysis
• Average case analysis
• Best case analysis
Worst case:
Under what condition/s does the algorithm when executed consumes maximum amount of
resources. It is the maximum amount of resource the algorithm can consume for any value
of problem size.
Best case:
Under what condition/s does the algorithm when executed consumes minimum amount of
resources.
Average case:
This is between worst case & best case. It is probabilistic in nature. Average-case running
times are calculated by first arriving at an understanding of the average nature of the input,
and then performing a running-time analysis of the algorithm for this configuration.
Average case analysis is done by considering every possibility are equally likely to happen.
44
Why Worst case analysis?
Even though the average case is more tends towards the real situation worst case
analysis is preferred due to the following reasons:
• It is better to bound ones pessimism – the time of execution can’t go beyond

T(n) as it is the upper bound
• Generally it is easy to compute the worst case analysis as compared to

computation of best case and average case of algorithms
During Priori analysis Worst case complexity is preferred. Why?

The goodness of an algorithm is most often expressed in terms of its worst-case running
time. There are two reasons for this: the need for a bound on one’s pessimism, and the
ease of calculation (in most of the cases) of worst-case times as compared to average-case
times
Here we prefer worst case complexity is due to ease of computation of the worst case
complexity of the algorithm compared to the average case complexity and the least usage
of best case. Also it is better to find the maximum time of execution of an algorithm to be on
the safer side.
45
Asymptotic notations for determination of order of
magnitude of an algorithm
The limiting behavior of the complexity of a problem as problem size increases is
called asymptotic complexity
The most common asymptotic notations are:

• ‘Big Oh’ ( ‘O’) notation:
It represents the upper bound of the resources required to solve a problem.
It is represented by ‘O’
• ‘Omega’ notation:
It represents the lower bound of the resources required to solve a problem.
It is represented by Ω
The goodness of an algorithm is expressed usually in terms of its worst case running time.
‘Worst case running time’ of an algorithm is the ‘upper bound’ for time of execution of
that algorithm for different problem size.
An algorithm is said to have a worst-case running time of O(n^2) if, its running time
(execution time) is always bound within n^2 where n is the problem size.
Goodness of an algorithm refers to efficiency or capability

Upper bound is also called the upper limit or the range of maximum values. Eg: when we
consider marks of a student out of 100, 100 is the upper bound. Student can’t get marks
greater than 100.
46
Asymptotic analysis: What it does?
• Asymptotic analysis is necessary but not sufficient for many kinds of problems
?
%
100%
N 2N
The large body of literature on asymptotic (apriori) analysis basically answers the question:
In relative terms, how much more time does a problem of twice (say) the size take? Say, if I
can sort 1000 numbers in unit time, how much time will it take for sorting 10000 numbers?
The unit time is not specified (analysis is relative), but could be say 10-100 microseconds
on typical modern PCs.
It does not attempt to give exact estimates of runtime. In database and similar applications,
asymptotic analysis is very useful, as it yields insight into scalability to larger database
sizes. In real-time and transaction processing system, scalability in terms of throughput
(increased answers/second for problems of the same size), requires the mean and variance
of the execution time to be controlled instead.
A large portion of this course will deal with asymptotic analysis.
?%
100%
N 2N
47
Big Oh notation
T(n) = O(f(n)) if there are constants c and n0 such that T(n) <= cf(n) when
n >= n0. In this Big-Oh notation for worst case analysis, c and n0 are positive
integers. n0 represents the threshold problem size.
T(n)
T(n) is bound within f(n)
for different values of n
cf(n)
T(n) (Upped bound of
the algorithm)
Problem Size
Threshold
Problem size
n0
While we compute the complexity of any algorithm, we take the threshold problem size i.e n
> n0 , where n0 is the threshold problem size and n is the problem size. Accordingly we
determine the upper bound of computation.
In the above graph, the dotted line (parallel to y axis ) passing through the intersection of
T(n) and f(n) represents the threshold problem size.
The threshold problem size is taken into account in priori analysis because the algorithm
might have some assignment operations which can’t be neglected for a lower problem size
( i.e for lower values of ‘n’).
Example:
T(n) = (n+1)2
Which is O(n2).
f(n) = n2
Let n0 = 1 ( threshold value)
c=(1+1)2 = 4
So there exists n0 and c such that T(n) <= cf(n).
48
Theta & Omega notations
Theta notation (Θ):
T( n ) = Θ( f( n )) if there are positive constants c1, c2 and n0 such that

c2.f(n) ≤ T( n ) ≤ c1.f(n), for all n ≥ n0.
Omega Notation (Ω):
T( n ) = Ω( f( n )) if there are positive constants c and n0 such that

T( n ) ≥ c.f( n ) for all n ≥ n0.
Theta notation:
If it can proved that for any two constants c1 & c2, T(n) lies between c1.f(n) and c2.f(n) then
T(n) can be expressed as Θ( f( n )).
Omega notation:
The function f(n) is the lower bound for T(n). This means for any value of n (n ≥ n0), the
time of computation of the algorithm T(n) is always above the graph of f(n). So f(n) serves
as the lower bound for T(n).
49
Big ‘Oh’ Vs Omega notations
Case (i) : A Project manager requires maximum of 100 software engineers to
finish the project on time.
Case (ii) : The Project manager can start the project with minimum of 50 software
engineers but cannot assure the completion of project in time.
Case (i) is similar to Big Oh notation, specifying the upper bound of resources
needed to do a task.
Case (ii) is similar to Omega notation, specifying the lower bound of resources
needed to do a task.
Which case is preferred?
Case (i) is preferred in most of the situations.
50
‘Big Oh’ manipulations
While finding the worst case complexities of algorithms using Big Oh notation,
some/all of the following rules are used.
Rule I
The leading coefficients of highest power of ‘n’ and all lower powers of ‘n’ and the
constants are ignored in f(n)
Example:
T(n) = O(100n3 + 29 n2 + 19n)
Representing the same in big Oh notation
T(n) = O(n3)
The constants and the slower growing terms are ignored as their growth rates are
insignificant compared to the growth rate of the highest power.
51
Big Oh Manipulations (contd.,)
Rule II :
The time of execution of a ‘for loop’ is the ‘running time’ of all statements
inside the ‘for loop’ multiplied by number of iterations of the ‘for loop’.
Example:
for( i=0 to n)
{
x Å x + 1;
y Å y + 1;
xÅx+y
}
The for loop is executed n times.

So, worst case running time of the algorithm is
T ( n ) = O( 3 * n ) = O ( n )
52
Rule III :
If we have a ‘nested for loop’, in an algorithm, the analysis of that algorithm should
start from the inner loop and move it outwards towards outer loop.
Example:
for(j=0 to m) {
for( i=0 to n) {
x Å x + 1;
y Å y + 1;
z Å x + y;
}
}
The worst case running time of inner loop is O( 3*n )
The worst case running time of outer loop is O( m*3*n )
The total running time = O ( m * n )
53
Rule IV :
The execution of an ‘if else statement’ is an algorithm comprises of
• Execution time for testing the condition
• The maximum execution time of either ‘if’ or ‘else’( whichever is larger )
Example:
If(x > y) {
print( “ x is larger than y”);
print(“ x is the value to be selected”);
z Å x;
x Å x+1;
}
else print( “ x is smaller than y”);
The execution time of the program is the exec. time of testing (X > Y) +
exec. time of ‘if’ statement, as the execution time of ‘if’ statement is
more than that of ‘else’ statement
O(constant)=1.
For example, O(100)=1
54
Case study on analysis of algorithms
The following examples will help us to understand the concept of worst case and
average case complexities
Example – 1: Consider the following pseudocode.

To insert a given value, k at a particular index, l in an array, a[1…n]:
1. Begin
2. Copy a[l…n] to a[l+1…n+1] (Assuming space is available)
3. Copy k to a[l]
4. End
BEST CASE: O (1)
WORST CASE: O (n)
AVERAGE CASE: O (n)
The above given code inserts a value k into position l in an array a. The basic operation here
is copy.
Worst Case Analysis: Step 2 does n-1 copies in the worst case. Step 3 does 1 copy. So
the total number of copy operations is n-1+1=n. Hence the worst case complexity of array
insertion is O(n).
Average Case Analysis: On an average step 2 will perform (n-1)/2 copies. This is derived
as follows: The probability that step 2 performs 1 copy is 1/n, the probability that it performs
2 copies is 2/n and so on. The probability that it performs n-1 copies is (n-1)/n. Hence the
average number of copies that step 2 performs is (1/n) + (2/n) + … + (n-1)/n + (n/n) =
(n+1)/2. Also step 3 performs 1 copy. So on an average the array insertion performs
((n+1)/2) + 1 copies. Hence the average case complexity of array insertion is O(n).
Best case Analysis:
O(1) = 1, as only one insertion is done with no movements.
55
Case study (Contd…)
Example – 2: Consider the following pseudocode.

To delete the value, k at a given index, i in an array, a[1…n]:
1. Begin
1 to (j-1)
2. Copy a[i+1…n] to a[i…n-1]
3. Clear a[n]
4. End
1 to (i-1)
j to (n-1)
(i+1) to n
The above given code deletes the value k at a given index i in an array a. The basic
operation here is copy.
Worst Case Analysis: Step 2 does n-1 copies in the worst case. So the total number of
copy operations is n-1. Hence the worst case complexity is O(n).
Average Case Analysis: On an average step 2 will perform (n-1)/2 copies. This is derived
as follows: The probability that step 2 performs 1 copy is 1/n, the probability that it performs
2 copies is 2/n and so on. The probability that it performs n-1 copies is (n-1)/n. Hence the
average number of copies that step 2 performs is (1/n) + (2/n) + … + (n-1)/n = (n-1)/2. So
on an average the array deletion performs
((n-1)/2) copies. Hence the average case complexity of array insertion is O(n).
Best case Analysis:
O(1) = 1, as only one deletion will be done with no further movements.
56
Summary of Unit-2
• Analyzing Algorithms
– Introduction to Space and Time complexities
– Basic Mathematical principles
– Order of magnitude
– Introduction to Asymptotic notations
• Best case
• Worst case
• Average case
57
Thank You!
58

SLIDES01 - SE15 - AOA (Read-Only)

Uploaded by

Copyright:

Available Formats

You might also like

SLIDES01 - SE15 - AOA (Read-Only)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SLIDES01 - SE15 - AOA (Read-Only)

Uploaded by

Copyright:

Available Formats

Analysis of Algorithms

• To introduce the concept of ‘Analysis of Algorithms’

The main concerns of a software engineer are to ensure:

• Introduction to Analysis of algorithms

• Algorithm design techniques

Muhammad al-Khowarizmi, from a 1983 USSR

Effectiveness Algorithm Definiteness

An Algorithm is defined as “Finite set of instructions to accomplish a task”.

An Algorithm has five properties as follows:

Write an algorithm to find the GCD of two numbers?

Step1: Get two numbers m & n

Telephone Banking (IVR),

Fault tolerant Communication protocols, real

High Speed Rule-based System, with/without

It is a high level description of the algorithm

• Write (Implementation of the Algorithm)

• Test the Algorithm

• Analyze the Algorithm

(iii) Test: Testing the algorithm for its correctness.

The Primary Resources available in a deterministic silicon computer are:

• Analyzing the amount of resources needed for a particular solution of the

• The Analysis is done at two stages:

Eg. Algorithm to check whether a number is prime or not.

After implementing the algorithm (Posteriori Analysis) in a programming language, the

Resource Usage: Resiliency

Algorithms can be analyzed in many dimensions, speed, accuracy, power consumption,

•Embedded systems have to be power efficient, e.g. cell phones.

• Most of the software problems do not have a single best solution

• Then how do we judge these solutions?

• The solutions are chosen based on performance measures

Example 1: Consider a personnel management product that an organization can purchase

Which implementation would be easier to code?

Which implementation would be easier to test?

In your opinion which is the preferred option and why?

• Design a computer program which produces an output 1, if the word is of

Consider the RAM size required in both the programs.

• Life Cycle of an Algorithm

• Refers to predicting the resources required by the

• The primary resources required are Time and Space

• Analysis based on time taken to execute

• Analysis based on the memory required to

Complexity of an algorithm represents the amount of resources required while executing

Total time = Σ ( ti * opi(n) )

• Example 1: Searching an array for the presence of an element

• Example 2: Finding the roots of a quadratic equation ax2+bx+c =0

Here the number of operations can be reduced by computing the common

Consider the code below: Total steps

Step1.1 is executed for n times and return for 1 time

Regarding step 1.1.1 the function sum(array,n) is being called.

• Posteriori Analysis is aimed at determination of actual statistics about

• Priori Analysis is aimed at analyzing the algorithm before it is implemented on

Posteriori analysis is done after implementing the algorithm in a Programming Language

• The same algorithms might behave differently on different systems

• Queries can also be profiled (provided by database vendors)

Refer Lab guide for VC++ profiler.

Each of these are discussed in the forthcoming sections

Mathematical knowledge is an essence for performing priori analysis.

Floor function f(x) or x :

Given the complexities, n2 nlog(n)

The worst case running time of outer loop is O( m3n )