Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

2.

Algorithm analysis
2.1. Introduction
Algorithm analysis is a computer science field whose overall goal is to understand time and
space complexity of algorithms. Programmers are interested in performance of algorithms and
how the cost of an algorithm increases as the size of the problem increases. Cost can include:
• Time required
• Space (memory) required
2.2. Lesson objectives
By the end of this lesson, the learner will be able to:
 Describe time and space complexity
 Describe the cases used in algorithm analysis
 Describe the three notations in complexity analysis
 Estimate running time of an algorithm
2.3. Lesson outline
This lesson is organized as follows:
2.1. Introduction
2.2. Lesson objectives
2.3. Lesson outline
2.4. Efficiency of an algorithm
2.5. Cases used in complexity analysis
2.6. Time and space complexity
2.7. Asymptotic notations
2.8. Growth rates
2.9. Working with big Oh
2.10. Dominance relations
2.11. Importance of algorithm analysis
2.12. Calculating time complexity of a program
2.13. Debugging and profiling
2.14. Types of algorithms
2.15. Revision questions
2.16. Summary
2.17. Suggested reading
2.4. Efficiency of an algorithm
The terms efficiency and complexity are often used in algorithm analysis. The time and space
used by an algorithm are the two main measures of its efficiency. The time is measured by
counting the number of important operations .The space is measured by counting the
maximum of memory needed by an algorithm. The complexity of an algorithm is therefore a
function f(n) which give the running time and/or storage requirement of the algorithm in
terms of the size n of the input data. Often, storage requirement by an algorithm is a multiple
of the input size n. Generally, complexity is used to refer to running time of an algorithm.
2.5. Cases used in complexity analysis
Usually, there are three cases, used to find the complexity of function f(n).
a) Best case: Refers to the minimum value of f (n) for any possible input.
b)Worst case: Refers to the maximum value of f (n) for any possible input.
c) Average case: Refer to the value of f (n) which is in between maximum and minimum for
any possible input (expected value of f(n)).
Unless otherwise stated or implied we are always interested in determining the complexity of
an algorithm in worst case.
2.6. Time and space complexity
Given an algorithm it’s possible to find out its time complexity- the best case, average case and
worst case. In order to do this, programmers measure the complexity of every step in the
algorithm and then calculate overall time complexity. Two measures of efficiency are used-
time and space complexity.
Space complexity
This is the amount of memory needed by the program to run to completion.
Time complexity
This is the amount of computer time required by a program to run to completion. The time
complexity of an algorithm is given by the number of steps taken by the algorithm to compute
the function it was written for. The time taken by a program is the sum of compile time and
run (execution) time. The compile time does not depend on the instance characteristics .i.e.
number of inputs, number of outputs, magnitude of inputs and magnitude of outputs e.t.c.
The major concern of the programmers is running time of the program.
2.7. Asymptotic notations
There are three basic asymptotic (n→∞) notations used to express the running time of
an algorithm in terms of function whose domain is the set of natural numbers.
a) The big oh notation (o): This notation is used to express upper bound
(maximum steps) required to solve a problem.
b) The big omega (Ω): This notation is used to express lower bound
(minimum/least steps) required to solve a problem.
c) Theta notation (ө): This notation is used to express both upper and lower bound i.e.
minimum and maximum steps required to solve a problem. It is also called tight bound. The
asymptotic notation gives the rate of growth i.e. performance, of the run time for large input
size n (i.e. n→∞) and is not a measure of particular run time for specific input size.
2.8.Growth Rates
In Big Oh notation, discard the multiplicative constants. Thus, the functions f(n) = 0.001n2 and
g(n) = 1000n2 are treated identically, even though g(n) is a million times larger than f(n) for all
values of n. The bottom line is that even ignoring constant factors, we get an excellent idea of
whether a given algorithm is appropriate for a problem of a given size. An algorithm whose
running time is f(n) = n3 seconds will beat one whose running time is g(n) = 1,000,000 ·
n2 seconds only when n < 1,000,000. Such enormous differences in constant factors between
algorithms occur far less frequently in practice than large problems do.
2.9.Working with the Big Oh
You learned how to do simplifications of algebraic expressions back in high school.
Working with the Big Oh requires dusting off these tools. Most of what you learned there
still holds in working with the Big Oh, but not everything.
Rules for Big O
There are several rules that make it easy to find the order of a function:
i). Any constant multipliers are dropped.
ii). The order of a sum is the maximum of the orders of its summands.
iii). A higher power of n beats a lower power of n. iv). n beats log(n).
v). 2n beats any power of n.
Example
In analyzing f(n) = 2n2 + 6n + 100, first drop all the constant multipliers to get n2 + n + 1 and
then drop the lower-order terms to get O(n2).
2.10. Dominance Relations
The Big Oh notation groups functions into a set of classes, such that all the functions in a
particular class are equivalent with respect to the Big Oh. Functions f(n) = 0.34n and g(n) =
234,234n belong in the same class, namely those that are order Θ(n). We say that a
fastergrowing function dominates a slower-growing one.
a) Constant functions, f(n) = 1
Such functions might measure the cost of adding two numbers, printing out “Hello world or
the growth realized by functions such as f(n) = min(n, 100). There is no dependence on the
parameter n.
b) Logarithmic functions, f(n) = log n
Logarithmic time-complexity shows up in algorithms such as binary search. Such functions
grow quite slowly as n gets big.
c) Linear functions, f(n) = n
Such functions measure the cost of looking at each item once in an n-element array, say to
identify the biggest item, the smallest item, or compute the average value.
d) Super linear functions, f(n) = n lg n
This important class of functions arises in such algorithms as Quicksort and Mergesort.
e) Quadratic functions, f(n) = n2
Such functions measure the cost of looking at most or all pairs of items in an n-element
universe. This arises in algorithms such as insertion sort and selection sort.
f) Cubic functions, f(n) = n3
Such functions enumerate through all triples of items in an n-element array. These also arise
in certain dynamic programming algorithms.
g) Exponential functions, f(n) = cn
For a given constant c > 1 – Functions like 2n arise when enumerating all subsets of n items.
Exponential algorithms become useless fast.
h) Factorial functions, f(n) = n!
Functions like n! arise when generating all permutations or orderings of n items. Remark:
To understand the intricacies of dominance relations all you really need to understand is that:
n! ≥2n ≥ n3 ≥ n2 ≥ n log n ≥ n ≥ log n ≥ 1
2.11. Importance of algorithm analysis
As programmers once we are given an algorithm we should be able to guess its big o so
that we can:
i).Predict how well an algorithm will scale up.
ii).Figure out what is wrong with a given a program that is too slow.
iii).Select algorithms based on data sizes and Big O.
2.12. Calculating time complexity of a program
The number of machine instructions which a machine executes during its running time is
called time complexity. This number depends primarily on the size of program’s input. The
time taken by the program is the sum of the compile time and the run time. In time complexity
we only consider run time only. The time required by an algorithm is determined by the
number of elementary operations. The following basic operations are used to calculate the
running time. These operations are independent of any programming language.
i). Assigning a value to a variable.
ii). Calling a function
iii). Performing an arithmetic operation e.g. x + y-z etc. iv).
Comparing two variables e.g. (x>=y)
v). Indexing into an array of following a pointer reference
vi). Returning from a function
Rules for calculating running time of a program
Nested loops are multiplied together
Sequential loops are added
Only largest term is kept, the others are dropped
Constants must be dropped
Conditional checks are treated as constants
2.13. Debugging and profiling
Program testing involves two phases- debugging and profiling (performance
measurement).Debugging is the process of executing a program on sample data sets to
find out if wrong results occur and if so how to correct errors. Debugging can only
point to the presence of errors. Profiling on the other hand is the process of executing a
correct program on data sets and measuring its time and space. These timing figures are
used to confirm previous algorithm analysis and point out logical errors. These are
useful to judge a better algorithm.

Example 1
1.int sum(int n)
2.{
3.int i, sum;
4.sum=0; //add 1 to time count
5.for(i=0;i<n;i++) //add n+1 to time count
6.{
7.sum=sum+i*i; //add n to time count
8.}
9.return sum; //add 1 to time count

Explanation
The function returns sum from i=0 to i=n of I squared.
Sum=1×1+2×2+3×3+…+n×n=12 +22 +32+…+n2
Now to determine the running time for this program we count the number of statements that
are executed in this function. The code at line 4 executes 1 time, the code at line 5(for loop)
executes n+1 time. Line 7 executes n times and line 9 executes 1 time. Therefore:
The sum is=1+(n+1)+n+1=2n+3
In terms of O-notation this function is O(n).
Example 2
1.int sum(int a,b,c,m,n)
2.{
3.int i, j;
4.for i to m do //total steps m
5.for j to n do // total steps m.n
6.c[i,j]=a[i,j]+b[i,j] //total steps m.n
7.}
Explanation
Line 4 executes m times, line 5 executes m ×n times while line 6 executes m ×n times.
The sum is =m+m.n+m.n= m+2mn
In terms of O-notation this function is O(mn).
2.14. Types of algorithms.
Algorithms can be classified into five categories based on the design techniques. Algorithms
that use a similar problem-solving approach can be grouped together
a) Divide-and-conquer algorithms
b)Greedy algorithms
c) Dynamic programming
d)Backtracking algorithms
e) Branch and bound algorithms.
f) Brute force algorithms
g) Genetic algorithms
Divide and conquer algorithms
These algorithms use top down approach to solve a problem. Such algorithms involve three
basic steps:
Step 1: Divide the problem into a set of sub problems.
Step 2: Conquer or solve each sub-problem individually.
Step 3: Combine the solutions of the sub-problems to get solution of the original problem.
Examples: Merge sort and quick sort algorithms.
Greedy algorithms
These algorithms are used to solve optimization problem. The optimization problems require
that given input values, we minimize or maximize the objective function with respect to some
constraints. Greedy algorithms do not always guarantee an optimal solution but generally
produces solutions that are very close in value to the optimal. Greedy algorithm works in
phases: At each phase: Take the best you can get presently, without regard for future
consequences. Next hope that by choosing a local optimum at each step, you will end up at a
global optimum.
Examples: Kruskal’s algorithm, Prim’s algorithm, Dijkstra’s algorithm
Dynamic programming
These algorithms use bottom up approach to solve a problem. They are similar to divide and
conquer algorithms in that they break the original problem into sub problems that are then
solved recursively. However in dynamic programming the results obtained after solving sub
problems are reused (by maintaining a table of results) in calculation of larger sub problem.
Most importantly, there is no recalculation of the same problem as seen in divide and conquer.
Dynamic programming algorithms always give an optimal solution.
Examples: Calculating numbers in Fibonacci series, Traveling salesmen problem.
Backtracking algorithms

These algorithms solve a problem by trying each possibility until the right one is found.
These algorithms solves the problem using the steps
Step1: Make a choice
Step 2: Recur (backtrack)
Step 3: If the recursion returns a solution, return it
Examples: Depth first search algorithm
Branch and bound algorithms
This is a systematic method for solving optimization problems. The original problem is
considered the root problem. A method is used to construct an upper and lower bound for a
given problem. At each node, apply the bounding methods. If the bounds match, it is deemed a
feasible solution to that particular sub problem. If bounds do not match, partition the problem
represented by that node and make the two sub-problems into children nodes.
Continue, using the best known feasible solution to trim sections of the tree, until all nodes
have been solved or trimmed.
Examples: Breadth First Search, Assignment problem.
Brute Force algorithms
These algorithms are used to solve a problem based on problem statement and
definitions of the concepts involved. It tries all possibilities until a satisfactory solution
is found. It is useful for solving small size instances of a problem.
Examples: Computing cn, computing n!, selection sort, bubble sort, sequential search
e.t.c.
Genetic algorithms
Genetic algorithms are mainly used for optimization problems for which the exact
algorithms are of very low efficiency. They are used to search for good solutions to a
problem from among a large number of possible solutions .The current set of possible
solutions is used to generate a new set of possible solution. Examples: Knapsack
problem, travelling salesman problem.
2.15. Revision questions
a) Differentiate between the following asymptotic notations
i). Big oh (O)
ii). Big omega ( Ω)
iii). Theta notation (ө)
b) Define time complexity. Explain how time complexity of an algorithm is calculated.
c) Determine the time complexity of the following function.
for(i=1;i≤n;i++) for(j=1;j≤n;j=j*2)
{ ……
…….
}

Answer O(n log n)


d) Determine the running time to retrieve an element from an array of size n in worst case.
Answer O(n)
e) Determine the time complexity of the following function.
for(i=1;i≤n;i=i*2)
{ sum=0;
}

Answer O(log n)
f) Find out the O-notation, Ω-notation and ө-notation for the following functions.
i). f (n) =5n3 -6n2 +1

ii). f (n) =7n2 +6n -8

2.16. Summary
In this lesson we have learnt that an algorithm analysis requires that we estimate time and
space complexity. However, our focus was on time complexity. To calculate this complexity we
have to consider three cases: best case, average case and worst case. Our interest was on worst
case scenario. When working with these cases we have to use three notations: big Oh, big
omega and theta notations. Again our attention was on the worst case- the upper bound. We
classified functions based on their grow rate. Finally we discussed how we can use knowledge
of big O to estimate the running time of a program.
2.17. Suggested reading
[1]. Data structures using C and C++, 2nd Edition by Yedidyah Langsam, Aaron J.Augenstein and
Aaron M.Tenebaum: Pubslisher: Pearson.
[2]. Data structures and algorithms in c++ by Michael T.Goodrich,Robertio Tamassia and David
Mount: Publisher: Wiley
[3]. Fundamentals of data structures in c++ by Ellis Horowitz,Sartaj Sahni and Dinesh Mehta.
Publisher:Galgotia
[4]. Introduction to data structures and algorithms with c++ by Glenn W.Rowe . Publisher:
Prentice Hall.

You might also like