Download as pdf or txt
Download as pdf or txt
You are on page 1of 133

Design and Analysis of

Algorithms
Prerequisites
Topics
Topics
Topics
Topics
Evaluation
Why Study Algorithms?
● Important for all other branches of computer science
● Routing protocols in communication networks piggyback on classical
shortest path algorithms.
● Public-key cryptography relies on efficient number-theoretic
algorithms.
● Computer graphics requires the computational primitives supplied by
geometric algorithms.
● Database indices rely on balanced search tree data structures.
● Computational biology uses dynamic programming algorithms to
measure genome similarity
Why Study Algorithms?
Good for the brain! Fun
Integer Multiplication
● Input: Two n digit numbers x and y
● Output: product of x and y
● Primitive operations: add or multiply
The Grade-School Algorithm
The Grade-School Algorithm

Can we do better?
Karatsuba Multiplication
Karatsuba Multiplication
Karatsuba Multiplication
Karatsuba Multiplication
Which multiplication is better?
Karatsuba Multiplication
Analysis of algorithms
Measuring efficiency of an algorithm
● Time: How long the algorithm takes (running time)
● Space: Memory requirement
Time and space
● Time depends on processing speed
○ Impossible to change for given hardware
● Space is a function of available memory
○ Easier to reconfigure, augment Typically,
● We will focus on time, not space
Measuring running time
● Analysis independent of underlying hardware
● Don’t use actual time
● Measure in terms of “basic operations”
● Typical basic operations?
RAM Model
● Algorithms can be studied in a machine/language independent way.
Because we use the RAM model of computation for all our analysis.
RAM Model
● Algorithms can be studied in a machine/language independent way.
Because we use the RAM model of computation for all our analysis.
● Each ”simple” operation (+, −, =, if) takes 1 step.
● Loops and subroutine calls ?
RAM Model
● Algorithms can be studied in a machine/language independent way.
Because we use the RAM model of computation for all our analysis.
● Each ”simple” operation (+, −, =, if ) takes 1 step.
● Loops and subroutine calls are not simple operations. They depend
upon the size of the data and the contents of a subroutine.
● Run time of an algorithm is measured by counting number of steps
Search for K in an unsorted array A
● Running time on input of size n varies across inputs
Best, Average, Worst Case Time
Complexity
● Best case complexity of an algorithm is the function defined by the
minimum number of steps taken on input data of n elements
● Average-case complexity of the algorithm is the function defined by
an average number of steps taken on input data of n elements
● Worst-case complexity indicates the maximum number of steps
performed on input data of n elements
Worst case complexity
● For each n, worst case input forces algorithm to take the maximum
amount of time
○ If K not in A, search scans all elements
● Upper bound for the overall running time
○ Here worst case is proportional to n for the array size n
Average case complexity
● Worst case may be very pessimistic
● Compute average time taken over all inputs
● Difficult to compute
○ Average over what?
○ Are all inputs equally likely?
○ Need probability distribution over inputs
Worst case vs average case
● Worst case can be unrealistic …
● … but average case is hard, if not impossible, to compute
Worst case vs average case
● Worst case can be unrealistic …
● … but average case is hard, if not impossible, to compute

● A good worst case upper bound is useful


● A bad worst case upper bound may be less informative
● Try to “classify” worst case inputs, look for simpler subclasses
Measuring running time
● Input size - Running time depends on input size, e.g., larger arrays
will take longer to sort
● Measure time efficiency as function of input size
● Input size n → Running time t(n)
● Different inputs of size n may each take a different amount of time
● Typically t(n) is worst case estimate
Typical functions t(n)…
1: Sorting
● Sorting an array with n elements
● Naive algorithms : time proportional to n2
● Best algorithms : time proportional to n log n
● How important is this distinction?
○ Suppose, typical CPUs process up to 108 operations per second
■ Useful for approximate calculations
1: Sorting
● Telephone directory for mobile phone users in India. India has about
100 crores = 109 phone numbers
● Naïve n2 algorithm requires 1018 operations 108 operations per
second ⟹ 1010 seconds 2778000 hours
● 115700 days
● 300 years!
1: Sorting
● Telephone directory for mobile phone users in India. India has about
100 crores = 109 phone numbers
● Naïve n2 algorithm requires 1018 operations 108 operations per
second ⟹ 1010 seconds 2778000 hours
● 115700 days
● 300 years!
● Smart n log n algorithm takes less than 3 x 1010 operations
● About 300 seconds, or 5 minutes
2: Video game
● Several objects on screen
● Basic step: Given n objects, find closest pair of objects
○ For each pair of objects, compute their distance, and report
minimum distance over all such pairs
● naive algorithm is again n2
● There is a clever algorithm that takes time n log n
2: Video game
● High resolution monitor has 2500 x 1500 pixels 375 x 104 points
● Suppose we have 500,000 = 50 x 104 objects
● Naïve algorithm takes 25 x 1010 steps = 2500 seconds
● 2500 seconds = 42 minutes response time is unacceptable!
● Smart n log n algorithm takes a fraction of a second
3: Xerox Shop
● Campus Xerox has several photocopiers
● Tomorrow is the deadline for BTech projects and there is a rush of
reports to be printed
● The number of pages for each job is known
● Each customer has been promised delivery by a deadline
● Campus Xerox offers discount if deadline is not met
● How to sequentially allocate the jobs to photocopiers to maximize
revenue?
3: Xerox Shop
● Brute force
● Try all possible allocations Choose one that is optimum
● Number of possibilities is exponential!
● Even with 30 jobs, it would take hours to compute an optimal
schedule
3: Xerox Shop
Decompose the problem
● Choose a job to schedule first, and the machine on which it will run,
according to some strategy
● Now, recursively solve the problem for N-1 jobs
3: Xerox Shop
Greedy approach
● Fix the choice of next job greedily
● Never go back and try another sequence

● How to choose the next job?


3: Xerox Shop
Greedy approach
● Fix the choice of next job greedily
● Never go back and try another sequence

● How to choose the next job?


○ Shortest processing time?
○ Earliest deadline?
3: Xerox Shop
Greedy approach
● Fix the choice of next job greedily
● Never go back and try another sequence

● How to choose the next job?


○ Shortest processing time?
○ Earliest deadline?
● How to show that this strategy is optimal?
Variations
● Some photocopiers are old and slow, some are new and fast
○ Time for a job depends on choice of machine
● Cost of ink and paper varies across machines
○ Net revenue for a job depends on choice of machine
4: Document similarity
● Given two documents, how similar are they? (applications)
○ Plagiarism detection
○ Checking changes between versions of code
○ Answering web search queries more effectively
4: Document similarity
● Given two documents, how similar are they? (applications)
○ Plagiarism detection
○ Checking changes between versions of code
○ Answering web search queries more effectively
4: Document similarity
● Given two documents, how similar are they? (applications)
○ Plagiarism detection
○ Checking changes between versions of code
○ Answering web search queries more effectively
● Document similarity
○ Edit Distance: Minimum number of edit operations to transform
one document to another
○ Jaccard Similarity (e.g., Locality-sensitive hashing)
○ Word Embeddings (e.g., Word2Vec)
Design and Analysis of
Algorithms
Sorting Motivation
Sorting
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Strategy 1
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Selection Sort
Analysis of Selection Sort
Analysis of Selection Sort
Recursive formulation
Selection Sort, recursive
Selection Sort, recursive
Selection Sort, recursive
Alternative calculations
Alternative calculations
Alternative calculations
Sorting
Can we do better?
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Strategy 2
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Insertion Sort
Analysis of Insertion Sort
Recursive formulation
Insertion Sort, recursive
Insertion Sort, recursive
Insertion Sort, recursive
Recurrence
Recurrence
Recurrence
Sorting algorithms
Time Time Space
Algorithm complexity:Best complexity:Worst complexity:Worst

Insertion sort O(n) O(n2) O(1)

Selection sort O(n2) O(n2) O(1)


Lower bound for Sorting
● A lower bound for a problem is the worst-case running time of the best
possible algorithm for that problem.
Lower bound for Sorting
● A lower bound for a problem is the worst-case running time of the best
possible algorithm for that problem.
● Can we say that it is impossible to sort faster than (n lg n) using a
comparison based algorithm?
Sorting
Assume elements are the
(distinct) numbers
Sorting
There must be n! leaves (one for
each of the n! permutations of n
elements)
Sorting
– Tree of height h has at most 2h
leaves
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting

Can we say that it is


impossible to sort faster than
(n lg n)?
Sorting algorithms
Time Time Space
Algorithm complexity:Best complexity:Worst complexity:Worst

Insertion sort O(n) O(n2) O(1)

Selection sort O(n2) O(n2) O(1)

Bubble sort O(n) O(n2) O(1)


Sorting algorithms
Time Time Space
Algorithm complexity:Best complexity:Worst complexity:Worst

Insertion sort O(n) O(n2) O(1)

Selection sort O(n2) O(n2) O(1)

Bubble sort O(n) O(n2) O(1)

Merge sort O(n log(n)) O(n log(n)) O(n)

Quick sort O(n log(n)) O(n2) O(n)

Heap sort O(n log(n)) O(n log(n)) O(1)

You might also like