Divide and Conquer: Analysis of Algorithms

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Analysis of Algorithms

Chapter 2
Divide and Conquer
 The General Method
 Binary search
 Maximum and Minimum
 Merge Sort
 Quick Sort
 Selection Sort

There are many ways to design algorithms. Insertion sort uses an incremental approach: having sorted
the sub array A[1 . . j −1], we insert the single element A[ j ] into its proper place, yielding the sorted sub array
A[1 . . j ]. In this section, we examine an alternative design approach, known as ―divide and- conquer.‖ We shall
use divide-and-conquer to design a sorting algorithm whose worst-case running time is much less than that of
insertion sort. One advantage of divide-and-conquer algorithms is that their running times are often easily
determined using techniques.
The Divide and Conquer approach:

Many useful algorithms are recursive in structure: to solve a given problem, they call themselves
recursively one or more times to deal with closely related sub problems. These algorithms typically follow a
divide-and-conquer approach: they break the problem into several sub problems that are similar to the original
problem but smaller in size, solve the sub problems recursively, and then combine these solutions to create a
solution to the original problem.

The divide-and-conquer paradigm involves three steps at each level of the recursion:
 Divide the problem into a number of sub problems.
 Conquer the sub problems by solving them recursively. If the sub problem sizes are small enough,
however, just solve the sub problems in a straightforward manner.
 Combine the solutions to the sub problems into the solution for the original problem.

To be more precise, suppose we consider the divide-and-conquer strategy when it splits the input into
two sub problems of the same kind as the original problem. DAndC algorithm is initially invoked as DAndC(P),
where P is the problem to be solved.

1 Prepared by: Mujeeb Rahman


Small(P) is a Boolean-valued function that determines whether the input size is small enough that the
answer can be computed without splitting. If this is so, the function S is invoked. Otherwise the problem P is
divided into smaller sub problems. These sub problems P1,P2,…….,Pk are solved by recursive applications of
DAndC. Combine is a function that determines the solution to P using the solutions to the k sub problems. If the
size of P is n and the sizes of the k sub problems are n1,n2,….nk respectively, then the computing time of DAndC
is described by the recurrence relation.

Where T(n) is the time for DAndC on any input of size ‗n‘ and g(n) is the time to compute the answer directly
for small inputs. The function f(n) is the time for dividing P and combining the solutions to subproblems.
Fordivideand-conquer-based algorithms that produce subproblems of the same type as the original problem, it is
very natural to first describe such algorithms using recursion.
The complexity of many divide-and-conquer algorithms is given by recurrences of the form

Where ‗a‘ and ‗b‘ are known constants. We assume that T(l) is known and n is a power of b (i.e.n,= bk). One of
the methods for solving any such recurrence relation is called the substitution method. This method repeatedly
makes substitution for each occurrence of the function T in the right-hand side until all such occurrences
disappear.
Example: Consider the case in which a = 2 and b = 2. Let T(l) and f(n) = n. We have
T(n) = 2T(n/2)+ n
= 2[2T(n/4)+ n/2] + n
= 4T(n/4) +2n
= 4[2T(n/8) + n/4] + 2n
= 8T(n/8)+ 3n
.
.
In general, we see that T(n) = 2iT(n/2i) + in, for any log2n > i > 1.In particular, then, T(n) = 2lgnT(n/2lgn)+nlog2n,
corresponding to the choice of i = log2n. Thus, T(n) = nT(l)+ n log2n = n log2n + 2n.
Thus Time complexity = O(nlgn).

Binary Search
Let ai, 1< i <n, be a list of elements that are sorted in non-decreasing order. Consider the problem of
determining whether a given element ‗x‘ is present in the list. If x is present, we are to determine a value j such
that aj = x. If ‗x‘ is not in the list, then j is to be set to zero. Let P = (n, ai,…..al , x) denote an arbitrary instance of
this search problem(n is the number of elements in the list, ai,…..al is the list of elements and x is the element
searched for).
Divide-and-conquer can be used to solve this problem. Let Small(P) be true if n = 1. In this case, S(P)
will take the value i if x = ai, otherwise it will take the value 0. Then g(l) = Θ(1). If P has more than one element,
it can be divided (or reduced) into a new sub problem as follows. Pick an index q (in the range [i,l]) and
compare x with aq. There are three possibilities: (1) x = aq: In this case the problem P is immediately solved.
(2) x< aq: In this case x has to be searched for only in the sub list ai,ai+1,..,aq-1. Therefore, P reduces to
(q-1, ai,….,aq-1,x). (3) x > aq: In this case the sublist to be searched is aq+1,…..al. P reduces to (l-q, aq+1,…..al, x).
In this example, any given problem P gets divided (reduced) into one new subproblem. This division
takes only Θ(1) time. After a comparison with aq, the instance remaining to be solved (if any) can be solved by
using this divide-and-conquer scheme again. If q is always chosen such that aq is the middle element (that is,q =
2 Prepared by: Mujeeb Rahman
floor((n + l)/2)), then the resulting search algorithm is known as binary search. Note that the answer to the new
sub problem is also the answer to the original problem P; there is no need for any combining. Algorithm given
below describes this binary search method, where BinSrch has four inputs a[ ], i, l and x. It is initially invoked
as BinSrch(a,1,n,x).

Example:

Pass 1: Pass 2:

Pass 3: Pass 4:

Best Case: Best case occurs, when we are searching the middle element itself. In this case, the total number of
comparisons required is 1. Therefore, Best Case time complexity is O(1).
Worst Case: Let
– T(n) – cost involved to search n elements
– T(n/2) – cost involved to search either left part or right part
3 Prepared by: Mujeeb Rahman
The recurrence relation is given by

T(n) = T(n/2)+b
= T(n/22)+2b
= T(n/23)+3b
.
.
= T(n/2k)+kb if n/2k =1, k = lg n
= a + lg n . b
= lg n ( by neglecting a & b)
= O(lg n)

Finding Maximum and Minimum


Let us consider another simple problem that can be solved by the divide and-conquer technique. The
problem is to find the maximum and minimum items in a set of ‗n‘ elements. A divide-and-conquer algorithm for
this problem would proceed as follows: Let P = (n,a[i],..,a[j]) denote an arbitrary instance of the problem. Here n
is the number of elements in the list a[i],...,a[j] and we are interested in finding the maximum and minimum of
this list. Let Small(P) be true when n ≤ 2. In this case, the maximum and minimum are a[i] if n = 1. If n = 2, the
problem can be solved by making one comparison.

4 Prepared by: Mujeeb Rahman


If the list has more than two elements, P has to be divided into smaller instances. For example, we might
divide P into the two instances P1 = (floor(n/2),a[l],...,a[floor(n/2)]) and P2 = (n-
floor(n/2),a[floor(n/2+1)],…,a[n]). After having divided P into two smaller sub problems we can solve them by
recursively invoking the same divide-and-conquer algorithm.
How can we combine the solutions for P1 and P2 to obtain a solution for P? If MAX(P) and MIN(P) are
the maximum and minimum of the elements in P, then MAX(P) is the larger of MAX(P1) and MAX(P2). Also,
MIN(P) is the smaller of MIN(P1)and MIN(P2).
MaxMin is a recursive algorithm that finds the maximum and minimum of the set of elements {a(i),a(i
+1),…,a(j)}. The situation of set sizes one (i = j) and two (i = j-1) are handled separately. For sets containing
more than two elements, the midpoint is determined (just as in binary search) and two new subproblems are
generated. When the maxima and minima of these subproblems are determined, the two maxima are compared
and the two minima are compared to achieve the solution for the entire set.
The procedure is initially invoked by the statement MaxMin(l,n,x,y). Suppose we simulate MaxMin on
the following nine elements.

A good way of keeping track of recursive calls is to build a tree by adding a node each time a new call is made.
For this algorithm each node has four items of information: i,j, max, and min. On the array a[ ] above, the tree of
Figure is produced.

Examining Figure, we see that the root nodecontains1and 9 as the values of i and j corresponding to the
initial call to MaxMin. This execution produces two new calls to MaxMin, where i and j have the values 1,5 and
6,9 respectively, and thus split the set into two subsets of approximately the same size. From the tree we can
immediately see that the maximum depth of recursion is four (including the first call). The circled numbers in
the upper left corner of each node represent the orders in which max and min are assigned values.

Time Complexity:
Now what is the number of element comparisons needed for MaxMin? If T(n) represents this number,
then the resulting recurrence relation is

When n is a power of two, n = 2k for some positive integer k, then

T(n) = 2T(n/2)+2
= 2(2T(n/4)+2)+2
= 4T(n/4) +4+2

5 Prepared by: Mujeeb Rahman


= 22T(n/22)+ 22+2
= 2kT(n/2k)+ 2k+2k-1+….+22+2
= 2kT(n/2k)+∑ki=1 2i
= n+2n-2
= 3n-2
= O(n)
Merge Sort
The key operation of the merge sort algorithm is the merging of two sorted sequences in the ―combine‖
step. To perform the merging, we use an auxiliary procedure MERGE(A, p, q, r), where A is an array and p, q,
and r are indices numbering elements of the array such that p ≤ q < r. The procedure assumes that the subarrays
A[p . . q] and A[q + 1 . . r] are in sorted order. It merges them to form a single sorted subarray that replaces the
current subarray A[p . . r].
Our MERGE procedure takes time O(n), where n = r − p + 1 is the number of elements being merged.
MERGE(A, p, q, r) In detail, the MERGE procedure works as follows. Line 1
1 n1 ← q − p + 1 computes the length n1 of the subarray A[p..q], and line 2
computes the length n2 of the subarray A[q + 1..r].
2 n2 ←r − q
3 create arrays L[1 . . n1 + 1] and R[1 . . n2 + 1] We create arrays L and R (―left‖ and ―right‖), of lengths n1
4 for i ← 1 to n1 + 1 and n2 + 1, respectively, in line 3. The for loop of lines
4–5 copies the subar ray A[p . . q] into L[1 . . n1], and the
5 do L[i ] ← A[p + i − 1]
for loop of lines 6–7 copies the subarray A[q +1 . . r] into
6 for j ← 1 to n2 R[1 . . n2].
7 do R[ j ]← A[q + j ]
8 L[n1 + 1]←∞ Lines 8–9 put the sentinels at the ends of the arrays L and
R. Lines 10–17, illustrated in algorithm, perform the r − p
9 R[n2 + 1]←∞ +1 basic steps by maintaining the following:
10 i ← 1
11 j ← 1 At the start of each iteration of the for loop of lines 12–17,
the subarray A[p . . k − 1] contains the k − p smallest
12 for k ← p to r elements of L[1 . . n1 + 1] and R[1 . . n2 + 1], in sorted
13 do if L[i ] ≤ R[ j ] order. Moreover, L[i] and R[ j ] are the smallest elements of
14 then A[k] ← L[i ] their arrays that have not been copied back into A.

15 i ← i + 1
16 else A[k] ← R[ j ]
17 j ← j + 1
We can now use the MERGE procedure as a subroutine in the merge sort algorithm. The procedure MERGE-
SORT(A, p, r) sorts the elements in the subarray A[p . . r]. If p ≥ r, the subarray has at most one element and is
therefore already sorted. Otherwise, the divide step simply computes an index q that partitions A[p . . r] into two
subarrays: A[p . . q], containing ceil(n/2) elements, and A[q + 1 . . r], containing floor(n/2) elements.
MERGE-SORT(A, p, r) To sort the entire sequence A = <A[1], A[2], . . . ,
1 if p < r A[n]>, we make the initial call MERGE-SORT(A, 1,
2 then q ← _(p + r)/2_ length[A]), where once again length[A] = n.
3 MERGE-SORT(A, p, q)
4 MERGE-SORT(A, q + 1, r)
5 MERGE(A, p, q, r)
6 Prepared by: Mujeeb Rahman
Merge: Merging means combining two sorted lists
into one sorted list. For this, the elements from both
the sorted lists are compared. The smaller of both the
elements is then stored in the third array. The sorting
is complete when all the elements from both the lists
are placed in the third list.

Analysis of Merge Sort


Divide: The divide step just computes the middle of the subarray, which takes constant time. Thus, D(n) = Θ(1).
Conquer: We recursively solve two subproblems, each of size n/2, which contributes 2T (n/2) to the running
time.
Combine: We have already noted that the MERGE procedure on an n-element subarray takes time Θ(n), so C(n)
= Θ(n).

T(n) = 2T(n/2)+ n
= 2[2T(n/4)+ n/2] + n
= 4T(n/4) +2n
= 4[2T(n/8) + n/4] + 2n
= 23T(n/23)+ 3n
.
.
In general, we see that T(n) = 2iT(n/2i) + in, for any log2n > i > 1.In particular, then, T(n) = 2lgnT(n/2lgn)+nlog2n,
corresponding to the choice of i = log2n. Thus, T(n) = nT(l)+ n log2n = n log2n + 2n.
Thus Time complexity = O(nlgn).

7 Prepared by: Mujeeb Rahman


Quick Sort
It is one of the most powerful sorting algorithm. Quicksort is a sorting algorithm whose worst-case
running time is Θ(n2) on an input array of n numbers. In spite of this slow worst-case running time, quicksort is
often the best practical choice for sorting because it is remarkably efficient on the average: its expected running
time is Θ(n lg n), and the constant factors hidden in the Θ(n lg n) notation are quite small.
Quicksort, like merge sort, works on the Divide and Conquer design principle. Here is the three-step divide-and-
conquer process for sorting a typical subarray A[p . . r].
Divide: Partition (rearrange) the array A[p . . r] into two (possibly empty) subarrays A[p . . q −1] and A[q +1 . . r]
such that each element of A[p . . q −1] is less than or equal to A[q](Pivot), which is, in turn, less than or equal to
each element of A[q + 1 . . r]. Compute the index q as part of this partitioning procedure.
Conquer: Sort the two subarrays A[p . . q−1] and A[q +1 . . r] by recursive calls to quicksort.
Combine: Since the subarrays are sorted in place, no work is needed to combine them: the entire array A[p . . r]
is now sorted.
 Partition
Choose a pivot
Find the position for the pivot so that
all elements to the left are less
all elements to the right are greater

< Pivot Pivot > Pivot

 Conquer
Apply the same algorithm to each half

The following procedure implements quicksort.


QUICKSORT(A, p, r) Partitioning the array
1 if p < r The key to the algorithm is the PARTITION procedure,
2 then q ← PARTITION(A, p, r) which rearranges the subarray A[p . . r] in place.
3 QUICKSORT(A, p, q − 1)
4 QUICKSORT(A, q + 1, r)
PARTITION(A, p, r)
1 x ← A[r]
2 i←p−1
3 for j ← p to r − 1
To sort an entire array A, the initial call is 4 do if A[ j ] ≤ x
QUICKSORT(A, 1, length[A]). 5 then i ←i + 1
6 exchange A[i ] ↔ A[ j ]
7 exchange A[i + 1] ↔ A[r]
8 return i + 1

8 Prepared by: Mujeeb Rahman


Figure shows the operation of PARTITION on an 8-element array. PARTITION always selects an element x =
A[r] as a pivot element around which to partition the subarray A[p . . r].
The operation of PARTITION on a sample array. Lightly shaded
array elements are all in the first partition with values no greater
than x. Heavily shaded elements are in the second partition with
values greater than x. The unshaded elements have not yet been
put in one of the first two
partitions, and the final white element is the pivot.
(a) The initial array and variable settings. None of the elements
have been placed in either of the first two partitions.
(b) The value 2 is ―swapped with itself‖ and put in the partition
of smaller values.
(c)–(d) The values 8 and 7 are added to the partition of larger
values.
(e) The values 1 and 8 are swapped, and the smaller partition
grows.
(f) The values 3 and 7 are swapped, and the smaller partition
grows.
(g)–(h) The larger partition grows to include 5 and 6 and the loop
terminates.
(i) In lines 7–8, the pivot element is swapped so that it lies
between the two partitions.

Analysis of Quick Sort – Worst Case


The worst-case is when the pivot always ends up in the first or last element. That is, partitions the array
as unequally as possible.

Let T(n) – cost involved to sort n elements


T(n-1) – cost involved to sort left or right sub array
c.n – cost involved for partitioning, c is constant.

The recurrence relation is given by


T(n) = T(n-1) +c.n
T(n) = T(n - 2) + c.n - 1 + c.n
T(n) = T(n - 3) + c.n - 2 + c.n - 1 + c.n
. .
. .
T(n) = T(n - n) + c.1 + c.2 + c.3 + …+ c(n - 2) + c.(n - 1) + c.n
= T(0) + c[1 + 2 + 3 + …+ (n - 2) + (n - 1) + n]
n(n  1)
= 0 + c[ ]
2
c
= [n 2 +n]
2
T ( n)  n 2  n Since n 2  n
T ( n)  O ( n 2 )

9 Prepared by: Mujeeb Rahman


Analysis of Quick Sort – Best Case
The best case is clearly when the pivot always partitions the array equally.
Let T(n) – cost involved to sort n elements
T(n/2) – cost involved to sort left or right sub array
c.n – cost involved for partitioning, c is constant.

The recurrence relation is given by


 a if n  1
T (n)  
2T (n / 2)  cn otherwise
T(n) = 2T(n/2)+ n

= 2[2T(n/4)+ n/2] + n
= 4T(n/4) +2n
= 4[2T(n/8) + n/4] + 2n
= 23T(n/23)+ 3n
.
.
In general, we see that T(n) = 2iT(n/2i) + in, for any log2n > i > 1.In particular, then, T(n) = 2lgnT(n/2lgn)+nlog2n,
corresponding to the choice of i = log2n. Thus, T(n) = nT(l)+ n log2n = n log2n + 2n.
Thus Time complexity = O(nlgn).

Selection
The Partition operation of quick sort can also be used to obtain an efficient solution for the selection
problem. In this problem, we are given n elements a[1 : n] and are required to determine the kth-smallest element.
If the partitioning element (pivot element) x is positioned at a[q], then (q – 1) elements are less than or equal to
a[q] and (n - q) elements are greater than or equal to a[q]. Hence if k < q. then the kth-smallest element is in a[1 :
q — 1]; if k = q, then a[q] is the kth-smallest element; and if k > q, then the kth-smallest element is (k-q)th-smallest
element in a[q + 1 : n].

The following procedure implements selection.


SELECT(A, p, r,k) Partitioning the array
1 if p < r The key to the algorithm is the PARTITION procedure,
2 then q ← PARTITION(A, p, r) which rearranges the subarray A[p . . r] in place.
3 if(k=q) then return
4 else if (k<q)
PARTITION(A, p, r)
5 then SELECT(A, p, q − 1)
1 x ← A[r]
6 else
2 i←p−1
7 SELECT(A, q + 1, r)
3 for j ← p to r − 1
4 do if A[ j ] ≤ x
5 then i ←i + 1
th
To find k smallest element in an array A, the 6 exchange A[i ] ↔ A[ j ]
initial call is SELECT(A, 1, length[A],k). 7 exchange A[i + 1] ↔ A[r]
8 return i + 1

10 Prepared by: Mujeeb Rahman


Example: Find 5th smallest element from given array A.
A 1 2 3 4 5 6 7 8
20 80 70 10 30 50 60 40
Now q=6, it will call SELECT(A,5,5,5) and it will
Initial call is SELECT(A,1,8,5). After the call Partition(A,1,8), stop the execution since p is not less than r.
array becomes

20 10 30 40 70 50 60 80 5th element of our final array will be the 5th smallest


element of array.
Now q=4, it will call Partition(A,5,8) and array becomes

20 10 30 40 70 50 60 80 20 10 30 40 50 60 70 80

Now q=8, it will call Partition(A,5,7) and array becomes

20 10 30 40 50 60 70 80

Worst case time complexity and best case time complexity of the above algorithm is O(n2) and O(n)
respectively.

***

11 Prepared by: Mujeeb Rahman

You might also like