Lecture 5-Merge Sort

MERGE SORT
Introduction
⚫ Many useful algorithms are recursive in structure: to
solve a given problem, they call themselves recursively
one or more times to deal with closely related
subproblems.
⚫ These algorithms typically follow a
divide-and-conquer approach: they break the
problem into several subproblems that are similar to
the original problem but smaller in size, solve the
subproblems recursively, and then combine these
solutions to create a solution to the original problem.
Divide and conquer approach
Mergesort
⚫Mergesort (divide-and-conquer)
⚫ Divide array into two halves.
A L G O R I T H M S
A L G O R I T H M S divide
Mergesort
⚫ Recursively sort each half.
A L G O R I T H M S
A G L O R H I M S T sort
Mergesort
⚫ Recursively sort each half.
⚫ Merge two halves to make sorted whole.
A L G O R I T H M S
A G L O R H I M S T sort
A G H I L M O R S T merge
Merging
⚫Merge.
⚫ Keep track of smallest element in each sorted half.
⚫ Insert smallest of two elements into auxiliary array.
⚫ Repeat until done.
smallest smallest
A G L O R H I M S T
A auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H I auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H I L auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H I L M auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H I L M O auxiliary array
Merging
⚫Merge.
smallest smallest
A G L O R H I M S T
A G H I L M O R auxiliary array
Merging
⚫Merge.
⚫ Repeat until done. first half
exhausted smallest
A G L O R H I M S T
A G H I L M O R S auxiliary array
Merging
⚫Merge.
⚫ Repeat until done. first half
exhausted smallest
A G L O R H I M S T
A G H I L M O R S T auxiliary array
Divide and Conquer
⚫ Divide the problem into a number of subproblems
that are smaller instances of the
same problem.
⚫ Conquer the subproblems by solving them
recursively. If the subproblem sizes are small enough,
however, just solve the subproblems in a
straightforward manner.
⚫ Combine the solutions to the subproblems into the
solution for the original problem.
Merge sort
⚫ Divide: Divide the n-element sequence to be
sorted into two subsequences of n/2 elements
each.
⚫ Conquer: Sort the two subsequences recursively

using merge sort.
⚫ Combine: Merge the two sorted subsequences to

produce the sorted answer
⚫ The recursion “bottoms out” when the sequence to be
sorted has length 1, in which case there is no work to
be done, since every sequence of length l is already in
sorted order
⚫ The key operation of the merge sort algorithm is the
merging of two sorted sequences in the “combine”
step.
⚫ We merge by calling an auxiliary procedure
MERGE(A,p,q,r), where A is an array and p, q, and r are
indices into the array such that p<= q < r.
⚫ Assumes sub arrays A[p..q] and A[q+1…r] are in sorted
order
⚫ It merges them to form a single sorted subarray
A[p…r]
Merge sort indices
Merge sort
⚫ We need a base case.
⚫ The base case is a subarray containing fewer than two
elements, that is, when p >=r, since a subarray with no
elements or just one element is already sorted.
⚫ So we'll divide-conquer-combine only when p < r.
⚫ Let's see an example. Let's start with array holding
[14, 7, 3, 12, 9, 11, 6, 2],
⚫ so that the first subarray is actually the full
array, array[0..7]
⚫ (p=0 and r=7).
⚫ This subarray has at least two elements, and so it's not
a base case.
⚫
Steps
⚫ Divide by finding the number q of the position
midway between p and r. : add p and r, divide by 2, and
round down.
⚫ Conquer by recursively sorting the subarrays in each
of the two subproblems created by the divide step.
That is, recursively sort the subarray array[p..q] and
recursively sort the subarray array[q+1..r].
⚫ Combine by merging the two sorted subarrays back
into the single sorted subarray array[p..r].
⚫ You can check for the base case easily. Finding the
midpoint q in the divide step is also really easy.
⚫ You have to make two recursive calls in the conquer
step.
⚫ It's the combine step, where you have to merge
two sorted subarrays, where the real work
happens.
Steps
Merge (A,p,q,r)
In the two subarrays , the smallest is on the
top .
Take the smallest for these and put it into the
result array. Repeat until both the subarrays
are empty.
Copy to A[p…r].
Recursively ..
Merging
Merging the final two sorted
sequences
⚫ Compare the top most element of both the sorted
sequences
⚫ So order (n) comparisons are required.
⚫ i.e. n-1 comparisons ,since we require that every
element (except the last) be compared.
Recurrence Analysis
Analysis
⚫ How much time in the division step? Constant time
Theta(1)
⚫ The conquer step, where we recursively sort two

subarrays of approximately n/2 elements each, takes
some amount of time, but we'll account for that time
when we consider the subproblems.
⚫ Merger step takes Theta(n ) time -how?

Divide and merge steps alone
⚫ ..take Theta(n)….
⚫ …because
⚫ Theta(1) can be included in Theta(n) .
⚫ To make things more concrete, let's say that the divide

and combine steps together take cn time for some
constant c.
⚫ Lets see how…
⚫ Now we have to figure out the running time of two
recursive calls on n/2 elements.
⚫ Each of these two recursive calls takes twice of the
running time of mergeSort on an (n/4)-element
subarray….
⚫ And so on…
Recursion Tree for Merge Sort
Each of the size n/2 problems has a cost
For the original problem, we of cn/2 plus two subproblems, each
have a cost of cn, plus two costing T(n/4).
subproblems each of size (n/2)
and running time T(n/2).
cn
cn
Cost of divide
and merge.
cn/2 cn/2
T(n/2) T(n/2)
T(n/4 T(n/4 T(n/4 T(n/4
Cost of sorting ) ) ) )
subproblems.
Comp 122
Running time at each level of
merge –recursive merge sort calls
⚫ 2 x n/2 +4xn/4+8 xn/8…
⚫ 2 units of time and n/2 merge sort calls
⚫ Then..
⚫ 4 units of time and n/4 such merge sort calls
⚫ 8 units of time and n/8…
⚫ As the subproblems get smaller, the number of
subproblems doubles at each "level" of the recursion,
but the merging time halves.
⚫ The doubling and halving cancel each other out, and
so the total merging time is cn at each level of
recursion.
⚫ Eventually, we get down to subproblems of size 1
⚫ We have to spend Θ(1) time to sort subarrays of size 1,
because we have to test whether p <r, and this test
takes time.
⚫ How many subarrays of size 1 are there?
⚫ Since we started with n elements, there must be n of
them.
⚫ Since each base case takes Θ(1) time, let's say that
altogether, the base cases take cn time:
How many levels ?
⚫ he total time for mergeSort is the sum of the merging
times for all the levels.
⚫ If there are l levels in the tree, then the total merging
time is l.cn
⚫ So what is l ?
⚫ We start with subproblems of size n and repeatedly
halve until we get down to subproblems of size 1.
⚫ The answer is l = lg n + 1
⚫ For example, if n=8, then lg n + 1 = 4, and sure
enough, the tree has four levels:
⚫ n = 8, 4, 2, 1
⚫ . The total time for mergeSort, then, is cn.(lg n +1)
Running time
⚫ When we use big-Θ notation to describe this running
time, we can discard the low-order term 1
⚫ and the constant coefficient c, giving us a running
time Θ(n lg n)
⚫ It is a tight bound ‘theta’ ,because the best ,worst and

average cases are the same
Not ‘in place’
⚫ During merging, it makes a copy of the entire array
being sorted, with one half in lowHalf and the other
half in highHalf.
⚫ Because it copies more than a constant number of
elements at some time, we say that merge sort does
not work in place
⚫ How many merge sort calls made so far ? n -1 recursive
calls. i.e order n –why?
⚫ Each of the blocks (big ) make a mergesort call i.e 7
here

Lecture 5-Merge Sort

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 5-Merge Sort

Uploaded by

Copyright:

Available Formats

MERGE SORT

⚫ Conquer: Sort the two subsequences recursively

⚫ Combine: Merge the two sorted subsequences to

⚫ The conquer step, where we recursively sort two

⚫ Merger step takes Theta(n ) time -how?

⚫ To make things more concrete, let's say that the divide

⚫ It is a tight bound ‘theta’ ,because the best ,worst and

You might also like