Download as pdf or txt
Download as pdf or txt
You are on page 1of 107

Chapter 2

Sorting and Order Statistics

Yonas Y.
School of Electrical and Computer Engineering

1
1 Introduction
2 Heapsort
Heaps
Maintaining the heap property
Building a heap
The heapsort algorithm
Priority queues
3 Quicksort
Quicksort
Performance of quicksort
Analysis of quicksort
4 Sorting in Linear Time
Lower bounds for sorting
Counting sort
Radix sort
Bucket sort
5 Medians and Order Statistics
Introduction
2
Introduction

This part presents several algorithms that solve the sorting


problem.

The numbers to be sorted =⇒ collection of data called a


record.

Each record contains a key, which is the value to be sorted.

The remainder of the record consists of satellite data, which


are usually carried around with the key.

3
There are a number of sorting algorithms with different expected
running time and applicability as shown below.

4
Heapsort

Heapsort combines the better attributes of the two sorting


algorithms - Insertion sort and Merge sort.

O(n lgn) worst case - like merge sort.

Sorts in place - like insertion sort.

To understand heapsort =⇒ heaps and heap operations =⇒


priority queues.

5
Heaps

Heaps

The (binary) heap data structure is an array object that we can


view as a nearly complete binary tree.

Heap data structure

Height of node = number of edges on a longest simple path


from the node down to a leaf.

Height of heap = height of root = Θ(lg n).

A.length: gives the number of elements in the array.

A.heap-size: represents how many elements in the heap are


stored within array A.

6
Heaps

A heap can be stored as an array A.


Root of tree is A[1].
Parent of A[i] = A[bi/2c].
Left child of A[i] = A[2i].
Right child of A[i] = A[2i + 1].
Computing is fast with binary representation implementation.

Figure: 2.1 A heap viewed as a binary tree and an array.


7
Heaps

Heap property

There are two kinds of binary heaps: max-heaps and min-heaps.

For max-heaps (largest element at root) ⇐⇒ max-heap


property: for all nodes i, excluding the root, A[PARENT(i)]
≥ A[i].

For min-heaps (smallest element at root) ⇐⇒ min-heap


property: for all nodes i, excluding the root, A[PARENT(i)]
≤ A[i].

By induction and transitivity of ≤, the max-heap property


guarantees that

The maximum element of a max-heap is at the root,

Similar argument for min-heaps.


8
Maintaining the heap property

Maintaining the heap property

In order to maintain the max-heap property, we call the procedure


MAX-HEAPIFY.

Before MAX-HEAPIFY, A[i] may be smaller than its children.

Assume left and right subtrees of i are max-heaps.

After MAX-HEAPIFY, subtree rooted at i is a max-heap.

9
Maintaining the heap property

Algorithm 1 MAX-HEAPIFY(A, i)
1: l = Left(i)
2: r = Right(i)
3: if l ≤ A.heap − size and A[l] > A[i] then
4: largest = l
5: else
6: largest = i
7: end if
8: if r ≤ A.heap − size and A[r ] > A[largest] then
9: largest = r
10: end if
11: if largest 6= i then
12: exchange A[i] with A[largest]
13: MAX-HEAPIFY(A, largest)
14: end if

10
Maintaining the heap property

The way MAX-HEAPIFY works:

Compare A[i], A[LEFT(i)], and A[RIGHT(i)].

If necessary, swap A[i] with the larger of the two children to


preserve heap property.

Continue this process of comparing and swapping down the


heap, until subtree rooted at i is max-heap.

If we hit a leaf, then the subtree rooted at the leaf is trivially


a max-heap.

11
Maintaining the heap property

Figure: 2.2 The action of MAX-HEAPIFY(A, 2), where A.heap-size =


10.
12
Maintaining the heap property

The running time of MAX-HEAPIFY on a subtree of size n rooted


at a given node i is:

1 Θ(1) time to fix up the relationships among the elements


A[i], A[LEFT(i)], and A[RIGHT(i)], plus;

2 The time to run MAX-HEAPIFY on a subtree rooted at one


of the children of node i (assuming that the recursive call
occurs).

13
Maintaining the heap property

The children’s subtrees each have size at most 2n/3

The worst case occurs when the bottom level of the tree is
exactly half full.

Therefore, we can describe the running time of MAX-HEAPIFY by


the recurrence

T (n) ≤ T (2n/3) + Θ(1).

The solution to this recurrence, is T(n) = O(lg n).

14
Building a heap

Building a heap

The following procedure, given an unordered array, will produce a


max-heap.

Algorithm 2 BUILD-MAX-HEAP(A)
1: A.heap − size = A.length
2: for i = bA.length/2c downto 1 do
3: MAX-HEAPIFY(A, i)
4: end for

15
Building a heap

Example: Building a max-heap from the unsorted array shown as


A results in the heap of Figure 2.3.

16
Building a heap

Figure: 2.3 The operation of BUILD-MAX-HEAP, for a 10-element input


array A.

17
Building a heap

To show why BUILD-MAX-HEAP works correctly, we use the


following loop invariant:

At the start of each iteration of the for loop of lines 2-3, each
node i + 1, i + 2,..., n is the root of a max-heap.

Initialization: Prior to the first iteration of the loop, i = bn/2c.


Each node bn/2c + 1, bn/2c + 2,..., n is a leaf and is thus the
root of a trivial max-heap.

18
Building a heap

Maintenance: Children of node i are indexed higher than i, so by


the loop invariant, they are both roots of max-heaps.

Correctly assuming that i + 1, i + 2,..., n are all roots


of max-heaps, MAX-HEAPIFY makes node i a max-heap
root.

Decrementing i re-establishes the loop invariant at each


iteration.

Termination: When i = 0, the loop terminates. By the loop


invariant, each node, notably node 1, is the root of a max-heap.

19
Building a heap

Analysis

Simple bound:

O(n) calls to MAX-HEAPIFY, each of which takes O(lg n)


time =⇒ O(nlg n).

This upper bound, though correct, is not asymptotically tight.

20
Building a heap

Tighter analysis:

Time to run MAX-HEAPIFY is linear in the height of the


node it’s run on, and most nodes have small heights.
The tighter analysis relies on the properties that an n-element
heap has height blg nc and at most dn/2h+1 e nodes of any
height h.
The time required by MAX-HEAPIFY when called on a node
of height h is O(h), so the total cost of BUILD-MAX-HEAP is
 
blg nc blg nc
X n X h
d h+1 eO(h) = O n
2 2h
h=0 h=0

21
Building a heap

Evaluate the
P∞last summation by substituting x = 1/2 in the
k , which yields

formula k=0 kx


!
X h 1/2
=
2h (1 − 1/2)2
h=0
=2

Thus, the running time of BUILD-MAX-HEAP is O(n).


Hence, we can build a max-heap from an unordered array in
linear time.

22
The heapsort algorithm

The heapsort algorithm

Given an input array, the heapsort algorithm acts as follows:

Builds a max-heap from the array.


Starting with the root (the maximum element), the algorithm
places the maximum element into the correct place in the
array by swapping it with the element in the last position in
the array.
”Discard” this last node (knowing that it is in its correct
place) by decreasing the heap size, and calling
MAX-HEAPIFY on the new (possibly incorrectly-placed) root.
Repeat this ”discarding” process until only one node (the
smallest element) remains, and therefore is in the correct
place in the array.
23
The heapsort algorithm

Algorithm 3 HEAPSORT(A)
1: BUILD-MAX-HEAP(A)
2: for i = A.length downto 2 do
3: exchange A[1] with A[i]
4: A.heap − size = A.heap − size − 1
5: MAX-HEAPIFY(A, 1)
6: end for

24
The heapsort algorithm

Figure: 2.4 The operation of HEAPSORT for input array A. 25


The heapsort algorithm

Analysis

The HEAPSORT procedure takes a total running time of:

BUILD-MAX-HEAP: O(n)
for loop: n - 1 times
exchange elements: O(1)
MAX-HEAPIFY: O(lg n)

Thus, the total time will be =⇒ O(n lgn).

Though heapsort is a great algorithm, a well-implemented


quicksort usually beats it in practice.

26
Priority queues

Priority queues

Heaps efficiently implement priority queues.

Max-priority queues implemented with max-heaps.

Min-priority queues are implemented with min-heaps similarly.

As an example, we can use max-priority queues to schedule


jobs on a shared computer.

A min-priority queue can be used in an event-driven simulator.

27
Priority queues

A Priority queues:

Maintains a dynamic set S of elements.

Max-priority queue supports the following dynamic-set


operations:

INSERT(S, x): inserts element x into set S.

MAXIMUM(S): returns element of S with largest key.

EXTRACT-MAX(S): removes and returns element of S with


largest key.

INCREASE-KEY(S, x, k): increases value of element x’s key


to k. Assume k ≥ x’s current key value.

28
Priority queues

Finding the maximum element

Getting the maximum element is easy: it’s the root.

Algorithm 4 HEAP-MAXIMUM(A)
1: return A[1]

Time: Θ(1).

29
Priority queues

Extracting maximum element

Given the array A:

Make sure heap is not empty.

Make a copy of the maximum element (the root).

Make the last node in the tree the new root.

Re-heapify the heap, with one fewer node.

Return the copy of the maximum element.

30
Priority queues

Algorithm 5 HEAP-EXTRACT-MAX(A)
1: if A.heap − size < 1 then
2: error ”heap overflow”
3: end if
4: max = A[1]
5: A[1] = A[A.heap-size]
6: A.heap-size = A.heap-size - 1
7: MAX-HEAPIFY(A, 1)
8: return max

Time: O(lg n).

31
Priority queues

Increasing key value

Given set S, element x, and new key value k:

Make sure k ≥ x’s current key.

Update x’s key value to k.

Traverse the tree upward comparing x to its parent and


swapping keys if necessary, until x’s key is smaller than its
parent’s key.

32
Priority queues

Algorithm 6 HEAP-INCREASE-KEY(A, i, key)


1: if key < A[i] then
2: error ”new key is smaller than current key”
3: end if
4: A[i] = key
5: while i > 1 and A[PARENT (i)] < A[i] do
6: exchange A[i] with A[PARENT (i)]
7: i = PARENT (i)
8: end while

Time: O(lg n).

33
Priority queues

Figure: 2.5 The operation of HEAP-INCREASE-KEY.

34
Priority queues

Inserting into the heap

Given a key k to insert into the heap:

Insert a new node in the very last position in the tree with key
−∞.

Increase the −∞ key to k using the HEAP-INCREASE-KEY


procedure defined above.

35
Priority queues

Algorithm 7 MAX-HEAP-INSERT(A, key)


1: A.heap − size = A.heap − size + 1
2: A[A.heap − size] = −∞
3: HEAP-INCREASE-KEY(A, A.heap-size, key)

Analysis: Constant time assignments + time for


HEAP-INCREASE-KEY.

Time: O(lg n).

36
Quicksort

Quicksort

Worst-case running time: Θ(n2 )

Expected running time: Θ(n lg n)

Constants hidden in Θ(n lg n) are small.

Sorts in place.

Despite the slow worst-case running time =⇒ remarkably


efficient on the average.

37
Quicksort

Description of quicksort

Quicksort is based on the three-step process of divide-and-conquer.


To sort the subarray A[p..r]:

Divide: Partition A[p..r], into two (possibly empty)


subarrays A[p..q-1] and A[q+1..r].
The first subarray A[p..q-1] is ≤ A[q] and A[q] is ≤ each
element in the second subarray A[q+1..r].

Conquer: Sort the two subarrays by recursive calls to


QUICKSORT.

Combine: No work is needed to combine the subarrays,


because they are sorted in place.

38
Quicksort

Algorithm 8 QUICKSORT(A, p, r)
1: if p < r then
2: q = PARTITION(A, p, r)
3: QUICKSORT(A, p, q-1)
4: QUICKSORT(A, q+1, r)
5: end if

To sort an entire array A, the initial call is QUICKSORT(A, 1,


A.length).

39
Quicksort

Partitioning

Partition subarray A[p..r ] by the following procedure:

Algorithm 9 PARTITION(A, p, r)
1: x = A[r ]
2: i = p − 1
3: for j = p to r − 1 do
4: if A[j] ≤ x then
5: i =i +1
6: exchange A[i] with A[j]
7: end if
8: end for
9: exchange A[i + 1] with A[r ]
10: return i + 1

40
Quicksort

PARTITION always selects the last element A[r ] in the


subarray A[p..r ] as the pivot - the element around which to
partition.

As the procedure executes, the array is partitioned into four


regions, some of which may be empty:

41
Quicksort

We state these properties as a loop invariant:

1 All entries in A[p..i] are ≤ pivot.

2 All entries in A[i + 1..j − 1] are > pivot.

3 A[r ] = pivot.

The indices between j and r − 1 are not covered by any of the


three cases, and the values in these entries have no particular
relationship to the pivot x.

42
Quicksort

Example: On an 8-element subarray.

Figure: 2.6 The operation of PARTITION on a sample array. 43


Quicksort

We use the loop invariant to prove correctness of PARTITION:

Initialization: Before the loop starts, all the conditions of the


loop invariant are satisfied, because r is the pivot and the
subarrays A[p..i] and A[i+1..j-1] are empty.

Maintenance: While the loop is running, if A[j] ≤ pivot,


increment i, then A[j] and A[i] are swapped and then j is
incremented. If A[j] > pivot, then increment only j.

Termination: When the loop terminates, j = r, so all


elements in A are partitioned into one of the three cases:
A[p..i] ≤ pivot, A[i+1..r-1] > pivot, and A[r] =
pivot.

Time for partitioning: Θ(n) to partition an n-element subarray.


44
Performance of quicksort

Performance of quicksort

The running time of quicksort depends on the partitioning of the


subarrays:

If the subarrays are balanced, then quicksort can run as fast as


merge sort.

If they are unbalanced, then quicksort can run as slowly as


insertion sort.

45
Performance of quicksort

Worst-case partitioning
Occurs when the subarrays are completely unbalanced.
Have 0 elements in one subarray and n - 1 elements in the
other subarray.
Thus we will get the recurrence

T (n) = T (n − 1) + T (0) + Θ(n)


= T (n − 1) + Θ(n)
= Θ(n2 )

Same running time as insertion sort.


In fact, the worst-case running time occurs when quicksort
takes a sorted array as input, but insertion sort runs in O(n)
time in this case. 46
Performance of quicksort

Best-case partitioning

Occurs when the subarrays are completely balanced every


time.

Each subarray has ≤ n/2 elements.

Thus we will get the recurrence

T (n) = 2T (n/2) + Θ(n)


= Θ(n lg n)

47
Performance of quicksort

Balanced partitioning

Quicksort’s average running time is much closer to the best


case than to the worst case.

Imagine that PARTITION always produces a 9-to-1 split.

Thus we will get the recurrence

T (n) = T (9n/10) + T (n/10) + Θ(n)

Figure 2.7 shows the recursion tree for this recurrence.

48
Performance of quicksort

Figure: 2.7 A recursion tree for QUICKSORT in which PARTITION always


produces a 9-to-1 split.
49
Performance of quicksort

Every level of the tree has cost cn, until the recursion reaches
a boundary condition at depth log10 n = Θ(lg n).

Then the levels have cost at most cn.

The recursion terminates at depth log10/9 n = Θ(lg n).

The total cost of quick-sort is therefore O(n lg n).

50
Performance of quicksort

Intuition for the average case


Splits in the recursion tree will not always be constant.
There will usually be a mix of good and bad splits throughout
the recursion tree.
To see that this doesn’t affect the asymptotic running time of
quicksort, assume that levels alternate between best-case and
worst-case splits.

Figure: 2.8 Running time of two levels of a recursion tree for quicksort
(”worst partition” and ”best patition”) Vs. a single level of a recursion
tree that is very well balanced. 51
Performance of quicksort

The combination of the bad split followed by the good split


produces three sub-arrays of sizes 0, (n-1)/2-1 and (n-1)/2
at a combined partitioning cost of Θ(n) + Θ(n-1) = Θ(n).

Thus, the extra level in the left-hand figure only adds to the
constant hidden in the Θ - notation.

Both figures result in O(n lg n) time, though the constant for


the figure on the left is higher than that of the figure on the
right.

52
Performance of quicksort

Randomized version of quicksort

In exploring the average - case behavior of quicksort, we have


made an assumption that all permutations of the input numbers
are equally likely. This is not always true.

To correct this, we add randomization to quicksort. We could


randomly permute the input array.

Instead, we use random sampling, or picking one element at


random.

Don’t always use A[r] as the pivot. Instead, randomly pick


an element from the subarray that is being sorted.

53
Performance of quicksort

Algorithm 10 RANDOMIZED-PARTITION(A, p, r)
1: i = RANDOM(p, r)
2: exchange A[r ] with A[i]
3: return PARTITION(A, p, r)

Randomly selecting the pivot element will, on average, cause the


split of the input array to be reasonably well balanced.

54
Performance of quicksort

Algorithm 11 RANDOMIZED-QUICKSORT(A, p, r)
1: if p < r then
2: q = RANDOMIZED-PARTITION(A, p, r)
3: RANDOMIZED-QUICKSORT(A, p, q-1)
4: RANDOMIZED-QUICKSORT(A, q+1, r)
5: end if

For example, an already-sorted array causes worst-case


behavior in non-randomized QUICKSORT, but not in
RANDOMIZED-QUICKSORT.

55
Analysis of quicksort

Analysis of quicksort

We will analyze

the worst-case running time of QUICKSORT and


RANDOMIZED-QUICKSORT (the same), and

the expected (average-case) running time of


RANDOMIZED-QUICKSORT.

56
Analysis of quicksort

Worst-case analysis

We will prove that a worst-case split at every level produces a


worst-case running time of O(n2 ).

Recurrence for the worst-case running time of QUICKSORT:

T (n) = max [T (q) + T (n − q − 1)] + Θ(n).


0≤q≤n−1

Because PARTITION produces two subproblems, totaling size


n - 1, q ranges from 0 to n - 1.

Guess: T(n) ≤ cn2 , for some c.

57
Analysis of quicksort

Substituting our guess into the above recurrence:

T (n) ≤ max (cq 2 + c(n − q − 1)2 ) + Θ(n)


0≤q≤n−1

=c· max (q 2 + (n − q − 1)2 ) + Θ(n)


0≤q≤n−1

The maximum value of (q 2 + (n − q − 1)2 ) occurs when q is either


0 or n - 1. This means that

(q 2 + (n − q − 1)2 ) ≤ (n − 1)2
= n2 − 2n + 1

58
Analysis of quicksort

Therefore,

T (n) ≤ cn2 − c(2n − 1) + Θ(n)


≤ cn2 if c(2n − 1) ≥ Θ(n)

Pick c so that c(2n - 1) dominates Θ(n).

Therefore, the worst-case running time of quicksort is O(n2 ).

59
Analysis of quicksort

Expected running time

The running time of QUICKSORT is dominated by the time spent


in the PARTITION procedure.

PARTITION removes the pivot element from future


consideration each time.

Thus, PARTITION is called at most n times.

QUICKSORT recurses on the partitions.

60
Analysis of quicksort

The amount of work that each call to PARTITION does is a


constant plus the number of comparisons that are performed
in its for loop.

Let X = the total number of comparisons performed in all


calls to PARTITION.

Therefore, the total work done over the entire execution is


O(n + X).

61
Analysis of quicksort

Compute a bound on the overall number of comparisons.

For ease of analysis:

Rename the elements of A as z1 , z2 , ..., zn , with zi being the


i th smallest element.

Define the set Zij = {zi , zi+1 , ..., zj } to be the set of elements
between zi and zj , inclusive.

Each pair of elements is compared at most once, because elements


are compared only to the pivot element, and then the pivot
element is never later called to PARTITION.

62
Analysis of quicksort

Let Xij = I {zi is compared to zj }.


Considering whether zi is compared to zj at any time during
the entire quicksort algorithm, not just during one call of
PARTITION.

Since each pair is compared at most once, the total number of


comparisons performed by the algorithm is
n−1 X
X n
X = Xij
i=1 j=i+1

63
Analysis of quicksort

Take expectations of both sides, and by linearity of expectation:

 
Xn−1 X
n
E [X ] = E  Xij 
i=1 j=i+1
n−1
X n
X
= E [Xij ]
i=1 j=i+1
n−1
X n
X
= Pr {zi is compared to zj }.
i=1 j=i+1

64
Analysis of quicksort

Now all we have to do is find the probability that two elements are
compared.

Think about when two elements are not compared.

For example, numbers in separate partitions will not be


compared.
In the previous example, 8, 1, 6, 4, 0, 3, 9, 5 and the
pivot is 5, so that none of the set {1, 4, 0, 3} will ever be
compared to any of the set {8, 6, 9}.

65
Analysis of quicksort

Once a pivot x is chosen such that zi < x < zj , then zi and zj


will never be compared at any later time.

If either zi or zj is chosen before any other element of Zij , then


it will be compared to all the elements of Zij , except itself.

The probability that zi is compared to zj is the probability


that either zi or zj is the first element chosen.

There are j − i + 1 elements, and pivots are chosen randomly


and independently. Thus, the probability that any particular
one of them is the first one chosen is 1/(j − i + 1).

66
Analysis of quicksort

Therefore,

Pr {zi is compared to zj } = Pr {zi or zj is the first pivot chosen from Zij }


= Pr {zi is the first pivot chosen from Zij }
+ Pr {zj is the first pivot chosen from Zij }
1 1
= +
j −i +1 j −i +1
2
=
j −i +1

Substituting into the equation for E [X ]:


n−1 X
n
X 2
E [X ] =
j −i +1
i=1 j=i+1

67
Analysis of quicksort

Evaluate by using a change in variables (k = j − i) and the bound


on the harmonic series:
n−1 X
n
X 2
E [X ] =
j −i +1
i=1 j=i+1
n−1 X
n−i
X 2
=
k +1
i=1 k=1
n−1 X
n
X 2
<
k
i=1 k=1
n−1
X
= O(lg n)
i=1
= O(n lg n)

So the expected running time of quicksort, using


RANDOMIZED-PARTITION, is O(n lg n). 68
Lower bounds for sorting

Sorting in Linear Time

All sorts seen so far are comparison sorts: insertion sort,


merge sort, quicksort, heapsort.
The only operation that may be used to gain order information
about a sequence is comparison of pairs of elements.

Lower bounds for sorting


In a comparison sort, we use only comparisons between elements to
gain order information about an input sequence ha1 , a2 , ..., an i.

Ω(n) to examine all the input.


All sorts seen so far are Ω(n lgn).

We can view comparison sorts abstractly in terms of decision trees.


69
Lower bounds for sorting

Decision tree

A decision tree is a full binary tree that represents comparisons


made by:

a specific sorting algorithm

on inputs of a given size.

We’re counting only comparisons.

Abstracts away everything else: control and data movement.

70
Lower bounds for sorting

For insertion sort on 3 elements:

Figure: 2.9 The decision tree for insertion sort operating on three
elements.

71
Lower bounds for sorting

Because, any correct sorting algorithm must be able to produce


each permutation of its input,

n! permutations must appear as one of the leaves of the


decision tree.

Furthermore, each of these leaves must be reachable from the root


by a downward path corresponding to an actual execution of the
comparison sort.

72
Lower bounds for sorting

A lower bound for the worst case

The length of the longest simple path from the root of a


decision tree to any of its reachable leaves represents the
worst-case.

A lower bound on the heights of all decision trees is therefore


a lower bound on the running time of any comparison sort
algorithm.

73
Lower bounds for sorting

Theorem:
Any comparison sort algorithm requires Ω(n lgn) comparisons in
the worst case.
Proof: Consider a decision tree of height h with l reachable leaves
corresponding to a comparison sort on n elements.
Any binary tree of height h has ≤ 2h leaves.
Because each of the n! permutations of the input appears as
some leaf, we have n! ≤ l.
Thus,
n! ≤ l ≤ 2h
which, by taking logarithms, implies
h ≥ lg (n!) (then by using sterlings approximation : n! > (n/e)n )
= Ω(n lgn)
Heapsort and merge sort are asymptotically optimal comparison
sorts. 74
Counting sort

Counting sort

Counting sort assumes that each of the n input elements is an


integer in the range 0 to k, for some integer k.

When k = O(n), the sort runs in Θ(n) time.

Counting sort determines, for each input element x, the


number of elements less than x.

It uses this information to place element x directly into its


position in the output array.

75
Counting sort

Input: A[1..n], where A[j] ∈ {0, 1,..., k} for j = 1,


2,..., n. Array A and values n and k are given as
parameters.

Output: B[1..n], sorted. B is assumed to be already


allocated and is given as a parameter.

Auxiliary storage: C[0..k] provides temporary working


storage.

76
Counting sort

Algorithm 12 COUNTING-SORT(A, B, k)
1: let C [0 . . . k] be a new array
2: for i = 0 to k do
3: C [i] = 0
4: end for
5: for j = 1 to A.length do
6: C [A[j]] = C [A[j]] + 1
7: . C [i] now contains the number of elements equal to i.
8: end for
9: for i = 1 to k do
10: C [i] = C [i] + C [i − 1]
11: . C [i] contains elements less than or equal to i.
12: end for
13: for j = A.length downto 1 do
14: B[C [A[j]]] = A[j]
15: C [A[j]] = C [A[j]] − 1
16: end for 77
Counting sort

Figure: 2.10 The operation of COUNTING-SORT on an input array


A[1..8], where each element of A is a nonnegative integer no larger
than k = 5.

78
Counting sort

Analysis of counting sorts running time

The for loop of lines 2-4 takes time Θ(k).

The for loop of lines 5-8 takes time Θ(n).

The for loop of lines 9-12 takes time Θ(k).

The for loop of lines 13-16 takes time Θ(n).

Thus, the overall time is Θ(n + k), which is Θ(n) if k = O(n).

An important property of counting sort is that it is stable.

Normally, the property of stability is important only when


satellite data are carried around with the element being sorted.

79
Radix sort

Radix sort

Key idea: Sort least significant digits first.

It’s the algorithm for using the machine that extends the
technique to multi-column sorting.

Workes on one column at a time.

80
Radix sort

Algorithm 13 RADIX-SORT(A, d)
1: for i = 0 to d do
2: use a stable sort to sort array A on digit i
3: end for

Figure: 2.11 The operation of radix sort on a list of seven 3-digit


81
Radix sort

Correctness:

Assuming digits 1, 2,..., i - 1 are sorted.


Stable sort on digit i leaves digits 1,..., i sorted:
If 2 digits in position i are different, ordering by position i is
correct, and positions 1,..., i - 1 are irrelevant.
If 2 digits in position i are equal, numbers are already in the
right order (by inductive hypothesis). The stable sort on digit
i leaves them in the right order.

This argument shows why it’s so important to use a stable


sort for intermediate sort.

82
Radix sort

Analysis of Radix Sort running time

Assume that we use counting sort as the intermediate sort.

Θ(n + k) per pass (digits in range 0, . . . , k)

d passes

Thus, total running time is Θ(d(n + k)).

If k = O(n), then total running time Θ(dn).

83
Bucket sort

Bucket sort

Bucket sort assumes that the input is drawn from a uniform


distribution and has an average-case running time of O(n).

Like counting sort, bucket sort is fast because it assumes


something about the input.

Bucket sort assumes that the input is generated by a random


process that distributes elements uniformly and independently
over the interval [0, 1).

84
Bucket sort

Idea of bucket sort:

Divide [0, 1) into n equal-sized buckets.

Distribute the n input values into the buckets.

Sort each bucket.

Then go through buckets in order, listing elements in each


one.

Input: A[1..n], where 0 ≤ A[i] < 1 for all i.

Auxiliary array: B[0..n - 1] of linked lists, each list initially


empty.

85
Bucket sort

Algorithm 14 BUCKET-SORT(A, n)
1: let B[0 . . . n − 1] be a new array
2: n = A.length
3: for i = 0 to n − 1 do
4: make B[i] an empty list
5: end for
6: for i = 1 to n do
7: insert A[i] into list B[bnA[i]c]
8: end for
9: for i = 0 to n − 1 do
10: sort list B[i] with insertion sort
11: end for
12: concatenate the lists B[0], B[1],. . . , B[n-1] together in order.

86
Bucket sort

Figure: 2.12 The operation of BUCKET-SORT for n = 10.

87
Bucket sort

Correctness: Consider A[i], A[j].

Assume without loss of generality that A[i] ≤ A[j]. Then


bn · A[i]c ≤ bn · A[j]c.

So A[i] is placed into the same bucket as A[j] or into a


bucket with a lower index.
If same bucket, insertion sort fixes up.
If earlier bucket, concatenation of lists fixes up.

88
Bucket sort

Running time analysis of bucket sort

Relies on no bucket getting too many values.

All lines of algorithm except insertion sorting take Θ(n)


altogether.

Intuitively, if each bucket gets a constant number of elements,


it takes O(1) time to sort each bucket ⇒ O(n) sort time for
all buckets.

89
Bucket sort

To analyze the cost of the calls to insertion sort, define a random


variable:

ni = the number of elements placed in bucket B[i].

Because insertion sort runs in quadratic time, bucket sort running


time is
n−1
X
T (n) = Θ(n) + O(ni2 ).
i=0

90
Bucket sort

Take expectations of both sides:


n−1
X
O(ni2 )
 
E [T (n)] = E Θ(n) +
i=0
n−1
X
= Θ(n) + E [O(ni2 )]
i=0
n−1
X
= Θ(n) + O(E [ni2 ])
i=0

Using indicator random variables


E [ni2 ] = 2 − (1/n) for i = 0, 1, ..., n − 11

1
Check the proof from the textbook 91
Bucket sort

Therefore:
n−1
X
E [T (n)] = Θ(n) + O(2 − (1/n))
i=0
= Θ(n) + O(n)
= Θ(n)

This is a probabilistic analysis - we used probability to analyze an


algorithm whose running time depends on the distribution of
inputs.

92
Introduction

Medians and Order

ith order statistic is the ith smallest element of a set of n


elements.

The minimum is the first order statistic (i = 1).

The maximum is the nth order statistic (i = n).

A median is the ”halfway point” of the set.

93
Introduction

When n is odd, the median is unique, at i = (n + 1)/2.

When n is even, there are two medians:

The lower median, at i = n/2, and


The upper median, at i = n/2 + 1.
We mean lower median when we use the phrase ”the median”.

94
Introduction

The selection problem:

Input: A set A of n distinct numbers and a number i, with


1 ≤ i ≤ n.
Output: The element x ∈ A that is larger than exactly i - 1
other elements in A. In other words, the ith smallest element
of A.

The selection problem can be solved in O(n lgn) time.


Sort the numbers using an O(n lgn) - time algorithm, such as
heapsort or merge sort.
Then return the ith element in the sorted array.

95
Minimum and Maximum

Minimum and Maximum

We can easily obtain an upper bound of n - 1 comparisons for


finding the minimum of a set of n elements.

Examine each element in turn and keep track of the smallest


one.

This is the best we can do, because each element, except the
minimum, must be compared to a smaller element at least
once.

96
Minimum and Maximum

The following pseudocode finds the minimum element in array


A[1..n].

Algorithm 15 MINIMUM(A)
1: min = A[1]
2: for i = 2 to A.length do
3: if min > A[i] then
4: min = A[i]
5: end if
6: end for
7: return min

The maximum can be found in exactly the same way by replacing


the > with < in the above algorithm.
97
Minimum and Maximum

Simultaneous minimum and maximum

Some applications need both the minimum and maximum of a set


of elements.
A simple algorithm to find the minimum and maximum is to find
each one independently.

There will be n - 1 comparisons for the minimum and n - 1


comparisons for the maximum, for a total of 2n - 2
comparisons.
This will result in Θ(n) time.

98
Minimum and Maximum

In fact, with 3bn/2c comparisons we can find both the minimum


and maximum:

Maintain the minimum and maximum of elements seen so far.


Don’t compare each element to the minimum and maximum
separately.
Process elements in pairs.
Compare the elements of a pair to each other.
Then compare the larger element to the maximum so far, and
compare the smaller element to the minimum so far.

This leads to only 3 comparisons for every 2 elements.

99
Minimum and Maximum

Setting up the initial values for the min and max depends on
whether n is odd or even.

If n is even, compare the first two elements and assign the


larger to max and the smaller to min. Then process the
rest of the elements in pairs.

If n is odd, set both min and max to the first element. Then
process the rest of the elements in pairs.

100
Minimum and Maximum

Analysis of the total number of comparisons

If n is even, we do 1 initial comparison and then 3(n − 2)/2


more comparisons.

3(n − 2)
# of comparisons = +1
2
3n − 6
= +1
2
3n
= −2
2
If n is odd, we do 3(n − 1)/2 = 3bn/2c comparisons.

In either case, the maximum number of comparisons is ≤ 3bn/2c.

101
Selection

Selection in expected linear time

Selection of the ith smallest element of the array A can be


done in Θ(n) time.

The function RANDOMIZED-SELECT uses


RANDOMIZED-PARTITION from the quicksort algorithm:

RANDOMIZED-SELECT differs from quicksort because it


recurses on one side of the partition only, which gives it a
reduced running time of Θ(n).

102
Selection

The following code for RANDOMIZED-SELECT returns the ith


smallest element of the array A[p..r].

Algorithm 16 RANDOMIZED-SELECT(A, p, r, i)
1: if p == r then
2: return A[p]
3: end if
4: q = RANDOMIZED-PARTITION(A, p, r)
5: k = q - p + 1
6: if i == k then . the pivot value is the answer
7: return A[q]
8: else if i < k then
9: return RANDOMIZED-SELECT(A, p, q-1, i)
10: else
11: return RANDOMIZED-SELECT(A, q+1, r, i-k)
12: end if
103
Selection

After the call to RANDOMIZED-PARTITION, the array is partitioned


into two subarrays A[p..q − 1] and A[q + 1..r ], along with a pivot
element A[q].

The elements of subarray A[p..q − 1] are all ≤ A[q].

The elements of subarray A[q + 1..r ] are all > A[q].

The pivot element is the k th element of the subarray A[p..r ],


where k = q − p + 1.

If the pivot element is the i th smallest element (i.e., i = k),


return A[q].

104
Selection

Otherwise, recurse on the subarray containing the i th smallest


element.

If i < k, this subarray is A[p..q − 1], and we want the i th


smallest element.

If i > k, this subarray is A[q + 1..r ] and, since there are k


elements in A[p..r ] that precede A[q + 1..r ], we want the
(i − k)th smallest element of this subarray.

105
Selection

Running time analysis

Worst-case running time: Θ(n2 ), because we could be


extremely unlucky and always recurse on a subarray that is
only 1 element smaller than the previous subarray.

Expected running time: RANDOMIZED-SELECT works well on


average. Because it is randomized, no particular input brings
out the worst-case behavior consistently. As a result of that
this algorithm runs in linear time ⇒ Θ(n).

106
Selection

End of Chapter 2

Questions?

107

You might also like