Sorting and Order Statistics (1x1)

Chapter 2
Sorting and Order Statistics
Yonas Y.
School of Electrical and Computer Engineering
1
1 Introduction
2 Heapsort
Heaps
Maintaining the heap property
Building a heap
The heapsort algorithm
Priority queues
3 Quicksort
Quicksort
Performance of quicksort
Analysis of quicksort
4 Sorting in Linear Time
Lower bounds for sorting
Counting sort
Radix sort
Bucket sort
5 Medians and Order Statistics
Introduction
2
Introduction
This part presents several algorithms that solve the sorting

problem.
The numbers to be sorted =⇒ collection of data called a

record.
Each record contains a key, which is the value to be sorted.
The remainder of the record consists of satellite data, which

are usually carried around with the key.
3
There are a number of sorting algorithms with different expected
running time and applicability as shown below.
4
Heapsort
Heapsort combines the better attributes of the two sorting

algorithms - Insertion sort and Merge sort.
O(n lgn) worst case - like merge sort.
Sorts in place - like insertion sort.
To understand heapsort =⇒ heaps and heap operations =⇒

priority queues.
5
Heaps
Heaps
The (binary) heap data structure is an array object that we can

view as a nearly complete binary tree.
Heap data structure
Height of node = number of edges on a longest simple path

from the node down to a leaf.
Height of heap = height of root = Θ(lg n).
A.length: gives the number of elements in the array.
A.heap-size: represents how many elements in the heap are

stored within array A.
6
Heaps
A heap can be stored as an array A.

Root of tree is A[1].
Parent of A[i] = A[bi/2c].
Left child of A[i] = A[2i].
Right child of A[i] = A[2i + 1].
Computing is fast with binary representation implementation.
Figure: 2.1 A heap viewed as a binary tree and an array.

7
Heaps
Heap property
There are two kinds of binary heaps: max-heaps and min-heaps.
For max-heaps (largest element at root) ⇐⇒ max-heap

property: for all nodes i, excluding the root, A[PARENT(i)]
≥ A[i].
For min-heaps (smallest element at root) ⇐⇒ min-heap

property: for all nodes i, excluding the root, A[PARENT(i)]
≤ A[i].
By induction and transitivity of ≤, the max-heap property

guarantees that
The maximum element of a max-heap is at the root,
Similar argument for min-heaps.

8
In order to maintain the max-heap property, we call the procedure

MAX-HEAPIFY.
Before MAX-HEAPIFY, A[i] may be smaller than its children.
Assume left and right subtrees of i are max-heaps.
After MAX-HEAPIFY, subtree rooted at i is a max-heap.
9
Algorithm 1 MAX-HEAPIFY(A, i)
1: l = Left(i)
2: r = Right(i)
3: if l ≤ A.heap − size and A[l] > A[i] then
4: largest = l
5: else
6: largest = i
7: end if
8: if r ≤ A.heap − size and A[r ] > A[largest] then
9: largest = r
10: end if
11: if largest 6= i then
12: exchange A[i] with A[largest]
13: MAX-HEAPIFY(A, largest)
14: end if
10
The way MAX-HEAPIFY works:
Compare A[i], A[LEFT(i)], and A[RIGHT(i)].
If necessary, swap A[i] with the larger of the two children to

preserve heap property.
Continue this process of comparing and swapping down the

heap, until subtree rooted at i is max-heap.
If we hit a leaf, then the subtree rooted at the leaf is trivially

a max-heap.
11
Figure: 2.2 The action of MAX-HEAPIFY(A, 2), where A.heap-size =

10.
12
The running time of MAX-HEAPIFY on a subtree of size n rooted

at a given node i is:
1 Θ(1) time to fix up the relationships among the elements

A[i], A[LEFT(i)], and A[RIGHT(i)], plus;
2 The time to run MAX-HEAPIFY on a subtree rooted at one

of the children of node i (assuming that the recursive call
occurs).
13
The children’s subtrees each have size at most 2n/3
The worst case occurs when the bottom level of the tree is
exactly half full.
Therefore, we can describe the running time of MAX-HEAPIFY by

the recurrence
T (n) ≤ T (2n/3) + Θ(1).
The solution to this recurrence, is T(n) = O(lg n).
14
Building a heap
Building a heap
The following procedure, given an unordered array, will produce a

max-heap.
Algorithm 2 BUILD-MAX-HEAP(A)
1: A.heap − size = A.length
2: for i = bA.length/2c downto 1 do
3: MAX-HEAPIFY(A, i)
4: end for
15
Building a heap
Example: Building a max-heap from the unsorted array shown as

A results in the heap of Figure 2.3.
16
Building a heap
Figure: 2.3 The operation of BUILD-MAX-HEAP, for a 10-element input

array A.
17
Building a heap
To show why BUILD-MAX-HEAP works correctly, we use the

following loop invariant:
At the start of each iteration of the for loop of lines 2-3, each
node i + 1, i + 2,..., n is the root of a max-heap.
Initialization: Prior to the first iteration of the loop, i = bn/2c.

Each node bn/2c + 1, bn/2c + 2,..., n is a leaf and is thus the
root of a trivial max-heap.
18
Building a heap
Maintenance: Children of node i are indexed higher than i, so by

the loop invariant, they are both roots of max-heaps.
Correctly assuming that i + 1, i + 2,..., n are all roots

of max-heaps, MAX-HEAPIFY makes node i a max-heap
root.
Decrementing i re-establishes the loop invariant at each

iteration.
Termination: When i = 0, the loop terminates. By the loop

invariant, each node, notably node 1, is the root of a max-heap.
19
Building a heap
Analysis
Simple bound:
O(n) calls to MAX-HEAPIFY, each of which takes O(lg n)

time =⇒ O(nlg n).
This upper bound, though correct, is not asymptotically tight.
20
Building a heap
Tighter analysis:
Time to run MAX-HEAPIFY is linear in the height of the

node it’s run on, and most nodes have small heights.
The tighter analysis relies on the properties that an n-element
heap has height blg nc and at most dn/2h+1 e nodes of any
height h.
The time required by MAX-HEAPIFY when called on a node
of height h is O(h), so the total cost of BUILD-MAX-HEAP is
 
blg nc blg nc
X n X h
d h+1 eO(h) = O n
2 2h
h=0 h=0
21
Building a heap
Evaluate the
P∞last summation by substituting x = 1/2 in the
k , which yields

formula k=0 kx
∞
!
X h 1/2
=
2h (1 − 1/2)2
h=0
=2
Thus, the running time of BUILD-MAX-HEAP is O(n).

Hence, we can build a max-heap from an unordered array in
linear time.
22
Given an input array, the heapsort algorithm acts as follows:
Builds a max-heap from the array.

Starting with the root (the maximum element), the algorithm
places the maximum element into the correct place in the
array by swapping it with the element in the last position in
the array.
”Discard” this last node (knowing that it is in its correct
place) by decreasing the heap size, and calling
MAX-HEAPIFY on the new (possibly incorrectly-placed) root.
Repeat this ”discarding” process until only one node (the
smallest element) remains, and therefore is in the correct
place in the array.
23
Algorithm 3 HEAPSORT(A)
1: BUILD-MAX-HEAP(A)
2: for i = A.length downto 2 do
3: exchange A[1] with A[i]
4: A.heap − size = A.heap − size − 1
5: MAX-HEAPIFY(A, 1)
6: end for
24
Figure: 2.4 The operation of HEAPSORT for input array A. 25

Analysis
The HEAPSORT procedure takes a total running time of:
BUILD-MAX-HEAP: O(n)
for loop: n - 1 times
exchange elements: O(1)
MAX-HEAPIFY: O(lg n)
Thus, the total time will be =⇒ O(n lgn).
Though heapsort is a great algorithm, a well-implemented

quicksort usually beats it in practice.
26
Priority queues
Priority queues
Heaps efficiently implement priority queues.
Max-priority queues implemented with max-heaps.
Min-priority queues are implemented with min-heaps similarly.
As an example, we can use max-priority queues to schedule

jobs on a shared computer.
A min-priority queue can be used in an event-driven simulator.
27
Priority queues
A Priority queues:
Maintains a dynamic set S of elements.
Max-priority queue supports the following dynamic-set

operations:
INSERT(S, x): inserts element x into set S.
MAXIMUM(S): returns element of S with largest key.
EXTRACT-MAX(S): removes and returns element of S with

largest key.
INCREASE-KEY(S, x, k): increases value of element x’s key

to k. Assume k ≥ x’s current key value.
28
Priority queues
Finding the maximum element
Getting the maximum element is easy: it’s the root.
Algorithm 4 HEAP-MAXIMUM(A)
1: return A[1]
Time: Θ(1).
29
Priority queues
Extracting maximum element
Given the array A:
Make sure heap is not empty.
Make a copy of the maximum element (the root).
Make the last node in the tree the new root.
Re-heapify the heap, with one fewer node.
Return the copy of the maximum element.
30
Priority queues
Algorithm 5 HEAP-EXTRACT-MAX(A)
1: if A.heap − size < 1 then
2: error ”heap overflow”
3: end if
4: max = A[1]
5: A[1] = A[A.heap-size]
6: A.heap-size = A.heap-size - 1
7: MAX-HEAPIFY(A, 1)
8: return max
Time: O(lg n).
31
Priority queues
Increasing key value
Given set S, element x, and new key value k:
Make sure k ≥ x’s current key.
Update x’s key value to k.
Traverse the tree upward comparing x to its parent and

swapping keys if necessary, until x’s key is smaller than its
parent’s key.
32
Priority queues
Algorithm 6 HEAP-INCREASE-KEY(A, i, key)

1: if key < A[i] then
2: error ”new key is smaller than current key”
3: end if
4: A[i] = key
5: while i > 1 and A[PARENT (i)] < A[i] do
6: exchange A[i] with A[PARENT (i)]
7: i = PARENT (i)
8: end while
Time: O(lg n).
33
Priority queues
Figure: 2.5 The operation of HEAP-INCREASE-KEY.
34
Priority queues
Inserting into the heap
Given a key k to insert into the heap:
Insert a new node in the very last position in the tree with key
−∞.
Increase the −∞ key to k using the HEAP-INCREASE-KEY

procedure defined above.
35
Priority queues
Algorithm 7 MAX-HEAP-INSERT(A, key)

1: A.heap − size = A.heap − size + 1
2: A[A.heap − size] = −∞
3: HEAP-INCREASE-KEY(A, A.heap-size, key)
Analysis: Constant time assignments + time for

HEAP-INCREASE-KEY.
Time: O(lg n).
36
Quicksort
Quicksort
Worst-case running time: Θ(n2 )
Expected running time: Θ(n lg n)
Constants hidden in Θ(n lg n) are small.
Sorts in place.
Despite the slow worst-case running time =⇒ remarkably

efficient on the average.
37
Quicksort
Description of quicksort
Quicksort is based on the three-step process of divide-and-conquer.

To sort the subarray A[p..r]:
Divide: Partition A[p..r], into two (possibly empty)

subarrays A[p..q-1] and A[q+1..r].
The first subarray A[p..q-1] is ≤ A[q] and A[q] is ≤ each
element in the second subarray A[q+1..r].
Conquer: Sort the two subarrays by recursive calls to

QUICKSORT.
Combine: No work is needed to combine the subarrays,

because they are sorted in place.
38
Quicksort
Algorithm 8 QUICKSORT(A, p, r)
1: if p < r then
2: q = PARTITION(A, p, r)
3: QUICKSORT(A, p, q-1)
4: QUICKSORT(A, q+1, r)
5: end if
To sort an entire array A, the initial call is QUICKSORT(A, 1,

A.length).
39
Quicksort
Partitioning
Partition subarray A[p..r ] by the following procedure:
Algorithm 9 PARTITION(A, p, r)
1: x = A[r ]
2: i = p − 1
3: for j = p to r − 1 do
4: if A[j] ≤ x then
5: i =i +1
6: exchange A[i] with A[j]
7: end if
8: end for
9: exchange A[i + 1] with A[r ]
10: return i + 1
40
Quicksort
PARTITION always selects the last element A[r ] in the

subarray A[p..r ] as the pivot - the element around which to
partition.
As the procedure executes, the array is partitioned into four

regions, some of which may be empty:
41
Quicksort
We state these properties as a loop invariant:
1 All entries in A[p..i] are ≤ pivot.
2 All entries in A[i + 1..j − 1] are > pivot.
3 A[r ] = pivot.
The indices between j and r − 1 are not covered by any of the

three cases, and the values in these entries have no particular
relationship to the pivot x.
42
Quicksort
Example: On an 8-element subarray.
Figure: 2.6 The operation of PARTITION on a sample array. 43

Quicksort
We use the loop invariant to prove correctness of PARTITION:
Initialization: Before the loop starts, all the conditions of the

loop invariant are satisfied, because r is the pivot and the
subarrays A[p..i] and A[i+1..j-1] are empty.
Maintenance: While the loop is running, if A[j] ≤ pivot,

increment i, then A[j] and A[i] are swapped and then j is
incremented. If A[j] > pivot, then increment only j.
Termination: When the loop terminates, j = r, so all

elements in A are partitioned into one of the three cases:
A[p..i] ≤ pivot, A[i+1..r-1] > pivot, and A[r] =
pivot.
Time for partitioning: Θ(n) to partition an n-element subarray.

44
The running time of quicksort depends on the partitioning of the

subarrays:
If the subarrays are balanced, then quicksort can run as fast as

merge sort.
If they are unbalanced, then quicksort can run as slowly as

insertion sort.
45
Worst-case partitioning
Occurs when the subarrays are completely unbalanced.
Have 0 elements in one subarray and n - 1 elements in the
other subarray.
Thus we will get the recurrence
T (n) = T (n − 1) + T (0) + Θ(n)

= T (n − 1) + Θ(n)
= Θ(n2 )
Same running time as insertion sort.

In fact, the worst-case running time occurs when quicksort
takes a sorted array as input, but insertion sort runs in O(n)
time in this case. 46
Best-case partitioning
Occurs when the subarrays are completely balanced every

time.
Each subarray has ≤ n/2 elements.
T (n) = 2T (n/2) + Θ(n)

= Θ(n lg n)
47
Balanced partitioning
Quicksort’s average running time is much closer to the best

case than to the worst case.
Imagine that PARTITION always produces a 9-to-1 split.
T (n) = T (9n/10) + T (n/10) + Θ(n)
Figure 2.7 shows the recursion tree for this recurrence.
48
Figure: 2.7 A recursion tree for QUICKSORT in which PARTITION always

produces a 9-to-1 split.
49
Every level of the tree has cost cn, until the recursion reaches
a boundary condition at depth log10 n = Θ(lg n).
Then the levels have cost at most cn.
The recursion terminates at depth log10/9 n = Θ(lg n).
The total cost of quick-sort is therefore O(n lg n).
50
Intuition for the average case

Splits in the recursion tree will not always be constant.
There will usually be a mix of good and bad splits throughout
the recursion tree.
To see that this doesn’t affect the asymptotic running time of
quicksort, assume that levels alternate between best-case and
worst-case splits.
Figure: 2.8 Running time of two levels of a recursion tree for quicksort
(”worst partition” and ”best patition”) Vs. a single level of a recursion
tree that is very well balanced. 51
The combination of the bad split followed by the good split

produces three sub-arrays of sizes 0, (n-1)/2-1 and (n-1)/2
at a combined partitioning cost of Θ(n) + Θ(n-1) = Θ(n).
Thus, the extra level in the left-hand figure only adds to the
constant hidden in the Θ - notation.
Both figures result in O(n lg n) time, though the constant for

the figure on the left is higher than that of the figure on the
right.
52
Randomized version of quicksort
In exploring the average - case behavior of quicksort, we have

made an assumption that all permutations of the input numbers
are equally likely. This is not always true.
To correct this, we add randomization to quicksort. We could

randomly permute the input array.
Instead, we use random sampling, or picking one element at

random.
Don’t always use A[r] as the pivot. Instead, randomly pick

an element from the subarray that is being sorted.
53
Algorithm 10 RANDOMIZED-PARTITION(A, p, r)
1: i = RANDOM(p, r)
2: exchange A[r ] with A[i]
3: return PARTITION(A, p, r)
Randomly selecting the pivot element will, on average, cause the

split of the input array to be reasonably well balanced.
54
Algorithm 11 RANDOMIZED-QUICKSORT(A, p, r)
1: if p < r then
2: q = RANDOMIZED-PARTITION(A, p, r)
3: RANDOMIZED-QUICKSORT(A, p, q-1)
4: RANDOMIZED-QUICKSORT(A, q+1, r)
5: end if
For example, an already-sorted array causes worst-case

behavior in non-randomized QUICKSORT, but not in
RANDOMIZED-QUICKSORT.
55
We will analyze
the worst-case running time of QUICKSORT and

RANDOMIZED-QUICKSORT (the same), and
the expected (average-case) running time of

RANDOMIZED-QUICKSORT.
56
Worst-case analysis
We will prove that a worst-case split at every level produces a

worst-case running time of O(n2 ).
Recurrence for the worst-case running time of QUICKSORT:
T (n) = max [T (q) + T (n − q − 1)] + Θ(n).

0≤q≤n−1
Because PARTITION produces two subproblems, totaling size

n - 1, q ranges from 0 to n - 1.
Guess: T(n) ≤ cn2 , for some c.
57
Substituting our guess into the above recurrence:
T (n) ≤ max (cq 2 + c(n − q − 1)2 ) + Θ(n)

0≤q≤n−1
=c· max (q 2 + (n − q − 1)2 ) + Θ(n)

0≤q≤n−1
The maximum value of (q 2 + (n − q − 1)2 ) occurs when q is either

0 or n - 1. This means that
(q 2 + (n − q − 1)2 ) ≤ (n − 1)2
= n2 − 2n + 1
58
Therefore,
T (n) ≤ cn2 − c(2n − 1) + Θ(n)

≤ cn2 if c(2n − 1) ≥ Θ(n)
Pick c so that c(2n - 1) dominates Θ(n).
Therefore, the worst-case running time of quicksort is O(n2 ).
59
Expected running time
The running time of QUICKSORT is dominated by the time spent

in the PARTITION procedure.
PARTITION removes the pivot element from future

consideration each time.
Thus, PARTITION is called at most n times.
QUICKSORT recurses on the partitions.
60
The amount of work that each call to PARTITION does is a

constant plus the number of comparisons that are performed
in its for loop.
Let X = the total number of comparisons performed in all

calls to PARTITION.
Therefore, the total work done over the entire execution is

O(n + X).
61
Compute a bound on the overall number of comparisons.
For ease of analysis:
Rename the elements of A as z1 , z2 , ..., zn , with zi being the

i th smallest element.
Define the set Zij = {zi , zi+1 , ..., zj } to be the set of elements
between zi and zj , inclusive.
Each pair of elements is compared at most once, because elements

are compared only to the pivot element, and then the pivot
element is never later called to PARTITION.
62
Let Xij = I {zi is compared to zj }.

Considering whether zi is compared to zj at any time during
the entire quicksort algorithm, not just during one call of
PARTITION.
Since each pair is compared at most once, the total number of

comparisons performed by the algorithm is
n−1 X
X n
X = Xij
i=1 j=i+1
63
Take expectations of both sides, and by linearity of expectation:
 
Xn−1 X
n
E [X ] = E  Xij 
i=1 j=i+1
n−1
X n
X
= E [Xij ]
i=1 j=i+1
n−1
X n
X
= Pr {zi is compared to zj }.
i=1 j=i+1
64
Now all we have to do is find the probability that two elements are
compared.
Think about when two elements are not compared.
For example, numbers in separate partitions will not be

compared.
In the previous example, 8, 1, 6, 4, 0, 3, 9, 5 and the
pivot is 5, so that none of the set {1, 4, 0, 3} will ever be
compared to any of the set {8, 6, 9}.
65
Once a pivot x is chosen such that zi < x < zj , then zi and zj

will never be compared at any later time.
If either zi or zj is chosen before any other element of Zij , then

it will be compared to all the elements of Zij , except itself.
The probability that zi is compared to zj is the probability

that either zi or zj is the first element chosen.
There are j − i + 1 elements, and pivots are chosen randomly

and independently. Thus, the probability that any particular
one of them is the first one chosen is 1/(j − i + 1).
66
Therefore,
Pr {zi is compared to zj } = Pr {zi or zj is the first pivot chosen from Zij }

= Pr {zi is the first pivot chosen from Zij }
+ Pr {zj is the first pivot chosen from Zij }
1 1
= +
j −i +1 j −i +1
2
=
j −i +1
Substituting into the equation for E [X ]:

n−1 X
n
X 2
E [X ] =
j −i +1
i=1 j=i+1
67
Evaluate by using a change in variables (k = j − i) and the bound

on the harmonic series:
n−1 X
n
X 2
E [X ] =
j −i +1
i=1 j=i+1
n−1 X
n−i
X 2
=
k +1
i=1 k=1
n−1 X
n
X 2
<
k
i=1 k=1
n−1
X
= O(lg n)
i=1
= O(n lg n)
So the expected running time of quicksort, using

RANDOMIZED-PARTITION, is O(n lg n). 68
Sorting in Linear Time
All sorts seen so far are comparison sorts: insertion sort,

merge sort, quicksort, heapsort.
The only operation that may be used to gain order information
about a sequence is comparison of pairs of elements.

In a comparison sort, we use only comparisons between elements to
gain order information about an input sequence ha1 , a2 , ..., an i.
Ω(n) to examine all the input.

All sorts seen so far are Ω(n lgn).
We can view comparison sorts abstractly in terms of decision trees.

69
Decision tree
A decision tree is a full binary tree that represents comparisons

made by:
a specific sorting algorithm
on inputs of a given size.
We’re counting only comparisons.
Abstracts away everything else: control and data movement.
70
For insertion sort on 3 elements:
Figure: 2.9 The decision tree for insertion sort operating on three
elements.
71
Because, any correct sorting algorithm must be able to produce

each permutation of its input,
n! permutations must appear as one of the leaves of the

decision tree.
Furthermore, each of these leaves must be reachable from the root

by a downward path corresponding to an actual execution of the
comparison sort.
72
A lower bound for the worst case
The length of the longest simple path from the root of a

decision tree to any of its reachable leaves represents the
worst-case.
A lower bound on the heights of all decision trees is therefore

a lower bound on the running time of any comparison sort
algorithm.
73
Theorem:
Any comparison sort algorithm requires Ω(n lgn) comparisons in
the worst case.
Proof: Consider a decision tree of height h with l reachable leaves
corresponding to a comparison sort on n elements.
Any binary tree of height h has ≤ 2h leaves.
Because each of the n! permutations of the input appears as
some leaf, we have n! ≤ l.
Thus,
n! ≤ l ≤ 2h
which, by taking logarithms, implies
h ≥ lg (n!) (then by using sterlings approximation : n! > (n/e)n )
= Ω(n lgn)
Heapsort and merge sort are asymptotically optimal comparison
sorts. 74
Counting sort
Counting sort
Counting sort assumes that each of the n input elements is an

integer in the range 0 to k, for some integer k.
When k = O(n), the sort runs in Θ(n) time.
Counting sort determines, for each input element x, the

number of elements less than x.
It uses this information to place element x directly into its

position in the output array.
75
Counting sort
Input: A[1..n], where A[j] ∈ {0, 1,..., k} for j = 1,

2,..., n. Array A and values n and k are given as
parameters.
Output: B[1..n], sorted. B is assumed to be already

allocated and is given as a parameter.
Auxiliary storage: C[0..k] provides temporary working

storage.
76
Counting sort
Algorithm 12 COUNTING-SORT(A, B, k)
1: let C [0 . . . k] be a new array
2: for i = 0 to k do
3: C [i] = 0
4: end for
5: for j = 1 to A.length do
6: C [A[j]] = C [A[j]] + 1
7: . C [i] now contains the number of elements equal to i.
8: end for
9: for i = 1 to k do
10: C [i] = C [i] + C [i − 1]
11: . C [i] contains elements less than or equal to i.
12: end for
13: for j = A.length downto 1 do
14: B[C [A[j]]] = A[j]
15: C [A[j]] = C [A[j]] − 1
16: end for 77
Counting sort
Figure: 2.10 The operation of COUNTING-SORT on an input array

A[1..8], where each element of A is a nonnegative integer no larger
than k = 5.
78
Counting sort
Analysis of counting sorts running time
The for loop of lines 2-4 takes time Θ(k).
The for loop of lines 5-8 takes time Θ(n).
The for loop of lines 9-12 takes time Θ(k).
The for loop of lines 13-16 takes time Θ(n).
Thus, the overall time is Θ(n + k), which is Θ(n) if k = O(n).
An important property of counting sort is that it is stable.
Normally, the property of stability is important only when

satellite data are carried around with the element being sorted.
79
Radix sort
Radix sort
Key idea: Sort least significant digits first.
It’s the algorithm for using the machine that extends the
technique to multi-column sorting.
Workes on one column at a time.
80
Radix sort
Algorithm 13 RADIX-SORT(A, d)
1: for i = 0 to d do
2: use a stable sort to sort array A on digit i
3: end for
Figure: 2.11 The operation of radix sort on a list of seven 3-digit

81
Radix sort
Correctness:
Assuming digits 1, 2,..., i - 1 are sorted.

Stable sort on digit i leaves digits 1,..., i sorted:
If 2 digits in position i are different, ordering by position i is
correct, and positions 1,..., i - 1 are irrelevant.
If 2 digits in position i are equal, numbers are already in the
right order (by inductive hypothesis). The stable sort on digit
i leaves them in the right order.
This argument shows why it’s so important to use a stable

sort for intermediate sort.
82
Radix sort
Analysis of Radix Sort running time
Assume that we use counting sort as the intermediate sort.
Θ(n + k) per pass (digits in range 0, . . . , k)
d passes
Thus, total running time is Θ(d(n + k)).
If k = O(n), then total running time Θ(dn).
83
Bucket sort
Bucket sort
Bucket sort assumes that the input is drawn from a uniform

distribution and has an average-case running time of O(n).
Like counting sort, bucket sort is fast because it assumes

something about the input.
Bucket sort assumes that the input is generated by a random

process that distributes elements uniformly and independently
over the interval [0, 1).
84
Bucket sort
Idea of bucket sort:
Divide [0, 1) into n equal-sized buckets.
Distribute the n input values into the buckets.
Sort each bucket.
Then go through buckets in order, listing elements in each

one.
Input: A[1..n], where 0 ≤ A[i] < 1 for all i.
Auxiliary array: B[0..n - 1] of linked lists, each list initially

empty.
85
Bucket sort
Algorithm 14 BUCKET-SORT(A, n)
1: let B[0 . . . n − 1] be a new array
2: n = A.length
3: for i = 0 to n − 1 do
4: make B[i] an empty list
5: end for
6: for i = 1 to n do
7: insert A[i] into list B[bnA[i]c]
8: end for
9: for i = 0 to n − 1 do
10: sort list B[i] with insertion sort
11: end for
12: concatenate the lists B[0], B[1],. . . , B[n-1] together in order.
86
Bucket sort
Figure: 2.12 The operation of BUCKET-SORT for n = 10.
87
Bucket sort
Correctness: Consider A[i], A[j].
Assume without loss of generality that A[i] ≤ A[j]. Then

bn · A[i]c ≤ bn · A[j]c.
So A[i] is placed into the same bucket as A[j] or into a

bucket with a lower index.
If same bucket, insertion sort fixes up.
If earlier bucket, concatenation of lists fixes up.
88
Bucket sort
Running time analysis of bucket sort
Relies on no bucket getting too many values.
All lines of algorithm except insertion sorting take Θ(n)

altogether.
Intuitively, if each bucket gets a constant number of elements,

it takes O(1) time to sort each bucket ⇒ O(n) sort time for
all buckets.
89
Bucket sort
To analyze the cost of the calls to insertion sort, define a random

variable:
ni = the number of elements placed in bucket B[i].
Because insertion sort runs in quadratic time, bucket sort running

time is
n−1
X
T (n) = Θ(n) + O(ni2 ).
i=0
90
Bucket sort
Take expectations of both sides:

n−1
X
O(ni2 )

E [T (n)] = E Θ(n) +
i=0
n−1
X
= Θ(n) + E [O(ni2 )]
i=0
n−1
X
= Θ(n) + O(E [ni2 ])
i=0
Using indicator random variables

E [ni2 ] = 2 − (1/n) for i = 0, 1, ..., n − 11
1
Check the proof from the textbook 91
Bucket sort
Therefore:
n−1
X
E [T (n)] = Θ(n) + O(2 − (1/n))
i=0
= Θ(n) + O(n)
= Θ(n)
This is a probabilistic analysis - we used probability to analyze an

algorithm whose running time depends on the distribution of
inputs.
92
Introduction
Medians and Order
ith order statistic is the ith smallest element of a set of n

elements.
The minimum is the first order statistic (i = 1).
The maximum is the nth order statistic (i = n).
A median is the ”halfway point” of the set.
93
Introduction
When n is odd, the median is unique, at i = (n + 1)/2.
When n is even, there are two medians:
The lower median, at i = n/2, and

The upper median, at i = n/2 + 1.
We mean lower median when we use the phrase ”the median”.
94
Introduction
The selection problem:
Input: A set A of n distinct numbers and a number i, with

1 ≤ i ≤ n.
Output: The element x ∈ A that is larger than exactly i - 1
other elements in A. In other words, the ith smallest element
of A.
The selection problem can be solved in O(n lgn) time.

Sort the numbers using an O(n lgn) - time algorithm, such as
heapsort or merge sort.
Then return the ith element in the sorted array.
95
Minimum and Maximum
Minimum and Maximum
We can easily obtain an upper bound of n - 1 comparisons for

finding the minimum of a set of n elements.
Examine each element in turn and keep track of the smallest

one.
This is the best we can do, because each element, except the
minimum, must be compared to a smaller element at least
once.
96
Minimum and Maximum
The following pseudocode finds the minimum element in array

A[1..n].
Algorithm 15 MINIMUM(A)
1: min = A[1]
2: for i = 2 to A.length do
3: if min > A[i] then
4: min = A[i]
5: end if
6: end for
7: return min
The maximum can be found in exactly the same way by replacing

the > with < in the above algorithm.
97
Minimum and Maximum
Simultaneous minimum and maximum
Some applications need both the minimum and maximum of a set

of elements.
A simple algorithm to find the minimum and maximum is to find
each one independently.
There will be n - 1 comparisons for the minimum and n - 1

comparisons for the maximum, for a total of 2n - 2
comparisons.
This will result in Θ(n) time.
98
Minimum and Maximum
In fact, with 3bn/2c comparisons we can find both the minimum

and maximum:
Maintain the minimum and maximum of elements seen so far.

Don’t compare each element to the minimum and maximum
separately.
Process elements in pairs.
Compare the elements of a pair to each other.
Then compare the larger element to the maximum so far, and
compare the smaller element to the minimum so far.
This leads to only 3 comparisons for every 2 elements.
99
Minimum and Maximum
Setting up the initial values for the min and max depends on
whether n is odd or even.
If n is even, compare the first two elements and assign the

larger to max and the smaller to min. Then process the
rest of the elements in pairs.
If n is odd, set both min and max to the first element. Then
process the rest of the elements in pairs.
100
Minimum and Maximum
Analysis of the total number of comparisons
If n is even, we do 1 initial comparison and then 3(n − 2)/2

more comparisons.
3(n − 2)
# of comparisons = +1
2
3n − 6
= +1
2
3n
= −2
2
If n is odd, we do 3(n − 1)/2 = 3bn/2c comparisons.
In either case, the maximum number of comparisons is ≤ 3bn/2c.
101
Selection
Selection in expected linear time
Selection of the ith smallest element of the array A can be

done in Θ(n) time.
The function RANDOMIZED-SELECT uses

RANDOMIZED-PARTITION from the quicksort algorithm:
RANDOMIZED-SELECT differs from quicksort because it

recurses on one side of the partition only, which gives it a
reduced running time of Θ(n).
102
Selection
The following code for RANDOMIZED-SELECT returns the ith

smallest element of the array A[p..r].
Algorithm 16 RANDOMIZED-SELECT(A, p, r, i)
1: if p == r then
2: return A[p]
3: end if
4: q = RANDOMIZED-PARTITION(A, p, r)
5: k = q - p + 1
6: if i == k then . the pivot value is the answer
7: return A[q]
8: else if i < k then
9: return RANDOMIZED-SELECT(A, p, q-1, i)
10: else
11: return RANDOMIZED-SELECT(A, q+1, r, i-k)
12: end if
103
Selection
After the call to RANDOMIZED-PARTITION, the array is partitioned

into two subarrays A[p..q − 1] and A[q + 1..r ], along with a pivot
element A[q].
The elements of subarray A[p..q − 1] are all ≤ A[q].
The elements of subarray A[q + 1..r ] are all > A[q].
The pivot element is the k th element of the subarray A[p..r ],

where k = q − p + 1.
If the pivot element is the i th smallest element (i.e., i = k),

return A[q].
104
Selection
Otherwise, recurse on the subarray containing the i th smallest

element.
If i < k, this subarray is A[p..q − 1], and we want the i th

smallest element.
If i > k, this subarray is A[q + 1..r ] and, since there are k

elements in A[p..r ] that precede A[q + 1..r ], we want the
(i − k)th smallest element of this subarray.
105
Selection
Running time analysis
Worst-case running time: Θ(n2 ), because we could be

extremely unlucky and always recurse on a subarray that is
only 1 element smaller than the previous subarray.
Expected running time: RANDOMIZED-SELECT works well on

average. Because it is randomized, no particular input brings
out the worst-case behavior consistently. As a result of that
this algorithm runs in linear time ⇒ Θ(n).
106
Selection
End of Chapter 2
Questions?
107

Sorting and Order Statistics (1x1)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sorting and Order Statistics (1x1)

Uploaded by

Copyright:

Available Formats

Chapter 2

Sorting and Order Statistics

This part presents several algorithms that solve the sorting

The numbers to be sorted =⇒ collection of data called a

Each record contains a key, which is the value to be sorted.

The remainder of the record consists of satellite data, which

Heapsort combines the better attributes of the two sorting

O(n lgn) worst case - like merge sort.

Sorts in place - like insertion sort.

To understand heapsort =⇒ heaps and heap operations =⇒

The (binary) heap data structure is an array object that we can

Heap data structure

Height of node = number of edges on a longest simple path

Height of heap = height of root = Θ(lg n).

A.length: gives the number of elements in the array.

A.heap-size: represents how many elements in the heap are

A heap can be stored as an array A.

Figure: 2.1 A heap viewed as a binary tree and an array.

There are two kinds of binary heaps: max-heaps and min-heaps.

For max-heaps (largest element at root) ⇐⇒ max-heap

For min-heaps (smallest element at root) ⇐⇒ min-heap

By induction and transitivity of ≤, the max-heap property

The maximum element of a max-heap is at the root,

Similar argument for min-heaps.

Maintaining the heap property

In order to maintain the max-heap property, we call the procedure

Before MAX-HEAPIFY, A[i] may be smaller than its children.

Assume left and right subtrees of i are max-heaps.

After MAX-HEAPIFY, subtree rooted at i is a max-heap.

The way MAX-HEAPIFY works:

Compare A[i], A[LEFT(i)], and A[RIGHT(i)].

If necessary, swap A[i] with the larger of the two children to

Continue this process of comparing and swapping down the

If we hit a leaf, then the subtree rooted at the leaf is trivially

Figure: 2.2 The action of MAX-HEAPIFY(A, 2), where A.heap-size =

The running time of MAX-HEAPIFY on a subtree of size n rooted

1 Θ(1) time to fix up the relationships among the elements

2 The time to run MAX-HEAPIFY on a subtree rooted at one

The children’s subtrees each have size at most 2n/3

Therefore, we can describe the running time of MAX-HEAPIFY by

T (n) ≤ T (2n/3) + Θ(1).

The solution to this recurrence, is T(n) = O(lg n).

The following procedure, given an unordered array, will produce a

Example: Building a max-heap from the unsorted array shown as

Figure: 2.3 The operation of BUILD-MAX-HEAP, for a 10-element input

To show why BUILD-MAX-HEAP works correctly, we use the

Initialization: Prior to the first iteration of the loop, i = bn/2c.

Maintenance: Children of node i are indexed higher than i, so by

Correctly assuming that i + 1, i + 2,..., n are all roots

Decrementing i re-establishes the loop invariant at each

Termination: When i = 0, the loop terminates. By the loop

O(n) calls to MAX-HEAPIFY, each of which takes O(lg n)

This upper bound, though correct, is not asymptotically tight.

Time to run MAX-HEAPIFY is linear in the height of the

Thus, the running time of BUILD-MAX-HEAP is O(n).

The heapsort algorithm

Given an input array, the heapsort algorithm acts as follows:

Builds a max-heap from the array.

Figure: 2.4 The operation of HEAPSORT for input array A. 25

The HEAPSORT procedure takes a total running time of:

Thus, the total time will be =⇒ O(n lgn).

Though heapsort is a great algorithm, a well-implemented

Heaps efficiently implement priority queues.