Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Definitions

Height of a node = the number of edges on the longest


simple path from the node down to a leaf
Level of a node = the length of a path from the root to
the node
Height of tree = height of root node

4 Height of root = 3

1 3
Height of (2)= 1 2 16 9 10 Level of (10)= 2
14 8

2
Useful Properties

height

height

(see Ex 6.1-2, page 129)

4 Height of root = 3

1 3
Height of (2)= 1 2 16 9 10 Level of (10)= 2
14 8

3
The Heap Data Structure
A heap is a nearly complete binary tree with
the following two properties:
Structural property: all levels are full, except
possibly the last one, which is filled from left to right
Order (heap) property: for any node

8 From the heap property, it


follows that:
7 4
5 2

Heap
4
A heap is a binary tree that is filled in order
Array Representation of Heaps
A heap can be stored as an
array A.
Root of tree is
Left child of
Right child of
Parent of
Heapsize[A] length[A]
The elements in the subarray
are leaves

5
Heap Types
Max-heaps (largest element at root), have the
max-heap property:
for all nodes , excluding the root:

Min-heaps (smallest element at root), have the


min-heap property:
for all nodes , excluding the root:

6
Adding/Deleting Nodes
New nodes are always inserted at the bottom
level (left to right)
Nodes are removed from the bottom level (right
to left)

7
Operations on Heaps
Create a max-heap from an unordered array
BUILD-MAX-HEAP
Maintain/Restore the max-heap property
MAX-HEAPIFY
Sort an array in place
HEAPSORT
Priority queues

8
Maintaining the Heap Property
Suppose a node is smaller than a
child
Left and Right subtrees of are max-heaps
To eliminate the violation:
Exchange with larger child
Move down the tree
Continue until node is not smaller than
children

9
Example
MAX-HEAPIFY(A, 2, 10)

A[2] A[4]

A[2] violates the heap property A[4] violates the heap property

A[4] A[9]

Heap property restored


10
Maintaining the Heap Property
Assumptions: MAX-HEAPIFY( )
Left and Right )
subtrees of are )
max-heaps 3. if and
may be 4. then
smaller than its
5. else
children
6. if and
7. then
8. if
9. then exchange
10. MAX-HEAPIFY( )

11
MAX-HEAPIFY Running Time
Intuitively:
- h
-
- 2h
- O(h)

Running time of MAX-HEAPIFY is

Can be written in terms of the height of the heap,


as being
Since the height of the heap is

12
Building a Heap
Convert an array into a max-heap ( )
The elements in the subarray are leaves
Apply MAX-HEAPIFY on elements between and

BUILD-MAX-HEAP 1

4
= length[A]
2 3
2. for downto 1 3
4 5 6 7

3. do MAX-HEAPIFY 8
2 9 10
16 9 10
14 8 7

A: 4 1 3 2 16 9 10 14 8 7

13
4 1 3 2 16 9 10 14 8 7
Example: A
=5 =4 =3
1 1 1

4 4 4
2 3 2 3 2 3

1 3 1 3 1 3
4 5 6 7 4 5 6 7 4 5 6 7

8
2 9 10
16 9 10 8 2 9 10
16 9 10 8 14 9 10
16 9 10
14 8 7 14 8 7 2 8 7

=2 =1
1 1 1

4 4 16
2 3 2 3 2 3

1 10 16 10 14 10
4 5 6 7 4 5 6 7 4 5 6 7

8
14 9 10
16 9 3 8
14 9 10
7 9 3 8
8 9 10
7 9 3
2 8 7 2 8 1 2 4 1
14
Running Time of BUILD MAX HEAP

BUILD-MAX-HEAP
= length[A]
2. for downto
3. do MAX-HEAPIFY

Running time:
This is not an asymptotically tight upper bound

15
Running Time of BUILD MAX HEAP
HEAPIFY takes the cost of HEAPIFY on a node is
proportional to the height of the node in the tree

Height Level No. of nodes


h0 = 3 ( lgn ) i=0 20

h1 = 2 i=1 21

h2 = 1 i=2 22

h3 = 0 i=3 23

hi = h i height of the heap rooted at level i


ni = 2i number of nodes at level i
16
Running Time of BUILD MAX HEAP
Cost of HEAPIFY at level i number of nodes at that level

Replace the values of ni and hi computed before

Multiply by 2h both at the nominator and denominator and


write 2i as

Change variables: k = h - i

The sum above is smaller than the sum of all elements to


and h = lgn

The sum above is smaller than 2

Running time of BUILD-MAX-HEAP:


17
Heapsort
Goal:
Sort an array using heap representations

Idea:
Build a max-heap from the array
Swap the root (the maximum element) with the last
element in the array

Call MAX-HEAPIFY on the new root


Repeat this process until only one node remains

18
Example: A=[7, 4, 3, 1, 2]

MAX-HEAPIFY(A, 1, 4) MAX-HEAPIFY(A, 1, 3) MAX-HEAPIFY(A, 1, 2)

MAX-HEAPIFY(A, 1, 1)

19
HEAPSORT(A)

1. BUILD-MAX-HEAP
2. for downto 2
3. do exchange times

4. MAX-HEAPIFY

Running time:

20
Priority Queues

12 4

21
Operations
on Priority Queues
Max-priority queues support the following
operations:
INSERT : inserts element into set

EXTRACT-MAX : removes and returns element of


with largest key

MAXIMUM : returns element of with largest key

INCREASE-KEY : increases value of element


(Assume

22
HEAP-MAXIMUM
Goal:
Return the largest element of the heap

Running time:
HEAP-MAXIMUM
1. return
Heap A:

Heap-Maximum(A) returns 7
23
HEAP-EXTRACT-MAX
Goal:
Extract the largest element of the heap (i.e., return the max
value and also remove that element from the heap
Idea:
Exchange the root element with the last
Decrease the size of the heap by 1 element
Call MAX-HEAPIFY on the new root, on a heap of size n-1

Heap A: Root is the largest element

24
Example: HEAP-EXTRACT-MAX
16 1

14 10 14 10
8 7 9 3 8 7 9 3
2 4 1 2 4
Heap size decreased with 1

14

Call MAX-HEAPIFY
8 10
4 7 9 3
2 1

25
HEAP-EXTRACT-MAX
HEAP-EXTRACT-MAX

1. if
2. then error

3.

4.

5. MAX-HEAPIFY remakes heap

6. return
Running time:
26
HEAP-INCREASE-KEY
Goal:
Increases the key of an element i in the heap
Idea:
Increment the key of to its new value
If the max-heap property does not hold anymore:
traverse a path toward the root to find the proper
place for the newly increased key
16

14 10
8 7 9 3
2 4 1

27
Example: HEAP-INCREASE-KEY
16 16

14 10 14 10
8 7 9 3 8 7 9 3
2 4 1 2 15 1

[ ]

16 16

14 10 15 10
15 7 9 3 14 7 9 3
2 8 1 2 8 1

28
HEAP-INCREASE-KEY
HEAP-INCREASE-KEY

1. if
2. then error
3.
4. while and 16
5. do exchange
14 10
6.
8 7 9 3
2 4 1
Running time:

29
MAX-HEAP-INSERT
Goal:
16
Inserts a new element into a max-
heap 14 10
8 7 9 3
Idea:
2 4 1 -
Expand the max-heap with a new
16
element whose key is -
Calls HEAP-INCREASE-KEY to 14 10
set the key of the new node to its 8 7 9 3
correct value and maintain the 2 4 1 15

max-heap property

30
Example: MAX-HEAP-INSERT
Insert value 15: Increase the key to 15
- Start by inserting - Call HEAP-INCREASE-KEY on A[11] = 15
16 16

14 10 14 10
8 7 9 3 8 7 9 3
2 4 1 - 2 4 1 15

The restored heap containing


the newly added element

16 16

14 10 15 10

8 15 9 3 8 14 9 3

2 4 1 7 2 4 1 7
31
MAX-HEAP-INSERT

16
MAX-HEAP-INSERT
14 10
1. 8 7 9 3
2 4 1 -
2.

3. HEAP-INCREASE-KEY

Running time:

32
Summary
We can perform the following operations on
heaps:
MAX-HEAPIFY
BUILD-MAX-HEAP
HEAP-SORT
MAX-HEAP-INSERT
HEAP-EXTRACT-MAX
HEAP-INCREASE-KEY
HEAP-MAXIMUM
33
Exercise 1
Assuming the data in a max-heap are distinct, what are
the possible locations of the second-largest element?

34
Exercise 2
(a) What is the maximum number of nodes in a
max heap of height h?

(b) What is the maximum number of leaves?

(c) What is the maximum number of internal


nodes?
35
Exercise 3
Demonstrate, step by step, the operation of
Build-Heap on the array
A=[5, 3, 17, 10, 84, 19, 6, 22, 9]

36
Exercise 4
Let A be a heap of size n. Give the most
efficient algorithm for the following tasks:

(a) Find the sum of all elements

(b) Find the sum of the largest lgn elements

37
Sorting in Linear Time
Sorting in linear time (for students to read)
Comparison sort:
Lower bound: (nlgn).
Non comparison sort:
Bucket sort, counting sort, radix sort
They are possible in linear time (under certain
assumption).
Lower bound of comparison sort

Lower bound: (f(n)) in the worst case.


Decision tree model:
Example: for comparison sorting.
Adversary argument:
Example: find the maximum

40
Comparison sort
Comparison sort:
Insertion sort, O(n2), upper bound in worst case
Merge sort, O(nlg n), upper bound in worst case
Heapsort, O(nlg n), upper bound in worst case
Quicksort, O(nlg n), in average case
Question:
what is the lower bounds for any comparison sorting: i.e., at least
how many comparisons needed in worst case?
It turns out: Lower bound in worst case: (nlg n),
Merge and Heapsort are asymptotically optimal comparison
sorts.

41
Decision Tree Model
Assumptions:
All numbers are distinct (so no use for ai = aj )
All comparisons have form ai aj (since ai aj, ai a j,
ai < aj, ai > aj are equivalent).
Decision tree model
Full binary tree
Internal node represents a comparison.
Ignore control, movement, and all other operations, just see
comparison
Each leaf represents one possible result.
The height (i.e., longest path) is the lower bound.

42
Decision tree model
1:2 >

2:3 1:3 >


>

<1,2,3> 1:3 > <2,1,3> 2:3 >

<1,3,2> <3,1,2> <2,3,1> <3,2,1>

Internal node i:j indicates comparison between ai and aj.


suppose three elements < a1, a2, a3> with instance <6,8,5>
Leaf node < (1), (2), (3)> indicates ordering a (1) a (2) a (3).
Path of bold lines indicates sorting path for <6,8,5>.
There are total 3!=6 possible permutations (paths).

43
Lower bound: for comparison sort
The Longest path is the worst case number of
comparisons. The length of the longest path is the
height of the decision tree.
Theorem 8.1: Any comparison sort algorithm
requires (nlg n) comparisons in the worst case.
Proof:
Suppose height of a decision tree is h, and number of paths
(i,e,, permutations) is n!.
Since a binary tree of height h has at most 2h leaves,
n! 2h , so h lg (n!) (nlg n) (By equation 3.18).
That is to say: any comparison sort in the worst
case needs at least nlg n comparisons.

44
Comparison Sorting Review
Insertion sort:

Easy to code
Fast on small inputs (less than ~50 elements)
Fast on nearly-sorted inputs

O(n2) worst case


O(n2) average case
O(n2) reverse-sorted case
Comparison Sorting Review
Merge sort:
Divide-and-conquer:
Split array in half
Recursively sort sub-arrays
Linear-time merge step

O(n lg n) worst case - asymptotically optimal for comparison


sorts
Comparison Sorting Review
Heap sort:
Uses the very useful heap data structure
Complete binary tree

O(n lg n) worst case - asymptotically optimal for comparison


sorts
Sorts in place

Fair amount of shuffling memory around


Comparison Sorting Review
Quick sort:
Divide-and-conquer:
Partition array into two sub-arrays, recursively sort
All of first sub-array < all of second sub-array

O(n lg n) average case


Sorts in place
Fast in practice (why?)

O(n2) worst case


Naïve implementation: worst case on sorted input
Good partitioning makes this very unlikely.
Non-Comparison Based Sorting
Many times we have restrictions on our keys
Deck of cards: Ace->King and four suites
Social Security Numbers

We will examine three algorithms which under


certain conditions can run in O(n) time.
Bucket sort
Counting sort
Radix sort
Bucket Sort
Assumption: uniform distribution
Input numbers are uniformly distributed in [0,1).
Suppose input size is n.
Idea:
Divide [0,1) into n equal-sized subintervals (buckets).
Distribute n numbers into buckets
Expect that each bucket contains few numbers.
Sort numbers in each bucket (insertion sort as
default).
Then go through buckets in order, listing elements,
BUCKET-SORT(A)
1. n length[A]
2. for i 1 to n
3. do insert A[i] into bucket B[ nA[i] ]
4. for i 0 to n-1
5. do sort bucket B[i] using insertion sort
6. n-1]
Example of BUCKET-SORT
Analysis of BUCKET-SORT(A)
1. n length[A] (1)
2. for i 1 to n O(n)
3. do insert A[i] into bucket B[ nA[i] ] (1) (i.e. total
O(n))
4. for i 0 to n-1 O(n)
5. do sort bucket B[i] with insertion sort n-1
O(ni2) ( i=0 O(ni2))

6. n-1] O(n)

Where ni is the size of bucket B[i].


Thus T(n) = (n) + i=0n-1 O(ni2)
= (n) + nO(2-1/n) = (n). Beat (nlg n)
Counting Sort
Assumption: n input numbers are integers in
range [0,k], k=O(n).
Idea:
Determine the number of elements less than x, for
each input x.
Place x directly in its position.
COUNTING-SORT(A,B,k)
1. for i 0 to k
2. do C[i] 0
3. for j 1 to length[A]
4. do C[A[j]] C[A[j]]+1
5. // C[i] contains number of elements equal to i.
6. for i 1 to k
7. do C[i]=C[i]+C[i-1]
8. // C[i] contains number of elements i.
9. for j length[A] downto 1
10. do B[C[A[j]]] A[j]
11. C[A[j]] C[A[j]]-1
Example of Counting Sort
Analysis of COUNTING-SORT(A,B,k)
1. for i 0 to k (k)
2. do C[i] 0 (1)
3. for j 1 to length[A] (n)
4. do C[A[j]] C[A[j]]+1 (1) ( (1) (n)=
(n))
5. // C[i] contains number of elements equal to i. (0)
6. for i 1 to k (k)
7. do C[i]=C[i]+C[i-1] (1) ( (1) (n)=
(n))
8. // C[i] contains number of elements i. (0)
9. for j length[A] downto 1 (n)
10. do B[C[A[j]]] A[j] (1) ( (1) (n)=
(n))
11. C[A[j]] C[A[j]]-1 (1) ( (1) (n)=
(n))

Total cost is (k+n), suppose k=O(n), then total cost is (n). Beat (nlg n).
Radix sort
Suppose a group of people, with last name, middle,
and first name (each has one letter).

Sort it by the last name, then by middle, finally by the


first name
Solution 1:
sort by last name first as into (possible) 26 bins,
Sort each bin by middle name into (possible) 26 more bins
(26*26 =512)
Sort each of 512 bins by the first name into 26 bins
So if many names, there may need possible
26*26*26 bins.
Suppose there are n names, there need possible n
bins.
What is the efficient solution?
Radix sort
By first name, then middle, finally last name.
Then after every pass of sort, the bins can be combined
as one file and proceed to the next sort.
Radix-sort(A,d)
For i=1 to d do
use a stable sort to sort array A on digit i.
Lemma 8.3: Given n d-digit numbers in which each digit
can take on up to k possible values, Radix-sort correctly
sorts these numbers in (d(n+k)) time.
If d is constant and k=O(n), then time is (n).
Example

You might also like