Chapter 6 Heapsort

Introduction
This part presents several algorithms that solve the sorting problem
Input: A sequence of n numbers ¿ a1 , a2 , a3 , … , an >¿
Output: A permutation (reordering) ¿ a'1 , a'2 , a'3 , … , a'n >¿ of the input sequence such that
a '1 ≤ a'2 ≤ a '3 ≤ … ≤ a'n
The structure of data
 The numbers to be sorted are rarely isolated values

 Each is usually part of a collection called a record
 Each record contains a key, which is the value to be sorted
 The remainder of the record is satellite data, which are carried around with the key.
 when focusing on the problem of sorting, we typically assume that the input consists only of
numbers.
 Translating an algorithm for sorting numbers into a program for sorting records is
conceptually straightforward
Why sorting
 Sometimes an application inherently needs to sort information.

 Algorithms often use sorting as a key subroutine.
 We can prove a nontrivial lower bound for sorting. Our best upper bounds match the lower
bound asymptotically, and so we know that our sorting algorithms are asymptotically
optimal.
 We can use the lower bound for sorting to prove lower bounds for certain other problems.
Sorting algorithms
 a sorting algorithm sorts in place if only a constant number of elements of the input array
are ever stored outside the array.
 Insertion sort, merge sort, heapsort, and quicksort are all comparison sorts: they determine
the sorted order of an input array by comparing elements.
Chapter 6 Heapsort
Heapsort’s running time is Θ(n lg n)

It is an in-place sorting algorithm
We use a data structure called heap to manage information.

Heaps are useful for heapsort and they also make an efficient priority queue.
heaps
the (binary) heap data structure is an array object that can be viewed as a nearly complete binary
tree
each node of the tree corresponds to an element of the array
The tree is completely filled on all levels except the lowest. Which is filed from the left up till a point.
An array A which represents a heap is an object with 2 attributes

1. A . lenght : the number of elements in the array
2. A . heap−¿ ¿: how many elements in the heap are stored in array A
Although A[1. . A . lenght ] may contain numbers, only the elements in A ¿ are elements of the heap
A[1] is the root of the tree
For index i of a node, the parent, left child and right child can be computed
Note
the heap is constructed in such a way that it remains as balanced as possible. first 16 is
inserted, then 14, when 10 is inserted it is not inserted as a child of 14 but as a child of 16,
to keep the tree balance. in either case, the max heapify property would be satisfied.
There are 2 types of binary heaps –
1. Max heaps
2. Min heaps
In each heap the nodes satisfy a specific property
In max heap the max heap property is that, for every node i , other than the root
A [ parent (i) ] ≥ A [i]

A min heap is organized in the opposite way
The min heap property is
A [ parent ( i ) ] ≤ A [i]
Height of a node in a heap is the number of edges on the longest simple downward path from the
node to a leaf.
The number of nodes in layer of depth h=2h
depth=no .of edges ¿ theroot ¿ that node

the height of an n element heap=[ lg n ] ,[ ]isthe floor function
Maintaining the heap property
The inputs are the array A and an index i .
when it is called it assumes that the trees rooted at the nodes with indices ¿ and ¿(i) are max-
heaps, but A[i] may be smaller than its children, violating the max heap property.
MAX−HEAPIFY lets the value A[i] float down the max-heap, so that the subtree rooted at index
i obeys the property
In each step the largest of the elements A [ i ] , A ¿(i¿) ¿ ¿∧A ¿ is determined and stored in the
variable largest .
If A[i] is the largest, then the subtree rooted at node i is already a max-heap and the procedure
terminates.
If one of the children is the largest, A[i] is swapped with A[largest ] so node i and its children
follow the property.
The node indexed at largest value has changed, so the subtree rooted at it might violate the
property.
So MAX−HEAPIFY is called recursively on that subtree.

Let T (n) be the running time of MAX−HEAPIFY a subtree of size n .
T (n) is given by the time to fix the relationship amongst the elements A [ i ] , A ¿ and A[¿(i)] –
which takes Θ(1) time.
+¿
Time to run MAX−HEAPIFY recursively on the subtree rooted at index i . (assuming it occurs)
Worst case scenario is when the last layer of the heap is half filled
In that case
no . of nodes of ¿+ no . of nodes of ¿ 1=n
( 2h +2−1 ) + ( 2h +1−1 ) +1=n
( 2 ⋅ 2h+1 −1 )+ ( 2h+1−1 )+ 1=n
3 ⋅2h+1 =n+1
n+ 1
2h+ 1=
3
2 ( n+ 1 ) 2n 2 2n
no . of nodes∈¿=2h+2 −1= −1= − <
3 3 3 3
Hence largest size of the subtree called recursively is 2 n/3

The recurrence is,
T (n )≤ T ( 23n )+Θ(1)
Solution to the recurrence by master method is T ( n )=Ο(lg n)
Also, the running time can be characterized by Ο ( h )
(worst case all the subtrees till the root are called recursively)
Building a heap
MAX HEAPIFY can be used to convert an array A[1. . A . lenght ] into a max heap in a bottom up
manner.
For a tree of n nodes, n is the index of the right most leaf.
is the parent of this node with index n . All the nodes to the right of this parent node will be
leaf nodes (think about it).
So,
are all leaves of the tree.
The procedure BUILD−MAX−HEAP goes through the remaining nodes of the tree and runs
MAX−HEAPIFY on each one.
Analysis
Loop Invariant
At the start of each iteration of the for loop of lines 2 – 3, each node i+1 ,i+2 , … ,n is
the root of a max-heap.
Time
We can compute a simple upper bound

cost of call ¿ MAX−HEAPIFY =Ο(lg n)
no . of calls made∈ BUILD−HEAP=Ο ( n )
Thus, running time is Ο(n lg n)
But it’s not asymptotically tight
Because each call to MAX−HEAPIFY is actually Ο(h), and h<O(lg n) for subtrees
Using this and the properties that

h=[ lg n ] , for ann element heap
and
no . of nodes∈ alayer isatmost=¿
We can derive a tighter bound
Hence, we can build a max heap from an unordered array in linear time
The heap sort algorithm
1. Build a heap from the input array A[1. . A . length] using BUILD−MAX−HEAP
2. The maximum element is stored in A[1], so swap A[1] and A[n]
3. To discard A[n] from the heap decrement A . heap−¿ ¿.
4. The new root may violate the property, so call MAX−HEAPIFY on the new root, which
gives a max-heap in A[1. . n−1]
5. Then repeat this process for a max-heap of size n−1, down to a size of 2
Priority queues
One of the most popular applications of heap: priority queues

Just like heaps, priority queues come in 2 forms, min-priority queues and max-priority queues.
A priority queue is a data structure for maintaining a set S of elements, each with an associated
value called a key.
A max-priority queue supports the following operations:
INSERT (S , x): Inserts the element x into the set S. It is equivalent to the operation S=S ∪ { x }
MAXIMUM (S ): Returns the elements of S with the largest key.
EXTRACT−MAX (S) : Removes and returns the element of S with the largest key.
INCREASE−KEY (S , x , k ): Increases the value of element x ' s key to the new value k , which is
assumed to be at least as large as x ' s current key value.
Implementation
HEAP−MAXIMUM =Θ(1)
HEAP−EXTRACT −MAXIMUM =Ο(lg n)
HEAP−INCREASE−KEY =Ο(lg n)
MAX−HEAP−INSERT =Ο(lg n)

Chapter 6 Heapsort

Uploaded by

Copyright:

Available Formats

You might also like

Chapter 6 Heapsort

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 6 Heapsort

Uploaded by

Copyright:

Available Formats

Introduction

The structure of data

 The numbers to be sorted are rarely isolated values

 Sometimes an application inherently needs to sort information.

Heapsort’s running time is Θ(n lg n)

We use a data structure called heap to manage information.

An array A which represents a heap is an object with 2 attributes

There are 2 types of binary heaps –

In each heap the nodes satisfy a specific property

A [ parent (i) ] ≥ A [i]

The min heap property is

The number of nodes in layer of depth h=2h

depth=no .of edges ¿ theroot ¿ that node

Maintaining the heap property

The inputs are the array A and an index i .

So MAX−HEAPIFY is called recursively on that subtree.

no . of nodes of ¿+ no . of nodes of ¿ 1=n

( 2h +2−1 ) + ( 2h +1−1 ) +1=n

( 2 ⋅ 2h+1 −1 )+ ( 2h+1−1 )+ 1=n

Hence largest size of the subtree called recursively is 2 n/3

Also, the running time can be characterized by Ο ( h )

For a tree of n nodes, n is the index of the right most leaf.

are all leaves of the tree.

We can compute a simple upper bound

Thus, running time is Ο(n lg n)

But it’s not asymptotically tight

Using this and the properties that

One of the most popular applications of heap: priority queues

You might also like