Chapter 6 Heapsort

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Introduction

This part presents several algorithms that solve the sorting problem
Input: A sequence of n numbers ¿ a1 , a2 , a3 , … , an >¿
Output: A permutation (reordering) ¿ a'1 , a'2 , a'3 , … , a'n >¿ of the input sequence such that
a '1 ≤ a'2 ≤ a '3 ≤ … ≤ a'n

The structure of data

 The numbers to be sorted are rarely isolated values


 Each is usually part of a collection called a record
 Each record contains a key, which is the value to be sorted
 The remainder of the record is satellite data, which are carried around with the key.
 when focusing on the problem of sorting, we typically assume that the input consists only of
numbers.
 Translating an algorithm for sorting numbers into a program for sorting records is
conceptually straightforward
Why sorting

 Sometimes an application inherently needs to sort information.


 Algorithms often use sorting as a key subroutine.
 We can prove a nontrivial lower bound for sorting. Our best upper bounds match the lower
bound asymptotically, and so we know that our sorting algorithms are asymptotically
optimal.
 We can use the lower bound for sorting to prove lower bounds for certain other problems.

Sorting algorithms

 a sorting algorithm sorts in place if only a constant number of elements of the input array
are ever stored outside the array.
 Insertion sort, merge sort, heapsort, and quicksort are all comparison sorts: they determine
the sorted order of an input array by comparing elements.
Chapter 6 Heapsort

Heapsort’s running time is Θ(n lg n)


It is an in-place sorting algorithm

We use a data structure called heap to manage information.


Heaps are useful for heapsort and they also make an efficient priority queue.

heaps

the (binary) heap data structure is an array object that can be viewed as a nearly complete binary
tree
each node of the tree corresponds to an element of the array

The tree is completely filled on all levels except the lowest. Which is filed from the left up till a point.

An array A which represents a heap is an object with 2 attributes


1. A . lenght : the number of elements in the array
2. A . heap−¿ ¿: how many elements in the heap are stored in array A

Although A[1. . A . lenght ] may contain numbers, only the elements in A ¿ are elements of the heap
A[1] is the root of the tree

For index i of a node, the parent, left child and right child can be computed

Note

the heap is constructed in such a way that it remains as balanced as possible. first 16 is
inserted, then 14, when 10 is inserted it is not inserted as a child of 14 but as a child of 16,
to keep the tree balance. in either case, the max heapify property would be satisfied.

There are 2 types of binary heaps –

1. Max heaps
2. Min heaps

In each heap the nodes satisfy a specific property

In max heap the max heap property is that, for every node i , other than the root

A [ parent (i) ] ≥ A [i]


A min heap is organized in the opposite way

The min heap property is

A [ parent ( i ) ] ≤ A [i]

Height of a node in a heap is the number of edges on the longest simple downward path from the
node to a leaf.

The number of nodes in layer of depth h=2h

depth=no .of edges ¿ theroot ¿ that node


the height of an n element heap=[ lg n ] ,[ ]isthe floor function

Maintaining the heap property

The inputs are the array A and an index i .

when it is called it assumes that the trees rooted at the nodes with indices ¿ and ¿(i) are max-
heaps, but A[i] may be smaller than its children, violating the max heap property.

MAX−HEAPIFY lets the value A[i] float down the max-heap, so that the subtree rooted at index
i obeys the property

In each step the largest of the elements A [ i ] , A ¿(i¿) ¿ ¿∧A ¿ is determined and stored in the
variable largest .

If A[i] is the largest, then the subtree rooted at node i is already a max-heap and the procedure
terminates.

If one of the children is the largest, A[i] is swapped with A[largest ] so node i and its children
follow the property.

The node indexed at largest value has changed, so the subtree rooted at it might violate the
property.

So MAX−HEAPIFY is called recursively on that subtree.


Let T (n) be the running time of MAX−HEAPIFY a subtree of size n .

T (n) is given by the time to fix the relationship amongst the elements A [ i ] , A ¿ and A[¿(i)] –
which takes Θ(1) time.
+¿
Time to run MAX−HEAPIFY recursively on the subtree rooted at index i . (assuming it occurs)

Worst case scenario is when the last layer of the heap is half filled

In that case

no . of nodes of ¿+ no . of nodes of ¿ 1=n

( 2h +2−1 ) + ( 2h +1−1 ) +1=n

( 2 ⋅ 2h+1 −1 )+ ( 2h+1−1 )+ 1=n

3 ⋅2h+1 =n+1

n+ 1
2h+ 1=
3

2 ( n+ 1 ) 2n 2 2n
no . of nodes∈¿=2h+2 −1= −1= − <
3 3 3 3

Hence largest size of the subtree called recursively is 2 n/3


The recurrence is,
T (n )≤ T ( 23n )+Θ(1)
Solution to the recurrence by master method is T ( n )=Ο(lg n)

Also, the running time can be characterized by Ο ( h )

(worst case all the subtrees till the root are called recursively)
Building a heap

MAX HEAPIFY can be used to convert an array A[1. . A . lenght ] into a max heap in a bottom up
manner.

For a tree of n nodes, n is the index of the right most leaf.

is the parent of this node with index n . All the nodes to the right of this parent node will be
leaf nodes (think about it).
So,

are all leaves of the tree.

The procedure BUILD−MAX−HEAP goes through the remaining nodes of the tree and runs
MAX−HEAPIFY on each one.
Analysis

Loop Invariant

At the start of each iteration of the for loop of lines 2 – 3, each node i+1 ,i+2 , … ,n is
the root of a max-heap.

Time

We can compute a simple upper bound


cost of call ¿ MAX−HEAPIFY =Ο(lg n)
no . of calls made∈ BUILD−HEAP=Ο ( n )

Thus, running time is Ο(n lg n)

But it’s not asymptotically tight

Because each call to MAX−HEAPIFY is actually Ο(h), and h<O(lg n) for subtrees

Using this and the properties that


h=[ lg n ] , for ann element heap
and
no . of nodes∈ alayer isatmost=¿
We can derive a tighter bound
Hence, we can build a max heap from an unordered array in linear time
The heap sort algorithm

1. Build a heap from the input array A[1. . A . length] using BUILD−MAX−HEAP
2. The maximum element is stored in A[1], so swap A[1] and A[n]
3. To discard A[n] from the heap decrement A . heap−¿ ¿.
4. The new root may violate the property, so call MAX−HEAPIFY on the new root, which
gives a max-heap in A[1. . n−1]
5. Then repeat this process for a max-heap of size n−1, down to a size of 2
Priority queues

One of the most popular applications of heap: priority queues


Just like heaps, priority queues come in 2 forms, min-priority queues and max-priority queues.

A priority queue is a data structure for maintaining a set S of elements, each with an associated
value called a key.
A max-priority queue supports the following operations:

INSERT (S , x): Inserts the element x into the set S. It is equivalent to the operation S=S ∪ { x }
MAXIMUM (S ): Returns the elements of S with the largest key.
EXTRACT−MAX (S) : Removes and returns the element of S with the largest key.
INCREASE−KEY (S , x , k ): Increases the value of element x ' s key to the new value k , which is
assumed to be at least as large as x ' s current key value.

Implementation
HEAP−MAXIMUM =Θ(1)
HEAP−EXTRACT −MAXIMUM =Ο(lg n)
HEAP−INCREASE−KEY =Ο(lg n)
MAX−HEAP−INSERT =Ο(lg n)

You might also like