Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

COMP2230 Introduction to Algorithmics Lecture 5

A/Prof Ljiljana Brankovic

Lecture Overview
Priority Queues, Heaps Text, Chapter 3, Section 3.5 Disjoint sets Text, Chapter 3, Section 3.6 Binary Search Text, Chapter 4, Section 4.1 Next week: Searching Text, Chapter 4, Section 4.2 4.5

Priority Queues
A priority queue is an abstract data type that allows:
insertion of items with specified priorities, and deletion of an item with the highest priority

Example: Suppose we insert items with priorities 20,

389, 12 and 7 into initially empty priority queue. After deleting and item, the priority queue contains 20,12,7 and after inserting 67 it contains 20,12,7,67.

Implementing a Priority Queue Using an Array


Insertion (at the end of the array) Deletion

(1)
(n) O(n) ((n) ?) (n)

Finding an item with largest priority Shifting all elements to the right one cell to the left TOTAL

For n insertions and deletions we have O(n2) ( is it also (n2)?)

Implementing a Priority Queue Using an Array


What happens if we maintain a sorted array? Insertion
Location the position Shifting all elements to the right one cell to the right (lg n) (n) (1) (n)

Deletion
TOTAL

For n insertions and deletions we have (n2)

Heaps
A heap is a binary tree where all levels except possibly the last level are completely filled in; the nodes in the last level are at the left. LEVEL
0 1

Binary Maxheaps
LEVEL 1
104

In a binary maxheap a value of each node is greater or equal to the values of its children.
2
71

0 3
24

1 7
8

4
66 27

6
23

8
5 32

9
25

10
18

11
22

12 3
5 8 32 9 25 10 18 11 22 12

104 1

71 2

24 3

66 4

27 5

23 6

8 7

Algorithm 3.5.6 Largest


This algorithm returns the largest value in a heap. The array v represents the heap. Input Parameter: v Output Parameters: None heap_largest(v) { return v[1] }

Deleting from a Heap


104 71 66 5 32 25 27 18 23 22 24 8 66 5 32 25 71 27 18 23 22 24 8 66 5 32 25 22 27 18 23 71 24

71 66 22 5 32 25 27 18 23 24 8 32 5 22 25 66

71 24 27 18 23 8

Algorithm 3.5.7 Siftdown

The array v represents a heap structure indexed from 1 to n. The left and right subtrees of node i are heaps. After siftdown(v, i, n) is called, the subtree rooted at i is a heap.

Input Parameter: v,t,n Output Parameters: v siftdown(v,i,n) { temp = v[i] // 2 * i = n tests for a left child while (2 * i = n) { child = 2 * i // if there is a right child and it is // bigger than the left child, move child if (child < n && v[child + 1] > v[child ]) child = child + 1 // move child up? if (v[child ] > temp ) v[i] = v[child] else break // exit while loop i = child } // insert original v[i] in correct spot v[i] = temp }

Algorithm 3.5.9 Delete


This algorithm deletes the root (the item with largest value) from a heap containing n elements. The array v represents the heap. Input Parameters: v,n Output Parameters: v,n heap_delete (v,n) { v[1] = v[n] n = n - 1 siftdown(v,1, n) }

Complexity: (lg n)

Algorithm 3.5.10 Insert


This algorithm inserts the value val into a heap containing n elements. The array v represents the heap. Input Parameters: val ,v,n Output Parameters: v,n heap_insert (val ,v,n) { i = n = n + 1 // i is the child and i/2 is the parent. // If i > 1, i is not the root. while ( i > 1 && val > v[i/2]) { v[i] = v[i/2] i = i/2 } v[i] = val }

Complexity?

Algorithm 3.5.12 Heapify


This algorithm rearranges the data in the array v, indexed from 1 to n, so that it represents a heap. Input Parameters: v,n Output Parameters: v heapify (v,n) { // n/2 is the index of the parent of the last node for i = n/2 downto 1 siftdown(v,i,n) }

Complexity?

Algorithm 3.5.12 Heapify


Complexity? (n) Proof: The shiftdown is applied to all the nodes except the leaves. For a node at the level h-1 the shiftdown takes at most 1 step. In general, for a node at level i, the shiftdown takes at most h-i steps. There are 2i internal nodes on level i, i<h-1 at most 2h-1internal nodes on level h-1 Therefore, the worst case time T(n) 12h-1+22h-2+32h-2++h2h-h

T (n) i 2
i =1

h i

=2

2
i =1

i
h 1

< 2 h 2 = 2 lg n 2 2lg n 2 = 2n

Therefore T(n)=O(n).

Algorithm 3.5.12 Heapify


We need to show that

2
i =1

i
h 1

<2

total
1/2 1/22 1/22 1/23 1/23 1/23 ... ... ... ... 1/2h-1 1/2h-1 1/2h-1 ... 1/2h-1 <1 <1/2 <1/22 ... <1/2 h

TOTAL

<2

Algorithm 3.5.12 Heapify


On the other hand, there are n/2 internal nodes (nonleaves) in the heap, therefore the complexity is (n); thus we have (n). Note that thuis proves not only that the worst case complexity is (n), but also the best and average case complexity.

Indirect Heaps
key
66 312 1 12 2 312 3 25 4 8 5 109 6 7 7 18 8

into
66 109 2 8 1 7 5 3 6 4

outof
18 8 7 25 3 1 6 8 5 7 4 2

12

Algorithm 3.5.15 Increase

This algorithm increases a value in an indirect heap and then restores the heap. The input is an index i into the key array, which specifies the value to be increased, and the replacement value newval.

Input Parameters: i,newval Output Parameters: None increase(i,newval ) { key[i] = newval // p is the parent index in the heap structure // c is the child index in the heap structure c = into[i] p = c/2 while ( p = 1) { if (key [outof [p]] = newval) break // exit while loop // move value at p down to c outof [c] = outof[p] into[outof[c]] = c // move p and c up c = p p = c/2 } // put newval in heap structure at index c outof [c] = i into[i] = c }

Algorithm 3.5.16 Heapsort


This algorithm sorts the array v[1], ... , v[n] in nondecreasing order. It uses the siftdown and heapify algorithms (Algorithms 3.5.7 and 3.5.12). Input Parameters: v,n Output Parameter: v heapsort(v,n) { // make v into a heap heapify (v,n) for i = n downto 2 { // v[1] is the largest among v[1], ... , v[i]. // Put it in the correct cell. swap(v[1],v[i]) // Heap is now at indexes 1 through i - 1. // Restore heap. siftdown(v,1, i - 1 ) } }

Disjoint sets
We consider nonempty pairwise disjoint sets, where each set has an element marked as a representative element of the set. Sets X and Y are disjoint iff X Y= Example: The sets {1} {2,4,3} {5,6,7} Are nonempty pairwise disjoint sets. The mark element is each set is represented in bold & underlined.

Disjoint sets
The operations: makeset(i) constructs the set {i} findset(i) returns the marked member of the set to which i belongs union(i,j) replaces the set containing I and the set containing j with their union.

Example:
makeset(1) makeset(2) makeset(3) makeset(4) makeset(5) makeset(6) makeset(7) We have {1}, {2},{3},{4}, {5},{6},{7} union(2,4) union(2,3) union(5,6) union(6,7) We have {1} {2,4,3} {5,6,7} findset(2) returns 4.

We represent the disjoint-set abstract data type as an arbitrary tree with the marked element as a root. Note that such a tree is not necessarily binary. Example: Sets {1}

Disjoint sets

{2,4,3} {5,6,7}
6 5 2 3 7

may be represented as follows (note that there is more than one way to represent these sets):
1 4

Disjoint sets
We then use an array parent to represent such a tree; parent[i] is the parent of i in the corresponding tree, unless i is the root, in which case the parent[i] is i. We can also use a single array to represent a collection of such trees (forest). Example: Sets {1} {2,4,3} {5,6,7} represented as
1 4 6 5

have the corresponding parent array:

Algorithm 3.6.4 Makeset, Version 1


This algorithm represents the set {i} as a one-node tree.

Input Parameter: i Output Parameters: None makeset1(i) { parent[i] = i }

Algorithm 3.6.5 Findset, Version 1


This algorithm returns the root of the tree to which i belongs.

Input Parameter: i Output Parameters: None findset1(i) { while ( i != parent[i]) i = parent[i] return i }

Algorithm 3.6.6 Mergetrees, Version 1


This algorithm receives as input the roots of two distinct trees and combines them by making one root a child of the other root. Input Parameters: i,j Output Parameters: None mergetrees1 (i,j) { parent[i] = j }

Algorithm 3.6.8 Union, Version 1


This algorithm receives as input two arbitrary values i and j and constructs the tree that represents the union of the sets to which i and j belong. The algorithm assumes that i and j belong to different sets. Input Parameters: i,j Output Parameters: None union1(i,j) { mergetrees1 (findset1(i), findset1 (j)) }

Q: What is a height of a tree constructed by previous algorithms? A: At most n-1, where n is the number of elements in the set. It is more desirable for a tree representing a disjoint set to have a smaller height as then algorithms findset and union would be more efficient. The following algorithms guarantee that the height of a tree with n nodes is at most lg n.

Algorithm 3.6.9 Makeset, Version 2


This algorithm represents the set {i} as a one-node tree and initializes its height to 0. Input Parameter: i Output Parameters: None makeset2(i) { parent[i] = i height[i] = 0 }

Algorithm 3.6.10 Findset, Version 2


This algorithm returns the root of the tree to which i belongs.

Input Parameter: i Output Parameters: None findset2(i) { while ( i != parent[i]) i = parent[i] return i }

Algorithm 3.6.11 Mergetrees, Version 2


This algorithm receives as input the roots of two distinct trees and combines them by making the root of the tree of smaller height a child of the other root. If the trees have the same height, we arbitrarily make the root of the first tree a child of the other root. Input Parameters: i,j Output Parameters: None mergetrees2 (i,j) { if (height[i] < height[j]) parent[i] = j else if ( height [i] > height [j]) parent[j] = i else { parent[i] = j height[j] = height[j] + 1 } }

Algorithm 3.6.12 Union, Version 2


This algorithm receives as input two arbitrary values i and j and constructs the tree that represents the union of the sets to which i and j belong. The algorithm assumes that i and j belong to different sets. Input Parameters: i,j Output Parameters: None union2(i,j) { mergetrees2 (findset2(i), findset2 (j)) }

Can we do better than this?


We can apply the so-called path compression in the findset algorithm. When executing findset(i) , we make every node on the path from i to the root a child of the root (except, of course, the root itself). Example:
2 3 7 Original tree 1

1 8 6 4

2 3

6 4

The tree after findset(7) is called

The following algorithms incorporate path compression. Note that the algorithms use rank instead of height. As the path compression can potentially decrease the height if the tree, this parameter does not necessarily denote the height of the tree. It rather denotes an upper bound on its height, which we call rank.

Algorithm 3.6.16 Makeset, Version 3


This algorithm represents the set {i} as a one-node tree and initializes its rank to 0. Input Parameter: i Output Parameters: None makeset3(i) { parent[i] = i rank[i] = 0 }

Algorithm 3.6.17 Findset, Version 3


This algorithm returns the root of the tree to which i belongs and makes every node on the path from i to the root, except the root itself, a child of the root. Input Parameter: i Output Parameters: None findset3(i) { root = i while ( root != parent [root]) root = parent [root] j = parent[i] while ( j != root) { parent[i] = root i = j j = parent[i] } return root }

Algorithm 3.6.18 Mergetrees, Version 3

This algorithm receives as input the roots of two distinct trees and combines them by making the root of the tree of smaller rank a child of the other root. If the trees have the same rank, we arbitrarily make the root of the first tree a child of the other root. Input Parameters: i,j Output Parameters: None mergetrees3 (i,j) { if (rank[i] < rank[j]) parent[i] = j else if ( rank [i] > rank [j]) parent[j] = i else { parent[i] = j rank[j] = rank[j] + 1 } }

Algorithm 3.6.19 Union, Version 3


This algorithm receives as input two arbitrary values i and j and constructs the tree that represents the union of the sets to which i and j belong. The algorithm assumes that i and j belong to different sets. Input Parameters: i,j Output Parameters: None union3(i,j) { mergetrees3 (findset3(i), findset3 (j)) }

Binary Search
Binary search is a very efficient algorithm for searching in a sorted array. In fact, it is an optimal algorithm no other algorithm that uses only comparisons is faster than binary search.

Binary Search - Example


0 2 1 4 2 7 3 4 5 6 7 8 9 10 11 12 11 13 17 22 23 24 35 36 38 43

Find 11. i=0; j=12; k=(i+j)/2=6; 22>11 -> j=k-1=5

Binary Search - Example


0 2 1 4 2 7 3 4 5 6 7 8 9 10 11 12 11 13 17 22 23 24 35 36 38 43

Find 11. i=0; j=12; k=(i+j)/2=6; 22>11 -> j=k-1=5 i=0; j=5; k =(i+j)/2=2; 7<11 -> i=k+1=3

Binary Search - Example


0 2 1 4 2 7 3 4 5 6 7 8 9 10 11 12 11 13 17 22 23 24 35 36 38 43

Find 11. i=0; j=12; k=(i+j)/2=6; 22>11 -> j=k-1=5 i=0; j=5; k =(i+j)/2=2; 7<11 -> i=k+1=3 i=3; j=5; k =(i+j)/2=4; 13>11 -> j=k-1=3

Binary Search - Example


0 2 1 4 2 7 3 4 5 6 7 8 9 10 11 12 11 13 17 22 23 24 35 36 38 43
k=(i+j)/2=6; k =(i+j)/2=2; k =(i+j)/2=4; k =(i+j)/2=3; 22>11 -> 7<11 -> 13>11 -> 11=11 j=k-1=5 i=k+1=3 j=k-1=3

Find 11.
i=0; j=12; i=0; j=5; i=3; j=5; i=3; j=3;

Algorithm 4.1.1 Binary Search

This algorithm searches for the value key in the nondecreasing array L[i], ... , L[j]. If key is found, the algorithm returns an index k such that L[k ] equals key . If key is not found, the algorithm returns -1, which is assumed not to be a valid index. Input Parameters: L,i,j,key Output Parameters: None bsearch (L,i,j,key) { while ( i j) { k = (i + j)/2 if (key == L[k]) // found return k if (key < L[k]) // search first part j = k - 1 else // search second part i = k + 1 } return -1 // not found }

Correct or not?
bsearch (L,i,j,key) { while ( i j) { k = (i + j)/2 if (key == L[k]) return k if (key < L[k]) j = k else i = k } return -1

If correct, what is the worst-case time?


The algorithm is not correct. When the key is not in the array, the algorithm does not terminate.

Correct or not?
bsearch (L,i,j,key) { if (i>j) return -1 k = (i + j)/2 if (key == L[k]) return k flag = bsearch(L,i,k-1,key) if (flag == -1 ) return bsearch(L,k+1,j,key) else return flag }

If correct, what is the worst-case time?


The algorithm is correct but it is not very efficient. It always searches through lower half of the array first and thus the worst-case time is (n).

Correct or not?
bsearch (L,i,j,key) { if (i>j) return -1 k = (i + j)/2 if (key == L[k]) return k if (key <L[k]) return bsearch(L,i,k,key) else return bsearch(L,k+1,j,key) }

If correct, what is the worst-case time?


The algorithm is not correct. It does not always terminate.

Binary Search: Complexity for successful search


Best-case: Cbest (n) = 1 = (1) Worst-case: Cworst (n) = lg n + 1 = (lg n) Average-case: Cavg(n) = lg n = (lg n) Prove the above! Can we do better than binary search, that is, better than (lg (n))?

Interpolation search
Binary search compares a search key with the middle value of the sorted array. Interpolation search takes into account the values of the search key and the smallest and largest element in the array, and estimates where the search key would be if it is in the array. Interpolation search assumes that values in the array increase linearly with the index.

Interpolation search
a(j) key

a(i) i x j

(x-i)/(key-a(i))= (j-i)/(a(j)-a(i)) x = i +(key-a(i))(j-i)/(a(j)-a(i))

Interpolation search
The worst case complexity for interpolation search is (n). However, in an array with random keys, the averagecase complexity is (lg lg n).

Hashing
Hashing distributes the values of the array evenly among elements of the hash array. This is done by computing a hash function. For example, if the elements of the array are integers, the hash function can be h(K) = K mod m. The hash values will be integers between 0 and m-1, inclusive.

Hashing
Let n be the size of the original array, and m the size of the hash array. Whenever m < n we will certainly have collisions, that is, two or more elements of the original array being hashed into the same cell of the hash array.

Hashing
One way to deal with collisions is to have a linked list for each cell of the hash array that gets more than one element of the original array. Then for a good hash function, that is, the function that distributes the values evenly among elements of the hash array, the average number of comparisons for a successful search will be 1+n/(2m). The drawback of hashing is the extra space required for the hash array.

You might also like