UNIT4

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 268

ALGORITHMS – ANALYSIS AND

DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Introduction

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Greedy Method

• Prim’s Algorithm

• Kruskal’s Algorithm

• Dijkstra’s Algorithm

• Huffman Trees
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Dynamic Programming

• Computing Binomial co-efficient

• Warshall’s Algorithm

• Floyd’s Algorithm

• Knapsack Problem

• Memory Functions
ALGORITHMS – ANALYSIS AND DESIGN
Transform and Conquer

• The design methods are based on the idea of transformation.

• The method works on two-stage procedures.

• In transformation stage

• The problem’s instance is modified so that it is amenable

to the solution

• In conquering stage

• The problem is solved


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• Three major variations

• Instance simplification

• Transformation to a simpler or more convenient

instance of the same problem


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• Three major variations

• Representation change

• Transformation to a different representation of the

same instance
ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• Three major variations

• Problem reduction

• Transformation to an instance of a different problem

for which an algo is already available.


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

Simpler instance
Or
Problem’s Another representation
Solution
• instance Or
Another problem’s
instance
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Transform and Conquer


Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Presort

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Transform and Conquer


ALGORITHMS – ANALYSIS AND DESIGN
Presorting

• "Presorting" is a common example of "instance simplification."

• Presorting is sorting ahead of time, to make repetitive

solutions faster.

• Presorting is a form of preconditioning.

• Preconditioning is manipulating the data to make the

algorithm faster.
ALGORITHMS – ANALYSIS AND DESIGN
Presorting

• Example

• To find many kth statistics in an array then it might make

sense to sort the array ahead of time for so that the cost

for determining each statistics is constant time.


ALGORITHMS – ANALYSIS AND DESIGN
Presorting

• Some examples include

• Checking element uniqueness in an array

• Computing a mode

• Searching problem
ALGORITHMS – ANALYSIS AND DESIGN
Checking element uniqueness in an array

• Algorithm

• Sort the array

• Check only consecutive elements

• If the array has equal elements the pair must be next

to each other.
ALGORITHMS – ANALYSIS AND DESIGN
PresortElementUniqueness(A[0..n-1])

• //Solves the element uniqueness problem by sorting the

//array first.

• //Input: An array A [0..n - 1] of ordered elements.

• //Output: Returns "true" if A contains no equal elements,

//otherwise returns false.


ALGORITHMS – ANALYSIS AND DESIGN
PresortElementUniqueness(A[0..n-1])

• for i ← 0 to n — 2 do

• if A[i] = A [i+1] return false

• return true
ALGORITHMS – ANALYSIS AND DESIGN
PresortElementUniqueness(A[0..n-1]) - Analysis

• The total time is the sum of the time spent on sorting and the

time spent on checking consecutive elements.

• Sorting requires n.log(n) comparisons

• Checking requires not more than n-1 comparisons

• T(n) = Tsort(n) + Tscan(n)

  (n log n) +  (n) =  (n log n)


ALGORITHMS – ANALYSIS AND DESIGN
Computing Mode

• A mode is a value that occurs most often in a given list of

numbers.

• Example

• 5, 5, 2, 3, 5, 8, 8, 7

• Here 5 is mode.
ALGORITHMS – ANALYSIS AND DESIGN
Computing a mode – Brute Force

• Compute the frequencies of all its distinct values.

• Later find the value with the largest frequency.

• To implement this idea

• Store the values already encountered along with their

frequencies in a separate list.


ALGORITHMS – ANALYSIS AND DESIGN
Computing a mode – Brute Force

• On each iteration the ith element of the original list is

compared with the values already encountered by

traversing this auxiliary list.

• If match is found its frequency is incremented otherwise

the current element is added to the list of distinct values

seen so far with the frequency of 1.


ALGORITHMS – ANALYSIS AND DESIGN
Computing a mode – Brute Force – Analysis

• Worst case scenario

• There are no equal elements

• Here the ith element is compared with i – 1 elements that

have been added to the list with a frequency of 1.


n
C (n) =  (i − 1) = 0 +1 + .......... + (n − 1)
i =1

n(n − 1)
=   (n 2 )
2
ALGORITHMS – ANALYSIS AND DESIGN
Presortmode(A[0….n-1])

• //Computes the mode of an array by sorting it first.

• //Input: An array A[0…n-1] of orderable elements.

• //Output: The array’s mode.


ALGORITHMS – ANALYSIS AND DESIGN
Presortmode(A[0….n-1])

• i←0 //current run begins at position i

• modefrequency 0 // highest frequency so far

• while i ≤ n-1 do

• runlength ← 1

• runvalue ← A[i]
ALGORITHMS – ANALYSIS AND DESIGN
Presortmode(A[0….n-1])

• while i+runlength ≤ n-1 and A[i+ runlength]=runvalue

• runlength ← runlength + 1

• if runlength > modefrequency

• modefrequency ← runlength

• modevalue ← runvalue

• i ← i+ runlength

• return modevalue
ALGORITHMS – ANALYSIS AND DESIGN
Presortmode(A[0….n-1]) - Tracing


ALGORITHMS – ANALYSIS AND DESIGN
Presortmode(A[0….n-1]) - Analysis

• The total time is the sum of the time spent on sorting and the

time spent on checking consecutive elements.

• Sorting requires nlogn comparisons

• Checking requires not more than n comparisons

• T(n) = Tsort(n) + Tscan(n)

  (n log n) +  (n) =  (n log n)


ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Presorting

• Checking uniqueness of elements

• Computing Mode
Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Balanced Search trees

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Presorting

• Checking uniqueness of elements

• Computing Mode
ALGORITHMS – ANALYSIS AND DESIGN
Balanced Search Trees

• Binary Search Trees

• It is a binary tree whose nodes contain elements of a set of

orderable items, one element per node, so that all

elements in the left subtree are smaller than the element

in the subtree’s root, and all the elements in the right

subtree are greater than it.

• These trees have to be balanced.


ALGORITHMS – ANALYSIS AND DESIGN
Balanced Search Trees

• The first approach is of the instance-simplification variety: an

unbalanced binary search tree is transformed into a balanced

one.

• Such trees are called self-balancing.


ALGORITHMS – ANALYSIS AND DESIGN
Balanced Search Trees

• An AVL tree requires the difference between the heights of

the left and right subtrees of every node never exceed 1.


ALGORITHMS – ANALYSIS AND DESIGN
Balanced Search Trees

• A red-black tree tolerates the height of one subtree being

twice as large as the other subtree of the same node.

• If an insertion or deletion of a new node creates a tree

with a violated balance requirement, the tree is

restructured by one of a family of special

transformations called rotations that restore the

balance required.
ALGORITHMS – ANALYSIS AND DESIGN
Balanced Search Trees

• The second approach is of the representation-change variety:

allow more than one element in a node of a search tree.

• Specific cases of such trees are 2-3 trees, 2-3-4 trees, and

more general and important B-trees.

• They differ in the number of elements admissible in a

single node of a search tree, but all are perfectly

balanced.
ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees

• AVL trees were invented in 1962 by two Russian scientists, G.

M. Adelson-Velsky and E. M. Landis after whom this data

structure is named.
Fig 1 – Adapted from
https://en.wikipedia.org/
wiki/Georgy_Adelson-
Velsky

Fig 2 – Adapted from


https://en.wikipedia.org/
wiki/Evgenii_Landis
ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees

• An AVL tree is a binary search tree in which the balance factor

of every node, which is defined as the difference between the

heights of the node’s left and right subtrees, is either 0 or +1

or −1.
ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees

• The height of the empty tree is defined as −1.

• The balance factor can also be computed as the difference

between the numbers of levels rather than the height

difference of the node’s left and right subtrees.


ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees

B G

• C E H

D F
ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees - Rotation

• When the insertion of a new node makes an AVL tree

unbalanced rotation is used to transform the tree.

• It is a local transformation of its subtree rooted at a node

whose balance has become either 2 or -2.

• If there are too many nodes of the same kind

• Root the tree rooted at the unbalanced node that is

closest to the newly inserted leaf.


ALGORITHMS – ANALYSIS AND DESIGN
AVL Trees - Rotation

• Four kinds

• Single right rotation

• Single left rotation

• Double left-right rotation

• Double right-left rotation


ALGORITHMS – ANALYSIS AND DESIGN
Single Right Rotation

• Also known as R-rotation

• It is performed after inserting a new key into the left subtree

of the left child of a tree whose root has a balance of 1 before

the insertion.
ALGORITHMS – ANALYSIS AND DESIGN
Single Right Rotation

3 21

2
• 01

1 0

• 2 becomes the new root.

• 3 takes ownership of 2’s right child as its left child. Here the

right child of 2 is null.

• 2 takes ownership of 3 as its right child.


ALGORITHMS – ANALYSIS AND DESIGN
Single Right Rotation

3 21

2
01

1 0

2 0

0 1
0 3
ALGORITHMS – ANALYSIS AND DESIGN
Single Left Rotation

• It is a symmetric rotation known as L-rotation.

• It is the mirror image of the single R-rotation.

• It is performed after a new node is inserted into the right

subtree of the right child of a tree whose root has a balance of

-1 before the insertion


ALGORITHMS – ANALYSIS AND DESIGN
Single Left Rotation

1 -2
-1

2 -1
0

3 0
• 2 becomes the new root.

• 1 takes ownership of 2’s left child as its right child. Here the

left child of 2 is null.

• 2 takes ownership of 1 as its left child.


ALGORITHMS – ANALYSIS AND DESIGN
Single Left Rotation

2
1 -2
-1 0

0 1
0 3
2 -1
0

3 0
ALGORITHMS – ANALYSIS AND DESIGN
Double left-right rotation

• Also known as LR rotation.

• Is a combination of two rotations

• Perform L rotation on the left subtree of root r followed

by the R-rotation of the new tree rooted at r.

• Is performed after a new key is inserted into the right subtree

of the left child of the tree whose root has a balance of 1

before the insertion.


ALGORITHMS – ANALYSIS AND DESIGN
Double left-right rotation

3 21

1 0-1

2 0
• Considering only the right sub tree.
1

2
ALGORITHMS – ANALYSIS AND DESIGN
Double left-right rotation

• 2 becomes the new root.

• 1 takes ownership of 2’s left child as its right child. Here the

left child of 2 is null.

• 2 takes ownership of 1 as its left child.

1
ALGORITHMS – ANALYSIS AND DESIGN
Double left-right rotation

• Now the tree will look like.

1
ALGORITHMS – ANALYSIS AND DESIGN
Double left-right rotation

• Applying R rotation

• 2 becomes the new root.

• 3 takes ownership of 2’s right child as its left child. Here the

right child of 2 is null.

• 2 takes ownership of 3 as its right child. 2

1 3
ALGORITHMS – ANALYSIS AND DESIGN
Double Right-Left Rotation

• Also known as RL rotation.

• Is a mirror image of the double LR rotation.

• Is performed after a new key is inserted into the left subtree

of the right child of the tree whose root has a balance of -1

before the insertion.


ALGORITHMS – ANALYSIS AND DESIGN
Double Right-Left Rotation

1 -1
-2

3 01

2 0
3

• Considering only the left subtree.

2
ALGORITHMS – ANALYSIS AND DESIGN
Double Right-Left Rotation

• Applying R rotation

• 2 becomes the new root.

• 3 takes ownership of 2’s right child as its left child. Here the

right child of 2 is null.

• 2 takes ownership of 3 as its right child.


ALGORITHMS – ANALYSIS AND DESIGN
Double Right-Left Rotation

• Now the tree will look like


1

3
ALGORITHMS – ANALYSIS AND DESIGN
Double Right-Left Rotation

• Applying L rotation

• 2 becomes the new root.

• 1 takes ownership of 2’s left child as its right child. Here the

left child of 2 is null.


2
• 2 takes ownership of 1 as its left child.

1 3
ALGORITHMS – ANALYSIS AND DESIGN
AVL Tree - Efficiency

• The major factor involved is the height of the tree.

• The height of the AVL tree satisfies the inequalities


[log 2 n]  h  1.4405 log 2 (n + 2) − 1.3277

• It implies that the operations of searching and insertion are

Θ(log n) in a worst-case scenario.


ALGORITHMS – ANALYSIS AND DESIGN
AVL Tree - Efficiency

• Searching in an AVL tree requires almost the same number of

comparisons as searching in a sorted array using binary

search.
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Balanced Search Trees • Efficiency

• AVL Trees

• Rotation

• Single Right

• Single Left

• Double Right-Left Rotation

• Double Left-Right Rotation


Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Balanced Search Trees: 2-3 Trees

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Balanced Search Trees • Efficiency

• AVL Trees

• Rotation

• Single Right

• Single Left

• Double Right-Left Rotation

• Double Left-Right Rotation


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• It follows the representation-change variety: allow more than

one element in a node of a search tree.

• The simplest implementation of this idea is 2-3 trees,

introduced by the U.S. computer scientist John Hopcroft in

1970
ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• John Edward Hopcroft (born October 7, 1939) is an American

theoretical computer scientist.

Fig 3 – Adapted from


https://en.wikipedia.org/wiki/John_Hopcroft
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• It can have nodes of two kinds

• 2-nodes

• 3-nodes
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• 2-node

• Contain a single key K and has two children.

• The left child serves as the root of a subtree whose keys

are less than K

• Right child serves as the root of a subtree whose keys are

greater than K.
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• 2-node

<K >K
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• 3-node

• Contains two ordered keys K1 and K2 with K1<K2.

• It has three children.

• The left most child serves as the root of a subtree

whose keys are less than K1.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• 3-node

• Middle child serves as the root of a subtree whose keys

are between K1 and K2.

• Right most child serves as the root of a subtree whose

keys are greater than K2.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• 3-node

K1, K2

< K1 (K1, K2) > K2


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Trees

• All the leaves of the 2-3 tree must be on the same level.

• 2-3 tree is always perfectly balanced.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Searching

• Start at the root.

• If the root is a 2-node the process is continued as if it is a BST

• Stop if K is equal to the root’s key or continue the search in the

left or the right subtree if K is smaller or larger than the root’s

key.
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Searching

• If the root is a 3-node , with not more than 2 key comparisons

it is known whether

• the search has to be stopped since K is equal to one of the

root’s keys..

• Or in which of the root’s three subtrees the search has to

be continued.
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Insertion

• Insert a new key K in the leaf unless it is an empty tree.

• The appropriate leaf is found by performing the search for K.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Insertion

• If the leaf is a

• 2-node

• Insert K as the first node if K is smaller than the node’s

old key.

• Insert K as the second node if K is larger than the

node’s old key.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Insertion

• 3-node

• Split the leaf into two.

• The smallest of the three keys (two old ones and the new

key) is put in the first leaf

• The largest key is put in the second leaf.

• The middle key is promoted to the old leaf’s parent.


ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Insertion

• List – 9,5,8,3,2,4,7

9 8 8
5,9 5,8,9 4
1 2 3
5 9 3,5 9
Has to split

5 8 3,8
6 3,8

2,3,5 9 2 9
5 2 9
4,5
Has to split
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Insertion

Has to split 3,5,8


7 3,8
8

2 4 7 9
2 9
4,5,7
Has to split

3 8

2 4 7 9
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Searching - Analysis

• The efficiency depends on the height of the tree

• A 2-3 tree of height h with the smallest number of keys is a full

tree of 2-nodes.

• For any 2-3 tree of height h with n nodes

n  1 + 2 + ........ + 2 h

h +1
= 2 −1
hence h  log 2 (n + 1) −1
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Searching - Analysis

• A 2-3 tree of height h with the largest number of keys is a full

tree of 3-nodes each with two keys and three children.

• For any 2-3 tree with n nodes

• n  2.1 + 2.3 + ........ + 2.3 h

= 2(1 + 3 + ........ + 3)
= 3h +1 −1`
hence h  log 3 (n + 1) −1
ALGORITHMS – ANALYSIS AND DESIGN
2-3 Nodes Searching - Analysis

• The lower and upper bounds on height h

log3 (n + 1) −1 h  log 2 (n + 1) −1


• This implies that the time efficiencies of searching, insertion

and deletion are all in Θ(log n) in both the worst and average

case.
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• 2-3 Trees

• Searching

• Insertion

• Analysis
Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Heaps

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• 2-3 Trees

• Searching

• Insertion

• Analysis
ALGORITHMS – ANALYSIS AND DESIGN
Heapsort

• Heapsort was invented by J. W. J. Williams in 1964.

• John William Joseph Williams (2 September 1930 – 29

September 2012) was a Welsh-Canadian computer scientist

best known for inventing in 1964 heapsort and the binary

heap data structure

Fig 4- adapted from


https://en.wikipedia.org/wiki/J._W._J._Williams
ALGORITHMS – ANALYSIS AND DESIGN
Heap

• Heap is a partially ordered data structure that is suitable for

implementing priority queues.

• It can be defined as a binary tree with keys assigned to its

node provided the following two conditions are met:

• The tree’s shape requirement

• The parental dominance requirement


ALGORITHMS – ANALYSIS AND DESIGN
Heap

• The tree’s shape requirement

• The binary tree is essentially complete

• All its levels are full except possibly the last level where

only some rightmost leaves are missing.

• The parental dominance requirement

• The key at each node is greater than or equal to the

keys at its children.


ALGORITHMS – ANALYSIS AND DESIGN
Heap

10 10

5 7 5 7

10
4 2 1 2 1

5 7

• Heap • Not heaps

6 2 1
ALGORITHMS – ANALYSIS AND DESIGN
Heap

• The key values in a heap are ordered top down.

• Sequence of values on any path from the root to a leaf is

decreasing.

• There is no left to right order in key values.

• There is no relationship among key values for nodes either

on the same level of the tree.


ALGORITHMS – ANALYSIS AND DESIGN
Heap - Properties

• There exists exactly one essentially complete binary tree with

n nodes.

• Its height is equal to log2n.

• The root of the heap always contains the largest element /

smallest element.

• A node of the heap considered with all its descendants is also

a heap.
ALGORITHMS – ANALYSIS AND DESIGN
Heap - Properties

• A heap can be implemented as an array by recording its

elements in the top-down, left-to-right fashion.

• It is convenient to store the heap’s elements in positions 1

thru n of such an array leaving H[0] either unused or

putting there a sentinel element whose value is greater

than every element in the heap.


ALGORITHMS – ANALYSIS AND DESIGN
Heap - Properties

• Here

• The parental nodes keys will be in the first [n/2]

positions of the array, while the leaf keys will occupy

the last [n/2] positions

• The children of a key in the array’s parental position i

(1 ≤ i ≤ [n/2]) will be in positions 2i and 2i+1.

• Correspondingly the parent of a key in position i (2 ≤ I

≤ n) will be in position [i/2].


ALGORITHMS – ANALYSIS AND DESIGN
Heap

• Heap can be defined as an array H[1..n] where every element

in position i in the first half of the array is greater than or

equal to the elements in positions 2i and 2i+1

n
H [i ]  max{H [2i ], H [2i + 1]} for i =1,...,  
2
ALGORITHMS – ANALYSIS AND DESIGN
Heaps

• Max heap

• It is a complete binary tree in which the value in each

internal node is greater than or equal to the values in the

children of that node.


ALGORITHMS – ANALYSIS AND DESIGN
Heaps

• Min heap

• It is a complete binary tree in which the value in each

internal node is lesser than the values in the children of

that node.
ALGORITHMS – ANALYSIS AND DESIGN
Heap Construction

• Two principals

• Bottom-up heap construction

• Top-down heap construction


ALGORITHMS – ANALYSIS AND DESIGN
Bottom-up Heap construction

• Initializes the complete binary tree with n nodes placing keys

in the order given.


ALGORITHMS – ANALYSIS AND DESIGN
Bottom-up Heap construction

• Heapify the tree as follows

• Starts with the parental nodes while checking whether the

parental dominance holds for the key of this node.

• If it does not it exchanges the node’s key K with the larger

key of its children.

• It again checks the parental dominance holds true for K in

its new position.


ALGORITHMS – ANALYSIS AND DESIGN
Bottom-up Heap construction

• The process continues until the parental dominance

requirement for K is satisfied.

• The algo proceeds to the same for the node’s immediate

predecessor.

• The algo stops after this is done for the tree’s root.
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n])

• //Constructs a heap from the elements of a given array by the

// bottom – up algorithm

• // Input: An array H[1..n] of orderable items

• // Output: A heap H[1..n]


ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n])

• for i ← [n/2] down to 1 do

• k ← i; v ← H [k]

• heap ← false

• while not heap and 2*k ≤ n do

• j ← 2*k

• if j < n // there are two children

• if H [ j ] < H [ j + 1] then j ← j + 1
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n])

• if v ≥ H [ j] then heap ← true

• else

• H [k] = H [j]

• k=j

• H [k] ← v
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Tracing

• Example 2, 9, 7, 6, 5, 8

2 2 2 9 9

9 7 9 8 9 8 2 8 6 8

6 5 8 6 5 7 6 5 7 6 5 7 2 5 7
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

• Worst-case scenario

• n = 2k – 1

• A heap’s tree is full – maximum number of nodes occurs

on each level.
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

• Let h be the height of the tree

• h = log2n

• Each key on level i of the tree will travel to the leaf level h

in the worst case of construction.


ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

• Moving to the next level down requires two comparisons

• One to find the largest child

• One to determine whether the exchange is required

• Total number of comparisons involving a key on level i will

be 2(h-i).
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

• Total number of comparisons


ℎ−1

𝐶𝑤𝑜𝑟𝑠𝑡 (𝑛) = ෍ ෍ 2(ℎ − 𝑖)


𝑖=0 𝑙𝑒𝑣𝑒𝑙𝑖𝑘𝑒𝑦𝑠

ℎ−1

= ෍ 2(ℎ − 𝑖)2𝑖 since t he number of nodes in level i is 2i


𝑖=0

ℎ−1 ℎ−1 ℎ−1

= ෍ 2(ℎ − 𝑖)2𝑖 = ෍ 2ℎ. 2𝑖 − ෍ 2𝑖2𝑖


𝑖=0 𝑖=0 𝑖=0
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

• Total number of comparisons


ℎ−1 ℎ−1 ℎ−1 ℎ−1

= 2 ෍ ℎ. 2𝑖 − ෍ 𝑖2𝑖 = 2 ℎ ෍ 2𝑖 − ෍ 𝑖2𝑖
𝑖=0 𝑖=0 𝑖=0 𝑖=0

= 2(ℎ(2ℎ−1+1 − 1) − (ℎ − 2)2ℎ−1+1 − 2)

= 2 ℎ 2ℎ − 1 − ℎ − 2 2ℎ − 2
= 2(ℎ2ℎ − ℎ − ℎ2ℎ + 2. 2ℎ − 2)

= 2(2. 2ℎ − 2 − ℎ)
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

h +1 h +1
• n=2 − 1 = n + 1 = 2
Applying log on both sides log 2 (n + 1) = (h + 1) log 2 2
log 2 (n + 1) = (h + 1) = h = log 2 (n + 1) − 1

𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑖𝑛𝑔 𝑏𝑎𝑐𝑘 𝑤𝑒 𝑔𝑒𝑡

𝐹(𝑛) = 2(2. 2log2(𝑛+1)−1 − log 2 ( 𝑛 + 1) + 1 − 2)

2log 2 ( n +1)
= 2(2. − log 2 (n + 1) − 1)
2
ALGORITHMS – ANALYSIS AND DESIGN
HeapBottomUp(H[1..n]) - Analysis

log 2 ( n +1)
• = 2(2 − log 2 (n + 1) − 1)
= 2((n + 1) log 2 2
− log 2 (n + 1) − 1)
= 2(n + 1 − log 2 (n + 1) − 1)
= 2(n − log 2 (n + 1))
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Heaps

• Bottom-up Heap construction

• Analysis
Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Heapsort

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Heaps

• Bottom-up Heap construction

• Analysis
ALGORITHMS – ANALYSIS AND DESIGN
Top-down Heap Construction

• Constructs a heap by successive insertions of a new key into a

previously constructed heap.

• Attach a new node with key K in it after the last leaf of the

existing heap.
ALGORITHMS – ANALYSIS AND DESIGN
Top-down Heap Construction

• Shift K up to its appropriate place in the new heap as

follows.

• Compare K with its parent’s key.

• If the latter is greater than or equal to K stop.

• Else swap these two keys and compare K with its new

parent.
ALGORITHMS – ANALYSIS AND DESIGN
Top-down Heap Construction

• This swapping continues until K is not greater than its last

parent or it reaches the root.


ALGORITHMS – ANALYSIS AND DESIGN
Top-down Heap Construction

• Example – 2,9,7,6,5,8 9

6 8

2 5 7
9

• Insert 10 into the heap constructed.


6 8

2 5 7 10
ALGORITHMS – ANALYSIS AND DESIGN
Top-down Heap Construction

9 9 10

6 8 6 10 6 9

•2 5 7 10 2 5 7 8 2 5 7 8
ALGORITHMS – ANALYSIS AND DESIGN
Top-down heap construction – Analysis

• The insertion operation cannot require more key comparisons

than the heap’s height.

• Since the height of a heap with n nodes is about log2n, the

time efficiency of insertion is in O(log n)


ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Algorithm

• Maximum Key Deletion from a heap

• Exchange the root’s key with the last key K of the heap

• Decrease the heap’s size by 1.


ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Algorithm

• “Heapify” the smaller tree by shifting K down the tree

exactly in the same way as it is done in bottom-up heap

construction algorithm.

• Verify the parental dominance for K.


ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Algorithm

• If it holds stop.

• else

• Swap K with the larger of its children.

• Repeat this operation until the parental dominance

condition holds for K in its new position.


ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Algorithm

9 1 1
8 8

8 6 8 6 8 6
1 6 5 6

2 5 1 2 5 9 2 5
2 5 2 1
ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Analysis

• The efficiency of deletion is determined by the number of key

comparisons needed to “heapify” the tree after the swap has

been made and the size of the tree is decreased by 1.

• Since this cannot require more key comparisons than twice

the heap’s height, the time efficiency of deletion is in O(log n).


ALGORITHMS – ANALYSIS AND DESIGN
Heapsort

• This is a two-stage algorithm that works as follows.

• Stage 1 (heap construction): Construct a heap for a given

array.

• Stage 2 (maximum deletions): Apply the root-deletion

operation n − 1 times to the remaining heap


ALGORITHMS – ANALYSIS AND DESIGN
Heapsort

9 1 1
8 8

8 6 8 6 8 6
1 6 5 6

2 5 1 2 5 9 2 5
2 5 2 1
ALGORITHMS – ANALYSIS AND DESIGN
Heapsort

8
• 1
1 6

5 6 5 6
5 6 5 1

2 1 2 8
2 2
2
1 1 5 2

5 1

2 2 5 2 1 5 1

6
ALGORITHMS – ANALYSIS AND DESIGN
Heapsort

2 1
1

1 2

• Remove 1 from heap


ALGORITHMS – ANALYSIS AND DESIGN
Heapsort – Analysis

• The efficiency of the of the heap construction stage is ϴ(n)

with bottom-up heap construction.

• The effort required to build the heap is linear.

• The efficiency of the of the heap construction stage is ϴ(log n)

with top-down heap construction.

• The effort required to build the heap is lograthmic.


ALGORITHMS – ANALYSIS AND DESIGN
Heapsort – Analysis

• The array elements are deleted in the decreasing order in a

max heap.

• Time efficiency of the second stage

• C(n) is the number of comparisons needed for eliminating

the root keys from the heaps of diminishing sizes from n to

2.
ALGORITHMS – ANALYSIS AND DESIGN
Deletion of the root’s key from a heap – Analysis

𝐶(𝑛) ≤ 2[log 2 ( 𝑛 − 1)] + 2[log 2( 𝑛 − 2)] + 2[log 2 ( 𝑛 − 3)]+. . . . . . . +2[log 2 2]

n −1 n −1
•  2 log 2 i  2 log 2 (n − 1).1
i =1 i =1
n −1
= 2 log 2 (n − 1)1 = 2 log 2 (n − 1)(n − 1 − 1 + 1)
i =1
= 2(n − 1) log 2 (n − 1)
≈ 𝑛 log 2 𝑛 𝑓𝑜𝑟 𝑣𝑒𝑟𝑦 large values of n

T (n) =  (n) +  (n log 2 n) 𝑇(𝑛) = 𝜃(log 2 𝑛) + 𝜃(𝑛 log 2 𝑛)

=  (max{n, n log 2 n}) =  (n log 2 n) = 𝜃(max{ log2 𝑛 , 𝑛 log2 𝑛}) = 𝜃(𝑛 log2 𝑛)
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Top-down Heap Construction

• Deletion of the root’s key from a heap

• Analysis

• Heapsort

• Algorithm

• Tracing

• Analysis
Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Space and Time Tradeoffs

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Top-down Heap Construction

• Deletion of the root’s key from a heap

• Analysis

• Heapsort

• Algorithm

• Tracing

• Analysis
ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• A space–time or time–memory trade-off in computer science

is a case where an algorithm or program trades increased

space usage with decreased time.

• Space refers to the data storage consumed in performing a

given task (RAM, HDD)

• Time refers to the time consumed in performing a given task

(computation time or response time).


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• The utility of a given space-time tradeoff is affected by related

fixed and variable costs (ex. CPU speed, storage space).

• The best Algorithm is that which helps to solve a problem that

requires less space in memory and also takes less time to

generate the output.

• But in general, it is not always possible to achieve both of

these conditions at the same time.


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• To overcome this issue there are three methods

• Preprocess the problem’s input in whole or in part.

• Store the additional information obtained to accelerate

solving the problem later.

• Approach is known as input enhancement.


ALGORITHMS – ANALYSIS AND DESIGN
Input Enhancement

• Algorithms based on this technique

• Counting methods for sorting

• Boyer-Moore algo for string matching

• Horspool algo for string matching


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• Second approach

• Called as prestructuring.

• Use extra space to facilitate faster and more flexible access

to data.

• Algorithms based on this technique

• Hashing

• Indexing with B trees


ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• Third approach

• Dynamic Programming

• Based on recording solutions to overlapping sub-

problems of a given problem in a table from which a

solution to the problem in question is then obtained.


ALGORITHMS – ANALYSIS AND DESIGN
Sorting by counting

• One method

• For each element of a list to be sorted, count the total

number of elements smaller than this element and record

the results in a table.

• The numbers will indicate the positions of the elements in

the sorted list.


ALGORITHMS – ANALYSIS AND DESIGN
Sorting by counting

• Example

• If the count is 10 it means the element will be in 11th

position with index 10.

• The algo is known as comparison counting algorithm.


ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1])

• //Sorts an array by comparison counting

• //Input: An array A[0..n-1] orderable elements

• //Output: Array S[0..n-1] of A’s elements sorted in non-

//decreasing order
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1])

• for i ← 0 to n-1 do

• count[i] ← 0

• for i ← 0 to n-2 do

• for j ← i+1 to n-1 do

• if A[i] < A[j]

• count[j] ← count[j] + 1
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1])

• else

• count[i] ← count[i]+1

• for i ← 0 to n-1 do

• S[count[i]] ← A[i]

• return S
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) - Tracing

• 65, 28, 87, 93, 22, 44

• Initially count[] 0 0 0 0 0 0

• After i = 0, count[] = 1 0 1 1 0 0
2
3
• After i = 1, count[] =
3 0 1 1 0 0
1 2 2 1
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) - Tracing

• After i = 2, count[] = 3 1 1 1 0 0
2 2 1
3 3
• After i = 3, count[] = 4
3 1 4 1 0 0
2 1
3
4
5
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) - Tracing

• After i = 4, count[] = 3 1 4 5 0 0
1
2
• Final State, count[] =
3 1 4 5 0 2
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) - Tracing

• count[] = 3 1 4 5 0 2

• 65, 28, 87, 93, 22, 44


68
• After i = 0, S[] = 0 1 2 3 4 5

28 68
• After i =1, s[] =
0 1 2 3 4 5
• After i = 2, s[] = 28 68 87
0 1 2 3 4 5
• After i = 3, s[] =
28 68 87 93
0 1 2 3 4 5
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) - Tracing

• count[] = 3 1 4 5 0 2

• 65, 28, 87, 93, 22, 44

• After i = 4, S[] = 22 28 68 87 93
0 1 2 3 4 5
• After i =5, s[] =
22 28 44 68 87 93
0 1 2 3 4 5
ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) – Analysis

• Step 1: Input size is ‘n’

• Step 2: The basic operation is Comparison A[i] < A[j]

• Step 3: The number of comparisons depends on n.

• Step 4: One comparison per iteration.


ALGORITHMS – ANALYSIS AND DESIGN
ComparisonCountingSort(A[0..n-1]) – Analysis

• Step 5: The number of times it is executed is


n − 2 n −1
C ( n) =  1
i = 0 j =i +1
n −1
=  [(n − 1) − (i + 1) + 1
i =0
n−2
=  (n − 1 − i ) + 1
i =0

n(n − 1)
=
2
 O(n 2 )
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Introduction

• Input Enhancement

• Presorting

• Comparison Counting

• Tracing

• Analysis
Thank you

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Sorting by Counting

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Introduction

• Input Enhancement

• Presorting

• Comparison Counting

• Tracing

• Analysis
ALGORITHMS – ANALYSIS AND DESIGN
Sorting by counting

• Sort a list of items with some other information associated

with the keys.

• Copy the elements into a new array S[0..n-1] to hold the

sorted elements.

• The elements of A whose values are equal to the lowest

possible value i are copied into the first D[0] elements of S. i.e

position 0 to D[0] – 1.
ALGORITHMS – ANALYSIS AND DESIGN
Sorting by counting

• The elements of value i+1 are copied into positions from D[0]

to (D[0]+D[1]) – 1 and so on.

• The accumulated sums of frequencies are called distribution in

statistics.

• The method is known as Distribution Counting.


ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u)

• //Sorts an array of integers from a limited range by

//distribution counting

• //Input: An array A[0..n-1] of integers between l and u

• //Output: Array S[0..n-1] of A’s elements sorted in non -

//decreasing order
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u)

• for j ← 0 to u-l do

• D[j]←0 //initialize frequencies

• for i ← 0 to n-1 do

• D[A[i]-l]←D[A[i]-l]+1 // compute frequencies

• for j ← 1 to u – l do

• D[j] ← D[j-1] + D[j] //reuse for distribution


ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u)

• for i ← n-1 downto 0 do

• j ← A[i]-l

• S[D[j]-1] ← A[i]

• D[j] ← D[j] – 1

• return S
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

• Array – 15, 13, 14, 15, 14, 14

• These have come from the set {13,14,15}.

• Here l = 13, u = 15

• Initially D[] = 0 0 0

• After i = 0 D[] = 0 0 1

• After i = 1 D[] = 1 0 1

• After i = 2 D[] = 1 1 1
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

1 1 1
• After i = 3 D[] = 2

1 1 1
• After i = 4 D[] =
2 2

• After i = 5 D[] = 1 1 1
2 2
3
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

1 1 1
• After j = 1 D[] = 2 2
3
4

1 1 1
• After j = 2 D[] = 2 2
3 6
4
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

• After i = 5
14
• j = 1, S[] =
1 4 6
• D[] = 3

• After i = 4

• j = 1, S[] = 14 14

1 4 6
• D[] = 3
2
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

• After i = 3
14 14 15
• j = 2, S[] =
1 4 6
• D[] = 3 5
2
• After i = 2

• j = 1, S[] = 14 14 14 15

• D[] = 1 4 6
3 5
2
1
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Tracing

• After i = 1
13 14 14 14 15
• j = 0, S[] =
1 4 6
• D[] = 0 3 5
2
1
• After i = 0

• j = 2, S[] = 13 14 14 14 15 15

• D[] = 1 4 6
0 3 5
2 4
1
ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Analysis

• When the range of values are fixed it is a linear algorithm.

• It makes two consecutive passes through its input array A.

• Parameter for analysis is n- number of inputs

• Basic Operation is traversal of input array.

• The array is traversed twice during the algorithm.


ALGORITHMS – ANALYSIS AND DESIGN
DistributionCounting(A[0..n-1], l, u) - Analysis

• The time efficiency of the algorithm is given as


n −1 n −1
T ( n) =  1 +  1
i 0 i 0
= n −1 − 0 +1 + n −1 − 0 +1
= 2n
T ( n )  ( n )
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Distribution Counting

• Tracing

• Analysis
THANK YOU

Lekha A
Department of Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Input Enhancement in String
Matching - I

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Distribution Counting

• Tracing

• Analysis
ALGORITHMS – ANALYSIS AND DESIGN
Introduction

• The problem of string matching requires finding an occurrence

of a given string of m characters called the pattern in a longer

string of n characters called the text.

• In input enhancement method preprocess the pattern to get

some information about it, store this information in a table,

and then use this information during an actual search for the

pattern in a given text. T


ALGORITHMS – ANALYSIS AND DESIGN
Input enhancement in string matching

• Boyer-Moore Algorithm

• Horspool Algorithm
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• It was published by Nigel Horspool in 1980.

Fig 3 – adapted from


http://webhome.cs.uvic.ca/~nigelh/
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• It is a simplified version of Boyer-Moore algorithm.

• Both algorithms start by aligning the pattern against the

beginning characters of the text.

• If no matching string is found (that is, the first trial fails), shift

the pattern to the right.

• They differ in deciding the shift size.


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Consider, as an example, searching for the pattern BARBER in

some text:

• If a mismatch occurs, shift the pattern to the right.

• Try to make as large a shift as possible without risking the

possibility of missing a matching substring in the text.


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Horspool’s algorithm determines the size of such a shift by

looking at the character c of the text that is aligned against the

last character of the pattern.


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 1

• If there are no c’s in the pattern—e.g., c is letter S in

the example— safely shift the pattern by its entire

length (if shift is less, some character of the pattern

would be aligned against the text’s character c that is

known not to be in the pattern)


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 1
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 2

• If there are occurrences of character c in the pattern

but it is not the last one there—e.g., c is letter B in the

example—the shift should align the rightmost

occurrence of c in the pattern with the c in the text


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 2
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 3

• If c happens to be the last character in the pattern but

there are no c’s among its other m − 1 characters—e.g.,

c is letter R in the example—the situation is similar to

that of Case 1 and the pattern should be shifted by the

entire pattern’s length m


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 3
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 4

• If c happens to be the last character in the pattern and

there are other c’s among its first m − 1 characters.

• c is letter R in the example

• The situation is similar to that of Case 2


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm - Possibilities

• In general, the following four possibilities can occur

• Case 4

• The rightmost occurrence of c among the first m −

1 characters in the pattern should be aligned with

the text’s c.
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Precompute shift sizes and store them in a table.

• For every character c, the shift’s value is determined by

• t(c) = the pattern’s length, m

• If c is not among the first m - 1 characters of the pattern

• else

• t(c) = the distance from the rightmost c among the first m – 1

characters of the pattern to its last character


ALGORITHMS – ANALYSIS AND DESIGN
Algorithm – ShiftTable(P[0..m-1])

• //Fills the shift table used by Horspool’s and Boyer- Moore

//algorithms

• //Input: Pattern P[0..m-1] and an alphabet of possible

//characters

• //Output: Table[0..size-1] indexed by the alphabet’s characters

//and filled with shift sizes computed by formula


ALGORITHMS – ANALYSIS AND DESIGN
Algorithm – ShiftTable(P[0..m-1])

• Initialize all the elements of Table with m

• for j ←0 to m-2 do

• Table[P[j]] ← m-1-j

• return Table
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Step 1: For a given pattern of length m and the alphabet used

in both the pattern and text, construct the shift table.

• Step 2: Align the pattern against the beginning of the text.


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Step 3: Scan pattern from the right to left by comparing its

characters with the corresponding characters in the text

• If a mismatching occurs, shift the pattern to the right by

t(c), where c is the text character currently aligned against

the last character of the pattern.


ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm

• Start the right-to-left comparison for the new aligned

position of the pattern until a single or multiple matching

is found or the text is exhausted


ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1])

• //Implements Horspool algorithm for string matching

• //Input: Pattern P[0..m-1] and text T[0..n-1]

• //Output: The index of the left end of the first matching

• // substring or -1 if there are no matches


ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1])

• ShiftTable(P[0..m-1])//generate table of shifts

• i ← m -1 // pos of pattern’s right end

• while i ≤ n-1 do

• k←0 //no. of matched characters

• while k ≤m-1 and P[m-1-k] = T[i-k] do

• k ← k+1
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1])

• if k = m

• return i - m + 1

• else

• i← i + Table[T[i]]

• return -1
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing

• Search for the pattern BARBER in the following text.

• JIM_SAW_ME_IN_A_BARBERSHOP

• P[] = B A R B E R
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing

• After calling ShiftTable(P[0..m-1])

• j – 0, Table[B] = 6-1-0 = 5

• j – 1, Table[A] = 6-1-1 = 4
Character c A B C D E …… R ….. Z ..

• j – 2, Table[R] = 6 – 1 -2 =3 Shift t(c) 4 2 6 6 1 …6… 3 …6… 6 6

• j – 3, Table[B] = 6 – 1 – 3=2

• j – 4, Table[E] = 6 -1 - 4 = 1
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing

• After i = 5 k = 0, P[5] = ‘R’ T[5] = ‘A’

• k not = 6, i = 5 + 4 = 9

• After i = 9 k = 0, P[5] = ‘R’, T[9] = ‘E’

• K not = 6, i = 9 + 1 = 10

• After i=10 k = 0, P[5] = ‘R’, T[10] = ‘ ’

• k not = 6, i = 10 + 6 = 16
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing

• After i =16 k = 0, P[5] = ‘R’, T[16] = ‘B’

• k not = 5, i = 16 + 2 = 18

• After i =18 k = 0, P[5] = ‘R’, T[18] = ‘R’

• k = 1, P[4] = ‘E’, T[17] = ‘A’

• k not = 5, i = 18 + 3 = 21
ALGORITHMS – ANALYSIS AND DESIGN
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing

• After i =21 k = 0, P[5] = ‘R’, T[21] = ‘R’

• k = 1, P[4] = ‘E’, T[20] = ‘E’

• k = 2, P[3] = ‘B’, T[19] = ‘B’

• k = 3, P[2] = ‘R’, T[18] = ‘R’

• k = 4, P[1] = ‘A’, T[17] = ‘A’

• k = 5, P[0] = ‘B’, T[16] = ‘B’

• k = 5 return 16
ALGORITHMS – ANALYSIS AND DESIGN
Horspool Algorithm- Analysis

• Input – n and m

• Basic operation is P[m-1-k] = T[i-k]


n −1 m −1
• C(n) =  1
i = m −1 k = 0
n −1 n −1
=  m −1 − 0 + 1=  m
i = m −1 i = m −1
n −1
= m 1 = m(n − 1 − m + 1 + 1)
i = m −1

= m(n − m + 1) = mn − m 2 + m
 mn for large values of n
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Input enhancement in string matching

• Horspool algorithm
THANK YOU

Lekha A
Department of Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Input Enhancement in String Matching
- II
Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Input enhancement in string matching

• Horspool algorithm
ALGORITHMS – ANALYSIS AND DESIGN
Exact Matching

• P: word

• T: There would have been a time for such a word.

• ---------word---------------------------------------------->

• Here w and o are matched and a mismatch occurs at u

since r ≠ u.

• Since u doesn’t occur in P, it is better to skip the next two

alignments.
ALGORITHMS – ANALYSIS AND DESIGN
Exact Matching

• P: word

• T: There would have been a time for such a word.

• ---------word---------------------------------------------->

• word Skip

• word Skip

• word
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm

• It was developed by Robert S. Boyer and J Strother Moore in

1977.

Fig 4 – adapted from


https://www.cs.utexas.edu/users/boyer/

Fig 5 – adapted from


https://en.wikipedia.org/wiki/J_Strother_Moore
ALGORITHMS – ANALYSIS AND DESIGN
Boyer- Moore Algorithm

• Use knowledge gained from character comparisons to skip

future alignments that definitely won’t match

• If mismatch occurs, use knowledge of the mismatched text

character to skip alignments

• “Bad character rule”


ALGORITHMS – ANALYSIS AND DESIGN
Boyer- Moore Algorithm

• Use knowledge gained from character comparisons to skip

future alignments that definitely won’t match

• If some characters are matched, use knowledge of the

matched characters to skip alignments

• “Good suffix rule”


ALGORITHMS – ANALYSIS AND DESIGN
Boyer- Moore Algorithm

• Use knowledge gained from character comparisons to skip

future alignments that definitely won’t match

• Try alignments in one direction, then try character

comparisons in opposite direction

• For longer skips


ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule

• Avoids repeating unsuccessful comparisons against a target

character.

• Bad-symbol shift is guided by the text character c that caused

a mismatch.
ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule

• Upon mismatch, let b be the mismatched character in T.

• Skip alignments until

• A. b matches its opposite in P,

• B. P moves past b
ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule
ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule
ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule

• Skipped 8 alignments and 5 characters in T were never looked at


Fig 6 – Adapted from
http://www.cs.jhu.edu/~langmea/resources/lecture_notes/04_boyer_moore_v2.pdf
ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Let t = substring matched by inner loop; skip until

• A. there are no mismatches between P and t

• B. or P moves past t
ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Step 1

• t occurs in its entirety to the left within P


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Step 2

• Prefix of P matches a suffix of t


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Case (a) has two subcases according to whether t occurs in its

entirety to the left within P (as in step 1), or a prefix of P

matches a suffix of t (as in step 2)


ALGORITHMS – ANALYSIS AND DESIGN
Bad Character Rule + Good Suffix Rule

• Bad character rule says skip 2.

• Good suffix rule says skip 7.

• Take the maximum! (7)


ALGORITHMS – ANALYSIS AND DESIGN
Bad Character Rule + Good Suffix Rule

• Use bad character or good suffix rule, whichever skips more.

• Example
ALGORITHMS – ANALYSIS AND DESIGN
Bad Character Rule + Good Suffix Rule

• 11 characters of T were ignored


ALGORITHMS – ANALYSIS AND DESIGN
Bad Character Rule + Good Suffix Rule
ALGORITHMS – ANALYSIS AND DESIGN
Bad Character Rule + Good Suffix Rule
ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule

• If c is not present in the pattern, shift the pattern to just pass

this c in the text.

• d1 = max{t1(c) - k, 1}

• Where t1(c) is the entry in the pre-computed table and

k is the number of the matched characters.


ALGORITHMS – ANALYSIS AND DESIGN
Bad character Rule

• If t1(c) - k ≤ 0, then shift the pattern by one position to the

right.

• If c is not in the pattern shift the pattern to just pass this c.


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Aligns only matching pattern characters against target

characters already successfully matched

• It works for the suffix.

• Suffix is the ending portion of the pattern.

• Its size is given by k.


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Suppose for a given pattern P if we have already matched

some suffix ‘S’ but a mismatch occurs with the preceding

character xP.

• Shift the pattern to right along the string so that the matched

part is occupied by the same suffix ‘S’.


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• No complete match of the suffix ‘S’ is possible if ‘S’ does not

occur elsewhere in P.

• Match the largest prefix of P.


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• Good-suffix shift is guided by a successful match of the last k >

0 characters of the pattern.

• suff(k) is the ending portion of the pattern of size k

• d2 = distance between such second rightmost occurrence

of suff(k) and its rightmost occurrence.

• The second rightmost occurrence should not be preceded

by the same character as in the last occurrence.


ALGORITHMS – ANALYSIS AND DESIGN
Good Suffix Rule

• However, if there is no other occurrence of suff(k), then

find the longest prefix of size l < k that matches the suffix of

the same size l. The d2 is the distance between such a

prefix and the suffix.


ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Need for Boyer-Moore Algorithm

• Bad character Rule

• Good suffix Rule


THANK YOU

Lekha A
Department of Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN

Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Input Enhancement in String Matching
- III
Lekha A
Computer Applications
ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Need for Boyer-Moore Algorithm

• Bad character Rule

• Good suffix Rule


ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm

• Step 1: For a given pattern and the alphabet used in both the

pattern and the text construct the bad symbol shift table.

• Step 2: Using the pattern construct the good suffix shift table

• Step 3: Align the pattern against the beginning of the text.


ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm

• Step 4: Repeat the following step until a matching substring is

found or the pattern reaches beyond the last character of the

text.

• Starting with the last character compare the corresponding

characters in the pattern and the text until

• A. All m characters are matched.

• Return position of occurrence and stop


ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm

• B. A mismatching pair is encountered after k ≥ 0

characters are matched successfully.

• Retrieve the entry t1(c) from the c’s column of the

bad symbol table where c is the text’s mismatched

character.
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm

• If k > 0 also retrieve the corresponding d2 entry

from the good-suffix table.

• Shift the pattern to the right by the number of

positions computed by the formula


d1 if k = 0
• d =
 max{d1 , d 2 } if k  0
• where d1 = max{t1(c)-k,1}
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

• Text: BESS KNEW ABOUT BAOBABS

• Pattern: BAOBAB

• Using shift table the computed bad-symbol table is

Character c A B C D E …… O …… Z
t1(c) 1 2 6 6 6 …6… 3 ..6… 6 6
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

• The good suffix table is generated as

• For the suffix of k characters compare the prefix of k

characters

• If they are same check for another occurrence of the

same suffix.

• If it exists d2 is the distance from the second

occurrence till the end of the string


ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

• If they are not same reduce the suffix by one and check

if they are same.

• If same d2 is the distance from the second

occurrence till the end of the string


ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

• If the prefix does not match check the pattern for the

occurrence of the suffix at any position.

• If found d2 is the distance from that occurrence till the end of

the string.
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

• Pattern is BAOBAB
k Pattern d2
1 B 2
2 AB 5
3 BAB 5
4 OBAB 5
5 AOBAB 5
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

B E S S K N E W A B O U T B A O B A B S
B A O B A B
• d1 = t1(K)-0 = 6
B E S S K N E W A B O U T B A O B A B S
B A O B A B
• d1 = t1( )-2 = 6-2 = 4

• There has been two matching's use the good-suffix table d2 = 5

• d=max(4,5) = 5
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Tracing

B E S S K N E W A B O U T B A O B A B S
B A O B A B
• d1 = t1( )-1 = 6-1 = 5

• There has been one matching's use the good-suffix table d2 = 2

• d=max(5,2) = 5

B E S S K N E W A B O U T B A O B A B S
B A O B A B
ALGORITHMS – ANALYSIS AND DESIGN
Boyer-Moore Algorithm - Analysis

• For a given length ‘n’ of given text string and ‘m’ of pattern

• The worst case efficiency is O(n+m) if the pattern is am and

the text an.

• The average case efficiency is O(n/m) if the pattern is

am and the text an.

• This is because often a shift by m is possible due to the

bad character heuristics.


ALGORITHMS – ANALYSIS AND DESIGN
Recap

• Boyer-Moore Algorithm

• Tracing

• Analysis
THANK YOU

Lekha A
Department of Computer Applications

You might also like