UNIT4

ALGORITHMS – ANALYSIS AND
DESIGN
Lekha A
Computer Applications
DESIGN
Transform and Conquer, Space and
Time Tradeoffs
Introduction
Lekha A
ALGORITHMS – ANALYSIS AND DESIGN
Recap
• Greedy Method
• Prim’s Algorithm
• Kruskal’s Algorithm
• Dijkstra’s Algorithm
• Huffman Trees
Recap
• Dynamic Programming
• Computing Binomial co-efficient
• Warshall’s Algorithm
• Floyd’s Algorithm
• Knapsack Problem
• Memory Functions
Transform and Conquer
• The design methods are based on the idea of transformation.
• The method works on two-stage procedures.
• In transformation stage
• The problem’s instance is modified so that it is amenable
to the solution
• In conquering stage
• The problem is solved

Introduction
• Three major variations
• Instance simplification
• Transformation to a simpler or more convenient
instance of the same problem

Introduction
• Representation change
• Transformation to a different representation of the
same instance
Introduction
• Problem reduction
• Transformation to an instance of a different problem
for which an algo is already available.

Introduction
Simpler instance
Or
Problem’s Another representation
Solution
• instance Or
Another problem’s
instance
Recap
• Transform and Conquer

Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Presort
Lekha A
Recap
• Transform and Conquer

Presorting
• "Presorting" is a common example of "instance simplification."
• Presorting is sorting ahead of time, to make repetitive
solutions faster.
• Presorting is a form of preconditioning.
• Preconditioning is manipulating the data to make the
algorithm faster.
Presorting
• Example
• To find many kth statistics in an array then it might make
sense to sort the array ahead of time for so that the cost
for determining each statistics is constant time.

Presorting
• Some examples include
• Checking element uniqueness in an array
• Computing a mode
• Searching problem
Checking element uniqueness in an array
• Algorithm
• Sort the array
• Check only consecutive elements
• If the array has equal elements the pair must be next
to each other.
PresortElementUniqueness(A[0..n-1])
• //Solves the element uniqueness problem by sorting the
//array first.
• //Input: An array A [0..n - 1] of ordered elements.
• //Output: Returns "true" if A contains no equal elements,
//otherwise returns false.

PresortElementUniqueness(A[0..n-1])
• for i ← 0 to n — 2 do
• if A[i] = A [i+1] return false
• return true
PresortElementUniqueness(A[0..n-1]) - Analysis
• The total time is the sum of the time spent on sorting and the
time spent on checking consecutive elements.
• Sorting requires n.log(n) comparisons
• Checking requires not more than n-1 comparisons
• T(n) = Tsort(n) + Tscan(n)
  (n log n) +  (n) =  (n log n)

Computing Mode
• A mode is a value that occurs most often in a given list of
numbers.
• Example
• 5, 5, 2, 3, 5, 8, 8, 7
• Here 5 is mode.
Computing a mode – Brute Force
• Compute the frequencies of all its distinct values.
• Later find the value with the largest frequency.
• To implement this idea
• Store the values already encountered along with their
frequencies in a separate list.

Computing a mode – Brute Force
• On each iteration the ith element of the original list is
compared with the values already encountered by
traversing this auxiliary list.
• If match is found its frequency is incremented otherwise
the current element is added to the list of distinct values
seen so far with the frequency of 1.

Computing a mode – Brute Force – Analysis
• Worst case scenario
• There are no equal elements
• Here the ith element is compared with i – 1 elements that
have been added to the list with a frequency of 1.

n
C (n) =  (i − 1) = 0 +1 + .......... + (n − 1)
i =1
n(n − 1)
=   (n 2 )
2
Presortmode(A[0….n-1])
• //Computes the mode of an array by sorting it first.
• //Input: An array A[0…n-1] of orderable elements.
• //Output: The array’s mode.

• i←0 //current run begins at position i
• modefrequency 0 // highest frequency so far
• while i ≤ n-1 do
• runlength ← 1
• runvalue ← A[i]
• while i+runlength ≤ n-1 and A[i+ runlength]=runvalue
• runlength ← runlength + 1
• if runlength > modefrequency
• modefrequency ← runlength
• modevalue ← runvalue
• i ← i+ runlength
• return modevalue
Presortmode(A[0….n-1]) - Tracing
•
Presortmode(A[0….n-1]) - Analysis
• The total time is the sum of the time spent on sorting and the
time spent on checking consecutive elements.
• Sorting requires nlogn comparisons
• Checking requires not more than n comparisons
• T(n) = Tsort(n) + Tscan(n)
  (n log n) +  (n) =  (n log n)

Recap
• Presorting
• Checking uniqueness of elements
• Computing Mode
Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Balanced Search trees
Lekha A
Recap
• Presorting
• Checking uniqueness of elements
• Computing Mode
Balanced Search Trees
• Binary Search Trees
• It is a binary tree whose nodes contain elements of a set of
orderable items, one element per node, so that all
elements in the left subtree are smaller than the element
in the subtree’s root, and all the elements in the right
subtree are greater than it.
• These trees have to be balanced.

• The first approach is of the instance-simplification variety: an
unbalanced binary search tree is transformed into a balanced
one.
• Such trees are called self-balancing.

• An AVL tree requires the difference between the heights of
the left and right subtrees of every node never exceed 1.

• A red-black tree tolerates the height of one subtree being
twice as large as the other subtree of the same node.
• If an insertion or deletion of a new node creates a tree
with a violated balance requirement, the tree is
restructured by one of a family of special
transformations called rotations that restore the
balance required.
• The second approach is of the representation-change variety:
allow more than one element in a node of a search tree.
• Specific cases of such trees are 2-3 trees, 2-3-4 trees, and
more general and important B-trees.
• They differ in the number of elements admissible in a
single node of a search tree, but all are perfectly
balanced.
AVL Trees
• AVL trees were invented in 1962 by two Russian scientists, G.
M. Adelson-Velsky and E. M. Landis after whom this data
structure is named.
Fig 1 – Adapted from
https://en.wikipedia.org/
wiki/Georgy_Adelson-
Velsky

https://en.wikipedia.org/
wiki/Evgenii_Landis
AVL Trees
• An AVL tree is a binary search tree in which the balance factor
of every node, which is defined as the difference between the
heights of the node’s left and right subtrees, is either 0 or +1
or −1.
AVL Trees
• The height of the empty tree is defined as −1.
• The balance factor can also be computed as the difference
between the numbers of levels rather than the height
difference of the node’s left and right subtrees.

AVL Trees
B G
• C E H
D F
AVL Trees - Rotation
• When the insertion of a new node makes an AVL tree
unbalanced rotation is used to transform the tree.
• It is a local transformation of its subtree rooted at a node
whose balance has become either 2 or -2.
• If there are too many nodes of the same kind
• Root the tree rooted at the unbalanced node that is
closest to the newly inserted leaf.

AVL Trees - Rotation
• Four kinds
• Single right rotation
• Single left rotation
• Double left-right rotation
• Double right-left rotation

Single Right Rotation
• Also known as R-rotation
• It is performed after inserting a new key into the left subtree
of the left child of a tree whose root has a balance of 1 before
the insertion.
3 21
2
• 01
1 0
• 2 becomes the new root.
• 3 takes ownership of 2’s right child as its left child. Here the
right child of 2 is null.
• 2 takes ownership of 3 as its right child.

3 21
2
01
1 0
•
2 0
0 1
0 3
Single Left Rotation
• It is a symmetric rotation known as L-rotation.
• It is the mirror image of the single R-rotation.
• It is performed after a new node is inserted into the right
subtree of the right child of a tree whose root has a balance of
-1 before the insertion

1 -2
-1
2 -1
0
3 0
• 1 takes ownership of 2’s left child as its right child. Here the
left child of 2 is null.
• 2 takes ownership of 1 as its left child.

2
1 -2
-1 0
•
0 1
0 3
2 -1
0
3 0
Double left-right rotation
• Also known as LR rotation.
• Is a combination of two rotations
• Perform L rotation on the left subtree of root r followed
by the R-rotation of the new tree rooted at r.
• Is performed after a new key is inserted into the right subtree
of the left child of the tree whose root has a balance of 1
before the insertion.

3 21
1 0-1
2 0
• Considering only the right sub tree.
1
2
1
• Now the tree will look like.
1
• Applying R rotation
• 2 takes ownership of 3 as its right child. 2
1 3
Double Right-Left Rotation
• Also known as RL rotation.
• Is a mirror image of the double LR rotation.
• Is performed after a new key is inserted into the left subtree
of the right child of the tree whose root has a balance of -1
before the insertion.

1 -1
-2
3 01
2 0
3
• Considering only the left subtree.
2
• Applying R rotation
• 2 takes ownership of 3 as its right child.

• Now the tree will look like

1
3
• Applying L rotation

2
1 3
AVL Tree - Efficiency
• The major factor involved is the height of the tree.
• The height of the AVL tree satisfies the inequalities

[log 2 n]  h  1.4405 log 2 (n + 2) − 1.3277
• It implies that the operations of searching and insertion are
Θ(log n) in a worst-case scenario.

AVL Tree - Efficiency
• Searching in an AVL tree requires almost the same number of
comparisons as searching in a sorted array using binary
search.
Recap
• Balanced Search Trees • Efficiency
• AVL Trees
• Rotation
• Single Right
• Single Left
• Double Right-Left Rotation
• Double Left-Right Rotation

Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Balanced Search Trees: 2-3 Trees
Lekha A
Recap
• Balanced Search Trees • Efficiency
• AVL Trees
• Rotation
• Single Right
• Single Left
• Double Right-Left Rotation
• Double Left-Right Rotation

Introduction
• It follows the representation-change variety: allow more than
one element in a node of a search tree.
• The simplest implementation of this idea is 2-3 trees,
introduced by the U.S. computer scientist John Hopcroft in
1970
Introduction
• John Edward Hopcroft (born October 7, 1939) is an American
theoretical computer scientist.

https://en.wikipedia.org/wiki/John_Hopcroft
2-3 Trees
• It can have nodes of two kinds
• 2-nodes
• 3-nodes
2-3 Trees
• 2-node
• Contain a single key K and has two children.
• The left child serves as the root of a subtree whose keys
are less than K
• Right child serves as the root of a subtree whose keys are
greater than K.
2-3 Trees
• 2-node
<K >K
2-3 Trees
• 3-node
• Contains two ordered keys K1 and K2 with K1<K2.
• It has three children.
• The left most child serves as the root of a subtree
whose keys are less than K1.

2-3 Trees
• 3-node
• Middle child serves as the root of a subtree whose keys
are between K1 and K2.
• Right most child serves as the root of a subtree whose
keys are greater than K2.

2-3 Trees
• 3-node
K1, K2
< K1 (K1, K2) > K2

2-3 Trees
• All the leaves of the 2-3 tree must be on the same level.
• 2-3 tree is always perfectly balanced.

2-3 Nodes Searching
• Start at the root.
• If the root is a 2-node the process is continued as if it is a BST
• Stop if K is equal to the root’s key or continue the search in the
left or the right subtree if K is smaller or larger than the root’s
key.
2-3 Nodes Searching
• If the root is a 3-node , with not more than 2 key comparisons
it is known whether
• the search has to be stopped since K is equal to one of the
root’s keys..
• Or in which of the root’s three subtrees the search has to
be continued.
2-3 Nodes Insertion
• Insert a new key K in the leaf unless it is an empty tree.
• The appropriate leaf is found by performing the search for K.

2-3 Nodes Insertion
• If the leaf is a
• 2-node
• Insert K as the first node if K is smaller than the node’s
old key.
• Insert K as the second node if K is larger than the
node’s old key.

2-3 Nodes Insertion
• 3-node
• Split the leaf into two.
• The smallest of the three keys (two old ones and the new
key) is put in the first leaf
• The largest key is put in the second leaf.
• The middle key is promoted to the old leaf’s parent.

2-3 Nodes Insertion
• List – 9,5,8,3,2,4,7
9 8 8
5,9 5,8,9 4
1 2 3
5 9 3,5 9
Has to split
5 8 3,8
6 3,8
2,3,5 9 2 9
5 2 9
4,5
Has to split
2-3 Nodes Insertion
Has to split 3,5,8

7 3,8
8
•
2 4 7 9
2 9
4,5,7
Has to split
3 8
2 4 7 9
2-3 Nodes Searching - Analysis
• The efficiency depends on the height of the tree
• A 2-3 tree of height h with the smallest number of keys is a full
tree of 2-nodes.
• For any 2-3 tree of height h with n nodes
n  1 + 2 + ........ + 2 h
h +1
= 2 −1
hence h  log 2 (n + 1) −1
• A 2-3 tree of height h with the largest number of keys is a full
tree of 3-nodes each with two keys and three children.
• For any 2-3 tree with n nodes
• n  2.1 + 2.3 + ........ + 2.3 h
= 2(1 + 3 + ........ + 3)
= 3h +1 −1`
hence h  log 3 (n + 1) −1
• The lower and upper bounds on height h
log3 (n + 1) −1 h  log 2 (n + 1) −1

• This implies that the time efficiencies of searching, insertion
and deletion are all in Θ(log n) in both the worst and average
case.
Recap
• 2-3 Trees
• Searching
• Insertion
• Analysis
Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Heaps
Lekha A
Recap
• 2-3 Trees
• Searching
• Insertion
• Analysis
Heapsort
• Heapsort was invented by J. W. J. Williams in 1964.
• John William Joseph Williams (2 September 1930 – 29
September 2012) was a Welsh-Canadian computer scientist
best known for inventing in 1964 heapsort and the binary
heap data structure
Fig 4- adapted from

https://en.wikipedia.org/wiki/J._W._J._Williams
Heap
• Heap is a partially ordered data structure that is suitable for
implementing priority queues.
• It can be defined as a binary tree with keys assigned to its
node provided the following two conditions are met:
• The tree’s shape requirement
• The parental dominance requirement

Heap
• The tree’s shape requirement
• The binary tree is essentially complete
• All its levels are full except possibly the last level where
only some rightmost leaves are missing.
• The parental dominance requirement
• The key at each node is greater than or equal to the
keys at its children.

Heap
10 10
5 7 5 7
10
4 2 1 2 1
5 7
• Heap • Not heaps
6 2 1
Heap
• The key values in a heap are ordered top down.
• Sequence of values on any path from the root to a leaf is
decreasing.
• There is no left to right order in key values.
• There is no relationship among key values for nodes either
on the same level of the tree.

Heap - Properties
• There exists exactly one essentially complete binary tree with
n nodes.
• Its height is equal to log2n.
• The root of the heap always contains the largest element /
smallest element.
• A node of the heap considered with all its descendants is also
a heap.
Heap - Properties
• A heap can be implemented as an array by recording its
elements in the top-down, left-to-right fashion.
• It is convenient to store the heap’s elements in positions 1
thru n of such an array leaving H[0] either unused or
putting there a sentinel element whose value is greater
than every element in the heap.

Heap - Properties
• Here
• The parental nodes keys will be in the first [n/2]
positions of the array, while the leaf keys will occupy
the last [n/2] positions
• The children of a key in the array’s parental position i
(1 ≤ i ≤ [n/2]) will be in positions 2i and 2i+1.
• Correspondingly the parent of a key in position i (2 ≤ I
≤ n) will be in position [i/2].

Heap
• Heap can be defined as an array H[1..n] where every element
in position i in the first half of the array is greater than or
equal to the elements in positions 2i and 2i+1
n
H [i ]  max{H [2i ], H [2i + 1]} for i =1,...,  
2
Heaps
• Max heap
• It is a complete binary tree in which the value in each
internal node is greater than or equal to the values in the
children of that node.

Heaps
• Min heap
• It is a complete binary tree in which the value in each
internal node is lesser than the values in the children of
that node.
Heap Construction
• Two principals
• Bottom-up heap construction
• Top-down heap construction

Bottom-up Heap construction
• Initializes the complete binary tree with n nodes placing keys
in the order given.

• Heapify the tree as follows
• Starts with the parental nodes while checking whether the
parental dominance holds for the key of this node.
• If it does not it exchanges the node’s key K with the larger
key of its children.
• It again checks the parental dominance holds true for K in
its new position.

• The process continues until the parental dominance
requirement for K is satisfied.
• The algo proceeds to the same for the node’s immediate
predecessor.
• The algo stops after this is done for the tree’s root.
HeapBottomUp(H[1..n])
• //Constructs a heap from the elements of a given array by the
// bottom – up algorithm
• // Input: An array H[1..n] of orderable items
• // Output: A heap H[1..n]

• for i ← [n/2] down to 1 do
• k ← i; v ← H [k]
• heap ← false
• while not heap and 2*k ≤ n do
• j ← 2*k
• if j < n // there are two children
• if H [ j ] < H [ j + 1] then j ← j + 1
• if v ≥ H [ j] then heap ← true
• else
• H [k] = H [j]
• k=j
• H [k] ← v
HeapBottomUp(H[1..n]) - Tracing
• Example 2, 9, 7, 6, 5, 8
2 2 2 9 9
9 7 9 8 9 8 2 8 6 8
6 5 8 6 5 7 6 5 7 6 5 7 2 5 7
HeapBottomUp(H[1..n]) - Analysis
• Worst-case scenario
• n = 2k – 1
• A heap’s tree is full – maximum number of nodes occurs
on each level.
• Let h be the height of the tree
• h = log2n
• Each key on level i of the tree will travel to the leaf level h
in the worst case of construction.

• Moving to the next level down requires two comparisons
• One to find the largest child
• One to determine whether the exchange is required
• Total number of comparisons involving a key on level i will
be 2(h-i).
• Total number of comparisons

ℎ−1
𝐶𝑤𝑜𝑟𝑠𝑡 (𝑛) = ෍ ෍ 2(ℎ − 𝑖)

𝑖=0 𝑙𝑒𝑣𝑒𝑙𝑖𝑘𝑒𝑦𝑠
ℎ−1
= ෍ 2(ℎ − 𝑖)2𝑖 since t he number of nodes in level i is 2i

𝑖=0
ℎ−1 ℎ−1 ℎ−1
= ෍ 2(ℎ − 𝑖)2𝑖 = ෍ 2ℎ. 2𝑖 − ෍ 2𝑖2𝑖

𝑖=0 𝑖=0 𝑖=0
• Total number of comparisons

ℎ−1 ℎ−1 ℎ−1 ℎ−1
= 2 ෍ ℎ. 2𝑖 − ෍ 𝑖2𝑖 = 2 ℎ ෍ 2𝑖 − ෍ 𝑖2𝑖
𝑖=0 𝑖=0 𝑖=0 𝑖=0
= 2(ℎ(2ℎ−1+1 − 1) − (ℎ − 2)2ℎ−1+1 − 2)
= 2 ℎ 2ℎ − 1 − ℎ − 2 2ℎ − 2
= 2(ℎ2ℎ − ℎ − ℎ2ℎ + 2. 2ℎ − 2)
= 2(2. 2ℎ − 2 − ℎ)
h +1 h +1
• n=2 − 1 = n + 1 = 2
Applying log on both sides log 2 (n + 1) = (h + 1) log 2 2
log 2 (n + 1) = (h + 1) = h = log 2 (n + 1) − 1
𝑆𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑖𝑛𝑔 𝑏𝑎𝑐𝑘 𝑤𝑒 𝑔𝑒𝑡
𝐹(𝑛) = 2(2. 2log2(𝑛+1)−1 − log 2 ( 𝑛 + 1) + 1 − 2)
2log 2 ( n +1)
= 2(2. − log 2 (n + 1) − 1)
2
log 2 ( n +1)
• = 2(2 − log 2 (n + 1) − 1)
= 2((n + 1) log 2 2
− log 2 (n + 1) − 1)
= 2(n + 1 − log 2 (n + 1) − 1)
= 2(n − log 2 (n + 1))
Recap
• Heaps
• Bottom-up Heap construction
• Analysis
Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Heapsort
Lekha A
Recap
• Heaps
• Bottom-up Heap construction
• Analysis
Top-down Heap Construction
• Constructs a heap by successive insertions of a new key into a
previously constructed heap.
• Attach a new node with key K in it after the last leaf of the
existing heap.
• Shift K up to its appropriate place in the new heap as
follows.
• Compare K with its parent’s key.
• If the latter is greater than or equal to K stop.
• Else swap these two keys and compare K with its new
parent.
• This swapping continues until K is not greater than its last
parent or it reaches the root.

• Example – 2,9,7,6,5,8 9
6 8
2 5 7
9
• Insert 10 into the heap constructed.

6 8
2 5 7 10
9 9 10
6 8 6 10 6 9
•2 5 7 10 2 5 7 8 2 5 7 8
Top-down heap construction – Analysis
• The insertion operation cannot require more key comparisons
than the heap’s height.
• Since the height of a heap with n nodes is about log2n, the
time efficiency of insertion is in O(log n)

Deletion of the root’s key from a heap – Algorithm
• Maximum Key Deletion from a heap
• Exchange the root’s key with the last key K of the heap
• Decrease the heap’s size by 1.

• “Heapify” the smaller tree by shifting K down the tree
exactly in the same way as it is done in bottom-up heap
construction algorithm.
• Verify the parental dominance for K.

• If it holds stop.
• else
• Swap K with the larger of its children.
• Repeat this operation until the parental dominance
condition holds for K in its new position.

9 1 1
8 8
8 6 8 6 8 6
1 6 5 6
2 5 1 2 5 9 2 5
2 5 2 1
Deletion of the root’s key from a heap – Analysis
• The efficiency of deletion is determined by the number of key
comparisons needed to “heapify” the tree after the swap has
been made and the size of the tree is decreased by 1.
• Since this cannot require more key comparisons than twice
the heap’s height, the time efficiency of deletion is in O(log n).

Heapsort
• This is a two-stage algorithm that works as follows.
• Stage 1 (heap construction): Construct a heap for a given
array.
• Stage 2 (maximum deletions): Apply the root-deletion
operation n − 1 times to the remaining heap

Heapsort
9 1 1
8 8
8 6 8 6 8 6
1 6 5 6
2 5 1 2 5 9 2 5
2 5 2 1
Heapsort
8
• 1
1 6
5 6 5 6
5 6 5 1
2 1 2 8
2 2
2
1 1 5 2
5 1
2 2 5 2 1 5 1
6
Heapsort
2 1
1
1 2
• Remove 1 from heap

Heapsort – Analysis
• The efficiency of the of the heap construction stage is ϴ(n)
with bottom-up heap construction.
• The effort required to build the heap is linear.
• The efficiency of the of the heap construction stage is ϴ(log n)
with top-down heap construction.
• The effort required to build the heap is lograthmic.

Heapsort – Analysis
• The array elements are deleted in the decreasing order in a
max heap.
• Time efficiency of the second stage
• C(n) is the number of comparisons needed for eliminating
the root keys from the heaps of diminishing sizes from n to
2.
Deletion of the root’s key from a heap – Analysis
𝐶(𝑛) ≤ 2[log 2 ( 𝑛 − 1)] + 2[log 2( 𝑛 − 2)] + 2[log 2 ( 𝑛 − 3)]+. . . . . . . +2[log 2 2]
n −1 n −1
•  2 log 2 i  2 log 2 (n − 1).1
i =1 i =1
n −1
= 2 log 2 (n − 1)1 = 2 log 2 (n − 1)(n − 1 − 1 + 1)
i =1
= 2(n − 1) log 2 (n − 1)
≈ 𝑛 log 2 𝑛 𝑓𝑜𝑟 𝑣𝑒𝑟𝑦 large values of n
T (n) =  (n) +  (n log 2 n) 𝑇(𝑛) = 𝜃(log 2 𝑛) + 𝜃(𝑛 log 2 𝑛)
=  (max{n, n log 2 n}) =  (n log 2 n) = 𝜃(max{ log2 𝑛 , 𝑛 log2 𝑛}) = 𝜃(𝑛 log2 𝑛)
Recap
• Top-down Heap Construction
• Deletion of the root’s key from a heap
• Analysis
• Heapsort
• Algorithm
• Tracing
• Analysis
Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Space and Time Tradeoffs
Lekha A
Recap
• Top-down Heap Construction
• Deletion of the root’s key from a heap
• Analysis
• Heapsort
• Algorithm
• Tracing
• Analysis
Introduction
• A space–time or time–memory trade-off in computer science
is a case where an algorithm or program trades increased
space usage with decreased time.
• Space refers to the data storage consumed in performing a
given task (RAM, HDD)
• Time refers to the time consumed in performing a given task
(computation time or response time).

Introduction
• The utility of a given space-time tradeoff is affected by related
fixed and variable costs (ex. CPU speed, storage space).
• The best Algorithm is that which helps to solve a problem that
requires less space in memory and also takes less time to
generate the output.
• But in general, it is not always possible to achieve both of
these conditions at the same time.

Introduction
• To overcome this issue there are three methods
• Preprocess the problem’s input in whole or in part.
• Store the additional information obtained to accelerate
solving the problem later.
• Approach is known as input enhancement.

Input Enhancement
• Algorithms based on this technique
• Counting methods for sorting
• Boyer-Moore algo for string matching
• Horspool algo for string matching

Introduction
• Second approach
• Called as prestructuring.
• Use extra space to facilitate faster and more flexible access
to data.
• Algorithms based on this technique
• Hashing
• Indexing with B trees

Introduction
• Third approach
• Dynamic Programming
• Based on recording solutions to overlapping sub-
problems of a given problem in a table from which a
solution to the problem in question is then obtained.

Sorting by counting
• One method
• For each element of a list to be sorted, count the total
number of elements smaller than this element and record
the results in a table.
• The numbers will indicate the positions of the elements in
the sorted list.

Sorting by counting
• Example
• If the count is 10 it means the element will be in 11th
position with index 10.
• The algo is known as comparison counting algorithm.

ComparisonCountingSort(A[0..n-1])
• //Sorts an array by comparison counting
• //Input: An array A[0..n-1] orderable elements
• //Output: Array S[0..n-1] of A’s elements sorted in non-
//decreasing order
• for i ← 0 to n-1 do
• count[i] ← 0
• for j ← i+1 to n-1 do
• if A[i] < A[j]
• count[j] ← count[j] + 1
• else
• count[i] ← count[i]+1
• S[count[i]] ← A[i]
• return S
ComparisonCountingSort(A[0..n-1]) - Tracing
• 65, 28, 87, 93, 22, 44
• Initially count[] 0 0 0 0 0 0
• After i = 0, count[] = 1 0 1 1 0 0
2
3
• After i = 1, count[] =
3 0 1 1 0 0
1 2 2 1
• After i = 2, count[] = 3 1 1 1 0 0
2 2 1
3 3
• After i = 3, count[] = 4
3 1 4 1 0 0
2 1
3
4
5
• After i = 4, count[] = 3 1 4 5 0 0
1
2
• Final State, count[] =
3 1 4 5 0 2
• count[] = 3 1 4 5 0 2
• 65, 28, 87, 93, 22, 44

68
• After i = 0, S[] = 0 1 2 3 4 5
28 68
• After i =1, s[] =
0 1 2 3 4 5
• After i = 2, s[] = 28 68 87
0 1 2 3 4 5
• After i = 3, s[] =
28 68 87 93
0 1 2 3 4 5
• count[] = 3 1 4 5 0 2
• 65, 28, 87, 93, 22, 44
• After i = 4, S[] = 22 28 68 87 93
0 1 2 3 4 5
• After i =5, s[] =
22 28 44 68 87 93
0 1 2 3 4 5
ComparisonCountingSort(A[0..n-1]) – Analysis
• Step 1: Input size is ‘n’
• Step 2: The basic operation is Comparison A[i] < A[j]
• Step 3: The number of comparisons depends on n.
• Step 4: One comparison per iteration.

ComparisonCountingSort(A[0..n-1]) – Analysis
• Step 5: The number of times it is executed is

n − 2 n −1
C ( n) =  1
i = 0 j =i +1
n −1
=  [(n − 1) − (i + 1) + 1
i =0
n−2
=  (n − 1 − i ) + 1
i =0
n(n − 1)
=
2
 O(n 2 )
Recap
• Introduction
• Input Enhancement
• Presorting
• Comparison Counting
• Tracing
• Analysis
Thank you
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Sorting by Counting
Lekha A
Recap
• Introduction
• Input Enhancement
• Presorting
• Comparison Counting
• Tracing
• Analysis
Sorting by counting
• Sort a list of items with some other information associated
with the keys.
• Copy the elements into a new array S[0..n-1] to hold the
sorted elements.
• The elements of A whose values are equal to the lowest
possible value i are copied into the first D[0] elements of S. i.e
position 0 to D[0] – 1.
Sorting by counting
• The elements of value i+1 are copied into positions from D[0]
to (D[0]+D[1]) – 1 and so on.
• The accumulated sums of frequencies are called distribution in
statistics.
• The method is known as Distribution Counting.

DistributionCounting(A[0..n-1], l, u)
• //Sorts an array of integers from a limited range by
//distribution counting
• //Input: An array A[0..n-1] of integers between l and u
• //Output: Array S[0..n-1] of A’s elements sorted in non -
//decreasing order
• for j ← 0 to u-l do
• D[j]←0 //initialize frequencies
• D[A[i]-l]←D[A[i]-l]+1 // compute frequencies
• for j ← 1 to u – l do
• D[j] ← D[j-1] + D[j] //reuse for distribution

• for i ← n-1 downto 0 do
• j ← A[i]-l
• S[D[j]-1] ← A[i]
• D[j] ← D[j] – 1
• return S
DistributionCounting(A[0..n-1], l, u) - Tracing
• Array – 15, 13, 14, 15, 14, 14
• These have come from the set {13,14,15}.
• Here l = 13, u = 15
• Initially D[] = 0 0 0
• After i = 0 D[] = 0 0 1
• After i = 1 D[] = 1 0 1
• After i = 2 D[] = 1 1 1
1 1 1
• After i = 3 D[] = 2
1 1 1
• After i = 4 D[] =
2 2
• After i = 5 D[] = 1 1 1
2 2
3
1 1 1
• After j = 1 D[] = 2 2
3
4
1 1 1
• After j = 2 D[] = 2 2
3 6
4
• After i = 5
14
• j = 1, S[] =
1 4 6
• D[] = 3
• After i = 4
• j = 1, S[] = 14 14
1 4 6
• D[] = 3
2
• After i = 3
14 14 15
• j = 2, S[] =
1 4 6
• D[] = 3 5
2
• After i = 2
• j = 1, S[] = 14 14 14 15
• D[] = 1 4 6
3 5
2
1
• After i = 1
13 14 14 14 15
• j = 0, S[] =
1 4 6
• D[] = 0 3 5
2
1
• After i = 0
• j = 2, S[] = 13 14 14 14 15 15
• D[] = 1 4 6
0 3 5
2 4
1
DistributionCounting(A[0..n-1], l, u) - Analysis
• When the range of values are fixed it is a linear algorithm.
• It makes two consecutive passes through its input array A.
• Parameter for analysis is n- number of inputs
• Basic Operation is traversal of input array.
• The array is traversed twice during the algorithm.

DistributionCounting(A[0..n-1], l, u) - Analysis
• The time efficiency of the algorithm is given as

n −1 n −1
T ( n) =  1 +  1
i 0 i 0
= n −1 − 0 +1 + n −1 − 0 +1
= 2n
T ( n )  ( n )
Recap
• Distribution Counting
• Tracing
• Analysis
THANK YOU
Lekha A
Department of Computer Applications
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Input Enhancement in String
Matching - I
Lekha A
Recap
• Distribution Counting
• Tracing
• Analysis
Introduction
• The problem of string matching requires finding an occurrence
of a given string of m characters called the pattern in a longer
string of n characters called the text.
• In input enhancement method preprocess the pattern to get
some information about it, store this information in a table,
and then use this information during an actual search for the
pattern in a given text. T

Input enhancement in string matching
• Boyer-Moore Algorithm
• Horspool Algorithm
Horspool Algorithm
• It was published by Nigel Horspool in 1980.
Fig 3 – adapted from

http://webhome.cs.uvic.ca/~nigelh/
Horspool Algorithm
• It is a simplified version of Boyer-Moore algorithm.
• Both algorithms start by aligning the pattern against the
beginning characters of the text.
• If no matching string is found (that is, the first trial fails), shift
the pattern to the right.
• They differ in deciding the shift size.

Horspool Algorithm
• Consider, as an example, searching for the pattern BARBER in
some text:
• If a mismatch occurs, shift the pattern to the right.
• Try to make as large a shift as possible without risking the
possibility of missing a matching substring in the text.

Horspool Algorithm
• Horspool’s algorithm determines the size of such a shift by
looking at the character c of the text that is aligned against the
last character of the pattern.

Horspool Algorithm - Possibilities
• In general, the following four possibilities can occur
• Case 1
• If there are no c’s in the pattern—e.g., c is letter S in
the example— safely shift the pattern by its entire
length (if shift is less, some character of the pattern
would be aligned against the text’s character c that is
known not to be in the pattern)

• Case 1
• Case 2
• If there are occurrences of character c in the pattern
but it is not the last one there—e.g., c is letter B in the
example—the shift should align the rightmost
occurrence of c in the pattern with the c in the text

• Case 2
• Case 3
• If c happens to be the last character in the pattern but
there are no c’s among its other m − 1 characters—e.g.,
c is letter R in the example—the situation is similar to
that of Case 1 and the pattern should be shifted by the
entire pattern’s length m

• Case 3
• Case 4
• If c happens to be the last character in the pattern and
there are other c’s among its first m − 1 characters.
• c is letter R in the example
• The situation is similar to that of Case 2

• Case 4
• The rightmost occurrence of c among the first m −
1 characters in the pattern should be aligned with
the text’s c.
Horspool Algorithm
• Precompute shift sizes and store them in a table.
• For every character c, the shift’s value is determined by
• t(c) = the pattern’s length, m
• If c is not among the first m - 1 characters of the pattern
• else
• t(c) = the distance from the rightmost c among the first m – 1
characters of the pattern to its last character

Algorithm – ShiftTable(P[0..m-1])
• //Fills the shift table used by Horspool’s and Boyer- Moore
//algorithms
• //Input: Pattern P[0..m-1] and an alphabet of possible
//characters
• //Output: Table[0..size-1] indexed by the alphabet’s characters
//and filled with shift sizes computed by formula

Algorithm – ShiftTable(P[0..m-1])
• Initialize all the elements of Table with m
• for j ←0 to m-2 do
• Table[P[j]] ← m-1-j
• return Table
Horspool Algorithm
• Step 1: For a given pattern of length m and the alphabet used
in both the pattern and text, construct the shift table.
• Step 2: Align the pattern against the beginning of the text.

Horspool Algorithm
• Step 3: Scan pattern from the right to left by comparing its
characters with the corresponding characters in the text
• If a mismatching occurs, shift the pattern to the right by
t(c), where c is the text character currently aligned against
the last character of the pattern.

Horspool Algorithm
• Start the right-to-left comparison for the new aligned
position of the pattern until a single or multiple matching
is found or the text is exhausted

HorspoolMatching(P[0..m-1], T[0..n-1])
• //Implements Horspool algorithm for string matching
• //Input: Pattern P[0..m-1] and text T[0..n-1]
• //Output: The index of the left end of the first matching
• // substring or -1 if there are no matches

• ShiftTable(P[0..m-1])//generate table of shifts
• i ← m -1 // pos of pattern’s right end
• while i ≤ n-1 do
• k←0 //no. of matched characters
• while k ≤m-1 and P[m-1-k] = T[i-k] do
• k ← k+1
• if k = m
• return i - m + 1
• else
• i← i + Table[T[i]]
• return -1
HorspoolMatching(P[0..m-1], T[0..n-1]) - Tracing
• Search for the pattern BARBER in the following text.
• JIM_SAW_ME_IN_A_BARBERSHOP
• P[] = B A R B E R
• After calling ShiftTable(P[0..m-1])
• j – 0, Table[B] = 6-1-0 = 5
• j – 1, Table[A] = 6-1-1 = 4
Character c A B C D E …… R ….. Z ..
• j – 2, Table[R] = 6 – 1 -2 =3 Shift t(c) 4 2 6 6 1 …6… 3 …6… 6 6
• j – 3, Table[B] = 6 – 1 – 3=2
• j – 4, Table[E] = 6 -1 - 4 = 1
• After i = 5 k = 0, P[5] = ‘R’ T[5] = ‘A’
• k not = 6, i = 5 + 4 = 9
• After i = 9 k = 0, P[5] = ‘R’, T[9] = ‘E’
• K not = 6, i = 9 + 1 = 10
• After i=10 k = 0, P[5] = ‘R’, T[10] = ‘ ’
• k not = 6, i = 10 + 6 = 16
• After i =16 k = 0, P[5] = ‘R’, T[16] = ‘B’
• k not = 5, i = 16 + 2 = 18
• After i =18 k = 0, P[5] = ‘R’, T[18] = ‘R’
• k = 1, P[4] = ‘E’, T[17] = ‘A’
• k not = 5, i = 18 + 3 = 21
• After i =21 k = 0, P[5] = ‘R’, T[21] = ‘R’
• k = 1, P[4] = ‘E’, T[20] = ‘E’
• k = 2, P[3] = ‘B’, T[19] = ‘B’
• k = 3, P[2] = ‘R’, T[18] = ‘R’
• k = 4, P[1] = ‘A’, T[17] = ‘A’
• k = 5, P[0] = ‘B’, T[16] = ‘B’
• k = 5 return 16
Horspool Algorithm- Analysis
• Input – n and m
• Basic operation is P[m-1-k] = T[i-k]

n −1 m −1
• C(n) =  1
i = m −1 k = 0
n −1 n −1
=  m −1 − 0 + 1=  m
i = m −1 i = m −1
n −1
= m 1 = m(n − 1 − m + 1 + 1)
i = m −1
= m(n − m + 1) = mn − m 2 + m
 mn for large values of n
Recap
• Input enhancement in string matching
• Horspool algorithm
THANK YOU
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Input Enhancement in String Matching
- II
Lekha A
Recap
• Input enhancement in string matching
• Horspool algorithm
Exact Matching
• P: word
• T: There would have been a time for such a word.
• ---------word---------------------------------------------->
• Here w and o are matched and a mismatch occurs at u
since r ≠ u.
• Since u doesn’t occur in P, it is better to skip the next two
alignments.
Exact Matching
• P: word
• T: There would have been a time for such a word.
• ---------word---------------------------------------------->
• word Skip
• word Skip
• word
Boyer-Moore Algorithm
• It was developed by Robert S. Boyer and J Strother Moore in
1977.

https://www.cs.utexas.edu/users/boyer/

https://en.wikipedia.org/wiki/J_Strother_Moore
Boyer- Moore Algorithm
• Use knowledge gained from character comparisons to skip
future alignments that definitely won’t match
• If mismatch occurs, use knowledge of the mismatched text
character to skip alignments
• “Bad character rule”

• If some characters are matched, use knowledge of the
matched characters to skip alignments
• “Good suffix rule”

• Try alignments in one direction, then try character
comparisons in opposite direction
• For longer skips

Bad character Rule
• Avoids repeating unsuccessful comparisons against a target
character.
• Bad-symbol shift is guided by the text character c that caused
a mismatch.
Bad character Rule
• Upon mismatch, let b be the mismatched character in T.
• Skip alignments until
• A. b matches its opposite in P,
• B. P moves past b
Bad character Rule
Bad character Rule
Bad character Rule
• Skipped 8 alignments and 5 characters in T were never looked at

http://www.cs.jhu.edu/~langmea/resources/lecture_notes/04_boyer_moore_v2.pdf
Good Suffix Rule
• Let t = substring matched by inner loop; skip until
• A. there are no mismatches between P and t
• B. or P moves past t
Good Suffix Rule
• Step 1
• t occurs in its entirety to the left within P

Good Suffix Rule
• Step 2
• Prefix of P matches a suffix of t

Good Suffix Rule
• Case (a) has two subcases according to whether t occurs in its
entirety to the left within P (as in step 1), or a prefix of P
matches a suffix of t (as in step 2)

Bad Character Rule + Good Suffix Rule
• Bad character rule says skip 2.
• Good suffix rule says skip 7.
• Take the maximum! (7)

• Use bad character or good suffix rule, whichever skips more.
• Example
• 11 characters of T were ignored

Bad character Rule
• If c is not present in the pattern, shift the pattern to just pass
this c in the text.
• d1 = max{t1(c) - k, 1}
• Where t1(c) is the entry in the pre-computed table and
k is the number of the matched characters.

Bad character Rule
• If t1(c) - k ≤ 0, then shift the pattern by one position to the
right.
• If c is not in the pattern shift the pattern to just pass this c.

Good Suffix Rule
• Aligns only matching pattern characters against target
characters already successfully matched
• It works for the suffix.
• Suffix is the ending portion of the pattern.
• Its size is given by k.

Good Suffix Rule
• Suppose for a given pattern P if we have already matched
some suffix ‘S’ but a mismatch occurs with the preceding
character xP.
• Shift the pattern to right along the string so that the matched
part is occupied by the same suffix ‘S’.

Good Suffix Rule
• No complete match of the suffix ‘S’ is possible if ‘S’ does not
occur elsewhere in P.
• Match the largest prefix of P.

Good Suffix Rule
• Good-suffix shift is guided by a successful match of the last k >
0 characters of the pattern.
• suff(k) is the ending portion of the pattern of size k
• d2 = distance between such second rightmost occurrence
of suff(k) and its rightmost occurrence.
• The second rightmost occurrence should not be preceded
by the same character as in the last occurrence.

Good Suffix Rule
• However, if there is no other occurrence of suff(k), then
find the longest prefix of size l < k that matches the suffix of
the same size l. The d2 is the distance between such a
prefix and the suffix.

Recap
• Need for Boyer-Moore Algorithm
• Bad character Rule
• Good suffix Rule

THANK YOU
Lekha A
DESIGN
Lekha A
DESIGN
Time Tradeoffs
Input Enhancement in String Matching
- III
Lekha A
Recap
• Need for Boyer-Moore Algorithm
• Bad character Rule
• Good suffix Rule

• Step 1: For a given pattern and the alphabet used in both the
pattern and the text construct the bad symbol shift table.
• Step 2: Using the pattern construct the good suffix shift table
• Step 3: Align the pattern against the beginning of the text.

• Step 4: Repeat the following step until a matching substring is
found or the pattern reaches beyond the last character of the
text.
• Starting with the last character compare the corresponding
characters in the pattern and the text until
• A. All m characters are matched.
• Return position of occurrence and stop

• B. A mismatching pair is encountered after k ≥ 0
characters are matched successfully.
• Retrieve the entry t1(c) from the c’s column of the
bad symbol table where c is the text’s mismatched
character.
• If k > 0 also retrieve the corresponding d2 entry
from the good-suffix table.
• Shift the pattern to the right by the number of
positions computed by the formula

d1 if k = 0
• d =
 max{d1 , d 2 } if k  0
• where d1 = max{t1(c)-k,1}
Boyer-Moore Algorithm - Tracing
• Text: BESS KNEW ABOUT BAOBABS
• Pattern: BAOBAB
• Using shift table the computed bad-symbol table is
Character c A B C D E …… O …… Z
t1(c) 1 2 6 6 6 …6… 3 ..6… 6 6
• The good suffix table is generated as
• For the suffix of k characters compare the prefix of k
characters
• If they are same check for another occurrence of the
same suffix.
• If it exists d2 is the distance from the second
occurrence till the end of the string

• If they are not same reduce the suffix by one and check
if they are same.
• If same d2 is the distance from the second
occurrence till the end of the string

• If the prefix does not match check the pattern for the
occurrence of the suffix at any position.
• If found d2 is the distance from that occurrence till the end of
the string.
• Pattern is BAOBAB
k Pattern d2
1 B 2
2 AB 5
3 BAB 5
4 OBAB 5
5 AOBAB 5
B E S S K N E W A B O U T B A O B A B S
B A O B A B
• d1 = t1(K)-0 = 6
B A O B A B
• d1 = t1( )-2 = 6-2 = 4
• There has been two matching's use the good-suffix table d2 = 5
• d=max(4,5) = 5
B A O B A B
• d1 = t1( )-1 = 6-1 = 5
• There has been one matching's use the good-suffix table d2 = 2
• d=max(5,2) = 5
B A O B A B
Boyer-Moore Algorithm - Analysis
• For a given length ‘n’ of given text string and ‘m’ of pattern
• The worst case efficiency is O(n+m) if the pattern is am and
the text an.
• The average case efficiency is O(n/m) if the pattern is
am and the text an.
• This is because often a shift by m is possible due to the
bad character heuristics.

Recap
• Boyer-Moore Algorithm
• Tracing
• Analysis
THANK YOU
Lekha A

UNIT4

Uploaded by

Copyright:

Available Formats

You might also like

UNIT4

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

UNIT4

Uploaded by

Copyright:

Available Formats

ALGORITHMS – ANALYSIS AND

• Computing Binomial co-efficient

• The design methods are based on the idea of transformation.

• The method works on two-stage procedures.

• The problem’s instance is modified so that it is amenable

• The problem is solved

• Three major variations

• Transformation to a simpler or more convenient

instance of the same problem

• Three major variations

• Transformation to a different representation of the

• Three major variations

• Transformation to an instance of a different problem

for which an algo is already available.

• Transform and Conquer

• Transform and Conquer

• "Presorting" is a common example of "instance simplification."

• Presorting is sorting ahead of time, to make repetitive

• Presorting is a form of preconditioning.

• Preconditioning is manipulating the data to make the

• To find many kth statistics in an array then it might make

for determining each statistics is constant time.

• Some examples include

• Checking element uniqueness in an array

• Sort the array

• Check only consecutive elements

• If the array has equal elements the pair must be next

• //Solves the element uniqueness problem by sorting the

• //Input: An array A [0..n - 1] of ordered elements.

• //Output: Returns "true" if A contains no equal elements,

//otherwise returns false.

• if A[i] = A [i+1] return false

time spent on checking consecutive elements.

• Sorting requires n.log(n) comparisons

• Checking requires not more than n-1 comparisons

• T(n) = Tsort(n) + Tscan(n)

  (n log n) +  (n) =  (n log n)

• A mode is a value that occurs most often in a given list of

• Compute the frequencies of all its distinct values.

• Later find the value with the largest frequency.

• To implement this idea

• Store the values already encountered along with their

frequencies in a separate list.

• On each iteration the ith element of the original list is

compared with the values already encountered by

traversing this auxiliary list.

• If match is found its frequency is incremented otherwise

the current element is added to the list of distinct values

seen so far with the frequency of 1.

• Worst case scenario

• There are no equal elements

• Here the ith element is compared with i – 1 elements that

have been added to the list with a frequency of 1.

• //Computes the mode of an array by sorting it first.

• //Input: An array A[0…n-1] of orderable elements.

• //Output: The array’s mode.

• i←0 //current run begins at position i

• modefrequency 0 // highest frequency so far