Data Structure and Algorithm

1.3.
EXA3fPLE 1: LONGEST COAIAION SUBSEQUENCE
Longest Common Subsequence

1.Explain
Deflnitlon
eveh
tuo slrinys:
fongest common
nerrssaniy
alnng
11.1
aubsequence:
n
The Longest
s of iengih n, and sfnng
a conttguous
iength
Comman
Subsoquonce (LCS) pmblem is as follous.
goal I
fo
We are
prtdue he
the fongrst seguenr oJ eharuclers thal aPIar lejt-ta-nghi (bul not
block) in bofh stnngs.

o . ur
Fur exanple, coalter:
S= ABAZOC
T= BACBAD
In this rase. the LCS hns kength and is ABAD. Aother way lo kmk at il is we are indings
the striug
n l-I malehing belwTYTI Ome ol llie ktleTN in S and mine al the letteTs in T soch 1hal none al the
in the malehing ONS Ceh other.
gew
in geniomies: givei twa DNA fragnients,
For inxtanee, this prollrm eonw up all the tin
typ" of
the 1.c3 BiVes Inorihul Mi ntsont wlunt tiay havt iu eaON1 Ala
be Way tu iri up. e ne
we will kok at
now molve tlhe lCS problem Ning Dynurnie Prugrnnming. AN ulbprubens
Lat
le LCS of
a prefix of and a pretix
wrry inl alboul huling the íengih ol
S of T: nunning
the s
ad vT we pairs
thrt
ol rehs
all
ean noity the nkgorithn r lo proxlice
e atual *quence iwl.

Su, heve in the qw»ticn: say LCS(i,j] in the levgth of the LCS uf S[1..i] with T[1..j]. Huw*
can we Nlve lor LCe (a.j9 in urnix of the LCS'uf tlu sauallor prolskux?
Cano 1i if si] # TG]? Thwn, the desirecd mabmeqtienee has uo igore one of Sp| or TbI so
wlhat
- 1.j1. LES{3. j
1Cs.j] = ar(LCS| -1).
Cae 2: what il
St)= TDI' Then the 1.CS of S[1..i d T|l.J] might as well malcth 1hem up.
ir tananee, you conlti alwinys nteh it ta Tul lanteul. S0, 1 Lhls Ce we nve
- -1
LCS|j| =l+LCSji 1.j
So, w ean jusl do lux loops (ovvrT values of andl j) , tilling in lhe LCS vesing thee rules. Here's
i
it looksi like pietarinlly
lor the exampe nbev, wh S nlong hr kltinat culuui anl
T alon
wlhat
the top rew.
BA
** CB** A D
Just ll aut Lhis Matrix rUw liy row, doing coHNAnt Amaunt ol wrk per vatry, 0
ths takes
"Tha anwwer (the cif tlhe Les otS and T) H un the lower-rnght
({rora) time overnll. final length
dynamic programming
techniques
Therefore'"Dynamic programming is a
applicable when sub problem are not

,
independent that is,when sub problem
share sub problems."
As Greedy approach, Dynamic

programming is typically applied to
optimization problems and for them

there can be many possible solutions and
the requirement is to find the optimal
solution among those. But, Dynamic
programming approach is little different
greedy approach. In greedy solutions are
computed by making choices in serial
forward way and in this no backtracking
& Tevision of choices is done where as
Dynamic programming computes its
solution bottom up by producing them
from smaller sub problems, and by trying
many possibilities and choices before it
arrives at the optimal set of choices.
The Development of a dynamic-

gramming algorithm can be broken
a sequence of four steps:
1. Divide, Sub problems: The main
problems is divided into several
smaller sub problems. In this the
solution of the main problem is
expressed in terms of the solution for

the smaller sub problems. Basically it is
all about characterizing the structure

of an optimal solution and recursively
define the value of an optimal solution.
2. Table, Storage: The solution for each

sub problem is stored in a table, so that
it can be used many times whenever
required.
3. Combine, bottom-up Computation:

The solution to main problem is
obtained by combining the solutions of

smaller sub problems. i.e. compute the
value of an optimal solution ina
bottom up fashion.
4. Construct an optimal solution from

computed information.(This step is
optional and is required in case if
some additional information is
required after finding out optinal

solution.)
Various algorithms which make use of
Dynamic programming technique are as
follows:
1. Knapsack problem.
2. Chain matrix multiplication.
3. All pair shortest path.
4. Travelling sales man problem.
5. Tower of hanoi.
6. Checker Board.
7.Fibonacci Sequence.
8. Assembly line scheduling.
9.Optimal binary search trees.

Dynamic Programming
1s that solves
Dynne Tugraiming an algornthnie paradigi a gIven complex problcm by breakin
nto und to
it sulbproblems storcs the results of subproblcns avod cemputing the same results again.
Dynamie programing is uscd when the sub problems are nt ndependent. Dynamie l'rogrammung 1s a
bottom up appruach e
solve all possible snall problems and then combune thcn to obtain svlutions liur
Igger poblems.
Dyname Programming is otten uscd im optimization problenis (A problen with many possihle
solutons lor which we Want to an optimal solution
tind
Dynamic P'rogramming works when a problem has the following two main properties.
Overlapping Subproblcms
Optinal Subsinucture
a recunive algoithn would sane subptublems

Overlappig Subproblems:W hen visit the tepcatcdly.
then a prublem lhas ovetlapping subprubicms.
LIke DivIde and Conquer, Dynamic Prugramning combines soluttons to

sub-problens. Dynaue
Programnng is nainly uscd when solutkns of sanme subproblcms arc nccded agaln and again. In dynanic
programmung. computcd solutions subproblems are stored in

to a table so that these dun't hase tu recomputcd.
Sa DynaniC no1 w hen there comimon (overlapping) subprublenis hecause there
P'Togrannng is uselul are no
Is O pont in stormg hie solutions they
i1 are not necdcd agatn.
Property if optimal solution of the

Optimal Substructure: A gcn problems has Optimal Substructurc
gven pablcm can be obtainc by using optimal solutioms ot its
Subprublcms.
hash table structure
Article Talk
A
Not to be confused with Hash list or Hash tree.
redirectshere. For the South Park episode,

"Rehash
see Rehash (South Park). For the 1RC command, see
List
of Internet Relay Chat commands REHASH.
In computing, a hash table, also known as

hash map, is a data structure that implements
an associative array or dictionary. It is an
abstract data type that maps keys to values.
A hash table uses a hash function to compute
an index, also called a hash code, into an array
of buckets or slots, from which the desired
alue can be found. During lookup, the key is
hashed and the resulting hash indicates
where the corresponding value is stored.

Hash Table is a data structure which stores
data in an associative manner. In a hash table,
data is stored in an array format, where each

data value has its own unique index value.
Access of data becomes very fast if we know
the index of the desired data.
Thus, it becomes a data structure in which
insertion and search operations are very fast
irrespective of the size of the data. Hash Table

uses an array as a storage medium and uses
hash technique to generate an index where an
element is to be inserted or is to be located
from.
koy_1
koy_2 Hash
0 valuo_1
valuo_2
Function
valuo
koy 3 valuo4
(1,20)
(2,70)
(42,80))
(4,25)
a
(12,44)
(14,32)
a (17,11)
hash table kn a direct file
Hashed File Organisation

Hashed file organisation is also called a direct
file organisation.
In this method, for storing the records a hash

functionis calculated, which provides the
address of the block to store the record. Any

type of mathematical function can be used as
a hash furnction. It can be simple or complex.
Hash function is applied to columns or

attributes to get the block address. The
records are stored randomly. So, it is also
known as Direct or Random file organization.
If the generated hash function is on the

column which is considered as key, then the
column can be called as hash key and if the
generated hash function is on the column

In this method, for storing the records a hash
function is calculated, which provides the
address of the block to store the record. Any
type of mathematical function can be used as

a hash function. It can be simple or complex.
Hash function is applied to columns or

attributes to get the block address. The
records are stored randomly. So, it is also
known as Direct or Random file organization.
If the generated hash function is on the

column which is considered as key, then the
column can be called as hash key and if the
generated hash function is on the column

which is considered as non-key, then the
column can be called as hasth column.
Data Blocks in
Data Records
memory
R1 AA4BF
R3 GDSKA
R6 AB7HL
R4 sG9KA
R5 SV4HD
<b>Hash Filerganizatien</b
in DBMS
hash function
A hash function is any function that can be
used to map data of arbitrary size to fixed-size
values. The values returned by a hash function
are called hash values, hash codes, digests, or
simply hashes. The values are usually used to

index a fixed-size table called a hash table. Use
of a hash function to index a hash table is
called hashing or scatter storage addressing.
hash
keys function hashes
John Smith
01
Lisa Smith 02
03
p4
Sam Doe
05
Sandra Dee
A hash function that maps names to

integers from 0 to 15. There is a collision
between keys "John Smith and "Sandra
12 Dee".
Hash functions and their associated hash
tables are used in data storage and retrieval
applications to access data in a small and

nearly constant time per retrieval.They require
an amount of storage space only fractionally
greater than the total space required for the

data or records themselves. Hashing is a
computationally and storage space-efficient

form of data access that avoids the non
constant access time of ordered and unordered
lists and structured trees, and the often
exponential storage requirements of direct
access of state spaces of large or variable-
length keys
Use of hash functions relies on statistical
properties of key and function interaction:
worst-case behaviour is intolerably bad with a
vanishingly small probability,and average-case
behaviour can be nearly optimal (minimal
collision).
Hash functions are related to (and often
confused with) checksums, check digits,
fingerprints, lossy compression, randomization
functions, error-correcting codes, and ciphers.
Although the concepts overlap to some extent,

each one has its own uses and requirements
Description of Chained
Hash Tables
A chained hash table fundamen-
tally consists of an array of
linked lists. Each list forms a
bucket in which we place all ele-
ments hashing to a specific posi-

tion in the array (see Figure 8.1).
To insert an element, we first

pass its key to a hash function in
a process called hashing the key.
This tells us in which bucket the
element belongs. We then insert
the element at the head of the ap-
propriate list. To look up or re-

move an element, we hash its
move an element, we hash its
key again to find its bucket, then

traverse the appropriate list until1
we find the element we are look-
ing for. Because each bucket is a
linked list, a chained hash table
is not limited to a fixed number
of elements. However, perfor-
mance degrades if the table be-
comes too full.
bukel at h(k)=0
=
buckelat (k)
bucket at h{k) =2
buckel ot h{k)
=3|
KEY: dato
burkeetH4)-4
eA pounfer
Figure 8.1. A chained hash table

with five buckets containing a total
ofseven elements
Chaining is a technique used for
avoiding collisions in hash tables.
A collision occurs when two keys are

hashed to the same index in a hash
table. Collisions are a problem
because every slot in a hash table is
supposed to store a single element.

The benefits of
chaining
.Through chaining,insertion in a
hash table always occurs in O1)
since linked lists allow insertion
in constant time.
Theoretically, a chained hash

table can grow infinitely as long
as there is enough space.
A hash table which uses

chaining will never need to be
resized.
Explain Travelling Salesman
Problem.
The travelling salesman problem (also called

the travelling salespersonproblem or TSP)
asks the following question: "Given a list otf
cities and the distances between each pair of
cities, what is the shortest possible route that
visits each city exactly once and returns to the
origin city?" It is an NP-hard problem in
combinatorial optimization,important in
theoretical computer science and operations

research.
N
Solution of a travelling salesman
problem: the black line shows the

shortest possible loop that connects
Solution of a travelling salesman
problem: the black line showsthe

shortest possible loop that connects
every red dot.
The traveling purchaser problem and the

vehicle routing problem are both
generalizationsof TSP
In the theory of computational complexity, the
decision version of the TSP (where given a
length L, the task is to decide whether the
graph has a tour of at most L) belongs to thee
class of NP-complete problems. Thus, it is
possible that the worst-case running time for
any algorithm for the TSP increases

superpolynomially (but no more than
exponentially) with the number of cities.
The problem was first formulated in 1930 and

is one of the most intensively studied problems
in optimization. It is used as a benchmark for
many optimization methods. Even though the
problem is computationally difficult, many

heuristics and exact algorithms are known, so
that some instances with tens of thousands of
cities can be solved completely and even
problems with millions of cities can be
approximated within a small fraction of 1%.

The TSP has several applications even in its
purest formulation, such as planning, logistics,
and the manufacture of microchips. Slightly
modified, it appears as a sub-problem in many

areas, such as DNA sequencing. In these
applications, the concept city represents, for
example, customers, soldering points, or DNA

fragments, and the concept distance
represents travelling times or cost, or a
similarity measure between DNA fragments.

The TSP also appears in astronomy, as
astronomers observing many sources will want
to minimize the time spent moving the
telescope between the sources; in such
problems, the TSP can be embedded inside an
optimal control problem. In many applications,
additional constraints such as limited resources
or time windows may be imposed.

Explain Kruskal's Spanning Tree
Algorithm.
Kruska's Algorithm
is a A chooses some local optimum (i.e.

This grecdy algorithm. greedy algorithm
pICkIng an edge with the least weignt in a Msl).
Kruskals algonthm wOrkS as rolows: akC

ing
a graph with nvertices, kecp on adding the
have been added. Sometimes two or more edges may have the same cost. The order in
the are chosen, in Different MSTs may result,
which edges this case, does not matter.
but they wllall have the same total cOst, which will always be the minimum cost.
13
VP Cllpe f Fngirrin t Winue
Design und Analyas of Algorilhus
Algorithm:
The algorithm for finding the MST, using the Kruskal's method as follows:
is
Algorithm Kruskal (E, cost, n, t)
E G has n the
IS the set of edges in G. [u,v]
is
vertices. cost
ne Th fin
cost returned.
e Set
o edges in the minimum-cost spanning tree.
/
is
fina
a out of the costs using

Construct heap edge heapify;
for to
n do parent *1
Each vertex is In a dilterent set.
u
0mincost 0.0
while
n -1) and (heap not empty)) do
Deete a minimum edge cost (u, v) from the heap and

re-heapityusing Adjust;
K Find(V);
K hen
t[, 1] =u t[,
mincost:mincost+
2]:=
cost [u, V];
v
Unlon (d.
K)
if n-1) then write ("no spanning tree

(i
e'se return mincost;
Running time:
The number of finds at most 2e, and the number of unians at most n-1.
is
for a the algorithm has a
ncluding the initiall zation time the trees, this part
compexity that is Just slightl y more than o (n+e).
We can add at most n-1 edges to tree T. So, the total time for operations on T is
Ofn).
Summing up the various components of the computing times, we get o (ntelog e) as

asymptotic complexity
25
Example
5
5
1:
0a3
14
GVP Cullege of
Engileering lr
Wuei
Sign AnilySEN Agorithms

inKl
knapsnack problem
Let
m'. If a fraction x, 0< of object 1

us epply the greedy method to solvetheknapsack problem. We are given n
is into the knapsack a
then profit
of p.
I
placed
is earned. The is to the
objective that maximizes the
ill knapsack earmed. total profit
K
,
chosen to
Since
the
knapsack capacity
be at most 'm. The problem
Is
is
m, weas:
stated
require the tatal wetght of all objects
maximize
P
1
subject to * where, 0 %EI and 1 cl n

The profits and weights are posibve numbers.
Algorithm
w then the
tnejetsare aieeybeen sortednto non-increasing order of D
Algorithm GreedyKnapsack (m,n)

PLL:n] and wll : nl contain
the prots
and welghts
respectively of
II
Objects ordered so that p[l/ w[]> p[l +1]/ w[i + 1].
mis tne knapsack Size and l: nj is the solutian vector
r to n
do x[)
:= D.0 # initiallze x
r 1
to n do
if
x
w)> U then
[
break;
0 U-wi
1,0;
if
(is n) then xil := u/wil;
Running time:
he objecs are
to easi
be sorted into non-decredsing order of
W pi
ratio. But ir we
asreg time to sort tne oujects, the algoritnrn requires
initially ornly
on)urme
Example
Consider the tollowing Instance of theknapsack problerm: n =3, m= 20, (Pi P2, P)
(25,24, 15) and (w, W3 W) (16, 15, 10). =
GVP College of Enginrering for Women

Design and Analysis of Algerithms
1. First, we try to till the knapsack by selecting the objects In some order:
2 1/3 | 1/4 18x 1/2 + 15 x/3 +10 x 1/4 25 x 1/2 +24x1/3 +15 x 1/4 =
2.
the object with the maxiun pron t
Select
Proit(P =24). So,

x2 =2/15 L Select the object with next largest
PI
2/15 D 18xi +15x2/15 20 25xI+24 x2/1S = 23.2
3. Considering the objects in the order of non-decreasing welghts

w
X1X2X PiA
2/31 15x2/3+ 10x1 =20 24 x 2/3 + 15 xI =31
4. Considered the objects in the order

of the
ratio p/w
P1/ Wi Pa/W2 P/W

25/18 24/15 15/10
in or
Sort the objects order
oi
thenon-increasing order te ratlo pi/ X Select the
units ofspace is left. select the object with next largest p./ x.ratlo, so x= and the
profit eamed is 7.5.
- 20
/2|15x1+ 10
x 1/2
24x1+15x 1/2 -31.5
This solution is the optimal solution.
Dijkstra's Minimal Spanning Tree
Algorithm
The
4.8.7. Single Source Problem: DDKSTRA'SALGORITHS
Shortest- Path
thepreousiy
the lengths of
studied grapls,
its edges
tthe edge labels re called as COsls, but here e thin
n the destnations, shortest path problem, innd a shos st

path from a given source vertex to each of te vertices(caed destinations) n tie
i lstras
Dijkstra' s algorithm
ag 5 sirlar
takes a labeledto
prirs
graph
alort of dig
for,
and a pair verticesinina spair trs
P and 9, and inds the
wPCge EngIeTg wnCn

at
tox
there is more than orne
shortest poth bet ween then for one of Ihe shoatest paths)
ir
lgorhm does nt work for negaive edges at
D1Jkstra 's
e ngre nsts e sartest om vertex Tar d

al
we vetex wegnien dgrpn.
2
1
peati
O
O0-0{9 Shortest Patlis
AIgorihm
Algorlthim shotes enatlh of 1he sortest patli
cast aljaceney inatrix cost[l:n, 1:n
talse Intislkre
ani
io -tont w,
Pul
SIvIrue, lst[v
v
in
o Tea td lul+ OM w) then Upuuale dislances
dist[ disl [ucost [u,w
unning ume:
Depends an implementation af data structures lar dist.
Buld
a stncture wneieen
at most m =E
times deC Te se the va lue of an item m
times select the sma lest value
n
*
Or array A OnJ B01C0 = O (log
which
gves O (n totd
For
hean
tota.
A
=0 tn}; 8 =O (log nj;C n) which gives o (n +m log n)
wF Callege at Engineer1ng ir Wmen
egn and Arly si uf Algrilins
EXampie
to find the shortest path from A to each of the other six

Use Dijkstras algonthm
vertices the graph:
Use to find the shortest path from A to each of the other six
Dijkstras algorithm
vertices in the graph:
6
1
Solution:
0 3 6
30 2 4
6 2 0 14
The cost adjacency matrix is 4 10 2
42 0 2
Here- means infinite
The problem is solved by considering the following information:
Status[v be either 0', meaning that the shortest path from v to vo has
will
definitely been found; or 1', meaning that it hasn't.
Dist[v] will be a number, representing the length of the shortest path from v to
Vo found so far.
vext[v] will be the first vertex on the way to Vo along the shortest path found so
far from v to vo

Data Structure and Algorithm

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Structure and Algorithm

Uploaded by

Copyright:

Available Formats

1.3.

EXA3fPLE 1: LONGEST COAIAION SUBSEQUENCE

Longest Common Subsequence

block) in bofh stnngs.

Fur exanple, coalter:

e atual *quence iwl.

applicable when sub problem are not

As Greedy approach, Dynamic

optimization problems and for them

arrives at the optimal set of choices.

The Development of a dynamic-

expressed in terms of the solution for

all about characterizing the structure

2. Table, Storage: The solution for each

3. Combine, bottom-up Computation:

obtained by combining the solutions of

4. Construct an optimal solution from

required after finding out optinal

3. All pair shortest path.

4. Travelling sales man problem.

9.Optimal binary search trees.

a recunive algoithn would sane subptublems

LIke DivIde and Conquer, Dynamic Prugramning combines soluttons to

programmung. computcd solutions subproblems are stored in

Property if optimal solution of the

redirectshere. For the South Park episode,

In computing, a hash table, also known as

alue can be found. During lookup, the key is

hashed and the resulting hash indicates

where the corresponding value is stored.

data is stored in an array format, where each

Thus, it becomes a data structure in which

insertion and search operations are very fast

irrespective of the size of the data. Hash Table

Hashed File Organisation

In this method, for storing the records a hash

address of the block to store the record. Any

Hash function is applied to columns or

known as Direct or Random file organization.

If the generated hash function is on the

column can be called as hash key and if the

generated hash function is on the column

address of the block to store the record. Any

type of mathematical function can be used as

Hash function is applied to columns or

known as Direct or Random file organization.

If the generated hash function is on the

column can be called as hash key and if the

generated hash function is on the column

values. The values returned by a hash function

are called hash values, hash codes, digests, or

simply hashes. The values are usually used to

of a hash function to index a hash table is

called hashing or scatter storage addressing.

A hash function that maps names to

between keys "John Smith and "Sandra

tables are used in data storage and retrieval

applications to access data in a small and

an amount of storage space only fractionally

greater than the total space required for the

computationally and storage space-efficient

lists and structured trees, and the often