Professional Documents
Culture Documents
DS unit-IV
DS unit-IV
1. Magnetic tapes
2. Magnetic Diskes
1. Magnetic tapes
o A Magnetic tape is made of using plastic material coded with
ferrite.
o Magnetic tapes length is represent using number of
characters or number of tracks in the tape.
o In tape we can store the information in bit by bit.
o Information can be read or write into the tape using tape
drive.
o The gap between the records in the tap is called record gap.It
is vary from ½ to ¾ inch.
o Records are grouped into blocks.The gap between the block
is called inter block gap.
o Number of records in a block is called blocking factor.
Advantages
Disadvantages
Example
File contains 4500 records it can be sorted on disk it can be sorted
on disk in the following way.
K-Way Merging
Merging algorithm on disk and tape need Log2M Passes over
the disk.
It can be reduced using K-Way merging.
In this method we can merge K-runs simultaneously.
K-way merge need only LogKM Passes.
Example
Selection Tree
In K-Way merging sorting is start with smallest run.
The smallest run is find using the concept of selection
tree.
Selection tree is a binary tree where each node
represents the smallest of its two children.
Example
Tournament Tree
For k-way merging, it is more efficient to only store the loser of each
game (see image). The data structure is therefore called a loser tree.
When building the tree or replacing an element with the next one from
its list, we still promote the winner of the game to the top.
Sorting with tape
Sorting on disk and tapes are same but they are differ in following ways
T2
Run2 Run4 Run6
T3
Run1 Run3
T4
Run2
T3.Run1=T1.Run1+T2.Run2
T4.Run2=T1.Run3+T2.Run4
T3.Run3=T1.Run5+T2.Run6
T1
Run1
T1.Run1=T3.Run1+T4.Run2
T2
Run1
T2.Run1=T1.Run1+T3.Run3
At last sorted records are stored in Tape-2.
This type of sorting is known as balanced merge sort.
It need 2k tapes for sorting.
Balanced Merge sort
Balanced merge sort use M1,M2 and M3 algorithms.
M1 Algorithm
Analysis of M1 algorithm
Total number of passes needed in M1 algorithm is 2logkm.
M2 Algorithm
Analysis of M2 algorithm
Total number of passes needed in M2 algorithm is 3/2logkm+1/2.
M3 Algorithm
Analysis of M3 algorithm
Total number of passes needed in M3 algorithm is logkm.
Symbol Tables
Symbol table is a set of name-value pairs.In symbol table we can
perform following operations.
i. Ask if particular name is already present
ii. Retrieve the attributes of that name
iii. Insert new name and its value
iv. Delete a name and its value
Symbol tables can be implemented in following ways
i. Static tree tables
ii. Dynamic tree tables
iii. Hash tables
Example
if
for while
repeat
Loop
Example-2
10
5 20
Algorithm
Procedure search(T,X,i)
i=T
while i<>0 do
case
:X<ident(i): i=LCHILD(i)
:X=ident(i):return
:X>ident(i): i=RCHILD(i)
End
End
End search
Example
10
5 5
2 3
Algorithm
Procedure Huffman(L,n)
For i= 1 to n-1 do
Call getnode(T)
Lchild(T)=Least(L)
Rchild(T)=Least(L)
Weight(T)=Weight(Lchild(T))+Weight((Rchild(T))
Call insert(L,T)
End
End Huffman
D) Optimal Binary Search Tree
An optimal binary search tree is a binary search tree for which the
nodes are arranged on levels such that the tree cost is minimum.
Example
Tree1
stop
if
do
Maximum cost is 2
Tree2
if
do stop
Example
A
B C
Hash Functions
The following functions are used as Hash Functions
1. Mid-Square
2. Division
3. Folding
4. Digit Analysis
1. Mid-Square
In Mid-Square method identifier is squared and
then middle of square is take it as bucket address.
Ex:
Rno=134
X=134
X2=17956
Middle element is 9.It is used as index.
2.Division
The identifier X is divided by some number m and
remainder is used as the hash address.
Ex:
Rno=134
X=134/6
Remainder 2 is used as hash address.
3.Folding
The identifier X is partitioned into several parts of
same length then that parts are added to obtain the hash
address.
X=123/203/241
=123+203+241
=567
It is used as hash address.
4.Digit Analysis
In digit analysis identifier X is interpreted using some
radix r then that is used as hash address.
Ex:
X=123/203/241
123=321
203=302
241=142
-----
765
------
765 is used as hash address.
Overflow Handling
1. Open Addressing
A. Linear Probing
When a new identifier get hashed into a full bucket it is necessary
to find the another bucket for this identifier.
Find the closest unfilled bucket and put the identifier to that
bucket.It is called linear probing.
Example
If R1 is placed in the bucket A1 and R2 also gets same hash
address of A1 then find closest unfill bucket A2.So R2 is placed in
the bucket A2.
B.Quadratic Probing
In quadratic probing search the bucket address (f(x)+i) mod b
If that address is already full then we place the identifier into
the bucket address (f(x)+i2) mod b.
C.Random Probing
The random probing the hash address f(x) is already full then
we calculate the new hash address add some random numbers
with hash function f(x).
2. Chaining
The hash address f(x) is full means then we not search the new
address instead of that we can store the elements using linked
list.
Example
1 R1 R4
2 R2 R6
R3
3
4 R5
Algorithm
Procedure Chsearch(X,HT,b,J)
J=HT(f(x))
While (J<>0 and ident(J) <> X) do
J=Link(J)
End
End Chsearch
3. Rehashing
Hash address is calculated using some hash function f(x).If f(x)
is full then we change the hash function and built the new table
and store the identifier into that table.This method is called
rehashing.