Professional Documents
Culture Documents
Dbms Unit 5.2 (Ar16)
Dbms Unit 5.2 (Ar16)
Slide No:L1-1
Various File Organizations
Many alternatives exist, each is ideal for some situations,
– Heap (random order) files: Files of randomly ordered
records are called heap files. Suitable when typical access
is a file scan i.e retrieving all records. it is an unordered
and simplest file structure.
– Sorted Files:Files sorted on some fields are called sorted
files. Best if records must be retrieved in some order
– when only a `range’ of records is needed.
– The records are stored in some order.
– Hashed files: files that are hashed on some fields are called
hashed files.
• Like sorted files, they speed up searches for a subset of
records, based on values in certain (“search key”) fields
• Updates are much faster than in sorted files.
Slide No:L1-2
Index
• Primary vs. secondary: an index on a set of fields that
includes the primary key .other indexes are called
secondary indexes.
• Clustered vs. unclustere d when the file is organized
so that the order of data records is the same as, or
`close to’, order of data entries, in some index , then it
is called clustered index.
– A file can be clustered on at most one search key.
– Cost of retrieving data records through index varies
greatly based on whether index is clustered or not!
Slide No:L1-3
Index data structures
Slide No:L3-3
Cost Model for Our Analysis
Slide No:L3-1
Cost of Operations
Slide No:L4-1
Indexed Sequential Access Method(ISAM)
IN ISAM data structure, the no. of leaf pages are
fixed at file creation time.The records are stored in
leaf pages and are sorted
Disadvantages
Results in long chain of overflow pages.
When a record is deleted from the primary leaf
pages, the space created is unchanged.
index entry
ISAM
P K P K 2 P K m Pm
0 1 1 2
• ISAM
Non-leaf
Pages
Leaf
Pages
Overflow
page
Primary pages
Slide No:L7-3
ISAM
Data
Pages
Index Pages
Overflow pages
Slide No:L7-4
Example ISAM Tree
• Each node can hold 2 entries;
Root
40
20 33 51 63
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
Slide No:L7-5
After Inserting 23*, 48*, 41*, 42* ...
Root
Index 40
Pages
20 33 51 63
Primary
Leaf
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
Pages
Pages
42*
Slide No:L7-6
... Then Deleting 42*, 51*, 97*
Root
40
20 33 51 63
10* 15* 20* 27* 33* 37* 40* 46* 55* 63*
Slide No:L7-7
B+ Trees
It is dynamic data structure which is most widely used.
It is a height balanced tree in which every path from
the root of the tree to any leaf is of same length.
The data pages are organized in the form of a double
linked list
So we can traverse the pages in both directions.
Each node except the root has between d and 2d
entries where d is the order of the tree.
D is a parameter and is measure of the capacity of a
tree node.
Insert and delete operations keep the tree balanced.
B+ trees perform better than ISAM as inserts are
handled without overflow pages.
Non-leaf
Pages
Leaf
Pages
(Sorted by search key)
Leaf pages contain data entries, and are chained (prev & next)
Non-leaf pages have index entries; only used to direct searches:
index entry
P0 K 1 P1 K 2 P 2 K m Pm
Slide No:L2-2
Example B+ Tree
Root
17
5 13 27 30
2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*
Slide No:L2-3