Professional Documents
Culture Documents
Ads 2 Part 3
Ads 2 Part 3
1.Understand indexing
2.Understand B and B+ Trees
2
INDEXING
DENSE INDEX
MULTILEVEL INDEX
M-WAY SEARCH TREE
MOTIVATION FOR B-TREES
B-Tree was developed in the year of 1972 by Bayer and McCreight with the
name Height Balanced m-way Search Tree. Later it was named as B-Tree.
We end up with a very deep binary tree with lots of different disk
accesses;
But, the solution is to use more branches and thus reduce the height
of the tree!
⚫ As branching increases, depth decreases
DEFINITION OF A B-TREE
A B-tree of order m is an m-way tree (i.e., a tree where each node
may have up to m children) in which:
1. the number of keys in each non-leaf node is one less than the
number of its children and these keys partition the keys in the
children in the fashion of a search tree
2. all leaves are on the same level
3. all non-leaf nodes except the root have at least [m / 2] children
4. the root is either a leaf node, or it has from two to m children
5. a leaf node contains no more than m – 1 keys
INSERTING INTO A B-TREE
If this would result in that leaf becoming too big, split the leaf into
two, promoting the middle key to the leaf’s parent
If this would result in the parent becoming too big, split the parent
into two, promoting the middle key
This strategy might have to be repeated all the way to the top
If necessary, the root is split in two and the middle key is promoted
to a new root, making the tree one level higher
AN EXAMPLE B-TREE
A B-tree of order 5
containing 26 items
26
6 12
42 51 62
1 2 4 7 8 13 15 18 25
27 29 45 46 48 53 55 60 64 70 90
1 2 8 12
Therefore, when 25 arrives, pick the middle key to make a new root
CONSTRUCTING A B-TREE (CONTD.)
1 2 12 25
1 2 6 12 14 25 28
CONSTRUCTING A B-TREE (CONTD.)
Adding 17 to the right leaf node would over-fill it, so we take the
middle key, promote it (to the root) and split the leaf
8 17
1 2 6 12 14 25 28
1 2 6 7 12 14 16 25 28 48 52
CONSTRUCTING A B-TREE (CONTD.)
Adding 68 causes us to split the right most leaf, promoting 48 to the
root, and adding 3 causes us to split the left most leaf, promoting 3
to the root; 26, 29, 53, 55 then go into the leaves
3 8 17 48
1 2 6 7 12 14 16 25 26 28 29 52 53 55 68
17
3 8 28 48
1 2 6 7 12 14 16 25 26 29 45 52 53 55 68
OPERATIONS
B-Tree of order 4
⚫ Each node has at most 4 (M) pointers and 3 (M-1) keys, and root has
at least 2 (M/2) pointers and 1 ((M/2)-1) key.
*5* a
*3*5* a
* 3 * 5 * 21 * a
INSERT 9
*9* a
b c
*3*5* * 21 *
*9* a
b c
*1*3*5* * 13 * 21 *
*3*9* a
b d c
*1*2* *5* * 13 * 21 *
*3*9* a
b d c
*1*2* *5*7* * 10 * 13 * 21 *
* 3 * 9 * 13 * a
b d c e
*1*2* *5*7* * 10 * 12 * * 21 *
b d c e
*1*2* *4*5*7* * 10 * 12 * * 21 *
f g
*3*7* * 13 *
b d h c e
*1*2* *4*5* *8* * 10 * 12 * * 21 *
Node d must split into 2 nodes. This causes node a to split into 2
nodes and the tree grows a level.
REMOVAL FROM A B-TREE
Maximum children=M
Order 5 tree it’s 5
Maximum keys=M-1
Order 5 tree it’s 4
LEAF NODE DELETION CASES:
Node has less than min. number of Node has more than min.
keys number of keys
If (1) or (2) lead to a leaf node containing less than the minimum
number of keys then we have to look at the siblings immediately
adjacent to the leaf :
⚫ 3: if one of them has more than the min. number of keys then we can
promote one of its keys to the parent and take the parent key into our
lacking leaf
⚫ 4: if neither of them has more than the min. number of keys then the
lacking leaf and one of its neighbours can be combined with their shared
parent (the opposite of promoting a key) and the new leaf will have the
correct number of keys; if this step leave the parent with too few keys
then we repeat the process up to the root itself, if required
B TREE
a
*9*
f g
*3*7* * 13 *
b d h c e
*1*2* *4*5* *8* * 10 * 12 * * 21 *
B tree Of order 4
Minimum Keys: 1
Maximum keys 3
Minimum children: 2
Maximum children 4
DELETE 2
CASE: NODE HAS MORE THAN MIN. NUMBER
OF KEYS
*9* a
f g
*3*7* * 13 *
b d h c e
*9* a
f g
*3*7* * 13 *
b d h c e
*9* a
f g
*3*7* * 12 *
b d h c e
*9* a
f g
*3*7* * 12 *
b d h c e
*9* a
f g
*4*7* * 12 *
b d h c e
* 3* *5* *8* * 10 * * 13 *
*4*7*9* a
b d h e
*3* * 5* *8* * 12 * 13 *
When searching tables held on disc, the cost of each disc transfer is
high but doesn't depend much on the amount of data transferred,
especially if consecutive items are transferred
⚫ If we use a B-tree of order 101, say, we can transfer each node in one disc
read operation
⚫ A B-tree of order 101 and height 3 can hold 1014 – 1 items (approximately
100 million) and any item can be accessed with 3 disc reads (assuming we
hold the root in memory)
If we take m = 4, we get a 2-3 tree, in which non-leaf nodes have
two or three children (i.e., one or two keys)
⚫ B-Trees are always balanced (since the leaves are all at the same level), so
2-3 trees make a good type of balanced tree
B+ TREES:
B+ tree is an extension of the B tree.
The difference in B+ tree and B tree is that in B tree the
keys and records can be stored as internal as well as leaf
nodes whereas in B+ trees, the records are stored as leaf
nodes and the keys are stored only in internal nodes.
The records are linked to each other in a linked list
fashion.
This arrangement makes the searches of B+ trees faster
and efficient.
Internal nodes of the B+ tree are called index nodes.
DIFFERENCE OF B TREES AND B+
TREES:
B-Tree B+ Tree
Data is stored in leaf nodes as well as internal nodes. Data is stored only in leaf nodes.
Searching is a bit slower as data is stored in internal Searching is faster as the data is stored only in the
as well as leaf nodes. leaf nodes.
No redundant search keys are present. Redundant search keys may be present.
Leaf nodes cannot be linked together. Leaf nodes are linked together to form a linked list.
B+ Tree Node Structure
IMPORTANCE
B+ trees are used by
⚫ NTFS, ReiserFS, NSS, XFS, JFS, ReFS, and BFS file
systems for metadata indexing
⚫ BFS for storing directories.
⚫ IBM DB2, Informix, Microsoft SQL Server, Oracle 8, Sybase
ASE, and SQLite for table indexes
DATA STRUCTURES FOR
STRINGS
TRIES Data Structure
INTRODUCTION
Bioinformatics
(DNA/RNA or protein sequence data).
Search Engines.
Spell checker.
TRIES
The basic tool for string data structures, similar in role
to the balanced binary search tree, is called “trie”
A A
aacce
A
aaccee
A
TRIES EXAMPLE
\0 \0 \0 e \0 \0 p
l
\0
\0
STRING TERMINATION
The use of the special termination character ’\0’ has a
number of advantages in simplifying code.
x
Example:
⚫ Insert “extra” t a
r m
a \0
\0
FIND, INSERT AND DELETE
To perform a delete operation in this structure:
1. Perform find
e
2. Delete all nodes on the path from ‘\0’ to the root of the
tree unless we reach a node with more than 1 child
x
Example: t a
⚫ Delete “extra”
r m
a \0
\0
THANK YOU