Professional Documents
Culture Documents
10. Tree
10. Tree
10. Tree
A Tree is a non-linear data structure in which items are arranged in a sorted sequence. It is used to represent
hierarchical relationship exiting amongst several data items.
• A Tree is a recursive data structure containing the set of one or more data nodes where one node is
designated as the root of the tree while the remaining nodes are called as the children of the root.
• The nodes other than the root node are partitioned into the non-empty sets where each one of them is to be
called sub-tree.
• Nodes of a tree either maintain a parent-child relationship between them or they are sister nodes.
• In a general tree, A node can have any number of children nodes but it can have only a single parent.
• The following image shows a tree, where the node A is the root node of the tree while the other nodes can be
seen as the children of A.
Tree Terminology:
1. Root: It is specially designed data item in a tree. It is the first in the hierarchical arrangement of data items. In
the above tree, A is the root item.
2. Node: Each data item in a tree is called a node. It is the basic structure in a tree. It specifies the data
information and links (branches) to other data items. There are 10 nodes in the above tree.
3. Degree of a node: It is number of subtrees of a node in a given tree. In the above tree: The degree of node A
is 2 and E is 1.
4. Degree of a Tree: It is the maximum degree of nodes in a given tree. In the above tree the node A has degree
2 and in all this value is the maximum. So, the degree of the above tree is 2.
5. Terminal Node: A node with degree zero is called a terminal node or a leaf. In the above tree, there are 5
terminal nodes. They are H, I, J, F and G.
6. Non-terminal nodes: any node (except the root node) whose degree is not zero is called non-terminal node.
Non-terminal nodes are the intermediate nodes in traversing the given tree from its root node to the terminal
nodes (leaves). There are 4 non-terminal nodes in the above tree.
7. Sibling: The children nodes of a given parent node are called siblings. They are also called brothers. In the
above tree, D and E are siblings of parent node B.
8. Level: The entire tree structure is leveled in such a way that the root node is always at level 0. Then, its
immediate children are at level 1 and their immediate children are at level 2 and so on up to the terminal nodes.
In general, if a node is at level n, then its children will be at level n+1.
9. Edge: It is connecting line of two nodes. That is, the line drawn from one node to another node is called an
edge.
10. Path: It is sequence of consecutive edges from the source node to the destination node. In the above tree, the
between A and J is given by the node pairs, (A, B), (B, E) and (E, J)
11. Depth: It is the maximum level of any node in a given tree. In the above tree, the root node A has the
maximum level. That is the number of levels one can descend the tree from its root to the terminal node (leaves).
The term height is also used to denote the depth.
12. Forest: It is a set of disjoint trees. In a given tree, if you remove its root node then it becomes a forest. In the
above tree, there is forest with two trees.
13. Ancestor & Descendant: If A is said to be the father of B and B is said to be the left son of A. Node n1 is
an ancestor of node n2 (and n2 is a descendant of n1) if n1 is either the father of n2 or the father of some
ancestor of n2.
Why Trees?
1. One reason to use trees might be because you want to store information that naturally forms a hierarchy. For
example, the file system on a computer:
file system
/ <-- root
/ \
... home
/ \
ugrad course
/ / | \
... cs101 s112 cs113
2. Trees (with some ordering e.g., BST) provide moderate access/search (quicker than Linked List and slower
than arrays).
3. Trees provide moderate insertion/deletion (quicker than Arrays and slower than Unordered Linked Lists).
4. Like Linked Lists and unlike Arrays, Trees don’t have an upper limit on number of nodes as nodes are linked
using pointers.
Types of Tree
The tree data structure can be classified into six different categories.
General Tree
General Tree stores the elements in a hierarchical order in which the top level element is always present at level
0 as the root element. All the nodes except the root node are present at number of levels. The nodes which are
present on the same level are called siblings while the nodes which are present on the different levels exhibit the
parent-child relationship among them. A node may contain any number of sub-trees. The tree in which each
node contains 3 sub-trees, is called ternary tree.
Forests
Forest can be defined as the set of disjoint trees which can be obtained by deleting the root node and the edges
which connects root node to the first level node.
Binary Tree
Binary tree is a data structure in which each node can have at most 2 children. The node present at the top most
level is called the root node. A node with the 0 children is called leaf node. Binary Trees are used in the
applications like expression evaluation and many more. We will discuss binary tree in detail, later in this tutorial.
Expression Tree
Expression trees are used to evaluate the simple arithmetic expressions. Expression tree is basically a binary tree
where internal nodes are represented by operators while the leaf nodes are represented by operands. Expression
trees are widely used to solve algebraic expressions like (a+b)*(a-b). Consider the following example.
Q. Construct an expression tree by using the following algebraic expression. (a + b) / (a*b - c) + d
Tournament Tree
Tournament tree are used to record the winner of the match in each round being played between two players.
Tournament tree can also be called as selection tree or winner tree. External nodes represent the players among
which a match is being played while the internal nodes represent the winner of the match played. At the top most
level, the winner of the tournament is present as the root node of the tree.
For example, tree .of a chess tournament being played among 4 players is shown as follows. However, the
winner in the left sub-tree will play against the winner of right sub-tree.
Binary Tree: A tree whose elements have at most 2 children is called a binary tree. Since each element in a
binary tree can have only 2 children, we typically name them the left and right child.
Full Binary Tree or Strictly Binary Tree: A Binary Tree is full if every node has 0 or 2 children. Following
are examples of a full binary tree. We can also say a full binary tree is a binary tree in which all nodes except
leaves have two children.
18
/ \
15 30
/ \ / \
40 50 100 40
18
/ \
15 20
/ \
40 50
/ \
30 50
18
/ \
40 30
/ \
100 40
Complete Binary Tree: A Binary Tree is complete Binary Tree if all levels are completely filled except
possibly the last level and the last level has all keys as left as possible
18
/ \
15 30
/ \ / \
40 50 100 40
/ \ /
8 7 9
Practical example of Complete Binary Tree is Binary Heap.
Perfect Binary Tree: A Binary tree is Perfect Binary Tree in which all internal nodes have two children and all
leaves are at the same level.
Following are examples of Perfect Binary Trees.
18
/ \
15 30
/ \ / \
40 50 100 40
18
/ \
15 30
A Perfect Binary Tree of height h (where height is the number of nodes on the path from the root to leaf) has 2 h –
1 node.
Example of a Perfect binary tree is ancestors in the family. Keep a person at root, parents as children, and
parents of parents as their children.
A degenerate (or pathological) tree: A Tree where every internal node has one child. Such trees are
performance-wise same as linked list.
10
/
20
\
30
\
40
Here the circles represent the internal nodes and the boxes represent the external nodes.
Properties of External binary tree
1. The nodes from the original tree are internal nodes and the special nodes are external nodes.
2. All external nodes are leaf nodes and the internal nodes are non-leaf nodes.
3. Every internal node has exactly two children and every external node is a leaf. It displays the result which is
a complete binary tree
Prove that E = 2*n + I, where E is external path length, I is internal path length and n is total number of
internal nodes.
We do this by using induction on n.
Induction Base:
When n = 0 the binary tree has no internal node and 1 external node. For this tree E = I = n = 0.
Therefore, E = I + 2n.
Induction Hypothesis:
Let m be any integer >= 0. Assume that E = I + 2m for all binary trees that have m internal nodes.
Induction Step:
We will show that E = I + 2n for all binary trees that have m + 1 internal nodes. Consider any binary tree T that
has m + 1 internal nodes. Remove any one of the internal nodes that is a leaf. The resulting
tree, T' has m internal nodes. From the induction hypothesis it follows that E' = I' + 2m where E' and I' are,
respectively, the external and internal path lengths of T'.
Suppose that the removed leaf was at level ‘level’ of T. It follows that E = E' + level + 2 and that I = I' +
level where E and I are, respectively, the external and internal path lengths of T. Therefore,
E = E' + level + 2
= I' + 2m + level + 2
= I - level + 2m + level + 2
= I + 2(m + 1)
Here, the sum of total weights is already calculated and stored in the external nodes and thus makes it very
easier to calculate the total path length of a tree with given weights. The same technique can be used to
update routing tables in a network.
2. To convert binary tree in complete binary tree: The above-given tree having removed all the external
nodes, is not a complete binary tree. To introduce any tree as complete tree, external nodes are added onto it.
Heap is a great example of a complete binary tree and thus each binary tree can be expressed as heap if
external nodes are added to it.
So:
Depth d # nodes at depth d # of child nodes
--------------------------------------------------------------
0 1 = 20 2 (each node has 2 children)
1 2 = 21 4 (each node has 2 children)
2 4 = 22 8 (each node has 2 children)
...
I.e.: The number of nodes doubles every time the depth increases by 1!
Therefore: Number of nodes at level l = 2l
2) For any nonempty binary tree, T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2,
then n0=n2+1
Proof:
Let n, e be the total no. of nodes and edges of the binary tree respectively.
Let n0 = total no. of nodes with 0 children, n1 = total no. of nodes with 1 child and n2 = total no. of nodes with 2
children.
Therefore, n = n0 + n1 + n2......(1)
Again e= n-1.....................(2)
also e = 0*n0 + 1*n1 + 2*n2...............(3)
Now, from (2) and (3) we get,
n-1 = n1 + 2*n2
=> n=1 + n1 + 2*n2.............(4)
from (1) and (4) we get,
n0 + n1 + n2 = 1 + n1 + 2*n2 => n0 = 1 + n2... proved
3) In a Binary Tree with N nodes, minimum possible height or minimum number of levels is ⌈ Log2(N+1) ⌉
This can be directly derived from point 2 above. If we consider the convention where height of a leaf node is
considered as 0, then above formula for minimum possible height becomes ⌈ Log2(N+1) ⌉ – 1
4) A Binary Tree with L leaves has at least ⌈ Log2L ⌉ + 1 levels
A Binary tree has maximum number of leaves (and minimum number of levels) when all levels are fully filled.
Let all leaves be at level l, then below is true for number of leaves L.
L <= 2l-1 [From Point 1]
l = ⌈ Log2L ⌉ + 1
where l is the minimum number of levels.
5) In Binary tree where every node has 0 or 2 children, number of leaf nodes is always one more than nodes
with two children.
L=T+1
Where L = Number of leaf nodes, T = Number of internal nodes with two children
The root node is always at index 0. Then, in successive memory locations the left child and right child are stored.
Consider a binary tree with only three nodes as shown. Let BT denote a binary tree.
How to identify the father, the left child and the right child of an arbitrary node in such representation? It is very
simple to identify the father and the children of a node. For any node n, 0<=n <= (max-1), then we have
1. Father (n): The father of node having index n is at at floor ((n-1)/2) if n is not equal to 0. If n=0, then it is the
root node and has no father.
Example: Consider a node numbered 3 (i.e., D). The father of D, no doubt, is B whose index is 1 (Floor ((3-1)2)
= 1).
4. Siblings: If the left child at index n is given then its right sibling (or brother) is at (n+1). And, similarly, if the
right child at index n is given, then its left sibling is at (n-1).
The array representation is more ideal for the complete binary trees. But, this in not suitable for other than
complete binary tree as it results in unnecessary wastage of memory space. Consider the following binary tree:
It is a skewed binary tree. Since only the left sub tree is left sub tree is present, this type of binary tree is called
left skewed binary tree. You can also have right skewed binary tree. The array representation of the above binary
tree is given above. Note that the right child of A, is empty, and its both left child and right child are also empty
whose index is 4. Therefore, these indexes in array BT are left unused. This results in wastage of more memory.
Pre-order traversal
Steps
o Visit the root node
o traverse the left sub-tree in pre-order
o traverse the right sub-tree in pre-order
Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: Write TREE -> DATA
o Step 3: PREORDER(TREE -> LEFT)
o Step 4: PREORDER(TREE -> RIGHT)
[END OF LOOP]
o Step 5: END
Example
Traverse the following binary tree by using pre-order traversal
o Since, the traversal scheme, we are using is pre-order traversal, therefore, the first element to be printed is
18.
o traverse the left sub-tree recursively. The root node of the left sub-tree is 211, print it and move to left.
o Left is empty therefore print the right children and move to the right sub-tree of the root.
o 20 is the root of sub-tree therefore, print it and move to its left. Since left sub-tree is empty therefore move
to the right and print the only element present there i.e. 190.
o Therefore, the printing sequence will be 18, 211, 90, 20, 190.
In-order traversal
Steps
o Traverse the left sub-tree in in-order
o Visit the root
o Traverse the right sub-tree in in-order
Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: INORDER(TREE -> LEFT)
o Step 3: Write TREE -> DATA
o Step 4: INORDER(TREE -> RIGHT)
[END OF LOOP]
o Step 5: END
Example
Traverse the following binary tree by using in-order traversal.
o print the left most node of the left sub-tree i.e. 23.
o print the root of the left sub-tree i.e. 211.
o print the right child i.e. 89.
o print the root node of the tree i.e. 18.
o Then, move to the right sub-tree of the binary tree and print the left most node i.e. 10.
o print the root of the right sub-tree i.e. 20.
o print the right child i.e. 32.
o hence, the printing sequence will be 23, 211, 89, 18, 10, 20, 32.
Post-order traversal
Steps
o Traverse the left sub-tree in post-order
o Traverse the right sub-tree in post-order
o visit the root
Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: POSTORDER(TREE -> LEFT)
o Step 3: POSTORDER(TREE -> RIGHT)
o Step 4: Write TREE -> DATA
[END OF LOOP]
o Step 5: END
Example
Traverse the following tree by using post-order traversal
o Print the left child of the left sub-tree of binary tree i.e. 23.
o print the right child of the left sub-tree of binary tree i.e. 89.
o print the root node of the left sub-tree i.e. 211.
o Now, before printing the root node, move to right sub-tree and print the left child i.e. 10.
o print 32 i.e. right child.
o Print the root node 20.
o Now, at the last, print the root of the tree i.e. 18.
The printing sequence will be 23, 89, 211, 10, 32, 18.
Last element in the postorder [] will be the root of the tree, here it is 1.
Now the search element 1 in inorder[], say you find it at position i, once you find it, make note of elements
which are left to i (this will construct the leftsubtree) and elements which are right to i ( this will construct the
rightSubtree).
Suppose in previous step, there are X number of elements which are left of ‘i’ (which will construct the
leftsubtree), take first X elements from the postorder[] traversal, this will be the post order traversal for elements
which are left to i. similarly if there are Y number of elements which are right of ‘i’ (which will construct the
rightsubtree), take next Y elements, after X elements from the postorder[] traversal, this will be the post order
traversal for elements which are right to i
From previous two steps construct the left and right subtree and link it to root.left and root.right respectively.
See the picture for better explanation.
Make a Binary Tree from Given Inorder and Preorder Traveral.
int [] inOrder = {2,5,6,10,12,14,15};
int [] preOrder = {10,5,2,6,14,12,15};
First element in preorder[] will be the root of the tree, here its 10.
Now the search element 10 in inorder[], say you find it at position i, once you find it, make note of elements
which are left to i (this will construct the leftsubtree) and elements which are right to i ( this will construct the
rightSubtree).
See this step above and recursively construct left subtree and link it root.left and recursively construct right
subtree and link it root.right.
A Binary search tree is shown in the above figure. As the constraint applied on the BST, we can see that the root
node 30 doesn't contain any value greater than or equal to 30 in its left sub-tree and it also doesn't contain any
value less than 30 in its right sub-tree.
Advantages of using binary search tree
1. Searching become very efficient in a binary search tree since, we get a hint at each step, about which
sub-tree contains the desired element.
2. The binary search tree is considered as efficient data structure in compare to arrays and linked lists. In
searching process, it removes half sub-tree at every step. Searching for an element in a binary search tree
takes o(log2n) time. In worst case, the time it takes to search an element is 0(n).
3. It also speed up the insertion and deletion operations as compare to that in array and linked list.
Q. Create the binary search tree using the following data elements.
43, 10, 79, 90, 12, 54, 11, 9, 50
1. Insert 43 into the tree as the root of the tree.
2. Read the next element, if it is lesser than the root node element, insert it as the root of the left sub-tree.
3. Otherwise, insert it as the root of the right of the right sub-tree.
The process of creating BST by using the given elements, is shown in the image below.
Operations on Binary Search Tree
There are many operations which can be performed on a binary search tree.
SN Operation Description
1 Searching in BST Finding the location of some specific element in a binary search tree.
2 Insertion in BST Adding a new element to the binary search tree at the appropriate location so that
the property of BST do not violate.
3 Deletion in BST Deleting some specific node from a binary search tree. However, there can be
various cases in deletion depending upon the number of children, the node have.
Searching
Searching means finding or locating some specific element or node within a data structure. However, searching
for some specific node in binary search tree is pretty easy due to the fact that, element in BST are stored in a
particular order.
1. Compare the element with the root of the tree.
2. If the item is matched then return the location of the node.
3. Otherwise check if item is less than the element present on root, if so then move to the left sub-tree.
4. If not, then move to the right sub-tree.
5. Repeat this procedure recursively until match found.
6. If element is not found then return NULL.
Algorithm:
Search (ROOT, ITEM)
o Step 1: IF ROOT -> DATA = ITEM OR ROOT = NULL
Return ROOT
ELSE
IF ROOT < ROOT -> DATA
Return search(ROOT -> LEFT, ITEM)
ELSE
Return search(ROOT -> RIGHT,ITEM)
[END OF IF]
[END OF IF]
o Step 2: END
Insertion
Insert function is used to add a new element in a binary search tree at appropriate location. Insert function is to
be designed in such a way that, it must node violate the property of binary search tree at each value.
1. Allocate the memory for tree.
2. Set the data part to the value and set the left and right pointer of tree, point to NULL.
3. If the item to be inserted will be the first element of the tree, then the left and right of this node will point
to NULL.
4. Else, check if the item is less than the root element of the tree, if this is true, then recursively perform
this operation with the left of the root.
5. If this is false, then perform this operation recursively with the right sub-tree of the root.
Insert (TREE, ITEM)
o Step 1: IF TREE = NULL
Allocate memory for TREE
SET TREE -> DATA = ITEM
SET TREE -> LEFT = TREE -> RIGHT = NULL
ELSE
IF ITEM < TREE -> DATA
Insert(TREE -> LEFT, ITEM)
ELSE
Insert(TREE -> RIGHT, ITEM)
[END OF IF]
[END OF IF]
o Step 2: END
Deletion
Delete function is used to delete the specified node from a binary search tree. However, we must delete a node
from a binary search tree in such a way, that the property of binary search tree doesn't violate. There are three
situations of deleting a node from binary search tree.
r = Node(50)
r = insert(r, 30)
r = insert(r, 20)
r = insert(r, 40)
r = insert(r, 70)
r = insert(r, 60)
r = insert(r, 80)
def search(root,key):
def preOrder(node):
if not node:
return
print(node.val)
preOrder(node.left)
preOrder(node.right)
root = TreeNode(5)
root.left = TreeNode(3)
root.right = TreeNode(6)
root.left.left = TreeNode(2)
root.left.right = TreeNode(4)
root.left.right.left = TreeNode(7)
print("Original node:")
print(preOrder(root))
result = delete_Node(root, 4)
print("After deleting specified node:")
print(preOrder(result))
if par == None:
root = tmp
tmp.left = None
tmp.right = None
else if ikey < (par.info):
tmp.left = par.left
tmp.right = par
par.lthread = False
par.left = tmp
else:
tmp.left = par
tmp.right = par.right
par.rthread = False
par.right = tmp
return root
# Main Code
if __name__ == '__main__':
root = None
Output
5 10 13 14 16 17 20 30
AVL Tree
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour of its
inventors.
AVL Tree can be defined as height balanced binary search tree in which each node is associated with a balance
factor which is calculated by subtracting the height of its right sub-tree from that of its left sub-tree.
Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the tree will be
unbalanced and need to be balanced.
Balance Factor (k) = Maximum height (left(k)) – Maximum height (right(k))
If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right sub-tree.
If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal height.
If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-tree.
An AVL tree is given in the following figure. We can see that, balance factor associated with each node is in
between -1 and +1. therefore, it is an example of AVL tree.
Complexity
Algorithm Average case Worst case
Space o(n) o(n)
Search o(log n) o(log n)
Insert o(log n) o(log n)
Delete o(log n) o(log n)
Operations on AVL tree
Due to the fact that, AVL tree is also a binary search tree therefore, all the operations are performed in the same
way as they are performed in a binary search tree. Searching and traversing do not lead to the violation in
property of AVL tree. However, insertion and deletion are the operations which can violate this property and
therefore, they need to be revisited.
Q. Construct an AVL tree by inserting the following elements in the given order.
63, 9, 19, 27, 18, 108, 99, 81
The process of constructing an AVL tree from the given set of elements is shown in the following figure.
At each step, we must calculate the balance factor for every node, if it is found to be more than 2 or less than -2,
then we need a rotation to rebalance the tree. The type of rotation will be estimated by the location of the
inserted element with respect to the critical node. All the elements are inserted in order to maintain the order of
binary search tree.
Example:
Delete the node 30 from the AVL tree shown in the following image.
Solution
In this case, the node B has balance factor 0, therefore the tree will be rotated by using R0 rotation as shown in
the following image. The node B(10) becomes the root, while the node A is moved to its right. The right child of
node B will now become the left child of node A.
R1 Rotation (Node B has balance factor 1)
R1 Rotation is to be performed if the balance factor of Node B is 1. In R1 rotation, the critical node A is moved
to its right having sub-trees T2 and T3 as its left and right child respectively. T1 is to be placed as the left sub-
tree of the node B.
The process involved in R1 rotation is shown in the following image.
Example
Delete Node 55 from the AVL tree shown in the following image.
Solution:
Deleting 55 from the AVL Tree disturbs the balance factor of the node 50 i.e. node A which becomes the critical
node. This is the condition of R1 rotation in which, the node A will be moved to its right (shown in the image
below). The right of B is now become the left of A (i.e. 45).
The process involved in the solution is shown in the following image.
R-1 Rotation (Node B has balance factor -1)
R-1 rotation is to be performed if the node B has balance factor -1. This case is treated in the same way as LR
rotation. In this case, the node C, which is the right child of node B, becomes the root node of the tree with B and
A as its left and right children respectively.
The sub-trees T1, T2 becomes the left and right sub-trees of B whereas, T3, T4 become the left and right sub-
trees of A.
The process involved in R-1 rotation is shown in the following image.
Example
Delete the node 60 from the AVL tree shown in the following image.
Solution:
in this case, node B has balance factor -1. Deleting the node 60, disturbs the balance factor of the node 50
therefore, it needs to be R-1 rotated. The node C i.e. 45 becomes the root of the tree with the node B(40) and
A(50) as its left and right child.
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is
root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 + 13 = 25
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes
are root of tree with more than one nodes.
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100
While performing some operations on B Tree, any property of B Tree may violate such as number of minimum
children a node can have. To maintain the properties of B Tree, the tree may split or join.
Searching:
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the
following B Tree. The process will something like following :
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any
element in a B tree.
Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order to insert an item
into B Tree.
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can be inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
o Insert the new element in the increasing order of elements.
o Split the node into the two nodes at the median.
o Push the median element up to its parent node.
o If the parent node also contains m-1 number of keys, then split it too by following the same steps.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.
The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node from the median
i.e. 8 and push it up to its parent node shown as follows.
Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf node or an
internal node. Following algorithm needs to be followed in order to delete a node from a B tree.
1. Locate the leaf node.
2. If there are more than m/2 keys in the leaf node then delete the desired key from the node.
3. If the leaf node doesn't contain m/2 keys then complete the keys by taking the element from eight or left
sibling.
o If the left sibling contains more than m/2 elements then push its largest element up to its parent
and move the intervening element down to the node where the key is deleted.
o If the right sibling contains more than m/2 elements then push its smallest element up to the
parent and move intervening element down to the node where the key is deleted.
4. If neither of the sibling contain more than m/2 elements then create a new leaf node by joining two leaf
nodes and the intervening element of the parent node.
5. If parent is left with less than m/2 nodes then, apply the above process on the parent too.
If the the node which is to be deleted is an internal node, then replace the node with its in-order successor or
predecessor. Since, successor or predecessor will always be on the leaf node hence, the process will be similar as
the node is being deleted from the leaf node.
Example 1
Delete the node 53 from the B Tree of order 5 shown in the following figure.
Now, 57 is the only element which is left in the node, the minimum number of elements that must be present in a
B tree of order 5, is 2. it is less than that, the elements in its left and right sub-tree are also not sufficient
therefore, merge it with the left sibling and intervening element of parent i.e. 49.
The final B tree is shown as follows.
Application of B tree
B tree is used to index the data and provides fast access to the actual data stored on the disks since, the access to
value stored in a large database that is stored on a disk is a very time consuming process.
Searching an un-indexed and unsorted database containing n key values needs O(n) running time in worst case.
However, if we use B Tree to index this database, it will be searched in O(log n) time in worst case.
B+ Tree
B+ Tree is an extension of B Tree which allows efficient insertion, deletion and search operations.
In B Tree, Keys and records both can be stored in the internal as well as leaf nodes. Whereas, in B+ tree, records
(data) can only be stored on the leaf nodes while internal nodes can only store the key values.
The leaf nodes of a B+ tree are linked together in the form of a singly linked lists to make the search queries
more efficient.
B+ Tree are used to store the large amount of data which can not be stored in the main memory. Due to the fact
that, size of main memory is always limited, the internal nodes (keys to access records) of the B+ tree are stored
in the main memory whereas, leaf nodes are stored in the secondary memory.
The internal nodes of B+ tree are often called index nodes. A B+ tree of order 3 is shown in the following figure.
Advantages of B+ Tree
1. Records can be fetched in equal number of disk accesses.
2. Height of the tree remains balanced and less as compare to B tree.
3. We can access the data stored in a B+ tree sequentially as well as directly.
4. Keys are used for indexing.
5. Faster search queries as the data is stored only on the leaf nodes.
B Tree VS B+ Tree
SN B Tree B+ Tree
1 Search keys cannot be repeatedly stored. Redundant search keys can be present.
2 Data can be stored in leaf nodes as well as internal nodes Data can only be stored on the leaf nodes.
3 Searching for some data is a slower process since data can Searching is comparatively faster as data can only be
be found on internal nodes as well as on the leaf nodes. found on the leaf nodes.
4 Deletion of internal nodes are so complicated and time Deletion will never be a complexed process since
consuming. element will always be deleted from the leaf nodes.
5 Leaf nodes cannot be linked together. Leaf nodes are linked together to make the search
operations more efficient.
Insertion in B+ Tree
Step 1: Insert the new node as a leaf node
Step 2: If the leaf doesn't have required space, split the node and copy the middle node to the next index node.
Step 3: If the index node doesn't have required space, split the node and copy the middle element to the next
index page.
Example :
Insert the value 195 into the B+ tree of order 5 shown in the following figure.
195 will be inserted in the right sub-tree of 120 after 190. Insert it at the desired position.
The node contains greater than the maximum number of elements i.e. 4, therefore split it and place the median
node up to the parent.
Now, the index node contains 6 children and 5 keys which violates the B+ tree properties, therefore we need to
split it, shown as follows.
Deletion in B+ Tree
Step 1: Delete the key and data from the leaves.
Step 2: if the leaf node contains less than minimum number of elements, merge down the node with its sibling
and delete the key in between them.
Step 3: if the index node contains less than minimum number of elements, merge the node with the sibling and
move down the key in between them.
Example
Delete the key 200 from the B+ Tree shown in the following figure.
200 is present in the right sub-tree of 190, after 195. delete it.
Merge the two nodes by using 195, 190, 154 and 129.
Now, element 120 is the single element present in the node which is violating the B+ Tree properties. Therefore,
we need to merge it by using 60, 78, 108 and 120.
Now, the height of B+ tree will be decreased by 1.