10. Tree

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Trees

A Tree is a non-linear data structure in which items are arranged in a sorted sequence. It is used to represent
hierarchical relationship exiting amongst several data items.
• A Tree is a recursive data structure containing the set of one or more data nodes where one node is
designated as the root of the tree while the remaining nodes are called as the children of the root.
• The nodes other than the root node are partitioned into the non-empty sets where each one of them is to be
called sub-tree.
• Nodes of a tree either maintain a parent-child relationship between them or they are sister nodes.
• In a general tree, A node can have any number of children nodes but it can have only a single parent.
• The following image shows a tree, where the node A is the root node of the tree while the other nodes can be
seen as the children of A.

Tree Terminology:
1. Root: It is specially designed data item in a tree. It is the first in the hierarchical arrangement of data items. In
the above tree, A is the root item.
2. Node: Each data item in a tree is called a node. It is the basic structure in a tree. It specifies the data
information and links (branches) to other data items. There are 10 nodes in the above tree.
3. Degree of a node: It is number of subtrees of a node in a given tree. In the above tree: The degree of node A
is 2 and E is 1.
4. Degree of a Tree: It is the maximum degree of nodes in a given tree. In the above tree the node A has degree
2 and in all this value is the maximum. So, the degree of the above tree is 2.
5. Terminal Node: A node with degree zero is called a terminal node or a leaf. In the above tree, there are 5
terminal nodes. They are H, I, J, F and G.
6. Non-terminal nodes: any node (except the root node) whose degree is not zero is called non-terminal node.
Non-terminal nodes are the intermediate nodes in traversing the given tree from its root node to the terminal
nodes (leaves). There are 4 non-terminal nodes in the above tree.
7. Sibling: The children nodes of a given parent node are called siblings. They are also called brothers. In the
above tree, D and E are siblings of parent node B.
8. Level: The entire tree structure is leveled in such a way that the root node is always at level 0. Then, its
immediate children are at level 1 and their immediate children are at level 2 and so on up to the terminal nodes.
In general, if a node is at level n, then its children will be at level n+1.
9. Edge: It is connecting line of two nodes. That is, the line drawn from one node to another node is called an
edge.
10. Path: It is sequence of consecutive edges from the source node to the destination node. In the above tree, the
between A and J is given by the node pairs, (A, B), (B, E) and (E, J)
11. Depth: It is the maximum level of any node in a given tree. In the above tree, the root node A has the
maximum level. That is the number of levels one can descend the tree from its root to the terminal node (leaves).
The term height is also used to denote the depth.
12. Forest: It is a set of disjoint trees. In a given tree, if you remove its root node then it becomes a forest. In the
above tree, there is forest with two trees.
13. Ancestor & Descendant: If A is said to be the father of B and B is said to be the left son of A. Node n1 is
an ancestor of node n2 (and n2 is a descendant of n1) if n1 is either the father of n2 or the father of some
ancestor of n2.

Why Trees?
1. One reason to use trees might be because you want to store information that naturally forms a hierarchy. For
example, the file system on a computer:
file system
/ <-- root
/ \
... home
/ \
ugrad course
/ / | \
... cs101 s112 cs113
2. Trees (with some ordering e.g., BST) provide moderate access/search (quicker than Linked List and slower
than arrays).
3. Trees provide moderate insertion/deletion (quicker than Arrays and slower than Unordered Linked Lists).
4. Like Linked Lists and unlike Arrays, Trees don’t have an upper limit on number of nodes as nodes are linked
using pointers.

Main applications of trees include:


1. Manipulate hierarchical data.
2. Make information easy to search (see tree traversal).
3. Manipulate sorted lists of data.
4. As a workflow for compositing digital images for visual effects.
5. Router algorithms
6. Form of a multi-stage decision-making (see business chess).

Types of Tree
The tree data structure can be classified into six different categories.

General Tree
General Tree stores the elements in a hierarchical order in which the top level element is always present at level
0 as the root element. All the nodes except the root node are present at number of levels. The nodes which are
present on the same level are called siblings while the nodes which are present on the different levels exhibit the
parent-child relationship among them. A node may contain any number of sub-trees. The tree in which each
node contains 3 sub-trees, is called ternary tree.
Forests
Forest can be defined as the set of disjoint trees which can be obtained by deleting the root node and the edges
which connects root node to the first level node.
Binary Tree
Binary tree is a data structure in which each node can have at most 2 children. The node present at the top most
level is called the root node. A node with the 0 children is called leaf node. Binary Trees are used in the
applications like expression evaluation and many more. We will discuss binary tree in detail, later in this tutorial.

Binary Search Tree


Binary search tree is an ordered binary tree. All the elements in the left sub-tree are less than the root while
elements present in the right sub-tree are greater than or equal to the root node element. Binary search trees are
used in most of the applications of computer science domain like searching, sorting, etc.

Expression Tree
Expression trees are used to evaluate the simple arithmetic expressions. Expression tree is basically a binary tree
where internal nodes are represented by operators while the leaf nodes are represented by operands. Expression
trees are widely used to solve algebraic expressions like (a+b)*(a-b). Consider the following example.
Q. Construct an expression tree by using the following algebraic expression. (a + b) / (a*b - c) + d

Tournament Tree
Tournament tree are used to record the winner of the match in each round being played between two players.
Tournament tree can also be called as selection tree or winner tree. External nodes represent the players among
which a match is being played while the internal nodes represent the winner of the match played. At the top most
level, the winner of the tournament is present as the root node of the tree.
For example, tree .of a chess tournament being played among 4 players is shown as follows. However, the
winner in the left sub-tree will play against the winner of right sub-tree.
Binary Tree: A tree whose elements have at most 2 children is called a binary tree. Since each element in a
binary tree can have only 2 children, we typically name them the left and right child.

Binary Tree (Types of Binary Tree)


Following are common types of Binary Trees.

Full Binary Tree or Strictly Binary Tree: A Binary Tree is full if every node has 0 or 2 children. Following
are examples of a full binary tree. We can also say a full binary tree is a binary tree in which all nodes except
leaves have two children.
18
/ \
15 30
/ \ / \
40 50 100 40

18
/ \
15 20
/ \
40 50
/ \
30 50

18
/ \
40 30
/ \
100 40

In a Full Binary, number of leaf nodes is number of internal nodes plus 1


L = I + 1 Where L = Number of leaf nodes, I = Number of internal nodes

Complete Binary Tree: A Binary Tree is complete Binary Tree if all levels are completely filled except
possibly the last level and the last level has all keys as left as possible

Following are examples of Complete Binary Trees


18
/ \
15 30
/ \ / \
40 50 100 40

18
/ \
15 30
/ \ / \
40 50 100 40
/ \ /
8 7 9
Practical example of Complete Binary Tree is Binary Heap.

Perfect Binary Tree: A Binary tree is Perfect Binary Tree in which all internal nodes have two children and all
leaves are at the same level.
Following are examples of Perfect Binary Trees.
18
/ \
15 30
/ \ / \
40 50 100 40

18
/ \
15 30
A Perfect Binary Tree of height h (where height is the number of nodes on the path from the root to leaf) has 2 h –
1 node.
Example of a Perfect binary tree is ancestors in the family. Keep a person at root, parents as children, and
parents of parents as their children.

Balanced Binary Tree


A binary tree is balanced if the height of the tree is O(Log n) where n is the number of nodes. For Example,
AVL tree maintains O(Log n) height by making sure that the difference between heights of left and right
subtrees is 1. Red-Black trees maintain O(Log n) height by making sure that the number of Black nodes on every
root to leaf paths are same and there are no adjacent red nodes. Balanced Binary Search trees are performance
wise good as they provide O(log n) time for search, insert and delete.

A degenerate (or pathological) tree: A Tree where every internal node has one child. Such trees are
performance-wise same as linked list.
10
/
20
\
30
\
40

Extended Binary trees or 2-Tree:


A Binary tree T is said to be a 2-tree or an extended binary tree if each node N has either 0 or 2 children. In such
a case, the nodes with 2 children are called internal nodes, and the nodes with 0 children are called external
nodes.
Extended binary tree is a type of binary tree in which all the null sub tree of the original tree are replaced with
special nodes called external nodes whereas other nodes are called internal nodes.

Here the circles represent the internal nodes and the boxes represent the external nodes.
Properties of External binary tree
1. The nodes from the original tree are internal nodes and the special nodes are external nodes.
2. All external nodes are leaf nodes and the internal nodes are non-leaf nodes.
3. Every internal node has exactly two children and every external node is a leaf. It displays the result which is
a complete binary tree

Prove that E = 2*n + I, where E is external path length, I is internal path length and n is total number of
internal nodes.
We do this by using induction on n.
Induction Base:
When n = 0 the binary tree has no internal node and 1 external node. For this tree E = I = n = 0.
Therefore, E = I + 2n.

Induction Hypothesis:
Let m be any integer >= 0. Assume that E = I + 2m for all binary trees that have m internal nodes.

Induction Step:
We will show that E = I + 2n for all binary trees that have m + 1 internal nodes. Consider any binary tree T that
has m + 1 internal nodes. Remove any one of the internal nodes that is a leaf. The resulting
tree, T' has m internal nodes. From the induction hypothesis it follows that E' = I' + 2m where E' and I' are,
respectively, the external and internal path lengths of T'.

Suppose that the removed leaf was at level ‘level’ of T. It follows that E = E' + level + 2 and that I = I' +
level where E and I are, respectively, the external and internal path lengths of T. Therefore,
E = E' + level + 2
= I' + 2m + level + 2
= I - level + 2m + level + 2
= I + 2(m + 1)

Application of extended binary tree:


1. Calculate weighted path length: It is used to calculate total path length in case of weighted tree.

Here, the sum of total weights is already calculated and stored in the external nodes and thus makes it very
easier to calculate the total path length of a tree with given weights. The same technique can be used to
update routing tables in a network.
2. To convert binary tree in complete binary tree: The above-given tree having removed all the external
nodes, is not a complete binary tree. To introduce any tree as complete tree, external nodes are added onto it.
Heap is a great example of a complete binary tree and thus each binary tree can be expressed as heap if
external nodes are added to it.

Binary Tree (Properties)


1) The maximum number of nodes at level ‘l’ of a binary tree is 2l.
There is only 1 node (= the root node) at depth 0:
20 = 1
In a perfect binary tree, every node has 2 children nodes

So:
Depth d # nodes at depth d # of child nodes
--------------------------------------------------------------
0 1 = 20 2 (each node has 2 children)
1 2 = 21 4 (each node has 2 children)
2 4 = 22 8 (each node has 2 children)
...
I.e.: The number of nodes doubles every time the depth increases by 1!
Therefore: Number of nodes at level l = 2l

2) For any nonempty binary tree, T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2,
then n0=n2+1
Proof:
Let n, e be the total no. of nodes and edges of the binary tree respectively.
Let n0 = total no. of nodes with 0 children, n1 = total no. of nodes with 1 child and n2 = total no. of nodes with 2
children.
Therefore, n = n0 + n1 + n2......(1)
Again e= n-1.....................(2)
also e = 0*n0 + 1*n1 + 2*n2...............(3)
Now, from (2) and (3) we get,
n-1 = n1 + 2*n2
=> n=1 + n1 + 2*n2.............(4)
from (1) and (4) we get,
n0 + n1 + n2 = 1 + n1 + 2*n2 => n0 = 1 + n2... proved

2) Maximum number of nodes in a binary tree of height ‘h’ is 2h+1 – 1.


We know that number of nodes at level l = 2l
At each level d there may be from 1 to 2d nodes.
level 0: 1 to 20 = 1 node
level 1: 1 to 21 = 2 nodes
level 2: 1 to 22 = 4 nodes
level 3: 1 to 23 = 8 nodes
………………………………
level l 1 to 2l = 2l
So the total number of nodes in a perfect binary tree of height h:
=20 + 21 + ... 2h
= 2h+1 – 1
S = 1 + 2 + 22 + 23 + ... + 2h
2xS = 2 + 22 + 23 + ... + 2h + 2h+1 - (subtract)
------------------------------------------------------------
2xS - S = 2h+1 - 1
<==> S = 2h+1 - 1

3) In a Binary Tree with N nodes, minimum possible height or minimum number of levels is ⌈ Log2(N+1) ⌉
This can be directly derived from point 2 above. If we consider the convention where height of a leaf node is
considered as 0, then above formula for minimum possible height becomes ⌈ Log2(N+1) ⌉ – 1
4) A Binary Tree with L leaves has at least ⌈ Log2L ⌉ + 1 levels
A Binary tree has maximum number of leaves (and minimum number of levels) when all levels are fully filled.
Let all leaves be at level l, then below is true for number of leaves L.
L <= 2l-1 [From Point 1]
l = ⌈ Log2L ⌉ + 1
where l is the minimum number of levels.

5) In Binary tree where every node has 0 or 2 children, number of leaf nodes is always one more than nodes
with two children.
L=T+1
Where L = Number of leaf nodes, T = Number of internal nodes with two children

Array representation of Binary Trees


An array can be used to store the nodes of a binary tree. The nodes stored in an array are accessible sequentially.
In C, array start with index 0 to max-1. Here, numbering of binary tree nodes start from 0 rather than the
maximum number of nodes is specified by max.

The root node is always at index 0. Then, in successive memory locations the left child and right child are stored.
Consider a binary tree with only three nodes as shown. Let BT denote a binary tree.

The array representation of this binary tree is as follows:


Here, A is the father of B and C. B is the left child of A and C is the right child of A. Let us extend the above
tree by one more level as shown below:
The array representation of this binary tree is as follows:

How to identify the father, the left child and the right child of an arbitrary node in such representation? It is very
simple to identify the father and the children of a node. For any node n, 0<=n <= (max-1), then we have

1. Father (n): The father of node having index n is at at floor ((n-1)/2) if n is not equal to 0. If n=0, then it is the
root node and has no father.
Example: Consider a node numbered 3 (i.e., D). The father of D, no doubt, is B whose index is 1 (Floor ((3-1)2)
= 1).

2. Lchild (n): The left child of node numbered n is at (2n+1).


For example, lchild(C) = lchild(2)
=2X2+1
= 5 i.e., F

4. Siblings: If the left child at index n is given then its right sibling (or brother) is at (n+1). And, similarly, if the
right child at index n is given, then its left sibling is at (n-1).
The array representation is more ideal for the complete binary trees. But, this in not suitable for other than
complete binary tree as it results in unnecessary wastage of memory space. Consider the following binary tree:

It is a skewed binary tree. Since only the left sub tree is left sub tree is present, this type of binary tree is called
left skewed binary tree. You can also have right skewed binary tree. The array representation of the above binary
tree is given above. Note that the right child of A, is empty, and its both left child and right child are also empty
whose index is 4. Therefore, these indexes in array BT are left unused. This results in wastage of more memory.

Operations on Binary Tree:


Apart from primitive operations, other operations that can be applied to the Binary Tree are:
1. Tree Traversal
2. Insertion of nodes
3. Deletion of nodes
4. Searching for the node
5. Copying the binary tree
Traversal of a Binary Tree:
Tree traversal is one of the most common operations performed on tree data structures. It is a way in which each
node in the tree is visited exactly once in a systematic manner. There are many applications that essentially
requite traversal of binary trees. The full binary tree traversal would produce a linear other for the nodes in a
binary tree, there are three popular ways of binary tree traversal. They are:
1. Preorder Traversal
2. Inorder Traversal
3. Postorder Traversal

Pre-order traversal
Steps
o Visit the root node
o traverse the left sub-tree in pre-order
o traverse the right sub-tree in pre-order

Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: Write TREE -> DATA
o Step 3: PREORDER(TREE -> LEFT)
o Step 4: PREORDER(TREE -> RIGHT)
[END OF LOOP]
o Step 5: END

Example
Traverse the following binary tree by using pre-order traversal

o Since, the traversal scheme, we are using is pre-order traversal, therefore, the first element to be printed is
18.
o traverse the left sub-tree recursively. The root node of the left sub-tree is 211, print it and move to left.
o Left is empty therefore print the right children and move to the right sub-tree of the root.
o 20 is the root of sub-tree therefore, print it and move to its left. Since left sub-tree is empty therefore move
to the right and print the only element present there i.e. 190.
o Therefore, the printing sequence will be 18, 211, 90, 20, 190.

In-order traversal
Steps
o Traverse the left sub-tree in in-order
o Visit the root
o Traverse the right sub-tree in in-order

Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: INORDER(TREE -> LEFT)
o Step 3: Write TREE -> DATA
o Step 4: INORDER(TREE -> RIGHT)
[END OF LOOP]
o Step 5: END

Example
Traverse the following binary tree by using in-order traversal.

o print the left most node of the left sub-tree i.e. 23.
o print the root of the left sub-tree i.e. 211.
o print the right child i.e. 89.
o print the root node of the tree i.e. 18.
o Then, move to the right sub-tree of the binary tree and print the left most node i.e. 10.
o print the root of the right sub-tree i.e. 20.
o print the right child i.e. 32.
o hence, the printing sequence will be 23, 211, 89, 18, 10, 20, 32.

Post-order traversal
Steps
o Traverse the left sub-tree in post-order
o Traverse the right sub-tree in post-order
o visit the root

Algorithm
o Step 1: Repeat Steps 2 to 4 while TREE != NULL
o Step 2: POSTORDER(TREE -> LEFT)
o Step 3: POSTORDER(TREE -> RIGHT)
o Step 4: Write TREE -> DATA
[END OF LOOP]
o Step 5: END

Example
Traverse the following tree by using post-order traversal

o Print the left child of the left sub-tree of binary tree i.e. 23.
o print the right child of the left sub-tree of binary tree i.e. 89.
o print the root node of the left sub-tree i.e. 211.
o Now, before printing the root node, move to right sub-tree and print the left child i.e. 10.
o print 32 i.e. right child.
o Print the root node 20.
o Now, at the last, print the root of the tree i.e. 18.
The printing sequence will be 23, 89, 211, 10, 32, 18.

# Python program to for tree traversals


# A class that represents an individual node in a Binary Tree
class Node:
def __init__(self, key):
self.left = None
self.right = None
self.val = key

# A function to do inorder tree traversal


def printInorder(root):
if root:
# First recur on left child
printInorder(root.left)

# then print the data of node


print(root.val),

# now recur on right child


printInorder(root.right)

# A function to do postorder tree traversal


def printPostorder(root):
if root:
# First recur on left child
printPostorder(root.left)

# the recur on right child


printPostorder(root.right)

# now print the data of node


print(root.val),

# A function to do preorder tree traversal


def printPreorder(root):
if root:
# First print the data of node
print(root.val),

# Then recur on left child


printPreorder(root.left)

# Finally recur on right child


printPreorder(root.right)
# main
root = Node(1)
root.left = Node(2)
root.right = Node(3)
root.left.left = Node(4)
root.left.right = Node(5)
print "Preorder traversal of binary tree is"
printPreorder(root)

print "\nInorder traversal of binary tree is"


printInorder(root)

print "\nPostorder traversal of binary tree is"


printPostorder(root)
Output:
Preorder traversal of binary tree is
12453
Inorder traversal of binary tree is
42513
Postorder traversal of binary tree is
45231

Construct a binary tree from given Inorder and Postorder Traversal


int[] inOrder = { 4, 2, 5, 1, 6, 3, 7 };
int[] postOrder = { 4, 5, 2, 6, 7, 3, 1 };.

Last element in the postorder [] will be the root of the tree, here it is 1.
Now the search element 1 in inorder[], say you find it at position i, once you find it, make note of elements
which are left to i (this will construct the leftsubtree) and elements which are right to i ( this will construct the
rightSubtree).
Suppose in previous step, there are X number of elements which are left of ‘i’ (which will construct the
leftsubtree), take first X elements from the postorder[] traversal, this will be the post order traversal for elements
which are left to i. similarly if there are Y number of elements which are right of ‘i’ (which will construct the
rightsubtree), take next Y elements, after X elements from the postorder[] traversal, this will be the post order
traversal for elements which are right to i

From previous two steps construct the left and right subtree and link it to root.left and root.right respectively.
See the picture for better explanation.
Make a Binary Tree from Given Inorder and Preorder Traveral.
int [] inOrder = {2,5,6,10,12,14,15};
int [] preOrder = {10,5,2,6,14,12,15};

First element in preorder[] will be the root of the tree, here its 10.
Now the search element 10 in inorder[], say you find it at position i, once you find it, make note of elements
which are left to i (this will construct the leftsubtree) and elements which are right to i ( this will construct the
rightSubtree).

See this step above and recursively construct left subtree and link it root.left and recursively construct right
subtree and link it root.right.

Technique of conversion of an expression into Binary Tree:


Divide and conquer technique is used to convert an expression into a binary tree. The following clarifies the
technique:
Expression: A + (B + C * D + E) + F / G
1. Note the order of precedence. All expressions in parentheses are to be evaluated first.
2. Exponential will come next.
3. Division and multiplication will be the next in order of precedence.
4. Subtraction and addition will be the last to be processed.

Binary Search Tree


1. Binary Search tree can be defined as a class of binary trees, in which the nodes are arranged in a specific
order. This is also called ordered binary tree.
2. In a binary search tree, the value of all the nodes in the left sub-tree is less than the value of the root.
3. Similarly, value of all the nodes in the right sub-tree is greater than or equal to the value of the root.
4. This rule will be recursively applied to all the left and right sub-trees of the root.

A Binary search tree is shown in the above figure. As the constraint applied on the BST, we can see that the root
node 30 doesn't contain any value greater than or equal to 30 in its left sub-tree and it also doesn't contain any
value less than 30 in its right sub-tree.
Advantages of using binary search tree
1. Searching become very efficient in a binary search tree since, we get a hint at each step, about which
sub-tree contains the desired element.
2. The binary search tree is considered as efficient data structure in compare to arrays and linked lists. In
searching process, it removes half sub-tree at every step. Searching for an element in a binary search tree
takes o(log2n) time. In worst case, the time it takes to search an element is 0(n).
3. It also speed up the insertion and deletion operations as compare to that in array and linked list.

Q. Create the binary search tree using the following data elements.
43, 10, 79, 90, 12, 54, 11, 9, 50
1. Insert 43 into the tree as the root of the tree.
2. Read the next element, if it is lesser than the root node element, insert it as the root of the left sub-tree.
3. Otherwise, insert it as the root of the right of the right sub-tree.
The process of creating BST by using the given elements, is shown in the image below.
Operations on Binary Search Tree
There are many operations which can be performed on a binary search tree.
SN Operation Description
1 Searching in BST Finding the location of some specific element in a binary search tree.
2 Insertion in BST Adding a new element to the binary search tree at the appropriate location so that
the property of BST do not violate.
3 Deletion in BST Deleting some specific node from a binary search tree. However, there can be
various cases in deletion depending upon the number of children, the node have.

Searching
Searching means finding or locating some specific element or node within a data structure. However, searching
for some specific node in binary search tree is pretty easy due to the fact that, element in BST are stored in a
particular order.
1. Compare the element with the root of the tree.
2. If the item is matched then return the location of the node.
3. Otherwise check if item is less than the element present on root, if so then move to the left sub-tree.
4. If not, then move to the right sub-tree.
5. Repeat this procedure recursively until match found.
6. If element is not found then return NULL.

Algorithm:
Search (ROOT, ITEM)
o Step 1: IF ROOT -> DATA = ITEM OR ROOT = NULL
Return ROOT
ELSE
IF ROOT < ROOT -> DATA
Return search(ROOT -> LEFT, ITEM)
ELSE
Return search(ROOT -> RIGHT,ITEM)
[END OF IF]
[END OF IF]
o Step 2: END

Insertion
Insert function is used to add a new element in a binary search tree at appropriate location. Insert function is to
be designed in such a way that, it must node violate the property of binary search tree at each value.
1. Allocate the memory for tree.
2. Set the data part to the value and set the left and right pointer of tree, point to NULL.
3. If the item to be inserted will be the first element of the tree, then the left and right of this node will point
to NULL.
4. Else, check if the item is less than the root element of the tree, if this is true, then recursively perform
this operation with the left of the root.
5. If this is false, then perform this operation recursively with the right sub-tree of the root.
Insert (TREE, ITEM)
o Step 1: IF TREE = NULL
Allocate memory for TREE
SET TREE -> DATA = ITEM
SET TREE -> LEFT = TREE -> RIGHT = NULL
ELSE
IF ITEM < TREE -> DATA
Insert(TREE -> LEFT, ITEM)
ELSE
Insert(TREE -> RIGHT, ITEM)
[END OF IF]
[END OF IF]
o Step 2: END
Deletion
Delete function is used to delete the specified node from a binary search tree. However, we must delete a node
from a binary search tree in such a way, that the property of binary search tree doesn't violate. There are three
situations of deleting a node from binary search tree.

The node to be deleted is a leaf node


It is the simplest case; in this case, replace the leaf node with the NULL and simple free the allocated space.
In the following image, we are deleting the node 85, since the node is a leaf node, therefore the node will be
replaced with NULL and allocated space will be freed.

The node to be deleted has only one child.


In this case, replace the node with its child and delete the child node, which now contains the value which is to
be deleted. Simply replace it with the NULL and free the allocated space.
In the following image, the node 12 is to be deleted. It has only one child. The node will be replaced with its
child node and the replaced node 12 (which is now leaf node) will simply be deleted.

The node to be deleted has two children.


It is a bit complex case compare to other two cases. However, the node which is to be deleted, is replaced with
its in-order successor or predecessor recursively until the node value (to be deleted) is placed on the leaf of the
tree. After the procedure, replace the node with NULL and free the allocated space.
In the following image, the node 50 is to be deleted which is the root node of the tree. The in-order traversal of
the tree given below.
6, 25, 30, 50, 52, 60, 70, 75.
replace 50 with its in-order successor 52. Now, 50 will be moved to the leaf of the tree, which will simply be
deleted.
Algorithm
Delete (TREE, ITEM)
o Step 1: IF TREE = NULL
Write "item not found in the tree" ELSE IF ITEM < TREE -> DATA
Delete(TREE->LEFT, ITEM)
ELSE IF ITEM > TREE -> DATA
Delete(TREE -> RIGHT, ITEM)
ELSE IF TREE -> LEFT AND TREE -> RIGHT
SET TEMP = findLargestNode(TREE -> LEFT)
SET TREE -> DATA = TEMP -> DATA
Delete(TREE -> LEFT, TEMP -> DATA)
ELSE
SET TEMP = TREE
IF TREE -> LEFT = NULL AND TREE -> RIGHT = NULL
SET TREE = NULL
ELSE IF TREE -> LEFT != NULL
SET TREE = TREE -> LEFT
ELSE
SET TREE = TREE -> RIGHT
[END OF IF]
FREE TEMP
[END OF IF]
o Step 2: END

# Python program to demonstrate insert operation in binary search tree


class Node:
def __init__(self, key):
self.left = None
self.right = None
self.val = key

def insert(root, key):


if root is None:
return Node(key)
else:
if root.val == key:
return root
elif root.val < key:
root.right = insert(root.right, key)
else:
root.left = insert(root.left, key)
return root
def inorder(root):
if root:
inorder(root.left)
print(root.val)
inorder(root.right)
# Main
# Let us create the following BST
# 50
# / \
# 30 70
# / \ / \
# 20 40 60 80

r = Node(50)
r = insert(r, 30)
r = insert(r, 20)
r = insert(r, 40)
r = insert(r, 70)
r = insert(r, 60)
r = insert(r, 80)

# Print inoder traversal of the BST


inorder(r)

def search(root,key):

# Base Cases: root is null or key is present at root


if root is None or root.val == key:
return root

# Key is greater than root's key


if root.val < key:
return search(root.right,key)

# Key is smaller than root's key


return search(root.left,key)

# Deletion: Binary tree node.


class TreeNode(object):
def __init__(self, x):
self.val = x
self.left = None
self.right = None

def delete_Node(root, key):


# if root doesn't exist, just return it
if not root:
return root
# Find the node in the left subtree if key value is less than root value
if root.val > key:
root.left = delete_Node(root.left, key)
# Find the node in right subtree if key value is greater than root value,
elif root.val < key:
root.right= delete_Node(root.right, key)
# Delete the node if root.value == key
else:
# If there is no right children delete the node and new root would be root.left
if not root.right:
return root.left
# If there is no left children delete the node and new root would be root.right
if not root.left:
return root.right
# If both left and right children exist in the node replace its value with
# the minmimum value in the right subtree. Now delete that minimum node
# in the right subtree
temp_val = root.right
mini_val = temp_val.val
while temp_val.left:
temp_val = temp_val.left
mini_val = temp_val.val
# Delete the minimum node in right subtree
root.right = deleteNode(root.right,root.val)
return root

def preOrder(node):
if not node:
return
print(node.val)
preOrder(node.left)
preOrder(node.right)

root = TreeNode(5)
root.left = TreeNode(3)
root.right = TreeNode(6)
root.left.left = TreeNode(2)
root.left.right = TreeNode(4)
root.left.right.left = TreeNode(7)
print("Original node:")
print(preOrder(root))
result = delete_Node(root, 4)
print("After deleting specified node:")
print(preOrder(result))

Threaded Binary Tree:


The idea of threaded binary trees is to make inorder traversal faster and do it without stack and without
recursion. A binary tree is made threaded by making all right child pointers that would normally be NULL point
to the inorder successor of the node (if it exists).

There are two types of threaded binary trees.


Single Threaded: Where a NULL right pointers is made to point to the inorder successor (if successor exists)
Double Threaded: Where both left and right NULL pointers are made to point to inorder predecessor and
inorder successor respectively. The predecessor threads are useful for reverse inorder traversal and postorder
traversal.
The threads are also useful for fast accessing ancestors of a node.
Following diagram shows an example Single Threaded Binary Tree. The dotted lines represent threads.
Since right pointer is used for two purposes, the int variable rightThread is used to indicate whether
right pointer points to right child or inorder successor. Similarly, we can add leftThread for a double
threaded binary tree.

# Insertion in Threaded Binary Search Tree.


class newNode:
def __init__(self, key):

# False if left pointer points to


# predecessor in Inorder Traversal
self.info = key
self.left = None
self.right =None
self.lthread = True

# False if right pointer points to


# successor in Inorder Traversal
self.rthread = True

# Insert a Node in Binary Threaded Tree


def insert(root, ikey):

# Searching for a Node with given value


ptr = root
par = None # Parent of key to be inserted
while ptr != None:

# If key already exists, return


if ikey == (ptr.info):
print("Duplicate Key !")
return root

par = ptr # Update parent pointer

# Moving on left subtree.


if ikey < ptr.info:
if ptr.lthread == False:
ptr = ptr.left
else:
break

# Moving on right subtree.


else:
if ptr.rthread == False:
ptr = ptr.right
else:
break

# Create a new node


tmp = newNode(ikey)

if par == None:
root = tmp
tmp.left = None
tmp.right = None
else if ikey < (par.info):
tmp.left = par.left
tmp.right = par
par.lthread = False
par.left = tmp
else:
tmp.left = par
tmp.right = par.right
par.rthread = False
par.right = tmp

return root

# Returns inorder successor using rthread


def inorderSuccessor(ptr):

# If rthread is set, we can quickly find


if ptr.rthread == True:
return ptr.right

# Else return leftmost child of


# right subtree
ptr = ptr.right
while ptr.lthread == False:
ptr = ptr.left
return ptr

# Printing the threaded tree


def inorder(root):
if root == None:
print("Tree is empty")

# Reach leftmost node


ptr = root
while ptr.lthread == False:
ptr = ptr.left

# One by one print successors


while ptr != None:
print(ptr.info,end=" ")
ptr = inorderSuccessor(ptr)

# Main Code
if __name__ == '__main__':
root = None

root = insert(root, 20)


root = insert(root, 10)
root = insert(root, 30)
root = insert(root, 5)
root = insert(root, 16)
root = insert(root, 14)
root = insert(root, 17)
root = insert(root, 13)
inorder(root)

Output
5 10 13 14 16 17 20 30

AVL Tree
AVL Tree is invented by GM Adelson - Velsky and EM Landis in 1962. The tree is named AVL in honour of its
inventors.
AVL Tree can be defined as height balanced binary search tree in which each node is associated with a balance
factor which is calculated by subtracting the height of its right sub-tree from that of its left sub-tree.
Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the tree will be
unbalanced and need to be balanced.
Balance Factor (k) = Maximum height (left(k)) – Maximum height (right(k))
If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right sub-tree.
If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal height.
If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-tree.
An AVL tree is given in the following figure. We can see that, balance factor associated with each node is in
between -1 and +1. therefore, it is an example of AVL tree.

Complexity
Algorithm Average case Worst case
Space o(n) o(n)
Search o(log n) o(log n)
Insert o(log n) o(log n)
Delete o(log n) o(log n)
Operations on AVL tree
Due to the fact that, AVL tree is also a binary search tree therefore, all the operations are performed in the same
way as they are performed in a binary search tree. Searching and traversing do not lead to the violation in
property of AVL tree. However, insertion and deletion are the operations which can violate this property and
therefore, they need to be revisited.

AVL Tree Rotations


In AVL tree, after performing operations like insertion and deletion we need to check the balance factor of
every node in the tree. If every node satisfies the balance factor condition then we conclude the operation
otherwise we must make it balanced. Whenever the tree becomes imbalanced due to any operation we
use rotation operations to make the tree balanced.
Rotation operations are used to make the tree balanced.
Rotation is the process of moving nodes either to left or to right to make the tree balanced.
There are four rotations and they are classified into two types.

Single Left Rotation (LL Rotation)


In LL Rotation, every node moves one position to left from the current position. To understand LL Rotation, let
us consider the following insertion operation in AVL Tree...

Single Right Rotation (RR Rotation)


In RR Rotation, every node moves one position to right from the current position. To understand RR Rotation,
let us consider the following insertion operation in AVL Tree...
Left Right Rotation (LR Rotation)
The LR Rotation is a sequence of single left rotation followed by a single right rotation. In LR Rotation, at first,
every node moves one position to the left and one position to right from the current position. To understand LR
Rotation, let us consider the following insertion operation in AVL Tree...

Right Left Rotation (RL Rotation)


The RL Rotation is sequence of single right rotation followed by single left rotation. In RL Rotation, at first
every node moves one position to right and one position to left from the current position. To understand RL
Rotation, let us consider the following insertion operation in AVL Tree...

Q. Construct an AVL tree by inserting the following elements in the given order.
63, 9, 19, 27, 18, 108, 99, 81
The process of constructing an AVL tree from the given set of elements is shown in the following figure.
At each step, we must calculate the balance factor for every node, if it is found to be more than 2 or less than -2,
then we need a rotation to rebalance the tree. The type of rotation will be estimated by the location of the
inserted element with respect to the critical node. All the elements are inserted in order to maintain the order of
binary search tree.

Example: Construct an AVL Tree by inserting numbers from 1 to 8.


Deletion in AVL Tree
Deleting a node from an AVL tree is similar to that in a binary search tree. Deletion may disturb the balance
factor of an AVL tree and therefore the tree needs to be rebalanced in order to maintain the AVLness. For this
purpose, we need to perform rotations. The two types of rotations are L rotation and R rotation. Here, we will
discuss R rotations. L rotations are the mirror images of them.
If the node which is to be deleted is present in the left sub-tree of the critical node, then L rotation needs to be
applied else if, the node which is to be deleted is present in the right sub-tree of the critical node, the R rotation
will be applied.
Let us consider that, A is the critical node and B is the root node of its left sub-tree. If node X, present in the
right sub-tree of A, is to be deleted, then there can be three different situations:

R0 rotation (Node B has balance factor 0 )


If the node B has 0 balance factor, and the balance factor of node A disturbed upon deleting the node X, then the
tree will be rebalanced by rotating tree using R0 rotation.
The critical node A is moved to its right and the node B becomes the root of the tree with T1 as its left sub-tree.
The sub-trees T2 and T3 becomes the left and right sub-tree of the node A. the process involved in R0 rotation is
shown in the following image.

Example:
Delete the node 30 from the AVL tree shown in the following image.

Solution
In this case, the node B has balance factor 0, therefore the tree will be rotated by using R0 rotation as shown in
the following image. The node B(10) becomes the root, while the node A is moved to its right. The right child of
node B will now become the left child of node A.
R1 Rotation (Node B has balance factor 1)
R1 Rotation is to be performed if the balance factor of Node B is 1. In R1 rotation, the critical node A is moved
to its right having sub-trees T2 and T3 as its left and right child respectively. T1 is to be placed as the left sub-
tree of the node B.
The process involved in R1 rotation is shown in the following image.

Example
Delete Node 55 from the AVL tree shown in the following image.

Solution:
Deleting 55 from the AVL Tree disturbs the balance factor of the node 50 i.e. node A which becomes the critical
node. This is the condition of R1 rotation in which, the node A will be moved to its right (shown in the image
below). The right of B is now become the left of A (i.e. 45).
The process involved in the solution is shown in the following image.
R-1 Rotation (Node B has balance factor -1)
R-1 rotation is to be performed if the node B has balance factor -1. This case is treated in the same way as LR
rotation. In this case, the node C, which is the right child of node B, becomes the root node of the tree with B and
A as its left and right children respectively.
The sub-trees T1, T2 becomes the left and right sub-trees of B whereas, T3, T4 become the left and right sub-
trees of A.
The process involved in R-1 rotation is shown in the following image.

Example
Delete the node 60 from the AVL tree shown in the following image.

Solution:
in this case, node B has balance factor -1. Deleting the node 60, disturbs the balance factor of the node 50
therefore, it needs to be R-1 rotated. The node C i.e. 45 becomes the root of the tree with the node B(40) and
A(50) as its left and right child.

Why AVL Tree?


AVL tree controls the height of the binary search tree by not letting it to be skewed. The time taken for all
operations in a binary search tree of height h is O(h). However, it can be extended to O(n) if the BST becomes
skewed (i.e. worst case). By limiting this height to log n, AVL tree imposes an upper bound on each operation to
be O(log n) where n is the number of nodes.

Huffman Algorithm (Coding)


Huffman coding is a lossless data compression algorithm. The idea is to assign variable-legth codes to input
characters; lengths of the assigned codes are based on the frequencies of corresponding characters. The most
frequent character gets the smallest code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are
assigned in such a way that the code assigned to one character is not prefix of code assigned to any other
character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit
stream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned to
c is prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-compressed output may be
“cccd” or “ccb” or “acd” or “ab”.
There are mainly two major parts in Huffman Coding
1) Build a Huffman Tree from input characters.
2) Traverse the Huffman Tree and assign codes to characters.

Steps to build Huffman Tree:


Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a
priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least
frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first
extracted node as its left child and the other extracted node as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree
is complete.
Let us understand the algorithm with an example:
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1: Build a min heap that contains 6 nodes where each node represents root of a tree with single node.
Step 2: Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9 =
14.

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is
root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 + 13 = 25

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes
are root of tree with more than one nodes.
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30

Now min heap contains 3 nodes.


character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 + 30 = 55

Now min heap contains 2 nodes.


character Frequency
f 45
Internal Node 55

Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100

Now min heap contains only one node.


character Frequency
Internal Node 100
Since the heap contains only one node, the algorithm stops here.

Steps to print codes from Huffman Tree:


Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write
0 to the array. While moving to the right child, write 1 to the array. Print the array when a leaf node is
encountered.
The codes are as follows:
character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111
B Tree
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order m can have at most
m-1 keys and m children. One of the main reasons of using B tree is its capability to store large number of keys
in a single node and large key values by keeping the height of the tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.
It is not necessary that, all the nodes contain the same number of children but, each node must have m/2 number
of nodes.
A B tree of order 4 is shown in the following image.

While performing some operations on B Tree, any property of B Tree may violate such as number of minimum
children a node can have. To maintain the properties of B Tree, the tree may split or join.
Searching:
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the
following B Tree. The process will something like following :
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any
element in a B tree.
Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order to insert an item
into B Tree.
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can be inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
o Insert the new element in the increasing order of elements.
o Split the node into the two nodes at the median.
o Push the median element up to its parent node.
o If the parent node also contains m-1 number of keys, then split it too by following the same steps.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.

8 will be inserted to the right of 5, therefore insert 8.

The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node from the median
i.e. 8 and push it up to its parent node shown as follows.

Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf node or an
internal node. Following algorithm needs to be followed in order to delete a node from a B tree.
1. Locate the leaf node.
2. If there are more than m/2 keys in the leaf node then delete the desired key from the node.
3. If the leaf node doesn't contain m/2 keys then complete the keys by taking the element from eight or left
sibling.
o If the left sibling contains more than m/2 elements then push its largest element up to its parent
and move the intervening element down to the node where the key is deleted.
o If the right sibling contains more than m/2 elements then push its smallest element up to the
parent and move intervening element down to the node where the key is deleted.
4. If neither of the sibling contain more than m/2 elements then create a new leaf node by joining two leaf
nodes and the intervening element of the parent node.
5. If parent is left with less than m/2 nodes then, apply the above process on the parent too.
If the the node which is to be deleted is an internal node, then replace the node with its in-order successor or
predecessor. Since, successor or predecessor will always be on the leaf node hence, the process will be similar as
the node is being deleted from the leaf node.
Example 1
Delete the node 53 from the B Tree of order 5 shown in the following figure.

53 is present in the right child of element 49. Delete it.

Now, 57 is the only element which is left in the node, the minimum number of elements that must be present in a
B tree of order 5, is 2. it is less than that, the elements in its left and right sub-tree are also not sufficient
therefore, merge it with the left sibling and intervening element of parent i.e. 49.
The final B tree is shown as follows.

Application of B tree
B tree is used to index the data and provides fast access to the actual data stored on the disks since, the access to
value stored in a large database that is stored on a disk is a very time consuming process.
Searching an un-indexed and unsorted database containing n key values needs O(n) running time in worst case.
However, if we use B Tree to index this database, it will be searched in O(log n) time in worst case.

B+ Tree
B+ Tree is an extension of B Tree which allows efficient insertion, deletion and search operations.
In B Tree, Keys and records both can be stored in the internal as well as leaf nodes. Whereas, in B+ tree, records
(data) can only be stored on the leaf nodes while internal nodes can only store the key values.
The leaf nodes of a B+ tree are linked together in the form of a singly linked lists to make the search queries
more efficient.
B+ Tree are used to store the large amount of data which can not be stored in the main memory. Due to the fact
that, size of main memory is always limited, the internal nodes (keys to access records) of the B+ tree are stored
in the main memory whereas, leaf nodes are stored in the secondary memory.
The internal nodes of B+ tree are often called index nodes. A B+ tree of order 3 is shown in the following figure.

Advantages of B+ Tree
1. Records can be fetched in equal number of disk accesses.
2. Height of the tree remains balanced and less as compare to B tree.
3. We can access the data stored in a B+ tree sequentially as well as directly.
4. Keys are used for indexing.
5. Faster search queries as the data is stored only on the leaf nodes.

B Tree VS B+ Tree
SN B Tree B+ Tree
1 Search keys cannot be repeatedly stored. Redundant search keys can be present.
2 Data can be stored in leaf nodes as well as internal nodes Data can only be stored on the leaf nodes.
3 Searching for some data is a slower process since data can Searching is comparatively faster as data can only be
be found on internal nodes as well as on the leaf nodes. found on the leaf nodes.
4 Deletion of internal nodes are so complicated and time Deletion will never be a complexed process since
consuming. element will always be deleted from the leaf nodes.
5 Leaf nodes cannot be linked together. Leaf nodes are linked together to make the search
operations more efficient.

Insertion in B+ Tree
Step 1: Insert the new node as a leaf node
Step 2: If the leaf doesn't have required space, split the node and copy the middle node to the next index node.
Step 3: If the index node doesn't have required space, split the node and copy the middle element to the next
index page.
Example :
Insert the value 195 into the B+ tree of order 5 shown in the following figure.

195 will be inserted in the right sub-tree of 120 after 190. Insert it at the desired position.

The node contains greater than the maximum number of elements i.e. 4, therefore split it and place the median
node up to the parent.
Now, the index node contains 6 children and 5 keys which violates the B+ tree properties, therefore we need to
split it, shown as follows.

Deletion in B+ Tree
Step 1: Delete the key and data from the leaves.
Step 2: if the leaf node contains less than minimum number of elements, merge down the node with its sibling
and delete the key in between them.
Step 3: if the index node contains less than minimum number of elements, merge the node with the sibling and
move down the key in between them.
Example
Delete the key 200 from the B+ Tree shown in the following figure.

200 is present in the right sub-tree of 190, after 195. delete it.

Merge the two nodes by using 195, 190, 154 and 129.

Now, element 120 is the single element present in the node which is violating the B+ Tree properties. Therefore,
we need to merge it by using 60, 78, 108 and 120.
Now, the height of B+ tree will be decreased by 1.

You might also like