DSF Combined notes

Trees
Data Structure
Department of Computer Engineering
BRACT’S, Vishwakarma Institute of Information Technology, Pune-48

(An Autonomous Institute affiliated to Savitribai Phule Pune University)
(NBA and NAAC accredited, ISO 9001:2015 certified)
Objective/s of this session
1. Understand difference between linear and non linear data structure

2. Understand concept of Trees
3. Introduce basic properties of trees
Learning Outcome/Course Outcome
1. Differentiate between linear and non linear data structure

2. Understand basic terminologies of trees
3. Understand basic properties of trees
2
Content (for example)
Part –I
➢ Difference between linear and non-liner data structure. Trees and Binary trees-
basic terminology
➢ representation using linked organization, Binary tree- properties
➢ converting tree to binary tree
Part-II
➢, binary tree traversals recursive and non-recursive: depth first and breadth first.
➢ Binary Search Tree (BST), BST operations: Create , Display
➢BST insertion and Deletion
3
Data Structure Classification
4
Linear and Non Linear Data Structure Difference
5
Linear and Non Linear Data Structure Difference
Faculty Name(optional), Department of ______Engineering,

6
VIIT,Pune-48
What is a tree?
7
Definition of Tree
• A tree is a finite set of one or more nodes

such that:
• There is a specially designated node called

the root.
• The remaining nodes are partitioned into n>=0 disjoint sets T1, ...,
Tn, where each of these sets is a tree.
• We call T1, ..., Tn the subtrees of the root.

Terminology
• Finite set of elements in tree are called NODES
• Finite set of lines in tree are called as BRANCHES
• Branch directed towards the node is the INDEGREE of the node
• Branch directed away from the node is the OUTDEGREE of the node
• The degree of a node is the number of sub trees of the node
• INDEGREE of ROOT is ZERO

Terminology
• All other nodes of the tree will have in degree one and out degree can be
zero or more
• OUTDEGREE of LEAF node is ZERO
• A node that is neither ROOT not LEAF is INTERNAL NODE
• The total number of edges from leaf node to a particular node in the
longest path is called as HEIGHT of that Node.
• In a tree, height of the root node is said to be Height of the tree.

• A node is called PARENT node if it has successor nodes i.e. outdegree
greater than ZERO
Terminology
• The node with degree 0 is a leaf or terminal node.
• A node that has sub trees is the parent of the sub trees.
• The sub trees of the node are the children of the node.
• Children of the same parent are siblings.
• The ancestors of a node are all the nodes along the path from the
root to the node.
Terminology

12
VIIT,Pune-48
Terminology

13
VIIT,Pune-48
Minimum Number Of Nodes
 Minimum number of nodes in a binary tree whose

height is h.
 At least one node at each of first h levels.
minimum number of nodes is h+1 where h

is height
Maximum Number Of Nodes
 All possible nodes at first h levels are present.
Applicable for
Full Binary tree
Perfect Binary tree
Maximum number of nodes

20 + 21 + ... 2h = 2h+1 − 1
= 2h+1 − 1
Number Of Nodes & Height
 Let n be the number of nodes in a binary tree whose height is

h.
 The number of nodes doubles every time the depth increases by 1
 h <= n <= 2h+1 – 1 Ex. 4<=15<=15 as height of a leaf is
considered as 0.
 Number of nodes in any binary tree is 2lh– 1+2rh – 1+1
i.e. height of left sub-tree + height of right sub-tree+1
Full Binary Tree
 A full binary tree of a given height h has 2h+1 − 1 nodes.
Height 3 full binary tree.

Numbering Nodes In A Full Binary Tree
 Number the nodes 1 through 2h+1 – 1.

 Number by levels from top to bottom.
 Within a level number from left to right.
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
Node Number Properties
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
 Parent of node i is node i / 2, unless i = 1.

 Node 1 is the root and has no parent.
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
 Left child of node i is node 2i, unless 2i > n, where n

is the number of nodes.
 If 2i > n, node i has no left child.
1
2 3
4 5 6 7
8 9 10 11 12 13 14 15
 Right child of node i is node 2i+1, unless 2i+1 > n,

where n is the number of nodes.
 If 2i+1 > n, node i has no right child.
Complete Binary Tree With n Nodes
A complete binary tree is a binary tree in which every

level, except possibly the last, is completely filled, and
all nodes are as far left as possible.
A binary tree T is full If each node is either a leaf or

possesses exactly two child nodes.
 A binary tree T with n levels is Complete if all levels
except possibly the last are completely full, and the
last level has all its nodes to the left side.
Perfect Binary Tree

A binary tree is perfect binary Tree if all
internal nodes have two children and all
leaves are at the same level.
Example
2 3
4 5 6 7
8 9 10 11 12 13 14 15
 Complete binary tree with 10 nodes.

Binary Tree Representation
 Array representation.
 Linked representation.
Array Representation
 Number the nodes using the numbering scheme for
a full binary tree. The node that is numbered i is
stored in tree[i].
a1
2 3
b c
4 5 6 7
d e f g
8 9 10
h i j
tree[] a b c d e f g h i j
0 5 10
Right-Skewed Binary Tree
a1
b 3
7
c
15
d
tree[] a - b - - - c - - - - - - - d
0 5 10 15
 To represent a binary tree of depth 'n' using

array representation, we need one
dimensional array with a maximum size of
2n+1 - 1.
Linked Representation
 Each binary tree node is represented as an

object whose data type is BinaryTreeNode.
 The space required by an n node binary tree is n
* (space required by one node).
The Class BinaryTreeNode
Class BinaryTreeNode
{
<type of data> Object element;
BinaryTreeNode *leftChild; // left subtree
BinaryTreeNode *rightChild;// right subtree
// constructors and any other methods
// come here
};
Linked Representation Example
root a
b c
d e
f g
Left Child
element h
rightChild
Tree
Sequential Representation
Linked Representation
31
Binary Trees
• Two special kinds of binary trees:
(a) skewed tree, (b) complete binary tree
• The all leaf nodes of these trees are on two adjacent levels
Binary Trees
• Properties of binary trees
• Lemma 5.1 [Maximum number of nodes]:
1. The maximum number of nodes on level i of a binary tree is 2i-1,
i 1.
2. The maximum number of nodes in a binary tree of depth k is
2k-1, k1.
• Lemma 5.2 [Relation between number of leaf nodes and
degree-2 nodes]:
For any nonempty binary tree, T, if n0 is the number of leaf
nodes and n2 is the number of nodes of degree 2, then n0 = n2
+ 1.
• These lemmas allow us to define full and complete
binary trees
Binary Trees
• Definition:
• A full binary tree of depth k is a binary tree of death k having 2k-1 nodes, k 
0.
• A binary tree with n nodes and depth k is complete iff its nodes correspond
to the nodes numbered from 1 to n in the full binary tree of depth k.
• From Lemma 5.1, the
height of a complete
binary tree with n nodes
is log2(n+1)
• Full Binary Tree
• A full binary tree is a binary tree where
every node has exactly 0 or 2 children.
Algorithm to convert General Tree to Binary
• Identify branches from parent to left most child

• Connect remaining siblings from left most
child using branches for each sibling to its
right
• Delete all unneeded branches
Example Tree Conversion General to
Binary
A
B C D
E F G H I J K
Example Tree Conversion General to
Binary
A
C
E
D
G
F I
H
J
K
Converting tree to binary tree
The process of converting the general tree to a binary tree is as follows:
* use the root of the general tree as the root of the binary tree
* determine the first child of the root. This is the leftmost node in the general tree at the next level
* insert this node. The child reference of the parent node refers to this node
* continue finding the first child of each parent node and insert it below the parent node with the child
reference of the parent to this node.
* when no more first children exist in the path just used, move back to the parent of the last node entered and
repeat the above process. In other words, determine the first sibling of the last node entered.
* complete the tree for all nodes. In order to locate where the node fits you must search for the first child at
that level and then follow the sibling references to a nil where the next sibling can be inserted. The
children of any sibling node can be inserted by locating the parent and then inserting the first child.
Then the above process is repeated.

38
VIIT,Pune-48
Binary
Tree
39
Binary
Tree
40
Binary
Tree
41
Binary Tree Traversal
 Many binary tree operations are done by

performing a traversal of the binary tree.
 In a traversal, each element of the binary tree is
visited exactly once.
 During the visit of an element, all action (make a
clone, display, evaluate the operator, etc.) with
respect to this element is taken.
Tree Traversal
 Two main methods:
 Depth first Traversal
 Breadth first Traversal
 Three methods of Depth first Traversal

 Preorder
 Postorder
 Inorder
 PREorder:
 visit the root
Tree Traversal
 POSTorder
 Traverse left subtree
 Traverse right subtree
 visit the root
 Inorder
 Traverse left subtree
 visit the root
 Traverse right subtree
Pre-Order Traversal
A
B E
C D F
A B C D E F
def preorder(treenode):
if treenode:is Null
Print(tree.getRoot(value))
preorder(tree.getLeftChild())
preorder(tree.getRightChild())

46
VIIT,Pune-48
In-Order Traversal
A
B E
C D F
C B D A E F
Post-Order Traversal
A
B E
C D F
def postorder(tree):
if tree:
CDB F E A postorder(tree.getLeftChild())
postorder(tree.getRightChild())
print(tree.getRootVal())
Breadth first Traversal
A
B C
D E F G
A B C D E F G
Recursive Traversal
def preorder(treenode):
if treenode:is Null
preorder(tree.getLeftChild())
preorder(tree.getRightChild())
def postorder(treenode):
if treenode:is Null return
postorder(tree.getLeftChild())
postorder(tree.getRightChild())
print(tree.getRootVal()) def inorder(treenode):
if treenode: is Null return
inorder(tree.getLeftChild())
inorder(tree.getRightChild())
50
Binary Search Trees
• Ordered data stored in Array will have efficient
search algorithms but inefficient insertion and
deletion
• Linked list storage will increase efficiency of

insertion and deletion but search will always
be sequential as you always have to start from
ROOT node
Properties of Binary Search Trees
• All items in left sub tree are less than
root
• All items in right sub tree are greater

than or equal to root
• Each sub tree is itself a binary search

Is this Binary Search Tree
17
6 19
17
6 19
3 14
17
19
22
17
22 19
17
6 11
17
6 19
11 15
17
6 19
3 22
Insertion in BST
17
6 19
4 15
Add node - 2
Insertion in BST
17
6 19
4 15
2
Insertion in BST
17
6 19
4 15
Add node - 22
Insertion in BST
17
6 19
4 15
22
Insertion
 Algorithm addBST (root, newnode)

If (empty)
Set root to newnode
Return newnode
End if
If (newnode < root)
Return addBST (left sub tree, newnode)
Else
Return addBST (right sub tree, newnode)
Deleting from BST
• If the node doesn’t have children's then just delete the
reference for this node from its parent and recycle the
memory
• If the node has only right sub tree then

Simply attach right subtree to delete's nodes parent
• If the node has only left sub tree then

Simply attach left subtree to delete's nodes parent
Deleting from BST
• If the node has left and right subtree then
• Find largest node from left subtree and
replace that with node we have to delete
OR
• Find smallest node from right subtree and
replace that with node we have to delete
BST Operations: Removal
▪ removes a specified item from the BST and adjusts

the tree uses a binary search to locate the target item:
starting at the root it probes down the tree till it finds
the target or reaches a leaf node (target not in the tree)
▪removal of a node must not leave a ‘gap’ in the tree,
Removal in BST - Pseudocode
method remove (key)
I if the tree is empty return false
II Attempt to locate the node containing the target using the binary
search algorithm
if the target is not found return false
else the target is found, so remove its node:
Case 1: if the node has 2 empty subtrees
replace the link in the parent with null
Case 2: if the node has a left and a right subtree
- replace the node's value with the max value in the
left subtree
- delete the max node in the left subtree
Removal in BST - Pseudocode
Case 3: if the node has no left child

- link the parent of the node
- to the right (non-empty) subtree
Case 4: if the node has no right child
- link the parent of the target
- to the left (non-empty) subtree
Removal in BST: Example
Case 1: removing a node with 2 EMPTY SUBTREES
parent 7
cursor 5 9
4 6 8 10
7
Removing 4replace the
link in the parent with 5 9
null
6 8 10
Case 2: removing a node with 2 SUBTREES
- replace the node's value with the max value in the left subtree
- delete the max node in the left subtree What other element
Removing 7 can be used as
cursor
cursor replacement?
7 6
5 9 5 9
4 6 8 10 4 8 10
Case 3: removing a node with 1 EMPTY SUBTREE

the node has no left child:
link the parent of the node to the right (non-empty) subtree
parent
parent
7 7
cursor
5 9 cursor 5 9
6 8 10 6 8 10
Case 4 : removing a node with 1 EMPTY SUBTREE
the node has no right child:
link the parent of the node to the left (non-empty) subtree
Removing 5
parent
parent
7 cursor 7
cursor
5 9 5 9
4 8 10 4 8 10
References
1. https://www.geeksforgeeks.org/
2. https://www.educative.io/
3. https://www.javatpoint.com/
74
Thank You
75
Presentation Topic Advanced Trees
Advance Data Structures
Department of Computer Engineering
BRACT’S, Vishwakarma Institute of Information Technology, Pune-48
(An Autonomous Institute affiliated to Savitribai Phule Pune University)

(NBA and NAAC accredited, ISO 9001:2015 certified)
Objective/s of this session
1. Introduce Threaded Binary Tree

2. Understand AVL Trees
3. Understand indexing
4. Introduce B+ Trees
Learning Outcome/Course Outcome
1. Understand Threaded Binary Tree

2.Understand AVL Trees
3. Understand indexing
4. Understand B+ Trees
2
Threaded Binary Tree
• In a linked representation of a binary tree, the number
of null links (null pointers) are actually more than non-
null pointers.
• Consider the following binary tree:
• In above binary tree, there are 7 null pointers & actual 5
pointers.
• In all there are 12 pointers.
• We can generalize it that for any binary tree with n nodes
there will be (n+1) null pointers and 2n total pointers.
• The objective here to make effective use of these null
pointers.
• A. J. perils & C. Thornton jointly proposed idea to make
effective use of these null pointers.
• According to this idea we are going to replace all the null
pointers by the appropriate pointer values called threads.
• And binary tree with such pointers are called threaded
tree.
• In the memory representation of a threaded binary tree,
it is necessary to distinguish between a normal pointer
and a thread.
▪ Therefore we have an alternate node representation for
a threaded binary tree which contains five fields as
show below:
• Also one may choose a one-way threading or a
two-way threading.
• Here, our threading will correspond to the in
order traversal of T.
One-Way
• Accordingly, in the one way threading of T, a
thread will appear in the right field of a node
and will point to the next node in the in-order
traversal of T.
• See the bellow example of one-way in-order
threading.
Threaded Binary Tree:
One-Way
Inorder of below tree is: D,B,F,E,A,G,C,L,J,H,K

Two-Way
• In the two-way threading of T.
• A thread will also appear in the left field of a node and
will point to the preceding node in the in-order
traversal of tree T.
• Furthermore, the left pointer of the first node and the
right pointer of the last node (in the in-order traversal
of T) will contain the null value when T does not have
a header node.
• Below figure show two-way in-order threading.
• Here, right pointer=next node of in-order traversal and
left pointer=previous node of in-order traversal
• Inorder of below tree is: D,B,F,E,A,G,C,L,J,H,K
Dangling Pointers
Double Threaded binary tree with dummy node
Two-way Threading with Header node
• Again two-way threading has left pointer of the first
node and right pointer of the last node (in the inorder
traversal of T) will contain the null value when T will
point to the header nodes is called two-way threading
with header node threaded binary tree.
• Below figure to explain two-way threading with header node.
• Below example of link representation of threading
binary tree.
• In-order traversal of below tree: G,F,B,A,D,C,E
Threaded Tree Code
• Example code:
class Node {
Node left, right;
boolean leftThread, rightThread;
}
Amir Kamil 8/8/02 20

Threaded Tree Example
6
3 8
1 5 7 11
9 13

Threaded Tree Traversal
• We start at the leftmost node in the tree, print
it, and follow its right thread
• If we follow a thread to the right, we output
the node and continue to its right
• If we follow a link to the right, we go to the
leftmost node, print it, and continue

Output
6 1
3 8
1 5 7 11
9 13
Start at leftmost node, print it

Output
6 1
3
3 8
1 5 7 11
9 13
Follow thread to right, print node

Output
6 1
3
5
3 8
1 5 7 11
9 13
Follow link to right, go to leftmost
node and print
Output
6 1
3
5
3 8 6
1 5 7 11
9 13

Output
6 1
3
5
3 8 6
7
1 5 7 11
9 13
Follow link to right, go to
leftmost node and print
Output
6 1
3
5
3 8 6
7
1 5 7 11 8
9 13

Output
6 1
3
5
3 8 6
7
1 5 7 11 8
9
9 13
Output
6 1
3
5
3 8 6
7
1 5 7 11 8
9
11
9 13

Output
6 1
3
5
3 8 6
7
1 5 7 11 8
9
11
9 13 13

Threaded Tree Traversal Code
Node leftMost(Node n) { void inOrder(Node n) {
Node ans = n; Node cur = leftmost(n);
if (ans == null) { while (cur != null) {
return null; print(cur);
if (cur.rightThread) {
}
cur = cur.right;
while (ans.left != null) { } else {
ans = ans.left; cur = leftmost(cur.right);
} }
return ans; }
} }

• Advantages of threaded binary tree:
• Threaded binary trees have numerous advantages over
non-threaded binary trees listed as below:
– The traversal operation is more faster than that of its
unthreaded version, because with threaded binary tree
non-recursive implementation is possible which can
run faster and does not require the botheration of stack
management.
– The second advantage is more understated with a
threaded binary tree, we can efficiently determine
the predecessor and successor nodes starting from
any node. In case of unthreaded binary tree, however,
this task is more time consuming and difficult. For
this case a stack is required to provide upward
pointing information in the tree whereas in a
threaded binary tree, without having to include the
overhead of using a stack mechanism the same can
be carried out with the threads.
– Any node can be accessible from any other node.
Threads are usually more to upward whereas links
are downward. Thus in a threaded tree, one can
move in their direction and nodes are in fact
circularly linked. This is not possible in unthreaded
counter part because there we can move only in
downward direction starting from root.
– Insertion into and deletions from a threaded tree are
although time consuming operations but these are
very easy to implement.
• Disadvantages of threaded binary tree:
– Insertion and deletion from a threaded tree are very
time consuming operation compare to non-threaded
binary tree.
– This tree require additional bit to identify the
threaded link.
AVL TREE(HEIGHT BALANCED TREE)
Georgy Adelson-Velsky Evgenii

Landis
3/3/2022 37
Binary Search Tree
disadvantages
3/3/2022 38
AVL Trees
•We have seen that all operations depend on the depth of the tree.
•We don’t want trees with nodes which have large height This can be
attained if both sub trees of each node have roughly the same height.
•AVL tree is a binary search tree where the height of the two sub trees of a
node differs by at most one Height of a null tree is -1
•For the tree to be balanced the Balance factor of tree should be -1,0,1
•The Balance Factor is calculated as HL - HR
•An AVL tree is another balanced binary search tree. Named after their
inventors, Adelson-Velskii and Landis, they were the first dynamically
balanced trees whose height differ by at most 1
3/3/2022 39
AVL Trees
(Adelson – Velskii – Landis)
AVL tree:
2
1 0 2
6 5
7 1 3
8 0
5 1 1
3/3/2022 7 9 40
AVL Trees
Not AVL tree:
20
16 25
7 18 30
5 17 19 27
3/3/2022 41
AVL Tree Rotations
Single rotations: insert 14, 15, 16, 13, 12, 11, 10
• First insert 14 and 15:
14
15
• Now insert 16.
3/3/2022 42
AVL Tree Rotations
Single rotations:
• Inserting 16 causes AVL violation:
14
15
16
• Need to rotate.
3/3/2022 43
AVL Tree Rotations
Single rotations:- Right of Right Rotation

(Insert element to right of right subtree )
3/3/2022
Rotate to left
• Inserting 16 causes AVL violation:
14
•rr_rotation(avl_node *parent)
{
•avl_node *temp;
15
•Temp = parent->right;
•parent->right = temp->left; 16
•temp->left = parent;
•return temp; • Need to rotate.
}
44
AVL Tree Rotations
Single rotations:
• Rotation type:
14
15
16
3/3/2022 45
AVL Tree Rotations
Single rotations:
• Rotation restores AVL balance:
15
14 16
3/3/2022 46
AVL Tree Rotations
Single rotations: Left of Left

(Insert element to Left of Left subtree )
Rotate to Right
• Now insert 13 and 12:

•ll_rotation(avl_node *parent)
{
15
•avl_node *temp;
•temp = parent->left; 14 16
•parent->left = temp->right;
•temp->right = parent;
•return temp; 13
}
12
• AVL violation - need to rotate.
3/3/2022 47
AVL Tree Rotations
Single rotations:
• Rotation type:
15
14 16
13
12
3/3/2022 48
AVL Tree Rotations
Single rotations:
15
13 16
12 14
3/3/2022 49
AVL Tree Rotations
Double rotations: Right of Left
Insert element to Right (part) of Left (subtree)
Rotate to left first and then rotate to right
15 •lr_rotation(avl_node *parent)
{
13 •avl_node *temp;
•temp = parent->left;
•parent->left = rr_rotation
14 (temp);
•return ll_rotation (parent);
}
3/3/2022 50
AVL Tree Rotations
15
14
14
13
13 15
3/3/2022 51
AVL Tree Rotations
Double rotations: Left of Right
Insert element to Left (part) of Right (subtree)
Rotate to Right first and then rotate to left
15
•rl_rotation(avl_node *parent)
{
18
•avl_node *temp;
•temp = parent->right;
•parent->right = ll_rotation (temp); 17
•Return rr_rotation (parent);
}
3/3/2022 52
AVL Tree Rotations
15
18 18
17 15 17
3/3/2022 53
AVL Tree Rotations
Single rotations:
15
1 1
1 3 6
1
2 4
• Now insert 11.
3/3/2022 54
AVL Tree Rotations
Single rotations:
1
5
1 1
1 3 6
1
1 2 4
• 1
AVL violation – need to rotate
3/3/2022 55
AVL Tree Rotations
Single rotations:
• Rotation type: 1
5
1 1
1 3 6
1
1 2 4
1
3/3/2022 56
AVL Tree Rotations
Single rotations:
1
3
1 1
2 5 1
1 1
1 4 6
• Now insert 10.
3/3/2022 57
AVL Tree Rotations
Single rotations:
1
3
1 1
2 5 1
1 1
1 4 6
1
• AVL violation – need to rotate
3/3/2022 0 58
AVL Tree Rotations
Single rotations:
• Rotation type:
1
3
1 1
2 5 1
1 1
1 4 6
1
3/3/2022 0 59
AVL Tree Rotations
Single rotations:
1
3
1 1
1 5
1 1 1 1
0 2 4 6
• AVL balance restored.
3/3/2022 60
AVL Tree Rotations
Double rotations: insert 1, 2, 3, 4, 5, 7, 6, 9, 8
• First insert 1 and 2:
1
3
1 1
1 5
1 1 1 1
0 2 4 6
3/3/2022 61
AVL Tree Rotations
Double rotations:
• AVL violation - rotate
1
3
1 1
1 5 1
1 1 1
0 2 4 6
1
3/3/2022
2 62
AVL Tree Rotations
Double rotations:
• Rotation type:
1
3
1 1
1 5 1
1 1 1
0 2 4 6
1
3/3/2022
2 63
AVL Tree Rotations
Double rotations:
• AVL balance restored:
1
3
1 1
1 5
2 1 1 1
2 4 6
1 1
• Now insert 3. 0
3/3/2022 64
AVL Tree Rotations
Double rotations:
• AVL violation – rotate:
1
3
1 1
1 5
2 1 1 1
2 4 6
1 1
3/3/2022 3 0 65
AVL Tree Rotations
Double rotations:
• Rotation type:
1
3
1 1
1 5
2 1 1 1
2 4 6
1 1
3/3/2022 3 0 66
AVL Tree Rotations
Double rotations:
1
3
1 1
0 5
2 1 1 1
1 14 6
1 3
• Now insert 4. 2
3/3/2022 67
AVL Tree Rotations
Double rotations:
• AVL violation - rotate
1
3
1 1
0 5
2 1 1 1
1 1 14 6
3
3/3/2022 4 2 68
AVL Tree Rotations
Double rotations:
• Rotation type:
1
3
1 1
0 5
2 1 1 1
1 1 14 6
3
3/3/2022 4 2
69
AVL Tree Rotations
Double rotations:
1
2 1
0
3
1 3 1 1
1 1 5
1 1
4
2 4 6
• Now insert 5.
3/3/2022 70
AVL Tree Rotations
Double rotations:
1
2 1
0
3
1 3 1 1
1 1 5
1 1
4
2 4 6
•
5
AVL violation – rotate.
3/3/2022 71
AVL Tree Rotations
Single rotations:
• Rotation type:
1
2 1
0
3
1 3 1 1
1 1 5
1 1
4
2 4 6
5
3/3/2022 72
AVL Tree Rotations
Single rotations:
1
2 1
0
3
1 4 1 1
1 1 5
1 1
3 5
2 4 6
• Now insert 7.
3/3/2022 73
AVL Tree Rotations
Single rotations:
• AVL violation – rotate.
1
2 1
0
3
1 4 1 1
1 1 5
1 1
3 5
7 2 4 6
3/3/2022 74
AVL Tree Rotations
Single rotations:
• Rotation type:
1
2 1
0
3
1 4 1 1
1 1 5
1 1
3 5
7 2 4 6
3/3/2022 75
AVL Tree Rotations
Double rotations:
1
4 1
0
3
2 5 1 1
1 1 5
1 1
1 3 7
2 4 6
• Now insert 6.
3/3/2022 76
AVL Tree Rotations
Double rotations:
• AVL violation - rotate.
1
4 1
0
3
2 5 1 1
1 1 5
1 1
1 3 7
2 4 6
6
3/3/2022 77
AVL Tree Rotations
Double rotations:
• Rotation type:
1
4 1
0
3
2 5 1 1
1 1 5
1 1
1 3 7
2 4 6
6
3/3/2022 78
AVL Tree Rotations
Double rotations:
1
4 1
0
3
2 6 1 1
1 1 5
1 1
1 3 5 7
2 4 6
• Now insert 9 and 8.
3/3/2022 79
AVL Tree Rotations
Double rotations:
• AVL violation - rotate.
1
4 1
0
3
2 6 1 1
1 1 5
1 1
1 3 5 7
2 4 6
9
3/3/2022
8 80
AVL Tree Rotations
Double rotations:
• Rotation type:
1
4 1
0
3
2 6 1 1
1 1 5
1 1
1 3 5 7
2 4 6
9
3/3/2022
8 81
AVL Tree Rotations
Final tree:
• Tree is almost perfectly balanced
1
4 1
0
3
2 6 1 1
1 1 5
1 1
1 3 5 8
2 4 6
7 9
3/3/2022 82
GRAPHS
Data Structures
1
SYLLABUS
 Basic Concepts, Storage representation, Adjacency
matrix, adjacency list, adjacency multi list, inverse
adjacency list. Traversals-depth first and breadth first,
Introduction to Greedy Strategy, Minimum spanning
Tree, Greedy algorithms for computing minimum
spanning tree- Prims and Kruskal Algorithms,
Dikjtra's Single source shortest path, Topological
ordering.
 Case study- Data structure used in Webgraph and
Google map.
2
WHAT IS A GRAPH?
 A data structure that consists of a set of nodes

(vertices) and a set of edges that relate the nodes to
each other
 The set of edges describes relationships among the
vertices
 An edge is a pair of vertices. As in mathematics, an
edge (x, y) is said to point or go from x to y.
 If two of the vertices are connected with edge, then
we say these two vertices are adjacent
 A graph may not have an edge from vertex to
vertex i.e.< v , v >
 A graph may not have multiple occurrence of same 3
edge
Example
B
Vertices E
A D
Edges
C
 Undirected Graph with 5 vertices and 7 edges
 Maxm number of edges in undirected graph n(n-1)/2
 If graph has exact n(n-1)/2 edges then its complete

4
graph
Example
B
E
A D
 Directed Graph with 5 vertices and 7 edges

5
FORMAL DEFINITION OF GRAPHS
 A graph G is defined as follows:
G=(V,E)
V(G): a finite, nonempty set of vertices
E(G): a set of edges (pairs of vertices)
6
DIRECTED VS. UNDIRECTED GRAPHS
 When the edges in a graph have no direction, the

graph is called undirected
7
DIRECTED VS. UNDIRECTED GRAPHS
(CONT.)
 When the edges in a graph have a direction, the
graph is called directed (or digraph)
if the graph is
directed, the
order of the
vertices in each
edge is important
E(Graph2) = {(1,3) (3,1) (5,9) (9,11) (5,7)

TREES VS GRAPHS
 Trees are special cases of graphs!!
9
The degree of a vertex is the number of
edges incident to that vertex
For directed graph,
the in-degree of a vertex v is the number of edges
that have v as the head
the out-degree of a vertex v is the number of edges

that have v as the tail
if di is the degree of a vertex i in a graph G with n vertices

and e edges, the number of edges is
n −1
e = (  di ) / 2
0
10
GRAPH TERMINOLOGY
 Adjacent nodes: two nodes are adjacent if they are

connected by an edge
 Path: a sequence of vertices that connect two nodes

in a graph
 Complete graph: a graph in which every vertex is

directly connected to every other vertex
11
MORE TERMINOLOGY
 simple path: no repeated vertices
a b
bec
c
d e
 cycle: simple path, except that the last vertex is the same
as the first vertex
a b
acd
a ac d a
c
12
d e
EVEN MORE TERMINOLOGY
•connected graph: any two vertices are connected by some path
connected not connected

 subgraph: subset of vertices and edges forming a graph
 connected component: maximal connected subgraph. E.g., the graph
below has 3 connected components.
13
GRAPH TERMINOLOGY (CONT.)
 What is the number of edges in a
complete directed graph with N
vertices?
N * (N-1)
14
 What is the number of edges in a complete
undirected graph with N vertices?
N * (N-1) / 2
15
 Weighted graph: a graph in which each edge
carries a value
16
ABSTRACT DATA TYPE
 objects: a nonempty set of vertices and a set of
undirected edges, where each edge is a pair of vertices
 functions: for all graph  Graph, v, v1 and v2 

Vertices
 Graph Create()::=return an empty graph
 Graph InsertVertex(graph, v)::= return a graph with

v inserted. v has no incident edge.
 Graph InsertEdge(graph, v1,v2)::= return a graph

with new edge between v1 and v2
17
 Graph DeleteVertex(graph, v)::= return a graph in
which v and all edges incident to it are removed
 Graph DeleteEdge(graph, v1, v2)::=return a graph

in which the edge (v1, v2) is removed
 Boolean IsEmpty(graph)::= if (graph==empty

graph) return TRUE else return FALSE
 List Adjacent(graph,v)::= return a list of all

vertices that are adjacent to v
18
GRAPH TRAVERSAL
 Use the same depth-first and breadth-first traversal
algorithms seen for the binary trees.
 Differences between graph and tree traversals:
1) Tree traversal always visit all the nodes in the tree
2 ) Graph traversal visits all the nodes in the graph only
when it is connected. Otherwise it visits only a subset of
the nodes. This subset is call the connected component of
the graph.
 Recursive and iterative implementations of the algorithms.
 Iterative: Use a stack for the depth-first search (dfs)
 Use a queue for the breadth-first search (bfs) 19
GRAPH IMPLEMENTATION
 Array-based implementation
 A 1D array is used to represent the vertices
 A 2D array (adjacency matrix) is used to represent the
edges
20
ARRAY-BASED IMPLEMENTATION
21
GRAPH IMPLEMENTATION (CONT.)
 Linked-list implementation
 A 1D array is used to represent the vertices
 A list is used for each vertex v which
contains the vertices which are adjacent from
v (adjacency list)
22
LINKED-LIST IMPLEMENTATION
23
ADJACENCY MATRIX VS.
ADJACENCY LIST REPRESENTATION
 Adjacency matrix
 Good for dense graphs
 Memory requirements( Consider all edges)
 Connectivity between two vertices can be tested quickly
 Adjacency list
 Good for sparse graphs
 Memory requirements(Consider only connected edges)
 Vertices adjacent to another vertex can be found quickly
24
GRAPH SEARCHING
 Problem: find a path between two nodes of the

graph (e.g., Austin and Washington)
 Methods: Depth-First-Search (DFS) or Breadth-
First-Search (BFS)
25
DEPTH-FIRST-SEARCH (DFS)
 What is the idea behind DFS?
 Travel as far as you can down a path
 Back up as little as possible when you reach a "dead
end" (i.e., next vertex has been "marked" or there is
no next vertex)
 DFS can be implemented efficiently using a
stack
26
DEPTH-FIRST-SEARCH (DFS) (CONT.)
NEED CORRECTIONS
Set found to false
stack.Push(startVertex)
DO
stack.Pop(vertex)
IF vertex == endVertex
Set found to true
ELSE
Push all adjacent vertices onto stack
WHILE !stack.IsEmpty() AND !found
IF(!found)
Write "Path does not exist"
27
WALK-THROUGH
Visited Array
F C A
A B
B C
D
H D
E
G E F
G
H
Task: Conduct a depth-first search of the

graph starting with node D
28
Walk-Through
Visited Array
F C A
A B
B C
D
H D √
E
G E F
G
H D
The order nodes are visited:
Visit D
D 29
Walk-Through
Visited Array
F C A
A B
B C
D
H D √
E
G E F
G
H D
Consider nodes adjacent to D, decide to
D
visit C first 30
Prefer Alphabetical order

Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E
G E F
G C
H D
Visit C
D, C 31
Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E
G E F
G C
The order nodes are visited: H D
D, C
No nodes adjacent to C; cannot
continue ➔ backtrack, i.e., pop stack 32
and restore previous state

Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E
G E F
G
H D
Back to D – C has been visited,
D, C decide to visit E next33
Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E √
G E F
G E
H D
D, C, E
Back to D – C has been visited, 34
decide to visit E next

Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E √
G E F
G E
H D
Only G is adjacent to E
D, C, E 35
Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E √
G E G
F
G √ E
H D
Visit G
D, C, E, G 36
Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
E √
G E G
F
G √ E
H D
D, C, E, G Nodes D and H are adjacent to
G. D has already been 37
visited. Decide to visit H.

Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
H
E √
G E G
F
G √ E
H √ D
Visit H
D, C, E, G, H 38
Walk-Through
Visited Array
F C A
A B
B C √
D
H D √
H
E √
G E G
F
G √ E
H √ D
Nodes A and B are adjacent to F.
D, C, E, G, H Decide to visit A next.39
Walk-Through
Visited Array
F C A √
A B
B C √ A
D
H D √
H
E √
G E G
F
G √ E
H √ D
Visit A
D, C, E, G, H, A 40
Walk-Through
Visited Array
F C A √
A B
B C √ A
D
H D √
H
E √
G E G
F
G √ E
H √ D
Only Node B is adjacent to A.
D, C, E, G, H, A Decide to visit B next.41
Walk-Through
Visited Array
F C A √
A B √ B
B C √ A
D
H D √
H
E √
G E G
F
G √ E
H √ D
Visit B
D, C, E, G, H, A, B 42
Walk-Through
Visited Array
F C A √
A B √
B C √ A
D
H D √
H
E √
G E G
F
G √ E
H √ D
No unvisited nodes adjacent to
D, C, E, G, H, A, B B. Backtrack (pop the stack).
43
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
H
E √
G E G
F
G √ E
H √ D
D, C, E, G, H, A, B A. Backtrack (pop the stack).
44
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E G
F
G √ E
H √ D
D, C, E, G, H, A, B H. Backtrack (pop the45
stack).
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F
G √ E
H √ D
D, C, E, G, H, A, B G. Backtrack (pop the46
stack).
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F
G √
H √ D
D, C, E, G, H, A, B E. Backtrack (pop the stack).
47
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F
G √
H √ D
F is unvisited and is adjacent to
D, C, E, G, H, A, B D. Decide to visit F next.
48
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F √
G √ F
H √ D
Visit F
D, C, E, G, H, A, B, F 49
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F √
G √
H √ D
D, C, E, G, H, A, B, F F. Backtrack. 50
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F √
G √
H √
D, C, E, G, H, A, B, F D. Backtrack. 51
Walk-Through
Visited Array
F C A √
A B √
B C √
D
H D √
E √
G E F √
G √
H √
Stack is empty. Depth-first
D, C, E, G, H, A, B, F traversal is done. 52
BREADTH-FIRST-SEARCHING (BFS)
 What is the idea behind BFS?
 Look at all possible paths at the same depth before you
go at a deeper level
 Back up as far as possible when you reach a "dead end"
(i.e., next vertex has been "marked" or there is no next
vertex)
53
 Can be used to attempt to visit all nodes of a
graph in a systematic manner
 Works with directed and undirected graphs
 Works with weighted and unweighted graphs
Steps
Breadth-first search starts with given node
Then visits nodes adjacent in some specified order
(e.g., alphabetical)
54
BREADTH-FIRST-SEARCHING (BFS)
(CONT.)
 BFScan be implemented efficiently using a
queue IF(!found)
Write "Path does not exist"
Set found to false
queue.Enqueue(startVertex)
DO
queue.Dequeue(vertex)
IF vertex == endVertex
Set found to true
ELSE
Enqueue all adjacent vertices onto queue
WHILE !queue.IsEmpty() AND !found
55
Walk-Through Enqueued Array
F C A
A B Q→
B C
D
H D
E
G E F
G
H
How is this accomplished? Simply replace the stack

with a queue! Rules: (1) Maintain an enqueued 56
array. (2) Visit node when dequeued.
F C A
A B Q→D
B C
D
H D √
E
G E F
G
Nodes visited: H
Enqueue D. Notice, D not yet visited.

57
F C A
A B Q→C→E→F
B C √
D
H D √
E √
G E F √
G
Nodes visited: D H
Dequeue D. Visit D. Enqueue unenqueued nodes

58
adjacent to D.
F C A
A B Q→E→F
B C √
D
H D √
E √
G E F √
G
Nodes visited: D, C H
Dequeue C. Visit C. Enqueue unenqueued nodes

59
adjacent to C.
F C A
A B Q→F→G
B C √
D
H D √
E √
G E F √
G
Nodes visited: D, C, E H
Dequeue E. Visit E. Enqueue unenqueued nodes

60
adjacent to E.
F C A
A B Q→G
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F H
Dequeue F. Visit F. Enqueue unenqueued nodes

61
adjacent to F.
F C A
A B Q→H
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G H √
Dequeue G. Visit G. Enqueue unenqueued nodes

62
adjacent to G.
F C A √
A B √ Q→A→ B
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H H √
Dequeue H. Visit H. Enqueue unenqueued nodes

63
adjacent to H.
F C A √
A B √ Q→B
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, A H √
Dequeue A. Visit A. Enqueue unenqueued nodes

64
adjacent to A.
F C A √
A B √ Q empty
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, H √
A, B
Dequeue B. Visit B. Enqueue unenqueued nodes
65
adjacent to B.
F C A √
A B √ Q empty
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, H √
A, B
Q empty. Algorithm done.
66
DEFINITION
 A Minimum Spanning Tree (MST) is a subgraph
of an undirected graph such that the subgraph
spans (includes) all nodes, is connected, is
acyclic, and has minimum total edge weight
67
MINIMUM SPANNING TREES
Prim’s Algorithm
• Focuses on nodes
Kruskal’s Algorithm
• Focuses on edges, rather than nodes
68
 Minimum Spanning Tree is concerned with connected
undirected graphs.
 The idea is to find another graph with the same number

of vertices, but with minimum number of edges.
 Let G = (V,E) be connected undirected graph. A

minimum spanning tree for G is a graph, T = (V’, E’)
with the following properties:
 V’ = V
 T is connected
 T is acyclic.
69
Example of Minimum Spanning Tree
• The following figure shows a graph G1 together with its three

possible minimum spanning trees.
•a •b •d
•c •e •f
•a •b •d •a •b •d •a •b •d
•c •e •f •c •e •f •c •e •f
• The following figure shows a graph G1 together with its three possible
70
minimum spanning trees.
WHAT IS A MINIMUM-COST SPANNING
TREE
• Minimum-Cost spanning tree is concerned with edge-weighted
connected undirected graphs.
• For an edge-weighted , connected, undirected graph, G, the total
cost of G is the sum of the weights on all its edges.
• A minimum-cost spanning tree for G is a minimum spanning tree
of G that has the least total cost.
71
CONSTRUCTING MINIMUM SPANNING TREE
• A spanning tree can be constructed using any of the traversal

algorithms discussed, and taking note of the edges that are
actually followed during the traversals.
• The spanning tree obtained using breadth-first traversal is called

breadth-first spanning tree, while that obtained with depth-first
traversal is called depth-first spanning tree.
• For example, for the graph in figure 1, figure (c) is a depth-first

spanning tree, while figure (d) is a breadth-first spanning tree.
• In the rest of the lecture, we discuss two algorithms for

constructing minimum-cost spanning tree.
72
ALGORITHM CHARACTERISTICS
 Both Prim’s and Kruskal’s Algorithms work with

undirected graphs
 Both work with weighted and unweighted graphs but
are more interesting when edges are weighted
 Both are greedy algorithms that produce optimal
solutions
73
PRIM’S ALGORITHM
 Similar to Dijkstra’s Algorithm except that

dv( Distance of Vertex ) records edge weights, not path
lengths
74
PRIM’S ALGORITHM
 Prim’s algorithm finds a minimum cost spanning tree by selecting
edges from the graph one-by-one as follows:
 It starts with a tree, T, consisting of the starting vertex, x.
 Then, it adds the shortest edge emanating from x that connects T
to the rest of the graph.
 It then moves to the added vertex and repeat the process.
•Let T be a tree consisting of only the starting vertex x

•While (T has fewer than n vertices)
•{
• find the smallest edge connecting T to G-T
• add it to T
•}
75
WALK-THROUGH
2
Initialize array
3
10
F C K dv pv
A 7 3 A False  −
8 4
18 B False  −
4
9
B D C False  −
10
H 25 D False  −
2
3 E False  −
G 7
E F False  −
G False  −
H False  −
•dv( Distance of Vertex )

•Pv( Previous Vertex )
76
2
Start with any node, say D
3
10
F C K dv pv
A 7
4
3 A
8 B
18
4
9
B D C
10
H 25 D T 0 −
2
3 E
G 7
E F
G
H
77
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A
8 B
18
4
9
B D C 3 D
10
H 25 D T 0 −
2
3 E 25 D
G 7
E F 18 D
G 2 D
H
78
2 Select node with
minimum distance
3
10
F C
A 7 3 K dv pv
8 4
18 A
4
9
B D B
10
H 25
C 3 D
2
3 D T 0 −
G 7
E E 25 D
F 18 D
G T 2 D
H
79
3
10
F C K dv pv
A 7
4
3 A
8 B
18
4
9
B D C 3 D
10
H 25 D T 0 −
2
3 E 7 G
G 7
E F 18 D
G T 2 D
H 3 G
80
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A
8 B
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E 7 G
G 7
E F 18 D
G T 2 D
H 3 G
81
3
10
F C K dv pv
A 7
4
3 A
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E 7 G
G 7
E F 3 C
G T 2 D
H 3 G
82
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E 7 G
G 7
E F T 3 C
G T 2 D
H 3 G
83
3
10
F C K dv pv
A 7
4
3 A 10 F
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E 2 F
G 7
E F T 3 C
G T 2 D
H 3 G
84
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A 10 F
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H 3 G
85
3
10
F C K dv pv
A 7
4
3 A 10 F
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H 3 G
Table entries unchanged
86
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A 10 F
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
87
3
10
F C K dv pv
A 7
4
3 A 4 H
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
88
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A T 4 H
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
89
3
10
F C K dv pv
A 7
4
3 A T 4 H
8 B 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
Table entries unchanged
90
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A T 4 H
8 B T 4 C
18
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
91
2 Cost of Minimum
Spanning Tree =  dv = 21
3
F C K dv pv
A 4
3 A T 4 H
B T 4 C
4
B D C T 3 D
H D T 0 −
2
3 E T 2 F
G E F T 3 C
G T 2 D
H T 3 G
Done
92
Kruskal’s Algorithm
• Work with edges, rather than nodes

• Two steps:
– Sort edges by increasing edge weight
– Select the first |V| – 1 edges that do not
generate a cycle
93
KRUSKAL'S ALGORITHM.
 Kruskal’ÿ algorithm also finds the minimum cost spanning tree of
a graph by adding edges one-by-one.
 However, Kruskal’ÿ algorithm forms a forest of trees, which are

joined together incrementally to form the minimum cost spanning
tree.
•sort the edges of G in increasing order of weights

•form a forest by taking each vertex in G to be a tree
•for each edge e in sorted order
• if the endpoints of e are in separate trees
• join the two tree to form a bigger tree
•return the resulting single tree
94
Walk-Through
Consider an undirected, weight graph
3
10
F C
A 4
4
3
8
6
5
4
B D
4
H 1
2
3
G 3
E
95
Sort the edges by increasing edge weight
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 (B,E) 4
8
6
5 (D,G) 2 (B,F) 4
4
B D
4
H 1
(E,G) 3 (B,H) 4
2
3 (C,D) 3 (A,H) 5
G 3
E
(G,H) 3 (D,F) 6
(C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
96
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
8 4
6 (D,G) 2 (B,F) 4
5
4
B D (E,G) 3 (B,H) 4
4
H 1
(C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E (C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
97
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3 (B,H) 4
4
H 1
(C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E (C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
98
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E (C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
Accepting edge (E,G) would create a cycle
99
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E (C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
100
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
101
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3  (A,B) 8
(B,C) 4 (A,F) 10
102
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3  (A,B) 8
(B,C) 4  (A,F) 10
103
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4 
4
8
6 (D,G) 2  (B,F) 4
5
4
B D (E,G) 3  (B,H) 4
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3  (A,B) 8
(B,C) 4  (A,F) 10
104
generate a cycle
3
10
F C edge dv edge dv
A 4 3
8 4 (D,E) 1  (B,E) 4 
6
5
4
B D (D,G) 2  (B,F) 4 
4
H 1 (E,G) 3  (B,H) 4
2
3 (C,D) 3  (A,H) 5
G 3
E
(G,H) 3  (D,F) 6
(C,F) 3  (A,B) 8
(B,C) 4  (A,F) 10
105
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4 
4
8
6 (D,G) 2  (B,F) 4 
5
4
B D (E,G) 3  (B,H) 4 
4
H 1
(C,D) 3  (A,H) 5
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3  (A,B) 8
(B,C) 4  (A,F) 10
106
generate a cycle
3
10
F C edge dv edge dv
A 4 3 (D,E) 1  (B,E) 4 
4
8
6 (D,G) 2  (B,F) 4 
5
4
B D (E,G) 3  (B,H) 4 
4
H 1
(C,D) 3  (A,H) 5 
2
3 (G,H) 3  (D,F) 6
G 3
E (C,F) 3  (A,B) 8
(B,C) 4  (A,F) 10
107
generate a cycle
edge dv edge dv
3
F C (D,E) 1  (B,E) 4 
A 4
3
(D,G) 2  (B,F) 4 
5
B D (E,G) 3  (B,H) 4 
H 1
(C,D) 3  (A,H) 5 
2
3 (G,H) 3  (D,F) 6
G E
(C,F)
(B,C)
3
4


(A,B)
(A,F)
8
10
}
not
Done considered
Total Cost =  dv = 21 108

• Kruskal’s algorithm • Prim’s algorithm
1. Select the shortest edge in a 1. Select any vertex

network
2. Select the shortest edge
2. Select the next shortest edge connected to that vertex
which does not create a cycle
3. Select the shortest edge
3. Repeat step 2 until all vertices connected to any vertex
have been connected already connected
4. Repeat step 3 until all

vertices have been
connected 109
•Some points to note
•Both algorithms will always give solutions with

the same length.
•They will usually select edges in a different order
•Occasionally they will use different edges – this

may happen when you have to choose between
edges with the same length. In this case there is
more than one minimum connector for the
network.
110
SINGLE-SOURCE SHORTEST-PATH
PROBLEM
 There are multiple paths from a source vertex to a

destination vertex
 Shortest path: the path whose total weight (i.e., sum
of edge weights) is minimum
 Examples:
 Austin->Houston->Atlanta->Washington: 1560 miles
 Austin->Dallas->Denver->Atlanta->Washington: 2980
miles
111
SINGLE-SOURCE SHORTEST-PATH
PROBLEM (CONT.)
 Common algorithms: Dijkstra's algorithm,
Bellman-Ford algorithm
 BFS can be used to solve the shortest graph
problem when the graph is weightless or all the
weights are the same
(mark vertices before Enqueue)
112
Dijkstra(L[1…n,1…n]):array[2….n]
Array D[2….n]
Initialize C={2,3,…..n}{ S=N\C}
For i= 2 to n do D[i]=L[1,i]
Loop n-2 times
v= {some element of C minimizing D[v] }
C=C\{v} {S=S union {v}}
For w belongs to C do
D[w]=min(D[w],D[v]+L[v,w])
Return D 113
void dij(int n,int s,int cost[10][10],int dist[])
{ int i,u,count,v,flag[10],min;
for(i=1;i<=n;i++)
{ flag[i]=0; dist[i]=cost[s][i];
} count=2;
while(count<=n)
{ min=99;
for(v=1;v<=n;v++)
if(dist[v]<min && !flag[v])
{ min=dist[v]; u=v; }
flag[u]=1;
count++;
for(v=1;v<=n;v++)
if((dist[u]+cost[u][v]<dist[v]) && !flag[v])
dist[v]=dist[u]+cost[u][v];
} 114
}
SYMBOL TABLE & HASH TABLE
1
 Compiler is composed of Analysis and synthesis
Compiler
Analysis synthesis
2
3
ANALYSIS
 Lexical Analysis (Stream of Tokens)

 Syntax Analyzer
 Semantic Analyzer
 It does the grammar check create intermediate

representation of program
 Syntax ,syntactically correct
4
 Lexical analysis is the first phase of a compiler. It takes the
modified source code from language preprocessors that are
written in the form of sentences.
 The lexical analyzer breaks these syntaxes into a series of
tokens, by removing any whitespace or comments in the source
code.
 The input string of characters of the source program are divided
into sequence of characters that have a logical meaning, and are
known as tokens
 In programming language, keywords, constants, identifiers,
strings, numbers, operators and punctuations symbols can be
considered as tokens.
 int value = 100; contains the tokens:
int (keyword), value (identifier), = (operator),

100 (constant) and ; (symbol)
5
 Optimization is a program transformation technique,
which tries to improve the code by making it consume
less resources (i.e. CPU, Memory) and deliver high
speed.
 The final phase of the compiler is the generation of

target code, consisting normally of relocatable
machine code or assembly code
6
Lexemes are said to be a sequence of characters (alphanumeric)
in a token.
7
 The lexical analyzer also follows rule priority where a
reserved word, e.g., a keyword, of a language is given priority
over user input. That is, if the lexical analyzer finds a lexeme
that matches with any existing reserved word, it should
generate an error.
 it involves grouping the tokens of the source program into
grammatical phrases that are used by the compiler to
synthesize output
 The semantic analysis phase checks the source program for semantic errors
and gathers type information for the subsequent de-generation phase.
It uses the hierarchical structure determined by the syntax-analysis phase to
identify the operators and operands of expressions and statements.
 An important component of semantic analysis is type checking.
8
COMPILER
9
 Symbol table is a data structure used by a language translator
such as a compiler or interpreter, where each identifier in a
program's source code is associated with information relating to
its declaration or appearance in the source, such as
its type, scope level and sometimes its location.
 A data structure called a symbol table is generally used to

store information about various source language
constructs.
 The information is collected by the analysis phases of the

compiler and used by the synthesis phases to generate the
target code
10
 A common implementation technique is to use
a hash table
 The symbol table is accessed by most phases of a

compiler, beginning with the lexical analysis to
optimization.
 At that time of accessing variables and allocating

memory dynamically, a compiler should perform
many works and as such the extended stack
model requires the symbol table.
11
 Symbol table is an important data structure
created and maintained by compilers in order to
store information about the occurrence of various
entities such as variable names, function names,
objects, classes, interfaces, etc.
 Symbol table is used by both the analysis and

the synthesis parts of a compiler.
12
 Consider the following program written in C
// Declare an external function
extern double Abc(double x);

// Define a public function
double fun (int count)
{ double sum = 0.0 ;
// Sum all the values bar(1) to bar(count)
for (int i = 1; i <= count; i++) sum += bar((double) i);
return sum;
}
13
Symbol name Type Scope
Abc function, double extern
function
x double
parameter
fun function, double global
function
count int
parameter
sum double block local
for-loop
i int
statement
In addition to this the symbol table also has dynamic entries

suppose in for loop value of any variable ‘i’ changes it is
always reflected in table
14
SCOPE RULE OF SYMBOL TABLE
15
OPERATIONS ON SYMBOL TABLE
 A symbol table, either linear or hash, should provide the
following operations.
 insert()
 This operation is more frequently used by analysis phase, i.e.,
the first half of the compiler where tokens are identified and
names are stored in the table. This operation is used to add
information in the symbol table about unique names occurring
in the source code
 Scope Management :- A compiler maintains two types of

symbol tables: a global symbol table which can be accessed by
all the procedures and scope symbol tables that are created
for each scope in the program.
16
 lookup()
 lookup() operation is used to search a name in the
symbol table to determine:
 if the symbol exists in the table.
 if it is declared before it is being used.
 if the name is used in the scope.
 if the symbol is initialized.
 if the symbol declared multiple times.
17
HASH TABLE
 a hash table, or a hash map, is a data structure
that associates keys with values (attributes).
 A hash table generalizes the simpler notion of an ordinary

array.
 The array index is computed from the key.
 Directly addressing into an ordinary array makes

effective use of our ability to examine an arbitrary position
in an array in O(1) time.
 Used for
 Look-Up Table
 Dictionary
 Cache
18
 Extended Array
HASHING
 Key-value pairs are stored in a fixed size table

called a hash table.
 A hash table is partitioned into many buckets.
 Each bucket has many slots.
 Each slot holds one record.
 A hash function f(x) transforms the identifier (key) into an
address in the hash table
19
 We can insert element in hash table using direct
addressing mode
 In direct addressing mode we calculate the slots in

bucket
 Where we can store data
 The problem with direct addressing mode is

 When bucket get completely filled we cannot insert
elements
 There might be situation where same slot is assign to

two values or elements
20
 For this we have a term called as perfect hash
function
 Where we calculate load factor and load density
 Let n be the total number of elements in table
 T is the total number of possible elements
 Density is n/t load factor is alpha=n/sb
 S is slot in hash table
21
m slots
00000
….
B-1
B Buckets
22
WHAT IS A HASH TABLE ?
 The simplest kind of hash table is an array of records.
 This example has 701 records.
[0] [1] [2] [3] [4] [5] [ 700]
... 23
An array of records
 The simplest kind of hash table is an array of records.
 This example has 701 records.
[0] [1] [2] [3] [4] [5] [ 700]
... 24
An array of records
 Each record has a special field,
called its key. [4]
 In this example, the key is a
Number 506643548
long integer field called
Number.
[0] [1] [2] [3] [4] [5] [ 700]

... 25
 The number might be a
person's identification [4]
number, and the rest of the
record has information about Number 506643548
the person.
[0] [1] [2] [3] [4] [5] [ 700]

... 26
 When a hash table is in use, some spots contain valid
records, and other spots are "empty".
[0] [1] [2] [3] [4] [5] [ 700]

Number 281942902 Number 233667136 Number 506643548 Number 155778322
...
27
INSERTING A NEW RECORD
 In order to insert a new
record, the key must Number 580625685
somehow be converted to an
array index.
 The index is called the hash
value of the key.
[0] [1] [2] [3] [4] [5] [ 700]

...
28
 Typical way create a hash
value: Number 580625685
(Number mod 701)
What is (580625685 mod 701) ?
[0] [1] [2] [3] [4] [5] [ 700]

...
29
 Typical way to create a hash
value: Number 580625685
(Number mod 701)
What is (580625685 mod 701) ?
[0] [1] [2] [3] [4] [5] [ 700]

...
30
 The hash value is used for the [3]
location of the new record. Number 580625685
[3]
[0] [1] [2] [4] [5] . . . [ 700]

31
 The hash value is used for the location of the new
record.
[0] [1] [2] [3] [4] [5] [ 700]

Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322
...
32
COLLISIONS
 Here is another new record
to insert, with a hash value Number 701466868
of 2.
My hash
value is
[2].
[0] [1] [2] [3] [4] [5] [ 700]

...
33
COLLISION RESOLUTION POLICIES
 Two classes
 Open hashing
 After collision records are stored outside the table
 Linear Probing
 Quadratic Probing
 Double Hashing
 Closed hashing
 After collisions records are stored somewhere in other
empty slot in the table itself
34
LINEAR PROBING
 Linear probing (linear hashing, sequential probing):
f(i) = i
 Insert: When there is a collision we just probe the next slot
in the table. If it is unoccupied – we store the key there. If it
is occupied – we continue probing the next slot
 One main disadvantage of linear probing is it forms a
cluster such clustering increases time to search for a record
35
Consider an example where we want to insert elements.
131, 22 , 31 , 11 , 121 , 61 , 7 , 8 and size of hash table is-10
Hash key is 131%10 = 1

22%10=2 Index Data
31%10 = 1
Index 1 is Occupied 0
1 131
2 22
3 31
4 11
5 121
6 61
7 7
8 8
36
9
10
 This is called a collision,
because there is already Number 701466868
another valid record at [2].
When a collision
occurs,
move forward until you
find an empty spot.
[0] [1] [2] [3] [4] [5] [ 700]
...
37
 This is called a collision,
Number 701466868
because there is already
When a collision
occurs,
move forward until
you
find an empty spot.
[0] [1] [2] [3] [4] [5] [ 700]
...
38
 This is called a collision, Number 701466868
because there is already
When a collision
occurs,
move forward until
you
find an empty spot.
[0] [1] [2] [3] [4] [5] [ 700]
...
39
 This is called a collision, because there is already
The new record

goes
in the empty spot.
[0] [1] [2] [3] [4] [5] [ 700]

Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322
...
40
QUADRATIC PROBING
 Quadratic Probing
 Quadratic probing: f(i) = i2
 A quadratic function is used to compute the next index in
the table to be probed.
 Example:
 In linear probing we check the i-th position. If it is occupied,
we check the i+1st position, next we check the i+2nd, etc.
 In quadric probing, if the i-th position is occupied we check

the i+1st, next we check the i+4th, next - i + 9th etc.
 The idea here is to skip regions in the table with possible
clusters.
41
Hi (K) = (H(K) + i2) % n
Index Data
0 90 Here we can take values of I as
0,1,2,3…..
1 131
2 22
3 33
For 121 and i=0
121%10=1
4
5 121 Hi(K)=(121+0 Sq)%size → 1
6 Hi(K)=(121+1 Sq)%size → 2
7 7
Hi(K)=(121+2 Sq)%size → 5
8 8 Next insert 87 i=0,
9 Hi(K)=(87+0 Sq)%size →7
10 Hi(K)=(87+1 Sq)%size →8
Hi(K)=(87+2Sq)%size →1
42
Hi(K)=(87+3Sq)%size → 6
• Division Method:- The hash function depends on the reminder of
division typically the divisor is length of table (10) or Prime
Number
Index Value
• H(K)=K (mod)m
• 4=54%10 0
• 2=72%10 1
• Using prime Number
2 72
• Choose prime number close to 99
so that table size can be increased i.e 97 3
• H(3205)=4 4 54
43
• Mid Square Method:- The key is Squared and
then hash function is defined as H(K)=l where l is
obtained like
K=3205 K 210272025 , l =72
• Folding Method :-the key K is partitioned into

Number of parts where each parts has same
number of digits except possibly the last
• H(K)=K1+k2+K3……Kn
• H(3205)= 32+05=37
• The record will be placed at 37 loc 44
DOUBLE HASHING
 Double-hash probing
 A technique which eliminates both primary and secondary
clustering is double-hashing.
 The idea is to compute a second hash value of the element to be
inserted.
 k1 = H(key % size of table(M) or prime no) We want 0 < k <
capacity.
 Using this value, search this sequence of alternative locations:
 This second hash function needs to generate integers somewhere in
the range from 1 (not 0) to capacity - 1.
 H(K2) hash2(obj) = 1+ (K1% M ’) where Max is prime number
less than or equal to size of table or can be considered as less than
99
45
OPEN HASHING
 Eachbucket in the hash table is the head of a
linked list
 All
elements that hash to a particular bucket are
placed on that bucket’s linked list
 Records within a bucket can be ordered in several

ways
 by order of insertion, by key value order
46
OPEN HASHING (CHAINING)
0 k2 k3 ...
1
...
2
From a
set of 3
keys
k1,k2,kn 4
D-1 ...
47
CHAINING
Chaining involves maintaining two table in memory
chaining without replacement
Index Data Chain

0 -1 -1
1 131 2
2 21 5
3 3 -1
4 64 -1
5 61 7
6 66 -1
7 71 -1
48
SEARCHING FOR A KEY
 The data that's attached to a
key can be found fairly Number 701466868
quickly.
[0] [1] [2] [3] [4] [5] [ 700]

...
49
SEARCHING FOR A KEY
 Calculate the hash value.
 Check that location of the array for the Number 701466868
key.
 Start by computing the hash value,
which is 2 in this case. Then check
location 2.
 If location 2 has a different key than
the one you are looking for, then move
My hash
forward...
value is
Not
[2].
me.
[0] [1] [2] [3] [4] [5] [ 700]
...
50
SEARCHING FOR A KEY
 Keep moving forward until you Number 701466868
find the key, or you reach an
empty spot.
My hash
value is
[2].
Not
me.
[0] [1] [2] [3] [4] [5] [ 700]
...
51
SEARCHING FOR A KEY
 Keep moving forward until you
Number 701466868
empty spot.
My hash
value is
[2].
Not
me.
[0] [1] [2] [3] [4] [5] [ 700]
...
52
SEARCHING FOR A KEY
 Keep moving forward until you Number 701466868
empty spot.
My hash
value is
[2].
Yes!
[0] [1] [2] [3] [4] [5] [ 700]

...
53
SEARCHING FOR A KEY
 When the item is found, the Number 701466868
information can be copied to the
necessary location.
 if a search reaches an empty
spot? In that case, it can halt
and indicate that the key was
not in the hash table. My hash
value is
[2].
Yes!
[0] [1] [2] [3] [4] [5] [ 700]

...
54
DELETING A RECORD
 Records may also be deleted from a hash table.
Please
delete
me.
[0] [1] [2] [3] [4] [5] [ 700]
...
55
DELETING A RECORD
 But the location must not be left as an ordinary "empty
spot" since that could interfere with searches. .
 (Remember that a search can stop when it reaches an empty
spot.)
[0] [1] [2] [3] [4] [5] [ 700]

...
56
DELETING A RECORD
 But the location must not be left as an ordinary "empty
spot" since that could interfere with searches.
 The location must be marked in some special way so that a
search can tell that the spot used to have something in it.
[0] [1] [2] [3] [4] [5] [ 700]

...
57
o When a empty slot is encountered during insertion, that slot
can be used to store the new record.
o To avoid inserting duplicate keys, it will still be necessary for

the search procedure to follow the probe sequence until a truly
empty position has been found
58

DSF Combined notes

Uploaded by

Copyright:

Available Formats

You might also like

DSF Combined notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DSF Combined notes

Uploaded by

Copyright:

Available Formats

Trees

Department of Computer Engineering

BRACT’S, Vishwakarma Institute of Information Technology, Pune-48

1. Understand difference between linear and non linear data structure

1. Differentiate between linear and non linear data structure

Faculty Name(optional), Department of ______Engineering,

• A tree is a finite set of one or more nodes

• There is a specially designated node called

• We call T1, ..., Tn the subtrees of the root.

• Finite set of elements in tree are called NODES

• Finite set of lines in tree are called as BRANCHES

• Branch directed towards the node is the INDEGREE of the node

• The degree of a node is the number of sub trees of the node

• INDEGREE of ROOT is ZERO

• OUTDEGREE of LEAF node is ZERO

• A node that is neither ROOT not LEAF is INTERNAL NODE

• In a tree, height of the root node is said to be Height of the tree.

• Children of the same parent are siblings.

Faculty Name(optional), Department of ______Engineering,

Faculty Name(optional), Department of ______Engineering,

 Minimum number of nodes in a binary tree whose

minimum number of nodes is h+1 where h

Maximum number of nodes

 Let n be the number of nodes in a binary tree whose height is

 A full binary tree of a given height h has 2h+1 − 1 nodes.

Height 3 full binary tree.

 Number the nodes 1 through 2h+1 – 1.

 Parent of node i is node i / 2, unless i = 1.

 Left child of node i is node 2i, unless 2i > n, where n

 Right child of node i is node 2i+1, unless 2i+1 > n,

A complete binary tree is a binary tree in which every

A binary tree T is full If each node is either a leaf or

Perfect Binary Tree

 Complete binary tree with 10 nodes.

 To represent a binary tree of depth 'n' using

 Each binary tree node is represented as an

• Identify branches from parent to left most child

Faculty Name(optional), Department of ______Engineering,

 Many binary tree operations are done by

 Three methods of Depth first Traversal

Faculty Name(optional), Department of ______Engineering,

• Linked list storage will increase efficiency of

• All items in right sub tree are greater

• Each sub tree is itself a binary search

 Algorithm addBST (root, newnode)

• If the node has only right sub tree then

• If the node has only left sub tree then

▪ removes a specified item from the BST and adjusts

Case 3: if the node has no left child

Case 1: removing a node with 2 EMPTY SUBTREES

Case 3: removing a node with 1 EMPTY SUBTREE

Department of Computer Engineering

BRACT’S, Vishwakarma Institute of Information Technology, Pune-48

(An Autonomous Institute affiliated to Savitribai Phule Pune University)

1. Introduce Threaded Binary Tree

1. Understand Threaded Binary Tree

Inorder of below tree is: D,B,F,E,A,G,C,L,J,H,K

Amir Kamil 8/8/02 20

Amir Kamil 8/8/02 21