Consider the linked representation of a binary tree T, approximately half of the entries in left pointer field and right pointer field contains NULL elements. This space occupied by NULL entries can be efficiently utilized to store some kind of valuable information. These special pointers are called threads, and the binary tree having such pointers is called a threaded binary tree. Threads in a binary tree are represented by a dotted line. There are many ways to thread a binary tree these are 1. The right NULL pointer of each leaf node can be replaced by a thread to the successor of that node under in order traversal called a right thread, and the tree will called a right threaded tree or right threaded binary tree. 2. The left NULL pointer of each node can be replaced by a thread to the predecessor of that node under in order traversal called left thread, and the tree will called a left threaded tree. 3. Both left and right NULL pointers can be used to point to predecessor and successor of that node respectively, under in order traversal. Such a tree is called a fully threaded tree. A threaded binary tree where only one thread is used is also known as one way threaded tree and where both threads are used is also known as two way threaded tree.

Memory Representatin Of Treaded Binayr Tree

In above representation char thread field is as a tag. It will hold 0 for normal right pointer and 1 for thread.

You have seen that using linked lists you can represent an ordered collection of values without using arrays. Although linked lists require more memory space than arrays ( as they have to store address at each node), they have definite advantages over arrays. Insertion and deletion of items can be carried out with out involving considerable movement of data. The ordering relationship amongst a set of values is obtained through use of pointers. However, we need not restrict ourselves to only linear structures. In this chapter we shall extend the use of pointers to define a non-linear structure to model hierarchical

relationships, such as a family tree. In such a tree, we have links moving from an ancestor to a parent, and links moving from the parent to children. We have many other examples of tree-structured hierarchies. Directory Hierarchies: In computers, files are stored in directories that form a tree. The top level directory represents the root. It has many subdirectories and

files. The subdirectories would have further set of subdirectories. Organization charts: In a company a number of vice presidents report to a president. Each VP would have a set of general managers, each GM having his/her own set of specific managers and so on. Biological classifications: Starting from living being at the root, such a tree can branch off to mammals, birds, marine life etc. Game Trees: All games which require only mental effort would always have number of possible options at any position of the game. For each position, there would be number of counter moves. The repetitive pattern results in what is known a game tree. Tree as a data structure A tree is a data structure that is made of nodes and pointers, much like a linked list. The difference between them lies in how they are organized: The top node in the tree is called the root and all

other nodes branch off from this one.


Every node in the tree can have some number of children. Each child node can in turn be the parent node to its children and so on. Child nodes can have links only from a single parent. Any node higher up than the parent is called an ancestor node. Nodes having no children are called leaves. Any node which is neither a root, nor a leaf is called an interior node. The height of a tree is defined to be the length of the longest path from the root to a leaf in that tree ( including the path to root) A common example of a tree structure is the binary tree.

Binary Trees

Definition: A binary tree is a tree in which each node can have maximum two children. Thus each node can have no child, one child or two children. The pointers help us to identify whether it is a left child or a right child. Application of a Binary tree Before we define any formal algorithms, let us look at one possible application of a binary tree. Consider a set of numbers: 25,63,13, 72,18,32,59,67.

Suppose we store these numbers in individual nodes of a singly linked list. To search for a particular item we have to go through the list, and maybe we have to go to the end of the list as well. Thus if there were n numbers, our search complexity would be O(n). Is it because the numbers are not in any particular sequence? Now suppose we order these numbers: 13,18,25,32,59,63,67,72. and store these in another linked list. What would be the search complexity now? You may be surprised to discover that it is still O(n). You simply cannot apply binary search on a linked list with O(log n) complexity. You still have to go

through each link to locate a particular number. So a linear linked structure is not helping us at all. Let us see if we can improve the situation by storing the data using a binary tree structure. Consider the following binary tree where the numbers have been stored in a specific order. The value at any node is more than the values stored in the left-child nodes, and less than the values stored in the right-child nodes.
59 18 67 13 32 63 72 25

With this arrangement any search is taking at most 4

steps. For larger set of numbers, if we can come up with a good tree arrangement than the search time can be reduced dramatically. Examples of binary trees:
root root

The following are NOT binary trees:


tree, then n1 is the parent of n2 and n2 is the left or right child of n1. The level of a node in a binary tree: - The root of the tree has level 0 - The level of any other node in the tree is one more than the level of its parent.
root Level 0 Level 1 Level 2 Level 3

Full Binary Tree

How many nodes? Level 0 : 1 node ( height 1) Level 1: 2 nodes ( height 2) Level 3 : 4 nodes (height 3) Level 3: 8 nodes (height 4) Total number of nodes n = 2h 1 ( maximum) h = log ( n+1)


24 76 14 32 61 87 8 20 27 37 56 67 81 94

A binary tree has a natural implementation in linked storage. A tree is referenced with a pointer to its root. Recursive definition of a binary tree: A binary tree is either - Empty, or - A node (called root) together with two binary trees (called left subtree and the right subtree of the root) Each node of a binary tree has both left and right subtrees which can be reached with pointers: struct tree_node{ int data; struct tree_node *left_child; struct tree_node *right_child; };
left_child data right_child

Note the recursive definition of trees. A tree is a node with structure that contains more trees. We have actually a tree located at each node of a tree.

Traversal of Binary Trees

Linked lists are traversed sequentially from first node to the last node. However, there is no such natural linear order for

the nodes of a tree. Different orderings are possible for traversing a binary tree. Every node in the tree is a root for the subtree that it points to. There are three common traversals for binary trees: Preorder Inorder Postorder These names are chosen according to the sequence in which the root node and its children are visited. Suppose there are only 3 nodes in the tree having the following arrangement:

With inorder traversal the order is left-child, root node, right-child With preorder traversal the order is root node, left child , right child. With postorder traversal the order is left child, right child, root node.

n1 n2 n3 In order : n2 n1 n3 Pre-order : n1 n2 n3 Post order : n2 n3 n1


A tree will typically have more than 3 nodes. Instead of nodes n2 and n3 there would be subtrees as shown below: With inorder traversal the order is left subtree, then the root and finally the right subtree. Thus the root is visited in-between visiting the left and right subtrees. With preorder traversal the root node is visited first,

then the nodes in the left subtree are visited followed by the nodes in the right subtrees

root Left subtree Right subtree


With postorder traversal the root is visited after both the subtrees have been visited.(left subtree followed by right subtree. As the structure of a binary tree is recursive, the traversal algorithms are inherently recursive. Algorithm for Preorder traversal In a preorder traversal, we first visit the root node. If there is a left child we visit the left subtree (all the nodes) in pre-order fashion starting with that left child . If there is a right child then we visit the right subtree in pre-order fashion starting with that right child. The function may seem very simplistic, but the real power lies in the recursive formulation. In fact there is a double recursion. The real job is done by the system on the runtime stack. This simplifies coding while it puts a heavy burden on the system. void preorder(struct tree_node * p) { if (p !=NULL) { printf(%d\n, p->data);

preorder(p->left_child); preorder(p->right_child);

} } Take a tree of say height 3 with maybe 6 nodes and try to run the above recursion to find out the actual order of printing the nodes. Example: Preorder Traversal : a b c d f g e a(root) b(left) c d f g e (right) Algorithm for Inorder traversal
root a bc de fg

In the inorder traversal, we first visit its left subtree (all the nodes) , then we visit the root node and then its right subtree. void inorder(struct tree_node *p) { if (p !=NULL) { inorder(p->left_child); printf(%d\n, p->data); inorder(p->right_child); } } Inorder: b a f d g c e b(left) a(root) f d g c e(right) Algorithm for Postorder traversal
root a bc

de fg

In a postorder traversal, we first visit its left subtree (all the nodes) and then visit its right subtree ( all the nodes) and then finally we visit the root node. void postorder(struct tree_node *p) { if (p !=NULL) { postorder(p->left_child); postorder(p->right_child); printf(%d\n, p->data); } } Example: Postorder: b f g d e c a b(left)
root a bc de fg

f g d e c(right) a(root)

Finding Maximum value in a given tree p

int findMax (struct tree_node *p) { int node_data, leftmax, rightmax, max; max = -1 //assume all values in the tree are positive integers if (p != NULL) { node_data = p -> data; leftmax = findMax(p -> left_child);

rightmax = findMax(p->right_child); //find the largest of the tree values. if (leftmax > rightmax) max = leftmax; else max = rightmax; if (node_data > max) max = node_data; } return max;

node_value = 45 leftmax = 24 (max of 24, -1, -1 ) rightmax = 76 (max of 76, -1, -1 ) max = max of 45, 24, 76 = 76

24 45 76

max of left subtree leftmax = max ( 24,12,32) = 24 rightmax = 76 max of tree = max (45, leftmax, rightmax) = 76

24 76 14 32 45

Finding sum of values of all the nodes of a tree

To find the sum, add to the value of the current node, the sum of values of all nodes of left subtree and the sum of values of all nodes in right subtree. int sum(struct tree_node *p) { if ( p!= NULL) return(p->data + sum(p->left_child) + sum(p->right_child)); else return 0; }

AVL Trees
The Concept
These are self-adjusting, height-balanced binary search trees and are named after the inventors: Adelson-Velskii and Landis. A balanced binary search tree has Theta(lg n) height and hence Theta(lg n) worst case lookup and insertion times. However, ordinary binary search trees have a bad worst case. When sorted data is inserted, the binary search tree is very unbalanced, essentially more of a linear list, with Theta(n) height and thus Theta(n) worst case insertion and lookup times. AVL trees overcome this problem.

The height of a binary tree is the maximum path length from the root to a leaf. A singlenode binary tree has height 0, and an empty binary tree has height -1. As another example, the following binary tree has height 3.
7 / 3 / 2 / 9 10 \ 11 / \ 12 \ 20

An AVL tree is a binary search tree in which every node is height balanced, that is, the difference in the heights of its two subtrees is at most 1. The balance factor of a node is the height of its right subtree minus the height of its left subtree. An equivalent definition, then, for an AVL tree is that it is a binary search tree in which each node has a balance factor of -1, 0, or +1. Note that a balance factor of -1 means that the subtree is left-heavy, and a balance factor of +1 means that the subtree is right-heavy. For example, in the following AVL tree, note that the root node with balance factor +1 has a right subtree of height 1 more than the height of the left subtree. (The balance factors are shown at the top of each node.)
+1 30 / \

-1 22 / 0 5 +1 44 \ /

0 62 \ -1 95 0 51 / 0 77

The idea is that an AVL tree is close to being completely balanced. Hence it should have Theta(lg n) height (it does - always) and so have Theta(lg n) worst case insertion and lookup times. An AVL tree does not have a bad worst case, like a binary search tree which can become very unbalanced and give Theta(n) worst case lookup and insertion times. The following binary search tree is not an AVL tree. Notice the balance factor of -2 at node 70.
-1 100 / -2 70 / +1 30 0 10 / \ / 0 36 -1 40 \ 0 80 +1 130 \ 0 140 / \ -1 150 \ 0 180

Inserting a New Item

Initially, a new item is inserted just as in a binary search tree. Note that the item always goes into a new leaf. The tree is then readjusted as needed in order to maintain it as an AVL tree. There are three main cases to consider when inserting a new node.

Case 1:
A node with balance factor 0 changes to +1 or -1 when a new node is inserted below it. No change is needed at this node. Consider the following example. Note that after an insertion one only needs to check the balances along the path from the new leaf to the root.

40 / +1 20 \ 0 30 0 45 / \ 0 50 \ 0 70

After inserting 60 we get:

+1 40 / +1 20 \ 0 30 0 45 / 0 60 / \ +1 50 \ -1 70

Case 2:
A node with balance factor -1 changes to 0 when a new node is inserted in its right subtree. (Similarly for +1 changing to 0 when inserting in the left subtree.) No change is needed at this node. Consider the following example.
-1 40 / +1 20 / 0 10 0 22 / \ 0 30 \ 0 32 0 45 / \ 0 50 \ 0 70

After inserting 60 we get:

2) / +1 20 / 0 case 1 10 0 22 / \ 0 30 \ 0 32

0 <-- the -1 changed to a 0 (case 40 \ +1 <-- an example of case 1 50 / 0 45 0 60 / \ -1 <-- an example of 70

Case 3:
A node with balance factor -1 changes to -2 when a new node is inserted in its left subtree. (Similarly for +1 changing to +2 when inserting in the right subtree.) Change is needed at this node. The tree is restored to an AVL tree by using a rotation. Subcase A: This consists of the following situation, where P denotes the parent of the subtree being examined, LC is P's left child, and X is the new node added. Note that inserting X makes P have a balance factor of -2 and LC have a balance factor of -1. The -2 must be fixed. This is accomplished by doing a right rotation at P. Note that rotations do not mess up the order of the nodes given in an inorder traversal. This is very important since it means that we still have a legitimate binary search tree. (Note, too, that the mirror image situation is also included under subcase A.)
(rest of tree) | -2 P / -1 LC / sub tree of height n / X \ sub tree of height n \ sub tree of height n

The fix is to use a single right rotation at node P. (In the mirror image case a single left rotation is used at P.) This gives the following picture.
(rest of tree) | 0 LC / sub tree of height n / X \ P / sub tree of height n \ sub tree of height n

Consider the following more detailed example that illustrates subcase A.

-1 80 / -1 30 / 0 15 \ 0 40 0 20 0 90 / \ -1 100

/ 0 10

We then insert 5 and then check the balance factors from the new leaf up toward the root. (Always check from the bottom up.)
-2 80 / -2 30 / -1 \ 0 0 / \ -1 100

/ -1 10 0 5 /


\ 0 20



This reveals a balance factor of -2 at node 30 that must be fixed. (Since we work bottom up, we reach the -2 at 30 first. The other -2 problem will go away once we fix the problem at 30.) The fix is accomplished with a right rotation at node 30, leading to the following picture.
-1 80 / 0 15 / -1 10 0 20 \ 0 30 0 90 0 40 / \ -1 100

/ 0 5

Recall that the mirror image situation is also included under subcase A. The following is a general illustration of this situation. The fix is to use a single left rotation at P. See if you can draw a picture of the following after the left rotation at P. Then draw a picture of a particular example that fits our general picture below and fix it with a left rotation.
(rest of tree) | +2 P / sub tree of height n \ +1 RC / sub tree of height n \ sub tree of height n \ X

Subcase B: This consists of the following situation, where P denotes the parent of the subtree being examined, LC is P's left child, NP is the node that will be the new parent, and X is the new node added. X might be added to either of the subtrees of height n-1. Note that inserting X makes P have a balance factor of -2 and LC have a balance factor of +1. The -2 must be fixed. This is accomplished by doing a double rotation at P (explained below). (Note that the mirror image situation is also included under subcase B.)
(rest of tree) | -2 P / +1 LC / sub tree of height n \ -1 NP / \ sub sub tree tree n-1 n-1 / X \ sub tree of height n

The fix is to use a double right rotation at node P. A double right rotation at P consists of a single left rotation at LC followed by a single right rotation at P. (In the mirror image case a double left rotation is used at P. This consists of a single right rotation at the right child RC followed by a single left rotation at P.) In the above picture, the double rotation gives the following (where we first show the result of the left rotation at LC, then a new picture for the result of the right rotation at P).
(rest of tree) | -2 P / -2 NP / 0 LC / sub \ sub \ sub tree n-1 \ sub tree of height n

tree of height n

tree n-1 / X

Finally we have the following picture after doing the right rotation at P.
(rest of tree) | 0 NP / 0 LC / sub tree of height n \ sub tree n-1 / X / sub tree n-1 \ +1 P \ sub tree of height n

Consider the following concrete example of subcase B.

-1 80 / 0 30 / -1 20 0 40 \ 0 50 0 90 0 60 / \ 0 100 \ 0 120

/ 0 10

After inserting 55, we get a problem, a balance factor of -2 at the root node, as seen below.
-2 80 / +1 30 / \ / \ 0 100 \

-1 20 0 10 / 0 40 /

+1 50 \ / 0 55 -1 60

0 90

0 120

As discussed above, this calls for a double rotation. First we do a single left rotation at 30. This gives the following picture.
-2 80 / -1 50 / -1 30 / -1 20 \ 0 40 / 0 55 \ -1 60 0 90 / \ 0 100 \ 0 120

/ 0 10

Finally, the right rotation at 80 restores the binary search tree to be an AVL tree. The resulting picture is shown below.
0 50 / -1 30 / -1 20 0 10 / \ 0 40 / 0 55 -1 60 / 0 90 / \ 0 80 \ 0 100 \ 0 120

Example Program


bstnode.h bstnode.cpp bstree.h bstree.cpp avlnode.h avlnode.cpp avltree.h avltree.cpp avltest.cpp

This example program inserts some characters into an AVL tree, uses a print routine to see that the AVL tree is correct, and tries out other features such as the copy constructor, the Find function, etc. The class AVLClass is derived by public inheritance from the class BSTClass. Since we have already implemented binary search trees and AVL trees are a form of specialized binary search tree, this allows considerable code reuse. The public functions provided are a constructor, a copy constructor, a destructor, an overloaded assignment operator, and a function to insert a new item. The following public functions are inherited from BSTClass and hence are also available: NumItems, Empty, Find, and Print. AVLClass also has numerous private functions to carry out the various rotations and the like. Note that the Print function prints the binary search tree sideways, as it is much easier to do so. Note, too, that AVLClass is named as a friend of BSTClass so that it can have direct access to the private data fields of the latter class. This makes the coding simpler. There are no new data fields added in the derived class. is derived by public inheritance from BSTNodeClass. In the BSTNodeClass note that the data fields are protected fields, instead of the usual private fields. (Public fields are also possible but rarely used since they violate the principle of information hiding.) A protected field is directly accessible in a publicly derived class, but is not accessible elsewhere, such as from our application program in the main function. This means that we do not need to make AVLNodeClass a friend of BSTNodeClass in order for the derived class to have direct access to the data fields. The AVLNodeClass only needs to add one data field to the inherited ones: a field to hold the balance number for this node.

One messy feature of the above example is that in many places a cast is required so that a pointer to a BSTNodeClass node is seen as a pointer to an AVLNodeClass node. For example, the CopySubtree function in AVLClass contains:
NewLeftPtr = CopySubtree(reinterpret_cast <AVLNodePtr> (Current->Left))

This is needed because the data field Left is a pointer to the wrong type of node, a BSTNodeClass node. Since CopySubtree expects a pointer to an AVLNodeClass node, we need the cast. One type of node is derived by inheritance from the other and is almost the same, but to the compiler they are different types. So a pointer to one type of node is not of

the same type as a pointer to the other type of node. Hence we need the cast. Similar casts occur in the application code found in the main function, such as:
Result = reinterpret_cast <AVLNodePtr> (AVLTreeA.Find('E'))

This one is needed since we are using the inherited Find function on an AVL tree, which belongs to the derived class. Our Find function returns a pointer to the wrong type of node, so we fix it up with the cast. Code reuse helps a lot, but here it does add the nuisance of a cast.

