Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 36

CONCEPT OF DATA STRUCTURES

A Data structure is an organized grouping of data items treated as a unit. It is the arrangement of
individual items of data which come in different forms. The logical or mathematical model of a
particular organization of data is called its data structures.

TERMS USED IN DATA STRUCTURES

a) Data item: A data item is a single unit of value. It is a raw fact which becomes
information after processing.
b) Entity: An entity is anything that has attributes or properties which may be assigned
values. The values may either be numeric or non-numeric e.g. people, objects, events etc.
about which there is need to record data e.g. an employee, a student, an item of stock.
c) Attributes: These are facts about the state of an object. They are the individual properties
of the entity for example; the attributes of a student (entity) will include the “Name”,
“Sex”, “Address”, “Matric. Number”, “Year of entry”, etc. Entities with similar attributes
for example, all the 200 level Computer science & Statistics students form an entity set.
d) Algorithm: A finite sequence of instructions, each of which has a clear meaning and can
be executed with a finite amount of effort in finite time. Whatever the input values, an
algorithm will definitely terminate after executing a finite number of instructions.
Characteristics of algorithm:
Has a finite set of steps with definite instructions.
Instructions have definite order.
Algorithm must eventually stop.
Actions are deterministic.
e) Name: A word that an entity is known by e.g. School, Williams.
f) Value range: All possible values that could be assigned to a given attribute of an entity set
is called the range of values of the attribute. The limit between which value varies. Limit
between the lowest value and highest value.
g) Data type: Data type is the characteristics of the programmed data such as Characters
(Alphabets), Numbers (Real and Integer), Logical statements (Boolean expressions) and

1
Graphics. It can also be said to be a set of values together with the operations defined on the
values: {(values), (operations)}. The operations are performed on the values defined. e.g integer (-
4, -1, 1, 3, 4) are values while (+, -, *, /) are operations. Data types also allow us to associate
meaning to sequence of bits in the computer memory. e.g. string AA@, integer 5 etc.
Data types operations storage representation
1. Integer *, +, -, /, 2's complement, sign magnitude
2. Real
3. Boolean AND, OR, NOT True=0, False=1
4. Character 8 bits ASCII/EBCDI length followed by sequence of characters

UNITS FOR IDENTIFYING DATA

(i) Character – A character is the smallest element in a file or the smallest unit of
information and can be alphabetic, numeric or special e.g. letters, digits and
special symbols such as + (Plus sign), - (minus sign), \, /, $, a, b, z, A, B…. Z
etc. The name Olu1 contains four (4) characters.
(ii) Field – A field is an item of data within a record. It is a single unit of information
representing an attribute of an entity and it is made up of a number of characters
e.g. a name, a date, an amount etc. In a database concept fields are usually in
columns of a given table. Examples are Hameen, NCS/13/03456.
(iii) Record – A record is made up of a number of logically related fields of a given
entity e.g. a student record, an employee payroll record etc. For instance, all the
information about a student (Name, age, sex, matric. No., level) are his/her
record.
(iv) Files – A file consists of a number of records of the entities in a given entity set.
e.g. a students’ file, NDI file etc.

TOOLS FOR STUDYING DATA

1. Symbols - A symbol is a sign, number, letter etc. that is used to represent data e.g
can be used to represent 2 men.

2
2. Relation – A relation shows the connection between two operands. Two operands are
compared to define the relationship between them. e.g. 4 > 2.
Symbols for expressing relations
A relational operator defines the comparison between the two operands. The relational
operators are
= for Equals
> for Greater than
< For Less than
< > for Not equal to
< = Less than or Equal to
>= Greater than or Equal to
The result of the comparison is either “True” or “False” e.g. 4 > 2 is True.
3. Graphs – A planned drawing consisting of a line or lines, showing how two or more sets
of numbers or operands are related to each other. Data from tables can be put into
graphical form to show their results. Graphs are natural models that are used to represent
arbitrary relationships among data objects.
Properties of Graph
a) Routes – This is the path through which a graph is traced (through the vertices).
b) Edges – This is a connection between two vertices of a graph. A graph is usually
depicted in a pictorial form in which the vertices appear as dots or other shapes,
and the edges are shown as lines joining the appropriate points.
A graph consists of two things:
i. A set of vertices (V) of elements called nodes (or points)
ii. A set of edges (E) such that each edge in E is identified with a unique
unordered pair (u, v) of nodes in V, denoted by e = [u, v].
Sometimes we indicate the path of a graph by writing G = (V, E). Suppose e = (u,
v), then the nodes u and v are called the endpoints of e, and u and v are adjacent
nodes or neighbors
c) Sequences – A path P of length n from a node u to a node v is defined as a
sequence of n + 1 nodes.

3
(A typical example of a Graph)

d) Directed and Non-Directed Graphs – A graph G is said to be directed if each edge e is


identified with an ordered pair (u, v) of nodes in G rather than an un-ordered pair (u, v).
Directed graphs are also called Digraph or graph.
The following terminologies apply under a directed graph:
 e begins at u and ends at v
 u is the origin or initial point of e, and v is the destination or terminal point of e.
 u is the predecessor of v, and v is the successor or neighbor of u.
 u is adjacent to v, and v is adjacent to u.

The edges e2 and e3 are said to be parallel since each begins at B and ends at A.
The edge e7 is a loop, since it begins and ends at the same point. A directed graph
is said to be connected or strongly connected if for each pair u, v of nodes in G
there is a path from u to v and there is also a path from v to u. On the other hand,
G is said to be unilaterally connected if for any pair u, v of nodes in G there is a
path from u to v or a path from v to u. An undirected graph is that graph G with a
unordered pair of vertices connected by an edge i.e edges which has no specified
direction.

4
Directed graph Undirected graph

The following operations are peculiar with the processing of data in data structures:
i. Transversing – This is the process of accessing each record exactly once
so that certain items in the record may be processed.
ii. Searching – This finding the location of a record with a given key value
or finding the locations of all records which satisfy one or more
conditions.
iii. Inserting – This is the addition of a new record to a structure.
iv. Deleting – It is the removing of a record from a structure.
v. Sorting – It is the rearranging of records in some logical order (e.g
alphabetically according to some name key, or in numerical order
according to some number key such as matric. number).

DATA LIFE CYCLE

Data life cycle is a policy-based approach to managing the flow of an information system’s data
throughout its life cycle: from creation and initial storage to the time when it becomes obsolete
and deleted. This is the complete lifetime of a data. The phases of development through which a
data passes. It refers to the period from which data is created, stored initially until the time it
becomes obsolete and deleted.
Data life cycle can be represented by the diagram below:
_________________________________________________
________________

ORIGINATION INPUT PROCESSING OUTPUT DISTRIBUTION

STORAGE

5
Some of the terms that can be used in relation to data life cycle are:

a. Occupancy – It is derived from the word “occupy” and can be defined as the whereby a
particular data occupies a space or memory in the computer. The act of occupying a space
in a document by data.
b. Empty – This can also be called Null and it is defined as the concept of data whereby
there is no element or data in a set.

SEQUENTIAL LIST (LINEAR)

A sequential list is the list of data in sequential or serial form such that one must be before the
other. (x1, x2...xn) where n ≥ 0. If n=0, then the list has no element and is called null list or empty
list. If n>0, the list has at least one element x 1 which is called the head of the list. The list
consisting of the remaining elements is called the tail of the original list. The tail of the null list is
the null list, because it is the tail of a list containing no element.
It is possible for an item to be another listing which case it is known as a sub-list

ARRAYS
An array is a set of data items which may be conveniently arranged into a sequence and referred
to by a single identifier e.g. MARK = (56, 42, 89, 65, 48).
Individual data items in the array may be referred to separately by stating their position in the
array. Therefore, MARK (1) refers to 56 and MARK (2) refers to 42 etc. The number 1 and 2 are
called subscripts.

Example:
Let DATA be a 5-elements linear array of integers such that DATA (1) = 247, DATA (2) = 56,
DATA (3) = 429, DATA (4) = 135, DATA (5) = 87. Sometimes we denote such an array simply
by writing DATA: 247, 56, 429, 135, 87.
Thus the array DATA is pictured as below:

6
DATA
1 2 3 4 5
247 56 429 135 87

Or
DATA

1 247
2 56
3 429
4 135
5 87

FIXED AND VARIABLE LENGTH FIELDS

A fixed length field value has a fixed number of character places available for data storage. A
variable length field on the other hand provides the data with just the number of spaces it needs.

Example: Detail of Fixed


Character 1 2 3 4 5 6 7 8 9 10 1 12 13 1 15 1 17 18 1 20
position 1 4 6 9
Content J A M E S K A T Y A N N J O
Comment 1ST STRING 2ND STRING 3RD STRING 4TH STRING

In the diagram above, four fixed-length strings (each string is five character long) concatenated
into a single string called Fixed.

On the other hand, Detail of Variable


Character 1 2 3 4 5 6 7 8 9 10 1 12 13 14 15 16 17 18 19 20

7
position 1
Content J A M E S * K A T Y * A N N * J O *
Comment 1ST STRING 2ND 3RD STRING 4TH Saved
STRING STRING Storage

The above diagram shows four variable length strings concatenated into a single string called
Variable. The same data is used but the end of each string is indicated by a “*”. Note the saving
in storage.

SET AND RELATIONS

ORDERED AND LINEAR LIST

A Linear list is called one-dimensional array or list because each element in such an array is
referenced by one subscript. It provides a flexible way of handling data items in order. Changes
to the order can be achieved with minimal data movement and little loss of storage space.

Example:
Ade does not like cake is written as a list

ADE DOES NOT LIKE CAKE

OR
ADE
DOES
NOT

8
LIKE
CAKE

In the first example, each word is regarded as data item or datum which is linked to the next
datum by a pointer. Datum plus pointer make one element or node of the list. The last datum is
the Terminator.

OPERATIONS THAT CAN BE PERFORMED ON AN ORDERED LIST

1. Append
When we append a list, we simply add an element/record to the original one. It is
therefore sensible to tackle data entry and appending list or records at the same time. In
another way it can also be called inserting.
Elements can be inserted (a) At the end of a list (b) In the middle of a list
(a) At the end of a list is possible provided the memory space allocated for the array is
large enough to accommodate the addition.
(b) In the middle of a list is possible when half of the elements must be moved
downward to new locations to accommodate the new element and keep the order of
the elements.
Suppose TEST has been declared to be a 5-elements array but data have been
recorded only for TEST [1], TEST [2], and TEST [3]. If x is the value of the next
test, then one simply assigns
TEST [4]: = x (to add x to the list)
Similarly, if y is the value of the subsequent test, then we simply assign
TEST [5]: = y (to add y to the list)

N.B
We cannot add any new test scores to the list because TEST has been declared a 5-elements
array. The general form is INSERT (Text, Position, String) e.g. INSERT (TEST, 2, K) which
means insert the letter K to the position 2 in a linear array called TEST.

9
Write an algorithm to insert ITEM into the Kth position in a linear array with N elements.
INSERT (LA, N, K, ITEM)
LA – Linear Array N – N elements K – Positive integers such that K ≤ N
1. Set J: = N [Initialize Counter]
2. Repeat Steps 3 and 4 while J ≥ K
3. Set LA [J + 1]: = LA [J]
4. Set J: = J – 1 [ Decrease Counter]
[End of step 2 loop]
5. Set LA [K]: = ITEM [Insert element]
6. Set N: = N + 1
7. Exit
The first four steps create space in LA by moving downward one location each element from
the Kth position on. The elements are moved in reversed order i.e. first LA [N], then LA [N-
1], …LA [K] otherwise data might be erased. We set J: = N and then using J as counter
decrease J each time the loop is executed until J reaches K. Step 5, insert Item into the array
in the created space. Increase N by 1 to account for the new element.

2. Delete
Delete is used to remove an element/record from a list or structure. Deleting can also be
done (a) At the end of a list/array (b) In the middle of a list/array
Deleting at the end of the array has no difficulties but deleting an element somewhere in
the middle of the array would require that each subsequent element be moved one
location upward in order to “fill up” the array. The general form is

Write an algorithm to delete the Kth element from a linear array LA and assigns it to a
variable ITEM.
DELETE (LA, N, K, ITEM)
1. Set ITEM: = LA [K]
2. Repeat for J = K to N – 1

10
Set LA [J]: = LA [J +1] [Move J + 1st element upward]
End of loop
3. Set N: = N – 1
4. Exit

3. Sorting and Searching


Sorting and Searching are fundamental operations in data structures. Sorting refers to the
operation of rearranging of data in some given order, such as increasing or decreasing
with numerical data, or alphabetically with character data.
Searching refers to the operation of finding the location of a given item in a collection of
items.
Let A be a list of n elements A1, A2 …An in memory. Sorting A refers to the operation
of rearranging the contents of A so that they are increasing in order, so that A1 ≤ A2 ≤
A3 ≤ ...≤ An. Since A has n elements, there are n! ways that the contents can appear in A.
These ways correspond to the n! permutations of 1, 2, 3...n.

Example:
Suppose an array DATA contains 8 elements DATA: 77, 33, 44, 11, 88, 22, 66, 55
After sorting, DATA must appear in memory as 11, 22, 33, 44, 55, 66, 77, 88
N.B: Since DATA consist of 8 elements, there are 8! = 40, 320 ways that the numbers
can appear in DATA.

4. Selection
Selection involves the process of searching, identifying or finding a data within a record
or file in order to perform other operations on the data. It is similar to some operations we
carry out on paper which is usually of identifying named record in a file. For example, in
a college database with details of students on it, it is possible to find the name of the
student and select it. The database works through its records until it finds all the
information relating to that student, such as a photograph, matric. number, address, age,
schools attended, sex etc.

11
5. Exchange
This is the means of replacing one data item/record with another. Exchange selection
(bubble sort) is a form of sorting by exchanging, which simply interchanges pairs of
elements that are out of order in a sequence of passes through the file until no such pairs
exist.

6. Merge
Combining multiple sets of data to produce only one set usually in an ordered sequence.
The process is usually employed in external sorting where data is stored in backing store.
A data handling system can merge data from one file into another; two sets of
information are taken and are put together in a file – in order, becoming one.
Suppose A is a sorted list with R elements and B is a sorted list with S elements. The
operation that combines the elements of A and B into a single sorted list C with n
elements such that n = R + S is called merging.
One simple way to merge is to place the elements of B after the elements of A and then
use some sorting algorithm on the entire list.

Example:
Writing an algorithm to merge a sorted R-elements array A and a sorted S-elements array
B into sorted array C, with n = R + S elements.
i. Keep track of the locations of the smallest elements of A and the
smallest elements of B which has not yet been placed in C. Let NA and
NB denote these locations, respectively. Also let PTR denote the
location in C to be filled.
ii. Initially, set NA: = 1, NB: = 1 and PTR: =1
iii. At each step of the algorithm, we compare A [NA] and B [NB] and
assign the smaller element to C [PTR].

12
iv. Increase PTR by setting PTR: = 1, and we either increase NA by setting
NA: = NA + 1 or increase NB by setting NB: = NB + 1, according to
whether the new element in C has come from A or B.
v. If NA > R, then the remaining element of B are assigned to C; or if NB
> S, then the remaining element of A are assigned to C.

MERGING (A, R, B, S, C)
Let A and B be sorted arrays with R and S elements respectively. This algorithm merges
A and B into an array C with N = R + S elements.
1. [Initialize] Set NA: = 1, NB: = 1 and PTR: = 1.
2. [Compare] Repeat while NA ≤ R and NB ≤ S: If A [NA] < B [NB], then:
a. [Assign element from A to C] Set C [PTR]: = A [NA]
b. [Update pointer] Set PTR: = PTR + 1 and NA: = NA + 1
Else:
a. [Assign element from B to C] Set C [PTR]: = B [NB]
b. [Update pointer] Set PTR: = PTR + 1 and NB: = NB + 1
[End of If Structure]
[End of Loop]
3. [Assign remaining elements of C]
If NA > R, then: (element not existing in A, check B)
Repeat for K = 0, 1, 2…, S – NB:
Set C [PTR + K]: = B [NB + K]
[End of Loop]
Else:
If NB > S (element not existing in B, check A)
Repeat for K = 0, 1, 2..., R – NA:
Set C [PTR + K]: = A [NA + K]
[End of Loop]
[End of If Structure]
4. Exit

13
POLYPHASE MERGE
This is a method in which the data are kept on more than one backing store or file. Items
are merged from the source files onto another file. Whenever one of the source file is
exhausted, it immediately becomes the destination of the merge operations from the non-
exhausted and previous destination files. When there is only one file left, the process
stops. The repeated merging is referred to as polyphase merging.

SIMPLE LINKED LIST

LINKED LIST ARRAY

This is a list representation in which items are not necessarily sequential in storage.
Access is made possible by the use in every item of a link that contains the address of the
next item in the list. The last item in the list has a special null link to indicate that there
are no more items in the list.
Let LIST be a linked list.
LIST will require 2 linear arrays, one is INFO and the other is LINK such that INFO [K]
contains the information part while LINK [K] contains the next pointer field of a node of
LIST. LIST also requires a variable name e.g. START which contains the location of the
beginning of the list, and a n ext pointer sentinel denoted by NULL which indicates the
end of the list.

Consider the example below:


This is a linked list in memory where each node of the list contains a single character. We
can obtain the list of characters or the strings, as follows.
START = 9, so INFO [9] = N (1st character)
LINK [9] = 3, so INFO [3] = O (2nd character)
LINK [3] = 6, so INFO [6] = (3rd character)

14
LINK [6] = 11, so INFO [11] = E (4th character)
LINK [11] = 7, so INFO [7] = X (5th character)
LINK [7] = 10, so INFO [10] = I (6th character)
LINK [10] = 4, so INFO [4] = T (7th character)
LINK [4] = 0, the null value, the list has ended
The character string is NO EXIT.
START
9
INFO LINK

1
2
3 O 6
4 T 0
5
6 11
7 X 10
8
9 N 3
10 I 4
11 E 7
12

TYPES OF LINKED LIST

1. Single Linked List (One – way linked list)


This is a linked list in which each item contains a single link to its successor. By
following links, it is possible to access the entire structures from the first item.
2. Double Linked List (2 – way linked list or Symmetric list)
This is the type of linked list where each item contains links to both its predecessor
and its successor. This makes it possible to transverse the list in either direction.

QUEUES (FIFO)

15
A queue is a linear list where all insertions are made at one end of the list and all
removals and access at the other end. A queue can be implemented in hardware as a
specialized form of addressless memory, and is most commonly used for speed
buffering between real-time data i/o stream and a form of memory that requires
start/stop time.
It is sometimes called a push-up list or it is a First-In, First-Out (FIFO) organization
of data.
STACK
A Stack is a linear list where all accesses, insertion and removals are made at one end
of the list called the top. This implies access on Last-In, First-Out (LIFO) basis i.e
data is entered into the stack and dealt with by taking items individually off the top.
The operations being performed under stack are:
PUSH – Insertion
POP – Removal

1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
Push and Pop Operations
6 FFF
TOP
FFF 7
(a) (b)
EEE 8
TOP
DDD 9
CCC : : 16
BBB N-1
AAA N
AAA BBB CCC DDD EEE FFF .. (c
) .
1 2 3 4 5 6 7 8 9 .. N-1 N
.

The figures above show three ways of picturing a stack which has 6 elements AAA, BBB, CCC,
DDD, EEE, FFF. The implication is that the right-most element is the top element. Since
insertions and deletions can occur only at the top of the stack, this means EEE cannot be deleted
before FFF is deleted, so also DDD cannot be deleted before EEE and FFF are deleted etc.
Consequently, the elements may be popped from the stack only in the reverse order of that in
which they were pushed onto the stack.

A. ALGORITHM TO PUSH ITEM INTO A STACK


PUSH (STACK, TOP, MAXSTK, ITEM)
1. [Stack already filled?]
If TOP = MAXSTK, then: Print: OVERFLOW, and Return
2. Set TOP: = TOP + 1 [Increase TOP by 1]
3. Set STACK [TOP]: = ITEM [Inserts ITEM in new TOP position]
4. Return
B. ALGORITHM TO POP ITEM FROM A STACK
POP (STACK, TOP, ITEM)
This procedure deletes the top element of STACK and assigns it to the variable
ITEM.
1. [Stack has an item to be removed?]
If TOP + 0, then: Print UNDERFLOW, and Return
2. Set ITEM: = STACK [TOP] [Assigns TOP elements to ITEM]
3. Set TOP: = TOP – 1 [Decrease TOP by 1]
4. Return.
Note that the value of TOP is changed before the insertion in PUSH but the value of TOP
is changed after the deletion in POP.
Example

17
Consider the stack below: TOP MAXSTK

XXX YY ZZZ
Y
1 2 3 4 5 6 7 8

We simulate the operation PUSH (STACK, WWW)


1. Since TOP = 3, control is transferred to step 2
2. TOP = 3 + 1 = 4
3. STACK [TOP]: = STACK [4] = WWW
4. Return

(a) We simulate the operation POP (STACK, ITEM)


1. Since TOP = 3, control is transferred to step 2
2. Set ZZZ: = 3
3. TOP: = 3 – 1 : = 2
4. Return

DEQUEUES (Double – end Queue)


A dequeue is a linear list in which element can be added or removed at either end but not in the
middle. The term dequeue is a contraction of the name double – ended queue.
There are two variations of a dequeue
(i) Input – restricted dequeue
This is a dequeue which allows insertions at only one end of the list but allows
deletion at both ends of the list.

(ii) Output – restricted dequeue


This is a dequeue which allows deletion at only one end of the list but allows
insertion at both end of the list.

18
POINTER
A pointer is a value that indicates the storage location of an item of data. When a field of an item
A in a data structure contains the address of another item B i.e of its first word in memory then A
contains a pointer to B, it is said to point to B.

NON – LINEAR STRUCTURES

TREES
Tree is a non – linear data structure which is mainly used to represent data containing a
hierarchical relationship between elements e.g. records, family tree and table of contents.
Properties of Tree
1. A Tree has root which can also be called parent tree or root.
2. A Tree has sub – root(s) which can be referred to as children or nodes.
Trees are constructed using a rule of precedence for data items. If the order of this subtree is
significant, the tree is called an ordered tree else, it is called an unordered tree.
TYPES OF TREE

1. GENERAL TREES
A general tree (T) is defined to be a non-empty finite set T of elements called nodes, such
that:
i. T contains a distinguished element called R, called the root of T.
ii. The remaining elements of T form an ordered collection of zero or more
disjoint trees T1, T2, ….,Tm.
The trees T1, T2, ….,Tm are called subtrees of the root R and the roots of T 1, T2, ….,Tm
are called successors of R.

Example:
This is a general tree with 13 nodes. A, B, C, D, E, F, G, H, J, K, L, M, N

19
A

B C D

E F G H J K

L M N

The root is the node at the top of the diagram and the children of the node are ordered from left
to right.
i. The root is A and it has three children, B, C, D.
ii. Node C has three children G, H, J.
iii. Each of nodes B and K has two children.
iv. Each of nodes D and H has only one child.
v. Nodes E, F, G, J, L, M, and N have no children.
The last groups of nodes with no children are called terminal nodes

Question: Represent the above tree in another form.

COMPUTER REPRESENTATION OF GENERAL TREES

Suppose T is a general tree, unless otherwise stated, T will be maintained in memory by means
of a linked representation which uses three parallel arrays INFO, CHILD and SIBL. Each node N
of T will correspond to a location K such that:
i. INFO [K] contains the data at node N
ii. CHILD [K] contains the location of the first child of N. The condition CHILD [K] =
NULL indicates that N has no children.

20
iii. SIBL [K] contains the location of the next sibling of N. the condition SIBL [K] = NULL
indicates that N is the last child of its parent.
ROOT will contain the location of the root R of T.
a. Root A of T is stored in INFO [2], Set ROOT: = 2
b. B is first child of A which is stored in INFO [3], set CHILD [2]: = 3, A has no sibling
SIBL [2]: = NULL
c. E is the first child of B which is stored in INFO [6], set CHILD [3]: = 6; since node C is
the next sibling of B and C is stored in INFO [4], set SIBL [3]: = 4 etc.

INFO CHILD SIBL


1 2
2 A 3 0
3 B 6 4
4 C 8 5
5 D 11 0
6 E 0 7
7 F 0 0
8 G 0 9
9 H 12 10
10 J 0 0
11 K 13 0
12 L 0 0
13 M 0 14
14 N 0 0

2. BINARY TREES
A binary tree T is defined as a finite set of elements called nodes, such that:
a. T is empty (null/ empty tree) or

21
b. T contains a distinguished node R, called the root of T, and the remaining nodes
of T form an ordered pair of disjoint binary trees T1 and T2.
If T contains a root R, then the two trees T 1 and T2 are respectively called the left and
right subtrees of R. If T1 is non empty, then its root is called the left successor of R;
similarly, if T2 is non empty, then its root is called the right successor of R.

Example:
The binary tree (below) T consists of eleven (11) nodes represented by the letters A
though to L excluding I.
i. The root of the node is A at the top
ii. A left – downward slanted line from a node N indicates a left successor of N and
a right – downward slanted line indicates a right successor of N.

B C

D E G H

F J K

L
The list generated by the tree is: A, B, C, D, E, F, G, H, J, K, L
NOTE

22
Any node in a binary tree has 0, 1 or 2 successors.

Similar Trees
Binary trees T and T1 are said to be similar if they have the same structure or if they have the
same shape. The trees are said to be copies if they have the same contents at corresponding
nodes.
A E A E

B F B F

C D G H C D G H

(a) (b) (c) (d)


i. The trees a, c and d are similar
ii. The trees a and c are copies
iii. The tree b is neither similar nor a copy of tree d.

TREE REPRESENTATION OF ALGEBRAIC EXPRESSIONS

Consider any algebraic expression E involving only binary operations such as


E = (a -b) / (c * d + e)
E can be represented by means of the binary tree T as shown below. Each variable or constant in
E appears as an internal node in T whose left and right subtrees corresponds to the operands of
the operation e.g.
a. In the expression E, the operands of + are c * d and e.
b. In the tree T, the subtrees of the node + corresponds to the sub expression c * d and e.

23
- +

a b * e

c d

NOTE
Any algebraic expression will correspond to a unique tree and vice versa.
Example:
Consider the algebraic expression E = (2x + y) (5a - b)3
(a) Draw the tree T which corresponds to the expression E.
(b) Find the scope of the exponential operator i.e. find the subtrees rooted at the exponential
operator.
N.B: An arrow () is used for exponentiation.
Solution:
(a) *

* y - 3

2 x * b

5 a

24
(b) The scope of the exponential operator is 5, *, a, -, b, 3,

COMPUTER REPRESENTATION OF BINARY TREES IN MEMORY

There are two ways of representing tree in memory.


(i) Linked representation of Binary trees
(ii) Sequential representation of Binary trees

(i) Linked representation of Binary trees


A binary tree T will be maintained in memory by means of a linked representation which
uses three parallel arrays, INFO, LEFT, RIGHT and a pointer variable ROOT. Each node N of T
will correspond to a location K such that:
(i) INFO [K] which contains the data at the node N
(ii) LEFT [K] which contains the location of the left child of node N
(iii) RIGHT [K] which contains the location of the right child of node N
ROOT will contain the location of the root R of T. If any subtree is empty, then the root will
contain the null value.

Example 1: Consider the binary tree below:

B C

D E G H

F J K

25
L

The tree can be represented as shown below:

INFO LEFT RIGHT

4
1 K 0 0
2 C 3 5
3 G 0 0
4 A 7 2
5 H 10 1
6 L 0 0
7 B 11 9
8 F 0 0
9 E 8 0
10 J 6 0
11 D 0 0

Example 2: Suppose the personnel file of a small company contains the following data on its
nine employees; Name, Sex, Monthly Salary. The file can be maintained in memory as a binary
tree as shown:

NAME SEX SALARY LEFT RIGHT


10
1 0
2 DAVID M 22,800 0 9
3 KENNY F 19,000 0 0
4 GOKE M 27,200 2 0
5 BUNMI F 14,700 0 0
6 LEKAN M 16,400 3 8
7 LANRE M 19,000 5 4
8 RAFIU M 15,500 0 0
26
9 KUNLE M 34,200 0 0
10 JONES F 22,800 7 6
Suppose we want to draw the tree diagram which corresponds to the binary tree above, we label
the nodes in the tree diagram only by the key values NAME and we construct the tree as follows:
(i) The value ROOT + 10 indicates that Jones is the root of the tree
(ii) LEFT [10] = 7 indicates that Lanre is the left child of Jones, and RIGHT [6] indicates
that Lekan is the right child of Jones

Solution:

JONES

LANRE LEKAN

BUNMI GOKE KENNY RAFIU

DAVID

KUNLE

27
Question
Draw a computer representation of the binary tree below.
60

30 70

20 55 90

35 80 95

45

40 50

SEQUENTIAL REPRESENTATION OF BINARY TREES

There is an efficient way of maintaining binary tree T in memory, sequential representation of T.


This representation uses only a s ingle linear array TREE as follows:
(i) The root R of T is stored in TREE [1]
(ii) If a node N occupies TREE [K], then its left child is stored in TREE [2*K] and its
right child is stored in TREE [2*K+1]
NULL is used to indicate an empty subtree. If TREE [1] = NULL indicates that the tree is
empty.
Example:
The sequential representation of the binary tree below
45

22 77

28
11 30 90

15 25 88
is as shown:
TREE

1 45
2 22
3 77
4 11
5 30
6
7 90
8
9 15
10 25
11
12 88
13

Example 2: Consider example 2 of linked representation and represent it sequentially.

SORTING

Sorting is the arrangement of elements or records in a list in some logical order either
alphabetically or numerically. If A be a list of n elements A 1, A2, A3……An in memory, sorting
A refers to the operation of rearranging the contents of A in any given order. Therefore there are
n! ways that the contents can appear in A.

29
BUBBLE SORT

Suppose the list of numbers A [1], A [2]….A [N] is in memory. The bubble sort algorithm works
as follows:

Step 1: Compare A [1] and A [2] and arrange them so that A [1] < A [2]. Then compare A [2]
and A [3], arrange so that A [2] < A [3]. Then compare A [3] and A [4], arrange so that A [3] <
A [4]. Continue until you compare A [N-1] with A [N] and arrange them so that A [N-1] <
A[ N].
* At the end of these steps, the largest element is “bubbled up” to the nth position and A [N] will
contain the largest element.
Step 2: Repeat Step 1 with one less comparison i.e we compare A [N-2] and A [N-1] and
rearrange. When Step 2 is completed, the second largest element will occupy A [N-1].
Step 3: Repeat Step 1 with two fewer comparison i.e we compare A [N-3] and A [N-2] and
rearrange.
* The process of sequentially traversing through all or part of a list is called “A pass”. Each step
is called a pass. Therefore, bubble sort algorithm requires n-1 passes where n is the number of
input items.

Example:
Suppose the following numbers are stored in an array A: 32, 51, 27, 85, 66, 23, 13, 57. The
bubble sort is applied to the array A and each pass is discussed separately.

Pass 1
(i) Compare A1 and A2. Since 32 < 51 the list is not altered
(ii) Compare A2 and A3. Since 51 > 27 interchange them to produce 32, 27, 51, 85, 66,
23, 13, 57
(iii) Compare A3 and A4. Since 51 < 85 the list is not altered.
(iv) Compare A4 and A5. Since 85 > 66 interchange them to produce 32, 27, 51, 66, 85,
23, 13 57

30
(v) Compare A5 and A6. Since 85 > 23 interchange them to produce 32, 27, 51, 66,
23, 85, 13, 57
(vi) Compare A6 and A7. Since 85 > 13 interchange them to produce 32, 27, 51, 66, 23,
13, 85, 57
(vii) Compare A7 and A8. Since 85 > 57 interchange them to produce 32, 27, 51, 66, 23,
13, 57, 85
* Notice that the largest number has moved to the last position at the end of the first
pass, the rest of the numbers are not sorted, even though some of them have changed
positions.
At the end of the first pass this was generated: 32, 27, 51, 66, 23, 13, 57, 85.

Pass 2

27, 32, 51, 66, 23, 13, 57, 85

27, 32, 51, 23, 66, 13, 57, 85

27, 32, 51, 23, 13, 66, 57, 85

27, 32, 51, 23, 13, 57, 66, 85


* At the end of Pass 2, the second largest number 66, has moved its way to the next - to –the- last
position.
Pass 3

27, 32, 23, 51, 13, 57, 66, 85

27, 32, 23, 13, 51, 57, 66, 85


Pass 4

27, 23, 32, 13, 51, 57, 66, 85

31
27, 23 13, 32, 51, 57, 66, 85
Pass 5

23, 27, 13, 32, 51, 57, 66, 85

23, 13, 27, 32, 51, 57, 66, 85


Pass 6

13, 23, 27, 32, 51, 57, 66, 85


Pass 7
13, 23, 27, 32, 51, 57, 66, 85
No interchange takes place in Pass 7.

Bubble Sort Algorithm

(Bubble Sort) BUBBLE (DATA, N)


DATA is an array with N elements. This algorithm sorts the elements in DATA.
1. Repeat Steps 2 and 3 for K = 1 to N – 1
2. Set PTR: = 1 [Initializes pass pointer PTR]
3. Repeat while PTR ≤ N – K [Execute pass]
(a) If DATA [PTR] > DATA [PTR +1] then: Interchange DATA [PTR] and
DATA [PTR + 1]
[End If Structure]
(b) Set PTR: = PTR + 1
[End inner loop]
[End of Step 1 outer loop]
4. Exit

32
INSERTION SORT

Suppose an array A with n elements A [1], A [2],..A [N] is in memory. The insertion sort
algorithm scans A from A [1] to A [N], inserting each element A [K] into its proper position in
the previously sorted subarray A [1], A [2],..A [K – 1] i.e

Pass 1
A [1] is sorted by itself
Pass 2
A[2] is inserted either before or after A[1] so that A[1], A[2] is sorted
Pass 3
A[3] is inserted into its proper place in A[1], A[2] i.e either before A[1], between A[1] and A[2]
or after A[2] so that A[1], A[2] and A[3] is sorted.
Pass 4
A[4] is inserted into its proper place in A[1], A[2], A[3] i.e either before A[1], between A[1] and
A[2], between A[2] and A[3] or after A[3] so that A[1], A[2], A[3] and A[4] is sorted.
Pass N
A[N] is inserted into its proper place in A[1], A[2],….A[N - 1] so that A[1], A[2],…A[N] is
sorted.
* This sorting is used when N is small.

Example:
Suppose an array A contains 8 elements as follows: 77, 33, 44, 11, 88, 22, 66, 55. The figure
illustrates the insertion sort. The paired elements indicates the A [K] in each pass and the arrow
indicates the proper place for inserting A [K].
PASS A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8]
K=1 -∞ 77 * 33 44 11 88 22 66 55

K=2 -∞
77 33 44 11 88 22 66 55

33
K=3 -∞
33 77 44 11 88 22 66 55
K=4 -∞
33 44 77 11 88 22 66 55
K=5 -∞ *
11 33 44 77 88 22 66 55
K=6 -∞
11 33 44 77 88 22 66 55
K=7 -∞
11 22 33 44 77 88 66 55
K=8 -∞
11 22 33 44 66 77 88 55
Sorted -∞ 11 22 33 44 55 66 77 88

Insertion Sort Algorithm

(Insertion Sort) INSERTION (A, N)


This algorithm sorts the array A with N elements
1. Set A [0]: = - ∞ [Initializes sentinel element]
2. Repeat Steps 3 to 5 for K = 2, 3,…N
3. Set TEMP: = A [K] and PTR: = K – 1
4. Repeat while TEMP < A [PTR]
(a) Set A [PTR + 1]: = A [PTR] [Moves element forward]
(b) Set PTR: = PTR -1
[End of loop]
5. Set A [PTR + 1]: = TEMP [Inserts elements in proper place]
[End of Step 2 loop]
6. Return

SELECTION SORT

34
Suppose an array A with n elements A [1], A [2]….A[n] is in memory. The selection sort
algorithm for sorting A works as follows:
 First find the smallest element in the list and put it in the first position.
 Then find the second smallest element in the list and put it in the second position and
so on.
Pass 1
Find the location LOC of the smallest in the list of N elements.
A [1], A [2],…A [N] and then interchange A [LOC] and A [1]. Then: A [1] is sorted
Pass 2
Find the location LOC of the smallest in the sublist of N – 1 elements A [2], A [3],…A [N] and
then interchange A [LOC] and A [2]. Then A [1], A [2] is sorted, since A [1] ≤ A [2].
Pass 3
Find the location LOC of the smallest in the sublist of N – 2 elements A [3], A [4],…A [N] and
then interchange A [LOC] and A [3]. Then: A [1], A [2],..A [N] is sorted since A [2] ≤ A [3].
.
.
Pass [N-1]
Find the location LOC of the smaller of the elements A[N – 1], A[N] and then interchange
A[LOC] and A[N – 1]. Then: A[1], A[2],…A[N] is sorted, since A[N -1] ≤ A[N]. Thus A is
sorted after N – 1 passes.
Example:
Suppose an array A contains 8 elements as follows: 77, 33, 44, 11, 88, 22, 66, 55

Applying the selection sort yields

PASS A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8]


K = 1, LOC = 4 * 77 33 44 * 11 88 22 66 55
K = 2, LOC = 6 11 * 33 44 77 88 * 22 66 55
K = 3, LOC = 6 11 22 * 44 77 88 * 33 66 55
K = 4, LOC = 6 11 22 33 * 77 88 * 44 66 55

35
K = 5, LOC = 8 11 22 33 44 * 88 77 66 * 55
K = 6, LOC = 7 11 22 33 44 55 * 77 * 66 88
K = 7, LOC = 8 11 22 33 44 55 66 * 77 88
Sorted 11 22 33 44 55 66 77 88

The Selection Sort Algorithm

(Selection Sort) SELECTION (A, N)


This algorithm sorts the array A with N elements.
1. Repeat Steps 2 and 3 for K = 1, 2…N – 1
2. Call MIN (A, K, N, LOC)
3. [Interchange A[K] and A[LOC]]
Set TEMP : = A[K], A[K] := A[LOC] and A[LOC] : = TEMP
[End of Step 1 Loop]
4. Exit

36

You might also like