Data Structure Manual

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

COURSE TITLE: DATA STRUCTURE

COURSE CODE: CTE 115

At the beginning of storing data structure, one should ask (1) what is data?
(2) where did data come from? (3) how are they computed? (4) how we can
manage data?

Data are raw figures, values, your set of values in facts pertaining to an
object, person, organization, place or things. Data are considerably more than just
a collection of numbers; each number must be interpreted as part of
representations of an object. Source of data; data (datum) are raw materials of
which information derived is obtained, data can be from many sources, they can
be compiled as a result of telephone call, received in the form of letter and turn
around documents, collected on the remote workstation and may be from a point
of sales transaction. Data can be obtained from parent ways e.g. census in the
street, relay data to a central computer that control the traffic etc. you must know
in advance what exactly you want to do with the data before you collect them.

Data management is the storage, retrieval, classify and manipulation of


data. Two basic methods are to manage data, these methods are traditional and
computer-based method.
DATA HIERARCHY
There are six levels of data hierarchy namely:

(1) Bit
(2) Character
(3) Field
(4) Record
(5) File
(6) Data Base

BIT: It is derived from two words which are binary digit (0 & 1). It is the smallest
unit of information and the most private data item that a computer can
manipulate.

CHARACTER: It is refer to a single letter numeral or special symbols e.g. (A, B, C,


/,?). A group of character is known as character string.

FIELDS: Is a single elementary unit of information representing an attribute of an


entity e.g. a name, address, a phone number.

RECORD: Is the collection of related data items each of which is called field or
attributes e.g. student records, farm records, employee records,

1|P a g e
example of student record are: MATRIC NO. AGE NAME CLASS
DEPARTMENT.

FILE: This is the collection of records of the entities in a given entity set e.g. File of
ND 1 students of computer engineering department.

DATA BASE: is a repository where data /information are kept and set for easy
access, management and updating e.g. Sales transaction, customer
data, financial and product information.

File

Record Record

Field Field Field Field

Char. Char. Char. Char. Char. Char. Char. Char.

BIT, BYTE AND WORD

BIT: Is coined from two words which are binary digit (0 & 1) and it is smallest
unit of information and the most private data items which a computer can
manipulate.

BYTE: It is form when groups of bits are grouped together. It consists of eight (8)
consecutive bits data units and it is the most universal acceptable basic unit
of memory size measurement.

WORD: is a fixed size group of bytes/bits handled as a unit by the computer. (16
bits or two bytes).

Bit of a number are usually numbered from 0 and starts from the right to the
left. 16bit word.

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

High byte low byte


The byte to the right is often called the low byte while the one on the left is called
high byte.

2|P a g e
Most computer have no instruction to manipulate a bit. However, some
programming languages provide bit-wise operations that may be applied on a byte
or word with an appropriate sequence of bits (called a mask) to isolate a bit for
manipulation.
Example: Student records have variable lengths, since different students take
different numbers of courses. Variable-length records have a minimum and a
maximum length. The above organization of data into fields, records and files may
not be complex enough to maintain and efficiently process certain collections of
data. For this reason, data are also organized into more complex types of
structures.

TRADITIONAL APPROACH TO DATA MANAGEMENT

In traditional file processing, files are sorted, merged and process by a key data
element. When you write program based on traditional approach, data are
manipulated and retrieved on the recording menu either by sequential or random-
access capabilities. The problems contain in traditional files processing are
overcome by DBMS (Database Management System) software. DBMS rely
considerably among commercially available DBMS software packages.

1. The hierarchy DBMS – Trees


2. The network DBMS COD45YL DBMS – Graph
3. The relational DBMS-Orals.

DATA STRUCTURE
Data structure is the arrangement of data in an interconnected network
manner to make easy commencement to each data via algorithm. Usually,
algorithms defined on structure and retrieval instruction such as add, fetch and
processing operation such as search etc. Examples of data structures are array, set,
stack, queue, matrix, records, tree, graph, lists etc. This can be conventionally be
specified in abstract data language. Data structure is principal method for
collecting related data value into a single unit. An employee record may consist of
a name, address, phone number and salary, each of which may be different data
type, but together represent the employee’s information as a whole. Another
example is a group of items, all of which have the same data type and which need
to be kept together for the purpose of searching. A typical data structure provided
by programming language is an “array”. Array collects data into a sequence of
individually indexed items. e.g. Int A[10] where “A” is the name of the array of 10
integer values. Array is a successive location in the memory where document/data
are kept. Depending on your requirement and project, it is important to choose
the right data structure for your project. For example, array is a good example of

3|P a g e
data structure to be used if a user wants to store data on normal order in the
memory.

Memory location
1004 1005 1006 1007 1008 1009 1010
*** 2 1 5 3 4 ***
0 1 2 3 4
Index
Array data structure representation

NEED FOR FASTER PROCESSING

1. Improve productivity: more job can be done within the same space of time
the slower processor would take to complete a job.

2. Reduce the time taken to solve complex problem, (i.e. some large
simulation and numbers crunching program).

3. Make generation of relevant information readily available at faster rate.

4. Sustain the rate of data processing and information generation as data


grow daily due to growth in organizations and society; data are becoming
massive and growing at a faster rate. There is the need to meet this
challenge to keep the rate of production of information at the same period.

CONCEPT OF ABSTRACTION

Abstraction is one of the most important means of handling complexity it


entails, the separation of relevant and irrelevant aspect of an object. In high level
programming languages, abstraction is used to define more meaningful and
concise construct of data operation and control by hiding some of the necessary
details of the programming.

ABSTRACT DATA TYPE (ADT): is the specification of data type which specifies the
logical and mathematical model of the data type. In other words, ADT is defined as
a mathematical model of data objects that make up a data type as well as the
functions that operate on these objects. We can think of an ADT as a mathematical
model with a collection of operations defined on that model. Sets of integers
together with operations of union, intersection and set difference, form a simple
example of an ADT. So, a useful tool for specifying the logical properties of a data
type is abstract data type.

4|P a g e
An ADT is the specification of logical and mathematical properties of a data
type or data structure. A data type is a class of data objects together with a set of
operations for creating and manipulating them. Data object represents space in
the memory of actual processor, which can be recognized by variable name or
identifier name by the programmer. Data object is also called a container for data
values. A data object is elementary if it contains a data value that is always
manipulated as a unit and It is called a data structure if it is an aggregate of other
objects. The important steps in the definition of ADT involves mainly two parts:

1. Description of the way in which components are related to each other.

2. Statement of operations that can be performed on the data type.

ADT acts as a useful guideline to implement a data type correctly. An


implementation chooses a data structure to represent the ADT; each data
structure is built up from the basic data types of the underlying programming
language using the available data structuring facilities. An ADT i.e. (object for
access function) may be defined behaviorally by set of functions. Each function is
specified in terms of pre and post condition using the abstract model specification
technique called ABSTRACT SPECIFICATION. In designing an ADT, we often ask the
following 3 good questions.

a. What is data structure? This suitably represents the object we wish to


manipulate.

b. What are the fundamental operations that can be legitimately performed


on such object?

c. what properties should the object process? When these questions are
correctly answered, only the relevant information is made available to user.
Details of the implementation are hidden from the user. Professional can easily be
given to ensure that the implementation of the design ADT is proper
representation of what is expected or required. Example of ADT that will be discuss
are stack, queue, list, tree, graph
ABSTRACT DATA TYPE SPECIFICATION
The specification of an ADT does not imply any implementation consideration.
The implementation of an ADT is a translation into statements of a programming
language, of the declaration that defines a variable to be of that particular ADT
including the procedure in that language for each operation of the ADT. The aim
is on how to specify the various attributes of our ADT. The abstract data type of
an abstract usually defines the syntax and semantic part of the Abstract data type.
The syntax part describes the manner in which constructs can be used, this cover
such details as the used of type, types argument results etc. The semantic section
on the other hand describes the meaning of behavior of the operation on the
abstract data type. Meaning that it can be defined by specifying the desire axioms.

5|P a g e
As part of the semantic section, there is a section with least exception in
restriction. The use of data types in a programming has the following advantages:

1. Allows separation of objects from their implementation: This allows the


developers to change implementation of an abstract data type without
affecting the user applications built on it.
2. It encourages modularity and hence inherit the associated benefit such as
readability, ease of maintenance and debugging.
3. It enhances usability’s of existing code: A program can be extended easily
where needed to meet some changes or additions to its specification by
adding to existing data types already in the implementation.
SPECIFYING THE SEMANTICS OF ABSTRACT DATA TYPE
There are many approaches to the problems of specifying the semantic of an
abstract data type. All these approaches fall into one of the two categories below:

a. The model approach.


b. The definition approach.

The model approach involves finding some formal mathematical object


whose behaviour is likely to reflect that of the abstract data type. Thus, if we wish
to implement a set of examples, we can equal the properties with those of the
mathematical concept of a set. In the definition approach, user specifies the
properties that an abstract data type should possess, the way of doing this is to
employ the axiomatic approach.
FORMAL SPECIFICATION APPROACHES
There are 3 basic families of specification approaches, the algebraic, the state
machine and the abstract model also called the predicate.

PRECONDITION RESTRICTION NOTATION


This inform the author of the program been used that a given condition is
acceptable and make the author responsible for seeking that the condition reset
occur.

THE ERROR OR EXCEPTION APPROACH


Is the detection of the attending condition as part of the specification and defined
a special value to be passed out across the interface. The abstract program then
has the responsibility to check for that value and take appropriate action. The
responsibility is thereby split between the implementing program for the function
(which must detect the intending condition) and the using program which must
react to it.
NEED OF A DATA STRUCTURE

6|P a g e
Data structure helps us to understand the relationship of one element with the
other and organize it within the memory. The study of complex data structures
includes the following three steps:
1. Logical or mathematical description of the structure
2. Implementation of the structure on a computer
3. Quantitative analysis of the structure which includes determining the amount of
memory needed to store the structure and the time required to execute the
structure. Areas in which data structures are applied extensively are: -
⚫ DBMS ⚫ Simulation ⚫ Operating System
⚫ Statistical Analysis ⚫ Artificial ⚫ Numerical Analysis
Package Intelligence
⚫ Graphics
⚫ computer design ⚫ Network Analysis

In many applications, different data structures are used to do the operations on


the data structures. In such a situation, there is a trade-off between memory
utilization and run time; one data structure sacrifices memory Compactness for
speed; another utilizes memory efficiently but results in a slow run-time. So, each
data structure has its own strengths and weakness.

CHARACTERISTIC OF DATA STRUCTURES


S/N DATA ADVANTAGES DISADVANTAGES
STRUCTURE
1. Array Quickly insertion, very fast access if Slow search, slow deletion,
index is known fixed size
2. Ordered Quicker search than unsorted array Slow insertion and deletion,
Array fixed size
3. Stack It provides last-in first-out access Slow access to other items
4. Queue It provides first-in first-out access Slow access to other items
5. Linked list Quick insertion and quick deletion Slow search
6. Binary Trees Quick search, insertion and deletion if Deletion of algorithm is
trees remain balance complex
7. Hash Table Very fast access if key is provided, fast Slow deletion, access slow if key
insertion is unknown, inefficient memory
usage
8. Heap Fast insertion, deletion, access to Slow access to other items
largest item
9. Graph Models’ real-world situation Some algorithms are slow in
access

7|P a g e
NOTE: Data Structure and data types are slightly different. Data structure is the
collection of data types arranged in a specific order. While data type is a class of
data objects together with a set of operations for creating and manipulating them.
Data structures are used in almost every program or software system.
Specific data structures are essential ingredients to many efficient algorithms, and
make the management of huge amount of data possible such as large databases
and internet indexing services. Some formal design methods and programming
language emphasize on data structures rather than algorithms, as the key
organizing factor in software design.
DATA TYPE VS. DATA STRUCTURE

In programming language, the data type of a variable is the set of values


that the variable may assume. For example, a variable of type Boolean can assume
either the value is true or the value is false, but no other value. The basic data
types vary from language to language. The rules for constructing composite data
types out of basic ones also vary from language-to-language.
Data may be defined as a set of values (collection of items) needed in an
application or systems to process and obtain information. Data occur in types; the
types of a data items determine how the data might be manipulated and ends the
types of operations that may be performed on it. Indeed, a data type is a set of
data items that have the same properties and may be manipulated the same way;
it has the same set of operation define on the items that form the set.
DATA ITEMS
Data items refers to a single unit of data in a storage record. The term refers
specifically to the smallest possible unit of information or more generally to a single
entry or field. Data item is classified into two, they are:

1. Group items: are data item that are themselves divisible into sub-items e.g.
name: first name, other name and last name.

2. Elementary items: are data item that can not be further divided e.g. sex, age.

ENTITY: an entity is a statement that has certain attributes and properties which
may be assigned values, these values may be numeric or non-numeric or mixture
or the two (Alphanumeric). For example, the following are possible attributes and
their corresponding values of an entity. i.e. student of a school.

ATTRIBUTE MATRIC NO. NAME CLASS DEPARTMENT SEX AGE

VALUE ND/19/COM/FT/410 OLA 100L COMPUTER F 21

8|P a g e
Entity with similar attributes form an entity set e.g. all 100L students in
department of computer science. Each attribute of an entity set has a range of
value.

The term “information” is the term used to describe meaningful processed data or
used to refer to processed data from which meaningful decision can be made
which is the reason why data are considerably more than just a collection of
numbers of facts; each number or fact must be a part of a model or structure, for
example, the fact

ND/14/COM/PT/300 ZAZU 500L COMPUTER SCI. 20

ND/17/COM/PT/301 SOZO 400L COMPUTER ENGR. 15

ND/18/COM/PT/305 KELLY 300L ACCOUNTING 14


all these are meaningful unless described as a record of the student data
file of a school.

TYPES OF DATA STRUCTURE

Depending upon the arrangement of data, data structure can be classified as


arrays, records, link, lists, stacks, queues, trees, graphs etc.

1. ACCORDING NATURE OF SIZE

a. static data structure

b. dynamic data structure

Static Data Structure: a data structure is said to be static if we can store


data up to a fix number e.g. Array.

Dynamic data structure: is a type of data structure that allows the


programmer to change its size during program execution to add or delete
data. For example, link, list, tree, graph etc.

2. ACCORDING TO ITS OCCURRENCE

a. Linear Data Structure

b. Non-linear Data Structure

LINEAR DATA STRUCTURE: in linear data structure, data is store in consecutive


memory location or stored in a sequential form i.e. every element in the structure
has a unique predecessor and unique successor e.g. Array, list, link, queue, stack,
etc.
9|P a g e
Non-Linear Data Structure: in this, data is stored in non-consecutive memory
location or stored in inconsequential form. e.g. trees, graphs etc. A non-linear
structure is mainly used to represent data containing a hierarchical relationship
between elements. In this, elements do not form a sequence; there is no unique
predecessor or unique successor.

Data Structure

Logical Data Structure Physical Data Structure

Int
Linear Data Structure Non-linear Data Structure float

Char.
Link Tree

Graph
Linear List Linked List

Array One way

Stack Two way

Queue Circular

dequeue

Circular

10 | P a g e
LINEAR DATA STRUCTURE
1
Top
2 (insertion/
deletion)
.
.
n
Array Stack

….……. Rear (insertion)

Front (deletion) Queue

A . B . C Linked list

NON-LINEAR DATA STRUCTURE


a) Tree

11 | P a g e
b) Graph

3. PRIMITIVE AND NON-PRIMITIVE DATA STRUCTURES:

PRIMITIVE DATA STRUCTURE: are the basic data structures and are directly
operated upon by the machine instructions, which is in primitive level. These have
different representations on different computers. They are integers, floating point
numbers, characters, string constants, pointers etc.

NON-PRIMITIVE DATA STRUCTURES: Are more complicated data structure


emphasizing on structuring of a group of homogeneous or heterogeneous data
items. These are derived from the primitive data structures. Array, lists, linked lists,
trees and graphs falls in this category.

CLASSIFICATION OF DATA STRUCTURES

Data Structure

Primitive Data Structure Non-Primitive Data Structure

Integer Float Character pointer Array Lists Files

Linear Lists Non-linear Lists

Stacks Queues graphs Trees

12 | P a g e
4. HOMOGENEOUS AND NON-HOMOGENEOUS DATA STRUCTURE:

In homogeneous data structures data elements are of same type like array.
In non-homogeneous data structure data elements may not be of same type like
structure in other programming language like “C”.

DATA STRUCTURE OPERATIONS

The data appearing in our data structure is processed by means of certain


operations. As a matter of fact, the particular data structure that one chooses for
a given situation depends largely on the frequency with which specific operations
are performed.

The following are the basic operation on data structure:

1. CREATING: This first operation is to create a data structure. It is declaration


and initialization of the data structure and reserved memory locations for data
elements.

2. INSERTING: It is refers to as operation of adding another element to the


collection or structure.

3. DELETING: Is the removal of a record from the structure.

4. UPDATING: Is the amendment of the data value of the data structure.

5. TRAVERSING: Is the accessing of each record exactly once so that certain


items in the records may be processed. (This processing is sometimes called visiting
the records) i.e. trees.

6. SEARCHING: It involves finding the location of the record with a given value
or finding the location of all records, which satisfy one or more conditions in the
data.

7. SORTING: is the process of arranging the records in some logical order i.e.
in ascending and descending order with numeric data or alphabetically with
character data.

8. MERGING: this is the process of combining the data elements (Records) in


two different sorted files into a single sorted file.
9. DESTROYING: this operation is applied when the particular data structure
is no longer needed. It is the last operation of the data structure.

13 | P a g e
ARRAY

An array is the most frequently used data structure in Programming


language; mathematical problems like matrix, algebra can be easily handled by
arrays. An array is the collection of sequential memory locations, which can be
referred to by a single name along with a number known as Index, to access a
particular field of data. In other word, an array is a collection of homogeneous data
elements described by a single name. Each element of an array is referenced by a
subscripted variable or value, called SUBSCRIPT or INDEX enclosed in parenthesis.
If an element of an array is referenced by single subscript, the array is known as
one-dimensional array or linear array and if two subscripts are required to
reference an element, the array is known as two-dimensional array and so on.
Analogously the array whose elements are referenced by two or more subscripts
are called multi-dimensional arrays. Arrays are a data type that is used to represent
a large number of homogeneous values that are all of one data type. The data type
could be of type of char; in case we have a string. The data type could just as easily
be of type Int, float or another array. Intuitively, an array is a set of pairs, index and
value for each index which is define, there is a value associated with that index.

DECLARATION OF ARRAY

Arrays must be declared before they are used just like every other variable. The
general form of declaration is:

1. The name of the array

2. The data type of the array

3. The index set of the array

Type variable-name [size];

The type specifies the type of elements that will be contained in the arrays such as
Int, float or char and the size indicates the maximum of elements that can be
stored inside the float array. So, when we declare an array, we will have to assign
a type as well as size. For example, when we want to store 10 integer values, we
can use the following declaration, int A [10].

By this declaration, we are declaring A to be an array, which is supposed to contain


in all 10 integer values when the allocation is done for array A then in all 10
locations of size 2bytes for each integer, i.e. 20 bytes will be allocated and the

14 | P a g e
starting address is stored in A. when we say A[0] we are referring to the first
integer value in A.

Float A [10];

Array size

Data type
Array name

ONE-DIMENSIONAL ARRAY:

The simplest form of data structure is one-dimensional array (linear array);


is an array that can be represented by only one-dimension such as row or column
and that hold finite number of the same type of data items. One dimensional array
(linear array) is a set of ‘n’ finite numbers of homogeneous data element such as
referenced respectively by a set of n consecutive numbers, usually 1,2,3, …………...,
n. If B is the name of the array, then its elements can be denoted by parentheses
B(2), B(3), B(4), B(N) or by the bracket notation B[1], B[2], B[3],…………………B[N].

A linear array is such that:

(i) The element of the array is referenced respectively by index set


consisting of ‘n’ consecutive numbers.
(ii) The elements of the array are stored respectively in successive memory
locations. ‘n’ number of elements is called the length of size of an
array…the number ‘n’ in A[n] is called a subscript or an index and A[n] is
called a subscriptial variable.

If ‘n’ is 10, then the array elements A[0], A[1], …………..A[9] are stored in sequential
memory locations as follows:

A[0] A[1] A[2] …………….. A[9]

The element of an array ‘A’ may be denoted in C language. A[0], A[1],


A[2],……. A[n-1]. In C, array can always be read or written through loop. To read
one-dimensional array, it requires on loop for reading and writing the array, for
example:

15 | P a g e
For reading an array of ‘n’ elements.

For (i=0; c<n; i++)

Scanf (11%d, & a[I];

For writing an array

For (i=0; i<n; I++)

Printf (“%d, & a[I];

The operations that are normally performed on any linear array (one-dimensional)
array are: Traversal, search, insertion, deletion, sorting and merging.

REPRESENTING ONE-DIMENSIONAL ARRAY (LINEAR ARRAY) IN MEMORY


Let LA be a linear array in the memory of the computer. The memory of the
computer is simply a sequence of addressed locations. Let us use the notation.
LOC(LA[K]) = Address of the element (LA[K]) of the array LA, the element of
LA are stored in successive memory cell. Accordingly, the computer does not need
to keep track of the address of every element of LA, but need to keep track only of
the address of the first element of LA denoted by Base LA.
Using this address Base(LA), the computer calculates the address of any
element of LA by the following formula:
LOC(LA[K]) = Base(LA) + W(K-Lower bound)

Where W is the number of bytes per storage location for one element of the array
LA. The time to calculate LOC(A[K]) is essentially the same for any value of K. Giving
any subscript K, one can locate and access the content of (LA[K]) without scanning
any other element of LA. The memory of the computer is composed of a sequence
of words, each word is referred to by its position in the sequence of words
(address). The address of the first word is normally zero.

Example 1:

Let the base address of the first element of the array is 400 and each element of
the array occupies 2 bytes in the memory, then address of 4th element of an array
A [10] will give as solution:

Address of 4th element = 400 + 2*4

= 400 + 8

16 | P a g e
= 408

Length = UB – LB + 1

Where UB is the largest index, called the upper bound and LB is the smallest index,
known as the lower bound of the array.

The length = UB when LB = 1

Example 2

An automobile company uses an array AUTO to record the number of automobiles


sold each year from 1970 through 1988. rather than beginning the index set with
1, it is more useful to begin the index set with 1970 so that AUTO[K] = number of
automobiles sold in the year K.

LB = 1970 (The lower bound)

UB = 1988 (The upper bound)

That is AUTO contains 19 elements and its index set consists of integers from 1970
through 1988.
Example 3
consider the array auto in the previous example, which records the number of
automobiles sold each year from 1970 to 1988. Which means Base(AUTO) = 200,
and W = 4 Words per memory cell for AUTO then LOC(AUTO[1970]) = 200,
LOC(AUTO[1971]) = 204, LOC(AUTO[1972]) = 208, LOC(AUTO[1973]) = 212.
NOTE: A collection 'A' of data element is said to be indexed if any element of A,
which we shall call AK can be located and processed in a time that is independent
of K (This is a very important property of linear array).
TRAVERSING LINEAR ARRAY
Let A be a collection of data element stored in the memory of the computer.
Suppose we want to print the content of each element of A or we want to count
the element A with a given property. This can be accomplished by traversing A.
That is by accessing and processing (frequently called visiting) each element of A
exactly once. The following algorithm traverses a linear array LA. The simplicity of
algorithm comes from the fact that LA is a linear structure. Other linear structures
such as linked lists, can also be easily traversed. On the other hand, the traversal
of non-linear structures such as trees and graph are considerably more
complicated.
Algorithm 1a (To traverse a linear array)
17 | P a g e
I. [Initialize counter) set K: = LB
2. Repeat step 3 and 4 while K <= UB
3. [Visit element] Apply process to LA[k]
4. [Increase counter] set k: = k+1
[End of Loop]
5. Exit.
NOTE: - LA is a linear array with lower bound LB and upper bound UB, this
algorithm traverses LA applying the operation process to each element of LA.
Algorithm 1b
1. Repeat for k = LB TO UB
Apply process to LA[k]
[End of Loop]
2. Exit
The operation process in the traversal algorithm may use certain variable which
must be initialized before process is applied to any of the element in the array.
Accordingly, the algorithm may need to be preceded by such an initialization step.
Example 4
Consider the array AUTO in example 2, which records the the number of
automobile sold each year from 1970 through 1988. Each of the following modules
which carry out the given operation involves traversing AUTO.
(i) Find the number NUM of the years margin during the execution which more
than 300 automobiles were sold.
1. [Initialization step] set NUM : = 0
Repeat for k = 1970 to 1988
If AUTO[k] > 300, then: set NUM: = NUM +1
[End of loop]
2. Return
Note: this algorithm requires an initialization step for variable NUM before
traversing the array AUTO
(ii) Print each year and the number of automobiles sold in that year.

18 | P a g e
1. Repeat for k := 1970 to 1988
Write k, AUTO[K]
End of loop]
2. Return
INSERTING AND DELETING
INSERTING: - insertion refers to operation of adding another element to the
collection. If the memory space located for a linear array is larger, inserting an
element at the end will be easy, but inserting such element in the middle would
pose some difficulties. This is because about half of the element of the array must
be moved downward to the new location to accommodate the new element and
maintain the order of the elements.
Example 5
The insertion operation is described below using COURSE which is an element
array having six title such as Agric, Botany, Computer, Geography, History and
Physics. The title is used alphabetically and the array title are kept alphabetically
at all times, if English is added to the array, Geography, History and Physics will
have to move one location downward. If Government is added, History and Physics
would have to move down.
Algorithm to (inserting into a linear array) INSERT (LA, N, K, ITEM)
1. [Initialize Counter set J:=N
2. repeat step 3 and 4 while J >= k
3. [Move Jth element downward] set LA [J+1]= LA[J]
4. [Decrease Counter] Set J=J-1
[End of step 2 loop]
5. [Insert element] set LA[K]
6. [Reset N] set N=N+1
7. Exit
DELETING
Deleting is the removal of a record from the structure. That is the operation of
removing one of the elements from the collection of data elements in the memory
of a computer. Similarly, it is easy to delete an element at the end of array but
tedious to do in the middle of the array as it would required moving each

19 | P a g e
subsequent element or location upward in order to fill up the array. Given similar
array as it is in traversing to remove History, physics will move upward.
COURSE

1 AGRIC

2 BOTANY

3 COMPANY
4 ENGLISH

5 GEOGRAPHY
6 GOVERNMENT

7 PHYSICS

Algorithm to (Deleting from a linear array) Delete (LA, N, K, ITEM


1. " Set ITEM, = LA[K]
2. Repeat for J:= k to N-1
[Move j + 1st element upward] set U\(J):= LA[J+1]
[End of loop]
3. [Reset the number N of element in LA] set N:= N-1
4. Exit
NOTE: LA is a linear array with N element and K is a positive integer such that K
Less than or equal to N. this algorithm deletes the kth element from LA.
SORTING
Sorting; is the process of arranging the records in some logical order such as
increasing or decreasing, with numerical data or alphabetically for example A is a
list of N element A1, A2, A3. “An” in memory sorting, A refers to operation of
rearranging the contents of A so that they are in increasing order numerically, such
that A1<A2<A3 <A4…. <An and since A has n element there are n! ways that the
contents can appear in A.

20 | P a g e
CLASSIFICATION OF SORTING METHODS
Sorting methods used in computing can be classified into 2 areas
(1) internal
(2) external
Internal sorting involves the storage in main memory of all the data to be sorted.
However, when the amount of data is too large to be stored and sorted in the main
memory, it is sorted on an external (secondary) storage region such as disk and
successive part of data are sorted in the main memory, such techniques is known
as external sorting.
The type of sorting method that is used will depend upon at least one
of the following factors:
i. The amount of information to be sorted, clearly it will be time consuming to
use a relative inefficient sorting method on a large quantity of data.
ii. The computer configuration being used the size of the main memory and
number of tapes that would be used,
iii. The nature of the application and urgency of result
BUBBLE SORT
Bubble sorting is the simplest sorting algorithm that works by repeatedly
swapping the adjacent elements if they are in the wrong order. Alternatively, it
refers to the operation of rearranging the element I.e. Suppose the list of numbers
A[1], A[2]…..A[n] is in memory. The bubble sort algorithm works as follows:
Step 1:- compare A[1] and A[2] and arrange them in the desired order, so that A[1]
< A[2]. Then compare A[2] and A[3] and arrange them so that A[2] < A[3]. Then
compare A[3] and A[4] and arrange them so that A[3] < A[4]. Continue until we
compare A[N-1] with A[N] and arrange them so that A[N-1] < A[N].
Step 2:- Repeat step 1 with one less comparison; that is, we stop after we compare
and possibly rearrange A[N-2] and A[N-1] (step 2 involves N-2 comparisons and
when step 2 is completed, the second largest element will occupy A[N-1].
Step 3:- Repeat step 1 with two-fewer comparison; that is, we stop after we
compare and possibly rearrange A[N-3] and A[N-2] Step N-1. Compare A[1], with
A[2] and arrange them so that A[1] < A[2]. NOTE:- After N-1 steps, that list will be
sorted in increasing order.
Example 6

21 | P a g e
Suppose the following numbers are stored in an array "A" 32, 51, 27, 85, 66, 23,
13, 57. We apply bubble sort to the array A, each pass will be discussed separately.
Pass 1:
We have the following comparison:
a. Compare A1 and A2: since 32 < 51, the list is not altered.
b. Compare A2 and A3: since 51 > 27, interchange 51 and 27 as follows 32,
27,51,85,66,23,13,57
c. Compare A3 and A4: since 51 < 85, the list is not altered
d. Compare A4 and A5: since 85 > 66, interchange 85 and 66 as
follows 32, 27, 51, 66, 85, 23, 13, 57
e. Compare A5 and A6: since 85 > 23, interchange 23 as follows
32, 27,51,66,23,85, 13,57
f. Compare A6 and A7: since 85 > 13, interchange 85 and 13 as follows 32,
27,51,66,23, 13,85,57
g. Compare A7 and A8: since 85 > 57, interchange 85 and 57 to yield 32, 27, 51,
66, 23, 13, 57, 85
At the end of this first pass, the largest number, 85, has moved to the last
position. However, the rest of the numbers are not sorted even though some of
them have changed their position for the remainder of the passes, we show only
the interchanges.
Pass 2:
27,33,51,66,23, 13,57,85
27,33,51,23,66,13,57,85
27,33,51,23,13,66,57,85
27,33,51,23,13,57,66,85
At the end of pass 2, the second largest number 66 has moved to the next to the
last position.
Pass 3:
27,33,23,51,13,57,66,85
27,33,23,13,51,57,66,85
Pass 4
22 | P a g e
27,23,33,13,51,57,66,85
27,23,13,33,51,57,66,85
Pass 5:
23,27,13,33,51,57,66,85
23,13,27,33,51,57,66,85
Pass 6:
13,23,27,33,51,57,66,85
Pass 6 actually has 2 comparisons A1 with A2, and A2 and A3. The second
comparison does not involve an interchange. Pass 7 Finally, A1 is compared with
A2 since 13 < 23; no interchange takes place since the list has 8 elements, it is
sorted after the seventh pass.
Algorithm (a) Bubble sort
1. Repeat step 2 and 3 for k = 1 to N-1
2. Set PTR = 1 [initialize pass pointer PTR]
3. Repeat while PTR <= N-K: [Execute pass]
(a) If DATA[PTR] > DATA[PTR + 1], then interchange
DATA[PTR] and DATA[PTR +1]
[End If structure]
Set PTR := PTR +1
[End of inner loop]
[End of step 1 outer loop]
4. Exit.

GRAPH
A graph is a non-linear data structure that consists of a set of vertices (also
known as nodes) and edges. The vertices represent objects, and the edges
represent relationships between the objects. In a graph, the vertices and edges can
have different weights or values that represent the strength or importance of the
relationship between the vertices. A graph can be either directed or undirected. In
a directed graph, the edges have a direction, meaning that they have a start vertex
and an end vertex, while in an undirected graph, the edges have no direction and

23 | P a g e
are bidirectional. Graphs are used in various applications such as computer
networks, social networks, road networks, and many others. They provide a way
to model and analyse complex relationships between objects. Different types of
graphs exist, including weighted graphs, connected graphs, disconnected graphs,
and many others, each with its own properties and uses. In other words, a graph
is a pictorial representation of a set of objects where some pairs of objects are
connected by links. The interconnected objects are represented by points termed
as vertices, and the links that connect the vertices are called edges. More Formally,
a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of
edges, connecting the pairs of vertices. Take a look at the following graph:

in the above graph,


V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}
The concept of graphs in graph theory stands up on some basic terms such as
point, line, vertex, edge, degree of vertices, properties of graphs, etc.
Mathematical graphs can be represented in data structure. We can represent a
graph using an array of vertices and a two-dimensional array of edges.
IMPORTANT TERMS USED GRAPH
• Vertex − Each node of the graph is represented as a vertex. In the following
example, the labelled circle represents vertices. Thus, A to G are vertices.
We can represent them using an array as shown in the following image. Here
A can be identified by index 0. B can be identified using index 1 and so on.
• Edge − Edge represents a path between two vertices or a line between two
vertices. In the following example, the lines from A to B, B to C, and so on
represents edges. We can use a two-dimensional array to represent an array
as shown in the following image. Here AB can be represented as 1 at row 0,
column 1, BC as 1 at row 1, column 2 and so on, keeping other combinations
as 0.
• Adjacency − Two node or vertices are adjacent if they are connected to each
other through an edge. In the following example, B is adjacent to A, C is
adjacent to B, and so on.
24 | P a g e
• Path − Path represents a sequence of edges between the two vertices. In
the following example, ABCD represents a path from A to D.

BASIC OPERATIONS
Following are basic primary operations of a Graph −
• Add Vertex − Adds a vertex to the graph.
• Add Edge − Adds an edge between the two vertices of the graph.
• Display Vertex − Displays a vertex of the graph.

SYMBOL AND RELATIONS

RELATIONS
A binary relation is determined by specifying all ordered pairs of objects in that
relation; it does not matter by what property the set of these ordered pairs is
described. We are led to the following definition: “A set R is a binary relation if all
elements of R are ordered pairs, i.e., if for any z ∈ R there exist x and y such that z
= (x, y). It is customary to write xRy instead of (x, y) ∈ R. We say that x is in relation
R with y if xRy holds. The set of all x which are in relation R with some y is called
the domain of R and denoted by “dom R.” So, dom R = {x | there exists y such that
xRy}. dom R is the set of all first coordinates of ordered pairs in R. The set of all y
such that, for some x, x is in relation R with y is called the range of R, denoted by
“ran R.” So, ran R = {y | there exists x such that xRy}.
SYMBOL
A symbol is something such as an object, picture, written word, sound, or
particular mark that represents something else by association, resemblance, or
convention. For example, a red octagon may stand for "STOP" On maps, crossed
sabres may indicate a battlefield. Numerals are symbols for numbers. The study or
interpretation of symbols is known as symbology, and the study of signs is known
as semiotics.

Logic Symbols
s/n Symbols Latex Command HTML Entity
1 ¬ \neg &not;
2 ∧ \wedge &and;
25 | P a g e
3 ∨ \vee &or;
4 ⊕ \oplus &oplus;
5 \Rightarrow
6 \Leftrightarrow
7 ∃ \exists &exist;
8 ∀ \forall &forall;

EQUIVALENCE RELATION
Equivalence relation is a binary relation between two elements of a set
which groups them together as being "equivalent" in some way. Let a, b, and c be
arbitrary elements of some set X. Then "a ~ b" or "a ≡ b" denotes that a is
equivalent to b. An equivalence relation "~" is reflexive, symmetric, and transitive.
In other words, the following must hold for "~" to be an equivalence relation on X:
An equivalence relation partitions a set into several disjoint subsets, called
equivalence classes. All the elements in a given equivalence class are equivalent
among themselves, and no element is equivalent with any element from a
different class.
• Reflexivity: a ~ a
• Symmetry: if a ~ b then b ~ a
• Transitivity: if a ~ b and b ~ c then a ~ c.
• The equivalence class of a under "~", denoted [a], is the subset of X for which
every element b, a~b. X together with "~" is called a setoid. Example of equivalence
relations is a ubiquitous equivalence relation is the equality ("=") relation between
elements of any set.
Other examples include:
• "Has the same birthday as" on the set of all people, given naive set theory.
• "Is similar to" or "congruent to" on the set of all triangles.
• "Is congruent to modulo n" on the integers.
• "Has the same image under a function" on the elements of the domain of the
function.
• Logical equivalence of logical sentences.
• "Is isomorphic to" on models of a set of sentences.
• In some axiomatic set theories other than the canonical ZFC (e.g., New
Foundations and related theories):
➢ Similarity on the universe of well-orderings gives rise to equivalence classes
that are the ordinal numbers.
➢ Equinumerosity on the universe of:
▪ Finite sets give rise to equivalence classes which are the natural
numbers.
▪ Infinite sets give rise to equivalence classes which are the transfinite
cardinal numbers.
• Let a, b, c, d be natural numbers, and let (a, b) and (c, d) be ordered pairs of such
numbers. Then the equivalence classes under the relation (a, b) ~ (c, d) are the:
▪ Integers if a + d = b + c;
▪ Positive rational numbers if ad = bc.
26 | P a g e
• Let (rn) and (sn) be any two Cauchy sequences of rational numbers. The real
numbers are the equivalence classes of the relation (rn) ~ (sn), if the sequence (rn
− sn) has limit 0.
• Green's relations are five equivalence relations on the elements of a semigroup.
• "Is parallel to" on the set of subspaces of an affine space.
Examples of relations that are not equivalences:
• The relation "≥" between real numbers is reflexive and transitive, but not
symmetric. For example, 7 ≥ 5 does not imply that 5 ≥ 7. It is, however, a partial
order.
• The relation "has a common factor greater than 1 with" between natural
numbers greater than 1, is reflexive and symmetric, but not transitive. (The natural
numbers 2 and 6 have a common factor greater than 1, and 6 and 3 have a common
factor greater than 1, but 2 and 3 do not have a common factor greater than 1).
• The empty relation R on a non-empty set X (i.e., aRb is never true) is vacuously
symmetric and transitive, but not reflexive. (If X is also empty then R is reflexive).
• The relation "is approximately equal to" between real numbers, even if more
precisely defined, is not an equivalence relation, although reflexive and symmetric,
it is not transitive, since multiple small changes can accumulate to become a big
change. However, if the approximation is defined asymptotically, for example by
saying that two functions f and g are approximately equal near some point if the
limit of f-g is 0 at that point, then this defines an equivalence relation.
• The relation "is a sibling of" on the set of all human beings is not an equivalence
relation. Although siblinghood is symmetric (if A is a sibling of B, then B is a sibling
of A) it is neither reflexive (no one is a sibling of himself), nor transitive (since if A
is a sibling of B, then B is a sibling of A, but A is not a sibling of A). Instead of being
transitive, siblinghood is "almost transitive", meaning that if A ~ B, and B ~ C, and
A ≠ C, then A ~ C. However, the relation "A is a sibling of B or A is B" is an
equivalence relation. (This applies only to full siblings. A and B could have the same
mother, and B and C the same father, without A and C having a common parent).
• The concept of parallelism in ordered geometry is not symmetric and is
therefore, not an equivalence relation.
An equivalence relation on a set is never an equivalence relation on a
proper superset of that set. For example, R = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3),
(3,1), (3,2), (3,3)} is an equivalence relation on {1,2,3} but not on {1,2,3,4} or on the
natural number. The problem is that reflexivity fails because (4,4) is not a member.

SYMBOLS FOR EXPRESSING RELATIONS AMONG DATA


Relationship symbols show how numbers or terms of an equation relate to
each other. The relationship symbols show if one value is larger, smaller, equal, or
approximately equal to another value. The so-called semantic modeling method
nowadays is commonly used in database structure design. Semantic modeling is
modeling data structures, based on the meaning of these data. Different variants
of the entity-relationship diagrams are used as a tool for the semantic modeling.
ER-model based diagrams have three main components: an entity, a relation and
attributes. An entity is a class of similar objects, information about which should
27 | P a g e
be taken into account in the model. Each entity must have a name, expressed by a
noun in the singular. Examples of entities can be such classes of objects as
"Supplier", "Employee", "Invoice". Each entity in the model is depicted in the form
of a rectangle with the name. While determining the type of entity it is necessary
to be sure that each instance of an entity may be indistinguishable from any other
instance of the same entity. Relation is presented in the form of non-directional line
connecting two entities or leading from the entity to itself. In the place of "docking"
an entity uses:

• Three-point entry to the entity rectangle, if many instances of the entity may be
used for this relation
• Single-point entry, if relation can (or should) participate in only one instance of
the entity.

It is the specialized Entity Relationship diagram symbols, and the meanings of those
symbols.

Entity Relationship Diagram Symbols — Chen notation

Symbol Shape Name Symbol Description

Entities

An entity is represented by a rectangle which


Entity
contains the entity’s name.

An entity that cannot be uniquely identified by


its attributes alone. The existence of a weak
entity is dependent upon another entity called
Weak Entity
the owner entity. The weak entity’s identifier is
a combination of the identifier of the owner
entity and the partial key of the weak entity.

An entity used in a many-to-many relationship


Associative
(represents an extra table). All relationships for
Entity
the associative entity should be many

28 | P a g e
Attributes

In the Chen notation, each attribute is


Attribute represented by an oval containing atributte’s
name

An attribute that uniquely identifies a particular


Key attribute entity. The name of a key attribute is
underscored.

An attribute that can have many values (there


Multivalued are many distinct values entered for it in the
attribute same column of the table). Multivalued
attribute is depicted by a dual oval.

An attribute whose value is calculated (derived)


from other attributes. The derived attribute
Derived
may or may not be physically stored in the
attribute
database. In the Chen notation, this attribute is
represented by dashed oval.

Relationships

A relationship where entity is existence-


independent of other entities, and PK of Child
Strong
doesn’t contain PK component of Parent Entity.
relationship
A strong relationship is represented by a single
rhombus

A relationship where Child entity is existence-


Weak dependent on parent, and PK of Child Entity
(identifying) contains PK component of Parent Entity. This
relationship relationship is represented by a double
rhombus.

Composite Relations

29 | P a g e
If the elements of a set A are related to those of a set B, and those of B are in turn
related to the elements of a set C, then one can expect a relation between A and
C. For example, if Tom is my father (parent-child relation) and Sarah is a sister of
Tom (sister relation), then Sarah is my aunt (aunt-nephew/niece relation).
Composite relations give that kind of relations.
Definition (composite relation): Let R1 be a binary relation from a set A to
a set B, R2 a binary relation from B to a set C. Then the composite relation from A
to C denoted by R1R2(also denoted by R1 R2 is defined as:

R1R2 = {<a, c> | a A c C b [b B <a, b> R1 <b, c> R2]}

In English, this means that an element a in A is related to an element c in C if there


is an element b in B such that a is related to b by R1 and b is related to c by R2.
Thus, R1R2 is a relation from A to C via B in a sense. If R1 is a parent-child relation
and R2 is a sister relation, then R1R2 is an aunt-nephew/niece relation.
Properties of Composite Relations
Composite relations defined above have the following properties. Let R1 be a
relation from A to B, and R2 and R3 be relations from B to C. Then
1. R1(R2R3) = (R1R2) R3
2. R1(R2 R3) = R1R2 R1R3
3. R1(R2 R3) = R1R2 R1R3
Powers of Relation
Let R be a binary relation on A. Then Rn for all positive integers n is defined
recursively as follows:
Definition (power of relation):
Basis Clause: R0 = E, where E is the equality relation on A.
Inductive Clause: For an arbitrary natural number n, R n+1 = RnR. Note that there
is no need for extremal clause here. Thus, for example R1 = R, R2 = RR, and R3 =
R2R = (RR)R = R(RR) = RRR. The powers of binary relation R on a set A defined
above have the following properties.
1. R m+n = RmRn,
2. (Rm)n = R mn.

PROPERTIES OF GRAPH

Graphs, like the dynamic systems of objects they represent, take on an


unfathomable number of shapes & sizes; it therefore helps to create a set of
properties in order to specify unique graph attributes. Let’s examine the defining
properties of our example simple graph:

a) Route: Routing is the act of finding a path to a destination and moving data
across this path from source to destination. It is also known as path in graph
30 | P a g e
theory, which is any route along the edges of a graph. A path may follow a
single edge directly between two vertices, or it may follow multiple edges
through multiple vertices. If there is a path linking any two vertices in a
graph, that graph is said to be connected.
b) Edges: The line segment which acts as an interface between two faces is
called an edge. Sometimes it is also described as the line segment joining
two vertices. A vertex is a corner. An edge is a line segment between faces.
A face is a single flat surface.
c) Sequences: Sequence graph, also called an alignment graph, breakpoint
graph, or adjacency graph, are bidirected graphs used in comparative
genomics. The structure consists of multiple graphs or genomes with a
series of edges and vertices represented as adjacencies between segments
in a genome respectively.
d) Directed: A directed graph (or digraph) is a set of vertices and a collection
of directed edges that each connects an ordered pair of vertices. We say
that a directed edge points from the first vertex in the pair and points to
the second vertex in the pair. We use the names 0 through V-1 for the
vertices in a V-vertex graph. In a directed graph, every vertices has a
minimum of one incoming edge & one outgoing edges signifying the strict
direction of each edge relative to its two connected vertices. Directed
graphs are a class of graphs that don’t presume symmetry or reciprocity in
the edges established between vertices. In a directed graph, if two vertices
connected by an edge, this doesn’t necessarily mean that an edge
connecting also exists.

e) Non-directed/undirected: An undirected graph is graph, i.e., a set of


objects (called vertices or nodes) that are connected together, where all the
edges are bidirectional. Undirected graphs are more restrictive kinds of
graphs. They represent only whether or not a relationship exists between
two vertices. They don't however represent a distinction between subject
and object in that relationship. One type of graph can sometimes be used
to approximate the other. Undirected graphs are more specific. For them,
31 | P a g e
there’s an extra assumption regarding the reciprocity in the relationship
between pairs of vertices connected by an edge. If an edge exists between
two vertices and the edge also exists.

SETS
Sets are fundamental objects that can be used to define all other concepts
in mathematics, they are not defined in terms of more fundamental concepts.
Rather, sets are introduced either informally, and are understood as something
self-evident or as is now standard in modern mathematics. Axiomatically, and their
properties are postulated by the appropriate formal axioms. The language of set
theory is based on a single fundamental relation, called membership. We say that
A is a member of B (in symbols A ∈ B), or that the set B contains A as its element.
The understanding is that a set is determined by its elements; in other words, two
sets are deemed equal if they have exactly the same elements. In practice, one
considers sets of numbers, sets of points, sets of functions and so on. In theory, it
is not necessary to distinguish between objects that are members and objects that
contain members. The only objects one needs for the theory are sets. See the
supplement:
The type SET denotes sets whose elements are integers in the range 0 to a small
number, typically 31 or 63. Given, for example, variables
VAR r, s, t: SET
possible assignments are
r: = {5}; s: = {x, y, z}; t: = {}
Here, the value assigned to r is the singleton set consisting of the single element 5;
to t is assigned the empty set, and to s the elements x, y, y+1, …., z-1, z.

SET OPERATIONS
A relation set is a set of ordered pairs if it is a binary relation, and it is a set of
ordered n tuples if it is an n-ary relation. Thus, all the set operations apply to
32 | P a g e
relations such as ᴜ, Ո, and complementing. For example, the union of the "less
than" and "equality" relations on the set of integers is the "less than or equal to"
relation on the set of integers. The intersection of the "less than" and "less than or
equal to" relations on the set of integers is the "less than" relation on the same
set. The complement of the "less than" relation on the set of integers is the
"greater than or equal to" relation on the same set. Therefore, the following are
the elementary operators are defined on variables of type SET:
*set intersection
+ set union
- set difference
/ symmetric set difference
IN set membership
Constructing the intersection or the union of two sets is often called set
multiplication or set addition, respectively; the priorities of the set operators are
defined accordingly, with the intersection operator having priority over the union
and difference operators, which in turn have priority over the membership
operator, which is classified as a relational operator. Following are examples of set
expressions and their fully parenthesized equivalents:
r * s + t = (r*s) + t
r - s * t = r - (s*t)
r - s + t = (r-s) + t
THIS IS A TEXT
r + s / t = r + (s/t)
x IN s + t = x IN (s+t)

Sets, Elements, and Subsets


Sets a number of things naturally connected by location, formation, or order in
time. Although set holds the record for words with the most dictionary definitions,
there are terms mathematicians choose to leave undefined, or actually, defined by
usage. Set, element, member, and subset are four such terms which will be
discussed part. Each item inside a set is termed an element. The brace symbols
{and} are used to enclose the elements in a set. Each element is a member of the
set (or belongs to the set).

33 | P a g e
The symbol for membership is ϵ. It can be read "is an element of" and looks quite
similar to the Greek letter epsilon (Ꞓ). The symbol for subset is . Some books will
allow and use it reversed—we will not. A superset is a set that includes other sets.
For example: If A B, then A is a subset of B and B is a superset of A.
A subset might have no members, in which case it is termed the null set or empty
set. The empty set is denoted either by {} or by Ø, a Norwegian letter. The null set
is a subset of every set. Note: a common mistake is to use {Ø} to denote the null
set. This is actually a set with one element and that element is the null set. Since
some people slash their zeroes, it is safest when using handwriting to always use
the notation {} to denote the empty or null set. A singleton is a set with only one
element. A subset might contain every member of the original set. In this case it is
termed an improper subset. A proper subset does not contain every member of
the original set. Sets may be finite, {1, 2, 3..., 10}, or infinite, {1, 2, 3…….}. The
cardinality of a set A, n(A), is how many elements are in the set.
The power set of a set is the complete set of subsets of the set. The safe sets
will be considered in this part; any set we consider should be well-defined. There
should be no ambiguity as to whether or not an element belongs to a set. That is
why we will avoid things like the village barber who shaves everyone in the village
that does not shave himself. This results in a contradiction as to whether or not he
shaves himself. Also consider Russell's Paradox: Form the set of sets that are not
members of themselves. It is both true and false that this set must contain itself.
These are examples of ill-defined sets. Sometimes, instead of listing elements in a
set, we use set builder notation: {x| x is a letter in the word "mathematics"}. The
symbol | can be read as "such that." Sometimes the symbol is reserved to mean
proper subset and the symbol is used to allow the inclusion of the improper
subset. Compare this with the use of < and ≤ to exclude or include an endpoint.
We will make no such distinction. A set may contain the same elements as another
set. Such sets are equal or identical sets— element order is unimportant.
A = B where
A = {m,o,r,e} and
B = {r,o,m,e},
In general A=B if A B and B A. Sets may be termed equivalent if they have the
same cardinality. If they are equivalent, a one-to-one correspondence can be
established between their elements. The universal set is chosen arbitrarily, but
must be large enough to include all elements of all sets under discussion.
Complementary set, A', is a set that contains all the elements of the universal set
that are not included in A. The symbol ' can be read "prime."
For example:
if U={0, 1, 2, 3, 4, 5, 6,...} and

34 | P a g e
A={0, 2, 4, 6, ...}, the
A'={1, 3, 5, ...}
Such paradoxes as those mentioned above, particularly involving infinities.
INTERSECTION AND UNION
Once we have created the concept of a set, we can manipulate sets in useful ways
termed set operations. Consider the following sets: animals, birds, and white
things. Some animals are white: polar bears, mountain goats, big horn sheep, for
example. Some birds are white: dove, stork, sea gulls. Some white things are not
birds or animal (but birds are animals!): snow, milk, wedding gowns (usually). The
intersection of sets are those elements which belong to all intersected sets.
Although we usually intersect only two sets, the definition above is general. The
symbol for intersection is . The union of sets are those elements which belong to
any set in the union. Again, although we usually form the union of only two sets,
the definition above is general. The symbol for union is .
For the example given above, we can see that:
{white things} {birds} = white birds
{white animals} {birds} = white animals and all birds
{white birds} {white animals} {animals}
Another name for intersection is conjunction. This comes from the fact that an
element must be a member of set A and set B to be a member of A B. Another
name for union is disjunction. This comes from the fact that an element must be a
member of set A or set B to be a member of A B. Conjunction and disjunction are
grammar terms and date back to when Latin was widely used.

OCCUPANCY
occupancy refers to the number of elements or items stored in a particular
structure, such as an array or a hash table. ”Load factor" is the measure of how
full a data structure is, calculated as the ratio of the number of elements stored in
the structure to the number of spaces or buckets available to store elements.
"Lean" generally means that the data structure is not very full, and has a low load
factor. This can be desirable in some cases, as it may lead to faster operations and
better performance. "Empty" means that the data structure has no elements and
load factor is 0."Loose" generally means that the data structure is relatively full,
and has a high load factor. This can lead to slower operations and worse

35 | P a g e
performance, as the structure may become too full and require rehashing or
resizing.

"Birth," "Death," and "Change" refer to the addition, removal, and


modification of elements or items stored in the structure. "Birth" refers to the
addition of a new element or item to a data structure. For example, adding a new
item to a list or inserting a new key-value pair into a hash table. "Death" is the
removal of an element or item from a data structure. For example, deleting an item
from a list or removing a key-value pair from a hash table. "Change" is the process
of modifying an existing element or item in a data structure. For example, updating
the value associated with a key in a hash table or changing the value of an element
in a list. It is important to note that different data structures handle these
operations differently, some data structures are efficient at insertions and
deletions while other are efficient at searching and retrieval.

SEQUENTIAL LIST
sequential list is also known as a linear list. it is a type of data structure that stores
a sequence of elements in a linear order. Each element in the list has a unique
position, and elements can be accessed by their position in the list. In a sequential
list, elements are typically stored in contiguous memory locations and can be
accessed directly by their index. The most common operations on a sequential list
include adding an element to the list, removing an element from the list, and
searching for an element in the list. A common implementation of a sequential list
is an array, where each element of the list is stored at a specific index in the array.
It also can be implemented as linked list, where each element of the list is
represented by a node that contains the element's value and a reference to the
next node in the list. Sequential lists are useful in situations where elements need
to be stored in a specific order, and the order of elements is important.

DIFFERENCES BETWEEN FIXED LENGTH AND VARIABLE LENGTH FIELDS


fields are simpler to work with, as the size of each field is known in advance.
This can make it easier to access and manipulate the data stored in the structure,
and can also make it easier to ensure that the data is stored in a specific format.
However, fixed-length fields can be less efficient in terms of memory usage, as
they can lead to wasted space if the data stored in a field is shorter than the
maximum length of the field. Variable-length fields on the other hand, are more
efficient in terms of memory usage, as they only take up as much space as is
needed to store the actual data. However, they can be more complex to work with,
as the size of each field is not known in advance. This can make it more difficult to

36 | P a g e
access and manipulate the data stored in the structure, and can also make it more
difficult to ensure that the data is stored in a specific format. It is important to note
that depending on the data structure, different structures will have different
advantages and disadvantages with regard to fixed-length and variable-length
fields.

IMPLEMENTATION OF FIXED LENGTH AND VARIABLE-LENGTH FIELDS


Implementing fixed-length fields in a data structure can be done using
arrays or fixed-size structures, where each field is assigned a fixed size and position
in memory. For example, in a C-style struct, you can define a field with a fixed
length by specifying its data type, such as an Int or Char and a fixed number of bytes
the field should occupy, such as:
struct myStruct {

int id;

char name[10];

int age;

};

struct myStruct { int id; char name[ 10 ]; int age; };

Here, the name field is of fixed length and will occupy 10 bytes of memory. On the
other hand, implementing variable-length fields in a data structure can be done
using dynamic arrays or linked lists. For example, in a linked list, each node
contains a variable-length field that stores a value and a pointer to the next node
in the list. The size of the field can be determined at runtime based on the actual
data stored in the field.

struct myStruct {

int id;

char* name;

int age;

};

struct myStruct { int id; char * name; int age; };

37 | P a g e
Here, the name field is of variable length, the size of this field will be determined
at runtime based on the value assigned to the field. Another way to implement
variable fields is by using pointers; variable-length fields can be implemented by
storing a pointer to a block of memory allocated dynamically, that can be resized
as needed. It is important to note that different data structures handle fixed and
variable length fields differently, and the choice of which one to use will depend
on the specific requirements of the application and the operations that will be
performed on the data.

ORDERED AND LINEAR LIST

An ordered list is a data structure in which elements are arranged in a


specific order, such as ascending or descending. The order of the elements can be
determined by their value, such as numerical or alphabetical. And on the other
hand, a linear list is a type of data structure in which elements are stored in a linear
sequence, with each element being connected to the next element in the list. This
can be represented as a linked list or an array.

OPERATIONS OF ORDERED LIST

1. Append: This operation adds a new element to the end of the ordered list.
2. Search: This operation searches for a specific element in the ordered list. It
can be implemented using various algorithms such as linear search or
binary search.
3. Delete: This operation removes an element from the ordered list. It can be
implemented by searching for the element and then removing it from the
list.
4. Sort: This operation rearranges the elements in the ordered list according
to a specific order, such as ascending or descending. There are various
sorting algorithms that can be used, such as bubble sort, insertion sort, and
quick sort.
5. Merge: This operation combines two ordered lists into a single ordered list.
The resulting list will also be ordered.
6. Multiway merge: This operation combines multiple ordered lists into a
single ordered list. It is an extension of the merge operation.
7. Balance merge: This operation is used to balance the distribution of
elements in an ordered list. It is useful in scenarios where the list is
unbalanced and certain elements are grouped together, making searching
and accessing them slower.

38 | P a g e
LINKED LIST
A linked list is a data structure that consists of a sequence of elements,
where each element is called a "node". Each node contains two fields: a value field
that stores the element's value, and a next field that stores a reference to the next
node in the list (or pointer). The first node in the list is called the "head", and the
last node in the list is called the "tail". The tail's next field contains a null reference,
indicating the end of the list. Linked lists can be used to implement various data
structures such as stacks, queues, and hash tables. A singly linked list is a type of
linked list where each node points to its next node in the list, a doubly linked list is
a type of linked list where each node points to both next and previous nodes in the
list. Linked lists are dynamic data structures and can change their size during the
execution of a program, they are useful when you don't know how much data you
will have, or when you need to add/remove elements frequently.

COMPARISONS BETWEEN LINKED LIST AND LINEAR LIST

The main difference between a linked list and a linear list is how the
elements are stored in memory. In a linear list, elements are stored in contiguous
memory locations, whereas in a linked list, elements are stored in non-contiguous
memory locations. This means that in a linear list, elements can be accessed by
their index position, whereas in a linked list, elements must be accessed by
following the next pointers.

Linked lists have the advantage of being dynamic data structures and can
change their size during the execution of a program, they are useful when you
don't know how much data you will have, or when you need to add/remove
elements frequently. On the other hand, linear list has the advantage of being
more cache-friendly and allow for constant-time access to elements at a specific
index position.

TYPE OF LINKED LIST


This type of linked list utilizes a special property of the XOR operation to minimize
the memory usage and allow for constant-time addition and removal of elements
from the list. There are several types of linked lists, including:
1. Singly Linked List: In this type of linked list, each node contains a value and
a reference (or pointer) to the next node in the list. The last node in the list
contains a null reference, indicating the end of the list. This type of linked
list is also known as a one-way linked list, as the nodes can only be traversed
in one direction (from head to tail).

39 | P a g e
2. Doubly Linked List: In this type of linked list, each node contains a value
and references to both the next and previous nodes in the list. This type of
linked list is also known as a two-way linked list, as the nodes can be
traversed in both directions (from head to tail and from tail to head).
3. Circular Linked List: In this type of linked list, the last node in the list
contains a reference to the first node in the list, creating a circular loop. This
type of linked list can be either singly or doubly linked.
4. Multi-Linked List: In this type of linked list, each node contains multiple
references to other nodes in the list. This allows for multiple traversals of
the list and can be used to implement more complex data structures.
5. Skip List: Skip list is a data structure that allows for efficient search,
insertion and deletion of elements. It is a type of linked list where each node
has multiple pointers, each pointing to another node at a different level.
6. XOR Linked List: XOR linked list is a special type of linked list, where each
node contains an XOR of the addresses of its previous and next node.

POINTERS
Pointers are a fundamental concept in computer programming and are used in
many programming languages, including C and C++. They are variables that store
memory addresses of other variables. Here are some of the main uses of pointers:
1. Dynamic memory allocation: Pointers are commonly used to dynamically
allocate memory during the execution of a program. This allows for the
creation of data structures such as linked lists and trees, which can change
their size during runtime.
2. Implementing data structures: Pointers are used to implement various
data structures such as linked lists, trees, and graphs. They are used to link
together the different elements of the data structure, allowing for efficient
traversal and manipulation of the data.
3. Function parameters: Pointers can be used as function parameters to pass
variables by reference rather than by value. This can be useful for situations
where the called function needs to modify the original variable, rather than
a copy of it.
4. Indirection: This means that the pointers can be used to refer to other data
indirectly. This can be useful in situations where the location of data
changes frequently, such as in linked lists or dynamic arrays.

40 | P a g e
5. Arrays and Strings: Pointers are used to manipulate arrays and strings; they
are used to traverse and access elements at specific indexes.
6. Inter-process communication: Pointers can be used in inter-process
communication, where two or more processes share the same memory
space, it helps in sharing data between processes.
7. Device drivers: Pointers are used in device drivers to interact with
hardware, it helps to access memory addresses of specific hardware
registers to read or write data.
8. Smart pointers: smart pointers are a type of C++ feature that are used to
manage dynamically allocated memory and help to prevent memory leaks
and other issues related to manual memory management. They also
provide additional functionality such as thread-safety and automatic
deallocation of memory.
9. Callback functions: Pointers to functions can be used to create callback
functions, where a function pointer is passed as an argument to another
function, and the function pointer is called at a certain point in the
execution of the second function.

STORAGE MAPPING

Storage mapping is a technique used in computer science and engineering to map


data elements to physical storage locations in a computer. The goal of storage
mapping is to improve the performance of data access, by reducing the time it
takes to retrieve data from storage.

Type of Storage Mapping


There are several different types of storage mapping techniques, each with their
own strengths and weaknesses. Some of the most common types include:

1. Contiguous Storage Mapping: In this technique, data elements are stored in


contiguous blocks of memory. This allows for efficient access to data, as it can
be accessed sequentially without the need to jump around in memory.

2. Linked Storage Mapping: Data elements are stored in non-contiguous blocks


of memory and linked together using pointers. This allows for efficient access
to data, as it can be accessed directly by following the pointers.
3. Indexed Storage Mapping: In this technique, data elements are stored in non-
contiguous blocks of memory and an index is used to map the data elements

41 | P a g e
to their physical storage locations. This allows for efficient access to data, as
the index can be used to quickly locate the desired data element.
4. Hashed Storage Mapping: In this technique, data elements are stored in non-
contiguous blocks of memory and hashed key is used to map the data elements
to their physical storage locations. This allows for efficient access to data, as
the hashed key can be used to quickly locate the desired data element.
The choice of storage mapping technique will depend on the specific requirements
of the application, such as the size of the data, the number of data elements, and
the frequency of data access.

TREE
A tree is a non-linear data structure that consists of nodes connected by
edges usually in form of hierarchy. The topmost node is called the root, and its
children are referred to as its descendants. The nodes without any children are
called leaves. The height of a tree is the number of edges from the root to the
furthest leaf. Each node in a tree has a parent (except the root) and zero or more
children.

Trees are a fundamental data structure can be used to represent


hierarchical relationships between elements and are commonly used in computer
science for various purposes such as searching, sorting, and organizing data. There
are several different types of trees, each with their own specific characteristics and
uses. Some of the most common types of trees include:

1. Binary Tree: A binary tree is a tree data structure in which each node has at
most two children, which are referred to as the left child and the right child.
Binary trees can be used to implement various data structures such as binary
search trees and heap.
2. AVL Tree: An AVL tree is a self-balancing binary search tree, in which the
difference in heights of left and right subtrees of any node is at most one. AVL
tree is a balanced tree, which performs basic operations like insert, delete and
search in O (log n) time.
3. B-tree: A B-tree is a self-balancing tree data structure that is commonly used in
databases and file systems to store large amounts of data. B-trees have a higher
branching factor than other trees, which allows them to efficiently store and
retrieve large amounts of data.

42 | P a g e
4. B+ tree: A B+ tree is a variation of a B-tree, where all the data is stored in leaf
nodes, and all the leaf nodes are linked together to form a linked list. This
allows for efficient range queries and sequential access to the data.
5. Trie: A Trie (prefix tree) is a tree-like data structure that is used to store a
collection of strings. Each node in the Trie represents a character in the string,
and the path from the root to the node represents the prefix of the string. Tries
are commonly used for efficient string matching and prefix matching.
6. Heap: A Heap is a special tree-based data structure in which the root-node key
is compared with its children and arranged accordingly, it can be either min
heap or max heap. It is used to implement priority queues and also used in
algorithms such as Dijkstra's shortest path algorithm.
7. Red-Black Tree: A red-black tree is a self-balancing binary search tree, where
each node is coloured either red or black. It is similar to AVL tree, but it has less
strict balancing condition and is simpler to implement.

Properties of tree
A tree has several properties that define its structure and behaviour:
1. Hierarchical structure: A tree has a hierarchical structure where each node
has a parent node and zero or more child nodes.
2. Root node: The root node is the topmost node in the tree and has no
parent.
3. Leaf nodes: Leaf nodes are the nodes without any children.
4. Edge: An edge is the connection between a parent node and a child node.
5. Depth: The depth of a node is the number of edges from the root to that
node.
6. Height: The height of a tree is the number of edges from the root to the
furthest leaf.
7. Path: A path is a sequence of nodes connected by edges from the root to a
leaf.
8. Ancestors and descendants: Ancestors of a node are its parent and
grandparent, and so on. Descendants of a node are its children,
grandchildren, and so on.
9. Subtree: A subtree is a portion of a tree consisting of a node and its
descendants.
10. Degree: The degree of a node is the number of children it has.
These properties help to describe the structure of a tree and its behaviour when
used in various algorithms and data structures.

Binary tree

43 | P a g e
A binary tree is a type of tree in data structure where each node has at most
two children.
TYPES OF BINARY TREE
There are several types of binary trees, including:
1. Full binary tree: A full binary tree is a binary tree where every node has
either 0 or 2 children.
2. Complete binary tree: A complete binary tree is a binary tree in which all
the levels are completely filled except possibly the last level and the last
level has all keys as left as possible.
3. Perfect binary tree: A perfect binary tree is a tree in which all the internal
nodes have two children and all leaf nodes are at the same level.
4. Balanced binary tree: A balanced binary tree is a tree in which the height of
the two subtrees of any node never differ by more than 1. Examples of
balanced binary trees are AVL trees and Red-Black trees.
5. Skewed binary tree: A skewed binary tree is a binary tree in which one of
the subtrees is much taller than the other. Examples of skewed binary trees
are left-skewed trees and right-skewed trees.
6. Binary search tree: A binary search tree is a tree in which for each node, the
key of all the nodes in its left subtree is less than its own key, and the key
of all the nodes in its right subtree is greater than its own key.
These different types of binary trees have different properties and are used in
various algorithms and data structures for specific purposes.

Representation of binary tree


A binary tree is a type of tree data structure where each node has at most two
children. In a binary tree, each node has the following elements:
i. Data or key: A node in a binary tree stores data or a key value.
ii. Left child: A pointer to the left child of the node.
iii. Right child: A pointer to the right child of the node.
A binary tree can be represented using arrays or linked lists. In an array
representation, the nodes are stored in an array, and the left and right children of
a node can be calculated using its index in the array. In a linked list representation,
each node has pointers to its left and right children. The root node is the first node
in the array representation or the node with a null parent pointer in the linked list
representation. The leaf nodes are the nodes without any children. The height of
a binary tree is the number of edges from the root to the furthest leaf.
The binary tree can be used for various purposes such as searching, sorting,
and organizing data. The different types of binary trees have different properties
and are used in various algorithms and data structures for specific purposes.
44 | P a g e
TIME COMPLEXITY ISSUES
Time complexity issue arise when algorithms take too long to complete a
task or require too much memory. This issue can be caused by inefficient
algorithms, data structures or implementation code. The amount of time an
algorithm takes to run and complete as a function of the size of the input. Issues
arise when an algorithm takes too long to run for large inputs, causing the program
to become unacceptably slow or even halt. This can lead to scalability issues, where
the program becomes impractical for use as the size of the inputs grow. Improving
time complexity is often a key factor in the design and optimization of algorithms,
and involves trade-offs between time and other resources such as memory usage.
Common techniques for improving time complexity include using more efficient
data structures, reducing the number of operations performed, and parallelizing
computations.
BIG ‘O’ NOTATION
Big ‘O’ notation is a way of describing the upper bound of the growth rate
of the running time of an algorithm, expressed as a function of the size of the input.
In the context of data structures, Big ‘O’ is commonly used to describe the time
complexity of operations such as inserting, deleting, or searching for elements. For
example:
1. O(1) means the operation takes a constant time, regardless of the size of
the input. An example of this would be accessing an element in an array
using an index.
2. O(n) means the operation takes linear time, proportional to the size of the
input. An example of this would be searching for an element in a linked list
by traversing it from start to end.
3. O(log n) means the operation takes logarithmic time, where the growth rate
is logarithmic with respect to the size of the input. An example of this would
be finding an element in a binary search tree.
4. O(n log n) means the operation takes log-linear time, where the growth rate
is proportional to the size of the input times the log of the size of the input.
An example of this would be sorting an array using a divide-and-conquer
algorithm such as merge sort.
5. O(n^2) means the operation takes quadratic time, proportional to the
square of the size of the input. An example of this would be checking all
pairs of elements in an array to find the closest ones.

45 | P a g e
It's important to note that Big ‘O’ notation provides an upper bound on the growth
rate, and does not take into account factors such as the actual constant factors or
lower order terms.
Analysing algorithm and determine their running time
The running time of an algorithm refers to the amount of computational
resources (such as time or memory) required to run the algorithm as a function of
the size of the input. The order of the running time of an algorithm is often
described using big ‘O’ notation, which provides an upper bound on the growth
rate of the running time as the size of the input increases. For example, a linear
search algorithm has a running time of O(n), where n is the number of elements in
the list, as it needs to examine each element in the list once to find the target
element. On the other hand, a binary search algorithm has a running time of O(log
n), as it reduces the search space by half with each iteration.
Linked list algorithms have different running times depending on the specific
operation being performed. For example, inserting an element at the beginning of
a linked list has a running time of O(1), as it only requires updating a few pointers.
However, inserting an element at a specific position in a linked list has a running
time of O(n), as it may require traversing the entire list to find the correct position.
In conclusion, the running time and the order of running time of algorithms
can be analysed using big O notation and depends on the specific operations being
performed and the size of the input.

SORTING
Sorting is the process of arranging a set of elements in a specific order,
either ascending or descending, based on their values. Some of the most
commonly used sorting algorithms include:
1. Bubble sort
2. Insertion sort
3. Selection sort
4. Quick sort
5. Merge sort
6. Heap sort
7. Radix sort
Sorting is an important operation in computer science and is used in various
applications, such as searching, data analysis, and database management. The
choice of sorting algorithm depends on the specific requirements of the problem,

46 | P a g e
such as the size of the data, the required speed of sorting, and the memory
constraints.

Sorting Techniques
There are several sorting algorithms, each with its own advantages and
disadvantages. Some of the most commonly used sorting techniques include:
1. Bubble sort: Bubble sort is a simple sorting algorithm that repeatedly steps
through the list, compares adjacent elements and swaps them if they are in the
wrong order. The process continues until the list is sorted.
2. Insertion sort: Insertion sort is a simple sorting algorithm that builds the final
sorted array one item at a time. It is much less efficient on large lists than more
advanced algorithms such as quicksort, heapsort, or merge sort.
3. Selection sort: Selection sort is a simple sorting algorithm that works by
repeatedly finding the minimum element from the unsorted part of the list and
swapping it with the first unsorted element.
4. Quick sort: Quick sort is a divide and conquer algorithm that works by selecting
a "pivot" element from the array and partitioning the other elements into two
sub-arrays, according to whether they are less than or greater than the pivot.
The sub-arrays are then sorted recursively.
5. Merge sort: Merge sort is a divide and conquer algorithm that works by dividing
the unsorted list into n sub-lists, each containing one element, and then
repeatedly merging sub-lists to produce new sorted sub-lists until there is only
one sub-list remaining.
6. Heap sort: Heap sort is a comparison-based sorting algorithm that works by
dividing the input into a sorted and an unsorted region, and iteratively
shrinking the unsorted region by extracting the largest element and moving
that to the sorted region.
7. Radix sort: Radix sort is a non-comparative integer sorting algorithm that sorts
data with integer keys by grouping the keys by the individual digits which share
the same significant position and value.
These are some of the most widely used sorting techniques. The choice of sorting
algorithm depends on the specific requirements of the problem, such as the size
of the data, the required speed of sorting, and the memory constraints.

Comparison Based Sorting


Comparison-based sorting is a type of sorting algorithm that sorts elements by
comparing them. The algorithms work by dividing the input data into two parts,
the sorted and the unsorted, and iteratively sorting the unsorted part. The sorting
process continues until all elements are in their correct positions. Comparison-
based sorting algorithms include:

47 | P a g e
i. Bubble sort
ii. Insertion sort
iii. Selection sort
iv. Quick sort
v. Merge sort
vi. Heap sort
In comparison-based sorting, the elements are compared to each other, and their
positions are swapped if they are in the wrong order. The algorithms use the
comparison operator (<, >, =) to determine the order of elements. Comparison-
based sorting algorithms can be applied to any data type for which a comparison
operator is defined. The performance of comparison-based sorting algorithms is
typically measured in terms of the number of comparisons required to sort the
data. Some algorithms, such as quick sort and merge sort, have a best-case time
complexity of O(n log n), while others, such as selection sort and bubble sort, have
a worst-case time complexity of O(n^2). The choice of sorting algorithm depends
on the specific requirements of the problem, such as the size of the data, the
required speed of sorting, and the memory constraints.
Bubble Sorting Algorithm
Bubble sort is a simple sorting algorithm that works by repeatedly stepping
through the list, comparing adjacent elements and swapping them if they are in
the wrong order. The process continues until the list is sorted.
Here is how bubble sort works:
i. Start at the beginning of the list.
ii. Compare the first two elements of the list.
iii. If the first element is greater than the second, swap their positions.
iv. Move to the next pair of elements and repeat step 3.
v. Repeat steps 3 and 4 for all elements in the list.
If no swaps were made during the pass through the list, the list is sorted. If not,
repeat the entire process again. Bubble sort has a time complexity of O(n^2) in the
worst-case scenario, which means it is not very efficient for large lists. However, it
is easy to understand and implement, making it a good starting point for beginners
learning about sorting algorithms. Bubble sort can be improved by adding a flag
that keeps track of whether any swaps were made during a pass through the list.
If no swaps were made, the list is already sorted, and the algorithm can be stopped.
This optimization reduces the number of passes required for the algorithm, making
it slightly faster.
Selection Sorting Algorithm

48 | P a g e
Selection sort is a simple sorting algorithm that works by repeatedly finding the
minimum element from the unsorted part of the list and swapping it with the first
unsorted element.
Here is how selection sort works:
i. Start with the first element in the list and consider it the current minimum.
ii. Compare the current minimum with each remaining element in the list.
iii. If a smaller element is found, set it as the new minimum.
iv. After checking all the remaining elements, swap the current minimum with
the first unsorted element.
v. Repeat steps 1 to 4 for all remaining elements in the list.
Selection sort has a time complexity of O(n^2) in the worst-case scenario, which
means it is not very efficient for large lists. However, it is simple to understand and
implement, making it a good starting point for beginners learning about sorting
algorithms. The key to the selection sort algorithm is the concept of dividing the
input data into two parts, the sorted and the unsorted, and iteratively sorting the
unsorted part. The algorithm works by finding the minimum element from the
unsorted part of the list and swapping it with the first unsorted element. The
algorithm continues until all elements are in their correct positions.
Insertion Sorting
Insertion sort is a sorting algorithm that works by taking an input list and
sorting it into an output list. The algorithm works by taking each element of the
input list and inserting it into its proper position in the output list. This is done by
comparing the element to the elements already in the output list. Elements that
are smaller than the current element are shifted to the right to make room for the
current element. This process is repeated until the entire input list is sorted.
Insertion sort is an in-place sorting algorithm - it does not require any extra
memory to store the sorted list. It is a simple and efficient sorting algorithm which
has a time complexity of O(n2).
Algorithm for Insertion Sorting
1. Begin by setting the first element of the array as the sorted partition.

2. For each element in the unsorted partition, starting from the second element,
compare it to the elements in the sorted partition.
3. If the element is smaller than the element in the sorted partition, move all
elements greater than it one index to the right.
4. Insert the element into its correct position in the sorted partition.
5. Repeat step 2-4 until all elements in the unsorted partition have been sorted.

49 | P a g e
Differences between Linear and Binary Search Algorithm Techniques
Linear Search: Linear search is a very simple search algorithm. It sequentially
checks each element of the list until a match is found or the whole list has been
searched. Linear search is also known as sequential search. While Binary search is
a much more efficient searching algorithm than linear search. It works by
repeatedly dividing the search interval in half. The idea is to narrow down the
search by eliminating the half of the elements that cannot be the solution. Binary
search works only on a sorted list of elements. The list should be in either an
ascending or descending order.
Apply sorting algorithm to sort an array of objects.
[8, 5, 2, 9, 7, 1, 3]
Answer:
[1, 2, 3, 5, 7, 8, 9].

Different Sorting and Searching in C/C++ Programming Language


Here are code snippets for various sorting and searching algorithms in C/C++:

SORTING:
i. Bubble Sort:

1. #include <iostream>
2. using namespace std;
3.
4. void bubbleSort(int arr[], int n) {
5. for (int i = 0; i < n-1; i++) {
6. for (int j = 0; j < n-i-1; j++) {
7. if (arr[j] > arr[j+1]) {
8. int temp = arr[j];
9. arr[j] = arr[j+1];
10. arr[j+1] = temp;
11. }
12. }
13. }
14. }
15.
16. int main() {
17. int arr[] = {64, 34, 25, 12, 22, 11, 90};
18. int n = sizeof(arr)/sizeof(arr[0]);
19.

50 | P a g e
20. bubbleSort(arr, n);
21.
22. cout << "Sorted array: \n";
23. for (int i = 0; i < n; i++)
24. cout << arr[i] << " ";
25. return 0;

26. }

ii. Selection Sort:

#include <iostream>
using namespace std;

void selectionSort(int arr[], int n) {


int i, j, minIndex;
for (i = 0; i < n-1; i++) {
minIndex = i;
for (j = i+1; j < n; j++) {
if (arr[j] < arr[minIndex])
minIndex = j;
}
int temp = arr[minIndex];
arr[minIndex] = arr[i];
arr[i] = temp;
}
}

int main() {
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);

selectionSort(arr, n);

cout << "Sorted array: \n";


for (int i = 0; i < n; i++)
cout << arr[i] << " ";
return 0;
}

51 | P a g e
Searching:
Linear Search:
1. #include <iostream>
2. using namespace std;
3.
4. int linearSearch(int arr[], int n, int x) {
5. for (int i = 0; i < n; i++) {
6. if (arr[i] == x)
7. return i;
8. }
9. return -1;
10. }
11.
12. int main() {
13. int arr[] = {64, 34, 25, 12, 22, 11, 90};
14. int n = sizeof(arr)/sizeof(arr[0]);
15. int x = 25;
16.
17. int result = linearSearch(arr, n, x);
18. if (result == -1)
19. cout << "Element not found\n";
20. else
21. cout << "Element found at index: " << result << endl;
22. return 0;
23. }

52 | P a g e

You might also like