NF-Data Structures Using Python

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 186

1 Basics of Python

Objectives
- Installing and using python interpreter
- Declaring and using python variable
- Understanding and using python operators
- Making use of python selection and iterative statements
- Making use string and handling string functions

1.1 Introduction
Python is a popular programming language. It was created by Guido van Rossum, and released
in 1991. It is object-oriented, high-level programming language with dynamic semantics. Python
code is execution is carried by an interpreter. Python's simple, easy to learn syntax emphasizes
readability and therefore reduces the cost of program maintenance. Python supports modules and
packages, which encourages program modularity and code reuse. The Python interpreter and the
extensive standard library are available in source or binary form without charge for all major
platforms, and can be freely distributed.Python support various kind of programs which includes
• Web programming
• Software Development
• Mathematical Computations
• Machine learning

Some of the important features about python to make this as widely accepted by programmer
community is due to following reasons

• As python interpreters are available for different platforms (Windows, Mac, Linux,
Raspberry Pi, etc), and once code written its works on all the platforms.
• As syntax of python is similar to the English language it is easy to learn.
• Python has robust built in library that allows developers to write programs with fewer
lines than some other programming languages.
• Python runs on an interpreter system, meaning that code can be executed as soon as it is
written. This makes it easy to write and test code and this also helps in prototyping.
• Python code can be written by either procedural approach or an object-oriented approach
or a functional approach. Combination of this approach can also be used.
• It is also considered one of the best programming languages for machine learning

1
In this unit you start with installation of python, followed by writing and executing python script.
Latter part covers basic programming constructs such as data types, control structures and
iterative statements of python.

1.1.1 Python Installation

For Windows operating system, the installation process is as follows:

• To install Python, firstly You need to go to the Download Python page from its
official site python.org/download and click on the latest version

Figure 1-1: Python Download Page

Once the Python distribution download is completed, then double-click on the executable
downloaded software, and then click on Run this will start the installation.

Note: to install latest version of python (python 3.9) you need at least windows 8.1

• You need to ensure that You have checked the checkboxes for ‘Install launcher for all users
(recommended)’ and for ‘Add Python 3.9 to PATH’ at the bottom
• It is recommended to choose custom installation to make installation for all users.
• After clicking Choose
• The next dialog will prompt you to select whether to Disable path length limit. Choosing
this option will allow Python to bypass the 260-character MAX_PATH limit. Effectively, it
will enable Python to use long path names.
• This will complete the installation.

1.1.2 Writing and executing first python script


If python is installed successfully then you are ready to write first script. This first script can be
written at python command line or it may be written in editor. Let’s check whether python is
installed and running properly by using command line to check version of python installed.
Following is command and response.

If you get message as: 'python' is not recognized as an internal or external command, that
means either python is not installed properly or python command is not in path. It is request to
troubleshoot the problem by using Google or with help of instructor.

Above figure also shows python command prompt to be user to execute python script in
interactive mode. Now you are ready to learn most versatile programming language python.
Let’s us run a script to print ‘Hello World. This snapshot also includes one more statement which
adds two numbers (3 and 4). More about syntax, data types, operators and many other are
covered in the book as we proceed further.

1.1.3 Using python editors to write and execute python scripts


One of the most important skills you need to build is ability to run Python scripts and code.
There are various ways you can execute python scripts using any of the following mechanism.
• The operating system command-line or terminal
• The Python interactive mode
• Any IDE (Idle, Pycharm and Spyder etc.)

We had already learnt about how to use python interactive mode at command line in previous
section. Let’s learn now how to use basic editor IDLE (Integrated Development and Learning
Environment); which is installed along with python installation for Windows and Mac. If
developer is using Linux user, then it required to installed separately.

Idle support execution of python in two ways


1. Using Interactive Interpreter
2. Using File Editor
Interactive Interpreter is also called as python shell. The Python shell is an excellent place to
experiment with small code snippets. You can access it through the terminal or command line
app on your machine. You can simplify your workflow with Python IDLE, which will
immediately start a Python shell when you open it. interactive mode and file mode. This self-
learning material(SLM) uses python3.9 and 3.7 also to execute script. The snapshot after
launching the Idle is shown below.

The interactive interpreter is the best place to experiment with Python code, also known as
a shell. The shell is a basic Read-Eval-Print Loop (REPL). It reads a Python statement, evaluates
the result of that statement, and then prints the result on the screen. Then, it loops back to read
the next statement.

The file editor helps to edit and save scripts in files. The python scripts are saved with .py
extension. Files are used to save python scripts that can be used or edited latter on as per need.
1.2 Keywords and Identifiers
Any general purpose programming language process certain kind of data consisting of numbers,
characters and strings and end of processing is useful information called as output. To achieve
desired result programmer needs to define some variables with the help of keywords and user
names to that variables called as identifiers. The data is processed with the help of instructions.
This variable and instructions must confirm the rules and semantics of the language. The
grammar includes keywords, rules to define variables, writing scripts using control structures and
input/output statements.

1.2.1 Keywords
keywords are understood by the complier; known as reserved words. They have predefined
meanings which cannot be changed by a programmer. Keywords act as building block for
program statement. While using these keywords one must remember that keywords are case
sensitive, all keywords must be written in lowercase letters. Keywords are the reserved words in
Python. We cannot use a keyword as a variable name, function name or any other identifier
The list of keywords in python is given in table below.

False await else import pass


None break except in raise
True class finally is return
and continue for lambda try
as def from nonlocal while
assert del global not with
async elif if or yield

To get the list of keywords you can use following command by using help(“keywords”) in
interactive mode. To get help for specific keyword also you can use help(‘specific_keyword’).
Examples is shown below.
1.2.2 Identifiers
Identifier is the name given to entities such as variables, functions and classes etc. Identifier must
be unique that means no two entities should have the same name. They are created to give unique
name to an entity to identify it during the execution of the program; it helps to differentiate one
entity from another.
• An identifier can only have alphanumeric characters (a-z , A-Z , 0-9) (i.e. letters & digits)
and underscore( _ ) symbol.
• Identifier names must be unique.
• Identifier should not be reserved words.
• Both uppercase and lowercase letters are permitted.
• The first character must be an alphabet or underscore.
• Keywords can’t be used as identifiers.
• Identifiers should not contain white spaces.

Following are valid identifiers:

number address roll_no emp_no

num_1 num_123 date_of_birth num_123_second

Following are invalid identifiers:

1972_amit Since the first character must be alphabet


emp no Blank character cannot be used in identifier
pass Reserve word cannot be used as an identifier name.
while Reserve word cannot be used as an identifier name.

Python is a case-sensitive language. This means, number1 and Number1 are not the same. It is
recommended that the identifiers nameshave to make sense. Even though variable n = 10 is a
valid name, writing number = 10 would make more sense, and it would be easier to figure out
what it represents when you look at your code after a long gap. If variable names contain
multiple words the multiple words can be separated using an underscore, like
this_is_a_long_variable.

1.3 Python Statements


Instructions that a Python interpreter can execute are called statements. These statements are
assignment statement like num = 30, statements like arithmetic, relational and logical
expressions are considered as compound statement. Control structures like if statement, for
statement, while statement, etc. are other kinds of statements. Note that a newline character is
considered as statement terminator and if we use comma then it is considered as statement
separator rather than terminator.

Multiline statements
In Python, the end of a statement is marked by a newline character. But we can make a statement
extend over multiple lines with the line continuation character (\). For example:

sum = 11 + 22 + 33 + \
44 + 55 + 66+ \
77 + 88 + 99

This is an explicit line continuation. In Python, line continuation is implied inside parentheses ( ),
brackets [ ], and braces { }. For instance, we can implement the above multi-line statement as:

sum = (11 + 22 + 33 +
44 + 55 + 66+
77 + 88 + 99)

Here, the surrounding parentheses ( ) do the line continuation implicitly. Same is the case with [ ]
and { }. For example:

fruits = ['Apple',
'Banana',
'Grapes']

We can also put multiple statements in a single line using semicolons, as follows:

num1 = 1; num2 = 2; num3 = 3

1.4 Python Indentation


Majority of the programming languages like C, C++, and Java use braces { } to define a block of
code. Python, however, uses indentation.A code block (body of a function, loop, etc.) starts with
indentation and ends with the first unindented line. The amount of indentation is up to you, but it
must be consistent throughout that block. Generally, four whitespaces are used for indentation
and are preferred over tabs. Here is an example.

for i in range(1,11):
print(i)
if i == 5:
break
The enforcement of indentation in Python makes the code look neat and clean. This results in
Python programs that look similar and consistent.Indentation can be ignored in line continuation,
but it's always a good idea to indent. It makes the code more readable. For example:

if True:
print('Hello')
a = 5
and

if True: print('Hello'); a = 5

both are valid and do the same thing, but the former style is clearer. Remember that incorrect
indentation will result in IndentationError.

1.5 Python Comments


Comments are very important while writing a program. They describe what is going on inside a
program, so that a person looking at the source code does not have a hard time figuring it out.
We might forget the key details of the program you just wrote in a month's time. So taking the
time to explain these concepts in the form of comments is always fruitful.

In Python, we use the hash (#) symbol to start writing a comment.It extends up to the newline
character. Comments are used by programmers for programmers to better understand a program.
When Python Interpreter executes the ignores comments.

#This is a comment
#following statement print Hello to console
print('Hello')

Multi-line commentsWe can have comments that extend up to multiple lines. One way is to use
the hash(#) symbol at the beginning of each line. For example:

#This is a long comment


# it is required to put on multiple lines
# and it is extends
#to multiple lines

Another way of doing this is to use triple quotes, either ''' or """.These triple quotes are generally
used for multi-line strings. But they can be used as a multi-line comment as well. Unless they are
not docstrings, they do not generate any extra code. Following are the two ways to write
comments in the python program.
'''this is an
example of
multiline comment'''

OR

"""this is also a
perfect example of
multiline comment"""

Docstrings in Python
A docstring is short for documentation string.Python docstrings (documentation strings) are the
string literals that appear right after the definition of a function, method, class, or module.Triple
quotes are used while writing docstrings. For example:

def double(num):
"""Function to double the value"""
return 2*num

Docstrings appear right after the definition of a function, class, or a module. This separates
docstrings from multiline comments using triple quotes.The docstrings are associated with the
object as their __doc__ attribute.So, we can access the docstrings of the above function with the
following lines of code:

def double(num):
"""Function to double the value"""
return 2*num
print(double.__doc__)

Output of the above code snippet is


Function to double the value

1.6 Python Data Types


The type of data value stored in an identifier or variable is called as data type. Every value in
Python; which is stored in variable has a datatype. Since everything is an object in Python
programming, data types are actually classes and variables are instance (object) of these classes.
Python language has standard data types that are used to define operations possible on them and
storage method for each of them. Python supports seven standard data types and are listed below.
Integer

Numbers Float

Dictionary Complex

Python Data
Boolean
Types

Set List

Sequence Types Tuple

String

Figure 1-2: Data Types in Python

1. Number: represents numeric data type to perform mathematical operations. numeric


data type represent the data which has numeric value. Numeric value can be integer,
floating number or even complex numbers. These values are defined as int, float and
complex class in Python.
▪ Integers – This value is represented by int class. It contains positive or
negative whole numbers (without fraction or decimal). In Python there is no
limit to how long an integer value can be.
▪ Float – This value is represented by float class. It is a real number with
floating point representation. It is specified by a decimal point. Optionally, the
character e or E followed by a positive or negative integer may be appended to
specify scientific notation.
▪ Complex Numbers – Complex number is represented by complex class. It is
specified as (real part) + (imaginary part)j. For example – 2+3j

2. Dictionary: It represents a collection of data that associate a unique key with each
value. We can say it is a collection where each item is having a key and that key used
to access a value associated with it.
3. Boolean: It represents a type where variable can take any one of the two values, True
and False. valye this
4. Set: Set is an unordered collection of data type that is iterable, mutable and has no
duplicate elements. The order of elements in a set is undefined though it may consist
of various elements.
5. Sequence Types: Sequence is the ordered collection of similar or different data types.
Sequences allows to store multiple values in an organized and efficient fashion. There
are several sequence types in Python
▪ String: Strings are arrays of bytes representing Unicode characters. A string is a
collection of one or more characters put in a single quote, double-quote or triple
quote. String includes special symbols and alphanumeric characters. In python
there is no character data type, a character is a string of length one.
▪ List:It is just like the arrays, declared in other languages which is a ordered
collection of data. Lists in python are very flexible as the items in a list do not
need to be of the same type.
▪ Tuple:Tupleis just like list; it is also an ordered collection objects. The only
difference between type and list is that tuples are immutable i.e. tuples cannot be
modified after it is created.
In this unit we all study about number and string type. Other types are covered latter on in
second unit.

1.7 Number Data Type


As stated earlier python support number data type to deal with numbers. Python supports three
types of numbers Viz. Integer, floating point and complex. Integers and floating points are
separated by the presence or absence of a decimal point. For instance, 7 is an integer whereas 7.0
is a floating-point number. Complex numbers are written in the form, a + bj, where a is the real
part and b is the imaginary part.

We can use the type() function to know which class a variable or a value belongs to and
isinstance() function to check if it belongs to a particular class.

Let's look at an example:


#number demo example, this
a = 7
print("a = ", a)
print("type of a is: ", type(a))

print('type(7.0) : ', type(7.0))

#creating complex variable


c = 7 + 3j
print('c = ', c)
print('c + 3 = ',c + 3)
print('c + 2j = ',c + 2j)
print("Is c is of complex type? : ",isinstance(c, complex))

If we run the above code we get output as

a = 7
type of a is: <class 'int'>
type(7.0) : <class 'float'>
c = (7+3j)
c + 3 = (10+3j)
c + 2j = (7+5j)
Is c is of complex type? : True

In Python, we can represent these numbers by appropriately placing a prefix before that number.
The following table lists these prefixes.

Number System Prefix Example Decimal Value


Binary '0b' or '0B' 0b1101011 107
Octal '0o' or '0O' 0o15 13
Hexadecimal '0x' or '0X' 0xFd 253

1.8 Type conversion


We can convert one type of number into another. This is also known as coercion. Operations like
addition, subtraction coerce integer to float implicitly (automatically), if one of the operands is
float.

>>>5 + 2.0
7.0

We can see above that 1 (integer) is coerced into 1.0 (float) for addition and the result is also a
floating point number. We can also use built-in functions like int(), float() and complex() to
convert between types explicitly. These functions can even convert from strings.

>>>int(12.30) #float to int


12
>>>int(-12.30) #float to int
-12
>>>float(5) #int to float
5.0
>>>float('5') #string to float
5.0
>>> complex('3+8j') # string to complex
(3+8j)

When converting from float to integer, number gets truncated (decimal parts are removed) and
the sign remains the same.

1.9 User input using keyboard


Developers often have a need to interact with users, either to get data or to provide some sort
of result. Most programs today use a dialog box as a way of asking the user to provide some
type of input. While Python provides us inbuilt functions to read the input from the keyboard.

input(prompt)

This function first takes the input from the user and then evaluates the expression, which
means Python automatically identifies whether user entered a string or a number or list. If the
input provided is not correct then either syntax error or exception is raised by python. For
example

str1 = input('Enter a String ')


print(str1)
Working of the input function in Python:
• When input() function executes program flow will be stopped until the user has given an
input.
• The text or message display on the output screen to ask a user to enter input value is
optional i.e. the prompt, will be printed on the screen is optional.
• Whatever you enter as input, input function convert it into a string. if you enter an integer
value still input() function convert it into a string. You need to explicitly convert it into an
integer in your code using typecasting.

If we want to convert string it to another format(int ot float) then we can use function as
bellow

intNum = int(str)
floatNum = float(str)

1.10 Operators in Python


In every programming language we require to write computational statements; which carry some
mathematical computations. These mathematical statements are called as expressions; which
contains operators and operands. To understand in more detail, consider following example
(a) num = 12
(b) a = b+ 3
(c) sum = num1 + num2 + num3
(d) avg = sum / 3
In above example (=, + and /) are the operators and (num, a, b, sum, num1, num2, num3, avg and
even constant 3) are called as operands. Operators are used to perform certain operations on
operands some of examples are listed above. The expression num = 12 is called as assignment
operation and b=3 is called as arithmetic addition. There are several types of operators supported
in python.
• With the help of operator’s individual constants and variables can be joined to form
expressions.
• An expression may contain operators, functions, constants and variables.
The operators in python are categorized as follows.
1. Arithmetic operators
2. Assignment operators
3. Relational operators
4. Logical operators
5. Bit wise operators
6. Conditional operators (ternary operators)
7. Increment/decrement operators
8. Membership operators
9. Special Operators

Arithmetic operators

Arithmetic operators are used to perform mathematical operations such as addition, subtraction,
multiplication and division on numerical values (variables and constants). The following Table
1-1: Python Arithmetic Operators; shows types of arithmetic operators.
Table 1-1: Python Arithmetic Operators

Operator Description Syntax

+ Addition: adds two operands a+b

- Subtraction: subtracts two operands a–b

* Multiplication: multiplies two operands a*b

/ Division (float): divides the first operand by the second a/b

// Division (floor): divides the first operand by the second a // b

Modulus: returns the remainder when first operand is


% divided by the second a%b

** Power : Returns first raised to power second a ** b

Following code demonstrates the result of using arithmetic operators.

# Examples of Arithmetic Operator


a = 9
b = 4
print('a = ',a, ', b = ',b)
# Addition of numbers
add = a + b
print('addition (a+b): ',add)

# Subtraction of numbers
sub = a - b
print('Substraction (a-b): ',sub)

# Multiplication of number
mul = a * b
print('Multiplecation (a * b) : ',mul)

# Division(float) of number
div1 = a / b
print('Divison (a/b) :',div1)

# Modulo of both number


mod = a % b
print('Modulo/remainder: ', mod)

# Power
p = a ** b
print('Power ( a ^ b )',p)

# Division(floor) of number
div2 = a // b
print('Floor division ( a//b): ', div2)

#If any one operand is negative


a = -9
div3 = a // b
print('Floor division (One operand negative a=-9)( a//b): ', div3)

Output of the above code is:


a = 9 , b = 4
addition (a+b): 13
Substraction (a-b): 5
Multiplecation (a * b) : 36
Divison (a/b) : 2.25
Modulo/remainder: 1
Power ( a ^ b ) 6561
Floor division ( a//b): 2
Floor division (One operand negative a=-9)( a//b): -3
Relational Operators

Relational operators are used to compare two values, which may be variables, constants or
expressions. All the Relational operators have lower precedence than the Arithmetic operators.
The following table shows relational operators in C language. Following table shows the list of
relational operators with example if a =10 and b= 20.
Table 1-2: Relational Operators

Operator Description Syntax

> If the value of left operand is greater than the value (a > b) is not true.
of right operand, then condition becomes true.

< If the value of left operand is less than the value of (a < b) is true.
right operand, then condition becomes true.

>= If the value of left operand is greater than or equal (a >= b) is not true.
to the value of right operand, then condition
becomes true.

<= If the value of left operand is less than or equal to (a <= b) is true.
the value of right operand, then condition becomes
true.

== If the values of two operands are equal, then the (a == b) is not true.
condition becomes true.

!= If values of two operands are not equal, then (a != b) is true.


condition becomes true.

# Examples of Relational Operators


a = 25
b = 45
#print values of a and b
print('a = ',a, ', b = ',b)

# a > b is False
print('a > b :', a > b)

# a < b is True
print('a < b : ',a < b)
# a == b is False
print('a == : b',a == b)

# a != b is True
print('a != b:', a != b)

# a >= b is False
print('a >= b:',a >= b)

# a <= b is True
print('a <= b :', a <= b)
Output of the above code is
a = 25 , b = 45
a >b : False
a <b : True
a == : b False
a != b: True
a >= b: False
a <= b : True

Logical Operators:

These operators are used to perform logical operations on the given expressions. Logical
operators are used to combine two or more condition. There are three logical operators in python
language. They are, logical AND (and), logical OR (or|) and logical NOT (not). The following
table shows logical operators in python language.

Table 1-3: Logical Operators

Operator Description Syntax

and Logical AND: True if both the operands are true x and y

or Logical OR: True if either of the operands is true x or y

not Logical NOT: True if operand is false not x


# Examples of Logical Operator
a = True
b = False

#print values of a and b


print('a = ',a, ', b = ',b)

# Print a and b is False


print('a and b: ',a and b)

# Print a or b is True
print('a or b: ',a or b)

# Print not a is False


print('not a : ',not a)

Output of the above code is

a = True , b = False
a and b: False
a or b: True
not a : False

Bitwise Operators
These are special operators that act on integer types only. They allow the programmer to get
closer to the machine level by operating at bit-level in their arguments.
Table 1-4: Bitwise Operators

Operator Description Syntax

& Bitwise AND a&b

| Bitwise OR a|b

~ Bitwise NOT ~a

^ Bitwise XOR a^b

>> Bitwise right shift a>>

<< Bitwise left shift a<<

Consider a integer number with size of 8 bits. This means it is made up of 8 distinct bits or
binary digits normally designated as illustrated below with Bit 0 being the Least Significant Bit
(LSB) and Bit 7 being the Most Significant Bit (MSB). The value represented below is 13 in
decimal.

Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0


0 0 0 0 1 1 0 1

An integer on a 32 bits in python and Bit 31 will be the MSB and used as sign bit. The following
are the illustrations of using bitwise operators for small number which can fit into 8 bits.

Bitwise AND (&)


If any two bits in the same bit position are set then the resultant bit in that position is set
otherwise it is zero.
For Example:
1011 0010 (178)
&0011 1111 (63)
=0011 0010 (50)
Bitwise OR ( | )
If either bit in corresponding positions are set the resultant bit in that position is set.
For Example:

1011 0010 (178)


| 0000 1000 (63)
= 1011 1010 (186)

Bitwise XOR (^)

If the bits in corresponding positions are different then the resultant bit is set.

For Example:-
1011 0010 (178)
^ 0011 1100 (63)
= 1000 1110 (142)

Shift Operators, << and >>


These move all bits in the operand left or right by a specified number of places.

Syntax : variable << number of places


variable >> number of places

For Example:-
2 << 2 = 8
i.e.
0000 0010 becomes 0000 1000

NB : shift left by one place multiplies by 2


shift right by one place divides by 2

Bitwise Not /Ones Complement (~)


It will Reverses the state of each bit. Note that every bit is changed, 1 is changed to 0 and 0 is
changed to 1. As it is having impact on sign bit too; sign of a number will also change. Positive
number will change to negative and negative to positive. Example code is provided below.

For Example : 1101 0011 becomes 0010 1100

# Examples of Bitwise operators


a = 10
b = 4

#print values of a and b


print('a=',a, 'b=',b)

# Print bitwise AND operation


print('a & b : ',a & b)

# Print bitwise OR operation


print('a | b: ', a | b)

# Print bitwise NOT operation


print('~a: ', ~a)

# print bitwise XOR operation


print('a ^ b : ',a ^ b)

# print bitwise right shift operation


print('a >> 2: ',a >> 2)

# print bitwise left shift operation


print('a << 2: ',a << 2)
Output of the above code is

a= 10 b= 4
a &b : 0
a | b: 14
~a: -11
a ^ b : 14
a >> 2: 2
a << 2: 40

Assignment Operators: The operator which is used to assign the value of an expression to
variable is called as assignment operator. The left side operand of the assignment operator
is a variable and right side operand of the assignment operator is a value. The value on

the right side must be of the same data-type (number, string) of variable on the left

side otherwise the compiler will raise an error.

Table 1-5: Short- hand Assignment Operators

Operator Description Syntax

= Assign value of right side of expression to left side operand x=y+z

+= Add AND: Add right side operand with left side operand and then a+=b
assign to left operand
-= Subtract AND: Subtract right operand from left operand and then a-=b
assign to left operand
*= Multiply AND: Multiply right operand with left operand and then a*=b
assign to left operand
/= Divide AND: Divide left operand with right operand and then assign a/=b
to left operand
%= Modulus AND: Takes modulus using left and right operands and a%=b
assign result to left operand
//= Divide(floor) AND: Divide left operand with right operand and then a//=b
assign the value(floor) to left operand
**= Exponent AND: Calculate exponent(raise power) value using a**=b
operands and assign value to left operand
&= Performs Bitwise AND on operands and assign value to left operand a&=b
|= Performs Bitwise OR on operands and assign value to left operand a|=b
^= Performs Bitwise xOR on operands and assign value to left operand a^=b
>>= Performs Bitwise right shift on operands and assign value to left a>>=b
operand
<<= Performs Bitwise left shift on operands and assign value to left a <<= b
operand
The most commonly used assignment operator is ‘=’.

Syntax :identifier=expression;
Example : x=5;
Here x is a number data type and value 5 is assigned to x.

Python language also has some shorthand assignment operators. Assignment operators are listed
in Table 1-5: Short- hand Assignment Operators.

1.11 Special operators


Python language offers some special types of operators like the identity operator or the
membership operator. They are described below with examples.

Identity operators
There are two identity operators in Python (is and is not). They are used to check if two values
(or variables) are located on the same part of the memory or check whether they are the same. If
two variables that are equal then it does not imply that they are identical.

Table 1-6: Identity Operators

Operator Meaning Example

is True if the operands are identical (refer to the same object) x is y

True if the operands are not identical (do not refer to the
is not x is not y
same object)

# Examples of identity operators


a1 = 5
b1 = 5
#strings
a2 = 'Hello'
b2 = 'Hello'
#lists
a3 = [1,2,3]
b3 = [1,2,3]

# Output: True
print('a1 is b1: ',a1 is b1)

# Output: False
print('a1 is not b1:',a1 is not b1)

# Output: True
print('a2 is b2:',a2 is b2)

# Output: False
print('a3 is b3:',a3 is b3)
Output of the above code is:
a1 is b1: True
a1 is not b1: False
a2 is b2: True
a3 is b3: False

Here, we see that a1 and b1 are integers of the same values, so they are equal as well as identical.
Same is the case with a2 and b2 (strings).

But a3 and b3 are lists. They are equal but not identical. It is because the interpreter locates them
separately in memory although they are equal.

Membership operators
There are two membership operators in Python (in and not in). They are used to test whether a
value or variable is found in a sequence (string, list, tuple, set and dictionary). In a dictionary we
can only test for presence of key, not the value. Detailed discussion about all this is covered later
on.
Table 1-7: Membership Operators

Operator Meaning Example

in True if value/variable is found in the sequence 5 in x

True if value/variable is not found in the 5 not in x( Here x is


not in
sequence sequential data type)

# Examples of Membership operator


name = 'Bharati Vidyapeeth, Pune'
nums = [1,2,3,4,5]
# Output: True
print('B' in name)
# Output: False
print('Bharati' not in name)
# Output: True
print(3 in nums)
# Output: False
print('b' in nums)

1.12 Operator Precedence and Associativity


When any expression is evaluated then it follows rules of precedence. Following is the table are
in decreasing order of precedence. If Expression contains operators with same precedence then
these are evaluated from left to right.

Table 1-8: Operator Precedence

Operators Meaning
(Decreasing order of precedence)

** Exponent

*, /, //, % Multiplication, Division, Floor division, Modulus

+, - Addition, Subtraction

<= <>>= Comparison operators

= %= /= //= -= += *= **= Assignment Operators

is is not Identity operators

in not in Membership operators

not or and Logical operators

1.13 Python Control structures in Python:


Control structures are one of the most vital elements in any programming languages. Most
programs do not work by executing a simple sequential set of statements. The code is
constructed so that decisions and different paths through the program can be taken based on
changes in variable values.To make this possible all programming language have a set of control
structures which allow this to happen.

A program’s control flow is the order in which the program’s code executes. The control flow of
a Python program is regulated by conditional statements, loops, and function calls. Python has
three types of control structures:

▪ Sequential - default mode


▪ Selection - used for decisions and branching
▪ Repetition - used for looping, i.e., repeating a piece of code multiple times.

In Python, the selection statements are also known as Decision control statements or branching
statements.

The selection statement allows a program to test several conditions and execute instructions
based on which condition is true.

Some Decision Control Statements are:

• Simple if
• if-else
• nested if
• if-elif-else

Simple if: If statements are control flow statements that help us to run a particular code, but only
when a certain condition is met or satisfied. A simple if only has one condition to check.

If Statement Syntax

If expression:
Statement-1
Statement-2

Statement-n

Here, the program evaluates the expression and will execute statement(s) only if the expression
is True. If the test expression is False, the statement(s) is not executed. In Python, the body of the
if statement is indicated by the indentation. The body starts with an indentation and the first
unindented line marks the end.
Example:
# If the number is positive, we print an appropriate message

nummber = 13
if nummber> 0:
print(num, "is a positive number.")
print("This is always printed.")

When you run the program, the output will be:

13 is a positive number
This is always printed

You have nota that python interprets non-zero values as True. None and 0 are interpreted as
False.

Figure 1-3: functioning of If

In the above example, num > 0 is the test expression. The body of if is executed only if this
evaluates to True. When the variable num is equal to 13, test expression is true and statement(s)
inside the body of ifis executed.
1.14 Python if…else Statement
An if statement can optionally include an else clause. The else clause is included as follows:

if expression:
statement1;
else:
statement2;
If expression evaluates to true, statement1 is executed. If expression evaluates to false, control
goes to the else statement, statement2, which is then executed. Bothstatement1 and statement2
can be single statement or block of statements.

Figure 1-4: functioning of if-else

Example: An if statement with an else clause

x = 13
if x >= 0:
print(“x is positive number”)
else:
print(“x is negative number”)
In the above example, num >= 0 is the test expression. The body of if is executed only if this
evaluates to True otherwise body of the else is executed. When the variable num is equal to 13,
test expression is true and statement(s) inside the body of if is executed. If we change the value
of x = -3 then expression x >=0 becomes false and body of else executes.

1.15 Python if...elif...else Statement

if test expression:
Body of if
elif test expression:
Body of elif
else:
Body of else

The elif is short for else if. It allows us to check for multiple expressions. If the condition for if is
False, it checks the condition of the next elif block and so on. If all the conditions are False, the
body of else is executed. Only one block among the several if...elif...else blocks is executed
according to the condition. The if block can have only one else block. But it can have multiple
elif blocks. Following diagrams shows the flowchart of if...elif...else Statement.

Figure 1-5: functioning of if-elif-else


Example of if...elif...else Statement
x = 12
y = 34
if x == y:
print(“x is equal to y”);
elif x > y:
print(“x is greater than y”);
else:
print(“x is smaller than y”);
if we execute above code we will get output as

x is smaller than y

If values of x and y are changed then we get different output depending on values of variables.

1.16 Python Iterative Statements


In Python, the iterative statements are also known as looping statements or repetitive statements.
The iterative statements are used to execute a part of the program repeatedly as long as a given
condition is True. This repetition is used to print sequence of numbers or adding elements from
collection or sending messages to other object. Python supports following loops.
1. While
2. for

1.16.1 while Statement


In Python, the while statement is also known as entry control loop statement because in the case
of the while statement, first, the given condition is verified then the execution of statements is
determined based on the condition result. The general syntax of while statement in Python is as
follows.

While Loop Syntax :

while condition:
Statement_1
Statement_2
Statement_3
...
Following diagrams shows the flowchart of if...while loop or iterative statement. Note that here
all statements within loop are executed. If we use if or if -else, then we have choice to execution
of statements depending on the condition used in if.

When we define a while statement, the block of statements must be specified using indentation
only. The indentation is a series of white-spaces. Here, the number of white-spaces may variable,
but all statements must use the identical number of white-spaces. Let's look at the following
example Python code.

Figure 1-6: Functioning of while loop

While Statement Example:


#while loop example
count = int(input('How many even numbers we want to print: '))
print('First % even numbers are ', %(count))

#while loop constrcuct


i = 1
while i<= count:
evenNum = i*2
print(evenNum)
i += 1

#this is not a part of while loop


print('Job is done! Thank you!!')
1.16.2 for Statement
In Python, the for statement is used to iterate through a sequential collection like a list, a tuple, a
set, a dictionary, or a string. The for statement is used to repeat the execution of a set of
statements for every element of a sequence. The general syntax of for statement in Python is as
follows.

For loop Syntax


for <variable>in <sequence>:
Statement_1
Statement_2
Statement_3
...
In the above syntax, the variable is stored with each element from the sequence for every
iteration. Flow chart for for-in loop is as below

Figure 1-7: Functioning of for loop

Example
#for loop example
#Creating list of first 10 natural numbers
nums = [1,2,3,4,5,6,7,8,9,10]
print("list of first 10 natural numbers is:")
print(nums)

#initialize sum to zero


sum = 0
# loops adds numbers in list into sum
for num in nums:
sum +=num

#this is not a part of while loop


print('Job is done!')
print("sum of numbers in list : ", sum)

1.16.3 Loop Control Statements


Loop control statements change execution from its normal sequence. When execution leaves a
scope, all automatic objects that were created in that scope are destroyed. Python supports the
following control statements.
• Break
• Continue
• Pass

The Python break statement immediately terminates a loop entirely. Program execution
proceeds to the first statement following the loop body.

The Python continue statement immediately terminates the current loop iteration. Execution
jumps to the top of the loop, and the controlling expression is re-evaluated to determine whether
the loop will execute again or terminate.
Note that break and continue statements always appears as part of conditional statement (used
with if). During execution of either loop if break statement executes then it terminates the loop
entirely and statements after loop are executed. In a case if continue executes then statements
within loop after continue are not executed and loop control goes to next iteration of loop by
checking loop expression.

1.17 Working strings in python:


String type, strings concatenations and comparing strings, using string functions
In Python, strings can be created by enclosing the character or the sequence of characters in the
quotes. A character is simply a symbol. For example, the digits have 10 symbols
(0,1,2,3,4,5,6,7,8,9) and English language has 26 characters.

Computers do not deal with characters; they deal with numbers (binary). Even though you may
see characters on your screen, internally it is stored and manipulated as a combination of 0s and
1s.This conversion of character to a number is called encoding, and the reverse process is
decoding. ASCII and Unicode are some of the popular encodings used.

In Python, a string is a sequence of Unicode characters. Unicode was introduced to include every
character in all languages and bring uniformity in encoding.

Python has a built-in string class named "str" with many handy features (there is an older module
named "string" which you should not use). Whenever we create a string it is automatically of
type ‘str’. To understand this consider following example.

#Creating Strings in python


str1 = 'I am String'
print(type(str1))

Output of the above code is


<class 'str'>

In Python, strings are treated as the sequence of characters, which means that Python doesn't
support the character data-type; instead, a single character written as 'p' is treated as the string of
length 1.

1.17.1 Creating Strings in Python


Strings can be created by enclosing characters inside a single quote or double-quotes. Even triple
quotes can be used in Python but generally used to represent multiline strings and docstrings.
Backslash escapes work the usual way within both single and double quoted literals -- e.g. \n \'
\". A double quoted string literal can contain single quotes. (e.g. "I didn't do it") and likewise
single quoted string can contain double quotes.
Example
#Creating Strings in python
str = 'I am String in single quote'
print(str)

str ="I am string in double quotes."


print(str)

str ='''I am string in triple quotes.'''


print(str)

str ="I am string containing 'single quote' in double quotes."


print(str)

str ='I am string containing "single quote" in double quotes.'


print(str)

str ='''I am multiline comment/string.


It is also called as "docstring".
This third line.'''
print(str)

Output of the above code is

I am String in single quote.


I am string in double quotes.
I am string in triple quotes.
I am string containing 'single quote' in double quotes.
I am string containing "single quote" in double quotes.
I am multiline comment/string.
It is also called as "docstring".
This third line.

Accessing String characters


We can access individual characters using indexing and a range of characters using slicing. Index
starts from 0. Trying to access a character out of index range will raise an IndexError. The index
must be an integer. We can't use floats or other types, this will result into TypeError.
Python allows negative indexing for its sequences.The index of -1 refers to the last item, -2 to the
second last item and so on. We can access a range of items in a string by using the slicing
operator :(colon). Examples of accessing string characters and string slicing is below.

String slicing example


#code for string slicing
str = 'programming'
print('str = ', str)

#first character
print('str[0] = ', str[0])

#second character
print('str[1] = ', str[1])

#last character
print('str[-1] = ', str[-1])

#slicing 3rd to 8th character


print('str[3:8] = ', str[3:8])

#slicing 6th to 2nd last character


print('str[5:-2] = ', str[5:-2])

#acessing character out of index


#print('str[15] = ', str[15]) #second last line

#acessing character with non-integer index


#print('str[5.0] = ', str[5.0]) #last line

Output of the above code is:


str = programming
str[0] = p
str[1] = r
str[-1] = g
str[3:8] = gramm
str[5:-2] = ammi
If we try to access index which does not possible will raise Index error. Accesing 15th character (
str[15]) as specified in second last python code line linein above example will raise following
error.

File "C://python slm/code/Unit1/string-slice.py", line 22, in <module>


print('str[15] = ', str[15]) #second last line
IndexError: string index out of range

If we use non-integer values as index, then we get TypeError as below. You can try out it by
removing comment of last line in above code.

File "C:/python slm/code/Unit1/string-slice.py", line 25, in <module>


print('str[5.0] = ', str[5.0]) #last line
TypeError: string indices must be integers

Python strings are "immutable" which means they cannot be changed after they are created (Java
strings also use this immutable style). Since strings can't be changed, we construct *new* strings
as we go to represent computed values. So for example the expression ('hello' + 'there') takes in
the 2 strings 'hello' and 'there' and builds a new string 'hellothere. If we try to replace the
characters of any index position will raise an error. If we try to use str[5]= 'T'; then it
leads to error as TypeError: 'str' object does not support item assignment

1.17.2 String Methods

Here are some of the most common string methods. A method is like a function, but it runs "on"
an object. If the variable s is a string, then the code s.lower() runs the lower() method on that
string object and returns the result (this idea of a method running on an object is one of the basic
ideas that make up Object Oriented Programming, OOP). Here are some of the most common
string methods:

• s.lower(), s.upper() -- returns the lowercase or uppercase version of the string


• s.strip() -- returns a string with whitespace removed from the start and end
• s.isalpha()/s.isdigit()/s.isspace()... -- tests if all the string chars are in the various character
classes
• s.startswith('other'), s.endswith('other') -- tests if the string starts or ends with the given other
string
• s.find('other') -- searches for the given other string (not a regular expression) within s, and
returns the first index where it begins or -1 if not found
• s.replace('old', 'new') -- returns a string where all occurrences of 'old' have been replaced by
'new'
• s.split('delim') -- returns a list of substrings separated by the given delimiter. The delimiter is
not a regular expression, it's just text. 'aaa,bbb,ccc'.split(',') -> ['aaa', 'bbb', 'ccc']. As a
convenient special case s.split() (with no arguments) splits on all whitespace chars.
• s.join(list) -- opposite of split(), joins the elements in the given list together using the string as
the delimiter. e.g. '---'.join(['aaa', 'bbb', 'ccc']) ->aaa---bbb---ccc

1.18 Review Questions


1. Describe rules of used to name identifiers.
2. What do you know about Python statement? List and describe types of statement
supported in python.
3. Describe various types of comments in python.
4. List and describe various data types in python.
5. Which are number data types supported by Python? Briefly describe each.
6. What is type conversion? Explain the mechanism to convert types in python.
7. Describe various Arithmetic operators in python.
8. List and describe working of logical operators in Python.
9. When to use relational operators? List and describe relational operators in python.
10. Describe following operators
a. Bitwise
b. Assignment
c. Identity
d. Membership
11. Write note operator precedence and associativity in python.
12. What is need of control statement in programming? List control structures in python and
explain any one of them.
13. Explain the working of if, if-else and if-elif stamen in Python
14. List and describe iterative statements in Python.
15. What do you know about loop control statements? Explain usage of each.
16. How to create strings in python? Describe various mechanism of accessing string data.
17. List and describe various methods of string object.
2 Working with functions and Built in data structures

Objectives
After completing this chapter, students will be able to
- Understand use of functions and mechanism to defining it.
- Describe types of functions.
- Define functions and use it built in functions.
- Passing parameters and returning the value from function.
- About recursive function and variable number of arguments to functions.
- To describe and built in data structures and differentiate them.

2.1 Introduction

In this Unit, you'll learn about functions, what a function is, the syntax, components, and types of
functions. Also, you'll learn to create a user defined function in Python. We will lean about
passing parameters to functions and returning values from it. We will also learn about python
built in data structure tuple, list, set and dictionary.

2.2 Function
A function is a group of related statements that performs a specific task. Functions helps to break
a program into smaller and modular chunks. As our program grows larger and larger, functions
make it more organized and manageable.Function is self-contained in a sub-program and used
when needed. Function helps to avoid repetition and makes the code reusable. The function
might or might not need inputs and may or may return values.

There are three types of functions in Python:


• Built-in functions,: such as help() to ask for help, print() to print an object to the terminal
• User-Defined Functions (UDFs): functions that users/programmer create
• Anonymous functions: these are also called lambda functions because they are not
declared with the standard def keyword.

2.3 User Defined Functions


User defined functions are defined by programmer to perform specific task such as finding
minimum/maximum value, checking whether number is even or not. Once function is written
then it can be used as and when there is need to use that functionality. Consider an example
where we had written a function to check number is even or not and we are interested to get sum
of even numbers from given collection; then we can use earlier function to check every number
from collection is even or not and use it get sum of even numbers. The syntax of uder defined
function is below.
Function Syntax

def function_name(parameters):
"""docstring"""
statement(s)

If we look at the function syntax above it consists of the following components.

▪ Keyword def that marks the start of the function header.


▪ A function_name, is name given to function to identify the function uniquely.
Function naming follows the same rules of writing identifiers in Python.
▪ Parameters (arguments) through which we pass values to a function. They are optional.
▪ A colon (:) to mark the end of the function header.
▪ Optional documentation string (docstring) to describe what the function does.
▪ One or more valid python statements that make up the function body. Statements must
have the same indentation level (usually 4 spaces).
▪ A function may have an optional return statement to return a value from the function.

Example: let’s consider the need to print copyright message in a program; which may be used in
multiple places and may be required change. Following code defines function without parameter
and not returning any value and prints copyright message. This function also contains docstring.

def copyright():
""" This does not take any parameter
and does not return any value.
This function just prints copyright message.
"""
print("This contents are ©copyright of ABC-Organization")

#below code calss a function defined above


copyright()

Ouput of the above code execution is:

This contents are ©copyright of ABC-Organization

The string appearing immediately after the function header is called the docstring; used as
documentation string. It is used to describe what a function does.We generally use triple quotes
so that docstring can extend up to multiple lines. Docstring is optional and it may appear in triple
single quote too. Documentation is a good programming practice it helps to reader of the
program to know more about function. This string is available to us as the __doc__ attribute of
the function. To access and print the docstring in above of above function as
print(copyright.__doc__).

2.4 Argument Passing in Function


Function defined above is not takes any arguments. That can sometimes be useful, and you’ll
occasionally write such functions. Many number of time we will need to pass data into a function
so that it will perform a specified task depending on the parameters passed to the function.
Consider the following function; which takes three arguments.
#function with arguments
def getCost(item, qty, price):
cost = qty * price
print('Cost of %d %s is %.2f' %(qty, item, cost))
#above statement can also be written as bellow
#print(f'Cost of {qty} {item} is {cost}')

#this is function call


getCost('Apple', 4, 180.0)

The output for above function call is

Cost of 4 Apple is 720.00

The parameters (item, qty, and price) behave like variables that are defined locally to the
function. When the function is called, the arguments that are passed ('Apple', 4, and 180)
are bound to the parameters in order, as though by variable assignment.

The parameters given in the function definition are referred to as formal parameters, and the
arguments in the function call are referred to as actual parameters. Argument can be passed in
the following ways
1. Positional Arguments
2. Keyword Arguments
3. Default parameters

2.4.1 Positional Arguments


In function call getCost('Apple', 4, 180.0); arguments to a function are passed
as positional arguments (also called required arguments). In the function definition, you specify a
comma-separated list of parameters inside the parentheses as mentioned in function definition
above and also need to call the function with arguments in the same order.

With positional arguments, the arguments in the call and the parameters in the definition must
agree not only in order but in number as well. That’s the reason positional arguments are also
referred to as required arguments. Let us see the result of calling the above function with
incorrect order (It is assumed that the function is executed before executing fallowing codes).

# call function to get cost of 6 banana at Rs.2


getCost(6, "Banana", 2)

The result of the above function call is


File "E:/programming/python/function-02.txt", line 3, in getCost
print('Cost of %d %s is %.2f' %(qty, item, cost))
TypeError: %d format: a number is required, not str

Positional arguments are conceptually straightforward to use, but they’re not very
forgiving. You must specify the same number of arguments in the function call as
there are parameters in the definition, and in exactly the same order. In the sections
that follow, you’ll see some argument-passing techniques that relax these restrictions

2.4.2 Keyword Arguments


When you’re calling a function, you can specify arguments in the form <keyword>=<value>. In
that case, each <keyword> must match a parameter in the Python function definition. For
example, the previously defined function getCost() may be called with keyword arguments as
follows (invoked on python command prompt) then all of the times we get the desired output.

>>>getCost(qty=6, item="Banana", price=2)


Cost of 6 Banana is 12.00
>>>getCost(item="Banana", qty=6, price=2)
Cost of 6 Banana is 12.00
>>>getCost(item="Banana", price=2, qty=6)
Cost of 6 Banana is 12.00

Using keyword arguments lifts the restriction on argument order. Each keyword
argument explicitly designates a specific parameter by name, so you can specify them
in any order and Python will still know which argument goes with which parameter.
Like with positional arguments, though, the number of arguments and parameters
must still match. If we specify any keyword that doesn’t match any of the declared
parameters then such function call generates an exception:
>>>getCost(item="Banana", qty=6, cost=2) # argument name not matches
TypeError: getCost() got an unexpected keyword argument 'cost'

>>>getCost(item="Banana", qty=6) # only two qrguments


TypeError: getCost() missing 1 required positional argument: 'price'

2.4.3 Default Parameters


Python allows function arguments to have default values. If the function is called without
the argument, the argument gets its default value. If a parameter specified in a Python function
definition has the form <name>=<value>, then <value> becomes a default value for that
parameter. Parameters defined this way are referred to as default or optional parameters. An
example of a function definition with default parameters is shown below:

def greet(name, msg="Good morning!"):


"""
This function greets to the person with the
provided message.

If the message is not provided,


it defaults to "Good morning!"
"""
print("Hello", name + ', ' + msg)

#calling greet function without msg


greet("Nisha")

#calling greet function with msg


greet("Amit", "How do you do?")

Output of the above code execution is:

Hello Nisha, Good morning!


Hello Amit, How do you do?

In this function, the parameter name does not have a default value and is required (mandatory)
during a call.On the other hand, the parameter msg has a default value of "Good morning!". So, it
is optional during a call. If a value is provided, it will overwrite the default value.

Any number of arguments in a function can have a default value. But once we have a default
argument, all the arguments to its right must also have default values.This means to say, non-
default arguments cannot follow default arguments. For example, if we had defined the above
function header as
def greet(msg = "Good morning!", name):

This will result in syntax error.

2.4.4 Functions and variable scope


The variables declared within a function are called local variables to that function and
has its scope only in that particular function. In simple words, it cannot be accessed
outside that function. Any declaration of a variable outside the function with same
name as that of the one within the function is a complete different variable

Local Scope
Whenever you define a variable within a function, its scope is limited within the
function. Variable is accessible from the point at which it is defined until the end of the
function. Value of local variable cannot be changed or even accessed from outside the
function. Let's take a simple example:

#This is function
def func():
#create a local variable
num = 23
print('In func(); num = ', num)

#call a function
func()
#attempt to access num defined in function
print('Outside func(); num = ', num)

Output of the above code is looks like below.


In func(); num = 23
NameError: name 'num' is not defined

Enclosing Scope
Python supports nested blocks and functions too. If variable is declared in outer block then that is
accessible in inner block also. The inner block can be any control statement (if, if_else, while,
for) or it can be a function two. Consider the following example:

def outer():
num1 = 1
def inner():
num2 = 2
# Print statement 1 - Scope: Inner
print("num1 from outer: ", num1)
# Print statement 2 - Scope: Inner
print("num2 from inner: ", num2)
inner()
# Print statement 3 - Scope: Outer
print("num2 from inner: ", num2)

Ouput of the above code is:

num1 from outer: 1


num2 from inner: 2

NameError: name 'num2' is not defined

Global Scope
When we declare variable outside a function then scope of that variable is global. Any
python code including code in function can access that values. Consider following code

#this is a global variable


num = 23

#this is function
def func():
#accessing global num
print('In func(); num = ', num)

#call a function
func()
print('Outside func(); num = ', num)

Output of the above code execution is:

In func(); num = 23
Outside func(); num = 23

Variable num in above example is declared outside function and hence its scope becomes global.
Access for variable num is available to everywhere in a program.

Built-in Scope
This is the widest scope that exists! All the special reserved keywords fall under this scope. We
can call the keywords anywhere within our program without having to define them before use
2.4.5 Call by reference in Python
In Python, every variable created to store data is an object that means every variable name is a
reference. When we pass a variable to a function, a new reference to the object is created.
Parameter passing in Python is the same as reference passing in Java. Here when we pass
parameter to function the actually copy of reference is created. We can change the content of
that reference but contents as whole cannot be changed.

Remember that some python objects (boolean, integer, float, string, and tuple) are immutable.
This means that after you create the object and assign some new value to it, you can't modify that
value. Every time when we assign new values to them a new object is created and hence new
reference is assigned. Consider example below.

# this tries to modify num


def change(num):
num = 23
print ('in change() num =', num)

#define num
num = 11
print("Before call to change(); num = ", num)
#call change()
change(num)
print("After call to change(); num = ", num)

Output of the above code is

Before call to change(); num = 11


in change() num = 23
After call to change(); num = 11

From this it clears that value of num is changed in function but not reflecting it to the calling
position. If we pass list and check for behavior as below example.
#function tries to change list as whole
def changeList(fruits):
fruits = ['Orange','Dates', 'pineapple' ]
print('In changeList(), fruits = ', fruits)

fruits = ['Apple', 'Banana', 'Grapes']

print('Before calling changeList(), fruits = ', fruits)


changeList(fruits)
print('After calling changeList(), fruits = ', fruits)

Output of the above code is:

Before calling changeList(), fruits = ['Apple', 'Banana', 'Grapes']


In changeList(), fruits = ['Orange', 'Dates', 'pineapple']
After calling changeList(), fruits = ['Apple', 'Banana', 'Grapes']

From this it clears that; if we try to change whole object it does not affect at the caller scope.

#function modifies list part


def changeListPart(fruits):
fruits.append('Orange')
print('In changeListPart(), fruits = ', fruits)

#define list of fruits


fruits = ['Apple', 'Banana', 'Grapes']

print('Before changeListPart(), fruits = ', fruits)


changeListPart(fruits)
print('After changeListPart(), fruits = ', fruits)

If we run code above we will get output as below.

Before changeListPart(), fruits = ['Apple', 'Banana', 'Grapes']


In changeListPart(), fruits = ['Apple', 'Banana', 'Grapes', 'Orange']
After changeListPart(), fruits = ['Apple', 'Banana', 'Grapes', 'Orange']

Above functionchangeListPart(fruits) can also be written/replaced as:

def changeListPart(items):
items.append('Orange')
print('In changeListPart(), items = ', items)
output of the code is:

Before changeListPart(), fruits = ['Apple', 'Banana', 'Grapes']


In changeListPart(), items = ['Apple', 'Banana', 'Grapes', 'Orange']
After changeListPart(), fruits = ['Apple', 'Banana', 'Grapes', 'Orange']
.

2.4.6 Variable number of arguments


Sometimes, we do not know in advance the number of arguments that will be passed into a
function. Python allows us to handle this kind of situation through function calls with
variable/arbitrary number of arguments. In the function definition, we use an asterisk (*) before
the parameter name to denote this kind of argument. Here is an example.

#function with variable number of arguments


def greet(*names):
"""This function greets all
the person in the names tuple."""

# names is a tuple with arguments


for name in names:
print("Hello", name)

#Calling a function with variable values


greet("Abhay", "Beena", "Chetan", "Dhruv")
print()
greet("Savita", "Seema", "Simran")

If we execute above code we will get output as:


Hello Abhay
Hello Beena
Hello Chetan
Hello Dhruv

Hello Savita
Hello Seema
Hello Simran

2.5 Functions Returning values


Certain functions need to perform some task and at the end returns result back to caller. To do so
The return statement is used at the end of the function. Return statement can contain the
expression which gets evaluated and value is returned to the caller function. Syntax to use return
in function is
return [expression]
Once the return statement executes then it terminates the function execution and transfers the
result back to caller who called the function. The return statement cannot be used outside of the
function.

#function with return statement


def isEven(num):
""" This function returns True if number is even
otherwise returns False
"""
rem = num %2
if num==0:
return True;

return False

In above function we returns either True or False. If the return statement has no expression or
does not used it in the function, then it returns the None object. To understand this consider
example below.

#function with return not having return expression


def printHello():
print("Hello, Using function")
return # implicitly every function has this at the end of function

result = printHello()
print (result)

Output of the above code execution with making use of using return or not using return is.

Hello, Using function


None

2.6 Understanding Recursive Function


A recursive function is a function defined in terms of itself via self-referential expressions. This
means that the function will continue to call itself and repeat its behavior until some condition is
met to return a result. All recursive functions share a common structure made up of two parts:
base case and recursive case.

To understand this let’s write a recursive function for calculating factorial on n(n!). the formula
to calculate the factorial of any number n is
n! = n*(n-1)!
n! = n*(n-1)*(n-2)!

n! = n*(n-1)*(n-2)* … *3 * 2*1!

If we observe above carefully; to get factorial of any given number ‘n’ we need to get factorial of
(n-10. To solve large we are solving sub problems until we get a base problem. In case of
factorial base problem is to get it get 1!.

To calculate factorial of n we can write expression as


factorial(n) = 1 if n==1 Base Case
= n * (n-1)! if n>1 Recursive

By considering above we can write a factorial function and use it n as bellow:


#define a recursive function to get factorial
def factorial(n):
if n > 1:
#recursive case n > 1
return n * factorial(n-1)
#else:
#base case n = 1
return 1

#using recursive function


print("Factorial of 1 : ", factorial(1))
print("Factorial of 3 : ", factorial(3))
print("Factorial of 5 : ", factorial(5))

Above function can also be written without else clause as below.


#define a recursive function to get factorial
def factorial (n):
if n > 1:
#recursive case n > 1
return n * factorial (n-1)
#base case n = 1
return 1

# drive code for recursive function


print("Factorial of 1 : ", factorial(1))
print("Factorial of 3 : ", factorial(3))
print("Factorial of 5 : ", factorial(5))
If we execute code along with driver code we will get output as:
Factorial of 1 : 1
Factorial of 3 : 6
Factorial of 5 : 120

For each recursive call adds a stack frame (containing its execution context) to the call stack until
we reach the base case. Then, the stack begins to unwind as each call returns its results. A
complete stack for calculating factorial of 3 and 5 is shown below.

Factorial(1)
Factorial(2)
Factorial(1) Factorial(3)
Factorial(2) Factorial(4)
Factorial(3) Factorial(5)

Factorial(3) = 3 * factorial(2) Factorial(5) = 5 * factorial(4)


= 3 * 2 * factorial(1) = 5 * 4 * factorial(3)
=3*2*1 = 5 * 4 * 3 * factorial(2)
= 5 * 4 * 3 * 2 * factorial(1)
= 5 * 2 * 3 *2 * 1

2.7 Python Inbuilt Data Structures

Data structures are the fundamental constructs around which you build your programs. Each
data structure provides a particular way of organizing data so it can be accessed efficiently,
depending on your use case. Python has inbuilt support of extensive set of data structures in
its standard library. Python has four basic inbuilt data structures namely
▪ Lists
▪ Tuple
▪ Set
▪ Dictionary

2.8 Lists
Python's list structure is a mutable sequence container that can change size as items are added or
removed. It is an abstract data type that is implemented using an array structure to store the items
contained in the list.
2.8.1 Creating Python List
List is sequential in nature and it is created by placing all the items (elements) inside square
brackets [], separated by commas. It can have any number of items and they may be of different
types (integer, float, string etc.).

#Empty list
list1 = []

#List of integers
list2 = [11, 22, 33]

#List with mixed data types


list3 = [11, "Banana", 3.4]

print("list1 :", list1)


print("list2 :", list2)
print("list3 :", list3)

An empty list can also be created by object creation

List4 = list() # object is created


print("list4 :", list4)

Remember that list can contain any data types; this includes list or user defined types too.
Following example is a list of list.

list5 = [11, "Amit", 3.4, [1,2,3]]

2.8.2 Accessing List elements


List is index based data structure supports sequential access. so we can use index to access
elements of list. We can use the index operator [] to access an item in a list. In Python, indices
start at 0. So, a list having 5 elements will have an index from 0 to 4. Nested lists are accessed
using nested indexing.

Trying to use indexes which are not possible will raise an IndexError. The index must be an
integer. We can't use float or other types, this will result in TypeError.

#List with mixed data types


list1 = [1, "Two", 3, 'Four']

#accessing elements of list1


print('list1[0] : ', list1[0])
print('list1[1] : ', list1[1])
print('list1[2] : ', list1[2])
print('list1[3] : ', list1[3])

#List of list ( Nested list)


list2 = [1, "Two", 3, [11,12,13]]

#Accessing [11,22,33]
print('list2[3] : ', list2[3])

#Accessing individual elements 11,22,33


print('list2[3][0] : ', list2[3][0])
print('list2[3][1] : ', list2[3][1])
print('list2[3][2] : ', list2[3][2])

Python allows negative indexing for its sequences. The index of -1 refers to the last item, -2 to
the second last item and so on.

list1 1 'Two' 3 'Four'


Index 0 1 2 3
Negative Index -4 -3 -2 -1

If we use list[-1] it returns 'Four'


If we use list[-2] it returns 3
If we use list[-3] it returns 'Two'
If we use list[-4] it returns 1'

2.8.3 List Slicing


We can access a range of items in a list by using the slicing operator :(colon). Slicing is indexing
syntax that extracts a portion from a list. If nums is a list, then nums[m:n:step] returns the
portion of nums:
• m: Starting postion/Index
• n : end position/index (but not including n)
• step: default value is 1; we can change it to any value
• Negative indexing can also be used

Remember start index must be position-wise lower than the end. If we use negative indexing
then -6(m) will appear -2(n). If this rule is not followed we will get empty list.
#list of numbers
nums = [1,2,3,4,5,6,7,8,9]

print('nums[1:4] : ', nums[1:4])


#m>n returns empty list
print('nums[4:1] : ', nums[4:1])

print('nums[-7: -4]:', nums[-7:-4])

#m>n returns empty list


print('nums[-1: -4]:', nums[-1:-4])

#this will give full list


print('nums[:] : ', nums[:])

#first 4 elements
print('nums[:4] : ', nums[:4])

#This will give alternate elements


print('nums[1::2] : ', nums[1::2])

Output of above code execution is:

nums[1:4] : [2, 3, 4]
nums[4:1] : []
nums[-7: -4]: [3, 4, 5]
nums[-1: -4]: []
nums[:] : [1, 2, 3, 4, 5, 6, 7, 8, 9]
nums[:4] : [1, 2, 3, 4]
nums[1::2] : [2, 4, 6, 8]

2.8.4 Adding Elements in List


There are four methods to add elements to a List in Python.
▪ append(): append the object to the end of the list.
▪ insert(): inserts the object before the given index.
▪ extend(): extends the list by appending elements from the iterable.
▪ List Concatenation: We can use + operator to concatenate multiple lists and create a new
list.
Using Append
Append functions add an element to the end of the list. Consider following example where
flowers list contains two elements and we are adding new flower into it. Note that append is used
to add single element.
#list of flowers
flowers = ["Rose", "Marigold"]

#print list of flowers


print('Current Flowers List', flowers)
#above statement can also written using formating as below
#print(f'Current Flowers List {flowers}')

flwr = input("Please enter a flower name: ")


flowers.append(flwr)

print(f'Updated Fruits List {flowers}')


Output of the above code is:

Current Flowers List ['Rose', 'Marigold']


Please enter a flower name: Tulip
Updated Fruits List ['Rose', 'Marigold', 'Tulip']

Parameter to append can be anything; it can be list too. When we use append method to append
another list then current list becomes nested list.

Using Insert
This function adds an element at the given index of the list. It’s useful to add an element at the
specified index of the list.

#using insert to insert elements in a list


#current number list
num_list = [1, 2, 3, 4, 5]
#print current number list
print(f'Current Numbers List {num_list}')

num = int(input("Please enter a number to add to list:\n"))


#get number of elements in list
len = len(num_list)
index = int(input(f'Please enter the index between 0 and {len - 1}:\n'))

#insert a num at index position


num_list.insert(index, num)

#print updated list


print(f'Updated Numbers List {num_list}')
Output of above code execution is:

Current Numbers List [1, 2, 3, 4, 5]


Please enter a number to add to list:
8
Please enter the index between 0 and 4:
3
Updated Numbers List [1, 2, 3, 8, 4 , 5]

Using Extend
This function append() adds elements to the list. If we use list as parameter to append(); it adds
new list as single element. Extend() function useful to append elements from an iterable to the
end of the list with each element separately.
#Create an empty list
list_num = []
list_num.extend([1, 2]) # extending list elements
print(list_num)
list_num.extend((3, 4)) # extending tuple elements
print(list_num)
list_num.extend("ABC") # extending string elements
print(list_num)

Output of the above code is:

[1, 2]
[1, 2, 3, 4]
[1, 2, 3, 4, 'A', 'B', 'C']

Using List Concatenation

If you have to concatenate multiple lists, you can use the “+” operator. This will create a new list
and the original lists will remain unchanged. Consider following examples where two lists are
concatenated using ‘+’ operator

#list of even numbers


evens = [2, 4, 6]
#list of odd numbers
odds = [1, 3, 5]
#use + operator to concatenate to lists
nums = odds + evens
print(nums) # [1, 3, 5, 2, 4, 6]
The new list will contain elements from the list from left to right. It’s similar to the string
concatenation in Python.

2.8.5 Delete/Remove list elements


As list are mutable we were able to modify list. To remove elements from list we can use
following methods.
1. Using del keyword
2. Using remove
3. Using pop
4. Clear method to empty list

Using del keyword


We can delete one or more items from a list using the keyword del. It can even delete the list
entirely

# Deleting list items


# Original list
my_list = ['B', 'h', 'a', 'r', 'a', 't', 'i']

# delete one item


del my_list[2]

print(my_list)

# delete multiple items


del my_list[1:5]

print(my_list)

# delete entire list


del my_list

# Error: List not defined


print(my_list)

['B', 'h', 'r', 'a', 't', 'i']


['B', 'i']
Traceback (most recent call last):
File "C:/python slm/code/Unit2/list-del.py", line 19, in <module>
print(my_list)
NameError: name 'my_list' is not defined
Using remove, pop and clear
We can use remove() method to remove the given item or pop() method to remove an item at
the given index.

The pop() method removes and returns the last item if the index is not provided. This helps
us implement lists as stacks (first in, last out data structure).

We can also use the clear() method to empty a list.

# Deleting list items


# Original list
my_list = ['B', 'h', 'a', 'r', 'a', 't', 'i']

# output ['B', 'h', 'a', 'r', 'a', 't', 'i']


print(f'mylist : {my_list}')

my_list.remove('B')

# Output: ['h', 'a', 'r', 'a', 't', 'i']


print("Mylist after removing 'B': ",my_list)

# pop element at index 1


# pops out : 'a'
print('popped : ',my_list.pop(1))

# Output: ['r', 'b', 'l', 'e', 'm']


print(f'my_list : {my_list}')

# pops out last element: 'i'


print('popped', my_list.pop())

# Output: ['h', 'r', 'a', 't']


print(f'my_list : {my_list}')

my_list.clear()

# Output: []
print(my_list)

Output of the above code execution is:

mylist : ['B', 'h', 'a', 'r', 'a', 't', 'i']


Mylist after removing 'B': ['h', 'a', 'r', 'a', 't', 'i']
popped : a
my_list : ['h', 'r', 'a', 't', 'i']
popped i
my_list : ['h', 'r', 'a', 't']

2.8.6 Important functions in List


Methods that are available with list objects in Python programming are given below. They are
accessed as list.method(). Few of the methods have already studied in above sections in lists
heading.

Python List Methods


▪ append() - Add an element to the end of the list
▪ extend() - Add all elements of a list to the another list
▪ insert() - Insert an item at the defined index
▪ remove() - Removes an item from the list
▪ pop() - Removes and returns an element at the given index
▪ clear() - Removes all items from the list
▪ index() - Returns the index of the first matched item
▪ count() - Returns the count of the number of items passed as an argument
▪ sort() - Sort items in a list in ascending order
▪ reverse() - Reverse the order of items in the list
▪ copy() - Returns a shallow copy of the list

2.8.7 List Comprehension


Consider an example of creating list of perfect squares all the numbers from 0-9. We can do it by
using for loop using flowing three steps.
▪ Instantiate an empty list.
▪ Loop over an iterable or range of elements.
▪ Append each element to the end of the list.

#define empty list


squares = []
#add items in the list using append
for i in range(10):
squares.append(i * i)
#print list
print(squares)

It will create list as bellow.

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


List comprehensions in Python provide us with a short and concise way to construct new list.
Consider the following example which provides the concise way to create list of perfect squares
as bellow.

squares = [n*n for n in range(10)]

Common syntax for using list comprehension is:

var_list = [output_exp for var in input_list if (var satisfies this


condition)]

In above example condition is missing. To use condition modify the requirement of list as list of
perfect squares of odd numbers then we can write list comprehension as
Squares = [n*n for n in range(10) if n%2 == 1]

Above comprehension creates list of perfect squares of all odd numbers from 0 to 9 and we get
list as [1, 9, 25, 49, 81].

It should be noted that list comprehension can also make use of function as per need. Consider an
example of getting perfect squares all prime numbers from 0 to 9. The code for the same as
below.

#function returns square of a given number


def square(n):
return n*n

# function check number is prime or not


def isPrime(n):
if n <= 1:
return False

# check number is prime or not if n > 2


d=2
while(d <= n/2):
if(n %d == 0):
return False
d+=1
return True
#l2 is the list of perfect squares of prime numbers
l2 = [square(n) for n in range(10) if isPrime(n)]

print('l2 : ', l2)

The code above will produce output as below:

l2 : [4, 9, 25, 49]

2.9 Tuple
A tuple in Python is similar to a list. The difference between the two is that we cannot change the
elements of a tuple once it is assigned whereas we can change the elements of a list.

2.9.1 Creating a Tuple


A tuple is created by placing all the items (elements) inside parentheses (), separated by commas.
The parentheses are optional, however, it is a good practice to use them. Similar to list; tuple can
have any number of items and they may be of different types (integer, float, list, string, etc.).
Following example describes various methods of creating tuple.

# Different types of tuples


# Empty tuple
my_tuple = ()
print('Empty tuple: ',my_tuple)

# Tuple having integers


my_tuple = (1, 2, 3)
print('tuple of integers: ', my_tuple)

# tuple with mixed datatypes


my_tuple = (1, "Hello", 3.4)
print('tuple with mixed type: ',my_tuple)

# nested tuple
my_tuple = ("mouse", [8, 4, 6], (1, 2, 3))
print('nested tuple: ',my_tuple)
Output of the above code execution is:

Empty tuple: ()
tuple of integers: (1, 2, 3)
tuple with mixed type: (1, 'Hello', 3.4)
nested tuple: ('mouse', [8, 4, 6], (1, 2, 3))
Tuples can also be created without using parenthesis as below. This example also demonstrates
the unpacking of a tuple.

#creating tuple without paranthesis


names = 'Amit','Biru','Hemant'
print("names: ", names)

#unpacking tuple
name1, name2, name3 = names

print('name1: ', name1)


print('name2: ', name2)
print('name3: ', name3)

Output after execution of above code is:

names: ('Amit', 'Biru', 'Hemant')


name1: Amit
name2: Biru
name3: Hemant

2.9.2 Accessing Tuple Elements


Tuples are similar to lists; hence all the mechanisms used to access list elements is applicable to
tuples too. We can use indexing which include both positive and negative indexing. Tuples are
support slicing as list provides.

2.10 Set
A set is an unordered collection of items. Every set element is unique (no duplicates) and must
be immutable (cannot be changed).However, a set itself is mutable. We can add or remove items
from it.Sets can also be used to perform mathematical set operations like union, intersection,
symmetric difference, etc.The major advantage of using a set, as opposed to a list, is that it has a
highly optimized method for checking whether a specific element is contained in the set.

2.10.1 Creating Set


A set is created by placing all the items (elements) inside curly braces {}, separated by comma,
or by using the built-in set() function. It can have any number of items and they may be of
different types (integer, float, tuple, string etc.). Attempting to create set using mutable elements
like lists, sets or dictionaries as its elements will result into TypeError: unhashable type:
'type'.

# Creating a set
set1= {1,2,3}
print("set1: ", set1)

# Creating a set with set function


set2 = set([11,12,13])
print("set2: ", set2)

# Set of mixed type


# Here tuple is non-mutable
set3= {1,'two',3.0, 'four', (5,6,7)}
print("set3: ", set3)

# Creating set with mutable type


# This leads to TypeError
set4 = {[1,2,3],4,5,6}

Output of the above code execution is

set1: {1, 2, 3}
set2: {11, 12, 13}
set3: {1, 'four', 3.0, (5, 6, 7), 'two'}
Traceback (most recent call last):
File "filepath", line 16, in <module>
set4 = {[1,2,3],4,5,6}
TypeError: unhashable type: 'list'

Note that set3 created it using tuple; as it is immutable set is created without any error. While in
case of set4 it raises error as it is created using list; which is mutable.

We can create set by using comprehension; we can get a set of all natural numbers less than 10
as below.
s1= {n for n in range(1,9)}

2.10.2 Modifying a set in Python


Sets are mutable. However, since they are unordered, indexing has no meaning. We cannot
access or change an element of a set using indexing or slicing. Set data type does not support it.

We can add a single element using the add() method, and multiple elements using the update()
method. The update()method can take tuples, lists, strings or other sets as its argument. In all
cases, duplicates are avoided and mutable types are also allowed.

# Creating a set
set1= {1,2,3}
print("set1: ", set1)
# Add new element
set1.add(4)
print("set1: ", set1)

# Update set with single element


set1.update([5,6])
print("set1: ", set1)

# Update set with multiple element


set1.update([1,6], (6,7,8))
print("set1: ", set1)

Output of the execution is given below. From this it clears that update methods can take multiple
parameters; which may contains elements which already presents in the set. Also note that
duplicate elements are ignored.

set1: {1, 2, 3}
set1: {1, 2, 3, 4}
set1: {1, 2, 3, 4, 5, 6}
set1: {1, 2, 3, 4, 5, 6, 7, 8}

2.10.3 Removing elements from a set


A particular item can be removed from a set using the methods discard() and remove(). The only
difference between the two is that the discard() function leaves a set unchanged if the element is
not present in the set. On the other hand, the remove() function will raise an error in such a
condition (if element is not present in the set). The following example will illustrate this.

# Creating a set
set1= {1,2,3,4,5}
print("set1: ", set1)

# Remove 3 from set


set1.remove(4)
print("set1: ", set1)

# Another way to remove element from set


set1.discard(5)
print("set1: ", set1)

# Try to remove element not present in the set


set1.discard(6)
# Try to remove element not present in the set
# This leads to keyError
set1.remove(6)

Ouput of the above code execution is.

set1: {1, 2, 3, 4, 5}
set1: {1, 2, 3, 5}
set1: {1, 2, 3}
Traceback (most recent call last):
File "filepath", line 18, in <module>
set1.remove(6)
KeyError: 6

We can also use pop() method to remove element from set. As pop method removes last element
in the collection and set is unordered; hence it is not possible to predict about which element is
removed from set. Pop method removes any arbitrary element from set. We can also remove all
the items from a set using the clear() method.

2.10.4 Set operations


Sets can be used to carry out mathematical set operations like union, intersection, difference and
symmetric difference. We can do this with operators or methods.

Operation Operator Method


Union | union()
Intersection & intersection()
Difference - difference()
Symmetric difference ^ symmetric_difference()

Following python code demonstrate these operations

# define two sets


SetA = {1,2,3,4,5}
SetB = {2,4,6,8}

print('SetA: ',SetA)
print('SetB: ',SetB)

# Perform union
print('Union using operator(|) :',(SetA | SetB))
print('Union using method :',SetA.union(SetB))

print()# used for new line


# Perform intersection
print('Intersection using operator(&) :',(SetA&SetB))
print('Intersection using method :',SetA.intersection(SetB))

print()# used for new line

# Perform diffrence
print('Difference using operator(-) :',(SetA - SetB))
print('Difference using method :',SetA.difference(SetB))

print()# used for new line

# Perform Symmetric diffrence


print('Symmetric Difference using operator(^) :',(SetA ^ SetB))
print('Symmetric difference using method :',
SetA.symmetric_difference(SetB))

Output of the above code is

SetA: {1, 2, 3, 4, 5}
SetB: {8, 2, 4, 6}
Union using operator(|) : {1, 2, 3, 4, 5, 6, 8}
Union using method : {1, 2, 3, 4, 5, 6, 8}

Intersection using operator(&) : {2, 4}


Intersection using method : {2, 4}

Difference using operator(-) : {1, 3, 5}


Difference using method : {1, 3, 5}

Symmetric difference using operator(^) : {1, 3, 5, 6, 8}


Symmetric difference using method : {1, 3, 5, 6, 8}

2.11 Dictionary
Dictionary in Python is an unordered collection of data values, used to store data values like a
map, which unlike other Data Types that hold only single value as an element, Dictionary
holds key:value pair. Key value is provided in the dictionary to make it more optimized.

2.11.1 Creating a Dictionary


In Python, a Dictionary can be created by placing sequence of elements within curly {} braces,
separated by ‘comma’. Dictionary holds a pair of values, one being the Key and the other
corresponding pair element being its Key:value. Values in a dictionary can be of any datatype
and can be duplicated, whereas keys can’t be repeated and must be immutable. Note that
dictionary keys are case sensitive, same name but different cases of Key will be treated
distinctly. Following is a syntax for creating dictionary.

d = {
<key>: <value>,
<key>: <value>,
.
.
.
<key>: <value>
}

The following example defines a dictionaries in various ways where we mapsintegers to its word
equivalents.
# define a dictionary d
d1 = {
1 : 'One',
2 : 'Two',
3 : 'Three',
4 : 'Four'
}

# print dictionary
print('d1: ',d1)

# create empty dictionary


d2 = {}
print('d2: ', d2)

# one more way to create dictionary


d3 = dict()
print('d3: ', d3)

# add new value into dictionary


d1[5]='Five'
d2[1] = "First"
print('d1: ', d1)
print('d2: ', d2)
Output of the above code execution is:

d1: {1: 'One', 2: Two', 3: 'Three', 4: 'Four'}


d2: {}
d3: {}
d1: {1: 'One', 2: 'Two', 3: 'Three', 4: 'Four', 5: 'Five'}
d2: {1: 'First'}

2.11.2 Accessing Dictionary Elements


We can access elements of by using key as index when using [] operator or using key as
parameter to get() method of dictionary object. If we use the [] with key; which does not exist,
then KeyError is raised. In case of get() method if key not found the it returns None.

# Create dictionary of nation capitals


capitals = {
'India':'New Delhi',
'USA':'Washington',
'Srilanka':'Colombo'
}

# Output: New Delhi


print("capitals['India']: ",capitals['India'])

# Output: Washington
print("capitals.get('USA'):", capitals.get('USA'))

# Output None
print("capitals.get('UK'):", capitals.get('UK'))

# Trying to access keys which doesn't exist throws error


# KeyError
print("capitals['UK']:", capitals['UK'])

Output of the above code execution is:

capitals['India']: New Delhi


capitals.get('USA'): Washington
capitals.get('UK'): None
Traceback (most recent call last):
File "filepath", line 19, in <module>
print("capitals['UK']:", capitals['UK'])
KeyError: 'UK'
2.11.3 Changing and Removing Dictionary Elements
Dictionaries are mutable. We can add new items or change the value of existing items using an
assignment operator. To add we can use syntax as below.

dict_name[key] = value

If the key is already present, then the existing value gets updated. In case the key is not present, a
new (key: value) pair is added to the dictionary. To add capital with key as ‘UK’ we can use

# Adding a new value


capitals['UK'] ='London'

We can remove a particular item in a dictionary by using various methods as below.


▪ pop() : This method removes an item with the provided key and returns the value.
▪ popitem() : This method can be used to remove and return an arbitrary (key, value) item
pair from the dictionary.
▪ clear(): Using this method All the items from dictionary removed at once.
▪ del keyword: Is used to remove individual items or the entire dictionary itself.

# Create dictionary of nation capitals


capitals = {
'India':'New Delhi',
'USA':'Washington',
'Srilanka':'Colombo'
}

# Print all capitals


print("capitals: ",capitals)

# removing using pop


retVal = capitals.pop('USA')
print("pop('USA') returned: ", retVal)

# Print all capitals


print("capitals: ",capitals)

# Add new items


capitals['UK'] ='London'
capitals['Japan'] ='Tokyo'

# Usiningpopitem()
retVal = capitals.popitem()
print("popitem() returned: ", retVal)

# Print all capitals


print("capitals: ",capitals)

# Using del keyword


del capitals['UK']

# Print all capitals


print("capitals: ",capitals)
del capitals['UK'] # this will result into KeyError

Output of the above code execution is:

capitals: {'India': 'New Delhi', 'USA': 'Washington', 'Srilanka': 'Colombo'}


pop('USA') returned: Washington
capitals: {'India': 'New Delhi', 'Srilanka': 'Colombo'}
popitem() returned: ('Japan', 'Tokyo')
capitals: {'India': 'New Delhi', 'Srilanka': 'Colombo', 'UK': 'London'}
capitals: {'India': 'New Delhi', 'Srilanka': 'Colombo'}
Traceback (most recent call last):
File "filepath", line 34, in <module>
del capitals['UK']
KeyError: 'UK'

Trying to del an item which does not exist raises KeyError as observed in above item. If we try
to use pop() method in-place of last statement in above with key 'UK' as capitals.pop('UK'); also
raises the same error.

Traceback (most recent call last):


File "filepath ", line 34, in <module>
capitals.pop('UK')
KeyError: 'UK'

Using popitem() on empty list also raises KeyError: 'popitem(): dictionary is empty'

2.11.4 Python Dictionary Comprehension


Like List Comprehension, Python allows dictionary comprehensions. We can create dictionaries
using simple expressions. A dictionary comprehension takes the form

{key: value for (key, value) in iterable}


Here k and value are two separate values taken from iterable. Consider the following example
where we are creating dictionary using two lists.
# Python code to demonstrate dictionary comprehension

# Lists to represent keys and values


keys =['a','b','c','d','e']
values =[1,2,3,4,5]

# but this line shows dict comprehension here


myDict={ k:v for(k,v) inzip(keys, values)}

# We can use below too


# myDict = dict(zip(keys, values))

print(myDict)

Above code will give us dictionary printed as bellow.


{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
Consider one more example of creating dictionary of prime numbers as key and squares of
numbers as values. Here we are considering only prime numbers as keys. We are using function
written in list comprehension example for creating dictionary.

# create a list using comprehension


list1 = [n for n in range(1,9)]
# print list items
print("list1: ",list1)

# function check number is prime or not


def isPrime(n):
if n <= 1:
return False
d=2
while(d <= n/2):
if(n %d == 0):
return False
d+=1
return True

# create a dictionary using comprehension


dict1 = {k:k*k for k in list1 if isPrime(k)}

#print the dictionary


print(f'dist1 : {dict1}')
We will getout of the above code as:

list1: [1, 2, 3, 4, 5, 6, 7, 8]
dist1 : {2: 4, 3: 9, 5: 25, 7: 49}

2.11.5 Python Dictionary Methods


Python support various method to deal with dictionaries. Some of methods are already used in
above examples. List of built-in methods that you can use on dictionaries are listed
below.
▪ clear() Removes all the elements from the dictionary
▪ copy() Returns a copy of the dictionary
▪ fromkeys() Returns a dictionary with the specified keys and value
▪ get() Returns the value of the specified key
▪ items() Returns a list containing a tuple for each key value pair
▪ keys() Returns a list containing the dictionary's keys
▪ pop() Removes the element with the specified key
▪ popitem() Removes the last inserted key-value pair
▪ setdefault() Returns the value of the specified key. If the key does not exist: insert the
key, with the specified value
▪ update() Updates the dictionary with the specified key-value pairs
▪ values() Returns a list of all the values in the dictionary

2.12 Review Questions


1. What is need of function? Explain syntax of defining function in python with example.
2. What do you know about scope of variable? Describe various scopes with example.
3. Explain the concept of default parameter in function argument.
4. Illustrate how python supports variable number of arguments with suitable example.
5. What is recursion? Comment on use of it with example in python.
6. Describe lists in python. List and describe various operations to be carried on it
7. What do you know about Tuple? Comment on when to use it.
8. List and describe various methods used with list type.
9. Explain the Set data structure in python.
10. Illustrate various set operations to be carries out with set type in python.
11. Describe the dictionary data type in python.
12. Comment on the usage of dictionary data type in python.
13. What do you know list comprehension in python?
14. Can we comprehension in set and dictionary? Illustrate your answer with example.
3 Handling Exceptions and File Input/Output

Objectives
After completing this chapter, students will be able to
- Understand exception and basic mechanism to handle it.
- Make use of ‘if’ and ‘else’ to handling exception.
- Handle generic and specific exceptions
- Define and raise exceptions
- Use of File object
- Read and write data into file
- Handle CSV files with file objects
- Make use of exception while handling input/output

3.1 Introduction
Programs are written to perform specific task. When program executes; it may stop execution
due totwo some errors. There are two types errors occur in programs.
1. Syntax errors
2. Runtime errors
Syntax errors are the errors which are due to wrong statement written. In programming
languages such as Java and C++; this type errors are identified by compilers during parsing a
program to syntax and semantics. As python uses interpreter to executes program; the syntax is
checked during execution and execution of program stops whenever syntax error occurs but
lines/statements are appearing before that are already executed. These errors are not possible to
handle during execution. Another type of error is runtime errors and these are due to logical
errors or some exceptions related to object or data in the object. To make program execute
smoothly we need to handle these exceptions. This Unit covers details about what is exception,
type of exceptions; various mechanisms to handle it.

Input/output is essential part of any software programming language and reads data from various
data sources such as keyboard and write output to console window. Input to program need not
require to read from keyboard and similarly output also not needed to printed to console. Files
can be used to read data or store data. This also discusses about using file objects and making use
of it read and write data into files.

3.2 Need of exception Handling


An exception which occurs during the execution of a program; disrupts the normal execution
flow of the program's instructions. Exception occurrence can also be treated as event and is
raised when script encounters a situation; which cannot be possible to recover by execution
environment. Actually an exception is a Python object that represents an error. When a Python
script raises an exception, it must either handle the exception immediately otherwise it terminates
and quits. There are number of errors which may occurs; some of the errors that we already faced
are:
▪ NameError
▪ TypeError
▪ IndexError
▪ KeyError

3.3 Handling Exception


If any code causes exception, then such code need to safely written. In python try-except blocks
support mechanism to catch and handle exceptions. Try block contains code which may cause
exception and except block contains code; which handles exception occurred in try block. The
code in try block is executed in normal flow. If exception occurs, then and then only except
block is executed. The basic syntax to handle generic exception is as bellow.

try:
statements may cause exception
except:
statements to handle exception occurred in try block
this blocks handles any type of exception

The critical operation which can raise an exception is placed inside the try clause. The code that
handles the exceptions is written in the except clause. Consider following example:

print("Begin")
print('x = ',x)
print('End')

If we execute above code; we will get output as below.

Begin
Traceback (most recent call last):
File "filepath ", line 2, in <module>
print('x = ',x)
NameError: name 'x' is not defined

Here line 1 is executed and NameError is occurred at line 2 while line 3 will not get execution
turn. Here we are trying to print value of x; which is not defined. As name ‘x’ is not defined it
causes NameError. Once error is occurred then lines after that are not get execution turn. If we
make use of try and except then line 3 will executes at every time in spite of exception occurs or
not. Observe the following code written using try-except.

#Simple example to handle exception


print("Begin")
try:
print('x = ',x)
except:
# Here actual exception is not specified
# uses generic message
print('Variable x is not defined')
print('End')

The output of above code execution is:

Begin
Variable x is not defined
End

From the output; it is clear that making use of try-except will help us to handle exception and
make sure that program executes all the lines and exception are also intimated to user.

3.4 Handling Generic Exceptions


Consider the following code; where we read two numbers and divide first one by second one.
Here we reading a value from keyboard using input() function and value return by this function
is string type and need to convert into Integer. To convert string value into integer
int(value_to_convert) function is used. If it is not possible to convert parameter into
integer; it will raise an error ValueError. The arithmetic expression num1/num2 may leads error
if num2 is equal to zero (num2= 0). The error occurred if divisor is zero is ZeroDivisionError.

# Program may cause exceptions


#flowing two line may cause ValueError.
num1 = int(input('Enter First number: '))
num2 = int(input('Enter First number: '))

#This may cause ZeroDivisionError


result = num1/num2
print(f'{num1}/{num2} = {result}')

To understand this consider following two runs and exception as bellow.


This is first run with error to convert a12 into integer.
Enter First number: 12
Enter First number: a12
Traceback (most recent call last):
File "filepath", line 4, in <module>
num2 = int(input('Enter First number: '))
ValueError: invalid literal for int() with base 10: 'a12'

When we run the same program with value of num2= 0; we get error as below.

Enter First number: 23


Enter First number: 0
Traceback (most recent call last):
File "filepath", line 7, in <module>
result = num1/num2
ZeroDivisionError: division by zero

Above code can be written using exception handling mechanism as below.

# Program with exception handling


# Statements in try block may cause exceptions
try:
# Flowing two line may cause ValueError.
num1 = int(input('Enter First number: '))
num2 = int(input('Enter Second number: '))

# This may cause ZeroDivisionError


result = num1/num2

#print result
print(f'{num1}/{num2} = {result}')

# Handles any type of exception


except:
# Handle exception here
print("Some Exception is Occurred")

Above code using mechanism of exception handling. If we observe it has only one except block
to handle both the exceptions. This mechanism of using except is treated as generic exception
handling and handle exception of any type. Above code can also be written as below.
# Program with exception handling
# Statements in try block may cause exceptions
try:
# Flowing two line may cause ValueError
num1 = int(input('Enter First number: '))
num2 = int(input('Enter Second number: '))

# This may cause ZeroDivisionError


result = num1/num2

#print result
print(f'{num1}/{num2} = {result}')

# Handles any type of exception


except Exception as e:
# Handle exception here
# Any one statement of below can be used to disply error
print("Exception ",e, " Occurred")
# It is better to use this as it specifies error type
print("Oops!", e.__class__, "occurred.")

Output of the above code execution for one of the run which causes error:ValueErroris:

Enter First number: 12


Enter First number: a
Exception invalid literal for int() with base 10: 'a' Occurred
Oops! <class 'ValueError'> occurred.

3.5 Handling Specific Exceptions


Generic exception handling mechanism helps programmer to write less code to handle exception
but this is not good mechanism to handle exception. It is always better to handle specific
exception and provide details of that exception in respective block. The syntax for handling
specific exception is as below.

In above example; ValueErroroccurs when user enters value which is not possible to convert
into integer and ZeroDivisionErroroccurs when divisor is zero. If handle these two
exceptions separately with self-explanatory message, then it will better to user to understand
what went wrong. The syntax to handle specific exception is as below.
try:
Statements may cause exceptions
......................
except ExceptionI:
If there is ExceptionI, then execute this block.
except ExceptionII:
If there is ExceptionII, then execute this block.
Example used in demonstration of generic exception handling can be rewritten by using specific
exception handling as below.

# Program with exception handling


# Statements in try block may cause exceptions
try:
# Flowing two line may cause ValueError.
num1 = int(input('Enter First number: '))
num2 = int(input('Enter second number: '))

# This may cause ZeroDivisionError


result = num1/num2

#print result
print(f'{num1}/{num2} = {result}')

except ValueError as e:
# Handle ValueError here
print("Failed to convert input into integer ")

except ZeroDivisionError as e:
# Handle ZeroDivisionErrorValueError here
print("Divisor(num2) should not be zero")

If we run the above code and for wrong input ValueError comes then the except block with
ValueError type is executed and if num2=0 then except block with ZeroDivisonError executes.
The except blocks with specific exception will helps to handle that exception. This also helps to
user to understand exactly what happened wrong. The except block may contains some
additional code other than present in example above.

3.6 Using else clause


While handling exceptions using try-except we can add else clause as bellow. This will help to
separate a code to be executed if no exception occurs. The following syntax helps to understand
execution of else with try-except. The else block will execute only if there is no exception
occurred.

try:
statements may cause exception
except:
statements to handle exception occurred in try block
this blocks handles any type of exception
else:
If there is no exception, then execute this block

In the code above we are printing result if everything goes smoothly without exceptions. The
code can be modified to else clause as bellow.

# Program with exception handling with else clause


# Statements in try block may cause exceptions
try:
# Flowing two line may cause ValueError.
num1 = int(input('Enter First number: '))
num2 = int(input('Enter First number: '))

# This may cause ZeroDivisionError


result = num1/num2

except ValueError as e:
# Handle ValueError here
print("Failed to convert input into integer ")

except ZeroDivisionError as e:
# Handle ZeroDivisionErrorValueError here
print("Divisor(num2) should not be zero")

else:
#print result
print(f'{num1}/{num2} = {result}')

If we execute above code with valid input, then there will not be exception and code written in
else clause executes. The example run is as bellow.

Enter First number: 45


Enter First number: 11
45/11 = 4.090909090909091

3.7 Handling Multiple Exceptions


We can use one except statement to handle multiple exceptions and the syntax for the same is as
follows.

try:
You do your operations here;
......................
except(Exception1[, Exception2[,...ExceptionN]]]):
If there is any exception from the given exception list,
then execute this block.
......................
else:
If there is no exception, then execute this block.

From the syntax it is clear that the except bloc will executes if code in try block raises any one
exception listed in except block. We can write our example code written earlier with except
block with multiple exception as below. Remember that it is always better to use separate
exception with one except clause.

# Program with exception handling


# Statements in try block may cause exceptions
# Single except is used to catch multiple exceptions
try:
# Flowing two line may cause ValueError.
num1 = int(input('Enter First number: '))
num2 = int(input('Enter Second number: '))

# This may cause ZeroDivisionError


result = num1/num2

except(ValueError, ZeroDivisionError) as e:
print("Error: ",e)
else:
#print result
print(f'{num1}/{num2} = {result}')

3.8 Raising Exception


In Python programming, exceptions are raised when errors occur at runtime. We can also
manually raise exceptions using the raise keyword. We can optionally pass values to the
exception to clarify why that exception was raised. The general syntax for the raise statement is
as follows.

raise [Exception [, args [, traceback]]]

We need to understand why there is need to raise exception and how to use it. To understand
about raising exception; consider an example of calculating average age of 5 persons. One of the
solutions for this is:

# Program to calculate average age of 5 persons


totalAge = 0
personCount =0
age = 0
print("Enter age values for 5 persons between 18 to 99 ")
while personCount< 5:
try:
age = int(input("Enter age for "+str(personCount+1)+" Person : "))

except ValueError as e :
print(e)

else:
personCount+=1
totalAge += age
avgAge = totalAge/5
print("Total age = ", totalAge)
print("Average age = ", avgAge)

one of the execution run for all valid ages is as bellow.

Enter age values for 5 persons between 18 to 100


Enter age for 1 Person : 22
Enter age for 2 Person : 44
Enter age for 3 Person : 55
Enter age for 4 Person : 66
Enter age for 5 Person : 77
Total age = 264
Average age = 52.8

Here possible exception is ValueError; which is caused due to parsing a string value into integer.
Consider that we are putting constraint that age of a person is expected to be lies between 18 to
99. In this case if there is no valid input then it needs to be handled. To handle this we can raise
exception and can be handled in as bellow.
# Program to calculate average age of 5 persons
totalAge = 0
personCount =0
age = 0
print("Enter age values for 5 persons between 18 to 99 ")
while personCount< 5:
try:
age = int(input("Enter age for "+str(personCount+1)+" Person : "))

# chack value of age


if age <18 or age > 99:
# it is invalid age
# raise exception here
raise Exception('Invalid Age '+ str(age)+'; expectd value is
18<=age<=99')

# Handle Exception ocuured due to type conversion string to integer


except ValueError as e :
print("Value Error: ", e)

#Handle exception raised programitically


except Exception as e:
print("AgeError:", e)

# This block executes if no any exception is occures


else:
personCount+=1
totalAge += age
avgAge = totalAge/5

print()
print("Total age = ", totalAge)
print("Average age = ", avgAge)

If we run above code and if any input is not valid then we raise exception programmatically.
Once any exception; either inbuilt or user defined is raised code in try block stop execution and it
looks for matching except block; after match code in that is executed. If there is no match, then
program terminate. Here we had written an except block with Exception type and it handles all
exceptions not handled before it. It is recommended that we have to list exceptions from most
specific to least specific. If we put first except block as more generic (least specific) then all the
exceptions are caught at that block only and we miss to handle specific exceptions. One of the
run to execute above code with invalid input is as below.
Enter age values for 5 persons between 18 to 99
Enter age for 1 Person : 32
Enter age for 2 Person : as
Value Error: invalid literal for int() with base 10: 'as'
Enter age for 2 Person : 22
Enter age for 3 Person : 24
Enter age for 4 Person : 13
AgeError: Invalid Age 13; expectd value is 18<=age<=99
Enter age for 4 Person : 22
Enter age for 5 Person : 23

Total age = 123


Average age = 24.6

3.9 User defined exceptions


In above example we raised an exception of type Exception. We can define new exception types
and that can be used to raise exception. We can name our own exceptions by creating a new
exception class which need to be derived from the Exception class, either directly or indirectly.
Although not mandatory, most of the exceptions are named as names that end in “Error” similar
to naming of the standard exceptions in python. User defined exception to be used are written as
below. More about the creating/writing class is covered in next unit.

# class MyError is derived from super class Exception


classMyError(Exception):

# Constructor or Initializer
# message is a parameter passed when object is created
def__init__(self, message):
self.value =value

# __str__ is to print() the value


# Here self is used to declare this is method used with object
def__str__(self):
return(repr(self.value))

The simplest way to define a AgeError exception and program to calculate average age written
above can be modified as:

# Define user defined exception


class AgeError(Exception):
#constructor
def __init__(self, message):
pass # place holder

# Program to calculate average age of 5 persons


totalAge = 0
personCount =0
age = 0
print("Enter age values for 5 persons between 18 to 99 ")
while personCount< 5:
try:
age = int(input("Enter age for "+str(personCount+1)+" Person : "))

# chack value of age


if age <18 or age > 99:
# it is invalid age
# raise exception here
raise AgeError('Invalid Age '+
str(age)+'; expected value is 18<=age<=99')

# Handle Exception occurred due to type conversion string to integer


except ValueError as e :
print("Value Error: ", e)

#Handle exception raised programitically


except AgeError as e:
print("AgeError:", e)

# This block executes if no any exception is occures


else:
personCount+=1
totalAge += age

avgAge = totalAge/5

print("Total age = ", totalAge)


print("Average age = ", avgAge)

You can observe that the AgeError is used to raise exception with raise clause and except clause
uses AgeError exception name to handle it.
3.10 Using try-finally Clause
When we write, there may be some situation in which in the current method exception is raised
and that method ends up while handling some exceptions. But the method may require some
additional steps before its termination, like closing a file or a network and so on.; python
provides mechanism to handle such situations by using finally clause with try-except.
The finally block always executes after normal termination of try block or after try block
terminates due to some exception.

Consider following example; where we are accepting one number from user and computing
reciprocal of the same. Here try-except block is accompanied by finally; which is always get
executed.

# Program to get reciprocal of gein number


try:
num = int(input('Enter a valid number: '))

# try to get reciprocal of number 'num'


reciprocal = 1/num

# print reciprocal of number


print("Reciprocal of",num," is: ", reciprocal)

#handle exceptions in generic way


except Exception as e:
print("In except: ", e)

# This always executes


finally:
print("In finally")

Sample runs of the above codes are provided below.


Case 1: No exception occurred

Enter a valid number: 10


Reciprocal of 10 is: 0.1
In finally

Case 2: Exception is occurred

Enter a valid number: 0


In except: division by zero
In finally
3.11 File Object
Files are named locations on disk to store related information. They are used to permanently
store data in a non-volatile memory (e.g. hard disk). Python provides an inbuilt function for
creating, writing, and reading files. To use file object to read and write data to file there is no
need for importing external library.

To open a file, you need to use a function open(); which is builtin function. The open() function
opens a file and returns a file object that contains methods and attributes to perform various
operations for opening files in Python.Syntax of Python open file function

file_object = open("filename", "mode")

Here,
▪ filename: gives name of the file that the file object has opened.
▪ mode: attribute of a file object tells you which mode a file was opened in.

There are various file opening modes as summarized in table below.

Mode Description

r Opens a file for reading only. This is a default value. Raises an error if the
file does not exist. The file pointer placed at the beginning of the file.
r+ Opens a file for both reading and writing. The file pointer placed at the
beginning of the file.
w Opens a file for writing only. Overwrites the file if the file exists. If the file
does not exist, creates a new file for writing.
w+ Opens a file for both writing and reading. Overwrites the existing file if the
file exists. If the file does not exist, creates a new file for reading and
writing.
a Opens a file for appending. The file pointer is at the end of the file if the file
exists. If the file does not exist, it creates a new file.
a+ Opens a file for both appending and reading. The file pointer is at the end of
the file if the file exists.If the file does not exist, it creates a new file.
x opens a file for exclusive creation in writing mode. if the file already exists
it will raiseFileExistsError

In addition to modes specified above we can also use additional two modes (t and b) depending
on the file is text or binary file. If we do now use one of these mode, then default mode is text
mode.
It should be note that the mode is not compulsory parameter; if not provided then file opens in
read mode. If we do not specify any of the other two modes (t or b); then file opens in text mode.
Some of examples of using open function are:

f = open("test.txt") # equivalent to 'r' or 'rt'


f = open("test.txt",'w') # write in text mode
f = open("img.bmp",'r+b') # read and write in binary mode

When we opens a file it is using platform dependent encoding. If we interested to use specific
encoding then we can specify it as third parameter to open() function.

f = open("test.txt", mode='r', encoding='utf-8')

3.12 Using Exception handling with file operations


When we open a file read mode it may be exist or does not exists. If we open a file in write or
append mode and there are no permission to write into file or to create a new file then open()
functions fails to open file and it raises exception. Consider following example of attempting to
open file in interactive shell.

>>> open("test.txt")
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
open("test.txt")
FileNotFoundError: [Errno 2] No such file or directory: 'test.txt'
>>>
It is better to make use of exception handling mechanism by using try-except to handle this.

# Program to attempt to p[en file


filename = "test.txt"
try:
open(filename, "r")
print(filename, ' is opened successfuly')
except Exception as e:
print(e)

If file does not exist, then open function raises exception and except block execute with
following output.

[Errno 2] No such file or directory: 'test.txt'


You can change the declaration of except block to handle specific exception as bellow.

# Program to attempt to p[en file


filename = "test.txt"
try:
open(filename, "r")
print(filename, ' is opened successfuly')
except FileNotFoundError as e:
print("There is no file named", filename)

3.13 Reading File contents


To read a file we have to open a file in read mode and use any of the following function to read
data with file object.
▪ read(size) : reads size number of data from file and if size parameter is not specified then
its reads full data in a file.
▪ readline() : this method to read individual lines of a file. This method reads data from
current position to the newline, including the newline character.
▪ readlines() : read all lines and returns list of them.
All these reading methods return empty values when the end of file (EOF) is reached.

When we do file operation then it is expected to be done in the following order.


▪ Open a file
▪ Read or write - Performing operation
▪ Close the file

Consider the following example of reading data fromfile ‘test.txt’ with following data contents

This is file 'text.txt'.


The data is in text and has multiple lines.
The open function reads data from file and
can be used with different ways.
following three methods are used
1.read(size) : reads size number of data from file.
2.readline() : reads individual lines of a file.
3.readlines() : read all lines and returns list of them.
File contents of file test.txt

The following code attempts to open the above file and reads contents of it. To handle scenario
of file does not exist use of try and except is made. If you observe the you can note that we use
read() method without parameter to read whole data at a time and the file is closed with the help
of close() method.
# Program to read data from file
filename = "test.txt"
try:
file1 = open(filename, "r")
print(filename, ' is opened successfuly')
# reads whole data at a time
data = file1.read()

#print data
print(data)

except FileNotFoundError as e:
print("There is no file named", filename)

finally:
# close file
file1.close()

The output of the above code execution is full content as it is from ‘test.txt’ file.We read file line
by line by using readline() method then above code can be written as.

# Program to read data from file


filename = "test.txt"
try:
file1 = open(filename, "r") #open a file

# reads whole data at a time


while file1:
line = file1.readline() # retruns a line
print(line, end="")

#if end of file reached the readline() return empty line


if line == "":
break

# if exception occurs then this block executes


except FileNotFoundError as e:
print("There is no file named", filename)

# finally blocks executes always


finally:
# close file
file1.close()
Above program makes use of readline() method to read contents. If you observe print() function
in while block; it uses parameter as end="". It is used to specify the end character used by print()
function; default value is ‘\n’.

If we readlines() method; it returns list of all the lines. This list can be iterated to access values
form it. The example code to read all lines at a time and iterating list contents is:

linesData = file1.readlines()
# read all lines from list finally blocks executes always
for line in linesData:
print(line)

3.14 Writing data into file


In order to write into a file in Python, we need to open that file in write(w), append(a) or
exclusive(x) mode. We need to be careful with the w mode, as it will overwrite into the file if it
already exists. Due to this, all the previous data are erased. It is recommended to use exclusive(x)
mode supported by python-3 versions so that you will not accidentally truncate or overwrite and
existing file.

Once file is opened; writing a string or sequence of bytes (for binary files) is done using the
write() method. This method returns the number of characters/bytes written to the file.

There are two ways to write in a file.


▪ write() : Inserts the string str1 in a single line in the text file.
Example: file_object.write(str1)
▪ writelines() : For a list of string elements, each string is inserted in the text file.Used to
insert multiple strings at a single time.
Example: file_object.writelines(L) Here L=[str1, str2, str3] is a list

Consider the following example; here we are opening a file ‘file2.txt’in write mode and writing
four some data in it.
# Program to write string contents in file
try:
file2 =open("file2.txt" ,'w')
file2.write("This is first line")
file2.write("This is Second line")
file2.write("This is Third line")
except Exception as e:
print(e)
finally:
file2.close()
In above case we are opening file in ‘w’ mode; if file not exists then it creates new one otherwise
it opens existing file and overwrites contents. It is needed to close a file whenever we open it.
Note that even though we are using write four times; contents are written in single line as ‘/n’ is
not automatically added by write() method.

Following example code illustrates use second method to write data into files from list.

#Program to write list contents in file


data = ['One\n', 'two\n','three\n','four\n', 'five\n']
try:
# open a file
file2 =open("file4.txt" ,'w')

# write whole list data at a time into file


file2.writelines(data)
except Exception as e:
print(e)
finally:
file2.close()

3.15 Using with clause


We open file to read/write into it by using open() function. Once file is opened it must be closed.
During the process of read/write there may occur exception and this leads to terminates the
program without closing it. There is mechanism to deal with this by using finally clause with file
handling code we used to write data into file in above section as below.

# Program to use finally clause to close file


data = ['1: One\n', '2: Two\n','3:Three\n','4: Four\n', '5: Five\n']
try:
# open a file
file2 =open("test3.txt" ,'w')

# write whole list data at a time into file


file2.writelines(data)

except Exception as e:
print(e)

finally:
file2.close() # This always executes
Above code with finally clause used with makes it sure that file is closed every time; whether
exception is there or not. These is alternative to use finally clause is to use with clause; it is used
in exception handling to make the code cleaner and much more readable. It simplifies the
management of common resources like file streams. Observe the following code example on
how the use of with statement makes code cleaner.

# Program to illustrate use of with clause


data = ['1: One\n', '2: Two\n','3:Three\n','4: Four\n', '5: Five\n']

with open("file4.txt" ,'w') as file2:

# write whole list data at a time into file


file2.writelines(data)

Notice that unlike the first implementations, there is no need to call file.close() when using with
statement. The with statement itself ensures proper acquisition and release of resources. As used
in try-except approach takes care of all the exceptions but using the with statement makes the
code compact and much more readable. Thus, with statement helps avoiding bugs and leaks by
ensuring that a resource is properly released when the code using the resource is completely
executed.

3.16 Reading and Writing CSV files


A CSV file (Comma Separated Values file) is a type of plain text file that uses specific
structuring to arrange tabular data. It’s a plain text file, it can contain only actual text data and
data values normally separated by comma. In CSV file first line contains names of columns
separated by commas and every line thereafter contain data values separated by commas. Below
is one of the CSV file:

Student.csv
Prn, first_name, last name, email, phone
2018300212, Anil, Desai,anil@abc.orrg, 9234567891
2019300301, Beena, Desai,beena@abc.orrg, 9144556677
2020300218, Chetan, Desai,chetan@abc.orrg, 9277886655

n general, the separator character is called a delimiter, and the comma is not the only one used.
Other popular delimiters include the tab (\t), colon (:) and semi-colon (;) characters. Properly
parsing a CSV file requires us to know which delimiter is being used.

Python has csv library; it provides functionality to both read from and write CSV files. This can
be used to work with Excel-generated CSV files. This library is easily adapted to work with a
variety of CSV formats. The csv library contains objects and other code to read, write, and
process data from and to CSV files.

3.16.1 Reading CSV Files


Reading from a CSV file is done using the reader object. The CSV file is opened as a text file
with Python’s built-in open() function, which returns a file object. This is then passed to
the reader()method of csv library. The general syntax for using reader() method is as below. Here
csv_reader is an object return by csv.reader() method.

csv_reader = csv.reader(csv_file, delimiter=',')

# program to read csv file


import csv # library needed to import

with open('student.csv') as csv_file:


csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0

#reads a current lines and stores in row


for row in csv_reader:
if line_count == 0:
print(f'Column names are: \n{row}')
for n in range(50):print('-', end='') # print seperater
print()
else:
print(row)

#increment line counter


line_count += 1

print(f'Processed {line_count} lines.')

It generates output as

Column names are:


['Prn', ' first_name', ' last name', ' email', ' phone']
--------------------------------------------------
['2018300212', ' Anil', ' Desai', 'anil@abc.orrg', ' 9234567891']
['2019300301', ' Beena', ' Desai', 'beena@abc.orrg', ' 9144556677']
['2020300218', ' Chetan', ' Desai', 'chetan@abc.orrg', ' 9277886655']
Processed 4 lines.
From output it clears that each row is list of string tokens without delimiter. First line in csv file
contains column names and hence needed to handle properly. Once first line data is accessedall
other lines are data lines. If we use indexing, then we can retrieve each column name or data
value depending on index value.

3.16.2 Writing CSV File


To write data into file we have to open a file using open() function with filename and write-mode
as parameter. Then file object obtained using open() function can be used to get writer() for CSV
file writing using csv libray. To write data we can take help of write_row() method of writer to
write a csv line in file. One example of using it is.
# Program to write CSV file
import csv

with open('employee.csv', mode='w', newline='') as employee_file:


employee_writer = csv.writer(employee_file, delimiter=',')
# write column names
employee_writer.writerow(['Name', 'Department', 'Job Title'])
# write data
employee_writer.writerow(['Nitin Singh', 'Accounting', 'Sr.
Acountant'])
employee_writer.writerow(['Nina', 'IT', 'Developer'])

Whenever we use write method of file object it adds newline character in file and writerow()
method of csv.writer object also adds new line. To reduce it to one newline character; the open()
function takes one parameter as newline=''.

Even though csv library has support csv file handling mechanism. It is highly recommended to
use pandas library to read such file. Pandas library has powerful features to process data.
Description about using pandas library is out of scope of this course.

3.17 Review Questions


1. What is exception? List various types exception can occur in Python.
2. Illustrate exception handling mechanism in Python?
3. Comment on the use finally clause with example?
4. When to raise exception? Describe need of with example.
5. What do you know about user defined exception?
6. What is file? Explain various modes of opening file.
7. List steps to read file contents. Also discuss usage of various read methods.
8. Describe how you are going to write contents in a file.
9. Why there is need to handle exceptions when working with files?
10. Explain the usage of with clause.
11. Discuss on python support to read and write CSV files.
4 ADT: Abstract Data Type
Objectives
After completing this chapter, students will be able to
- Write a simple Class in Python
- Create object of class, Instance Methods, Class Variables and special methods.
- Understand ADT and its implementation.
- Implement Stack and Queue ADT.
- Understand Concepts of circular and double ended queue.
- Implement applications of Stack and Queue.

This Chapter introduces the data structure concepts, its requirement and the concept of abstract
data types (ADTs) for simple and complex data types. ADTs are presented in terms of their
definition, use, and implementation. After discussing the importance of abstraction, we define
several ADTs and then show how a well-defined ADT can be used without knowing how its
actually implemented. The focus then turns to the implementation of the ADTs with an emphasis
placed on the importance of selecting an appropriate data structure. The chapter includes an
introduction to the Python iterator mechanism and provides an example of a user-defined iterator
for use with a container type ADT. This chapter also discusses stack and queue implementation
using ADT.

4.1 Introduction: Data Structures

Every day we deal with various software applications that make our life convenient like online
banking, Railway reservation, booking appointment with doctor and online shopping. Almost, all
the organizations use database software for better functioning of their organizations like
employee management system, Student data management system in colleges and payroll system
to handle the financial matters. All these applications store and process some kind of data or
files. Efficiency of all these application is based on the fact how this data is stored in memory,
how it is assessed for processing and how it is manipulated.

The design and implementation of any software depends upon structuring and Organization of
data (i.e. Data Structure). A data structure specifies the logical relationship between data
elements and how these affect the physical processing of data.

4.1.1 Who should have knowledge of Data Structure?

The knowledge of data structure is important for the people who design and develop software for
commercial or technical applications/ system software. The good thing about this course is that it
also explains different algorithms for building and Manipulating Data Structure. The algorithms
are written in pseudo code and some implementations are shown with the help of python
programming.
4.1.1. Elementary Data Structure

This section discusses major types of Data structures and how they represent real life data
structures like date, temperature, distance and Pay etc. This introduction chapter gives you basic
understanding of what is data structure and why it is important in any information system.

Before starting a discussion on data structure, let us first understand what data is and how they
are important for any information system. Let us consider following diagram Figure1:

Role of Data in Information System

• The big rectangle represents the data of overall organization.


• The small rectangle represents individual data elements belonging to various sections of
the organization.

The small data elements provide some facts for the organization. These data elements
aggregated together and summarized to produce meaning full information for the
organization. This information helps in making important decisions for organization. These
decision result into actions, which generate more useful data. Now this newly generated
data can be used to create another cycle of design making process. As data is responsible for
decision making and hence affect the operation and planning of organization. Hence data is
very expensive for any organization. They must be managed to ensure accuracy and
availability of produced information.

Simple data handling operations include:


• Measurement
• Collection
• Storage
• Aggregation
• Updation
• Retrieval
• Protection
• Validation

Data Structure is a way of organizing data which not only determine how data is stored in
memory but also how they are related to each other. It is a mathematical / logical model of
organization of data elements.

A data structure is mainly specified as follows:

• How data is organized in memory


• What are different ways to access data
• How data elements are associated with each other
• What are different ways to process data elements to generate information

Efficiency of a program depends on data structure selected for the program. In other words Data
Structure is a class of data that specifies organization of related data in memory and what are the
various operations that can be applied over the data and how these operations are performed.

Data Structure = Data Organization + Allowed Operation

Data structure and algorithm:

Algorithm is a step-by-step way of finding the solution of a given Problem. It is set of


instructions to carry certain task and data structure is a way to organize data with their logical
relationship retained.

To develop a program for an algorithm, we should select appropriate data structure for the
algorithm. Therefore algorithm and associated Data Structure makes a program.

Algorithm + Data Structure = Program

4.1.2 Classification of data structures:

There are various ways to classify data structures:

1. Linear and nonlinear data structures: in linear data structures elements are arranged
in the linear sequence. For example; an array.
Non-linear data structures elements are not arranged in linear sequence. For
example, tree

2. Homogeneous and non-homogeneous data structures: in homogeneous data


structures all the data elements are of same type. For example, array.

In non-homogeneous data structures all the data elements may or may not be of
same type. For example, records.

3. Primitive and non-primitive data structures:


Primitive data structures are atomic i.e. Not composed of other data structures. Example
of primitive data structures are int, real, bullion, characters.

Non primitive data structures are composed of primitive and other non-primitive data
structures. For example, records, array, string.

4. Static and dynamic data structures: structures are the one whose size and structure
also memory locations are fixed at compile time.

Dynamic data structure are the one which expand and shrink as required during
program execution and their associated memory location change. Classification od data
structure is shown in Figure 2.

Classification
of Data
Structure

Primitive Non-Primitive

Int Float Char Pointer Linear Non Linear

Array Stack Queue Tree Graph

Figure 8: Classification of Data Structure


4.2 Concept of Abstract Data Type:

An abstract data type defines the logical properties of a data type. Generally, a data type is a
collection of values and effect of operation on those values. An abstract data type can be used to
define user defined data type when primitive data types are not suitable for representing real life
data structures. Abstract data type we can define the logical properties of a user defined data
type. In abstract data type user has to define what kind of value abstract data type can have and
also has to describe that what are the different operations that can be carried over these values.

We will use a semi-formal notation to describe Abstract Data Type (ADT). To


understand the concept of abstract data type let us consider rational as an abstract data type: ADT
RATIONAL which corresponds to the mathematical concept of a rational number. Rational
number is a number that can be expressed as a quotient of two integers. The denominator of
rational number cannot be zero.

The various operations that we can apply over rational numbers are the creation of
rational number from two integers, addition, multiplication, and testing for equality. The
following is specification for ADT RATIONAL:

/* Value definition */
Abstact typdef < integer, interger > RATIONAL ;
Condition RATIONAL [1]!= 0;
/* Operator definition */
Abstract RATIONAL makerational(a,b)
Int a,b;
Precondition
b!= 0;
Postcondition
makerational [0] == a;
makerational [0] == b;

/* Add rational a,b */


abstract RATIONAL add(a,b)
RATIONAL a,b;
Postcondition add[1] = a[1] *b[1];
add[0] = a[0] *b[1] + b[0] *a[1] ;

/* Multiply rational a,b */


abstract RATIONAL multy(a,b)
RATIONAL a,b;
Postcondition multy[0] = a[0] *b[0];
multy[1] = a[1] *b[1];

/* check equality of rational a,b */


abstract RATIONAL equal(a,b)
RATIONAL a,b;
Postcondition equal == ( a[0]*b[1] == b[0] *a[1];
multy[1] = a[1] *b[1];
An ADT consists of two parts:

• A value definition
• Operator definition

A value definition: define the collection of values that ADT can have. It has two sub parts:

• Definition clause
• Condition clause

Abstract Data Type: Rational, the value definition clause specified value consists of two
integers and condition specified that the second integer cannot be zero. The keyword abstract
typedef introduce value definition and the keyword Condition is used to specify any conditions
on the newly defined data type.

In this definition the condition specifies that the denominator may not be zero. The definition
clause is required but the condition clause may not be necessary for every abstract data type.

Immediately after the value definition there is declaration for operator definition. Each operator
is defined as an abstract function. Abstract function has three parts:

• Header
• Operational preconditions
• Post conditions.

The post condition specifies what the operation does. In a post condition the name of function
says add is used to denote the result of the operation. The creation operation creates a rational
number from two integers and contains the first example of a precondition. In general
preconditions specify any restriction that must be satisfied before the operation can be applied. In
this example the precondition states that make rational cannot be applied if the second parameter
is zero.

In similar way we can create other data structures such as Integer, Rational, Currency, Date,
Temperature, distance, Pay, Marks, Grade_card etc.

4.3. ADT in Python


An abstract data type (or ADT) is a programmer-defined data type that specifies a set of data
values and a collection of well-defined operations that can be performed on those values.
Abstract data types are defined independent of their implementation, allowing us to focus on the
use of the new data type instead of how it’s implemented. This separation is typically enforced
by requiring interaction with the abstract data type through an interface or defined set of
operations. This is known as information hiding. By hiding the implementation details and
requiring ADTs to be accessed through an interface, we can work with an abstraction and focus
on what functionality the ADT provides instead of how that functionality is implemented.
Abstract data types can be viewed like black boxes as shown in following figure:

User programs interact with instances of the ADT by invoking one of the several operations
defined by its interface.

The set of operations can be grouped into four categories:


• Constructors: Create and initialize new instances of the ADT.
• Accessors: Return data contained in an instance without modifying it.
• Mutators: Modify the contents of an ADT instance.
• Iterators: Process individual data components sequentially.

Following are the benefits of Abstraction:


• We can focus on solving the problem at hand instead of getting bogged down in the
implementation details.
• We can reduce logical errors that can occur from accidental misuse of storage structures
and data types by preventing direct access to the implementation.
• The implementation of the abstract data type can be changed without having to modify
the program code that uses the ADT.
• It’s easier to manage and divide larger programs into smaller modules, allowing different
members of a team to work on the separate modules

There are many common data structures, including arrays, linked lists, stacks, queues, and trees,
to name a few. All data structures store a collection of values, but differ in how they organize the
individual data items and by what operations can be applied to manage the collection. The choice
of a particular data structure depends on the ADT and the problem at hand. Some data structures
are better suited to particular problems. For example, the queue structure is perfect for
implementing a printer queue, while the B-Tree is the better choice for a database index. No
matter which data structure we use to implement an ADT, by keeping the implementation
separate from the definition, we can use an abstract data type within our program and later
change to a different implementation, as needed, without having to modify our existing code.

1.4 The Date Abstract Data Type

The Date Abstract Data Type An abstract data type is defined by specifying the domain of the
data elements that compose the ADT and the set of operations that can be performed on that
domain. The definition should provide a clear description of the ADT including both its domain
and each of its operations as only those operations specified can be performed on an instance of
the ADT. Next, we provide the definition of a simple abstract data type for representing a date in
the proleptic Gregorian calendar.

The Gregorian calendar was introduced in the year 1582 by Pope Gregory XIII to replace the
Julian calendar. The new calendar corrected for the miscalculation of the lunar year and
introduced the leap year. The official first date of the Gregorian calendar is Friday, October 15,
1582. The proleptic Gregorian calendar is an extension for accommodating earlier dates with the
first date on November 24, 4713 BC. This extension simplifies the handling of dates across older
calendars and its use can be found in many software applications.

A date represents a single day in the proleptic Gregorian calendar in which the first day starts on
November 24, 4713 BC.

Date( month, day, year ): Creates a new Date instance initialized to the given Gregorian date
which must be valid. Year 1 BC and earlier are indicated by negative year components.

day(): Returns the Gregorian day number of this date.


month(): Returns the Gregorian month number of this date.

year(): Returns the Gregorian year of this date.

monthName(): Returns the Gregorian month name of this date.


dayOfWeek(): Returns the day of the week as a number between 0 and 6 with 0 representing
Monday and 6 representing Sunday.

numDays( otherDate ): Returns the number of days as a positive integer between this date and
the otherDate.

isLeapYear(): Determines if this date falls in a leap year and returns the appropriate boolean
value.
advanceBy( days ): Advances the date by the given number of days. The date is incremented if
days is positive and decremented if days is negative. The date is capped to November 24, 4714
BC, if necessary.

comparable ( otherDate ): Compares this date to the otherDate to determine their logical
ordering. This comparison can be done using any of the logical operators <=, >, >=, ==, !=.

toString (): Returns a string representing the Gregorian date in the format mm/dd/yyyy.
Implemented as the Python operator that is automatically called via the str() constructor.

4.3 Writing a simple Class in Python:

Classes provide a means of bundling data and functionality together. Creating a new class creates
a new type of object, allowing new instances of that type to be made. Each class instance can
have attributes attached to it for maintaining its state. Class instances can also have methods
(defined by its class) for modifying its state.
A class is a user-defined blueprint or prototype from which objects are created. Classes provide
a means of bundling data and functionality together. Creating a new class creates a new type of
object, allowing new instances of that type to be made. Each class instance can have attributes
attached to it for maintaining its state. Class instances can also have methods (defined by their
class) for modifying their state.
To understand the need for creating a class let’s consider an example, let’s say you wanted to
track the number of dogs that may have different attributes like breed, age. If a list is used, the
first element could be the dog’s breed while the second element could represent its age. Let’s
suppose there are 100 different dogs, then how would you know which element is supposed to
be which? What if you wanted to add other properties to these dogs? This lacks organization
and it’s the exact need for classes.
Class creates a user-defined data structure, which holds its own data members and member
functions, which can be accessed and used by creating an instance of that class. A class is like
a blueprint for an object.

Some points on Python class:

• Classes are created by keyword class.


• Attributes are the variables that belong to a class.
• Attributes are always public and can be accessed using the dot (.) operator. Eg.:
Myclass.Myattribute
Class Definition Syntax:

class ClassName:
# Statement-1
.
.
.
# Statement-N
4.3.1 Class Objects

An Object is an instance of a Class. A class is like a blueprint while an instance is a copy of the
class with actual values. It’s not an idea anymore, it’s an actual dog, like a dog of breed pug
who’s seven years old. You can have many dogs to create many different instances, but
without the class as a guide, you would be lost, not knowing what information is required.
An object consists of:
• State: It is represented by the attributes of an object. It also reflects the properties of an
object.
• Behavior: It is represented by the methods of an object. It also reflects the response of an
object to other objects.
• Identity: It gives a unique name to an object and enables one object to interact with other
objects.

4.3.2 Declaring Objects (Also called instantiating a class)


When an object of a class is created, the class is said to be instantiated. All the instances share
the attributes and the behavior of the class. But the values of those attributes, i.e. the state are
unique for each object. A single class may have any number of instances.
Example:
# program to demonstrate instantiating a class

class Dog:

# A simple class
# attribute
attr1 = "mammal"
attr2 = "dog"

# A sample method
def fun(self):
print("I'm a", self.attr1)
print("I'm a", self.attr2)

# Driver code
# Object instantiation
Rodger = Dog()

# Accessing class attributes


# and method through objects
Output: print(Rodger.attr1)
Rodger.fun()
I'm a mammal
I'm a dog
4.3.3 The self

• Class methods must have an extra first parameter in the method definition. We do not give
a value for this parameter when we call the method, Python provides it.
• If we have a method that takes no arguments, then we still have to have one argument.
• This is similar to this pointer in C++ and this reference in Java.

When we call a method of this object as myobject.method (arg1, arg2), this is automatically
converted by Python into MyClass.method (myobject, arg1, arg2) – this is all the special self is
about.

4.3.4 __init__ method


The __init__ method is similar to constructors in C++ and Java. Constructors are used to
initializing the object’s state. Like methods, a constructor also contains a collection of
statements(i.e. instructions) that are executed at the time of Object creation. It is run as soon as
an object of a class is instantiated. The method is useful to do any initialization you want to do
with your object.
4.4 Particular Linear Data Structures (Stack, Queue)

This chapter discusses commonly used data structures stack. Stack is the linear data structure
which allows insertion and deletion at only one end. Insertion operation is specified as
PUSH and deletion operation is specified as POP. Elements are removed in opposite order in
which they are added. This process is called last in first out (LIFO). The last part of the chapter
discusses Queue and its Applications.

4.5 STACK

Let us understand stack with the help of examples:

Example 1: Stack of plates at cafeteria: this arrangement of plates allows a person to pick only a
plate at the top.

Example 2: railway track used to Shunt cars from one position to other position. Last rail car on
the track is the first one to be removed from the stack of cars.

4.5.1 Memory representation of stack:

Stack can be represented in two ways:

1. Static implementation
2. Dynamic implementation

let us first discuss the static implementation of stack using array.


4.5.2 Static implementation of stack:

Stack can be implemented with the help of array. In stack insertion and deletion is only possible
at one end which is called top. Following diagram represent memory representation of stack as
an array.

A pointer top denotes the top element of the stack. For an Empty stack, top has a value of
negative number (i.e. TOP = -1). Top location is incremented by one before placing a new
element on the stack. And reversely, top is decremented by one when an element is to be deleted
from stack.

Implementation of stack using array is not flexible. The size of stack cannot be varied. Static
implementation is not efficient with respect to memory utilisation. As the declaration of the
array for implementing stack is done before the start of the operation, now if there are too few
elements to be stored in the stack then the memory is wasted on the other side if there are so
many numbers of elements to be stored in the stack then we cannot change the size of the array
to increase its capacity So that we can accommodate new elements.
4.5.3 Operations of stack:

The basic operations of stack are as follows:

1. Push: The process of adding new element at the top of stack is known as push operation.
to add a new element in the stack the stop position has to be incremented by one. in case
the array is full and no new element can be accommodated this situation is stack full
condition. This condition is also known as stack overflow condition.
2. Pop: The process of deleting an element from the top of the stack is called pop operation.
After every element is deleted from the stack the top is recommended by one. If there is
no element in the stack and the stack is empty as a Pop operation is called then which
will result into stack underflow condition.
Stack terminology:

MAXSIZE: this term is used to refer the maximum size of stack.

TOP: This term refers to the top of stack. The top is used to check the overflow and underflow
condition. Initial value of top is -1. This assumption is taken so that whenever a new element is
added into stack top is first incremented and then element is inserted into the location indicated
by top.

ST: it is the name of an array with maximum size MAXSIZE.

Stack underflow: This is the situation when the stack contains no element. At this point the tops
of the stack is present at the bottom of stack (i.e. TOP = -1)

Stack overflow: This is the situation when stagnation school and there are no elements can be
inserted into stack. At this point the stack top is present at the highest location of the Stack (i.e.
TOP = MAXSIZE)

Algorithm: PUSH: to insert an element on the TOP of STACK

Assumptions:
Name of array representing stack- ST
Maximum capacity of array- MAXSIZE
Top of stack – TOP

• Step 1 − Checks if the stack is full.

o IF TOP == MAXSIZE – 1

o THEN PRINT “ OVERFLOW”

o EXIT

• Step 2 − If the stack is not full, increments top to point next empty space.

o SET TOP = TOP + 1

• Step 3 − Adds data element to the stack location, where top is pointing.

o ST[TOP] = ITEM

• Step 4 − Returns success.


4.5.4 Algorithm: POP: to DELETE an element from the TOP of STACK
Assumptions:

Name of array representing stack- ST

Maximum capacity of array- MAXSIZE

• Step 1 − Checks if the stack is empty.

o IF TOP == – 1

o THEN PRINT “UNNDERFLOW”

o EXIT

• Step 2 − If the stack is not full, decrement top by 1.

o SET TOP = TOP - 1

• Step 3 − Returns success.

Top of stack – TOP

4.6 Application of STACK

Stack is used in many applications where the last inserted process is processed first (i.e. Last In
First Out- LIFO). Some of the application where last in first out processing is used are arithmetic
expressions, recursion and Tower of Hanoi.

Let us discuss some applications of STACK.

4.6.1 Applications of stack for arithmetic expressions:

Arithmetic expressions can be defined as combination of variables Are constants connected by


arithmetic operators.

Following is the table of basic mathematical operators used in expressions:

The order of evaluation of arithmetic operators is known as operator precedence. Following is


the order of operator precedence:
1. highest ↑
2. next highest / *
3. Lowest + -

SYMBOL PURPOSE EXAMPLE

↑ Exponentiation A↑B

/ Division A/B

* Multiplication A*B

+ Addition A+B

- Subtraction A-B

The operators on same level are evaluated from left to right. Operators may be enclosed within
parentheses to override the above rules. Operators contained within parentheses are evaluated
first. When the sets of parentheses are nested, the innermost is evaluated first and so on.

4.6.2 Notation for a mathematical expression:

There are basically Three Types of notation for a mathematical expression:

1. Infix notation
2. Prefix notation
3. Postfix notation

4.6.2.1 Infix notation:

This is the notation we generally used in mathematics, where the operator is written in between
the operands. For example, the expression to add two numbers A and B is written in infix
notation as following:

A+B

It is important to note that, the operator, '+' is written between the operands A and B, that is why
it is known as infix notation.
4.6.2.2 Prefix notation or polish notation:

a prefix notation is which in which the operators is written before the operands. Now, the same
expression to add a and b can be written in prefix notation as following:

+ AB

it is important to note that the operator, '+" is written before the operands A and B, that's why It
is known as prefix notation or polish notation.

4.6.2.3 Postfix notation or reverse polish notation:

In postfix notation the operators are returned after the opponents, so it is called the postfix
notation for some time also known as suffix notation or reverse polish notation.

The above mathematical expression to add A and B can be written in postfix notation as
following:

AB +

4.6.2.4 Need for prefix and postfix notations:

Human beings are quite used to work with mathematical expressions in infix notation. This
notation is quite Complex for computers because using this notation one has to remember a set of
non-trivial rules. One has to remembers the rules of precedence, associatively and BODMAS.

If we use infix notation, it is difficult to define the order in which operators should be applied
just by looking at the expression. Operations associated with that operators are evaluated first. As
compared to postfix notation in which operates appear before the operator, there is no need for
operator precedence and other rules. As soon as an operator appears in the postfix expression
during the scanning of postfix expression the topmost opponents are popped off and are
calculated applying the encountered operator. Place the result back onto the stack, doing so the
stack will contain finally a single value at the end of process.

Generally, we write mathematical expressions in infix notation which is not convenient for a
computer processor to process. Any mathematical operation is processed from left to right
(postfix expression). The postfix notation is the way computer looks to word any mathematical
expression, any expression entered into the computer is first converted into postfix notation And
then it is stored in stack and then it is calculated.

Postfix notation is the most suitable for computers to calculate any mathematical expression due
to its reversing characteristics. It is universally accepted notation for designing arithmetic and
logical unit of processor. Therefor it is important to study postfix notation. In C language A
function to add two variables is described as following:

Add (A, B) In this description the operator add precedes the operands A and B.
Let us see some example:
INFIX
A+B*C
POSTFIX
ABC * +
PREFIX
+ A* BC

4.7 Recursion

When a function calls itself, it is called a recursive function. If a function calls itself, it can result
into infinite process, to stop the indefinite execution of this recursive program, a recursive
function must have following two properties:

1. There must be some variable in the function for which the function will not refer to
itself. This value is known as base value. So a recursive function should have a base
value for which the recursive function will not refer to itself.
2. Every time the recursive functions refer to itself, the argument value of the function must
go One Step Closer to the base value.

Recursive function that has above two characteristics is also known as well-defined recursive
function.

Let us understand recursive function with the help of an example.

Example 1: factorial function: recursive definition of n! is following:

4.7.1 Algorithm: factorial of a number

Assumptions:

N - is the number for which we need to find factorial


F- is the variable that stores the value of factorial
Procedure: FACT (F, N)

Step 1: if N= 0 then
Set F:= 1 and return worksheet completed
Step 2: Call FACT (F, N-1)
Step 3: Set F := N * F
Step 4: return

4.7.2 Tower of Hanoi:

This problem is concerned with the particular children's game, consisting of three poles and a
number of different size disk. Each disk has a hole in the centre. Initially, all the disks are placed
on the leftmost pole with the largest one at the bottom and the smallest one at the top. The
objective of the game is to transfer the disc from the leftmost pole to the rightmost pole
according to the following rules:

• The disk has to be moved one at a time.


• During shifting, at no point of time, should a larger disk rest on a smaller one

Following diagram represent the scenario:

4.8 The Stack ADT


A stack is used to store data such that the last item inserted is the first item removed. It is used to
implement a last-in first-out (LIFO) type protocol. The stack is a linear data structure in which
new items are added, or existing items are removed from the same end, commonly referred to as
the top of the stack. The opposite end is known as the base.

Stack ADT in Python: Stack ADT

A stack is a data structure that stores a linear collection of items with access limited to a last-in
first-out order. Adding and removing items is restricted to one end known as the top of the stack.
An empty stack is one containing no items.

Stack(): Creates a new empty stack.


isEmpty(): Returns a boolean value indicating if the stack is empty.
length (): Returns the number of items in the stack.
pop(): Removes and returns the top item of the stack, if the stack is not empty.
Items cannot be popped from an empty stack. The next item on the stack
becomes the new top item.
peek(): Returns a reference to the item on top of a non-empty stack without
removing it. Peeking, which cannot be done on an empty stack, does not
modify the stack contents.
push( item ): Adds the given item to the top of the stack.

4.9 Implementing the Stack Using a Python List

The Stack ADT can be implemented in several ways. The two most common approaches in
Python include the use of a Python list and a linked list. The choice depends on the type of
application involved. The Python list-based implementation of the Stack ADT is the easiest to
implement. The first decision we have to make when using the list for the Stack ADT is which
end of the list to use as the top and which as the base. For the most efficient ordering, we let the
end of the list represent the top of the stack and the front represent the base. As the stack grows,
items are appended to the end of the list and when items are popped, they are removed from the
same end.

The peek() and pop() operations can only be used with a non-empty stack since you cannot
remove or peek at something that is not there. To enforce this requirement, we first assert the
stack is not empty before performing the given operation. The peek() method simply returns a
reference to the last item in the list. To implement the pop() method, we call the pop() method of
the list structure, which actually performs the same operation that we are trying to implement.
That is, it saves a copy of the last item in the list, removes the item from the list, and then returns
the saved copy. The push() method simply appends new items to the end of the list since that
represents the top of our stack.

# Implementation of the Stack ADT using a Python list.


class Stack :
# Creates an empty stack.
def __init__( self ):
self._theItems = list()

# Returns True if the stack is empty or False otherwise.


def isEmpty( self ):
return len( self ) == 0

# Returns the number of items in the stack.


def __len__ ( self ):
return len( self._theItems )

# Returns the top item on the stack without removing it.


def peek( self ):
assert not self.isEmpty(), "Cannot peek at an empty stack"
return self._theItems[-1]

# Removes and returns the top item on the stack.


def pop( self ):
assert not self.isEmpty(), "Cannot pop from an empty stack"
return self._theItems.pop()

# Push an item onto the top of the stack.


def push( self, item ):
self._theItems.append( item )

4.10 Queue data structure:


A linear Data Structure is known as IQ if deletions are performed from the beginning and
insertion is done performed at the end of the rear of the list. In a queue Data Structure
information is processed in the same order in which it was received on a first in first out (FIFO)
basis or First Come First serve (FCFS) basis.

In the stack, insertion and deletion of the element is restricted only at the top of the list, in, queue
Elements are added at rear end and they are deleted from the beginning/the head of the list. You
are very familiar with cues as they are openly arise in our daily life. We stand in the queue of
bank, supermarket, and school bus. In this part of the chapter we will discuss queue Data
Structure, its representation in memory, various operations on queue like insertion and deletion,
different types of queues like circular queue and dequeue, and their applications in computers.

4.10.1 Memory representation of queue data structure:


As we discussed with stack queue can also be implemented in two ways:
• Using Array
• Using Link List
Queue is a non-primitive linear Data Structure. New elements are added at rear end and the
existing elements are deleted from the front end.

Following diagrams show Q graphically during insertion operation


Queue is also an abstract data type or a linear data structure, just like stack data structure, in
which the first element is inserted from one end called the REAR(also called tail), and the
removal of existing element takes place from the other end called as FRONT(also called head).
This makes queue as FIFO(First in First Out) data structure, which means that element inserted
first will be removed first.
Which is exactly how queue system works in real world? If you go to a ticket counter to buy
movie tickets, and are first in the queue, then you will be the first one to get the tickets. Same is
the case with Queue data structure. Data inserted first, will leave the queue first.
The process to add an element into queue is called Enqueue and the process of removal of an
element from queue is called Dequeue.
Basic features of Queue

1. Like stack, queue is also an ordered list of elements of similar data types.
2. Queue is a FIFO( First in First Out ) structure.
3. Once a new element is inserted into the Queue, all the elements inserted before the new
element in the queue must be removed, to remove the new element.
4. peek( ) function is often used to return the value of first element without dequeuing it.

4.11 Applications of Queue

Queue, as the name suggests is used whenever we need to manage any group of objects in an
order in which the first one coming in, also gets out first while the others wait for their turn, like
in the following scenarios:
1. Serving requests on a single shared resource, like a printer, CPU task scheduling etc.
2. In real life scenario, Call Center phone systems uses Queues to hold people calling them in
an order, until a service representative is free.
3. Handling of interrupts in real-time systems. The interrupts are handled in the same order as
they arrive i.e First come first served

4.11.1 Implementation of Queue Data Structure

Queue can be implemented using an Array, Stack or Linked List. The easiest way of
implementing a queue is by using an Array.
Initially the head(FRONT) and the tail(REAR) of the queue points at the first index of the array
(starting the index of array from 0). As we add elements to the queue, the tail keeps on moving
ahead, always pointing to the position where the next element will be inserted, while
the head remains at the first index.

When we remove an element from Queue, we can follow two possible approaches (mentioned
[A] and [B] in above diagram). In [A] approach, we remove the element at head position, and
then one by one shift all the other elements in forward position.
In approach [B] we remove the element from head position and then move head to the next
position.
In approach [A] there is an overhead of shifting the elements one position forward every time
we remove the first element.
In approach [B] there is no such overhead, but whenever we move head one position ahead, after
removal of first element, the size on Queue is reduced by one space each time
4.12 The Queue ADT in Python:

A queue is a specialized list with a limited number of operations in which items can only be
added to one end and removed from the other. The definition of the Queue ADT follows:

A queue is a data structure that a linear collection of items in which access is restricted to a first-
in first-out basis. New items are inserted at the back and existing items are removed from the
front. The items are maintained in the order in which they are added to the structure.

Queue(): Creates a new empty queue, which is a queue containing no items.


isEmpty(): Returns a boolean value indicating whether the queue is empty.
length (): Returns the number of items currently in the queue.
enqueue( item ): Adds the given item to the back of the queue.
dequeue(): Removes and returns the front item from the queue. An item cannot be dequeued
from an empty queue.

4.13 Implementing the Queue Using a Python List


Since the queue data structure is simply a specialized list, it is commonly implemented using
some type of list structure. There are three common approaches to implementing a queue: using a
vector, a linked list, or an array. The simplest way to implement the Queue ADT is to use
Python's list. It provides the necessary routines for adding and removing items at the respective
ends. By applying these routines, we can remove items from the front of the list and append new
items to the end. To use a list for the Queue ADT, the constructor must, define a single data field
to store the list that is initially empty. We can test for an empty queue by examining the length of
the list. To enqueue an item, we simply append it to the end of the list. The dequeue element of
the list. Before attempting to remove an item from the list, we must ensure the queue is not
empty. Remember, the queue definition prohibits the use of the dequeue() operation on an empty
queue. Thus, to enforce this, we must first assert the queue is not empty and raise an exception,
when the operation is attempted on an empty queue.

# Implementation of the Queue ADT using a Python list.


class Queue :
# Creates an empty queue.
def __init__( self ):
self._qList = list()

# Returns True if the queue is empty.


def isEmpty( self ):
return len( self ) == 0

# Returns the number of items in the queue.


def __len__( self ):
return len( self._qList )

# Adds the given item to the queue.


def enqueue( self, item ):
self._qList.append( item )

# Removes and returns the first item in the queue.


def dequeue( self ):
assert not self.isEmpty(), "Cannot dequeue from an empty queue."
return self._qList.pop( 0 )

4.14 Circular Queue: Using a Circular Array


The list-based implementation of the Queue ADT is easy to implement, but it requires linear time
for the enqueue and dequeue operations. We can improve these times using an array structure
and treating it as a circular array. A circular array is simply an array viewed as a circle instead of
a line. An example is illustrated in following figure:

A circular array allows us to add new items to a queue and remove existing ones without having
to shift items in the process. Unfortunately, this approach introduces the concept of a maximum-
capacity queue that can become full. A circular array queue implementation is typically used
with applications that only require small-capacity queues and allows for the specification of a
maximum size.

4.14.1 Application of Circular Queue


Below we have some common real-world examples where circular queues are used:

1. Computer controlled Traffic Signal System uses circular queue.


2. CPU scheduling and Memory management.

4.15 Review Questions:

Q1. Define data structure? What are the different types of data structures?
Q2. Difference between Built in data structure and user define data structure
Q3. What is abstract data type? How an abstract data type is implemented in a language? Specify
and implement simple data structures such as date.
Q4. What is the role of pre-condition and post conditions in implementing a data structure?
Q5. Explain the different operations to be performed on data structures.
Q6. What are linear data structures?
Q2. Explain algorithm to implement stack with the help of python list.
Q3. Explain algorithm to implement stack ADT.
Q4. What is the significance of the top in a stack?
Q5. Why stack is called a LIFO data structure?
Q6. List applications of Stack in computers.
Q7. Difference between postfix and prefix expression with the help of example.
Q9. Explain algorithm to implement queue with the help of python list.
Q10.Explain algorithm to implement queue ADT.
Q11.Explain an algorithm to implement circular queue.
Q12.Explain an algorithm to implement dequeue.

5 Linked Lists

Objectives
After completing this chapter, students will be able to

- Define List as ADT


- Implement Singly Linked Lists
- Implement Circularly Linked Lists
- Implement Doubly Linked Lists
- Compare Link-Based vs Array-Based Sequences
- Implement Stack and Queue using Link List
- Implement applications of Linked List (polynomial Equations)

In this chapter, we introduce the linked list data structure, which is a general purpose structure
that can be used to store a collection in linear order. The linked list improves on the construction
and management of an array and Python list by requiring smaller memory allocations and no
element shifts for insertions and deletions. But it does eliminate the constant time direct element
access available with the array and Python list. Thus, it's not suitable for every data storage
problem. There are several varieties of linked lists. The singly linked list is a linear structure in
which traversals start at the front and progress, one element at a time, to the end. Other variations
include the circularly linked, the doubly linked, and the circularly doubly linked lists.

5.1 Linked Lists:


A linked list is a linear Collection of data elements called nodes, where a linear order is
maintained by using pointer. A node in a linked list has two fields:
1. An information field (DATA): It contains the data values of elements.
2. A pointer field (ADDR) : it gives address of next element in the list.
Structure of a node is shown in the following diagram

NODE

DATA ADDR
P
- Here P denotes the address of the node.

Linked Lists are used to create trees and graphs. The following diagram shows the linked list.
here the header describes the address of the first node in the linked list.

5.1.1 Advantages of Linked Lists

• Link list are dynamic in nature, memory locations are allocated only when there is a
requirement.
• Insertion and deletion operation in a linked list require less memory operations as
compared to array.
• Linked list can also be used to implement other data structures such as stack and queue.
• Linked list allows efficient access to elements in the list. It reduces the access time of
elements.

5.1.2 Disadvantages of Linked Lists

• Each element required to store address of next element. Extra memory is required to
store the address of next element which results into wastage of memory.
• Elements cannot be accessed randomly. If we have to access any element in the list we
have to start sequentially from the header node.
• By using linked list we can traverse the linked list in one direction from header. He
was travelling is difficult in linked list.

5.1.3 Applications of Linked Lists

• Link list are used to implement other data structures such as stack, queue, graphs, tree.
• Linked list allows insertion and deletion of the element at any position in the list.
• Linked list can be extended during run-time. It is not required to know the size of data
in advance.

5.2 Types of Linked Lists

There are 3 different implementations of Linked List available, they are:


1. Singly Linked List
2. Doubly Linked List
3. Circular Linked List

Let's know more about them and how they are different from each other.

5.2.1 Singly Linked List


Singly linked lists contain nodes which have a data part as well as an address part i.e. next,
which points to the next node in the sequence of nodes.
The operations we can perform on singly linked lists are insertion, deletion and traversal.

5.2.2 Doubly Linked List


In a doubly linked list, each node contains a data part and two addresses, one for
the previous node and one for the next node.

5.2.3 Circular Linked List

In circular linked list the last node of the list holds the address of the first node hence forming a
circular chain.
We will learn about all the 3 types of linked list, one by one, in the next tutorials. So click
on Next button, let's learn more about linked lists.

5.3 Link List using Python:


Suppose we have a basic class containing a single data field:

class ListNode :
def __init__( self, data ) :
self.data = data

We can create several instances of this class, each storing data of our choice.

In the following example, we create three instances, each storing an integer value:

a = ListNode( 11 )
b = ListNode( 52 )
c = ListNode( 18 )

The above statements will result into creation of three variables and three objects as shown in
following diagram :

Now, suppose we add a second data field to the ListNode class:


class ListNode :
def __init__( self, data ) :
self.data = data
self.next = None

The three objects from the previous example would now have a second data _eld
initialized with a null reference, as illustrated in the following:

Since the next field can contain a reference to any type of object, we can assign to it a reference
to one of the other ListNode objects. For example, suppose we assign b to the next field of object
a:

a.next = b

which results in object a being linked to object b, as shown here:

And finally, we can link object b to object c:

b.next = c

This will result in a chain of objects, as illustrated here:

We can remove the two external references b and c by assigning None to each, as shown here:
The result is a linked list structure. The two objects previously pointed to by b and c are still
accessible via a. For example, suppose we wanted to print the values of the three objects. We can
access the other two objects through the next field of the first object:

print( a.data )
print( a.next.data )
print( a.next.next.data )

A linked structure contains a collection of objects called nodes, each of which contains data and
at least one reference or link to another node. A linked list is a linked structure in which the
nodes are connected in sequence to form a linear list. Figure 6.1 provides an example of a linked
list consisting of five nodes. The last node in the list, commonly called the tail node, is indicated
by a null link reference. Most nodes in the list have no name and are simply referenced via the
link field of the preceding node. The first node in the list, however, must be named or referenced
by an external variable as it provides an entry point into the linked list. This variable is
commonly known as the head pointer, or head reference. A linked list can also be empty, which
is indicated when the head reference is null.

5.3.1 Traversing a Link List:

To traverse the Link list we have taken a temporary external reference traNode to point to the
first node of the list. After entering the loop, the value stored in the first node is printed by
accessing the data component stored in the node using the external reference. The external
reference is then advanced to the next node by assigning it the value of the current node's link
field. The loop iteration continues until every node in the list has been accessed. The completion
of the traversal is determined when traNode becomes null. After accessing the last node in the
list, traNode is advanced to the next node, but there being no next node, traNode is assigned
None from the next field of the last node.

def traversal( head ):


traNode = head
while traNode is not None :
print traNode.data
traNode = traNode.next

5.3.2 Removing Nodes:

An item can be removed from a linked list by removing or unlinking the node containing that
item. Consider the linked list from Figure 6.4(c) and assume we want to remove the node
containing 18. First, we must find the node containing the target value and position an external
reference variable pointing to it. After finding the node, it has to be unlinked from the list, which
entails adjusting the link field of the node's predecessor to point to its successor. The node's link
field is also cleared by setting it to None.

Removing a node from a linked list.


# Given the head reference, remove a target from a linked list.
predNode = None
curNode = head

while curNode is not None and curNode.data != target :


predNode = curNode
curNode = curNode.next

if curNode is not None :


if curNode is head :
head = curNode.next
else :
predNode.next = curNode.next

5.4 Application: Polynomials

Polynomials, which are an important concept throughout mathematics and science, are arithmetic
expressions specified in terms of variables and constants. A polynomial in one variable can be
expressed in expanded form as

anxn + an-1xn-1 + an-2xn-2 + …………. + a1x1 + a0


where each aixi component is called a term. The ai part of the term, which is a scalar that can be
zero, is called the coefficient of the term. The exponent of the xi part is called the degree of that
variable and is limited to whole numbers. For example,

15 x2 - 17x + 2

Polynomials can be characterized by degree (i.e., all second-degree polynomials). The degree of
a polynomial is the largest single degree of its terms. The example polynomial above has a
degree of 2 since the degree of the first term has the largest degree. In this section, we design and
implement an abstract data type to represent polynomials in one variable expressed in expanded
form.

# Implementation of the Polynomial ADT using a sorted linked list.

class Polynomial :
def __init__(self, degree = None, coefficient = None):

if degree is None :
self._polyHead = None
else :
self._polyHead = _PolyTermNode(degree, coefficient)
self._polyTail = self._polyHead

# Return the degree of the polynomial.

def degree( self ):


if self._polyHead is None :
return -1
else :
return self._polyHead.degree
# Return the coefficient for the term of the given degree.

def __getitem__( self, degree ):


assert self.degree() >= 0,
"Operation not permitted on an empty polynomial."
curNode = self._polyHead

while curNode is not None and curNode.degree >= degree :


curNode = curNode.next
if curNode is None or curNode.degree != degree :
return 0.0
else :
return curNode.degree

# Evaluate the polynomial at the given scalar value.

def evaluate( self, scalar ):


assert self.degree() >= 0,
"Only non-empty polynomials can be evaluated."
result = 0.0;
curNode = self._polyHead
while curNode is not None :
result += curNode.coefficient * (scalar ** curNode.degree)
curNode = curNode.next
return result

5.4.1 Polynomial Operations

Polynomial Addition:

The addition of two polynomials can be performed for our linked list implementation using a
simple brute-force method, as illustrated in the code segment below:

def simple_add( self, rhsPoly ):


newPoly = Polynomial()
if self.degree() > rhsPoly.degree() :
maxDegree = self.degree()
else
maxDegree = rhsPoly.degree()

i = maxDegree

while i >= 0 :
value = self[i] + rhsPoly[i]
self._appendTerm( i, value )
i += 1
return newPoly

Polynomial Multiplication:

def multiply( self, rhsPoly ):


assert self.degree() >= 0 and rhsPoly.degree() >= 0,
"Multiplication only allowed on non-empty polynomials."

# Create a new polynomial by multiplying rhsPoly by the first term.


node = self._polyHead
newPoly = rhsPoly._termMultiply( node )

# Iterate through the remaining terms of the poly computing the


# product of the rhsPoly by each term.
node = node.next
while node is not None :
tempPoly = rhsPoly._termMultiply( node )
newPoly = newPoly.add( tempPoly )
node = node.next

return newPoly

# Helper method for creating a new polynomial from multiplying an


# existing polynomial by another term.

def _termMultiply( self, termNode ):


newPoly = Polynomial()
# Iterate through the terms and compute the product of each term and
# the term in termNode.
curr = curr.next
while curr is not None :
# Compute the product of the term.
newDegree = curr.degree + termNode.degree
newCoeff = curr.coefficient * termNode.coefficient
# Append it to the new polynomial.
newPoly._appendTerm( newDegree, newCoeff )

# Advance the current pointer.


curr = curr.next

return newPoly

5.5 Implementing Stack Using a Linked List

This section discusses commonly used data structures stack. Stack is the linear data structure
which allows insertion and deletion at only one end. Insertion operation is specified as
PUSH and deletion operation is specified as POP. Elements are removed in opposite order in
which they are added. This process is called last in first out (LIFO).

To use a linked list, we again must decide how to represent the stack structure. With the Python
list implementation of the stack, it was most efficient to use the end of the list as the top of the
stack. With a linked list, however, the front of the list provides the most efficient representation
for the top of the stack.

# Implementation of the Stack ADT using a singly linked list.


class Stack :
# Creates an empty stack.
def __init__( self ):
self._top = None
self._size = 0
# Returns True if the stack is empty or False otherwise.
def isEmpty( self ):
return self._top is None
# Returns the number of items in the stack.
def __len__( self ):
return self._size
# Returns the top item on the stack without removing it.
def peek( self ):
assert not self.isEmpty(), "Cannot peek at an empty stack"
return self._top.item
# Removes and returns the top item on the stack.
def pop( self ):
assert not self.isEmpty(), "Cannot pop from an empty stack"
node = self._top
self.top = self._top.next
self._size -= 1
return node.item
# Pushes an item onto the top of the stack.
def push( self, item ) :
self._top = _StackNode( item, self._top )
self._size += 1
# The private storage class for creating stack nodes.
class _StackNode :
def __init__( self, item, link ) :
self.item = item
self.next = link

5.6 Queue data structure: Queue Using a Linked List

A linear Data Structure is known as IQ if deletions are performed from the beginning and
insertion is done performed at the end of the rear of the list. In a queue Data Structure
information is processed in the same order in which it was received on a first in first out (FIFO)
basis or First Come First serve (FCFS) basis.

A major disadvantage in using a Python list to implement the Queue ADT is the expense
of the enqueue and dequeue operations. The circular array implementation improved on these
operations, but at the cost of limiting the size of the queue. A better solution is to use a linked list
consisting of both head and tail references. To work on a queue, we need to maintain two
references, qhead and qtail as shown below:

# Implementation of the Queue ADT using a linked list.

class Queue :
# Creates an empty queue.
def __init__( self ):
self._qhead = None
self._qtail = None
self._count = 0

# Returns True if the queue is empty.


def isEmpty( self ):
return self._qhead is None

# Returns the number of items in the queue.


def __len__( self ):
return self._count

# Adds the given item to the queue.


def enqueue( self, item ):
node = _QueueNode( item )
if self.isEmpty() :
self._qhead = node
else :
self._qtail.next = node

self._qtail = node
self._count += 1

# Removes and returns the first item in the queue.


def dequeue( self ):
assert not self.isEmpty(), "Cannot dequeue from an empty queue."
node = self._qhead
if self._qhead is self._qtail :
self._qtail = None
self._qhead = self._qhead.next
self._count -= 1
return node.item

# Private storage class for creating the linked list nodes.


class _QueueNode( object ):
def __init__( self, item ):
self.item = item
self.next = None

5.7 Difference between Array and Linked List


Both Linked List and Array are used to store linear data of similar type, but an array consumes
contiguous memory locations allocated at compile time, i.e. at the time of declaration of array,
while for a linked list, memory is assigned as and when data is added to it, which means at
runtime.
This is the basic and the most important difference between a linked list and an array. In the
section below, we will discuss this in details along with highlighting other differences.
5.8 Linked List vs. Array
Array is a datatype which is widely implemented as a default type, in almost all the modern
programming languages, and is used to store data of similar type.
But there are many usecases, like the one where we don't know the quantity of data to be stored,
for which advanced data structures are required, and one such data structure is linked list.
Let's understand how array is different from Linked list.

ARRAY LINKED LIST

Array is a collection of elements of similar Linked List is an ordered collection of elements of


data type. same type, which are connected to each other using
pointers.

Array supports Random Access, which Linked List supports Sequential Access, which means
means elements can be accessed directly to access any element/node in a linked list, we have to
using their index, like arr[0] for 1st sequentially traverse the complete linked list, upto that
element, arr[6] for 7th element etc. element.
Hence, accessing elements in an array To access nth element of a linked list, time complexity
is fast with a constant time complexity is O(n).
of O(1).

In an array, elements are stored In a linked list, new elements can be stored anywhere
in contiguous memory location or in the memory. Address of the memory location
consecutive manner in the memory. allocated to the new element is stored in the previous
node of linked list, hence forming a link between the
two nodes/elements.

In array, Insertion and Deletion operation In case of linked list, a new element is stored at the
takes more time, as the memory locations first free and available memory location, with only a
are consecutive and fixed. single overhead step of storing the address of memory
location in the previous node of linked list.
Insertion and Deletion operations are fast in linked list.

Memory is allocated as soon as the array is Memory is allocated at runtime, as and when a new
declared, at compile time. It's also known node is added. It's also known as Dynamic Memory
as Static Memory Allocation. Allocation.

In array, each element is independent and In case of a linked list, each node/element points to the
can be accessed using it's index value. next, previous, or maybe both nodes.

Array can single dimensional, two Linked list can


dimensional or multidimensional be Linear(Singly), Doubly or Circular linked list.

Size of the array must be specified at time Size of a Linked list is variable. It grows at runtime, as
of array declaration. more nodes are added to it.

Array gets memory allocated in the Stack Whereas, linked list gets memory allocated in Heap
section. section.

Below we have a pictorial representation showing how consecutive memory locations are
allocated for array, while in case of linked list random memory locations are assigned to nodes,
but each node is connected to its next node using pointer.

On the left, we have Array and on the right, we have Linked List.

An array is the most basic sequence container used to store and access a collection of
data. It provides easy and direct access to the individual elements and is supported at the
hardware level. But arrays are limited in their functionality. The Python list, which is also a
sequence container, is an abstract sequence type implemented using an array structure. It extends
the functionality of an array by providing a larger set of operations than the array, and it can
automatically adjust in size as items are added or removed.
The array and Python list can be used to implement many different abstract data types.
They both store data in linear order and provide easy access to their elements. The binary search
can be used with both structures when the items are stored in sorted order to allow for quick
searches. But there are several disadvantages in the use of the array and Python list. First,
insertion and deletion operations typically require items to be shifted to make room or close a
gap. This can be time consuming, especially for large sequences. Second, the size of an array is
fixed and cannot change. While the Python list does provide for an expandable collection, that
expansion does not come without a cost. Since the elements of a Python list are stored in an
array, an expansion requires the creation of a new larger array into which the elements of the
original array have to be copied.

Finally, the elements of an array are stored in contiguous bytes of memory, no matter the
size of the array. Each time an array is created, the program must find and allocate a block of
memory large enough to store the entire array. For large arrays, it can be difficult or impossible
for the program to locate a block of memory into which the array can be stored. This is
especially true in the case of a Python list that grows larger during the execution of a program
since each expansion requires ever larger blocks of memory. In this chapter, we introduce the
linked list data structure, which is a general purpose structure that can be used to store a
collection in linear order. The linked list improves on the construction and management of an
array and Python list by requiring smaller memory allocations and no element shifts for
insertions and deletions. But it does eliminate the constant time direct element access available
with the array and Python list. Thus, it's not suitable for every data storage problem. There are
several varieties of linked lists. The singly linked list is a linear structure in which traversals start
at the front and progress, one element at a time, to the end. Other variations include the circularly
linked, the doubly linked, and the circularly doubly linked lists.

5.9 Review Questions:

Q1. What is linear data structure? Explain following operations on a link list with the help of
algorithm.
a) Insertion
b) Deletion
c) Traversing

Q2. Difference between array and link list implementation.


Q3. What is a link list ? Represent following polynomial with the help of link list
3x3+5x2+7=0
Q5. Give an idea how a polynomial addition can be done by using link list.
Q6. Implement STACK with Link List.
Q7. Implement QUEUE with Link List.
6 Trees

Objectives
After completing this chapter, students will be able to

- Understand the concepts of Trees and Binary Trees,


- Define binary tree as ADT
- Implement Binary Trees, Tree Traversal Algorithms
- Implement Search Trees: Binary Search Trees ,Balanced Search Trees ,Python
Framework for Balancing Search Trees ,AVL Trees ,Splay Trees, Red-Black Trees
- Understand Heaps, Maps, Hash Tables, and Skip Lists

6.1 Trees: Concepts of Trees and Binary Trees

In the last chapters, we have discussed about linear data structures like arrays, stack, queues and
linked list. These data structures had one element after another element and followed the linear
pattern. Tree is a data structure which is not linear, each element may have more than one next
elements. Tree is one of the nonlinear data structure. In a tree data structures, data elements are
organized so that item of information are related by branches. There are many applications where
we can use tree data structures. For example: Pedigree: the Pedigree chart shows
someone's ancestors.

A tree structure consists of nodes and edges that organize data in a hierarchical fashion. The
relationships between data elements in a tree are similar to those of a family tree: \child,"
\parent," \ancestor," etc. The data elements are stored in nodes and pairs of nodes are connected
by edges. The edges represent the relationship between the nodes that are linked with arrows or
directed edges to form a hierarchical structure resembling an upside-down tree complete with
branches, leaves, and even a root. Formally, we can define a tree as a set of nodes that either is
empty or has a node called the root that is connected by edges to zero or more subtrees to form a
hierarchical structure. Each subtree is itself by definition is a tree.A classic example of a tree
structure is the representation of directories and subdirectories in a file system.
Let us understand the basic terminology of tree Data Structure.

A tree is a connected undirected graph with no simple circuit. An undirected graph is a tree if
and only if there is a unique simple path between any two of its vertices. A rooted tree is a tree
in which one vertex has been designated as the root and every edge is directed away from the
root. Different choice of root produce different rooted tree

6.2 Properties of Rooted Trees


Parent – A vertex other than root is a parent if it has one or more children
• The parent of c is b

Children – If A is a vertex with successors B and C, then B and C are the children of A.
• The children of a is b, f and g

Siblings – Children with the same parent vertex.


• h, i and j are siblings

Level – the length of the unique path from the root to a vertex
• Vertex a is at level 0
• Vertices d and e is at level 3
Height – The maximum level of all the vertices
• The height of this tree is 3.

Ancestor of a vertex (v) – the vertices in the path from the root to this vertex excluding this
vertex.
• The ancestors of e are c, b and a

Descendent of a vertex (v) – vertices that have v as ancestor.


• The descendants of b are c, d and e

• Leaf – A vertex with no children

• The leaves are d, e, f, i, k, l and m

• Internal Vertices – vertices that have children

• The internal vertices are a, b, c, g, h and j

Subtree – A subgraph of the tree consisting of a root and its descendent and all edges incident to
these descendent.

6.3 m-ary Tree


► A rooted tree is called an m-ary tree if every vertex has no more than m children.
► The tree is called a full m-ary tree if every internal vertex has exactly m children.

► A rooted m-ary tree is balanced if all leaves are at levels h or h-1.

6.4 Tree Traversal

Ordered trees are often used to restore data/info. Tree traversal is a procedure for systematically
visiting each vertex of an ordered rooted tree to access data. If the tree is label by Universal
Address System we can totally order the vertices using lexicographic ordering

 Example: 0 < 1 < 1.1 < 1.2 < 1.2.1 < 1.3 < 2 < 3 < 3.1 <

3.1.1 < 3.1.2 < 3.1.2.1 < 3.1.2.2 < 4 < 4.1

Following are different Tree traversal algorithm:

 Preorder traversal

 Inorder traversal

 Postorder traversal

6.4.1 Preorder Traversal


 Let T be an ordered rooted tree with root r.

 If T consists only of r, then r is the preorder traversal of T.

 If T1, T2, …, Tn are subtrees at r from left to right in T, then the preorder traversal
begins by visiting r, continues by traversing T1 in preorder, then T2 in preorder,
and so on until Tn is traversed in preorder.
6.4.2 Inorder Traversal
 Let T be an ordered rooted tree with root r.

 If T consists only of r, then r is the inorder traversal of T.

 If T1, T2, …, Tn are subtrees at r from left to right in T, then the inorder traversal
begins by traversing T1 in inorder, then visiting r, continues by traversing T2 in
inorder, and so on until Tn is traversed in inorder.

6.4.3 Postorder Traversal


 Let T be an ordered rooted tree with root r.

 If T consists only of r, then r is the postorder traversal of T.

 If T1, T2, …, Tn are subtrees at r from left to right in T, then the preorder traversal
begins by traversing T1 in postorder, then T2 in postorder, and so on until Tn is
traversed in postorder and ends by visiting r.
6.5 Represent Expression by Rooted Tree

We can represent complicated expression (propositions, sets, arithmetic) using ordered rooted
trees.
EXAMPLE: A binary tree representing ((x+y)↑2)+((x-4)/3)

Infix, Prefix & Postfix Notation


We obtain the Infix form of an expression when we traverse its rooted tree in Inorder.
 The infix form for expression ((x+y)↑2)+((x-4)/3) is

x + y ↑2 + x – 4 / 3 or ((x+y)↑2)+((x-4)/3)
 We obtain the Prefix form of an expression when we traverse its rooted tree in
Preorder.

The prefix form for expression ((x+y)↑2)+((x-4)/3) is + ↑ + x y 2 / - x 4 3


 We obtain the Postfix form of an expression when we traverse its rooted tree in
Postorder.

The postfix form for expression ((x+y)↑2)+((x-4)/3) is x y + 2 ↑ x 4 – 3 / +


6.6 Evaluating Prefix Expression

Working right to left and performing operations using the operands on the right.
EXAMPLE:
The value of the prefix expression + - * 2 3 5 / ↑ 2 3 4 is 3

6.7 Evaluating Postfix Expression


Working left to right and performing operations using the operands on the left.
EXAMPLE:
The value of the postfix expression 7 2 3 * - 4 ↑ 9 3 / + is 4

6.8 Tree Searching ( Binary Search Tree) :

Tree searches:
◼ A tree search starts at the root and explores nodes from there, looking for a goal node (a
node that satisfies certain conditions, depending on the problem)

◼ For some problems, any goal node is acceptable (N or J); for other problems, you want a
minimum-depth goal node, that is, a goal node nearest the root (only J)
6.8.1 Depth-first searching
◼ A depth-first search (DFS) explores a path all the way to a leaf before backtracking and
exploring another path

◼ For example, after searching A, then B, then D, the search backtracks and tries another
path from B

◼ Node are explored in the order A B D E H L M N I O P C F G J K Q

◼ N will be found before J


6.8.2 How to do depth-first searching
◼ Put the root node on a stack;
while (stack is not empty) {
remove a node from the stack;
if (node is a goal node) return success;
put all children of node onto the stack;
}
return failure;

◼ At each step, the stack contains some nodes from each of a number of levels

◼ The size of stack that is required depends on the branching factor b

◼ While searching level n, the stack contains approximately b*n nodes

◼ When this method succeeds, it doesn’t give the path

Recursive depth-first search


◼ search(node):
if node is a goal, return success;
for each child c of node {
if search(c) is successful, return success;
}
return failure;

◼ The (implicit) stack contains only the nodes on a path from the root to a goal

◼ The stack only needs to be large enough to hold the deepest search path

◼ When a solution is found, the path is on the (implicit) stack, and can be extracted
as the recursion “unwinds”

6.8.3 Breadth-first searching

◼ A breadth-first search (BFS) explores nodes nearest the root before exploring nodes
further away

◼ For example, after searching A, then B, then C, the search proceeds with D, E, F, G

◼ Node are explored in the order A B C D E F G H I J K L M N O P Q

◼ J will be found before N


6.8.3.1 How to do breadth-first searching
◼ Put the root node on a queue;
while (queue is not empty) {
remove a node from the queue;
if (node is a goal node) return success;
put all children of node onto the queue;
}
return failure;

◼ Just before starting to explore level n, the queue holds all the nodes at level n-1

◼ In a typical tree, the number of nodes at each level increases exponentially with the depth

◼ Memory requirements may be infeasible

◼ When this method succeeds, it doesn’t give the path

◼ There is no “recursive” breadth-first search equivalent to recursive depth-first search

6.8.4 Comparison of algorithms (DFS Vs. BFS)


◼ Depth-first searching:

◼ Put the root node on a stack;


while (stack is not empty) {
remove a node from the stack;
if (node is a goal node) return success;
put all children of node onto the stack;
}
return failure;
◼ Breadth-first searching:

◼ Put the root node on a queue;


while (queue is not empty) {
remove a node from the queue;
if (node is a goal node) return success;
put all children of node onto the queue;
}
return failure;

Depth- vs. breadth-first searching


◼ When a breadth-first search succeeds, it finds a minimum-depth (nearest the root) goal
node

◼ When a depth-first search succeeds, the found goal node is not necessarily minimum
depth

◼ For a large tree, breadth-first search memory requirements may be excessive

◼ For a large tree, a depth-first search may take an excessively long time to find even a very
nearby goal node

◼ How can we combine the advantages (and avoid the disadvantages) of these two search
techniques?

6.9 Binary tree:

In a tree if the degree of every node is less than or equals to 2, then the tree is called binary tree.

In a binary tree, which distinguish between the left and right subtree of each node. So we define
the binary tree as “A binary tree is a finite set of nodes which is either empty or consists of a root
and two disjoint binary trees called left subtree and the right subtree."

A binary tree is shown in the following diagram:

B C

D E F
6.9.1 Complete or full binary tree:

A binary tree in which each node is of degrees 0 or 2 and all these are at the same level is called
a complete binary tree. Following diagram shows a complete or full binary tree:

Binary trees are different from regular trees. Binary tree has left and right subtrees, where is in
the trees there is no left or right subtrees.

Let us consider the following scenario, where in the diagram given below option (1), (2) and
(3) are described.

If we consider the scenario as General trees then option (1), (2) and (3) are same.

But if we consider them as binary tree then option (1) is having a left subtree and no right
subtree. Option (2) is called the left child of A and option (3) has no meaning.

A A A

B B
B

Option (1) Option (2) Option (3)


6.9.2 Properties of a binary tree:
• Binary tree with N internal nodes or non-terminal nodes has maximum of (N +1) external
notes on terminal notes. Root is considered as an internal node.
• The height of a full binary tree with N nodes is [ log2 N+1 ]

6.10 Binary Tree ADT:

Binary trees are commonly implemented as a dynamic structure in the same fashion as linked lists. A
binary tree is a data structure that can be used to implement many different abstract data types. Since the
operations that a binary tree supports depend on its application, we are going to create and work with the
trees directly instead of creating a generic binary tree class. Trees are generally illustrated as abstract
structures with the nodes represented as circles or boxes and the edges as lines or arrows. To implement a
binary tree, however, we must explicitly store in each node the links to the two children along with the
data stored in that node. We define the BinTreeNode storage class for creating the nodes in a binary tree.
Like other storage classes, the tree node class is meant for internal use only.

# The storage class for creating binary tree nodes.

class _BinTreeNode :
def __init__( self, data ):
self.data = data
self.left = None
self.right = None

6.11 Binary Search Tree:

A linear search of an array or Python list is very slow, but that can be improved with a binary search.
Even with the improved search time, arrays and Python lists have a disadvantage when it comes to the
insertion and deletion of search keys. Remember, a binary search can only be performed on a sorted
sequence. When keys are added to or removed from an array or Python list, the order must be maintained.
This can be time consuming since keys have to be shifted to make room when adding a new key or to
close the gap when deleting an existing key. The use of a linked list provides faster insertions and
deletions without having to shift the existing keys. Unfortunately, the only type of search that can be
performed on a linked list is a linear search, even if the list is sorted.

The tree structure can be used to organize dynamic data in a hierarchical fashion. Trees come in various
shapes and sizes depending on their application and the relationship between the nodes. When used for
searching, each node contains a search key as part of its data entry (sometimes called the payload) and the
nodes are organized based on the relationship between the keys. There are many different types of search
trees, some of which are simply variations of others, and some that can be used to search data stored
externally. But the primary goal of all search trees is to provide an efficient search operation for quickly
locating a specific item contained in the tree.
A binary search tree (BST) is a binary tree in which each node contains a search key within its payload
and the tree is structured such that for each interior node V :

- All keys less than the key in node V are stored in the left subtree of V .
- All keys greater than the key in node V are stored in the right subtree of V .
Consider the binary search tree in below figure, which contains integer search keys. The root node
contains key value 60 and all keys in the root's left subtree are less than 60 and all of the keys in the right
subtree are greater than 60. If you examine every node in the keys, you will notice the same key
relationship applies to every node in the tree. Given the relationship between the nodes, an inorder
traversal will visit the nodes in increasing search key order. For the example binary search tree, the order
would be 1, 4, 12, 23, 29, 37, 41, 60, 71, 84, 90, 100 .

6.11.1 Partial implementation of the Map ADT using a binary search tree.
class BSTMap :

# Creates an empty map instance.

def __init__( self ):


self._root = None
self._size = 0

# Returns the number of entries in the map.

def __len__( self ):


return self._size

# Returns an iterator for traversing the keys in the map.

def __iter__( self ):


return _BSTMapIterator( self._root )

# Storage class for the binary search tree nodes of the map.
class _BSTMapNode :
def __init__( self, key, value ):
self.key = key
self.value = value
self.left = None
self.right = None

As with any binary tree, a reference to the root node must also be maintained for a binary search tree. The
constructor defines the root field for this purpose and also defines the size field to keep track of the
number of entries in the map. The latter is needed by the _len_ method.
6.11.2 Searching in a Binary Tree
Given a binary search tree, you will eventually want to search the tree to determine if it contains a
given key or to locate a specific element. In the last chapter, we saw that there is a single path from the
root to every other node in a tree. If the binary search tree contains the target key, then there will be a
unique path from the root to the node containing that key.

Since the root node provides the single access point into any binary tree, our search must begin
there. The target value is compared to the key in the root node. If the root contains the target value, our
search is over with a successful result. But if the target is not in the root, we must decide which of two
possible paths to take. From the definition of the binary search tree, we know the key in the root node is
larger than the keys in its left subtree and smaller than the keys in its right subtree. Thus, if the target is
less than the root's key, we move left and we move right if it's greater. We repeat the comparison on the
root node of the subtree and take the appropriate path. This process is repeated until target is located or
we encounter a null child link.

Searching for a target key in a binary search tree.


class BSTMap :
# ...
# Determines if the map contains the given key.
def __contains__( self, key ):
return self._bstSearch( self._root, key ) is not None

# Returns the value associated with the key.


def valueOf( self, key ):
node = self._bstSearch( self._root, key )
assert node is not None, "Invalid map key."
return node.value

# Helper method that recursively searches the tree for a target key.
def _bstSearch( self, subtree, target ):
if subtree is None : # base case
return None
elif target < subtree.key : # target is left of the subtree root.
return self._bstSearch( subtree.left )
elif target > subtree.key : # target is right of the subtree root.
return self._bstSearch( subtree.right )
else : # base case
return subtree

Example : Binary search tree for the words mathematics, physics, geography, zoology,
meteorology, geology, psychology, and chemistry using alphabetical order
6.12 Decision Trees
 A rooted tree in which each internal vertex corresponds to a decision, with a subtree at
these vertices for each possible outcome of decision.

 The possible solutions of the problem correspond to the paths to the leaves of this rooted
tree.
6.13 Balanced trees:

In binary search, if all levels are filled, then search, insertion and deletion operations have
efficiency of O(log N). However the performance of binary search tree made it right to linear
search if the nodes are inserted in the following order:

Solution of the problem: the above tree is imbalanced; it is heavier on the right side of the root.
To get the better performance during searching, the tree should be in balance form. Keep the
tree height balanced (for every node, the difference in height between the left and the right
subtree should be at most one). This will result in to log rhythmic performance.

How to keep the tree balanced:

There are two approaches to balance the tree:

1. Top down insertion


2. Bottom up insertion

1. Top down insertion: as they search the place to insert the item. They make one pass
through the tree. Red black tree is used to down approach.
2. Bottom up insertion: in this tree we first insert the item and then walked back Through
The Tree to make changes. It is less efficient because we make two passes through the
tree.

6.13.1 AVL Trees:

AVL ( Adelson -Velskii & Landis ) trees are binary search trees where nodes also have
additional information: the difference in depth between their left and the right subtree ( balance
factor) . Balance factor is represented as a number equal to the depth of the right subtree minus
the depth of the left subtree. In a balanced tree the balance factor can be 0, 1 or -1.
Let us take an example of balanced tree:

Following diagram is not a balanced tree:

Insertion in an AVL tree:

• To insert a new node in an AVL tree insert New Leaf node, as for ordinary binary search
tree.
• Then work back up from the New Leaf to root, checking if any height imbalance has
been introduced (by computing new balance factors).
• Perform rotation to correct height imbalance (rotation will be performed in which
direction, it depends on the balance factor).

Rotations in an AVL tree:

An imbalanced tree is restructured by performing rotations to result in a balanced tree. Rotation


is performed around some node say X. Rotations can be:

• Single right rotation,


• Single left rotation or
• A double rotation
Let’s take the example of these rotations one by one.

Right rotation: right rotation around X ( if the tree is heavier on the left side). Let us take an
example of following tree which is imbalance on the left side it is required to have a right
rotation around X.

After right rotation tree looks like as following:

Steps for right rotation:

• X was down and to the right, into the position of right child
• X's right subtree is unchanged
• X's left child A moves up to take X's place. X is now A's new right child.
• This leaves the old right child of A unaccounted 4. Since it comes from the left of X. So
it becomes X's new left child
Steps for left rotation:
Let us take the example of following imbalance tree which is heavier on the right side:

After left rotation around X, the diagram looks like as following:

6.13.2 A red-black tree

A red-black tree is a binary search tree which has the following red-black properties:
• Root is always black.

• Every node is either red or black.

• Every leaf (NULL) is black.

• If a node is red, then both its children are black. (Implies that on any path from the root to
a leaf, red nodes must not be adjacent. However, any number of black nodes may appear
in a sequence.)

• Every simple path from a node to a descendant leaf contains the same number of black
nodes.

Following is the example of Red-Black Tree:


Insertion in Red Black Tree:
• Insertion can be done top down, changing colors and rotating subtrees while searching for
a place to insert the node.

• The rules should be satisfied after the new node is inserted. Only requires one pass
through the tree and is guaranteed to be O (log N).

• If we need to insert an item in the red-black tree, and we arrive at the insertion point
which is a node S which does not have at least one of the daughters and S is black we are
done: we insert our item as a red node and nothing changes.

Following diagram shows an example for insertion in Red Black Tree:

6.14 Splay tree


A splay tree is a binary search tree with the additional property that recently accessed elements
are quick to access again. Like self-balancing binary search trees, a splay tree performs basic
operations such as insertion, look-up and removal in O(log n) amortized time. For many
sequences of non-random operations, splay trees perform better than other search trees, even
performing better than O(log n) for sufficiently non-random patterns, all without requiring
advance knowledge of the pattern.
All normal operations on a binary search tree are combined with one basic operation,
called splaying. Splaying the tree for a certain element rearranges the tree so that the element is
placed at the root of the tree. One way to do this with the basic search operation is to first
perform a standard binary tree search for the element in question, and then use tree rotations in a
specific fashion to bring the element to the top. Alternatively, a top-down algorithm can combine
the search and the tree reorganization into a single phase.

6.15 Heap:

A heap is a complete binary tree in which the nodes are organized based on their data entry
values. There are two variants of the heap structure. A max-heap has the property, known as the
heap order property, that for each non-leaf node V, the value in V is greater than the value of its
two children. The largest value in a max-heap will always be stored in the root while the smallest
values will be stored in the leaf nodes. The min-heap has the opposite property. For each non-
leaf node V, the value in V is smaller than the value of its two children.

6.16 MAPS
Searching for data items based on unique key values is a very common application in computer science.
An abstract data type that provides this type of search capability is often referred to as a map or dictionary
since it maps a key to a corresponding value. Consider the problem of a university registrar having to
manage and process large volumes of data related to students. To keep track of the information or records
of data, the registrar assigns a unique student identification number to each individual student. Later,
when the registrar needs to search for a student's information, the identification number is used. Using
this keyed approach allows access to a specific student record. If the names were used to identify the
records instead, then what happens when multiple students have the same name? Or, what happens if the
name was entered incorrectly when the record was initially created?
6.16.1 The Map Abstract Data Type
The Map ADT provides a great example of an ADT that can be implemented using one of many different
data structures. Our definition of the Map ADT, which is provided next, includes the minimum set of
operations necessary for using and managing a map.

Map ADT

A map is a container for storing a collection of data records in which each record is associated with a
unique key. The key components must be comparable.

_ Map(): Creates a new empty map.

_ length (): Returns the number of key/value pairs in the map.

_contains ( key ): Determines if the given key is in the map and returns True if the key is found and False
otherwise.

_add( key, value ): Adds a new key/value pair to the map if the key is not already in the map or replaces
the data associated with the key if the key is in the map. Returns True if this is a new key and False if the
data associated with the existing key is replaced.

_remove( key ): Removes the key/value pair for the given key if it is in the map and raises an exception
otherwise.

_valueOf( key ): Returns the data record associated with the given key. The key must exist in the map or
an exception is raised.

_ iterator (): Creates and returns an iterator that can be used to iterate over the keys in the map.

6.17 Hashing Table:

We can use the direct access technique for small sets of keys that are composed of consecutive integer
values. But what if the key can be any integer value? Even with a small collection of keys, we cannot
create an array large enough to store all possible integer values. That's where hashing comes into play.
Hashing is the process of mapping a search key to a limited range of array indices with the goal of
providing direct access to the keys. The keys are stored in an array called a hash table and a hash function
is associated with the table. The function converts or maps the search keys to specific entries in the table.
For example, suppose we have the following set of keys:

765, 431, 96, 142, 579, 226, 903, 388

and a hash table, T, containing M = 13 elements. We can define a simple hash function h(_) that maps the
keys to entries in the hash table: h(key) = key % M

You will notice this is the same operation we used with the product codes in our earlier example.
Dividing the integer key by the size of the table and taking the remainder ensures the value returned by
the function will be within the valid range of indices for the given table.
To add keys to the hash table, we apply the hash function to determine the entry in which the given key
should be stored. Applying the hash function to key 765 yields a result of 11, which indicates 765 should
be stored in element 11 of the hash table. Likewise, if we apply the hash function to the next four keys in
the list, we find:

h(431) => 2 h(96) => 5 h(142) => 12 h(579) => 7

all of which are unique index values.

6.18 Skip list

The skip list is a probabilisitc data structure that is built upon the general idea of a linked list. The
skip list uses probability to build subsequent layers of linked lists upon an original linked list. Each
additional layer of links contains fewer elements, but no new elements.

You can think about the skip list like a subway system. There's one train that stops at every single
stop. However, there is also an express train. This train doesn't visit any unique stops, but it will stop
at fewer stops. This makes the express train an attractive option if you know where it stops.

Skip lists are very useful when you need to be able to concurrently access your data structure.
Imagine a red-black tree, an implementation of the binary search tree. If you insert a new node into
the red-black tree, you might have to rebalance the entire thing, and you won't be able to access your
data while this is going on. In a skip list, if you have to insert a new node, only the adjacent nodes
will be affected, so you can still access large part of your data while this is happening.

A skip list starts with a basic, ordered, linked list. This list is sorted, but we can't do a binary search
on it because it is a linked list and we cannot index into it. But the ordering will come in handy later.

Then, another layer is added on top of the bottom list. This new layer will include any given element
from the previous layer with probability p. This probability can vary, but oftentimes ½ is used.
Additionally, the first node in the linked list is often always kept, as a header for the new layer. Take
a look at the following graphics and see how some elements are kept but others are discarded. Here,
it just so happened that half of the elements are kept in each new layer, but it could be more or less--
it's all probabilistic. In all cases, each new layer is still ordered.

A skip list S has a few important properties that are referenced in its analysis. It has a height
of h which is the number of linked lists in it. It has a number of distinct elements, n. And it has a
probability p, which is usually ½. .
The highest element (one that appears in the most lists) will appear in log 1/p(n) lists, on average--
we'll prove this later. This, if we use p = 1/2, there are log2(n) lists. This is the average value of h.
Another way of saying "Every element in a linked list is in the linked list below it" is "Every element
in level Si+1 exists in level Si." Each element in the skip list has four pointers. It points to the node
to its left, its right, its top, and its bottom. These quad-nodes will allow us to efficiently search
through the skip list.

6.19 Review Questions:

1. Represent the following expression using binary trees. Then write these expression in infix,
prefix and postfix notations.

a. ((x+2)↑3)*(y – (3+x)) – 5

b. (A∩B) – (A∪(B – A))

2. What is the value of these expression in prefix expression?

a. + – ↑3 2 ↑ 2 3 / 6 – 4 2

b. *+3+3↑3+333

3. What is the value of these expression in postfix expression?

a. 32*2↑53–84/*–

b. 93/5+72–*

4. What are binary trees? Explain how binary tree can be stored using array?
5. Explain following operation for binary tree?
a. Insertion
b. Deletion
c. Traversal
6. Devise an algorithm for determining whether the two binary tree t1 & t2 are similar or not?
7. Write an algorithm to display the element of a binary tree in level order, that is, list the
element in the root, followed by the elements in depth 1, then the elements at depth 2, and
so on.
(Hint: Use queue structure and perform preorder traversal on the tree.)
7 Searching, Sorting and Analysis of Algorithms

Objectives
After completing this chapter, students will be able to

• Understand the need of searching, linear search


• Understand the need of using binary search for efficient search.
• Understand the need of sorting and various sorting algorithms: insertion sort, bubble sort,
selection sort; Merge sort and quick sort algorithms.
• Implement Python’s Built-In Sorting Functions, Selection Algorithms.
• Analyze Algorithms: Measuring Algorithm Efficiency, Asymptotic Analysis, The Big-O
Notation and Find the complexity of Algorithms

7.1 Need of searching, linear search, using binary search for efficient search.
The problem of Search: A file contains many records and a record contains many fields. These
fields help us to differentiate between different records. One of the field acts as a key for
searching in a collection of records. As one record can be used in many
applications, different fields can be used as key for search in different applications. Let us take
an example: the record of a student which contains three fields: name, address, roll number.
Generally, the key used is a person's name. In an application it may be required to look for a
student residing at a particular address. In that case adjust can act as a key.

In some applications it may be required look for particular student with a given roll number. In
that case roll number connect as a key. A collection of records can be stored by the sequentially
or non-sequentially.

We will assume that will the data records are stored sequentially.

Searching: Linear, Binary search algorithms

Searching for an element in a list is the process of checking, if a specified element is present in
the list and determining the location of the desired element in the list.

We will discuss two searching algorithms:

1. Linear search
2. binary search

7.1.1 Linear search:

It is a search algorithm, it is also known as sequential search, it is suitable for searching a set of
data for a particular value.

• Every element in the list is checked from starting until a match is found.
• It compares each element with the value being searched for stops when that value is
found or the end of array is reached.

7.1.1.1 Advantage of Linear search:


• It is easy to understand
• it is not required the array to be ordered

7.1.1.2 Disadvantage of Linear search:


• If there are 20000 items in the array and you are searching the element that is at 19999
position, then we would have to search for entire list.

Best case: it is the value is equal to first element tested, in which only one comparison is needed.

Worst case: is that value is not in the list or it is last item in the list, in which case and
comparisons are needed.

Algorithm development:

Let A is the array of n elements.


Therefore we have a list A[1], A[2], A[3], A[4],.........,A[n].
Suppose X is the element to be searched.

This algorithm will find the location of X in the array if it is in list otherwise it will give a
message-" element is not in the list."

Step 1: start
Step 2: repeat for i= 1 to i <= n
if (A[i] == X) then
Display " element X found at i position."
Exit.
Step 3: display “element not found in the list."
Step 4: stop.
7.1.1.3 Python implementation of Linear Search

Implementation of the linear search on an unsorted sequence.


1 def linearSearch( theValues, target ) :
2 n = len( theValues )
3 for i in range( n ) :
4 # If the target is in the ith element, return True
5 if theValues[i] == target
6 return True
7
8 return False # If not found, return False.
7.1.2 Binary search:

Binary search Method can be used only for sorted list.

In this method, the value of the element in the middle of the list is compared with the value of
the element to be searched for. if the middle element is large, the desired element may be in the
behalf of the list. If the middle element is smaller, the desired element may be in the lower half
of the list.

Therefore every time domain of such is reduced by half.

Therefore logic behind the technique is given below as:

1. Find the first middle element of the array.


2. Compare the mid element with an item to be searched.
3. there are three cases
1. if it is desired element then search is successful.
2. if it is less than item then search only first half of the array.
3. if it is greater than desired element then search in the second half of the array.
4. Repeat the above step until the element is found or exhaust in search area.

Algorithm – Binary Search

Suppose A= Array name


B=Size of Array
X= Element to be searched.
Algorithm will find the location of X in array or otherwise give the massage that the element is
not found.

Step1 START
[INITIALIZE THE VARIABLES]
Step2 BEG =1
END= N &LOC=0
MID =int((BEG+END)/2)
Step3 Repeat the following step 4 while BEG<=END and A[MID]!=x
Step4 if(x<A[MID]) then\
END =MID-1;
else
BEG =MID+1;
MID =int((BEG+END)/2)
Step5 if(x== A[MID])then
LOC=MID
Display “X found at LOC=MID”
Step6 if(LOC==0)
Display “X not found at list”
7.1.2.1 Python implementation of Binary Search
# Program for recursive binary search.
# Returns index of e in arr if present, else -1
def binary_search(arr, low, high, e):
# Check base case
if high >= low:

mid = (high + low) // 2

# If element is present at the middle itself


if arr[mid] == e:
return mid

# If element is smaller than mid, then it can only


# be present in left subarray
elif arr[mid] > e:
return binary_search(arr, low, mid - 1, e)

# Else the element can only be present in right subarray


else:
return binary_search(arr, mid + 1, high, e)

else:
# Element is not present in the array
return -1

# Test array
arr = [ 1, 2, 5, 7, 11, 13, 14, 15, 16 ]
element = 13

# Function call
result = binary_search(arr, 0, len(arr)-1, element)

if result != -1:
print("Element is present at index", str(result))
else:
print("Element is not present in array")
7.1.3 Efficiency of linear Search and binary Search Algorithm:

Efficiency of binary search is better than Linear search. There are many reasons for the better
efficiency of binary search. Let us discuss them one by one:

1. In linear search data elements are not required to be sorted, on the other side the basic
requirement of binary search is that the list should be sorted.
2. Linear search can be applied over array Data Structure or linked list data structure on
the other side binary search cannot be directly implemented on linked list.
3. If we have to insert an element in an array that will go through Linear search, insertion
can be done at any place or at the end of the list. But if we have to insert an element in
an array that will go through binary search, insertion cannot be done at any place
because the resulted list should be sorted to apply binary search.
4. Linear search algorithm is iterative in nature. On the other side binary search algorithm
is of divide-and-conquer nature.
5. In linear search in worst case if an element to be searched is present at the last of the list
the number of comparisons required are N. Show the complexity of Linear search is
O(N). On the other side, during binary search, the list is divided into how during every
search. Hence the time complexity of binary search is O(log N).
7.1.4 Sorting Algorithm-

Searching and sorting algorithm is the most common algorithm that is required in any of the
programming environment. In this chapter we are going to discuss some of the common sorting
algorithms. In this chapter you will also discuss the advantages and disadvantages of one
technique over another technique.

In our daily life, availability of data in particular order impact the convenience. Suppose we
have to find out the telephone number in a telephone directory. This process is called searching
and it is now simplified because the elements in the telephone directory are in alphabetically
sorted order. Now consider the complication, if the telephone numbers are recorded into
telephone directory in the order in which the customer has ordered his phone. In that
situation, the names of customers are also registered in random order. But the telephone directory
numbers are arranged in alphabetical order, which simplifies the problem of searching. Hence we
can say that with the help of sorted data, searching becomes faster.
If file contains multiple records and a record contains multiple fields. Any field in the record can
be used as a key for sorting algorithm.

Sorting algorithms can be classified into two categories:

Internal: in this algorithm the data elements that are being sorted are in main memory.

External: this algorithm that all elements that are being sorted are in auxiliary storage. Will be
on internal sorting algorithms.

Sorting Algorithm is used to arrange random data into some order.


-Data can be arranged in ascending order.

Different sorting algorithm are-


* Bubble Sort.
* Insertion Sort.
* Selection Sort.

7.1.4.1 BUBBLE SORT:-


In this method adjacent elements of the list to be sorted are compared If the elements on the top
is greater than the item immediately below it, they are exchanged.
This process is carried out till the list is sorted.
After pass 1 =1st largest element is at the bottom.
After pass 2 =2nd largest element is at the second last position.



And so on

Following is an example:
Original Pass1 Pass2 Pass3 Sorted
List List
4 4 4 4 2
8 8 6 2 4
10 6 2 6 6
6 2 8 8 8
2 10 10 10 10
Algorithm –Bubble Sort.

Let A is linear array with N elements.

Step1 START
Step2 Repeat step 3and 4 for for I=1 to I=N-1
Step3 Repeat step 4 for J=1 to N-1
Step4 [exchange of elements]
if(A[J]>A[J+1]) then
{
TEMP =A[J]
A[J]=A[J+1]
A[J+1]=TEMP
}
Step5 Stop.

7.1.4.2 Selection sort:

Selection sort is a simple sorting technique that improves the performance of the bubble sort.
Logic of the selection sort works as follows

Step1 Find the minimum value in the list.


Step2 Swap it with the first position.
Step3 Repeat the above steps for remaining elements of list(Starting at the second
position).

Original Pass1 Pass2 Pass3 Sorted


List List
31 11 11 11 11
25 25 12 12 12
12 12 25 22 22
22 22 22 25 25
11 31 31 31 31

During pass1, find the smallest elements using a linear scan and swap it into the first position
in the list, then the during pass2 find the second smallest element by scanning the remaining
list and so on.
7.1.4.3 INSERTION SORT

In insertion sort we insert an element into its proper place in the previously sorted sub list.
Consider an array A with N elements.

Step can be started as:-

1)A[1] by itself is sorted


2)A[2] is inserted before or after AEN A[I] so that A[1] and A[2] are sorted
3)Similarly A[3] is inserted so that A[1], A[2] and A[3] are sorted.
4)This process is continued till all the elements are sorted.

Original list: 10 8 4 6 2
Pass1 10
Pass2 8 10
Pass3 4 8 10
Pass4 4 6 8 10
Pass5 2 4 6 8 10

Algorithm- Insertion sort

Let A is linear array with N elements

1) START
2) REPEAT STEP 2 TO 5 FOR I=2 TO I=N
3) Set TEMP=A[I]
POSITION= I-1
4)[Move down 1 position all elements greater than temp]
Repeat while [Temp< A[POSITION] & POSITION>=1]
{
A[POSITION+1]= A[POSITION]
POSITION=POSITION-1
5) [insert TEMP at proper position]
A[POSITION+1]=TEMP
6) STOP
7.1.4.4 Quick Sort and Merge Sort:

Quick Sort:

Divide and conquer policy is used for various sorting algorithms like Quick sort and merge sort.
Let us discuss this algorithm one by one.

Quick sort: this algorithm is also known as partitions exchange sort. In this algorithm the set of
data elements are divided into two parts repeatedly until it is not possible to divide them further.
To partition the elements, the key element used is known as pivot.

The pivot value partition the whole data set into two parts. In first partition data elements are
smaller than the pivot element. I need second partition data elements are greater than pivot value.
This partition is further divided into two by using the same principal. The elements are sorted
recursively.

Algorithm: Quick sort

Assumptions: suppose you have an array A with n elements to be sorted. Following are the steps
for the algorithm:
Advantages of Quick sort:

Performance of Quick sort algorithm is better than bubble sort, Insertion sort and selection Sort.

Disadvantages of Quick sort:

This algorithm is not suitable for large arrays as it is complex and very recursive.
7.1.4.5 Merge Sort Algorithm:

Algorithm: MERGE SORT (KEYS, N)


This algorithm sorts the N- element array or vector KEYS using an auxiliary vector TEMP.

Step 1: Repeat Step 2 and Step 3 for P_NO = 1 to log2N


Step 2: [Find the size of subvector in current pass]
S_SIZE : = 2 ↑ (P_NO – 1)
Step 3: [Perform current pass]
If ODD (P_NO) then
Call MERGEPASS(KEYS ,N , S_Size ,TEMP)
Else
Call MERGEPASS(TEMP, N, S_SIZE<,KEYS)
[End of If statement]
Step 4: [Find which area (the original or auxiliary) the sorted array is in ]
If ODD (log2N) then
KEYS : = TEMP
[End of If statement]
Step 5: [Finished]
Exit

Procedure: MARGEPASS (INPUT, N, S_SIZE, OUTPUT)

Given a vector INPUT of N elements which contains sorted subvectors of size S_SIZE, this
procedure merges the pairs of subvectors of INPUT and copies the result into the vector
OUTPUT. The integer variable LB denotes the position of the first element in the first subvector
and I is a Local integer variable.
Step 1: [Find the number of pairs of subvector to be Merged]
NO_PAIRS : = N div (2 * S_SIZE)
Step 2: [Perform successive simple merges for pass]
Repeat Step 3 and Step 4 for I = 1 to NO_PAIRS
Step 3: [Find lower bound (LB) positions of first subvector]
LB : = 1 + (2 * I - 2) * S_SIZE
Step 4: [Perform Simple merge of subvector pairs]
Call SIMPLE_MERGE (INPUT, LB, S_SIZE, INPUT, LB +
S_SIZE,S_SIZE,OUTPUT,LB)
Step 5: [Finished Current pass]
Return

Procedure: SIMPLE_MERGE(A, LBA, ASIZE, B, LBB, BSIZE, C, LBC)


Given two sorted vector A and B of size ASIZE and BSIZE respectively. This procedure merge
the given vectors A and B and stored the resulting sorted vector into C. The position LBA, LBB
and LBC represent the first elements of A, B and C respectively. The variables I,P,Q and R are
local integer variables.

Step 1: [Initialize]
UBA : = LBA +ASIZE – 1
UBA : = LBB + BSIZE – 1
P := LBA
Q := LBB
R := LBC
Step 2: [Compare Corresponding elements and output the smallest]
Repeat while (P ≤ UBA) and (Q ≤ UBB)
If A[P] < B[Q] then
C[R] := A[P]
P := P + 1
R := R + 1
Else
C[R] := B[Q]
Q := Q + 1
R := R + 1
[End of If statement]
[End of loop]
Step 3: [Copy the remaining unprocessed element into the output array]
If P > UBA then
Repeat while ( Q ≤ UBB)
C[R] := B[Q]
R := R + 1
Q := Q + 1
[End of Loop]
Else
Repeat while ( P ≤ UBA)
C[R] := A[P]
P := P + 1
R := R + 1
[End of Loop]
[End of If Statement]
Step 4: [Finished simple merge]
Return

• Analysis of Algorithms: Measuring Algorithm Efficiency, Asymptotic Analysis, The


Big-O Notation, Find the complexity of Algorithms: Linear Search, Binary Search,
Sorting Algorithms. Compare complexity of various searching and sorting Algorithms.

7.2 Analysis of Algorithms:


Algorithms are designed to solve problems, but a given problem can have many different
solutions. How then are we to determine which solution is the most efficient for a given
problem? One approach is to measure the execution time. We can implement the solution by
constructing a computer program, using a given programming language. We then execute the
program and time it using a wall clock or the computer’s internal clock.

The execution time is dependent on following factors:


• First, the amount of data that must be processed directly affects the execution time. As
the data set size increases, so does the execution time.
• Second, the execution times can vary depending on the type of hardware and the time of
day a computer is used. If we use a multi-process, multi-user system to execute the
program, the execution of other programs on the same machine can directly affect the
execution time of our program.
• Finally, the choice of programming language and compiler used to implement an
algorithm can also influence the execution time. Some compilers are better optimizers
than others and some languages produce better optimized code than others.
Thus, we need a method to analyze an algorithm’s efficiency independent of the
implementation details.

7.3 Complexity Analysis:

To determine the efficiency of an algorithm, we can examine the solution itself and measure
those aspects of the algorithm that most critically affect its execution time. For example, we
can count the number of logical comparisons, data interchanges, or arithmetic operations.
Consider the following algorithm for computing the sum of each row of an n × n matrix and
an overall sum of the entire matrix:

# Version 1

totalSum = 0
for i in range( n ) :
rowSum[i] = 0
for j in range( n ) :
rowSum[i] = rowSum[i] + matrix[i,j]
totalSum = totalSum + matrix[i,j]

Suppose we want to analyze the algorithm based on the number of additions performed. In this
example, there are only two addition operations, making this a simple task. The algorithm
contains two loops, one nested inside the other. The inner loop is executed n times and since it
contains the two addition operations, there are a total of 2n additions performed by the inner loop
for each iteration of the outer loop. The outer loop is also performed n times, for a total of 2n 2
additions.

Can we improve upon this algorithm to reduce the total number of addition operations
performed? Consider a new version of the algorithm in which the second addition is moved out
of the inner loop and modified to sum the entries in the rowSum array instead of individual
elements of the matrix.

#Version 2

totalSum = 0
for i in range( n ) :
rowSum[i] = 0
for j in range( n ) :
rowSum[i] = rowSum[i] + matrix[i,j]
totalSum = totalSum + rowSum[i]

In this version, the inner loop is again executed n times, but this time, it only contains one
addition operation. That gives a total of n additions for each iteration of the outer loop, but the
outer loop now contains an addition operator of its own. To calculate the total number of
additions for this version, we take the n additions performed by the inner loop and add one for
the addition performed at the bottom of the outer loop. This gives n + 1 additions for each
iteration of the outer loop, which is performed n times for a total of n2 + n additions.

If we compare the two results, it’s obvious the number of additions in the second version is less
than the first for any n greater than 1. Thus, the second version will execute faster than the first,
but the difference in execution times will not be significant. The reason is that both algorithms
execute on the same order of magnitude, namely n2. Thus, as the size of n increases, both
algorithms increase at approximately the same rate (though one is slightly better), as illustrated
numerically in following Table:

N 2n2 n2+n
10 200 110
100 20,000 10,100
1000 2,000,000 1,001,000
10000 200,000,000 100,010,000
100000 20,000,000,000 10,000,100,000
Table: Growth rate comparisons for different input sizes
Figure: Graphical comparison of the growth rates from Table

7.4 Big-O Notation:


Instead of counting the precise number of operations or steps, computer scientists are more
interested in classifying an algorithm based on the order of magnitude as applied to execution
time or space requirements. This classification approximates the actual number of required steps
for execution or the actual storage requirements in terms of variable-sized data sets. The term
big-O, which is derived from the expression “on the order of,” is used to specify an algorithm’s
classification.

Defining Big-O:
Assume we have a function T(n) that represents the approximate number of steps required by an
algorithm for an input of size n. For the second version of our algorithm in the previous section,
this would be written as
T2(n) = n2 + n
We can express algorithmic complexity using the big-O notation. For a problem of size N:
• A constant-time function/method is “order 1” : O(1)
• A linear-time function/method is “order N” : O(N)
• A quadratic-time function/method is “order N squared” : O(N 2 )

Definition: Let g and f be functions from the set of natural numbers to itself. The function f is
said to be O(g) (read big-oh of g), if there is a constant c > 0 and a natural number n0 such that
f (n) ≤ cg(n) for all n >= n 0.
Note: O(g) is a set!
The general step wise procedure for Big-O runtime analysis is as follows:
1. Figure out what the input is and what n represents.
2. Express the maximum number of operations, the algorithm performs in terms of n.
3. Eliminate all excluding the highest order terms.
4. Remove all the constant factors.
Some of the useful properties of Big-O notation analysis are as follow:
ConstantMultiplication:
If f(n) = c.g(n), then O(f(n)) = O(g(n)) ; where c is a nonzero constant.

Polynomial Function:
If f(n) = a0 + a1.n + a2.n2 + —- + am.nm, then O(f(n)) = O(nm).

Summation Function:
If f(n) = f1(n) + f2(n) + —- + fm(n) and fi(n)≤fi+1(n) ∀ i=1, 2, —-, m,
then O(f(n)) = O(max(f1(n), f2(n), —-, fm(n))).

Logarithmic Function:
If f(n) = logan and g(n)=logbn, then O(f(n))=O(g(n))
; all log functions grow in the same manner in terms of Big-O.
Basically, this asymptotic notation is used to measure and compare the worst-case scenarios of
algorithms theoretically. For any algorithm, the Big-O analysis should be straightforward as
long as we correctly identify the operations that are dependent on n, the input size.

7.5 Runtime Analysis of Algorithms


In general cases, we mainly used to measure and compare the worst-case theoretical running
time complexities of algorithms for the performance analysis. The fastest possible running time
for any algorithm is O(1), commonly referred to as Constant Running Time. In this case, the
algorithm always takes the same amount of time to execute, regardless of the input size. This is
the ideal runtime for an algorithm, but it’s rarely achievable. In actual cases, the performance
(Runtime) of an algorithm depends on n, that is the size of the input or the number of
operations is required for each input item.

The algorithms can be classified as follows from the best-to-worst performance (Running Time
Complexity):

• A logarithmic algorithm – O(logn)


Runtime grows logarithmically in proportion to n.
• A linear algorithm – O(n)
Runtime grows directly in proportion to n.
• A super linear algorithm – O(nlogn)
Runtime grows in proportion to n.
• A polynomial algorithm – O(nc)
Runtime grows quicker than previous all based on n.
• A exponential algorithm – O(cn)
Runtime grows even faster than polynomial algorithm based on n.
• A factorial algorithm – O(n!)
Runtime grows the fastest and becomes quickly unusable for even small values of n.

Where, n is the input size and c is a positive constant.

7.6 Algorithmic Examples of Runtime Analysis:

Some of the examples of all those types of algorithms (in worst-case scenarios) are mentioned
below:
• Logarithmic algorithm – O(logn) – Binary Search.
• Linear algorithm – O(n) – Linear Search.
• Superlinear algorithm – O(nlogn) – Heap Sort, Merge Sort.
• Polynomial algorithm – O(n^c) – Strassen’s Matrix Multiplication, Bubble Sort, S
election Sort, Insertion Sort, Bucket Sort.
• Exponential algorithm – O(c^n) – Tower of Hanoi.
• Factorial algorithm – O(n!) – Determinant Expansion by Minors, Brute force Sear
ch algorithm for Traveling Salesman Problem.

7.7 Memory Footprint Analysis of Algorithms


For performance analysis of an algorithm, runtime measurement is not only relevant metric but
also we need to consider the memory usage amount of the program. This is referred to as the
Memory Footprint of the algorithm, shortly known as Space Complexity.
Here also, we need to measure and compare the worst case theoretical space complexities of
algorithms for the performance analysis. It basically depends on two major aspects described
below:
• Firstly, the implementation of the program is responsible for memory usage. For example,
we can assume that recursive implementation always reserves more memory than the
corresponding iterative implementation of a particular problem.
• And the other one is n, the input size or the amount of storage required for each item. For
example, a simple algorithm with a high amount of input size can consume more memory
than a complex algorithm with less amount of input size.

Algorithmic Examples of Memory Footprint Analysis: The algorithms with examples are
classified from the best-to-worst performance (Space Complexity) based on the worst-case
scenarios are mentioned below:
Ideal algorithm - O(1) - Linear Search, Binary Search,Bubble Sort, Selection Sort, Insertion
Sort, Heap Sort, Shell Sort.
Logarithmic algorithm - O(log n) - Merge Sort.
Linear algorithm - O(n) - Quick Sort.
Sub-linear algorithm - O(n+k) - Radix Sort.

7.8 Space-Time Tradeoff and Efficiency:


There is usually a trade-off between optimal memory use and runtime performance. In general,
for an algorithm, space efficiency and time efficiency reach at two opposite ends and each
point in between them has a certain time and space efficiency. So, the more time efficiency you
have, the less space efficiency you have and vice versa.
For example, Mergesort algorithm is exceedingly fast but requires a lot of space to do the
operations. On the other side, Bubble Sort is exceedingly slow but requires the minimum
space.
At the end of this topic, we can conclude that finding an algorithm that works in less running
time and also having less requirement of memory space, can make a huge difference in how
well an algorithm performs.

7.9 Review Questions


1. Explain linear search algorithm with the help of an example. What is complexity of
linear search algorithm?
2. Write algorithm for binary search. Compare the complexity of binary search algorithm
with linear search algorithm.
3. What is sorting? List different type of sorting techniques?
4. What are factors to be consider during selection of sorting techniques?
5. What is bubble Sort? Explain the complexity of bubble sort?
6. What is the advantage of Bubble sort over other sorting techniques?
7. What is the disadvantage of bubble sort?
8. How bubble sort is different from quick sort?
9. What is Quick Sort? Explain its complexity?
10. What is the advantage of Quick sort over other sorting techniques?
11. What is the disadvantage of Quick sort?
12. What is Merge Sort? Explain its complexity?
13. What is the advantage of Merge sort over other sorting techniques?
14. What is the disadvantage of Merge sort?
15. What is insertion sort? Explain its complexity?
16. What is the advantage of Insertion sort over other sorting techniques?
17. What is the disadvantage of Insertion sort?
18. What is selection sort? Explain its complexity?
19. What is the advantage of selection sort over other sorting techniques?
20. what is the disadvantages of selection sort?
21. what is Heap sort? Explain its complexity?
22. What is the advantage of Heap over other sorting techniques?
23. what is the disadvantages of Heap sort?
24. Create a heap 'H' from the following list of numbers:4
44,30,50,22,60,55,77,55
25. Consider the Heap 'H' in the in the above question-24. ('H' is a mini heap since the
smaller elements are on the top of heap rather than the larger elements). Describe the
heap after item=11 is inserted into 'H' and item=22 is deleted?
26. Sort the following list in ascending order using bubble / selection / insertion / quick /
merge sort: a)56,57,92,38,44,90,61,73.
b) D,A,T,A,S,T,R,U,C,T,U,R,E.
27. Which of the following sort will take longest time to execute and which will take the
shortest time?
(a)Merge Sort
(b)Insertion sort
(c)selection sort
(d)Heap sort
(e)Quick sort
(f)Bubble sort
28. What is binary heap and what its complexity?
29. How many comparisons are required to sort an array of 50 elements using Selection sort/
bubble sort/ merge sort/ quick sort / insertion sort/ heap sort, if the original array were
already sorted?

You might also like