Download as pdf or txt
Download as pdf or txt
You are on page 1of 999

About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Programming Fundamentals 1
Lesson 1

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Outline

About This Course

Towards Computational Problem Solving

From Problems to Programs

The Python Programming Language

2 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

About this course


Course Organization

I Course name: Programming Fundamentals 1 (BICS 1st


semester)
I Schedule:
I Total volume: 5 ECTS (What does this mean?)
I Lecture: every Tuesday 08:45-11:15 in room MSA 3.370
I Practicals: every Wednesday 11:30-13:00 in room MSH
1.010-TIC
I Evaluation
I Course: 60%
I Practicals: 40%

3 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Human Problem Solving

I We solve problems ”by hand” (without technical aid) on a


daily basis
I find car keys
I get lunch
I find classroom

4 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Dealing with Complexity

I Humans not very efficient in dealing with complex problems


I Especially those involving large amounts of data
I DISCUSSION: examples?

5 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Dealing with Complexity

I Humans not very efficient in dealing with complex problems


I Especially those involving large amounts of data
I DISCUSSION: examples?
I There are limits to our mental capacities

Miller’s Law
The number of objects the average person can keep in working
memory is seven, plus or minus 2.

5 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Computational Power

I Computers are much faster in solving many problems.


I Computing power has rapidly evolved over the past decades.
I Trillion-fold increase in computing power over the last 60 years

Large numbers
1 million = 106 = 1.000.000
1 billion = 109 = 1.000.000.000
1 trillion = 1012 = 1.000.000.000.000

For comparison: human lifespan in seconds?

6 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Computational Power

I Computers are much faster in solving many problems.


I Computing power has rapidly evolved over the past decades.
I Trillion-fold increase in computing power over the last 60 years

Large numbers
1 million = 106 = 1.000.000
1 billion = 109 = 1.000.000.000
1 trillion = 1012 = 1.000.000.000.000

For comparison: human lifespan in seconds? Less than 4 billion


seconds.

6 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

Towards Computational Problem Solving


Course Objective

I Computational problem solving: solving problems using


computers

Main Objective of this Course


Teaching computational problem solving.

7 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Problem Definition

I Starting point: Problem Definition


I Discussion: how would you define a problem?

8 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Problem Definition

I Starting point: Problem Definition


I Discussion: how would you define a problem?
I Problem definition comprises
I Problem input (including possible restrictions)
I Problem output, and relation between input and output

8 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Example Problem

Example (Number Search)


Input: a sequence of numbers and a number x
Output: first position at which x occurs (starting with 0), or -1 if
number not in list
Sample inputs/outputs:
I Input ”23, 5, 6”, 7 produces output: -1
I Input ”23, 5, 6”, 5 produces output: 1
I Input ”23, 5, 6”, 23 produces output: 0

9 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Declarative versus Imperative Knowledge

I Above problem description defines what needs to be


computed. This is an example of declarative knowledge.
I Ultimately: need to define how to compute it, typically as a
sequence of instructions
I Such a sequence of instructions is called a program. This is an
example of imperative knowledge
I The instructions must be understandable by a computer

10 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Overview

11 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Algorithms

I A key step in designing a program is coming up with the


method to solve the problem
I We call such a method an algorithm
I An algorithm is designed for a particular problem.

12 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Properties of Algorithms

I An algorithm consists of a finite number of instructions


I It accepts an input and produces an output of the type
required by the problem
I An algorithm is correct if
I for every valid input, it produces in a finite amount of time the
output as specified in the problem definition

13 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Algorithm for Number Search

Recall the Number Search Problem


Input: a sequence of numbers + a number x
Output: first position at which x occurs (starting with 0), or -1 if
number not in list

Algorithm for the Number Search Problem


I as long as the end of the list has not been reached, repeat the
following two steps
I select the next number
I If the selected number = x, output current position and EXIT
I output -1

14 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Algorithm versus Program

Discussion
Why does the algorithm described on the previous slide not
constitute a program? What is missing?

15 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Algorithm versus Program

Discussion
Why does the algorithm described on the previous slide not
constitute a program? What is missing?

Answer
The algorithm was formulated in English.
A program must be formulated in a language the computer
understands.

15 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


What is a computer?

16 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Fixed-Program Computers

I Early computers were


fixed-program computers
I Could only execute a fixed
task, e.g., solve systems of
linear equations

17 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Stored-Program Computers

I Modern computers are


generally stored-program
computers
I Programs (sequences of
instructions) are stored in
memory and can be
Manchester Mark 1
executed
I Programs can easily be
changed

18 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Power of Computers

I To understand the power of computers, it helps to study an


abstraction: Turing Machine

19 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Turing Machine

I Turing machine represents an abstraction of a fixed-program


computer
I Consists of
I an infinite tape containing symbols
I a finite set of states
I a read-write head with a current state

I In one time unit


I TM reads a symbol
I Based on the current state
I writes a new symbol
I moves head left or right
I enters the next state

20 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Turing Computability

Church-Turing Thesis
If a problem is computable (that
is, solvable by a computer), then
a Turing Machine can be
programmed to compute it.

21 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Halting Problem

Question
Can every problem be computed?

22 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Halting Problem

Question
Can every problem be computed?

Answer
No. Example of a non-computable problem: Halting Problem
I Input: Program P (as string), input for Program P
I Output: Yes, if P halts on input of P, No, otherwise.

22 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Machine Code

I Natively computers only understand machine code

23 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


From Machine Language to Programming Languages

Question
Does this mean that we need to use machine code to program
computers?

24 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


From Machine Language to Programming Languages

Question
Does this mean that we need to use machine code to program
computers?

Answer
I Fortunately NOT.
I Can use programming languages that use a higher level of
abstraction, closer to our natural language
I There exist special programs that translate these high-level
programs to machine code.
I These programs are called compilers.

24 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Poll

Question
Which programming language(s) do you know?

25 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

From Problems to Programs


Programming Languages

I A programming language is said to be Turing complete if any


Turing machine can be expressed in this language
I All modern programming languages are Turing complete
(Java, C++, Python,...)
I We will use Python for expressing programs

26 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Basic Facts

I Python was created by Guido van Rossum and first released in


1991
I Emphasizes code readability (eg, through identation)
I Programs can be expressed in fewer lines of code than in
languages such as Java or C++
I Widely used in many fields (computer science, engineering
disciplines, life sciences, ...)
I thanks to many specialized libraries
I Free software, well documented

27 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Further Facts

I Python is an interpreted language


I Instructions executed directly
I Not first translated into machine code – as is case for compiled
languages
I Good for beginners (Why?)
I Python is a dynamically typed language
I Correctness of types can generally only be checked at runtime
I More about this later

28 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Installation

I Python can be downloaded from python.org


I Course uses Python 3 (currently: version 3.6)
I Not backwards compatible with Python 2
I First place to go to for Python resources: python.org

29 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Python Shell

I Most basic way to use Python: Python shell


I Can be run in terminal window on most platforms (next slide)
...
or executed online: python.org/shell (DEMO)
I Allows to interactively execute Python instructions
I No support for saving code

30 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


ID(L)E

I IDE: Integrated Development Environment


I Bunch of tools to allow you to write programs
I IDLE: Integrated Development and Learning Environment
I IDE that comes with Python
I Basic tool to get started with Python
I More advanced IDE (PyCharm) will be introduced in the
practicals
I Includes Python shell
I DEMO of IDLE

31 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


The Simplest Program

I Earlier definition: a program is a sequence of instructions


I We will use the term ”statement” since it is commonly used
in Python
I Thus: a Python program is a sequence of statements

Question
So what is the simplest Python program?

32 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


The Simplest Program

I Earlier definition: a program is a sequence of instructions


I We will use the term ”statement” since it is commonly used
in Python
I Thus: a Python program is a sequence of statements

Question
So what is the simplest Python program?

Answer
The empty program = empty sequence of statements.
Not a very useful one!

32 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Comments

# This is another version of the empty program


# It at least illustrates the use of comments in Python
# Comments are single lines of text preceded by ’#’
# Comments are ignored by the Python interpreter

33 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Data and Objects

I To write useful programs, we need data


I In Python data is represented in terms of objects
I Each object has
I an identity
I a type
I a value

34 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


About Types

I Type of object determines which operations an object


supports.
I Example:
I Integers can be multiplied
I Strings can be concatenated or split into substrings

35 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Numeric Types

I Objects representing numbers have numeric types


I We distinguish three different numeric types:
I Integer numbers (called int in Python)
I Floating-point numbers (called float in Python)
I Complex Numbers (called complex in Python)

36 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Literals

I A literal denotes a concrete data value in a program


I Example of literals for the numeric types:
I 42 (integer number)
I 3.14 (floating-point number)
I 1j (imaginary number)

37 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Boolean Type

I The type bool in Python is used to represent the Boolean


values True and False
I True and False are the only literals of Boolean type

38 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Variables

I To reuse a data value, we need to give it a name


I A variable is a name of a data value
I To assign a name to a value, we need to use an assignment
statement

Example (Assignment statements)


a=3
b = 2.5
c = 1j
Compare with equality operator in math!

39 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Assignment Statements

I Multiple assignements can be done in the same statement


I a, b = 2, 3 is equivalent to: a=2; b=3
I All expressions on the right sides are evaluated before
performing any assignment
I So how do you exchange values of variables x and y?

40 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Assignment Statements

I Multiple assignements can be done in the same statement


I a, b = 2, 3 is equivalent to: a=2; b=3
I All expressions on the right sides are evaluated before
performing any assignment
I So how do you exchange values of variables x and y?
I Like this: x, y = y, x
I DEMO

40 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Variable Identifiers

I Variable identifiers (or names) can contain


I uppercase and lowercase letters (Python identifiers are
case-sensitive!)
I digits (not in first position)
I underscore
I Do not use reserved words as identifiers:

41 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Variable Naming: Advice

I Use meaningful names for variables


I This makes your programs more readable

Question
Which of the following two code snippets is easier to understand,
and why?

x1 = 3.14 pi = 3.14
x2 = 13.1 diameter = 13.1
c = x1*(x2**2) area = pi*(diameter**2)

42 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


About dynamic typing

I Variables can be bound to objects of different types because


Python is dynamically typed
I This is not possible in statically typed languages (such as Java
or C++)
I In these languages the type of a variable has to be declared
before first use
I Once declared it cannot be changed afterwards

43 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Fun with Objects and Variables

Exercise
Enter the assignments similar to those given on slide 42. After
each assignment check the id (or ”location” or ”address”) of the
variable and its type using the functions id() and type().
I DEMO

44 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Expressions

I Assignment statements given above are very simple


I right-hand side is a literal
I Expressions are ”pieces” of Python code that produce a value
I Expressions can be used wherever a value is needed
I e.g., in the right-hand side of an assignment statement
I Simplest expressions: variables and literals are expressions
I e.g., ”a” and 3.14 are expressions

45 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Operators

I More complicated expressions can be built from simpler ones


by joining them using operators
I Depending on how many expressions (called operands) are
joined by an operator, we get different types of operators
I One operand: unary operator
I e.g., -a returns the opposite of an integer
I Two operands: binary operator
I e.g., a+b returns the sum of two integers a and b
I Three operands: ternary operator

46 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Equality and Inequality

I e1 == e2 tests the equality of the values of expressions e1


and e2
I The equality operator is not to be confused with the
assignment operator = (beware, mathematicians!)
I Inequality is checked with the operator !=
I Both operators return a Boolean value

47 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Arithmetic Operators

The following operators work on numeric types; e1 and e2 denote


expressions with an int or float value.
I e1 + e2, e1 - e2 and e1 * e2 return the sum, difference and
product of two expressions
I two types of division:
I e1 / e2 returns a float that is the result of dividing e1 by e2
I e1 // e2 returns the result of the floor division (aka integer
division) of e1 by e2 (by truncating digits after decimal point)
I E.g., 3 / 4 returns 0,75 and 3 // 4 returns 0

48 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Arithmetic Operators - Continued

I e1 % e2 is the remainder of diving inter e1 by integer e2


I e.g., 3 % 4 returns 3 and 7 % 3 returns 1
I e1**e2 returns e1 to the power of e2, i.e., e1e2
I e.g., 2 ** 4 returns 16

49 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Arithmetic Conversions

I If arithmetic expressions are combined with an operator and


they have different types, numeric types are converted to a
common type:
I If either argument is a complex number, the other is converted
to complex;
I otherwise, if either argument is a floating point number, the
other is converted to floating point;
I otherwise, both must be integers and no conversion is
necessary

50 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Boolean Operators

I The following operators work on Boolean expressions


I a and b is True if both a and b are True and False otherwise
I a or b is True if at least one of a and b is True and False
otherwise
I not a is True if a is False and False otherwise
I Note: a !=b is equivalent to: not (a == b)

51 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Operator Precedence

I The arithmetic operators have the usual precedence, i.e., *, /


and // have higher precedence than + and -
I Thus:
I a + b ∗ c has the same meaning as a + (b ∗ c) since ∗ has
higher precedence than +
I we can override the precedence by putting parentheses, as in
(a + b) ∗ c
I Precedence of Boolean operators is as follows (from highest to
lowest): not, and, or

52 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Fun with Operators

Exercise
Enter expressions similar to those given above into the Python
shell and observe their result
I DEMO

53 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Augmented Assignment

I Binary operators can be combined with assignment statements


I x op= y is equivalent to: x = x op y (where op is binary
operator)

Example
x += y is equivalent to: x = x + y
x *= y is equivalent to: x = x * y

54 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Strings

I So far we have only spoken about numeric and Boolean types


I Sequences of characters are represented by objects of type str
in Python
I we call objects of this type strings
I Strings are enclosed in single quotes or double quotes, eg,
’Paul’ and ”Pierre” are both valid strings

55 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Basic Output

I To output text, we can use the print function


I Functions require arguments, indicated between parentheses
I The print function requires one or more objects as
arguments
I Non-string arguments are converted to strings when output

Example (Basic output)


print(3) # outputs: 3
print(4,’Hello’,’there’,27)} # outputs: 4 Hello there 27

56 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Basic Input

I To read text from the keyboard, use the input function


I The input function either has no argument or a prompt string
as argument
I It reads a string entered by the user on the keyboard (ended
by ENTER-key)
I It returns as value the string it has read

Example (Basic input)


# read string (terminated by newline) from keyboard
input()
# prompt user for new number and read it
input(’Enter a number: ’)

57 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


A Simple Program

I The following Python program reads a number entered by the


user and outputs the opposite number
I All input is read as strings; the function int() is used to
convert a string (given as argument) to an integer

n = int(input(’Please enter a number: ’))


print(-n)

I DEMO: run this program in IDLE.

58 / 59
About This Course Towards Computational Problem Solving From Problems to Programs The Python Programming Language

The Python Programming Language


Type Conversion

I The call to the function int (with argument of type string)


is an example of a type conversion
I Generally we can use the name of a type to convert to that
type

Example
str(3) # converts int to string
bool(2) # converts integer to boolean

DEMO

59 / 59
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Programming Fundamentals 1
Lesson 2

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Outline

Syntax and Semantics

Control Flow Statements

Strings and Sequences

Two Simple Problems

2 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Syntax of a Program

I At the end of the last lesson we presented the following simple


program:
n = int(input('Please enter a number: '))
print(-n)
I Note that this program exhibits a certain structure and form
I We call this structure and form the syntax of the program

Discussion
Can you describe in your own words the syntax of this program?

3 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Syntax of Python

I Python itself has a syntax, namely, the set of rules that


determine which strings constitute well formed Python
programs.
I The above program is well formed because its structure
conforms to the syntax of the Python language
I In other words: the structure of the program satisfies the rules
making up the syntax of Python.

Discussion
Can you describe, as an analogy, some syntactic rules of English?

4 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Beyond Syntax

Question
Is the Python syntax enough to define the Python language?

5 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Beyond Syntax

Question
Is the Python syntax enough to define the Python language?

Answer
No. We also need to define the meaning of Python programs.

Discussion
How would you define the meaning of a Python program?

5 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Semantics

I The semantics of Python associates with each program a


meaning
I The meaning can be defined as the behavior of the interpreter
when it executes the program
I Includes the output that is generated
I What we want:
I The meaning of the program should be in line with the
problem we are trying to solve

6 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Semantics

I The semantics of Python associates with each program a


meaning
I The meaning can be defined as the behavior of the interpreter
when it executes the program
I Includes the output that is generated
I What we want:
I The meaning of the program should be in line with the
problem we are trying to solve
I E.g., for the Number Search Problem the program does indeed
output the position of the number to be searched (or -1 if
number not present)

6 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Syntax and Semantics


Errors in Python

I There are two types of errors you can encounter when


programming in Python
I Syntax errors are those that are caught before the program is
actually run
I Example of a statement that will raise a syntax error (DEMO
in Idle):
if True print('hello')
I The other type of errors are exceptions. These are found while
running/interpreting the program
I Example of a statement that will raise an exception (DEMO in
Idle):
print(x) # will raise a NameError

7 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Straightline Programs

I Recall that a Python program is a sequence of statements


I The few very simple programs we have seen so far are
straight-line programs
I Statements are executed one after the other, in the order they
occur, until there are no more to execute
I For many applications more flexible control is needed
I Discuss possible scenarios where this would be needed.

8 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Basic if-statements

I The if-statement is used for conditional execution


I Simplest Form: if expression: statement(s)
I Meaning:
I Expression is evaluated
I Statements are executed only if the expression evaluates to
True

9 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Basic if-statements

Example
if x < 0: print('x is negative')

10 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Basic if-statements

Example
if x < 0: print('x is negative')
Most Python programmers prefer to put the statement on a
separate line:
if x < 0:
print('x is negative')
Note: indentation is mandatory!!! (see slide below)

10 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Conditional expressions

I The expression controlling the if-statement can be an arbitrary


Python expression
I In other words: any value can be interpreted as True or
False
I if a value is of bool type, it is obviously either True or False
I if a value is of a numeric type, it is True if it is not equal to 0.
I if a value is of a string type, it is True if it is not the empty
string
I In the previous example the expression ”x<0” is of type bool;
”<” is a relational operator. Other relational operators are: >,
<= (less or equal), >= (greater or equal)

11 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


if-elif-else statements

I A more complete form of the if-statement is the following:

if expression: This selects exactly one group of


statement(s) statements by evaluating the
elif expression: expressions one by one until one
statement(s) is found to be true; then that
elif expression: group of statements is executed.
statement(s) If none are true, the statements
... following the else clause (if
else: present) are executed.
statement(s)

12 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Indentation
I The statements following an expression form a block
I Statements in a block need to be indented if on separate lines
I The block is finished when indentation level returns to that of
preceding clause (if, elif, or else)
I Python style recommendation: use 4 spaces for indentation

Example
if x < 0:
print('x is negative')
elif x%2:
print('x is positive and odd')
else
print('x is non-negative and even')
13 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Programming Advice
Python Coding Conventions

I The recommendation to put 4 spaces for indent are part of


Python coding conventions
I There are many other coding conventions for Python
I You can find all of them in:
PEP 8 – Style Guide for Python Code1

1
https://www.python.org/dev/peps/pep-0008/
14 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Basic while-statements

I The while-statement is used for repeated execution


I This statement is also called a while-loop
I Simplest Form:
while expression:
statement(s)
I Meaning: if expression is True, statement(s) are executed and
this process is repeated; otherwise we proceed past the while
loop.

15 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Example: basic while-statements

Example
x = 512
print('value of x is ', x)
count = 0
while x > 1:
x //= 2 # floor division
count += 1 # current block ends here
print('The approximate log2 is', count)

16 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Basic for-statements

I The for-statement is used for repeated execution over a


sequence of values (from first to last);
I This statement is also called a for-loop
I Simplest Form:
for variable in expression:
statement(s)
I Here expression denotes a sequence of elements
I To obtain the sequence of elements from 0, . . . , n − 1, we can
use the expression range(n)

17 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Control Flow Statements


Example: basic for-loop

I Request a number n from the user and return the sum of the
squares of numbers from 0 to n

n = int(input('enter number: '))


sum = 0
for i in range(n+1):
sum = sum + i * i #or: sum += i*i
print('result: ', sum)

18 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


About Strings

I In the Number Search Problem we have to read a sequence of


numbers
I Not possible with current functions we have seen
I input-function returns a single string
I Need to break up this string to retrieve individual numbers.
I How?

19 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


About Classes

I A class is the type of an object


I E.g., string objects have as type the str class − > DEMO
I The main elements of a class
I data items called instance variables
I functions called methods

20 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Functions vs Methods

I The functions we have used so far are defined outside classes


I Functions inside classes are called methods
I Unlike functions, methods must be called on objects using the
following syntax:
I x.f()
I calls method f on object x

21 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


A Few Simple String Methods

I x.lower() (x.upper()) returns copy of string with all


characters converted to lower (upper) case
I x.startswith(y) (x.endswith(y)) returns True if string x
starts with (ends with) string y and False otherwise

Example
'A3B'.lower() #returns: 'a3b'\\
'A3B'.startswith('A3') #returns: True\\
'A3B'.endswith('A3') #returns: False

DEMO!

22 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Immutability of Strings

I x.lower() returns copy of strings


I Question: Why doesn’t it modify the string x?

2
abbreviation of ”id est” (Latin), meaning: ”that is”
23 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Immutability of Strings

I x.lower() returns copy of strings


I Question: Why doesn’t it modify the string x?
I Answer: Strings are immutable in Python
I i.e.,2 , their contents cannot be changed

2
abbreviation of ”id est” (Latin), meaning: ”that is”
23 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Strings are Sequences!

I Strings are an example of a Python sequence type


I All objects of a sequence type support the built-in function
len which returns the length of the sequence
I Objects of sequence type also support the following
operations:
I concatenation
I membership testing
I indexing
I slicing
I We give examples of these operations for strings on following
slides (but they apply to any sequence type)

24 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Concatenating/ Repeating Strings

I x + y denotes the concatenation of strings x and y


I i*x denotes concatenation of i copies of string x

Example
'Hello ' + 'world' #'Hello world'
3*'4' #'444'

I Question: What is the value of 0*’abc’

25 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Concatenating/ Repeating Strings

I x + y denotes the concatenation of strings x and y


I i*x denotes concatenation of i copies of string x

Example
'Hello ' + 'world' #'Hello world'
3*'4' #'444'

I Question: What is the value of 0*’abc’


I Answer: the empty string

25 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Membership Testing

I Generally, for sequences ”x in s” tests whether some item in


sequence s is equal to x
I For strings we can use it more generally to test whether x is a
substring of s
I to find the first occurrence of an element x in y, we can use
the index method

Example
'Hello' in 'Hello world' #True
'' in 'abc' #True
'z' in 'xy' #False
'Hello'.index('l') #2

26 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Indexing

I We denote the nth item in a sequence s by s[n]


I Indices start at 0

Example
'Hello world'[0] #'H'
'xyz'[2] #'z'

27 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Slicing

I For a sequence s, s[i:j] indicates subsequence of elements from


ith to j-1st
I if i omitted, behaves same as if i=0
I if j omitted, behaves same as if j=len(s)

Example
'Hello world'[0:5] #'Hello'
'Hello world'[5:] #' world'
'Hello world'[:5] #'Hello'

28 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Lists

I As explained above strings are an example of a sequence type


I For more general sequences (not just sequences of characters)
we can use the list datatype
I Unlike strings, lists are mutable (i.e., the contents can change)
I A list is written as a comma separated sequence of elements
enclosed by brackets

Example
['2','1a','3b'] #list with 3 strings
[3,1] #list with two numbers
[] #empty list

29 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


Lists

I The list type being a sequence type we can use all operations
(as well as the len function) defined above on string examples

Example
len(['2','1a','3b']) # 3
[1] + [2,3] # [1,2,3]
[5,4,'7a'][1:] # [4,'7a']

30 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Strings and Sequences


String methods
I We have seen a few simple methods for strings (eg, upper and
lower)
I We now present a more sophisticated one, the split method,
allowing to break up a string into multiple strings
I The simplest form is
x.split() # x must be a string
I Statement returns list of strings making up x separated by
whitespaces (i.e., tabs or spaces)

Example
'Hello world'.split() #['Hello','world']
'23 1 78'.split() #['23','1','78']
31 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Algorithm for Number Search

Recall the Number Search Problem


Input: a sequence of numbers + a number x
Output: first position at which x occurs (starting with 0), or -1 if
number not in list

Algorithm for the Number Search Problem


I as long as the end of the list has not been reached
I select the next number
I If the selected number = x, output current position and EXIT
I output -1

We are now ready to implement this algorithm in Python

32 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Program for Number Search

#request input sequence


s = input('Please enter the sequence of numbers: ')
numberList = s.split()
#request number to search for
n = input('Please enter the number to search for: ')
#look for first occurrence
pos = -1
for x in numberList: # for-loop works with any sequence
pos += 1
if numberList[pos] == n: # or: if x == n
print('number found at position', pos)
quit()
print('number not found')

33 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Exiting a loop

I In the previous program we use the quit() command to exit


the program
I Note that this command stops the interpreter
I This is not always what we want, e.g., if this code is executed
within some larger program
I We can use the break command to exit the enclosing loop
(for or while) and continue with following statements
I An alternative version of the program using the break is
shown on the next slide

34 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Program for Number Search: Version 2

# request input sequence


# initialize numberList and n as before
# look for first occurrence
pos = -1
found = False
for x in numberList:
pos += 1
if numberList[pos] == n: # or: if x == n
found = True
break # leaves the for loop
# output result
if found:
print('number found at position: ', pos)
else:
print('number not found')

35 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop termination

I In general programs may not terminate if they contain loops


I Not always easy to distinguish infinite loops from programs
that take very long

Question
Why is it clear that the previous program (both versions)
terminates?

36 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop termination

I In general programs may not terminate if they contain loops


I Not always easy to distinguish infinite loops from programs
that take very long

Question
Why is it clear that the previous program (both versions)
terminates?

Answer
Because the for-loop is executed at most once for each number in
the numberList sequence.

36 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Another Search Problem

I We consider a variation of the number search problem


I We restrict the input: the sequence of numbers has to be
sorted.
I We relax the output: the output is any position at which the
number to be searched occurs, or -1 if it does not occur in the
sequence
I Note: the algorithm for the Number Search problem still
works (Why?)

Discussion
Do you see a more efficient (faster) method to solve this problem?

37 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search

I The algorithm one uses for the modified problem is called


binary search
I Here is an informal description from Wikipedia3 (slightly
adapted)

Binary Search - Informal Description


Binary search compares the target value to the middle element of
the array; if they are unequal, the half in which the target cannot
lie is eliminated and the search continues on the remaining half
until it is successful. If the search ends with the remaining half
being empty, the target is not in the array.

3
https://en.wikipedia.org/wiki/Binary search algorithm
38 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search

I Although the algorithm looks fairly obvious, it is known that it


is easy to get the implementation (program) wrong
I For that reason we will carefully proceed when devising an
algorithm
I We start by defining the ”search interval” by two variables a
and b, a denoting the leftmost index and b denoting the
rightmost index
I Initially: a = 0 and b = L − 1 where L is length of numberList
I The problem amounts to finding an integer i such that
a ≤ i ≤ b and numberList[i] == n (where n is the number to
be searched)

39 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search

I We will now present the first version of a program for


performing binary search
I Note that the program is incomplete since the actual search is
expressed using pseudocode
I Pseudocode is an English description that will be refined into
code
I We have marked the pseudocode that needs to be refined as a
comment with contents between quotes

40 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search Program - Version 1

# "read numberList and n from user"


a = 0
b = len(numberList)-1
# "compute i such that a <= i <= b and numberList[i] = n"

41 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search Program - Version 2

I Replace second piece of pseudocode in version 1 by loop (also


in pseudocode)

# "read numberList and n from user"


a=0
b=len(numberList)-1
# "as long as list from positions a to b nonempty"
# "if n is equal to middle,
# output middle position and exit loop"
# "else if n smaller than middle,
# restrict sequence to left half"
# "else restrict sequence to right half"
# "if sequence from a to b is empty, output 'not found' "

42 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Computing the Middle

I A key step in the loop is to determine the position of the


middle element
I Question: How do we compute the middle position?

43 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Computing the Middle

I A key step in the loop is to determine the position of the


middle element
I Question: How do we compute the middle position?
I Answer: we first need to define what is a middle element

43 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Computing the Middle

I A key step in the loop is to determine the position of the


middle element
I Question: How do we compute the middle position?
I Answer: we first need to define what is a middle element

Definition
A middle element is an element in a sequence such that at most
half of the elements are on each side of the middle element.

Example
For [2, 4, 8] we want at most 3/2, i.e., at most one element on
either side , so the only choice is 4
For [3, 5, 7, 9] we want at most 4/2, i.e., at most two elements on
either side , so the possible choices are 5 or 7.
43 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Computing the Middle - Code

I More generally:
I if a sequence has odd length 2k + 1 the middle element is
element at position k (first index = 0)
I if a sequence has even length 2k the middle element is element
at position k − 1 or k
I Guess expression for middle element: (a + b)//2
I Check this:
I if a sequence has odd length then b = a + 2k and
(a + b)//2 = (2a + 2k)//2 = a + k (satisfying above
condition)
I if a sequence has even length 2k, then b = a + 2k − 1 and
(a + b)//2 = (2a + 2k − 1)//2 = a + k − 1 (satisfying above
condition)

44 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Binary Search Program - Version 3

# "read numberList and n from user"


a = 0
b = len(numberList)-1
while a<=b:
m= (a+b)//2
if n==numberList[m]:
print('Number at position: ',m)
break
elif n<numberList[m]:
b = m-1
else:
a = m+1
if a>b:
print('number not found')
45 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Termination

I As we did for the Number Search Problem we want to reason


why the above loop (and hence the program) terminates
I Each execution of the statements inside the loop is called an
iteration
I Termination thus means that the number of iterations of the
loop is finite
I Proving termination is a bit more difficult here because
I while-loop instead of for-loop (Explain!)
I search algorithm more complex

46 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Termination

I As we did for the Number Search Problem we want to reason


why the above loop (and hence the program) terminates
I Each execution of the statements inside the loop is called an
iteration
I Termination thus means that the number of iterations of the
loop is finite
I Proving termination is a bit more difficult here because
I while-loop instead of for-loop (Explain!)
I search algorithm more complex

Discussion
Can anybody explain why the loop in Version 3 should terminate?

46 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Variant

I There is a general approach for proving termination of a loop


I It is based on a loop variant

Definition (Loop Variant)


A loop variant is an integer expression E with the following
properties:
I For an iteration to execute we must have E ≥ 0
I each iteration of the loop decreases the value of E
We note: if there is a loop variant for a loop, then the loop must
terminate (Why?)

47 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Find the loop variant

I Challenge: find the loop variant in the loop below!

while a<=b:
m= (a+b)//2
if n==numberList[m]:
print('Number at position: ',m)
break
elif n<numberList[m]:
b = m-1
else:
a = m+1

48 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Variant for Binary Search

I We claim that b − a is a loop variant for the above loop


I Indeed
I An iteration is only executed if a ≤ b, i.e., if b − a ≥ 0
I Each iteration decreases the value of b − a (Why?)
I We conclude that the program for binary search (version 3)
does indeed terminate.

49 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Correctness of Programs

I Similarly to the correctness notion for algorithms (see lesson


1) we define correctness of programs as follows:
I for each possible input, the program terminates in a finite time
and returns the output requested
I Correctness for the number search program is fairly obvious
I This is not so for the binary search program (Why?)

Discussion
How can we argue that the program for binary search is correct?

50 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant

I To prove correctness, it is helpful to work with loop invariants

Definition (Loop Invariant)


A loop invariant for a loop is (boolean) condition that holds before
and after each iteration of a loop

51 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Find the loop variant

I Challenge: find a loop invariant in the loop below!

while a<=b:
m= a+(b-a+1)//2
if n==numberList[m]:
print('Number at position: ',m)
break
elif n<numberList[m]:
b = m-1
else:
a = m+1

52 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially

53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially
I now suppose it holds before an iteration

53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially
I now suppose it holds before an iteration
I if n is found in the middle then it still holds after the iteration
since a and b are not modified

53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially
I now suppose it holds before an iteration
I if n is found in the middle then it still holds after the iteration
since a and b are not modified
I if n is smaller than the middle, it must be in the first half (if at
all)

53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially
I now suppose it holds before an iteration
I if n is found in the middle then it still holds after the iteration
since a and b are not modified
I if n is smaller than the middle, it must be in the first half (if at
all)
I otherwise n must be larger and thus be in the second half (if
at all)

53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant for Binary Search

I We claim that the following is a loop invariant for the above


loop:
if numberList contains n, then it must be at a position
between a and b
I To see this:
I first note that it trivially holds initially
I now suppose it holds before an iteration
I if n is found in the middle then it still holds after the iteration
since a and b are not modified
I if n is smaller than the middle, it must be in the first half (if at
all)
I otherwise n must be larger and thus be in the second half (if
at all)
I We conclude that the condition holds after the iteration as well
53 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Loop Invariant implies Correctness

I An answer ’number found at position x’ must be correct since


it only occurs when n is indeed found in the middle of the
sequence
I An answer ’number not found” must be correct for the
following reason:
I The program outputs this answer only when a > b after the
loop
I This condition must have been true after the last iteration.
I The loop invariant then implies that after the last iteration n
must be in an empty subsequence, i.e., n is not in numberList

54 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Algorithmic Complexity

I Let us refer to the search done for the Number Search


Problem as linear search
I Recall why we introduced binary search instead of linear
search (Why?)

55 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Algorithmic Complexity

I Let us refer to the search done for the Number Search


Problem as linear search
I Recall why we introduced binary search instead of linear
search (Why?)
I because we can search faster in a sorted list
I We can quantify the difference in performance of the two
algorithms

55 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Complexity of Linear Search

I Recall the linear search algorithm


I There is at most one iteration per number in the sequence
I Thus in the worst case linear search takes time roughly
proportional to the number of elements in the input sequence
I We say that linear search has linear complexity.

56 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Linear Search: Best Case

Question
What is the complexity of linear search in the best case?

57 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Linear Search: Best Case

Question
What is the complexity of linear search in the best case?

Answer
I In the best case the element is found at the first position of
the sequence
I Thus the complexity in the best case is independent of the
length of the input sequence
I We talk about constant complexity

57 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Complexity of Binary Search

I To analyze the complexity of binary search, note the following:

I In one iteration we divide the length of the sequence that


needs to be checked by a factor of at least 2
I So how many iterations will be required?
I Discussion.

58 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Complexity of Binary Search

I Suppose we start out with a sequence of n numbers


I After one iteration we are left with at most n/2 numbers
I After i iterations we are left with how many numbers?

59 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Complexity of Binary Search

I Suppose we start out with a sequence of n numbers


I After one iteration we are left with at most n/2 numbers
I After i iterations we are left with how many numbers?
I At most n/2i numbers

59 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Logarithmic Complexity

I If input sequence has 2k numbers, then after at most k


iterations we are left with at most one number
I What is the relation between k and 2k ?

60 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Logarithmic Complexity

I If input sequence has 2k numbers, then after at most k


iterations we are left with at most one number
I What is the relation between k and 2k ?
I k = log2 2k
I Thus, binary search takes roughly time logarithmic in terms of
the length of the input sequence
I Binary search has logarithmic complexity

60 / 61
Syntax and Semantics Control Flow Statements Strings and Sequences Two Simple Problems

Two Simple Problems


Logarithmic Complexity vs4 Linear Complexity

Example

n=2 log n = 1
n=1024 log n = 10
n=1048576 log n = 20
n=1073741824 log n = 30

4
”vs” stands for ”versus” or ”compared to”
61 / 61
Towards Functions Functions in Python

Programming Fundamentals 1
Lesson 3

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 55
Towards Functions Functions in Python

Outline

Towards Functions

Functions in Python

2 / 55
Towards Functions Functions in Python

Towards Functions
Code Reuse

I In lesson 2 we saw a program for the Number Search Problem


I Suppose we want to write another program (let us call it
Program 2) that needs to solve this problem multiple times
I This is an example scenario for code reuse
I We could simply copy and paste the code for the number
search problem into those places where we need to solve this
problem

3 / 55
Towards Functions Functions in Python

Towards Functions
Structure of Program 2

I Program 2 would look like this:

# Program 2

# some statements
# code for Number Search

#some more statements


# code for Number Search

# even more statements


# code for Number Search

# yet more statements


4 / 55
Towards Functions Functions in Python

Towards Functions
Problems with Copy and Paste

Question
What are drawbacks of this approach to code reuse?

5 / 55
Towards Functions Functions in Python

Towards Functions
Problems with Copy and Paste

Question
What are drawbacks of this approach to code reuse?

Answer
Any modification to the method used for number search will entail
changes in multiple places

5 / 55
Towards Functions Functions in Python

Towards Functions
Problems with Copy and Paste

Question
What are drawbacks of this approach to code reuse?

Answer
Any modification to the method used for number search will entail
changes in multiple places

Discussion
Why is this really a problem? Imagine real world scenarios.

5 / 55
Towards Functions Functions in Python

Towards Functions
Programming Advice

Rule
Say each thing only once in one place!

6 / 55
Towards Functions Functions in Python

Towards Functions
Programming Advice

Rule
Say each thing only once in one place!

Discussion
Why does the ”copy and paste” approach to code reuse violate
this rule?

6 / 55
Towards Functions Functions in Python

Towards Functions
How to Reuse

Question
So how can we reuse code without reproducing it?

7 / 55
Towards Functions Functions in Python

Towards Functions
How to Reuse

Question
So how can we reuse code without reproducing it?

Answer
By packaging the code as a function.

7 / 55
Towards Functions Functions in Python

Towards Functions
Functions in Mathematics

Definition (function)1
In mathematics, a function is a relation between a set of inputs
and a set of permissible outputs with the property that each input
is related to exactly one output.

1
https://en.wikipedia.org/wiki/Function (mathematics)
8 / 55
Towards Functions Functions in Python

Towards Functions
From Programs to Functions

I Suppose that a program solves a given problem


I We can then view the program as realizing a (mathematical)
function

Question
What is missing for a program to be a full-fledged function?

9 / 55
Towards Functions Functions in Python

Towards Functions
From Programs to Functions

I Suppose that a program solves a given problem


I We can then view the program as realizing a (mathematical)
function

Question
What is missing for a program to be a full-fledged function?

Answer
We need to clearly identify inputs and outputs.

9 / 55
Towards Functions Functions in Python

Functions in Python
Example of a Function in Python

I The following Python code computes the minimum of two


numbers a and b

def myMin(x, y): # myMin2 is the name of the function


# x and y are the "parameters" of the function
# they constitute the "input"
if x<y:
return x
else:
return y
# output is either x or y

2
There is a builtin function min, hence we choose a different name
10 / 55
Towards Functions Functions in Python

Functions in Python
General Form

I A Python function definition looks as follows:


def <name>(<parameters>):
<body>
I < name > stands for the name of the function
I < parameters > stands for a comma separated list of names
that constitute the input to the function
I < body > stands for a sequence of (indented) Python
statements

11 / 55
Towards Functions Functions in Python

Functions in Python
General Form

I A Python function definition looks as follows:


def <name>(<parameters>):
<body>
I < name > stands for the name of the function
I < parameters > stands for a comma separated list of names
that constitute the input to the function
I < body > stands for a sequence of (indented) Python
statements
I In previous example: < name > = myMin, < parameters > =
(x, y) and < body > is composed of an if-else statement

11 / 55
Towards Functions Functions in Python

Functions in Python
Function Call

I To call a function, we use the syntax:


<name>(<arguments>)
where < name > is the name of the function and
< arguments > is a comma-separated list of arguments
I Arguments are expressions (that have a value)

Example
myMin(3, 4) calls function myMin with arguments 3 and 4
Parameter x will be bound to 3, and parameter y will be bound to 4

12 / 55
Towards Functions Functions in Python

Functions in Python
Return value

I Function calls are expressions


I Like all expressions, function calls have a value
I The value is that returned by the function
I So what is value of myMin(3,4)?

13 / 55
Towards Functions Functions in Python

Functions in Python
Return value

I Function calls are expressions


I Like all expressions, function calls have a value
I The value is that returned by the function
I So what is value of myMin(3,4)?
I To obtain the value, execute the body with x=3 and y=4

13 / 55
Towards Functions Functions in Python

Functions in Python
Return value

I Function calls are expressions


I Like all expressions, function calls have a value
I The value is that returned by the function
I So what is value of myMin(3,4)?
I To obtain the value, execute the body with x=3 and y=4
I The value 3 is returned since 3 < 4

13 / 55
Towards Functions Functions in Python

Functions in Python
Function Call Execution

I To execute a function call:


I the expressions of the arguments are evaluated, and the
parameters are bound to these values
I the point of execution moves from the point of the function
call to the first statement in the body
I the statements in the body are executed until a return is
encountered, or there are no more statements to execute
I in the first case the value of the function call is the value of
the expression following the return
I in the second case the value is None
I the point of execution is transferred back to the place of the
call where it continues executing the current statement

14 / 55
Towards Functions Functions in Python

Functions in Python
Positional Parameter Binding

I In the above examples parameters were bound to arguments


via their position, i.e., 1st parameter → 1st argument, 2nd
parameter → 2nd argument,...
I Python also supports keyword arguments: these arguments
are bound to the parameters via the parameter name
I A keyword argument must not (i.e., is not allowed to) be
followed by a non-keyword argument.

Example
The call myMin(3,4) can be replaced by myMin(x=3,y=4) or even
myMin(y=4,x=3). These calls are all equivalent.
The call myMin(x=3, 4) is not allowed (see last item above).

15 / 55
Towards Functions Functions in Python

Functions in Python
Optional Parameters

I Parameters may supply a default value using the syntax:


<param> = <defaultValue>
I A corresponding argument is optional: if it is not given, the
parameter binds to the default value

Example
In the function definition
def sortList(numberList, ascending = True):
the second parameter has a default value.
The call sortList(list) is equivalent to sortList(list,
True)

16 / 55
Towards Functions Functions in Python

Functions in Python
Optional Parameters - Continued

I The following restriction exists for the definition of default


values:
I All parameters with default values must follow all parameters
without default values.
I Keyword arguments are commonly used in conjunction with
default parameter values.
I In general use of keyword arguments may improve the
readability of the code
I E.g.,the call sortList(l, ascending=False) documents
the purpose of the second parameter.

17 / 55
Towards Functions Functions in Python

Functions in Python
To Return or Not to Return

I The functions above all contain return statements followed by


an expression
I A function may also not contain a return statement or a
return statement without an expression

Example
def meaningOfLife():
print('Not sure what it is')
is a valid function definition

18 / 55
Towards Functions Functions in Python

Functions in Python
None as a Return Value

Question
In the cases outlined in the second atom on the previous slide
(including the example), does the function have a return value?

19 / 55
Towards Functions Functions in Python

Functions in Python
None as a Return Value

Question
In the cases outlined in the second atom on the previous slide
(including the example), does the function have a return value?

Answer
Yes, it has! In those case the function returns None.

19 / 55
Towards Functions Functions in Python

Functions in Python
None

I None is a special value in Python


I Since all values are objects, it is an object.
I A function that contains no return statement (or a return not
followed by an expression) returns None

Question
Recall the function
def meaningOfLife():
print('Not sure what it is')
print(meaningOfLife()==None)
What is the output? → DEMO

20 / 55
Towards Functions Functions in Python

Functions in Python
What is None? - continued

Question
What is the type of None? How can you find this out?

21 / 55
Towards Functions Functions in Python

Functions in Python
What is None? - continued

Question
What is the type of None? How can you find this out?
→ DEMO

21 / 55
Towards Functions Functions in Python

Functions in Python
What is None? - continued

Question
What is the type of None? How can you find this out?
→ DEMO
Answer
The type of None is NoneType. None is the only value of
NoneType.

21 / 55
Towards Functions Functions in Python

Functions in Python
Local Variable

I A local variable in a function is a variable on the left side of


an assignment statement within a function

Example
In the following function, y is a local variable
def f(x):
print(x)
y=1
print(x+y)

22 / 55
Towards Functions Functions in Python

Functions in Python
Global versus Local Variables

Example
Consider the following program:
def f(x):
y=1
print(y)
y=3
f(y)
print(y)

Question
What is the output?

23 / 55
Towards Functions Functions in Python

Functions in Python
Global versus Local Variables

Example
Consider the following program:
def f(x):
y=1
print(y)
y=3
f(y)
print(y)

Question Answer
1
What is the output?
3
23 / 55
Towards Functions Functions in Python

Functions in Python
Global versus Local Variables - Continued

I The assignment y = 3, being outside any function, defines a


global variable
I The assignment y = 1, being inside function f, defines a local
variable
I These are two different variables

24 / 55
Towards Functions Functions in Python

Functions in Python
Lifetime of Variables
I Global variables exist as long as the program has not
terminated
I Local variables are created every time the function is executed
and removed when the function ends
I When the example program executes we have the following
situation

25 / 55
Towards Functions Functions in Python

Functions in Python
Meaning of Variable Names

I Suppose a statement is executed that is referring to y.


I How do we know which variable we are referring to?

26 / 55
Towards Functions Functions in Python

Functions in Python
Meaning of Variable Names

I Suppose a statement is executed that is referring to y.


I How do we know which variable we are referring to?
I Answer: during the lifetime of the local variable any reference
to y refers to the local variable, any other time it refers to the
global variable.

26 / 55
Towards Functions Functions in Python

Functions in Python
Meaning of Variable Names

I Suppose a statement is executed that is referring to y.


I How do we know which variable we are referring to?
I Answer: during the lifetime of the local variable any reference
to y refers to the local variable, any other time it refers to the
global variable.

26 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables

I Discuss the following program


def f():
print(y)
y=3
f()
print(y)

27 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables

I Discuss the following program


def f():
print(y)
y=3
f()
print(y)

I There is a single global variable y.


I The global variable is referenced inside the function.
I Thus the ouput is:

27 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables

I Discuss the following program


def f():
print(y)
y=3
f()
print(y)

I There is a single global variable y.


I The global variable is referenced inside the function.
I Thus the ouput is:
3
3

27 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables - Continued

Question
Is it a good idea to access global variables from inside a function?

28 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables - Continued

Question
Is it a good idea to access global variables from inside a function?

Answer
No, it’s not a good idea because ...

28 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables - Continued

Question
Is it a good idea to access global variables from inside a function?

Answer
No, it’s not a good idea because ...
I ... it makes programs more difficult to read

28 / 55
Towards Functions Functions in Python

Functions in Python
Referencing Global Variables - Continued

Question
Is it a good idea to access global variables from inside a function?

Answer
No, it’s not a good idea because ...
I ... it makes programs more difficult to read
I ... it makes functions more difficult to reuse

28 / 55
Towards Functions Functions in Python

Functions in Python
Programming Advice

Advice
It is generally not a good idea to reference global variables from a
function. If the function needs to access information at the global
level, this information should be passed via parameters.

29 / 55
Towards Functions Functions in Python

Functions in Python
Applying the Advice

I So let us apply the advice to the previous program

# BEFORE
def f():
print(y)
y=3
f()
print(y)

30 / 55
Towards Functions Functions in Python

Functions in Python
Applying the Advice

I So let us apply the advice to the previous program

# BEFORE # AFTER
def f(): def f(z):
print(y) print(z)
y=3 y=3
f() f(y)
print(y) print(y)

30 / 55
Towards Functions Functions in Python

Functions in Python
Alternative

I Note that we could have done this:

# BEFORE
def f():
print(y)
y=3
f()
print(y)

31 / 55
Towards Functions Functions in Python

Functions in Python
Alternative

I Note that we could have done this:

# BEFORE # AFTER
def f(): def f(y):
print(y) print(y)
y=3 y=3
f() f(y)
print(y) print(y)

I Would the final program behave in the same way?


I Yes, it would as parameters are treated as local variables:
during the function execution y refers to the parameter.

31 / 55
Towards Functions Functions in Python

Functions in Python
Programming Advice

Advice
It is recommended to give parameters names different from those
of global variables to avoid confusion and enhance readability.

32 / 55
Towards Functions Functions in Python

Functions in Python
A Simple Function

I Consider the following function:

def sumOfSquares(n):
""" Assumes n is an integer with n>=0
Returns the sum of squares of numbers form 0 to n """
s = 0
for i in range(n+1):
s += i * i
return s

Question
What do you notice about this function?

33 / 55
Towards Functions Functions in Python

Functions in Python
Docstrings
I The function of squares contains a special comment
(multi-line!) enclosed by triple quotes (watch the indent!)
""" Assumes n is an integer with n>=0
Returns the sum of squares of numbers from 0 to n """

I Such a comment is called a docstring in Python

Question
What is the purpose of this particular comment?

34 / 55
Towards Functions Functions in Python

Functions in Python
Docstrings
I The function of squares contains a special comment
(multi-line!) enclosed by triple quotes (watch the indent!)
""" Assumes n is an integer with n>=0
Returns the sum of squares of numbers from 0 to n """

I Such a comment is called a docstring in Python

Question
What is the purpose of this particular comment?
Answer
To define the contract between the user of the function and the
implementer (programmer). We also talk about the specification of
a function
34 / 55
Towards Functions Functions in Python

Functions in Python
Functions and Contracts

I With each Function we can associate a contract/specification


consisting of two parts:
I Assumptions (also called preconditions): conditions that must
be met by users or clients of the function
I Guarantees (also called postconditions) that must be met by
the function.
I Provided that the assumptions are satisfied, the guarantees
are required to hold

Discussion
Can you think of analogies in the real world?

35 / 55
Towards Functions Functions in Python

Functions in Python
Assumptions

I Assumptions typically define constraints on the parameters of


the function
I the type of the parameters
I other (boolean) conditions on the parameters

Question
What happens when an assumption is violated?

36 / 55
Towards Functions Functions in Python

Functions in Python
Assumptions

I Assumptions typically define constraints on the parameters of


the function
I the type of the parameters
I other (boolean) conditions on the parameters

Question
What happens when an assumption is violated?

Answer
Errors or other unpredictable behavior may occur.

36 / 55
Towards Functions Functions in Python

Functions in Python
Violating an Assumption

I Let us explore what happens in the example when the


assumption is violated → DEMO

def sumOfSquares(n):
""" Assumes n is an integer with n>=0
Returns the sum of squares of numbers from 0 to n """
s = 0
for i in range(n+1):
s += i*i
return s

Question
Can you explain the observed behavior?

37 / 55
Towards Functions Functions in Python

Functions in Python
Benefits of contracts

Question
Who benefits from the use of contracts?

38 / 55
Towards Functions Functions in Python

Functions in Python
Benefits of contracts

Question
Who benefits from the use of contracts?

Answer
I The programmer because he knows what he must implement.

38 / 55
Towards Functions Functions in Python

Functions in Python
Benefits of contracts

Question
Who benefits from the use of contracts?

Answer
I The programmer because he knows what he must implement.
I The client of the function because ideally the contract is
enough to understand what the function does

38 / 55
Towards Functions Functions in Python

Functions in Python
Information Hiding

I The use of contracts is an example of information hiding


I The client does not need to read the detailed implementation
to understand what the function does

Question
Why is this use of information hiding useful?

39 / 55
Towards Functions Functions in Python

Functions in Python
Information Hiding

I The use of contracts is an example of information hiding


I The client does not need to read the detailed implementation
to understand what the function does

Question
Why is this use of information hiding useful?

Answer
It shields the client from the complexity of the implementation.

39 / 55
Towards Functions Functions in Python

Functions in Python
Viewing docstrings

I Python presents a convenient way to view docstrings


I Use syntax ”help(sumOfSquares)” to view contents of
docstring
I Typing ”sumOfSquares(” in shell or editor will display list of
parameters and the first few lines of the doctring
I DEMO

40 / 55
Towards Functions Functions in Python

Modules
Software Complexity

I We have presented functions as a way to facilitate reuse by


encapsulating commonly used functionalities
I We can also view it as being helpful to tame (or control) the
complexity of a program
I The structural complexity of a program is also known as
software complexity.

Discussion
Why is software complexity a problem?

41 / 55
Towards Functions Functions in Python

Modules
Definition of Modules

I All the programs we have seen so far are contained in a single


Python file (with extension .py)
I When programs get large, it is natural to try to divide them
up into different files to reduce software complexity
I Each such file is called a module
I A module contains a collection of Python definitions and
statements

42 / 55
Towards Functions Functions in Python

Modules
Benefits of Modules

I As mentioned above modules help to reduce the software


complexity of a program
I Other uses are:
I for software development:

43 / 55
Towards Functions Functions in Python

Modules
Benefits of Modules

I As mentioned above modules help to reduce the software


complexity of a program
I Other uses are:
I for software development: dividing up programming tasks
I for software testing:

43 / 55
Towards Functions Functions in Python

Modules
Benefits of Modules

I As mentioned above modules help to reduce the software


complexity of a program
I Other uses are:
I for software development: dividing up programming tasks
I for software testing: allows different functionalities to be tested
separately
I for software maintenance:

43 / 55
Towards Functions Functions in Python

Modules
Benefits of Modules

I As mentioned above modules help to reduce the software


complexity of a program
I Other uses are:
I for software development: dividing up programming tasks
I for software testing: allows different functionalities to be tested
separately
I for software maintenance: facilitates modification of specific
program functionalities

43 / 55
Towards Functions Functions in Python

Modules
Module Example
I Example of a module circle.py
pi = 3.14159

def area(radius):
return pi*(radius**2)

def circumference(radius):
return 2*pi*radius

def sphereSurface(radius):
return 4.0*area(radius)

def sphereVolume(radius):
return (4.0/3.0)*pi*(radius**3)
44 / 55
Towards Functions Functions in Python

Modules
Modules and Namespaces

I Each module provides its own namespace


I To use a module, you can import it as follows:
import <moduleName>
I The namespace provides a context for the names in a module
I Two different modules can be imported having functions with
the same name
I These functions are differentiated using dot notation

45 / 55
Towards Functions Functions in Python

Modules
Modules and Namespaces - Example

Example
I Suppose we have two implementations of a circle module
called circle1 and circle2 with the same function and variable
names
I We could then test values of pi constants as follows:

import circle1, circle2


...
if circle1.pi == circle2.pi:
print('Values of pi are the same')
else:
print('Values of pi are different')

46 / 55
Towards Functions Functions in Python

Modules
Other Forms of Import

Example
I We can also import using the following syntax:
from <moduleName> import <something>
I Here <something> can either be
I a list of identifiers, e.g., func1, func2
I a single renamed identifier, e.g.,
from <moduleName> import f1 as func
I an asterisk, as in: from circle import *

47 / 55
Towards Functions Functions in Python

Modules
Other Forms of Import (2)

I There is a fundamental difference between this second version


of import and the earlier one:
I For this version the namespace of the imported module
becomes part of the importing module
I Identifiers can thus be used without the dot notation

Caveat3
If there is a name clash between an imported identifier and a local
identifier, the imported identifier will be masked (or hidden).
The form with the asterisk makes name clashes more likely.

3
”caveat” is a synonym for ”warning”
48 / 55
Towards Functions Functions in Python

Modules
Programming Advice

Advice
With the from-import mechanism, name clashes become
possible. Because of this, import <moduleName> is the preferred
form of import in Python.

49 / 55
Towards Functions Functions in Python

Modules
Main Module

I When running a Python program file, this module becomes


the main module
I The current directory is the directory containing the main
module
I The namespace for the main module is the global namespace
I The namespace is reset every time the interpreter is started

50 / 55
Towards Functions Functions in Python

Modules
Viewing the Namespace

I We can view the items in the namespace by executing the


dir() function
I DEMO: test this after restarting the shell
I DEMO: redo this test after running an external file

51 / 55
Towards Functions Functions in Python

Modules
Loading Modules

I Each imported module needs to be located and loaded


I Python first searches for modules in the current directory
I It then searches the directories in sys.path variable
I To view this path either execute the following code ...

import sys
sys.path

I ... or check the ”Path browser” in the File menu of Idle


I DEMO

52 / 55
Towards Functions Functions in Python

Modules
Information Hiding with Modules

I All identifiers in an imported Python module are public


I Sometimes one wants to restrict access to an item (Why?)
I Python only allows to provide a ”hint” for this
I if a name starts with two undescores ( ), it is intended to be
private
I private entities should not be accessed
I if a module is imported using
from <moduleName> import *, then private identifiers are
not imported

53 / 55
Towards Functions Functions in Python

Modules
Three Namespaces

I When a program executes, there are up to three name spaces


active
I Global namespace = namespace of currently executing module
I built-in namespace = namespace of builtin functions and
constants
I local namespace: namespace of currently executing function
I An identifier in a namespace can mask an identifier in another
namespace

54 / 55
Towards Functions Functions in Python

Modules
Masking of Identifiers

Example
I When a function f executes with a local variable y, and a
variable with that name exists in the global namespace, the
local variable masks the global one (see slide 25)
I When a module is imported using
from <moduleName> import * and that module defines a
function of the same name as a builtin function, the imported
name masks the builtin function.

As previous example shows, importing with


from <moduleName> import * should be avoided (as stated
earlier)

55 / 55
Problem Solving Recursion

Programming Fundamentals 1
Lesson 4

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 58
Problem Solving Recursion

Outline

Problem Solving

Recursion

2 / 58
Problem Solving Recursion

Problem Solving
Reducing Complexity

I We have seen already a mechanism in Python to reduce


software complexity
I What is it?

3 / 58
Problem Solving Recursion

Problem Solving
Reducing Complexity

I We have seen already a mechanism in Python to reduce


software complexity
I What is it?
I We can break down a program into functions and modules,
which result in smaller pieces that can be worked on
independently
I So how is this done when dealing with a concrete problem?

3 / 58
Problem Solving Recursion

Problem Solving
Top-Down Approach

I One standard problem solving technique is the top-down


method
I It consists in breaking the original problem into subproblems,
then break those subproblems in subsubproblems, until the
problems we end up with are easy to solve (e.g., by a function)
I How far this decomposition goes depends on each problem

4 / 58
Problem Solving Recursion

Problem Solving
Top-Down Approach

I For a simple problem such as the number search problem, a


one level decomposition will be sufficient

Example
I The Number Search Problem can be decomposed as follows:
I Subproblem 1: obtain input
I Subproblem 2: compute result
I Subproblem 3: output result

5 / 58
Problem Solving Recursion

Problem Solving
Top-Down Approach - Example

I The following is a graphical representation of the top-down


decomposition of the number search problem

I Each subproblem can now be solved using a function or


directly with code (as we did in lesson 2 by refining
pseudocode)

6 / 58
Problem Solving Recursion

Problem Solving
Bottom-up Approach

I Besides the top-down approach there is another approach


called bottom-up approach
I It basically consists in working from what’s already known to
build solutions to bigger problems
I This is in-line with the well-known adage: do not reinvent the
wheel.

7 / 58
Problem Solving Recursion

Problem Solving
Bottom-up Approach

I To use the bottom-up approach with Python, we need to


check what’s already known
I A good place to start is the Python Standard Library
I DEMO

8 / 58
Problem Solving Recursion

Problem Solving
Top-down or Bottom-up Approach

Question
So which one should we use – top-down or bottom-up?

9 / 58
Problem Solving Recursion

Problem Solving
Top-down or Bottom-up Approach

Question
So which one should we use – top-down or bottom-up?

Answer
In practice both approaches are often used together: one does
decompose the problem down to some level and then use existing
solutions (eg., existing functions) for solving some subproblems

9 / 58
Problem Solving Recursion

Problem Solving
Bottom-up Approach

Question
Can you think of examples of top-down and bottom-up approaches
for solving problems in the real world?

10 / 58
Problem Solving Recursion

Problem Solving
An Interesting Special Case

I What if we can reduce the problem to subproblems of the


same type?
I E.g., we can reduce the problem of sorting a sequence to the
problem of sorting subsequences
I In those cases we use a technique called recursion to solve the
original problem
I This technique is also known as divide and conquer approach

11 / 58
Problem Solving Recursion

Recursion
Recursion in Real Life

1 2

1
https://upload.wikimedia.org/wikipedia/en/c/c1/Vache qui rit.png
2
https://en.wikipedia.org/wiki/File:Russian-Matroshka.jpg
12 / 58
Problem Solving Recursion

Recursion
Factorial Function

Definition (from Wikipedia)3


In mathematics, the factorial of a non-negative integer n, denoted
by n!, is the product of all positive integers less than or equal to n.
For example,

5! = 5 × 4 × 3 × 2 × 1 = 120

Question
What is the practical significance of the factorial function?

3
https://en.wikipedia.org/wiki/Factorial
13 / 58
Problem Solving Recursion

Recursion
Factorial Function

Definition (from Wikipedia)3


In mathematics, the factorial of a non-negative integer n, denoted
by n!, is the product of all positive integers less than or equal to n.
For example,

5! = 5 × 4 × 3 × 2 × 1 = 120

Question
What is the practical significance of the factorial function?

Answer
n! represents the number of orderings (also called ”permutations”)
of n elements
3
https://en.wikipedia.org/wiki/Factorial
13 / 58
Problem Solving Recursion

Recursion
Factorial Function in Python

I The factorial function is available in the math module of the


Python Standard Library
I Example usage:
import math
math.factorial(100)
I DEMO

14 / 58
Problem Solving Recursion

Recursion
Factorial Function

Recursive Definition of Factorial Function

1! = 1
(n + 1)! = (n + 1) · n!

I In functional notation:

f (1) = 1

f (n + 1) = (n + 1) · f (n)

15 / 58
Problem Solving Recursion

Recursion
Factorial Function

I The first part f (1) = 1 is called the base case


I The second part f (n + 1) = (n + 1) · f (n) is called the
recursive (or inductive case)
I Thus we can express the problem of computing n! in terms of
a subproblem of the same type but of smaller size

Question
Why do we need the base case?

16 / 58
Problem Solving Recursion

Recursion
Factorial Function

I The first part f (1) = 1 is called the base case


I The second part f (n + 1) = (n + 1) · f (n) is called the
recursive (or inductive case)
I Thus we can express the problem of computing n! in terms of
a subproblem of the same type but of smaller size

Question
Why do we need the base case?

Answer
To avoid infinite recursive descent

16 / 58
Problem Solving Recursion

Recursion
Recursive Program for Factorial

I The above recursive definition leads to the following Python


program
def f(n):
if n == 1:
return 1
else:
return n*f(n-1)

I DEMO: compute 100!

17 / 58
Problem Solving Recursion

Recursion
Recursive Program for Factorial

I The above recursive definition leads to the following Python


program
def f(n):
if n == 1:
return 1
else:
return n*f(n-1)

I DEMO: compute 100!


I What’s missing?

17 / 58
Problem Solving Recursion

Recursion
Recursive Program for Factorial

I Let us complete the program for factorial by adding the


contract/specification:
def f(n):
""" Assumes that n is an integer >0
Returns n! """
if n == 1:
return 1
else:
return n*f(n-1)

18 / 58
Problem Solving Recursion

Recursion
Functions calling Functions

I The function implementing factorial has the property that it


calls a function, in fact itself
I This is the first example of a function call being done within a
function
I We can represent the function calls in a call chain

Example
I The call chain for f(4)

19 / 58
Problem Solving Recursion

Recursion
Activation Records
I Each function call is associated with an activation record
I An activation record contains the names and values of all
parameters and local variables; it also contains the place from
where the function was called (omitted in figure below)
I The data in the activation record is required for the function
call to be correctly executed (Why?)

Example
I Call chain with activation records

20 / 58
Problem Solving Recursion

Recursion
Managing the Activation Records

I We note that there is a one-to-one correspondence between


function calls and activation records
I When a function call is executed, an activation record is
created for this function call (which allows this function call to
execute correctly)
I When a function call terminates, we can remove its activation
record since it is no longer needed
I Note that the activation record created last will be removed
first → LIFO strategy: last in first out (Why?)

21 / 58
Problem Solving Recursion

Recursion
Visualization of Nested Calls

22 / 58
Problem Solving Recursion

Recursion
Stack

I There exists a data structure that manages data in a LIFO


manner
I it is called a stack
I stacks offer the following operations:
I push for pushing a new data item onto the stack
I pop for removing the data item added last (and not removed
yet), which we call the top item
I top returns the top item of the stack

23 / 58
Problem Solving Recursion

Recursion
Nested Calls with Stack

Circled numbers indicate order in which stack operations are called. Each
push/pop corresponds to the function call above it

24 / 58
Problem Solving Recursion

Recursion
Common Uses of Stacks

4 5

4
https://https://pixabay.com/p-2630076/?no redirect
5
https://static.pexels.com/photos/9415/food-lunch-kitchen-eat.jpg
25 / 58
Problem Solving Recursion

Recursion
Recursion Depth

I Let us try to run the recursive implementation of factorial for


large numbers
I DEMO

26 / 58
Problem Solving Recursion

Recursion
Recursion Depth

I Let us try to run the recursive implementation of factorial for


large numbers
I DEMO
I We get an exception called RecursionError stating that the
maximum recursion depth is exceeded
I The recursion depth is the maximum size of the call stack
(related to the maximum nesting level of calls)
I Two options:
I Increase the maximum recursion depth → DEMO
I Develop an iterative program for the problem → next slides

26 / 58
Problem Solving Recursion

Recursion
Iterative Solution

I We can view the recursive program as a top-down solution for


the factorial function
I It is straightforward to develop a bottom-up solution to this
problem
I Try this yourself!

27 / 58
Problem Solving Recursion

Recursion
Iterative Solution

I We can view the recursive program as a top-down solution for


the factorial function
I It is straightforward to develop a bottom-up solution to this
problem
I Try this yourself!
I Here is a possible implementation:

def f(n):
""" Assumes that n is an integer >0
Returns n! """
result = 1
for i in range(2, n+1): # from 2 to n
result *= i
return result

27 / 58
Problem Solving Recursion

Recursion
Recursive vs Iterative Solution

I Can you compare the recursive vs6 iterative solution?

6
”vs” stands for ”versus”, meaning ”compared to”
28 / 58
Problem Solving Recursion

Recursion
Recursive vs Iterative Solution

I Can you compare the recursive vs6 iterative solution?


I The recursive solution is more elegant but
I it has a bit of overhead for managing the stack (so it may be a
bit slower)
I it does not work for large numbers because of the limit on
recursion depth
I it is also wasteful in memory (why?)

6
”vs” stands for ”versus”, meaning ”compared to”
28 / 58
Problem Solving Recursion

Recursion
Fibonacci Numbers

Question
Who has heard about Fibonacci numbers?

7
https://en.wikipedia.org/wiki/Fibonacci number
29 / 58
Problem Solving Recursion

Recursion
Fibonacci Numbers

Question
Who has heard about Fibonacci numbers?

Definition7
In mathematics, the Fibonacci numbers are the numbers in the
following integer sequence, called the Fibonacci sequence, and are
characterized by the fact that every number after the first two is
the sum of the two preceding ones:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . .

7
https://en.wikipedia.org/wiki/Fibonacci number
29 / 58
Problem Solving Recursion

Recursion
Fibonacci Numbers: Recursive Definition

I We can define the Fibonacci numbers recursively as follows:

F1 = 1

F2 = 1
Fn = Fn−1 + Fn−2 for n > 2
I This straightforward recursive definition leads to a
straightforward recursive implementation (next slide)

30 / 58
Problem Solving Recursion

Recursion
Fibonacci Numbers: Recursive Implementation

def fib(n):
""" Assumes that n is an integer >0
Returns nth Fibonacci number """
if n == 1:
return 1
elif n == 2:
return 1
else:
return fib(n-1) + fib(n-2)

31 / 58
Problem Solving Recursion

Recursion
Testing the Recursive Program

I Let us try to run the Fibonacci program for a few inputs


I DEMO

32 / 58
Problem Solving Recursion

Recursion
Testing the Recursive Program

I Let us try to run the Fibonacci program for a few inputs


I DEMO
I We notice that for even moderately large numbers the
program becomes really slow
I Do you have an explanation for this?

32 / 58
Problem Solving Recursion

Recursion
From Call Chain to Call Tree

I Recall that the call to the factorial function resulted in a call


chain

Question
Can the calls resulting from executing fib(n) still be represented by
a call chain?

33 / 58
Problem Solving Recursion

Recursion
From Call Chain to Call Tree

I Recall that the call to the factorial function resulted in a call


chain

Question
Can the calls resulting from executing fib(n) still be represented by
a call chain?

Answer
No, calling fib(n) will result in a call tree.

33 / 58
Problem Solving Recursion

Recursion
Call Tree for fib(5)

Question
Complexity of fib(n)?

34 / 58
Problem Solving Recursion

Recursion
Complexity

I We notice in the previous call tree that the same subproblem


(e.g., fib(2)) is solved multiple times
I This is the reason why fib(n) becomes really slow for larger n
I it is actually exponential in n
I This is also a reason to look for an iterative solution (next
slide)

35 / 58
Problem Solving Recursion

Recursion
Iterative Solution

I Iterative bottom-up solution to computing fib(n)

def fib(n):
""" Assumes that n is an integer >0
Returns nth Fibonacci number """
if n == 1:
return 1
elif n == 2:
return 1
else:
last = 1
secondToLast = 1
for i in range(3, n+1): # i.e., from 3 to n
last, secondToLast = last + secondToLast, last
return last

36 / 58
Problem Solving Recursion

Recursion
Complexity of Iterative Solution
I What is the complexity of the iterative solution?

37 / 58
Problem Solving Recursion

Recursion
Complexity of Iterative Solution
I What is the complexity of the iterative solution?
I The time taken is roughly proportional to n
I We say that the complexity is roughly a linear function of n
I Complexity of recursive solution is roughly exponential in n
I Thus, for Fibonacci numbers the iterative solution is much
preferred!

37 / 58
Problem Solving Recursion

Recursion
Another Problem

I We have seen recursive solutions to


I computing the factorial function
I computing Fibonacci numbers
I In both cases the recursive implementation followed rather
directly from the recursive definition of the problem
I We now consider a problem that is not defined recursively

38 / 58
Problem Solving Recursion

Recursion
Problem Definition

I Input: a configuration with three pegs, the leftmost peg


having a stack of disks sorted in descending size from bottom
to top (see figure)
I Output: a sequence of moves of disks from one peg to another
without ever putting a larger disk on a smaller one so that we
end up with the disks in the same order on the rightmost peg

Question
What is this problem called?

39 / 58
Problem Solving Recursion

Recursion
Problem Definition

I Input: a configuration with three pegs, the leftmost peg


having a stack of disks sorted in descending size from bottom
to top (see figure)
I Output: a sequence of moves of disks from one peg to another
without ever putting a larger disk on a smaller one so that we
end up with the disks in the same order on the rightmost peg

Question
What is this problem called?

Answer
The Tower of Hanoi problem

39 / 58
Problem Solving Recursion

Recursion
Solution for n=4

I Here is a solution for n=4 click here

40 / 58
Problem Solving Recursion

Recursion
Solution for n=4

I Here is a solution for n=4 click here

I Can we deduce a general solution?

40 / 58
Problem Solving Recursion

Recursion
Some observations

I Assume that at some point in time, the largest disk is moved


from the leftmost to the rightmost peg
I At that point in time:
I there can be no disk on top of the largest disk, i.e., the
leftmost peg has exactly one disk
I there can be no disk on the rightmost peg (Why?)
I i.e., all other disks are on the middle peg
I We must thus have the following situation:

41 / 58
Problem Solving Recursion

Recursion
Some observations (2)

I After moving the largest disk from the leftmost peg to the
rightmost peg, it would suffice to move the remaining disks
from the middle peg to the rightmost peg
I This suggests the following method to solve the problem
I move all the disks but the largest one from the leftmost peg to
the middle one
I move the largest disk to the rightmost one
I move the other disks from the middle to the rightmost peg

Question
Is this a divide and conquer approach, i.e., are the subproblems of
the same type as the initial problem?

42 / 58
Problem Solving Recursion

Recursion
Towards divide an conquer

I Answer to the previous question:


I yes, if we define the initial problem in sufficient generality

Attempt 1
The Tower of Hanoi problem consists in moving n disks from the
leftmost peg to the rightmost peg without ever placing a larger
disk on a smaller disk

Question
Does this definition lead to a divide and conquer approach?

43 / 58
Problem Solving Recursion

Recursion
A general definition

I Answer to the previous question:


I No, this definition is too restrictive. (Why?)

44 / 58
Problem Solving Recursion

Recursion
A general definition

I Answer to the previous question:


I No, this definition is too restrictive. (Why?)

Attempt 2
The Tower of Hanoi problem consists in moving n disks from a
source peg to a target peg without ever placing a larger disk on a
smaller disk

44 / 58
Problem Solving Recursion

Recursion
A more formal definition

I Let us be a bit more formal


I Number the pegs as 1,2,3 from left to right
I Let T(n,i,j,k) denote the problem of moving the topmost n
disks from peg i to peg j, using peg k as auxiliary peg, while
never placing a larger disk on a smaller disk
I What does the original problem correspond to?

45 / 58
Problem Solving Recursion

Recursion
A more formal definition

I Let us be a bit more formal


I Number the pegs as 1,2,3 from left to right
I Let T(n,i,j,k) denote the problem of moving the topmost n
disks from peg i to peg j, using peg k as auxiliary peg, while
never placing a larger disk on a smaller disk
I What does the original problem correspond to?
T(n, 1, 3, 2)

45 / 58
Problem Solving Recursion

Recursion
A more formal definition

I We can now rephrase the earlier method of solving the


problem as follows:
I To solve T(n,i,j,k):
I first solve T(n-1,i,k,j) (for n¿0)
I next move the (largest) disk from peg i to peg j
I finally solve T(n-1,k,j,i)
I We can thus view our solution approach indeed as a divide
and conquer approach

46 / 58
Problem Solving Recursion

Recursion
A recurrence relation

I Let t(n,i,j,k) denote the mathematical function that returns


the sequence of moves necessary to solve problem T(n,i,j,k)
I We then have the following recursive definition for t (based on
the above reasoning):
I t(n,i,j,k) = t(n-1,i,k,j) +8 (i,j) + t(n-1,k,j,i) for n > 0
I t(0,i,j) = empty sequence of moves
I Note that in these equations we use the pair (i,j) to denote a
move of the topmost disk of peg i to peg j

8
+ denotes sequence concatenation
47 / 58
Problem Solving Recursion

Recursion
Python Implementation

I The above recursive procedure can be expressed using a


Python function

def t(n,i,j,k):
''' Assume n is an integer >= 0
Assume that {i,j,k} = {1,2,3}
Return the sequence of moves to move the n top disks
from peg i to peg j while using peg k as auxiliary peg'''
if n == 0:
return []
else:
return t(n-1,i,k,j) +[[i,j]] + t(n-1,k,j,i)

48 / 58
Problem Solving Recursion

Recursion
Question about the Implementation

Question
Why do we write [[i,j]]9 for the move (i,j) and not simply [i,j]?

9
Recall that elements surrounded by brackets denote a list (see lesson 2)
49 / 58
Problem Solving Recursion

Recursion
Question about the Implementation

Question
Why do we write [[i,j]]9 for the move (i,j) and not simply [i,j]?

Answer
Since otherwise we lose the structure of the sequence of moves as
a sequence of pairs, i.e., the function simply outputs a sequence of
numbers.
DEMO

9
Recall that elements surrounded by brackets denote a list (see lesson 2)
49 / 58
Problem Solving Recursion

Recursion
Testing the Python Implementation

I Let us test the Python implementation. How?

50 / 58
Problem Solving Recursion

Recursion
Testing the Python Implementation

I Let us test the Python implementation. How?


I Use click here as simulator
I Demo

50 / 58
Problem Solving Recursion

Recursion
Call Tree for Towers of Hanoi

I Since there are two recursive calls, we get a call tree rather
than a call chain
I Here is the call tree for 3 disks:

51 / 58
Problem Solving Recursion

Recursion
About that Call Tree (2)

Question
What does the call tree for the t-function have in common with
the call tree for the Fibonacci function?

52 / 58
Problem Solving Recursion

Recursion
About that Call Tree (2)

Question
What does the call tree for the t-function have in common with
the call tree for the Fibonacci function?

Answer
They both contain redundant computations, i.e., multiple
computations of the same subproblem.
Can you give examples?

52 / 58
Problem Solving Recursion

Recursion
About that Call Tree (3)

Question
Does the previous observation imply that the t-function has high
complexity? What do you think?

53 / 58
Problem Solving Recursion

Recursion
About that Call Tree (3)

Question
Does the previous observation imply that the t-function has high
complexity? What do you think?

Answer
In fact NOT. To understand why let us take a closer look at the
moves produced by t.

53 / 58
Problem Solving Recursion

Recursion
Moves in the call tree

I Recall the recursion for t:


t(n,i,j,k) = t(n-1,i,k,j) + (i,j) + t(n-1,k,j,i) for n > 1
I We can interpret this equation as follows: with each node
t(n,i,j,k) with n>0, we associate a concrete move (namely
(i,j))
I Represent this single move graphically by placing a black dot
next to each node t(n,i,j,k) in the call tree
I The black dot represents move (i,j)

54 / 58
Problem Solving Recursion

Recursion
Call Tree With Moves

Question
What can we say about the number of moves (= number of black
dots)?

55 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node:

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c:

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c:

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c: a

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c: a
Internal nodes:

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c: a
Internal nodes: a, c

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c: a
Internal nodes: a, c
Leaf nodes:

56 / 58
Problem Solving Recursion

Recursion
Tree terminology

I If a node a has an edge to a node b, we say that b is the child


of a and a is the parent of b
I If a node has no children, we call it a leaf node, oetherwise we
call it an internal node
I Exactly one node in the tree has no parent: we call it the root
node

Example
Root node: a
Children of c: d, e
Parent of c: a
Internal nodes: a, c
Leaf nodes: b, d, e

56 / 58
Problem Solving Recursion

Recursion
Call Tree With Moves (2)

Answer
The number of moves is equal to the number of internal nodes of
the tree.
How many internal nodes are there?

57 / 58
Problem Solving Recursion

Recursion
Call Tree With Moves (2)

Answer
The number of moves is equal to the number of internal nodes of
the tree.
How many internal nodes are there? 7

Question
How does the total number of nodes in the call tree compare to
the number of internal nodes?

57 / 58
Problem Solving Recursion

Recursion
Call Tree With Moves (2)

Answer
The number of moves is equal to the number of internal nodes of
the tree.
How many internal nodes are there? 7

Question
How does the total number of nodes in the call tree compare to
the number of internal nodes?

Answer
We claim that the total number of nodes is at most three times
the number of internal nodes. Why?

57 / 58
Problem Solving Recursion

Recursion
Number of Nodes in the Call Tree

I Let i denote the number of internal nodes.


I Every leaf node has an internal node as a parent
I Each internal node has at most 2 children
I Therefore there are at most 2 · i leaf nodes
I We conclude that the total number of nodes is between i and
3i
I Since i is actually the number of moves, the total number of
nodes in the call tree is roughly proportional to the number of
moves
I We conclude that the recursive procedure is rather efficient
since any other algorithm must take time at least proportional
to the number of moves (Why?)

58 / 58
Structured Types Functions as Objects

Programming Fundamentals 1
Lesson 5

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 44
Structured Types Functions as Objects

Outline

Structured Types

Functions as Objects

2 / 44
Structured Types Functions as Objects

Structured Types
Structured vs Non-Structured Types

I The first types we have seen in this course – the numeric


types and boolean type - are unstructured, meaning:
I an item of such a type has no internal structure, i.e., they are
not further subdivided
I We have also seen two structured types. Which ones?

3 / 44
Structured Types Functions as Objects

Structured Types
Structured vs Non-Structured Types

I The first types we have seen in this course – the numeric


types and boolean type - are unstructured, meaning:
I an item of such a type has no internal structure, i.e., they are
not further subdivided
I We have also seen two structured types. Which ones?
I strings are sequences of characters
I lists are sequences of objects

3 / 44
Structured Types Functions as Objects

Structured Types
Sequences

I Sequences are ordered containers of items that can be


accessed via an index
I Let s denote a sequence. Recall from lesson 2 that all
sequence types support the following operations:
I length function: len(s) returns the length of the sequence
I indexing, e.g., s[0] returns the first element
I concatenation, e.g., for lists: [1, 2, 3] + [4, 5, 6] returns [1, 2,
3, 4, 5, 6]
I membership testing, e.g., ”a in s” returns True if a is an item
in s, and False otherwise
I slicing, e.g., s[0:3] returns the prefix of s consisting of the first
three items

4 / 44
Structured Types Functions as Objects

Structured Types
Tuples

I Besides strings and lists, there is a third sequence type in


Python: tuple
I Just like lists, tuples can contain objects of arbitrary types
I Literals (=concrete values) of tuple type are written as a
comma-separated list of elements surrounded by parentheses
I Example: a = (3, True, ’wind’) contains elements of type int,
bool and str.

5 / 44
Structured Types Functions as Objects

Structured Types
Operations on Tuples

I Since tuples are sequences, all operations mentioned above for


sequences are applicable
I Example: given a tuple t, return its last element:

6 / 44
Structured Types Functions as Objects

Structured Types
Operations on Tuples

I Since tuples are sequences, all operations mentioned above for


sequences are applicable
I Example: given a tuple t, return its last element:
I t[len()-1]
I Other way to express this:

6 / 44
Structured Types Functions as Objects

Structured Types
Operations on Tuples

I Since tuples are sequences, all operations mentioned above for


sequences are applicable
I Example: given a tuple t, return its last element:
I t[len()-1]
I Other way to express this:
t[-1] # works for all sequences

Question
How do you express a tuple with one element?

6 / 44
Structured Types Functions as Objects

Structured Types
Operations on Tuples

I Since tuples are sequences, all operations mentioned above for


sequences are applicable
I Example: given a tuple t, return its last element:
I t[len()-1]
I Other way to express this:
t[-1] # works for all sequences

Question
How do you express a tuple with one element?
Like this: (1) ? → DEMO.

6 / 44
Structured Types Functions as Objects

Structured Types
Operations on Tuples

I Since tuples are sequences, all operations mentioned above for


sequences are applicable
I Example: given a tuple t, return its last element:
I t[len()-1]
I Other way to express this:
t[-1] # works for all sequences

Question
How do you express a tuple with one element?
Like this: (1) ? → DEMO.

Answer
No, single element tuples need to be terminated by a comma.
E.g., (1,) denotes a tuple with a single integer 1
6 / 44
Structured Types Functions as Objects

Structured Types
Tuples vs Lists

Question
Tuples and lists look very similar. So what’s the difference?

1
Recall that strings are immutable as well
7 / 44
Structured Types Functions as Objects

Structured Types
Tuples vs Lists

Question
Tuples and lists look very similar. So what’s the difference?

Answer
Tuples are immutable1 while lists are mutable

Definition (Mutability)
A type is mutable if objects of this type can be modified, otherwise
it is immutable.

1
Recall that strings are immutable as well
7 / 44
Structured Types Functions as Objects

Structured Types
Mutations

I Illustrate mutability vs immutability by trying the following in


the shell:
a = [1,2,3]
a[0] = 2
a
a = (1,2,3)
a[0] = 2

8 / 44
Structured Types Functions as Objects

Structured Types
Mutations

I Illustrate mutability vs immutability by trying the following in


the shell:
a = [1,2,3]
a[0] = 2
a
a = (1,2,3)
a[0] = 2

I Note that the last assignment produces an exception because


tuples are immutable

8 / 44
Structured Types Functions as Objects

Structured Types
Another example

I Try the following example:


a = (1,4)
a += (3,) # shorthand for: a = a + (3,)
print(a)

I What is the output?

9 / 44
Structured Types Functions as Objects

Structured Types
Another example

I Try the following example:


a = (1,4)
a += (3,) # shorthand for: a = a + (3,)
print(a)

I What is the output?


I (1,4,3)

9 / 44
Structured Types Functions as Objects

Structured Types
Another example

I Try the following example:


a = (1,4)
a += (3,) # shorthand for: a = a + (3,)
print(a)

I What is the output?


I (1,4,3)
I So we have shown how to add an element to a tuple.
I Doesn’t this contradict the immutability of tuples?

9 / 44
Structured Types Functions as Objects

Structured Types
Another example (2)
I Let us check object identities by modifying the previous code
snippet:
a = (1,4)
print(id(a))
a += (3,) # shorthand for: a = a + (3,)
print(id(a))

I DEMO

10 / 44
Structured Types Functions as Objects

Structured Types
Another example (2)
I Let us check object identities by modifying the previous code
snippet:
a = (1,4)
print(id(a))
a += (3,) # shorthand for: a = a + (3,)
print(id(a))

I DEMO
I We note that the object identity of the object to which a is
bound changes
I This is because concatenating two sequences creates a new
sequence
I So there is no contradiction to immutability of tuples.
10 / 44
Structured Types Functions as Objects

Structured Types
Mutating Lists

I So how would you append an element to a list?


I Like this?
a = [1,4] # a is a list now
a += [3]

11 / 44
Structured Types Functions as Objects

Structured Types
Mutating Lists

I So how would you append an element to a list?


I Like this?
a = [1,4] # a is a list now
a += [3]

I Since lists are mutable, there should be another way that does
not require creating a second list

11 / 44
Structured Types Functions as Objects

Structured Types
Mutating Lists

I There are several methods2 ) in the list class that allow to


modify a list
I Let L be a list
I L.append(e) adds object e to the list
I L.insert(i,e) inserts object e to the list at position i
I L.extend(L1) adds objects in list L1 to L
I L.remove(e) deletes the first occurrence of e from the list
I L.sort() sorts the objects in L in ascending order
I L.reverse() reverses the order of objects in L

2
remember that methods are functions defined inside classes - see lesson 3
12 / 44
Structured Types Functions as Objects

Structured Types
The append Method

I The following example exercises the append method.


I DEMO.
a = [1,4] # a is a list now
print(id(a))
a.append(3)
print(id(a))
print(a)
I Contrary to the example with concatenation, no new list is
created but the existing one is modified (which is more
efficient in general)

13 / 44
Structured Types Functions as Objects

Structured Types
The insert Method

I L.insert(e,i) inserts object e to the list at position i


I The following example exercises the insert method
a = [1,4] # a is a list now
a.insert(1,3)
print(a)

I What is the output?

14 / 44
Structured Types Functions as Objects

Structured Types
The insert Method

I L.insert(e,i) inserts object e to the list at position i


I The following example exercises the insert method
a = [1,4] # a is a list now
a.insert(1,3)
print(a)

I What is the output?


1,3,4

14 / 44
Structured Types Functions as Objects

Structured Types
Information on the insert Method

I How do we look up information on the insert method?

15 / 44
Structured Types Functions as Objects

Structured Types
Information on the insert Method

I How do we look up information on the insert method?


I Using the help function in the Python shell
>>> help(list.insert)

will return:
Help on method\_descriptor:
insert(...)\\
L.insert(index, object) -- insert object before index

15 / 44
Structured Types Functions as Objects

Structured Types
Meaning of insertion

I Is the statement ”insert object before index” clear?


I To avoid any ambiguities we give a more precise statement as
a docstring:
''' Assumes that this object is a list
and index is a position in the list
Modifies this list as follows:
- the elements at positions i<index are unchanged
- element at position index will be e
- element at a postion i>index will be the element
that was previously at position i-1
'''

16 / 44
Structured Types Functions as Objects

Structured Types
The remove Method

I L.remove(e) deletes the first occurrence of e from the list


I How do we define the first occurrence?

17 / 44
Structured Types Functions as Objects

Structured Types
The remove Method

I L.remove(e) deletes the first occurrence of e from the list


I How do we define the first occurrence?
I the smallest index i such that ”L[i] == e” returns True

17 / 44
Structured Types Functions as Objects

Structured Types
The remove Method

I L.remove(e) deletes the first occurrence of e from the list


I How do we define the first occurrence?
I the smallest index i such that ”L[i] == e” returns True
I So what is the meaning of ”L[i] == e”?

17 / 44
Structured Types Functions as Objects

Structured Types
The remove Method

I L.remove(e) deletes the first occurrence of e from the list


I How do we define the first occurrence?
I the smallest index i such that ”L[i] == e” returns True
I So what is the meaning of ”L[i] == e”?
I It means that L[i] and e have the same value

17 / 44
Structured Types Functions as Objects

Structured Types
Equality vs Identity

I Two objects that have the same identity necessarily have the
same value
I It is possible to redefine the meaning of ”==” for a custom
type.
I If ”==” has not been redefined, ”==” just means object
identity
I for numeric types, ”==” behaves as expected (with ”value”
meaning mathematical value)

Example
For type ”str” equality has been redefined to mean: s1 == s2 iff
they consist of exactly the same characters in the same order.

18 / 44
Structured Types Functions as Objects

Structured Types
Query Methods for List

I Besides the methods of the list class mentioned above that


modify the list, there are two additional methods:
I L.count(e) returns the number of times that e occurs in L
I index(e) returns the index of the first occurrence of e in L
I Note that neither method modifies the list.

Question
Why couldn’t we just use the index-method of list to solve the
number search problem?

19 / 44
Structured Types Functions as Objects

Structured Types
Query Methods for List

I Besides the methods of the list class mentioned above that


modify the list, there are two additional methods:
I L.count(e) returns the number of times that e occurs in L
I index(e) returns the index of the first occurrence of e in L
I Note that neither method modifies the list.

Question
Why couldn’t we just use the index-method of list to solve the
number search problem?

Answer
Because this method throws an exception when the element is not
in the list → this requires exception handling.
DEMO
19 / 44
Structured Types Functions as Objects

Structured Types
List Comprehension

I List comprehension provides a way to construct a list out of


another list.
I Why would you need to do this?

20 / 44
Structured Types Functions as Objects

Structured Types
List Comprehension

I List comprehension provides a way to construct a list out of


another list.
I Why would you need to do this?

Example
I Suppose you have a list of temperatures in degree Celsius
I You would like to convert the list into temperatures in
Fahrenheit
I The following slide shows a way to do this.

20 / 44
Structured Types Functions as Objects

Structured Types
Converting Temperatures - Version 1

celsiusTemps = [10.2, 3, 1.5, 23]


print(celsiusTemps)
farenheitTemps = []
for t in celsiusTemps:
farenheitTemps.append((t*9/5) + 32)
print(farenheitTemps)

21 / 44
Structured Types Functions as Objects

Structured Types
List Comprehension

Using list comprehension we can replace the code snippet:


farenheitTemps = []
for t in celsiusTemps:
farenheitTemps.append((t*9/5) + 32)
by the following one:
farenheitTemps = [(t*9/5) + 32 for t in celsiusTemps]

22 / 44
Structured Types Functions as Objects

Structured Types
List Comprehension Explained

I The effect of
farenheitTemps = [(t*9/5) + 32 for t in celsiusTemps]
is to create a new list by executing the indicated expression
((t*9/5) + 32) for each element in sequence celsiusTemps
I We can add a condition for elements on which the expression
will be evaluated, e.g.,
L1 = [3, 4, 7, 2]
L2 = [x**2 for x in L1 if x%2==1]
print(L2)

23 / 44
Structured Types Functions as Objects

Structured Types
List Comprehension Explained

I The effect of
farenheitTemps = [(t*9/5) + 32 for t in celsiusTemps]
is to create a new list by executing the indicated expression
((t*9/5) + 32) for each element in sequence celsiusTemps
I We can add a condition for elements on which the expression
will be evaluated, e.g.,
L1 = [3, 4, 7, 2]
L2 = [x**2 for x in L1 if x%2==1]
print(L2) # output: [9, 49]
will place in list L2 the squares of the odd numbers in L1

23 / 44
Structured Types Functions as Objects

Structured Types
The Set Data Type

I List represents a (mutable) ordered collection of elements


I What do you get if you drop the order and remove duplicates?

24 / 44
Structured Types Functions as Objects

Structured Types
The Set Data Type

I List represents a (mutable) ordered collection of elements


I What do you get if you drop the order and remove duplicates?
I We get a set of elements (same as in mathematics)
I To define a set, use a notation similar to that of lists but
replace brackets by braces (just like we do in maths)

Example
S = {0, 3, 6, 9}
defines the set of multiples of 3 in the range 0.. 10
Caveat: unlike sets in maths, sets in Python are mutable

24 / 44
Structured Types Functions as Objects

Structured Types
Operations on Sets

I If S and T are sets


I ”x in A” tests whether x is an element of S
I S.add(x) adds element x to S
I S.remove(x) removes element x from S
I S | T results in a new set that is the union of S and T
I S - T results in the set of elements that are in S but not in T
I S ^ T results in the set (S − T ) ∪ (T − S)
I len(S) returns the number of elements in S

Question
Which operations modify the set S?

25 / 44
Structured Types Functions as Objects

Structured Types
Operations on Sets

I If S and T are sets


I ”x in A” tests whether x is an element of S
I S.add(x) adds element x to S
I S.remove(x) removes element x from S
I S | T results in a new set that is the union of S and T
I S - T results in the set of elements that are in S but not in T
I S ^ T results in the set (S − T ) ∪ (T − S)
I len(S) returns the number of elements in S

Question
Which operations modify the set S?

Answer
Only add and remove modify the set S.

25 / 44
Structured Types Functions as Objects

Structured Types
The Empty Set

I What is the notation for an empty set?

26 / 44
Structured Types Functions as Objects

Structured Types
The Empty Set

I What is the notation for an empty set?


I It is NOT {}
I {} denotes an empty dictionary (see below)

26 / 44
Structured Types Functions as Objects

Structured Types
The Empty Set

I What is the notation for an empty set?


I It is NOT {}
I {} denotes an empty dictionary (see below)
I You have to use set() instead
I set() is called a constructor

26 / 44
Structured Types Functions as Objects

Structured Types
Set Comprehension

I Recall the initial example


S = {0, 3, 6, 9}
I Same example with set comprehension:
{ x for x in range(11) if x%3 == 0}
I DEMO

27 / 44
Structured Types Functions as Objects

Structured Types
Set Comprehension

I Recall the initial example


S = {0, 3, 6, 9}
I Same example with set comprehension:
{ x for x in range(11) if x%3 == 0}
I DEMO
I Note that elements of a set not necessarily displayed in order

27 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries

I Elements in a sequence can be accessed by an integer index


I What if we need to look up an element in a more flexible way?
I Can you think of a real world example?

28 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries

I Elements in a sequence can be accessed by an integer index


I What if we need to look up an element in a more flexible way?
I Can you think of a real world example?

Example
Suppose we have a collection of daily temperatures which we want
to look up by the day (as a string).
I Any other examples?

28 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries

I Elements in a sequence can be accessed by an integer index


I What if we need to look up an element in a more flexible way?
I Can you think of a real world example?

Example
Suppose we have a collection of daily temperatures which we want
to look up by the day (as a string).
I Any other examples?
I In Python we can represent such collections using a dictionary

28 / 44
Structured Types Functions as Objects

Structured Types
Dictionary Example

I The following defines a collection of daily temperatures:


dailyTemps = {'sun': 19.2, 'mon': 21, 'tue': 23,
'wed': 17, 'thu': 11, 'fri': 13, 'sat':19}
I We can look up a temperature by day as follows:
print(dailyTemps['wed'])
# looks up the temperature for Wednesday
I We can think of a dictionary as a set of key-value pairs
I e.g., ’tue’: 23 is a key-value pair in which ’tue’ is the key and
23 is the value
I key and value are separated by a colon (:)
I key-value pairs are separated by commas

29 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries vs Sequences

Question
Using the key-value terminology, how would you express the
difference between a sequence and a dictionary?

30 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries vs Sequences

Question
Using the key-value terminology, how would you express the
difference between a sequence and a dictionary?

Answer
A sequence is similar to a dictionary where the keys are consecutive
integers

30 / 44
Structured Types Functions as Objects

Structured Types
Key Types

I Like lists, dictionaries are mutable


I The type of the keys can be any immutable type (e.g., a
numeric type, strings, tuples but not lists)
I Unlike lists, dictionaries are unordered (since keys are not
ordered)
I Key types can be different within the same dictionary

Example
The following is a valid definition of a dictionary. What are the key
types?
monthNumbers = {'jan': 1, 'feb': 2, 'mar': 3, 5: 'may'}
print(monthNumbers['feb'],' ', monthNumbers[5])

31 / 44
Structured Types Functions as Objects

Structured Types
Key Types

I Like lists, dictionaries are mutable


I The type of the keys can be any immutable type (e.g., a
numeric type, strings, tuples but not lists)
I Unlike lists, dictionaries are unordered (since keys are not
ordered)
I Key types can be different within the same dictionary

Example
The following is a valid definition of a dictionary. What are the key
types?
monthNumbers = {'jan': 1, 'feb': 2, 'mar': 3, 5: 'may'}
print(monthNumbers['feb'],' ', monthNumbers[5])
# output: 2 may

31 / 44
Structured Types Functions as Objects

Structured Types
Operations on Dictionaries

I Here are some operations on dictionaries:


I dict(): creates an empty dictionary (you can also use {})

32 / 44
Structured Types Functions as Objects

Structured Types
Operations on Dictionaries

I Here are some operations on dictionaries:


I dict(): creates an empty dictionary (you can also use {})
I len(d): returns the number of key-value pairs in the dictionary

32 / 44
Structured Types Functions as Objects

Structured Types
Operations on Dictionaries

I Here are some operations on dictionaries:


I dict(): creates an empty dictionary (you can also use {})
I len(d): returns the number of key-value pairs in the dictionary
I d[k] = v: adds key-value pair to the dictionary (if key not yet
in the dictionary) or replaces existing pair with key k

32 / 44
Structured Types Functions as Objects

Structured Types
Operations on Dictionaries

I Here are some operations on dictionaries:


I dict(): creates an empty dictionary (you can also use {})
I len(d): returns the number of key-value pairs in the dictionary
I d[k] = v: adds key-value pair to the dictionary (if key not yet
in the dictionary) or replaces existing pair with key k
I del d[k]: removes key and associated value from dictionary

32 / 44
Structured Types Functions as Objects

Structured Types
Operations on Dictionaries

I Here are some operations on dictionaries:


I dict(): creates an empty dictionary (you can also use {})
I len(d): returns the number of key-value pairs in the dictionary
I d[k] = v: adds key-value pair to the dictionary (if key not yet
in the dictionary) or replaces existing pair with key k
I del d[k]: removes key and associated value from dictionary
I key in d: True if k is a key in the dictionary, False otherwise

32 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I We could simulate a dictionary by a list. How?

33 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I We could simulate a dictionary by a list. How?


I Each element of the list would be a key-value pair.
I Let us rewrite earlier dictionary example (shortened version)
as a list:
dailyTemps = [('sun',19.2),('mon', 21),('tue', 23),...]

I We can then simulate dictionary lookup by a simple function


def lookup(k, l): #look up value with key k in list l
for elem in l:
if elem[0] == k:
return elem[1]
return None # nothing found

33 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I So why do we need dictionaries if we can easily simulate


dictionaries by lists?

34 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I So why do we need dictionaries if we can easily simulate


dictionaries by lists?
I It turns out that dictionary lookup is much faster
I How long does the lookup function (previous slide) take to
find the value?

34 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I So why do we need dictionaries if we can easily simulate


dictionaries by lists?
I It turns out that dictionary lookup is much faster
I How long does the lookup function (previous slide) take to
find the value?
I It takes time proportional to the length of the list in the worst
case (and even on the average)

34 / 44
Structured Types Functions as Objects

Structured Types
Dictionaries as Lists?

I So why do we need dictionaries if we can easily simulate


dictionaries by lists?
I It turns out that dictionary lookup is much faster
I How long does the lookup function (previous slide) take to
find the value?
I It takes time proportional to the length of the list in the worst
case (and even on the average)
I Lookup in dictionaries is much faster on the average thanks to
a technique called hashing (see Algorithms & Complexity
course, semester 3)

34 / 44
Structured Types Functions as Objects

Functions as Objects
Example

I Consider the following very simple function:


def f(x):
return x*x
I What will be the output in the shell of: print(f(3))?

35 / 44
Structured Types Functions as Objects

Functions as Objects
Example

I Consider the following very simple function:


def f(x):
return x*x
I What will be the output in the shell of: print(f(3))?
9
I What will be the output if we accidentally write: print(f)?

35 / 44
Structured Types Functions as Objects

Functions as Objects
Example

I Consider the following very simple function:


def f(x):
return x*x
I What will be the output in the shell of: print(f(3))?
9
I What will be the output if we accidentally write: print(f)?
I The output will be something like: <function f at
0x105923c80>
I Can you explain this output?

35 / 44
Structured Types Functions as Objects

Functions as Objects
Functions are Objects

I Answer to the last question:


I The print statement outputs the address of the function object
I Thus, functions are indeed objects
I We can therefore verify the type and id of a function
I DEMO

36 / 44
Structured Types Functions as Objects

Functions as Objects
Functions as Arguments

I Since functions are objects, we can do something even cooler.


What is it?

37 / 44
Structured Types Functions as Objects

Functions as Objects
Functions as Arguments

I Since functions are objects, we can do something even cooler.


What is it?
I We can pass functions as arguments to another function.
I Here is an example:

def g(x,h): # apply function h to parameter x


return h(x)

37 / 44
Structured Types Functions as Objects

Functions as Objects
Example

I Here is a more complete example. What is the output?

def f(x):
return x*x
def g(x,h): # apply function h to parameter x
return h(x)
print(g(3,f))

38 / 44
Structured Types Functions as Objects

Functions as Objects
Example

I Here is a more complete example. What is the output?

def f(x):
return x*x
def g(x,h): # apply function h to parameter x
return h(x)
print(g(3,f)) # output: 9

38 / 44
Structured Types Functions as Objects

Functions as Objects
Another Example

I Here is a more interesting application of this feature


I The example applies a function to each element of a list

def applyForEach(L, f):


""" Assumes L is a list
Returns L in which each x has been replaced by f(x)"""
for i in range(len(L)):
L[i] = f(L[i])
return L

39 / 44
Structured Types Functions as Objects

Functions as Objects
Another Example
I We adapt an earlier example of a function that computes the
sum of squares
I We assume now that the function returns the sum of squares
of the numbers in a list passed as parameter (rather than
limiting ourselves to the numbers 1...n)
I Can we shorten the code by using the applyEach function?
How?
def sumOfSquares(L):
""" Assumes L is a list of numbers
Returns the sum of squares of numbers in L """
s = 0
for x in L :
s += x * x
return s

40 / 44
Structured Types Functions as Objects

Functions as Objects
Another Example
I We adapt an earlier example of a function that computes the
sum of squares
I We assume now that the function returns the sum of squares
of the numbers in a list passed as parameter (rather than
limiting ourselves to the numbers 1...n)
I Can we shorten the code by using the applyEach function?
How?
def sumOfSquares(L):
""" Assumes L is a list of numbers
Returns the sum of squares of numbers in L """
s = 0
for x in L :
s += x * x
return s
I Hint: use the built-in function sum.
40 / 44
Structured Types Functions as Objects

Functions as Objects
Another Example

I We can write a one-liner

def sumOfSquares(L):
""" Assumes L is a list of numbers
Returns the sum of squares of numbers in L """
return sum(applyForEach(L, f))
# assuming f(x) returns x*x

I This is not really a one-line implementation? Why?

41 / 44
Structured Types Functions as Objects

Functions as Objects
Another Example

I We can write a one-liner

def sumOfSquares(L):
""" Assumes L is a list of numbers
Returns the sum of squares of numbers in L """
return sum(applyForEach(L, f))
# assuming f(x) returns x*x

I This is not really a one-line implementation? Why?


I We still need to write a function f that computes the square
of a number.

41 / 44
Structured Types Functions as Objects

Functions as Objects
Expressing Simple Functions

I A function that computes the square of a number is a very


simple function with a single return statement
I It would be nice to have a more compact notation for such
functions
I Lambda expressions in Python provide such a notation

42 / 44
Structured Types Functions as Objects

Functions as Objects
Lambda Expressions

I A lambda expression is an expression that defines an


”anonymous function”
I Its general form is:
lambda <parameters>: <expression>
I Lambda expression can be used to express a simple function
with a single return statement

Example
The lambda expression lambda x: x*x defines a function that
computes the square of a number

43 / 44
Structured Types Functions as Objects

Functions as Objects
Applying Lambda Expressions

I We can apply the lambda expression on the previous slide to


the sum of squares example:

def sumOfSquares(L):
""" Assumes L is a list of numbers
Returns the sum of squares of numbers in L """
return sum(applyForEach(L, lambda x: x*x))

I and going one step further we can get a true one-liner:


lambda L: sum(applyForEach(L, lambda x: x*x))

44 / 44
Files Exceptions and Assertions

Programming Fundamentals 1
Lesson 6

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 52
Files Exceptions and Assertions

Outline

Files

Exceptions and Assertions

2 / 52
Files Exceptions and Assertions

Files
Text

I The very first code snippet we saw in this course is the


following (from lesson 1):
n = int(input('Please enter a number: '))
print(-n)
I The program uses the input function which reads a line of
text from the keyboard
I Generally text is made of of a sequence of lines
I each line is a sequence of characters (a character being any
symbol you can type on the keyboard in one keystroke)
I each line is terminated by the ”non-printable” newline
character, denoted \n

3 / 52
Files Exceptions and Assertions

Files
Transient versus Persistent Data

I The string read by the input function on the last example only
exists during the lifetime of the program
I This string is thus an example of transient data
I If data exists outside the lifetime of the program, we call it
persistent
I Note that you have not seen yet persistent data in your
practicals

Question
Can you think of scenarios in which you would need to process
persistent data?

4 / 52
Files Exceptions and Assertions

Files
Text Files

I Persistent text data can be represented in form of text files on


your computer
I Text files must be stored on a storage medium, eg., a flash
drive or a hard disk
I The first part of this lesson deals with how to work with text
files

5 / 52
Files Exceptions and Assertions

Files
Opening Text Files

I Before you can read text from a text file you must open the
text file
I Can you think of analogy in the real world?

6 / 52
Files Exceptions and Assertions

Files
Opening Text Files

I Before you can read text from a text file you must open the
text file
I Can you think of analogy in the real world?
I In order to read a book, you must first open it
I Text files can be opened using the built-in function open

6 / 52
Files Exceptions and Assertions

Files
Opening Text Files

I Before you can read text from a text file you must open the
text file
I Can you think of analogy in the real world?
I In order to read a book, you must first open it
I Text files can be opened using the built-in function open

Question
How can we look up information on open?

6 / 52
Files Exceptions and Assertions

Files
Opening Text Files

I Before you can read text from a text file you must open the
text file
I Can you think of analogy in the real world?
I In order to read a book, you must first open it
I Text files can be opened using the built-in function open

Question
How can we look up information on open?

Answer
By typing help(open) in the shell → DEMO

6 / 52
Files Exceptions and Assertions

Files
Simplest Form of open

I At least one argument has to be given in a call to open.


(Why?)

7 / 52
Files Exceptions and Assertions

Files
Simplest Form of open

I At least one argument has to be given in a call to open.


(Why?)
I in its simplest version we just pass the name/location of the
file as argument
I The simplest case is when the file you want to open is in the
current directory

7 / 52
Files Exceptions and Assertions

Files
Current Directory

I How do we find the location of the current directory?

8 / 52
Files Exceptions and Assertions

Files
Current Directory

I How do we find the location of the current directory?


I Use the getcwd() method in the os-module in the Python
standard library, e.g., like this:
import os
print(os.getcwd())

I DEMO (both in shell and Python file)


I Note: the location of the current directory changes when we
switch from executing commands in the shell to executing
commands in a Python file

8 / 52
Files Exceptions and Assertions

Files
An Example

I The following represents a very simple use of the


open-function:
f = open("myfile.txt")
I The file entitled ”myfile.txt” is now open for reading.
I DEMO

9 / 52
Files Exceptions and Assertions

Files
An Example

I The following represents a very simple use of the


open-function:
f = open("myfile.txt")
I The file entitled ”myfile.txt” is now open for reading.
I DEMO
I Note that we encounter an exception (=error at runtime)
when executing this: FileNotFoundError: [Errno 2] No
such file or directory: ’myfile.txt’
I Let’s fix it by creating a file in the current directory with this
name → DEMO

9 / 52
Files Exceptions and Assertions

Files
Reading the File

I We can read a text file by calling the readline-method on


the object returned by the open-function
I This method returns a single line of text including the
newline-character \n or returns the empty string (in case the
end of the file has been reached)
I On the following slide we give a program that reads lines of
text from a text file and prints them to the screen
I Recall that we can interpret a string value as a boolean value
which is True if and only if the string is non-empty

10 / 52
Files Exceptions and Assertions

Files
Example Program: First Attempt

I Trying to read a file line by line:


f = open("myfile.txt")
while s = f.readline():
print(s)
I DEMO

11 / 52
Files Exceptions and Assertions

Files
Example Program: First Attempt

I Trying to read a file line by line:


f = open("myfile.txt")
while s = f.readline():
print(s)
I DEMO
I Why doesn’t this work (gives a syntax error)???

11 / 52
Files Exceptions and Assertions

Files
Review: Assignment Statement

I Recall from lesson 1: An assignment ”x = <expr>” is an


assignment statement
I It is not an expression
I Thus it does not have a value and cannot be used inside
expressions
I Here Python differs from languages such as Java or C!

12 / 52
Files Exceptions and Assertions

Files
Example Program: Second Attempt

I We have to slightly rewrite the previous program:


f = open("myfile.txt")
s = f.readline()
while (s):
print(s)
s = f.readline()
I Having two readlines in the code above is not ideal. Why?

13 / 52
Files Exceptions and Assertions

Files
Example Program: Second Attempt

I We have to slightly rewrite the previous program:


f = open("myfile.txt")
s = f.readline()
while (s):
print(s)
s = f.readline()
I Having two readlines in the code above is not ideal. Why?
I If we change the reading method we need to modify code in
two places.

13 / 52
Files Exceptions and Assertions

Files
Example program: Version 2

I In Python files can be viewed as sequences of lines


I So what is a more elegant way to read the lines?

14 / 52
Files Exceptions and Assertions

Files
Example program: Version 2

I In Python files can be viewed as sequences of lines


I So what is a more elegant way to read the lines?
I Using a for-loop:
f = open("myfile.txt")
for s in f:
print(s)

I Note that we get extra blank lines in the output. Why?

14 / 52
Files Exceptions and Assertions

Files
Example program: Version 2

I In Python files can be viewed as sequences of lines


I So what is a more elegant way to read the lines?
I Using a for-loop:
f = open("myfile.txt")
for s in f:
print(s)

I Note that we get extra blank lines in the output. Why?


I Because print adds a new line character by default at the
end each time it is called, and the string s (representing a
line) itself includes the end-of-line character

14 / 52
Files Exceptions and Assertions

Files
Example program: Version 3

I We can get rid of the extra lines in the output by stripping


each line of the last character
I This can be done using string slicing: s[:-1] includes
everything but the last character
f = open("myfile.txt")
for s in f:
print(s[:-1]) #same as: print(s[:len(s)-1])

15 / 52
Files Exceptions and Assertions

Files
The print Function

I A better way to suppress extra lines is to use a more elaborate


version of print
I What is it? Look for help on the print-function. → DEMO

16 / 52
Files Exceptions and Assertions

Files
The print Function

I A better way to suppress extra lines is to use a more elaborate


version of print
I What is it? Look for help on the print-function. → DEMO
I The print-function has a parameter named end that specifies
the string to be appended at the end
I By default the string to be appended is a newline
I We change it to the empty string

16 / 52
Files Exceptions and Assertions

Files
Example program: Version 4

I We thus obtain the following program


f = open("myfile.txt")
for s in f:
print(s, end = '')
I DEMO

17 / 52
Files Exceptions and Assertions

Files
File Access Modes

I When calling the open-function with a single parameter, we


open the file for reading
I The open-function has a parameter called mode that is used
to indicated how we want to access the file
I Example of available modes are:
I ’r’ for reading
I ’w’ for writing
I ’a’ for appending

18 / 52
Files Exceptions and Assertions

Files
Writing Text Files

I We can write a string to a file using the write-method


I The following code copies lines from a file ”f1.txt” to ”f2.txt”
in the current directory

f1 = open("f1.txt",'r') # same as: f1 = open("f1.txt")


f2 = open("f2.txt",'w')
for s in f1:
f2.write(s)
f2.close()

19 / 52
Files Exceptions and Assertions

Files
About Buffers

I Why do we need to execute f2.close() at the end?

20 / 52
Files Exceptions and Assertions

Files
About Buffers

I Why do we need to execute f2.close() at the end?


I The write method writes things first in an area of internal
memory called a buffer
I Only when the buffer is full is its content written to the file
I Since the last lines may not fill the buffer we call the
close-method to flush/empty the buffer
I Buffers are used for efficiency reasons (Explain)

20 / 52
Files Exceptions and Assertions

Files
Path Names
I All examples above assume the file is located in the current
directory
I If this is not the case, we can give the absolute path name of
the file
I The form of absolute path names depends on the platform,
e.g.,
I "C:\MyFiles\file1.txt" could be an absolute path name
(including the file name) on Windows
I "/Users/pierre/MyFiles/file1.txt" could be an absolute
path name (including the file name) on a Mac
I Python allows to use the same separator - ”/” - on all
platforms
I we can thus rewrite the Windows path above as:
"C:/MyFiles/file1.txt"
21 / 52
Files Exceptions and Assertions

Files
Relative Path Names

I Besides absolute pathnames we can use relative path names


I E.g., ”MyFiles/f1.txt” would refer to a file named ”f1.txt” in
a subdirectory named ”MyFiles” of the current directory
I Note the absence of a leading slash in relative path names

22 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Motivation
I Recall the difference between syntax errors and exceptions in
Python (from lesson 2)

23 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Motivation
I Recall the difference between syntax errors and exceptions in
Python (from lesson 2)
I Syntax errors are found before execution, and exceptions are
found during execution
I Exceptions signal that something unexpected has happened
while the program was executed

Question
Why do we need to deal with exceptions?

23 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Motivation
I Recall the difference between syntax errors and exceptions in
Python (from lesson 2)
I Syntax errors are found before execution, and exceptions are
found during execution
I Exceptions signal that something unexpected has happened
while the program was executed

Question
Why do we need to deal with exceptions?
Answer
Because we make mistakes in our coding.
Example:
t = [1,2,3]
print(t[3])
I DEMO 23 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Motivation
I Recall the difference between syntax errors and exceptions in
Python (from lesson 2)
I Syntax errors are found before execution, and exceptions are
found during execution
I Exceptions signal that something unexpected has happened
while the program was executed

Question
Why do we need to deal with exceptions?
Answer
Because we make mistakes in our coding.
Example:
t = [1,2,3]
print(t[3])
I DEMO → raises an exception of type IndexError 23 / 52
Files Exceptions and Assertions

Exceptions and Assertions


The Perfect Programmer

I Suppose you were a perfect programmer, i.e., your code never


contains bugs
I Would you still need to deal with exceptions?

24 / 52
Files Exceptions and Assertions

Exceptions and Assertions


The Perfect Programmer

I Suppose you were a perfect programmer, i.e., your code never


contains bugs
I Would you still need to deal with exceptions?
I Yes, because unexpected things can happen beyond our
control. For example?

24 / 52
Files Exceptions and Assertions

Exceptions and Assertions


The Perfect Programmer

I Suppose you were a perfect programmer, i.e., your code never


contains bugs
I Would you still need to deal with exceptions?
I Yes, because unexpected things can happen beyond our
control. For example?
I A file that we expect to be present is not there
I The user enters invalid input
I The computer runs out of memory

24 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Standard Exceptions

I Python contains a predefined set of exceptions referred to as


standard exceptions
I For example:
I ImportError: raised when an import statement fails
I IndexError: raised when a sequence index is out of range
I NameError: raised when a name is not found
I TypeError: raised when an operation or function is applied to
an object of inappropriate type
I ValueError: raised when an operation or function is applied
to an object of appropriate type but inappropriate value
I IOError: raised when an input/output operation fails

25 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Example: ValueError

I Recall: ValueError: raised when an operation or function is


applied to an object of appropriate type but inappropriate
value
I Can you think of an example?

26 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Example: ValueError

I Recall: ValueError: raised when an operation or function is


applied to an object of appropriate type but inappropriate
value
I Can you think of an example?
I An example is given in the following code

import math
print(math.factorial(4))
print(math.factorial(-1))

I DEMO

26 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Example: ValueError

I Recall: ValueError: raised when an operation or function is


applied to an object of appropriate type but inappropriate
value
I Can you think of an example?
I An example is given in the following code

import math
print(math.factorial(4))
print(math.factorial(-1))

I DEMO
I The second print statement raises a ValueError since the
factorial function expects a non-negative argument

26 / 52
Files Exceptions and Assertions

Exceptions and Assertions


To Handle or not to Handle

I In all examples we have seen so far (including the last one) an


exception leads to the abortion of the program
I Is this always appropriate?

27 / 52
Files Exceptions and Assertions

Exceptions and Assertions


To Handle or not to Handle

I In all examples we have seen so far (including the last one) an


exception leads to the abortion of the program
I Is this always appropriate?
I No, since in some cases (e.g., in a safety-critical system)
errors need to be handled in a more ”graceful” way
I In those cases we need to handle the exceptions ourselves

27 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Another Example: Arithmetic Exceptions

I Consider the following code:

successFailureRatio = numSuccesses/float(numFailures)
print('The success/failure ratio is', successFailureRatio)
print('Now here')

I What unexpected event can happen?

28 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Another Example: Arithmetic Exceptions

I Consider the following code:

successFailureRatio = numSuccesses/float(numFailures)
print('The success/failure ratio is', successFailureRatio)
print('Now here')

I What unexpected event can happen?


I numFailures could be zero
I In that case Python raises a ZeroDivisionError, which is an
example of an Arithmetic Exception

28 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement

I We can handle exceptions using the try statement


I The simplest form of the try statement is the following:

try:
statement(s)
except [expression]:
statement(s)

I brackets ([,]) indicate that the expression is optional

29 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement: Example

1 try:
2 successFailureRatio = numSuccesses/float(numFailures)
3 print('The success/failure ratio is', successFailureRatio)
4 except ZeroDivisionError:
5 print('No failures')
6 print('Now here')

I if exception raised in line 2, execution proceeds with line 5


and then 6 (an assignment in line 2 not executed)
I if no exception raised in line 2, execution proceeds with line 3
then line 6

30 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement: Another Example

I Consider the following code:

val = int(input('Enter an integer: '))


print('The square of the number you entered is:', val * val)

I What kind of unexpected event can occur?

31 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement: Another Example

I Consider the following code:

val = int(input('Enter an integer: '))


print('The square of the number you entered is:', val * val)

I What kind of unexpected event can occur?


I The user can enter an invalid number!
I Which kind of exception would this correspond to? (Show
standard errors.)

31 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement: Another Example

I Consider the following code:

val = int(input('Enter an integer: '))


print('The square of the number you entered is:', val * val)

I What kind of unexpected event can occur?


I The user can enter an invalid number!
I Which kind of exception would this correspond to? (Show
standard errors.)
I If the string entered does not represent a valid number, a
ValueError is raised

31 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Try Statement: Another Example

I We can handle the ValueError as follows:

try:
val = int(input('Enter an integer: '))
print('The square of the number you entered is:', val**2)
except ValueError:
print('The entered number is not an integer')

I What if the val value is needed for further computations?


What should be done?

32 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Keep trying

I The following code repeatedly ask the user for a number until
a valid number is given:

while True:
val = input('Enter a number: ')
try:
val = int(val)
print('The square of the number you entered is:',val**2)
break # exit the while loop
except ValueError:
print('The entered number is not an integer')

I This code guarantees that the code following the while loop
can only be reached once val has a proper integer value

33 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Make it a Function
I If we need to request an input in several places, it makes
sense to package this as a function:
def requestInput(valType, prompt):
while True:
val = input(prompt)
try:
val = valType(val)
return val
except ValueError:
print('The value does not have the required type')

I The code is more general than the code on the previous slide.
Why?

34 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Make it a Function
I If we need to request an input in several places, it makes
sense to package this as a function:
def requestInput(valType, prompt):
while True:
val = input(prompt)
try:
val = valType(val)
return val
except ValueError:
print('The value does not have the required type')

I The code is more general than the code on the previous slide.
Why?Because it works for any type.
I We can ask for an integer like this:
requestInput(int, 'Please enter an integer: ')
34 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Multiple Exceptions

s = input('Enter a sequence of numbers: ')


numberSequence = s.split()
print(numberSequence)
inverseSequence = [1/int(x) for x in numberSequence]
print(inverseSequence)

I What exception(s) can be raised in this code snippet?

35 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Multiple Exceptions

s = input('Enter a sequence of numbers: ')


numberSequence = s.split()
print(numberSequence)
inverseSequence = [1/int(x) for x in numberSequence]
print(inverseSequence)

I What exception(s) can be raised in this code snippet?


I Both a ValueError and ZeroDivisionError could be raised
I How can we handle both exceptions at once?

35 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Multiple Exceptions: Approach 1

I We can handle multiple exceptions using multiple except


clauses

try:
s = input('Enter a sequence of numbers: ')
numberSequence = s.split()
print(numberSequence)
inverseSequence = [1/int(x) for x in numberSequence]
print(inverseSequence)
except ValueError:
print('invalid number')
except ZeroDivisionError:
print('Attempted to divide by 0')

I At most one except clause is executed → DEMO


36 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Multiple Exceptions: Approach 2

try:
s = input('Enter a sequence of numbers: ')
numberSequence = s.split()
print(numberSequence)
inverseSequence = [1/int(x) for x in numberSequence]
print(inverseSequence)
except (ValueError, ZeroDivisonError):
print('ValueError or ZeroDivisionError occurred')

I An except clause can be followed by a tuple of Exceptions


I DEMO

37 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Multiple Exceptions: Approach 3

try:
s = input('Enter a sequence of numbers: ')
numberSequence = s.split()
print(numberSequence)
inverseSequence = [1/int(x) for x in numberSequence]
print(inverseSequence)
except:
print('Some error occurred')

I An except clause will catch all errors, even unexpected one


(e.g., due to misspelling)
I Generally we don’t want to do this since we would like to be
notified of unexpected events

38 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Accessing the Error Message

I Each exception has an error message


I If an exception may have several underlying causes, it may be
relevant to access the error message

try:
f = open('test.txt')
s = f.readline()
except IOError as e:
print('IO error', e)

I An IOError may be raised for several reasons, e.g., the file


may not exists or we may not have permission to read it
I In the above example e contains the error message
I DEMO
39 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Propagation

I If an exception is raised in a function,


I the exception is handled within the function, or
I the exception is passed to the caller
I This process continues until either a handler is found in your
code, or the interpreter code is reached
I In the latter case the interpreter handles the exception by
printing an error message, including a traceback that gives
details about functions terminated during propagation

40 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Propagation: Example

I Consider the following example:

def f():
g()
def g():
h()
def h():
i = int(input('Enter a number please: '))
print(i*i)

I Try out both valid and invalid data entry


I DEMO

41 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Propagation: Traceback

I If we enter an invalid number after calling f() in the shell,


Python displays the following:

Traceback (most recent call last):


File "<pyshell#3>", line 1, in <module>
f()
File "/Users/pierre.kelsen/Documents/Temp/test.py", line 2, in f
g()
File "/Users/pierre.kelsen/Documents/Temp/test.py", line 4, in g
h()
File "/Users/pierre.kelsen/Documents/Temp/test.py", line 6, in h
i = int(input('Enter a number please: '))
ValueError: invalid literal for int() with base 10: 'a3'

42 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Call Stack

I After the exception interrupts function h(), the interpreter


displays the call chain (including line numbers) that has led to
the execution of the function that raises the exception (in this
case the int function)
I The call chain can be extracted from the stack of activation
records (see lesson 4)(revisit previous slide)

43 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Raising your own Exceptions

I You can raise your own exception using the syntax:


raise expression
I Here is an example where this would be useful:

def crossProduct(seq1, seq2):


if not seq1 or not seq2: # if either sequence is empty
raise ValueError('Sequence arguments must be non-empty')
return [(x1, x2) for x1 in seq1 for x2 in seq2]

I Note that the exception name is followed by an error message


(in parentheses)
I This exception will either be handled by the caller, or passed
up the call stack

44 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Revisiting: Number Search Problem
I Recall the Number Search Problem from the first lesson:
I Input: a sequence of numbers numberList and a number n
I Output: first position at which n occurs (starting with 0), or
-1 if number not in list
I We proposed the following code for finding the number:
pos = -1
found = False
for x in numberList:
pos += 1
if numberList[pos] == n:
found = True
break
if found:
print('number found at position: ', pos)
else:
print('number not found')
45 / 52
Files Exceptions and Assertions

Exceptions and Assertions


The index Method

I In lesson 2 we mention the index-method for lists


I Let us check the official documentation:

46 / 52
Files Exceptions and Assertions

Exceptions and Assertions


The index Method

I In lesson 2 we mention the index-method for lists


I Let us check the official documentation:
I help(list.index) will return:
Help on method_descriptor:

index(...)
L.index(value, [start, [stop]]) ->
integer -- return first index of value.
Raises ValueError if the value is not present.

I How can we simplify the code on the previous slide to find a


number in the sequence?

46 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Applying the index Method

I Using exception handling we come up with a much cleaner


and shorter solution:
try:
pos = numberList.index(n)
print('number found at position: ', pos)
except ValueError:
print('number not found')
I This also shows that exception handling is not only used to
deal with ”abnormal” events

47 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Checking Strategies

I There are two basic ways of dealing with exceptions


I The first method, sometimes known as ”Look Before You
Leap” (LBYL), is to check in advance for any possible
problems
I This approach is illustrated in the following example

def safeDivide(x, y):


if y==0:
print('Divide-by-0 attempt detected')
return None
else
return x/y

48 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Checking Strategies

I Main disadavantage of the LBYL approach:

49 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Exception Checking Strategies

I Main disadavantage of the LBYL approach:


I The checks diminish the readability of the main (normal) case
I Since the checks come first, the mainstream case is somehow
hidden at the end of the function
I The second approach, also known as ”it’s easier to ask
forgiveness than Permission” (EAFP) is the preferred
approach in Python
I On the next slide is a version of the previous example using
this approach

49 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Example with EAFP Approach

def safeDivide(x, y):


try:
return x/y
except ZeroDivisionError:
print('Divide-by-0 attempt detected')
return None

I Note that the mainstream case is upfront, which makes the


code more readable
I The EAFP approach is widely credited to Grace Murray
Hopper

50 / 52
Files Exceptions and Assertions

Exceptions and Assertions


Grace Murray Hopper

I Grace Murray Hopper was an American computer scientist


and a US Navy rear admiral
I Worked on the Harvard Mark 1
I invented the COBOL programming language
I devised the EAFP approach to error checking

51 / 52
Files Exceptions and Assertions

Exceptions and Assertions


COBOL vs Python

I Simple COBOL program:


$ SET SOURCEFORMAT"FREE"
IDENTIFICATION DIVISION.
PROGRAM-ID. ShortestProgram.

PROCEDURE DIVISION.
DisplayPrompt.
DISPLAY "I did it".
STOP RUN.

I Equivalent Python program:


print('I did it')

52 / 52
Testing

Programming Fundamentals 1
Lesson 7

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 44
Testing

Outline

Testing

2 / 44
Testing

Testing
Testing vs Debugging

I Recall from lesson 2:


I A program is correct if for every valid input it returns within a
finite amount of time the output requested
I Testing consists in checking the correctness of a program by
running it on a set of inputs
I Debugging consists in finding the error in a program that we
know not to be correct

3 / 44
Testing

Testing
Testing vs Debugging: The Big Picture

Debugging will be covered in lesson 8

4 / 44
Testing

Testing
Aim of Testing
Observation
Program testing can be used to show the presence of bugs, but
never to show their absence.

Edsgar W. Dijkstra

5 / 44
Testing

Testing
Aim of Testing (2)
I A similar quote attributed to Albert Einstein (and paraphrased
here):
No amount of experimentation can ever prove me right; a
single experiment can prove me wrong.

Albert Einstein
6 / 44
Testing

Testing
Limits of Testing

Question
Why can’t we just test all possible inputs?

7 / 44
Testing

Testing
Limits of Testing

Question
Why can’t we just test all possible inputs?

Answer
Because there are too many. E.g., suppose you wrote a function
that tests whether a number is prime. How many possible inputs
are there?

7 / 44
Testing

Testing
Limits of Testing

Question
Why can’t we just test all possible inputs?

Answer
Because there are too many. E.g., suppose you wrote a function
that tests whether a number is prime. How many possible inputs
are there?
In Python 3, there is no limit to the size of an integer!

7 / 44
Testing

Testing
Test Suite

I Since we generally cannot test all possible inputs, the key


question is:

8 / 44
Testing

Testing
Test Suite

I Since we generally cannot test all possible inputs, the key


question is:
How to find a suitable set of possible inputs to test?
We call such a set a test suite

8 / 44
Testing

Testing
Partition

1
https://en.wikipedia.org/wiki/Partition of a set
9 / 44
Testing

Testing
Partition

Definition1
In mathematics, a partition of a set is a grouping of the set’s
elements into non-empty subsets, in such a way that every element
is included in one and only one of the subsets.

Question
Can you give examples of set partitions?

1
https://en.wikipedia.org/wiki/Partition of a set
9 / 44
Testing

Testing
From Partitions to Test Suites

I We could partition the set of all possible inputs into subsets


with the property:
I if the program is correct for one element in a subset, then it is
correct for all elements in this subset
I Let us call such a partition a nice partition
I How can we construct a test suite from a nice partition?

10 / 44
Testing

Testing
From Partitions to Test Suites

I We could partition the set of all possible inputs into subsets


with the property:
I if the program is correct for one element in a subset, then it is
correct for all elements in this subset
I Let us call such a partition a nice partition
I How can we construct a test suite from a nice partition?
I By selecting one element from each subset of the partition

10 / 44
Testing

Testing
Finding Test Suites

I Ideally we would like to come up with a nice partition of


minimal size (which would lead to a test suite of minimal size)
I Unfortunately we don’t know in general how to do this
I So we’ll settle in general for some other test suite of
reasonable size
I We now describe two strategies for finding good test suites:
black-box testing and glass-box testing

11 / 44
Testing

Testing
Black-Box Testing

I Black-box test suites are created without looking at the code


I Advantages of this approach:

12 / 44
Testing

Testing
Black-Box Testing

I Black-box test suites are created without looking at the code


I Advantages of this approach:
I Testers and Implementers can be drawn from separate
populations (see next slide)

12 / 44
Testing

Testing
Black-Box Testing

I Black-box test suites are created without looking at the code


I Advantages of this approach:
I Testers and Implementers can be drawn from separate
populations (see next slide)
I Approach robust with respect to implementation changes.
I when implementation changes, we do not need to change the
test suite

12 / 44
Testing

Testing
Black-Box Testing

I Black-box test suites are created without looking at the code


I Advantages of this approach:
I Testers and Implementers can be drawn from separate
populations (see next slide)
I Approach robust with respect to implementation changes.
I when implementation changes, we do not need to change the
test suite
I We can write the test cases before writing the code (assuming
the requirements are defined)
I We can let the tests guide the development → test-driven
development

12 / 44
Testing

Testing
Testers vs Implementers

Question
Why can it be advantageous for testers to be distinct from
implementers?

13 / 44
Testing

Testing
Testers vs Implementers

Question
Why can it be advantageous for testers to be distinct from
implementers?

Answer
Test suites may exhibit mistakes that are correlated with mistakes
in the code. How?

13 / 44
Testing

Testing
Testers vs Implementers

Question
Why can it be advantageous for testers to be distinct from
implementers?

Answer
Test suites may exhibit mistakes that are correlated with mistakes
in the code. How?

Example
Suppose that the programmer made the incorrect assumption that
a function is never called with a negative number.
If the programmer writes the test suite, (s)he is likely to repeat the
same mistake (by not including cases where the argument is
negative).
13 / 44
Testing

Testing
Specifications

I Since black-box tests are not based on the code, they must be
based on the requirements of the code

Question
We have already seen a method for definining the requirements of
a function in Python. What is it?

14 / 44
Testing

Testing
Specifications

I Since black-box tests are not based on the code, they must be
based on the requirements of the code

Question
We have already seen a method for definining the requirements of
a function in Python. What is it?

Answer
The requirements can be specified by a special comment in a
function called a docstring. The docstring defines the assumptions
on the input and the guarantees on the output. We called this a
specification or a contract for the function.

14 / 44
Testing

Testing
Example Specification

I The following function returns an integer based on the age


interval of a person
I Such a function could for instance be useful in the context of
a human resources software that determines what kind of
employment contract a person may have with a company

Example
def ageInterval(x):
"""Assumes x is a non-negative integer in the range 0..99
Returns 0 if 0<=x<=15, 1 if 16<=x<18,
2 if 18<=x<=60, 3 if 60<x<=99
"""

15 / 44
Testing

Testing
Boundary-Values Technique

I One technique employed within black-box testing is the


boundary value technique
I It is based on examining test cases at the boundary values

Question
How would you define the boundary values of the ageInterval
function?

16 / 44
Testing

Testing
Boundary-Values Technique

I One technique employed within black-box testing is the


boundary value technique
I It is based on examining test cases at the boundary values

Question
How would you define the boundary values of the ageInterval
function?

Answer
Define the boundary values as those values that are the endpoints
of the age intervals being tested.
In this case: 0, 15, 16, 18 60, 99

16 / 44
Testing

Testing
Boundary-Values Technique: Application

I The boundary value technique consists in testing boundary


values as well as those values immediately around it
I if b is a boundary value, we will check b-1, b and b+1
I From boundary values: 0, 15, 16, 18, 60, 99 we thus obtain
the following test cases:
-1, 0, 1, 14, 15, 16, 17, 18, 19, 59, 60, 61, 98, 99, 100

Question
Do you notice anything special about the test suite?

17 / 44
Testing

Testing
Boundary-Values Technique: Application

I The boundary value technique consists in testing boundary


values as well as those values immediately around it
I if b is a boundary value, we will check b-1, b and b+1
I From boundary values: 0, 15, 16, 18, 60, 99 we thus obtain
the following test cases:
-1, 0, 1, 14, 15, 16, 17, 18, 19, 59, 60, 61, 98, 99, 100

Question
Do you notice anything special about the test suite?

Answer
The first and last values are not valid inputs according to the
requirements.

17 / 44
Testing

Testing
Invalid inputs

I If a input value is outside the range of valid inputs, what are


the guarantees on the output?

18 / 44
Testing

Testing
Invalid inputs

I If a input value is outside the range of valid inputs, what are


the guarantees on the output?
I There are no guarantees!!

18 / 44
Testing

Testing
Invalid inputs

I If a input value is outside the range of valid inputs, what are


the guarantees on the output?
I There are no guarantees!!
I Does that mean that we should not test those invalid inputs?

18 / 44
Testing

Testing
Invalid inputs

I If a input value is outside the range of valid inputs, what are


the guarantees on the output?
I There are no guarantees!!
I Does that mean that we should not test those invalid inputs?
I Two approaches:
I contract based testing: only test valid inputs
I defensive testing: test all inputs

18 / 44
Testing

Testing
Defensive Testing

I What would be the motivation for doing defensive testing?

19 / 44
Testing

Testing
Defensive Testing

I What would be the motivation for doing defensive testing?


I In some circumstances (e.g., safety-critical systems) it is
important that the software continues to function even if
abnormal circumstances are met

19 / 44
Testing

Testing
Equivalence Class Testing

I A second technique for black-box testing (besides boundary


value testing) is equivalence class testing
I It goes back to the idea of a nice partition introduced above:
we define subsets of input values that are thought to be
equivalent in the sense that if the code behaves correctly for
one input in a subset it behaves correctly for all inputs of this
subset

20 / 44
Testing

Testing
Equivalence Class Testing: Application

I We can apply equivalence class testing to the ageInterval


function as follows:
I We note that the function should behave in the same way for
two inputs if they both belong to one of the following
intervals: [0,..,15], [16,17], [18,...,60], [61,...,99]
I According to equivalence testing it would suffice to choose one
test case from each interval
I Since we know alreday how to check for boundary values, it is
natural to choose non-boundary values (when possible) in this
case
I Possible test suite: 7, 16, 40, 73

21 / 44
Testing

Testing
Glass-Box Testing

I In many cases black-box testing is not sufficient to find some


errors
I Generally incorrect outputs can be traced to logical errors in
the code
I Without looking at the code it is thus often difficult to find
certain errors

22 / 44
Testing

Testing
Glass-Box Testing: Example

I Consider the following function:

def isPrime(x):
"""Assumes x is a non-negative integer
Returns True if x is prime, False otherwise"""
if x<=2:
return False
for i in range(2,x):
if x%i == 0:
return False
return True

23 / 44
Testing

Testing
Glass-Box Testing: Example

I By inspecting the code of function isPrime we notice that


some values are treated differently. Which ones?

24 / 44
Testing

Testing
Glass-Box Testing: Example

I By inspecting the code of function isPrime we notice that


some values are treated differently. Which ones?
I Values 0, 1 and 2 are treated differently (because of the first
if-statement); they should therefore be tested
I Testing these values will uncover the error

24 / 44
Testing

Testing
Glass-Box Testing: Example

I By inspecting the code of function isPrime we notice that


some values are treated differently. Which ones?
I Values 0, 1 and 2 are treated differently (because of the first
if-statement); they should therefore be tested
I Testing these values will uncover the error
I isPrime(2) incorrectly returns False

24 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1:
4 for i in range(2,x):
5 if x%i == 0:
6 return False
7 return True
25 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1: 2, 3
4 for i in range(2,x):
5 if x%i == 0:
6 return False
7 return True
25 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1: 2, 3
4 for i in range(2,x): Path for x=3:
5 if x%i == 0:
6 return False
7 return True
25 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1: 2, 3
4 for i in range(2,x): Path for x=3: 2, 4, 5, 7
5 if x%i == 0:
6 return False
7 return True
25 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1: 2, 3
4 for i in range(2,x): Path for x=3: 2, 4, 5, 7
5 if x%i == 0: Path for x=4:
6 return False
7 return True
25 / 44
Testing

Testing
Paths through Code
I Different input values give rise to different statements being
executed
I Each sequence of statements that can be executed from start
to finish is called a path
I Example of paths (as sequences of line numbers) are given
below

1 def isPrime(x):
2 if x<=2:
3 return False Path for x=1: 2, 3
4 for i in range(2,x): Path for x=3: 2, 4, 5, 7
5 if x%i == 0: Path for x=4: 2, 4, 5, 6
6 return False
7 return True
25 / 44
Testing

Testing
Path-Completeness

I A glass-box test suite is path-complete if it exercises every


path through the code
I It is not always possible to have a path-complete test suite.
I E.g., in the above example it is not possible:
I the for-loop can be repeated an arbitrary number of times

26 / 44
Testing

Testing
Path-Completeness and Correctness
I Suppose we do have a path-complete test suite
I Is that sufficient to detect all errors?

27 / 44
Testing

Testing
Path-Completeness and Correctness
I Suppose we do have a path-complete test suite
I Is that sufficient to detect all errors?
I No, it is not - see example below:

def abs(x):
"""Assumes x is an int
Returns absolute value of x"""
if x < -1:
return -x
else:
return x

I The test suite {2,-2} is path-complete

27 / 44
Testing

Testing
Path-Completeness and Correctness
I Suppose we do have a path-complete test suite
I Is that sufficient to detect all errors?
I No, it is not - see example below:

def abs(x):
"""Assumes x is an int
Returns absolute value of x"""
if x < -1:
return -x
else:
return x

I The test suite {2,-2} is path-complete


I It does not reveal the erroneous output: abs(-1) = -1
27 / 44
Testing

Testing
Guidelines

I As we have seen it is not easy to design good test suites


I There are however a few rules of thumb that are worth
following
I These rules are given on the following slides

28 / 44
Testing

Testing
If-Statements and Except Clauses

I Both branches of an if-statement should be exercised


I All except clauses should be executed

29 / 44
Testing

Testing
For-Loops and While-Loops

I For each for-loop we should have test cases in which:


I the loop is not entered (e.g, when iterating over an empty list)
I the body of the loop is executed exactly once
I the body of the loop is executed more than once

30 / 44
Testing

Testing
Recursive Functions

I For testing recursive functions include test cases that lead to:
I no recursive call
I exactly one recursive call
I more than one recursive call

31 / 44
Testing

Testing
Unit Testing vs Integration Testing

I Testing is usually done at two levels:


I Unit testing tests individual units of code (e.g., functions or
modules)
I Integration testing tests whether the program as a whole
behaves as expected

32 / 44
Testing

Testing
Unit Testing

I Unit Testing is typically done before integration testing


I There are several unit testing frameworks that facilitate
writing and executing tests
I unittest is a unit testing framework that is part of the Python
standard library
I doctest is another unit testing framework within the Python
standard library
I Because doctest is rather lightweight and does not require
much object-oriented concepts, we will cover it here

33 / 44
Testing

Testing
Example

I Recall the function for determining whether a number is prime


def isPrime(x):
if x<=2:
return True
for i in range(2,x):
if x%i == 0:
return False
return True
I Assume that this function is contained in a module named
prime

34 / 44
Testing

Testing
Example

I To test the isPrime function, we first import the function as


follows:
>>> from prime import isPrime

I We can then test the prime function in the shell as follows:


>>> isPrime(7)
True

I Note that the Python statement that calls the function is


followed by the expected output and a blank line

35 / 44
Testing

Testing
Example

I Let us save the text on the previous slide into a text file
”testPrime.txt”
I DEMO

36 / 44
Testing

Testing
Example

I Let us save the text on the previous slide into a text file
”testPrime.txt”
I DEMO
I We can now use doctest to read this file as follows (in the
shell):
>>> import doctest
>>> doctest.testfile("testPrime.txt")
I Run this Python code → DEMO

36 / 44
Testing

Testing
Result

I When running the above testfile command, we get an output


such as:
TestResults(failed=0, attempted=3)

I or an output such as:


File "/Users/pierre/Documents/Temp/testPrime.txt", line 5, in testPrime.txt
Failed example:
isPrime(7)
Expected:
False
Got:
True
**********************************************************************
1 items had failures:
1 of 2 in testPrime.txt
***Test Failed*** 1 failures.
TestResults(failed=1, attempted=2)

37 / 44
Testing

Testing
Explaining the Result

I To understand the result, we need to understand what the


testfile function does
I This function scans the file and searches for interactive
examples
I An interactive example can be:
I any statement executed in the shell, preceded by ”>>>”, e.g.,
>>> from prime import isPrime
I possibly followed by expected output on the next line
I expected output is terminated by a following line consisting
only of whitespaces

38 / 44
Testing

Testing
Explaining the Result (2)

I Thus the following three lines:


>>> isPrime(7)
True

I would be interpreted as follows:


I isPrime(7) is an interactive example
I the expected output is ”True”, which is terminated by a line of
whitespaces

39 / 44
Testing

Testing
The testfile Function

I The testfile function of module doctest


I searches all interactive examples in the file given as argument
I verifies that the output is the expected one (if given)
I also reports any exceptions encountered
I We can thus write a series of test cases for a function or a
module, and at the same time document/explain these test
cases by embedding them in a text file
I DEMO

40 / 44
Testing

Testing
Regressions

I Suppose that successive releases of a software have integer


version numbers
I Suppose a version i of a software has a set of errors E
I What can we say about the set of errors E’ of version i+1?

41 / 44
Testing

Testing
Regressions

I Suppose that successive releases of a software have integer


version numbers
I Suppose a version i of a software has a set of errors E
I What can we say about the set of errors E’ of version i+1?
I Is it necessarily the case that: E 0 ⊂ E ?

41 / 44
Testing

Testing
Regressions

I Suppose that successive releases of a software have integer


version numbers
I Suppose a version i of a software has a set of errors E
I What can we say about the set of errors E’ of version i+1?
I Is it necessarily the case that: E 0 ⊂ E ?
I No, new errors can be introduced
I Such errors are called regressions
I Regressions can be introduced:
I when adding new features
I by incorrect fixes of existing errors

41 / 44
Testing

Testing
Regression Testing

I Regression testing consists in checking whether no regressions


have been introduced because of changes in the software
I Unit testing frameworks such as doctest are well suited for
regression testing. Why?

42 / 44
Testing

Testing
Regression Testing

I Regression testing consists in checking whether no regressions


have been introduced because of changes in the software
I Unit testing frameworks such as doctest are well suited for
regression testing. Why?
I They allow to specify a large number of test cases, and let
those test cases be checked automatically every time changes
have been applied

42 / 44
Testing

Testing
Integration Testing

I The above slides are mainly concerned with testing individual


units of code (e.g., functions)
I The test of the program as a whole is called integration testing
I Integration testing is much harder than unit testing. Why?

43 / 44
Testing

Testing
Integration Testing

I The above slides are mainly concerned with testing individual


units of code (e.g., functions)
I The test of the program as a whole is called integration testing
I Integration testing is much harder than unit testing. Why?
I This is because it is much harder to define the expected
behavior of a system than the behavior of small subunits

Example
Compare the problem of specifying the behavior of a word
processor to that of a function that counts the number of
occurrences of a word in a document.

43 / 44
Testing

Testing
Software Quality Assurance

I Many large software development companies have a software


quality assurance (SQA) group
I mission is to ensure that software that is released is of
sufficient quality
I Often
I the development group (responsible for developing the
software) is in charge of unit testing
I the SQA group is in charge of integration testing

44 / 44
Debugging

Programming Fundamentals 1
Lesson 8

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 43
Debugging

Outline

Debugging

2 / 43
Debugging

Debugging
Software Bugs

Definition1
A software bug is an error, flaw, failure or fault in a computer
program or system that causes it to produce an incorrect or
unexpected result, or to behave in unintended ways.

1
https://en.wikipedia.org/wiki/Software bug
3 / 43
Debugging

Debugging
Origin of Bugs

I The term ”bug” has been used in an engineering context since


at least the 19th century.

Quote(Thomas Edison, 1878)


It has been just so in all of my inventions. The first step is an
intuition, and comes with a burst, then difficulties arise – this thing
gives out and [it is] then that ”Bugs” – as such little faults and
difficulties are called–show themselves and months of intense
watching, study and labor are requisite before commercial success
or failure is certainly reached.

4 / 43
Debugging

Debugging
Origin of Software Bugs

I The use of the term ”bug” in the context of programming is


more recent.

A page from the Harvard Mark II electromechanical computer’s log,


featuring a dead moth that was removed from the device (around 1946)

5 / 43
Debugging

Debugging
Types of Software Bugs: Overt vs Covert

I Software bugs are due to programming errors


I We can classify them in different ways:
I Overt vs covert bugs: An overt bug is one that has an obvious
manifestation, e.g., an exception is raised, or the program
takes much longer than it should. A covert bug is one that has
no obvious manifestations
I Many bugs fall in between those extremes. Why?

6 / 43
Debugging

Debugging
Types of Software Bugs: Overt vs Covert

I Software bugs are due to programming errors


I We can classify them in different ways:
I Overt vs covert bugs: An overt bug is one that has an obvious
manifestation, e.g., an exception is raised, or the program
takes much longer than it should. A covert bug is one that has
no obvious manifestations
I Many bugs fall in between those extremes. Why?
I Their classification depends on much effort we invest in
analyzing the behavior of the program.

6 / 43
Debugging

Debugging
Types of Software Bugs: Persistent vs Intermittent

I We can also classify software bugs according to the frequency


of their manifestation:
I Persistent vs intermittent: A persistent bug is one that occurs
every time the program is run with the same inputs. An
intermittent bug is one that occurs only some of the time
(even with same inputs)

7 / 43
Debugging

Debugging
Software Bugs: The Good, the Bad and the Ugly

I The best bugs to have are those that are overt and persistent
since they are easiest to discover
I Next are those that are overt but intermittent; these are
harder to detect and are more likely to occur in production
software
I Worst are those that are covert
I possibly only discovered after many ”wrong” answers have
been produced
I Can you think about a possible scenario with bad
consequences?

8 / 43
Debugging

Debugging
Software Bugs: The Good, the Bad and the Ugly

I The best bugs to have are those that are overt and persistent
since they are easiest to discover
I Next are those that are overt but intermittent; these are
harder to detect and are more likely to occur in production
software
I Worst are those that are covert
I possibly only discovered after many ”wrong” answers have
been produced
I Can you think about a possible scenario with bad
consequences?
I E.g., software that evaluates the credit worthiness of a bank
customer

8 / 43
Debugging

Debugging
What is Debugging?

I We have already seen a technique to uncover bugs. What is


it?

9 / 43
Debugging

Debugging
What is Debugging?

I We have already seen a technique to uncover bugs. What is


it?
I We can use testing techniques to uncover software bugs
I Once we detect a bug, we need to locate and correct the bug
I the process of finding and resolving software bugs is called
debugging
I we will focus on the search part of the problem since it is
usually the hardest

9 / 43
Debugging

Debugging
About Debugging

I Debugging is one of the most creative and challenging aspects


of programming
I Often problems are more psychological than technical
I You must learn to think in a different way

10 / 43
Debugging

Debugging
Programming and Role Playing

Question
What does programming have in common with a role-playing
game?

11 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like

12 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like an architect
I When coding, you need to act like

12 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like an architect
I When coding, you need to act like an engineer
I When testing, you need to act like

12 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like an architect
I When coding, you need to act like an engineer
I When testing, you need to act like a vandal

12 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like an architect
I When coding, you need to act like an engineer
I When testing, you need to act like a vandal
I When debugging, you need to act like

12 / 43
Debugging

Debugging
Programming and Role Playing

I When programming, you need to assume different roles:


I When designing a program, you need to act like an architect
I When coding, you need to act like an engineer
I When testing, you need to act like a vandal
I When debugging, you need to act like a detective
I When you change roles, you need to adopt different strategies
and goals
I Often not easy to switch perspectives

12 / 43
Debugging

Debugging
10 Truths about Debugging

I The following slides summarize some lessons learnt about


debugging
I They are attributed to Nick Parlante (Stanford University)2

2
http://cs.stanford.edu/people/nick/compdocs/Debugging.pdf
13 / 43
Debugging

Debugging
Truth 1: Intuition vs Facts

Truth 1
Intuition and hunches are great - you just have to test them out.
When a hunch and a fact collide, the fact wins.

14 / 43
Debugging

Debugging
Truth 2: Simplicity vs Complexity

Truth 2
Dont look for complex explanations. Even the simplest omission or
typo can lead to very weird behavior. Dont just sweep your eye
over that series of simple statements assuming that they are too
simple to be wrong.

15 / 43
Debugging

Debugging
Truth 3: Importance of Watching

Truth 3
The clue to what is wrong in your code is in the values of your
variables and the flow of control. Try to see what the facts are
pointing to. The computer is not trying to mislead you. Work
from the facts.

16 / 43
Debugging

Debugging
Truth 4: Test as You Go

Truth 4
If your code was working a minute ago, but now it doesnt–what
was the last thing you changed? This incredibly reliable rule of
thumb is the reason your section leader told you to test your code
as you go rather than all at once.

17 / 43
Debugging

Debugging
Truth 5: Be Prudent

Truth 5
Do not change your code haphazardly trying to track down a bug.
This is sort of like a scientist who changes more than one variable
in an experiment at a time. It makes the observed behavior much
more difficult to interpret, and you tend to introduce new bugs.

18 / 43
Debugging

Debugging
Truth 6: Related Bugs

Truth 6
If you find some wrong code that does not seem to be related to
the bug you were tracking, fix the wrong code anyway. Many times
the wrong code was related to or obscured the bug in a way you
had not imagined.

19 / 43
Debugging

Debugging
Truth 7: Explain your Reasoning

Truth 7
You should be able to explain in Sherlock Holmes style the series
of facts, tests, and deductions that led you to find a bug.
Alternately, if you have a bug but cant pinpoint it, then you should
be able to give an argument to a critical third party detailing why
each one of your functions cannot contain the bug. One of these
arguments will contain a flaw since one of your functions does in
fact contain a bug. Trying to construct the arguments may help
you to see the flaw.

20 / 43
Debugging

Debugging
Truth 8: Question Yourself

Truth 8
Be critical of your beliefs about your code. Its almost impossible to
see a bug in a function when your instinct is that the function is
innocent. Only when the facts have proven without question that
the function is not the source of the problem should you assume it
to be correct.

21 / 43
Debugging

Debugging
Truth 9: Guess Anyway

Truth 9
Although you need to be systematic, there is still an enormous
amount of room for beliefs, hunches, guesses, etc. Use your
intuition about where the bug probably is to direct the order that
you check things in your systematic search. Check the functions
you suspect the most first. Good instincts will come with
experience.

22 / 43
Debugging

Debugging
Truth 10: Take a Break

Truth 10
Debugging code is more mentally demanding than writing code.
The longer you try to track down a bug without success, the less
perspective you tend to have. Realize when you have lost the
perspective on your code to debug. Take a break. Get some sleep.
You cannot debug when you are not seeing things clearly. Many
times a programmer can spend hours late at night hunting for a
bug only to finally give up at 4:00A.M. The next day, they find the
bug in 10 minutes.

23 / 43
Debugging

Debugging
Getting Down to Business

I We will now discuss the debugging process


I A very simple way to debug does not use any special tool but
makes use of print-statements at various places in the
program
I What are the drawbacks of this approach?
I pollutes the main code
I may result in much unnecessary information being output

24 / 43
Debugging

Debugging
A Buggy Program
def isPal(x):
"""Assumes x is a list
Returns True if the list is a palindrome; False otherwise"""
temp = x
temp.reverse
if temp == x:
return True
else:
return False
# main program
n = int(input('Enter number n:'))
for i in range(n):
result = []
elem = input('Enter element: ')
result.append(elem)
if isPal(result):
print('Yes')
else:
print('No') 25 / 43
Debugging

Debugging
Testing the Buggy Program

I The program on the previous page is supposed to test whether


a sequence of strings entered by the user reads the same
forward and backwards
I For example:
I When the user enters the number 3 and the the strings
”Pierre”, ”Paul”, ”Pierre”, the program is suppose to return
”Yes”
I DEMO:

26 / 43
Debugging

Debugging
Testing the Buggy Program

I The program on the previous page is supposed to test whether


a sequence of strings entered by the user reads the same
forward and backwards
I For example:
I When the user enters the number 3 and the the strings
”Pierre”, ”Paul”, ”Pierre”, the program is suppose to return
”Yes”
I DEMO: shows that program works on this input
I Test a smaller example: n=2, strings: ”a”, ”b”
I DEMO:

26 / 43
Debugging

Debugging
Testing the Buggy Program

I The program on the previous page is supposed to test whether


a sequence of strings entered by the user reads the same
forward and backwards
I For example:
I When the user enters the number 3 and the the strings
”Pierre”, ”Paul”, ”Pierre”, the program is suppose to return
”Yes”
I DEMO: shows that program works on this input
I Test a smaller example: n=2, strings: ”a”, ”b”
I DEMO: reveals that the program incorrectly answers ”Yes”

26 / 43
Debugging

Debugging
Setting Breakpoints

I A key feature of debugging tools is to set breakpoints


I Breakpoints allow to interrupt the program at a predefined
location
I What is this good for?

27 / 43
Debugging

Debugging
Setting Breakpoints

I A key feature of debugging tools is to set breakpoints


I Breakpoints allow to interrupt the program at a predefined
location
I What is this good for?
I This helps determining where the program starts to behave
incorrectly

27 / 43
Debugging

Debugging
Setting Breakpoints

I A key feature of debugging tools is to set breakpoints


I Breakpoints allow to interrupt the program at a predefined
location
I What is this good for?

28 / 43
Debugging

Debugging
Setting Breakpoints

I A key feature of debugging tools is to set breakpoints


I Breakpoints allow to interrupt the program at a predefined
location
I What is this good for?
I This helps determining where the program starts to behave
incorrectly
I In Pycharm breakpoints can be set or unset by clicking in the
margin
I DEMO

28 / 43
Debugging

Debugging
Debug Mode

I A program can be executed either in


I run mode, or
I debug mode
I Breakpoints will only work if the program is being executed in
debug mode
I In Pycharm: to execute a program in debug mode, right-click
on the Python file and choose ”Debug” (or choose Debug in
the Run menu)
I DEMO

29 / 43
Debugging

Debugging
Where to Break

I Finding the error can be viewed as a search problem. How?

30 / 43
Debugging

Debugging
Where to Break

I Finding the error can be viewed as a search problem. How?


I We need to find the line(s) that contains the error

Where to look?

30 / 43
Debugging

Debugging
Binary Code Search

I A common strategy is to look roughly half-way down the


program
I The goal is to reduce quickly the set of candidate lines to
inspect
I Applying this strategy to the example we would set the
breakpoint as shown below, right after the for-loop

31 / 43
Debugging

Debugging
Looking around
I What are we going to do when the program pauses at the
breakpoint?

32 / 43
Debugging

Debugging
Looking around
I What are we going to do when the program pauses at the
breakpoint?
I We inspect the value of variables to check for abnormalities
I Pycharm provides a variable window which shows the value of
relevant variables
I DEMO
I What variable value is most relevant?

32 / 43
Debugging

Debugging
Looking around
I What are we going to do when the program pauses at the
breakpoint?
I We inspect the value of variables to check for abnormalities
I Pycharm provides a variable window which shows the value of
relevant variables
I DEMO
I What variable value is most relevant?
I The result list that will be passed as argument to the isPal
function

32 / 43
Debugging

Debugging
What to do next

I What abnormality do we notice?

33 / 43
Debugging

Debugging
What to do next

I What abnormality do we notice?


I The result list does not have the correct value:
I Expected value: [’a’,’b’]
I Actual value: [’b’]

33 / 43
Debugging

Debugging
What to do next

I What abnormality do we notice?


I The result list does not have the correct value:
I Expected value: [’a’,’b’]
I Actual value: [’b’]
I Since the list is constructed in the for loop we decide to
I set a breakpoint at the start of the for-loop
I execute the loop step-by-step

33 / 43
Debugging

Debugging
Stepwise execution

I After the program halts at a breakpoint Pycharms debugger


allows to:
I step over a code line
I step into function code
I step out function code
I We are going to use the ”step over” functionality to execute
the for-loop, one line at a time
I While we do this, we watch the value of the result list

34 / 43
Debugging

Debugging
Restarting

I Before we examine the for loop we must


I set the new breakpoint
I stop the current debugging session (click on red square)
I restart a debugging session
I DEMO

Question
So why does result not ”behave” correctly?

35 / 43
Debugging

Debugging
Restarting

I Before we examine the for loop we must


I set the new breakpoint
I stop the current debugging session (click on red square)
I restart a debugging session
I DEMO

Question
So why does result not ”behave” correctly?

Answer
Because result is reinitialized in every iteration. How do we fix this?

35 / 43
Debugging

Debugging
Restarting

I Before we examine the for loop we must


I set the new breakpoint
I stop the current debugging session (click on red square)
I restart a debugging session
I DEMO

Question
So why does result not ”behave” correctly?

Answer
Because result is reinitialized in every iteration. How do we fix this?
By initializing result once before the loop.

35 / 43
Debugging

Debugging
Fixing the Error

I Let us fix this error


I The resulting code is shown below (only main program is
shown)

# main program
n = int(input('Enter number n:'))
result = [] # initialization has been moved here
for i in range(n):
elem = input('Enter element: ')
result.append(elem)
if isPal(result):
print('Yes')
else:
print('No')

36 / 43
Debugging

Debugging
Fixing the Error

I Let us fix this error


I The resulting code is shown below (only main program is
shown)

# main program
n = int(input('Enter number n:'))
result = [] # initialization has been moved here
for i in range(n):
elem = input('Enter element: ')
result.append(elem)
if isPal(result):
print('Yes')
else:
print('No')

So we are done, right?


36 / 43
Debugging

Debugging
Retesting

I We need to retest the code


I We verify that the input that was not working before is now ok
I DEMO

37 / 43
Debugging

Debugging
Retesting

I We need to retest the code


I We verify that the input that was not working before is now ok
I DEMO
I Verdict:

37 / 43
Debugging

Debugging
Retesting

I We need to retest the code


I We verify that the input that was not working before is now ok
I DEMO
I Verdict:The code still does not work
I the result list is correctly computed
I but the isPal function returns an incorrect result

37 / 43
Debugging

Debugging
Testing the Function

I How shall we proceed?

38 / 43
Debugging

Debugging
Testing the Function

I How shall we proceed?


I We step into the function execution
I equivalently we could set a break point in the function
I DEMO
I So what is the problem?

38 / 43
Debugging

Debugging
Testing the Function

I How shall we proceed?


I We step into the function execution
I equivalently we could set a break point in the function
I DEMO
I So what is the problem?
I We suspect that it has to with the reverse function
I Check the documentation: list.reverse

38 / 43
Debugging

Debugging
Testing the Function

I How shall we proceed?


I We step into the function execution
I equivalently we could set a break point in the function
I DEMO
I So what is the problem?
I We suspect that it has to with the reverse function
I Check the documentation: list.reverse
I We forgot the parentheses!!!

38 / 43
Debugging

Debugging
One More Fix

I Let’s write the call to the reverse function correctly


I The resulting code of function isPal is shown below
def isPal(x):
"""Assumes x is a list
Returns True if the list is a palindrome; False otherwise"""
temp = x
temp.reverse() # before temp.reverse
if temp == x:
return True
else:
return False

39 / 43
Debugging

Debugging
One More Fix

I Let’s write the call to the reverse function correctly


I The resulting code of function isPal is shown below
def isPal(x):
"""Assumes x is a list
Returns True if the list is a palindrome; False otherwise"""
temp = x
temp.reverse() # before temp.reverse
if temp == x:
return True
else:
return False

I Are we done?

39 / 43
Debugging

Debugging
One More Fix

I Let’s write the call to the reverse function correctly


I The resulting code of function isPal is shown below
def isPal(x):
"""Assumes x is a list
Returns True if the list is a palindrome; False otherwise"""
temp = x
temp.reverse() # before temp.reverse
if temp == x:
return True
else:
return False

I Are we done?
No: the output is still wrong. Why?

39 / 43
Debugging

Debugging
Final Fix
I x and temp are bound to the same list object
I when we reverse the list bound to temp, it has the same effect
on x
I we need to do a copy of the list bound to x, and then reverse
that copy
def isPal(x):
"""Assumes x is a list
Returns True if the list is a palindrome; False otherwise""
temp = x.copy() # performs shallow copy of list x
temp.reverse() # before: temp.reverse
if temp == x:
return True
else:
return False

This version passes the test.


40 / 43
Debugging

Debugging
Stack Trace

I We review a few more debugging features of Pycharm


I When the program halts at a breakpoint the stack trace is
shown at the bottom left
I We can inspect individual frames in the stack trace (we called
these activation records in lesson 4)
I We can even inspect variables at each level
I DEMO

41 / 43
Debugging

Debugging
Watches

I We can watch the value of certain expressions using watch


I We can add an expression to watch either to the variable
window or to a separate window
I Every time the program stops we can easily view the values of
those expressions
I DEMO

42 / 43
Debugging

Debugging
Conditional Breakpoints

I The breakpoints we set so far are always ”on”


I It is possible to set conditional breakpoints
I right-click on the breakpoint and write a boolean expression
I such a breakpoint will only halt the program when the
expression is True
I When could conditional breakpoints be useful?

43 / 43
Debugging

Debugging
Conditional Breakpoints

I The breakpoints we set so far are always ”on”


I It is possible to set conditional breakpoints
I right-click on the breakpoint and write a boolean expression
I such a breakpoint will only halt the program when the
expression is True
I When could conditional breakpoints be useful?
I For instance inside a loop in which a certain expression should
not become negative

43 / 43
Iterators Generators

Programming Fundamentals 1
Lesson 9

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 46
Iterators Generators

Outline

Iterators

Generators

2 / 46
Iterators Generators

Iterators
For-Loops

I Recall from lesson 2 the basic syntax of for-loops:


for variable in expression:
statement(s)

I When we introduced for-loops, we said that expression can


be any sequence
I While this is correct, it turns out that the for-loop works for a
more general class of expressions

3 / 46
Iterators Generators

Iterators
From Expressions to Iterators

I For-loops allow variables to iterate over expressions


I What kind of expressions can we iterate over?

4 / 46
Iterators Generators

Iterators
From Expressions to Iterators

I For-loops allow variables to iterate over expressions


I What kind of expressions can we iterate over?
I Let’s think in terms of contracts
I What would be a simple contract that expressions would need
to fulfill so we can iterate over them?

4 / 46
Iterators Generators

Iterators
From Expressions to Iterators

I For-loops allow variables to iterate over expressions


I What kind of expressions can we iterate over?
I Let’s think in terms of contracts
I What would be a simple contract that expressions would need
to fulfill so we can iterate over them?
I The expression should accept next-requests to return the next
element

4 / 46
Iterators Generators

Iterators
From Expressions to Iterators (2)

I The idea that the key to iterating over an expression is to


respect a contract based on next-requests is also taken up by
Python
I There is however one important difference with the approach
outlined on the previous slide:
I It is not the expression itself that is responsible for iterating
but a separate object called an iterator
I It is the iterator that fulfills the contract based on
next-requests
I We call an expression that has an iterator an iterable

5 / 46
Iterators Generators

Iterators
From Expressions to Iterators (3)

ITERATOR

Iterates over
(Using next requests)

EXPRESSION
(must be iterable)

Iterating in Python

6 / 46
Iterators Generators

Iterators
Iterables

I Iterable expressions generalize sequences


I every sequence (e.g., string, list, tuple) is an iterable
I In Python terms:
I An iterable is any object that is a valid argument of the built-in
function iter, which returns an iterator for this expression

7 / 46
Iterators Generators

Iterators
The iter-function

I In its simplest form the iter-function has a single argument:


iter(iterable) − > iterator
I it takes one argument that is an iterable and returns an iterator
I Let’s test this function on a sequence
I DEMO

8 / 46
Iterators Generators

Iterators
Iterators

I As said above an iterator takes care of iterating over the


expression
I In Python terms:
I An iterator is an object that is a valid argument of the built-in
next-function
I This is how it fulfills the contract of responding to
next-requests (see earlier slide)
I In its simplest form next(iterator) returns the next item
from the iterator
I if there is no next item, a StopIteration exception is raised
I In other words: an iterator is an object that returns items one
at a time (using the next-function)

9 / 46
Iterators Generators

Iterators
Example
I So how do we iterate over an iterable expression?

10 / 46
Iterators Generators

Iterators
Example
I So how do we iterate over an iterable expression?
I obtain an iterator from the expression

10 / 46
Iterators Generators

Iterators
Example
I So how do we iterate over an iterable expression?
I obtain an iterator from the expression
I keep invoking the next() function with the iterator as
argument to obtain the next item

10 / 46
Iterators Generators

Iterators
Example
I So how do we iterate over an iterable expression?
I obtain an iterator from the expression
I keep invoking the next() function with the iterator as
argument to obtain the next item

Example
L =[2, 5, 7]
i = iter(L)
while True:
x = next(i)
print('Item: ', x)

I DEMO

10 / 46
Iterators Generators

Iterators
Example
I So how do we iterate over an iterable expression?
I obtain an iterator from the expression
I keep invoking the next() function with the iterator as
argument to obtain the next item

Example
L =[2, 5, 7]
i = iter(L)
while True:
x = next(i)
print('Item: ', x)

I DEMO
I this code raises a StopIteration exception after the last item
has been printed
10 / 46
Iterators Generators

Iterators
Rewriting for-loops

I The traditional for-loop


for variable in expression:
statement(s)
is thus equivalent to (but more concise than):
itr = iter(expression)
try:
while True:
variable = next(itr)
statement(s)
except StopIteration:
pass # does nothing

11 / 46
Iterators Generators

Iterators
The range function

I We have used the range-function before


I Here is an example use in a for-loop:

for i in range(10):
if i%3==0:
print (i)

I What is the output?

12 / 46
Iterators Generators

Iterators
The range function

I We have used the range-function before


I Here is an example use in a for-loop:

for i in range(10):
if i%3==0:
print (i)

I What is the output?


0
3
6
9

12 / 46
Iterators Generators

Iterators
What is range?

I Even though range is listed under built-in functions, it is


actually a type (class)
I Executing range(10) constructs an object of type range that
represents a list of numbers from 0..9
I it does not actually construct the list
I So why can we iterate over the range?

13 / 46
Iterators Generators

Iterators
What is range?

I Even though range is listed under built-in functions, it is


actually a type (class)
I Executing range(10) constructs an object of type range that
represents a list of numbers from 0..9
I it does not actually construct the list
I So why can we iterate over the range?
I Because range(n) is iterable

13 / 46
Iterators Generators

Iterators
range is iterable

I How can we check that range(10) is iterable

14 / 46
Iterators Generators

Iterators
range is iterable

I How can we check that range(10) is iterable


I By executing: iter(range(10))
I DEMO

14 / 46
Iterators Generators

Iterators
Iterables vs explicit lists
I So what is the advantage of implementing for-loops using
iterables

15 / 46
Iterators Generators

Iterators
Iterables vs explicit lists
I So what is the advantage of implementing for-loops using
iterables
I do not need to keep all the items in memory at once
I thus much more memory-efficient (and possibly faster)
I The following example illustrates this:

N = 10**12
for i in range(N):
if i >= 10:
break
print(i, end=', ')

I DEMO

15 / 46
Iterators Generators

Iterators
Iterables vs explicit lists
I So what is the advantage of implementing for-loops using
iterables
I do not need to keep all the items in memory at once
I thus much more memory-efficient (and possibly faster)
I The following example illustrates this:

N = 10**12
for i in range(N):
if i >= 10:
break
print(i, end=', ')

I DEMO
I This code executes instantly because only 11 items are
constructed explicitly (numbers 0..10)
15 / 46
Iterators Generators

Iterators
Iterables as lists

I Note that iterables ”behave like” a sequence without the


sequence necessarily being present in memory
I If we want to explicitly construct the list of items
corresponding to an iterable, how would we do it?

16 / 46
Iterators Generators

Iterators
Iterables as lists

I Note that iterables ”behave like” a sequence without the


sequence necessarily being present in memory
I If we want to explicitly construct the list of items
corresponding to an iterable, how would we do it?
I Using the syntax: list(iterable)

16 / 46
Iterators Generators

Iterators
Iterables as lists

I Note that iterables ”behave like” a sequence without the


sequence necessarily being present in memory
I If we want to explicitly construct the list of items
corresponding to an iterable, how would we do it?
I Using the syntax: list(iterable)
I e.g., list(range(3)) returns the list:[0, 1, 2]
I This syntax is consistent with the general type conversion
syntax of Python. (Why?)

16 / 46
Iterators Generators

Iterators
The enumerate function

I In the for-loop shown above the variable iterates over the


values of the sequence (or other iterable)
I Sometimes it is useful to also have the index of an item
I e.g., if we want to print positions of certain items
I The built-in function enumerate returns a special iterator
I the iterator returns tuples that contain the index as well as a
value
I Check documentation for this function using :
help(enumerate)
I DEMO

17 / 46
Iterators Generators

Iterators
The enumerate Function: Example

seasons =['Spring', 'Summer', 'Fall', 'Winter']


list(enumerate(seasons))

I Output:

18 / 46
Iterators Generators

Iterators
The enumerate Function: Example

seasons =['Spring', 'Summer', 'Fall', 'Winter']


list(enumerate(seasons))

I Output:
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]

I The following for-loop prints the position of seasons starting


with ’S’:

seasons =['Spring', 'Summer', 'Fall', 'Winter']


for i, v in enumerate(seasons): # note the special syntax
if (v.startswith('S')):
print(i)

18 / 46
Iterators Generators

Iterators
The map function

I The built-in function map returns an iterator that applies a


function to items of iterables
I In its simplest form it applies a function of a single argument
to items of an iterable

# create list of squares of numbers from 0 to 9


list(map(lambda x: x**2, range(10)))
Note that we can express this also using list comprehension:

19 / 46
Iterators Generators

Iterators
The map function

I The built-in function map returns an iterator that applies a


function to items of iterables
I In its simplest form it applies a function of a single argument
to items of an iterable

# create list of squares of numbers from 0 to 9


list(map(lambda x: x**2, range(10)))
Note that we can express this also using list comprehension:
[x**2 for x in range(10)]

19 / 46
Iterators Generators

Iterators
The map function: general case

I In a more general form, map may take several iterables as


arguments
I The function must then have the same number of arguments
I The iterator returns as ith item the function applied to the ith
item of each iterable
I The iterator stops when the ”shortest” iterable is finished

20 / 46
Iterators Generators

Iterators
Example for the General Case

I Recall that the inner product of two vectors a = [a1 , . . . , an ]


and b = [b1 , . . . , bn ] is defined as:
n
X
a·b = ai bi
i=1
I We can compute the inner product of two vectors represented
by tuples (or lists) in Python as follows:

a = (1, 5, 8)
b = (2, 6, 3)
sum(map(lambda x,y: x*y, a, b))

21 / 46
Iterators Generators

Iterators
The filter Function

I The filter built-infunction takes as argument a function and


an iterable (just like map) and returns an iterator yielding only
those items for which the function evaluates to true

Example
filter(lambda x: x%2==1, range(10))

Output:

22 / 46
Iterators Generators

Iterators
The filter Function

I The filter built-infunction takes as argument a function and


an iterable (just like map) and returns an iterator yielding only
those items for which the function evaluates to true

Example
filter(lambda x: x%2==1, range(10))

Output: an iterator over odd numbers between 0 and 9


DEMO

22 / 46
Iterators Generators

Iterators
The itertools Module

I The itertools-module contains a number of useful iterators


I We will review some of them on the following slides

23 / 46
Iterators Generators

Iterators
Infinite Iterators

I All the iterators we have seen so far are finite


I they produce only a finite number of items
I Does it make sense to think of infinite iterators?

24 / 46
Iterators Generators

Iterators
Infinite Iterators

I All the iterators we have seen so far are finite


I they produce only a finite number of items
I Does it make sense to think of infinite iterators?
I Yes, for instance when looking for an integer value with a
certain property

24 / 46
Iterators Generators

Iterators
The count Iterator

I The count function produces an iterator over evenly spaced


values starting with a given value
I Its general form is:
itertools.count(start=0, step =1)
I Both arguments have default values and are thus optional
I The following simple iterator returns even numbers starting at
2:
itertools.count(start=2, step=2)

25 / 46
Iterators Generators

Iterators
The count Iterator: Example

I A more involved example produces function data points


I The following iterator iterates over odd arguments of a given
function
from itertools import count
map(lambda x: x**2, count(start=1, step=2))
I DEMO

26 / 46
Iterators Generators

Iterators
Combinatoric Iterators

I Combinatoric iterators allow to iterate over various types of


combinations of items of another iterator
I A well-know class of combinations are permutations
I itertools.permutations(iterable) returns all permutations of an
iterable

Example
>>> from itertools import permutations
>>> a= permutations(range(4))

I DEMO

27 / 46
Iterators Generators

Generators
Definition

I A generator is a function whose body contains the special


keyword yield
I Generators are related to iterators as follows:
I when calling a generator, the function body does not execute
but a special iterator (object) is returned, called a generator
object
I the generator object has a state composed of
I the function body
I the local variables and parameters
I the current point of execution (initially the start of the
function)

28 / 46
Iterators Generators

Generators
Definition(2)

I when you call next on a generator object, the function body


executes from the current point to the next yield, which is of
the form:
yield expression
I the value of this call of next is the value of the expression
following yield
I when you call next again on a generator object, function
execution resumes again up to the next yield expression
I when the function body ends or a return statement is
executed, an exception of type StopIteration is raised

29 / 46
Iterators Generators

Generators
General Form of range

I Before we can introduce an example for generators, we


introduce the general form of the range function:
range(start, stop, step)
I This iterator works as follows:
I for a positive step the iterator produces the set of values

{start + i ∗ step | i ≥ 0 and start + i ∗ step < stop}


I for a negative step the iterator produces the set of values

{start + i ∗ step | i ≥ 0 and start + i ∗ step > stop}

30 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:

31 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:


[0, 5, 10, 15, 20, 25]

31 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:


[0, 5, 10, 15, 20, 25]
I list(range(0, -10, -1)) produces:

31 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:


[0, 5, 10, 15, 20, 25]
I list(range(0, -10, -1)) produces:
[0, −1, −2, −3, −4, −5, −6, −7, −8, −9]

31 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:


[0, 5, 10, 15, 20, 25]
I list(range(0, -10, -1)) produces:
[0, −1, −2, −3, −4, −5, −6, −7, −8, −9]
I list(range(0, -10, 1)) produces:

31 / 46
Iterators Generators

Generators
General Form of range: Example

I list(range(0, 30, 5)) produces:


[0, 5, 10, 15, 20, 25]
I list(range(0, -10, -1)) produces:
[0, −1, −2, −3, −4, −5, −6, −7, −8, −9]
I list(range(0, -10, 1)) produces:
[]

31 / 46
Iterators Generators

Generators
From Generators to Iterators

I We may view generators as a convenient way to build iterators


I For example, suppose we want to build an iterator that counts
from 1 to n and then down to 1 again
I The following generator code does just that:
def updown(n):
for i in range(1,n):
yield i
for i in range(n, 0, -1):
yield i
I DEMO

32 / 46
Iterators Generators

Generators
Calling the Generator

I As explained above calling updown(n) will return an iterator


I We can use it as follows (e.g.)

for i in updown(3):
print(i)

I Output:
1
2
3
2
1

33 / 46
Iterators Generators

Generators
Iterators are Iterables!

I We explained earlier that for-loops allow to iterate over any


iterable, i.e., any object that produces an iterator using the
iter-function
I One may now argue that in the example
for i in updown(3):
print(i)
updown(3) is not an iterable but an iterator
I So why does this work?

34 / 46
Iterators Generators

Generators
Iterators are Iterables!

I We explained earlier that for-loops allow to iterate over any


iterable, i.e., any object that produces an iterator using the
iter-function
I One may now argue that in the example
for i in updown(3):
print(i)
updown(3) is not an iterable but an iterator
I So why does this work?
I because for any iterator object i, it is true that: iter(i) = i
I Thus each iterator is also an iterable
I DEMO

34 / 46
Iterators Generators

Generators
Generators as Functions
I We can simulate a generator by a function that returns a list
of items
I For the previous example we can write a function that returns
the list [1,..., n, n-1,..., 1]
I An example implementation is the following:

def updown2(n):
l1 = list(range(1,n))
l2 = list(range(n, 0, -1))
return l1+l2
The following statements will then produce the same output:
for i in updown2(3):
print(i)
35 / 46
Iterators Generators

Generators
Lazy Evaluation

I So why introduce generators?

36 / 46
Iterators Generators

Generators
Lazy Evaluation

I So why introduce generators?


I Because the associated iterator only computes items when
needed, just in time
I The ”equivalent” function must compute all items beforehand
I very wasteful of memory
I possibly also wasteful of time (Why?)
I If we really need the full list of items, we can obtain it like this:

36 / 46
Iterators Generators

Generators
Lazy Evaluation

I So why introduce generators?


I Because the associated iterator only computes items when
needed, just in time
I The ”equivalent” function must compute all items beforehand
I very wasteful of memory
I possibly also wasteful of time (Why?)
I If we really need the full list of items, we can obtain it like this:
list(g) # where g is the generator object

36 / 46
Iterators Generators

Generators
Generator Expressions

I Python provides a convenient way to write simple generators


using generator expressions
I The syntax is similar to that of list comprehension but
parenthesizes are used instead of brackets
I Here is an example of a generator expression:
(n**2 for n in range(10))
I The type of this expression is a generator object. DEMO

37 / 46
Iterators Generators

Generators
Generator Expressions

I Python provides a convenient way to write simple generators


using generator expressions
I The syntax is similar to that of list comprehension but
parenthesizes are used instead of brackets
I Here is an example of a generator expression:
(n**2 for n in range(10))
I The type of this expression is a generator object. DEMO
I The equivalent generator (function) is:
def gen():
for n in range(10):
yield n**2

37 / 46
Iterators Generators

Generators
Infinite Generator

I We have seen iterators that go on forever, e.g., the count()


iterator in module itertools
I Using generators it is easy to produce a similar iterator
def countgen():
i=0
while True:
yield i
i += 1

38 / 46
Iterators Generators

Generators
Single Use vs Multiple Use
I Iterators (and therefore also generator objects) have an
internal state
I Once they have produced all items, they terminate and cannot
be used again
I Thus
a = (n for n in range(4))
for i in a:
print(i,end=' ')
for i in a:
print(i,end=' ')

will output:

39 / 46
Iterators Generators

Generators
Single Use vs Multiple Use
I Iterators (and therefore also generator objects) have an
internal state
I Once they have produced all items, they terminate and cannot
be used again
I Thus
a = (n for n in range(4))
for i in a:
print(i,end=' ')
for i in a:
print(i,end=' ')

will output:
0123
I There is only one line of output since a has terminated after
the first loop
39 / 46
Iterators Generators

Generators
Single Use vs Multiple Use (2)

I If we need to iterate multiple times over an iterable, we can


just generate the list
a = list((n for n in range(4)))
for i in a:
print(i,end=' ')
for i in a:
print(i,end=' ')

will output:

40 / 46
Iterators Generators

Generators
Single Use vs Multiple Use (2)

I If we need to iterate multiple times over an iterable, we can


just generate the list
a = list((n for n in range(4)))
for i in a:
print(i,end=' ')
for i in a:
print(i,end=' ')

will output:
01230123

40 / 46
Iterators Generators

Generators
Single Use vs Multiple Use (2)

I Because iterators have state, we can stop and resume an


iterator
G = (n**2 for n in range(12))
for n in G:
print(n, end=' ')
if n > 30:
break
print("\ndoing something in between")
for n in G:
print(n, end=' ')

will output:

41 / 46
Iterators Generators

Generators
Single Use vs Multiple Use (2)

I Because iterators have state, we can stop and resume an


iterator
G = (n**2 for n in range(12))
for n in G:
print(n, end=' ')
if n > 30:
break
print("\ndoing something in between")
for n in G:
print(n, end=' ')

will output:
0 1 4 9 16 25 36
doing something in between
49 64 81 100 121

41 / 46
Iterators Generators

Generators
Application: Sieve of Erastothenes

Definition1
In mathematics, the sieve of Erastosthenes is a simple, ancient
algorithm for finding all prime numbers up to any given limit.
It does so by iteratively marking as composite (i.e., not prime) the
multiples of each prime, starting with the first prime number, 2.
The multiples of a given prime are generated as a sequence of
numbers starting from that prime, with constant difference
between them that is equal to that prime.

1
https://en.wikipedia.org/wiki/Sieve of Eratosthenes
42 / 46
Iterators Generators

Generators
Sieve of Erastothenes: Animation

I Here is an animation for applying the sieve to get all prime


numbers below 121 click here

43 / 46
Iterators Generators

Generators
Functions of Iterators

I There are two useful built-in functions that take an iterable as


argument
I all(iterable) returns true if all items are true
I any(iterable) returns true if at least one item is true

44 / 46
Iterators Generators

Generators
A Generator for the Sieve

I The following code generates prime numbers up to N using a


generator

def gen_primes(N):
"""Generate primes up to N"""
primes = set()
for n in range(2, N+1):
if all(n % p > 0 for p in primes):
primes.add(n)
yield n
print(list(gen_primes(100)))

45 / 46
Iterators Generators

Generators
A Generator for the Sieve

I Can we convert the previous generator into one generating all


primes?

46 / 46
Iterators Generators

Generators
A Generator for the Sieve

I Can we convert the previous generator into one generating all


primes?
I Yes, the following code shows how.

import itertools
def gen_primes():
"""Generate primes one at a time"""
primes = set()
for n in itertools.count(start=2): #default step is 1
if all(n % p > 0 for p in primes):
primes.add(n)
yield n

46 / 46
Floating-Point Numbers

Programming Fundamentals 1
Lesson 10

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 45
Floating-Point Numbers

Outline

Floating-Point Numbers

2 / 45
Floating-Point Numbers

Floating-Point Numbers
Numeric Types

I Recall from lesson 1 the numeric types in Python:

3 / 45
Floating-Point Numbers

Floating-Point Numbers
Numeric Types

I Recall from lesson 1 the numeric types in Python:


I Integer numbers (called int in Python)

3 / 45
Floating-Point Numbers

Floating-Point Numbers
Numeric Types

I Recall from lesson 1 the numeric types in Python:


I Integer numbers (called int in Python)
I Floating-point numbers (called float in Python)

3 / 45
Floating-Point Numbers

Floating-Point Numbers
Numeric Types

I Recall from lesson 1 the numeric types in Python:


I Integer numbers (called int in Python)
I Floating-point numbers (called float in Python)
I Complex Numbers (called complex in Python)

3 / 45
Floating-Point Numbers

Floating-Point Numbers
Numeric Types

I Recall from lesson 1 the numeric types in Python:


I Integer numbers (called int in Python)
I Floating-point numbers (called float in Python)
I Complex Numbers (called complex in Python)
I We now take a closer look at floating-point numbers

3 / 45
Floating-Point Numbers

Floating-Point Numbers
Type float

I There is no limit to the size of integers in Python


I Let us assume for now that floating-point numbers are
represented in decimal (base 10)
I We can represent a float as a pair of integers
I the significant digits (or mantissa)
I the exponent
I Example: the number 1.949 would be represented by
I significant digits: 1949
I exponent: -3
meaning: 1.949 = 1949 ∗ 10−3
I Python syntax: 1949e-3 or 1949E-3

4 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision and Range

I The number of significant digits determines the precision of


floats
I If there were only two significant decimal digits, we would not
be able to represent 1.949 precisely
I We would need to use an approximation, e.g., 19 ∗ 10−1
I This approximation is called a rounded value
I Since floats are internally not expressed in the decimal system
but in binary we need to review binary numbers

5 / 45
Floating-Point Numbers

Floating-Point Numbers
From Decimal to Binary

I Numbers we have used so far in the lecture have been


expressed in decimal notation
I The value of the number can thus be expressed as a sum of
powers of 10
I Thus, the integer an an−1 . . . a0 in base 10 has value
n
X
ai ∗ 10i
i=0

I For example,
543 = 5 ∗ 102 + 4 ∗ 101 + 3 ∗ 100 = 5 ∗ 100 + 4 ∗ 10 + 3

6 / 45
Floating-Point Numbers

Floating-Point Numbers
From Decimal to Binary(2)

I Expressing a number in binary means its value can be


expressed as a sum of powers of 2
I Thus, the integer an an−1 . . . a0 in binary notation has value
n
X
ai ∗ 2i
i=0

I For example,
1011 = 1 ∗ 23 + 0 ∗ 22 + 1 ∗ 21 + 1 ∗ 20 = 1 ∗ 8 + 0 ∗ 4 + 1 ∗ 2 + 1
= 11 (in decimal)

7 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary

I Floats are internally not represented in decimal but in binary


I This means
I Significant digits are either 0 or 1
I current implementation has 53 significant digits
I The exponent is expressed in binary as well (max: 308 in
decimal)
I 2 rather than 10 is raised to the power of the exponent
I Represent the float by the pair (s,e) where s consists of the
significant digits and e is the exponent
I Example of a float (in binary):
I (101, 100) representing:

8 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary

I Floats are internally not represented in decimal but in binary


I This means
I Significant digits are either 0 or 1
I current implementation has 53 significant digits
I The exponent is expressed in binary as well (max: 308 in
decimal)
I 2 rather than 10 is raised to the power of the exponent
I Represent the float by the pair (s,e) where s consists of the
significant digits and e is the exponent
I Example of a float (in binary):
I (101, 100) representing:
2
I (1 ∗ 22 + 0 ∗ 21 + 1 ∗ 20 ) ∗ 2(2 ) = 5 ∗ 24 = 80 (in decimal)

8 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example

I Another example (non-integer): 0.625


I What is the representation in binary?

9 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example

I Another example (non-integer): 0.625


I What is the representation in binary?
I Noting that 0.625 = 5/8 means it can be represented by
(101, −11) since

9 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example

I Another example (non-integer): 0.625


I What is the representation in binary?
I Noting that 0.625 = 5/8 means it can be represented by
(101, −11) since
I 5/8 = 5 ∗ 2−3
I binary representations of 5 and 3 are 101 and 11, respectively

9 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example 2

I Another example (non-integer): 0.1


I What is the representation in binary?

10 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example 2

I Another example (non-integer): 0.1


I What is the representation in binary?
I If we have four significant digits, we can approximate 0.1 by
(0011, −101)
I decimal value of (0011, −101) is 3 ∗ 2−5 = 3/32 = 0.09375
I What precision and range do we need to represent 0.1 exactly?

10 / 45
Floating-Point Numbers

Floating-Point Numbers
Floats in Binary: Example 2

I Another example (non-integer): 0.1


I What is the representation in binary?
I If we have four significant digits, we can approximate 0.1 by
(0011, −101)
I decimal value of (0011, −101) is 3 ∗ 2−5 = 3/32 = 0.09375
I What precision and range do we need to represent 0.1 exactly?
I This is not possible with finite precision and range! (see next
slide)

10 / 45
Floating-Point Numbers

Floating-Point Numbers
Need for approximations

I Suppose that 0.1 = n ∗ 2e where n and e are integers


I Note that e has to be negative since 0.1 < 1
I Rewrite above equality as 1 = 10 ∗ n ∗ 2e or

2−e = 10 ∗ n
I Why is this not possible for integer n?

11 / 45
Floating-Point Numbers

Floating-Point Numbers
Need for approximations

I Suppose that 0.1 = n ∗ 2e where n and e are integers


I Note that e has to be negative since 0.1 < 1
I Rewrite above equality as 1 = 10 ∗ n ∗ 2e or

2−e = 10 ∗ n
I Why is this not possible for integer n?
I Because the left side cannot be a multiple of 5
I We conclude that Python often needs to approximate real
numbers (even simple ones)

11 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation

I Let us test the Python implementation of floats: >>> a= 0.1


>>> a
I What will be the output?

12 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation

I Let us test the Python implementation of floats: >>> a= 0.1


>>> a
I What will be the output?
I DEMO

12 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation

I Let us test the Python implementation of floats: >>> a= 0.1


>>> a
I What will be the output?
I DEMO
Output is: 0.1
I Doesn’t this contradict what we said above?

12 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation

I Let us test the Python implementation of floats: >>> a= 0.1


>>> a
I What will be the output?
I DEMO
Output is: 0.1
I Doesn’t this contradict what we said above?
I No, because Python does some automatic rounding when
displaying floats
I In this case it does not display the exact value stored in the
computer

12 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation (2)
I Consider the following code:
x = 0.0
for i in range(10):
x += 0.1
if x == 1.0:
print(x, '= 1.0')
else:
print(x, 'is not 1.0')
I What is the output?
I DEMO.
I Does not recognize equality. Why?

13 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation (2)
I Consider the following code:
x = 0.0
for i in range(10):
x += 0.1
if x == 1.0:
print(x, '= 1.0')
else:
print(x, 'is not 1.0')
I What is the output?
I DEMO.
I Does not recognize equality. Why?
I At some point in the loop it ran out of significant digits and
produced some rounding error
13 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation (3)

I Let’s modify the code as follows:


x = 0.0
for i in range(10):
x += 0.1
print(x)
I DEMO.

14 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Python Implementation (3)

I Let’s modify the code as follows:


x = 0.0
for i in range(10):
x += 0.1
print(x)
I DEMO.
I As we can see it is rather unpredictable when exact results
come out

14 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Equality

I What have we learnt from previous examples?

15 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Equality

I What have we learnt from previous examples?


we should never test for equality of floats
I Generally its is sufficient to test whether two floats are close,
e.g.,

15 / 45
Floating-Point Numbers

Floating-Point Numbers
Testing Equality

I What have we learnt from previous examples?


we should never test for equality of floats
I Generally its is sufficient to test whether two floats are close,
e.g.,
I test whether: abs(f1-f2) < epsilon
where epsilon is a suitably small number
I We will see a concrete application of this in practical examples
on the following slides

15 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree 0)

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree 0)
I x − 1 (of degree

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree 0)
I x − 1 (of degree 1)

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree 0)
I x − 1 (of degree 1)
I x 7 − x 2 + 37 (of degree

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Polynomials

I A polynomial p(x) in one variable x is an expression of the


form:
d
X
p(x) = ai ∗ x i
i=0

where ai is a real number and ad 6= 0


I We call d the degree of p
I Examples of polynomials:
I 3 (of degree 0)
I x − 1 (of degree 1)
I x 7 − x 2 + 37 (of degree 7)

16 / 45
Floating-Point Numbers

Floating-Point Numbers
Roots of Polynomials

I For a polynomial p(x) we denote by p(r ) the value of p for


x =r
I r is a root of p if p(r ) = 0

Theorem (Fundamental Theorem of Algebra)

17 / 45
Floating-Point Numbers

Floating-Point Numbers
Roots of Polynomials

I For a polynomial p(x) we denote by p(r ) the value of p for


x =r
I r is a root of p if p(r ) = 0

Theorem (Fundamental Theorem of Algebra)


Every non-zero, single-variable polynomial of degree n has exactly
n roots (counted with multiplicities)

17 / 45
Floating-Point Numbers

Floating-Point Numbers
Roots of Polynomials

I For a polynomial p(x) we denote by p(r ) the value of p for


x =r
I r is a root of p if p(r ) = 0

Theorem (Fundamental Theorem of Algebra)


Every non-zero, single-variable polynomial of degree n has exactly
n roots (counted with multiplicities)
Equivalently:
Q there exists complex numbers r1 , . . . , rn such that
p(x) = a ni=1 (x − ri ) for some complex number a

17 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots

I Suppose you are told that you are supposed to compute at


least one real root of a given polynomial
I Why is this task ill-defined?

18 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots

I Suppose you are told that you are supposed to compute at


least one real root of a given polynomial
I Why is this task ill-defined?
I Because, in general, there is no finite representation of a root
of a polynomial
I Can you give an example?

18 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots

I Suppose you are told that you are supposed to compute at


least one real root of a given polynomial
I Why is this task ill-defined?
I Because, in general, there is no finite representation of a root
of a polynomial
I Can you give an example?
I There is no finite representation of the roots of x 2 − 2. Why?

18 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots

I Suppose you are told that you are supposed to compute at


least one real root of a given polynomial
I Why is this task ill-defined?
I Because, in general, there is no finite representation of a root
of a polynomial
I Can you give an example?
I There is no finite representation of the roots of x 2 − 2. Why?

I 2 is an irrational number. Meaning?

18 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots

I Suppose you are told that you are supposed to compute at


least one real root of a given polynomial
I Why is this task ill-defined?
I Because, in general, there is no finite representation of a root
of a polynomial
I Can you give an example?
I There is no finite representation of the roots of x 2 − 2. Why?

I 2 is an irrational number. Meaning?
I It cannot be represented by a fraction qp where p and q are
integers
I Note that all floating numbers are rational. Why?

18 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots (2)

I So what is the next question you should ask?

19 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots (2)

I So what is the next question you should ask?


I How good an approximation should I compute?

19 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing Roots (2)

I So what is the next question you should ask?


I How good an approximation should I compute?
I Suppose you are told that you have to compute a real root
with an error of at most 10−6
I How should you proceed to do this?

19 / 45
Floating-Point Numbers

Floating-Point Numbers
An Example

I Let us take a concrete example

p(x) = x 5 + x − 1

I Does this polynomial have at least one real root?

20 / 45
Floating-Point Numbers

Floating-Point Numbers
An Example

I Let us take a concrete example

p(x) = x 5 + x − 1

I Does this polynomial have at least one real root?


I Yes, since p(0) = −1 and p(1) = 1 and p is a continuous
function there must be at least one root
I What can we say about the location of this root?

20 / 45
Floating-Point Numbers

Floating-Point Numbers
An Example

I Let us take a concrete example

p(x) = x 5 + x − 1

I Does this polynomial have at least one real root?


I Yes, since p(0) = −1 and p(1) = 1 and p is a continuous
function there must be at least one root
I What can we say about the location of this root?
I It must be in the interval [0, 1]

20 / 45
Floating-Point Numbers

Floating-Point Numbers
Looking up the Answer

I CHALLENGE: look up the root online


I You have got five minutes
I display ”countdown timer”

21 / 45
Floating-Point Numbers

Floating-Point Numbers
A Nave Method

I What would be a very simple and straightforward method of


computing such a root?

22 / 45
Floating-Point Numbers

Floating-Point Numbers
A Nave Method

I What would be a very simple and straightforward method of


computing such a root?
I Simply by exhaustive enumeration. How?

22 / 45
Floating-Point Numbers

Floating-Point Numbers
A Nave Method

I What would be a very simple and straightforward method of


computing such a root?
I Simply by exhaustive enumeration. How?
I Since we know already that the root lies in the interval [0, 1]
we could use the following simple procedure

b = 0.0
step = 10**-6
while b**5 + b - 1 < 0:
b += step # same as: b= b+step

22 / 45
Floating-Point Numbers

Floating-Point Numbers
Drawback of Nave Method

I What is the main drawback of the simple method?

23 / 45
Floating-Point Numbers

Floating-Point Numbers
Drawback of Nave Method

I What is the main drawback of the simple method?


I It can become really slow for small values of epsilon
I DEMO

23 / 45
Floating-Point Numbers

Floating-Point Numbers
Another method?

I We can view the current problem as a search problem. How?

24 / 45
Floating-Point Numbers

Floating-Point Numbers
Another method?

I We can view the current problem as a search problem. How?


I Among the values [0, 10−6 , 2 ∗ 10−6 , ..., 1] find one with the
desired property
I We can consider the previous program as a linear search in
this set
I So how can we speed up the search?

24 / 45
Floating-Point Numbers

Floating-Point Numbers
Another method?

I We can view the current problem as a search problem. How?


I Among the values [0, 10−6 , 2 ∗ 10−6 , ..., 1] find one with the
desired property
I We can consider the previous program as a linear search in
this set
I So how can we speed up the search?
I Using binary search (see lesson 2)
I More precisely, we will use bisection search based on exactly
halving the search interval in each iteration

24 / 45
Floating-Point Numbers

Floating-Point Numbers
Bisection Search

I The following code searches for a root using bisection search

p = lambda x: x**5+x-1
epsilon = 10**(-6)
low = 0
high = 1
b = 0.5
while high-low >epsilon:
if p(b)<0:
low = b #search interval [b,high]
else:
high = b # #search interval [low,b]
b = (low+high)/2
print(b)

25 / 45
Floating-Point Numbers

Floating-Point Numbers
Complexity of Bisection Search
I How many iterations does the bisection search need?

26 / 45
Floating-Point Numbers

Floating-Point Numbers
Complexity of Bisection Search
I How many iterations does the bisection search need?
I After each iteration the interval to search is halved
I The initial size of the interval is 1 and the final one at most
epsilon
I If k is the required number of iterations for the bisection
1
search, then it suffices that 21k < epsilon or 2k > epsilon
I Taking logarithms in base 2 on both sides we get
1
k > log ( ) = − log(epsilon)
epsilon
I Assuming epsilon = 10−p then
k = − log(10−p ) = p log 10 < 4p iterations would be
sufficient (versus 10p for the linear search)
26 / 45
Floating-Point Numbers

Floating-Point Numbers
How to Find Roots?
I We have seen that bisection search can be quite fast
I it relies however on having an initial search interval
I Does there exist a method that is even faster?

27 / 45
Floating-Point Numbers

Floating-Point Numbers
How to Find Roots?
I We have seen that bisection search can be quite fast
I it relies however on having an initial search interval
I Does there exist a method that is even faster?

27 / 45
Floating-Point Numbers

Floating-Point Numbers
How to Find Roots?
I We have seen that bisection search can be quite fast
I it relies however on having an initial search interval
I Does there exist a method that is even faster?

Sir Isaac Newton

27 / 45
Floating-Point Numbers

Floating-Point Numbers
Newton’s Method

I Newton came up with a method for finding roots of


polynomials – this method was generalized to real-valued
functions
I The Newton method (also called Newton-Raphson method) is
a method for finding successively better approximations to the
roots of a real-valued function
I A root of a function f (x) is a value r such that f (r ) = 0
I This method states the following:
f (x0 )
I if x0 is close to a root of f , then x1 = x0 − f 0 (x0 ) is even closer
I here f 0 denotes the derivative of function f

28 / 45
Floating-Point Numbers

Floating-Point Numbers
Newton’s Method

I Newton came up with a method for finding roots of


polynomials – this method was generalized to real-valued
functions
I The Newton method (also called Newton-Raphson method) is
a method for finding successively better approximations to the
roots of a real-valued function
I A root of a function f (x) is a value r such that f (r ) = 0
I This method states the following:
f (x0 )
I if x0 is close to a root of f , then x1 = x0 − f 0 (x0 ) is even closer
I here f 0 denotes the derivative of function f
I Geometric interpretation: (x1 , 0) is the intersection of the
tangent of the graph of f at (x0 , f (x0 )) with the x-axis
click here

28 / 45
Floating-Point Numbers

Floating-Point Numbers
Application of Newton’s Method

I We would like to apply Newton’s method for finding the root


of our example polynomial: p(x) = x 5 + x − 1
I Applying Newton’s formula we know that if b is an
approximation to a root of p(x) then1

b5 + b − 1
b−
5 ∗ b4 + 1
should be an even better approximation

1
Note that the denominator is the derivative of the numerator
29 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing the Next Approximation

I The following Python code computes the next approximation

p = lambda x: x**5+x-1
p2 = lambda x: 5*x**4+1 # derivative of p
def nextVal(b):
'''Assumes float b is an approximation to a root of p
Returns the next approximation (using Newton's method)''
return b-p(b)/p2(b)

30 / 45
Floating-Point Numbers

Floating-Point Numbers
Computing a Root

I The following Python code will stop when the approximation


is close enough
I It makes use of the nextVal-function defined on the previous
slide
b=1 #initial approximation to the square root of p
epsilon=10**(-6)
while abs(p(b))>epsilon:
b=nextVal(b)

I DEMO

31 / 45
Floating-Point Numbers

Floating-Point Numbers
Comparing Complexities

I Newton’s method appears just as fast as the bisection search


I Because it is not easy to analyze the complexity of Newton’s
method we will do an empirical comparison
I Why doesn’t it make sense to compare the running times
directly, i.e., comparing how long each one takes for a given
epsilon?

32 / 45
Floating-Point Numbers

Floating-Point Numbers
Comparing Complexities

I Newton’s method appears just as fast as the bisection search


I Because it is not easy to analyze the complexity of Newton’s
method we will do an empirical comparison
I Why doesn’t it make sense to compare the running times
directly, i.e., comparing how long each one takes for a given
epsilon?
I Because they have different termination criteria

32 / 45
Floating-Point Numbers

Floating-Point Numbers
Comparing Complexities

I Newton’s method appears just as fast as the bisection search


I Because it is not easy to analyze the complexity of Newton’s
method we will do an empirical comparison
I Why doesn’t it make sense to compare the running times
directly, i.e., comparing how long each one takes for a given
epsilon?
I Because they have different termination criteria
I Newton’s method considers the value of p(b) rather than the
value of b
I To make them comparable we modify the bisection search
(see next page)

32 / 45
Floating-Point Numbers

Floating-Point Numbers
Modified Bisection Search

I Recall the main loop of the bisection search:


while high-low >epsilon:
if p(b)<0:
low = b #search interval [b,high]
else:
high = b # #search interval [low,b]
b = (low+high)/2

I We modify it by basing termination on p(b):


while abs(p(b))>epsilon:
if p(b)<0:
low = b #search interval [b,high]
else:
high = b # #search interval [low,b]
b = (low+high)/2

33 / 45
Floating-Point Numbers

Floating-Point Numbers
Comparing Functions

I Let
I bisect denote the function that computes the root using the
modified bisection search
I newt denote the function that computes the root using
Newton’s method
I We assume that both functions have epsilon as argument
(rather than having it as a local variable)

34 / 45
Floating-Point Numbers

Floating-Point Numbers
Running the experiments

I We would like to measure the time taken by both functions


for various values of epsilon
I We could do this by hand but this would be rather painful
I What is the better way?

35 / 45
Floating-Point Numbers

Floating-Point Numbers
Running the experiments

I We would like to measure the time taken by both functions


for various values of epsilon
I We could do this by hand but this would be rather painful
I What is the better way?
I using Python’s time measuring facilities

35 / 45
Floating-Point Numbers

Floating-Point Numbers
The timeit module

I The timeit module provides a simple way to time python


code snippets
I It contains a function, also called timeit
I In its simplest form it just has one parameter: the statements
to be executed
I By default the time it returns is the time in seconds that is
consumed by running the statements 1 million times
I To run a function f a single time we invoke it as follows:
timeit.timeit("f()", number=1)

36 / 45
Floating-Point Numbers

Floating-Point Numbers
The timeit Function: Example

I Here is a simple example that times the square root function


for values from 0 to 9

import timeit
import math
for i in range(10):
statement = 'math.sqrt('+str(i)+')'
print(timeit.timeit(statement, number =1, globals = globals()))

I The effect of setting the globals parameter to globals() is


to allow access to imported functions (math.sqrt in this case)

37 / 45
Floating-Point Numbers

Floating-Point Numbers
Timing the Root Finding Functions

I The following code times functions bisect and newt for values
of epsilon 10−2 , . . . , 10−10
I We assume each function is in a module with the same name

import timeit
import math
import bisect
import newt
for epsilon in [10**-i for i in range(2,11)]:
statement1 = 'bisect.bisect('+str(epsilon)+')'
statement2 = 'newt.newt('+str(epsilon)+')'
print(timeit.timeit(statement1, number =1,globals = globals()),'/',
timeit.timeit(statement2, number =1, globals = globals()))

38 / 45
Floating-Point Numbers

Floating-Point Numbers
Timing the Root Finding Functions

I The output of the experiment is shown below:


2.863499685190618e-05 / 2.6583002181723714e-05
2.192300235037692e-05 / 1.8225000530947e-05
2.230900281574577e-05 / 2.2709002223564312e-05
4.1784998757066205e-05 / 2.3335996957030147e-05
3.45799999195151e-05 / 1.981499735848047e-05
3.96939976781141e-05 / 1.909700222313404e-05
4.870500197284855e-05 / 2.1547999494941905e-05
4.93240004288964e-05 / 2.0577001123456284e-05
4.965499829268083e-05 / 2.9588001780211926e-05

39 / 45
Floating-Point Numbers

Floating-Point Numbers
Timing the Root Finding Functions

I The output of the experiment is shown below:


2.863499685190618e-05 / 2.6583002181723714e-05
2.192300235037692e-05 / 1.8225000530947e-05
2.230900281574577e-05 / 2.2709002223564312e-05
4.1784998757066205e-05 / 2.3335996957030147e-05
3.45799999195151e-05 / 1.981499735848047e-05
3.96939976781141e-05 / 1.909700222313404e-05
4.870500197284855e-05 / 2.1547999494941905e-05
4.93240004288964e-05 / 2.0577001123456284e-05
4.965499829268083e-05 / 2.9588001780211926e-05
I We conclude that both methods are of similar complexity (at
least for this example) with Newton’s method being a bit
faster
39 / 45
Floating-Point Numbers

Floating-Point Numbers
Python Decimal Module

I The decimal module in the Python standard library supports


decimal floating point
I Purpose of this module:
I supposed to work in the way people are used to
I decimal numbers (e.g., 0.1) can be represented exactly
I exactness carries over to arithmetic computations, e.g.
0.1+0.1+0.1-0.3 is exactly equal to zero.
I the module incorporates the notion of significant digits
I e.g., 1.20 + 1.30 = 2.50
I trailing zero kept to indicate significance
I customary presentation for monetary application

40 / 45
Floating-Point Numbers

Floating-Point Numbers
Floating Point Standards

I Both binary and decimal floating point implementations follow


published standards: IEEE 754 and IEEE 854
I DEMO
I Three main concepts of decimal module: the decimal number,
context for arithmetic operations and signals

41 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string
Decimal('3.14')

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string
Decimal('3.14')
>>> Decimal(3.14) # Decimal from float

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string
Decimal('3.14')
>>> Decimal(3.14) # Decimal from float
Decimal('3.140000000000000124344978758017532527446746826171875')

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string
Decimal('3.14')
>>> Decimal(3.14) # Decimal from float
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> Decimal(0.1) # Decimal from float

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Constructing Decimals

I Here are some examples creating Decimals


I Note that construction from an integer or float performs an
exact conversion from that integer or float
from decimal import Decimal
>>> Decimal(10) # Decimal from int
Decimal('10')
>>> Decimal('3.14') # Decimal from string
Decimal('3.14')
>>> Decimal(3.14) # Decimal from float
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> Decimal(0.1) # Decimal from float
Decimal('0.1000000000000000055511151231257827021181583404541015625'

42 / 45
Floating-Point Numbers

Floating-Point Numbers
Context
I The context for arithmetic is an environment specifying
precision, rounding rules, limits on exponents, flags indicating
the results of operations, and trap enablers which determine
whether signals are treated as exceptions
I DEMO
I >>> from decimal import getcontext

43 / 45
Floating-Point Numbers

Floating-Point Numbers
Context
I The context for arithmetic is an environment specifying
precision, rounding rules, limits on exponents, flags indicating
the results of operations, and trap enablers which determine
whether signals are treated as exceptions
I DEMO
I >>> from decimal import getcontext
I >>> getcontext()

43 / 45
Floating-Point Numbers

Floating-Point Numbers
Context
I The context for arithmetic is an environment specifying
precision, rounding rules, limits on exponents, flags indicating
the results of operations, and trap enablers which determine
whether signals are treated as exceptions
I DEMO
I >>> from decimal import getcontext
I >>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN,
Emin=-999999, Emax=999999, capitals=1,
clamp=0, flags=[],
traps=[InvalidOperation, DivisionByZero, Overflow])
I We can modify parameters in the context, e.g.,
I >>> getcontext().prec =6
I sets the precision to 6 (for arithmetic operations)
43 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision

I Let’s see the effect of setting the precision:


>>> from decimal import *
>>> getcontext().prec = 3

44 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision

I Let’s see the effect of setting the precision:


>>> from decimal import *
>>> getcontext().prec = 3
>>> a = Decimal(3.14)
Decimal('3.140000000000000124344978758017532527446746826171875')

44 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision

I Let’s see the effect of setting the precision:


>>> from decimal import *
>>> getcontext().prec = 3
>>> a = Decimal(3.14)
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> a+0

44 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision

I Let’s see the effect of setting the precision:


>>> from decimal import *
>>> getcontext().prec = 3
>>> a = Decimal(3.14)
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> a+0
Decimal(3.14)
>>> Decimal('3.14')*Decimal('3.14')

44 / 45
Floating-Point Numbers

Floating-Point Numbers
Precision

I Let’s see the effect of setting the precision:


>>> from decimal import *
>>> getcontext().prec = 3
>>> a = Decimal(3.14)
Decimal('3.140000000000000124344978758017532527446746826171875')
>>> a+0
Decimal(3.14)
>>> Decimal('3.14')*Decimal('3.14')
Decimal('9.86')

44 / 45
Floating-Point Numbers

Floating-Point Numbers
Mixing Floats and Decimals

I Mixing floats and decimals can lead to unwanted effects (see


earlier examples)
I We can have an exception raised when mixing floats and
decimals by trapping the FloatOperation signal
>>> from decimal import *
>>> getcontext().traps[FloatOperation] = True

45 / 45
Floating-Point Numbers

Floating-Point Numbers
Mixing Floats and Decimals

I Mixing floats and decimals can lead to unwanted effects (see


earlier examples)
I We can have an exception raised when mixing floats and
decimals by trapping the FloatOperation signal
>>> from decimal import *
>>> getcontext().traps[FloatOperation] = True
>>> a = Decimal(3.14)

45 / 45
Floating-Point Numbers

Floating-Point Numbers
Mixing Floats and Decimals

I Mixing floats and decimals can lead to unwanted effects (see


earlier examples)
I We can have an exception raised when mixing floats and
decimals by trapping the FloatOperation signal
>>> from decimal import *
>>> getcontext().traps[FloatOperation] = True
>>> a = Decimal(3.14)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.FloatOperation: [<class 'decimal.FloatOperation'>]
>>> Decimal(3.14)< 3.14

45 / 45
Floating-Point Numbers

Floating-Point Numbers
Mixing Floats and Decimals

I Mixing floats and decimals can lead to unwanted effects (see


earlier examples)
I We can have an exception raised when mixing floats and
decimals by trapping the FloatOperation signal
>>> from decimal import *
>>> getcontext().traps[FloatOperation] = True
>>> a = Decimal(3.14)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.FloatOperation: [<class 'decimal.FloatOperation'>]
>>> Decimal(3.14)< 3.14
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.FloatOperation: [<class 'decimal.FloatOperation'>]

45 / 45
Matplotlib NumPy Pandas

Programming Fundamentals 1
Lesson 11

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 52
Matplotlib NumPy Pandas

Outline

Matplotlib

NumPy

Pandas

2 / 52
Matplotlib NumPy Pandas

Matplotlib
What is Matplotlib?

I Matplotlib is a library for making 2D plots of arrays (which


are similar to lists) in Python
I On the next slides we present a simple way to use Matplotlib
that is similar in flavor to Matlab
I Matlab is a numerical computing environment used by
engineers and scientists

3 / 52
Matplotlib NumPy Pandas

Matplotlib
What is PyLab?

I Module (part of Matplotlib) that provides facilities for data


visualization, data analysis and numeric computations
I In this lesson we focus on the data plotting features of PyLab

4 / 52
Matplotlib NumPy Pandas

Matplotlib
PyLab: Example 1

I Below you find a simple example that uses Pylab to produce a


plot
I Lists instead of arrays are given as arguments

import pylab

pylab.figure(1) #create figure 1


pylab.plot([1,2,3,4], [1,7,3,5]) #draw on figure 1
pylab.show() # show figure on screen

5 / 52
Matplotlib NumPy Pandas

Matplotlib
PyLab: Figure 1

I Explain figure
6 / 52
Matplotlib NumPy Pandas

Matplotlib
Saving Figures

I Is there any point in plotting figures without showing them?

7 / 52
Matplotlib NumPy Pandas

Matplotlib
Saving Figures

I Is there any point in plotting figures without showing them?


I Yes, we can create different figures and save them to files
(e.g., for viewing them later)
I The following source code does just that

import pylab

pylab.figure(1) #create figure 1


pylab.plot([1,2,3,4], [1,7,3,5]) #draw on figure 1
pylab.savefig('fig1') # save figure 1
pylab.figure(2) #create figure 2
pylab.plot([1,2,3,4], [4,3,2,1]) #draw on figure 2
pylab.savefig('fig2') # save figure 2

7 / 52
Matplotlib NumPy Pandas

Matplotlib
Current Figure

I Note that the plot and savefig commands do not refer to a


specific figure
I Pylab has the notion of current figure
I The current figure is the last figure that was created using the
figure command
Thus, in
pylab.figure(2) #create figure 2
pylab.plot([1,2,3,4], [4,3,2,1]) #draw on figure 2
the plot command refers to figure 2

8 / 52
Matplotlib NumPy Pandas

Matplotlib
Another Example

I The following example shows the effect of accumulating


interests on an initial investment

import pylab
principal = 1000 # initial investment
interest = 0.05
years = 20
values = []
for i in range(years+1):
values.append(principal)
principal += principal*interest
pylab.plot(values)
pylab.show()

I Note that by calling plot with a single list argument l, the


x-values will be range(len(l))
9 / 52
Matplotlib NumPy Pandas

Matplotlib
Plot

10 / 52
Matplotlib NumPy Pandas

Matplotlib
Informative Labeling

I The figure should be self-explanatory


I For that we need give a title to the figure and labels to the
axes
I We can add the following lines (which apply to the current
figure):
pylab.title('5% Interest, Compounded Anually')
pylab.xlabel('Years of Compounding')
pylab.ylabel('Principal')
I See next slide

11 / 52
Matplotlib NumPy Pandas

Matplotlib
Plot #2

12 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots

I In an earlier lesson we showed how to compute roots of a


polynomial
I Challenge:
I Determine (roughly) a real root of x 5 + x − 1 graphically
I How shall we do this?

13 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots (2)

I We need to plot the graph of our polynomial


I How should we choose the x-coordinates?

14 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots (2)

I We need to plot the graph of our polynomial


I How should we choose the x-coordinates?
I Since we already know that one root must be between 0 and 1,
we limit ourselves to interval [0,1]

import pylab

xvalues = [i*0.01 for i in range(101)]


yvalues = [x**5+x-1 for x in xvalues]
pylab.plot(xvalues, yvalues)
pylab.show()
DEMO

14 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots (3)
I The value of the root still a bit hard to determine
I What would help?

15 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots (3)
I The value of the root still a bit hard to determine
I What would help?
I drawing a horizontal line y=0
I we can simply draw it on top of the current figure (see code
below)

import pylab

xvalues = [i*0.01 for i in range(101)]


yvalues = [x**5+x-1 for x in xvalues]
pylab.plot(xvalues, yvalues)
pylab.plot([0,0]) # x-coordinates will be [0,1]
pylab.show()

15 / 52
Matplotlib NumPy Pandas

Matplotlib
Finding the Roots (3)
I The value of the root still a bit hard to determine
I What would help?
I drawing a horizontal line y=0
I we can simply draw it on top of the current figure (see code
below)

import pylab

xvalues = [i*0.01 for i in range(101)]


yvalues = [x**5+x-1 for x in xvalues]
pylab.plot(xvalues, yvalues)
pylab.plot([0,0]) # x-coordinates will be [0,1]
pylab.show()

I We see that the real root is about 0.75


15 / 52
Matplotlib NumPy Pandas

NumPy
What is NumPy?

I NumPy provides an efficient way to store and manipulate


multi-dimensional arrays in Python
I ndarray type: efficient storage and manipulation of vectors,
matrices and higher-dimensional data
I readable and efficient syntax to operate on this data

16 / 52
Matplotlib NumPy Pandas

NumPy
A First Example

I Arrays look a lot like lists in Python


>>> import numpy as np
>>> x = np.arange(9)
>>> x
array([0, 1, 2, 3, 4, 5, 6, 7, 8]) # output
I similar to
list(range(9))

17 / 52
Matplotlib NumPy Pandas

NumPy
Applying Operators

I We can apply operators to arrays directly like this:


>>> x**2

18 / 52
Matplotlib NumPy Pandas

NumPy
Applying Operators

I We can apply operators to arrays directly like this:


>>> x**2
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64])
I we could do the same with list comprehension (in a more
verbose way):
>>> [x**2 for x in range(9)]

18 / 52
Matplotlib NumPy Pandas

NumPy
Applying Operators

I NumPy can also deal with multi-dimensional arrays


I Let’s reshape vector x into a matrix
>>> M = x.reshape((3,3))

19 / 52
Matplotlib NumPy Pandas

NumPy
Applying Operators

I NumPy can also deal with multi-dimensional arrays


I Let’s reshape vector x into a matrix
>>> M = x.reshape((3,3))
>>> M

19 / 52
Matplotlib NumPy Pandas

NumPy
Applying Operators

I NumPy can also deal with multi-dimensional arrays


I Let’s reshape vector x into a matrix
>>> M = x.reshape((3,3))
>>> M
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])

19 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations

I NumPy knows how to efficiently do matrix operations


I Here is how to compute the transpose of a matrix:
>>> M.T

20 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations

I NumPy knows how to efficiently do matrix operations


I Here is how to compute the transpose of a matrix:
>>> M.T
array([[0, 3, 6],
[1, 4, 7],
[2, 5, 8]])

I Has M been modified?


I No, only the transpose of M has been computed and displayed.

20 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations (2)

I We can very simply express multiplication of a matrix by a


vector: >>> np.dot(M,[2, 3, 4])

21 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations (2)

I We can very simply express multiplication of a matrix by a


vector: >>> np.dot(M,[2, 3, 4])
array([11, 38, 65])

21 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations (2)

I We can very simply express multiplication of a matrix by a


vector: >>> np.dot(M,[2, 3, 4])
array([11, 38, 65])
I or do something more complicated, like computing
eigenvalues: >>> np.linalg.eigvals(M)

21 / 52
Matplotlib NumPy Pandas

NumPy
Matrix Operations (2)

I We can very simply express multiplication of a matrix by a


vector: >>> np.dot(M,[2, 3, 4])
array([11, 38, 65])
I or do something more complicated, like computing
eigenvalues: >>> np.linalg.eigvals(M)
array([ 1.33484692e+01, -1.34846923e+00, -2.48477279e-16])

21 / 52
Matplotlib NumPy Pandas

Pandas
What is Pandas?

I Pandas built on top of NumPy


I Much more recent
I Provides a labeled interface to multi-dimensional data
I using a DataFrame object

22 / 52
Matplotlib NumPy Pandas

Pandas
A First Example

>>> import pandas as pd


>>> df = pd.DataFrame({'label': ['A', 'B', 'C', 'A', 'B', 'C'],
'value': [1, 2, 3, 4, 5, 6]})
>>> df

I Output:

23 / 52
Matplotlib NumPy Pandas

Pandas
A First Example

>>> import pandas as pd


>>> df = pd.DataFrame({'label': ['A', 'B', 'C', 'A', 'B', 'C'],
'value': [1, 2, 3, 4, 5, 6]})
>>> df

I Output:

label value
0 A 1
1 B 2
2 C 3
3 A 4
4 B 5
5 C 6

23 / 52
Matplotlib NumPy Pandas

Pandas
A More Interesting Example

I This example is based on data from the Hubble Space


Telescope1
I click here

I The data is presented in CSV format


I Allows to store tabular data in plain text
I Each line is composed of one or more fields separated by
commas

1
http://pythonforengineers.com/introduction-to-pandas/
24 / 52
Matplotlib NumPy Pandas

Pandas
Inspecting the Data

I The data is contained in a file named: hubble data.csv


I Here is an excerpt of the data
distance,recession_velocity
.032,170
.034,290
.214,-130
.263,-70
.275,-185
.275,-220
.45,200
.5,290

25 / 52
Matplotlib NumPy Pandas

Pandas
Loading the Data

I The data can be read with the read csv function of pandas
import pandas as pd
data = pd.read_csv('hubble_data.csv')
I What type is data?

26 / 52
Matplotlib NumPy Pandas

Pandas
Loading the Data

I The data can be read with the read csv function of pandas
import pandas as pd
data = pd.read_csv('hubble_data.csv')
I What type is data?
I It’s a DataFrame.
I We can print the first five rows of the DataFrame like this:
data.head() → DEMO
I The top row is the header
I If the csv file has no headers, we can pass them (as a list of
strings) as second argument to the read csv function

26 / 52
Matplotlib NumPy Pandas

Pandas
Viewing a Column

I So what can we do with the tabular data?


I We can view one column at a time
I E.g., the following code just extracts the distances
data['distance']
I Output (first five rows):

27 / 52
Matplotlib NumPy Pandas

Pandas
Viewing a Column

I So what can we do with the tabular data?


I We can view one column at a time
I E.g., the following code just extracts the distances
data['distance']
I Output (first five rows):
0 0.032
1 0.034
2 0.214
3 0.263
4 0.275

27 / 52
Matplotlib NumPy Pandas

Pandas
Plotting

I We would like to plot recession velocity versus distance


I In other words
I x-values should be the distances and y-values should be the
recession velocities
I DataFrame offers a plot-method
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('hubble_data.csv')
data.plot()
plt.show()
I DEMO

28 / 52
Matplotlib NumPy Pandas

Pandas
Fixing the Plot

I The plot is still not exactly what we want. Why?


I because it uses an integer index
I to correct the plot we need to replace the index by ’distance’
(thus cutting down the number of columns to 2)

import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('hubble_data.csv')
data.set_index('distance',inplace = True)
data.plot()
plt.show()
I DEMO

29 / 52
Matplotlib NumPy Pandas

Pandas
Data Analysis

I Let’s now start to do some more advanced data analysis2


I We use as base data set the world happiness reports of
2015-2017

import pandas as pd
import numpy as np
# reading the data
wh = pd.read_csv('wh_data.csv', index_col=0)

2
https://www.dataquest.io/blog/pandas-pivot-table/
30 / 52
Matplotlib NumPy Pandas

Pandas
Looking at the data

I Let’s look at the raw data


I DEMO: open the file directly

31 / 52
Matplotlib NumPy Pandas

Pandas
Info on the DataFrame

I We can obtain information about the DataFrame using its


info method:
>>> wh.info()

32 / 52
Matplotlib NumPy Pandas

Pandas
Info on the DataFrame

I We can obtain information about the DataFrame using its


info method:
>>> wh.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 495 entries, 0 to 494
Data columns (total 12 columns):
Country 495 non-null object
Region 495 non-null object
Happiness Rank 470 non-null float64
Happiness Score 470 non-null float64
Economy (GDP per Capita) 470 non-null float64
Family 470 non-null float64
Health (Life Expectancy) 470 non-null float64
Freedom 470 non-null float64
Trust (Government Corruption) 470 non-null float64
Generosity 470 non-null float64
Dystopia Residual 470 non-null float64
Year 495 non-null int64
dtypes: float64(9), int64(1), object(2)
memory usage: 50.3+ KB}

32 / 52
Matplotlib NumPy Pandas

Pandas
Info on the DataFrame

I Suppose we want to sort the data by ascending year and


descending happiness scores:
>>> wh.sort_values(["Year", "Happiness Score"],
ascending=[True, False], inplace=True)
>>> wh.head()

33 / 52
Matplotlib NumPy Pandas

Pandas
Info on the DataFrame

I Suppose we want to sort the data by ascending year and


descending happiness scores:
>>> wh.sort_values(["Year", "Happiness Score"],
ascending=[True, False], inplace=True)
>>> wh.head()

Country Region Happiness Rank Happiness Score


141 Switzerland Western Europe 1.0 7.587
60 Iceland Western Europe 2.0 7.561
38 Denmark Western Europe 3.0 7.527
108 Norway Western Europe 4.0 7.522
25 Canada North America 5.0 7.427

Note: Only showing first 5 data columns

33 / 52
Matplotlib NumPy Pandas

Pandas
Calculating Happiness Score

I The happiness score is obtained by suming up seven other


variables:
I Economy: real GDP per capita
I Family: social support
I Health: healthy life expectancy
I Freedom: freedom to make life choices
I Trust: perceptions of corruption
I Generosity: perceptions of generosity
I Dystopia: each country is compared against a hypothetical
nation that represents the lowest national averages for each
key variable and is, along with residual error, used as a
regression benchmark

34 / 52
Matplotlib NumPy Pandas

Pandas
Some Statistical information

I We can use the describe-method to obtain some basic


statistical information
>>> wh.describe()
I Note: only showing the first four data columns

35 / 52
Matplotlib NumPy Pandas

Pandas
Some Statistical information

I We can use the describe-method to obtain some basic


statistical information
>>> wh.describe()
I Note: only showing the first four data columns
Happiness Rank Happiness Score Economy (GDP per Capita) Family
count 470.000000 470.000000 470.000000 470.000000
mean 78.829787 5.370728 0.927830 0.990347
std 45.281408 1.136998 0.415584 0.318707
min 1.000000 2.693000 0.000000 0.000000
25% 40.000000 4.509000 0.605292 0.793000
50% 79.000000 5.282500 0.995439 1.025665
75% 118.000000 6.233750 1.252443 1.228745
max 158.000000 7.587000 1.870766 1.610574

35 / 52
Matplotlib NumPy Pandas

Pandas
Some Statistical Terms

I Mean: in this case denotes the arithmetic mean

36 / 52
Matplotlib NumPy Pandas

Pandas
Some Statistical Terms

I Mean: in this case denotes the arithmetic mean


I sum of values divided by number of values
I std: standard deviation
I measure used to denote the variation in the data
I Percentile: measure used in statistics indicating the value
below which a given percentage of observations in a group of
observations fall.
I Review the previous slide

36 / 52
Matplotlib NumPy Pandas

Pandas
Missing Values

I Are there any missing values in the table?


I How do we find out?

37 / 52
Matplotlib NumPy Pandas

Pandas
Missing Values

I Are there any missing values in the table?


I How do we find out?
I By comparing the ”counts” for each column with the number
of rows (see result of info method)

37 / 52
Matplotlib NumPy Pandas

Pandas
Missing Values

I Are there any missing values in the table?


I How do we find out?
I By comparing the ”counts” for each column with the number
of rows (see result of info method)
I We conclude that there are no missing values in the year
column but 25 missing values in several other columns

37 / 52
Matplotlib NumPy Pandas

Pandas
Pivot Table

I We will use pivot tables to get more insight into the data

Definition3
A pivot table is a table that summarizes data in another table, and
is made by applying an operation such as sorting, averaging, or
summing to data in the first table, typically including grouping of
the data.

3
https://en.wikipedia.org/wiki/Pivot table
38 / 52
Matplotlib NumPy Pandas

Pandas
Our First Pivot Table

I In pandas to create a pivot table we need to supply at least


the data (usually a DataFrame) and an index
I The following example uses also the values parameter

>>> pd.pivot_table(wh, index= 'Year', values= "Happiness Score")

39 / 52
Matplotlib NumPy Pandas

Pandas
Our First Pivot Table

I In pandas to create a pivot table we need to supply at least


the data (usually a DataFrame) and an index
I The following example uses also the values parameter

>>> pd.pivot_table(wh, index= 'Year', values= "Happiness Score")

Happiness Score
Year
2015 5.375734
2016 5.382185
2017 5.354019

39 / 52
Matplotlib NumPy Pandas

Pandas
Interpreting the Output

I For a given year there was a single happiness score. Why?


I Because the pivot table aggregates rows with the same value
for the index (the year in this case)
I The default aggregation method is to take averages
I Thus from the previous table we can conclude the average
happiness score was lowest in 2017

40 / 52
Matplotlib NumPy Pandas

Pandas
Ranking by Region

I Suppose we want to know the average happiness score by


region
I We can simply use ”Region” as index

>>> pd.pivot_table(wh, index = 'Region', values = "Happiness Score")

41 / 52
Matplotlib NumPy Pandas

Pandas
Ranking by Region

I Suppose we want to know the average happiness score by


region
I We can simply use ”Region” as index

>>> pd.pivot_table(wh, index = 'Region', values = "Happiness Score")

Happiness Score
Region
Australia and New Zealand 7.302500
Central and Eastern Europe 5.371184
Eastern Asia 5.632333
Latin America and Caribbean 6.069074
Middle East and Northern Africa 5.387879
North America 7.227167
Southeastern Asia 5.364077
Southern Asia 4.590857
Sub-Saharan Africa 4.150957
Western Europe 6.693000

41 / 52
Matplotlib NumPy Pandas

Pandas
Multi-Index Pivot Tables
I Suppose we want to know the average happiness score by
region and year
I We can simply pass both column names as indices
>>> pd.pivot_table(wh, index = ['Region', 'Year'],
values = "Happiness Score")
(Only showing top 12 rows)

42 / 52
Matplotlib NumPy Pandas

Pandas
Multi-Index Pivot Tables
I Suppose we want to know the average happiness score by
region and year
I We can simply pass both column names as indices
>>> pd.pivot_table(wh, index = ['Region', 'Year'],
values = "Happiness Score")
(Only showing top 12 rows)

42 / 52
Matplotlib NumPy Pandas

Pandas
Columns Parameter
I Using the columns parameter we can improve the readability
of the previous table
>>> pd.pivot_table(wh, index = 'Region', columns = 'Year', values = "Happiness Score")

43 / 52
Matplotlib NumPy Pandas

Pandas
Columns Parameter
I Using the columns parameter we can improve the readability
of the previous table
>>> pd.pivot_table(wh, index = 'Region', columns = 'Year', values = "Happiness Score")

Year 2015 2016 2017


Region
Australia and New Zealand 7.285000 7.323500 7.299000
Central and Eastern Europe 5.332931 5.370690 5.409931
Eastern Asia 5.626167 5.624167 5.646667
Latin America and Caribbean 6.144682 6.101750 5.957818
Middle East and Northern Africa 5.406900 5.386053 5.369684
North America 7.273000 7.254000 7.154500
Southeastern Asia 5.317444 5.338889 5.444875
Southern Asia 4.580857 4.563286 4.628429
Sub-Saharan Africa 4.202800 4.136421 4.111949
Western Europe 6.689619 6.685667 6.703714
43 / 52
Matplotlib NumPy Pandas

Pandas
Columns Parameter (2)

I What is the effect of the columns parameter?


I We list possible values of that column as headers, one per
column in the output
I We aggregate rows by region and year
I e.g., for a given region we have one value for each year in the
corresponding column

44 / 52
Matplotlib NumPy Pandas

Pandas
Aggregation Functions

I Up to now, when aggregating data, we have taken averages


I Using the aggfunc parameter of pivot table we can pass
other aggregation functions
>>> pd.pivot_table(wh, index= 'Region', values= "Happiness Score",
aggfunc= [np.mean, np.median, min, max, np.std])

I The effect of this is to display per region the listed statistical


measures for the happiness score. DEMO.

45 / 52
Matplotlib NumPy Pandas

Pandas
Custom Aggregation Functions

I We can pass custom aggregation functions to the pivot table


function
I This can be done via lambda-functions
>>> pd.pivot_table(wh, index = 'Region', values="Happiness Score",
aggfunc= [np.mean, min, max, np.std, lambda x: x.count()/3])

I What will be the effect of the custom aggregation function?

46 / 52
Matplotlib NumPy Pandas

Pandas
Custom Aggregation Functions

I We can pass custom aggregation functions to the pivot table


function
I This can be done via lambda-functions
>>> pd.pivot_table(wh, index = 'Region', values="Happiness Score",
aggfunc= [np.mean, min, max, np.std, lambda x: x.count()/3])

I What will be the effect of the custom aggregation function?


I To add per region the average number of countries (over the
three years)
I DEMO

46 / 52
Matplotlib NumPy Pandas

Pandas
Custom Categorization
I So far we have categorized the data according to the
categories present in the original data
I Using string search we can do more flexible categorization
I Suppose we want to group data by continent
I We can do this as follows
>>> t = pd.pivot_table(wh, index = 'Region', values="Happiness Score",
aggfunc= [np.mean])
>>> t[t.index.str.contains('Asia')]

47 / 52
Matplotlib NumPy Pandas

Pandas
Custom Categorization
I So far we have categorized the data according to the
categories present in the original data
I Using string search we can do more flexible categorization
I Suppose we want to group data by continent
I We can do this as follows
>>> t = pd.pivot_table(wh, index = 'Region', values="Happiness Score",
aggfunc= [np.mean])
>>> t[t.index.str.contains('Asia')]

mean
Happiness Score
Region
Eastern Asia 5.632333
Southeastern Asia 5.364077
Southern Asia 4.590857
47 / 52
Matplotlib NumPy Pandas

Pandas
Extracting Data

I If we want to restrict more than one column to certain values,


we have to use the query-function of DataFrames
t = pd.pivot_table(wh, index = ['Region', 'Year'], values='Happiness Score',
aggfunc= [np.mean])
t.query('Year == [2015, 2017] and Region == ["Sub-Saharan Africa", "Middle East and Northern Africa"]')

48 / 52
Matplotlib NumPy Pandas

Pandas
Extracting Data

I If we want to restrict more than one column to certain values,


we have to use the query-function of DataFrames
t = pd.pivot_table(wh, index = ['Region', 'Year'], values='Happiness Score',
aggfunc= [np.mean])
t.query('Year == [2015, 2017] and Region == ["Sub-Saharan Africa", "Middle East and Northern Africa"]')

mean
Happiness Score
Region Year
Middle East and Northern Africa 2015 5.406900
2017 5.369684
Sub-Saharan Africa 2015 4.202800
2017 4.111949

48 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data

I Python represents missing data by a special value calle NaN


(”not a number”)
I There are two parameters of pivot table that allow us to
handle missing data
I dropna is boolean, and used to indicate you do not want to
include columns whose entries are all NaN (default: True)
I fill value is type scalar, and used to choose a value to replace
missing values (default: None).
I Since the default value of fill value is None, we did not replace
missing data values so far

49 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data (2)

I To illustrate handling of missing data, we construct a new


pivot table using the qcut-function
I This is a built-in panda function that allows to split data into
a number of quantiles
I Thus
pd.qcut(wh["Happiness Score"], 4) will produce four
quantiles:
I 0
I 25
I 50
I 75

50 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data (2)
I The following code splits our data into 4 quantiles:
score = pd.qcut(wh["Happiness Score"], 4)
pd.pivot_table(wh, index= ['Region', score], values= "Happiness Score",
aggfunc= 'count').head(9)

51 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data (2)
I The following code splits our data into 4 quantiles:
score = pd.qcut(wh["Happiness Score"], 4)
pd.pivot_table(wh, index= ['Region', score], values= "Happiness Score",
aggfunc= 'count').head(9)

Happiness Score
Region Happiness Score
Australia and New Zealand (2.692, 4.509] NaN
(4.509, 5.283] NaN
(5.283, 6.234] NaN
(6.234, 7.587] 6.0
Central and Eastern Europe (2.692, 4.509] 10.0
(4.509, 5.283] 28.0
(5.283, 6.234] 46.0
(6.234, 7.587] 3.0
Eastern Asia (2.692, 4.509] NaN

51 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data (2)
I A NaN in the last column really means that no row is in this
quantile
I It thus makes more sense to replace NaN by 0
score = pd.qcut(data["Happiness Score"], 3)
pd.pivot_table(wh, index= ['Region', score], values= "Happiness Score",
aggfunc= 'count',fill_value= 0).head(9)

52 / 52
Matplotlib NumPy Pandas

Pandas
Handling Missing Data (2)
I A NaN in the last column really means that no row is in this
quantile
I It thus makes more sense to replace NaN by 0
score = pd.qcut(data["Happiness Score"], 3)
pd.pivot_table(wh, index= ['Region', score], values= "Happiness Score",
aggfunc= 'count',fill_value= 0).head(9)

Happiness Score
Region Happiness Score
Australia and New Zealand (2.692, 4.509] 0
(4.509, 5.283] 0
(5.283, 6.234] 0
(6.234, 7.587] 6
Central and Eastern Europe (2.692, 4.509] 10
(4.509, 5.283] 28
(5.283, 6.234] 46
(6.234, 7.587] 3
Eastern Asia (2.692, 4.509] 0 52 / 52
Software Engineering

Programming Fundamentals 1
Lesson 12

Pierre Kelsen

CSC Research Unit, MNO

2017

1 / 53
Software Engineering

Outline

Software Engineering

2 / 53
Software Engineering

Software Engineering
Software vs Programs

I There is more to software development than programming.


Why?

3 / 53
Software Engineering

Software Engineering
Software vs Programs

I There is more to software development than programming.


Why?
I There is typically a difference in size
I You are used to develop small programs consisting of fewer
than 100 lines of code
I Typical commercial software comprises often thousands to tens
of thousands of lines of code
I What is a consequence of this?

3 / 53
Software Engineering

Software Engineering
Software vs Programs

I There is more to software development than programming.


Why?
I There is typically a difference in size
I You are used to develop small programs consisting of fewer
than 100 lines of code
I Typical commercial software comprises often thousands to tens
of thousands of lines of code
I What is a consequence of this?
I software is usually developed by teams of developers
I therefore management issues need to be tackled (among
others)

3 / 53
Software Engineering

Software Engineering
Software vs Programs (2)

I Other considerations to be taken into account:


I The software needs to be usable. Proper user interface design
is important.
I Software is developed for a customer. Need to make sure that
it satisfies the requirements of the customer.

4 / 53
Software Engineering

Software Engineering
Software Engineering

Definition1
Software Engineering is the set of techniques - including theories,
methods, processes, tools and languages - for developing and
operating production software meeting defined standards of quality.
Production software is operational software, intended to function
in real environments to solve real problems.

1
Bertrand Meyer, Touch of Class, Springer, 2009
5 / 53
Software Engineering

Software Engineering
Production Software

I Constraints on production software:


I Quality constraints,e.g., the system will consistently and
efficiently deliver correct results
I Duration constraints: production software must often be
maintained over many years
I Team constraints: because of the size and complexity,
development has usually to be done in a team
I Impact constraints: dysfunctions in the software may have
serious implication in the real world (examples?), hence the
importance of quality

6 / 53
Software Engineering

Software Engineering
Challenges of Software Engineering

I Describe: an important activity is the description of the


problem and of the solution
I Implement: task of building the software: includes both
high-level design and programming of specific functionalities
I Assess: we need to assess the quality of the software as we
develop it
I Manage: we need to manage teams of developers; includes
setting deadlines, deliverables, scheduling tasks, coordinating
meetings with customers
I Operate: putting a system into service is a non-trivial task.

7 / 53
Software Engineering

Software Engineering
Software Quality

I Quality factors can be classified into two broad categories:


I Product quality: immediately related to the software that is
developed
I Process quality: characterizing the effectiveness of the software
development process
I Why should we care about process quality?

8 / 53
Software Engineering

Software Engineering
Software Quality

I Quality factors can be classified into two broad categories:


I Product quality: immediately related to the software that is
developed
I Process quality: characterizing the effectiveness of the software
development process
I Why should we care about process quality?
I affects time of delivery, cost effectiveness, and trust the
customer puts in the software

8 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy:

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy: is the software providing all the functionalities that
it is required to have

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy: is the software providing all the functionalities that
it is required to have
I Correctness:

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy: is the software providing all the functionalities that
it is required to have
I Correctness: does it fulfill those functionalities in a correct
way?
I Why is correctness hard to achieve?

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy: is the software providing all the functionalities that
it is required to have
I Correctness: does it fulfill those functionalities in a correct
way?
I Why is correctness hard to achieve?
Because writing correct programs is hard,

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities


I Adequacy: is the software providing all the functionalities that
it is required to have
I Correctness: does it fulfill those functionalities in a correct
way?
I Why is correctness hard to achieve?
Because writing correct programs is hard,
and writing complete specifications is hard as well.

9 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Robustness:

10 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Robustness: how well does the software handle unforeseen
situations
I Why is this an issue?

10 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Robustness: how well does the software handle unforeseen
situations
I Why is this an issue?
Because (e.g.) users can act in unpredictable ways, as can the
environment.
I How can we achieve it?

10 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Robustness: how well does the software handle unforeseen
situations
I Why is this an issue?
Because (e.g.) users can act in unpredictable ways, as can the
environment.
I How can we achieve it?
Using error handling and recovery mechanisms.

10 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Security:

11 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Security: how well does the system react to hostile attempts to
break it or steal data from it

11 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Security: how well does the system react to hostile attempts to
break it or steal data from it
I Efficiency:

11 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Security: how well does the system react to hostile attempts to
break it or steal data from it
I Efficiency: adequate use of time, memory, and other resources
I Especially critical on small devices

11 / 53
Software Engineering

Software Engineering
Product Quality

I Immediate Product Qualities (continued)


I Security: how well does the system react to hostile attempts to
break it or steal data from it
I Efficiency: adequate use of time, memory, and other resources
I Especially critical on small devices
I Ease of use: make the system easy to use for various categories
of users

11 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility:

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility:

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features
I Portability:

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features
I Portability: can the software be easily deployed on another
platform

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features
I Portability: can the software be easily deployed on another
platform
I Reusability:

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features
I Portability: can the software be easily deployed on another
platform
I Reusability: can part of this software be reused for later
projects
I Why is this important?

12 / 53
Software Engineering

Software Engineering
Product Quality

I Long-term product quality: (continued)


I Corrigibility: how easy it is to update the software to correct
deficiencies
I Extendibility: how easy is it to add new features
I Portability: can the software be easily deployed on another
platform
I Reusability: can part of this software be reused for later
projects
I Why is this important?
I Developing software is costly. Reuse allows to lower the cost
of software development.

12 / 53
Software Engineering

Software Engineering
Internal vs External Quality Factors

I Above immediate and long-term quality factors are also called


external quality factors because they directly affect the client
I Besides the external quality factors there are also internal
quality factors
I characterize how well the software is written
I can be expressed using software metrics
I Are there relations between internal and external quality
factors?
I Yes, internal factors often influence external factors. Can you
give some examples?

13 / 53
Software Engineering

Software Engineering
Internal vs External Quality Factors

I Above immediate and long-term quality factors are also called


external quality factors because they directly affect the client
I Besides the external quality factors there are also internal
quality factors
I characterize how well the software is written
I can be expressed using software metrics
I Are there relations between internal and external quality
factors?
I Yes, internal factors often influence external factors. Can you
give some examples?
I Poorly written code is difficult to correct and to extend
I Poorly written code is likely to have errors and security issues

13 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed:

14 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed: ability to deliver the software quickly

14 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed: ability to deliver the software quickly
I Cost effectiveness:

14 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed: ability to deliver the software quickly
I Cost effectiveness: how many resources does the development
consume? Main component is usually personnel costs which
can be expressed in person-months

14 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed: ability to deliver the software quickly
I Cost effectiveness: how many resources does the development
consume? Main component is usually personnel costs which
can be expressed in person-months
I Collaboration effectiveness:

14 / 53
Software Engineering

Software Engineering
Process Quality

I Process Qualities
I Production speed: ability to deliver the software quickly
I Cost effectiveness: how many resources does the development
consume? Main component is usually personnel costs which
can be expressed in person-months
I Collaboration effectiveness: effectiveness of developers working
together

14 / 53
Software Engineering

Software Engineering
Process Quality (2)

I Process Qualities (continued)


I Predictability:

15 / 53
Software Engineering

Software Engineering
Process Quality (2)

I Process Qualities (continued)


I Predictability: use of reliable methods to predict quality factors
ahead of time
I Reproducibility:

15 / 53
Software Engineering

Software Engineering
Process Quality (2)

I Process Qualities (continued)


I Predictability: use of reliable methods to predict quality factors
ahead of time
I Reproducibility: can successful practices of a project be carried
over to future projects?
I Self-improvement:

15 / 53
Software Engineering

Software Engineering
Process Quality (2)

I Process Qualities (continued)


I Predictability: use of reliable methods to predict quality factors
ahead of time
I Reproducibility: can successful practices of a project be carried
over to future projects?
I Self-improvement: inclusion in the process of mechanisms to
improve the process itself

15 / 53
Software Engineering

Software Engineering
Software Development Activities

I Software development involves a number of tasks. Different


activities are defined to deal with these tasks
I Feasability Analysis

16 / 53
Software Engineering

Software Engineering
Software Development Activities

I Software development involves a number of tasks. Different


activities are defined to deal with these tasks
I Feasability Analysis
I Is it possible to build a software system?
I Why would it not be possible?

16 / 53
Software Engineering

Software Engineering
Software Development Activities

I Software development involves a number of tasks. Different


activities are defined to deal with these tasks
I Feasability Analysis
I Is it possible to build a software system?
I Why would it not be possible?
I Maybe that it would be too costly or take too long

16 / 53
Software Engineering

Software Engineering
Software Development Activities(2)

I Requirements analysis
I Defines the requirements of the system
I Both functional and non-functional requirements need to be
described
I Functional requirements: what is the system able to do?
I Non-functional requirements: have to do with how the system
should operate. Examples?

17 / 53
Software Engineering

Software Engineering
Software Development Activities(2)

I Requirements analysis
I Defines the requirements of the system
I Both functional and non-functional requirements need to be
described
I Functional requirements: what is the system able to do?
I Non-functional requirements: have to do with how the system
should operate. Examples?
I E.g., Performance needs to reach a certain level
I Software needs to integrate with existing system
I Constraints on the choice of technologies
I Given levels of security and reliability

17 / 53
Software Engineering

Software Engineering
Software Development Activities(3)

I Specification
I Precise description of individual elements of the system
I Difference between specification and requirement analyses?

18 / 53
Software Engineering

Software Engineering
Software Development Activities(3)

I Specification
I Precise description of individual elements of the system
I Difference between specification and requirement analyses?
I Requirements are customer-oriented, Specification is
developer-oriented
I Difference in the amount of rigour and formalism (higher for
specification document)

18 / 53
Software Engineering

Software Engineering
Software Development Activities(4)

I Design
I Builds the overall architecture of the system
I Defines the different parts/modules as well as their interaction
I Implementation: the actual programming of the system
I Can be model-driven: code generation from models may
automate parts of the coding process

19 / 53
Software Engineering

Software Engineering
Software Development Activities(5)

I Documentation
I Documents the software system - for customers and developers
I Documents the software process
I Maybe partly automated
I Verification & Validation (V&V)
I Making sure the systems does the right things in the right way

20 / 53
Software Engineering

Software Engineering
Lifecycle Models
I Lifecycle models address the question of how the above
activities should be organized (eg, in what sequence)

21 / 53
Software Engineering

Software Engineering
Lifecycle Models
I Lifecycle models address the question of how the above
activities should be organized (eg, in what sequence)
I Waterfall model is the traditional way to organize the
activities
I based on executing the above activities in the order given

21 / 53
Software Engineering

Software Engineering
About the Waterfall Model

I What are the drawbacks of the waterfall model


I Rigidity: assumes that one activity finishes completely before
the next one starts
I Why is this unrealistic?

22 / 53
Software Engineering

Software Engineering
About the Waterfall Model

I What are the drawbacks of the waterfall model


I Rigidity: assumes that one activity finishes completely before
the next one starts
I Why is this unrealistic?
I We may trace back errors to earlier phases which then need to
be revisited
I Late appearance of code
I many problems can only be discovered when coding
I The later the problem is found, the more expensive is it to fix
(Why?)

22 / 53
Software Engineering

Software Engineering
The Spiral Model

23 / 53
Software Engineering

Software Engineering
About the Spiral Model

I Principles of the spiral model:


I Focus is on risk minimization by breaking project into smaller
segments
I Each cycle follows a sequence of steps similar to those in the
waterfall model
I Each cycle results in a prototype, to be evaluated by customer
I More flexible than the waterfall but focus on prototypes may
result in shipping unfinished product

24 / 53
Software Engineering

Software Engineering
Agile Development

I Agile methods (including extreme programming) deemphasize


plans and processes and put the focus on:
I Working code as the main measure of progress
I Frequent communication
I Tests drive the development rather than specifications
I Small increments of development and regular feedback from
customer

25 / 53
Software Engineering

Software Engineering
Requirements Engineering

I A requirements engineering process should produce two


things:
I A requirements document
I A test plan describing how the future software will be tested
I Why is the early design of a test plan important?
I to make sure it is driven by user needs rather than design
choices

26 / 53
Software Engineering

Software Engineering
A Standard for Requirements Engineering

I The IEEE Computer Society has developed a standard (IEEE


29148:2011) describing best practices in requirements
engineering
I relatively short document
I you should read it before writing/reading your first
requirement document

27 / 53
Software Engineering

Software Engineering
Scope of Requirements

I The system to be developed will generally be part of a larger


system
I embedded system will be interfacing with hardware
I a business software will have to be integrated with existing
enterprise software
I The first question to answer is how large the scope of the
requirement document will be
I Will it include the surrounding system?
I Generally it is sufficient to define the interface to other systems

28 / 53
Software Engineering

Software Engineering
Obtaining Requirements

I In rare cases obtaining requirements is straightforward


I More often it requires good negotiation skills because
I customers have no idea how difficult it is to implement certain
features
I customers often do not have a precise idea of what they want
I there may be conflicting view of what is required from the new
system

29 / 53
Software Engineering

Software Engineering
The glossary

I Every requirement document should start out with a glossary


I precisely defines the main concepts of the problem domain
I Why is this important?

30 / 53
Software Engineering

Software Engineering
The glossary

I Every requirement document should start out with a glossary


I precisely defines the main concepts of the problem domain
I Why is this important?
I There has to be agreement on what basic terms mean
I Developers do not know the domain well

30 / 53
Software Engineering

Software Engineering
Twelve Properties of Good Requirements

I Ideally the defined requirements will have a number of


properties that guarantee the quality of the requirement
document
I many of these are found in the IEEE standard
I this list provides a good yardstick to assess the quality of a
requirements document
I very few requirement documents fulfill all of them

31 / 53
Software Engineering

Software Engineering
Property 1: Justification

I Property 1: each requirement should be justified

32 / 53
Software Engineering

Software Engineering
Property 1: Justification

I Property 1: each requirement should be justified


I can be traced back to the needs of some stakeholder
I stakeholders are those people that are affected by the new
system

32 / 53
Software Engineering

Software Engineering
Property 2: Correctness

I Property 2: the requirements should be correct

33 / 53
Software Engineering

Software Engineering
Property 2: Correctness

I Property 2: the requirements should be correct


I any system satisfying the requirements will meet the needs of
the stakeholders
I this is difficult to guarantee formally
I it is therefore important to have customers formally approve
the requirements

33 / 53
Software Engineering

Software Engineering
Property 3: Completeness

I Property 3: the requirements should be complete

34 / 53
Software Engineering

Software Engineering
Property 3: Completeness

I Property 3: the requirements should be complete


I they should cover all the needs of the stakeholders
I difficult to realize in practice
I a more practical approach:
I we should define the effect of every command on the system
state

34 / 53
Software Engineering

Software Engineering
Property 4: Consistency

I Property 4: the requirements should be consistent

35 / 53
Software Engineering

Software Engineering
Property 4: Consistency

I Property 4: the requirements should be consistent


I they should not contradict each other
I again difficult to realize in practice because of the size and
complexity of requirement documents
I inconistent requirements can often be traced back to
conflicting needs of stakeholders
I Note: an inconsistent requirement document can never be
correct (Why?)

35 / 53
Software Engineering

Software Engineering
Property 5: Unambiguous

I Property 5: the requirements should be unambiguous


I Why is this a challenge?

36 / 53
Software Engineering

Software Engineering
Property 5: Unambiguous

I Property 5: the requirements should be unambiguous


I Why is this a challenge?
I Because requirement documents are written in natural
language (e.g., English)
I Such languages are by their nature imprecise
I Example of ambiguous requirement:

All payments, regardless of the amount, can be made in contactless


mode. Below EUR 25, there is no need to enter your PIN.
Above EUR 25, you will be asked for the PIN.

36 / 53
Software Engineering

Software Engineering
Property 6: Feasability

I Property 6: Requirements should be feasible


I Can you think of a non-feasible requirement?

37 / 53
Software Engineering

Software Engineering
Property 6: Feasability

I Property 6: Requirements should be feasible


I Can you think of a non-feasible requirement?
I A non-functional requirement imposing unrealistic performance
constraints on certain operations

37 / 53
Software Engineering

Software Engineering
Property 7: Verifiability

I Property 7: the requirements should be verifiable

38 / 53
Software Engineering

Software Engineering
Property 7: Verifiability

I Property 7: the requirements should be verifiable


I There should be a clear criterion to decide whether a system
meets a requirement.
I Example of non-verifiable requirement:
I The system shall respond in real-time

38 / 53
Software Engineering

Software Engineering
Property 7: Verifiability

I Property 7: the requirements should be verifiable


I There should be a clear criterion to decide whether a system
meets a requirement.
I Example of non-verifiable requirement:
I The system shall respond in real-time
I Not defined clearly enough to be verifiable

38 / 53
Software Engineering

Software Engineering
Property 8: Interfaced

I Property 8: the requirements should be interfaced


I they should precisely describe how the system interacts with
existing systems

39 / 53
Software Engineering

Software Engineering
Property 9: Priorities

I Property 9: the requirements should be prioritized


I Why is this important?

40 / 53
Software Engineering

Software Engineering
Property 9: Priorities

I Property 9: the requirements should be prioritized


I Why is this important?
I The given time and budget constraints (or other unexpected
difficulties) may not allow to realize all requirements

40 / 53
Software Engineering

Software Engineering
Property 10: Understandability

I Property 10: the requirements should be understandable


I A requirement document that is difficult to read and
understand will not be a good basis for development

41 / 53
Software Engineering

Software Engineering
Property 11: Endorsement

I Property 11: the requirements should be endorsed


I It is important to have customers formally approve a
requirement document

42 / 53
Software Engineering

Software Engineering
Verification and Validation

I We have already seen two techniques in this course for


verification and validation.

43 / 53
Software Engineering

Software Engineering
Verification and Validation

I We have already seen two techniques in this course for


verification and validation.
I Testing
I Debugging
I These are called dynamic techniques since they are based on
executing the code
I We now review some static techniques (based on examining
the source code)

43 / 53
Software Engineering

Software Engineering
Design and Code Reviews

I Design and code reviews are a manual process (meaning?)


designed to uncover faults and other defiencies
I The target is some software element, for example,
I a code module
I a design document
I a chapter from a user manual
I Text is circulated in advance and then discussed in a meeting
to uncover possible problems

44 / 53
Software Engineering

Software Engineering
Design and Code Reviews (2)

I Design and code reviews are not an effective tool for


systematic detection of faults

45 / 53
Software Engineering

Software Engineering
Design and Code Reviews (2)

I Design and code reviews are not an effective tool for


systematic detection of faults
I they can be viewed as a spot check
I they should be complemented by other more systematic
investigations (e.g., testing)
I Design and code reviews can be useful in assessing design and
code practices (and possibly improving them)

45 / 53
Software Engineering

Software Engineering
Static Analysis

I A more systematic V&V practice is based on static analysis


I Static analysis is already done to some extent by the compiler
I detects syntax errors
I Special tools called static analyzers look for code patterns
that may be faulty

46 / 53
Software Engineering

Software Engineering
Suspicious Code Patterns

I Here are some examples of code patterns that a static


analyzer may flag:
I Variables that can on some execution path be accessed
without first being set
I Variables that are not used
I Missing return statements

47 / 53
Software Engineering

Software Engineering
Completeness of V&V Techniques

I All of the techniques we have described so far are not


complete in the following sense
I They cannot prove the absence of errors but only their presence
I Can you think of a technique that could prove the absence of
errors?

48 / 53
Software Engineering

Software Engineering
Completeness of V&V Techniques

I All of the techniques we have described so far are not


complete in the following sense
I They cannot prove the absence of errors but only their presence
I Can you think of a technique that could prove the absence of
errors?
I We could try to formally prove that the software is correct
I This is called program proving

48 / 53
Software Engineering

Software Engineering
Program Proving

I For program proving to be applicable we need to have a


mathematical description of what it means for the software to
be correct
I Once such a description has been established, one can try to
prove correctness using an automated tool such as an
interactive theorem prover
I requires human intervention
I a completely automated solution not possible
I Because of the difficulty of establishing a precise
mathematical description of the software, program proving is
rarely used in practice

49 / 53
Software Engineering

Software Engineering
Other Static Techniques

I There exist other formal static techniques that do not attempt


to establish full correctness but focus on some aspects of the
system
I Model checking
I Abstract interpretation
I Both of these techniques attempt to analyze a simplified
version of the program

50 / 53
Software Engineering

Software Engineering
Model Checking

I Model checking explores the state space of a program


I Usually the state space is very large
I For that reason model checking uses special techniques to
traverse the state space

51 / 53
Software Engineering

Software Engineering
Abstract Interpretation

I Abstract interpretation is a two step process:


I derive a simpler more abstract version of a program (by leaving
out ”irrelevant details”)
I apply static analysis techniques to the simplified version
I Intuition about abstracting information:
I How can we check whether a student is missing in today’s
lecture?

52 / 53
Software Engineering

Software Engineering
Abstract Interpretation

I Abstract interpretation is a two step process:


I derive a simpler more abstract version of a program (by leaving
out ”irrelevant details”)
I apply static analysis techniques to the simplified version
I Intuition about abstracting information:
I How can we check whether a student is missing in today’s
lecture?
I by checking the student ids of all the students registered for
this class
I and also checking the student ids of the student present in
this room

52 / 53
Software Engineering

Software Engineering
The End

print('This is the last slide of this course')


print('I hope you enjoyed it!')
print('Remember to do the course evaluations!')
click here

53 / 53

You might also like