Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

INF1002F

Strings
Richard Maliwatu ~richard.maliwatu@uct.ac.za

Department of Information Systems

1
Recap of previous session
• What is iteration?
• Counter-controlled loops
• Condition-controlled loops
• The range function
• Nested loops

2
Objectives
• String type
• String operations
• Read/Convert
• String library
• Indexing strings [] • String comparisons
• Slicing strings [2:4] • Searching in strings
• Looping through strings • Replacing text
with for and while • Stripping white space
• Concatenating strings with +

3
String data type
• A string is a sequence of characters (letters, digits, symbols)
• A string literal are enclosed in quotes
'Hello' or "Hello"
• For strings, + means “concatenate”
• When a string contains numbers, it is still a string
• We can convert numbers in a string into a number using int()
• All strings are objects so have methods for various useful functions.

4
How the computer stores strings
• Strings are sequences of characters.
• A character is internally a single number representing some symbol.
• The mapping from numbers to symbols is called Unicode - it is a
standard for electronic data.
• Unicode has thousands of symbols defined to cater for all living
languages!

Symbol ... A B C …. a b c ...


Unicode number ... 65 66 67 ... 97 98 99 ...

5
...and some non-living languages

6
Internal data vs Input/Output
• What the computer stores
72 101 108 108 111 32 87 111 114 108 100

• What the user sees on the screen


H e l l o W o r l d

7
Actually …
• What the computer stores:
1001000 1100101 1101100 1101100 1101111 100000 1010111 1101111 1110010 1101100 1100100

• What the user sees on the screen:


H e l l o W o r l d

8
Reading and converting
• We prefer to read data in using strings and then parse and convert
the data as we need.
• This gives us more control over error situations and/or bad user input
• Input numbers must be converted from strings.

num1 = input("Enter number: ")


num1 = eval(num1)

9
Converting to/from numbers
• ord ("a")
▪ return the Unicode number for 1-character string "a"
• chr(97)
▪ returns the 1-character string with Unicode symbol 97
• int ("1234")
▪ returns the integer value 1234
▪ note: we can also use eval() and float()
• str (1234)
▪ returns the string value "1234"

10
Looking inside strings
B A N A N A
• We can get at any single
character in a string using an 0 1 2 3 4 5
index specified in square
brackets >>> fruit = 'banana'
>>> letter = fruit[1]
• The index value must be an
>>> print(letter)
integer and starts at zero
A
• The index value can be an >>> x = 3
expression that is computed
>>> w = fruit[x - 1]
>>> print(w)
N
11
A character too far
• You will get a python error if >>> zen = 'abc'
you attempt to index beyond >>> print(zen[5])
the end of a string Traceback (most recent
• So be careful when call last): File
constructing index values and "<stdin>", line 1, in
slices <module>
IndexError: string
index out of range
>>>

12
Strings have length
B A N A N A
• The built-in function len ()
gives us the length of a 0 1 2 3 4 5
string
>>> fruit = 'banana'
>>> print (len(fruit))
>>> print(letter)
6

13
Iterating over strings
• Using a while fruit = 'banana'
statement, an index = 0
iteration variable, and while index < len(fruit):
the len() function, we letter = fruit[index]
can construct a loop
print(index, letter)
to look at each of the
letters in a string index = index + 1
individually Output:
0b
1a
2n
3a
4n
5a 14
Iterating over strings (cont'd)
• A definite loop using a fruit = 'banana'
for statement is much for letter in fruit:
more elegant print(letter)
• The iteration variable
is completely taken
care of by the for loop Output:

b
a
n
a
n
a
15
Looping and counting
This is a simple loop that word = 'banana'
loops through each letter count = 0
in a string and counts the for letter in word :
number of times the loop if letter == 'a' :
encounters the 'a'
count = count + 1
character
print(count)

16
Looking deeper into in
• The iteration variable
“iterates” through the
sequence (ordered set)
Iteration variable Six-character string
• The block (body) of code
is executed once for each
value in the sequence
• The iteration variable for letter in 'banana' :
moves through all of the Print(letter)
values in the sequence

17
More string operations ...

18
Slicing strings [1/2] Mo n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11
• We can also look at any
continuous section of a
string using a colon >>> s = 'Monty Python'
operator >>> print(s[0:4])
• The second number is one Mont
beyond the end of the slice - >>> print(s[6:7])
“up to but not including” P
• If the second number is >>> print(s[6:20])
beyond the end of the Python
string, it stops at the end

19
Slicing strings [2/2] Mo n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11
• If we leave off the first
number or the last number
of the slice, it is assumed to >>> s = 'Monty Python'
be the beginning or end of >>> print(s[:2])
the string respectively Mo
>>> print(s[8:])
thon
>>> print(s[:])
Monty Python

20
String slicing quiz
What string is generated from the following expressions?
Assume greet = “Hello Bob”
1. greet[0:3]
2. greet[5:9]
3. greet[:5]
4. greet[5:]
5. greet[:]

21
H e l l 0 B o b
String slicing quiz 0 1 2 3 4 5 6 7 8

What string is generated from the following expressions?


Assume greet =“Hello Bob”
Strings generated
1. greet[0:3]
2. greet[5:9] 1. Hel
3. greet[:5] 2. Bob
3. Hello
4. greet[5:]
4. Bob
5. greet[:] 5. Hello Bob

22
More string slicing examples
The third argument to the slice
a = "Jabberwocky” operation denotes the step
size/amount through the string
• b = a[::2] # b = 'Jbewcy'
• c = a[::-2] # c = 'ycwebJ'
• d = a[0:5:2] # d = 'Jbe'
• e = a[5:0:-2] # e = 'rba'
• f = a[:5:1] # f = 'Jabbe'
• g = a[:5:-1] # g = 'ykcow'
• h = a[5::1] # h = 'rwocky'
• i = a[5::-1] # i = 'rebbaJ'
• j = a[5:0:-1] # j = 'rebba'

23
String concatenation
• When the + operator >>> a = 'Hello'
is applied to strings, it >>> b = a + 'There'
means >>> print(b)
“concatenation” HelloThere
>>> c = a + ' ' + 'There'
>>> print(c)
Hello There
>>>

24
Using in as a logical operator
>>> fruit = 'banana'
• The in keyword can also >>> 'n' in fruit
be used to check to see if True
one string is “in” another >>> 'm' in fruit
string
False
• The in expression is a >>> 'nan' in fruit
logical expression that True
returns True or False and
>>> if 'a' in fruit :
can be used in an if
statement ... print('Found it!')
...
Found it!
>>> 25
String comparison
if word == 'banana':
print('All right, bananas.')
if word < 'banana':
print('Your word,' + word + ', comes before banana. ')
elif word > 'banana':
print('Your word,' + word + ', comes after banana.')
else:
print('All right, bananas.')

26
Objects
• Objects are a special data type, including both the actual data and
functions (called methods) that operate on the data. All Python
strings are objects.
• Examples of string functions/methods include:
▪ count, find, join, lower, ...
• For non-object functions, we specify the string as a parameter:
▪ len (s)
• For object function (methods), we call the function on the string using
a dot and then the function name (aka dot-notation):
▪ s.lower ()
27
String library
• Python has several string functions >>> greet = 'Hello Bob'
which are in the string library. >>> zap = greet.lower()
>>> print(zap)
• These functions are already built hello bob
into every string - we invoke them >>> print(greet)
by appending the function to the Hello Bob
string variable. >>> print('Hi There'.lower())
• These functions do not modify the hi there
original string, instead they return >>>
a new string that has been altered.

28
>>> stuff = 'Hello world'
>>> type(stuff)
<class 'str'>
>>> dir(stuff)
['capitalize', 'casefold', 'center', 'count',
'encode', 'endswith', 'expandtabs', 'find',
'format', 'format_map', 'index', 'isalnum',
'isalpha', 'isdecimal', 'isdigit', 'isidentifier',
'islower', 'isnumeric', 'isprintable', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower',
'lstrip', 'maketrans', 'partition', 'replace',
'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit',
'rstrip', 'split', 'splitlines', 'startswith',
'strip', 'swapcase', 'title', 'translate', 'upper',
'zfill']

https://docs.python.org/3/library/stdtypes.html#string-methods 29
30
String library (cont'd)
str.capitalize()
str.center(width[, fillchar])
str.endswith(suffix[, start[, end]])
str.find(sub[, start[, end]])
str.lstrip([chars])
str.replace(old, new[, count])
str.lower()
str.rstrip([chars])
str.strip([chars])
str.upper()

31
Searching a string
• We use the find() function to b a n a n a
0 1 2 3 4 5
search for a substring within
another string >>> fruit = 'banana'
• find() finds the first occurrence >>> pos = fruit.find('na')
of the substring and returns the >>> print(pos)
position 2
>>> aa = fruit.find('z')
• If the substring is not found, >>> print(aa)
find() returns -1 -1
• Remember that string position
starts at zero
32
Making everything UPPER CASE
• You can make a copy of a >>> greet = 'Hello Bob'
string in lower case or >>> nnn = greet.upper()
upper case >>> print(nnn)
• Often when we are HELLO BOB
searching for a string using >>> www = greet.lower()
find() we first convert the >>> print(www)
string to lower case so we
can search a string hello bob
regardless of case >>>

33
Search and replace
• The replace() >>> greet = 'Hello Bob'
function is like a >>> nstr = greet.replace('Bob','Jane')
“search and >>> print(nstr)
replace” operation
in a word processor Hello Jane
>>> nstr = greet.replace('o','X')
• It replaces all >>> print(nstr)
occurrences of the
search string with HellX BXb
the replacement >>>
string.

34
Stripping whitespace
• Sometimes we want >>> greet = ' Hello Bob '
to take a string and
remove whitespace >>> greet.lstrip()
at the beginning 'Hello Bob '
and/or end >>> greet.rstrip()
• lstrip() and rstrip() ' Hello Bob'
remove whitespace >>> greet.strip()
at the left or right 'Hello Bob'
• strip() removes >>>
both beginning and
ending whitespace

35
Prefixes
>>> line = 'Please have a nice day'
>>> line.startswith('Please')
True
>>> line.startswith('p')
False

36
find syntax:
find(<substring>, <starting position>, <ending position>)

Parsing and extracting


>>> data = 'From richard.maliwatu@uct.ac.za Sat Jan 5 09:14:16 2023'
>>> atpos = data.find('@')
>>> print(atpos)
21
>>> sppos = data.find(' ',atpos)
>>> print(sppos)
31
>>> host = data[atpos+1 : sppos]
>>> print(host)
uct.ac.za
21 31
From richard.maliwatu@uct.ac.za Sat Jan 5 09:14:16 2023
37
<intentionally left blank>

38
To print single/double quotes
• print("SA rugby team, "Springboks " ")
Output: SyntaxError: invalid syntax

• Escape the quotes, e.g.


print("SA rugby team, \"Springboks \" ")

Output: SA rugby team, "Springboks "

39
Escape characters
• Escape characters can be classified as non-printable characters when
backslash precedes them. The print statements do not print escape
characters.

40
Escape characters
Code Description Example Output
\\ backslash print ("\\") \
\n New line print("SA rugby team, \n 2023") SA rugby team,
2023
\r Carriage return print("SA rugby team, \r 2023") 2023gby team
\t Tab print("\t SA rugby team, 2023") SA rugby team, 2023
\b Backspace print("SA rugby team, \b2023") SA rugby team,2023
\f Form feed print(" SA rugby team, \f 2023") SA rugby team,
2023

41
String formatting
• Use a format language to specify a template and expressions to fit
into the template.
• General syntax:
<template string>.format (<var1>, <var2>, …)
• Python has multiple formatting approaches – this is one!

42
String formatting example 1
• Print two variables in order
print("{0} {1}".format ("one", "two"))
Output: one two
• Print two variable in reverse order
print("{1} {0}".format ("one", "two"))
Output: two one
• Left-aligned in fixed width
print("{0:<20}".format ("abc"))
Output: abc

43
String formatting example 2
• Right-aligned in fixed width
print("{0:>20}".format ("abc"))
Output: abc
• center-aligned in fixed width
print("{0:^20}".format ("abc"))
Output: abc
• Floating point number rounding
print("{0:5.3f}".format (1.23456789))
#5 implies 5 indent spaces; 3 implies round to 3 decimal points
Output: 1.235

44
String formatting example 3
• Printing a message that is not a fixed string
name = "Jane"
age = 23
print("Hello! My name is %s." % name)
Output: Hello! My name is Jane.
print("Hello! My name is %s and I am %d years old." % (name, age))
Output: Hello! My name is Jane and I am 23 years old.

45
Problems
1. Write a program to print out the reverse of a sentence. For
example: “hello” becomes “olleh”. Use first principles - i.e., process
the string character-by-character.
2. Suppose we have a variable containing: “the quick brown
fox jumps over the lazy dog” Write a program to
extract the colour of the quick fox from the sentence using only
string manipulations.

46
End

47

You might also like