Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Strings

Chapter 6

reachus@cloudxlab.com
String Data Type >>> str1 = "Hello"
>>> str2 = 'there'
>>> bob = str1 + str2
• A string is a sequence of characters >>> print (bob)
Hellothere
• A string literal uses quotes >>> str3 = '123'
'Hello' or "Hello" >>> str3 = str3 + 1
Traceback (most recent call
• For strings, + means “concatenate” last): File "<stdin>", line
1, in <module>TypeError:
• When a string contains numbers, it is cannot concatenate 'str' and
still a string 'int' objects
>>> x = int(str3) + 1
• We can convert numbers in a string >>> print (x)
into a number using int() 124
>>>

reachus@cloudxlab.com
Reading and >>> name = input('Enter:')
Converting Enter:Chuck
>>> print (name)
Chuck
>>> apple = input('Enter:')
• We prefer to read data in using
Enter:100
strings and then parse and
>>> x = apple – 10
convert the data as we need
Traceback (most recent call
last): File "<stdin>", line 1,
• This gives us more control over
in <module>TypeError:
error situations and/or bad user
unsupported operand type(s) for
input
-: 'str' and 'int'
• Raw input numbers must be >>> x = int(apple) – 10
>>> print (x)
converted from strings
90

reachus@cloudxlab.com
Looking Inside Strings
• We can get at any single character b a n a n a
in a string using an index specified
in square brackets 0 1 2 3 4 5

• The index value must be an integer


>>> fruit = 'banana'
>>> letter = fruit[1]
and starts at zero >>> print (letter)
a
>>> x = 3
• The index value can be an >>>
>>>
w = fruit[x - 1]
print (w)
expression that is computed n
reachus@cloudxlab.com
A Character Too Far

• You will get a python error if you >>> zot = 'abc'


>>> print (zot[5])
attempt to index beyond the Traceback (most recent call
end of a string. last): File "<stdin>", line
1, in <module>IndexError:

• So be careful when constructing string index out of range


>>>
index values and slices

reachus@cloudxlab.com
Strings Have Length

b a n a n a
• There is a built-in function len that 0 1 2 3 4 5
gives us the length of a string
>>> fruit = 'banana'
>>> print (len(fruit))
6

reachus@cloudxlab.com
Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print (x) function takes some input
6 and produces an output.

'banana' len() 6
(a (a number)
string) function

Guido wrote this code


reachus@cloudxlab.com
Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print (x) function takes some
6 input and produces an
output.
def len(inp):
'banana' blah
6
blah
(a for x in y: (a number)
string) blah
blah

reachus@cloudxlab.com
Looping Through Strings

• Using a while statement and fruit = 'banana'


0b
1a
an iteration variable, and the index = 0
while index < len(fruit): 2n
len function, we can letter = fruit[index] 3a
construct a loop to look at print (index, letter) 4n
each of the letters in a string index = index + 1
5a
individually

reachus@cloudxlab.com
Looping Through Strings

• A definite loop using a for b


statement is much more a
elegant fruit = 'banana' n
for letter in fruit: a
• The iteration variable is print (letter) n
completely taken care of by a
the for loop

reachus@cloudxlab.com
Looping Through Strings

• A definite loop using a for fruit = 'banana'


for letter in fruit : b
statement is much more print (letter) a
elegant n
a
• The iteration variable is index = 0
while index < len(fruit) :
n
completely taken care of by letter = fruit[index]
a
the for loop print (letter)
index = index + 1

reachus@cloudxlab.com
Looping and Counting

• This is a simple loop that word = 'banana'


loops through each letter in a count = 0
string and counts the number for letter in word :
of times the loop encounters if letter == 'a' :
the 'a' character count = count + 1
print (count)

reachus@cloudxlab.com
Looking deeper into in
• The iteration variable
“iterates” through the
sequence (ordered set)
Six-character string
Iteration variable
• The block (body) of code is
executed once for each for letter in 'banana' :
value in the sequence
print (letter)
• The iteration variable moves
through all of the values in
the sequence

reachus@cloudxlab.com
Yes No b a n a n a
Done? Advance letter

print letter

for letter in 'banana' :


print (letter)

The iteration variable “iterates” through the string and the block
(body) of code is executed once for each value in the sequence
reachus@cloudxlab.com
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11
• We can also look at any
continuous section of a string >>> s = 'Monty Python'
using a colon operator >>> print (s[0:4])
Mont
• The second number is one >>> print (s[6:7])
beyond the end of the slice - P
“up to but not including”
>>> print (s[6:20])
• If the second number is Python
beyond the end of the string, it
stops at the end
Slicing Strings
reachus@cloudxlab.com
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11

>>> s = 'Monty Python'


• If we leave off the first >>> print (s[:2])
number or the last number Mo
of the slice, it is assumed to >>> print (s[8:])
be the beginning or end of thon
the string respectively >>> print (s[:])
Monty Python

Slicing Strings
reachus@cloudxlab.com
String Concatenation
>>> a = 'Hello'
• When the + operator is >>> b = a + 'There'
>>> print (b)
applied to strings, it
HelloThere
means “concatenation”
>>> c = a + ' ' + 'There'
>>> print (c)
Hello There
>>>

reachus@cloudxlab.com
Using in as a logical Operator
• The in keyword can also be >>> fruit = 'banana'
>>> 'n' in fruit
used to check to see if one True
string is “in” another string >>> 'm' in fruit
False

• The in expression is a >>> 'nan' in fruit


True
logical expression that >>> if 'a' in fruit :
returns True or False and ... print ('Found it!')
...
can be used in an if Found it!
statement >>>

reachus@cloudxlab.com
String Comparison
if word == 'banana':
print ('All right, bananas.')

if word < 'banana':


print ('Your word,' + word + ', comes before banana.')
elif word > 'banana':
print ('Your word,' + word + ', comes after banana.')
else:
print ('All right, bananas.')

reachus@cloudxlab.com
String Library
• Python has a number of string
functions which are in the string >>> greet = 'Hello Bob'
library >>> zap = greet.lower()
>>> print (zap)
• These functions are already built into
hello bob
every string - we invoke them by
>>> print (greet)
appending the function to the string
Hello Bob
variable
>>> print ('Hi There'.lower())
• These functions do not modify the hi there
original string, instead they return a >>>
new string that has been altered
reachus@cloudxlab.com
>>> stuff = 'Hello world'
>>> type(stuff)
<type 'str'>
>>> dir(stuff)
['capitalize', 'center', 'count', 'decode', 'encode',
'endswith', 'expandtabs', 'find', 'format', 'index',
'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower',
'lstrip', 'partition', 'replace', 'rfind', 'rindex',
'rjust', 'rpartition', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase',
'title', 'translate', 'upper', 'zfill']

https://docs.python.org/2/library/stdtypes.html#string-methods
reachus@cloudxlab.com
reachus@cloudxlab.com
String Library

str.capitalize() str.replace(old, new[, count])


str.center(width[, fillchar]) str.lower()
str.endswith(suffix[, start[, end]]) str.rstrip([chars])
str.find(sub[, start[, end]]) str.strip([chars])
str.lstrip([chars]) str.upper()

reachus@cloudxlab.com
Searching a String
b a n a n a
• We use the find() function to search
for a substring within another string 0 1 2 3 4 5
• find() finds the first occurrence of
>>> fruit = 'banana'
the substring
>>> pos = fruit.find('na')
>>> print (pos)
• If the substring is not found, find() 2
returns -1 >>> aa = fruit.find('z')
>>> print (aa)
• Remember that string position -1
starts at zero

reachus@cloudxlab.com
Making everything UPPER CASE
• You can make a copy of a string >>> greet = 'Hello Bob'
in lower case or upper case >>> nnn = greet.upper()
>>> print (nnn)
• Often when we are searching for HELLO BOB
a string using find() - we first >>> www = greet.lower()
convert the string to lower case >>> print (www)
so we can search a string hello bob
regardless of case >>>

reachus@cloudxlab.com
Search and Replace
• The replace()
function is like a
>>> greet = 'Hello Bob'
“search and replace” >>> nstr = greet.replace('Bob','Jane')
operation in a word >>> print (nstr)
processor Hello Jane
>>> nstr = greet.replace('o','X')

• It replaces all
>>> print (nstr)
HellX BXb
occurrences of the >>>
search string with the
replacement string
reachus@cloudxlab.com
Stripping Whitespace
• Sometimes we want to take a
string and remove whitespace at >>> greet = ' Hello Bob '
the beginning and/or end >>> greet.lstrip()
'Hello Bob '

• lstrip() and rstrip() remove >>> greet.rstrip()


' Hello Bob'
whitespace at the left or right >>> greet.strip()
'Hello Bob'

• strip() removes both beginning


>>>

and ending whitespace


reachus@cloudxlab.com
Prefixes

>>> line = 'Please have a nice day'


>>> line.startswith('Please')
True
>>> line.startswith('p')
False

reachus@cloudxlab.com
Parsing and Extracting
21 31

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008

>>> data = 'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'


>>> atpos = data.find('@')
>>> print (atpos)
21
>>> sppos = data.find(' ',atpos)
>>> print (sppos)
31
>>> host = data[atpos+1 : sppos]
>>> print (host)
uct.ac.za

reachus@cloudxlab.com
Summary
• String type • String operations
• Read/Convert • String library
• Indexing strings [] • String Comparisons
• Slicing strings [2:4] • Searching in strings
• Looping through strings • Replacing text
with for and while • Stripping white space
• Concatenating strings with +

reachus@cloudxlab.com
Acknowledgements / Contributions
These slides are Copyright 2010- Charles R. Severance
...
(www.dr-chuck.com) of the University of Michigan School of
Information and open.umich.edu and made available under a
Creative Commons Attribution 4.0 License. Please maintain this
last slide in all copies of the document to comply with the
attribution requirements of the license. If you make a change,
feel free to add your name and organization to the list of
contributors on this page as you republish the materials.

Initial Development: Charles Severance, University of Michigan


School of Information

… Insert new Contributors and Translators here

You might also like