You are on page 1of 31

Strings

Chapter 6

Python for Informatics: Exploring Information


www.pythonlearn.com
String Data Type >>> str1 = "Hello"
>>> str2 = 'there'
>>> bob = str1 + str2
• A string is a sequence of characters >>> print bob
Hellothere
• A string literal uses quotes >>> str3 = '123'
'Hello' or "Hello" >>> str3 = str3 + 1
Traceback (most recent call
• For strings, + means “concatenate” last): File "<stdin>", line
1, in <module>TypeError:
• When a string contains numbers, it is cannot concatenate 'str' and
still a string 'int' objects
>>> x = int(str3) + 1
• We can convert numbers in a string >>> print x
into a number using int() 124
>>>
Reading and >>> name = raw_input('Enter:')
Converting Enter:Chuck
>>> print name
Chuck
>>> apple = raw_input('Enter:')
• We prefer to read data in using
Enter:100
strings and then parse and
>>> x = apple – 10
convert the data as we need
Traceback (most recent call
last): File "<stdin>", line 1,
• This gives us more control over
in <module>TypeError:
error situations and/or bad user
unsupported operand type(s) for
input
-: 'str' and 'int'
• Raw input numbers must be >>> x = int(apple) – 10
>>> print x
converted from strings
90
Looking Inside Strings
• We can get at any single character b a n a n a
in a string using an index specified
in square brackets 0 1 2 3 4 5

• The index value must be an integer


>>> fruit = 'banana'
>>> letter = fruit[1]
and starts at zero >>> print letter
a
>>> x = 3
• The index value can be an >>>
>>>
w = fruit[x - 1]
print w
expression that is computed n
A Character Too Far

• You will get a python error if you >>> zot = 'abc'


>>> print zot[5]
attempt to index beyond the Traceback (most recent call
end of a string. last): File "<stdin>", line
1, in <module>IndexError:

• So be careful when constructing string index out of range


>>>
index values and slices
Strings Have Length

b a n a n a
• There is a built-in function len that 0 1 2 3 4 5
gives us the length of a string
>>> fruit = 'banana'
>>> print len(fruit)
6
Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print x function takes some input
6 and produces an output.

'banana' len() 6
(a string) function (a number)

Guido wrote this code


Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print x function takes some
6 input and produces an
output.
def len(inp):
blah
'banana' blah 6
(a string) for x in y: (a number)
blah
blah
Looping Through Strings

• Using a while statement and fruit = 'banana'


0b
1a
an iteration variable, and the index = 0
while index < len(fruit): 2n
len function, we can letter = fruit[index] 3a
construct a loop to look at print index, letter 4n
each of the letters in a string index = index + 1
5a
individually
Looping Through Strings

• A definite loop using a for b


statement is much more a
elegant fruit = 'banana' n
for letter in fruit: a
• The iteration variable is print letter n
completely taken care of by a
the for loop
Looping Through Strings

• A definite loop using a for fruit = 'banana'


for letter in fruit : b
statement is much more print letter a
elegant n
a
• The iteration variable is index = 0
while index < len(fruit) :
n
completely taken care of by letter = fruit[index]
a
the for loop print letter
index = index + 1
Looping and Counting

• This is a simple loop that word = 'banana'


loops through each letter in a count = 0
string and counts the number for letter in word :
of times the loop encounters if letter == 'a' :
the 'a' character count = count + 1
print count
Looking deeper into in
• The iteration variable
“iterates” through the
sequence (ordered set)
Six-character string
Iteration variable
• The block (body) of code is
executed once for each for letter in 'banana' :
value in the sequence
print letter
• The iteration variable moves
through all of the values in
the sequence
Yes No b a n a n a
Done? Advance letter

print letter

for letter in 'banana' :


print letter

The iteration variable “iterates” through the string and the block
(body) of code is executed once for each value in the sequence
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11
• We can also look at any
continuous section of a string >>> s = 'Monty Python'
using a colon operator >>> print s[0:4]
Mont
• The second number is one >>> print s[6:7]
beyond the end of the slice - P
“up to but not including”
>>> print s[6:20]
• If the second number is Python
beyond the end of the string, it
stops at the end
Slicing Strings
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11

>>> s = 'Monty Python'


• If we leave off the first >>> print s[:2]
number or the last number Mo
of the slice, it is assumed to >>> print s[8:]
be the beginning or end of thon
the string respectively >>> print s[:]
Monty Python

Slicing Strings
String Concatenation
>>> a = 'Hello'
• When the + operator is >>> b = a + 'There'
>>> print b
applied to strings, it
HelloThere
means “concatenation”
>>> c = a + ' ' + 'There'
>>> print c
Hello There
>>>
Using in as a logical Operator
• The in keyword can also be >>> fruit = 'banana'
>>> 'n' in fruit
used to check to see if one True
string is “in” another string >>> 'm' in fruit
False

• The in expression is a >>> 'nan' in fruit


True
logical expression that >>> if 'a' in fruit :
returns True or False and ... print 'Found it!'
...
can be used in an if Found it!
statement >>>
String Comparison
if word == 'banana':
print 'All right, bananas.'

if word < 'banana':


print 'Your word,' + word + ', comes before banana.'
elif word > 'banana':
print 'Your word,' + word + ', comes after banana.'
else:
print 'All right, bananas.'
String Library
• Python has a number of string
functions which are in the string >>> greet = 'Hello Bob'
library >>> zap = greet.lower()
>>> print zap
• These functions are already built into
hello bob
every string - we invoke them by
>>> print greet
appending the function to the string
Hello Bob
variable
>>> print 'Hi There'.lower()
• These functions do not modify the hi there
original string, instead they return a >>>
new string that has been altered
>>> stuff = 'Hello world'
>>> type(stuff)
<type 'str'>
>>> dir(stuff)
['capitalize', 'center', 'count', 'decode', 'encode',
'endswith', 'expandtabs', 'find', 'format', 'index',
'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower',
'lstrip', 'partition', 'replace', 'rfind', 'rindex',
'rjust', 'rpartition', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase',
'title', 'translate', 'upper', 'zfill']

https://docs.python.org/2/library/stdtypes.html#string-methods
String Library

str.capitalize() str.replace(old, new[, count])


str.center(width[, fillchar]) str.lower()
str.endswith(suffix[, start[, end]]) str.rstrip([chars])
str.find(sub[, start[, end]]) str.strip([chars])
str.lstrip([chars]) str.upper()
Searching a String
b a n a n a
• We use the find() function to search
for a substring within another string 0 1 2 3 4 5
• find() finds the first occurrence of
>>> fruit = 'banana'
the substring
>>> pos = fruit.find('na')
>>> print pos
• If the substring is not found, find() 2
returns -1 >>> aa = fruit.find('z')
>>> print aa
• Remember that string position -1
starts at zero
Making everything UPPER CASE
• You can make a copy of a string >>> greet = 'Hello Bob'
in lower case or upper case >>> nnn = greet.upper()
>>> print nnn
• Often when we are searching for HELLO BOB
a string using find() - we first >>> www = greet.lower()
convert the string to lower case >>> print www
so we can search a string hello bob
regardless of case >>>
Search and Replace
• The replace()
function is like a
>>> greet = 'Hello Bob'
“search and replace” >>> nstr = greet.replace('Bob','Jane')
operation in a word >>> print nstr
processor Hello Jane
>>> nstr = greet.replace('o','X')

• It replaces all
>>> print nstr
HellX BXb
occurrences of the >>>
search string with the
replacement string
Stripping Whitespace
• Sometimes we want to take a
string and remove whitespace at >>> greet = ' Hello Bob '
the beginning and/or end >>> greet.lstrip()
'Hello Bob '

• lstrip() and rstrip() remove >>> greet.rstrip()


' Hello Bob'
whitespace at the left or right >>> greet.strip()
'Hello Bob'

• strip() removes both beginning


>>>

and ending whitespace


Prefixes

>>> line = 'Please have a nice day'


>>> line.startswith('Please')
True
>>> line.startswith('p')
False
Parsing and Extracting
21 31

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008

>>> data = 'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'


>>> atpos = data.find('@')
>>> print atpos
21
>>> sppos = data.find(' ',atpos)
>>> print sppos
31
>>> host = data[atpos+1 : sppos]
>>> print host
uct.ac.za
Summary
• String type • String operations
• Read/Convert • String library
• Indexing strings [] • String Comparisons
• Slicing strings [2:4] • Searching in strings
• Looping through strings • Replacing text
with for and while • Stripping white space
• Concatenating strings with +
Acknowledgements / Contributions
These slides are Copyright 2010- Charles R. Severance (www.
...
dr-chuck.com) of the University of Michigan School of Information
and open.umich.edu and made available under a Creative
Commons Attribution 4.0 License. Please maintain this last slide
in all copies of the document to comply with the attribution
requirements of the license. If you make a change, feel free to
add your name and organization to the list of contributors on this
page as you republish the materials.

Initial Development: Charles Severance, University of Michigan


School of Information

… Insert new Contributors and Translators here

You might also like