Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Social Data Analytics

Section 1 - Introduction to Python

1- Python Arithmetic
Python allows you to perform all of the basic mathematical computations that you can perform on
a calculator.
To perform arithmetic with Python, simply type the operation you wish to perform into a code cell
and then run the cell to view the result.
Use the + sign to perform addition:
[1]: 10 + 5

[1]: 15

Use the - sign to perform subtraction:


Note: putting a - sign in front of a number makes it negative.
[2]: 10 - 5

[2]: 5

Use * for multiplication:


[3]: 10 * 5

[3]: 50

Use / for decimal division:

[4]: 10 / 3

[4]: 3.3333333333333335

Use // for floor division (Round decimal remainders down):

[5]: 10 // 3

[5]: 3

1
Use ** for exponentiation (Power):

[6]: 10 ** 3

[6]: 1000

Math expressions in Python follow the normal arithmetic order of operations so * and / are executed
before + and - and ** is executed before multiplication and division.
Note: Lines within Python code that begin with “#” are text comments. Comments are typically
used to make notes and describe what code does and do not affect the code when it is run.
[7]: # These operations are executed in reverse order of appearance due to the order␣
↪of operations.

# First -> Exponentiation , Then -> Multiplication, Finally Addition

2 + 3 * 5 ** 2

[7]: 77

You can use parentheses in your math expressions to ensure that operations are carried out on the
correct order.
Operations within parentheses are carried out before operations that are external to the parentheses,
just like you’d expect.
[8]: # This time, the addition comes first and the exponentiation comes last.

((2 + 3) * 5 ) ** 2

[8]: 625

The modulus produces the remainder you’d get when dividing two numbers. Use the % sign to
take the modulus in Python:
[9]: 100 % 75

[9]: 25

Beyond symbolic operators, Python contains a variety of named math functions available in the
“math” module.
To load a library into Python, “import” followed by the name of the library.
[10]: import math # Load the math module

[11]: # Use math.sqrt() to take the square root of a number:

math.sqrt(64)

2
[11]: 8.0

[12]: # Use abs() to get the absolute value of a number. Note abs() is a base Python␣
↪function so you do not need to load the math package to use it.

abs(-30)

[12]: 30

[13]: math.pi # Get the constant pi

[13]: 3.141592653589793

Rounding Numbers
Base Python contains a round() function that lets you round numbers to the nearest whole number.
You can also round up or down with math.ceil and math.floor respectively.
[14]: # Use round() to round a number to the nearest whole number:

round(233.234)

[14]: 233

[15]: # Add a second argument to round to a specified decimal place

round(233.234, 1) # round to 1 decimal place

[15]: 233.2

[16]: # Round down to the nearest whole number with math.floor()

math.floor(2.8)

[16]: 2

[17]: # Round up with math.ceil()

math.ceil(2.2)

[17]: 3

3
2- Basic Data Types

Integers
Integers or “ints” for short, are whole-numbered numeric values.
Any positive or negative number (or 0) without a decimal is an integer in Python.
You can check the type of a Python object with the type() function.
Let’s run type() on an integer:

[18]: type(12)

[18]: int

You can also use the function isinstance() to check whether an object is an instance of a given type:

[19]: # Check if 12 is an instance of type "int"

isinstance(12, int)

[19]: True

Floats
Floating point numbers or “floats” are numbers with decimal values.
[20]: type(1.0)

[20]: float

[21]: isinstance(0.33333, float)

[21]: True

The arithmetic operations we learned last time work on floats as well as ints.
If you use both floats and ints in the same math expression the result is a float:
[22]: 5 + 1.0

[22]: 6.0

You can convert a float to an integer using the int() function:

[23]: int(6.0)

[23]: 6

You can convert an integer to a float with the float() function:

4
[24]: float(6)

[24]: 6.0

Booleans
Booleans or “bools” are true/false values that result from logical statements.
In Python, booleans start with the first letter capitalized so True and False are recognized as bools
but true and false are not.
[25]: type(True)

[25]: bool

[26]: isinstance(False, bool) # Check if False is of type bool

[26]: True

You can create boolean values with logical expressions.


Python supports all of the standard logic operators you’d expect:
[27]: # Use > and < for greater than and less than:

20 > 10

[27]: True

[28]: 20 < 5

[28]: False

[29]: # Use >= and <= for greater than or equal and less than or equal:

20 >= 20

[29]: True

[30]: # Use == (two equal signs in a row) to check equality:

10 == 10

[30]: True

[31]: # Use != to check inequality. (think of != as "not equal to")

1 != 2

5
[31]: True

[32]: # Use the keyword "not" for negation:

not False

[32]: True

[33]: # Use the keyword "and" for logical and:

(2 > 1) and (10 > 11)

[33]: False

[34]: # Use the keyword "or" for logical or:

(2 > 1) or (10 > 11)

[34]: True

You can convert numbers into boolean values using the bool() function.
All numbers other than 0 convert to True:
[35]: bool(1)

[35]: True

[36]: bool(0)

[36]: False

Strings
Text data in Python is known as a string or “str”.
Surround text with single (’ ’) or double quotation marks (” ”) to create a string:

[37]: type("cat")

[37]: str

[38]: type('1')

[38]: str

6
None
In Python, “None” is a special data type that is often used to represent a missing value.
For example, if you define a function that doesn’t return anything (does not give you back some
resulting value) it will return “None” by default.

[39]: type(None)

[39]: NoneType

[40]: # Define a function that prints the input but returns nothing*

def my_function(x):
print(x)

my_function("hello") == None # The output of my_function equals None

hello

[40]: True

3- Variables
A variable is a name you assign a value or object, such as one of the basic data types.
After assigning a variable, you can access its associated value or object using the variable’s name.
Variables are a convenient way of storing values with names that are meaningful.
In Python, assign variables using “=”:
[41]: x = 10
y = "Python is fun"
z = 144**0.5 == 12 # 144 power 0.5

print(x)
print(y)
print(z)

10
Python is fun
True
Note: assigning a variable does not produce any output.
When assigning a variable, it is good practice to put a space between the variable name, the
assignment operator and the variable value for clarity:
[42]: p=8 # This works, but it looks messy.
print(p)

7
p = 10 # Use spaces instead
print(p)

8
10
As shown above, you can reassign a variable after creating it by assigning the variable name a new
value.
After assigning variables, you can perform operations on the objects assigned to them using their
names:
[43]: x = 10
z = 12
p = 10

x + z + p

[43]: 32

You can assign the same object to multiple variables with a multiple assignment statement.
[44]: n = m = 4
print(n)
print(m)

4
4
You can also assign several different variables at the same time using a comma separated sequence
of variable names followed by the assignment operator and a comma separated sequence of values
inside parentheses:
[45]: # Assign 3 variables at the same time:

x, y, z = (10, 20 ,30)

print(x)
print(y)
print(z)

10
20
30

4- Lists
A list is a mutable, ordered collection of objects. “Mutable” means a list can be altered after it is
created.

8
You can, for example, add new items to a list or remove existing items.
Lists are heterogeneous, meaning they can hold objects of different types.
Construct a list with a comma separated sequence of objects within square brackets:
[46]: my_list = ["Lesson", 5, "Is Fun?", True]

print(my_list)

['Lesson', 5, 'Is Fun?', True]


A list with no contents is known as the empty list:
[47]: empty_list = []

print( empty_list )

[]
You can add an item to an existing list with the list.append() function:

[48]: empty_list.append("I'm no longer empty!")

print(empty_list)

["I'm no longer empty!"]


Remove a matching item from a list with list.remove():

[49]: my_list.remove(5)

print(my_list)

['Lesson', 'Is Fun?', True]


Note: Remove deletes the first matching item only.
Join two lists together with the + operator:
[50]: combined_list = my_list + empty_list

print(combined_list)

['Lesson', 'Is Fun?', True, "I'm no longer empty!"]


You can also add a sequence to the end of an existing list with the list.extend() function:

[51]: combined_list = my_list

combined_list.extend(empty_list)

print(combined_list)

9
['Lesson', 'Is Fun?', True, "I'm no longer empty!"]
Check the length, maximum, minimum and sum of a list with the len(), max(), min() and sum()
functions, respectively.
[52]: num_list = [1, 3, 5, 7, 9]
print( len(num_list)) # Check the length
print( max(num_list)) # Check the max
print( min(num_list)) # Check the min
print( sum(num_list)) # Check the sum
print( sum(num_list)/len(num_list)) # Check the mean*

5
9
1
25
5.0
You can check whether a list contains a certain object with the “in” keyword:
[53]: 1 in num_list

[53]: True

Add the keyword “not” to test whether a list does not contain an object:
[54]: 1 not in num_list

[54]: False

Count the occurrences of an object within a list using the list.count() function:

[55]: num_list.count(3)

[55]: 1

Other common list functions include list.sort() and list.reverse():

[56]: new_list = [1, 5, 4, 2, 3, 6] # Make a new list

new_list.reverse() # Reverse the list


print("Reversed list", new_list)

new_list.sort() # Sort the list


print("Sorted list", new_list)

Reversed list [6, 3, 2, 4, 5, 1]


Sorted list [1, 2, 3, 4, 5, 6]

10
List Indexing and Slicing
Lists are indexed, meaning each position in the sequence has a corresponding number called the
index that you can use to look up the value at that position.
Python sequences are zero-indexed, so the first element of a sequence is at index position zero, the
second element is at index 1 and so on.
Retrieve an item from a list by placing the index in square brackets after the name of the list:
[57]: another_list = ["Hello","my", "bestest", "old", "friend."]

print (another_list[0])
print (another_list[2])

Hello
bestest
If you supply a negative number when indexing into a list, it accesses items starting from the end
of the list (-1) going backward:

[58]: print (another_list[-1])


print (another_list[-3])

friend.
bestest
Supplying an index outside of a lists range will result in an IndexError:
[59]: print (another_list[5])

---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[59], line 1
----> 1 print (another_list[5])

IndexError: list index out of range

If your list contains other indexed objects, you can supply additional indexes to get items contained
within the nested objects:
[60]: nested_list = [[1,2,3],[4,5,6],[7,8,9]]

print (nested_list[0][2])

3
You can take a slice (sequential subset) of a list using the syntax [start:stop:step] where start and
stop are the starting and ending indexes for the slice and step controls how frequently you sample
values along the slice.

11
The default step size is 1, meaning you take all values in the range provided, starting from the first,
up to but not including the last:
[61]: another_list = ["Hello","my", "bestest", "old", "friend."]

my_slice = another_list[1:3] # Slice index 1 and 2


print(my_slice )

['my', 'bestest']

[62]: # Slice the entire list but use step size 2 to get every other item:

my_slice = another_list[0:6:2]
print(my_slice )

['Hello', 'bestest', 'friend.']


You can leave the starting or ending index blank to slice from the beginning or up to the end of
the list respectively:
[63]: slice1 = another_list[:4] # Slice everything up to index 4
print(slice1)

['Hello', 'my', 'bestest', 'old']

[64]: slice2 = another_list[3:] # Slice everything from index 3 to the end


print(slice2)

['old', 'friend.']
If you provide a negative number as the step, the slice steps backward:
[65]: # Take a slice starting at index 4, backward to index 2

another_list = ["Hello","my", "bestest", "old", "friend."]

my_slice = another_list[4:2:-1]
print(my_slice )

['friend.', 'old']
If you don’t provide a start or ending index, you slice of the entire list:
[66]: my_slice = another_list[:] # This slice operation copies the list
print(my_slice)

['Hello', 'my', 'bestest', 'old', 'friend.']

12
Using a step of -1 without a starting or ending index slices the entire list in reverse, providing a
shorthand to reverse a list:
[67]: my_slice = another_list[::-1] # This slice operation reverses the list
print(my_slice)

['friend.', 'old', 'bestest', 'my', 'Hello']


You can use indexing to change the values within a list or delete items in a list:
[68]: another_list = ["Hello","my", "bestest", "old", "friend."]

another_list[3] = "new" # Set the value at index 3 to "new"

print(another_list)

del(another_list[3]) # Delete the item at index 3

print(another_list)

['Hello', 'my', 'bestest', 'new', 'friend.']


['Hello', 'my', 'bestest', 'friend.']
You can also remove items from a list using the list.pop() function. pop() removes the final item
in a list and returns it:
[69]: next_item = another_list.pop()

print(next_item)
print(another_list)

friend.
['Hello', 'my', 'bestest']

5- Tuples and Strings


Tuples
Tuples are used to hold short collections of related data.
For instance, if you wanted to store latitude and longitude coordinates for cities, tuples might be
a good choice, because the values are related and not likely to change.
Construct a tuple with a comma separated sequence of objects within parentheses:
[70]: my_tuple = (1,3,5)

print(my_tuple)

(1, 3, 5)

13
Tuples generally support the same indexing and slicing operations as lists and they also support
some of the same functions, but tuples cannot be changed after they are created.
This means we can do things like find the length, max or min of a tuple, but we can’t append new
values to them or remove values from them:
[71]: another_tuple = (2, 3, 1, 4)

[72]: another_tuple[2] # You can index into tuples

[72]: 1

[73]: another_tuple[2:4] # You can slice tuples

[73]: (1, 4)

[74]: # You can use common sequence functions on tuples:

print( len(another_tuple))
print( min(another_tuple))
print( max(another_tuple))
print( sum(another_tuple))

4
1
4
10

[75]: another_tuple.append(1) # You can't append to a tuple

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[75], line 1
----> 1 another_tuple.append(1)

AttributeError: 'tuple' object has no attribute 'append'

[76]: del another_tuple[1] # You can't delete from a tuple

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[76], line 1
----> 1 del another_tuple[1]

TypeError: 'tuple' object doesn't support item deletion

14
You can sort the objects in tuple using the sorted() function, but doing so creates a new list
containing the result rather than sorting the original tuple itself like the list.sort() function does
with lists:
[77]: sorted(another_tuple)

[77]: [1, 2, 3, 4]

Strings

[78]: my_string = "Hello world"

my_string[3] # Get the character at index 3

[78]: 'l'

[79]: my_string[3:] # Slice from the third index to the end

[79]: 'lo world'

[80]: my_string[::-1] # Reverse the string

[80]: 'dlrow olleH'

In addition, certain sequence functions like len() and count() work on strings:

[81]: len(my_string)

[81]: 11

[82]: my_string.count("l") # Count the l's in the string

[82]: 3

Some basic string functions include:


[83]: my_string.lower() # Make all characters lowercase

[83]: 'hello world'

[84]: my_string.upper() # Make all characters uppercase

[84]: 'HELLO WORLD'

[85]: my_string.title() # Make the first letter of each word uppercase

[85]: 'Hello World'

Find the index of the first appearing substring within a string using str.find().

15
If the substring does not appear, find() returns -1:

[86]: my_string = "Hello world"

my_string.find("W")

[86]: -1

*Notice that since strings can’t be changed after creation, we never actually changed the original
value of my_string with any of the code above, but instead generated new strings that were printed
to the console.
This means “W” does not exist in my_string even though our call to str.title() produced the output
‘Hello World’.
The original lowercase “w” still exists at index position 6:
[87]: my_string.find("w")

[87]: 6

Find and replace a target substring within a string using str.replace():

[88]: my_string.replace("world", # Substring to replace


"friend") # New substring

[88]: 'Hello friend'

Split a string into a list of substrings based on a given separating character with str.split():

[89]: my_string.split() # str.split() splits on spaces by default

[89]: ['Hello', 'world']

Split a multi-line string into a list of lines using str.splitlines():

[90]: multiline_string = """I am


a multiline
string!
"""

multiline_string.splitlines()

[90]: ['I am', 'a multiline ', 'string!']

[91]: # str.strip() removes whitespace by default

" strip white space! ".strip()

[91]: 'strip white space!'

16
To join or concatenate two strings together, you can us the plus (+) operator:

[92]: "Hello " + "World"

[92]: 'Hello World'

Convert the a list of strings into a single string separated by a given delimiter with str.join():

[93]: " ".join(["Hello", "World!", "Join", "Me!"])

[93]: 'Hello World! Join Me!'

String Operations
Although the + operator works for string concatenation, things can get messy if you start trying
to join more than a couple values together with pluses.
[94]: name = "Joe"
age = 10
city = "Paris"

"My name is " + name + " I am " + str(age) + " and I live in " + "Paris"

[94]: 'My name is Joe I am 10 and I live in Paris'

For complex string operations of this sort is preferable to use the str.format() function or formatted
strings.
str.format() takes in a template string with curly braces as placeholders for values you provide to
the function as the arguments.
The arguments are then filled into the appropriate placeholders in the string:
[95]: template_string = "My name is {} I am {} and I live in {}"

template_string.format(name, age, city)

[95]: 'My name is Joe I am 10 and I live in Paris'

Formatted strings or f-strings for short are an alternative, relatively new (as of Python version 3.6)
method for string formatting.
F-strings are strings prefixed with “f” (or “F”) that allow you to insert existing variables into string
by name by placing them within curly braces:
[96]: # Remaking the example above using an f-string

f"My name is {name} I am {age} and I live in {city}"

[96]: 'My name is Joe I am 10 and I live in Paris'

17
6- Dictionaries and Sets
Dictionaries
A dictionary or dict is an object that maps a set of named indexes called keys to a set of corre-
sponding values.
Dictionaries are mutable, so you can add and remove keys and their associated values.
Create a dictionary with a comma-separated list of key: value pairs within curly braces:
[97]: my_dict = {"name": "Joe",
"age": 10,
"city": "Paris"}

print(my_dict)

{'name': 'Joe', 'age': 10, 'city': 'Paris'}


*Since dictionaries are unordered. Index into a dictionary using keys rather than numeric indexes:
[98]: my_dict["name"]

[98]: 'Joe'

Add new items to an existing dictionary with the following syntax:


[99]: my_dict["new_key"] = "new_value"

print(my_dict)

{'name': 'Joe', 'age': 10, 'city': 'Paris', 'new_key': 'new_value'}


Delete existing key: value pairs with del:
[100]: del my_dict["new_key"]

print(my_dict)

{'name': 'Joe', 'age': 10, 'city': 'Paris'}


Check the number of items in a dict with len():

[101]: len(my_dict)

[101]: 3

Check whether a certain key exists with “in”:


[102]: "name" in my_dict

[102]: True

18
You can access all the keys, all the values or all the key: value pairs of a dictionary with the keys(),
value() and items() functions respectively:

[103]: my_dict.keys()

[103]: dict_keys(['name', 'age', 'city'])

[104]: my_dict.values()

[104]: dict_values(['Joe', 10, 'Paris'])

[105]: my_dict.items()

[105]: dict_items([('name', 'Joe'), ('age', 10), ('city', 'Paris')])

Sets
Sets are unordered collections of objects that cannot contain duplicates.
Sets are useful for storing and performing operations on data where each value is unique.
Create a set with a comma separated sequence of values within curly braces:
[106]: my_set = {1,2,3,4,5,6,7}

type(my_set)

[106]: set

Add and remove items from a set with add() and remove() respectively:

[107]: my_set.add(8)

my_set

[107]: {1, 2, 3, 4, 5, 6, 7, 8}

[108]: my_set.remove(7)

my_set

[108]: {1, 2, 3, 4, 5, 6, 8}

Sets do not support indexing, but they do support basic sequence functions like len(), min(), max()
and sum().
You can also check membership and non-membership as usual with in:
[109]: 6 in my_set

19
[109]: True

One of the main purposes of sets is to perform set operations that compare or combine different
sets.
Python sets support many common mathematical set operations like union, intersection, difference
and checking whether one set is a subset of another:
[110]: set1 = {1,3,5,6}
set2 = {1,2,3,4}

set1.union(set2) # Get the union of two sets

[110]: {1, 2, 3, 4, 5, 6}

[111]: set1.intersection(set2) # Get the intersection of two sets

[111]: {1, 3}

[112]: set1.difference(set2) # Get the difference between two sets

[112]: {5, 6}

[113]: set1.issubset(set2) # Check whether set1 is a subset of set2

[113]: False

7- Control Flow
If, Else and Elif
The most basic control flow statement in Python is the “if” statement.
An if statement checks whether some logical expression evaluates to true or false and then executes
a code block if the expression is true.
In Python, an if statement starts with if, followed by a logical expression and a colon.
The code to execute if the logical expression is true appears on the next line, indented from the if
statement above it by 4 spaces:
[114]: x = 10 # Assign some variables
y = 5

if x > y: # If statement
print("x is greater than y")

x is greater than y

20
*In the code above, the logical expression was true–x is greater than y–so the print(x) statement
was executed.
If statements are often accompanied by else statements.
Else statements come after if statements and execute code in the event that logical expression
checked by an if statement is false:
*In this case the logical expression after the if statement is false, so the print statement after the
if block is skipped and the print statement after the else block is executed instead.
[115]: y = 25
x = 10

if x > y:
print("x is greater than y")
else:
print("y is greater than x")

y is greater than x
You can extend this basic if/else construct to perform multiple logical checks in a row by adding
one or more “elif” (else if) statements between the opening if and closing else.
Each elif statement performs an additional logical check and executes its code if the check is true:
[116]: y = 10

if x > y:
print("x is greater than y")
elif x == y:
print("x and y are equal!")
else:
print("y is greater than x")

x and y are equal!

For Loops
For loops are a programming construct that let you go through each item in a sequence and then
perform some operation on each one.
For instance, you could use a for loop to go through all the values in a list, tuple, dictionary or
series and check whether each conforms to some logical expression or print the value to the console.
Create a for loop using the following syntax:
[117]: my_sequence = list(range(0,101,10)) # Make a new list

for number in my_sequence: # Create a new for loop over the specified items
print(number) # Code to execute

21
0
10
20
30
40
50
60
70
80
90
100
In each iteration of the loop, the variable “number” takes on the value of the next item in the
sequence.
For loops support a few special keywords that help you control the flow of the loop: continue and
break.
The continue keyword causes a for loop to skip the current iteration and go to the next one:
[118]: my_sequence = list(range(0,101,10))

for number in my_sequence:


if number < 50:
continue # Skip numbers less than 50
print(number)

50
60
70
80
90
100
Use break to “break out” of a loop:
[119]: my_sequence = list(range(0,101,10))

for number in my_sequence:


if number > 50:
break # Break out of the loop if number > 50
print(number)

0
10
20
30
40
50

22
While Loops
While loops are similar to for loops in that they allow you to execute code over and over again.
For loops execute their contents, at most, a number of iterations equal to the length of the sequence
you are looping over.
While loops, on the other hand, keep executing their contents as long as a logical expression you
supply remains true:
[120]: x = 5
iters = 0

while iters < x: # Execute the contents as long as iters < x


print("Study")
iters = iters+1 # Increment iters by 1 each time the loop executes

Study
Study
Study
Study
Study
While loops can get you into trouble because they keep executing until the logical statement
provided is false.
If you supply a logical statement that will never become false and don’t provide a way to break out
of the while loop, it will run forever.
For instance, if the while loop above didn’t include the statement incrementing the value of iters
by 1, the logical statement would never become false and the code would run forever.
Infinite while loops are a common cause of program crashes.
The continue and break statements work inside while loops just like they do in for loops.
You can use the break statement to escape a while loop even if the logical expression you supplied
is true.
Consider the following while loop:
[121]: while True: # True is always true!
print("Study")
break # But we break out of the loop here

Study

8- Functions
A function is a named code block that performs a job or returns a value.

23
Defining Functions

[122]: def my_function(arg1, arg2): # Defines a new function


return arg1 + arg2 # Function body (code to execute)

After defining a function, you can call it using the name you assigned to it, just like you would
with a built in function.
The “return” keyword specifies what the function produces as its output.
When a function reaches a return statement, it immediately exits and returns the specified value.
The function we defined above takes two arguments and then returns their sum:
[123]: my_function(5, 10)

[123]: 15

You can give function arguments a default value that is used automatically unless you override it.
Set a default value with the argument_name = argument_value syntax:
[124]: def sum_3_items(x, y, z, print_args = False):
if print_args:
print(x,y,z)
return x + y + z

[125]: sum_3_items(5,10,20) # By default the arguments are not printed

[125]: 35

[126]: sum_3_items(5,10,20,True) # Changing the default prints the arguments

5 10 20

[126]: 35

24

You might also like