Professional Documents
Culture Documents
Coding Training Material
Coding Training Material
MATERIAL
About the doc
These exercises are to make sure that you have enough familiarity with
programming and, in particular, Python programming. This is not meant to
be a stand-alone introduction to computer programming. Rather, it’s a way
for someone with almost no previous exposure to programming to get some
practice and to learn the basics of Python. The goal of the exercises is to give
you practice with Python concepts and to help diagnose your level of
programming ability. Some of the problem sets are much longer than the
earlier ones, because we need the concepts in the earlier sections before we
can really write many interesting programs. So, don’t be misled by the short
length of the early problem sets. The presented solutions are just a sample
solution. There are many different ways to arrive to the same result.
1
]\
Exercises 7
Question: Time Counter 7
Question: The biggest number I 7
Question: The biggest number II 7
Question: If and not If 8
Question: Product & sum 8
Question: Pandas I 8
Question: Pandas II 9
Question: Pandas III 9
Question: Pandas IV 10
Question: Pandas V 11
Question: Pandas VI 11
Question: Pandas VII 13
Question: Pandas VIII 13
Question: Pandas IX 14
Question: Pandas X 14
Question: Pandas XI 15
Question: Pandas XII 16
Question: Pandas XIII 16
Question: Pandas XIV 17
Question: Pandas XV 18
Question: The Descent 20
Question: Onboarding 20
Question: Weather 20
Question: Power of the light 21
Question: Time Series I 23
Question: Time Series II 23
Question: Time Series III 23
Question: Time Series IV 24
Question: Time Series V 24
Question: Time Series VI 24
Question: Time Series VII 24
Question: Time Series VIII 24
2
]\
3
]\
Solutions 36
Question: Time Counter 36
Question: The biggest number I 36
Question: The biggest number II 37
Question: If and not If (A): 37
Question: If and not If (B): 37
Question: Product & Sum: 37
Question: Pandas I 38
Question: Pandas II 38
Question: Pandas III 38
Question: Pandas IV 39
Question: Pandas V 39
Question: Pandas VI 40
Question: Pandas VII 40
Question: Pandas VIII 41
Question: Pandas IX 41
Question: Pandas X 42
Question: Pandas XI 42
Question: Pandas XII 42
Question: Pandas XIII 43
Question: Pandas XIV 43
Question: Pandas XV 44
Question: The Descent 45
Question: Onboarding 45
Question: Weather 45
Question: Power of the light 46
Question: Time Series I 47
Question: Time Series II 48
Question: Time Series III 48
Question: Time Series IV 48
4
]\
5
]\
Question: Plotting X 64
Question: Plotting XI 65
Question: Plotting XII 65
Question: Plotting XIII 66
Question: Plotting XIV 66
Question: Plotting XV 67
Question: Plotting XVI 68
Question: Plotting XVII 68
Question: Plotting XVIII 69
Question: Plotting XIX 69
6
]\
Exercises
Question: Time Counter
Write a program that asks for a number of seconds and retrieves how many
days, hours, minutes and seconds that number has.
Example:
Number of seconds: 345 678
Output: days: 4 hours: 0 minutes: 1 seconds: 18
7
]\
Question: Pandas I
Write a Pandas program to get the powers of an array values element-wise.
Sample data: {'X':[78,85,96,80,86], 'Y':[84,94,89,83,86],'Z':[86,97,96,72,83]}
Expected Output:
XYZ
0 78 84 86
1 85 94 97
2 96 89 96
8
]\
3 80 83 72
4 86 86 83
Question: Pandas II
Write a Pandas program to create and display a DataFrame from a specified
dictionary data which has the index labels. Sample Python dictionary data
and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
attempts name qualify score
a 1 Anastasia yes 12.5
b 3 Dima no 9.0
.... i 2 Kevin no 8.0
j 1 Jonas yes 19.0
9
]\
Question: Pandas IV
Write a Pandas program to select the rows where the score is missing, i.e. is
NaN. Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Rows where score is missing:
attempts name qualify score
d 3 James no NaN
h 1 Laura no NaN
10
]\
Question: Pandas V
Write a Pandas program to select the rows where number of attempts in the
examination is less than 2 and score greater than 15. Sample Python
dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Number of attempts in the examination is less than 2 and score greater than
15 :
name score attempts qualify
j Jonas 19.0 1 yes
Question: Pandas VI
Write a Pandas program to sort the DataFrame first by 'name' in descending
order, then by 'score' in ascending order. Sample Python dictionary data and
list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
11
]\
12
]\
13
]\
Question: Pandas IX
Write a Pandas program to iterate over rows in a DataFrame. Sample Python
dictionary data and list labels:
exam_data = [{'name':'Anastasia', 'score':12.5}, {'name':'Dima','score':9},
{'name':'Katherine','score':16.5}]
Expected Output:
Anastasia 12.5
Dima 9.0
Katherine 16.5
Question: Pandas X
Write a Pandas program to change the order of a DataFrame columns.
Sample data:
Original DataFrame
14
]\
Question: Pandas XI
Write a Pandas program to write a DataFrame to CSV file using tab separator.
Sample data:
Original DataFrame
col1 col2 col3
0147
1458
2369
3470
4581
Data from new_file.csv file:
col1\tcol2\tcol3
0 1\t4\t7
1 4\t5\t8
15
]\
2 3\t6\t9
3 4\t7\t0
4 5\t8\t1
16
]\
17
]\
........
8 2 Kevin no 8.0
9 1 Jonas yes 19.0
Question: Pandas XV
Write a Pandas program to merge datasets and check uniqueness.
Sample Output:
Original DataFrame:
Name Date_Of_Birth Age
0 Alberto Franco 17/05/2002 18.5
1 Gino Mcneill 16/02/1999 21.2
2 Ryan Parkes 25/09/1998 22.5
3 Eesha Hinton 11/05/2002 22.0
4 Syed Wharton 15/09/1997 23.0
18
]\
New DataFrames:
Name Date_Of_Birth Age
2 Ryan Parkes 25/09/1998 22.5
3 Eesha Hinton 11/05/2002 22.0
4 Syed Wharton 15/09/1997 23.0
Name Date_Of_Birth Age
0 Alberto Franco 17/05/2002 18.5
1 Gino Mcneill 16/02/1999 21.2
3 Eesha Hinton 11/05/2002 22.0
4 Syed Wharton 15/09/1997 23.0
"one_to_one": check if merge keys are unique in both left and right
datasets:"
Name Date_Of_Birth Age
0 Eesha Hinton 11/05/2002 22.0
1 Syed Wharton 15/09/1997 23.0
19
]\
Question: Onboarding
Your program must destroy the enemy ships by shooting the closest enemy
on each turn. On each start of turn (within the infinite game loop), you obtain
information on the two closest enemies:
● enemy1 and dist1: the name and the distance to enemy 1.
● enemy2 and dist2: the name and the distance to enemy 2.
Before your turn is over (end of the loop), output the value of either enemy1 or
enemy2 to shoot the closest enemy.
Question: Weather
In this exercise, you have to analyze records of temperature to find the closest
to zero. Write a program that prints the temperature closest to 0 among input
data. If two numbers are equally close to zero, positive integer has to be
considered closest to zero (for instance, if the temperatures are -5 and 5, then
20
]\
display 5). Your program must read the data from the standard input and
write the result on the standard output.
Input:
Line 1: N, the number of temperatures to analyze
Line 2: A string with the N temperatures expressed as integers ranging from
-273 to 5526
Output:
Display 0 (zero) if no temperatures are provided. Otherwise, display the
temperature closest to 0.
21
]\
Input:
The program must first read the initialization data from the standard input,
then, in an infinite loop, provides on the standard output the instructions to
move Roomba.
Initialization input
Line 1: 4 integers lightX lightY initialTX initialTY. (lightX, lightY) indicates the
position of the light. (initialTX, initialTY) indicates the initial position of
Roomba.
Input for a game round
Line 1: the number of remaining moves for Roomba to reach the light of
power: remainingTurns. You can ignore this data but you must read it.
Output for a game round
A single line providing the move to be made: N NE E SE S SW W NW
Constraints
● 0 ≤ lightX < 40
● 0 ≤ lightY < 18
22
]\
● 0 ≤ initialTX < 40
● 0 ≤ initialTY < 18
23
]\
24
]\
25
]\
26
]\
27
]\
28
]\
Question: Plotting I
Dataset:
Historical stock prices of Alphabet Inc. (GOOG)
Time Period: April 01, 2020 - October 01, 2020
Download here:
https://drive.google.com/file/d/1hKxLZLTx2rWPM1zid6cTAXs-33OgoTBj/view?us
p=sharing
Description of Dataset:
29
]\
Write a Pandas program to create a line plot of the historical stock prices of
Alphabet Inc. between two specific dates.
Question: Plotting II
Write a Pandas program to create a line plot of the opening, closing stock
prices of Alphabet Inc. between two specific dates.
30
]\
Question: Plotting IV
Write a Pandas program to create a bar plot of opening, closing stock prices
of Alphabet Inc. between two specific dates.
Question: Plotting V
Write a Pandas program to create a stacked bar plot of opening, closing stock
prices of Alphabet Inc. between two specific dates.
Question: Plotting VI
Write a Pandas program to create a horizontal stacked bar plot of opening,
closing stock prices of Alphabet Inc. between two specific dates.
31
]\
Question: Plotting IX
Write a Pandas program to draw a horizontal and cumulative histograms plot
of opening stock prices of Alphabet Inc. between two specific dates.
Question: Plotting X
Write a Pandas program to create a stacked histograms plot of opening,
closing, high, low stock prices of Alphabet Inc. between two specific dates
with more bins.
Question: Plotting XI
Write a Pandas program to create a stacked histograms plot with more bins
of opening, closing, high, low stock prices of Alphabet Inc. between two
specific dates.
32
]\
Question: Plotting XV
Write a Pandas program to create a plot of adjusted closing prices, 30 days
simple moving average and exponential moving average of Alphabet Inc.
between two specific dates.
33
]\
34
]\
35
]\
Solutions
Question: Time Counter
minutes = secs // 60
secs = secs % 60
print('days: ', days, ' hours: ', hours, ' minutes: ', minutes, '
seconds: ', secs)
if n1 > n2:
if n2 > n3:
biggest = n1
else:
if n1 > n3:
biggest = n2
else:
biggest = n3
36
]\
def max_of_two( x, y ):
if x > y:
return x
return y
def equal_five(input_number):
if input_number == 5:
return True
else:
return False
def belowFive(input_number):
return input_number == 5
37
]\
else:
# product is greater than 1000 calculate sum
return num1 + num2
# first condition
result = multiplication_or_sum(20, 30)
print("The result is", result)
Question: Pandas I
import pandas as pd
df = pd.DataFrame({'X':[78,85,96,80,86],
'Y':[84,94,89,83,86],'Z':[86,97,96,72,83]});
print(df)
Question: Pandas II
import pandas as pd
import numpy as np
df = pd.DataFrame(exam_data , index=labels)
print(df)
import pandas as pd
import numpy as np
38
]\
df = pd.DataFrame(exam_data , index=labels)
print("Summary of the basic information about this DataFrame and its
data:")
print(df.info())
Question: Pandas IV
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(exam_data , index=labels)
print("Rows where score is missing:")
print(df[df['score'].isnull()])
Question: Pandas V
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
39
]\
Question: Pandas VI
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(exam_data , index=labels)
print("Orginal rows:")
print(df)
df.sort_values(by=['name', 'score'], ascending=[False, True])
print("Sort the data frame first by 'name' in descending order, then
by 'score' in ascending order:")
print(df)
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
40
]\
df = pd.DataFrame(exam_data , index=labels)
print("Original rows:")
print(df)
print("\nReplace the 'qualify' column contains the values 'yes' and
'no' with True and False:")
df['qualify'] = df['qualify'].map({'yes': True, 'no': False})
print(df)
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(exam_data , index=labels)
print("Original rows:")
print(df)
color =
['Red','Blue','Orange','Red','White','White','Blue','Green','Green','
Red']
df['color'] = color
print("\nNew DataFrame after inserting the 'color' column")
print(df)
Question: Pandas IX
import pandas as pd
import numpy as np
exam_data = [{'name':'Anastasia', 'score':12.5},
{'name':'Dima','score':9}, {'name':'Katherine','score':16.5}]
41
]\
df = pd.DataFrame(exam_data)
for index, row in df.iterrows():
print(row['name'], row['score'])
Question: Pandas X
import pandas as pd
import numpy as np
d = {'col1': [1, 4, 3, 4, 5], 'col2': [4, 5, 6, 7, 8], 'col3': [7, 8,
9, 0, 1]}
df = pd.DataFrame(data=d)
print("Original DataFrame")
print(df)
print('After altering col1 and col3')
df = df[['col3', 'col2', 'col1']]
print(df)
Question: Pandas XI
import pandas as pd
import numpy as np
d = {'col1': [1, 4, 3, 4, 5], 'col2': [4, 5, 6, 7, 8], 'col3': [7, 8,
9, 0, 1]}
df = pd.DataFrame(data=d)
print("Original DataFrame")
print(df)
print('Data from new_file.csv file:')
df.to_csv('new_file.csv', sep='\t', index=False)
new_df = pd.read_csv('new_file.csv')
print(new_df)
import pandas as pd
df1 = pd.DataFrame({'name': ['Anastasia', 'Dima', 'Katherine',
'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'city': ['California', 'Los Angeles', 'California', 'California',
42
]\
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
df = pd.DataFrame(exam_data)
print("Original DataFrame")
print(df)
df = df.fillna(0)
print("\nNew DataFrame replacing all NaN with 0:")
print(df)
import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
df = pd.DataFrame(exam_data)
print("Original DataFrame:")
43
]\
print(df)
print("\nSort the above DataFrame on attempts, name:")
df = df.sort_values(['attempts', 'name'], ascending=[True, True])
print(df)
Question: Pandas XV
import pandas as pd
df = pd.DataFrame({
'Name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha
Hinton', 'Syed Wharton'],
'Date_Of_Birth ':
['17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
'Age': [18.5, 21.2, 22.5, 22, 23]
})
print("Original DataFrame:")
print(df)
df1 = df.copy(deep = True)
df = df.drop([0, 1])
df1 = df1.drop([2])
print("\nNew DataFrames:")
print(df)
print(df1)
print('\n"one_to_one": check if merge keys are unique in both left
and right datasets:"')
df_one_to_one = pd.merge(df, df1, validate = "one_to_one")
print(df_one_to_one)
print('\n"one_to_many" or "1:m": check if merge keys are unique in
left dataset:')
df_one_to_many = pd.merge(df, df1, validate = "one_to_many")
print(df_one_to_many)
print('"many_to_one" or "m:1": check if merge keys are unique in
right dataset:')
df_many_to_one = pd.merge(df, df1, validate = "many_to_one")
print(df_many_to_one)
44
]\
# game loop
while True:
amax = 0
imax = 0
for i in range(8):
# represents the height of one mountain.
mountain_h = int(input())
print(imax)
Question: Onboarding
#Game loop.
while True:
#Read inputs.
enemy1 = input()
dist1 = int(input())
enemy2 = input()
dist2 = int(input())
Question: Weather
#Read inputs.
N = int(input())
45
]\
inputs = input().split()
closestToZero = 5526
for i in range(N):
T = int(inputs[i])
#Print output.
print(closestToZero)
#Read inputs.
inputs = input().split(' ')
lightX = int(inputs[0])
lightY = int(inputs[1])
initialTX = int(inputs[2])
initialTY = int(inputs[3])
while True:
remainingTurns = int(input())
move = ''
46
]\
#Vertical movement.
if lightY > initialTY:
initialTY += 1
move += 'S'
elif lightY < initialTY:
initialTY -= 1
move += 'N'
#Horizontal movement.
if lightX > initialTX:
initialTX += 1
move += 'E'
elif lightX < initialTX:
initialTX -= 1
move += 'W'
import datetime
from datetime import datetime
print("Datetime object for Jan 11 2012:")
print(datetime(2012, 1, 11))
print("\nSpecific date and time of 9:20 pm")
print(datetime(2011, 1, 11, 21, 20))
print("\nLocal date and time:")
print(datetime.now())
print("\nA date without time: ")
print(datetime.date(datetime(2012, 5, 22)))
print("\nCurrent date:")
print(datetime.now().date())
print("\nTime from a datetime:")
print(datetime.time(datetime(2012, 12, 15, 18, 12)))
print("\nCurrent local time:")
47
]\
print(datetime.now().time())
import pandas as pd
#from datetime import datetime
print("\nA specific date using timestamp:")
print(pd.Timestamp('2016-11-10'))
print("\nDate and time using timestamp:")
print(pd.Timestamp('2012-05-03 11:30'))
print("\nA time adds in the current local date using timestamp:")
print(pd.Timestamp('11:30'))
print("\nCurrent date and time using timestamp:")
print(pd.Timestamp("now"))
import pandas as pd
import datetime
48
]\
import pandas as pd
import numpy as np
import datetime
from datetime import datetime, date
dates = [datetime(2011, 9, 1), datetime(2011, 9, 2)]
print("Time-series with two index labels:")
time_series = pd.Series(np.random.randn(2), dates)
print(time_series)
print("\nType of the index:")
print(type(time_series.index))
import pandas as pd
import numpy as np
import datetime
from datetime import datetime, date
dates = ['2014-08-01','2014-08-02','2014-08-03','2014-08-04']
time_series = pd.Series(np.random.randn(4), dates)
print(time_series)
49
]\
import pandas as pd
index = pd.DatetimeIndex(['2011-09-02', '2012-08-04',
'2015-09-03', '2010-08-04',
'2015-03-03', '2011-08-04',
'2015-04-03', '2012-08-04'])
import pandas as pd
date_range = pd.date_range('2020-01-01', periods=45)
print("Date range of perods 45:")
print(date_range)
import pandas as pd
dates = pd.Series(pd.date_range('2020-12-01',periods=31, freq='D'))
print("Month of December 2020:")
print(dates)
dates = pd.Series(pd.date_range('2020-12-01',periods=31, freq='D'))
50
]\
import pandas as pd
time_series = pd.date_range('1/1/2021', periods = 36, freq='3M')
print("Time series using three months frequency:")
print(time_series)
import pandas as pd
date_range = pd.timedelta_range(0, periods=49, freq='H')
print("Hourly range of perods 49:")
print(date_range)
import pandas as pd
data = {\
"year": [2002, 2003, 2015, 2018],
"day_of_the_year": [250, 365, 1, 140]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
df["combined"] = df["year"]*1000 + df["day_of_the_year"]
df["date"] = pd.to_datetime(df["combined"], format = "%Y%j")
print("\nNew DataFrame:")
51
]\
print(df)
import pandas as pd
df = pd.DataFrame({'year': [2018, 2019, 2020],
'month': [2, 3, 4],
'day': [4, 5, 6],
'hour': [2, 3, 4]})
print("Original dataframe:")
print(df)
result = pd.to_datetime(df)
print("\nSeries of Timestamps from the said dataframe:")
print(result)
print("\nSeries of Timestamps using specified columns:")
print(pd.to_datetime(df[['year', 'month', 'day']]))
import pandas as pd
def is_business_day(date):
return bool(len(pd.bdate_range(date, date)))
print("Check busines day or not?")
print('2020-12-01: ',is_business_day('2020-12-01'))
print('2020-12-06: ',is_business_day('2020-12-06'))
print('2020-12-07: ',is_business_day('2020-12-07'))
print('2020-12-08: ',is_business_day('2020-12-08'))
import pandas as pd
s = pd.date_range('2021-01-01', periods=12, freq='BM')
52
]\
df = pd.DataFrame(s, columns=['Date'])
print('last working days of each month of a specific year:')
print(df)
import pandas as pd
result = pd.timedelta_range(0, periods=30, freq="1H20T")
print("For a frequency of 1 hours 20 minutes, here we have combined
the hour (H) and minute (T):\n")
print(result)
import pandas as pd
epoch_t = 1621132355
time_stamp = pd.to_datetime(epoch_t, unit='s')
# UTC (Coordinated Universal Time) is one of the well-known names of
UTC+0 time zone which is 0h.
# By default, time series objects of pandas do not have an assigned
time zone.
print("Regular time stamp in UTC:")
print(time_stamp)
print("\nConvert the said timestamp in to US/Pacific:")
print(time_stamp.tz_localize('UTC').tz_convert('US/Pacific'))
print("\nConvert the said timestamp in to Europe/Berlin:")
print(time_stamp.tz_localize('UTC').tz_convert('Europe/Berlin'))
import pandas as pd
print("Timezone: Europe/Berlin:")
53
]\
print("Using pytz:")
date_pytz = pd.Timestamp('2019-01-01', tz = 'Europe/Berlin')
print(date_pytz.tz)
print("Using dateutil:")
date_util = pd.Timestamp('2019-01-01', tz = 'dateutil/Europe/Berlin')
print(date_util.tz)
print("\nUS/Pacific:")
print("Using pytz:")
date_pytz = pd.Timestamp('2019-01-01', tz = 'US/Pacific')
print(date_pytz.tz)
print("Using dateutil:")
date_util = pd.Timestamp('2019-01-01', tz = 'dateutil/US/Pacific')
print(date_util.tz)
import pandas as pd
date1 = pd.Timestamp('2019-01-01', tz='Europe/Berlin')
date2 = pd.Timestamp('2019-01-01', tz='US/Pacific')
date3 = pd.Timestamp('2019-01-01', tz='US/Eastern')
print("Time series data with time zone:")
print(date1)
print(date2)
print(date3)
print("\nTime series data without time zone:")
print(date1.tz_localize(None))
print(date2.tz_localize(None))
print(date3.tz_localize(None))
import pandas as pd
print("Subtract two timestamps of same time zone:")
date1 = pd.Timestamp('2019-03-01 12:00', tz='US/Eastern')
54
]\
import pandas as pd
thursdays = pd.date_range('2020-01-01',
'2020-12-31', freq="W-THU")
print("All Thursdays between 2020-01-01 and 2020-12-31:\n")
print(thursdays.values)
import pandas as pd
q_start_dates = pd.date_range('2020-01-01', '2020-12-31',
freq='BQS-JUN')
q_end_dates = pd.date_range('2020-01-01', '2020-12-31',
freq='BQ-JUN')
print("All the business quarterly begin dates of 2020:")
print(q_start_dates.values)
print("\nAll the business quarterly end dates of 2020:")
print(q_end_dates.values)
import pandas as pd
55
]\
import pandas as pd
dateset1 = pd.date_range('2029-01-01 00:00:00', periods=20,
freq='3h10min')
print("Time series with frequency 3h10min:")
print(dateset1)
dateset2 = pd.date_range('2029-01-01 00:00:00', periods=20,
freq='1D10min20U')
print("\nTime series with frequency 1 day 10 minutes and 20
microseconds:")
print(dateset2)
import pandas as pd
newday = pd.Timestamp('2020-02-07')
print("First date:")
print(newday)
print("\nThe day name of the said date:")
print(newday.day_name())
print("\nAdd 2 days with the said date:")
newday1 = newday + pd.Timedelta('2 day')
print(newday1.day_name())
print("\nNext business day:")
nbday = newday + pd.offsets.BDay()
print(nbday.day_name())
56
]\
import pandas as pd
dates1 = pd.to_datetime([1329806505, 129806505, 1249892905,
1249979305, 1250065705], unit='s')
print("Convert integer or float epoch times to Timestamp and
DatetimeIndex upto second:")
print(dates1)
print("\nConvert integer or float epoch times to Timestamp and
DatetimeIndex upto milisecond:")
dates2 = pd.to_datetime([1249720105100, 1249720105200, 1249720105300,
1249720105400, 1249720105500], unit='ms')
print(dates2)
import pandas as pd
from pandas.tseries.offsets import *
import datetime
from datetime import datetime, date
dt = datetime(2020, 1, 4)
print("Specified date:")
print(dt)
print("\nOne business day from the said date:")
obday = dt + BusinessDay()
print(obday)
print("\nTwo business days from the said date:")
tbday = dt + 2 * BusinessDay()
print(tbday)
print("\nThree business days from the said date:")
thbday = dt + 3 * BusinessDay()
print(thbday)
print("\nNext business month end from the said date:")
57
]\
nbday = dt + BMonthEnd()
print(nbday)
import pandas as pd
import datetime
from datetime import datetime, date
sdt = datetime(2020, 1, 1)
edt = datetime(2020, 12, 31)
dateset = pd.period_range(sdt, edt, freq='M')
print("All monthly boundaries of a given year:")
print(dateset)
print("\nStart and end time for each period object in the said
index:")
for d in dateset:
print ("{0} {1}".format(d.start_time, d.end_time))
import pandas as pd
import numpy as np
pi = pd.Series(np.random.randn(36),
pd.period_range('1/1/2029',
'12/31/2031', freq='M'))
print("PeriodIndex which represents all the calendar month periods in
2029 and 2030:")
print(pi)
print("\nValues for all periods in 2030:")
print(pi['2030'])
58
]\
import pandas as pd
from pandas.tseries.holiday import *
sdt = datetime(2021, 1, 1)
edt = datetime(2030, 12, 31)
print("Holidays between 2021-01-01 and 2030-12-31 using the US
federal holiday calendar.")
cal = USFederalHolidayCalendar()
for dt in cal.holidays(start=sdt, end=edt):
print (dt)
import pandas as pd
mtp = pd.Period('2021-11','M')
print("Monthly time perid: ",mtp)
print("\nList of names in the current local scope:")
print(dir(mtp))
import pandas as pd
ytp = pd.Period('2020','A-DEC')
print("Yearly time perid:",ytp)
print("\nAll the properties of the said period:")
print(dir(ytp))
59
]\
Question: Plotting II
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-09-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1.set_index('Date')
plt.figure(figsize=(5,5))
plt.suptitle('Stock prices of Alphabet Inc.,\n01-04-2020 to
30-09-2020', \
fontsize=18, color='black')
plt.xlabel("Date",fontsize=16, color='black')
plt.ylabel("$ price", fontsize=16, color='black')
df2['Close'].plot(color='green');
plt.show()
Question: Plotting II
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-09-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df2 = df.loc[new_df]
plt.figure(figsize=(10,10))
df2.plot(x='Date', y=['Open', 'Close']);
plt.suptitle('Opening/Closing stock prices of Alphabet Inc.,\n
01-04-2020 to 30-09-2020', fontsize=12, color='black')
60
]\
plt.xlabel("Date",fontsize=12, color='black')
plt.ylabel("$ price", fontsize=12, color='black')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1.set_index('Date')
plt.figure(figsize=(6,6))
plt.suptitle('Trading Volume of Alphabet Inc. stock,\n01-04-2020 to
30-04-2020', fontsize=16, color='black')
plt.xlabel("Date",fontsize=12, color='black')
plt.ylabel("Trading Volume", fontsize=12, color='black')
df2['Volume'].plot(kind='bar');
plt.show()
Question: Plotting IV
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Date', 'Open', 'Close']]
61
]\
df3 = df2.set_index('Date')
plt.figure(figsize=(20,20))
df3.plot(kind='bar');
plt.suptitle('Opening/Closing stock prices Alphabet Inc.,\n01-04-2020
to 30-04-2020', fontsize=12, color='black')
plt.show()
Question: Plotting V
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Date', 'Open', 'Close']]
df3 = df2.set_index('Date')
plt.figure(figsize=(20,20))
df3.plot.bar(stacked=True);
plt.suptitle('Opening/Closing stock prices Alphabet Inc.,\n01-04-2020
to 30-04-2020', fontsize=12, color='black')
plt.show()
Question: Plotting VI
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
62
]\
df1 = df.loc[new_df]
df2 = df1[['Date', 'Open', 'Close']]
df3 = df2.set_index('Date')
plt.figure(figsize=(20,20))
df3.plot.barh(stacked=True)
plt.suptitle('Opening/Closing stock prices Alphabet Inc.,\n01-04-2020
to 30-04-2020', fontsize=12, color='black')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Open','Close','High','Low']]
#df3 = df2.set_index('Date')
plt.figure(figsize=(25,25))
df2.plot.hist(alpha=0.5)
plt.suptitle('Opening/Closing/High/Low stock prices of Alphabet
Inc.,\n From 01-04-2020 to 30-09-2020', fontsize=12, color='blue')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
63
]\
df2 = df1[['Open','Close','High','Low']]
plt.figure(figsize=(25,25))
df2.plot.hist(stacked=True, bins=20)
plt.suptitle('Opening/Closing/High/Low stock prices of Alphabet
Inc.,\n From 01-04-2020 to 30-09-2020', fontsize=12, color='blue')
plt.show()
Question: Plotting IX
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-4-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Open']]
plt.figure(figsize=(15,15))
df2.plot.hist(orientation='horizontal', cumulative=True)
plt.suptitle('Opening stock prices of Alphabet Inc.,\n From
01-04-2020 to 30-04-2020', fontsize=12, color='black')
plt.show()
Question: Plotting X
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Open','Close','High','Low']]
plt.figure(figsize=(25,25))
df2.plot.hist(stacked=True, bins=200)
64
]\
Question: Plotting XI
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Open','Close','High','Low']]
plt.figure(figsize=(30,30))
df2.hist();
plt.suptitle('Opening/Closing/High/Low stock prices of Alphabet Inc.,
From 01-04-2020 to 30-09-2020', fontsize=12, color='black')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
stock_data = df1.set_index('Date')
top_plt = plt.subplot2grid((5,4), (0, 0), rowspan=3, colspan=4)
top_plt.plot(stock_data.index, stock_data["Close"])
plt.title('Historical stock prices of Alphabet Inc. [01-04-2020 to
30-09-2020]')
bottom_plt = plt.subplot2grid((5,4), (3,0), rowspan=1, colspan=4)
65
]\
bottom_plt.bar(stock_data.index, stock_data['Volume'])
plt.title('\nAlphabet Inc. Trading Volume', y=-0.60)
plt.gcf().set_size_inches(12,8)
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
stock_data = df1.set_index('Date')
stock_data.plot(subplots = True, figsize = (8, 8));
plt.legend(loc = 'best')
plt.suptitle('Open,High,Low,Close,Adj Close prices & Volume of
Alphabet Inc., From 01-04-2020 to 30-09-2020', fontsize=12,
color='black')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
stock_data = df1.set_index('Date')
close_px = stock_data['Adj Close']
stock_data['SMA_30_days'] =
stock_data.iloc[:,4].rolling(window=30).mean()
stock_data['SMA_40_days'] =
66
]\
stock_data.iloc[:,4].rolling(window=40).mean()
plt.figure(figsize=[10,8])
plt.grid(True)
plt.title('Historical stock prices of Alphabet Inc. [01-04-2020 to
30-09-2020]\n',fontsize=18, color='black')
plt.plot(stock_data['Adj Close'],label='Adjusted Closing Price',
color='black')
plt.plot(stock_data['SMA_30_days'],label='30 days simple moving
average', color='red')
plt.plot(stock_data['SMA_40_days'],label='40 days simple moving
average', color='green')
plt.legend(loc=2)
plt.show()
Question: Plotting XV
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
stock_data = df1.set_index('Date')
close_px = stock_data['Adj Close']
stock_data['SMA_30_days'] =
stock_data.iloc[:,4].rolling(window=30).mean()
stock_data['EMA_20_days'] =
stock_data.iloc[:,4].ewm(span=20,adjust=False).mean()
plt.figure(figsize=[15,10])
plt.grid(True)
plt.title('Historical stock prices of Alphabet Inc. [01-04-2020 to
30-09-2020]\n',fontsize=18, color='black')
plt.plot(stock_data['Adj Close'],label='Adjusted Closing Price',
color='black')
plt.plot(stock_data['SMA_30_days'],label='30 days Simple moving
67
]\
average', color='red')
plt.plot(stock_data['EMA_20_days'],label='20 days Exponential moving
average', color='green')
plt.legend(loc=2)
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1.set_index('Date')
x= ['Close']; y = ['Volume']
plt.figure(figsize=[15,10])
df2.plot.scatter(x, y, s=50);
plt.grid(True)
plt.title('Trading Volume/Price of Alphabet Inc. stock,\n01-04-2020
to 30-09-2020', fontsize=14, color='black')
plt.xlabel("Stock Price",fontsize=12, color='black')
plt.ylabel("Trading Volume", fontsize=12, color='black')
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
68
]\
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Date', 'Close']]
df3 = df2.set_index('Date')
data_filled = df3.asfreq('D', method='ffill')
data_returns = data_filled.pct_change()
data_std = data_returns.rolling(window=30, min_periods=30).std()
plt.figure(figsize=(20,20))
data_std.plot();
plt.suptitle('Volatility over a period of time of Alphabet Inc.
stock price,\n01-04-2020 to 30-09-2020', fontsize=12, color='black')
plt.grid(True)
plt.show()
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
69
]\
df = pd.read_csv("alphabet_stock_data.csv")
start_date = pd.to_datetime('2020-4-1')
end_date = pd.to_datetime('2020-9-30')
df['Date'] = pd.to_datetime(df['Date'])
new_df = (df['Date']>= start_date) & (df['Date']<= end_date)
df1 = df.loc[new_df]
df2 = df1[['Date', 'Adj Close']]
df3 = df2.set_index('Date')
daily_changes = df3.pct_change(periods=1)
sns.distplot(daily_changes['Adj
Close'].dropna(),bins=100,color='purple')
plt.suptitle('Daily % return of Alphabet Inc. stock
price,\n01-04-2020 to 30-09-2020', fontsize=12, color='black')
plt.grid(True)
plt.show()
70