Download as pdf or txt
Download as pdf or txt
You are on page 1of 67

CONTENTS

i PYTHON BASICS QUICK


1 Variables and statements 5
2 User-Defined Functions 9
3 Conditions 11
4 Loops 15
5 Nested-Loops 17
6 LIST AND DICTIONARY 21

ii PYTHON & DATA SCIENCE


7 Numpy Quick 25
8 Pandas Quick 27
9 VISUALISATION TOOLKITS QUICK 31

iii MUST DO EXERCISES


10 SOME BASIC PROBLEMS 49
11 Let’s Play with Data 59

1
Part I

PYTHON BASICS QUICK


1
VA R I A B L E S A N D S TAT E M E N T S

Let us start with the most basic commands in python using


which you print something.

p r i n t ( ’ Hello ! Let us l e a r n python ’ )


# a f t e r w r i t i n g t h i s c o d e p r e s s s h i f t and
e n t e r i n y o u r j u p y t e r n o t e b o o k t o run
the code

Out: Hello! Let us learn python


Note that you can comment near to the code using # as
done above.Let us do some interesting print for fun.

print ( ’ 1 ’ )
print ( ’ 1 ’ , ’ 2 ’ )
print ( ’ 1 ’ , ’ 2 ’ , ’ 3 ’ )
print ( ’ 1 ’ , ’ 2 ’ , ’ 3 ’ , ’ 4 ’ ) # a f t e r writing t h i s
p r e s s s h i f t e n t e r in your j u p y t e r
notebook

Out:
1
1 2
1 2 3
1 2 3 4

5
6 variables and statements

This may be fun but is a long and tiring way to print such
a pattern. Maybe we will write a short code that can do
the same after some experience. Note that there are various
ways to write code with all of them giving the same output.

Now let us talk about values and variables.

Consider a value (it can be an integer like 2 or a float/dec-


imal type like 2.2 or a string type ’hello’) and let us store
this value in a variable named x. This is just like how you
define an unknown x in mathematics but later you know
what is in that x after solving the mathematical problem.
Anyways the way to store a value in a variable is as follows:
x =2.2
p r i n t ( x ) # n o t e t h a t v a r i a b l e s can b e i n s i d e
p r i n t w i t h o u t any s i n g l e − q u o t e s
Out: 2.2

Let us define one more variable y that stores an integer

y=4
print (y)
Out: 4

Let us do some mathematical operations like the follow-


ing:

2* x # m u l t i p l i c a t i o n
Out: 4.4
x/2 # d i v i s i o n
variables and statements 7

Out: 1.1
x * * 2 #x r a i s e d t o t h e power o f 2
Out: 4.840000000000001
# L e t us p u t them t o g e t h e r h o r i z o n t a l l y and
add some s p a c e b e t w e e n v a l u e s
p r i n t ( 2 * x , ’ ’ , x /2 , ’ ’ , x * * 2 )
Out: 4.4 1.1 4.840000000000001
p r i n t ( x+y , ’ ’ , x*y , ’ ’ , x/y )
Out: 6.2 8.8 0.55
2
USER-DEFINED FUNCTIONS

Let us understand user-defined function in python


Look at this:
we have
f (x) = 2 × x
The function f is like a machine that eats a variable x and
vomits 2 × x .
Say
f (9) = 2 × 9 = 18
How can we define such a function in python? This can be
done using def (write the form of the function) and return
(what the function should vomit.)
def f ( x ) : r e t u r n 2 * x
f (9)
Out: 18
There is yet another way to do that using lambda function.
Let us define another function g such that

g( x ) = x2

such that g(3) = 9


g=lambda x : x * * 2
g(3)

9
10 user-defined functions

Out: 9
Look at this:
h( x, y) = x + y
The function h is like another machine that eats two vari-
ables x and y and vomits their sum.

h(2, 3) = 2 + 3 = 5

. Thus h is a function of two variables x and y unlike f


which is a function of one variable x.
def h ( x , y ) : r e t u r n x+y
h(2 ,3)
Out: 5
Let us do the same using the lambda function.
h=lambda x , y : x+y
Let us do something funny. Let a function print something.
How do we do that?
def p r i n t i t ( ) :
p r i n t ( ’ Hello world ’ )
# c a l l the function
printit ()
Out: Hello world
Thus we have defined a function named printit(), which
runs the piece of code written below it.
3
CONDITIONS

Let us understand if and else. What do you understand by


a condition ? Well your friend ’SHYAM’ can condition you
saying that if "you give him more than 3 chocolates" then "he
will help you with your homework". Say C is the number of
chocolates. i.e C is the variable that stores the value which
indicates the number of chocolates.
C=9
i f C> 3 :
p r i n t ( ’ I w i l l help you ’ ) # y o u r f r i e n d
RAM s a y s h e w i l l h e l p you .
Out: I will help you
C=2
i f C> 3 :
p r i n t ( ’ I w i l l help you ’ )
Note that the above code does not give any output because
C is 2 and is less than 9. Thus for situations like this in
order to have an output we utilize an ’else’ statement as
below. The else statement basically tells what piece of code
to execute if a given condition is not satisfied.
C=2
i f C> 3 :

11
12 conditions

p r i n t ( ’ I w i l l help you ’ )
else :
p r i n t ( ’ I w i l l not help you ’ )
Out: I will not help you

Thus we understood what if does and what else does. In


summary, if tells you what to execute if the condition is
satisfied and else tells you what to execute if condition is
not satisified.
Let us understand if, elseif and else together using the ex-
ample below:
Consider your friend ask you to play a dice game. You need
to roll two dice A and B. First the value on dice A will be
considered and then the value of dice B will be considered.
If the value on dice A is greater than 3, you win. If not, we
will go and look at the value in dice B and if the value of
dice B is greater than 5, you win. Else you lose.
A= i n t ( input ( ’ E n t e r t h e value on d i c e A: ’ ) )
B= i n t ( input ( ’ E n t e r t h e value on d i c e B : ’ ) )
Out:
Enter the value on dice A:6
Enter the value on dice B:6
i f A> 3 :
p r i n t ( ’ You win because o f d i c e A ’ )
e l i f B>5:
p r i n t ( ’ You win because o f d i c e B ’ )
else :
p r i n t ( ’ You l o s e ’ )
Out: You win because of dice A
Now your friend make the game a bit more difficult. He tells
conditions 13

you to get a value more than 4 in both the dices when rolled.
This is now your winning condition. Thus the condition can
be restated as Dice A must have value greater than 4 and
Dice B must have value greater than 4 too.
i f A>4 and B > 4 : # n o t i c e t h e and
p r i n t ( ’ You win t h i s d i f f i c u l t game o f
two d i c e s ’ )
else :
p r i n t ( ’ You l o s e ’ )
Out: You win this difficult game of two dices

However, your friend changes the rule in between the game


as it was difficult for you to win. So he tells you to get
a value more than 4 in either dice A or dice B. The code
becomes the following:
i f A>4 or B > 4 : # n o t i c e t h e o r
p r i n t ( ’ You win t h i s d i f f i c u l t game o f
two d i c e s ’ )
else :
p r i n t ( ’ You l o s e ’ )
4
LOOPS

Let us understand ’for loop’.


Say you have a variable that stores a string (a word). And
you want to bring down its letter one by one.
D= ’ Hello ’
f o r i in D:
print ( i )

Out:
H
e
l
l
o

The ’i’ in the code touches each letter of D one by one and
everytime it touches a letter, it executes the print commmand
which prints the string that i touches/stores temporarily.
f o r i in D:
p r i n t ( i , end= ’ ’ )
# Here we i n t e n d t o p r i n t h o r i z o n t a l l y
i n s t e a d o f v e r t i c a l l y as done in
previous c e l l code .

15
16 loops

Out: H e l l o
How about bringing down element by elemet from a list
containing numbers
E = [ 1 , 2 , 3 , 4 , 5 ] # T h i s i s a l i s t . We w i l l
r e a d a b o u t i t more l a t e r
f o r i in E :
print ( i )
Out:
1
2
3
4
5

Let us make it more interesting. Since i touches all values


of E, let us condition such that if the values that i touches is
greater than 4 than it should be printed.
f o r i in E :
i f i >4:
print ( i )
# The o n l y v a l u e g r e a t e r t h a n 4 i n t h e l i s t
E is 5
Out: 5
5
NESTED-LOOPS

The idea is to understand when we use a second for loop


inside a for loop.

f o r i in range ( 1 , 4 ) :
f o r j in range ( 1 , 3 ) :
p r i n t ( ’ Hello ’ )

Out:
Hello
Hello
Hello
Hello
Hello
Hello
Let us try to understand why we get 6 hello.
Well, i taken on values 1,2,3. Now when i=1, j takes on
values 1,2. And so on.

i=1 j =1
j =2
i=2 j =1
j =2
i=3 j =1
j =2

17
18 nested-loops

So we have total of 6 j values in total for all possible i


values. The print(’hello’), which is inside the j loop, executes
j times which is 6 times here.
To understand this let us print the corresponding i and j
value for each ’hello’ that is printed.
f o r i in range ( 1 , 4 ) :
f o r j in range ( 1 , 3 ) :
p r i n t ( ’ Hello ’ , i , j )

Out:
Hello 1 1
Hello 1 2
Hello 2 1
Hello 2 2
Hello 3 1
Hello 3 2

import pandas as pd

data = { ’ output ’ : [ ’ Hello ’ , ’ Hello ’ , ’ Hello


’ , ’ Hello ’ , ’ Hello ’ , ’ Hello ’ ] ,
’ i ’ : [1 , 1 , 2 , 2 , 3 , 3] ,
’ j ’ : [1 , 2 , 1 , 2 , 1 , 2]}

df = pd . DataFrame ( data )

p r i n t ( df )

Out:
i j
0 Hello 1 1
1 Hello 1 2
2 Hello 2 1
nested-loops 19

3 Hello 2 2
4 Hello 3 1
5 Hello 3 2
We will understand the above code later but you can now
see how individual each printed hello is related to its corre-
sponding i and j value.
6
L I S T A N D D I C T I O N A RY

Let us understand Data Structures


First note that data structure is a collection of multiple
pieces of data. You can thing of a list storing data such as
number or string data etc.

L_NUM= [ 1 , 2 , 3 , 4 ] #L_NUM i s a l i s t
L_STR =[ ’ a ’ , ’ b ’ , ’ c ’ , ’ d ’ ] #L_STR i s a l i s t
too
L_FRUIT_DATA=[ ’ apple ’ , ’ orange ’ , ’mango ’ ] #
L_FRUIT_DATA i s a l i s t o f f r u i t names
L_SALARY_DATA= [ 2 5 0 0 0 , 3 0 0 0 0 , 4 0 0 0 0 , 8 0 0 0 0 ] #
L_SALARY_DATA i s a l i s t o f s a l a r y o f 4
e m p l o y e e s i n a company

Let us focus on the salary data list i.e LS ALARYD ATA. It


would be nice to have a nice tabular representation of the
salary along with the employee’s name. This is where the
dictionary comes in.

Dic_SALARY_DATA={ ’TOM’ : 2 5 0 0 0 , ’HARRY ’ : 3 0 0 0 0 ,


’TONY ’ : 4 0 0 0 0 , ’HARRY ’ : 8 0 0 0 0 }
p r i n t ( Dic_SALARY_DATA )

21
22 list and dictionary

Out: ’TOM’: 25000, ’HARRY’: 80000, ’TONY’: 40000


Still not satisfied with how it looks. Let us move to using
pandas Series then.
import pandas as pd
pd . S e r i e s ( Dic_SALARY_DATA )
Out:
TOM 25000
HARRY 80000
TONY 40000
dtype: int64
Part II

P Y T H O N & D ATA S C I E N C E
7
NUMPY QUICK

We know what is a list. For examples, here is a list L=[1,2,3].


Now try to double all values of the list. A natural thing to
do is to multiply it by 2. Lets see what we get
L=[1 ,2 ,3]
2*L
Out: [1, 2, 3, 1, 2, 3]
Unfortunately, we did not get what we expected. We ex-
pected to get 2*L=[1,4,6]. However, numpy can help us
here.
import numpy as np
2 * np . a r r a y ( L )
Out: array([2, 4, 6])
Now that was simple. Let us look at the data type using
type function.
p r i n t ( type ( L ) , type ( np . a r r a y ( L ) ) )
Out:
What do you see. The type(L) is a class list and the type(np.array(L))
is class numpy.ndarray.
Whatelse we cannot do with list but can do with numpy
array. How about adding two list. Lets see what happens
Let us define two list L and L_1. Let us add them.

25
26 numpy quick

L=[1 ,2 ,3]
L_1 = [ 4 , 5 , 6 ]
L+L_1
Out: [1, 2, 3, 4, 5, 6]
Well adding two list gives you a third list containing all
values of first and secound list. But we were unable to
actually do the mathematical addition. Numpy will help us.
Convert all list to numpy array and store the (it is optional
if you want to store them). And then add them.
N=np . a r r a y ( L ) # We h a v e c o n v e r t e d l i s t L
i n t o numpy a r r a y and s t o r e them i n N
N_1=np . a r r a y ( L_1 ) # We h a v e c o n v e r t e d l i s t
L_1 i n t o numpy a r r a y and s t o r e them i n
N_1
N+N_1 # WE ADD THE TWO NUMPY ARRAYS HERE
Out: array([5, 7, 9])
ALL GOOD. Now we understand what list can do and what
numpy can do.
8
PA N D A S Q U I C K

Just like numpy is closely related to list, similarly, you can


think of pandas being closely related to dictionary. Let us
understand a distionary first. Then we will look at pandas
series and dataframe. Let us define two dictionary type:
KEY & NUMBER DICTIONARY AND KEY & LIST DICTIO-
NARY
KEY AND NUMBER DICTIONARY.
This is dictionary of type 1.
d i c _ 1 ={ ’ Ram_toys ’ : 1 , ’ Shyam_toys ’ : 2 , ’
Radha_toys ’ : 3 } # T h i s i s a k e y number
dictionary .
# Here k e y s a r e ’ Ram_toys ’ , ’ Shyam_toys ’ and
’ R a d h a _ t o y s ’ and t h e y s t o r e o n l y numbers
.
print ( dic_1 )
Out: ’Ram_toys’: 1, ’Shyam_toys’: 2, ’Radha_toys’:
3
Let us connect a list L=[1,2,3] and a dictionary dic_1=’Ram_toys’:1,’Shyam_toys’
You can think like this. The values in list L stores only values
without indicatiing what they are where as in dictionary dic
you atleast know values 1,2,3 are the number of toys that
Ram, Shyam and Radha have.

27
28 pandas quick

It would have been nice if we can see them in some nice


tabular kind of form. This is where pandas Series helps.
A pandas series eats a index and number dictionary.
import pandas as pd
pd . S e r i e s ( d i c _ 1 )
Out:
Ram_toys 1
Shyam_toys 2
Radha_toys 3
dtype: int64
Looks nice in tabular form. Right. Let us see KEY AND
LIST DICTIONARY.
KEY AND LIST DICTIONARY
This is dictionary of type 2
Now let the keys are related to list instead of a single num-
ber. This is dictionary of type 2.

d i c _ 2 ={ ’ Ram_toys ’ : [ 1 , 2 , 3 ] , ’ Shyam_toys ’
: [ 4 , 5 , 6 ] , ’ Radha_toys ’ : [ 7 , 8 , 9 ] } # T h i s i s
a k e y number d i c t i o n a r y .
# Here k e y s a r e ’ Ram_toys ’ , ’ Shyam_toys ’ and
’ R a d h a _ t o y s ’ and t h e y s t o r e o n l y l i s t s .
print ( dic_2 )
Out: ’Ram_toys’: [1, 2, 3], ’Shyam_toys’: [4,
5, 6], ’Radha_toys’: [7, 8, 9]
It will be nice if we will have them in tabular form. This is
where Pandas DataFrame enters and helps. Think of pandas
DataFrame is something that eats dictionary of type 2
pd . DataFrame ( dic_2 , index =[ ’ c a t ’ , ’ dog ’ , ’
dragon ’ ] )
Out:
pandas quick 29

What do you understand from above table. The table says


that Ram has cat toys 1, dog toys 2 and dragon toys 3. The
table says that Shyam has cat toys 4, dog toys 5 and dragon
toys 6. And so on for radha.
9
V I S U A L I S AT I O N T O O L K I T S Q U I C K

We will look at two libraries: matplotlib and seaborn. First


let us understand how to plot a staightline using matplotlib.
A) LINEAR AND NON-LINEAR CURVE PLOT

def f ( x ) : r e t u r n 2 * x+3 # h e r e we d e f i n e a
stringht line .
import m a t p l o t l i b . pyplot as p l t
x=np . a r r a y ( [ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ] ) # t h e
v a l u e s o f x f o r which you want t o p l o t f
( x ) . . . o r y= f ( x ) =2x+3
print ( f ( x ) )
plt . plot ( f ( x ) )

Out: [ 3 5 7 9 11 13 15 17 19 21]

31
32 visualisation toolkits quick

We can make the function a bit complicated and see what


graph we get. May be a non-linear curve for fun.

def g ( x ) : r e t u r n x * * 2 # h e r e we d e f i n e a non
l i n e a r curve
import m a t p l o t l i b . pyplot as p l t
x=np . a r r a y ( [ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ] ) # t h e
v a l u e s o f x f o r which you want t o p l o t f
( x ) . . . o r y= f ( x ) =2x+3
print (g( x ) )
plt . plot (g( x ) )

Out: [ 0 1 4 9 16 25 36 49 64 81]
visualisation toolkits quick 33

# P l o t many f u n c t i o n s i n o n e . Here i s an
example
plt . plot (g( x ) )
plt . plot ( f ( x ) )

Out:
34 visualisation toolkits quick

Now we know how to plot curves using matplotlib. Now


let us plot some histograms.

B) HISTOGRAMS
Histograms basically show frequency distribution. That is
how many time a particular number appears in a list.

List
=[1 ,1 ,1 ,1 ,1 ,2 ,3 ,4 ,4 ,4 ,4 ,4 ,4 ,4 ,5 ,5 ,5 ,6 ,6 ,6]

p l t . h i s t ( L i s t , width = 0 . 1 ) # I h a v e r e d u c e
t he width h e r e .

Out: (array([5., 0., 1., 0., 1., 0., 7., 0.,


3., 3.]), array([1. , 1.5, 2. , 2.5, 3. , 3.5,
4. , 4.5, 5. , 5.5, 6. ]),
visualisation toolkits quick 35

p l t . h i s t ( L i s t , width = 1 . 1 ) # I h a v e
i n c r e a s e d th e width h e r e .

Out: (array([5., 0., 1., 0., 1., 0., 7., 0.,


3., 3.]), array([1. , 1.5, 2. , 2.5, 3. , 3.5,
4. , 4.5, 5. , 5.5, 6. ]),
36 visualisation toolkits quick

Now its your job to understand how width effects the intu-
ition behind the visualisation. Lets move on
Remember our dataframe. Lets give it a name df
df=pd . DataFrame ( dic_2 , index =[ ’ c a t ’ , ’ dog ’ , ’
dragon ’ ] )
# L e t us c a l l i n d i v i d u a l columns .
df [ ’ Ram_toys ’ ]
Out: cat 1 dog 2 dragon 3 Name: Ram_toys, dtype:
int64
df [ ’ Shyam_toys ’ ]
Out: cat 4 dog 5 dragon 6 Name: Shyam_toys,
dtype: int64
df [ ’ Radha_toys ’ ]
visualisation toolkits quick 37

Out: cat 7 dog 8 dragon 9 Name: Radha_toys,


dtype: int64
Let us learn bar plot. Note it depends where you should
use histogram and where you should bar plot.
This is where you should know the data properly and use
your brain.

C) BAR PLOT
# L e t us p l o t u s i n g p a n d a s l i b r a r y t o p l o t a
bar graph
df . p l o t . bar ( )
Out:

Kids = l i s t ( df . keys ( ) )
38 visualisation toolkits quick

v a l u e s = l i s t ( df . v a l u e s )
# Here we t o o k t h e k e y s and v a l u e s f r o m d f
and k e p t them i n l i s t named K i d s and
values
Kids

Kids = l i s t ( df . keys ( ) )
v a l u e s = l i s t ( df . v a l u e s )
# Here we t o o k t h e k e y s and v a l u e s f r o m d f
and k e p t them i n l i s t named K i d s and
values
kids

Out: [’Ram_toys’, ’Shyam_toys’, ’Radha_toys’]

values

Out: [array([1, 4, 7], dtype=int64), array([2,


5, 8], dtype=int64), array([3, 6, 9], dtype=int64)]
Let us use matplotlib instead of pandas plot. Basically, pan-
das itself has visualisations and can be sometimes simple.

p l t . bar ( Kids , v a l u e s [ 0 ] , c o l o r = ’ maroon ’ ,


width = 0 . 4 )
p l t . x l a b e l ( " Kids " )
plt . ylabel ( " cats " )
# T h i s p l o t s a y s Ram h a s 1 c a t , shyam h a s 4
and r a d h a h a s 7 c a t s .
# V a l u e s [ 0 ] s a y s t h a t you a r e l o o k i n g a t t h e
c a t v a l u e s . Think h e r e p r o p e r l y .

Out:
visualisation toolkits quick 39

Here you can ask yourself if the hist plot will make any
sense for dataframe df under consideration
D) SCATTER PLOT
To understand scatter plot let us work with a dataset.
The following data set is a simple dataset. There are two
columns. One is ’Hours’ and other is ’Scores’. The dataset
says basically if a student studies for a particular hour (say
2.5 hours) then he gets a score of 31.
You can put ’Hours’ in x-axis and ’Scores’ in y-axis. Now
let us first see the datset.
#READ DATA USING PANDAS
df=pd . read_csv ( r "C: \ Users\HP\Downloads\
s t u d e n t _ s c o r e s . csv " )
df

Out:
40 visualisation toolkits quick

Let us look at the plot. The scatter plot. We will use seaborn
library.

import seaborn as sns


# scatter plot
sns . s c a t t e r p l o t ( data=df , x= ’ Hours ’ , y= ’ S c o r e s
’ , marker= ’ * ’ )

Out:
visualisation toolkits quick 41

#You can u s e p a n d a s t o p l o t t o o . Depends


which o n e i s more c o m f o r t a b l e .
df . p l o t . s c a t t e r ( x= ’ Hours ’ , y= ’ S c o r e s ’ )

Out:
42 visualisation toolkits quick

What do you learn from scatter plot. Baically, it shows that


as the students who study more scores more. Natural. As
the value of hours become high, the value of corresponding
scores is also higher

Seaborn has an interesting code that does ploting in one


shot: HISTOGRAMS AND SCATTERPLOT

sns . p a i r p l o t ( data=df )

Out:
visualisation toolkits quick 43

E) BOX PLOT

You can use either pandas or seaborn to do box plot. Let


us use pandas first and then seaborn for box plot

df . p l o t . box ( df )

Out:
44 visualisation toolkits quick

Out:
visualisation toolkits quick 45

This is clear that box plot tells you range of values in ’Hours’
column and ’Scores’ column.

df . d e s c r i b e ( )
46 visualisation toolkits quick

Out:
To understand box plot, let us see minimum and maximum
values in both. See that hours range from min 1.1 to max 9.2.
The same is reflected in box plot. The score column range
from min 17 to max 95, and the same is reflected in box plot.
Part III

MUST DO EXERCISES
10
SOME BASIC PROBLEMS

1. Try to print this Hello. How are you?


print ( ’ ’ ’ Hello .
How a r e you ? ’ ’ ’ ) # ’ ’ ’ t e x t ’ ’ ’ i n s t e a d o f
" t e x t " c o n s i d e r s l i n e change .
Out: Hello. How are you?
2. Store some values in variables x, y, z
x=2
y=3.5
z=6
x , y , z # you can j u s t c a l l y o u r v a r i a b l e s
without using p r i n t .
Out: (2, 3.5, 6)
3. Add all of them and double it.
2 * ( x+y+z )
Out: 23.0
4. Solve this
z
( x + y) − 2x ∗ 2
(2+3.5)-2(2) 26
(5.5)-26
5.5-12
-6.5

49
50 some basic problems

( x+y ) −2 * x * z/2

Out: -6.5
5. find the square root of x
x**(1/2)

Out: 1.4142135623730951
6. let’s solve a math problem with the help of python.
Q. My brother is 5 years older than me. IF my age is 15, find
my brothers age.
b r o t h e r _ a g e = my_age+5
my_age= 15
brother_age

Out: –––––––––––––––––––––––––––––––––––––-
NameError Traceback (most recent call last)
Cell In[6], line 1
––> 1 brother_age= my_age+5
2 my_age= 15
3 brothe_age
NameError: name ’my_age’ is not defined
Why didn’t it work?
Because python follows your instruction in sequence. It
did not recognise "my_age" in 1st line because you have
introduced it in the 2nd line.
Let’s see if swapping the order works or not.
my_age= 15
b r o t h e r _ a g e = my_age+5
brother_age

Out: 20
7. what is the type of " brothe_age"?
type ( b r o t h e r _ a g e )
some basic problems 51

Out: int
8. print your brother’s age in a sentence.
p r i n t ( " My b r o t h e r ’ s age i s " + s t r (
brother_age ) ) # Print t a k e s s t r i n g s only
as a part o f sentence
Out: My brother’s age is 20
9. Input a number and write if it is even or odd
n= i n t ( input ( " E n t e r a number : " ) )
i f n%2==0 :
p r i n t ( " I t i s an even number " )
else :
p r i n t ( " I t i s an odd number " )
Out: Enter a number : 3 It is an odd number

10. Count the letters in "Captain"


count =0
f o r i in " Captain " :
count=count +1
p r i n t ( count )
Out: 7
11. write a program which directs you to the right path.
children below 14 do not require ticket, they can enter to the
museum. Females buy their ticket from left counter. Males
buy their ticket from right counter.
age= i n t ( input ( " E n t e r your age : " ) )
gender= s t r ( input ( " E n t e r your gender ( male/
female ) " ) )
i f age >=14 :
i f gender== " Female " :
p r i n t ( " You can buy your t i c k e t
from t h e c o u n t e r on l e f t " )
52 some basic problems

else :
p r i n t ( " You can buy your t i c k e t from
t h e c o u n t e r on r i g h t " )
else :
p r i n t ( " you do not r e q u i r e a t i c k e t .
You can e n t e r t h e museum" )
Out: Enter your age: 6
Enter your gender (male/ female)male
you do not require a ticket. You can enter the
museum
List and Dictionaries

Let’s do some problems and make our understanding better


on lists and dictionaries.
1. Given a list, find the location of the elements ?
L=[ ’ apple ’ , ’ orange ’ , ’mango ’ , ’ pineapple ’ ]
# The l o c a t i o n can b e f o u n d u s i n g i n d e x
L . index ( ’ apple ’ )
Out: 0
L . index ( ’ orange ’ )
Out: 1
L . index ( ’mango ’ )
Out: 2
L . index ( ’ pineapple ’ )
Out: 3
2. Make a list containing the location of elements of L ?
L _ l o c a t i o n =[L . index ( ’ apple ’ ) , L . index ( ’
orange ’ ) , L . index ( ’mango ’ ) , L . index ( ’
pineapple ’ ) ]
L_location
some basic problems 53

Out: [0, 1, 2, 3]
3. The above way of making the list of location is time
consuming if the list contains say 100 elements. Can you
find an easier way to do this ?
The best way is to use loops. Let us write a code below to
do the same task.
L_Location = [ ]
f o r i in L :
L_Location . append ( L . index ( i ) )
L_Location
Out: [0, 1, 2, 3]
Let us explain ’append’. Say L=[1,2,3,4]. Suppose you want
to add another number 5 to the list. The all you need to do
is
L.append(5)
Now print L. You will see the following:
L=[1,2,3,4,5]
4. Can you write a code to show the evolution of list
L_Location ?
L_Location = [ ]
f o r i in L :
L_Location . append ( L . index ( i ) )
p r i n t ( L_Location )
Out: [0, 1] [0, 1, 2] [0, 1, 2, 3]
Note how the list gets filled after each loop is run success-
fully.
5. Can you slice the list and make new list with limited
number of elements from original list ?
L=[1 ,2 ,3 ,4]
L_1=L [ 0 : 2 ]
L_1
54 some basic problems

Out: [1, 2]
L=[1 ,2 ,3 ,4]
L_2=L [ 0 : 3 ]
L_2
Out: [1, 2, 3]
L=[1 ,2 ,3 ,4]
L_3=L [ 1 : 3 ]
L_3
Out: [2, 3]
L=[1 ,2 ,3 ,4]
L_4=L [ 2 : 3 ]
L_4
Out: [3]
6. Make a list containing number from 0 to 19 ?
J = [ * range ( 2 0 ) ]
J
Out: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19]
7. Make a list containing even number from 20 to 30 ?
J _ 1 = [ * range ( 2 0 , 3 0 , 2 ) ]
J_1
Out: [20, 22, 24, 26, 28]
So here the range is from 20 to 30 with step size 2. However
let us also mention that for loop can also help us with this
task
f o r i in range ( 2 0 , 3 0 , 2 ) :
print ( i )
Out: 20 22 24 26 28
All we need is to append these values to a black list
some basic problems 55

J_2 =[]
f o r i in range ( 2 0 , 3 0 , 2 ) :
J _ 2 . append ( i )
print ( J_2 )
Out: [20, 22, 24, 26, 28]
8. Given two list containing name of flowers what happens
when you add them ?
H_1=[ ’ r o s e ’ , ’ sunflower ’ , ’ marigold ’ ]
H_2=[ ’ l o t u s ’ , ’ j a s m i n e ’ ]
H_1 + H_2
Out: [’rose’, ’sunflower’, ’marigold’, ’lotus’,
’jasmine’]
Thus adding two lists fetches you a bigger list
9. How can list be utilized to represent a 2 by 2 matrix ?
Matrix_2D = [ [ 1 , 2 ] , [ 3 , 4 ] ]

Matrix_2D is same as above


matrix.
10. Can you bring down all rows of the above matrix ?
f o r i in Matrix_2D :
p r i n t ( i , end= ’ ’)
Out: [1, 2] [3, 4]
11. Can you bring down all elements of the above matrix ?
f o r i in Matrix_2D :
f o r j in i :
p r i n t ( j , end= ’ ’)
56 some basic problems

Out: 1 2 3 4
Here we have two loops. The ’i’ brings down all rows and ’j’
bring down all elements in the rows.
12. Given the dictionary dic_1=’a’:1,’b’:2,’c’:3 use this in
pandas series. Then write a code to bring down all keys and
values in it ?
d i c _ 1 ={ ’ a ’ : 1 , ’ b ’ : 2 , ’ c ’ : 3 }
import pandas as pd
df_1=pd . S e r i e s ( d i c _ 1 )
# B r i n g down a l l t h e v a l u e s
f o r i in df_1 . v a l u e s :
print ( i )

Out: 1 2 3
# B r i n g down a l l t h e k e y s
f o r i in df_1 . keys ( ) :
print ( i )

Out: a b c
13. Given dic_2=’a’:[1,2,3],’b’:[4,5,6], use pandas dataframe
to make a dataframe out of this dictionary. Then write code
to bring down the keys and associated lists ?
d i c _ 2 ={ ’ a ’ : [ 1 0 0 , 2 0 0 , 3 0 0 ] , ’ b ’ : [ 4 0 0 , 5 0 0 , 6 0 0 ] }
df_2=pd . DataFrame ( d i c _ 2 )
df_2

Out:
some basic problems 57

f o r i in df_2 . keys ( ) :
print ( i )

Out: a b
Basically, ’a’ and ’b’ are just column name of the table.
Now let us bring down the column values (list values)

df_2 [ ’ a ’ ]

Out: 0 100 1 200 2 300 Name: a, dtype: int64

df_2 [ ’ b ’ ]

Out: a 100 b 400 Name: 0, dtype: int64


Similarly, we can extract the other row values

df_2 . i l o c [ 1 ]

Out: a 200 b 500 Name: 1, dtype: int64

df_2 . i l o c [ 2 ]

Out: a 300 b 600 Name: 2, dtype: int64


15. Write the code to locate a particular value in the table/-
data frame?

df_2 . l o c [ 0 , ’ a ’ ]
58 some basic problems

Out: 100
Note that the value 100 can be located exactly if the index
value (to the left) and the column value (on the top) is
known. The code goes like this :
df_2.loc[index value, column name].
11
L E T ’ S P L AY W I T H D ATA

Let us start by importing a simple data set with one input


features and one output features. This is the ’Hours and
Scores’ dataset.

import pandas as pd
import numpy as np
import seaborn as sns
import m a t p l o t l i b . pyplot as p l t

#READ DATA USING PANDAS


df=pd . read_csv ( r "C: \ Users\HP\Downloads\
s t u d e n t _ s c o r e s . csv " )
df

Out:

59
60 let’s play with data

One extra information. The reason I am calling ’Hours’ col-


umn as input feature and ’Scores’ column as output feature
because in machine learning, we would like to develop a
mobel that takes a new value of ’Hours’ as input and predict
the ’Score’.
That is if a student studies for say 6.5 hours what will be the
Score ? Note that the value 6.5 hours is a new value and do
not exist in the data set. But anyways, we will learn more
about it in our future booklets.
Now, let us work with the ’Hours’ column.
let’s play with data 61

# B r i n g o u t t h e h o u r s column f r o m t h e
dataframe
df [ ’ Hours ’ ]

Out: 0 2.5
1 5.1
2 3.2
3 8.5
4 3.5
5 1.5
6 9.2
7 5.5
8 8.3
9 2.7
10 7.7
11 5.9
12 4.5
13 3.3
14 1.1
15 8.9
16 2.5
17 1.9
18 6.1
19 7.4
20 2.7
21 4.8
22 3.8
23 6.9
24 7.8
Name: Hours, dtype: float64
Note that the extra values on the left are indices.

# B r i n g o u t t h e v a l u e s f o r Hours w i t h o u t t h e
indices
62 let’s play with data

f o r i in df [ ’ Hours ’ ] :
print ( i )

Out: 2.5
5.1
3.2
8.5
3.5
1.5
9.2
5.5
8.3
2.7
7.7
5.9
4.5
3.3
1.1
8.9
2.5
1.9
6.1
7.4
2.7
4.8
3.8
6.9
7.8

# L e t us c r e a t e a b a l n k l i s t and p u t t h e
values there
H= [ ]
f o r i in df [ ’ Hours ’ ] :
let’s play with data 63

H. append ( i )
p r i n t (H)
Out: [2.5, 5.1, 3.2, 8.5, 3.5, 1.5, 9.2, 5.5,
8.3, 2.7, 7.7, 5.9, 4.5, 3.3, 1.1, 8.9, 2.5, 1.9,
6.1, 7.4, 2.7, 4.8, 3.8, 6.9, 7.8]
#Can we do t h e same i n any o t h e r way . Yes .
Numpy can h e l p
H_1=np . a r r a y ( df [ ’ Hours ’ ] )
p r i n t ( H_1 )
Out: [2.5 5.1 3.2 8.5 3.5 1.5 9.2 5.5 8.3 2.7
7.7 5.9 4.5 3.3 1.1 8.9 2.5 1.9 6.1 7.4 2.7 4.8
3.8 6.9 7.8]
# How a b o u t we t a k e o u t t h o s e v a l u e s o f
h o u r s t h a t f a l l w i t h i n r a n g e 6 t o 9 and
p u t them i n a l i s t
H_condition = [ ]
f o r i in df [ ’ Hours ’ ] :
i f i >6 and i <=9:
H_condition . append ( i )
p r i n t ( H_condition )
Out: [8.5, 8.3, 7.7, 8.9, 6.1, 7.4, 6.9, 7.8]

Let us arrange this in some nice order


H_condition . s o r t ( )
p r i n t ( H_condition )
Out: [6.1, 6.9, 7.4, 7.7, 7.8, 8.3, 8.5, 8.9]

Let us understand iloc in pandas. This will help us filter


the values of data frames using index values. For example,
below I have set index values from 0 to 3. See what we get
in turn.
64 let’s play with data

df . i l o c [ 0 : 3 ]
Out:

Thus iloc takes in index values and return corresponding


rows.
Now instead of showing all columns, let us use iloc to show
only hours column values of the corresponsing index values.
df . i l o c [ 0 : 3 ] [ ’ Hours ’ ]
Out: 0 2.5
1 5.1
2 3.2
Name: Hours, dtype: float64
However, we can go more specific. Let us choose a specific
column and get the Hours column value. Thus let us choose
the first index value which is zero.
df [ ’ Hours ’ ] [ 0 ]
Out: 2.5
How about doing the reverse. Let us give actual Hours
column values and get the corresponding index value.
# Where i s t h e l o c a t i o n / i n d e x v a l u e f o r
hours value 3.2
np . where ( df [ ’ Hours ’ ] == 3 . 2 )
Out: (array([2], dtype=int64),)
# The same i s d o n e u s i n g p a n d a s .
df [ df [ ’ Hours ’ ] = = 3 . 2 ] . index . v a l u e s
let’s play with data 65

Out: array([2], dtype=int64)


Let us delve into a tough question. You can ksip it if you
dont like this. The idea is to use what we have learned above
to solve the following question:
Given few values from the ’Hours’ column, can you find
their corresponding index and put them in a list or numpy
array ?
# L e t us f i n d t h e i n d e x v a l u e s f o r t h e s e
Hours v a l u e s i n l i s t H _ c o n d i t i o n a b o v e
index_condition =[]
f o r i in H_condition :
i n d e x _ c o n d i t i o n . append ( df [ df [ ’ Hours ’ ]==
i ] . index . v a l u e s )
index_condition

Out: [array([18], dtype=int64),


array([23], dtype=int64),
array([19], dtype=int64),
array([10], dtype=int64),
array([24], dtype=int64),
array([8], dtype=int64),
array([3], dtype=int64),
array([15], dtype=int64)]

EXERCISE: You can play with the Scores column as we


have done with the hours column. This will help you re-
fresh ideas we have developed above.
Let us do something interesting. Let us convert the dataframe
into a list of list/matrix.
Once we achieve this we will use an interesting loop
condition you might have not see before in the book.
The idea is to have the following structure of matrix
[[Hour,Score],[Hour,Score],[Hour,Score],[Hour,Score],[Hour,Score],[Hour,Score]
66 let’s play with data

Thus we want each hour and its corresponding score value


in a single bracket.
Let us understand the bad code. The i and j touch all the
values from Hours, Scores respectively and make all possi-
ble combinations of them. Well this is wierd as we do not
want all possible combinations of the values but only those
values which are in one-one correspondense as can be seen
from the df dataframe.
Matrix_good = [ ]
f o r i in range ( 0 , 2 5 ) :
Matrix_good . append ( [ df [ ’ Hours ’ ] [ i ] ,
df [ ’ S c o r e s ’ ] [ i ] ] )
p r i n t ( Matrix_good )
Out: [[2.5, 21], [5.1, 47], [3.2, 27], [8.5,
75], [3.5, 30], [1.5, 20], [9.2, 88], [5.5, 60],
[8.3, 81], [2.7, 25], [7.7, 85], [5.9, 62], [4.5,
41], [3.3, 42], [1.1, 17], [8.9, 95], [2.5, 30],
[1.9, 24], [6.1, 67], [7.4, 69], [2.7, 30], [4.8,
54], [3.8, 35], [6.9, 76], [7.8, 86]]
Now we will write a code as following where i takes all first
entry values in [first entry, ] (note that the first entry has
the hours column values) and i takes all first entry values in
[ ,second entry] (note that the second entry has the Scores
column values)
# This p i e c e of code p r i n t s f i r s t entry
v a l u e s and s e c o n d e n t r y v a l u e s i n t h e
matrix .
# Note t h a t i t l o o k s l i k e a d a t a f r a m e b u t
wihtout index .
f o r i , j in Matrix_good :
print ( i , j )
let’s play with data 67

Out:
2.5 21
5.1 47
3.2 27
8.5 75
3.5 30
1.5 20
9.2 88
5.5 60
8.3 81
2.7 25
7.7 85
5.9 62
4.5 41
3.3 42
1.1 17
8.9 95
2.5 30
1.9 24
6.1 67
7.4 69
2.7 30
4.8 54
3.8 35
6.9 76
7.8 86

You might also like