Dataframe in Pandas

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 23

Dataframe in Pandas

Q1. Create the following dataframes:

i. CricketPlayers from a list of dictionaries containing names of five cricket


players, number of matches played and Average Score.
ii. Items from a list of dictionaries containing names of five items, cost
price, sales price, discount(if any).
iii. Result from a dictionary of series containing rollnumber of 6 students
and their percentage in last five years
iv. Monuments from a dictionary of series containing names of 10
monuments, their year of built, place and who built them
v. Countries from a list of dictionaries containing names of 10 countries, its
national animal, bird and currency.

 Q2. Consider the following dataframe RESULTSHEET:

   UT1 Half Yearly UT2 Final


Sharad 57 83 49 89
Mansi 86 67 87 90
Kanika 92 78 45 66
Ramesh 52 84 55 78
Ankita 93 75 87 69
Pranay 98 79 88 96

 Here, Names of the students are row labels and term names (UT1, Half
Yearly, UT2 and Final) are the column labels. Answer the following questions
based on the above dataframe:

a. Change the row labels from student name to roll numbers from 1 to 6.
b. Change the column labels to Term1, Term2, Term3, Term4.
c. Add a new column Grade with values ‘A’, ‘A’,’B’,’A’,’C’, ‘B’
d. Add a new row for the student with row label=7 and marks equal to
49, 56, 75,58 and grade=b.
e. Delete the first row
f. Delete the third column
g. Display 2nd row with all columns
h. Display students who have scored more than 50 in Final exam
i. Check students who have grade as A
j. Display marks in Half Yearly and Final of all students
k. Display marks of students from Mansi to Ankita
l. Display marks of Mansi to Ankita in UT1 and UT2
m. Display marks of Kanika and Ankita in Half Yearly and Final
n. Display first 3 records
o. Display last four records

 
Q3. Write a Python program to create the following dataframe DOCTOR using
the index values as 10,20,30,40, 50, 60, 70.

ID NAME DEPT EXPERIENCE


101 JOHN ENT 12
102 SMITH ORTHOPEDIC 5
103 GEORGE CARDIOLOGY 10
104 LARA SKIN 3
105 K GEORGE MEDICINE 9
106 JOHNSON ORTHOPEDIC 10
107 LUCY ENT 3

 Q4. Give commands to perform the following operations on the dataframe


DOCTOR:

a. Write code to display the details of LARA using loc.


b. Write code to display the details of LARA and LUCY using iloc.
c. Write code to display all doctor’s names.
d. Write code to display all doctor’s names along with DEPT
e. Write code to display the first 3 records from the dataframe.
f. Write code to display the last 4 records from the data frame.
g. Write code to display the department and experience of doctors with
names JOHN and SMITH.
h. Write code to display 2nd to 6th record
i. Display all the odd numbered records.
j. Write code to insert a column named “AGE” giving appropriate values
to each doctor.

 Q5. Consider the following DataFrame Flight_Fare:

FL_NO AIRLINES FARE


IC701 INDIAN AIRLINES 6500
MU499 SAHARA 9400
AM501 JET AIRWAYS 13400
IC899 INDIAN AIRLINES 8300
IC302 INDIAN AIRLINES 4300

Give the output of the following commands:

a. Flight_Fare [Flight_Fare.index>1]
b. Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare
.FARE<=9000)]
c. Flight_Fare [( Flight_Fare .FL_NO== "IC701")| ( Flight_Fare
.FL_NO== "AM501")| ( Flight_Fare .FL_NO== " IC302")]
d. Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare
.FARE<=9000)][[ "FL_NO", "FARE"]]
e. Flight_Fare [2:4]
f. Flight_Fare [:4]
g. Flight_Fare [::3]
h. Flight_Fare [:: -3]
i. Flight_Fare [3:]
j. Flight_Fare.loc[1:4,'FL_NO':'FARE']
k. Flight_Fare.loc[1:4,['FL_NO','FARE']]
l. Flight_Fare.iloc[[0,2,4]]
m. Flight_Fare.iloc[:,1:3]
n. Flight_Fare.iloc[1:2,1:3]
o. Flight_Fare.loc[1:3]
p. Flight_Fare.loc[:,'FL_NO':'FARE']
q. Flight_Fare ["Tax%"] = [10,8,9,5,7]
r. Flight_Fare.loc[5]=[ "MC101", "DECCAN AIRLINES", "3500",”10”]
s. Flight_Fare.loc [:,"Disc%"] = [2,3,2,4,2]
t. Flight_Fare =Flight_Fare.drop("Tax%", axis=1)
u. Flight_Fare =Flight_Fare.drop(4, axis=0)
v. Flight_Fare =Flight_Fare.drop([1,4] , axis=0)
w. Flight_Fare.loc[2]
x. Flight_Fare.loc[:,"FL_NO"]
y. Flight_Fare ["FARE"]>=6000

 Solutions

Q1    
  (i) import pandas as pd

a=[{'name':'virat','matches played':180,'avg score':4500},   

       {'name':'rohit','matches played':150,'avg score':6000},

        {'name':'ms dhoni','matches played':120,'avg score':2800}]

f=pd.DataFrame(a)

print(f)

OUTPUT

      name     matches played  avg score

0     virat             180                  4500

1     rohit             150                  6000


2     ms dhoni     120                   2800

 
import pandas as pd

m=[{'item':'charger','cost':500,'discount':'5%'},

       {'item':'books','cost':750},

        {'item':'clock','cost':1200,'discount':'10%'}]

df=pd.DataFrame(m)

print(df)

  (ii)  

OUTPUT

      item       cost   discount

0    charger   500       5%

1    books      750      NaN

2    clock       1200      10%

 
  (iii) import pandas as pd

result  = {'2015':pd.Series(('78%','56%','90%','79%','60%')),

                '2016':pd.Series(('64%','85%','72%','56%','48%')),

                 '2017':pd.Series(('45%','66%','78%','88%','73%')),

                 '2018':pd.Series(('70%','56%','38%','89%','94%')),

                 '2019':pd.Series(('66%','78%','58%','90%','83%'))}

rs=pd.DataFrame(result)

rs.index=[1,2,3,4,5]

print(rs)

OUTPUT
  2015 2016 2017 2018 2019

1  78%  64%  45%  70%  66%

2  56%  85%  66%  56%  78%

3  90%  72%  78%  38%  58%

4  79%  56%  88%  89%  90%

5  60%  48%  73%  94%  83%

 
import pandas as pd

mn={'monuments':pd.Series(['qutab minar','humayan tomb','lal qila','taj mahal',

'efiel tower']),

'year':pd.Series([1193,1572,1683,1630,1887]),

'built':pd.Series(['qutab','bega begum','shah jhan','shahjhan','gustave effiel'])}

pm=pd.DataFrame(mn)

print(pm)

 
  (iv)
OUTPUT

      monuments        year             built

0   qutab minar         1193            qutab

1   humayan tomb    1572            bega begum

2   lal qila                    1683            shah jhan

3     taj mahal            1630             shahjhan

4     efiel tower         1887             gustave effiel

 
  (v) import pandas as pd

con={'country':pd.Series(['India','Australia', 'China']),

'national animal':pd.Series(['tiger','kangaroo','Chinese dragon']),

'national bird':pd.Series(['peacock','emu','red crowned erane']),


'currency':pd.Series(['ruppee','dollar','kerminbi'])}

pp=pd.DataFrame(con)

print(pp)

OUTPUT

    country     national animal      national bird               currency

0      India           tiger                       peacock                      ruppee

1      Australia    kangaroo              emu                             dollar

2      China         Chinese dragon    red crowned erane   kerminbi


Q2.    
import pandas as pd

a=[[58,83,49,89],[86,67,87,90],[92,78,45,56],[52,84,55,78],[93,75,87,69],

[98,79,88,96]]

m=pd.DataFrame(a,index=['sharad','mansi','kanika','ramesh','ankita','pranay'],

    columns=['ut1','halfyearly','ut2','final'])

print(m)

OUTPUT
   
                ut1  halfyearly ut2 final

sharad     58         83   49     89

mansi      86         67   87     90

kanika     92         78   45     56

ramesh   52         84   55     78

ankita     93         75   87     69

pranay    98         79   88     96

 
  (a) m=m.rename({'sharad':1,'mansi':2,'kanika':3,'ramesh':4,'ankita':5,'pranay':6},
axis="index")

print(m)

OUTPUT

    ut1  halfyearly  ut2  final

1   58          83   49     89

2   86          67   87     90

3   92          78   45     56

4   52          84   55     78

5   93          75   87     69

6   98          79   88     96


m=m.rename({'ut1':"term1",'halfyearly':"term2",'ut2':"term3",'final':"term4"},

axis="columns")

print(m)

o\p

   term1  term2  term3  term4

  (b) 1     58     83     49     89

2     86     67     87     90

3     92     78     45     56

4     52     84     55     78

5     93     75     87     69

6     98     79     88     96

 
  (c) m['Grade']=['a','b','b','a','a','b']

print(m)
 

OUTPUT

   term1  term2  term3  term4 Grade

1     58     83           49     89             a

2     86     67           87     90             b

3     92     78          45     56              b

4     52     84          55     78              a

5     93     75          87     69              a

6     98     79          88     96              b

 
m.loc[7]=[49,56,75,58, 'b']

print(m)

OUTPUT

   term1  term2  term3  term4 Grade

1     58     83           49     89             a

  (d) 2     86     67           87     90             b

3     92     78          45     56              b

4     52     84          55     78              a

5     93     75          87     69              a

6     98     79          88     96              b

7     49     56          75     58              b

 
  (e) m.drop(1,axis=0)

OUTPUT
   term1  term2  term3  term4 Grade

2     86     67           87     90             b

3     92     78          45     56              b

4     52     84          55     78              a

5     93     75          87     69              a

6     98     79          88     96              b

7     49     56          75     58              b

 
m.drop('term3',axis=1)

OUTPUT

   term1  term2  term4 Grade

2     86     67          90             b

  (f) 3     92     78         56              b

4     52     84         78              a

5     93     75         69              a

6     98     79         96              b

7     49     56         58              b

 
  (g) m.loc[2]

OUTPUT

term1                  92

term2                  78

term3                  45

term4                  56
Grade                   b

Name: 2, dtype: object

 
m['term4']>50

OUTPUT

1    True

2    True

3    True
  (h)
4    True

5    True

6    True

7    True

Name: term4, dtype: bool

 
bm.loc[:,'Grade']=='a'

OUTPUT

0    True

1    False

2    False
  (i)
3    True

4    True

5    False

6    False

Name: internal assessment, dtype: bool

 
m.loc[:,["term2","term4"]]

                    or

m[["term2","term4"]]

OUTPUT

    term2  term4

  (j) 1     83     89

2     67     90

3     78     56

4     84     78

5     75     69

6     79     96

7     56     58
m.loc['2':'5']

OUTPUT

   term1  term2  term3  term4


  (k)
2     86     67     87     90

3     92     78     45     56

4     52     84     55     78

5     93     75     87     69


  (l) m.loc['2':'5',['term1','term2']]

OUTPUT

   term1  term2

2     86     67

3     92     78
4     52     84

5     93     75
m.loc[['3','5'],['term2','term4']]

OUTPUT
  (m)
   term2  term4

3     78     56

5     75     69


m.head(3)

OUTPUT

  (n)    term1  term2  term3  term4

1     58     83     49     89

2     86     67     87     90

3     92     78     45     56


m.tail(4)

OUTPUT

   term1  term2  term3  term4


  (o)
3     92     78           45     56

4     52     84           55     78

5     93     75           87     69

6     98     79           88     96


Q3.    
    import pandas as pd

a={'ID':[101,102,103,104,105,106,107],

'NAME':['JOHN','SMITH','GEORGE','LARA','K GEORGE','JOHNSON','LUCY'],

'DEPT':['ENT','ORTHOPEDIC','CARDIOLOGY','SKIN','MEDICINE','ORTHOPEDIC','ENT'],
'EXPERIENCE':[12,5,10,3,9,10,3]}

df1=pd.DataFrame(a,index=[10,20,30,40, 50, 60, 70])

print(df1)

OUTPUT

       ID      NAME        DEPT              EXPERIENCE

10  101    JOHN          ENT                          12

20  102    SMITH        ORTHOPEDIC           5

30  103    GEORGE    CARDIOLOGY          10

40  104    LARA          SKIN                           3

50  105    K GEORGE MEDICINE                 9

60  106   JOHNSON   ORTHOPEDIC          10

70  107   LUCY            ENT                            3


Q4    
df1.loc[40]

OUTPUT

ID             104
  (a)
NAME          LARA

DEPT          SKIN

EXPERIENCE       3

Name: 40, dtype: object


  (b) df1.iloc[[3,4]]

OUTPUT

     ID       NAME         DEPT             EXPERIENCE

40  104      LARA            SKIN                    3


50  105      K GEORGE  MEDICINE           9
df1.NAME

OUTPUT

10      JOHN

20      SMITH

  (c) 30      GEORGE

40      LARA

50      K GEORGE

60      JOHNSON

70      LUCY

Name: NAME, dtype: object


df1.loc[:,["NAME","DEPT"]]

OUTPUT

        NAME        DEPT

10    JOHN          ENT

  (d) 20    SMITH        ORTHOPEDIC

30    GEORGE    CARDIOLOGY

40    LARA          SKIN

50    K GEORGE MEDICINE

60   JOHNSON  ORTHOPEDIC

70   LUCY           ENT


  (e) df1.head(3)

OUTPUT

       ID        NAME        DEPT               EXPERIENCE


10  101    JOHN         ENT                         12

20  102   SMITH       ORTHOPEDIC          5

30  103  GEORGE    CARDIOLOGY          10


df1.tail(4)

OUTPUT

        ID      NAME            DEPT             EXPERIENCE


  (f)
40  104      LARA              SKIN                      3

50  105      K GEORGE    MEDICINE             9

60  106      JOHNSON     ORTHOPEDIC       10

70  107      LUCY              ENT                       3


df1.loc[[10,20],['DEPT','EXPERIENCE']]

OUTPUT
  (g)
          DEPT             EXPERIENCE

10     ENT                          12

20     ORTHOPEDIC           5
df1.loc[20:60]

OUTPUT

     ID      NAME        DEPT                    EXPERIENCE

  (h) 20  102     SMITH     ORTHOPEDIC           5

30  103    GEORGE   CARDIOLOGY          10

40  104      LARA       SKIN                          3

50  105  K GEORGE  MEDICINE                9

60  106   JOHNSON  ORTHOPEDIC          10


  (i) df1.iloc[0:6:2]
 

OUTPUT

        ID      NAME           DEPT             EXPERIENCE

10  101      JOHN           ENT                          12

30  103    GEORGE       CARDIOLOGY          10

50  105    K GEORGE    MEDICINE                9


df1["AGE"]=[50,65,45,38,45,39,52]

OUTPUT

     ID         NAME            DEPT               EXPERIENCE  AGE

10  101      JOHN            ENT                        12            50

  (j) 20  102     SMITH           ORTHOPEDIC         5             65

30  103    GEORGE        CARDIOLOGY         10          45

40  104     LARA              SKIN                         3           38

50  105     K GEORGE    MEDICINE                9           45

60  106     JOHNSON     ORTHOPEDIC          10          39

70  107     LUCY              ENT                           3            52


Flight_Fare [Flight_Fare.index>1]

 OUTPUT

       FL_NO             AIRLINES   FARE


Q5 (a)
2  AM501      JET AIRWAYS  13400

3  IC899  INDIAN AIRLINES   8300

4  IC302  INDIAN AIRLINES   4300


  (b) Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare .FARE<=9000)]

OUTPUT

    FL_NO               AIRLINES  FARE


0  IC701  INDIAN AIRLINES  6500

3  IC899  INDIAN AIRLINES  8300

4  IC302  INDIAN AIRLINES  4300


Flight_Fare [( Flight_Fare .FL_NO== "IC701")| ( Flight_Fare .FL_NO== "AM501")|
( Flight_Fare .FL_NO== " IC302")]

OUTPUT
  (c)
 

    FL_NO              AIRLINES    FARE

0  IC701  INDIAN AIRLINES   6500

2  AM501      JET AIRWAYS  13400


Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare .FARE<=9000)][[ "FL_NO",
"FARE"]]

OUTPUT

   FL_NO  FARE
  (d)
0  IC701  6500

3  IC899  8300

4  IC302  4300
Flight_Fare [2:4]

OUTPUT
  (e)
     FL_NO             AIRLINES   FARE

2  AM501      JET AIRWAYS  13400

3  IC899   INDIAN AIRLINES   8300


  (f) Flight_Fare [:4]

OUTPUT

     FL_NO               AIRLINES   FARE


0     IC701  INDIAN AIRLINES   6500

1  MU499                 SAHARA   9400

2  AM501        JET AIRWAYS  13400

3    IC899   INDIAN AIRLINES   8300


Flight_Fare [::3]

OUTPUT
  (g)
   FL_NO               AIRLINES  FARE

0  IC701  INDIAN AIRLINES  6500

3  IC899  INDIAN AIRLINES  8300


Flight_Fare [:: -3]

OUTPUT
  (h)
    FL_NO                 AIRLINES  FARE

4  IC302    INDIAN AIRLINES  4300

1  MU499                SAHARA  9400


Flight_Fare [3:]

OUTPUT
  (i)
   FL_NO               AIRLINES  FARE

3  IC899  INDIAN AIRLINES  8300

4  IC302  INDIAN AIRLINES  4300


  (j) light_Fare.loc[1:4,'FL_NO':'FARE']

OUTPUT

      FL_NO            AIRLINES   FARE

1  MU499             SAHARA    9400


2  AM501      JET AIRWAYS  13400

3  IC899  INDIAN AIRLINES   8300

4  IC302  INDIAN AIRLINES   4300


Flight_Fare.loc[1:4,['FL_NO','FARE']]

OUTPUT

     FL_NO   FARE
  (k)
1  MU499   9400

2  AM501  13400

3    IC899   8300

4  IC302   4300
Flight_Fare.iloc[[0,2,4]]

OUTPUT

  (l)      FL_NO                AIRLINES   FARE

0  IC701    INDIAN AIRLINES   6500

2  AM501        JET AIRWAYS  13400

4  IC302    INDIAN AIRLINES   4300


Flight_Fare.iloc[:,1:3]

OUTPUT

                  AIRLINES   FARE

  (m) 0  INDIAN AIRLINES   6500

1                 SAHARA   9400

2        JET AIRWAYS  13400

3  INDIAN AIRLINES   8300

4  INDIAN AIRLINES   4300


  (n) Flight_Fare.iloc[1:2,1:3]
 

OUTPUT

    AIRLINES  FARE

1   SAHARA  9400


Flight_Fare.loc[1:3]

OUTPUT

  (o)    FL_NO              AIRLINES   FARE

1  MU499             SAHARA   9400

2  AM501      JET AIRWAYS  13400

3  IC899  INDIAN AIRLINES   8300


Flight_Fare.loc[:,'FL_NO':'FARE']

OUTPUT

   FL_NO                 AIRLINES   FARE

  (p) 0  IC701    INDIAN AIRLINES   6500

1  MU499                SAHARA   9400

2  AM501        JET AIRWAYS  13400

3  IC899    INDIAN AIRLINES   8300

4  IC302  INDIAN AIRLINES   4300


  (q) Flight_Fare ["Tax%"] = [10,8,9,5,7]

OUTPUT

   FL_NO             AIRLINES        FARE  Tax%

0  IC701    INDIAN AIRLINES   6500    10


1  MU499               SAHARA     9400     8

2  AM501      JET AIRWAYS    13400     9

3  IC899    INDIAN AIRLINES   8300      5

4  IC302    INDIAN AIRLINES   4300      7


Flight_Fare.loc[5]=[ "MC101", "DECCAN AIRLINES", "3500",”10”]

OUTPUT

     FL_NO                 AIRLINES   FARE  Tax%

  (r) 0  IC701      INDIAN AIRLINES   6500   10

1  MU499                  SAHARA    9400    8

2  AM501         JET AIRWAYS   13400    9

3  IC899     INDIAN AIRLINES    8300     5

4  IC302     INDIAN AIRLINES    4300     7

5  MC101  DECCAN AIRLINES   3500   10


Flight_Fare.loc [:,"Disc%"] = [2,3,2,4,2,3]

OUTPUT

     FL_NO                 AIRLINES   FARE  Tax%  Disc%

0  IC701     INDIAN AIRLINES   6500    10      2


  (s)
1  MU499                    SAHARA     9400    8       3

2  AM501          JET AIRWAYS  13400     9       2

3  IC899    INDIAN AIRLINES   8300     5       4

4  IC302    INDIAN AIRLINES   4300     7       2

5  MC101  DECCAN AIRLINES   3500   10      3

 
Flight_Fare =Flight_Fare.drop("Tax%", axis=1)

OUTPUT

     FL_NO                  AIRLINES    FARE    Disc%

0  IC701      INDIAN AIRLINES   6500      2


  (t)
1  MU499                      SAHARA   9400      3

2  AM501            JET AIRWAYS  13400     2

3  IC899     INDIAN AIRLINES   8300      4

t4  IC302     INDIAN AIRLINES   4300      2

5  MC101  DECCAN AIRLINES   3500      3


Flight_Fare =Flight_Fare.drop(4, axis=0)

OUTPUT

     FL_NO                AIRLINES   FARE   Disc%

  (u) 0  IC701    INDIAN AIRLINES    6500       2

1  MU499                SAHARA     9400      3

2  AM501        JET AIRWAYS   13400      2

3  IC899     INDIAN AIRLINES    8300      4

5  MC101  DECCAN AIRLINES   3500      3


Flight_Fare =Flight_Fare.drop([1,4] , axis=0)

OUTPUT

  (v)     FL_NO                AIRLINES   FARE   Disc%

0  IC701    INDIAN AIRLINES   6500       2

2  AM501        JET AIRWAYS  13400      2

3  IC899    INDIAN AIRLINES   8300       4


  (w) Flight_Fare.loc[2]
OUTPUT

FL_NO             AM501

AIRLINES    JET AIRWAYS

FARE              13400

Name: 2, dtype: object


Flight_Fare.loc[:,"FL_NO"]

OUTPUT

0    IC701

1    MU499
  (x)
2    AM501

3    IC899

4    IC302

Name: FL_NO, dtype: object

 
Flight_Fare ["FARE"]>=6000

OUTPUT

0     True

1     True
  (y)
2     True

3     True

4    False

Name: FARE, dtype: bool

You might also like