Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

The apply() and apply map() function

● apply( ) is a series function, so it applies the given function to one row or one column of the dataframe
( as single row/ column of a dataframe is equivalent to a series ).

● apply map ( ) is an element function ,so it applies the given function to each individual element ,
separately – without taking into account other elements.

Syntax:

Apply( ) : <dataframe>.Apply (<funcname> ,axis=0 )

Apply map ( ): <data frame >.Apply map (<funcname>)

Parameters

Funcname() : the function to be applied on the series inside the data frame I.e, on rows and columns . It
should be a function that work with series and similar object.

Axis: 0 or 1 default ; axis along with the function is applied .

Axis 0 is applied on each column and

Axis 1 is applied on each row .

Example: apply( )

In [48]: sal_df .Apply (np.mean)

Out [48]:

2016 46625.0

2017. 51750.0

2018. 55250.0

2019. 61000.0

Dtype :. Float 64

Example: apply map ( )

In 47: sal.df .Apply map (np.mean)

Out 47:

2016. 2017. 2018. 2019

Qtr1. 34500.0. 44900.0. 54500.0. 61000.0

Qtr2. 56000.0. 46100.0. 51000.0. NaN

Qtr3. 47000.0. 57000.0. 57000.0. NaN

Qtr4. 49000.0. 59000.0. 58500.0. NaN


For apply ( ) , by default , the axis is 0 , the function is applied on individual column. To apply the function
row wise , you may write :

< Dataframe> .Apply ( <func> ,axis =1)

Example :

In [58] : sal_df apply (np.mean, axis =1)

Out [58] :

Qtr1. 48725.000000

Qtr2. 51033.333333

Qtr3. 53666.666667

Qtr4. 55500.000000

Dtype : float 64

Function group by ( )

The group by( ) function rearrange data into groups based on some criteria and stores the rearranged fat
in a new group by object . You can apply aggregate function on the group by object using agg ( ) .

Syntax:

<Data frame >.Group by ( by= none , axis =0)

By. Label , or list of label to be used for grouping

Axis. {0 or Index , 1 or column } , default 0 ; slit among row (0) or column (1).
Grouping on multiple columns

If you want to created groups on multiple column , all you need to do is to pass a sequence to group by (
) that contain the names of first group column, followed by second group column , followed by the third
and so on.

Aggregation via group by ( )

Often in data science you need to have summary statics in the same table . You can achieve this using
agg( ) method on the group by object crested using group by ( ) method.

The agg( ) method aggregates the data od the data frame using one or more operation over the
specified axis. The syntax for using agg( ) is :

<Data frame >.agg ( function , axis =0 )

Parameters

Func. Function ,str , list or dictionary

Axis. ( 0 or index , 1 or column } , default 0 ;.If 0 or index : apply function to each column . If 1 or
column : apply function to each row.
The transform ( ) function

The transform ( ) function transforms the aggregates data by repeating the summary result for each row
of the group and makes the result have the same shape as original data. Have a look on example below
to understand it.

Reindexing and altering labels

When you create a dataframe object , it gets its row number ( or the index ) and column labels
automatically . But sometimes we are not satisfied with the row and column labels of a data frame . For
this, pandas offers you a major functionality. You may change the row index and column labels as and
when you require.

Row and column refers to labels of axis 1 I.e column labels . There are many similar methods provided by
pandas library that help you change rearrange , rename index’s or column labels . Thus , you should read
following lines carefully to know the difference between the working of these methods .

1. method that simply renames the index and/ or column labels in a datafram
2. Reindex ( ) a method that can specify the new order of existing indexes and column label anr
also create new indexes column labels.

3. Reindex_like ( ) A method for creating indexes / column – labels based on other dataframe
object.

The rename ( ) method

The rename () function renames the existing indexes column labels in a data frame . The old and new
index column labels are to be provided in the form of dictionary where keys are the old indexes row
labels and the values are the new names for the same e,g

{ ‘Qtr1’’ : 1 ‘Qtr2’:2, ….}

Syntax:

<Data frame>. Rename ( mapper =none, axis = none, in place= false)

<Data frame>. Rename (index= none , columns = none , in place = false)

Parameters

Mapper, index, columns. Dict like ( dictionary like )

Axis. Int ( 0 or 1 ) or str ( ‘index ‘ or ‘ column ‘ ,) the default is 0 or ‘ index ‘.

In place. Boolean default false ( which returns a new dataframe with renamed I Dex
labels . If true , then changes are made in the current data frame and new data frame is not returned.
(ii) the reindex () method

Parameters

Labels array – like , optional; New labels / index to conform to the axis specified by axis to.

Index , column array – like ,optional ; New labels / index to conform to, should be specified using
keywords. Preferably an Index object to avoid duplicating data.

Axis int( 0 or 1 ) or str(‘index or ‘ column ‘ ) , optional ; Axis to target .Default 0 or index .

Fill_ value the values to be filled in the newly added row/ columns

(A) Reordering the existing indexes using reindex ( )

Ndf.reimdex([‘Qtr4’,’Qtr1’,’Qtr3’,’Qtr1’])

Ndf.reindex([‘Qtr4’,’Qtr1’,’Qtr3’,’Qtr2’],axis=0)

(c) specify fill values for new rows/columns.


The reindex_like ( ) method

The reindex like( ) function works on a data frame and reindexed it’s data as per the argument data
frame passed on it . This function does following things:

(a) If the current data frame has some matching row indexes columns labels as the passed data
frame , then retain the index label and it’s data.

(b) If the current data frame has some row indexes columns labels in it which are not in the passed
data frame , drop them.

(c) If the current datagrams does not have some row indexes/ column labels which are in the passed
data frame , then add them to current data frame with values as NaN

(d) The reindex_like ( ) ensure that the current data frame object conforms to the same indexes /
labels on all axes.

Syntax:

<Data frame>.reindex_like () is:

Parameters

Other. Name of a data frame as per which current <data frame > is to reindexed.

You might also like