Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Indian Institute of Management Raipur

AMR Assignment
Pilgrim Bank
By
SECTION A

Group no 31

Aanchal Togarwar (19PGP268)


Shashank Kumar (19PGP285)
Praneeth Mutnuru (19PGP245)
Shasak Choudhary(19PGP213)
Mopuri Praneeth Reddy (19PGP194)
Yadavilli Sathya Venkata Bhaskar Sarma (19PGP229)
Table of Contents

Introduction..............................................................................3
Data and Research Methodology.............................................3
Results and Discussions...........................................................5
Introduction
The case primarily deals with the

Data and Research Methodology

Research objective 1:

Would encouraging transaction migration to lower cost channels improve customer


profitability? (Is there a difference between the profitability of on-line and off-line
customers?)

METHODOLOGY
To verify and establish if there is a difference in the profitability value between the customers
using offline channel vs online channel, an initial regression is performed for the data of
1999 and 2000 separately. Consequently, the total users who are using online channel and
offline channel are identified and an independent T-test and paired T-test are performed on
the groups. These tests are to accept or reject the null hypothesis that Profitability among the
online and offline users is equal.

Research Objective 2:
Is profitability different for different types of people?

METHODOLOGY
As it is important to find out which income groups’ customers are similar and which income
groups’ customers are different in terms of profitability to the bank, we performed
Independent-Samples T Test by picking some of the pairs. The null hypothesis here being
that Mean of Profitability of two Income groups considered for test are similar.

Research Objective 3:
1. Whether the use of Online channel influenced
• The profitability
• Customer Retention

2. If it was possible to predict the following for individual customers,


• The future profitability
• The likelihood of retention
Data Variables used
Since the data related to the age and income was not available for some of the entries, the
data was cleaned and new variables 9Age_Recoded and 9Inc_Recoded were created with the
missing values taking the mode of these two variables which was category 3 for age and
category 6 for income.
Two new variables were constituted:
@0Retained - Recoded to identify whether the customer left or was retained in 2000
Online_Change_Status - Recoded to identify the group of customers according to their
1999- 2000 Online status.
This status can be classified into 5 categories
0 – Person has left the bank
1 – 9online – 0online,
2 – 9offline – 0online,
3 – 9offline – 0 offline,
4 – 9online – 0 offline

Methodology:
First, we have checked for the effect on the customer profitability due to the online services
provided. This analysis is performed by independent sample T test.
Null hypothesis H0 – The difference in profits is not significant between online and offline
channels
Alternate hypothesis H1– The difference is significant between online and offline channels

Then we have analysed, for the customer groups as defined above whether the shift to online
status from year 1999 to 2000 has impacted the profitability and also with the existing online
customers continuing for the next year has impact on the profitability.

Null hypothesis H0– The difference in profits between groups is not significant among the
groups.
Alternate hypothesis H1 – The difference is significant among the groups

Next, we have analyzed the effect of the Online channel in customer retention capability for
the bank. For this we have performed the Chi-Square test of Independence using cross tabs.
Null hypothesis H0: There is no effect on customer retention due to online channel.
Alternate hypothesis H1: There is effect on customer retention due to online channel.

For Prediction of Future Profitability and Likelihood of retention we have developed the
regression model by taking the categorical variables 9tenure,9online,
9inc_recoded,9age_recoded. And to validate the model, we have used the Binary logistic
regression with the independent categorical variables on the year 2000 profitability.

Null hypothesis H0: The model is not significant to predict with the independent variables.
Alternate hypothesis H1: The model is significant to predict with the independent variables

Research Objective 4:
Whether the use of Online channel and Electronic Bill Payments influenced the Customer
Retention.
Methodology
We have analyzed the effect of the Online channel and electronic bill payments in customer
retention capability for the bank. For this we have performed the regression analysis to find
out the amount of contribution from each of the two parameters.
Null hypothesis H0: There is no effect on customer retention due to online channel and
electronic bill payments both simultaneously.
Alternate hypothesis H1: There is effect on customer retention due to online channel
electronic bill payments due to at least one of them.
Results and Discussions

Research objective 1: Would encouraging transaction migration to lower cost channels improve
customer profitability? (Is there a difference between the profitability of on-line and off-line
customers?)

To observe if there is is any effect on profitability change due to the online migration we perform
regression with the billpay, district, tenure, income level, age and if they are online customers or
offline as independent variables.

The regression equation before obtaining the co-efficients is as follows:


9Profit = b0 + b1*9Online + b2*9Age + b3*9Inc + b4*9Tenure + b5*9District + b6*9Billpay
Null Hypothesis: Difference in profitablity between online and offline mode is significant
Alternate Hypothesis: The difference is insignificant

The R Square value of 0.59 shows that 59% of the variance in Profitability is explained by Billpay,
District, Tenure, Inc, Age and the channel mode (Online or Offline). This will allow us to move ahead
and test which of these factors are significant.
Observation:
Anova Test: It can be observed that P<0.05 which means it is significant and we reject the null
hypothesis b1=b2=b3=b4=b5=b6=0. (Atleast one of them is not equal to zero).
For b2, b3, b4, b6 t > |2| so we reject the null hypothesis. 9Age, 9Inc, 9Tenure and 9Billpay does
have effect on 9Profit. For b1 and b5 t < |2| which means that we accept the null hypothesis. 9Online
and 9District does not have any effect on 9Profit.
So the final regression equation will be
Profitability = -101.881+ 6.363*9Online + 18.232*9Age + 17.737*9Inc + 4.020*9Tenure +
0.009*9District + 82.885*9Billpay

Regression on the data for the year 2000


Initial Regression equation for this case will be  0Profit = b0 + b1*0Online + b2*0Billpay
Null Hypothesis: Difference in profitablity between online and offline mode is significant
Alternate Hypothesis: The difference is insignificant

Observation:
The R Square value of 0.02 shows that only 2% of the variance in Profit is explained by Billpay and
channel mode (Online or Offline).
Here P<0.05 which means it is significant and we reject the null hypothesis of
b1=b2=b3=b4=b5=b6=0. (Atleast one of them is not equal to zero).
For b2 is t> |2| which means that we reject the null hypothesis. 0Billpay does have a effect on 0Profit.
For b1 t < |2| which means that we accept the null hypothesis. 0Online does not have any effect on
0Profit.
So we end up with a regression as 0Profit = 140.683 +6.043*0Online + 98.658*0Billpay

Independent T-test
In this sample data provided among 31634 customers there are 27780 offline customers and 3854
online customers. The first t test here correlates to the offline customers. For which:

Null Hypothesis: Profitablity of online and offline channels is equal


Alternative Hypothesis: Profitablity is not equal

Here t< |2| therefore we accept the null hypothesis that the online and offline channels have same
profitability in the 1999 data.
Now we perform the same test on the online customers whose count is 3854 with the same null and
alternate hypothesis.
Observation:
According to one hypothesis, differences exist between online and offline customers. In this 95%
confidence interval test, the t value is 1.254 that is less than 2, i.e., value is not statistically significant.
So, we will accept the null hypothesis.

Paired T-Test

Observation:
t value is greater than |2|, therefore we will reject the null hypothesis. The two means 9Profit and
0Profit are not equal.

Objective Conclusion: The result of test concludes that there is no meaningful difference
between the profitability of online & offline customers.

Research objective 2: Is profitability different for different types


of people?
RESULTS

The results are presented in the table below. Here the null hypothesis is rejected when the t-values for
the corresponding Independent-Samples t-test is found to be more than modulus of 2. Else, we accept
the null hypothesis, which means that mean values of profitability of the groups considered to be
equal.

INCOME GROUPS MEAN VALUES T-Values


1 vs 2 72.28, 89.45 1.695
1 vs 3 72.28, 88.96 2.356
2 vs 3 89.45, 88.96 0.05
3 vs 4 88.96, 93.38 0.630
4 vs 5 93.38, 95.02 0.226
(2,3) vs (4,5) 89.08, 94.21 0.920
5 vs 6 95.02, 119.27 3.735
6 vs 7 119.27, 138.01 3.034
7 vs 8 138.01, 157.83 2.293
8 vs 9 157.83, 233.88 7.385

KEY OBSERVATIONS:

 The profitability for income groups remained almost same for income buckets 1 and 2 as
the t-value is less than modulus of 2 but for 1 and 3 it tend to be significantly different as
its t-value is more than modulus of 2.
 The income groups 2, 3, 4, 5 (from $15,000 to $49,999) tend to show similar profitability,
hence we can group them together.
 But from the income level of $50,000 and above the profitability tend to increase
significantly with increase in incomes as we can see that groups 6, 7, 8, 9 have their mean
profitability significantly different from one another.
 High Income customers prove to be more profitable to Pilgrim Bank. The customers with
income levels above $1,25,000 are highly profitable to the bank.

buckets 1&2
mean profit

Profitability
each other

Gr 6,7,8,9
diff. from

same for
High Income

Income Group
2,3,4,5 show
profitability
profitable

Customer

similar
more

Research Objective 3:
1.Whether the use of Online channel influenced the profitability

From the data available we see the missing values. We would concentrate on the 0profit missing
which is 16.6 %. Hence, we conclude that approximately 16.6% people have not been retained by the
bank. 16.5 % missing in online values also suggest the same.
As we have seen above for the 1999 profitability due to online usage, we will now concentrate on the
2000 year’s profitability sue to going online.
Analysis

T-test significance < .05, it is in the rejection region. The Null hypothesis is rejected, and the
Profits are statistically different significantly. But Means are close by 140.7 and 161.1.
Analysis

T-test significance < .05, it is in the rejection region hence we reject the null hypothesis.
The Null rejected and the Profits of different groups are statistically significant. Since the
Profits are higher for Online customers compared to Offline, we believe that the high profits
may have been influenced by Online status.

2. Whether the use of Online channel influenced Customer Retention

To check the influence of Online channel in Retention of the customers we perform the Chi-
Square test of independence for the profit in year 2000 and independent categorical variables
in 1999.
Analysis:

T-test significance < .05, in the rejection region. The Null hypothesis is rejected, and we
believe that there is an influence of Online status on Customer Retention.

Prediction of Future Profitability and Likelihood of retention


Analysis

The regression model is significant (P-value < .05), though it explains only a small R-square
component.

Regression Equation to predict future profitability


Profit = -91.638+4.678*@9tenure+22.184*@9inc+15.135*@9age+31.871*@9online
Hence by using the above model the Likelihood to predict the Customer Retention can be
determined by Logistic Regression of Customer retention v/s independent variables.

Analysis:

T-test significance < .05, in the rejection region.


The regression model is significant. 83.5% of the time the model can predict the customer
retention correctly for individual customers.
Regression equation:
Y = 0.422+0.043*@9tenure+0.265*@9age-.037*@9inc+0.408*@9Online

Research Objective 4:
Whether the use of Online channel and Electronic Bill Payments influenced the Customer
Retention.
Analysis

We have seen that the significance level of bill payments is greater than 0.05 while for online
payments it is 0.00. Hence, we can conclude that the null hypothesis can be neglected and
there is only the effect of online channel on customer retention and the bill payments does
not.

Regression equation:
Y = 0.832+0.027*@9Online+0*@9Billpay

You might also like