Credit Balance Analysis: Saee Chaudhari

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

CREDIT BALANCE

ANALYSIS

SAEE CHAUDHARI DADM Assignment (8/8/22)


INTRODUCTION
A statistical analysis was conducted to understand how balance on credit cards
relates to various characteristics of the use (income age, years of education, gender,
whether a student or not, marital status, ethnicity, and credit rating) as well as of the
credit cards themselves (total credit limit and the number of cards, in addition to
balance). This summary attempts to explain the analysis results and findings.

BUSINESS REPORT 2
BUSINESS REPORT 3
A. DATA UNDERSTANDING
Data Intro

Credit card and users’ characteristics data along with the credit card balance (variable under analysis). Credit card
balance is the total amount of money currently owned by a cardholder to their credit card company.

Data dictionary

Income: Assume units as thousand US dollars per year


Limit: Assume units in dollars and, that the limit is the total limit on all the cards
Rating: Higher the rating -> lower is the probability of defaulting
Cards: Number of cards
Age: Age in years
Education: Years of education
Gender: Gender (Male/Female)
Student: Student or not (Student/No student)
Married: Marital status (Married/ Unmarried)
Ethnicity: Ethnicity (Asian/ Caucasian/ African American)
Balance: Assume its units as dollars

Descriptive statistics

BUSINESS REPORT 4
1. UNDERSTANDING AVERAGE BALANCE

A company manager says that the average balance on their credit cards is $500. Do you think that
this assertion is justified? Use a one-sample t-test to draw your conclusion.

Ho: Average balance is equal to $500


Ha: Average balance is not equal to $500

T-Test: Two Sample Assuming Unequal Variances

Result: The average balance on credit cards is $500

BUSINESS REPORT 5
2. UNDERSTANDING GENDER INFLUENCE ON BALANCE

Is there a difference between men and women as far as average balance is concerned? Use a two-
sample t-test to draw your conclusion.

Ho: There is no difference in balance on cards between men and women


Ha: Average balance is significantly different of either gender over the other

Data preparation

Gender: two values – Men/Women -> Categorical data

Create balance data at men and women level as:

T-Test: Two Sample Assuming Unequal Variances

Result: There is no difference in balance on cards between men and women

BUSINESS REPORT 6
3. UNDERSTANDING STUDENT FLAG INFLUENCE ON BALANCE

Is there a difference between students and non-students as far as average balance is


concerned? Use a two-sample t-test to draw your conclusion.

Ho: There is no difference in balance between student and non-student


Ha: At least one of the groups has significantly different balance than other

Data preparation

Student: two values – Student/Non-Student -> Categorical data

Create balance data at student and non-student level as:

T-Test: Two Sample Assuming Unequal Variances

Result: Balance is influenced by whether the customer is a student or non-student

BUSINESS REPORT 7
4. UNDERSTANDING #CREDIT CARDS INFLUENCE ON BALANCE

It is generally assumed that if there are more credit cards then the balance on the cards will be
more. Based on this dataset, do you think this is true? Calculate a correlation coefficient and show
a scatter plot to support your answer

Ho: More the number of credit cards, higher is the balance on the cards
Ha: Number of credit cards doesn’t affect balance on the cards

Data preparation

#Credit cards: nine values – 1,2,3,4,5,6,7,8,9 -> Here, continuous data

Create #cards and balance data as:

Correlation Analysis

Result: Balance on the cards is dependent on the number of credit cards. More the number of cards,
more is the balance on the cards. With every 1 unit increase in the #cards, balance increases by ~$29.
Maximum number of people in the sample have up to 5 credit cards.

BUSINESS REPORT 8
5. UNDERSTANDING AGE, EDUCATION AND MARITAL STATUS INFLUENCE
ON BALANCE

Examine whether the following demographic variables influence balance: (a) age, (b) years of
education, (c) marital status. For age and years of education, use scatter plots to depict their
relationship with balance and calculate the correlation coefficient. For the relationship
between marital status and balance, use a two-sample t-test to draw your conclusion

5.1. Age and Education influence on balance

Data preparation

Age and level of education -> Continuous data

Create data as:

Correlation Analysis

BUSINESS REPORT 9
Results:

Age and balance analysis:


• Age and balance are positively correlated, although the correlation coefficient is small
• The correlation coefficient of age and balance is 0.0018
• 1 year increase in age will result in ~$0.0489 increase in balance
• Higher the age, higher is the balance on cards

Education and balance analysis:


• Education and balance are negatively correlated with a very small correlation coefficient
• The correlation coefficient of education and balance is -0.0081
• 1 level increase in education will result in ~$1.2 decrease in the balance on cards

5.2. Marital status influence on balance

Ho: Marital status doesn’t influence balance


Ha: Marital status influences balance

5.2.1. Does marital status impact balance on cards?

Data preparation

Create data as:

Linear Regression analysis on marital status flag and balance

BUSINESS REPORT 10
P value = 0.9099
P value > 0.05: Fail to reject null hypothesis

Result: Marital status doesn’t influence balance on cards

5.2.2. How much does marital status influence balance?

Data preparation

Marital Status: two values – Married/Unmarried -> Categorical data

Create data as:

T-Test: Two Sample Assuming Unequal Variances

BUSINESS REPORT 11
6. UNDERSTANDING ENTHNICITY INFLUENCE ON BALANCE

“Ethnicity of the cardholder does not matter as far a balance is concerned.” Carry out an analysis of
variance (ANOVA) and discuss whether this statement is supported by the data or not.

Ho: Ethnicity of the cardholder doesn’t influence balance on the cards


Ha: At least one of the ethnicities has significantly different balance on the cards than others

Data preparation

Ethnicity: three values: Asian/ Caucasian/ African American -> Categorical data

Create data as:

ANOVA: Single Factor

P value = 0.957
P value > 0.05: Fail to reject null hypothesis

Result: Ethnicity doesn’t influence balance on cards

BUSINESS REPORT 12
7. UNDERSTANDING CREDIT RATING INFLUENCE ON CREDIT LIMIT

A general principle that credit card companies often follow is to assign a higher credit limit to
people with a higher credit rating. Does the data show that this principle is being followed?

Ho: Credit rating is directly influencing credit limit (direct proportion)


Ha: Credit rating doesn’t influence credit limit

Data preparation

Create data as:

Correlation Analysis

BUSINESS REPORT 13
8. UNDERSTANDING CREDIT LIMIT INFLUENCE ON BALANCE

Run a simple linear regression of balance on the credit limit. (Here credit limit is the X and the
balance is the Y). Report the coefficients and the R-squared. Show a scatter plot. State
inference

Ho: Credit limit doesn’t influence balance on the cards


Ha: Credit limit influences balance on the cards

Data preparation

Create data as:

Simple Linear Regression Analysis

P value = 2.531E-119
P value < 0.05 : Reject null hypothesis

BUSINESS REPORT 14
Equation of balance as function of credit limit:

Y = -292.79 + 0.176 (limit)

• Every dollar increase in limit, will increase the balance by $0.176

Result: Credit limit influences balance on the cards

BUSINESS REPORT 15
9. UNDERSTANDING CREDIT RATING INFLUENCE ON BALANCE

Run a simple linear regression of balance on the credit rating (X). Report the coefficients and
R-squared. Show a scatter plot. State inference

Ho: Credit rating doesn’t influence balance on the cards


Ha: Credit rating influences balance on the cards

Data preparation

Create data as:

Simple Linear Regression Analysis

P value = 1.9E-120
P value < 0.05: Reject null hypothesis

BUSINESS REPORT 16
Equation of balance as function of credit limit:

Y = -390.85 + 2.566 (rating)

• Every unit increase in ratings, will increase the balance by $2.55

Result: Credit rating influences balance on the cards

BUSINESS REPORT 17
10 . UNDERSTANDING RATING AND LIMIT INFLUENCE ON BALANCE

Consider your findings in questions 8-9. Discuss business mechanisms to increase or decrease
the balance on credit cards. Try to quantify your answers in this context, focus on possible
specific strategies using variables in Q8 and Q9 that the business could adopt to increase the
balance on credit cards

F(Balance) = {limit, rating}

Data preparation

Create data as:

Correlation analysis

Multi Linear Regression Analysis

BUSINESS REPORT 18
Correlation findings:
• Limit and balance are highly correlated (coeff.: 0.862)
• Rating and balance are also highly correlated (coeff.: 0.864)
• Limit and rating are highly correlated (coeff.: 0.997)

Equation of balance as function of limit and rating:

Y = -377.573 + 0.0245 (limit) + 2.2016 (rating) …………….Eqn.10

• Every unit increase in ratings, keeping the limit constant will increase the balance on cards by
~$2.2
• Every dollar increase in credit card limit, keeping the rating constant will increase the balance
on cards by very negligible amount of $0.025

Also,
1. When tested for individual influence on balance: credit limit has coefficient of 0.172
2. When tested for individual influence on balance: credit rating has coefficient of 2.566

Here, the balance will increase with every unit increase in limit and rating by $0.172 and $2.566
respectively. But when tested together as influence on balance on cards, these coefficients differ, as
highlighted in Eqn.10.

Results: Business should focus on improving the credit card ratings to increase balance on cards

BUSINESS REPORT 19
11. UNDERSTANDING LIMIT AND #CARDS INFLUENCE ON BALANCE

The credit limit is provided as a consolidated amount for all the credit cards the cardholder
has. Run a multiple linear regression of Balance (Y) on Limit and Cards as two X variables.
Report the coefficients. Discuss the effect on the balance of (a) increasing the credit limit on
the same number of cards and (b) increasing the number of cards without altering the total
credit limit.
F(balance) = {limit, cards}
Data preparation

Create data as:

Correlation analysis

Multi Linear Regression Analysis

BUSINESS REPORT 20
Correlation findings:
• Limit and balance are highly correlated (coeff.: 0.861)
• Cards and balance are also highly correlated (coeff.: 0.086)
• Limit and rating are nearly correlated (coeff.: 0.0102)

Equation of balance as function of limit and #cards:

Y = -369.04 + 0.1714 (limit) + 26.0338 (#cards) …………….Eqn.11

• Every dollar increase in credit limit, keeping the #cards constant will increase the balance on
cards by ~$0.17
• Every unit increase in number of credit card, keeping the limit constant will increase the
balance on cards by ~$26
• Although there is negligible correlation between limit and #cards, they individually are highly
correlated to balance on the cards

Also,
3. When tested for individual influence on balance: credit limit has coefficient of 0.172
4. When tested for individual influence on balance: #cards have coefficient of 0.0864

Here, the balance will increase with every unit increase in limit and #cards by $0.172 and $28
respectively. But when tested together as influence on balance on cards, these coefficients differ, as
highlighted in Eqn.11.

Results:
• Increasing the number of cards people own, keeping the credit limit constant, will increase
balance on cards by 26 times
• Increasing the credit limit, keeping the number of cards people own constant, the balance will
increase by 0.17 times

BUSINESS REPORT 21
12. UNDERSTANDING INCOME INFLUENCE ON BALANCE

Run a simple linear regression equation with Income as X and Balance as Y. Report the
coefficients. Is the coefficient of Income significantly different from zero? What does this say
about the effect of income on balance?
Ho: Income doesn’t influence balance on the cards
Ha: Income influences balance on the cards
Data preparation
Create data as:

Correlation analysis

Simple Linear Regression Analysis

BUSINESS REPORT 22
• Income and balance have a correlation coefficient of 0.463

Equation of balance as function of income:

Y = 246.515 + 6.048 (income) …………….Eqn.12

• Every dollar increase in income, the balance on cards will increase by $6


• The coefficients of both intercept and income are significantly higher than 0

P value of income = 1.03E-22


P value < 0.05 : Reject null hypothesis

Results: Income influences balance on the cards. More the income, more is the balance on cards

13. ESTIMATING BALANCE BASED ON INCOME

Based on the equation derived in question 12, what is the estimated balance for a person with
an income of USD 100k per year?
Y = 246.515 + 6.048 *100
Y = 851.315

The estimated balance for a person with an income of $100k per year with 95% confidence is
$851.315

To safely assume the lower and upper 95% confidence interval, it is safe to say that the estimated
balance will lie within $673 and $1030

BUSINESS REPORT 23
14. UNDERSTANDING THE INFLUENCE OF SEVERAL FACTORS ON BALANCE
TOGETHER

Based on the dataset, explore the relationship between credit card balance (Y) and (a) Income
(b) Age (c) Education (c) Limit, and (d) Rating as X variables? Estimate a multiple linear
regression model and report the statistical significance of each of these variables.

Data preparation
Create data as:

Correlation analysis

Multi Linear Regression Analysis

BUSINESS REPORT 24
Equation of balance as function of income, age. Education, limit and rating

Y = -473.251 – 7.609 (income) – 0.860 (age) + 1.967 (education) + 0.079 (limit) + 2.774 (rating)

…Eqn12

• With every unit increase in ratings, keeping all the other parameters constant, will increase
balance by ~$2.8
• With every unit dollar increase in limit, keeping all the other parameters constant, will increase
balance by ~$0.8
• With every unit increase in level of education, keeping all the other parameters constant, will
increase balance by ~$2
• Although age and income have negative coefficients here, these coefficients are only to
indicate their performance when considering all the mentioned factors together

Looking at only age influence on balance, Y = 517.29 + 0.0449 (age)


But here the coefficient in Eqn12 are when age is considered along with other factors.
Similarly, for income, the coefficients when acting along vs when acting together with other variables
are different.

BUSINESS REPORT 25
BUSINESS REPORT 26

You might also like