Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

BUSINESS REPORT

1. A company manager says that the average balance on their credit cards is $500. Do you think that this assertion is
justified? Use a one-sample t-test to draw your conclusion.

 A t-test is a type which inferences about populations constructed on samples. It is used to define if there is a major difference
between the means of two groups, which may be linked in certain features.
 The t-test is one of many tests used for the determination of hypothesis testing in statistics.
 There are several different types of t-test that can be performed depending on the data and type of analysis required.
Source: google
Determining the Correct T-Test to Use

Source: Google

SUMMARY:
The assertion is acceptable. As the difference between the hypothesized mean and the actual mean is less than the critical
value, we do not have enough indication to throw away this assertion, henceforth we can conclude that the average balance
is $500. The significance level is 0.05, if our p-value is less than the 0.05, then the null hypothesis can be rejected. As per the
given data the p value is 0.19, which is greater than 0.05, we cannot reject the assertion.

The null hypothesis used in statistics that helps to understand that there is no difference between specified populations, any
observed difference being due to sampling or experimental error. An alternative hypothesis is a statement which the
parameters that we wish to prove. It proposes that there is a difference. On the absence of any data, Null hypothesis can
stand alone but not the alternative hypothesis

REFERENCE: Attached given Excel sheet

Question 2: Is there a difference between men and women as far as average balance is concerned? Use a two-sample t-test
to draw your conclusion

t-Test: Two-Sample Assuming Unequal Variances

male Balance Female Balance


Mean 509.8031088 529.5362319
Variance 213554.5652 210187.1043
Observations 193 207
Hypothesized Mean Difference 0
df 396
t Stat -0.42838443
P(T<=t) one-tail 0.334302083
t Critical one-tail 1.648710601
P(T<=t) two-tail 0.668604165
t Critical two-tail 1.965972608

if |t|>c, then reject that they are equal


SUMMARY:

According to the given data there is no much difference between the genders balance, both men and women have equal
average balance, which has been proved by the two sample t test

REFERENCE: Attachment from the given Excel sheet

Question 3: Is there a difference between students and non-students as far as average balance is concerned? Use a two-
sample t-test to draw your conclusion.

SUMMARY:

If the test statistic value (t) is larger than the critical value of a two paired t test, then you can reject the hypothesis that the
two data’s are equal in their compared values. Hence, by referring to this data set, we see that the |t| > c, pointing to the
fact that we can reject the hypothesis that the average balance of the non-students and students are equal. Hence the
average balance of the non-students and students are not equal, they have a difference.
REFERENCE: Attachment from the given Excel sheet

Question 4. It is generally assumed that if there are more credit cards then the balance on the cards will be more. Based on
this dataset, do you think this is true? Calculate a correlation coefficient and show a scatter plot to support your answer
Summary:
The assumption is incorrect. As per the below Correlation coefficient of the data the positive relation is very small when compared and
it is proving that the assumption having more credit cards will not have add balance or more balance.

REFERENCE: Attachment from the given Excel sheet

Question 5. Examine whether the following demographic variables influence balance: (a) age, (b) years of education, (c)
marital status. For age and years of education, use scatter plots to depict their relationship with balance and calculate the
correlation coefficient. For the relationship between marital status and balance, use a two-sample t-test to draw your
conclusion
Summary and Answers
A: Since the correlation within age and balance is a positive and close to 0, so we can conclude that age factor that
influences balance in a very least relation almost it does not influence.

B: Since the correlation btw years of Education and Balance is a negative very close to 0, we can conclude that education
influences balance in a very least relation almost it does not influence.
C: If the test statistic value (t) is larger than the critical value of a two paired t test, then you can reject the hypothesis that
the two data’s are equal in their compared values. This shows that the unmarried balance and married balance are equal, so
the marital status does not influence the balance
Question 6: “Ethnicity of the cardholder matter does not matter as far as balance is concerned.” Carry out an analysis of
variance (ANOVA) and discuss whether this statement is supported by the data or not.

SUMMARY

Groups Count Sum Average Variance


CAUCASIAN Balance 199 103181 518.4974874 190922.4
AF AM Balance 99 52569 531 235839.2
ASIAN Balance 102 52256 512.3137255 231748.3

here p>0.05
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 18454.20047 2 9227.100236 0.043443 0.957492 3.018452
Within Groups 84321457.71 397 212396.6189

Total 84339911.91 399

Summary:

The p value is 0.95, which is greater than the significant value of 0.05, the hypothesis that the ethnicity of the card holder does not
matter as far as balance is concerned.

Question 7 b) A general principle that credit card companies often follow is to assign a higher credit limit to people with a higher
credit rating. Does the data show that this principle is being followed?
As per the trend line in the above scatter plot we could conclude that the credit card companies assign a higher credit limit to people
with a higher credit rating. When your rating is doubled the limit is extended 1.5X times. Limit can be extended to even further if a
rating stands outside the box.
8. Run a simple linear regression of balance on the credit limit. (Here credit limit is the X and the balance is the Y). Report the
coefficients and the R-squared. Show a scatter plot
Variable Coefficients
Limit 0.171637278
R Square 0.74252218
When your balance is higher your limit
will be higher

9. Run a simple linear regression of balance (Y) on credit rating (X). Report the coefficients and R-squared. Show a scatter plot
Variable Coefficients

Rating(x) 2.566240327

R Square 0.745848418

10. Consider your findings in questions 8-9. Discuss business mechanisms to increase or decrease the balance on credit cards. Try to
quantify your answers.
As per the data findings from the question 8 and 9, it is derived that as the coefficients of Limit range is close to zero, there is almost
no relationship with Balance.
The coefficients of Rating is far away from Zero, hence strong relationship with Balance of the credit card. As there is an increasing
the rating of the credit cards it directly proportional to the Balance on the credit card. If there is an increase in the Rating of credit
card by 1 unit, correspondingly there will be increase of 1.5 unit on the Balance. The R square value also confirms that there is a
74.6% influence on the Balance of the credit card Limit through the Rating variable than the card Limit. Hence to increase the
Balance on the credit card we need to focus to increase the Credit card ratings

11. The credit limit is provided as a consolidated amount for all the credit cards the cardholder has. Run a multiple linear regression
of Balance (Y) on Limit and Cards as two X variables. Report the coefficients. Discuss the effect on the balance of (a) increasing the
credit limit on the same number of cards and (b) increasing the number of cards without altering the total credit limit.

Balance(y)
2000
1800
1600
1400
1200
1000
800
600
400
200
0
0 2000 4000 6000 8000 10000 12000 14000 16000

Ans: When there is increase in credit limit there is a significance increase in balance correspondingly
12 . Run a simple linear regression equation with Income as X and Balance as Y. Report the coefficients. Is the coefficient of Income
significantly different from zero? What does this say about the effect of income?

Balance
y = 6.0484x + 246.51
2500

2000

1500

1000

500

0
0 20 40 60 80 100 120 140 160 180 200

Ans: When an income of an object increases eventually the balance increased in to 10X (10 folds)

13. Based on the equation derived in question 12, what is the estimated balance for a person with an income of USD 100k per year?
Regression Equation y = 6.0484x + 246.51
Estimated balance y= 851.35 USD
The regression equation simply defines that as the balance of the credit card increases by 6 times. When there is a unit change in
the value of income, there will be a 6 times increase in the total value of balance. Hence regression analysis gives us the relationship
between X and Y - to illustrate the nature of the relationship via a straight line and to conclude that the relationship is significant

14. Based on the dataset, explore the relationship between credit card balance (Y) and (a) Income (b) Age (c) Education (c) Limit, and
(d) Rating as X variables? Estimate a multiple linear regression model and report the statistical significance of each of these
variables.
Variables P-value Statistical Output
Significant predictors, as p value is less than
Income 1.37077E-61 0.05
Not a significant predictors as the P-value is
Age 0.073165937 greater than 0.05
Not a significant predictors as the P-value is
Education 0.450516748 greater than 0.05
Not a significant predictors as the P-value is
Limit 0.078487737 greater than 0.05
Significant predictors, as p value is lesser
Rating 3.93909E-05 than 0.05

You might also like