Professional Documents
Culture Documents
Multipkle Regression
Multipkle Regression
Multipkle Regression
Correlation Matrix
2016
Submitted To:
Dewan Muktadir-Al-Mukit
Assistant Professor
Faculty of Business Administration
Eastern University
Submitted By:
Tanjina Alam Jhumur
ID: 142 200 104
Syed Sujibur Rahman
ID: 153 200 087
Sharmin Aktar Juli
ID: 133 200 012
Arifa Ahmed
ID: 143 200 056
Section: 4
Date of Submission: 18th April, 2
LETTER OF TRANSMITTAL
Assistant Professor
Faculty of Business Administration
Eastern University
Subject: Request for accepting the report.
Dear Sir,
We would like to draw your kind attention that we are submitting our report
about the Multiple Regression and Correlation Matrix. We have tried
our best to prepare this report which has fulfilled your requirements. We
believe that all these ideas from this report will help us in our future
practical life.
We will be highly grateful to your honors if you would kindly accept our
report and obliged thereby.
Thanking you,
On behalf of
The entire group member
________________________
Tanjina Alam Jhumur
ACKNOWLEDEMENT
T first we would like to express our deepest gratitude to Allah for giving us the
strength and the composure to finish the task
within the scheduled time.
of encouragement, co-operation,
and moral support that we have
II
EXECUTIVE SUMMARY
The report analyzes the multiple regression & correlation matrix we collected our data from
primary sources such as survey
The introduction part of the report provides introduction about multiple regression &
correlation matrix. It also provides objectives of the report, methods of data collection &
limitations of the study.
The findings & analysis part of the report provides the data that we have collected from
survey. The analysis part interprets the following things:
III
Table of Contents
LETTER OF TRANSMITTAL .................................................................................................................... I
ACKNOWLEDEMENT ........................................................................................................................... II
EXECUTIVE SUMMARY ....................................................................................................................... III
Chapter One: Introduction: .................................................................................................................... 1
1.1
Introduction ............................................................................................................................ 1
1.2
1.3
1.4
1.5
Analysis: .............................................................................................................................................. 8
Part One: Multiple Regression & Correlation Matrix.......................................................................... 8
3.2
3.3
Interpretation...................................................................................................................... 9
3.3.1
3.3.2
3.3.3
3.3.4
3.3.5
3.3.6
3.5
Mean ................................................................................................................................. 11
3.6
3.7
Median: ............................................................................................................................. 12
3.8
Mode: ................................................................................................................................ 12
3.9
3.10
3.11
Conclusion ............................................................................................................................. 16
APPENDIX: ......................................................................................................................................... 17
Survey Data ................................................................................................................................... 17
Chapter One
Introduction
Introduction
Multiple regression is a flexible method of data analysis that may be appropriate whenever a
quantitative variable (the dependent or criterion variable) is to be examined in relationship to
any other factors (expressed as independent or predictor variables). Relationships may be
nonlinear, independent variables may be quantitative or qualitative, and one can examine the
effects of a single variable or multiple variables with or without the effects of other variables
taken into account.
Correlation and regression analysis are related in the sense that both deal with relationships
among variables. The correlation coefficient is a measure of linear association between two
variables. Values of the correlation coefficient are always between -1 and +1. A correlation
coefficient of +1 indicates that two variables are perfectly related in a positive linear sense; a
correlation coefficient of -1 indicates that two variables are perfectly related in a negative
linear sense, and a correlation coefficient of 0 indicates that there is no linear relationship
between the two variables. For simple linear regression, the sample correlation coefficient is
the square root of the coefficient of determination, with the sign of the correlation coefficient
being the same as the sign of b1, the coefficient of x1 in the estimated regression equation.
Neither regression nor correlation analyses can be interpreted as establishing cause-and-effect
relationships. They can indicate only how or to what extent variables are associated with each
other. The correlation coefficient measures only the degree of linear association between two
variables. Any conclusions about a cause-and-effect relationship must be based on the
judgment of the analyst.
1.2
This report has been prepared to make a study on the Multiple Regression & Correlation
Matrix as a part of the Business Statistics II course required for the BBA program of the
Faculty of Business Administration of Eastern University.
The report was prepared under the supervision of Dewan Muktadir-Al-Mukit, Assistant
1.3
Everything in life holds some kinds of objectives to be fulfilled. This report is not an exception to
it. The following are a few straight forward objectives which we have tried to fulfill in the report:
1.4
For smooth and accurate data collection we have to follow some rules &
regulation. The report inputs were collected from primary sources such as survey.
1.5
Chapter TWO
independent variables can explain the behavior of the dependent variable. Thus, R2 represents
the explanatory power of a regression model.
In simple regression with one independent variable, R2 is simply the square of the correlation
coefficient.
In multiple linear regression with more than one explanatory variables, with the intercept of
the regression straight line included, R2 is called the multiple correlation coefficient.
In all the standard statistical solvers including MS Excel, R2 is provided in the output. For
example, an R2 of 74% from the solver output implies the regression model can explain 74%
of the behavior of the dependent variable. This indicates we may refine our regression model
a bit.
However, a problem with R2 is that it increases as one simply plugs in more independent
variables into the regression model, even if they do not increase the explanatory power of the
model. To solve this problem, adjusted R2 (Ra2) is used (also provided by all statistical
solvers) defined below, which is free from this bias.
2.3.: Assumptions
Multiple regression technique does not test whether data are linear. On the contrary, it
proceeds by assuming that the relationship between the Y and each of Xi's is linear. Hence as
a rule, it is prudent to always look at the scatter plots of (Y, Xi), i= 1, 2,,k. If any plot
suggests non linearity, one may use a suitable transformation to attain linearity.
Another important assumption is non existence of multicollinearity- the independent
variables are not related among themselves. At a very basic level, this can be tested by
computing the correlation coefficient between each pair of independent variables.
Other assumptions include those of homoscedasticity and normality.
Multiple regression analysis is used when one is interested in predicting a continuous
dependent variable from a number of independent variables. If dependent variable is
dichotomous, then logistic regression should be used.
Chapter three
Serial
No.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
No. of Rooms
(X1)
3
4
3
3
4
4
3
4
3
3
3
2
3
3
3
4
3
3
3
3
3
3
3
3
3
3
3
4
3
3
Analysis:
Part One: Multiple Regression & Correlation Matrix
3.2
Summary Output
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.806569155
R Square
0.650553802
Adjusted R Square
0.624668899
Standard Error
13352.80393
Observations
30
ANOVA
df
Regression
Residual
Total
2
27
29
SS
MS
F
8962137604 4481068802 25.13255655
4814029062 178297373
13776166667
Significance F
0.00000068
Coefficient
Intercept
X Variable 1
X Variable 2
Coefficients
8222.577181
-7421.32255
33.65967748
Standard Error
17614.1569
5470.176413
4.747771671
t Stat
P-value
0.46681639 0.644376317
0.00000013 0.186116258
7.08957376 1.26982E-07
Lower 95%
Upper 95%
-27918.687 44363.84141
-18645.197 3802.552217
23.918055 43.40130017
3.3
Interpretation
3.3.1
b1= (-7421.323) indicates that when number of rooms decreases by 1 room then the mean
decrease in house rent is Taka 7421.323 while other variables are held constant.
b2=33.65 indicates that when area in square feet increases by 1 Square feet then the mean
increase in house rent is Taka 33.65 while other variables are held constant.
3.3.3
The Relationship among the variables in relative terms can be estimated with the help of
coefficient of multiple correlation (r).
R=0.81 indicates that there is a strong positive correlation among three variables (House rent,
Number of Rooms & Area in Square Feet).
3.3.4
The explanatory power of the independent variables can be assessed with the help of
coefficient of multiple determination (R2) [Adjusted]
Adjusted R2=0.62 indicates that around 62 percent of the variation in the dependent variable
can be explained by the total variation in independent variables of No. of Rooms & Area in
Square feet).
3.3.5
If the p-value of slope coefficient is equal or less then 5% (0.05) then the relationship
between dependent and independent variable is statistically significant.
If significance of F (p value of F) is equal or less then 5% (0.05) then the overall model is
statistically significant. [ANOVA Table]
From the coefficient table, we can say that the slope coefficient of no. of rooms is not
statistically significant at 5% level. [0.18>0.05].
So, we cannot reject our null hypothesis. That means there is no between houses rent &
Number of rooms.
From the coefficient table, we can say that the slope coefficient of areas in square feet is
statistically significant at 5% level.[0.00<0.05].
So, we can reject our null hypothesis. That means there is positive and significant relationship
between Areas in Square Feet & House Rent.
From the ANOVA table, F statistic implies that the overall regression model is statistically
significant at 5% level (0.001<0.05). So, the regression equation is a good model fit for the
data.
3.3.6
i) Correlation coefficients between the independent variables may be higher than the
coefficient between dependent and any other independent variables
r (X1,X2)= 0.184 > r (Y,X1)= -0.00629
Multicollinearity
No Multicollinearity
ii) Correlation coefficient between the independent variables is Less than 0.80
r (X1,X2)= 0.184 < 0.80
No Multicollinearity
No of Rooms(X1)
0.791664
0.183698
10
2K > n
25 (32) >28, so number of classes, K=5
H L
K
94
or ,
5
or ,1
I
So take interval = 1
Exclusive Method:
Class
Tally Bar
Frequency
Relative Frequency
Cumulative frequency
4-5
IIIIIIII
0.32
5-6
IIIIIII
0.29
17 (9+8)
6-7
IIII
0.18
22
7-8
IIII
0.18
27
8-9
27
9-10
0.036
28
28
Total:
3.5
Mean
Data Range =
11
3.6
4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 9
3.7
Median:
3.8
Mode:
3.9
Mean Deviation
0 (5-5)
-1 (4-5)
2
-1
2
0
1
0
0
-1
2
0
-1
2
0
-1
1
2
-1
1
1
-1
-1
0
0
1
2
1
2
0
1
0
0
1
2
0
1
2
0
1
1
2
1
1
1
1
1
0
12
6
4
5
9
1
-1
0
4
1
1
0
4
28
Total=
5
4
7
4
7
5
6
5
5
4
7
5
4
7
5
4
6
7
4
6
6
4
4
5
6
4
0
1
4
1
4
0
1
0
0
1
4
0
1
4
0
1
1
4
1
1
1
1
1
0
1
1
13
5
9
Total=
0
16
50
= 5.2
Hence, Geometric mean of the no. of family members will be 5.
14
Chapter four
Conclusion
15
Conclusion
Multiple regression analysis is a powerful technique used for predicting the unknown value
of a variable from the known value of two or more variables- also called the predictors.
Multiple regressions is a statistical tool used to derive the value of a criterion from several
other independent, or predictor, variables. It is the simultaneous combination of multiple
factors to assess how and to what extent they affect a certain outcome.
This technique breaks down when the nature of the factors themselves is of an un-measurable
or pure-chance nature.
In regression analysis, the problem of interest is the nature of the relationship itself between
the dependent variable (response) and the (explanatory) independent variable.
The analysis consists of choosing and fitting an appropriate model, done by the method of
least squares, with a view to exploiting the relationship between the variables to help estimate
the expected response for a given value of the independent variable. For example, if we are
interested in the effect of age on height, then by fitting a regression line, we can predict the
height for a given age.
The observations are assumed to be independent. For correlation, both variables should be
random variables, but for regression only the dependent variable Y must be random. In
carrying out hypothesis tests, the response variable should follow Normal distribution and the
variability of Y should be the same for each value of the predictor variable. A scatter diagram
of the data provides an initial check of the assumptions for regression.
16
APPENDIX:
Survey Data
Serial No:
Area:
Rent:
Size:
Rooms:
1
Hazaribag
11000 per month
800 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
2
Hazaribag
15000 per month
1050 square feet
4 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
3
Hazaribag
10500 per month
750 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
4
Dhanmondi
12000 per month
900 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
5
Hazaribag
15000 per month
1200 square feet
4 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
6
Hazaribag
16500 Taka per month
1250 square feet
4 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
7
Hazaribag
8500 Taka per month
950 square feet
3 rooms
17
Serial No:
Area:
Rent:
Size:
Rooms:
8
Hazaribag
14000 Taka per month
1100 square feet
4 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
9
Dhanmondi
30000 Taka per month
2500 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
10
Dhanmondi
25000 Taka per month
1150 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
11
Hazaribag
18000 Taka per month
1200 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
12
Dhanmondi
11500 Taka per month
650 square feet
2 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
13
Dhanmondi
18000 Taka per month
1000 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
14
Hazaribag
13000 Taka per month
1066 square feet
3 rooms
Serial No:
Area:
Rent:
Size:
15
Dhanmondi
22000 Taka per month
1163 square feet
18
Rooms:
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
16
114 b gikatola 3rd floor monswor road bhaka dhanmondi area
40,000 per month
2,400 square feet
4 rooms
Serial No:
Area:
17
109/3/a home. 9/a road. 4/a flat. Dhanmondi dreams. west Dhanmondi (behind of
Dhanmondi party centre.
27,000 Taka per month
1650 square feet (scft)
3 bed room, 1 dining room, 1sitting room, 3 toilet + 1 servants toilet , 1 kitchen , 2
balcony ,1 garage
Rent:
Size:
Rooms:
Serial No:
Area:
Rent:
Size:
Rooms:
18
Banani
45,000 Taka per month
1750 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
19
Banani
60,000 Taka per month
1600 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
20
Banani
55,000 Taka per month
1560 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
21
Banani (BAN-113)
60,000 Taka per month
2300 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
22
Banani (BAN1572)
40,000 Taka per month
1600 Square Feet
3 rooms
19
Serial No:
Area:
Rent:
Size:
Rooms:
23
Banani (BAN1394)
70,000 Taka per month
1800 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
24
Banani BAN1136
65,000 Taka per month
2130 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
25
Banani
60,000 Taka per month
2200 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
26
Banani
40,000 Taka per month
1350 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
27
Banani
38,000 Taka per month
1300 Square Feet
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
28
Banani (BAN-9)
80,000 Taka per month
2300 Square Feet
4 rooms
Serial No:
Area:
Rent:
Size:
29
Banani (BAN-59)
35,000 Taka per month
1400 Sq Feet
20
Rooms:
3 rooms
Serial No:
Area:
Rent:
Size:
Rooms:
30
Banani (BAN-6)
70,000 Taka per month
2000 Square Feet
3 rooms
21