Professional Documents
Culture Documents
Regn & Marketing Research
Regn & Marketing Research
Regression:
Explaining Association
and Causation
by
Tuhin Chattopadhyay
Slide 1
Application Areas: Correlation
Pre-conceived Approach
Data
1. Input data on y and each of the x variables is
required to do a regression analysis. This data is input
into a computer package to perform the regression
analysis.
Dependent Variable
Independent Variables
Input data:
Correlation
Regression
and determine the values of a, b1, b2, b3, b4, b5, & b6.
Regression Output:
a (intercept) = -3.17298
b1 = .22685
b2 = .81938
b3 = 1.09104
b4 = -1.89270
b5 = -0.54925
b6 = 0.06594
Slide 11
Given the levels of X1, X2, X3, X4, X5, and X6 for a
particular territory, we can use the regression model
for prediction of sales.
Before we do that, we have the option of redoing the
regression model so that the variables not statistically
significant are minimized or eliminated.
We can follow either the Forward Stepwise
Regression method, or the Backward Stepwise
Regression method, to try and eliminate the
'insignificant' variables from the full regression model
containing all six independent variables.
Fig. 5
STAT. Regression Summary for Dependent Variable: Sales
MULTIPLE R = .98831786 R2 = .97677220 Adjusted R2 = .96748108
REGRESS. F = (4,10) = 105.13 p<.00000 Std. Error of estimate: 3.9637
N=15 BETA St. Err. B St. Err. of T (10) p-level
of B
BETA
Intercept -3.74194 4.847683 -.77190 .458025
People .390134 .115138 1.02822 .303453 3.38841 .006904
Potentl .462686 .117988 .23905 .060959 3.92147 .002860
Dealers .180700 .102687 .90109 .512065 1.75971 .108955
Compet -.081195 .053434 -1.81074 -1.191624 -1.51955 .159589
The 4 variables in the model are PEOPLE (No, of sales
people) POTENTL (sales potential), Dealers (No of
Dealers) and COMPET (competitive index). Again we
notice, that the two significant variables (those with p value
<.10) at 90 % confidence are only PEOPLE and POTENTL
(p- levels of .006904 and .002860).
But DEALERS is now at p-level of .108955, very close to
significance at 90 % confidence level. This could be the
equation, instead of the one with 6 independent variables,
that we could use. We would be economizing on the two
variables, which are not required if we decide to use the
model from Fig, 5 instead of that from Fig, 4.
The F test for the model in Fig, 5 also indicates it is highly
significant (From top of Fig, 5, F=105.1296, P<.000000)
and R² value for the model is 0.9767, which is very close to
the 6-independent variable model of Fig, 4. If we decide to
use the model from Fig. 5, it would be written as follows -
Sales = -3.74 + 1.03 (PEOPLE) + .24 (POTETL) + .9 (DEALERS) - 1.81 (COMPET)
……….Equation 2
Slide 16
Fig. 6
Additional comments