Professional Documents
Culture Documents
Exercise Discussion PDF
Exercise Discussion PDF
A clothing company manager wants to know the relationship between sales with several
factors in order to predict the acquisition of results selling clothes to get the maximum
profit. The dependent variable used is sales (in million rupiahs). Meanwhile, the
independent variables used are promotion costs through TV (in million rupiahs),
promotion costs through flyers (in million rupiahs), the number of stores in that district,
and the number of competitors in the same district. Do the regression analysis and
determine the best model regression using the model selection criteria procedure. The
following data were obtained from fifteen districts.
TV Flyer Store Competitor Sales
31 5.5 8 10 79
55 2.5 6 8 200
67 8 9 12 163
50 3 16 7 201
38 3 15 8 146
71 2.9 17 12 177
30 8 8 12 31
56 9 10 5 292
42 4 4 8 160
73 6.5 16 5 340
60 5.5 7 11 160
44 5 12 12 87
50 6 6 6 237
39 5 4 10 107
55 3.5 4 10 155
Answer:
In this case, we will use a multiple linear regression analysis since there are four
independent variables (promotion costs through TV, promotion costs through social
media, promotion costs through radio, and promotion costs through flyers) and one
dependent variable, namely Sales. The procedures are as follows.
Assumption:
1. Normality test
● Hypothesis
H0: Sales data follows a normal distribution.
H1: Sales data does not follow a normal distribution.
● Significance level
α = 5%
● Test statistic
Interpretation:
The scatter plot above shows the relationship between the Sales variable (Y) with
promotion costs through TV (X1), promotion costs through social media (X2),
promotion costs through radio (X3), and promotion costs through flyers (X4).
Dependent Independent R-Square Interpretation
Variable Variable
Conclusion :
Variables TV and competitor show a straight-line relationship with the Sales
variable. However, if you look at the relationship between the Sales variable with
the flyer and store variables, it doesn’t show a linear relationship and also the
R-Square value are very small, in other words, the percentage of the flyer and
store variables that can explain the variation in the Sales variable is very small.
Therefore, through the linearity test, only two independent variables passed the
assumption test. However, because there are still variables that show a linear
relationship, the four independent variables are still included in the model
formation.
1. Model Summary
Interpretation:
● R : indicates the degree of relationship between the independent variables
and the dependent variable which is equal to 0.999, this value indicates
that there is a close relationship between the independent variables and the
dependent variable.
● 𝐑𝟐 : indicates that 99.7% variation of the dependent variable can be
explained by the independent variables. While the rest is explained by
other reasons.
● Adjusted 𝐑𝟐 : 0.996 indicates a correction to R2 by 99.6%.
● Std. Error of the Estimate : 4.95659 shows the magnitude of the
variation in the regression model of 4.95659.
● AIC (Akaike Information Criterion): the value of AIC is 51.940
● SBC (Schwarz Bayesian Criterion): the value of SBC is 55.480
● Cp Mallow’s: the number of Cp Mallow’s is 5 equal to the number of
parameters (including constant)
● PRESS: the value of sum square residual is 679.00
2. Overall Test
● Hypothesis
H0: 𝛽i = 0, i = 1, 2, 3, 4 (regression model is not suitable for use/there is no
linear relationship between the dependent variable and the independent
variable)
H1: Not all 𝛽i = 0, i = 1, 2, 3, 4 (regression model is suitable for use/ there
is a linear relationship between the dependent variable and the
independent variable)
● Significance level
α = 5%
● Test Statistic
P-value = 0.000
● Critical region
H0 is rejected if the p-value is less than the significance level (α)
● Conclusion
Because the p-value (0.000) is less than the significance level (0.05), then
H0 is rejected. So it can be concluded that the model is suitable for use /
there is a linear relationship between the dependent variable and the
independent variable.
● Interpretation
By stating the initial hypothesis H0 that the regression model is not
feasible to use and alternative hypothesis H1 regression model is feasible
to use. At the significance levelα = 5%, H0 will enter the critical or
rejection area if the value of Sig. is less than α. Obtained Sig. (0.000) is
less than the significance level (0.05) then H0 is rejected. So it can be
concluded that the regression model is feasible to use or there is a linear
relationship between the dependent variable and independent variable.
3. Partial Test
TV 0.000
Flyer 0.007
Store 0.461
Competitor 0.000
● Critical region
H0 is rejected if the p-value is less than the significance level (α)
● Conclusion
Independent variable Conclusion
● Interpretation
By stating the initial hypothesis H0 that the predictor variable (𝑋𝑖)
is not statistically significant to the regression model and
alternative hypothesis H1 the predictor variable (𝑋𝑖) is statistically
significant to the regression model. At the significance level
α = 5%, H0 will enter the critical or rejection area if the value of
Sig. is less than α. Obtained Sig. of TV, Flyer, Store, and
Competitor respectively are 0.000, 0.007, 0.461, and 0.000. This
shows that the critical region is rejected on the TV, Flyer, and
Competitor variable. In other words, the variable Store is not
statistically significant to the regression model.
From the Partial Test, we know that the Store variable is not statistically significant to the
regression model and has the largest p-value (Sig.). So, we should take out the Store
variable and do the regression analysis again.
MODEL 2
Variables that included in this model are:
- Dependent variable (Y) : Sales
- Independent variable (X)
1. X1 = TV
2. X2 = Flyer
3. X3 = Competitor
- Constant
1. Model Summary
Interpretation:
● R : indicates the degree of relationship between the independent variables
and the dependent variable which is equal to 0.999, this value indicates
that there is a close relationship between the independent variables and the
dependent variable.
● 𝐑𝟐 : indicates that 99.7% variation of the dependent variable can be
explained by the independent variables. While the rest is explained by
other reasons.
● Adjusted 𝐑𝟐 : 0.996 indicates a correction to R2 by 99.6%.
● Std. Error of the Estimate : 4.86255 shows the magnitude of the
variation in the regression model of 4.86255
● AIC (Akaike Information Criterion): the value of AIC is 50.795
● SBC (Schwarz Bayesian Criterion): the value of SBC is 53.627
● Cp Mallow’s: the number of Cp Mallow’s is 4 equal to the number of
parameters (including constant)
● PRESS: the value of sum square residual is 605.12
2. Overall Test
● Hypothesis
H0: 𝛽i = 0, i = 1, 2, 3 (regression model is not suitable for use/there is no
linear relationship between the dependent variable and the independent
variable)
H1: Not all 𝛽i = 0, i = 1, 2, 3 (regression model is suitable for use/ there is a
linear relationship between the dependent variable and the independent
variable)
● Significance level
α = 5%
● Test Statistic
P-value = 0.000
● Critical region
H0 is rejected if the p-value is less than the significance level (α)
● Conclusion
Because the p-value (0.000) is less than the significance level (0.05), then
H0 is rejected. So it can be concluded that the model is suitable for use /
there is a linear relationship between the dependent variable and the
independent variable.
● Interpretation
By stating the initial hypothesis H0 that the regression model is not
feasible to use and alternative hypothesis H1 regression model is feasible
to use. At the significance levelα = 5%, H0 will enter the critical or
rejection area if the value of Sig. is less than α. Obtained Sig. (0.000) is
less than the significance level (0.05) then H0 is rejected. So it can be
concluded that the regression model is feasible to use or there is a linear
relationship between the dependent variable and independent variable.
3. Partial Test
TV 0.000
Flyer 0.007
Competitor 0.000
● Critical region
H0 is rejected if the p-value is less than the significance level (α)
● Conclusion
Independent variable Conclusion
● Interpretation
By stating the initial hypothesis H0 that the predictor variable (𝑋𝑖)
is not statistically significant to the regression model and
alternative hypothesis H1 the predictor variable (𝑋𝑖) is statistically
significant to the regression model. At the significance level
α = 5%, H0 will enter the critical or rejection area if the value of
Sig. is less than α. Obtained Sig. of TV, Flyer, and Competitor
respectively are 0.000, 0.007, and 0.000. This shows that the
critical region is rejected on the TV, Flyer, and Competitor
variable. In other words, all independent variables in Model 2 are
statistically significant to the regression model.
REGRESSION MODEL
Model 1
𝑆𝑎𝑙𝑒𝑠 = 177. 392 + 3. 533 * 𝑇𝑉 + 2. 184 * 𝐹𝑙𝑦𝑒𝑟 + 0. 236 * 𝑆𝑡𝑜𝑟𝑒 − 22. 187 * 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑡𝑜𝑟
Model 2
𝑆𝑎𝑙𝑒𝑠 = 178. 893 + 3. 562 * 𝑇𝑉 + 2. 109 * 𝐹𝑙𝑦𝑒𝑟 − 22. 222 * 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑡𝑜𝑟
Interpretation:
● For each addition of 1 unit promotion costs through TV variable (in million
rupiahs), the value of Sales will increase by 3.562 (in million rupiahs).
● For each addition of 1 unit promotion costs through flyers variable (in million
rupiahs), the value of Sales will increase by 2.109 (in million rupiahs).
● For each addition of 1 unit of Competitor variable, the value of Sales will
decrease by 22.222 (in million rupiahs).
Conclusion:
Based on the analysis above, after doing the regression analysis procedure the best
regression model obtained by using the model selection criteria procedure is Model 2. So,
the manager of that clothing company can determine the value of the sales by using TV,
Flyer, and Competitor variables. This model has an R-Squared value equal to 0.996
which means that 99.6% variation of the dependent variable can be explained by the
independent variables (TV, Flyer, and Competitor) while the rest is explained by other
reasons.