Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Backward Elimination Method

 Number of observation- 52
 7 independent variables
 α=0.1

Step 0:
All the independent variables are there from the start and the model is
y=60.55+ 0.00135 x 1 +0.08727 x 2+ 0.00869 x 3−0.04278 x 4 + 0.04679 x 5 +0.20921 x6 +0.00482 x 7

ANOVA
p−value=0.0001

Since P-value is less than the level of significance (0.0001<0.1), we therefore


reject the null hypothesis which says that there’s no relationship between y
and the seven independent variables and conclude that there is relationship
between y and at least one of the seven independent variables.

Significance of Variables
In Step 0, from the table we have 4 insignificant variables compared to α which
are x 1 , x 3 , x 6∧x 7 and three significant variables which are x 2 , x 4 ∧x5. By
comparing the P values with the level of significance we found that x 7 has the
greater p-value as compared to other three insignificance variable, therefore
x 7 has to be removed from the results.

R2=0.5684

There is 56.84% of variation in the number of hours worked per day that is
explained by the seven independent variables in the model.

Step 1:
The results were ran again and we can see that variable x 7 was indeed
Removed then R2=0.5609
56.09% of the variation in the number of hours worked per day is explained by
the six independent variables when x 7 is being removed.
Model:
y=61.21705+ 0.00112 x1 +0.08872 x 2+ 0.01146 x3 −0.04383 x 4 +0.04994 x 5+ 0.21474 x 6

ANOVA
P-value=0.0001 which is less than the level of significance, we therefore reject
the null hypothesis and conclude that there’s relationship between y and at
least one of the six remaining variables.
Significance of Variables
On the table, we have 3 insignificant variables compared to α=0.1, which are
x 1 , x 3∧x 6.

P-value of x 1=0.2085
P-value of x 3=0.1883
P-value of x 6=0.1048
x 1has the greatest insignificant value compared to the other two insignificant
values, therefore x 1 has to be removed.

Step2:
Variable x 1 is removed: R2=0.5650 .
There is 54.50% of variation in the number of hours worked per day that is
explained by the five independent variables when x 1 is being removed.
Model:
y=68.27443❑+ 0.08309 x 2 +0.01386 x 3−0.04345 x 4 +0.04471 x 5 +0.22909 x 6

Now on the table, we have one insignificant variable compared to α=0.1 which
is x 3 and the P-value of x 3=0.1067
Therefore x 3 has to be removed as it is the insignificant variable and the P-
value is greater than the level of significance

Step 3:
R-sq=0.5182 variable x3 is removed
Significance of Variables
In the table, all the independent variables are significant because all their p
values are less than α (0.1)
R-sq=0.5182
-we conclude that the is 51, 82%of variation in y-variable that is explained by 4
predicator model

Final model: y= 70.44910 + 0.10212x3 - 0.03398x4 + 0.05075x5 + 0.25556x6

Multicollinearity
Since Tol>0.1 on x2, x4, x5 and x6, we can conclude that multicollinearity is low
meaning it was not poorly estimated

You might also like