Professional Documents
Culture Documents
Linear Regression
Linear Regression
mpg
cylinders
1.0000
-0.7776
-0.8051
-0.7784
-0.8322
0.4233
0.5805
0.5652
-0.7776
1.0000
0.9508
0.8429
0.8975
-0.5046
-0.3456
-0.5689
displacement
-0.8051
0.9508
1.0000
0.8972
0.9329
-0.5438
-0.3698
-0.6145
horsepower
weight
acceleration
-0.7784
0.8429
0.8972
1.0000
0.8645
-0.6891
-0.4163
-0.4551
-0.8322
0.8975
0.9329
0.8645
1.0000
-0.4168
-0.3091
-0.5850
0.4233
-0.5046
-0.5438
-0.6891
-0.4168
1.0000
0.2903
0.2127
year
origin
0.5805
-0.3456
-0.3698
-0.4163
-0.3091
0.2903
1.0000
0.1815
0.5652
-0.5689
-0.6145
-0.4551
-0.5850
0.2127
0.1815
1.0000
Mpg ~ cylinders
Mpg ~ displacement
Mpg ~ horsepower
Mpg ~ year
RSE
4.914
4.365
4.906
6.363
R squared
0.6047 (Significant)
0.6482 (Significant)
0.6059 (Significant)
0.337
Clear from the illustrations above mpg as a response is statistically significant with respect to cylinders
displacement and horsepower but not so statistically significant with year.
c)
[-26.349864469
[ -1.129001385
[ 0.005119788
[ -0.044058392
[ -0.007756074
,-8.087004775]
, 0.142248747]
, 0.034671499]
, 0.010156103]
,-0.005192013]
The values suggest a possible high standard error for the intercept parameter estimate (0) and lowest
standard error in the group is the weight parameter(4). This implies that the 0.95 probability of the 0
being a true estimate lies in a wider range than that of 4
Multiple Linear Regression
Residual standard error: 4.914 on 390 degrees of freedom
Multiple R-squared: 0.6047
The model fit is worse in the simple linear regression, since generally, the estimates for the Residual
standard error and R-squared for cylinders, horsepower and displacement are higher and lower
(respectively) than in the multiple regression model. This implies that the multiple linear regression is a
better estimate of the system.
d)
The residual plot in the upper left illustration suggest a slight non-linearity but still generally acceptable.
The residual plot also marks out observations 323,327,326 as outliers.
The leverage plot in the lower left illustration shows observation 14 possesses an unusually high
leverage
e)
In this case we decided to choose the pairwise combination with non-linear displacement in the form
The residual vs fitted values was selected since its the best to evaluate non-linearity and it shows in the
illustrations 1 and 3 that the model fit degrades and is not the best fit but illustration 2 is still somewhat
acceptable except the ending deviates substantially from the mean also, so its debatable.