Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

MATH 423/533 - ASSIGNMENT 2

Johra
October 26, 2017

(a) We can calculate the estimate, using the formula for the least squares solution:

X <- cbind(1,x1)
XTX <- t(X)%*%X
beta1 <- solve(XTX,t(X)%*%y)

print(beta1)

## [,1]
## 12129.371
## x1 3307.585

The estimates are:

0 = 12129.371
1 = 3307.585

Which are similar to the estimated output (12129.371, 3307.585).

(b) To compute the value of the omitted entry for the Residual standard error on line 20 we can write the
following code:

fitted.y <- X %*% beta1


residual <- y - fitted.y
SS.Res <- sum(residual^2)
MS.Res <- SS.Res/(length(y)-2)
sigma1 <- sqrt(MS.Res)
print(sigma1)

## [1] 2324.779

We have that the value for sigma is 2324.7789, which is similar to the output (2324.7789).

(c) From the table we have that the missing standard error is:

e.s.e() = 1197.35

1
Which can be calculated as following:

(est.cov <- sigma1^2*solve(XTX))

## x1
## 1433648.9 -359160.75
## x1 -359160.8 97159.55

(ese.vals <- sqrt(diag(est.cov)))

## x1
## 1197.3508 311.7043

(d) We can confirm the Multiple R-square as following:

SS <- (length(y)-1)*var(y)
SS2 <- SS-SS.Res
(R2 <- SS2/SS)

## [1] 0.6967813

As we can observe, the calculated R-squared is similar to the output.

(e) Numericalli we have that:

SSR = 608555014.633

which can be calculated as following:

(S.xy <- sum((y-mean(y))*(x1-mean(x1))))

## [1] 183987.7

and:

(coef(fit.Salary)[2]*S.xy)

## x1
## 608555015

(S.xy <- sum((y-mean(y))*(x1-mean(x1))))

2
## [1] 183987.7
(coef(fit.Salary)[2]*S.xy)

## x1
## 608555015

(f) From the previous computation the F statistic is confirmed as 112.5995. We can calculated as following:

Fstat<-(SS2/(2-1))/(SS.Res/(length(y)-2))
Fstat

## [1] 112.5995
summary(fit.Salary)$fstatistic

## value numdf dendf


## 112.5995 1.0000 49.0000

(g) We have that:

H<-X %*% (solve(XTX) %*%t(X))


One<-cbind(rep(1,length(y)))
H1<-(One %*%t(One))/length(y)
sum(diag((diag(1,length(y))-H1)))

## [1] 50

sum(diag(H-H1))

## [1] 1

the required n = 51.

(h) The plot and the fitted line are:

plot(x1,y,pch=19,cex=0.75)
abline(coef(fit.Salary),col='red')

3
40000
30000
y

20000

3 4 5 6 7 8

x1
We can analyze residuals as following:

plot(x1,residual,pch=19,cex=0.75,ylim=range(-6000,6000))
abline(h=0,lty=2)
6000
2000
residual

2000
6000

3 4 5 6 7 8

x1

We can observe an outlier. But there is not evidence that the assumption concerning to the model and the
residuals error are invalid.

4
We can demonstrate the orthogonality as following:

sum(residual)

## [1] -1.67347e-10

t(One) %*% residual

## [,1]
## [1,] -1.67347e-10

(i) We have that the prediction is computed using the fitted straight line as:

format(as.numeric(c(1,4.8) %*%coef(fit.Salary)),digits=8)

## [1] "28005.779"

(j) As we may write the predicted value Ynew as a linear combination of the estimator vector elements. On
this way, we can replace the variance by its estimate.

x1new <- matrix(c(1,4.8),nrow=1)


pred.var<-sigma1^2*x1new %*% solve(XTX) %*% t(x1new)
pred.var

## [,1]
## [1,] 224261.7

sqrt(pred.var)

## [,1]
## [1,] 473.5628

We have that the required estimated standart prediction error is 43.5628.

We can check the results of (i) and (j) using predict:

predict(fit.Salary,newdata = data.frame(x1=4.8),se.fit=TRUE)

## $fit
## 1
## 28005.78
##
## $se.fit
## [1] 473.5628
##
## $df

5
## [1] 49
##
## $residual.scale
## [1] 2324.779

You might also like