Regression

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

1)

Rail Distance Transportation XY X2 Y2


to Destination Time – Days
(X) (Y)
210 5 1050 44100 25
290 7 2030 84100 49
350 6 2100 122500 36
480 11 5280 230400 121
490 8 3920 240100 64
730 11 8030 532100 121
780 12 9360 608400 144
850 8 6800 722500 64
920 15 13800 846400 225
1010 12 12120 1020100 144

6110 95 64490 4451500 993

a-

X̄ =
∑ X = 6110 =611Ȳ = ∑ Y = 95 =9.5
n 10 n 10

b=
∑ XY −n X̄ Ȳ = 64490−10 (611)(9.5) a=Ȳ −b X̄=9.5−0.00897 (611)=4.019
∑ X 2 −n Χ 2 4451500−10¿ ¿

y c =a+bX=4.019+0.00897 X
(The regression line goes through the point ( X̄ , Ȳ )

The Standard error of the estimate:

S e=
n−2√ √
∑ (Y − y c )2 = ∑ Y 2−a ∑ Y −b ∑ XY
n−2

¿

993−(4 . 019 )( 95)−0 .00897 (64490)
8
=2. 02 days
b-
Coefficient of determination =
2 SSR ∑ ( y c − Ȳ ) a ∑ Y +b ∑ XY −n Ȳ 2
2
r= = =
SST ∑ (Y − Ȳ )2 ∑ Y 2−n Ȳ 2
( 4 .019 )(95 )+(0 . 00897)(64490 )−(10 )(9 .5 )2 57 .78
¿ = =0 . 64
993−10 (9 .5 )2 90 . 5
64% of the total variation in transportation time is explained by rail distance. The
remaining 36% is due to other variables that we have not included in the analysis

c-
√2
Correlation coefficient = r= r =√ 0 . 64=0 .80
Strong linear positive relationship between rail distance and transportation time.
Recall that the correlation coefficient varies between +1 and -1.

d-
Suppose a customer is at a distance of 490 miles, then
y c =a+bX =4 . 019+0 . 00897( 490)=8 . 4 days
This means that it will take on the average 8.4 days to deliver to a customer
located at a distance of 490 miles.

e-
Predict the mean transportation time that will be achieved by all shipments over a
distance of 500 miles. In this case X=500 and the best point estimate to
μY : X is

y c =a+b (500 )=4 . 019+0 . 00897(500)=8 .5 days

f-
Estimate the transportation time for a particular shipment of 500 miles
y c =4 . 019+0 . 00897(500 )=8 .5 days

g-
Find the 95% confidence interval for the mean transportation time for all
customers located at a distance of 500 miles
√√ √
yc−t(n−2,α/).SȲ≤μY:X≤yc+t(n−2,α/).SȲ (X−X̄)2 2
1 g 1 (50−61) 123
t = 2 . 30
(1−α)=95%¿}α/2=0.25¿} (8,0.25) n 2 2 10 2 7182906 ¿ S = + =2 . 0 + ¿ = 2 . 0 . 10 + = 0 . 6913 ¿ 8 . 50 − 2 . 306 x . 6913 ≤ μ ≤ 8 . 5 + 2 . 306 x . 6913 ¿ 8 . 5− 1 . 594 ≤ μ ≤ 8 . 5 + 1 . 594 ¿ 6 . 90 ≤μ ≤ 10 . 94 ¿
¿
Ȳ e
∑ X−nX̄ 4510−10(61)
Y : X Y : X Y : X

We therefore conclude that the transportation time for all destinations 500 miles
from the plant is on the average somewhere between 6.906 and 10.094 days.

h-
Predict the transportation time for the next shipment over a distance of 500 miles
ie X=500 miles
y c =4 . 019+0 . 00897(500 )=8 .5 days
y c −t( n−2 , α /2) . S f ≤Y ≤ y c +t( n−2 , α /2) . S f
t( 8,0. 025)=2. 306

√ √
2 2
1 ( X g − X̄ ) 1 (500−611)
S f =S e 1+ + =2 .02 1+ + =2 . 135
n ∑ X 2 −nX 2 10 4451500−10(611)2
8 .5−2 .306 x 2. 135≤Y ≤8 .5+2. 306 x 2 .135
3 .577≤Y ≤13 . 423
Note that this interval is considerably wider than the interval obtained previously
μ
for Y : X . This is to be expected because Yestim is the estimate of the transportation
time for a particular shipment, not a mean, and the greater width is attributable to
the added variability that would be present even if the two regression lines were
available in making the prediction.

i-
Construct a 95% confidence interval estimate for B.
b−t( n−2 , α /2) . S b ≤B≤b+t (n−2,α /2) . Sb
Se 2. 02
S b= = =0 .00238

∑ X −nX2 2 √ 718290

0 . 00897−2 .306 x 0 .00238≤B≤0 . 00897+2 .306 x 0 .00238


0 . 00897−0 . 005496≤B≤0 . 00897+0 .005496
0 . 00347≤B≤0. 01447
j-
Using =5%, test whether there is a linear relationship between X and Y
Ho: B=0
Ha: B≠0
We use the confidence interval constructed for B, if it contains B=0, then we
accept Ho, if it does not contain B=0, then we reject the null hypothesis. The
interval does not contain 0, so we reject the null hypothesis and accept the
alternative. This means that rail distance does affect transportation time
OR

b−0 0 .00897
t c= = =3. 769
S b 0 .00238

You might also like