Professional Documents
Culture Documents
Regression Chap2
Regression Chap2
1X i
(2.1)
Yi = i
0 1 =
Xi = i
i = (random error) i
E(i) 0 2(i ) 2
i j
(covariance) 0 (i, j) i j
Y
0 + 1Xi 2
Yi ~ NID(0 + 1Xi, 2) (Montgomery & Peck, 1992, p. 10-11)
0 1 (regression coefficient)
1
Y X 1 0 Y
Y X
0 0 0
0
10
2.1
(2.1)
0 1
Y
fitted value
(2.2)
Yi 0 1 X i
0 1
0 1
0 1
(ordinary least square estimation)
(residual) (ei)
Y ( Y )
X
ei = Y i -
= Yi - (0 + 1Xi)
Yi
(2.3)
() (e)
2.1.1 0 1
Q
Q
Y
i
1 X i
(2.4)
i 1
Q 0 1 (Abraham
& Ledolter, 2006, p.28-29)
n
Q
2 Yi 0 1 X i
0
i 1
11
n
Q
2 Yi 0 1 X i X i
1
i 1
0 0 1 b0
b1
n
Y b
b1 X i 0
i 1
Y b
b1 X i X i 0
i 1
2 (normal
equation) (2.5) (2.6)
nb0 b1
Xi
i 1
(2.5)
i 1
b0
X i b1
i 1
X i2
i 1
X Y
(2.6)
i i
i 1
(2.7)
b0 Y b1 X
Y ( X
i
b1
i 1
n
(X
X)
X)
S xy
(2.8)
S xx
i 1
S xx
(X
X)
2
i 1
i 1
Xi
i 1
X i2
n
(2.9)
S xy
i 1
Yi ( X i X )
i 1
n
n
X i Yi
i 1
i 1
X i Yi
n
(2.10)
12
2.1 1.1
b0 b1
S xx
i 1
Xi
i 1
X i2
n
= 7,733.41 -
(299.41) 2
13
= 837.54
S xy
i 1
n
n
X i Yi
i 1
i 1
X i Yi
n
= 62,978.59 -
299.41 2,435.50
13
= 6,885.28
(2.6) (2.7)
b1
S xy
S xx
6,885.28
8.22
837.54
b0 Y b1 X
Yi 1.99 8.22X i
310
y 1.99 8.22 x
260
Y 1.99 8.22 X
210
160
110
60
0
10
15
20
25
30
35
40
13
b0 b1 b1 8.22
1 8.22
b0
0 Y Y X
3 (e) 4
10.01 80.30
y 1.99 8.22 10.01 80.30
299.41
Yi
77.60
114.90
141.40
190.80
239.90
270.00
280.00
100.50
180.40
145.00
210.00
300.00
185.00
2,435.50
ei
-2.70
-4.36
-4.09
-3.93
-1.53
-17.22
6.59
0.14
17.65
-5.92
-34.63
38.93
11.07
Yi
80.30
119.26
145.49
194.73
241.43
287.22
273.41
100.36
162.75
150.92
244.63
261.07
173.93
2,435.50
MINITAB
1. Stat
2. Regression
3. Regression
4. Response: Predictors:
0.00
14
5. Options
6.
Storage
7. ANOVA
(R2) Results
8. Graphs OK
2.1
2.1
2.1 MINITAB
2.2 Session
b0 b1 p-value ( )
ANOVA ( )
(outlier) Worksheet (Res1)
(FITS1)
15
2.2
2.1.2 b0 b1
- (Guass-Markov theorem) b0
b1 (unbiased) E(b0) = 0
E(b1) = 1
(unbiased linear estimator) best linear unbiased
estimator (BLUE) best
b0 b1
1 X2
V (b0 ) 2
n S xx
(2.11)
(2.12)
V (b1 )
S xx
2 = ()
16
2.1.3
(2) 2
(mean square error MSE) MSE
(error sum of squares SSE)
SSE
ei2
i 1
(Y Y )
i
(2.13)
i 1
S yy
n
Yi
n
i 1
2
Yi
n
i 1
S xy
i 1
(2.14)
n
n
X i Yi
i 1
i 1
X i Yi
n
SSE n - 2 2 0
1 b0 b1
MSE 2
SSE
n2
(2.15)
MSE 2
() MSE
MSE (standard
error of regression)
Y
2.2 2.1 2
2 MSE
S yy
n
Yi
n
i 1
2
Yi
n
i 1
17
516,468.79
(2,435.50) 2
13
Sxy = 6,885.28
= 60,187.23
( 2.1)
MSE
2.1.4
( Y ) (e)
(1) 0
n
(Yi Yi )
i 1
i 1
(2.5) 4 2.1
0.00
(2) Y Y
n
i 1
Yi
i 1
2 3 2.1
18
(3) ( ei2 )
i 1
(4) X , Y
(5) X 0
n
X e
i i
i 1
(6) 0
n
Y e
i i
i 1
2.1.5
( X i X i ) X centered X
1 X - 1 X
Yi 0 1 ( X 1 X ) 1 X i
( 0 1 X ) 1 ( X 1 X ) i
0 1 ( X 1 X ) i
(2.16)
0 0 1 X
X , Y (2.7)
(2.17)
b0 b0 b1 X (Y b1 X ) b1 X Y
n
Y ( X
i
b1
i 1
n
(X
X)
X)
S xy
S xx
(2.18)
i 1
(2.17) Y Y
(b1)
Yi Y b1 ( X i X )
(2.19)
(2.7)
19
b0 b0 b1
COV( b0, b1) = 0 Y
2.3 2.1
, x = 10.01
y 187.35 8.22 (10.01 23.03)
= 80.33
2.1 (80.30)
2.2
2.1 0 1 (2.7)
(2.8)
2.2.1
0 1
2.2.1.1 b0 b1
Y b0 b1
S xx
b0 0 2 1 X
n
1 X 2
b0 ~ N 0 , 2
n
S
xx
b0 (unbiased
20
estimator)
(n)
X b0
X b0
X
b1 1
2
b1 ~ N 1 ,
S xx
2
S xx
b1 b0
b1 X
b0 b1
MSE b0 b1
S2(b0) =
1 X 2
MSE
n S xx
(2.20)
S2(b1) = MSE
(2.21)
S xx
(standard error) b0 b1 se(b0) se(b1)
2.2.1.2 b0 b1
b0
b1
n 2
b0 0
1 X 2
MSE
n S xx
b1 1
MSE
S xx
~ t n2
~ t n2
21
100(1- )% 0
1 X2
1 X2
0 b0 t / 2,n2 MSE
b0 t / 2,n2 MSE
n S
xx
n S xx
(2.22)
100(1- )% 1
b1 t / 2,n2
MSE
MSE
1 b1 t / 2,n2
S xx
S xx
(2.23)
100 X
95% 0 95 100 0
1
2.4 2.1 95% 0 1
0
se(b0) =
1 X 2
MSE
n S xz
1 23.032
325.84
13 837.54
= 15.21
1
se(b1) =
MSE
S xx
325.84
837.54
= 0.624
b0 t / 2,n2 MSE
n S
n
S
xx
xx
22
b1 t / 2,n2
MSE
MSE
1 b1 t / 2,n2
S xx
S xx
2.2.2 0 1
0 1 2.2.1.2 (b0 - 0)/ se(b0) (b1 - 1)/ se(b1)
t n 2
0 1 (00 10)
0 x = 0 y
(00) 1 (10)
(1)
0
H0: 0 = 00
H1: 0 00 ( )
H1: 0 < 00 ( )
H1: 0 > 00 ( )
1
H0: 1 = 10
H1: 1 10 ( )
H1: 1 < 10 ( )
H1: 1 > 10 ( )
(2) ()
23
1
(3) (critical value)
0 1
t/ 2, n 2 - t, n 2
t,
(4)
0
t
n 2
(b0 0 )
1 X2
MSE
n S xx
(2.24)
1
t
(b1 1 )
MSE
S xx
(2.25)
(5)
| t | < t/ 2, n 2 (H0)
| t | > t/ 2, n 2 (H0)
t > - t, n 2 (H0)
t < -t, n 2 (H0)
t < t, n 2 (H0)
t > t, n 2 (H0)
1. 0
0
24
2. 1 0 X Y
X
Y
3. (2) (2.24) (2.25)
(Z) t MSE
2 Z t
Z
4. p-value
p-value p-value
observed level of significance
5. p-value
p-value () p-value <
p-value > p-value
0.40 t = 2.561 2.561 40%
= 0.05
6. p-value
7. p-value 0.000 p-value
0.000
2.5 2.1 (1) 1 = 9 (2)
0.05
(1) 10 = 9
H0 : 1 = 9
H1 : 1 9
() 0.05 n = 13 t/ 2, n 2 = t0.025,
t
11
= 2.201
25
(b1 1 )
MSE
S xx
(8.22 9)
1.251
325.84
837.54
b1
MSE
S xx
8.22
325.84
837.54
13.179
26
2.2.3
(1) Y Y (mean
response) E(y|x0) x = x0 x
(2) Y 1 x = x0 x0
(3) Y m
y 0 b0 b1 x0
Y Y x0
8 Y
8
x 1 m
m 1 m
2.2.3.1
Y E(y|x0)
0 1 X
2 1 x0 X
n
2
x0 X
2 1
y 0 ~ N 0 1 X ,
n
S xx
S xx
Y 1 0 1 X
2 1 1 x0 X
S xx
1 x X 2
2
y 0 ~ N 0 1 X , 1 0
n
S
xx
Y Y
27
Y m
1 1 x X 2
0 1 X 0
S xx
m n
1 x0 X
2 1
y 0 ~ N 0 1 X ,
m n
S xx
X (m )
0 1 X X ( X )
( x x ) 2 0 x0 = 0
b0
(2.10) y 0 b0 b1 x0 x0 = 0
y 0 b0
2.2.3.2
t MSE
2
100(1- )% Y Y0
1 (x X )2
y 0 t / 2,n2 MSE 0
S xx
n
(x X )2
y 0 y 0 t / 2,n 2 MSE 1 0
n
S xx
(2.26)
100(1- )% Y x0 1 Y0
1 (x X )2
y 0 t / 2,n 2 MSE1 0
n
S xx
(x X )2
y 0 y 0 t / 2,n2 MSE1 1 0
(2.27)
n
S
xx
100(1- )% Y x0 m
Y0
28
1 1 (x X )2
y 0 t / 2,n 2 MSE 0
S xx
m n
(x X )2
y 0 y 0 t / 2,n 2 MSE 1 1 0
m n
S xx
(2.28)
2.6 2.1
95% (1) 12 (2) 12
(3) 12 10
y 0 t / 2,n2 MSE 0
S xx
S xx
n
1 (12 23.03) 2
96.65 2.201 325.84
837.54
13
1 (12 23.03) 2
y 0 96.65 2.201 325.84
13
837.54
y 0 t / 2,n2 MSE1 0
S xx
S xx
n
1 (12 23.03) 2
96.65 2.201 325.841
837.54
13
1 (12 23.03) 2
y 0 96.65 2.201 325.841
13
837.54
29
1 1 (x X )2
y 0 t / 2,n 2 MSE 0
S xx
m n
(x X )2
y 0 y 0 t / 2,n 2 MSE 1 1 0
m n
S xx
1
1 (12 23.03) 2
96.65 2.201 325.84
837.54
10 13
1
1 (12 23.03) 2
y 0 96.65 2.201 325.84
10 13
837.54
2.3
2.3.1
Y
X X Y
Y X
Y
( Yi Y ) (Yi Y )
X (Yi Yi )
X
Yi Y = (Yi Y )
+ (Yi Yi )
30
(Yi Y )
i 1
i 1
i 1
i 1
(2.29)
i 1
i 1
n
i 1
i 1
2 Yi ei 2Y ei
= 0
1 6
0 0
n
i 1
i 1
i 1
(2.30)
SST
= SSR + SSE
SST (total sum of squares) Syy
SST = 0
SST
SSR (regression sum of squares)
SSR SST
SSE (error sum of squares)
X
SSE = 0
(2.30) (2.14) SSR
SSR = b1Sxy
(2.31)
2.3.2
SST n 1 1
n
(Yi Y ) 0
i 1
SSE n - 2 2 0 1
SSR 1 SSR SSE SST
31
n 1 = 1 + (n - 2)
2.3.3
(mean square)
SSR
1
= SSR
(2.32)
SSE
n2
(2.33)
MSR MSE
E(MSR) = 2 12 S xx
E(MSE) =
2.3.4 F
(1)
0 H0: 1 = 0 H1:1 0
t 2.2.2 F
F
MSR
MSE
(2.34)
F 2
1 n 2 F F
X
F F
X Y X
F t
32
t F t
F
F
(Weisberg,
2005, p. 31)
Source of variation
Regression
SS
SSR = b1Sxy
df
1
MS
MSR = SSR
Error
n2
MSE =
Total
SST = Syy
n1
F
F
MSR
MSE
SSE
n2
2.7 2.1 F
95%
F
H0: 1 = 0
H1: 1 0
2.1 SST = 60,187.23 (Syy 2.1)
SSR = b1Sxy
= 8.22 6,885.28 = 56,597.00
SSE
n2
33
3,590.23
11
= MSR
MSE
56,597.00
326.38
= 326.38
= 173.41
Source of variation
SS
56,597.00
Regression
3,590.23
Error
Total
60,187.23
df
1
11
12
MS
56,597.00
326.38
F
173.41
MINITAB 2.1
MINITAB p-value p-value =
0.000 < (0.05)
2.4
SST Y X SSE
Y
Y
34
(coefficient of determination) R2
R2
SSR
SSE
1
SST
SST
(2.35)
R2 0 1 R2
0 X
Y R2 1 Y
X Y
R2 ?
R2 0.60
0.70
R2
R2 = 0.80 X Y 80%
R2
(1) R2 Y
2.3 1 7
x 7
8
x 7
35
16
14
12
10
8
6
4
2
0
0
10
12
2.3
(2) R2 0
2.4
16
14
12
10
8
6
4
2
0
0
2.4
(3) R2
b12 S xx
b12 S xx 2
Sxx R2 X
36
(4) R2
R2
R2
(5) R2
R2
X
(6) R2 R2
(Cohen et al, 2003, p. 83)
2.5
(coefficient of correlation) r
(2.36)
r R2
r -1 1 r -1
r 1
r 0
r R2
b1 b1 r b1
r
S yy
b1
S xx
(2.37)
37
( X i X )(Yi Y )
(2.38)
i 1
( X i X ) (Yi Y )
2
i 1
i 1
2.8 2.1
SSR
SST
56,597.00
60,187.23
R2
= 0.94
94%
6%
r R 2
0.94 = 0.97
b1 2.1 r
MINITAB 2.1 6 R-Sq
MINITAB R-Sq(adj) 5
2.6
(x,y) = (0,0)
38
Y 0
Yi = 1Xi +
(2.39)
1
n
i 1
i 1
(2.40)
b1 X i2 Yi X i
1
n
b1
Yi X i
i 1
n
(2.41)
X i2
i 1
b1
1
2
n
X i2
i 1
(2.42)
Yi b1 X i
2
n
2 MSE
(Yi Yi ) 2
i 1
n 1
Yi 2 b1 Yi X i
i 1
i 1
n 1
(2.43)
n - 1 1
1 t n 1
H1: 1 0
t
~ t / 2,n1
b1
MSE
(2.44)
X i2
i 1
R2
Yi 2
i 1
n
Yi
i 1
(2.45)
39
b1
Yi X i
i 1
n
X i2
i 1
1,487.56
2.34
636.26
Yi 2.34X i
R2
Yi 2
i 1
n
Yi 2
i 1
3,477.88
0.998
3,485.01
Yi 0.878 2.26X i
1
1
40
0 1
t
2.1
X:
21
13
20
25
19
24
16
13
Y:
13
6
12
7
19
10
24
19
2.2 2.1 t F 0.10
2.3 2.1
2.4
1
21
6
2
17
6
3
10
4
4
5
2
5
11
3
6
12
4
2.5 2.4
2.4
2.6 2.4
2.7 2.4
2.8 MINITAB 2.4
41
2.9
56
54
52
58
52
62
66
63
2.10 2.9
2.11 2.9
0.05 F
2.12 X Y
0.05
The regression equation is
y = - 92.8 + 0.273 x
Predictor
Constant
x
S = 10.83
Coef
-92.81
0.27275
SE Coef
40.11
0.06977
R-Sq = 60.4%
T
-2.31
3.91
P
0.043
0.003
R-Sq(adj) = 56.5%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
10
11
SS
1791.0
1172.0
2963.0
MS
1791.0
117.2
F
15.28
P
0.003
2.13
63
61
45
53
60
42
58
62
47
56
161 217 130 148 162 144 152 140 135 122
2.14 2.13
2.13