Professional Documents
Culture Documents
03 - Linear Models - Part A
03 - Linear Models - Part A
"#$%&'()*+%*,,-%*+
./0%*,#-(123,$4(5 6#-&(#
!"#$%#&'#("#&)*+*,"-*.&/01
2"#$%#-#("#3%*+*,"-*45*+"-"3"6
!"#$%&'()(*%'(%* 4'0'<"*)1*,.('*$=*
Assumptions +%$*,&',-*%&'* >)$7.%)$1
.((/0#%)$12
3?.##)1?
“independence” 3 4/1(*%'(% 3@.%,&)1?
(random pattern)
3 5.6%7'%%8(*%'(% 3+9)1'.62*6'?6'(()$1
3 95:8(*%'(% 3A)0'*('6)'(*+B4CDB2
Normal
distribution
;$60.7)%"*%'(%* A6.1(=$60*<.%.
! #
" ! = ! "' ! "!&# + $ %%%%%%%%%%%%%% ! = !"#"$$$
# $ =!
"#$%&'$()'*%&+,)(#$-.%/$01#-+(12%3,(&+%41,$%5/34%6'$()'*%&+,)(%7889:
Quality Engineering- BM Colosimo
!
Batching
Ex:
1000 data from a chemical process
b = 10
5
Batching
Ex:
1. Initialize b=1
2. Compute the autocorrelation coefficient at the first lag
3. If the coefficient is smaller than 0.1 go to step 5
4. Set b=2*b, go to step 2
5. end
These are approaches aimed at avoiding to deal with the autocorrelation issue
instead of facing it.
How can we identify the appropriate model in case of nonrandom data?
-Regression
-ARIMA
"##$%& '#(!!
We can look for the model/coefficients to minimize the Sum of Squared Errors
(Minimum Mean Squared Error – MSE – approach)
! " !
$$% = ! $ #" " µ" # = ! # " "
" =! " =!
Observed data Deterministic (assumed known)
ex : µt = cost, µt = at + b,
µt = at 2 + bt + c, µt = a ln t,
(! "2 = cost)
For the sake of simplicity che can assume models linear with reference
to the unnown parameters: LINEAR REGRESSION
! " !
$$% = ! $ #" " µ" # = ! # " "
" =! " =!
Observed data Unknow mean (deterministic)
ex : µ̂t = côst, µ̂t = ât + b̂, µ̂t = ât 2 + b̂t + ĉ, µ̂t = â ln t
324'/5
"! = µ! + ! ! "! ! = µ! ! "#$%&'$()*+)$,()-($(.&%/%#$%0
1'.$)*+)$,()&*-(2
Assumed model: Yt = µ + e t = b 0 + e t
n 2 n
Let find bˆ0 = b0 so to minimize SSE: SSE = å ( yt - µt ) = å e t 2
t =1 t =1
12
Regression
observed
n assumed
SSE = å ( yt - b 0 )2
t =1
¶SSE n n
= -2 å ( yt - b0 ) = 0 å yt - nb0 = 0
¶b 0 t =1 t =1
1 n n n
Þ b0 = å yt = y = = - ˆ 2
= - 2
n t =1 SSE ( b0 ) SSE å t( y y ) å t 0
( y b )
t =1 t =1
Difference:
S(b0) function to be minimized yˆ = y
SSE = min S ( b0 )
b0
13
TREND
50 sequentially produced items: elongation of a spring subject to a
force of 20 g
(Deming 1986: deming.dat)
! "#$%&'( ! "#$%&'( ! "#$%&'( ! "#$%&'( ! "#$%&'(
) *+**),-, )) *+**).., /) *+**)/*0 .) *+**))*1 ,) *+***1-,
/ *+**),21 )/ *+**)*0. // *+**)/), ./ *+***33- ,/ *+**))*)
. *+**)./ ). *+**)/*0 /. *+**)*10 .. *+***311 ,. *+***1,
, *+**).). ), *+**))*) /, *+**)//) ., *+***31) ,, *+***1-,
- *+**)/*0 )- *+**))*) /- *+**).., .- *+***30, ,- *+***0/
2 *+**).., )2 *+***30, /2 *+**)*3, .2 *+**)*3, ,2 *+***30,
0 *+**)//) )0 *+**)/ /0 *+***30, .0 *+***31) ,0 *+***1-,
1 *+**),-, )1 *+**)/ /1 *+**)))- .1 *+***12) ,1 *+***30,
3 *+**)/ )3 *+**)*0. /3 *+***30, .3 *+**)*10 ,3 *+***1,0
)* *+**)/), /* *+**)/ .* *+**))*) ,* *+***31) -* *+***2/)
"'""!&
34
#!
2034/3567
-./012
"'""!"
$
0-,*12
!
"'"""&
!"!!!+!"!!!*!"!!!)!"!!!(!"!!#!!"!!##!"!!#'!"!!#&!"!!#%!"!!#$ ()*+, !" #" $" %" &"
,-./01
Quality Engineering- BM Colosimo
!"#$%&'()*+%*,,-%*+ ./
Trend
Trend? “True” model Yt = b 0 + b1xt + e t (1)
– In the example :
xt = t regressor or predictor
– Observations: yt t = 1,...,n
" =!
!
" =!
( )
"
! (#" $" " #" $ ) + %! ! #" # " #" = #$$$ # # "
n
( xt ! x ) =
n
"( y ! y ) = 0
t
t=1 t=1
! !
(! " ) !
! (#" $" " #" $ ) " # ! ( $" " $ ) + %! ! #" # " #" + %! # ! (#" " # ) = #
" =! " =! " =! " =!
!
(
! " "
! (#" $" " #" $ " # $" + # $ ) + %! ! #" # " #" + # #" " # = #
" =! " =!
)
! ! "
! (#" " # )( $" " $ ) " %! ! (#" " # ) = #
" =! " =!
!
! (#" " # )( $" " $ )
# %! = " =!
!
! (#" " # )
" $%&$
" =! !"
Therefore Yt = !0 + !1 xt + !t ! ŷt = b0 + b1 xt
$ n
& # t=1
(xt " x )(yt " y )
&& b1 = n
&
&
&' b0 = y " b1 x
n
Sxx = " (xt ! x )2 Sxy
t=1
define n
# b1 =
Sxy = " (xt ! x )(yt ! y ) Sxx
t=1
!"
! !"#$%& #$%&'()(&*(&++$,%
#$%&'()(&*(&++$,%' -$(+.),(/&()0,/&#
1.)$+)-2%3.$,%),-),%&)+$%*#&)(&*(&++,( 4#$%&'()-2%3.$,%),-)!"#"5
! 6'7)'..&%.$,%8
!0 , !1 deterministic values (unknown)
b = !ˆ , b = !ˆ random variable (they are function of observed data)
0 0 1 1
! 1-).9&).(2&)0,/&#)$+)&:2'#).,).9&)'++20&/),%&8
! ;%<$'+&/)&+.$0'.,(+8) "$!" # = ! " %%%"$!!# = !!
! =$%)>'($'%3&)&+.$0'.,(+)4'0,%*)'##).9&)2%<$'+&/)&+.$0'.,(+5
! 6'7)'..&%.$,%).,)3,00,%)&((,(+ $%)(&*(&++$,%8
? @.(,%*)(&#'.$,%+9$A)'0,%*)>'($'<#&+)/,&+)%,.)0&'%)3'+2'#)
(&#'.$,%+9$A
? B9&)$/&%.$-$&/)(&#'.$,%+9$A)$+)>'#$/),%#7)$%).9&)&CA#,(&/)$%.&(>'#),-)!
B9&)$/&%.$-$&/)(&#'.$,%+9$A)$+)>'#$/),%#7)$%).9&)&CA#,(&/)$%.&(>'#),-)
A'7)'..&%.$,%).,)
A'7)'..&%.$,%).,)!"#$%&'(%#)'* !"
!"
! y $ ! 1 x11 ! x1k $! ! ! $
$ # i &
# 1 & # & !0
# & # ! ! " ! &# & # &
& # !
# &=# &# !1 &
# yi 1 xi1 ! xik &# & + # !i &
#
& #
& # ! ! " ! &#
# ! !!!!!!!!!!!
& # & ! ! = "! + "
# & # &" !k & # ! &
#" yn &% #" 1 xn1 ! xnk & % #! &
% " n %
n '1 n'k k '1 n '1 !"
!"
!) # $%&'"( & %# & $$$') = ' " + '!!) + % ) # $%&' ' " + '!!) ( & %# &
")(=! ' !) ! ! &' #) ! # &
"! = $$$%$$$"" = # ! "!! #$%&'()%*$&+,-.()*/(,)&,0&1*/*&!"#$
#
")(=! ' !) ! ! &
2% +*)&34,5&/4*/6
(' ! 7'%*3%&),/%6
% & )(# $ = % &$# $ = )#''''''''''''%&$# $ =
" !!
x
&# ! # ! Cov( !ˆ0 , !ˆ1) = !! "2
% & )(" $ = % &$" $ = ) " ''''''''''%&$" $ = ( ' $ + S xx
!
!
$% # " !! !"
!"
#./0),/-1&./,21,$1&1-30,/0%2&4-./0),/-1&./,21,$1&-$$%$56
"! ! !$" #
%&"" ' = %& #$" ' = !"%#$!& = !# =
# !! ! $ %%
$ " !
!
%
$
%&"# ' = %&# # ' = " ! & +
!
' $! %# %
#
!"%#$" & = !# = !$" & + '
&( $ # !! ') " & & $ %% '
( )
Predictor Coef SE Coef T P
Constant 0.00136522 0.00002914 46.84 0.000
t -0.00001066 0.00000099 -10.72 0.000 !"
! ## ' ## '
#" " = &# ' = = Mean
n Squared
d Errorr (MS
S E)
$% ' " ! !
with K=number of regressors+1
!"
bi ! !i
t0 = ~tn-K t Student n - K degree of freedom (dof)
sbi
!"#"$%&'()*+',&-'."&'/0*1"02&'/0
Student's t distribution with 48 DF
x P( X <= x )
-10.7700 0.0000
p-value=2(0.0000)
bi ! !i
t0 = ~tn-K
sbi
$% $ " %
$ &! $ #"!"# # # &! $ #"!"# !!
'$%
for !1 :
" = 5% ! t0.025,48 = 2.0106
-1.066 "10 -5 - 2.0106(0.099 "10 -5 ) # !1 # -1.066 "10 -5 + 2.0106(0.099 "10 -5 )
-1.265 "10 -5 # !1 # $0.867 "10 -5
Quality Engineering- BM Colosimo
!"
Confidence interval for the mean
" 2%
ˆ 2 $ 1 (x0 ! x ) '
µ̂Y |x ! t! /2,n-2 !" +
0 $n
# S xx '&
"'""&&
"#$%&'
"'""&"
"'"""(
"'""")
"'"""*
"'"""+
" &" %" $" #" !"
!
!"
Point prediction
0.001362 -0.00001066 t
using t=51 we obtain 0.00082156
!"
D;.-4.1':34-./-19:3'+&(9+,E
# "
# = µ ! + ! ! # " $ µ# ! " + " !!
"$ %&'(&)(*(+,-./-0&+&-&).1+-+23-34+(5&+30-5.03*-6!7
#$ 89:3'+&(9+,-./-+23-34+(5&+30-5.03*-;(+2-'34<3:+-+.-+23-+'13-.93
=.'-9-'3&4.9&)*,-*&'>3-69?@!A7B-+23-43:.90-4.1':3-./-C&'(&)(*(+,-(4-45&**
!!
!
& ! & % $ ! ' ! # #
% $$ ' $" # = ) ( $% + + "
!
% # " !! "
!
! & & % ! ' ! $ #
#$%&'' %()*+,$)%$ +)-$./&' *(.0-1$02$&) % % µ' " # !" $ = ) ( $ + "
!
% $ # !! "
" 2%
2$ 1 (x ! x ) '
ŷ0 ! t! /2,n-2 ! " 1+ + 0
ˆ
$ n S xx
'
# &
!"
) "'"""&"&!
*+), ("'!-
"'""&%! *+),./012 34'4-
"#$%&'
"'""&""
"'"""(!
"'"""!"
" &" %" $" #" !"
!
./
t =1 t =1
0.0005
0 10 20 30 40 50
DIM
0.0015 0.0015
0.001 0.001
0.0005 0.0005
0 10 20 30 40 50
0 10 20 30 40 50
Quality Engineering 36
n 2 n 2 n 2
å ( yt - y ) = å ( yt - yˆ t ) + å ( yˆ t - y )
t =1 t =1 t =1
Analysis of Variance
Source DF SS MS F P
Regression 1 1.18428E-06 1.18428E-06 114.96 0.000
Residual Error 48 4.94479E-07 1.03017E-08
Total 49 1.67876E-06
Quality Engineering 37
SS R SS E
MS R = MS E =
df R df E
MS R
se H 0 è vera, allora ~ F ( K - 1, n - K )
MS E
Analysis of Variance
Source DF SS MS F P
Regression 1 1.18428E-06 1.18428E-06 114.96 0.000
Residual Error 48 4.94479E-07 1.03017E-08
Total 49 1.67876E-06 1-1.0000
0 x P( X <= x )
MSR/MSE 114.9600 1.0000
Quality Engineering
38
!" #$%&'()*+,(-./$'0$$*.1231.!.&*4.123156.7(',.)*$.+(*8%$.9)$::(9($*'
! ! #""""$ = !
!$ #"""" "$ ! !
01,(&23(&,4&4(#-,(%5,*&%6#$7(8,(6#*(4132(&1#&9
*
&* ' ( & &* ' ( #
'( = )' ' !+, ))()$'( " = $
* ! = #$ ) ' " $%& ! # !"
%&* $ %& ! #$ (
% * "
: ;9 <3-(0)=0(>9(?*(3-5,-
3-5,- &3("4,(&1,(&,4&(3*(
&3("4,(&1,(&,4&(3*(! 63,<<%6%,*&4 #& &1,(4#@,
&%@,A(63*4%5,- &1,(<#@%$'(,--3- -#&,(! B%& %4 #6&"#$$' C,&&,- "4%*+ #(
63*4%5,- &1,(<#@%$'(,--3-
5%<<,-,*& #DD-3#61 E &1,(,F&-#(4"@(3<(4G"#-,4H
! !!"#
!% = %%%%%%%%$% = $"###" $ # ! ! " %! % = ! !!"#
$ % =$"##" $
n
SS R = å
i =1
( yˆ i - y ) 2 Variability explained by the regression model
n
SS E = å
i =1
( yi - yˆ i ) 2 Variability not explained by the regression model
n
SST = å
i =1
( yi - y ) 2 = SS R + SS E
Quality Engineering 40
Comments on R2
"" &
"
$'() = ! ! !!# #$%&'()**+,'#-''. $'&/0#1). )*+
""% #2'+334+%.&+$&/0#1). )*+.(!
! !!
#"$%&'-)
#"*+,$-.'/)
%"" ) *+,$-""
.()/ $#,!0
.()/12345 $+,*0
!""
'""
)#!!'(*+#$'*,-.!
56789:;72<=<<!>,&&<?<&,!#><9
" &""
"#$%!&'$(
?<","&***<9@@&<(<","""&"+<9@@'
%"" ) *+,#*!#
(!""
" &" %"
!"" $" #" !""
!
"
(!""
" &" %" $" #" !""
Quality Engineering- BM Colosimo
!
!"
Residual check
"'"""%
"'"""&
./0)&
"'""""
("'"""&
("'"""%
! ACF ± $ # !"
ACF of RESI1
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
+----+----+----+----+----+----+----+----+----+----+
1 -0.062 XXX
2 -0.006 X
3 0.068 XXX
4 -0.047 XX
5 0.000 X
6 0.040 XX
7 0.084 XXX
8 -0.272 XXXXXXXX
9 -0.031 XX
10 -0.136 XXXX
11 0.012 X
12 0.015 X
! " #!
15$8%6(?$5E%E767-9(?65-
+AAA
+AA
+A0
?$5E%E767-9
+D*
+0*
+C*
+*0
+*/
+**/