Professional Documents
Culture Documents
02 - Data Modelling Main Assumptions - Part B
02 - Data Modelling Main Assumptions - Part B
"#$%&' ()*%)++,%)*
-#&#./01+$$%)*.
2#%).#33"/4&%0)3.5 4#,&.6
!"#$%#&'#("#&)*+*,"-*.&/01
2"#$%#-#("#3%*+*,"-*45*+"-"3"6
How can we decide whether the standard model is
appropriate?
!"#$%&'()(*%'(%* 4'0'<"*)1*,.('*$=*
Assumptions +%$*,&',-*%&'* >)$7.%)$1
.((/0#%)$12
3?.##)1?
“independence” 3 4/1(*%'(% 3@.%,&)1?
(random pattern)
3 5.6%7'%%8(*%'(% 3+9)1'.62*6'?6'(()$1
3 95:8(*%'(% 3A)0'*('6)'(*+B4CDB2
Normal
distribution
;$60.7)%"*%'(%* A6.1(=$60*<.%.
Descriptive tools
Tests
Transformations
Montgomery:
! Use 4 - 20 classes (often the number of classes approx equal to the
square root of the sample size)
! Classes with uniform size
! Lower limit slightly lower than the smallest data
)(
3 +2 4 . 2 1 5 6
* +, - . / 0, 1 2
Median
The median is in the middle of the data: half the observations are less than or equal
to it, and half are greater than or equal to it.
Suppose you have a column containing N values. To calculate the median, first order
your data values from smallest to largest. If N is odd, the median is the value in the
middle. If N is even, the median is the average of the two middle values.
For example, when N = 5 and you have data x(1), x(2), x(3), x(4), and x(5), the median
= x(3).
When N = 6 and you have data x(1), x(2), x(3), x(4), x(5),and x(6):
where x(3) and x(4) are the third and fourth observations.
)*)+,'-'*).,,
!"#$%&'(
/
It is insensitive to outliers
'"
B'&3)= ! ,.'#@3&&+'#/(#
,.'#&),)>#C)+( /(#,.'#
&"
/14'5<),3/=4 )5'#+'44
,.)= /5#'6*)+ ,/#3,
( )* + , - . * /0
%"
$"
#"
!"
D.34E'5 !?
$%#&'()*+,-#,.'#1/,,/@#/(#,.'#1/2#34 ,.'#(354,#
$%#&'()*+,-#,.'#+/D'5 D.34E'5 '2,'=&4 ,/#,.34
6*)5,3+'#7!?8#! A:;#/(#,.'#&),)#<)+*'4 )5'#+'44 ,.)=
)&F)G'=, <)+*' ! ,.'#+/D'4, <)+*' D3,.3= ,.'#
+/D'5 +3@3,> /5#'6*)+ ,/#,.34 <)+*'>
L/D'5#+3@3, J#!?! ?>:#7!"#! !?8
+0%4').
* ! $1&010(0$44;&4$.<)&+.&(:$44&
)
+/().=$%'+1>&?$40)(&/);+1-&%#)&
(
'
9#'(@).(&$.)&+0%4').(>
&
+"!
%
$
#
"
!
!"#$%&'(&%#)&(#$*)&+,&%#)&-'(%.'/0%'+12
!30%4').(2
!"#).)&'(&%#)&56)1%).7&+,&%#)&-'(%.'/0%'+12&
!8+9&:06#&$.)&-$%$&(*.)$-2&
c2(2) N (0,1)
Boxplot of C2 Boxplot of C1
2
9
8
1
7
6
0
5
C2
C1
4
-1
3
2
-2
1
0
-3
Goodness of fit test for distributions
:'#.0%.("';
!
678
45<9%;%-/)&'&.1.$= ("*#.$= 42*0$.)* % " !! = !
#"
$ "# !"#
75<9
A
75'9
8"#-.$" )4%$+"%#-"0.4.0 45<9
>?@
0 F 1
F(x)
1 Pr 2 ≤ 4 = Pr 6 2 ≤ 6 4
A = Pr 7 89 2 ≤ 7 89 4
0.5
= Pr : ≤ ; = 7 ; = 4
0 a x
F ~Unif(0,1)
Normalityy te
test:
est: Anderson-Darling
Anderso
on-D
Darling
!" #$%&'()* $"& " "% "###" "! ! !"#$%&'()'! **+ +","
8"*9(-.6-361:) "$
"$
%
"$
# "$ ! " " %
"= $$$$$$# =
! ! !!
;"*#$%&'()*(<)$3)(12-= 26>
x [i ] ! x
w[1], w[2],..., w[i],..., w[n] , con w[i] =
S
F!1# , F!2# ,..., F!i# ,..., F!n# , con F!i# = Pr(z < w!i# ) = %(w[i] )
"$ " $ "$ " $ "$ "$
Rejection region
If data are normally distributed (zi, xi) should lay on a straight line
.999
.99
.95
Probability
.80
.50
.20
.05
.01
.001
Often we use the normal probability graph (step 4 is skipped- da F[i] a z[i])
90
From a c2(4) 60
Frequency 50
40
30
20
10
0
0 10 20
C1
;7/<$%)!/7F$F9%9=>)!%7=
+444
+44
+4@
!/7F$F9%9=> +E*
+@*
+3*
+*@
+*1
+**1
* @ 1* 1@
G1
,A'/$:'()C+*DC44 ,50'/675"8$/%95:);7/<$%9=>)?'6=
-=8'A()3+4B@22 ,"-.&$/'0()12+343
;()@** !"#$%&'())))*+***
! Other processes:
# Waiting time
# Km/day for a sale representative
# Time to repair.
Solutions:
1. Use the real distribution
Ex: esponenziale, Weibull, Gamma for time to failure
2. Manage the data sampling to deal with the sample
average instead of dealing with single data (Central
Limit Theorem)
3. Nonparametric methods (ex. runs test): limited use in
SPC because they are unsenstitive (robust) to
extreme points as outliers. They are not useful when
we want to detect outliers
4. Trasform data
234$15(1+6('44"47
!,- !.-
!"#$%$&'(#)'*+'## /'01%$&'(#)'*+'##
Quality Engineering- BM Colosimo
Box-Cox transformation (in minitab)
$! ! & """#$"& % !
" ( !' = #
!"%& !""""#$"& = !
!"#"$%&"'()*+,-.#/
Problems:
data must be greater than zero (to apply ln and square root):