Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

!

"#$%&' ()*%)++,%)*
-#&#./01+$$%)*.
2#%).#33"/4&%0)3.5 4#,&.6
!"#$%#&'#("#&)*+*,"-*.&/01
2"#$%#-#("#3%*+*,"-*45*+"-"3"6
How can we decide whether the standard model is
appropriate?

!"#$%&'()(*%'(%* 4'0'<"*)1*,.('*$=*
Assumptions +%$*,&',-*%&'* >)$7.%)$1
.((/0#%)$12

3?.##)1?
“independence” 3 4/1(*%'(% 3@.%,&)1?
(random pattern)
3 5.6%7'%%8(*%'(% 3+9)1'.62*6'?6'(()$1
3 95:8(*%'(% 3A)0'*('6)'(*+B4CDB2

Normal
distribution
;$60.7)%"*%'(%* A6.1(=$60*<.%.

Quality Engineering- BM Colosimo


Distributions:

Descriptive tools
Tests
Transformations

Quality Engineering- BM Colosimo


Exploring the data- data snooping

Qualitative: Histogram, Boxplot, Run chart

Weekly production of a semiconductor production

week products week products week products week products


1 48 11 59 21 68 31 75
2 53 12 54 22 65 32 85
3 49 13 47 23 73 33 81
4 52 14 49 24 88 34 77
5 51 15 45 25 69 35 82
6 52 16 64 26 83 36 76
7 63 17 79 27 78 37 75
8 60 18 65 28 81 38 91
9 53 19 62 29 86 39 73
10 64 20 60 30 92 40 92

40 data: min 45; max 92 !"#$% &'()*+,-.+/%01

Quality Engineering- BM Colosimo


Histogram

Montgomery:
! Use 4 - 20 classes (often the number of classes approx equal to the
square root of the sample size)
! Classes with uniform size
! Lower limit slightly lower than the smallest data
)(

3 +2 4 . 2 1 5 6

"# ## $# %# &# '#

* +, - . / 0, 1 2

Quality Engineering- BM Colosimo


Median, First and third quartile

Median

The median is in the middle of the data: half the observations are less than or equal
to it, and half are greater than or equal to it.
Suppose you have a column containing N values. To calculate the median, first order
your data values from smallest to largest. If N is odd, the median is the value in the
middle. If N is even, the median is the average of the two middle values.
For example, when N = 5 and you have data x(1), x(2), x(3), x(4), and x(5), the median
= x(3).
When N = 6 and you have data x(1), x(2), x(3), x(4), x(5),and x(6):
where x(3) and x(4) are the third and fourth observations.

)*)+,'-'*).,,
!"#$%&'(
/

It is insensitive to outliers

Quality Engineering- BM Colosimo


Boxplot ,--.//0123-4/ 256. ,--.//0123-4/ 256.
!" #" !$% (! +) (!$%
!& #' ($% (" +* (($%
Based on ! #) &$% (& '& (&$"
quantile/percentile & #* #$" &* '& (&$"
!# #* #$" &! '" ("$"
" "! +$% &' '" ("$"
Rank the data and define # "( '$" &+ '+ ('$%
the rank + "( '$" &# '' ()$%
( "& *$" (' ') (*$%
* "& *$" !' '* &%$%
Median=50mo percentile !( "# !!$% () )! &!$"
If N is odd, the median is the !! "* !($% && )! &!$"
value in the middle. If N is ) +% !&$" &" )( &&$%
even, the median is the (% +% !&$" (+ )& &#$%
average of the two middle !* +( !"$% &( )" &"$%
values. ' +& !+$% (* )+ &+$%
!% +# !'$" (# )) &'$%
!+ +# !'$" &) *! &)$%
Median=(65+68)/2=66.5 !) +" !*$" &% *( &*$"
(( +" !*$" #% *( &*$"

Quality Engineering- BM Colosimo


Boxplot
D.34E'5 !"#
$%#&'()*+,-#,.'#*00'5#D.34E'5#'2,'=&4#,/# $%#&'()*+,-#,.'#,/0#/(#,.'#1/2#34 ,.'#,.35& 6*)5,3+'#
,.34#)&F)G'=,#<)+*'#! ,.'#.3H.'4,#&),)# 7!"8#! 9:;#/(#,.'#&),)#<)+*'4 )5'#+'44 ,.)= /5#
<)+*'#D3,.3=#,.'#*00'5#+3@3,># '6*)+ ,/#,.34 <)+*'>
I00'5#+3@3,#J#!"#K#?>:#7!"#! !?8

'"
B'&3)= ! ,.'#@3&&+'#/(#
,.'#&),)>#C)+( /(#,.'#
&"
/14'5<),3/=4 )5'#+'44
,.)= /5#'6*)+ ,/#3,
( )* + , - . * /0

%"

$"

#"

!"

D.34E'5 !?
$%#&'()*+,-#,.'#1/,,/@#/(#,.'#1/2#34 ,.'#(354,#
$%#&'()*+,-#,.'#+/D'5 D.34E'5 '2,'=&4 ,/#,.34
6*)5,3+'#7!?8#! A:;#/(#,.'#&),)#<)+*'4 )5'#+'44 ,.)=
)&F)G'=, <)+*' ! ,.'#+/D'4, <)+*' D3,.3= ,.'#
+/D'5 +3@3,> /5#'6*)+ ,/#,.34 <)+*'>
L/D'5#+3@3, J#!?! ?>:#7!"#! !?8

Quality Engineering- BM Colosimo


Boxplot* ,--.//0123-4/ 256. ,--.//0123-4/ 256.
!" #" !$% (! +) (!$%
1st quartile (Q1):
!& #' ($% (" +* (($%
Observation corresponding to the ! #) &$% (& '& (&$"
rank (n+1)/4= 41/4=10.25 & #* #$" &* '& (&$"
!53+0.25*(54-53) =53.25 !# #* #$" &! '" ("$"
" "! +$% &' '" ("$"
3rd quartile (Q3): # "( '$" &+ '+ ('$%
+ "( '$" &# '' ()$%
Observation corresponding to rank (' ') (*$%
( "& *$"
(n+1)*3/4=30.75 !' '* &%$%
* "& *$"
!79+0.75*(81-79) =80.5 !( "# !!$% () )! &!$"
!! "* !($% && )! &!$"
) +% !&$" &" )( &&$%
IQR=Q3-Q1= (% +% !&$" (+ )& &#$%
interquartile range= !* +( !"$% &( )" &"$%
' +& !+$% (* )+ &+$%
it is a measure of dispersion
!% +# !'$" (# )) &'$%
!+ +# !'$" &) *! &)$%
!) +" !*$" &% *( &*$"
(( +" !*$" #% *( &*$"
*
According to the way in which Minitab uses to compute quartiles

Quality Engineering- BM Colosimo


Boxplot

+0%4').
* ! $1&010(0$44;&4$.<)&+.&(:$44&
)
+/().=$%'+1>&?$40)(&/);+1-&%#)&
(
'
9#'(@).(&$.)&+0%4').(>
&
+"!

%
$
#
"
!

!"#$%&'(&%#)&(#$*)&+,&%#)&-'(%.'/0%'+12
!30%4').(2
!"#).)&'(&%#)&56)1%).7&+,&%#)&-'(%.'/0%'+12&

!8+9&:06#&$.)&-$%$&(*.)$-2&

Quality Engineering- BM Colosimo


Boxplot

39 data randomly generated from a:

c2(2) N (0,1)
Boxplot of C2 Boxplot of C1
2
9

8
1
7

6
0
5
C2

C1
4
-1
3

2
-2
1

0
-3
Goodness of fit test for distributions

Chi-square test: used to check if data in a sample are extracted by a specified


distribution.
H0: Data follow a given distribution.
Ha: data do not follow a specified distribution
!
! = # ##" " $" $! % $"
!
" ="
i=1,..,k classes; Oi=observed frequence in class i;
Ei=expected frequence in class i
$! = %" & "'"!! # ! & "'# !! ##
n=sample size; Yu,i e Yl,i =upper e lower limits for the i-th class

If H0 is true: ! ! "! ! ! ""

where c=number of estimated parameters (ex: mean, variance)


!""#$%&&'''()"*(+)$"(,-.&/).010&!2+/3--4&

Quality Engineering- BM Colosimo


!"

$! = %" & "'"!! # ! & "'# !! ##

Quality Engineering- BM Colosimo


Draw random data from any CFD

!"#$%&'#"( )*%$+"%",-./.0'1 02,21'$.3"%(.#$/.&2$.)* 42*0$.)* 56789

:'#.0%.("';
!
678
45<9%;%-/)&'&.1.$= ("*#.$= 42*0$.)* % " !! = !
#"
$ "# !"#

75<9
A
75'9
8"#-.$" )4%$+"%#-"0.4.0 45<9
>?@

! " # $ " #%!&'( )*+,- > ' <

Quality Engineering- BM Colosimo


U ~Unif(0,1) if and only if
1
Pr 2 ≤ 7 = 7 ∗ 1 = 7

0 F 1
F(x)
1 Pr 2 ≤ 4 = Pr 6 2 ≤ 6 4
A = Pr 7 89 2 ≤ 7 89 4
0.5
= Pr : ≤ ; = 7 ; = 4

0 a x

F ~Unif(0,1)
Normalityy te
test:
est: Anderson-Darling
Anderso
on-D
Darling

!" #$%&'()* $"& " "% "###" "! ! !"#$%&'()'! **+ +","

+"*,-./ 01.23)-41.5 $36)37 x[1], x[2],..., x[i],..., x[n]

8"*9(-.6-361:) "$
"$
%
"$
# "$ ! " " %
"= $$$$$$# =
! ! !!
;"*#$%&'()*(<)$3)(12-= 26>

x [i ] ! x
w[1], w[2],..., w[i],..., w[n] , con w[i] =
S

F!1# , F!2# ,..., F!i# ,..., F!n# , con F!i# = Pr(z < w!i# ) = %(w[i] )
"$ " $ "$ " $ "$ "$

Quality Engineering- BM Colosimo


5. Compute A2, A*2
å (2i - 1)[ln F [i ] + ln(1 - F[ n +1-i ] )]
A2 = - i =1,n -n
n
2 2 0.75 2.25
A * = A (1 + + 2 )
n n

Rejection region

a 0.25 0.2 0.15 0.1 0.05 0.025 0.01 0.005


A*2a 0.472 0.509 0.561 0.631 0.752 0.873 1.035 1.159

(if A*2>A*2a we can reject the null hypothesis)


Graphical method (to check normality)

Sample of n iid data $"& " "% "###" "! !

1. Sample (n>20) $"& " "% "###" "! !

2. Ranked in increasing order


#[$]! #[#]!"""! #[" ]!"""! #[! ]

3. Estimate empirically the cumulative distribution function F[i]


! ! #$%
"[! ] = " & #!! " ' = ()& # " #!! " ' = #[$]! #[#]!"""! #[" ]!"""! #[! ]
$
4. Compute the corresponding z’s (assuming a standard normal
distribution)
"[! ] = "!! $ #"! # %

Quality Engineering- BM Colosimo


5. Plot (z[i],x[i])

6. Check the deviation from a straight line

If data are normally distributed (zi, xi) should lay on a straight line

Normal Probability Plot

.999
.99
.95
Probability

.80
.50
.20
.05
.01
.001

0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5

Often we use the normal probability graph (step 4 is skipped- da F[i] a z[i])
90

Ex: 500 data 80


70

From a c2(4) 60

Frequency 50
40
30
20
10
0

0 10 20
C1
;7/<$%)!/7F$F9%9=>)!%7=

+444
+44
+4@
!/7F$F9%9=> +E*
+@*
+3*
+*@
+*1
+**1

* @ 1* 1@
G1
,A'/$:'()C+*DC44 ,50'/675"8$/%95:);7/<$%9=>)?'6=
-=8'A()3+4B@22 ,"-.&$/'0()12+343
;()@** !"#$%&'())))*+***

!"#$%&'&$()**)+ &$,)-.&* %/0'-/12'/),3$


4( '56$789&*26 )($'56$'60'$/0 *600 '5&, :)2- ! *696*;$-6<6=' !"3

Quality Engineering- BM Colosimo


What can we do if data are not normal?
! Mixture: example of output from two different processes that
should be (in principle) characterizied by the same distribution.

Quality Engineering- BM Colosimo


Quality Engineering- BM Colosimo
What can we do if data are not normal?
! It can be intrinsically due to the process
" Example:
! Physical processes:
# Distorsion measurement (eccentricity, roughness)
# Electrical phenomena: capacitance, insulation resistance
# Small levels of substances in the material: porosity,
contaminants;
# Other physical properties (ultimate tensile stress , time
to failure)

! Other processes:
# Waiting time
# Km/day for a sale representative
# Time to repair.

Quality Engineering- BM Colosimo


Non normality

Solutions:
1. Use the real distribution
Ex: esponenziale, Weibull, Gamma for time to failure
2. Manage the data sampling to deal with the sample
average instead of dealing with single data (Central
Limit Theorem)
3. Nonparametric methods (ex. runs test): limited use in
SPC because they are unsenstitive (robust) to
extreme points as outliers. They are not useful when
we want to detect outliers
4. Trasform data

Quality Engineering- BM Colosimo


Transform data for non-normality

Power transformation (Box-Cox transformation):


$! ! & """#$"& % !
" ( !' = # ! # $ # ! % # !" !
!"%& !""""#$"& = ! !

234$15(1+6('44"47

!,- !.-

!"#$%$&'(#)'*+'## /'01%$&'(#)'*+'##
Quality Engineering- BM Colosimo
Box-Cox transformation (in minitab)

$! ! & """#$"& % !
" ( !' = #
!"%& !""""#$"& = !
!"#"$%&"'()*+,-.#/

Quality Engineering- BM Colosimo


Quality Engineering- BM Colosimo
Quality Engineering- BM Colosimo
Ex 500 dati from !2(4)

Quality Engineering- BM Colosimo


Box-Cox transformation

Problems:
data must be greater than zero (to apply ln and square root):

! If data are negative: we can add a constant


! Trasformation does not work if data that has higher
frequence is the smallest data in the set
! … other tranformations (e.g., Johnson transformation)

Quality Engineering- BM Colosimo

You might also like