Professional Documents
Culture Documents
Esame Quality Data Analysis 2018+09+10
Esame Quality Data Analysis 2018+09+10
10/09/2018
General recommendations:
avoid (if not required) theoretical introductions or explanations covered during the course;
always state the assumptions, formulas/expressions and the final results (when using hypothesis tests provide
the numerical value of the test statistic and the test conclusion in terms of p-value);
show (qualitatively) all the plots.
300.7 305.0 296.7 263.7 210.3 233.0 247.3 265.3 287.0 380.3 341.0 340.0
326.3 342.7 280.0 242.7 273.0 252.0 268.0 278.0 279.7 274.0 338.7 328.7
391.3 405.7 389.7 358.0 356.3 330.3 263.7 273.3 229.7 241.0 244.0 160.3
182.0 203.3 221.3 251.7
a) Design a traditional control chart (assuming a NID behaviour) with run-rules and comment the result.
X Y Z
1 0,94 13,92 22,47
2 1,31 27,83 28,33
3 1,06 13,94 25,78
4 1,16 12,20 25,18
5 1,32 12,95 25,85
6 1,40 2,48 21,81
7 1,00 9,62 24,92
8 1,44 28,67 29,47
9 1,34 13,18 26,42
10 1,21 8,47 25,49
11 1,19 0,09 23,4
12 1,40 14,95 25,89
13 1,17 3,91 26,35
14 1,23 36,55 26,41
15 1,23 28,02 24,17
16 1,37 7,91 25,05
17 1,48 5,65 30,56
18 1,31 7,65 20,52
19 1,19 0,59 30,49
20 1,33 2,74 26,6
21 1,26 1,84 27,99
22 1,22 4,42 20,96
23 1,39 9,69 24,11
24 1,28 15,98 24,71
25 1,25 3,63 26,35
X Y Z
26 1,32 33,79 21,7
27 1,27 22,78 26,67
28 1,04 5,53 21,44
29 1,19 12,25 25,46
30 1,2 2,53 27,37
Assuming that an out-of-control shift of the mean is detected with a Type II error equal to 0.865, compute
the entity of the shift in terms of standard deviation units
the out-of-control mean.
Solution
Exercise 1
a)
Time-series plot:
400
350
300
X
250
200
150
4 8 12 16 20 24 28 32 36 40
Index
The process seems not NID, but let’s design the traditional chart assuming a NID behaviour:
I-MR Chart of X
1
400 1
1 1
5 5
5 5 5 UCL=358,7
6 6
Individual Value
300 _
X=286,4
2
5
2 LCL=214,1
200 1
1
1
1
1 5 9 13 17 21 25 29 33 37
Observation
1
UCL=88,8
80
Moving Range
60
40
__
MR=27,2
20
0 3 LCL=0
1 5 9 13 17 21 25 29 33 37
Observation
Run rules:
Test Results for I Chart of X
TEST 1. One point more than 3,00 standard deviations from center line.
Test Failed at points: 5; 10; 25; 26; 27; 36; 37; 38
TEST 1. One point more than 3,00 standard deviations from center line.
Test Failed at points: 10
1,0
0,8
0,6
0,4
Autocorrelation
0,2
0,0
-0,2
-0,4
-0,6
-0,8
-1,0
1 2 3 4 5 6 7 8 9 10
Lag
Partial Autocorrelation Function for X
(with 5% significance limits for the partial autocorrelations)
1,0
0,8
0,6
Partial Autocorrelation
0,4
0,2
0,0
-0,2
-0,4
-0,6
-0,8
-1,0
1 2 3 4 5 6 7 8 9 10
Lag
Analysis of Variance
Model Summary
Coefficients
Regression Equation
Analysis of Variance
Model Summary
Coefficients
Regression Equation
X = 0,9886 AR1
The model without constant meet the assumptions (normality of residuals: p-value=0.118, runs-test: 0.294,
ACF & PCAF ok, no strange pattern, lack of fit ok).
Autocorrelation Function for RESI_1 Partial Autocorrelation Function for RESI_1
(with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations)
1,0 1,0
0,8 0,8
0,6 0,6
Partial Autocorrelation
0,4 0,4
Autocorrelation
0,2 0,2
0,0 0,0
-0,2 -0,2
-0,4 -0,4
-0,6 -0,6
-0,8 -0,8
-1,0 -1,0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Lag Lag
c mech)
Special cause control chart:
I-MR Chart of RESI_1
UCL=108,2
100
50
Individual Value
_
0 X=2,0
-50
-100 LCL=-104,1
1 5 9 13 17 21 25 29 33 37
Observation
150
1
UCL=130,4
100
Moving Range
50 __
MR=39,9
0 LCL=0
1 5 9 13 17 21 25 29 33 37
Observation
One out-of-control point is signalled by the MR chart only. It can be caused by the fact that the MR statistic
follows an half-normal distribution. Let’s transform it to normality with the Box-Cox transformation:
The result yields a value close to the known transformation (λ=0.4). By using λ=0.4, the new MR chart is:
I Chart of C9
8
UCL=7,730
6
Individual Value
_
4 X=3,946
LCL=0,161
0
1 5 9 13 17 21 25 29 33 37
Observation
d mech)
95% Prediction interval:
Variable Setting
AR1 251,7
Prediction
Fit SE Fit 95% CI 95% PI
248,828 4,93612 (238,836; 258,821) (175,460; 322,196)
ONLY FOR MANAGEMENT ENGINEERING STUDENTS
b man)
The EWMA control chart for auto-correlated data to be used is the following:
Xˆ t 1|t zt xt (1 ) zt 1 ( z0 x )
m | et |
t | et | (1 ) t 1 0.1 ( 0) ̂ t 1.25 t
t 1 m
The λ parameter can be estimated by minimizing the SSE: the result is λ=0.975.
The resulting EWMA control chart is:
There is one out-of-control observation at time t=10. We have no information about the existence of
assignable causes, thus the EWMA control chart design is over. That OOC observation deserves some
attention.
c man)
In order to determine if the EWMA is a good one-step-ahead predictor we should check its residuals:
Normality can be accepted (at alpha=5%).
Runs test:
Runs test for RES-EWMA
1,0 1,0
0,8 0,8
0,6 0,6
Partial Autocorrelation
0,4 0,4
Autocorrelation
0,2 0,2
0,0 0,0
-0,2 -0,2
-0,4 -0,4
-0,6 -0,6
-0,8 -0,8
-1,0 -1,0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Lag Lag
EXERCISE 2
Time-series plots and scatter plots:
1,2 20
10
1,0
0
Z
30,0
27,5
25,0
22,5
20,0
1 5 10 15 20 25
Index
Scatterplot of X vs Y; X vs Z; Y vs Z
X*Y X*Z
1,4
1,2
1,0
0 10 20 30 40
20,0 22,5 25,0 27,5 30,0
Y*Z
40
30
20
10
0
20,0 22,5 25,0 27,5 30,0
Runs Test: X; Y; Z
Descriptive Statistics
Number of
Observations
Variable N K ≤K >K
X 25 1,2587 13 12
Y 25 11,4759 14 11
Z 25 25,5712 12 13
K = sample mean
Test
Null hypothesis H₀: The order of the data is random
Alternative hypothesis H₁: The order of the data is not random
Number of Runs
Variable Observed Expected P-Value
X 15 13,48 0,534
Y 10 13,32 0,168
Z 14 13,48 0,831
Autocorrelation Function for X Partial Autocorrelation Function for X
(with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations)
1,0 1,0
0,8 0,8
0,6 0,6
Partial Autocorrelation
0,4 0,4
Autocorrelation
0,2 0,2
0,0 0,0
-0,2 -0,2
-0,4 -0,4
-0,6 -0,6
-0,8 -0,8
-1,0 -1,0
1 2 3 4 5 6 1 2 3 4 5 6
Lag Lag
1,0 1,0
0,8 0,8
0,6 0,6
Partial Autocorrelation
0,4 0,4
Autocorrelation
0,2 0,2
0,0 0,0
-0,2 -0,2
-0,4 -0,4
-0,6 -0,6
-0,8 -0,8
-1,0 -1,0
1 2 3 4 5 6 1 2 3 4 5 6
Lag Lag
1,0 1,0
0,8 0,8
0,6 0,6
Partial Autocorrelation
0,4 0,4
Autocorrelation
0,2 0,2
0,0 0,0
-0,2 -0,2
-0,4 -0,4
-0,6 -0,6
-0,8 -0,8
-1,0 -1,0
1 2 3 4 5 6 1 2 3 4 5 6
Lag Lag
The T2 control chart for the Phase I data after the transformation of Y is:
T² Chart of X; ...; Z
16
UCL=15,51
14
12
10
8
T²
4 Median=3,92
1 3 5 7 9 11 13 15 17 19 21 23 25
Sample
b mech)
Before applying the control chart to the new observation, the descriptor Y must be transformed with
the same exponential used in Phase I (𝜆 = 0.5). The dataset after the transformation is:
X Y Z
26 1,32 5,81262 21,7
27 1,27 4,77236 26,67
28 1,04 2,3515 21,44
29 1,19 3,500465 25,46
30 1,20 1,589665 27,37
14
12
10
8
T²
4 Median=3,92
1 4 7 10 13 16 19 22 25 28
Sample
Subgroups omitted from the calculations: 26-30
c mech)
A statistical test can be performed after checking the normality and independence of the new
observations of X. However, due to the very small sample size, we can simply assume that it was
drawn from a normal and independent population.
Then, a test to determine is the two variances are statistically equal or not is needed:
Descriptive Statistics
Variable N StDev Variance 95% CI for σ
X 25 0,130 0,017 (0,096; 0,191)
Xnew 5 0,105 0,011 (0,044; 0,413)
Test
Null hypothesis H₀: σ₁ / σ₂ = 1
Alternative hypothesis H₁: σ₁ / σ₂ ≠ 1
Significance level α = 0,05
Test
Method Statistic DF1 DF2 P-Value
Bonett * 0,697
Levene 0,58 1 28 0,454
Since there is no statistical difference in terms of variance, a 2-sample t test for the means with
equal variances can be performed. The results is the following:
a man)
Let’s apply the PCA based on Correlation Matrix, as the three descriptors are defined on different
scales.
Eigenvectors
Variable PC1 PC2 PC3
X 0,662 0,302 -0,686
Ytrans 0,303 -0,945 -0,123
Z 0,686 0,126 0,717
0,6 0,0
0,5
-0,5
0,4
0,3 -1,0
X Y Z X Y Z
PC3*C16
0,8
0,4
0,0
-0,4
-0,8
X Y Z
In order to retain at least 75% of the overall variability, the first two PCs are retained.
b man)
12
10
8
T²
2 Median=2,29
1 3 5 7 9 11 13 15 17 19 21 23 25
Sample
Subgroups omitted from the calculations: 26-30
c man)
The new 5 observations must be projected onto the first two principal components, but, first, the
new observations of variable Y must be transformed with the same exponential used in Phase I (𝜆 =
0.5). The dataset after the transformation is:
X Y Z
26 1,32 5,81262 21,7
27 1,27 4,77236 26,67
28 1,04 2,3515 21,44
29 1,19 3,500465 25,46
30 1,20 1,589665 27,37
Since, the correlation matrix was used, the scores are computed by standardizing the original data.
The Phase I means and standard deviations are the following:
𝑋̅ = 1,26 𝑠𝑥 = 0,13
𝑌̅ = 3,07 𝑠𝑦 = 1,46
𝑍̅ = 25,57 𝑠𝑧 = 2,60
By standardizing the new 5 observations with respect to the above means and standard deviations and
projecting them along the first two PCs, the resulting T2 control chart is the following:
12
10
8
T²
2 Median=2,29
1 4 7 10 13 16 19 22 25 28
Sample
Subgroups omitted from the calculations: 26-30
Exercise 3
Xbar-S control chart with known parameters:
Alpha 0,005
Alpha/2 0,0025
K=z_alpha/2 2,807
n 5
c4(5) 0,94
X-bar chart S-chart
UCL CL LCL UCL CL LCL
0,901 0,75 0,599 0,228 0,113 0
H1: 𝜇𝑛𝑒𝑤 = 𝜇 + 𝛿𝜎
Xbar chart:
S chart:
(𝑛 − 1) (𝑛 − 1) 2 (𝑛 − 1)
𝛽𝑆 = 𝑃(𝐿𝐶𝐿𝑆 ≤ 𝑆 ≤ 𝑈𝐶𝐿𝑆 |𝐻1 ) = 𝑃 ( 2
𝐿𝐶𝐿𝑆 2 ≤ 𝑆 ≤ 𝑈𝐶𝐿𝑆 2 )
𝜎 𝜎2 𝜎2
This is constant.
Eventually:
𝛽 = 𝑃(𝑛𝑜 𝑎𝑙𝑎𝑟𝑚|𝐻1 ) = 𝑃(𝑛𝑜 𝑎𝑙𝑎𝑟𝑚 𝑓𝑟𝑜𝑚 𝑋𝑏𝑎𝑟 𝑐ℎ𝑎𝑟𝑡|𝐻1) ∗ 𝑃(𝑛𝑜 𝑎𝑙𝑎𝑟𝑚 𝑓𝑟𝑜𝑚 𝑆 𝑐ℎ𝑎𝑟𝑡|𝐻1 )
By computing the Type II error for different values of 𝛿, we can see that 𝛽 = 0,865 when 𝛿 = 0,8
standard deviation units. Thus, the out-of-control mean is 𝜇𝑛𝑒𝑤 = 0,846.