IE380 Unit 9

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

IE380

Quality Control and


Improvement
Unit 9
Design of Experiments
Full Factorial Modeling, Experimentation, and Calculation
Noise Estimation
Optimizing Parameter Settings
Design of Experiments
• We use DOE to help us analyze a process, to study the effects of changing factors in
our production which we believe to be fundamental to our final product's quality.

• We hope that by changing these factors in an


intelligent and structured way we will be able to
remove faults and/or reduce variation in our
product.
Design of Experiments
• We assume that there is some sort of underlying mathematical model for our
system; varying the inputs in a certain way will alter our response.
• We use DOE to test our model and we use our model to determine the structure
of our DOE.

• If we think that a response varies linearly with an input, how many points do we
need to determine the relationship?
Linear means they have the following relationship 𝑎𝑥 + 𝑏 = 𝑦 and we need two points to draw
a line (or we need two equations to solve for two unknowns, a and b)
• If it is a quadratic?
We need 3 points x1 , y1 , 𝑥2 , 𝑦2 , 𝑥3 , 𝑦3 to solve a quadratic equation 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 𝑦
Design of Experiments
a) The relationship is actually curvilinear but this
will never be detected because the chosen model
is linear and two points are tested.

b) The model checking will reveal that straight


line is a poor model thanks to the center point.

c) If the relationship is known to be straight line


experimenting on many levels will be unnessary.

d) If we know a priori that the relationship is a


straight line then best is to test two more
extreme levels, and use additional test for
replication (that is to observe the experimental
error)
Design of Experiments
• There is no point in over-specifying functions:
• If you believe two points will determine the relation, then test two.
• Additional testing can be used for replication purposes, to verify your data and quantify your
error. Because there will always be error, or variation, in your experimentation.

Question: How do we decide on the experimental settings for each variable?


Ask experts / Pre-testing
The most popular experimental designs are two-level designs. What should be
those two levels?
Nominal ±10% 𝑜𝑟 ± 15%
Design of Experiments
• Assume we have factors and their levels. What is next?
• Our task is to decide which ones to
• Include in the experiment.
• Block.
• Randomize away.
• Estimate.
• Let’s see on an example.
Example
An online media merchant is developing a new web site. In order to do so they
want to test alternate configurations of the check-out operation, seeking the
configuration that yields the best user experience.
Question: What to include, block, randomize and estimate
One type of variation the site experience is in the specific purchase - the number of
items, the type, the payment method, the delivery method. To properly test the
different configurations, a series of test purchases will be made, of different types
of orders.
This is an example of randomness inherent in the use of the process, outer noise,
which the company wishes to include in their experiment.
Example
Another source of variation will be the specific user. In order to make the
comparison as fair as possible, all pairs of purchases should be made by the same
user. This is an example of blocking.
The user induces a source of variation in the operation. By using the same user for
each of the experiments, we are removing this source of variation, by standardizing
across all tests.
Note: A valid question at this point is why don't we just include different users as a
factor in the experiment? We very well could, and treat this as outer noise, to be
included in the experiment. Deciding which factors to include and which to block is
not always an easy decision.
Example
A third source of variation between the experiments could be which site was used
first and which second for each purchase. While we don't immediately see how this
should matter, it might.
To overcome this, we randomly choose (by coin flip say) which site to use first. We
do not alternate. This in itself would induce a pattern in our experiments, which we
wish to avoid.
This is an example of randomization.
Example
Finally, after running our experiments, we would still expect to have common cause
variation in each of the test purchases.
To estimate this, we might replicate our experiments, and then use ANOVA
techniques to try to isolate how much of the relative differences between the sites
were due to the change in configuration, and how much was attributable to
random variation.
This is an example of estimation.
𝑘
2 Full Factorial Experimentation
• We will now discuss, through an example, the use of 2𝑘 full factorial experiments.

• These are called 2𝑘 because we are testing 𝑘 factors, against each other, each at
two levels (off/on, or low/high). This gives rise to a total of 2𝑘 = 2 × 2 × ⋯ ×
2 total experiments.

• For now we will assume we will run all of the experiments, although later we will
explore how the number of experiments can be reduced, without the loss of too
much information.
Example
Thumb tacks. A company is preparing to manufacture metal thumb tacks. This is to
be done by applying a fixative to either the cap or the pin, joining them, and then
spot welding the junction.

They are currently at the stage where they need to decide the levels of three
different factors in the production:
1. Where to put the fixative (cap or pin).
2. What temperature to use to do the spot weld, low or high.
3. Which type of fixative to use (A or B).
Example
• This problem has three different factors. Two of them are discrete factors (1 and
3), there are only two choices for the company, all one way or all the other.
• The second factor, temperature, is a continuous factor they can alter the
temperature along a range of values, as they see fit.
• We will test each of the three factors at two different levels, yielding 23 or eight
experiments.
Example
• To perform the experiments - tell the operator which settings go with which experiment -
we construct a design matrix:

Test 𝑿𝟏 𝑿𝟐 𝑿𝟑 Location Temp Type


1 -1 -1 -1 Cap Low A
Run 8 experiments 2 +1 -1 -1 Pin Low A
in random order 3 -1 +1 -1 Cap High A
4 +1 +1 -1 Cap High A
5 -1 -1 +1 Cap Low B
6 +1 -1 +1 Pin Low B
7 -1 +1 +1 Cap High B
8 +1 +1 +1 Cap Low B
Example
This also has a geometric interpretation:

My experiments are vertices of


this unit cube.
Example
• So assume now that we have run our experiments (in a randomized order!), with the
following yield of good tacks (per batch of 100):

Test 𝑿𝟏 𝑿𝟐 𝑿𝟑 Yield Pictorially:


1 -1 -1 -1 99
2 +1 -1 -1 94
3 -1 +1 -1 88
4 +1 +1 -1 85
5 -1 -1 +1 98
6 +1 -1 +1 92
7 -1 +1 +1 90
8 +1 +1 +1 91
Example
Question: How do we expect these data points to compare to the original data? Do we
expect these points to (all) be better than the original setting? Do we expect them to be
better on average?

Average of these should be close to the old


setting. Why? Varying on either side.

But we do not care about the average.


We are just looking for one ‘very good’ setting.
Calculation of Main Effects
• We first seek information on how varying each of the individual factors changed
our yield.
• How can we determine this, for fixative location (𝑋1 ), for example?

Which ones do you need to


compare?
Calculation of Main Effects
• How can we determine this, for fixative location (𝑋1 ), for example?

So we compare the following pairs:


𝑦2 − 𝑦1 = −5
y4 − y3 = −3
y6 − y5 = −6
y8 − y7 = +1

Therefore the average effect of changing from cap to pin is


−5 − 3 − 6 + 1
= −3.25
4
cap pin
Calculation of Main Effects
• This is the main or location effect for the first variable, which we
denote by 𝐸1 . What does this mean?
𝐸1 = −3.25 Moving fixative from cap to pin reduces yield by 3.25 tacks
on average.
• Note that this is an average, so you must be careful!

• Let’s calculate the location effect for the other variables.


Calculation of Main Effects
• To calculate the location effect of temperature, we compare these planes:

So we compare the following pairs:


𝑦3 − 𝑦1 = −11
y4 − y2 = −9
y7 − y5 = −8
y8 − y6 = −1

Therefore the average effect of changing from low to high is


−11 − 9 − 8 − 1
𝐸2 = = −7.25
4
Rising temperature from low to high reduces yield by 7.25 tacks on average.
Calculation of Main Effects
• And lastly, the type of fixative:

So we compare the following pairs:


𝑦5 − 𝑦1 = −1
y6 − y2 = −2
B
y7 − y3 = 2
y8 − y4 = 6
A Therefore the average effect of changing from A to B is
−1 − 2 + 2 + 6
𝐸3 = = +1.25
4
Changing the type of fixative from A to B increases yield by 1.25 tacks on average.
Calculation of Main Effects
• When looking at the main effects, both the sign, and the magnitude of the change are
important. Why?
Because + or – tells you which way to go and magnitude tells you how important.
• Does this mean we can just put the best three levels together?
• Location:-1
• Temperature:-1
• Fixative: +1
and be done with it? Why or why not?
No. (-1,-1,+1)=98 < (-1,-1,-1)=99
Due to relationships between variables!
Cross and interaction effects!
Interaction Effects
• Let's look at how location and temperature affect each other, by
averaging out the fixative effect.
Interaction Effects
• This is called a two way diagram. The interaction effect, 𝐸12 , is equal to:
88 − 89 − 93 − 98.5 −1 − −5.5
= = 2.25
2 2
In this form, what this represents is not very clear. But rewritten:
(88 + 98.5 − 89 − 93) −1 − −5.5
= = 2.25
2 2
OR
+ + + − − − − + − (+ −)
= 2.25
2
• What this term represents may be more clear. Any ideas?
• Correlation! What we are doing is (positively correlated – negative correlated)
• On average the yield goes up by 2.25 when 𝑋1 &𝑋2 ‘agree’ in sign.
Interaction Effects
• For the interaction effects of temperature and fixative, 𝐸23 :

(90.5 + 96.5 − 95 − 86.5)


𝐸23 = = 2.75
2
Interaction Effects
• We can also get the other two way effect, 𝐸12 , and the three way
effect, 𝐸123 in a similar manner.

Question: Isn't there a better way of calculating these things???!!!


Answer: Yes.
Calculation Method-Linear Regression
• We will now discuss the calculation method for DOE, which is an application of
linear regression.
Test I 𝑿𝟏 𝑿𝟐 𝑿𝟑 𝑿𝟏 𝑿𝟐 𝑿𝟏 𝑿 𝟑 𝑿𝟐 𝑿𝟑 𝑿𝟏 𝑿𝟐 𝐗 𝟑 Yield
1 1 -1 -1 -1 +1 +1 +1 -1 99
2 1 +1 -1 -1 -1 -1 +1 +1 94
3 1 -1 +1 -1 -1 +1 -1 +1 88
4 1 +1 +1 -1 +1 -1 -1 -1 85
5 1 -1 -1 +1 +1 -1 -1 +1 98
6 1 +1 -1 +1 -1 +1 -1 -1 92
7 1 -1 +1 +1 -1 -1 +1 -1 90
8 1 +1 +1 +1 +1 +1 +1 +1 91

How did we get the columns for this matrix? 𝑋1 𝑋2 = 𝑋1 ∙ 𝑋2 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡 𝑤𝑖𝑠𝑒
Calculation Method-Linear Regression
• How do you think we calculate 𝐸1 now?
𝐸1 =< 𝑋1 ∙ 𝑦𝑖𝑒𝑙𝑑 >÷ 4
𝐸23 =< 𝑋2 𝑋3 ∙ 𝑦𝑖𝑒𝑙𝑑 >÷ 4
• What do we divide by?
• We always divide by 2𝑘−1 = 23−1 = 4

−99 + 94 − 88 + 85 − 98 + 92 − 90 + 91
𝐸1 = = −3.25
4
Mathematical Model
• We started all of this with the hope of establishing a mathematical model for the
system.
• For most factorial experiments, the implicit model is a linear model with
coefficients for each of the possibly significant effects:
𝑦0 = 𝑏0 + 𝑏1 𝑥1 + 𝑏2 𝑥2 + 𝑏3 𝑥3 + 𝑏12 𝑥1 𝑥2 + 𝑏13 𝑥1 𝑥3 + 𝑏23 𝑥2 𝑥3 + 𝑏123 𝑥1 𝑥2 𝑥3 + 𝜖
• What is 𝑏0 ? And how we calculate it?
𝑏0 : 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡, 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑦𝑖𝑒𝑙𝑑 𝑤ℎ𝑒𝑛 𝑎𝑙𝑙 𝑋𝑖 = 0
𝑏0 =< 𝐼 ∙ 𝑦𝑖𝑒𝑙𝑑 >÷ 8 = 𝐴𝑣𝑔 𝑌1 , … , 𝑌8
• For our example 𝑏0 = 92.125
Mathematical Model
• How do we get the other 𝑏𝑖′ 𝑠?
• If 𝑋1 changes by 1, 𝑌 changes by 𝑏1 .
𝐸
• If 𝑋1 changes from cap(-1) to pin (+1), 𝑌 changes by 𝐸1 so 𝑏𝑖 = 𝑖
2
• What is 𝜖? How do we estimate it?
• Noise or error.
• Assume 𝜖~𝑁 0, 𝜎𝜖2
• Estimate 𝜎𝜖2 by replication.
• Substituting in, we get:
𝑦 = 92.125 − 1.625𝑥1 − 3.625𝑥2 + 0.625𝑥3 + 1.125𝑥1 𝑥2 + 0.375𝑥1 𝑥3 + 1.375𝑥2 𝑥3 + 0.625𝑥1 𝑥2 𝑥3 + 𝜖
• Which factors do you believe are the most important?
𝑏0 , 𝑋1 , 𝑋2 , 𝑋2 𝑋3 , 𝑋1 𝑋2
Mathematical Model
Typically, factors having three or more terms are disregarded as noise. What do you
think about this practice?
Bad: Maybe three factor matters!
Good: Simplifies the model, usually OK, often not that telling.
Good idea but filter with process knowledge
• If we decide that 𝑥1 and 𝑥2 ,and their cross term are important, a possible model
for our system is:
𝑦 = 92.125 − 1.625𝑥1 − 3.625𝑥2 + 1.125𝑥1 𝑥2 + 𝜖
• What do we do now?
Refine and replicate
Model Refinements and Replications
• First of all, what are replicate experiments, or replications?
Multiple experiments run at exactly same settings.
Why? To estimate noise!
• For further study we might reduce the number of variables we think are significant and
re-run the experiment, or reinterpret the data under this assumption:

Why only four rows? Where did this data come from?
Test 𝑿𝟏 𝑿𝟐 𝑿𝟏 𝑿𝟐 𝒀𝟏 𝒀𝟐 𝒂𝒗𝒈 𝚫
Dropped 𝑋3 Treat 𝑋3 =-1 and 𝑋3 =+1 as same.
1 -1 -1 +1 99 98 98.5 -1
2 +1 -1 -1 94 92 93 -2 Are these really replications? What do you think of this
3 -1 +1 -1 88 90 89 2 practice?
Yes: If 𝑋3 does not matter then yes they are.
4 +1 +1 +1 85 91 88 6
No:Have different 𝑋3 values so it is not the same.
−− − −− +
noise
Model Refinements and Replications
• If we decide 𝑋3 is not significant, where should we set it? How can we
decide?
𝑋3 =-1 or 𝑋3 =+1. What are things that we should consider?
• Remember 𝐸3 = 1.25 so set to +1
• Cost
• Variability
• Should we drop 𝑋3 now, or replicate first?
• Drop: Focus on key variables, fewer experiments, saves money and time.
• Replicate: Maybe it matters!
Model Refinements and Replications
• This is one reason we favor sequential testing, so we can dynamically design our
experimental program in response to the data we are seeing.
IN PRACTICE: We fit our model to each replication to see that they tend to agree.
1. If they do not essentially agree there might be a problem with the model or this
particular replication. In this case stop doing replications and look for a special
cause affecting your experiments.
2. If they do essentially agree your final model will be based on the average of all of
your experimental results.
Noise Estimation
• It is important that we try to determine the noise, or variance, inherent in our
observations. There are different methods of doing this - the best involves replicate
experiments.
• If an experiment is replicated, we may use standard regression techniques to estimate
the noise and fit. These are summarized below, but may be done implicitly in excel.
1. Assume that the variance of each individual response is the same, independent of the
specific combination we are considering.
2. Let 𝑛 denote the number of replications, and 𝑚 be the number of experiments in each
replication.
So for this table, 𝑛 = 2 and 𝑚 = 4.
Noise Estimation
3. For 𝑖 = 1, … , 𝑚:
𝑛 2
𝑘=1 𝑦𝑖𝑘 − 𝑦𝑖
𝑠𝑖2 ≜
𝑛−1
So for the example:
2 2
Test 𝑿𝟏 𝑿𝟐 𝑿𝟏 𝑿𝟐 𝒀𝟏 𝒀𝟐 𝒂𝒗𝒈 𝚫
99 − 98.5 + 98 − 98.5
𝑌1 = 98.5 S12 = = 0.5 1 -1 -1 +1 99 98 98.5 -1
1
𝑌2 = 93 𝑆22 = 2 2 1 -1 -1 94 92 93 -2
𝑌3 = 89 𝑆32 = 2 3 -1 +1 -1 88 90 89 2
𝑌4 = 88 𝑆42 = 18
4 +1 +1 +1 85 91 88 6
Noise Estimation
4. Calculate the pooled variance estimator
𝑚 2
𝑠
𝑖=1 𝑖
𝑠𝑝2 ≜
𝑚
So for the example: S12 = 0.5 𝑆22 = 2 𝑆32 = 2 𝑆42 = 18
0.5 + 2 + 2 + 18
𝑠𝑝2 = = 5.625
4
5. Calculate the estimated variance of each effect:
2
4𝑠𝑝
𝑠𝑒2 ≜
𝑛×𝑚
For our example it is
4 × 5.625
= 2.8125
4×2
6. Determine the sample variance of the mean response:
2
2 ≜
𝑠𝑝
𝑠𝑎𝑣𝑒 (= 0.7031 in our example)
𝑛×𝑚
Noise Estimation
7. We can get sample standard deviations by taking the square root of the above
values.
8. To test whether an effect is statistically significant, we can test the hypothesis
that 𝜇𝑒𝑓𝑓𝑒𝑐𝑡𝑖 = 0. We will include only factors which clearly appear to be
important.
𝐸𝑖 − 0
𝑡= ~𝑡𝑚 (For X1 , E1 = −3.25 t = 1.9371)
𝑠𝑒𝑓𝑓𝑒𝑐𝑡
Recall that the t above implies the factor is significant if it falls outside of the
interval (𝑡𝑚,𝛼/2 , 𝑡𝑚,1−𝛼/2 ). Alternatively, we can calculate
TDIST(t,m,tails)=TDIST(1.9371,4,2)=0.1247
Noise Estimation

• We can use a similar method to derive confidence intervals for 𝜇𝐸𝑖 .If
such an interval included the value zero, we would assume the factor
was not significant.
THIS IS HOW WE TEST IF FACTORS ARE SIGNIFICANT.
Noise Estimation
• If experiments are not replicated, we can either:
1. Assume our data is normal, and plot it. We look for data points which are
outliers, and assume these are due to significant effects.
This is really only good for 24 and larger.
2. Assume that higher order effects (three factor and greater) are solely due to
noise, and use these terms to estimate the sample variation. (Taguchi does this.
Complicated)
• Neither of these methods are as good as replication, so if at all possible, replicate,
replicate, replicate.
Question: What if none of our factors are significant? Where should we set the
insignificant factors?
Noise Estimation
• What if none of our factors are significant?
• Try other factors / Expand testing width
• Try more experiments
• Take most significant- This is statistics.
• Where should we set the insignificant factors?
• Sign
• Cost
• Robustness
• Interaction with other factors
Model Diagnostics
• Is each variable in the model necessary?
• Is the model sufficient to describe the observed behavior?
Given an equation which can predict the performance of our system, which in our
case is − − − − → 𝒚𝟏 = 𝟗𝟖. 𝟓
𝑦 = 92.125 − 1.625𝑥1 − 3.625𝑥2 + 1.125𝑥1 𝑥2 + 𝜖
we can calculate the residuals for our experiments, and use these to test the
validity of our model.
The residuals are calculated by:
𝑒𝑖𝑗 ≜ 𝑦𝑖𝑗 − 𝑦𝑖 (𝒆𝟏𝟏 = 𝟎. 𝟓 𝒆𝟏𝟐 = −𝟎. 𝟓)
where 𝑦𝑖 is the response which our model predicts, and the 𝑦𝑖𝑗 is the actual
response of the 𝑖𝑡ℎ experimental setting on its 𝑗𝑡ℎ replication. We do this for all of
the residuals in the experiment.
Model Diagnostic
• If we plot our residuals, they should be:
• Centered about zero.
• Normally distributed.
• Demonstrating no patterns or correlations.
• How can we check this?
1. Check to see if data is Normally distributed.
2. Plot the residuals against
(a) Run order
(b) 𝑦𝑖′ 𝑠
(c) 𝑥𝑖 ’s (significant and not)
We can also do standard hypothesis tests to determine whether the
variances are homogeneous.
Final Results
• Once we have determined a model, we can use it to try to maximize our
yield.
• If we have a discrete value, we can only put in ±1, but for a continuous
value (for xi ∈ [−1, 1] only) we have our choice of a value anywhere in this
range.
• Why do we restrict ourselves to [−1, 1]? Maybe some other value we have
not test is better!
• How do we choose our final variable values?
• Max Yield
• Min Cost
• Min Variance
Final Results
• We can also use a contour plot, which is a graph of the expected
response versus some, or all of the input levels, to help us make our
decision.
Final Results
• What if more than one variable setting gives us our optimal (or near-
optimal yield)? Or more precisely:
How can we choose from a set of candidate points that appear very similar
on mean yield?
• Favor lower cost
• Replicate
Find model predicts (-1,-1)  yield 99.2
(+1,-1)  yield 99
Run 20 batches at (-1,-1) and 20 batches at (+1,-1) and compare!
VALIDATION
Finally, ANOVA can ben used to attempt to quantify the sources of our variance.
Summary of DOE
The overall procedure we would like to use it:
1. Check for control.
2. Establish control if necessary.
3. Postulate the model.
4. Initial DOE Testing
a) Replication.
b) Fitting the model to the data/Diagnostics
c) Do the experimental results agree?
d) If not, why?  special cause?
e) Drop variables?
Summary of DOE
5. (Possibly) Second round of (more focused) DOE.
a) Replication
b) Fitting the model to the data/Diagnostics
c) Do the experimental results agree?
d) If not, why?
6. VALIDATION: Test a small number of new process settings.
a) If validation successful, recommend new settings and send bill
b) If validation unsuccessful, go to 1.
Next Chapter
Question: What do we do when we lack the time, resource, or desire to
carry out a full battery of experiments?
Answer: Fractional factorial experimentation.

You might also like