Design and Analysis of Experiments

Design and Analysis
of Experiments
Chapter 2
SIMPLE COMPARATIVE
EXPERIMENTS
Dr. Tran Thanh Hung
Department of Automation Technology,
College of Engineering, Can Tho University
Email: tthung@ctu.edu.vn
Chapter objectives
At the end of the chapter, students can

• review basic statistical concepts
• chose sample size for two sample
problems based on the t-test
• analyze results of comparative
experiments using t-test
• review the assumptions underlying the t-
test and how to test for these
assumptions
Contents
• Basic Statistical Concepts
• Simple comparative experiments
– The hypothesis testing framework
– The two-sample t-test
– Checking assumptions, validity
• Sample size determination
3
Basic Statistical Concepts
• Probability Distributions:
The probability structure of a random variable.
b
P  a  y  b   f  y  dy
b
P  ya  y j  yb    p yj  a
j a

 p y  1
all values
j  f  y  dy  1 (2.1)

• Mean of a probability distribution is a measure of its
central tendency or location, or expected value or the
long-run average value of the random variable y:

  yf  y  dy , y continuous
  E  y     (2.2)
  y j p  y j  , y discrete
 all values
• Variance: The variability or dispersion of a probability
distribution 

   y    f  y  dy , y continuous
2

V  y    2    (2.3)
   y    2 p  y  , y discrete
 all values j j
  E  y   

2 2
(2.4)
 
Suppose that y1, y2, . . . , yn represents a sample  Sample size = n.
• Random sample: a sample that has been randomly
selected from the population
n
• Sample mean:
 yi
y  i 1 (2.7)
n
• Sample variance:
n
 y  y 
2
i
S2  i 1
(2.8)
n 1
• Relationship between y and  , S 2 and  2 ?
 
• Number of degrees of freedom: n  1
• Normal Distribution (Phân bố chuẩn):

y  N  , 2 
If   0 and   1
2
 Standard Normal Distribution

(Phân bố chuẩn hóa)
Simple Comparative Experiment
• Two Sample Experiment:
Example: Study the formulation of a Portland cement
mortar with 2 formulations:
- Normal (unmodified)
- Polymer latex emulsion added
What is the meaning of the

results? Which is better?
 It is hard to get a sense of the

data when looking only at a table
of numbers.
Graphical view of the data:

• Dot diagram/ dot plot:
Now what you can understand about the result?

Dot diagrams work well to get a sense of the distribution.
These work especially well for very small sets of data.
• Box plot:
Now what you can understand about the result?

If you look at the box plot you get a quick snapshot of the
distribution of the data.
Box plot is useful for small or larger data sets.
The Hypothesis Testing
Framework
• Statistical hypothesis testing is a
useful framework for many experimental
situations
• Origins of the methodology date from the
early 1900s
• We will use a procedure known as the
two-sample t-test
11
(Kiểm định giả thuyết)
• Sampling from a normal distribution

• Statistical hypotheses (các giả thuyết thống kê)
- Null hypothesis: H 0 : 1   2
- Alternative hypothesis: H1 : 1   2 12
• Statistical hypotheses:
- Null hypothesis (giả thuyết không): H 0 : 1   2
- Alternative hypothesis (gt thay thế): H1 : 1   2
• Errors may be committed when testing
hypotheses:
- Type 1:   P  type I error   P  reject H 0 | H 0 is true 
- Type 2:   P  type II error   P  fail to reject H 0 | H 0 is false 
- Power of a test:
Power  1    P  reject H 0 | H 0 is false 

13
Estimation of Parameters
1 n
y   yi estimates the population mean 
n i 1
n
1
S 
2

n  1 i 1
( yi  y ) estimates the variance 
2 2
14
Summary Statistics
Formulation 1 Formulation 2
“New recipe” “Original recipe”
y1  16.76 y2  17.04
S  0.100
1
2
S 22  0.061
S1  0.316 S 2  0.248
n1  10 n2  10
15
How the Two-Sample t-Test
Works:
Use the sample means to draw inferences about the population means
y1  y2  16.76  17.04  0.28
Difference in sample means
Standard deviation of the difference in sample means
 2
 y2 
n
This suggests a statistic:
y1  y2
Z0 
 12  22

n1 n2
16
How the Two-Sample t-Test
Works:
Use S and S to estimate  and 
1
2 2
2
2
1
2
2
y1  y2
The previous ratio becomes
2 2
S S
1
 2
n1 n2
However, we have the case where      2
1
2
2
2
Pool the individual sample variances:

( n  1) S 2
 ( n  1) S 2
S p2  1 1 2 2
n1  n2  2
17
How the Two-Sample t-Test Works:
The test statistic is
y1  y2
t0 
1 1
Sp 
n1 n2
• Values of t0 that are near zero are consistent with the
null hypothesis
• Values of t0 that are very different from zero are
consistent with the alternative hypothesis
• t0 is a “distance” measure-how far apart the averages
are expressed in standard deviation units
• Notice the interpretation of t0 as a signal-to-noise
ratio 18
The Two-Sample (Pooled) t-Test
(n1  1) S12  (n2  1) S 22 9(0.100)  9(0.061)

S 
2
p   0.081
n1  n2  2 10  10  2
S p  0.284
y1  y2 16.76  17.04
t0    2.20
1 1 1 1
Sp  0.284 
n1 n2 10 10
The two sample means are a little over two standard deviations apart
Is this a "large" difference?
19
• So far, we haven’t
t0 = -2.20
really done any
“statistics”
• We need an objective
basis for deciding how
large the test statistic
t0 really is
• In 1908, W. S. Gosset
derived the reference
distribution for t0 …
called the t distribution
In general, if , we would reject

is type I error, sometime is call level of significance. 20
• A value of t0 t0 = -2.20
between –2.101
and 2.101 is
consistent with
equality of means
• It is possible for the
means to be equal
and t0 to exceed
either 2.101 or –
2.101, but it would
be a “rare event” …
leads to the
conclusion that the
means are different
• Could also use the 21
t0 = -2.20
• The P-value is the risk of wrongly rejecting the null hypothesis of

• equal means (it measures rareness of the event).
• P-value = smallest level of significance () that would lead to
rejection of the null hypothesis.
• The P-value in our problem is P = 0.042
 would be rejected at any level of significance . 22
Minitab Two-Sample t-Test Results
23
Checking Assumptions –
The Normal Probability Plot
Assumption of independence:
Both samples are random
samples that are drawn from
independent populations
- normal distribution,
- equal standard deviation or
variances
24
Importance of the t-Test
• Provides an objective framework for

simple comparative experiments
• Could be used to test all relevant
hypotheses in a two-level factorial
design, because all of these hypotheses
involve the mean response at one “side”
of the cube versus the mean response at
the opposite “side” of the cube
25
Confidence Intervals
(khoảng tin cậy)
• Hypothesis testing gives an objective
statement concerning the difference in
means, but it doesn’t specify “how different”
they are
• General form of a confidence interval
L    U where P( L    U )  1  
• The 100(1- α)% confidence interval on the
difference in two means:
y1  y2  t / 2, n  n 2 S p (1/ n1 )  (1/ n2 )  1  2 
1 2
y1  y2  t / 2, n1  n2 2 S p (1/ n1 )  (1/ n2 )
26
Confidence Intervals
(khoảng tin cậy)
y1  y2  t / 2, n1  n2 2 S p (1/ n1 )  (1/ n2 )  1  2 
y1  y2  t / 2, n1  n2 2 S p (1/ n1 )  (1/ n2 )
Choice of Sample Size
• The length of the confidence interval
t /2, n1  n2 2 S p (1/ n1 )  (1/ n2 )
 t /2,2 n 2 S p 2 / n ,if n1  n2  n
What If There Are More Than
Two Factor Levels?
• The t-test does not directly apply
• There are lots of practical situations where there are
either more than two levels of interest, or there are
several factors of simultaneous interest
• The analysis of variance (ANOVA) is the appropriate
analysis “engine” for these types of experiments –
Chapter 3
• The ANOVA was developed by Fisher in the early
1920s, and initially applied to agricultural experiments
• Used extensively today for industrial experiments
29
Thực hành chương 2
• Bài 1: Dùng Minitab phân tích kết quả thí

nghiệm 2 loại hồ trong ví dụ chương 2.
• Bài 2: Dùng cùng một loại giấy, xếp 2

loại máy bay khác nhau. Phóng thử mỗi
loại 10 lần, ghi lại độ xa mỗi lần phóng.
Dùng Minitab phân tích kết quả. Rút ra
kết luận.

Design and Analysis of Experiments

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Design and Analysis of Experiments

Uploaded by

Copyright:

Available Formats

Design and Analysis

At the end of the chapter, students can

 Standard Normal Distribution

What is the meaning of the

 It is hard to get a sense of the

Graphical view of the data:

Now what you can understand about the result?

Now what you can understand about the result?

• Sampling from a normal distribution

Power  1    P  reject H 0 | H 0 is false 

Pool the individual sample variances:

(n1  1) S12  (n2  1) S 22 9(0.100)  9(0.061)

In general, if , we would reject

• The P-value is the risk of wrongly rejecting the null hypothesis of

• Provides an objective framework for

• Bài 1: Dùng Minitab phân tích kết quả thí

• Bài 2: Dùng cùng một loại giấy, xếp 2

You might also like