MultiVariad 1

1
Hypothesis Testing for a

Single Sample
Assoc. Prof. Prapaisri Sudasna-na-Ayudthya, KU

2
Statistics and Sampling

Distributions
• Statistical methods are used to

make decisions about a process
– Is the process out of control?
– Is the process average you were
given the true value?
– What is the true process
variability?
Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

3
Statistics and Sampling

Distributions
• Statistics are quantities calculated
from a random sample taken from a
population of interest.
• The probability distribution of a

statistic is called a sampling
distribution.

4
Sampling from a Normal

Distribution
• Let X represent measurements

taken from a normal distribution.
X ~ N(µ ,σ )2
• Select a sample of size n, at

random, and calculate the sample
mean, x . Then  σ 2
x ~ N µ ,  
 n 
5

Distribution
• Chi-square (χ2) Distribution
– Furthermore, the sampling
distribution of
n
∑ i
( x − x ) 2
(n − 1)S 2
y= i =1
=
σ2 σ2
is chi-square with n – 1 d.f. when
sampling from a normal population.

6

Distribution
• t-distribution
– If X is a standard normal
random variable and if Y is a chi-
square random variable with k
degrees of freedom, then
x
t =
y
k
with k degrees of freedom.
7

Distribution
• F-distribution
– If w and y are two independent chi-
square random variables with u and
v degrees of freedom, respectively,
then w/u
F=
y/v
is distributed as F with u numerator d.f.
and v denominator d.f.

Point Estimation of Process
8
Parameters
• Parameters are values representing the
population, e.g. µ, σ 2
• Parameters in reality are often

unknown and must be estimated.
• Statistics are estimates of parameters,
e.g.
2
x, S
9
Point Estimation of Process

Parameters
Two properties of good point
estimators
1. The point estimator should be
unbiased.
2. The point estimator should
have minimum variance.
10
Statistical Inference for a

Single Sample
Two categories of statistical

inference:
1. Parameter Estimation
2. Hypothesis Testing

11
Hypothesis
Testing

12

Single Sample
• A statistical hypothesis is a
statement about the values of
the parameters of a probability
distribution.
H0 : µ = µ0
H1 : µ ≠ µ 0

13
• Hypothesis consists of two parts

1. Null Hypothesis (Ho)
: statement that you want to
reject. (Disprove)
2. Alternate Hypothesis (H1)
:statement that you want to prove
that it’s true.

14

Single Sample
• Steps in Hypothesis Testing
– Identify the parameter of
interest
– State the null hypothesis, H0
and alternative hypotheses,
H1.
– Choose a significance level
15
– State the appropriate test

statistic
– (State the rejection region)
– Compare the value of test
statistic to the rejection
region. Can the null hypothesis
be rejected?
p-value (Sig.) < α
16

Single Sample
• Example: An automobile
manufacturer claims a particular
automobile can perform at average
35 mpg (highway).
– Suppose we are interested in
testing this claim. We will sample 25
of these particular autos and under
identical conditions calculate the average
mpg for this sample.
17
– Before actually collecting the

data, we decide that if we get
a sample average less than 33
mpg or more than 37 mpg, we
will reject the makers claim.
(Critical Values)

18

Single Sample
• Example (continued)
– H0: µ = 35
Rejection Regions
H1: µ ≠ 35
Do not reject
• From the sample of

25 cars, the average
mpg was found to be
Reject
Reject
31.5. What is your

conclusion?
33 35 37
x

19
One-tailed Two-tailed
H1: < H1: ≠
α
(1- α)100% α/2 α/2
(1- α)100%
H1: >
α
(1- α)100%

20

Single Sample
Choice of Critical Values
• How are the critical values
chosen?
• Wouldn’t it be easier to decide
“how much room for error you
will allow” instead of finding the
exact critical values for every
problem you encounter?

21
Single Sample
Significance Level
• The level of significance, α
determines the size of the
rejection region.
• The level of significance is a
probability. It is also known as
the probability of a “Type I
error” (want this to be small)

22
• Type I error - rejecting the

null hypothesis when it is true.
How small? Usually want
α ≤ 0.10

23
Test
result FTR H0 Reject H0
Fact
H0 is
“true” 1- α α
H1 is
“true” β 1-β

24

Single Sample
Types of Error
• Type I error - rejecting the
null hypothesis when it is true.
Pr(Type I error) = α.
• Type II error - not rejecting
the null hypothesis when it is
false.
Pr(Type II error) = β.
25
Single Sample
Power of a Test
• The Power of a test of hypothesis
is given by 1 - β
• That is, 1 - β is the probability
of correctly rejecting the null
hypothesis

26
Inference on the Mean of a

Population, Variance Known
Hypothesis Testing
• Hypotheses: H0: µ = µ o H1: µ ≠ µ o
• Test Statistic: x − µ0
Z =
σ/ n
0
• Significance Level, α
• Rejection Region: Zo < − Zα / 2 or Z0 > Zα / 2
• If Z0 falls into R.R. above,Reject H0

27

Example
• Hypotheses: H0: µ = 175 H1: µ > 175
182 − 175
• Test Statistic: Z0 = = 3.50
10 / 25
• Significance Level, α = 0.05
• Rejection Region: Z0 > Zα = 1.645
Since 3.50 > 1.645, reject H0 and conclude that
the lot mean pressure strength exceeds 175 psi.

28

Confidence Intervals
• A general 100(1- α)% two-
sided confidence interval on
the true population mean, µ
is
P[ L ≤ µ ≤ U ] = (1 − α )

29
• 100(1- α)% One-sided

confidence intervals are:
Upper Lower
P[µ ≤ U] = (1 − α ) P[L ≤ µ ] = (1 − α )

30

Confidence Interval on the Mean
with Variance Known
• Two-Sided:
σ σ
P[ x − Z α ≤ µ ≤ x + Zα ] = (1 − α )
2 n 2 n

The Use of P-Values in
31
Hypothesis Testing
• If it is not enough to know if your
test statistic, Z0 falls into a
rejection region, then a measure
of just how significant your test
statistic is can be computed - P-
value.
• P-values are probabilities
associated with the test statistic,
Z0 .
32

Hypothesis Testing
Definition
• The P-value is the smallest
level of significance that
would lead to rejection of the
null hypothesis H0.

33
P-value
• One sided test
(H1: < or >)
P-Value
= P( test statistics < or >
calculated value)

34
• Two sided test

(H1 : ≠ ) left = < right = >
P-Value
= 2*P( test statistics <* or >*
calculated value)
* Sign depends on the position of the

calculated value. (left = <, right = >)

35

Hypothesis Testing
Example
• Reconsider the Example . The test
statistic was calculated to be Z0 =
3.50 for a right-tailed hypothesis
test. The P-value for this problem is
then P = 1 - Φ(3.50) = 0.00023
• Thus, H0: µ = 175 would be rejected
at any level of significance α ≥ P =
0.00023 i.e.reject when P < α
36

Population, Variance Unknown
Hypothesis Testing
• Hypotheses: H0: µ = µ o H1: µ ≠ µ o
• Test Statistic: x − µ0
t0 =
s/ n
• Reject H0 if
t 0 > t α / 2,n −1
37

Confidence Interval on the Mean with

Variance Unknown
• Two-Sided:
 s s 
P  x − t α / 2,n −1 ≤ µ ≤ x + t α / 2,n −1  = (1 − α )
 n n

38

Computer Output
Minitab Output
Welcome to Minitab, press F1 for help.
One-Sample T: Strength
Test of mu = 50 vs mu not = 50
Variable N Mean StDev SE Mean

Strength 16 49.864 1.661 0.415
Variable 95.0% CI T P
Strength (48.979, 50.750) -0.33 0.749

39
Inference on the Variance of

a Normal Distribution
Hypothesis Testing
• Hypotheses: H0:σ = σ 0 H1: σ ≠ σ 0
2 2 2 2
• Test Statistic: (n − 1)S 2

χ0 =
2
σ02
• Rejection Region: 2
χ 0 > χ α2 or χ 02 < χ 2 α
, n −1 1− , n −1
2 2

40
Inference on the Variance of

a Normal Distribution
Confidence Interval on the Variance

• Two-Sided:
 (n − 1)s 2 (n − 1)s 2 
≤σ ≤ 2  = 1− α
2
P 2
 χ α / 2,n −1 χ1− α / 2,n −1 

41
Inference on a Population
Proportion
Hypothesis Testing
• Hypotheses: H0: p = p0 H1: p ≠ p0
• Test Statistic:
X − np0
Z0 =
np0 (1 − p0 )
• Rejection Region: Z 0 ≥ Z α / 2
42
Inference on a Population
Proportion
Confidence Interval on the Population

Proportion
• Two-Sided:
 pˆ (1 − pˆ ) pˆ (1 − pˆ ) 
P  pˆ − Zα / 2 ≤ p ≤ pˆ + Zα / 2  = 1−α
 n n 

43
CH X
Tests of Hypotheses for

Two Samples
Assoc. Prof. Prapaisri Sudasna-na-Ayudthya, KU

44
Statistical Inference for Two

Samples
The picture can't be displayed.
• Previous section presented

hypothesis testing and confidence
intervals for a single population
parameter.
• Results are extended to the case of
two independent populations
• Statistical inference on the
difference in population means, µ1 − µ 2

45
Inference For a Difference in

Means, Variances Known
Assumptions
1. X11, X12, …, X1n1 is a random sample
from population 1.
2. X21, X22, …, X2n2 is a random sample
from population 2.
3. The two populations represented by X1
and X2 are independent.
4. Both populations are normal, or if they
are not normal, the conditions of the
central limit theorem apply.
46

Null Hypothesis:
H 0 : µ1 − µ 2 = ∆ 0
Test Statistic:
X1 − X 2 − ∆ 0
Z0 =
σ12 σ 22
+
n1 n 2

47

Hypothesis Tests for a Difference in Means,

Variances Known
Alternative Hypotheses Rejection Criterion
H1 : µ 1 − µ 2 ≠ ∆ 0 z 0 > z α / 2 or z 0 < − z α / 2
H1 : µ 1 − µ 2 > ∆ 0 z0 > zα
H1 : µ 1 − µ 2 < ∆ 0 z0 < −zα

48

Means,Variances Known
Confidence Interval on a Difference in
100(1 - α)% confidence interval on
the difference in means is given by
σ σ
2 2 2
σ σ 2
x1 − x 2 − z α / 2 1
+ ≤ µ1 − µ 2 ≤ x 1 − x 2 + z α / 2
2 1
+ 2
n1 n 2 n1 n 2

49

Means, Variances Unknown
Case I: σ1 = σ 2 = σ
2 2 2
• Point estimator for µ1 − µ 2 is X1 − X 2

where
σ 2
σ 1
2
1 
V (X1 − X 2 ) = + = σ  + 
2
n1 n 2  n1 n 2 

50

Case I:
σ1 = σ 2 = σ
2 2 2
The pooled estimate of σ 2, denoted by S2p is

defined by
S =
2 (n1 − 1)S 2
1+ (n 2 − 1)S 2
2
n1 + n 2 − 2
p

51

Case I: σ 2 = σ 2 = σ 2
1 2
Null Hypothesis: H : µ − µ = ∆
0 1 2 0
Test Statistic: X1 − X 2 − ∆ 0
t0 =
1 1
Sp +
n1 n 2

52

Variances Unknown
Alternative Hypotheses Rejection Criterion
H1 : µ 1 − µ 2 ≠ ∆ 0 t0 > tα / 2,n1 + n2 − 2 or
t0 < −tα / 2,n1 + n2 − 2
H1 : µ 1 − µ 2 > ∆ 0 t0 > tα ,n1 + n2 − 2
H1 : µ 1 − µ 2 < ∆ 0 t0 < −tα ,n1 + n2 − 2

53

Case II: σ 2 ≠ σ 2
1 2
Null Hypothesis: H 0 : µ1 − µ 2 = ∆ 0
Test Statistic: ∗ X1 − X 2 − ∆ 0
t =
0
S12 S22
+
n1 n 2

54

Case II: σ2 ≠ σ2
1 2
The degrees of freedom for t ∗0 are given by

2
S 2
S 2
 1
+ 
2
 n1 n2 
ν= 2
(S1 n1 ) + (S2 n2 )
2 2 2
n1 − 1 n2 − 1
55

Confidence Intervals on a Difference in Means,
Case I: σ12 = σ 22 = σ 2
100(1 - α)% confidence interval on the
difference in means is given by
1 1 1 1
x 1 − x 2 − t α / 2 , n1 + n 2 − 2 s p + ≤ µ1 − µ 2 ≤ x 1 − x 2 + t α / 2 , n 1 + n 2 − 2 s p +
n1 n 2 n1 n 2

56

Confidence Intervals on a Difference in Means,
Case II: σ ≠σ
2 2
1 2
100(1 - α)% confidence interval on the
difference in means is given by
2 2 2 2
s s s s
x1 − x 2 − t α / 2 ,ν 1
+ ≤ µ1 − µ 2 ≤ x 1 − x 2 + t α / 2 , ν
2 1
+ 2
n1 n 2 n1 n 2

57
Paired Data
• Observations in an experiment are
often paired to prevent extraneous
factors from inflating the estimate
of the variance.
• Difference is obtained on each pair
of observations, dj = x1j – x2j,
where j = 1, 2, …, n.
• Test the hypothesis that the mean
of the difference, µd, is zero.

58
Paired Data
• The differences, dj, represent the
“new” set of data with the
summary statistics:
1 n
d = ∑dj
n j=1
∑ (d j − d )
n 2
j=1
S =
2
n −1
d

59
Paired Data
Hypothesis Testing
• Hypotheses: H0: µd = 0 H1: µd ≠ 0
• Test Statistic:
d
t0 =
Sd n
• Rejection Region: |t0| > tα/2,n-1

60
Inferences on the Variances

of Two Normal Distributions
Hypothesis Testing
• Consider testing the hypothesis that the
variances of two independent normal
distributions are equal.
H0 : σ = σ
2
1
2
2
H1 : σ12 ≠ σ 22
• Assume random samples of sizes n1 and n2
are taken from populations 1 and 2,
respectively
61

Hypothesis Testing
• Hypotheses: H 0 : σ12 = σ 22 H1 : σ12 ≠ σ 22
• Test Statistic: 2
S
F0 = 1
2
S 2
• R.R.: F0 > Fα / 2,n −1,n
1 2 −1
F0 < F(1− α / 2 ),n1 −1, n 2 −1

62

Alternative Test Rejection

Hypothesis Statistic Region
S22
H1 : σ < σ 2
1
2
2 F0 = 2 F0 > Fα ,n 2 −1,n1 −1
S1
S12
H1 : σ > σ 2
1
2
2 F0 = 2 F0 > Fα , n1 −1, n 2 −1
S2

63

Confidence Intervals on Ratio of the

Variances of Two Normal Distributions
100(1 - α)% two-sided confidence
interval on the ratio of variances is given
by
S2 2
σ S 2
1
F(1− α / 2),n 2 −1,n1 −1 ≤
1
≤ Fα / 2,n 2 −1,n1 −1
1
S2
2
2
2 σ S 2
2

64
What If We Have More Than

Two Populations?
Example
Investigating the effect of one factor (with several
levels) on some response.
Hardwood Observations
Concentration 1 2 3 4 5 6 Totals Avg
5% 7 8 15 11 9 10 60 10.0
10 12 17 13 18 19 15 94 15.67
15 14 18 19 17 16 18 102 17.0
20 19 25 22 23 18 20 127 21.17
Overall 383 15.96

65

Two Populations?
Analysis of Variance
• Always a good practice to compare
the levels of the factor using
graphical methods such as boxplots.
• Comparative boxplots show the
variability of the observations
within a factor level and the
variability between factor levels.
66

Two Populations?
25
Tensile strength (psi)
15
5
5 10 15 20
Hardwood Concentration (%)

MultiVariad 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MultiVariad 1

Uploaded by

Copyright:

Available Formats

1

Hypothesis Testing for a

Assoc. Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistics and Sampling

• Statistical methods are used to

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistics and Sampling

• The probability distribution of a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Sampling from a Normal

• Let X represent measurements

• Select a sample of size n, at

Sampling from a Normal

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Sampling from a Normal

Sampling from a Normal

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

• Parameters in reality are often

Point Estimation of Process

Statistical Inference for a

Two categories of statistical

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistical Inference for a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

• Hypothesis consists of two parts

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistical Inference for a

– State the appropriate test

Statistical Inference for a

– Before actually collecting the

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistical Inference for a

• From the sample of

31.5. What is your

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistical Inference for a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

• Type I error - rejecting the

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Statistical Inference for a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Inference on the Mean of a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Inference on the Mean of a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Inference on the Mean of a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

• 100(1- α)% One-sided

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Inference on the Mean of a

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

The Use of P-Values in

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

• Two sided test

* Sign depends on the position of the

Assoc.Prof. Prapaisri Sudasna-na-Ayudthya, KU

The Use of P-Values in

Inference on the Mean of a