Lec 10-13

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 207

Sampling Distributions

Estimators & Confidence Intervals


Hypothesis Testing
5.1

The Concept of a Sampling


Distribution
Parameter & Statistic
A parameter is a numerical descriptive
measure of a population. Because it is
based on all the observations in the
population, its value is almost always
unknown.
A sample statistic is a numerical
descriptive measure of a sample. It is
calculated from the observations in the
sample.
Common Statistics &
Parameters
Sample Statistic Population Parameter

Mean x 
Standard
Deviation s 

Variance s2 2
Binomial ^
p p
Proportion
Sampling Distribution

The sampling distribution of a sample


statistic calculated from a sample of n
measurements is the probability distribution
of the statistic.
Developing
Sampling Distributions
Suppose There’s a Population ...
Population size, N = 4
Random variable, x
Values of x: 1, 2, 3, 4
Uniform distribution

© 1984-1994 T/Maker Co.


Population Characteristics

Summary Measure Population Distribution


N P(x)
.3
 xi .2
 i1
 2.5 .1
N .0 x
1 2 3 4
All Possible Samples
of Size n = 2
16 Samples 16 Sample Means
1st 2nd Observation 1st 2nd Observation
Obs 1 2 3 4 Obs 1 2 3 4
1 1,1 1,2 1,3 1,4 1 1.0 1.5 2.0 2.5
2 2,1 2,2 2,3 2,4 2 1.5 2.0 2.5 3.0
3 3,1 3,2 3,3 3,4 3 2.0 2.5 3.0 3.5
4 4,1 4,2 4,3 4,4 4 2.5 3.0 3.5 4.0
Sample with replacement
Sampling Distribution
of All Sample Means

16 Sample Means Sampling Distribution


1st 2nd Observation of the Sample Mean
Obs 1 2 3 4
1 1.0 1.5 2.0 2.5 P(x)
.3
2 1.5 2.0 2.5 3.0 .2
.1
3 2.0 2.5 3.0 3.5 .0 x
1.0 1.5 2.0 2.5 3.0 3.5 4.0
4 2.5 3.0 3.5 4.0
Summary Measure of
All Sample Means
N

x 1.0  1.5  ...  4.0


i
X  
i1
 2.5
N 16
Comparison

Population Sampling Distribution


P(x) P(x)
.3 .3
.2 .2
.1 .1
.0 x
.0 x
1 2 3 4 1.0 1.5 2.0 2.5 3.0 3.5 4.0

  2.5  x  2.5
5.2

Properties of Sampling
Distributions:
Unbiasedness and
Minimum Variance
Point Estimator

A point estimator of a population parameter is a


rule or formula that tells us how to use the
sample data to calculate a single number that
can be used as an estimate of the population
parameter.
Estimates

If the sampling distribution of a sample statistic


has a mean equal to the population parameter
the statistic is intended to estimate, the statistic
is said to be an unbiased estimate of the
parameter.
If the mean of the sampling distribution is not
equal to the parameter, the statistic is said to be
a biased estimate of the parameter.
Example

Probability Distribution

x 0 2 3
1 1 1
p(x)
3 3 3

 = 1.667  2 = 1.556
 = 1.247
Example
Sampling Distribution of x for n = 2
(3 possible samples, each with a sample mean)
3 5
x 1
2 2
1 1 1
p(x)
3 3 3

E(x ) = 1.667 is the same as .


x is an unbiased estimator of .
5.3

The Sampling Distribution


of a Sample Mean and the
Central Limit Theorem
Properties of the Sampling
Distribution of x

1. Mean of the sampling distribution equals mean


of sampled population*, that is,
 x  E x   .
2. Standard deviation of the sampling distribution
equals Standard deviation of sampled population
Square root of sample size

That is,  x  .
n
Standard Error of the Mean

The standard deviation  x is often referred


to as the standard error of the mean.
Theorem 5.1

If a random sample of n observations is selected


from a population with a normal distribution, the
sampling distribution of x will be a normal
distribution.
Sampling from
Normal Populations
Central Tendency Population Distribution
x   s = 10

Dispersion
 m = 50 x
x 
n
Sampling with Sampling Distribution
replacement n=4 n =16
x = 5 x = 2.5

mx- = 50 x
Standardizing the
Sampling Distribution of x
x  x x
z 
x 
Sampling n Standardized Normal
Distribution Distribution
sx s=1

mx x m =0 z
Thinking Challenge
You’re an operations
analyst for AT&T. Long-
distance telephone calls
are normally distributed
with  = 8 min. and  = 2
min. If you select random
samples of 25 calls, what
percentage of the sample
means would be between
7.8 & 8.2 minutes?
© 1984-1994 T/Maker Co.
Sampling Distribution
Solution*
x   7.8  8
z   .50
 2
n 25
x   8.2  8
z   .50
Sampling  2 Standardized Normal
Distribution n 25 Distribution
s`x = .4 s=1
.3830

.1915 .1915

7.8 8 8.2 `x –.50 0 .50 z


Sampling from
Non-Normal Populations
Central Tendency Population Distribution
x   s = 10

Dispersion
 m = 50
x  x
n
Sampling with Sampling Distribution
replacement n=4 n =30
x = 5 x = 1.8

mx- = 50 x
Central Limit Theorem
Consider a random sample of n observations
selected from a population (any probability
distribution) with mean μ and standard deviation .
Then, when n is sufficiently large, the sampling
distribution of x will be approximately a normal
distribution with mean  x   and standard
deviation  x   n . The larger the sample size,
the better will be the normal approximation to the
sampling distribution of x .
Central Limit Theorem


As sample x 
n
size gets
sampling
large
distribution
enough
becomes almost
(n  30) ...
normal.

x   x
Central Limit Theorem
Example
The amount of soda in cans
of a particular brand has a
mean of 12 oz and a standard
deviation of .2 oz. If you
select random samples of 50 SODA
cans, what percentage of the
sample means would be less
than 11.95 oz?
Central Limit Theorem
Solution*
x 11.95  12
z   1.77
 .2
Sampling n 50 Standardized Normal
Distribution Distribution
s`x = .03 s=1
.0384

.4616

11.95 12 `x –1.77 0 z
Shaded area exaggerated
5.4

The Sampling Distribution


of the Sample Proportion
Sample Proportion

Just as the sample mean is a good estimator of the


population mean, the sample proportion—denoted


— is a good estimator of the population
proportion p. How good the estimator is will
depend on the sampling distribution of the statistic.
x. similar to
This sampling distribution has properties
those of the sampling distribution of
Sample Distribution of p̂

1. Mean of the sampling distribution is equal to the


true binomial proportion, p; that is, E ( pˆ )  p.
Consequently, p̂ is an unbiased estimator of p
2. Standard deviation of the sampling distribution is
equal to p (1  p ) / n ; that is,

 pˆ  p (1  p ) / n .
3. For large samples, the sampling distribution is
approximately normal. (A sample is considered
large if npˆ  15 and n(1  pˆ )  15.)
Thinking Challenge
Suppose you’re
interested in the
average amount of
money that students
in this class (the
population) have on
them. How would
you find out?
Statistical Methods

Statistical
Methods

Descriptive Inferential
Statistics Statistics

Hypothesis
Estimation
Testing
6.1

Identifying and Estimating


the Target Parameter
Estimation Methods

Estimation

Point Interval
Estimation Estimation
Target Parameter

The unknown population parameter (e.g., mean or


proportion) that we are interested in estimating is
called the target parameter.
Target Parameter
Determining the Target Parameter

Parameter Key Words of Phrase Type of Data

µ Mean; average Quantitative

p Proportion; percentage
fraction; rate Qualitative
Point Estimator
A point estimator of a population parameter is a
rule or formula that tells us how to use the sample
data to calculate a single number that can be used
as an estimate of the target parameter.
Point Estimation

1. Provides a single value


• Based on observations from one sample
2. Gives no information about how close the
value is to the unknown population
parameter

3. Example: Sample mean x = 3 is the


point estimate of the unknown
population mean
Interval Estimator

An interval estimator (or confidence interval) is


a formula that tells us how to use the sample data
to calculate an interval that estimates the target
parameter.
Interval Estimation

1. Provides a range of values


• Based on observations from one sample
2. Gives information about closeness to unknown
population parameter
• Stated in terms of probability
– Knowing exact closeness requires knowing
unknown population parameter
3. Example: Unknown population mean lies between
50 and 70 with 95% confidence
6.2

Confidence Interval for a


Population Mean:
Normal (z) Statistic
Estimation Process

Population Random Sample


I am 95%
Mean  confident that 
Mean, , is  x = 50 is between 40 &
unknown 60.

 
 
Sample 

 


Key Elements of
Interval Estimation
Sample statistic
Confidence
(point estimate)
interval

Confidence Confidence
limit (lower) limit (upper)

A confidence interval provides a range of


plausible values for the population parameter.
Confidence Interval
According to the Central Limit Theorem, the
sampling distribution of the sample mean is
approximately normal for large samples. Let us
calculate the interval estimator:
1.96
x  1.96 x  x 
n
That is, we form an interval from 1.96 standard
deviations below the sample mean to 1.96 standard
deviations above the mean. Prior to drawing the
sample, what are the chances that this interval will
enclose µ, the population mean?
Confidence Interval
If sample measurements yield a value of x that falls
between the two lines on either side of µ, then the
interval x  1.96 x will contain µ.
The area under the
normal curve between
these two boundaries
is exactly .95. Thus,
the probability that a
randomly selected
interval will contain µ
is equal to .95.
Confidence Coefficient
The confidence coefficient is the probability that
a randomly selected confidence interval encloses
the population parameter - that is, the relative
frequency with which similarly constructed
intervals enclose the population parameter when
the estimator is used repeatedly a very large
number of times. The confidence level is the
confidence coefficient expressed as a percentage.
95% Confidence Level
If our confidence level is 95%, then in the long run,
95% of our confidence intervals will contain µ and 5%
will not.
For a confidence coefficient of 95%, the area in the
two tails is .05. To choose a different confidence
coefficient we increase or decrease the area (call it )
assigned to the tails. If we place /2 in each tail
and z/2 is the z-value, the
confidence interval with
coefficient (1 – ) is
 
x  z 2  x .
Large-Sample (1 – )% Confidence
Interval for µ

x   z 2  x  x  z 2  / n 
where z/2 is the z-value with an area /2 to its right
and in the standard normal distribution. The
parameter  is the standard deviation of the
sampled population, and n is the sample size.
Note: When  is unknown and n is large (n ≥ 30),
the confidence interval is approximately equal to

x  z 2 s / n 
where s is the sample standard deviation.
Conditions Required for a Valid
Large-Sample
Confidence Interval for µ

1. A random sample is selected from the target


population.
2. The sample size n is large (i.e., n ≥ 30). Due to
the Central Limit Theorem, this condition
guarantees that the sampling distribution of x is
approximately normal. Also, for large n, s will be
a good estimator of .
Thinking Challenge
You’re a Q/C inspector for
Gallo. The  for 2-liter bottles
is .05 liters. A random sample
of 100 bottles showed x =
1.99 liters. What is the 90%
confidence interval estimate
of the true mean amount in 2-
liter bottles?
22 liter
liter

© 1984-1994 T/Maker Co.


Confidence Interval
Solution*

 
x  z /2     x  z /2 
n n

.05 .05
1.99  1.645    1.99  1.645
100 100

1.982    1.998
6.3

Confidence Interval for a


Population Mean:
Student’s t-Statistic
Small Sample  Unknown
Instead of using the standard normal statistic
xµ xµ
z 
x  n

use the t–statistic


xµ
t
s n
in which the sample standard deviation, s, replaces
the population standard deviation, .
Student’s t-Statistic
The t-statistic has a sampling distribution very
much like that of the z-statistic: mound-shaped,
symmetric, with mean 0.
The primary
difference between
the sampling
distributions of t and
z is that the t-
statistic is more
variable than the z-
statistic.
Degrees of Freedom

The actual amount of variability in the sampling


distribution of t depends on the sample size n. A
convenient way of expressing this dependence is
to say that the t-statistic has (n – 1) degrees of
freedom (df).
Student’s t Distribution

Standard
Normal
Bell-Shaped
t (df = 13)
Symmetric
‘Fatter’ Tails t (df = 5)

z
t
0
t - Table
t-value
If we want the t-value with an area of .025 to its
right and 4 df, we look in the table under the
column t.025 for the entry in the row corresponding
to 4 df. This entry is t.025 = 2.776. The
corresponding standard normal z-score is z.025 =
1.96.
Small-Sample
Confidence Interval for µ

 s 
x  t 2 
 n 

where ta/2 is based on (n – 1) degrees of freedom.


Conditions Required for a
Valid Small-Sample
Confidence Interval for µ

1. A random sample is selected from the target


population.
2. The population has a relative frequency
distribution that is approximately normal.
Estimation Example
Mean ( Unknown)
A random sample of n = 25 has x = 50 and s = 8.
Set up a 95% confidence interval estimate for .
s s
x  t /2     x  t /2 
n n
8 8
50  2.064     50  2.064 
25 25
46.70    53.30
Thinking Challenge
You’re a time study analyst
in manufacturing. You’ve
recorded the following task
times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
What is the 90% confidence
interval estimate of the
population mean task time?
Confidence Interval Solution*

x = 3.7
s = 3.8987
• n = 6, df = n – 1 = 6 – 1 = 5
• t.05 = 2.015
.38987 .38987
3.7  2.015    3.7  2.015
6 6
.492    6.908
6.4

Large-Sample Confidence
Interval for a Population
Proportion
Sampling Distribution of p̂
1. The mean of the sampling distribution of p̂ is p;
that is, p̂ is an unbiased estimator of p.

2. The standard deviation of the sampling


distribution of p̂ is pq n ; that is,  p̂  pq n
where q = 1–p.
3. For large samples, the sampling distribution of p̂
is approximately normal. A sample size is
considered large if both np̂  15 and nq̂  15.
Large-Sample Confidence
Interval for p̂
pq ˆˆ
pq
pˆ  z 2 pˆ  pˆ  z 2  pˆ  z 2
n n
x
where p̂  and q̂  1  p̂.
n

Note: When n is large, p̂ can approximate the


value of p in the formula for  p̂ .
Conditions Required for a
Valid Large-Sample
Confidence Interval for p
1. A random sample is selected from the target
population.
2. The sample size n is large. (This condition will be
satisfied if both np̂  15 and nq̂  15 . Note that np̂
and nq̂ are simply the number of successes and
number of failures, respectively, in the sample.).
Estimation Example
Proportion
A random sample of 400 graduates showed 32
went to graduate school. Set up a 95% confidence
interval estimate for p.

ˆˆ
pq ˆˆ
pq 32
pˆ  Z /2  p  pˆ  Z /2 pˆ   0.08
n n 400

.08 .92  .08 .92 


.08  1.96  p  .08  1.96
400 400

.053  p  .107
Thinking Challenge
You’re a production
manager for a newspaper.
You want to find the %
defective. Of 200
newspapers, 35 had
defects. What is the 90%
confidence interval estimate
of the population
proportion defective?
Confidence Interval
Solution*

pˆ  qˆ pˆ  qˆ
pˆ  z /2  p  pˆ  z /2
n n

.175(.825) .175(.825)
.175  1.645  p  .175  1.645
200 200

.1308  p  .2192
Adjusted (1 – )100%
Confidence Interval for a
Population Proportion, p
p1  p 
p  z 2
n4
x2
p 
where  n  4 is the adjusted sample proportion
of observations with the characteristic of interest, x
is the number of successes in the sample, and n is
the sample size.
6.5

Determining the Sample Size


Sampling Error

In general, we express the reliability associated


with a confidence interval for the population mean
µ by specifying the sampling error within which
we want to estimate µ with 100(1 –)% confidence.
The sampling error (denoted SE), then, is equal to
the half-width of the confidence interval.
Sample Size Determination
for 100(1 – ) %
Confidence Interval for µ
In order to estimate µ with a sampling error (SE)
and with 100(1 – )% confidence, the required
sample size is found as follows:
  
z 2    SE
 n
The solution for n is given by the equation
2
 z /2 
n 
 SE 
Sample Size Example

What sample size is needed to be 90%


confident the mean is within  5? A pilot
study suggested that the standard deviation
is 45.

1.645 45
2 2
(z 2 ) 
2 2

n   219.2  220
(SE) 2 5
2
Sample Size Determination
for 100(1 – ) %
Confidence Interval for p
In order to estimate p with a sampling error SE and
with 100(1 – )% confidence, the required sample
size is found by solving the following equation for
n: pq
z 2  SE
n
The solution for n can be written as follows:
z   pq 
2
 2
Note: Always round n
n up to the nearest
SE  2
integer value.
Sample Size Example

What sample size is needed to estimate p


within .03 with 90% confidence?
width .03
SE    .015
2 2

(Z 2 )  pq 
2
1.645  .5 .5 
2

n   3006.69  3007
(SE) 2 .015 2
Thinking Challenge
You work in Human
Resources at Merrill Lynch.
You plan to survey employees
to find their average medical
expenses. You want to be
95% confident that the
sample mean is within ± $50.

A pilot study showed that 


was about $400. What
sample size do you use?
Sample Size Solution*

(z 2 )2  2
n
(SE)2

1.96  400 
2 2


50
2

 245.86  246
6.6

Finite Population Correction


for Simple Random Sample
Finite Population Correction Factor

In some sampling situations, the sample size n


may represent 5% or perhaps 10% of the total
number N of sampling units in the population.
When the sample size is large relative to the
number of measurements in the population (see
the next slide), the standard errors of the
estimators of µ and p should be multiplied by a
finite population correction factor.
Rule of Thumb for Finite
Population Correction Factor

Use the finite population correction factor


when n/N > .05.
Simple Random Sampling with
Finite Population of Size N
Estimation of the Population Mean

Estimated standard error:

s N n
̂ x 
n N

Approximate 95% confidence interval: x  2ˆ x


Simple Random Sampling with
Finite Population of Size N
Estimation of the Population Proportion

Estimated standard error:

p̂(1  p̂) N  n
̂ p̂ 
n N

Approximate 95% confidence interval:p̂  2̂ p̂


Finite Population Correction
Factor Example
You want to estimate a population mean, μ, where
x =115, s =18, N =700, and n = 60. Find an
approximate 95% confidence interval for μ.

Since
n  60  .086
N 700
is greater than .05 use the finite correction
factor
Finite Population Correction
Factor Example
You want to estimate a population mean, μ, where
x =115, s =18, N =700, and n = 60. Find an
approximate 95% confidence interval for μ.

s N n 18 700  60
x 2  115  2 
n N 60 700
 115  4.4
 110.6, 119.4 
6.7

Confidence Interval for a


Population Variance
Confidence Interval for a
Population Variance
Conditions Required for a Valid
Confidence Interval for 2

1. A random sample is selected from the


target population.
2. The population of interest has a relative
frequency distribution that is
approximately normal.
Thinking Challenge
You’re a marketing
manager for a 5K race. You
take a random sample of
the times of 292 runners
from the last race, with
mean of 28.5 minutes and
standard deviation of 8.3
minutes. What is the 95%
confidence interval estimate
of the population variance?
Confidence Interval
Solution*

df = 292  1 = 291 (use 300 df) 2
 .025

 
n  1 s 2

2 
 
n  1 s 2

2  21
2  2

 292  18.3  292  18.3


2 2

 
2

349.874 253.912

57.30    78.95
2
7.1

The Elements of
a Test of Hypothesis
Hypothesis Testing
I believe the
population mean Reject
age is 50 Reject
hypothesis!
hypothesis!
Population (hypothesis). Not
Notclose.
close.

 

 
 Random
 sample
Mean 
X = 20
What’s a Hypothesis?

A statistical hypothesis I believe the mean GPA of


is a statement about this class is 3.5!
the numerical value of
a population parameter.

© 1984-1994 T/Maker Co.


Null Hypothesis

The null hypothesis, denoted H0,


represents the hypothesis that will be
accepted unless the data provide
convincing evidence that it is false. This
usually represents the “status quo” or some
claim about the population parameter that
the researcher wants to test.
Alternative Hypothesis

The alternative (research) hypothesis,


denoted Ha, represents the hypothesis that
will be accepted only if the data provide
convincing evidence of its truth. This
usually represents the values of a
population parameter for which the
researcher wants to gather evidence to
support.
Alternative Hypothesis
1. Opposite of null hypothesis
2. The hypothesis that will be accepted
only if the data provide convincing
evidence of its truth
3. Designated Ha
4. Stated in one of the following forms
Ha:   some value)
Ha:   some value)
Ha:   some value)
Identifying Hypotheses
Example problem: Test that the population
mean is not 3
Steps:
• State the question statistically (   3)
• State the opposite statistically (  = 3)
— Must be mutually exclusive & exhaustive
• Select the alternative hypothesis (   3)
— Has the , <, or > sign
• State the null hypothesis ( = 3)
What Are the Hypotheses?
Is the population average amount of TV
viewing 12 hours?
State the question statistically:  = 12
State the opposite statistically:   12
Select the alternative hypothesis: Ha:   12
State the null hypothesis: H0:  = 12
What Are the Hypotheses?
Is the population average amount of TV
viewing different from 12 hours?

State the question statistically:   12


State the opposite statistically:  = 12
Select the alternative hypothesis: Ha:   12
State the null hypothesis: H0:  = 12
What Are the Hypotheses?
Is the average cost per hat less than or
equal to $20?
State the question statistically:   20
State the opposite statistically:   20
Select the alternative hypothesis: Ha:   20
State the null hypothesis: H0:   20
What Are the Hypotheses?
Is the average amount spent in the
bookstore greater than $25?
State the question statistically:   25
State the opposite statistically:   25
Select the alternative hypothesis: Ha:   25
State the null hypothesis: H0:   25
Test Statistic

The test statistic is a sample statistic,


computed from information provided in the
sample, that the researcher uses to decide
between the null and alternative
hypotheses.
Test Statistic - Example
The sampling distribution of x assuming
µ = 2,400. the chance of observing x more than
1.645 standard deviations above 2,400 is
only .05 – if in fact the true mean µ is 2,400.
Type I Error

A Type I error occurs if the


researcher rejects the null hypothesis
in favor of the alternative hypothesis
when, in fact, H0 is true. The
probability of committing a Type I
error is denoted by .
Rejection Region
The rejection region of a statistical test is
the set of possible values of the test
statistic for which the researcher will reject
H0 in favor of Ha.
Type II Error

A Type II error occurs if the researcher


accepts the null hypothesis when, in fact,
H0 is false. The probability of committing a
Type II error is denoted by .
Conclusions and
Consequences for a Test of
Hypothesis
True State of Nature
Conclusion H0 True Ha True
Accept H0 Correct decision Type II error
(Assume H0 True) (probability )
Reject H0 Type I error Correct decision
(Assume Ha True) (probability )
Elements of a Test of
Hypothesis
1. Null hypothesis (H0): A theory about the
specific values of one or more population
parameters. The theory generally represents
the status quo, which we adopt until it is
proven false.
2. Alternative (research) hypothesis (Ha): A
theory that contradicts the null hypothesis.
The theory generally represents that which we
will adopt only when sufficient evidence exists
to establish its truth.
Elements of a Test of
Hypothesis
3. Test statistic: A sample statistic used to
decide whether to reject the null hypothesis.
4. Rejection region: The numerical values of the
test statistic for which the null hypothesis will
be rejected. The rejection region is chosen so
that the probability is  that it will contain the
test statistic when the null hypothesis is true,
thereby leading to a Type I error. The value of
 is usually chosen to be small (e.g., .01, .05,
or .10) and is referred to as the level of
significance of the test.
Elements of a Test of
Hypothesis

5. Assumptions: Clear statement(s) of any


assumptions made about the population(s)
being sampled.
6. Experiment and calculation of test statistic:
Performance of the sampling experiment and
determination of the numerical value of the
test statistic.
Elements of a Test of
Hypothesis

7. Conclusion:

a. If the numerical value of the test statistic falls


in the rejection region, we reject the null
hypothesis and conclude that the alternative
hypothesis is true. We know that the
hypothesis-testing process will lead to this
conclusion incorrectly (Type I error) only 100 
% of the time when H0 is true.
Elements of a Test of
Hypothesis

7. Conclusion:

b. If the test statistic does not fall in the rejection


region, we do not reject H0. Thus, we reserve
judgment about which hypothesis is true. We
do not conclude that the null hypothesis is
true because we do not (in general) know the
probability  that our test procedure will lead
to an incorrect acceptance of H0 (Type II
error).
Determining the
Target Parameter

Parameter Key Words or Phrases Type of Data


µ Mean; average Quantitative
p Proportion; percentage; Qualitative
fraction; rate

2 Variance; variability; Quantitative


spread
7.2

Formulating Hypotheses and


Setting Up the Rejection
Region
Steps for Selecting the Null
and Alternative Hypotheses
1. Select the alternative hypothesis as that which
the sampling experiment is intended to establish.
The alternative hypothesis will assume one of
three forms:
a. One-tailed, upper-tailed (e.g., Ha: µ > 2,400)
b. One-tailed, lower-tailed (e.g., Ha: µ < 2,400)
c. Two-tailed (e.g., Ha: µ ≠ 2,400)
Steps for Selecting the Null
and Alternative Hypotheses
2. Select the null hypothesis as the status quo,
that which will be presumed true unless the
sampling experiment conclusively establishes
the alternative hypothesis. The null hypothesis
will be specified as that parameter value closest
to the alternative in one-tailed tests and as the
complementary (or only unspecified) value in
two-tailed tests.
(e.g., H0: µ = 2,400)
One-Tailed Test

A one-tailed test of hypothesis is one in which the


alternative hypothesis is directional and includes
the symbol “ < ” or “ >.”
Upper-tailed (>): “greater than,” “larger,” “above”
Lower-tailed (<): “less than,” “smaller,” “below”
Two-Tailed Test

A two-tailed test of hypothesis is one in which the


alternative hypothesis does not specify departure
from H0 in a particular direction and is written with
the symbol “ ≠.”
Some key words that help you identify this
nondirectional nature are:
Two-tailed (≠): “not equal to,” “differs from”
Basic Idea
Sampling Distribution
It is unlikely
that we would ... therefore, we
get a sample reject the
mean of this hypothesis that
value ...  = 50.
... if in fact this were
the population mean

20 m = 50 Sample Means
H0
Rejection Region
(One-Tail Test)
Sampling Distribution Level of Confidence
Rejection
Region
1–
a
Fail to Reject
Region

Ho Sample Statistic
Critical Value
Value
Rejection Regions
(Two-Tailed Test)
Sampling Distribution Level of Confidence
Rejection Rejection
Region Region
1–
1/2 a 1/2 a
Fail to Reject
Region

Ho Sample Statistic
Critical Value Critical
Value Value
Rejection Regions

Alternative Hypotheses

Lower- Upper- Two-Tailed


Tailed Tailed
 = .10 z < –1.282 z > 1.282 z < –1.645 or z > 1.645

 = .05 z < –1.645 z > 1.645 z < –1.96 or z > 1.96

 = .01 z < –2.326 z > 2.326 z < –2.575 or z > 2.575


7.3

Observed Significance Levels:


p-Values
p-Value

The observed significance level, or


p-value, for a specific statistical test is the
probability (assuming H0 is true) of
observing a value of the test statistic that is
at least as contradictory to the null
hypothesis, and supportive of the alternative
hypothesis, as the actual one computed
from the sample data.
p-Value

Probability of obtaining a test statistic more


extreme (or than actual sample
value, given H0 is true
Called observed level of significance
• Smallest value of  for which H0 can be
rejected
Used to make rejection decision
• If p-value  , do not reject H0
• If p-value < , reject H0
Steps for Calculating the p-
Value for a Test of Hypothesis

1. Determine the value of the test statistic z


corresponding to the result of the
sampling experiment.
Steps for Calculating the p-
Value for a Test of Hypothesis
2a. If the test is one-tailed, the p-value is equal to
the tail area beyond z in the same direction as the
alternative hypothesis. Thus, if the alternative
hypothesis is of the form > , the p-value is the
area to the right of, or above, the observed z-
value. Conversely, if the alternative is of the form
< , the p-value is the area to the left of, or below,
the observed z-value.
Steps for Calculating the p-
Value for a Test of Hypothesis
2b. If the test is two-tailed, the p-value is equal to
twice the tail area beyond the observed z-value
in the direction of the sign of z – that is, if z is
positive, the p-value is twice the area to the
right of, or above, the observed z-value.
Conversely, if z is negative, the p-value is twice
the area to the left of, or below, the observed
z-value.
Reporting Test Results as
p-Values: How to Decide
Whether to Reject H0
1. Choose the maximum value of  that you
are willing to tolerate.
2. If the observed significance level (p-
value) of the test is less than the chosen
value of , reject the null hypothesis.
Otherwise, do not reject the null
hypothesis.
Two-Tailed z Test
p-Value Example
Does an average box of
cereal contain 368 grams of
cereal? A random sample of
25 boxes showed x = 372.5.
The company has specified
 to be 15 grams. Find the
p-value. How does it
compare to  = .05?
368 gm.
Two-Tailed z Test
p-Value Solution
x   372.5  368
z   1.50
 15
n 25

0 1.50 z
z value of sample
 statistic (observed)
Two-Tailed Z Test
p-Value Solution

p-Value is P(z  –1.50 or z  1.50)


1/2 p-Value 1/2 p-Value .5000
– .4332
.0668
.4332

–1.50 0 1.50 z
From z table: z value of sample
 lookup 1.50  statistic (observed)
Two-Tailed z Test
p-Value Solution

p-Value is P(z  –1.50 or z  1.50) = .1336

1/2 p-Value 1/2 p-Value


.0668 .0668

–1.50 0 1.50 z
Two-Tailed z Test
p-Value Solution
p-Value = .1336   = .05
Do not reject H0.
1/2 p-Value = .0668 1/2 p-Value = .0668

Reject H0 Reject H0
1/2  = .025 1/2  = .025

–1.50 0 1.50 z
Test statistic is in ‘Do not reject’ region
One-Tailed z Test
p-Value Example
Does an average box of
cereal contain more than
368 grams of cereal? A
random sample of 25
boxes showed x = 372.5.
The company has specified
 to be 15 grams. Find the
p-value. How does it
368 gm.
compare to  = .05?
One-Tailed z Test
p-Value Solution
x   372.5  368
z   1.50
 15
n 25

0 1.50 z
z value of sample
 statistic
One-Tailed z Test
p-Value Solution

p-Value is P(z 1.50)


Use p-Value

alternative .5000
hypothesis – .4332
to find .4332 .0668
direction
0 1.50 z
 From z table:
 z value of sample
lookup 1.50 statistic
One-Tailed z Test
p-Value Solution

p-Value is P(z  1.50) = .0668

 p-Value
Use 
.0668 .5000
alternative
hypothesis – .4332
to find .4332 .0668
direction
0 1.50 z
 From z table:
 z value of sample
lookup 1.50 statistic
One-Tailed z Test
p-Value Solution
(p-Value = .0668)  ( = .05).
Do not reject H0.
p-Value = .0668

Reject H0
 = .05

0 1.50 z
Test statistic is in ‘Do not reject’ region
p-Value
Thinking Challenge
You’re an analyst for Ford. You
want to find out if the average
miles per gallon of Escorts is
less than 32 mpg. Similar
models have a standard
deviation of 3.8 mpg. You take
a sample of 60 Escorts &
compute a sample mean of
30.7 mpg. What is the p-
value? How does it compare to
 = .01?
p-Value
Solution*
p-Value is P(z  -2.65) = .004.
p-Value < ( = .01). Reject H0.
 p-Value

Use .5000
alternative
.004 – .4960
hypothesis .0040
to find .4960
direction
–2.65 0 z
z value of sample From z table:
 statistic  lookup 2.65
Converting a Two-Tailed
p-Value from a Printout to a
One-Tailed p-Value
Reported p-value if Ha is of the form > and z is
p
2 positive
or Ha is of the form < and z is
negative

 Reported p-value 
p  1  
 2
if Ha is of the form > and z is negative
Ha is of the form < and z is positive
7.4

Test of Hypotheses about a


Population Mean:
Normal (z) Statistic
Large-Sample Test of
Hypothesis about µ
One-Tailed Test Two-Tailed Test
H0: µ = µ0 H0: µ = µ0
Ha: µ < µ0 Ha: µ ≠ µ0
(or Ha: µ > µ0)

Test Statistic: Test Statistic:


x  µ0 x  µ0 x  µ0 x  µ0
z  z 
x  n x s n
Large-Sample Test of
Hypothesis about µ
One-Tailed Test
Rejection region:
z < –z
(or z > zwhen Ha: µ > µ0)
where z is chosen so that
P(z > z) = 
Large-Sample Test of
Hypothesis about µ
Two-Tailed Test
Rejection region:
|z| > z
where z is chosen so that
P(|z| > z) = /2

Note: µ0 is the symbol for the numerical value


assigned to µ under the null hypothesis.
Conditions Required for a
Valid Large-Sample
Hypothesis Test for µ
1. A random sample is selected from the target
population.
2. The sample size n is large (i.e., n ≥ 30). (Due to
the Central Limit Theorem, this condition
guarantees that the test statistic will be
approximately normal regardless of the shape
of the underlying probability distribution of the
population.)
Possible Conclusions for a
Test of Hypothesis

1. If the calculated test statistic falls in the


rejection region, reject H0 and conclude
that the alternative hypothesis Ha is true.
State that you are rejecting H0 at the 
level of significance. Remember that the
confidence is in the testing process, not
the particular result of a single test.
Possible Conclusions for a
Test of Hypothesis

2. If the test statistic does not fall in the


rejection region, conclude that the
sampling experiment does not provide
sufficient evidence to reject H0 at the 
level of significance. [Generally, we will
not “accept” the null hypothesis unless the
probability  of a Type II error has been
calculated.]
Two-Tailed z Test Example
Does an average box of
cereal contain 368 grams of
cereal? A random sample
of 25 boxes had x = 372.5.
The company has specified
 to be 25 grams. Test at
the .05 level of significance.
368 gm.
Two-Tailed z Test Solution
H0:  = 368 Test Statistic:
Ha:   368 x   372.5  368
z   0.9
  .05  25
n  25 n 25
Critical Value(s): Decision:

Reject H 0 Reject H 0 Do not reject at  = .05


Conclusion:
.025 .025
No evidence average
–1.96 0 1.96 z is not 368
Two-Tailed z Test Thinking
Challenge
You’re a Q/C inspector. You want to find
out if a new machine is making electrical
cords to customer specification: average
breaking strength of 70 lb. with  = 3.5 lb.
You take a sample of 36 cords & compute
a sample mean of 69.7 lb. At the .05 level
of significance, is there evidence that the
machine is not meeting the average
breaking strength?
Two-Tailed z Test Solution*
H0:  = 70 Test Statistic:
Ha:   70 x 69.7  70
z   .51
 = .05  3.5
n = 36 n 36
Critical Value(s): Decision:

Reject H 0 Reject H 0 Do not reject at  = .05


Conclusion:
.025 .025
No evidence average
–1.96 0 1.96 z is not 70
One-Tailed z Test
Example
Does an average box of
cereal contain more than
368 grams of cereal? A
random sample of 25 boxes
showed x = 372.5. The
company has specified  to
be 25 grams. Test at the .05
level of significance.
368 gm.
One-Tailed z Test Solution
H0:  = 368 Test Statistic:
Ha:  > 368 x 372.5  368
z   1.50
 = .05  15
n = 25 n 25
Critical Value(s): Decision:

Reject Do not reject at  = .05


Conclusion:
.05

No evidence average is
0 1.645 z more than 368
One-Tailed z Test Thinking
Challenge
You’re an analyst for Ford. You
want to find out if the average
miles per gallon of Escorts is at
least 32 mpg. Similar models
have a standard deviation of 3.8
mpg. You take a sample of 60
Escorts & compute a sample
mean of 30.7 mpg. At the .01
level of significance, is there
evidence that the miles per
gallon is less than 32?
One-Tailed z Test Solution*
H0:  = 32 Test Statistic:
Ha:  < 32 x   30.7  32
z   2.65
= .01  3.8
n= 60 n 60
Critical Value(s): Decision:
Reject Reject at  = .01
Conclusion:
.01

There is evidence average


-2.33 0 z is less than 32
7.5

Test of Hypothesis about a


Population Mean:
Student’s t-Statistic
Small-Sample Test of
Hypothesis about µ
One-Tailed Test
H0: µ = µ0
Ha: µ < µ0 (or Ha: µ > µ0)
x 
Test statistic: t 
s n
Rejection region: t < –t
(or t > t when Ha: µ > µ0)
where t and t are based on (n – 1) degrees of
freedom
Small-Sample Test of
Hypothesis about µ

Two-Tailed Test
H0: µ = µ0
Ha: µ ≠ µ0
x 
Test statistic: t 
s n

Rejection region: |t| > t


Conditions Required for a
Valid Small-Sample
Hypothesis Test for µ
1. A random sample is selected from the
target population.
2. The population from which the sample is
selected has a distribution that is
approximately normal.
Two-Tailed t Test
Example
Does an average box of
cereal contain 368 grams
of cereal? A random
sample of 36 boxes had
a mean of 372.5 and a
standard deviation of 12
grams. Test at the .05
level of significance.
368 gm.
Two-Tailed t Test
Solution
H0:  = 368 Test Statistic:
Ha:   368 x   372.5  368
t   2.25
 = .05 s 12
df = 36 – 1 = 35 n 36
Critical Value(s): Decision:
Reject H0 Reject H0 Reject at  = .05
Conclusion:
.025 .025
There is evidence population
-2.030 0 2.030 t average is not 368
Two-Tailed t Test
Thinking Challenge
You work for the FTC. A
manufacturer of detergent claims
that the mean weight of detergent
is 3.25 lb. You take a random
sample of 64 containers. You
calculate the sample average to be
3.238 lb. with a standard deviation
of .117 lb. At the .01 level of
significance, is the manufacturer
correct? 3.25 lb.
Two-Tailed t Test
Solution*
H0:  = 3.25 Test Statistic:
Ha:   3.25 x   3.238  3.25
t   .82
  .01 s .117
df  64 – 1 = 63 n 64
Critical Value(s): Decision:
Reject H 0 Reject H0 Do not reject at  = .01
Conclusion:
.005 .005
There is no evidence
-2.656 0 2.656 t average is not 3.25
One-Tailed t Test
Example
Is the average capacity of
batteries less than 140
ampere-hours? A random
sample of 20 batteries had a
mean of 138.47 and a
standard deviation of 2.66.
Assume a normal
distribution. Test at the .05
level of significance.
One-Tailed t Test
Solution
H0:  = 140 Test Statistic:
Ha:  < 140 x   138.47  140
t   2.57
s 2.66
= .05
20 – 1 = 19 n 20
df =
Critical Value(s): Decision:

Reject H0 Reject at  = .05


Conclusion:
.05
There is evidence population
-1.729 0 t average is less than 140
One-Tailed t Test
Thinking Challenge
You’re a marketing analyst for Wal-
Mart. Wal-Mart had teddy bears on
sale last week. The weekly sales
($ 00) of bears sold in 10 stores
was:
8 11 0 4 7 8 10 5 8 3
At the .05 level of significance, is
there evidence that the average
bear sales per store is more than 5
($ 00)?
One-Tailed t Test
Solution*
H0: =5 Test Statistic:
Ha: >5 x   6.4  5
t   1.31
= .05 s 3.373
df = 10 – 1 = 9 n 10
Critical Value(s): Decision:
Reject H0 Do not reject at  = .05
.05 Conclusion:

There is no evidence
0 1.833 t average is more than 5
7.6

Large-Sample Test of
Hypothesis about a Population
Proportion
Large-Sample Test of
Hypothesis about p
One-Tailed Test
H0: p = p0
Ha: p < p0 (or Ha: p > p0)

Test statistic:

Rejection region:
z < –z(or z > z when Ha: p > p0)
Note: p0 is the symbol for the numerical value of p
assigned in the null hypothesis
Large-Sample Test of
Hypothesis about p
Two-Tailed Test
H0: p = p0
Ha: p ≠ p0
p̂  p0
Test statistic: z  where  p̂  p0 q0 n
 p̂
q0  1  p0
Rejection region: |z| < z

Note: p0 is the symbol for the numerical value of p


assigned in the null hypothesis
Conditions Required for a
Valid Large-Sample
Hypothesis Test for p
1. A random sample is selected from a
binomial population.
2. The sample size n is large. (This condition
will be satisfied if both np0 ≥ 15 and
nq0 ≥ 15.)
One-Proportion z Test
Example
The present packaging
system produces 10%
defective cereal boxes.
Using a new system, a
random sample of 200
boxes had11 defects.
Does the new system
produce fewer defects?
Test at the .05 level of
significance.
One-Proportion z Test
Solution
H0: p = .10 Test Statistic:
11
Ha: p < .10 ˆp  p0 200  .10
z   2.12
= .05 p0 q0 .10 .90 
n= 200 n 200
Critical Value(s): Decision:
Reject H0 Reject at  = .05
Conclusion:
.05
There is evidence new
-1.645 0 z system < 10% defective
One-Proportion z Test
Thinking Challenge
You’re an accounting manager.
A year-end audit showed 4% of
transactions had errors. You
implement new procedures. A
random sample of 500
transactions had 25 errors. Has
the proportion of incorrect
transactions changed at the .05
level of significance?
One-Proportion z Test
Solution*
H0: p = .04 Test Statistic:
25
Ha: p  .04 pˆ  p0
 .04
z  500  1.14
= .05 p0 q0 .04 .96 
n= 500 n 500
Critical Value(s): Decision:
Reject H 0 Reject H 0
Do not reject at  = .05
.025 .025 Conclusion:

There is evidence
-1.96 0 1.96 z proportion is not 4%
7.7

Test of Hypothesis about a


Population Variance
Variance

Although many practical problems involve


inferences about a population mean (or
proportion), it is sometimes of interest to
make an inference about a population
variance, 2.
Test of a Hypothesis about  2

One-Tailed Test
H0:  = 0
Ha:  < 0(or Ha:  > 0)
 
2  n  1 s 2

Test statistic:  02

Rejection region:  2
  2
1 
(or  >  when Ha:  > 0)
where 0 is the hypothesized variance and the
distribution of  is based on (n – 1) degrees of
freedom.
Test of a Hypothesis about  2

Two-Tailed Test
H0:  = 0
Ha:  ≠ 0

 
2 n  1s 2

Test statistic:  02

Rejection region:  2
  2
1 2  or  2
  2
 2 

where 0 is the hypothesized variance and the


distribution of  is based on (n – 1) degrees of
freedom.
Conditions Required for a
Valid Hypothesis Test for s2

1. A random sample is selected from the


target population.
2. The population from which the sample is
selected has a distribution that is
approximately normal.
Several 2 probability
Distributions
Critical Values of Chi Square
Finding Critical Value
Example
What is the critical 2 value given:
Ha: 2 > 0.7
Reject
n=3
 =.05?  = .05
df = n - 1 = 2
0 5.991 c 2

2 Table Upper Tail Area


(Portion)
DF .995 … .95 … .05
1 ... … 0.004 … 3.841
2 0.010 … 0.103 … 5.991
Finding Critical Value
Example
What is the critical 2 value given:
Ha: 2 < 0.7
n=3 What do you do
 =.05? if the rejection
region is on the
left?
Finding Critical Value
Example
What is the critical 2 value given:
Ha: 2 < 0.7 Upper Tail Area
Reject H0
for Lower Critical
n=3
 = .05 Value = 1–.05 = .95
 =.05?
df = n - 1 = 2
0 .103 c2
2 Table Upper Tail Area
(Portion)
DF .995 … .95 … .05
1 ... … 0.004 … 3.841
2 0.010 … 0.103 … 5.991
Chi-Square ( ) Test2

Example
Is the variation in boxes
of cereal, measured by
the variance, equal to 15
grams? A random
sample of 25 boxes had
a standard deviation of
17.7 grams. Test at
the .05 level of
significance.
Chi-Square (2) Test
Solution
H0: 2 = 15
Test Statistic:
Ha: 2  15
(25  1) 17.7 
2
(n  1) s 2

= .05  
2

25 – 1 = 24
 2
0 152
df =
= 33.42
Critical Value(s): Decision:
/2 = .025 Do not reject at  = .05
Conclusion:

There is no evidence
0 12.401 39.364 2 2 is not 15
7.8

Calculating Type II Error


Probabilities: More about 
Type II Error
The Type II error probability  is calculated
assuming that the null hypothesis is false
because it is defined as the probability of
accepting H0 when it is false.The situation
corresponding to accepting the null
hypothesis, and thereby risking a Type II
error, is not generally as controllable. For
that reason, we adopted a policy of
nonrejection of H0 when the test statistic
does not fall in the rejection region, rather
than risking an error of unknown magnitude.
Steps for Calculating  for a
Large-Sample Test about µ
1. Calculate the value(s) of x corresponding
to the border(s) of the rejection region.
There will be one border value for a one-
tailed test and two for a two-tailed test.
The formula is one of the following,
corresponding to a test with level of
significance :
 s 
Upper-tailed test: x0  0  z  x  0  z  
 n
Steps for Calculating  for a
Large-Sample Test about µ

 s 
Lower-tailed test: x0  0  z  x  0  z  
 n

Two-tailed test: x0, L  s 


 0  z 2 x  0  z 2 
 n 

 s 
x0, U  0  z 2 x  0  z 2 
 n 
Steps for Calculating  for a
Large-Sample Test about µ
2. Specify the value of µa in the alternative
hypothesis for which the value of  is to
be calculated. Then convert the border
value(s) of x0 to z-value(s) using the
alternative distribution with mean µa. The
general formula for the z-value is
x0   a
z
x
Steps for Calculating  for a
Large-Sample Test about µ

Sketch the alternative distribution


(centered at µa) and shade the area in the
acceptance (nonrejection) region. Use the
z-statistic(s) and Table II in Appendix D to
find the shaded area, which is .
Power of Test
Probability of rejecting false H0
• Correct decision

Equal to 1 – 
Used in determining test adequacy
Affected by
• True value of population parameter
• Significance level 
• Standard deviation & sample size n
Two-Tailed z Test Example
Does an average box of
cereal contain 368 grams of
cereal? A random sample
of 25 boxes had x = 372.5.
The company has specified
 to be 15 grams. Test at
the .05 level of significance.
368 gm.
Finding Power
Step 1
  Reject H0
Hypothesis:
15
n
Do Not

H0: 0  368 Draw
25 Reject H0
Ha: 0 < 368  = .05

0 = 368 x
Finding Power
Steps 2 & 3
  Reject H0
Hypothesis:
15
n
Do Not

H0: 0  368 Draw
25 Reject H0
Ha: 0 < 368  = .05

0 = 368 x
‘True’ Situation:
a = 360 (Ha)

Draw

 1–
Specify

a = 360 x
Finding Power
Step 4
  Reject H0
Hypothesis:
15
n
Do Not

H0: 0  368 Draw
25 Reject H0
Ha: 0 < 368  = .05

0 = 368 x
 15
‘True’ Situation:
a = 360 (Ha)
 xL   0  z
n
 368  1.64
25
Draw
 363.065
 1– 
Specify
a = 360 363.065
x
Finding Power
Step 5
  Reject H0
Hypothesis:
15
n
Do Not

H0: 0  368 Draw
25 Reject H0
Ha: 0 < 368  = .05

0 = 368 x
‘True’ Situation:
a = 360 (Ha)
 xL   0  z

 368  1.64
15
Draw n 25
 = .154
 
 363.065

Specify 1– =.846
z Table
a = 360 363.065 x
Properties of
 and Power
1. For fixed n and ,
the value of 
decreases, and the
power increases as
the distance
between the
specified null value
µ0 and the specified
alternative value µa
increases.
Properties
of  and
Power
2. For fixed n and
values of µ0
and µa, the
value of 
increases, and
the power
decreases as
the value of 
is decreased.
Properties of  and Power
3. For fixed  and values of µ0 and µa, the value of
 decreases, and the power increases as the
sample size n is increased.

You might also like