Professional Documents
Culture Documents
Lecture 4 Hypothesis Sampling
Lecture 4 Hypothesis Sampling
Lecture 4 Hypothesis Sampling
x
̂ = X n = i =1
n
n
ˆ 2 = s 2 = i =1
n −1
observable)
Sample *hat notation ^ is often used to indicate
“estitmate”
Population (observation)
parameters
N N
x
i =1
(x − )
i
2
= 2 = i =1
N N
Make guesses
about the whole
population
Two different inference-making procedures
– population mean
Example: The mean monthly cell phone bill
in this city is μ = $42
– population proportion
Example: The proportion of adults in this
city with cell phones is π = 0.68
The problem of hypotheses testing
Statement of the Problem:
• Given a population (equivalently a distribution) with a
parameter of interest, , (which could be the mean, variance,
standard deviation, proportion, etc.), we would like to
decide/choose between two complementary statements
concerning . These statements are called statistical
hypotheses.
• The choice or decision between these hypotheses is to be
based on a sample data taken from the population of
interest.
• The ideal goal is to be able to choose the hypothesis that is
true in reality based on the sample data.
Situations where Hypotheses Testing is
Relevant
• Example: A quality engineer would like to determine whether the
production process he is charged of monitoring is still producing
products whose mean response value is supposed to be 0 (process is
in-control), or whether it is producing products whose mean response
value is now different from the required value of 0 (process is out-of-
control).
8
The Hypothesis Testing Process
Population
Sample
The Hypothesis Testing Process
(continued)
• Suppose the sample mean age was X = 20.
Sampling
Distribution of X
X
20 μ = 50
If H0 is true ... then you reject
If it is unlikely that you
the null hypothesis
would get a sample
that μ = 50.
mean of this value ... ... When in fact this were
the population mean…
The Test Statistic and
Critical Values
• If the sample mean is close to the assumed
population mean, the null hypothesis is not
rejected.
• If the sample mean is far from the assumed
population mean, the null hypothesis is rejected.
• How far is “far enough” to reject H0?
• The critical value of a test statistic creates a “line
in the sand” for decision making -- it answers the
question of how far is far enough.
The Test Statistic and
Critical Values
Sampling Distribution of the test statistic
Region of Region of
Rejection Rejection
Region of
Non-Rejection
Critical Values
• Type I Error
– Reject a true null hypothesis
– Considered a serious type of error
– The probability of a Type I Error is
• Called level of significance of the test
• Set by researcher in advance
• Type II Error
– Failure to reject false null hypothesis
– The probability of a Type II Error is β
Level of Significance
and the Rejection Region
H0: μ = 3 Level of significance =
H1: μ ≠ 3
/2 /2
Critical values
Rejection Region
• Assumptions
-Independent observations
-normal or “well behaved” distributions
-populations have equal variances
17
-at least interval data measurement scale
Classes of Significance Tests
• Non parametric tests
ex Chi-Square tests or Wilcoxon or Mann-Whitney tests
• Assumptions
-normal distribution not necessary
-independent observations for some tests only
-homogeneity of variance not necessary
-appropriate for nominal and ordinal data, may be
used for interval or ratio data.
Characteristics of distribution-free or
non-parametric tests
• They do not suppose any particular distribution and
the assumptions
• They are rather quick and easy to use since the
observations are replaced by their rank order
• They are not as efficient as parametric tests
• Parametric tests cannot apply to ordinal or nominal
scale but non-parametric tests do not suffer from
any such limitation
To summarize
• Parametric methods are powerful because they
utilize all the information in the data but require
fairly strict assumptions for their application
Hypothesis
Tests for
Known Unknown
(Z test) (t test)
Z Test of Hypothesis for the
Mean (σ Known)
• Convert sample statistic ( )Xto a ZSTAT test statistic
Hypothesis
Tests for
σKnown
Known σUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−μ
Z STAT =
σ
n
Critical Value
Approach to Testing
• For a two-tail test for the mean, σ known:
• Convert sample statistic (X ) to test statistic
(ZSTAT)
• Determine the critical Z values for a specified
level of significance from a table or
computer
• Decision Rule: If the test statistic falls in the
rejection region, reject H0 ; otherwise do not
reject H
Two-Tail Tests
H0: μ = 3
◼ There are two
H1: μ 3
cutoff values
(critical values),
defining the
regions of /2 /2
rejection
3 X
Reject H0 Do not reject H0 Reject H0
-Zα/2 0 +Zα/2 Z
Lower Upper
critical critical
value value
6 Steps in
Hypothesis Testing
= 0.05/2 = 0.05/2
4. Collect data and compute the value of the test statistic and the p-
value
X − μ 2.84 − 3 − .16
Z STAT = = = = − 2.0
σ 0.8 .08
n 100
p-Value Hypothesis Testing Example:
Calculating the p-value
4. (continued) Calculate the p-value.
– How likely is it to get a ZSTAT of -2 (or something further from the
mean (0), in either direction) if H0 is true?
0 Z
-2.0 2.0
p-value = 0.0228 + 0.0228 = 0.0456
p-value Hypothesis Testing Example
(continued)
• Probably not!
σKnown
Known σUnknown
Unknown
(Z test) (t test)
The test statistic is:
X−μ
t STAT =
S
n
Example: Two-Tail Test
( Unknown)
The average cost of a hotel
room in New York is said
to be $168 per night. To
determine if this is true, a
random sample of 25
hotels is taken and resulted
in an X of $172.50 and an H0: μ = 168
S of $15.40. Test the H1: μ 168
appropriate hypotheses at
= 0.05.
(Assume the population distribution is normal)
Example Solution:
Two-Tail t Test
H0: μ = 168 /2=.025 /2=.025
H1: μ 168
Data
Null Hypothesis µ= $ 168.00
Level of Significance 0.05
Sample Size 25
Sample Mean $ 172.50
Sample Standard Deviation $ 15.40
Intermediate Calculations
Standard Error of the Mean $ 3.08 =B8/SQRT(B6)
Degrees of Freedom 24 =B6-1
t test statistic 1.46 =(B7-B4)/B11
Two-Tail Test
p-value > α Lower Critical Value
Upper Critical Value
-2.0639 =-TINV(B5,B12)
2.0639 =TINV(B5,B12)
So do not reject H0 p-value 0.157 =TDIST(ABS(B13),B12,2)
Do Not Reject Null Hypothesis =IF(B18<B5, "Reject null hypothesis",
"Do not reject null hypothesis")
Example Two-Tail t Test Using
A p-value from Minitab
One-Sample T
p-value > α
So do not reject H0
One-Tail Tests
μ X
Critical value
Upper-Tail Tests
H0: μ ≤ 3
◼ There is only one
critical value, since H1: μ > 3
the rejection area is
in only one tail
Critical value
Example: Upper-Tail t Test
for Mean ( unknown)
A phone industry manager thinks that
customer monthly cell phone bills have
increased, and now average over $52 per
month. The company wishes to test this
claim. (Assume a normal population)
Form hypothesis test:
H0: μ =52 the average is $52 per month
H1: μ > 52 the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)
Example: Find Rejection Region
(continued)
• Suppose that = 0.10 is chosen for this test and n =
25.
Find the rejection region: Reject H0
= 0.10
= 0.10
Reject H0
= .10
0
Do not reject Reject
H0 1.318 H0
tSTAT = .55
Data
Null Hypothesis µ= 52.00
Level of Significance 0.1
Sample Size 25
Sample Mean 53.10
Sample Standard Deviation 10.00
Intermediate Calculations
Standard Error of the Mean 2.00 =B8/SQRT(B6)
Degrees of Freedom 24 =B6-1
t test statistic 0.55 =(B7-B4)/B11
• The sampling
distribution of p is Hypothesis
approximately Tests for p
normal, so the test
statistic is a ZSTAT
value: nπ 5
and
p−π n(1-π) 5
ZSTAT =
π (1 − π )
n
Example: Z Test for Proportion
A marketing company
claims that it receives 8%
responses from its mailing.
To test this claim, a
random sample of 500
were surveyed with 25
responses. Test at the = Check:
Do not reject H0
Reject H0 Reject H0 p-value = 0.0136:
/2 = .025 /2 = .025
P(Z −2.47) + P(Z 2.47)
0.0068 0.0068
= 2(0.0068) = 0.0136
-1.96 0 1.96
Z = -2.47 Z = 2.47