Professional Documents
Culture Documents
CH 06 Updatedround 2
CH 06 Updatedround 2
1
Two Types of Problems
• For the remainder of the semester we will be
focusing only two types of problems
– Hypothesis Testing
– Confidence Intervals
Example: Manufacturing
• Suppose you run a manufacturing plant
3.8
6
3.8 4.48
8
Hypothesis Testing
• The ideas on the previous slides forms the basis of
hypothesis testing
• We formulate a hypothesis, and test it using what we
know about the normal distribution
• The hypothesis is that Bob cheated
• We can’t know for sure if Bob cheated
• We look at the statistical evidence
– Either Bob cheated, or something very unusual happened
• We can re-state this hypothesis as: Is Bob’s score a
sample from a distribution that is N(100,25) or is it a
score from a sample with a much higher mean?
14
• Example: Scores for a test are distributed N(100,25). We make improvements and give
it to 81 students
• We want to know if the test will still produce a population of scores that are N(100,25),
or will the scores go up
• Suppose the average score for the sample of 81 students is 110
• If the population distribution is N(100,25), what will be the distribution of the SET of
averages from many samples (otherwise known as the sampling distribution)
~N(100,25/9)
• =1-NORM.DIST(110,100,25/9,TRUE) =0.0002
• So, we believe that the POPULATION mean is not 100, but is actually a larger number
15
Hypothesis Testing
• All we have to do now is define things more formally
• Ho (pronounced “H-not”) is the null hypothesis. This is
the thing we wish to ‘disprove’ or reject
– We wished to stop believing that the population average for
the test is 100
Hypothesis Testing
• The level of the test (also called the alpha-level or just the
alpha) is the significance level
• This is the chance that we would see a change in the test
statistic due to random chance alone. This is usually a small
number like 10%, 5%, or 1%
• We calculate the test statistic under the null hypothesis
– What are the chances that an average of the sample of test scores
would be 110 is the population average is 100
• If the chance of this is very small– smaller than the alpha level
we set for the test– then we Reject the Null Hypothesis
– We therefore imply that we believe the alternative hypothesis
• So we conclude that the population average for the test has gone up
• If we give the test in the future, we believe the overall average is higher
17
• Ha: μ>100
P-Value Definition
19
Hypothesis Testing
• This example demonstrated the key items of
hypothesis testing
Alpha
•• We have a significance level (alpha) set up in advance of
performing the analysis
• The significance level is the threshold we establish for
how unlikely it is that the value of the test statistic (the
average from the sample) is due to random chance
• Only if we get a calculation that exceeds this threshold do
we conclude that the evidence is better that the population
parameter has changed
– In this case the alpha level was set at 10%
– The test demonstrated that we could be confident at a 10% level
that the average processing time in the population (all orders) had
decreased
• Note that the probability that the population has not
changed is 2.3% b/c this is the probability of getting =3.6
24
Hypothesis Testing
• Ho (pronounced “H-not”) is the null hypothesis. This is
the thing we wish to ‘disprove’ or reject
– We wished to stop believing that average processing time had not
decreased
Two-Sided Test
• Sometimes we are interested in whether a
population parameter changed (e.g. went up or
down)
• In a two-sided test the alternative hypothesis is of
the form
– Ha: μ 3.8 or
– Ha: μ 100
• We would use a two-sided hypothesis test if we
made changes to an assembly process and we
didn’t know whether the average assembly time
when up or down
– More on this later
28
• Since the mean of the sampling distribution of x-bar is the same as the
mean of the population, our best guess for the mean of the population is
x-bar
31
If Don’t Know
µ, use
• Let’s say you are studying the shoe size of UIC students
• You take a random sample of 36 students and the average is 6.5.
Assume the standard deviation is 0.5
• Someone forces you to make an
Dr. Sparks, while studying for his qualifying exam
estimate of the average of the
population. What would you do?
• Sensible to use the average of the
sample: 6.5
• In fact, is always the best estimate for
µ according to the statistical rules of
highly dark magic
32
• We can make these statements because the CLT tells us that the
distance from µ to has certain characteristics
Determining Confidence
So if is 6.5 and we estimate
that µ is between 6.36 and 6.64
then we have a 10% risk of
being wrong
So if we say that our best estimate for µ is 6.5 but that we believe that it lies somewhere
between 6.36 and 6.64 then we can be 90% confident in this statement.
So (6.36,6.64) is a 90% confidence interval for µ
Using the same logic but different values,
(6.34,6.66) is 95% confidence interval and
(6.29,6.71) is 99% confidence interval
37
Chapter 6
40
41
• If you test the same set of data with 20 different null hypotheses all at
alpha=.05, one of them will come up significant based on random
chance alone
• You can
– Decrease the confidence percentage and
therefore get a narrower interval
47
Chapter 6
Section 4
48
49
Power
Beta
54
Definition of Power