Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 54

Chapter 6

1
Two Types of Problems
• For the remainder of the semester we will be
focusing only two types of problems
– Hypothesis Testing
– Confidence Intervals

• Why spend rest of semester on two problems?


• Because we change the setting
– Studying a number (i.e. Average Salary)
– Studying a category (favorite candy bar) 2
– Etc.
3

The CLT Is Key


• For both types of problems we will be working on
averages
• Therefore the Central Limit Theorem (CLT) is very
important
• It tells us that a set of sample averages will follow a
bell-shaped curve and have a certain mean and
standard deviation
• This allows us to use Table A (or Excel) to estimate
how rare it would be to get a single average
– Similar to midterm problem: If exam scores are
N(600, 60), how many people score 700 or above
4

Example: Manufacturing
• Suppose you run a manufacturing plant

• Your computer systems can record how long it takes to


manufacture every single order

• So you can actually know µ and σ because you have data on


the entire population of order times
– Let’s say µ=3.8 and σ=1

• Your quality control systems tracks the average time for


every 25 orders (so n=25)
• So you also have the……
– Sampling distribution
5

Apply the CLT


• If
  µ=3.8, σ=1 and n=25 then we KNOW
that the sampling distribution will
– Bell shape
– Mean=3.8
– Standard deviation=1/5 (σ/)

3.8
6

Use the CLT


•• You
  are coming back to the plant from lunch and you
overhear some of your employees saying that there is a
problem with the machinery
• You look at the figures for your last set of 25 orders and
the average time was 4.48
• Based on this, should you believe the rumor?
• Well, you can make your decision based on how rare it
would be to get this (one sample average) if everything is
working fine
– That is, if the mean of the sampling distribution=3.8 and the
standard deviation=0.2 (1/5)
• From Excel=1-NORM.DIST(4.48,3.8,0.2,TRUE)=0.0003
7

Applying the CLT


• Either there is a problem with the equipment
Or
• Something that should only happen 3 times
out of every 10,000 has happened
• I think you should look into the rumor

  Using table: Z==3.4


Table A for 3.4 is 0.9997
1-0.9997=0.0003 for area to right

3.8 4.48
8

That’s Hypothesis Testing!


• Going through those same steps forms the basis of hypothesis
testing
– One of the two main things we will be studying

• All we need to do now is fill in a number of details and notation

• But the basic idea is this


1. You have an understanding or assumption about the population
2. You suspect that the population may have changed
3. You draw some sample data from that population & do some calculations
4. You find out how unusual it would be to get that data if your assumption
about the population is still true
5. If it is very unusual, then you stop believing your assumption about the
population and start believing something else
9

Hypothesis Testing & Example


General Step From Example
You have an understanding or Your factory’s system measures every order, so
assumption about the population
you know that µ=3.8 & σ=1
You
You suspect
suspect that
that the
the population You
may have changed
population You hear
hear aa rumor
rumor that
that there
there is
is trouble
trouble with
with the
the
may have changed machinery
machinery
You draw some sample data from You
You draw some &
that population sample data from
do some You look
look at
at the
the last
last 25
25 orders.
orders. The
The average
average was
was
that population & do some
calculations
4.48
4.48
calculations
You find out how unusual it You use what the CLT tells us about the
You
would find
be out how
to get unusual
that data ifityour
would be to about
assumption get that
thedata if your
population
sampling distribution and find out that the
assumption
is still true about the population probability of a single of 4.48 is 0.0003 if µ=3.8
is still true & σ=1
If it is very unusual, then you Zoiks! Better get down to the manufacturing
stop believing your assumption
about the population
floor and find out what happened
10

Hypothesis Testing & Example


General Step From Example Hypothesis Testing
You
You have
have an
an Your
Your factory’s
factory’s system
system measures
measures Ho
Ho (pronounced
(pronounced H-oh
H-oh or
or H-not)
H-not) the
the
understanding
understanding or every
assumption
or every order, so you know that
order, so you know that null hypothesis
null hypothesis
assumption about the
about the µ=3.8
µ=3.8 &
& σ=1
σ=1
population
population
You suspect that the You hear a rumor that there is Ha (H-A) the alternative hypothesis
You suspectmay
population thathave
the You hear a rumor that there is Ha (H-A) the alternative hypothesis
population trouble with the machinery
changed may have trouble with the machinery
changed
You draw some sample You look at the last 25 orders. Draw a sample of size n and
You
data draw somepopulation
from that sample You look at the
The average waslast 25 orders.
4.48 compute
data
& dofromsomethat population
calculations The average was 4.48
& do some calculations
You find out how unusual You use what the CLT tells us This probability is called the p-value
it
Youwould
find be
outtohow
get unusual
that about the sampling distribution This probability is called the p-value
data if your
it would assumption
be to get that and find out that the probability
aboutifthe
data population
your assumptionis
still true of a single of 4.48 is 0.0003 if
about the population is
still true µ=3.8 & σ=1
If it is very unusual, then Zoiks! Better get down to the You set your risk tolerance (called
you stop believing your manufacturing floor andto
find
If it is very unusual,
assumption about thethen Zoiks! Better get down theout alpha)
You settoyour
a small
risk number
tolerance(1%, 5% or
(called
you stop believing your what happened floor and find out
manufacturing 10%). If athe
alpha) to p-value
small is less
number (1%,than
5% or
population
assumption about the what happened alpha, then reject Ho for Ha
10%). If the p-value is less than
population
alpha, then reject Ho for Ha
11

Hypothesis Testing & Example


General Step From Example Hypothesis Testing
You have an Your factory’s system Ho (pronounced H-oh or
understanding or measures every order, so H-not) the null hypothesis   µ=3.8
Ho:
assumption about
you know that µ=3.8 &
the population
σ=1 Ha: µ>3.8
You suspect that You
You suspect that
the population may You hear
hear aa rumor
rumor that
that Ha
Ha (H-A)
(H-A) the
the alternative
alternative
the population may there
there is trouble with the
is trouble with hypothesis
have changed
have changed machinery
the hypothesis =4.48 for n=25 orders
machinery
You draw some You look at the last 25 Draw a sample of size n P-value=0.0003
You draw
sample datasome
from You look at average
the last 25
sample data from
orders. The was and compute
that population & orders.
4.48 The average was
that population &
do some
do some
calculations
4.48 P-value< any standard
calculations
You find out how You use what the CLT This probability is called alpha (1%, 5%, 10%)
unusual
You finditoutwould
howbe tells use
You us about
what the
the CLT the p-value
This probability is called
to get that
unusual data if be
it would sampling
tells distribution
us about the the p-value
your
about
assumption
to get that data if
the and find out
sampling that the
distribution Statistical conclusion:
your assumption
population
about the is still probability
and find outofthat
a single
the x- Reject Ho for Ha
true
population is still bar of 4.48 is
probability of 0.0003
a singleifx-
true µ=3.8
bar & σ=1
of 4.48 is 0.0003 if
If it is very µ=3.8
Zoiks!&Better
σ=1 get down You set your risk tolerance Interpretation: Average
unusual, then you to the manufacturing (called alpha) to a small
If it isbelieving
stop very your Zoiks! Better
floor and get down
find out what You set your
number (1%, risk
5% or tolerance
10%).
processing time has
unusual,
assumption then you
about to the manufacturing (called alpha) to a small
stop believing your
the population happened
floor and find out what
If the p-value is less than
number (1%, 5% or
increased in the population
assumption about alpha, then reject Ho10%).
for Ha
the population happened If the p-value is less than
alpha, then reject Ho for Ha
12

Further Use of Normal


• Julie says that she thinks Bob cheated on the test
• Bob’s score was 159 where the distribution of
scores was N(100,25).
• Only 1 student in 100 should get a score this high
if indeed the distribution of scores is N(100,25)
• WHY? =1-NORM.DIST(159,100,25,TRUE)=0.01
• So, either Bob cheated, or he got an extremely
high score
• Realize that here this is the same thing as
saying that the POPULATION mean for
cheaters is much higher than the mean for non-
cheaters
13

Hypothesis Testing
• The ideas on the previous slides forms the basis of
hypothesis testing
• We formulate a hypothesis, and test it using what we
know about the normal distribution
• The hypothesis is that Bob cheated
• We can’t know for sure if Bob cheated
• We look at the statistical evidence
– Either Bob cheated, or something very unusual happened
• We can re-state this hypothesis as: Is Bob’s score a
sample from a distribution that is N(100,25) or is it a
score from a sample with a much higher mean?
14

Hypothesis Testing for Means


• Ordinarily,

 sample ()
we perform hypothesis tests for groups of people and look at the mean of the

• Example: Scores for a test are distributed N(100,25). We make improvements and give
it to 81 students
• We want to know if the test will still produce a population of scores that are N(100,25),
or will the scores go up
• Suppose the average score for the sample of 81 students is 110
• If the population distribution is N(100,25), what will be the distribution of the SET of
averages from many samples (otherwise known as the sampling distribution)
~N(100,25/9)

• So, how unusual is average score of 110?

• =1-NORM.DIST(110,100,25/9,TRUE) =0.0002

• So, we believe that the POPULATION mean is not 100, but is actually a larger number
15

Hypothesis Testing
• All we have to do now is define things more formally
• Ho (pronounced “H-not”) is the null hypothesis. This is
the thing we wish to ‘disprove’ or reject
– We wished to stop believing that the population average for
the test is 100

• Ha or H1 is the alternative hypothesis. This is the thing


we wish to conclude
– We wish to conclude that population average score has
increased
– Ha is set up to be the only logical alternative to the null
hypothesis. If you decide you no longer believe the null
hypothesis then you imply that the statistical evidence is more
in favor of the alternative hypothesis
16

Hypothesis Testing
• The level of the test (also called the alpha-level or just the
alpha) is the significance level
• This is the chance that we would see a change in the test
statistic due to random chance alone. This is usually a small
number like 10%, 5%, or 1%
• We calculate the test statistic under the null hypothesis
– What are the chances that an average of the sample of test scores
would be 110 is the population average is 100
• If the chance of this is very small– smaller than the alpha level
we set for the test– then we Reject the Null Hypothesis
– We therefore imply that we believe the alternative hypothesis
• So we conclude that the population average for the test has gone up
• If we give the test in the future, we believe the overall average is higher
17

Hypothesis Testing Shorthand


• Ho: μ=100

• Ha: μ>100

• Alpha=.1 (for example)


• The chance of getting a sample mean of 110 under
the null hypothesis is 0.0002– called the p-value
• Since 0.0002< 0.10, we reject the null hypothesis
and conclude that the population mean for the test
has increased
18

P-Value Definition
19

P-Value, Alpha &


Statistical Significance
20

Another Example: Did


Processing Time Go Down?
•• Suppose your boss wants to know whether changes he
 implemented to the manufacturing process have reduced
processing time from the current average of 3.8 days

• Your boss is only willing to tell his boss that processing


time decreased if he can be 90% confident that the
decrease is not due to random chance (or alpha=0.1)

• You have data from a sample of 100 orders


– The average processing time for the sample is (=)3.6 days
– Pretend that we know the populations standard deviation is 1 day
21

Did Processing Time Go Down?


• You calculate that, the probability of getting a sample with
an average of 3.6 days (if the population mean was still 3.8
days) is 2.3% based on the calculation
• (3.6-3.8)/(1/sqrt(100))= -2
• And =normsdist(-2)=.023
– So the area under the standard normal curve to the left of the value
-2.0 is equal to .023, or 2.3%

• Your boss can therefore be more than 90% confident that


the decrease in production time was not due to random
chance because .023 is less than 0.10.
• He should tell his boss that the changes in the process have
decreased the average processing time
22

Hypothesis Testing
• This example demonstrated the key items of
hypothesis testing

• We have a hypothesis that we wish to


evaluate (Did processing time go down?)

• We have a test statistic (Average processing


time from our sample)
23

Alpha
•• We have a significance level (alpha) set up in advance of
 performing the analysis
• The significance level is the threshold we establish for
how unlikely it is that the value of the test statistic (the
average from the sample) is due to random chance
• Only if we get a calculation that exceeds this threshold do
we conclude that the evidence is better that the population
parameter has changed
– In this case the alpha level was set at 10%
– The test demonstrated that we could be confident at a 10% level
that the average processing time in the population (all orders) had
decreased
• Note that the probability that the population has not
changed is 2.3% b/c this is the probability of getting =3.6
24

Hypothesis Testing
• Ho (pronounced “H-not”) is the null hypothesis. This is
the thing we wish to ‘disprove’ or reject
– We wished to stop believing that average processing time had not
decreased

• Ha (or H1 in other textbooks) is the alternative hypothesis.


This is the thing we wish to conclude
– We wished to conclude that average processing time had decreased
– Ha is set up to be the only logical alternative to the null hypothesis.
If you decide you no longer believe the null hypothesis then you
imply that the statistical evidence causes you to conclude that the
alternative hypothesis is true
25

Hypothesis Testing Shorthand


• Ho:
  μ=3.8
• Ha: μ<3.8
• Alpha=.1
• The chance of getting = 3.6 under the null
hypothesis is 2.3%
• Since .023 < .10, we reject the null
hypothesis and conclude that average
processing time has decreased
26

One-Sided vs. Two Sided Tests


• Note that in the 1st and 2nd example, we wanted to
test for whether something increased
• In the 3rd example we wanted to see if processing
time went down
• These are both examples of one-sided hypothesis
tests
– We are interested in whether we think a POPULATION
mean has increased or
– We are interested in whether a POPULATION mean
has decreased
27

Two-Sided Test
• Sometimes we are interested in whether a
population parameter changed (e.g. went up or
down)
• In a two-sided test the alternative hypothesis is of
the form
– Ha: μ  3.8 or
– Ha: μ  100
• We would use a two-sided hypothesis test if we
made changes to an assembly process and we
didn’t know whether the average assembly time
when up or down
– More on this later
28

Hypothesis Testing Summary


• The summary at left shows the
very basics of how to perform a
hypothesis test for these types
of problems

• There are a number of these


‘recipes’ in the book

• They are all put together on the


‘highly suggested formula
sheet’

• Regardless of the format of the


exam, you will need that sheet
29

How Small a P Is Convincing


• If the significance test is contrary to many years of
accepted practice and theory, a lower p-value should be
required to convince you that a brand-new piece of
contradicting evidence has been found
• If you must make expensive changes as a result of
rejecting Ho, then again, a smaller p-value is needed
– Expensive new drug with severe side-effects

• If you simply must choose between two good options, then


using a larger p-value is acceptable
– Picking between one of two advertising campaigns
– If you pick the wrong one, still make money, just not as much
30

2nd Problem: Confidence Intervals


• What if we don’t have a hypothesized value for mu (the population mean)

• Again, the Central Limit Theorem comes in handy

• Since the mean of the sampling distribution of x-bar is the same as the
mean of the population, our best guess for the mean of the population is
x-bar
31

If Don’t Know
  µ, use
• Let’s say you are studying the shoe size of UIC students
• You take a random sample of 36 students and the average is 6.5.
Assume the standard deviation is 0.5
 • Someone forces you to make an
Dr. Sparks, while studying for his qualifying exam
estimate of the average of the
population. What would you do?
• Sensible to use the average of the
sample: 6.5
• In fact, is always the best estimate for
µ according to the statistical rules of
highly dark magic
32

Statistics Is About Rareness


•• I  have made reference in the last few lectures to estimating how
rare (or common) an event is
– For average manufacturing time to decrease to 3.6 if µ=3.8
– For test scores to increase if µ=100

• We can make these statements because the CLT tells us that the
distance from µ to has certain characteristics

• For confidence intervals we are simply going to work the same


idea in reverse
– If we calculate a value for , how rare would it be for µ to be bigger than a
certain number
33

Building Confidence Intervals


Building a confidence
Hypothesis Test Example interval
•• Q:
  If µ=3.8, what is the probability •  Q: If is 6.5, σ=0.5 & n=36,
of getting an of 3.6 or less
what is the probability that
• A: 2.3% µ≤ 6.46?
• A: 32% (approximately)
34

Calculation Extends to Different Potential Values of µ

•• Q: If is 6.5, what is the


 
probability that µ≤ 6.44?
• A: 25%
35

Also Calculate For Value Above 6.5


36

Determining Confidence
 So if is 6.5 and we estimate
that µ is between 6.36 and 6.64
then we have a 10% risk of
being wrong

Recall that I said that risk


and confidence are two ways
of expressing the same idea.
If 10% chance of being wrong
then 90% confident that right

So if we say that our best estimate for µ is 6.5 but that we believe that it lies somewhere
between 6.36 and 6.64 then we can be 90% confident in this statement.
So (6.36,6.64) is a 90% confidence interval for µ
Using the same logic but different values,
(6.34,6.66) is 95% confidence interval and
(6.29,6.71) is 99% confidence interval
37

Form of a Confidence Interval


• If (6.36,6.64) is a 90% confidence interval for µ

• This means that our estimate for µ is that it is


somewhere between
 6.36 [low-side estimate] and
 6.64 [high-side estimate]

• The general form of a confidence interval is


(lowest estimate, highest estimate)
38

Confidence Interval for Mu


• That is, if we take 100 samples and calculate the
confidence interval for each one of them, then
approximately 95 out of 100 confidence intervals will
contain the population mean

• This is demonstrated at left with 25 confidence intervals


39

Recipe for Confidence Interval


Additional Items Regarding Hypothesis
Tests & Confidence Intervals

Chapter 6

40
41

Data For Statistical Significance


• All of the fancy calculations of statistical
significance assume that the data has been drawn
from a Simple Random Sample in an unbiased
manner

• If this is not true, throw all the calculations out the


window

• Homework problem regarding radio call-in poll


42

If You Search For Something You


Are More Likely to Find It
• A hospital study that compared brain cancer patients and a similar
group without brain cancer found no statistically significant
association between cell phone use and a group of brain cancers
known as gliomas

• But when 20 types of glioma were considered separately, an


association was found between phone use and one rate form. The risk,
however, decreased rather than increased with additional mobile
phone use

• If you test the same set of data with 20 different null hypotheses all at
alpha=.05, one of them will come up significant based on random
chance alone

• Don’t do this if you want to practice good statistics


– Technically speaking, should draw new sample for each new hypothesis test
43

Confidence vs. Probability


• What is the probability of getting a head on one toss of a coin? ½
• So, in the probability model approach that probability was defined as
the count of the # of things of interest / count of total # of things in the
sample space
• So, if I have a box with 95 Red marbles and 5 White marbles, the
probability of drawing a red is 95 out of 100
• Using the strict application of this approach to probability, if I draw a
marble from the box, but keep it in my hand so that I cannot see the
color, then what is the probability that the marble is Red
– The answer is ½
• In the long run, however, if I repeated this experiment (and replaced
the marble after each trial) I would be more confident that I have a
Red marble in my hand than a white marble
• This is because after I draw the marble the probability is ½, but the
process will give me a Red marble 95% of the time
44

Confidence vs. Probability


• This applies to confidence intervals as well

• The probability that any one confidence interval contains


the population mean is ½ because there are only one of
two outcomes (either it contains the mean, or it does not)

• However, we are using a process that gives an interval that


contains the mean 95% of the time (or 90% of the time,
etc., depending on the numbers used), so we are 95%
confident that the interval is Right

• Any one confidence interval that you calculate, however,


may not contain the true population mean
45

Confidence Interval Width

• Realize that in order to be more confident


(e.g. have a higher confidence percentage)
the confidence interval must be wider in
order to have a better chance of capturing
the mean
• In order for the 99% confidence interval to
46

Decreasing Confidence Interval Width

• Your client says: “But that Confidence


Interval is too wide!”
– Example: Negative market share

• You can
– Decrease the confidence percentage and
therefore get a narrower interval
47

Sample Size Calculation

• One way to avoid the complaint that the


interval is too wide (or too narrow) is to set
the sample size based on the formula above
before you begin collecting data
IDS 270

Chapter 6
Section 4

48
49

Review & Preview


• We have talked about confidence intervals
and hypothesis testing
• These are statistics that are computed in
order to make inference (an educated
estimate) about a population parameter that
is not known
– Each computed statistic has some probability
that it is incorrect
50

Review & Preview


• So far, we have focused on alpha
– The probability of rejecting the null hypothesis when it
is actually true or
– The probability that a confidence interval does not
contain the true population parameter
• We set alpha to be a small number (5%, 1%) so
that we are very confident that any one statistic is
correct
– That is, we are confident that we rarely reject Ho when
Ho is true or
– Rarely does the confidence interval not contain the
population mean
51

Different Types of Error


• However, this is not the only type of error
• We must also consider the possibility that
we “Accept Ho” when Ha is actually true
52

Error Types Defined


53

Different Types of Errors-- Definitions


• The probability of a Type II error is called beta
• This probability should also be low; however we generally accept
that it need not be as low as alpha (e.g. approximately 20% is
acceptable
• The Power of a test is defined as 1-Beta=Power. This is the number
we work with to think about Type II error
– So we want Power to be high, around 80%

Power

Beta
54

Definition of Power

This is best demonstrated through an example problem

Please view appropriate video containing that problem

You might also like