Introduction To Probability and Statistics: Sayantan Banerjee Sessions 16 - 17

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Introduction to Probability and Statistics

Sayantan Banerjee
Sessions 16 – 17
Testing of Hypothesis

• A vendor claims that his company fills any accepted order, on


the average, in at most 3 working days. You suspect that the
average is greater than that claimed. How will you go forward
with this?
• According to Fortune, on February 27, 2007, the average
stock in all U.S. exchanges fell by 3.3%. If a random sample
of 120 stocks reveals a drop of 2.8% on that day and a
standard deviation of 1.7%, are there grounds to reject the
magazine’s claim?

Sayantan Banerjee 1
Testing of Hypothesis

• What is a hypothesis?
• What do we mean by ‘testing’?
• Why do we need ‘testing’?

Sayantan Banerjee 2
The first step in hypothesis testing is to formalize it by specifying
the null and alternative hypothesis.
• Null hypothesis
I An assertion about the value of a population parameter.
I It is an assertion that we hold as true unless we have sufficient
statistical evidence to conclude otherwise.
I Generally denoted by H0 .
• Alternative hypothesis
I Negation of the null hypothesis.
I Researcher’s hypothesis.
I Generally denoted as H1 .

Sayantan Banerjee 3
Testing of Hypothesis

• A vendor claims that his company fills any accepted order, on


the average, in at most 3 working days. You suspect that the
average is greater than that claimed. How will you go forward
with this?

H0 : µ ≤ 3 vs. H1 : µ > 3.
• According to Fortune, on February 27, 2007, the average
stock in all U.S. exchanges fell by 3.3%. If a random sample
of 120 stocks reveals a drop of 2.8% on that day and a
standard deviation of 1.7%, are there grounds to reject the
magazine’s claim?

H0 : µ = 3.3 vs. H1 : µ 6= 3.3.

Sayantan Banerjee 4
• Although the idea of a null hypothesis is simple, determining
what the null hypothesis should be in a given condition is
difficult.
• Generally what the analyst aims to prove is the alternative
hypothesis, the null hypothesis standing for the status quo,
do-nothing situation.

Sayantan Banerjee 5
Examples

• A pharma company claims that 4 out of 5 doctors prescribe


their pain killer medicine. If you wish to test this claim, how
would you set up the hypotheses?
• It is found that web surfers will lose interest in a web page if
downloading takes more than 12 secs at 28K band rate. If you
wish to test the effectiveness of a newly designed web page in
regard to downloading time, how will you set up the
hypotheses?

Sayantan Banerjee 6
Testing of Hypothesis

• We DO NOT know the truth.


• Based on observed data, we either support H0 or H1 .
• Of course, we cannot be certain (so there is some room for
error).

Sayantan Banerjee 7
Type I and Type II errors

In the context of statistical testing of hypothesis, rejecting a true


null hypothesis is known as Type I error. Failing to reject a false
null hypothesis is known as a Type II error.

H0 true H0 false
Reject H0 Type I error X
Fail to reject H0 X Type II error

Sayantan Banerjee 8
Type I and Type II errors

Type-I error
• maximum P(Type-I error) is called ‘Type-I error rate’.
• maximum P(Type-I error) is denoted by α.
• α is also known as ‘level of significance’.
• P(Type-I error) is maximised at the boundary of H0 and H1 .

Sayantan Banerjee 9
Type I and Type II errors

Type-II error
• P(Type-II error) always computed at a particular θ in the
region of H1 .
• P(Type-II error) at a fixed θ denoted by β(θ).
• 1 − β(θ) also known as the ‘power’ of a test.

Sayantan Banerjee 10
Testing of Hypothesis

• Test Statistic: Statistic used to perform a specific testing of


hypothesis.
• Decision rule: Rule which decides which hypothesis to favour
on light of the value of the statistic.
• Critical region: Rejection region.

Sayantan Banerjee 11
Testing of Hypothesis

What is a ‘good’ test?


• Test which simultaneously minimizes both Type-I and Type-II
errors. But is this possible? (Why or why not?)

Sayantan Banerjee 12
Testing of Hypothesis

What is a ‘good’ test?


• Test which simultaneously minimizes both Type-I and Type-II
errors. But is this possible? (Why or why not?) NO!

Sayantan Banerjee 12
Testing of Hypothesis

What is a ‘good’ test?


• Test which simultaneously minimizes both Type-I and Type-II
errors. But is this possible? (Why or why not?) NO!
• We fix an upper bound for α, and then reduce the probability
of Type-II error.
• This guarantees that we cannot commit a larger Type-I error
than that specified.

Sayantan Banerjee 12
Testing population mean for Normal population

H0 : µ ≤ µ0 vs. H1 : µ > µ0 .

• Random sample X1 , . . . , Xn from N (µ, σ 2 ) population.


• Test statistic: X̄ ∼ N (µ, σ 2 /n).
• Observed test statistic: X̄obs
• Decision rule: Reject H0 if X̄obs > C.
• C is chosen so that maximum P(Type-I error) is

α = P (Reject H0 | µ = µ0 )
= P (X̄ > C | µ = µ0 ).

Sayantan Banerjee 13
Testing population mean for Normal population

How to find C?
• σ 2 known:
σ
C = µ0 + zα √ .
n
• σ 2 unknown:
S
C = µ0 + tα √ .
n

Sayantan Banerjee 14
What will be the decision rules for the following cases?

H0 : µ ≥ µ0 vs. H1 : µ < µ0 .

H0 : µ = µ0 vs. H1 : µ 6= µ0 .

Sayantan Banerjee 15
Testing population proportions

H0 : p ≤ p0 vs. H1 : p > p0 .

• Consider a random sample from Ber(p).


• Test statistic: X, the number of successes in n trials.
• Exact distribution: X ∼ Bin(n, p).
• Approx. distribution: X ∼ N (np, npq).
• Decision rule: Reject H0 if Xobs is too large.

Sayantan Banerjee 16
Testing population proportions

H0 : p ≤ p0 vs. H1 : p > p0 .

• Consider a random sample from Ber(p).


• Test statistic: X, the number of successes in n trials.
• Exact distribution: X ∼ Bin(n, p).
• Approx. distribution: X ∼ N (np, npq).
• Decision rule: Reject H0 if Xobs is too large.
• How large is large?

Sayantan Banerjee 16
Testing population proportions

H0 : p ≤ p0 vs. H1 : p > p0 .

• Decision rule: Reject H0 if (based on approx. distribution


under H0 )
Xobs − np0
√ ≥ C∗
np0 q0
• Given a significance level α,

C ∗ = zα .

Sayantan Banerjee 17
Testing population proportions

How to test the following:



H0 : p ≥ p0 vs. H1 : p < p0 .

H0 : p = p0 vs. H1 : p 6= p0 .

Sayantan Banerjee 18
Hypothesis testing: p-value approach

• p-value appoach
1. Cook up a reasonable test statistic (point estimator for θ)
2. Find its sampling distribution under H0 .
3. Find the p-value: Probability of getting extreme values under
H0 .
4. Reject H0 if p-value is small (p-value < α)

Sayantan Banerjee 19
p-value approach

H0 : p ≤ p0 vs. H1 : p > p0 .

• Consider a random sample from Ber(p).


• Test statistic: X, the number of successes in n trials.
• Exact distribution: X ∼ Bin(n, p).
• Approx. distribution: X ∼ N (np, npq).
• Compute the probability of extreme values

P (X ≥ Xobs | H0 )

Sayantan Banerjee 20
p-value approach

H0 : p ≤ p0 vs. H1 : p > p0 .

• Consider a random sample from Ber(p).


• Test statistic: X, the number of successes in n trials.
• Exact distribution: X ∼ Bin(n, p).
• Approx. distribution: X ∼ N (np, npq).
• Compute the probability of extreme values

P (X ≥ Xobs | H0 )

which is maximised at the boundary of H0 and H1

p − value = P (X ≥ Xobs | p = p0 )
Sayantan Banerjee 20
p-value approach (contd.)

H0 : p ≤ p0 vs. H1 : p > p0 .

• Decision rule: Reject H0 if p − value ≤ α.


• Reject H0 if X ≥ Xobs is less likely under H0

Sayantan Banerjee 21
p-value approach

Testing Normal means

H0 : µ ≤ µ0 vs. H1 : µ > µ0 .

H0 : µ ≥ µ0 vs. H1 : µ < µ0 .
H0 : µ = µ0 vs. H1 : µ 6= µ0 .
Find p-value for the tests based on the test statistic X̄, both for
the cases where σ is known and unknown.

Sayantan Banerjee 22
Problems

According to BusinessWeek, the average market value of a biotech


company is less than $250 million. A random sample of 30 firms
reveal a mean of $235 million with sd $85 million. Use α = 0.05 to
test this claim, and state your conclusions.

Sayantan Banerjee 23
Problems

According to Fortune, on February 27, 2007, the average stock in


all U.S. exchanges fell by 3.3%. If a random sample of 120 stocks
reveals a drop of 2.8% on that day and a standard deviation of
1.7%, are there grounds to reject the magazine’s claim?

Sayantan Banerjee 24
Problems

In a random sample of 30 students at IIMI it was found that 22 use


non-Apple laptops. Test at α = 0.05 if there is sufficient evidence
to conclude that over 60% of the students use non-Apple laptops.

Sayantan Banerjee 25
Problems

A random sample of 150 recent donations at a certain blood bank


reveals that 82 were type A blood. Does this suggest that the
actual percentage of type A donations differs from 40%, the
percentage of the population having type A blood? Carry out a
test of the appropriate hypotheses using a significance level of
0.01. Would your conclusion have been different if a significance
level of 0.05 had been used?

Sayantan Banerjee 26

You might also like