Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 66

Hypothesis

Testing
Prof. Method Kazaura, PhD

© MRK, 2009
© Department of Epidemiology & Biostatistics, MUHAS
Inferential statistics
• Study a sample

• Conclude about the population

• Two processes:

– Estimation (Point or Interval)


– Hypothesis testing
2
Hypothesis testing
• Standard statistical procedure

• Aim: make judgment based on sample

estimates; conclude about unknown

parameters

3
Hypothesis
• A tentative prediction

• A statement explaining the relationship

between variables

• Something not proved yet

• Eventually draw conclusion or inference 4


Null hypothesis (Ho)
• Relates to particular hypothesis under
study

• States, NO RELATIONSHIP, NO
DIFFERENCE

• Assumed to be true until proven


Alternative
hypothesis (H1)
• It disagrees with Ho
• States, there is a relationship or difference
Note: Ho and H1
• Both are concerned with the population
• Statements referring to the population
• Conclusions based on the sample
• Possibility of errors (sampling errors)
Note: Ho and H1
• You can never 100% sure Ho is true or not
• Can ONLY say how likely the hypotheses
are
Example
• If we want to determine which drug between A

and B cures the disease

• Ho: There is no difference in cure rate between

drug A and B

• The other: One of the drugs (A or B) is better

than the other


Example
• We never point which drug is better (one
tailed test)

• So, we state H1: ”There is a difference in

cure rate between the two drugs” (Two

tailed)
Referring to the mean
• Ho: μA = μB

• H1: μA ≠ μB

• Note: The Ho must contain an = sign


Test statistics
• Measures used to reject or fail to reject

the Ho

• Examples: Z-test (SND), t-test, Chi-


square test, etc
Type I and II errors
• Decision to reject Ho has errors
• One may reject the ‘correct’ hypothesis
Type I and Type II errors
Decison: In reality In reality
Ho Ho is TRUE Ho is
FALSE
REJECT Type I error Correct
(α) desicion
(1 – β)
DO NOT Correct Type II
REJECT desicion error
Type I and II errors
• The above table has FOUR outcomes:
1) Do NOT reject Ho, when in fact Ho is true (Correct decision)
2) Reject Ho when in fact Ho is true (Incorrect decision, Type I
error)
3) No NOT reject Ho when in fact Ho is false (Incorrect
decision, Type II error)
4) Reject Ho when in fact Ho is false (Correct decision.
Probability known as Power of the test)
Type I error
• Type I error (false Positives), occurs when
you see things that are not there
Type I error
• Type I errors = (in epidemiology)
Type II error
• Type II error (false Negatives), occurs when

you don’t see things that are there


Type II error

• Type II errors = (in epidemiology)

You are
NOT
pregnan
t
Power

• β = probability of a Type II error


• β = Pr(H | H false)
0 0

• 1 – β = “Power” (pr of avoiding a Type II error


1– β = Pr(reject H0 | H0 false)
Significance level
• English meaning ‘Important’

• Here: ‘Probably true’ (not due to chance)

• A finding may be true but NOT important

• SL shows how likely are results due to chance

(consider results to be rare)

• The probability something is to be NOT true


Significance level
• Probability value small enough for you to reject
the null hypothesis.

• Normally set at 5%
• Five percent chance of not being true
Critical region/ value
• The critical region of a hypothesis test is the
set of all outcomes which, if they occur,

cause the null hypothesis to be rejected and

the alternative hypothesis accepted


Critical region/ value
• The value of a test statistic at or beyond
which we will reject Ho

• A boundary that is “improbable” if the null


hypothesis is true
Critical region/ value
• The value of a test statistic at or beyond which we
will reject Ho

• A boundary that is “improbable” if the null


hypothesis is true

• Value of a test statistic at or beyond which we will


reject Ho
Hypothesis testing
• Because the two hypothesis are
CONTRADICTORY……

• Looking for EVIDENCE which of the two


hypotheses is MORE LIKELY

• Concept of the p-value


p-value
• The probability that the obtained results are due
to chance IF Ho is true

• This chance = Type I error


• IF Ho is TRUE, it is a probability to obtain a test
statstic value as (equal to) or more extreme

(greater) than the observed test statistic value


p-value
• Large p-values (p > 0.05) suggest Ho

• Small p-values (p < 0.05) evidence for H 1 (=

vailability of difference, relationship)

• p < 0.01 very strong evidence in favour of H1.

There is a difference
One Sample test: Testing for the mean
Large sample size

• Sigma known or unknown

• Use one sample z-test


One sample z-test

• Remember:

• ”If the sample is drawn from a normal


population, then the sample means will be

normally distributed”
SND or z-test

• Test statistic (SND) or Z-test


xbar— 
z = SND = ----------
/n

• Shows the size of standard deviations


below or above the standardized sample

mean (xbar)
What is the problem?

• “Is it ok to conclude that a sample of size n, with


mean xbar comes from a population with mean

() and SD ()?”

• Hypotheses:

– Ho: No difference between xbar and 

– H1: There is a difference between x and 


If:

• SND < 1.96 (p > 0.05), no evidence against Ho

• SND > 1.96 (p < 0.05), then evidence Ho is false

• SND > 2.58 then strong evidence against Ho

• 1.96 <SND < 2.58, then 0.01 < P < 0.05


Example Ia

Body temperature (BT) changes with altitude. BT

is always expected to be about 98.6oF. Based on

100 blood donors in Arusha, the average BT was

recoded as 97.89oF (SD=3.1o).

Test whether this temperature is normal or not

normal (use α = 5%).


Solution
• Ho:
• H1:
• Test statistic is: SND = (xbar - )//n
• SND =
• Therefore, ……………
• Interpret the meaning of the obtained p-value
• If the temperature in Arusha is normal, there is a 0.022
probability (2.2%) to conclude the AR temp is not normal
Example Ib

In the general population, IQ is normally

distributed with mean of 100 points. We want to

know whether our students have IQ as the

general population. A sample of 49 students was

found to have a mean IQ of 109 (SD = 23). Use a

significance level of 5% during this assessment.


Solution
• Ho:
• H1:
• Test statistic is: SND = (xbar - )//n
• SND = 2.73; p=0.006
• Therefore, …….
• Interpret the meaning of the obtained p-value
• “If the IQ is the same as the general population, the probability
that it is different from the general population is 0.006
Example II
In one study with a large number of patients
whose mean survival time from diagnosis is
38.3 months with a standard deviation of
43.3 months. 100 patients were treated by a
new technique and found the mean survival
time of 46.9 months. Is this apparent change
in mean survival time associated with the
new technique?
Solution
• Ho: New technique mean survival time same as old one

• (Ho: xbar =  OR xbar - =0)

• H1: New technique mean survival time has changed

• (Ho: xbar #  OR xbar -  # 0)

• Test statistic is: SND = (xbar - )//n

• SND = 1.99; SND > 1.96

• P < 0.05: evidence against Ho


One sample test for a proportion
Sampling distribution of p

• Given a sample of size n with a proportion p


that comes from a large population with a

proportion π

• Ho: “the observed proportion p is the same


as a population proportion”
Conditions required:

• Both np and nq must be greater than 5


• Given a sample of size n with a proportion p that
comes from a large population with a proportion π

• Ho: “the observed proportion p is the same as a


population proportion”
Sampling distribution of p

• Is it reasonable to conclude that a sample of n


observations in which r have a characteristic, could have

been taken from a population in which the proportion with

the characteristic is ?

• We use the z-test = SND = SE(p)


p-π

• Remember, SE(p) =  p (1- p)/n


If:

• SND < 1.96, no strong evidence against Ho

• Therefore p > 0.05

• It is likely that the difference between p and


 is due only to sampling error
If:

• SND > 1.96; evidence Ho is false

• Therefore p < 0.05

• It is unlikely that the difference between p


and  is due only to sampling error
If:

• SND > 2.58; strong evidence against Ho

• Therefore, p < 0.01

• 1.96 < SND < 2.58 we write 0.01<p<0.05


Example I

A census reported that 20% of the families in a

large community lived below the poverty level. To

determine if this has changed, a random sample

of 400 families was studied and 72 found to live

below the level. Does this finding indicate a

significant change?
Example

In a clinical trial to compare two


analgesics A and B, 100 patients were
each given the two drugs on different
occasions. Of the 100 patients, 65 say
they prefer A, 35 prefer B. Is this
reasonably good evidence that more
patients prefer A than B?
Solution

• If patients in general showed no


preference for A or B, the proportion of A
preferences, , would be 0.5.
• Ho: The proportion of all patients who
prefer A is  = 0.5.
• p = 0.65;
• SE(p) = 0.05
Solution

• Therefore SND = 0.65-0.50 = 3


0.05

• Since SND > 2.58, then p < 0.01

• Therefore, strong evidence against Ho

• Conclude: Good evidence of more patients


preferring drug A
Exercise
Suppose a clinical trial is conducted to test the
efficacy of a new drug X in the treatment of disease
Y. Sixty-four patients are given 4 g daily dose of
the drug and are seen one week later at which time
it is discovered that 16 of the patients still have the
disease.
(a)What is the best point estimate of p, the
probability of a failure with the drug?
(b)What is the 95% confidence interval for failure
(c)Suppose it is known that drug Z at a 4.8 mega
unit daily dose has a 30% failure rate. What can be
said about when comparing the two drugs?
Small sample size: t-test
 unknown and/or small n

• We have been substituting s for 


• For small samples, the SE is affected by n
• When n is small, using s we increase the
sampling variation

• A test statistic that adjust for this is a t-test


t-test
• Know that SND =
• Using a t-test, we replace the SND by t which has extra
parameter known as degrees of freedom (df)
• Df is the number of scores that are free to vary
• df = n-1
• As n increases, s approaches  and t is very close to
SND
• t used to test whether the sample comes from a
population with a specified mean but with UNKNOWN
standard deviation
Table of t-test
Example
The following data are uterine weights (in
mg) for each of 20 rats drawn at random from
a large stock. Is it likely that the mean weight
for the whole stock could be 24 mg, a value
observed in some previous work?
9, 18, 21, 26, 14, 18, 22, 27, 16, 20, 15, 19,
22, 29, 15, 19, 24, 30, 24, 32
Solution

• Ho: Mean weight for whole stock is 24 mg (Ho:  = 24).

• H1: Mean weight: whole stock not 24 mg (H1: # 24).

• The test statistic is t = xbar-  where,


s/n

• n = 20,  = 24, x = 420, xbar= 420/20 =21 and s = 5.91


Solution
• So, t = 21 - 24 = -2.27 : absolute value = 2.27
1.3219

• The degrees of freedom (d.f) are 20-1 = 19

• From the t-table, t(0.05, 19) = 2.093

• Since 2.27 > 2.093, then p < 0.05


Solution

• Sufficient evidence to suggest the mean uterine


weight of the stock is different from 24 mg

• 95%CI: xbar ± t(α, d.f) x SE(xbar)

• 21 ± 2.093 x 1.3215 = (18.2, 23.8)

• Note: the CI does NOT include “0”


Exercise I

• A prominent professor thinks that the average score


is higher than 65%. Based on a sample of 10

students, he gets the scores as 65, 65, 70, 71, 67,

63, 66, 63, 68, and 72. Perform a hypothesis test

using 5% level of significance.

• (Mean=67.0, SD = 3.2)
Exercise II

• The MNH Obstetrician thinks that the average


birth weight is 1643 grams with a standard

deviation of 8 grams. Analyzing 15 babies’ weight,

they find the average birth weight of 1600 grams.

Is this apparent a significant lower birth weight

that that thought by the Obstetrician?


Exercise III

• Suppose the Ministry of Communication thinks that


proportion of households with two cell phones in not
known to be 30%. But Tigo believes that this proportion is
really 30%. Before advertising campaign, they conduct a
survey of 150 households and finds that 43% have two cell
phones. Using a 5% significance level, test the hypothesis
to go for or refute the company’s belief.
Exercise IV

• RITA believes that 50% of the first-time brides in Tanzania


are younger than their grooms. Using 100 first-time brides,

they perform a hypothesis test to determine if the

proportion is the same of different from 50% and find 53%

affirm that they are younger than their grooms. Use 5%

level of significance for this test

You might also like