Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Inference and Hypothesis

Testing:

MUHAMMAD NIZAMUDDIN
MS in Biostatistics and Epidemiology (Dow)
BS MT (Surgical Technology)
Medical Technologist
Ophthalmology Department
Dow University Hospital
2021
Objectives:

Muhammad Nizamuddin BS MSBE


Objectives:
• What is hypothesis testing?
• Interpreting and selecting significance level
• Type I and Type II errors
• One tailed and two tailed tests
• Hypothesis tests for population mean
• Hypothesis tests for population proportion
• Hypothesis tests for population standard deviation

Muhammad Nizamuddin BS MSBE


• The four-step process that encompasses statistics: data
production, exploratory data analysis, probability and
inference.
What is Inference?
• Statistical inference is the process through which inferences
about a population are made based on certain statistics calculated
from a sample of data drawn from that population.
Principles and Practice of Clinical Research (Third Edition), 2012.
• Statistical inference, where we draw conclusions about a population
based on the data obtained from a sample chosen from it.
• It helps to assess the relationship between the dependent and independent
variables.
• The purpose of statistical inference is to estimate the uncertainty or sample to
sample variation.
• The strength (certainty) of the inference is largely depends upon the
sampling.
• And the Sample must be representative of the population.

Muhammad Nizamuddin BS MSBE


Sample Representative:

• It links frequency distribution to probability


distribution

• Has a Bell Shape Curve and is Symmetric

• It is Symmetric around the mean:


Two halves of the curve are the same (mirror images)
Muhammad Nizamuddin BS MSBE
How To Obtain Representative Sample:

Muhammad Nizamuddin BS MSBE


How To Obtain Representative Sample:

Muhammad Nizamuddin BS MSBE


Muhammad Nizamuddin BS MSBE
The Central Limit Theorem:
• If a random sample of n cases is drawn from a population with mean μ and standard
deviation σ, then the sampling distribution of the mean (the distribution of all possible
means for samples of size n)

1) the sample mean = the population mean

2) the sample standard deviation or standard error =

the population standard deviation or standard error divided by the square root of the
sample size, n

3) The shape of the sample distribution of the mean approaches normal as n increases.

Muhammad Nizamuddin BS MSBE


Muhammad Nizamuddin BS MSBE
Muhammad Nizamuddin BS MSBE
Confidence Intervals
• A range of values within which we are confident (in terms of probability) that the true value of
a pop parameter lies
• A 95% CI is interpreted as 95% of the time the CI would contain the true value of the pop
parameter
• i.e. 5% of the time the CI would fail to contain the true value of the pop parameter
What is a Hypothesis?
•A hypothesis is an assumption about the
population parameter.
• A parameter is a Population mean or I assume the mean GPA of this
class is 3.5!
proportion
• The parameter must be identified
before analysis.

© 1984-1994 T/Maker Co.


Terms Introduce Prior:
• Population  all possible values
• Sample  a portion of the population
• Statistical inference  generalizing from a sample to a population with
calculated degree of certainty
• Two forms of statistical inference
• Hypothesis testing
• Estimation
• Parameter  a characteristic of population, e.g., population mean µ
• Statistic  calculated from data in the sample, e.g., sample mean
Hypothesis testing
• The cat is faced with a decision!
• To go out into the snow or not since it looks cold and wet.
• It has two hypotheses; Null: it won’t get its paws wet and be
cold,
• Alternative: It will get its paws wet and will be cold.
• It could base it’s decision on data it collected in its mind in the
past; when it went out in the snow in the past it’s paws did get
cold and wet
• so may make the decision not to go out based on that
evidence!
• Hypothesis testing pervades all of our lives all of the time
without us realising it.
Hypothesis testing Framework
What the text books might say!
•Always two hypotheses:
HA: Research (Alternative) Hypothesis
• What we aim to gather evidence of
• Typically that there is a difference/effect/relationship
etc.
H0: Null Hypothesis
• What we assume is true to begin with
• Typically that there is no difference/effect/relationship
etc.
Null Hypothesis

The null hypothesis H0 represents a theory that has been put forward
either because it is believed to be true or because it is used as a basis for
an argument and has not been proven.

For example, in a clinical trial of a new drug, the null hypothesis might be
that the new drug is no better, on average, than the current drug.
We would write
H0: there is no difference between the two drugs on an average.
Alternative Hypothesis
The alternative hypothesis, HA, is a statement of what a
statistical hypothesis test is set up to establish.
For example, in the clinical trial of a new drug, the alternative
hypothesis might be that the new drug has a different effect, on
average, compared to that of the current drug.
We would write
HA: the two drugs have different effects, on average.
or
HA: the new drug is better than the current drug, on average.
The result of a hypothesis test:
‘Reject H0 in favour of HA’ OR ‘Do not reject H0’
Could try explaining things in the
context of “The Court Case”?
• Members of a jury have to decide whether a person is guilty or
innocent based on evidence

Null: The person is innocent


Alternative: The person is not innocent (i.e. guilty)

• The null can only be rejected if there is enough evidence to doubt it


• i.e. the jury can only convict if there is beyond reasonable doubt for
the null of innocence
• They do not know whether the person is really guilty or innocent so
they may make a mistake
What is Hypothesis Testing?
Result Possibilities
H0: Innocent

Jury Trial Hypothesis Test

Actual Situation Actual Situation

Verdict Innocent Guilty Decision H 0 True H 0 False

Do Not
Type II
Innocent Correct Error Reject 1- a
Error ( b )
H0
Type I
Reject Power
Guilty Error Correct Error
H0 (1 - b )
(a )
Type I and Type II Errors
1. Type I error refers to the situation when we reject the null hypothesis
when it is true (H0 is wrongly rejected).
e.g H0: there is no difference between the two drugs on average.
Type I error will occur if we conclude that the two drugs produce
different effects when actually there isn’t a difference.
Prob(Type I error) = significance level = α
2. Type II error refers to the situation when we accept the null
hypothesis when it is false.
H0: there is no difference between the two drugs on average.
Type II error will occur if we conclude that the two drugs produce
the same effect when actually there is a difference.
Prob(Type II error) = ß
Type I and Type II Errors – Example

Your null hypothesis is that the battery for a heart pacemaker has an
average life of 300 days, with the alternative hypothesis that the average
life is more than 300 days. You are the quality control manager for the
battery manufacturer.

(a)Would you rather make a Type I error or a Type II error?


(b)Based on your answer to part (a), should you use a high or
low significance level?
Given H0 : average life of pacemaker = 300 days, and
HA: Average life of pacemaker > 300 days

(a)It is better to make a Type II error (where H0 is false i.e. average life is
actually more than 300 days but we accept H0 and assume that the average
life is equal to 300 days)

(b)As we increase the significance level (α) we increase the chances of making
a type I error. Since here it is better to make a type II error we shall choose a
low α.
Muhammad Nizamuddin BS MSBE
p Value Test

• Probability of Obtaining a Test Statistic More Extreme ( or


) than Actual Sample Value Given H0 Is True
• Called Observed Level of Significance
• Smallest Value of a H0 Can Be Rejected
• Used to Make Rejection Decision
• If p value  a Do Not Reject H0
• If p value < a, Reject H0
Muhammad Nizamuddin BS MSBE
Two Tail Test
Two tailed test will reject the null hypothesis if the sample mean
is significantly higher or lower than the hypothesized mean.
Appropriate when H0 : μ = μ0 and HA: μ ≠ μ0
e.g The manufacturer of light bulbs wants to produce light bulbs
with a mean life of 1000 hours. If the lifetime is shorter he will
lose customers to the competition and if it is longer then he will
incur a high cost of production. He does not want to deviate
significantly from 1000 hours in either direction. Thus he selects
the hypotheses as H0 : μ = 1000 hours and HA: μ ≠ 1000 hours
and uses a two tail test.
•One Tail Test
• A one-sided test is a statistical hypothesis test in which the values for which
we can reject the null hypothesis, H0 are located entirely in one tail of the
probability distribution.
• Lower tailed test will reject the null hypothesis if the sample mean is
significantly lower than the hypothesized mean.
Appropriate when H0 : μ = μ0 and HA: μ < μ0
• e.g A wholesaler buys light bulbs from the manufacturer in large lots and
decides not to accept a lot unless the mean life is at least 1000 hours.
H0 : μ = 1000 hours and HA: μ <1000 hours and uses a lower tail test.
i.e he rejects H0 only if the mean life of sampled bulbs is significantly
below 1000 hours. (he accepts HA and rejects the lot)
• Upper tailed test will reject the null hypothesis if the sample mean is
significantly higher than the hypothesized mean.
Appropriate when H0 : μ = μ0 and HA: μ > μ0
e.g A highway safety engineer decides to test the load bearing
capacity of a 20 year old bridge. The minimum load-bearing capacity of
the bridge must be at least 10 tons.
H0 : μ = 10 tons and HA: μ >10 tons and uses an upper tail test.
i.e he rejects H0 only if the mean load bearing capacity of the
bridge is significantly higher than 10 tons.
COMPUTATON IN SPSS:
• BP EXAMPLE
• The Normal Systolic BP for men age 30 to 35 is said to
be 135 mm Hg.
• To find that we collect data of 204 men’s systolic BP.
• After descriptive analysis we find the mean of 129.41 ±
18.56.
• How can this BP will be compared with the mean BP we had
already?
HYPOTHESIS TESTING:
1) Statement of Hypothesis: Ho : µo = 135 --- HA : µA : µ≠ 135
2) Level of Significance:
3) Test Statistic: ------
4) Rejection Region: Reject Ho if P – value < α
5) Computation: SPSS
6) Result and Conclusion: ------
•Test Statistic:

•Sample Data → Continuous Variable


• BUT CLT SAID: Use normal distribution with larger number of data
observation.
• Our Data is small not representing the whole population:
• The development depended on CLT known population s.d., σ, and large
sample size, n
• The derivation shows that
has a t-distribution
with n-1 degree of freedom (d.f.),
so we can always use t instead of
normal to estimate µ,
this is similar to normal but with higher
probabilities for large values of t compared to normal.
HYPOTHESIS TESTING:
1) Statement of Hypothesis: Ho : µo = 135 --- HA : µA : µ≠ 135
2) Level of Significance:
3) Test Statistic: One Sample t-test
4) Rejection Region: Reject Ho if P – value < α
5) Computation: SPSS
6) Result and Conclusion: ------
One-Sample Statistics

N Mean Std. Deviation Std. Error Mean


Systolic BP of Men
204 129.41 18.560 1.299

One-Sample Test
TEST VALUE = 135

95% Confidence Interval of the


Mean Difference
t df Sig. (2-tailed) Difference Lower Upper
Systolic BP of Men
-4.300 203 0.000026 -5.588 -8.15 -3.03
Result: P value = 0.000026 < α (0.05).
Conclusion: We have to accept the alternative
hypothesis that the BP of men shows significant
statistical difference from mean BP of men = 135 mmHg.
EXAMPLE:

The average mobile price is supposed to be PKR


21000/-.
We think this price may differ with the actual
price.
We asked ten of our friends to check the price.
The data is shown above.
Example:
Example:
• A weight reducing program that includes a strict diet and exercise
claims on its online advertisement that it can help an average
overweight person lose 10 pounds in three months. Following the
program’s method a group of twelve overweight persons have lost
8.1 5.7 11.6 12.9 3.8 5.9 7.8 9.1 7.0 8.2 9.3 and 8.0 pounds
in three months.
ASSIGNMENT:
1) Statement of Hypothesis: Ho : µo = --- HA : µA : µ≠
2) Level of Significance:
3) Test Statistic: ------
4) Rejection Region:
5) Computation: SPSS
6) Result and Conclusion: ------

Write these steps of these examples in MS Word with copy paste your
outcome and with SPSS output as well file to be sent Wednesday 3:00
pm .

You might also like