Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 41

jem summer

PHARMACEUTICAL STATISTICS MIDTERMS REVIEWER

I. HYPOTHESIS TESTING

 Hypothesis
- In statistics, a hypothesis is a claim or statement about a property of a
population.
- Researchers should be contented with studying random sample from a
population
- Should be a representative of a sample to ensure validity of conclusion
- Inferential statistics: process of drawing or generalizing conclusion
from a target population on the basis of the results obtained from a
sample
- How do we draw conclusion based on the sample population? Through
hypothesis testing
- Called “test of significance”- standard procedure/testing for
claims
- Testing of claims
- We test our hypothesis to tell whether our data supports or rejects our
idea
- Hypothesis testing keeps scientists honest
- Hypothesis testing procedure relies on using the information in a
random sample from a population of interest. If the information is
consistent with the hypothesis, then the hypothesis is true. If not, the
hypothesis is not true.
- In hypothesis testing, we have to know the distribution if normally or
skewed to know the formula for testing the hypothesis.
- Central limit theorem: when there is a large enough number of
sample that is drawn randomly in a population, it will follow a
normal distribution

 Parameters Used in Hypothesis Testing


1. Mean and Std. Dev.
→ Example: to determine if there’s difference in efficacy of drugs
→ If normally distributed, we can use the mean and median in
designing our data because they have same values.
→ If skewed, it is not appropriate to use mean as a way of
presenting of data. Median is used instead.
2. Proportions
→ Example: who’s more at risk, male or female?
jem summer

3. Median

 Steps to Hypothesis Testing


1. State the null and alternative hypothesis
→ Null Hypothesis (H0): is a statement that the value of a
population parameter (such as proportion, mean, or standard
deviation) is equal to some claimed value. ALWAYS A
STATEMENT OF EQUALITY
 Example: if you’re comparing for the proportion of
Group 1, it should be same with the proportion of
Group 2.
 If there’s a slight difference, for example: 81%, 80%,
80.5%; although, there are slight differences between
the values of each group, these differences are
insignificant.
 Example: antioxidant activity= % scavenging activity.
Usually, the standard for antioxidant activity is ascorbic
acid. For instance, ascorbic acid has 95%; 1s% extract
has 80%; and 2% extract has 85% therefore, the null
hypothesis is equal.
 If the objective is to compare, the statement for null
hypothesis is ALWAYS EQUAL
 If the objective is relationship between variables,
there is no relationship or sometimes NO
CORRELATION

→ Alternative hypothesis (H1 or HA): is the statement that the


parameter has a value that somehow differs from the null
hypothesis.
 Different value from the standard or not equal
 Can sometimes be directional (IF MAY GREATER
THAN OR LESS THAN), unequal, greater than or
less than
 If relationship, there is a relationship
 If comparison, there is a significant difference

→ Note about identifying H0 and H1


 START
 Identify the specific claim or hypothesis to be tested
and express it in symbolic form
jem summer

- Mean (mew), proportion (P1, P2)


- M1 > M2
- P1 < P2
 Give the symbolic form that must be true when the
original claim is false
 Of the two symbolic expressions obtained so far, let the
alternative hypothesis H1 be the one
 Example:
a) The proportion of drivers who admit to running
red lights is greater than 0.5.
- H0: P = 0.5; the proportion of drivers who
admit to running red lights is equal to 0.5
- H1: P > 0.5
b) The mean height of professional basketball
players is at most 7 ft.
- H0: M = 7ft.; the mean height of
professional basketball players is equal to 7
ft.
- H1: M < 7ft.; at most or less than

2. Level of Significance
→ Avoiding decision errors
DO NOT REJECT
REJECT
H0= false (TYPE √
2 ERROR or
BETA ERROR)
H0= true (TYPE 1 √
ERROR or
ALPHA ERROR)
→ Type 1 error or alpha error: we reject the null hypothesis
even if it is true
→ Type 2 error or beta error: we do not reject the null
hypothesis even if it is false
→ At what point do we reject the null and accept the alternative
→ Before carrying out any test, determine the level of significance
→ Level of significance should already be set to prevent bias
→ Denoted as alpha
→ Alpha level of significance
→ Alpha level is the probability that the test statistics will fall
in the critical region when the null hypothesis is true
jem summer

→ Common choices in significant level is 0.05 (5%), 0.01 (1%),


and 0.10 (10%)
→ 10% is not usually used
→ 0.05 is normally used in computation of sample size; feasible
→ For clinical trials, 0.01 is used because it is stricter and has
very small chance.

3. Identify the Test Statistics


→ Based on the data from the sample
→ T- statistics: in t-test; linear regression
→ F-statistics: ANOVA
→ X2-statistics: (x^2-test) – used for relationships
→ Z-statistics: comparison of proportions
4. Determine the critical region
→ Determines if the null will be rejected or not
→ Basis for statistical decision
→ Critical region is also known as “REJECTION REGION”
→ DECISION RULE:
 Critical region: if the computed value is greater than the
critical region, + C.R. = reject null (right side shaded
part)
 CR: if the computed value is less than the – C.R.= also
reject the null (left side shaded part)
 If the computed value is less than the positive CR. = we
do not reject the null (right side unshaded part)
 If the computed value is greater than the negative CR =
do not reject the null (left side unshaded part)

REJECT DO NOT
REJECT
If the computed value √
is greater than the
positive critical
region
If the computed value √
is less than the
positive CR
If the computed value √
is less than the
negative C.R
jem summer

If the computed value √


is greater than the
negative CR

→ DECISION RULE IF THERE’S A P-VALUE:


 If the P-value is greater than alpha (level of
significance), we do not reject the null
 If the P-value is less than alpha, we reject the null
 Example: alpha value- 0.05 and p-value= 0.001, reject
the null
 Example: p-value= 0.1 and alpha value= 0.05, do not
reject
5. Compute for test statistics
→ Software output (STATA)
6. Statistical decision
→ Since the P-value is <0.0001is less than alpha=0.005, reject
H0.
→ Since the P-value=0.1 is greater than alpha=0.05, do not reject
H0
→ REJECT AND DO NOT REJECT ONLY
7. State your conclusion
→ Dependent on the null and alternative hypothesis
→ Example: If you rejected the null, then your conclusion is
whatever your alternative hypothesis is.
→ If do not reject: there is no sufficient evidence to say that
*rewrite your alternative hypothesis*

→ Example: (conduct steps to hypothesis testing and draw


conclusion)

Test scores in the entrance examination of the incoming


freshmen nursing students from four different schools (A, B, C,
and D) at University X are shown below. Is there a difference
in the mean scores among schools?
jem summer

1. State the Null and Alternative Hypothesis:

→ H 0: μ1 = μ2 = μ3 = μ4; The test scores in the entrance


examination of the incoming freshmen nursing students from
four different schools (A, B, C, and D) at University X have an
equal mean score.
→ H1: μ1 ≠ μ2 ≠ μ 3 ≠ μ4 ; The test scores in the entrance
examination of the incoming freshmen nursing students from
four different schools (A, B, C, and D) at University X have an
unequal mean score.

2. Level of Significance:
→ α =0.05

3. Identify the Test Statistics:


→ F-statistics: ANOVA

4. Determine the Critical Region


→ α - 0.05 and p-value= 0.0022, reject the null

5. Analyze Data Using Statistical Software

→ Co
mputed value: 7.22
jem summer

→ Critical F-value: 9.28-


http://users.sussex.ac.uk/~grahamh/RM1web/F-ratio%20table
%202005.pdf (f-table)

→ P-value: 0.0022

6. Statistical Decision
→ Since the P-value = 0.0022 is less than α – 0.05, reject H 0

7. State your Conclusion


→ The test scores in the entrance examination of the incoming
freshmen nursing students from four different schools (A, B, C,
and D) at University X have an unequal mean score.

 SUMMARY
1. State the null and alternative hypothesis
2. Determine the level of significance
3. Identify the test statistics depending on the objectives
4. Determine the critical region
5. Compute for the test statistics or analyze data using statistical software
6. State your statistical decision
7. State your conclusion
jem summer

II. PARAMETRIC TESTS

 SAMPLING DISTRIBUTION OF PROPORTIONS


1. The frequency distribution of sample proportions obtained from all
possible sample of size n.
2. The properties of the sampling distribution of proportions
when the sample size is large, i.e. both nP and n(1-P) are
greater than or equal to 5 are:
 It is approximately normally distributed that’s why it
follows the standard normal distribution
 Its mean, µp, is equal to the population proportion, P; and
 Its standard deviation, δp, known as the standard error of
proportions, is equal to

 APPLICATIONS OF THE SAMPLING DISTRIBUTION


1. Determine the probability of occurrence of a sample proportion with a
pre-specified magnitude from a given population;
2. Estimate the population proportion, P; and
3. Test hypothesis regarding the population proportion, P.

 SAMPLE PROBLEM
1. If the cure rate for a new drug is 80%, what is the. Probability that at
most, 70% of 50 patients administered with the drug will be cured?

 Formula for z-deviate: z= x – mean / st.dev


 Proportion: z= p (sample proportion) -P (population mean) /
sigma ; therefore, z= p-P / square root of P(1-P) / n

 Claim would be the population proportion


 Actual would be the sample population (ung tinetest na value)
 nP: 50 * 0.8 = 40; it is greater than 5
 n (1-P): (50) (1-0.8) = 10; greater than 5
jem summer

 since it met the standard, the sample size is large enough and the
properties of the sampling distribution of the proportions that were
enumerated may be used
 next step is transforming
 use formula: z= p-P / square root of P(1-P) / n
 input: 0.7-0.8 / square root (0.8) (0.2) / 50
 -1.77
 Check the value in the z-table
 Plot in the graph

 -1.77 = 0.0384 or 3.84% (shaded part)


 96.16% chance the more 70% will be cured with a drug that
has a cure rate of 80%

2. A post EPI implementation survey was conducted among 200


randomly selected children to determine if the aim of the MHO to
cover 80% of the target population was attained. It was found that 176
of the 200 (88%) children surveyed were immunized. Did the MHO
meet his objective?

 Step 1: make a null and alternative hypothesis. Alternative


hypothesis is the claim.
 H0: P= 80%; The proportion of the children that was
immunized is equal to 80%; The MHO did not meet the
objectives
 Ha: P ≠80%: The proportion of the children that was
immunized is not equal to 80%.; The MHO met the
objectives
 Step 2: Level of Significance
 Alpha level: 0.05
 Step 3: Test Statistics
 Z-statistics; z= p-P / square root of P(1-P) / n
jem summer

 Step 4: Critical region


 Based on the alpha level of significance (0.05)
 Is the hypothesis testing directional or non-directional?
 Non-directional (pag equal or you don’t know the side) =
two-tailed; directional (< or >) = one-tailed
 Because it is non-directional, 0.05 will be divided which is
0.025
 The z-deviate for 0.025 is 1.96
 CR= +/- 1.96
 If the computed value is greater than +1.96, reject the null
 If the computed value is less than – 1.96, reject the null

 Step 5: Compute for test statistics


 z= p-P / square root of P(1-P) / n
 z = 0.88 – 0.8 / square root of (0.8) (0.2) / 200
 +2.83
 Step 6: Statistical decision
 Since the computed value is greater than 1.96, reject the
null
 Step 7: Conclusion
 The proportion of the children that was immunized is not
equal to 80%; it is significantly higher

3. A survey to determine the prevalence of hypertension was undertaken


in the six towns of Cavite covered by the Community Health
Development Project of the College of Public Health, GTZ and
SEAMEO. However, for purposes of illustrating confidence interval
estimation of the difference between two proportions, only the results
for the towns of Alfonso and Magallanes are given. Of the 414
respondents in Magallanes, 46 (11.1%) were hypertensive as compared
to 62(15.1%) of the 410 respondents in Alfonso. Determine if the
prevalence of hypertension in the two towns are significantly different
from each other.
jem summer

 Step 1: make a null and alternative hypothesis. Alternative


hypothesis is the claim.
 H0: Pm = Pa; The proportion of hypertensive in
Magallanes is equal to the proportion of hypertensive in
Alfonso
 Ha: Pm ≠Pa: The proportion of hypertensive in Magallanes
is not equal to the proportion of hypertensive in Alfonso
 Step 2: Level of Significance
 Alpha level: 0.05 or 5%
 Step 3: Test Statistics
 Z-statistics
 Step 4: Critical region
 Based on the alpha level of significance (0.05)
 Because it is non-directional, 0.05 will be divided which is
0.025
 The z-deviate for 0.025 is +/- 1.96
 CR= +/- 1.96
 If the computed value is greater than +1.96, reject the null
 If the computed value is less than – 1.96, reject the null

 Step 5: Compute for test statistics


 Since we have two proportions, we use this formula:

 But this formula is for population proportion, so instead,


we will use this formula for sample population proportions:
jem summer

Where:
P1: Proportion of Magallanes
P2: Proportion of Alfonso
N1: sample size of Magallanes
N2: sample size of Alfonso
 How do we get the pooled proportion: we will get the
overall total which is 414 and 410 equals to 824.

 +2.83
 Step 6: Statistical decision
 Since the computed value is greater than 1.96, reject the
null
 Step 7: Conclusion
 The proportion of the children that was immunized is not
equal to 80%; it is significantly higher

 HYPOTHESIS TESTING FOR POPULATION PROPORTIONS

 Comparing two population proportions

Where:
P1 = proportion of the 1st sample
P2 = proportion of the 2nd sample
Q1 = 1 – P1
Q2 = 1 – P2
n1, n2 = samples in the 1st and 2nd group

 HYPOTHESIS TESTING FOR SAMPLE PROPORTIONS

 Comparing two population proportions


jem summer

 Comparing two population proportions (pooled estimate)

 HYPOTHESIS TESTING FOR POPULATION MEANS

 Testing the difference between large sample mean and population


mean

Where:
z = z-test
x = sample mean
µ = population mean
n = samples
δ = population standard deviation

 To determine whether the results obtained from the samples


support the long-established norms or is consistent with what is
claimed to be existing in the population value
 Computing for population means depends on whether or not the
population variance is known
 In population means, we can do 2 statistics: Z-statistics and T-
statistics.
 IF THE POPULATION VARIANCE IS GIVEN, WE USE
THE Z-STATISTICS
 IF THE POPULATION VARIANCE IS NOT KNOWN OR
GIVEN, WE USE THE T-STATISTICS
jem summer

 Comparing Two Large Sample Means

Where:
z = z-test
x1 = mean of the 1st sample group
x2 = mean of the 2nd sample group
n1 , n2 = samples in the 1st and 2nd group
δ = population standard deviation

 You can also compare two population means

 HYPOTHESIS TESTING FOR SAMPLE MEANS

 Small sample hypothesis test for the mean of a normal population


(formula for T-statistics)

 Small sample hypothesis test for the mean of a normal population


(formula for T-statistics)

 The difference here is that we use the value in sample population rather
than in target population.

 SAMPLE PROBLEM
1. A sports biologist claimed that female distance runners tend to be
taller on the average than women in general, who have an average
height of 64 inches. To test this claim, a random sample of 40 female
jem summer

distance runners and their heights were recorded, giving x = 65.6


inches and standard deviation of 3.3 inches. Test the claim at the 5%
level of significance. (consider the value 3.3 as an estimate for δ

 The population standard deviation is GIVEN; since it is given, we


use Z-statistics
 Step 1: make a null and alternative hypothesis. Alternative
hypothesis is the claim.
 H0: M= 64 inches; The mean height of female distance
runners is equal to 64 inches
 Ha: M ¿64 inches (because taller than 64 is our expected
value): The mean height of female distance runners is
greater than 64 inches.
 Step 2: Level of Significance
 Alpha level: 0.05 or 5%
 Step 3: Test Statistics
 Z-statistics
 Step 4: Critical region
 Based on the alpha level of significance (0.05)
 One-tailed; if the claim is higher, the shaded part should be
on the right side. If the claim is lower, the shaded part
should be on the left side
 The z-deviate for 0.05 is 1.96
 CR= +1.64 because we’re on the right side
 If the computed value is greater than +1.64, reject the null
 If the computed value is less than +1.64, do not reject the
null

 Step 5: Compute for test statistics


 Formula:
jem summer

(65.6−64)√ 64
z=
3.3
 +3.07
 Step 6: Statistical decision
 Since the computed value = +3.07 is greater than +1.64,
reject null
 Step 7: Conclusion
 The mean height of female distance runners is greater than
64 inches.

2. Suppose that a journal article reports that the mean age at marriage of
Filipino women is 22.6 years in urban and 18.4 years in rural areas.
These findings are based on a sample survey of 150 urban and 180
rural women. The report did not indicate the corresponding variances
of the estimates. However, a review of past data shows that the
variances for the age at marriage of Filipino women are 7.2 and 5.8 for
urban and rural areas, respectively. Is there a significant difference
between the age at marriage of women in urban and rural areas? Use
alpha=0.01.

 The population standard deviation is GIVEN; since it is given, we


use Z-statistics
 We want to compare the mean age: rural vs. urban
 Step 1: make a null and alternative hypothesis. Alternative
hypothesis is the claim.
 H0: Mu= Mr; The mean age at marriage of women in urban
and rural areas is equal.
 Ha: Mu ≠ Mr (because we are determining if there IS A
SIGNIFICANT DIFFERENCE): The mean age at marriage
of women in urban and rural areas is not equal.
 Step 2: Level of Significance
 Alpha level: 0.01 or 1%
 Step 3: Test Statistics
 Z-statistics because population variance is KNOWN
because we have past data
 Step 4: Critical region
 Based on the alpha level of significance (0.01)
 two-tailed; divide 0.01 into 2 which is each side will be
0.005
jem summer

 The z-deviate for 0.005 is 2.57


 CR= +/- 2.57
 If the computed value is greater than + 2.57, reject the null
 If the computed value is less than -2.57, reject the null
 If the computed value is less than +2.57, do not reject the
null
 If the computed value is greater than -2.57, do not reject the
null

 Step 5: Compute for test statistics


 Formula:

 Instead of using this, we use the formula below because


there are two variances:

x 1−x 2
z=
σ 21 σ 22
√ +
n1 n 2
Where:
X1= urban
X2= rural
 Input value:
22.6−18.4
¿
7.2 5.8
√ +
150 180
 +14.83
jem summer

 Step 6: Statistical decision


 Since the computed value = +14.83 is greater than +2.57,
reject null
 Step 7: Conclusion
 The mean age at marriage of women in urban and rural
areas is not equal.

3. A study aims to determine the relationship of salt intake to the blood


pressure of persons aged 15 years and over. The mean systolic blood
pressure (SBP) of 20 subjects with a low salt diet was compared to that
of an equal number of subjects with high salt diet. The following data
were generated:

High Salt diet mean SBP = 138 mmHg s.d. = 11.9 mmHg
Low Salt diet mean SBP = 120 mmHg s.d. = 12.2 mmHg

 We want to compare two groups with high salt diet and low salt
diet
 Step 1: make a null and alternative hypothesis. Alternative
hypothesis is the claim.
 H0: Mlow= Mhigh; The mean SBP of persons aged 15
years and over with low salt diet is equal to that with high
salt diet.
 Ha: Mlow ≠ Mhigh: The mean SBP of persons aged 15
years and over with low salt diet is not equal to that with
high salt diet.
 Step 2: Level of Significance
 Alpha level: 0.05 or 5%
 Step 3: Test Statistics
 T-statistics because population variance is UNKNOWN.
And the given standard deviation is for the 20 subjects. It is
not indicated if it is from past studies
 Step 4: Critical region
 Based not only on the alpha level of significance (0.05), but
also on the degrees of freedom
 https://www.studocu.com/en-au/document/australian-
national-university/quantitative-research-methods/lecture-
notes/t-table-quantitative-research-methods/1062146/view?
fbclid=IwAR3LPH0eFFVc6IGSNG-
66wWgsU8N3kAmFtNPtQxX9u4AvtpATT2dyK4vakU
jem summer

 Df: n-1 for one sample


 Df: n1 + n2 – 2 for two samples
 Since we have two samples, use n1 + n2 –2
 N= 20
 20 + 20 – 2= 38
 Df: 38
 two-tailed; divide 0.05 into 2 which is each side will be
0.025
 Look for the closest value of 38 in the T-table, which gives
2.021
 CV= +/- 2.021
 If the computed value is greater than + 2.021, reject the null
 If the computed value is less than -2.021, reject the null
 If the computed value is less than +2.021, do not reject the
null
 If the computed value is greater than -2.021, do not reject
the null

 Step 5: Compute for test statistics


 Formula:

Where:
X1= high salt diet
X2= low salt diet
Since the values are in standard deviation and we want
variance, we will square it.
jem summer

 Input value:
138−120
¿ z=
11.9❑2 12.22❑

 +4.72
√ 20
+
20

 Step 6: Statistical decision


 Since the computed value = +4.72 is greater than +2.021,
reject null
 Step 7: Conclusion
 The mean SBP of persons aged 15 years and over with low
salt diet is not equal to that with high salt diet.

4. The average number of persons per household for the whole country
based on the 1980 census results is 5.6. If a random sample of 25
households in a survey done lately showed a mean household size of
5.2 persons with a standard deviation of 1.56, does the result indicate
that there has been a change in the mean household size in the
Philippines since the last census? (Use alpha = 0.10)
 Step 1: make a null and alternative hypothesis. Alternative
hypothesis is the claim.
 H0: M = 5.6; The mean household size is equal to 5.6.
 Ha: M ≠ 5.6: The mean household size is not equal to 5.6.
 Step 2: Level of Significance
 Alpha level: 0.10 or 10%
 Step 3: Test Statistics
 T-statistics because population variance is UNKNOWN.
And the given standard deviation is for those who were
surveyed. The population variance and population standard
deviation is not known.
 Step 4: Critical region
 Based not only on the alpha level of significance (0.05), but
also on the degrees of freedom
 https://www.studocu.com/en-au/document/australian-
national-university/quantitative-research-methods/lecture-
notes/t-table-quantitative-research-methods/1062146/view?
fbclid=IwAR3LPH0eFFVc6IGSNG-
66wWgsU8N3kAmFtNPtQxX9u4AvtpATT2dyK4vakU
 Df: n-1 for one sample
 Df: n1 + n2 – 2 for two samples
 Since we only have one sample, use n-1
jem summer

 N= 25
 25-1 = 24
 Df: 24
 two-tailed; divide 0.10 into 2 which is each side will be
0.05
 Look for the closest value of 24 in the T-table, which gives
1.711
 CR= +/- 1.71
 If the computed value is greater than + 1.71, reject the null
 If the computed value is less than -1.71, reject the null
 If the computed value is less than +1.71, do not reject the
null
 If the computed value is greater than -1.71, do not reject the
null

 Step 5: Compute for test statistics


 Formula:

 Input value:
( 5.2−5.6 ) √ 25−1
¿ z=
1.56
 -1.26
 Step 6: Statistical decision
 Since the computed value = -1.26 is greater than -1.711, do
not reject null
 Step 7: Conclusion
jem summer

 There is no sufficient evidence to say that the mean


household size is not equal to 5.6.

III. PARAMETRIC TESTS – ANOVA

 Analysis of Variance (ANOVA)


jem summer

1. The extension of the t-test of two independent samples.


2. As its name implies, ANOVA analyzes the variance of the data to
determine whether there is a difference between the group means.
3. In ANOVA:
 Factor(s) – are just the variable(s), i.e. gender
 Levels – are the levels of variable(s), i.e., by gender, the levels
are male and female

 Types of ANOVA
1. One-Way ANOVA
 Example 1: A researcher wants to test a new anti-anxiety
medication. They split participants into three conditions (0 mg,
50 mg, and 100 mg), then ask them to rate their anxiety level
on a scale of 1-10, with 10 being “high anxiety” and 1 being
“low anxiety”. Are there any differences between the three
conditions?

 One-Way ANOVA is an ANOVA with one factor with at least


two levels. Levels are independent.

2. Two-Way ANOVA
 Example 2: A physical therapist wished to compare three
methods for teaching patients to use a certain prosthetic device.
He felt that the rate of learning would be different for patients
of different ages and wished to design an experiment in which
the influence of age could be taken into account.
jem summer

 Two-Way ANOVA is an ANOVA with two factors with at


least two levels. Levels are independent.
 Example 3: A study to determine the effects of 3 doses of a
new therapeutic agent on a short-term memory function was
conducted at two different centers. The subjects were
administered a single oral dose of test preparation and then
asked to recall items one hour after exposure to a list consisting
of 12 items.

3. Repeated-Measures ANOVA
 Example 4: A researcher wants to test a new anti-anxiety
medication. They measure the anxiety of 7 participants three
times: before taking the medication, one week after taking the
medication, and two weeks after taking the medication. Are
there any differences between the three time periods?
jem summer

 Repeated-Measures ANOVA is an ANOVA with one factor


with at least two levels. Levels are dependent.

 COMPLETELY RANDOMIZED DESIGN


 Introduction
1. One -way analysis of variance (One -Way ANOVA) is a method used
to compare 2 or more group means simultaneously in the light of
single variable.
2. One variable (or factor) with at least two levels, level is independent.
3. This test is appropriate for both equal and unequal samples from each
group.
 Assumptions
1. Each of the populations from which the samples come is normally
distributed with mean μj and variance σj^2
2. Each of the populations has the same variance .
 Statistical Methods
1. Dependent variable is normally distributed from each population.
 Shapiro -Wilk’ test
 H0: Data is normal
 Ha: Data is not normal
2. Variance of dependent variable is the same in each population
(homogeneity of variance)
 Breusch-Pagan/Cook-Weisberg Test
 H0: Population variances are all equal.
 Ha: Population variance are not all equal .

 STATISTICAL HYPOTHESES
1. H0: The t treatments have equal effects.
2. Ha: At least one of the t treatments is different.

 Decision Rule
 In general, the decision rule is: reject the null hypothesis if the
computed value of V.R. is equal to or greater than the critical
value of F for the chosen α level.
 Conclusion
 If H0 is not rejected → there is no sufficient evidence
from the data to indicate that, not all population means
are equal.
jem summer

 If H0 is rejected → not all population means are equal


(i.e., at least one population mean difference is not
equal to the others

 One-Way ANOVA with Equal Observations


 Example 1: A researcher wants to test a new anti-anxiety
medication. They split participants into three conditions (0 mg, 50
mg, and 100 mg), then ask them to rate their anxiety level on a
scale of 1-10, with 10 being “high anxiety” and 1 being “low
anxiety”. Are there any differences between the three conditions?

 CHECKING NORMALITY AND HOMOGENEITY


jem summer

 One-Way ANOVA with Unequal Observations


 Example 2: Test scores in the entrance examination of the
incoming freshmen nursing students from four different schools
(A, B, C, and D) at University X are shown below. Is there a
difference in the mean scores among schools?

 CHECKING NORMALITY AND HOMOGENEITY


jem summer

 MULTIPLE PAIRWISE COMPARISONS


1. LEAST SIGNIFICANT DIFFERENCE (LSD)
 Calculates the smallest significant difference between two means.
2. BONFERONNI
 A conservative test use for comparisons of a small number of pairs
of treatment means.
3. SIDAK
 Same as Bonferroni procedure but is less conservative.
4. SCHEFFE
 Used for testing the significance of unplanned comparisons (allows
data snooping)
5. TUKEY’S HSD
 Used for testing the significance of unplanned comparisons (allows
data snooping)
6. DUNNETT
 test used when the only pairwise comparisons of interests are
comparisons with a control.

 Software Output
jem summer

 RANDOMIZED COMPLETE BLOCK DESIGN

 Introduction
1. Several factors with different levels and the respondents are randomly
assigned to each level or each group.
2. However, there are cases wherein you just randomly assign
respondents to each groups or levels, there’s a possibility of bias due
to confounding variables.
3. In RCBD, you identify your confounding variable and you group them
to a particular confounding variable.
4. To compare different levels of the certain factor, if there is a
significant difference between two or more groups.
5. The technique for analyzing the data from RCBD is two-way ANOVA
since the observation is characterized on the basis of two criteria
which is the block and the treatment group to which the respond
6. Randomized complete block design (RCBD) is a design in which the
units (called experimental units) to which the treatments are applied
jem summer

are subdivided into homogenous groups called blocks, so that the


number of experimental units in a block is equal to the number (or
some multiple of the number) of treatments being studied.
7. The treatments are then assigned at random to experimental units
within each block.
8. It should be emphasized that each treatment appears in every block,
and each block receives every treatment.
 Assumptions
1. The population from which, the observations drawn is normally
distributed.
2. The observations are independent.
3. The various effects (block effects) are additive in nature.

 Decision Rule
 In general, the decision rule is: reject the null hypothesis if the
computed value of V.R. is equal to or greater than the critical value
of F for the chosen α level.
 Conclusion
 If H0 is not rejected → there is no sufficient evidence from the
data to indicate that, not all population means are equal.
 If H0 is rejected → not all population means are equal (i.e., at least
one population mean difference is not equal to the others.

 Two-way ANOVA without replication


 Example 3.1: A physical therapist wished to compare three
methods for teaching patients to use a certain prosthetic device. He
felt that the rate of learning would be different for patients of
different ages and wished to design an experiment in which the
influence of age could be taken into account.
jem summer

1. In here, we are only interested in one factor which is the teaching


method if there is a difference between the teaching methods. We just
created the age group to eliminate the confounding variable or possible
bias that may occur.

 Step 1: make a null and alternative hypothesis.


 H0: The main rate of learning for the three methods for the
teaching patients to the use of a prosthetic device is equal.
 Ha: At least one of the mean rate of learning for the three
methods for the teaching patients to the use of a prosthetic
device is not equal/different.
 Step 2: Level of Significance
 Alpha level: 0.05 or 5%
 Step 3: Test Statistics
 F-statistics
 Step 4: Critical region/Decision Rule
 If p-value > alpha, do not reject null
 If p-value < alpha, reject the null
 Since we are only interested in the differences between the
methods, we are just going to look here:

 The p-value for the methods is 0.0006


 BUT if you will look for the p-value of the age group
which is 0.0010, it is less than the alpha. There is a
difference between the rate of learning.
 But if the p-value of the age group is greater than the alpha,
the age group may not really affect the rate of learning of
the methods. In this case, we may use one-way ANOVA or
completely randomized.
jem summer

 Step 6: Statistical decision


 Since the p-value = 0.0006 is less than alpha, we reject null.
 Step 7: Conclusion
→ At least one of the three methods has a different mean rate
of learning.

Data for Example 3.1

 After we conduct the ANOVA, we have to make sure that the


assumptions have been met if the variance is constant and the data
is normal.

 CHECKING NORMALITY AND HOMOGENEITY

Data for example 3.1

 Since the p-value 0.00147 is less than the alpha, do not reject.
 The data is normally distributed/normal
jem summer

 For the population variance which is p=0.0954, do not reject.


Therefore, population variances are all equal.

 SOFTWARE OUTPUT (MCP-TUKEY’S HSD PROCEDURE)

Data for Example 3.1

 Conduct multiple pairwise comparison because during the


ANOVA, we identified or concluded that at least one of the
method has a different mean rate of learning.
 Since Tukey, we will just compare to 0.05
 For B vs A
→ Since the p-value is greater than alpha, then do not reject
null. There is no sufficient evidence to say that the mean
rate of learning for method A and B is not equal.
 For C vs A
→ Reject null. The mean rate of learning for method A and C
is not equal and the method C is more effective than
method A.
→ To determine which one is higher mean rate of learning,
we’re going to look at the contrast.
→ Since the contrast 2.6 is positive, the left side is higher.
→ Therefore, method C is more effective
 For C vs B
→ Since the p-value is 0.003, reject the null. The mean
rate of learning for methods A and C is not equal and
the method C is more effective than method B.
 C > A=B

 FACTORIAL EXPERIMENTS
 Introduction
1. Study of two or more factors then we compare those factors
2. CRD – one set of treatments (a factor) is applied to homogenous
experimental units.
jem summer

3. RCBD – one set of treatments (a factor) is applied to heterogeneous


experimental units classified by the blocks.
4. The focus of this lecture will be on designs for experiments with two
factors with at least two levels for each factor of interest.

5. Interaction
 Presence of interaction between two factors can affect the
characteristics of the data in a variety of ways.
 To illustrate the effects of interaction, consider the data shown
below.

 If there’s an intersect, we can say that there is an interaction

 STATISTICAL HYPOTHESES
1. For Factors A and B, we will make null and alternative hypotheses
for each factors; same goes for the interaction. The hypothesis for
Factors A and B is just the same as before.
2. For Factor A:
 H0: α1 = α2=…=αa=0
 Ha: At least one of the α’ s≠0
3. For Factor B
 H0: β1 = β2=…= βb=0
 Ha: At least one of the β’ s≠0
4. For the interaction:
jem summer

 H0: There are no interactions between the levels of Factor A


and levels of Factor B.
 Ha: There is an interaction between Factor A and Factor B.

 Two-Way ANOVA with Replication


 Example 3.2: A study to determine the effects of 3 doses of a new
therapeutic agent on a short-term memory function was conducted
at two different centers. The subjects were administered a single
oral dose of test preparation and then asked to recall items one
hour after exposure to a list consisting of 12 items.

Data for Example 3.2

 First, we identify the factors. Factor A: dose group. Factor B:


Centers. Interaction: between factor A and factor B.
 FOR FACTOR A:
 Step 1: make a null and alternative hypothesis.
 H0: The mean items recalled by the patients for the three
doses and placebo are equal.
 Ha: At least one of the three doses and placebo has a
different mean items recalled.
 FOR FACTOR B:
 Step 1: make a null and alternative hypothesis.
 H0: The mean items recalled by the patients from center 1
and 2 are equal
 Ha: The mean items recalled from center 1 and 2 are not
equal/different.
 FOR INTERACTION:
 Step 1: make a null and alternative hypothesis.
 H0: There is no interaction between levels of the doses and
centers.
jem summer

 Ha: There is an interaction between levels of the doses and


centers.
 Step 2: Level of Significance (FOR FACTORS A, B,
INTERACTION)
 Alpha level: 0.05 or 5%
 Step 3: Test Statistics (FOR FACTORS A, B, INTERACTION)
 F-statistics
 Step 4: Critical region/Decision Rule
 If p-value > alpha, do not reject null
 If p-value < alpha, reject the null
 For factorial experiments, we are going to take a look first
at the interaction.
 To determine the interaction: A#B which is the center#dose

 If you determined the interactions between Factors A and


B, we are only going to interpret for the interaction. If there
is an interaction, yun na lang ang iinterpret. If there is no
interaction, interpret Factors A and B.
 Step 6: Statistical decision
 Since the p-value = 0.0283 is less than alpha, we reject null.
 Step 7: Conclusion
→ There is an interaction between the levels of doses and
centers.
→ Since there is an interaction, we do not need to interpret
the factors.
jem summer

 GRAPHICAL PRESENTATION OF INTERACTION

Data for Example 3.2

 If we are going to graph it from center to placebo, the center 1


(blue one) is increasing from placebo to 50 mg then a sudden
decrease by the patient’s response to 75 mg.
 For center 2 (red), there is an increase from placebo to 25mg then
slightly decrease to 50mg and increase in 75 mg.

 CHECKING NORMALITY AND HOMOGENEITY

 For the data (0.33816), since it is higher than the alpha, we do not
reject the null. Therefore, data is normal.
 For the variance (0.6503), do not reject the null. The population
variances are all equal.

 SOFTWARE OUTPUT (MCP-TUKEY’S HSD PROCEDURE)


jem summer

 If ever there is no interaction between the center and the doses and
you found out that there is a difference between the different doses,
you are going to conduct multiple pairwise comparison. Interpret
all the p-values. Since there is interaction, this is no longer
applicable. Just write N/A.

 REPEATED-MEASURES DESIGN
 Introduction
1. One of the most frequently used experimental designs in the health
sciences field is the repeated measures design.
2. One factor with at least two levels, levels that are dependent are called
repeated-measures one-way ANOVA.
 Simple Repeated-Measures Design
1. Simple repeated-measures design (SRMD) is an experimental design
in which the measurements of the same variable are made on each
subject on two or more different occasions.
2. Common for anti-inflammatory drugs, topical preparations
 Repeated-Measures ANOVA
 Example 3.3: A researcher wants to test a new antianxiety
medication. They measure the anxiety of 7 participants three times:
before taking the medication, one week after taking the
medication, and two weeks after taking the medication. Are there
any differences between the three time periods?
jem summer

Data for Example 3.3

 The response variable here is the anxiety level. The factor of


interest is the time (week).
 ASSUMPTIONS
1. Sphericity
 Suppose the repeated measure factor of TIME had 3 levels –
before, after and follow-up scores of each individual.
2. SRMD ANOVA assumes that the 3 correlations
 r (Before – After)
 r (Before – Follow-up)
 r (After – Follow-up)
3. Correcting for deviations
 Epsilon (ε) measures the degree to which covariance matrix
deviates.
 If epsilon=1, sphericity assumption is met perfectly.
 If epsilon further deviates from 1, the worse, the
violation.
 The assumptions here are the co-variances are equal.
Instead of checking or determining if the population
variances of each group is equal, we’re checking the
co-variances.
 If <0.75, use G-G adjusted p-value. G-G epsilon
(greenhouse-geisser is conservative epsilon)
 If >0.75, use H-F adjusted p-value. H-F (hyun-felt
epsilon is liberal)
 Since one is conservative and liberal, we combine these
two because it tends to overestimate your epsilon.
Sometimes, H-F epsilon goes >1 and we assume that it
is equal to 1.
 How do we adjust?
 We get the average between two epsilons and
compare with <0.75 or >0.75.
jem summer

 Example 3.4: A researcher wants to test a new antianxiety


medication. They measure the anxiety of 7 participants three
times: before taking the medication, one week after taking the
medication, and two weeks after taking the medication. Are
there any differences between the three time periods?

Data for Example 3.4

 H0: There is no difference between the anxiety levels across


the three different time periods.
 Ha: At least one of the three groups/time periods has a different
anxiety level.
 Since the level of significance, F-statistics, decision rule is the
same, look at the table below:
jem summer

 Get the average of H-F and G-G which is around 0.63 Since it is
0.63, based on our adjustment, the p-value that we are going to use
is the G-G epsilon because it is less than 0.75.
 The p-values are below and we are going to use the G-G adjust p-
value and we are going to base our alternative hypothesis for that.
 Since it is less than the alpha, at least one of the three time period
has a significantly different anxiety level.
 And then determine which time period that is.

 SOFTWARE OUTPUT (MCP-BONFERRONI PROCEDURE)

You might also like