BIOstat t-test anova (Autosaved)

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

BIO-STATISTICS

T-TEST
ANOVA
The t-test
t-test is a form of inferential statistics that is applied to ascertain if there is a significant
difference between the means of two groups. These groups may be related or not related. Thus, it
is a test that is used to compare two data sets obtained from an observation or study. It is one of
the tests that is often used in hypothesis testing in a scientific experiment. In hypothesis testing,
we test the assumption regarding a population parameter.
Inferential statistics
With inferential statistics tries to reach conclusions that extend beyond the immediate data alone.
For instance, we use inferential statistics to try to infer from the sample data what the population
might think. Or, we use inferential statistics to make judgments of the probability that an
observed difference between groups is a dependable one or one that might have happened by
chance in this study. Thus, we use inferential statistics to make inferences from our data to more
general conditions; we use descriptive statistics simply to describe what’s going on in our data.
Explaining the T-Test
Consider that a drug manufacturer wants to test a newly invented medicine. It follows the
standard procedure of trying the drug on one group of patients and giving a placebo to another
group, called the control group. The placebo given to the control group is a substance of no
intended therapeutic value and serves as a benchmark to measure how the other group, which is
given the actual drug, responds. After the drug trial, the members of the placebo-fed control
group reported an increase in average life expectancy of three years, while the members of the
group who are prescribed the new drug report an increase in average life expectancy of four
years. Instant observation may indicate that the drug is indeed working as the results are better
for the group using the drug. However, it is also possible that the observation may be due to a
chance occurrence, especially a surprising piece of luck. A t-test is useful to conclude if the
results are actually correct and applicable to the entire population. Mathematically, the t-test
takes a sample from each of the two sets and establishes the problem statement by assuming a
null hypothesis that the two means are equal. Based on the applicable formulas, certain values

1
are calculated and compared against the standard values, and the assumed null hypothesis is
accepted or rejected accordingly. If the null hypothesis qualifies to be rejected, it indicates that
data readings are strong and are probably not due to chance.
Type of t-test:
(1).The one-sample t-test: The one sample t-test is used to determine if the mean of a single
population is equal to that of a single value. For instance is the mean height of 20 college boys
greater than 5.5 feet. For example, you want to show that a new teaching method for pupils
struggling to learn English grammar can improve their grammar skills to the national average.
Your sample would be pupils who received the new teaching method and your population mean
would be the national average score. Another example for instance is comparing the acidity of a
group of liquid to a neutral pH of 7
Assumptions to consider before conducting a one sample t-test
-The dependent variable should be measured at the interval or ratio level (i.e., continuous).
Examples of variables that meet this criterion include study time (measured in hours),
intelligence (measured as IQ score), exam performance (measured from 0 to 100%), weight
(measured in kg).

-The data are independent ( not correlated/related), which means that there is no relationship
between the observations.

-There should be no significant outliers. Outliers are data points within a data that do not follow
the usual pattern (e.g., in a study of 100 students' IQ scores, where the mean score was 108 with
only a small variation between students, one student had a score of 156, which is very unusual
thus 156 is considered to be an outlier.. The problem with outliers is that they can have a
negative effect on the one-sample t-test, reducing the accuracy of your results.

-Your dependent variable should be approximately normally distributed. We talk about the
one-sample t-test only requiring approximately normal data because it is quite "robust" to
violations of normality, meaning that the assumption can be a little violated and still provide
valid results. You can test for normality using the Shapiro-Wilk test of normality, which is easily
tested for using SPSS Statistics.

2
(2). 2-sample t-tests: This is also referred to as independent sample t-test. It is used to compare
the means of two group that are independent of one another. In other words, when the
participants in each group are independent from each other and actually comprise two separate
groups of individuals, who do not have any linkages to particular members of the other group.
Imagine a researcher collected information about instruction quality in youth programs from two
separate sets of youth programs: programs/ participants that received some teaching aid
intervention and those that did not. Because the participants in each group comprise separate,
independent samples, the scores from each group are not dependent on each other. Thus, to
compare the means between these two groups an independent samples t-test would be more
ideal.

Imagine you collected information about instruction quality in youth programs from two separate
sets of youth programs: programs that received the instructor quality intervention and those that
did not. Because the participants in each group comprise separate, independent samples, the
scores from each group are not dependent on each other. Thus, to compare the means between
these two groups an independent samples t-test would be used. Results of an independent
samples t-test indicate whether the difference between two means (e.g., means of programs
receiving intervention and means of programs not receiving intervention) are larger than
expected by chance. Using the example above, if the instructors who received the intervention
had higher mean quality scores compared to a group that did not receive the intervention, there
would be evidence that the intervention increased the quality of instruction.

(3). Paired t-test. The paired sample t-test, sometimes called the dependent sample t-test, is a
statistical procedure used to determine whether the mean difference between two sets of
observations is the same. This is ideal when a test group comes from the same population. It
involves measuring before and after an experiment treatment. Here, we measure one group at
two different times. We compare separate means for a group at two different times or under two
different conditions. In a paired sample t-test, each subject or entity is measured twice, resulting
in pairs of observations. Common applications of the paired sample t-test include case-control
studies or repeated-measures designs.

3
The dependent t-test called the paired-samples t-test compares the means between two related
groups on the same continuous, dependent variable. For example, you could use a dependent t-
test to understand whether there was a difference in smokers' daily cigarette consumption before
and after a 6 week hypnotherapy programme (i.e., your dependent variable would be "daily
cigarette consumption", and your two related groups would be the cigarette consumption values
"before" and "after" the hypnotherapy programme).

Assumptions to consider before conducting a paired sample t-test.


(1).Your dependent variable should be measured on a continuous scale (i.e., it is measured at the
interval or ratio level). Examples of variables that meet this criterion include revision time
(measured in hours), intelligence (measured using IQ score), exam performance (measured from
0 to 100), weight (measured in kg).
(2). Your independent variable should consist of two categorical, "related groups" or "matched
pairs". "Related groups" indicates that the same subjects are present in both groups. The reason
that it is possible to have the same subjects in each group is because each subject has been
measured on two occasions on the same dependent variable.
(3). There should be no significant outliers in the differences between the two related groups.
Outliers are simply single data points within your data that do not follow the usual pattern.
(4).The distribution of the differences in the dependent variable between the two related groups
should be approximately normally distributed.

ANOVA
Analysis of variance (ANOVA) is a type of inferential statistical technique that is used to check
if the means of more than two (2) groups are significantly different from each other. ANOVA
checks the impact of one or more factors by comparing the means of different samples.
A researcher might, for example, test students from multiple colleges to see if students from one
of the colleges consistently outperform students from the other colleges.

Types of ANOVA
We have a One way ANOVA and a two-way ANOVA.

4
One-way ANOVA
In a one way ANOVA only a single factor is investigated. It involves the comparison of means
of 3 or more samples. It hypothesizes that there is equality in all the population means (H 0..Null
hypothesis). It also hypothesizes that there will be a difference in at least one mean
(H1..Alternative hypothesis).
The one-way analysis of variance (ANOVA) is used to determine whether there are any
statistically significant differences between the means of three or more independent (unrelated)
groups. For example, you could use a one-way ANOVA to understand whether exam
performance differed based on test anxiety levels amongst students, dividing students into three
independent groups (e.g., low, medium and high-stressed students). Also, it is important to
realize that the one-way ANOVA is an omnibus test statistic and cannot tell you which specific
groups were statistically significantly different from each other; it only tells you that at least two
groups were different. Since you may have three, four, five or more groups in your study design,
determining which of these groups differ from each other is important. You can do this using a
post hoc test.
Assumptions of a One-way ANOVA.
(1). Your dependent variable should be measured at the interval or ratio level (i.e., they are
continuous). Examples of variables that meet this criterion include revision time (measured in
hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100),
weight (measured in kg)
(2). Your independent variable should consist three or more categorical, independent groups.
Typically, a one-way ANOVA is used when you have three or more categorical, independent
groups, but it can be used for just two groups (but an independent-samples t-test is more
commonly used for two groups). Example independent variables that meet this criterion include
ethnicity (e.g., 3 groups: Caucasian, African American and Hispanic), physical activity level
(e.g., 4 groups: sedentary, low, moderate and high), profession (e.g., 5 groups: surgeon, doctor,
nurse, dentist, therapist), and so forth.
(3). You should have independence of observations, which means that there is no relationship
between the observations in each group or between the groups themselves. For example, there
must be different participants in each group with no participant being in more than one group.

5
(4). There should be no significant outliers. Outliers are simply single data points within your
data that do not follow the usual pattern.
(5). Your dependent variable should be approximately normally distributed for each category of
the independent variable.
(6). There needs to be homogeneity of variances. You can test this assumption in SPSS Statistics
using Levene's test for homogeneity of variances.

Two-way ANOVA

The two-way ANOVA compares the mean differences between groups that have been split on
two independent variables (called factors). The primary purpose of a two-way ANOVA is to
understand if there is an interaction between the two independent variables on the dependent
variable. For example, you could use a two-way ANOVA to understand whether there is an
interaction between gender and educational level on test anxiety amongst university students,
where gender (males/females) and education level (undergraduate/postgraduate) are your
independent variables, and test anxiety is your dependent variable. Alternately, you may want to
determine whether there is an interaction between physical activity level and gender on blood
cholesterol concentration in children, where physical activity (low/moderate/high) and gender
(male/female) are your independent variables, and cholesterol concentration is your dependent
variable.

The interaction term in a two-way ANOVA informs you whether the effect of one of your
independent variables on the dependent variable is the same for all values of your other
independent variable (and vice versa). For example, is the effect of gender (male/female) on test
anxiety influenced by educational level (undergraduate/postgraduate)?
Assumptions for a two-way ANOVA.
(1). Your dependent variable should be measured at the continuous level (i.e., they are interval or
ratio variables). Examples of continuous variables include revision time (measured in hours),
intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight
(measured in kg)

6
(2). Your two independent variables should each consist of two or more categorical, independent
groups. Example independent variables that meet this criterion include gender (2 groups: male or
female), ethnicity (3 groups: Caucasian, African American and Hispanic), profession (5 groups:
surgeon, doctor, nurse, dentist, therapist), and so forth.
(3). Independence of observations, which means that there is no relationship between the
observations in each group or between the groups themselves. For example, there must be
different participants in each group with no participant being in more than one group.
(4). There should be no significant outliers. Outliers are data points within your data that do not
follow the usual pattern. The problem with outliers is that they can have a negative effect on the
two-way ANOVA, reducing the accuracy of your results.
(5). Your dependent variable should be approximately normally distributed for each
combination of the groups of the two independent variables.
(6). There needs to be homogeneity of variances for each combination of the groups of the two
independent variables. Again, whilst this sounds a little tricky, you can easily test this
assumption in SPSS Statistics using Levene’s test for homogeneity of variances.

Sample questions..The use of SPSS to conduct some inferential statistical tests.


(1) One Sample t-test.
Q. The data below represents information on cholesterol levels of 10 KASU students. Using a
one sample t-test, compare the sample mean to that of the known national mean of 65.Based on
your output, is there any significant difference in cholesterol level between that of the students
and the known national average?.

Data set
Student ID Gender Cholesterol level Weight
1 1 64 128
2 2 66 124
3 2 62 136
4 1 70 153
5 1 68 144
6 2 64 116

7
7 2 64 122
8 1 68 138
9 1 66 151
10 2 66 118

(2) 2 sample t-test (Independent sample t-test).


Q.The data below shows the test score of 2 groups of students; 10 students who studied via
online classes and 10 students who received physical classes sampled from the faculty of
science, Kaduna State University. Using an independent sample t-test, determine if there is a
significant difference between the means of these two groups with respect to test scores.

Groups
Physical classes (1) Online classes (2)
98 120
101 114
94 117
96 121
112 130
89 127
94 119
99 134
96 123
97 125

(3) Paired t-test (Repeated measures t-test).


Q. An experimental plant was tested with growth enhancing hormones and the following data for
plant height was obtained before and after treatment. Using a paired sample t-test, is there any
significant differences in the means of plant height before and after treatment?. Alpha value set
at P≤0.05

8
Plant height Before treatment Plant height After treatment
99 128
86 124
89 136
90 153
100 144
85 116
99 122
94 138
100 151
101 118

(4). One-way ANOVA.


Q. Researcher want to test a new anti-inflammatory drug synthesized from a plant extract. The
anti-inflammatory level was measured on a scale of 1-10. Twenty one participants were obtained
and splitted into 3 major groupings (0mg, 50mg and 100mg) with each group containing 7
participants ( see table). Using a one way ANOVA in spss, determine if there are significant in
inflammatory level between these groupings at an alpha value of P≤0.05.
Participants 0mg 50mg 100mg
1 9 7 4
2 8 6 3
3 7 6 2
4 8 7 3
5 8 8 4
6 9 7 3
7 8 6 2

(5). Two-way ANOVA.

9
Q. A biology test was given to students comprising boys and girls of different age groups. The
score were recorded as illustrated in the table below. Is it the gender that is causing the variation
in score or the age group that is causing the variation or a combination of both gender and age
group. Here we have 2 factors Gender and age, with gender having two levels (male and female)
and age having 3 levels (10, 11 and 12 year olds). Using SPSS, analyze this data using a 2 way
ANOVA an determine if there are any significant differences at an alpha value of P≤0.05

Gender score Age group


Boys 4 10 year olds
Boys 6 10 year olds
Boys 8 10 year olds
Girls 4 10 year olds
Girls 8 10 year olds
Girls 9 10 year olds
Boys 6 11 year olds
Boys 6 11 year olds
Boys 9 11 year olds
Girls 7 11 year olds
Girls 10 11 year olds
Girls 13 11 year olds
Boys 8 12 year olds
Boys 9 12 year olds
Boys 13 12 year olds
Girls 12 12 year olds
Girls 14 12 year olds
Girls 16 12 year olds

10

You might also like