Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

T test function in statistical software

Most statistical software (R, SPSS, etc.) includes a t test function. This built-in
function will take your raw data and calculate the t value. It will then compare it
to the critical value, and calculate a p-value. This way you can quickly see
whether your groups are statistically different.

In your comparison of flower petal lengths, you decide to perform your t test
using R. The code looks like this:

t.test(Petal.Length ~ Species, data = flower.data)

Interpreting test results

If you perform the t test for your flower hypothesis in R, you will receive the
following output:

The output provides:

1. An explanation of what is being compared, called data in the output table.


2. The t value: -33.719. Note that it’s negative; this is fine! In most cases, we
only care about the absolute value of the difference, or the distance from 0.
It doesn’t matter which direction.
3. The degrees of freedom: 30.196. Degrees of freedom is related to your
sample size, and shows how many ‘free’ data points are available in your
test for making comparisons. The greater the degrees of freedom, the
better your statistical test will work.
4. The p value: 2.2e-16 (i.e. 2.2 with 15 zeros in front). This describes the
probability that you would see a t value as large as this one by chance.
5. A statement of the alternative hypothesis (Ha). In this test, the Ha is that
the difference is not 0.
6. The 95% confidence interval. This is the range of numbers within which
the true difference in means will be 95% of the time. This can be changed
from 95% if you want a larger or smaller interval, but 95% is very
commonly used.
7. The mean petal length for each group.

t test example From the output table, we can see that the difference in means for
our sample data is −4.084 (1.456 − 5.540), and the confidence interval shows that
the true difference in means is between −3.836 and −4.331. So, 95% of the time,
the true difference in means will be different from 0. Our p value of 2.2e–16 is
much smaller than 0.05, so we can reject the null hypothesis of no difference
and say with a high degree of confidence that the true difference in means is not
equal to zero.

Presenting the results of a t test

When reporting your t test results, the most important values to include are the t
value, the p value, and the degrees of freedom for the test. These will
communicate to your audience whether the difference between the two groups is
statistically significant (a.k.a. that it is unlikely to have happened by chance).

You can also include the summary statistics for the groups being compared,
namely the mean and standard deviation. In R, the code for calculating the mean
and the standard deviation from the data looks like this:

In our example, you would report the results like this:

Which t-test should I use?

Your choice of t-test depends on whether you are studying one group or
two groups, and whether you care about the direction of the difference in
group means.
If you are studying one group, use a paired t-test to compare the group
mean over time or after an intervention, or use a one-sample t-test to
compare the group mean to a standard value. If you are studying two
groups, use a two-sample t-test.

If you want to know only whether a difference exists, use a two-tailed


test. If you want to know if one group mean is greater or less than the
other, use a left-tailed or right-tailed one-tailed test.

Types of t-tests

There are three t-tests to compare means: a one-sample t-test, a two-sample t-test
and a paired t-test. The table below summarizes the characteristics of each and
provides guidance on how to choose the correct test. Visit the individual pages for
each type of t-test for examples along with details on assumptions and
calculations.

The table above shows only the t-tests for population means. Another common t-
test is for correlation coefficients. You use this t-test to decide if the correlation
coefficient is significantly different from zero.

One-tailed vs. two-tailed tests


When you define the hypothesis, you also define whether you have a one-tailed or
a two-tailed test. You should make this decision before collecting your data or
doing any calculations. You make this decision for all three of the t-tests for
means.

To explain, let’s use the one-sample t-test. Suppose we have a random sample of
protein bars, and the label for the bars advertises 20 grams of protein per bar. The
null hypothesis is that the unknown population mean is 20. Suppose we simply
want to know if the data shows we have a different population mean. In this
situation, our hypotheses are:

Ho:μ=20

Ha:μ≠20

Here, we have a two-tailed test. We will use the data to see if the sample average
differs sufficiently from 20 – either higher or lower – to conclude that the
unknown population mean is different from 20.

Suppose instead that we want to know whether the advertising on the label is
correct. Does the data support the idea that the unknown population mean is at
least 20? Or not? In this situation, our hypotheses are:

Ho:μ>=20

Ha:μ<20

Here, we have a one-tailed test. We will use the data to see if the sample average
is sufficiently less than 20 to reject the hypothesis that the unknown population
mean is 20 or higher.

See the "tails for hypotheses tests" section on the t-distribution page for images
that illustrate the concepts for one-tailed and two-tailed tests.

Paired T-Test

The paired sample t-test, sometimes called the dependent sample t-test, is a
statistical procedure used to determine whether the mean difference between two
sets of observations is zero. In a paired sample t-test, each subject or entity is
measured twice, resulting in pairs of observations. Common applications of the
paired sample t-test include case-control studies or repeated-measures designs.
Suppose you are interested in evaluating the effectiveness of a company training
program. One approach you might consider would be to measure the performance
of a sample of employees before and after completing the program, and analyze
the differences using a paired sample t-test.
Hypotheses

Like many statistical procedures, the paired sample t-test has two competing
hypotheses, the null hypothesis and the alternative hypothesis. The null hypothesis
assumes that the true mean difference between the paired samples is zero. Under
this model, all observable differences are explained by random variation.
Conversely, the alternative hypothesis assumes that the true mean difference
between the paired samples is not equal to zero. The alternative hypothesis can
take one of several forms depending on the expected outcome. If the direction of
the difference does not matter, a two-tailed hypothesis is used. Otherwise, an
upper-tailed or lower-tailed hypothesis can be used to increase the power of the
test. The null hypothesis remains the same for each type of alternative hypothesis.
The paired sample t-test hypotheses are formally defined below:

 The null hypothesis (H0) assumes that the true mean difference (μd) is
equal to zero.
 The two-tailed alternative hypothesis (H1) assumes that μd is not equal to
zero.
 The upper-tailed alternative hypothesis (H1) assumes that μd is greater
than zero.
 The lower-tailed alternative hypothesis (H1) assumes that μd is less than
zero.

The mathematical representations of the null and alternative hypotheses are


defined below:

 H0: μd = 0
 H1: μd ≠ 0 (two-tailed)
 H1: μd > 0 (upper-tailed)
 H1: μd < 0 (lower-tailed)

Note. It is important to remember that hypotheses are never about data, they are
about the processes which produce the data. In the formulas above, the value of
μd is unknown. The goal of hypothesis testing is to determine the hypothesis (null
or alternative) with which the data are more consistent.

Assumptions

As a parametric procedure (a procedure which estimates unknown parameters),


the paired sample t-test makes several assumptions. Although t-tests are quite
robust, it is good practice to evaluate the degree of deviation from these
assumptions in order to assess the quality of the results. In a paired sample t-test,
the observations are defined as the differences between two sets of values, and
each assumption refers to these differences, not the original data values. The
paired sample t-test has four main assumptions:
 • The dependent variable must be continuous (interval/ratio).
 • The observations are independent of one another.
 • The dependent variable should be approximately normally distributed.
 • The dependent variable should not contain any outliers.

Level of Measurement

The paired sample t-test requires the sample data to be numeric and continuous, as
it is based on the normal distribution. Continuous data can take on any value
within a range (income, height, weight, etc.). The opposite of continuous data is
discrete data, which can only take on a few values (Low, Medium, High, etc.).
Occasionally, discrete data can be used to approximate a continuous scale, such as
with Likert-type scales.

Independence

Independence of observations is usually not testable, but can be reasonably


assumed if the data collection process was random without replacement. In our
example, it is reasonable to assume that the participating employees are
independent of one another.

Normality

To test the assumption of normality, a variety of methods are available, but the
simplest is to inspect the data visually using a tool like a histogram (Figure 1).
Real-world data are almost never perfectly normal, so this assumption can be
considered reasonably met if the shape looks approximately symmetric and bell-
shaped. The data in the example figure below is approximately normally

distributed.

Histogram of an approximately normally distributed variable.


Outliers

Outliers are rare values that appear far away from the majority of the data.
Outliers can bias the results and potentially lead to incorrect conclusions if not
handled properly. One method for dealing with outliers is to simply remove them.
However, removing data points can introduce other types of bias into the results,
and potentially result in losing critical information. If outliers seem to have a lot
of influence on the results, a nonparametric test such as the Wilcoxon Signed
Rank Test may be appropriate to use instead. Outliers can be identified visually
using a boxplot (Figure 2).

Boxplots of a variable without outliers (left) and with an outlier (right).


Procedure

The procedure for a paired sample t-test can be summed up in four steps. The
symbols to be used are defined below:

 D = Differences between two paired samples


 di = The ith observation in D
 n = The sample size
 d¯¯¯ = The sample mean of the differences
 σ^ = The sample standard deviation of the differences
 T =The critical value of a t-distribution with (n − 1) degrees of freedom
 t = The t-statistic (t-test statistic) for a paired sample t-test
 p = The p-value (probability value) for the t-statistic.

The four steps are listed below:

determine whether the results provide sufficient evidence to reject the null
hypothesis in favor of the alternative hypothesis.

Interpretation

There are two types of significance to consider when interpreting the results of a
paired sample t-test, statistical significance and practical significance.

Statistical Significance

Statistical significance is determined by looking at the p-value. The p-value gives


the probability of observing the test results under the null hypothesis. The lower
the p-value, the lower the probability of obtaining a result like the one that was
observed if the null hypothesis was true. Thus, a low p-value indicates decreased
support for the null hypothesis. However, the possibility that the null hypothesis is
true and that we simply obtained a very rare result can never be ruled out
completely. The cut off value for determining statistical significance is ultimately
decided on by the researcher, but usually a value of .05 or less is chosen. This
corresponds to a 5% (or less) chance of obtaining a result like the one that was
observed if the null hypothesis was true.

Practical Significance

Practical significance depends on the subject matter. It is not uncommon,


especially with large sample sizes, to observe a result that is statistically
significant but not practically significant. In most cases, both types of significance
are required in order to draw meaningful conclusions.

You might also like