Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

The Basics of Interpreting Research: A How

to Guide for Each Section


Additional Resources and Content

Reading and understanding the full text of a research study without formal training
can be a daunting task.  We’ve all been there too.  We can each remember a time
not too long ago (Eric Helms was still in his early 50s then, Greg Nuckols was 11
years old, and Mike Zourdos still had long hair) when the title of a study caught our
eye and we eagerly searched out the full text.  However, upon locating the full text,
we discovered that deciphering the scientific jargon was overwhelming.  Reading
the full text of a journal article and discovering unfamiliar terms (i.e. ANOVA, p-
value, etc.) along with a foreign writing style is frustrating when all you want to
know is "what can I take from this to increase my bench press?" Well, we’re here to
help. In addition to our monthly reviews, we want you to enhance your own skill set
to decipher research.  

This document provides a step-by-step guide through each typical section


(Introduction, Methods, Results, and Discussion) of a research study.  We cannot
possibly cover every term that you might encounter, but we will provide insight into
what to look for in each section of a scientific manuscript. Further, this document
explains basic terms and common statistical analyses used in applied physiology
research so you can interpret scientific studies in a practical way to aid your training
and performance.

READING AN INTRODUCTION
The overarching goal of an introduction is to build a clear rationale
based upon previous scientific data and then present the purpose of the study and
the hypotheses to the reader.  To accomplish this, the authors of the paper start
with the most basic concepts, which apply to the specific study, then establish a
problem and thus, the need for the present study. 

An introduction should have a clear flow, and the concepts must logically connect. 
If, as the reader, you look up and think, "wait, how did we get here," the introduction
may very well have a hole in it.  Essentially, the rationale must be clear.  In science,
research should not try to go from point "A" to point "C.’’ Rather, data connects
from "A" to "B" because that is how a rationale is built.  Even if we are already past
point "B" in a practical sense, the research still needs to investigate the
fundamental underlying theory.  This is important as sometimes the assumptions
underlying practice are incorrect, and practice could be improved by investigating
the underlying theory.  With that said, in reality, practical experience may very well
influence the construction of a research question in applied science.  Specifically,
changes in practice occur faster than changes in science as anecdotal information
requires less time to accumulate than scientific evidence.  Thus, practice often
informs research to test the applications that already are in place under more
rigorous control. Often, occurrences drive research questions.  For example, if you
see a positive training adaptation in response to a programming change, then you
might examine the literature to see if there is a logical case to be made to
investigate why that programming change might be effective.

In the most basic sense, to interpret an introduction simply try to “talk it out” (you
can do this verbally by yourself or as a writing exercise) as if you were explaining it
to someone with absolutely no knowledge of the concept.  For example, if you are
reading a study investigating the effect of different repetition ranges with equated
volume on hypertrophy, you might talk it out as the following:

1. Resistance training improves muscle hypertrophy. 

2. Hypertrophy seems to have a direct relationship with training volume (sets x


repetitions x weight lifted). 

3. Moderate- to high-repetition sets are often utilized for high volume and thus are
recommended for hypertrophy training. 

4. However, it is not known if high repetitions are better than lower repetitions for
hypertrophy for any mechanistic reason. 

5. Rather, hypertrophy may simply be a product of training volume independent of


repetition range. 

6. Therefore, the purpose of this study is to examine high versus lower repetitions with
equated volume for muscle hypertrophy. 

7. We hypothesize that hypertrophy adaptations will be the same due to equal volume,
however, high repetitions will be more time efficient to complete.
That example is a pretty basic concept, but the flow is logical.  If the scientific
writing is difficult to understand, simply talk out the rationale as you might explain
it to someone, then refer back to the study and the scientific writing should now be
easier to understand.  Ultimately, as a reader, you should be able to easily find a
logical flow and rationale for the study.  Additionally, it is often (although not
always) that the best introductions accomplish all of the above in a concise manner
(i.e. 2 pages or less in a Word document).  As you read enough papers, you’ll be
able to differentiate the good introductions from the poor ones.

What to look for in the Introduction

 Background information leading to a clear rationale


 The rationale has created a problem
 A demonstration of how the present study is novel and needed to fill a gap in
the literature
 The study’s purpose
 The hypotheses

READING A METHODS SECTION


The first thing to understand about a methods section is that it should provide the
reader with clear enough information so the study could be reproduced.  One of
the bedrocks of science is replication, and the reader must understand how
research occurred to replicate that process.  Thus, the mark of a well-written
methods section is clear explanation of exactly what the researchers did.

Unlike an introduction, a methods section will have subsections.  The first two
subsections will be titled subjects/participants and experimental protocol (note: the
titles of these sections will vary slightly by journal), and the final subsection will
be statistical analyses.  In between the initial and final subsections, the remaining
subsections will be study-specific, each section explaining in detail the actual
training program and testing methods employed.  Sometimes there are tertiary
sections within a subsection.

Let’s use our example from the introduction and devise methods subsections:

METHODS
 Subjects
 Experimental Protocol
 Testing Protocols
o One-Repetition Maximum Testing
o Wilks Coefficient
o Muscle Thickness
o Muscular Endurance
o Body Fat Percentage
 Training Protocol
 Dietary Log
 Training History Questionnaire
 Statistical Analyses

The above example is similar to a previous full text, which can be found here. 
Above, you can see all of the testing procedures in addition to the initial standard
sub-sections for subjects and the experimental protocol.  It is important that a
study explicitly describes or cites the testing procedures so that it is reproducible.

In the experimental protocol (or similarly titled section), you’ll find the study design,
where it is stated if the study has multiple groups or conditions.  If a study has
"groups," it is comparing different individuals divided into multiple groups.  These
groups are usually randomly sampled or often, counterbalanced.  For example, if
training program A is compared to program B for squat strength over 10 weeks, the
subjects will be counterbalanced so that there is no significant difference (more on
"significant" difference and statistics in just a bit) between groups for strength at
baseline (start of the study).  One thing to look for is that ideally, groups should also
be counterbalanced by relative strength at baseline in addition to absolute
strength.  On the other hand, if a study has "conditions," then it is a crossover
design, where subjects are compared to themselves under different conditions.  For
example, an individual takes 300mg of caffeine immediately prior to performing a
muscular endurance test with 60% of 1RM, then 72 hours later the same individual
ingests a placebo and repeats the protocol.

In studies that employ a training program over multiple weeks (usually a mesocycle
of about 8 weeks), details are commonly left out.  We imagine you have read a
study before, as have we, which stated something to the effect of: "Subjects trained
3 times a week for 8 weeks.  Subjects completed 3-4 sets of 8-12 repetitions between 60-
80% of 1RM each training session."  A description like this is, unfortunately, common
and leaves much to be desired.  Questions remain such as: How was load
progressed?  How many subjects did 12 reps and how many did 8 reps?  How was it
decided when to do 4 sets and when to do 3 sets?  Did some subjects do more reps
on multi-joint exercises and some more on single-joint exercises?  Obviously, all of
those factors could skew the results.  We would again point you to the study linked
above as an example of a clear methods section, which describes the details of the
training program, including the progression model, so that the protocol is clear to
the reader.

A methods section should also clearly present the predetermined objective


standards. For example, what constitutes appropriate squat depth?  Or if an
outcome measure is vertical jump, the methods might say, "to assess vertical jump
height, three jumps were recorded with three minutes between jumps, and the best of
the three trials was used for analysis."  These factors are again important for
replication, but also to establish objective criteria for carrying out data collection.

Importantly, methods can become convoluted if attempting to describe a training


program or an entire protocol.  Thus, complex methodologies will often have
accompanying tables or figures, which can illustrate a protocol with clarity. 
Additionally, the protocols used should be validated procedures.  If a 1RM test was
conducted, the authors should cite a previous protocol.  Similarly, when body
composition is assessed, the protocol (i.e. 3-site, 7-site skin folds, etc.) should be
described, and any questionnaires used to obtain background information
(training/health history) should be described.

Statistics

Perhaps the most daunting task of reading a scientific manuscript without formal
training is tackling the statistical analysis section, which is always located at the end
of the methods section just before the results.  To decrease the barriers of
understanding a statistical analysis section, let’s cover the basic terms and statistics
you’ll see in applied exercise science research.  The below terms, definitions, and
descriptions are far from comprehensive, but they are the most common and will
provide a baseline knowledge for statistics reading.

Mean ± Standard Deviation

While you may be familiar with this, the mean (or the average) is the sum of all
values divided by the total number of values.  You may also be familiar with the
term standard deviation, which is the deviation or variation from the mean.  A small
or low standard deviation represents that variation is low and the subjects’
responses or descriptive statistics were homogenous.  However, a high standard
deviation demonstrates a large variation in response.  Therefore, when a high
standard deviation exists (in response to an intervention), it is useful to see
individual responses and not just means.

 P-value

In applied exercise science, you will almost always see a statement that the level of
significance is set at p≤0.05.  Now, the "default" position is the null hypothesis,
which states that any difference between groups or conditions is due to sampling
or random error.  A p-value, generated by the statistical tests utilized, represents
the probability that any observed difference is due to sampling or random error. 
Thus, if the resultant p-value of the particular statistical model (discussed below) is
less than or equal to 0.05, the findings are said to be "statistically significant."

Specifically, this means there is a less than 5% chance the change is random; thus,
the null hypothesis can be reasonably rejected.  In other words, if a p-value is 0.01,
there is a 1% chance statistical significance is being reported mistakenly.  Also, if a
p-value falls between 0.051 and 0.099, this is commonly highlighted and stated to
be "approaching significance," meaning, despite not reaching the threshold for
significance, the authors believe there is likely a "meaningful difference," which will
be discussed more later. Note that p-values are often misinterpreted to simply
represent the likelihood that the study’s hypothesis is true (i.e. if there’s only a 5%
probability that the observed difference would have occurred by chance, there
must be a 95% chance that there is really a meaningful difference). 

Bayesian inference is needed to assign a likelihood of the hypothesis actually being


true, but since it’s very rarely used in exercise science, we’ll spare you the details. 
But in essence, if the hypothesis is something you’d probably expect to be true in
the first place (i.e. gaining more strength in response to high load training than low
load training), or if the p-value is very small (i.e. smaller than 0.01), there’s a pretty
good chance that a single statistically significant finding is also a “true” finding. 
However, if the hypothesis seems more unlikely (i.e. “training with lower volume will
build more muscle than training with higher volume”) or the p-value is “significant”
but very close to 0.05, there’s a decent chance the study’s results are a “false
positive" – statistically significant, but not actually true.  However, once a result has
been replicated, it becomes much more likely that it’s true, even if it’s counter-
intuitive (building muscle as effectively with 30% of 1RM as with 70-80% of 1RM fell
into this category until it was replicated many times).  Long story short:  a single
study, even with “statistically significant” results and a p-value smaller than 0.05,
shouldn’t be taken as the final word on a subject.  Before you can put a lot of faith
in the results, it still needs to be replicated.

R-value

An r-value is determined most commonly with a Pearson’s product moment


correlation.  This statistic simply tells you if there is a correlation between two
variables.  A correlation represents how two variables are related; it answers the
question “when one variable is high or low, or increases or decreases, what does
the other variable do?” However, a correlation does not tell us if one variable
is causing the other to change. The r-value will fall between -1 and 1; thus,
correlations can be inverse or negative as well as positive.  For example, if you ran a
Pearson’s product moment correlation between body mass and squat 1RM on all
competitive powerlifters worldwide, you would expect a positive correlation (i.e. in
general heavier lifters have an absolute greater 1RM, which is why we have weight
classes), whereas if you looked at the correlation between aerobic fitness and 10-
year mortality rate, you’d expect to see a negative correlation.

Magnitude of an R-value

The following can be used to interpret the "magnitude" of a relationship when


looking at an r-value: 0.35 is weak, 0.36 to 067 is moderate, 0.68-0.89 is strong, and
an r-value greater than or equal to 0.90 is a very strong relationship. 
Mathematically, a perfect correlation is 1 (or -1). This simply means that the two
variables perfectly change in concert with one another in a relative fashion. The
examples above are positive or direct relationships, the magnitude is the same if
the relationship is negative, and it would just be an "inverse" correlation in that
case (and would have a negative sign in front of the number).

R-squared or r2

An r2 will sometimes be reported with regression analysis as well.  This value will
provide the amount of variance in the dependent variable that can be predicted by
the independent variable.  For example, if there is an r2 = 0.50 from a regression
analysis using change in muscle growth as the independent variable and change in
maximal strength as the dependent variable, then the r2 value indicates 50% of the
change in strength is explained by the change in hypertrophy.

T-test
A t-test is only used to examine if a difference exists between two variables.  There
are two types of t tests to discuss: a student’s (independent) t-test and a paired t-
test. 

Student’s t-test: In applied exercise science, a student’s t-test is used when the
individuals are unrelated.  For example, if a study is testing program A versus
program B over 8 weeks, the subjects should be counterbalanced (as discussed
above) at baseline.  Thus, when subjects are allocated to groups, a student’s t-test
will be used to ensure those groups do not have statistically different strength
levels to start, meaning the goal is for p>0.05 in this case.  In reality, researchers
should aim to have strength levels as close as possible between groups to start, so
a p-value much higher than 0.05 is desirable.

Paired t-test: A paired t-test is used to compare two conditions when the individuals
are the same, such as in a crossover design.  For example, if we used the
hypothetical crossover study on the effects of caffeine on muscular endurance
from the methods portion of this document (caffeine prior to testing, then 72 hours
later the same test after no caffeine), a paired t-test would be used to examine if
differences exist between conditions since the subjects are the same.

Analysis of Variance (ANOVA)

Repeated Measures ANOVA

Quite commonly, a repeated measures ANOVA will appear in the statistical analyses
section of the paper you are reading.  This is used to assess differences between
two or more groups at multiple time points.  For example, if training program A is
compared to training program B over 8 weeks, and bench press 1RM was tested
pre- and post-study, then a 2 (group) x 2 (time) repeated measures ANOVA would
be used to assess changes and differences between groups for bench press 1RM. 
Now, let’s say that a mid-point measurement was also conducted for bench press
1RM.  In that case, there is an additional time point so the ANOVA would now be a 2
(group) x 3 (time) repeated measures test.  However, if we went back to only 2
time points (pre and post), but added a group (now comparing programs A, B and
C), it would be …. you guessed it, a 3 (group) x 2 (time) repeated measures. See? It’s
not hard.  Remember, in most instances, names are very descriptive: In this case,
we have "repeated measures," thus, the measures are being repeated.

Post-Hoc Test
The statistics section may also state that a "post-hoc" test was used for "pairwise
comparisons" or "multiple comparison" purposes.  In short, this means that there is
significance somewhere as detected by the original ANOVA, but the researcher
doesn’t know "where" the significance is yet.  So, a post-hoc is used to determine if
the difference is pre- to post-study in either group (a change over time within a
group) or if there is a group or interaction difference (one group changing more
than the other) being detected. 

One-Way ANOVA

Previously, we discussed using a student’s t-test to counterbalance two groups at


baseline; however, if there are more than two groups at baseline, a t-test is
inappropriate as it can only compare two groups at one time.  So, if a study has
three groups or more at baseline, a one-way ANOVA is used to counterbalance.

Effect Size

If you have perused a scientific paper, you are most likely familiar with the terms t-
test and ANOVA, but maybe less familiar with the term effect size (ES); fortunately,
ES is becoming much more common and almost standard to report in exercise
science research these days.  An ES will quantify the magnitude of difference
between groups and conditions or is simply used to examine the magnitude of
change in one group from pre- to post-study.  This is quite useful to help determine
a meaningful difference if the p-value does not reach statistical significance. 
Conversely, the ES might reveal that there is not much of a meaningful difference
when the p-value is indeed <0.05; thus, an ES is useful for both of those reasons. 

Let’s briefly discuss 3 ways that ES is calculated:

1) Between conditions in a crossover design,

2) Within group from pre- to post-study and

3) Between groups when a training study is done.

1) Between Conditions ES

Again, let’s use the caffeine example for max repetitions on the squat at 60% 1RM,
with and without caffeine, 72 hours apart.  We would first run a paired t-test, then
afterward, we would also calculate an ES, which would be calculated as: ES = (Mean
2 – Mean 1) / Pooled Standard Deviation.  Mean 2 in this case is the caffeine
condition because it would be hypothesized that caffeine ingestion would result in
more squat repetitions.  Thus, if that is correct, the ES would be positive.  If in fact,
the placebo or control (i.e. no caffeine) condition had better performance, the
resulting effect size would be negative, showing that the results were the opposite
of the hypothesis.  Finally, the pooled standard deviation is the weighted average of
the standard deviations.  Therefore, when calculating the pooled SD, the condition
or group with the larger sample size receives more weight in the calculation. 

Now, if a study doesn’t report a between condition ES, the good news is you can
usually do it yourself.  In this instance of an acute study crossover design, you
indeed can calculate it, so let’s do so.  For our mock calculation, we’ll make up
numbers for each condition.  Placebo = 24 ± 9 reps, and caffeine condition = 28 ± 7
reps.  Note that there are no decimals above. That is because you only record
whole numbers for the measurement, then you round to the nearest whole
number as well.  We would now do: 28 – 24 (our hypothesis is in favor of the
caffeine group so that number comes first) = 4.  Then, divide 4 by the pooled
standard deviation of 8 (the average of 9 and 7), which gives us an ES of 0.50.  The
good news is you don’t have to do this by hand, just plug in the means and
standard deviations at this link. https://www.socscistatistics.com/

2) Within Group ES

Think back to the 8-week program A versus program B comparison with two groups
in which we ran the 2 x 2 repeated measures ANOVA.  In addition to that, we would
also like to examine the magnitude of change in each individual group over time
from pre- to post- testing.  To do this, we can calculate a within group effect size
using the same formula as above: ES = (Mean 2 – Mean 1) / Pooled Standard
Deviation.  This time, Mean 2 is the post-study bench press 1RM mean in group A,
and Mean 1 is the pre-study bench press 1RM mean in group A.  Then we would do
that again, but for group B. 

This equation will simply give us the magnitude of change in each individual group;
it shouldn’t be used to compare groups (unfortunately, it’s often used to make
between-group comparisons, but that’s not how within group ESs should be used). 
The within group effect size is something you can again calculate yourself if a study
does not.  So, let’s again make up some numbers in terms of means and SDs for a
mock calculation:  Group A has a pre-study bench press 1RM of 102.75 ± 10.57kg
and a post-study 1RM of 115.50 ± 14.78.  Following the equation above, we would
simply find the difference of pre- and post-training means for the group, so group A
would be: 115.50 – 102.75 = 12.75, then divide that by the pooled SD (the average
of 10.57 and 14.78).  Using the link just above in example 1 we would get an ES
of 0.99.

3) Between Group ES

Unlike the within group, the between group ES compares the magnitude of change
between group A and group B.  To accomplish this, the means of the change
scores are used rather than pre- or post-study means or simply the means of each
condition in example 1. However, unlike the two examples above, the between
group effect size most likely cannot be calculated from simply reading a
manuscript.  This is because you need the pooled SD of the mean change, which
cannot be found by simply having pre- and post-study means.  The SD of the mean
change is calculated from each of the individual change scores themselves; thus,
you would need individual subject data for this calculation.  The second-best way to
calculate between group ES would be to use the mean change scores as described
and then divide the difference by the pooled standard deviations of both groups’
pre-study means, which could be calculated on your own.

This brings us to magnitude interpretation.  Based upon Cohen’s interpretation,


magnitude of ES is determined as follows <0.20=trivial, 0.20-0.49=small, 0.50-
0.79=moderate, and ≥0.80=large.  However, more recently in applied exercise
science, it has been suggested to interpret ES as: <0.20=trivial, 0.20-0.59=small (i.e.
the smallest worthwhile or smallest meaningful difference), 0.60-1.19=moderate,
1.20=1.99=large, and ≥2.00=very large.  Ultimately, an ES can help to determine if
there is, in fact, a meaningful change.

Percentage Change

Percentage change is obviously something we all do commonly on a daily basis


outside of scientific statistics.  To calculate the percentage change of a group from
pre- to post-study, you could use the following formula: Percentage change = [(Post
Mean – Pre Mean)/Pre Mean] X 100.  This is usually reported, however, as will be
discussed in the results, the percentage change is perhaps the least important
statistic and must be reported in conjunction with the other statistics.  Percentage
change by itself can be quite misleading.  For example, if group A has a percentage
change of 20% for bench press over 8 weeks and group B a change of 8% for bench
press, that seems impressive in favor of group A.  However, in the absence of
means, that is meaningless.  If you find out that the starting bench press in group A
was 85kg and in group B was 120kg (extreme example), then these groups were
clearly not counterbalanced at baseline, and subjects in group A had a much lower
starting strength and thus, a much greater ability to increase strength in the short
term in all likelihood.

What to look for in the Methods

 A clear explanation of procedures that doesn’t leave you guessing as to how


the study was carried out. 
 Detailed explanation of the subjects’ descriptive statistics (i.e. training age,
strength levels, etc.) so you know how applicable the findings are to you or
your clients. 
 Clearly written statistical analyses so you know what type of tests were used
when you move on to interpret the results. 

READING A RESULTS SECTION


A results section will have subsections similar to the methods.  Also, similar to the
methods, there is certain terminology to learn, and it is necessary to know how
data is usually presented in order to comprehend what is reported in a results
section.  Ultimately, data are presented with the most basic information first (what
are called the "main effects"), followed by "group" or "interaction" effects. 
Remember, the initial results of an ANOVA provide a main effect, then the post-hoc
test tells you where that effect is (i.e. group or group by time interaction).  In
addition to presenting main effects first and then interactions, you will find the
main statistics (i.e. ANOVA) reported first, followed by the accompanying effect size
and percentage change.  Lastly, ideally exact p-values are reported, rather than just
"p<0.05 or p>0.05," as there is a huge difference between a p-value which is 0.10
and p-value which is 0.90.  We’ll make up some exact p-values below so the mock
sentences read like a good results section.

Main Time Effect

 A main time effect is the overall effect over time.  In other words, let’s take
our program A versus program B example for bench press 1RM over 8
weeks.  If the results state, “There was a significant main time effect (p=0.02)
for bench press 1RM”, that means that across all subjects (both groups
included), bench press increased from pre- to post-study.  Again, names are
descriptive, so just break down the term and you’ll figure it out.  Main
(overall), time (pre- to post-study), effect (something changed).  Then, a post
hoc test tells us if there was a difference between groups for this overall
change.  So, main effects are always presented first, then interactions
second.

Group Effect

 Using our training program example, a group effect tells you if there is a
statistically greater bench 1RM at post-study in one group vs. the other.

Group x Time Interaction

 This will tell you if the groups are changing at different rates over time.

After the main statistics are presented, you may see effect sizes and percentage
changes.  Now, those statistics are not always applicable, so if you don’t see them,
that doesn’t mean the results are inherently poor.  However, for many studies
applicable to the readers of MASS, effect sizes will often be appropriate, and the
growing trend in applied exercise science research is to include effect sizes as part
of the standard statistical package.  Since we now understand common statistics
and how they are presented, it is our hope that by looking at the totality of
statistics, you'll be able to decipher a results section. 

It’s important to understand – especially in our field where sample sizes are small
(making null hypothesis testing more prone to error) – when a meaningful
difference is present, even in the absence of a statistically significant difference. 
However, thinking even further, an ES >0.20 in the absence of a significant p-value
does not always represent a meaningful change.  It is quite possible in this scenario
that with a larger sample size, this ES would not remain.  Therefore, an ES >0.20
(and findings in general) should be viewed with some skepticism in small sample
sizes, especially in the absence of a significant p-value.  This is not to say that a
meaningful difference doesn’t exist in this case, but rather the totality of statistical
analyses should be analyzed in each study when determining whether a meaningful
difference exists.  Finally, firm conclusions should only be made once a strong
majority of studies show similar outcomes.

Reporting: Tables and Figures

Figures are often used when data is more easily understood visually. For example, if
the authors want to display volume over the course of a training study, they might
make a figure that plots tonnage over time, as it would easily show volume in
different weeks relative to other weeks. Figures can be line graphs or bar graphs (or
other visual displays), and they typically have error bars (representing standard
deviation) but don’t list the exact values.

When the authors want to present a lot of data, including means, standard
deviations, effect sizes, and percentage changes for many groups, it is often more
efficient to do so with a table.  The text will state the effects that occur, and then will
refer to the table.  For example, "Both program A and program B increased bench
press 1RM from pre- to post-study (p=0.02), however, there was a significant group
effect (p<0.001) in favor of program A.  Specific values for bench press 1RM are
displayed in Table 1."

An important note: Authors typically don’t "double report." Meaning, data in a table
or figure won’t be written in the body of the text and vice versa. Thus, if you think
you are missing something in the text, make sure to check the tables and figures. 

Additionally, we have arrived at another concept; notice I wrote "p<0.001" above. 


When the p-value is less than 0.001, it is typical to state that, rather than the actual
value (since that could get a little crazy  – i.e. p=0.000000000001….you get the
point). 

What to look for in the Results

 Results of statistical analysis reported as main effects first, and then group
and interaction effects second.
 Tables and figures to provide visuals.
 Hopefully, exact p-values and ES.

READING A DISCUSSION
Most of the time, the final main heading of a scientific manuscript is the discussion. 
The discussion is where authors interpret their findings and compare and contrast
the findings to other research.  Additionally, in our field, there is special importance
on presenting practical applications of the findings.  You should also find limitations
that the authors themselves point out (every study has limitations) and suggestions
for future research to move the ideas forward.  A good discussion also presents the
findings cautiously.
Opening Paragraph

The opening paragraph of the discussion doesn’t actually do too much


interpretation.  Rather, this paragraph serves to restate the purpose, the main
findings, and if the hypotheses were supported or not.  Now that you have the skills
to interpret a results section, reading this paragraph will be a good test to see if you
understood what you read in the results.

Interpreting and Comparing the Present Findings

This is most of the discussion.  The authors will compare and contrast their findings
with other research.  If findings are in disagreement with other research, then
explanations should be given. These may commonly include: different
methodologies (i.e. different dosage of a supplement), different subject population
(i.e. trained vs. untrained), different study length (i.e. 8 vs. 12 weeks), etc.

Limitations

Every study has limitations, and there’s nothing wrong with that.  A good discussion
acknowledges these limitations. Sometimes the authors will provide counterpoints
to help explain the novelty of the study to offset the limitations; but nonetheless,
the limitations exist.  If the authors don’t point out the limitations when they submit
a study, often the reviewers will ensure that they do before allowing it to be
published.  

Conclusions and Practical Applications

The final part of a discussion is the conclusion. Its purpose is to briefly restate the
main points and then provide practical applications.  In some journals, there is a
separate heading for applications, and in some, there is not.  However, in either
case, a good discussion in applied research provides recommendations for the
athlete and coach to utilize the findings to improve their training and performance. 
But, you’re lucky: You have MASS for that, too.

 A final point about discussions (perhaps the most important): A well-written


discussion is cautious when presenting the findings.  This means that the authors
do not overstate their findings or try to recommend their findings be applied when
it might be inappropriate. 

Let’s recall our crossover design caffeine example and assume the caffeine
condition produced significantly more reps at 60% of 1RM.  An overstatement
would be, "Based  upon  our findings, we recommend that acute caffeine ingestion can
be utilized in all populations of athletes to improve resistance training performance."
That is a poor statement for a few reasons:

1. This study did not employ all populations of athletes; thus, the authors should
only apply it to the population presently used.  That doesn’t mean that the
intervention wouldn’t be beneficial for other populations. It just means that the
authors cannot say that based upon the present dataset. 

2.  This example demonstrates an improvement in muscular endurance


performance and not overall resistance training performance.  Based upon these
results, we do not know if this intervention is good for 1RM performance.  Again,
that doesn’t mean it isn’t good for 1RM; it just means the authors should not say
that it is based upon these results.  A good discussion is cautious; if the data are
important, people will see that and utilize the information, and the study will stand
on its own without overstatement.  Finally, remember that when you read a
discussion, it is the researcher’s interpretation of the results, which is not the same
as the results themselves. Like anyone, researchers are subject to bias. Additionally,
critical thinking is a skill, and if the researchers are writing outside of their area of
expertise or unaware of other data, they may not interpret it in the most correct
manner.   

What to Look for in the Discussion

 The main findings and if hypotheses were supported.


 Comparisons of findings to other research.
 Explanations of why findings may differ with other research.
 Applications of the findings.

EXTRA POINTS
Before we conclude, let's go over just a few more points.  In any manuscript, usually
on the first page, you’ll find the corresponding author with his/her email.  If you
have any questions about the manuscript, don’t hesitate to contact the
corresponding author. If you’re respectful and patient, we bet you get a response. 
You can find papers by searching on PubMed. Not all journals in the field are
indexed on Pubmed, but most of the high-quality ones are.  Here is a pretty
comprehensive journal list, along with each journal’s associated impact factor (a
rough guide to its overall quality).  If you cannot find a journal on Pubmed, try
Google Scholar.

When conducting a Pubmed search, after you type in your search term and hit
return/enter, you will notice tabs on the left and on top of the page where you can
refine your search.  You can select "most recent" or "best match," and you can also
select "Review," which is found under "Article Types."  If a topic is new to you,
selecting "review" is useful to find a meta-analysis, systematic review, or even
narrative review to give you an overarching idea of the results and mechanisms. 
Additionally, a recent review will have many papers in the reference list; thus, you
will now have all of those papers at your disposal.  

FINAL WORDS
We have various goals with MASS. One is obvious: to disseminate the most
important and recent information related to strength sport to you in an easy-to-
understand format.  However, an additional goal is to teach and to help improve
the ability of our readers to interpret scientific research to enhance their own
knowledge and training.  It is our hope that this document will help reading the full
text of a study become a less daunting task, allowing you to be confident in your
ability to read and understand a full manuscript.  Besides, what could be a better
Friday or Saturday night than looking for science on Pubmed?

You might also like