Professional Documents
Culture Documents
Conducting & Reading Research in Kinesiology - Ted, PHD & Weimo Zhu, PHD & Pamela Hodges Kulinna, PHD
Conducting & Reading Research in Kinesiology - Ted, PHD & Weimo Zhu, PHD & Pamela Hodges Kulinna, PHD
& READING
RESEARCH IN
KINESIOLOGY
SIXTH EDITION
Professor Emeritus
Department of Kinesiology
The University of Georgia
Athens, Georgia
Production Credits
VP, Product Management: Amanda Martin
Director of Product Management: Cathy L. Esperti
Product Manager: Sean Fabery
Product Assistant: Andrew LaBelle
Senior Project Specialist: Alex Schab
Director of Marketing: Andrea DeFronzo
Production Services Manager: Colleen Lamy
VP, Manufacturing and Inventory Control: Therese Connell
Composition: Exela Technologies
Project Management: Exela Technologies
Cover Design: Scott Moden
Media Development Editor: Troy Liston
Rights Specialist: Maria Leon Maimone
Cover Image (Title Page, Part Opener, Chapter Opener): ©
Stanislaw Pytel/Stone/Getty Images Plus/Getty Images
Printing and Binding: CJK Group Inc.
Inc. Cover Printing: CJK Group Inc.
6048
Brief Contents
Preface
Acknowledgments
About the Authors
About SHAPE America
Contents
Preface
Acknowledgments
About the Authors
About SHAPE America
Summary of Objectives
References
Chapter 2 Descriptive Research
Types of Descriptive Research
Survey
Developmental
Case Study
Correlational
Normative
Observational
Action
Causal-Comparative
Survey Research
Preliminary Considerations in Planning a Survey
Survey Methods
Summary of Objectives
References
Types of Designs
Validity in Summary
Methods of Control
Common Sources of Error
Hawthorne Effect
Placebo Effect
“John Henry” Effect
Rating Effect
Experimenter Bias Effect
Participant–Researcher Interaction Effect
Post Hoc Error
Measurement in Experimental Research
Summary of Objectives
References
References
Action Research
What Is Action Research?
Characteristics of Action Research
The Action Research Process
Sample Action Research Study
Ethics in Research
Summary of Objectives
References
References
Summary of Objectives
References
References
Chapter 9 Reading, Evaluating, and Summarizing Research
Reports
Preliminary Information
Introduction
Methods
Research Participants
Instrumentation
Procedures
Results
Discussion
Meta-Analysis
Procedures in Meta-Analysis
Computer Programs for Meta-Analysis
Meta-Analysis in Kinesiology
Summary of Objectives
References
Summary of Objectives
References
Summary of Objectives
References
Summary of Objectives
References
Summary of Objectives
References
Computer Analysis
Descriptive Values
Measures of Central Tendency
Measures of Variability
Measuring Group Position
Percentile Ranks and Percentiles
Standard Scores
Determining Relationships Between Scores
The Graphing Technique
The Correlation Technique
Summary of Objectives
References
One-Group t-test
References
Mixed-Methods Designs
Mixed-Methods Data Analysis
Interpreting, Evaluating, and Presenting Mixed-Methods
Data
Summary of Objectives
References
Summary of Objectives
References
Summary of Objectives
References
Glossary
Index
© Liashko/Shutterstock
Preface
Revised Organization
In revising this text, the authors’ goal was to provide an overview
of the types of research prior to discussing the specifics of the
research process. Thus, Part One introduces the reader to the
various types of research in kinesiology, and Part Two proceeds
by describing the steps involved in conducting and reading
research.
Instructor Resources
Qualified instructors can also request access to a full suite of
instructor resources, which include:
Test Bank
Slides in PowerPoint format
Weblinks
Online Resources
Each new print copy of this text is accompanied by the Navigate 2
Companion Website, which includes the following items:
Practice Quizzes
Appendices
Weblinks
Interactive Glossary
Flashcards
References
Newell, K. (1990a). Physical education in higher education: Chaos out of order, Quest,
42, 227–242.
Newell, K. (1990b). Physical activity, knowledge types, and degree programs, Quest, 42,
243–268.
Newell, K. (1990c). Kinesiology: The label for the study of physical activity in higher
education, Quest, 42, 269–278.
Twietmeyer, G. (2012). What is kinesiology? Historical and philosophical insights, Quest,
64, 4–23.
© Liashko/Shutterstock
Acknowledgments
© SHAPE America
© Liashko/Shutterstock
PART ONE
The Nature and Type of
Research
CHAPTER 1 The Nature, Purpose, and Overall Process of
Research
In Part One of this text, the nature, purpose, and overall process of
research are described in six chapters.
Chapter 1, “The Nature, Purpose, and Overall Process of
Research,” describes (a) the importance of research in the
acquisition of knowledge by Kinesiology professions, (b) how the
scientific method of solving problems fits into the research
process, (c) various types of research, and (d) the significance of
research in kinesiology.
In Chapter 2, “Descriptive Research,” we cover many different
research approaches. These are traditional and commonly
conducted types of research to describe the present situation.
In Chapter 3, “Experimental Research,” we deal with a type of
research that endeavors to find new ways of doing things in the
future.
In Chapter 4, “Qualitative and Mixed-Methods Research,” we
describe a type of research that is commonly utilized in the social
sciences. The methods in qualitative and mixed methods research
(e.g., interviews and field notes and combining methodologies)
differ considerably from the methods commonly found in
experimental research.
In Chapter 5, “Historical Research and Action Research,” we
consider two other approaches to conducting research. Historical
research describes research conducted to show what happened in
the past. It is not a common type of research in kinesiology but is
slowly becoming more prevalent. Action research is a type of
applied research that is conducted to try to find an answer to a
problem that exists in a particular setting (e.g., schools). It is
conducted in the setting where it will be applied.
Finally, Chapter 6 describes “Epidemiological Research.” Using
this method, researchers study phenomena on a large scale. They
often investigate how many people have a
disease/disorder/condition, if this number of people changes, and
how it affects society and, possibly, the economy.
© Liashko/Shutterstock
CHAPTER 1
CHAPTER 1
The Nature, Purpose, and
Overall Process of Research
Members of the kinesiology profession have a wealth of
information upon which they base decisions. Quite frequently, this
information is passed down to us, rather than discovered through
direct observation or experiences. Many times, we do not bother to
examine the source of the information prior to using it; however,
when members of a profession engage in various aspects of the
research process, current information can be validated—and new
information acquired.
OBJECTIVES
Action research
Applied research
Basic research
Causal-comparative research
Correlational research
Deductive reasoning
Descriptive research
Experimental research
Hypothesis
Imperfect induction
Induction
Mixed-methods research
Perfect induction
Qualitative research
Quantitative research
Research
Research question
Scientific method
Syllogism
Theory
Experimental Research
In experimental research, the researcher explores the cause-
and-effect relationship between variables by manipulating certain
variables (referred to as independent variables) to determine their
effect on another variable (referred to as the dependent variable).
Furthermore, the researcher attempts to control all other factors
that could possibly influence the dependent variable. Consider, for
example, a situation in which a researcher is interested in
investigating the effects of two different methods of training (with
and without cardiovascular exercise) on strength development. In
this case, the method of training represents the independent
variable and is being manipulated by the researcher. Ideally, the
researcher would be able to randomly assign research participants
to either a treatment group that utilizes weight training and
cardiovascular training or a treatment group that utilizes weight
training only. Participants would then train for a specified length of
time, and the researcher would determine if there were differences
between the two groups in terms of strength development. These
two groups would also be compared to a third group (one that did
not perform any training). In such a study, the researcher would
attempt to ensure that both experimental groups trained for the
same length of time, for the same duration, and at the same
intensity. Moreover, the researcher would attempt to control other
factors that could reasonably affect the outcome—factors like
nutrition, other forms of physical activity, and even motivation, to
name a few. The researcher would compare the three groups
based on the research participants’ strength scores (dependent
variable) at the end of the training period. Assuming that there
were no differences in strength when the study commenced, if the
final strength scores were significantly higher for the group using
weight training only, the researcher would be able to conclude with
some confidence that training using only weight training was more
effective at increasing muscular strength than training using weight
training and cardiovascular training or simply doing no training.
At its essence, experimental research is designed to answer
the question “What if …?” by systematically manipulating one or
more variables and observing the consequences on another
variable as the researcher is primarily interested in the future
because of the intervention. Experimental research methods are
quantitative in nature, typically begin with clearly stated
hypotheses to fit the research questions, and are commonly
associated with a laboratory setting. It is the most formally
structured of all the various types of research. According to Ary,
Jacobs, and Sorensen (2010), in its simplest form, experimental
research has three characteristics: (a) an independent variable
that is manipulated by the researcher, (b) control of other relevant
variables, and (c) observation of the effect of the manipulation of
the independent variable on the dependent variable. A more
thorough discussion of experimental research is presented in
Chapter 3, “Experimental Research”; related variables are
described further in Chapter 3, and in Chapter 7 “Understanding
the Research Process.”
Causal-Comparative Research
Causal-comparative research is similar to experimental research
in that it seeks to investigate possible cause-and-effect
relationships. However, in this case, the independent variable is
not manipulated by the researcher, either because it cannot be
manipulated or because it would be unethical to do so. With
causal-comparative research, the independent variable is often
referred to as an attribute or organismic variable because
frequently it is some attribute or characteristic that the research
participant already possesses due to the natural course of actions
(e.g., gender, ethnicity, medical condition, family background,
employment status). As a result, the participants are already
members of a particular class or group of individuals that possess
the same attribute. The researcher then compares groups that
differ on the attribute (independent variable) to determine whether
there are differences on some dependent variable of interest.
Because the researcher has no control over the independent
variable, which is either innate or has already occurred, this type
of research is also called ex post facto (after the fact) research.
One example of causal-comparative research is represented
by current studies that are investigating the effect of vaping
(independent variable) on nicotine addiction and health changes
(dependent variable) in adolescents. Because of the potential
harm to participants, it would be unethical to assign some
participants to a group and force them to vape and other
participants to a group that is not allowed to vape. Researchers
would instead compare the incidence of nicotine addiction and
brain and lung development in a group of nonvapers to a group of
longtime adolescent vapers. Clearly, the researcher did not have
control over the independent variable in this example, or in causal-
comparative research in general. The group membership, either
vaper or nonvaper, was determined before the study commenced;
the researcher merely selected research participants from each of
the preexisting groups and compared them on the dependent
variable—in this case, nicotine addiction and brain and lung
development. Because of the lack of control, especially the
capability to manipulate the independent variable, the ability of the
researcher to draw cause-and-effect conclusions is limited
compared to experimental research. This has been one of the
arguments used by the vaping industry in refuting the findings of
initial research pointing to a linkage between vaping and nicotine
addiction as well as impaired lung and brain development,
although the evidence available today shows a relationship
between vaping and nicotine addiction as well as between vaping
and impaired brain and lung development in adolescents (Fraga,
2018). Causal-comparative research generally functions to identify
group differences and to discover relationships among variables,
but stops short of establishing causality. Although there are
limitations associated with causal-comparative research, it
nevertheless is frequently used in kinesiological research.
Descriptive Research
Descriptive research, as the name suggests, attempts to gather
information from groups of research participants in order to
describe systematically, factually, and accurately specific
characteristics of interest or conditions that presently exist. Quite
simply, a descriptive study first determines and then describes the
way things are. In direct contrast to experimental designs, these
studies are nonexperimental in nature and are concerned with the
present and describing what is. It typically precedes experimental
research. There is no manipulation of an independent variable in
descriptive research because there is no intent to explore a cause-
and-effect relationship. Descriptive research utilizes a wide variety
of methodologies to collect data—surveys, interviews, life
histories, direct measurement, and observational techniques being
the most prevalent. Frequently, descriptive research is interested
in comparing relevant subgroups (for example, groups based on
gender, age, grade level, socioeconomic status, or ethnicity) with
the results being reported according to each subgroup as well as
for the total sample. An example of such a comparative study is a
recent investigation by Tremblay et al. (2018) who measured
physical literacy (and domains of daily behavior, physical
competence, knowledge and understanding, and motivation of
confidence) of 10,034 Canadian children ages 8 to 12. Literacy
findings and domain scores were compared by age and gender.
Results provide baseline information on physical literacy of youth
in Canada that can be used to monitor changes and inform future
intervention plans to increase physical literacy.
Although descriptive research is similar to qualitative research
in some regards, there are important distinctions: (a) methodology
is more structured and standardized; (b) variables of interest are
predetermined; (c) it inevitably uses more research participants,
often selected through randomization procedures; (d) data are
analyzed predominately through the use of statistical procedures;
and (e) there is typically less in-depth researcher interaction with
the participants.
Descriptive studies have been commonplace in kinesiology as
well as in education and the social and behavioral sciences. Best
and Kahn (2005) indicate that descriptive research is the
predominant research method used in the behavioral sciences.
Furthermore, many doctoral dissertations and master’s theses
involve descriptive research. Descriptive research methodology is
particularly suited to studies that seek to identify the attitudes or
opinions of various groups of individuals or to those that purport to
detail behaviors that may naturally occur in places such as parks,
classrooms, gymnasia, workplaces, playing fields, or homes.
Because descriptive research commonly seeks to generalize the
information collected from a sample (e.g., opinions, attitudes,
abilities, behaviors) to a target population, researchers often
employ randomization procedures in an attempt to obtain a
sample that is representative of the population of interest.
Examples of descriptive studies in kinesiology are studies to
identify the physiological characteristics of elite women distance
runners or studies to ascertain the physical activity and sedentary
behaviors of adolescents.
Some of the best examples of descriptive research are
represented by public opinion surveys in which the respondents
are asked their opinion or attitude on a multitude of topics ranging
from presidential preference to physical activity and exercise. The
Centers for Disease Control and Prevention (CDC) regularly
conducts a number of national probability studies aimed at
describing health behaviors of American youth and adults (e.g.,
Youth Risk Behavior Surveillance System [started 1990, YRBSS],
School Health Policies and Practices Study [started 1994,
SHPPS], National Health Interview Survey [started 1957, NHIS],
and Behavioral Risk Factor Surveillance System [started 1984,
BRFSS]).
Correlational Research
Nonexperimental in nature, correlational research is closely
related to both descriptive and causal-comparative research. It is
similar to descriptive research in that it describes currently existing
phenomena; it is similar to causal-comparative research in that it
explores relationships between or among variables. According to
Gay, Mills, and Airasian (2008), the purpose of correlational
research is to either determine whether and to what extent a
relationship exists between two or more variables or to use these
relationships to make predictions. That is, it may be either
relational or predictive in nature. If a relationship exists, the degree
of that relationship is usually expressed as some type of
correlation coefficient. An example of this type of study in
kinesiology would be an investigation to determine how self-
esteem corresponds (relates) to physical fitness among females.
Measures of self-esteem and physical fitness would be collected
on a representative group of adolescent girls, and the association
between the variables would be measured as a correlation
coefficient.
In contrast to experimental research, there is no manipulation
of variables in correlational research; therefore, it is important to
recognize that these types of studies never establish cause-and-
effect relationships between variables, even in the presence of a
high correlations. Additionally, even if a high correlation between
self-esteem and physical fitness in this example did exist, it does
not mean that self-esteem increases or decreases physical fitness
or that physical fitness leads to increases or decreases in self-
esteem. It simply means that adolescent girls who have higher
levels of self-esteem tend to have higher levels of physical fitness,
and those who have lower levels of self-esteem tend to have lower
levels of physical fitness. Although similar to causal-comparative
studies in the fact that causal relationships cannot be definitively
established, correlational research methodology differs in that
usually there is only a single group of participants from which data
on two or more variables are collected for each participant. In
contrast, causal-comparative research methodology takes into
consideration scores from two distinct groups of participants (e.g.,
females and males; youth and adults). It should be noted,
however, that the existence of a high correlation does permit more
accurate predictions based on the relationship. For example,
correlational research has established that there is a relationship
between graduate school grades and scores on the Graduate
Record Exam (GRE). As a result, we can now establish a
prediction model (equation) whereby we can predict, with a certain
degree of accuracy, a student’s graduate school success (grades)
based on her/his GRE scores. There is a higher correlation (with
GRE scores) than with TOEFL scores (English language skills)
and undergraduate grades. As this example illustrates, the results
of prediction studies may be used to facilitate decision-making or
selection processes, as in the case of admission to graduate
school.
Historical Research
Most historical research can be regarded as both qualitative and
descriptive in nature, yet relationships can be explored and
hypotheses tested in certain types of well-designed and well-
executed historical studies. In the historical approach, the
researcher endeavors to record and understand events of the past
in order to provide a better understanding of the present and
suggest possible directions for the future. For instance, a
researcher might be interested in investigating the “evolution of
attendance and trends at Women’s National Basketball
Association events in 2019 over the 22 years of the WNBA.” Such
a study would likely be based on official records and reports from
teams, as well as newspaper and magazine articles, refereed
Internet sources, and interviews with selected individuals who had
played or coached in the league.
The historical approach is oriented toward the past as the
researcher seeks to provide a new perspective on a question of
current interest by conducting an intensive study of material that
already exists. However, historical research differs from other
forms of research because its subject matter, the past, is difficult to
capture, and the researcher cannot generate new data—only
synthesize and interpret data that already exist. Moreover, locating
all the relevant information concerning a particular subject from
many, widely scattered sources is typically a difficult and tedious
process. Sources of data or evidence for historical research can
generally be classified into two main categories, primary sources
and secondary sources. Primary sources of data consist of original
materials prepared by individuals who were participants or a direct
eyewitness of an event under investigation. Secondary sources of
data, on the other hand, are those one step removed from the
historical happenings of interest and consist of materials based on
secondhand accounts of events (accounts that are prepared by
individuals who were not present or witness to the events). In
Chapter 5, “Historical Research and Action Research,” we provide
a more extensive description of historical research methods.
The Significance of Research in
Kinesiology
The body of knowledge in the various kinesiology fields is currently
expanding at a rapid rate. Tremendous progress has been made
because scholars in these fields have focused on all aspects of
the human being—physical, mental, and emotional. The whole
person is studied in relation to movement, attitude, and lifestyle.
The research has been both theoretical and practical and has
reflected a sharp increase in quality, range, and depth.
Even so, research in the kinesiology field has a relatively short
historical footprint compared to other disciplines. Most authorities
agree that systematic research in kinesiology (formerly only
referred to as physical education) in the United States originated in
the mid-1800s. Although the Association for the Advancement of
Physical Education (now the Society of Health and Physical
Educators America or SHAPE America) was established in 1885
to serve as the professional organization for the fledging field, the
disciplinary knowledge base was just being shaped. In the 75th
anniversary issue of the Research Quarterly for Exercise and
Sport, Roberta Park (2005) traces the historical development of
research and building the body of knowledge we know today as
kinesiology. Park describes the difficulty that early scholars and
professional leaders had in reconciling the diverse interests often
expressed under the “physical education” label. It was not until the
publication of the Research Quarterly (now Research Quarterly for
Exercise and Sport) in 1930 that a professional journal devoted
exclusively to reporting scholarly research in physical education
existed. As Research Quarterly matured as a professional journal
for reporting research findings, subdisciplines of physical
education (e.g., sport psychology, sport sociology, biomechanics,
exercise physiology, pedagogy, sport history, motor behavior, etc.)
began to emerge and create their own knowledge base. Park
described the 1960s as a time of rapid growth in scientific and
scholarly endeavors, leading to the creation of several specialized
journals associated with the various subdisciplines (e.g., Medicine
and Science in Sport, 1969; Journal of Motor Behavior, 1969;
Journal of Sport History, 1974; Journal of Sport Psychology,
1979). In 1986, leading scientists representing several of the
subdisciplines gathered at Arizona State University for a
symposium on Future Directions in Exercise/Sport Research. An
edited monograph of the symposium proceedings provides a
historical perspective of exercise and sport science research and
provides recommendations for future research needs in the
various subdisciplines (Skinner, Corbin, Landers, Martin, &
Wells, 1989). Similarly, a conference held in Pittsburgh,
Pennsylvania, in 2007 brought together researchers to reflect on
research about teaching and teacher education in physical
education. The resulting conference proceedings, Historic
Traditions and Future Directions of Research on Teaching and
Teacher Education in Physical Education (Housner, Metzler,
Schempp, & Templin, 2009), provide a valuable resource for
students and young professionals in physical education pedagogy,
curriculum, and teacher preparation. It is important for leading
researchers to gather regularly to provide recommendations for
lines of research, such as recent recommendations for more
emphasis on research on physical activity patterns, health-
optimizing physical education, and student outcomes from
programming in all physical activity and sport settings.
Much has been written about the advantages and
disadvantages of the various types of research and the ways to
interpret their results. Although quantitative methods have
historically been used more frequently by researchers in
kinesiology, the use of qualitative research methodologies and
mixed-methods have become equal. For example, in a research
review of coaching literature (from 2005 to 2015), quantitative
studies were used 48.9%, with qualitative studies used 43.9%, and
mixed-methods 7.1% of the time (Griffo, Jensen, Anthony,
Baghurst, & Kulinna, 2019). The body of knowledge in
kinesiology continues to expand at a rapid rate. This is partly due
to the fact that the scope of research activities within the
kinesiology discipline is broad and includes such topics as
biomedical research, social and behavioral research, educational
research, and marketing research. Over time, researchers are
becoming more adept at using better study designs, improving
instrumentation, and developing increasingly sophisticated
analysis techniques. The tremendous progress in the volume of
research has led to the creation of a number of new professional
journals in recent years. In fact, the American Kinesiology
Association now identifies more than 100 leading journals in
kinesiology on their website (American Kinesiology
Association, 2018).
The demand for research to provide answers to many of the
questions and problems facing society continues to grow. More
and more we hear the call, show me the evidence! Leaders and
professionals in many fields often demand empirical evidence to
help guide decision making and professional practice. It is not
uncommon to hear of evidence-based medicine, evidence-based
education, evidence-based therapy, and so forth. The typical
kinesiology practitioner has become more active in research in
recent years while the focus of the professional researcher
appears to have become more fixed on research having utilitarian
value. The so-called gap between the researcher and the
practitioner appears to be narrowing. We are seeing more and
more cooperative efforts between the researcher and practitioner,
a trend that promises to help solve practical kinds of problems as
well as to validate theoretical concepts.
In its simplest form, research is just another way of looking at
the problems that permeate the fields of kinesiology. Research is
for everyone, and everyone can and should engage in this
process. Deep knowledge of statistics and research design is not
necessary to carry out creditable research. All that is needed is
someone who is interested and willing to undertake the activity
and has some background knowledge of the problem to be
studied. Research should not be relegated to just a few experts in
the profession, but rather, it is the responsibility of many people.
Larger numbers of individuals engaging in research are needed to
guarantee that the problems, questions, and interests of all fields
of kinesiology are represented in research investigations.
Research can help broaden the base of knowledge and improve
the practices that have long been associated with kinesiology. It is,
and should continue to be, an active and continuing ingredient of
the scholarly efforts of those fields. It is a tool that professionals
cannot do without. It is the lifeblood of a profession and must be
pursued vigorously.
Summary of Objectives
1. Explain the relationship between research and a
profession. Research is the lifeblood of a profession because
it produces information that adds to the profession’s body of
knowledge. In the kinesiology professions, research is a
growing field that has produced a variety of conceptual and
methodological approaches for dealing with all aspects of
movement and human health.
2. Explain the scientific method. The scientific method is a
systematic approach to solving problems and acquiring
knowledge that involves a series of stages: identification of the
problem, formulation of a hypothesis, development of a
research plan, collection and analysis of data, and
interpretation and usefulness of results.
3. Describe the nature of research. The research process is a
formal, systematic, and logical attack on a problem using the
scientific method as the mode of inquiry. An organized
endeavor to obtain facts about some subject and, ultimately,
answers to a specific question, research attempts to discover
truth.
4. Understand the various types of research classifications
and how they are applied. Research that is theoretical in
nature and intended for the fundamental purpose of
discovering new knowledge is called basic research. In
kinesiology, research is usually done for the purpose of
addressing a specific problem or issue and, as such, is
considered applied research. The methodology should always
follow the research question. Which type of research is
appropriate for a given study depends on the nature of the
question being asked.
References
American Kinesiology Association. (2018). Kinesiology Journals. Retrieved from
http://www.americankinesiology.org/SubPages/Pages/Kinesiology%20Journals
Ary, D., Jacobs, L. C., & Sorensen, C. (2010). Introduction to research in education (8th
ed.). Belmont, CA: Wadsworth.
Best, J. W., & Kahn, J. V. (2005). Research in education (10th ed.). Boston, MA: Allyn &
Bacon.
Creswell, J. W., & Creswell, D. (2018). Research Design: Qualitative, Quantitative, and
Mixed Methods Approaches. Thousand Oaks, CA: Sage.
Drew, C. J., Hardman, M. L., & Hosp, J. L. (2008). Designing and conducting research in
education (2nd ed.). Needham Heights, MA: Allyn & Bacon.
Ensign, J., Mays Woods, A., Kulinna, P. H., & McLoughlin, G. (2018). The teaching
performance of first-year physical educators. Physical Education and Sport Pedagogy,
23(6), 592–608.
Fraga, J. A. (2018). The dangers of juuling. National Center for Health Research.
Retrieved from http://www.center4research.org/the-dangers-of-juuling/
Gall, M. D., Gall, J. P., & Borg, W. R. (2003). Education research: An introduction (7th
ed.). New York: Allyn and Bacon.
Gay, L. R., Mills, G. E., & Airasian, P. (2008). Educational research (8th ed.). Upper
Saddle River, New Jersey: Merrill.
Greene, J. (2007). Mixed methods in social inquiry. San Francisco: Jossey-Bass.
Griffo, J. M., Jensen, M., Anthony, C. C., Baghurst, T., & Kulinna, P. H. (2019). A decade
of research literature in sport coaching (2005–2015). International Journal of Sports
Science & Coaching, 1747954118825058.
Hall, S., & Getchell, N. (2014). Research methods in Kinesiology and the health
sciences. Philadelphia, PA: Williams & Wilkins.
Houser, L. D., Metzler, M. W., Schempp, P. G., & Templin, T. J. (2009) Historic traditions
and future directions of research on teaching and teacher education in physical
education. Morgantown, WV: Fitness Information Technology.
Kulinna, P. H., Stylianou, M., Lorenz, K., Mulhearn, S., Taylor, T., Orme, S., & Everett, A.
(2019). Comprehensive school physical activity programs in Rural Settings. In C.
Webster and R. Carson (Ed) Comprehensive School PA Programs: Handbook of
Research and Practice. Reston, VA: Society of Health and Physical Educators.
Martin, J. J. (2017). Handbook of disability sport and exercise psychology. Oxford, UK:
Oxford University Press.
McMillan, J. H., & Schumacher, S. (2010). Research in education: Evidence-based
inquiry. Upper Saddle River, NJ: Pearson.
National Institutes of Health (NIH). (2018). Research involving human subjects glossary.
Retrieved from https://humansubjects.nih.gov/glossary.
Sallis, J. F., McKenzie, T. L., Beets, M.W., Beighle, A., Erwin, H., & Lee, S. (2012).
Physical Education’s Role in Public Health. Research Quarterly for Exercise and
Sport, 83(2), 125–135. doi: 10.1080/02701367.2012.10599842
Skinner, J. S., Corbin, C. B., Landers, D. M., Martin, P. E., and Wells, C. L. (1989) Future
directions in exercise and sport science research. Champaign, IL: Human Kinetics.
Society of Health and Physical Educators. (2018). National PE Standards. Retrieved
from; https://www.shapeamerica.org/standards/pe/
Tremblay, M. S., Longmuir, P. E., Barnes, J. D., Belanger, K., Anderson, K. D., Bruner,
B., … Woodruff, S. J. (2018). Physical literacy levels of Canadian children aged 8–12
years: descriptive and normative results from the RBC Learn to Play-CAPL project.
BMC Public Health 18(2), 31–43.
Tuckman, B. W., & Harper, B. E. (2012). Conducting educational research. (6th Edition).
Lanham, Maryland: Rowman & Littlefield Publishing.
Tuohy, D., Cooney, A., Dowling, M., Murphy, K. & Sixmith, J. (2013). An overview of
interpretive phenomenology as a research methodology. Nurse Researcher, 20(6),
17–20.
Twietmeyer, G. (2012). What is Kinesiology? Historical and philosophical insights. Quest
64(1), 4–23. Chicago: University of Chicago Press.
U.S. Department of Health and Human Services (1996). Physical activity and health: A
report of the Surgeon General. Washington, DC: Author.
U.S. Department of Health and Human Services. (2018). 2018 Physical activity
guidelines advisory committee scientific report. Washington, DC: Author.
© Liashko/Shutterstock
CHAPTER 2
CHAPTER 2
Descriptive Research
This chapter contains information about descriptive research. You
should become familiar with the various types of descriptive
research and the techniques commonly used in conducting
descriptive research.
OBJECTIVES
KEY TERMS
Causal-comparative research
Completion item
Criterion-referenced standards
Cross-sectional approach
Focus group
Longitudinal approach
Multiple choice
Norm-referenced standards
Open-ended
Percentile ranks
Semistructured interview
Structured interview
Unstructured interview
escriptive research is oriented toward the present; in the
D next chapter you will find that experimental research focuses
on the future. Descriptive research is conducted to describe
a present situation, what people currently believe, what people are
doing at the moment, and so forth.
Descriptive research is a broad classification of research under
which many types of research are conducted. Some people
mistakenly believe that experimental research is the only
approach. Others mistakenly believe that descriptive research is
easier to conduct than experimental research. Any type of
research is good if it is conducted properly, and all research is
demanding. For a study to be conducted well, the researcher
should have some formal training and practical experience in the
research area. It is foolish to try to conduct a research study in
physics without sufficient formal training in physics. Likewise, it is
improper to use research techniques on elementary school-aged
children that are used on teenagers or adults.
Descriptive research is conducted by collecting information
and, based on that information, describing the situation.
Descriptive research can, but does not have to, include a research
hypothesis. Consider a study in which 10,000 high school seniors
are selected from a population and surveyed as to whether they
smoke tobacco, and 38% indicate they do. The researcher reports
that 38% of high school seniors in the population smoke tobacco.
Using the same research procedures, the researcher could have
hypothesized prior to conducting the study that at least 30% of
high school seniors in the population smoke tobacco. In that case,
the researcher reports the 38% and whether the research
hypothesis is accepted or rejected.
Descriptive research is commonly conducted in physical
education and may be the predominant research approach in
health, recreation, and sport management. Descriptive research is
often conducted in exercise science, but not as commonly as in
the other disciplines mentioned.
Types of Descriptive Research
The types of descriptive research are briefly described in this
section to provide a general awareness of and broad appreciation
for them. The types of descriptive research discussed are: survey,
developmental, case study, correlational, normative, observational,
action, and causal-comparative. Isaac and Michael (1995) offer a
useful overview of descriptive research and the types of
descriptive research.
Survey
Survey research is the most common type of descriptive research.
It involves determining the views or practices of a group through
interviews or by administering a questionnaire. The questionnaire
may be administered to a group by the researcher or mailed to the
members of the group for them to complete and mail back to the
researcher. Due to the rapid development of the Internet and
increased prevalence of mobile phones, more and more surveys
are now conducted using these new technologies and media.
Survey research will be discussed in detail later in this chapter.
Developmental
Developmental research usually deals with the growth and
developmental change of humans over time. For example, what
are the growth and developmental changes each year from age 6
to age 18? More accurately, what are the growth and
developmental attributes of each age group? Other developmental
research might examine how organizations or professional groups
develop over time.
There are two approaches to developmental research. One is
the longitudinal approach in which a group is measured and
observed on a regular basis for multiple years. For example, twice
a year from age 6 to age 18 a large group of children is fitness
tested. Based on this information, fitness standards are developed
for each age level. The problem with this research approach is that
it takes many years to complete, and it is often difficult to keep
track of the participants over the many years that the study lasts.
However, such studies are tremendously valuable. Presently,
many epidemiologists wish there were more fitness data available
on middle-aged adults as children, because this would allow them
to answer questions concerning how fit children need to be in
order to be fit as adults. As valuable as longitudinal studies may
be, they are not a good choice for master’s theses or doctoral
dissertations due to the vast length of time involved.
The other approach to developmental research is the cross-
sectional approach. Taking the earlier example of developing
fitness standards for age groups, with the cross-sectional
approach, large samples of each age group are tested at the same
time, and standards for each age group are developed. The
assumption underlying this research approach is that each age
group is representative of all other age groups when they will be or
were that age. If this assumption is true, cross-sectional research
will yield the same results as longitudinal research, but in a shorter
time period. Cross-sectional research cannot always answer
questions that can be answered by longitudinal research. The
earlier question by epidemiologists about fitness in children as an
indicator of fitness in adults can be answered only with a
longitudinal study. More information on how these approaches are
used in epidemiological research can be found in Chapter 6,
“Epidemiologic Research.” Interested readers may also refer to
Hauspie, Cameron, and Molinari (2004) for a more complete
introduction on methods in human growth research.
Case Study
Case study research typically involves studying a person or event
in great detail and describing what is found. A study of the training
techniques or performance techniques of a highly skilled athlete is
an example of a case study. It is assumed that less-skilled
performers should use the techniques of the highly skilled. A study
of the way the Boston Marathon is organized and conducted for
use as a model for how to organize and conduct a marathon is
another example of a case study.
Being highly organized and systematic in collecting the
information needed to write the report is vital in this research
approach. Essential to case study research is preparation by
looking at the literature on how to conduct case study research
and also at actual case studies for the techniques used.
Correlational
The purpose of correlational studies is to determine if a
relationship exists between variables. The statistical techniques
used in correlational research are discussed in Chapter 14
“Descriptive Data Analysis,” and Chapter 15, “Inferential Data
Analysis.” To determine if a relationship exists between two
variables, each participant must be measured on both variables.
For example, to determine if a relationship exists between time
spent practicing a task and ability in performing the task, a practice
time and task ability score must be obtained for each participant. If
the data analysis shows that the longer a participant’s practice
time, the better the participant’s task ability score tended to be, the
researcher can conclude that a relationship exists and that
practicing a task is beneficial for performance.
Usually, the variables in a correlational study are not ones that
the researcher tries to manipulate, as is done in an experimental
study. In the previous example, the researcher just determined
how much each participant practiced and how well each
participant performed the task. Many correlational studies deal
with the relationship between a participant’s classification variable
(e.g., height, weight, gender, age, income, education) and a
variable of interest to the researcher. For example, what is the
relationship between age and beliefs concerning use of leisure
time?
Sometimes correlational studies involve three or more
variables, and the purpose of the research is to determine how
well one of the variables can be predicted by some combination of
the other variables. For example, how well can college grade-point
average be predicted by high school grade-point average and SAT
scores?
Correlational studies can be conducted to try to explain why
participants differ on a variable. This is a regression approach to a
correlational study. Why don’t all dancers execute a dance move
with equal skill? Part of this difference in ability among dancers is
explained by differences in amount of training, percentage body
fat, flexibility, and leg length to body height ratio. Wouldn’t it be
interesting to find that the leg length to body height ratio explained
much of the difference among dancers in executing a move and
that ability had little to do with things like amount of training,
percentage body fat, or flexibility?
Normative
Norms are standards of performance. The purpose of normative
research is to develop performance standards. Performance
standards are developed on a large representative sample from a
population; these standards then are applied to other samples
from the population.
Standards can be norm-referenced or criterion-referenced. The
majority of standards have been norm-referenced, but this
discussion will consider normative research contributing to the
development of either type of standard. Norm-referenced
standards are designed to rank order individuals from best to
worst and are usually expressed in percentile ranks. Test scores
commonly achieved are presented in charts; for each test score
there is a percentile rank indicating the percentage of the
participants in the norming group who scored below that test
score. An example of norm-referenced standards is presented in
TABLE 2.1. The percentile rank for the test score 4 is 42,
indicating that 42% of the participants scored below the test score
of 4. Standards published with most physical fitness tests prior to
1980 and many other nationally distributed tests are norm-
referenced. Measurement books such as Baumgartner, Jackson,
Mahar, and Rowe (2016) detail how to develop norm-referenced
standards. Because a person is compared with his/her peers
(often by same age and sex in kinesiology in a norm-referenced
test), the norms should be updated and the population should be
healthy and normal; otherwise, the evaluation derived from the
norms may be misleading, and a score at the average or above
average becomes meaningless (Zhu, 2012).
Observational
In this type of research the data are observations of people or
programs. For example, 5 days per week for 18 weeks a
researcher observed a community recreation program and
recorded everything he observed. At the end of the 18 weeks, the
researcher wrote a report based on those recorded observations.
With this type of research, data collection and analysis are
time-consuming and involve considerable technique. Formal
training and practical experience are necessary before this type of
research can be attempted. Because of advances in technology
that make the data collection in observational research easier,
observational research has gained in popularity within fields such
as kinesiology since 1980.
Action
Action research is conducted in the natural setting where it will be
applied. Thus, it lacks some of the control possible with other
types of research, but the results of the research are certainly
correct for the setting. Action research is always conducted to try
to find an answer to a problem that exists in the natural setting.
Practitioners who constantly strive to do a better job are actually
performing an informal type of action research. An example of
action research could be the testing of a new approach to interest
students or adults in starting a fitness program. Isaac and
Michael (1995) contrast formal research, action research, and the
causal approach to solving problems; they characterize action
research as less precise and demanding than formal research, but
superior to the casual approach.
Causal-Comparative
Also called ex post facto, causal-comparative research is
research conducted using data that were generated before the
research study was conceived. It is after-the-fact research, looking
for relationships or explanations for certain phenomena that
presently exist by looking at data from the past. For example,
looking at differences between heavy drinking and nondrinking 18-
year-olds based on information kept on file about these individuals
over the last 8 years is ex post facto research.
Isaac and Michael (1995) present a thorough discussion of
this research method, also referring to it as causal-comparative
research. Best and Kahn (2005) warn that just because two
variables are related does not mean that a cause-and-effect
relationship exists, with one variable causing an effect on the other
variable. Several research books in education (Wiersma, 1991;
Best & Kahn, 2005) discuss this research approach in detail.
Survey Research
Survey research is one of the most common types of descriptive
research performed in the kinesiology area. For that reason,
survey research is discussed further in this section of the chapter.
The information presented under “Measurement Techniques” and
“Questioning Techniques” in Chapter 10, “Developing the
Research Plan and Grant Application,” should be reviewed at this
time, particularly the information concerning scaling techniques
(rating scales, semantic differential scales, and Likert scale) and
structured questionnaires as they may apply to the discussion at
this time concerning questionnaire construction and use.
In survey research, information concerning opinions or
practices is obtained from a sample of people representing a
population through the use of interview or questionnaire
techniques. This information provides a basis for making
comparisons and determining trends, reveals current weaknesses
and/or strengths in a given situation, and provides information for
decision making.
As with most types of research, information obtained by a
survey has limitations. Survey information reveals, at best, what
the situation is, not what the situation should be. Surveys that deal
with behaviors or attitudes do not reveal the factors that cause or
influence the behaviors or attitudes. Furthermore, a survey cannot
be used to secure all the information sometimes needed for
decision making. Surveys are quite often limited by the sample
used and the information obtained. Finally, the information
obtained may be inaccurate or misinterpreted.
The many survey methods and the techniques used along with
them make it impossible to discuss all methods and techniques in
detail. The authors’ intent in this section of the book is to provide
an overview of the methods and to discuss some of the more
common techniques. Many books are available on the survey
method and the techniques involved with it. Bradburn, Sudman,
and Wansink (2004) have written a practical guide to
questionnaire design. Publishers such as Sage Publications and
Lawrence Erlbaum Associates often list in their catalogs many
excellent publications dealing with survey research, survey
construction, phone interviews, personal interviews, attitude
measurement, and much more. Recently, Web-based and mobile
phone-based surveys have been used (Cronk & West, 2002;
Daley, McDermott, Brown, & Kittleson, 2003; Dillman, Smyth,
& Christian, 2008; Jones, 1999; Pealer & Weiler, 2000; Riva,
Teruzzi, & Anolli, 2003; Schonlau, Fricker, & Elliott, 2002; Sills
& Song, 2002; Simsek & Veiga, 2000; Vispoel, 2000).
Survey Methods
Phone interviews are not often used by researchers in kinesiology
as the primary data collection method. Personal interviews are
becoming more prevalent. Administering a questionnaire to a
group of participants is a technique commonly used in the field.
However, a mailed or emailed questionnaire to a sample of
participants is the method used more often than any other.
Phone Interview
Marketing research, a form of survey research, and product sales
commonly take place over the phone. Why do people in marketing
research always call you at dinner time? Because you are likely to
be home and probably have not been home all day. This is a
phone interview technique: Call when people are home and are
probably willing to talk to you. Don’t call at 3:37 a.m., although
people are usually home. When the phone is answered, the caller
has about 10 seconds to say something to get the person’s
attention, or the individual is likely to hang up. This is also a
technique: What questions do you ask the person, and in what
order do you ask the questions? How long can you hold a person’s
attention on the phone? Ask questions that require a short answer,
and plan how to record the person’s answers. There is
considerable planning and technique in conducting a phone
interview. Much of the planning and technique for administering a
phone interview is the same as that for doing a personal interview.
What are the advantages of a phone interview over other
survey techniques? Phone interviews are quick and inexpensive
when the sample is spread over a wide geographical area.
However, the method does not allow for very many questions to be
asked, and recording the answers may be difficult. Talk to the
survey researchers among the business and sociology students
on campus who do phone interviews. Read the literature on phone
interviews. Many educational research books address this topic.
Babbie (2004) discussed phone surveys and noted that not all
people have phones with listed numbers. That may bias phone
survey results. However, Babbie indicated that random-digit dialing
overcomes this problem and that this technique has gained
popularity. Finally, a phone list with targeted demographic
populations is often available through commercial companies.
Personal Interview
In the personal interview the researcher meets with each member
of the sample and, based on their conversations, the needed
information is obtained. If the sample is small and accessible, this
is a feasible technique. When the information the researcher
desires cannot be collected by asking a series of questions on
paper (questionnaire), the personal interview must be used. For
example, a personal interview is probably necessary to obtain
information from a senior citizen about how things were 50 years
ago.
What are some things to consider if the personal interview is
selected as the data-gathering technique? One issue is how to
contact potential participants for an interview time. Another is the
decision regarding whether to use a structured interview (asking
each participant the same specific questions), a semistructured
interview (asking each participant the same general questions), or
an unstructured interview (just letting the conversation develop).
The decision is dependent on what information the researcher
needs, whether questions can be formulated in advance, and what
the participant is comfortable with or will tolerate. Still another
decision is how to record the information provided by the
participant. Recording the interview may be best because every
word is permanently saved. An alternative is to take notes during
the interview. Both approaches may be unacceptable to the
participant or be so intimidating that the participant will not be
open or totally truthful with the researcher. A sufficiently
experienced or skilled interviewer may be able to just talk to
people, and then write everything down after the interview is over
and the participant is no longer present. The point is, personal
interviews require much more planning and technique than many
people realize.
Some of the advantages of the personal interview are
completeness of responses, ability to clear up misconceptions,
opportunity to follow up responses, and increased likelihood that
the respondent will be more conscientious with the interviewer
present. Isaac and Michael (1995) favor the structured interview
because it requires less training of the researcher and is more
objective than the unstructured or semistructured interview. They
also present some guidelines for interviews. Other useful
references for those who desire additional information are
Wiersma (1991), Rubin (1983), Babbie (2004), and Gall, Gall,
and Borg (2010).
Focus Groups
A focus group is a group interview in which a moderator or
interviewer guides the interview when a small group of participants
discusses the topics that the interviewer raises. The size of a
focus group is usually between six and eight people, and what the
participants say in the discussion is sound-recorded (sometimes
also video-recorded) and used as the research data. The
interviewer should be a well-trained professional who guides the
group discussion through a predetermined set of questions or
topics and explores and follows up the new questions or topics
when they are raised. For more information on focus groups, see a
“Focus Group Kit” developed by Sage Publications (Morgan,
Krueger, & King, 1998).
Self-administered Questionnaire
For a variety of reasons, the majority of survey research
conducted in kinesiology uses questionnaires as the data-
gathering technique. A questionnaire is a series of questions or
statements on paper. Each participant is given a copy of the
questionnaire. The participant responds to the questionnaire and
then returns it to the researcher. If the researcher can meet with
the participants, the questionnaire will probably be administered to
all the participants at one time or to several groups of participants
at several times. However, the most prevalent procedure is to mail
or distribute the questionnaire to the participants, who complete
the questionnaire on their own, and then return it to the
researcher. Because the majority of procedures are the same no
matter whether the researcher administers the questionnaire or
distributes it for completion, the majority of the information about
questionnaires is discussed in the section on distributed
questionnaires. Information specific to administered
questionnaires is presented here.
If the researcher is unable to get the participants together at
one time or in several groups to administer the questionnaire, the
researcher may as well distribute the questionnaire. Often, a
researcher seeks permission to administer a questionnaire to an
intact group such as a class, school, agency, or exercise program.
The other possibility is to organize the participants so they come
together in one or more groups to complete the questionnaire. In
either case, a room or facility must be secured for administering
the questionnaire. The facility must be large enough to easily
accommodate the largest possible group and conducive to
completing a questionnaire (i.e., is quiet, contains desks or tables
providing a writing surface, has adequate seating, etc.). Some
thought should be given to whether pencils and other materials
need to be provided. Certainly, what to say to the participants
about the purpose of the questionnaire and how to complete it
needs to be planned. How to pass out and receive back the
questionnaires has to be thought through as well. The larger the
group, the more important these plans become, particularly if the
researcher has access to the room or facility for a limited amount
of time. A common pitfall is to think that a questionnaire that
typically takes 45 minutes to complete can be administered in 60
minutes. All participants will not arrive on time, giving the verbal
directions and distributing and receiving the questionnaire will take
time, and some participants will take more than 45 minutes to
complete the questionnaire.
Distributed Questionnaire
When the sample is geographically spread out or cannot be
brought together as a group, a distributed questionnaire is used.
Distribution may be by mailing or emailing it to the participants or
putting it in their mailbox at work, by handing it to participants in
class or at work or in some location where they gather, or by
having another person distribute the questionnaires in classes or
on job sites. This will include some planned procedure for having
participants return the completed questionnaires to the researcher.
Researchers distribute more questionnaires by mail or email than
by any other method.
Good questionnaire research requires considerable planning
and technique and is not something that just anyone can
accomplish quickly. Professional pollsters and marketing research
experts conduct outstanding questionnaire research. Graduate
students sometimes do a poor job with questionnaires due to a
lack of knowledge and sufficient planning.
Virtually everyone receives many questionnaires in the mail or
by email each year. The good ones are often completed and
returned; the poor ones usually go in the trash. By way of example
of the latter, the authors once received a poorly Xeroxed copy of a
questionnaire. One question asked, “Do you belong to
TAHPERD?” First, was the question referring to the Tennessee or
Texas Association of Health, Physical Education, Recreation and
Dance? Second, why would a person living in Georgia or another
state belong to either one? The questionnaire was a Xeroxed copy
of one that another researcher had developed for use in
Tennessee. The questionnaire went in the trash.
The following discussion of questionnaire research assumes
that the questionnaire is distributed and returned by mail.
Questionnaire development, format, distribution, return, and
examples are treated separately.
Questionnaire development
Questionnaires are used for a variety of purposes, and often the
purpose dictates the type of items (i.e., question or statement)
used in the questionnaire. Thus item type is one decision to be
made in questionnaire development. An item is classified as open-
ended if participants have the freedom to respond however they
choose. For example, the question “Why did you enroll in this adult
fitness program?” is open-ended. A second type of item is a
completion item; the subject fills in the blank. For example, “What
is your age in years to the current year?” is a completion item. The
third type of item is multiple choice or closed-ended; the possible
responses to the item are provided and the participant selects the
most appropriate response(s).
The following are two examples of a closed-ended item:
1. Select the response below that best reflects your feelings
about the research course you are presently taking.
1. Great
2. Above average
3. Average
4. Below average
5. Terrible
_______ 2. Truck
_______ 3. Jeep
_______ 4. Motorcycle
_______ 5. RV
_______ 6. Van
2. Agree
3. Disagree
4. Strongly disagree
6. I like candy.
1. Agree
2. Disagree
33. I don’t like candy.
1. Agree
2. Disagree
The questionnaire is administered and the responses to paired
items are examined to see whether they are as expected. No
matter whether the questionnaire is administered on two different
days or on one day with paired items, an intraclass R can be
obtained as an estimate of reliability, either for each item
administered twice or for each pair (see Chapter 13,
“Measurement in Research,” or Baumgartner, Jackson, Mahar,
& Rowe, 2016).
The quality of the questionnaire should be determined before it
is used to collect the research data. This is always the case with
the reliability and validity of a questionnaire, as just discussed.
Some researchers determine reliability or validity using data from
the research study. The problem with this method is that if the
reliability or validity is poor, it is too late to make changes in the
questionnaire to improve reliability or validity. Unreliable or
nonvalid data are no good, so all unreliable or nonvalid data
should be discarded.
A pilot study is the best solution to this possible problem. For
further information, see Chapter 10, “Developing the Research
Plan and Grant Application,” under “Instrument Development.” The
questionnaire is administered to a small number of individuals who
are similar to the participants who will take part in the research
study, and the reliability and validity for their data are estimated.
As part of the pilot study, it is a good idea to keep track of such
things as how long it takes individuals to complete the
questionnaire, whether the questionnaire appears to contain any
ambiguous items, and any other factor that could affect the
successful administration of the questionnaire. Based on the pilot
study, the questionnaire is revised as needed.
Questionnaire format
The appearance and layout of the questionnaire is as important as
the content. The questionnaire must be professional in the way it
is designed and in the quality of the paper and reproduction, or
people will not complete it. Items need to be arranged in rows or
columns for a neat appearance, ease of completing the
questionnaire, and ease of data entry into the computer, if
necessary, for analysis. Demographic information such as age,
gender, education, and income are often requested on the
questionnaire. Because questions about age and income
sometimes irritate people, demographic questions should be
placed at the end of the questionnaire. Irritating questions at the
end of a questionnaire are often left blank, but if they appear at the
beginning of a questionnaire it may cause the entire questionnaire
to go into the trash. In fact, any controversial item should be
placed at the end of the questionnaire for the same reason (e.g.,
“Have you gained a lot of weight in the last 5 years?”).
Answers to closed-ended items that will be computer-analyzed
need to be numerically coded. Participants should be asked to
check the appropriate item or circle the appropriate number.
EXAMPLE:
Poor format
What is your sex? ______
Better format
Circle the number for your sex.
1. Female
2. Male
or
Check the number for your gender.
______ 1. Female
______ 2. Male
EXAMPLE:
Poor format
What is your income? ______
Better format
Which group most closely represents your income?
______ 1. Less than $12,000
______ 2. $12,001 to $18,000
______ 3. $18,001 to $27,000
______ 4. $27,001 to $40,000
______ 5. More than $40,000
Questionnaire distribution
The biggest concerns with distribution are controlling the cost of
getting the questionnaire to the participants and trying to obtain a
high rate of questionnaire return. The researcher can influence
both. For example, the fewer the number of pages in the
questionnaire, the less expensive it is to mail, the shorter it looks
to the participant, and the more likely it is to be returned.
Generally, there is a tendency to have too many items on a
questionnaire. No matter what the number of items, reproduce the
questionnaire on both sides of the page and use the smallest
readable print size. Even the weight of the paper used can
sometimes influence the mailing return rate.
The researcher is expected to provide a self-addressed
stamped envelope for returning the questionnaire. Failure to do so
will almost certainly decrease the rate of return. To minimize the
postage on the return envelope, ask participants to respond to all
closed-ended items on a standardized answer sheet rather than
on the questionnaire and to just return the answer sheet.
Standardized answer sheets are often an efficient tool for
organizing the data to be entered into a computer. However, if
participant groups are not familiar with standardized answer
sheets, their use will decrease questionnaire return rate or result in
such poorly erased or completed answer sheets that data will be
lost.
A good mailing list is an essential component of distributing a
questionnaire. Lists are not always highly accessible. Lists with the
name of a person rather than a title (e.g., “department head”) are
generally desirable. However, if the questionnaire is going to the
person presently in the position and there is considerable turnover
in the position, a title may be better than a personal name. Check
into the advantages and disadvantages of bulk mail in comparison
to first- and second-class mail in terms of cost, speed of delivery,
and forwarding, because considerable cost reduction is possible
with bulk mail.
Researchers can do a number of things to try to improve the
percentage of questionnaires returned. To give some idea of the
extent of this problem, the expected rate of return is 50%; perhaps
an 80% return rate may be expected from professional participants
who have an interest in the questionnaire results; but only a 10%
return is likely when surveying the general public. For this reason
researchers tend to send out enough questionnaires to ensure a
desired number of returns. However, a low return rate makes the
validity of the results questionable. Are the participants who
returned the questionnaire really a representative sample of the
target population? Would the results have been different if more
people had returned the questionnaire? It is important to motivate
people in every way possible to return the questionnaire. A cover
letter with the questionnaire explaining why the study is being
conducted and its importance to humanity will improve the return
rate. A cover letter signed by an influential person also increases
the return rate. Offering to share the results, giving money, a
catchy saying (“a penny for your thoughts,” and enclose a penny)
all help to motivate people to return the questionnaire.
Sending the questionnaire out at a time when people are likely
to have time to complete and return it is good strategy. Sending
questionnaires to coaches in the middle of their season or to
anyone just before Christmas is not wise. Also, one to three follow-
up letters sent at 2- to 3-week intervals after the first mailing and
reminding people to return their questionnaire increases the
percentage of returns. Having participants put their name on the
questionnaire before they return it makes follow-up easier and is to
the advantage of the researcher. However, some participants will
not return the questionnaire or will not be totally truthful in
completing it if they have to put their name on it. If there are any
highly personal items on the questionnaire or any items dealing
with illegal activities, do not request the participant’s name on the
questionnaire. Meanwhile, numbering each questionnaire when
mailed can be keyed to the recipient.
EXAMPLE:
How often do you smoke pot?
1. Weekly
2. Monthly
3. Several times a year
4. Never
Questionnaire examples
I. Poor format
Listed are activities taught in the physical education
department. Please check the activities you have taken at the
university. If you check an activity, then please check the skill
level you presently have in the activity. “B” is beginner, “I” is
intermediate, and “A” is advanced.
Note: This is a poor questionnaire because much space is wasted and there is no
numerical coding of responses.
Improved format
Listed are activities taught in the physical education
department at the university. Please use the following codes to
indicate your status for each activity.
CODE: 0—Did not take activity
1—Took activity; presently have beginner level skill
2—Took activity; presently have intermediate level skill
3—Took activity; presently have advanced level skill
_____ Archery
_____ Bowling
_____ Golf
_____ Dance
_____ Swimming
_____ Tennis
II. May I have about five minutes of your time?
I need your assistance.
This is my dissertation research. The purpose of the study is to
determine what faculty members think are important attributes
in a department head. Using the scale shown below, please
rate the importance of each attribute of a department head.
Then return your ratings in the enclosed addressed, stamped
envelope.
Attributes
______ Friendly
______ Professional
______ Organized
______ Good looking
III. Using the key below, rate each task by putting an “X” in the
face that best represents your feeling.
IV. What are the four biggest problems in our discipline today?
Note: This is an open-ended question because the researcher
does not know all the problems and probably does not have
enough space on the questionnaire to list all the problems.
2. How much time did you usually spend doing vigorous physical activities on one
of those days?
______ hours per day
______ minutes per day
□ Don’t know/not sure
Think about all the moderate activities that you did in the last 7 days. Moderate
activities refer to activities that take moderate physical effort and make you
breathe somewhat harder than normal. Think only about those physical
activities that you did for at least 10 minutes at a time.
3. During the last 7 days, on how many days did you do moderate physical
activities like carrying light loads, bicycling at a regular pace, or doubles tennis?
Do not include walking.
______ days per week
□ No moderate physical activities → Skip to question 5
4. How much time did you usually spend doing moderate physical activities on one
of those days?
______ hours per day
______ minutes per day
□ Don’t know/not sure
Think about the time you spent walking in the last 7 days. This includes at work
and at home, walking to travel from place to place, and any other walking that
you have done solely for recreation, sport, exercise, or leisure.
5. During the last 7 days, on how many days did you walk for at least 10 minutes
at a time?
______ days per week
□ No walking → Skip to question 7
6. How much time did you usually spend walking on one of those days?
______ hours per day
______ minutes per day
□ Don’t know/not sure
The last question is about the time you spent sitting on weekdays during the
last 7 days. Include time spent at work, at home, while doing coursework, and
during leisure time. This may include time spent sitting at a desk, visiting
friends, reading, or sitting or lying down to watch television.
7. During the last 7 days, how much time did you spend sitting on a week day?
______ hours per day
______ minutes per day
□ Don’t know/not sure
Questionnaire Summary
Research using a questionnaire as the data-gathering instrument
is quite common in kinesiology. Considerable information has
been presented concerning construction and use of a
questionnaire. For people who will seldom use a questionnaire, it
may be too much information, but for people who will use a
questionnaire in their research, it is probably not enough
information. A number of excellent sources are available for those
who desire more information. All of the following are enlightening:
Isaac and Michael (1995); Wiersma (1991, chapter 7); Best and
Kahn (2005, chapter 5); and Neutens and Rubinson (2002).
Books specific to survey methods and mail surveys such as
Babbie (1990) and Weisberg and Bowen (1977) are excellent
sources.
Summary of Objectives
1. Identify the common types of descriptive research. The
most common and familiar type of descriptive research is the
survey. Other types of descriptive research include
developmental research, case studies, correlational studies,
normative research, observational research, action research,
and causal-comparative, or “after-the-fact,” research.
2. Explain several types of survey methods. Surveys can be
conducted as phone interviews, or personal interviews, which
can vary in structure from very specific to very general.
Surveys can also be conducted through the use of
administered questionnaires. Questionnaires can include
multiple choice or closed-ended questions, which are easy to
compare, as well as open-ended questions that allow more
varied responses.
3. Evaluate the quality of a research questionnaire. To be a
useful research tool, a questionnaire must meet several
criteria. The survey itself must be carefully designed and
administered to a randomly chosen but appropriate population.
It should be easy to read, complete, and submit. Questions
should be unambiguously written and should extract the
desired information in a consistent manner. Validity and
reliability of surveys should be judged by a panel of experts.
References
Babbie, E. (1990). Survey research methods (2nd ed.). Belmont, CA: Wadsworth.
Babbie, E. (2004). The practice of social research (10th ed.). Belmont, CA: Wadsworth.
Baumgartner, T. A., Jackson, A. S., Mahar, M. T., & Rowe, D. A. (2016). Measurement
for evaluation in kinesiology (9th ed.). Burlington, MA: Jones & Bartlett Learning.
Best, J. W., & Kahn, J. V. (2005). Research in education (10th ed.). Boston, MA: Allyn &
Bacon.
Bradburn, N. M., Sudman, S., & Wansink, B. (2004) Asking questions: The definitive
guide to questionnaire design—for market research, political polls, and social and
health questionnaires. San Francisco, CA: Wiley.
Cronk, B. C., & West, J. L. (2002). Personality research on the Internet: A comparison of
Web-based and traditional instruments in take-home and in-class settings. Behavior
Research Methods Instruments and Computers, 34, 177–180.
Daley, E. M., McDermott, R. J., Brown, K. R. M., & Kittleson, M. J. (2003). Conducting
Web-based survey research: A lesson in Internet design. American Journal of Health
Behavior, 27, 116–124.
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2008). Internet, mail, and mixed-mode
surveys: The tailored design method (3rd ed.). New York: John Wiley and Sons.
Gall, M. D., Gall, J. P., & Borg, W. R. (2003). Education research: An introduction (7th
ed.). New York: Allyn and Bacon.
Hauspie, R. C., Cameron, N., & Molinari, L. (Eds.). (2004). Methods in human growth
research. Cambridge University Press: New York.
Isaac, S., & Michael, W. B. (1995). Handbook in research and evaluation (3rd ed.). San
Diego: EdITS.
Jones, S. (Ed.) (1999). Doing internet research: Critical issues and methods for
examining the net. Thousand Oaks, CA: Sage Publications.
Morgan, D. L., & Krueger, R. A. (1998) The focus group kit. Thousand Oaks, CA: Sage
Publications.
Neutens, J. J., & Rubinson, L. (2002). Research techniques for the health sciences (3rd
ed.). New York: Benjamin Cummings.
Pealer, L. N., & Weiler, R. M. (2000). Web-based health survey research: A primer.
American Journal of Health Behavior, 24, 69–72.
Riva, G., Teruzzi, T., & Anolli, L. (2003). The use of the Internet in psychological
research: Comparison of online and offline questionnaires. CyberPsychology and
Behavior, 6, 73–80.
Rubin, H. J. (1983). Applied social research. Columbus, OH: Charles E. Merrill.
Schonlau, M., Fricker, R. D., Jr., & Elliott, M. N. (2002). Conducting research surveys via
e-mail and the web. Arlington, VA: Rand.
Sills, S. J., & Song, C. Y. (2002). Innovations in survey research: An application of Web-
based surveys. Social Science Computer Review, 20, 22–30.
Simsek, Z., & Veiga, J. F. (2000). The electronic survey technique: An integration and
assessment. Organizational Research Methods 3, 93–115.
Vispoel, W. P. (2000). Computerized versus paper-and-pencil assessment of self-
concept: Score comparability and respondent preferences. Measurement and
Evaluation in Counseling and Development, 33, 130–143.
Weisberg, H. F., & Bowen, B. D. (1977). An introduction to survey research and data
analysis. San Francisco: W. H. Freeman.
Wiersma, W. (1991). Research methods in education (5th ed.). Boston, MA: Allyn &
Bacon.
Zhu, W. (2012). “17% at or above the 95th percentile”—What is wrong with this
statement? Journal of Sport and Health Science, 1(2), 67–69.
© Liashko/Shutterstock
CHAPTER 3
CHAPTER 3
Experimental Research
This chapter contains information concerning experimental
research. You should be familiar with how experimental research
is conducted and the major issues in conducting this type of
research.
OBJECTIVES
History
History refers to specific things that happen while conducting the
research study that affect the final scores of the participants in
addition to the effect of the experimental treatment. Suppose that
participants participate in an activity or program outside of the
research study but that the activity is very similar to the
experimental treatment. If this outside participation increases their
final scores, it makes the experimental treatment seem more
effective than it really is. A second example is the atypical
occurrence, such as an epidemic, local disaster, or one-time
emphasis on the research topic in the community, that occurs in
the lives of the participants during the research study and that
affects their final scores.
Maturation
Because the participants grow older during the experimental
period, their performance level changes and that change is
reflected in their final scores. For example, as young participants
grow older during the 15-week experimental period, their physical
performance level changes no matter what the effect of the
experimental treatment. Similarly, as elderly participants grow
older over the course of a 6-month fitness program, their physical
performance level decreases no matter what the effect of the
fitness program.
Seasonal changes might also be considered here. For
example, between September when the research study begins
and December when the study ends, participants become less
physically fit due to less daily activity because they are in school
all day and the weather is bad. This loss of fitness counteracts the
effect of the experimental treatment. A control group (group
receiving no treatment) can be an effective check on the
maturation threat.
Testing
The act of taking a test can affect the scores of the participants on
a second or later testing. For example, participants are pretested,
then the experimental treatment is administered, and finally the
participants are posttested. One reason why participants do better
on the posttest is because they learn from the pretest. The pretest
is like a treatment, which poses a real problem if the pretest and
posttest are the same knowledge or physical performance test.
Baumgartner (1969) tested participants with several physical
performance tests, retested them 2 days later, and found that
scores had improved from the initial test to the retest.
Instrumentation
Changes in the adjustment or calibration of the measuring
equipment or use of different standards among scorers may cause
differences among groups in final score or changes in the scores
of the participants over time. This suggests that researchers need
to check the accuracy of their measuring equipment regularly and
frequently and make sure a standard scoring procedure is used.
Any difference in the test scores of several treatment groups or
change in the test scores of the participants in a group over time
(pretest to posttest) must be due to differences among groups or a
change in the participants over time and not due to changes in the
calibration of the equipment or scoring procedure. Testing
equipment and scoring procedures must be held constant for each
participant in research environments involving multiple pieces of
the same equipment, multiple scorers, or multiple days of data
collection.
Statistical Regression
The tendency for groups with extremely high or low scores on one
measure to score closer to the mean score of the population on a
second measure is called statistical regression. This tendency
may be mistaken for experimental treatment effect. For example, a
high IQ group and a low IQ group are formed. An experimental
treatment is applied to both groups for 12 weeks. Then a physical
ability test is administered to both groups, and it is found that they
are equal in physical ability. This equality may be due to the
regression effect or to the treatment effect. Using random
sampling procedures to form groups eliminates that threat.
Research studies in which the research question is whether
groups that differ in one attribute differ in terms of another attribute
may have similar problems. For example, does a population of
individuals with disabilities differ from a population of individuals
without disabilities in their beliefs concerning use of leisure time?
No treatment is applied in this example, so random samples from
the two populations are obtained and a score for beliefs about the
use of leisure time is obtained for each participant. The finding that
the two populations do not differ in their leisure-time-use beliefs is
due to the regression effect.
Selection
The way participants are selected or assigned to groups can be
biased. This may result in groups that are not representative of a
population, groups that are not equal in ability at the beginning of
the experiment, and/or groups that differ at the end of the
experiment for reasons other than differences in the integrity of the
experimental treatments. Random selection of participants and
random assignment of participants to groups usually controls this
threat. Thus, following the random sampling techniques and using
sample sizes that are sufficiently large for the research situation
will control this threat. (See Chapter 12, “Recruiting and Selecting
Research Participants.”) When participant selection is a threat it is
often due to small sample size per group and/or failure to use one
of the necessary alternatives to simple random sampling when
required for the research setting.
Experimental Mortality
This particular threat to internal validity is created with the
excessive loss of participants so that experimental groups are no
longer representative of a population or similar to each other.
Some participant loss is to be expected in a research study, but if
many participants of the same type drop out of a group it can
considerably change the characteristics of the group and the
outcome of the study.
This also applies to disproportional participant loss among the
groups. Usually groups are about the same size at the start of the
research study but, if at the end of the research study they are
considerably different in size, this is a concern.
Interaction of Selection and Maturation or
History
The maturation effect or history effect is not the same for all
groups selected for the research study, and this influences final
scores. This problem may arise when each group is an intact
preexisting group rather than a randomly formed group. For
example, if the groups differ considerably in age or background,
this may affect how they respond to the experimental treatments.
Also, if groups start out unequal in ability due to maturation or
history, they may not have the same potential for improvement as
a result of the experimental treatments.
A design must control the major threats to validity and allow the
researcher to answer the research questions. However, the KISS
principle (keep it simple, stupid) should not be overlooked. The
design must be adequate, but do not make the design any more
complicated than it has to be.
In the previous section we learned that random selection and
random assignment of research participants influence how a
design is classified. So, random selection of participants should be
strived for in a design. Sometimes using an alternative to simple
random sampling will enable the researcher to achieve random
selection of research participants and random assignment of
research participants to groups for a true experimental design.
(Simple random sampling and alternatives to simple random
sampling are discussed in Chapter 12, “Recruiting and Selecting
Research Participants.”) From a design standpoint, if there are two
or more groups, the basic procedure is (1) define the target
population, (2) identify the accessible population, (3) randomly
select participants, and (4) randomly assign participants to groups.
Hawthorne Effect
This source of error is named for a research study that was
conducted at the Hawthorne Electric Plant in the 1920s. It was
observed that having a group of workers participate in a research
study caused them to feel special, so they acted differently from
the typical worker. The research study was conducted to
determine how changes in the working environment would affect
productivity. Results of the study showed that productivity
increased no matter whether the change in the working
environment was supposed to increase or decrease productivity.
This points out that participants in an experiment may perform in
an atypical manner due to the newness or novelty of the treatment
and because they realize that they are participating in an
experiment. This suggests that, as much as possible, the
participants should be unaware that they are participating in an
experiment and also unaware of the hypothesized outcome of the
study. Particularly in studies comparing new and traditional
methods (e.g., teaching, training), the researcher does not want
the new method to seem superior to the traditional method based
on the Hawthorne effect.
Placebo Effect
The participants in an experimental treatment group may believe
the treatment is supposed to change them, so they respond to the
treatment with a change in performance. In a typical study to
determine the effect of a drug, participants are pretested,
administered the drug, and posttested. The researcher wants any
change in the performance of the group from pretest to posttest to
be due to the drug and not due to the fact that the participants
think the drug is effective. In this example, the researcher has a
second group (control group) that goes through the same
procedure (pretest, drug, posttest) as the experimental treatment
group, except the drug the control group receives is a placebo, an
inert substance that can have no effect on the participants.
Assuming the experimental and control groups were equal in
pretest scores and that the psychological response to a drug is the
same for both groups, superiority of the experimental group in
posttest scores is due to the positive effect of the drug.
In some studies, the experimental group receives a treatment
and the control group receives nothing. This seems like a poor
research technique since the control group should be receiving a
placebo if psychological response to a treatment is to be the same
for both groups. The treatment for the control group should be one
that cannot affect the test score used to compare groups at the
end of the study.
CHAPTER 4
CHAPTER 4
Qualitative and Mixed-Methods
Research
Authors’ note: We would like to thank the previous author of this
chapter, Jennifer Waldron of the University of Northern Iowa, for
her outstanding contribution with the original qualitative research
chapter.
This chapter contains information regarding qualitative and
mixed-methods research. After reading this chapter, you should
understand the distinct role of qualitative research and mixed-
methods research traditions in kinesiology. Furthermore, you
should understand data collection strategies for qualitative and
mixed-methods research. The process of interpreting qualitative
and mixed-methods will be discussed in Chapter 16, “Qualitative
and Mixed-Methods Data Analysis Interpretations.”
OBJECTIVES
Case study
Convergent parallel mixed-methods
Corpus of data
Essence
Explanatory sequential mixed-methods
Exploratory sequential mixed-methods
Grounded theory
Interview
Life history
Mixed-methods research
Paradigm
Point of interface
Probes
Qualitative research
Narrative Strategies
To start, a narrative strategy is a design for inquiry taken from the
area of humanities. Specific forms include biographical,
autoethnographical, historical, and action (Denzin & Lincoln,
2018). Typically, one or more participants are asked to provide
stories about their lives, and then the researchers employ an
analysis or a specific form to interpret and share the stories related
to the phenomenon under investigation (Creswell, 2014).
Because these accounts drive the content of the narrative, a plot
line, with a beginning, middle, and end, may exist. One recent
example of this type of study examined the personal story of a
former university track and field athlete with a strong athletic
identity as she transitioned out of her university sport. Authors
conducted seven life history interviews (compilations of data that
cover the lives of individuals or that result from one or more
individuals providing stories about their lives), analyzed each using
a dialogical narrative analysis, and utilized the structure of her
sport context to explain her sport life story. Results suggested that
the extra support and services provided to her as a university
athlete made the transition out of her sport more challenging and
led to feelings of isolation and helplessness (Jewett, Kerr, &
Tamminen, 2018).
Phenomenological and
Ethnomethodological Strategies
Another common category of qualitative strategy,
phenomenological and ethnomethodological, is interested in the
mutual, shared experiences of people. With the former,
researchers assume an essence (i.e., core meaning) or presence
of shared experiences and attempt to better understand the
phenomenon of lived experiences through in-depth interviews or
observations. With the latter, ethnomethodological strategies,
researchers seek to understand an individual’s understanding and
construction of social experiences, social action, social lives, and
culture (Creswell, 2014).
During this process, researchers explore their own personal
beliefs about the phenomenon (and may bracket them in
transcripts), so that personal beliefs do not interfere with the data
collection and analysis. For example, Grace and Butryn (2019)
conducted semistructured interviews with eight marathoners and
uncovered seven themes. These were organized temporally
around their Boston marathon experience in 2013 (the year of the
attack) and subsequent experiences running in the Boston
marathon after 2013. Themes across time included achieving a
goal (unattainable in 2013), an ability to demonstrate their
resilience, and satisfaction in showing the attackers defiance.
Authors were also able to provide implications for practitioners and
suggestions to help others with trauma experiences in sport
(Grace & Butryn, 2019).
Case Studies Strategies
The third type of strategy, qualitative case studies, explores a
singular case or bounded system (e.g., person, program, event, or
activity) over time. If the answer to the question: Is there a limit to
participants? is “yes,” then the phenomenon is bounded, and a
case study may be an appropriate methodology. When
undertaking this type of research, in-depth data collection typically
occurs, with multiple sources of information (Creswell, 2014). For
example, Kerr and MacKenzie (2014) conducted research on an
expert skydiver by soliciting his reflections on different skydiving
experiences using a side-by-side, cooperative process between
the researcher and participant. Findings were generated from the
participant and inducted from the theory. Results from this unique
study indicated that the participant did not engage in skydiving
simply for the thrill of taking risks but instead for personal
development and improvement (Kerr & MacKenzie, 2014).
Interviews
An interview is a purposeful conversation employed to gather
information. Interviews are the key means of collecting data when
researchers are unable to observe the phenomenon. When
interviewing, the researcher’s primary role is to allow the
participants to converse and discuss their experiences in detail.
Researchers may use individual interviews (i.e., interviewing one
person at a time) or focus group interviews, where a small group
of participants is interviewed together. In both cases, interviews
may take place in person, by video (e.g., Skype, Zoom), by
telephone, or through other electronic means (e.g., an open-ended
survey).
Interview guides and questions
Because interviews are purposeful conversations, researchers
must generate and structure questions that are based on the
scholarly literature and the purpose of the research study. After
considering these two sources, researchers should create an
outline of the broad categories relevant to the study and then
create specific questions. Further development of the interview
guides should continue from the researcher’s reflection,
understanding, and personal experiences associated with the
topic. The researcher also may consider including areas not found
in scholarly literature. Strategies concerning the revision of the
interview guide begin with first asking experts in the area (e.g.,
scholars, clinicians) to review the list of questions for appropriate
content. A second strategy is to conduct a small pilot study
(utilizing the vetted interview guide) with several participants. This
will help uncover other related areas of participants’ interests that
should be added to the interview guide. A final strategy is simply to
begin collecting data and amend the interview guide as needed as
the process unfolds and researchers learn more about the
effectiveness of the interview questions. As part of the flexible and
emergent nature of the qualitative research design, researchers
often employ this last strategy as they evaluate the interview guide
and adapt it as needed throughout the data collection process
(Patton, 2014).
Interview questions
After selecting the appropriate type of interview guide, researchers
should develop and arrange the questions. Initial questions should
be mild and broad (i.e., Describe your college coaching
experiences.). These allow participants to become comfortable
discussing personal experiences with the researcher, and they
also lay the foundation for the process of building trust between
interviewer and participant. Rapport continues to build during
interviews and across interviews when the interviewer
demonstrates empathy and understanding without judgment. To
this point, it may be useful to limit initial questions (often
demographic in nature), so participants do not lose interest or
enthusiasm. Instead, researchers can distribute these types of
questions throughout the interview, save them for the end of the
interview, or send them separately to the participant via a short
online survey (Patton, 2014).
Good questions are open-ended, neutral, singular, and clear
(Patton, 2014). First, open-ended questions, those that
participants should not be able to answer with a yes-or-no
response, are recommended during interviews; these increase
discussion. Second, neutral questions, those that do not lead the
participant toward a desired answer, are preferred. An example of
a contraindicated leading question might be: Why was the Wingate
Anaerobic Cycle Test hard to perform? By asking in this manner,
the assumption is that the test was difficult, although this may not
be the view held by the researcher and participant. The question is
also affectively worded, and this can produce negative emotional
responses in the participant. Phrasing questions to begin with
“why” may produce an upsetting response because of its punitive
connotation. Therefore, researchers are encouraged to use
verbiage such as “how come” as an alternative.
Third, questions should be singular, so the participant can
focus on a single response. Double-barreled questions require
responses to two different issues in a single question, and
complex questions can confuse the participant. Overall, inquiries
should be kept brief and concise, so participants hear the question
in its entirety. For example, asking college students: What does it
mean to be thin and fit to you? contains two different topics of
interest (i.e., weight and fitness). It is more effective to separate
questions such as these into two stems, so participants can
respond successfully to each. Finally, questions should be clear.
Expert review and pilot testing, as discussed earlier, should help in
the clarification process. Once trust and rapport are present, it
may be possible to ask more controversial and specific questions
about the phenomenon (Patton, 2014).
Furthermore, within the interview guide, researchers should
create and employ different types of questions. Central (essential)
questions access information of primary focus to the study and
allow the participants to tell their story about a phenomenon on
their terms. Similar questions (sometimes referred to as extra
questions) may ask about the same issue in a different way to
check the reliability or consistency of participant responses. If
discrepancies arise between answers to similar questions,
researchers may need to continue questioning participants until
consistency and a deeper understanding are present.
In general, the array of questions within an interview guide can
be vast. They can range among all the following areas of inquiry:
(a) experience and behavior; (b) opinions, values, and belief
systems; (c) feelings and emotions; (d) knowledge; (e) sensory
experiences; and (f) background demographics (Patton, 2014).
Common question types address hypothetical situations (or
hypothetical cases), ideal positions, probes, and may also, at
times, present devil’s advocate views. With the first type,
hypothetical questions, participants answer questions that support
theoretical outcomes that may or may not occur. They often begin
with a verb such as “suppose” (e.g., Suppose you were describing
the procedure for body composition testing using an underwater
weighing technique, what parts of the process would you highlight
as most important?). These help the researcher provide a
description without having to physically experience the
phenomenon. The second type, ideal position questions, address
the positives and negatives of a phenomenon by probing about an
ideal situation associated with the phenomenon: What might a
positive management plan look like?
In addition, it is likely that some of the topics of the interview
will be controversial or potentially threatening to the participants.
One way to begin to address these types of topics is via asking
devil’s advocate questions. These typically start with “Some
people would say …” This mechanism depersonalizes the issue,
and the response typically reflects participants’ opinions. For
example, a question such as, “Some people would say that
parents should teach their children about nutrition and healthy
eating, while other people may say that schools can help in
teaching children nutrition and healthy eating. What do you think
the school’s role is in teaching students about nutrition?” allows
the participant to log an authentic response.
Finally, one last category of questions employed during a
typical interview process is probes (also called prompts). Probes
are interpretive questions designed to elicit more information,
detail, and elaboration from the participant. Probes for clarification,
contrast, eliciting details, elaboration, and verbal/nonverbal cues
are commonly used (Patton, 2014). As such, they can be
impromptu or planned. Impromptu prompts will occur and be
different from interview to interview; they can include raising
eyebrows at the end of a comment, repeating a key word, or
asking the meaning of a specific word or phrase. Planned prompts
are part of the structured interview guide and are used to
encourage participants to discuss issues that do not readily come
to their minds. Clarification probes communicate to the participant
that the researcher desires more information. For example, the
interviewer might state, “I am not sure I understood what you were
saying. Could you please say more about that?” Researchers
typically employ clarification probes naturally as needed during
interview conversations (Patton, 2014).
Other types of probes provide contrast. For example, the
interviewer might state, “How does that compare to this?” This
helps define the boundaries of the participant’s response. Detail
information probes seek more information, such as who, when,
where, why, and how. For example, the interviewer might ask,
“Who else was present?” or “What was the first time that you were
in this situation?” Elaboration probes request more information
directly, such as “Could you please expound on that?” Overall,
however, the best type of cue to encourage participants to provide
further details of the phenomenon might be silent probes. Those
include such behaviors as the researcher nodding his or her head
in agreement or providing certain hand gestures. To finish, all
interviews should end with a question, such as “Would you like to
share anything else on this topic at this time?” or “What could I
have asked you that I may not have thought of to ask?”
Concluding questions such as these allow the participant to
expand on ideas or contribute relevant content not previously
addressed during the interview (Patton, 2014).
Observations
In addition to interviews, another common means of qualitative
data collection is observation. This technique allows researchers
to observe what is occurring in a social situation, conversation,
social interaction, event, as well as within a pattern of behavior.
Observations are recorded in detail in the form of field notes. One
type, participant observation, involves the researcher immersing
herself or himself in the daily lives and routines of those being
studied. When observations such as these take place, it is referred
to as fieldwork. As an example, Hewitt and colleagues (2018)
conducted an observational study across three countries
(Australia, the United States, and Canada) comparing adherence
of participants (at childcare centers) related to screen time
recommendations from the American Academy of Pediatrics
(AAP) and the 2011 policy statement from the Institute of Medicine
Early Childhood Obesity Prevention Policies (now National
Academy of Medicine [NAM]). Compliance to specific policies
varied by country, but overall, the highest areas of compliance
among childcare centers was in discouraging screen time,
encouraging indoor free movement under adult supervision,
tummy time for infants, and encouraging infants to participate in
physical activity. Low compliance was observed for use of
equipment that limits infant movement and in education for
families about physical activity (Hewitt et al., 2018).
Document Analyses/Photovoice-Visual
Voice/Art-Based Inquiry
Aside from interview and observations, another method of data
collection is examining documents and audiovisual materials such
as photographs, videos, personal letters, public documents,
photographs, videos taken by participants, medical records, and
art. Genoe and Dupuis (2013), for example, utilized interviews,
participant observations, and photovoice to explore meanings of
leisure for persons with dementia. Photovoice is a participatory
action research method where participants use a camera to
communicate thoughts and reflections of their daily lives through
photography. This technique allows researchers to see the lives of
participants through their own lenses. In this study, the authors
asked two men and two women in the early stages of dementia to
participate in the photovoice experiences about their leisure. The
participants’ images were later used during interviews to help them
explain their experiences. Despite challenges (e.g., teaching the
participants how to use the cameras), photovoice aided in cueing
memories, planning for the interview, sharing stories, and
capturing meaning. In addition, participants found that as their
cognitive abilities changed due to memory loss, there were
corresponding changes in their ability to participate in leisure
activities. Leisure was still valued by the participants as a method
of coping with changes they were experiencing, and it helped slow
down the process of dementia while they created new memories
for themselves and families (Genoe & Dupuis, 2013).
Data Coding
Because coding should be an ongoing process throughout data
collection and analysis, it is briefly mentioned here. A full
discussion of data analysis and interpretation for qualitative and
mixed-methods is available in Chapter 16, “Qualitative and Mixed-
Methods Data Analysis Interpretations.” To begin, data collection
and analysis are processes that proceed simultaneously
(informally) and sequentially. Typically, data analysis consists of
generating, developing, and verifying themes, and then building
potential theories via inductive and comparative processes
(Patton, 2014). As such, researchers approach a topic with
curiosity and strive to remain open to the unfolding of more
specific research questions. Additionally, inductive processes
move from specific, individual cases to those that are more
general in nature. By examining individual experiences and then
creating relevant themes, qualitative and mixed-methods
researchers work to understand human phenomena (Patton,
2014). In reviewing transcripts, field notes, documents, audio-
visual materials, and art, these materials should be examined for
emerging themes. Coding of these materials and ideas can be
accomplished with paper and pencil, in spreadsheets such as
Microsoft Excel, and also with the help of qualitative data software
(e.g., NVivo or HyperRESEARCH).
Examples of initial coding using the software program
HyperRESEARCH are presented in FIGURE 4.1 and FIGURE 4.2.
This project investigated blended professional learning for college
tutors/mentors in the Advancement Via Individual Determination
(AVID) program at an urban school district in the Southwestern
United States (Garcia, 2018). The project augmented the current
tutor training system in the district with online modules. Figure 4.1
displays initial coding from an interview transcript from an
interview with an AVID tutor (note that initial coding often includes
many categories that are later condensed). In Figure 4.1, the
round one coding includes 23 codes. Figure 4.2, from the same
study, displays a document analysis (whiteboard lesson on gas
laws in science). The pictured lesson was facilitated by a trained
AVID tutor and presented to other AVID tutors as part of
professional development. The lesson activity was coded as
“STEPS ACC” indicating that the steps taken by the tutor are
acceptable. The findings suggested that the blended learning
professional learning model may be beneficial for AVID tutors and
students.
CHAPTER 5
CHAPTER 5
Historical Research and
Action Research
This chapter contains information concerning the historical
research and action research approaches. You should become
familiar with the characteristics, terms, and techniques of each.
OBJECTIVES
Action planning
External criticism
Internal criticism
Participatory action research
Practical action research
Primary source material
Secondary source material
Biographical Research
Yet another facet in the area of historical research is using
biographical or autobiographical sources. Biographical historical
research presents yet another avenue for a professional in a field
of kinesiology to gain further insight into the background of the
field. Studying the life, career, and contributions of former leading
scholars, teachers, coaches, and administrators can provide a
better understanding of the philosophies and movements that are
the foundation of current thought and activity in the field of
kinesiology. Appropriately, the subjects of biographies have been
leaders with outstanding professional credentials, their legacies
preserved through careful and objective evaluation of their
contributions to their respective fields. At the same time, the
heritage of each kinesiology field is kept and maintained.
To that end, biographical research emphasizes the use of
primary source materials and applies external and internal
criticism to evaluate the data. The biographer encounters many of
the problems typical of most historical research. The following list
represents some of the items that can become problematic in a
biographical study: (a) faulty memories of the primary source
people, (b) inability to interview a primary source, (c) failing to
discover a potentially strong primary source person or document,
and (d) responses of a living subject of the biography could be
biased and could affect those of other primary source people.
Grosshans (1975) completed an excellent biography on
Delbert Oberteuffer and his contributions to the fields of health and
physical education. In addition to extensive personal interviews
with Oberteuffer, she talked to his family members, childhood
friends, fellow students, professional colleagues, and former
employees. In determining the worth of the data obtained from
these individuals, Grosshans applied questions, such as the
following: (a) Was the position, location, or association of the
contributor favorable for observing the conditions he or she
reported?, (b) Did emotional stress, age, or health conditions
cause the contributor to make faulty observations or an inaccurate
report?, (c) Did the contributor report on direct observation,
hearsay, or borrowed source material?, (d) Did the contributor
write the document at the time of the observation, or weeks or
years later?, (e) Did the contributor have biases concerning any
person, professional body, period of history, old or new teaching
methods, educational philosophy, or activity that influenced his or
her writing?, (f) Did the contributor contradict himself or herself?,
and (g) Are there accounts by other independent, competent
observers that agree with the report of the contributor?
(Grosshans, 1975).
Ethics in Research
Ethics in All Research Studies
All research studies include ethical considerations, including
historical research. Researchers need to show that scores
produced by instruments used in studies are reliable and valid in
the samples being studied and that qualitative data being used are
trustworthy, believable, and credible. Similarly, researchers need
to show that historical data meets the criteria of external and
internal criticism and that the data have been verified as historical
facts. Researchers provide competent performance in the design,
implementation, and reporting of studies. Researchers show
respect for all people and their property (including those
individuals who are no longer with us). Respect includes security,
dignity, and self-worth of the individuals, their property, as well as
the greater community and public. Finally, integrity and honesty
are used in everything that the researcher does, from developing a
research question to publishing a historical paper or research
report (Mertens, 2015).
CHAPTER 6
CHAPTER 6
Epidemiological Research
The authors would like to acknowledge Drs. Richard A. Washburn,
Rod K. Dishman, and Gregory Heath for the contribution of
content from their “Epidemiology and Physical Activity” chapter in
the book, Measurement Theory and Practice in Kinesiology, edited
by Drs. Terry M. Wood and Weimo Zhu in 2006.
While most epidemiological research has been conducted in
the public health and medical fields, it also has a long history in
kinesiology. In fact, the first major study of Dr. Jeremy Morris, the
founding father of epidemiology, was related to physical activity, in
which he found sedentary London bus drivers had a higher
probability of suffering coronary disease than did their more active
fellow bus conductors, who had taken approximately 750 extra
daily stair steps. More recently, epidemiological research has
become one of the most popular research means in kinesiology to
estimate the relationship between physical activity, fitness, and
chronic diseases. This chapter introduces epidemiological
research with a focus on its definition, its history, and commonly
used research designs and statistics.
OBJECTIVES
Attributable risk
Case-control studies
Cross-sectional survey
Incident cases
Incidence rates
Odds ratio
Prevalence rates
Prevalent cases
Prospective cohort studies
Randomized controlled trial
Relative risk
Cross-Sectional Surveys
The cross-sectional survey, sometimes called a prevalence
study, measures both risk factors and the presence or absence of
disease at the same point in time. While this approach is
expedient and relatively inexpensive, it is not possible to
determine the temporal relationship of a potential cause-and-effect
relationship. For example, The Iowa Farmers Study (Pomrehn,
Wallace, & Burmeister, 1982) used a cross-sectional design to
examine the association of physical activity and all-cause
mortality. The study examined 62,000 all-cause deaths occurring
from 1962 to 1978 in male residents of Iowa from 20 to 64 years of
age. A randomly selected group of 95 farmers were compared with
158 men who lived in a city. Farmers had a 10% lower rate of
death, and they were twice as likely to participate in strenuous
physical activity. They also were more physically fit, as determined
by lower exercise heart rate and longer endurance time on a
treadmill test. Because this was a cross-sectional survey, it is
possible that healthy and fit individuals self-selected to be farmers
(or that men living in the city were less healthy and fit to begin
with) and that the activity level of the farmers had little to do with
the observed reduction in mortality rate.
Cross-sectional surveys can be useful for generating
hypotheses regarding potential risk factors/disease associations
and also for the assessment of the prevalence of risk factors or
behaviors in a defined population. For example, the CDC, in
cooperation with state health departments, conducts the
Behavioral Risk Factor Surveillance System
(www.cdc.gov/brfss/index.html) each year to determine the
prevalence of several disease risk factors, including smoking,
physical activity, and sedentary behavior. A specific type of cross-
sectional survey, called ecological studies, is used to compare the
frequency of some risk factors of interest, e.g., sedentary behavior,
with an outcome measure like obesity in the same geographical
region. For example, surveys conducted in a particular state may
find high rates of sedentary behavior as well as high rates of
obesity. Although this type of information may suggest the
hypothesis that sedentary behavior results in obesity, drawing that
conclusion is unjustified. Data from this type of survey should
never be used to make any conclusion regarding cause and effect
because there is no information regarding whether the individuals
who are sedentary are the same individuals who are obese. That
problem is referred to as the ecological fallacy.
Case-Control Studies
In case-control studies, subjects are selected based on the
presence or absence of a disease of interest. After cases and
controls are selected, a comparison of the two groups is made
relative to the frequency of past exposure to potential risk factors
for the disease. In essence, the rate of having the risk factors
between the case and control groups is compared. Risk factor
information is typically obtained by personal interview or a review
of medical records. As will be discussed later in this chapter, the
case-control design does not allow for a direct determination of the
absolute risk of the disease, because the incidence rates are not
available, due to the fact that a group of individuals is not followed
over time. However, an estimate of the risk of disease in those
exposed to the risk factor compared to those not exposed can be
calculated. Additional disadvantages of the case-control design
include difficulty in obtaining a truly representative control group,
the inability to study more than one disease outcome at a time,
and the potential for recall bias. In order to obtain a representative
group of controls, which are generally matched with cases on age,
gender, and race, the controls are often obtained from the same
setting as the cases (i.e., the hospital where the cases were
diagnosed or from the same neighborhood where the cases
reside). Investigators often use multiple control groups to increase
the probability of obtaining a representative comparison group.
Another limitation of case-control studies is recall bias, which may
result in a spurious association between a risk factor and disease.
Recall bias is the notion that individuals who have experienced an
adverse event (cancer, heart attack, etc.) may think more about
why they had this problem than healthy individuals and thus might
be more likely to recall exposure to potential risk factors. Errors in
recall can also be a problem in mortality studies, because records
must be available in archives or must be obtained from a witness
to past behavior, e.g., a spouse. Case-control studies are relatively
quick and inexpensive to conduct, are useful for studying rare
disease outcomes, require a relatively small number of subjects,
and allow the study of multiple risk factors, which make this design
particularly useful for the initial development and testing of
hypotheses to determine if moving forward to conduct a more
time-consuming and expensive cohort study or randomized trial is
warranted.
A number of case-control studies are available in the physical
activity epidemiology literature, particularly in the area of physical
activity and cancer. The case-control methodology is ideal for the
study of diseases like cancer, which occur rather infrequently and
have a long latent period between exposure to a risk factor and
actual manifestation of the disease. For example, a group of
investigators in the Netherlands (Verloop, Rookus, van der
Kooy, & van Leeuwen, 2000) have reported on a case-control
study on physical activity as a risk for breast cancer in women
ages 20 to 54 years. A sample of 918 women who had been
diagnosed with invasive breast cancer between 1986 and 1989
were selected from a cancer registry. Each patient was matched
on age and region of residence with a control subject. Both cases
and controls were interviewed in their home to collect information
on lifetime physical activity and other risk factors, including
reproductive and contraceptive history, family history of breast
cancer, smoking, alcohol use, as well as premenstrual and
menstrual complaints. Control subjects were assigned a date of
pseudo-diagnosis that corresponded with the date on which she
was the same age as her matched-case subject at diagnosis. The
analysis was restricted to risk events that occurred prior to actual
or pseudo-diagnosis. Results indicated that women who were
more active than their peers at ages 10 to 12 were at significantly
reduced risk of breast cancer. Also, women who had ever
engaged in recreational physical activity at any point prior to
diagnosis were also at significantly reduced risk for breast cancer.
These data add support for the hypotheses that recreational
physical activity decreases the risk of breast cancer in women, but
the results must be interpreted in the light of potential problems
associated with the case-control study design, including recall bias
and nonrepresentativeness of the control group.
Case-Control Study
Recall that, in a case-control study, participants are selected
based on their disease status (present or absent). A 2 × 2 table
illustrating the organization of data from a case-control study is
shown in TABLE 6.6. The analysis in a case-control study is a
comparison of the proportion of cases exposed to a suspected risk
factor [a / (a + c)] with the proportion of controls exposed to the
same risk factor [b / (b + d)]. If exposure to the risk factor is
positively related to the disease, then the proportion of cases
exposed to the risk factor should be greater than the proportion of
controls exposure to the risk factor. In the case-control study, the
only measure of the strength of the association between the risk
factor and disease is the odds ratio (a × d / b × c). Conceptually,
the odds ratio is an estimate of the risk of disease, given the
presence of a particular risk factor compared with the risk of
disease if the risk factor is not present. In most instances, the odds
ratio from a well conducted case-control study is a good estimate
of the relative risk that would have been derived from a
prospective cohort study, provided that the overall risk of disease
in the population is low (i.e., less than 5%).
Confounding
Even when a strong and statistically significant association
between a risk factor and disease is demonstrated, there is always
the possibility that the observed association is noncausal. An issue
referred to in epidemiology as confounding can result in noncausal
associations. Confounding is a situation in which a noncausal
association between exposure and outcome is observed as the
result of the influence of a third variable or group of variables
called confounders. The possibility of confounding is higher in
observational studies, which are more common in physical
activity/fitness epidemiology, than in randomized controlled trials.
This is because the likelihood that the comparison groups (e.g.,
exposed/not exposed) differ with regard to a potential confounding
variable are minimized in the randomized trial as a result of the
random assignment of participants to groups. For example, many
observational studies have shown increased physical activity at
baseline to be associated with lower risk of all-cause mortality.
Does increased physical activity cause the lower mortality risk
or are people with higher levels of physical activity generally in
better health and therefore at a lower risk of mortality? Or, stated
in another way: Does overall health status confound the observed
association between increased physical activity and decreased
mortality? Most well conducted observational studies use
statistical techniques to adjust, or attempt to adjust, for indicators
of health status such as blood pressure, lipid levels, obesity,
cigarette smoking, etc.; however, the possibility that some
confounding (e.g., nutrition intake and/or mental stress
experienced) still plays a role in the observed association is open
to question. It is possible that the measurement of confounders is
imprecise or that potential confounders were not considered.
Nevertheless, many studies report that the association of
physical activity and all-cause mortality still persists after
adjustment for the measured confounders. However, that
association may still be the result of confounding, as most studies
do not measure all potential confounders. For example, other
diseases that may affect both physical activity and mortality, such
as diabetes or depression, are often not taken into consideration.
That points to the fact that it is difficult for any single observational
study to account for all potential confounders and therefore
provide a definitive answer relative to causality. In the study of
physical activity and chronic disease risk, the accumulation of
results over numerous investigations, conducted in different
samples and measuring different confounding variables, adds
weight to the evidence for a causal association between physical
activity and chronic disease risk.
Effect Modification
The issue of effect modification, or interaction, also needs to be
considered when evaluating the possibility of causal associations.
Effect modification in epidemiology describes a situation where
two or more risk factors modify the effect of each other with regard
to the occurrence of an outcome. A minimum of three factors are
required for effect modification to occur. For dichotomous
variables, effect modification means that the effect of the exposure
on the outcome depends on the presence of another variable. For
example, the effect of physical activity categories
(sedentary/active) on the presence or absence of lung cancer may
differ depending on a third factor, such as gender. If that were the
case, gender would be termed an effect modifier. Effect
modification can also be evaluated with continuous variables by
determining the effect of exposure on outcome depending on the
level of a third variable. For example, the effect of physical activity
assessed by questionnaire (Kcal/week) and the risk for CHD may
vary depending on the level of body mass index [BMI = body
weight (kg) / height (m2)]. If so, then BMI would be considered an
effect modifier in the association between activity and lung cancer.
Determining the degree to which certain factors act as effect
modifiers can provide important information relative to the
development of preventive strategies. For example, if it was
demonstrated that individuals with a high BMI are at higher risk of
lung cancer due to low physical activity compared to those with
normal or low BMI, it would suggest that strategies specifically
designed to increase physical activity in individuals with high BMI
should be developed and evaluated.
The potential for confounding and effect modification increases
the difficulty in the interpretation of epidemiological studies on
physical activity and health. For example, age can be a
confounder; when people get older, their physical activities may be
reduced and their risk for death and most chronic disease may be
increased. Therefore, to examine the association between
physical activity and health outcomes, the impact of age needs to
be removed. On the other hand, age may also change the
magnitude of risk associated with other variables and, therefore, it
may also be considered an effect modifier. For example, the risk of
lung cancer increases with increased age, increased number of
cigarettes smoked, and decreased levels of physical activity.
However, age is also associated with decreased physical activity
and increase in the length of time cigarettes were smoked, if the
subject is a smoker. Therefore, to determine if there is an
association between physical activity and lung cancer risk, the
effects of both age and cigarette-smoking history must be
controlled. In epidemiological studies, this type of control is
typically done by statistical analysis. When reading and
interpreting information relative to the association of physical
activity and health outcomes, it is important to be aware of the
potential consequences of confounding and effect modification
and to determine if those issues have been satisfactorily
addressed.
Summary of Objectives
1. Be familiar with the history of epidemiological research in
kinesiology. The field of kinesiology has a long and rich
history in conducting epidemiological research. Scholars like
Drs. Jeremy Morris and Ralph Paffenbarger, and studies like
the Harvard alumni study and Cooper Center longitudinal
study, have made significant contributions to our understanding
of physical activity/fitness and public health, and documents
such as the 1996 surgeon general’s report on Physical Activity
and Health, and the 2008 and 2018 Physical Activity
Guidelines, are the key documents for an historical overview of
physical activity recommendations.
2. Understand basic epidemiological measures. Incident
cases and rates and prevalence cases and rates are the basic
epidemiological measures.
3. Understand commonly used research designs in
epidemiological studies. Commonly used research design in
epidemiological studies include cross-sectional surveys, case-
control studies, prospective cohort studies, and randomized
controlled trials, and each has its advantages and
disadvantages.
4. Be familiar with commonly used ways to summarize data
and statistics in epidemiological studies. A 2 × 2 table is
often used to summarize the epidemiological data, and
statistics, such as relative risk, odds ratio, and attributable risk,
are used to analyze the data.
References
Blair, S. N., Kohl, H. W., Paffenbarger, R. S., Clark, D. G., Cooper, K. H., & Gibbons, L.
W. (1989). Physical fitness and all-cause mortality: a prospective study of healthy men
and women. Journal of the American Medical Association, 262, 2395–2401.
Dishman, R. K., Heath, G., & Lee, I-M. (2013). Physical activity epidemiology (2nd ed.).
Champaign, IL: Human Kinetics.
Dishman, R. K., Washburn, R. A., & Heath, G. (2004). Physical activity epidemiology.
Champaign, IL: Human Kinetics.
Fletcher, G. F., Blair, S. N., Blumenthal, J., Caspersen, C., Chaitman, B., Epstein, S., …
Pina, I. L. (1992). AHA statement on exercise: Benefits and recommendations for
physical activity programs for all Americans. Circulation, 86, 340–344.
Lee, I-M. (Ed.) (2009). Epidemiologic methods in physical activity studies. New York:
Oxford University Press.
Morris, J. N. (1957). Uses of epidemiology. Edinburgh: Livingstone.
Morris, J. N., Heady, J. A., Raffle, P. A., Roberts, C. G., & Parks, J. W. (1953). Coronary
heart-disease and physical activity of work. Coronary heart-disease and physical
activity of work 262(6795), 1053–1108.
Morris, J. N., & Raffle, P. A. B. (1954). Coronary heart disease in transport workers: A
progress report. British Journal of Industrial Medicine 11, 260–264.
Pate, R. R., Pratt, M., Blair, S. N., Haskell, W. L., Macera, C. A., Bouchard, C., …
Wilmore, J. H. (1995). Physical activity and public health: A recommendation from the
CDC and ACSM. Journal of the American Medical Association, 273, 402–407.
Pomrehn, P. R., Wallace, R. B., & Burmeister, L. F. (1982). Ischemic heart disease
mortality in Iowa farmers. The influence of lifestyle. JAMA, 248(9), 1073–1076.
Powell, K. E., & Paffenbarger, R. J. (1985). Summary of the workshop on epidemiologic
and public health aspects of physical activity and exercise. Public Health Reports 100,
172–180.
U.S. Department of Health and Human Services. (1996). Physical activity and health: A
report of the Surgeon General. Atlanta, GA: U.S. Department of Health and Human
Services, Centers for Disease Control and Prevention, National Center for Chronic
Disease Prevention and Health Promotion.
U.S. Department of Health and Human Services. (2008). Physical Activity Guidelines
Advisory Committee report. Washington, DC: Author.
Verloop, J., Rookus, M. A., van der Kooy, R., & van Leeuwen, F. E. (2000). Physical
activity and breast cancer risk in women aged 20–54 years. Journal of the National
Cancer Institute, 92(2), 128–135. doi:10.1093/jnci/92.2.12
Washburn, R. A., Dishman, R. K., & Heath, G. (2006). Epidemiology and physical
activity. In T. M. Wood & W. Zhu (Eds.). Measurement theory and practice in
kinesiology (pp. 273–295). Champaign, IL: Human Kinetics.
© Liashko/Shutterstock
PART TWO
The Research Process
CHAPTER 7 Understanding the Research Process
The second part of this text outlines and describes the research
process.
In Chapter 7, “Understanding the Research Process,” we
discuss (a) the various steps in the research process, (b) how
research questions are initiated, selected, and defined, and (c) the
concept of variables.
In Chapter 8, “Reviewing the Literature,” we detail (a) the
importance of the literature review in the research process, (b)
methods for reviewing and evaluating the literature, and (c)
sources of literature pertinent to kinesiology.
In Chapter 9, “Reading and Evaluating Research Reports,” we
describe (a) the typical format for a research report and (b)
evaluative criteria for reading and judging the quality of research
reports, as well as reading and understanding systematic reviews
and meta-analysis studies.
In Chapter 10, “Developing the Research Plan & Grant
Application,” we describe (a) the various research approaches
common in kinesiology, (b) the role of hypotheses in research, (c)
techniques for collecting data, and (d) criteria for selecting or
developing data collection instruments and (e) writing grant
applications.
In Chapter 11, “Ethical Issues in Research,” we address (a) the
history of research involving human participants that led to the
formation of ethical standards for the conduct of research, and (b)
those standards and guidelines most applicable to research in
kinesiology.
In Chapter 12, “Recruiting and Selecting Research
Participants,” we discuss (a) the importance of selecting
participants appropriate for the research, (b) the concepts of
population and sample, (c) various methods for selecting research
participants, and (d) sample size.
© Liashko/Shutterstock
CHAPTER 7
CHAPTER 7
Understanding the Research
Process
Beginning researchers often have a misconception of what
research entails. Through no fault of their own, they simply do not
understand the nature of research and the concepts underlying the
scientific method. A basic understanding of the types of research
questions and the different types of variables associated with
research is requisite to designing a research study. The emphasis
of this chapter is on developing an understanding of the research
process and the various steps involved.
CHAPTER OBJECTIVES
Assumptions
Delimitations
Dependent variable
Descriptive questions
Difference questions
Distilling
Extraneous variables
Independent variable
Limitations
Purpose statement
Relationship questions
Research process
Research question
Variable
Memmert, D., Almond, L., Bunker, D., Butler, J., Fasold, F.,
Griffin, L., … Furley, P. (2015). Top 10 research questions
related to teaching games for understanding. Research
Quarterly for Exercise and Sport, 86(4), 347–359.
doi:10.1080/02701367.2015.1087294
Delimitations
In research circles, delimitations refer to the scope of the study.
The delimitations of a study set the boundaries for the research, in
effect, “fencing it in.” For example, delimitations include
characteristics of a study such as (1) type of research participant,
(2) number of research participants, (3) measures to be collected,
(4) instruments utilized in the study, (5) time and duration of the
study, (6) setting, and (7) type of intervention or treatment. These
are the parameters of a study that the researcher can control.
Identifying the delimitations of a study is actually part of the
process of defining (distilling) the problem. As the researcher
narrows the question down to a researchable problem, many of
the delimitations are established. Others are identified as the
research plan and methodology are developed. In fact, an astute
reader will be able to recognize important delimitations in a well-
written purpose statement. For instance, consider the following
purpose statement:
Limitations
In research terminology, limitations refer to weaknesses of the
study. All studies have some weaknesses because compromises
frequently have to be made in order to conform to the realities of a
situation. Limitations are those things the researcher could not
control, but that may have influenced the results of the study. The
reader of a research report should always know at the outset
those conditions of the study that could reflect negatively on the
work in some way. Only those things that might affect the
acceptability of the research findings should be reported. The
researcher will, of course, try to eliminate extremely serious
weaknesses before the study commences. The process of
carefully defining (distilling) the research problem necessarily
involves identifying the major limitations associated with the
research. Additional limitations may be identified as the researcher
develops the research plan and then actually proceeds to
implement it. Among the items that typically involve limitations to a
study are the following:
The research approach, design, method(s), and techniques
Sampling problems
Uncontrolled variables
Faulty administration of tests or training programs
Generalizability of the data
Representativeness of research participants
Compromises to internal and external validity
Reliability and validity of research instruments
While the number of limitations identified will vary depending
on the nature of the study, the research design, and the
methodology utilized, it is generally recommended that the number
of limitations be less than the number of delimitations presented.
While it is important that the researcher recognize, from the outset,
potential weaknesses of the study, an exceedingly long list of
limitations raises serious questions about why so many facets of
the research were not controlled, thus questioning the overall
legitimacy of the research. In published papers, the author
frequently will discuss the limitations of the research and the
implications thereof within the “discussion” section of the
manuscript. For instance, in the study of shuttle run performance
cited earlier (Beets & Pitetti, 2004), the authors note limitations in
the “discussion” section pertaining to (1) sampling procedures, (2)
unfamiliarity with test procedures, (3) the number of different
people administering the tests, and (4) cultural differences.
Sometimes, a separate section of the manuscript titled “limitations”
will present the weaknesses of the research. In a thesis or
dissertation, it is common to see a specific subdivision of the
manuscript called “limitations” in which the author will provide a
listing of the weaknesses or perceived weaknesses of the study.
This is illustrated by the following examples of limitations taken
from the study regarding the Special Olympics:
1. Participants for the study were a convenience sample.
2. The participants’ understanding of the research instrument’s
statements as they were intended.
3. Previous participation as a Special Olympics athlete.
4. Existence of research participants’ involvement with organized
recreation programs.
5. Participants’ family setting may be a potential extraneous
variable that could influence the findings.
Assumptions
Assumptions are conditions that the researcher presumes to be
true. They are conditions that are taken for granted. Assumptions
are derived primarily from the literature, but may also come from
the researcher’s own knowledge of the problem area. In other
words, what does the literature tell the researcher that can be
assumed to be true for purposes of planning the study? The
information gleaned from the literature during the process of
defining (distilling) the problem frequently serves as the basis for
much of the development of the research project, including the
identification of assumptions that should be taken for granted. The
literature provides information about a particular behavior, and
sometimes contains factual evidence explaining the behavior. The
researcher undertakes a study with the assumption that this
information is correct. Assumptions are also made about the way
the instrumentation, procedures, methods, and techniques will
contribute to the study. The researcher may also assume that
certain things that could not be controlled or documented may
have happened in the study. In a study designed to investigate the
effects of participation in the Special Olympics on the self-esteem
of adults with mental retardation, the following assumptions were
made by the author:
1. The test instrument, as modified, was appropriate for the target
population and was a valid and reliable measure of self-
esteem.
2. The participants understood the directions as they were
intended.
3. The participants completed the self-esteem inventory to the
best of their ability.
4. The participants were a representative sample of adults with
mental retardation who reside in the state of Iowa.
5. The interviewers were sufficiently trained and capable of
utilizing the recommended survey procedures.
Additional examples of assumptions that are commonly made
include the following:
All research participants completed the questionnaire honestly
and correctly.
The research participants complied with the investigator’s
request to put forth maximal effort on all trials of the test.
The instructors (or trainers) were capable of utilizing the
prescribed teaching method.
Although published papers frequently do not explicitly state the
underlying assumptions for a research study, a thesis or
dissertation often contains a specific subdivision of the manuscript
that includes a listing of assumptions that were made. This is
illustrated here by the assumptions listed for the study on Special
Olympics participation.
The Concept of Variables
The focus of the researcher’s effort is always on the variable. A
variable is a characteristic, trait, or attribute of a person or thing
that can be classified or measured. Eye color, sex, and church
preference are classifications. Skill, self-concept, strength, heart
rate, and intelligence are variables that can be measured. The
term variable indicates that a characteristic can have more than
one value. There are two genders in the human race, male and
female, and if both appear in a research project, then gender
becomes a variable. When a characteristic does not vary, it is
referred to as a constant. In a study of the manual dexterity of
second-year nursing students, the characteristic, second-year, is a
constant; it does not vary among the student nurses.
Quantitative and Qualitative Variables
Variables can be qualitative or categorical if they are classified
according to some characteristic, attribute, or property. People are
categorized as to sex (male, female), eye color (blue, brown,
green), church preference (Catholic, Protestant, Jewish), and
political affiliation (Republican, Democrat, Independent).
Qualitative variables are usually unmeasurable. Variables are
called quantitative if they can be measured in a numerical sense.
The heights of five starters on a basketball team may be 5'10",
6'2", 6'5", 6'8", and 7', respectively. Their ages range from 18
years, 6 months to 22 years, 3 months. There are two basic types
of quantitative variables, discrete and continuous.
Discrete Variables
This type of variable is usually thought of as being a whole unit,
one that cannot be fractionated or divided into smaller parts.
Examples of discrete variables are football scores, and the
number of correct answers on a test. Football scores are recorded
in whole numbers (e.g., 3, 7, 10, 14) and cannot be divided into
smaller parts, like 7.5 or 11.75.
Continuous Variables
This type of variable can be divided into fractional amounts in
large or small degrees. Strength and endurance scores, track and
field times, height, weight, and girth measures are considered to
be continuous variables. Time in a 100-meter dash is sometimes
measured to the nearest tenth of a second, but in high-level
competition it could be one one-hundredth of a second. The height
of a person could be measured in feet and inches, in half-inches,
or in quarter-inches, depending on the precision desired and the
ability of the measuring instrument to measure height accurately.
Extraneous Variables
While the independent variable is to serve as a stimulus to evoke
a response from the dependent variable, there are usually other
factors (variables) that could possibly cause the same response.
In fact, it is quite likely that there are many possible variables that
could have an effect on the dependent variable. To conclude that
one type of physical fitness training is more effective than another,
the researcher must be confident that other factors did not
produce the same effect. Perhaps the most frequent and most
serious mistake made by the beginning researcher is the failure to
either control or account for the variables that could contribute
error in an experiment. These error-producing variables are
generally referred to as extraneous variables. Although some
authors may also describe error-producing variables as
intervening variables, modifying variables, or confounding
variables, we prefer to use the term extraneous variable to refer to
any factor that is a source of unwanted or error variance. The task
of the researcher is to eliminate or somehow control the potential
influence of extraneous variables. Methods of controlling
extraneous variables are discussed later. The following examples
illustrate the potential effect that extraneous variables could have
on research findings.
A researcher studied the effect of motivation (independent) on
the physical fitness test scores (dependent) of sixth-grade boys
and girls. The two motivating conditions were team competition
and level of aspiration (goal seeking). The subsequent finding that
the group receiving level of aspiration was significantly better than
the group receiving team competition was surprising because the
literature contains much information proclaiming the superiority of
team competition as a motivational vehicle. A detailed check into
the composition of the groups participating under each condition
revealed that they came from classes that were divided into high,
average, and low academic achievement. The sixth graders who
performed under the level of aspiration condition came from the
high achievement category and possessed a mean intelligence
quotient (IQ) of 132. The literature clearly indicates that, in
general, the more intelligent an individual is, the easier it is to get
that person “up,” or motivated. So, IQ, a variable that was not
included in the design of the study, served as an extraneous
variable that slipped into the research situation, interacting with the
dependent variable (fitness score) to make the condition (level of
aspiration) look extremely potent and effective. The researcher
presumed that IQ, while not a planned part of the study, was a
substantive contributor to its results. This variable, like all
extraneous variables, was not manipulated by the researcher.
Many extraneous variables can have a significant influence on
the dependent variable and can invalidate research conclusions.
Other such extraneous variables besides IQ, and depending on
the specific type of experiment, are gender, socioeconomic level,
teacher competence, personality, enthusiasm, physical health,
emotional health, age, and the use of volunteers and intact groups
as research participants. If a researcher conducted a study on the
effect of three different methods of teaching outdoor education
concepts on elementary school students’ appreciation and
knowledge of the outdoors, with one instructor assigned to each
method, instructor personality could play an important role in the
ultimate results of the study. Extraneous variables are difficult, if
not impossible, to control. A good design of a study can help to
downplay the effect of such variables. In the teacher personality
example, the outdoor education teachers could be screened for
personality characteristics; the one with the desired personality
would then be assigned to teach the outdoor education concepts
under the three different methods. Teacher personality would be
the same for all three groups of participants and so, in this case,
the teacher personality variable would be controlled, or
neutralized. However, it is assumed here that the teacher is
equally effective in teaching with each method.
Moderator and Mediator Variables
Moderator and mediator variables were briefly introduced in
Chapter 3, “Experimental Research.” Conceptually, these two
variables belong with the extraneous variables described earlier
because they could affect the relationship of independent and
dependent variables being studied. Baron and Kenny (1986)
provided an excellent description of these two variables in their
classical paper:
Moderator variables: “In general terms, a moderator is a
qualitative (e.g., sex, race, class) or quantitative (e.g., level of
reward) variable that affects the direction and/or strength of the
relation between an independent or predictor variable and a
dependent or criterion variable. Specifically within a
correlational analysis framework, a moderator is a third
variable that affects the zero-order correlation between two
other variables…. In the more familiar analysis of variance
(ANOVA) terms, a basic moderator effect can be represented
as an interaction between a focal independent variable and a
factor that specifies the appropriate conditions for its
operation.” (p. 1174)
Mediator variables: “In general, a given variable may be said
to function as a mediator to the extent that it accounts for the
relation between the predictor and the criterion. Mediators
explain how external physical events take on internal
psychological significance. Whereas moderator variables
specify when certain effects will hold, mediators speak to how
or why such effects occur.” (p. 1176)
Thus, a moderator variable can be considered as a variable that
could influence the strength of a relationship between two other
variables, and a mediator variable is as a variable that could help
explain the relationship between the two other variables. As an
example, let us consider the relationship between socioeconomic
status (SES) and physical activity participation. Age might be a
moderator variable, in that the relation between SES and physical
activity participation could be stronger for older women and less
strong or nonexistent for teenage girls. Education might be a
mediator variable in that it explains why there is a relation between
SES and physical activity participation. When you remove the
effect of education, the relation between SES and physical activity
participation may become much weaker.
Control of Variables
It is critically important not only for the researcher but also for the
consumer of the research literature to be able to identify the
pertinent variables in a study. In designing a study, particularly an
experimental study, the researcher should pay particular attention
to controlling the influence of extraneous variables.
Many studies in kinesiology are conducted outside of the
laboratory, using human research participants, and there are many
variables that surround the research situation that could alter the
results in some substantive way. It becomes imperative that the
researcher maintain as much control over these variables as
possible to prevent or reduce their effect. At the same time, it must
be recognized that not every variable that could make a difference
can be controlled. The following are some of the procedures that
are frequently used in an attempt to control extraneous variables:
CHAPTER 8
CHAPTER 8
Reviewing the Literature
After selecting the research question, it is common for the novice
researcher to want to jump right into developing the research plan
and be eager to start the research project. While often challenging
and time consuming, reviewing the literature is seen by many as
being something that is tangential to the real purpose of the study.
However, thoroughly reviewing the literature is an essential step in
developing the research proposal, formulating hypotheses, and
developing the research plan. This chapter discusses the
purposes of reviewing the literature and recommends strategies
for conducting a meaningful literature review.
OBJECTIVES
Primary Sources
Secondary Sources
Systematic PubMed
review Cochrane Library
Meta- Scopus
analysis
Periodicals
Periodicals constitute a significant portion of the literary world.
There are thousands and thousands of periodicals. But exactly
what is a periodical? Quite simply, a periodical can be defined as a
publication that is issued at regular intervals. Its regularity
distinguishes it from other forms of publications such as books,
monographs, and reviews. There are many types of periodicals
that vary significantly in terms of purpose, audience, and content.
In general, periodicals can be classified as either newspapers,
popular press periodicals, trade periodicals, or scholarly
periodicals. Newspapers consist of short articles and are
nontechnical in nature. They tend to cover current events and
issues and often provide firsthand reports or accounts of events.
Popular press periodicals, or what we generally think of as simply
“magazines,” are publications that appeal to broad groups of
readers. Articles in popular press periodicals tend to be shorter in
length and do not usually contain highly specialized or technical
language. Trade periodicals refer to magazines that are targeted
at specific occupational groups and businesses. They contain
shorter articles but tend to be aimed at those who share an
occupation or business goal in common and offer insights into
trends on particular aspects of the business or occupation.
Scholarly journals are more highly specialized, contain a higher
level of technical terms, and are usually intended for a more
specific and limited segment of the population. Scholarly journals
provide firsthand reporting of research studies that serves to build
on the knowledge base in a field. Scholarly journals constitute the
type of periodical most valuable for researchers. The various
categories of periodicals, as well as sample publications in each
category, are shown here:
Newspapers
Chicago Tribune
New York Times
Washington Post
USA Today
Des Moines Register
Popular Press Magazines
Time
National Geographic
Sports Illustrated
People
Entertainment Weekly
Trade/Professional Magazines
Advertising Age
Psychology Today
Beverage Industry
American Spa
The Progressive Farmer
Scholarly Journals (in kinesiology)
Research Quarterly for Exercise and Sport
Medicine & Science in Sport & Exercise
Journal of Teaching in Physical Education
Journal of Physical Activity and Health
Pediatric Exercise Science
1. Identify topic
2. Review secondary sources and known primary reports
3. Identify key terms
4. Select database(s)
5. Conduct search
6. Identify relevant publications (particularly primary sources)
7. Locate and obtain sources
8. Read and document selected literature
Search
Article search—Search by keyword or
thesaurus/controlled vocabulary term
Field search—Author, title, source-journal title,
abstract
Boolean searches—“AND,” “OR,” and “NOT” to
combine search terms
Results
Define results—Date, language, publication type
Sort results—Date, relevance, author, journal title
Results display—Citation only, citation and
abstract, details
Related articles—Provides additional literature
closely related to a single article
Times cited—See what more recent articles cited
the one you are looking at
Note: An article citation should contain the following basic citation information: Article
name, author(s), date, journal title, volume number, issue number if applicable, page
number(s), and DOI (digital object identifier).
Many databases allow complex searches to be performed that
would be much more difficult, if not impossible, using paper
resources. These reference sources include indexes to journals
and other publications (sometimes accompanied by the full text of
the cited article), abstracts, statistical reference materials, and
compilations of useful data. Learning to do a basic computer
search for literature is a relatively simple matter. Select the
keywords, or descriptors, from the problem statement of the
proposed research topic or from your initial reading of secondary
sources. These key terms are important in that they will be used to
search computerized databases to find associated literature. Once
identified, these keywords are entered into the search fields of the
selected database software, which uses them to search for related
information from the records maintained by the particular
database. Results of the search produce a report with the titles
and citation information of published articles, research reports,
reviews, conference papers and proceedings, and other sources.
Some computerized databases include a brief abstract describing
the content of each reference, while a few provide full-text access
to articles. These reports are typically available almost
immediately. The nature of searching electronic databases often
yields many sources that are not relevant to the research problem,
requiring the researcher to repeat the search using a different
database as well as different keywords. This highlights the
importance of carefully selecting the keywords and choosing the
appropriate database.
Once you have identified the topic of interest and relevant key
terms you must identify the database(s) that will be used for your
literature search. A crucial part of a successful search strategy is
choosing an appropriate database. Databases are generally
classified as either multidisciplinary, such as Academic Search
Premier, which is a general database containing records from
many fields, or specialized, such as Medline, which emphasize a
particular subject or discipline (TABLE 8.5). There are numerous
computerized databases. Among the many databases available,
ERIC, PsycINFO, SPORT Discus, Academic Search Premier,
Ingenta, Medline, and others, are commonly used by kinesiology
researchers. While it may be convenient to simply select a single
database, rarely will one database be sufficient in providing the
complete coverage of the topic under investigation.
TABLE 8.5 Selected Multidisciplinary and Specialized
Databases
Multidisciplinary Specialized
Ingenta Medline
JSTOR PsycINFO
Subject Directories
Wikipedia www.wikipedia.org
Yahoo! www.yahoo.com
Search Engines
Excite www.excite.com
Google www.google.com
Lycos www.lycos.com
Metasearch Engines
Dogpile www.dogpile.com
Ixquick www.ixquick.com
Mamma www.mamma.com
MetaCrawler www.metacrawler.com
Sputtr www.sputtr.com
http://guides.library.jhu.edu/evaluate/internet-
resources
http://guides.lib.berkeley.edu/evaluating-resources
CHAPTER 9
CHAPTER 9
Reading, Evaluating, and
Summarizing Research
Reports
A published research report is often the culmination of the
research process. The ability to read, understand, critically
evaluate and, finally, summarize research reports is an important
expectation of the professional consumer of research literature.
Although the content will differ, developing a familiarity with the
general format of a research report is the first step in becoming an
informed consumer. The astute reader should be able to judge the
content of the various sections and discern the overall quality of
the report and the underlying research. Additionally, we present
information about commonly used approaches to summarize
research reports. You should become familiar with, and be able to
use, the meta-analysis approach.
OBJECTIVES
Abstract
Acknowledgments
Background information
Conclusions
Criteria for critiquing an article
Data analysis
Discussion section
Effect size
Instrumentation
Introduction section
Meta-analysis
Methods section
Preliminary items
Procedures
Purpose statement
Recommendations
References
Research participants
Results section
Systematic review
Preliminary
Title
information
Author and organizational affiliation
Acknowledgments (if any)
Abstract
Introduction
Background information and
literature review
Rationale for the study
Purpose statement
Hypotheses or research questions
Methods Participants
Instrumentation
Procedures
Statistical analysis
Discussion
Conclusions
Recommendations
References
Research Participants
Typically, the first part of the methods section is used to describe
the research participants (subjects). This should include
information about the target population, the number of participants,
and how they were selected. In addition, pertinent characteristics
of the participants, such as age, gender, grade level,
socioeconomic level, ethnicity, and any other characteristics that
may have a bearing on the problem being studied should be
described. If appropriate to the study, the author should also
describe how participants were assigned to comparison groups or
treatment groups. It is important that the author carefully describe
how the research participants were obtained, including a
description of the population, thus enabling the reader to make a
judgment about the generalizability of the results. Were they
volunteers? Were they randomly selected? What selection criteria
were established? What sampling technique was used? Chapter
12, “Recruiting and Selecting Research Participants,” provides a
full description of various sampling techniques and the process of
selecting research participants. When sampling procedures are
not well defined, it is wise to believe that a convenience sample
was used. See EXAMPLE 9.4 for examples of excerpts from
published articles describing research participants.
Participants [1]
Children were recruited from community school or non-school
youth programs. Program instructors and parents were asked
to identify children who had minimal swimming experience,
were fearful of the water, and had low self-confidence. In all, 24
children (18 boys and 6 girls) met those criteria. Their average
age was 6.2 years (SD = 0.90), 62.5% (n = 15) had taken
formal swimming lessons, and they averaged 9.5 (SD = 11.4;
range = 0 to 40) weeks of swimming lesson experience.
Children were matched to group by age and lesson
experience, and then groups were randomly assigned to one of
three model type conditions: control, peer mastery, and peer
coping. All parents provided informed consent prior to study
procedures, and the children signed an assent form on the first
day of the study prior to any swimming skill or questionnaire
assessments.
Participants [2]
The 310 adolescents participating in this study were enrolled in
six junior high schools (n = 77 boys, n = 64 girls; M age = 13.5
years, SD = 1.5) and six senior high schools (n = 76 boys, n =
93 girls; M age = 16.5 years, SD = 1.5). All physical education
classes were coeducational and taught by 12 physical
education teachers, 3 male and 3 female teachers in junior
high schools and 3 male and 3 female teachers in senior high
schools. The schools were randomly selected from the total
number of schools in Thessaloniki, Greece, a town with over 1
million residents.
1: From Weiss, M. R., McCullagh, P., Smith, A. L., & Berlant, A. R. (1998).
Observational learning and the fearful child: Influence of peer models on
swimming skill performance and psychological responses. Research
Quarterly for Exercise and Sport, 69(4), 380–394.
2: From Papaioannou, A. (1998). Students’ perceptions of the physical
education class environment for boys and girls and the perceived
motivational climate. Research Quarterly for Exercise and Sport, 69(3), 267–
275.
Instrumentation
This subsection of a journal article should contain a description of
data-collection instruments, materials, and any equipment or
apparatus used in the study. Given the diverse nature of
kinesiology research, instrumentation may vary substantially. For
some researchers, instrumentation is largely a paper-and-pencil
variety and may include tests, checklists, scales, inventories,
questionnaires, interview schedules, or observation reports. Other
researchers, however, utilize specialized equipment for data
collection, such as skinfold calipers, timing devices, strength
gauges, metabolic measurement systems, cinematographic
equipment, and heart monitors, among others. Furthermore, some
researchers utilize standardized performance tests, such as sport
skills tests or physical fitness tests as their means of collecting
data.
While space limitations within the journal may prohibit the
actual reproduction of a written instrument in the article itself,
authors should provide a complete description of the instrument,
including pertinent psychometric properties (i.e., validity and
reliability), if available. Similarly, equipment or research apparatus
should be thoroughly described, including evidence of its technical
properties. Sometimes, illustrations or photographs of testing
equipment may be used to supplement the written description.
Researchers utilizing published instruments or commercially
available equipment often cite manufacturer’s specifications or the
work of other researchers as the basis of asserting the validity and
reliability of the chosen instrument. If the measuring instrument is
new and specifically designed for the study described in the
article, the authors would be expected to go to greater lengths in
describing the development of the instrument as well as pertinent
validity and reliability evidence. The main concern for the author(s)
is to establish that the instrumentation used in the study was
appropriate for the particular research problem and that it enabled
the researchers to collect accurate and credible information.
EXAMPLE 9.5 provides illustrations of instrumentation sections in
published articles.
Instrumentation [1]
Intrinsic motivation was assessed using a sport-oriented
version of the Intrinsic Motivation Inventory (IMI; McAuley,
Duncan, & Tammen, 1989). An original version of the IMI was
used by R. Ryan and colleagues (e.g., Plant & Ryan, 1985; R.
Ryan, Mims, & Koestner, 1983) as a multidimensional measure
of subjects’ intrinsic motivation for a specific achievement
activity. McAuley et al.’s sport-oriented version of the IMI
contains 16 items that assess four components of intrinsic
motivation, including interest-enjoyment, perceived
competence, effort-importance, and tension-pressure. McAuley
et al. reported acceptable psychometric properties for the four
subscales.
A fifth subscale was included in the current study
questionnaire based on suggestions by McAuley et al. (1989).
This additional subscale, labeled perceived choice, included
four items to assess the degree to which athletes believe they
are participating in their sport by personal choice. The four
items include the following: (a) “I participate in this sport
because I want to,” (b) “I would quit this sport if I could,” (c)
“Working hard in this sport is something I choose to do,” and
(d) “When my eligibility is up I will quit this sport.”
Each item on the IMI is followed by a 7-point Likert-type
scale, with response choices ranging from strongly disagree to
strongly agree. Subjects were asked to indicate their
agreement or disagreement with each statement by circling the
appropriate response.
Instrumentation [2]
Based on a comprehensive review of the literature, a 12-item
instrument was developed to assess subjects’ perceived self-
efficacy in performing life-saving skills and the impact of
training on subjects’ willingness to act during an emergency
and in the presence of noted barriers. The instrument
consisted of four demographic items (age, gender, race, and
education level) and five background items (level of previous
training, first-aid requirement for college major or job, reasons
for taking the first-aid course, field of study, and previous
experience in giving first aid and CPR). The instrument also
contained questions based on Bandura’s self-efficacy model.
Efficacy expectations, outcome expectations, and outcome
values subscales consisted of three items each. The remaining
questions contained three items examining subjects’
willingness to perform life-saving skills in the presence of
specified barriers.
To establish face validity, items in the survey instrument
were constructed based on a comprehensive review of the
literature. Content validity was established by distributing the
instrument to three national experts in survey research and one
national expert on the self-efficacy model. Experts were
defined as individuals who were recognized authorities on
survey research or the self-efficacy model and who had
recently published articles on these topics. Recommendations
offered by experts were taken under consideration and
appropriate revisions were made to the survey instrument.
Stability reliability was assessed utilizing a sample of 25 male
and female undergraduate college students who were given
the instrument twice, 5 days apart.
1: From Amorose, A. J., & Horn, T. S. (2000). Intrinsic motivation:
Relationships with collegiate athletes’ gender, scholarship status, and
perceptions of their coaches’ behavior. Journal of Sport and Exercise
Psychology, 22(1), 63–84.
2: From Kandakai, T. L., & King, K. A. (1999). Perceived self-efficacy in
performing life-saving skills: An assessment of the American Red Cross’s
responding to emergencies course. Journal of Health Education, 30(4), 235–
241.
Procedures
The procedures portion of the methods section includes a
description of data collection procedures as well as experimental
treatments and how they were administered. Basically this is a
description, step by step, of how the researcher conducted the
study. Where was the study conducted? What did the research
participants do? What did the researcher do? Who collected the
data? How long did it take? What safeguards were provided? The
procedures should be presented with sufficient detail and clarity
that another researcher would be able to follow the steps
described and replicate the study. You might consider this to be
the “recipe” for conducting the study.
For experimental studies, this section should also include a
description of the research design, clearly identifying the
independent and dependent variables, and describing the
experimental treatments. It is important that the research design
be appropriate to the solution of the research problem and be
relatively free of threats to the validity of the study. See Chapter 8,
“Reviewing the Literature,” for a more thorough discussion of
internal and external validity. All studies must include a description
of the data collection procedures. These procedures should be
appropriate to the research participants and should correctly use
the selected measuring instruments. Although not seen in all
published research articles, sometimes a description of the data
analysis procedures will be included here. This approach is
typical in theses and dissertations. Otherwise, the data analysis
procedures are described in the results section. Examples of
procedures sections are shown in EXAMPLE 9.6.
Results [1]
A total of 53 (43%) older adults had dropped out of the exercise
program within 4 months. There were no demographic
differences found between those who dropped out of the
exercise program and those who remained, except for
education level. Education level differed significantly between
the two groups; a greater proportion of the attrition group
reported less than a 12th-grade education (x2 = 7.33, p < 0.05).
Health differences were found between the exercise
attrition group and those who completed the program (Table 2).
Compared with those who continued in the exercise program,
the attrition group reported poorer self-ratings of health (x2 =
6.0, p < 0.05), a greater frequency of not having enough
energy to do the things they need to do (x2 = 6.57, p < 0.05),
and that they climbed fewer flights of stairs per day. If the PAR-
Q scale had been used as an eligibility screening tool, a
greater proportion of attrition group individuals would have
been considered ineligible for the program, as more of them
reported having one or more problems on the revised PAR-Q
scale indicating greater health risk (x2 = 3.58, p = 0.058).
Results [2]
Intrinsic Motivation, Gender, and Scholarship Status. To test
whether athletes’ intrinsic motivation would vary as a function
of their gender, scholarship status, and perceived percentage
of athletes on scholarship, a 2 × 3 × 2 (gender by scholarship
status by scholarship percentage) MANOVA was conducted.
The dependent variables for the analysis were the five
subscale scores from the IMI (interest-enjoyment, perceived
competence, effort-importance, perceived choice, and tension-
pressure). The independent variables were gender (male,
female), scholarship status (full, partial, none) and perceived
scholarship percentage (low: < 70%, high: > 75%). The
descriptive statistics for each group are presented in Table 3.
Due to the nonorthogonal nature of the research design, the
significance of the main and interaction effects was tested in
hierarchical fashion (Finn, 1974; Tabachnick & Fidell, 1996).
That procedure revealed a nonsignificant three-way (gender by
scholarship status by scholarship percentage) interaction
effect. In addition, the Scholarship Status × Scholarship
Percentage interaction, the Gender × Scholarship Status
interaction, and the Gender × Scholarship Percentage
interaction were all nonsignificant. The scholarship percentage
main effect was also nonsignificant. However, both the
scholarship status main effect, Wilks’s lambda = 0.94, F(10,
740) = 2.37, p < 0.01, and the gender main effect, Wilks’s
lambda = 0.97, F(5, 370) = 2.39, p < 0.04, were significant.
Results [3]
The Freshman Survey with tobacco-related questions was
completed by 544 first-semester freshmen. Both genders were
fairly equally represented, with 305 (56%) females and 232
(43%) males completing the survey (7 students did not indicate
gender). Unfortunately, the survey instrument did not include a
question about race or ethnicity; however, other sources
provided this information about the entire freshman class (n =
1,142). Eighty-three percent (n = 943) of the class was white,
7% (n = 83) was African American, 6% (n = 69) was Hispanic,
3% (n = 29) was Asian, one was Native American, and 1% (n
=17) did not indicate ethnicity.
Sixty-six percent of respondents (n = 360) reported that
they had never been smokers. Twenty-seven percent (n = 148)
reported smoking either regularly (more than 1 day per week)
or occasionally (1 day per week or less). Specifically, regular
smokers were more prevalent (n = 91) than occasional
smokers (n = 57). Among all smokers, 86 (58%) were female,
56 (38%) were male, and 6 (4%) did not indicate gender. Only
7% of respondents (n = 36) reported being former smokers.
Among current smokers, the average age of smoking initiation
was 16 years, with a range of 8–18 years. Former smokers
began smoking at age 14 and smoked for 2 years before
quitting, on average.
Results [4]
Demographic data are presented in Table 1 as a function of
Age Group, Fitness Group, and Treatment Group. Results
showed no significant difference in years of education since
high school as a function of Age Group, Fitness Group, or
Treatment Group (p > 0.05). There was a significant main
effect for Fitness Group on weight, F(1, 79) = 11.03, p < 0.001,
such that fit participants (M = 76.03 kg, SD = 9.12) weighed
significantly less than unfit participants (M = 83.86 kg, SD =
11.80). There was also a significant main effect for Age Group
on self-reported physical activity level, F(1, 79) = 13.91, p <
0.001, such that the scores for the older participants (M =
11.63, SD = 7.51) were significantly higher than those for the
younger ones (M = 7.09, SD = 1.32). For VO2max, there was a
significant main effect for Age Group, F(1, 79) = 146.14, p <
0.001, such that younger participants (M = 43.05 ml/kg/min, SD
= 9.15) were significantly more fit than older participants (M =
28.30 ml/kg/min, SD = 9.82). Examination of the time interval
between acquisition and retention trials indicated that there
was not a significant difference as a function of Age Group,
Fitness Group, Treatment Group, or the interactions of these
variables, F(1, 75) = 2.65, p > 0.05. Importantly, for the older
participants, there was no significant difference in years since
retirement as a function of Fitness Group, Treatment Group, or
their interaction (p > 0.05).
1: From Prohaska, T. R., Peters, K., & Warren, J. S. (2000). Sources of attrition
in a church-based exercise program for older African Americans. American
Journal of Health Promotion, 14(6), 380–385.
2: From Amorose, A. J., & Horn, T. S. (2000). Intrinsic motivation:
Relationships with collegiate athletes’ gender, scholarship status, and
perceptions of their coaches’ behavior. Journal of Sport and Exercise
Psychology, 22(1), 63–84.
3: From Spencer, L. (1999). College freshmen smokers versus nonsmokers:
Academic, social and emotional expectations and attitudes toward college.
Journal of Health Education, 30(5), 274–281.
4: From Etnier, J. L., & Landers, D. M. (1988). Motor performance and motor
learning as a function of age and fitness. Research Quarterly for Exercise
and Sport, 69(2), 139–146.
Discussion
The discussion section of a journal article contains a
nontechnical explanation of the results. Basically, the discussion
section is where the author presents the outcomes of the study.
Here, the author provides his or her interpretation of the findings,
culminating with a conclusion that provides an answer to the
research problem. Note that some authors may include in the
paper a separate section labeled conclusions. The discussion
section is often the least structured of the research report,
enabling the author to take considerable latitude in the writing.
However, if the study was set up to test hypotheses, the
discussion section should include a report on the results of each
hypothesis test. Tuckman (1999) suggests that the discussion
section of a research report serves six major functions:
To summarize the findings—a rather straightforward function
is to provide a summary of the major findings in the form of
conclusions, thus providing the reader with an overall picture of
the results.
To interpret the study’s findings—the author’s explanation or
interpretation of what the results mean. (Note: We would
propose that this may be the most important function of the
discussion section.)
To integrate the findings—an attempt to synthesize the
various findings, both expected and unexpected, to achieve
meaningful conclusions and generalizations.
To theorize—occasionally, a study generates a number of
interrelated findings that might serve as the basis for theory
development, either as support for existing theory or for the
establishment of an original theory.
To recommend or apply—since results from research
frequently have implications for enhancing or altering
professional practices, the discussion section provides the
venue for making such recommendations.
To suggest extensions—inevitably, the results of a research
study give rise to more questions; the discussion section of a
report often includes recommendations for future research.
While each of these functions may not be served in the
discussion section of every published research report, it is
important to realize the breadth of possible content. It is common
for the discussion section to begin with a brief restatement of the
purpose of the study, followed by a general statement
summarizing the results. This is usually followed with a more
detailed discussion of the specific findings. The discussion section
generally does not contain the statistical notation and numerical
information found in reporting the results, but usually consists of a
narrative discussion in which the author summarizes and explains
what the results mean, often attempting to explain why the results
turned out as they did. In explaining the results, the author
attempts to connect the results of the study with appropriate
theory, professional practice, and related literature. The author
should be careful to avoid reaching conclusions that are not
supported by the results of the study. Moreover, the author should
present only implications and recommendations that are based on
the results of the study, not on what the author had hoped to be
true. Sometimes the author cautions the reader about possible
misinterpretations, pointing out the limitations associated with the
research. A common error made by researchers as well as by
consumers of the research literature is to overgeneralize the
results, that is, attempt to apply the findings and resultant
conclusions to settings or populations that are not warranted by
the study. EXAMPLE 9.8 provides excerpts taken from discussion
sections in published reports. Please note that each excerpt is
only a portion of a complete section.
Criteria
The criteria for critiquing an article are the same criteria a
researcher could apply to a manuscript being developed. Many
research methods books present a framework for evaluating
research, including a list of criteria for evaluating an article or other
research publication. These lists differ to some degree but are
generally similar. The majority of the criteria on these lists usually
apply to any article or research publication, particularly ones
based on experimental research. Due to the variety of research
conducted, no article or research publication should be expected
to fulfill all the criteria on any list.
The authors of this book have developed an unreferenced list
of criteria for evaluating research in kinesiology. It is organized
similarly to the outline described earlier in the chapter: (1)
introductory section, including the review of related literature; (2)
methods; (3) results; (4) discussion; (5) references; and (6)
appendix. A series of questions for evaluating a research paper,
similar in organization to those previously presented, appears in
EXAMPLE 9.9.
Example 9.9 Checklist for Evaluating a Research
Paper
Characteristics
Procedures in Meta-Analysis
There are, in general, seven steps when conducting a meta-
analysis:
1. Compile references. Using several different literature search
techniques, databases, and so on, compile a list of all research
studies pertaining to the research topic for the meta-analysis.
(See Chapter 8, “Reviewing the Literature,” for a description of
using databases.) While searching the topic, you might find a
review article that has a large bibliography. The review article
probably contains only good research studies selected by the
knowledgeable author of the article. In addition, some earlier
meta-analysis articles on the topic and/or on a topic similar to
your topic could yield a large list of references.
2. Determine your criteria. Before you review the studies
extensively, make a basic list of things that must be in any
research study. You will likely have a large number of research
studies, and there is no reason to extensively review a
research study if you will not use it later in the meta-analysis. If
there is not enough information in the research study to
calculate effect size, there is no reason to extensively review
the research study. If the knowledge base for the meta-analysis
topic is changing rapidly, you might decide to not review any
research study that is more than 10 years old. This decision
might be the correct decision, but it might be perceived as an
incorrect decision in that it eliminates any older classic studies
and/or limits the generalizability of the meta-analysis study.
Research studies not published in research journals or in
books based on the research might be excluded from the
meta-analysis. Theses and dissertations are unpublished and
usually harder to obtain than research journal articles. Some
researchers believe that if the research was good, it has been
published in a research journal—which suggests that
unpublished research is not to be trusted. Again, excluding
certain research studies may or may not be the correct
decision. Research studies with small numbers of participants,
perhaps fewer than 30, might be excluded, in that the research
findings may be unique to the participants. Of course, it would
be nice to review only good research conducted by
knowledgeable people and published in good journals.
3. Review each study. Review each research study carefully,
taking good notes concerning the basic information (author,
research article title, journal name, journal volume and pages,
date the issue of the journal was published), the number and
type of participants in the research study, the type of
treatment(s) applied to the research participants, the length of
time of the study, the quality of the procedures used in the
research study, the findings, and the information needed for
calculating effect size. Moderator variables are factors that
influence the relationship between two variables or effect size.
If the relationship between two variables or effect size is not
the same for males and females, gender would be a moderator
variable. Moderator variables need to be identified and coded
during this step. You must have the expertise to evaluate
whether a research study was well conducted, because you do
not want to use weak research studies in the meta-analysis.
Keep a list of research studies reviewed and a list of research
studies not reviewed, so you do not go back to a research
study twice. It is important to get the necessary information in a
research study the first time and apply the same standards to
all research studies. A purpose of meta-analysis is to remove
subjectivity as much as possible.
To ensure the study quality, some methodological quality
rating scales are usually employed. The approach described in
the Cochrane Handbook (Version 5.1, 2011,
training.cochrane.org/handbook) is recommended, in which
a set of reporting biases, including publication, time lag,
multiple publication, location, citation, language, and outcome
reporting biases are examined and coded.
4. Decide which studies you’ll use. Make decisions concerning
what studies will be used in the meta-analysis and whether
moderator variables like gender, age, and physical or health
characteristics will be considered in the meta-analysis. As in
step 2, it is important to list the criteria used for not using a
research study in the meta-analysis and to apply the criteria
equally to each research study.
5. Compute effect size. Calculating the effect size (ES) for each
research study used is the most important step of the meta-
analysis. In statistics, ES is a quantitative measure of the
magnitude of a phenomenon. In meta-analysis, ES is often
calculated in an attempt to estimate whether a difference is a
practical difference. ES can be calculated in a variety of ways
depending on the research design and the intent of the
research. A common way of calculating ES (known as Cohen’s
d) is presented here:
or
where:
CHAPTER 10
CHAPTER 10
Developing the Research Plan
and Grant Application
The process of developing a research project involves many
interrelated elements. We have previously discussed various types
of research and approaches to scientific inquiry. The nature of
research problems and the importance of reviewing the related
literature have been addressed. Now, it is important to consider
the details of a research plan. These include the research
approach, methodological steps, instruments to be used, and how
the instruments will be administered to capture the data needed on
the selected variables under study. In addition, the second part of
the chapter will introduce how to take your research plan and
create a basic grant application.
CHAPTER OBJECTIVES
Hypotheses
Once you have a central research question or purpose for the
study, a statement of hypothesis may be developed for
quantitative research studies or studies with a quantitative
research component. To this end, researchers may proceed with
the idea that a certain outcome will result. This idea may be stated
as “predictions the researchers make about the expected
outcomes of relationships among variables” (Creswell, 2014, p.
143). These are referred to as a research hypothesis. For
example, it has been hypothesized that extensive use of anabolic
steroids by athletes leads to the deterioration of body parts and
disease. As such, many research studies have investigated the
relationship between these variables, and the results have
generally supported this hypothesis (e.g., long-term use can result
in serious illness including a heart attack or liver failure). In this
case, a relationship exists.
In general, research hypotheses possess the following
characteristics:
1. They are based on theory or previous research findings.
2. They state a relationship between at least two variables.
3. They are simple and clear statements with no vague terms
clouding the relationships.
4. They are testable; that is, the stated variables can be
measured, and theoretical or previous research knowledge
exists to permit a clear and testable hypothesis.
5. They have the capability of being refuted; the prediction can be
evaluated in terms of “yes, it occurred” or “no, it did not occur.”
6. All research hypotheses are stated (a) in the present tense and
(b) before collecting data on the variables.
In addition, research hypotheses can be stated using either a
directional hypothesis or a nondirectional hypothesis. If,
based on previous research, the researcher believes that a
particular relationship or difference exists between groups of
research participants, he or she will state the hypothesis
directionally, or in the direction of the expected result.
Observation
Once IRB approval is granted, the researcher may begin data
collection. One method of data collection is observation and this
can be further dissected into direct observation or participant
observation.
Measurement Techniques
Measurement textbooks in kinesiology are available for a more
detailed discussion on assessment (for example, Baumgartner,
Jackson, Mahar, & Rowe, 2016). Due to space limitations, a
general overview of measurement techniques is presented in this
chapter. To that end, almost anything can be measured in one way
or another (e.g., attitudes, knowledge, opinions, movement
patterns, sport skills, and physiological characteristics). In fact,
humans are measured all of their lives. Measurements are taken
prior to birth (e.g., ultrasound) and throughout one’s life (e.g.,
measures of physical growth and development, intellectual
capacity, academic ability, job performance, attitudes, opinions,
and so on). Some of these qualities can be quantified directly,
whereas others, mostly affective behaviors, are measured
indirectly. In research, these types of techniques involve testing or
assessing the research participants in some way to gather the
information or data required by the research question. Research in
kinesiology often involves movement and the measurement of
various outcomes associated with those movements. With that in
mind, one primary consideration related to measurement in
kinesiology is whether the measures are authentic, alternative, or
a mix of both. Authentic assessments are designed to provide a
realistic task, activity, simulation, or problem related to the
attributes of the performance being measured. The more authentic
the assessment, the more it may increase participant motivation
and interest (i.e., such as assessing volleyball skills during a
volleyball game). Alternative assessments, in contrast, refer to
almost everything in kinesiological research that is not a traditional
inventory/checklist, scale, or questionnaire. An overview of
physical, cognitive, and affective measures will be presented in the
section, and although these three categories by no means exhaust
the possibilities, they are ones commonly utilized in kinesiological
research.
Physical Measures
The very nature of the activity inherent in the field of kinesiology
provides countless opportunities for physical measures. Some
variables, such as height or distance jumped, can be measured
directly and the measurement procedure is usually straightforward.
Physical education and various areas of exercise science have
produced a large variety of tests, equipment, and techniques to
measure physiological characteristics, motor behavior, sport skills,
anthropometric attributes, and biomechanical variables, among
others. Depending upon the problem being investigated,
researchers in kinesiology may utilize equipment valued at
thousands to millions of dollars to obtain measures of body
composition (e.g., units for dual X-ray absorptiometry [DEXA] or
whole body air plethysmography [BOD POD]); muscular strength
(e.g., electronic isokinetic dynamometers); cardiorespiratory
endurance; metabolic measurement systems; force production
(e.g., force platforms); blood and muscle biopsy (e.g., tracking
diseases in the blood and muscle function as related to physical
activity); and countless other aspects of human movement.
Kinesiological researchers also use less-expensive field
techniques (e.g., skinfold measurements, pull-up tests, 1-mile
run/walk tests, PACER cardiovascular field tests, or medicine-ball
throw, plus many others) to obtain measures on the characteristics
of human movement. Each technique has its own advantages and
disadvantages. For research purposes, arguably the most
important considerations are reliability (repeatability) and validity
(measuring what was intended) of the scores. Without having
confidence in the information or data collected (if reliability and/or
validity are low), conclusions based on these data will inevitably be
questioned. Those concerns aside, however, the choice of which
instrument to employ in a research project often comes down to
cost, accessibility to the needed equipment, and practicality of
use. The measures available from which to choose will be highly
dependent upon the area of specialization (e.g., exercise
physiology, athletic training, and motor development) and the
setting for the research study.
Cognitive Measures
Equally important are the instruments used to assess knowledge.
As an integral component of most educational programs, cognitive
measurement plays an important role in kinesiological research. In
the past, research studies have sought to measure knowledge
about physical activity, health, fitness, recreation, and sport,
among other things. In fact, many nutrition education interventions
and health-related fitness programs include knowledge objectives.
For example, a recent study investigated the impact of a 2-year
conceptual- based physical education intervention program (i.e.,
Fitness for Life) on students’ healthy behavior knowledge in two
rural schools. Employing a reliable and valid knowledge
assessment tool, the authors showed significant differences by
school (with the schools using different approaches to teach the
content, such as more time in physical education or more time in
other subject areas). The authors ultimately concluded that
students can increase cognitive knowledge related to healthy
behaviors using a variety of approaches that focus on conceptual
knowledge across multiple grade levels (Lorenz, Kulinna, &
Stylianou, in press). In kinesiology, the development of cognitive
measures for research purposes largely follows the same
principles that guide the development of knowledge tests for
educational purposes. Item difficulty (probability that the
respondent will get the item correct), the ability to discriminate
(items’ ability to discriminate those who do well on the
assessment), and the efficiency of responses, are relevant
characteristics for cognitive measures.
Affective Measures
The third category, affective measures, examines characteristics
such as opinion, attitude, personality, motivation, self-efficacy,
mood, anxiety, and frustration. These are generally more difficult
to measure accurately than physical or cognitive variables.
However, recent years have seen a tremendous growth in
research in kinesiology in which the outcome variables of interest
are affective in nature. For example, sport, exercise, and health
psychologists are interested in identifying personality
characteristics that predispose an individual to participate in
various physical activities and sports. They may also wish to
identify what psychosocial factors affect an individual’s physical
activity behavior, or they may seek to better understand “burn-out”
in sports. Physical education teachers and pedagogy
professionals may be interested in assessing the attitudes of
children toward school physical education or perhaps the
children’s level of self-efficacy in physical education. In each of
these situations, the construct of interest involves an affective
measure. Instruments developed to measure these variables
typically have been self-report scales. In fact, hundreds of
instruments have been developed that purport to measure these
affective factors. Because the willingness of the research
participant to provide accurate and truthful information is key to the
usefulness of such self-evaluation instruments, validity and
reliability considerations are paramount. Because of this,
researchers will typically try to select an instrument that has
previously been published and validated rather than construct a
new one. Unfortunately, not all of the available instruments have
been scientifically developed and may lack acceptable reliability
and validity information in studies with similar populations.
Therefore, it is incumbent on the researcher to critically examine
and carefully check the suitability of available instruments and
then to wisely choose the most appropriate instrument for the
research study.
Likert-type scales. One of the most widely used and versatile type
of scales, a Likert-type scale, is designed to measure the degree
to which an individual exhibits an attitude, belief, characteristic of
interest, or opinion about something along a continuum of
responses. Developed in the 1930s by Rensis Likert, these are
also called summated-rating scales because a person’s score on
the scale is computed by summing the numerical value assigned
to the various responses provided by the research participant.
Usually a Likert-type scale asks respondents to indicate whether
they agree or disagree with a statement. A 5-point Likert-type
scale is most common, but variations also include scales having 4,
6, 7, and even 9 points. For a 5-point Likert-type scale, the
continuum of responses runs from “strongly agree” (SA), to
“agree” (A), “undecided” (U), “disagree” (D), and “strongly
disagree” (SD). Researchers have debated about whether or not
to include a neutral point or category (e.g., “don’t know,”
“undecided,” or “no opinion”) on the scale, thereby forcing the
respondent to “get off the fence.” If a neutral category is included,
then the scale will consist of an odd number of points on the scale.
An example of an item from a Likert-type scale is shown here.
strongly disagree
disagree
undecided
agree
strongly agree
0 – Nothing at all
1 – Very light
2 – Light
3 – Moderate
4 – Somewhat heavy
5 – Heavy
7 – Very heavy
Never
Rarely
Sometimes
Often
Always
Questioning Techniques
Similar to observation and measurement, questioning techniques
include a variety of methods; in contrast, the research participant
is merely asked to respond to questions presented by the
investigator. These techniques frequently take the form of self-
report questionnaires or some type of personal interview. Survey
research is used extensively in the social and behavioral sciences,
and it is the predominant form of questioning technique in
kinesiology research. In fact, it would be difficult to find an adult
who has not participated in some type of survey, and this method
is becoming more commonplace in research involving children and
adolescents. With surveys, a researcher is not required to observe
the participant’s behavior or introduce a measurement technique;
the researcher simply asks questions to participants about their
attitudes, opinions, or behaviors. For example, self-report
measures are often utilized to track individuals’ health behaviors.
One example for youth tracking their own physical activity is the
ActivityGram® instrument, a 3-day electronic tracking program for
monitoring daily physical activity patterns.
Numerous methods of collecting survey data exist. These
include but are not limited to the following: (a) social media, (b)
personal interviewing, (c) electronic (e.g., Skype, Zoom) telephone
interviewing, (d) mailed or online surveys, and (e) in-person
administered surveys. The selection of the data collection method
depends on the researcher’s access to the population of interest,
the type of information sought, time requirements, costs, and
resources available. More information is available in Chapter 2,
“Descriptive Research.” Regardless of the method of data
collection, one of the keys to the success of survey research is the
development of meaningful questions.
Structured Questionnaires
Also called a closed-ended questionnaire, the structured
questionnaire instrument includes queries that can be answered
with responses of yes/no, true/false, or by selecting an answer
from a list of suggested (multiple-choice) answers. The short,
quick response format requires less time and effort from research
participants and tends to be fairly objective. The analysis of the
data from this type of instrument is relatively simple. The following
example illustrates a closed-ended question in which the adult
respondent selects only one of the possible alternatives provided:
0 days
1 day
2 days
3 days
4 days
5 days
6 days
7 days
Data-Collecting Instruments
As previously mentioned, the available array of data- collecting
instruments is vast. Foundationally, a data-collecting instrument
is any paper-and-pencil test or measure, mechanical or electronic
equipment measure, or physical performance test employed to
collect information (data) on the variable under study. The choice
of the instrument to be used during the data collection process
involves deciding whether it will be one that has already been
developed and will be used “as is,” one that already exists but will
be revised, or one that will need to be developed. This decision
will be based on the needs of the research study design and what
instruments are available. Those details will be more thoroughly
examined in the next section.
Selecting the Research Instrument
Selecting a research instrument for any project is usually the result
of an extensive review of the literature. For example, in studying
domains of teacher identity and how these impact student learning
in physical education, a review of quantitative measurement
instruments available would identify 17 existing studies and
corresponding instruments used to measure teacher identity (i.e.,
Hanna, Oostdam, Severiens, & Zijlstra, 2019). Once it is
determined that an appropriate instrument exists, the next step is
to assess its acceptability. Whether the instrument is reliable
(measures consistently), whether it is objective (free of tester
bias), and whether it is valid (measures what it is supposed to
measure) are important characteristics in this part of the process.
Frequently, a data collection tool will be selected on the basis of its
reliability and validity as reported by the researchers who have
used it in earlier studies. If these measures are not provided,
various statistical techniques can be applied to determine the
reliability and validity of the instrument. Sometimes, researchers
must make their own determination of the reliability and validity of
the instrument they plan to use. Part of determining validity is
analyzing whether or not the instrument is appropriate. For
example, can the research participants meet the demands of the
instrument, or is it too hard or too easy for them? The vocabulary
of an instrument may be appropriate for one age group and not for
another. The same can be true for certain physical performance
tests. If the instrument is not appropriate, it will not produce valid
data, and the data produced will be of no use in answering the
research question. Further information about these important
assessment instrument and tool attributes and how they are
determined is available in Chapter 13, “Measurement in
Research.”
In selecting an instrument, a researcher must also consider
other criteria, such as how easy the test is to administer, how
much it costs in time and money, and how easy it is to score.
Ideally, the researcher selects the most reliable and valid
instrument that has been used in similar populations. In reality,
however, if that instrument is difficult, time-consuming, costly, and
demanding of the research participant, the researcher may instead
select an instrument that is a little less reliable and valid but is
easier to handle, has a shorter administrative time span, is less
costly, and is less demanding of the research participants.
Revising the Instrument
Frequently, a researcher locates an instrument that is not quite
acceptable for the intended research situation, and in this case the
instrument must be revised. Depending on the situation, it may be
necessary to contact the original instrument designers to request a
copy of the complete instrument and administrative procedures.
As a matter of professional courtesy, the researcher may need to
request permission from the original author/developer to modify a
proprietary instrument. When a standardized, copyrighted
instrument is utilized, either as-is or revised, permission from the
instrument’s publisher is required. Changes in an original
instrument may take many forms, but are usually undertaken to
better fit a particular group of research participants. For example,
self-efficacy has been studied in different age groups, sexes,
races, and individuals with disabilities. Obviously, an instrument
designed to measure this variable cannot be used satisfactorily
with all participant groups, so, resources and instruments for
measuring self-efficacy have been developed for children, adults,
teachers, and individuals with disabilities (e.g., Martin, 2018) as
well as many other groups. Ultimately, however, revising an
instrument may alter its reliability and validity, and new measures
of these constructs may need to be determined because the
reliability and validity of an instrument are specific to the scores
produced with the distinct sample only.
Instrument Development
It is difficult and time-consuming to develop an instrument, and this
procedure is best avoided by both veteran and beginning
researchers whenever possible. The task becomes necessary only
when it is obvious that no survey, either electronic or mechanical,
or no physical performance test or instrument exists to generate
the needed research data. The starting point for instrument
development is, once again, the literature. All the dynamics in the
prospective research problem must be thoroughly understood, and
what instruments are available to use as models must be known
before the instrument development process begins. Suppose a
researcher is interested in determining the motivation of adults for
participating in recreational sport. The dynamics involved in this
phenomenon are varied and comprise intellectual, sociological,
psychological, and physiological components. A thorough review
of the literature must be completed and should include related
research and conceptual articles reflecting each of those
components as well as what instruments others have employed in
certain situations. The knowledge accrued from such a review will
provide the basis for determining the content of the instrument,
that is, what type of questions and/or statements will contribute
most in the attempt to measure the motivation of adults who do or
do not participate on recreational sport teams.
The usual next step is to select the questions or statements
and produce a tentative instrument. In writing these research
items, careful consideration of each item provides a reasonable
estimate of each relevant component of what may motivate adults
to participate in recreational sport. If, for example, one of the
components related to adult sport participation is the individual’s
perception of whether or not athletic participation will enhance the
chances of her or him being successful in life, then the
researcher’s task is to develop questions or statements that will
provide an estimate of this attribute. If the participant perceives
that athletic participation will enhance their self-efficacy in sport
and status among their adult peers, the instrument should contain
items that will provide a measurable estimate of these
components.
Once the tentative instrument has been developed, the
researcher invites selected individuals, who are considered to
have expertise in the problem area, to review the content of the
instrument. These can be people who have taught, researched, or
written in the area being studied. The task of the experts is to
review the research questions or statements and revise them as
needed. In addition, if the items are formed around a theory or
concept, the experts may also be asked to assign items to content
areas. For example, in the Comprehensive School Physical
Activity Program, items are grouped according to components
such as before-/after-school physical activity, during-school
physical activity, physical education, staff physical activity
programming, and community physical activity programming.
Experts also determine which, if any, items should be rewritten.
Sometimes, certain items should be deleted or added. Some items
are most significant to the essence of the problem while others are
relatively insignificant. Needless to say, the work of the experts
suggests careful attention to the elimination of instrument flaws.
Submitting the proposed instrument to this vetting process is an
excellent way to increase the content validity of the proposed
measurement instrument. In most instances, the instrument
undergoes this process only once, but, on occasion, it may be
necessary to repeat the procedure two or more times.
The next step in the instrument development process is a pilot
study or preliminary investigation. The intended instrument is
administered to selected research participants of the same
population who will make up the actual research sample but who
are not included in the actual sample of research participants. This
precursor serves as a trial run of the instrument to see if it is in
need of further revision. The objectives of a pilot study are (a) to
determine whether or not research participants are likely to
understand the content items in the instrument, (b) to determine
whether the instrument will provide the needed data, (c) to
familiarize the researcher with instrument administration
procedures, (d) to obtain a set of data for practicing proposed data
treatment techniques, and (e) to determine the reliability and the
validity of the instrument.
As an additional step, the researcher will further revise the
instrument if the pilot study shows change to be necessary. In
some cases, a pilot study may not always be needed, but it is a
valuable research tool. The knowledge from a pilot study that the
instrument is sufficient, will yield reliable and valid data, that the
administration procedures are appropriate and the research plan is
a proper approach are valuable benefits for conducting a pilot
study. The actual research study itself may not be completely
without problems, but the prospects of severe trouble occurring
are diminished by having performed, in essence, a test run. With
expert evaluation and a successful pilot study, the researcher may
have confidence in his or her completed instrument.
In general, the aforementioned discussion, specifically
addressed the creation of a survey instrument. However, the same
basic steps occur in the process of developing any data collection
instrument. These steps are as follows:
1. Survey the literature.
2. Develop a tentative instrument.
3. Revise the instrument as needed.
4. Obtain opinions of experts concerning the instrument.
5. Revise the instrument as needed—again.
6. Conduct a pilot study.
7. Revise the instrument as needed—again.
8. Deploy the instrument.
Transforming a Research Plan into a Basic
Grant Application
Depending on the stage, a research plan can be modified fairly
simply into a grant proposal. The process also can be driven from
the opposite end of the spectrum with the grant proposal being
altered to become a research plan. Both documents are
foundationally similar, apart from two primary differences. First,
development of the grant proposal must follow the strict guidelines
of the funding agency. Second, specific requirements from the
granting agency for the grant proposal may differ from content in a
research proposal. For example, in a grant proposal, the requester
may need to show pilot data for the project in order to demonstrate
that if monies are invested in the proposal, the project will likely
produce effective results. The following section provides a brief
overview of identifying funding sources, the difference between
internal grants and external grants, the components needed for a
typical grant proposal, and what to expect following a grant
submission.
CHAPTER 11
CHAPTER 11
Ethical Concerns in Research
Public acceptance of the scientific method as a source of
knowledge is founded, at least in part, on the integrity associated
with the process. While it is impossible to legislate ethics, the
scientific community has sought to develop ethical standards that
provide direction for conducting research. The effectiveness of
such standards is ultimately dependent upon the adherence and
practices of individual researchers. Because most research in
kinesiology involves the use of human research participants in one
way or another, it is important for researchers to consider the
ethical principles and regulations that guide their research
activities.
OBJECTIVES
45 CFR 46
Assent
Authorship
Beneficence
Common Rule
Confidentiality
Ethics
Helsinki Declaration
Informed consent
Institutional Review Boards (IRBs)
Justice
Nazi experimentation
Nuremberg Code
Plagiarism
Privacy
Respect for persons
Scientific misconduct
Self-plagiarism
Situational ethics
Tuskegee Syphilis Study
Malaria Experiments
Prisoners and concentration camp inmates were either purposely
exposed to mosquitoes that were known to carry the malaria virus
or were given direct injections of the parasite, in order to
investigate the effect of various antimalarial compounds. After
contracting malaria, the participants were treated with various
drugs to test their relative efficacy. Testimony revealed that victims
who did not die of malaria often suffered substantial and
debilitating side effects from the various compounds being tested.
The alleged intent of these experiments was to determine effective
immunization and treatment for malaria.
High-Altitude Experiments
Supposedly, to investigate the limits of human endurance and
existence at extremely high altitudes, Nazi physicians placed
concentration camp inmates in low-pressure chambers capable of
reproducing atmospheric conditions and pressures of extremely
high altitudes. As the simulated altitude was progressively
increased, testimony states that victims experienced excruciating
pain, spasmodic convulsions, grave injuries, and frequently, death
due to the enormous number of air embolisms that developed in
the brain, coronary vessels, and other internal organs.
A detailed account of the Nuremberg war crimes trials of Nazi
doctors and their horrific experiments in the name of science is
discussed in the book, Doctors from Hell (Spitz, 2005). Ironically,
even as the Nuremberg war crimes trials were being conducted,
the U.S. Public Health Service (PHS) was simultaneously
supporting a research project that ultimately constituted an
example of unethical behavior and blatant misconduct in the name
of science—the Tuskegee Syphilis Study.
Please note that “exempt” does not imply that the researcher
can simply ignore IRB guidelines and/or not submit required forms;
it only means that the proposed research is exempt from the IRB’s
full review process. In reality, many institutions actually extend
their requirement for IRB review beyond that specified by federal
regulations; this is another reason to become familiar with local
IRB policies.
If the research does not qualify for exempt status, it may only
require an expedited review. This pertains to research in which
there is no more than a minimal risk to the participants. In these
cases, the IRB review is conducted by the chair of the IRB or a
designated member or members of the committee. Regulations
define minimal risk to mean that “the probability and magnitude of
harm or discomfort anticipated in the research are not greater in
and of themselves than those ordinarily encountered in the daily
life or during performance of routine physical or psychological
examinations or tests.” Moreover, minimal risk is determined to be
relative to the daily life of a normal, healthy person (Common
Rule, 2018).
In some cases, the research plan and design may require a full
IRB review. A full review will generally require the researcher to
meet before an IRB board and provide further details about the
study and to address questions from the board members about the
study. When the proposed research project involves more than
minimal risk to the participants, this route will be required for
approval. Proposals for research involving animals are also
reviewed by the IRB. In addition to the care needed to work
ethically with human participants, the same care is needed in
working with animal participants in research studies. The next
section will address those concerns.
Research Involving Animals
Although the general nature of kinesiology precludes the frequent
use of animals in research endeavors, it is not uncommon to find
the results of learning studies or biomedical studies utilizing
animals published in our field’s professional journals. Animal
research has long been a part of biomedical research and has
produced significant benefits for humans. Questions have arisen,
however, about the way animals are selected and treated as
research participants.
To that end, governmental agencies and professional
associations have issued ethical guidelines and regulations for the
utilization and care of animals during testing and research.
Although various guidelines for the proper care and use of animals
in research have existed for many years, the U.S. Government
Principles for the Utilization and Care of Vertebrate Animals Used
in Testing, Research, and Training, 8th Edition (National
Research Council, 2011; originally published in 1972) provides
primary directives. In general, the guidelines relate to the
transportation, care, and use of vertebrate animals in ways that
are judged to be scientifically, technically, and humanely
appropriate. Similar regulations are included in the 2015 PHS
Policy on Humane Care and Use of Laboratory Animals
(https://olaw.nih.gov/policies-laws/phs-policy.htm). Moreover,
the Animal Welfare Act of 1985 (Public Law 99-158) established
assurance procedures to monitor compliance with federal
regulations for conducting research using animals. The APA has
also developed standards for ethical conduct in the care and use
of animals in research and testing (American Psychological
Association, 2012). For further information, readers
contemplating research involving animals are advised to consult
the sources named here as well as their local IRB. One important
component of the IRB process is informed consent for human
participants. It is considered the base of ethical practice.
Informed Consent
The first provision of the Nuremberg Code (and arguably the most
important pillar underlying ethical standards governing research
involving human participants) is consent. This is often seen as the
key aspect of obtaining approval for conducting research involving
human participants. In the United States, the DHHS articulates the
requirements for informed consent in the Common Rule (2018).
For more information, please refer to the U.S. Health and Human
Services website under informed consent. A sample adult
informed consent form from Arizona State University (ASU) is
shown in FIGURE 11.3.
(https://researchintegrity.asu.edu/human-subjects/forms).
Figure 11.3 Sample informed consent form.
Arizona State University, Office of Knowledge Enterprise, see https://research.asu.edu
Participant Assent
If the prospective research participants are children or youth, the
regulations require both the assent of the child or minor and the
permission of the parent or guardian (FIGURE 11.4). Please note
that federal regulations (i.e., DHHS) are minimum adherence
standards, and various institutions may be more stringent.
Obtaining parent or guardian permission and child or youth assent
to participate in a project may be one of the more challenging
issues facing researchers embarking on a new scientific project.
Confidentiality
The informed consent form should also indicate how the
researcher will protect the confidentiality of participants. In most
cases, the researcher is interested in group data and will
aggregate individual scores or information. Confidentiality is a
major factor in determining whether certain types of research
projects involving educational tests, surveys, interviews, or the
observation of public behavior may be considered exempt from
federal policy governing research. According to 45 CFR 46 (US
Department of Health and Human Services, 1974), researchers
are responsible for insuring confidentiality of personal information
obtained from human research participants unless express
permission to the contrary is granted by the research participant.
To that end, many good practices or procedures exist that
might aid a researcher who wants to insure the confidentiality of
the research participants. The following are some good practice
procedures that might be used to maintain confidentiality:
Obtain anonymous information (e.g., survey research where
participants provide no identifying information).
Code the data in a way that eliminates identifying information
(e.g., identification number).
Substitute pseudonyms for names and information that could
identify the participants (e.g., school name).
Do not share or report data at the individual level.
Limit access to data that could reveal a participant’s identity
(e.g., only the principal investigator would have access to the
data).
Report the data only in aggregate form.
Use computerized methods for encrypting and storing data.
Additionally, when research involves collecting data about
sensitive issues (e.g., illegal behaviors), it is essential that the
researcher is able to provide assurances of confidentiality to the
participants. In fact, in certain instances, researchers are able to
obtain a “certificate of confidentiality” (CoC) that protects the
identities of research participants or research data even against
subpoena by law enforcement agencies. A CoC is issued
automatically for projects funded by the National Institutes of
Health (NIH). For non-NIH-funded projects, the researcher can
also apply for a CoC from the NIH. In general, the more sensitive
the data being collected, the more attentive the researcher needs
to be in protecting confidentiality.
One example of a project where the confidentially of the
research participants was not maintained was a partnership
between the Havasupai Tribe and Arizona State University.
Named The Diabetes Project (http://genetics.ncai.org/case-
study/havasupai-Tribe.cfm), researchers were looking for a
genetic link to type 2 diabetes. The project included testing blood
samples, genetic testing between genes and diabetes risk, and
health education for community members. However, significant
concerns were later raised when community members discovered
that the blood samples were also being tested for markers of
inbreeding, migration, and schizophrenia. Ultimately, the
Havasupai Tribe filed a lawsuit against the Arizona Board of
Regents for complaints over a lack of informed consent, violation
of civil rights in the way the blood samples were handled,
inappropriate uses of data, and lack of confidentiality.
This and other similar cases, bring up several important points.
First, the use of broad language in consent forms can lead to
misunderstandings with participants. The language in these
informed consent forms indicated that the blood would be used for
genetic studies on diabetes and for research on “behavioral and
medical disorders.” Second, by using the Havasupai Tribe name
on research papers, researchers put the community members’
confidentiality at risk. Another issue can be HIPAA rights and
access to participants’ medical records. In the ASU case, the
researchers had access to participants’ records on diabetes,
however, data on other medical issues were also (inappropriately)
collected. In the end, the lawsuit was settled out of court, and
restitution was made to community members. In addition to ethical
care taken by researchers in the proposal, IRB, and carrying out
the research project, care is also needed in how research findings
are disclosed.
Disclosure of Research Findings
After gaining IRB approval and following the path of the research
plan, the culmination of a project generally results in the disclosure
of the results in one form or another. Usual methods of disclosure
include publication in a professional journal in one’s field, a
presentation at a professional conference, a report to a sponsoring
organization, a news release, and/or a completion of a thesis or
dissertation (prior to later publication). It is through the disclosure
of research findings that the scientific method furthers knowledge,
thus affecting professional practices and personal choices. In
order for research to be viewed as a credible source of knowledge,
the honest and accurate disclosure of research findings is
essential. Chapter 18, “Writing a Research Report,” provides
additional information on disseminating the results of a research
study.
Even though the pathways are often widely diverse, all
researchers have a common ethical obligation to report findings of
research studies with human participants. The dissemination of
results leads to “good science” in kinesiology. Sharing findings is
more than a personal choice—it is an ethical obligation. For
example, healthcare advances need to be grounded in knowledge
that is accurate, up-to-date, and complete. However, evidence
suggests that researchers may not publish studies promptly, or
they may neglect publication completely, for projects with
inconclusive or negative findings (Alley, Seo, & Hong, 2015).
To that aforementioned point, not reporting findings is one type
of researcher negligence, but other forms of scientific misconduct,
as defined by the DHHS Office of Research Integrity
(https://ori.hhs.gov/definition-misconduct), exist. These
guidelines suggest that research misconduct means fabrication,
falsification, plagiarism, or other practices that seriously deviate
from those that are commonly accepted within the scientific
community for proposing, conducting, or reporting research. It
does not include honest errors or honest differences in
interpretations or judgments of data. As another example, the
National Science Foundation (NSF) provides a similar definition of
research misconduct that can be found in 45 CFR 689 (please see
the National Science Foundation website under key regulations;
National Science Foundation, 1982). The NSF indicates that
research misconduct means fabrication, falsification, plagiarism, or
other serious deviation from accepted practices in proposing
research, performing research, reviewing research proposals, and
reporting results from activities funded by the NSF.
It is clear from these definitions that fabrication, falsification,
and plagiarism serve as the underlying elements of research
misconduct, at least as defined by these federal agencies. While
the notion of fabricating or falsifying research findings is contrary
to the principles underlying the scientific method, a number of
reported instances in which researchers have made up data,
altered data, or otherwise misrepresented the true findings of their
study continue to draw attention. Examples of scientific
misconduct are regularly reported in publications such as The
Chronicle of Higher Education, as well as by the Office of
Research Integrity’s ORI Newsletter.
One recent example published in The Chronical of Higher
Education involved a professor and director of the Food and Brand
Lab at Cornell University. He had published some popular works,
including a best-selling book about “why we eat more than we
think,” and a study where he found people ate more if their plates
were larger. The professor also developed a successful lunch
program that is used in more than 30,000 schools. The researcher
had been giving what some thought was “questionable research
advice” on a blog, and this prompted a series of events
culminating with the American Medical Association pulling six of
his research papers from publication. He was charged with
research misconduct in the areas of misreporting research data,
problematic statistical techniques, and a lack of maintaining
research data and results, as well as inappropriate authorship.
The professor will work with the university in reviewing previous
publications for appropriateness and report problems found in
previous work (Bartlett, 2018).
To prevent cases of misconduct like this one, the Office of
Research Integrity (ORI; https://ori.hhs.gov/) is responsible for
developing policies, procedures, and regulations related to the
detection, investigation, and prevention of scientific misconduct in
biomedical and behavioral research conducted or supported by
the DHHS or its agencies. Considerable resources to improve the
integrity of research and reduce research misconduct are available
at the ORI website, including reports documenting incidents of
misconduct. In addition, severe penalties are usually imposed on
researchers who are found guilty of scientific misconduct.
Drowatzky (1996) provides an extensive listing of possible
penalties, including letters of reprimand, salary freeze, reduction in
professorial rank, prohibition from obtaining outside grants, fines,
dismissal from university employment, and referral to the legal
system for further action. Depending upon the regulations at the
given institution, student researchers caught falsifying research
findings could possibly receive a failing grade, be placed on
probation, suspended, or even expelled. For example, a graduate
student in kinesiology at the home institution of one of the authors
of this textbook received a grade of “F” for 9 hours of credit and
was suspended from school for fabricating data in research
undertaken for the thesis requirement. Fabricating data, however,
is just one area of scientific misconduct; another is plagiarism.
Plagiarism is frequently recognized as a problem with student
papers or reports, but plagiarism among professionals also occurs
in scientific writing. Foundationally, plagiarism is simply the
presentation of the ideas or works of others as your own without
giving proper credit. The ethical guiding principle is merely to give
proper credit to those whose work you borrow or cite. This is done
through appropriate documentation in accordance with the chosen
style manual. Sometimes plagiarism is accidental; the writer is
either unfamiliar with the rules for correctly paraphrasing or
quoting sources or is careless in recording and citing sources. A
number of universities have established websites that provide tips
on avoiding plagiarism. Two such examples are Northwestern
University (from Northwestern University’s website see plagiarism
and Indiana University’s website on how to recognize plagiarism)
and Indiana University (Tutorials and tests, Indiana University,
2002–2019).
In addition to “borrowing” information from others, self-
plagiarism can also be a problem. Self-plagiarism is the
presentation of your own written words more than once. Here are
some common academic examples of self-plagiarism: (a)
submitting the same paper in more than one class, (b) using parts
of a previously submitted paper for a comprehensive exam
response, (c) using part of your thesis in your dissertation, and (d)
submitting similar research articles to different journals with
different titles (https://provost.asu.edu/academic-
integrity/resources/students). It is important that authors are
given credit for their words and their work—even if you are the
author.
Authorship is the primary method through which researchers
receive recognition for their research efforts. Authorship should be
limited to those who have made a significant contribution to the
research project. In fact, it is generally recommended that the
order of authorship be based on the magnitude of one’s
contribution to the research. The researcher who is primarily
responsible for conceptualizing the problem and developing the
research plan is usually listed as first author in kinesiology (note
that this differs in other subject areas—for example, in medicine
the lab director is often listed last). Additional authors are listed in
order of their contributions. Moreover, all coauthors must agree to
be listed as such. Inclusion of a coauthor strictly on the basis of
the position of authority they hold (e.g., departmental chairperson)
is also inappropriate. It is recommended that all issues about
authorship be resolved early in the research process—certainly
prior to commencing the publication/reporting phase. Publication
of student research undertaken for a thesis or dissertation
requirement poses special problems. The student should always
be listed as first author, and the faculty chair of the thesis or
dissertation committee may be included as a coauthor. Other
faculty may be included, as well, depending upon their contribution
to the research. The ethical principles published by the APA and
the AERA both include standards pertaining to authorship and
should be consulted for further information (please see the
American Psychological Association and American Educational
Research Association websites under ethics).
The financial support provided by corporations, agencies, and
foundations has provided many researchers with the means to
conduct research activities and thus further the body of
knowledge. Without this support, it is quite likely that the
magnitude and quality of research would be different from what we
see today. Yet sponsored research poses particular ethical
concerns. Who owns the data? Who controls the release of the
research findings? Are limitations placed on subsequent
publication of the results when they are unfavorable to the
sponsor? Is it appropriate to withhold sponsorship information
when obtaining informed consent? Controversy over matters such
as these can often be resolved through contractual agreement
with the research sponsor. In these cases, it is important for the
researcher to carefully read such contracts prior to signing and to
understand the rights and responsibilities articulated therein.
Violation of contractual agreements is not only unethical, but it
also constitutes a breach of law. A specific standard concerning
the ethical integrity associated with sponsored research is
included in the code of ethics of the AERA.
Summary of Objectives
1. Discuss various instances of ethical misconduct in
research involving humans (i.e., Nazi experimentation and
the Tuskegee Syphilis Study). The atrocities associated with
human experimentation by Nazi scientists during World War II
shocked the world and served as the impetus for the
development of ethical codes of conduct for biomedical and
behavioral research. Revelations of scientific misconduct in the
United States, including the Tuskegee Syphilis Study,
prompted further reforms.
2. Identify ethical codes and federal regulations that serve to
guide research in the United States. The international
outrage over Nazi experimentation led to the development of
the Nuremberg Code, a set of basic principles that govern
research involving human participants. In the United States,
federal policies and regulations now provide ethical guidelines
for human and animal participants.
3. Understand the role and function of Institutional Review
Boards in approving research activities. Institutional review
boards have been established to provide a mechanism for the
protection of research participants and to ensure compliance
with federal policy.
4. Understand the importance of informed consent in
research involving human participants. The key provision of
the Nuremberg Code demands that research participants are
made fully aware of the nature and purpose of the project in
which they will be participating, that they voluntarily give
consent, that they are legally capable of giving such consent,
and that the researcher is responsible for obtaining the
participants’ consent. Many professional associations have
also promulgated ethical standards that serve to guide
research activities of their members as well as the disclosure of
research findings.
5. Explain ethical considerations involved in the disclosure
of research results. When publishing or reporting the results
of their research studies, researchers are responsible for
ensuring that their findings have not been falsified, fabricated,
or plagiarized. Severe penalties are usually imposed on those
who are found guilty of scientific misconduct.
References
Alley, A. B., Seo, J. W., & Hong, S. T. (2015). Reporting results of research involving
human subjects: an ethical obligation. Journal of Korean Medical Science, 30(6), 673–
675.
American Educational Research Association (AERA). (2011). Code of Ethics. Retrieved
from www.aera.net/About-AERA/Key-Programs/Social-Justice/Research-Ethics
American Psychology Association (APA). (2012). Guidelines for ethical conduct in the
care of and use of nonhuman animals in research. Retrieved from
https://www.apa.org/science/leadership/care/care-animal-guidelines.pdf
American Psychological Association (APA). (2018) Code of Ethics. Retrieved from
www.apa.org/ethics/code/index.aspx
American Sociologial Association (ASA). (2018). Code of ethics for kinesiology
sociologists. Retrieved from www.asanet.org/code-ethics
Bartlett, T. (2018). As Cornell finds him guilty of academic misconduct, food researcher
will retire. Chronicle of Higher Education, 65(4), 1–4.
Cohen, J. S. (2018). Illinois regulators are investigating a psychiatrist whose research
with children was marred by misconduct. Chronicle of Higher Education, 65(16), 1–4.
Drowatzky, J. N. (1996) Ethical decision making in physical activity research.
Champaign, IL: Human Kinetics.
Foe, G., & Larson, E. L. (2016). Reading level and comprehension of research consent
corms: An integrative review. Journal of Empirical Research on Human Research
Ethics, 11(1), 31–46. doi:10.1177/1556264616637483
Goliszek, A. (2003). In the name of science: A history of secret programs, medical
research and human experimentation. New York: St. Martin’s Press.
Indiana University, Bloomington. (2019). How to recognize plagiarism: Tutorials and
tests. Retrieved from https://www.indiana.edu/~academy/firstPrinciples/tutorials/
Jones, J. H. (1983). Bad blood: The Tuskegee syphilis experiment. New York: The Free
Press.
McCuen, G. E. (1998). Human experimentation: When research is evil. Hudson, WI:
GEM Publications.
National Research Council (2011). Committee for the Update of the Guide for the Care
and Use of Laboratory Animals. Guide for the Care and Use of Laboratory Animals.
8th edition. Washington (DC): National Academies Press (US); Appendix B, U.S.
Government Principles for the Utilization and Care of Vertebrate Animals Used in
Testing, Research, and Training. Available from:
https://www.ncbi.nlm.nih.gov/books/NBK54048/
National Science Foundation. (1982). 45 CFR 689: Research Misconduct. Retrieved
from https://www.nsf.gov/oig/_pdf/cfr/45-CFR-689.pdf
Reverby, S. (2000) Tuskegee’s truths: Rethinking the Tuskegee syphilis study. Chapel
Hill, NC: University of North Carolina Press.
Spitz, V. (2005). Doctors from hell: The horrific account of Nazi experiments on humans.
Sentient Publications.
“Trials of War Criminals before the Nuremberg Military Tribunals under Control Council
Law”. (1949). No. 10, Vol. 2, pp. 181–182. Washington, D.C.: U.S. Government
Printing Office.
US Department of Education. (1974). Family Educational Rights and Privacy Act
(FERPA). Retrieved from www.ed.gov/ferpa
US Department of Health and Human Services. (1996). Health Insurance Portability and
Accountability Act (HIPPA). Retrieved from www.hhs.gov/hipaa/index.html
US Department of Health and Human Services, Code of Federal Regulations. (1974). 45
CFR 46: Protection of Human Subjects under United States Law. Retrieved from
https://www.govinfo.gov/content/pkg/CFR-2016-title45-vol1/pdf/CFR-2016-
title45-vol1-part46.pdf
US Department of Health and Human Services, Regulations & Policy. (1979). The
Belmont Report: Ethical Principles and Guidelines for the Protection of Human
Subjects of Research. Retrieved from https://www.hhs.gov/ohrp/regulations-and-
policy/belmont-report/read-the-belmont-report/index.html
US Department of Health and Human Services, Regulations & Policy. (2018). Revised
Common Rule. Retrieved from https://www.hhs.gov/ohrp/regulations-and-
policy/regulations/finalized-revisions-common-rule/index.html
World Medical Assembly, Policy. (2013). Declaration of Helsinki: Ethical Principles for
Medical Research Involving Human Subjects. Retrieved from
https://www.who.int/bulletin/archives/79%284%29373.pdf
© Liashko/Shutterstock
CHAPTER 12
CHAPTER 12
Identifying, Recruiting, and
Selecting Research
Participants
All steps in the research process are important, but perhaps the
most crucial to a successful investigation are the procedures used
in the identification, recruitment, and selection of research
participants. In qualitative research studies, the investigator often
identifies recruits and selects sites and participants in a purposive
way. This intentional selection allows the investigators to gain a
deep understanding of the phenomenon or situation they are
interested in studying. In quantitative research, often the goal is to
study a population. This extremely difficult task may require the
use of a randomized sampling process. In this process, the
researcher first identifies and defines the population from which
information is to be obtained. Considerations are given to such
factors as sample size, how representative the population is of the
sample, and the appropriateness of various research methods and
assessments for the participants. The process of sampling in this
manner permits the researcher to generalize the results from a
study of the sample to the larger population. Sampling is used for
quantitative methodology designs. This includes the quantitative
portion of mixed-methods designs. The mixed-methods design is
unique in that it can follow the pattern of either qualitative,
quantitative, or some combination of each. It allows maximum
flexibility to best aid the researcher’s pursuit to satisfy the research
questions.
OBJECTIVES
Cluster sampling
Element
Fishbowl technique
Multistage sampling
Nonprobability sampling
Population
Probability sampling
Purposive sampling
Random assignment
Random selection
Representative sample
Sample
Sample size
Sampling
Sampling error
Sampling frame
Sampling unit
Saturation
Simple random sample
Statistical power
Stratified random sampling
Systematic sampling
Table of random numbers
Populations and Samples
Populations
The term population, often the focus of a research effort, refers to
an entire group or aggregate of people or elements having one or
more common characteristic. A population can be composed of
people, animals, objects, organisms, institutions, attributes,
materials, or any other defined element. For example, 4-year
colleges and universities in the United States, all elementary
school teachers in Iowa, all adolescents in Battle Creek, Michigan,
all NCAA (the National Collegiate Athletic Association) Division I
baseball coaches, and all health education majors at the
University of Georgia can be referred to as populations. In short,
researchers can define or describe a population any way they
choose to suit the research purpose. The effects of physical
activity on cognitive executive function, for example, could be
studied among Americans who are older, athletes, individuals with
disabilities, various socioeconomic groups, teenagers, or any other
group.
Populations can be defined in numerous ways. They can be
quite large—theoretically infinite—as would be the case of fitness
tests of all high school students; or, they can be finite, such as the
number of dance majors at a college. As a general practice,
populations are typically considered finite. From these types of
populations, qualitative research studies generally employ smaller
numbers of participants, or samples. For example, a physical
activity study involving older adults may require a larger number of
participants, a climbing study on the relationship of nutrition and
altitude may involve just a few participants, and a researcher
studying the golf swing of the Ladies Professional Golf Association
(LPGA) women golfers on tour will need to be able to recruit
female professional golfers from the LPGA tour.
Sampling
Because of the difficulties related to studying entire populations,
the process of sampling is often used in quantitative studies.
Sampling is the process whereby a small proportion or subgroup
of the population is selected for scientific observation and analysis.
A sample is a subgroup of a population of interest that is thought
to be representative of that larger population. The researcher
observes and analyzes certain characteristics of the sample to get
a better idea of how those characteristics might relate to the entire
population. From the information (data) obtained from the sample,
the researcher then makes inferences about the population. As an
example, the mean blood pressure score for a sample of cardiac
rehabilitation participants could be used to approximate the mean
blood pressure for an entire population of cardiac rehabilitation
participants enrolled in a rehabilitation program at one hospital.
These statistics are then used to estimate the population
information (parameters). The extent to which the results of a
study can be generalized from the sample to the population of
interest is referred to as population validity. Qualitative studies
generally have much smaller sample sizes and are not as
concerned with generalizability.
Qualitative Samples
One major reason for doing research is to obtain information to
better understand a phenomenon and report results in rich, thick
detail (the use of meticulous descriptions that may or may not
relate to any other settings). Five of the most frequently employed
strategies for qualitative inquiry include: (a) narrative strategies
(e.g., biographical, autoethnographic, historical action); (b)
phenomenological and ethnomethodological strategies; (c) case
study strategies; (d) grounded theory strategies; and (e) clinical
strategies (Denzin & Lincoln, 2018). These strategies were
discussed in greater detail in Chapter 4, “Qualitative and Mixed-
Methods Research.” The type of sample relates to the type of
inquiry strategy being used by the researcher. The investigator
may be studying individuals or small purposive-selected
participants (e.g., narrative strategies, phenomenological and
ethnomethodological strategies), or seeking to learn more about
processes, events, and activities (e.g., case study strategies,
grounded theory strategies), or seeking to gain unique information
about individuals or groups (e.g., clinical strategies). In general,
qualitative research designs typically include smaller numbers of
participants as opposed to quantitative samples.
Quantitative Samples
In addition to obtaining information to better understand a
phenomenon, another major reason for doing research is to gather
data that can be generalized to a population of interest. This can
be accomplished by studying an entire group, but this is often
impractical, if not impossible, to do. For example, if we really want
to determine the physical fitness status of all preadolescent
children in the United States, we could fitness test all members of
that group. This would, however, be extremely challenging in
terms of the number of researchers needed to collect the data, the
tremendous amount of money needed to finance the project, and
the exorbitant amount of time and energy the project would
require.
Sampling Process
Due to the challenges of studying entire populations, random
sampling is often used. The first step in the sampling process is
the identification of the target population. Because it is seldom
realistic to work with the entire target population, it is usually
necessary to identify that portion of the population that is available
to the researcher—the accessible population. This represents the
sampling frame, or collection of elements, from which the sample
is taken. Obviously, the accessible population should closely
resemble the target population. The researcher then determines
the desired sample size (e.g., the number of research participants)
and chooses the method of sampling that best meets the needs of
the research study in terms of both feasibility and the ability to
select an appropriate sample. Finally, the sampling plan is
implemented, and the research participants are selected for
recruitment. It is important to note that the sampling process
described here follows a logical, top-down approach, whereby the
population of interest is first determined and then the sample is
selected. It is an all-too-common mistake for the beginning
researcher to use the opposite approach—first selecting the
research participants and then trying to retrofit this sample into
some population. The recommended steps in the sampling
process are summarized as follows:
1. Identify the target population.
2. Identify the accessible population.
3. Determine the desired sample size.
4. Select the specific sampling technique.
5. Implement the sampling plan (including recruitment of
participants).
To review, the most crucial element in the selection of
participants is that the sample be a representative sample of the
population in which the characteristics are being investigated. For
example, if a researcher recruits just a few pre-physical therapy
majors to better understand what may be true about a large
number of pre-physical therapy majors then, it is imperative that
the researcher be as certain as possible that the small group
(sample) of majors is representative of the total population of pre-
physical therapy majors. Said another way, whatever variable is
being studied in a particular population (e.g., attitude, behavior,
self-efficacy, fitness, physical activity, and/or motivation) should be
present in the sample drawn from that population. In order to
improve the representativeness of the sample, random sample
selection is employed whenever possible.
Random Processes
In addition to helping ensure the representativeness of the sample
to the population about which generalizations will be made,
random processes also indicate that the researcher was unbiased
as to which members of the population were selected for the
study. Random processes also equalize characteristics among
subsets in research studies requiring multiple groups, such as
experimental and comparison assignments. If each group is
representative of the population, the cohorts will start the research
study equal in ability, and the research will be safe from criticism
about which research participants were selected and placed in
each group. Furthermore, random sampling is a basic requirement
underlying inferential statistical tests.
Before discussing specific sample selection techniques in more
detail, it is important that a clear distinction be made between the
random selection of research participants and the random
assignment of research participants. Although both utilize random
processes to achieve their goals, the underlying purposes are very
different (Mertler, 2019). As we have already seen, the purpose of
random selection is to select a sample that is representative of the
population. This is important for generalizing the results of the
study and enhancing the external validity (i.e., population validity)
of the study. Meanwhile, the purpose of random assignment is to
establish group equivalence by randomly assigning research
participants to condition or comparison groups. It is especially
desirable that groups are equitable prior to the introduction of a
treatment condition, thus providing greater confidence in
attributing posttest differences to the treatment. Random
assignment is also important to the internal validity of a study,
particularly in experimental research. Further discussion of both
internal validity and external validity issues are addressed in
Chapter 3, “Experimental Research.” Because the use of random
selection and random assignment procedures increases the
researcher’s ability to generalize to a population and reduces the
chance of extraneous variables affecting results, randomization
procedures are warranted whenever possible (Creswell, 2014).
The next section will describe a variety of the sample selection
methods employed during qualitative and quantitative research
studies.
Sample Selection Methods
Overall, there are three primary types of sampling procedures. The
first, purposive sampling, is employed when the site and
participants are selected in order to address a specific purpose or
context. This procedure can be employed in either qualitative or
mixed-methods studies. Another option, probability sampling,
involves using techniques wherein the probability of selecting each
participant or element is known. These techniques rely on random
processes in the selection of the sample. In the third major
category, nonprobability sampling (convenience sampling),
random processes are not used during participant selection,
making it impossible to know the exact probability of selecting a
given participant or element from the population. As a result, it is
more difficult to claim that the sample is representative of the
population, and this limits the ability of the researcher to
generalize the results of the study. The following sections provide
greater detail about these three types of sampling procedures.
Purposive Sampling
With this method, the researcher knows that specific
characteristics exist in a certain segment of a population. Because
these traits are extremely critical to the results of the investigation,
the researcher using purposive sampling selects only those
research participants with the characteristics being investigated.
For example, biomechanics researchers often purposely select
excellent performers in a sport when they are trying to determine
correct kinetic and kinematic parameters. From that perspective,
recruited participants could be top gymnasts, swimmers,
basketball players, and other athletes.
In addition, when selecting and recruiting participants for
qualitative and mixed-methods studies, purposive sampling is also
often employed for selected sites and potential participants in
order to recruit participants that will help the researcher best
understand the problem or situation. For example, questions that
might guide the selection of the site and the participants might
include the following: (a) Where will the research take place? (the
setting); (b) Who will be the participants (e.g., who will be
interviewed and/or observed?); and (c) What is the process or the
evolving nature of events that will take place? (e.g., studying elite
college rowers at an NCAA championship).
Mertler (2019) provides additional information about purposive
sampling and discusses nine sampling strategies that have direct
applications to qualitative research and that may be used in
mixed-methods approaches as well. These include the following:
(a) typical sampling, (b) extreme case sampling, (c) maximum
variation sampling, (d) theory or concept sampling, (e)
homogeneous sampling, (f) opportunistic sampling, (g) critical
sampling, (h) snowball sampling, and (i) confirming and
disconfirming sampling. These strategies are discussed in the next
section.
Typical sampling
In typical sampling, the researcher centers the study on a person
or persons, or a site that is representative (or typical) of others in
that area. For example, if you are interested in learning about
participants’ views of local fitness centers, a common fitness
center chain, such as LA Fitness, might be a better potential
partner than an extreme sport training facility (e.g., climbing,
biking, or indoor surfing).
Homogeneous sampling
This method is the opposite of the maximum variation sampling
method. In homogeneous sampling, the investigator employs
individuals or sites that have similar traits or characteristics. In this
method, the sample participants and sites are recruited due to the
presence of a similar trait or characteristic (e.g., hospitals with
cardiac rehabilitation centers or schools with comprehensive
school physical activity programs).
Opportunistic sampling
More experienced scholars may engage in emergent research
designs and studies. In the situation of opportunistic sampling, the
focus of the research may change during the study. The
researcher may then need to collect new or different data, add a
different site, or recruit different participants to answer the
emerging questions.
Critical sampling
In critical sampling, the researcher seeks participants or sites that
have been involved with the phenomenon in serious or dramatic
ways. For example, in order to study physical activity in youth
detention centers, youth currently living in detention centers might
be targeted as recruits.
Snowball sampling
In snowball sampling, the researcher may not know who to recruit
to learn more about the phenomenon of interest when they start
the study, so they begin by interviewing a few individuals within the
setting or context. Each participant who is interviewed is then
asked to suggest other individuals who may add value to the
project. For example, while studying workers who exhibit and
support healthy behaviors at injury rehabilitation facilities, in order
to investigate the link between individuals’ belief systems and
behaviors related to health and their career selections, additional
participants may be recruited for the study through coworker
referrals.
Probability Sampling
Probability sampling utilizes random processes or chance
procedures to derive the sample. A key characteristic is that every
element within the population, more precisely the sampling frame,
has a known probability of being selected for the sample.
Furthermore, when random processes are employed to select a
sample, it is possible to estimate sampling error, or the chance
that variations, or things out of the researcher’s control, may occur
in the sampling process (Mertler, 2019). Conceptually, sampling
error is the deviation of the sample mean (or other statistic) from
the population parameter. It is not, however, indicative of mistakes
in the sampling process but merely describes the inevitable
chance of variations that occur when a number of randomly
selected sample means are computed. Although no technique can
guarantee a sample that is representative of an entire population,
probability sampling is generally recognized as an efficient method
of drawing a representative, unbiased sample from a population.
Specific techniques that use this process include the following: (a)
simple random sampling, (b) stratified random sampling, (c)
systematic sampling, and (d) cluster sampling. These four
probability sampling techniques will be discussed in the next
section.
School Size
Let us say that there are 400 schools in the state of Wyoming,
and the researcher decides to study funding for girls’ athletic
programs using a sample of 100 (a number randomly selected)
schools (25% of the total number of schools). Upon examination,
the number of small schools outnumbers those that fall within the
medium or large school parameters. Using only random selection,
the researcher selects a sample that contains 80 small schools, 15
medium schools, and 5 large schools. The resultant data on how
schools in the state fund girls’ athletic programs would probably be
unbalanced in favor of how that process works at small schools
and would not fairly represent the funding strategies of medium or
large schools. This problem could be overcome, or at least
minimized, by first stratifying the schools by size and then, taking a
proportional random number of schools from each stratum. In this
example, there are 250 small schools, 100 medium schools, and
49 large schools. Because the researcher wants a final sample of
100 schools (25% of the total number of schools), a proportional
stratified random sampling (taking 0.25 × the number of schools in
each stratum) would yield, roughly, 63 small schools, 25 medium
schools, and 12 large schools. Thus, the final sample of 100
schools is quite representative of the total population of 400
schools. Any generalizations or inferences from the sample to the
population stand a better chance of being valid than if proportional
stratified random sampling had not been used. While not
completely without error, this procedure is an efficient way to
achieve a representative sample.
In some research studies, however, the primary focus is on the
differences among the various strata or subgroups in the
population. In these situations, nonproportional stratified random
sampling should be used to select equal-sized samples (the same
number) from each stratum, thus making group comparisons
easier. In the previous example, the researcher might use 33 each
of small, medium, and large schools to determine the differences
in funding for girls’ athletics by school level.
To summarize, if the characteristics of the entire population are
the focus, proportional sampling or stratified random sampling is
more appropriate. If, however, the focus of the research question
is on differences among the strata, the researcher should select
an equal number of participants (elements) from each stratum, or
nonproportional stratified random sampling (Mertler, 2019).
Systematic sampling
When some “scheme” is applied to the selection of research
participants, the procedure employed is called systematic
sampling. More precisely, this involves drawing a sample by
selecting every kth element from a list of the population, where k is
a constant representing the sampling interval. For example, in a
study of a population of those who received athletic training care
during at least one game in the fall semester, the researcher
obtained an alphabetized listing of the 200 high school athletes
who qualified. The researcher then determines there is a need to
draw a 20% sample from the population and thus, 40 athletes
(20% of 200) are selected. The sampling interval is determined by
dividing the population size by the desired sample size: k = 200/40
= 5. The starting point then would be to randomly select one
participant among the first five athletes on the list, and then, every
fifth athlete thereafter would be selected. In another situation, a
researcher may decide to select every 100th person listed in a
large health club membership directory. In this case, k = 100.
Another version of systematic sampling is illustrated by a
researcher who is situated at a location (e.g., coffee shop) and
stops every 15th person who passes by and asks each a series of
questions (e.g., asks about her or his physical activity patterns
over the last week). In each of these examples, a “system” was
applied to the selection of the sample.
If the listing of the population elements is in random order (this
is rarely the case) systematic sampling is a good approximation of
random sampling. However, systematic sampling does not strictly
satisfy the definition of random sampling and, hence, it is not
entirely bias-free. To that point, once the first element is selected,
all of the remaining cases in the sample are automatically
determined while others have no chance of being selected. Thus,
all members of the population do not have the same independent
chance of being selected for the sample. In the previous example
of the athletes who received care from an athletic trainer, if every
fifth athlete who received care from an athletic trainer is selected,
the fourth and sixth athletes on the list have no chance of being
selected. In addition, subtle biases may be present in systematic
sampling, especially when the population listing is not in random
order. For example, if you were systematically sampling schools in
Massachusetts, it is likely that more parochial schools with names
that begin in S will cluster together near the bottom of an
alphabetized listing. Because Massachusetts has the highest
percentage of Catholic churches in the United States, the
parochial schools (many of which may have names that begin with
S) may not be adequately represented in a sample chosen
through systematic sampling (Mertler, 2019).
Overall, though, systematic sampling is quick, efficient, and
saves time and energy. This is particularly true when populations
exist on a definitive list or roster. In this instance, it is simply
convenient to select every kth element of the population. The
resultant sample will provide a good estimate of the population of
interest, particularly if the sample is relatively large.
Cluster sampling
Also referred to as area sampling, cluster sampling is particularly
appropriate in situations where the researcher cannot obtain a list
of the members of a population and has little knowledge of their
characteristics, especially when the population is scattered over a
wide geographic area or in situations where it is impractical to
remove individuals from a naturally occurring group.
Foundationally, cluster sampling is a type of probability sampling in
which the sampling unit is a naturally occurring group, or cluster, of
members of the population. Thus, instead of sampling individual
members of the population, clusters are selected.
Examples of clusters particularly applicable to kinesiology
research include schools, classrooms, city blocks, clinics,
hospitals, and worksites. Let us say that a researcher wants to
survey 800 doctors out of a population of 4,600 doctors at 46
hospitals in three states in the Southwestern United States, and
each hospital has approximately 100 doctors. If the doctors were
selected at random, the chosen participants would be scattered
across the Southwestern United States. The expense in time,
money, and energy in designing a plan to get information from the
800 randomly selected doctors would be enormous. Considering
this, the researcher decides to randomly select 12 hospitals from
the total population of hospitals in three states. Then, the
researcher would survey all doctors at these 12 hospitals. The
doctors in each hospital comprise a cluster. Allowing for certain
factors of participant mortality (i.e., drop out), the researcher
reasonably expects approximately 800 doctors to respond. In this
example, the random selection is of the hospitals and not of the
doctors, who are of interest in the investigation. The hospitals are
the sampling unit (the sample size is 12 not 800) because the
sample statistics are calculated at the hospital level. Conclusions
and inferences are stated in terms of the sampling unit. Because
hospitals are the sampling unit, the mean score of the doctors in a
hospital represents the score for that hospital.
In general, cluster sampling, as in this example, is more
practical and less costly than simple random sampling, but it does
hold the possibility of greater sampling error, particularly if the
number of clusters is small. Furthermore, if clusters are not of
equal size, the possibility of introducing sample bias exists.
One alternative to the cluster sampling procedure described
previously would be the process of multistage sampling (a
successive selection of clusters within clusters). In this method,
the researcher will frequently perform one or more rounds of
cluster sampling and then, in the final stage, randomly select
elements (research participants) from the chosen clusters. For
example, a researcher interested in investigating the health
behaviors of high school students in Illinois could not reasonably
expect to obtain a listing of all high school students in the state. It
is simply not feasible to utilize simple random sampling
procedures in this case. However, utilizing a multistage sampling
procedure, the researcher would first randomly select school
districts within the state, then schools within the selected districts,
and finally, classrooms within schools. If the researcher did not
want to utilize all students within the selected classrooms, in a final
sampling stage, she or he could randomly select students
(particularly because student health behaviors are the focus) from
the chosen classrooms. Essentially, this process involves a
random sample of clusters and then, selection of a random sample
of individuals from clusters. Multistage random sampling is also
called two-stage random sampling because it starts with cluster
random sampling, and it is followed by individual random sampling
(Mertler, 2019).
Nonprobability Sampling
In addition to the four methods of probability sampling described
previously, nonprobability sampling, or convenience sampling, is
often utilized in quantitative research studies. To review, if a
researcher is to justify generalizations about the population from
which the sample is drawn, participants must be selected at
random. The random sample is a chance sample and follows the
laws of probability. Not all samples, however, can be randomly
selected. Samples not selected at random are called
nonprobability samples. Examples are intact classes, volunteer
groups, a group of graduate students in kinesiology, participants in
a health promotion program, members of a sport team, or
coworkers at a physical therapy clinic. When these kinds of
samples are used in research, they may not accurately reflect all
of the traits of the population in which the researcher is interested;
thus, they may not be representative of the population. This can
lead to inaccurate conclusions. However, the use of convenience
samples in research, particularly in work conducted by graduate
students, is commonplace.
Another major problem with a study that uses this approach to
select research participants is that the results are not likely
generalizable beyond the participants in the study. This does not
mean that the study should not be conducted or that the results
are not accurate or credible; it simply means that the researcher
should be cautious in generalizing the findings. In an attempt to
justify the use of a convenience sample, researchers generally
attempt to show that the characteristics of the sample are similar
to those of the population. Nevertheless, a researcher should
recognize the limitations of utilizing a convenience sample and
disclose this methodology as part of the final written report or
presentation.
For example, suppose a researcher wants to study the
muscular strength of middle school students. The middle school
students at Harvest Middle School (pseudonym) belong to a class
made up of four sections of 30 students each for a total population
of 120 prospective research participants in this middle school
physical education setting. The researcher’s plan is to select 30
participants from the 120 students in physical education at
random. However, the school principal says, “No, the random
selection would mean that some students from each section might
be selected in the sample, and if you (as the researcher)
indiscriminately pull them out of their respective sections for
testing purposes, it will disrupt the school schedule. Instead, select
participants from just one intact section, and make arrangements
to do your testing before or after the school day.” The questions
then become: How representative of the characteristics of all the
middle school students in physical education are students in this
one section? and How representative is it of similar middle school
physical education populations at other schools? The point is,
intact or available groups impose serious restrictions on the
researcher’s ability to generalize the data obtained from the
sample to the larger population.
The use of volunteers, another example of nonprobability
sampling, can also introduce an element of bias into an
investigation. Let us say that a researcher visits four personal
health classes and asks the students to volunteer to be in a study
on the eating behaviors of undergraduate college men and
women. Of the 150 students, 25 volunteer. How representative of
the 150 are these 25? The sample volunteers may be biased
(perhaps they believe they have healthy eating behaviors). What
caused these 25 students to accept the invitation to participate in
the study and the other 125 to decline? The primary problem with
volunteer participants is that they may have significantly different
characteristics (especially related to those that are the focal point
of the study) from the nonvolunteers. In fact, they usually possess
special kinds of characteristics that permit them to volunteer, and
these tend to bias the sample. The use of volunteer participants
does not allow an equal chance for all population members to
participate; thus the law of probability, or chance, has not been
permitted to operate.
Sample Size
In addition to sampling techniques, another important
consideration related to selecting and recruiting participants for
research studies is the size of the sample. In general, sample
sizes are typically much larger for quantitative than qualitative or
mixed-methods studies. Often, the first question that a beginning
researcher asks is, “How many participants do I need?” In
qualitative studies, the question may be, “How many participants
and research sites do I need?”
Qualitative Studies
In general terms, the number of participants needed for a
qualitative study depends on the research questions as well as the
type of qualitative study and its theoretical underpinnings. More
specifically, the sample size depends on what the researcher
wants to learn. It is guided by the purpose of the study, and it is
also influenced by what is at stake by the inquiry. It depends on
whether it will be useful for the researcher and others. The sample
size will also be influenced by what brings credibility to the study,
and finally, a number of participant-size decisions will also be
made based on time and resources (Patton, 2014). Trade-offs in
the number of participants in a qualitative study related to the
breadth and depth of the design exist, however. With limited
resources, a researcher could study a small group of participants
over a longer period and potentially gain a deep, rich
understanding of the phenomenon—thus giving the researcher
great insight into these few participants’ experiences. The
alternative, in light of finite resources, would be to investigate a
larger number of participants and perhaps gain more
understanding about the phenomenon but with less depth in the
understanding of participants’ views.
For example, three philanthropists who had their own
charitable organization (i.e., HealthWorks Foundation of Arizona)
were interviewed about why they funded projects related to health
(i.e., Fitness for Life school-wide interventions, dentists in schools
projects, and cardiac rehabilitation projects where patients had
access to expedited, aggressive rehabilitation protocols).
Research findings led to a deep understanding about why they
supported health initiatives in their community, and themes
included the following: (a) desire to see action at the grassroots
level, (b) interest in funding evidence-based projects in their local
community, (c) interest in education and health, and (d) strong
belief systems that local data were needed to develop new policies
promoting health in their community (Mulhearn, Kulinna, Yu, &
Griffo, 2019).
In another example, 30 school personnel (ranging from a
psychologist to a nutrition specialist) were interviewed after
participating in a Whole School, Whole Community, Whole Child
(www.ascd.org/programs/learning-and-health/wscc-
model.aspx) intervention in their school district. These interviews,
which took place after surveys had already been administered,
provided the researcher with a deeper understanding of the
school-wide initiative. The two data sources were combined (a
mixed-methods design!) to describe the complex situation in the
school district as it promoted school-wide, healthy behavior
change initiatives. It also supported the components of an
ecological model of health change (i.e., organization, community,
interpersonal, and individual factors); all of these influenced the
health-promoting changes observed (Kulinna, 2016).
In addition to considering the number of participants, it is also
expedient to consider the quality of potential data collected. Morse
(2000) provides some useful direction in terms of the quality of the
data and sample size. The number of interviews conducted with
each participant affects the quality of the data. She suggests an
inverse relationship may be present between the number of
participants and the amount of useful data (e.g., quality of the
data) generated per participant. In short, the more data that are
collected from each person, including number of interviews,
artifacts, and observations, the fewer the number of participants
that is needed (Morse, 2000).
Sample size is also influenced by the type of qualitative
research strategy. If the researcher is conducting single,
semistructured interviews, then a larger sample may be needed. If,
however, the researcher is interviewing participants multiple times
and using an open interview format, then fewer participants may
be needed (Morse, 2000). Whereas narrative strategies often
include one or a small number of participants and research sites,
phenomenological and ethnomethodological strategies may target
small numbers of participants, often from around 3 to 10. Case
study strategies may involve one participant or setting or several
cases at the same time, and ethnography strategies may study
one phenomenon from a single culture that may include multiple
interviews, observations, and examinations of artifacts. Grounded
theory strategies may include up to 20 to 30 participants
(Creswell, 2013; 2014).
Saturation of the data sources is another approach used by
qualitative researchers to determine sample size. Evolving from
grounded theory literature, it suggests that when the researcher is
hearing the same messages in the interviews, seeing the same
ideas in observations, and reviewing the same ideas in
documents, the categories or qualitative research themes have
reached saturation or the point where “gathering fresh data no
longer sparks new insights or reveals new properties” (Creswell,
2014; p. 189). The small sample sizes used in qualitative studies
and often in mixed-methods studies are often seen in opposition to
the larger sample sizes used in quantitative studies.
Quantitative Studies
In quantitative studies, determining sample size is difficult to
achieve with any degree of certainty. Defining the optimal sample
size for a study indicates an adequate power in order to detect
differences that are statistically significant. In addition, sample
sizes will vary by quantitative approach and theoretical base.
Beginning researchers, particularly graduate students, often
believe that, due to its size, a large sample will produce better data
and more valid conclusions based on the data. Regardless of size,
the crucial factor, just as in qualitative data collection, is whether
the sample is representative of the population. A well-selected and
controlled small sample is ultimately better than a poorly selected
and poorly controlled large sample. In addition, sample size is
influenced by population size. For example, 50 subjects out of a
population of 100 may be representative, but 50 subjects from a
population of one million would not be representative.
Here are two examples—each with a different outcome: During
the 2016 presidential election, the Washington Examiner (Carney,
2016) predicted on November 7 that the Democratic candidate,
Hillary Rodham Clinton, would defeat Donald John Trump the next
day on November 8, 2016. That projection (ultimately incorrect!)
was based on aggregate results of many statewide polls (total
projection = Clinton 319 vs. Trump 219). This process may have
lacked the careful stratification and proportional random sampling
needed.
In contrast, in 1988, George H.W. Bush was correctly predicted
to be the presidential victor with less than 5% of the votes cast.
The sample on which the prediction was based was obtained
through careful stratification of several characteristics known to be
related to voting practices within the voting population. This part of
the process was followed by proportional random sampling, and
ultimately the reporting sample appeared to be unbiased and
representative.
Overall, however, the size of the sample does become
important in the statistical applications used to analyze the data
produced by the sample, especially related to making inferences
from the sample to the population. From an analytical standpoint,
the statistical power of a test is the probability that the test will
reject the null hypothesis when, in fact, the null hypothesis is false.
A powerful statistical test, therefore, is one that can detect even
small differences or a small effect, and one of the factors that
influences the statistical power of a test is sample size. In general,
the larger the sample size, the more statistical power generated by
the statistic being used. A long-standing convention in
experimental research is that sample size never needs to exceed
50% of the population. Conducting a prospective power analysis
before initiating a study can help a researcher determine the
sample size needed to detect a difference that is considered
important. For more information about statistical power and
conducting a power analysis to determine sample size, the
interested reader may wish to consult other research and statistics
books, such as Baumgartner et al. (2016), Cohen (1988), and
Rosner (2011).
Sample size will also depend on the quantitative research
approach utilized. Experimental studies will typically employ fewer
research participants than descriptive studies. Samples in
experimental research are usually quite “captive” (i.e., participants
are directly involved with changes in their setting) and, depending
on the data-collecting demands, result in less participant mortality
(i.e., drop out). In contrast, with a descriptive approach, especially
when sending survey instruments through email or on websites,
participant attrition can be quite large, so the accepted practice in
descriptive research is to select a sample large enough to offset
natural attrition (i.e., people who fail to complete a survey
instrument or questionnaire) and still maintain a satisfactory
sample size. To the previous point, nonresponses in questionnaire
studies must not be taken lightly. If 500 questionnaires are sent
and only 100 are returned, what does this mean? How
representative is the sample of 100 of the larger population from
which the participants were drawn? What selective or systematic
factor is operating to cause 400 people to decline the opportunity
to participate? Even with the aforementioned concerns, a 20% to
25% return rate of an electronic or website survey is considered
acceptable.
The method of selecting the sample is probably just as
important as the sample size. Random sampling, as indicated
earlier, is the best approach to sample selection because the laws
of probability operate in such a way that sampling error in both
large and small samples can be estimated. To review, though,
utilizing random sampling does not guarantee that the sample will
be representative of the population. The selection process, the
types of variables being studied, how the data on those variables
are collected, the proposed statistical procedures, and elements of
public relations or partnerships with organizations in the research
process may all operate to determine sample size.
Whenever a sample is used, it is logical to wonder what the
results might have been if all of the people in the population had
been included. An appropriate sample, however, usually provides
a good estimate of what the results would have been from the
entire population. All estimates involve some degree of error; the
goal is to minimize that error. The researcher never knows exactly
how much error is present, but it can be at least partially controlled
by obtaining a sample that is in proportion to the total number of
people in the population. For example, if a researcher wants to
survey parental opinions on whether an after-school sport program
should be added, the percentage of “yes” responses can be
determined. Ultimately, the researcher wants to be able to say with
90% certainty that the sample percentage results fall within five
percentage points of what the findings would be if all the parents
had been surveyed. Although the researcher knows there will be
some variation in the percentage of “yes” opinions from the
sample to the population, if there is high confidence (90%) that this
variation is small (± 0.05), support for the sample percentage
increases.
For example, if 3,000 parents comprised the population, the
sample would need to include 341 parents. These projections are
based on forecasts developed by Krejcie and Margan (1970) for
determining sample size. It is based on 90% certainty (confidence)
that the sample and population percentages do not differ by more
than 0.05. Tables for other situations can be found in books
devoted to statistics or sampling procedures. In addition, there are
numerous resources on the Internet that provide online
calculations for determining sample size. The following are two
such examples:
Selecting Participants
Along with recruiting participants, specific criteria are often used to
select participants.
Equitable Participation
Studies funded by the U.S. Government often require researchers
to provide evidence of equitable participation. This includes project
samples with equal participation or representative participation by
race and sex. This may include oversampling for specific
populations. For example, if you know that a school has a
population with 20% Native American students but your random
sample only had 3%, you can target more Native American
students until you reach 20%. This group can participate so that
you understand the health outcomes from this group. Then you
can downweigh this group so the participation rate is still 3%.
Summary of Objectives
1. Define the term population. The population refers to an entire
group or aggregate of people or elements having one or more
common characteristics.
2. Describe the goal of the sampling process.The goal of
sampling is to identify a subgroup of the population of interest
that is thought to be representative of that larger population.
3. Explain the importance of randomization in sample
selection for quantitative studies. Randomization is crucial
in ensuring that the sample is representative of the population
about which generalizations will be made. It also helps
eliminate researcher bias from the research process and helps
equalize characteristics among groups in studies involving
more than one group.
4. Discuss the major considerations in recruiting and
selecting participants. The challenge of site and participant
recruitment varies by the type of research being conducted.
Typically, the more time that the researcher spends at the
research site and the more the researcher asks individuals
from the partner organizations to do, the more challenging the
recruitment effort. Once participants have been recruited, the
final selection process takes place. For example, if the study is
funded by the National Institutes of Health (NIH), equal male
and female participants may be required. It is important that the
organizational partners and potential participants have a clear
understanding of the study purposes, their involvement, and
how all parties may benefit from the study.
5. Compare and contrast the various methods of selecting
samples. There are three primary types of sampling
procedures. Purposive sampling is the most common method
of sampling in qualitative studies; it is also commonly used in
mixed-methods research studies. It involves the intentional
selection of sites and individuals to participate in a study based
on their ability to help answer the research inquiry. Probability
sampling relies on random selection techniques in which the
probability of selecting each participant or element is known;
techniques include simple random sampling, stratified random
sampling, systematic sampling, and cluster sampling. The third
method, nonprobability sampling procedures, or convenience
sampling, is a less reliable technique for generalization.
6. Know the various factors that are used to determine
sample size for qualitative and quantitative research
studies. For qualitative research studies, the sample size
depends on the research question and the qualitative research
strategy (e.g., narrative strategy). It also involves finding a
balance between the number of participants and time spent
with participants in order to gain a deep understanding of the
phenomenon under examination. Sample size for quantitative
studies is dependent upon factors such as research approach
utilized, population size, variability within the population,
statistical power needed, desired sampling error, as well as
other practical considerations. Regardless of sample size, the
crucial issue for both qualitative, quantitative, and mixed-
methods studies is whether the sample is representative of the
population.
References
American College of Sports Medicine (ACSM). (2018). ACSM’s guidelines for exercise
testing and prescription (10th ed.). Baltimore, MD: Wolters Kluwer.
Baumgartner, T. A., Jackson, A. S., Mahar, M. T., & Rowe, D. A. (2016). Measurement
for evaluation in kinesiology (9th ed.). Boston, MA: Jones and Bartlett Learning.
Carney, T. P. (2016). Parsing the polls, 2016 prediction: A big electroal college win for
Hillary. Retrieved from: https://www.washingtonexaminer.com/parsing-the-polls-
2016-prediction-a-big-electoral-college-win-for-hillary
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).
Mahwah, NJ: Lawrence Erlbaum Associates.
Creswell, J. W. (2013). Qualitative inquiry and research designs: Choosing among five
approaches (4th ed.). Thousand Oaks, CA: Sage.
Creswell, J. W. (2014). Research design: Qualitative, quantitative and mixed methods
approaches. (4th ed.). Thousand Oaks, CA: Sage.
Denzin, N. K., & Lincoln, Y. S. (2018). The Sage handbook of qualitative research.
Thousand Oaks California: Sage.
Gay, L. R., Mills, G. E., & Airasian, P. (2012). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Merrill.
Krejcie, R. V., & Margan, D. W. (1970). Determining sample size for research activities.
Educational and Psychological Measurement, 30, 607–610.
Kulinna, P. H. (2016). School staff perceptions of factors influencing participation in a
whole-of-school initiative in an indigenous community. Health Education Journal,
75(7), 869–881.
Mertler, C. A. (2019). Introduction to educational research (2nd ed.). Thousand Oaks,
CA: Sage.
Morse, J. M. (2000). Determining sample size. Qualitative Health Research, 10(1), 3–5.
Mulhearn, S., Kulinna, P. H., Yu, H., & Griffo, J. (2019, April). Funders and gatekeepers
views of CSPAP. Paper presented at the Society for Health and Physical Education
Convention, Tampa, FL.
Patino, C. M., & Ferreira, J. C. (2018). Inclusion and exclusion criteria in research
studies: Definitions and why they matter. Journal of Brasileiro de Pneumologia, 44(2),
84. doi:10.1590/s1806-37562018000000088
Patton, M. Q. (2014). Qualitative research and evaluation methods: Integrating theory
and practice (4th ed.). Thousand Oaks, CA: Sage.
Rosner, B. (2011). Fundamentals of biostatistics (7th ed.). Boston, MA: Brooks/Cole.
Sands, C., Kulinna, P. H., van der Mars, H., & Dorantes, L. (in press). Trained peer tutors
in adaptive physical education class. Palaestra.
© Liashko/Shutterstock
PART THREE
Data Analysis
CHAPTER 13 Measurement in Research
CHAPTER 13
CHAPTER 13
Measurement in Research
In this chapter, we present measurement concepts and techniques
commonly used in research. You should be familiar with these
concepts and techniques in order to select the appropriate
techniques in your research and to understand the studies
reported in research journals.
OBJECTIVES
Construct
Continuous scores
Convergent evidence
Correlation coefficient
Cronbach’s alpha
Discrete scores
Discriminant (divergent) evidence
Good test
Group differences method
Internal consistency reliability
Interval scores
Intraclass correlation coefficient
Kappa coefficient
Kuder-Richardson
Modified kappa coefficient
Nominal scores
Objectivity
Objectivity coefficient
Ordinal scores
Percent agreement
Phi
Ratio scores
Reliability
Reliability coefficient
Stability reliability
Standard error of measurement
Test
Validity
Type of Scores
Scores can be classified as either continuous or discrete.
Continuous scores, as most are in kinesiology, have a potentially
infinite number of values, allowing variables to be measured with
varying degrees of accuracy. Between any two values of a
continuous score are countless other values that may be
expressed as fractions. For example, 100-yard dash scores are
usually recorded to the nearest tenth of a second, but they could
be recorded in hundredths or even thousandths of a second, if
timing equipment accurate to that level of precision was available.
Body weight, accurate to the nearest 5-pound interval, to the
whole-pound, or to the half-pound might be recorded depending
on how exact the measurement needs to be. Discrete scores are
limited to a specific number of values and usually are not
expressed as fractions. Scores on a throw or shot at a target
numbered 5–4–3–2–1–0 are discrete because whole-number
scores of 5, 4, 3, 2, 1, or 0 are the only ones possible. A score of
4.5 or 1.67 is impossible.
Most continuous scores are rounded off to the nearest unit of
measurement when they are recorded. For example, the score of
a student who runs the 100-yard dash in 10.57 seconds is
recorded as 10.6 because 10.57 is closer to 10.6 than to 10.5.
Usually, when a number is rounded off to the nearest unit of
measurement, it is increased only when the number being
dropped is 0.05 or more. Thus, 11.45 is rounded off to 11.5, while
11.44 is recorded as 11.4. A less-common method is to round off
to the last unit of measure, awarding the next higher score only
when that score is actually accomplished. For example, an
individual who does eight sit-ups but cannot complete the ninth
receives a score of 8.
We can also classify scores as nominal, ordinal, interval, and
ratio based on three key features of a real number system, i.e.,
order (if a large score represents greater amounts of the attribute
being measured), common unit (if differences between pairs of
numbers are ordered), and origin (if there is an absolute zero). Let
us start from nominal scores, which have none of these three
features, that cannot be hierarchically ordered, are mutually
exclusive, and do not have an absolute zero. For example,
individuals can be classified by church preference, but we cannot
say that one religion is better than another. Gender is another
example. Ordinal scores do not have a common unit of
measurement between each score but are ordered in a way that
makes it possible to characterize one score as higher than
another. Class ranks, for example, are ordinal: If three students
receive sit-up scores of 16, 10, and 8, respectively, the first is
ranked 1; the next, 2; and the last, 3. Notice that the number of sit-
ups necessary to change the class ranks of the second and third
students differs. Thus, there is not a common unit of measurement
between consecutive scores. Interval scores are also ordered,
have a common unit of measurement between each score, but do
not have a true zero point (A score of 0 as a measure of distance
is a true zero, indicating no distance. However, a score of 0 on a
knowledge test is not a true zero because it does not indicate a
total lack of knowledge; it simply means that the respondent
answered none of the questions correctly). Most physical-
performance scores are either ratio or interval. Finally, ratio
scores are ordered, have a common unit of measurement
between each score, and a true zero point, so that statements
about equality of ratios can be made. Length and weight are
examples, because one measurement may be expressed as two
or three times that of another. In addition, how scores are
classified influences what calculations may be performed on the
data. TABLE 13.1 summarizes features of these scores,
corresponding examples, and math and statistical methods that
can apply.
Objectivity
To determine objectivity, at least two scorers, raters, or judges
(usually two) must independently score a group of people on the
test. With physical-performance tests, this would be each scorer
independently scoring each person as he or she performed the
test. For example, two individuals independently time each person
as he or she runs the mile for time. For paper-and-pencil tests or
instruments like a questionnaire, this would mean each scorer
independently scoring the test or instrument. For example, two
individuals independently grade a knowledge test. No matter what
the situation, for each person tested there will be a score from
each scorer. The score organization will be as presented in
TABLE 13.2.
TABLE 13.2 Data for Two Scorers to Determine Objectivity
A 4* 4
B 4 3
C 3 3
D 2 3
E 3 2
F 3 3
G 5 4
H 4 3
I 1 2
J 3 2
TABLE 13.3 Means and ANOVA Summary Table for the Data in
Table 13.2
In Table 13.2, two scorers were used in the pilot study so that
objectivity could be determined. Most likely in the research study,
the researcher would like to have one person score the test with
assurance that the data obtained are basically the same data any
well-trained scorer would award. The intraclass correlation
coefficient can be calculated using several different formulas. In
this case the intraclass correlation coefficient (R) is calculated as
follows:
Reliability (Relative)
To determine reliability, each person must have at least two
repeated measure scores. These scores might be (1) multiple
trials (administrations) of a physical-performance test within a day,
(2) multiple administrations of a physical-performance test with
each administration on a different day, (3) multiple items on a
knowledge test, (4) two different forms (A and B) of a knowledge
test with both forms administered on different days or on the same
day, or (5) other repeated measures. If the data are collected
within a day, it allows a determination of internal consistency
reliability, defined as consistency of the test scores within a day.
If the data are collected on different days, it allows a determination
of stability reliability, defined as consistency of the test scores
across days (see Zhu, 2013 for a discussion of the need to clearly
define different types of reliability). In all cases, some correlation
coefficient is calculated to indicate the degree of relationship
between the sets of scores. This correlation coefficient is called
the reliability coefficient. Although it can theoretically range from
–1.00 to 1.00, its meaningful range is from 0.00 to 1.00, where
0.00 indicates no reliability and 1.00 indicates perfect reliability.
TABLE 13.5 Means and ANOVA Summary Table for the Data in
Table 13.4
Multiple Days
This is a typical situation with physical-performance tests where
the same test is administered to a group of people (30 or more) on
two different days, usually 1 to 7 days apart. Thus, this testing
procedure allows stability reliability to be determined. The data
arrangement is as presented in TABLE 13.6. Usually, in this
situation, the reason the test was administered on two different
days in the pilot study was to be able to determine reliability, but in
the future, the researcher would like to administer the test on one
day with assurance that the data collected during the research
study data collection have high stability reliability. An intraclass
correlation coefficient is calculated to estimate stability reliability
using formula (1); R = reliability of a score for each participant that
is collected on one day.
1 X X
2 X X
3 X X
· · ·
· · ·
· · ·
n X X
The data for trials 1 and 2 in Table 13.6 were analyzed treating
the trials as if they were day 1 and day 2 scores, so that an
example calculation of R using formula (1) is possible. Based on
the analysis of the trial 1 and trial 2 data that are presented in
TABLE 13.7, R = (7.97 – 1.63)/(7.97 + 1.63) = 0.66, which is a low
reliability coefficient.
TABLE 13.7 Means and ANOVA Summary Table for Trials One
and Two in Table 13.4
There are differences between formulas (2) and (1) in terms of
how the data are collected (within a day or between days), the
denominator of the formula, and the score for which R is reported
[sum (mean) of scores or one score]. If formula (2) was used on
the values in Table 13.7, R = (7.97 – 1.63)/(7.97) = 0.80. This is
an acceptable value for a reliability coefficient, but it requires that
the score for each participant be based on scores for two trials or
two days. Just reporting that the reliability is a value (like 0.66) is a
totally inadequate description of the test reliability. The ANOVA
model (one-way or two-way) (see Baumgartner, Jackson, Mahar,
and Rowe, 2016 for more information), criterion score (one score
or mean score), type of reliability (internal consistency or stability),
and value of R must be reported to adequately describe the
situation. Formula (1) is correct if, in the pilot study, the test was
administered on two days to determine R and in the research
study the test is administered on one day. There would be nothing
wrong with administering the test on each of two days during both
the pilot study and the research study data collection and the
score for each participant being the sum or mean of the
participant’s day 1 and day 2 scores. In fact, as long as the
number of days the test is administered is the same in both the
pilot study and the research study, the stability coefficient (R) is
determined by formula (2); R = reliability of a score for each
participant, which is the sum or mean of the participant’s day
scores.
General Case
If the test was administered more than twice (three or more trials
or days) to determine R, so each person has more than two
scores, the formula changes, as presented by Baumgartner,
Jackson, Mahar, and Rowe (2016). The formula is as follows:
FIGURE 13.1 Types of evidence used to argue for the validity of test scores.
CHAPTER 14
CHAPTER 14
Descriptive Data Analysis
This chapter presents statistical techniques that can be applied to
evaluate a set of scores. You should be familiar with them in order
to select the appropriate one for a given situation and to
understand the research literature.
OBJECTIVES
Central tendency
Coefficient of determination
Correlation
Correlation coefficient
Curvilinear relationship
Descriptive statistic
Frequency polygon
Histogram
Leptokurtic curve
Line of best fit
Linear relationship
Mean
Median
Mode
Normal curve
Outlier
Parameters
Percentiles (percentile ranks)
Platykurtic curve
Range
Rho
Scattergram
Simple frequency distribution
Skewed curve
Standard deviation
Standard scores
Statistics
T-score
Variability
Variance
z-score
Grouping Frequency
44–46 1
47–49 3
50–52 0
53–55 2
56–58 5
59–61 2
62–64 4
65–67 6
68–70 7
71–73 4
74–76 6
77–79 4
80–82 1
83–85 3
86–88 0
89–91 2
FIGURE 14.1 is a graph of the data in Table 14.1, made using
SPSS. SPSS used different intervals from those in Table 14.3.
Test scores (sometimes midpoints of intervals) are listed along the
horizontal axis with low scores on the left to high scores on the
right. The frequency is listed on the vertical axis starting with 0 and
increasing upward. Going up from a score until the graph line is hit
and then going to the left until the vertical axis is hit, the frequency
for a score can be obtained. For example, the frequency for the
score 56 is 3. This graph was formed by connecting the plotted
frequencies for the scores. The type of graph is called a
frequency polygon.
Mode
The mode is the score most frequently received and it is used with
nominal data. In Table 14.2 the mode is 68; more participants
received a score of 68 than any other one score in the set. It is
possible to have several modes if a number of scores tie for most
frequent. The mode is not a stable measure, because the addition
or deletion of a single score can change its value considerably.
Unless the data are nominal or the most frequent score is desired,
other measures of central tendency are more appropriate.
Median
The median is the middle score; half the scores fall above the
median and half below. It requires data that are at least ordinal. It
cannot be calculated unless the scores are listed in order from
best to worst. As an example, for the nine numbers: 4, 3, 1, 4, 10,
7, 7, 8, 6, the median score is 6, because it is the middle score.
Mean
The mean is ordinarily the most appropriate measure of central
tendency for interval or ratio data. It is affected by both the value
and the position of each score. The mean (x) is the sum of the
scores divided by the number of scores. Notice that the scores
need not be ordered hierarchically to calculate the mean. For
example, the mean for the scores in Table 14.1 is 67.78, the sum
of the randomly ordered scores divided by 50.
When the graph of the scores is a normal curve, the mode,
median, and mean are equal. When the graph is positively
skewed, the mean is larger than the median; when it is negatively
skewed, the mean is less than the median. For example, the graph
for the scores: 2, 3, 1, 4, 1, 8, 10 is positively skewed with a mean
of 4.14 and a median of 3.
The mean is the most common measure of central tendency.
However, when scores are skewed or lack a common interval
between consecutive scores (e.g., with ordinal scores), the median
is the best measure of central tendency. The mode is used only
when the mean and median cannot be calculated (e.g., with
nominal scores) or when the only information wanted is the most
frequent score (e.g., most common uniform size, most frequent
error).
Measures of Variability
A second type of descriptive value is the measure of variability,
which describes the data in terms of their spread, or heterogeneity.
For example, consider these scores for two groups:
Group 1 Group 2
9 5
1 6
5 4
For both groups, the mean and median are 5. If you simply
report that the mean and median for both groups are identical
without showing the data, another person could conclude that the
two groups have equal or similar ability. That is not true: Group 2
is more homogeneous in performance than is Group 1. A measure
of variability is the descriptive term that indicates this difference in
the spread, or heterogeneity, of the data. There are two such
measures: the range and the standard deviation.
Range
The range is the easiest measure of variability to obtain and is the
one used when the measure of central tendency is the mode or
median. The range is the difference between the highest and
lowest score. For example, the range for the data in Table 14.2 is
44 (90 – 46). The range is neither a precise nor stable measure
because it depends on only two scores and is affected by a
change in either one of them. For example, the range for the data
in Table 14.2 would have been 38 (84 – 46) if the participants who
scored 90 had, instead, scored 84 or less.
Standard Deviation
The standard deviation is the measure of variability used with the
mean. It indicates the amount that all the scores differ from the
mean. The more the scores differ from the mean, the larger the
standard deviation. The minimum value of the standard deviation
is 0, indicating that all scores were the same value. A formula
illustrating how the standard deviation is defined is presented
here:
N Valid 50
Missing 0
Mean 67.7800
Median 68.0000
Mode 68.00
Variance 115.2771
Skewness –0.058
Kurtosis –0.392
Range 44.00
Minimum 46.00
Maximum 90.00
Sum 3389.00
Variance
A third measure of variability is the variance. It is not a descriptive
statistic like the range or standard deviation, but rather, a useful
statistic in certain other statistical procedures and interpretations
of statistics that will be discussed later. The variance is the square
of the standard deviation. If, for example, the standard deviation is
4, the variance is 16.
Measuring Group Position
Percentile Score
5 47.5500
10 54.1000
15 56.0000
20 58.0000
25 59.7500
30 62.3000
35 64.8500
40 65.4000
45 66.9500
50 68.0000
55 68.0000
60 70.6000
65 72.1500
70 75.0000
75 76.0000
80 76.8000
85 78.3500
90 82.9000
95 86.7000
100 90.0000
Standard Scores
Researchers use standard scores when they have data on
participants from several different tests, and they want all the data
in the same unit of measurement. An example of this is the sit-up,
mile-run, skin-fold, and sit-and-reach tests common to many
fitness batteries. Test scores for a test are converted to standard
scores using the following formula:
1 66 –0.166
2 56 –1.097
3 68 0.020
4 47 –1.935
5 46 –2.029
6 58 –0.911
7 75 0.672
8 71 0.300
9 60 –0.725
10 68 0.020
11 67 –0.073
12 56 –1.097
13 68 0.020
14 58 –0.911
15 68 0.020
16 49 –1.749
17 65 –0.259
18 75 0.672
19 76 0.766
20 70 0.207
21 54 –1.283
22 65 –0.259
23 76 0.766
24 68 0.020
25 68 0.020
26 62 –0.538
27 66 –0.166
28 83 1.418
29 65 –0.259
30 48 –1.842
31 63 –0.445
32 71 0.300
33 55 –1.190
34 78 0.952
35 90 2.070
36 84 1.511
37 72 0.393
38 83 1.418
39 79 1.045
40 77 0.859
41 90 2.070
42 82 1.324
43 78 0.952
44 76 0.766
45 62 –0.538
46 75 0.672
47 73 0.486
48 64 –0.352
49 56 –1.097
50 59 –0.818
A 60 130
B 52 125
C 75 145
D 66 136
E 70 150
r = 0.91
When the scores for the two sets of scores are ranked, a
correlation coefficient called rho or Spearman’s rho or the rank
order correlation coefficient may be calculated. The formula is just
a simplification of the Pearson correlation formula and will produce
the same result when applied to the two sets of ranks, provided
that no tied ranks exist. Rho is commonly included in computer
statistical programs.
0–0.19 No correlation
Measures of Association
In physical activity epidemiological research, researchers often
intend to determine if there is a cause-and-effect relationship
between an independent variable (e.g., physical activity
participation or fitness) and a dependent variable (e.g., disease,
injury, or death). Statistics such as relative risk, odds ratio, and
attributable risk are often used to determine the relationship. See
Chapter 6, “Epidemiological Research,” of this text for more
detailed information.
Summary of Objectives
1. Understand several options for organizing and graphing
scores. Researchers use frequency distributions and graphing
techniques such as scattergrams, histograms, and curves, to
show relationships and trends in data.
2. Explain three measures of central tendency. Central
tendency is a descriptive value indicating the points at which
scores tend to be concentrated. The mode is the most
frequently received score; it can be used only with nominal
data. The median represents the middle score—the point at
which, when scores are listed in order, half the scores fall
above and half fall below. The mean is the average score—the
sum of all the scores divided by the total number of scores.
The mean is most useful for interval or ratio data.
3. Explain two measures of variability. A measure of variability
is a descriptive value indicating the amount of variability or
spread in the set of scores. The range is used when the data
are not interval or ratio. It is the difference between the largest
and smallest score. The standard deviation is used when the
data are interval or ratio. It indicates the variability or spread of
the scores around the mean.
4. Explain two measures of group position. Percentile ranks
and percentiles are calculated to indicate the position, or rank,
of participants within a group.
5. Explain how to determine the relationship between scores.
The relationship between scores on two different tests can be
determined by using a graph, but the more common method is
to calculate a correlation coefficient. A correlation coefficient of
1.0 or –1.0 indicates a perfect relationship.
References
Baumgartner, T. A., Jackson, A. S., Mahar, M. T., & Rowe, D. A. (2016). Measurement
for evaluation in kinesiology (9th ed.). Burlington, MA: Jones & Bartlett Learning.
Huck, S. W. (2008). Reading statistics and research (5th ed.). Boston: Pearson.
Larose, T. L. (1998). Ready, set, run! A student guide to SAS software for Microsoft
Windows. New York: McGraw-Hill.
Leech, N. L., Barrett, K. C., & Morgan, G. A. (2015). IBM SPSS for intermediate
statistics: Use and interpretation (5th ed.). New York, NY, US: Routledge/Taylor &
Francis Group.
Moore, D. S. (2010). The basic practice of statistics (5th ed.). New York: Freeman.
Pavkov, T. W., & Pierce, K. A. (2007). Ready, set, go! A student guide to SPSS 13.0 and
14.0 for Windows. Boston, MA: McGraw-Hill.
Wagner, W. E. (2019). Using IBM* SPSS* statistics for research methods and social
science statistics (7th ed.). Thousand Oaks, CA: Sage Publications.
Zhu, W. (2012). Sadly, the earth is still round (p < 0.05). Journal of Sport and Health
Science, 1, 9–11. doi:10.1016/j.jshs.2012.02.002
© Liashko/Shutterstock
CHAPTER 15
CHAPTER 15
Inferential Data Analysis
This chapter presents statistical tests that researchers commonly
use in analyzing their data. You should be familiar with these tests
in order to select the appropriate test or tests for your own
research and to understand the research of others, as reported in
research journals.
OBJECTIVES
KEY TERMS
Alpha level
Analysis of variance (ANOVA)
Cell
Correlation
Critical region
Critical value
Cross-validation
Degrees of freedom
Expected frequency
Experiment-wise error rate
Factorial design
Inferential statistics
Mean square (MS) value
Multiple prediction
Multivariate statistic
Multivariate tests
Nonparametric statistic
Nonparametric tests
Nonsignificant
Observed frequency
One-group t-test
One-way (goodness of fit) chi-square tests
One-way ANOVA
Parametric statistic
Parametric tests
Per-comparison error rate
Polynomial regression
Prediction
Random blocks ANOVA
Real difference
Regression
Repeated measures
Repeated measures ANOVA
Research hypothesis
Sampling error
Significant
Simple effects tests
Simple prediction
Slope
Standard error of prediction
Statistical hypothesis
Statistical power
Sums of squares (SS)
Two dependent groups t-test
Two independent groups t-test
Two-group comparisons
Two-way (contingency table) chi-square tests
Two-way ANOVA
Type I error
Type II error
Univariate statistic
Univariate tests
Y-intercept
12 7 13
15 10 14
10 11 10
11 8 9
9 9 12
14 10 11
12 12 11
13 9 15
TABLE 15.2 Output for the One-Group 12-Test of Group A in
Table 15.1
N 8
Mean 12.00
t 2.83
df 7
Case ID Score
1 1 12
2 1 15
3 2 7
4 2 10
ID was the name the researcher used to identify group membership (1 = A group, 2 = B
group), and Score was the name the researcher used for the score of a subject.
Step 4: The scores for each group are analyzed and the t-
test calculated:
N 8 8
t 2.76
df 14
for repeated measures where is the mean for the final scores
or
for matched pairs where is the mean for group 2 with degrees
of freedom equal to n – 1, where n is the number of participants for
a repeated measures design or the number of pairs for a matched-
pairs design.
An example of the five-step hypothesis-testing procedure with
a repeated measures design is presented next.
Suppose eight participants are selected and initially (I)
measured for the number of mistakes made when juggling balls.
The participants are then taught to juggle and finally (F) measured.
Let us say that, in Table 15.1, the group A scores are the initial
scores and the group B scores are the final scores. To determine if
the participants improved in juggling ability as a result of the
juggling instruction, a t-test for dependent groups is conducted.
H1: μI – μF < 0
Step 4: Analyze the initial and final data, and calculate the
t-test:
One-Way ANOVA
One-way ANOVA is an extension of the two independent groups
design already discussed, but typically involves statistical analysis
of three or more independent groups. Some statistics books refer
to this test as a one-way ANOVA because each score has only
one classification (group membership). The placement of the
score in the group has no effect on the data analysis. Actually, a
one-way ANOVA can be used with two or more independent
groups, so it could be used in place of the t-test for two
independent groups. The null hypothesis being tested is that the
populations represented by the groups (samples) are equal in
mean performance. The one alternate hypothesis is that the
population means are not equal.
Some explanation of the one-way ANOVA is necessary to
understand how the technique is used and how to interpret the
output from the computer analysis. Also, a few symbols are
necessary in order to develop formulas for determining degrees of
freedom.
Two or more independent groups each receive a different
treatment. The number of groups will be represented symbolically
by K, and the total number of scores by N. For the data in Table
15.1, K = 3 and N = 24. All the scores in all the groups are
combined into one set of scores, and the mean is calculated. The
variability of the scores from this mean is the total variability for the
set of scores and is called the sum of squares total (SST). The
sum of squares total is divided into sum of squares among groups
(SSA) and sum of squares within groups (SSW):
The sums of squares values for among and within groups are
divided by their degrees of freedom to provide mean square (MS)
values, and an F statistic is calculated from the two mean square
values:
Step 1:
TABLE 15.8 Output for the General Linear Model with Repeated
Measures for the Data in Table 15.6
Random Blocks ANOVA
Earlier in this chapter, a matched-pairs t-test design was
presented. In that design, pairs are formed by pairing two
participants. Then, using random procedures, one participant from
each pair is assigned to one group, and the other participant is
assigned to a second group. The random blocks ANOVA design
is an extension of the matched-pairs t-test when there are three or
more groups, and is the same as the matched-pairs t-test when
there are two groups. In the random blocks design, participants
similar in terms of a variable(s) are placed together in what is
called a block, rather than a pair as in the t-test. The number of
participants in each block is equal to the number of treatment
groups in the research study. The procedures used in the
matched-pairs t-test might serve as a helpful reference for this
design.
After participants are assigned to groups, each group receives
a different treatment, and at the end of the study the data are
collected. The random block design has all the advantages and
disadvantages of the matched-pairs t-test.
TABLE 15.9 is an example data layout for a random blocks
design. Note that the same notation is used in this design as is
used for the repeated measures ANOVA design. In fact, blocks are
like participants, treatment groups are like repeated measures,
and the hypothesis tested is that the group means are equal.
Thus, the analysis for the random blocks design is the same as
the analysis for the repeated measures ANOVA design.
The minimum value for each of these sum of squares is zero. The
SST is greater than zero if all N scores are not equal. The SSC is
greater than zero if the means for the columns are unequal. The
greater the inequality among the columns, the larger the SSC.
Likewise, the more the row means differ, the larger is SSR. There
will be no interaction (SSI = 0) if the pattern of differences among
column means is the same for each row, and vice versa. Finally,
SSW will equal zero if, for each cell, all scores in the cell have the
same value.
Each of these sum of squares values has an associated
degrees of freedom value. So,
These mean square values are used to form the F tests for the
various statistical hypotheses. All three potential hypotheses are
presented next, although a researcher might not test a hypothesis
if it is not of interest.
Typically, the first hypothesis tested is one of no interaction.
The alternate hypothesis (H1) is one of interaction. The hypothesis
is tested with F = MSI/MSW, with degrees of freedom (C – 1)(R –
1) and (C)(R)(n – 1). A significant F value could affect how
subsequent F tests are calculated and/or how the overall data
analysis is conducted. Discussion of this issue is left to books such
as Winer, Brown, and Michels (1991), Maxwell, Delaney, and
Kelley (2018), Keppel and Wickens (2004), and other
experimental design books.
A second hypothesis tested is that the column means are
equal. The alternate hypothesis is that the column means are not
equal. This test is usually with F = MSC/MSW and degrees of
freedom (C – 1) and (C)(R)(n – 1). Occasionally, the denominator
is not MSW; which will alter the second degrees of freedom value.
Remember, for any F test, the first degrees of freedom value is for
the numerator and the second degrees of freedom value is for the
denominator. In the F test just presented, the numerator will
always be MSC. Note that the numerator of any F test corresponds
to the hypothesis being tested. The hypothesis in the last example
dealt with the column means, so the numerator of the F test was
MSC.
A third hypothesis tested is that the row means are equal. The
alternate hypothesis is that the row means are not equal. This
hypothesis is seldom tested in a random blocks design. The F test
is normally F = MSR/MSW with degrees of freedom (R – 1) and (C)
(R)(n – 1).
The following is an example of the data analysis for a factorial
design using the data in Table 15.11. Remember, these data are
fitness test scores, so small scores are better than large scores.
H1: μ1 ≠ μ2
H1: μ1 ≠ μ2 ≠ μ3
Cohen (1988) states that an effect size less than 0.20 is small,
around 0.50 is medium, and greater than 0.80 is large. The ES
equation shows that either a large difference between groups,
which indicates a better treatment effect in an experimental study,
or small variations within groups could lead to a large effect size.
In most cases, if the effect size is large, the difference between the
two means has practical implications, whereas a small effect size
is less likely to have practical significance. Statistics books like
Winer, Brown, and Michels (1991) and Murphy, Myors, and
Wolach (2014) present an excellent discussion of statistical power
and effect size. Lipsey (1990) is totally devoted to the topic.
Overview of Two-Group Comparisons
Two-group comparisons are sometimes referred to as multiple
comparisons or a posteriori comparisons. These techniques are
used to compare groups two at a time following a significant F test
in ANOVA. Having rejected H0 that the group means are equal and
accepted H1 that the group means are not equal, the researcher
wants to compare pairs of means to determine if they differ
significantly. For example, if there are three groups (A, B, and C),
this involves comparing A to B, A to C, and B to C.
Several techniques have been developed specially for two-
group comparisons. It is important for beginning researchers to
know something about these techniques in case there is a need to
use them and in order to have some understanding of their
application in the research literature. Some of the techniques
commonly used are by Scheffe, Tukey, Duncan, Newman-Keuls,
and Bonferroni. Discussion of these techniques can be found in
statistics books such as Keppel and Wickens (2004) and Winer,
Brown, and Michels (1991). Which technique to use is influenced
partly by personal preference and partly by the research situation.
Two-group comparison techniques may differ in terms of three
important attributes: per-comparison error rate, experiment-wise
error rate, and statistical power. Each of those attributes should be
considered when selecting a technique for a particular research
situation. Per-comparison error rate is the probability of making
a type I error in a single two-group comparison. Experiment-wise
error rate is the probability of making a type I error somewhere in
all of the two-group comparisons conducted. Statistical power is
the probability of not making a type II error in a two-group
comparison.
Scheffe’s technique, for example, controls type I errors very
well, but is lacking in statistical power unless sample sizes are
large. Tukey’s technique does not control type I errors as well, but
has better statistical power than Scheffe’s.
Two-group comparisons are conducted after a significant F test
in any ANOVA design, provided there are three or more levels or
groups associated with the F test. For example, if for the data in
Table 15.11, there is a significant F test for columns, then two-
group comparisons are conducted; but a significant F test for rows
indicates that the two row means are not equal, making two-group
comparison unnecessary.
In statistics books, two-group comparisons are very often
discussed after one-way ANOVA and, in packages of statistical
computer programs, two-group comparisons are often part of the
one-way ANOVA program. Furthermore, computer programs may
not contain all of the two-group comparison techniques. Thus,
using a particular two-group comparison with ANOVA designs
other than a one-way may require hand calculation of the two-
group comparison. A hand calculation is not difficult with the
information provided in the computer printout for the ANOVA
conducted to justify the two-group comparisons. SPSS provides
many two-group comparison techniques with both the one-way
and two-way programs. Within these programs, click on Post Hoc
Tests to select the desired two-group comparison. With a two-way
ANOVA, you must specify whether two-group comparisons are for
rows or columns or both.
Review of Analysis of Covariance
Analysis of covariance (ANCOVA) is an alternative to most
ANOVA designs, and it is commonly found in the research
literature. ANCOVA was discussed briefly in Chapter 8, “Reviewing
the Literature”; it would be helpful to review that material before
proceeding. Basically, ANCOVA statistically adjusts the difference
among the group means on the criterion (data) variable to allow for
the fact that the groups differ in mean score on some other
variable(s) and then applies ANOVA to the adjusted criterion
variable. A variable used for adjusting the data is called the
covariate, and the data variable is called the variate. If there is no
adjustment in the variate, ANOVA and ANCOVA yield the same
results.
One reason for using ANCOVA is to reduce error variance
(reflected in the denominator of the F test) in a design. Another
reason is to adjust for the inequality of groups at the start of the
research study in terms of one or more variables that need to be
controlled in the research study.
ANCOVA is more complex than ANOVA in terms of the
assumptions underlying its use, data collection, and analysis. Prior
to using ANCOVA, the researcher should become familiar with the
technique by consulting statistics books like Huck (2011), Keppel
and Wickens (2004), and Winer, Brown, and Michels (1991).
Manuals for packages of statistical computer programs often
include good explanations of statistical techniques. In SPSS,
ANCOVA is in the two-way ANOVA program referenced earlier.
Overview of Nonparametric Tests
Another way in which a statistic can be classified is by whether it is
a parametric statistic or a nonparametric statistic. All of the
statistical tests discussed so far have been parametric tests, in
that interval data and normal distribution of the scores for each
population are assumed. This section discusses statistical tests
that do not require that these two assumptions be met.
Nonparametric tests are used when the data are not interval or
the data are interval but not normally distributed. For example,
small sample sizes per group (n < 10) seldom yield data that are
normally distributed. An overview of nonparametric statistical tests
is presented, and then one commonly used test, chi-square, is
discussed in detail.
Books devoted entirely to nonparametric statistical tests
include Siegel (1956), which is a classic but is out of print; and
Siegel and Castellan (1988), which is a revision of Siegel.
Nonparametric statistics courses are available at most universities.
Many statistics books have a chapter devoted to the common
nonparametric statistical tests. For each of the common
parametric statistical tests discussed in this chapter, there is an
alternative nonparametric statistical test. Where parametric
statistical tests deal with the mean, nonparametric statistical tests
deal with the median, ranks, and frequencies. Note that the
second-best measure of central tendency is the median, and the
mode is based on frequencies. Often, nonparametric statistical
tests do not have as much statistical power as parametric
statistical tests. However, random sampling is still assumed with
nonparametric statistical tests.
The Kruskal-Wallis one-way ANOVA by ranks and the
Friedman two-way ANOVA by ranks are two nonparametric
statistical tests commonly found in the research literature. Both of
these statistical tests can be found in statistical packages for the
computer. SPSS has a Nonparametric Tests program.
The chi-square (χ2) statistical test is a nonparametric test often
used by researchers conducting descriptive research. Most
general statistics books discuss the chi-square test. The one-way
(goodness of fit) and two-way (contingency table) chi-square tests
are discussed in detail in subsequent paragraphs because of their
many uses in research studies.
One-Way Chi-Square
In a research study where a one-way (goodness of fit) chi-
square test is used to analyze the data, the researcher
hypothesizes some distribution for the data of a population. Based
on the distribution of sample data, the researcher either accepts or
rejects the hypothesis. Again, the five-step hypothesis-testing
procedure is used. An example will help to clarify this test.
Suppose a researcher is interested in determining whether
freshmen at a university are satisfied with living in a dormitory. The
population of university freshmen is 6,000. A sample of 1,000
students is selected using random procedures. A four-item
questionnaire with five possible answers (strongly agree, agree,
neutral, disagree, strongly disagree) per item is sent to the
sample. Eight hundred students return the questionnaire. Prior to
mailing the questionnaires to the sample, the researcher
hypothesizes that the population distribution on items is 10%
strongly agree, 10% strongly disagree, 20% agree, 20% disagree,
and 40% neutral. These hypothesized percentages generate the
expected frequency for each answer, whereas the number of
participants in the sample selecting each answer generates the
observed frequency for each answer. Presented in TABLE 15.13
are the frequencies of response (observed frequencies) for the
sample to each item of the questionnaire. Presented in TABLE
15.14 are the expected frequencies for each answer to the
questionnaire in Table 15.13. Are the observed frequencies and
expected frequencies for a questionnaire item in close enough
agreement to accept the hypothesized distribution of responses?
Strongly Agree 10 80
Agree 20 160
Neutral 40 320
Disagree 20 160
Strongly Disagree 10 80
Step 4: Analyze the data item by item. The data for item 1
in Table 15.13 is analyzed. Using the One-Way Chi-
Square program in SPSS, the output presented in TABLE
15.15 was obtained. Both the chi-square value and p-
value indicate significance. The directions for using this
program are presented in section 9 of the Statistical
Procedures part of Appendix C.
Two-Way Chi-Square
In a research study where a two-way (contingency table) chi-
square test is used to analyze the data, the researcher has
hypothesized that, for a population, two variables are independent
of each other or uncorrelated. Based on the data of a sample from
the population, the researcher accepts or rejects the hypothesis.
As always, the five-step hypothesis-testing procedure is followed.
The following example is an easy way to present this chi-square
test.
A questionnaire is sent to a sample of 100 former participants
in an industrial fitness program, and 30 of them return it.
Participants are asked to indicate their gender, age, and response
to each of three items about the fitness program. The data from
the questionnaire is presented in TABLE 15.16. The researcher
hypothesizes that males and females respond in a similar manner
to item 1. This is equivalent to believing that gender and response
to item 1 are independent of each other.
Simple Prediction
Simple prediction involves predicting an unknown score (Y) for
an individual by using that person’s performance (X) on a known
measure. To develop a simple prediction, or regression formula, a
large number of participants must be measured, and a score on
the independent or predictor variable X and the dependent or
predicted variable Y must be obtained for each. Once the formula
is developed for any given relationship, only the predictor variable
X is needed to predict the performance of an individual on
measure Y. An example of simple prediction is predicting college
grade-point average (Y) with high school grade-point average (X).
One index of the accuracy of a simple prediction equation is the
correlation coefficient (r).
Because prediction is a correlational technique, a line of best
fit, or regression line, can be generated. (See graphing technique
in Chapter 14, “Descriptive Data Analysis.”) In the case of simple
prediction, this is the regression line for the graph of the X variable
on the horizontal axis and the Y variable on the vertical axis.
The general form of the simple prediction equation is:
The, Y′ is a predicted Y score. The b is a constant multiplying X
and represents the slope of the regression line. The slope is the
rate at which Y′ changes with change on X, and X is the score of a
person. The constant is called the Y-intercept and is the point at
which the line crosses the vertical axis. It is the value of Y′ that
corresponds to an X of 0. Sometimes, computer programs report
the simple prediction equation in terms of slope and Y-intercept.
An individual’s predicted score Y′ does not equal the actual
score Y unless the correlation coefficient used in the formula is
perfect—a rare event. Thus, for each individual there is an error of
prediction. The standard deviation of this error is called the
standard error of prediction. The standard error of prediction is
another index of the accuracy of a simple prediction equation.
If the prediction formula and standard error seem acceptable,
the researcher should check the accuracy of the prediction formula
on a second group of individuals similar to the first. This process is
called cross-validation. If the formula works satisfactorily for the
second group, it can be used with confidence to predict score Y for
any individual who resembles the individuals used to form and
cross-validate the equation. If the formula does not work well for
the second group, it is considered to be unique to the group used
to form the equation and has little value in predicting performance
in other groups.
Multiple Prediction
A prediction formula using a single measure X is usually not very
accurate for predicting a person’s score on measure Y. Multiple
prediction (or regression) techniques allow Y scores to be
predicted using several X scores (e.g., the prediction of college
grade-point average based on high school grade-point average
and SAT scores).
A multiple regression equation has one intercept and one b
value for each independent variable. The general form of two- and
three-predictor multiple regression equations is:
CHAPTER 16
CHAPTER 16
Qualitative and Mixed-Methods
Data Analysis and
Interpretations
This chapter presents techniques for data analysis and
interpretations of qualitative and mixed-methods research.
CHAPTER OBJECTIVES
Data reduction
Data saturation
Themes
Trustworthiness
Verification
Data Reading
Another valuable step in the qualitative research process, data
reading, consists of thoroughly reviewing transcripts and other
data sources (e.g., field notes, documents, audio-visual materials,
art work). As data familiarization is occurring, researchers reflect
on the data by creating written or electronic memos or notes that
may be entered directly into qualitative software—these may store
analytic thoughts and ideas and connections with theory. These
websites or applications allow researchers to begin to examine
and understand the information contained in the data sources as it
relates to the research purpose and question(s). Personal memos
highlight individual thoughts, feelings, and reflections about the
phenomenon, interview, or method, and summary memos work to
synthesize the content from several memos and can include short
quotes from raw data sources. Regardless of the type, record a
date and heading, source of the data, and identify the information
as a fact, quote, or interpretation. These memos may, in turn, help
provide a better understanding and description of the phenomenon
to support the data-driven themes.
CHAPTER 17
CHAPTER 17
Developing the Research
Proposal
The preparation of a research proposal is a crucial step in the
overall research process. In this chapter, we will review the
essential elements of a research proposal and provide a general
framework for writing a proposal.
OBJECTIVES
KEY TERMS
Assumptions
Data collection plan
Definition of terms
Delimitations
Limitations
Purpose statement
Research proposal
Not only is the title overly long, but it also includes some
redundancy, and it is confusing. The title should be long enough to
cover the subject of the research, but short enough to be
interesting. Usually, 12 to 15 words are sufficient. The beginning
researcher should ask the following questions when contemplating
a project title:
1. Does the title precisely identify the area of the problem?
2. Does the title present the uniqueness of the study (e.g., sample
employed, method used, how the data were analyzed, etc.)?
3. Is the title clear, concise, free from jargon, and adequately
descriptive to permit indexing the study in its proper category?
4. Does the title identify the key variables and provide some
information about the scope of the study?
5. Are unnecessary words, such as “a study of,” or “an
investigation of,” or “an analysis of” avoided? (Note: Such
words and other catchy, misleading, or vague phrases lengthen
titles unnecessarily and do not add substance. They are
superfluous.)
6. Do nouns, as opposed to adjectives, serve as the key words in
the title?
7. Have words been selected that will aid computerized retrieval
systems?
8. Are the most important words placed at the beginning of the
title?
The following examples illustrate appropriate titles. Each title
meets the criteria for a streamlined, yet complete, title. Moreover,
each accurately reflects the research problem.
“Leisure Participation of Urban Chinese Adolescents” (Wang,
2000)
“Psychological Factors Related to Eating Disorders Among
Female College Cheerleaders in Iowa” (Musser, 2002)
“Effects of Trampoline Training on Balance in Functionally
Unstable Ankles” (Hess, 2004)
“Effects of Special Olympic Participation and Self-Esteem on
Adults with Mental Retardation” (Major, 1998)
“Parental Influence on Children’s Physical Activity” (Lambdin-
Abraham, 2009)
Writing Chapter One: Introduction
Chapter One of the research proposal usually begins with a brief
introduction. Customarily, the researcher includes a paragraph or
two (perhaps more, depending on the nature of the problem) to
help lead the reader into the problem. The researcher specifies the
problem to be investigated, provides the underlying rationale, and
establishes why it is important. How the researcher became
interested in the problem is usually indicated. Some references to
the literature should be made in the introduction, although it should
not be so extensive to resemble a review of literature, which is
more properly done in Chapter Two. The literature cited in the
introduction, coupled with the researcher’s own thoughts, should
be used to stimulate interest and to establish the underlying
rationale for the study. Again, it is not necessary that this section
be long. Typically, the introduction section of a research proposal
will be one to two pages in length. Following are two examples,
both paraphrased from the completed research proposals, that
illustrate the essence of introductory sections:
Specific Aims
For some funding agencies, e.g., National Institutes of Health
(NIH), aside from the purpose statement, “specific aims” of the
study are also required. In fact, the specific aims are considered
the central, or a crucial, part of the proposal. Good specific aims
should include why, who, what, and how, e.g.:
1. Why are you going to do this study?
2. What specifics are you doing in the study?
3. Who will get involved in this study?
4. How, exactly, are you going to do the study?
The total number of specific aims usually is limited to three to
four in a proposal of a study lasting 3 to 5 years. A specific aim
itself usually consists of a few sentences, with a clear “to do”
statement, along with a short paragraph describing the hypothesis
to be examined and other related information, e.g.:
Delimitations
In research circles, delimitations refer to the scope of the study. It
is in this section of Chapter One of the research proposal that the
researcher draws a line around the study and, in effect, fences it
in. This section identifies what is included in the research.
Delimitations spell out the population studied and include those
things the researcher can control. It establishes the parameters in
the characteristics of the study, such as (1) number and kinds of
research participants; (2) number and kinds of variables; (3) tests,
measures, or instruments utilized in the study; (4) special
equipment; (5) type of training program; (6) the time and duration
of the study (e.g., date, number of weeks, time of year); and (7)
analytical procedures. These are the ingredients that the
researcher uses to attack the problem. If an item does not appear
in the list of delimitations, then it is of no concern in the research.
A particular population is targeted on which selected variables will
be studied. The variables will be measured by specific test
instruments. A certain training program may be involved, and the
study will be conducted over a specified period of time.
The section on delimitations in the research proposal does not
have to be excessively long, but it should specify those
parameters of the research proposal that the researcher can
control. Typically, a proposal would list 5 to 10 delimitations. An
example of delimitations taken from a research proposal to
ascertain the relationship between self-esteem, eating behaviors,
and eating attitudes among female collegiate swimmers is
presented as follows:
1. Twenty-four female collegiate swimmers, ages 18 to 21 years
2. A purposeful sample consisting of members of the University of
Northern Iowa’s women’s swimming team during the spring of
1998
3. The use of the Rosenburg Self-Esteem Scale to measure self-
esteem
4. The use of the Eating Attitudes Tests (EAT–26) of the National
Eating Disorders Screening Program to measure the construct
and concerns characteristic of an eating disorder
5. Administration of the data collection instruments during a single
testing session held during a team meeting of the swimming
team immediately following completion of the collegiate
swimming season (Fey, 1998)
The format for presenting the delimitations (as well as the
limitations and assumptions) in a research proposal often includes
an introductory phrase, such as, “This study is delimited to the
following” followed by a listing or enumeration of the applicable
delimitations, as illustrated previously.
Limitations
In research terminology, limitations refer to weaknesses of the
study. All studies have them, because compromises frequently
have to be made in order to conform to the realities of a situation.
Limitations are those things the researcher could not control, but
which may have an influence on the results of the study. The
reader of a research report should always know at the outset
those conditions of the study that could reflect negatively on the
work in some way. Only those things that might affect the
acceptability of the research data should be presented. The
researcher will, of course, try to eliminate extremely serious
weaknesses before the study is commenced. Among the items
that typically involve statements of limitation are the following:
1. The research approach, design, method(s), and techniques
2. Sampling problems
3. Uncontrolled variables
4. Faulty administration of tests or training programs
5. Generalizability of the data
6. Representativeness of the data
7. Compromises to internal and external validity
8. Reliability and validity of the research instruments
It is important that all statements of limitation sound like, or
imply, weakness. The statement, “The sample size was small,” is
not sufficient, because a small sample does not necessarily
assume a weakness of the study. However, the statement, “The
sample size of the study (N = 30) is small, necessitating caution in
extrapolation of the data to a larger obese population,” implies a
possible weakness of the study. While the number of limitations
listed will vary depending on the nature of the study, the research
design, and the methodology utilized, it is generally recommended
that the number of limitations be less than the number of
delimitations presented. While it is important that the researcher
recognize at the outset potential weaknesses of the study, an
exceedingly long list of limitations raises serious questions about
why so many facets of the research were not controlled, thus
questioning the overall veracity of the research. The following
limitations were adapted from those included in a proposal for
studying an obesity problem:
1. The sample size of this study is small (N = 30), necessitating
caution in extrapolating the data to a larger obese population.
2. Daily activities of the subjects other than the exercise program
are not controlled.
3. Although subjects are requested to stay on a diet that is well
balanced and restricted by 500 calories per day, occasional
variances from this may occur. In addition, due to the nature of
the diet analyses performed, changes in caloric intake may not
be detected.
4. The investigator is unable to personally conduct all of the
maximal graded exercise tests in pretraining and posttraining
as well as the exercise sessions. To ensure standardization in
testing and training, the investigator will meet with all research
assistants in order to discuss the testing and training
procedures.
5. Resting metabolic rate will not be measured following the
exercise sessions and lipolytic changes during this time will not
be determined.
6. Because plasma FFA, glycerol, and lipolytic hormones can be
affected by conditions other than exercise, it is possible that
such conditions may exist.
7. Plasma FFA does not reflect FFA flux in adipose or skeletal
muscle tissue and, therefore, conclusions concerning fat
metabolism must be interpreted with caution. (Sun, 1988)
Assumptions
Assumptions are derived primarily from the literature.
Assumptions then become the basis for the hypotheses, or
predictions of eventual outcomes, of the study. In other words,
what does the literature tell the researcher about what can be
assumed to be true for purposes of planning the study? The
information gleaned from the literature frequently serves as the
basis for much of the development of the research project. The
literature provides information about a particular behavior being
investigated and the various conditions influencing that behavior,
and sometimes contains factual evidence explaining the behavior.
The researcher undertakes a study with certain assumptions about
this information. Assumptions are also made about the way the
instrumentation, procedures, methods, and techniques will
contribute to the study.
The researcher may also assume that certain things that could
not be controlled or documented may occur in the study. The
following are examples of this type of assumption:
1. All research participants will answer the questionnaire honestly
and correctly.
2. The research participants will comply with the researcher’s
request to perform maximally on all trials of the test.
3. The test instrument is a reliable and valid measure of the
desired construct.
The following provides an example of assumptions made by
the author of a study designed to investigate the effects of
participation in Special Olympics on the self-esteem of adults with
mental retardation:
1. The test instrument, as modified, is appropriate for the target
population and is a valid and reliable measure of self-esteem.
2. The subjects will be able to understand the directions as they
are intended.
3. The subjects will complete the self-esteem inventory to the
best of their ability.
4. The subjects are a representative sample of adults with mental
retardation who reside in the state of Iowa.
5. The interviewers will be sufficiently trained and capable of
utilizing the recommended survey administration procedures.
(Major, 1998)
Hypotheses
This section of Chapter One permits the researcher to predict the
outcome of the study in advance and to present tentative
explanations for the solution of the problem. The assumptions
made by the researcher provide a “launching pad” for hypotheses.
A simple research hypothesis can be written as a single sentence
in which the researcher describes the expected outcome.
Consider the following examples of simple research hypotheses:
There is a positive relationship between the level of intrinsic
motivation and performance on a test of cardio-respiratory
function.
There is a difference in the self-reported alcohol usage of male
collegiate athletes compared to female collegiate athletes.
Normally, a statement of hypothesis will be declared for each
research question posed, although some descriptive questions
may not warrant hypotheses. It is important to note that, because
of space limitations in some professional journals, authors often do
not explicitly present their research hypotheses in the published
research reports, although such hypotheses may be implied in the
paper and undoubtedly have been made by the researcher. On the
other hand, graduate students, preparing their research proposal
or writing their dissertation or thesis, are normally expected to
explicitly state their research hypotheses in clear, unambiguous
terms. Stating hypotheses helps make a researcher’s thought
process about the research situation more concrete. If a prediction
of a specific outcome is made, it forces a more thorough
consideration of which research techniques, methods, test
instruments, and data-collecting procedures should be employed.
It is extremely important that the hypotheses be testable.
Hypotheses also help set up the way the data will be analyzed and
how the final report will be organized and written. Following is an
example of hypotheses presented in a research proposal to
determine the effect of sea kayak touring on the self-concept of
patients with low-level spinal cord injuries (SCIs):
1. Patients with a low level of SCI of at least 1 year’s duration
exhibit an increased self-concept after participation in a 2-week
sea kayak tour.
2. Increased self-concept changes are significantly greater for
those patients participating in the sea kayak tour than patients
participating in a 2-week camping experience or a regular
rehabilitation program.
3. Self-concept does not change for those patients participating in
a 2-week camping experience or in a regular rehabilitation
program.
4. For those patients participating in the sea kayak tour, self-
concept remains elevated for 6 months after completion of the
tour. (Pate, 1987)
Definition of Terms
Definition of terms is a necessity, because many terms and
concepts have multiple meanings. The researcher should define
terms as they will be interpreted and used throughout the study.
Always define operational and behavioral terms and concepts, and
check special terms in technical dictionaries or have them
reviewed by experts in the field. An operational definition gives
meaning to a concept or term by indicating what operation has to
be accomplished so as to be able to measure the concept. In
experimental studies, a researcher will cause a particular result to
occur by using a certain procedure or operation. An operational
definition of personality might refer to the “scores on the Cattell 16
Personality Factor Inventory” or some other personality test
selected by the researcher. Directly quoted or accurately
paraphrased time-honored definitions, if they fit the research
situation, are considered superior to definitions made up or
“coined” by the researcher. Once defined, a term should be
applied consistently throughout the study so as not to confuse the
reader. It is customary to alphabetize the terms in a list of
operational definitions. Following is an example of a partial list of
terms defined in a research proposal related to assessment
practices in school physical education:
5. Osteoporosis
a. Bone Loss
b. Types of Osteoporosis
c. Osteoporosis in the Triad
6. Summary
The introduction to Chapter Two should always contain an
opening paragraph that relates the literature to the problem being
studied and explains how the chapter is organized. Often, the
introductory paragraph will include a restatement of the problem
that was presented in Chapter One of the research proposal.
Consider the following example:
CHAPTER 18
CHAPTER 18
Writing the Research Report
This chapter offers a basic overview of the research report. The
function of a research report is to communicate a set of ideas and
facts to those who are interested in the problem area in which the
research was undertaken. While there is no universally accepted
format for the research proposal and resultant research report,
most tend to follow a similar style.
OBJECTIVES
KEY TERMS
Article format
Preliminary items
Research report
Supplementary items
Thesis format
Preliminary Items
The preliminary items usually included are the title page,
approval page, acknowledgments page, table of contents, list of
tables, list of figures, and abstract. EXAMPLES 18.1 to 18.4
provide examples of these items completed by former students of
the authors of this textbook.
July 1998
Fey (1998).
Fey (1998).
Acknowledgments
I would like to express my sincere gratitude to Dr. ______, Dr.
______, and Dr. ______ for their assistance and guidance in
this investigation. The author is especially indebted to Dr.
______ whose help and encouragement were invaluable to the
completion of this study.
Abstract
The general purpose of this study was to investigate the role of
parental influence on children’s physical activity (PA)
behaviors. The specific purposes were to (1) determine the
relationship between parental attitudes and their children’s PA,
(2) determine if parental exercise habits are related to their
children’s PA participation, (3) determine if parental knowledge
of physical activity recommendations is related to their
children’s PA participation, (4) examine the nature of parenting
practices on their children’s PA participation, and (5)
investigate the mediating effect of socioeconomic status on the
influence parents have on their children’s PA participation.
Questionnaires were obtained from fifth and sixth grade
parents and children at two schools in Iowa. To assess the
physical activity behaviors of the participants, children
completed the Physical Activity Questionnaire for Children,
PAQ(C). Parents completed a questionnaire consisting of five
sections: demographics, parental exercise behavior, parenting
practices scale, physical activity knowledge scale, and exercise
benefits and barriers scale.
Data were collected on 57 fifth and sixth grade parents and
students during fall semester 2010. The study sample included
28 girls and 29 boys. Students ranged in age from 9 to 12
years, with a mean age of 10.6 years. The parents ranged from
29 to 52 years, with a mean age of 34.9 years.
The results of the study showed that parental influences do
not have a large impact on children’s physical activity. No
significance was found between parental physical activity
behaviors or attitude/beliefs and children’s physical activity
behavior. The majority of the parenting practice items did
illustrate a weak, positive correlation, while parental knowledge
results showed a weak negative correlation with children’s
physical activity behavior. Socioeconomic status did not impact
the correlations between the parental variables and children’s
physical activity behavior. There are many factors that
influence the physical activity behaviors of children outside of
parents, and future research should investigate other potential
factors. (Eshelman, 2010)
Abstract
Example 18.4 is a sample of an abstract for a master’s thesis.
Most institutions require an abstract varying in length from 250 to
350 words. The abstract is one of the preliminary items of the
thesis or dissertation. Most research journals also require an
abstract when researchers submit an article for publication.
Although the length requirement varies among the journals, 150 to
200 words is typical. The researcher should use all of whatever
abstract length is allowed, because the more that can be reported
to the reader, the greater the chance for good understanding of the
research. Often, abstracts are written as a single paragraph, but
sometimes may include separate paragraphs for each section.
An abstract must be concise but should include enough detail
to enable the reader to determine whether or not the entire
research report, article, thesis, or dissertation needs to be read.
The abstract should be brief and contain the overall essence of the
information included in the larger work. The reader should be able
to glean from the abstract all the main ideas of the original
research investigation.
An appropriate abstract will usually include comments
concerning each of the following elements:
1. The problem that was investigated, including the rationale from
which the problem was developed.
2. The methods by which the data for the problem were collected.
Included here will be an identification of the variables studied,
the procedure used in carrying out the research and collecting
the data, and a discussion of how the data were analyzed.
3. A brief report of the findings resulting from the study. These
results are usually not given in tables, figures, and graphs due
to the limitations placed on the length of an abstract.
4. The conclusions made by the researcher based on the
findings. If room in the abstract exists, it could also include the
researcher’s interpretation of the possible implementation of
the results, or recommendations for further research in the
problem area.
Characteristics of Participants
The physical characteristics of the research participants are
shown in TABLE 1. Mean (with SD in parentheses) age was
13.1 (0.6) years, mean BMI was 21.8 (5.2) kg/m2. There were
no significant differences in age or BMI by sex (p > 0.05).
Correlations
Correlational analyses indicated that all 10 psychosocial
variables were significantly related to children’s physical
activity (TABLE 5). The subscale with the single highest
correlation for boys was “likes exercise” with an r = 0.61,
whereas “likes sports/games” had the highest correlation for
girls with r = 0.44. In addition, the inter-relationships among the
10 psychosocial variables were low to moderate, ranging from
0.28 to 0.66.
Summary
The problem of the study was to determine if the psychological
factors of self-efficacy discriminate between female adults who
adhere to exercise programs and female adults who do not
adhere to exercise programs. Included in the study was an
attempt to identify the reasons exercise adherers give for
continuing an exercise program and the reasons exercise non-
adherers give for discontinuing an exercise program.
The subjects of the study were 168 females who had
enrolled in physical fitness classes at the Monroe County
YMCA, Bloomington, Indiana, during the spring of 1985. All
subjects completed a survey instrument consisting of the self-
consciousness scale, the physical self-efficacy scale, and
questions designed to determine demographic data and
exercise habits. The data for the study were collected during
the months of May and June 1986.
The data were analyzed using four statistical techniques.
Chi-square test of association was used to test the differences
in fitness class attended, education level, and marital status
between exercise adherers and non-adherers. Discriminant
analysis was used to determine if the independent variables of
private self-consciousness, perceived physical self-efficacy,
age, and income could significantly predict the dependent
variables of group membership (exercise adherer or non-
adherer). Multiple regression analysis was used to determine if
the independent variables of the study could significantly
predict the number of minutes per week spent in vigorous
physical activity. Finally, a frequency distribution of the reasons
exercise adherers gave for continuing an exercise program and
the reasons exercise non-adherers gave for discontinuing an
exercise program was done. The Statistical Package for the
Social Sciences (SPSS) was used for all statistical analyses
except for the frequency distributions.
The analysis of the data revealed the following significant
findings:
1. The independent variables of private self-consciousness,
public self-consciousness, social anxiety, self-
consciousness, perceived physical ability, physical self-
presentation confidence, physical self-efficacy, age, and
income significantly predicted group membership as an
exercise adherer or non-adherer.
2. The independent variable of perceived physical ability and
physical self-efficacy accounted for most of the function
which discriminated between exercise adherers and non-
adherers.
3. The optimal linear combination of the independent variables
of private self-consciousness, social anxiety, perceived
physical ability, physical self-presentation confidence,
physical self-efficacy, age, education level, marital status,
and income, taken altogether, significantly predicted the
number of minutes per week spent in vigorous physical
activity.
4. By itself, only perceived physical ability significantly
predicted the number of minutes per week spent in vigorous
physical activity.
5. The most frequent reasons given by exercise adherers for
continuing an exercise program were to maintain health,
social status, enhance mental health, relieve stress, and
appearance.
6. The most frequent reasons given by exercise non-adherers
for discontinuing an exercise program were lack of time,
demands of employment, schedule conflicts, demands of
children, and illness or injury.
Conclusions
Within the limitations of the study the following conclusions are
warranted:
1. Significant psychological differences exist between female
exercise adherers and non-adherers.
2. Females who adhere to a program of exercise perceive
their physical abilities at a higher level than do females who
discontinue a program of physical exercise.
3. Self-consciousness is not an important factor in the
psychological differences between female exercise
adherers and non-adherers.
4. Females who have a higher perception of their physical
abilities exercise more than do females who have a lower
perception of their physical abilities.
Recommendations
The following recommendations are made for further research
in the area of exercise adherence:
1. The present study should be replicated using both male and
female participants.
2. The relationship of self-consciousness and physical self-
efficacy to exercise adherence should be examined in an
experimental type of study. That would permit more control
in the variable of exercise adherence, since it would not rely
on the subjects’ truthfulness regarding their exercise habits.
3. A study should be conducted to determine the cause-and-
effect relationship between perceived physical ability and
exercise adherence.
4. Additional studies identifying other psychological variables
that could be related to exercise adherence are needed.
5. The present investigation should be replicated in other
communities to gain a large cross-section of the adult
population from which reliable data can be obtained.
6. The psychological variables of self-consciousness and
physical self-efficacy related to exercise adherence should
be investigated in a non-structured exercise program
setting.
7. Additional studies identifying reasons exercise adherers
give for continuing an exercise program are needed to
corroborate the findings of the present study.
The findings of the study may be implemented into either a
professional practice situation or a research setting in the
following ways:
1. The significance of perceived physical ability to discriminate
between female exercise adherers and non-adherers
should be considered in developing psychological profiles
of exercise.
2. Adherence to an exercise program could be enhanced by
the use of intervention strategies designed to increase the
participants’ perception of their physical ability. Programs
involving the use of progression in skills and intensity of
training could increase one’s self-perception of physical
ability and thereby increase the likelihood of an individual
continuing participation in those exercise programs.
3. If exercise adherence is one of the major goals in physical
activity programs, the enhancement of one’s self-perceived
physical abilities should be a major concern in elementary
and secondary physical education programs. Exposure to a
variety of skills will increase the likelihood of finding an
activity in which the skills can be mastered, thus increasing
the chance for regular lifelong participation in that activity.
Visker (1986).
References
The body of the text is always followed by a list of references
organized according to the prescribed style manual required by
the students’ college or university or academic unit. With a
relatively short list of sources to be referenced, different types of
literary pieces may be incorporated. However, if the list of sources
is quite long, they may be grouped according to type, such as
books, documents, manuscripts, newspapers, pamphlets, and
periodicals. The usual convention is to include only those sources
that were referenced in the text. Occasionally, it may be necessary
to include a source that was not cited, but the usual caution is not
to pad the list of references.
Supplementary Items
Appendixes serve as the repository for supplementary items that
are unnecessary for inclusion in the body of the text, but can be
used by the reader to clarify various aspects of the thesis or
research report. They provide additional specific information if the
reader desires it. Typically appendixes include the following types
of items:
1. Copies of the instruments (e.g., questionnaires, interview
guides)
2. Instructions to participants on how to engage in physical
performance tests (e.g., fitness, skill tests)
3. Letters and similar documents
4. Human participant consent forms
5. Raw data from both a pilot study and the actual study
6. Diagrams of testing sites/settings
7. Tables from related research
8. Supplementary reference lists
9. Credentials of the members of a jury of experts or committee of
authorities
10. Interview and other data collection schedules
11. Legal codes
All such materials should be clearly labeled and lettered for
quick and easy reference. Each appendix should bear a
descriptive title of what is included.
Thesis Format Versus Published Article
Format
On the foregoing pages, the format for a thesis or dissertation was
described in some detail. The major difference between a thesis or
dissertation and a published research report is the length of the
document. Researchers who publish articles are limited by the
established publishing criteria of particular journals in the amount
of detail they can submit. A six to eight–page article, as prescribed
by a specific publication, cannot possibly include all the
information contained in a 150-page thesis or dissertation.
Although the precise article format may vary depending on the
nature of the research, a published research report typically
includes the following elements:
1. Preliminary information
a. Title
b. Author and organizational affiliation
c. Acknowledgments (if any)
d. Abstract
2. Introduction
a. Background information and literature review
b. Rationale for the study
c. Purpose statement
d. Hypotheses or research questions
3. Methods
a. Participants
b. Instrumentation
c. Procedures
d. Statistical analysis
4. Results
a. Presentation of data
5. Discussion
a. Key findings
b. Strength and limitation of the study
c. Recommendations
6. Conclusions
7. References
8. Appendix (if appropriate)
The style in which the published article is written will also be
determined by whatever style manual is followed by the journal.
Quite frequently, the style will be different from that found in a
thesis or dissertation. Capitalization may differ, as will punctuation,
the spelling of numbers, and abbreviations.
Theses or dissertations written by graduate students are
considered “final” manuscripts. They usually have a long life span
and will be read and referred to over a long time. A journal
manuscript tends to be a “copy” manuscript that, in turn, becomes
a typeset article. Copy manuscripts have a relatively short life.
They are read by reviewers, editors, and typeset experts. The
original content can sometimes be changed dramatically as a
result of the publication process.
Students who complete a thesis or dissertation are encouraged
to publish one or more articles derived from their study.
Preparation of an article entails transposing selected thesis
content to journal material according to the publishing restrictions
of the journal in which the article will appear. This is not a
particularly difficult task, although some people find it to be a
laborious chore. Some confuse transposition with creating a whole
new document. Students who write a thesis under one format are
sometimes confused when following the guidelines imposed by the
journal format requirements.
In recent years, some institutions, including those of the
authors of this textbook, have adopted regulations that enable
students to write their theses and dissertations in the form required
by those journals in which the students hope to publish articles
based upon their research. That would result in documents of 20
to 30 pages in length and containing only the essential information
from the literature, procedures, and results sections. The more
detailed information from the other thesis sections would become
part of appendices. The underlying idea is to make it quicker and
easier for the student to create a journal article for publication from
a document written to fulfill a degree requirement. Relatively few
theses or dissertations written using the traditional format end up
being published. This alternative format for a thesis or dissertation
may enhance the likelihood that research conducted to satisfy a
degree requirement will contribute new knowledge to the field.
Readers are encouraged to read the excellent article by Thomas,
Nelson, and Magill (1986), who advocate for use of the journal
format for theses and dissertations. Whether or not the movement
away from the traditional thesis format will become large scale is
yet to be seen.
Preparing a Manuscript for Publication
The decision to prepare a manuscript for publication represents a
commitment to building the body of knowledge within a given
subject area. A published manuscript serves as a primary
communication document for a profession and, as such, has the
potential to advance theory and inform professional practice.
Publishing a manuscript based upon one’s own research can be
one of the most exciting and professionally rewarding experiences
for both young scholars and seasoned veterans. The process is
often challenging and requires perseverance. In the next sections,
we have outlined several suggestions that should be considered
as you begin the task of preparing a manuscript for publication.
Before Writing
Prior to starting to prepare a manuscript for publication, the
researcher should do several things. First, the researcher needs to
select the journal to which the manuscript will be submitted. Next,
the researcher needs to examine articles published in that journal
to get a general idea of the format used and read the guidelines
for submitting manuscripts to the journal. Guidelines are published
in the journal. Select the journal primarily on its orientation and
type of reader. Some journals are geared primarily toward
researchers; others are more for practitioners. Some journals are
highly specialized, whereas others are for a general audience.
There is no use in submitting a manuscript to a journal that does
not publish the type of research described in the manuscript. Also,
the researcher should try to determine the manuscript acceptance
rate of the journal. Acceptance rates vary from 10% for very
prestigious and specialized research journals to as high as 70% or
more for practical, general-audience, state-association journals.
By looking at articles in the selected journal, the researcher
may discover a useful model to follow. Certainly, the researcher
will be able to determine whether an abstract is required as part of
the manuscript, what organization and major headings are
required in the manuscript, and the typical length of a manuscript,
based on the length of articles in the journal.
After selecting a journal, the next most important thing is to
read the author guidelines carefully. Most author guidelines
include information such as aims and scope, specifics on
manuscript submission (e.g., cover letter, writing style, copyright,
etc.), preparation of manuscripts (e.g., structure, length of title,
manuscript page, and references, etc.), illustrations, tables and
figure, reprints and issues, etc. For detailed information, see the
links for the author guidelines for Medicine & Science in Sports &
Exercise and Research Quarterly for Exercise and Sport, two
prestigious research journals in kinesiology:
http://edmgr.ovid.com/msse/accounts/ifauth.htm
https://www.shapeamerica.org/publications/journals/rqes/authorguidelin
During Writing
Any manuscript starts with the title of the research study. The title
needs to be short, yet represent the research well and stimulate a
person to read the entire article. If an abstract is required, that
element appears next in the manuscript; it should be one to three
paragraphs long. The abstract will contain the statement of
purpose for the study, description of the research participants,
major procedures, major results, and conclusions.
The manuscript for most published articles begins with an
introduction and brief review of the most important related
literature. Somewhere, wherever appropriate, there should be a
statement of the purpose of the study. This part of the manuscript
is typically three to six paragraphs long. Next in a manuscript is a
procedures or methods section. Typically included in this section
are descriptions of research participants, how the research was
conducted, data collection techniques, and data analysis
procedures. The content and length of this section is dependent
on the type of research conducted and the complexity of
describing how the research was conducted. Tables and figures in
this section are commonly used to clearly and economically
present information. Not every small point concerning the conduct
of the research can be presented or the manuscript cannot be kept
to a reasonable length.
A results section follows the procedures section. This section
usually makes up a relatively large portion of the manuscript
because it includes all of the major data analysis and findings of
the research study. Much of this information is presented in text
form; however, the use of tables and figures is also common. The
recommended organization when using tables and figures is to
introduce a table or figure, present it, and then discuss it. That is
not always possible, but the recommendation is a good general
rule. The discussion following the table or figure may be just a
short sentence or two with an extensive discussion in a later
section, or it may be quite extensive within that section. The
decision to present a major discussion in this section or in a later
section is left to the researcher. The majority of researchers
usually place the extensive discussion in the next section.
The discussion and conclusions section is where the results
previously reported are discussed at length and conclusions based
on the results are stated. This part of the manuscript is usually of
considerable length. The discussion may include comments on
each finding followed by a discussion of the findings as a whole.
However, if each finding was discussed at length when introduced
in the results section, then the discussion is reserved for the
findings as a whole. Often included in the discussion of findings is
a comparison of how the findings agree with those in the literature
cited at the beginning of the manuscript. Usually, in the discussion
are statements concerning the importance and application of the
findings. No matter how the discussion is structured, the
conclusions follow it. Anytime the research involves a population
from which one or more samples were drawn, conclusions consist
of statements concerning what the researcher thinks is true at the
population level based on the findings from the samples used in
the study.
The last section in a manuscript is a list of references. Included
in the list are all sources cited in the manuscript. The length of this
section is dependent on the number of sources cited. Normally, it
is not a lengthy section and will contain from 10 to 20 listings.
After Writing
After writing the first draft of the manuscript, one must spend
considerable time proofreading it. Correctness of manuscript form,
correctness of content, economy and clarity of presentation, and
correctness of spelling and grammar are important. The more
prestigious the journal is, where the manuscript will be submitted,
the more important the proofreading.
The spell-check and grammar check options on most word
processing computer programs are often quite helpful in revising
the manuscript. However, they do not catch all problems, and the
final responsibility for correctness rests with the writer. Spell-check
will not catch words left out, the wrong word, confusing sentences,
or homonyms. To help readers’ proofreading, some journals even
provide a checklist. Try to use such tools for proofreading if they
are available (EXAMPLE 18.11) for a checklist provided by
Research Quarterly for Exercise and Sport.
SHAPE America. (2017). Research Quarterly for Exercise and Sport Style Checklist.
Retrieved from https://s3-us-west-2.amazonaws.com/clarivate-scholarone-prod-us-
west-2-s1m-
public/wwwRoot/prod1/societyimages/rqes/RQES%20style%20checklist_%202018.pdf
Glossary
A
45 CFR 46 Specific federal law, referred to as the Common Rule,
that establishes regulations governing research involving human
participants in the United States.
Abstract Preliminary item in a research report that provides a brief
summary of the research study.
Acknowledgments Preliminary item in a research report in which
the author lists the contributions of others who have assisted with
the research.
Action planning A unique stage within the action research
process whereby the researcher identifies local actions that take
place (or taking informed action) based on the findings of the
study.
Action research A distinct type of applied, practical research
often seen in education in which the focus is on local needs,
problems, or issues.
Alpha level Probability level selected that warrants rejection of the
null hypothesis.
Analysis of covariance (ANCOVA) A statistical technique to gain
control by adjusting for initial differences among groups.
Analysis of variance (ANOVA) A statistical test with two or more
groups to determine group differences.
Applied research Research whose primary aim is toward the
solution of practical problems, yet seeks to make inferences
beyond the study setting.
Article format Alternative form of a research report prepared by
graduate students whose format is based upon the style specified
by the journal in which the student hopes to publish the
manuscript.
Assent Statement of consent or agreement made by a child or
adolescent regarding his or her involvement in a research project.
Assumptions Facts or conditions presumed to be true, yet not
actually verified, that become underlying basics in the planning
and implementation of the research study.
Attributable risk The percentage of cases in the total group that
occur in the group with a risk factor.
Authorship The primary method through which researchers
receive recognition for their research efforts.
B
Background information Important literature on the topic being
investigated that is cited in the introductory section of a research
report, thus establishing the foundation for the study.
Basic research Research whose primary aim is the discovery of
new knowledge through theory development or evaluation.
Beneficence Ethical principle obligating researchers to protect
persons from harm and to maximize possible benefits and
minimize possible harms.
Block design An extension of matched pairs research design for
three or more groups.
Boolean search strategy A strategy for searching electronic
literature databases by using AND, OR, and NOT between search
words.
C
Case-control studies In case-control studies, participants are
selected based on the presence or absence of a disease of
interest.
Case study Study that provides an intensive, holistic, and in-depth
understanding of a single unit or bounded system. Also involves
studying an event, activity, program, process, or one or more
individuals.
Causal-comparative research Research that seeks to investigate
cause-and-effect relationships that explain differences that already
exist in groups or individuals; also called ex post facto, or after-
the-fact research.
Cell A row-column intersection in a two-dimensional data layout.
Central research questions Broad questions that begin the study
of a central phenomenon or concept in a study.
Central tendency A descriptive value that indicates those points
at which scores tend to be concentrated.
Central tendency error A tendency to rate most participants in
the middle of a rating scale.
Cluster sampling A type of probability sampling in which the
sampling unit is a naturally occurring group or cluster (such as
classrooms) of members of the population.
Coefficient of determination A value that indicates the amount of
variability in one measure that is explained by the other measure.
Common Rule The general name for federal law 45 CFR 46 that
establishes regulations governing research involving human
participants in the United States.
Completion item Question for which the answer is left blank and
the participant fills in the blank.
Conceptual literature The conjectural, often abstract, literature
that provides the theoretical underpinnings for a research study.
Conclusions A closing portion of a research report in which the
author provides an answer to the research problem.
Confidentiality Preventing to link information or data collected
during a research study to a person’s identity.
Construct Something which is known to exist although it may not
be precisely defined and/or measured.
Continuous scores Scores that have a potentially infinite number
of values.
Control group In a research study, the group that received no
treatment given to the experiment group(s).
Convergent evidence The degree to which two measures of
constructs that theoretically should be related, are indeed related.
Convergent parallel mixed-methods The investigator merges or
converges the qualitative and quantitative data together in an
effort to gain a more complete understanding of the phenomenon.
In this type of mixed-method, the data are usually collected and
analyzed together, and findings are incorporated into the
interpretation of the results.
Corpus of data A large collection of texts or recorded speech
(e.g., large data base).
Correlation A mathematical technique for determining the
relationship between two sets of scores.
Correlation coefficient A value indicating the degree of
relationship between two sets of scores.
Correlational research Type of research that seeks to investigate
the extent of the relationship between two or more variables.
Counterbalanced design A method of gaining control by all
participants receiving all treatments but in different orders.
Covariate A score used to adjust for initial differences among
groups in ANCOVA.
Criteria for critiquing an article Basic standards or guidelines for
evaluating a research report or proposal.
Criterion-referenced standards Minimum proficiency or pass-fail
standards.
Critical reading The process of reading a large amount of
literature while thinking about, reflecting upon, and critically
analyzing what is being said.
Critical region The region at which all values of the statistical test
are at or beyond the critical value.
Critical value The tabled value of the statistical test needed to
reject the null hypothesis.
Cronbach’s alpha A correlation coefficient calculated as an
estimate of objectivity and reliability; also called coefficient alpha.
Cross-sectional approach Method for testing many groups and
assuming each group is representative of all other groups when
they are at the same point in time.
Cross-sectional survey Known also as a prevalence study, this
survey measures both risk factors and the presence or absence of
disease at the same point in time.
Cross-validation Method for checking the accuracy of a
prediction equation on a second group of individuals similar to the
first group.
Curvilinear relationship A relationship between two variables
best represented graphically by a curved line.
D
Data analysis The methods of manipulating and analyzing the
collected data to reveal relevant information.
Data-collecting instrument Any tool or procedure used by the
researcher to collect relevant data pertaining to the research
problem.
Data collection plan Detailed procedure for acquiring the
information needed to attack the research problem.
Data reduction A process of investigating and transforming raw
data, often using a coding technique, in order to make sense of
and identify themes within the data; part of data analysis in
qualitative research.
Data saturation When no new themes are emerging in any of the
data sources being analyzed in qualitative research, suggesting
that adequate data have been collected and analyzed to
understand the phenomenon.
Deductive reasoning Thinking that proceeds from a general
assumption to a specific logical conclusion.
Definition of terms Explanation of important terms as they are
used in a research study.
Degrees of freedom A value associated with a statistical test that
is used when finding a table value of the statistical test.
Delimitations Characteristics specified by the investigator that
define the scope of the research study, in effect, “fencing it in.”
Delphi technique A method of data collection using questioning
techniques to obtain a consensus on a specific issue from a
defined group of individuals.
Dependent variable The variable that is expected to change as a
result of the independent variable; it is the variable that is
observed or measured in a study.
Descriptive questions Type of research question that seeks to
describe phenomena or characteristics of a particular group of
subjects.
Descriptive research Research that attempts to systematically
describe specific characteristics or conditions related to a
participant group.
Descriptive statistics Statistics used to describe characteristics
of a group.
Designs The ways in which a research study may be conducted.
Difference questions Type of research question that seeks to
determine if there are differences between or within groups or
conditions.
Directional hypothesis Type of research hypothesis that is
posited when the researcher has reason to believe that a
particular relationship or difference exists.
Discrete scores Scores that are limited to a specific number of
values.
Discriminant (divergent) evidence The degree to which two
measures of constructs that theoretically are not supposed to be
related, are actually unrelated.
Discussion section Section of a research report containing a
nontechnical explanation or interpretation of the results,
culminating in the researcher’s conclusions.
Distilling The process of narrowing an initial research question to
a specific problem that is amenable to investigation.
Double-blind study A study in which both the participants and
those conducting the study are unaware of the purpose of the
study and group membership of participants.
E
Effect size An estimate of the practical difference between two
means; used in meta-analysis as a standard unit of measurement.
Element The basic unit from which data or information is collected
—normally, the research participant.
Essence Core meaning of an experience.
Ethics Moral principles that define one’s values in terms of
acceptable behaviors.
Expected frequency The frequency hypothesized or expected for
an answer in a chi-square test.
Experimental research Type of research in which an independent
variable is manipulated to observe the effect on a dependent
variable for the purpose of determining a cause-and-effect
relationship.
Experiment-wise error rate Probability of making a type I error
somewhere in all of the two-group comparisons conducted.
Explanatory sequential mixed-methods This mixed-method
technique has the researcher first conduct the quantitative data
collection and analysis, and then, this is used to expand on the
findings with the qualitative data collection and analysis phase to
further explain the phenomenon.
Exploratory sequential mixed-methods This mixed-method
technique has the researcher first conduct the qualitative data
collection and analysis in order to discover the views of the
participants or information about the context. The qualitative
results are then used to build a second phase of quantitative data
for the project (e.g., survey).
External criticism Method used in historical research to
determine the authenticity, genuineness, and validity of the source
of data (e.g., methods to authenticate documents beyond content).
External validity Validity of generalizing the findings in a research
study to other groups and situations.
Extraneous variables Error-producing variables that could
negatively affect the results of a research study if not adequately
controlled.
F
Factorial design A multiple-way ANOVA design with rows
representing a classification or treatment and columns
representing a treatment or two or more independent variables
that may influence the dependent variable
Fishbowl technique Simple random sampling technique by which
the names of all members of a population are placed in a
container (such as a fishbowl) and then, randomly drawn from the
container one at a time.
Focus group Group interview with 6 to 8 participants facilitated by
a moderator or interviewer to discuss topics prepared by the
interviewer.
Focus group interview An interviewing technique where a group
of participants is interviewed together.
Frequency polygon A line graph of scores and their frequencies.
G
Good test A test for which acceptable objectivity, reliability, and
validity of the data have been determined.
Grounded theory Derivation of a theory from the views of the
participants in a study; develops paradigms that are based on real-
world experiences.
Group differences method Two or three groups are compared to
verify the construct being measured.
H
Halo effect The tendency to let initial impressions influence future
ratings or scores of a participant.
Helsinki Declaration Ethical guidelines defined by the World
Medical Association for medical research involving human
participants.
Histogram A bar graph of scores and their frequencies.
Hypothesis A tentative explanation or prediction of the eventual
outcome of a research problem.
I
Impact factor Computed index of the quality of a scholarly journal
based on the frequency with which a journal’s articles are cited by
other researchers.
Imperfect induction Conclusion derived through inductive
reasoning based on the observations of a small number of
members of a population.
Incidence rates The number of incident cases over a defined time
period divided by the population at risk over that time period.
Incident cases The new occurrences of these events, from non-
disease to disease or from one state of a health outcome to
another, in the study population during the time period of interest.
Independent variable The experimental or treatment variable in a
study; it is the variable that is purposely manipulated or selected
by the researcher in order to determine its effect on some
observed phenomenon; it is antecedent to the dependent variable.
Induction Thinking method that proceeds from the specific to the
general; basic principle of scientific inquiry in which information
gained through observations of a small number of cases lead to
generalized conclusions.
Inferential statistics Statistics used in the process of testing an
assumption from a sample to a population.
Informed consent Explicit statement informing potential research
participants of the purposes, procedures, risks, and benefits of a
research project; provides an acknowledgment that participation is
done so voluntarily.
Institutional review boards (IRBs) Local committees established
by an institution whose purpose is to ensure the protection of
human and animal participants involved in research activities.
Instrumentation Portion of the methods section of a research
report describing the measuring instrument or equipment, such as
a questionnaire, test, inventory, or interview used to collect data
from the research participants.
Internal consistency reliability Consistency of test scores within
a single day.
Internal criticism Method used in historical research to assess
the meaning and accuracy of the source of data (e.g., methods
used to authenticate content of documents).
Internal validity Validity of the findings within the research study.
Interval scores Scores that have a common unit of measurement
between each score, but do not have a true zero point.
Interview Face-to-face, virtual interaction, or discussion with an
individual or group.
Intraclass correlation coefficient A correlation coefficient
calculated as an estimate of objectivity and reliability.
Introduction section The introductory portion of a research report
that provides background information and develops the rationale
for conducting the study.
J
Justice Ethical principle requiring that the benefits and burdens of
research be fairly distributed, thus impacting the selection of
research participants.
K
Kappa coefficient Estimate of the reliability of dichotomous data.
Kuder-Richardson A correlation coefficient calculated as an
estimate of reliability.
L
Leptokurtic curve A curve that is more sharply peaked than a
normal curve, indicating an extremely homogeneous group.
Life history Type of study that covers the lives of individuals or
that results from one or more individuals providing stories about
their lives. Also known as narrative research or biographical
research.
Likert-type scale A type of scaling technique in which
respondents are presented with a series of statements and asked
to indicate the degree to which they agree or disagree.
Limitations Aspects of a research study which the investigator
cannot control that represent weaknesses to the study and may
negatively affect the results.
Line of best fit A straight line that represents the trend or
relationship in the data; also known as the regression line.
Linear relationship A relationship between two variables best
represented graphically by a straight line.
Longitudinal approach Method by which a group is measured
and observed for years.
M
Matched pairs design A form of selective manipulation by which
participants are matched to gain control in order to obtain similar
groups.
Mean A measure of central tendency, used with interval or ratio
data, that is the sum of the scores divided by the number of
scores.
Mean square (MS) value Values used in ANOVA to calculate an F
statistic.
Measurement techniques Methods for collecting information in
which participants are directly tested or measured on the
characteristics of interest; may include physical measures,
cognitive measures, and/or affective measures.
Median A measure of central tendency used with ordinal data that
indicates the middle score.
Mediating variables Variables that describes the process by
which the intervention achieves its effects.
Meta-analysis A set of statistical procedures to combine
numerical information from multiple separate studies to reach an
overall summary of empirical knowledge on a given topic.
Metasearch engine A website that uses multiple search engines
simultaneously to perform an Internet search.
Methods section Section of a research report that describes the
research participants, instrumentation, procedures for data
collection, and data analysis.
Mixed-methods research Studies incorporating components of
both qualitative and quantitative research methodology to provide
a more comprehensive perspective.
Mode A measure of central tendency used with nominal data that
indicates the score most frequently received.
Moderating variables Variables at different values that have a
different intervention effect. They help to explain the relationship
between independent variables and dependent (outcome)
variables.
Modified kappa coefficient An estimate of the reliability of
dichotomous data.
Multiple choice Question for which responses are provided and
the participant selects the most appropriate response(s).
Multiple prediction Prediction of an individual’s score on a
measure based on the individual’s scores on several other
measures.
Multistage sampling A type of cluster sampling technique which
involves the successive selection of clusters within clusters and/or
elements within clusters.
Multivariate statistic Statistic for which each participant
contributes multiple scores to the data analysis.
Multivariate tests Tests for which each participant contributes
multiple scores to the data analysis.
N
Nazi experimentation Medical experiments conducted by
German scientists during World War II that subjected unwilling
participants to extreme cruelty and inhuman treatment.
Nominal scores Scores that cannot be hierarchically ordered.
Nondirectional hypothesis Type of research hypothesis that is
posited when the researcher has no reason to believe that a
difference or relationship exists in any direction.
Nonparametric statistic Statistic that has no requirement of
interval data and normal distribution.
Nonparametric tests Statistical tests that do not assume interval
data and normal distribution of the scores.
Nonprobability sampling A sampling technique such as
convenience sampling in which random processes are not used
and the probability of selecting a given sampling unit from the
population is not known.
Nonrefereed journal Type of scholarly journal in which the
manuscripts are not evaluated by external reviewers, although
they may be appraised by the journal editor.
Nonsignificant Value of the statistical test does not warrant
rejection of the null hypothesis; differences among groups are
sampling error.
Normal curve A symmetrical curve centered around a point with a
defined base-to-height ratio, indicating a balanced (or normal)
distribution; also called a bell-shaped curve.
Norm-referenced standards Standards to rank-order individuals
from best to worst.
Null (statistical) hypothesis Hypothesis used for statistical
testing purposes that proposes that there is no difference between
comparison groups or no relationship between variables;
hypothesis stating that the independent variable has “no effect” on
the dependent variable.
Nuremberg Code Basic principles of ethical conduct that govern
research involving human participants that were developed as a
result of trials of German scientists.
O
Objectivity The degree to which multiple scorers agree on the
values of collected measures or scores; also called rater reliability.
Objectivity coefficient A correlation coefficient indicating the
relationship between the scores of the scorers.
Observation techniques Methods for collecting information in
which the participants are observed by the researcher, either
directly or indirectly, and relevant data recorded.
Observed frequency The frequency in the sample for an answer
in a chi-square test.
Odds ratio An estimate of relative risk.
One-way ANOVA A statistical analysis used with two or more
independent groups.
One-way (goodness of fit) chi-square tests A nonparametric
statistical test using frequencies arranged in columns.
One-group t-test A statistical test used with one sample to look
for differences.
Open access journal Type of scholarly journal which allows
anyone to read the articles in the journal, without charge.
Open-ended Question for which no potential response is
provided; participants respond however they choose.
Ordinal scores Scores that do not have a common unit of
measurement between each score, but are ordered from high to
low.
Outlier An extremely high or low score that does not seem typical
for the group tested.
P
Paradigm A model or example that underscores the meaning
people have constructed about a phenomenon and highlights the
importance of the context while promoting equality (e.g., Feminism
promoting equal opportunities across gender identification, sexual
identification, social class, disability status, weight status, etc.).
Parameters Values or quantities calculated using information
obtained from a population.
Parametric statistic Statistic that requires interval data and a
normal distribution.
Parametric tests Statistical tests that assume interval data and
normal distribution of scores.
Participant observation An observational technique in which the
researcher or observer is in the setting and doing the same
activities as the people being observed.
Participatory action research Research with the purpose of
empowering people and groups to make a change in their lives
and bringing about social change.
Percent agreement An estimate of the reliability of dichotomous
data.
Percentiles (percentile ranks) A descriptive value that indicates
the percentage of participants below a designated score.
Per-comparison error rate Probability of making a type I error in
a single two-group comparison.
Perfect induction Conclusion derived through inductive reasoning
based on the observations of all members of a population.
Phi A correlation coefficient calculated as an estimate of validity
for dichotomous data.
Physical manipulation A method of gaining control by the
researcher physically controlling the research surroundings.
Placebo A treatment with an inert substance that can have no
effect on any dependent variable of participants in a control group.
Plagiarism The presentation of ideas or the work of others as
one’s own with the absence of proper credit.
Platykurtic curve A curve that is less sharply peaked than a
normal curve, indicating an extremely heterogeneous group.
Point of interface Where the qualitative and quantitative data are
combined in mixed-methods research designs either at data
collection, during analysis, or in the interpretation of the findings.
Polynomial regression A regression model that does not assume
a linear relationship; a curvilinear correlation coefficient is
computed.
Population An entire group or aggregate of people or elements of
interest from which a sample will be selected.
Practical action research Research undertaken with the purpose
of addressing a specific problem in a clinical setting or classroom
or other community setting and bringing about change.
Prediction Estimation of a person’s score on one measure based
on the individual’s score on one or more other measures.
Preexperimental designs Designs that have poor control often
due to no random sampling.
Preliminary items Front matter of a thesis or dissertation that
includes items such as a title page, approval page,
acknowledgments page, table of contents, list of tables and
figures, and an abstract; introductory material in a published
research report, which may include title, author, institution,
acknowledgments, and abstract.
Prevalence rates The number of participants who have a
particular disease or engage in a behavior at a certain time.
Prevalent cases The number of participants in the population who
have a particular disease or condition at some specific point in
time; prevalence of a disease is a function of both incidence and
duration.
Primary source material Original sources in historical research
(e.g., original documents).
Privacy The capacity of individuals to control when and under
what conditions others will have access to their medical/physical
data, behaviors, beliefs, and values.
Probability sampling A sampling technique based upon random
processes in which the probability of selecting each participant or
element is known.
Probes Interpretive-type questions utilized to seek more
information or detail from the participant during an interview; also
called interview prompts.
Procedures Portion of the methods section of a research report
that includes a description of data collection protocols,
experimental treatments, and how they were administered; a step-
by-step description of how the study was conducted.
Prospective cohort studies Known also as incidence or
longitudinal follow-up studies, these studies involve the selection
of a group of individuals at random from some defined population,
or selection of groups exposed or not exposed to a risk factor or
interest. After the cohort is selected, baseline information on
potential risk factors is collected and individuals are followed over
time to track the incidence of disease.
Purpose statement A specific declaration expressing the
researcher’s intent or goal for conducting the study; normally
indicates the variables or phenomenon of interest and information
regarding the participants and setting of the study.
Purposive sampling An intentional way to select research sites
and participants to help the researcher better understand the
problem or situation.
Q
Qualitative research Research based upon nonnumerical data
obtained in natural settings through extensive observations and
interviews whose primary aim is the interpretation of phenomena
and the discovery of meaning.
Quantitative research Research involving the collection of
numerical data in order to describe phenomena, investigate
relationships between variables, and explore cause-and-effect
relationships of phenomena of interest.
Quasi-experimental design An acceptable research design but
with some loss of control due to lack of random sampling.
Questioning techniques Methods for collecting information in
which the participants are asked to respond to questions posed by
the researcher; may include self-report questionnaires, personal or
group interviews, or telephone/electronic (e.g., zoom) interviews.
R
Random assignment Technique of using random processes to
assign research participants to treatment conditions or comparison
groups.
Random blocks ANOVA A statistical analysis where participants
similar in terms of a variable are placed together in a block and
then randomly assigned to treatment groups.
Random selection Technique of using random processes for the
selection of a sample that is thought to be representative of the
population of interest.
Randomized controlled trial A study design in which subjects are
selected and randomly assigned to receive either an experimental
manipulation or a control condition. Measurements are made
before and after the intervention period in both groups to assess
the degree of change in the outcomes of interest between the
intervention and control conditions.
Range A measure of variability that is the difference between the
highest and lowest score.
Ratio scores Scores that have a common unit of measurement
between each score and a true zero point.
Real difference Difference among group means because the
groups are different and not due to sampling error.
Recommendations Portion of a research report, often included in
the discussion section, in which the researcher presents
suggestions for clinical or professional practice and/or further
research based upon the findings of the current research study.
Refereed (peer-reviewed) journal Type of scholarly journal in
which the manuscripts undergo a careful evaluation by the editor
and expert reviewers in the field.
References Lists of all books, journal articles, or other sources
cited by the author in a research report.
Regression The statistical model used to predict performance on
one variable from another.
Related research literature Scientific reports of previous
research related to the problem area being studied.
Relationship questions Type of research question that seeks to
investigate the relationship or association between two or more
variables.
Relative risk The likelihood that a group with a risk factor will
have a health characteristic in comparison to a group without a
risk factor; see also odds ratio.
Reliability Degree to which a measure is consistent and
unchanged over a short period of time.
Reliability coefficient A correlation coefficient indicating the
relationship between the repeated measures or scores of a test.
Repeated measures A two dependent groups t-test design where
participants are tested before and after a treatment.
Repeated measures ANOVA A statistical analysis where each
participant is measured on two or more occasions.
Representative sample A subgroup of a population that is similar
to the population on the characteristics of interest.
Research A systematic way of finding solutions to a problem or an
answer to a question.
Research approaches General procedures that a researcher may
take in investigating the problem of interest; may include
descriptive, experimental, qualitative, historical, action,
epidemiological, and mixed-methods procedures.
Research hypothesis A tentative explanation or prediction of the
eventual outcome of a research problem; normally, this is the
outcome expected by the investigator.
Research participants People in a study; a portion of the
methods section of a research report used to present information
about the people being studied.
Research process Procedure founded upon the scientific method
for investigating research problems that involve systematic
progression through a series of necessary steps.
Research proposal Document presenting a problem including a
plan of attack and protocol for investigating the problem; often
required by a potential funding agency or thesis/dissertation
committee.
Research question The central focus of the research effort that
serves as the basis for the problem and provides direction for the
entire process.
Research report A scientific paper completed at the culmination
of a research project in which the results of the investigation are
presented to others.
Respect for persons Ethical principle proclaiming respect for
individuals involved as participants in research studies.
Results section Section of a research report where the
researcher reports the outcomes of the data collection efforts and
the various data-analysis procedures.
Rho A value indicating the degree of relationship between two
sets of ranks; also known as Spearman’s rho and rank-order
correlation coefficient.
S
Sample A subgroup of a population of interest from which data are
collected.
Sample size The number of elements or research participants in a
sample selected for scientific study.
Sampling The process whereby a small proportion or subgroup of
a population of interest is selected for scientific study.
Sampling error Difference among group means because the
samples are not 100% representative of a population; the extent to
which sample values (statistics) deviate from those that would be
obtained from the entire population (parameter).
Sampling frame The accessible population or collection of
elements from which the sample is drawn.
Sampling unit The element or group of elements (cluster) that is
selected during the sampling process.
Saturation Generated from grounded theory, the idea that no new
themes or categories in qualitative data collection are being
generated.
Scaling techniques Methods for measuring the degree to which a
research participant values or exhibits a concept or characteristic
of interest; uses a graded-response format that assigns values to
the strength or intensity of one’s responses.
Scattergram A graph that shows the relationship between two
variables.
Scholarly journals A primary source of literature that provides the
firsthand reporting of research studies that serves to build the
knowledge base in a field.
Scientific method An empirical (based on experience or
observation) approach for gaining knowledge using a systematic
process with a hypothesis (predictions) and conducting
experiments or observations to test a hypothesis/predictions.
Scientific misconduct The fabrication, falsification, plagiarism, or
other practices that seriously deviate from those commonly
accepted by the scientific community for proposing, conducting, or
reporting research.
Search engines Websites that perform automated searches of
Internet databases for specific information based on specification
of a keyword or words.
Secondary source material Secondhand sources in historical
research (e.g., newspapers).
Selective manipulation A method of gaining control by selectively
manipulating certain participants or situations.
Self-plagiarism Presenting your own written words that you have
used previously in other ways.
Semantic differential scale A type of scaling technique in which
respondents are asked to make judgments about a concept of
interest using a continuum consisting of bipolar adjectives.
Semistructured interview Interview for which each participant is
asked the same general questions.
Significant Value of the statistical test that warrants rejection of
the null hypothesis; differences among groups are real differences.
Simple effects tests Tests to compare the column means for
each row or the row means for each column in a two-way ANOVA
design.
Simple frequency distribution An ordered listing of the scores
and their frequencies.
Simple prediction Prediction of an individual’s score on a
measure based on the individual’s score on another measure.
Simple random sample A type of probability sample in which
every element in the population has an equal chance of being
selected, and the selection of one element does not interfere with
the selection chances of any other element (i.e., equal and
independent).
Single-blind study A study in which participants are unaware of
the purpose of the research study and their role in the study.
Situational ethics Ethical paradigm that proposes that no general
rules can be applied in all instances and that ethics are specific to
the individual case.
Skewed curve An asymmetrical curve, indicating an unbalanced
distribution.
Slope Angle of the graphed prediction line; rate of change in the
predicted score with a change in the predictor score.
Stability reliability Consistency of test scores across several
days.
Standard deviation A measure of variability that indicates the
amount that all the scores differ from the mean.
Standard error of measurement An estimate of the amount of
measurement error to expect in the score of any person in a
group.
Standard error of prediction A type of standard deviation that
serves as an index of the accuracy of a prediction equation.
Standard scores Scores that are used to change scores from
different tests to a common unit of measurement.
Statistical hypothesis An assumption that is evaluated using a
statistical test.
Statistical power Probability that the statistical test will reject the
null hypothesis when, in fact, the null hypothesis is false;
probability of not making a type II error.
Statistical techniques A method of gaining control if other control
techniques are not possible.
Statistics Values or quantities calculated using information
obtained from a sample; used to estimate population information.
Stratified random sampling A type of probability sampling in
which the population is first divided into specific strata or
subgroups and a set number of research participants are then
randomly selected from each strata.
Structured interview Interview for which each participant is asked
the same specific questions.
Structured questionnaire Type of measurement instrument that
includes questions along with prescribed response alternatives
from which the respondents must choose, such as yes/no or
multiple choice items; also called a closed-ended questionnaire.
Subject directories Internet “yellow pages” that enable the user
to peruse World Wide Web sources according to subject areas.
Sums of squares (SS) Variability values in ANOVA.
Supplementary items Back matter of a thesis or dissertation that
includes items such as questionnaires, consent documents,
instructions to research participants, diagrams of research
settings, and tables of raw data.
Syllogism A process of logical reasoning in which conclusions are
based on a series of propositions or assumptions.
Systematic review A research approach involving the reanalysis
of the results from a large number of research studies by following
a predetermined and specific protocol.
Systematic sampling A type of probability sampling in which the
sample is drawn by choosing every kth element (research
participant or element) from a listing of the population, where k is a
constant representing the sampling interval.
T
Table of random numbers A table (or book) consisting of
numbers arranged in a random order that may be used to select a
random sample or to assign research participants to groups.
Test A data-gathering technique to obtain a set of scores.
Themes Key ideas found across data sources and reported in the
results of qualitative and mixed-methods studies.
Theory A belief or assumption about the causal relationship
between variables that serves to explain phenomena.
Thesis format Traditional, generic format of a research report
prepared by graduate students in which the body of the report is
organized within specific chapters, such as introduction, review of
literature, procedures or methods, results and discussion, and
conclusions.
True experimental designs The best type of design because
there is good control with sufficient random sampling.
Trustworthiness Following procedures in qualitative and mixed-
methods research (e.g., member checking and peer debriefing) in
order to ensure that justifiable conclusions arise from confirmable
patterns in data, and the procedures employed to reach those
conclusions can be articulated.
T-score A standard score with mean 50 and standard deviation
10.
Tuskegee Syphilis Study U.S. Public Health Service research
study conducted in the mid-1900s that is infamous for the
maltreatment of the participants.
Two dependent groups t-test A statistical test used with two
dependent groups or columns of scores.
Two-group comparisons Statistical techniques to compare
groups two at a time following a significant F in ANOVA; also
called multiple comparisons and a posteriori comparisons.
Two independent groups t-test A statistical test used with two
independent samples.
Two-way ANOVA An ANOVA design with rows and columns; also
called two-dimensional ANOVA.
Two-way (contingency table) chi-square tests Nonparametric
statistical test using frequencies arranged in rows and columns.
Type I error Rejection of a true null hypothesis.
Type II error Acceptance or nonrejection of a false null
hypothesis.
U
Univariate statistic Statistic for which each participant contributes
one score to the data analysis.
Univariate tests Tests for which each participant contributes one
score to the data analysis.
Unstructured interview Interview for which no questions have
been prepared for a participant; a conversation.
Unstructured questionnaire Type of measurement instrument
that includes questions for which the response alternatives are not
listed, and respondents answer freely in their own words; also
called an open-ended questionnaire.
V
Validity The degree to which interpretations of test scores or
measures derived from a measuring instrument lead to correct
conclusions.
Variability A descriptive value that describes the data in terms of
their spread or heterogeneity.
Variable A characteristic, trait, or attribute of a person or thing that
can take on more than one value and can be classified or
measured.
Variance A measure of variability that is the square of the
standard deviation.
Variate The score adjusted in ANCOVA.
Verification Method for testing interpretations for plausibility,
sturdiness, or confirmability of research findings.
Y
Y-intercept The point at which the graphed prediction line crosses
the Y-axis; the value of the predicted score when the predictor
score is zero.
Z
z-score A standard score with mean 0 and standard deviation 1.0.
© Liashko/Shutterstock
Index
A
Absolute reliability, 213
Abstracts, 125, 129, 294, 298
Academic Info, 124
Academic Search Premier, 121
Acknowledgments, 129, 297
Action approach, 148
Action plan, 77
Action research, 12, 24
characteristics of, 72–73
definition, 71–72
ethics in, 79–80
sample study, 78–79
stages of, 73–77
Active variables, 109
Adjustments, 112
Administered questionnaires, 26
Advancement Via Individual Determination (AVID), 59
Aerobics Center longitudinal study (ACLS), 85
Affective measures, 153–158
Agbuga, B., 130n.
Aiken, L. S., 266
Ainsworth, B. E., 66
Airasian, P., 16, 138, 196
Alcaraz, J., 104
Algina, J., 204
Alley, A. B., 181
Alpha level, 241, 245–246
American Academy of Pediatrics (AAP), 55
American College of Sports Medicine (ACSM), 197
American Educational Research Association (AERA), 171, 179,
183, 214
American Kinesiology Association, 18, 119
American Psychological Association (APA), 171, 176, 179, 183,
294, 307
American Sociological Association (ASA), 171
Amorose, A. J., 133n.
Amrein-Beardsley, A., 278
Analysis of covariance (ANCOVA), 45, 259
Analysis of variance (ANOVA), 206–208, 247–257
one-way ANOVA, 247–248, 259
other ANOVA designs, 257
random blocks ANOVA, 252–253
repeated measures ANOVA, 248–252
two-way ANOVA, 253–257, 259
Animal research, 176
Animal Welfare Act of 1985, 176
Annotated bibliography, 125
Anolli, L., 25
Anthony, C. C., 18
Appendix, 136–138
Applied research, 11, 12
Approval page, 296
Araki, K., 284
Arizona State University (ASU), 18, 177–178
Arthur, W., 144
Article format, 304–305
Ary, D., 10, 15, 109, 116
Assent, 178–180
Association, 237
Association for the Advancement of Physical Education, 17
Assumptions, 108–109, 287
Attitudes, 10, 12, 16
Attributable risk (AR), 91–93, 237
Attribute variables, 109–110
Authorship, 182–183
“Avis effect,” 46
B
Babbie, E., 25, 26
Background information, 130
Bad Blood: The Tuskegee Syphilis Experiment (Jones), 169
Baghurst, T., 18
Banville, D., 104
Baric, R., 59
Baron, R. M., 111
Barrett, K. C., 223
Bartee, R. T., 107
Bartlett, T., 182
Basic research, 11–12
Bauman, A., 104
Baumgartner, T. A., 23, 28, 39, 47, 152, 191, 195, 204, 205, 208,
211, 214, 218, 230, 231
Bax, L., 144
Baxter-Jones, A. D., 117
Beckman Instruments, 133
Beets, M. W., 106–108
Behavioral Risk Factor Surveillance System (BRFSS), 16
Bell-shaped curve, 225
Belmont Report, 171, 179
Beltrami, F. G., 107
Beneficence, 171
Bennett, W., 144
Berlant, A. R., 131n.
Bernstein, I. H., 204, 213
Berryman, J. W., 67
Best, J. W., 16, 24, 112
Beta, 246
Billings, D., 288
Bipolar adjective scale, 155
Blair, S.N., 85, 86
Block designs, 44
Blood pressure, 272, 275
Blumenthal, J., 86
Bohnenblust, S., 240
Bookwalter, C. W., 116
Bookwalter, K. W., 116
Boolean search strategy, 121–122
Borenstein, M., 141, 145
Borg, G., 156
Borg, W. R., 14, 26, 71
Boston Marathon, 22
Bouchard, C., 86
Bowdoin, B. A., 130n.
Bowen, B. D., 33
Bradburn, N. M., 25
Britannica Internet Guide, 124
Brown, D. D., 130n.
Brown, D. R., 44, 45, 255, 257–259
Brownell, K. D., 129n.
Brown, K. R. M., 25
Brown, W. J., 104
Brundage, A., 67
Brunner, R. L., 129n.
Buckworth, J., 143, 144
Bukowski, B. J., 104
Bush, George H.W., 195
Butryn, T. M., 51, 52
C
California longshoremen study, 85
Cameron, N., 22
Campbell, D. T., 38, 40, 43, 44
Capps, G., 67
Cardinal, B. J., 119
Carney, T. P., 195
Carpenter, C., 52
Case-control studies, 88–89
Case study, 22–23, 52
Caspersen, C., 86
Castellan, N. J., 259
Causal-comparative research, 15–16, 24
Causation
confounding, 95
criteria for, 94–95
effect modification, 95–96
Cells, 248
Centers for Disease Control and Prevention (CDC), 16, 104, 163
Central tendency, 227–228
Central tendency error, 46
Cerin, R., 104
Chaitman, B., 86
Chapman, H., 71
Chen, A., 105, 157, 258
Chicago Manual of Style, 294
Christian, L. M., 25
Chronicle of Higher Education, 182
Chronic obstructive pulmonary disease (COPD), 197
Chung, H., 204, 205, 208, 211, 214, 218
Clark, D. G., 85
Clinical strategy, 52
Clinton, Bill, 169
Closed-ended questionnaires, 28, 158
Closed-loop conceptualizations, 9
Cluster sampling, 192
Cochrane Library, 145
Coefficient alpha, 208, 212
Coefficient of determination, 234–235
Cognitive measures, 153
Cohen, D. A., 152
Cohen, J. S., 169, 195, 258, 266
Cohen, P., 266
Cole, A. M., 157
Collins, M. E., 102, 289
Colorado on the Move, 106
Common Rule (45 C.F.R. 46), 171, 172, 175–177, 180
Completion items, 27–29
Comprehensive School Physical Activity Program (CSPAP), 11
Computer analysis, 222–223
Computer search, 120
Conceptual literature, 116–117
Conclusions, 9, 136, 300, 302–304
Cone, J. D., 308n.
Confidentiality, 180–181
Confirming sampling, 189
Confounders, 95
Construct, 202
Continuous scores, 203
Continuous variables, 109
Control, 44–45
analysis of covariance (ANCOVA), 45
block designs, 44
counterbalanced design, 44–45
matched pairs, 44, 244–245
mediating variables, 45
moderating variables, 45
physical manipulation, 44
selective manipulation, 44
statistical techniques, 45
threats to validity, 38–40
variable, 112
Control group, 41
Controlling for the variable, 112
Convenience sampling, 193
Convergent evidence, 216
Convergent parallel mixed-methods, 61–62, 277–278
Conway, T. L., 105
Cook, T. D., 43
Cooper Center longitudinal study (CCLS), 85
Cooper, K. H., 85
Corbin, C. B., 18, 157, 159
Corbin, J., 52
Corcoran, J., 141
Coronary heart disease (CHD), 84
Corpus of data, 57
Correlation, 232–233, 237, 265
Correlational research, 16–17, 23
Correlation coefficient, 144, 205, 232–237, 266
calculating, 207–208
intraclass, 206, 208
rank order, 234
Counterbalanced design, 44–45
Covariates, 45, 259
Creswell, D., 13
Creswell, J. W., 13, 51, 52, 56, 59–61, 149, 188, 194, 270, 271,
276
Criteria for critiquing an article, 138–139
Criterion-referenced standards, 24
Criterion validity evidence, 217
Critical reading, 125
Critical region, 242
Critical sampling, 189
Critical value, 242
Crocker, L., 204
Cronbach’s alpha, 208, 212
Cronk, B. C., 25
Cross-sectional approach, 22
Cross-sectional survey, 87–88
Cross-validation, 265
Cumming, S. P., 117
Cunningham, G. K., 47
Curvilinear relationship, 235
D
Daley, E. M., 25
Dalton, S. N., 144, 145
Damori, K. D., 107
Darracot, S. H., 213
Darwin, Charles, 7
Data analysis, 133
common units of measure, 222
computer analysis, 222–223
descriptive, 221–237
determining relationships between scores, 232–237
inferential, 239–267
measures of association, 237
measuring group position, 229–232
mixed-method designs, 277–278
organizing and graphing scores, 223–227
qualitative designs, 270–271
stages of, 222
Database searches, 120–123
Boolean search strategy, 121–122
multidisciplinary, 120–121
specialized, 121
Data coding, 57–58, 271
Data-collecting instruments, 161
Data collection and analysis, 13, 38, 133, 151–160, 289–292
Data collection process
collecting and recording data, 56
determining the sampling technique, 56
interview day suggestions, 57
locating and accessing data site, 56
Data, defined, 203
Data reading, 270–271
Data reduction, 271
Data saturation, 272
Data set, 270
Data source triangulation, 276
Day, R. A., 308n.
Deductive reasoning, 6–8
Dees, R., 308n.
Definition of terms, 288
Degrees of freedom (DF), 206, 242, 247
Delaney, H. D., 255, 257
Delimitations
nature of, 107
research proposal, 285–286
Delphi technique, 160
Denzin, N.K., 50, 51, 53, 187, 270, 271
Dependent variables, 36–37, 109–110
Descriptive data analysis, 221–237
Descriptive questions, 104
Descriptive research, 16, 21–33, 148, 195
action research, 24
case study research, 22–23
causal-comparative research, 15–16, 24
correlational research, 16–17, 23
developmental research, 22
normative research, 23–24
observational research, 24
survey research, 22
Descriptive statistic, 222
Descriptive values, 227–229
measures of central tendency, 227–228
measures of variability, 228–229
Design types, 37, 40–43
in ascending order of complexity, 42
designs, defined, 40
preexperimental designs, 40
quasi-experimental designs, 41–43
true experimental designs, 41
Determining strategy, 51–52
Developmental research, 22
Dichotomously scored tests, 212–213, 217
Difference questions, 105
Digby, K., 60
Dillman, D. A., 25
DiMatteo, M. R., 144
Directional hypothesis, 150
Direct observation, 151–152
Directory of Psychological Tests in the Sport and Exercise
Sciences (Ostrow), 154
Disconfirming sampling, 189
Discrete scores, 203
Discrete variables, 109
Discriminant (divergent) evidence, 216
Discussion section, 136, 137, 299–302
Dishman, R. K., 83, 86, 143–145
Dissertation Abstracts, 121
Dissertation Express, 121
Distributed questionnaires, 27–31
development, 27–28
distribution methods, 29–30
examples, 30–32
format, 28–29
return, 30
Doctors from Hell (Spitz), 168
Dogpile, 123, 124
Domains, 124
Domangue, E., 105
Dorantes, L., 196
Double-blind study, 47
Dowling, M., 10
Drew, C. J., 9, 13, 104
Drowatzky, J. N., 169, 170, 182
DSTAT program, 144
Dupuis, S., 55
Dzialoznski, T. M., 130n.
E
Eating Attitudes Tests (EAT-26), 286
Ecological studies, 88
Effect size, 142–143
defined, 258
in inferential data analysis, 258
Eisenmann, J. C., 107, 117, 130n.
Electroencephalogram (EEG), 11
Element, 186
Elliott, M. N., 25
EndNote, 125
Engels, H. J., 133n.
Ensign, J., 13, 61
Epidemiological and Public Health Aspects of Physical Activity and
Exercise, 85–86
Epidemiological approach, 148
Epidemiological research, 83–96
cause in, 94–96
history, 84–86
measurement, 86–87
methods in, 86–96
relative risk and odds ratios, 94
research design, 87–91
statistics, 91–94
Epstein, S., 86
Equating criterion, 112
Equitable participation, 197
ERIC, 121
Error
experimenter bias effect, 47
Hawthorne effect, 46
“John Henry” effect, 46
participant-researcher interaction effect, 47
placebo effect, 46
post hoc error, 47
rating effect, 46–47
sampling, 189, 195, 240
Type I error, 246
Type II error, 246
Escamilla, R. F., 105
Eshelman, M., 298
Essence, 51
Ethics, 167–184
in action research, 79–80
animal research, 176
basis in research, 168–169
defined, 169–170
disclosure of research findings, 181–183
informed consent, 177–181
institutional review boards (IRBs), 171–176, 179
Nazi experimentation, 168–170
privacy and confidentiality, 180–181
protection of research participants, 170–171
regulation of research, 170–171
Tuskegee Syphilis Study, 168–169, 171
Ethnomethodological strategy, 51–52
Evaluation
critiquing a research article, 138–139
information on the Web, 124–125
of research report, 139–140
Excite, 123, 124
Exercise and Sport Science Reviews, 117
Expected frequency, 260
Experimental mortality, as threat to internal validity, 39–40
Experimental research, 13–15, 17, 35–48, 148
common sources of error, 46–47
controlling threats to validity, 38–40
design types, 40–43
external validity, 38, 40, 43–44
interaction effect of selection bias and, 40
internal validity, 38–40, 43–44
measurement, 47
methods of control, 44–45
procedure in, 36–38
stages of, 36–38
threats to validity, 43–44
Experimenter bias effect, 47
Experiment-wise error rate, 259
Explanatory sequential mixed-methods, 277
Exploratory sequential mixed-methods, 59–60, 62, 277
Ex post facto research, 24
External criticism, 70
External grants, 163–164
External validity
defined, 38
in summary, 43–44
threats to, 40
Extraneous variables, 110–111
Extreme case sampling, 188–189
F
FACSM (Fellow, American College of Sports Medicine), 128
Factorial ANOVA, 255
Factorial design, 253–255
Family Educational Rights and Privacy Act (FERPA), 180
Farias, C., 72
Faucette, N., 104
Federal Policy for the Protection of Human Subjects, 171
Fellow, American College of Sports Medicine (FACSM), 128
Ferreira, J. C., 197
Fey, M. A., 105, 286, 291, 292, 295, 296
Firetto, C. M., 272
Fishbowl technique, 190
Five-step hypothesis-testing procedure, 266
Flesch-Kincaid Scale, 179
Fletcher, G. F., 86
Fligner, M. A., 240
Focus group, 26
Focus group interviews, 160
Foe, G., 179
Food Safety and Inspection Service (U.S. Department of
Agriculture), 176
Forced ranking scale, 156
Form and Style (Slade and Perrin), 294
Forzano, L. B., 155
Foster, S. L., 308n.
45 C.F.R. 46 (Common Rule), 171, 172, 175–177, 180
Four-way ANOVA, 257
Fox, K. R., 157
Fraga, J. A., 16
French, K. E., 144
Frequency distributions, 223–225
Frequency polygon, 225
Freshman Survey, 135
Frey, G. C., 105
Fricker, R. D. Jr., 25
Friedman two-way ANOVA, 260
F test, 247–249
Funding sources, 163
G
Gall, J. P., 14, 26, 71
Gall, M. D., 14, 26, 71
Garcia, M. B., 59
Gastel, B., 308n.
Gay, L. R., 16, 138, 196
General Linear Model with Repeated Measures, 251–252
Genoe, R., 55
Getchell, N., 6, 50, 163
Gibbons, L. W., 85
Gill, D. L., 66
Gleaves, J., 68
Glew, D., 291
Golinelli, D., 152
Goliszek, A., 169
Good tests, 203–204
Google, 121, 122, 124
Google Guide, 124
Google Scholar, 122
Grace, A., 51, 52
Graduate Record Exam (GRE), 17
Grant application, 163–164
Grant proposal, 164
Grant submission, 164
Graphing
grouping scores for, 225–227
techniques, 232–233
Gravetter, F. J., 134, 155
Greene, J., 13, 14
Green, L., 47
Green, S. B., 240
Griffo, J. M., 18, 194
Grosshans, I.R., 71
Grounded theory, 52
Group differences method, 216
Grouping scores for graphing, 225–227
Group position, 229–232
H
Hagen E. P., 47
Hales, P., 204, 205, 208, 211, 214, 218
Hall, S., 6, 50, 163
Halo effect, 46
Hamdan, S. M., 104
Hamill, J., 66
Hanna, F., 161
Hannus, A., 107
Hardman, M. L., 9, 104
Harris, M. B., 134
Harvard alumni study, 85
Haskell, W. L., 86
Hassenkamp, A., 71
Hasson, F., 160
Hastie, P. A., 72
Hauspie, R. C., 22
Hawthorne effect, 46
Haymes, E. M., 66
Heady, J. A., 84
Health Insurance and Portability and Accountability Act (HIPAA),
180
Health screenings, 197
Heath, G., 83, 86
Hedges, L. V., 141, 145
Heelan, K. A., 130n.
Heller, Jean, 169
Helsinki Declaration, 170
Henry, F. M., 66
Hensler, P., 67
Herring, M. P., 145
Hess, D., 283, 284
Hewitt, L., 55
Higgins, J. P. T., 141, 145
Histogram, 226
Historical approach, 148
Historical research, 17
biographical research, 70–71
characteristics of, 67
data control and interpretation, 69
evaluating historical data, 70
hypotheses in, 69–70
kinesiology, 66–67
oral history, 68
process of, 67–68
reports, 71
sources of historical data, 68
Historic Traditions and Future Directions of Research on Teaching
and Teacher Education in Physical Education (Houser et al.),
18
History, as threat to internal validity, 38, 40
History, epidemiological research
CCLS, 85
CHD, 84
Harvard alumni study, 85
longshoremen study, 85
physical activity, 85–86
Tecumseh community health study, 84–85
Homogeneous sampling, 189
Hong, S. T., 181
Horn, T. S., 133n.
Hosp, J. L., 9, 104
HotBot, 123
Housner, L. D., 18
Huberty, C. J., 264
Huck, S. W., 45, 129, 222, 259
Huffcutt, A. I., 144
Hult, J. S., 67
Human Experimentation (McCuen), 169
HyperRESEARCH software, 272–274
Hypothesis, 149–151
decision on alternate hypothesis, 245–246
defined, 8
formulating, 8
hypothesis-testing procedure, 241–242
null, 150, 241–242, 245–246, 248
research, 36–37, 149–151, 241
research proposal, 287
Hyun Hwang, J., 66
I
Identification, recruitment, and selection of research participants,
185–198
Ikeda, N., 144
Impact factors, 119
Imperfect induction, 7
Incidence rates, 87
Incident cases, 86
Inclusion and exclusion criteria, 197
Independent variables, 36, 109–110
Indiana University, 182
Induction, 7
Inferential data analysis, 239–267
analysis of variance, 206–208, 247–257
assumptions underlying statistical tests, 257–258
decision on alternate hypothesis and alpha level, 245–246
effect size, 258
hypothesis-testing procedure, 241–242
inference, 240
one-group t test, 242–243
overview of multivariate tests, 264–265
overview of nonparametric tests, 259–264
overview of prediction-regression analysis, 265–266
overview of testing correlation coefficients, 266
overview of two-group comparisons, 258–259
review of analysis of covariance, 259
selecting the statistical test, 266–267
two dependent groups t test, 244–245
two independent groups t test, 243–244
Inferential statistics, 240
Informed consent, 177–181
Ingenta, 121, 122
IngentaConnect, 126
Institute for Scientific Information (ISI), 119
Institutional review boards (IRBs), 56, 171–176, 179
Instrumentation, 132–133, 151–163
data-collection methods and techniques, 151–160
instrument development, 162–163
revising the instrument, 161
selecting appropriate method and instrument, 160–163
as threat to internal validity, 39
Interaction effect
Hawthorne effect, 46
participant-researcher interaction effect, 47
of selection bias and experimental treatment as threat to external
validity, 40
of testing, as threat to external validity, 40
Internal consistency reliability, 208
Internal criticism, 70
Internal grants, 164
Internal validity, 38–40
defined, 38
in summary, 43–44
threats to, 38–40
Internet searches, 123–125
evaluating information on the Web, 124–125
utilities, 123–124
Interpreting results, 9
Interval scores, 203
Intervening variables, 37
Interview, 52–55
guides and questions, 52–53
guide types, 53
questions, 53–55
Interview day suggestions, 57
Interviewer bias, 160
Interviews, 25–26, 159–160
In the Name of Science (Goliszek), 169
Intraclass correlation coefficient, 206, 208
Intrinsic Motivation Inventory (IMI), 132
Introduction section
research proposal, 283–288
research report, 130–131, 298
Inventory, 154
Isaac, S., 22, 24, 26, 36, 38
Ixquick, 123, 124
J
Jackson, A. S., 23, 28, 47, 152, 191, 195, 204, 205, 208, 211, 214,
218, 230, 231
Jacobs, L. C., 10, 15, 109, 116
Jekel, J. F., 129n.
Jensen, M., 18
Jewett, R., 51
“John Henry” effect, 46
Johns Hopkins University, 125
Johnson, B. J., 144
Johnson, L., 144
Jones, J. H., 169
Jones, S., 25
Jose, P. E., 45
Journal Citation Reports (JCR), 119
Journal of Motor Behavior, 18
Journal of Sport History, 18
Journal of Sport Psychology, 18
Justice, 171
K
Kahn, J. V., 16, 24, 112
Kais, K., 107
Kandakai, T. L., 133n.
Kappa coefficient, 212
Katz, D. L., 129n.
Keeney, S., 160
Kelley, K., 255, 257
Kenny, D. A., 111
Kenyon, G.S., 216
Keppel, G., 44, 45, 255, 257–259
Kerr, G., 51, 52
Kerr, J. H., 52
Kinesiology
CCLS, 85
Harvard alumni study, 85
longshoremen study, 85
physical activity, 85–86
Tecumseh community health study, 84–85
King, K. A., 133n.
Kittleson, M. J., 25
Knowledge
as basis of profession, 5–6
research as knowledge pipeline, 6
search for, 6–7
Knowledge test, 211–212
Kohl, H. W., 85
Kolody, B., 104
Kono, S., 60
Koro-Ljungberg, M., 278
Kraft, K. P., 50
Krejcie, R. V., 196
Krueger, R. A., 26
Kruskal-Wallis one-way ANOVA, 260
Kuder-Richardson coefficient, 212
Kulinna, P. H., 11, 13, 18, 61, 79, 129n., 153, 159, 194, 196, 197,
278
Kuzma, J. W., 240
Kwon, J. Y., 278
L
Laasko, L., 105
Lambdin-Abraham, R., 283
Landers, D. M., 18
Langley, K., 79
Larose, T. L., 223
Larson, E. L., 179
Lawrence Erlbaum Associates, 25
Lee, A. M., 66
Lee, C., 104
Leech, N. L., 223
Lee, I.-M., 86
Leptokurtic curve, 226
Lerner Self-Esteem Scale, 291
Leslie, E., 104
Lewis, F. M., 47
Life history, 51
Likert scale, 154–155
Limitations
nature of, 107–108
research proposal, 286
Lincoln, Y. S., 50, 51, 53, 187, 270, 271
Linear relationship, 235
Line of best fit, 233
Lipsey, M. W., 144, 258
Literature review
importance of, 116–117
in meta-analysis studies, 141–142
purposes of, 115–116
reading and documenting research literature, 125–126
research proposal, 288–289
research report, 115–126, 298
searching the literature, 120–125
sources of literature, 117–119
steps in, 120
Littell, J. H., 141
Liu, Y., 250
Lochbaum, M., 59
Locke, L. F., 50, 106, 137, 283, 298, 308n.
Logical validity evidence, 214
Longitudinal approach, 22
Longshoremen study, 85
Lorenz, K., 153
Lycos, 124
M
Macera, C. A., 86
Mackenzie, S. H., 52
MacKinnon, D. P., 45
Macmillan, F., 50
Magazines, 118
Magill, R. A., 305
Mahar, M. T., 23, 28, 47, 152, 191, 195, 204, 205, 208, 211, 214,
218, 230, 231
Major, M. J., 283, 287
Malina, R. M., 117
Mamma, 123, 124
Manual for Writers of Term Papers, Theses, and Dissertations, A
(Turabian), 294
Manuscript preparation, 305–307
after writing, 306–307
before writing, 305–306
during writing, 306
Marathon, Boston, 51
Margan, D. W., 196
Marshall, S. J., 105
Martinez, R., 104
Martin, J. J., 10, 11, 161
Martin, P. E., 10, 11, 18
Matched pairs design, 44, 244–245
Matching criterion, 112
Maturation, as threat to internal validity, 38–40
Mauchly’s Test of Sphericity, 251–252
Maximum variation sampling, 189
Maxwell, S. E., 255, 257
Mays Woods, A., 13
Mbambo, Z., 133n.
McBride, R., 130n.
McCaw, S. T., 130n.
McCuen, G. E., 169
McCullagh, P., 131n.
McDermott, R. J., 25, 47
McKenna, H., 160
McKenzie, T. L., 104, 105, 152
McMillan, J. H., 6, 72, 116, 155
McMullen, J., 52
Mean, 228
Mean effect, 144
Mean square (MS) value, 206, 207, 247, 249, 255
Measurement in epidemiology, 86–87
Measurement in research, 202–219
data characteristics, 203–218
other considerations, 218–219
Measurement techniques, 47, 151–158
affective measures, 153–158
cognitive measures, 153
physical measures, 153
Measurement Theory and Practice in Kinesiology, 86
Median, 227–228
Mediator variables, 45, 112
Medicine and Science in Sport, 18
Medicine and Science in Sport and Exercise, 144, 179
Medline, 121
Member checking, 276
Merom, D., 50
Mertens, D. M., 79
Mertler, C. A., 72, 187–189, 191, 192
Mesquita, I., 72
Meta-analysis studies, 117, 118, 140–145
computer program, 144
criticisms of, 145
defined, 141
effect size, 142–143
effect size in, 143
examples, 144–145
in kinesiology, 144–145
procedures in, 141–144
stages of, 141–144
Metacrawler, 123, 124
Metasearch engines, 123–124
Methods of data collection, 52–55
interviews, 52–55
observations, 55
Methods section, 131–133
instrumentation, 132–133
procedures, 133, 298–299
research participants, 131
Metzler, M. W., 18
Michael, W. B., 22, 24, 26, 36, 38
Michels, K. N., 44, 45, 255, 257–259
Mills, G. E., 16, 72, 74–77, 138, 196
Mintah, J. K., 288, 291
Mishra, G., 104
MIX 2.0, 144
Mixed-methods designs, 276–278
convergent parallel mixed-methods, 277–278
explanatory sequential mixed-methods, 277
exploratory sequential mixed-methods, 277
interpreting, evaluating, and presenting, 278
trustworthiness, 276
verification, 276
Mixed-methods research
convergent parallel mixed-methods, 61–62
data collection, 60–61
exploratory sequential mixed-methods, 59–60
MLA Handbook for Writers of Research Papers (Gibaldi), 294
Mode, 227
Moderator variables, 45, 111, 142
The Modern Language Association of America, 294
Modified kappa, 212
Modified kappa coefficient, 212
Moffat, R. J., 133n.
Molinari, L., 22
Moons, K., 144
Moore, D. S., 226, 236, 240
Morgan, D. L., 26
Morgan, G. A., 223
Morris, J. N., 84
Morrow, J. R., Jr., 144
Morse, J. M., 194
Morss, G., 104
Mulhearn, S. C., 194, 197, 272
Multiple choice items, 27
Multiple comparisons, 258–259
Multiple prediction, 265–266
Multiple-treatment interference, as threat to external validity, 40
Multistage sampling, 192
Multivariate statistics, 264–265
Multivariate tests, 264
Murphy, K. R., 10, 258
Murray, L., 71
Musser, K. A., 283, 284, 289
Myburgh, K. H., 133n.
Myors, B., 258
N
Nader, P. R., 152
Narrative strategy, 51
National Academy of Medicine (NAM), 55
National Commission for the Protection of Human Subjects of
Biomedical and Behavioral Research, 171
National Eating Disorders Screening Program, 286
National Health Interview Survey (NHIS), 16
National Heart, Lung, and Blood Institute, 129
National Institutes of Health (NIH), 6
National Research Act, 171
National Science Foundation, 181–182
Nazi experimentation, 168–170
Nelson, J. K., 164, 305
Neutens, J. J., 33, 70
New Language of Qualitative Methods, The (Gulbrium and
Hostein), 291
Newspapers, 118
Noble, E. G., 130n.
Nominal scores, 203
Nondirectional hypothesis, 150
Nonlinear regression, 266
Nonparametric statistics, 259–260
Nonparametric tests, 259–264
Nonprobability sampling, 188, 193
Nonresponse bias, 160
Nonsignificant, 241
Normal curve, 225–226, 228, 235
Normative research, 23–24
Norm-referenced standards, 23
Norris, J., 278
Northern Light, 123, 124
Northwestern University, 182
Notz, W. I., 240
Nugent, P., 104
Null (statistical) hypothesis, 150, 241–242, 245–246, 248
Numerical rating scale, 155
Nunnally, J. C., 155, 204, 213
Nuremberg Code, 170, 177
Nuremberg War Crime Trials, 168, 170, 177
O
Objectivity, 204, 206–208
acceptable, 219
defined, 151, 204
documenting, 205
establishing, 205
estimating, 205
nature of, 6, 204
scientific method and, 8–10
Observational research, 24
Observations, 55
Observation techniques, 151–152
Observed frequency, 260
O’Connor, P. J., 145
Odds ratio (OR), 91, 94, 237
Office of Human Research Protections (OHRP), 171
Office of Research Integrity (ORI), 182
Oh, S., 204, 205, 208, 211, 214, 218
Olejnik, S., 264
Olson, R., 50
One-group t test, 242–243
O’Neill, D. E. T., 130n.
One-Sample T Test, 242
One-way ANOVA, 247–248, 257, 259
One-way chi square test, 260–261
Oostdam, R., 161
Open-ended items, 27
Opportunistic sampling, 189
Ordinal scores, 203
ORI Newsletter, 182
Osgood, C. E., 155
Ostrow, A. C., 154
Outliers, 228
Owen, N., 104
Owens, D., 67
P
Paffenbarger, R. J., 86
Paffenbarger, R. S., 85
Palmer, E., 125
Papaioannou, A., 131n.
Paradigm, 50
Parameters, 222
Parametric statistics, 259–260
Parametric tests, 259–260
Park, R. J., 18, 67
Parks, J. W., 84
Participant observation, 152
Participant recruitment, 196–197
Participant-researcher interaction effect, 47
Participatory action research, 71
Passive consent, 178
Pate, D. J., 287
Pate, R. R., 86
Patino, C. M., 197
Patterson, D., 71
Patton, M. Q., 53–55, 57, 61, 160, 189, 194, 271
Pavkov, T. W., 223, 250
Payne, V. G., 144
PCPFS Research Digest, 117–118
Pealer, L. N., 25
Pedhazur, E. J., 266
Pediatric Exercise Science, 117
Peer debriefing, 276
Peer review, 276
Perceived Competence Scale for Children, 157
Percent agreement, 212
Percentile ranks, 23, 229–230
Percentiles, 229
Per-comparison error rate, 259
Perfect induction, 7
Periodicals, 118–119
Perrin, R., 294
Personal interviews, 25–26
Personal opinions, 10
Phases of qualitative research, 50–59
case study, 52
clinical strategy, 52
data coding, 57–59
data collection, 52–55
data collection process, 55–57
ethnomethodological strategy, 51–52
grounded theory, 52
narrative strategy, 51
observations, 55
paradigms/theories, 50–51
phenomenological strategy, 51–52
researcher processing, 50
Phenomenological strategy, 51–52
Phi, 217
Phone interviews, 25
Photovoice-visual voice, 55
PHS Policy on Humane Care and Use of Laboratory Animals, 176
Physical Activity and Health (U.S. Department of Health and
Human Services), 5, 10
Physical Activity Epidemiology (Dishman et al.), 86, 237
Physical Activity Guidelines for Americans, 86
Physical Activity Questionnaire for Children, PAQ(C), 298
Physical education teacher education programs (PETE programs),
278
Physical manipulation, 44
Physical measures, 153
Physical Self-Concept Subscale (PSCS), 291
Physical Self-Perception Profile (PSPP), 157
Pierce, K. A., 223, 250
Pillai, V., 141
Pilot study, 162–163
Pina, I. L., 86
Pitetti, K. H., 106–108
Placebo effect, 46
Plagiarism, 182
Platykurtic curve, 226
Point of interface, 59
Polynomial regression, 266
Poon, P. P. L., 105
Popular press periodicals, 118
Population
defined, 186
sampling, 186–187
validity, 186
Posteriori comparisons, 258–259
Post hoc error, 47
Powell, K. E., 86
Powerjog EG30, 133
Practical action research, 72
Pratt, M., 86
Prediction
defined, 265
multiple, 265–266
simple, 265
Preexperimental designs, 40
Preliminary investigation, 162
Preliminary items, 128–129, 294–298
abstracts, 294, 298
acknowledgments, 129, 297
approval page, 296
title page, 295
typical contents, 128
Premises, 7
Preplanning, 9
President’s Council on Physical Fitness and Sports, 118
Pretest/posttest experimental design, 41
Prevalence rates, 87
Prevalence study, 87–88
Prevalent cases, 86
Primary source material, 68
Primary sources, 17, 68, 117
Privacy, 180
Probability sampling, 188–192
Probes, 54–55
Problem statement, 130–131, 284
Procedures section
research proposal, 289–292
research reports, 133, 298–299
ProCite, 125
Profession
knowledge as basis of, 5–6
research as, 6
Project Muse, 122
Prolonged engagement, 276
ProQuest Dissertations and Theses (PQDT), 121
Prosoli, R., 59
Prospective cohort studies, 89–90
Psychological Reviews, 117
PsycINFO, 121
Publication Manual of the American Psychological Association,
294
Purpose statement, 106–107, 130, 284
Purposive sampling, 188–189
Q
Qualitative approach, 148, 152
Qualitative designs, 269–279
data analysis, 270–271
data coding, 271
data reading, 270–271
data reduction, 271
example, 272–276
interpreting, evaluating, and presenting findings, 271–272
mixed-methods designs, 276–278
themes, 270
trustworthiness, 276
types of, 270
verification, 276
Qualitative purpose statement, 107
Qualitative research, 12–14, 16, 18, 50–59
case study, 52
clinical strategy, 52
data coding, 57–59
data collection, 52–55
data collection process, 55–57
ethnomethodological strategy, 51–52
grounded theory, 52
narrative strategy, 51
observations, 55
sampling procedures, 186–187
Qualitative studies, 193–194
Qualitative variables, 109
Quantitative purpose statement, 107
Quantitative research, 12–14, 187
Quantitative studies, 194–196
Quantitative variables, 109
Quasi-experimental designs, 41–43
Questioning techniques, 151, 158–160
Questionnaires, 26–31, 154, 158–159, 195
administered, 26
distributed, 27–31
reliability, 212
R
Raffle, P. A. B., 84
Random assignment, 39, 43, 187
Random blocks ANOVA, 252–253
Randomized controlled trial, 90–91
Random selection, 39, 43, 112, 187
with replacement, 190
without replacement, 190
Range, 228, 236
Rank order correlation coefficient, 234
Rater reliability, 204
Rating effect, 46–47
Rating scale, 155–157
Ratio scores, 203
Raudsepp, L., 107
Reactive effects of experimental setting, 40
Reactivity, 152
Readability statistics, 179
Real difference, 240
Recommendations, 136, 300, 302–304
Reeve, G. T., 66
References, 136–138, 304
RefWorks, 125
Regression, 265–266
defined, 265
nonlinear, 266
polynomial, 266
Regression line, 233
Related research, 116
Relationship questions, 104–105
Relative reliability, 208–212
multiple trials within a day, 208–209
relative, 208–212
Relative risk (RR), 91, 94, 237
Reliability, 27–28, 204, 208–212, 235, 249–250
absolute, 213
acceptable, 219
defined, 151, 204–205
dichotomously scored test, 212–213, 217
documenting, 205
establishing, 205
estimating, 205
general case, 211
knowledge test, 211–212
multiple days, 210–211
other evidence, 217–218
questionnaire, 212
rater, 204
Reliability coefficient, 208
RENO Diet Heart Study, 129
Repeated measures, 244–245
Repeated measures ANOVA, 248–252
Representative sample, 187
Research
defined, 6
descriptions of, 6
objectivity in, 6
scientific method, 8–11
as search for knowledge, 6–7
significance in kinesiology, 17–19
types of, 11–17
Research approaches, 148–151
Research bias, 276
Research design
advantages and disadvantages, 87–88
case-control studies, 88–89
cross-sectional survey, 87–88
goal of, 87
longitudinal/prospective, 87
prospective cohort studies, 89–90
randomized controlled trial, 90–91
retrospective, 87
Researcher processing, 50
Research hypothesis, 36–37, 149–151, 241
Research participants, 131
populations, 186
recruitment, 196–197
sampling procedures, 186–196
Research plan, 8–9, 147–165
data-collection methods and techniques, 151–160
development, 148
grant application, 163–164
hypotheses, 149–151
selecting appropriate method and instruments, 160–163
selecting the research approach, 148–151
Research process, 100–113
assumptions, 108–109
defined, 101
defining the problem, 105–109
delimitations, 107
limitations, 107–108
purpose of study, 106–107
research question in, 101–105
steps in, 101
variables, 109–112
Research proposal, 282–292
chapter one: introduction section, 283–288
chapter three: procedures section, 289–292
chapter two: literature review section, 288–289
defined, 283
title, 283
Research Quarterly for Exercise and Sport, 18, 119, 144, 179
Research question, 8, 101–105
defined, 101–102
types of, 104–105
Research Randomizer, 191
Research reports, 38, 127–146, 293–308
appendix, 136–138
body or text, 294–304
critiquing, 138–139
defined, 286
discussion section, 136, 137, 299–302
divisions of thesis or dissertation, 294–304
format, 294, 304–305
guides to preparation, 294, 308
introduction section, 130–131, 298
literature review, 115–126, 298
for meta-analysis studies, 143–144
methods section, 131–133
nature of, 128
preliminary items, 128–129
preparing manuscript for publication, 305–307
references, 136–138, 304
results section, 134–136, 299–302
summarizing, 140–141
Respect for persons, 171
Results section, 134–136, 299–302
Reverby, S., 169
Rho, 234
Rikard, L., 104
Rinehart, R., 56
Riva, G., 25
Roberts, C. G., 84
Rodgers, W. M., 105
Rogers, D. E., 66
Rosenberg Self-Esteem Scale, 107, 283, 291
Rosenthal, R., 141, 144, 145
Rosner, B., 195
Rothstein, H. R., 141, 145
Rowe, D. A., 23, 28, 47, 152, 191, 195, 204, 205, 208, 211, 214,
218, 230, 231
RPE Scale (Ratings of Perceived Exertion Scale), 156
Rubin, H. J., 26
Rubinson, L., 33, 70
S
Safrit, M. J., 214
Sage Publications, 25, 26
St. Jeor, S. T., 129n.
Salkind, N. J., 240
Sallis, J. F., 104, 105, 152
Sample size, 193, 194
Sample Size Calculator, 196
Sampling error, 189, 195, 240
Sampling frame, 187
Sampling procedures
population, 186–187
qualitative research, 186–187
quantitative research, 187
random processes, 187–188
sample, defined, 186
sample selection methods, 188–189
sample size, 193
sampling, defined, 186
Sampling unit, 192
Sands, C., 196
Sarvela, P. D., 47
Sarzynski, M. A., 130n.
SAS, 144, 222, 223
Saturation, 194
Scaling techniques, 154–158
Scattergram, 232
Schaer, C. E., 107
Schempp, P. G., 18
Scholarly journals, 118
Schonlau, M., 25
School Health Policies and Practices Study (SHPPS), 16
Schumacher, S., 6, 155
Schwarzer, R., 144
Scientific method, 8–10
attitudes and personal opinions versus, 10
defined, 8
framework for, 9
stages of, 8–9
strict application, 9, 11
Scientific misconduct, 171, 182
Scott, B., 129n.
Scott Kretchmar, R., 66
Search engines, 123, 124
Search Engine Showdown, 124
Secondary sources, 17, 117
material, 68
Sehgal, A., 152
Selecting statistical tests, 266–267
Selection, as threat to internal validity, 39
Selection bias, interaction effect of experimental treatment and, 40
Selective manipulation, 44
Self-administered Questionnaire, 26–27
Semantic differential scale, 155
Semi-structured interviews, 26, 53
Seo, J. W., 181
Severiens, S. E., 161
SHAPE America, 7, 17–18
Sherar, L. B., 117
Siegel, S., 259
Significance of study, 285
Significant, 241
Sills, S. J., 25
Silverman, S. J., 106, 137, 164, 283, 298, 308n.
Simple effects tests, 257
Simple frequency distribution, 223–225
Simple prediction, 265
Simple random sample, 190–191
Simsek, Z., 25
Single-blind study, 47
Situational ethics, 170
Sixmith, J., 10
Skewed curve, 226
Skinner, B. F., 18
Skinner, J. S., 18
Slade, C., 294
Slope, 265
Smith, A. L., 131n.
Smyth, J. D., 25
Snowball sampling, 189
Social desirability bias, 160
Society of Health and Physical Educators America (SHAPE
America), 7, 17–18
Solmon, M. A., 66, 105
Song, C. Y., 25
Sonstroem’s Physical Activity Model, 301
Sorensen, C., 10, 15, 109, 116
Spearman’s rho, 234
Special Olympics, 107–109, 287
Specific aims, research proposal, 284–285
Spengler, C. M., 107
Spirduso, W., 298, 308n.
Spirduso, W. W., 106, 137, 283
Spitz, V., 168
Sponsored research, 183
SPORT Discus, 121
SPSS, 191, 206, 208, 209, 211, 218, 222, 223, 225, 226,
228–231, 233, 236, 240, 242–243, 245, 248–250, 255,
259–261, 263, 264, 266, 292, 303
Sputtr, 123
Stability reliability, 208
Standard deviation, 228–229
Standard error of measurement (SEM), 213
Standard error of prediction, 265
Standard scores, 230–231
Standards for Educational and Psychological Testing (American
Educational Research Association), 214
Stanley, J., 38, 40, 43, 44
Statistical analysis, 128
Statistical control, 112
Statistical hypothesis, 241
Statistical power, 195, 246
Statistical regression, as threat to internal validity, 39
Statistical techniques, 45
Statistics, 222
attributable risk, 91–92
case-control study, 93–94
dependent variable, 91
disease probability, 91
examples, 91
independent variable, 91
interpreting relative risk, 94
issues, 91
odds ratio, 91, 94
prospective cohort study, 92–93
relative risk, 91
risk factor, 91
Steel, K. A., 50
Stinson, A. D., 104
Stratified random sampling, 191
Strauss, A., 52
Structured-alternative scale, 157–158
Structured interviews, 26, 159
Structured questionnaires, 158
Stylianou, M., 153
Subject directories, 123, 124
Subject Guides A to Z, 124
Suci, G. J., 155
Sudman, S., 25
Sugiyama, T., 104
Summary, 300, 302, 303
Sums of squares (SS), 206, 247, 249, 254
Sun, D. C., 284, 286
Sun, J., 105
Supplementary items, 304
Survey research, 22, 24–25
preliminary considerations, 25
survey methods, 25–31
Suto, M., 52
Syllogism, 7
Systematic review, 140
Systematic sampling, 191–192
Systematic variables, 37
T
Table of random numbers, 190
Tamminen, K., 51
Tannenbaum, P. H., 155
Taylor, A. W., 130n.
Tecumseh community health study, 84–85
Telama, R., 105
Telephone interviews, 25
Templin, T. J., 18
Tennessee Self-Concept Scale, 291
Teruzzi, T., 25
Testing
interaction effects as threat to external validity, 40
as threat to internal validity, 39
Tests
defined, 203–204
good tests, 203–204
Tests of Within-Subjects Effects, 251–252
Thayer, R. E., 130n.
Themes, 270
Theory, 10–11
Theory/concept sampling, 189
Thesis format, 294, 304–305
Thomas, D. Q., 130n.
Thomas, J. R., 119, 144, 164, 305
Thomson, D., 71
Thomson Reuters, 119
Thorndike, R. L., 47
Thorndike, R. M., 47
Three-way ANOVA, 257
Titles
research proposal, 283
research report, 295
Toner, M., 71
Trade/professional magazines, 118
Trekell, M., 67
Tremblay, M. S., 16
True experimental designs, 41
Trump, Donald John, 195
Trustworthiness, 276
T-score, 231
Tsuruta, H., 144
t test
one-group, 242–243
two dependent groups, 244–245
two independent groups, 243–244
Tucker, J., 130n.
Tuckman, B.W., 6, 110, 134, 136
Tudor-Locke, C., 66
Tuohy, D., 10
Turabian, K. L., 294
Tuskegee’s Truths (Reverby), 169
Tuskegee Syphilis Study, 168–169, 171
Two dependent groups t test, 244–245
Two-group comparisons, 258–259
Two independent groups t test, 243–244
Two-way ANOVA, 253–257, 259
Two-way chi square tests, 260–264
Two-way (contingency table) tests, 260–264
Type I error, 246
Type II error, 246
Typical sampling, 188
U
Ullrich-French, S., 157
Ulrich, B. D., 66
United States Public Health Service, 168
Univariate statistics, 264–265
Univariate tests, 264
University of California-Berkeley, 125
University of California Internet Resources, 124
University of Georgia, 186
Unstructured interviews, 26, 53, 159
Unstructured questionnaires, 159
U.S. Army, 169
U.S. Department of Agriculture (USDA), 176
U.S. Department of Education, 180
U.S. Department of Health and Human Services (DHHS), 5, 10,
86, 171, 177, 179, 180
U.S. Department of Health, Education, and Welfare (HEW), 171
U.S. Environmental Protection Agency (EPA), 176
U.S. Food and Drug Administration (FDA), 171, 176
U.S. Government Principles for the Utilization and Care of
Vertebrate Animals Used in Testing, Research, and Training
(Interagency Research Animal Committee), 176
V
Validity, 27–28, 204, 213–218
convergent evidence, 216
criterion test, 215
defined, 151, 205
documenting, 205
establishing, 205
estimating, 205
external, 38, 40, 43–44
internal, 38–40, 43–44, 216
logical validity evidence, 214
processes, 216–217
reporting, 218
in summary, 43–44
test content, 214
testing, 217
threats to, 38–40
Validity coefficient, 217
van der Mars, H., 196, 197, 278
Vander Werff, A. R., 284, 285
Variability, 228–229
Variables, 109–112
control of, 112
defined, 109
dependent, 36–37, 109–110
excluding, 112
extraneous, 110–111
independent, 36, 109–110
mediator, 112
moderator, 111
qualitative, 109
quantitative, 109
Variance, 229
Variates, 45, 259
Veiga, J. F., 25
Verbal frequency scale, 156
Verification, 276
Vilkari, J., 105
Visker, T. L., 299n., 302n., 304n.
Vispoel, W. P., 25
Vogt, M., 291
W
Wagner, W. E., 223
Walker, G. J., 60
Wallnau, L. B., 134
Wang, J., 283, 284
Wansink, B., 25
Washburn, R. A., 83, 86
Weiler, R. M., 25
Weisberg, H. F., 33
Weiss, M. R., 66, 131n.
Welk, G. J., 104
Wells, C. L., 18
West, J. L., 25
Weston, A. R., 133n.
West, S. G., 266
Wickens, T., 44, 45, 255, 257–259
Wiersma, W., 24, 26, 33
Wiggins, D. K., 67
Wikipedia, 124
Williamson, S., 152
Wilmore, J. H., 86
Wilson, D. P., 144
Winer, B. J., 44, 45, 255, 257–259
Wolach, A., 258
Wood, K., 104
Woods, A., 61
Wood, T. M., 214
Working bibliography, 125
World Medical Association (WMA), 170
Wrynn, A., 67
Wüthrich, T. U., 107
WWW Virtual Library, 123
Wyatt, H. R., 106
Wye, L., 60
X
Xiang, P., 130n.
Y
Yahoo!, 123, 124
Yahoo Search, 124
Yang, X., 105
Y-intercept, 265
Youth Risk Behavior Surveillance System (YRBSS), 16, 104
Yu, H., 159, 194
Yu, L., 144
Z
Zhu, W., 23, 129n., 133n., 157, 208, 234, 258
Zieff, S. G., 67
Zijlstra, B. J., 161
Z-score, 230