Professional Documents
Culture Documents
Clarifying Measurement and Data Collection in Quantitative Research
Clarifying Measurement and Data Collection in Quantitative Research
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 10
What is an example of
nominal data?
Gender
1 = Male
2 = Female
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 11
Nominal-Level Measurement
Nominal-level measurement is the lowest of the four
measurement categories.
It is used when data can be organized into categories of a
defined property but the categories cannot be rank-ordered.
For example, you may decide to categorize potential study
subjects by diagnosis. However, the category “kidney stone,”
for example, cannot be rated higher than the category “gastric
ulcer”; similarly, across categories, “ovarian cyst” is no closer to
“kidney stone” than to “gastric ulcer.”
The categories differ in quality but not quantity. Therefore, it
is not possible to say that subject A possesses more of the
property being categorized than subject B.
(RULE: The categories must not be
orderable.) Categories must be established in
such a way that each datum will fit into only
one of the categories.
(RULE: The categories must be exclusive.)
All the data must fit into the established
categories.
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 17
Ordinal-Level Measurement
With ordinal-level measurement, data are assigned
to categories that can be ranked.
(RULE: The categories can be ranked) To rank data,
one category is judged to be (or is ranked) higher or
lower, or better or worse, than another category.
Rules govern how the data are ranked. As with
nominal data, the categories must be exclusive
(each datum fits into only one category) and
exhaustive (all data fit into at least one category).
With ordinal data, the quantity also can be identified
(Stevens, 1946). For example, if you are measuring intensity
of pain, you may identify different levels of pain. You
probably will develop categories that rank these different
levels of pain, such as excruciating, severe, moderate, mild,
and no pain. However, in using categories of ordinal
measurement, you cannot know with certainty that the
intervals between the ranked categories are equal. A
greater difference may exist between mild and moderate
pain, for example, than between excruciating and severe
pain. Therefore ordinal data are considered to have
unequal intervals.
Many scales used in nursing research are ordinal levels
of measurement. For example, it is possible to rank
degrees of coping, levels of mobility, ability to provide
self-care, or levels of dyspnea on an ordinal scale. For
dyspnea with activities of daily living (ADLs), the scale
could be:
0-no shortness of breath with ADLs
1-minimal shortness of breaths with ADLs
2-moderate shortness of breath with ADLs
3-extreme shortness of breath with ADLs
4-shortness of breath so severe the person is unable to
perform ADLs without assistance
The measurement is ordinal because it is not
possible to claim that equal distances exist
between the rankings. A greater difference
may exist between the ranks of 1 and 2 than
between the ranks of 2 and 3.
What is an interval scale?
Numerical distances between intervals
Absence of a zero point
Likert scale scores
1 = Strongly disagree
2 = Disagree
3 = Neutral
4 = Agree
5 = Strongly agree
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 22
Interval-Level Measurement
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 25
What is an example of ratio data?
Test scores
1 = Lowest third percentile
2 = Middle third percentile
3 = Top third percentile
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 26
Ratio-Level Measurement
Ratio-level measurement is the highest form of
measurement and meets all the rules of other forms
of measurement—mutually exclusive categories,
exhaustive categories, ordered ranks, equally
spaced intervals, and a continuum of values.
Interval- and ratio-level data can be added,
subtracted, multiplied , and divided because of the
equal intervals and continuum of values of these
data.
interval and ratio data can be analyzed with statistical
techniques of greater precision and strength to
determine significant relationships and differences
ratio-level measures have absolute zero points. (RULE:
The data must have absolute zero . Weight, length, and
volume are commonly used as examples of ratio scales.
All three have absolute zeros, at which a value of zero
indicates the absence of the property being measured;
zero weight means the absence of weight
Summary of the Rules for Levels of
Measurement
Nominal Ordinal Interval Ratio
A. Ordinal
B. Interval
C. Nominal
D. Ratio
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 30
What is reference measurement?
Norm-referenced testing
Tests performance standards that have been
carefully developed over years with large,
representative samples using a standardized
test with extensive reliability and validity
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 31
Measurement Error
Measurement error is the difference between the true
measure and what is actually measured (Grove, Burns, &
Gray, 2013).
The amount of error in a measure varies from considerable
error in one measurement to very little in another.
Measurement error exists with direct and indirect
measures.
With direct measures, both the object and measurement
method are visible. Direct measures, which generally are
expected to be highly accurate, are subject to error. For
example, a weight scale may be inaccurate for 0.5 pound..
With indirect measures, the element being
measured cannot be seen directly. For example,
you cannot see pain. You may observe behaviors or
hear words that you think represent pain, but pain
is a sensation that is not always clearly recognized
or expressed by the person experiencing it.
The measurement of pain is usually conducted
with a scale but can also include observation and
physiological measures .
Efforts to measure concepts such as pain usually result
in measuring only part of the concept. Sometimes
measures may identify some aspects of the concept but
may include other elements that are not part of the
concept. For example, measurement methods for pain
might be measuring aspects of anxiety and fear in
addition to pain. However, using multiple methods to
measure a concept or variable usually decreases the
measurement error and increases the understanding
of the concept being measured
types of error
random error
systematic error.
The difference between random and systematic
error is in the direction of the error
In random measurement error, the difference
between the measured value and the true value is
without pattern or direction (random).
In one measurement, the actual value obtained
may be lower than the true value, whereas in the
next measurement, the actual value obtained may
be higher than the true value.
.
A number of chance situations or factors can
occur during the measurement process that
can result in random error . For example, the
person taking the measurements may not
use the same procedure every time, a subject
completing a paper and pencil scale may
accidentally mark the wrong column, or the
person entering the data into a computer
may punch the wrong key.
The purpose of measuring is to estimate the
true value, usually by combining a number of
values and calculating an average. An
average value, such as the mean, is a closer
estimate of the true measurement. As the
number of random errors increases, the
precision of the estimate decreases
In systematic measurement error, the variation in
measurement values from the calculated average is
primarily in the same direction. For example, most of
the variation may be higher or lower than the
average that was calculated .
Systematic error occurs because something else is
being measured in addition to the concept. For
example, a paper and pencil rating scale designed to
measure hope may actually also be measuring
perceived support.
When measuring subjects’ weights, a scale that shows
weights that are 2 pounds over the true weights will
give measures with systematic error. All the measured
weights will be high, and as a result the mean will be
higher than if an accurate weight scale were used.
Some systematic error occurs in almost any measure.
Because of the importance of this type of error in a
study, researchers spend considerable time and effort
refining their instruments to minimize systematic
measurement error (Waltz et al., 2010).
The measurement errors for BP readings can
be minimized by checking the BP cuff and
sphygmomanometer for accuracy and
recalibrating them periodically during data
collection, obtaining three BP readings and
averaging them to determine one BP reading
for each subject, and having a trained nurse
using a protocol to take the BP readings
If a checklist of pain behaviors is developed for
observation, less error occurs than if the observations
for pain are unstructured. Measurement will also be
more precise if researchers use a well-developed,
reliable, and valid scale, such as the FACES Pain Scale,
instead of developing a new pain scale for their study.
In published studies, look for the steps that
researchers have taken to decrease measurement
error and increase the quality of their study findings.
Audience Response Question
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 43
Reliability
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 47
DETERMINING THE QUALITY OF
MEASUREMENT METHODS
QUALITY DESCRIPTION
INDICATOR
Validity Evidence of validity from convergence: Two scales measuring the same
concept are administered to a group at the same time, and the subjects’ scores
on the scales should be positively correlated. For example, subjects completing
two scales to measure depression should have positively correlated scores.
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 125
Evaluation responses ask the respondent for
an evaluative rating along a bad-good
dimension, such as negative to positive or
terrible to excellent.
Frequency responses may include statements
such as never, rarely, sometimes,
frequently, and all the time. The terms used
are versatile and are selected based on the
content of the questions or items in the scale
Sometimes seven options are given on a response scale,
sometimes only four.
When the response scale has an odd number of options,
the middle option is usually an uncertain or neutral
category.
Using a response scale with an odd number of options is
controversial because it allows the subject to avoid making
a clear choice of positive or negative statements.
To avoid this, researchers may choose to provide only four
or six options, with no middle point or uncertain category.
This type of scale is referred to as a forced choice version
A Likert scale usually consists of 10 to 20 items,
each addressing an element of the concept being
measured.
Usually, the values obtained from each item in
the instrument are summed to obtain a single
score for each subject.
Although the values of each item are technically
ordinal-level data, the summed score is often
analyzed as interval-level data.
Visual Analog Scales
Copyright © 2015, 2011, 2007, 2003, 1999, 1995 by Saunders, an imprint of Elsevier Inc. 130
Visual Analog Scales
These end anchors must include the entire range of
sensations possible for the phenomenon being
measured (e.g., all and none, best and worst, no
pain, and most severe pain possible.
Subjects are asked to place a mark through the line
to indicate the intensity of the sensation or feeling.
Then researchers use a ruler to measure the
distance between the left end of the line (on a
horizontal scale) and the subject’s mark. This
measure is the value of the sensation.
The VAS has been used to measure pain, mood,
anxiety, alertness, craving for cigarettes, quality of
sleep, attitudes toward environmental conditions,
functional abilities, and severity of clinical
symptoms.
The reliability of the VAS is usually determined by
the test-retest method. The correlations between
the two administrations of the scale need to be
moderate or strong to support the reliability of the
scale .
Because these scales are used to measure
phenomena that are dynamic or erratic over time,
test-retest reliability is sometimes not
appropriate, and the low correlation is then
caused by the change in sensation versus a
problem with the scale.
The validity of the VAS is usually determined by
correlating the VAS scores with other measures,
such as rating or Likert scales, that measure the
same phenomenon, such as pain
CRITICAL APPRAISAL GUIDELINES Scales