Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Key Principles

Underlying
Statistical Inference:
Probability and the
Normal Distribution

OBJECTIVES
After studying this chapter, you should be able to:
1. Explain the importance of probability theory for statistical inference.
2. Defi ne the characteristics of a probability measure, and explain the difference between a
theoretical probability distribution and an empirical probability distribution (a priori vs. a
posteriori).
3. Compute marginal, joint, and conditional probabilities from a cross-tabulation table and
correctly interpret their meaning.
4. Defi ne and derive sensitivity, specifi city, predictive value, and effi ciency from a cross-
tabulation table.
5. Identify and describe the characteristics of a normal distribution.
6. Use a standard normal distribution to obtain z-scores and percentiles.
7. Explain the importance of the central limit theorem.

that population. Sometimes researchers focus


FUNDAMENTAL CONCEPTS IN RESEARCH
on what a sample can tell us about a whole pop-
62 SECTION 1 Obtaining and Understanding Data

ulation, sometimes they focus on how a sample


One of the main objectives of research is to compares with the whole population, and at draw meaningful
conclusions about a popula- other times they make comparisons between tion, based on data collected from
a sample of different groups. Researchers may also compare

61
measurements on the same group that are taken over prevalence of HPV was 28.6% among women aged
time (e.g., to test how a weight-loss program is 14 to 59 years (Dunne et al., 2007).
performing). In all these cases, researchers use
statistical inference as their tool for obtaining
information from a sample of data about the PROBABILITY: THE MATHEMATICS THAT
population from which the sample is drawn. UNDERLIES STATISTICS
Statistical inference uses probability to help the An understanding of probability is critical to an
researchers to assess the meaning of their fi ndings. understanding of statistical i nference. Competency
A particularly important probability distribution for in probability is needed to comprehend statistical
statistical inference in health-related research is the signifi cance (e.g., interpreting p-values), read cross-
normal (or Gaussian) distribution. This chapter tabulation tables, and understand frequency
focuses on the fundamental concepts of probability distributions, all of which are used in health care
and the normal distribution, both of which are crucial research extensively (see Chapters 2 and 10). In
to understanding the statistical techniques contained particular, correctly reading cross-tabulation tables
in the subsequent chapters of this book. (commonly referred to as “cross-tabs”) is a critical
skill for researchers and for clinicians and
Estimating Population Probabilities Using administrators who need to understand the research
Research literature. Using cross-tabs requires an
understanding of joint, conditional, and marginal
One example of using statistical inference to draw probabilities. Thus, after a brief discussion of defi
conclusions about a population from a sample comes nitions and concepts necessary to an understanding
to us from a recent populationbased study of an of probability, we will illustrate the principles of
urban area using the New York City Health and probability with examples from cross- tabulation
Nutrition Examination Survey (NYCHANES). In tables.
this study, the authors found that the prevalence of
diabetes (diagnosed and undiagnosed combined)
among adults aged 20 and above was 12.5%. Defi ning Probability
(Thorpe et al., 2009) Population-based studies have The general concept of objective probability can
also been used to estimate the baseline prevalence of be categorized under two areas: a priori
an exposure or disease. For example, in order to (theoretical or classical) probability and a
determine the effi cacy of the human papillomavirus posteriori (empirical or relative frequency)
(HPV) vaccine, the baseline population prevalence probability (Daniel, 2008; Mood, Graybill, &
for HPV must be determined. Using biological Boes, 1974). In theoretical probability, the
samples collected as part of the 2003 to 2004 distribution of events can be inferred without
NHANES cycle, it was estimated that the overall collecting data. For example, we can compute the
probability of getting “heads or tails” on a coin fl
CHAPTER 3 Key Principles Underlying Statistical Inference: Probability and the Normal Distribution 63
ip without actually fl ipping the coin. In empirical the respective probabilities (Daniel, 2008). The
probability, data must be collected by some probability distributions computed in Chapter 2
process, and the probability that each event will are examples of empirical probability
occur must be estimated from the data. In health distributions. The probability distributions used
care research, empirical probability is used when in inferential statistics (e.g., normal distribution,
reporting characteristics of a sample (e.g., 35% of binomial distribution, chi-square distribution, and
the sample was female) and classical probability Student’s t distribution) are examples of
(e.g., theoretical probability distributions) is used theoretical probability distributions.
when making statistical inferences about the data. Probability theory is based on the three axioms
Probability provides a numerical measure of stated by Kolmogorov (1956). These axioms are
uncertainty by giving a precise measurement of illustrated by examples in the next section:
the likelihood that an event will occur. An event
1. The probability that each event will occur
can be as simple as a single outcome (e.g., one
must be greater than or equal to 0 and less than
card is picked out of a deck of cards) or it can be
or equal to 1.
composed of a set of outcomes (e.g., fi ve cards
2. The sum of the probabilities of all the mutually
are picked out of a deck of cards). It can be an
exclusive outcomes of the sample space is
event from which results are inferred (e.g., a coin
equal to 1. Mutually exclusive outcomes are
fl ip) or an event for which data need to be
those that cannot occur at the same time (e.g.,
collected (e.g., percentage of premature births in a
on any given fl ip, a coin can be either heads
h ospital). Events that are uncertain are those that
or tails but not both).
may or may not occur. For example, there is a
small chance that a lottery ticket will be a winner
(i.e., the “event”), but there is a much larger
chance that it will not be a winner. People who
purchase lottery tickets are essentially willing to
pay for the uncertainty (the probability) of
winning.
Several defi nitions, notations, and f ormulas—
all of which are used throughout this chapter—
are useful for an understanding of probability
(Table 3-1). It is particularly critical to understand
two of these ideas, sample space and probability
distribution, when using statistics. Simply put,
sample space is the set of all possible outcomes
of a study. For example, if a coin is fl ipped, the
sample space has two possible outcomes: heads
and tails; if a six-sided die is rolled, the sample
space has six possible outcomes. Similarly, the
sample space for gender has two outcomes:
female and male.
A probability distribution is the set of
probabilities associated with each possible
outcome in the sample space. The probability
distribution of a variable can be expressed as a
table, graph, or formula; the key is that it specifi
es all the possible values of a random variable and
64 SECTION 1 Obtaining and Understanding Data
3. The probability that either of two mutually (yes/no). A total of 53 EDs provided appropriate
exclusive events, A or B, will occur is the sum responses to these two questions, and it turned out
of the probabilities of their individual that 33 EDs had a forensic nurse on staff and 41 EDs
probabilities. had a relationship with a rape crisis center.
Using Probability: Interpreting a Cross-
Tabulation Table
Marginal Probability
Table 3-1
PROBABILITY SYMBOLS AND DEFINITIONS
Symbol or
Term Meaning
Sample space The set of all possible outcomes of a study

Probability distribution The set of probabilities associated with each event in the sample space

p(A) The marginal probability that event A will occur

p(A) The probability that event A will not occur


p(A|B) The conditional probability that event A will occur if event B occurs

p(AÇB) The joint probability that both events A and B will occur; also called the
intersection of A and B

p(AÈB) The probability that event A will happen and/or event B will happen; also
called the union of A and B
Addition rule p(AÈB) = p(A) + p(B) − p(AÇB)
Multiplication rule p(AÇB) = p(A) × p(A|B)
Independence of events A and B If p(A) = p(A|B), then A and B are independent

In this
CROSS-TABULATION TABLE: FORENSIC NURSE ON EMERGENCY DEPARTMENT STAFF BY
section, we
EMERGENCY DEPARTMENT LINKAGE TO A RAPE CRISIS CENTER
illustrate
the different types of probability using a cross-
tabulation table from a study of emergency
department (ED) services for sexual assault victims
in Virginia (Table 3-2) (Plichta, Vandecar-Burdin,
Odor, Reams, & Zhang, 2007). This study examined
the question: Does having a forensic nurse trained to
assist victims of sexual violence on staff in an ED
affect the probability that a hospital will have a
relationship with a rape crisis center? Some experts
thought that having a forensic nurse on staff might
actually reduce the chance that an ED would have a
connection with a rape crisis center. The two
variables of interest here are “forensic nurse on staff”
(yes/ no) and “relationship with rape crisis center”
CHAPTER 3 Key Principles Underlying Statistical Inference: Probability and the Normal Distribution 65
The marginal probability is simply the number of that the ED will have a relationship with a rape
times the event occurred divided by the total number crisis center is
of times that it could have occurred. When using
relative frequency probability, the probability of an
event is the number of times the event occurred p B( )= = .7736
divided by In other words, 77.36% of the hospitals have such
a relationship, and the probability that the ED will
not have a relationship with a rape crisis center is

the total number of trials. This is expressed p B( )= = .2264

mathematically as Conditional Probability


#Times_A_occurs Conditional probability is the probability that one
p A( )= (3-1) event will occur given that another event has
N occurred. In mathematical notation, conditional
probability is written as p(B|A), the probability of
where N is the total number of trials. In health care
event B occurring given that event A has
research, the number of trials (N) is typically the
occurred. In practice, using conditional
number of subjects in the study. “Subjects” may
probabilities means that only a subset of the data
refer to individual human beings, individual
is being studied. It is very important to use the
institutions (e.g., EDs), or even individual
correct denominator when computing conditional
laboratory samples.
probability.
First, the simple probability, also called the
Conditional probabilities are often compared in
marginal probability, of each of the two variables
cross-tabulation tables. For example, in the ED
is computed. The probability that the ED will have
study, the research question is: Does having a
a forensic nurse on staff is
forensic nurse on staff affect the probability that
the ED will have a relationship with the local rape
p A( )= =.6226 crisis center? To answer this question, we need to
In other words, 62.26% of the EDs have a forensic compare two conditional probabilities: (a) given
nurse on staff. The probability of not having a that the ED does not have a forensic nurse on
forensic nurse on staff, p(not A), is staff, what is the probability that the ED will have
a relationship with a rape crisis center and (b)
given that the ED does have a forensic nurse on
p A( )= = .3774 staff, what is the probability that the ED will have
Because having a nurse on staff and not having a a relationship with a rape crisis center?
nurse on staff are mutually exclusive and The correct denominator for the fi rst
exhaustive events, their probabilities add up to 1 probability is 20 because 20 EDs reported no
(.6626 + .3774 = 1.0). Similarly, the probability forensic nurse on staff, and the correct
denominator for the second probability is 33
66 SECTION 1 Obtaining and Understanding Data
because 33 EDs reported that they do have a rape crisis center. In this case, the denominator is the
forensic nurse on staff. The conditional entire sample and the numerator is the number of
probabilities are computed as follows: EDs with both conditions.
p(relationship with rape crisis center|no Addition Rule
forensic nurse)
The addition rule is used to compute the probability
that either one of two events will occur; this means
= p B A( | )= = .6000 that one, the other, or both will occur. Usually, the
p(relationship with rape crisis center|forensic nurse) term and/or indicates this type of probability. For
example, if one wanted to know how many hospitals
had a forensic nurse on staff and/or a relationship
= p B A( |)= = .8788 with a rape crisis center, the addition rule would be
In other words, 60% of the EDs without a forensic used. The general rule is expressed mathematically
nurse on staff have a relationship with a rape crisis as shown in Equation 3-3:
center compared with 87.88% of EDs with a forensic
nurse on staff. Although it certainly looks like the PA B pA PB pA B( ∪ =) ( )+ ( )− ( ∩ )
presence of a forensic nurse increases the chances of (3-3)
having such a relationship, inferential statistics (in The reason that the joint probability is subtracted is
this case, a chi-square test, discussed in Chapter 10) that if the events A and B are nonmutually exclusive
must be used to see whether this difference was (e.g., they have some overlap), the probability of the
statistically signifi cant or simply attributable to overlap is added twice. In the example above, the
chance. two marginal probabilities and the joint probability
of these two outcomes are used to compute the
Joint Probability probability of the event of either occurring: p A( )=
Joint probability is the cooccurrence of two or more
events. The key to understanding joint probability is p(forensic_nurse_on_staff) =
to know that the words “both” and “and” are usually
p B( ) = p(relationship_with_rape_crisis_
involved. For example, if the research question
asked about the probability of an ED in our sample
having both a forensic nurse on staff and a centre)=
relationship with a rape crisis center, a joint
probability would be computed. pA B p( ∩ =) (forensic_nurseIrape_crisis_
In mathematical notation, this probability is
written as: centre)

p A( ÇB) (3-2) Thus,

In this example, the probability would be computed


pA B
as
In other words, 84.9% of the EDs have a forensic
p(forensic nurse on staff Ç relationship with rape nurse on staff and/or a relationship with a rape crisis
crisis center) center.
Another version of the addition rule is useful when
p A( Ç = =B) .547 computing the probability that either of two mutually
exclusive events will occur. When two events are
In other words, 54.7% of the EDs have both a mutually exclusive, they never occur together, and
forensic nurse on staff and a relationship with a local
CHAPTER 3 Key Principles Underlying Statistical Inference: Probability and the Normal Distribution 67
thus, their joint probability is 0. Therefore, the independent if p(A|B) = p(A). In this case, the
addition rule for mutually exclusive events is multiplication rule reduces to
reduced to
p A( Ç =B)p A( )´p B( ) (3-6)
p A( ∪ =B)p A( )+ p B( ) (3-4)
It is important to understand that independent
since p(AB) = 0 events are not mutually exclusive events; they can
and do cooccur. Mutually exclusive events are not
independent insofar as the occurrence of one
Multiplication Rule
depends on the other one’s not occurring.
The multiplication rule in probability allows An example of two independent events is
certain types of probabilities to be computed from rolling a 3 on a die and fl ipping a tail on a coin.
other probabilities. This is particularly useful Rolling a 3 does not affect the probability of fl
when only the probabilities that the events will ipping the tail. If events are independent, then the
occur, not the raw data, are available. These joint probability that both will occur is simply the
probabilities can be used to compute joint product of the probabilities that each will occur.
probabilities. The general multiplication rule is This is written as
p A( Ç =B)p A( )´ p B( ) (3-7)
p A( Ç =B)p A( )´ p B A( | )
(3-5) which means the
p A( andB)= p A( )´p B( ) (3-8)
For example, if only marginal and conditional
probabilities from the ED study were available, For example, the probability of rolling a 3 on a
the joint probability could be computed as shown die is .17, and the probability of fl ipping a tail on
in Equation 3-5 using the marginal and a coin is .5. Then the probability of both rolling a
conditional probabilities as such: pA p( ) = 3 on a die and fl ipping a tail on the coin is

(forensic_nurse_on_staff) = = .6226 p B A( | )= .17 ´.05 = .085


p(relationshipwithrapecrisiscentre

Sensitivity, Specifi city, Predictive Value, and


and forensicnurseonstaff)= = .8788
Effi ciency
p A( Ç =B) .6226 ´.8788 .5471= Cross-tabulation tables can also provide us with
This is the same result achieved when joint information on the quality of diagnostic tests.
probability was computed directly from the table: Clinicians routinely order tests to screen patients
for the presence or absence of disease. There are
four possible outcomes to diagnosing and testing
= .5471 a particular patient: True Positive (TP), where the
screening test correctly identifi es someone who
has the disease as having the disease (i.e., both
Independent Events diagnosis and screening test are positive for the
Two events are independent when the occurrence disease); True Negative (TN), where the
of either one does not change the probability that screening test correctly identifi es someone who
the other will occur. In mathematical terms, this is does not have the disease as not having the
defi ned by saying that events A and B are disease (i.e., both diagnosis and screening test are
negative); False Positive (FP), where the
screening test incorrectly identifi es someone who
68 SECTION 1 Obtaining and Understanding Data
does not have the disease as having the disease patients in the study, 14 had DVTs identifi ed by
(i.e., diagnosis is negative and the screening test ultrasound. The optimal cutoff for predicting DVT
is positive for the disease); and False Negative was a D-dimer level of greater than 1,591 ng/ml. Test
(FN), where the screening test incorrectly classifi results showed the following:
es someone as not having the disease when, in
fact, that person does have the disease (i.e., the Positive Negative
diagnosis is positive for the disease and the Ultrasound Ultrasound
screening test is negative for it) (Kraemer, 1992). D-Dimer > 13 (TP) 19 (FP)
Essex-Sorlie (1995) notes that a type I error 1,591 ng/ml
resembles an FP outcome, occurring when a
screening test result incorrectly indicates disease D-Dimer ≤ 1 (FN) 72 (TN)
presence. A type II error is comparable to an FN 1,591 ng/ml
outcome, indicating a screening test result 14 (TP + FN) 91 (FP + TN)
incorrectly points to disease absence. The
following 2 × 2 table is often used as a way to Using the above formula, Sn = (TP/[TP + FN]) × 100
depict the relationship between the various = 13/14 × 100 = 93, the sensitivity for the D-dimer
outcomes. test for diagnosing DVTs is 93%. The larger the
sensitivity, the more it is likely to confi rm the
DIAGNOSIS disease. The D-dimer’s test for diagnosing the
Screening Condition Present Condition Absent
presence of DVT is accurate 93% of the time.
The specifi city (Sp) of a screening test is defi ned
Test Positive True Positive (TP) False Positive as the probability that the test will be negative among
(FP) patients who do not have the disease. Its formula is
Test Negative False Negative (FN) True Negative Sp = (TN/[TN + FP]) × 100 (3-10) and can
(TN) be understood as 1 − the FP rate, expressed as a
percentage.
The terms used to defi ne the clinical performance of In the same example, the specifi city for the D-
a screening test are sensitivity, specifi city, positive dimer test was 79% (Sp = (72/[72 + 19]) × 100 =
predictive value, negative predictive value, and effi 72/91 × 100 = 79). A large Sp means that a negative
ciency. Test sensitivity (Sn) is defi ned as the screening test result can rule out the disease. The D-
probability that the test is positive when given to a dimer’s specifi city of 79% indicates that the test is
group of patients who have the disease. It is fairly good in ruling out the presence of DVTs in
determined by the formula rehabilitation stroke patients. Seventy-nine percent
of those who do not have DVTs are correctly identifi
Sn = (TP/(TP + FN) ) × 100 (3-9) and is
ed as not having DVTs on the D-dimer screening
expressed as the percentage of those with the disease
test.
who are correctly identifi ed as having the disease. In
The positive predictive value (PPV) of a test is the
other words, sensitivity can be viewed as 1 − the
probability that a patient who tested positive for the
false negative rate, expressed as a percentage.
disease actually has the disease. The formula for
For example, Harvey, Roth, Yarnold, Durham,
PPV is:
and Green (1992) undertook a study to assess the use
PPV = (TP/[TP + FP]) × 100 (3-11)
of plasma D-dimer levels for diagnosing deep
venous thrombosis (DVT) in 105 patients Again using the D-dimer test for predicting
hospitalized for stroke rehabilitation. Plasma DVT, its PPV is calculated as PPV = (13/[13 +
samples were drawn from patients within 24 hours of 19]) × 100 = 13/32 × 100 = 40.6 or 41%. This
a venous ultrasound screening for DVT. Of the 105 means that only 41 out of every 100 people who
CHAPTER 3 Key Principles Underlying Statistical Inference: Probability and the Normal Distribution 69
test positive for DVTs on the D-dimer screening where m is the overall mean of the distribution
test actually have DVTs, and 59 out of 100 of and s2 is the variance. Although knowing the
those who test positive do not actually have DVTs actual formula is not important when using a
(they are false-positives). normal distribution, it is helpful to see the
The negative predictive value (NPV) of a test is formula to understand that a normal distribution
the probability that a patient who tested negative is a theoretical distribution where the shape is
for a disease really does not have the disease. It is determined by two key parameters: the
calculated as
population mean (shown as m) and the population
NPV = (TN/[TN + FN]) × 100 (3-12) standard deviation (SD) (shown as s). The
usefulness of this is that for phenomena that are
Using this formula in the above D-dimer test
normally distributed, if we know the mean and
example, NPV = (72/[72 + 1]) × 100 = 72/73 ×
SD of a population, we can infer the distribution
100 = 98.6 or 99%. This value indicates that 99
of the variable for the entire population without
out of 100 patients who screen negative for DVTs
collecting any data.
on the D-dimer test truly do not have DVTs. Thus,
In practice, this distribution is among the most
the D-dimer test is outstanding at ruling out DVTs
important distributions in statistics for three
in rehabilitation stroke patients who test negative
reasons (Vaughan, 1998). First, although most
for their presence.
distributions are not exactly normal, many
The effi ciency (EFF) of a test is the probability
biological and population-level variables (such as
that the test result and the diagnosis agree
height and weight) tend to have approximately
(Kraemer, 1992) and is calculated as
normal distributions. Second, many inferential
EFF = ([TP + TN]/[TP + TN + FP statistics assume that the populations are
+ FN]) × 100 (3-13) distributed normally. Third, the normal curve is a
probability distribution and is used to answer
In the D-dimer test example, EFF = ([13 + 72]/
questions about the likelihood of getting various
[13 + 72 + 19 + 1]) × 100 = 85/105 × 100 = 80.9.
particular outcomes when sampling from a
Thus, the effi ciency of the D-dimer test in
population. For example, when we discuss
diagnosing rehabilitation stroke patients with
hypothesis testing, we will talk about the
DVTs is almost 81%.
probability (or the likelihood) that a given
difference or relationship could have occurred by
NORMAL DISTRIBUTION chance alone. Understanding the normal curve
prepares you for understanding the concepts
The Normal distribution, also referred to as the
underlying hypothesis testing.
Gaussian distribution, is one of the most important
theoretical distributions in statistics. It is a
theoretically perfect frequency polygon in which Useful Characteristics of the Normal
the mean, median, and mode all coincide in the Distribution
center, and it takes the form of a s ymmetrical bell-
A normal distribution is displayed in Figure 3-1. It
shaped curve (Fig. 3-1). It is a continuous
possesses several key properties: (a) it is bell shaped;
frequency distribution that is expressed by the
(b) the mean, median, and mode are equal; (c) it is
following formula:
symmetrical about the mean; and (d) the total area
- -(x m)2
under the curve above the x-axis is equal to 1.
s
f x( ) = 1(3-14) ´ e The baseline of the normal curve is measured off
ps in SD units. These are indicated by the Greek letter
for SD, s, in Figure 3-1. A score that is 1 SD above
70 SECTION 1 Obtaining and Understanding Data

the mean is symbolized by +1a, and −1a indicates a


score that is 1 SD below the mean. For example, the
Graduate Record Exam verbal test has a mean of 500
and an SD of 100. Thus, 1 SD above the mean (+1a)
is determined by adding the SD to the mean (500 +
100 = 600), and 1 SD below the mean (−1a) is found
by subtracting the SD from the mean (500 − 100 =
400). A score that is 2 SD above the mean is 500 + X
Median Positively skewed
100 + 100 = 700; a score that is 2 SD below the mean Mode
is 500 − (100 + 100) = 300.
These properties have a very useful result: The
percentage of the data that lies between the mean and
a given SD is a known quantity. Specifi cally, 68%
of the data in a normal distribution lie within ±1 SD
from the mean, 95% of the data lie within ±2 SD
from the mean, and 99.7% of the data lie within ±3
SD from the mean. Many distributions are
nonnormal in practice. When the distribution is
shaped so that the mean is larger than the median or
mode, the distribution is said to be positively Negatively skewed X
Mode
skewed. Similarly, when the distribution is shaped so Median
that the mean is smaller than the median or mode, it
FIGURE 3-2 Skewed distributions.
is said to be negatively skewed (Fig. 3-2).
is above or below the mean of the distribution.

34% 34%

14% 14%
2% 2%
Standard – 3s – 2s – 1s X + 1s + 2s +3s FIGURE 3-1 Normal distribudeviation units tion with Standard deviation units.
STANDARD NORMAL DISTRIBUTION AND These z-scores are very useful because they follow
a known probability distribution, and this allows
UNDERSTANDING Z-SCORES for the computation of percentile ranks and the
The standard normal distribution is a particularly assessment of the extent to which a given data
useful form of the normal distribution in which the point is different from the rest of a data set.
mean is 0 and the SD is 1. Data points in any
normally distributed data set can be converted to a
Understanding and Using z-Scores
standard normal distribution by transforming the
data points into z-scores. The z-score shows how A z-score measures the number of SDs that an
many SD a given score actual value lies from the mean; it can be positive
or negative. A data point with a positive z-score
CHAPTER 3 Key Principles Underlying Statistical Inference: Probability and the Normal Distribution 71
has a value that is above (to the right of) the mean, where x is the value of the data point, m is the
and a negative z-score indicates that the value is mean, and s is the SD.
below (to the left of) the mean. Knowing the For example, the ages of the 62 young women
probability that someone will score below, at, or who participated in a program run by an
above the mean on a test can be very useful. In adolescent health clinic are listed in Table 3-3.
health care, the criteria that typically defi ne The ages range from 10 to 22 years and are
laboratory tests (e.g., glucose, thyroid, and roughly normally distributed. The average age is
electrolytes) as abnormal are based on the 16 years, with an SD of 2.94. To fi nd out what
standard normal distribution, with scores that percentage of the girls are 14 years of age or
occur less than 95% of the time. In particular, younger, the percentile rank of 14 is found by
those with a z-score of ±2 or greater (representing computing the z-score of 14 using Equation 3-16:
very large and very small values) are defi ned as
abnormal (Seaborg, 2007).
z= =-0.6802
After this value is computed, we take the absolute
Using Z-Scores to Compute Percentile Ranks
value (| − 0.6802| = 0.6802), and look up this z-
Percentiles allow us to describe a given score in score in the z-table, which is also called the
relation to other scores in a distribution. A “Table of the Area under the Normal Curve” (see
percentile tells us the relative position of a given Appendix B). By looking up the fi rst digit and fi
score and allows us to compare scores on tests that rst decimal place in the column labeled “z,” the
have different means and SDs. A percentile rank value 0.6 can be found in the column. Then,
is calculated as across the top of the table, the hundredth decimal
Number of scores less than a given score place can be found; in this case, .08. The number
that is located at the intersection of the row and
´100
column (25.17) is the percentage of the area under
Total number of scores (3-15)
the curve between the z-score of 0.68 and the
Suppose you received a score of 90 on a test given mean (recall that the mean in the standard
to a class of 50 people. Of your classmates, 40 had normal distribution is 0). A positive z-score is
scores lower than 90. Your percentile rank would above the mean, whereas a negative z-score falls
be below the mean.
The z-score table provides the z-scores for the
(40 / 50)´100 = 80 positive side of the distribution, but
AGES OF PARTICIPANTS IN
You achieved a higher score than 80% of the A PROGRAM FOR ADOLESCENT
people who took the test, which also means that GIRLS
almost 20% who took the test did better than you. Age (Years)
To compute the percentile rank of a single
value (data point) of a variable from a normal 10 20 18
distribution, a z-score is fi rst computed, and then 11 11 19
that z-score is located in a z-table to obtain the
12 12 12
percentile rank. The equation to calculate z-scores
is 13 13 13
14 14 14
x -m
15 15 15
z= (3-18) (3-16)
15 15 15
s
72 SECTION 1 Obtaining and Understanding Data
15 15 15
16 16 16
16 16 16
17 17 17
17 17 17
18 18 18
18 18 13
19 19 14
19 10 15
20 20 15
21 21 16
22 22 16
17 18

20 11

because the distribution is symmetrical, we can look


up the absolute value of a negative z-score on the
table. So, we can calculate the percentage of girls
who are 14 years old or younger as 50 − 25.17 =
24.83. In other words, according to the computed
percentile rank, 24.83% of the girls were 14 years old
or younger. Looking at the other side of the
distribution, 50 + 25.17 = 75.17% of the girls were
older than 14 years. Some commonly computed
percentile ranks are the 25th percentile, 50th
percentile (also known as the median), and 75th
percentile.

You might also like