Professional Documents
Culture Documents
Essential Statistics 2E: William Navidi and Barry Monk
Essential Statistics 2E: William Navidi and Barry Monk
McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or further distribution permitted without the prior written consent of McGraw-Hill Education.
Measures of Position
Section 3.3
McGraw-Hill Education.
Objectives
1. Compute and interpret -scores
2. Compute the quartiles of a data set
3. Compute the percentiles of a data set
4. Compute the five-number summary for a data set
5. Understand the effects of outliers
6. Construct boxplots to visualize the five-number summary
and outliers
McGraw-Hill Education.
Objective 1
Compute and interpret -scores
McGraw-Hill Education.
-Score
Who is taller, a man 73 inches tall or a woman 68 inches tall? The
obvious answer is that the man is taller. However, men are taller
than women on the average.
McGraw-Hill Education.
Interpreting -Scores
The -score of an individual data value tells how many standard
deviations that value is from its population mean.
McGraw-Hill Education.
Example: -Score
A National Center for Health Statistics study states that the mean
height for adult men in the U.S. is = 69.4 inches, with a standard
deviation of = 3.1 inches. The mean height for adult women is
= 63.8 inches, with a standard deviation of = 2.8 inches. Who
is taller relative to their gender, a man 73 inches tall, or a woman 68
inches tall?
We compute:
7369.4
Mans Height = = = 1.16
3.1
6863.8
Womans Height = = = 1.50
2.8
McGraw-Hill Education.
-Scores & The Empirical Rule
Since the -score is the number of standard deviations from the mean,
we can easily interpret the -score for bell-shaped populations using
The Empirical Rule.
McGraw-Hill Education.
Objective 2
Compute the quartiles of a data set
McGraw-Hill Education.
Quartiles
In a previous section, we learned how to compute the mean and
median of a data set as measures of the center. Sometimes, it is
useful to compute measures of position other than the center to get
a more detailed description of the distribution. Quartiles divide a
data set into four approximately equal pieces.
McGraw-Hill Education.
Computing Quartiles
There are several methods for computing quartiles, all of which give similar
results. The following procedure is one fairly straightforward method.
The data are already in increasing order. There are = 45 values. For
the first quartile we compute = 0.25 45 = 11.25. Since 11.25 is not
a whole number, we round it up to 12. The first quartile is the number in
the 12th position, which is 0.92. = 1st Quartile = 0.92
For the second quartile, we compute the median using the methods
previously presented. The median is 3.12. = 2nd Quartile = 3.12
For the third quartile we compute = 0.75 45 = 33.75. Since 33.75 is
not a whole number, we round it up to 34. The third quartile is the
number in the 34th position, which is 4.94. = 3rd Quartile = 4.94
McGraw-Hill Education.
Quartiles on the TI-84 PLUS
The 1-Var Stats command in the TI-84 PLUS
Calculator displays a list of the most common
parameters and statistics for a given data set.
This command is accessed by pressing STAT and
then highlighting the CALC menu.
The 1-Var Stats command returns the following quantities.
The mean
The sample standard deviation
The population standard deviation
minX The minimum data value
Q1 The first quartile
Med The median
Q3 The third quartile
maxX The maximum data value
McGraw-Hill Education.
Example: Computing Quartiles on the TI-84
The following table presents the annual rainfall, in inches, in Los Angeles
during the month of February over several years. Compute the quartiles for
the data.
McGraw-Hill Education.
Objective 3
Compute the percentiles of a data set
McGraw-Hill Education.
Percentiles
Quartiles describe the shape of a distribution by dividing it into
fourths. Sometimes it is useful to divide a data set into a greater
number of pieces to get a more detailed description of the
distribution.
Percentiles divide a data set into hundredths. For a number p
between 1 and 99, the pth percentile separates the lowest p% of the
data from the highest (100 p)%.
McGraw-Hill Education.
Computing Percentiles
The following procedure computes the pth percentile of a data set:
Step 2: Let be the number of values in the data set. For the pth
percentile, compute = .
100
The data are already in increasing order. There are = 45 values. For the
60
60th percentile we compute = 45 = 27. Since 27 is a whole number,
100
the 60th percentile is the average of the numbers in the 27th and 28th
positions. We see that the 60th percentile is
.+.
60th Percentile = = .
McGraw-Hill Education.
Computing a Percentile from a Given Data Value
Sometimes we are given a value from a data set and wish to compute
the percentile corresponding to that value. Following is the
procedure for doing this:
+0.5
Percentile = 100
+. +.
Percentile = = = .
We round the result to 39. The value 1.90 corresponds to the 39th
percentile.
McGraw-Hill Education.
Objective 4
Compute the five-number summary for a data
set
McGraw-Hill Education.
Five-Number Summary
The five-number summary of a data set consists of the median, the
first quartile, the third quartile, the smallest value, and the largest
value. These values are generally arranged in order.
The five-number summary of a data set consists of the following
quantities.
McGraw-Hill Education.
Example: Five Number Summary
The following table presents the annual rainfall, in inches, in Los
Angeles during the month of February over several years. Compute
the five-number summary.
McGraw-Hill Education.
Objective 5
Understand the effects of outliers
McGraw-Hill Education.
Outliers
An outlier is a value that is considerably larger or
considerably smaller than most of the values in a data set.
Some outliers result from errors; for example a misplaced
decimal point may cause a number to be much larger or
smaller than the other values in a data set. Some outliers
are correct values, and simply reflect the fact that the
population contains some extreme values.
McGraw-Hill Education.
Example: Outliers
The temperature in a downtown location is measured for
eight consecutive days during the summer. The readings, in
Fahrenheit, are
81.2 85.6 89.3 91.0 83.2 8.45 79.5 87.8
Which reading is an outlier? Is the outlier an error or is it
possible that it is correct?
Solution:
The outlier is 8.45. It certainly is an error, likely resulting
from a misplaced decimal point. The outlier should be
corrected if possible.
McGraw-Hill Education.
Interquartile Range
One method for detecting outliers involves a measure
called the Interquartile Range.
McGraw-Hill Education.
IQR Method for Detecting Outliers
The most frequent method used to detect outliers in a data set is the
IQR Method.
Step 1: Find the first quartile 1 , and the third quartile 3 .
Step 4: Any data value that is less than the lower outlier boundary
or greater than the upper outlier boundary is considered to
be an outlier.
McGraw-Hill Education.
Example: Identifying Outliers
The following table presents the number of students absent in a middle
school in northwestern Montana for each school day in January. Identify any
outliers.
65 67 71 57 51 49 44 41 59 49 42
56 45 77 44 42 45 46 100 59 53 51
We may use the TI-84 PLUS or other technology to
compute the quartiles. 1 = 45 3 = 59
There are no values less than the lower boundary of 24. The value 100 is
greater than the upper boundary. Therefore, the value 100 is an outlier.
McGraw-Hill Education.
Objective 6
Construct boxplots to visualize the five-number
summary and outliers
McGraw-Hill Education.
Boxplot
A boxplot is a graph that presents the five-number summary
along with some additional information about a data set.
There are several different kinds of boxplots. The one we
describe here is sometimes called a modified boxplot.
McGraw-Hill Education.
Example: Boxplot
The following table presents the number of students absent in a middle
school in northwestern Montana for each school day in January. Construct a
boxplot.
65 67 71 57 51 49 44 41 59 49 42
56 45 77 44 42 45 46 100 59 53 51
Step 1:
We may use the TI-84 PLUS or other technology to
compute the quartiles. 1 = 45, Med = 51, and 3 = 59.
Step 2:
We draw vertical lines at 45, 51, and 59, then
horizontal lines to complete the box.
McGraw-Hill Education.
Example: Boxplot (Continued 1)
Step 3:
We compute the outlier boundaries:
Lower Outlier Boundary = 1 1.5(IQR) = 24
Upper Outlier Boundary = 3 + 1.5(IQR) = 80
Step 4:
The largest data value that is less than the upper boundary is 77. We
draw a horizontal line from 59 up to 77.
McGraw-Hill Education.
Example: Boxplot (Continued 2)
Step 5:
The smallest data value that is greater than the lower boundary is 41.
We draw a horizontal line from 45 down to 41.
Step 6:
The data value 100 lies outside of the outlier boundaries. Therefore,
100 is an outlier. We plot this point separately.
McGraw-Hill Education.
Boxplots on the TI-84 PLUS
The following steps will create a boxplot for
the student absences data on the TI-84 PLUS.
McGraw-Hill Education.
The Empirical Rule
When a data set has a bell-shaped histogram, it is often possible to
use the standard deviation to provide an approximate description of
the data using a rule known as The Empirical Rule.
Approximately 68% of the data will be within one standard
deviation of the mean.
Approximately 95% of the data will be within two standard
deviations of the mean.
All, or almost all, of the data will be within three standard
deviations of the mean.
McGraw-Hill Education.
Boxplots and Shape of a Data Set (Skewed Right)
Boxplots can be used to determine skewness in a data set.
McGraw-Hill Education.
Boxplots and Shape of a Data Set (Skewed Left)
McGraw-Hill Education.
Boxplots and Shape of a Data Set (Symmetric)
McGraw-Hill Education.
You Should Know . . .
How to compute and interpret -scores
How to compute the quartiles of a data set
How to compute a percentile of a data set
How to compute the percentile corresponding to a given data value
How to find the five-number summary for a data set
How to determine outliers using the IQR method
How to construct a boxplot and use it to determine skewness
McGraw-Hill Education.