Worksheets - Chapter04 KEY

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Statistics: Histograms, Stem and Leaf Plots, and Dot Plots –KEY

These four graphs all display the same data--the Kentucky Derby winning times. The dot plot is a
horizontal version of the graph shown in your book.

1. Describe fully in context what each lettered part represents about the data.
6 Kentucky Derby winning times were
A 157 seconds.
32 Kentucky Derby winning times were
B between 125 and 130 seconds.

C
This represents 40 Kentucky Derby A
winning times
A winning time in the Kentucky Derby
D was 158 seconds.

This represents the group of winning


E times between 125 and 129 seconds.

B
C

17 : 2
16 : 5 D
16 : 000113
15 : 5777777888899
15 :
14 :
14 :
E 13 : 5
13 : 00111233
12 : 555555555555556667777888888999999
12 : 0001111111111222222222222222222222222222223333333333333334444444444444
11 : 9
(13:5 means a winning time of 135 seconds)

2. What are the similarities and differences of each graph?


Similarites – Each display shows two modes, and the overall right skew. Differences – The gap between
times of 135 and 155 seconds is obscured as bin width increases.

3. When might you choose to display data using a stem and leaf plot instead of a histogram?
Use the stem and leaf plot when it is important to know the actual winning times, or when making a quick
display by hand to organize the data.

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-12
Statistics: Frequency Tables – KEY

1. Here is a frequency table showing how nuclear power plants are # plants # states
distributed across the United States. 0 19
1 12
a. Explain what the pair (2, 11) in this table represents in the context 2 11
of the problem. 3 4
4 2
11 states have 2 nuclear power plants. 5 1
6 1

b. Find the mean and median and describe these values in context.

Mean:

In the United States, the mean number of nuclear power plants is 1.3 per state.

Median:

In the United States, roughly half of states have 1 or fewer nuclear power plants, and roughly half of
states have 1 or more nuclear power plant.

2. Here is a frequency table of the ages of U.S. Presidents when Age at Number of
they were first inaugurated into the office. inauguration Presidents
40-44 2
a. Which interval contains the median age of inauguration? 45-49 7
The median age at inauguration is between the 50-54 year interval 50-54 13
and the 55-59 year interval.. 55-59 12
60-64 7
b. Estimate the mean age of inauguration of U.S. Presidents 65-69 3
(use the center of each interval)

The mean age of inauguration of U.S. Presidents is


approximately 54.7 years old.

3. Estimate the mean and median of the data shown in


this histogram.
The mean percent body fat of these 252 men is
approximately 25%, with the median a little lower at
approximately 20%.

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-14
Statistics: The 5 Number Summary – KEY
1. Pediatricians use percentiles to monitor the height and weight of their patients over time.
According to the Centers for Disease Control, the distribution of weights of 5 year old males
has the following quartiles:
1st quartile: 37 pounds
2nd quartile: 40 pounds
3rd quartile: 42 pounds
a. For each quartile, write a sentence that describes what that quartile means about the
weights of 5 year old males.
Q1: Approximately 25% of 5 year old males weigh less than 37 pounds.

Q2: Approximately 50% of 5 year old males weigh less than 40 pounds.

Q3: Approximately 75% of 5 year old males weigh less than 42 pounds.

b. What percent of 5 year old males weigh…


between 37 and 40 pounds 25%
less than 40 pounds 50%
more than 40 pounds 50%
more than 37 pounds 75%
between 40 and 42 pounds 25%
less than 37 pounds 25%

c. Mark is a 5 year old of average height who weighs 38 pounds. Is his weight also average?
Explain your thinking.
Mark’s weight is slightly below average, since he weighs less than the median of 40 pounds. However,
his weight is not unusually low, since he is within the middle 50% of weights. At least 25% of 5 year
olds weigh less than Mark.

2. Here again is the dot plot of the 135 Kentucky Derby winning times. Use the dot plot to
separate the dots into quartiles, then draw a box and whisker plot above the dots on the same
axis. On the back, describe the shape, center, and spread of the distribution of winning times
in context.
The distribution of Kentucky Derby
winning times is divided into two distinct
groups. The larger, lower group is skewed
to the right, within winning times between
119 and 135 seconds. A typical winning
time in this group is around 122 seconds.
The second group is smaller, and has
generally higher winning times. The
distribution of this group is also skewed to
the right, with a typical winning time of
around 158 seconds. Winning times range from 155 seconds to 165 seconds. The slowest winning time
is set apart from the group, at 172 seconds.

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-16
Statistics: Chapter 4 Review A – KEY
1. Describe when it is more accurate to use the median and IQR to measure the center and spread
of a set of data opposed to using the mean and standard deviation.
If a distribution is skewed or has outliers, use the median and IQR instead of mean and standard deviation.

2. Below are four histograms from www.shodor.org/interactivate. For each display, choose
whether it more accurate to use the median and IQR, the mean and standard deviation, or
either. Also estimate the mean and median for each using the histograms.
College Math SAT scores NBA Payrolls
Summarize with median and IQR. Summarize with median and IQR.
The median is approximately 500, with the mean The median is approximately $24 million, with the
slightly higher, perhaps approximately 520. mean higher, perhaps $30 million.
Body Fat of 252 men Horsepower of Cars
Summarize with either median and IQR or mean Summarize with median and IQR.
and standard deviation. The median is approximately 90 horsepower, with the
The median is approximately 18% body fat, with mean higher, perhaps 130 horsepower.
the mean slightly higher, perhaps 19% body fat.

3. Imagine in the last histogram, if the car with the highest horsepower was “souped” up so that it
now has 300 horsepower. How would each of the following summary statistics change: mean,
median, range, IQR, and standard deviation?
The mean, range, and standard deviation would all increase. The median and IQR would not change.

4. Here are the number of floors in the 16 tallest buildings of the 160 88 88 69
world. On the back make a histogram of this data, find the five 101 66 88 102
number summary, and write a full shape, center, spread 101 110 96 78
description in context (use the W’s!!!) 88 103 80 70

Min = 66
Number of Floors in 16 Tallest Buildings
Q1 = 78.5
Median = 88 5

Q3 = 101.75 4
Max = 160.
Frequency

The distribution of the number of floors in the 3


worlds 16 tallest buildings is roughly unimodal and
2
slightly skewed to the left. The median number of
floors is 88 and the IQR is 23.25 floors. The 1
distribution contains an outlier (Burj Khalifa in
Dubai, UAE) that has 160 floors. This is 50 floors 0
80 100 120 140 160
more than the next highest number of floors. Number of Floors

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-28
Statistics: Chapter 4 Review B – KEY
1. Describe when it is more accurate to use the median and IQR to measure the center and spread
of a set of data opposed to using the mean and standard deviation.
If a distribution is skewed or has outliers, use the median and IQR instead of mean and standard deviation.

2. Here is a histogram of the horsepower of cars (www.shodor.org/interactivate). Which


measures of center and spread are appropriate to use for this data and why?
Since the distribution is skewed to the right, use the median and IQR instead of mean and standard deviation.
The median and IQR are more resistant to the effects of skewness than the mean and standard deviation.

3. If a new car with 300 horsepower was added to the histogram above, how would each of the
following summary statistics change: mean, median, range, IQR, and standard deviation.

Since the distribution is skewed to the right, use the median and IQR instead of mean and standard deviation.
The median and IQR are more resistant to the effects of skewness than the mean and standard deviation.

4. Here are the number of floors in the 20 tallest buildings of the world. 160 88 88 69 54
Make a histogram of this data, find the five number summary, and 101 66 88 102 68
write a complete shape, center, spread description. 101 110 96 78 54
88 103 80 70 85
Min = 66
Q1 = 69.25
Median = 88
Q3 = 101 Number of Floors in 20 Tallest Buildings
Max = 160.
9
The distribution of the number of floors in
8
the worlds 20 tallest buildings is roughly
7
unimodal and symmetric. The median
Frequency

6
number of floors is 88 and the IQR is 31.75
floors. The distribution contains an outlier 5
(Burj Khalifa in Dubai, UAE) that has 160 4
floors. This is 50 floors more than the next 3
highest number of floors. 2
1
0
60 80 100 120 140 160
Number of Floors

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-30
Statistics: Chapter 4 Review C – KEY
1. An elementary class examined 31 snack size bags of M&M™ candies and posted th/eir results
shown below on their class webpage.

Total Candies per bag

10
8
Students
6
4
2
0
21 22 23 24 25 26 27 28

Candies

mode: 23, 25 median : 24 mean:23.8

Accessed at http://score.kings.k12.ca.us/lessons/mandm/mmoct.html on September 10, 2010

a) Make suggestions to improve the graph.


The students should label the horizontal axis, and the bars should not have space between them.
b) Fully describe these data in context. Use appropriate measures of center and spread.
The distribution of the number of candies per bag is roughly unimodal and symmetric, with median 24
and IQR = 2. This means that the middle 50% of bags had between 23 and 25 candies in them. One
lucky student got 28 candies, while 2 students got the minimum number, 21 candies.
c) Why is the mean slightly less than the median?
The distribution has a slight tail to the left, despite the large value of 28 candies on the right.
d) Is it appropriate to use mean and standard deviation to summarize this data? Why or why
not?
A student may choose to use the mean and standard deviation, since the distribution is roughly
unimodal and symmetric.
e) Sketch a histogram of these data using intervals Total Candies per bag
of 2. 14
12
nd 10
f) If a 32 bag containing 35 M&M™ candies was
Frequency

8
added to this data, how would all the different 6
measures of center and spread discussed in this 4
chapter change? 2
0
The mean, standard deviation, and range would all 21 22 23 24 25 26 27 28 29
increase, while the median and interquartile range Number of Candies
would remain the same.

Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.


4-32

You might also like