Professional Documents
Culture Documents
Solutions To Practice Problems - Ch1
Solutions To Practice Problems - Ch1
Page 1/3
Problem 8: (d) Both Median and IQR are insensitive to outliers. However, both the mean
and the standard deviation are very sensitive to the outlying values, because their values
will be distorted greatly when averaging.
Problem 9: (a) is true because it’s a square root value. b) is false: s can actually be zero,
when all the data points have the same value. c) is false: With the presence of outliers, s is
not a good measure of spread. However, the interquartile range (IQR) is a better measure of
spread in that case, as it depends on the middle half of the data points only.
n
(c) Range = 60, x = (4 + 60) / 8 = 8, x = 42 + 602 = 3616 ,
2
i =1 i
s= ( n
)
x − nx 2 /(n − 1) = (3616 − 8(8)2 ) / 7 = 21.058
2
i =1 i
Both the range and standard deviation are sensitive to outliers, with the presence of the
data point 60 increasing the values for both substantially.
Problem 12: Note that mean and mode are preserved under both translation and rescaling,
while standard deviation and range are preserved under rescaling only. Hence,
y = 10 x + 2 = 10(10) + 2 = 102, mode y = 10mode x + 2 = 10(12) + 2 = 122
s y =| 10 | s x = 10( 4) = 40, range y =| 10 | range x = 10(13) = 130
Problem 13: Sample mean = 10/10 = 1. In this case, the sample variance can be computed
through alternative formula, given by s2 = ( n
i =1
)
xi2 − nx 2 /(n − 1) = (3300 − 10[1]2 ) / 9 = 365.56 .
Problem 15: (a) np/100 = 16(10)/100=1.6 => k =2. Hence the 10th percentile of body fat
percentage is the 2nd smallest observation of body fat percentage = 25.9%.
(b) np/100 = 16(81.25)/100 = 13. Hence the 81.25th percentile of age is the average of the
13th and 14th smallest observation of age = (57+58)/2 = 57.5 years.
Problem 16: (a) IQR = Q3 - Q1 = 245 - 145 = 100, hence the thresholds to define the outliers
are Q3 + 1.5IQR = 245 + 150 = 395 and Q1 - 1.5IQR = 145 - 150 = -5
Since all the data are within the two thresholds, there is no outlier in the data.
Page 2/3
Problem 17 (b) From the boxplots, IQR is roughly equal to 80-60=20. Hence the length of
the vertical bars cannot be longer than 1.5×IQR=30, with (ii) violating the criterion. (i) is a
valid boxplot, with the absence of a vertical line suggesting that the largest 25% of the data
points are of the same value (=80!). (iii) is a valid boxplot with two outliers in the data.
Problem 18:
Q1: (np/100)=(11)(25)/100=2.75≤3=k.
=> Q1 = 3rd smallest observation = 11.
Q3: (np/100)=(11)(75)/100=8.25≤9=k.
=> Q3 = 9th smallest observation = 15.
Median = (11+1)/2th smallest observation
= 6th smallest observation = 13.
IQR=15-11=4. The thresholds for the
outliers are therefore
Q3 + 1.5IQR= 15+1.5(4)=21 and
Q1 - 1.5IQR=11-1.5(4)=5
Based on the thresholds, we can see that
the data values 2 and 25 are outliers, with
the largest and smallest non-outlying values
given by 9 and 18.
Page 3/3