Practice Problems in Chapter 1

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Practice Problems in Chapter 1

Problem 1: Identify each of the following as categorical variable or quantitative variable.

a) Number of children in family


b) Amount of time in football game before first points scored
c) College major (English, history, chemistry...)
d) Type of music (rock, jazz, classical, folk, other)

Problem 2: Which of the following variables are continuous?

a) Age of mother
b) Number of children in a family
c) Cooking time for preparing dinner
d) Latitude and longitude of a city
e) Population size of a city

Problem 3: According to the Census Bureau report in 2002, the number of times residents of the U.S.
(age 20 to 24) had been married is summarized in the table below:

a) Compute the median and sample mean number of times married for each of women and men.
b) Do you think the median is a good measure of location for the number of times women
married above? Explain.

Problem 4: Consider a sample of 10 data points x1, x2, …, x10 as follows: -1, -1, 0, 0, 0, 1, 1, 2, 2, 4

(a) If we transform the data through yi=-2xi+1 what is the sample mean of y1, y2, …, y10?

(b) If we transform the data through yi=2xi3 what is the mode of y1, y2, …, y10?

(c) If we transform the data through yi=3xi2 what is the mode of y1, y2, …, y10?
Problem 5: For each of the data sets below (10 data points each), determine if the sample mean, the
median and the mode are good as measures of location.

Data Set A: Age of male (in years) to have the first marriage

23, 23, 25, 28, 29, 30, 31, 32, 35, 70

Data Set B: Number of children in a family:

0, 1, 1, 2, 2, 2, 2, 3, 3, 5

Data Set C: 1 = Student taking STAT1012,

0 = Student not taking STAT 1012

0, 0, 1, 1, 1, 1, 1, 1, 1, 1

Problem 6: Suppose 15 families were visited, and the number of children in each family were recorded
as follow:

2, 0, 6, 1, 1, 3, 0, 1, 3, 4, 2, 0, 2, 1, 2

(a) Which graphical method would you use to summarize the data, the bar graph or the histogram?

(b) Sketch the graph you chose in part (a).

(c) Is the distribution symmetric, left-skewed, or right-skewed?

(d) How many modes are there in the distribution?

Problem 7: Consider the frequency table on the right:

a) What are the sample mean, median and mode of the data?

b) Is the distribution symmetric, left-skewed or right-skewed?

c) What is the sample standard deviation of the data?

Problem 8: If an outlier exists in the data set, which of the following is/are NOT recommended to be
used as measures of location or spread? (i) Sample Mean (ii) Median (iii) IQR (iv) Sample Standard
Deviation

a) (i) only

b) (i) and (ii) only

c) (i) and (iii) only

d) (i) and (iv) only


Problem 9: Which statement(s) about the sample standard deviation s is false?

a) s can never be negative.

b) s can never be zero.

c) s is a good measure of spread in the presence of outliers

10 10
Problem 10: Consider 10 sample points x1, x2, …, x10, with ∑ x i=100 and ∑ x i =1294
2

i=1 i=1

(a) Compute the sample mean and the sample standard deviation of the data?

(b) If a new data point x11 =10 is added to the sample. What are the new sample mean and sample
standard deviation of the 11 data points?

Problem 11: A company wants to investigate the amount of sick leave taken by its employees. A sample
of 8 employees yields the following numbers of sick leave (in days) taken over the past year: 0 0 4 0
0 0 6 0

a) What is the range of the data?

b) Compute the sample standard deviation of the data.

c) It turns out that the data value 6 was recorded incorrectly, with the correct value given by 60.
Redo part a) and b) based on the correct data, and describe the effect of the outlier.

Problem 12: Consider a sample of 10 data points x1, x2, …, x10, with sample mean, mode, sample
standard deviation and range given by 10, 12, 4 and 13 respectively.

If we transform the data through the formula yi = 10xi+2. What are the sample mean, mode, sample
standard deviation and range of y1, y2, …, y10?

Problem 13: Compute the sample variance of the 10 data points:

-20, 20, -30, 10, -20, -10, 20, 10, 20, 10

Problem 14: It’s known that the sample variance and range of x1, x2, …, x100 are 21 and 10 respectively. If
we transform the data through yi = -2xi+5, what are the sample variance and the range of y1, y2, …, y100?
Problem 15: The data on the right shows the body fat percentage of 16
females with ages ranging from 23 to 61 years old.

(a) Compute the 10th percentile of the body fat percentage.

(b) Compute the 81.25th percentile of age.

Problem 16: Consider a data set with quartiles given by the following:

Q0 = 0, Q1 = 145, Q2 = 200, Q3 = 245 and Q4 = 290

How many outliers are there in the data?

a) Zero

b) One

c) Two

d) At least one

Problem 17: Which of the following is/are NOT a possible shape of


the boxplot?

a) (i) only

b) (ii) only

c) (iii) only

d) (i) and (iii) only

Problem 18: Sketch the boxplot of the following sample:

2, 9, 11, 12, 13, 13, 14, 14, 15, 18, 25

Problem 19: The CO2 emissions from fossil fuel combustion are the result of the generation of electricity,
heating, industrial processes, and gas consumption in automobiles.
The table on the right shows the CO2
emissions (in metric ton per person)
of 8 selected countries:

(a) What are the sample mean and median of the data?

(b) Is there any outlier in the data set?

Problem 20: The table on the right shows the birth-weights (in gram) of 20 live-born infants.

(a) Draw a stem-and-leaf plot, if the last two digits


are used as the leaf, while the first two digits
are used as the stem.

(b) What is the first and the third quartiles of the


distribution?

You might also like