Professional Documents
Culture Documents
Chapter6 Stats
Chapter6 Stats
Chapter6 Stats
74 87 60 77 67 68 69 73 87 81 ...
The full data set has the following relative frequency table:
A common error that researchers make is to assume a sample arises from a normal
distribution when in fact it does not. Because the sample in the last example is
1
These lecture notes are intended to be used with the open source textbook “Introductory
Statistics” by Barbara Illowsky and Susan Dean (OpenStax College, 2013).
1
Chapter 6 Notes The Normal Distribution D. Skipper, p 2
fairly large, its histogram has a very clear bell-shape. For smaller data sets, the
bell-shape may not be as clearly apparent. There are methods to help us decide
if it is “safe” to assume that data arise from a normally distributed population,
such as a normal quantile plot, but ultimately this is a gray area and relies on the
discretion of the researcher.
Suppose X ∼ N (µ, σ). If we need to find the z score associated with the data
value x, we use the formula
x−µ
z= .
σ
If we need to find the data value that has a particular z score, we use the formula
x = µ + zσ.
Example 2. Comparing data values using z scores. SAT scores and ACT
scores are both normally distributed. SAT scores have a mean of 1026 and a
standard deviation of 209. ACT scores have a mean of 20.8 and a standard deviation
of 4.8. A student takes both tests and scores 1130 on the SAT and 25 on the ACT.
(1) Suppose X = score on the SAT. Then X ∼ ( , ).
(2) Suppose Y = score on the ACT. Then Y ∼ ( , ).
(3) Compare the test scores using z scores.
algebra.com
If X is a random variable and has a normal distribution with mean and standard
deviation , then the Empirical Rule says the following:
• About 68% of the x values lie between −1σ and +1σ of the mean µ (within
one standard deviation of the mean).
• About 95% of the x values lie between −2σ and +2σ of the mean µ (within
two standard deviations of the mean).
• About 99.7% of the x values lie between −3σ and +3σ of the mean µ (within
three standard deviations of the mean).
The empirical rule is also known as the 68-95-99.7 rule.
Section 6.1.
Chapter 6 Notes The Normal Distribution D. Skipper, p 3
Example 3. The 68-95-99.7 Rule. From 1984 to 1985, the mean height of 15
to 18-year-old males from Chile was 172.36 cm, and the standard deviation was
6.34 cm. Let Y = the height of 15 to 18-year-old males from 1984 to 1985. Then
Y ∼ N (172.36, 6.34).
(1) About 68% of the y values lie between what two values?
(2) About 95% of the y values lie between what two values?
(3) About 99.7% of the y values lie between what two values?
Example 6.6.
STRATEGY: For each question, take a minute to figure out what is given and
what is unknown. Then sketch a normal diagram, shade the relevant area, and
mark what is given and what is unknown.
• Finding a probabilty: cutoff data value(s) x given; area unknown.
• Finding a percentile/quartile: area given; cutoff data value x unknown.
• Given a percentage: area given; cutoff data value(s) unknown.