Professional Documents
Culture Documents
TOPIC9. The Normal Distribution and Z Scores
TOPIC9. The Normal Distribution and Z Scores
TOPIC9. The Normal Distribution and Z Scores
Topic 9
The Normal Distribution and Z scores
In Topic 5, the shapes of distributions were introduced. One particular shape is the bell-shaped
distribution. This distribution is symmetric, and the mean, median, and mode are all equal (located at
the center). The distribution looks like this:
This distribution is also called the normal distribution. Data like heights of people, quiz scores, weights
of canned goods, and the like are usually normally distributed. Most of the scores cluster around a
central value which is the mean and that is the reason why the peak is at the center. The parameters
of the normal distribution are the mean and the variance. The variance is the area under the curve.
The standard deviation, the square root of the variance, is the average distance of the values to the
left and to the right of the mean. These parameters’ relationship can be explained by the Empirical
Rule.
68% of the observations lie within the interval [𝝁 − 𝟏𝝈, 𝝁 + 𝟏𝝈] or within 1 standard
deviation of the mean
95% of the observations lie within the interval [𝝁 − 𝟐𝝈, 𝝁 + 𝟐𝝈] or within 2 standard
deviations of the mean
99.7% of the observations lie within the interval [𝝁 − 𝟑𝝈, 𝝁 + 𝟑𝝈] or within 3 standard
deviations of the mean
https://andymath.com/normal-distribution-empirical-rule/
hvvvalle
Page 1 of 9
Example
Given an approximately normal distribution with a mean of 60 and a standard deviation of 10, draw
a bell-shaped curve and label it with the appropriate values according to the Empirical Rule.
30 40 50 60 70 80 90
Example
The scores in the Biostatistics midterm exam are normally distributed with mean of 45 and variance
equal to 49. Answer the following questions.
1. Use the Empirical Rule to find the intervals where 68.3%, 95.4%, and 99.7% of the scores lie.
Answer: Given 𝜇 = 45, 𝜎 2 = 49 → 𝜎 = 7
Answer: 84%
Answer on your own:
3. What percent of the scores are greater than 52?
4. What percent of the scores are greater than 31?
5. What percent of the scores are less than 24?
6. What percent of the scores are between 31 and 66?
7. What percent of the scores are less than the mean?
Example
A normal distribution has 𝜇 = 5 and 𝜎 = 2. What percent of the values are within the interval
[−1, 11]?
Answer: We know that the interval is [𝜇 − 𝑘𝜎, 𝜇 + 𝑘𝜎] for 𝑘 = 1,2,3.
𝐼𝑓 𝑘 = 1: [𝜇 − 1𝜎, 𝜇 + 1𝜎] = [5 − 1(2), 5 + 1(2)] = [3, 7]
𝐼𝑓 𝑘 = 2: [𝜇 − 2𝜎, 𝜇 + 2𝜎] = [5 − 2(2), 5 + 2(2)] = [1, 9]
𝐼𝑓 𝑘 = 3: [𝜇 − 3𝜎, 𝜇 + 3𝜎] = [5 − 3(2), 5 + 3(2)] = [−1, 11] → 99.7%
hvvvalle
Page 1 of 9
Z scores
A 𝒁 score is computed if one wants to determine how a data value 𝑥 relates to the mean 𝜇 of the
observations in a group in terms of standard deviations 𝜎 from the mean. The 𝑍 score is a
standardized value which can be used for comparison from one data set to another. This is given by
the formula
𝒙−𝝁
𝒁=
𝝈
Example
The scores in the quiz (which are approximately normal) of 500 students have a mean of 52 and
standard deviation equal to 2. Maliah, Ben, and Jean’s scores are 58, 47, and 54, respectively.
Compute their 𝑍 scores and interpret.
𝒙𝑴𝒂𝒍𝒊𝒂𝒉 −𝝁 𝟓𝟖−𝟓𝟐 𝟔
Maliah: 𝒁= 𝝈
= 𝟐 =𝟐=𝟑
𝒙𝑩𝒆𝒏 −𝝁 𝟒𝟕−𝟓𝟐 −𝟓
Ben: 𝒁= = = = −𝟐. 𝟓
𝝈 𝟐 𝟐
𝒙𝑱𝒆𝒂𝒏 −𝝁 𝟓𝟒−𝟓𝟐 𝟐
Jean: 𝒁= 𝝈
= 𝟐 =𝟐=𝟏
Maliah’s score is 3 standard units higher than the mean score of the quiz whereas Ben’s score is 2.5
standard units lower than the mean score of the quiz. Jean’s score is 1 standard unit higher than the
mean. We can say that Maliah’s performance is better than Ben’s and Jean’s.
hvvvalle
Page 1 of 9
We are now going to find areas under the standard normal curve of which the whole area is equal to
1. That is, we want to find probabilities bounded by the indicated 𝑍 values. A 𝑍 table or a standard
normal table is needed for this. (A 𝑍 table is provided as a separate file at the VLE.)
𝑷(𝒁 ≤ 𝒛)
Example: Find 𝑷(𝒁 < 𝟏. 𝟏𝟒). Take note that we can use < and ≤ or > 𝑎𝑛𝑑 ≥ interchangeably.
hvvvalle
Page 1 of 9
𝑷(𝟎. 𝟒𝟑 < 𝒁 < 𝟏. 𝟎𝟔) = 𝑷(𝒁 < 𝟏. 𝟎𝟔) − 𝑷(𝒁 < 𝟎. 𝟒𝟑)
hvvvalle
Page 1 of 9
𝑷(−𝟑. 𝟐𝟓 < 𝒁 < −𝟐. 𝟗𝟎) = 𝑷(𝒁 < −𝟐. 𝟗𝟎) − 𝑷(𝒁 < −𝟑. 𝟐𝟓)
𝑷(−𝟑. 𝟎𝟎 < 𝒁 < 𝟏. 𝟖𝟗) = 𝑷(𝒁 < 𝟏. 𝟖𝟗) − 𝑷(𝒁 < −𝟑. 𝟎𝟎)
= 𝟎. 𝟗𝟕𝟎𝟔𝟐 − 𝟎. 𝟎𝟎𝟏𝟑𝟓
= 𝟎. 𝟗𝟔𝟗𝟐𝟕
Case 6. 𝑷(𝒁 ≥ 𝒛)
hvvvalle
Page 1 of 9
𝑷(𝒁 > −𝟐. 𝟏𝟐) = 𝟏 − 𝑷(𝒁 < −𝟐. 𝟏𝟐) = 𝟏 − 𝟎. 𝟎𝟏𝟕𝟎𝟎 = 𝟎. 𝟗𝟖𝟑𝟎𝟎
𝑷(𝒁 > −𝟐. 𝟏𝟐) = 𝑷(𝒁 < −(−𝟐. 𝟏𝟐)) = 𝑷(𝒁 < 𝟐. 𝟏𝟐) = 𝟎. 𝟗𝟖𝟑𝟎𝟎
Examples
• What is the percentage of the examinees having scores lower than Maliah’s score in the quiz?
Given: 𝑛 = 500 𝜇 = 52 𝜎=2 𝑥𝑀𝑎𝑙𝑖𝑎ℎ = 58
Find 𝑃(𝑋 < 58). Transform Maliah’s score into a 𝑍 score first.
𝑋−52 58−52
𝑃( < ) = 𝑃(𝑍 < 3) = 0.99865 → 99.865% of 500 examinees have scores lower
2 2
than Maliah’s score.
Find 𝑃(𝑋 > 47). Transform Ben’s score into a 𝑍 score first.
𝑋−52 47−52
𝑃( 2
> 2
) = 𝑃(𝑍 > −2.5) = 𝑃(𝑍 < 2.5) = 0.99379
Number of examinees having scores larger than Ben’s score = 0.99379 ∗ 500 = 497
• How many examinees have scores between the mean and Jean’s score?
Given: 𝑛 = 500 𝜇 = 52 𝜎=2 𝑥𝐽𝑒𝑎𝑛 = 54
Find 𝑃(52 < 𝑋 < 54). Transform the values into 𝑍 scores first.
Number of examinees having scores between the mean and Jean’s score= 0.34134 ∗ 500 = 17
Find 𝑃(𝑋 ≥ 50). Transform the passing score into a 𝑍 score first.
𝑋−52 50−52
𝑃(𝑋 ≥ 50) = 𝑃 ( 2
≥ 2
) = 𝑃(𝑍 ≥ −1) = 𝑃(𝑍 < 1) = 0.84134
hvvvalle
Page 1 of 9
Locate 0.66276 in your 𝑍 table. Add 0.4 and 0.02 0.4 + 0.02 = 0.42 Then 𝒂 = 𝟎. 𝟒𝟐.
hvvvalle
Page 1 of 9
Since we are dealing with the area 𝑷(−𝒂 ≤ 𝒁 ≤ 𝒂) = 𝟎. 𝟗𝟓, then the remaining area is 𝟏 − 𝟎. 𝟗𝟓 =
𝟎. 𝟎𝟓. We divide 𝟎. 𝟎𝟓 by 𝟐 since there are two sides to account for, resulting to 𝟎. 𝟎𝟐𝟓. Next, we
need to locate the probability 𝟎. 𝟎𝟐𝟓 in the 𝑍 table such that 𝑷(𝒁 ≤ −𝒂) = 𝟎. 𝟎𝟐𝟓.
We find that −𝒂 = −𝟏. 𝟗𝟔, therefore, 𝒂 = 𝟏. 𝟗𝟔. Try checking if indeed 𝑷(−𝟏. 𝟗𝟔 ≤ 𝒁 ≤ 𝟏. 𝟗𝟔) =
𝟎. 𝟗𝟓.
On your own,
hvvvalle