Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Status check:

• Completed Assessments: Activity 1, THT, Activity 2, Activity 3


• Incomplete Assessments: Activity 4, Activity 5, Test & Exam
• We at Week 5 (Mid-Semester break April 7th – 14th)
• Chapter 16 – The Normal distribution
• Monday 17th - Resume classes
• Chapter 17
Business Statistics

Chapters 16 The Normal Distribution


Chapter 16 Activity 3
The Normal Distribution Covering chapter 16
Learning Outcomes
16.1 Identify the properties of the normal distribution and normal curve
16.2 Identify the characteristics of the standard normal curve
16.3 Understand examples of normally distributed data
16.4 Read z-score tables and find areas under the normal curve
16.5 Find the z-score, given the area under the normal curve
16.6 Compute proportions
16.7 Check whether data follow a normal distribution
16.8 Understand and apply the Central Limit Theorem
16.9 Solve business problems that can be represented by a normal distribution
16.10 Calculate estimates and their standard errors
16.11 Calculate confidence intervals for the population mean
16.13 Calculate confidence intervals for the population proportion
What is a normal distribution in statistics?

“A probability distribution that is


symmetric about the mean, showing
that data near the mean are more
frequent in occurrence than data far
from the mean. In graph form,
normal distribution will appear as
a bell curve.”
Properties of the normal distribution and
normal curve
• Chapter 11 (Pg.263)

• Raw data – observation Array – Some sort of order


• Frequency distribution – the number of times an observation occurs
is put in a distribution to give a listing of different observations (in
order of magnitude) with the corresponding frequency along side it.
• The total number of frequency should always equal to the number of
array and the raw data which in this case is 80.

• This distribution can be further sorted into class intervals.


• Class Intervals = grouped frequency distribution
• Observations are grouped into intervals or a range of values.
• Grouped frequency distribution
- Advantages : - Reduces complex data, making easy to read
- Smoothens irregularities in the distribution
- Disadvantage: is that some data can be lost.

Grouped frequency Histogram (p.265)


distribution
• Histogram - is a method of presenting a graphical picture of a
grouped frequency distribution.
• Frequency Polygon – a line graph of a grouped frequency
distribution.
• If class intervals were continuously reduced in size, and if there was a
larger number of distribution, the frequency polygon would resemble
a more and more smooth bell-shaped curve.
• The Normal Distribution – many naturally occurring distributions
have this shape. It provides sufficiently accurate approximation to
enable adequate conclusion to be drawn.
• Several commonly occurring shapes in distribution
Where do we use the normal distribution in real
life? Some variables…
• Height of adults (basket ball)
• Blood pressure (diet related/stress)
• Cholesterol level (triggered by lack of exercise, smoking, obesity, diabetes)
• IQ (fair representation)
• Crop yield per year (farmers)
• Radiation exposure per area (Radon - radioactive gas in 1 in 15 houses)
• Head circumference (Archaeologists - Ancient Egypt)
• Oxygen consumption (higher altitudes)
• Temperature at a given time of the year (production & sales/exports)
• Exam scores (fair correlation of effects of covid)
• SMEs in e-commerce in PNG
• Normal curve – When the frequency polygons have a normal
distribution, they are then made into a smooth curve which is known
as the normal curve.

a.k.a….Normal Curve
• A specific normal curve is characterised by it’s mean(𝝁) and standard
deviation(𝝈).

Normal curve with same mean but different standard deviation values
• A specific normal curve is characterised by it’s mean(𝝁) and standard
deviation(𝝈).

Normal curve with same standard deviation but different mean values
Main features of a normal distribution or curve:
1. It is bell shaped.
2. It is symmetric, hence mean and median are equal.
3. It is asymptotic to the horizontal axis, meaning the curve never
touches the horizontal axis as it moves out-word.

VERTICAL
H O R I Z O N TA L
4. Approximately 68% of a normal distribution lies within 1 standard
deviation of the mean; above 95% lies within 2 standard deviation of
the means and about 99.7% lies within 3 standard deviation of the
mean.
5. The centre and variation will depend on the values of 𝝁 and 𝝈.
6. The total area under any normal curve is 1, therefore as noted in
point 4;
• The area under the curve within 1 standard deviation if the mean is
0.68 approx.
• The area within 2 standard deviations of the mean is 0.95 approx.
• The area within 3 standard deviations of the mean is 0.997 approx.
Example: weight of watermelon (Kg)

10 Kg
8 Kg
6 Kg
2 Kg 4 Kg
Class intervals (Kg) Frequency (f)
0 – under 2 50
2 – under 4 200
4 – under 6 500
6 – under 8 120
8 – under 10 20
σ 𝑓 = 890
8-10 Kg

6-8 Kg
4-6 Kg
2-4 Kg
0-2 Kg
Class intervals (Kg) Frequency (f)
0 – under 0.5 5
0.5 – under 1 5
1 – under 1.5 10
1.5 – under 2 30
2 – under 2.5 25
Intervals decreased 2.5 – under 3 25
3 – under 3.5 50
3.5 – under 4 100
4 – under 4.5 110
4.5 – under 5 120
5 – under 5.5 130
5.5 – under 6 140
6 – under 6.5 …..
Etc…… σ 𝑓 =3000
…so this is a
normal distribution
of the watermelons
on the school
bus…..
The areas under the normal curve
• The proportions of observations that take on certain values are
represented by areas under the distribution curve
• The proportion of observations that take on a value between a and b
is the area under the curve between two vertical lines erected at a
and b
• We could calculate these areas and thus obtain values for the
proportions
• Examples 16.1 (page 453)
Watermelons had a mean(μ) = 4 and a standard deviation(σ) = 1

1 2 3 4 5 6 7
Shade the relevant regions μ = 4 & standard deviation σ = 1
Shade the region that represents the proportion of weight for each
watermelon between 2 and 6 Kgs.

1 2 3 4 5 6 7
Shade the region that represents the proportion of weight for each
watermelon more than 5Kgs.

1 2 3 4 5 6 7
Shade the region that represents the proportion of weight for each
watermelon less than 1Kg.

1 2 3 4 5 6 7
Shade the region that represents the proportion of weight for each
watermelon between 5 and 7 Kgs.

1 2 3 4 5 6 7
Standard Scores (z–scores)
Standard Scores (z–scores)
• It is rather complex to obtain specific measurements for items that
vary in their respective limitations. For instance, the weight of
watermelons and guava vary greatly hence different normal curves
would have to be created for each items (fruits) meaning their
respective mean and standard deviation would have to be calculated.
• This tedious process can be reduced or managed by standardising the
weight to cover all fruits. Therefore all measurements taken must be
converted into standard scores or z-scores.
• The z-score of a measurement is defined as the number of standard
deviations the measurement is away from the mean
• If the measurement is above the mean, the corresponding z-score is
positive; but if the measurement is below the mean, the
corresponding z-score is negative
observed value  mean
Standard score  z 
standard deviation

Thus, if a distribution has a mean of μ and a standard deviation of σ, the


corresponding z-score of an observation x is:
x
z

The mean itself has a z-score of zero and a value exactly 1 standard deviation from
the mean has z = + 1 or – 1

The larger a positive z-value, the further the given x is above the mean, while
large negative z-values correspond to extreme values below the mean
Formula:

𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 −𝑚𝑒𝑎𝑛


Standard score = z =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

𝑥 −𝜇
∴z=
𝜎

Using the table of the standard normal curve.


• Table 6 as referred to by the text book (Pg.784)
Using the table of the standard normal curve
• Table 6 gives the area under the standard normal curve between the
mean (zero) and any positive number z

• Using Table 6, determine the area under the standard


normal curve:
(a) between z = 0 and z = 1.50
(b) between z = – 2.10 and z = 0
(c) between z = 0.60 and z = 1.80
(d) between z = – 0.30 and z = 2.25
(e) to the right of z = 1.95
Solution:
(a)From Table 6: The area between z = 0 and z = 1.50 is 0.4332
.00 .01 .02 .03
Solution:

(b) In order to find the area to the left of z = 0, the symmetry of the normal
curve can be used. In this case, the area between z = –2.10 and z = 0 is the
same as the area between z = 0 and z = + 2.10. From Table 6: The area is
0.4821
.00 .01 .02 .03
Solution:

(c) The area between z = 0.60 and z = 1.80 may be found by subtracting the area
between z = 0 and z = 0.60 from the area between z = 0 and z = 1.80.
From Table 6:
The area between z = 0 and z = 1.80 is 0.4641
The area between z = 0 and z = 0.60 is 0.2257
Therefore, the required area is 0.4641 – 0.2257 = 0.2384
Expanded version of the previous slide….just to get the mental image….

Finally, minus the


First, find the area of Then, find the area of
smaller area from the
the bigger z-score, i.e., the smaller z-score, i.e.,
bigger area:
Z= 0 and z=1.8 Z= 0 and z=0.6
0.4641 – 0.2257 =
= 0.4641 = 0.2257
0.2384
.00 .01 .02 .03
(d)The area between z = –0.30 and z = 2.25 may be found by adding the area
between z = – 0.30 and z = 0 to the area between z = 0 and z = 2.25. The area
between z = – 0.30 and z = 0 equals the area between z = 0 and z = + 0.30 (by
symmetry).
From Table 6:
The area between z = 0 and z = 0.30 is 0.1179
The area between z = 0 and z = 2.25 is 0.4878
Therefore, the required area is 0.1179 + 0.4878 = 0.6057
.00 .01 .02 .03
(e) From the facts that the normal curve is symmetrical about the mean and the total
area under the curve is 1, it follows that the area to the right of z = 0 is 0.5 and that the
area to the left of z = 0 is also 0.5. Thus, to find the area to the right of z = 1.95, we
subtract the area between z = 0 and z = 1.95 from 0.5.
From Table 6:
The area between z = 0 and z = 1.95 is 0.4744.
Therefore, the required area is 0.5000 – 0.4744 = 0.0256
.00 .01 .02 .03
Expanded version of the previous slide….

The entire area to the right of 0 is


To find the area to the right
always 0.5 and likewise the area to Find the area of z =1.95. i.e.,
of z = 1.95. Minus the smaller
the left of 0 is 0.5. Note: the z- Z= 0 and z=1.95
area from the bigger area:
score is not 0.5, the entire area to = 0.4744
0.5 – 0.4744 = 0.0256
the right of 0 (or the mean) is 0.5.
Conversion to raw scores
• In order to determine appropriate areas under any normal curve, the z-score
(or standard score) may be calculated

• The z-scores express the given problem in ‘standard form’ so that the
standard normal curve can be used
• To convert a raw score of x (from a distribution with mean μ and standard deviation σ) to
a z-score, subtract the mean from x and divide by the standard deviation
• To convert a z-score to a raw score x, multiply the z-score by the standard deviation and
add this product to the mean

• In equation form:

x    Ζ
Conversion to raw score
• To find 𝑥:

𝑥−μ
Z=
σ
𝑥−μ
Zxσ=
σ xσ
Re-arrange z σ + μ = 𝑥 − μ +μ

𝑥=zσ+μ
∴ 𝑥 = 𝜇 + 𝑧𝜎
Computation of proportions
A proportion within a given interval

• If a set of data has a normal distribution, the proportion of


observations that lie in a particular interval can be found using
the following procedure:
1. Determine the z-score for each end point of the interval

2. Find the area (from Table 6) for each z-value (If the z-value is negative,
ignore the sign)

3. lf the end points of the interval lie on opposite sides of the mean, add the
two areas found in Step 2. If the two end points lie on the same side of
the mean, subtract the smaller area from the larger one
The proportion greater or less than a given value

• Table 6 can also be used to determine what proportion of a


normal population is greater than a certain value or less than a
certain value
• To determine the proportion of a normal distribution greater than a value
of x, calculate the z-score corresponding to x and find the area to the right
of this score

• To determine the proportion of a normal distribution less than a value of


x, calculate the z-score corresponding to x and find the area to the left of
this score
Summary
• We looked at identifying the properties of the normal distribution and
normal curve

• We identified the characteristics of the standard normal curve

• We understood examples of normally distributed data

• We read z-score tables and found areas under the normal curve

• We found the z-score, given the area under the normal curve

• We computed proportions

You might also like