Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

Normal Probability

Distributions
Normal Distributions
• This pdf is the most popular distribution
for continuous random variables
• First described de Moivre in 1733
• Elaborated in 1812 by Laplace
• Describes some natural phenomena
• More importantly, describes sampling
characteristics of totals and means
Normal Probability Density
Function
• Recall: continuous Figure: Age distribution
random variables are of a pediatric population
described with with overlying Normal
probability density pdf
function (pdfs)
curves
• Normal pdfs are
recognized by their
typical bell-shape
Area Under the Curve
• pdfs should be viewed
almost like a histogram
• Top Figure: The darker
bars of the histogram
correspond to ages ≤ 9
(~40% of distribution)
• Bottom Figure: shaded
area under the curve
(AUC) corresponds to
ages ≤ 9 (~40% of area)
7: Normal Probability Distributions 4
Parameters μ and σ
• Normal pdfs have two parameters
μ - expected value (mean “mu”)
σ - standard deviation (sigma)

μ controls location σ controls spread

7: Normal Probability Distributions 5


Mean and Standard Deviation
of Normal Density

μ
7: Normal Probability Distributions 6
Standard Deviation σ
• Points of inflections
one σ below and
above μ
• Practice sketching
Normal curves
• Feel inflection points
(where slopes change)
• Label horizontal axis
with σ landmarks

7: Normal Probability Distributions 7


Two types of means and standard
deviations
• The mean and standard deviation from
the pdf (denoted μ and σ) are
parameters
• The mean and standard deviation from
a sample (“xbar” and s) are statistics
• Statistics and parameters are related,
but are not the same thing!

7: Normal Probability Distributions 8


68-95-99.7 Rule for
Normal Distributions
• 68% of the AUC within ±1σ of μ
• 95% of the AUC within ±2σ of μ
• 99.7% of the AUC within ±3σ of μ

7: Normal Probability Distributions 9


Example: 68-95-99.7 Rule
Wechsler adult • 68% of scores within
intelligence scores: μ±σ
Normally distributed = 100 ± 15
with μ = 100 and σ = 15; = 85 to 115
X ~ N(100, 15) • 95% of scores within
μ ± 2σ
= 100 ± (2)(15)
= 70 to 130
• 99.7% of scores in
μ ± 3σ =
100 ± (3)(15)
= 55 to 145
7: Normal Probability Distributions 10
Symmetry in the Tails
Because the Normal
curve is symmetrical
and the total AUC is
exactly 1…

… we can easily
determine the AUC in
95%
tails
7: Normal Probability Distributions 11
Example: Male Height
• Male height: Normal with μ = 70.0˝ and σ = 2.8˝
• 68% within μ ± σ = 70.0 ± 2.8 = 67.2 to 72.8
• 32% in tails (below 67.2˝ and above 72.8˝)
• 16% below 67.2˝ and 16% above 72.8˝ (symmetry)

7: Normal Probability Distributions 12


Reexpression of Non-Normal
Random Variables
• Many variables are not Normal but can be
reexpressed with a mathematical
transformation to be Normal
• Example of mathematical transforms used
for this purpose:
– logarithmic
– exponential
– square roots
• Review logarithmic transformations…
Logarithms
• Logarithms are exponents of their base
• Common log
(base 10) Base 10 log function
– log(100) = 0
– log(101) = 1
– log(102) = 2
• Natural ln (base e)
– ln(e0) = 0
– ln(e1) = 1
Example: Logarithmic Reexpression
• Prostate Specific Antigen Take exponents of “95% range”
(PSA) is used to screen ⇒ e−1.9,1.3 = 0.15 and 3.67
⇒ Thus, 2.5% of non-diseased
for prostate cancer
population have values greater
• In non-diseased than 3.67 ⇒ use 3.67 as
populations, it is not screening cutoff
Normally distributed, but
its logarithm is:
• ln(PSA) ~N(−0.3, 0.8)
• 95% of ln(PSA) within
= μ ± 2σ
= −0.3 ± (2)(0.8)
= −1.9 to 1.3
§7.2: Determining Normal
Probabilities
When value do not fall directly on σ
landmarks:

1. State the problem


2. Standardize the value(s) (z score)
3. Sketch, label, and shade the curve
4. Use Table B
Step 1: State the Problem
• What percentage of gestations are
less than 40 weeks?
• Let X ≡ gestational length
• We know from prior research:
X ~ N(39, 2) weeks
• Pr(X ≤ 40) = ?
Step 2: Standardize
• Standard Normal
variable ≡ “Z” ≡ a
Normal random
variable with μ = 0
and σ = 1,
• Z ~ N(0,1)
• Use Table B to look
up cumulative
probabilities for Z
Example: A Z variable
of 1.96 has cumulative
probability 0.9750.

7: Normal Probability Distributions


Step 2 (cont.)
Turn value into z score:

z-score = no. of σ-units above (positive z) or below


(negative z) distribution mean μ

7: Normal Probability Distributions 20


Steps 3 & 4: Sketch & Table B
3. Sketch
4. Use Table B to lookup Pr(Z ≤ 0.5) = 0.6915

7: Normal Probability Distributions


Probabilities Between Points
a represents a lower boundary
b represents an upper boundary
Pr(a ≤ Z ≤ b) = Pr(Z ≤ b) − Pr(Z ≤ a)
Between Two Points
Pr(-2 ≤ Z ≤ 0.5) = Pr(Z ≤ 0.5) − Pr(Z ≤ -2)
.6687 = .6915 − .0228

.6687 .6915
.0228
-2 0.5 0.5 -2
Values Corresponding to Normal
Probabilities
1. State the problem
2. Find Z-score corresponding to
percentile (Table B)
3. Sketch
4. Unstandardize:
SKEWNESS
&
KURTOSIS
Concept of Skewness

A distribution is said to be skewed-when the mean, median and mode fall at


different position in the distribution and the balance (or center of gravity) is
shifted to one side or the other i.e. to the left or to the right.
Therefore, the concept of skewness helps us to understand the
relationship between three measures-
•Mean.
•Median.
•Mode.
Symmetrical Distribution

• A frequency distribution is said to be


symmetrical if the frequencies are equally
distributed on both the sides of central value.
• A symmetrical distribution may be either bell
– shaped or U shaped.
• In symmetrical distribution, the values of
mean, median and mode are equal i.e.
Mean=Median=Mode
Skewed Distribution

• A frequency distribution is said to be skewed if


the frequencies are not equally distributed on
both the sides of the central value.

• A skewed distribution may be-

• Positively Skewed
• Negatively Skewed
Skewed Distribution

• Negatively Skewed • Positively Skewed


• In this, the distribution • In this, the distribution
is skewed to the left is skewed to the right
(negative) (positive)
• Here, Mode exceeds • Here, Mean exceeds
Mean and Median. Mode and Median.

Mean<Median<Mode
Mode<Median<Me
an
Tests of Skewness

In order to ascertain whether a distribution is skewed or not the following


tests may be applied. Skewness is present if:
•The values of mean, median and mode do not coincide.
•When the data are plotted on a graph they do not give the normal bell shaped
form i.e. when cut along a vertical line through the center the two halves are not
equal.
•The sum of the positive deviations from the median is not equal to the sum of the
negative deviations.
•Quartiles are not equidistant from the median.
•Frequencies are not equally distributed at points of equal deviation from the
mode.
Graphical Measures of Skewness

• Measures of skewness help us to know to what degree and in which direction (positive or
negative) the frequency distribution has a departure from symmetry.
• Positive or negative skewness can be detected graphically (as below) depending on whether the
right tail or the left tail is longer but, we don’t get idea of the magnitude
• Hence some statistical measures are required to find the magnitude of lack of symmetry

Mean> Median>
Mode Mean=Media Mean<Media
n=Mode n<Mode

Symmetrical Skewed to the Skewed to the


Left Right
Statistical Measures of Skewness

Absolute Measures of Skewness Relative Measures of Skewness


Following are the absolute measures of There are four measures of skewness:
skewness:
•Skewness (Sk) = Mean – Median •β and γ Coefficient of skewness
•Karl Pearson's Coefficient of skewness
•Skewness (Sk) = Mean – Mode
•Bowley’s Coefficient of skewness
•Skewness (Sk) = (Q3 - Q2) - (Q2 - Q1)
•Kelly’s Coefficient of skewness
β and γ Coefficient of Skewness


Karl Pearson's Coefficient of Skewness……01

• This method is most frequently used for measuring skewness. The formula for
measuring coefficient of skewness is given by

SKP = Mean – Mode


σ

Where,
SKP = Karl Pearson's Coefficient of skewness,
σ = standard deviation.

Normally, this coefficient of skewness lies between -3 to +3.


Karl Pearson's Coefficient of Skewness…..02
In case the mode is indeterminate, the coefficient of skewness is:

Mean – (3 Median - 2 Mean)


SKP = σ

Now this formula is equal to

3(Mean - Median)
SKP = σ

The value of coefficient of skewness is zero, when the distribution is symmetrical.


The value of coefficient of skewness is positive, when the distribution is positively skewed.
The value of coefficient of skewness is negative, when the distribution is negatively skewed.
Bowley’s Coefficient of Skewness……01

Bowley developed a measure of skewness, which is based on quartile


values.
The formula for measuring skewness is:

(Q3 – Q2) – (Q2 – Q1)


SKB =
(Q3 – Q1)

Where,
SKB = Bowley’s Coefficient of skewness,
Q1 = Quartile first Q2 = Quartile second
Q3 = Quartile Third
Bowley’s Coefficient of Skewness…..02

The above formula can be converted to-

SKB = Q3 + Q1 – 2Median
(Q3 – Q1)

The value of coefficient of skewness is zero, if it is a


symmetrical distribution.
If the value is greater than zero, it is positively skewed distribution.
And if the value is less than zero, it is negatively skewed distribution.
Kelly’s Coefficient of Skewness…..01

Kelly developed another measure of skewness, which is based on percentiles and


deciles.
The formula for measuring skewness is based on percentile as follows:

P90 – 2P50 + P
SKk = 10
P –P
90 10
Where,
SKK = Kelly’s Coefficient of skewness,
P = Percentile Ninety.
90
= Percentile Fifty.
P
50 = Percentile Ten.

P
10
Kelly’s Coefficient of Skewness…..02

This formula for measuring skewness is based on percentile are as follows:

SKk = D9 – 2D5 +
D1 D9 – D1

Where,
SKK = Kelly’s Coefficient of skewness,
D9 = Deciles Nine.
D5 = Deciles Five. D1 = Deciles one.
Example:
Homework:
• Ques: The following are the marks of 150 students in an examination. Calculate Karl Pearson’s coefficient of

skewness.

Marks No. of Students


0-10 20
10-20 10
20-30 40
30-40 0
40-50 15
50-60 20
60-70 15
70-80 10
80-90 30
Moments:

•In Statistics, moments is used to indicate


peculiarities of a frequency distribution.
•The utility of moments lies in the sense that
they indicate different aspects of a given
distribution.
•Thus, by using moments, we can measure the
central tendency of a series, dispersion or
variability, skewness and the peakedness of
the curve.
Moments:
Moments around any
Moments around Mean Arbitrary No
Conversion formula for Moments

1st moment: (Mean)

2nd moment:
(Variance)

3rd moment: (Skewness)

4th moment: (Kurtosis)


Two important constants calculated from μ2, μ3 and μ4 are:-

β1 (read as beta one) β2 (read as beta two)

• •
Kurtosis

•Kurtosis is another measure of the shape of a frequency curve. It is a Greek word,


which means bulginess.

•While skewness signifies the extent of asymmetry, kurtosis measures the degree
of peakedness of a frequency distribution.

•Karl Pearson classified curves into three types on the basis of the shape of their
peaks. These are:-

–Leptokurtic
–Mesokurtic
–Platykurtic
Kurtosis

• When the peak of a curve


becomes relatively high then that
curve is called Leptokurtic.

• When the curve is flat-topped,


then it is called Platykurtic.

• Since normal curve is neither


very peaked nor very flat topped,
so it is taken as a basis for
comparison.

• This normal curve is called


Mesokurtic.
Measure of Kurtosis

• There are two measure of Kurtosis:

• Karl Pearson’s Measures of Kurtosis

• Kelly’s Measure of Kurtosis


Karl Pearson’s Measures of Kurtosis

Formula Result:
• •
Kelly’s Measure of Kurtosis

Formula Result:
• •
Example:

You might also like