Professional Documents
Culture Documents
Chapter 3 Slides #3 zscore-Empirical-Chebyshev
Chapter 3 Slides #3 zscore-Empirical-Chebyshev
variability
By integrating the three main features, we can learn not only about the
general global characteristics of the given data, we may also be able to
learn about particular observations within the data.
To that end, we introduce z-score (simply z), the Empirical Rule and
Chebyshev’s Theorem that are based on the three main aspects of
data.
z-scores
Z-score (or simply z) is used to measure the location of a particular
value in the data relative to the mean.
X
Frequency
68%
95%
100%
Sometimes a data will have one or more observations with unusually large or
unusually small values.
These extreme values are often called outliers. These are values that are out of
the ordinary or are not typical!
How can we tell whether or not a value is an outlier? This is a difficult question
to answer even for an experienced statistician.
But, if one is willing to assume the data is normal or close to normal, any value
with a z-score value below -3 or more than 3 may be considered an outlier. The
justification is that: per Empirical Rule, almost all values should be within 3
standard deviations of the mean. If not, that value must be an outlier.
What if the data is not normal? We will come back to this later.
Can we use the Empirical rule for the following
distributions? The answer is NO, NO, NO
Do not use Empirical rule - when