Professional Documents
Culture Documents
Fundamentals of Statistics With MS Excel
Fundamentals of Statistics With MS Excel
Fundamentals of Statistics With MS Excel
Presented By:
%
Engr. Raniel B. Taripe, CIE MBA-FM
Analytics is everywhere…
Statistics is everywhere…
OUTLINE:
In this session, we will try to
answer the following questions:
Σ𝑥
𝑥=
ҧ
𝑛
EXCEL TIME!!!
MEDIAN
MEDIAN
• The median is the midpoint of the data array. When the dataset is
ordered whether ascending or descending, it is called a data
array. Median is best for ordinal type of data.
PROPERTIES OF MEDIAN
• 1. The median is unique, there is only one median for a set of data.
• 2. The median is found by arranging the set of data from lowest to
highest (or vice versa) and getting the value of the middle observation.
• 3. Median is not affected by the extreme small or large values.
• 4. Median can be computed for an open-ended frequency distribution.
• 5. Median can be applied for ordinal, interval and ratio data.
• 6. Median is most appropriate in a skewed data.
𝑛+1
𝑥 (Rank Value) =
2
EXCEL TIME!!!
MODE
MODE
• The mode is the value in a dataset that appears most frequently.
Like the median and unlike the mean, extreme values in a dataset
do not affect the mode.
• A dataset may not contain any mode if non of the values is “most
typical”
Σ|𝑥−𝑥|ҧ
AD =
𝑁
EXCEL TIME!!!
VARIANCE AND
STANDARD DEVIATION
VARIANCE AND
STANDARD DEVIATION
• The standard deviation is the most widely used measures of
dispersion. The more spread the data points, the higher the
deviation.
• While variance is the square of standard deviation. It is the
mathematical expectation of the average squared deviations from
the mean.
EXCEL TIME!!!
MEASURES OF
LOCATION
QUARTILES, DECILES
AND PERCENTILES
QUARTILES AND
PERCENTILES
• When presenting or analyzing dataset, it is sometimes helpful to
group subjects into several equal groups. For example, to create
four equal groups, we need the values that split the data such that
25% of the observations are in each group. The cut off points are
called quartiles, when dataset is split into 100 equal parts, that is
called percentiles.
The general term for such cut off points are quantiles.
EXCEL TIME!!!
MIDHINGE,
INTERQUARTILE
RANGE, & QUARTILE
DEVIATION
MIDHINGE
• The midhinge is the mean of the first and third quartiles in the
dataset. It is used to overcome potential problems introduced by
extreme values (or outliers) in the dataset.
EXCEL TIME!!!
INTERQUARTILE
RANGE
• The interquartile range (IQR), also called midspread or middle fifty
is the difference between the 3rd and 1st quartile.
EXCEL TIME!!!
QUARTILE DEVIATION
• The quartile deviation (QD) is a slightly better measure of aboslute
dispersion than the range. But it ignores the observation on the
tails.
EXCEL TIME!!!
COEFFICIENT OF
VARIATION
COEFFICIENT OF
VARIATION
• In any given two samples with the same units of measures, the
variance and standard deviation for each can be compared.
• In cases when one is interested to compare standard deviations of
two (or more) different units, coefficient of deviations can be
applied.
𝑠
CV = (100)
𝑥ҧ
EXCEL TIME!!!
KURTOSIS
KURTOSIS
• Kurtosis is from the Greek word kyrtos or kurtos, meaning bulging.
• In statistics kurtosis (or excess) is a statistical measure used to
describe the distribution of observed data around the mean.
• In measures the relative peakedness or flatness of a distribution (as
compared to the normal distribution, which shows a kurtosis of 0)
2 TYPES OF KURTOSIS
• Leptokurtic (kurtosis > 0)
• Mesokurtic (kurtosis = 0)
• Platykurtic (kurtosis < 0)
EXCEL TIME!!!
SKEWNESS
SKEWNESS
• The coefficient of skewness measures the general shape of the distribution of the lack of
symmetry of a distribution.
• The range of possible skewness values is theoretically unbounded but it normally ranges
from -3 to +3 and it relates the difference between the mean and the median to the
standard deviation. The direction of the long tail of the distribution points to the direction of
the skewness.
• It is possible to have skewness values higher or lower than 3 and they indicated significant
skew, meaning the distribution has a long tail to the right or to the left.
EXCEL TIME!!!
OUTLIERS
OUTLIERS
• An outlier is an observation point that is distant from other observations or an observation
that lies outside the overall pattern of a distribution. A dataset should be checked from
extremely high or extremely low values called outlier.
• Outliers can strongly affect the mean and standard deviation of a variable.
• Mild Outlier
< [ Q1-1.5(IQR) ] or > [Q3+1.5(IQR) ]
Extreme Outlier
< [ Q1-3(IQR) ] or > [Q3+3(IQR) ]
EXCEL TIME!!!
BOXPLOT
BOXPLOT
• A boxplot or box-and-whisker plot is a graph of a dataset obtained by drawing a horizontal
line from the minimum data value to the Q1, drawing a horizontal line from the Q3 up to the
maximum data value, and drawing a box whose vertical pass through Q1 and Q3, with a
vertical line inside the box passing through the median or second quartile (Q2).
EXCEL TIME!!!
Descriptive Statistics
with Microsoft Excel
DataSense Analytics
Presented By:
%
Engr. Raniel B. Taripe, CIE MBA-FM