Professional Documents
Culture Documents
Statistics For Social Science
Statistics For Social Science
Lecture 6
L6
2
Describing Data
A Symmetric Distribution
It is a symmetric distribution, which is also
mound-shaped.
This distribution has the same shape on either
side of the center. If the polygon were folded in
half, the two halves would be identical.
For any symmetric distribution, the mode,
median, and mean are located at the center and
are always equal.
L6
4
The Relative Positions of the Mean, Median, and Mode
A measure of location, such as the mean or the median, only describes the
center of the data. It is valuable from that standpoint, but it does not tell us
anything about the spread of the data.
A small value for a measure of dispersion indicates that the data are clustered
closely, say, around the arithmetic mean. The mean is therefore considered
representative of the data.
Conversely, a large measure of dispersion indicates that the mean is not
reliable.
L6
7
Measures of Dispersion
A defect of the range is that it is based on only two values, the highest
and the lowest; it does not take into consideration all of the values.
L6
9
Mean Deviation
The arithmetic mean of the absolute values of the deviations from the
arithmetic mean. As not like range, it does take into consideration all of the
values. It measures the mean amount by which the values in a population, or
sample, vary from their mean.
L6
10
Mean Deviation
The mean deviation is 16 cups of coffee. That is, the number of cups of coffee
sold deviates, on average, by 16 from the mean of 50 cups.
L6
14
Mean Deviation
The mean deviation is 12.4 cups of coffee. That is, the number of cups of
coffee sold deviates, on average, by 12.4 from the mean of 50 cups.
L6
15
Mean Deviation
Let’s interpret and compare the results of our measures for the two
shops. The mean and median of the two locations are exactly the
same, 50 cups sold. These measures of location indicate the two
distributions are the same. The range for both locations is also the
same, 60. However, recall that the range provides limited
information about the dispersion, because it is based on only two
of the observations.
L6
16
Mean Deviation
The mean deviations are not the same for the two shops. The mean
deviation is based on the differences between each observation and the
arithmetic mean. It shows the closeness or clustering of the data
relative to the mean or center of the distribution.
Compare the mean deviation for Banani of 16 to the mean deviation
for Gulshan of 12.4. Based on the mean deviation, we conclude that
the dispersion for the sales distribution of the shop at Gulshan is more
concentrated—that is, nearer the mean of 50—than the shop at Banani.
L6
17
Variance and Standard Deviation
The variance and standard deviation are also based on the deviations from
the mean. However, instead of using the absolute value of the deviations, the
variance and the standard deviation square the deviations.
The variance is the arithmetic mean of the squared deviations from the mean.
The standard deviation is the square root of the variance.
L6
18
Population Variance
The formulas for the population variance and the sample variance are slightly
different.
The Population Variance is considered first.
L6
19
Population Variance
The formulas for the population variance and the sample variance are slightly
different.
The Population Variance is considered first.
=
L6
23
Population Variance
Both the range and the mean deviation are easy to interpret. The range is the
difference between the high and low values of a set of data, and the mean
deviation is the mean of the deviations from the mean. However, the variance
is difficult to interpret for a single set of observations. The variance of 124 for
the number of sales issued is not in terms of sales, but sales squared.
L6
27
Population Standard Deviation
There is a way out of this difficulty. By taking the square root of the
population variance, we can transform it to the same unit of measurement
used for the original data. The square root of 124 is 11.14. The units are now
simply sales. The square root of the population variance is the Population
Standard Deviation.
Population Standard Deviation:
L6
28
Sample Variance
2
2 ∑ ( 𝑋 − ´
𝑋 )
𝑠 =
𝑛 −1
L6
29
Sample Variance
The hourly wages for a sample of part-time employees at Home Depot are:
12 20 16 18 19
=
L6
32
Sample Variance
Wage (X) X - Xbar (𝑋−Xbar)^2
12 -5 25
20 3 9
16 -1 1
18 1 1
19 2 4
85 0 40
L6
35
Sample Standard Deviation
End of Lecture 6
Thank you