Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 38

L6

Statistics for Social Science

Lecture 6
L6
2
Describing Data

 The Relative Positions of the Mean, Median, and Mode


 Dispersion
 Range
 Mean Deviation
 Variance and Standard Deviation
 Population Variance
 Population Standard Deviation
 Sample Variance
 Sample Standard Deviation
L6
3
The Relative Positions of the Mean, Median, and Mode

A Symmetric Distribution
 It is a symmetric distribution, which is also
mound-shaped.
 This distribution has the same shape on either
side of the center. If the polygon were folded in
half, the two halves would be identical.
 For any symmetric distribution, the mode,
median, and mean are located at the center and
are always equal.
L6
4
The Relative Positions of the Mean, Median, and Mode

A Positively Skewed Distribution


 If a distribution is nonsymmetrical, or skewed, the
relationship among the three measures changes.
 In a positively skewed distribution, the arithmetic
mean is the largest of the three measures.
 The mean is influenced more than the median or
mode by a few extremely high values.
 The median is generally the next largest measure in
a positively skewed frequency distribution.
 The mode is the smallest of the three measures.
L6
5
The Relative Positions of the Mean, Median, and Mode

A Negatively Skewed Distribution


 If a distribution is negatively skewed, the
mean is the lowest of the three measures.
 The mean is, of course, influenced by a few
extremely low observations.
 The median is greater than the arithmetic
mean, and the modal value is the largest of the
three measures.
L6
6
Dispersion

 A measure of location, such as the mean or the median, only describes the
center of the data. It is valuable from that standpoint, but it does not tell us
anything about the spread of the data.
 A small value for a measure of dispersion indicates that the data are clustered
closely, say, around the arithmetic mean. The mean is therefore considered
representative of the data.
 Conversely, a large measure of dispersion indicates that the mean is not
reliable.
L6
7
Measures of Dispersion

 A measure of dispersion can be used to evaluate the reliability of two or more


measures of location.
 Range
 Mean Deviation
 Variance and Standard Deviation
L6
8
Range

 The simplest measure of dispersion is the range. It is the difference


between the largest and the smallest values in a data set. In the form of
an equation:

 Range = Largest value ⎼ Smallest value

 A defect of the range is that it is based on only two values, the highest
and the lowest; it does not take into consideration all of the values.
L6
9
Mean Deviation

 The arithmetic mean of the absolute values of the deviations from the
arithmetic mean. As not like range, it does take into consideration all of the
values. It measures the mean amount by which the values in a population, or
sample, vary from their mean.
L6
10
Mean Deviation

  In terms of a formula, the mean deviation, designated MD, is computed for a


sample by:

 is the value of each observation.


 is the arithmetic mean of the values.
 is the number of observations in the sample.
 | | indicates the absolute value.
L6
11
Mean Deviation

 The chart shows the number of cups Banani Gulshan


coffee sold at two coffee shops at 20 20
Banani and at Gulshan between 4pm 40 49
and 5pm for a sample of five days last 50 50
month. 60 51
 Determine the mean, median, and 80 80
range, mean deviation for each location.
L6
12
Mean Deviation

  Notice that all three of the Banani Gulshan


measures are exactly the 20 20
same. Does this indicate 40 49
that there is no difference
50 50
in the two sets of data?
60 51
 Now, determine the mean 80 80
deviation for each
Mean 50 50
location.
Median 50 50
Range 60 60
L6
13
Mean Deviation

 First, for Shop at Banani:

 The mean deviation is 16 cups of coffee. That is, the number of cups of coffee
sold deviates, on average, by 16 from the mean of 50 cups.
L6
14
Mean Deviation

 And, for Shop at Gulshan:

 The mean deviation is 12.4 cups of coffee. That is, the number of cups of
coffee sold deviates, on average, by 12.4 from the mean of 50 cups.
L6
15
Mean Deviation

 Let’s interpret and compare the results of our measures for the two
shops. The mean and median of the two locations are exactly the
same, 50 cups sold. These measures of location indicate the two
distributions are the same. The range for both locations is also the
same, 60. However, recall that the range provides limited
information about the dispersion, because it is based on only two
of the observations.
L6
16
Mean Deviation

 The mean deviations are not the same for the two shops. The mean
deviation is based on the differences between each observation and the
arithmetic mean. It shows the closeness or clustering of the data
relative to the mean or center of the distribution.
 Compare the mean deviation for Banani of 16 to the mean deviation
for Gulshan of 12.4. Based on the mean deviation, we conclude that
the dispersion for the sales distribution of the shop at Gulshan is more
concentrated—that is, nearer the mean of 50—than the shop at Banani.
L6
17
Variance and Standard Deviation

 The variance and standard deviation are also based on the deviations from
the mean. However, instead of using the absolute value of the deviations, the
variance and the standard deviation square the deviations.
 The variance is the arithmetic mean of the squared deviations from the mean.
 The standard deviation is the square root of the variance.
L6
18
Population Variance

  The formulas for the population variance and the sample variance are slightly
different.
 The Population Variance is considered first.
L6
19
Population Variance

  The formulas for the population variance and the sample variance are slightly
different.
 The Population Variance is considered first.

 is the population variance. It is read as “sigma squared.”


 is the value of an observation in the population.
 is the arithmetic mean of the population.
 is the number of observations in the population.
L6
20
Population Variance

 The number of car sold last year


by month at ABC Car Center, Month Sales Month Sales
Dhaka, is reported below. January 19 July 45
February 17 August 39
March 22 September 38
April 18 October 44
May 28 November 34
June 34 December 10
L6
21
Population Variance

  The number of car sold last year


by month at ABC Car Center, Month Sales Month Sales
Dhaka, is reported below. January 19 July 45
February 17 August 39
 Determine the population March 22 September 38
variance. April 18 October 44
 May 28 November 34
June 34 December 10
L6
22
Population Variance

  We begin by determining the arithmetic mean of the population.

=
L6
23
Population Variance

Month Sales (X) X-𝜇


January 19 -10
February 17 -12
March 22 -7
April 18 -11
May 28 -1
June 34 5
July 45 16
August 39 10
September 38 9
October 44 15
November 34 5
December 10 -19
SUM 348 0
L6
24
Population Variance

Month Sales (X) X-𝜇 (𝑋−𝜇)^2


January 19 -10 100
February 17 -12 144
March 22 -7 49
April 18 -11 121
May 28 -1 1
June 34 5 25
July 45 16 256
August 39 10 100
September 38 9 81
October 44 15 225
November 34 5 25
December 10 -19 361
SUM 348 0 1488
L6
25
Population Variance

Month  Sales (X) X-𝜇 (𝑋−𝜇)^2


January 19 -10 100
February 17 -12 144
March 22 -7 49
April 18 -11 121
May 28 -1 1
June 34 5 25
July 45 16 256
August 39 10 100
September 38 9 81
October 44 15 225
November 34 5 25
December 10 -19 361
SUM 348 0 1488
L6
26
Population Standard Deviation

 Both the range and the mean deviation are easy to interpret. The range is the
difference between the high and low values of a set of data, and the mean
deviation is the mean of the deviations from the mean. However, the variance
is difficult to interpret for a single set of observations. The variance of 124 for
the number of sales issued is not in terms of sales, but sales squared.
L6
27
Population Standard Deviation

  There is a way out of this difficulty. By taking the square root of the
population variance, we can transform it to the same unit of measurement
used for the original data. The square root of 124 is 11.14. The units are now
simply sales. The square root of the population variance is the Population
Standard Deviation.
 Population Standard Deviation:
L6
28
Sample Variance

2
 

2 ∑ ( 𝑋 − ´
𝑋 )
𝑠 =
𝑛 −1
L6
29
Sample Variance

 is the sample variance.


 is the value of each observation in the sample.
 is the mean of the sample.
 is the number of observations in the sample.
L6
30
Sample Variance

 The hourly wages for a sample of part-time employees at Home Depot are:

12 20 16 18 19

 What is the sample variance?


L6
31
Sample Variance

=
L6
32
Sample Variance

Wage (X) X - Xbar


12 -5
20 3
16 -1
18 1
19 2
85 0
L6
33
Sample Variance

Wage (X) X - Xbar (𝑋−Xbar)^2


12 -5 25
20 3 9
16 -1 1
18 1 1
19 2 4
85 0 40
L6
34
Sample Variance

 
Wage (X) X - Xbar (𝑋−Xbar)^2
12 -5 25
20 3 9
16 -1 1
18 1 1
19 2 4
85 0 40
L6
35
Sample Standard Deviation

 Thesample standard deviation is used as an estimator of the


population standard deviation. As noted previously, the population
standard deviation is the square root of the population variance.
Likewise, the sample standard deviation is the square root of the
sample variance.
L6
36
Sample Standard Deviation

  Sample Standard Deviation:

 The sample standard deviation is $3.16, found by . Note again that


the sample variance is in terms of dollars squared, but taking the
square root of 10 gives us $3.16, which is in the same units
(dollars) as the original data.
L6
37
Topics covered in the Lecture

 The Relative Positions of the Mean, Median, and Mode


 Dispersion
 Range
 Mean Deviation
 Variance and Standard Deviation
 Population Variance
 Population Standard Deviation
 Sample Variance
 Sample Standard Deviation
L6

End of Lecture 6

Thank you

You might also like