Descriptive Statistics Part 1

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 49

Descriptive Statistics

Module II
Learning Objectives

 Distinguish between measures of central tendency, measures of variability, measures


of shape, and measures of association.
 Understand the meanings of mean, median, mode, quartile, percentile, and range.
 Compute mean, median, mode, percentile, quartile, range, variance, standard
deviation, and mean absolute deviation on ungrouped data.
 Differentiate between sample and population variance and standard deviation.
Measures Of Central Tendency And
Dispersion
 Measures of Central Tendency
-- Mean
       Arithmetic
 Weighted Mean

 Median,

 Mode

 Quartiles, Percentiles, Deciles


Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Measures Of Variation

 Range
 Mean Deviation
 Standard Deviation ( Variance )
 Inter Quartile Range
 Coefficient of Variation
 Measures of Skewness and Kurtosis
 Standardised Variables and Scores
5
Mean / Average
 There are three types of means viz.,

 Arithmetic Mean

 Harmonic Mean

 Geometric Mean
Arithmetic Mean

 Mean is the average of a group of numbers


 Applicable for interval and ratio data
 Not applicable for nominal or ordinal data
 Affected by each value in the data set, including
extreme values
 Computed by summing all values in the data set and
dividing the sum by the number of values in the data
set
7
Arithmetic Mean
Ungrouped (Raw) Data

Sum of Observatio ns
x=
Number of Observations
ns

å xi
=
n
Illustration 4.1
8

Table 4.1 : Equity Holdings of 20 Indian Billionaires


( Rs. in Millions)

2717 2796 3098 3144 3527

3534 3862 4186 4310 4506

4745 4784 4923 5034 5071

5424 5561 6505 6707 6874


9 Illustration 4.1

For the above data, the A.M. is


 
2717 + 2796 +…… 4645+….. + 5424 + ….+ 6874

x
= --------------------------------------------------------------------------
20
 

= Rs. 4565.4 Millions


10
Arithmetic Mean
Grouped Data

x=
å fx i i

åf i
Illustration 4.2
The calculation is illustrated with the data relating to equity
11
holdings of the group of 20 billionaires given in Table 3.1

Mid Value of fixi


Class Interval Frequency
Class Interval Col.(4) = Col.(2) x Col.
(1) ( fi ) ( 2 )
( xi ) ( 3 ) (3)
2000 – 3000 2 2500 5000

3000 – 4000 5 3500 17500

4000 – 5000 6 4500 27000

5000 – 6000 4 5500 22000

6000 – 7000 3 6500 19500


     
 

 fi = 20    fixi = 91000
Sum
12
Illustration 4.2

values of  fi and  fixi , in formula

x=
å fx i i

åf i

= 9100 ÷ 20
= 4550
Statistics for Business and Economics (13e)

Weighted Mean
• In some instances the mean is computed by giving each observation a weight
that reflects its relative importance.
• The choice of weights depends on the application.
• The weights might be the number of credit hours earned for each grade, as in
GPA.
• In other weighted mean computations, quantities such as pounds, dollars, or
volume are frequently used.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
13
Statistics for Business and Economics (13e)

Weighted Mean
𝑥=
∑ 𝑤 𝑖 𝑥𝑖
∑ 𝑤𝑖
where: xi = value of observation i
wi = weight for observation i

Numerator: sum of the weighted data values


Denominator: sum of the weights
If data is from a population, m replaces .

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
14
Statistics for Business and Economics (13e)

Weighted Mean
• Example: Construction Wages
Ron Butler, a home builder, is looking over the expenses he incurred for a
house he just built. For the purpose of pricing future projects, he would like to
know the average wage ($/hour) he paid the workers he employed. Listed
below are the categories of worker he employed, along with their respective
wage and total hours worked.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
15
Statistics for Business and Economics (13e)

Weighted Mean
• Example: Construction Wages

= = = 20.0464 = $20.05

FYI, equally-weighted (simple) mean = $21.21

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
16
17
Illustration 4.3

Item Monthly Weight Rise in Price  


Consumption (wi) (Percentage) w ip i
  (pi)

Sugar 5 5 20 100

Rice 20 20 10 200
18
Illustration 4.3

Therefore, the average price rise could be


evaluated as
 
p =
å wi pi
=
åw i

100 + 200 300


= = = 12.
5 + 20 25

Thus the average price rise is 12 % .


Statistics for Business and Economics (13e)

Trimmed Mean
• Another measure, sometimes used when extreme values are present, is the
trimmed mean.
• It is obtained by deleting a percentage of the smallest and largest values from a
data set and then computing the mean of the remaining values.
• For example, the 5% trimmed mean is obtained by removing the smallest 5%
and the largest 5% of the data values and then computing the mean of the
remaining values.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
19
20
Median
 whenever there are some extreme values in the
data, calculation of A.M. is not desirable.

 Further, whenever, exact values of some


observations are not available, A.M. cannot be
calculated.

 In both the situations, another measure of location


called Median is used.
Statistics for Business and Economics (13e)

Median
• The median of a data set is the value in the middle when the data items are
arranged in ascending order.
• Whenever a data set has extreme values, the median is the preferred measure
of central location.
• The median is the measure of location most often reported for annual income
and property value data.
• A few extremely large incomes or property values can inflate the mean.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
21
Statistics for Business and Economics (13e)

Median
• For an odd number of observations:

26 18 27 12 14 27 19 7 observations

12 14 18 19 26 27 27 in ascending order

The median is the middle value. Median = 19

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
22
Statistics for Business and Economics (13e)

Median
• For an even number of observations:

26 18 27 12 14 27 30 19 8 observations

12 14 18 19 26 27 27 30 in ascending order

The median is the average of the middle two values.


Median = (19 + 26)/2 = 22.5

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
23
Statistics for Business and Economics (13e)

Median
• Example: Apartment Rents
Averaging the 35th and 36th data values:
Median = (575 + 575)/2 = 575

525 530 530 535 535 535 535 535 540 540
540 540 540 545 545 545 545 545 550 550
550 550 550 550 550 560 560 560 565 565
565 570 570 572 575 575 575 580 580 580
580 585 590 590 590 600 600 600 600 610
610 615 625 625 625 635 649 650 670 670
675 675 680 690 700 700 700 700 715 715

Note: Data is in ascending order.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
24
Median

 Median - middle value in an ordered array of numbers.


 Half the data are above it, half the data are below it
 Mathematically, it’s the (n+1)/2 th ordered observation
 For an array with an odd number of terms, the median is the middle
number
 n=11 => (n+1)/2 th = 12/2 th = 6th ordered observation
 For an array with an even number of terms the median is the average
of the middle two numbers
 n=10 => (n+1)/2 th = 11/2 th = 5.5th = average of 5th and 6th
ordered observation
Median - Ungrouped Data
26

First the data is arranged in ascending/descending order.


 In the earlier example relating to equity holdings data of 20 billionaires given in
Table 4.1, the data is arranged as per ascending order as follows
 2717 2796 3098 3144 3527 3534
3862 4187 4310 4506 4745 4784 4923
5034 5071 5424 5561 6505 6707 6874

Here, the number of observations is 20, and therefore there is no middle


observation. However, the two middle most observations are 10th and 11th. The
values are 4506 and 4745. Therefore, the median is their average.  

4506 + 4745 9251


Median = ----------------- = -----------
2 2
 
= 4625.5
 
Thus, the median equity holdings of the 20 billionaires is Rs.4625.5 Millions.
Median - Grouped
27

The median for the grouped data is also defined as the value
corresponding to the ( (n)/2 )th observation, and is calculated from the
following formula:
( (n/2) –fc )
Median = Lm + -----------------  wm
fm
 where,
•Lm is the lower limit of 'the median class internal i.e. the interval which
contains n/2th observation
•fm is the frequency of the median class interval i.e. the class interval which
contains the ( (n)/2 )th observation
•fc is the cumulative frequency up to the median class- interval
•wm is the width of the median class-interval
•n is the number of total observations.
Illustration 4.2
28

Class Interval Frequency Cumulative


frequency

2000-3000 2 2
3000-4000 5 7
4000-5000 6 13
5000-6000 4 17
6000-70000 3 20
Illustration 4.2
29

Here, n = 20, the median class interval is from 4000 to 5000 as the 10th
observation lies in this interval.
Further, 
Lm = 4000
  fm = 6
  fc = 7
  wm = 1000
Therefore,
20/2 –7 x 1000
Median = 4000 + -------------------------
6
= 4000 + 3/6 x 1000
= 4000 + 500
= 4500
Statistics for Business and Economics (13e)

Mode
• The mode of a data set is the value that occurs with greatest frequency.
• The greatest frequency can occur at two or more different values.
• If the data have exactly two modes, the data are bimodal.
• If the data have more than two modes, the data are multimodal.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
30
Statistics for Business and Economics (13e)

Mode
• Example: Apartment Rents
550 occurred most frequently (7 times)
Mode = 550
525 530 530 535 535 535 535 535 540 540
540 540 540 545 545 545 545 545 550 550
550 550 550 550 550 560 560 560 565 565
565 570 570 572 575 575 575 580 580 580
580 585 590 590 590 600 600 600 600 610
610 615 625 625 625 635 649 650 670 670
675 675 680 690 700 700 700 700 715 715

Note: Data is in ascending order.

© 2017 Cengage Learning. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or
otherwise on a password-protected website or school-approved learning management system for classroom use.
31
32
Mode (Grouped data)
f m - f0
Mode = Lm + -----------------  wm
 
2fm - f0 - f2
where ,

 Lm is the lower point of the modal class interval


 fm is the frequency of the modal class interval
 f0 is the frequency of the interval just before the modal interval
 f2 is the frequency of the interval just after the modal interval
 wm is the width of the modal class interval
34
Equity Holding Data

the modal interval i.e., the class interval with the maximum
frequency (6) is 4000 to 5000. Further,
Lm = 4000

wm = 1000

fm = 6

f0 = 5

f2 = 4

Therefore
Equity Holding Data
35

( 6 – 5)

Mode = 4000 + --------------------  1000


26–5–4
= 4000 + 1000 /3
= 4000 + 333.3
= 4333.3
Thus the modal equity holdings of the billionaires is Rs.
4333.3 Millions.
36
Empirical Relationship among Mean,
Median and Mode
In a moderately skewed distributions, it is found that the following
relationship, generally, holds good :

Mean – Mode = 3 (Mean – Median)


 
From the above relationship between, Mean, Median and Mode, if the
values of two of these are given, the value of third measure can be
found out
Quartiles
37
 Median divides the data into two parts such that 50 % of the observations
are less than it and 50 % are more than it.
 Similarly, there are “Quartiles”. There are three Quartiles viz. Q1 , Q2 and
Q3. These are referred to as first, second and third quartiles.

 The first quartile , Q1, divides the data into two parts such that 25 %
( Quarter ) of the observations are less than it and 75 % more than it.

 The second quartile, Q2, is the same as median.

 The third quartile Q3 divides the data into two parts such that 75 %
observations are less than it and 25 % are more than it.
Quartiles

 Quartile - measures of central tendency that divide a group of data into four
subgroups
 Q1: 25% of the data set is below the first quartile
 Q2: 50% of the data set is below the second quartile
 Q3: 75% of the data set is below the third quartile

Q1 Q2 Q3

25% 25% 25% 25%


Quartiles
39

data Q1 and Q3 are defined as values corresponding to an observation given


below :
 

Ungrouped Data Grouped Data


(arranged in ascending
or descending order)
 
Lower Quartile Q1 {( n + 1 ) / 4 }th ( n / 4 )th
  
 Median Q2 { ( n + 1 ) / 2 }th ( n / 2 )th
  
Upper Quartile Q3 {3 ( n + 1 ) / 4 } th (3 n / 4 )th
Quartiles

(n / 4) - f c
Q 1= L Q1 + ´ w Q1
f Q1

(3n / 4) - f c
Q3=LQ3 + ´ wQ3
f Q3
Equity Holding Data
41

Class Interval Frequency Cumulative


frequency

2000-3000 2 2

3000-4000 5 7

4000-5000 6 13

5000-6000 4 17

6000-70000 3 20
42

( (20/4) – 2 )
Q1 = 3000 + ---------------  1000
5
  ( 5 – 2)
= 3000 + --------------------  1000
5
  3000
= 3000 + -------------
5
  = 3000 + 600
  = 3600
 
The interpretation of this value of Q1 is that 25 %
billionaires have equity holdings less than Rs.
43

 
(15 – 13)
Q3 = -------------  1000 +5000
4
  2
= -------  1000 +5000
4
 
= 5500
The interpretation of this value of Q3 is that 75 %
billionaires have equity holdings less than Rs. 5500
Millions.
 Find the median, lower quartile and upper
quartile of the following numbers.

12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25

You might also like