Professional Documents
Culture Documents
02 QuantTech Mean Variance
02 QuantTech Mean Variance
Varsha Varde
Course Coverage
Quantitative Methods
Preliminary Analysis of Data
This Group
This Group of Participants:
Mode of age is
years
Median is
years,
Arithmetic Mean is
years
varsha Varde
10
30
20
Total
65
varsha Varde
varsha Varde
RoI
10
400
0.20
2.00
30
200
0.10
3.00
900
0.45
2.25
20
500
0.25
5.00
Total
65
2000
1.00
12.25
Wt. Av.
varsha Varde
10
A Comparison
Mode: Easiest, At A Glance, Crude
Median: Disregards Magnitude of Obs.,
Only Counts Number of Observations
Arithmetic Mean: Outliers Vitiate It.
Weighted Av. Useful for Averaging Ratios
Symmetrical Distn: Mode=Median=Mean
+ly Skewed Distribution: Mode < Mean
-ly Skewed Distribution: Mode > Mean
varsha Varde
11
12
varsha Varde
13
14
|Orders-Mean|
|Orders-Mean| x SEs
00: 04
9.82
9.82x4=39.28
01: 01
8.82
8.82x1=8.82
02: 03
7.82
7.82x3=23.46
03: 03
6.82
6.82x3=20.46
04: 03
5.82
5.82x3=17.46
05: 03
4.82
4.82x3=14.46
06: 04
3.82
3.82x4=15.28
07: 03
2.82
2.82x3=8.46
08: 04
1.82
1.82x4=7.28
09:05
0.82
0.82x5=4.10
10: 02
0.18
varsha Varde
0.18x2=0.36
15
|Orders-Mean|
|Orders-Mean| x SEs
11: 03
1.18
1.18x3=3.54
12: 01
2.18
2.18x1=2.18
13: 00
3.18
3.18x0=0
14: 01
4.18
4.18x1=4.18
15: 01
5.18
5.18x1=5.18
16: 01
6.18
6.18x1=6.18
17: 01
7.18
7.18x1=7.18
18: 00
8.18
8.18x0=0
19: 01
9.18
9.18x1=9.18
20:00
10.18
10.18x0=0
21: 01
11.18
varsha Varde
11.18x1=11.18
16
|Orders-Mean|
|Orders-Mean| x SEs
22: 00
12.18
12.18x0=0
23: 00
13.18
13.18x0=0
24: 01
14.18
14.18x1=14.18
25: 00
15.18
15.18x0=0
26: 00
16.18
16.18x0=0
27: 00
17.18
17.18x0=0
28: 01
18.18
18.18x1=18.18
29: 00
19.18
19.18x0=0
30: 01
20.18
20.18x1=20.18
31: 00
21.18
21.18x0=0
32: 00
22.18
varsha Varde
22.18x0=0
17
|Orders-Mean|
|Orders-Mean| x SEs
33: 00
23.18
23.18x0=0
34: 01
24.18
24.18x1=24.18
35: 00
25.18
25.18x0=0
36: 00
26.18
26.18x0=0
37: 00
27.18
27.18x0=0
38: 00
28.18
28.18x0=0
39: 00
29.18
29.18x0=0
40: 00
30.18
30.18x0=0
41: 00
31.18
31.18x0=0
42: 00
32.18
32.18x0=0
43: 01
33.18
varsha Varde
33.18x1=33.18
18
Mean Deviation
varsha Varde
19
No. of Days
00 10
20
10 20
40
20 30
20
30 40
10
40 50
10
Total
varsha Varde
100
20
No. of Days
05
20
15
40
25
20
35
10
45
10
Total
varsha Varde
100
21
Arithmetic Mean
Downtime
Midpoints
05
No. of Days
20
05 x 20 = 100
15
40
15 x 40 = 600
25
20
25 x 20 = 500
35
10
35 x 10 = 350
45
10
45 x 10 = 450
Total
100
varsha Varde
Product
2000
22
Arithmetic Mean
Arithmetic Mean is the Average of the
Observed Downtimes.
Arithmetic Mean= Total Observed
Downtime/ total number of days
Arithmetic Mean= 2000 / 100 = 20 Minutes
Average Machine Downtime is 20 Minutes.
varsha Varde
23
Mean Deviation
Downtime
Midpoints
05
No. of Days
20
Deviation from
Mean
|05 20| =15
15
40
|15 20| = 05
25
20
|25 20| = 05
35
10
|35 20| = 15
45
10
|45 20| = 25
Total
100
varsha Varde
24
Mean Deviation
Downtime
Midpoints
05
No. of
Days
20
Deviation from
Products
Mean
|05 20| =15 15 x 20 = 300
15
40
|15 20| = 05
05 x 40 = 200
25
20
|25 20| = 05
05 x 20 = 100
35
10
|35 20| = 15
15 x 10 = 150
45
10
|45 20| = 25
25 x 10 = 250
Total
100
1000
varsha Varde
25
Mean Deviation
Definition: Mean Deviation is mean of
Deviations (Disregard negative Sign) of
the Observed Values from the Average.
In this Example, Mean Deviation is the
Weighted Average(weights as
frequencies) of the Deviations of the
Observed Downtimes from the Average
Downtime.
Mean Deviation = 1000 / 100 = 10 Minutes
varsha Varde
26
Variance
Definition: Variance is the average of the
Squares of the Deviations of the Observed
Values from the mean.
varsha Varde
27
Standard Deviation
Definition: Standard Deviation is the
Average Amount by which the Values
Differ from the Mean, Ignoring the Sign of
Difference.
Formula: Positive Square Root of the
Variance.
varsha Varde
28
Variance
Downtime
Midpoints
No. of
Days
Difference from
Mean
Square
Products
05
20
05 20 = -15
225
225 x 20 =
4500
15
40
15 20 = - 05
25
25 x 40 =
1000
25
20
25 20 = 05
25
25 x 20 =
500
35
10
35 20 = 15
225
225 x 10 =
2250
45
10
45 20 = 25
625
625 x 10 =
6250
Total
100
14500
varsha Varde
29
varsha Varde
30
varsha Varde
31
No. of Days
Squares
Products
05
20
25
25 x 20 = 500
15
40
225
225 x 40 = 9000
25
20
625
625 x 20 = 12500
35
10
1225
1225 x 10 = 12250
45
10
2025
2025 x 10 = 20250
Total
100
54500
varsha Varde
32
33
34
No. of Days
00 10
20
10 20
40
20 30
20
30 40
10
40 50
10
Total
varsha Varde
100
35
36
Earlier Example
Orders: SEs
Orders: SEs
Orders: SEs
Orders: SEs
00: 04
11: 03
22: 00
33: 00
01: 01
12: 01
23: 00
34: 01
02: 03
13: 00
24: 01
35: 00
03: 03
14: 01
25: 00
36: 00
04: 03
15: 01
26: 00
37: 00
05: 03
16: 01
27: 00
38: 00
06: 04
17: 01
28: 01
39: 00
07: 03
18: 00
29: 00
40: 00
08: 04
19: 01
30: 01
41: 00
09: 05
20: 00
31: 00
42: 00
10: 02
21: 01
32: 00
43: 01
varsha Varde
37
38
BIENAYME_CHEBYSHEV RULE
For any distribution percentage of
observations lying within +/- k standard
deviation of the mean is at least
( 1- 1/k square ) x100 for k>1
For k=2, at least (1-1/4)100 =75% of
observations are contained within 2
standard deviations of the mean
varsha Varde
39
Coefficient of Variation
Std. Deviation and Dispersion have Units
of Measurement.
To Compare Dispersion in Many Sets of
Data (Absenteeism, Production, Profit),
We Must Eliminate Unit of Measurement.
Otherwise its Apple vs. Orange vs. Mango
Coefficient of Variation is the Ratio of
Standard Deviation to Arithmetic Mean.
CoV is Free of Unit of Measurement.
varsha Varde
40
Coefficient of Variation
In Our Machine Downtime Example,
Coefficient of Variation is 12.04 / 20 = 0.6
or 60%
In Our Sales Orders Example, Coefficient
of Variation is 6.36 / 9.82 = 0.65 or 65%
The series for which CV is greater is said
to be more variable or less consistent ,
less uniform, less stable or less
homogeneous.
Coefficient of Variation
In Our Machine Downtime Example,
Coefficient of Variation is 12.04 / 20 = 0.6
In Our Sales Orders Example, Coefficient
of Variation is 6.36 / 9.82 = 0.65
The series for which CV is greater is said
to be more variable or less consistent ,
less uniform, less stable or less
homogeneous.
Example
Mean and SD of dividends on equity stocks of
TOMCO & Tinplate for the past six years is as
follows
Tomco:Mean=15.42%,SD=4.01%
Tinplate:Mean=13.83%, SD=3.19%
CV:Tomco=26.01%,Tinplate=23.01%
Since CV of dividend of Tinplates is less it
implies that return on stocks of Tinplate is more
stable
For investor seeking stable returns it is better to
invest in scrips of Tinplate
Exercise
List Ratios Commonly used in Cricket.
Study Individual Scores of Indian Batsmen
at the Last One Day Cricket Match.
Are they Nominal, Ordinal or Cardinal
Numbers? Discrete or Continuous?
Find Median & Arithmetic Mean.
Compute Range, Mean Deviation,
Variance, Standard Deviation & CoV. ..
varsha Varde
44
Steps in Constructing a
Frequency Distribution
(Histogram)
1. Determine the number of classes
2. Determine the class width
3. Locate class boundaries
4. Use Tally Marks for Obtaining
Frequencies for each class
varsha Varde
45
Rule of thumb
Not too few to lose information content
and not too many to lose pattern
The number of classes chosen is usually
between 6 and15.
Subject to above the number of classes
may be equal to the square root of the
number of data points.
The more data one has the larger is the
number of classes.
varsha Varde
46
Rule of thumb
Every item of data should be included in
one and only one class
Adjacent classes should not have interval
in between
Classes should not overlap
Class intervals should be of the same
width to the extent possible
varsha Varde
47
Illustration
Frequency and relative frequency distributions
(Histograms):
Example
Weight Loss Data
20.5 19.5 15.6 24.1 9.9
15.4 12.7 5.4 17.0 28.6
16.9 7.8 23.3 11.8 18.4
13.4 14.3 19.2 9.2 16.8
8.8 22.1 20.8 12.6 15.9
Objective: Provide a useful summary of the available
information
varsha Varde
48
Illustration
1 5.0-9.0
3
2 9.0-13.0
5
3 13.0-17.0
7
4 17.0-21.0
6
5 21.0-25.0
3
6 25.0-29.0
1
Totals
25
Let
k = # of classes
max = largest measurement
min = smallest measurement
n = sample size
w = class width
3/25 (.12)
5/25 (.20)
7/25 (.28)
6/25 (.24)
3/25 (.12)
1/25 (.04)
1.00
varsha Varde
49
Formulas
k = Square Root of n
w =(max min)/k
Square Root of 25 = 5. But we used k=6
w = (28.65.4)/6
w = 4.0
varsha Varde
50
Numerical methods
Measures of Central Tendency
1. Mean( Arithmetic,Geometric,Harmonic)
2 .Median
3. Mode
Measures of Dispersion (Variability)
1. Range
2. Mean Absolute Deviation (MAD)
3. Variance
4. Standard Deviation
varsha Varde
51
52
Example
: Given a sample of 5 test grades
(90, 95, 80, 60, 75)
Then n=5; x1=90,x2=95,x3=80,x4=60,x5=75
AM of x =( 90 + 95 + 80 + 60 + 75)/5 = 400/5=80
GM of x =( 90 *95* 80 * 60 * 75)^1/5
=(3078000000)^1/5=79
Weighted verage;w1=1,w2=2,w3=2,w4=3,w5=2
WM of x =( 1*90 + 2*95 + 2*80 +3* 60 +2*75)/10
= 770/10=77
varsha Varde
53
Sample Median
The median of a sample (data set) is the middle number when the measurements are
arranged in ascending order.
Note:
If n is odd, the median is the middle number
If n is even, the median is the average of the middle two numbers.
2, 7, 9, 11, 14
Step 2: med = 9.
Step 1: 2, 6, 7, 9, 11, 14
location or position).
3. Mode
The mode is the value of x (observation) that occurs with the greatest frequency.
varsha Varde
54
Choosing Appropriate
Measure of Location
If data are symmetric, the mean, median,
and mode will be approximately the same.
If data are multimodal, report the mean,
median and/or mode for each subgroup.
If data are skewed, report the median.
The AM is the most commonly used and is
preferred unless precluding circumstances
are present
varsha Varde
55
Measures of Variation
Sample range
Sample variance
Sample standard deviation
Sample interquartile range
varsha Varde
56
Sample Range
R = largest obs. - smallest obs.
or, equivalently
R = xmax - xmin
Coefficient of Range
CR = largest obs. - smallest obs.
-------------- ---------------------------largest obs. +smallest obs.
or, equivalently
CR = xmax xmin/ xmax + xmin
Sample Variance
n
s
2
i 1
n 1
s s
2
i 1
n 1
varsha Varde
61
IQR = Q3 - Q1
Quartile Deviation
Q.D =( third quartile - first quartile)/2
= (Q3 - Q1)/2
(Median -Q.D) to( Median+Q.D)
covers around 50% of the observations
as economic or business data are
seldom perfectly symmetrical
varsha Varde
64
Measures of Variability
65
Measures of Variability
Totals
x
90
85
65
7
70
95
480
x x
10
5
-15
-5
-10
15
0
|x x|
10
560
15
5
10
15
60
MAD =60/10=6
Remarks:
(i) MAD is a good measure of variability
(ii) It is difficult for mathematical manipulations
varsha Varde
66
Measures of Variability
3. Standard Deviation
Example: Same sample as before (AM of ;x = 80) ;n=6
x
90
85
65
75
70
95
Totals 480
Therefore
Variance of x =700 / 5 =140
x x
10
5
-15
-5
-10
15
0
(x x)2
100
25
225
25
100
225
700
varsha Varde
67
Finite Populations
Let N = population size.
Data: {x1, x2, , xN}
N
Population standard deviation: = 2,
varsha Varde
68
varsha Varde
69
4 Percentiles
Using percentiles is useful if data is badly
skewed.
Let x1, x2, . . . , xn be a set of measurements
arranged in increasing order.
Definition. Let 0 < p < 100. The pth percentile is
a number x such that p% of all measurements
fall below the pth percentile and (100 p)% fall
above it.
varsha Varde
70
varsha Varde
71
Special Cases.
1. Lower Quartile (25th percentile)
Example.
(1) position = .25(n + 1) = .25(9) = 2.25
(2) Q1 = 5+.25(8 5) = 5 + .75 = 5.75
2. Median (50th percentile)
Example.
(1) position = .5(n + 1) = .5(9) = 4.5
(2) median: Q2 = 10+.5(11 10) = 10.5
varsha Varde
72
73
Totals
25
391 6,809
Let k = number of classes.
Formulas.
AM= (x1f1+x2f2+..+xkfk)/(f1+f2++fk)=391/25=15.64
Variance= 6809/24-(15.64)^2=283,71-244.61=39
SD=(39)^1/2=6.24
xf
x2f
21 147
55
605
105 1,575
114 2,166
69 1,587
27
729
varsha Varde
74
f f1
Mode=Lmo + ---------- x w
2f-f1-f2
Lmo= Lower limit of Modal Class
f1,f2=Frequencies of classes preceding
and succeeding modal class
f=Frequency of modal class
w= Width of class interval
varsha Varde
75
Lmo=13
f1=5
f2=6
f=7
w=4
Mode=13+{(7-5)/(14-5-6)}X4=13+8/3
=15.67
varsha Varde
76
varsha Varde
77
varsha Varde
78
[ 3(N+1)/4-(F+1)]
Q3=Lq + --------------------xW
fq
Where, Lq=Lower limit of quartile class
N= Total frequency
F=Cumulative frequency upto quartile class
fq= frequency of quartile class
w= Width of the class interval
Third quartile class is that which includes observation
no.3(N+1)/4
varsha Varde
79
[ 3(N+1)/4-(F+1)]
Q3=Lq + --------------------xW
fq
Where, Lq=Lower limit of quartile class=17
N= Total frequency=25
F=Cumulative frequency upto quartile class=15
fq= frequency of quartile class=6
w= Width of the class interval=4
Third quartile class is that which includes observation
no.3(N+1)/4=19.5
Q3=17 +[ {(19.5-16)/6}x4]=17+2.33=19.33
varsha Varde
80
[ 2(N+1)/4-(F+1)]
Q2=Lq + ------------------ xW
fq
Where, Lq=Lower limit of quartile class
N= Total frequency
F=Cumulative frequency upto quartile class
fq= frequency of quartile class
w= Width of the class interval
Second quartile class is that which includes observation
no.(N+1)/2
varsha Varde
81
[ 2(N+1)/4-(F+1)]
Q2=Lq + ------------------ xW
fq
Where, Lq=Lower limit of quartile class=13
N= Total frequency=25
F=Cumulative frequency upto quartile class=8
fq= frequency of quartile class=7
w= Width of the class interval=4
Second quartile class is that which includes observation
no.(N+1)/2=13
Q2=13 +[{(13-9)/7}x4]=13+5.14=18.14
varsha Varde
82
Empirical mode
Where mode is ill defined its value may be
ascertained by using the following formula
Mode =3 median-2mean
varsha Varde
83