Professional Documents
Culture Documents
Measure of Central Tendency: Chapetr - Three
Measure of Central Tendency: Chapetr - Three
Measure of Central Tendency: Chapetr - Three
CHAPETR - THREE
Measure of Central Tendency
o When we want to make comparison b/n groups of numbers it is good to have a single value, which is considered to be a good representative of each
group.
o This single value is called the average of the group.
o Averages are also called measure of central tendency.
o An average, which is representative, is called typical average and average which is not representative and has only a theoretical value is called a
descriptive average.
o A typical average should posses the following.
o It should be strictly defined.
o It should be based on all observation under investigation.
o It should be as little as affected by extreme observations.
o It should be capable of further algebraic treatment.
o It should be ease to calculate and simple to understand.
Objectives:
o To comprehend the data easily.
o To facilitate comparison.
o To make further statistical analysis.
Properties of Summation
Introduction to statistics
n
i) ∑ k = nk, where k is any constant.
i=1
n n
ii) ∑ kxi = k∑ xi where k is any constant.
i=1 i=1
n
n
¿ ( a+kxi ) =na+k ∑ xi where a∧k are any constant . ¿
iii) ∑ i=1
i=1
¿
n n n
iv)∑ ( xi+ yi ) =∑ xi+ ∑ yi
i=1 i=1 i=1
n
1
For ungroup frequency distribution x ¿
n ∑ fixi
∑ fi i=1
i=1
Introduction to statistics
xi fi xifi
2 2 4
3 1 3
7 3 21
8 1 8
Total 7 36
n
1
x ¿ n ∑ fixi = 36 = 5.15
∑ fi i=1 7
i=1
1 k
∑ fi ∑
k
Ü Grouped data x = fixi
i=1
i=1
Where: xi is the class mark of the ith class and fi is the ith class frequency.
Example: calculate the mean for the following age distribution.
class Frequency
6-10 35
11-15 23
16-20 15
21-25 12
26-30 9
31-35 6
o Solutions:
Introduction to statistics
class fi xi xifi
6-10 35 8 280
11-15 23 13 299
16-20 15 18 270
21-25 12 23 276
26-30 9 28 252
31-35 6 33 198
total 100 1575
1 k
1
∑ fi ∑
k
Ü x = fixi = (1575) =15.75
i=1 100
i=1
Marks Frequency
40-44 7
45-49 10
50-54 22
55-59 f1
60-64 f2
65-69 6
70-74 3
∑ (xi−x ¿)¿ = 0
i=1
Example:-In a class there are 30 females and 70 males .If females averaged 60 in an examination and boys averaged 72, find the mean of the entire class.
✈solutions:-
Females males
x 1= 60 x 2 = 72
Introduction to statistics
n1= 30 n2=70
n 1 x 1+ n 2 x 2+ …+nk xk 60∗30+ 72∗70 1800+5040
xc = = =¿ = 68.4
n 1+ n 2+ …+nk 30+70 100
iii. If wrong figure has been used when calculating; the mean of the correct mean can be obtained without repeating the whole process using:
Correct Value−Wrong Value
Correct Mean = Wrong Mean + , where n is the total number of observations
n
Example: - An average weight of 10 students was calculated to be 65k.g latter it was discovered that one weight was misread as 40 instead of 80 k.g.
calculate the correct average weight.
Correct Value−Wrong Value
Correct Mean = Wrong Mean +
n
80−40
65 + = 69 k.g.
10
iv) Weighted A. mean
o When a different importance is desired to be giving to different data a weighted mean is appropriate.
o Weights are assigned to each item in proportion to its relative importance.
o Let x1, x2 ,…., xn be the values of the items a series and w1,W2,..., Wn their corresponding weights, the weighted mean denoted by xw is defined as:-
o
( w X +w X +. ..+wn X n )
X̄ w = 1 1 2 2
( w1 +w2 +. . . wn )
1 n
∑ wi ∑
n
xw = wix i
i=1
i=1
• Example:-A student obtained the following percentage in an examination:- English 60, biology 75, mathematics 63, physics 59,and chemistry 55.
Find the students weighted arithmetic mean if weights 1, 2,1,3,3 respectively are allotted to all students.
Introduction to statistics
• Solution :-
1 n
60∗1+75∗2+63∗1+59∗3+55∗3
xw = n
∑ wi ∑ wixi =
1+2+1+3+ 3
= 61.5
i=1
i=1
G.M= √n x 1. x 2. x 3 … xn
o For grouped data
G= (x √ 1
f1x
2
f
2
x 3 ⋯⋯⋯⋯x n )
3
f
Introduction to statistics
- If the number of observation is more than three or more, the computation of the nth root very tedious, to simplify computation, logarithm is used in terms of
log.
n
1
LogG.M =
n ∑ log xi
i=1
n
1
Anti log (Log G.M) = Antilog [
n ∑ log xi ]
i=1
n
1
G.M = Anti log [
n ∑ log xi ] For row data
i=1
( )
n
1
G= AntiLog
N
∑ fi Log x i For grouped data
i =1
Note: - The geometric mean is useful and appropriate for finding averages of ratios or growth rates.
Harmonic Mean (H.M)
The harmonic mean of x1, x2, x3… xn is denoted by H.M
1 n
n n
o H.M = 1 1 = 1
n
∑ xi
∑ xi
i=1 i=1
1 n n
, n =∑ fi
k k
H.M = 1
∑ fi =
n i=1 xi
∑ xifi i=1
i=1
- If x1, x2, x3,…, xn be the value of the items a series and w1,w2,…,wn their corresponding weights, the weighted Harmonic Mean denoted by;
1
n
1
H.Mw = n ∑ wi
xi
∑ wi
i =1
i=1
Note:- The Harmonic Mean is useful and appropriate in finding average speeds and average rates.
N.B a). A.M>G.M>H.M
b). √ A . M ∗H . M = G.M, Where A.M and H.M. are the usual abbreviations.
The mode ( ^x )
►Mode is a value, which occurs most frequently in a set of values.
►The mode may not exist and even if it does exist, it may not be unique.
►In case of discrete distribution the values having the maximum frequency in the modal class.
Examples:-
►Find the mode of 5, 3,5,8,9
Mode is =5. It is a unimodal data.
Introduction to statistics
V 2 3 4 5
f 5 8 12 1
Mode is 4.
continues series: (class frequency distribution).
Demerit
o It is not rigid.
o It not based on all observations.
o It is not suitable for further mathematical treatment.
o It is not stable average. i.e. it is affected by fluctuations of sampling to some extent
o Often its value is not unique. i.e It may not be uniquely defined
Example: X={1,1,2,2,3,4}, Mode(X)=1 and 2
The Median (~
x)
o In a distribution, median is the value of the variables, which divides it into two equal parts
o One part comprising all the values greater and the other all the values less than median.
o Median can be defined as the middle value of a set of data value when they are arranged in ascending or descending order.
For ungrouped data
~
x
=
a) 2, 1,8,3,5
b) 6, 5, 2,8,9,4
►Quartiles
o Are the three values, which divided the given data in to four equal parts, they are denoted by Q1, Q2 and Q3.
►Deciles
o Are the nine values, which divide, the series in to 10 equal parts, they are denoted by D1, D2, D3,..., D9
D1= covers 10% of the distribution
► Percentiles
o Are the 99 values, which divide the series in to 100 equal parts. They are denoted by P1, P2,…, P99.
Note that
Reading assignments:
Quartiles, Deciles and Percentiles from the row data
For grouped data (from class frequency distribution)
Introduction to statistics
cw ¿
Qi = Lcb + ( - cf) N.B Qi class= Lcf> iN/4
f 4
cw ¿
Di= Lcb + ( - cf) N.B Di class =Lcf > iN/10
f 10
cw ¿
Pi= Lcb + ( - cf) N.B Pi class =Lcf > iN/100
f 100
E.g. for the data given below, compute the quartiles, D3, D7, P15 and P88 interpret.
cw 2 N 20
Q2= Lcb + ( - cf) = 20 + (50 - 25) = 40
f 4 25
Marks of half of students are below 40.
Introduction to statistics
3N
D3- size th= 30th item 20-40 deciles class
10
L=20, cw=20, f=25,cf=25
cw 3 N 20
D3= Lcb + ( - cf) =20 + (30 - 25) =24
f 10 25
Marks of 30% of the students are below 24.
7N
D7- size th , item= 70th item 40-60 deciles
10
L=40, cw=20, f=30, cf=50
cw 7 N 20
D7=Lcb + ( –cf) == 40 + (70-50) = 53.33
f 10 30
Marks of 70% of students is below 53.33
15 N
P15= size th = 15th 10-20 percentile class
100
L=10, cw=10, f=15, cf=10
cw 15 N 10
P15= = Lcb + ( –cf) = 10 + (15 –10) = 13.3
f 100 15
Mark of 15% of the students is below 13.33
88 N
P88 –size ( ¿ th = 88th item 60-80 percentiles class
100
L=60, cw=20, f=14, cf=80
cw 88 N
P88 = Lcb + ( – cf) = 60+20/14 (88 - 80 ) = 71.43
f 100
Introduction to statistics
CHAPTER - FOUR
MEASURE OF DISPERSION (VARIATION)
The degree to which a numerical data tends to spread about an average is called dispersion or variation of the data
Objectives of Measuring variation or Dispersion
o To judge the reliability of measure of central tendency,
o To compare two or more groups of numbers in terms of their variability, and
o To further statistical analysis.
1. Range (R)
R = X max – X min
o Easy to compute and a quick but not good measure of variability since it fails to take into account how the data are distributed and it is greatly affected
by extreme value.
o The following two distributions have the same range, 13, yet appear to differ greatly in the amount of variability.
Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45
Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45
Example:
1. If the range and relative range of a series are 4 and 0.25 respectively.
Then what is the value of:
a). smallest observation.
Introduction to statistics
Q.D =? , C.Q.D =?
Standard Deviation
o There is a problem with variances. Recall that the deviations were squared. That means the units were also squared.
o To get the units back the same as the original data values, the square root must be taken.
o = √ and s = √ s 2
Examples: find the variances and standard deviations of the following sample data 5,17,12,10. The data is given in the form of frequency
distribution.
Solutions: x =11
xi 5 10 12 17 total
(Xi- x )2 36 1 1 36 74
1
s2 =
n−1
∑ ¿ ¿)2 = 74/3 =24.67 s == √ s 2 =√ 24.67 = 4.97
class frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
x = 55
Xi(C.M) 42 47 52 57 62 67 72 total
fi( xi – x ¿ 2 1183 640 198 60 588 864 867 4400
Introduction to statistics
1
s2 =
n−1
∑ fi ¿ ¿)2 = 4400/74 = 59.46 S= √ 59.46 = 7.71
Coefficient of Variation (CV)
o Is defined as the ratio of standard deviation to the mean usually expressed as percents.
S
o CV= * 100%
x
Examples:
1. An analysis of the monthly wages paid (in birr) to workers in two firms A and B belonging to the same industry gives the following results:
Exercise:-
1. A meteorologist interested in the consistency of temperatures in three cities during a given week collected the following data. The temperature for the five days of the week in the
three cities were
City -1 25 24 23 26 17
City-2 22 21 24 22 20
City-3 32 27 35 24 28
o Then, which city do you think have the most consistent temperature, based on these data?
2. Two groups of people were trained to perform a certain task and tested to find out which group is faster to learn the task. For the two groups the following
information was given:
Introduction to statistics
Ü Moments
Mr = ∑ ∑ ( xi−x ) r
(xi−x ¿)r
i=1 n−1 i=1 =
¿
n n n−1
For the case of frequency distribution this is expressed as:
n
o Mr = ∑
fi (xi−x ¿)r
i=1
¿
n
o If r=2 , it is population variance, this is called the second central moment.
o If we assume n-1~n, it is also the sample variance.
Examples:
1. Find the first two moments for the following set of numbers 2,3,7
2. Find the first three central moments of the numbers in problem 1.
Solutions:
1. Use the rth moment formula.
Introduction to statistics
n
1
xr = ∑
n i=1
xi r = x 1 = (2+3+7)/3 =4, x 2 = (22+32+72)/3 = 20.67
Measure of Shapes
Skewness
Skewness is concerned with the shape the curve not size
o Skewness is the degree of asymmetry or departure from symmetry of a distribution.
o A skewed frequency distribution is one that is not symmetrical.
o If the frequency curve (smoothed frequency polygon) of a distribution has a longer tail to the right of the central maximum than to the left, the
distribution is said to be skewed to the right or said to be Positive skewness.
o If it has a longer tail to the left of the central maximum than to the right, it is said to be Skewed to the left said to have negative skewness
For the moderately skewed distribution, the relation holds among the three commonly used measure of central tendency. Mean – mode =3*(mean – median)
Q3-Q1 Q3 - Q1
α3 = M3 = M3
M2
3/2
(
Examples:
1. Suppose the mean, the mode, and the standard deviation of a certain distribution are 32, 30.5 and 10 respectively. What is the shape of the curve representing
the distribution?
2. In a frequency distribution, the coefficient of the skewness based on the quartiles is given to be 0.5. If the sum of the upper and lower quartiles is 28 and the
median is 11, find the values of the upper and the lower quartiles.
Solutions:
Given:
α3 =0.5, median =Q2=11
Q1+Q3= 28....................................... (*)
Required Q1 and Q3
α3 = (Q3 –Q2) –(Q2-Q1) = Q3+Q1 -2Q2 = 0.5
Q3-Q1 Q3- Q1
Substituting the given value
Q3-Q1=12………………………… (**)
Solving (*) and (**) Q1=8 , Q3=20
3. For a moderately skewed frequency distribution, the mean is 10 and the median is 8.5. If the coefficient of variation is 20%, find the Pearsonian coefficient of
skewness and the probable mode of the distribution.
Introduction to statistics
4. The sum of fifteen observations, whose mode is 8, was found to be 150 with coefficients of variation of 20%. Then, calculate the Pearsonian coefficient of
skewness and give appropriate conclusion.
Ü Kurtosis
o Kurtosis is the degree of peakdness of a distribution, usually taken relative to a normal distribution.
o A distribution having relatively high peak is®Leptokurtic
o if a curve representing a distribution is flat topped ® Platy kurtic
o The normal distribution which is not very high peaked or flat topped ® Mesokurtic
Measure of Kurtosis
The moment coefficient of kurtosis: denoted by α4
M4 M4
where 2 = 4
M2 σ
Where:-
M4 = is the 4th moment about mean
M2 = is 2nd moment about mean.
is population standard deviation
The peakdness of depends on the value of α4 :
If α4 > 3 then the curve is leptokurtic.
If α4 = 3 the curve is Mesokurtic
If α4 < 3 then the curve is Platykurtic.
M3 −60
a 3/2 3 /2 the distribution is negatively skewed
M 2 16
Introduction to statistics
M4 162
b 2 2 the curve is platykurtic
M 2 16