Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

A Report on

Study of Estimation of the Coefficient of


Variation for Normal Distribution
submitted in the partial fulfillment of the
requirements of the degree of
Master of Science
in
Mathematics
submitted by
Mr. Mantu Kumar
Roll N: S18MA006
4th Semester
under the supervision of
Dr. Adarsha Kumar Jena

Department of Mathematics
National Institute of Technology, Meghalaya
Bijni Complex, Laitumkhrah
Shillong-793003, Meghalaya, India
4th July 2020
National Institute of Technology, Meghalaya
Bijni Complex, Laitumkhrah
Shillong-793003

Certificate
This is to certify that the thesis entitled A Report on Study of Estimation of Coefficient of
Variation for Normal Distribution submitted by Mr. Mantu Kumar in partial fulfillment
of the requirements for the award of the degree of Masters of Science in Mathematics as
specialization is a record of research work carried out by him under our supervision and
guidance.

All help received by him from various sources have been duly acknowledged.

No part of this thesis has been submitted elsewhere for award of any other degree .

Dr. Adarsha Kumar Jena


Assistant Professor & Supervisor
Department of Mathematics, NIT Meghalaya

Dr. Manideepa Saha


Assistant Professor & Head of the Department
Department of Mathematics, NIT Meghalaya
Declaration

I hereby declare that the thesis entitled A Report on Study of Estimation of Coefficient
of Variation for Normal Distribution submitted by me to Department of Mathematics,
NIT Meghalaya in partial fulfillment of the requirements for the award of the Master of
Science in Mathematics in the record of bonafide project work carried by me, under the
guidance of Dr. Adarsha Kumar Jena, NIT Meghalaya.

The matter presented in this project has not been submitted by me for the award of
any other degree elsewhere.

Mantu Kumar
(S18MA006)
Acknowledgement
I would like to express my deep sense of gratitude and sincere thanks to my project super-
visor Dr. Adarsha Kumar Jena, Assistant Professor NIT Meghalaya for being helpful
and a great source of inspiration. He has provided me with great insight and feedback
every step of the way and as a result of that I have learned and grown personally and
professionally. His keen interest and constant encouragement gave me the condense to
complete my work.

I am also grateful to Dr. Manideepa Saha, Assistant Professor and Head of the
Department of Mathematics, NIT Meghalaya, for providing the necessary facilities in
the department.

I am also grateful to all the faculties, National Institute of Technology Meghalaya for
all their useful suggestions and encouragements. I would like to thank and knowledge,
Government of India, for continued assistance throughout the course of my studies at the
National Institute of Technology Meghalaya.

I would also like to extend my heartfelt gratitude to my classmates who have encour-
aged me in every possible way to the completion of the thesis. Last but not the least, I
would like to thank my parents for being my constant source of motivation. This disser-
tation would not have been possible without the help of several individuals who extended
my support in the completion of this study.

Mantu Kumar
S18MA006
Contents
Certificate ii

Declaration iii

Acknowledgement iv

Table of Contents v

Abstract 1

1 Introduction 2

2 Confidence Interval 4
2.1 Confidence Interval for the mean . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Confidence Interval for the Variance, Standard Deviation and Coefficient
of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Mean and Variance of the Coefficient of Variation for Sampling Distribution 6
2.4 Asymptotic Unbiased Estimator . . . . . . . . . . . . . . . . . . . . . . 8

3 Point Estimation 14
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Maximum Likelihood Estimator . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Method of Moment Estimator . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Unbiased Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Conclusion 23

Bibliography 25

Index 25
Abstract
In this article, we have studied the coefficient of variation for normal and lognormal

distribution. In this regard, we have studied point estimate and confidence interval for the

coefficient of variation. Point estimate for the coefficient of variation has been studied in

three cases: (i) Maximum Likelihood Estimator, (ii) Method of Moment Estimator and

(iii) Unbiased Estimator. From the study, it is observed that maximum likelihood esti-

mator gives precise value rather than others. Although all the estimators have their own

proper characteristics; but when sample size increases the estimator gives more precise

value.

1
Chapter 1

Introduction
The coefficient of variation is dimensionless measure of the dispersion of a probability

distribution. More specifically it is a measure of variability realative to the mean. This

measure can be used to make comparisons across several populations those have different

units of measurement. The zoologist utilizes this for biometry. In chemical experiments,

the CV is often used as yardstick of precision of measurements; two measurements meth-

ods may be compared on the basis of their respective CVs. In finance, the CV can be used

as a measure of relative risks, and a test of equality for two stocks can help to determine

if two stocks posses the same risk or not.

Point estimator draws inferences about a population by estimating the value of an un-

known parameter using a single value or point. Confidence interval is in fact interval

estimator. An interval estimator draws inferences about a population by estimating the

value of an unknown parameter using an interval.

What is the coefficient of variation?


Let X be a random variable following certain distribution with mean µ
and variance σ 2 , the coefficient of variation is defined as the ratio of σ /µ,
where σ and µ are standard deviation and the mean respectively.
Example 1.(Poission Distribution)
λ x e−λ
The Poission distribution is given by f (x, λ ) = x! , x = 0, 1, 2, ...
n
λ x e−λ
Mean of f(x)= ∑ x f (x, λ ) = x!
x=1
n
λ x e−λ
= ∑ (x−1)!
x=1
n x−1 e−λ
= λ ∑ λ(x−1)!
x=1
n y −λ
= λ ∑ λ y!e
y=1
n y
= λ e−λ ∑ λy! = λ e−λ eλ =λ
i=1

2
n 2 λ x e−λ
Variance of f(x)= ∑ ( x x! ) − µ2
x=1
n
= ∑ xλ x e−λ
x=1
n
= e−λ ∑ xλ x
i=1
n
= λ e−λ ∑ xλ x−1
y=x−1
−λ
= λ e eλ

√ √
Standard deviation = λ . Hence, the coefficient of variation= λ
= √1 .
λ λ

3
Chapter 2

Confidence Interval
2.1 Confidence Interval for the mean

A random variable X is said to be normally distributed, if its density


1 x−µ 2
function is defined by fX (x; µ, σ 2 ) = √1 e− 2 ( σ ) , where the parameters
σ 2π
µ and σ satisfy −∞ < µ < ∞ and σ > 0. If, we define a random variable
X−µ x−µ
Z as Z = σ and at X = x, Z = z, z = σ then
P(−zα/2 < Z < zα/2 ) = 1 − {P(−zα/2 > Z) + P(Z > zα/2 )}
= 1 − (α/2 + α/2)
( because α/2 is the area of the region falls left to −zα/2 and α/2 is the
area of the region falls right to zα/2 as in figure-1)
= 1 − α.
If, there is a sample of normally distributed random variables X1 , X2 , ..., Xn
then z is given by
x̄ − µ
z= √ (2.1)
σ/ n
where x̄ and µ are expected mean and actual mean respectively, σ is stan-
dard deviation and n is the sample size. The equation (8) can be written as

σ
µ = x̄ ± z √ (2.2)
n
Therefore,
σ σ
x̄ − z √ < µ < x̄ + z √ (2.3)
n n
Now, since zα/2 is the z-value leaving an area of α/2 to its right, −zα/2 is
the z-value leaving an area of α/2 to its left. Therefore, the above equation
can be written as

σ σ
x̄ − zα/2 √ < µ < x̄ + zα/2 √ (2.4)
n n

4
This equation is known as confidence interval.
Let θ̂L = x̄ − zα/2 √σn and θ̂U = x̄ + zα/2 √σn , then θ̂L and θ̂U are known as
lower and upper one sided bound or confidence limit respectively.

2.2 Confidence Interval for the Variance, Standard De-


viation and Coefficient of Variation

Let X1 , X2 , ..., Xn be random variables with mean µ and variance σ 2 .


Let X = (X1 , X2 , ..., Xn ) be a sample of random variables of lognormal
distribution Y = In(x) ∼ (µ, σ 2 ). The probability density function of log-
normal LN(µ, σ 2 ) is


1 In(x)−µ 2
√ exp(− 12 ( σ ) ), for x>0;


xσ 2π
f (x, µ, σ 2 ) = (2.5)

0, elsewhere.

The mean and variance of lognormal distribution is E(x) = exp(µ + σ 2 /2)


2 +2µ) 2
and var(x)=e(σ (eσ − 1) respectively.
Hence, the coefficient of variation for lognormal distribution is:
q
2 2
e(σ +2µ) (eσ −1 ) p 2
CV = µ+σ 2 /2 = eσ − 1 (2.6)
e
Let (X1 , X2 , ..., Xn ) be a sample of random variables with mean µ and vari-
n
2
∑ (xi −x̄)
(n−1)S2
ance σ 2 unknown, we know that i=1
σ2
= σ2
. This distribution is
(n−1)S2
known as χ 2 distribution with (n-1) degree of freedom i.e. σ2
. It can
be used as pivot. We now find lower and upper confidence bound denoted
by χL 2 , χU 2 respectively
! s.t.

P χL2 ≤ σ 2 ≤ χU2 = 1−α

!
(n − 1)S2 (n − 1)S2
=⇒ P 2
< σ2 < 2
= 1 − α. (2.7)
χ(α/2,n−1) χ(1−α/2,n−1)

5
It is straightforward to see that 100(1 − α)% confidence interval for σ 2 is
(n − 1)S2 2 (n − 1)S2
2
≤σ ≤ 2 (2.8)
χ(α/2,n−1) χ(1−α/2,n−1)
when α ∈ (0, 1) and χ 2 (α/2,n−1) is the α/2 quantile of the chi-square
distribution with (n − 1) degree of freedom. Using (2.8) the confidence
interval for the standard deviation is obtained as
s s
(n − 1)S2 (n − 1)S2
2
≤ σ ≤ 2
(2.9)
χ(α/2,n−1) χ(1−α/2,n−1)

We see that in eq (2.6), CV of lognormal distribution depends only on


σ 2 . Hence, because of (2.6) and (2.8), the confidence interval(CI) for the
coefficient of variation of lognormal distribution is obtained as
s s !
(n − 1)S 2 (n − 1)S 2
CI = [L,U] = exp( 2 − 1, exp( 2 − 1 (2.10)
χ(α/2,n−1) χ(1−α/2,n−1)

Example 2. The average zinc concentration recovered from a sample


of measurements taken in 36 different locations in a river is found to be 2.6
grams per milliliter. Find the 95% confidence intervals for the mean zinc
concentration in the river. Assume the standard deviation is 0.3 gram per
milliliter.
Solution. The point estimate of µ is x̄ = 2.6. The z-value leaving an area of
0.025 to the right and therefore an area of 0.975 to the left is z0.025 = 1.96.
Therefore 95% confidence interval is
x̄ − zα/2 √σn < µ < x̄ + zα/2 √σn
=⇒ 2.6 − (1.96) √0.3 < µ < 2.6 + (1.96) √0.3
36 36
=⇒ 2.502 < µ < 2.698.

2.3 Mean and Variance of the Coefficient of Variation


for Sampling Distribution

Theorem. If X̄ and S are the standard deviation of a random sample

6
X1 , X2 , ..., Xn of size n taken from a normal population N(µ, σ 2 ), then the
S
probability density function of the coefficient of variation C = X̄
of the
sample is

 (n−1)
R∞ n − n [z2 x2 +(x−µ)2 ]
kz

 x e 2σ 2 dx, if z > 0;
f (z) = 0 (2.11)
(n−1)
R0 n − n [z2 x2 +(x−µ)2 ]
−kz



 x e 2σ 2 dx, elsewhere.
−∞

2(n/2)(n+1)/2
where k = σ (n−1) Γ(n/2)Γ(1/2)
.
2
Proof. Let ξ 2 = n σS 2 ∼ χ 2 (n), according to the sampling distribution the-
orem of χ 2 , ξ 2 will have the following density function

n
1
x 2 −1 e−x/2 , x>0;


2n/2 Γ(n/2)
fξ 2 (x) = (2.12)

0, otherwise.

Then by transformation of random variable, the following density function


of S2 can be obtained

ny
n 1 ny n2 −1 2σ 2
( ) e , x>0;


σ 2 2n/2 Γ(n/2) σ 2
fξ 2 (x) = (2.13)

0, otherwise.

from which we conclude that the standard variance of the sample has prob-
ability distribution

n/2 ny
 2(n/2)

n yn−1 e 2σ 2 , x>0;
fξ 2 (x) = 2y fs2 (y2 ) =
σ Γ(n/2)
(2.14)

0, otherwise.

As the mean of the sample has normal distribution X̄ ∼ N(µ, σ 2 /n), its
√ −n(n−µ)2
density function is given by fX (x) = √ n e 2σ 2 . As the mean and the
2πσ
standard deviation of the sample are independent random variables, the

7
density function of the coefficient of variation C = S/X̄ will be

R∞
Z∞ Z0  x fs (zx) fX̄ dx z>0;


fc (z) = x fs (zx) fX̄ dx− x fs (zx) fX̄ dx = 0
 R0
0 −∞  x fs (zx) fX̄ dx otherwise.


−∞
(2.15)
If the (2.13) and (2.14) are substituted into (2.15), we have (2.11). So, the
theorem is established.

2.4 Asymptotic Unbiased Estimator

k−1
S ∞
(−1)k−1 (X̄ − µ)
cv = = S ∑ (2.16)
X̄ k=1 µk
It is known that X̄ and S are independent in a normal distribution. Taking
the expectation of (1)
k−1
(−1)k−1 (X̄ − µ)

E(cv) = E(S)S ∑ (2.17)
k=1 µk

k
2k k Γ(k + 1/2) σ2 2k+1
E(S) = cn σ , E(X̄ − µ) = 2 √ ( ) , E(X̄ − µ) = 0.
π n
(2.18)
q
2 Γ(n/2)
where k is a natural and cn = n−1 Γ( n−1 )
2

Using (2.18) in (2.17) it follows that

8
k
2k Γ(k + 1/2) σ 2

E(cv) = E(S) ∑ √ 2k+1 ( )
k=0 πµ n
k
2k Γ(k + 1/2) σ 2

= cn σ ∑ √ 2k+1 ( )
k=0 πµ n
k
σ ∞ k
2 Γ(k + 1/2) σ 2
= cn {1 + ∑ √ 2k ( ) }
µ k=1 πµ n
k
2k Γ(k + 1/2) τ 2

= cn τ + cn τ ∑ √
k=1 π n
k k

(k − j + 1/2)Γ(1/2) τ 2
= cn τ + cn τ ∑ ∏ √ ( )
k=1 j=1 pi n
k
(2.19)

τ2
= cn τ + cn τ ∑ (2k − 1)(2k − 3)(2k − 5)...( )
k=1 n
k

(2k − 1)(2k − 2)(2k − 3)(2k − 4)... τ 2
= cn τ + cn τ ∑ ( )
k=1 (2k − 2)(2k − 4)(2k − 6)... n
k
(2k − 1)! τ 2

= cn τ + cn τ ∑ k−1 ( )
k=1 2 (k − 1)! n
k

(2k − 1)!2k τ2
= cn τ + cn τ ∑ (
(k−1) (k − 1)! n
)
k=1 2k2
k
2k! τ 2

= cn τ + cn τ ∑ k ( )
k=1 2 k! n
σ
where µ = τ. Now,
r
2 Γ(n/2)
cn =
n − 1 Γ( n−1
2 )
r
2 n−2
=
n−1 2 (2.20)
1 n−1
=√ √
n−1 2
n−2
=√ √
n − 1. 2
Now, on expanding this accordig to the Taylor’s series expansion this has
the pole ’1’.

2k! τ 2 k
Hence, lim cn = 1, and lim cn τ ∑ ( )
2k k! n
=0
n→∞ n→∞ k=1

9
and henceforth lim E(cv) = lim cn τ = τ.
n→∞ n→∞
The probability density function of the coefficient of variation is defined
as

 R∞ n 2 2 2
(n−1) xn e− 2σ 2 [z x +(x−µ) ] dx,


kz if z > 0;
f (z) = 0 (2.21)
(n−1)
R0 n − n [z2 x2 +(x−µ)2 ]
−kz



 x e 2σ 2 dx, elsewhere.
−∞

2(n/2)(n+1)/2
where k = σ (n−1) Γ(n/2)Γ(1/2)
; the sampling distribution of the mean is

Z+∞ Z0 −n(x−µ)2
Z0 −nz2 x2
E(z) = z f (z)dz = −k xn e 2σ 2 zn e 2σ 2 dzdx
−∞ −∞ −∞
(2.22)
Z∞ −n(x−µ)2 Z

−nz2 x2
+k xn e 2σ 2 zn e 2σ 2 dzdx
0 0
nz2 x2 σ u−1/2
Let 2σ 2
then, dz = n1/2 x21/2
; hence
Z0 −n(x−µ)2
Z0 n/2 n n/2
n 2 σ µ −µ σ1/2
E(z) = −k x e 2σ 2 2nµ x e dudx
nn/2 xn
−∞ −∞
(2.23)
Z∞ ∞
−n(x−µ)2 Z
2n/2 σ n µ n/2 −µ σ
+k xn e 2σ 2 e 2nµ 1/2 x dudx
nn/2 xn
0 0

Z∞
(n−1)/2 Γ( n+1
2 ) σ n+1 1 − n ( x−µ )2
=⇒ E(z) = −k2 e 2 σ dx (2.24)
n(n+1)/2 Γ(n/2)Γ(1/2) x
−∞
2
Taking X̄ = ( x−µ
σ ) , the above equation reduces to
Γ( n+1
2 )
R∞ 1
√ x f X (x)dx
Γ(n/2) 2/nσ −∞
Using Taylor’s series expansion
Z∞ Z∞ Z∞
1 11 1 21 1 1 σ2
fX (x)dx = fX (x)dx+ (x − µ) f X (x)dx+... ≈ + .
x µx µ3 x µ µ3 n
−∞ −∞ −∞
(2.25)
Substituding c = σ /µ the above equation reduces to
Γ( (n+1)
2 )2 c3
E(z) = Γ(n/2) n
(c + n)

10
Taking n → ∞ the following equation reduces to
" #2
Γ( n+1
2 ) n 1
= − (2.26)
Γ(n/2) 2 4

Γ( (n+1)
q q q
2 )2
3 1 2 1
Hence, E(z) = Γ(n/2) n
(c + cn ) ≈ n
2−4 nc = 1 − 2n c ≈ c (i.e. c =
σ /µ).
Z+∞ Z0 −n(x−µ)2
Z0 −nz2 x2
E(z2 ) = z2 f (z)dz = −k xn e 2σ 2 z(n+1) e 2σ 2 dzdx
−∞ −∞ −∞
(2.27)
Z∞ −n(x−µ)2 Z

−nz2 x2
+k xn e 2σ 2 z(n+1) e 2σ 2 dzdx
0 0

nz2 x2 σ u−1/2
Let 2σ 2
, dz = n1/2 x21/2
; hence

Z0 −n(x−µ)2
Z0 n/2 n n/2
2 σ µ −µ σ1/2
E(z2 ) = −k xn e 2σ 2 2nµ x e dudx
nn/2 xn
−∞ −∞
(2.28)
Z∞ ∞
−n(x−µ)2 Z
2n/2 σ n µ n/2 −µ σ
+k xn e 2σ 2 e 2nµ 1/2 x dudx
nn/2 xn
0 0

Z∞
n+1 σ (n+1) Γ( n+1
2 ) 1 − n ( x−µ )2
=⇒ E(z2 ) = k2 2 e 2 σ dx (2.29)
n(n+1)/2 Γ(n/2 + 1)Γ(1/2) x
−∞
2
Taking X̄ = ( x−µ
σ ) , the above equation reduces to
R∞ 1
E(z2 ) = σ 2 f (x)dx
x2 X̄
−∞
Using Taylor’s series expansion
R∞ 1 1 R∞ 3 2 2 + 3c4 . There-
E(z2 ) = σ 2 { f
µ x X̄
2 (x)dx+ µ 4 (x − µ) f X̄ (x)dx+...} ≈ c n
−∞ −∞
fore,
" #2
n+1
r
3c 4 Γ( ) 2 c2 c4 c2
2 2 2 2
var(z) = E(z )−(E(z)) = c + − (c + ) ≈ + .
n Γ(n/2) n n n 2n
(2.30)
Theorem 1. If X̄ and S are the mean and standard variance of a random
sample X1 , X2 , ..., Xn of size n taken from a normal population N(0, σ 2 ),

11
S
then the probability density function of CV = X̄
of the sample is

n+1
Γ( n+1 −2  x2 
2 x )
2
f (x) = (2.31)
Γ(1/2)Γ(n/2) 1 + x2
2
Let ξ 2 = nS
σ2
∼ χ 2 (n), according to the sampling distribution of χ 2 , ξ 2 has
the following density function

n x
 n/2 1 x 2 −1 e− 2 , x>0;

2 Γ(n/2)
fξ 2 (x) = (2.32)

0, elsewhere.

Then by transformation of random variable, the desity function of S2 is


obtained as follows,


ny
n 1 ny 2n −1 − 2σ 2
( ) e , x>0;


σ 2n/2 Γ(n/2) σ 2
2
fS2 (y) = (2.33)

0, elsewhere.
from which it can be concluded that the standard variance of the sample
has probability distribution


n/2 nx2
 2(n/2)

x n−1 e− 2σ 2 , x>0;
n
fs (x) = 2x fs2 (x2 ) =
σ Γ(n/2)
(2.34)

0, elsewhere.
As the population is normal, the mean of the sample X̄ also has normal
2
distribution N(0, σn ), and its density function is given by

n − ny22
fX̄ (y) = √ e 2σ (2.35)
nπσ
Since the mean and standard variance of the sample are two independent
S
random variables, the density function of CV = X̄
will be

Z∞
fc (z) = |y| fs (zy) fX̄ (y)dy. (2.36)
−∞

12
case 1: when z>0,
Z+∞ Z0 n+1 Z+∞
2(n/2) 2 zn−1 n −
ny 2 (1+z2 )
fc (z) = y fs (zy) fX̄ (y)dy− y f s(zy) fX̄ (y)dy = n+1 √ y e 2σ 2 dy.
σ Γ(n/2) π
0 −∞ 0
(2.37)
ny2 (1+z2 )
Writing t = σ2
,
σ 2t
by transformation we get y2 = n(1+z2 )
and
σ2
ydy = 2n(1+z2 )
dt then

Z∞
! n+1 Z+∞ ! n+1
2 2
n
2
− ny (1+z
2)
σ2 1 n−1 −t σ2 n−1 n+1
y e 2σ 2 dy = t 2 e 2 dt = 2 2 Γ( )
n(1 + z2 ) 2 n(1 + z2 ) 2
0 0
(2.38)
Using (2.38), (2.37) reduces to
n+1
Γ( n+1
2 ) 1  z2  2
fc (z) = . (2.39)
Γ(n/2)Γ(1/2) z2 1 + z2

case 2: when z ≤ 0, (4)will be


Z+∞ Z0 Z0
fc (z) = y fs (zy) fX̄ (y)dy − y fs (zy) fX̄ (y)dy = 0 − y fs (zy) fX̄ (y)dy.
0 −∞ −∞
(2.40)
Let zy = t, we get y = t/z, and dy = dt/z; then
Z∞ n+1 Z∞ nt 2 (1+ 12 )
1 t 1 2(n/2) 2 n − z
fc (z) = 2 t fs (t) fX̄ dt = 2 n+1 √ t e 2σ 2
z z z σ Γ(n/2) π
0 0 (2.41)
n+1
Γ( n+1
2 )1  z2  2
= .
Γ(n/2)Γ(1/2) z2 1 + z2
Hence from (2.39) and (2.41), we get the density function and the theorem
is established.

13
Chapter 3

Point Estimation
3.1 Definition

A point estimate is an approximate value for a parameter in the distribution


of X obtained from a sample. A sample mean
1 n
x̄ = ∑ xi (3.1)
n i=1
is an estimator of the mean µ of X, and the sample variance
2 1 n 2 1 h 2 2
i
S = ∑ (xi − x̄) = n − 1 (x1 − x̄) + ... + (xn − x̄)
n − 1 i=1
(3.2)

is a sample variance σ 2 of X.
A point estimate of a parameter θ is a single number that can be regarded
as a sensible value of θ . A point estimate is obtained by selecting a suitable
statistic and computing its value from the given sample data. The selected
statistic is called the point estimator of θ .

An Estimator of the Coefficient of Variation using Point Estimator

If X is a normal random variable with mean µ and variance σ 2 then, the


parameter
σ
κ∼ (3.3)
µ
is called the population coefficient of variation. Let Xi for i = 1, ..., n be
an independent random sample with Xi ∼ N(µ, σ 2 ) for each i. In terms of
usual sample estimates of the normal parameters, sample mean
n
Xi
X̄ = ∑ (3.4)
i=1 n
and the standard sample variance
n 2
2 (Xi − X̄)
S =∑ (3.5)
i=1 n − 1

14
then an estimator of the coefficient of variation in this case is given by
S
K≡ . (3.6)

Example 3. Suppose that, X the reaction time to a certain stimulus has a
uniform distribution on the interval from 0 to an unknown upper limit θ .
It is desired to estimate θ on the basis of a random sample X1 , X2 , ..., Xn
of reaction times. Since, θ is largest possible time in the entire population
of reaction times, consider as a estimator of the largest sample reaction
time θ̂ = max(X1 , X2 , ..., Xn ). If n = 5 and x1 = 4.2, x2 = 1.7, x3 = 2.4, x4 =
3.9, x5 = 1.3, the point estimator of θ is
θ̂ = max(4.2, 1.7, 2.4, 3.9, 1.3)=4.2, where x1 , x2 , ..., x5 are correspondent
values of X1 , X2 , ..., X5 .

3.2 Maximum Likelihood Estimator

3.2.1 Definition

Given independent observations x1 , x2 , ..., xn from a population density


function(continuous case) or probability mass function (discrete case) f(x;θ ),
the maximum likelihood estimator θ̂ is that which maximizes the likeli-
hood function
L(x1 , x2 , ..., xn , θ ) = f (x1 , θ ) f (x2 , θ )... f (xn , θ ).
That is to say, let f be a function defined on identically independent random
variables and θ be a parameter,

L(x1 , x2 , ..., xn , θ ) = f (x1 , θ ) f (x2 , θ )... f (xn , θ ) (3.7)

and
∂ L(x, θ )
=0 (3.8)
∂θ

∂ 2 L(x, θ )
<0 (3.9)
∂θ2
15
then the maximum among the values of θ for which equations (3.8) and
(3.9) are satisfied is called ’maximum likelihood estimator’. L(θ ) and
logL(θ ) have their maxima at the same values of θ and hence, it is some-
times easier to find the maximum of logarithm of the likelihood function.

3.2.2 An Estimator of the Coefficient of Variation using MLEs of µ


and σ

A random sample of size n from the normal distribution has the density
function: 
1 x−µ 2
√1 e− 2 ( σ ) , if x>0;


σ 2π
f (x, µ, σ ) = (3.10)

0, eslewhere.
If µ = 0 and σ = 1, f (x; µ, σ ) is known as standard normal distribution.
The likelihood function of (3.10) is given by
n
σ 2 2 [− 1 2 ∑ (xi −µ)2 ]
Πni=1 √1 e− 2 (xi −µ) = 1
n/2 e
2σ i=1
σ 2π 2
(2πσ )
The logarithm of the likelihood function is
n
L = − 2n log2π − n2 logσ 2 − 2σ1 2 ∑ (xi − µ)2 , where σ > 0 and −∞ < θ < ∞.
i=1
To find the location of its maximum, we compute

∂L 1
= 2 ∑(xi − µ) (3.11)
∂µ σ

∂L n 1 n
2
= − 2
+ 4 ∑ (xi − µ)2 (3.12)
∂σ 2σ 2σ i1
Putting these derivatives equal to zero and solving the resulting equations
for µ and σ 2 ,
n
1
σ2 ∑ (xi − µ) = 0
i=1
n
=⇒ ∑ xi − nµ = 0
i=1
1 n
=⇒ µ̂ = ∑ xi (3.13)
n i=1

16
using (3.12)
n 1 n
− 2 + 4 ∑ (xi − µ)2 = 0 (3.14)
2σ 2σ i=1
n n 2
=⇒ −n + σ12 ∑ (xi − n1 ∑ xi ) = 0 (using (3.13))
i=1 i=1
n
2
=⇒ −n + σ12 ∑ (xi − x̄) = 0
i=1
n
1 2
=⇒ σ̂ 2 = n ∑ (xi − x̄)
i=1
Hence, anrestimator the CV using maximum likelihood estimators of µ
n
n ∑ (xi −x̄)2
i=1
and σ = n .
∑ xi
i=1
Example 4. Suppose that 10 rats are used in a biomedical study, when
they are injected with cancer cells and then given a cancer drug that is
designed to increase their survival rate . The survival times in months are
14,17,27,18,12,8,22,13,19 and 12. Assume that exponential distribution is
exerted. Give a MLE of the mean survival time.
Solution. 
 1 e−x/β , if x>0;

β
f (x, β ) = (3.15)

0, elsewhere.
Thus, the log likelihood function for the data given n=10 is
−x1 −x2 −x10
L(x1 , x2 , ..., xn ; β ) = β1 e β . β1 e β .... β1 e β

x1 +x2 +...x10
1 −
= β 10
e β

10
∑ xi
1 − i=1β
= β 10
e
10
lnL = −10lnβ − β1 ∑ xi
i=1
10
∂ lnL −10 1
∂β = β + β2 ∑ xi = 0.
i=1
10
=⇒ −β 10 + ∑ xi = 0
i=1
10
1
=⇒ β = 10 ∑
i=1
1 1
=⇒ β = 10 (14 + 17 + 27 + 18 + 12 + 8 + 22 + 13 + 19 + 12) = 10 (162) =

17
16.2.
Since 1st derivative of InL is satisfied by only one value of β ; hence max-
ima and minima of InL both depend on single value of β . Hence, β = 16.2
is MLE of f (x, β ).

3.3 Method of Moment Estimator

3.3.1 Moment

The rth moment of a RV ’X’ about the mean µ(also called the central
mean) is defined as
µr = E[(X − µ)r ] where r = 0, 1, 2, .... It follows that µo = 1, µ1 = 0 and
µ2 = σ 2 ,
µr = ∑ni=1 (xi − µ)r f (x) (discrete case)
R∞ r
µr = −∞ (x − µ) f (x)dx (continuous case)
µ10 = µ, µ00 = 1,
µ2 = µ20 − µ 2 ,
µ3 = µ30 − 3µ20 µ + 2µ 3 , and so on.
Let (x1 , x2 , ..., xn ) be a random sample from the density function f(.). Then
the rth sample moment about 0, denoted by

1 n r
Mr0 = ∑ Xi (3.16)
n 1

In particular, if r = 1 we get the sample mean, which is usually denoted by


X̄ or X¯n , i.e. X¯n = 1n ∑ni=1 Xi ; also the rth sample moment about X¯n denoted
by Mr is defined to be

1 n r
Mr = ∑ (Xi − X¯n ) (3.17)
n i=1

18
3.3.2 Method of Moment

Let f (.; θ1 , θ2 , ..., θn ) be a density of a random variable which has k


parameters θ1 , θ2 , ..., θk . Let µr0 denotes rth moment about 0; that is µr0 =
δ [x0 ]. In general µr0 will be a known function of the k parameters θ1 , θ2 , ..., θk .
Denote this by writing µr0 = µr0 (θ1 , θ2 , ..., θk ). Let X1 , X2 , ..., Xn be a ran-
dom sample from the density f (.; θ1 , θ2 , ..., θk ) and as before let M 0j be
j
the jth sample moment, i.e. M 0j = 1n ∑ni=1 xi . From the k equations M 0j =
µ 0j (θ1 , θ2 , ..., θk ), j = 1, 2, ..., k.
If the k variables θ1 , ..., θk and let θˆ1 , ..., θˆk be their respective solutions.
We say that the estimator (θ1 , ..., θk ), where θˆj estimates θ j is the estima-
tor of (θ1 , θ2 , ..., θk ) obtained by method of moments. The estimators are
obtained by replacing population moments by sample moments.

3.3.3 An Estimator of the Coefficient of Variation using Moment Es-


timator

If r = 1, the 1st sample moment or simply moment is defined as


1 n
X̄ = ∑ Xi. (3.18)
n i=1
Let X1 , X2 , ..., Xn be random sample from a density function f(.) then
1 n
S2 = ∑ (xi − x̄)2 (3.19)
n i=1
for n > 1 is defined to be sample variance.
Hence, an estimator of the CV using moment estimator
r n
1 2
n ∑ (xi − x̄)
i=1
= n . (3.20)
1
n ∑ xi
i=1
Example 5. Let X1 , X2 , ..., Xn be a random sample from a normal distribution
with mean µ and variance σ 2 . Let (θ1 , θ2 ) = (µ, σ ). Estimate the parame-
ters µ and σ by the method of moments.

19
Solution. σ 2 = µ20 − (µ10 )2 and µ = µ10
The moment equations reduce to
M10 = µ10 = µ10 (µ, σ ) = µ
M20 = µ20 = µ20 (µ, σ ) = σ 2 + µ 2
and their solution is the following: The method of moments estimator of µ
is M 0 = X¯1 and the method of moment estimator of σ is
q 1 q q
n 2
0 1 n 0 2 (X −X̄)
M2 − X̄ = n ∑i=1 Xi − X̄ = ∑i=1 ni
2 2 .

3.4 Unbiased Estimator

Definition The bias of is defined as

Biasθ [θ̂ ] = E θx [θ̂ ] − θ = E θx [θ̂ − θ ]. (3.21)

An estimator is said to be unbiased if its bias is equal to zero for all values
of parameter θ . An estimator δ (x) is an biased estimator of a function
g(θ ) of the parameter θ if Eθ δ (x) = g(θ ) for every possible values of θ .
An estimator that is not unbiased is called biased estimator. The difference
between the expectation of an estimator and g(θ ) is called the bias of the
estimator. That is bias of δ as an estimator of g(θ ) is Eθ [δ (x) − g(θ )] and
S is unbiased iff
Eθ [δ (x) − g(θ )] = 0, ∀θ (3.22)

In case of a sample from a normal distribution with unknown mean θ , X¯n


is an unbiased estimator of θ because

Eθ (X¯n ) = θ , −∞ < θ < ∞. (3.23)

Theorem. If S2 is the variance of a random sample from a population with


mean µ and variance σ 2 , S2 is an unbiased estimator of σ 2 .
Proof. Let X1 , X2 , ..., Xn be its random variables with variance σ 2 < ∞. We

20
have
n n n
2 2 2
∑ (Xi − X̄) = ∑ (Xi − µ) − ∑ (X̄ − µ) (3.24)
i=1 i=1 i=1
Now,
" # " #
n n
1 2 1
E(S2 ) = E ∑
n − 1 i=1
(Xi − X̄) =
n−1 ∑ E(Xi − µ)2 − nE(X̄ − µ)2
i=1
n
1
= ( ∑ σXi 2 − nσX̄2 )
n − 1 i=1
(3.25)
However, σXi 2 = σ 2 , for i=1,2,...,n and σX̄2 = 1n σ 2 ,
Therefore E(S2 ) = 1
(nσ 2 − n σ 2 ) = σ 2.
n−1 n

Hence, S2 is an unbiased estimator of σ 2 .


Example 6. Let X1 , X2 , ..., Xn be a random sample from a population with
finite mean µ. Show that the sample mean X̄ and 13 X̄ + 23 X1 are both unbi-
ased estimators of µ.
Solution. By the theorem, X̄ is unbiased.
Now, E[ 13 X̄ + 23 X1 ] = 13 E(X̄) + 23 E(X1 )
n n
= 13 . 1n ∑ Xi + 23 µ = 31 . 1n (nµ) + 23 µ = µ. (since ∑ Xi = nµ and expectation
i=1 i=1
of a random variable is its mean)
Hence, 13 X̄ + 32 X1 is also an unbiased estimator of µ. Thus, there are one or
more unbiased estimator(s) of µ.

3.4.1 An Estimator of the Coefficient of Variation using Unbiased


Estimator

The sample variance of unbiased estimator is defined as

1 n
2
S = ∑ (xi − x̄)2 . (3.26)
n − 1 i=1

The denominator n-1 in the sample variance is necessary to ensure unbi-


asedness of the variance operator.

21
The mean of unbiased estimator is defined as

1 n
X̄ = ∑ Xi . (3.27)
n i=1

Hence, an estimator of the CV using unbiased estimator


r n
1
n−1 ∑ (xi − x̄)2
i=1
= n . (3.28)
1
n ∑ xi
i=1

22
Chapter 4

Conclusion
In case, a sample of random variables has finite number of points, mo-
ment estimator, maximum likelihood estimator and unbiased estimator are
obtained. Maximum likelihood estimator gives precise value rather than
others. Although all the estimators have their own proper characteristics;
but when sample size increases the estimator gives more precise value.

23
References
[1] Murray R. Spiegel, John Schiller, R Alu Srinivasan( 2009), Probabil-
ity and statistics, McGraw-Hill, US, p.311-433.

[2] Vijay K Rohatgi, A.K. MD. Ehsanes Saleh(2005), An introduction to


probability and statistics, John Wiley & Sons, US, p.353-486.

[3] Hwei P. HSU(1997), Theory and problems of probability, random


variables and random processes, McGraw-Hill, US, p.247-250.

24
Index
cv for MLE, 17 lower and upper one sided bound,
cv for moment estimator, 19 5
cv for unbiased estimator, 21
point estimator, 14
bias, 20
rth sample moment, 18
central mean, 18
sample mean, 21
estimator, 14 standard sample variance, 14

likelihood function, 15 unbiased, 20

25

You might also like