Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

made up in the early 19th century from the Greek

word for curvature.) Kurtosis is not a particularly


important concept, but I mention it here for com-
pleteness.
Moments and the moment It turns out that the whole distribution for X
generating function is determined by all the moments, that is different
Math 217 Probability and Statistics distributions cant have identical moments. Thats
Prof. D. Joyce, Fall 2014 what makes moments important.

There are various reasons for studying moments


The moment generating function. There is a
and the moment generating functions. One of them
clever way of organizing all the moments into one
that the moment generating function can be used
mathematical object, and that object is called the
to prove the central limit theorem.
moment generating function. Its a function m(t)
of a new variable t defined by
Moments, central moments, skewness, and
kurtosis. The k th moment of a random variable m(t) = E(etX ).
k
X is defined as k = E(X ). Thus, the mean is
the first moment, = 1 , and the variance can Since the exponential function et has the power se-
be found from the first and second moments, 2 = ries
2 21 .
X tk t2 tk
The k th central moment is defined as E((X)k ). et = = 1 + t + + + + ,
Thus, the variance is the second central moment. k=0
k! 2! k!
The higher moments have more obscure mean-
ings as k grows. we can rewrite m(t) as follows
A third central moment of the standardized ran- m(t) = E(etX )
dom variable X = (X )/,
(tX)2 (tX)k
 
= E 1 + tX + + + +
3 E((X )3 ) 2! k!
3 = E((X ) ) =
3 t2 E(X 2 ) tk E(X k )
= 1 + tE(X) + + + +
is called the skewness of X. A distribution thats 2! k!
symmetric about its mean has 0 skewness. (In fact t2 2 tk k
all the odd central moments are 0 for a symmetric = 1 + t 1 + + + +
2! k!
distribution.) But if it has a long tail to the right 2 2 k k
= 1 + 1 t + t + + t +
and a short one to the left, then it has a positive 2! k!
skewness, and a negative skewness in the opposite
Theorem. The k th derivative of m(t) evaluated at
situation.
t = 0 is the k th moment k of X.
A fourth central moment of X ,
In other words, the moment generating function
4
E((X ) ) generates the moments of X by differentiation.
4 = E((X )4 ) = 4
The primary use of moment generating functions
is callled kurtosis. A fairly flat distribution with is to develop the theory of probability. For instance,
long tails has a high kurtosis, while a short tailed the easiest way to prove the central limit theorem
distribution has a low kurtosis. A bimodal distri- is to use moment generating functions.
bution has a very high kurtosis. A normal distri- For discrete distributions, we can also compute
bution has a kurtosis of 3. (The word kurtosis was the moment generating function directly in terms

1
of the probability mass function f (x) = P (X=x) Proof: mZ (t) = E(eZt ) = E(e(X+Y )t ) = E(eXt eY t ).
as X Now, since X and Y are independent, so are eXt
tX tx
m(t) = E(e ) = e f (x). and eY t . Therefore, E(eXt eY t ) = E(eXt )E(eY t ) =
x mX (t)mY (t). q.e.d.
For continuous distributions, the moment generat- Note that this property of convolution on mo-
ing function can be expressed in terms of the prob- ment generating functions implies that for a sam-
ability density function f as ple sum Sn = X1 + X2 + + Xn , the moment
Z generating function is
tX tx
m(t) = E(e ) = e fX (x) dx.
mSn (t) = (mX (t))n .
We can couple that with the standardizing prop-
Properties of moment generating functions. erty to determine the moment generating function
Translation. If Y = X + a, then for the standardized sum
mY (t) = eat mX (t). Sn n
Sn = .
n
Proof: mY (t) = E(eY t ) = E(e(X+a)t ) =
E(eXt eat ) = eat E(eXt ) = eat mX (t). q.e.d. Since themean of Sn is n and its standard devia-
tion is n, so when its standardized, we get
Scaling. If Y = bX, then
mSn (t) = ent/( n) mSn ( t n )
mY (t) = mX (bt).
= e nt/ mSn ( t n )
Proof: mY (t) = E(eY t ) = E(e(bX)t ) = E(eX(bt) ) =  n
nt/ t
mX (bt). q.e.d. = e mX ( n )

Standardizing. From the last two properties, if Well use this result when we prove the central limit
X theorem
X =

Some examples. The same definitions for these
is the standardized random variable for X, then apply to discrete distributions and continuous ones.
mX (t) = et/ mX (t/). That is, if X is any random variable, then its nth
moment is
Proof: First translate by to get n = E(X n )
and its moment generating function is
mX (t) = et mX (t).

tX
X k
Then scale that by a factor of 1/ to get mX (t) = E(e ) = tk .
k=0
k!
m(X)/ = mX (t/)
The only difference is that when you compute them,
= et/ mX (t/) you use sums for discrete distributions but integrals
q.e.d. for continuous ones. Thats because expectation is
defined in terms of sums or integrals in the two
Convolution. If X and Y are independent vari- cases. Thus, in the continuous case
ables, and Z = X + Y , then Z
n
mZ (t) = mX (t) mY (t). n = E(X ) = xn fX (x) dx

2
and The moment generating function for a uni-
Z
form distribution on [0, 1]. Let X be uniform
m(t) = E(etX ) = etx fX (x) dx. on [0, 1] so that the probability density function fX
has the value 1 on [0, 1] and 0 outside this interval.

Well look at one example of a moment gener- Lets first compute the moments.
ating function for a discrete distribution and three
for continuous distributions. Z
n
n = E(X ) = xn fX (x) dx

The moment generating function for a geo- Z 1
metric distribution. Recall that when indepen- = xn dx
0
dent Bernoulli trials are repeated, each with prob- 1
ability p of success, the time X it takes to get the xn+1 1
= =
first success has a geometric distribution. n+1 0 n+1

P (X = j) = q j1 p, for j = 1, 2, . . . .
Next, lets compute the moment generating func-
Lets compute the generating function for the geo- tion.
metric distribution.

Z
etx fX (x) dx
X
m(t) = etj q j1 p m(t) =

j=1 Z 1

p X tj j = etx dx
= e q 0
q j=1 1
1 tx et 1
= e =
t 0 t
X
X
The series etj q j = (et q)j is a geometric series
j=1 j=1
t Note that the expression for g(t) does not al-
eq
with sum . Therefore, low t = 0 since there is a t in the denominator.
1 et q
Still g(0) can be evaluated by using power series or
p et q pet LHopitals rule.
m(t) = = .
q 1 et q 1 qet

From this generating function, we can find the


moments. For instance, 1 = m0 (0). The derivative
The moment generating function for an ex-
of m is
ponential distribution with parameter .
0 pet
m (t) = , Recall that when events occur in a Poisson pro-
(1 qet )2
cess uniformly at random over time at a rate of
p 1 events per unit time, then the random variable
so 1 = m0 (0) = = . This agrees with
(1 q)2 p X giving the time to the first event has an expo-
what we already know, that the mean of the geo- nential distribution. The density function for X is
metric distribution is 1/p. fX (x) = ex , for x [0, ).

3
Lets compute its moment generating function. From these values of all the moments, we can com-
Z pute the moment generating function.
m(t) = etx fX (x) dx
n
Z
X
m(t) = tn
n!
= etx ex dx n=0
0
Z X 2m 2m
= t
= e(t)x dx m=0
(2m)!
0

(t)x (2m)! 1

e
X
= = m m! (2m)!
t2m
t 0 m=0
2

e(t)x e0
 
X 1
= lim = t2m
x t t 2m m!
m=0
t2 /2
Now if t < , then the limit in the last line is 0, so = e
in that case
Thus, the moment generating function for the stan-
m(t) = .
t dard normal distribution Z is
This is a minor, yet important point. The mo- 2
mZ (t) = et /2 .
ment generating function doesnt have to be defined
for all t. We only need it to be defined for t near More generally, if X = Z + is a normal distribu-
0 because were only interested in its derivatives tion with mean and variance 2 , then the moment
evaluated at 0. generating function is
2 2
The moment generating function for the gX (t) = exp(t + t /2).
standard normal distribution. Let Z be a ran-
dom variable with a standard normal distribution.
Math 217 Home Page at
Its probability density function is
http://math.clarku.edu/~djoyce/ma217/
1 2
fZ (x) = ex /2 .
2
Its moments can be computed from the definition,
but it takes repeated applications of integration by
parts to compute
Z
1 2
n = xn ex /2 dx.
2

We wont do that computation here, but it turns


out that when n is odd, the integral is 0, so n is 0
if n is odd. On the other hand when n is even, say
n = 2m, then it turns out that

(2m)!
2m = .
2m m!

You might also like