Week3 Lecture1 PDF

EE501 Stochastic Processes
Semester 191
Week-3, Lecture-1
Mohamed Abdul Haleem

Room B14-S-345
Tel: x2572
Email: m.haleem@uoh.edu.sa
1
Functions of Random Variables (Ch. 5, 6)
Let X  X   be a r.v. defined on the model (, F , P ), and

suppose g(x) is a function of the variable X. Define
Y  g ( X ).
Is Y necessarily a r.v.? If so what is its PDFFY ( y ), pdf fY ( y ) ?

FY ( y )  P(Y ( )  y )  Pg ( X ( ))  y   P X ( )  g 1 ( , y ].
Thus the distribution function as well of the density

function of Y can be determined in terms of that of X.
2
We shall consider some of the following functions to

illustrate the technical details.
aX  b
sin X X2
1
X Y  g( X ) |X |
X
log X
eX | X | U ( x)
3
Y
a 0
y
Y y
0 X
x
Example 1: Y  aX  b X 
y b
a
Solution: Suppose a  0.
 y b  yb
FY ( y )  P Y ( )  y   P aX ( )  b  y   P X ( )    FX  .
 a   a 
1  yb
fY ( y )  f X  .
and a  a  Y
On the other hand if a  0, then a0
 y b  y
FY ( y )  P Y ( )  y   P aX ( )  b  y   P  X ( )  
 a  Y y X
 y b  0 x
 1  FX   ,
 a  1  y b  y b
f (y )  f   X 
1  yb
Y X
|a |  a  a
and hence fY ( y )   f X  . 4
a  a 
Y  X2
y

X
x1 x2
Example 2: Y  X 2.
FY ( y )  PY ( )  y   P X 2 ( )  y .
If y  0, then the event  X 2 ( )  y  , and hence

FY ( y )  0, y  0.
For y  0, from Fig., the event {Y ( )  y}  { X 2 ( )  y}

is equivalent to {x1  X ( )  x2}.
5
Y  X2
y

X
x1 x2
Hence
FY ( y )  P  x1  X ( )  x2   FX ( x2 )  FX ( x1 )
 FX ( y )  FX (  y ), y  0.
By direct differentiation, we get

 1

fY ( y )   2 y

f X ( y )  f X ( y ) ,  y  0,

 0, otherwise.
If f X (x) represents an even function, then above reduces to
fY ( y ) 
1
y
fX  y  U ( y ).
6
1
In particular if X  N (0,1), so that f X ( x)  e x / 2 ,
2
2
and substituting this , we obtain the p.d.f of Y  X 2 to be
1
fY ( y )  e  y / 2U ( y ).
2y
Above represents a Chi-square r.v. with n = 1,since (1 / 2)  .

Thus, if X is a Gaussian r.v. with   0, then Y  X 2
represents a Chi-square r.v. with one degree of freedom
(n = 1).
7
Note: As a general approach, given Y  g ( X ), first sketch the graph
y  g ( x ), and determine the range space of y. Suppose a  y  b is
the range space of y  g ( x ).Then clearly for y  a, FY ( y )  0, and for
y  b, FY ( y )  1, so that FY ( y ) can be nonzero only in a  y  b. Next,
determine whether there are discontinuities in the range space of y. If
so evaluate PY ( )  yi  at these discontinuities. In the continuous
region of y, use the basic approach FY ( y)  Pg ( X ( ))  y 
and determine appropriate events in terms of the r.v. X for every y.
Finally, we must have FY ( y ) for   y  , and obtain
dFY ( y )
fY ( y )  in a  y  b.
dy 8
However, if Y  g ( X ) is a continuous function, it is easy to

establish a direct procedure to obtain fY ( y ).
g (x )
y  y
x
x1 x1  x1 x3 x3  x3
x2  x2 x2
y  y
Py  Y ( )  y  y   fY (u )du  fY ( y )  y.
y
when y  Y ( )  y  y, the r.v. X could be in any one of the

three mutually exclusive intervals
{x1  X ( )  x1  x1}, {x2  x2  X ( )  x2 } or {x3  X ( )  x3  x3} .
Hence Py  Y ( )  y  y  P{x1  X ( )  x1  x1}
 P{x2  x2  X ( )  x2 }  P{x3  X ( )  x3  x3} .
9
fY ( y )y  f X ( x1 )x1  f X ( x2 )( x2 )  f X ( x3 )x3.
In this case, x1  0, x2  0 and x3  0, so that above can be
rewritten as
| xi | 1
fY ( y )   f X ( xi )  f X ( xi )
i y i y / xi
and as y  0, fY ( y )   1
f X ( xi )  
1
f X ( xi ).
i dy / dx x i g ( xi )
i
For example if Y  X 2 , then for all y  0, x1   y and x2   y

represent the two solutions for each y. Y  X2
y
Moreover dy
 2x so that
dy
2 y.
dx dx x x i
 1
  x2 X
x
Therefore  f ( y )  f (  y ) , y  0, 1
fY ( y )   2 y X X

 0, otherwise , 10
Mean, Variance, Moments and Characteristic Functions
For a r.v. X, its p.d.f f X (x) represents complete information about it,
and for any Borel set B on the x-axis
P  X ( )  B    f X ( x )dx.
B
Note that f X (x) represents very detailed information, and quite often it
is desirable to characterize the r.v. in terms of its average behavior. In
this context, we will introduce two parameters - mean and variance -
that are universally used to represent the overall properties of the r.v.
and its p.d.f.
11
Mean or the Expected Value of a r.v. X is defined as

X  X  E( X )   x f X ( x )dx.

f X is a discrete-type r.v., then using we get

 X  X  E ( X )   x  pi ( x  xi )dx   xi pi   ( x  xi )dx
i i 
 

1
  xi pi   xi P ( X  xi ) .
i i
Mean represents the average (mean) value of the r.v. in a very large
number of trials. For example if X  U ( a, b), then
2 b
b x 1 x b2  a 2 ab
E( X )  
a ba
dx 
ba 2

2(b  a )

2
a
is the midpoint of the interval (a,b).

12
If X is exponential with parameter, then

 x 
E( X )   e x / 
dx    ye y dy   ,
0  0
If X is Poisson with parameter, then
 
k 
k
E( X )   kP( X
k 0
 k)   ke
k 0

k!
e 
k
k 1 k!

k 
i
e 
 (k  1)!  e  i!
k 1

i 0
 e  e    .
If X is binomial, then
n n
 n  k n k n
n!
E ( X )   kP( X  k )   k   p q   k p k q n k
k 0 k 0  k  k 1 (n  k )!k!
n
n! n 1
(n  1)!
 p q  np
k n k
pi q n i 1  np( p  q)n 1  np.
k 1 ( n  k )!( k  1)! i 0 ( n  i  1)!i!
13
For the normal r.v. X ~ N (  , 2 ) ,

1  1 
 
 ( x   ) 2 / 2 2 / 2 2
E( X )  dx  ( y   )e  y
2
xe dy
2 2 
2 2 
1  1 
ye 
 y 2 / 2 2  y 2 / 2 2
 dy    e dy   .
2 2

 
2
2 
0
1
Given X  f X ( x), suppose Y  g ( X ) defines a new r.v. with pdf
fY ( y ). Then the new r.v. Y has a mean Y given by

Y  E (Y )   y fY ( y )dy.

It can be shown that  

E (Y )  E g ( X )    y fY ( y )dy   g ( x ) f X ( x )dx.
 
* fY ( y ) is not required to evaluate E (Y ) for Y  g ( X ). 14

In the discrete case, E (Y )   g ( xi )P( X  xi ).
i
Example:Y  X 2 , where X is a Poisson r.v.
 
 
k
E X  k
k
2 2
P( X  k )   k e 2 
e 
k 2
k 0 k 0 k! k 1 k!

k 
i 1
e 
 k (k  1)!  e  (i  1)
k 1

i 0 i!
  i 
i   

i 
 e   i

    e   i  e  
 i 0 i! i 0 i!   i 1 i! 
  i    

m 1 
 e  
 e   e    e  
 i 1 (i  1)!   m 0 m! 
 e  e   e    2   .
In general, E X k  is known as the k th moment of r.v. X.

15
Mean alone will not be able to truly represent the p.d.f of any r.v. To
illustrate this, consider the following scenario: Consider two
Gaussian r.v.s X1  N (0,1) and X 2  N (0,10).Both of them have the
same mean   0. However, as Fig. shows, their p.d.f. s are quite
different. One is more concentrated around the mean, whereas the
other one ( X 2 ) has a wider spread. Clearly, we need at least an
additional parameter to measure this spread around the mean!
f X 1 ( x1 ) f X 2 ( x2 )
x1 x2
(a)   1
2
(b)  2  10
16
For a r.v. X with mean  , X   represents the deviation of the r.v.
from its mean. Since this deviation can be either positive or negative,
consider the quantity  X    2 , and its average value E [ X    2 ]
represents the average mean square deviation of X around its mean.
Define  2  E[ X    ]  0. With g ( X )  ( X   )2 and using
 2
X
 
E (Y )  E  g (X )    y f Y ( y )dy   g (x )f X (x )dx ,
 

   ( x   )2 f X ( x )dx  0.
2
X 
 2 is known as the variance of the r.v. X, and its square root

X
 X  E ( X   )2 is known as the standard deviation of X. Note that the

standard deviation represents the root mean square spread of the r.v. X
around its mean  . 17
Expanding above and using the linearity of the integrals, we get

Var( X )   2
X 


x 2
 2 x   2  f X ( x )dx
 
 x f X ( x )dx  2   x f X ( x )dx   2
2
 
 E X    E X   E ( X ) 2
___
X X .
2
2 2 2 2
Thus , for example, returning back to the Poisson r.v., we get
  X  X  2     2  .
___
2
2 2
X
Thus for a Poisson r.v, mean and variance are both equal to its
parameter .
18
Variance of the normal r.v. N (  ,  2 )

 x   
1
Var( X )  E [( X   ) ]   ( x   ) 2 / 2 2
2 2
e dx.

2 2
To simplify, we can make use of the identity

  1
 f X (x )dx    ( x   ) 2 / 2 2
e dx  1
 
2 2

 e ( x   ) / 2 dx  2  .
2 2
for a normal p.d.f. This gives 
Differentiating both sides of with respect to  ,we get

 ( x   )2
 e ( x   ) 2
/ 2 2
dx 
2
  3
 1
 x     ( x   )2 / 2 2
dx   2 .
2
or e

2 2
*In some cases, mean and variance may not exist. For example, Cauchy r.v. 19
___
Moments: mn  X  E ( X ), n  1 are known as the

n n
moments of the r.v. X, and n  E[( X   )n ] are known as the

central moments of X. Clearly, the mean   m1 , and the variance
 2  2 . It is easy to relate mn and n .
In general, the quantities
E[( X  a )n ]
are known as the generalized moments of X about , a and
E[| X |n ]
are known as the absolute moments of X.
20
Characteristic Function
The characteristic function of a r.v. X is defined as
 X ( )  E e    e jx f X ( x )dx.
 
jX

Thus  (0)  1, and  ( )  1 for all  .
X X
For discrete r.v.s the characteristic function reduces to

 X ( )   e jk P( X  k ).
k
Thus for example, if X  P (  ) [Poisson] then its characteristic
function is given by

jk   
k 
(e j )k
 X ( )   e e e 
   e 
e e e  ( e  1)
j j
.
k 0 k! k 0 k!
21
Similarly, if X is a binomial r.v., its characteristic function is
given by
n
 n  k n k n  n 
 X ( )   e jk
  p q    ( pe j ) k q n k ( pe j  q) n .
k 0 k  k 0  k 
22
Characteristic functions are useful in computing the moments
of a r.v.  ( )  E e jX   E   ( jX )k    j k E ( X k )  k
X   
 k 0 k!  k 0 k!
E( X 2 ) 2 k
k E( X )
 1  jE ( X )  j2
  j  k  .
2! k!
Taking the first derivative with respect to ω, and letting it to
be equal to zero, we get
 X ( ) 1  X ( )
 jE ( X ) or E ( X )  .
  0 j   0
Similarly, second derivative gives
1  2 X ( )
E( X )  2
2
,
j  2
 0 23
and repeating this procedure k times, we obtain the kth moment of X
to be 1  k  ( )
E( X k )  X
, k  1.
j k
 k
 0
We can use this technique to compute the mean, variance and other
higher order moments of any r.v. X. For example, if X  P (  ), then
 X ( )  e ,  X ( )  e  ee je j , and
 (e j  1)  j

 2  X ( )
 2
 e 
e e 
( je j 2
)
j
 e e 
 j 2 j
e .
j

so that E ( X )   , and E ( X 2 )  2   ,
24
X
   
X

Chebychev Inequality 2
A bound that estimates the dispersion of the r.v. beyond a certain

interval centered around its mean. Since  2 measures the
dispersion of the r.v. X around its mean  , we expect this bound to
depend on  2 as well.
Consider an interval of width 2 symmetrically centered around
its mean  as in Fig. What is the probability that X falls outside
this interval? We need
P| X   |   ?
25
P| X   |   ?
X
   
X

Chebychev Inequality 2
To compute this probability, we can start with the definition of  2 .

  E ( X   )   

2 2
( x   ) 2 f X ( x )dx   ( x   ) 2 f X ( x )dx
 |x   |
   2 f X ( x )dx  2  f X ( x )dx   2 P  | X   |   .
|x   | |x   |
From above, we obtain the desired probability to be

2
P  | X   |    2 ,

and this is known as the Chebychev inequality. Interestingly, to
compute the above probability bound the knowledge of f X (x )is not
necessary. We only need  2
,the variance of the r.v. In particular with
P  | X   | k   2 .
1
  k we obtain
k 26
P| X   |   ?
Chebychev Inequality
Thus with k  3, we get the probability of X being outside the 3
interval around its mean to be 0.111 for any r.v. Obviously this cannot
be a tight bound as it includes all r.v.s. For example, in the case of a
Gaussian r.v., from table of error function, (   0,   1)
P | X | 3   0.0027.
which is much tighter than that given by Chebychev inequality
Chebychev inequality always underestimates the exact probability.
27

Week3 Lecture1 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week3 Lecture1 PDF

Uploaded by

Copyright:

Available Formats

EE501 Stochastic Processes

Mohamed Abdul Haleem

Let X  X   be a r.v. defined on the model (, F , P ), and

Is Y necessarily a r.v.? If so what is its PDFFY ( y ), pdf fY ( y ) ?

Thus the distribution function as well of the density

We shall consider some of the following functions to

On the other hand if a  0, then a0

Functions of Random Variables (Ch. 5, 6)

If y  0, then the event  X 2 ( )  y  , and hence

For y  0, from Fig., the event {Y ( )  y}  { X 2 ( )  y}

Functions of Random Variables (Ch. 5, 6)

By direct differentiation, we get

Above represents a Chi-square r.v. with n = 1,since (1 / 2)  .

However, if Y  g ( X ) is a continuous function, it is easy to

when y  Y ( )  y  y, the r.v. X could be in any one of the

For example if Y  X 2 , then for all y  0, x1   y and x2   y

f X is a discrete-type r.v., then using we get

is the midpoint of the interval (a,b).

If X is exponential with parameter, then

For the normal r.v. X ~ N (  , 2 ) ,

It can be shown that  

* fY ( y ) is not required to evaluate E (Y ) for Y  g ( X ). 14

In general, E X k  is known as the k th moment of r.v. X.

 2 is known as the variance of the r.v. X, and its square root

 X  E ( X   )2 is known as the standard deviation of X. Note that the

Expanding above and using the linearity of the integrals, we get

Thus , for example, returning back to the Poisson r.v., we get

To simplify, we can make use of the identity

Differentiating both sides of with respect to  ,we get

Moments: mn  X  E ( X ), n  1 are known as the

moments of the r.v. X, and n  E[( X   )n ] are known as the

For discrete r.v.s the characteristic function reduces to

A bound that estimates the dispersion of the r.v. beyond a certain

To compute this probability, we can start with the definition of  2 .

From above, we obtain the desired probability to be

You might also like