Professional Documents
Culture Documents
Distribution Theory I
Distribution Theory I
Distribution Theory I
ng
generating function, Weak and strong laws of large numbers and Central limit
theorem.
COURSE REQUIREMENTS:
This is a compulsory course for all statistics students. Students are expected to have
a minimum of 75% attendance to be able to write the final examination.
READING LIST:
(1) Introduction to the Theory of Statistics by Mood, A.M, Graybill, F.A. and
Boes, D.C.
(3) Probability and Statistics by Spiegel, M. R., Schiller, J and Alusrinivasan, R..
LECTURE NOTES
Definition I:
Given a random experiment with a sample space , a function X which assign to each
element c , one and only one real number X(c) = x is called a Random Variable.
Example: Let the random experiment be the tossing of a single coin and let the sample
such that
X(c) = 0 if c is Tail
1 if c is Head
Definition 2:
Given a random experiment with the sample space . Consider two random variables X1
and X2 which assign to each element c of one and only ordered pair of numbers: X1 (c)
= x1 , X2 (c)= x2.
Definition 3:
Given a random experiment with the sample space . Let the random variable Xi assign
to each element c , one and only one real no. Xi(c) = xi, i = 1, . . ., n. the space of
Let X denote a r.v. with space A and let A C A, we can compute p(A) = p(x A) for each
A under consideration. That is, how the probability is distributed over the various subsets
Let X denote a r.v. with one dimensional space A. Suppose the space is a set of points s. t.
there is at most a finite no. of points of A in any finite interval, then such a set A will be
called a set of discrete points. The r.v. X is also referred to as a discrete r.v.
Note that X has distinct values x1,x2,. . . ,xn and the function is denoted by f(x)
=0 if x ≠ xi
i.e f(x) ≥ 0 x A
Σf(x) = 1
Let A be a one dimensional r.v; then a r.v. X is called continuous if there exist a function
f(x)
s.t. ∫ f ( x)dx = 1
A
(2) f(x) has atmost a finite no. of discontinuity in every finite interval (subset
of A)
http://www.unaab.edu.ng
or if A is the space for r.v. X and if the probability set function p(A), A ؿA can be
then x is said to be a r.v. of the continuous type with a continuous density function.
Let the r.v. X be a one dimensional set with probability set function p(A). Let x be a real
value no. in the interval -∞ to x which includes the point x itself, we have
F(x) = P(X≤ x)
The function F(x) is called a cumulative distribution function (CDF) or simply referred to
as distribution function of X.
x
And F(x) = ∫ f (w)dw
−∞
(for continuous r.v. X)
b. FX(x) is a monotonic, non decreasing function i.e. F(a) ≤ F(b) for a < b
c. F(x) is continuous from the right i.e. lim F (x+h) = F(x) for h> 0 and h small
http://www.unaab.edu.ng
Note: (F(x) = P(X ≤ x) the equality makes it continuous from the right while
Assignment
1. The Prob. Dist. Function of time between successive customer arrival to a petrol
station is given by
f(x) =0 x<0
= 10e -10x
0 ≤ x < ∞.
Find:
b. P(X < 1)
discrete r.v., then the joint discrete density function of (X1, X2,. . . , Xk) denoted by
Note ∑f ,
Χ1 Χ 2 ,...Χ k ( x1 , x2 ,...xk ) = 1
Where the summation is over all possible value of (X1, X2,. . . , Xk)
Definition 2: Marginal Discrete Density Function: If X and Y are joint discrete r.v.,
then f Χ (x ) and f Υ ( y ) are called marginal discrete density functions. That is, if
f Χ ( x ) = ∑ f Χ ,Υ ( x, y ) and f Υ ( y ) = ∑ f Χ , Υ ( x, y )
yi xi
Also let (X, Y) be joint continuous r.v. with joint probability density function f Χ ,Υ ( x, y ) ,
then
Ρ[(Χ, Υ ) ∈ Α] = ∫∫ f Χ ,Υ ( x, y )dxdy
Α
⎡ b1
b2
⎤
Ρ[a1 < x ≤ b1 ; a 2 < y ≤ b2 ] = ∫ ⎢ ∫ f Χ ,Υ ( x, y )dx ⎥ dy
a2 ⎢
⎣ a1 ⎥⎦
Assignment
Given
If X and Y are joint continuous r.v. then, f Χ (x ) and f Υ ( y ) are called marginal
∞ ∞
f Χ (x ) = ∫ f (x, y )dy
Χ ,Υ and f Υ ( y ) = ∫ f (x, y )dx
Χ ,Υ
−∞ −∞
(b) Find the marginal density function of Y and hence, obtain p(Y = 2).
Conditional discrete density function: Let X and Y be joint discrete r.v. with joint discrete
denoted by f Υ Χ (y x ) is defined as
http://www.unaab.edu.ng
f Χ , Υ ( x, y )
f Υ Χ (y x ) =
f Χ (x )
f Χ , Υ ( x, y )
similarly f ΧΥ ( x y) =
fΥ (y)
Ρ [ Χ = x, Υ = y ] ⎛Χ = x/Y = y⎞
= = Ρ⎜ ⎟
Ρ (Υ = y) ⎝ ⎠
f Χ , Υ ( x, y ) f Χ (x )
Note that ∑ f ( )= ∑
y
Υ
Χ
y
x
y f Χ (x )
=
f Χ (x )
=1
∞ ∞
f Χ ,Υ ( x, y )dy f Χ (x )
∫ fΥΧ ( y
)dy = ∫ = =1
f Χ (x ) f Χ (x )
x
−∞ −∞
Definition: Let X1, X2,. . . , Xk be a k- dimensional continuous (or discrete) r.v. with joint
density function
f Χ1 , Χ 2 ,... Χ k ( x1 , x 2 ,...x k ) and marginal density function fXi(xi) then X1, X2,. . . , Xk are said to
f Χ1 , Χ 2 ,...Χ k ( x1 , x 2 ,...x k ) = Π f Χi ( xi )
k
∀xi
i =1
for example, if r.v. Xi and X2 have the joint density function f Χ1 , Χ 2 ( x1 , x 2 ) with marginal
independent iff f Χ1 , Χ 2 ( x1 , x 2 ) = f Χ1 ( x1 )f Χ 2 ( x 2 )
http://www.unaab.edu.ng
Note that
∞
f ( x2 ) = ∫ f (x , x
−∞
1 2 )dx
∞
= ∫ f ( ) f ( x )dx
x2
x1 1 1
−∞
∞
= f( x2
x1 ) ∫ f ( x )dx
1 1
−∞
Exercise
Theorem: Let the r.v. X1 and X2 have the joint density function f(x1,x2), then X1 , X2 are
said to be stochastically independent iff f(x1,x2) can be written as the product of non-
Proof:
http://www.unaab.edu.ng
If X1 and X2 are S.I, then f(x1,x2)= f1(x1)f2(x2) where f(x1) and f(x2) are marginal density
Conversely
If f(x1,x2) = g(x1)h(x2), then for the r.v. of the continuous type, we have
∞ ∞
f1 ( x1 ) = ∫ g (x1 )h(x2 )dx2 = g (x1 ) ∫ h(x2 )dx2 = c1 g (x1 )
−∞ −∞
∞ ∞
f 2 (x2 ) = ∫ g (x1 )h(x2 )dx1 = h(x2 ) ∫ g (x1 )dx1 = c2 h(x2 )
−∞ −∞
But
∞ ∞ ∞ ∞
∫∫ f ( x1 , x 2 ) = ∫ ∫ g (x )h(x )dx dx
1 2 1 2 =1 sin ce a pdf
− ∞− ∞ − ∞− ∞
∞ ∞
⇒ ∫ g (x )dx ∫ h(x )dx
−∞
1 1
−∞
2 2 = c1c 2
⇒ c1c 2 = 1
i.e f(x1,x2) = g(x1)h(x2) = c1 c2 g(x1)h(x2) = c1 g(x1) c2 h(x2) = f1(x1) f2(x2) i.e X1 and X2
are S.I
Theorem 2: If X1 and X2 are S.I. with marginal pdf f1(x1) and f2(x2) respectively, then
P(a<x1<b, c<x2<d) = p(a<x1<b)p(c<x2<d) for a<b and c<d and a, b, c, d are constants.
f (x1 , x 2 ) = f1 ( x1 ) f 2 ( x 2 )
d b
Ρ(a < x1 < b ; c < x 2 < d ) = ∫ ∫ f ( x1 , x 2 )dx1 dx 2
c a
d b
= ∫ ∫ f 1 (x1 ) f 2 ( x 2 )dx1 dx 2
c a
⎡b ⎤⎡d ⎤
= ⎢ ∫ f 1 ( x1 )⎥ ⎢ ∫ f 2 ( x 2 )⎥
⎣a ⎦⎣ c ⎦
= Ρ(a < x1 < b )Ρ(c < x 2 < d )
Exercise:
DERIVED DISTRIBUTIONS
Y = a + bx -------------------------------------- (1)
Since X is a r.v., so is Y.
Suppose we which to find the density function of Y. let f(x) be the density function of X
where
Ρ ( Υ ≤ y ) = Ρ ( Υ ≤ a + bx )
or Ρ ( Υ ≤ y ) = Ρ ( a + bΧ ≤ y )
⎛ y−a⎞
= Ρ⎜Χ ≤ ⎟ ......................................... ( 2 )
⎝ b ⎠
⎛ y−a⎞
G( y ) = F ⎜ ⎟............................................(3)
⎝ b ⎠
⎛ y−a⎞ y −a
dF ⎜ ⎟
⎝ b ⎠= d
b
1 ⎛ y−a⎞
⇒ g(y) = ∫ f (x )dx = b f ⎜⎝ ⎟
dy dy −∞
b ⎠
Generalization of (1):
denoted by
= ф-1(y)
The transformation in equations (4) and (5) are said to be 1 – 1 if for any value of x, ф(x)
yields one and one value of Y and if for any value of Y, Ψ(Y) yields one and only one
value of X.
Y = ф(x) and X = ϕ (Υ )
http://www.unaab.edu.ng
elsewhere
Then Pdf of Y is
dφ ( y )
g ( y) = f (φ ( y ) ) α1 < y < α 2
dy
α 2 = max ⎡⎣ϕ (α ) , ϕ ( β ) ⎤⎦
G(y) = p(Y ≤ y) = p( φ ( x ) ≤ y )
ϕ ( y)
= Ρ[ Χ ≤ ϕ ( y )] = F (ϕ ( y )) = ∫ f ( x)dx
−∞
dG ( y ) d
ϕ(y)
g(y) = = ∫ f (x )dx
dy dy −∞
or
dϕ ( y )
g(y) = F (ϕ ( y )) φ (α ) ≤ y ≤ φ (β )
dy
G ( y ) = p (Y ≤ y ) = p (φ ( x) ≤ y )
= p ( X ≥ ϕ ( y )) = 1 − F (ϕ ( y ))
http://www.unaab.edu.ng
g ( y ) = d dy G ( y ) = − dϕ ( y ) dy F (ϕ ( y )) φ ( β ) ≤ y ≤ φ (α )
i.e. d
dy ϕ ( y) < 0
⇒ g(y) ≥ 0 .
i.e g ( y ) = dϕ ( y )
dy f (ϕ ( y )) α1 < y < α 2
Let X be a r.v. of discrete type with a pdf f(x). Let A denote the set of discrete points for
which f(x) > 0 and let Y = ν (x) be a 1-1 transformation that mapped A onto β . Let
Thus
ASSIGNMENT
Let f(x1, x2) be the joint pdf of two discrete r.vs. X1 and X2 with set of points at which
y 2 = U 1 ( x1 , x2 ).
from the joint pdf g(y1, y2), we then obtain the marginal pdf of y1 by solving over y2 and
vice-versa.
Let X be a r.v. of continuous type with a pdf of f(x). Let A be a one dimensional space
for f(x) > 0. Consider a 1-1 transformation which maps the set A onto set β . Let the
dx
inverse of Y = v (x ) be denoted by x = w( y ) and let the derivative = ω ' ( y ) be
dy
continuous and not vanishing for all points Y ∈ β . Then the points of Y = U (x ) is given
by
g ( y ) = f (ω ( y )) ω 1 ( y ) y∈β
=0 elsewhere
Exercise
Ca)
(b)
http://www.unaab.edu.ng
find pdf of
(c)
=0 elsewhere
find pdf of
The method of finding the pdf of a function one r.v. can be extended to two or more r.v.s
of continuous type.
Let
Define a 1-1 transformation which maps a 2-dimensional set of A in the x1 , x2 plane into
dx1 dx1
dy dy 2
J= 1
dx 2 dx 2
dy1 dy 2
It is assumed that these first order partial derivatives are continuous and that J is not
Exercise 1;
Let X1 and X2 denote random samples from the distribution. Obtain the marginal density
function of respectively.
Exercise 2 :
Given
Mathematical Expectation
∞
Let X be a r.v. with pdf f(x) and let V(x) be a function of x such that ∫ V ( x) f ( x)dx exists
−∞
summation as the case may be is called the mathematical expectation or expected value
of V(x) and it is denoted by Ε[( x )] . It is required that the integral or sum converge
absolutely. More generally, let x1 , x 2 ,...x n be a r.v. with pdf f( x1 , x 2 ,...x n ) and let
http://www.unaab.edu.ng
V( x1 , x 2 ,...x n ) be a function of the variable such that the n-fold integrals exist, i.e.
∞ ∞ ∞
∫ ∫ .... ∫ V (x , x
− ∞− ∞ −∞
1 2 ,...x n ) f ( x1 , x 2 ,...x n )dx1 , dx 2 ,...dx n exists, if the r.vs. are of continuous type
The n-fold integrals or the n-fold summation is called the mathematical expectation
Example:
∞ 1
Ε( x ) = ∫ xf ( x)dx = ∫ 2 x(1 − x)dx = 13
−∞ 0
∞ 1
Ε(x 2 ) = ∫ x f ( x)dx = ∫ 2 x (1 − x)dx = 1
2 2
6
−∞ 0
2
V ( x ) = Ε(x 2 ) − [Ε( x )] =
2 1 ⎛1⎞ 1 1 3
−⎜ ⎟ = − =
6 ⎝3⎠ 6 9 54
Ε(6 x + 3 x 2 ) = 6Ε( x ) + 3Ε(x 2 ) = 6( 1 3 ) + 3( 1 6 )
1
= 2 + = 2 12
2
x
f ( x) = x=1,2,3,
6
http://www.unaab.edu.ng
Ε(X 3 ) =
1
∑x 3
f ( x) = (1 + 16 + 81) = 98
x =1, 2 , 3 6 6
E(xy2)=?
∫∫ xy
2
f ( x, y )dxdy
1 1
17
∫ ∫ xy ( x + y )dxdy =
2
0 0
72
Let X1, X2, … be a set of independent r.v. distribution in the same form with mean μ .
1 n
i.e. Xn = ∑ xi
n i =1
The weak law of large numbers states that Χ n becomes more and more narrowly
{
i.e. lim P X n − μ >∈ = 0
n →∞
} ∈> 0
then ν X n = ( ) σ2
n
chebyshev inequality,
http://www.unaab.edu.ng
{
P X n − μ >∈ ≤ } σ2
n ∈2
or
{(
P Xn −μ )
2
>∈2 ≤ } σ2
n ∈2
or
{
p{Χ n − μ >∈ σ } = P (Χ n − μ ) ≥∈2 σ 2 ≤
2
} 1
∈2
1
P{Χ n − μ <∈ σ } ≥ 1 −
∈2
note
Ε ( g ( x ))
P[g ( x ) ≥ k ] ≤ ∀k > 0
k
Theorem: Let g(x) be a non negative function of a r.v. X. If Ε(g (x )) exist, then for any
+ve constant ∈
Ε ( g ( x ))
Ρ ⎡⎣ g ( x ) ≥∈⎤⎦ ≤
∈
Proof:
Ε ( g ( x ) ) = ∫ g ( x) f ( x)dx
∞
−∞
= ∫ g ( x) f ( x)dx + ∫ g ( x) f ( x)dx
Α Αc
Ac = {x : g ( x) <∈}
Ε ( g ( x ) ) ≥ ∫ g ( x) f ( x)dx
A
since g (x) ≥∈
≥ ∫ ∈ f ( x)dx
A
=∈ ∫ f ( x)dx =∈ Ρ ⎡⎣ g ( x ) ≥∈⎤⎦
A
http://www.unaab.edu.ng
Ε[g ( x)]
i.e. ≥ Ρ[g ( x ) ≥∈]
∈
Ε[g ( x )]
or Ρ[g ( x) ≥∈] ≤ ∀ ∈> 0
∈
Ε( g (x ))
Ρ[g ( x ) ≥ k ] ≤
Κ
Let g ( x ) = (Χ − μ )
2
Κ =∈2 σ 2
[
i.e Ρ (Χ − μ ) ≥∈2 σ 2 ≤
2
] σ2
∈ σ
2 2
1
= 2
∈
(
let g ( x) = Χ n − μ )2
Κ =∈2
( ) [(
Ρ Χ n − μ >∈ = Ρ Χ n − μ )2
>∈2 ]
≤
(
Ε Χn − μ ) 2
=
σ2
∈2 n ∈2
⎡Λ ⎤ σ2
lim Ρ ⎢ Χ n − μ >∈⎥ ≤ lim =0
⎦ n→∞ n ∈
2
n →∞
⎣
http://www.unaab.edu.ng
{
Ρ Χ n − μ <∈ ≥ 1 − δ }
or
[
Ρ − ∈< Χ n − μ <∈ ≥ 1 − δ ]
Again using Chebyshev inequality
[ ] [
Ρ − ∈< Χ n − μ <∈ = Ρ Χ n − μ <∈ ]
= Ρ ⎡ Χ n − μ <∈2 ⎤ ≥ 1 −
2 Ε Χn − μ ( ) 2
⎢⎣ ⎥⎦ ∈2
( 1n )σ 2
= 1− ≥ 1−δ
∈2
1
σ2 δ2
Where δ > n
or n > .
∈2 δ ∈2
Exercise 1:
Suppose that a sample is drawn from some distribution with an unknown mean and
variance equal to unity. How large a sample must be taken in order that the probability
will atleast 0.95 that the sample mean X n will lie within 0.5 of the population mean?
Exercise 2:
How large a sample must be taken in order that you are 99% certain that Χ n is within
0.56 of μ ?
http://www.unaab.edu.ng
The weak law of large numbers state a limiting property of sums of r.v. but the strong law
n
of large numbers state something about the behavior of a sequence, S n = ∑ xi ∀n
i =1
Sn
a.s μ (almost surely)
n ⎯⎯ ⎯→
n→∞
Let X1, X2, . . ., Xn be a r.s. from a normal distribution with mean μ and variance σ ,
2
Χn − μ
Ζn =
σ
n
⎯→ Ν (0,1)
Ζ n ⎯n⎯
→∞
Proof
Note that the moment generating function of a standard normal distribution is given as
Μ X (t ) = l 2
1 2
t
Μ Ζn (t ) ⎯n⎯
⎯→ Μ (t )
→∞
http://www.unaab.edu.ng
⎡ ⎛ Χ − μ ⎞⎤
( )
Μ Ζ n (t ) = Ε l tΖ n = Ε(exp tΖ n ) = Ε ⎢exp⎜ t
⎜ σ
⎟⎥
⎟
⎣⎢ ⎝ n ⎠⎦⎥
⎡ ⎛ ∑ Χ − μ ⎞⎤
⎟⎥ = Ε ⎢exp 1 t (Χ i − μ )⎥
⎡ ⎤
= Ε ⎢exp⎜ t n n∑
⎢ ⎜ σ ⎟⎥ ⎢
⎣
σ
⎥⎦
⎣ ⎝ n ⎠ ⎦ n
⎡ ⎛ Χ − μ ⎞⎤ ⎡n ⎛ Χ − μ ⎞⎤
= Ε ⎢exp⎜ tn ∑ i ⎟⎥ = Ε ⎢Π exp⎜ tn i ⎟
⎣ ⎝ σ ⎠⎦ ⎣ i =1 ⎝ σ ⎠⎥⎦
Χi − μ
Let y i =
σ
Μ
Then Ζn (t ) = Ε
⎡n
⎢⎣Π
i =1
exp ( t
n
)
⎤ n
[ (
yi ⎥ = Π Ε exp
⎦ i =1
t
n
yi )]
( )= ΠΜ ( )
n n
= Π Μ yi t
Υ
t
i =1 n i =1 n
= [Μ ( )] t n
Υ n
Note
Μ Χ (t ) = Ε l tx ( )
∞
xj
=∑
j =0 j!
http://www.unaab.edu.ng
⎯→ Ν (0,1)
Which is the same mgf for a standard normal distribution, hence Ζ n ⎯n⎯
→∞
http://www.unaab.edu.ng
Assignment
(a) Find the mgf of the following distributions and hence the mean and variance.
Chi-Square.
(b) Suppose
(c) Assume X1, X1, . . ., Xn are i.i.d and distributed as exponential. Find the
n
distribution of S n = ∑ Χ i
j =1