Probability ND Statistics 2

STA 2200:PROBABILITY AND
STATISTICS II
August 6, 2014
COURSE OUTLINE
PROBABILITY DISTRIBUTION OF CONTIN-
UOUS RANDOM VARIABLES
For a continuous variables assuming a particular values,because there are

innitely many values(even in a small interval)that a random variable can
assume .This implies that the probability of occurrence of any one value will
be
µi
P (X = xj ) = =0
∞
Where MI is the number of occurrences of value Xi But the fact this
probability is zero does not mean that this occurrence is impossible.
This suggests that we have another way of expressing probability.
This is done by the use of CDF as follows
ˆ µ
F (µ) = f (x)dx
−∞
1
Where f (x) is a function which is continuous and dierentiable .
This function f (x)is called the probability density function(pdf) of a con-
tinuous random variable X
For f (x) to be a pdf it must satisfy the following conditions
1. f (x) ≥ 0
´∞
2. −∞
f (x)dx = 1
with the above denition of distribution function ,we can derive the proba-
bilities for a continuous
 random variable.
3x2 ,0 ≤ x ≤ 1
Suppose f (x) =
0 elsewhare
1. p(x ≤ a) = p(x ≥ a)
ˆ
´a 1
= 0
f (x)dx = f (x)dx
a
ˆ
´a 1
= 0
3x dx2
= 3x2 dx
0
= x3 |a0 = x3 |1a
= a3 − 0 = 1 − a3
= 2a3 = 1
1
= a = 0.5 2
2. p(x > b)
2
ˆ 1
= f (x)dx = 0.5
b
ˆ 1
= = 3x2 dx = 0.5
b
= = x3 |1b = 0.5
= b3 =⇒ = 1 − b3 = 0.5
= b3 = 0.95
1
= b = 0.95 2
= 0.983
3. A cumulative distribution function is given by


1 − e−2x ,x > 0
F (x) =
0 elsewhare
(a) Derive the pdf of the CDF

(b) Show that the the derived function is actually a pdf
solution
i. By denition
ˆ µ
F (µ) = f (x)dx
−∞
ie CDF is gotten by integrating the pdf

Then given a CDF we derive the pdf by dierentiating F(x)
wrt X
3
d
f (x) = F (X)
dx
D
= [−e−2x ]
dx
= 2e−2x 
2e−2x ,x > 0
= f (x) =
0 , elsewhere
ii. Now
f (x) ≥ 0, e ∼
= 2.71
ˆ ∞
2e−2x dx = −e−2x |∞
0
0
= −[e−∞ − 1]
= 1
4. Let X be a continuous random variable with pdf


1x , 0 ≤ x ≤ 2
f (x) = 2
0 elsewhare
Obtain the cdf of x and sketch this graph

solution
ˆ µ
F (µ) = f (x)dx
−∞
4
=
´µ
= 0.5 0
xdxdx
1 2 µ
= 4
x |0
µ2
= 4
µ2
= F (µ) = 4
,0 < µ < µ

 x2 ,0 < x < 2
4
=
0 , elsewhere
µ <0 0 1 2 >2
F (µ) 0 0 14 1 1
5. Let X be a continuous random variable with pdf




 ax ,0 ≤ x ≤ 1


1≤x≤2

a
f (x) =


 −ax + 3a , 2 ≤ x ≤ 3



0 elsewhare
(a) Determine the constant a

(b) Compute probability of p(≤ 1.5)
solution
(a) Since f (x) is a pdf then

´3 ´1 ´2 ´3
0
f (x)dx = 0
axdx + 1
adx + 2
(−ax + 3a)dx = 1
ax2 ax2
=⇒ 1
|10 + ax|21 + − 2
| + 3ax|32 = 1
a
2
+ a − 52 a + 3a = 1
5
2a = 1
1
=⇒ a = 2
(b) p(X ≤ 1.5)

ˆ 1.5 ˆ 1 ˆ 1.5
f (x)dx = axdx + adx
0 0 1
ˆ 1 ˆ 1.5
1 1
= xdx + dx
2 0 2 1
1 x2 1 1 1.5
= | + x|
2 2 0 2 1
1 1 3
= + ( − 1)
4 2 2
1 1
= + = 0.5
4 4
Exercise
Which of the following functions can represent pdf

2 − x , 1 ≤ x ≤ 2
1. f (x) =
0 , elswhere

2 − x , 0 ≤ x ≤ 3
2. f (x) =
0 , elsewhere



 x2 ,1 ≤ x ≤ 2


1
 (2 − x) ,1 ≤ x ≤ 2
3. f (x) = 3


 x−2 ,1 ≤ x ≤ 2



0 , elsewhere
4. In any of a,b,c where you nd it is a pdf nd the CDF
6
MEASURES OF CENTRAL TENDENCY
The pth quartile of a random variable X (r its corresponding distribution )

denoted by p is dened as the smallest number p such that FX (p ) ≥ p, 0 <
p<1
MEDIAN
The median of a random variable X is denoted by med(x) or 0.5 is the 0.5th

quartile.On the other hand the median of a random variable X is a value
such that p(X ≤ x) ≥ 12 and the pr(X ≥ x) ≥ 12 .If X is a continuous random
variable then median M of X satises
ˆ m ˆ ∞
f (x)dx = f (x)dx = 0.5
−∞ m
Let 0<p<1 .A (100p)th percentile (quartile of order p) of the distribution

of a random variable x is a value p such that
p(X ≤ p ) ≥ p and p(X ≥ p ) ≥ p
Examples
1. Find the median and the 25th percentile of the following pdf

3(1 − x)2 ,0 < x < 1
f (x) =
0 , elswhere
Solution
(a) Let the median be m

By denition
ˆ m ˆ 1
1
f (x)dx = f (x)dx =
0 m 2
ˆ m
3 (1 − x)2 dx = 0.5
0
7
let (1-x)=z
−dx = dz
dx = −dz
ˆ 1
−3 z2 = 0.5
1−m
−z 3 |1−m
1 = 0.5
−[(1 − m)3 − 1] = 0.5
−(1 − m)3 = −0.5
1
1−m = 0.5 3
1
=⇒ m = 1 − 0.5 3
(b) Let p be the 25th percentile
ˆ p
f (x)dx = 0.25
0
ˆ p
3 (1 − x)2 dx = 0.25
0
8
let (1-x)=z
−dx = dz
dx = −dz
ˆ 1−p
−3 z2 = 0.25
1−m
−z 3 |1−p
1 = 0.25
−[(1 − p)3 − 1] = 0.25
−(1 − p)3 = −0.25
1
1−p = 0.25 3
1
=⇒ p = 1 − 0.25 3
2. Find the median of the following distribution


p(1 − p)x , x = 0, 1, 2 . . .
f (x) =
0 , elsewhere
solution
Let the median be m

By denition
9
ˆ m ˆ 1
1
f (x)dx = f (x)dx =
0 m 2



 0.25 ,0 ≤ x ≤ 1

FX (x) = 0.4375 , 1 ≤ x ≤ 2


0.5781 , 2 ≤ x ≤ 3

Therefore the median m is 2 since p(X ≤ m) = p(X ≤ 2) =

P3
x=0 px (x) =
0.5781 > 0.5
MODE
The mode of a distribution of a random variable X is a value x such that

it maximizes the pdf (pmf).If x is a continuous random variable we often
dierentiate the pdf to get the mode
Example
1. Find the mode of the following distribution


0.5x , x = 1, 2
(a) pX (x) =
0 , elsewhere
Solution
The value x for which px (x)is minimum is 1

0.5x2 e−x ,0 < x < ∞
(b) f (x) =
0 , elsewhere
Solution
10
(a) f (x) = 0.5x2 e−x
vdv/du + udv/dx
at maximum f 0 (x) = 0
0.5[2xe−x − x2 e−x ] = 0
2xe−x = x2 e−x
x=2
(b) Next if f 00 (x) < 0at x=2 then 2 is maximum
f 00 (x) = 0.5[2(e−x − xe−x ) − 2xe−x + x2 e−x ]
at x=2 f 00 (x) = f 00 (2)
f 00 (2) = 0.5[2(e−2 − 2e−2 ) − 4e−2 + 4e−2 ]
= 0.5(−2e−2 )
' −0.135 < 0
Implying that 2 is the maximum value which is the mode of the

above distribution.
2. Find the value of c, median,mode of the following distribution

12x2 (1 − x) , 0 < x < 1
pX (x) =
0 , elsewhere
ans mode =2/3
3. The random variable X has the pdf
11



 cx ,0 ≤ x ≤ 1

f (x) = c(2 − x) , 1 ≤ x ≤ 2



0 , elsewhere
Determine the value of c.Find the cdf,the median and mode of f (x)
Solution
(a) Since f (x) is a pdf
ˆ ∞
f (x)dx = 1
−∞
´1 ´2
0
cxdx + 1
c(2 − 1)dx = 1
cx2 2cx−cx2 2
2
|10 + [2cx − 2
]1 =1
c 4c c
2
+ 4c − 2
− 2c + 2
=1



x ,0 ≤ x ≤ 1

Therefore f (x) = 2 − x ,1 ≤ x ≤ 2



0 , elsewhere



x ,0 ≤ x ≤ 1

Therefore f (x) = 2 − x ,1 ≤ x ≤ 2



0 , elsewhere
The cdf
F (x) = 0,x < x =⇒ f (0) = 0
for 0 ≤ x ≤ 1
´
F (x) = xdx = x2
2
+c ]
Since F (0) = 0 =⇒ 21 (0) + c =⇒ c = 0
=⇒ F (0) = 12 x2 , 0 ≤ x ≤ 1
12
1
=⇒ F (1) = 2
(b) Next for 1 ≤ x ≤ 2

´ x2
F (x) = (2 − x)dx = 2x − 2
+ c2
But F (1) = 1 12
2
=⇒ 2(1) − 2
+ c2 = 12
=⇒ c2 = −1
 1 x2 0≤x≤1
F (x) = 2
2x − x2
2
− 1 ,1 ≤ x ≤ 2
(c) median
1
F (x) ≥ 2
Now F (1) = 1
2

 3 x2 (4 − x) , 0 ≤ x < 4
4. The pdf of a random variable is f (x) = 6
0 , elsewhere
Determine the mode
Solution
At maximum f 0 (x) = 0
3 3 2
=⇒ 64
2x[4 − x] − 64
x =0
3
=⇒ 64
[8x − 2x2 − x2 ] = 0
3
=⇒ 64
[8x − 3x2 ]
3
=⇒ 64
x[8 − 3x] = 0
8
x = 0 or x = 3
Next
13
3 3
f 00 (x) = (−3)x + (8 − 3x)
64 64
3 3
=⇒ 64
(−3)x + 64
(8 − 3x)
− 9x
64
+ 3
64
(8 − 3x)
At x = 0, f 000 (0) = 24
64
>0
=⇒ x = 0 gives minimum
At x = 38 ,
8 −9 3 3
f 00 ( ) = + −
3 24 8 8
−3
8
<0
Hence x = 8
3
gives the mode
5. A continuous random variable x has the pdf




 x2 0≤x≤1


 1 (2 − x)

1<x≤2
f (x) = 3


 x−2 2<x≤3



0 elsewhere
Determine the cdf and the median of this distribution

Solution
(a) The cdf

F (x) = 0,x < 0
for 0 ≤ x ≤ 1
14
´ x2
F (x) = x2 dx = 3
+ c1
Since F (0) = 0 =⇒ 21 (03 ) + c1 =⇒ c1 = 0
1
=⇒ F (1) = 3
(b) Next for 1 < x ≤ 2

1
´ 2
F (x) = 3
(2 − x)dx = 13 (2x − x2 ) + c2
But
F (1) = 31 =⇒ 13 (1 − 21 ) + c2 = 13
=⇒ c2 = − 16
Next for 2 < x ≤ 3
´ x2
F (x) = (x − 2)dx = 2
− 2x + c3
But F (2) = 1
2
=⇒ 4
2
− 4 + c3 = 1
2
5
=⇒ c3 = 2
x2
∴ F (x) = 2 − 2x + 25 , 2 < x ≤ 3

x2


 3
0≤x≤1


 2 x − x2 − 1

1<x≤2
F (x) = 3 2 6 6
x 5


 2
− 2x + 2 2<x≤3



0 elsewhere
(c) median
1
F (x) = 2
Now F (1) = 1
3
But F (x) = 13 x3 only when 0 ≤ x ≤ 1

=⇒ x > 1
Try kx ≤ 2
15
Solve
2 1
3
x − 61 x2 − 1
6
=
2
=⇒ 4x − x2 − 1 = 3
x2 − 4x + 4 = 0
x2 − 2x − 2x + 4 = 0
x(x − 2) − 2(x − 2) = 0
(x − 2)2 = 0
=⇒ x = 2
The median is 2
EXPECTATION OF A RANDOM VARIABLE
Expectation gives the average value of a random variable and hence is re-
garded as the mean of x
DEFINITION
Let X be a random variable .The expectation (mean) of x denoted by
E(X) is given by
1. E(X) = all x xi p(X = xi ) if X is discrete values x1 , x2 , . . . xi and

P
corresponding probabilities p(X = xi )

´
2. E(X) = −∞ ∞
xf (x)dx if X is continuous with probability density func-
tion f (x)
Examples
1. What is the expected value (mean) of number of points obtained in a

single throw of an ordinary die?
SOLUTION
16
Let the random variable X=no of points on a die.We need to get the
probability of X.
xi 1 2 3 4 5 6
i.e 1 1 1 1 1 1
p(X = xi ) 6 6 6 6 6 6
By denition
X
E(X) = xp(X = x)
all x
1 1 1 1 1
= (1) + (2) + (3) + (4) + (6)
6 6 6 6 6
7
=
2
n(a + 1)
= sn =
2
6( 16 + 1)
=
2
6
=1+
2
7
=
2
2. Let X be discrete random variable with pmf


x , x = 1, 2, 3, 4, 5, 6
21
p(X = x) =
0 , elsewhere
Compute the mean /expectation of X

SOLUTION
P
E(X) = all x p(X = x)
xi 1 2 3 4 5 6
1 2 3 4 5 6
p(X = xi ) 21 21 21 21 21 21
17
1 2 3 6
= 1( 21 ) + 2( 21 ) + 3( 21 ) + · · · + 6( 21 )
= 133
3. A continuous random variable has the given by


 1 (3 − x)(1 + x) , 0 ≤ x ≤ 3
f (x) = 9
0 elsewhere
Find the mean of X

solution
ˆ ∞
E(X) = xf (x)dx
−∞
ˆ 3
1
= (3 − x)(1 + x)dx
0 9
ˆ ˆ 3 ˆ 3 ˆ 3
1 3 2 2
= [ 3xdx + 3x dx − x − x3 dx]
9 0 0 0 0
1 3 3x3 x3 x4 3
= [ x2 |30 + − − |0 ]
9 2 3 3 4
1 27 81 27 81
= ( + − − )
9 2 3 3 4
3 9
= +3−1−
2 4
5
=
4
4. A bowl containing 10 chips ,of which 8 are marked USD 2 each and 2
are marked USD 5 each.Let a person choose ,at random and without
replacement 3 chips from this bowl.If the person is receive the sum of
18
the resulting amounts.Find his expectation.
SOLUTION
Let the E be the event of picking a chip marked USD 2 and let T be
the event of picking a chip marked USD 5.Let X =the random variable
representing the sum.
Probability distribution
x 6 9 12 15
P(X=x) 0.467 0.467 0.067 0
X
E(X) = xp(X = x)
all x
= 6(0.467) + 9(0.467) + 12(0.067) + 15(0)
= 7.8
EXERCISE
1. In a gambling game,a man is paid sh500 if he gets all heads or tails

in 4 tosses of a fair coin.He pays out sh 150 if he gets either 1,2 or 3
heads.Would you play the game?.
SOLUTION
No,since E(X) = − 375

4
(a) Let X have the pdf


 x+2 −2 < x < 4
18
f (x) =
0 elsewhere
19
Find E(X)
SOLUTION=2
PROPERTIES OF EXPECTATION
Let X be a continuous random variable with pdf f(x).Let g(x)=ax+b.(A

function of random variable X) where a and b are constants.then
1. E[g(x)] = aE(x) + b.The constant is not aected by expectation.

Proof
By denition
´∞
E[g(x)] = −∞
g(x)f (x)dx
but g(x) = ax + b
´∞
=⇒ E[g(x)] = −∞ (ax + b)f (x)dx
´∞ ´∞
= a −∞ xf (x)dx + b −∞ f (x)dx
= aE(X) + b
2. Let g(x) and h(x) be two functions of X then for any constants a and
b
E[g(x) ± bh(x)] = aE[g(x) ± bE[h(x)]
Proof
By denition
´∞
E[g(x) ± bh(x)] = −∞ (ag(x) ± bh(x)f (x)dx
´∞ ´∞
= −∞ (ag(x)f (x)dx ± −∞ (bh(x)f (x)dx
´∞ ´∞
= a −∞ g(x)f (x)dx ± b −∞ h(x)f (x)dx
= aE[g(x)] ± bE[h(x)]
EXAMPLE
20
1. The random variable X has pmf

x , x = 1, 2, 3, 4
10
P (X = x) =
0 elsewhere
Compute E(5x3 − 2x2 )
SOLUTION
E(5x3 − 2x2 ) = 5E(x3 ) − 2E(x2 )

But E(X) = all x xP (X = x)
P
1 2 3 4
5E(X 3 ) = 5[1( 10 ) + 8( 10 )] + 27( 10 ) + 64( 10 )
= 5[0.1 + 1.6 + 8.1 + 25.6]
= 5(35.4)
= 177
Next
1 2 3 4
2E[X 2 ] = 2[1( 10 + 4( 10 ) + 9( 10 ) + 16( 10 )]
= 2[0.1 + 0.8 + 2.7 + 6.4]
= 20
∴ 5E[X 2 ] − 2E[X 2 ] = 177 − 20 = 157
VARIANCE OF A RANDOM VARIABLE
Let X be a random variable

DEFINITION
The variance of a random variable X is dene as
var(X) = E[(X − E[X])2
I.e The expectation of the square of the deviations of the values of X from
its mean.
21
More precisely if X is a discrete random variable then
X
var(X) = [x − E(x)]2 .P (X = x)
all x
Similarly if X s a continuous random variable then

ˆ ∞
var(X) = [x − E(x)]2 f (x)dx
−∞
NB
var(x) = E[x − E(x)]2
= E[x2 − 2xE(x) + |E(x)2 ]
= E(x)2 − 2[E(x)]2 + E[x]2
= E[x2 ] − (E[x])2
Remarks
1. Normally E(x) = µ and var(x) = σ 2
2. Standard deviation is given by σ =

p
var(x)
3. The variance measures average dispersion from the mean.If it is small

it means that most of the values of X are concentrated near the mean.If
it is large it means that most values are spread far away from the mean.
Variance is normally calculated as follows
22
var(x) = σ 2 = E(X)2 − [E(X)]2
= E(X)2 − µ2
whereE(X)2 = all x x2 P (X = x).If X is a random variable and E(X)2 =

P
´∞ 2
−∞
x f (x)dx,if X is a continuous random variable
Proof
By denition
V ar(x) = E(X 2 ) − [E(X)]2

= E(X − µ2 )
but (X − µ2 ) = X 2 − 2µX + µ2
σ 2 = E(X 2 − 2µX + µ2 )
= E(X)2 − 2µE(X) + µ2
= E(X)2 − 2µ2 + µ2
= E(X)2 − µ2
Remarks
1. Suppose X is a random variable and g(x) is a function of a random

variable X ,then the variance of this new function will be
23
var(g(x)) = E[g(x) − E(g(x)]2
= E[g 2 (x) − [E(g(x))]2
2. In particular if g(x) = ax where a is a constant ,then
var(x) = var(ax) = a2 var(x)
Proof
var(ax) = E[(ax)2 ] − [E(ax)]2
= a2 E(x)2 − [aE(x)]2
= a2 [E(x)2 − [E(x)]2 ]
= a2 var(x)
Similarly if g(x) = ax+b where a and b are constants,then var(ax+b) =

a2 var(x)
Proof
24
var(ax + b) = E[(ax + b)2 ] − [E(ax + b)]2
= E[a2 x2 + 2axb + b2 ] − [aE(x) + b]2
= a2 E(x2 ) + 2abE(x) + b2 ] − [a2 E(x)2 + 2abE(x) + b]2

= a2 E(x)2 − [aE(x)]2
= a2 E(x)2 − a2 [E(x)]2
= a2 var(x)
EXAMPLES
1. Let f(x) be a pdf from a random variable X dened as


 x+3 −3 ≤ x ≤ 3
18
f (x) =
0 elsewhere
Find variance of X
ˆ ∞
E[x] = xf (x)dx
−∞
ˆ 3
x+3
=⇒ E[x] = x( )dx
−3 18
25
ˆ 3
1
= (x2 + 3x)dx
18 −3
ˆ 3 ˆ 3
1 2
= x dx + 3 x dx
18 −3 −3
1 x3 3 3x2 3

= | + |
18 3 −3 2 −3

1 27 27
= 9+9+ −
18 2 2
=1
Next
ˆ ∞
2
E(x ) = x2 f (x)dx
−∞
ˆ 3
x+3
= x2 ( )dx
−3 18
26
ˆ 3
1
= (x3 + 3x2 )dx
18 −3
ˆ 3 ˆ 3
1 3 2
= x dx + 3 x dx
18 −3 −3
1 x4 3 3x3 3

= | + |
18 4 −3 3 −3

1 81 81
= + 27 − − 27
18 4 4
1
= (27 + 27)
18
54
= =3
18
∴ var(x) = E(x2 ) − [E(x)]2
=3−1=2
2. Consider the experiment of tossing 2 dies .Let X1 and X2 denote the

outcome of the rst die and second die respectively.If Y is the absolute
dierence of X1 − X2 I.e Y = X1 − X2 .Find the variance of Y
SOLUTION
Sample is X1
27
1 2 3 4 5 6
1 0 1 2 3 4 5
2 1 0 1 2 3 4
X2 3 2 1 0 1 2 3
4 3 2 1 0 1 2
5 4 3 2 1 0 1
6 5 4 3 2 1 0
The possible values of Y are 0,1,2,3,4,5
Probability distribution
Y 0 1 2 3 4 5
P(Y=y) 6
36
10
36
8
36
6
36
4
36
2
36
Now
var(Y ) = E(Y 2 ) − [E(Y )]2
But
X
E(Y ) = yP (Y = y)
all y
6 10 8 6 4 2
= 0( ) + 1( ) + 2( ) + 3( ) + 4( ) + 5( )
36 36 36 36 36 36
1
= (70)
36
= 1.94
28
Next
X
E(Y 2 ) = y 2 P (Y = y)
all y
6 10 8 6 4 2
= 0( ) + 1( ) + 4( ) + 9( ) + 16( ) + 25( )
36 36 36 36 36 36
210
=
36
35
=
6
So
var(x) = E(Y 2 ) − [E(Y )]2
35
= − (1.94)2
6
= 2.05
MOMENTS
In addition to expectation and variance of a random variable we can compute

expectations of higher power of a random variable with respect to a given
distribution .
These expectations are useful in determining various characteristics of the
corresponding distributions
DEFINITION
If X is a random variable the rth moment of X denoted by µ0r is dened as
µ0r = E(X r ) if the expectation exists where r=1,2,..
29
These higher order moments are computed about zero origin
i.e E[(X r )] = E[(X − 0)r ]
Therefore they are referred to as raw moments about the origin or uncor-
rected moments
This means that moments can also be computed about another value e.g
about the mean of the probability distribution.
In this case they are referred to as central moments.
DEFINITION
If X is a random variable the rth central moment about the value a is
dened as µr = E[(X − µ)r ] (corrected moment =mean)
Remarks
1. The rst raw moment s the expectation of random variable X

I.e E(x) = 1st moment ≈ the origin
I.e if r=1 µ01 = E(X 0 ) = ean of x
2. If a = µ then the 1st central moment is zero.

i.e
µ1 = E[(X − µ)]
= E(X) − µ
=µ−µ
=0
3. If a = E(X) = µ,the second central moment is always the variance
30
i.e
µ2 = E[(x − µ)2 ]
= E(x2 ) − 2µE(x) + µ2
= E(x2 )2µ2 + µ2
= E(x2 ) − µ2
= E(x2 ) − [E(x)]2
= var(x)
4. Let X be a discrete random variable with pmf P (X = x).Then
(a) The rth raw moment (moment about the origin) is dened as
µ0r = E(xr )
X
= xr P (X = x)
all x
(b) The rth central moment (moment about the mean) is deed as
µ0r = E[(x − µ)r ]
X
= (x − µ)r P (X = x)
all x
31
Let X be a continuous random variable with pdf f(x).Then
• The rth raw moment (moment about the origin) is dened as
µ0r = E(xr )
ˆ ∞
= xr f (x)dx
−∞
• The rth central moment (moment about the mean) is dened as
µ0r = E[(x − µ)r ]
ˆ ∞
= (x − µ)r f (x)dx
−∞
RELATIONSHIP BETWEEN RAW AND CENTRAL MOMENTS
By denition
32
µ2 = E[(x − x)2 ]
W here x = E(x) = µ01
= E[(x − µ01 )2 ]
but (x − µ01 )2 = x2 − 2µ01 x + (µ01 )2
µ2 = E x2 − 2µ01 x + (µ01 )2

= E(x2 ) − 2µ01 E(x) + (µ01 )2
= µ02 − 2(µ01 )(µ01 ) + (µ01 )2
= µ02 − 2(µ01 )2 + (µ01 )2
= µ02 − (µ01 )2
Again the third central moment
33
µ3 = E[(x − x)3 ]
= E[(x − µ01 )3 ]
but (x − µ01 )3 = x3 − 3µ01 x2 + 3x(µ01 )2 − (µ01 )3
µ3 = E x3 − 3µ01 x2 + 3x(µ01 )2 − (µ01 )3

= E(x3 ) − 3µ01 E(x2 ) + 3(µ01 )2 E(x) − (µ01 )3
= µ03 − 3µ01 µ02 + 3(µ01 )2 µ01 − (µ01 )3
= µ03 − 3µ01 µ02 + 3(µ01 )3 − (µ01 )3
µ03 − 3µ01 µ02 + 2(µ01 )3
EXERCISE
Show that µ4 = µ04 − 4µ03 µ01 + 6µ02 (µ01 )2 − 3(µ01 )4

SOLUTION
34
µ4 = E[(x − x)4 ]
= E[(x − µ01 )4 ]
but (x − µ01 )4 = x4 − 4x3 µ01 + 6x2 (µ01 )2 − 4x(µ01 )3 + (µ01 )4
µ4 = E x4 − 4x3 µ01 + 6x2 (µ01 )2 − 4x(µ01 )3 + (µ01 )4

= E(x4 ) − 4E(x3 )µ01 + 6E(x2 )(µ01 )2 − 4E(x)(µ01 )3 + (µ01 )4
= µ04 − 4µ03 µ01 + 6µ02 (µ01 )2 − 4µ01 (µ01 )3 + (µ01 )4
= µ04 − 4µ03 µ01 + 6µ02 (µ01 )2 − 4(µ01 )4 + (µ01 )4
= µ04 − 4µ03 µ01 + 6µ02 (µ01 )2 − 3(µ01 )4
µ03 − 3µ01 µ02 + 2(µ01 )3
FACTORIAL MOMENTS
DEFINITION
If X is a random variable ,the rth factorial moment of X is dened as
µF = E [X(X − 1)(X − 2) . . . (X − r + 1)]
For discrete random variables ,the factorial moments are easy to compute
35
the the raw moments.
Note that the 1st factorial moment is always the mean E(x)
EXAMPLE
1. A die is thrown once .The possible outcomes are 1,2,3,4,5,6 which are
at unit intervals.Find the 1st,2nd and 3rd factorial moments
SOLUTION
(a) 1st factorial moment= E(X)

(b) 2nd factorial moment=E[X(X − 1)]
(c) 3rd factorial moment=E[X(X − 1)(X − 2)]
Now
x 1 2 3 4 5 6
P(X=x) 1
6
1
6
1
6
1
6
1
6
1
6
X-1 0 1 2 3 4 5
X(X-1) 0 2 6 12 20 30
X-2 -1 0 1 2 3 4
x(X-1)(X-2) 0 0 6 24 60 120
Therefore
(a) 1st
X
E(X) = P (X = x)
all x
1 1 1 1 1 1
= 1( ) + 2( ) + 3( ) + 4( ) + 5( ) + 6( )
6 6 6 6 6 6
7
=
2
36
(b) 2nd factorial moment
X
E[X(X − 1)] = X(X − 1)P (X = x)
all x
1 1 1 1 1 1
= 0( ) + 2( ) + 6( ) + 12( ) + 20( ) + 30( )
6 6 6 6 6 6
70
=
6
(c) 3rd factorial moment
X
E[X(X − 1)(X − 2)] = X(X − 1)(X − 2)P (X = x)
all x
1 1 1 1 1 1
= 0( ) + 0( ) + 6( ) + 24( ) + 60( ) + 120( )
6 6 6 6 6 6
210
=
6
MOMENT GENERATING FUNCTIONS
Computing all moments for a probability distribution function is normally

a tedious task.It is therefore desirable to have a function from which all the
moments can be derived at will.Such a function is called a moment generating
function(m.g.f)
When this function exists for a particular distribution it is unique,a prop-
erty which enables one to determine the m.g.f
DEFINITION
Let X be a random variable .The m.g.f denoted by µx (t)is dened as
mx (t) = E(etx ) where t is a small number such that −h < t < h and h > 0
37
If X is a discrete random variable with pmf P (X = x) ,then
mx (t) = E(etx )
(1)
X
= etx P (X = x)
all x
Similarly if X is a continuous random variable with pdf f(x)

Then
mx (t) = E(etx )
ˆ ∞
= etx f (x)dx (2)
−∞
Note that the mgf in (1) and (2) exists if is nite and if the integral is
P
nite respectively.
DERIVATION OF MOMENTS FROM M.G.F
We normally make use of mgf to derive higher moments of a given random

variable .
Let X be a discrete random variable .Then
mx (t) = E(etx )
X
= etx P (X = x)
all x
By the series expansion etx can be expressed as
38
t2 x2 t3 x3
etx = 1 + tx + + + ...
2! 3!
X
=⇒ mx (t) = etx P (X = x)
all x
X t2 x2 t3 x3

= 1 + tx + + + . . . P (X = x)
all x
2! 3!
X X X t 2 x2 X t3 x3
= 1(P (X = x) + tx(P (X = x) + (P (X = x) + (P (X = x) + . . .
all x all x all x
2! all x
3!
Dierentiating both sides with respect to t.
X X 2tx2 X 3t2 x3 X 4t3 x4

m0x (t) = x(P (X = x) + (P (X = x) + (P (X = x) + (P (X = x) . .
all x all x
2! all x
3! all x
4!
Xt x 2 3 X t3 x4
(P (X = x) . .(3
X X
= x(P (X = x) + tx2 (P (X = x) + (P (X = x) + .
all x all x all x
2! all x
3!
Set t=0 in (3)
X X X (0)2 x3 X (0)3 x4
=⇒ m0x (0) = 0 + x(P (X = x) + (0)x2 (P (X = x) + (P (X = x) + (P
all x all x all x
2! all x
3!
X
m0x (0) = x(P (X = x) = E(x) = µ01
all x
Dierentiating eqn (3) once more with respect to t.
39
X X X t2 x3 X t3 x4
m0x (t) = x(P (X = x) + tx2 (P (X = x) + (P (X = x) + (P (X = x) . .
all x all x all x
2! all x
3!
X X 2tx 3 X 3t x 2 4 X4
m00x (t) = 0 + x2 (P (X = x) + (P (X = x) + (P (X = x) +
all x all x
2! all x
3! all x
X X 2(0)x3 X 3(0)2 x4 X 4(0

m00x (0) = 0 + x2 (P (X = x) + (P (X = x) + (P (X = x) +
all x all x
2! all x
3! all x
3
X
= x2 (P (X = x)
all x
= E(X 2 ) = µ02
In general
µ0r = E(X r ) = mrx (t)|t=0
The rth raw moment is derived by dierentiating the m.g.f r times with
respect to t and setting to 0.
Note that the results can also be derived for continuous random variables.
40
t2 x2 t3 x3
etx = 1 + tx + + + ...
2! 3!
ˆ ∞
=⇒ mx (t) = etx f (x)dx
−∞
ˆ ∞
t2 x2 t3 x3

= 1 + tx + + + . . . f (x)dx
−∞ 2! 3!
ˆ ∞ ˆ ∞ ˆ ∞ 2 2 ˆ ∞ 3 3
tx tx
= 1f (x)dx + txf (x)dx + f (x)dx + f (x)dx + . . .
−∞ −∞ −∞ 2! −∞ 3!
t2 0 t3 t4
= 1 + tµ01 + µ2 + µ03 + µ04 + . . .
2! 3! 4!
ˆ ∞ ˆ ∞
0
mx (t) = 0 + xf (x)dx + tx2 f (x)dx + . . . (1)
−∞ −∞
ˆ ∞
m0x (0) = xf (x)dx
−∞
E(X) = µ01
Dierentiating eqn (1) once more with respect to t.
41
ˆ ∞ ˆ ∞
m0x (t) =0+ xf (x)dx + tx2 f (x)dx + . . .
−∞ −∞
ˆ ∞ ˆ ∞ ˆ ∞ 2 4 ˆ ∞ 3 5
2tx3 3t x 4t x
m00x (t)
=0+0+ 2
x f (x)dx + f (x)dx + f (x)dx + f (x)d
−∞ −∞ 2! −∞ 3! −∞ 3!
ˆ ∞ ˆ ∞ ˆ ∞ ˆ ∞
00 2 2(0)x3 3(0)2 x4 4(0)3 x5
mx (t) = 0 + x f (x)dx + f (x)dx + f (x)dx + f (x)dx . . .
−∞ −∞ 2! −∞ 3! −∞ 3!
ˆ ∞
00
mx (0) = x2 f (x)dx
−∞
= E(X 2 ) = µ02
In general the rth raw moment is given by dr

dtr
(mx (t)) |t=0
EXAMPLE
1. Let X be a continuous random variable with pdf f(x) given by


λe−λx ,x > 0
f (x) =
0 elsewhere
Find its m.g.f if it exists
(a) Derive the expected value of X and the variance of X from the
m.g.f
(b) Verify the results by computing the above quantities directly from
the denition
SOLUTION
(a) By denition
42
ˆ ∞
mx (t) = etx f (x)dx
−∞
ˆ ∞
= xλe−λx dx
−∞
ˆ ∞
= λe(t−λ)x dx
−∞
ˆ ∞
= λe−(λ−t)x dx
−∞
λ
= e−(λ−t)x |∞
0
−(λ − t)
λ
e−(λ−t)∞ − e−(λ−t)0

=
−(λ − t)
λ
e−∞ − e0

=
−(λ − t)

λ 1
= −1
−(λ − t) e∞
λ
=
−(λ − t)
= λ(λ − t)−1
E(x) = m0x (t)|0
λ
=
(λ − t)2
1
=
λ
43
V ar(x) = E(x2 ) − (E(x))2
2
V ar(x) = m00x (t)|0 − (m0x (t)|0 )

d λ
´∞
(b) We need to verify that E(x) = xλe−λx dx = λ1 and
−∞
´∞ ´
∞
2
var(x) = −∞ x2 λe−λx dx − −∞ xλe−λx dx = λ12
´∞
E(x) = λ −∞ xe−λx dx
´
Using integration by parts udv = uv − vdu
Let u = x and dv = e−λx dx
=⇒ du = dx and v = −eλ
−λx
ˆ ∞
E(x) = λ xe−λx dx
−∞
ˆ ∞
xe−λx e−λx

=λ + dx
λ −∞ λ
∞
xe−λx e−λx

=λ + 2
λ λ 0
−λx ∞
−λx ∞
e
= −xe 0
−
λ 0
1
1 ( ∞ −−1)
=− 0−
λ
1
1
=
λ
1
=
λ
44
Next
2
1 2
var(x) = E(x ) −
λ
ˆ ∞
2
but E(X ) = x2 f (x)dx
−∞
ˆ ∞
= x2 λe−λx dx
−∞
ˆ ∞
=λ x2 e−λx dx
−∞
Let u = x2 and dv = e−λx dx

=⇒ du = 2xdx and v = −e−λx
λ
ˆ ∞
2
E(x ) = λ x2 e−λx dx
−∞
ˆ ∞
−x2 e−λx 2

−λx
=λ + xe dx
λ λ −∞
∞
−x2 e−λx 2 −xe−λx e−λx

=λ + − 2
λ λ λ λ 0
∞ ∞
−x2 e−λx −xe−λx e−λx

2
= − − 2
λ 0 λ λ λ 0
∞ 2 ∞ 2 ∞
= −x2 e−λx 0 + −xe−λx 0 + 2 −e−λx 0

λ λ

2 1
= 0 + 0 + 2 −( ∞ − 1)
λ e
2
=
λ2
45
∴ var(x) = E(x2 ) − (E(x))2
2
2 1
= 2−
λ λ
1
=
λ2
SOME THEORETICAL PROBABILITY DIS-
TRIBUTION
These are theoretical distributions which are not obtained by actual obser-
vations/experiments.They are derived mathematically on the basis of certain
assumptions.These distributions are broadly classied into two categories
• Discrete probability distributions
• Continuous probability distributions.
0.0.1 Discrete probability distributions
BERNOULLI DISTRIBUTION
A Bernoulli trial is a random experiment with only two mutually exclusive

outcomes:Success(occurrence of an event) or Failure(Non-occurrence of an
event) e.g Tossing a coin,you can get a H or T
Testing a manufactured item(Item can be either defective or non-defective)
etc.
Let Xbe a random variable such that
1 , if outcome is a success
X=
0 , if outcome is a f ail
46
This simple distribution is completely dened and is characterized by a
singe parameter P where P=p(success)
=⇒ pr(X = 1) = p
P (X = 0) = (1 − p) = q
Therefore a Bernoulli distribution ca be expressed as


px (1 − p)1−x , x = 0, 1
P (X = x) =
0 , elsewhere
The moments of this distribution are very easy to compute i.e
µ0x = E[X r ]
X1
= xr p(X = x)
x=0
= 0 + 1r p1 (1 − p)1−1
= p
The variance of X is given by
var(X) = E(X 2 ) − [E(X)]2

= p − p2
= p(1 − p)
= pq
47
MOMENT GENERATING FUNCTION
mx (t) = E[etx ]
X1
= etx p(X = x)
x=0
1
X
= etx px (1 − p)1−x
x=0
= e0 (1 − p) + et p
= (1 − p) + pet
which exists fr all t.
BINOMIAL PROBABILITY DISTRIBUTION
This is an extension of a Bernoulli distribution.In this case we have more

than one trial.
CHARACTERISTICS OF A BINOMIAL EXPERIMENT.
1. It consists of n identical and independent trials

2. There are only two possible outcomes on each trial i.e Success or Failure.
3. The probability of an outcome is the same from trial to trial.
4. The random variable is the number of favorable outcomes.
Let a trial result in success with a constant probability p and in failure with
probability (1-p)=q.Then the probability of X successes in n independent
trials is given by

 n px q n−x , x = 0, 1, 2, . . . , n
x
p(X = x) =
0 , elsewhere
48
which is a Binomial distribution with parameters n and p
i.e X ∼ Bin(n, p)
EXAMPLE
1. It is expected that 10% of production from a continuous process will

be defective.Find the probability that in a sample of 10 units chosen
at random,exactly 2 will be defective and atleast 2 will be defective.
SOLUTION
(a) x=Defective items

What is required is p(X = 2) ansd p(X ≥ 2)
Now,

n x n−x
p(X = x) = p q
x

10
p(X = 2) = (0.1)2 (0.9)10−2
2
10!
= (0.1)2 (0.9)8
8!2!
10 × 9
= (0.1)2 (0.9)8
2
= 0.1937
(b) p(X ≥ 2)
49
= p(X = 2) + p(X = 3) + · · · + p(X = 10)
= 1 − p(X < 2)

10 0 10−0 10 1 10−1
= 1− pq + pq
0 1

10 0 10 10 1 9
= 1− (0.1) (0.9) + (0.1) (0.9)
0 1
= 1 − [0.3486 + 0.387]
= 0.264
EXERCISE
1. The probability that a computer drawn a random from a batch of

computers is defective is 0.1 ,If a sample of 6 pens is taken,nd the
probability that i will contain
(a) No defective computer

(b) 5 or 6 defective computers
(c) More than 2 defective computers
(d) Less than 3 defective computers
MGF OF A BINOMIAL DISTRIBUTION
The mgf of a Binomial random variable X with parameters n and p is given

by
50
mx (t) = E(etx )
n
tx n
X
= e px q n−x
x=0
x
n
X
t x
n−x n
= pe q
x=0
x
But (a + b)n = nx=0 n

,the binomial expression
P x n−x
x
a b
let pet = a and q = b.
By convention
n
X n x
= pet q n−x
x=0
x
n
= pet + q
Which is the mgf of a Binomial distribution with parameter n and p.
51
Now,
n−1
m0x (t) = n pet + q pet
n−1
= npet pet + q
m0x (0) = E(X)

= np (p + q)n−1
= np1n−1
= np
which is the mean of a Binomial distribution with parameter n and p.

Next VAR(X)
52
V ar(X) = E(X 2 ) − [E(X)]2
= E(X 2 ) − n2 p2
but E(X 2 ) = m00x (0)
n−1 n−2
m00x (t) = npet pet + q + n − 1 pet + q npet pet
m00x (0) = np (p + q)n−1 + n − 1 (p + q)n−2 np2
= np + n(n − 1)p2
= np + n2 p2 − p2
∴ V ar(X) = E(X 2 ) − n2 p2
= np + n2 p2 − p2 − n2 p2
= np − np2
= np(1 − p)
= npq
0.0.2 NEGATIVE BINOMIAL DISTRIBUTION:
Consider a sequence of independent repetitions of a random experiment with

constant probability of success p.Let k be the number of successes and n-k
53
be the number of failures in a sample of size n.In this case sampling stops
after exactly k successes and he last draw must be a success.The number of
ways this can happen is n−1

k−1
One of these ways could be pp . . . p}k − 1successes;(1 − p)(1 − p)(1 −
p) . . . (1 − p)p}n − k successes
Therefore the probability of obtaining k-1 successes in n-1 trials is

n − 1 k−1
f (k) = p (1 − p)n−k p
k−1

 n−1pk (1 − p)n−k , k > 0, n = k, k + 1, k + 2, . . .
k−1
=
0 , elsewhere
This distribution is what we call a negative Binomial distribution.

THE MGF OF NEGATIVE BINOMIAL DISTRIBUTION
Let X have a negative Binomial distribution,Then
mx (t) = E etx

∞
X n − 1 k n−k t(k)
= p q e
n=k
k − 1
where p = q = 1 − p
= pk + pk qet + pk (qet )2 + . . .
= pk 1 + qet + (qet )2 + . . .

pk
=
(1 − qet)k
k
p
=
(1 − qet)
54
MEAN(X)
−k−1
m0x (t)|t=0 pk (−k) 1 − qet −qet

=
−k−1
kpk qet 1 − qet

=
=⇒ E [X] = m0x (t)|t=0
kq
=
p
VAR(X)
55
E X 2 − [E (X)]2

2
k2q2
= E X − 2
p
E (X 2 ) = m00x (t)|t=0
−k−1 −k−2
m00x (t) = kpk 1 − qet + kpk q t (−k − 1) 1 − qet −qet

m00x (t)|t=0 = kpk (p)−k−1 + kpk q t (k + 1) (p)−k−2 (q)
kq k 2 q 2 kq 2
E (X 2 ) = + 2 + 2
p2 p p
k2q2
V ar(X) = E X2 − 2
p
kq k 2 q 2 kq 2 k 2 q 2
= + 2 + 2 − 2
p2 p p p
kq kq 2
= + 2
p2 p
kq(p + q)
=
p2
kq
=
p2
If k=1 then the resulting distribution is


p (1 − p)n−1 , n = 1, 2, 3, . . .
f (k) =
0 , elsewhere
56
Which is a geometric distribution.
57
The mgf of this distribution is
E [X] = m0x (t)|t=0
−2
m0x (t) = (−1) p 1 − qet −qet

m0x (t)|t=0 = (−1) p (1 − q)−2 (−q)

−2
= pqet 1 − qet

= p (p)−2 (q)
pq
E [X] =
p2
q
E [X] =
p
E (X 2 ) = m00x (t)|t=0
−2 −3
m00x (t) = pqet 1 − qet + (−2) pqet 1 − qet −qet

−2 −3
pqet 1 − qet + (2) pqet qet 1 − qet

=
m00x (t)|t=0 = (pq) (1 − q)−2 + (2) (pq) (q) (1 − q)−3

(pq) (p)−2 + (2) (pq) (q) (p)−3
(pq) (2) (pq) (q)

= +
(p)2 (p)3
(q) (2) (q) (q)

= +
(p) (p)2
q 2q 2
E X2

= + 2
p p
58
Now VAR(X)
E X 2 − [E (X)]2

2
q 2q 2 q
= + 2 −
p p p
q 2q 2 q 2
= + 2 − 2
p p p
qp + 2q 2 − q 2
=
p2
qp + q 2
=
p2
q (p + q)
=
p2
q
=
p2
Note that p + q = 1
POISSON DISTRIBUTION
A Poisson distribution is derived from a Binomial distribution.It is the lim-

iting form of a Binomial distribution under the following conditions.
1. Number of trials n is large i.en → ∞
2. Probabilities of success p in each trial is small i.e p → 0
3. np = λ is nite and a positive real number.
Let n=number of trails
59
m=Average number of successes per a given time span,Then given a ran-
dom variable X
n! m x m n−x

p(X = x) = (n−x)!x! n
1− n
n! m x m n−x

limn→∞ p (X = x) = limn→∞ (n−x)!x! n
1− n
n! m x m n m −x

= limn→∞ (n−x)!x! nx
1− n
1− n
mx n! m n m −x

= nx
limn→∞ (n−x)!x! × limn→∞ 1 − n
1− n
n! n(n−1)(n−2)...(n−x+1)
But limn→∞ (n−x)!x! = limn→∞ n.n....n
= nn limn→∞ 1 − 1 2 x+1

n
limn→∞ 1 − n
. . . limn→∞ 1 − n
= 1 × 1 × 1··· × 1 = 1
m n
= e−m

limn→∞ 1 − n
1 n

since limn→∞ 1 + n
=e
a

and limn→∞ 1 + n
= ea
m −x
= 1−x = 1

and limn→∞ 1 − n
mx
=⇒ limn→∞ p (X = x) = x!
(1) e−m

 λx e−λ , x = 0, 1, 2 . . .
x!
=
0 , elsewhere
Which is the Poisson distribution with parameter λ
60
THE MGF OF A POISSON DISTRIBUTION
mx (t) = e etx

∞
X λx e−λ
= etx
x=0
x!
∞
−λ
X λx
= e etx
x=0
x!
∞ x
X (λet )
= e−λ
x=0
x!
" #
2 3
(λet ) (λet )
= e−λ 1 + λet + + + ...
2! 3!
= e−λ eλet
t−1 )
= eλ(e
MEAN(X)
E [X] = m0x (t)|t=0
t−1 )
m0x (t) = λet eλ(e
E [X] = m0x (t)|t=0 = λ
61
VAR(X)
E X2 m00x (t)|t=0

=
t−1 ) t−1 )
m00x (t) = λet eλ(e + λet λet eλ(e
E (X 2 ) = m00x (t)|t=0 = λ + λ2
V ar(X) = E X 2 − [E (X)]2

= λ + λ2 − (λ)2
V ar(X) = λ
EXAMPLE
1. The number of calls per to minutes received at a telephone switch board

follows a Poisson distribution with mean 0.6.Find the probability that
(a) No call will be received in the rst 10 minutes

(b) More than 2 calls will be received in a period of 40 minutes.
SOLUTION
(a) Mean =0.6
Let X be the random variable representing the number of calls received

per 10 minutes
62
X ∼ P0 (0.6)
e−0.6 (0.6)x
p (X = x) = , x = 0, 1, 2
x!
e−0.6 (0.6)0
p (X = 0) = = 0.55
0!
(a) Let X1 be the number of calls received in 40 minutes
X ∼ P0 (0.6)
X1 ∼ P0 (4 × 0.6)
X1 ∼ P0 (2.4)
e−2.4 (2.4)x
p (X1 = x) = , x = 0, 1, 2, . . .
x!
p (X1 > 2) = 1 − [p (X1 = 0) + p (X1 = 1) + p (X1 = 2)]
" #
e−2.4 (2.4)0 e−2.4 (2.4)1 e−2.4 (2.4)2
=1− + +
0! 1! 2!
e−2.4
=1−
6.28
= 1 − 0.5697
= 0.430
63
2. A certain hospital usually admits 50 patients pr day.On average 3 pa-
tients in 100 require rooms provided with special facilities on a certain
day . It is found that there re 3 such rooms available.Assuming that
50 patients will be admitted .Find the probability that more than 3
patients will require such special rooms.
SOLUTION
p=3/100=0.03
n=50
λ = np = 0.03 × 50 = 1.5
Let the random variable X be the number of patients that require rooms
with specic facilities
X ∼ P0 (1.5)
e−1.5 (1.5)x
p (X = x) = , x = 0, 1, 2 . . .
x!
p (X > 3) = 1 − [p (X = 0) + p (X = 1) + p (X = 2) + p (X = 3)]
" #
e−1.5 (1.5)0 e−1.5 (1.5)1 e−1.5 (1.5)2 e−1.5 (1.5)3
=1− + + +
0! 1! 2! 3!
(1.5)2 (1.5)3

−1.5
=1− e 1 + 1.5 + +
2! 3!
= 1 − (0.2231 × 4.1879)
= 1 − 0.934
= 0.066
64
3. An insurance company attends 50 clients per day. On average 3 in
100 require special services .On a certain day it is found that there
are 3 special service providers available.Assuming that 50 clients will
be attended to,nd the probability that more than 3 clients require
special services.
4. A factory packs bolts in boxes of 500 . The probability that a bolt is

defective is 0.002.Find the probabilities that a box 2 defective bolts.
SOLUTION
Since n is large and p is small we can use a Poisson approximation.
λ = np
= 500(0.002)
= 1
e−1 (1)2
p(X = 2) =
2!
= 0.184
5. The mean number of bacteria per milliliter of liquid is known to be 4.

Assuming that the number of bacteria follows a Poisson distribution
6. Find the probability that in 1 ml of liquid there will be no bacteria.
7. Find the probability that
8. In 3 ml of liquid there will be less than 2 bacteria.
9. In 1/2 ml of liquid there will be more than 2 bacteria.
10. SOLUTION
65
X ∼ P0 (4)
e−4 (4)x
p (X = x) = , x = 0, 1, 2, . . .
x!
e−4 (4)0
p (X = 0) = =
0!
X ∼ P0 (3 × 4)
e−12 (12)x
p (X = x) = , x = 0, 1, 2, . . .
x!
p (X < 2) = p (X = 0) + p (X = 1)
e−12 (12)0 e−12 (12)1

= +
0! 1!
= e−12 (1 + 12)
= 0.000079
66

1
X ∼ P0 ×4
2
X2 ∼ P0 (2)
e−2 (2)x
p (X = x) = , x = 0, 1, 2, . . .
x!
p (X > 2) = 1 − [p (X2 = 0) + p (X2 = 1) + p (X2 = 2)]
" #
−2 0 −2 1 −2 2
e (2) e (2) e (2)
=1− + +
0! 1! 2!
= 1 − e−2 (1 + 2 + 2)
= 0.3233
HYPERGEOMETRIC DISTRIBUTION
We can derive hypergeometric distribution as follows:

Let N be the population size from which a sample of size n is to be
drawn.let the proportion of individuals in this nite population who possess
certain property of interest be denoted by p. If X is a random variable
corresponding to the number of individuals in the sample possessing the
property of interest ,the problem is to nd the distribution function of X.
Since x individuals must come from NP individuals in the population
with property of interest and the remaining n-x individuals come from N-
NP who do not possess the property of interest, then by using the idea of
combinations the distribution of X is given by
67
NP N −N P

x
p(X = x) = N
n−x
n
which is the hypergeometric distribution

Suppose then that NP=k then
 k N −k
 (x)(Nn−x ) , x = 0, 1, . . . , n
(n)
0 , elsewhere

MEAN(X)
X
E [X] = xP (X = x)
allx
k N −k

X x n−x
= x N

x=0 n
n k−1 N −k

nk X x−1 n−x
= N −1

N x=1 n−1
let x − 1 = y
=⇒ n − x = n − 1 − y
=⇒ x = y + 1
k−1 N −1−k+1
n−1

nk X y n−1−y
E [X] = N −1

N y=0 n−1
nk
E [X] =
N
VAR(X)
68
E X 2 − [E (X)]2

V ar(X) =
n k N −k

X x n−x
butE [X (X − 1)] = x(x − 1) N

x=0 n
k−2 N −2−k+2
n−2

n(n − 1)k(k − 1) X x−2 n−2−y
= N −2
N (N − 1)

y=0 n−2
let x − 2 = y
=⇒ n − x = n − y − 2
=⇒ x = y + 2
k−2 N −2−k+2
n−2

n(n − 1)k(k − 1) X y n−2−y
E [X (X − 1)] = N −2
N (N − 1)

y=0 n−2
n(n − 1)k(k − 1)
=
N (N − 1)
= E X 2 − [E (X)]2

V ar(X)
= E [X (X − 1)] + E (X) + [E (X)]2
2
n(n − 1)k(k − 1) nk nk
= + +
N (N − 1) N N
2 2
n(n − 1)k(k − 1) nk n k
= + +
N (N − 1) N N2

nk (k − 1) nk
(n − 1) +1−
N N −1 N

nk (n − K) (N − n)
=
N 69 N (N − 1)
NB:The mgf of the hypergeometric distribution is not useful.
EXAMPLES
1. A box of 20 spare parts foe a certain type of a machine contains 15
good items and 5 defective items. If 4 parts selected by chance from
the box ,what is the probability that exactly 3 of them will be good?
SOLUTION
Let the random variable X=No of good items.Using the hypergeometric
distribution
k N −k

x n−x
p(X = x) = N

n
N = 20, k = 15, N − k = 5, n = 4, x = 3, n − x = 1
15 5

3
p(X = 3) = 20
1
4
5! 5!
12!3!
× 4!1!
= 20!
16!4!
(15)(14)(13) 5 (4)(3)(2)(1)
= × ×
(3)(2)(1) 1 (20)(19)(18)(17)
(5)(7)(13)
=
(19)(3)(17)
455
= = 0.4696
969
2. Suppose a population consists of 100 individuals whom 10% have high

blood pressure.What is the probability of getting atmost two individu-
als with high blood pressure when choosing 8 individuals?
SOLUTION
70
Let the random variable X be the number of individuals with high
blood pressure
k N −k

x n−x
p(X = x) = N

n
N = 100, k = 10, N − k = 90, n = 8
2 10 90

X x 8−x
p(X ≤ 2) = 100

x=0 8
10 90 10 90 10 90

0 8−0 1 8−1 2 8−2
= 100
+ 100
+ 100

8 8 8
= 0.97
3. A JKUAT messenger is asked to deliver 10 out of 16 letters to the

Institute of computer science department and the remaining to En-
gineering department.He gets the letters mixed up and on arrival in
the computer science department delivers 10 letters at random to the
department.What is the probability that only six of the letters to be
delivered to the computer science department actually arrive there?
SOLUTION
Let the random variable X=No of letters delivered to the computer
science department.
71
k N −k

x n−x
p(X = x) = N

n
N = 16, k = 10, N − k = 6, n = 10, x = 6, n − x = 4
10 6

6
p(X = 6) = 16
4
10
5! 5!
12!3!
× 4!1!
= 20!
16!4!
(10)(9)(7)(6) 6 × 5 (6)(5)(4)(3)(2)(1)
= × ×
(4)(3)(2)(1) 2 × 1 (16)(15)(14)(13)(12)(11)
= 0.393
4. In a school there are 20 students .6 are compulsive smokers and keep

cigarettes in their lockers all the time. One day prefects decide to check
at random on 10 lockers . What is the probability that the Will nd
cigarettes in atleast 3 of the lockers?
SOLUTION
Let the random variable X be the number of lockers with cigarettes
72
k N −k

x n−x
p(X = x) = N

n
N = 20, k = 6, N − k = 14, n = 10
3 6 14

X x 10−x
p(X ≥ 3) = 20

x=0 10
10 90 10 90 10 90

0 8−0 1 8−1 2 8−2
= 100
+ 100
+ 100

8 8 8
= 0.686
CONTINUOUS PROBABILITY DISTRIBUTION:
CONTINUOUS PROBABILITY DISTRIBUTION:
0.1 UNIFORM OR RECTANGULAR DISTRIBUTION
Uniform distribution is very simple distribution and is particularly useful in

theoretical statistics because it is convenient to deal with mathematically .
A continuous random variable X has a uniform distribution over h interval
[a, b] with pdf given by

1

b−a
, −∞ < a < b < ∞
f (x) =
0 elsewhere
73
if the random variable X is uniformly distributed over [a, b] then
ˆ ∞
E [X] = xf (x)dx
−∞
ˆ b
1
= x dx
a b−a
ˆ b
1
= xdx
b−a a
2 b
1 x
=
b−a 2 a
2
b − a2

1
=
b−a 2
a+b
=
2
74
VAR(X)
E X 2 − (E [X])2

=
2
a+b
E X2 −

=
2
ˆ b
1
x2

E X2 = dx
a b−a
3 b
1 x
=
b−a 3 a
3
1 b − a3
=
b−a 3
b 3 − a3
=
3 (b − a)
2
b 3 − a3

a+b
∴ V AR (X) = −
3 (b − a) 2
b 3 − a3 (a + b)2
= −
3 (b − a) 4
4 b3 − a3 − (a + b)2 3 (b − a)

=
12 (b − a)
4 (b − a) b + a2 + ab − (a + b)2 3 (b − a)

=
12 (b − a)
4 b2 + a2 + ab − 3 (a + b)2

= 75 12
2
4b2 + 4a2 + 4ab − 3 a2 + b2 + 2ab
=
12
2 2 2 2
(b − a)2
∴ V AR (X) =
12
MGF OF X
mx (t) = E etx

ˆ b
1
= etx dx
a b−a
ˆ b
1 tx
= e dx
b−a a
tx b
1 e
=
b−a t a
bt
e − eat

1
=
b−a t
ebt − eat
=
t (b − a)
0.2 THE GAMMA AND CHI SQUARE DISTRIBU-

TION
0.2.1 THE GAMMA AND BETA DISTRIBUTIONS
The gamma function is denoted by Γα and is dened by

ˆ ∞
Γα = y α−1 e−y dy
0
for α > 0.If α = 1 then ˆ ∞

Γ1 = e−y dy = 1
0
76
If α > 1an integration by parts shows that
ˆ ∞
Γα = (α − 1) y α−1 e−y dy
0
= (α − 1) Γ (α − 1)
Therefore if αis a positive integer then

Γα = (α − 1) (α − 2) . . . (3)(2)Γ1 = (α − 1)!
=⇒ Γ (α + 1)=α!
Γ (α + 2) = (α + 1)! = (α + 1)! etc
The beta function with parameters αβ denoted B (α, β) is dened as
ˆ 1
B (α, β) = y α−1 (1 − y)β−1 dy, α > 0, β > 0
0
Note that
B (α, β) = B (β, α)
ˆ 1
i.e B (α, β) = y α−1 (1 − y)β−1 dy
0
let 1−y =µ
du = −dy
y =1−µ
ˆ 1
B (α, β) = − (1 − u)α−1 uβ−1 du
0
ˆ 1
= uβ−1 (1 − u)α−1 du = B (β, α)
0
77
The beta function is related to the gamma function according to the formula
ΓαΓβ
B (α, β) =
Γ (α + β)
We derive the gamma distribution from the gamma function as follows

ˆ ∞
Γα = y α−1 e−y dy
0
Introduce a new variable x by letting y = x

β
where β > 0,then
ˆ ∞ α−1
x x 1
Γα = e− β dx
0 β β
ˆ ∞
xα−1 −x
= e β dx = 1
0 Γαβ α
Now since α > 0β > 0 and Γα > 0 then


 xα−1 e −x
β ,0 < x < ∞
Γαβ α
f (x) =
0 , elsewhere
is a pdf of a random variable of the continuous type such a distribution is

called a gamma distribution with parameters α and β .
0.2.2 MGF OF A GAMMA DISTRIBUTION
mx (t) = E(etx )
´∞ 1
−x
= 0
etx Γαβ αx
α−1
e β dx
´∞ 1
−x
= 0 Γαβ α
xα−1 etx e β dx
= 1
Γαβ α
xα−1 e−x(1−βt)/β dx
78
1
lety = x (1 − βt) /β, t < β

1−βt
=⇒ dy = β
dx
βdy
=⇒ dx = 1−βt
α ´
1 ∞ 1 α−1 −y
= 1−βt 0 Γα
y e dy
βy
andx = 1−βt
´∞ β/(1−βt)

βy
α−1
∴ mx (t) = 0 Γαβ α 1−βt
e−y dy
´∞ 1 α−1 −y
but 0 Γα
y e dy = 1(pdf )
α
1 1
= 1−βt
,t < β
= (1 − βt)−α
M EAN (X) =
E(X) = m0x (t)|t=0
but m0x (t) = −α (−β) (1 − βt)−α−1
= αβ (1 − βt)−α−1
E(X) = m0x (t)|t=0 = αβ

VAR(X)
79
E X 2 − (E [X])2

=
E X 2 − α2 β 2

=
m00x (t) = αβ (−α − 1) (−β) (1 − βt)α−2
E X 2 = m00x (t)|t=0 = αβ (αβ + β)

=
= α2 β 2 + αβ 2
E X 2 − α2 β 2

∴ V AR(X) =
= α2 β 2 + αβ 2 − α2 β 2
= αβ 2
Chi square distribution is a special case for gamma distribution where

α = α2 , r > 0, β = 2
 r x
 x 2 e−r2 ,0 < x < ∞
Γr/22 /2
f (x) =
0 elsewhere
Likewise the mgf of a chi square distribution is
mx (t) = (1 − 2t)− /2 , t < 1/2

r
The mean and variance of a chi square distribution is αβ = 2r (2) = r and

variance = αβ 2 = 2r (4) = 2r respectively.
80
Exponential distribution is also a special case of gamma distribution
where α = 1and β = θ = λ1

 1 e−x/θ , 0 < x < ∞
f (x) = θ
0 , elsewhere
or

λe−λx ,0 < x < ∞
f (x) =
0 , elsewhere
The mgf of an exponential distribution is
1
1 − θt
or 1
λ−t
E(x) = λ1 or θ
V AR(X)= λ12 or θ2
0.3 BETA DISTRIBUTIONS
A continuous random variable X is said to have a beta distribution with

parameters αand β if its pdf is given by

 xα−1 (1−x)β−1 , 0 ≤ x ≤ 1, β > 0, α > 0
B(α,β)
f (x) =
0 , elsewhere
or

 Γ(α+β) xα−1 (1 − x)β−1 , 0 ≤ x ≤ 1, β > 0, α > 0
ΓαΓβ
f (x) =
0 elsewhere
The mgf of a beta distribution is not useful but we can as well get the
81
mean and variance for the distribution
E(X)
ˆ 1
E(X) = (x)1 f (x)dx
0
ˆ 1
(x)1 xα−1 (1 − x)β−1
= dx
0 B (α, β)
´1 xα (1 − x)β−1
= 0
dx
B (α, β)
ˆ 1
but B (α, β) = xα−1 (1 − x)β−1 dx
0
by convention
ˆ 1
xα (1 − x)β−1 dx = B (α + 1, β)
0
B (α + 1, β)
=⇒
B (α, β)
Γ (α + 1) Γβ Γ(α + β)
= ×
Γ(α + β + 1) ΓαΓβ
αΓαΓβ Γ (α + β)
= ×
(α + β) Γ (α + β) ΓαΓβ
α
=
α+β
Next VAR(X)
82
ˆ
2
1
(x2 ) xα−1 (1 − x)β−1
E(X ) = dx
0 B (α, β)
xα+2−1 (1 − x)β−1
= dx
B (α, β)
B (α + 2, β)
=
B (α, β)
Γ (α + 2) Γβ Γ (α + β)
= ×
Γ (α + β + 2) ΓαΓβ
(α + 1)Γ(α + 1)Γβ × Γ (α + β)
=
(α + β + 1) Γ (α + β + 1) ΓαΓβ
(α + 1)αΓαΓβ × Γ (α + β)
=
(α + β + 1) (α + β) Γ (α + β) ΓαΓβ
(α + 1) α
=
(α + β + 1) (α + β)
2
2 2 (α + 1) α α
E [X ] − (E [X]) = −
(α + β + 1) (α + β) α+β
(α2 + α)(α + β) − α2 (α + β + 1)
=
(α + β + 1) (α + β)2
α3 + α2 + α2 β + αβ − α3 − α2 β − α2
=
(α + β + 1) (α + β)2
αβ
=
(α + β + 1) (α + β)2
83
0.4 NORMAL DISTRIBUTION
It is the most popular and commonly used distribution,.A continuous random

variable X is said to have a normal distribution if its pdf is given by

1 2
 √1 e− 2σ2 (x−µ) , −∞ < x < ∞, −∞ < µ < ∞, σ > 0
σ 2π
f (x) =
0 , elsewhere
Sometimes we use the notation X ∼ N (µ, σ 2 ) .

From the above pdf we observe that the normal distribution depends on
two unknown parameters µ and σ .Later we will show that these parameters
µ and σ are mean and standard deviation respectively.
Once the parameters µ and σ are specied the distribution is completely
determined hence the values of f(x) can be evaluated from the values of X
and normal curve plotted which is normally bell-shaped and symmetric about
x = µ.
Note that the total area under the curve is always1 .

ˆ ∞
1 1 2
f (x) = √ e− 2σ2 (x−µ) dx = 1
−∞ σ 2π
The probability that X lies between a and b is given by
´b 1 2
p (a < x < b) = √1 e− 2σ2 (x−µ) dx
a σ 2π
It is not easy to evaluate the above integral and so we make use of the
84
standard normal variable.
i.e if µ =0and σ = 1 the density function for X becomes
 √1 e− 2σ12 x2 , −∞ < x < ∞
2π
f (x) =
0 , elsewhere
In this situation the random variable is referred to as a standard normal
variable and its distribution is referred to as normal distribution i.e X ∼
N (0, 1) .
To simplify the computation of probabilities involving the normal variable
we rst of all standardize the random variable.i.e we make it possess the
standard normal density.This is done by the following transformation
X −µ
z=
σ
In this case z is a standardized variable.Note that if X ∼ N (µ, σ 2 ) then the
standardized variable
X −µ
z= ∼ N (0, 1) .
σ
E(Z)
E(X−µ) E(X)−E(µ) µ µ
= σ
= σ
= σ
− σ
=0
VAR(z)
=var (X−µ) = var Xσ − var σµ = var(X) µ σ2

σ σ2
− var σ
= σ2
=1
Note var σ =variance of constant =0.
µ

If z ∼ N (0, 1) then
ˆ z
1 − 1 z2
p (Z < z) = e 2 dz
−∞ 2π
These probabilities have been evaluated and given in form of a table.

Now the probability i.e
85
p(x ≤ b) can be evaluated by

x−µ b−µ
= p ≤
σ σ
=⇒ p(x ≤ b) = p(Z < a∗)
(1)
where a∗ = b− σµ .
This probability on the RHS of (1) can be read from tabulated values of
the CDF of the normal distribution z.
USE OF STANDARD NORMAL TABLES.
1. If z is a standard normal variable.Find the following probabilities.
(a) pr(z < 0)

(b) pr (−1 < z < 1)
(c) p (z > 2.54)
SOLUTION
(a) If we sketch the curve and indicate region specied by the proba-
bility
86
Because of symmetry p(z < 0) = p(z > 0) = 1 − p(z > 0) = 0.5
(b) p(−1 < z < 1)
p(−1 < z < 1)
= p(z < 1) − [1 − p(z < 1)]
= 0.8413 − (1 − 0.8413)
= 0.6826
87
(c) p(z > 2.54)
=1-0.9945
=0.0055
2. A random variable X has a normal distribution with mean 9 and S.D

3.Find p [5 < X < 11]
SOLUTION
We need to standardize X
i.e z = X−µ
σ
for X=5
z = 5-9 = −1.333
3
for X=11
z = 11−9
3
= 0.6667
=⇒ p(−1.33 < z < 0.667)
=p(z < 0.667) − (1 − p(z < 1.333))
88
=0.7486 − 0.0918
=0.6568
3. If X ∼ N (10, σ 2 )and p(X > 12) = 0.1587 Find p(9 < X < 11)
SOLUTION
12−10

p(X > 12) ≡ p Z > σ
= 0.1587
From the tables the value of Z which corresponds to probability 0.1587
is 1.(i.e 1 − 0.1587 = 0.8413 ≡ 1from tables)
Nextp(q < X < 11) = p(−0.5 < z < 0.5)
p(z < 0.5) − [1 − p(z < 0.5)]

= 0.6915 − 0.3085
= 0.83
4. The time required to perform a certain job is a random variable having

a normal distribution with mean 55 and s.d of 10 minutes.Find the
probability that
(a) The job will take more than 75 min

(b) The job will take less than 60 min
89
(c) The job will take between 45-60min
SOLUTION
(a) Let the random variable X=time required to do the job.What is

required is

75 − 55
pr X >
10
p(Z < 2)
= 1 − 0.977
= 0.0228
(b) We need to compute

60 − 55
p(X < 60) ≡
10
= p(z < 0.5) = 0.6915
(c) p(45 ≤ X ≤ 60)
90

45 − 55 60 − 55
≡ p ≤z≤
10 10
= p(−1 ≤ z ≤ 0.5)
= p(z ≤ 0.5) − p(z < −1)
= p(z ≤ 0.5) − [1 − p(z < 1)]
= 0.6915 − 0.1587
EXERCISE
1. Let X be a normally distributed with mean µand variance σ 2 . Suppose

that p(X < 69) = 0.9 and p(X < 74) = 0.95.Determine µand σ
ans σ = 13.88and µ = 51.2
2. X ∼ N (µ, σ 2 ) .Find the constant b so that
p(−b < z < b) = p(z < b) − [1 − p(z < b)] = 0.95
let p(z < b) = y

=⇒ y − [1 − y] = 0.95
= 2y − 1 = 0.95
y = 0.9750
91
The value of z that corresponds to the probability o.9750 is 1.96
∴ if p(z < b) = 0.975
then b=1.96
PROPERTIES OF NORMAL DISTRIBUTION
1. The normal density curve is bell-shaped and symmetric about the value
x = µ since f (x) satises f (µ + a) = f (µ − a)
=⇒ p(X < µ) = p(x > µ) = 0.5
µis the median of the density.
From the curve f(x) is a maximum when X = µ

2. As X → ∞ f (x) → 0
As X → −∞ f (x) → 0
All the three measures of location mean,median and mode coincide at
x=µ
3. Since f(x) is a pdf the area under the normal curve is 1

i.e ˆ ∞
1 1 2
√ e− 2 z dx = 1
−∞ 2π
´∞
or −∞ σ√12π e dx = 1if z ∼ N (0, 1)
(x−µ)2
1
− 2σ
Note tat the normal distribution is widely used in statistics despite the
fact that populations hardly follow the exact normal distribution
This is because
92
1. If a variable does not follow a normal distribution it can be made to
follow it after making suitable transformations
2. Whatever the original distribution the distribution of the sample can

be in most cases approximated to normal distribution so long as the
sample size is suciently increased (CLT i.central limit theorem)
THE MGF OF A NORMAL DISTRIBUTION

By denition
ˆ ∞
1 1 2
mx (t) = etx √ e− 2σ2 (x−µ) dx
−∞ σ 2π
ˆ ∞
1 1 2
= √ e− 2σ2 (x−µ) +tx dx (1)
−∞ σ 2π
1 2
but − 1
2σ 2
(x − µ)2 + tx = − x 2
+ µ 2
− 2µx − 2σ 2
tx
2σ 2
1 2 2 2

= − x − µ − 2x(µ + σ t
2σ 2
1 2 2
µ2
= − x − 2x(µ + σ t − (2)
2σ 2 2σ 2
93
we coplete the square of
x2 − 2x(µ + σ 2 t
2 2
i.e x2 − 2x(µ + σ 2 t = x − (µ + σ 2 t − µ − σ 2 t

2
1 2 2
[(µ + σ 2 t] µ2
f rom(2) − x − (µ + σ t + −
2σ 2 2σ 2 2σ 2
2
1 2 2
[(µ2 + 2µσ 2 t + σ 4 t2 ] − µ2
= − 2 x − (µ + σ t +
2σ 2σ 2
1 2 2
σ 2 t2
=− x − (µ + σ t + µt +
2σ 2 2
From (1)
ˆ ∞
1 1 2 2 σ 2 t2
√ e− 2σ2 (x−(µ+σ t) +µt+ 2 dx
−∞ σ 2π
ˆ ∞
2 2 1 1 2 2
=e µt+ σ 2t
√ e− 2σ2 (x−(µ+σ t)
−∞ σ 2π
ˆ ∞
1 1 2 2
but √ e− 2σ2 (x−(µ+σ t) dx = 1
−∞ σ 2π
σ 2 t2
= eµt+ 2
´∞ 2
dx is a normal density with mean = µ + σ 2 t
1
√1 e− 2σ2 (x−(µ+σ t)
2
−∞ σ 2π
and var= σ2
Note that if we replace µwith 0 and σ 2 with 1 we get the mgf of standard
normal distribution
94
i.e
1 2
mz (t) = e0+ 2 t
1 2
= e2t
MEAN (X) where X ∼ N (µ, σ 2 )
E(X) = m0x (t)|t=0

1 2 2
= µ + σ 2 t eµt+ 2 σ t

mx (t)
E(X) =µ
var(X) = E(X 2 ) − (E [X])2
E [X 2 ] = m00x (t)|t=0
1 2 t2 1 2 2
m00x (t) = σ 2 eµt+ 2 σ + µ + σ 2 t µ + σ 2 t eµt+ 2 σ t

m00x (t)|t=0 = σ 2 + µ2
var(X) = E(X 2 ) − (E [X])2
= σ 2 + µ2 − µ2 = σ 2
For z ∼ N (0, 1) we can as well get the mgf of X
95
i.e
mZ (t) = E(etz )
ˆ ∞
1 z2
= etz √ e 2
−∞ 2π
ˆ ∞
1 (−z )
2 −2tz
= √ e 2 dz
−∞ 2π
but z 2 − 2tz = (z − t)2
ˆ ∞
1 (z−t)2 t2
=⇒ √ e− 2 + 2
−∞ 2π
ˆ ∞
t2 1 (z−t)2
=e2 √ e− 2 dz
−∞ 2π
t2
=e2
which is the mgf of a standard normal variable.

alternatively we can get the mgf of a standard normal distribution from
the mgf of a normal density.
i.e
1 2 t2
mx (t) = eµt+ 2 σ (1)
Since z ∼ N (0, 1) we can replace the µ and σ with 0 and 1 respectively from
(1).
t2 t2
=⇒ mz (t) = e0+ 2 = e 2
MEAN(Z)
96
E(z) = m0z (t)|t=0
t2
m0z (t) = te 2
m0z (t)|t=0 = 0
VAR(X)
= E(z 2 ) − (E [z])2
E z 2 = m00z (t)|t=0

but
now m00z (t)|t=0 = 1 + 0 = 1
∴ var(z)1 + 0 − 02 = 1
0.5 STATISTICAL INFERENCE
STATISTIC A statistic is a function of one or more random variables involv-

ing any unknown parameter e.g n i.e x or σ 2
P
xi
A researcher has always xed ideas about a certain population based on

prior experiments, surveys /experience .
There is need to ascertain whether these ideas /claims are correct or not
by collecting information in the form of data.
A statistical test is a procedure governed by certain rules which leads to
take a decision about the hypothesis for its acceptance r rejection on the
basis of sample values.
97
Test of hypothesis plays an important role in industry..
Hypothesis is an assertion or conjecture about the parameters of popula-
tion distributions
We have two hypothesis
1. Null hypothesis-A hypothesis which is actually tested for acceptance or

rejection.I s denoted by H0
It expresses the idea that an observed dierence is due to chance.
2. ALTERNATIVE Hypothesis:Is a statement about the population pa-

rameter or parameters which gives alternative to the null hypothesis
H0 wihithe range of pertinent values of the parameter.It is denoted by
HA or H1
It says the dierence is real.
P-VALUE OF A TEST:Is the probability of getting a big test statistic
assuming a null hypothesis to be right .P is not the chance of the null
hypothesis being right.
If the statistical hypothesis completely species the distribution ,it is
called a simple hypothesis otherwise t is called a composite hypothesis.
To make a test of signicance you need to

Set up the null hypothesis
Pick a test statistic
Compute the observed signicance level p.
The choice of tests statistic depends on the model and the hypothesis
being considered.
TYPES OF ERRORS
1. TYPE I ERROR:Reject H0 when it is true.
2. TYPE II ERROR :Reject H0 when t is false.
98
LEVEL OF SIGNIFICANCE:Is the quantity of risk of the type I error which
we are ready to tolerate in making a decision about H0
It is conventionally chosen as 5% or 1%
5% is moderate precision
1% is high precision
STUDENTS T DISTRIBUTION
We have discussed the normal distribution.we need µ and σ o dene it.
z = X−µσ/√n is a normal variate mean 0 and variance 1 .i.e z ∼ N (0, 1)
In practice σ is not known and in such a case only option is use S sample
estimate
√
of S.d
n(X−µ)
s
is approximately normal if n is large.
√
If n is not large then ( s ) is distributed as t.
n X−µ
s/√n where S = n−1 Xi − X t is widely used and the distribu-

2
t = X−µ 2 1
P
tion of t is called t distribution

The density function of variable t with k=n-1 degrees of freedom is
1 1
f (t) = √ 1 k
k+1 , −∞ < t < ∞
kB ,
2 2 1+ t2 2
k
Degrees of freedom are the number of independent observations in a set of

observations.
Γ1ΓK ΓπΓ k
where B 12 , k2 = Γ(2K+12 ) = Γ( K+12 )

2 2
PROPERTIES OF T DISTRIBUTION
It is bell shaped just like the normal with its tails a bit higher than the
normal.
It is uni-modal
The probability distribution symmetrical about t=0
99
It tends to normal as k increases.
T-TEST
Student t is deviation of estimated mean from its population mean expressed

in standard deviation
E.g if we want to test
H0 : µ = µ0
vs
H1 : µ 6= µ0
100
µ0 is an assumed value considered to t µ
X − µ0
tn−1 =
s
√
n X − µ0
= tn−1 =
s
s
n (n − 1)
= tn−1 = X − µ0 P 2
X −X
Whatever value we get we compare it with a tabulated value .

If we calculated value is greater than the tabulated we reject the null
hypothesis.
EXAMPLE
1. The life expectancy of people in the year 1970 in Brazil was expected be
50 years.A survey conducted in 11 regions of Brazil and the following
data obtained.Do the data conrm the expected view.
Life expectancy (yrs):54.2,50.4,44.2,49.7,55.4,57.0,58.2,56.6,61.9,57.5,53.4
SOLUTION
We wish to test
H0 : µ = 50
vs
HA : µ 6= 50
The test statistic
101
√
n X − µ0
tn−1 =
s
54.2+50.4+. . . +53.4
X=
11
598.5
= = 54.41
11
X 1 2
s2 = X −X
n−1
1
32799.91 − 54.412

=
10
= 23.607
√
S2 = 23.607 = 4.859
√
11 (54.41 − 50)
t=
4.859
= 3.01
From tables value of t at α = 5% and 10 d.f is 2.228

Since t calculated =3.01>t tabulated=2.228 we reject H0 . This means that
the life expectancy is more than 50 years.
EXERCISE
1. A breacher claims that his variety of Cotton contains at most 40%
lint in seed cotton.18 samples of 100 grams each were take n and after
ginning the following quantity of lint was found in each sample.
102
36.3 37.0 36.6 37.5 37.5 37.9 37.9 36.9 36.7
38.5 37.9 38.8 37.5 37.1 37.0 36.8 36.7 35.7
Check the breacher's claim.Use 1
100
level of signicance
SOLUTION
We wish to test
H0 : µ = 40
vs
HA : µ < 50
The test statistic
√
n X − µ0
tn−1 =
s
36.3+37.0+36.6+. . . +36.7+35.7
X=
18
669.7
= = 37.206
18
( P 2 )
1 X Xi
S2 = Xi2 −
n−1 n
103
X 1 2
s2 = X −X
n−1
1
= (27.33 − 37.206)
17
= 0.633
√
S2 = 0.633 = 0.796
√
18 (37.206 − 40)
t=
0.796
= −14.49
From tables value of t at α = 1% and 17 d.f is 2.567

Since t calculated =-14.49<t tabulated=2.567 we do not reject H0 .
This means that the average percentage of lint in this cotton contains
less than 40%.
104

Probability ND Statistics 2

Uploaded by

Copyright:

Available Formats

You might also like

Probability ND Statistics 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability ND Statistics 2

Uploaded by

Copyright:

Available Formats

STA 2200:PROBABILITY AND

PROBABILITY DISTRIBUTION OF CONTIN-

UOUS RANDOM VARIABLES

For a continuous variables assuming a particular values,because there are

3. A cumulative distribution function is given by

(a) Derive the pdf of the CDF

ie CDF is gotten by integrating the pdf

4. Let X be a continuous random variable with pdf

Obtain the cdf of x and sketch this graph

5. Let X be a continuous random variable with pdf

(a) Determine the constant a

(a) Since f (x) is a pdf then

(b) p(X ≤ 1.5)

4. In any of a,b,c where you nd it is a pdf nd the CDF

The pth quartile of a random variable X (r its corresponding distribution )

The median of a random variable X is denoted by med(x) or 0.5 is the 0.5th

Let 0<p<1 .A (100p)th percentile (quartile of order p) of the distribution

(a) Let the median be m

−[(1 − m)3 − 1] = 0.5

−(1 − m)3 = −0.5

(b) Let p be the 25th percentile

−[(1 − p)3 − 1] = 0.25

−(1 − p)3 = −0.25

2. Find the median of the following distribution

Let the median be m

Therefore the median m is 2 since p(X ≤ m) = p(X ≤ 2) =

The mode of a distribution of a random variable X is a value x such that

1. Find the mode of the following distribution

(b) Next if f 00 (x) < 0at x=2 then 2 is maximum

f 00 (x) = 0.5[2(e−x − xe−x ) − 2xe−x + x2 e−x ]

at x=2 f 00 (x) = f 00 (2)

f 00 (2) = 0.5[2(e−2 − 2e−2 ) − 4e−2 + 4e−2 ]

' −0.135 < 0

Implying that 2 is the maximum value which is the mode of the

(a) Since f (x) is a pdf

(b) Next for 1 ≤ x ≤ 2

5. A continuous random variable x has the pdf

Determine the cdf and the median of this distribution

(a) The cdf

(b) Next for 1 < x ≤ 2

But F (x) = 13 x3 only when 0 ≤ x ≤ 1

EXPECTATION OF A RANDOM VARIABLE

1. E(X) = all x xi p(X = xi ) if X is discrete values x1 , x2 , . . . xi and

corresponding probabilities p(X = xi )

1. What is the expected value (mean) of number of points obtained in a

2. Let X be discrete random variable with pmf

Compute the mean /expectation of X

3. A continuous random variable has the given by

Find the mean of X

= 6(0.467) + 9(0.467) + 12(0.067) + 15(0)

1. In a gambling game,a man is paid sh500 if he gets all heads or tails

No,since E(X) = − 375

(a) Let X have the pdf

Let X be a continuous random variable with pdf f(x).Let g(x)=ax+b.(A

1. E[g(x)] = aE(x) + b.The constant is not aected by expectation.

E(5x3 − 2x2 ) = 5E(x3 ) − 2E(x2 )

VARIANCE OF A RANDOM VARIABLE

4. In any of a,b,c where you nd it is a pdf nd the CDF

The median of a random variable X is denoted by med(x) or 0.5 is the 0.5th

1. E[g(x)] = aE(x) + b.The constant is not aected by expectation.

1. Let f(x) be a pdf from a random variable X dened as

1. The rst raw moment s the expectation of random variable X

• The rth raw moment (moment about the origin) is dened as

• The rth central moment (moment about the mean) is dened as

Dierentiating both sides with respect to t.

Dierentiating eqn (3) once more with respect to t.

Dierentiating eqn (1) once more with respect to t.