EE3110: Probability Foundations For Electrical Engineers Feb-May 2021 Tutorial 9 Solutions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

EE3110: Probability Foundations for Electrical Engineers

Feb-May 2021
Tutorial 9 Solutions

1 Review Questions:
Refer textbook and reference materials for the answers.

2 Bayesian Inference and a-Posteriori Distribution (8.1)


2.1
(a) Let K be the event that Nefeli knew the answer to the first question, and let C be the event
that she answered the question correctly. Then from the given data P (K) = P (K̄) = 21 . Also
P(C | K) = 1 as she knows the correct answer. And P(C | K̄) = 31 .
Using Bayes’ rule, we have
P(K)P(C | K) 0.5 × 1 3
P(K | C) = = 1 =
P(K)P(C | K) + P(K̄)P(C | K̄) 0.5 × 1 + 0.5 × 3
4

(b) Given that Nefeli answered correctly 6 out of the 10 questions. The probability that Nefeli
knows the answer to a question that she answered correctly is 34 by part (a). Since each question
is independent of the other questions, the posterior PMF is binomial with n = 6 and p = 34 .
or
(b)Let the random variable M to be the number of problems that Nefeli knew the answer. Let
E be the event that 6 of answers are correct. Then we have M is binomial distributed with
parameters (n.p) = (10, 21 ). so
   m  10−m
10 1 1
P (M = m) = for 0 ≤ m ≤ 10
m 2 2
   10
10 1
P (M = m) = for 0 ≤ m ≤ 10
m 2
If given M = m, the conditional probability of E is given by

 0 for m > 6
P (E | M = m) = 10 − m 1 6−m 2 10−m−(6−m)
 
3 3 for 0 ≤ m ≤ 6
6−m


 0 for m > 6
P (E | M = m) = 10 − m 1 6−m 2 4
 
3 3 for 0 ≤ m ≤ 6
6−m

So using total probability formula, we will have
6    6  4
X 10 2 1
P (E) = P (E | M = m)P (M = m) =
4 3 3
m=0

Then using Bayes rule, we will have

P (E | M = m)P (M = m)
P (M = m | E) =
P (E)

 0 form > 6
P (M = m | E) = 6 3m

46 for 0 ≤ m ≤ 6
m


 0 form > 6
P (M = m | E) = 6 3 m 1 6−m
 
4 4 for 0 ≤ m ≤ 6
m

which is a binomial distribution with parameters n = 6 and p = 34 .

3 MAP Rule (8.2)


3.1
(a) Let H1 and H2 be the hypotheses that box 1 or 2 , respectively, was chosen. Let X = 1 if
the drawn ball is white, and X = 2 if it is black. We introduce a parameter/random variable
Θ, taking values θ1 and θ2 , corresponding to H1 and H2 , respectively. We have the following
prior distribution for Θ :
pΘ (θ1 ) = p, pΘ (θ2 ) = 1 − p
where p is given. Using Bayes’ rule, we have
pΘ (θ1 ) pX|Θ (1 | θ1 )
pΘ|X (θ1 | 1) =
pΘ (θ1 ) pX|Θ (1 | θ1 ) + pΘ (θ2 ) pX|Θ (1 | θ2 )
2p/3
=
2p/3 + (1 − p)/3
2p
= .
1+p
Similarly we calculate the other conditional probabilities of interest:
1−p p 2 − 2p
pΘ|X (θ2 | 1) = , pΘ|X (θ1 | 2) = , pΘ|X (θ2 | 2) = .
1+p 2−p 2−p
If a white ball is drawn (X = 1), the MAP rule selects box 1 if

pΘ|X (θ1 | 1) > pΘ|X (θ2 | 1) ,

that is, if
2p 1−p
>
1+p 1+p
or p > 31 , and selects box 2 otherwise. If a black ball is drawn (X = 2), the MAP rule is selects
box 1 if
pΘ|X (θ1 | 2) > pΘ|X (θ2 | 2) ,

2
Figure 1: Conditional density functions for problem 3.2(a)

that is, if
p 2 − 2p
>
2−p 2−p
or p > 23 , and selects box 2 otherwise. Suppose now that the two boxes have equal prior
probabilities (p = 21 ). Then, the MAP rule decides on box 1 if X = 1 (since p = 12 > 13 ) and
box 2 if X = 2 (since p = 12 < 32 ), respectively. Given an initial choice of box 1 (Θ = θ1 ), the
probability of error is
1
e1 = P (X = 2 | θ1 ) =
3
Similarly, for an initial choice of box 2 (Θ = θ2 ), the probability of error is
1
e2 = P (X = 1 | θ2 ) =
3
The overall probability of error of the MAP decision rule is obtained using the total probability
theorem:
1 1 1 1 1
P( error ) = pΘ (θ1 ) e1 + pΘ (θ2 ) e2 = · + · = .
2 3 2 3 3
Thus, whereas prior to knowing the data (the value of X ), the probability of error for either
decision was 12 , after knowing the data and using the MAP rule, the probability of error is
reduced to 31 . This is in fact a general property of the MAP rule: with more data, the probability
of error cannot increase, regardless of the observed value of X.

3.2 Digital Communication Problem


(a) The given conditional pdf’s are shown in figure 1. The MAP rule says: decide the class s1
if P (s1 |z) > P (s2 |z) else decide class s2 . Using Bayes Theorem, we have

P (z|s1 )P (s1 )
P (s1 |z) =
P (z|s1 )P (s1 ) + P (z|s2 )P (s2 )
and
P (z|s2 )P (s2 )
P (s2 |z) =
P (z|s1 )P (s1 ) + P (z|s2 )P (s2 )
using above two equations we get the MAP rule as:
decide class 1 if P (z|s1 )P (s1 ) > P (z|s2 )P (s2 ) and since P (s1 ) = P (s2 ) = 21 , decide class 1 if
P (z|s1 ) > P (z|s2 ). Now from the given figure, at z = za , P (za |s1 ) = 0.5 and P (za |s2 ) = 0.3.

3
Since P (z|s1 ) > P (z|s2 ) the sample za belongs to the class s1 according to the MAP rule.
similarly at z = zb , P (za |s1 ) = 0.7 and P (za |s2 ) = 0.1. Since P (z|s1 ) > P (z|s2 ) the sample zb
also belongs to the class s1 according to the MAP rule.
Note: For equal priors (P (s1 ) = P (s2 ) = 21 ) the threshold point zT is the point of intersection
of the two conditional density functions P (z|s1 ) and P (z|s2 ) and the decision rule is :
decide s1 if z > zT , else decide s2 .
(b) Given that z(T ) = α1 + n0 or z(T ) = α2 + n0 . Here assume α1 < α2 . The conditional
density function of z when z(T ) = α1 + n0 is p(z|α1 ) and when z(T ) = α2 + n0 is p(z|α2 ). The
noise n0 is Gaussian shaped with density function given by,

1 n20
  
1
p (n0 ) = √ exp −
σ0 2π 2 σ02
Then p(z|α1 ) and p(z|α2 ) are also Gaussian with means α1 and α2 respectively.That is,

1 (z − α1 )2
 
1
p (z|α1 ) = √ exp −
σ0 2π 2 σ02

and
1 (z − α2 )2
 
1
p (z|α2 ) = √ exp −
σ0 2π 2 σ02
Now, assume the prior probabilities as P (α1 ) = p and P (α2 ) = 1 − p. The MAP rule says that:
decode α1 if p(α1 |z) > p(α2 |z) where,

p(z|α1 )P (α1 )
p(α1 |z) =
p(z|α1 )P (α1 ) + p(z|α2 )P (α2 )

and
p(z|α2 )P (α2 )
p(α2 |z) =
p(z|α1 )P (α1 ) + p(z|α2 )P (α2 )
Then the MAP rule is modified as decide α1 if p(z|α1 )P (α1 ) > p(z|α2 )P (α2 )
That is
1 (z − α1 )2 1 (z − α2 )2
   
1 1
p √ exp − > (1 − p) √ exp −
σ0 2π 2 σ02 σ0 2π 2 σ02

1 (z − α1 )2 1 (z − α2 )2
   
=⇒ p exp − > (1 − p) exp −
2 σ02 2 σ02
1 (z − α1 )2 − (z − α2 )2
 
1−p
=⇒ exp − 2 >
2 σ0 p
1 (z − α1 )2 − (z − α2 )2
 
1−p
=⇒ − 2 > ln
2 σ0 p
1−p
=⇒ −(z − α1 )2 + (z − α2 )2 > 2σ02 ln
p
1−p
=⇒ α22 − α12 − 2z(α2 − α1 ) > 2σ02 ln
p
1−p
=⇒ α22 − α12 − 2σ02 ln > 2z(α2 − α1 )
p
α2 + α1 σ02 1−p
=⇒ z < − ln
2 α2 − α1 p

4
Figure 2: Conditional density functions of z for problem 3.2(b) (when α1 < α2 )

So the MAP decision rule is:


α2 + α1 σ02 1−p
decide α1 if z ≤ − ln
2 α2 − α1 p
α2 + α1 σ02 1−p
decide α2 if z > − ln
2 α2 − α1 p
From figure 2 and the decision rule it is clearly evident that when α1 and α2 have equal priors
the threshold is the mean of α1 and α2 that is α1 +α 2
2
(The point of intersection of the two
conditional density functions). And when α1 has more prior probability, then the threshold will
be higher than α1 +α
2
2
. Similarly if α2 has more prior probability, then the threshold will be
α1 +α2
lower than 2 .

3.3
(a) Let X denote the random variable representing the number of questions answered correctly.
For each value θ ∈ {θ1 , θ2 , θ3 }, we have using Bayes’ rule,
pΘ (θ)pX|Θ (k | θ)
pΘ|X (θ | k) = P3 .
i=1 pΘ (θi ) pX|Θ (k | θi )

The conditional PMF pX|Θ is binomial with n = 10 and probability of success pi equal to the
probability of answer correctly a question, given that the student is of category i, i.e.,
1 2θi + 1
pi = θi + (1 − θi ) · =
3 3
Thus we have
1.6 2.4 2.9
p1 = , p2 = , p3 = .
3 3 3
For a given number of correct answers
 k,the MAP rule selects the category i for which the
10
corresponding binomial probability pki (1 − pi )10−k is maximized.
k
(b) The posterior PMF of M is given by
3
X
pM |X (m | X = k) = pΘ|x (θi | X = k) P (M = m | X = k, Θ = θi )
i=1

The probabilities pΘ|X (θi | X = k) were calculated in part (a) of the question and the proba-
bilities P (M = m | X = k, Θ = θi ) are binomial and can be calculated in the manner described

5
in Problem 2.1(b). For k = 5, the posterior PMF can be explicitly calculated for m = 0, . . . , 5.
The MAP and LMS estimates can be obtained from the posterior PMF. The probabilities
pΘ|X (θi | X = k) as calculated in part (a) when k = 5 are:
pΘ|X (θ1 | X = 5) ≈ 0.9010, pΘ|X (θ2 | X = 5) ≈ 0.0989, pΘ|X (θ3 | X = 5) ≈ 0.0001
The probability that the student knows the answer to a question that she answered correctly is
θi
qi =
θi + (1 − θi ) /3
for i = 1, 2, 3. The probabilities P (M = m | X = k, Θ = θi ) are binomial and are given by
 
k
P (M = m | X = k, Θ = θi ) = qim (1 − qi )k−m
m
For k = 5, the posterior PMF can be explicitly calculated for m = 0, . . . , 5
pM |X (0 | X = 5) ≈ 0.0145,
pM |X (1 | X = 5) ≈ 0.0929,
pM |X (2 | X = 5) ≈ 0.2402,
pM |X (3 | X = 5) ≈ 0.3173,
pM |X (4 | X = 5) ≈ 0.2335,
pM |X (5 | X = 5) ≈ 0.1015,
It follows that the MAP estimate is
m̂ = 3.
The conditional expectation estimate(LMS Estimate) is
5
X
E[M | X = 5] = mpM |X (m | X = 5) ≈ 2.9668 ≈ 3
m=1

4 Bayesian Least Mean Squares Estimation (8.3)


4.1
Let Θ be the car speed and let X be the radar’s measurement. The PDF of Θ is given by
 1
fΘ (θ) = 20 for 55 ≤ Θ ≤ 75
0 otherwise
Similarly the conditional PDF of X for a given value of Θ = θ is also uniformly distributed over
the interval [θ, θ + 5]
 1
fX/Θ (x/θ) = 5 for θ ≤ x ≤ θ + 5
0 otherwise
fΘ (θ) fX/Θ (x/θ)

1 1
20 5

55 75 θ θ θ+5 x

6
So,the joint PDF of Θ and X is uniform over the set of pairs (θ, x) that satisfy 55 ≤ θ ≤ 75
and θ ≤ x ≤ θ + 5. That is,
1 1 1
fΘ,X (θ, x) = fΘ (θ)fX/Θ (x/θ) = · =
20 5 100
if 55 ≤ Θ ≤ 75 and θ ≤ x ≤ θ + 5, andh is zero for an other
i values of (θ, x).
2
Note: For any given value x of X, E (Θ − θ̂) | X = x is minimized when θ̂ = E[Θ | X = x].
That is θ̂ = E[Θ | X = x] is the LMS estimate of Θ for a given value of X = x.

θ
75

Least squares estimate


E[Θ/X = x]

55

55 60 75 80 x

The parallelogram in the above figure is the set of (θ, x) for which fΘ,X (θ, x) is nonzero. Given
that X = x, the posterior PDF fΘ|X is uniform on the corresponding vertical section of the
parallelogram. Thus E[Θ | X = x] is the midpoint of that section, which happens to be a piece-
wise linear function of x shown in the red colour. So the LMS estimate of the car’s speed(Θ)
based on the radar’s measurement(x) is obtained as
 x
 2 + 27.5, if 55 ≤ x ≤ 60
E[Θ | X = x] = x − 2.5, if 60 ≤ x ≤ 75
 x
2 + 35, if 75 ≤ x ≤ 80

5 Bayesian Linear Least Mean Squares Estimation (8.4)


5.1
Note:
1 The linear LMS estimator Θ̂ of Θ based on X is
cov(Θ, X) σΘ
Θ̂ = E[Θ] + (X − E[X]) = E[Θ] + ρ (X − E[X])
var(X) σX
where
cov(Θ, X)
ρ=
σΘ σX
is the correlation coefficient.

2 The resulting mean squared estimation error is equal to

1 − ρ2 σΘ
 2

7
In our problem the random variable Θ is uniformly distributed in the interval [4, 10] and
X =Θ+W
where W is uniformly distributed in the interval [−1, 1], and is independent of Θ. The linear
LMS estimator of Θ given X is
cov(Θ, X)
Θ̂ = E[Θ] + 2 (X − E[X])
σX
Since Θ and W are independent, we have
E[X] = E[Θ] + E[W ] = E[Θ],
2 2 2
σX = σΘ + σW
cov(Θ, X) = E[(Θ − E[Θ])(X − E[X])] = E (Θ − E[Θ])2 = σΘ
2
 

where the last relation follows from the independence of Θ and W . Using the formulas for the
mean and variance of the uniform PDF, we have
E[Θ] = 7, σΘ2 =3
2
E[W ] = 0, σW = 1/3.
Thus, the linear LMS estimator is
3
Θ̂ = 7 + (X − 7)
3 + 1/3
or
9
(X − 7).
Θ̂ = 7 +
10
The mean squared error is 1 − ρ2 σΘ2 . We have


 2 2
cov(Θ, X) 2 σ2
 
2 σΘ 3 9
ρ = = = 2Θ = = .
σΘ σX σΘ σX σX 3 + 1/3 10
Hence the mean squared error is
9 3
1 − ρ2 σΘ
 2
= (1 − ) · 3 =
10 10

5.2
(a) Using the below figure and following the procedure similar to problem 4.1 we can obtain
the LMS estimator as
 1
2 X, if 0 ≤ X < 1
g(X) = E[Θ | X] = 1
X − 2 , if 1 ≤ X ≤ 2
θ

1
Least squares estimate
E[Θ/X = x]

1 2 3 x

8
(b) We first derive the conditional variance E (Θ − g(X))2 | X = x . If x ∈ [0, 1], the condi-
 

tional PDF of Θ is uniform over the interval [0, x], and

E (Θ − g(X))2 | X = x = x2 /12
 

Similarly, if x ∈ [1, 2], the conditional PDF of Θ is uniform over the interval [1 − x, x], and

E (Θ − g(X))2 | X = x = 1/12
 

We now evaluate the expectation and variance of g(X). Note that (Θ, X) is uniform over a
region with area 23 , so that the constant c must be equal to 32 . We have

E[g(X)] = E[E[Θ | X]]


= E[Θ]
ZZ
= θfX,Θ (x, θ)dθdx
Z 1Z x Z 2Z x
2 2
= θ dθdx + θ dθdx
0 0 3 1 x−1 3
7
= .
9
Furthermore,

var(g(X)) = var(E[Θ | X])


= E (E[Θ | X])2 − (E[E[Θ | X]])2
 
Z 2
= (E[Θ | X])2 fX (x)dx − (E[Θ])2
0
Z 1  2
1 2 2
Z 2   2
1 2 7
= x · xdx + x− · dx −
0 2 3 1 2 3 9
103
=
648
= 0.159,

where 
2x/3, if 0 ≤ x ≤ 1
fX (x) =
2/3, if 1 ≤ x ≤ 2
(c) The expectations E (Θ − g(X))2 and E[var(Θ | X)] are equal because by the law of
 

iterated expectations,

E (Θ − g(X))2 = E E (Θ − g(X))2 | X = E[var(Θ | X)]


    

Recall from part (b) that

x2 /12,

if 0 ≤ x < 1
var(Θ | X = x) =
1/12, if 1 ≤ x ≤ 2

It follows that
1 2
x2 2
Z Z Z
1 2 5
E[var(Θ | X)] = var(Θ | X = x)fX (x)dx = · xdx + · dx =
x 0 12 3 1 12 3 72

9
(d) By the law of total variance, we have

var(Θ) = E[var(Θ | X)] + var(E[Θ | X]).

Using the results from parts (b) and (c), we have


5 103 37
var(Θ) = E[var(Θ | X)] + var(E[Θ | X]) = + = .
72 648 162
An alternative approach to calculating the variance of Θ is to first find the marginal PDF fΘ (θ)
and then apply the definition
Z 2
var(Θ) = (θ − E[Θ])2 fΘ (θ)dθ.
0

(e) The linear LMS estimator is

cov(X, Θ)
Θ̂ = E[Θ] + 2 (X − E[X])
σX

We have Z 1Z x Z 2Z x
2 2 2 11
E[X] = xdθdx + xdθdx = + 1 =
0 0 3 1 x−1 3 9 9
Z 1Z x Zx 2Z
2 2 2 2 1 14 31
E X2 =
 
x dθdx + x dθdx = + =
0 0 3 1 x−1 3 6 9 18
71
var(X) = E X 2 − (E[X])2 =
 
162
Z 1Z x Z 2Z x
2 2 1 2 7
E[Θ] = θdθdx + θdθdx = + =
0 0 3 1 x−1 3 9 3 9
Z 1Z x Z 2Z x
2 2 1 17 37
E[XΘ] = xθdθdx + xθdθdx = + = ,
0 0 3 1 x−1 3 12 18 36
37 11 7
cov(X, Θ) = E[XΘ] − E[X]E[Θ] = − ·
36 9 9
Thus, the linear LMS estimator is
37
− 11 · 7
 
7 11
Θ̂ = + 36 719 9 X − = 0.5626 + 0.1761X .
9 162
9

Its mean squared error is


h i
E (Θ − Θ̂)2 = E (Θ − 0.5626 − 0.1761X)2
 

= E Θ2 − 2Θ(0.5626 + 0.1761X) + (0.5626 + 0.1761X)2


 

After some calculation we obtain the value of the mean squared error, which is approximately
0.2023. Alternatively, we can use the values of var(X), var(Θ), and cov(X, Θ) we found earlier,
to calculate the correlation coefficient ρ, and then use the fact that the mean squared crror is
cqual to
1 − ρ2 var(Θ)


to arrive at the same answer

10

You might also like