HW1 Solutions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

EE278 Statistical Signal Processing Stanford, Autumn 2023

Homework 1
Due: Thursday, October 5, 2023, 1pm on Gradescope

Please upload your answers timely to Gradescope. Start a new page for every problem. For
the programming/simulation questions you can use any reasonable programming language.
Comment your source code and include the code and a brief overall explanation with your
answers.

1. Exercise 1.1 in textbook. Let A1 and A2 be arbitrary events and show that

Pr{A1 ∪ A2 } + Pr{A1 ∩ A2 } = Pr{A1 } + Pr{A2 }

Explain which parts of the sample space are being double counted on both sides of this
equation and which parts are being counted once.

Answer:

The proof can be done using a Venn diagram. As shown in the figure below, A1 ∩ A2 is
part of A1 ∪ A2 and is thus being double counted on the left side of the equation. It is
also being double counted on the right (and is in fact the meaning of A1 ∩ A2 as those
sample points that are both in A1 and in A2 ).

2. Exercise 1.12 in textbook, parts (a), (b). Let X be a random variable with CDF
F (x). Find the CDF of the following random variables.
(a) The maximum of n i.i.d. rvs, each with CDF F (x).

1
Answer:

Let M+ , be the maximum of the n rv’s X1 , . . . , Xn . Note that for any real x, M+ is
less than or equal to x if and only if Xj ≤ x for each j, 1 ≤ j ≤ n. Thus
n
Y
Pr{M+ ≤ x} = Pr{X1 ≤ x, X2 ≤ x, . . . , Xn ≤ x} = Pr{Xj ≤ x},
j=1

where we have used the independence of the Xj ’s. Finally, since Pr{Xj ≤ x} =
FX (x) for each j, we have FM+ (x) = Pr{M+ ≤ x} = (FX (x))n .

(b) The minimum of n i.i.d. rvs, each with CDF F (x).

Answer:

Let M− be the minimum of X1 , . . . , Xn . Then, in the same way as in (a), M− > y if and
only if Xj > y for 1 ≤ j ≤ n and for all choice of y. We could make the
same statement using greater than or equal in place of strictly greater than, but the
strict inequality is what is needed for the CDF. Thus,
n
Y
Pr{M− > y} = Pr{X1 > y, X2 > y, . . . , Xn > y} = Pr{Xj > y},
j=1

It follows that 1 − FM− (y) = (1 − FX (y))n and therefore FM− (y) = 1 − (1 − FX (y))n .

3. Exercise 1.21 in textbook, parts (a)-(e).


(a) Show that, for uncorrelated rvs, the expected value of the product is equal to
the product of the expected values (by definition, X and Y are uncorrelated if
E[(X − X̄)(Y − Ȳ )] = 0).

Answer:

This follows from the linearity of expectation and that X̄ and Ȳ are constants.

0 = E[(X − X̄)(Y − Ȳ )]
= E[XY ] − E[X Ȳ ] − E[X̄Y ] + E[X̄ Ȳ ]
= E[XY ] − X̄ Ȳ

2
(b) Show that if X and Y are uncorrelated, then the variance of X + Y is equal to the
variance of X plus the variance of Y.

Answer:

Var(X + Y ) = E[(X + Y − (X̄ + Ȳ ))2 ]


= E[((X − X̄) + (Y − Ȳ ))2 ]
= E[(X − X̄)2 + 2(X − X̄)(Y − Ȳ ) + (Y − Ȳ )2 ]
= E[(X − X̄)2 ] + 2E[(X − X̄)(Y − Ȳ )] + E[(Y − Ȳ )2 ]
= Var(X) + 0 + Var(Y )

(c) Show that if X1 . . . , Xn are uncorrelated, then the variance of the sum is equal to
the sum of the variances.

Answer:

If X1 , . . . , Xn are uncorrelated, then X1 + . . . + Xn−1 and Xn are uncorrelated. This


is because
" n−1 ! # n−1
X X
E (Xi − X̄i ) (Xn − X̄n ) = E[(Xi − X̄i )(Xn − X̄n )] = 0
i=1 i=1

Then, using part (b),


Var(X1 + . . . + Xn ) = Var(X1 + . . . + Xn−1 ) + Var(Xn )
Using this and n = 2 as the basis for the induction, we see that Var(X1 +. . .+Xn ) =
Var(X1 ) + . . . + Var(Xn ).

(d) Show that independent rv s are uncorrelated.

Answer:

If X and Y are independent, then E[XY ] = E[X]E[Y ]. Then using part a), we get
that X and Y are uncorrelated. Alternatively, if X and Y are independent, then
X − X̄ and Y − Ȳ are also independent and therefore
E[(X − X̄)(Y − Ȳ )] = E[X − X̄]E[Y − Ȳ ] = 0.

3
(e) Let X, Y be identically distributed ternary valued rvs with the PMF pX (−1) =
pX (1) = 1/4; pX (0) = 1/2. Find a simple joint probability assignment such that X
and Y are uncorrelated but dependent.

Answer:

This becomes easier to understand if we express X as |X| · Xs where |X| is the


magnitude of X and Xs is ±1 with equal probability. From the given PMF, we see
that |X| and Xs are independent. Similarly, let Y = |Y | · Ys . For reasons soon to
be apparent, we construct a joint PMF by taking |Y | = |X| and by taking Xs , Ys ,
and |X| to be statistically independent. Since X and Y are zero mean, they are
uncorrelated if E[XY ] = 0. For the given joint PMF, we have

E[XY ] = E[|X| · Xs · |Y | · Ys ] = E[|X|2 ]E[Xs Ys ] = 0,

where we have used the fact that Xs and Ys are zero mean and independent. Thus X
and Y are uncorrelated. On the other hand, X and Y are certainly not independent
since |X| and |Y | are dependent and in fact identical. This should not be surprising
since the correlation of two rv’s is a single number, whereas many numbers are
needed to specify the joint distribution. This example was constructed through
the realization that the type of symmetry here between X and Y is sufficient to
guarantee uncorrelatedness, but not enough to cause independence.

Answer:

An alternative solution is the following (note the solution is not unique).


Let pX,Y (0, 1) = pX,Y (0, −1) = pX,Y (1, 0) = pX,Y (−1, 0) = 14 , and pX,Y (x, y) = 0
otherwise. Then E [XY ] = 0 because one of X and Y is always zero, and E [X] =
E [Y ] = 0, so E [XY ] = E [X] E [Y ] and therefore uncorrelated by part a. However,
2
pX (1)pY (1) = 41 ̸= 0 = pX,Y (1, 1).

4. Consider the binary symmetric channel model mentioned in class. This is a channel
we can use to transmit bits and each bit is flipped independently with probability p
regardless of the value of the bit. We want to estimate the flip probability p of the
channel by sending a known training bit sequence over the channel and counting the
number of bit flips at the receiver side.

4
a) Suppose p = 0.1 and we want to estimate it to within accuracy plus or minus 0.01.
We want to know the minimum length n∗ of the training sequence (i.e. the number
of bits you need to send over the channel) such that:

Pr {|pˆn − p| > 0.01} < 0.05,

where pˆn is the fraction of bits flipped. Do this in two ways and compare your
answers:
i. Bound using Chebyshev bound

Answer:

As defined in class X1 , . . . , Xn are i.i.d. 1 with probability p and 0 with prob-


ability 1 − p. X1 , . . . , Xn have mean p and variance σ 2 = p(1 − p). For a given
n, Chebyshev’s inequality gives us
Var(pˆn )
Pr {|pˆn − p| > 0.01} ≤ .
(0.01)2
σ2 p(1−p)
As derived in class, Var(pˆn ) = n = n .
p(1−p)
If we have n > (0.05)(0.01)2
,

p(1 − p)
Pr {|pˆn − p| > 0.01} ≤ < 0.05.
n(0.01)2
(0.1)(0.9)
Therefore, Chebyshev’s inequality tells us that we need n > (0.05)(0.01)2
=
18000.

ii. Estimate using the Central-limit approximation

Answer:
2 2
Using the CLT, we know that pˆn ∼ N (p, σn ). This means that pˆn −p ∼ N (0, σn )
where σ 2 = p(1 − p) = 0.09. Then
   
0.01 0.01
Pr {|pˆn − p| > 0.01} ≈ 1 − Φ √ +Φ − √
σ/ n σ/ n
where Φ(·) is the CDF of the standard normal distribution.
By iterating over n and calculating Pr {|pˆn − p| > 0.01} for each n, we can find
the smallest n for which Pr {|pˆn − p| > 0.01} < 0.05. This gives us n ≥ 3458.
A sample code for doing this in Python is given below:

5
import numpy as np
from scipy.stats import norm
p = 0.1
var = p*(1-p)
eps = 0.01
max_prob = 0.05
n = 1
prob = 1
while(prob >= max_prob):
n += 1
prob = 1 - norm.cdf(eps/np.sqrt(var/n)) + \
norm.cdf(-eps/np.sqrt(var/n))
print("n = ",n)
print("Error probability = ", prob)

If we approximate the Q-function with the exponential expression from class,


we instead continue by
   
0.01 0.01
Pr {|pˆn − p| > 0.01} ≈ 1 − Φ √ +Φ − √
σ/ n σ/ n
 
0.01
= 2Q √
σ/ n
1  n 
≤ 2 √ exp
2π 2σ 2 104

where the inequality is valid for n > 2σ 2 104 . so it suffices to choose


√ !
0.05 2π
n ≥ −2 ∗ 104 σ 2 log ≈ 4986
2

b) In practice p is unknown. (That’s the whole point!) Repeat part (a) to come up with
sensible answers using both ways. You may do the calculations either analytically
or using a short computer program.
Hint: Can you identify the worst-case value of p that will make the bounds you
derived in part (a) largest?

Answer:

i. Observe that the Chebyshev’s inequality depends on p only through σ 2 = p(1 −


p). Also observe that p(1 − p) ≤ 14 where the maximum value is achieved when

6
p = 12 . Therefore, we have

p(1 − p) 0.25
Pr {|pˆn − p| > 0.01} ≤ 2
≤ .
n(0.01) n(0.01)2
0.25
By choosing n > (0.05)(0.01)2
= 50000, Pr {|pˆn − p| > 0.01} < 0.05 for any value
of p.
ii. Using the CLT, the probability
   
0.01 0.01
Pr {|pˆn − p| > 0.01} ≈ 1 − Φ √ +Φ − √
σ/ n σ/ n
only depends on p through σ. As σ increases, Pr {|pˆn − p| > 0.01} increases
as the Gaussian distribution becomes more spread out. Thus, repeating the
calculation of part a) with σ = 12 (i.e. p = 12 ), we get a value of n that satisfies
Pr {|pˆn − p| > 0.01} < 0.05 for any value of p.
In this case, CLT gives n ≥ 9604. You can use the same code as in part a) by
setting p = 0.5. The Q-function bound with p = 0.5 gives 13850.

5. Two Envelopes
A fixed amount a is placed in one envelope and an amount 5a is placed in the other.
Suppose one of the two envelopes is chosen uniformly at random, and let X be the
amount in the opened envelope. Let Y be the amount in the other envelope.

Answer:

By the design of the experiment,


(
1
(a, 5a) with probability 2
(X, Y ) = 1
(5a, a) with probability 2.

We use this joint pmf to find the required expectations.

Y 
a) Find E X .

Answer:
   
Y 1 5a 1 a  5 1 13
E = + = + = .
X 2 a 2 5a 2 10 5

7
X 
b) Find E Y .

Answer:

X 13
Symmetrically, E = .
Y 5

c) Find E[Y ]/E[X] .

Answer:
5a a
E[X] 2 + 2
= a 5a = 1. Obvious in retrospect.
E[Y ] 2 + 2

6. Computing probability of events


(a) Uniform r.v. Consider a random variable X ∼ Unif[2, 10], i.e., uniformly chosen
from the interval [2, 10]. What is the probability that X 2 − 12X + 35 > 0?

Answer:
The probability of any event is the area under the pdf. We note that X 2 −12X+35 =
(X − 5)(X − 7). Thus X 2 − 12X + 35 > 0 iff {X > 7} ∪ {X < 5}. Since X ∈ [2, 10],
Pr{X 2 − 12X + 35 > 0} = Pr{{X > 7} ∪ {X < 5} ∩ {2 ≤ X ≤ 10}}
= Pr{{2 ≤ X < 5} ∪ {7 < X ≤ 10}}
= Pr{2 ≤ X < 5} + Pr{7 < X ≤ 10}
= 1/8 × 3 + 1/8 × 3 = 0.75.

(b) Laplacian r.v. Let X be a continuous random variable with pdf fX (x) = (1/2)e−|x|
for −∞ < x < ∞. Find the probability of each of the following events.
i. {X ≤ 2 or X ≥ 0}.

Answer:

Pr{X ≤ 2 or X ≥ 0} = Pr{−∞ < X < ∞} = 1.

8
ii. {|X| + |X − 3| ≤ 3}.

Answer:

We first find the cdf by considering two cases.


If x ≤ 0 then

Z x
1 t
FX (x) = 2 e dt
−∞
1 hx
= et
2 −∞
1 x
= e .
2

If x ≥ 0 then

Z 0 Z x
1 t 1 −t
FX (x) = 2 e dt + e dt
−∞ 0 2
Z +∞
1 −t
=1− e dt
x 2
1
= 1 − e−x .
2

We then simplify the expression for the given interval:




 x + x − 3 = 2x − 3 > 3 x>3

|x| + |x − 3| = x − x + 3 = 3 0≤x≤3


−x − x + 3 = 3 − 2x > 3 0 < x

Therefore

1
Pr{|X| + |X − 3| ≤ 3} = Pr([0, 3]) = FX (3) − FX (0− ) = (1 − e−3 ).
2

You might also like