Solution - Probability and Stochastics - Cinlar

Byeongho Ban bban@umass.
edu
STAT 605 : Probability Homework Solutions

:Taught by Luc Rey-Bellet
Mathematics & Statistics

University of Massachusetts, Amherst
Byeong Ho Ban
1
STAT 605 Editor : ByeongHo Ban
Due date : September 13th, 2019 ByeongHo Ban
Exercise 1.9
(a) E = {∅, A, B, C, A ∪ B, A ∪ C, B ∪ C, A ∪ B ∪ C = E}.
Since {A, B, C} is a partition of E, it is clear that E ⊆ σC. Conversely, since it is clear that E is a sigma algebra containing C,
σC ⊆ E. Thus, σC = E.
(b) Let E be a collection of all sets that are countable unions of elements taken from C. Since σC contains C and is closed under
countable unions, it is clear that E ⊆ σC. Now, suppose that A ∈ E, then, for some subcollection S ⊆ C,
[
A= B.
B∈S
c
Note that S = C \ S is countable and C is a partition of E so
[
Ac = B ∈ E.
B∈S c
Thus, E is closed under complement. Also, by definition of E, it is clearly closed under countable union. (Countable union of
countable unions is also a countable union.) Now, note that E is a sigma algebra containing C since, for any B ∈ C, B itself is a
countable union of element in C. Therefore, σC ⊆ E, so E = σC. Therefore, any element in σC is a countable union of elements in C.
(c) Let E = {A ⊆ R : A is countable or Ac is countable.}. Clearly, E is not empty since ∅ ∈ E. Suppose B ∈ E, then B is either
countable or cocountable where ’cocountable’ means B c is countable. In other words, B c is cocountable or countable. Therefore,
∞
B c ∈ E which means E is closed under complement. Now,
S∞ suppose that {Bi }i=1 ⊆ E is given. If Bi is countable for all i, then,
since countable union of countable sets is countable, i=1 Bi is countable. On the other hands, if at least one of Bi , let Bk , is
cocountable, then observe that
∞
!c ∞
[ \
Bi = Bic ⊆ Bkc .
i=1 i=1
S∞
Since Bkc is countable, i=1 Bi is cocountable so is in E. Therefore, E is closed under countable union which implies that it is
sigma algebra.
Now, note that every elements in C is singleton so is countable set which means C ⊆ E. Since E is a sigma algebra, σC ⊆ E.
S
Now, suppose that A ∈ E, then A is either countable or cocountable. If A is countable, then A = a∈A {a} ∈ σC since sigma
algebra is closed under countable union. If A is cocountable, then Ac is countable so, by same argument, Ac ∈ σC. Since sigma
algebra is closed under complement, we know A = (Ac )c ∈ σC. Therefore, E ⊆ σC which means E = σC. Thus, every element in
σC is either countable or cocountable.
Exercise 1.10
(a) Suppose that C ⊂ D. Since D ⊂ σD, it follows that C ⊂ σD. Since σC is the smallest sigma algebra containing C and σD is a
sigma algebra containing C, we can deduce that σC ⊂ σD.
(b) Suppose that C ⊂ σD. Since σD is a sigma algebra containing C and σC is the smallest sigma algebra containing C, it follows
that σC ⊂ σD.
(c) Suppose that C ⊂ σD and D ⊂ σC. Then by (b), σC ⊂ σD and σD ⊂ σC. Thus, σC = σD.
(d) Suppose that C ⊂ D ⊂ σC. From the fact C ⊂ D and (a), we deduce that σC ⊂ σD. And from the fact D ⊂ σC and by (b), it
follows that σD ⊂ σC. Therefore, σC = σD.
Exercise 1.11
Let C be a collection of all open intervals. Since each open intervals is open sets and BR is generated by open sets, we know that
C ⊂ BR . Furthermore, since σC is the smallest sigma algebra containing C and BR is a sigma algebra containing C, σC ⊂ BR .
Now, let O be a set of all open sets in R. Using the facts that any open set in R is a countable union of open intervals and that
σC contains all countable union of open intervals, it follows that O ⊂ σC. Since BR = σO is the smallest sigma algebra containing
O and σC is a sigma algebra containing O, we can conclude that BR ⊂ σC. Therefore, BR = σC which means BR is generated by
the collection of all open intervals.
Page 2
Exercise 1.12
Let a, b ∈ R with, without loss of generality, a < b be given. It is clear that (−∞, a), (a, ∞), (a, b) are open intervals so, by
Exercise 1.11, they are all Borel sets. Also, observe that
∞
\ 1
(−∞, a] = −∞, a + ∈ BR
n=1
n
∞
\ 1
[a, ∞) = a − , ∞ ∈ BR
n=1
n
∞
\ 1
(a, b] = a, b + ∈ BR
n=1
n
∞
\ 1
[a, b) = a − , b ∈ BR
n=1
n
∞
\ 1 1
[a, b] = a − ,b + ∈ BR
n=1
n n
∞
\ 1 1
{a} = a − ,a + ∈ BR .
n=1
n n
Page 3
Exercise 1.13
Let
Ea = {(−∞, a] : a ∈ R}
Eb = {(a, b] : a, b ∈ R, a ≤ b}
Ec = {[a, b] : a, b ∈ R, a ≤ b}
Ed = {(a, ∞) : a ∈ R}
By Exercise 1.12, Ek ⊂ BR for k = a, b, c, d which means σEk ⊂ BR for k = a, b, c, d. Now, let an open interval (x, y) be given.
Then observe that
∞ !
[ 1
(x, y) = (−∞, y) ∩ (x, ∞) = −∞, y − ∩ (−∞, x]c ∈ σEa
n=1
n
∞
[ 1
(x, y) = x, y − ∈ σEb
n=1
n
∞
[ 1 1
(x, y) = x + ,y − ∈ σEc
n=1
n n
∞ !c
c
\ 1
(x, y) = (−∞, y) ∩ (x, ∞) = [y, ∞) ∩ (x, ∞) = y − ,∞ ∩ (x, ∞) ∈ σEd
n=1
n
Therefore, bring the notation from Exercise 1.11, C ⊂ σEk for k = a, b, c, d. Since BR is the smallest sigma algebra containing C
by Exercise 1.11 and since σEk is a sigma algebra containing C, we conclude that BR ⊂ σEk for k = a, b, c, d. Therefore, σEk = BR
for k = a, b, c, d so BR can be generated by the given different four collections.
If we restrict our sight only to rational numbers, it is also true. Let

Ea0 = {(−∞, a] : a ∈ Q}
Eb0 = {(a, b] : a, b ∈ Q, a ≤ b}
Ec0 = {[a, b] : a, b ∈ Q, a ≤ b}
Ed0 = {(a, ∞) : a ∈ Q}
Now, let x, y ∈ R \ Q with x < y be given. Then note that there is a increasing sequence, {yn } ⊂ Q, converging to y and decreasing
sequence, {xn } ⊂ Q, converging to x. Then note that
∞
! ∞
!c
[ \
(x, y) = (−∞, y) ∩ (x, ∞) = (−∞, yn ] ∩ (−∞, xn ] ∈ σEa0
n=1 n=1
∞
[
(x, y) = (xn , yn ] ∈ σEb0
n=1
[∞
(x, y) = [xn , yn ] ∈ σEc0
n=1
∞
!c ∞
!
\ [
(x, y) = (−∞, y) ∩ (x, ∞) = [y, ∞) ∩ (x, ∞) = c
(yn , ∞) ∩ (xn , ∞) ∈ σEd0
n=1 n=1
Therefore, by same reasoning, we conclude that σEk0 = BR for k = a, b, c, d. Then we are done.
Exercise 1.14
We need to prove that
D̂ = {A ∈ D : A ∩ D ∈ D}.
Note that, since D ∈ D, observe that
D ∩ E = E \ Dc = E \ (E \ D) ∈ D
since E, D ∈ D and D ⊂ E so E \ D ∈ D. Thus, E ∈ D̂.
Let A, B ∈ D̂ with A ⊂ B be given. Then observe that
(B \ A) ∩ D = (B ∩ Ac ) ∩ D = B ∩ (Ac ∪ Dc ) ∩ D = (B ∩ D) ∩ (A ∩ D)c = (B ∩ D) \ (A ∩ D) ∈ D
since B ∩ D, A ∩ D ∈ D with (A ∩ D) ⊂ (B ∩ D). Thus, (B \ A) ∈ D̂.
Lastly, let (An ) ⊂ D̂ and An % A. Then we know that (An ∩ D) % (A ∩ D) so A ∩ D ∈ D. Thus, A ∈ D.
Therefore, D̂ is a d−system.
Page 4
Exercise 1.15
We need to show that
D = {A ∩ D : A ∈ E}
is a sigma algebra on D for fixed D ⊂ E with measurable space (E, E). Since ∅ = ∅ ∩ D ∈ D, D =
6 ∅. Now, let A ∩ D ∈ D be
given. Now, note that
D \ (A ∩ D) = (Ac ∩ D) ∈ D
since E is closed under complement. Thus, D is closed under complement in D.
Now, suppose that (An ∩ D) ⊂ D is a given sequence. Then observe that
∞ ∞
!
[ [
(An ∩ D) = An ∩ D ∈ D
n=1 n=1
since E is closed under countable union. Therefore, D is also closed under countable union.
In conclusion, D is a sigma algebra on D.

Exercise 1.16
Let σE be the sigma algebra on E generated E. Since E ⊂ E, we know that σE ⊂ E. Now, we will show that E ⊂ E. So let B ∈ E be
given. Since it is trivial that E ⊂ σE, without loss of generality, suppose that B = A ∪ {∆} ∈ E for some A ∈ E. Since E is closed
under complement in E, we have E \A ∈ E. Now, since σE is closed under complement in E, note that E \(E \A) = A∪{∆} ∈ σE.
Therefore, E ⊂ σE so E = σE. Thus, E is a sigma algebra on E generated by E.
Exercise 2.20 c
Firstly, note that ∅ = f −1 ∅ ∈ f −1 F so f −1 F is not emptyset. Let A ∈ F be given. Then observe that f −1 A = f −1 (Ac ) ∈ f −1 F
since F is closed under complement so since Ac ∈ E. Now, let (An ) ⊂ F be a sequence of measurable sets. Then observe that
∞ ∞
!
[ [
f −1 An = f −1 An ∈ f −1 F
n=1 n=1
S∞
since F is closed under countable union so since n=1 An ∈ F. Therefore, f −1 F is closed under complement and countable union
so it is a sigma algebra.
Exercise 2.21
Recall that F ⊗ G is generated by the collection of all measurable rectangles. Let A × B ∈ F ⊗ G be a given measurable rectangle.
Then observe that
h−1 (A × B) = {x ∈ E : h(x) = (f (x), g(x)) ∈ A × B} = {x ∈ E : f (x) ∈ A and g(x) ∈ B} = f −1 A ∩ g −1 B ∈ E
since by measurability of f and g we have f −1 A, g −1 B ∈ E and so f −1 A ∩ g −1 B ∈ E due to closed under intersection of sigma
algebra. Since A × B was arbitrary measurable rectangle, we conclude that h is measurable relative to E and F ⊗ G.
Exercise 2.22
Following the given hint, note that h = f ◦ g where g : F → E × F is defined by g(y) = (x0 , y). Let A × B ∈ E ⊗ F be a given
measurable rectangle. Then observe that
g −1 (A × B) = {y ∈ F : (x0 , y) ∈ A × B} = {y ∈ F : x0 ∈ A and y ∈ B}.
Note that the latter set is empty if x0 6∈ A or is B if x0 ∈ A. In either way, the latter set is in F. Therefore, g is measurable
relative to F and E ⊗ F. Since f is measurable relative to E ⊗ F and G, we can conclude that h = f ◦ g is measurable relative to
F and G.
Exercise 2.23
Suppose that f is E−measurable. Note that zero function, z, is measurable since z −1 (−∞, a) is either E or ∅ which are in E.
Then note that f − = −(f ∧ 0) = − inf{0, f } and f + = (f ∨ 0) = sup{0, f } are measurable by Thm 2.15 since {0, f } = (fn ) where
f1 = f and fk = 0 for k > 1 is a sequence of E−measurable functions.
Conversely, suppose that f + and f − are E− measurable. Note that they are positive functions. Thus, there are sequences of
positive simple functions (fn+ ) and (fn− ) such that
lim fn+ = f +
n→∞
lim fn− = f − .
n→∞
Note that fn+ − fn− is also simple functions so (fn+ − fn− ) is a sequence of measurable functions converging to f + − f − = f . Since
a limit of a sequence of measurable functions is measurable function, f is measurable.
Exercise 2.24
Let f : (E, E) → (F, F) be given function. Then, for any A ∈ F, f −1 (A) ⊂ E so f −1 (A) ∈ E so f is measurable. Since f was
arbitrary function with domain E, any function with domain E is E measurable.
Page 5
Exercise 2.25
Let {En } be the countable partition of E generating E.
Suppose that a numerical function f : E → R is E− measurable. Recall that from Exercise 1.12, we prove that {a} is Borel set for
any a ∈ R. Thus, for given a ∈ R, f −1 ({a}) ∈ E. Also, recall that, from Exercise 1.9, we proved that E is a collection of countable
union of elements in {En }. Thus,
[
f −1 {a} = En
some n
which means that
[
f (x) = a ∀x ∈ En .
some n
Now, assume that f is not constant in Ek for some k. Then a, b ∈ f −1 (Ek ) for some a, b ∈ R with a 6= b. Then by the argument
above, we know
[
f −1 {a} = En .
some n
Since a ∈ f −1 Ek , we have Ek ∩ ( some n En ) =

6 ∅. Since {En } is a partition of E, then Ek ⊂ ( some n En ) = f −1 {a}. It implies
S S
b 6∈ Ek which is a contradiction. Therefore, f should be constant on each component of the partition.
Conversely, suppose that f is constant over each member of that partition. Let a ∈ R be given and observe that S there are
countably many b < a such that f (x) = b for all x ∈ En for some n. Let’s call the set of b < a, W. Then f −1 {b} = some n En , a
countable union of the member of partition. Let’s call is E b . Then observe that
[
f −1 (−∞, a) = f −1 W = E b ∈ E.
b∈W
Therefore, f is E measurable.
Page 6
Due date : September 23th, 2019 Byeongho Ban
2.28
Suppose that a function f on E is E-measurable. Then we can decompose f = f + − f − where f + = f ∨ 0 and f − = −f ∧ 0.
Since f + and f − are positive measurable functions, there are two sequences fn+ = dn ◦ f + and fn− = dn ◦ f − which are increasing
to f + and f − respectively. Then observe that
f = f + − f − = lim fn+ − lim fn− = lim (fn+ − fn− ) = lim fn
n→∞ n→∞ n→∞ n→∞
since {fn+ }
and {fn− }
are convergent and where fn = − fn+ fn− .
Since {fn+ }
and {fn− }
are sequences of simple functions, {fn } is
also a sequence of simple functions. Therefore, f is a limit of a sequence of simple functions.
Conversely, suppose that f = limn→∞ fn where {fn } is a sequence of simple functions. Note that lim inf n→∞ fn and lim supn→∞ fn
are measurable since fn is measurable for any n. Then since
lim inf fn = lim fn = lim sup fn
n→∞ n→∞ n→∞
we can conclude that f = limn→∞ fn is measurable. So we are done.
2.29
Suppose that f and g are measurable. Then, by the Exercise 2.28, there are two sequence of simple functions {fn } and {gn } such
that fn → f and gn → g as n → ∞. Then observe that
lim (fn + gn ) = lim fn + lim gn = f + g
n→∞ n→∞ n→∞
lim (fn − gn ) = lim fn − lim gn = f − g
n→∞ n→∞ n→∞

lim fn · gn = lim fn lim gn = f · g
n→∞ n→∞ n→∞
Since {fn + gn }, {fn − gn } and {fn gn } are sequence of simple functions, f ± g and f · g are measurable. Since fn gn is simple
function, in order to prove {fn /gn } is a sequence of simple functions, we only need to show that 1/gn is simple function. Suppose
that
Xm
gn = ak 1Ak ,
k=1
Now, clearly, observe that
m
1 X 1
= 1A
gn ak k
k=1
Since we can observe that
fn limn→∞ fn f
lim = =
n→∞ gn limn→∞ gn g
f f
it turns out g is a limit of a sequence of simple functions. Therefore, similarly, by the exercise 2.28, g is measurable.
2.30
Let f : E → R be a given continuous function. Since B(E) is a sigma algebra generated by all open sets in of E, in order to
show f is measurable, it suffices to show that f −1 (O) ∈ B(E) for any open set O. However, since f −1 (O) is open whenever f is
continuous and O is open, and since every open set is in B(E), we conclude that f −1 (O) ∈ B(E) for any open set O. Therefore,
f is measurable.
Page 7
2.31
Let {tn } is such sequence. Clearly, letting

n
X
φn = ak 1[tk−1 ,tk )
k=1
for some constants {ak }, {φn } is clearly a sequence of simple function converging to
X∞
φ= ak 1tk−1 ,tk
k=1
so φ is Borel-measurable(each [tk−1 , tk ) is Borel set.).
Now, let f be a right continuous function defined in the problem. By given hint, note that f = limn→∞ fn where fn = f ◦ dn for
each n and
∞
X k
dn (r) = 1 k−1 k (r), r ∈ R+ .
2n [ 2n , 2n )
k=1
Note that fn is a right continuous simple function and so, by the first argument, fn is Borel measurable. Since f is a limit of a
sequence of Borel measurable functions {fn }, f should be Borel measurable.
In the case when f : R → R, note that f = f + +f − where f + = limn→∞ f ◦dn with f + (x) = 0 if x < 0 and f − = limn→∞ f ◦ −dn

with f − (x) = 0 if x > 0. Then by the same argument with above, f + and f − are Borel measurable. Then by Exercsie 2.29,
f = f + + f − is Borel measurable.
2.32
Let (rn ) be a sequence of real numbers decreasing to r. Then observe that, since f is increasing,
lim f (rn ) = inf f (rn ) = f (r)
n→∞ n
Thus, f is right continuous as defined in previous exercise. Therefore, f is Borel measurable.
If you are not sure inf n f (rn ) = f (r).
Note that, for any given > 0, there is an positive integer N such that f (r + ) > f (rn ) ∀n > N . Thus, f (r + ) > inf n f (rn ).
Since > 0 is arbitrary, f (r) ≥ inf n f (rn ). Since it is clear that f (r) ≤ f (rn ) for all n so f (r) ≤ inf n f (rn ), we conclude that
f (r) = inf n f (rn ).
2.33
Suppose that f and g are E measurable. Then, by the exercise 2.29, f − g is E-measurable. Then observe that
{f > g} = {f − g > 0} = (f − g)−1 (0, ∞] ∈ E
{f < g} = {f − g < 0} = (f − g)−1 [−∞, 0) ∈ E
{f 6= g} = {f − g 6= 0} = (f − g)−1 (R \ {0}) ∈ E
{f = g} = {f − g = 0} = (f − g)−1 ({0}) ∈ E
{f ≥ g} = {f − g ≥ 0} = (f − g)−1 ([0, ∞]) ∈ E
{f ≤ g} = {f − g ≤ 0} = (f − g)−1 ([−∞, 0]) ∈ E
since (0, ∞], [−∞, 0), R \ {0}, {0} = [0, ∞] \ (0, ∞], [0, ∞] and [−∞, 0] are all Borel sets. So we are done.
2.34
Let D = {A : 1A ∈ M+ }. Firstly, note that, by (a), 1 = 1E ∈ M+ so E ∈ D. Suppose that A, B ∈ D such that B ⊂ A. Then
note that, by (b), 1A\B = 1A − 1B ≥ 0 so 1A\B ∈ M+ so A \ B ∈ D. Now, suppose that (An ) ⊂ D is a sequence such that
An % A. Then Note that 1An % 1A in M+ so by (c), 1A ∈ M+ so A ∈ D. From these three argument, we conclude that D is a
d-system. Then by Monotone class theorem, since a p-system C is in D, we have E = σ(C) ⊂ D which means that 1A ∈ M+ for
any A ∈ E.
Now, suppose that f is a positive E measurable function. Then, dn ◦ f = fn % f with (fn ) ⊂ M+ since fn is a simple function
on E which is E measurable so, by (b), fn ∈ M+ for each n. Then by (c), f ∈ M+ . Therefore, M+ contains every positive
E-measurable functions.
Page 8
3.11
(a) Observe that

ν(∅) = µ(∅ ∩ D) = µ(∅) = 0
and that, for any given sequence of disjoint sets {An } ⊂ E,
! !
[ [ X X
ν An = µ An ∩ D = µ(An ∩ D) = ν(An ).
n n n n
Therefore, ν is a measure.
(b) Since ν(∅) = 0 by the argument in (a), we only need to show the countable additivity. Suppose (An ∩ D) ⊂ D be a given
sequence of disjoints sets. Then observe that
! !
[ [ X X
ν (An ∩ D) = µ (An ∩ D) = µ(An ∩ D) = ν(An ∩ D)
n n n n
since ∪n (An ∩ D) ∈ D due to the fact that D is a sigma algebra. Therefore, ν is a measure on (D, D).
3.12
Observe that
ν(∅) = µ(∅ ∩ D) = µ(∅) = 0.
Furthermore, observe that, for a given sequence of disjoint sets (An ) ⊂ E,
! !
[ [ X X
ν An = µ (An ∩ D) = µ(An ∩ D) = ν(An )
n n n n
so ν is countably additive. Therefore, ν is a measure on (E, E).
3.13
S
(a) Let (En ) ⊂ E be a sequence of measurable sets such that µ(En ) < ∞ for each n and Ei ∩ Ej = ∅ with n En = E. If it was
not disjoint, we can replace with Bn = En \ ∪n−1
k=1 Ek and rename Bn = En . Then let (En , En ) be the trace of E on En . Then
define the measure such that µn (A) = µ(A ∩ En ) for each A ∈ E and each n. By the previous exercise, µn is a measure. Now we
will show that
X
µ= µn
n
Let A ∈ E be given. Then observe that, by the countable additivity,
!! !
[ [ X X
µ(A) = µ(A ∩ E) = µ A ∩ En =µ (A ∩ En ) = µ(A ∩ En ) = µn (A)
n n n n
since (En ) is disjoint. Therefore, we are done.
(b) Let En = [n, n + 1) then (En ) is a Lebesgue measurable partition of R. Then note that Leb(En ) = n + 1 − n = 1 < ∞.
Therefore, Leb is σ-finite on R.
(c) Suppose that µ is σ−finite. Then there is (En ), a measurable partition of E such that µ(En ) < ∞ for each n. Thus,
X
µ(En ) = m(x)δx (En )
x∈D
If m(x ) = ∞ for some x ∈ D ⊂ E, then there is En such that x0 ∈ En . Then we have

0 0
X
µ(En ) = m(x)δx (En ) ≥ m(x0 ) = ∞
x∈D
which is a contradiction. Therefore, m(x) < ∞ for each x ∈ D.
Conversely, suppose that m(x) < ∞ for each x ∈ D. If D was finitely countable, then, since µ(E) < ∞, µ is σ-finite (finite sum
of finite numbers is finite). If D was infinitely countable, let D = (xn )∞n=1 . And let (En ) ⊂ E be a sequence of sets such that
En = {xn } (since (E, E) is discrete so every set is measurable.). By setting E0 = E \ D, we can create disjoint sequence (En )∞n=0
which is a partition of E. Since, in each n, µ(En ) = m(xn ) < ∞ we have proved that µ is σ-finite. Thus, in this case µ is σ-finite
so by (a), µ is Σ-finite.
Page 9
3.13
(d) If µ is sigma finite, then there is a measurable partition (En ) ⊂ E of E such that µ(En ) < ∞ which means that, for any A ∈ E,
Leb(En ) = 0. However, then
X
1 = Leb[0, 1] = Leb(En ) = 0
n
which is a contradiction. Therefore, µ cannot be σ-finite. Now, define a measure
X
µ(A) = Leb(A + n)
n∈Z
Note that if Lab(A) > 0, then µ(A) = ∞ since it is infinite sum of same numbers. If Leb(A) = then µ(A) = 0 since it is sum of a
bunch of zeros. Therefore, the equality holds. Also, it is clear that Leb is finite measure on [0, 1]. Thus, it is clear that µ is Σ-finite.
(e) Let (En ) ⊂ E be a given sequence of measurable partition of [0, 1] = E. Then since the cardinality of (En ), (the number of
En ’s), is countable and E is uncountable, at least one of Ek ∈ (En ) should be uncountable since countable union of countable
sets is countable. Thus, it implies µ(Ek ) = ∞. Since (En ) was arbitrary, there does not exists a measurable partition (En ) such
that µ(En ) < ∞ for each n so µ cannot be σ-finite.
Furthermore, assume that µ is Σ-finite so that

X
µ= µn
n
where µn is finite measure for each n. ;(µn (E) < ∞). For any x ∈ E, since µ({x}) = 1, there exists µn such that µn ({x}) > 0.
(Singleton set is measurable since {a} = [a, b) \ (a, b)) Since there are uncountably many x ∈ E such that µ({x}) = 1 and there
are countably many µn , there should be m such that µm ({x}) > 0 for uncountably many x. Let the set of such x is K. Then
X
µm (E) = µm ({x}) = ∞
x∈K
which contradicts to the fact that µm is finite. Therefore, µ cannot be Σ-finite.
3.14
Let (E, E, µ) be a given measure space. And let Card(A) be a number of elements in a set A. Now, letting D be a set of atoms,
let

1
Dn = x ∈ D : µ({x}) >
n
Then observe that, for each n, since Dn ⊂ E,
X Card(Dn )
µ(E) ≥ µ(Dn ) = µ({x}) > .
n
x∈Dn
From here, we conclude that

Card(Dn ) < nµ(E)
S
for each n. If D is uncountable, since D = n Dn , at least one of Dn should be uncountable which means there exists k such
that Card(Dk ) = ∞. It is contradiction since ∞ = Card(Dk ) < kµ(E) < ∞. Therefore, D is at most countable.
P
Now, let µ be a Σ−finite measure such that µ = n µn with finite measures µn . Let An be aSset of atoms of µn . Then note
n, An is at most countable. Now, further noteSthat, clearly An ⊂ D for each
that, for each P S n so n An ⊂ D. Also, if x ∈ D then
0 < µ({x}) = n µn ({x}) so µn ({x}) > 0 for some n so x ∈ n An . It implies that D = n An . If D was uncountable, then there
should be m such that Am is uncountable since countable union of countable sets is countable. It is contradiction since An is a
set of atoms of a finite measure and by previous argument, Am should be at most countable. So we are done.
3.15
Assume that λ({x}) > 0 for some x ∈ E. Since λ({x}) = µ({x} \ D) > 0 it should be x 6∈ D since µ(∅) = 0. However, then
λ({x}) = µ({x} \ D) = µ({x}) > 0 so x ∈ D which is a contradiction. Therefore, λ({x}) = 0 for each x ∈ E which proves that λ
is diffuse measure.
Now, note that ν({x}) = µ({x} ∩ D) so ν({x}) > 0 if and only if x ∈ D. Thus, D is the set of atoms of ν. Then observe that
ν(E \ D) = µ((E \ D) ∩ D) = µ(∅) = 0.
Therefore, ν is purely atomic. Also, note that
µ(A) = µ(D ∩ A) + µ(A \ D) = ν(A) + λ(A) ∀A ∈ E.
Therefore, we proves Proposition 3.9.
Page 10
3.16
Let F = {A ∪ N : A ∈ E, N ∈ N }. Suppose that A ∪ N ∈ F. Then note that N ⊂ M for some M ∈ E such that µ(M ) = 0. Then
observe that
(A ∪ N )c = (A ∪ M )c ∪ (M \ (A ∪ N ))
Since A ∪ M ∈ E, (A ∪ M )c ∈ E. Also, since µ(M ) = 0 and M \ (A ∪ N ) ⊂ M , we have M \ (A ∪ N ) ∈ N . Therefore, (A ∪ N )c ∈ F.
So F is closed under complement.
Now, suppose that {An ∪ Nn } ⊂ F be given. Then observe that

! !
[ [ [
(An ∪ Nn ) = An ∪ Nn
n n n
S
It is clear that n An ∈ E since E is closed under countable union. Now, note that there exists {Mn } ⊂ E such that Nn ⊂ Mn
and µ(Mn ) = 0 for each n. Then by subadditivity,
!
[ X [ [
µ Mn ≤ µ(Mn ) = 0, Nn ⊂ Mn .
n n n n
S S
So n Nn ∈ N which implies that n (An ∪ Nn ) ∈ F so F is closed under countable union. Therefore, F is a σ-algebra.
Now, note that, since ∅ ∈ N , E = E ∪ ∅ ∈ F for any E ∈ E so E ⊂ F. Similarly, since ∅ ∈ E, N = ∅ ∪ N ∈ F for each N ∈ N so
N ⊂ F. Since E ∪ N ⊂ F and F is a σ−algebra, E ⊂ F. And clearly, A ∪ N ∈ E for any A ∈ E and N ∈ N . Thus, F = E which
proves (a).
Observe that, for any A0 ∪ N 0 and A ∪ N in F stated in the problem,

µ(A ∪ N ) = µ(A) ≤ µ(A ∪ M ) = µ(A0 ∪ M 0 ) ≤ µ(A0 ) + µ(M 0 ) = µ(A0 ) = µ(A0 )
µ(A0 ∪ N 0 ) = µ(A0 ) ≤ µ(A0 ∪ M 0 ) = µ(A ∪ M ) ≤ µ(A) + µ(M ) = µ(A) = µ(A)
Thus, µ(A ∪ N ) = µ(A0 ∪ N 0 ) which means µ is well-defined.
Now, further note that

µ(∅) = µ(∅) = 0
and that, for given sequence of countable sets {An ∪ Nn } ⊂ F,
! ! !! !
[ [ [ [ X X
µ (An ∪ Nn ) = µ An ∪ Nn =µ An = µ(An ) = µ(An ∪ Nn ).
n n n n n n
Thus, µ is countably additive so it is a measure on F.
Page 11
Due date : September 27th, 2019 Byeongho Ban
4.24
Let A = {f = ∞}. Assume that µ(A) > 0, then observe that

Z
∞=∞· dµ = µ(f 1A ) ≤ µ(f ) < ∞
A
and it is a contradiction. Therefore, out assumption, µ(A) > 0 is not true so it should be zero.
Suppose that µ(f ) < ∞. Then it should be µ(f + ) < ∞ and µ(f − ) < ∞. Since f + , f − ∈ E+ , by the previous argument,
µ(A+ ) = 0 = µ(A− ) where A± = {f ± = ∞}. Now, observe that
A = {f = f + − f − = ±∞} ⊆ A+ ∪ A− =⇒ µ(A) ≤ µ(A+ ) + µ(A− ) = 0.
Therefore, f is real-valued almost everywhere.
4.26
PN P∞
Note that 1 fn % 1 fn in E+ . Thus, by MCT, observe that
∞ ∞
! N
! N
! N
X X X X X
µ fn = µ lim fn = lim µ fn = lim µfn = µfn
N →∞ N →∞ N →∞
1 1 1 1 1
where the second equality is from linearity of µ.
4.27
PN
Note that, by positivity, 1 µn ≤ µ for all N , so
N N
!
X X
µn (f ) = µn (f ) ≤ µ(f ) ∀N
1 1
Then by taking N → ∞, we have
X
µn (f ) ≤ µ(f ).
PN P
N
Now, note that, for any > 0, there exists N such that 1 µn (f ) = 1 µn (f ) > µ(f ) − . Then observe that, by the
positivity,
X N
X
µn (f ) ≥ µn (f ) > µ(f ) − .
1
Since > 0 was arbitrary, we have
X
µn (f ) ≥ µ(f )
which implies the equality combining with the previous result.
4.28
Observe that
|µ(f )| = |µ(f + ) − µ(f − )| ≤ |µ(f + )| + |µ(f − )| = µ(f + ) + µ(f − ) = µ(f + + f − ) = µ(|f |)
where third last equality is from positivity of µ since f ± ∈ E+ . So we are done.
4.29
∞
We also assume that µ(A) < ∞ and f is integrable otherwise, a ≤ ≤ b or a ≤ ∞ ≤ b. Now, observe that
∞
Z
a1A ≤ 1A f ≤ b1A =⇒ aµ(A) = µ(a1A ) ≤ f dµ ≤ µ(b1A ) = bµ(A)
A
Then by dividing µ(A) from each sides, we get
Z
1
a≤ f dµ ≤ b.
µ(A) A
Page 12
4.30
In order to say µfn , we need to suppose fn are measurable or equal to a measurable function almost every where.
Suppose that fn % f and fn ≥ g ∀n for some integrable function g. Then note that fn − g are positive measurable functions
increasing to f − g. Therefore, by the MCT,
lim µ(fn − g) = µ(f − g).
n→∞
Since g is integrable, we can subtract µ(g) from both sides and can get
µ(g) + lim µ(fn − g) = lim µ(fn − g + g) = lim µ(fn )
n→∞ n→∞ n→∞
µ(f − g) + µ(g) = µ(f − g + g) = µ(f )
thus,
lim µ(fn ) = µ(f ).
n→∞
and we are done.
Now, suppose that fn & f and fn ≤ g ∀n for some integrable function g. Then note that (g − fn ) ⊂ E+ and g − fn % g − f .
Then by MCT,
lim µ(g − fn ) = µ(g − f )
n→∞
⇐⇒ lim µ(−fn ) = lim (g − fn ) − µ(g) = µ(g − f ) − µ(g) = µ(−f )
n→∞ n→∞
=⇒ lim µ(fn ) = µ(f ).
n→∞
4.32
Suppose that µ is σ-finite. Then there is a partition (En )∞

1 of E such that µ(En ) < ∞. Define a function
∞
X 1 1
f= 1
2 En
n=1
µ(E n ) n
Then observe that
∞ ∞
X 1 1 X 1
µ(f ) = µ(E n ) = < ∞.
n=1
µ(En ) n2 1
n2
Since f is strictly positive, we are done.
1
Conversely, suppose that there exists a strictly positive function f in E such that µ(f ) < ∞. Now, let En = {f > n} and then
note that En % E since f is positive. Then observe that

1 1
∞ > µ(f ) > µ(f 1En ) > µ 1En = µ(En ) ∀n
n n
Since n1 < ∞, µ(En ) < ∞ for all n. Then by the nature of En increasing to E, ∪∞
1 En = E, and because µ(En ) < ∞ for all n, we
proved that µ is σ-finite.
5.13
(a) For convenience, let Au = {t ∈ R+ : c(t) > u}. Then observe that, if u ≥ w, then Au ⊆ Aw which means
a(u) = inf Au ≥ inf Aw = a(w) so a is increasing function.
Now, let (un ) be a sequence such that un & u. Then Aun ⊆ Au for all n. Thus, a(un ) ≥ a(u) for all n which means that
a =let limn→∞ a(un ) ≥ a(u). (The limit should exist since a(un ) is monotonically decreasing sequence which is bounded below.).
Now, observe that as n → ∞, since c is right continuous,
∞ ∞ ∞
!
[ [ [
−1 −1
Aun % Au n = c ((un , ∞]) = c (un , ∞] = c−1 ((u, ∞]) = {t ∈ R+ : c(t) > u} = Au .
1 1 1
Thus, since inf Aun is monotonically decreasing,
∞
[
lim a(un ) = lim inf Aun = inf Aun = inf Au = a(u)
n→∞ n→∞
1
Thus, a is right continuous.
Now, let U = inf{u ∈ R+ : a(u) > t}. And note that, since c is right continuous and increasing, t = inf{t0 ∈ R+ : c(t0 ) > c(t)}
which means a(c(t)) = t. Similarly, since a is increasing and right continuous, U = inf{u ∈ R+ : a(u) > t = a(c(t))} = c(t). Thus,
we are done.
Page 13
5.13
(b) Suppose c(t) < ∞. Now, observe that

a(c(t)) = inf{t0 ∈ R+ : c(t0 ) > c(t)} = inf Ac(t) .
and note that t 6∈ Ac(t) . Assume that a(c(t)) < t. Then note that there exists a decreasing sequence (sn ) ⊂ Ac(t) such that
sn & a(c(t)). Now, also note that there exists N ∈ N such that a(c(t)) ≤ sn < t for all n ≥ N . Since c is increasing function,
c(sn ) ≤ c(t) which is contradiction since (sn ) ⊂ Ac(t) . Therefore, a(c(t)) ≥ t.
Now, suppose that a(c(t)) = t and let > 0 be given. Assume that there exists δ > 0 such that c(t + δ) = c(t). Then, since c is
increasing, c(t) = c(s) ∀s ∈ [t, t + δ]. Now, observe that
t = a(c(t)) = inf{t0 ∈ R+ : c(t0 ) > c(t) = c(t + δ)} = a(c(t + δ)) ≥ t + δ
which is a contradiction. Therefore, there does not exists such δ > 0 so c(t + ) > c(t) since c is increasing.
Conversely, suppose that c(t + ) > c(t) for any > 0. Then observe that, since t + ∈ {t0 ∈ R+ : c(t0 ) > c(t)},
a(c(t)) = inf{t0 ∈ R+ : c(t0 ) > c(t)} ≤ t + ∀ > 0
Thus, a(c(t)) ≤ t. However, since a(c(t)) ≥ t from previous argument, we have the equality.
5.14
(a) Suppose t < s then [0, t] ⊆ [0, s] so by the monotonicity of measure, c(t) = µ([0, t]) ≤ µ([0, s]) = c(s) which implies that c is
increasing function. Now, let (sn ) be a given sequence such that sn & s. Then observe that
lim c(sn ) = lim µ([0, sn ]) = µ([0, s]) = c(s)
n→∞ n→∞
since [0, sn ] % [0, s]. Therefore, c is right continuous.
(b) Let t ∈ [0, b) be given. We define a(u) as

a(u) = inf{t0 ∈ R+ : µ[0, t0 ] = c(t0 ) > u}
Then from Exercise 5.13 (a),
µ[0, t] = c(t) = inf{u ∈ R+ : a(u) > t} = inf a−1 (t, ∞] = u0 ∈ a−1 (t)
since a is right continuous and increasing so a−1 is increasing function when we consider a representative of the set.( if it is
decreasing, when t < s we should have us ≥ ut where us ∈ a−1 (s) and ut ∈ a−1 (t). Then observe that s = a(a−1 (s)) = a(us ) ≥
a(ut ) = a(a−1 (t)) = t which is a contradiction). Also, note that u0 = λ[0, u0 ]. Then observe that
a−1 [0, t] = {u ∈ [0, b) : 0 ≤ a(u) ≤ t} = [0, u0 ]
The second equality holds by following argument. Suppose w ∈ [0, u0 ] then
0 = inf{t0 ∈ R+ : µ[0, t0 ] = c(t0 ) > 0} = a(0) ≤ a(u) ≤ a(u0 ) = a(a−1 (t)) = t =⇒ w ∈ {u ∈ [0, b) : 0 ≤ a(u) ≤ t}.
Conversely suppose that w ∈ {u ∈ [0, b) : 0 ≤ a(u) ≤ t}. Then assume that w 6∈ [0, u0 ] then u0 < w which implies a(u0 ) < a(w)
(it is strict inequality. If a(w) = a(u0 ) then a(w) = a(u0 ) = t so w ∈ a−1 (t). It means w could have been chosen as u0 and it will
contradicts to u0 < w.). Thus, 0 ≤ a(u0 ) < a(w) ≤ t, especially, a(u0 ) < t which is a contradiction to the setting of u0 . Therefore,
it should have been u ∈ [0, u0 ].
Now, so we have µ[0, t] = u0 = λ[0, u0 ] = λ(a−1 [0, t]) = λ ◦ a−1 [0, t]. Since B[0,b) can be generated by {[0, t] : t ∈ [0, b)}, µ = λ ◦ a−1
on B[0,b) .
Page 14
5.17
Suppose that µ(B) = 0 for some B ∈ E. It means that B ∩ D = ∅. Since nu is purely atomic note that
ν(B) ≤ ν(Dc ) = 0
Thus, ν(B) = 0. Since B ∈ E was arbitrary µ-zero, ν is absolutely continuous with respect to µ.
Let g(x) = ν({x}). It is measurable since g −1 ((a, ∞]) = {x : ν({x}) > a} is E if a ≤ 0Pand is a subset ofP
D which is a countable
union of singleton sets which are measurable. Now, observe that ν(1A ) = ν(1A∩D ) = x∈A∩D ν({x}) = x∈A∩D g(x) = µ(g1A ).
PN
Now, suppose that f = 1 an 1An then observe that
N N N
!
X X X
ν(f ) = an ν(1An ) = an µ(g1An ) = µ g an 1An = µ(gf ).
1 1 1
Now, let f ∈ E+ be given then there exists a sequence of simple functions (fn ) such that fn % f . Then observe that, by MCT
ν(f ) = lim ν(fn ) = lim µ(gfn ) = µ(gf )
n→∞ n→∞
since g is positive measurable so gfn % gf .
Now, suppose f = f + − f − such that f + , f − ∈ E+ . Then observe that
ν(f ) = ν(f + − f − ) = ν(f + ) − ν(f − ) = µ(gf + ) − µ(gf − ) = µ(g(f + − f − )) = µ(gf ).
dν
Therefore, ν(f ) = µ(gf ) for any measurable function f so p = dµ = g.
5.21
Suppose that ν is purely atomic and µ is diffuse. Let D be the set of atoms of ν. Since ν is purely atomic, ν(E \ D) = 0. Now,
assume that µ(D) > 0. Then observe that
X
0 < µ(D) = µ({x}) = 0
x∈D
since µ is diffuse so there is no atom. And it is a contradiction. Therefore, µ(D) = 0 which implies that ν is singular with respect
to µ.
5.22
(a) Observe that

 i

∞ [
2 ∞ ∞
[ X X 2i 1/3
Leb(D) = Leb  Di,j  = 2i Leb(Di,j ) = = =1
i=0 j=1 i=0 i=0
3i+1 1 − 23
1 2
since (Di,j ) is disjoint and Di,1 = ( 3i+1 , 3i+1 ) and Leb(Di,1 ) = Leb(Di,j ) ∀j.
Then clearly,
Leb(C) = Leb(E \ D) = Leb(E) − Leb(D) = 1 − 1 = 0.
(b) Observe that since a and c are inverse each other, c(Dij ) is singleton and λ(A) = 0 when A is countable,
  i
  i
  i

∞ [
[ 2 ∞ [
[ 2 ∞ [
[ 2
ν(D) = λ a−1  Di,j  = λ  a−1 (Di,j ) = λ  c (Di,j ) = 0.
i=0 j=1 i=0 j=1 i=0 j=1
Since [0, 1] is a disjoint union of C and D,

ν(C) = 1 − ν(D) = 1.
(c)
Note that a and c are inverse each other so the range of a is a domain of c. Note that the domain of c is D where [0, 1] \ D is
countable.
Page 15
Due date : October 7th, 2019 Byeongho Ban
6.29
Firstly, we will show that K(x, B) defined is a kernel. Let B ∈ F be given. Then note that h is measurable relative to E and
F and 1B is F-measurable. Since K(x, B) is a composition of the two measurable functions, it should be a measurable function.
Now, let x ∈ E be given. Then note that K(x, ∅) = 1∅ ◦ h(x) ≡ 0. And now, for disjoint sequence of sets (Fn ) ⊂ F, observe that
!
[ X X X
K x, Fn = 1Fn ◦ h(x) = 1Fn ◦ h(x) = K(x, Fn )
n n n
Thus, B 7→ K(x, B) is a measure so K is a kernel. Now, observe that
K(x, F ) = 1F ◦ h(x) = 1F (h(x)) = 1 ∀x ∈ E
since h(x) ∈ F for all x ∈ E. Therefore, it is a transition probability kernel.
Firstly, observe that, for any B ∈ F,

Z
K1B (x) = K(x, dy)1B (y) = K(x, B) = 1B ◦ h(x).
F
P
So, for simple function f = an 1Fn , by linearity of integral,
Z !
X Z X X X
Kf (x) = K(x, dy)f = an K(x, dy)1Fn = an K(x, Fn ) = an 1Fn ◦ h(x) = an 1Fn ◦ h(x) = f ◦ h(x).
F n F n n n
Observe that, if f ∈ E+ , there is a sequence of simple functions (fn ) increases to f so, by monotone convergence theorem,
Z Z Z
Kf (x) = K(x, dy)f (y) = K(x, dy) lim fn (y) = lim K(x, dy)fn (y)
F F n→∞ n→∞ F
= lim fn ◦ h(x) = lim fn (h(x)) = f (h(x)) = f ◦ h(x)
n→∞ n→∞
by the last argument for simple function since fn is simple for each n. If f ∈ E, then
Z Z Z
Kf (x) = K(x, dy)f (y) = K(x, dy)f+ (y) − K(x, dy)f− (y) = f+ ◦ h(x) − f− ◦ h(x) = (f+ − f− ) ◦ h(x) = f ◦ h(x).
F F F
So we are done for first equality.
Now, we want to show µK = µ ◦ h−1 . Let B ∈ F be given. Then observe that

Z Z
µK(B) = dµ(x)K(x, B) = dµ(x)(1B ◦ h(x)) = µ(1B ◦ h) = (µ ◦ h−1 )1B = µ ◦ h−1 (B)
E E
since 1B ∈ F+ by theorem 5.2.
Now, we want to show that µKf = µ(f ◦ h). Observe that, for any f ∈ E,
µKf = µ(Kf ) = µ(f ◦ h) =
by the first equality in this problem. Therefore, we are done.
6.30
We define
Z
K(x, B) = dν(y)k(x, y)
B
and we want to show that K is a kernel. Firstly, fix x ∈ E. Then, we note that
Z
K(x, ∅) = dν(y)k(x, y) = 0.
∅
Now, let (Fn ) ⊂ F be a disjoint sequence of sets. Then observe that, by monotone convergence theorem,
! Z
[ X XZ X
K x, Fn = dν(y)k(x, y) 1Fn (y) = dν(y)k(x, y) = K(x, Fn )
n F n n Fn n
since k is positive. Thus, B 7→ K(x, B) is measure for each x. Now, fix B ∈ F and observe that
Z
K(x, B) = dν(y)k(x, y)
F
is measurable by Proposition 6.9. Therefore, K is a kernel.
Page 16
6.31
Letting K be a transition matrix from E to F , we interpret K is a m × n matrix. And f ∈ F is a column vector with n entries.
And µ is a row vector with m entries. Then now note that Kf is a multiplication of m × n matrix with n × 1 so result would be
m × 1 column vector which means a function in E. Since Kf ∈ E+ , this notation is consistent.
As for µK, it is a multiplication of 1 × m vector and m × n matrix which gives us 1 × n row vector. Then it can be represented
as a measure on F. Since µK is a measure on F, it is consistent.
For µKf , it is a multiplication of 1 × m row vector, m × n matrix and n × 1 column vector which gives us a number. Since matrix
multiplication is associative, (µK)f = µ(Kf ) is consistent.
Lastly, let L is a transition kernel from F to G, then L would be represented as n × ρ matrix. Then since K is interpreted as m × n
matrix, the multiplication of m × n matrix and n × ρ matrix is consistent and it gives m × ρ matrix which can be represented as
a transition kernel from E to G which is consistent.
6.32
Firstly, recall that K(x, F ) ≥ 0 since for each x, B 7→ K(x, B) is a measure. Now, note that B = {x : K(x, F ) > 0} is measurable.
And note that
h(x) = K(x, F )1B (x) + 1B c (x).
c
Since B is measurable, the addition of measurable function h should be measurable. Also, since h(x) ≥ 0 since 1 > 0 and
K(x, F ) ∈ E+ so h ∈ E+ .
Note that h(x) > 0 for all x ∈ E so we can solve it for H by dividing h as
K(x, B)
H(x, B) = .
h(x)
Let A ∈ F fixed. Then note that K(x, A) and h(x) are measurable so the division of them, H(x, A) should be measurable.
Now, let x ∈ E be fixed. Then note that H(x, ∅) = K(x, ∅)/h(x) = 0/h(x) = 0 since A 7→ K(x, A) is a measure. Also, letting
(Fn ) ⊂ F be a sequence of disjoint sets, observe that
! S
[ K (x, n Fn ) 1 X X K(x, Fn ) X
H x, Fn = = K(x, Fn ) = = H(x, Fn )
n
h(x) h(x) n h(x) n
thus, A 7→ H(x, A) is a measure so is a kernel.
Now, note that

(
K(x, F ) K(x, F ) 1 x∈B
H(x, F ) = = =
h(x) K(x, F )1B (x) + 1B c 0 x ∈ Bc
which means H(x, F ) ≤ 1 for all x ∈ E so it is a bounded kernel.
6.33
(Hint: Use the preceding exercise to extend the proof from the bounded kernels to finite ones, and finally extend it to Σ-finite
kernels.)
Now, suppose that K is finite. Let f ∈ (E ⊗ F)+ be given. Then by using same notation for H and h in preceding exercise,
observe that
Z Z
T f (x) = K(x, dy)f (x, y) = h(x) H(x, dy)f (x, y) = h(x)Hf (x).
F F
Since H is bounded kernel, by the argument in the proof, Hf ∈ E+ . Since T f is a multiplication of two positive E-measurable
functions, T f ∈ E+ .
P
Now, let’s assume that K is Σ-finite. Then there exists a sequence of finite kernels (Kn ) such that K = n Kn . Then observe
that, for any f ∈ (E ⊗ F)+ ,
Z Z X Z XN XZ X
T f (x) = K(x, dy)f (x, y) = Kn (x, dy)f (x, y) = lim Kn (x, dy)f (x, y) = Kn (x, dy)f (x, y) = Kn f (x)
F F N →∞ F n=1 F
n n n
by monotone convergence theorem and linearity of integration. Since any sum of functions in E+ is in E+ and limit of sequence of
functions in E+ is in E+ , T f ∈ E+ . So we are done.
Page 17
6.34
Let E = F = [0, 1] with Borel σ-algebra and let µ be a Lebesgue measure on E and ν be a counting measure on F . And let
(
1 x=y
f (x, y) = .
0 x 6= y
Now, observe that
Z Z Z
µ(dx) ν(dy)f (x, y) = µ(dx) = µ(E) = 1
E F E
and that
Z Z Z
ν(dy) µ(dx)f (x, y) = ν(dy)0 = 0
F E F
since f (x, y) = 0 µ-almost everywhere. Therefore the two integral gives us different values. The reason for this is that ν is not
Σ-finite. In order to have Fubini theorem, µ and ν should be Σ-finite.
Page 18
7.
(a) Consider a two dimensional array of numbers aij with i, j = 0, 1, 2, . . . . Show that if either (a) aij ≥ 0 for all i, j or (b) There
PN PN
exists B ≥ 0 such that i=1 j=1 |aij | ≤ B for all N ≥ 0 then
∞ X
X ∞ ∞ X
X ∞
aij = aij .
i=1 j=1 j=1 i=1
P∞ P∞ P∞ P∞
(b) Consider the array aij with aii = 1, ai+1,i = −1 and aij = 0 otherwise. Compute i=1 j=1 aij and j=1 i=1 aij . Explain
your result.
Proof. (ByeongHo Ban)
.
(a)
Firstly, suppose the condition (a). Note that, when ν is a counting measure, (Z≥0 , P(Z≥0 ), ν) is σ-finite since {{j} : j ∈ Z≥0 } is
a partition of Z≥0 with ν({j}) = 1 < ∞ for each j. Thus, we can represent ν as a countable sum of finite measure
X
ν(E) = ν({n} ∩ E)
n≥0
which means ν is Σ-finite. Therefore, considering aij as a function a(i, j) on Z≥0 × Z≥0 , by part (a) of Fubini’s theorem
XX Z Z Z Z XX
aij = dν(i) dν(j)a(i, j) = dν(j) dν(i)a(i, j) = aij .
i j Z≥0 Z≥0 Z≥0 Z≥0 j i
Now, suppose condition (b). It means, again considering aij as a function a(i, j) on Z≥0 × Z≥0 ,
Z Z XX
dν(i) dν(j)|a(i, j)| = |aij | ≤ B
Z≥0 Z≥0 i j
which automatically means

Z
dν(j)|a(i, j)| ≤ B ν − a.e.i.
Z≥0
If
X Z
|aij | = dν(i)|a(i, j)| = ∞,
i Z≥0
then, since
X XX
∞= |aij | ≤ |aij | ≤ B
i i j
it is contradiction. Thus, a(i, j) is also ν-integrable almost every j. Then by the part (b) of the Fubini theorem, since ν is Σ-finite,
XX Z Z Z Z XX
aij = dν(i) dν(j)a(i, j) = dν(j) dν(i)a(i, j) = aij .
i j Z≥0 Z≥0 Z≥0 Z≥0 j i
(b) Observe that

XX ∞
X ∞
X ∞
X
aij = a1,1 + (aii + ai,i−1 ) = 1 + (1 − 1) = 1 + 0 = 1.
i j i=2 i=2 i=2
Also, observe that

XX ∞
X ∞
X
aij = (ajj + aj+1,j ) = (1 − 1) = 0.
j i j=1 j=1
Thus, we cannot exchange the summation order which means the Fubini theorem does not work. It is because the condition (a)
and (b) in the first part of this problem fails. Condition (a) failes because ai+1,i = −1 < 0 and condition (b) fails because
XX ∞
X ∞
X
|aij | = |a11 | + (|aii | + |ai,i−1 |) = 1 + 2 = ∞.
i j i=2 i=1
Page 19
x2 −y 2
8. Consider the product space [0, 1] × [0, 1] with the product of the Lebesgue measure µ × µ and the function f (x, y) = (x2 +y 2 )2 .
Show that
Z Z Z Z
µ(dx) µ(dy)f (x, y) and µ(dy) µ(dx)f (x, y)
[0,1] [0,1] [0,1] [0,1]
∂ x
exists and are finite but are unequal. (Hint: ∂x x2 +y 2 ?)
Firstly, note that

Z Z
x y
f (x, y)µ(dx) = − 2 + C1 f (x, y)µ(dy) = 2 + C2
x + y2 x + y2
with constants C1 and C2 since
∂ −x ∂ y
2 2
= f (x, y) and = f (x, y).
∂x x + y ∂y x + y 2
2
Now, observe that

! Z
Z Z Z 1
y 1 1 π
µ(dx) µ(dy)f (x, y) = µ(dx) 2 + y2
= µ(dx) 2 = arctan(x) x=0
=
[0,1] [0,1] [0,1] x y=0 [0,1] x +1 4
and that
!
1
−x −1
Z Z Z Z
1 π
µ(dy) µ(dx)f (x, y) = µ(dy) = µ(dx) = − arctan(x) =− .
[0,1] [0,1] [0,1] x + y2
2
x=0 [0,1] y2+1 x=0 4
Thus, each integral does exist and finite but unequal to each other.

Page 20
9. Write a 2-page summary of the most important concepts/results of Chapter 1.
Page 21

Summary
There are two main concepts we have dealt with in chapter 1 which are measure and integration.
Let’s talk about measure first. Measure, µ, is a function. The domain of measure is called σ-algebra. When you have a set
E, σ−algebra , E, is a subcollection of power set of E, P(E), satisfying two conditions. The two conditions are closedness under
complement and countable union. And we say a pair (E, E) a measurable space and (E, E, µ) a measure space. In the case
when it is closed under finite union, we call it just an ’algebra’. From the definition of σ-algebra, we know that E, ∅ ∈ E because,
for any A ∈ E, Ac ∈ E so ∅ = A ∩ Ac ∈ E and E = A ∪ Ac ∈ E. And the codomain of µ is a non-negative real number. In order
for a function µ to be a measure, it should satisfy two condition. The one condition is that µ(∅) = 0 and
!
[ X
µ En = µ(En )
n n
for any (En ) ⊂ E, a sequence of disjoint sets. This property is called countable additivity. Thus, in summary, measure is a
function from a σ-algebra to a non-negative real number such that its value at ∅ is 0 and have the property countable additivity.
A measure, µ, satisfies several nice and important properties. Firstly, µ(A) ≤ µ(B) whenever A ⊆ B. It is because
µ(A) ≤ µ(A) + µ(B \ A) = µ(B).
Secondly, for any (En ) ⊂ E,
! n−1
!
[ X [ X
µ En = µ En \ Ek ≤ µ(En )
n n k=1 n
by first property. And, by using above two property, when (En ) ⊂ E such that En ⊆ En+1 for each n, we have
!
[
µ En = lim µ(En ).
n→∞
n
We also have a classification of measure. A measure is finite if µ(E) < ∞. And measure is σ-finite if there is a partition
(En ) ⊂ E Pof E such that µ(En ) < ∞ for each n. Lastly, a measure is Σ-finite if there is a sequence of finite measures (µn ) such
that µ = µn . Also, there are some special concepts called Atomic and diffuse measure. An atom is a singleton set {x} ∈ E
such that µ({x}) > 0. We say µ is diffuse if it does not have any atom. On the other hands, we say µ is purely atomic if the
set of atom of µ is countable and the value of µ is always zero out side of the set of atoms. With given measure, the concept of
’Almost everywhere’ is also important concept. It says a statement is true almost everywhere if it is true except at the set of
point where the measure is zero.
Addition some words for σ-algebra, we first note that, if we have two arbitrary different σ-algebras, then the intersection of
them is also σ-algebra. Actually, any arbitrary many intersections of σ−algebras is also a σ-algebra. From here, we can define
a σ-algebra generated by a subcollection G ⊂ P(E). It is defined by any intersection of σ-algebras containing G. As for
generating a σ-algebra, we have important concepts called p−system and d-system. A p-system is a subcollection of the power set
of E closed under intersection. And d-system, D, is a also a subcollection of the power set of E such that E ∈ D and A, B ∈ D
with A ⊂ B, we have B \ A ∈ D. From here, an important theorem called ’Monotone class theorem’ arise. This theorem
states that ’If a d-system contains a p-system then it also contains a σ-algebra generated by the p-system.’ This theorem can be
extended to a ’Monotone class theorem for functions’. Before we state this, we should defined an appropriate class of function for
your space which is called measurable function. A measurable function relative to E and F is a function f : (E, E) → (F, F)
such that f −1 (A) ∈ E for any A ∈ F. Usually, we are interested in a class of measurable function relative to E and BR where BR
is a Borel σ-algebra. A Borel σ-algebra BE is a σ-algebra generated by all the open sets in E with given topology. In the case
when the codomain of a measurable function is BR , we simply say f is measurable function on (E, E).
Integration of measurable function is another important concept in this chapter. In order to define integration of a measurable
function, we first define some simple types of function called simple function. When 1A is a characteristic function of set
A, a simple function on a measurable space (E, E) is a finite R-linear combination of characteristic functions of sets in E with
coefficient in R. And we define the integration of simple function with respect to a measure µ as
Z X N XN
an 1An dµ = an µ(An ).
E n=1 n
And then we define the integration of a positive measurable function f . We should remind that good integration should satisfy
three conditions, one is positivity(saying that integration of positive function should be positive) and second is linearity(saying
integration of linear combination of two function is a linear combination of two integration of the functions) and third is continuity
(saying that limit of integration of sequence of increasing functions is the integration of the limit function)For simplicity, we call
the class of positive measurable functions as E+ . Before defining this, we note that simple function is measurable and if all the
coefficients are positive real then the simple function is in E+ .

Page 22
And also it should be noted that, for any positive simple function f , there exists a sequence of positive simple functions
R (fnR) such
that fn % f . Thus, it seems reasonable if we define the integration of positive measurable function f as limn→∞ E fn dµ = E f dµ
where (fn ) is as defined above. Then as for usual measurable
R Rfunction f , Rwe define f+ = max(0, f ) ∈ E+ and f− = − min(0, f ) ∈ E+
and say f = f+ − f− . By this way, we can say E f dµ = E f+ dµ − E f− dµ. This is usual and reasonable way to define the
integration of measurable functions. When defining integration, three important theorem about limit of integration arise. One is
monotone convergence theorem(MCT), others are dominated convergence theorem and Fatou’s lemma. Monotone
convergence theorem says that if (fn ) ⊂ E+ and fn % f µ-a.e then
Z Z
lim fn dµ(x) = f dµ(x)
n→∞ E E
which is interchange of integral and limit. And Fatou’s lemma says, for any sequence of positive measurable functions (fn ),
Z Z
lim inf fn dµ ≤ lim inf f dµ
E n→∞ n→∞ E
. Lastly, dominated convergence theorem states that, for any sequence of measurable functions (fn ) is dominated by an integrable
positve function g, i.e. |fn | ≤ g for each n, then we have
Z Z
lim fn dµ = f dµ
n→∞ E E
which is also interchange of integral and limit. It helps us to compute given integration by using some sequence of functions with
known integral values.
Now, we can state the Monotone class theorem for measurable functions. Firstly, we define monotone class of functions,
M, by a class of measurable functions such that 1 ∈ M and if f, g ∈ M then af + bg ∈ M for any a, b ∈ R and with (fn ) ⊂ M+
and fn % f , f ∈ M. Then the monotone class theorem for function states that, when M is a monotone class of functions on
(E, E) and C is a p-systems generating E with 1A ∈ M with A ∈ C, then E+ , Eb ⊂ M where Eb is a collection of all bounded
measurable functions.
On top of the integration, we now want to make other usual properties of integration such as change of variable. This can be made
by a measurable function g : (E, E) → (F, F). If µ is a measure on (E, E), we can create a new measure on (F, F) by composing
g and µ such that µ ◦ g −1 (B) = µ(g −1 (B)) for each B ∈ F since g −1 (B) ∈ E. And we note that (µ ◦ g −1 )(f ) = µ(f ◦ g) for all
f ∈ F+ . In other words, we have
Z Z
f d(µ ◦ g −1 ) = (f ◦ g)dµ
F E
which is a change of variable. For p ∈ E+ , we can define another measure ν such that
Z
ν(A) = µ(p1A ) = dµp1A .
E
From here, one important theorem called ’Radon-Nikodym theorem’ comes up. It states that, when µ is σ-finite and ν is
absolutely continuous with respect to µ, i.e. ν(A) = 0 whenever µ(A) = 0, then there exists p ∈ E+ such that ν(f ) = µ(pf ). And
dν
we notate p = dµ .
The last section of this chapter is about Kernel and Product space. As for product space of (E, E) and (F, F), we say E ⊗ F
is a σ-algebra generated by collection of measurable rectangles {A × B : A ∈ E, B ∈ F}. We can generalize this notion to
n-dimensional space. So (E × F, E ⊗ F) is the product measurable space.
When we have two measurable spaces (E, E) and (F, F), a transition kernel from E to F is a map K : E × F → R+
such that x → K(x, A) is a positive measurable map for each fixed A ∈ F and B 7→ K(x, B) is a measure for each fixed
x ∈ E. Since the kernel is related to measurable function and a measure, we can define similar notions. K is finite kernel if
K(x, F ) <P ∞ for each x. K is σ-finite B 7→ K(x, B) is σ-finite. K is Σ-finite if there is a sequence of finite kernels (Kn ) such
that K = n Kn . K is bounded if there is R > 0 such that K(x, F ) ≤ R for all P x ∈ E. K is σ−bounded if there is a partition
(Fn ) of F such that x 7→ K(x, Fn ) is bounded. Lastly, K is Σ-bounded if K = n Kn for some sequence of bounded kernels (Kn ).
The most important theorem in last section is Fubini-Toneli theorem. It says that, once we have Σ-finite measures µ and ν on
(E, E) and (F, F), there exists a unique Σ-finite measure π on (E × F, E ⊗ F) such that
Z Z Z Z
π(f ) = µ(dx) ν(dy)f (x, y) = ν(dy) µ(dx)f (x, y)
E F F F
for any f ∈ (E ⊗ F)+ .
The more stronger theorem which applies to f ∈ E ⊗ F. The formula above also hold for f ∈ E ⊗ F if x 7→ f (x, y) and y 7→ f (x, y)
are µ and ν integrable ν and µ almost everywhere respectively.
I think these are pretty much compactly important things in chapter 1 which can be summarized in two pages.
Page 23
1.17
(a)
Observe that
!
[
c(−x) = lim c(y) = lim µ[−∞, y] = µ [−∞, y] = µ[−∞, x) = P{X < x}
y%x y%x
y<x
c(x) − c(x−) = µ[−∞, x] − µ[−∞, x) = µ{x} = P{X = x}
!
\
c(−∞) = lim c(x) = lim µ[−∞, x] = µ [−∞, x] = µ{−∞} = P{X = −∞} ∵ intersecton of closed sets is closed.
x&−∞ x&−∞
−∞<x
!
[
c(+∞) = lim c(x) = lim µ[−∞, x] = µ [−∞, x] = µ[−∞, ∞) = P{X < ∞}
x%∞ x%∞
x<∞
= P{X ≤ ∞} − P{X = ∞} = 1 − P{X = ∞}
(b)
Observe that, by the results of (a),
X X
a(x) = c(−∞) + [c(y) − c(y−)] = µ{−∞} + µ{y} = µ ([−∞, x] ∩ D) = µa [−∞, x]
y∈Dx y∈Dx
for any x ∈ R. It implies that a is the distribution function of µa . And then observe that
b(x) = c(x) − a(x) = µ[−∞, x] − µa [−∞, x] = (µ − µa )[−∞, x] = µb [−∞, x]
for any x ∈ R. Thus, b is the distribution function of measure µb .
Note that, if B is measurable set disjoint with D, then µa (B) = µ(B ∩ D) = µ(∅) = 0. So µa is purely atomic. Assume that µb is
not diffuse, then there is x ∈ R such that 0 < µb {x} = µ{x} − µa {x} but then µ{x} > 0 so x ∈ D so µa {x} = µ{x} so µb {x} = 0
which is a contradiction. Therefore, µb does not have atom so it is diffuse.
1.18
Note that, since X does not take ±∞ (because it is real valued), c(−∞) = {P }{X = −∞} = P(∅) = 0 and
c(+∞) = P = 1 − P{X = ∞} = 1 − P(∅) = 1.
Let µX and µY be distributions of the random variables X and Y respectively. Then observe that, for any x ∈ R,
µY [−∞, x] = P{Y ≤ x} = P{q ◦ U ≤ x} = P{U ≤ c(x)} = Leb(0, c(x)] = c(x) = µ[−∞, x]

since c(x) ∈ (0, 1). And the reason why P{q ◦ U ≤ x} = P{U ≤ c(x)} is that, letting q ◦ U (ω) = y, we have U (ω) = c(y) because
U (ω) ∈ (0, 1).
y ≤ x ⇐⇒ c(y) ≤ c(x) ⇐⇒ U (ω) ≤ c(x).
Since {[−∞, x] : x ∈ R} generates B(R), µX and µY are same on B(R).
1.21
Let µ be a probability measure on R.
Now let Ω = (0, 1), H = B(0,1) , P = Leb, and define X(ω) = q(ω) for ω in Ω, where q is the quantile function corresponding to
the measure µ via the cumulative distribution function c defined by c(x) = µ[−∞, x].
Now, observe that, for any x,

P{X ≤ x} = Leb{ω : q(ω) ≤ x} = Leb{ω : ω ≤ c(x)} = c(x) = µ[−∞, x].
Therefore, there exists a probability space (Ω, H, P) and a random variable X : Ω → R such that µ is the distribution of X.
Page 24
2.14
Firstly, note that

Z X(ω) Z ∞
p p−1
X (ω) = dxpx = dxpxp−1 1{X>x} (ω).
0 0
Then, observe that
Z Z Z ∞ Z ∞ Z Z ∞
p p p−1 p−1
EX = P(dω)X (ω) = P(dω) dxpx 1{X>x} (ω) = dx P(dω)px 1{X>x} (ω) = dxpxp−1 P{X > x}
Ω Ω 0 0 Ω 0
by Fubini’s theorem since Lebesgue measure and P are Σ-finite.
In particular, if X takes values in N, then, by using the formula we found above,

Z ∞ ∞ Z n+1
X X∞ ∞
X
EX = dxP{X > x} = dxP{X > x} = (n + 1 − n)P{X > n} = P{X > n}
0 n=0 n n=0 n=0
by monotone convergence theorem.
Also, observe that
Z ∞ Z ∞ ∞ Z
X n+1 ∞
X Z n+1
2
EX = dx2xP{X > x} = 2 dxxP{X > x} = 2 dxxP{X > x} = 2 P{X > n} dxx
0 0 n=0 n n=0 n
∞ ∞ ∞
X 1 X 1 X
=2 P{X > n} (2n + 1) = 2 P{X > n} n + =2 nP{X > n} + EX.
n=0
2 n=0
2 n=0
2.16
Let a, b ∈ R be given. Then observe that

V ar(a + bX) = E(a + bX)2 − (E(a + bX))2 = E(a2 + 2abX + b2 X 2 ) − (E(a + bX))2
Z Z 2
= (a2 + 2abX + b2 X 2 )dP − (a + bX)dP
Ω Ω
Z Z Z Z Z 2
2 2 2
=a dP + 2ab XdP + b X dP − a dP + b XdP
Ω Ω Ω Ω Ω
= a2 P(Ω) + 2abEX + b2 EX 2 − (aP(Ω) + bEX)2
= a2 + 2abEX + b2 EX 2 − a2 − 2abEX − b2 (EX)2
= b2 (EX 2 − (EX)2 )
= b2 V arX.
2.17
Suppose that X ≥ 0 and let b > 0 be given. Firstly, observe that

X(ω) ≥ b1{X>b} (ω)
since, if ω ∈ {X > b} then X(ω) > b = b1{X>b} (ω) and otherwise, X(ω) ≥ 0 = b1{X>b} . Therefore, observe that
Z Z Z
X 1 EX
P{X > b} = 1{X>b} dP ≤ dP = XdP =
Ω Ω b b Ω b
since X ≥ 0.
2.18
Suppose that X has finite mean. Then, by applying Markov’s inequality of previous exercise, observe that, for given > 0,
1 1
P{|X − EX| > } = P{(X − EX)2 > } ≤ 2 E(X − EX)2 = 2 E X 2 − 2XEX + (EX)2

Z Z Z Z
1 2 2 1
X 2 dP − 2EX XdP + (EX)2

= 2 X − 2XEX + (EX) dP = 2 dP
Ω Ω Ω Ω
1 1 1
= 2 EX 2 − 2(EX)(EX) + (EX)2 = 2 EX 2 − (EX)2 = 2 V arX.

Page 25
2.19
Let X be real valued and let f : R → R+ be increasing and let b ∈ R be given. Firstly, observe that, since f is increasing,
X(ω) > b =⇒ f ◦ X(ω) = f (X(ω)) ≥ f (b).
Then observe that
Z Z Z
f (b)P{X > b} ≤ f (b) 1{X>b} dP ≤ f (b) 1{f ◦X≥f (b)} dP ≤ f ◦ XdP = Ef ◦ X
Ω Ω Ω
so, since f (b) ∈ R+ ,
1
P{X > b} ≤ Ef ◦ X.
f (b)
2.21
Let X and Y be independent, let X have distribution γa,c and Y the distribution γb,c . Let the distribution of X + Y be µ and that
X
of X+Y be ν. Let π = µ × ν. Suppose that π is the joint distribution of X and Y . Let a positive Borel function f on R+ × [0, 1]
be given. Then observe that

X
πf = Ef X + Y,
X +Y
Z ∞ a a−1 −cx Z ∞
cb y b−1 e−cy

c x e x
= dx dy f x + y,
0 Γ(a) 0 Γ(b) x+y
Z ∞ Z ∞ a a−1 −cx b b−1 −cy

c x e c y e x
= dx dy f x + y,
0 0 Γ(a) Γ(b) x+y
Z ∞ Z 1 a a−1 −cuz b b−1 −c[(1−u)z]

c (uz) e c [(1 − u)z] e
= dz du|J| f (z, u) ∵ y = (1 − u)z and x = uz
0 0 Γ(a) Γ(b)
Z ∞ Z 1
ca+b z a+b−2 e−cz Γ(a + b) a−1

= dz duz u (1 − u)b−1 f (z, u)
0 0 Γ(a + b) Γ(a)Γ(b)
Z ∞ Z 1 a+b a+b−1 −cz
c z e Γ(a + b) a−1
= dz du u (1 − u)b−1 f (z, u)
0 0 Γ(a + b) Γ(a)Γ(b)
Z ∞ a+b a+b−1 −cz Z 1
c z e Γ(a + b) a−1
= dz du u (1 − u)b−1 f (z, u)
0 Γ(a + b) 0 Γ(a)Γ(b)
= (γa+b,c × βa,b )(f )
where J is the Jacobian matrix and its value is z. Since X and Y are independent, note that γa,c × γb,c = π = γa+b,c × βa,b . Since
X
γa+b,c and βa,b are the marginals, X + Y and X+Y are independent. Therefore, we have µ = γa+b,c and ν = βa,b .
2.24
Let µ be the standard distribution on (0, 1). Note that µ(A) = P{U ∈ A}. Since µ is a measure on (0, 1), A ⊂ (0, 1). Let
c : [−∞, ∞] → [0, 1] where c is the distribution function of U . Note that, after we restrict the domain of c to (0, 1), we can defined
the inverse function q = c−1 since c is strictly increasing so it is one to one and onto. Now, observe that, for any open A ∈ B(0,1) ,
q −1 (A) = c(A) ∈ B(0,1) since q is continuous so q −1 (A) is open. Since B(0,1) is generated by open sets in (0, 1), q is measurable.
Therefore, q ◦ U is measurable so it is a random variable.
Note that the distribution function d of q ◦ U is defined by

d(x) = ν[−∞, x] = P{q ◦ U ≤ x}
where ν is the distribution of q ◦ U . Then, by the result of Exercise 1.18,
d(x) = ν[−∞, x] = µ[−∞, x] = c(x)
which implies that the quantile function of q ◦ U and U are same since their distribution functions are same.
2.25(Not homework)
Let f be any positive measurable function. And let µ be the distribution of X. Then observe that
Z 1 Z 0 Z ∞
1
µ(f ) = E(f ◦ X) = E((f ◦ g) ◦ U ) = Leb(f ◦ g1(0,1) ) = f − log x dx = f (u) (−ce−cu )du = f (u)ce−cu du
0 c ∞ 0
where g(x) = − 1c log x = u since U is uniformly distributed on (0, 1). Therefore, X has the exponential distribution.
Page 26
2.26
(Use this result to write down an algorithm which generates 2 independent Gaussian random variables X1 and X2 with parameters
a1 , b1 and a2 , b2 using two random numbers (this is called the Box-Müller algorithm.) )
First of all, by the previous exercise, R2 has the exponential distribution with scale factor 21 .
Also, observe that, letting µ be the distribution of R2 , for any positive measurable function f ,
Z ∞ Z ∞
1 y2 1 x2
E(f ◦ (S12 + S12 )) = dy √ e− 2 f (x2 + y 2 ) √ e− 2 dx
−∞ 2π −∞ 2π
Z ∞ Z ∞
1 x 2 +y 2
= dy dxf (x2 + y 2 ) e− 2
−∞ −∞ 2π
Z 2π Z ∞
1 r2
= dθ drrf (r2 ) e− 2 ∵ the Jacobian is r.
2π
Z0 ∞ 0
r2
= rdrf (r2 )e− 2
Z0 ∞
1 1
= du f (u)e− 2 u ∵ u = r2
0 2
= E(f ◦ R2 ) = µ(f ).
where S1 and S2 are two independent random variables with standard Gaussian distribution.
Let X = R cos 2πV and Y = R sin 2πV and let π be the joint distribution of X and Y . Let f be positive Borel measurable
function on R × R. Then observe that
πf = Ef (R cos 2πV , R sin 2πV )
Z ∞
1 −r 1
Z
= dr e 2 dvf (r cos 2πv, r sin 2πv)
0 2 0
Z ∞ Z 2π
1 r 1
= dr e− 2 dθf (r cos θ, r sin θ) ∵ θ = 2πv
0 2 2π 0
Z ∞ Z 2π
u2 1
= duue− 2 dθf (u2 cos θ, u2 sin θ) ∵ u2 = r
0 2π 0
Z ∞ Z ∞
1 − x2 1 − y2
= dx dy √ e 2 √ e 2 f (x, y)
−∞ −∞ 2π 2π
= (µ1 × µ2 )(f )
Thus, X and Y are independent standard gaussian variables.
Conversely, suppose that X and Y are independent standard Gaussian variables. Then the variables of R and A of polar coordinate
would be

p Y
R = X2 + Y 2 and A = arctan .
X
Then, by going back through the equations in the previous argument, we can see that R and A are independent variables. And
from the relation of third and second lines in the sequence of equations above, we know that R2 has exponential distribution with
scale parameter 21 and A has uniform distribution in [0, 2π].
Page 27
2.27
(Hint: One way is to compute first the joint distribution of (U, V ) = (Y /X, X) and then compute the marginal..)
Let f be given positive borel measurable function on R. Then observe that

Z ∞ Z ∞
X 1 x2 1 y2 x
µ(f ) = E f ◦ = dx √ e− 2 dy √ e− 2 f
Y −∞ 2π −∞ 2π y
Z ∞ Z 2π
1 r2
= dr dθr e− 2 f (tan(θ))
0 0 2π
Z ∞ Z π2 Z 3π Z 2π !
1 − r2 2
= rdr e 2 dθf (tan θ) + dθf (tan θ) + dθf (tan θ)
0 2π 0 π
2
3π
2
Z ∞ Z ∞ Z ∞ Z ∞
1 r2 1 1 1
= rdr e− 2 dz f (z) + dz f (z) + dz f (z) ∵ arctan z = θ,
0 2π 0 1 + z2 −∞ 1 + z2 −∞ 1 + z2
Z ∞ Z ∞
r2 1 1
= drre− 2 dz f (z)
0 −∞ π 1 + z2
Z ∞
1 1
= dz f (z)
−∞ π 1 + z2
∞
R∞ r2 r2
since 0
drre− 2 = −e− 2 = 0 − (−1) = 1. Therefore,
r=0
1
µ(dx) = dz z ∈ R.
π(1 + z 2 )
Suppose that a random variable Z has the Cauchy distribution. Then, letting ν be the distribution of Z1 , observe that
Z ∞ Z 0 Z ∞
1 1 1 1 1 1 1
ν(f ) = E(f ◦ ) = dz 2)
f = dz 2
f + dz 2
f
Z −∞ π(1 + z z −∞ 1 + z z 0 1 + z z
Z −∞ Z 0
1 1 1 1 1
= du − 2 1 f (u) + du − 2 1 f (u) ∵z=
0 u 1 + u2 ∞ u 1 + u2 u
Z ∞
1
= du f (u)
−∞ 1 + u2
1
so Z has the Cauchy distribution.
Now, suppose that A has the uniform distribution on (0, 2π). Then, for any positive measurable function f on R, letting η be the
distribution of tan A, observe that
Z 2π
1
η(f ) = E ((f ◦ tan) ◦ A) = dxf (tan x)
2π 0
Z π2 Z 3π Z 2π
1 2
= dxf (tan x) + dxf (tan x) + dxf (tan x)
2π 0 π
2
3π
2
Z ∞ Z ∞ Z 0
1 1 1 1
= dz f (z) + dz f (z) + dz f (z) ∵ x = arctan z
2π 0 1 + z2 −∞ 1 + z2 −∞ 1 + z2
Z ∞
1 2
= dz f (z)
2π −∞ 1 + z 2
Z ∞
1
= dz f (z)
−∞ π(1 + z 2 )
1
Thus, tan A has the Cauchy distribution. Also, by the previous argument, cot A = tan A also has Cauchy distribution.
Page 28
2.22
Suppose that X has the Gaussian distribution with mean a and variance b which means
Z
1 (x−a)2
P(f ◦ X) = dx √ e− 2b f (x) ∀f ∈ E+ .
R 2πb
Then, let Z = X−a
√
b
and we will show that it is a random variable with standard Gaussian distribution. Let f ∈ E+ be given and
observe that
Z
X −a 1 (x−a)2
− 2b x−a
P(f ◦ Z) = P f ◦ √ = dx √ e f √
b R 2πb b
Z √ 1 z2 x−a
= dz b √ e− 2 f (z) ∵z= √
2πb b
ZR
1 − z2
= dz √ e 2 f (z)
R 2π
Thus, Z has the standard Gaussian distribution.
√
Conversely, suppose that X = a + bZ with Z having standard Gaussian distribution. Then observe that
√
P(f ◦ X) = P f ◦ (a + bZ)
Z
1 z2
√
= dz √ e− 2 f a + bz
2π
ZR
1 1 − (x−a)2 x−a
= dx √ √ e 2b f (x) ∵z= √
R b 2π b
Z
1 (x−a)2
= dx √ e− 2b f (x)
R 2πb
Thus, X has the Gaussian distribution with mean a and variance b.
Now, note that

Z Z Z
1 (x−a)2 1 (x−a)2 a (x−a)2
EX = P(X) = dx √ e− 2b x = dx √ (x − a)e− 2b + √ e− 2b
R 2πb R 2πb 2πb R

−∞ Z
1 (x−a)2 1 (x−a)2
=√ b e− 2b +a dx √ e− 2b = 1
2πb ∞ R 2πb
2
=a ∵ lim e−x /2
= 0.
x→±∞
And note that EZ = E( X−a √ )= √

b
1
b
(E(X) − a) = 0.
Also, note that, by the integration by parts,
Z
1 z2
EZ 2 = dz √ e− 2 z 2
R 2π
" #
∞ Z
1 z2 z2
=√ −ze− 2 + e− 2
2π −∞ R
Z
1 z2 z2
=√ dze− 2 = 1 ∵ lim ze− 2 = 0 by L’Hospital.
2π R z→±∞
Now, then
√ √
EX 2 = E(a + bZ)2 = a2 + 2a bEZ + bEZ 2 = a2 + 0 + b = a2 + b.
Then we have
V ar(Z) = EZ 2 − (EZ)2 = 1 − 0 = 1
V ar(X) = EX 2 − (EX)2 = a2 − (a2 − b) = b.
Lastly, observe that, by the change of variable y = z + ir,
Z Z Z
1 z2 r2 1 −(z+ir)2 r2 1 y2 r2
EeirZ = dz √ e− 2 +irz = e− 2 dz √ e 2 = e− 2 dy √ e− 2 = e− 2 .
R 2π R 2π R+ir 2π
√ √ √
(r b)2 r2 b
Thus, EeirX = Eeir(a+ bZ)
= e−ira Eei( br)Z
= eira e− 2 = eira− 2 .
Page 29
2.28
Observe that, supposing that µ and ν are distributions of X and Y respectively, since X and Y are independent random variables,
Z Z Z Z
Ee−rX Ee−rY d = µ(dx)e−rx ν(dy)e−ry = µ(dx) ν(dy)e−r(x+y) = Ee−r(X+Y ) .
R+ R+ R+ R+
2.29
Observe that
∞ ∞ ∞
ca xa−1 e−cx irx ca xa−1 e−cx ca xa−1 e−cx
Z Z Z
EeirX = dx e = dx cos x + i dx sin x.
0 Γ(a) 0 Γ(a) 0 Γ(a)
Recall that
∞ ∞
X (−1)n X (−1)n
sin rx = (rx)2n+1 cos rx = (rx)2n
n=0
(2n + 1)! n=0
(2n)!
and they converges uniformly. Thus,
Z ∞ Z ∞ ∞
ca xa−1 e−cx ca xa−1 e−cx X (−1)n
dx cos rx = dx (rx)2n
0 Γ(a) 0 Γ(a) n=0
(2n)!
∞ n Z ∞
X (−1)
= dxca xa−1 e−cx (rx)2n
n=0
(2n)!Γ(a) 0
∞ n 2n Z ∞
X (−1) r
= 2n
dxc2n+a x2n+a−1 e−cx
n=0
(2n)!Γ(a)c 0
∞
X (−1)n r2n
= Γ(2n + a)
n=0
(2n)!Γ(a)c2n
∞
X (−1)n r2n (2n + a − 1)!
= 2n
Γ(a)
n=0
(2n)!Γ(a)c (a − 1)!
∞
X (−1)n r2n (2n + a − 1)!
=
n=0
(2n)!c2n (a − 1)!
where (2n+1−a)!
(a−1)! is nothing but (2n + a − 1)(2n + a − 1 − 1)(2n + a − 1 − 2) · · · (2n + a − 1 − 2n)(2n + a − 1 − (2n − 1)).(it is not
consistent with the mathematical notation of factorial precisely but let’s use it for convenience.)
Also, observe that

∞ ∞ ∞
ca xa−1 e−cx ca xa−1 e−cx X (−1)n
Z Z
dx sin rx = dx (rx)2n+1
0 Γ(a) 0 Γ(a) n=0
(2n + 1)!
∞ Z ∞
X (−1)n
= dxca xa−1 e−cx (rx)2n+1
n=0
(2n + 1)!Γ(a) 0
∞ Z ∞
X (−1)n r2n+1
= 2n+1
dxc2n+a+1 x2n+a e−cx
n=0
(2n + 1)!Γ(a)c 0
∞
X (−1)n r2n+1
= Γ(2n + a + 1)
n=0
(2n + 1)!Γ(a)c2n+1
∞
X (−1)n r2n+1 (2n + a)!
= 2n+!
Γ(a)
n=0
(2n + 1)!Γ(a)c (a − 1)!
∞
X (−1)n r2n+1 (2n + a)!
=
n=0
(2n + 1)!c2n+1 (a − 1)!
Thus, we have
∞ ∞
X (−1)n r2n (2n + a − 1)! X (−1)n r2n+1 (2n + a)!
EeirX = + i
n=0
(2n)!c2n (a − 1)! n=0
(2n + 1)!c2n+1 (a − 1)!
∞ n
X ir (n + a − 1)!
=
n=0
c (n!)(a − 1)!
(will be continued in next page)
Page 30
2.29(Continued)
Recall that
∞
X (a + n − 1)! n
(1 − x)−a = x .
n=0
(n!)(a − 1)!
This can be verified by mathematical induction with the properties that, letting f (x) = (1 − x)−a ,
f (0) = 1 f 0 (0) = −a(1 − 0)−a−1 = −a f 00 (0) = −a(−a − 1)(1 − 0)−a−2 = a(a + 1)
Our claim is that f n (0) = (−1)n (a+n−1)!
(a−1)! . Now, assume that f (n−1) (0) = (−1)n−1 (a+n−2)!
(a−1)! which means f (n−1) (x) =
(−1)n−1 (a+n−2)!
(a−1)! (1 − x)
−(a+n−1)
, note that
(a + n − 2)! (a + n − 1)!
f (n) (0) = (−1)n−1 (−1)(a + n − 1)(1 − x)−(a+n) = (−1)n (1 − x)−(a+n) .
(a − 1)! (a − 1)!
Thus, f (n) (0) = (−1)n (a+n−1)!
(a−1)! . Therefore,
∞ ∞ ∞
X f (n) (0) n X (a + n − 1)! 1 n X (a + n − 1)! n
(1 − x)−a = f (x) = (−1)n x = (−1)n (−1)n x = x .
n=0
n! n=0
(a − 1)! n! n=0
(n!)(a − 1)!
By using this, we get
∞ n −a
X i (n + a − 1)! ir
EeirX = = 1− .
n=0
c (n!)(a − 1)! c
Since there is no difference between the distributions X and Y except the mean b for Y instead the mean for X, we get
−b
ir
EeirY = 1 − .
c
Now, note that X + Y has gamma distribution with shape index a + b and the scale c, just by the previous result,
−(a+b)
ir
Eeir(X+Y ) = 1− .
c
Or we can directly compute,
−(a+b) −a −b Z Z Z Z

ir ir ir
1− = 1− 1− = EeirX EeirY = µ(dx)eirx ν(dy)eiry = µ(dx) ν(dy)eir(x+y) = Eeir(X+Y ) .
c c c R R R R
Now, observe that
−b −b
ir(−Y ) i(−r)Y i(−r) ir
Ee = Ee = 1− = 1+ .
c c
Then observe that
Z Z Z Z −a −b
ir(X−Y ) ir(X−Y ) irx −iry irX −irY ir ir
Ee = µ(dx) ν(dy)e = µ(dx)e ν(dy)e = Ee Ee = 1− 1+ .
R R R R c c
2.37
Observe that
2
V ar(X + Y ) = E(X + Y )2 − (E(X + Y ))
= E(X 2 + 2XY + Y 2 ) − (EX)2 + 2(EX)(EY ) + (EY )2

= (EX 2 + 2EXY + EY 2 ) − (EX)2 + 2(EX)(EY ) + (EY )2

= (EX 2 − (EX)2 ) + (EY 2 − (EY )2 ) + 2[E(XY ) − (EX)(EY )]

= V ar(X) + V ar(Y ) + 2Cov(X, Y ).
Lastly, suppose that X and Y are independent, then letting µ and ν be the distribution of X and Y respectively,
Z Z Z Z
E(XY ) = µ(dx) ν(dy)xy = µ(dx)x ν(dy)y = EXEY.
R R R R
Thus, we get Cov(X, Y ) = E(XY ) − EXEY = 0.
Page 31
2.38
Suppose that (Xi ) be the sequence of pairwise orthogonal random variables. Then by previous exercise(2.37), we have
V ar(X1 , X2 ) = V ar(X1 ) + V ar(X2 ) + 2Cov(X1 , X2 ) = V ar(X1 ) + V ar(X2 ) since X1 and X2 are orthogonal.
Assume that
n−1
! n−1
X X
V ar Xk = V ar(Xk ).
k=1 k=1
Then, note that
n−1
! ! n−1
! n−1
!
X X X
Cov Xk , Xn =E Xn Xk −E Xk EXn
k=1 k=1 k=1
n−1
X n−1
X
= E(Xn Xk ) − EXk EXn
k=1 k=1
n−1
X
= [E(Xk Xn ) − EXk EXn ]
k=1
=0 ∵ Xn is orthogonal to Xk for all k 6= n.
Thus, by the argument above, observe that
n
! n−1
! n−1
! !
X X X
V ar Xk = V ar Xk + V ar(Xn ) + 2Cov Xk , Xn
k=1 k=1 k=1
n−1
X
= V ar(Xk ) + V ar(Xn ) + 0
k=1
Xn
= V ar(Xk ).
k=1
So we are done.
Page 32
2.39
(a)
Observe that
d
! d
X X
a(r) = E(r · X) = E rk Xk = rk E(Xk ) = r · E(X) = r · a = a · r.
k=1 k=1
Also, observe that, by the argument in previous exercise (Cov is bilinear )(5th line of exercise 2.38),
d
! d−1
! d−1
!
X X X
b(r) = V ar(r · X) = V ar rk Xk = V ar(rd Xd ) + V ar rk Xk + 2Cov rk Xk , rd Xd
k=1 k=1 k=1
d−1
! d−1
X X
= V ar(rd Xd ) + V ar rk Xk +2 Cov(rk Xk , rd Xd )
k=1 k=1
d−1 d−2
! d−2
X X X
= V ar(rd Xd ) + 2 Cov(rk Xk , rd Xd ) + V ar(rd−1 Xd−1 ) + V ar rk Xk +2 Cov(rk Xk , rd−1 Xd−1 )
k=1 k=1 k=1
= ···
d
X X
= Cov(rk Xk , rk Xk ) + 2 Cov(rk Xk , rj Xj )
k=1 k<j
d
X X
= Cov(rk Xk , rk Xk ) + Cov(rk Xk , rj Xj )
k=1 k6=j
d X
X d
= Cov(Xj , Xk )rj rk
j=1 k=1
d
!
X
=r· Cov(Xi , Xk )rk
k=1 i
= r · vr
because V ar(aX) = E((aX)2 ) − (E(aX))2 = Cov(aX, aX) and because Cov(aX, bY ) = E(abXY ) − E(aX)E(bY ) =
ab(E(XY ) − EXEY ) = abCov(X, Y ).
(b) Observe that

vij = Cov(Xi , Xj ) = E(Xi Xj ) − EXi EXj = E(Xj Xi ) − EXj EXi = Cov(Xj , Xi ) = vji
so v is symmetric. Note that r · vr = V ar(r · X) ≥ 0 for all r ∈ Rd since V ar(X) ≥ 0 for any random variable. Therefore, v is
positive definite.
2.40
Suppose that Xi and Xj are independent, then, by Exercise 2.37, we have vij = Cov(Xi , Xj ) = 0.
Conversely, suppose that Cov(Xi , Xj ) = vij = 0. Since X is Gaussian random vector, Xi and Xj have Gaussian distributions by
using ei and ej as r where ei is a d dimensional vector with 1 in ith slot and 0 in any other slots. Let µi and µj be the distributions
of Xi and Xj respectively. Also, let πij be the joint distribution of µi and µj . Now, note that ri Xi + rj Xj also has Gaussian
distribution for any ri , rj ∈ R. Then, for any ri , rj ∈ R, observe that, by the information given in the previous problem,
ri2 Cov(Xi ,Xi )+rj
2 Cov(X ,X )
j j
bij (ri , rj ) = Eei(ri Xi +rj Xj ) = ei(ai ri +aj rj )−
π 2
V ar(Xi ) V ar(Xj )
iai ri −ri2 iaj rj −rj2
=e 2 e 2 = Eeiri Xi Eeirj Xj = µbi (ri )b
µj (rj )
since Cov(Xi , Xj ) = 0. Also, observe that
Z Z Z Z
i(ri Xi +rj Xj )
i × µj (ri , rj ) = Ee
µ\ = µi (dx) µj (dy)ei(ri x+rj y) = µi (dx)eiri x µj (dy)eirj y = µ
bi (ri )b
µj (rj ).
R R R R
Thus, π i × µj (ri , rj ). Since characteristic function determines the distribution uniquely, we have piij = µi × µj which
bij = µ\
implies that Xi and Xj are independent.
(will be continued in the next page)
Page 33
2.40(Continues)
Let I and J be given disjoint subsets of {1, 2, . . . , d}.
Firstly, suppose that the random vectors (Xi )i∈I and (Xj )j∈J are independent. Then, by same argument with above, vij = 0 if
(i, j) ∈ I × J since then Xi and Xj are independent.
Conversely, suppose that vij = 0 for every pair (i, j) ∈ I × J. Then, by same argument, any Xi and Xj with (i, j) ∈ I × J are
independent, so (Xi )i∈I and (Xj )j∈J are independent.
2.41
It is clear that, by exercise 2.39,
Eeir·X = eia·r−(r·vr)/2 .
3.22
Firstly, observe that, since 1Hn ≤ 1, by dominated convergence theorem,
Z Z
0 = lim P(Hn ) = lim P(1Hn ) = lim dP1Hn = dP lim 1Hn =⇒ lim 1Hn = 0 a.e.
n→∞ n→∞ n→∞ Ω Ω n→∞ n→∞
Then, note that |X1Hn | ≤ X and that X is integrable. Thus, by the dominated convergence theorem,
lim EX1Hn = E( lim X1Hn ) = E(0) = 0
n→∞ n→∞
since X1Hn → 0 a.e.
3.23
First, note that

lim sup E|Xi |1{|X|>b} = 0 and lim sup E|Yi |1{|X|>b} = 0.
b→∞ i b→∞ i
and that
0 ≤ sup E|Xi + Yi |1{|X|>b} ≤ sup E|Xi |1{|X|>b} + sup E|Yi |1{|X|>b}

i i i
0 ≤ sup E|Xi ∨ Yi |1{|X|>b} ≤ sup E|Xi |1{|X|>b} + sup E|Yi |1{|X|>b}
i i i
Since limit preserves the inequality, as we send b to infinity, we get
lim sup E|Xi ∨ Yi |1{|X|>b} = 0

b→∞ i
lim sup E|Xi + Yi |1{|X|>b} = 0
b→∞ i
Thus, (Xi + Yi ) and (Xi ∨ Yi ) are uniformly integrable.
4.14
Let t ∈ R+ and A ∈ Et be given then observe that Xt−1 A is one of Ω, {ω : t ≥ T (ω)} = T −1 (0, t] or {ω : t < T (ω)} = T −1 (t, ∞)
which are all in σT since T is measurable. Therefore, X belongs to σT so T determines X.
Conversely, let (t, ∞) ⊂ R+ be given. Then observe that T −1 (t, ∞) = {ω : t < T (ω)} = Xt−1 (−∞, 1/2) which is in σX. Since
(t, ∞) generates the σ algebra on R+ , T belongs to σX so X determines T .
Therefore, X and T determines each other.

4.16
Page 34
Due date : November 8th, 2019 ByeongHo Ban
5.20
Observe that, for any A, B ∈ B,

Z
E[1A (X)1B (Y )] = 1A (X(ω))1B (Y (ω))dP
ZΩZ
= 1A (f (ω1 ))1B (g(ω2 ))λ(dω1 )λ(dω2 )
B2
Z Z Z Z Z
= 1A (f (ω1 ))λ(dω1 ) λ(dω2 ) 1B (g(ω2 ))λ(dω2 ) λ(dω1 ) ∵ λ(dωi ) = 1 and Fubini
B
Z ZB B
B B
= 1A (X(ω))dP 1B (Y (ω))dP
Ω Ω
= E1A (X)E1B (Y ).
Then, by linearity, E[h1 (X)h2 (Y )] = E(h1 (X))E(h2 (Y )) hold for any simple functions h1 and h2 . Then by monotone convergence
theorem, since any positive measurable functions has a sequence of simple functions increasing to it, the equality hold for any
positive measurable functions. Therefore, X and Y are independent random variables.
Therefore, X and Y are independent.
5.21
Suppose that X and Y are independent random variables taking values from some measurable spaces A and B respectively. Let
µ and ν are distributions of X and Y respectively. Then observe that, letting π be the joint distribution, since X and Y are
independent,
Z Z Z
−pX−qY −px−qy −px
Ee = π(dx, dy)e = µ(dx)e ν(dy)e−qy = Ee−pX Ee−qY .
A×B A B
Conversely, suppose that the equality Ee−pX−qY = Ee−pX Ee−qY holds.

Then observe that, letting π be the joint distribution of µ and ν,
Z
π
b((p, q)) = π(dx, dy)e−(p,q)·(x,y)
A×B
−px−qy
= Ee
= Ee−px Ee−qy
Z Z
= µ(dx)e−px ν(dy)e−qy
Z A B
= µ × ν(dx, dy)e−px−qy
A×B
=µ
\ × ν(p, q).
Recall that Laplace transformation uniquely determine the distribution measure, it implies that π = µ × ν so the joint measure
of X and Y is the product of their marginals which implies X and Y are independent.
Page 35
5.22
Suppose that X and Y are independent random variables with distribution µ and ν respectively. And let their joint distribution
be π. Also, suppose that X and Y are taking values from (A, A) and (B, B) respectively. Then observe that
Z Z Z Z Z
Eeir(X+Y ) = π(dx, dy)eir(x+y) = µ(dx) ν(dy)eirx eiry = µ(dx)eirx ν(dy)eiry = EeirX EeirY .
A×B A B A B
Z Z Z Z Z
Eer(X+Y ) = π(dx, dy)er(x+y) = µ(dx) ν(dy)erx eiry = µ(dx)erx ν(dy)ery = EerX EerY
A×B A B A B
(a) Note that

X an e−a ∞
X (aeir )n −a ir
Ee irX
= irn
e = e = e−a eae
n! n=0
n!
n∈N
X bn e−b ∞
X (beir )n −b ir
EeirY = eirn = e = e−b ebe
n! n=0
n!
n∈N
Then by the result we got in the beginning,
ir
Eeir(X+Y ) = EeirX EeirY = e−(a+b) e(a+b)e
which implies that the joint distribution of X + Y would be Poisson distribution with mean a + b since characteristic function of
a random variable determines the distribution of it.
(b) From previous homework 2.29, we found that

−a −b
irX ir irY ir
Ee = 1− and Ee = 1−
c c
And also, note that
−a −b −(a+b)
ir ir ir
Eeir(X+Y ) = EeirX EeirY = 1− 1− = 1−
c c c
Since characteristic function of a random variable determines its distribution, the distribution of X + Y would be gamma
distribution with shape index a + b and scale factor c.
(c) From previous homework 2.22, we found that

r2 b r2 d
EeirX = eira− 2 and EeirY = eirc− 2
Then observe that

r2 b r2 d r 2 (b+d)
Eeir(X+Y ) = EeirX EeirY = eira− 2 eirc− 2 = eir(a+c)− 2
Therefore, by same reason with (a) and (b), the distribution of X + Y would be Gaussian distribution with mean a + c and
variance b + d.
Page 36
5.23
Here, X and Y are random variables having µ and ν as their distribution respectively.
(a) Observe that, for given B ∈ BR , with the change of variables z = x + y,

π(B) = π(1B ) = µ ∗ ν(1B ) = E(1B (X + Y ))
Z Z Z Z Z
= µ(dx) ν(dy)1B (x + y) = µ(dx) ν(dz)1B−x (z) = µ(dx)ν(B − x)
R R R R R
(b) Here, we use dx for λ(dx) for convenience. Observe that, by the change of variables z = x + y, for any positive measurable
function f ,
Z Z Z Z
π(f ) = µ ∗ ν(f ) = E(f (X + Y )) = µ(dy) ν(dx)f (x + y) = dyp(y) dxq(x)f (x + y)
R
Z Z Z ZR R
Z R
Z
= dy dxp(y)q(x)f (x + y) = dy dzp(y)q(z − y)f (z) = dz dyp(y)q(z − y) f (z)
R R R R R R
where the last equality is from Fubini since (R, L, λ) is σ-finite. Therefore,
Z
π(dx) = λ(dx) dyp(y)q(x − y)
R
which is desired result.
(c) If µ and ν is carried by R+ , p and q vanish outside of R+ almost everywhere. Otherwise, there is a set of nonzero measure A
outside of R+ and
Z Z
0< dxp(x) = µ(dx) = 0.
A A∩R+
which is a contradiction. This result is valid for ν without loss of generality. Thus, p and q vanish outside of R+ almost
everywhere. If we want to make them vanishing point wisely outside of the half real line, we can just define it 0 outside of the
half real line since change of function in the set of measure does not change the result.
Then observe that, by the change of variables z = x + y, for any positive measurable function f ,
Z Z Z Z
π(f ) = µ ∗ ν(f ) = E(f (X + Y )) = µ(dy) ν(dx)f (x + y) = dyp(y) dxq(x)f (x + y)
R+ R+ R+ R+
Z ∞ Z ∞ Z ∞ Z ∞ Z ∞ Z z
= dy dxp(y)q(x)f (x + y) = dy dzp(y)q(z − y)f (z) = dz dyp(y)q(z − y) f (z)
0 0 0 y 0 0
where the last equality is from Fubini since (R, L, λ) is σ-finite. Therefore,
Z x
π(dx) = λ(dx) dyp(y)q(x − y)
0
which is desired result.
(For the change of domain, draw the line y = z in yz-plane and note the change of order of domain of integration as we change
the integral.)
5.24
Note that X is indicator function, so X = 1H for some H ∈ H. Then, since we have
Z Z Z
p = P{X = 1} = 1{X=1} dP = 1H dP = XdP = P(X) = EX,
Ω Ω Ω
for any n, observe that
Z Z
n n
EX = (1H ) dP = 1H dP = EX = p.
Ω Ω
Also, by the result above, observe that
2
V arX = EX 2 − (EX) = p − p2 = p(1 − p) = pq.
Lastly, observe that
Z Z Z
Ez X = z 1H dP = zdP + z 0 dP = zP(H) + P(Ω \ H) = zP{X = 1} + P{X = 0} = pz + q.
Ω H Ω\H
Page 37
5.25
Firstly, observe that, since X1 and X2 are independent,

Z Z Z Z
Ez X1 +X2 = µ(dx) ν(dy)z x+y = z x µ(dx) ν(dy)z y = Ez X1 Ez X2
R R R R
where µ and ν are distributions of X1 and X2 respectively. Thus, by mathematical induction and binomial expansion, for any n,
n
X n!
Ez Sn = Ez X1 Ez X2 · · · Ez Xn = (q + pz)n = pk q n−k z k
k!(n − k)!
k=0
since their success probabilities are identical. It implies that the distribution of Sn would be counting measure with mass
n! k n−k
k!(n−k)! p q at k for k = 0, 1, . . . , n. Therefore, observe that
n!
P{Sn = k} = pk q n−k
k!(n − k)!
for k = 0, 1, . . . , n as we desired.
5.26
Note that Tk takes value in N∗ with discrete σ-algebra. Since the collection of An = {k : k ≤ n} for any n ∈ N∗ generates the
σ-algebra, showing Tk−1 (An ) for arbitrary n is measurable is enough to show Tk is measurable for any k. Observe that
Tk−1 (An ) = {ω : Tk (ω) ≤ n} ⊆ {ω : Sn (ω) ≥ k} = {ω : n ≥ Sn (ω) ≥ k}
since n ∈ {m ≥ 1 : Sn (ω) ≥ k} if Tk (ω) ≤ n. The last equality is since Sn ≤ n is always true. Now, assume that ω 6∈ Tk−1 An so
Tk (ω) > n. If Tk (ω) > n, then clearly ω 6∈ {ω : Sn (ω) ≥ k}. Therefore, we have
Tk−1 (An ) = Sn−1 (N ∩ [k, n])
which is measurable. Therefore, Tk is measurable so is random variable since positivity is obvious.
From the observation above and previous exercise 5.25, we have

n
X n!
P{Tk ≤ n} = P{Sn ≥ k} = pj q n−j .
j!(n − j)!
j=k
Now, we claim that A := Tk−1 {n} = Sn−1 {k} := B. Firstly, suppose that ω ∈ A, then Tk (ω) = n so Sn (ω) = k so ω ∈ B.
Now, suppose that ω ∈ B, then Sn (ω) = k so Tk (ω) ≤ n. But it should be equal to n, otherwise, Tk (ω) < n so Sm (ω) = k for
some m < n which implies Tk (ω) = m < n a contradiction. Then, observe that, since the collection of random variables Xi is
independent, from 5.25,
(n − 1)!
P{Tk = n} = P{Xn = 1}P{Sn−1 = k − 1} = pP{Sn−1 = k − 1} = pk q n−k
(k − 1)!(n − k)!
since if Xn (ω) = 0, then Tk (ω) < n a contradiction.
Note that Tk < ∞ is obvious since {Tk < n} & {Tk = ∞} so

k−1
X n!
P{Tk = ∞} = lim P{Tk > n} ≤ lim pj q n−j = 0
n→∞ n→∞
j=0
j!(n − j)!
n
since limn→∞ nq = 0 if 0 < q < 1. Therefore, by the definition of Tk , lim Sn = ∞ almost surely.
Page 38
5.27
Note that Wj = ij means we wait for ij seconds to success Pm from last success. In other words, Xj = 1 and X` = 0 for
` ∈ {j − 1, j − 2, . . . , j − i + 1}. For convenience, let sm = j=1 ij . Therefore, observe that
{W1 = i1 , . . . , Wk = ik } = {Xj = 1 if j = sn for some n ∈ {1, . . . , k} and, otherwise,Xj = 0. for , 1 ≤ j ≤ sk .}
Therefore, since grouping does not kill the independency and (Xi ) is independency, observe that
P{W1 = i1 , . . . , Wk = ik } = P {Xj = 1 if j = sn for some n ∈ {1, . . . , k} and, otherwise,Xj = 0. for , 1 ≤ j ≤ sk .}
= P{X1 = 0, . . . , Xs1 = 1}P{Xs1 +1 = 0 . . . , Xi2 = 1} · · · P{Xsk−1 +1 = 0, . . . , Xsk = 1}
= P{W1 = i1 }P{W2 = i2 } · · · P{Wk = ik }
which implies that {W1 , . . . , Wk } is independency. Also, we have
P{Wk = i} = P{Xk+1 = 0, . . . , Xk+i = 1} = P{Xk+1 = 0} · · · P{Xk+i−1 = 0}P{Xk+i = 1} = q i−1 p.
Now, observe that
∞ ∞ ∞
X X ∂ X i ∂ q 1 1
EWk = iP{Wk = i} = p iq i−1 = p q =p =p =
i=1 i=1
∂q i=1 ∂q 1 − q (1 − q)2 p
and that
∞ ∞ ∞ ∞
X X ∂ X i ∂ ∂ X i ∂ ∂ q ∂ q 1 − q2 1+q
EWk2 = i2 P{Wk = i} = i2 pq i−1 = p iq = p q q =p q =p 2
= p 4
=
i=1 i=1
∂q i=1 ∂q ∂q i=1 ∂q ∂q 1 − q ∂q (1 − q) (1 − q) p2
Therefore,
1+q 1 q 1−p
V arWk = EWk2 − (EWk )2 = − 2 = 2 = .
p2 p p p2
Observe that
" k
# " k
# k
X X X k
ETk = E (Tk − Tk−1 ) = E Wn = EWn = .
n=1 n=1 n=1
p
Also, recalling that Cov is bilinear, observe that
k k
! k X
k k k
X X X X X k(1 − p)
V arTk = Cov(Tk , Tk ) = Cov Wn , Wn = Cov(Wi , Wj ) = Cov(Wn , Wn ) = V arWn =
n=1 n=1 i=1 j=1 n=1 n=1
p2
since {W1 , . . . , Wk } is independency.
5.28 P
Note that, in order to make Sn (x) = k(x) for all x ∈ D, it should be x∈D k(x) = n. For convenience, let’s say D = {a1 , . . . , am }.
And, since {Sn (ai )} is independency,
P{Sn (a1 ) = k(a1 ), . . . , Sn (am ) = k(am )} = P{Sn−k(am ) (a1 ) = k(a1 ), . . . , Sn−k(am ) (am−1 ) = k(am−1 )}P{Sn (am ) = k(am )}
n!
= P{Sn−k(am ) (a1 ) = k(a1 ), . . . , Sn−k(am ) (am−1 ) = k(am−1 )} p(am )k(am )
k(am )!(n − k(am ))!
n!
= P{Sn−k(am )−k(am−1 ) (a1 ) = k(a1 ), . . . , Sn−k(am )−k(am−1 ) (am−2 )}P{Sn−k(am ) (am−1 ) = k(am−1 )} p(am )k(am )
k(am )!(n − k(am ))!
(n − k(am ))! (n)!
= P{Sn−k(am )−k(am−1 ) (a1 ) = k(a1 ), . . . , Sn−k(am )−k(am−1 ) (am−2 )} p(am−1 )k(am−1 ) p(am )k(am
k(am−1 )!(n − k(am−1 ))! k(am )!(n − k(am ))!
n!
= P{Sn−k(am )−k(am−1 ) (a1 ) = k(a1 ), . . . , Sn−k(am )−k(am−1 ) (am−2 )} p(am−1 )k(am−1 ) p(am )k(am )
k(am )!k(am−1 )!(n − k(am−1 ))!
= ···
n!
= p(a1 )k(a1 ) · p(am )k(am )
k(a1 )! · · · k(am )!
It makes sense since the left hand side is same with the probability choosing k(a) of Xi ’s, . . . , k(d) of Xi ’s from n collection of
identically distributed random variables. Therefore, we are done.
Page 39
5.29
So we define
n
X
Sn (ω, A) = 1A ◦ Xi (ω), n ∈ N, ω ∈ Ω, A ∈ E.
i=1
This problem is just similar problem with previous problem. So similarly, since (Sn (Ai ))m i=1 is independency,
n!
P{Sn (A1 ) = k1 , . . . , Sn (Am ) = km } = P{Xi ∈ A1 }k1 · P{Xm ∈ Am }km
k1 ! · · · km !
n!
= µ(A1 )k1 · · · µ(Am )km
k1 ! · · · km !
for every measurable partition (A1 , . . . , Am ) of E and integers k1 , . . . , km ≥ 0 summing up to n.
5.32
Let a finite subset K ⊂ I and binary numbers bi for i ∈ K be given. For convenience, let K = {1, 2, . . . , m} and let A = {i ∈ K :
bi = 1} = {1, . . . , n} with n ≤ m. And, for some finite subset J ⊂ I. observe that
Y Y Z Y
E= Xi = P Xi = dP Xi = P{Xi = 1, i ∈ J}
i∈J i∈J i∈J
since Xi is 0 or 1 and the product is zero if at least one of them is zero. Therefore, we need to know when some of bi = 0 so
n < m.
Supposing that we know the expectations, observe that
P{Xi = bi , i ∈ A, Xn+1 = 0} = P{Xi = bi , i ∈ A} − P{Xi = bi , i ∈ A, Xn+1 = 1}
where we know all values from right hand side since we know the expectations. Also, similarly,
P{Xi = 1, i ∈ A, Xn+1 = 0, Xn+2 = 0} = P{Xi = bi , i ∈ A, Xn+1 = 0} − P{Xi = bi , i ∈ A, Xn+1 = 0, Xn+2 = 1}.
By keep doing it, by the mathematical induction, at the end we get the value of P{Xi = bi , i ∈ K}. Therefore, knowing the
expectation is enough to compute the probabilities.
5.33
Recall that, if X is independency, then, for any finite set J ⊆ I,
Y Y
E fi ◦ Xi = Efi ◦ Xi
i∈J i∈J
for any numerical measurable functions fi . Thus, by setting each fi as identity functions, we get
Y Y
E Xi = EXi .
i∈J i∈J
Now, suppose that, for any finite subset J ∈ I, we have
Y Y
E Xi = EXi .
i∈J i∈J
But, it implies that the joint distribution of Xi ’s is just a product of marginals of each of Xi . Therefore, X is independency.
Page 40
Due date : November 15th, 2019 Byeongho Ban
2.15
(a)-1 ω ∈ lim inf Hn ⇐⇒ ∃m such that ω ∈ Hn ∀n ≥ m.

S T T
Suppose that ω ∈ lim inf Hn , then, by the definition, ω ∈ m n≥m Hn which means ω ∈ n≥m Hn for some m. It implies that
ω ∈ Hn for all n ≥ m for some m. T
Conversely, suppose that ∃m such that ω ∈ Hn for all n ≥ m. It implies that ω ∈ n≥m Hn . Thus, naturally,
S T
ω ∈ m n≥m Hn = lim inf Hn .
(a)-2 ω ∈ lim sup Hn ⇐⇒ ω ∈ Hn for infinitely many n.

Observe that
\ [ [
ω ∈ lim sup Hn = Hn ⇐⇒ ω ∈ Hn ∀m
m n≥m n≥m
By replacing the first statement with the new statement, I willSuse the contrapositive to prove this equivalent.
If ω ∈ Hn finitely many, say M = max{n : ω ∈ Hn } then ω 6∈ n≥M +1 Hn .
S
Conversely, if ω 6∈ n≥m Hn for some m, then max{n : ω ∈ Hn } ≤ m so ω ∈ Hn for finitely many n.
(b)-1 1lim inf Hn = lim inf 1Hn

Note that both of them has only 1 or 0 as their output. Thus, it suffices to show that LHS is 1 if and only if RHS is 1 at ω.
Observe that LHS is 1 at ω if and only if ω ∈ lim inf Hn if and only if, by (a), ∃m such that ω ∈ Hn ∀n ≥ m if and only if
1Hn (ω) = 1 ∀n ≥ m if and only if inf j≤n 1Hn (ω) = 1 for all j if and only if lim inf 1Hn (ω) = 1. Therefore, RHS=LHS .
(b)-2 1lim sup Hn = lim sup 1Hn .

We will show this with similar way of (b)-1.
Observe that 1lim sup Hn (ω) = 1 if and only if ω ∈ lim sup Hn if and only if, by (a), ω ∈ Hn for infinitely many n if and only if
1Hn for infinitely many n if and only if supn≥j 1Hn (ω) = 1 for all j if and only if lim sup 1Hn (ω) = 1. Therefore, LHS=RHS.
(c)
Observe that
ω ∈ lim sup Hn ⇐⇒ ω ∈ Hn for infinitely many n ∵ (a)
⇐⇒ 1Hn (ω) = 1 for infinitely many n
X
⇐⇒ 1Hn (ω) = ∞
n
( )
X
⇐⇒ ω ∈ 1Hn = ∞
n
and observe that
ω ∈ lim inf Hn ⇐⇒ ω ∈ Hn ∀n ≥ m for some m
⇐⇒ 1Hn (ω) = 1 ∀n ≥ m for some m
⇐⇒ (1 − 1Hn )(ω) = 0 ∀n ≥ m for some m
⇐⇒ (1 − 1Hn )(ω) = 1 ∀n < m for some m
X
⇐⇒ (1 − 1Hn )(ω) ≤ m < ∞
n
( )
X
⇐⇒ ω ∈ (1 − 1Hn ) < ∞
n
So we are done.
Page 41
2. Let Xi be independent random variable with P {Xi = 1} = p and P {Xi = 0} = 1 − p and

cn (ω) = X1 (ω) + · · · Xn (ω) .
X
n
(a) Show that Xbn converges to p in L2 .
bn − p) = 1 Pn Yi where Yi = Xi − p satisfies E[Yi ] = 0.
Hint. Write (X n i=1
(b) Show that X

bn converges to p in probability.
Hint. Bound P {|X
bn − p| ≥ } using Chebyshev inequality.
bn2 = 12 (X1 + · · · + Xn2 ) converges to p almost surely.

(c) Show that X n
Hint. Use Borel-Cantelli Lemma with Hn = {|X bn2 − p| ≥ }.
(d) Consider the set A = {ω; Xj (ω) = 1 for finitely many j}. Show that if ω ∈ A then X
bn (ω) does not converge to p but show
that P (A) = 0.
Hint. Consider the set An = {ω : Xn = 1} and apply Borel-Cantelli lemma.
(a) Following the hint, observe that

     
n 2 n 2 n n

bn − p|2 ] = E 
X Xi − p 1 X
 = 1
X
2
1 X σ
E[|X  =
2
E  (Xi − p) 2
E[|Xi − p| ] = 2
σ=
i=1
n n i=1
n i=1 n i=1 n
where V ar(Xi ) = σ and the third inequality is because E[Yi ] = 0 and E[(Xi − p)(Xj − p)] = E[(Xi − p)]E[(Xj − p)] = 0 if
i 6= j due to independency of the random variables. Note that, by taking limit as n → ∞ on both sides, we can show that
limn→∞ E[|X bn − p|2 ] = 0 so X
bn converges to p in L2 .
(b)
Let > 0 be given. Observe that, by using Chebyshev,
2
bn − p| > } ≤ E[|Xn − p] ]
b
P {|X
2
Then by (a), RHS converges to 0, so LHS should converges to 0 since probability measure is positive measure.
(c)
Following the notation in the hint, letting > 0 be given, observe that
X X E[|X bn2 − p]2 ] X σ σ X 1
P (Hn ) ≤ ≤ = <∞
n n
2 n
2 n 2 2 n n 2
by Chebyshev and a part of the proof in (a). Therefore, by Borel Cantelli, X

bn2 converges to p almost surely.
(d)
Suppose that ω ∈ A. Then there are only finitely many n, let’s say N , such that Xn (ω) = 1. Therefore, there exists M > 0 such
bn (ω) = N for all n ≥ M . Thus, by taking limit, we get X
that X bn (ω) → 0 which means that it does not converge to p.
n
Taking the notation from the hint, note that, by the previous problem,
     
\ [ [ \ ∞
Y
P (A) = P (lim sup An ) = P  An  = lim P  An  = 1 − lim P  Acn  = 1 − lim (1 − p)
m→∞ m→∞ m→∞
m n≥m n≥m n≥m i=m
n−m
=1− lim (1 − p) = 1 − 1 = 0.
n,m→∞
Page 42
3. A Cauchy random variable X with parameter β has distribution

1 1
µ(dx) =
βπ 1 + βx22
(a) Show that E[X] does not exist.
(b) One can show that with a complex analysis argument that if β = 1, then X has characteristic function (Fourier transform)
f (r) = E[eirX ] = e−|r|
Use this fact (and prove it if you want!) to show that if Xi are independent Cauchy random variables with parameter β then
bn = X1 +···+Xn is a Cauchy random variable.
X n
(c) Show that there exists number γ such that limn→∞ X

bn = γ in probability. Deduce from this there is no number γ such that
limn→∞ Xn = γ almost surely.
b
(d) Does X
bn converge in distribution?
(a) Observe that, if it exists,

Z Z t
1 x 1 x
E[X] = dx = lim dx = 0
R βπ 1 + x2 /β 2 t→∞ −t βπ 1 + x2 /β 2
since the integrand is odd function and the range is symmetric. However, observe that
Z Z 2t Z 2t
1 x 1 x 1 x
E[X] = 2 2
dx = lim 2 2
dx = lim dx > 0
R βπ 1 + x /β t→∞ −t βπ 1 + x /β t→∞ t βπ 1 + x2 /β 2
since the integrand is strictly positive in right half real line. These observation makes a contradiction. Therefore, E[X] does not
exist.
(b) By using the hint, firstly, observe that, with the change of variable y = βx ,
eirx 1 eiβry
Z Z
1
E[eirX ] = 2 2
dx = 2
dy = e−|βr| .
R βπ 1 + x /β R π1+y
Since Xi are independent, letting the distribution of X bn be νn , observe that

Z n Z n r Y n
x1 +···+xn Y irxi Y |βr| Pn 1
νbn = E[eirXn ] = eir µ(dx1 ) . . . µ(dxn ) = e n µ(dxi ) = µ = e− n = e−|βr| i=1 = e−|βr| .
b
n b n
Rn i=1 R i
n i=1
Therefore, X
bn is a Cauchy random variable.
(c)
[I guess he meant there is no such γ.]
If it does converges, then there is a subsequence (Xnk ) which converges to γ almost surely. Then E[Xn ] = γ which contradicts to
(a). If Xbn → γ a.s, then every then X bn → γ in probability as well which contradicts to our previous argument. Therefore, Xbn
does not converge almost surely to any number.
(d)
Yes, it converges to a Cauchy random variable in distribution. Since the density function of all X
bn are same by (b), it converges
to same density function. Then by problem 6 in this homework set, Xn converges to a Cauchy random variable in distribution.
b

Page 43
4. Exercise 4.13 p.109

p ∈ [1, ∞); Show that the following are equivalent for every sequence (Xn ):
(a) The sequence converges in Lp
(b) The sequence is Cauchy in Lp , that is, E|Xm − Xn |p → 0 as m, n → ∞.
(c) The sequence converges in probability and (Xnp ) is uniformly integrable.
Hint. Follow the proof of the basic theorem and use the generalization |x + y|p ≤ 2p−1 (|x|p + |y|p ) of the triangular inequality.
We will show that (a) =⇒ (b) =⇒ (c) =⇒ (a).
Suppose (a) and let X be the limit in Lp . Then observe that
E|Xm − Xn |p ≤ 2p−1 (E|Xm − X| + E|Xn − X|) → 0
as m, n → ∞. Thus, (b) holds.
Suppose (b). For given > 0, by Chebyshev,

E|Xm − Xn |p
P {|Xm − Xn | > } ≤ →0
p
as m, n → ∞. Thus, theorem 3.9 applies, and the sequence converges in probability. To show that the sequence is uniformly
integrable we use the − δ characterization of Theorem II.3.14: Fix > 0. Since the sequence is Cauchy in Lp , there exists an
integer k = k() such that E|Xm − Xn |p ≤ /2p−1 for all m, n ≥ k. Thus, for every event H,
E|Xn |p 1H ≤ 2p−1 E|Xn − Xk |p 1H + 2p−1 E|Xk |p 1H ≤ + 2p−1 E|Xk |p 1H
for all n ≥ k, and consequently,
sup E|Xn |p 1H ≤ + 2p−1 sup E|Xn |p 1H .
n n≤k
On the right side, the finite collection {X1 , . . . , Xk } is uniformly integrable since the Xn are integrable; see Remark II.3.13.
Hence, by TheoremII.3.14, there exists δ > 0 such that P (H) ≤ δ implies that the supremum on the left side is bounded by 2p .
Finally, taking H = Ω, we see that sup E|Xn | < ∞. Thus, the sequence is uniformly integrable and the implication (b) =⇒ (c)
is proved.
Assume (c). Let X be the limit. By Theorem 3.3, then, there is a subsequence (Xn0 ) that converges to X almost surely, and
Fatou’s lemma yields
E|Xn |p = E lim inf |Xn0 |p ≤ lim inf E|Xn0 |p ≤ sup E|Xn |p .
n
The supremum is finite by the assumed uniform integrability. Hence X is in Lp . To show that Xn → X in Lp , fix > 0 and let
Hn = {|Xn − X| > }. Now, obviously,
E|Xn − X|p ≤ p + E|Xn − X|p 1Hn .
Since |X|p is integrable and (Xnp ) is uniformly so, |Xn − X|p is uniformly integrable. Thus, there is δ > 0 such that P (Hn ) ≤ δ
implies that the expectation on the right is at most p and E|Xn − X|p ≤ 2p . Since P (Hn ) → 0, by the assumed convergence in
probability, this completes the proof that the sequence converges in Lp .

Page 44
5. Suppose Xn is a sequence of random variable such that supn E[(Xn )2 ] < ∞. Show that the sequence µn of the distribution of
Xn is tight.
Hint. Use Chebyshev.
Observe that, by Chebyshev, for given ε, eta > 0,

E|Xn |2 supn E|Xn |2
1 − µn [−η, η] = µn ((−∞, η) ∩ (η, ∞)) = P {|Xn | > η} ≤ 2
≤ ∀n.
η η2
Now, by increasing η enough, since supn E|Xn |2 < ∞, we can make the right most term < . Then we get 1 − µn (−η, η) < where
[−η, η] is compact. Since the inequality is independent on n, we get µn [−η, η] > 1 − ε for all n which implies that µn is tight.

Page 45
6. In this problem we consider random variables Xn and X with densities fn and f and we assume that fn converges pointwise
to f (x). Show then that Xn converges in distribution to X.
Hint. Given a bounded and continuous functions h with a = sup |h(x)| consider the nonnegative functions h1 = h + a and
h2 = a − h. Apply now Fatou’s lemma to both sequence fn h1 and fn h2 .
Let a bounded and continuous function h be given. The observe that, letting a = sup |h(x)|, define nonnegative functions h1 : h+a
and h2 := a − h. Then observe that
h2 = (a − h) ≤ a ≤ (h + a) = h1
Now, observe that, by the Fatou’s lemma,
Z Z Z Z
a − E[h ◦ X] = E[h2 (X)] = h2 f dx ≤ f (a − h)dx = lim inf fn (a − h)dx ≤ lim inf fn (a − h)dx
Z
= lim inf a µn (dx) − E[h ◦ Xn ] = lim inf(a − E[h ◦ Xn ]) = a − lim sup E[h ◦ Xn ].
Thus, since a < ∞, E[h ◦ X] ≥ lim sup E[h ◦ Xn ].

Conversely, observe that, by Fatou’s lemma,
Z Z Z Z Z
E[h ◦ X] + a = hf dx + aµ(dx) = (h + a)f dx = lim inf(h + a)fn dx ≤ lim inf (h + a)fn dx
Z Z
= lim inf hfn dx + aµn (dx) = lim inf [E[h ◦ Xn ] − a] = lim inf E[h ◦ Xn ] − a.
Thus, similarly, we have E[h ◦ X] ≤ lim inf E[h ◦ Xn ]. Therefore,

lim sup E[h ◦ Xn ] ≤ E[h ◦ X] ≤ lim inf E[h ◦ Xn ].
Since it is always true that lim inf E[h ◦ Xn ] ≤ lim sup E[h ◦ Xn ], all the inequalities above get equality. Therefore,
lim E[h ◦ Xn ]
n→∞
which implies that Xn converges to X in distribution.

Page 46
7.
(a) Show that if Xn is binomial with parameters n and pn and npn → λ then Xn converges in distribution to a Poisson random
variable with parameter λ.
(b) Suppose that Xn are normal random variable with mean µn and variance σn2 . Suppose that µn converges to µ and σn2 converges
to σ 2 ≥ 0. Show that Xn converges to a normal random variable with mean µ and variance σ 2 .
(a)
Let µn be the distribution of Xn and ν be the Poission distribution. Note that, by previous problem, it suffices to prove that
λk
lim µn {k} = e−λ
n→∞ k!
k
for every k. (here, µn ({k}) and e−λ λk!
are the density functions.) Let k be given and observe that
n
n! λk n! np k
n −npn
µn ({k}) = k
p (1 − pn ) n−k
= 1+ (1 − pn )−k .
k!(n − k)! n k! (n − k)!nk λ n
Note that pn → 0 since npn → λ. Thus, observe that
n
n! np k
n −npn
lim = 1, lim = 1, lim 1 + = e−λ , lim (1 − pn )−k = 1
n→∞ (n − k)!nk n→∞ λ n→∞ n n→∞
since npn → λ. Therefore,

λk −λ
lim µn ({k}) = e
n→∞ k!
so we are done.
(b)
[I guess, he meant Xn converges to a normal random variable with mean µ and variance σ 2 in distribution.]
Again, by previous problem, we only need to check if the density functions converges to the density function, that is
1 2 2 1 2 2
lim √ e−(x−µn ) /2σn = √ e−(x−µ) /2σ
n→∞ 2πσn 2πσ
(x−µn )2 (x−µ)2
but it is true since the exponential function is continuous and limn→∞ 2
2σn = 2σ 2 . Therefore, the problem 6, Xn converges
to the normal distribution with mean µ and variance σ.

Page 47
8. Suppose Xn is uniform on [−n, n]. In which sense does Xn converge to a random variable X?
Note that the density function fn for Xn would be

(
1
2n x ∈ [−n, n]
f (x) =
0 Otherwise.
Since this function converges to 0 as n → ∞, by problem 6, Xn → 0 in distribution.

Page 48
Due date : December 2nd, 2019 Byeongho Ban
1. Let (Yn ) be i.i.d. random variables with Yn ≥ a > 0 almost surely. Show that
N
! N1
Y
YN = Yn
n=1
converges to a certain constant α almost surely.
Observe that
PN
n=1 ln Yn
ln YN =
N
and it is well defined a.s. because Yn ≥ a > 0 a.s. Then by the strong law of large numbers, we have
lim ln YN = lim ln YN = α a.s.
N →∞ N →∞
where α = E[Yn ] for all n. Therefore, since ln is continuous function, YN → eα a.s.

Page 49
2. Let (Xn ) be i.i.d. random variables with E[Xn ] = µ and let (Yn ) be i.i.d. random variables with E[Yn ] = ν 6= 0. Show that
almost surely
PN
Xn µ
lim Pn=1N
=
N →∞ Y
n=1 n
ν
By the strong law of large numbers, we have

lim XN = µ
N →∞
and
lim YN = ν 6= 0.
N →∞
almost surely, that is, the two statements hold except in the sets Ω1 and Ω2 of measure zero respectively. Then observe that
PN
n=1 Xn
PN
n=1 Xn XN limN →∞ XN µ
lim PN = lim PNN = lim = =
N →∞
n=1 Yn
N →∞ n=1 Yn N →∞ YN lim Y
N →∞ N ν
N
excepts the set of Ω1 ∪ Ω@ which has measure zero(P(Ω1 ∪ Ω2 ) ≤ P(Ω1 ) + P(Ω2 ) = 0), thus it is true almost surely.

2
3. Let (Xn ) be i.i.d. random variables with E[Xn ] = µ and V ar[Xn ] = σ . Show that
N N
!2
1 X 1 X
lim Xn − Xm = σ2
N →∞ N N
n=1 m=1
almost surely.
By the strong laws of large numbers, we know

N
1 X
lim Xm = µ a.s.
N →∞ N
m=1
Thus, observe that, again, by the strong laws of large numbers,
N N
!2 N
1 X 1 X 1 X 2
lim Xn − Xm = lim (Xn − µ) = E[(Xn − µ)2 ] = V ar(Xn ) = σ 2 a.s.
N →∞ N N N →∞ N
n=1 m=1 n=1

4. Let (Xn ) be i.i.d. random variables with E[|Xn |] < ∞. Show that if E[Xn ] > 0 (or E[Xn ] < 0) then
N
X
lim SN = lim Xn = ∞
N →∞ N →∞
n=1
almost surely.
Note that, since E[|Xn |] < ∞, it is clear that E[Xn ] < ∞. Suppose that E[Xn ] > 0. Observe that, by the strong laws of
large numbers, XN → E[Xn ] > 0 almost surely, that is P{XN → E[Xn ]} = 1. Then clearly we have SN (ω) = N XN (ω) → ∞,
otherwise, since we can do division operation between finite limits, N does not diverge which is a contradiction. Therefore,
{XN → E[Xn ]} ⊆ {SN → ∞} so P{SN → ∞} = 1 which means SN → ∞ almost surely.
If E[Xn ] < 0, then SN → −∞ almost surely.

Page 50
5. Suppose (Xn ) are i.i.d. random variables with density 12 e−|x| , −∞ < x < ∞. Show that
PN
√ Xn
lim N Pn=1 N
=Z
N →∞ 2
n=1 Xn
in distribution, where Z is normal with mean 0 and variance σ 2 equal to?
Observe that
Z ∞
1
E[Xn ] = x e−|x| dx = 0
−∞ 2
since the integrand is odd function and the domain in symmetric. Now, observe that
Z ∞ Z ∞
1
V ar(Xn ) = E[Xn2 ] = x2 e−|x| dx = x2 e−x dx = Γ(3) = 2! = 2
−∞ 2 0
since the integrand is even function with symmetric domain and where Γ is gamma function. Thus, by central limit theorem and
Slutsky’s theorem,
PN PN
n=1 Xn
√ n=1 Xn
√
√ = 2 √ −→ 2Z in distribution.
N 2N
where Z is the standard normal random variable.
And observe that, by the strong laws of large numbers,
PN 2
n=1 Xn
−→ E[Xn2 ] = 2 a.s.
N
by the calculation above. Since almost sure convergence implies convergence in probability, again, by the Slutsky’s theorem,
observe that
PN
PN
Xn √
√ n=1
√
Z
n=1 Xn 2Z
N PN = PN N 2 −→ = √ := Z.
n=1 Xn
2 n=1 Xn 2 2
N
Then observe that
E[Z] 0
E[Z] = √ = √ = 0
2 2
and that
V ar(Z) 1
V ar(Z) = = .
2 2
1
Thus, σ 2 = 2 and we are done.

2
PN
6. Let (Xn ) be nonnegative i.i.d random variables with E[Xn ] = 1 and V ar[Xn ] = σ < ∞ and SN = n=1 Xn . Show that
2 p √
SN − N
σ
converges in distribution to a standard normal random variable.
Hint:
√ √
SN − N SN + N p √
√ = √ SN − N .
N N
Page 51
Let Z be the standard normal random variable. Then, observe that, by the central limit theorem,
SN − N
√ →Z in distribution.
N σ2
And also, observe that, by the strong laws of large numbers,
√ √ "r # q
SN + N SN
√ = +1 = XN + 1 −→ E[Xn ] + 1 = 2 almost surely.
N N
Since almost sure convergence implies convergence in probability, it converges to a constant 2 in probability.
Then, observe that, by Slutsky’s theorem,
√
2 p √ SN − N 2 N 2
( SN − N ) = √ √ √ −→ Z = Z in distribution.
σ 2
σ N SN + N 2
Therefore, we are done.

7.
(a) Suppose (Xn ) are i.i.d Poisson random variables with parameters λ = 1 and SN = X1 + · · · + XN . Show that limN →∞ √−N
SN
N
converges in distribution to a standard normal random variable.
(b) Show that
N
X Nk 1
lim e−N =
N →∞ k! 2
k=0
(a) Note that the Poisson distribution with parameter 1 has mean 1 and variance 1, then, by the central limit theorem,
SN − N
√ −→ Z in distribution
N
where Z is the standard normal random variable.
n −n
S√
(b) Let µn be the distribution of n
. Observe that
X 1 Nk
P{SN = k} = e−N = e−N
k1 ! · · · kN ! k!
k1 +···+kN =k
by the identity from the multinomial theorem

X k!
N k = (1 + · · · + 1)k = .
k1 ! · · · kN !
k1 +···+kN =k
Therefore, we have
N
X Nk
P(0 ≤ SN ≤ N ) = e−N
k!
k=0
but also observe that
√

SN − N SN − N
P(0 ≤ SN ≤ N ) = P − N ≤ √ ≤ 0 = P −∞ < √ ≤ 0 = µN (−∞, 0]
N N
since SN cannot be negative due to positivity of Poisson random variable. Note that (−∞, 0] is Borel set and µ{0} = 0 where µ
is the distribution of Z, a standard normal random variable. Therefore, since SN√−N
N
converges to Z in distribution,
N 0 ∞
Nk
Z Z
−N
X 1 x2 1 1 x2 1
lim e = lim µN (−∞, 0] = µ(−∞, 0] = √ e− 2 dx = √ e− 2 dx = .
N →∞ k! N →∞ −∞ 2π 2 −∞ 2π 2
k=0

Page 52
8.
(a) Construct a sequence of independent random variable Xn such that Xn converge to 1 in probability but E[Xn2 ] ≥ n diverges.
Hint : Take Xn such that P (Xn = n) = n1 and P (Xn = 1) = 1 − n1 .
(b) Let Zn = Y Xn where Y is a standard normal random variable independent of Xn . Prove that E[Zn ] = 0 and
limn→∞ V ar[Zn ] = +∞ but
lim Zn = Z
n→∞
in distribution where Z is standard normal.
Hint : Slutsky Theorem.
(a) Following the hint, let’s define Xn as having distribution with P (Xn = n) = n1 and P (Xn = 1) = 1 − n1 . Now, let > 0 be
given and observe that
1
P (|Xn − 1| > ) ≤ P (Xn = n) = −→ 0
n
since if ω ∈ {|Xn − 1| > }, then Xn (ω) 6= 1. Therefore, Xn converges in probability to 1.
Now, observe that

2 21 2 1 1
E[Xn ] = n +1 1− = n − ≥ n → ∞.
n n n
Thus, the (Xn ) we constructed satisfies the conditions.
(b) Since Xn −→ 1 in probability and Yn −→ Y in distribution where Yn = Y for all n, by Slutsky’s theorem, Zn = Y Xn =
Xn Yn −→ Y . Since Y is the standard normal, Y is the Z we were looking for.
Observe that, since Xn and Y are independent,
E[Zn ] = E[Xn Y ] = E[Xn ]E[Y ] = 0
since E[Y ] = 0. Then observe that
lim V ar[Zn ] = lim V ar[Xn Y ] = lim E[Xn2 Y 2 ] = E[Y 2 ] lim E[Xn2 ] ≥ lim n = ∞
n→∞ n→∞ n→∞ n→∞ n→∞
since E[Y ] = 1 and by (a). Then we are done.

Page 53
Due date : December 9th, 2019 Byeongho Ban
1. Suppose Y ∈ L1 (Ω, H, P) or Y ∈ L+ (Ω, H, P) and that G ⊂ H is a sub σ-algebra. Show the following properties of conditional
expectation.
(a) Show that |E[Y |G]| ≤ E[|Y ||G].

(b) If K ⊂ G is a sub σ- algebra, show that
E[E[Y |G]|K] = E[Y |K]
(c) Show that E[Y |Y ] = Y a.s.
(d) Show that if Y ≤ c a.s. then E[Y |G] ≤ c a.s.
(e) Show that if Y = α a.s. with α a constant then E[Y |G] = α a.s.
(f) If Y is positive, show that {E[Y |G] = 0} ⊂ {Y = 0} and {Y = ∞} ⊂ {E[Y |G] = +∞}
(a) Observe that, since conditional expectation is linear and E[Y |G] ≥ 0 if Y ≥ 0,
|E[Y |G]| = |E[Y + |G] − E[Y − |G]| ≤ |E[Y + |G]| + |E[Y − |G]| = E[Y + |G] + E[Y − |G] = E[Y + + Y − |G] = E|Y |G|.
(b) Let ΠG be a projection onto G such that ΠG Y = E[Y |G] and ΠK be a projection onto K such that ΠK Y = E[Y |K]. Then
observe that
E[E[Y |G]|K] = ΠK (ΠG (Y )) = (ΠK ΠG )Y = ΠK Y = E[Y |K]
since K ⊂ G so since ΠK ΠG = ΠK .
(c) Note that Y is measurable with respect to σ(Y ) so E[Y |Y ] = E[E|Y |σ(Y )] = Y a.s. is trivial.
(d) Suppose that Y ≤ c a.s and assume that A := {E[Y |G] > c} has positive measure. Then observe that
cE[1A ] < E[E[Y |G]1A ] = E[E[Y 1A |G]] = E[Y 1A ] ≤ cE[1A ]
which give us c < c, a contradiction. Therefore, A should have zero measure and so E[Y |G] ≤ c a.s.
(e) Note that α is measurable with respect to G. Then observe that
αE[Z] = E[Y Z] = E[E[Y |G]Z]
for any G measurable function Z. Then by uniqueness, E[Y |G] = α a.s.
(f) Observe that {E[Y |G] = 0} = E[Y |G]−1 ({0}) ⊂ Y −1 ({0}) = {Y = 0} since E[Y |G] is G measurable. Or setting A = {E[Y |G] =
0}, observe that, by the definition,
0 = E[E[Y |G]1A ] = E[Y 1A ]
which implies that Y = 0 over A since Y is positive. Therefore, we proved the first inclusion.
Let B = {Y = ∞}. Then observe that

∞ = E[Y 1B ] = E[E[Y |G]1B ]
which implies that E[Y |G] = ∞ over B. Therefore, we proved the second inclusion.

Page 54
2. If Y is exponential with P (Y > t) = e−t for t > 0, compute E[Y |Y ∧ t].
We note that Y = 1Y ≥t Y + 1Y <t Y. Then observe that, for any a > 0, we have
{1Y <t Y ∈ [0, a)} = {Y ∈ [0, a ∧ t)} = {Y ∧ t ∈ [0, a ∧ t)} ∈ σ(Y ∧ t)
and it implies that 1{Y <t} Y is measurable with respect to σ(Y ∧ t). Thus, we have E(Y |Y ∧ t) = 1Y <t Y + E(1Y ≥t Y |Y ∧ t).
Now, observe that
E[Y 1{Y ≥t} |Y ∧ t] = E[1{Y ∧t=t} Y |Y ∧ t] = 1{Y ∧t=t} E[Y ]
and that
Z ∞
E[Y ] = te−t dt = 1
0
by the integration by parts. Therefore, we have
E[Y |Y ∧ t] = 1Y <t Y + 1{Y ∧t=t} .

Page 55
3. Show that if Y ∈ L2 (Ω, H, P) and a > 0 then

E[Y 2 |G]
P (|Y | > a|G) ≤
a2
where P (|Y | > a|G) = E[1{|Y |>a} |G].
Observe that, for any Z ∈ L+ (G), since 1A ∈ L+ where A := {|Y | > a},
|Y |2
2
E[|Y |2 |G]

|Y |
E[P (|Y | > a)|G)Z] = E[E[1(|Y |>a) |G]Z] = E[1{|Y |>a} Z] ≤ E Z =E E G Z =E Z .
a2 a2 a2
n 2
o
If A := E[Ya2 |G] < P {|Y | > a|G} has positive measure, then we have
E[|Y |2 |G]

E[P (|Y | > a|G)1A ] > E 1A
a2
which contradicts our result. Therefore, we have
E[Y 2 |G]
P (|Y | > a|G) ≤ a.s.
a2

Page 56
4. Prove that for X, Y ∈ L2 (Ω, H, P ), we have

E[XY |G]2 ≤ E[X 2 |G]E[Y 2 |G].
Observe that, for any t ∈ R, we have

E[(X + tY )2 |G] ≥ 0.
Then, by expanding, we have
E[X 2 |G] + 2tE[XY |G] + t2 E[Y 2 |G] ≥ 0
Thus, the polynomial in t above have one solution or zero solution, which means our discriminant of the polynomial satisfies
4E[XY |G]2 − 4E[X 2 |G]E[Y 2 |G] ≤ 0.
Therefore, we have
E[XY |G]2 ≤ E[X 2 |G]E[Y 2 |G]
which has been desired.

Page 57
5. Prove the law of total variance: if Y ∈ L2 (Ω, H, P ) then

V ar[Y ] = E[V ar[Y |X]] + V ar[E[Y |X]]
where V ar[Y |X] is defined as
V ar[Y |X] = E[(Y − E[Y |X])2 |X].
Observe that
E[V ar[Y |X]] + V ar[E[Y |X]] = E[E[(Y − E[Y |X])2 |X]] + V ar[E[Y |X]]
= E[E[(Y 2 − 2Y E[Y |X] + E[Y |X]2 )|X]] + E[E[Y |X]2 ] − E[E[Y |X]]2
= E[E[Y 2 |X]] − E[E[2Y E[Y |X]|X]] + E[E[E[Y |X]2 |X]] + E[E[Y |X]2 ] − E[E[Y |X]]2
= E[Y 2 ] − E[2E[Y |X]E[Y |X]] + E[E[Y |X]2 ] + E[E[Y |X]2 ] − E[Y ]2
= V ar[Y ] − 2E[E[Y |X]2 ] + E[E[Y |X]2 ] + E[E[Y |X]2 ]
= V ar[Y ]
since E[Y |X] is measurable with respect to σ(X) and so is E[Y |X]2 .

Page 58
6. Conditional independence Let G, G1 , and G2 be sub σ-algebras of H. We say that G1 , G2 are conditionally independent
given G if
E[V1 V2 |G] = E[V1 |G]E[V2 |G]
for all non-negative V1 measurable with respect to G1 and all non-negative V2 measurable with respect to G∈ . Show that the
following are equivalent
(a) G1 , G2 are conditionally independent given G.
(b) E[V2 |G ∨ G1 ] = E[V2 |G] for every non-negative V2 measurable with respect to G2 .
(c) E[V2 |G ∨ G1 ] is measurable with respect to G for every non-negative V2 measurable with respect to G2 .
This should be interpreted as follows: if G1 and G2 are conditionally independent given G, then given the information in G the
additional information provided by G1 is useless if we are interested in quantities determined by G2 .
(b) =⇒ (c)
Suppose (b), then for any non negative V2 measurable with respect to G2 , we have E[V2 |G ∨ G1 ] = E[V2 |G] is measurable with
respect to G by the definition of conditional expectation.
(c) =⇒ (b)
Suppose (c), then for any non-negative V2 measurable with respect to G2 , E[V2 |G ∨ G1 ] is measurable with respect to G. Then
observe that
E[V2 |G] = E[E[V2 |G ∨ G1 ]|G] = E[V2 |G ∨ G1 ].
(b) =⇒ (a)
Let V1 and V2 are non-negative measurable with respect to G1 and G2 respectively. Then, assuming (b), observe that
E[V1 V1 |G] = E[E[V1 V2 |G1 ∨ G2 ∨ G]|G] ∵ by the result of problem 1-(b)
= E[V1 E[V2 |G1 ∨ G2 ∨ G]|G] ∵ V1 is measurable w.r.t G1 ∨ G2 ∨ G
= E[V1 E[V2 |G ∨ G1 ]|G] ∵ V2 is measurable w.r.t G2 so by the argument below
= E[V1 E[V2 |G]|G] ∵ by (b)
= E[V1 |G]E[V2 |G] ∵ E[V2 |G] is measurable w.r.t G
where the third inequality is because E[V |G ∨ G1 ] is already G2 measurable since V2 is G2 measurable already, so E[V2 |G ∨ G1 ] is
G ∨ G1 ∨ G2 measurable and
E[V2 |G ∨ G1 ∨ G2 ] = E[E[V2 |G ∨ G1 ]|G ∨ G1 ∨ G2 ] = E[V2 |G ∨ G1 ].
(a) =⇒ (c)
Observe that, for any V1 and V2 as stated in the problem,
V1 E[V2 |G] = E[V1 E[V2 |G]|G ∨ G1 ] ∵ V1 E[V2 |G] is measurable w.r.t G ∨ G1
= E[V2 |G]E[V1 |G ∨ G1 ] ∵ E[V2 |G] is G ∨ G1 measurable
= E[V2 |G]E[V1 |G] ∵ by the last argument in (b) =⇒ (a)
= E[V1 V2 |G] ∵ by our assumption.
Therefore, V1 is G measurable for any V1 measurable with respect to G1 . It implies G1 ⊂ G so G ∨ G1 = G. Therefore,
E[V2 |G ∨ G∞ ] = E[V2 |G]
which is desired.

Page 59
7. Let (D, D) and (E, E) be measurable spaces, let µ be a probability measure on (D, D) and K(x, B) a probability kernel from
(D, D) into (E, E) (in particular K(x, E) = 1 for all X.) Let π be the product space (D × E, D ⊗ E) given by
Z Z Z
f (x, y)dπ(x, y) = µ(dx) f (x, y)K(x, dy).
E×D E D
Show that
Z
E[f (Y )|X] = K(X, dy)f (y).
F
Observe that, for any Z,

Z
E[E[f (Y )|X]Z] = E[f (y)|x]z(x, y)dπ(x, y)
E×D
Z Z
= µ(dx) E[f (y)|x]z(x, y)K(x, dy)
ZE D
= K(X, dy)E[f (X)|X]Z(X, y).

D
So we are done.

Page 60
8. Suppose that Y and Z are independent standard Gaussian random variables and X = Y + Z.
(a)Find the joint distribution π(dx, dy) of X and Y .

(b) Compute the decomposition π(dx, dy) = µ(dx)K(x, dy).
(c) Compute E[Y |X].
(a)
Recall that the Fourier transform of standard Gaussian is
r2 b
E[eirY ] = eira− 2
where a is mean and b is variance. Since the mean and variance of the distributions of Y and Z are samely 0 and 1 respectively,
r2
we have E[eirY ] = E[eirZ ] = e− 2 . And since Y and Z are independent, we have
r2 r2 2
E[eir(X) ] = E[eir(Y +Z) ] = E[eirY ]E[eirZ ] = e− 2 e− 2 = e−r
which means the distribution of X is a Gaussian distribution with mean 0 and variance 2.
Now, observe that, since Y and Z are independent,
2

(r+s)2 r2 − r 2 +rs+ s2
b(r, s) = E[ei(r,s)·(X,Y ) ] = E[ei(r,s)·(Y +Z,Y ) ] = E[ei(r(Y +Z)+sY ) ] = E[ei(r+s)Y ]E[eirZ ] = e−
π 2 e− 2 =e .
Since the characteristic function uniquely specify the distribution, it determines the distribution.
s2 +2rs
(b) Notice the last second decompositon the characteristic function. It says K is corresponding to e− 2 and µ corresponds to
2
e−r . So it is clear.
(c) Observe that

1
E[Y |X] = E[Y |Y + Z] =
(Y + Z)
2
by the result of next problem since Y and Z are both standard Gaussian random variable(so in L1 ) and independent.

Page 61
9. Suppose X1 , . . . , Xn are i.i.d random variables and in L1 . Show that for 1 ≤ j ≤ n,

n
1X
E[Xj |X1 + · · · + Xn ] = Xi .
n i=1
Hint: Use problem 1(b) and the symmetry coming from i.i.d hypothesis.
Pn
Let j be given as stated in the problem. For convenience, Let Y = k=1 Xk . Then observe that
n
X
Y = E[Y |Y ] = E[Xk |Y ] = nE[Xj |Y ]
k=1
since (Xk )nk=1 is i.i.d. and sum in the Y can be interchanged if needed (E[Xi |Y ] = E[Xk |Y ] for any i and k.) Therefore, we have
n
Y 1X
E[Xj |X1 + · · · + Xn ] = = Xk
n n
k=1

Page 62
Takehome Final Exam
Due date : December 19th, 2019 Byeongho Ban
Problem 1. Let Xn , n = 1, 2, . . . be independent identical distributed random variables taking values in the measurable space
(E, E) and let µ denote their common distribution. Define for A ∈ E
N
1 X
FN (A) = 1A (Xn ).
N n=1
It is called the empirical measure for the sequence of random variables Xn . Show the following
1. For each ω ∈ Ω,
N
1 X
FN (A)(ω) = 1A (Xn (ω))
N n=1
defines a probability measure on (E, E).
2. For every A ∈ E, FN (A) converges to µ(A) almost surely.
3. Show that for every non-negative measurable f : E → R we have
N Z
1 X
lim f (Xn ) = f dµ a.s.
N →∞ N E
n=1
Problem 1 - 1
Let ω ∈ Ω be given. And note that 1∅ (Xn (ω)) = 0 for any n = 1, 2, . . . , N . Thus, we have
N
1 X 0
FN (∅)(ω) = 1∅ (Xn (ω)) = = 0.
N n=1 N
Next, we only need to show the countable additivity. Suppose that (Ak ) be a sequence of disjoint sets in the σ-algebra H of Ω.
Then observe that
∞
X
1A (Xn (ω)) = 1Ak (Xn (ω)) ∀n
k=1
S∞
where A = k=1 Ak . Note that only at most one of the summand above is 1 depending on if ω ∈ A, so the series is convergent.
And this implies that
∞ ∞ ∞
N
! N
!
1 X X X 1 X X
FN (A)(ω) = 1Ak (Xn (ω)) = 1Ak (Xn (ω)) = FN (Ak )(ω)
N n=1 N n=1
k=1 k=1 k=1
which gives us countable additivity.
Lastly, observe that, since Xn (ω) ∈ E ∀n, we have 1E (Xn (ω)) = 1 ∀n. Therefore, we have
N N
1 X 1 X N
FN (E)(ω) = 1E (Xn (ω)) = 1= = 1.
N n=1 N n=1 N
Therefore, we have proven that FN (·)(ω) defines a probability measure on (E, E).

Page 63
Problem 1 - 2
For given A ∈ E, note that the 1A (Xn ) are pairwise independent and identically distributed because Xn are pairwise independent
and identically distributed. And also observe that
Z
E[1A (Xn )] = 1A (x)µ(dx) = µ(A).
Therefore, by the law of large number, we know

N
1 X
1A (XN ) = 1A (Xk ) = FN (A)
N
k=1
converges to E[1A (Xn )] = µ(A) almost surely. (Note that, if µ(A) = ∞, then the limit converges to ∞ by the strong law of large
numbers which is consistent with our desired result.)

Problem 1 - 3
Let f : E → R be a given non-negative measurable function. Then note that f (Xn ) is i.i.d. Also, observe that
Z Z
E[f (Xn )] = f (x)dµ = f dµ.
E E
Therefore, by the law of large numbers, if the expectation is finite, we have
N Z
1 X
lim f (Xn ) = f dµ a.s.
N →∞ N E
n=1
And if the expectation is infinite, then
N Z
1 X
lim f (Xn ) = ∞ = f dµ a.s.
N →∞ N E
n=1

Page 64
Problem 2. (Random Harmonic Series) It is well known that the harmonic series 1 + 21 + 13 + · · · diverges while the alternating
series 1 − 12 + 13 − · · · converge. What happens if we pick the signs at random? Let Xj be independent random variable with
P (Xj = 1) = P (Xj = −1) = 21 . Let M0 = 0 and
n
X 1
Mn = Xj
j=1
j
1. Show that Mn is a martingale with respect to the filtration Fn = σ(X1 , . . . , Xn ).
2. Show that Mn is uniformly integrable. Hint: Show that supn E[Mn2 ] < ∞.
3. Show that the series Mn converges almost surely.

Problem 2 - 1
Let a positive integer n be given.
First, noting that Xj are independent, observe that

 
n n n X n
X 1 X 1 X 1 1 1 1
E[|Mn |] = E  Xj ≤
 E[|Xj |] = + | − 1| = < ∞.
j=1
j j=1
j j=1
j 2 2 j=1
j
Second, the fact that Mn is Fn measurable is clear since (Xj )nj=1 are all Fn measurable by the definition so Mn is a linear
combination of Fn measurable functions.
Third, observe that, for any m ≤ n,

 
n
X 1
E[Mn |Fm ] = E  Xj Fm 
j=1
j
 
n
X 1
= E Mm + Xj Fm 
j=m+1
j
n
X 1
= E[Mm |Fm ] + E [Xj |Fm ] ∵ the linearity of integaration
j=m+1
j
n
X 1
= Mm + E [Xj |Fm ] ∵ Mm is Fm measurable
j=m+1
j
n
X 1
= Mm + E [Xj ] ∵ Xj with j = m + 1, . . . , n are independent of ∨m
i=1 σ(Xi ).
j=m+1
j
= Mm
where the last equality is because
1 −1
E[Xn ] = + =0 ∀n.
2 2
Therefore, Mn is a martingale with respect to the filtration Fn
Page 65
Problem 2 - 2
Firstly, for given positive integer n, observe that

 
n
X 1 X 2
E[Mn2 ] = E  X2 +
2 j
Xi Xj 
j=1
j ij
1≤j<i≤n
n
X 1 X 2
= E[Xj2 ] + E[Xi Xj ] ∵ the linearity of E
j=1
j2 ij
1≤j<i≤n
n
X 1 X 2
= 2
E[Xj2 ] + E[Xi ]E[Xj ] ∵ Xi are independent
j=1
j ij
1≤j<i≤n
n
X 1
= 2
E[Xj2 ] ∵ E[Xi ] = 0 ∀i by an argument in Problem 2-1
j=1
j
n
X 1
=
j=1
j2
where the last equality is from
1 1
E[Xj2 ] = 12 P (Xj = 1) + (−1)2 P (Xj = −1) = + = 1.
2 2
Therefore, we have
∞
X 1
L :=let sup E[Mn2 ] = 2
< ∞.
n
n=1
n
Now, for given c > 0, observe that
cE[|Mn |1(|Mn |>c) ] = E[c|Mn |1(|Mn |>c) ] ≤ E[|Mn |2 1(|Mn |>c) ] ≤ E[Mn2 ] ≤ L
thus,
L
lim sup E[|Mn |1(|Mn |>c) ] ≤ lim
=0
c→∞ n c c→∞
Therefore, by the definition of uniform integrability, (Mn ) is uniformly integrable.

Problem 2 - 3
Note that the ”Martingale convergence theorem” tells us that Mn converges almost surely if Mn is a sub-martingale with
supn E[Mn+ ] < ∞. We know that Mn is martingale from part 1 of this problem so it is sub and sup martingale at the same
time. Furthermore, the condition that Mn is uniformly integrable gives us the fact that there is a finite c > 0 such that
E[|Mn |1(|Mn |>c) ] ≤ sup E[|Mn |1(|Mn >c|) ] < 1
n
so we have
E[Mn+ ] ≤ E[Mn+ ] + E[Mn− ] = E[|Mn |] = E[|Mn |1(|Mn |>c) ] + E[|Mn |1(|Mn |≤c) ] < 1 + c ∀n
which implies that
sup E[Mn+ ] < 1 + c < ∞.
n
Therefore, by Martingale convergence theorem, Mn converges almost surely.

Page 66
Problem 3. In this problem, we consider independent and identically distributed random variables X1 , . . . , Xn with a common
distribution function F (x) = P (Xj ≤ x) and we are interested in understanding
Yn = max(X1 , . . . , Xn )
that is Yn is the maximal value of X1 , . . . , Xn .
1. Prove that the distribution function of Yn is equal to F (x)n .
2. Assume that for any finite x, we have F (x) < 1 (this means that the Xj are unbounded). Show that the increasing sequence
Yn satisfies
lim Yn = +∞ a.s.
n→∞
Hint : Fix an arbitrary R and consider the event An = P (Yn ≤ R). Apply then Borel-Cantelli Lemma.
3. Suppose that Xj is an exponential random variable with distribution function F (x) = 1 − e−x . From part 2, we know that Yn
diverges to infinity almost surely. In order to characterize this divergence show that
Yn − log n converges in distribution to Z
where Z has distribution function
−x
P (Z ≤ x) = e−e
Z is called the Gumbell distribution.
Problem 3 - 1
Let the distribution function of Yn be Gn (x). Then, observe that

Gn (x) = P (Yn ≤ x) = P (max(X1 , . . . , Xn ) ≤ x) = P (X1 ≤ x)P (X2 ≤ x) · · · P (Xn ≤ x)
n
Y n
Y
= P (Xi ≤ x) = F (x) = F (x)n
i=1 i=1
since Xn are independent so in order for max(X1 , . . . , Xn ) to be smaller than x, we should have Xi ≤ x for all i ∈ {1, 2, . . . , n}.
Thus, we are done.

Problem 3 - 2
Let R > 0 be given. Then we can consider 1An where An = (Yn < R) as sequence of Bernoulli variables and we can say, by part
1 of this problem,
E[1An ] = P (An ) = F (R)n .

Then, since F (R) < 1, we should have
∞ ∞
X X F (R)
E[1An ] = F (R)n = < ∞.
n=1 n=1
1 − F (R)
Then by Borel Cantelli lemma: (divergence part), we would have
∞
X
1An < ∞ a.s.
n=1
It implies that for each ω ∈ Ω, there is N > 0 such that 1An (ω) = 0 for all n > N so Yn (ω) ≥ R for all n > N almost surely.
Therefore, we have that, for any R > 0, there is N > 0 such that for any n > N , Yn (ω) > R. In other words, we have
lim Yn = ∞ a.s.
n→∞

Page 67
Problem 3 - 3
Let r ∈ R be given. And, letting µn and µ be the distribution measure of Yn − log n and Z respectively. Then observe that, by
the result of part 1 of this problem,
n
e−r

−r−log n n
E[1(−∞,r] (Yn − log n)] = µn (−∞, r] = P (Yn − log n ≤ r) = P (Yn ≤ r + log n) = (1 − e ) = 1−
n
r
−→n→∞ e−e = P (Z ≤ r) = µ(−∞, r] = E[1(−∞,r] (Z)].
Since {(−∞, r] : r ∈ R} generates the Borel σ-algebra in R, BR , it is true that E[1A (Yn − log n)] → E[1A (Z)] for all A ∈ BR . Then
we know that, by linearity, any simple function, the convergence is true. Note that, for any f ∈ Cb , a bounded and continuous
function, since continuous functions are measurable, there exists a sequence of simple functions sm such that sm % f . Then, by
monotone convergence theorem,
lim E[f (Yn − log n)] = lim E[sm (Yn − log n)] = lim E[ lim sm (Yn − log n)] = lim E[sm (Z)] = E[f (Z)]
n→∞ n,m→∞ m→∞ n→∞ n,m→∞
where we used Dominated convergence theorem at the second and last equality because sm ≤ f ∈ L1 .
Therefore, Yn − log n converges to Z in distribution.

Page 68
Problem 4. Assume that X and Y are two independent non-negative random variables on a probability space (Ω, H, P ) and
assume that X has an exponential distribution with parameter λ. Show that
P (X > Y ) = E[e−λY ]
Observe that , since Y is non-negative random variable,

Z ∞
P (X > Y ) = P (X > Y |Y = y)fY (y)dy fY : density function of Y
Z0 ∞
= P (X > y)fY (y)dy ∵ X and Y are independent
0
Z ∞
= e−λy fY (y)dy ∵ P (X ≤ y) = 1 − e−λy
Z0 ∞
= e−λy µY (dy) µY : distribution of Y
0
= E[e−λY ].
When X and Y are not independent, the statement is not true. Consider when Y = X. Then, assuming that the statement was
true, observe that
Z ∞ Z ∞
−λX −λx −λx 1
0 = P (X > X) = P (X > Y ) = E[e ]= e λe dx = λe−2λx dx =
0 0 2
which is a contradiction. Therefore, it is valid to assume that X and Y are independent.

Page 69
Problem 5.
1. Consider independent identically distributed random variable with Xn , n = 1, 2, 3, . . . with E[Xn ] = 1 and with M0 = 1 set
Mn = M0 X1 X2 X3 · · · Xn
Show that Mn is a martingale with respect to the filtration Fn = σ(X1 , . . . , Xn ). What is E[Mn ]?
2. Assume that Xn is such that P (Xn = 3/2) = 1/2 and P (Xn = 1/2) = 1/2. Compute the almost sure limit limn→∞ Mn . Hint:
Consider the limit of Kn = log Mn and use the law of large numbers. You do not need anything else.
3. Use Jensen’s inequality to show that your result in 2 is true for any distribution of Xi (except Xi = 1 a.s.).
Problem 5 - 1
Let a positive integer n be given.
(1) Note that the condition E[Xn+ ] − E[Xn− ] = E[Xn ] = 1 gives us that E[|Xn |] < ∞. Then observe that
E[|Mn |] = E[|M0 X1 X2 X3 · · · Xn |]
= E[|X1 X2 X3 · · · Xn |] ∵ M0 = 1
= E[|X1 |]E[|X2 |] · · · E[|Xn |] ∵ Xn are independent
<∞
where the last inequality is from the fact that the finite product of finite number is finite.
(2) It is clear that Mn = M0 X1 · · · Xn = X1 · · · Xn is Fn = σ(X1 , . . . , Xn ) measurable because Mn is a finite product of Fn

measurable functions.
(3) For any m < n, observe that

E[Mn |Fm ] = E[M0 X1 · · · Xn |Fm ]
= Mn E[Xm+1 · · · Xn |Fm ] ∵ Mm is Fm measurable and Mn , Mm , Xm+1 · · · Xn ∈ L1
= Mn E[Xm+1 · · · Xn ] ∵ Xm+1 , . . . , Xn are all independent of σ(X1 , . . . , Xm )
= Mn E[Xm+1 ] · · · E[Xn ] ∵ Xm+1 , . . . , Xm are mutually independent
= Mn ∵ E[Xi ] = 1 ∀i
Therefore, M is martingale. And, by the independence of X1 , . . . , Xn and the fact that M0 = 1, observe that
E[Mn ] = E[M0 X1 X2 · · · Xn ] = E[X1 ]E[X2 ] · · · E[Xn ] = 1
since E[Xj ] = 1 for all j.

Problem 5 - 2
Consider Kn = log Mn . Then observe that, since log M0 = log 1 = 0,

X n
Kn = log Mn = log Xi
i=1
and that

3 3 1 1 1 3 1 1 1 3
E[log Xi ] = P Xi = log + P Xi = log = log + log = log < 0.
2 2 2 2 2 2 2 2 2 4
Then, since log Xi is i.i.d, by the law of large numbers,
Pn
log Mn Kn log Xi
= = i=1 = log Xi → E[log Xi ] < 0 as n → ∞ a.s.
n n n
Therefore, since n → +∞, it should be log Mn → −∞ as n → ∞. It implies that Mn → 0 a.s.

Page 70
Problem 5 - 3
Note that we have used the specific distribution of Xi of the part 2 only for concluding that E[log Xi ] < 0. Therefore, assuming
that we don’t have the case of Xi = 1 a.s., we want to conclude that, for i.i.d. random variables (Xn ) of any distribution with
E[Xi ] = 1, we still can have E[log Xi ] < 0.
Note that
d2 1
log x = − 2 ≤ 0
dx2 x
which implies that log is concave function. Therefore, by the Jensen’s inequality, we have
E[log Xi ] ≤ log E[Xi ] = 0.
Furthermore, we observe that E[log Xi ] cannot be zero. If it was zero, then, by the law of large number,
log Mn
lim = E[log Xi ] = 0 a.s.
n→∞ n
so limn→∞ log Mn < ∞. However, then
X∞
log Xn = lim log Mn < ∞ =⇒ lim log Xn = 0 =⇒ lim Xn = 1 a.s.
n→∞ n→∞ n→∞
n=1
Then Xn = 1 a.s. for all positive integer n by the following argument. First of all, note that Xn is i.i.d so each of them
should have same bound almost surely. Assume that it is not true that Xn = 1 a.s. Then {Xn 6= 1} has positive measure, so
{|Xn − 1| > ε} has positive measure for some ε > 0 for all positive integer n which implies that Xn does not converges to 1 almost
surely and it is a contradiction. Therefore, Xn = 1 a.s. for all positive integer n. However, since we are excluding the case when
Xn = 1 a.s., it is impossible.
Therefore, we have
E[log Xi ] < 0
so we can apply same argument with part 2 to prove that Mn → 0 a.s.. In other word, by the law of large numbers,
Pn
log Mn log Xi
= i=1 = log Xi → E[log Xi ] < 0 as n → ∞ a.s.
n n
which implies that log Mn → −∞ as n → ∞ so Mn → 0 as n → ∞.

Page 71

Solution - Probability and Stochastics - Cinlar

Uploaded by

Copyright:

Available Formats

You might also like

Solution - Probability and Stochastics - Cinlar

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solution - Probability and Stochastics - Cinlar

Uploaded by

Copyright:

Available Formats

Byeongho Ban bban@umass.

STAT 605 : Probability Homework Solutions

Mathematics & Statistics

Due date : September 13th, 2019 ByeongHo Ban

If we restrict our sight only to rational numbers, it is also true. Let

In conclusion, D is a sigma algebra on D.

Since a ∈ f −1 Ek , we have Ek ∩ ( some n En ) =

Due date : September 23th, 2019 Byeongho Ban

Let {tn } is such sequence. Clearly, letting

If you are not sure inf n f (rn ) = f (r).

(a) Observe that

If m(x ) = ∞ for some x ∈ D ⊂ E, then there is En such that x0 ∈ En . Then we have

which is a contradiction. Therefore, m(x) < ∞ for each x ∈ D.

Furthermore, assume that µ is Σ-finite so that

From here, we conclude that

Now, suppose that {An ∪ Nn } ⊂ F be given. Then observe that

Observe that, for any A0 ∪ N 0 and A ∪ N in F stated in the problem,

Now, further note that

Due date : September 27th, 2019 Byeongho Ban

Let A = {f = ∞}. Assume that µ(A) > 0, then observe that

Suppose that µ is σ-finite. Then there is a partition (En )∞

(b) Suppose c(t) < ∞. Now, observe that

(b) Let t ∈ [0, b) be given. We define a(u) as

(a) Observe that

Since [0, 1] is a disjoint union of C and D,

Due date : October 7th, 2019 Byeongho Ban

Firstly, observe that, for any B ∈ F,

Now, we want to show µK = µ ◦ h−1 . Let B ∈ F be given. Then observe that

Now, note that

which automatically means

(b) Observe that

Also, observe that

Proof. (ByeongHo Ban)

Firstly, note that

Now, observe that

9. Write a 2-page summary of the most important concepts/results of Chapter 1.

Proof. (ByeongHo Ban)

Due date : October 14th, 2019 Byeongho Ban

µY [−∞, x] = P{Y ≤ x} = P{q ◦ U ≤ x} = P{U ≤ c(x)} = Leb(0, c(x)] = c(x) = µ[−∞, x]

Let µ be a probability measure on R.

Now, observe that, for any x,

Firstly, note that

In particular, if X takes values in N, then, by using the formula we found above,

Let a, b ∈ R be given. Then observe that

Suppose that X ≥ 0 and let b > 0 be given. Firstly, observe that

Note that the distribution function d of q ◦ U is defined by

Let f be given positive borel measurable function on R. Then observe that

Due date : October 28th, 2019 Byeongho Ban

Now, note that

R 2πb R 2πb 2πb R

And note that EZ = E( X−a √ )= √

Also, observe that

 −(a+b)  −a  −b Z Z Z Z

= (EX 2 + 2EXY + EY 2 ) − (EX)2 + 2(EX)(EY ) + (EY )2

= (EX 2 − (EX)2 ) + (EY 2 − (EY )2 ) + 2[E(XY ) − (EX)(EY )]

(b) Observe that

(will be continued in the next page)

−(a+b) −a −b Z Z Z Z

Suppose (b). For given > 0, by Chebyshev,