Notes

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

STAT4406

March 12, 2024

1 Measure Theory
1.1 Defining σ-algebras
σ-algebras are a special kind of set algebra. A set algebra over some set S is a field-like structure
defined to be a family of subsets F such that:

1. F is closed under complementation.


2. F contains the ∅.
2b. Along with 1. the above imples that S ∈ A
3. F is closed under finite union. That is:
n
[
∀i ∈ {1, 2, 3, . . . n}Ai ∈ F ⇒ An ∈ F
i=1

Proof: F is closed under finite union ⇔ F is closed under finite intersection.

Consider some sequence of sets Ai ∈ F , i ∈ {1, 2, 3, . . . n}.

As F is closed under complements, Aci ∈ F .

As F is closed under unions,


n
[
An ∈ F
i=1

Again, as F is closed under complements,


n
!c
[
Acn ∈F
i=1

By De’Morgans law:
n
!c n
[ \
Acn = An
i=1 i=1

Hence,

1
n
\
An ∈ F
i=1

A σ-algebra Σ over some set S is a set algebra such that condition 3. now states that Σ is closed under
countable unions. That is, for some infinite enumerable sequence Ai ∈ Σ,

[
Ai ∈ Σ
i=1

The proof that this implies closure under countable intersection is the same as above.

Simple examples of σ-algebras over a set S:


1. P(S) is a σ-algebra.
2. {S, ∅} is a σ-algebra. It is the simplest σ-algebra.

σ-algebras lend themselves well to intuitive notions of measurability. For some set S and subset A, if
you know how ’big’ A is and how ’big’ S is, you should be able to work out how ’big’ Ac is. If you
know how ’big’ some sequence of sets is in S, you should be able to work out how ’big’ their union or
intersect is.

1.1.1 Generating σ-algebras from a family of sets


Given a family of sets F , ∃ a unique smallest σ-algebra containing F .

More specifically this is the intersection of every σ-algebra that contains F . To prove that taking the
intersection of every σ-algebra that contains F is a σ-algebra and minimal, consider that:

Proof: If Ci is a set of σ algebras on X, then the intersect ∩i Ci denoted


^
Ci
i

is also a σ-algebra:

If e ∈ Ci ∀ i, then it is guaranteed that ec ∈ Ci ∀ i as all the sets in question are σ-algebras. Hence,
^ ^
e∈ Ci → e c ∈ Ci
i i

That is, the intersect is closed under complement. As all the sets in question are σ-algebras,
^ ^
∅∈ Ci , X ∈ Ci
i i

Consider some enumeration of sets Ai ∈ Ci ∀ i. As all the sets in Ci are σ-algebras,


^ [ ^
Ai ∈ Ci → Ai ∈ Ci
i i i

That is, the intersect is closed under countable union. Hence the intersect of σ-algebras is a σ-algebra.
Proof: If Ci is the set of all σ-algebras on X containing a family of sets in X called F , the intersect is
the minimal σ-algebra containing F .

As F ∈ P (X) and P (X) ∈ Ci , Ci is not empty.

Therefore,
^
F ⊆ Ci
i

Moreover, the intersect is minimal in the sense that, for any Σ ∈ Ci ,


^
Ci ⊆ Σ
i

This minimal σ-algebra is called the σ-algebra generated by F and is denoted σF

1.1.2 Borel algebra


EXPAND

Borel subsets are sets that can be formed from ’open sets’ or ’closed sets’ through countable operations
of union, intersect and complement. The general notion of ’open sets’ requires the set to be topological
space.

The collection of all Borel sets forms a σ-algebra,


This is called the Borel algebra.

1.1.3 Dynkin π-λ theroem


π-system:

A family of sets P belonging to parent set X form a π-system if they are closed under finite intersections.

λ-system:

A family of sets D belonging to parent set X form a λ-system if:

1. X ∈ D
2. (A, B ∈ D) ∧ (B ⊆ A) → A \ B ∈ D S
3. For monotone increasing sequence An , ∀n An ∈ D → n An ∈ D

Proof: Σ is a σ-algebra ⇔ (Σ is a π system) ∧ (Σ is a λ system)


σ-algebras are closed under countable intersections and therefore are π-systems.

Moreover, σ-algebras are closed under countable unions, contain the entire set and are closed under
complement. Hence, they are λ-systems.

IS WRONG

X ∈ Σ as Σ is a λ-system. Hence ∅ ∈ Σ as Σ is a λ-system. Morover, as λ systems are closed under


relative complement and contain the parent set, Σ is closed under complements.

To show that Σ is closed under countable unions and therefore intersects, consider some sequence
in An ∈ Σ. To convert it to a monotone increasing sequence, set B1 = A1 and define recursively
Bn = B(n−1) ∩ An . It is clear that Bn is monotone increasing by definition. As Σ is a λ-system
∪i Bi ∈ Σ. Hence ∪i Ai ∈ Σ. Therefore Σ is a σ-algebra bc demorgans?

Proof: Dynkin π-λ theorem: Given π-system P on X and λ-system D on X,

P ⊂ D → σP ⊂ D
SKIPPING FOR NOW

1.1.4 Measurable space


A measurable space is a space (X, Σ) where Σ is a σ-algebra on X. Subsets of X in Σ are called
Σ-measurable.

1.1.5 Product space


Effectively defines how to take a Cartesian product of measurable spaces. Given measurable spaces
(E, E) and (F, F), the product space is denoted by:

(E × F, E ⊗ F)
Where E ⊗ F := {A × B | A ∈ E, B ∈ F}

1.2 Measurable functions


1.2.1 Definition
A function f : (E, E) → (F, F) is considered an E/F measurable function if ∀S ∈ F, the preimage
f −1 (S) ∈ E, where the preimage is precisely defined to be:

f −1 (S) := {x ∈ E | f (x) ∈ S}
1.2.2 Properties of measurable functions
Proof: f : (E, E) → (F, F) is E/F measurable ⇔ ∀S ∈ F0 where F = σF0 , f −1 (S) ∈ E


As F0 ⊂ F, the measureability definition guarantees this.


Consider set G := {X ∈ F | f −1 (X) ∈ E}

Now prove that G is a σ-algebra:

1. F ∈ G as E = f −1 (F )

2. S ∈ G. S c ∈ G as f −1 (S c ) = f −1 (S)c

Sn ∈ G as f −1 ( n Sn ) = n f −1 (Sn )
S S S
3. Sn ∈ G. n

Suppose F0 ⊆ G.

Hence F ⊆ G as G was shown to be a σ-algebra and F is generated by F0 . Moreover, as G is built


from the elements of F, G ⊆ F. Hence, G = F. Therefore, if F0 ⊆ G, which implies that ∀S ∈ F0
f −1 (S) ∈ E, F = {X ∈ F | f −1 (X) ∈ E}, i.e. F is precisely the set such that every element of F has
an inverse image that lies in E, which is the equivalent of saying that every set in F has an inverse
image in E, which is the definition of measurability.

Proof: Composition of measurable functions. For some measurable spaces (E, E), (F, F), and (G, G)
and functions f : E → F and g : F → G

(f is E/F measurable) ∧ (g is F/G measurable) → f ◦ g is E/G measurable


Consider some set S ∈ G. As g is F/G measurable, g −1 (S) ∈ F. Moreover, as f is E/F measurable,
f −1 (g −1 (S)) ∈ E. Hence, f ◦ g is F/G measurable.

1.2.3 Numerical functions


The reals with the Borel algebra (R, B) form a measureable space.

The extended reals and the corresponding Borel algebra (R, B) form a measureable space.

For some measurable space (E, E),

f : (E, E) → (R, B) is a called a real valued function on E.

f : (E, E) → (R, B) is a called a numerical function E.

A E/B measuarable numerical function can more succinctly be denoted as an E measurable function,
or more succinctly f ∈ E.
1.2.4 Building measurable numerical functions
Consider measurable space (E, E)

Indicator functions: Simplest measurable numerical functions. For some A ⊆ E, of the form:
(
1 if x ∈ A
1A (x) =
0 if x ∈
/A
Simple functions: A simple function f : E → R is one such that for some sequence of sets Ai ⊆ E and
n ∈ N it can be written as a linear combination of indicator functions:
n
ai 1Ai
X
f=
i=1

Canonically, the sequence Ai must be disjoint with ∪ni=1 Ai = E

For simple functions f and g, the below properties hold:

α1 f + α1 g are simple for any α1 , α2 ∈ R

f g is simple
f
For g such that g(x) ̸= 0∀x ∈ E, g
must be simple

min (f, g) denoted f ∨ g must be simple

max (f, g) denoted f ∧ g must be simple

Through a limiting process of sequences of simple functions a large class of E-measurable functions
can be built. Necessary tools to derive limits of sequences:

For some sequence an ∈ R,

inf (an ) is defined to be the greatest lower bound of an

sup (an ) is defined to be the smallest upper bound of an

lim sup (an ) is (very loosely) the infimum of the set of eventual supremums. Specifically,

lim sup (an ) = inf {sup {am | m ≥ n} | n ≥ 0}


lim inf (an ) is very loosely the supremum of the set of eventual infimums. Specifically,

lim inf (an ) = sup {inf {am | m ≥ n} | n ≥ 0}


To build the larger class of E-measurable functions, limits are taken pointwise. For each x the function
is defined on, define an := fn (x).
By definition, for each an , lim inf (an ) ≤ lim sup an . If lim inf (an ) = lim sup (an ), for every an then fn
is said to converge to f , denoted fn → f .

If every sequence an is increasing, that is to say that fn increases pointwise, then the limit always
exists and is denoted by fn ↑ f .

If every sequence an is decreasing, that is to say that fn decreases pointwise, then the limit always
exists and is denoted by fn ↓ f .

Proof: For some sequence of E-measurable functions fn , the pointwise defined functions inf (fn ),
sup (fn ), lim inf (fn ), and lim sup (fn ) are all E-measurable. Hence the limit, if it exists i.e. lim inf (fn ) =
lim sup (fn ), also is E-measurable.

1. sup (fn ):

Let s := sup (fn ). The Borel σ-algebra on R can be generated by a set of intervals of the form [−∞, r].
As f : (E, E) → (F, F) is E/F measurable ⇔ ∀S ∈ F0 where F = σF0 , f −1 (S) ∈ E, it is sufficient to
prove that the preimage under f of every set of the form [−∞, r] lie in E. To do this consider that ∀r:
\
s−1 [−∞, r] = fn−1 [−∞, r]
n

As each fn is measurable, fn−1 [−∞, r] ∈ E. Moreover, as E is a σ-algebra it is closed under countable


unions. Hence ∀r s−1 [−∞, r] ∈ E. Using this all other components of the proof follow.

2. inf (fn )

Let i := inf (fn ). It must be noted that i = − sup (−fn ). Hence by the above proof i is E-measurable.

3. lim inf (fn ), lim sup (fn )

lim inf (fn ) := sup inf fn


n≥0 m≥n

lim sup (fn ) := inf sup fn


n≥0 m≥n

step 1 + step 2

Hence, it has been established that limits of E-measurable functions are E-measurable.

The functions at our disposal to build all E-measurable functions are the simple functions previously
defined. To define a broad class of E-measurable functions, we will decompose them into positive and
negative functions. We will first show that all positive (and by sign change negative) E-measurable
functions can be built from simple functions. We will then show that any E-measurable function is
measurable if and only if its positive and negative decompositions are measurable. Hence the positive
and negative functions are sufficient to build all E-measurable functions.
It is intuitive that any numerical f can be written as

f := f + + f −
Proof: f + ∈ E+ ⇔ pointwise limit of simple functions fn

⇐ sufficiency,

If f + is the pointwise limit of simple functions then it is E-measurable as it was previously proven that
the limits of measruable functions were measurable.

⇒ necessity,
+
Goal is to chop up R into smaller and smaller pieces. Based on these pieces, create simple functions
that are the lower bound of f + on each interval. As the pieces keep getting smaller ”resolution” in-
creases and sequence of indicator functions pointwise converge to f + .
+
First cut R into [0, 2n ] and (2n , ∞] for some integer n. Now subdivide [0, 2n ] into 22n pieces, each of
1
width n . These intervals look like:
2
  
k k+1 2n
Ik,n = , | k ∈ {0, 1, 2, . . . 2 − 1}
2n 2n
Define:

Ek,n := (f + )−1 (Ik,n )


and the remnant

Rn := (f + )−1 (2n , ∞]
As f + is measurable, ∀k, n Ek,n , Rn ∈ E.

Now we want to define fn such that it is the lower bound of f . Hence, set:
2n −1
2X
!
k
fn = 1
n Ek,n
+ 2n 1Rn
k=1
2
hence, if the inverse image of an interval Ik,n exists, we create an indicator function of that inverse
image and scale it by the lower bound of that interval, hence creating a lower bound of f on that
interval. Hence, lim fn = f +
n→∞

Hence every positive and negative E-measurable function can be written as a limit of simple functions.
Now we will extend this to the entire class of E-measurable functions.

f ∈ E ⇔ f + ∈ E+ ∧ f − ∈ E−
 

Pick r ≥ 0 and take inverse image of all f ≥ r and same for −r

Hence all E-measurable functions are the limits of simple functions. Hence, all properties of simple
functions apply to all E-measurable functions.
Monotone Class of functions E on (E, E):

1. 1E ∈ M

2. ∀ bounded f, g ∈ M, ∀a, b ∈ R, af + bg ∈ M

3. For some increasing sequence of positive functions fn ∈ M that increases to some f , f ∈ M.

Properties of monotone class of functions:

1.3 Measures
For some measurable space (E, E), a measure µ is a set function:

µ : E → [0, ∞]
Such that it is countably additive. That is, for some sequence of disjoint sets An :
!
[ X
µ An = µ(An )
n n

With µ(∅) = 0.
A measure is called σ-finite if ∃An ∈ E such that:
!
[
µ An ≤∞
n

A measure is a probability measure if µ(E) = 1

Properties of measures:

A ⊆ B ⇒ µ(A) ≤ µ(B)
Proof:

!
[
∀n, An ⊆ An+1 ⇒ lim µ(An ) = µ An
n→∞
n

Proof:

! ∞
[ X
µ An ≤ µ(An )
n n

Proof:
A space (E, E, µ) is called a measure space.

A pre-measure is similar to a measure but is defined on some algebra E0 . The below allows its extension
to the entire σ-algebra generated by it:

Carathadory extension thoerem: For some algebra E0 on E, the pre-measure µ0 can be extended to a
measure on the entire space (E, σE0 )
1.3.1 Dirac measure
Assigns measure 1 to sets ∈ E containing some x ∈ E.

δx (A) := 1A (x)

1.3.2 Counting measure


Counts how many elements of countable D ∈ E lie in set A ∈ E
X
v(A) := δx (A)
x∈D

1.3.3 Discrete measure


Assigns mass to each point in some set D ∈ E. Defined as:
X
µ(A) := m(x)δx (A)
x∈D

For all countable E with E = 2E any measure must be of this form.

1.3.4 Lebesgue measure


Consider E := (0, 1]. Consider the algebra generated by finite unions of sets of the form (a, b] where
0 ≤ a < b ≤ 1. The natural length is defined as:

µ0 (a, b] = b − a
Morover, the finite additivity property is satisfied:
n
! n
[ X
µ0 (ai , bi ] = b i − ai
1=1 i=1

With some work it can be shown that µ0 is countably additive.

By Carathadory’s extension theorem, ∃ measure µ on B(0,1] := σE that extends µ0 . It can be proven


that this extension is unique and is called the Lebesgue measure.

Proof: Uniqueness of the Lebesgue measure.

Definition of product measure:

For two measure spaces (E, E, µE ) and (F, F, µF ). The measure on the product measure on the product
space (E × F, E ⊗ F, µE⊗F ) is defined as:

µE⊗F := µE µF
1.4 Integrals
Z Z
Denoted µf or f dµ or µ(dx)f
E E

Define for indicator functions. Define for simple functions. Define then for all E-measurable functions
by taking limit of simple functions.

For indicator function:

µ1A = µ(A)
For simple functions:
n
! n
ai 1 A i
X X
µ = ai µ(Ai )
i=1 i=1
+
For all positive E-measurable functions f , define fn as previously. Hence define:

µf + = lim µfn
n→∞
To extend to all E-measurable functions, as earlier define that:

µf = µf + + µf −
In this case we also impose that either µf + or µf − are finite. This is to avoid ∞ − ∞ situations? If
µf < ∞, then f is called µ-integrable.

1.4.1 Integrals over discrete measure


X
µf = m(x)f (x)
x∈D

1.4.2 Lebesgue integrals


For measurable space (R, B) and for some f ∈ B, the Lebesgue integral over some set A ∈ B is most
generally denoted as:
Z Z
f (x)dx = 1A f (x)dx
A
Limiting process as above can be taken. This can also be extended to (Rn , B n ) for all B n -measurable
functions.

1.4.3 Negligible sets


For some (E, E, µ), the class of measure 0 sets are extended such that a set A is said to be negligible
if A ⊂ E and for some B ∈ E such that A ⊂ B, µ(B) = 0.

1.4.4 µ-almost everywhere


Two functions f and g are said to be equal µ-almost everywhere if the set of points such that f ̸= g
is negligible.
1.4.5 Properties of the integral
For some measure space (E, E, µ)

Proof: Insensitivity of the integral

f = g µ-almost everywhere ⇒ µf = µg
Proof: Positivity

(µf ≥ 0) ∧ (f = 0) → µf = 0
Proof: Monotonicity

f ≤ g → µf ≤ µg
Proof: Linearity

µ(af + bg) = a · µf + b · µg
Proof: Monotone Convergence

fn ↑ f → µfn ↑ µf

1.4.6 Functionals to measures


On a measurable space (E, E) for some functional L : E+ → [0, ∞], there exists a corresponding
measure µ such that:

L(f ) = µf
⇐⇒
1. L(f ) = 0 =⇒ µf = 0
2. L(af + bg) = aL(f ) + bL(g)
3. fn ↑ f =⇒ L(fn ) ↑ L(f )

1.4.7 Densities
Consider measure space (E, E, µ). Establish a function p ∈ E+ . Consider then that for any A ∈ E the
mapping:
Z
A 7→ p dµ
A
Forms a measure v, which is said to have density p w.r.t µ, denoted v = pµ which is called indefinite
integral.

Proof: pµ is a measure

This lets us do:

vf = µ(f · p)
That is:
Z Z
f dv = f p dµ

Proof:

Absolute continuity: A measure v is said to be absolutely continuous w.r.t measure µ, denoted v ≪ µ,


if:

v ≪ µ : µ(A) = 0 =⇒ v(A) = 0
Called absolute continuity because for finite measures there is some relation to the definition of abso-
lute continuity of functions.

Radon-Nikodym theorem:

For measures v and µ on (E, E) such that µ is σ-finite and v ≪ µ, ∃p ∈ E+ such that v = µp and said
p is unique up to µ-almost equality.

Effectively v is zero whenever µ is zero, and v is non-zero whenever µ is non-zero. What the above
theorem states is that there is a unique positive E-measurable function p such that for any set A ∈ E:
Z
v(A) = p dµ
A
Consider the familiar formula:
Z
F ([a, b]) = f (x) dx
[a,b]

dF
Where =f
dx
By the above’s similarity to this, the function p is called the Radon-Nikodym derivative of v w.r.t µ
and is denoted heuristically as:
dv
=p

This is not a true derivative. It is rather just a measure of how much the v measure changes w.r.t the
µ measure. This notation is also in support of the
Z Z
f dv = f p dµ

As
Z Z
dv
f dv = f dµ

feels intuitive.
1.4.8 Image/pushforward measures and integrating w.r.t image measures
Consider a function f : E → F between measurable spaces (E, E) and (F, F) such that f is E/F-
measurable. Endow (E, E) with measure µ. Hence a measure v on (F, F), called the image measure,
can be established by defining:

v := µ ◦ f −1
That is for any set A ∈ F,

v(A) = µ(f −1 (A))


This works as f was defined to be E/F-measurable.

1.5 Integration w.r.t. image measure


For the same measurable spaces, consider the integral
Z
f dv
F
−1
For pushforward measure v := µ ◦ h where h was some E/F-measurable function. It can be proven
that for every positive F-measurable function,
Z Z
f dv = f ◦ hdµ
F E

1.5.1 Transition kernels


2 Probability
Random experiments: experiments whose results cannot be known in advance.

Modelled as a measure space (Ω, H, P). Ω is called the sample space. The sigma algebra H is called
the event hoard. P is a measure such that:

P : H → [0, 1]
With P(Ω) = 1 and P(∅) = 0.

Can specify a measure either as a direct formula for some A ∈ H or on a π -system Π such that H = σΠ

2.1 Properties of P
Along with the basic properties of measures and finite measures, the below hold for any A, B and An
∈ H:

P(Ac ) = 1 − A
Follows directly from P(Ω) = 1 and finite additivity.

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)


Follows from finite additivity.
!
\
∀n, An+1 ⊆ An =⇒ lim P(An ) = P An
n

This is continuity
S cfrom above. Is proved by taking the complement, therefore creating a sequence that
decreases to n An . The property that:
!
[
lim P(Acn ) = P Acn
n

Is guaranteed by the continuity from below property of finite measures.

Now, consider that:

P(Acn ) = 1 − P(An )

lim P(Acn ) = 1 − lim P(An )


!
[
P Acn = 1 − lim P(An )
n
!
[
lim P(An ) = 1 − P Acn
n
!
\
lim P(An ) = P An
n

2.2 Random variables


A random variable X from (Ω, H, P) taking values in (E, E) X is a H/E-measurable function

2.2.1 Stochastic process


A stochastic process is a collection of random variables:

X = {Xt | t ∈ T}
If T is finite then X is called a random vector.

There are two views of stochastic process:

One is that is a distribution of sample paths over sample space Ω = ×t∈T E with hoard H = ⊗t∈T Et

The second is that a stochastic process is a sequence of random variables Xt taking values in (Et , Et ).

2.3 Distribution and density


2.3.1 Distribution
The distribution of a random variable X from (Ω, H, P) taking values in (E, E) is the pushforward
measure established on (E, E) by P under X, that is:

µ := P ◦ X −1
It is often denoted PX . For some set T ∈ E. The probability that X ∈ T is defined as:

P(X ∈ T ) := P ◦ X −1 (T )

2.3.2 Cumulative Distribution function


For some numerical random variable X : (Ω, H, P) → (R, B), the CDF F : R → [0, 1] is defined as:

F (x) := P(X ≤ x)

2.3.3 Probability Density Function


Now consider the measure space (E, E, v). The PDF or probability density function f ∈ E+ is the
Radon-Nikodym derivative:

f=
dv
Where µ is the distribution measure. That is:
Z
µ(A) = f dv
A
By the RN theorem this exists only when µ ≪ v. Hence density only sometimes exists. Taking more
familiar notation,
Z
P(X ∈ A) = f dv
A

Often the X would take values in (R, B, λ) where λ is the Lebesgue measure. Hence:
Z
P(X ∈ A) = f dx
A

2.3.4 Discrete case


For random variable X taking values from (Ω, H, P) to (E, E), the distribution is called discrete if its
measure is discrete:
X
P(X ∈ A) = f (x)δx (A)
x∈D

That is for some D ∈ E, if x ∈ D and x ∈ A, assign mass f (x) to that point. Clearly P has density
f (x) w.r.t the counting measure. The function f (x) is called the probability mass function rather than
density function.

2.4 Independent Random Variables


Two random variables X and Y from (Ω, H, P) to (E, E) and (F, F). These are called independent if
for some A ∈ E, B ∈ F:

P(X ∈ A, Y ∈ B) = P(X ∈ A) · P(X ∈ B)


More generally, for some collection of random variables {Xt }T (finite T?) taking values in (Et , Et ) are
considered independent if for At ∈ Et :
Y
P(X1 ∈ A1 , X2 ∈ A2 . . .) = P(Xt ∈ At )
T

For some independent {Xt }T taking values in (E, E, λ),

Marginal distribution: individual distribution of each Xt . Denote these as µt and their density w.r.t λ
as ft

Consider the product space ×T XT . As the distribution is a measure the product distribution is the
product measure µ := ⊗T µt and its density w.r.t ⊗T λ is:
Y
f := ft
T

If these independent {Xt }T are also identically distributed, it is denoted:


i.i.d
{Xt }T ∼ Dist
or in the case that the distributions have density f w.r.t λ
i.i.d
{Xt }T ∼ f

2.4.1 Constructing a probability space


Want to construct (Ω, H, P) and Xi for a sequence of fair coin tosses. That is,
 
i.i.d 1
X1 , X2 , X3 . . . Xn ∼ Ber
2

To do so, let (Ω, H, P) = [0, 1), B[0,1) , λ

Now want Xi : [0, 1), B[0,1) , λ → {0, 1} such that λ(X −1 (A)) = Ber (A) for all A ∈ 2{0,1}


2.5 Expectation
For some numerical random variable X : (Ω, H, P) → (R, B), the expectation is defined as:

EX := PX
That is,
Z
EX := X dP

Constructing this is the exact same as constructing the integral of numerical measurable functions.

2.5.1 Properties of E
Follows all the properties of the integral:

Proof: Monotonicity

X1 ≤ X2 → EX1 ≤ EX2
Proof: Linearity

E(aX1 + bX2 ) = a · EX2 + b · EX2


Proof: Monotone Convergence

Xn ↑ X → EXn ↑ EX
Fatous lemma: For some sequence Xn of random numerical value

E lim inf Xn ≤ lim inf EXn


Proof:
Consider some Xn such that lim Xn = X exists. Fatous lemma allows us to say the below:

Dominated convergence: ∀nXn < Y ∧ EY < ∞ =⇒ EX < ∞

Proof:

Bounded convergence: ∀n∃c ∈ R s.t. |Xn | < c =⇒ EX < ∞


Proof:

2.6 Expectation of a function of X:


For X : (Ω, H, P) → (E, E, v) with distribution

µ = PX −1
and E-measurable numerical function h, the below details how to calculate Eh(X)
First, use the fact that:
Z Z
hdP = h ◦ Xdµ
E
Z
Eh(X) = h(X)

That is

Eh(X) = µh
This follows from something I havent proven and dont understand. DO:

But effectively the expected value of a function of a random variable is its integral w.r.t the probability
distribution. Or densities if such exits.

This also generalises to random vectors.

Consider some n-dimensional real (Rn , B n ) valued random vector X. Suppose it has density fX w.r.t
to the Lebesgue measure. Consider invertible g : Rn → Rn . Define random variable Z := g(X).
Hence, the probability density of Z w.r.t the Lebesgue measure is:

fX ◦ g − 1
fZ = ∂Z
∥ ∂X ∥
By applying invertibility,

∂X
fZ = fX ◦ g − 1
∂Z
2.7 Integral transforms
The MGF of a numerical random variable X : (Ω, H, P) → (R, B) with distribution µ := PX −1 is an
integral transform defined as:

M (s) := EesX
Z
M (s) = esX dP

Supposing pushforward

M (k) (0) = EX k
Simplifies working with independent sums of Xi
Y
MPi Xi (S) = Xi (S)
i

2.8 Information and independence


σ-algebra generated by random variable X taking values from (Ω, H, P) to (E, E) is the set σX :=
{X −1 (A) | A ∈ E}.

Proof: σX is a σ-algebra
The σ-algebra generated by stochastic process {Xt }T is the smallest σ-algebra generated by the union
of the individual σXt denoted
_
σXt
T

Proof?:

Characterizing functions f ∈ σX with X taking values in (E, E):

f ∈ σX ⇐⇒ f = h ◦ X, h ∈ E

You might also like