Professional Documents
Culture Documents
SSP 1 1 - Stochastic 1
SSP 1 1 - Stochastic 1
1 Introduction
2 Random numbers
Probability
IN5340 / IN9340 Lecture 1 Ensemble averages
Random variables, vectors and sequences Moments
Useful random variables
3 Random vectors
Joint distribution
Roy Edgar Hansen January 2022 Sum of random variables
Central limit theorem
Joint moments
4 Summary
1 / 37
4 / 37 5 / 37
Also called cumulative distribution function (CDF) or probability distribution This gives Z α
Example: Coin flipping Fx (α) = fx (u )du
−∞
Pr {x = −1} = 0.5 (Tails)
Probability density function
Pr {x = 1} = 0.5 (Heads)
The relative likelihood for a random variable to take on a given value. The probability of
CDF the random variable falling within a particular range of values is given by the integral of the
0 ; α < −1
variable’s density over that range.
Fx (α) = 0.5 ; −1 ≤ α < 1
1 ; 1≤α
6 / 37 7 / 37
Expectation Expected value of a function of a random variable
Expectation: Assume a random variable x with a known PDF fx
The expected value of x
The expectation of any function g (x ) becomes
The mathematical expectation of x
The statistical average of x Z ∞
The mean value of x E {g (x )} = g (α)fx (α)d α
−∞
Expected value of a discrete random variable
Sometimes called The law of the unconscious statistician (LOTUS)
X
E {x } = αk Pr {x = αk } Excellent description in wikipedia.org
k
m1 is the mean value also denoted x or mx . The second central moment µ2 is a very important statistical average referred to as the
m2 is also an important statistical average referred to as the mean squared value variance
The square root of the variance is called the standard deviation of the random variable
√
σx = µ2
10 / 37 11 / 37
Central Moments cont. Uniform distribution
The third central moment is a measure of the asymmetry of the probability density function
fx (α) called skew
µ3 = E {(x − x )3 }
Equal probability of all values within bounds
The normalized third central moment is known as the skewness of the density function
Matlab function rand
γ1 = µ3 /σx3
Probability density function
The fourth central moment, known as the kurtosis, is a measure of the heaviness of the tail of
0 ; α<a
the distribution
fx (α) = 1/(b − a) ; a ≤ α ≤ b
µ4 = E {(x − x )4 }
0 ; α>b
12 / 37 13 / 37
14 / 37 15 / 37
Random vectors Joint probability distribution function
Assume two random variables X and Y defined on a sample space S with the specific values Consider the events A = {X ≤ α} and B = {Y ≤ β}
x and y
The probability distribution functions
Any ordered pair (x , y ) may be considered a random point in the xy plane
Fx (α) = Pr {X ≤ α}
or, a vector random variable or a random vector
Fy (β) = Pr {Y ≤ β}
Note
New concept: the probability of the joint event {X ≤ α, Y ≤ β} is described by a joint
Although the random vector can contain several random variables, we first consider the two- probability distribution function
element case.
Fx ,y (α, β) = Pr {X ≤ α, Y ≤ β}
På norsk: simultan
16 / 37 17 / 37
Example conditional probability: coin flipping with Joint probability density function
uneven coins The joint probability density function (PDF)
Consider three different coins:
1 a coin with equal probability of H and T
∂ 2 Fx ,y (α, β)
fx ,y (α, β) =
2 a coin which is more likely to produce H ∂α∂β
3 a coin which is more likely to produce T
and the joint cumulative distribution function (CDF)
Flip the first coin (random variable X ) Z α Z β
If the first coin results in a H, then flip coin 2 (random variable Y ) Fx ,y (α, β) = fx ,y (u , v )dudv
−∞ −∞
If the first coin results in a T , then flip coin 3 (random variable Y )
The random variable Y is then statistically dependent on the random variable X
We specifically see that if X is H, it is more likely that Y also is H. And vice versa.
18 / 37 19 / 37
Properties of the joint density Statistical independence
Two random variables X and Y are said to be statistically independent if (and only if)
fx ,y (α, β) ≥ 0
Fx ,y (α, β) = Fx (α)Fy (β)
Z ∞ Z ∞
fx ,y (α, β)d αd β = 1 From the definition of the density function, this gives
−∞ −∞
Z ∞
fx ,y (α, β) = fx (α)fy (β)
fx (α) = fx ,y (α, v )dv
−∞ Statistical independence
Z ∞ The occurrence of one random event does not affect the probability of occurrence of the other
fy (β) = fx ,y (u , β)du random event.
−∞
20 / 37 21 / 37
Finding the CDF is often a well defined problem, while finding the PDF might be more difficult Fz (ζ) = fx ,y (α, β)d β d α
α=−∞ β=−∞
Material inspired by John Buck from Umass Dartmouth on youtube and Introduction to Probability by
Grinstead and Snell
22 / 37 23 / 37
Sums of independent random variables 3 Sums of independent random variables 4
The PDF is the derivative Insert into Leibniz rule and consider each term.
∞ Z ζ−α d d
a(ζ) = (−∞) = 0
Z
dFz (ζ) d
fz (ζ) = = fx ,y (α, β)d β d α dζ dζ
dζ α=−∞ dζ β=−∞ d d
b(ζ) = (ζ − α) = 1
In order to solve this, we turn to calcululs and find Leibniz rule: dζ dζ
∂
Z b(x )
! Z b(x ) fx ,y (α, β) = 0
d d d ∂ ∂ζ
f (x , t )dt = f (x , b(x )) · b(x ) − f (x , a(x )) · a(x ) + f (x , t )dt
dx a(x ) dx dx a(x ) ∂x Hence, only the upper limit contributes. From Leibniz:
d
See wikipedia or a suitable book in calculus f (x , b(x )) · b(x ) = fx ,y (α, ζ − α) · 1
dx
Which gives Z ζ−α
d
fx ,y (α, β)d β = fx ,y (α, ζ − α)
dζ β=−∞
24 / 37 25 / 37
Sums of independent random variables 5 The sum again using the characteristic function
The PDF of Z then becomes Let W be a random variable equal to the sum of two statistically independent random
Z ∞ variables X and Y
fz (ζ) = fx ,y (α, ζ − α)d α W =X +Y
α=−∞
Consider the characteristic function defined as
If X and Y are statistically independent, the joint PDF is the product of the marginals Z ∞
fx ,y (α, β) = fx (α)fy (β) Z ∞ Φx (ω) = E {ej ωx } = fx (u )ej ωu du
−∞
fz (ζ) = fx (α)fy (ζ − α)d α
α=−∞
Recap the expected value of a function of a random variable
This is recognized as a convolution fz (ζ) = fx (ζ) ∗ fy (ζ) Z ∞
26 / 37 27 / 37
The sum again using the characteristic function 2 Central limit theorem
The characteristic function of W becomes The Central limit theorem loosely defined states as follows:
From the convolution property of the Fourier transform Equivalent for probability density functions
fw (α) = fx (α) ∗ fy (α) Variants and proof of the classical CLT found in Wikipedia (using the characteristic function)
Z ∞
Test on the computer/blackboard
= fx (u )fy (α − u )du
−∞
See any suitable textbook or wikipedia for the Properties of the Fourier Transform
28 / 37 29 / 37
30 / 37 31 / 37
Correlation Joint central moments
The second order moment m11 is called the correlation The joint central moments are defined
Z ∞ Z ∞
Rxy = E {xy } = αβ fx ,y (α, β)d αd β µnk = E {(x − x )n (y − y )k }
Z ∞Z ∞
−∞ −∞
= (α − x )n (β − y )k fx ,y (α, β)d αd β
If the correlation can be written −∞ −∞
Rxy = E {x }E {y }
where x = E {x } and y = E {y }
then x and y is said to be uncorrelated. The second order moments
Statistically independent random variables are uncorrelated
µ20 = E {(x − x )2 } = σx2
Easy to prove by inserting fx ,y (α, β) = fx (α)fy (β)
µ02 = E {(y − y )2 } = σy2
Not vice versa.
If the correlation is Rxy = 0 the two random variables are said to be orthogonal. are simply the variances of the random variables x and y
32 / 37 33 / 37
34 / 37 35 / 37
Summary of lecture I På Norsk
Engelsk Norsk
random variable random vector probability sannsynlighet
probability joint probability distribution fordeling
density tetthet
probability distribution statistical dependence
expectation forventning
probability density function characteristic function standard deviation standardavvik
expectation central limit theorem variance varians
joint simultan
moments joint moments correlation korrelasjon
central moments correlation and covariance covariance kovarians
mean and variance uncorrelated and orthogonal https://folk.ntnu.no/bakke/ordliste.pdf
standard deviation correlation coefficient
normal/uniform distribution
36 / 37 37 / 37