Professional Documents
Culture Documents
Elements of Probability Theory
Elements of Probability Theory
Elements of Probability Theory
density function (pdf ) f (x). The nonnegative function f (x) is such that or joint probability density function, f (x, y). The nonnegative function
Z x2 f (x, y) is such that
P (x1 ≤ X ≤ x2 ) = f (x)dx. Z x2 Z y2
x1
P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = f (x, y)dydx
x1 y1
defines the probability that X takes a value in the interval [x1 , x2 ]. Note
that there is no chance that X takes exactly the value x, P (X = x) = 0. defines the probability that X and Y take values in the interval [x1 , x2 ]
The probability that X takes any value on the real line is and [y1 , y2 ], respectively. Note that
Z ∞ Z ∞Z ∞
f (x)dx = 1. f (x, y)dydx = 1.
−∞ −∞ −∞
The distribution of a univariate random variable X is alternatively The marginal density function or marginal probability density function
described by the cumulative distribution function (cdf ) is given by Z ∞
f (x) = f (x, y)dy
F (x) = P (X < x). −∞
such that
The cdf of a discrete random variable X is Z x2
X X P (x1 ≤ X ≤ x2 ) = P (x1 ≤ X ≤ x2 , −∞ ≤ Y ≤ ∞) = f (x)dx.
F (x) = P (X = xk ) = pk . x1
xk ≤x xk ≤x
The conditional density function or conditional probability density func-
and of a continuous random variable X tion with respect to the event {Y = y} is given by
Z x
F (x) = f (t)dt f (x, y)
f (y|x) =
−∞ f (x)
F (x) has the following properties: provided that f (x) > 0. Note that
Z ∞
• F (x) is monotonically nondecreasing
f (y|x)dy = 1.
−∞
• F (−∞) = 0 and F (∞) = 1.
Two random variables X and Y are called independent, if and only if
• F (x) is continuous to the left
f (x, y) = f (x) · f (y)
1.2 Bivariate Random Variables and Distributions
If X and Y are independent, then:
A bivariate continuous random variable is a variable that takes a contin-
• f (y|x) = f (y)
uum of values in the plane. The distribution of a bivariate continuous
random variable (X, Y ) can be characterized by a joint density function • P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (x1 ≤ X ≤ x2 ) · P (y1 ≤ Y ≤ y2 )
5 Short Guides to Microeconometrics Elements of Probability Theory 6
More generally, if a finite set of n continuous random variables X1 , X2 , X3 , ..., Xn The following rules hold in general, i.e. for discrete, continuous and
are mutually independent, then mixed types of random variables:
The correlation coefficient between two random variables X and Y is The law of iterated means or law of iterated expecations holds in gen-
defined as: eral, i.e. for discrete, continuous or mixed random variables:
Cov[X, Y ]
ρX,Y =
σX σY
EX E[Y |X] = E[Y ].
where σX and σY denote the corresponding standard deviations. The
correlation coefficient has the following property: The conditional variance of Y given X is given by
• −1 ≤ ρX,Y ≤ 1 2
V[Y |X] = E (Y − E[Y |X])2 X = E Y 2 X − (E[Y |X]) .
The following rule holds:
• ραX+γ,βY +µ = ρX,Y The law of total variance is
• ρX,Y = 0
if X and Y are independent V[Y ] = EX V [Y |X] + VX E[Y |X] .
where α ∈ R and β ∈ R are constants.
9 Short Guides to Microeconometrics Elements of Probability Theory 10
In this section we denote matrices (random or non-random) by bold cap- • V[a0 x] = a0 V[x] a
ital letters, e.g. X and vectors by small letters, e.g. x.
• V[Ax] = A V[x] A0
Let x = (x1 , . . . , xn )0 be a (n × 1)-dimensional vector such that each
element xi is a random variable. Let X be a (n × k)-dimensional matrix where the (m × n) dimensional matrix A with m ≤ n has full row rank.
such that each element xij is a random variable. Let a = (a1 , . . . , an )0 If the variance-Covariance matrix V[x] is positive definite (p.d.) then
be a n × 1-dimensional vector of constants and A a (m × n) matrix of all random elements and all linear combinations of its random elements
constants. have strictly positive variance:
The expectation of a random vector, E[x), and of a random matrix,
V [a0 x] = a0 V [x] a > 0 for all a 6= 0.
E[X), summarize the expected values of its elements, respectively:
E[x1 ] E[x11 ] E[x12 ] . . . E[x1k ] 4 Important Distributions
E[x2 ] E[x21 ] E[x22 ] . . . E[x2k ]
E[x] = . and E[X] = .
.. .. .. .
.. .. . . . 4.1 Univariate Normal Distribution
E[xn ] E[xn1 ] E[xn2 ] . . . E[xnk ] The density of the univariate normal distribution is given by:
The following rules hold: 1 1 x−µ 2
f (x) = √ e− 2 ( σ ) .
σ 2π
• E[a0 x] = a0 E[x]
The normal distribution is characterized by the two parameters µ and
• E[Ax] = AE[x]
σ. The mean of the normal distribution is E[X] = µ and the variance
• E[AX] = AE[X] V[X] = σ 2 . We write X ∼ N(µ, σ 2 ).
The univariate normal distribution with mean µ = 0 and variance
• E[tr(X)] = tr(E[X]) for X a quadratic matrix 2
σ = 1 is called the standard normal distribution N(0, 1).
The variance-Covariance matrix of a random vector, V(x), summarizes
all variances and Covariances of its elements: 4.2 Bivariate Normal Distribution
0 0
V[x] = E (x − E[x]) (x − E[x]) = E[xx0 ] − (E[x])([E[x]) The density of the bivariate normal distribution is
V[x1 ] Cov[x1 , x2 ] . . . Cov[x1 , xn ] 1
f (x, y) = p
Cov[x2 , x1 ] V[x2 ] . . . Cov[x2 , xn ] 2πσX σY 1 − ρ2
= .. .. .. .. . ( " 2 2 #)
.
x − µX y − µY x − µX y − µY
. . . 1
exp − − 2ρ
+ .
Cov[xn , x1 ] Cov[xn , x2 ] . . . V[xn ] 2(1 − ρ2 ) σX σY σX σY
11 Short Guides to Microeconometrics Elements of Probability Theory 12
If (X, Y ) follows a bivariate normal distribution, then: A n-dimensional random variable x is multivariate normally distributed
with mean µ and variance-Covariance matrix Σ, x ∼ N(µ, Σ) if its density
• The marginal densities f (x) and f (y) are univariate normal.
is:
• The conditional densities f (x|y) and f (y|x) are univariate normal.
−n/2 −1/2 1 0 −1
f (x) = (2π) (det Σ) exp − (x − µ) Σ (x − µ) .
2
• E[X] = µX , V[X] = σX
2
, E[Y ] = µY , V[Y ] = σY2 .
Let x ∼ N(µ, Σ) and A a (m × n) matrix with m ≤ n and m linearly
• The correlation coefficient between X and Y is ρX,Y = ρ.
independent rows then we have
• E[Y |X] = µY + ρ σσX
Y
(X − µX ) and V[Y |X] = σY2 2
(1 − ρ ).
Ax ∼ N(Aµ, AΣA0 ).
The above properties characterize the normal distribution. It is the only
distribution with all these properties. References
Further important properties:
Amemiya, Takeshi (1994), Introduction to Statistics and Econometrics,
• If (X, Y ) follows a bivariate normal distribution, then aX + bY is Cambridge: Harvard University Press.
also normally distributed: Hayashi, Fumio (2000), Econometrics, Princeton: Princeton University
2 Press. Appendix A.
aX + bY ∼ N(µX + µY , σX + σY2 + 2ρσX σY ).
Stock, James H. and Mark W. Watson (2020), Introduction to Economet-
The reverse implication is not true. rics, 4th Global ed., Pearson. Chapter 2.