Professional Documents
Culture Documents
Rvmoments Pred Notes
Rvmoments Pred Notes
Arun K. Tangirala
Random Variable
Informal definition
A random variable (RV) is one whose value set contains at least two elements, i.e., it
draws one value from at least two possibilities. The space of possible values is known as
the outcome space or sample space.
dF (x)
2 Derivative of c.d.f.: f (x) =
dx
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
F(x)
F(x)
F(x)
0.4
0.4
0.4
The type of
0.2
0.2
0.2
distribution for a
0.0
0.0
0.0
-4 -2 0 2 4 0 10 20 30 40 0 2 4 6 8 10
x x x
random phenomenon
Gaussian density Chi-square (df=10) density Binomial mass (n=10, p=0.5)
depends on its nature.
0.00 0.02 0.04 0.06 0.08 0.10
p(X=x)
f(x)
f(x)
0.2
0.1
0.0
-4 -2 0 2 4 0 10 20 30 40 0 2 4 6 8 10
x x x
It turns out that for linear processes, predictions of random signals and estimation of
model parameters it is sufficient to have the knowledge of mean, variance and
covariance (for the multivariate case), i.e., it is sufficient to know the first and
second-order moments of p.d.f.
Variance
The variance of a RV measures the average spread of outcomes around its mean,
Z 1
2 2
X = E((X µX ) ) = (x µX )2 f (x) dx (3)
1
Note: Integration in (2) and (3) are across the outcome space and NOT across time.
Arun K. Tangirala, IIT Madras Applied Time-Series Analysis 7
Remarks
I Prediction perspective:
The mean is the best prediction X̂ of the RV X in the minimum
mean square error sense, i.e.,
h i
2
µ = arg min E(X X̂) s.t. X̂ = c
c
I Applying the expectation operator E to a RV or its function produces its
“average” or expected value.
Z 1
E(g(X)) = g(x)f (x) dx (4)
1
2
I As (3) suggests, X is the second central moment of f (x). Further,
2
X = E(X 2 ) µ2X (5)
Associated with this joint probability (density function), we can ask two questions:
Marginal density
The marginal density of a RV X (Y ) with respect to another RV Y (X) is given by
Z 1 Z 1
fX (x) = f (x, y) dy likewise, fY (y) = f (x, y) dx (6)
1 1
Conditional density
The conditional density of Y given X = x (strictly, between x and x + dx) is
f (x, y)
fY |X=x (y) = (7)
f (x)
The conditional expectation is the best predictor of Y given X among all the pre-
dictors that minimize the mean square prediction error
(Frequentist school)
Covariance
The statistic that measures the co-variation of two RVs is given by the second-order
moment of the joint p.d.f.:
Z +1 Z +1
XY = E((X µX )(Y µY )) = (x µX )(y µY )f (x, y) dx dy (11)
1 1
XY
⇢XY = (14)
X Y
Unity correlation
Boundedness
For all bivariate distributions with finite second order moments,
|⇢XY | 1 (15)
with equality if, with probability 1, there is a linear relationship between X and Y .
Denote the optimal estimates that minimize E("2 ) = E(Y Ŷ )2 and E("2 ) = E(X X̂)2
as b? and b̃? , respectively. Then,
XY XY
⇢2XY = b? b̃? Note: b? = 2
; b̃? = 2
(18)
X Y
Remarks, limitations, . . .
I High values of estimated correlation may imply linear relationship over the
experimental conditions, but a non-linear relationship over a wider range.
I Absence of correlation only implies that no linear model can be fit. Even if
the true relationship is linear, noise can be high, thus limiting the ability of
correlation to detect linearity
I Correlation measures only linear dependencies. Zero correlation means
E(XY ) = E(X)E(Y ). In contrast, independence implies f (x, y) = f (x)f (y)
Despite its limitations, correlation and its variants remain one of the most widely used
measures in data analysis
X Y X.Z Y.Z
direct link
where X̂ ? (Z) and Ŷ ? (Z) are the optimal predictions of X and Y using Z.
implying X and Y are correlated. However, applying (19), it is easy (with a bit of
e↵ort) to see that
⇢Y X.Z = 0
I In time-series analysis, partial correlations are used in devising what are known as
partial auto-correlation functions, which are used in determining the order of
auto-regressive models
I The partial cross-correlation function and its frequency-domain counterpart, known
as partial coherency function, find good use in time-delay estimation
I In model-based control applications, partial correlations are used in quantifying the
impact of model-plant mismatch in model-predictive control applications.
Semi-partial correlation
It examines the correlation between X1 and residualized X2 (w.r.t. X3 )
Commands Utility
mean,var,sd sample mean, variance and standard deviation
colMeans,rowMeans means of columns and rows
median,mad sample median and median absolute deviation
cov,corr,cov2cor covariance, correlation and covariance-to-correlation
lm, summary linear regression and summary of fit
I pcor: Computes partial correlation for each pair of variables given others.
Syntax: pcormat <- pcor(X)
where X is a matrix with variables in columns. The default is Pearson correlation,
but it can also compute Kendall and Spearman (partial) correlations. Result is
given in pcormat$estimate
Sample usage
1 w <- rnorm (200) ; v <- rnorm (200) ; z <- rnorm (200) ; # ...
Generate w , v and z
2 x <- 2 * z + 3 * w ; y = z + v ; # Generate x and y
3 cor ( cbind (x , y ) ) # Compute correlation matrix between x ...
and y
4 pcor ( cbind (x ,y , z ) ) ) $ estimate # Compute partial correlation ...
matrix