SSP 1 3 - Stochastic 3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Outline

1 Introduction

2 Random processes revisitied


IN5340 / IN9340 Lecture 3
Random variables, vectors and sequences 3 Correlation and covariance matrix

4 Spectral characteristics

5 Example: sonar data continued


Roy Edgar Hansen January 2022
6 Magnitude Squared Coherence (MSC)

7 Summary

1 / 43

What do we learn Definition of random process


Correlation and Spectra. Covariance and Correlation matrix. Discrete time random process
Chapter 3.3 in Hayes. Chapter 2,3,4 in Therrien. Chapter 4 in Bendat & Piersol Indexed sequence of random variables
Wikipedia: Covariance matrix, Spectral density, Coherence
Straightforward extension of the concept of random variables
M. H. Hayes.
Statistical Digital Signal Processing and Modeling.
x (n )
John Wiley & Sons, 1996.
C. W. Therrien. Each random variable has a probability distribution
Discrete Random Signals and Statistical Signal Processing.
Prentice Hall, 1992. Fx (α, n) = Pr {x (n) ≤ α}
J. S. Bendat and A. G. Piersol. and a probability density function
Engineering Applications of Correlation and Spectral Analysis.
John Wiley & Sons, 1980. dFx (α, n)
fx (α, n) =
Youtube, Material from other schools and books dα
2 / 43 3 / 43
Definition of random process Stationarity and ergodicity
A complete statistical characterisation requires the joint probability distribution A random process becomes a random variable for a certain time
This random variable has statistical properties such as a probability density function, mean
Fx (α1 , . . . , αk , n1 , . . . , nk ) = Pr {x (n1 ) ≤ α1 , . . . , x (nk ) ≤ αk }
value, variance, moments etc
or the joint probability density function
Stationarity
∂k If all the statistical properties does not change with time, the random process is said to be
fx (α1 , . . . , αk , n1 , . . . , nk ) = Fx (α1 , . . . , αk , n1 , . . . , nk )
∂α1 . . . ∂αk stationary

Sometimes it is sufficient to describe the random process with the first and second order
Ergodic processes stated loosely
distribution (or density)
If the time averages equal the corresponding statistical averages, the process is said to be
ergodic

4 / 43 5 / 43

Autocorrelation Wide sense stationary processes


A random process is said to be wide sense stationary (WSS) if
The autocorrelation of a random process x (n)
wide sense stationary
Rxx (n1 , n2 ) = E {x (n1 )x (n2 )}
The expected value (mean value) of the process is constant
If the random process is second order stationary, we realise that we can write k = n2 − n1
which gives E {x (n)} = x̄
Rxx (n1 , n2 ) = Rxx (k ) where k = n2 − n1
The autocorrelation is only a function of lag
Written in time continuous notation
Rxx (n1 , n2 ) = Rxx (k )
Rxx (t1 , t2 ) = Rxx (τ ) where τ = t2 − t1
The variance of the random process is finite
Contains information about the history of the random process
µ2 ≤ ∞

6 / 43 7 / 43
Cross correlation Autocovariance and cross covariance
Assume two random processes x (t ) and y (t ). Similar to the covariance, the autocovariance is
The cross correlation function is
Cxx (t , t + τ ) = E {(x (t ) − mx ) (x (t + τ ) − mx )}
Rxy (t1 , t2 ) = E {x (t1 )y (t2 )}
The cross covariance is
We say they are jointly wide sense stationary if the cross correlation is only a function of time
Cxy (t , t + τ ) = E {(x (t ) − mx ) (y (t + τ ) − my )}
difference and not absolute time
The normalized cross covariance or correlation coefficient
Rxy (t , t + τ ) = E {x (t )y (t + τ )} = Rxy (τ )
Cxy (t , t + τ )
ρxy (t , t + τ ) =
σx σy

Due to the normalization 0 ≤ |ρxy | ≤ 1

8 / 43 9 / 43

Correlation and Covariance cont. Cross correlation of complex sequences


If the correlation can be written Assume two WSS complex random processes x (t ), x (t )
Rxy = E {x }E {y }
The autocorrelation is
then x and y is said to be uncorrelated.
Z ∞
Rxx (τ ) = E {x ∗ (t )x (t + τ )} = x ∗ (t )x (t + τ )dt
Statistically independent random variables are uncorrelated −∞
Easy to prove by inserting fx ,y (α, β) = fx (α)fy (β) The cross correlation
Z ∞
Not vice versa.
Rxy (τ ) = E {x ∗ (t )y (t + τ )} = x ∗ (t )y (t + τ )dt
If the correlation is Rxy = 0 the two random variables are said to be orthogonal. −∞

Different definitions on how to write and where to conjugate


If the random variables x and y are independent or uncorrelated: Rxy = E {x }E {y } and
Cxy = 0 WSS and the symmetry property:
Rxy (τ ) = E {x ∗ (t )y (t + τ )} = E {x ∗ (t − τ )y (t )}
= E {y ∗ (t )x (t − τ )}∗ = Ryx

(−τ )
10 / 43 11 / 43
Cross correlation example Cross correlation example code
In matlab: Rxy (τ ) = E {x (t + τ )y ∗ (t )} matlab code:
Note the difference between our chosen definition xx = complex( randn(100,1), randn(100,1) );

Should really be interpreted as Ryx (τ ) x = [xx ; complex(zeros(200,1))] + 1;


y = [complex(zeros(50,1)) ; xx ; complex(zeros(150,1))] - 1;
See the matlab manual for xcov and xcorr n1 = complex(randn(size(x)), randn(size(x)))/10;
n2 = complex(randn(size(x)), randn(size(x)))/10;

x = x + n1;
y = y + n2;

[Cv,lag] = xcov( y, x, ’coeff’ );

figure
subplot(1,2,1);
plot( real(x) ); hold on; plot( real(y) );
subplot(1,2,2);
plot(lag,abs(Cv));

12 / 43 13 / 43

Correlation matrix Covariance matrix


The autocorrelation and autocovariance sequences are important second-order statistical The covariance matrix is defined as
characterisations
Cx = E {(x − mx )(x − mx )T }
Convenient to represent a sequence of random variables in vector form
where
x = [x1 , x2 , x3 , . . . , xN ]T
mx = [E {x (0)}, E {x (1)}, E {x (2)}, . . . , E {x (N − 1)}]T
where all xn are random numbers Similarly to the covariance
The correlation matrix is defined as Cx = Rx − mx mTx

E {x12 }
 
E {x1 x2 }
. . . E {x1 xN }
 E { x2 x1 } E {x22 }
. . . E {x2 xN }
Rx = E {xxT } = 
 
.. .... .. 
 . . . . 
E {xN x1 } E {xN x2 } . . . E {xN2 }

14 / 43 15 / 43
Cross correlation and covariance matrix Autocorrelation matrix
Assume two random vectors x and y For the random process x (t ) = x (n), we write
In a completely similar fasion, we write the cross correlation matrix
x = [x (0), x (1), x (2), . . . , x (N − 1)]T
Rxy = E {xyT }
where all xn are time samples in the random process (and therefore random numbers)
And the cross covariance matrix The autocorrelation matrix is defined as

E {x (0)2 }
 
Cxy = E {(x − mx )(y − my )T } E {x (0)x (1)}
. . . E {x (0)x (N − 1)}
 E {x (1)x (0)} E {x (1)2 }
. . . E {x (1)x (N − 1)}
Rx = E {xxT } = 
 
and again .. .. .. .. 
 . . . . 
Cxy = Rxy − mx mTy 2
E {x (N − 1)x (0)} E {x (N − 1)x (1)} . . . E {x (N − 1) }

16 / 43 17 / 43

Autocorrelation matrix Autocovariance matrix


For a wide sense stationary random process, the autocorrelation is only a function of lag Similarly, the autocovariance matrix is defined as
Rx (n), and the autocorrelation matrix can be written as  
Cx (0, 0) Cx (0, 1) ... Cx (0, N − 1)
 
Rx (0) Rx (−1) Rx (−2)
. . . Rx (−N + 1)  Cx (1, 0) Cx (1, 1) ... Cx (1, N − 1) 
Cx = E {(x − mx )(x − mx )T } = 
 
Rx (1) Rx (0) Rx (−1)
. . . Rx (−N + 2) .. .. .. .. 
.

   . . . 
Rx = 
 Rx (2) Rx (1) Rx (0)
. . . Rx (−N + 3) Cx (N − 1, 0) Cx (N − 1, 1) . . . Cx (N − 1, N − 1)
 .. .. .... .. 
 . . .. . 
Rx (N − 1) Rx (N − 2) Rx (N − 3) . . . Rx (0)

This is a Toeplitz matrix or diagonal-constant with 2N − 1 degrees of freedom


It is Hermitian for complex valued random sequences, and symmetric for real

18 / 43 19 / 43
Autocovariance matrix The Fourier transform
For a wide sense stationary random process, the autocariance is only a function of lag and For a deterministic sequence x (t ), the Fourier transform is defined as
the autocovariance matrix becomes Z ∞
  X (ω) = x (t )e−j ωt dt
Cx (0) Cx (−1) ... Cx (−N + 1) −∞
 Cx (1) Cx (0) ... Cx (−N + 2)
Cx = E {(x − mx )(x − mx )T } = 
 
.. .. .. ..  sometimes called simply the spectrum
 . . . . 
Cx (N − 1) Cx (N − 2) . . . Cx (0) ω is understood as angular frequency (if t is time)
ω = 2π f where f is frequency in Hz
Note that for WSS random processes, the mean value does not change with time
The inverse Fourier transform is
mx = [mx , mx , mx , ..., mx ]T Z ∞
1
x (t ) = X (ω)ej ωt d ω
2π −∞

20 / 43 21 / 43

The power spectrum Properties of the power spectrum


A random process is an ensemble of discrete-time signals For a WSS random process, the power spectral density holds a number of important
properties:
We cannot compute the Fourier transform of the random process
Positivity:
The autocorrelation of a WSS random process is however a deterministic function of delay
Pxx (ω) ≥ 0
The Fourier transform of the autocorrelation function is called the power spectrum or the
power spectral density Symmetry: If x (n) is real, then the spectrum is symmetric

Pxx (ω) = Pxx (−ω)
Z
Pxx (ω) = Rxx (τ )e−j ωτ d τ
−∞
Z ∞ Total power:
1 ∞
Pxx (ω)ej ωτ d ω
Z
Rxx (τ ) =
2π −∞
Pxx (ω)d ω = E {x 2 (t )} = Rxx (0)
−∞
Called the Wiener-Khintchine theorem Norbert Wiener
or the Einstein-Wiener-Khintchine theorem Mathematician
The father of cybernetics
or the Khinchin-Kolmogorov theorem
22 / 43 23 / 43
Power spectrum - example Cross power spectral density
The power spectral density reveals frequency selective information Similar to the power spectral density (or auto PSD), the cross spectral density is:
Example: random sequence with corresponding ACF and PSD Z ∞
Pxy (ω) = Rxy (τ )e−j ωτ d τ
−∞
Z ∞
1
Rxy (τ ) = Pxy (ω)ej ωτ d ω
2π −∞

Used later on in the calculation of Magnitude Squared Coherence

24 / 43 25 / 43

White noise White noise


An important and fundamental discrete time random process is white noise White noise cannot really exist, since the average power becomes infinite
A wide sense stationary random process, real or complex, is said to be white if the Z ∞
autocovariance Pxx (ω)d ω = ∞
−∞
Cxx (τ ) = σx2 δ(τ )
Band limited white noise has constant power within a frequency band
That is, the autocovariance is zero in all places except at zero lag.
 Pπ

The Fourier transform of a delta-function becomes a constant (only one time-lag contributes) −W < ω < W
Pxx (ω) = W
This gives the following power spectral density for white noise
0 elsewhere

Pxx (ω) = σx2


This gives the autocorrelation
sin(W τ )
White noise has derived its name from white light which contains all visible light frequencies Rxx (τ ) = P

in its spectrum
which is a sinc.
26 / 43 27 / 43
Example: real data collected by a sonar Example: Single recorded pulse-echo timeseries
The HUGIN autonomous underwater vehicle The random sequence is clearly non-stationary
Wideband interferometric synthetic aperture sonar We divide into “similar” regions before we continue our statistical analysis
Transmitter that insonifies the seafloor with a LFM pulse Region 1: Backscattered signal from the seafloor
Array of receivers that collects the echoes from the seafloor Region 2: Additive noise
The signal scattered from the seafloor is considered to be random
The signal consists of a signal part and additive noise

28 / 43 29 / 43

Example: Power spectral density Example: Power spectral density


How about the PSD on the sonar data from region 1 and 2? We can improve the estimate by performing ensemble averaging (over all channels, or
The PSD in its simplest form in matlab: elements in the random vector)
Pxx = fftshift( fft( xcov( data, ’coeff’ ) ) ); Spectral estimation will be a separate topic in this course

30 / 43 31 / 43
Sonar data example: Conclusion Power spectral density and cross spectrum
The real and imaginary part of the signal is uncorrelated Power spectrum recap (assuming WSS random processes)
The PDF is OK in both regions Z ∞
Pxx (ω) = Rxx (τ )e−j ωτ d τ
The individual channels (receiver elements) are correlated. This may be OK in region 1 but −∞
not OK in region 2.
Cross spectrum recap (assuming joint WSS RP)
The channels are strongly correlated in region 2 (the noise region) Z ∞
The spectrum is contaminated with spectral lines in region 1 Pxy (ω) = Rxy (τ )e−j ωτ d τ
−∞
The spectrum contains strong unwanted strong lines (tones) in region 2
Spectral coherence or normalized cross spectrum
The spectral lines are due to self-noise and should be reduced as much as possible
Pxy (ω)
Γxy (ω) = p p
Pxx (ω) Pyy (ω)

32 / 43 33 / 43

Magnitude Squared Coherence (MSC) MSC example: The LTI filter


Magnitude squared coherence (MSC) Assume a zero mean WSS complex random processes x (t )

|Pxy (ω)|2 Construct a new random process


|Γxy (ω)|2 =
Pxx (ω)Pyy (ω) Z ∞
y (τ ) = h(τ ) ∗ x (τ ) = h(t )x (τ − t )dt
Requires Pxx (ω) > 0 and Pyy (ω) > 0 −∞

The MSC is bounded 0 ≤ |Γxy (ω)|2 ≤ 1 where h(t ) is a Linear Time Invariant (LTI) filter
Frequency domain analog of the correlation coefficient ∗ is the convolution-operator
An important tool to assess the relationship between two signals in frequency domain Then y (t ) also becomes a WSS random process
Can be used to prove linear dependence (example later on) In frequency domain (remembering Properties of the Fourier Transform)
Reference: Bendat & Piersol 1980 Y (ω) = H (ω)X (ω)
Barry Van Veen on youtube: Coherence and the Cross Spectrum
34 / 43 35 / 43
MSC example: The LTI filter 2 MSC example: The LTI filter 3
The autocorrelation is Z ∞
The cross spectrum (again remembering Properties of the Fourier Transform)

Rxx (τ ) = x (t )x (t + τ )dt
−∞ Pxy (ω) = H (ω)Pxx (ω)
The cross correlation
Z The (auto) power spectrum for y (t )

Rxy (τ ) = x (t )y (t + τ )dt
Z Z Pyy (ω) = |H (ω)|2 Pxx (ω)
= x ∗ (t ) h(s)x (t + τ − s)dsdt Inserting into the magnitude squared coherence (MSC)
Z
= h(s)Rxx (τ − s)ds |Pxy (ω)|2
|Γxy (ω)|2 =
Pxx (ω)Pyy (ω)
This is directly recognized as a convolution |H (ω)Pxx (ω)|2 |H (ω)|2 Pxx
2
(ω)
= 2
= 2 2
=1
Rxy (τ ) = h(τ ) ∗ Rxx (τ ) Pxx (ω)|H (ω)| Pxx (ω) |H (ω)| Pxx (ω)

36 / 43 37 / 43

MSC example: The LTI filter 4 MSC example 2: The LTI filter again
If y (t ) can be described as x (t ) convolved with a Linear Time-Invariant (LTI) filter

y (τ ) = h(τ ) ∗ x (τ )

then the Magnitude Squared Coherence (MSC) is


Consider a slightly better example where y (t ) can be described as x (t ) convolved with a
|Γxy (ω)|2 = 1 for all ω
LTI-filter plus noise
The Magnitude Squared Coherence y (t ) = h(t ) ∗ x (t ) + w (t )

The MSC can be used to examine the relation between two signals or data sets. It is commonly where w (t ) is a zero mean WSS complex random processes
used to estimate the power transfer between input and output of a linear system. If the signals x (t ) and w (t ) are statistically independent, giving Rxw = 0, Ryw = 0
are ergodic, and the system function linear, it can be used to estimate the causality between
The cross-correlation of x (t ) and y (t ) becomes (again)
the input and output.
Rxy (τ ) = h(τ ) ∗ Rxx (τ )

38 / 43 39 / 43
MSC example 2: The LTI filter again 2 MSC example 3: The LTI filter again 3
The cross power spectral density becomes (again)

Pxy (ω) = H (ω)Pxx (ω)

The auto power spectral density for y (t ) becomes1

Pyy (ω) = |H (ω)|2 Pxx (ω) + Pww (ω) Single input-output system with noise

The MSC x (t ), y (t ) = h(t ) ∗ x (t ) + w (t )


2 2
|Pxy (ω)| |H (ω)Pxx (ω)| The MSC
|Γxy (ω)|2 = =
Pxx (ω)Pyy (ω) Pxx (ω)(|H (ω)|2 Pxx (ω) + Pww (ω)) |H (ω)|2 Pxx (ω)
|Γxy (ω)|2 =
2
|H (ω)| Pxx (ω) |H (ω)|2 Pxx (ω) + Pww (ω)
=
|H (ω)|2 Pxx (ω) + Pww (ω) If |H (ω)|2 Pxx (ω) ≫ Pww (ω) then |Γxy (ω)|2 ≈ 1
1
remembering that the autocorrelation of a sum of two uncorrelated random processes is equal to the sum of the If |H (ω)|2 Pxx (ω) ≪ Pww (ω) then |Γxy (ω)|2 ≈ 0
autocorrelations
40 / 43 41 / 43

MSC summary Summary of lecture 3


MSC is used when studying two signals, recorded simultaneously, in one system
random processes Correlation matrix
Typically used when there are periodicities in the data
autocorrelation Covariance matrix
Frequency domain analog of the correlation coefficient
cross correlation Fourier transform
Applications in biomedicine (Cardiology / Neurology):
Studying heartrate and function in Electrocardiography (ECG,EKG) autocovariance power spectral density
Studying brain function in Electroencephalography (EEG) cross covariance Einstein-Wiener-Khintchine
Applications in natural sciences: relation
wide sense stationary
Climate research: Estimating dependency in solar activities and global temperature. See Sverre
cross spectral density
Holms papers on how to estimate the MSC and the accuracy of the estimates ergodicity
White noise
The main application in Engineering in System Identification
Magnitude squared coherence
Care must be taken in Estimation of the MSC

42 / 43 43 / 43
Roy Edgar Hansen

IN5340 / IN9340 Lecture 3


Random variables, vectors and sequences

You might also like