Professional Documents
Culture Documents
ISVR6130
ISVR6130
Paul White
1
Join: vevox.app ID: 180-736-270 POLL OPEN
a) Tp
6.67%
b) 1/Tp
86.67%
c) Tp2
6.67%
d) 1/Tp2
0%
2
Join: vevox.app ID: 180-736-270 POLL OPEN
a) It is discontinuous.
6.67%
b) It is continuous but is discontinuous when differentiated.
60%
c) It’s continuous, its derivative is continuous, its second derivative is discontinuous
20%
d) It is discontinuous but its derivative is continuous
13.33%
4
Contents
• Delta functions
• Fourier Series for non-periodic signals
• Definition
• Examples
• Properties
5
Dirac Delta Functions
Paul Dirac
1902-84
t lim R t
0
R t 1/ t /2
R t 0
0 t /2
Note Re(t) always has
an area of 1 1/e
6
-e/2 e/2
Properties of a Delta Function
0 t 0
• Zero everywhere, except at t=0, where it is infinite, t
t 0
• Symmetric
t t t
means “for all”
• Unit area b
t dt 1
t a
b 0, a 0
• Sifting property
b
t x t dt x
t a
a b
t x t dt x d(t-t)
x(t) x(t)
t t
d(t-t)x(t)
Area
is x(t)
t t
8
Examples of the Sifting Property
4 2
t 2
t 3 dt 32
9 t 3dt 9
t 2
t 3dt 0
t 2
2 0
e
t2
cos t t dt e 02
cos 0 1
1/2
1/2 2 1/2 t 13
7
t 1d dt t 1 dt
2 2
t 0 2 t 0 3 0 24
9
Alternative (useful) Definition
• In Fourier analysis one frequently encounters delta functions.
• Consider the following integral
I f e 2 ift dt
• Consider how this integral behaves as a function of f:
– When f0, the integrand (e 2 ift ) oscillates. When it is integrated
over the whole of the t axis you get 0.
– When f=0, the integrand is 1. When that is integrated over all t then
the result is infinite.
– So the integral is equal to 0 for all f 0 and infinity when f=0, like a
delta function.
I f e 2 ift dt f
This is strictly not a proof that the integral is a delta function, but does show that it is not an unreasonable assertion. 10
Fourier Series Revisited
• FS of a signal with a period, Tp=0.1 s (fp=1/Tp=10 Hz)
|dn| Components
separated by fp
fp
fp
11
Non-Periodic Signals
• What is the frequency domain representation of a non-
periodic signal? (Answer: the Fourier Integral!)
• One can consider a non-periodic signal as a signal which is
periodic, but whose period (Tp) is ∞!
• Consider the definition of the (complex) FS coefficients
Tp /2
1
x t e
2 int /Tp
dn
Tp
Tp /2
dt
Note the limits are –Tp/2 to Tp/2, rather
than 0 to Tp: this makes no difference
• Now consider as long as the integral covers one period of
the signal.
Tp /2 Tp /2
1 1
x t e x t e
2 int /Tp 2 i nf p t
lim d n lim
Tp Tp T
p
Tp /2
dt lim
Tp T
p
Tp /2
dt
12
…. in the limit
Tp /2 1/2 f p
1
x t e x t e
2 i nf p t 2 i nf p t
lim
Tp T
p
Tp /2
dt lim f p
fp 0
1/2 f p
dt
Tend to zero
• Assume x(t) has finite support, say it is zero for |t|>u, then
Tp /2 u
x t e dt x t e
2 i nf p t 2 i nf p nt
lim
Tp
Tp /2 u
dt
• So that
d n 0 as Tp
• Not much use! (In the limit all the FS coefficients tend to zero)
13
Alternative Approach
• Instead of dn , how about considering dnTp=dn/fp?
• Recall that fp is the spacing between harmonics (as well as the
fundamental frequency) so we could denote it Df.
Tp /2
dn dn
lim d nTp lim x t e dt
2 i nf t
lim lim
Tp f T p f Tp Tp
p Tp /2
F x t X f
*
• Time-reversal
• Time- shifts F x t e 2 if X f
• Conjugate symmetry X f X f *
X f X f , arg X f arg X f
Signal Symmetries
• Symmetric signals, i.e. x t x t then X f is real, since
from the above X f X f
*
X f X f
*
• Show that, if x(t) is real, then:
*
1
X f x t e X f x t e
2 ift * 2 ift
dt dt
X f x t e dt x t e
2 ift *
dt
2 ift *
* *
19
Example 1: Rectangular Function
-T/2 T/2 t
X f x t e 2 ift dt Definition of Fourier Integral
T /2
X f 1e 2 ift dt Using the expression for x(t)
T /2
T /2
1 2 ift T /2
X f e 2 ift
dt e
T /2
2if t T /2
f 2i
sin fT sin fT Using Euler’s equation
X f T sin
e i e i
f fT 2i
What does this function look like?
Called a sinc function
sin fT
X f T
fT
It is helpful to consider this in terms of U=pfT
T=1
sin U
X f T
U
1
T sin U
U
At U=0, sin(U)/U looks to be 1
It becomes more like a Dirac delta – Dirac deltas have infinite height and zero
width.
T /2
lim
So it is clear that for this example X f f
T
Or to put it another way the Fourier transform of x(t)=1, for all t, is d(f)
Example 2: Cosine Wave
x t A cos 2f 0t t
Note f0 is a constant, be careful not
to confuse it with f the variable in the
x(t) Fourier transform X(f).
A T=1/f0
d d d d
d d d d t
X f x t e 2 ift dt Definition of Fourier Integral
X f Acos 2f 0t e 2 ift dt Using the expression for x(t)
Here is where you can get f and f0 confused! f0 is the frequency of the cosine wave
being analysed and f is the variable in the Fourier transform.
i i
e +e
To evaluate this integral one simplifies using cos
2
A
X f e 2 if0t e 2 if0t e 2 ift dt
2
A A
e e2 if 0t 2 ift
dt e 2 if0t e 2 ift dt
2 2
A 2 i f f 0 t A 2 i f f 0 t
e dt e dt
2 2
2 i t
Both the integrals have the form: e dt
Note that the integrals have infinite limits and the argument of the
exponential is imaginary. Consequently the result is a Dirac delta:
e 2 it
dt
-f0 f0 f
Link to Example 1
Taking A=1 and f0=0.
Then the Fourier transform is
A A 1 1
X f f f 0 f f 0 f f f
2 2 2 2
And the signal in the time domain is
x t A cos 2f 0t cos 0 1 t
t
X f x t e 2 ift dt Definition of Fourier Integral
X f e t e 2 ift dt Using the expression for x(t)
0
2 if t
X f e dt Combining the exponential terms
0
1 2 if t
X f e Using the standard integral
2if 0 1 at
at
e dt e
a
2 if t
For large t, then e e t e 2 ift 0 because e t 0
1 1
X f 0 1
2if 2if
Discussion
• In this case the Fourier integral is complex valued (previous
examples have all been real valued).
• To explore this we typically consider the magnitude and
sometimes the phase of X(f).
• To compute the magnitude we commonly use the fact that
z zz * and for the phase arg z tan 1 zi / zr
2
2 1 1 1 1 1
X f 2
2if 2if 2if 2if 42 f 2
*
1 1 1 1
X f 2 2 2
2 2
as f
4 f 4 f 2f f
x t e
t
t
x(t)
e t
e t
t
Note when t is negative then |t|=-t and e-a|t| is equal to e-a(-t) = eat.
We shall take a different approach to this Fourier Integral
y(-t)
+ t
= x(t)
Identifying y(t) as the same function as used in example 3 (a
decaying exponential). So we already know that:
1
Y f
if
Fourier integral).
This means that
x t y t y t X f Y f Y f
*
*
1 1 1 1
X f
2if 2if 2if 2if
2
2
4 2 f 2
Symmetry in the Examples
• Note Examples 1, 2 and 4 all involve signals which are
symmetric and their Fourier integrals are real.
– We just look at X(f) since it is real.
39
Laurel and Hardy
• Example 1 - as T varies:
– As T increases the window gets broader in the time domain, but its
Fourier transform becomes narrower (first intercept on the frequency
axis is 1/T).
• Example 3,4 - as a varies: (Example 4 is easier to deal with …)
– For small a the time domain function is broad, but the Fourier
transform is narrow. Specifically for example 4
2 2
X 0 2 2
4 0
2 1 X 0
X / 2 2 i.e. at f=a/2p the Fourier
4 / 2
2
2 transform is half its peak value,
so a/p represents the 6 dB
bandwidth.
40
Example 4 (in detail)
• Functions as a varies
Time Domain Frequency Domain
1
F x at X f / a a 0
a
• Thus
– For a>1, x(at) is compressed but X(f/a) is stretched out along the frequency
axis.
• Shorter duration signals have broader bandwidths
– For a<1, x(at) is stretched but X(f/a) is compressed along the frequency
axis.
• Longer duration signals have narrower bandwidths
42
Examples Summarised
• Continuity in the time domain of the four examples
– Example 1 (Square pulse) is discontinuous
– Example 2 (Cosine wave) is completely continuous
– Example 3 (Exponential decay) is discontinuous
– Example 4 (Double exponential) is continuous, but is discontinuous in
its derivative
A A 2
Example 2 f f0 f f0 Example 4
2 4 2 f 2
2 2
43
Asymptotic Roll-Off
• What happens to the Fourier transform for large f ?
sin fT A 1
• Example 1 : AT Discontinuous
fT f f
• Example 2 : for large f (in fact for any f>f0) X(f)=0 Smooth
1 1
• Example 3 : for large f this is Discontinuous
2if f
2 1 Continuous,
• Example 4 : for large f this is 2 discontinuous
2 2 2
4 f f derivative
Continuity properties
in time domain
44
General Rule
• Consider a signal x(t) which is continuous up to its nth
derivative
– i.e. it is continuous, as is its first derivative and all those up to n-1 are
continuous, but the nth derivative is discontinuous.
• The Fourier transform of this signal will satisfy:
1
lim X f n1
f f
• This means in terms of decibels (20 log10(|X(f)|)) that for large
f the Fourier transform reduces by 20(n+1) dB per decade
increase in frequency (or 6(n+1) dB per octave).
45
Parseval’s Theorem
• Defining as signal’s energy as: E x t dt
2
2
• One can show E x t dt
X f df
2
Proof in summary
∞ ∞
𝑥 ( 𝑡 ) =∫ ❑𝑋 ( 𝑓 1) e d𝑓 1 𝐸= ∫ ❑𝑥 ( 𝑡 ) d𝑡
2𝜋𝑖𝑓 1𝑡 2
−∞ −∞ 46
Examples for you to try
1) Compute the Fourier transform for:
a) x(t)=1 t
b) x(t)=d(t)
c) x(t)=sin(2pf0t) t
d) x(t)=te-at t>0
=0 t<0
2) For each Fourier transform you found in 1), consider the
continuity of x(t) and show that the asymptotic rate of X(f)
conforms with the general rule relating these two (slide 45).
3) Prove that:
a) F x at 1 X f , a 0 b) F x t X f
*
a a
47
Fourier Integrals: Part 2
1. FI of a) is real and of b) is imaginary and the asymptotic roll-off of a) is faster than that of b)
0%
2. FI of a) is real and of b) is imaginary and the asymptotic roll-off of b) is faster than that of a)
0%
3. FI of b) is real and of a) is imaginary and the asymptotic roll-off of a) is faster than that of b)
0%
4. FI of b) is real and of a) is imaginary and the asymptotic roll-off of b) is faster than that of a)
0%
Vote Trigger
Join: vevox.app ID: 108-365-854 POLL OPEN
In the Figure below, which time domain signal (top row) matches
which Fourier integral (bottom row)?
4. a) - ii), b) - iii), c) - i)
0%
6. a) - iii), b) - ii), c) - i)
0%
Join: vevox.app ID: 108-365-854 POLL OPEN
In the Figure below, which time domain signal (top row) matches
which Fourier integral (bottom row)?
2 ift
F x t y t X f1 e df1 Y f 2 e df 2 e
2 if1t 2 if 2t
dt
t f1 f2
X f1 Y f 2 e 2 if1t e 2 if2t e 2 ift df 2 df1dt
t f1 f 2
2 i f f1 f2 t
X f1 Y f 2 e dt df 2 df1
f1 f 2 t
X f1 Y f 2 f f1 f 2 df 2 df1 X f1 Y f f1 df1
f1 f 2 f1
… in the Frequency Domain
• Similarly in the frequency domain:
F 1 X f Y f X f Y f e2 ift df
x y t d x t * y t
• Linear x t * a y t b z t a x t * y t b x t * z t
• Time shifts x t * y t x t * y t
1
Same way as 1 is the identity value for multiplication (anything multiplied by 1 does not
change it) and zero is the identity for addition (add zero to anything and
it does not change).
Linear Time Invariant Systems
• Consider a Linear Time Invariant (LTI) system.
t1 t2 System t1 t2
General Result
• The sifting property of a delta function can be used to express
an arbitrary input:
x t x t d
h t * x t
LTI Systems in the Frequency Domain
• Since
y t h t * x t x t h d h t x d
r t 1 t T / 2 r(t)
0 Elsewhere
t=-T/2 t=T/2 t
….. Cont’d
• Since x t can be related to x t via multiplication then
X f R f * X f
Blue: rectangular
or boxcar
Green: triangular
Light blue: Gaussian
Red: Hanning
Common Choices for Windowing Functions
• Rectangular window w t 1 t T / 2
0 Elsewhere
• Hanning window (raised cosine)
w t 1 cos 2t / T / 2 t T / 2
0 Elsewhere
• Hamming window (raised cosine on a pedestal)
w t 0.54 0.46cos 2t / T t T / 2
0 Elsewhere
• Blackman window
w t 0.42 0.5cos 2t / T 0.08cos 4t / T t T / 2
0 Elsewhere
Rectangular Hanning
Hamming Blackman
Properties of the Common Windows
Lines show
true signal
frequencies
Fourier Transform
x[0]
x[4]
Sampling interval
• If x[n] is known, then this says nothing about the value of x(t)
between samples, (n-1)D<t<nD.
Width dx
f(x)
f(xn)
xn x
a b
f x dx f x x
a n0
n
– You might be more familiar with the trapezoidal rule, which is actually
a little more accurate than the rectangle rule. (The area of the two half
strips outside of [a,b] are subtracted when using the trapezoidal rule).
Approximating the Fourier Integral
• We can used the rectangular method to approximate the
Fourier integral as follows:
x t e 2 ift
dt x tn e 2 iftn
x n e 2 ifn
n n
• Infinite sums
Based on the above expression as N∞, |rN|0 as long as |r|
<1. In which case
a
S a ar ar ar ....... ar
2 3 n
r 1
n 0 1 r
If |r|>1 the sum diverges, i.e. S∞ as N∞
Periodicity of Xs(f)
• Consider Xs(f+1/D)=Xs(f+fs)
1
1 2 i f n
X s f x n e
x n e 2 ifn e 2 in
n n
1
n
e 2 in
e 2 i n
1 1 X s f X s f fs X s f
• Consider Xs(1/D-f)=Xs(fs-f)
1
1 2 i f n
X s f x n e
x n e 2 ifn e 2 in
n n
1
x n e 2 ifn X s f X s f X s fs f X s f
* *
n
Implications of Periodicity
• The observations on the previous slide mean that Xs(f) is
periodic in frequency, regardless of the signal x[n].
|Xs(f)|
1
Xs f
1 2 cos 2f 2
sin 2f
arg X s f tan 1
1 cos 2 f
Impulse Train (or Dirac Comb)
• Consider the function i(t)
i t t n
n D 2D 3D 4D t
• We shall now compute the Fourier series coefficients, dn, for the
Dirac comb. Recall definition of complex Fourier Series
T /2
1
p
x t d ne
2 inf t 2 int /T
dn x t e dt
p p
n Tp Tp /2
Fourier Series of a Dirac Comb
/2
1
dn i t e 2 int /
dt
t /2
i(t) t t
t 2 t 3
-D/2 D/2
D 2D 3D 4D t
Over the region –D/2 to D/2, only one delta function in i(t) is present,
i.e. the one at t=0. So in this region we can write i t t
/2
1 1 1 2 int /
dn t e 2 int /
dt hence i t e
t /2 n
Alternative representation of
an impulse train
Fourier Transform of an Impulse Train
• The Fourier transform of i(t) can be computed as
1 2 int / 2 ift
I f F i t e e dt
n
1 2 i f n / t 1 n 1
e dt f f nfs
n
n n
i(t) I(f)
F
1
1/D
i.e. Fourier transform of a Dirac comb is another Dirac comb (scaled and with reciprocal spacing)
Alternative Definition of Xs(f)
• Consider F x t i t where i(t) is an impulse train (Dirac
comb) with spacing D.
F x t i t x t t n e 2 ift dt x t t n e 2 ift
dt
n n
x t t n e 2 ifn x n e 2 ifn X s f
n n
fs/2
Aliasing
• If x(t) is band limited, such that
X f 0 f f0
• If fs/2>f0 then
X s f X f for fs / 2 f fs / 2
So, with the exception of the 1/D scaling factor, in this band the
transforms are the same.
• If fs/2<f0 then
Xs f X f
D
q>fs
q-2fs q-fs q fs+q 2fs+q f
Example 3
• Cosine wave x n cos 2qn
e 2 iqn e 2 iqn
• We break this up using cos 2qn
2
• From last example
F e
2 iqn
f q nfs
n
• It is a simple step to then show
F e 2 iqn
f q nf s
• Thus n
cos 2qn
X s f =F f q nf f q nf
s s
2 n
Xs(f) fs 2fs 3fs
q<fs
D/2
So that
fs /2 f s /2
x t X s f e 2 ift df x n e 2 ifn e2 ift df
fs /2 f s /2 n
f s /2
2 if t n
sin t n /
x n e df x n
n fs /2 n t n /
Shannon’s Reconstruction Formula
sin t n /
x t x n
n t n /
Closer Look at Reconstructed Signal
• In this case the reconstructed signal and the original signal are
not exactly the same
• This is because the original signal (one cycle of a sine wave)
has energy above (fs/2), i.e. there is some aliasing.
Problems with Shannon’s Reconstruction
sin t n /
x t x n
n t n /
DFT
Pictorial Representation of the DFT
Inverse DFT (IDFT)
• One can show that:
N1
n0
e 2 iqn/ N 0 for 0 q N 1
N for q 0
x n x n 0 n N
Some value >N
0 N n M
• For this padded signal
X s f X s f f
• But it is sampled more finely since
1 1
f f
M N
Example of Zero Padding
Single Cycle of a sine wave
Zeroes added
n n
N-1 N-1
• The problems:
– What do we do with the space at the start of the signal?
– What do we do with the sample beyond the end of the measurement interval?
Option 1: Circular Shifts
• If we assume x[n] is periodically extended then the problems
on the preceding slide have a “natural” solution.
x[n]
N-1
x[n-1]
N-1
Option 2: Linear Shift
• Assuming the signal is zero outside of the measurement
regime then a shift is defined so that:
x[n] x[n-1]
n n
N-1 N
where here the shifts, like y[n-m], are defined as circular shifts
(not linear shifts).
• The result of circularly convolving two signals of length N is
also of length N.
Convolution in the Frequency Domain
• For the Fourier transforms considered up to now then, in
some form, we have:
F 1 X f Y f x t * y t
x n m y m
m
for each n reverse one of the signals, shift it by n, multiple the shifted
and unshifted signals together and then add the product up.
This can be computationally demanding.
– Frequency domain
See next slide for details, but basically Fourier transform the signals,
multiply them in frequency and then apply the inverse transform.
Because of the Fast Fourier transform (see next lecture) this can be
computationally efficient.
Computing Linear Convolution using the DFT
Note because of the zero padding the IDFT will be long enough (longer than
M+N-1) to accommodate the linear convolution.
The Fourier integral
The original continuous time signal (upper frame) and its Fourier
transform (lower frame).
F x t
X f F
X TT f F w t x t
kk
T /2
1 T /2
d k x t e 22iktikt//TTdt
T TT //22
1 NN 11
x t X f e 2 iftiftdf x n X k e22inkink//NN
N kk00
N1
X f x ttee 22iftiftdt
dt X k x n e 2 ink / N
nn
00
ff /2
/2
1 ss
x n XXss ff ee22ifnifndf
df
f ss ffs /2/2
s
N
N11
X ss ff xx nn ee22ifn
ifn
nn
00
The Fast Fourier Transform
Introduction
• Computation of DFT
• FFT Algorithm
• Divide and conquer approach
• Mathematical “trick”
• Diagrammatic representation
• Speed of the FFT
Computing DFT
• A DFT can be computed directly using the following recipe
(algorithm):
– For each frequency k (of which there are N)
• Multiply the signal x[n] by the complex exponential e-2pink/N
• Sum this product to for X[k]
a a+b
b
a a b
b a-b
Decimation in Time
• The FFT starts by dividing a sequence into two, assuming N is
even, by considering the odd and even numbered samples in
a signal as two different sequences:
N1
X k x n WNnk
n0
x1 n x 2n n 0,1, 2,....., N / 2 1
x2 n x 2n 1 n 0,1, 2,....., N / 2 1
x[n]
n
x2[n]
x1[n]
Resulting Simplification
• The DFT can be written in terms of these two sequences:
N /2 1 N /2 1
X k x 2n W x 2n 1WN 2 n1k
2 nk 2 n1 k
N
n0 n0
N /2 1 N /2 1
x1 n WN2 nk WNk x2 n WN2 nk
n0 n0
• Hence
N /2 1 N /2 1
X k N / 2 x1 n WNnk/2 WNk x2 n WNnk/2
n 0 n 0
Summary so far
• Thus we have
N /2 1 N /2 1
X k x1 n W nk
N /2
k
W
N x2 n WNnk/2
n0 n0
N /2 1 N /2 1
X k N / 2 x1 n W nk
N /2
k
W
N x2 n WNnk/2
n0 n0
• But
N /2 1 N /2 1
X 1 k x1 n W nk
N /2 X 2 k x2 n WNnk/2
n0 n0
i.e. these are the DFT of the two sequences (the odd and even
numbered samples).
The Saving
• To compute X[k] one can compute X1[k] and X2[k] and
combine them as follows:
X k X 1 k WNk X 2 n
X k N / 2 X 1 k WNk X 2 n
1. None
0%
2. High school (A-level)
0%
3. Some in first year
0%
4. Some throughout my degree
0%
5. ..... I did a stats degree!
0%
Join: vevox.app ID: 140-534-146 POLL OPEN
What is the name of the distribution with the probability density
function shown below?
1. Gaussian
0%
2. Rayleigh
0%
3. Uniform
0%
4. Normal
0%
5. Chi-squared
0%
Join: vevox.app ID: 140-534-146 POLL OPEN
• All values must lie between - and , i.e. Pr{- <x< }=1,
hence for any pdf
p ( x ) dx = 1
−
m=50 m=10
s=10 s=3.2
M3=0 M3=0
M4=30,000 M4=300
m=1 m=0
s=1 s=2
M3=2 M3=0
M4=9 M4=24
p ( x ) = e− x x0 1 −x
p( x) = e
=0 x0 2
Examples (cont’d)
• Consider the pdf: p ( x ) = e− x x0
=0 x0
−x
• First moment (mean) E X = xe− x dx = xe 0 + e− x dx = 1
• Second moment
0 0
(variance)
E ( X − 1) =
2
( x − 1)
2
e − x dx = (x 2
)
− 2 x + 1 e − x dx
0 0
xe − x dx = 1, e − x dx = 1 E ( X − 1) = x 2e − x dx − 1
2
0 0 0
2 −x
x 2 e − x dx = − x e + 2 xe − x dx = 2 E ( X − 1) = 2 − 1 = 1
2
0
0 0
• You can prove the 3rd and 4th moment results yourself ……
Example: Using Data
• Consider 1000 samples from the exponential pdf.
• We can compute the average of these samples, which
happens to be 1.0158.
• We can compute the average of (x-1.0158)2, which is 1.01.
• Similarly we can compute the 3rd central moment
1 N
( xk − 1.0158)
3
= 1.973
N k =1
1 N
• The fourth central moment is ( xk − 1.0158) = 8.17
4
N k =1
p ( x, y ) dx dy = 1
x =− y =−
Further Properties of Bivariate PDFs:
Expectations
• One can use bivariate pdfs to compute expectations, in
general
E f ( X , Y ) = f ( x, y ) p ( x, y ) dx dy
− −
1 1
x= xk y= yk
N k =1 N k =1
– Subtract these means from the samples
xk = xk − x yk = yk − y
– The compute the mean of the product xk yk
N
xy
1
Rxy = k k
N k =1
• Notes:
– This is not the most computationally efficient method.
– Strictly the 1/N factor in the last step should be 1/(N-1) – don’t worry
why, it makes little difference unless N is very small.
Positive and Negative Correlations
• The height vs weight data is an example of a positive
correlation: an increase in weight is associated with an
increase in height.
– In the graph in the preceding slide the slope of a line through the data
is positive.
• Data for which an increase in one variable is associated with a
reduction in the other is called a negative correlation.
– For example if one were to plot life time against number of cigarettes
smoked one would expect to see a graph in which the average life time
reduces as the number of cigarettes smoked increases – a negative
correlation.
Examples
Effect of Scale
• The height and weight data were expressed in cm and kg.
• If height were expressed in metres, then all of the x values
would be multiplied by 1/100.
• Thus the value of Rxy would be similarly scaled by 1/100 if we
used m instead of cm.
• Clearly this does not mean that the correlation is less because
of the units used to express values!
• To avoid such dependencies on the units used one can define
the correlation coefficient.
(Pearson’s) Correlation Coefficient
• The correlation coefficient, r, is defined as:
E ( X − m x ) (Y − m y )
s x = E ( X − m x ) , s y = E (Y − m y )
2 2
r=
sxs y
• Basic principles
– Ensemble averages
– Stationary signals
– Ergodicity
• Correlation function
• Cross correlation functions
What is a Random Time Series?
• [Throughout this section we shall consider continuous time
signals – a parallel set of concepts can be applied to discrete
signals, with very little modification]
• A random time series is a signal that is random …..
• If we make multiple measurements of the same process we
obtain different signals which have the same “structure”.
1
E X ( t ) = x ( t ) = lim xk ( t )
N → N
k =1
N
and we can use 1
E X ( t )
xk ( t )
N k =1
• Averages computed over multiple realisations are called
Ensemble averages.
• In many applications making a large number of measurements
is not feasible.
Time Averages
1
E t X ( t ) = x ( t ) dt
T
0
r ( 0 ) = E X ( t ) = 2
2
– The correlation never exceeds the value at t=0.
r ( t ) r ( 0 ) = 2 t
This can be thought of as defining the idea that no point can be better
correlated with X(t) than X(t) itself.
Time Averages
1
rxx ( t ) = E X ( t − t ) X ( t ) = lim x ( t − t ) x ( t ) dt
T → T
0
• If both X(t) and Y(t) are stationary then, as before, it is not the
absolute times t1 and t2 that matter but the time difference,
i.e. rxy ( t ) = E X ( t − t ) Y ( t )
Energy =
x ( t ) dt → as T →
2
−T /2
Fourier Transform of a Random Signal
1
Power = x ( t ) dt → 2 = E t X ( t ) as T →
2 2
T
−T /2
Signal’s variance
Fourier Transform in the Limit
1 1
xT ( t ) dt = X T ( f ) df
2 2
T T
−T /2 −
T
• Which is, as noted above, a random quantity.
Examples (on a linear scale)
Examples (on a logarithmic (dB) scale)
Average Fourier Transform
average, i.e. 1
E XT ( f )
2
T
S xx ( f ) = lim
T → T
Power Spectral Density (PSD)
−3000
S xx ( f ) df +
2000
S xx ( f ) df = 2
2000
S xx ( f ) df
Properties of PSDs
• It is symmetrical in frequency
S xx ( − f ) = S xx ( f )
−
S xx ( f ) df = 2 its power.
Wiener-Khinchin Theorem
S xx ( f ) =
−
rxx ( ) e−2if d =
−
e− e−2if d
0
2
e
0
− −2 if
e d +
e
−
−2 if
e d = 2
+ 42 f 2
Example: Results
=100
=10
=1
Example: White Noise
• Consider a signal whose correlation function is a Dirac delta
function.
rxx ( ) = ( )
S xx ( f ) = ( ) e−2if d = 1
−
• If one measures to random signals X(t) and Y(t) then one can
define a cross-spectrum as:
E X T ( f ) YT ( f )
*
S xy ( f ) = lim
T → T
S xy ( − f ) = lim = lim =S f
yx ( )
T → T T → T
S xy ( f ) S xx ( f ) S yy ( f )
2
• Consider the case where the two processes (X(t) and Y(t)) are
the input and output of a linear system with FRF H(f).
Input, X(t) Linear System Output, Y(t)
H(f)
• Cross-spectral density
E X T ( f ) YT ( f )
E X T ( f ) H ( f ) X T ( f )
* *
S xy ( f ) = lim = lim
T → T T → T
E XT ( f )
2
= H ( f ) lim =H f S f
( ) xx ( )
T → T
• Using the last two results
S xy ( f ) H ( f ) S xx ( f )
2 2 2
2
(f )= = =1
S xx ( f ) S yy ( f ) S xx ( f ) H ( f ) S xx ( f )
xy 2
Principles of Estimation Theory
Outline
• Estimation theory
– What is an estimator?
– What is a good estimator?
– Bias/Variance/Mean squared error
– Consistency
• Estimation of PSDs
– Periodograms
– Segment averaging
– Bias – Variance trade off
• Estimation of Cross Spectral Densities
General Estimation Problem
• Consider data, e.g. a digital time series x[n], from which one seeks to
estimate the value of some parameter q.
– Examples problems might be:
• From a set of data consisting of a sine wave in noise, estimate the frequency.
• For noise data which is thought to lie on straight line, estimate the slope of that line.
• From a transfer function, near a mode, what is the damping coefficient for that
mode?
• The goal is to use the data to generate a number, q̂ , which approximates q.
* Notethere are an infinite number of methods that could be considered, nearly all of those would
have no logical basis, the three illustrated here are chosen to have some reasoning behind them.
Example on One Data Set
• Consider one data set of 10 measurements from which we can construct
an estimate.
x y
0.4036 0.0939 * Data points
0.8765 0.1837 Red dotted line is the “true”
0.6154 0.1707 curve, which is only known
0.0636 -0.0067 because I simulated the data.
0.4610 0.0541 Blue dotted line is the best fit
0.4201 0.0862 straight line to the data set.
0.5578 0.1676
0.7780 0.1806
0.9371 0.2348 qˆ a : maximum of x is 0.9371, maximum of y is 0.2348, so
0.0692 0.0321 the estimate of the slope is 0.2505.
qˆ b : mean of x is 0.5182, mean of y 0.1197, so the estimate for
the slope is 0.2310.
qˆ c : best fit straight line is y=0.2456x-0.0076, so the estimate for
the slope is 0.2456.
Comments on Results
• From the analysis of the one data set in the last slide we see each
estimator produces a different estimate for the slope:
– Estimator a) suggests 0.2505
– Estimator b) suggests 0.2310
– Estimator c) suggests 0.2456
• Recall the correct answer is 0.25, so it looks like estimator a) is the best
………
• Or is it? The previous data set was generated using one set of 10 random
numbers representing the data.
• If we generate a new set of random numbers to simulate the data and
recompute the estimators we get the values of 0.2802, 0.2430, 0.2884 for
the three estimators a), b) and c)….. so now b) looks best being closest to
the true answer 0.25.
Repeated Testing
• One can run lots of tests for different sets of 10 random numbers and for
each set calculates the values for each of the 3 estimators:
Using 10,000 realisations of our simulated data we can look at the bias, variance
and mean squared errors for our 3 candidate estimators.
Estimator Bias Variance MSE
qa 0.2603-0.25=0.0103 0.00056 0.00067
qb 0.2499-0.25=0.0001 0.00028 0.00028
qc 0.2504-0.25=-0.0004 0.00095 0.00095
Both qb and qc seem to have biases very close to zero so appear to be unbiased,
whereas qa has a comparatively large bias, i.e. it is biased.
It is estimator qb which has the lowest mse, so is the best estimator of the three
considered here.
Note that whilst qa has the largest bias, it has a relatively low variance and its mse
is better than qc.
In summary we would rank the estimators, best to worst, as qb, qa and then qc.
Consistency
• An estimator is said to be consistent if the mse tends to zero as the data
length increases.
– Estimators which are not consistent are considered to be poor.
Bias
For the example of a slope we can look at the behaviours
of the bias, variance and mse for different data lengths N.
Variance
MSE
Comments on Consistency
• The results from our example show a couple of points:
– The estimators qˆ b and qˆ c are both consistent – as the data length increases
then the mse for both estimators reduces and tends towards zero.
– For all the data lengths considered then the best performing estimator is
always qˆ b .
– The estimator qˆ a is very poor.
• As the data length increases its performance decreases (mse increases)!
• It was only the fact that we originally looked at short data lengths (N=10) that
meant it seemed to do reasonably (it was our second best estimator).
Optimal Estimators
• How does one find the overall best (optimal) estimator?
• We could just consider all the estimators we can think of and measure the
mse for them all and choose the best.
• For our slope example there are lots more “sensible” estimators one might
think of, for example: …..and even more
nonsensical
N
1 yn median( yn ) y
qˆ d = ; qˆ e = ; m = arg max ( xn ) , qˆ e = m
N n =1
xn median( xn ) xm
( yn − yˆ n ) ( )
2
yn − qˆ xn
2
L= =
n =1 n =1
A good estimate of q will make N
estimates of the data y close to the measured values
dL
N xy n n
d qˆ
= −2 (
xn yn − qˆ xn = 0 qˆ * = ) n =1
N
n =1
n =1
xn2
xy
1
n n
N
qˆ * = n =1
N Variance of x
1
xn2
N n =1
S xx ( f ) = lim
T → T
• One might initially consider the following
X(f)
2
High bias
N=4096
Low bias
High variance
Estimating Cross-Spectra
• One can use segment averaging to compute the cross-spectrum in a
manner similar to that used to estimate the PSD.
• Given two time series x(t) and y(t) one aims to estimate the cross-
spectrum S xy ( f ) where
1
S xy ( f ) = lim E X T ( f ) YT ( f )
*
T → T
• The basic steps are:
– The two signals are segmented and a window may be applied.
– For each segment the Fourier transforms, X k ( f ) and Yk ( f ) , are computed,
where k is the segment number.
– The product X k ( f ) Yk ( f ) is formed.
*
K
1
( ) Yk ( f )
*
– These products are averaged across the segments K X k f
k =1
– This is then scaled by 1/T to produce an estimate of the cross-spectrum.
Illustration of Cross-Spectrum Estimation
x(t)
y(t)
Errors in Cross-Spectrum
• Consider the two signals x(t) and y(t) which are correlated with a delay of
t0 seconds between them.
• If the window length Ts is small compared to t0 then features in one signal,
say x(t), will not appear in the corresponding segment in y(t).
• Delays between x(t) and y(t) can result in a significant reduction in the
estimated cross-correlation, i.e. can lead to strong biases.
Example
Ts
Delay t0
Example: Effect of Delay
• Cross-spectrum computed for the input and output from a digital filter.
• An extra delay is incorporated in the output.
• The cross-spectrum is computed using a 256 Hanning window.
Example: Effect of FFT Size
• Example as previous slide, but with fixed delay of 128 samples.
• In this case the FFT size is varied.
Estimating Frequency Response
Functions (FRFs)
Outline
• Problem definition
• H1 and H2 estimators
• Relationship to the coherence function
• Biases in H1 and H2
• Reasons for lack of coherence
Problem Definition
• A common task in engineering is to estimate the frequency response
function for a system.
• This is commonly achieved using a controlled (and measured) input, x(t),
and measuring the response, y(t).
• So the problem is to estimate H(f) from the measurements x(t) and y(t),
in this case assuming x(t) (and thus y(t)) are random signals.
Relationship between Spectra
• We have already seen that if x(t) and y(t) are the input and output of a
linear system, with a frequency response, H(f), then the PSDs and cross-
spectra are related by:
𝑆𝑆𝑦𝑦𝑦𝑦 𝑓𝑓 = 𝐻𝐻(𝑓𝑓) 2 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
𝑆𝑆𝑥𝑥𝑦𝑦 𝑓𝑓 = 𝐻𝐻(𝑓𝑓)𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
𝑆𝑆𝑦𝑦𝑥𝑥 𝑓𝑓 = 𝐻𝐻(𝑓𝑓)∗ 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 * means conjugate
• These can be rearranged so that H(f) is the subject of the equations, in (at
least) two different ways:
𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)
𝐻𝐻 𝑓𝑓 =
𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)
𝑆𝑆𝑦𝑦𝑦𝑦 (𝑓𝑓)
𝐻𝐻 𝑓𝑓 =
𝑆𝑆𝑦𝑦𝑥𝑥 (𝑓𝑓)
Estimators of FRFs
• The two formulations for H(f) are identical if the theoretical spectra are
considered.
• In practice all of Sxx(f), Syy(f) and Sxy(f) have to be estimated from the
available data – which means that the two formulations for H(f), on the
last slide, will not be the same.
• These estimated quantities are given the names H1(f) and H2(f) and are
defined as:
𝑆𝑆̂𝑥𝑥𝑥𝑥 (𝑓𝑓)
𝐻𝐻1 𝑓𝑓 =
𝑆𝑆̂𝑥𝑥𝑥𝑥 (𝑓𝑓)
𝑆𝑆̂𝑦𝑦𝑦𝑦 (𝑓𝑓)
𝐻𝐻2 𝑓𝑓 =
𝑆𝑆̂𝑦𝑦𝑥𝑥 (𝑓𝑓)
Comments
• The estimator H1(f) is defined as
𝑆𝑆̂𝑥𝑥𝑥𝑥 (𝑓𝑓) ∑ 𝑋𝑋𝑛𝑛∗ (𝑓𝑓)𝑌𝑌𝑛𝑛 (𝑓𝑓)
𝐻𝐻1 𝑓𝑓 = =
̂
𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓) ∑ 𝑋𝑋𝑛𝑛 (𝑓𝑓) 2
where 𝑋𝑋𝑛𝑛 (𝑓𝑓) is the Fourier transform of the nth segment of x(t).
• Compare this to the optimal estimator for a slope of line constrained to
pass through (0,0) – as per the example in earlier lecture.
∑ 𝑥𝑥𝑘𝑘 𝑦𝑦𝑘𝑘
𝜃𝜃∗ =
∑ 𝑥𝑥𝑘𝑘2
• The FRF estimation problem deals with complex valued data, whereas
the slope problem only considers real valued data. With the exception
of that difference then these two solutions are the same.
• Estimating a FRF can be viewed as finding the slope for complex valued
data.
Relationship Between Estimators
• It is pretty simple to show that:
𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)
�
𝐻𝐻1 (𝑓𝑓) 𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓) 𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)𝑆𝑆𝑦𝑦𝑥𝑥 (𝑓𝑓) 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 𝑆𝑆𝑥𝑥𝑦𝑦 𝑓𝑓 ∗
= = =
𝐻𝐻2 (𝑓𝑓) 𝑆𝑆𝑦𝑦𝑦𝑦 (𝑓𝑓) 𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)𝑆𝑆𝑦𝑦𝑦𝑦 (𝑓𝑓) 𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)𝑆𝑆𝑦𝑦𝑦𝑦 (𝑓𝑓)
�𝑆𝑆 (𝑓𝑓)
𝑦𝑦𝑥𝑥
2
𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
= = γ2𝑥𝑥𝑥𝑥 (𝑓𝑓)
𝑆𝑆𝑥𝑥𝑥𝑥 (𝑓𝑓)𝑆𝑆𝑦𝑦𝑦𝑦 (𝑓𝑓)
• where γ2𝑥𝑥𝑥𝑥 𝑓𝑓 is the coherence function and 0 ≤ γ2𝑥𝑥𝑥𝑥 (𝑓𝑓) ≤ 1.
• This means that 𝐻𝐻2 (𝑓𝑓) ≥ 𝐻𝐻1 (𝑓𝑓)
‒ The above considers theoretical spectra.
‒ When these spectra are estimated from data this inequality is still
guaranteed to hold.
Measurements with Output Noise
• Consider the problem of estimating the frequency response function
when the output signal is corrupted by additive noise, n(t).
n(t)
• The noise is assumed to be uncorrelated with the signal x(t), so will also
be uncorrelated with u(t).
• What happens if one now uses x(t) and y(t) to compute the FRF?
Observations for this Model
• From the previous slide, we can express the Fourier transform of the
output as: 𝑌𝑌 𝑓𝑓 = 𝐻𝐻 𝑓𝑓 𝑋𝑋 𝑓𝑓 + 𝑁𝑁 𝑓𝑓 .
• If we consider simplified definitions of the PSD and cross-spectra, where
the division by T and the limit as 𝑇𝑇 → ∞ are not included.
‒ This is just keep the equations simpler as the factors are in all terms and so
just carry through all the expressions.
• The cross-spectrum Sxy(f) is thus:
𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 = 𝐸𝐸 𝑋𝑋(𝑓𝑓)∗ 𝑌𝑌(𝑓𝑓) = 𝐸𝐸 𝑋𝑋(𝑓𝑓)∗ 𝐻𝐻 𝑓𝑓 𝑋𝑋 𝑓𝑓 + 𝑁𝑁 𝑓𝑓
= 𝐻𝐻 𝑓𝑓 𝐸𝐸 𝑋𝑋 𝑓𝑓 ∗ 𝑋𝑋 𝑓𝑓 + 𝐸𝐸 𝑋𝑋 𝑓𝑓 ∗ 𝑁𝑁 𝑓𝑓 = 𝐻𝐻 𝑓𝑓 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
• Thus the noise does not affect the cross-spectrum.
• This development uses the fact that N(f) and X(f) are uncorrelated so
that 𝐸𝐸 𝑋𝑋 𝑓𝑓 ∗ 𝑁𝑁 𝑓𝑓 = 0
Observations for this Model (cont’d)
• The PSD Syy(f) is thus:
∗
𝑆𝑆𝑦𝑦𝑦𝑦 𝑓𝑓 = 𝐸𝐸 |𝑌𝑌 𝑓𝑓 |2 = 𝐸𝐸 𝐻𝐻 𝑓𝑓 𝑋𝑋 𝑓𝑓 + 𝑁𝑁 𝑓𝑓 𝐻𝐻 𝑓𝑓 𝑋𝑋 𝑓𝑓 + 𝑁𝑁 𝑓𝑓
= 𝐻𝐻 𝑓𝑓 2 𝐸𝐸 |𝑋𝑋 𝑓𝑓 |2 + 𝐸𝐸 𝑋𝑋 𝑓𝑓 ∗ 𝑁𝑁 𝑓𝑓 + 𝐸𝐸 𝑁𝑁 𝑓𝑓 ∗ 𝑋𝑋 𝑓𝑓 + 𝐸𝐸 |𝑁𝑁 𝑓𝑓 |2
𝑆𝑆𝑦𝑦𝑦𝑦 𝑓𝑓 = 𝐻𝐻 𝑓𝑓 2 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 + 𝑆𝑆𝑛𝑛𝑛𝑛 𝑓𝑓
• Using the expressions for the FRF estimators one has:
𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
𝐻𝐻1 𝑓𝑓 = = 𝐻𝐻(𝑓𝑓)
𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓
𝑆𝑆𝑦𝑦𝑦𝑦 𝑓𝑓 𝐻𝐻 𝑓𝑓 2 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 + 𝑆𝑆𝑛𝑛𝑛𝑛 𝑓𝑓 𝑆𝑆𝑛𝑛𝑛𝑛 𝑓𝑓
𝐻𝐻2 𝑓𝑓 = = = 𝐻𝐻(𝑓𝑓) 1 +
𝑆𝑆𝑦𝑦𝑥𝑥 𝑓𝑓 𝐻𝐻(𝑓𝑓)∗ 𝑆𝑆𝑥𝑥𝑥𝑥 𝑓𝑓 𝑆𝑆𝑢𝑢𝑢𝑢 𝑓𝑓
‒ This factor is real and positive, so it does not affect the phase of H2(f).
‒ The degree of over-estimation depends on the reciprocal of the signal to
noise ratio (SNR) on the output.
o High SNR (little noise) the estimator is good.
o Low SNR (more noise) and the estimator is poor.
Measurements with Input Noise
• In this case the measurement configuration is:
x(t)
n(t) +
System y(t)
v(t)
H(f)
x(t)
n2(t)
n1(t) +
System u(t) y(t)
v(t) +
H(f)
• Then
𝐻𝐻1 (𝑓𝑓) ≤ 𝐻𝐻(𝑓𝑓) ≤ 𝐻𝐻2 (𝑓𝑓)
𝐴𝐴𝐴𝐴𝐴𝐴 𝐻𝐻1 (𝑓𝑓) = 𝐴𝐴𝐴𝐴𝐴𝐴 𝐻𝐻 (𝑓𝑓) = 𝐴𝐴𝐴𝐴𝐴𝐴 𝐻𝐻2 (𝑓𝑓)
• So that H1(f) and H2(f) bracket the true spectrum.
2
‒ Since 𝐻𝐻2 𝑓𝑓 = 𝛾𝛾𝑥𝑥𝑥𝑥 𝑓𝑓 𝐻𝐻1 𝑓𝑓 if the coherence is close to one then the
bracket is tight.
The Coherence Function
• In practice before concerning oneself with the estimated FRF it is wise
to first consider the coherence function.
• If the coherence is small then the estimated FRF is possibly unreliable.
• There are several reasons why the coherence function may be less than
unity:
‒ Noise (as discussed in this lecture)
‒ The system may be non-linear
‒ There maybe other inputs which are not being measured, that contribute
to y(t).
‒ Estimation errors, i.e. the estimate of the coherence could be at fault.
o In particular the cross-spectrum estimate can be biased in frequency band
where the transfer function’s phase varies rapidly … like at a resonance.
Example
• Coherence function estimated using segment average for different
segment (FFT) sizes.
• The system is an AR system with a resonance at 2 kHz (as used
previously).
Chuang Shi
November 2023
174-730-538
Today’s Learning Outcomes
By the end of the session the students should be able to:
1. Recall different properties of continuous-time systems and describe their
basic physical meanings.
2. Identify a linear ordinary differential equation or a continuous-time linear
time-invariant system based on the form of a given differential equation.
3. Determine the transfer function of a continuous-time linear time-invariant
system based on its linear ordinary differential equation or impulse
response.
4. Sketch the frequency response of a continuous-time linear time-invariant
system based on its pole-zero diagram.
5. Explain the stability and causality of a continuous-time linear time-
invariant system based on a mathematical representation.
What is a System?
• A “system” is anything which takes one (or more or fewer)
inputs and creates one (or more outputs).
• Examples:
– A loudspeaker: the input being the voltage driving
the speaker and the output being the “sound”.
– A filter: the input is the unfiltered signal and the output is the filtered
signal (!)
• If one knows the system and the output y(t), can one estimate
the input, x(t)?
– i.e. to remove the effect of the system from y(t), e.g. removing the
effect of distortions.
Linear Systems
• An important subclass of systems are the linear systems.
• Consider a system whose response to the input u(t) is w(t) and
whose response to v(t) is z(t).
• Then, for a linear system, the response, y(t), to a new
combined input, x(t):
x ( t ) c1u ( t ) + c2 v ( t )
=
is given by
y ( t ) c1w ( t ) + c2 z ( t )
=
where c1 and c2 are scalar constants.
• This is a form of the principle of super-position.
Time-Invariant Systems
• A system which does not evolve with time is said to be time-
invariant.
• Formerly, if x(t) is an input at time t which elicits the response
y(t), then x(t-T), which is the same input occurring T seconds
later, elicits the response y(t-T), i.e. the same response delayed
by T seconds.
• Causality
– The output, y(t), at any time t, can only depend
on inputs that have occurred before t. i.e. the
system is NOT magic, it cannot predict what is
to come!
LOUDSPEAKER MICROPHONE
TRANSFORMER
174-730-538
Forms of Model
• There are 4 basic representations for continuous time LTI
systems.
1) Linear Ordinary Differential Equations (ODEs). Often
derived by consideration of the underlying physics.
2) Laplace domain, i.e. transfer functions, which are good tools
for assessing stability.
3) Frequency domain, i.e. frequency response functions, which
are easy to measure.
4) Time-domain, i.e. impulse responses, good for predicting the
output of a system given an arbitrary input.
• The above are not exclusive uses for each representation, they
are just those that they are (arguably) best suited towards.
174-730-538
Linear Ordinary Differential
Equations
• An ODE of the form
dpy dy dqx dx
a p p + ... + a1 + a= 0y bq q + ... + b1 + b0 x
dt dt dt dt
where the a’s and b’s are constants, represents an LTI system.
Linear, but time varying equation Non-linear, time invariant equation, with
each term on the LHS being non-linear.
Transfer Functions
• The Laplace transform, L {}, has the extremely useful property
of changing a LTI ODE into an algebraic equation, because
dy
if L { y ( t )} = Y ( s ) then L = sY ( s )
dt
• So that applying the Laplace transform to
dpy dy dqx dx
a p p + ... + a1 + a=
0 y bq q
+ ... + b1 + b0 x
dt dt dt dt
Leads to
a p s pY ( s ) + ... + a1sY ( s ) + a=
0Y ( s ) bq s q
X ( s ) + ... + b1sX ( s ) + b0 X ( s )
Transfer Functions (cont’d)
• Which we rearrange as:
( )
a p s p + ... + a1s + a0 Y ( s=
) ( )
bq s q + ... + b1s + b0 X ( s )
• For the systems we shall consider the transfer function has the
form of the ratio of two polynomials in the Laplace variable, s.
• Recall s is a complex valued variable.
Poles and Zeros
• The transfer function H(s) is a complex valued function of the
complex valued variable s.
• Visualising such a function is not easy.
• It turns out that if H(s) is the ratio of 2 polynomials, i.e.
Q(s)
H (s) =
P(s)
• Then the roots of the two polynomials convey all of the
important information regarding the system H(s).
• The poles are roots of P(s), i.e. the values of s such that P(s)=0
and hence the values of s where H(s)=∞.
• The zeros are roots of Q(s), i.e. the values of s such that Q(s)=0
and hence the values of s where H(s)=0.
Pole-Zero Diagram
• A transfer function is plotted as a pole-zero diagram, in which
just the positions of the poles and zeros are shown:
– Shows the complex plane of s (the s-plane) on which
• Poles are marked with an “×”
• Zeros are marked with an “o”
Stability
• The stability of a system is usually assessed using the transfer
function.
• A system is unstable if any of the poles have positive real
components, i.e. if any poles lie in the right half plane or if you
prefer to the right of the imaginary axis.
• In general, stability is not affected by the position of the zeros.
• Thus, simple inspection of the pole-zero diagram reveals
whether a system is stable or not.
Frequency Response (Analytic)
• The frequency response, H(f), of a system can be obtained
analytically if the transfer function is known.
• Specifically
H ( f ) = H ( s ) s= 2 πif
Enhance Eliminate
Magnitude
Frequency
Magnitude
Frequency
Magnitude
Frequency
Magnitude
Frequency
Frequency Response (Measurement)
• The frequency response of a system is something that can
commonly be measured.
• If the input to an LTI system is a sine wave at frequency f and
amplitude A, then the output will also be a sine wave at
frequency f but with amplitude A H ( f ) . The phase change
in that sine wave will be Arg { H ( f )}.
• So one way to measure a frequency response is to probe the
system with a set of sine waves at different frequencies:
– The amplitude of the output relative to the input, gives the magnitude
of the frequency response
– The phase change between the input and output gives the phase of the
frequency response.
Frequency Response
Impulse Response
• The final representation of an LTI system is it impulse
response, h(t).
• This is the output of the system when the input, x(t), is a Dirac
delta function, δ(t) – a “idealised” impulse.
– A Dirac delta function is a mathematical construct, not something that
can exist is the real world.
• One can seek to measure the impulse response of the system
using a “tap test” – applying a real impulse (as opposed to δ(t))
and measuring the response.
Properties of the Impulse Response
• For a stable system, the impulse response decays away.
Stable Unstable
L { x=
( t )} X=
(s) ∫ x ( t ) e − st
dt
174-730-538
Objectives
• This part of the course:
1) Defines what is meant by a digital system
2) Reconstructs the edifice on the previous o/h for a digital system
• For a digital system parallels to all the concepts just discussed
exist.
• This needs one to define digital equivalents of:
– ODEs (difference equations)*
– Fourier transforms (discrete Fourier transform (DFT))
– Laplace transform (z-transform)
– Convolution (digital convolution or convolution sum)
– Dirac delta function (Kronecker delta function)
– A criterion for stability
* Terms given in brackets are the names of these digital equivalents – for later reference.
Worked Example - Physical Model
• Model of car suspension
• Damping force: c ( x ( t ) − y ( t ) )
d
dt
= cx − cy
• Spring: k ( x ( t ) − y ( t ) ) =−
kx ky
Obtain an ODE by considering
the physics of the problem.
• Newton’s laws give: my = cx − cy + kx − ky
c k c k
⇒
y + y + y= x + x
m m m m
Note: Slides with the black block in the top right-hand
corner are “additional” material - appendices
• Compute the transfer function, by Laplace transforming the ODE:
c k c k c k
y + y + y =cx + kx = s 2 + s + Y ( s ) = s + X ( s )
L
m m m m m m
c k
s+
Y (s)
=
H ( s ) = m c m k
X (s) 2
s + s +
m m
c k k
Zeros: solution of s + =
0 ⇒ s =
−
m m c
c k −c ± c 2 − 4km
Poles: solution of s + s+ = 0 ⇒ s =
2
m m 2m
Stable as long as c,k > 0.
• Frequency Response, computed from transfer function
c k
2πif +
H ( f ) H= m m
= ( s ) s= 2 πif
c k
−4π f + 2πif +
2 2
m m
2 2
k 2 2 c
+ 4π f
H(f )
2
= m m
2 2
k 2 2 2 2 c
− 4 π f + 4 π f
m m
k
2π
• Impulse response, computed from the transfer function
−c + c 2 − 4km −c − c 2 − 4km
=
Defining: s1 = , s2
2m 2m
c k
m s + m k − s k − s1 s t
h (t ) L
= −1 = e +
2 st
1 2
e
( s − s1 )( s − s2 ) s2 − s1 s1 − s2
dt
• This system might be regarded as a sine wave generator, since
the solution of the equation is:
y ( t ) A sin ( ω0t + θ )
=
• The values of A and θ depend on the initial conditions supplied
to the system.
More on Laplace Transforms
• The Laplace transform is usually defined as the unilateral
transform: ∞
L { x=
( t )} X=
( s ) ∫ x ( t ) e− st dt
0
• There are some subtleties in this result that you may, or may
not, have considered previously.
Further Consideration
• Let us temporarily consider s as a real variable, and consider:
“Does the previous Laplace transform hold for all s?”
• Consider the integral ∞
−( s + a )t
∞
e
X1 ( s ) = ∫ e
−( s + a )t
dt = −
0 s + a 0
• On the previous slide it was assumed e ( ) → 0 as t → ∞
− s+a t
x n = x ( t ) t =nD
In which case x[n] represents the nth sample of the signal and D
is the sampling interval, with fs=1/D is the sampling frequency.
* Note on notation: I shall use square brackets for a digital signals/systems x[n] and round brackets for
continuous signals x(t), this emphasises the form of the signals, but it is only a device to help clarify the
difference between continuous and digital processes. In many texts you will find x(n) used.
Difference Equations
• Analogue/continuous systems might be represented as
ordinary differential equations (ODEs).
• The idea of a derivative does not apply directly to a digital
signal, so ODEs cannot be used for digital systems.
• An alternative form of equations is necessary for digital
systems.
• For a digital system we consider difference equations (or
recurrence equations), instead of ODEs.
Listen to http://www.bbc.co.uk/programmes/b008ct2j to find out more about Fibonacci sequences.
Note we need to use a different analysis for the case g=1 to avoid division by zero.
Steady State Solutions
• Often we might be interested in knowing if a difference
equation reaches a constant value, i.e. a steady state.
• These solutions, if they exist, can be found easily.
• If it does reach a constant value, then the values of P[n] do not
depend on n, we can write them as a constant, C, i.e.
= P n −1 = P n = P n + 1 = P n + 2 = .... = C
• Thus to find the steady state solution, replace every occurrence
of P[ ], regardless of the value of n, by C.
• For example: P n − gP n − 1 = M
• Replace P[n] and P[n-1] by C and solve to find C.
M
C − gC = M C =
1− g
Comments
• The steady state solution identified matches that found by
solving the equation and seeing what happens as n increases
(for g<1).
• The existence of a steady state solution, does not mean that the
solution will reach that value. For example, in the case where
g>1, the solution tends to ∞, because it is unstable, albeit that a
steady state solution exists.
• This method works even when the difference equation is non-
linear.
Comments on the Solution
• The solution to this forced equation consists of 2 parts (g≠1)
1− gn M n M
P n = g N +
n
M = N − g +
1− g 1− g 1− g
* Notethe general form of the solution of the unforced system was Agn this term has
exactly that form, just the arbitrary constant, A, is rather more complicated.
More Complex Penguin Models
• Our first order linear difference equation is a poor population
model.
• Can we make it better by making it more complicated?
• Let us try:
Consider a more realistic model based on splitting the
population into parts:
– Adult penguins, older than 1 year who can breed
– Juvenile penguins, younger than a year, who can not breed
• The number of adult penguins is denoted Pa[n] and the
juvenile population is Pj[n] – the total population P[n] is the
sum of these two sub-populations.
Rules for the new Model
• Juvenile and adults die at the same rate: still a constant
proportion, a, of their populations.
• The number of penguins born each year is a fixed proportion of
only the adult population.
• The surviving juveniles from the previous year become adults.
x[n]
This form of graph is called a “stem plot” and
is commonly used to represent sampled signals
the vertical lines are just for “effect”.
Applying a Difference Equation
• We can compute “by hand” the output if we apply this input
to a difference equation.
• For example, consider the difference equation:
y n = x n − 0.5 y n −1 + 0.3 y n − 2
• And so on ….
The Full Output
• Repeating this for all of the inputs, one gets:
y[n]={0, 0.9477, -0.6285, -0.2337, 0.2041, 0.5359, -0.5719, -0.1344, 0.3209, 0.2556
, -0.4908, -0.0159, 0.3314, 0.0585, -0.3931, 0.0826, 0.2815, -0.0689, -0.2876, 0.1469
, 0.2042, -0.1389, -0.1853, 0.1758, 0.1221, -0.1646, -0.0959, 0.1750, 0.0493, -0.1588
, -0.0251, 0.1533, -0.0070, -0.1336, 0.0247, 0.1198, -0.0444, -0.0990, 0.0545, 0.0822,
-0.0641}
or even
y n −1 = x n −1 − 0.5 y n − 2 + 0.3 y n − 3 Adding -1 to n
or even
y n + 37 = x n + 37 − 0.5 y n + 36 + 0.3 y n + 35 Adding 37 to n
• r=3.841
• Period 3 solution
• r=3.844
• Period 6 solution
• r=3.86
• Chaos
Bifurcation Diagram
• Consider running the logistic equation for a long time for
various values of r.
• Then for each run form a histogram of the points generated.
a) b) c) d) e)
• Stacking those histograms
next to each other, gives
you this plot.
Some example values of r considered here:
from Internet
from Internet
C/D D/C
DT System
Conversion Conversion
Outline
• Order of a digital system
• Classes of digital systems
– Moving Average (MA) systems
– Auto-Regressive (AR) systems
– Auto-Regressive Moving Average (ARMA) systems
• Kronecker (digital) delta function
• Sifting property for sequences
• Digital impulse response
– Finite Impulse Response (FIR) systems
– Infinite Impulse Response (IIR) systems
• Digital convolution
Opening Comments
• Difference equations are the digital equivalents of differential
equations.
• They relate an input sequence x[n] to an output sequence y[n].
• Difference equations are not normally developed by
consideration of the physics.
• We can easily compute the output (or “solve”) a difference
equation for a given input manually, or by computer, without
using analytic tools.
– Contrast this with differential equations which require the use of
“University level” mathematics.
• Difference equations can be applied to any time series: the
input does not have to be expressed as a mathematical
function.
Moving Average (MA) Difference
Equations
• These are difference equations in which y[n] depends only on
values of the input (x[n] for various n) and NOT on other
values of y[n].
• For a causal system:
y [=
n ] b0 x [ n ] + b1 x [ n − 1] + b2 x [ n − 2] + .... + bL x [ n − L ]
L
= ∑ b x [n − k ]
k =0
k Depends only on past inputs
L2
∑ b x [n − k ]
k =− L1
k
Depends on past and future inputs
Comments
• MA digital systems are usually the easiest to work with.
– As we shall see, later in the course, they can be configured to have
useful properties.
• For the causal system on the previous o/h the order of the filter
is L (although there are L+1 coefficients, bk).
• For the acausal system the order is L1+ L2.
• The output is formed as the weighted sum of input values.
• The system coefficients bk defined the system’s behaviour.
• Most of the following will consider only causal systems.
MA Systems
MA Example
• Input signal x [ n ] ={1, −2, 2,3, −1,0, −3}
• Applied to the MA system
y [ n ] = 0.5 x [ n ] − 0.25 x [ n − 1] − 0.25 x [ n − 2]
• Assuming x[n]=0 for n<0.
y [ 0]= 0.5 × 1= 0.5
y [1] = 0.5 × −2 − 0.25 × 1 = −1.25
y [ 2] =0.5 × 2 − 0.25 × −2 − 0.25 × 1 =1.25
y [3] =0.5 × 3 − 0.25 × 2 − 0.25 × −2 =1.5
y [ 4] =
−1.75, y [5] =
−0.5, y [ 6] =
−1.25,
y [ 7 ] 0.75,
= = y [8] 0.75,
= y [9] 0,=
y [10] 0,
Input stops at this sample
“The larger the window, the smoother the line becomes, but we’re also
shrinking the line from both ends. ”
The COVID-19 chart I wish I didn’t have to make - Datawrapper Blog
[ n] b0 x [ n] + ... + bp x [ n − p ] − a1 y [ n − 1] − .. − aq y [ n − q ]
y=
⇒ y [ n ] + a1 y [ n − 1] + ... + aq y [ n −=
q ] b0 x [ n ] + ... + bp x [ n − p ]
q p
⇒ ∑ a y [=
k 0=
n − k ] ∑ b x [n − k ]
k 0
k k a0 ≡ 1
∑
k 0=
ak y [=
k 0
n − k] ∑ bk x [ n − k ] ⇒
=k 0
∑ y [=
a=0
n − k]∑k 0
x [n − k ]
a0
ak bk
q p
Setting
a0
= ak and = bk ⇒
=a0 k 0=
∑
ak y [ n − k ]=
k 0
∑
bk x [ n − k ] a0 ≡ 1
ARMA Example
• Input signal x [ n=
] {1, −2,3}
• Consider the system
[ n] 0.5 x [ n] + 0.5 x [ n − 1] − 0.25 y [ n − 1]
y=
• To compute the output, assuming y[-1]=0 then
[0] 0.5 x [0] + 0.5 x [ −1] − 0.25 y =
y= [ −1] 0.5
y [1] =0.5 x [1] + 0.5 x [ 0] − 0.25 y [ 0] =−1 + 0.5 − 0.25 × 0.5 =−0.625
y [ 2]= 0.5 x [ 2] + 0.5 x [1] − 0.25 y [1]= 1.5 − 1 + 0.25 × 0.625= 0.6563
y [3] =0.5 x [3] + 0.5 x [ 2] − 0.25 y [ 2] =0 + 1.5 − 0.25 × 0.6563 =1.3359
y [ 4] = 0.5 x [ 4] + 0.5 x [3] − 0.25 y [3] = 0 + 0 − 0.25 ×1.3359 = -0.3340
y [ 5] =
0.0835, y [ 6] =
−0.0209, y [ 7 ] =
0.0052, y [8] =
−0.0013, y [9] =
0.0003
Discussion of ARMA Output
After the input has passed through the MA terms, such that
x[n-p]=0, then the system behaves like an AR system.
In this case the system is stable and rapidly approaches zero,
in this regime, for this example, y[n]=-0.25y[n-1]
174-730-538
Delta Functions
• Continuous delta function – the Dirac delta
– Denoted δ(t) the Dirac delta function is strictly not a function, Paul Dirac
but a distribution or a generalised function. (1902-84)
lim Rε ( t )
δ (t ) =
ε→0 Leopold
Kronecker
(1823-91)
"God made the integers;
• Digital delta function – the Kronecker delta all else is the work of man"
x [ n]
= ∑ x [k ] δ [n − k ]
k = −∞
x (t )
=
∫
−∞
x ( u ) δ ( t − u ) du
Example
• Consider the signal x [ n=
] {1, −2,3}
δ[n]
+ −2δ[n-1]
=
+ 3δ[n-2]
Example
Digital Impulse Response
• The digital impulse response of a system, h[n], is defined as
the response you elicit from a (digital) system when it is
excited by a Kronecker delta, δ[n].
h [ n]
= {0.25, 0.4, −0.3, 0.1, 0, 0,} Recall δ[n] is zero everywhere except when n=0
e.g. δ[-2]= δ[-1]= δ[1] =δ[2]=0
Comments on the MA Example
• Our MA system
=y [ n ] 0.25 x [ n ] + 0.4 x [ n − 1] − 0.3 x [ n − 2] + 0.1x [ n − 3]
is characterised by the coefficients
= bk {0.25, 0.4, −0.3, 0.1}
which is also the impulse response of the system, h[n].
• It is generally true that, for an MA system, the impulse
response is equal to its coefficients.
h [ n ] = bn
• This is an important result which makes MA systems easy to
design and manipulate.
• For an MA system of order L the impulse response, h[n], is
identically zero for n>L.
Example ARMA System
• Consider (again) the system:
[ n] 0.5 x [ n] + 0.5 x [ n − 1] − 0.25 y [ n − 1]
y=
• The impulse response is given by:
n= 0 : y [ 0]= h [ 0]= 0.5δ [ 0] + 0.5δ [ −1] − 0.25 y [ −1]= 0.5
n = 1: y [1] = h [1] = 0.5δ [1] + 0.5δ [ 0] − 0.25 × 0.5 = 0.5 − 0.125 = 0.375
n=2 : y [ 2] =h [ 2] =
0.5δ [ 2] + 0.5δ [1] − 0.25 × 0.375 =−0.0938
n 3 : y [3=
= ] h [3=] 0.5δ [3] + 0.5δ [ 2] − 0.25 × −0.0938= 0.0234
4 : y [ 4] =
n= h [ 4] =
0.5δ [ 4] + 0.5δ [3] − 0.25 × 0.0234 =
−0.0059
h [ 5] =
0.0015, h [ 6] =
−0.0004, h [ 7 ] =
−0.0001,
Comments on the ARMA Example
• The impulse response in this case is not related in a transparent
manner to the system coefficients (ak and bk).
• The impulse response decays towards zero, but, in theory at
least, it never reaches zero.
h [ n ] → 0 as n → ∞
• This is in contrast to MA systems, whose impulse response
becomes identically zero for sufficiently large n.
• This is generally true: in nearly all ARMA systems* the
impulse response continues ad infinitum.
– For a stable system the impulse response decays towards zero.
– For an unstable system the impulse response grows without bound.
*We shall shortly discuss the exceptions: ARMA systems whose impulse response
becomes identically zero after some point n – such systems are very rare.
Classification of Digital Systems
• When talking of a “system” we generally class digital systems
according to whether they are MA, AR or ARMA.
• When considering “filters” (which are just digital systems*) we
generally use an alternative classification.
• Filters are classified according to whether their impulse
response becomes zero after a finite time (as in an MA
system), or only approaches zero (as in most ARMA systems).
• The two classes are:
– Finite Impulse Response (FIR) filters
– Infinite Impulse Response (IIR) filters
* The concepts of a filter and a system are not separate – the distinction is artificial.
Finite Impulse Response
Filters/Systems
• Any filter (or system) whose impulse response satisfies
h [ n ]= 0 ∀n > M
for some M, is said to be FIR.
• All MA systems are FIR
• No AR systems are FIR
• A few (a very few) ARMA system are also FIR
Typical FIR system response
=x [ n] ∑ x [k ] δ [n − k ]
k = −∞
=y [ n] ∑ x [k ] h [n − k ]
k = −∞
Digital Convolution
• To compute the output of a digital system for any input, x[n],
given that system’s impulse response, h[n], one uses the
convolution sum:
∞ ∞
=y [ n] ∑ x [k=
k = −∞
] h [n − k ] ∑ h [k ] x [n − k ]
k = −∞
• Denoted= as y [ n ] x=
[ n] * h [ n] h [ n] * x [ n]
• For a causal system one has:
n ∞
y [ n]
= ∑ x [ k=
k = −∞
] h [n − k ] ∑ h [k ] x [n − k ]
k=0
=y [ n] ∑ h [k ] x [n − k ]
k = −∞
• For an MA system this becomes:
∞
=y [ n] ∑ b x [n − k ]
k = −∞
k
[ n]
y= ∑ b x [n −=
k =0
k k] b0 x [ n ] + b1 x [ n − 1] + ... + bL x [ n − L ]
Laplace
Z Not discussed
Fourier
transform
Transform here
Transform
FRF
Transfer ?
s=2πif
bq ( 2πif ) + ... + b1 2πif + b0
q
Function bq s q + ... + b1s + b0
H (s) = ? H(f )= ?p Measurement
a p s p + ... + a1s + a0 f=s/2πi
? a p ( 2πif ) + ... + a1 2πif + a0
P(s)=0 Fourier
?
Q(s)=0 Z
Laplace
Transform
Transform
transform
Convolution Response to
any input
Impulse ∞
X (z) =
n =−
x n z −n
U (z) =
n =−
u n z − n =
n=0
z −n
1
U (z) = z −n
= z −1 1 or z 1
n =0
1 − z −1
uL n = 1 0 n L
L=6
= 0 Elsewhere
L −1
UL ( z) =
n =−
uL n z − n = n =0
z −n
X (z) = z = ( z )
n
n −n −1
n =0 n =0
1
= z −1 1 or z
1 − z −1
Note the step function is just a special case of this example with =1.
Properties of the Z-Transform:
Linearity
• The z-transform is linear.
Z ax n = aX ( z ) and Z x n + y n = X ( z ) + Y ( z )
Y ( z ) = Z y n =
n=−
y n z −n =
n=−
x n − 1 z − n
Y (z) =
n =−
x n − 1 z − n =
m =−
x m z
−( m +1)
=
m =−
x m z − m z −1
= z −1
m =−
x m z − m = z −1 X ( z )
Z-Transform Properties:
Shifting (cont’d)
• So the z-transform of a signal shifted by 1 sample is the
original z-transform multiplied by z-1.
• Repeatedly applying this we have the general result:
Z x n − m = z − m X ( z )
uL n = u n − u n − L
-u[n-L]
Y (z) =
n =−
y n z − n =
n =−
x −n z − n Now replace n by m where m=-n
−
( ) ( )
−m
= x m z =
m
x m z −1
= X z −1
m = m =−
b a
=0 n0
• The z-transform of this function is:
0
1
U− ( z ) = u− n z − n = z −n = zm = z 1
n =− n =− m=0
1− z
again m=-n and the limits have been swapped
• Also note that u-[n]= u[-n] so that from the preceding theorem
U − ( z ) = U z −1 ( )
• So that since
1 1 1 (as also shown
U (z) = then U − ( z ) = =
( )
−1 −1
1− z 1 − z −1 1 − z above)
Example: Finite Step Function
(yet again)
• We can express the finite step function as the sum of two
reversed step functions as follows: -u-[n+1]
uL n = u− n − L + 1 − u− n + 1
• Z- transforming gives
U L ( z ) = z − L +1U − ( z ) − zU _ ( z )
z − L +1 − z z − L − 1 1 − z − L -u-[n-L+1]
= = −1 =
1− z z − 1 1 − z −1 L=6 in this illustration
Multiply numerator and
denominator by z-1. This is valid for |z|<1 since that is
the condition on U-(z).
This results valid on |z|<1 whereas the result obtained using step function was valid for
|z|>1, combined this covers almost all values of z.
Displaying the Z-Transform
• The variable z is complex valued and the function X(z) is also
complex valued.
• The z-transforms we encounter have the form of a ratio of two
polynomials in z.
Q( z)
• Specifically: X ( z) =
P( z)
• Like the Laplace transform we only consider the points at
which
– X(z)=0, which occurs when Q(z)=0, these are called the zeros.
– X(z)=∞, which are the points for which P(z)=0, these are called the
poles.
Example: Unit Step Function
• The z-transform of the unit step function is:
1 z
U (z) = −1
= z 1
1− z z −1
• The pole for this function is given by z-1=0 => z=1.
• The zero occurs when z=0. However, pole and zeros at the
origin (z=0) are not normally considered.
uses Z x n − m = z − m X ( z )
− a1Z y n − 1 − aq Z y n − q
Y ( z ) = b0 X ( z ) + b1 z −1 X ( z ) + ...b p z − p X ( z ) − a1 z −1Y ( z ) − ... − aq z − qY ( z )
( ) (
Y ( z ) 1 + a1 z −1 + ... + aq z − q = X ( z ) b0 + b1 z −1 + ... + b p z − p )
Y ( z) b0 + b1 z −1 + ..... + bp z − p
H (z) = =
X ( z) 1 + a1 z −1 + .... + aq z − q
Digital Transfer Functions
• The digital transfer function (sometimes called the
characteristic function) is defined as the ratio of Y(z) and X(z).
• A causal (LTI) ARMA system has a transfer function of the
form
Y ( z ) b0 + b1 z −1 + ..... + bp z − p
H (z) = =
X ( z ) 1 + a1 z −1 + .... + aq z − q
x n * y n =
m =−
x m y n − m
Minor note, the above uses “*” for both digital and continuous time convolution, strictly they are different,
but the context should always make it clear which operation is implied and so no confusion should follow.
Input Output Relationships
• For a digital system the input/output relationship based on the
impulse response is:
y n = h n * x n
• Z-transforming gives:
Y ( z) = H ( z) X ( z)
where H(z) is the z-transform of the impulse response h[n].
• Manipulating the above, the transfer function, H(z), can be
equated to the ratio Y(z)/X(z), hence
Y ( z) b0 + b1 z −1 + ..... + bp z − p
H (z) = = Z h n =
X ( z) 1 + a1 z −1 + .... + aq z − q
From previous slides we know this is equal to H(z)
Combining Digital Systems
• The basic rules for combing digital systems mirror those for
continuous time systems:
• For two systems in series their transfer functions are
multiplied.
Stable
Region Re{z}
|z|=1
Nature of
Poles
|z|=1
General Properties
• The distance from the unit disc controls the rate of growth or
decay of the signal/system.
• Angle around the circle controls the rate of oscillation.
Oscillations that neither
decay or grow Rapidly
growing
Slowly Slowly
decaying growing High
Rapidly frequency Low
decaying frequency
• Such a transfer function has q zeros. The only poles are at z=0
(such poles are not significant).
• So that MA systems have transfer functions that contain only
zeros.
Transfer Functions for AR Systems
• A general AR system has the difference equation:
y n = b0 x n − a1 y n − 1 − a2 y n − 2 − .... − a p y n − p
=
k =1
( z − rk )
= b0
k =1
( z − zk )
Alternative Representations
(Cont’d)
• Combining this one can write:
( 1 )( 2 ) ( q )
− − −
max p , q
( z − zk )
z z z z .... z z
H ( z ) = b0 = b0
( z − r1 )( z − r2 ) .... ( z − rp ) k =1
( z − rk )
where is assumed that zk and rk are zero if k>p,q
• Hence knowing just the poles and zeros one can compute the
transfer function, with the exception of the value of b0.
• Based on this any ARMA system can be expressed as a
sequence of ARMA{1,1} systems in series:
Z-Transform of Real Signals
• If a signal is real-valued, then the poles and zeros correspond
to the roots of polynomials with real coefficients.
• Such roots are either:
– Real valued
– Complex valued and occur in conjugate pairs
• So if zp is, say, a complex pole then so is zp*
Xs ( f ) =
n =−
x n e −2 pifnD =
n =−
x n e
−2 pin( f / f s )
H ( f ) = F h n =
n=−
h n e −2 pifnD
Defn of the Fourier transform
of a sequence.
Note that the notation for the FRF is ambiguous: no distinction is made between the digital and
continuous FRFs, both are referred to as H(f). The context should make it clear which is appropriate.
Note that frequency f
on this slide has been
normalized by
sample rate fs
Unit Disc
Magnitude
Frequency
Magnitude
Frequency
Periodicity in the FT of a Sequence
• The FT of a sequence is inherently periodic.
X s ( f ) = X s ( f + fs ) = X s ( fs − f )
*
Im{s}=0
Frequency |z|=1
Axis Frequency Axis
Frequency Axes
• In each case the boundary between the stable and unstable
regions corresponds to the frequency axes.
• This is because sine/cosine waves are signals that are neither
stable nor unstable – they do not grow or decay.
• Consider the Laplace transform of a cosine wave
L cos ( 2pf 0t ) = 2
s s
=
s + ( 2pf 0 )
2
( s − 2pif0 )( s + 2pif 0 )
with two poles s=±2pif0 on the frequency axis
X (z) =
n =−
x n z =
−n
2 n =0
e 2 pif nD − n
z +0
n=0
e
−2 pif nD − n
z 0
1 n
( ) ( )
2 pif D −1
n
−2 pif D −1 1 1
= e z + e z = +
( ) ( )
0 0
2 n =0 − 2 pif D −1
− −2 pif D −1
2 1 e z 2 1 e z
0 0
n =0
( ) (
= e − i 3 / 2 e i 3 / 2 + e i / 2 + e − i / 2 + e − i 3 / 2 e i 3 / 2 e i 3 / 2 + e i / 2 + e − i / 2 + e − i 3 / 2 )
= 4 ( cos ( 3 / 2 ) + cos ( / 2 ) )
2
n =0
ar n −1 = a + ar + ar 2 + ar 3 + .... + ar L −1 = S L
SL =
(
a 1− r L ) r 1
1− r
Sum of an Infinite GP
• To obtain the sum of an infinite length GP, take the previous
result and let L→∞.
lim S L = lim
(
a 1− r L
=
a) if | r | 1
L → L → 1− r 1− r
• The sum only exists when |r|<1.
• For |r|>1 each term in the GP is bigger than the proceeding
one, so as you add more and more terms together the sum
continues to grow and never approaches a finite limit.
lim S L = lim
(
a 1− r L ) = if | r | 1
L → L → 1− r
Summary of Sums of GPs
• For a finite GP, i.e. a sum which has L terms and L<∞.
SL =
(
a 1− r L ) r 1 You can easily deal with the
1− r special case of r=1 separately.
Try it!
Multiplying through by r
ar
ar + 2ar 2 + 3ar 3 + 4ar 4 + .... = = S r 1
(1 − r )
2
Proof for Z-Transform of
Convolution
−n
Z x n * y n = Z
k =−
x k y n − k =
n=−
k =−
x k y n − k z
=
n=− k =−
x k y n − k z −n
Swapping order of summations
=
k =−
x k
n=−
y n − k z −n
Definition of z-transform
=
k =−
x k Z y n − k =
k =−
x k Y ( z ) z − k
= Y (z)
k =−
x k z − k = X ( z )Y ( z )
Z-Transform of a Sine Wave
• Express sampled cosine as a sum of complex exponentials
e i n − e − i n
sin ( 2pf 0 nD ) = sin ( 0 n ) =
0 0
, n0 0 = 2pf 0 D
2i
=0 n0 Normalised angular
frequency – introduced
for compactness.
• Z-transforming
Z sin ( 0 n ) = (e )
1
sin ( 0 n ) z − n = i0 n
− e − i n z − n
0
n =0
2i n =0
1 n
( ) ( )
i n − n − i n − n 1 −1 i
n
−1 − i
= e z − 0
e z = 0
z e − z e 0
0
2i n =0 n =0
2i
n =0 n =0
1 1 1 z −1 sin ( 0 ) z sin ( 0 )
= − = =
2i 1 − z −1ei 1 − z −1e − i 1 − 2cos ( 0 ) z −1 + z −2 z 2 − 2 z cos ( 0 ) + 1
0 0
Example of Repeat Roots
• Z-transform of x n = na n n0
=0 n0
X (z) =
n =1
na n z − n = az −1 + 2a 2 z −2 + 3a 3 z −3 + 4a 4 z −4 + .....
= az −1 + a 2 z −2 + a 3 z −3 + a 4 z −4 + ..... + a 2 z −2 + 2a 3 z −3 + 3a 4 z −4 + ....
az −1
infinite sum of a GP
=
1 − az −1
+ az −1
az −1
(+ 2 a 2 −2
z + 3 a 3 −3
z + .... )
az −1
= −1
+ az −1
X (z) | z || a |
1 − az Condition is necessary because we have used
the sum of an infinite GP.
az −1 az
X (z) = = | z || a |
(1 − az ) ( z − a )
2 2
−1
1
X ( z) = an z −n = a−m z m = a −1 z 1 or z a
n =− m=0
1 − a −1 z
where m=-n
• Z-transforming
−1
X (z) = a a z + a ( az ) + ( az )
n
−n n −n −n −n −1 m
= =
n
z z
n =− n=0 n =− n=0 m =1
1 az 1 − a2 −1
= + = a z a
1 − az −1
1 − az 1 + a 2 − a z + z −1 ( ) Assuming |a|<1
Requires az −1 1 Requires az 1
Two-sided,
Not, anti-causal, Acausal
Left-sided
X (z) = (2 z) (2 z )
m m
−n −1 −9 −1
n
2 z = =2 z
9
= z 2
n =− m =−9 m=0
1− z / 2
ROC
Example: Acausal Geometric
Sequence (Again)
• Recall the example
x n = a n
n
X (z) =
n =−
x n z − n = ..... + x −2 z 2 + x −1 z1 + x 0 + x 1 z −1 + x 2 z −2 + .....
X (z) =
n =−
x n z − n = ..... + c−2 z 2 + c −1 z1 + c0 + c1 z −1 + c2 z −2 + .....
X (z) =
n =−
x n z − n = ..... + x −2 z 2 + x −1 z1 + x 0 + x 1 z −1 + x 2 z −2 + .....
( )
1 −1
X (z) = −1
= 1 − az −1
z a
1 − az
• One can not directly apply the expansion of (1-x)-1 which
requires that |x|<1, since |z|<|a| 1<|az-1|.
• However, expand (1-x)-1 for the case |x|>1 we can use the
following “trick”.
( ) ( )
1 1 −1 −1
−1 −1 −1 −1
= = x x −1 = −x 1− x
1− x x x −1 (
−1
)
( )
= − x −1 1 + x −1 + x −2 + ..... = − x −1 − x −2 − x −3 − ...... x −1 1 x 1
Example: Geometric Sequence
(Anti-causal case) (Cont’d)
• So for |z|<|a|
( )
−1
X ( z ) = 1 − az −1
z a
= −a −1 z − a −2 z 2 − a −3 z 3 − a −4 z 4 − .....
• Comparing to
X (z) =
n =−
x n z − n = ..... + x −3 z 3 + x −2 z 2 + x −1 z1 + x 0 + x 1 z −1 + .....
• So that
x −1 = −a −1 ; x −2 = −a −2 ; x −3 = −a −3 ;.....
x n = −a n n0 x[n]=0 for n≥0 since there are no
powers of zk, k≥0, in the series expansion.
=0 n0
Inverse Transform of More
Complicated Cases
• In general to invert more complicated functions X(z) one can
express X(z) in terms of simpler (first order) functions and
invert each of those.
• This simplification is usually achieved via partial fractions.
Q(z) Q(z)
X ( z) = =
P(z) ( z − r1 )( z − r2 ) ... ( z − rp )
c1 c2 cp
= + + .... + + d ( z) Roots of P(z), or the poles
z − r1 z − r2 z − rp of X(z).
= 3 (1 + 2 z + 2 z + 2 z + ...) − 2 (1 − z / 3 + z / 3 − z / 3 + ....)
−1 2 −2 3 −3 −1 −2 2 −3 3
= 1 + z ( 3 2 + 2 / 3) + z ( 3 2 − 2 / 3 ) + z ( 3 2 + 2 / 3 ) + .....
−1 −2 2 2 −3 3 3
x n = 3 ( 2 ) − 2 ( −1/ 3)
n n
n0
=0 n0 Notice this is unstable 2n→∞ as n →∞
Example: Part 4
• Find the stable sequence with the z-transform
1 + 5 z −1 3 2
X (z) = = −1
− −1
5 −1 2 −2 1 − 2 z 1 + z /3
1− z − z
3 3
• The ROC which corresponds to a stable sequence is 1/3<|z|<2
Can’t expand, since inside ROC |2z-1|>1 Can expand, since inside ROC |z/2|<1
( ) ( ) ( )
−1 −1 3 −1
X ( z ) = 3 1− 2z = − z (1 − z / 2 ) − 2 1 + z −1 / 3
−1 −1 −1
− 2 1+ z / 3
2
3 z z 2 z3
(
= − z 1 + + 2 + 3 + ... − 2 1 − z −1 / 3 + z −2 / 32 − z −3 / 33 + ....
2 2 2 2
)
z z 2 z3 z 4
(
= −3 + 2 + 3 + 4 + ... − 2 1 − z −1 / 3 + z −2 / 32 − z −3 / 33 + .... )
2 2 2 2
x n = −3 2n n0
Stable, note that 2n→0 as n →-∞
= −2 ( −1/ 3)
n
n0 and (1/3)n →0 as n →∞
Example: Part 5
• Find the left-sided sequence −1
with the z-transform
1 + 5z 3 2
X (z) = = −
5 −1 2 −2 1 − 2 z −1 1 + z −1 / 3
1− z − z
3 3
• For a left-sided sequence the ROC is the interior of a circle, in
this case this has to be |z|<1/3.
Can’t expand inside ROC Can expand inside ROC
( ) ( )
−1 −1 3
X ( z ) = 3 1− 2z = − z (1 − z / 2 ) − 6 z (1 + 3 z )
−1 −1 −1 −1
− 2 1+ z / 3
2
3 z z 2 z3
(
= − z 1 + + 2 + 3 + ... − 6 z 1 − 3 z + 32 z 2 − 33 z 3 + ....
2 2 2 2
)
z z 2 z3
(
= −3 + 2 + 3 + ... − 2 3 z − 32 z 2 + 33 z 3 − 34 z 4 + .... )
2 2 2
x n = −3 2n − 2 ( −3)
−n Right-sided (in this case anti-causal) and
n0 unstable, since 2n →0 as n →-∞ BUT
=0 n0 (-1/3)n →∞ as n →-∞
Computing the Impulse Response of
a Difference Equation
• One can compute the impulse response of a difference
equation using the IZT.
• Specifically, one can compute the transfer function, H(z), from
the difference equation.
• The impulse response, h[n], is then obtained by inverse z-
transforming, H(z).
• We normally assume that the system is causal, so that the ROC
used is the exterior of the circle containing the outer-most
pole.
Example
• Consider the ARMA difference equation (which we have seen
several times before).
y n = 0.5 x n + 0.5 x n − 1 − 0.25 y n − 1
(
0.5 1 + z −1 ) = 0.5 1 + z 1 + 0.25z
( )( )
−1
H (z) = −1 −1
(1 + 0.25z −1
) Assuming causality, the ROC is |z|>1/4
z −1 z −2 z −3 z −4
( −1
= 0.5 1 + z 1 − ) + 2 − 3 + 4 − ...
4 4 4 4
1 z −1 1 z −2 −1 −1
2
= + 1 − + + + .....
2 2 4 2 4 4
h n = 0 n 0, h 0 = 1/ 2, Compare with impulse response calculated
in “Basics of Digital Systems” notes
1 −1 −1 −3 −1
n n−1 n
h n = + =
2 4 4 2 4
1 3
h 0 = , h 1 = = 0.375, h 2 = −0.09375, h 3 = 0.0234, h 4 = −0.00586,...
2 8
Algebraic Long Division
• An alternative method for computing the series expansion of a
ratio of the form
Q( z)
f ( z) =
P( z)
is based on long division.
• The polynomial P(z) is divided into Q(z) using the standard
rules of long division.
• The resulting series expansion can be used to identify the
coefficients of z-n corresponding to x[n].
• The result is usually not a general expression for x[n] but
allows one to calculate the first few terms.
Example (again)
1 + 5 z −1
X ( z) =
Series expansion of X(z)
5 −1 2 −2 Equating powers of z-n we get
1− z − z x[0]=1, x[1]=20/3, x[2]=106/9, ….
3 3
1 + 20 / 3 z −1 + 106 / 9 z −2 +
5 −1 2 −2
1 − z − z 1 + 5 z −1
3 3
Subtract
5 −1 2 −2 Recall for Part 3, the IZT was:
1− z − z
x n = 3 ( 2 ) − 2 ( −1/ 3) n 0
n n
3 3
20 −1 2 −2 =0 n0
z + z
3 3 Subtract n = 0, x 0 = 1
5 −1 2 −2 20 −1 20 −1 100 −2 40 −3
1 − z − z z = − − −1 20
3 3 3
z z z n = 1, x[1] = 3 2 − 2 =
3 9 9 3 3
2
106 −2 40 −3 −1 106
+ z + z n = 2, x 2 = 3 22 − 2 =
9 9 3 9
5 −1 2 −2 106 −2 106 −2 530 −3 212 −4
1 − z − z z = z − z − z
3 3 9 9 27 27
Summary:
ODE
ap
dpy
dt p
+ ... + a1
dy
dt
+ a0 y = bq
dqx
dt q
+ ... + b1
dx
dt
+ b0 x
Underlying
Physics
Continuous
Laplace Not discussed
Systems
Fourier (As before)
transform here
Transform
FRF
Transfer s=2if
bq ( 2if ) + ... + b1 2if + b0
q
Function bq s q + ... + b1s + b0
H (s) = H(f )= Measurement
a p s p + ... + a1s + a0 a p ( 2if ) + ... + a1 2if + a0
p
f=s/2i
P(s)=0 Fourier
Q(s)=0 Laplace
Transform
transform
Convolution Response to
any input
Impulse
Response y (t ) = x (u ) h (t − u ) du
−
H (z) = H(f )=
1 + a1 z −1 + ... + a p z − p 1 + a1e −2 if / f + ... + a p e −2 pif / f
s s FRF
f = f s log ( z ) / ( 2i )
P(z)=0
Fourier
Q(z)=0 Transform
z-transform of a sequence
y n = h m x n − m
m=−
Is the system
Is the system
causal?
stable?
IZT Example 1
2z +1 1
X (z) = z z −1 / 3 1 z-transform to invert, note the ROC |z-1/3|<1
3z − 1 3
Can’t expand this because the condition implies that |3z|>1
X ( z ) = ( 2 z + 1)( 3 z − 1)
−1
X ( z ) = z (1 − z ) (1 − z )
−1 −1
−1 −1 −1
(1 − z ) = 1 + z + z + z + z
−1
−1 −1 −2 −3 −4
+ z −5 + ...... Expansion is valid since |z-1|<1.
(1 − z ) (1 − z ) = (1 + z + z )( )
−1 −1
−1 −1 −1 −2
+ z −3 + ...... 1 + z −1 + z −2 + z −3 + ......
= 1 + z −1 + z −2 + z −3 + z −4 + z −5 + ......
(
+ z −1 1 + z −1 + z −2 + z −3 + z −4 + z −5 + ...... )
+ z −2 (1 + z −1
+ z −2 + z −3 + z −4 + z −5 + ......) + .......
= 1 + 2 z −1 + 3 z −2 + 4 z −3 + 5 z −4 + .....
(1 − z ) (1 − z )
−1 −1
X (z) = z −1 −1 −1
= z −1 + 2 z −2 + 3z −3 + 4 z −4 + 5 z −5 + .....
x n = 0 n 0, x n = n n 0
IZT Example 3
6 z 2 + 3z − 1 1
X (z) = 2 z Use partial fractions to break X(z) into
6z + 5z + 1 2 smaller, easy to compute, parts.
−1 −1
2 4 z 4z / 3
X ( z ) = 1+ − = 1+ −
2 z + 1 3z + 1 1 + z / 2 1 + z −1 / 3
−1
( ) ( )
−1 4 −1 −1
−1 −1 −1
= 1+ z 1+ z / 2 − z 1+ z / 3 Note both expansions are valid since
3 in ROC |z-1|/2<1 and |z-1|/3<1
−1 z −1 z −2 z −3 4 −1 z −1 z −2 z −3
= 1 + z 1 − + 2 − 3 + .... − z 1 − + 2 − 3 + ....
2 2 2 3 3 3 3
−1 4 −2 −1 4 1 −3
−1
2
4 −1
2
= 1 + z 1 − + z + + z − + ......
3 2 3 3 2 3 3
x n = 0, n 0, x 0 = 1
n−1 n−1
−1 4 −1 −1
n
−1
n
x n = − = 4 − 2 n0
2 3 3 3 2
General Principles of Filter Design
Chuang Shi
• Prediction
– Forward or backward prediction
• Hilbert transformers
• Optimal filters
– For example, estimating one sequence from another.
• Tracking/state estimations filters
Types of Frequency Selective Filter
• Low-pass
– Frequencies above a cut-off frequency are rejected
• High-pass
– Frequencies below a cut-on frequency are rejected
• Band-pass
– Frequencies between two specified frequencies are passed
• Band-stop
– Frequencies between two specified frequencies are rejected
• Notch filter
– A narrow form of band-stop filter
• Comb filter
– A filter consisting of a series of notches
Examples of Frequency Selective Filters
Forms of Digital Filter
• When designing a filter, the first decision to be made is
whether the filter is to be Finite Impulse Response (FIR) or
Infinite Impulse Response (IIR).
• FIR filters are (always ?) implemented as a moving average
(MA) system.
• IIR filters are implemented as ARMA systems.
• There are very different design methodologies for FIR and IIR
filters.
• There are advantages and disadvantages to both (these will be
examined later).
Steps in Filter Design
• Designing a filter consists of the following general stages:
1. Specifying the required filter response, e.g. cut-on/-off frequencies.
2. Defining the type of filter you need, i.e. FIR or IIR
3. Deciding upon the number of coefficients
4. Designing a filter
5. Comparing the response with the specified response
• If the filter fulfils the specification either:
– Consider reducing the model order and redesigning (can you meet the
specification with a shorter filter?)
– Stop
• If the filter fails to meet the specification either:
– Consider increasing the model order, return to step 3.
– Reconsider your choice of filter type, return to step 2.
– Modify the specification (!), return to step 1.
The Effect of the Number of
Coefficients
• Choosing the number of coefficients in a filter is usually a
compromise.
• Filters normally have a better response if longer filters (ones
with more coefficients) are used.
• Longer filters require greater computational loads – which
may be an issue in real-time systems.
• Longer filters may introduce greater delays to the system (or
increase “phase distortion” – see later notes).
• Also, longer filters are more affected by rounding errors in
their coefficients (also see next slide).
– Because of finite precision, the filter coefficients are rounded before
they are implemented. The impact of these rounding errors tends to be
greater in longer filters than in shorter ones.
An Example of the Effects of
Rounding Coefficients
Coefficients have been
rounded to 3 d.p., which is
rather more dramatic than
normal (it is roughly equivalent
to 10 bit computation) – so the
effect is magnified here.
• The output from such an ideal filter has no energy above foff
but energy below foff is preserved by the filter.
• Note the filter’s FRF is discontinuous at foff and, as such, no
realisable filter can be designed that exactly has this FRF.
Practical Designs
• In practice we must accept that a filter will only approximate
the ideal FRF.
• Loosely (for a low-pass filter) we would like a filter to have an
FRF whose magnitude is:
– Close to 1 for frequencies below the cut-off.
– Close to 0 for frequencies above the cut-off.
– Near the cut-off frequency we expect the response to rapidly change
from 1 to 0.
• In practice, the response tends to oscillate around 1 in the pass-
band (a phenomenon called pass-band ripple).
• Be small (but generally not zero) in the stop band.
• Take a finite time to transit between the two bands – this
region, where the filter response changes from close to 1 to
close to 0, is called the transition zone.
Examples of Practical Designs
Linear Scale dB Scale
Transition zone
Transition zone
A linear scale usually shows the The stop band-band behaviour is more
the pass-band ripple more effectively clearly assessed using a dB representation.
than a dB scale does.
Filter Specifications
• A filter specification should be realisable.
• The ideal filter is not a specification, since no practical filter
can have such a response.
• A practical filter specification (for a low-pass filter) should
define:
– The end of the pass-band
– The width of the transition zone (which along with the above defines
the start of the stop-band).
– The permissible level of rippling in the pass-band.
– The maximum gain in the stop-band, sometimes called the stop-band
ripple.
= b x n − k
k =0
k
• Recall:
– These are all-zero filters and that there is no issue with stability.
– The impulse response of the system is equal to the filter coefficients
h[n]=bn.
– From the preceding statement one can see that the filter’s FRF is equal
to the FT of the coefficients, i.e. H ( f ) = F h n = F bn
This last point, in particular, makes the design of FIR filters relative straightforward.
Windowing Design Method
• The windowing design method is based on the idealised filter.
• Since we have
H ( f ) = F bn bn = h n = F −1 H ( f )
• Hence the “ideal” filter coefficients are the inverse FT of the
ideal FRF. In this notation a “~” is used to
H ( f ) =1
indicate quantities that relate to the
• We can say f f c ideal filter.
=0 f fc
• Note that for an ideal filter we actually only require that its
magnitude is 1 in the pass band, in the above we consider the
special case where it actually equals one.
Ideal Impulse Response
• The inversion of the ideal filter can be performed analytically.
fs / 2
h n = bn =
− fs / 2
H ( f ) e 2 ifn df
sin ( 2f c n )
fc
1 e
1 2 ifn f c
= 2 ifn
df = e =
2in − fc n
− fc
* =
Ideal filter’s FRF Window’s FT Designed filter’s FRF
Window Features which Affect the
Filter Design Ripple in Pass-
Stop-Band
Width of Width of
Window’s Transition
Main Lobe Zone
Window’s Side-lobes
• The width of the window’s main lobe defines the width of the
filter’s main lobe.
• The window’s side-lobes defines the filter’s pass-band and
stop-band behaviour.
Various Choices of Windowing
Functions (Linear Plots)
Narrow
transition zone
poor attenuation
in the stop-band
Wide transition
zone good
attenuation in
the stop-band
Truncated Filter Coefficients
• As described so far the FIR filter coefficients are
b− K , b− K +1 ,...., b−1 , b0 , b1 ,...., bK corresponding to the difference
equation:
y n = b− K x n + K + b− K +1 x n + K − 1 + .... + b−1 x n + 1 + b0 x n +
+ bK x n − K
• This filter is acausal, y[n], as it depends on future values of the
input, x[n].
• This can be rectified by waiting K samples before computing
the output, i.e. not computing y until x[n+K] has occurred.
• This is equivalent to the difference equation
y n + K = b− K x n + K + b− K +1 x n + K − 1 + .... + b−1 x n + 1 + b0 x n +
+ bK x n − K
Shifting
• The difference equation
y n + K = b− K x n + K + b− K +1 x n + K − 1 + .... + b−1 x n + 1 + b0 x n +
+ bK x n − K
• Is equivalent to
y n = b− K x n + b− K +1 x n − 1 + .... + b−1 x n − K + 1 + b0 x n − K +
+ bK x n − 2 K
(replacing n+K by n throughout) clearly this is causal.
• In this form the coefficients b are now numbered strangely, it
is sensible to renumber them using bˆk = bk − K leading to
y n = b垐
0 x n + b1 x n − 1 + .... + bK −1 x n − K + 1 + bK x n − K +
垐
+ bˆ2 K x n − 2 K
Summary
Design Principles for IIR Filters
Chuang Shi
qn ( s )
Bessel
(1784-1846)
15
= 3 n=3
s + 6 s 2 + 15s + 15
Bessel Filters (Cont’d)
• Bessel filters have very good phase responses, they introduce
minimal levels of phase distortion (see later notes).
• This means they have frequently been used for cross-over
systems.
Stephen Butterworth
(1885-1958)
Butterworth Filters
• Butterworth filters are a widely used class of analogue filters.
• These filters have a transfer function of the form:
1
H ( w) =
1 + w2 n
where n is the filter order.
• Thus the squared magnitude of the filter’s transfer function is
1
H (s) = n =1
s +1
1
= n=2
s 2 + 2s + 1
1
= n=3
( s + 1) ( s 2
+ s +1 )
Butterworth Filter (Cont’d)
• Butterworth filters are characterised by having a gain of -3 dB
at w=1 for all orders n.
Chebyshev Filters (Type I)
• There are two forms of Chebyshev filters: type I &
type II filters.
• Type I filters are characterised by an equi-ripple in
the pass-band and monotonic decay in the stop- Pafnuty
Chebyshev
band. (1821-1894)
1 + e Tn ( w)
2 2
T1 ( x ) = x
T2 ( x ) = 2 x 2 − 1
T3 ( x ) = 4 x 3 − 3x
T4 ( x ) = 8 x 4 − 8 x 2 + 1
T5 ( x ) = 16 x 5 − 20 x 3 + 5 x
Chebyshev Filters (Type I) (Again)
Effect of changing e, altering the degree of
rippling in the pass-band.
1
1+ 2
e Tn ( w)
2
1 + e Rn ( z, w)
2 2
1 s3
H hp ( s ) = =
100 100 100
+ 1
2
(100 + s (
) 10000 + 100 s + s 2
)
+ + 1
s s s
Example 3rd Order Butterworth
Filters
Summary
• So far we have described methods for designing analogue
filters: these filters have a transfer function H(s).
• We now consider how to convert that analogue filter to a
digital version, with a transfer function H(z).
• This can be regarded in several different ways:
– Mapping the variable s to the variance z.
– Creating an equivalent difference equation from a differential equation.
• In fact from this stand-point this process shares much with the problem of
the numerical solution of differential equations, such as Runge-Kutta
methods.
What is Required of such a
Mapping?
1. Stability should be maintained, i.e. if H(s) stable, then after
the mapping the digital system H(s) should also be stable.
2. For an analogue filter, the frequency response H(f) has been
“carefully” selected, one wants preserve the character of this
response in the digital domain.
• To do this the mapping should take the points of the analogue
frequency axis (s=iw) should be mapped to the digital frequency axis
(z=eiw).
3. The mapping should be one-to-one.
• This means that there can be no aliasing, since each point in the
analogue (s) domain is mapped to a different point the digital (z)
domain.
Method of Mapping Differentials
• This is equivalent to Euler’s method for solving differential
equations.
• It is based on a finite difference approximation for derivatives.
dx x ( t ) − x ( t − h )
dt h
for small h.
• If h is selected to be the sampling interval, D, then
dx x ( t ) − x ( t − D ) x n − x n − 1
=
dt D t =nD
D
Method of Mapping Differential in
the Transform Domain
• Since in the Laplace domain
dx
L = sX ( s )
dt
and in the z-domain, then
x n − x n − 1 (
1 − z −1 ) X (z)
Z =
D D
1 − z −1 1 − z −1
sX ( s ) X (z) s
D D This substitution
constitutes the method
of mapping differentials
Example
dy
• Consider the simple system +y=x
dt
1
H (s) =
s +1
• Using the method of mapping differentials
1 1 D
H (z) = = −1
=
s + 1 s=1− z −1
1− z 1 + D − z −1
D +1
D
• Consider the poles of these two systems:
– H(s) has one pole at s=-1
– H(z) has one pole at z=(1+D)-1 (1 + D ) y n − y n − 1 = Dx n
1 D
y n = y n − 1 + x n
1+ D 1+ D
Difference Equation form
General Properties
• Using 1 − z −1 1
s= z=
D 1 − Ds
where do points in the s-plane map to in the z-plane?
• Note that:
– s=0 z=1
– |s|→∞ z→0
1 1 1 + iwD 1 + eiq
s = iw z = = 1 + =
1 − iwD 2 1 − iwD 2
where q = 2 tan −1 ( wD )
Frequency axis is the s-plane map to a circle in z-plane, but NOT the unit disc, i.e.
not the frequency axis in the z-plane.
Graphical Representation of
Mapping Differentials
Summary of Mapping Differentials
1. The method of mapping differentials is a one-to-one mapping
• It does not introduce aliasing
2. It preserves stability.
• The left half of the s-plane is mapped to the interior of the circle
defined by (1+e-iq)/2 (see grey regions in previous plot).
3. Points on the frequency axis in the s-plane do NOT map to
the frequency axis in the z-plane.
• This means that the frequency response of the digital system will not
be equivalent to that of the analogue system/filter.
• At low frequencies this mapping approximately preserves the FRF.
4. This method is not well suited to filter design
• Conceivably it might be used in the case of non-linear systems
• It could be used to design low-pass (or band pass filters) if the cut-off
frequency is very much smaller than fs/2.
Example
• Designing a 3rd order high pass filter with cut on at 100 Hz,
with a sample rate of 1 kHz using method of mapping
differentials.
Impulse Invariance
• This is the second approach to computing a digital system
from an analogue one.
• It consists of 3 steps:
1. Compute the inverse Laplace transform of H(s), i.e. compute the
impulse response h(t).
2. Sample this impulse response to create h[n]=h(t) for t=nD.
3. Compute the transfer function, H(z), of the digital system with impulse
response h[n] (using the z-transform).
• Note that step 2 is a sampling process that possibly introduces
aliasing.
Example
• Again consider the system
1
H (s) =
s +1
−1 1
h (t ) = L = e −t
t0
s + 1
• Sample the impulse response
h n = h ( t ) t =nD = e − nD n0
• Z-transform to compute H(z).
( ) 1
H (z) = h n z
n
−n − nD − n −D −1
= e z = e z =
n =− n =0 n =0
1 − e −D z −1
y n = x n + e −D y n − 1
Impulse Invariance as a Mapping
• It can be shown that, using impulse invariance, the transfer
functions in the s- and z-planes are related via
1
H ( z ) z =e = H a ( s + 2pikf s )
D k =−
sD
5pfs
3pfs
pfs
−pfs
−3pfs
−5pfs
Summary of Impulse Invariance
1. Impulse invariance preserves stability.
• All the points in the left-half of the s-plane are mapped to the interior
of the unit disc in the z-plane.
2. The frequency axis in the s-plane is mapped to the frequency
axis in the z-plane.
3. The mapping is not one-to-one.
• The process of sampling can introduce aliasing.
• The analogue frequency response must be zero for frequencies above fs/2.
• This is feasible as long as the filter has a specific final cut-off frequency,
meaning that impulse invariance can be successfully used to design low-
pass or band-pass filters, but not high-pass or band-stop.
The Bilinear Transform
• The bilinear transform is a mapping method in which H(z) is
obtained from H(s) by making the substitution:
s→
2 1 (
− z −1
)
(
D 1 + z −1 )
where D is the sampling interval.
• This is the most widely used method for obtaining a digital
system from an analogue one.
• What follows is not strictly a “proof” for the utility of the
bilinear transform: it is more a verification.
Verification of the Bilinear
Transform (I)
• We shall consider a first order linear system*:
dy dy c b
a + by = cx = x− y
dt dt a a
• This system has a transfer function:
c
H (s) =
as + b
• Also consider the following integral:
nD
dy
dt = y ( t ) t =nD − y ( t ) t = n−1 D = y n − y n − 1
dt ( )
( n−1)D
*Note that higher order linear systems can be constructed but putting first order systems in series,
so this assumption is not as restrictive as one might, at first, expect.
Verification of the Bilinear
Transform (II)
• Recall the trapezoidal rule for approximating integral:
( x1 − x0 )
x 1
f ( x ) dx
x0
2
( f ( x0 ) + f ( x1 ) )
x1
x0
f ( x ) dx
Verification of the Bilinear
Transform (III)
• Using the trapezoidal approximation one has:
nD
D
dt = y n − y n − 1 ( y n − 1 + y n )
dy
dt 2
( n−1)D
dy
where y n =
dt t =nD
dy c b
• Using the linear system we also have that = x − y so
dt a a
D c b c b
y n − y n − 1 x n − y n + x n − 1 − y n − 1
2a a a a
Verification of the Bilinear
Transform (IV)
• Z-transforming gives:
2
D
( ) c
a
( ) −1 b
(
Y ( z ) 1 − z X ( z ) 1 + z − Y ( z ) 1 + z −1
−1
a
)
Y (z) c
H ( z) = =
X (z) (
a
2 1 −)z −1
+b
( D 1+ z) −1
H ( z) =
1
=
(
D 1 + z −1 )
=
D + Dz −1
2 1− z
−1
−1
+1
−1
(
2 − 2z + D 1+ z −1
)
2 + D − ( 2 − D ) z −1
D 1+ z
1 + Ds / 2
which means that z =
1 − Ds / 2
Chuang Shi
(
y ( t ) H ( f 0 ) A ( t − t g ) sin 2f 0 ( t − t p ) + )
Group Delay and Phase Delay
Group delay
Output Pulse
Input Pulse
Example Phase Response
• When computing the phase response for a system one has to
take some care.
• There is an ambiguity in the phase of a complex value.
z = rei+2 k for any k
• For each frequency arg(H(f )) takes a value in the range (-,].
• This means that if the phase function strays outside of the
region (-,] then it is folded back into that region.
• This had to be undone in a process called phase unwrapping.
Example of Phase Unwrapping
Butterworth Filter
• 6th Order Butterworth low-pass filter at 0.125
Elliptic Filter
• 6th Order Elliptic low-pass filter at 0.125
( f )
Phase Delay tp = − =
2f
( f ) = −2f
Comments
• Recall the group delays for the elliptic and Butterworth filters.
• The elliptic filter’s phase response shows a much stronger
frequency dependence than does the Butterworth filters.
• So the elliptic filter’s have a greater degree of phase distortion.
• They do, generally, have a better magnitude response.
• This is usually the case: filters with rapid transitions normally
introduce larger phase distortion.
Linear Phase
• In order to avoid phase distortion one requires that the FRFs
phase is linear, i.e.
( f ) = − 2f
• For a real system the FRF at f=0 is real, i.e. (f )=0 or , thus
=0 or (we shall largely assume =0).
• Note the slope, , of the phase response defines the delay
(both the group and phase delay) of the system and this delay
is constant for all frequencies, i.e. no phase distortion occurs.
• Linear phase systems are desirable, further we would generally
also like to minimise the delay .
Linear Phase (Cont’d)
• Recall that the Fourier transform of delayed signal satisfies
X ( f ) = F x ( t ) F x ( t − t ) = e −2 if t X ( f )
1/z*
z
r
1/r
FRF of 2 Zero System
• This system has a transfer function (assuming causality)
( )( ) ( )(
H ( z ) = 1 − z −1z 1 − z −1 / z* = 1 − z −1rei 1 − z −1ei / r )
−1 i 1 −2 i 2
= 1 − z e r + + z e
r
H ( ) =
− i
re − e − i
=
e − i
r − e (
i( −)
)
i − i i( −)
1 − re e 1 − re
H ( )
2
=
(r −e( )
i −)
( r −e
− i( − )
) = r + 1 − 2r cos ( − ) = 1
2
(1 − re ( )
i −
) (1 − re − i( − )
) 1 + r − 2r cos ( − )
2
All-Pass Filters (Cont’d)
• This system has a FRF which has a gain of unity for all
frequencies.
• This filter only affects the phase of the input – it delays
frequency components, does not attenuate or amplify any
component.
• Effectively such filters just introduce phase distortion.
• All-pass filters can be applied in series, without losing their
all-pass character.
FIR vs IIR Filters
• The choice of whether to use an FIR or an IIR filter is critical.
• The advantages and disadvantages of the two types can be
summarised as:
– IIR filters are generally more efficient: they require fewer coefficients
to achieve a given level of performance.
– FIR filters can be designed to avoid phase distortion: in most instances
this is the main reason for selecting FIR filters.
– IIR filters can suffer from the effects of rounding errors on the
coefficients: there is a maximum length of filter which can effectively
be designed.
– Stability need to be considered with IIR filters.
Complex Numbers and their
Reciprocals
• Consider the complex number z=rei.
• It conjugate is z*=re-i
• Its reciprocal is 1 = 1 e − i
z r
1 1 i
• The conjugate of the reciprocal is * = e
z r
1/z*
r
z
z*
1/r 1/z