Professional Documents
Culture Documents
Equalization: - Adaptive Filters Are Used in
Equalization: - Adaptive Filters Are Used in
(johns@eecg.toronto.edu)
(www.eecg.toronto.edu/~johns)
slide 1 of 70
University of Toronto
© D.A. Johns, 1997
slide 2 of 70
University of Toronto
© D.A. Johns, 1997
Adaptive Filter Introduction
• A typical adaptive system consists of the following
two-input, two output system
δ(n)
+
y(n) -
u(n) H(z) e(n )
y(n )
adaptive
algorithm
slide 3 of 70
University of Toronto
© D.A. Johns, 1997
slide 4 of 70
University of Toronto
© D.A. Johns, 1997
Noise (and Echo) Cancellation
signal
+
+
H 1(z) × noise
noise δ(n)
H 1(z)
+ e(n) = signal
H 2(z) H(z) -
u(n) y(n) = H 1(z) × noise
slide 5 of 70
University of Toronto
© D.A. Johns, 1997
slide 6 of 70
University of Toronto
© D.A. Johns, 1997
Adaptation Algorithm
• Optimization might be performed by:
• perturb some coefficient in H(z) and check whether the power of
the error signal increased or decreased.
• If it decreased, go on to the next coefficient.
• If it increased, switch the sign of the coefficient change and go on to
the next coefficient.
• Repeat this procedure until the error signal is minimized.
slide 7 of 70
University of Toronto
© D.A. Johns, 1997
Steepest-Descent Algorithm
2
∂E [ e (n) ]
p i(n + 1) = p i(n) – µ -----------------------
∂p i
slide 8 of 70
University of Toronto
© D.A. Johns, 1997
Steepest Descent Algorithm
• In the one-dimensional case
2
E [ e (n ) ]
2
∂ E [ e (n ) ]
----------------------- > 0
∂p i
pi
*
pi p i(2) p i(1) p i(0)
slide 9 of 70
University of Toronto
© D.A. Johns, 1997
Steepest-Descent Algorithm
• In the two-dimensional case
p2
*
p2
p1
2
E [ e (n ) ] *
p1
(out of page)
slide 10 of 70
University of Toronto
© D.A. Johns, 1997
LMS Algorithm
• Replace expected error squared with instantaneous
error squared. Let adaptation time smooth out result.
2
∂e (n)
p i(n + 1) = p i(n) – µ --------------
∂p i
∂e(n)
p i(n + 1) = p i(n) – 2µe(n) ------------
∂p i
slide 11 of 70
University of Toronto
© D.A. Johns, 1997
slide 12 of 70
University of Toronto
© D.A. Johns, 1997
Obtaining Gradient Signals
m pi n
h um(n) h ny(n)
∂y(n)
φ i(n) = ------------ = h ny(n) ⊗ h um(n) ⊗ u(n)
∂p i
slide 13 of 70
University of Toronto
© D.A. Johns, 1997
Gradient Example
G1
v lp(t)
u(t)
-1 G2 1
v bp(t)
G1 y(t)
G3
1
v lp(t)
-1 G2 1
∂ y(t )
G3 -----------
∂G 1
∂y(t) ∂y(t)
----------- = – v lp(t) ----------- = – v bp(t)
∂G 2 ∂G 3
slide 14 of 70
University of Toronto
© D.A. Johns, 1997
Adaptive Linear Combiner
p 1(n)
x 1(n)
δ (n )
p 2(n)
x 2(n)
N
+
- e(n )
state
u(n) generator y(n)
p N(n)
x N(n)
y(n ) = ∑ pi(n)xi(n)
∂-----------
y(n)-
= x i(n)
∂p i
Y(z)
often, a tapped delay line H(z) = ----------
U(z)
slide 15 of 70
University of Toronto
© D.A. Johns, 1997
slide 16 of 70
University of Toronto
© D.A. Johns, 1997
Performance Surface
• Correlation of two states is determined by multiplying
the two signals together and averaging the output.
• Uncorrelated (and equal power) states result in a
“hyper-paraboloid” performance surface — good
adaptation rate.
• Highly-correlated states imply an ill-conditioned
performance surface — more residual mean-square
error and longer adaptation time.
*
p2
p1
2
E [ e (n) ] *
p1
(out of page)
slide 17 of 70
University of Toronto
© D.A. Johns, 1997
Adaptation Rate
• Quantify performance surface — state-correlation
matrix
E [ x1 x1 ] E [ x1 x2 ] E [ x1 x3 ]
R ≡ E [ x2 x1 ] E [ x2 x2 ] E [ x2 x3 ]
E [ x3 x1 ] E [ x3 x2 ] E [ x3 x3 ]
slide 19 of 70
University of Toronto
© D.A. Johns, 1997
slide 21 of 70
University of Toronto
© D.A. Johns, 1997
p 1(n)
x 1(n)
u(n)
δ (n )
–1 p 2(n)
z
x 2(n)
+
–1 - e(n )
z
y(n)
z
–1
x N(n)
p N(n) y(n) = ∑ pi(n)xi(n)
∂y(n)
------------ = x i(n)
∂p i
slide 22 of 70
University of Toronto
© D.A. Johns, 1997
FIR Adaptive Filters
slide 23 of 70
University of Toronto
© D.A. Johns, 1997
slide 24 of 70
University of Toronto
© D.A. Johns, 1997
Equalization — Training Sequence
u(n) y(n ) output data
±1 H tc(z) H (z) ±1
known
input data FFE
e(n)
regenerated
delayed FFE = Feed Forward Equalizer
input data δ (n )
slide 25 of 70
University of Toronto
© D.A. Johns, 1997
FFE Example
• Suppose channel, Htc(z) , has impulse response
0.3, 1.0, -0.2, 0.1, 0.0, 0.0
time
slide 26 of 70
University of Toronto
© D.A. Johns, 1997
FFE Example
y(1) = 0 = 1.0p 1 + 0.3p 2 + 0.0p 3
time
slide 27 of 70
University of Toronto
© D.A. Johns, 1997
FFE Example
• Although ISI reduced around peak, introduction of
slight ISI at other points (better overall)
• Above is a “zero-forcing” equalizer — usually boosts
noise too much
• An LMS adaptive equalizer minimizes the mean
squared error signal (i.e. find low ISI and low noise)
• In other words, do not boost noise at expense of
leaving some residual ISI
slide 28 of 70
University of Toronto
© D.A. Johns, 1997
Equalization — Decision-Directed
e(n)
slide 29 of 70
University of Toronto
© D.A. Johns, 1997
Equalization — Decision-Feedback
±1 y(n) output data
H tc(z) ±1
input data δ(n)
y DFE(n) H 2( z )
e(n) DFE
slide 30 of 70
University of Toronto
© D.A. Johns, 1997
DFE Example
• Assume signals 0 and 1 (rather than -1 and +1)
(makes examples easier to explain)
• Suppose channel, Htc(z) , has impulse response
0.0, 1.0, -0.2, 0.1, 0.0, 0.0
time
slide 31 of 70
University of Toronto
© D.A. Johns, 1997
e(n ) DFE
slide 32 of 70
University of Toronto
© D.A. Johns, 1997
FFE and DFE Combined
Model as:
n noise(n)
FFE
x(n) y(n) output data
±1 H tc(z) H 1(z) ±1
input data δ (n )
H 2(z) y DFE(n)
DFE
Y
---- = H 1 (5)
N
Y
--- = H tc H 1 + H 2 (6)
X
• When Htc small, make H 2 = 1 (rather than H 1 → ∞ )
slide 33 of 70
University of Toronto
© D.A. Johns, 1997
time
slide 34 of 70
University of Toronto
© D.A. Johns, 1997
Equalization — Decision-Feedback
slide 35 of 70
University of Toronto
© D.A. Johns, 1997
Fractionally-Spaced FFE
• Feed forward filter is often a FFE sampled at 2 or 3
times symbol-rate — fractionally-spaced
(i.e. sampled at T ⁄ 2 or at T ⁄ 3 )
• Advantages:
— Allows the matched filter to be realized
digitally and also adapt for channel variations
(not possible in symbol-rate sampling)
— Also allows for simpler timing recovery
schemes (FFE can take care of phase recovery)
• Disadvantage
Costly to implement — full and higher speed
multiplies, also higher speed A/D needed.
slide 36 of 70
University of Toronto
© D.A. Johns, 1997
dc Recovery (Baseline Wander)
• Wired channels often ac coupled
• Reduces dynamic range of front-end circuitry and
also requires some correction if not accounted for in
transmission line-code
+2
+1 +1
-1 -1
slide 37 of 70
University of Toronto
© D.A. Johns, 1997
0 1 0 0 0 0 ...
0 1 0 0 0 0 ...
DFE
0 0 0.5 0.25 0.125 0.06 ...
1--- – 1 1--- – 2 1 – 3
z + z + --- z + …
2 4 8
slide 38 of 70
University of Toronto
© D.A. Johns, 1997
Baseline Wander Correction #1
DFE Based
0 1 1 1 1 1 ...
0 1 1 1 1 1 ...
DFE
0 0 0.5 0.75 0.875 0.938 ...
1--- – 1 1--- – 2 1 – 3
z + z + --- z + …
2 4 8
slide 39 of 70
University of Toronto
© D.A. Johns, 1997
0 1 1 1 1 1 ...
0 1 1 1 1 1 ...
slide 40 of 70
University of Toronto
© D.A. Johns, 1997
Baseline Wander Correction #3
Error Feedback
slide 41 of 70
University of Toronto
© D.A. Johns, 1997
Analog Equalization
slide 42 of 70
University of Toronto
© D.A. Johns, 1997
Analog Filters
Switched-capacitor filters
+ Accurate transfer-functions
+ High linearity, good noise performance
- Limited in speed
- Requires anti-aliasing filters
Continuous-time filters
- Moderate transfer-function accuracy (requires
tuning circuitry)
- Moderate linearity
+ High-speed
+ Good noise performance
slide 43 of 70
University of Toronto
© D.A. Johns, 1997
δ (t )
p 2(t)
x 2(t)
N
+
- e(t)
state
u(t) generator y(t)
p N(t)
x N(t)
y(t) = ∑ pi(t)xi(t)
∂----------
y(t)-
= x i(t)
∂p i
Y(s)
H(s) = ----------
U(s)
slide 44 of 70
University of Toronto
© D.A. Johns, 1997
Adaptive Linear Combiner
• The gradient signals are simply the state signals
• If coeff are updated in discrete-time
p i(n + 1) = p i(n) + 2µe(n)x i(n) (7)
slide 45 of 70
University of Toronto
© D.A. Johns, 1997
slide 46 of 70
University of Toronto
© D.A. Johns, 1997
Analog Adaptive Filters
Analog Equalization Advantages
• Can eliminate A/D converter
• Reduce A/D specs if partial equalization done first
• If continuous-time, no anti-aliasing filter needed
• Typically consumes less power and silicon for high-
frequency low-resolution applications.
Disadvantages
• Long design time (difficult to “shrink” to new process)
• More difficult testing
• DC offsets can result in large MSE (discussed later).
slide 47 of 70
University of Toronto
© D.A. Johns, 1997
slide 48 of 70
University of Toronto
© D.A. Johns, 1997
Orthonormal Ladder Structure
x 2( t ) x 4(t)
α1 –α2 α3
–α1 α2 x 3(t) – α 3
x 1(t)
β in
u(t)
slide 49 of 70
University of Toronto
© D.A. Johns, 1997
slide 50 of 70
University of Toronto
© D.A. Johns, 1997
Analog Adaptive Filters
• Usually digital control desired — can switch in caps
and/or transconductance values
• Overlap of digital control is better than missed values
pi pi
better worse
(hysteresis effect) (potential large coeff jitter)
slide 51 of 70
University of Toronto
© D.A. Johns, 1997
m xi Σ ∫ w i(k)
e ( k) Σ
mi
me
slide 52 of 70
University of Toronto
© D.A. Johns, 1997
Analog Adaptive Filters — DC Offsets
• Sufficient to zero offsets in either error or state-
signals (easier with error since only one error signal)
• For integrator offset, need a high-gain on error signal
• Use median-offset cancellation — slice error signal
and set the median of output to zero
• In most signals, its mean equals its median
error + offset comparator
offset-free error
up/down
D/A counter
slide 53 of 70
University of Toronto
© D.A. Johns, 1997
SE-LMS
-50 -6 -5 -4 -3 -2 -1
10 10 10 10 10 10
µ Step Size
slide 54 of 70
University of Toronto
© D.A. Johns, 1997
Coax Cable Equalizer
• Analog adaptive filter used to equalize up to 300m
• Cascade of two 3’rd order filters with a single tuning
control
w1 s ⁄ ( s + p1 ) α
in w2 s ⁄ ( s + p2 ) out
w3 s ⁄ ( s + p3 )
highpass filters
• Variable α is tuned to account for cable length
slide 55 of 70
University of Toronto
© D.A. Johns, 1997
freq
slide 56 of 70
University of Toronto
© D.A. Johns, 1997
Analog Adaptive Equalization Simulation
noise
{1,0} { ± 1, 0 } r(t) y(t) PR4 â(k)
1-D channel Σ equalizer
detector
a (k )
1 2 3
e ( k)
y i(t) S1 + _
PR4
generator Σ
4 S2
S1 - for training
S2 - for tracking
slide 57 of 70
University of Toronto
© D.A. Johns, 1997
1.5
0.5
y [v]
-1 1 0
-0.5
-1
-1.5
-1 -2
0 20 40 60 80 100 120 140 160
Time [ns]
slide 58 of 70
University of Toronto
© D.A. Johns, 1997
Equalizer Simulation — Decision Directed
• Switch S2 closed (S1 open), all poles fixed and 5
zeros adapted using
• e(k) = 1 – y(t) if ( y(t) > 0.5 )
• e(k) = 0 – y(t) if ( – 0.5 ≤ y(t) ≤ 0.5 )
• e(k) = – 1 – y(t) if ( y(t) < – 0.5 )
slide 59 of 70
University of Toronto
© D.A. Johns, 1997
1 1
0.5 0.5
Amplitude [V]
y [V]
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
0 20 40 60 80 100 120 140 160 0 50 100 150 200
Time [ns] Discrete Time [k]
18
1.5
PR4
16
1 overall
14
0.5
12
Gain [V/V]
y [v]
0 10
-0.5 8
6
-1 channel
4
-1.5
2
equalizer
-2 0
0 20 40 60 80 100 120 140 160 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Time [ns] Normalized Radian Frequency [rad]
slide 60 of 70
University of Toronto
© D.A. Johns, 1997
Equalizer Simulation — Decision Directed
• Channel changed to 7MHz Bessel
• Keep same fixed poles (i.e. non-optimum pole
placement) and adapt 5 zeros.
1.5 20
18 PR4
1
16
14
0.5
12
Gain [V/V]
y [V]
0 10
8
-0.5
6
channel
4
-1
2 equalizer overall
-1.5 0
0 20 40 60 80 100 120 140 160 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Time [ns] Normalized Radian Frequency [rad]
slide 61 of 70
University of Toronto
© D.A. Johns, 1997
18
PR4
1.5
16
overall
1
14
0.5 12
Gain [V/V]
y [V]
10
0
8
-0.5 6 channel
4
-1 equalizer
2
-1.5 0
0 20 40 60 80 100 120 140 160 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Time [ns] Normalized Radian Frequency [rad]
slide 62 of 70
University of Toronto
© D.A. Johns, 1997
BiCMOS Analog Adaptive Filter Example
• Demonstrates a method for tuning the pole-
frequency and Q-factor of a 100MHz filter — adaptive
analog
• Application is a pulse-shaping filter for data
transmission.
• One of the fastest reported integrated adaptive filters
— it is a Gm-C filter in 0.8um BiCMOS process
• Makes use of MOS input stage and translinear-
multiplier for tuning
• Large tuning range (approx. 10:1)
• All analog components integrated (digital left off)
slide 63 of 70
University of Toronto
© D.A. Johns, 1997
BiCMOS Transconductor
M1 M2
M3 M4
v v
----d I+i I–i – ----d
2 2
M6 Two styles implemented:
I+i vo v I–i
---- M5 M7 M8 – ----o “2-quadrant” tuning (F-Cell),
2 2 “4-quadrant” tuning by
cross-coupling top
Q1 V C2 Q3 input stage (Q-Cell)
V C1 Q2 Q4
+ +
V BIAS ⇔ G m_
_
slide 64 of 70
University of Toronto
© D.A. Johns, 1997
Biquad Filter
2C 2C
+ + + +
U G mi+ G m21+ X2 G m22+ G m12+ X1
_ _ _ _
_ _ _ _
2C 2C
+
G mb+
_
_
slide 65 of 70
University of Toronto
© D.A. Johns, 1997
slide 66 of 70
University of Toronto
© D.A. Johns, 1997
Adaptive Pulse Shaping Algorithm
30
Ideal input pulse
(not to scale)
20
lowpass
output
10
Voltage [mV]
0
bandpass
-10 output
-20
∆
∆ ∆ = 2.5ns
-30
20 25 30 35 40 45
• Fo control: sample output pulse shape at nominal zero-crossing and
decide if early or late (cutoff frequency too fast or too slow
respectively)
• Q control: sample bandpass output at lowpass nominal zero-crossing and
decide if peak is too high or too small (Q too large or too small)
slide 67 of 70
University of Toronto
© D.A. Johns, 1997
Experimental Setup
pulse-shaping filter chip
LP
U
CLK
+
Gm-C
Data Biquad - logic u/d
counter DAC
U
Generator Filter
+ logic u/d
counter DAC
fo Q
- off-chip tuning algorithm
BP
Vref CLK
slide 68 of 70
University of Toronto
© D.A. Johns, 1997
Pulse Shaper Responses
20 20
0 0
-20 -20
0 5 10 15 20 25 0 5 10 15 20 25
20 20
0 0
-20 -20
0 5 10 15 20 25 0 5 10 15 20 25
0 0
-20 -20
0 5 10 15 20 25 0 5 10 15 20 25
20 20
0 0
-20 -20
0 5 10 15 20 25 0 5 10 15 20 25
Initial — low-freq. high-Q Initial — low-freq. low-Q
slide 69 of 70
University of Toronto
© D.A. Johns, 1997
Summary
• Adaptive filters are relatively common
• LMS is the most widely used algorithm
• Adaptive linear combiners are almost always used.
• Use combiners that do not have poor performance
surfaces.
• Most common digital combiner is tapped FIR
Digital Adaptive:
• more robust and well suited for programmable filtering
Analog Adaptive:
• best suited for high-speed, low dynamic range.
• less power
• very good at realizing arbitrary coeff with frequency only change.
• Be aware of DC offset effects
slide 70 of 70
University of Toronto
© D.A. Johns, 1997