Professional Documents
Culture Documents
Numerical Analysis of Electronic Circuits by Mehrotra 2002
Numerical Analysis of Electronic Circuits by Mehrotra 2002
Amit Mehrotra
September 27, 2002
Contents
1 Formulation of Circuit Equations:
1.1 Conventions . . . . . . . . . . . .
1.2 Fundamental Laws . . . . . . . .
1.3 Widely Used Circuit Elements . .
1.4 Modified Nodal Analysis . . . . .
1.5 MNA Stamps of common devices
Modified Nodal
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
Analysis
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
6
8
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
13
14
15
15
16
17
17
18
18
Systems
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
20
20
21
23
26
27
27
27
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
29
30
30
31
31
32
33
33
34
35
35
35
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
4.6.2
4.6.3
Analysis
. . . . . .
. . . . . .
. . . . . .
39
39
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
42
43
46
48
48
54
55
58
58
59
61
61
63
of Circuits
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
65
66
66
Circuits
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
67
67
67
68
70
73
73
74
75
77
77
79
81
81
82
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Conventions
Current is always treated positive going from the positive node to the negative node as shown below
e+
Currents are denoted by i, branch voltages are denoted by v and node voltages are denoted by e. For the above
element
v = e+ e
Also, at a given node, currents going out of the node are considered positive.
1.2
Fundamental Laws
1
+
R1
G2 v3
ES 6
+
v3
R4
R8
IS5
+
E7 v3
1
0
0
0
1 1
0 1
0 0
0 0
i1
i2
0 0 0 0 0
0
i3
0
1 1 1 0 0 i4
=
0
0 0 1 0 1
i5
0 0 0 1 1 i6
0
i7
i8
or
AI = 0
KVL can be written as
v1
1 0 0 0
0
v2 1 0 0 0
0
0
v3 1 1 0 0 e1
v4 0 1 0 0 e2 0
=
v5 0 1 0 0 e3 0
v6 0 1 1 0 e4
0
v7 0 0 0 1
0
v8
0 0 1 1
0
or
V AT E = 0
1.3
We now describe branch equations of widely used linear elements. Nonlinear elements will be considered later.
Resistors are characterized by an algebraic relationship between their current and voltage
+
vR
iR
In general, the relationship can be nonlinear
iR = i(vR )
Linear resistors have a linear relationship between the voltage and current
1
vR
R
iR = GvR =
Capacitors are characterized by an algebraic relationship between their charge and voltage
+
vC
iC =
dqC
dt
qC = q(vC ) iC =
dq(vC )
dt
Inductors are characterized by an algebraic relationship between their flux and current
+ vL =
d(iL )
dt
iL
d(iL )
dt
Independent Voltage Sources are ideal elements which maintain a given voltage across them and deliver any
amount of current.
L = (iL ) vL =
+ v(t)
iV
Independent Current Sources are ideal elements which generate a given current independent of the voltage across
them.
+
vI
i(t)
Voltage Controlled Current Sources are elements of the type
+
vc
gvc
vc
evc
ic
f ic
The element connected between the controlling nodes can be arbitrary. However for ease of implementation,
many circuit simulators require that for current controlled sources, the element between the controlling nodes
should be a voltage source or an inductor.
Current Controlled Voltage Sources are elements of the type
+
ic
hic
1.4
This is a compact and efficient way of generating the circuit matrix from the circuit description. This is best
illustrated with an example.
R3
1
+
R1
ES 6
+
v3
G2 v3
R4
3
R8
IS5
E7 v3
The modified nodal analysis proceeds as follows:
1. Write KCL
i1 + i2 + i3
i3 + i4 i5 i6
i6 + i8
i7 i8
=0
=0
=0
=0
+
i6 = IS5
R3
R4
v8
i6 +
=0
R8
v8
i7
=0
R8
3. Write down the unused branch equations
v6 = ES6
v7 E7 v3 = 0
4. Use KVL to eliminate branch voltages
e1
e1 e2
+ G2 (e1 e2 ) +
=0
R1
R3
e1 e2
e2
+
i6 = IS5
R3
R4
e3 e4
i6 +
=0
R8
e3 e4
i7
=0
R8
e3 e2 = ES6
e4 E7 (e1 e2 ) = 0
In matrix form
1
R1
+ G2 +
R13
0
0
0
E7
1
R3
G2 R13
1
1
R3 + R4
0
0
1
E7
0
0
1
R8
R18
1
0
0
0
R18
1
R8
0
1
0
1
1
0
0
0
0
e1
0
0
e2 IS5
0 e3 0
1
e4 0
i6
ES 6
0
i7
0
0
9
Yn
C
B
0
e
=S
i
1.5
Each device adds entries in the MNA matrix which called stamp of a device. The MNA stamps of some common
devices are shown below
Resistor Let a resistor of resistance R be connected between nodes i and j. Then
ei ej
=0
R
X
ei ej
KCL at node j :
(other currents)
=0
R
KCL at node i :
(other currents) +
... j
1
R
R1
R1
1
R
Voltage controlled current source of value G between nodes i and j with controlling voltage being the potential
difference of nodes k and l
k ... l
i G
G
..
.
G
G
j
Independent current source of value IS between nodes i and j
i
..
.
IS
=
IS
k
1
1
1 1 0
RHS
0
0
V
Voltage controlled voltage source of value E between nodes i and j at branch k with controlling voltage being
the potential difference of nodes l and m
i
j
k
1 1
k
1
1
10
Current controlled current source of value F between nodes i and j with the controlling current at branch k
i
j
i
k
F
F
Current controlled voltage source of value H between nodes i and j at branch k with the controlling current at
branch l
l
i j k
i
1
j
1
k
1 1
H
The MNA stamps of nonlinear devices are somewhat complicated and will be addressed later.
Chapter 2
2.1
Gauian Elimination
b1
Pn
i=2
a1i xi
a11
In matrix form
a11
0
a12
a21
a22 a12
a11
an2
a12 an1
a11
a13
a21
a23 a13
a11
..
.
an1
an3 a13a11
...
...
...
11
a1n
b1
x1
a1n a21
a
21
a2n a11 x2
b2 a11 b1
.. =
..
.
.
an1
an1
ann a1n
x
b
b
n
n
a11
a11 1
12
or
a11
0
a12
(2)
a22
a13
(2)
a23
..
.
(2)
an3
(2)
an2
. . . a1n
b1
x1
(2)
(2)
. . . a2n x2
b2
.. = ..
. .
(2)
(2)
xn
. . . ann
bn
where
(2)
bi
(1)
(2)
(1)
where
(1)
(1)
li1 =
ai1
(1)
a11
In matrix form we can write this as A(2) x = b(2) . The new coefficient matrix A(2) is related to the original matrix as
follows
1
0 0 ... 0
l(1) 1 0 . . . 0
21
(1)
(2)
A = L1 A
l
0
1
.
.
.
0
A = 31
1
..
(1)
ln1 0 . . . 0 1
Also
b(2) = L1
1 b
Note that
(1) T
L1
e1
1 =I l
where
0
1
l(1)
0
21
(1)
0
(1)
e1 = and l = l31
..
..
.
.
(1)
0
ln1
(k)
At the kth step of Gauian elimination, we multiply the ith equation by lik and subtract it from the kth equation.
(k)
(k)
lik =
aik
(k+1)
aij
(k)
akk
(k)
(k) (k)
where
(k) T
L1
ek
k =I l
and
0
..
.
1 kth
ek =
.
..
0
0
and
l(k)
0
..
.
= (k)
lk+1,k
.
.
.
(k)
lnk
13
After n steps
a11 x1 + a12 x2 + . . . + a1n xn = b1
(2)
(2)
(2)
a22 x2 + . . . + a2n = b2
..
(n)
a(n)
nn xn = bn
or in matrix form
b1
b(2)
2
U x = . = b
..
(n)
bn
Note that
1
1
U = L1
n Ln1 . . . L1 A
Further
1
Lk = I l(k) eTk
= I + l(k) eTk
i<j
Therefore
L1 L2 . . . Ln = L = lower triangular matrix
and
A = LU
Back substitution
U x = b
This is solved as follows
bn
unn
bn1 un1,n xn
=
un1,n1
xn =
xn1
..
.
xi =
2.1.1
b1 Pn
j=i+1 ui,j xj
u1,1
LU Decomposition
To solve Ax = b
create LU = A
Let y = U x. Then Ly = b. Solve for y (forward substitution)
U x = y. Solve for x (backward substitution)
(P
i
lik ukj
aij = Pjk=1
k=1 lik ukj
uij = aij
lij =
i1
X
ij
i>j
lik ukj if i j
k=1
Pj1
aij k=1 lik ukj
ujj
if i > j
14
2.2
Pivoting
0
0
0
1
0
R
0 R1
0
0
0
0
0
R1
1
R
1
0
0
0
0
0
Another GE step
0 0
0 0
0 0
0 0
0
0
1
R
0
0
0
0
0
R1
0
1
0
0
0
0
0
(4)
li4 =
ai4
(4)
a44
Solution: interchange rows and/or columns to bring nonzero elements into position (k, k).
0 0
0
0
0
0
This would be a problem even if exact arithmetic computer was used because of the structure of the matrix. Most of
the times, the problems occur due to the finite precision of the computer. Consider the following example
1.25 104
12.5
1.25 x1
6.25
=
12.5 x2
75
Assume finite arithmetic with 3 digit floating point. After first step of GE
1.25 104
0
1.25
1.25 105
x1
6.25
=
x2
6.25 105
12.5
1.25 104
12.5 x2
75
=
1.25 x1
6.25
12.5 x2
75
=
1.25 x1
6.25
2.2.1
15
Pivoting Strategies
1. Partial pivoting (row interchange only), choose l as the smallest integer such that
(k)
(k)
alk = max ajk
j=k,...,n
2.3
Error Mechanism
Gauian elimination algorithm works well if exact arithmetic with pivoting is used. However, in finite precision
arithmetic, errors may occur even with pivoting. The two main reasons for this are
numerically singular matrix
numerical stability of the method
Consider the following example:
xy =0
x+y =1
The solution x = 1, y = 1 can be computed accurately. Now consider the following system
xy =0
x 1.01y = 0.01
The system here is called ill-conditioned and the solution cannot be computed very accurately. We need a way to
detect this.
16
2.3.1
Detecting Ill-Conditioning
n
X
|xi |
i=1
n
X
L2 : kxk2 =
|xi |
! 12
i=1
kAxk
kxk
From the above definition, various matrix norms can be computed as:
L1 : kAk1 = max
j
n
X
|aij |
i=1
p
largest eigenvalue of AT A
n
X
= max
|aij |
L2 : kAk2 =
L : kAk
j=1
kbk
kxk
kbk
kAk
A1
= k(A)
kxk
kbk
kbk
where
when A is perturbed, A A + A
k(A) = kAk
A1
(A + A)(x + x) = b
Ax + A(x + x) = 0
kxk <
A1
kAkkx + xk
kAk
kxk
kAk
kAk
A1
= k(A)
kx + xk
kAk
kAk
k(A) is called the condition number of A. Large k(A) implies that A is close to being singular, i.e., it is ill conditioned.
2.3.2
17
10
E.g. on Linux u = 1016
24
10
or 1032
of a number in a computer.
floating point
double precision
long double precision
= max
i,j,k
ai,j
kAk
It follows that
kEk 8n3 kAk u + O(u2 )
1
For partial pivoting 2n1 and for complete pivoting 1.8n 4 loge n . However, typically, < 10.
2.3.3
Rescale x
x
= D11 x
18
Equilibrate
D2 AD1 x
= D2 b
D1 and D2 can be chosen such that the condition number of D2 AD1 is much smaller compared to the condition number
of A. In circuit applications, one can rescale unknowns to reflect the difference in units, e.g. volts and microamperes.
Choose D2 such that
max |aij | 1
j
Scaling and equilibration only affect the choice of pivots elements in a pivoting scheme.
2.4
Sparse Matrix
Typically in a circuit, the number of elements connected to a node is limited to 5 or 6. Therefore for a large circuit,
the number of zero elements in the circuit matrix is very large. Such matrices are called sparse. The computational
and storage complexity of sparse matrices can be reduced by the following optimizations:
avoid storing zeros - use data structure of linked lists or pointers
avoid trivial operations 0 x = 0, 0 + x = x
avoid losing sparsity, minimize fill-ins
2.4.1
LU
LU
3 3
4
3
2
2
2
cj /ri
(k)
Chapter 3
3.1
Krylov Subspace
Definition Given a matrix A and a vector v, the ith order Krylov subspace is defined as
Ki (v, A) = span v, Av, A2 v, . . . , Ai1 v
Obviously i cannot be made arbitrarily large. If the rank of the matrix A is n then i n. More precisely, i is the
order of the annihilating polynomial for the matrix A and vector v.
Definition Annihilating polynomial is the polynomial
p(x) = xi + ai1 xi1 + . . . + a1 x + a0
of minimum degree i such that
p(A)v = 0
It can be shown that annihilating polynomial is unique for a given A and v.
A generic Krylov subspace algorithm for solving Ax = b can be described as follows:
1. guess a solution x(0) , let
r(0) = b Ax(0)
and i = 0.
2. while
where is some predefined error tolerance:
(i)
r
(a) i i + 1.
(b) generate Ki r(0) , A .
(c) generate
x(i) x(0) + Ki r(0) , A
such that
r(i)
is minimized in some sense.
19
20
Krylov subspace methods differ from each other in (a) how do they generate the Krylov subspace in step 2b and (b)
how do they minimize the residue in step 2c. As will be seen in the following section, generation of Krylov subspace
involves only matrix vector products. Therefore Krylov subspace methods can be easily applied to situations where
the matrix may not be directly available and generating, storing and multiplying with that matrix involves significant
overhead. For instance, in harmonic balance method, the coefficient matrix is available as a sequence of transforms
which can be efficiently applied to a vector. However, the coefficient matrix generated from the product of those
transforms is dense which involves significant overhead in storing and multiplication; factoring it is obviously out of
question.
3.2
We now describe some methods for generating the set of vectors which span the Krylov subspace Ki (v, A). The
obvious way of doing this is to successively generate Ai v vectors. This is numerically a bad way of generating the
Krylov subspace. To understand why, let A be diagonalized as
A = W W 1
Then
Ai = W i W 1
On a finite precision computer, as i increases, Ai only has the information about the dominant eigenvalues of A and all
other eigenvalues disappear. In other words, as the dimension of Ki (v, A) is increased, the new basis vector, which was
supposed the increase the dimension of this subspace by 1, has a component in the new dimension which is numerically
insignificant compared to the components which point in the previously generated dimensions. Therefore numerically
the dimension of the subspace has not increased.
3.2.1
Arnoldi Process
One obvious way of circumventing this problem is to remove the components in the previously generated directions
and optionally renormalize the resulting vector. At the ith step, let
Ki (v, A) = span {b0 , b1 , . . . , bi1 }
and bTj bk = 0, j 6= k. To increase the dimension by 1, form
di = Abi
orthogonalize it against b0 , b1 , . . . , bi1
ci = di
i1
X
j bj
j=0
ci
kci k
bTj Abi
bTj bj
This orthogonalization procedure is the so called modified Graham-Schmidt orthogonalization procedure. The obvious
disadvantage of this method is that as i increases, the computational cost of each iteration increases. The method
itself is O n2 .
3.2.2
21
Lanczos Process
Another method for generating the Krylov subspace is the Lanczos process which was originally proposed for matrix
tridiagonalization which is used for eigenvalue computation of the matrix. Consider the following set of equations
Ab1
Ab2
Ab3
..
.
Abk1
Abk
= b1 r1,1 + b2
= b1 r1,2 + b2 r2,2 + b3
= b1 r1,3 + b2 r2,3 + b3 r3,3 + b4
(3.1)
= b1 r1,k1 + b2 r2,k1 + . . . + bk1 rk1,k1 + bk
= b1 r1,k + b2 r2,k + . . . + bk rk,k
for some ri,j s. Similarly we can write these equations for ci s and A .
A c1
A c2
A c3
..
.
A ck1
A ck
= c1 s1,1 + c2
= c1 s1,2 + c2 s2,2 + c3
= c1 s1,3 + c2 s2,3 + c3 s3,3 + c4
(3.2)
= c1 s1,k1 + c2 s2,k1 + . . . + ck1 sk1,k1 + ck
= c1 s1,k + c2 s2,k + . . . + ck sk,k
Here k is minimum of the degrees of the annihilating polynomials for b1 and A and c1 and A . It follows from the
above equations that b1 , . . . , bi span the Krylov subspace Ki1 (b1 , A). We can rewrite the above set of equations as
AB = B(J + R) where B = [b1 , b2 . . . bk ], R is an upper triangular matrix and J = [e2 , e3 . . . ek 0]. I.e.,
0 0 0 ... 0 0
1 0 0 ... 0 0
J =
0 1 0 ... 0 0
...
0 0 0 ... 1 0
Similarly we can form A C = C(J +S) where C = [c1 , c2 . . . ck ] and W1 = c1 and S is another upper triangular matrix.
We select r1,1 and s1,1 such that c1 b2 = 0 and b1 c2 = 0, r1,2 , r2,2 , s1,2 and s2,2 such that c1 b3 = c2 b3 = b1 c3 = b2 c3 = 0
and so on such that C B = D is diagonal.
Now we will show that the elements of C B and S and R can be chosen such that C B = D is a nonsingular
diagonal matrix. Assume c1 b2 = b1 c2 = 0. This simplifies to
c1 (Ab1 r1,1 b1 ) = 0 or r1,1 =
b1 (A c1 s1,1 c1 ) = 0 or s1,1 =
c1 Ab1
c1 b1
b1 A c1
= r1,1
b1 c1
c1 Ab2
since c1 b2 = 0
c1 b1
c2 Ab2
since c2 b1 = 0
c2 b2
b1 A c2
since b1 c2 = 0
b1 c1
22
b2 A c2
since b2 c1 = 0
b2 c2
c1 Ab3
since c1 b3 = c1 b3 = 0
c1 b1
c2 Ab3
since c2 b1 = c2 b2 = 0
c2 b2
c3 Ab3
since c3 b1 = c3 b2 = 0
c3 b3
b1 A c3
since b1 c2 = b1 c3 = 0
b1 c1
b2 A c3
since b2 c1 = b2 c3 = 0
b2 c2
b3 A c3
since b3 c1 = b3 c2 = 0
b3 c3
Hence we have to choose c1 and b1 such that c3 b3 6= 0. In general we can show that
ri,j =
ci Abj
bi A cj
and
s
=
i,j
ci bi
bi ci
Consider
ri,j =
ci Abj
i<j1
ci bi
(
s1,i c1 + s2,i c2 + . . . + si,i ci + ci+1 )bj
=0
ci bi
Similarly si,j = 0 i < j 1. Hence, R and S have nonzero entries along the diagonal and super diagonal only. This
also implies that ci are A-orthogonal to bj for i < j 1 and bi are A -orthogonal to cj for i < j 1.
Consider
c Abj
i=j1
ri,j = i
ci bi
Substituting the value of ci A as above we obtain
ri,j =
Substituting i = j 1 we have
rj1,j =
Similarly we can show that
sj1,j =
ci+1 bj
ci bi
cj bj
cj1 bj1
bj cj
bj1 cj1
= rj1,j
23
For i = 0, 1, . . . do
1. Compute ci bi . If ci bi = 0, then stop.
2. Set
bi+1 = Abi
ci Abi
c bi
bi i
bi1
ci bi
ci1 bi1
ci+1 = A ci
bi A ci
bi ci
c
ci1
i
bi ci
bi1 ci1
This algorithm terminates if ci = 0 or bi = 0. It can be shown that bi = 0 occurs iff Ki (b1 , A) is an A invariant
subspace and the vectors b1 , . . . , bi form a basis of the subspace. This is the regular termination of the algorithm.
However the procedure breaks down if ci 6= 0 and bi 6= 0 but ci bi = 0 or ci bi 0. In finite precision arithmetic exact
cancellations are unlikely but we can have ci bi 0 but ci 6 0 and bi 6 0. These will cause numerical instability in
subsequent iterations.
3.3
Conjugate Gradient
Conjugate gradient is a Krylov subspace method for solving Ax = b where A is symmetric (Hermitian if complex)
and positive definite, which uses the Lanczos process for a Hermitian matrix and at each step, minimizes the A1 -norm
of the residue i.e., rT A1 r. Since A is Hermitian and positive definite rT A1 r is well-defined norm. Before proceeding,
we first point out the relevant properties of Lanczos process for Hermitian matrices.
If A = A, then:
1. for Lanczos process if one chooses c1 = b1 , then ci = bi i
2. since ci = bi and {bi } and {ci } are biorthogonal, it follows that {bi } are themselves orthogonal.
3. since ci is A-orthogonal to bj , i < j 1, it follows that bi is A-orthogonal to bj i < j 1.
Now consider the choice of
x(i) x(0) + Ki r(0) , A
i.e.,
x(i) = x(0) + Bi y
where Bi = b1
b2
...
bi , such that
r(i)
T
A1 r(i)
is minimized. Consider
r(i)
T
T
A1 r(i) = b Ax(i) A1 b Ax(i)
T
= r(0) ABi y A1 r(0) ABi y
T
T
= r(0) A1 r(0) 2 r(0) Bi y + y T BiT ABi y
24
1
1
Li =
0
1
...
...
..
.
...
i1
Di = diag(d1
d2
0
0
...
di )
bTk bk
dk1
bTi bi
di1
Let
Gi = Bi LT
LTi Gi = Bi
i
T (0)
pi = Di1 L1
Li Di pi = BiT r(0)
i Bi r
Then
x(i) = x(0) + Gi pi
Let Gi = g1
g2
...
gi . Then
g1 = b1
k1 gk1 + gk = bk
1
2
pi = .
..
i
0
..
= .
di
i1 0
i
0
Li1 Di1
0 . . . 0 i1 di1
1
2
..
.
25
T
Since Li1 Di1 pi1 = Bi1
r(0) it follows that
pi =
where
i =
pi1
i
i1 di1 i1
di
1 =
x(1)
2. while
(i)
r
>
(a) i i + 1
(b) increase the dimension of the Krylov subspace by setting
bi = Abi1
bTi1 Abi1
bTi1 bi1
b
bi2
i1
bTi1 bi1
bTi2 bi2
(c) compute
i1 =
bTi bi
di1
k(A) 1
k(A) + 1
!i
(0)
e
26
3.4
MINRES
When the matrix is symmetric but indefinite, Lanczos process can still be used to generate the Krylov subspace.
However, xT Ax is no longer a valid norm and therefore cannot be used for minimization. MINRES method overcomes
this limitation by minimizing the L2 norm of the residue. Eliminating the search direction gi from the conjugate
gradient equations we get
Ar(i) = r(i+1) ti+1,i + r(i) ti,i + r(i1) ti1,i
Recall that the above equation is indeed the Lanczos process for a symmetric matrix. This can be written in matrix
form as
ARi = Ri+1 Ti
where
Ri = r(0)
r(1)
...
r(i1)
..
. .
.
Ti =
..
..
..
.
.
..
..
..
..
.
..
.
..
.
Since xT Ax is no longer a valid norm, we minimize the residue in L2 norm. First choose
n
o
x(i) x(0) + Ki r(0) , A = x(0) + span r(0) , r(1) , . . . , r(i1)
i.e.,
x(i) = x(0) + Ri y
such that
(i)
Ax b
Let
Di+1 = diag
r(0)
,
r(1)
, . . .
r(i)
1
Then Ri+1 Di+1
is an orthonormal transformation with respect to the current Krylov subspace and
(i)
Ax b
=
Ri+1 Ti y r(0)
=
Di+1 Ti y
r(0)
e1
2
where e1 is the first unit vector. This final expression can be seen as minimum norm least squares problem. The
element in the i + 1, ith position can be removed by Givens rotation and the resulting bidiagonal system can be easily
solved. This method is know as the Minimum Residue (MINRES) method.
3.5. GMRES
3.5
27
GMRES
When the matrix is not symmetric, the Krylov subspace cannot be implicitly formed by the residues, it needs to
be formed explicitly. One option is use the Arnoldis method to form the Krylov subspace. One can use the exact
same minimization procedure as above, i.e., choose
n
o
x(i) x(0) + Ki r(0) , A = x(0) + span r(0) , r(1) , . . . , r(i1)
i.e.,
x(i) = x(0) + Ri y
such that
(i)
Ax b
is minimized. Again
ARi = Ri+1 Hi
but Hi is an upper Hessenberg matrix instead of a simple tridiagonal matrix. Therefore the computational complexity
of this algorithm is quadratic in the number of iterations. This method is known as the Generalized Minimum Residue
(GMRES) method.
3.6
QMR
Instead of using the (expensive) Arnoldi process to generate the Krylov subspace, one can use the Lanczos process
for this purpose. Using least square minimizations, yields the so called Quasi-Minimum Residue method. In order to
prevent break down in the underlying Lanczos process, look-ahead Lanczos can be used instead.
3.7
Preconditioning
Recall that the convergence rate of a Krylov subspace strongly depends on the condition number or spectral
properties of the coefficient matrix. Therefore one may speed-up the convergence by transforming the original system
into another one which has the same solution but more favourable spectral properties or condition number. This
process is called preconditioning. For instance, if the matrix M approximates the coefficient matrix in some way, then
the transformed system
M 1 Ax = M 1 b
has the same solution as the original system but the spectral properties of M 1 A may be more favourable. The
successful use of Krylov subspace methods in most of the situation critically hinges on the ability to form an appropriate
preconditioner. Obviously solving M x = y should be easy.
28
Chapter 4
Introduction
Recall that a linear system of equations Ax = b either has exactly one solution (when A is nonsingular) or has an
entire continuum of solutions (when A is rank deficient). This is not the case if the equations are nonlinear. Consider
the following circuit
1
+
v
The diode shown above is a tunnel diode whose current is related to the diode voltage by the following nonlinear
relation
i = f (v) = 17.76v 103.79v 2 + 229.62v 3 226.31v 4 + 83.72v 5
Let us investigate the solutions of this circuit when the source voltage E is increased from 0. The figure below plots
the diode current i = f (v) and the voltage source current i = (E v)/1 (also known as the load line). The intersection
point(s) of these curves are the solutions of the circuit.
i
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
29
0.8
30
In the range 0 E < 0.6 and E > 1.15, the circuit has one solution. In the range 0.6 < E < 1.15, there are three
distinct solutions. At E = 0.6 and E = 1.15, there are two solutions but one of the solutions is degenerate or
non-isolated. Consider the E = 0.6 load line and the solution close around v = 4.7. If E is increased slightly, two
solutions appear and if E is decreased slightly, both the solutions disappear. Such points are known as bifurcation
points and systems with such devices can be chaotic.
4.2
Solution Methods
Solve
f (x) = 0
f (x) : Rn Rn
f (x) : Rn Rn
Iterative method:
start from an initial guess x(0)
generate successive approximations x(1) , x(2) , . . . to the solution x using an iterative function
x(i+1) = x(i)
stop when
(i+1)
x(i)
1
x
f x(i+1)
2
If is a fixed point, i.e., = (), if all fixed points are zeros of f and is continuous in the neighbourhood of each
fixed point then the limit point of the sequence xi is a fixed point of and hence a zero of f .
4.2.1
x, y D0
for some 0 < < 1, then has a unique fixed point x D0 and sequence
x(k+1) = x(i)
converges to x for all x(0) D0 .
Theorem: Let J(x) =
x exist in a set
D0 = {x|kx x k < }
then if kJ(x)k m < 1 for all in x D0 , the iteration converges to x for all x(0) D0 .
Proof
x x(k+1)
=
(x ) x(k)
x x(k)
4.3
31
If is a zero of f (x) : R R and f is sufficiently differentiable in a neighbourhood N of , then form the Taylor
series expansion about x(0)
1
2
f () = 0 = f x(0) + x(0) f 0 x(0) +
x(0) f 00 x(0) + . . .
2!
0 f x(0) + x(0) f 0 x(0)
f x(0)
(0)
= x 0 (0)
f x
Therefore the iteration function is
(x) = x
f (x)
f 0 (x)
x
x(2)
x(1)
x(0)
(k)
f (x) f x
Here
f
x
f x(k)
x x(k)
+
x
Newton Iteration
Solve
fi
xj
x(k+1) = x(k) J 1 x(k) f x(k)
J x(k) x(k+1) = f x(k)
or
J x(k) x(k+1) = J x(k) x(k) f x(k)
4.3.1
An iteration x(k) is said to converge with order q if there exists a vector norm such that k
Theorem If
J(x) is Lipschitz continuous
q
(k+1)
x
x(k) x
x
J(x ) is nonsingular
then
1. x(k) x provided x(0) sufficiently close to x
x0 , x
32
2
(k+1)
x
C
x(k) x
x
Newton Raphson is not a fool-proof method and will run into convergence problems. Consider the following
function
f (x)
In this case NR keeps oscillating between the two points without ever converging to the right solution. Similarly
consider the case when the derivative is incorrectly computed as shown below
f (x)
In all these cases, NR will not converge to the solution. These observations have some important implications.
1. Device model equations must be continuous with continuous derivatives
2. Derivative calculation must be accurate
3. Nodes must not be left floating (J singular)
4. Give good initial guesses for x(0)
5. Most model computations produce errors in function values and derivatives. Want to have convergence criteria
(k+1)
x(k)
<
x
such that is more than model errors/numerical precision.
4.4
+
v
33
G(k)
Therefore diode can be replaced by a conductance of value G(k) and current source I (k) at every iteration.
Exponential nonlinearities such as diodes and BJTs present a special challenge to NR. Consider the following
circuit
i
+
IO
f v (1
IO
f (v)
RIO
If NR is started from 0, the very next iterate can be so large that the current is infinite. In such cases, for better
numerical behaviour the current equation is taken to be the tangent to the I-V curve at v (0) and the next iterate and
the current is calculated using the tangent as shown below
i
v (0) vlim
v (1)
Once the current at the next iterate is computed, the correct voltage vlim is calculated from the original I-V curve
and the Newton step is limited such that the new voltage is vlim .
4.5
4.5.1
h
i1
x(k) = J x(k)
f x(k)
(4.1)
34
where J x(k) is the Jacobian matrix. A reasonable strategy to use when deciding whether to accept the Newton step
x(k) is to require that the step decrease |f |2 = f T f . Let
g=
1 T
f f
2
4.5.2
Backtracking
dh()
= g T x
d
If we need to backtrack, then we model h with the most current information we have and choose to minimize h().
We start with h(0) and h0 (0) available. The first step is always the Newton step, = 1. If this step is not acceptable,
we have h(1) available as well. We can therefore model h() as a quadratic
h0 () =
h0 (0)
2[h(1) h(0) h0 (0)]
1
2
35
If subsequent backtracks are required, we model h as a cubic in , using the previous value h(1 ) and second most
recent value h(2 )
h() = a3 + b2 + h0 (0) + h(0)
where a and b are chosen such that the above cubic approximation gives the correct values of h at 1 and 2 , i.e.,
" 1
#
12 h(1 ) h0 (0)1 h(0)
1
a
21
2
=
1
b
h(2 ) h0 (0)2 h(0)
1 2 22
2
1
p
b2 3h0 (0)
3a
= 0.11 .
b +
4.5.3
for some . The intuition for the upper limit is that if the slope of the function is small at the current iterate,
Newton-Raphson will take a large step and we need to limit it more. We can choose in many ways. One possible
dg
= 0. I.e.,
method is the following: Recall that f is minimized when d
0=
"
#T
dg
x(k) = f T x(k) + x(k) J x(k) + x(k) x(k)
dx x(k) +x(k)
A binary search algorithm can be used to estimate the value of . Note that this method requires the evaluation of
the Jacobian at each binary search step and therefore is expensive compared to the previous step. However, no matrix
inversion is required. In general the previous method is preferred but if line search takes a lot of iterations or Newton
Raphson and the line search start opposing each other, this line search can be invoked.
4.6
Sometimes, using the above line search method also turns out to be insufficient. Continuation methods are used
in such cases to obtain a solution.
4.6.1
f, x Rn
Rp
(4.2)
We can view the solution process of (4.1) as a particular case of solution of (4.2) for a given .
Now consider the problem of obtaining the family of solutions x as is varied over a range as shown in Figure 4.1.
In many circuits, the solution manifold may fold around itself. Consider the point SN B . For > SN B there are no
solutions and for < SN B there are two solutions. Therefore there is no close neighbourhood around SN B where a
unique solution exists. Such points are call saddle node bifurcation (SNB) points.
A somewhat nave method of obtaining this manifold is to solve for a sequence of values in the given range and
then the set of solutions will describe the manifold. However, NR may not converge on many of the points and the
computational complexity can be prohibitive. We will consider some alternatives below.
36
SN B
SN B
37
SN B
f
x
are already
The advantage of this method over the previous one is that step size is much larger and therefore the computational complexity is much smaller. However, this method fails to follow the manifold around SN B because
f
x is singular at SN B and is ill-conditioned around SN B . There are two ways around this problem.
Reparameterization Instead of fixing a value of , one component of x can be fixed and solved for. Let xi be
the component that is fixed. Define
x1
..
.
xi1
y=
xi+1
.
..
xn
Then
1
f
f
f
f
df = 0 =
dy +
dxi dy =
dxi
y
xi
y
xi
It can be shown that this matrix is generally nonsingular. The only drawback is that the Jacobian is
different from the Jacobian in NR and needs to be refactorized. The method is shown in Figure 4.4.
Euler Homotopy (EH) This can be used to follow the folding of the manifold around SN B . This method
uses the unit tangent vector v as the predictor, but moves a specified distance along v rather than a
specified distance along the space (as in the previous methods). Let
x
z=
38
SN B
SN B
v
Rearranging the above equation we have
vx =
f
x
1
f
v
Therefore to compute v, choose arbitrary v , compute vx and then normalize v such that v T v = 1. This
method does not work around SN B where the Jacobian is ill-conditioned. In such cases, arbitrarily choose
vxj = 1 for some j and proceed.
Now if the starting point is
x
zi = i
i
then the predictor point is
z p = zi + v
The corrector uses the hyperplane passing through zp and is normal to v given by
(z zp )T v = 0 (z zi )T v =
The corrector finds the intersection of the solution manifold and this hyperplane
f (z) = 0
T
(z zi ) v = 0
This new set of nonlinear equations (in n + 1 unknowns) can be solved using NR. The Jacobian for the
above system is
f f
x
vxT
vT
39
Around tight corners, step size may need to be reduced. This approach does not solve directly for a
final solution corresponding to 2 though it can be used for that purpose. It is mainly used for path
following. The method is shown in Figure 4.5.
4.6.2
With the continuation methods, one can determine the approximate location of the saddle node bifurcation point
(xSN B , SN B ). However, a more accurate determination is also possible. Recall that the saddle node bifurcation point
has to satisfy the nonlinear equation, i.e.,
f (xSN B , SN B ) = 0
Furthermore, the Jacobian of f with respect to x is singular at that point, i.e., v 6= 0 such that
J (xSN B , SN B ) v = 0
where as before
f
x
If v satisfies the above equation then v satisfies the above equation for any 6= 0. Therefore, v 6= 0 can be enforced
by insisting that
vT v = 1
J=
2f
x v
f
x
2v
The obvious disadvantage of this formulation is that the device equations need to be twice differentiable and the device
models need to supply the entries of the Hessian
2f
x2
2
f
Note that x
is still a matrix since R. The approximate values of the bifurcation points obtained from the
continuation curves can be used as initial guess for the above system. The vector v is the unit tangent vector to the
manifold at the saddle node bifurcation point.
4.6.3
Circuit Implementation
40
detect whether we are close to SNB by monitoring the condition number of the circuit Jacobian f
x . When using EH,
one might be tempted to view the Jacobian as a 2 2 block matrix and use block LU decomposition to factor the
Jacobian. This would save the explicit formation of a (n + 1) (n + 1) sparse Jacobian. However, since EH is to be
used close to SNB point where the Jacobian f
x is ill-conditioned and v 0, this natural block decomposition is not
very well suited for our purpose. Therefore, one needs to explicitly form the Jacobian.
Chapter 5
h
i
q(v) = q0 exp VvT 1
IS
(5.1)
These are differential equations and their solution is a function (of time) rather than just a number.
5.1
In its most general form, the problem of solving a first-order differential equation is as follows: Given a function
F : Rn Rn R Rn
and its initial values
x0 Rn , t0 R
find a vector valued function
x(t) Rn ,
t t0
such that
F(x(t), x(t),
t) = 0 t t0
x(t0 ) = x0
Using a first-order differential equation is not a limitation since any higher-order differential equation can be reformulated as a first-order differential equation. Since x(t0 ) is given, the above system is called an initial value problem. An
initial value problem may have no solutions or it may have multiple solutions.
An important special case is differential equation for which the function F is such that
F(x(t), x(t),
t) = f (x, t) x
41
42
In this case, the initial value problem can be written in the following explicit form
x = f (x, t)
x(t0 ) = x0
Theorem Let f (x, t) be continuous for all x Rn and all t t0 and in addition, let f be Lipschitz continuous with
respect to x, i.e., L (independent of x and t) such that
kf (x, t) f (y, t)k Lkx yk
x, y Rn and all t t0 . Then for any x0 Rn , there exists an unique function
x(t), t t0
such that
x = f (x, t)
x(t0 ) = x0
5.2
dq
One way to solve for (5.1) is to think of dq
dt as a new variable. Therefore (5.1) has three variable v, q and dt and
dq
two equations. Therefore we need one more relationship between dt and the rest of the variables. This is determined
by so called Linear Multistep Methods Let
dx
= f (x)
dt
k
X
i xni + hn
i=0
k
X
i x ni
i=0
Here
xni x(tni )
dx
x ni =
(tni )
dt
tn = tn1 + hn
Examples
1. Forward Euler: 0 = 1, 1 = 1, 1 = 1
yn yn1 hn y n1 = 0
y n1 =
yn yn1
hn
2. Backward Euler: 0 = 1, 1 = 1, 0 = 1
yn yn1 hn y n = 0
y n =
yn yn1
hn
3. Trapezoidal: 0 = 1, 1 = 1, 0 = 12 , 1 = 12
1
yn yn1 hn (y n + y n1 ) = 0
2
1
yn yn1
(y n + y n1 ) =
2
hn
43
Linear multistep methods with 0 = 0 are called explicit methods as against implicit methods where 0 6= 0. Most
often 0 = 1.
Therefore given a differential equation of the form
dq(x)
+ f (x) + b(t) = 0
dt
discretize the time scale into a number of time steps and use an appropriate linear multistep method to eliminate q
as follows:
k
k
X
X
i
0
i
qni
qni
q(xn ) + f (xn ) + b(tn ) = 0
h
h
n 0
i=1 n 0
i=1 0
This is a nonlinear function with xn as unknown (since qi and qi for i < n are known). Therefore this can be solved
using Newton Raphson. The Jacobian for this system is
0 dq
df
+
hn 0 dx dx
(0)
As discussed earlier, it is crucial to have a good initial guess xn for xn in order to guarantee quadratic convergence
of Newtons method. Such an initial guess can be generated by an explicit k-step method
xn =
k
X
i
i=1
xni + hn
i
x ni
where
i s and i are the parameters of the explicit linear multistep method. This approach of combining an implicit
k step method with an explicit k-step method for generating initial values for Newtons method is called a predictorcorrector method.
5.3
Sources of errors
local error due to finite time-step
local error
global error
44
A linear multistep method computes an approximate solution of an initial value problem and it is desirable that the
approximation error is as small as possible. Local truncation error (LTE) measures the error introduced in taking one
time-step of the linear multi-step method assuming that all the values computed at previous time points are exact.
Let x(tn ) denote the exact solution and xn denote the computed (approximate) solution. Then local truncation error
measures how close xn is to x(tn ) in the following sense
LTE , x(tn ) xn
Consider a test problem x = f (x). Then xn satisfies the following equation
"
#
k
k
X
X
xn +
i x(tni ) + hn 0 f (xn ) +
i x(t
ni ) = 0
i=1
i=1
As a numerical example, consider x = x, x(0) = 1. The exact solution of this equation is x(t) = exp(t).
Consider integrating this equation with backward Euler with time-steps of 0.1. At time tn , for this test problem the
following relationship holds.
xn xn1
xn1
= x n = xn xn =
hn
1 + hn
Now consider the solution at time t = 0.5. If xn1 is assumed perfect, i.e., x4 = 0.67032, then
x5 =
0.67032
= 0.60938
1 + 0.1
k
X
i x(tni ) +
i=0
k
X
hn i x(t
ni )
i=0
k
X
"
i x(tni ) + hn 0 f (xn ) +
i=1
= x(tn ) +
k
X
k
X
i x(t
ni )
i=1
"
i x(tni ) + hn 0 f (x(tn )) +
i=1
k
X
i x(t
ni ) + hn 0 [f (xn ) f (x(tn ))]
i=1
kEk k
1 hn l|0 |
45
hk+1
h2
+ . . . + E (k+1) [x, 0]
+ O(hk+2 )
2!
k + 1!
0ip
q(t) =
tn t
hn
l
l = 0, 1, . . . , p
E[q(t), h] =
[i q(tni ) + hn i q(t
ni )]
i=0
k
X
"
"
i=0
tn tni
hn
l
li
tn tni
hn
l1 #
k
X
i=0
tn tni
hn
l
li
tn tni
hn
Examples
1. Forward Euler 0 = 1, 1 = 1, 1 = 1
X
X
X
i = 0 l = 0
(ii i ) = 0 l = 1
(i2 i 2ii ) 6= 0
2. Backward Euler 0 = 1, 1 = 1, 0 = 1
X
X
X
i = 0 l = 0
(ii i ) = 0 l = 1
(i2 i 2ii ) 6= 0
l1 #
46
i = 0 l = 0
(ii i ) = 0 l = 1
(i i 2ii ) = 0 l = 2
(i3 i 3i2 i ) 6= 0
Therefore Forward Euler and Backward Euler are first order methods while Trapezoidal is a second order method.
Usually 0 = 1, therefore we have 2k + 1 unknowns and p + 1 exactness conditions. Therefore
k
5.3.1
1
p
2
i xni + hn 0 x n = 0
i=0
x n =
1 X
i xni
0 h i=0
"
p+1
p #
k
X
1
tn tni
tn tni
p+1
=
(1)
i
(p + 1)i
p + 1!
hn
hn
i=0
47
1. Forward Euler: 0 = 1, 1 = 1, 0 = 0, 1 = 1, p = 1
2 =
1 2 1
1
=
2
2
2. Backward Euler: 0 = 1, 1 = 1, 0 = 1, 1 = 0, p = 1
2 =
1
1
=
2
2
3. Trapezoidal: 0 = 1, 1 = 1, 0 = 12 , 1 = 12 , p = 2
3 =
1 3 12
1
=
3!
12
+ f (x) + b(t) = 0
choose a linear multistep method
" p
#
p
X
X
1
0=
i q(xni ) +
i q(x
ni ) + f (xn ) + b(tn )
hn 0 i=0
i=1
Then the differential equation becomes a nonlinear algebraic equation
p
1
1 X
0 = F (xn ) ,
0 q(xn ) + f (xn )
[i q(xni ) + i q(x
ni )] + b(tn )
hn 0
hn 0 i=1
|
{z
}
cn known
x(k)
n
1
df
0 dq
(xn )
(xn )
dx
hn 0 dx
0 q(xn ) + f (xn ) + cn
hn 0
df
dx
and
dq
dx
df
0 dq
dx hn 0 dx
Then
df
0 dq
df
0 dq
kJ(x1 ) J(x2 )k =
(x
)
(x
)
(x
)
+
(x
)
1
2
2
dx 1
hn 0 dx
dx
hn 0 dx
df
0 dq
df
i dq
(x
)
(x
)
+
(x
)
+
(x
)
2
1
2
dx 1
dx
hn 0 dx
hn 0 dx
0
kx1 x2 k
lf + lq
hn 0
(0)
Thus J(x) Lipschitz continuous. By NR convergence theorem, convergence if xn close to xn . As pointed out earlier,
(0)
we can use a predictor for generating a xn from previous points.
48
5.4
Stability
h0
LTEi
0
h
LTEi
T
LTE
h
LTEi
h .
h0 0mM
where xm is the computed solution and x(tm ) is the true solution, tm = mh, M = Th .
A method is stable if h0 and k < such that for any two different initial conditions x0 and x00 and h =
T
M
< h0 ,
k
x(tm ) x
0 (tm )k kkx0 x00 k
Classical theorem: consistency + stability convergence
5.5
Absolute Stability
Here we examine the stability properties of various linear multistep methods, i.e., range of parameters where the
solution is stable and where the solution is unstable. Ideally we want the linear multistep method to have the same
stability properties as the original system, i.e., in the linear case, if the eigenvalues are in the left half plane, the
approximate solution generated by the LMS should be stable and if the eigenvalues are in the right half plane, the
approximate solution generated by the LMS should be unstable. Prior to formally introducing the concept, consider
a test problem x = x, x(0) = 1. The exact solution et . Let us try to solve this using the explicit midpoint method
x n1 =
xn xn2
2hn
0.5
0
-0.5
-1
10
time
15
20
49
This is obviously unstable and undesirable. Now let us try to solve this by reducing the time step. For h = 0.01 the
solution looks like the following
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
Again the solution is unstable while the original system was stable. It can be shown that this method is unstable for
all h! Now consider Forward Euler with h = 0.1.
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
The computed approximation is reasonably close to the solution. Let us increase the step size to h = 1.
50
0.5
0
-0.5
-1
10
time
15
20
Now the calculated solution is quite different from the exact solution. Let us try to increase the time-step even more.
Let h = 3.
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
Now the method becomes unstable. Therefore Forward Euler is conditionally stable. Let us try Backward Euler and
Trapezoidal for h = 0.1, 1, 3.
51
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
52
0.5
0
-0.5
-1
10
time
15
20
2
1.5
1
0.5
0
-0.5
-1
10
time
15
20
53
2
1.5
1
trapezoidal with h = 1
0.5
0
-0.5
-1
10
time
15
20
2
1.5
1
trapezoidal with h = 3
0.5
0
-0.5
-1
10
time
15
20
Therefore Backward Euler and Trapezoidal seem to be stable for all h. Also for a given h, Trapezoidal seems to be
the most accurate of the three. Let us now formalize this notion of absolute stability of a linear multistep method.
Consider the following test problem
x = x
x(0) = 1 complex
One might be tempted to use a more complicated test problem but it turns out that this problem suffices because it is
1. simple
2. local behaviour of nonlinear systems can be approximated by
x = A(t)x
linearization around the current operating point
3. A system x = Ax can often be diagonalized
x
= D
x
where D is diagonal matrix of eigenvalues of A.
54
[i xni + hn i x ni ] = 0
i=0
we have
k
X
[i xni + hn i xni ] = 0 =
i=0
k
X
(i + qi )xni
q , h
i=0
This can be treated as a difference equation and we can use the discrete time transform variable z to rewrite the above
equation as
k
X
(i + qi )z ki = 0
i=0
Since the degree of this polynomial is k, it will have k roots ri and the generic solution for distinct roots is
xn =
k
X
ci rin
i=1
mi = k
This system is stable if |ri | < 1 for all i, or if |ri | = 1 then mi = 1 and all other roots satisfy |ri | < 1. Otherwise this
system is unstable.
5.5.1
The region of absolute stability of an LMS method is the set of q = h such that all solutions of the difference
equation
k
X
0=
(i + qi )xni
i=0
(i + qi )z ki = 0
i=0
are inside or on the complex unit circle (|z| 1) and roots on the unit circle have multiplicity 1.
The roots of the above equation change as q changes. The region of absolute stability is the set of all values of q
where the necessary and sufficient conditions are satisfied.
As an example consider Backward Euler.
xn = xn1 + hx n
1
z=
1q
1
1
|z| 1
1 q
The region of absolute stability is the shaded area.
55
q-plane
unstable
Obviously Backward Euler is a very stable method. In fact, even for some differential equations whose actual solutions
are unstable, BE will produce a stable solution.
Consider Forward Euler
xn = xn1 + hx n1
z =1+q
|z| 1 |1 + q| 1
The region of absolute stability is the shaded area.
q-plane
unstable
As we have seen before in an example, Forward Euler is a conditionally stable method and can become unstable for
large time steps.
Now consider Trapezoidal
xn = xn1 +
z=
1+
1
h
(x n1 + x n )
2
q
2
q
2
1 +
|z| 1
1
q
2
q
2
q-plane
unstable
The desirable property of Trapezoidal method is that its region of absolute stability is the same as the original system.
5.5.2
Now consider the problem of finding the region of absolute stability for a given linear multistep method whose
corresponding difference equation is
k
X
(i + qi )z ki = 0
i=0
We need to find q such that all roots satisfy the stability condition. There are several ways to achieve this
56
p(z) =
k
X
i z ki
(z) =
k
X
i z ki
i=0
i=0
Look at
S , {q|q =
p(z)
, |z| 1}
(z)
I.e., let z wander around |z| 1 and record all values of q seen. This method is also not very useful because we
might get some q values for two or more different zs one with |z| 1 and one with |z| > 1.
The most efficient method for this is to use the concept of conformal mapping from complex number theory. Let
C(q) be the contour defined by
p(z)
q=
(z)
with z = exp()
z-plane
q-plane
Then we can use some basic results from theory of complex variables
p(z)
1. mapping (z)
is conformal
2. q-plane is separated into disjoint sets. In each set, the number of zeros from the outside the unit circle is constant.
3. boundary of the stability region is a subset of C(q)
Examples:
FE:
q(z = exp()) = exp() 1
q-plane
unstable
The unit circle in the z plane just gets shifted by (1, 0), therefore the area outside the above circle is unstable
region.
BE:
q(z = exp()) = 1
1
= 1 exp()
exp()
57
q-plane
unstable
In this transformation the unit circle shifts by (1, 0) and its rotation is reverse hence the area inside the circle is
unstable.
Trapezoidal:
q
xn = xn1 + (xn1 + xn )
2
2(exp() 1)
q=
(exp() + 1)
q-plane
unstable
In this case the unit circle is mapped to the imaginary axis. Hence the area to the left of the imaginary axis is
the region of absolute stability.
(i + qi )z 7i = 0 0 = 0 2 = . . . = 7 = 0
i=0
Here
p(z) = z 7 z 6
(z) = 1 z 6 + . . . + 7
The characteristic polynomial is 0 = p(z) + q(z). For q = 0, p(z) = z 7 z 6 , roots are z = 0 multiplicity 6, z = 1
multiplicity 1 so stable at q = 0.
58
0.2
1
0 1
-0.2
-0.4
-0.6
3
-0.8
2
-1
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0.2
The region just outside the stability region has one unstable root as so on.
5.5.3
A-Stable Methods
A method is A-stable if the region of absolute stability includes the entire left-half q plane. Examples: backward
Euler, trapezoidal.
Dahlquists Theorem
1. An A-stable LMS method cannot exceed second order of accuracy
2. The most accurate A-stable method (smallest local truncation error) is the trapezoidal rule.
5.6
ds
dt
x(0) = x0
s(t) = 1 exp(2 t)
1 = 106
2 = 1
The exact solution of this system is
x(t) = x0 exp(1 t) + 1 exp(2 t)
which is plotted below (not to scale)
x
x0 exp(1 t)
5 106
1 exp(2 t)
59
For t 5 106 , x0 exp(1 t) 0. For t 5, 1 exp(2 t) 1. The interval of interest is [0, 5].
If uniform step size is used in numerical integration of this set of equations then for accuracy purposes h 106 .
This would imply that we need to take 5 106 steps!!
A more optimal strategy is to take 5 steps of size 106 for accuracy during the initial phase and 5 steps of size 1.
Try this with forward Euler
t
x
0
1
1
1
2
2
3
3
4
4
5
5
1
1
2
3.7 105
3
3.7 1011
4
3.7 1017
5
3.7 1023
Forward Euler is obviously not suited for this problem because it has a very small stability region |1 + q| 1.
Therefore for solving practical differential equations, we need
1. variable time steps
2. methods with large region of stability
3. methods which are stable for variable time steps
Stiff problems occur when
1. natural time constants
2. input time constants
3. interval of interest
are widely separated.
5.6.1
2. For accuracy when Re() = > 0, we want accurate method when 0 Re(q)
3. Also we want to take larger time steps when t || no matter what is
Thus stability region should include q = . Recall that
q=
p(z)
(z)
For q = , (z) = 0. Let 1 = . . . = k = 0. Then as q , all roots 0 with multiplicity k. Therefore such a
method includes in its region of stability.
These class of methods where 1 = . . . = k = 0 are called Backward Differentiation Formula or Gear s method.
For these methods, order of accuracy p = k (step length), 0 6= 0, and 1 = . . . = k = 0
n
X
[i xni ] + h0 x n = 0
i=0
or
x n =
n
1 X
i xni
hn 0 i=0
k = 1, corresponds to backward Euler. Note that there are k + 1 coefficients 0 , 1 , . . . , k and to get accuracy p, we
need to satisfy the p + 1 exactness conditions. Hence the number of exactness conditions is equal to the number of
unknowns. The region of absolute stability for Gear methods of various orders with uniform step size is shown below
60
25
Gear 1
Gear 2
Gear 3
Gear 4
Gear 5
Gear6
20
15
10
5
0
5
10
15
20
25
10
10
15
20
25
30
The main difference between trapezoidal method and 2nd order Gear method is that trapezoidal method requires
knowledge of only the previous step whereas Gears method requires the knowledge of previous two time steps.
Therefore the coefficients are functions of the time step.
0 = 0 + 1 + 2
tn tn1
tn tn2
0 = 1
+ 2
0
hn
h
n
hn1
0
= 1 + 2 1 +
hn
2
hn1
0 = 1 + 2 1 +
hn
hn1
let r =
hn
1
2 =
r(r + 2)
(1 + r)2
1 =
r(r + 2)
r+1
0 =
r+2
Note however that the coefficients depend only on the step size ratio and not the absolute values.
5.6.2
61
For efficiency reasons the goal is to take the least number of time steps consistent with the error bounds. Recall
that the local error is given by
hp+1 x(p+1) (tn )
LEn = n
+ O hp+2
n
p + 1!
where
=
k
X
i=1
"
tn tni
hn
p+1
(p + 1)i
tn tni
hn
p #
At each time step we want the local error to be less than the given error bound En . This implies
1
(p + 1)!En p+1
hn (p+1)
x
(t )
n
For a given multistep method we have a formula for . If we know x(p+1) we would want to take hn to be equal to
1
(p + 1)!En p+1
hn = (p+1)
x
(tn )
One way to compute x(p+1) (tn ) is to use divided differences
xn xn1
x n
hn
DD1 (tn ) DD1 (tn1 )
x(2) (tn )
DD2 =
hn + hn1
2!
DD1 =
DDk+1 =
x(k+1) (tn )
DDk (tn ) DDk (tn1 )
Pk
k + 1!
i=0 hni
In principle, if we have a choice of different methods with different accuracy p, we would choose method which gives
the largest step size
1
En p+1
hn =
DDp+1
However,
1. DD is error prone
2. DD is expensive to compute
3. it is expensive to switch p
Therefore, typical circuit simulations follow some heuristic rules
1. dont change steps hn too often
2. change the order k only if improvement is 2h
3. change step size and order only if LE < En for k + 1 steps after last change or error is too large
5.6.3
62
dq
q0
=i=
exp
dt
VT
v
VT
dv
dt
vn
VT
vn vn1
hn
Therefore the capacitor looks like a nonlinear voltage dependent resistor in parallel with a current source. This can
be solved using Newton Raphson as
!
"
!
!
!#
(k)
(k)
(k)
(k)
(k)
q
v
v
v
q
v
q
v
v
v
n
n
n
n
n
0
n1
0
0
n1
i(k+1)
=
exp
+
exp
+ 2 exp
vn(k+1) vn(k)
n
VT
VT
hn
hn V T
VT
VT
VT
hn
= Gvn(k+1) + I
where
(k)
!"
q0
G=
exp
hn V T
vn
VT
I = Gvn(k) +
q0
exp
VT
#
(k)
vn vn1
1+
hn
!
(k)
(k)
vn
vn vn1
VT
hn
Another way of doing this is to apply the LMS directly to charge terms as follows:
in =
i(k+1)
=
n
=
qn qn1
h
n
(k)
(k)
(k+1)
(k)
q vn + q 0 vn
vn
vn q(vn1 )
hn
Gvn(k+1)
+I
where
G=
I=
(k)
q 0 vn
hn
(k)
(k)
(k)
q 0 vn vn + q vn q(vn1 )
hn
q 0 is calculated by the nonlinear expression of charges. The question now is which one is better. Consider the two
ways of writing the above equations again
dq(x)
+ f (x) + b(t) = 0
dt
dx
C(x)
+ f (x) + b(t) = 0
dt
Now the total charge in the circuit should be conserved, i.e.,
m+1
X
qi (x) = K
i=1
fi (x) + bi (t) = 0
63
However, what happens when we apply a numerical integration method to the two forms? We hope that charge should
be conserved. Let us apply backward Euler to the second equation.
0 = C(xn )
xn xn1
+ f (xn ) + b(tn )
hn
q(xn1 ) =
i=1
m+1
X
q(xn ) +
i=1
K =K +0+
m+1
X
i=1
2
O(hn )
i=1
i=1
i=1
K =K +0
Thus charge is conserved.
Theorem: Any consistent multistep method conserves charge when applied to
dq(x)
+ f (x) + b(t) = 0
dt
5.6.4
4
Re(qn ) 0
rn 1.2
| Im(qn )|
64
Chapter 6
General Formulation
dq(xs + xp )
+ f (xs + xp ) + b + D(xs + xp )(t)
dt
q
dq(xs ) + x
xp (t)
f
xs
+ f (xs ) +
xp (t) + b +
dt
x xs
0=
Let
!
D
D(xs ) +
xp (t) (t)
x xs
f
G=
x xs
q
C=
x
xs
dxp (t)
+ Gxp (t) + D(xs )(t)
dt
This small signal analysis can be used for so-called AC and noise analyses in circuits.
0=C
65
(6.3)
66
6.2
AC Analysis
In AC analysis, D(x)(t) is a small sinusoidal source A exp(2f t) whose frequency f is swept over a range. If the
circuit is nonoscillatory, xp (t) will also be a sinusoid with the same frequency, i.e.,
xp (t) = Xp exp(2f t)
Note that Xp C. Substituting the above form in (6.3), we get
[(2f C + G)Xp + A] exp(2f t) = 0 = (2f C + G)Xp + A
Xp can now be solved using a complex linear solver.
6.3
Noise Analysis
In noise analysis, D(xs ) D(xs , f ) and (t) are unit uncorrelated white and flicker noise sources. (6.2) can
be viewed as a linear time invariant system with some transfer function h(t) whose Fourier transform is H(f ) =
(2f C + G)1 . From stochastic differential equation theory
Z
xp (t) =
h(t s)D(xs )(s)ds
Typically in circuit simulation, we are interested in the second order statistics (power spectral density, total noise
power etc.) of one component of xp (t). Let ei be the ith unit vector where i is the index of the component of xp (t)
which is of interest. Therefore
Z
eTi xp (t) =
eTi h(t1
where E [] denotes the expectation operator. Interchanging the order of expectation and integration and using the fact
that
E (s1 ) T (s2 ) = I(s1 s2 )
we have
E eTi xp (t1 )xTp (t2 )ei =
=
eTi h(t1 s1 )D(xs )(s1 s2 )ds1 ds2 DT (xs )hT (t2 s2 )ei
ZZZ
=
eTi H(f1 )D(xs )DT (xs )H T (f2 )ei exp[2(f1 t1 + f2 t2 )] exp[2(f1 + f2 )s1 ]ds1 df1 df2
ZZ
=
eTi H(f1 )D(xs )DT (xs )H T (f2 )ei exp[2(f1 t1 + f2 t2 )](f1 + f2 )df1 df2
Z
=
eTi H(f1 )D(xs )DT (xs )H T (f1 )ei exp[2f1 (t1 t2 )]df1
Therefore the autocorrelation function for xp is a function only of t1 t2 , i.e., xp is a wide-sense stationary stochastic
process. The Fourier transform of the autocorrelation function is therefore given by
Sxpi ,xpi (f ) = eTi H(f )D(xs , f )DT (xs , f )H T (f )ei
The power spectral density Sxpi ,xpi (f ) can be calculated by solving H T (f )x = ei , multiplying the result by DT (xs , f )
and taking the absolute value of the result.
Chapter 7
(7.1)
The independent sources are assumed to be periodic with period T . Since the circuit is nonautonomous, the circuit
steady-state response x(t) will also be periodic with period T . A trivial method for determining x(t) is to run
transient and wait for all the waveforms to settle to their steady-state. However, this may take too long so we will
discuss methods with compute the steady-state response xs (t) for one period directly. The first two methods are in
time-domain while the last method is in the frequency domain.
7.1.1
First discretize the time period [0, T ] into n steps t0 , t1 , . . . , tn where t0 = 0 and tn = T . Further, define
hi = ti ti1
Note that these steps need not be equal. We rewrite (7.1) on each of these time steps by discretizing the differential
operator using Backward Euler (for example)
q(x1 ) q(x0 )
+ f (x1 ) + b(t1 ) = 0
h1
q(x2 ) q(x1 )
+ f (x2 ) + b(t2 ) = 0
h2
..
.
q(xn ) q(xn1 )
+ f (xn ) + b(tn ) = 0
hn
Periodicity of the solution requires that x0 = xn . Then the above equations become
q(x1 )q(xn )
h1
q(x2 )q(x1 )
h2
+ f (x1 ) + b(t1 )
0
+ f (x2 ) + b(t2 )
=
= Ff d
.
..
.
.
.
q(xn )q(xn1 )
0
+ f (xn ) + b(tn )
hn
67
68
where m is the circuit size. Therefore these equations can be solved using Newtons method. The Jacobian for the
above system of equations is
C
1
0
...
Ch1n
h1 + G1
C1
C2
h2
h2 + G2
Jf d =
..
Cn
...
0 Chn1
+
G
n
hn
n
where as usual
q
Ci =
x xi
f
Gi =
x
xi
Instead of solving the above system of equations by direct factorization, we will use Krylov subspace methods to
solve these equations. Recall that the success of a Krylov subspace method critically depends on the choice of a good
preconditioner. For this case, let us write the Jacobian as
Jf d = L + B
where
+ G1
C1
h2
L=
h1
C2
h2
0
+ G2
...
..
.
0
...
Chn1
n
Cn
hn
+ Gn
0
0
B=
0
0
Ch1n
...
..
...
.
0
Instead of solving
Jf d X = Ff d
we solve
L1 Jf d x = (I + L1 B)x = L1 Ff d
Since L is a block lower bidiagonal matrix, solving linear equations of the sort Lx = y is very cheap (O nm1.3 ).
This can be easily solved using Krylov subspace methods. Recall that the only computation involved is matrix vector
products. The multiplication with I + L1 B can be performed very efficiently.
7.1.2
Shooting Method
Recall that transient analysis is the solution of an initial value problem, i.e., solve (7.1) given an initial condition
x(t0 ). Shooting methods are used for solving so-called boundary-valued problem where a desired solution x(tn ) at some
time point tn is given and the problem is to obtain an initial condition x(t0 ) and (optionally) the trajectory x(t). We
69
can use shooting to determine the steady-state solution of (7.1). For this problem, we need to find x0 such that at
time T , x0 = x(T ). The solution trajectory can be viewed as a function of both time t and the initial condition x0 ,
i.e.,
x(t) = (t, x0 )
Therefore the shooting equation can be written as
Fsh = (T, x0 ) x0 = 0
The above equation can be viewed as a nonlinear equation with m variables x0 and therefore can be solved using
Newtons method. The Jacobian for this system is
Jsh =
(T, x0 )
I
x0
In order to use Newtons method, we need to be able to evaluate Fsh and Jsh for a given x0 . (T, x0 ) and therefore
Fsh can be evaluated by running transient analysis with initial condition x0 for time T . To evaluate Jsh note that
(T, x0 )
xn
=
x0
x0
Using chain rule,
n
Y xi
xn
=
x0
xi1
i=1
To evaluate
xi
xi1 ,
recall that
q(xi ) q(xi1 )
+ f (xi ) + b(ti ) = 0
hi
Differentiate the above equation with respect to xi1 ,
Ci xi
Ci1
xi
+ Gi
=0
hi xi1
hi
xi1
which yields
1
xi
Ci
Ci1
=
+ Gi
xi1
hi
hi
Therefore
1
n
Y
xn
Ci
Ci1
=
+ Gi
x0
hi
hi
i=1
Note that
Ci
+ Gi
hi
are already factored during the transient solution phase. Therefore the computational cost is dominated by factoring
the Jacobian matrix which is a dense matrix. This method therefore becomes impractical for large circuits. However,
the following observation facilitates the use of Krylov subspace methods for shooting. Recall that the preconditioned
coefficient matrix for the finite difference method is
I + L1 B
The last block column of L1 B is given by
i1
C1
Cn
+
G
1
h1
h1
i1
h
i1
C1 C1
Cn
G2
h2 h1 + G1
h1
C2 +
h2
Qn h i
i=1 C
hi
..
.
+ Gi
i1
Ci1
hi
70
n
Note that the last entry is x
x0 . This suggests that instead of solving
Jsh x0 = Fsh
we solve the following system
(L
0
0
..
.
B I)X =
0
Fsh
7.1.3
Unlike shooting and finite difference which solve (7.1) in time domain, harmonic balance solves them in frequency
domain. Since the circuit is nonoscillatory, if the input signal is T -periodic, the steady-state solution x(t) and its
functions q(x) and f (x) are T -periodic. Since these signals are T -periodic, they can be expanded in Fourier-series as
follows
b(t) =
x(t) =
f (t) =
q(t) =
i=
X
i=
X
i=
Bi exp(2if t)
Xi exp(2if t)
Fi exp(2if t)
Qi exp(2if t)
i=
where f =
1
T
[2if Qi + Fi + Bi ] exp(2if t) = 0
i=
(7.2)
71
where
= diag k
Qk
..
.
Q=
Q0
.
..
Qk
... 0 ...
B, F and X are similarly defined. (7.2) represents a system of m(2k + 1) equations in m(2k + 1) unknowns X which
can be solved using Newtons method. The Jacobian for (7.2) is given by
Jhb = 2f
F
Q
+
X
X
Therefore, given a X one needs to evaluate Q and F and the Jacobian of the above system.
The relationship between the various frequency domain quantities and time domain quantities is best illustrated
by an example. Consider a circuit of size 2 and k = 1. Then (7.2) is written as
(1) (1)
Q(1)
B1
1
1 0 0 0 0 0 (2) F1
0
(2)
(2)
0 1 0 0 0 0 Q1 F1 B1 0
0 0 0 0 0
Q0 + F0 + B0 = 0
2f
(2)
(2)
(2)
0
0 0 0 0 0 Q0 F0 B0
1
0
0
P =
0
0
0
Let D denote the three point DFT matrix, i.e.,
1
D = 2
Recall that
D1
Let
0
0
0
1
0
0
0
1
0
0
0
0
1 1
1
1 2
0
0
0
0
1
0
0
0
1
0
0
0
2
= exp
3
2
1
1
1
1 1
=
3
1 2
D
D=
0
0
0
0
1
0
D
72
Q = D 0 Q1
DP
(2)
0 D Q1
(2)
Q0
(2)
Q1
(1)
qt1
(1)
qt2
(1)
qt
3
=
q (2)
t1
(2)
qt2
(2)
qt3
Multiplying the above by P 1 , we have
1 0 0 0
0 0 0 1
Q = 0 1 0 0
P 1 DP
0 0 0 0
0 0 1 0
0 0 0 0
(1)
qt1
(2)
qt1
(1)
qt
2
=
q (2) = Q
t2
(1)
qt3
(2)
qt3
This relationship can be used to compute the Jacobian Jhb .
as
Q
X
0
0
0
1
0
0
q (1)
t
1
0 (1)
qt2
0
(1)
0
qt3
(2)
0 qt
1
(2)
0
qt2
1
(2)
qt3
1 P 1 Q
Q
P D
=
X
X
1 P 1 Q P DP
1
= PD
X
Let C(t) be given by
C(t) =
Define
q
x x(t)
C(t1 )
0
0
C(t2 )
0
C= 0
0
0
C(t3 )
Note that
C=
Q
X
Therefore
1 P 1 (2f C + G)P DP
1
Jhb = P D
73
This Jacobian is large and dense and therefore storing, multiplying or factoring it is extremely inefficient. However, if
Krylov subspace methods are used to solve for
Jhb X = Fhb
the only computation involved is matrix vector product. This can be achieved using permutations (no cost), Fourier
transforms (O (m(2k + 1) log(2k + 1))) and sparse matrix vector multiplications O (m(2k + 1)). Therefore, a properly
preconditioned Krylov subspace method can quickly find the solution. For an appropriate choice of preconditioner,
assume that C(t) and G(t) are constant. In this case, it is obvious that the Jacobian is
2(k)f C + G
0
0
2((k 1))f C + G 0
..
.
0
2kf C + G
Therefore for problems with mild nonlinearity, the above matrix is a very good preconditioner for the harmonic balance
Jacobian. For strongly nonlinear problems, harmonic balance runs into convergence problems.
7.2
7.2.1
Finite Difference
Let the time period be discretized into n steps h1 , . . . , hn . Discretizing the differential operator using Backward
Euler (for example) we have
q(x1 ) q(x0 )
+ f (x1 ) + b = 0
h1
q(x2 ) q(x1 )
+ f (x2 ) + b = 0
h2
..
.
q(xn ) q(xn1 )
+ f (xn ) + b = 0
hn
Enforcing that x0 = xn , we have
q(x1 ) q(xn )
+ f (x1 ) + b = 0
h1
q(x2 ) q(x1 )
+ f (x2 ) + b = 0
h2
..
.
q(xn ) q(xn1 )
+ f (xn ) + b = 0
hn
The above equations can be solved using Newton Raphson to obtain the steady-state response of the oscillator.
However, unlike the nonautonomous case, the period T is unknown and therefore h1 , . . . hn are also unknown. One
way to fix this problem is to insist that
hi
= i
T
74
are fixed throughout the Newton iteration. i s can be predetermined by running an initial transient. This still leaves
us with n equations and n + 1 unknowns. This implies that there are a continuum of solutions for this problem and
the Newton-Raphson will not work because the solutions are nonisolated. In terms of equations, the Jacobian for
the above equations is singular at the solution with a rank deficiency of 1. This observation is physically consistent
because for the oscillator steady-state, if xs (t) is a solution, xs (t + ) is also a solution for any fixed . In order to
rectify this, one of the variables is assigned a fixed value which fixes the phase and then the equations can be solved.
Therefore the system of equations is
q(x1 ) q(xn )
+ f (x1 ) + b = 0
h1
q(x2 ) q(x1 )
+ f (x2 ) + b = 0
h2
..
.
q(xn ) q(xn1 )
+ f (xn ) + b = 0
hn
xn xn0 = 0
with unknowns
x1
x2
..
.
xn
T
The Jacobian for this system of equations is
C
+ G1
1
C
h2
h1
C2
h2
0
+ G2
Ch1n
...
..
...
...
0
0
.
Chn1
n
...
Cn
hn
+ Gn
1
n)
q(x1h)q(x
T
1
1)
q(x2h)q(x
2T
q(xn )q(xn1 )
hn T
The linear system can be efficiently solved using Krylov subspace methods using the lower triangular portion of the
Jacobian as a preconditioner. Then the Krylov subspace method is guaranteed to converge in n + 2 iterations where
n is the circuit size.
7.2.2
Shooting
75
h2
h2 + G2
..
Chn1
n
C
1
+
G
1
h
1 C1
C2
h2
h2 + G2
..
Cn1
hn
Cn
hn
Cn
hn
C0
dx1
dx
dx02 h1
dx0 0
. = .
. .
.
.
dxn
0
+ Gn
dx0
q(x1 )q(x0 )
dx1
h1 T
dT
dx2 q(x2 )q(x
1)
dT
h2 T
. =
..
.
.
.
dxn
q(x
)q(x
)
n
n1
+ Gn
dT
h T
n
I
1
dxn
dT
+ G1
C1
h2
0
1
h1
C2
h2
0
+ G2
...
0
..
...
...
0
0
Chn1
n
...
Cn
hn
+ Gn
0
1 C
1
0
h1 + G1
1
0 C
h2
0
1
0
0
+ G2
C2
h2
Ch1n
...
..
...
...
0
0
.
Chn1
n
...
I
0
0
0
Cn
hn
0
I
+ Gn
1
...
...
..
.
...
...
n)
q(x1h)q(x
T
1
1)
q(x2h)q(x
2T
q(xn )q(xn1 )
hn T
1
dx
dx0
dx2
dx0
n
I dx
dx0
1
1
dx
dT
2
dx
dT
dxn
dT
0
Therefore instead of directly solving the Jacobian, the following system is solved using a Krylov subspace method
0 0 ...
0 0 . . .
..
0
...
0
...
dx1
dx0
dx2
dx0
dxn
dx0
dx1
dT
dx2
dT
I
0
dxn
0
dT
0
0
0
I
...
...
..
.
0
0
...
...
I
0
0
0
0
0
The initial condition and the period are the last m+1 variables of this system of equations and all others are discarded.
7.2.3
Harmonic Balance
Let 0 =
2
T .
Xi exp(i0 t)
i=
Assume that the Fourier series is truncated to the kth harmonic, i.e.,
x(t)
k
X
i=k
Xi exp(i0 t)
76
X
Q
0
C and G are as defined before. The matrix equation can be solved efficiently using a Krylov subspace
where P , D,
method. The preconditioner for the regular harmonic balance appended with an extra row and column with 1 as the
diagonal entry in the last column / row works well.
Chapter 8
(8.1)
where b(t) is assumed to be T -periodic. Let xs (t) be the steady-state T -periodic solution of this system. Now consider
that a small input signal D(x)(t) is added to the above equation, i.e.,
dq(x)
+ f (x) + b + D(x)(t) = 0
dt
(8.2)
We would like to find the solution of the above equation. From linear perturbation analysis, the solution of the above
system is xs (t) + xp (t) where xp (t) is small. Substituting this form of the solution in (8.2), we have
0=
dq(xs + xp )
+ f (xs + xp ) + b + D(xs + xp )(t)
dt
Expanding q, f and D as in Taylor series around xs and ignoring second order terms in the expansion, we have
dq(xs + xp )
+ f (xs + xp ) + b + D(xs + xp )(t)
dt
q
dq(xs ) + x
xp (t)
f
xs
+ f (xs ) +
xp (t) + b +
dt
x xs
0=
!
D
D(xs ) +
xp (t) (t)
x xs
(8.3)
where C and G are as defined before. Note that all the coefficients in the above equation are T -periodic. This small
signal analysis can be used for so-called periodic AC and periodic noise analyses in circuits.
8.1
Periodic AC Analysis
Just as in AC analysis, D(x)(t) is a small sinusoidal source A exp(2f t) whose frequency f is swept over a range.
Given that the system described by (8.3) is linear periodic time-varying we will first establish the generic form of the
response xp (t). Recall that
Z
xp (t) =
78
h(t, s) =
hi (t s) exp(2if0 s)
i=
Hi (f ) exp(2f t)df
Z
X
i=
and
xp (t) =
=
=
=
Z
X
i=
Z Z
X
i=
Z
X
i=
i=
Therefore
xp (t) =
(8.4)
i=
0=
1
0
Ch1n exp(2f1 T )
xp1
exp(2f1 t1 )
h1 + G1
C1
C2
exp(2f1 t2 )
0
xp2
h2
h2 + G2
= A
..
..
.
.
.
Cn1
Cn
xpn
exp(2f1 tn )
0
hn
hn + Gn
79
The above equation can be solved by using Krylov subspace methods with L1 as the preconditioner. The Fourier
series expansion of the resulting xpi yields Xpi (if0 + f1 ).
This analysis is done at a range of frequencies f1 . A nave method would be to solve the above linear equation
repeatedly for each f1 . However, note that the preconditioned coefficient matrix in the above equation is of the form
I + (f1 )E
where
E = L1 B
where (f1 ) is scalar. It can be shown that the Krylov subspace for the family of matrices of the type I + G is
invariant for any . Furthermore
2
2
(I + 1 G)v + 1
v
(I + 2 G)v =
1
1
Therefore as (f1 ) is swept, the Krylov subspace vectors need not be generated using matrix vector products and can
be generated by algebraic manipulations of computed basis vectors from the previous value of . As f1 varies, the
dimension of the Krylov subspace may need to be increased but in spite of this, the reuse results in large savings in
doing this computation, specially if the frequency range is large. This is called Krylov subspace recycling.
8.2
Using the fact that R(t1 , t2 ) = R(t1 + T, t2 + T ), rewrite the above equation as
Z
Rxpi (, t2 ) =
eTi h( + t2 , s1 )D(xs (s1 ))DT (xs (s1 ))hT (t2 , s1 )ei ds1
Since Rxpi (, t2 ) is periodic in t2 , let the stationary component of Rxpi (, t2 ) be denoted by Rx0 p ( ). Then
i
1
T
1
=
T
Rx0 p ( ) =
i
Rxpi (, t2 )dt2
0
T
eTi h( + t2 , s1 )D(xs (s1 ))DT (xs (s1 ))hT (t2 , s1 )ei ds1 dt2
1
T
Z
0
(8.5)
eTi h(
+ t2 , s1 )D(xs (s1 ))D (xs (s1 ))h (t2 , s1 )ei exp(2f )ds1 dt2 d
k=
Dk exp(2kf0 s1 )
80
Furthermore
h(t1 , t2 ) = h(t1 t2 , t2 )
X
hk (t1 t2 ) exp(2kf0 t2 )
=
=
k=
Z
X
k=
1
T
Z
X
k=
Z
X
l= m= n=
T
Dl exp(2lf0 s1 )Dm
exp(2mf0 s1 )HnT (f2 ) exp(2f2 (t2 s1 )) exp(2nf0 s1 ) exp(2f )ei df1 df2 ds1 dt2 d
Z ZZZ
X
1 T
T
=
eTi Hk (f1 ) exp(2f1 (t2 s1 )) exp(2kf0 s1 )Dl exp(2lf0 s1 )Dm
exp(2mf0 s1 )
T 0
k,l,m,n=
HnT (f2 ) exp(2f2 (t2 s1 )) exp(2nf0 s1 )(f1 f )ei df1 df2 ds1 dt2
Z ZZ
X
1 T
T
=
eTi Hk (f1 ) exp(2f1 t2 )Dl Dm
HnT (f2 ) exp(2f2 t2 )(f1 f )
T 0
k,l,m,n=
X
T
=
eTi Hk (f1 )Dl Dm
HnT (f2 )ei (f1 f )(f1 f2 + (k + l + m + n)f0 )
k,l,m,n=
exp(2(f1 + f2 )T ) 1
df1 df2
2(f1 + f2 )T
X
exp(2(k + l + m + n)f0 T ) 1
T
=
eTi Hk (f )Dl Dm
HnT ((k + l + m + n)f0 f )ei
2(k + l + m + n)f0
=
k,l,m,n=
X
T
T
eTi Hk (f )Dl Dm
Hklm
(f )ei
k,l,m=
T
eTi Hk (f )Dl Dm
Hk+l+m
(f )ei
k,l,m=
where denotes conjugate transpose. HkT ei can be computed using the Recycled Krylov subspace method as in the
periodic AC analysis case.
Chapter 9
9.1
Pad
e Approximation
f
n
X
i=1
fi gi
1 i
81
82
This is impractical because as A increases, its diagonalization becomes very expensive. Therefore the above transfer
function is approximated by
Hp () =
b0 + b1 + . . . + bp1 p1
1 + a1 + . . . + ap p
such that the Taylor series of Hp () matches the Taylor series of H(s0 + ) at least in the first 2p + 1 terms. The
Taylor series expansion of H(s0 + ) is given by
H(s0 + ) = lT (I + A + 2 A2 + . . .)r
X
=
lT Ai r i
i=0
mi i
i=0
bi i
i=0
p
X
ai i
i=0
mi i
i=0
m0
m1
mp1
m1
m2
..
.
...
...
..
.
m2p3
m0
m1
m2
0
m0
m1
b0
b1
.. =
.
bp1
mp1 mp2
mp1
mp
a1
mp+1
a2
mp
..
.
.
.
.
m2p3
ap
m2p1
m2p2
...
0
m0
..
.
...
0
...
m1
0
1
0
a1
..
.
ap1
m0
(9.1)
pi
i=1
where ri is the residue corresponding to the ith pole pi .
The problem with this approach is that the explicit computation of mi results in the same problem we discussed
earlier in the generation of Krylov subspace. Clearly, one should generate Ai r from the basis vectors of the Krylov
subspace generated either by Arnoldi process or Lanczos process.
9.2
Pad
e Via Lanczos
VIA LANCZOS
9.2. PADE
83
1. Initialization set
1 = krk2
1 = klk2
r
b1 =
1
l
c1 =
1
b0 = 0
c0 = 0
0 = 1
n =
(c) set
b = Abn n bn n bn1
c = Acn n cn n cn1
(d) set
n+1 = kbk2
n+1 = kck2
b
bn+1 =
n+1
c
cn+1 =
n+1
Note that the only difference in the above algorithm compared to the one described earlier is that the Krylov subspace
basis vectors are normalized at every step. The reason for this normalization will be explained later on.
Recall that
1. {cn } and {bn } are biorthogonal
84
Tp = 0
Tp = 0
3
..
.
..
.
0
3
..
.
..
.
0
...
..
.
p
p p
... 0
..
. 0
p
p p
then
ABp = Bp Tp + 0 . . . 0 bp+1 p+1
AT Cp = Cp Tp + 0 . . . 0 cp+1 p+1
where as before
B p = b1
Cp = c1
b2
...
bp
c2
...
cp
3. since the basis vectors are normalized at every step, the two tridiagonal matrices are not equal (or conjugate in
complex case) by are related to each other by the following relationship
TpT = Dp Tp Dp1
where
Dp = diag(1 , . . . , p )
Now consider the evaluation of Ai r as follows:
Ai r = 1 Ai b1
= 1 Ai Bp e1
= 1 Bp Tpi e1
Similarly
i T
l
i T
= AT l
i T
= 1 AT c1
T
i
= 1 AT Cp e1
T
= 1 Cp Ti e1
T
i
= 1 Cp DpT TpT DpT e1
T
i
= 1 Cp Dp1 TpT Dp e1
lT Ai =
AT
VIA LANCZOS
9.2. PADE
85
00
lT Ai r = lT Ai Ai r
0
00
00
p is an
where denotes the corresponding quantities generated by Lanczos process without normalization. Since D
arbitrary diagonal matrix, one cannot in general write
mi = keT1 Tpi e1
for some k which is critical for Pade approximation.
Therefore after running the Lanczos process for q steps, m1 , . . . , m2q1 are generated. However we need not
explicitly solve (9.1). To see this, first note that
l
reT1 (I
Tp )
e1 =
lT reT1 Tpi i e1
i=0
2p1
X
lT Ai r i + O 2p
2p1
X
mi i + O 2p
i=0
i=0
Therefore
Hp () lT reT1 (I Tp )1 e1
Let Tp be diagonalized as
Tp = Wp p Wp1
Then
Hp () = lT reT1 Wp (I p )1 Wp1 e1
= lT rT (I p )1
where obviously
= WpT e1
= Wp1 e1
86
Therefore
Hp () =
p
X
lT ri i
1 p,j
i=1
p
X
l
i=1
ri i
p,j
1
p,j
Therefore the poles and residues of the approximate transfer function are readily available. If the expansion is to be
accurate between some frequency range fmax f fmax then it is recommended that
s0 = 2fmax