Professional Documents
Culture Documents
Notes On The Matrix Exponential and Logarithm Howard E. Haber
Notes On The Matrix Exponential and Logarithm Howard E. Haber
Howard E. Haber
Santa Cruz Institute for Particle Physics
University of California, Santa Cruz, CA 95064, USA
May 2, 2023
Abstract
In these notes, we summarize some of the most important properties of the matrix
exponential and the matrix logarithm. Nearly all of the results of these notes are well
known and many are treated in textbooks on Lie groups. A few advanced textbooks on
matrix algebra also cover some of the topics of these notes. Some of the results concerning
the matrix logarithm are less well known. These include a series expansion representation
of d ln A(t)/dt (where A(t) is a matrix that depends on a parameter t), which is derived
here but does not seem to appear explicitly in the mathematics literature.
where I is the n × n identity matrix. The radius of convergence of the above series is infinite.
Consequently, eq. (1) converges for all matrices A. In these notes, we discuss a number of
key results involving the matrix exponential and provide proofs of three important theorems.
First, we consider some elementary properties.
Property 1: If A , B ≡ AB − BA = 0, then
eA+B = eA eB = eB eA . (2)
This result can be proved directly from the definition of the matrix exponential given by eq.(1).
The details are left to the ambitious reader.
Remarkably, the converse of property 1 is FALSE. One counterexample is sufficient. Con-
sider the 2 × 2 complex matrices
0 0 0 0
A= , B= . (3)
0 2πi 1 2πi
eA = eB = eA+B = I , (4)
1
where I is the 2 × 2 identity matrix. Hence, eq. (2) is satisfied. Nevertheless, it is a simple
matter to check that AB 6= BA, i.e., [A , B] 6= 0.
Indeed, one can use the above counterexample to construct a second counterexample that
employs only real matrices. Here, we make use of the well known isomorphism between the
complex numbers and real 2 × 2 matrices, which is given by the mapping
a b
z = a + ib 7−→ . (5)
−b a
It is straightforward to check that this isomorphism respects the multiplication law of two
complex numbers. Using eq. (5), we can replace each complex number in eq. (3) with the
corresponding real 2 × 2 matrix,
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
A= 0 0
, B = .
0 2π 1 0 0 2π
0 0 −2π 0 0 1 −2π 0
One can again check that eq. (4) is satisfied, where I is now the 4 × 4 identity matrix, whereas
AB 6= BA as before.
It turns out that a small modification of Property 1 is sufficient to avoid any such coun-
terexamples.
Property 2: If et(A+B) = etA etB = etB etA , where t ∈ (a, b) (where a < b) lies within some
open interval of the real line, then it follows that [A , B] = 0.
The above result can be derived simply by making use of the Taylor series definition [cf. eq. (1)]
for the matrix exponential.
Property 4 can be verified by employing the matrix logarithm, which is treated in Sections 4
and 5 of these notes.
2
This result is self evident since it replicates the well known result for ordinary (commuting)
functions. Note that Theorem 2 below generalizes this result in the case of [A(t) , dA/dt] 6= 0.
Property 6: If A , [A , B] = 0, then eA B e−A = B + [A , B].
To prove this result, we define
and compute
dB(t)
= AetA Be−tA − etA Be−tA A = [A , B(t)] ,
dt
d2 B(t) 2 tA −tA tA −tA tA −tA 2
= A e Be − 2Ae Be A + e Be A = A , [A , B(t)] .
dt2
By assumption, A , [A , B] = 0, which must also be true if one replaces A → tA for any
number t. Hence, it follows that A , [A , B(t)] = 0, and we can conclude that d2 B(t)/dt2 = 0.
It then follows that B(t) is a linear function of t, which can be written as
dB(t)
B(t) = B(0) + t .
dt t=0
By setting t = 1, we arrive at the desired result. If the double commutator does not vanish,
then one
obtains
a more general result, which is presented in Theorem 1 below.
If A , B 6= 0, the eA eB 6= eA+B . The general result is called the Baker-Campbell-Hausdorff
formula, which will be proved in Theorem 4 below. Here, we shall prove a somewhat simpler
version.
Property 7: If A , [A , B] = B , [A , B] = 0, then
eA eB = exp A + B + 12 [A , B] . (8)
We shall now derive a differential equation for F (t). Taking the derivative of F (t) with respect
to t yields
dF
= AetA etB + etA etB B = AF (t) + etA Be−tA F (t) = A + B + t[A , B] F (t) , (9)
dt
3
after noting that B commutes with eBt and employing eq. (7). By assumption, both A and
B, and hence their sum, commutes with [A , B]. Thus, in light of Property 5 above, it follows
that the solution to eq. (9) is
F (t) = exp t(A + B) + 21 t2 [A , B] F (0) .
However, not all matrices are diagonalizable. One can modify the above derivation by
employing the Jordan canonical form. But, here I prefer another technique that is applicable
to all matrices whether or not they are diagonalizable. The idea is to define a function
det eA(t+δt) = det(eAt eAδt ) = det eAt det eAδt = det eAt det(I + Aδt) , (11)
4
Taking the limit as δt → 0 yields the differential equation,
d
det eAt = Tr A det eAt . (12)
dt
The solution to this equation is
ln det eAt = t Tr A , (13)
where the constant of integration has been determined by noting that (det eAt )t=0 = det I = 1.
Exponentiating eq. (13), we end up with
det eAt = exp t Tr A .
Note that for A(t) = eAt , eq. (14) reduces to eq. (12) derived above. Another result related to
eq. (14) is
d
det(A + tB) = det A Tr(A−1 B) .
dt t=0
involves n nested commutators. We also introduce two auxiliary functions f (z) and g(z) that
1
See also Chapters 2 and 5 of Ref. [2] and Chapter 3 of Ref. [3].
5
are defined by their power series:
∞
ez − 1 X z n
f (z) = = , |z| < ∞ , (17)
z n=0
(n + 1)!
∞
X (1 − z)n
ln z
g(z) = = , |1 − z| < 1 . (18)
z − 1 n=0 n + 1
Theorem 1:
∞
X 1
eA Be−A = exp(adA )(B) ≡ (adA )n (B) = B + [A, B] + 12 [A, [A, B]] + · · · . (21)
n=0
n!
Proof: Define
B(t) ≡ etA Be−tA , (22)
and compute the Taylor series of B(t) around the point t = 0. A simple computation yields
B(0) = B and
dB(t)
= AetA Be−tA − etA Be−tA A = [A, B(t)] = adA (B(t)) . (23)
dt
Higher derivatives can also be computed. It is a simple exercise to show that:
dn B(t)
n
= (adA )n (B(t)) . (24)
dt
Theorem 1 then follows by substituting t = 1 in the resulting Taylor series expansion of B(t).
Theorem 2:
A(t) d −A(t) dA dA 1 dA 1 dA
e e = −f (adA ) =− − A, − A, A, +··· , (25)
dt dt dt 2! dt 3! dt
where f (z) is defined via its Taylor series in eq. (17). Note that in general, A(t) does not
commute with dA/dt. A simple example, A(t) = A + tB where A and B are independent of
t and [A, B] 6= 0, illustrates this point. In the special case where [A(t), dA/dt] = 0, eq. (25)
reduces to
A(t) d −A(t) dA dA
e e =− , if A(t), = 0. (26)
dt dt dt
Proof: Define
d −sA(t)
B(s, t) ≡ esA(t) e , (27)
dt
6
and compute the Taylor series of B(s, t) around the point s = 0. It is straightforward to verify
that B(0, t) = 0 and
dn B(s, t) n−1 dA
n
= −(adA(t) ) , (28)
ds s=0 dt
for all positive integers n. Assembling the Taylor series for B(s, t) and inserting s = 1 then
yields Theorem 2. Note that if [A(t), dA/dt] = 0, then (dn B(s, t)/dsn )s=0 = 0 for all n ≥ 2,
and we recover the result of eq. (26).
There are two additional forms of Theorem 2, which we now state for completeness.
Theorem 2(a):
d A(t) dA
e = eA(t) f˜(adA ) , (29)
dt dt
where f˜(z) is defined via its Taylor series,
∞
1 − e−z X (−1)n n
f˜(z) = = z , |z| < ∞. (30)
z n=0
(n + 1)!
Theorem 2(b):
d A+tB
e = eA f˜(adA )(B) , (31)
dt t=0
where f˜(z) is defined via its Taylor series in eq. (30) and A and B are independent of t.
Eq. (31) defines that Gâteau derivative of eA (also called the directional derivative of eA along
the direction of B).2
Proof: Define A(t) ≡ A + tB, and use eq. (29).
Corollary:
exp(A + ǫB) = eA 1 + ǫ f˜(adA )(B) + O(ǫ2 ) . (32)
Proof: Starting from Theorem 2(b), let us denote the right hand side of eq. (31) by
F (A, B) ≡ eA f (adA )(B) . (33)
Then, using the definition of the derivative, it follows that
d A+tB eA+(t+ǫ)B − eA+tB eA+ǫB − eA
e = lim = lim = F (A, B) . (34)
dt t=0
ǫ→0 ǫ t=0
ǫ→0 ǫ
2
In the present application, the Gâteau derivative exists, is a linear function of B, and is continuous in A,
in which case it coincides with the Fréchet derivative[11].
7
In particular, eq. (34) implies that,
Theorem 3: Z 1
d −A(t) dA −(1−s)A
e =− e−sA e ds . (37)
dt 0 dt
This integral representation is an alternative version of Theorem 2.
Proof: Consider
d −sA −(1−s)B
e e = −Ae−sA e−(1−s)B + e−sA e−(1−s)B B
ds
= e−sA (B − A)e−(1−s)B . (38)
8
Finally, we note that the definition of the derivative can be used to write:
d −A(t) e−A(t+h) − e−A(t)
e = lim . (43)
dt h→0 h
Using
dA
A(t + h) = A(t) + h + O(h2 ) , (44)
dt
it follows that:
dA
exp − A(t) + h − exp[−A(t)]
d −A(t) dt
e = lim . (45)
dt h→0 h
Thus, we can use the result of eq. (42) with B = dA/dt to obtain
Z 1
d −A(t) dA −(1−s)A
e =− e−sA e ds , (46)
dt 0 dt
which is the result quoted in Theorem 3.
As in the case of Theorem 2, there are two additional forms of Theorem 3, which we now
state for completeness.
Theorem 3(a):
Z 1
d A(t) dA sA
e = e(1−s)A e ds . (47)
dt 0 dt
This follows immediately from eq. (37) by taking A → −A and s → 1 − s. In light of eqs. (34)
and (36), it follows that,
Theorem 3(b):
Z 1
d A+tB
e = e(1−s)A B esA ds . (48)
dt t=0 0
Second proof of Theorems 2 and 2(b): One can now derive Theorem 2 directly from
Theorem 3. First, we multiply eq. (37) by eA(t) and change the integration variable, s → 1 − s.
Then, we employ eq. (21) to obtain,
Z 1 Z 1
A(t) d −A(t) sA dA −sA dA
e e =− e e ds = − exp [adsA ] ds
dt 0 dt 0 dt
Z 1 ∞ Z 1
dA X 1 n dA
=− exp s adA ds = − (adA ) sn ds
0 dt n=0
n! dt 0
∞
X 1 dA dA
=− (adA )n = −f (adA ) , (49)
n=0
(n + 1)! dt dt
9
Likewise, starting from eq. (48) and making use of eq. (21), it follows that,
Z 1 Z 1
d A+tB A −sA sA A
e =e e B e ds = e ds exp ad−sA (B)
dt t=0 0 0
Z 1 ∞ Z 1
A
A
X (−1)n n
=e ds exp −s adA (B) = e (adA ) (B) sn ds
0 n=0
n! 0
∞
X (−1)n
=e A
(adA )n (B) = eA f˜(adA )(B) , (50)
n=0
(n + 1)!
where g(z) is defined via its Taylor series in eq. (18). Since g(z) is only defined for |1−z| < 1, it
follows that the BCH formula for ln eA eB converges provided that keA eB − Ik < 1, where I is
the identity matrix and k · · · k is a suitably defined matrix norm. Expanding the BCH formula,
using the Taylor series definition of g(z), yields:
eA eB = exp A + B + 21 [A, B] + 12 1 1
[A, [A, B]] + 12 [B, [B, A]] + . . . , (52)
assuming that the resulting series is convergent. An example where the BCH series does not
converge occurs for the following elements of SL(2,R):
−λ
−e 0 1 0 0 1
M= = exp λ exp π , (53)
0 −eλ 0 −1 −1 0
where λ is any nonzero real number. It is easy to prove3 that no matrix C exists such that
M = exp C. Nevertheless, the BCH formula is guaranteed to converge in a neighborhood of
the identity of any Lie group.
3
The characteristic equation for any 2 × 2 matrix A is given by
Hence, the eigenvalues of any 2 × 2 traceless matrix A ∈ sl(2, R) [that is, A is an element of the Lie algebra of
SL(2,R)] are given by λ± = ±(−det A)1/2 . Then,
(
A 2 cosh |det A|1/2 , if det A ≤ 0 ,
Tr e = exp(λ+ ) + exp(λ− ) =
2 cos |det A|1/2 , if det A > 0 .
Thus, if det A ≤ 0, then Tr eA ≥ 2, and if det A > 0, then −2 ≤ Tr eA < 2. It follows that for any A ∈ sl(2, R),
Tr eA ≥ −2. For the matrix M defined in eq. (53), Tr M = −2 cosh λ < −2 for any nonzero real λ. Hence,
no matrix C exists such that M = exp C. Further details can be found in section 3.4 of Ref. [7] and in section
10.5(b) of Ref. [8].
10
Two corollaries of the BCH formula are noteworthy.
Corollary 1: The Trotter Product formula
k
lim eA/k eB/k = eA+B . (54)
k→∞
The proofs of eqs. (54) and (55) can be found, e.g., in Section 3.4.1 of Ref. [3].
or equivalently,
eC(t) = etA eB . (57)
Using Theorem 1, it follows that for any complex n × n matrix H,
exp adC(t) (H) = eC(t) He−C(t) = etA eB He−tA e−B
after noting that exp(adtA ) = exp(t adA ). Next, we use Theorem 2 to write:
C(t) d −C(t) dC
e e = −f (adC(t) ) . (60)
dt dt
However, we can compute the left-hand side of eq. (60) directly:
d −C(t) d d
eC(t) e = etA eB e−B e−tA = etA e−tA = −A , (61)
dt dt dt
since B is independent of t, and tA commutes with dtd (tA). Combining the results of eqs. (60)
and (61),
dC
A = f (adC(t) ) . (62)
dt
Multiplying both sides of eq. (62) by g(exp adC(t) ) and using eq. (20) yields:
dC
= g(exp adC(t) )(A) . (63)
dt
11
Employing the operator equation, eq. (59), one may rewrite eq. (63) as:
dC
= g(exp(t adA ) exp(adB ))(A) , (64)
dt
which is a differential equation for C(t). Integrating from t = 0 to t = T , one easily solves
for C. The end result is
Z T
C(T ) = B + g(exp(t adA ) exp(adB ))(A) dt , (65)
0
where the constant of integration, B, has been obtained by setting T = 0. Finally, setting
T = 1 in eq. (65) yields the BCH formula.
It is instructive to use eq. (51) to obtain the terms exhibited in eq. (52). In light of the
series definition of g(z) given in eq. (18), we need to compute
and 2
I − exp(t adA ) exp(adB ) = ad2B + t adA adB + t adB adA + t2 ad2A , (67)
after dropping cubic terms and higher. Hence, using eq. (18),
(69)
which confirms the result of eq. (52).
The Zassenhaus formula for matrix exponentials is sometimes referred to as the dual of the
Baker-Campbell Hausdorff formula. It provides an expression for exp(A + B) as an infinite
produce of matrix exponentials. It is convenient to insert a parameter t into the argument of
the exponential. Then, the Zassenhaus formula is given by
exp t(A + B) = etA etB exp − 21 t2 A, B exp 16 t3 2 B, [A, B] + A, [A, B] · · · . (70)
12
where the Cn are defined recursively as
1 ∂2 −tB −tA t(A+B)
C2 = e e e = − 21 [A, B] , (72)
2 ∂t2 t=0
3
1 ∂ −t2 C2 −tB −tA t(A+B)
C3 = e e e e = − 13 [A + 2B, C2 ] , (73)
3! ∂t3 t=0
and in general
1 ∂n −tn−1 Cn−1 −t2 C2 −tB −tA t(A+B)
Cn = e ...e e e e . (74)
n! ∂tn t=0
We can now rederive eq. (32), which we repeat here for the reader’s convenience.
Corollary:
exp(A + ǫB) = eA 1 + ǫ f˜(adA )(B) + O(ǫ2 ) . (75)
Proof: In eq. (71), replace A → A/t and t → ǫ. Then it follows immediately that
1 (−1)n
tn+1 Cn+1 = − ǫ [A, Cn ] + O(ǫ2 ) = ǫ adA )n (B) + O(ǫ2 ) . (78)
n+1 (n + 1)!
13
3 Properties of the Matrix Logarithm
The matrix logarithm should be an inverse function to the matrix exponential. However, in
light of the fact that the complex logarithm is a multi-valued function, the concept of the
matrix logarithm is not as straightforward as was the case of the matrix exponential. Let A
be a complex n × n matrix with no real negative or zero eigenvalues. Then, there is a unique
logarithm, denoted by ln A, all of whose eigenvalues lie in the strip, −π < Im z < π of the
complex z-plane. We refer to ln A as the principal logarithm of A, which is defined on the cut
complex plane, where the cut runs from the origin along the negative real axis. If A is a real
matrix (subject to the conditions just stated), then its principal logarithm is real.4
For an n × n complex matrix A, we can define ln A via its Taylor series expansion, under
the assumption that the series converges. The matrix logarithms is then defined as,
∞
X (A − I)m
ln A = (−1)m+1 , (80)
m=1
m
whenever the series converges, where I is the n × n identity matrix. The series converges
whenever kA − Ik < 1, where k · · · k indicates a suitable matrix norm.5 If the matrix A
satisfies (A − I)m = 0 for all integers m > N (where N is some fixed positive integer), then
A − I is called nilpotent and A is called unipotent. If A is unipotent, then the series given
by eq. (80) terminates, and ln A is well defined independently of the value of kA − Ik. For
later use, we also note that if kA − Ik < 1, then I − A is non-singular, and (I − A)−1 can be
expressed as an infinite geometric series,
∞
X
(I − A)−1 = Am . (81)
m=0
One can also define the matrix logarithm by employing the Gregory series,6
∞
X 1 2m+1
ln A = −2 (I − A)(I + A)−1 , (82)
m=1
2m + 1
which converges under the assumption that all eigenvalues of A possess a positive real part. In
particular, eq. (82) converges for any Hermitian positive definite matrix A. Hence, the region
of convergence of the series in eq. (82) is considerably larger than the corresponding region of
convergence of eq. (80).
Before discussing a number of key results involving the matrix logarithm, we first list some
elementary properties without proofs. Many of the proofs can be found in Chapter 2.3 of
Ref. [2]. A number of properties of the matrix logarithm not treated in Ref. [2] are discussed
in Ref. [11].
4
For further details, see Sections 1.5–1.7 of Ref. [11].
1/2
5
One possible choice is the Hilbert-Schmidt norm, which is defined as kXk = Tr(X † X) , where the
positive square root is chosen.
6
See, e.g. Section 11.3 of Ref. [11].
14
Property 1: For all A with kA − Ik < 1, exp(ln A) = A.
A derivation of eq. (83) can be found on pp. 136–137 of Ref. [13]. It is straightforward to check
that if kA − Ik < 1, then one can expand the integrand of eq. (83) in a Taylor series in s
[cf. eq. (81)]. Integrating over s term by term then yields eq. (80). Of course, eq. (83) applies
to a much broader class of matrices, A.
Property 3: Employing the extended definition of the matrix logarithm given in eq. (83), if
A is a complex n × n matrix with no real negative or zero eigenvalues, then exp(ln A) = A.
To prove Property 3, we follow the suggestion of Problem 9 on pp. 31–32 of Ref. [14] and
define a matrix valued function f of a complex variable z,
Z 1
−1
f (z) = z(A − I) sz(A − I) + I ds .
0
It is straightforward to show that f (z) is analytic in a complex neighborhood of the real interval
between z = 0 and z = 1. In a neighborhood of the origin, one can verify by expanding in z
and dropping terms of O(z 2 ) that
Using the analyticity of f (z), we can insert z = 1 in eq. (84) to conclude that
15
Property 5: If A(t) is a complex n × n matrix with no real negative or zero eigenvalues that
depends on a parameter t, and A commutes with dA/dt, then
d dA dA −1
ln A(t) = A−1 = A .
dt dt dt
Property 7: Suppose that X and Y are complex n×n complex matrices such that XY = Y X.
Moreover, if |arg λj +arg µj | < π, for every eigenvalue λj of X and the corresponding eigenvalue
µj of Y , then ln(XY ) = ln X + ln Y .
Note that if X and Y do not commute, then the corresponding formula for ln(XY ) is quite
complicated. Indeed, if the matrices X and Y are sufficiently close to I, so that exp(ln X) = X
and ln(eX ) = X (and similarly for Y ), then we can apply eq. (52) with A = ln X and B = ln Y
to obtain,
ln(XY ) = ln X + ln Y + 12 ln X, ln Y + · · · .
Proof: eq. (86) is easily derived by taking the derivative of the equation B −1 B = I. It follows
that
d −1 d −1 dB
0 = (B B) = B B + B −1 . (87)
dt dt dt
Multiplying on the right of eq. (87) by B −1 yields
d −1 dB −1
B + B −1 B = 0,
dt dt
which immediately yields eq. (86).
A second form of eq. (86) employs the Gâteau (or equivalently the Fréchet) derivative. In
light of eqs. (34) and (36) it follows that,
d −1
(A + tB) = −A−1 BA−1 .
dt t=0
16
Theorem 6:
Z 1
d −1 dA −1
ln A(t) = ds sA + (1 − s)I sA + (1 − s)I . (88)
dt 0 dt
We provide here two different proofs of Theorem 6. This first proof was derived by my own
analysis, although I expect that others must have produced a similar derivation. The second
proof was inspired by a memo by Stephen L. Adler, which is given in Ref. [15].
Proof 1: Employing the integral representation of ln A given in eq. (83), it follows that
Z Z 1
d dA 1 −1 d −1
ln A = s(A − I) + I ds + (A − I) s(A − I) + I ds . (89)
dt dt 0 0 dt
We now make use of eq. (86) to evaluate the integrand of the second integral on the right hand
side of eq. (89), which yields
Z Z 1
d dA 1 −1 −1 dA −1
ln A = s(A − I) + I ds − (A − I) [s(A − I) + I s [s(A − I) + I (90)
dt dt 0 0 dt
We can rewrite eq. (90) as follows,
Z 1
d −1 dA −1
ln A = s(A − I) + I s(A − I) + I s(A − I) + I ds
dt 0 dt
Z 1
−1 dA −1
− s(A − I)[s(A − I) + I [s(A − I) + I , (91)
0 dt
which simplifies to
Z 1
d −1 dA −1
ln A = s(A − I) + I s(A − I) + I ds .
dt 0 dt
Thus, we have established eq. (88).
17
For infinitesimal h, we have
−1 −1
A + hdA/dt + uI = (A + uI)(I + h(A + uI)−1 dA/dt)
= (I + h(A + uI)−1 dA/dt)−1(A + uI)−1
= (I − h(A + uI)−1dA/dt)(A + uI)−1 + O(h2 )
= (A + uI)−1 − h(A + uI)−1dA/dt(A + uI)−1 + O(h2 ) . (94)
Theorem 7:8
∞
d X 1 dA dA 1 −1 dA dA
A(t) ln A(t) = (A−1 adA )n = + A A, 1 −2
+ 3A A, A, +· · ·
dt n=0
n + 1 dt dt 2 dt dt
(98)
Proof: A matrix inverse has the following integral representation,
Z ∞
−1
B = e−sB ds , (99)
0
if the eigenvalues of B lie in the region, Re z > 0, of the complex z-plane. If we perform a
formal differentiation of eq. (99) with respect to B, it follows that
Z
1
B −n−1
= sn e−sB ds . (100)
n!
8
I have not seen Theorem 7 anywhere in the literature, although it is difficult to believe that such an
expression has never been derived elsewhere.
18
Thus, starting with eq. (95), we shall employ eq. (99) to write,
Z ∞
−1
(A + uI) = e−v(A+uI) dv . (101)
0
∞ Z ∞
X 1 1 n −xA n dA
= x e dx (adA ) . (104)
n=0
n! n + 1 0 dt
Finally, using eq. (100), we end up with
∞
d X 1 −n−1 n dA
ln A(t) = A (adA ) . (105)
dt n=0
n + 1 dt
d dA 1 −1 dA 1 −2 dA
A(t) ln A(t) = + 2A A, + 3A A, A, +··· (106)
dt dt dt dt
Note that if [A, dA/dt] = 0, then eq. (106) yields:
d dA
ln A(t) = A−1
dt dt
dA dA
= A−1 AA−1 = A−1 A A−1
dt dt
dA −1
= A , (107)
dt
which coincides with Property 5 given in the previous section.
19
One can rewrite eq. (105) in a more compact form by defining the function,
∞
−1
X xn
h(x) ≡ −x ln(1 − x) = . (108)
n=0
n+1
A−1 adA (B) − adA (A−1 B) = A−1 (AB − BA) − (AA−1 B − A−1 BA) = 0 .
A second form of Theorem 7 employs the Gâteau (or equivalently the Fréchet) derivative.
In light of eqs. (34) and (36) it follows that,
Theorem 7(a):
d
ln(A + Bt) = A−1 h A−1 adA (B) , (110)
dt t=0
where the function h is defined by its Taylor series given in eq. (108). In particular,
d
ln(A + Bt) = A−1 B + 12 A−2 [A, B] + 13 A−3 [A, [A, B]] + · · · .
dt t=0
Note that if [A, B] = 0, then d ln(A+Bt)/dt t=0 = A−1 B = BA−1 , which is also a consequence
of Property 5 given in the previous section in the special case of A(t) ≡ A+Bt for t-independent
commuting matrices A and B.
References
[1] Willard Miller Jr., Symmetry Groups and Their Applications (Academic Press, New York,
1972).
[2] Brian Hall, Lie Groups, Lie Algebras, and Representations (Second Edition), (Springer
International Publishing, Cham, Switzerland, 2015).
[3] Joachim Hilgert and Karl-Hermann Neeb, Structure and Geometry of Lie Gorups
(Springer Science+Business Media, New York, NY, 2012).
[4] Rajendra Bhatia, Positive Definite Matrices (Princeton University Press, Princeton, NJ,
2007).
[5] Weak Interactions and Modern Particle Theory, by Howard Georgi (Dover Publications,
Mineola, NY, 2009).
20
[6] R.M. Wilcox, J. Math. Phys. 8, 962 (1967).
[7] Andrew Baker, Matrix Groups: An Introduction to Lie Group Theory, by Andrew Baker
(Springer-Verlag, London, UK, 2002).
[8] J.F. Cornwell, Group Theory in Physics, Volume 2 (Academic Press, London, UK, 1984).
[10] F. Casas, A. Muruab, and M. Nadinic, Comput. Phys. Commun. 183, 2386 (2012).
[11] Nicholas J. Higham, Functions of Matrices (Society for Industrial and Applied Mathe-
matics (SIAM), Philadelphia, PA, 2008).
[13] Willi-Hans Stieb, Problems and Solutions in Introductory and Advanced Matrix Calculus
(World Scientific Publishing Company, Singapore, 2006).
[14] Jacques Faraut, Analysis on Lie Groups (Cambridge University Press, Cambridge, UK,
2008).
[15] Stephen L. Adler, Taylor Expansion and Derivative Formulas for Matrix Logarithms,
https://www.ias.edu/sites/default/files/sns/files/1-matrixlog_tex(1).pdf
[16] L. Dieci, B. Morini and A. Papini, Siam J. Matrix Anal. Appl. 17, 570 (1996).
21