Lecture 3

Maximal tori

Recall from last time: Given X ∈ g there exists a unique group homo γX : R →
G s.t. γX (0) = 1 and γX (0) = X. We defined exp(X) = γX (1). Then we obtain
γX (t) = exp(tX) from d exp(0) = id. In particular, t 7→ exp(tX) is a group

Lemma 1. Let G be a topological group. If H ⊂ G is an open subgroup, then it is also

closed. Thus H = G if G is connected.

Proof. If {1} t I ⊂ G is a set of representatives

S for the cosets G/H, then the
complement of H in G is the open subset g∈I gH.
Lemma 2. Let G be a connected topological group, U ⊂ G a nbd of the identity. Then
G is generated by U .

Proof. Since U ∩U −1 also is open, may assume wlog U = U −1 . Write U n for the
set of n-fold products of elements of U . It is the image of U × · · · × U → G and
is thus an open subset (if g1 S
. . . gn is in the image, then so is U g2 . . . gn , which
is open around it). Let H = n U n . It is open, and a subgroup. By lemma 1 it
equals G.
Lemma 3. Let G be a Lie group. Then exp : g → G is a diffeomorphism locally
around 0 and G◦ is generated by exp(g).

Proof. The first statement follows from the inverse function theorem and the
fact that d exp(0) = id. The second statement follows from lemma 2.

Corollary 4. Let G be a connected Lie group. Then g ∈ G lies in the center if and
only if Ad(g) is the identity on g.

Proof. If g ∈ Z, then Ad(g) is the identity, and so is its differential. Conversely,

if Ad(g) is the identity, then Ad(g) is the identity on the image of exp, but this
image generates G by Lemma 3 and Ad(g) is a homomorphism.

The proofs of the following two theorems will be postponed.

Theorem 5. For any X ∈ g consider the two linear maps d(exp)X : g → Texp(X) G
and dR(exp(X))1 : g → Texp(X) G. Then dR(exp(X))−11 ◦ d(exp)X : g → g is given
Z 1
expGL(g) (sad(X))ds.

Theorem 6. Let U ⊂ g be an open neighborhood of 0 on which exp is a diffeomor-
phism. For X, Y ∈ U consider the differential equation for Z : g → g

dZ adZ(t)
(t) = (X), Z(0) = Y.
dt expGL(g) (adZ(t)) − 1

After possibly shrinking U , this differential equation has a (unique) solution for all
X, Y ∈ U and all t ∈ [0, 1]. Define µ(X, Y ) = Z(1). Then we have

exp(X) exp(Y ) = exp(µ(X, Y )).

Corollary 7. The collection of maps κx : U → G, κx (Y ) = x exp(Y ) is a smooth

atlas for the manifold G.
Theorem 8. Let G be a Lie group with Lie algebra g. Then H 7→ h is a bijection
between the set of connected Lie subgroups and the set of Lie subalgebras.

Proof. Clearly H 7→ h is a map as claimed. To show injectivity, we note that

expH is the restriction of expG to h. By Lemma 3, H is generated by the image
of expH , thus H is uniquely determined as the subgroup of G generated by
expG (h).

This argument can in fact also give surjectivity. Indeed, for any subalgebra
h ⊂ g we can consider H to be the subgroup of G generated by expG (h). By the
uniqueness of the solution of the differential equation of Theorem 6, we know
that if X, Y ∈ U ∩ h, then µ(X, Y ) ∈ h. It follows that (κx )x∈H is a smooth atlas
for H realizing H as a submanifold of G.
Lemma 9. Let G be a compact Lie group. Then there exists a G-invariant inner
product on g. It satisfies

([X, Y ], Z) = −(Y, [X, Z]).

Proof. Let β be an arbitrary scalar product on g. For any v, w ∈ g the function

g 7→ β(Ad(g)v, Ad(g)w) is a continuous function G → R, in fact smooth. Define
(v, w) = β(Ad(g)v, Ad(g)w)dg.

This is clearly G-invariant. Its bilinearity follows from the linearity of Ad(g)
and the integral. Finally, positive-definiteness follows from the positivity of
the Haar measure.

Now, given any G-invariant scalar product (−, −), we differentiate the identity

(Ad(g)v, Ad(g)w) = (v, w).

On the right side we get 0, while on the left side we will consider G → g × g
given by g 7→ (Ad(g)v, Ad(g)w) composed with (−, −) : g×g → R. We now use
the chain rule. The differential of the first map at g = 1 int the direction of X is
([X, v], [X, w]). The differential of β at (v, w) in direction (A, B) is computed by
β(v+A, w+B)−β(v, w) = β(v, B)+β(A, w)+β(A, B), with β(A, B) superlinear.
Thus the differential of the composed map is β(Ad(X)v, w)+β(v, Ad(X)w).

1 Connected Abelian Lie groups

Fact 10. Let G be a Lie group. If X, Y ∈ g satisfy [X, Y ] = 0, then exp(X) and
exp(Y ) commute.

Proof. We have

Ad(expG (X))(expG (Y )) = expG (Ad(expG (X))Y )

= expG (expGL (g)(ad(X))Y )
X (ad(X))n
= expG ( (Y ))
= expG (Y ).

Definition 11. 1. A Lie group is called abelian if it is so as a group.

2. A Lie algebra is called abelian if its bracket is zero.
Fact 12. A connected Lie group G is abelian if and only g is abelian.

Proof. If G is abelian, then Ad(g) is the identity for every g, so its differential
Ad(g) is the identity, so g 7→ Ad(g)Y is contant and its differential ad(X)(Y ) is
zero. Conversely, if g is abelian, then all elements of exp(g) commute by Fact
10, but G is generated by this subset by Lemma 3.
Proposition 13. Let G be a connected abelian Lie group. Then exp : g → G is a
surjective group homomorphism with discrete kernel Γ and induces an isomorphism
g/Γ → G of Lie groups.

Proof. Consider c(t) = exp(tX) exp(tY ). This is a group homomorphism R →

G with c(0) = 1 and c0 (0) = X +Y . By uniqueness it follows exp(tX) exp(tY ) =
exp(t(X + Y )). Evaluating at 1 we see that exp is a group homomorphism, as

The image of exp is thus a subgroup of G containing a neighborhood of the

identity, and hence open, and equal to G by Lemma 1. Since exp is submersive
at 0, its kernel is a closed Lie subgroup of dimension 0, hence a collection of
points, i.e. a discrete subgroup.

I skip the last statement for now.

Lemma 14. A discrete subgroup of Rn is of the form gZk for some g ∈ GLn (R).

Proof. Homework.
Corollary 15. A connected abelian Lie group is isomorphic to Rn−k × (S1 )k .

2 Maximal tori

Definition 16. Let G be a compact Lie group with Lie algebra g. A torus in G is a
connected abelian Lie subgroup. It is called maximal if it is not properly contained in
another such. A Cartan subalgebra of g is a maximal abelian subalgebra.

Clearly both of these exist for dimension reasons. Note that a proper closed
submanifold without boundary must have strictly lower dimension.
Fact 17. T ⊂ G is a maximal torus if and only if t ⊂ g is a Cartan subalgebra.
Proposition 18. Let t ⊂ g be a Cartan subalgebra. There exists Z ∈ t s.t. t =
Cent(Z, g). In fact, the set of such Z is dense and open.

Proof. Being maximal abelian subalgebra means that for all X, Y ∈ t we have
[X, Y ] = 0 and for all X ∈ / t there exists Y ∈ t s.t. [X, Y ] 6= 0. Using
T this, we see
that if X1 , . . . , Xk ∈ t is a basis as a real vector space, then t = ker(ad(Xi )).
We may thin out this basis to a linearly independent set by throwing away
those Xi whose kernels don’t contribute.

We will now show that there exists t ∈ R s.t. ker(tX1 +X2 ) = ker(X1 )∩ker(X2 ).
Replacing X1 , . . . , Xk by tX1 + X2 , X3 , . . . , Xk we obtain another LI set with
the same property, and then induction finishes the proof.

To that end, take a G-invariant scalar product on g as in Lemma 9. Consider

kX = ker(ad(X)). Clearly this is an ad(X)-invariant subpsace of g. Define

rX = kX . Since the scalar product is skew-invariant under ad(X), rX is also
ad(X)-invariant. We have g = kX ⊕ rX an ad(X)-invariant decomposition, and
ad(X) is injective (and hence an isomorphism) when restricted to rX .

Recall that the endomorphisms ad(X) and ad(Y ) of g commute if [X, Y ] = 0

by Fact 10. Thus kX1 and rX1 are ad(X2 )-invariant. Decomposing in the same
way for X2 we obtain
g = (kX1 ∩ kX2 ) ⊕ (rX1 ∩ kX2 ) ⊕ (kX1 ∩ rX2 ) ⊕ (rX1 ∩ rX2 ).
This decomposition is invariant under both ad(X1 ) and ad(X2 ), hence under
ad(tX1 + X2 ). The restriction of this endomorphism to the first factor is zero, to
the second invertible (when t 6= 0), to the third also invertible. If it has no kernel
when restricted to the fourth factor, we’d have ker(tX1 + X2 ) = kX1 ∩ kX2 as
desired. If rX1 ∩ rX2 = {0} we are done, otherwise consider det(tX1 + X2 |rX1 ∩
rX2 ). This is a polynomial in t whose value at t = 0 is non-zero, so there exists
some t 6= 0 where the value is again non-zero.
Definition 19. An element Z ∈ t s.t. t = Cent(Z, t) is called regular.
Proposition 20. Let G be a compact Lie group and T a maximal torus. For any X ∈ g
there exists g ∈ G s.t. Ad(g)X ∈ t.

Proof. Let (−, −) be an invariant inner product on g as in Lemma 9 and Z ∈ t a

regular element as in Proposition 18. We are looking for g ∈ G s.t. [Ad(g)X, Z] =
0. This is equivalent to ([Ad(g)X, Z], Y ) = 0 for all Y ∈ g, i.e. (Z, [Y, Ad(g)X]) =
0. We consider the continuous function g 7→ (Z, Ad(g)X). By compactness of
G it takes a maximum at some g ∈ G. Thus the function R → R given by
t 7→ (Z, Ad(exp(tY ))Ad(g)X) takes maximum at t = 0. We differentiate at
t = 0 to see (Z, [Y, Ad(g)X]) = 0, as required.

Corollary 21. Let G be a compact Lie group. Then it acts transitively on the set of all
Cartan subalgebras and on the set of all maximal tori.

Proof. Since Lie(Ad(g)T ) = Ad(g)Lie(T ), the second statement follows from

the first. The first follows from the fact that Ad(g) is an automorphism of g and
Propositions 20 and 18.
Theorem 22. Let G be a compact connected Lie group. The exponential map is sur-

Proof. We induct on dim(G). If dim G = 1, then g is abelian and we are done.

It is enough to show that exp(g) is open and closed. We haveS g = g∈G Ad(g)t
by Proposition 20 and Corollary 21, and hence exp(g) = g∈G Ad(g)T using
Proposition 13. The latter is the image of G × T → G, (g, t) 7→ gtg −1 , and since
G × T is compact, so is its image, which is then closed.

It remains to show that the image of exp is also open. To that end, take X0 ∈ g
and put g0 = exp(X0 ). Fix an invariant scalar product (−, −) as in Lemma 9.
Define A = Cent(g0 , G)◦ and let a be its Lie-algebra, which is Cent(g0 , g). Let
b = a⊥ . Then g = a ⊕ b is an Ad(g0 )-invariant decomposition. The endomor-
phism (Ad(g0 ) − 1) of g respects this decomposition and its kernel is precisely
a, so its restriction to b is invertible.

We define the map

ϕ : a ⊕ b → G, (X, Y ) 7→ g0−1 exp(Y )g0 exp(X) exp(−Y ).

It maps 0 to 1 and is smooth. We can compute its differential

dϕ0 (X, 0) = X, dϕ0 (0, Y ) = (Ad(g0 ) − 1)Y,

and see that it is an isomorphism. Hence ϕ is submersive at 0 and its image

contains a neighborhood ot 1. Since Lg0 is a diffeomorphism, the set

{exp(Y )g0 exp(X) exp(−Y )|X, Y }

contains a neighborhood of g0 . Now X, X0 ∈ a, so exp(X), g0 ∈ A, so

gAg −1

contains a neighborhood of g0 .

We distinguish two cases. First assume g0 6∈ Z(G). Then dim A < dim G, and
by inductive hypothesis A = exp(a), so the above union lies in exp(g).

Now assume g0 ∈ Z(G). Since exp(g) contains a neighborhood of 1, g0 exp(g)

contains a neighborhood of g0 and it is enough to show that g0 exp(g) ⊂ exp(g).

First we claim that X0 is contained in every Cartan subalgebra t. Indeed, since

Ad(g0 ) = 1 we have [X0 , −] = 0. Choose a regular X ∈ t as in Proposition 18
and observe [X0 , X] = 0, hence X0 ∈ t.

Now let X ∈ g. There exists a Cartan subalgebra t s.t. X ∈ t by Proposition 20.

Then g0 exp(X) = exp(X0 ) exp(X) = exp(X0 + X) ∈ g, as claimed.

Corollary 23. Let G be a compact connected Lie group, and T a maximal torus.

For every g ∈ G there exists h ∈ G s.t. hgh−1 ∈ T . In other words, G =

1. S
h∈G hT h .
2. ZG (T ) = T .
3. Z(G) ⊂ T .

Proof. The third point follows from either of the two pervious ones.

For the first, write g = exp(X) by Theorem 22. Choose h s.t. Ad(h)X ∈ t by
Corollary 21. Then Ad(h)g = exp(Ad(h)X) ∈ exp(t) = T .

For the second, let g0 ∈ ZG (T ). Using Theorem 22 we write g0 = exp(X0 ). Then

t 7→ exp(tX0 ) is a path within ZG (g0 )◦ connecting 1 to g0 , so g0 ∈ ZG (g0 )◦ . At
the same time, T ⊂ ZG (g0 )◦ . Now ZG (g0 )◦ is a closed Lie subgroup of G, hence
compact, and by definition connected. It follows from the first point that g0 is
conjugate to an element of T by an element of ZG (g0 )◦ , hence g0 ∈ T .

3 Proofs of Theorems 5 and 6

Recall that given a manifold M , a point p ∈ M , and a smooth vector field v,

there exists an open nbd I ⊂ R of 0 and a curve cp,v : I → M satisfying c(0) = p
and c0 (t) = v(c(t)) for all t ∈ I, and that this curve is unique.

We can package these curves for all p ∈ M together and obtain the concept
of flow: Φv (t, p) = cp,v (t). The flow is defined on an open neighborhood of
{0} × M in R × M .

Recall the following convention. If f : A × B → C is a smooth map of man-

ifolds, and a ∈ A, b ∈ B, then da f (a, b) is the linear map Ta A → Tf (a,b) C
obtained by taking the differential at a of the smooth map f (−, b) : A → C. If
A happens to be R, then we identify this linear map with its value at 1 and thus
obtain an element of Tf (a,b) C.
Lemma 24. Φ(t + s, p) = Φ(t, Φ(s, p)).

Proof. Fix s and vary t. Define c1 (t) = Φ(t + s, p) and c2 (t) = Φ(t, Φ(s, p)). Then
d d
c1 (0) = Φ(s, p) = Φ(0, Φ(s, p)) = c2 (0). Furthermore dt c1 (t0 ) = dt Φ(t0 + s, p) =
d d
v(Φ(t0 + s, p)) = v(c1 (t0 )), and dt c2 (t0 ) = dt Φ(t0 , Φ(s, p)) = v(Φ(t0 , Φ(s, p))) =
v(c2 (t0 )). Thus both c1 and c2 satisfy the same differential equation, implying
c1 = c2 .

We will now be interested in vector fields v that depend smoothly on a real

parameter . These are simply smooth maps v : M × I → T M satisfying
π(v(p, )) = p. Let Φ(t, p, ) be the corresponding flow, so Φ is defined on a
tube around {0} × M × {0} in R × M × R and valued in M .
Theorem 25. We have
Z t0
d d d
Φ(t0 , p0 , 0 ) = Φ(t0 − s, Φ(s, p0 , 0 ), 0 ) · v(Φ(s, p0 , 0 ), 0 )ds.
d 0 dp d

Let’s untangle the statement. LHS is a the differential at 0 of the curve  7→
Φ(t0 , p0 , ). This is a linear map R → TΦ(t0 ,p0 ,0 ) M which we identify with its
value at 1, thus an element of TΦ(t0 ,p0 ,0 ) M .

RHS we differentiate at Φ(s, p0 , 0 ) the smooth map M → M given by p 7→

Φ(t0 − s, p, 0 ). We get a linear map from the tanget space at Φ(s, p0 , 0 ) to
the tangent space at Φ(t0 − s, Φ(s, p0 , 0 ), 0 ) = Φ(t0 , p0 , 0 ). We feed that linear
map the following argument: We have the curve R → TΦ(s,p0 ,0 ) M sending  →
v(Φ(s, p0 , 0 ), ). Its differential at 0 is a linear map R → TΦ(s,p0 ,0 ) M , which
we identify with its value at 1, thus an element of TΦ(s,p0 ,0 ) M . The integrand
is thus an element of TΦ(t0 ,p0 ,0 ) M for all s, thus a curve [0, t0 ] → TΦ(t0 ,p0 ,0 ) M ,
which we can integrate to produce an element of TΦ(t0 ,p0 ,0 ) M .

We now take G to be a Lie group.

Lemma 26. Let X ∈ g. If v is the right-invariant vector field v(p) = dRp X then
Φ(t, p) = Φ(t, 1)·p. If v is the left-invariant vector field v(p) = dLp X, then Φ(t, p) =
p · Φ(t, 1).

Proof. The proof of both statements is analogous. Let v be right-invariant. Fix

p and define the two curves c1 (t) = Φ(t, p) and c2 (t) = Φ(t, 1) · p = Rp ◦ Φ(t, 1).
d d
Then c1 (0) = Φ(0, p) = p and dt c1 (t0 ) = dt Φ(t0 , p) = v(Φ(t0 , p)) = v(c1 (t0 )).
d d
On the other hand, c2 (0) = Φ(0, 1) · p = p and dt c2 (t0 ) = dt |t=t0 (Φ(t, 1) · p) =
d d
dt |t=t0 (Rp ◦ Φ(t, 1)) = d(Rp )Φ(t0 ,1) ◦ dt |t=t0 Φ(t, 1) = (dRp )Φ(t0 ,1) (v(Φ(t0 , 1))) =
(dRp )Φ(t0 ,1) ◦ (dRΦ(t0 ,1) )1 X = d(RΦ(t0 ,1)·p )1 X = v(Φ(t0 , 1) · p) = v(c2 (t0 )).

Again both curves satisfy the same diff eq and are thus equal.

Proof of Theorem 5. We want to compute d(exp)X (Y ) = d |=0 exp(X + Y ). We
consider the left-invariant vector field v(g) = (dLg )1 (X + Y ). Lemma 26 tells
us that the corresponding flow is given by Φ(s, g, ) = g exp(s(X + Y )). Thus
we are computing d Φ(1, 1, 0), which by Theorem 25 is
Z 1
d d
Φ(1 − s, Φ(s, 1, 0), 0) · v(Φ(s, 1, 0), 0)ds
0 dp d
We compute − s, Φ(s, 1, 0), 0). The map p 7→ Φ(1 − s, p, 0) is simply
dp Φ(1
Rexp((1−s)X) = Rexp(X) ◦ Rexp(−sX) , and we are differentiating at Φ(s, 1, 0) =
exp(sX), thus we obtain from the chain rule

d(Rexp((1−s)X) )exp(sX) = d(Rexp(X) )1 ◦ d(Rexp(−sX) )exp(sX) .

Next we compute d v. We have  7→ v(Φ(s, 1, 0), ) = d(Lexp(sX) )1 (X + Y ), so
differentiating at  = 0 gives d(Lexp(sX) )1 (Y ). We get
Z 1
[d(Rexp(X) )1 ◦ d(Rexp(−sX) )exp(sX) ◦ d(Lexp(sX) )1 ](Y ).

All factors are linear, and the first does not depend on s, so we pull it out of
the integral. The other two we can integrate as operators before applying to Y ,
and get
Z 1 
d(Rexp(X) )1 ◦ d(Rexp(−sX) )exp(sX) ◦ d(Lexp(sX) )1 ds (Y ).

Reversing the chain rule, the integrant becomes
d(Ad(exp(sX)))1 = Ad(expG (sX)) = expGL(g) (ad(sX)).

Finally, we check that

Z 1 Z 1
d(Rexp(X) )1 ◦ expGL(g) (ad(sX))ds = d(Lexp(X) )1 ◦ expGL(g) (ad(−sX))ds.
0 0

For this we move d(Rexp (X))1 to the other side and compute
Z 1
d(Rexp(X) )−1
1 ◦ d(L exp(X) )1 ◦ expGL(g) (ad(−sX))ds
Z 1
= Ad(exp(X)) ◦ expGL(g) (ad(−sX))ds
Z 1
= expGL(g) (ad(exp(X))) ◦ expGL(g) (ad(−sX))ds
Z 1
= expGL(g) (ad((1 − s)X))ds

Substituting u = 1 − s we obtain the result.

Fact 27. We have for any A ∈ End(V )
Z 1 ∞
X 1
exp(sA) = An+1 = A−1 (exp(A) − 1),
0 n=0
(n + 1)!

where the final term makes sense when A is invertible.

Note that ad(X) is never an invertible transformation on g. We define A−1 (exp(A)−

1) by the above equation even for non-invertible A.
Fact 28. For any A ∈ End(V ), the eigenvalues of 0 exp(sA)ds are
Z 1
eλ − 1
esλ ds =
0 λ
where λ are the eigenvalues of A, and this expression is by convention equal to 1 for
λ = 0.
Corollary 29. The linear map d(exp)X is invertible for precisely those X for which
ad(X) has no eigenvalues of the form 2πik, 0 6= k ∈ Z.

Consider the open dense set ge where d(exp)X is invertible, equivalently where
X 7→ (ad(X))−1 (exp(ad(X)) − 1) is invertible.

Proof of Theorem 6. We show that for all X, Y ∈ ge we have exp(X) exp(Y ) =

exp(µ(X, Y )). We compute
d d
exp(Z(t)) = d(exp)Z(t) ◦ Z(t) = d(Rexp(Z(t)) )1 (X) = v(exp(Z(t))),
dt dt
where v is the right-invariant vector field associated to X. Thus, if Φ is the flow
of this right-invariant vector field, we have exp(Z(t)) = Φ(t, exp(Z(0))). By
Lemma 26 we see
exp(Z(t)) = Φ(t, 1) exp(Z(0)) = exp(tX) exp(Y ).
Apply to t = 1 to get result.

4 Baker-Campbell-Hausdorff

In the proof of Theorem 6 we verified the equality

exp(Z(t)) = exp(tX) exp(Y ).

We apply the adjoint action to this equation and obtain

expGL(g) (ad(Z(t))) = expGL(g) (ad(tX)) ◦ expGL(g) (ad(Y )).

For any A ∈ End(V ) we have the expression

X (−1)k
A = log(1 + (exp(A) − 1)) = (exp(A) − 1)k+1 ,

from which we derive

X (−1)k ∞
= (exp(A) − 1)k .
exp(A) − 1 k+1

Using this in the differential equation for Z(t) we obtain

d X (−1)k
Z(t) = (exp(adZ(t)) − 1)k (X)
dt k+1

X (−1)k
= (exp(tad(X)) ◦ exp(Y ) − 1)k (X)
 k

X (−1)k  X ad(X)l ad(Y )m 
=  tl ◦  (X)
k+1  l! m! 
k=0 l,m≥0
 
 
 
 
 
∞ k  l1 m1 lk mk 
X (−1)  X ad(X) ad(Y ) ad(X) ad(Y )
= tl1 +···+lk ◦ ◦ ··· ◦ ◦  (X)

k+1  l1 ! m1 ! lk ! mk ! 

k=0  l1 ,...,lk ≥0 
m1 ,...,mk ≥0 
 l1 +m1 >0 
 . ..

lk +mk >0

R1 d
Since Z(0) = Y and Z(1) = µ(X, Y ) we have µ(X, Y ) = Y + 0 dt
Z(t)dt =
 
 
 
 
 
∞ k
X (−1)  l m l m
 X 1 ad(X) 1 ad(Y ) 1 ad(X) k ad(Y ) k 
X+Y + ◦ ◦ ··· ◦ ◦  (X)

k+1  l1 + · · · + lk + 1 l1 ! m1 ! lk ! mk ! 

k=1  l1 ,...,lk ≥0 
m1 ,...,mk ≥0 
 l1 +m1 >0 
 . ..

lk +mk >0

