Download as pdf or txt
Download as pdf or txt
You are on page 1of 103

Notes On Advanced Quantum Field Theory

The Theory of Elementary Interactions


A Course Given By Dr. Tobias Osborne
Transcribed by Dr. Alexander V. St. John
August 11, 2021

1
Contents
1 Lecture 1: Introduction 3

2 Lecture 2: Gaussian Path Integrals 8

3 Lecture 3: Correlation Functions and Path Integrals 13

4 Lecture 4: Functional Quantization of the Scalar Field 18

5 Lecture 5: Functional Derivatives and Generating Functionals 24

6 Lecture 6: Grassmann Numbers 29

7 Lecture 7: Functional Quantization of the Dirac Field 35

8 Lecture 8: Renormalization 41

9 Lecture 9: Renormalizability (of φ4 Theory) 45

10 Lecture 10: Abelian Gauge Theory (Quantum Electrodynam-


ics) 49

11 Lecture 11: Nonabelian Gauge Theory (Yang-Mills) 54

12 Lecture 12: Quantization of Gauge Theories 60

13 Lecture 13: Quantization of Nonabelian Gauge Theory 66

14 Lecture 14: Quantization of Nonabelian Gauge Theory, Cont. 71

15 Lecture 15: Hamiltonian Lattice Gauge Theory 77

16 Lecture 16: Spontaneous Symmetry Breaking 84

17 Lecture 17: The Higgs Mechanism 92

18 Lecture 18: SSB with Gauge Theories and Next Steps 97

2
1 Lecture 1: Introduction
The goal for this course is to explain the current ”standard model” for particle
physics. This is too lofty of a goal for this course, so what we focus on is the
textitbuilding blocks of the standard model, such that we understand the origin
and purpose of each term of the Lagrangian.

Topics covered include

1. Path integral quantization


Via Gaussian integrals.

2. Review perturbation theory via path integrals


Includes familiar tools for calculating correlation functions such as
Wick’s theorem, Feynman rules, etc.
3. Renormalization
Allows us to discuss effective QFT (e.g., eliminating infinities) in more
detail.
4. Abelian and nonabelian classical gauge theories
Uses path integrals to deduce quantizations from classical field theo-
ries.

5. Quantization of non-abelian gauge theories


Employs path integrals for perturbative calculations and lattices for
nonperturbative calculations.
6. Spontaneous symmetry breaking mechanisms

Path Integrals
Let’s first suppose that the quantization is already done, and we have a quan-
tum system with a Hilbert space H, a Hamiltonian Ĥ, and a propagator from
integrating the Schroedinger equation, U (t) = e−iĤt .

Now work out a representation to first order for the propagator by Taylor ex-
panding
 it N  N
it
U (t) = e−iĤt = e− N Ĥ = lim I − Ĥ (1)
N →∞ N
Let {|ji} be a basis for the Hilbert space H and consider the transition amplitude
of evolving from an eigenstate |φi i to another eigenstate |φf i
 it N
hφf | U (t) |φi i = hφf | e− N Ĥ |φi i (2)

3
Insert N − 1 Hilbert space basis completeness relations, one in between each of
the N exponentials, and note that the sum over all paths from initial state to
final state of functions of the paths

it it it
X
hφf | U (t) |φi i = hφf | e− N Ĥ |jN −1 i hjN −1 | e− N Ĥ . . . |j1 i hj1 | e− N Ĥ |φi i
j1 ,...,jN −1
(3)
X
≡ f (j1 , . . . , jN −1 ). (4)
paths

So, the transition amplitude of this state evolution is a sum over all of the
paths through the basis states j1 , . . . , jN −1 . An example schematic of a path is
visualized below.

To work out the function of the path f (j1 , . . . , jN −1 ), we need to calculate these
individual transition amplitudes between successive states |jk−1 i to some final
state hjk | in the path integral setting, and seek to write it as an exponential of
some function of the states L(jk−1 , jk ), the Lagrangian density
it it
Ĥ) |jk−1 i ' e N L(jk−1 ,jk ) .
hjk | (I − (5)
N
So, the full transition amplitude will be a product of exponentials pf Lagrangian
densities, well-known classical quantities

X X it
PN −1
L(jk−1 ,jk )
hφf | U (t) |φi i = f (j1 , . . . , jN −1 ) = eN k=2 . (6)
paths j1 ,...,jN −1

Before moving forward, what makes the path integral so interesting?

4
1. It allows the calculation of quantum quantities, the transition amplitudes,
via well-understood classical solutions and methods for handling highly
oscillatory integrals such as the saddle point method.
2. It can also be used to build nonperturbative approximation schemes, such
as Monte Carlo sampling over paths.

Example: General Nonrelativistic Quantum Mechanical Sys-


tem
Let’s assume a little bit more about our quantum system. Suppose the quantum
system is inspired by a classical system with pairs of canonical coordinates and
momenta and the Hamiltonian H({q j }, {pj }) = H(q, p).

Turn around the path integral sum over paths to guess a quantum Hamiltonian
and Hilbert space from this classical Hamiltonian via

U (qi , qf , T ) = hqf | U (T ) |qi i = hqf | e−iT Ĥ |qi i (7)


Proceed as before, mulitplying the exponentials and inserting N −1 completeness
relations in between the N copies of the exponential. The completeness relation
for the canonical position basis, a continuous variable, is
 
YZ
I= dqkj  |qk i hqk | (8)
j

T
So, the transition amplitude in this case is, with  ≡ δt = N

X
hqf | U (T ) |qi i = hqf | e−iĤ |qkN −1 i hqkN −1 | . . . |qk1 i hqk1 | e−iĤ |qi i .
k1 ,...,kN −1
(9)
There are three cases for the dependence of the quantum Hamiltonian on the
canonical coordinates in the expression for the propagator. It can depend purely
on position, purely on momenta, or most realistically, it can depend on both.

In the case that the Hamiltonian is a function purely dependent on canonical


position, such that Ĥ = g(q̂), we easily calculate the transition amplitude, which
relates the quantum and classical canonical positions, since the |qk i are energy
eigenstates of the position-dependent Hamiltonian

Y
hqk+1 | g(q̂) |qk i = g(qk ) δ(qkj − qk+1
j
) (10)
j
 

qk+1 + qk
 Y Z dpj P j j j
k  i j pk (qk+1 −qk )
=g  e . (11)
2 j

5
R ∞ dp ip·q
Where we used the Dirac delta distribution identity δ(q) = −∞ 2π e to in-
troduce the canonical momenta into the transition amplitude. Also note that
the Dirac delta function forces qk+1 = qk , such that f (qk ) = f ( qk+12+qk ), and
we write it in this fashion for later use.

Next, in the case that the Hamiltonian is a function purely dependent on canon-
ical momenta, such that Ĥ = h(p̂), the transition amplitude is calculated by
inserting the completeness relation for the momentum eigenbasis.

YZ
hqk+1 | h(p̂) |qk i = hqk+1 | h(p̂) · dpjk |pk i hpk |qk i (12)
j
Y Z dpj P j j j
= k
h(pk )ei j pk (qk+1 −qk ) (13)
j

Where the inner product of the position and momentum eigenstates is a Fourier
1 ip·q
phase element hp|qi = 2π e , and we get the sum, since the subscript k denotes
N total canonical coordinate pairs.

The more realistic situation is when the Hamiltonian is dependent on both


position and momenta Ĥ = Ĥ(q̂, p̂) = g(q̂) + h(p̂). Suppose the dependencies
are linearly separable in the quantum Hamiltonian. Then we may translate
between classical position and momenta via the Taylor expansion to first order

e−iĤ = I − iĤ = I − i(g(q̂) + h(p̂)). (14)


Using this linearity, we can write this dependence in the derived formula as

Y Z dpj q +q P j j j
−iĤ(q̂,p̂) k −iH( k+12 k ,pk ) i j pk (qk+1 −qk )
hqk+1 | e |qk i = e e (15)
j

Putting all this together into the propagator, which is really the transition
amplitude for a nonrelativistic quantum system,

 
YZ pjk  i Pk Pj pjk (qk+1
Z 
j j q +q

−qk )−H( k+12 k ,pk )
U (qi , qf ; T ) =  dqkj e . (16)

jk

Take note that there is nothing quantum on the RHS: no hats! We have used
purely classical data to define the quantum propagator, or, transition ampli-
tude, on the LHS, such that U (qi , qf ; T ) ∝ e−iH(qi ,qf ;T ) .

A few other remarkable points:


Using the saddle point method, we can build an approximation scheme for U .
This is useful for solving highly oscillatory integrals, as we see in the transi-
tion amplitude above (e.g., ei... ), since such integrals can be approximated by

6
its saddle points (or critical points), which correspond to classical paths of the
system.

Monte Carlo sampling of the system can also be used to approximate the tran-
sition amplitude by building an estimator for the RHS, sampling over classical
configurations, and summing up the estimator.

Now, the expression for U was not-so-pretty,


P but imagine continuous time vari-
ables and integrals when you see k and  above. In the limit as N → ∞
(the number of completeness relations inserted), the quantum propagator is
expressed in a continuous form with strange new ”integrals”.
Z Z  R
T P j j
U (qi , qf ; T ) = Dq Dp ei 0 dt ( j p q̇ −H(q,p)) (17)

Do not think of these as literal integrals, as we do not have a proper measure


space to integrate over. Think of them as algorithms for now, something totally
new that will be applied to solve this expression above.

7
2 Lecture 2: Gaussian Path Integrals
Recall the propagator, or transition amplitude, for a nonrelativistic quantum
system
 
YZ Z RT j j
U (qi , qf ; T ) =  Dq j (t) Dpj (t) ei 0 dt L(q ,q̇ ) . (18)
j

To work with this, we often discretize q(t) → qkj

 
YZ dpjk  i Pk (Pj pj (qj −qj )−H)
Z
U (qi , qf ; T ) =  dqkj e k k+1 k (19)

j,k

T
Evaluate these very many integrals to get an answer dependent on  = N , since
we discretized, take the limit as  → 0 and deal with any encountered infinities.

Key Example
Consider the classical Hamiltonian

p2
+ V (q).
H= (20)
2m
Calculate the transition amplitude (Exercise)

 
YZ Z j
j dp P P j j j
k  i k ( j pk (qk+1 −qk )−H)
U (qi , qf ; T ) =  dqk e (21)

j,k
!
YZ p2
Z
dpk P k
= dqk ei k (pk (qk+1 −qk )−( 2m +V (q))) (22)

k
!r
−im i Pk 2
Z q
k+1 +qk

m
(qk+1 −qk )2 −V
Y
= dqk e 2
. (23)
2π
k

8
We may also write this in the following notation, using the fact that the argu-
ment of the exponential is the discretized version of the action, now without the
p-integral
Z
lim U (qi , qf ; T ) = Dq(t)eS[q(t)] (24)
→0

Where the action is


 
Z T
m X
S[q(t)] = dt  (q̇ j )2 − V (q) . (25)
0 2 j

Note that if our system is a harmonic oscillator V (q) = 21 mω 2 q 2 , we can do the


full integral.

Path Integrals for Scalar Fields


Recall the classical scalar field with Lagrangian density and Hamiltonian

1
L= (∂µ φ)2 − V (φ) (26)
2
Z  
1 2 1
H = d3 x π (x) + (∇φ(x))2 + V (φ(x)) . (27)
2 2

The path integral prescription for quantum scalar fields gives the transition
amplitude, by blind application of the above, we conjecture that
Z Z  R
T 4
hφb | e−iĤT |φa i = Dφ Dπ ei 0 d x (πφ̇−H(φ)) (28)

Where the boundary terms are φ(t = 0, x) = φa (x) and φ(t = T, x) = φb (x).

As explained above, to make sense of this quantity, we must discretize, evalu-


ate, and take the continuum limit as  → ∞. When we discretize, note that we
only discretize space, as discretizing time in this way will cause trouble with the
conjugate momenta.

The field operators are discretized over a ”grid” of points xj each of width ,
such that

φ(t, x) → φ(t, xj ) ≡ q j (t). (29)


Then discretize the integral by turning it into a sum over the grid
Z
d3 x → 3 Σj∈Z3 . (30)

9
Next the derivative can be discretized via a finite difference. Note that there
are more computationally efficient symmetric differences that can be used to
discretize the derivative, but the finite difference works well for demonstration

(φ(xj + µ ) − φ(xj ))
∇µ φ(x) → (31)
|µ |
Where µ denotes the four directions in which to calculate the derivative
1 0 0 0
µ = { 00 , 10 , 01 , 00 } (32)
0 0 0 1
.
Lastly, the potential just becomes evaluated at each xj

V (φ(x)) → V (φ(xj )). (33)


Then the Lagrangian is discretized to a sum over a bunch of terms, but the only
relevant term to the construction of the Hamiltonian is the time derivative of
the field operator φ̇ (Exercise)
Z X1
L = d3 x L → 3 (φ̇j )2 (34)
j
2

And the discretized conjugate momentum becomes


∂L ∂L
πj = j
= → 3 q̇ j . (35)
∂ q̇ ∂ φ̇j
Finally, we have the discretized Hamiltonian, where we display the  terms to
show that if we did not add the 3 term to the discretized Lagrangian, we would
be stuck with an extra −3 on the discretized Hamiltonian
 2
X 1 qj+µ − qj
H = 3 −3 πj2 + + V (q). (36)
j
2 

In summary, the discretization of the scalar field gives us a nonrelativistic lattice


system such that the discretized Hamiltonian is the sum of a kinetic energy term
and a potential energy term. The second step is to evaluate the (nonrelativistic)
path integral, and the third step is to take the continuum limit as  → 0, which
will later be re-branded as renormalization.

The most important case of the scalar field is the quadratic potential, which
corresponds to the Klein-Gordon field (e.g., discretizing Klein-Gordon theory
yields the quadratic potential below), is
1 T
V (q) = q Aq. (37)
2

10
Gaussian Integrals
Consider the following integral
Z ∞ √
2
I= dx e−x = π. (38)
−∞

Proof:

Z ∞
2
I= dx e−x (39)
−∞
Z∞  Z ∞ 
2 2
I2 = dx e−x dy e−y (40)
−∞ −∞
Z ∞ Z 2π
2
I2 = rdr dθ e−r (41)
0 0
Z ∞  
d 1 2
I 2 = 2π − e−r dr = π (42)
0 dr 2
.
This is actually a special case of the more general forms of the Gaussian integral


Z r
− 21 ax2 +bx 2π b2
dx e = e 2a (43)
−∞ a
Z ∞ r
iax2 +ibx 2πi −ib2
dx e = e 2a (44)
−∞ a
(45)

.
We will later need the moments generated by the Gaussian integral

R∞ 1 2
dx xn e− 2 ax
hx i = −∞
n
R∞ 1 2
. (46)
−∞
dx e− 2 ax

Note that if n is odd, then the moment is zero and we can write the exponent
of x as 2m, where m ∈ Z, and we have the relation (Exercise)

(2m − 1)!!
hx2m i = . (47)
am
Note that the double factorial (2m − 1)!! represents the number of ways to join
2m points in pairs. – ”All science should in linear algebra or combinatorics.” –

Another closed form of this integral is in terms of derivatives is

11
R∞ 1 2 !
dx e− 2 ax
 2m +bx
2m d −∞

hx i= R∞ − 12 ax2

b=0
(48)
db −∞
dx e
 2m
2m d b2
hx i= e 2a b=0 . (49)
db

To evaluate the maultivariable Gaussian integrals, where x ∈ Rn , consider


Z ∞ Z ∞
T T
I(A, B) = dx1 · · · dxn e−x Ax+B x (50)
−∞ −∞

Where A is an n × n symmetric real matrix and B is an n × 1 real vector. Since


A is real, symmetric, it contains orthogonal O and diagonal matrices D, such
that OT O = I and D is diagonalized with the eignevalues of A.

OT DO = A (51)
Assume that B = 0 and define y = Ox. Then

Z ∞ Z ∞
T
I(A, B = 0) = dy1 · · · dyn e−y Dy
(52)
−∞ −∞
n Z ∞
Y 2
= dyj e−yj λj (53)
j=1 −∞
n r
Y π
= (54)
j=1
λj
r
πn
I(A, B = 0) = . (55)
det(A)

The B 6= 0 case (Exercise) results in the following


r
πn T −1
I(A, B) = eB A B . (56)
det(A)

12
3 Lecture 3: Correlation Functions and Path In-
tegrals
Recall the generating function for a single-variable Gaussian probability distri-
1 2
bution e 2 ax and the moment-generating integral
Z ∞
1 2 (2n − 1)!!
I= dx x2n e 2 ax = . (57)
−∞ an
We also derived the identity with the generating function for the multivariable
Gaussian probability distribution
Z ∞ Z ∞ r
− 12 xT Ax+J T x πn T −1
dx1 · · · dxn e = eJ A J . (58)
−∞ −∞ det(A)
The 2-point correlation function for the n-variable Gaussian is (Exercise), for
j 6= k
R∞ R∞ 1 T
−∞
dx1 · · · −∞ dxn xj xk e− 2 x Ax
hxj xk i ≡ R∞ R∞ 1 T ≡ [A−1 ]jk . (59)
−∞
dx1 · · · −∞ dxn e− 2 x Ax
Note that this is also equal to the second derivative with respect to the vector
J, evaluated at J = 0
∂2
R∞ R∞ − 12 xT Ax+J T x
∂Jj ∂Jk −∞ dx1 · · · −∞ dxn e
hxj xk i ≡ . (60)

R∞ R∞ − 21 xT Ax+J T x

−∞
dx1 · · · −∞
dxn e J=0

Higher Order Moments: l-point Correlation Functions


The l-point correlation function has similar form
R∞ R∞ 1 T
−∞
dxj1 · · · −∞ dxjl xj1 . . . xjl e− 2 x Ax
hxj1 . . . xjl i ≡ R∞ R∞ 1 T . (61)
−∞
dxj1 · · · −∞ dxjl e− 2 x Ax
By Wick’s theorem (proof by induction), for even l, the l-point correlation func-
tion is proportional to the sum of the products over the permutation group on l
symbols, the ”Wick sum”. We write ”proportional to” for reasons of symmetry
and soon eliminating redundant terms.
X
hxj1 . . . xjl i ∝ [A−1 ]jπ−1 (1) jπ−1 (2) . . . [A−1 ]jπ−1 (l−1) jπ−1 (l) . (62)
π∈Sl

Example: 4-point Correlation


To calculate the 4-point correlation function, we begin by considering the 4! = 24
total permutations on 4 symbols. Since A−1 is symmetric

13
[A−1 ]jk = [A−1 ]kj (63)
24
And there are only 2!2!2! = 3 unique terms (products of two matrix elements),
which are (Exercise)

hxj1 xj2 xj3 xj4 i = [A−1 ]j1 j2 [A−1 ]j3 j4


+ [A−1 ]j1 j3 [A−1 ]j2 j4
+ [A−1 ]j1 j4 [A−1 ]j2 j3
= hxj1 xj2 ihxj3 xj4 i + hxj1 xj3 ihxj2 xj4 i + hxj1 xj4 ihxj2 xj3 i

With the approporiate choice of A, in the context of path integrals and per-
turbative field theory, these products of correlations functions are exactly cor-
respondent to Feynman propagator, and, in turn, the Feynman diagrams, just
as we studied in Lecture 9 of the last lecture series (Quantum Field Theory) for
the 4-particle Wick contraction.

Figure 1: Feynman diagram correspondence of the 4-point correlation function

Keeping only unique terms, the proprotionality relation becomes an equivalence

X
hxj1 . . . xjl i = [A−1 ]jπ−1 (1) jπ−1 (2) . . . [A−1 ]jπ−1 (l−1) jπ−1 (l) (64)
unique π −1

6!
(Exercise) Calculate the 6-point correlation function with 2!2!2! = 90 unique
terms

hxj1 xj2 xj3 xj4 xj5 xj6 i = [A−1 ]j1 j2 [A−1 ]j3 j4 [A−1 ]j5 j6 + . . . (65)
In short summary,

• We can calculate all moments for the Gaussian probability distribution.


• We have a diagrammatic calculus to calculate the l-point correlation func-
tions, which end up being exactly the Feynman propagators/diagrams,
with appropriate choice of A, and is the direct connection of quantum
field theory and Gaussian integrals.

14
The Matrix A for Path Integrals
Let the potential V be quadratic in the canonical position coordinate per par-
ticle qk (e.g., a one-dimensional chain of oscillators), such that the transition
amplitude, which will be discretized, evaluated, and limited  → 0, from some
state qa to another qb is
!
Y Z dqk 1 T
U (qa , qb ; T ) = e 2 iq Aq (66)
c()
k

Where we know the quadratic form contains a kinetic energy term plus a po-
tential energy term
!
2
X (q k+1 − q k qk+1 + qk
q T Aq = m − V ( ) . (67)
 2
k

This results is A as a tridiagonal matrix for the kinetic energy term and a
potential energy term which is a matrix with elements quadratic in q
 2m
−m

  0 ···
− m 2m − m 0 ··· 
   
A= 0 m 2m m + [V (q 2 )] (68)

−  − 0 ··· 
..
 
.. .. ..
. 0 . . .
The transition amplitude is then calculated similarly to last lecture as
∞ const.
U (qa , qb ; T ) = p (69)
det(A)
The infinite constant will not be a problem since the l-point correlation is nor-
malized, and the same exact infinite constant will appear in the denominator
and cancel the constant. So, in terms of qk , the l-point correlation reads
dqk − 21 iq T Ax
Q R
k c() qj1 . . . qjl e
hqj1 . . . qjl i ≡ Q R dqk − 1 iqT Ax . (70)
c() e
2
k

Note that with periodic boundary conditions, the elements of follow a modulo
relation Ajk = f ((j − k) mod n), where n is the number of sites/oscillators in
the chain.

Assuming that A is invertible, there exists a unitary matrix Q, such that QT Q =


I and QT AQ = D, with diagonal matrix D with the eigenvalues of A along the
diagonal. Then the determinant of A is easy to calculate, since
n
Y
det(A) = λj (A). (71)
j=1

15
Correlations Functions and Quantum Observables
Consider the transition amplitude over 2-point spatial correlations
Z RT 4
U (qa , qb ; T ) ∝ Dφ(x) φ(x1 )φ(x2 )ei −T d x L(φ) (72)

With an expression like this, always discretize by sending φ(xj ) → qj , evaluate


the integral, and enter the contiuum limit with the boundary conditions

φ(−T, x) = φa (x) (73)


φ(T, x) = φb (x). (74)

Apply the following condition, exploiting the boundary conditions, to factor the
full field ”integral” over the individual fields and the boundary of the field
Z Z Z Z
Dφ(x) = Dφ1 (x) Dφ2 (x) Dφ(x) (75)
∂φ

Where the boundary ∂φ = ∂φ1 + ∂φ2 is defined by

φ1 (x) = φ(x01 , x1 ) (76)


φ2 (x) = φ(x02 , x2 ) (77)

So, after performing the boundary integral, we introduce quantum stuff to the
expression from the classical 2-point function above, for x02 > x01

Z Z
0
U (qa , qb ; T ) ∝ Dφ1 (x) Dφ2 (x) φ(x1 )φ(x2 ) hφb | e−iĤ(T −x2 ) |φ2 i
0 0 0
× hφ2 | e−iĤ(x2 −x1 ) |φ1 i hφ1 | e−iĤ(x1 +T ) |φa i

Now, apply the Schroedinger-picture field operator to write the classical field
operators in terms of quantum field operators. The formula is

φ̂S (x) |φ1 i = φ(x1 ) |φ1 i (78)


So, the purely quantum expression for the 2-point function is

Z Z
0
U (qa , qb ; T ) ∝ Dφ1 (x) Dφ2 (x) hφb | e−iĤ(T −x2 ) φ̂S (x) |φ2 i
0 0 0
× hφ2 | e−iĤ(x2 −x1 ) φ̂S (x) |φ1 i hφ1 | e−iĤ(x1 +T ) |φa i

This is called the time-ordered expectation value of the field operators in the
Heisenberg picture. The equation holds for x02 < x01 as well, and we can write
it as

16
U (qa , qb ; T ) ∝ hφb | e−iĤT T [φ̂H (x1 )φ̂H (x2 )]e−iĤT |φa i (79)
Now, enter the limit as T → ∞, where we bring the interacting vacuum state and
the normalization for the full transition amplitude (Exercise), and introduce
the most important formula for this course

RT 4
Dφ φ(x1 )φ(x2 )ei −T d x L(φ)
R
hΩ| T [φ̂H (x1 )φ̂H (x2 )] |Ωi = lim RT
4
. (80)
Dφ ei −T d x L(φ)
T →∞(1−i)
R

So, the LHS is built of purely quantum observables equal to the classical ex-
pression of path integrals!

This expression will end up to be the propagator, which is also the inverse of
the Klein-Gordon operator, which is what we call A in the scalar quantum field
theory.

The solution to this is well-known for the case when L is quadratic in the field
operators, and one can easily discretize, evaluate the Gaussian integral, and
take the limit as  → 0.

(Exercise) Calculate the l-point formula for the time-ordered expectation value
of the field operators in the Heisenberg picture.

17
4 Lecture 4: Functional Quantization of the Scalar
Field
The path integral formalism for quantization of fields is an incredibly efficient
tool, but one must learn when, and when not, ot use it. Through the lectures
and many examples, we’ll develop an intuition for when to trust quantization
via path integrals.

Recall the action of the scalar field S with classical field operators φ
Z Z  
4 4 1 2 1 2 2
S0 = d x L0 = d x (∂µ φ) − m φ . (81)
2 2
We will (1) discretize, tantamount to imposing a cutoff Λ, (2) evaluate the in-
tegrals, and (3) enter the continuum limit where  → 0. Start the discretization
by putting the field on a lattice (a Lorentz manifold) with spacing  and then
compactify the space onto a torus for periodic boundary conditions.

Mathematically, we are transforming from a four-dimensional Minkowski space


M4 to a four-domensional torus (Z/N Z)4 , where N = L is the number of sites,
and L is the total size of grid.
Continue discretization with the field operators

φ(x) →φ(xj ) ≡ qj , (82)


L
xj = j ∈ (Z/N Z)4 . (83)
N
And the partial derivatives are replaced for the forward difference, which is not
the best method, but it’s ”good enough for government work”

φ(xj + eµ ) − φ(xj )


∂µ φ(x) → . (84)

And the space-time integral becomes a sum over the sites on the torus
Z X
d4 x → 4 (85)
j∈(Z/N Z)4

Now following the path integral quanitzation recipe, consider the transition
amplitude in terms of the discretized action
Z
hφf | U (qi , qf ; T ) |φi i = Dφ eiS0 (86)

Where we follwo the ”algorithm” of the integral-differential operator and dis-


cretize it to a product, over the torus sites, of integrals (N 4 total integrals) over
the field operators, the canonical position variables

18
Z YZ YZ
Dφ → dφ(xj ) ≡ dqj . (87)
j j

Discretization in Momentum Space


Thus far we have worked entirely in real (position) space. Let’s Fourier trans-
form over into momentum space to continue discretization. The Fourier trans-
form is a unitary transformation with Jacobian equal to 1 (Exercise). First,
the field operators transform as
1 X −ikn ·xj
φ(xj ) = e φ(kn ) (88)
V n
Where V = L4 is the volume of the 4D torus. Notationally, the k argument to
the field operator in momentum space φ(k) will denote the Fourier transform
of the field operator in real space φ(x). The wavenumber kn is discretized over
the torus as
2πnµ π
kn = , nµ ∈ Z/N Z and |k µ | < (89)
L 
Note that the Fourier space field operator is complex, such that φ(−k) = φ∗ (k),
and we therefore have two independent variables per field operator in momentum
space: the real part <φ(kn ) and the imaginary part =φ(kn ) for positive time-
component kn0 > 0.
So, in momentum space, the discretized integral-differential operator is (Exercise)
Z Y Z Z
Dφ = d <φ(kn ) d =φ(kn ). (90)
0 >0
n:kn

And the discretized action for the scalar field in momentum space is (Exercise)
1 X
S0 = − (m2 − kn2 )((<φn )2 + (=φn )2 ) (91)
V 0
kn >0

Where φn ≡ φ(kn ), and the following relation for the Kronecker delta is used
to obtain this expression
n−1
1 X 2πijk
δk,0 = e n . (92)
n j=0
Our expression for the path integral for the Klein-Gordon field discretized to
a lattice (four-dimensional with periodic boundary conditions) is comprised of
Gaussian integarls over a finite number of degrees of freedom

 
Z Y Z Z
2 2
−i V1 )|φn |2
P
0 >0 (m −kn
I0 = Dφ eiS0 =  d <φn d =φn  e kn
. (93)
0 >0
kn

19
Now, onto evaluating this integral, it’s just a bunch of Gaussian integrals, and
we know how to solve those. We get the following, and unrestrict kn to get the
second line (Exercise)

s s
Y −iπV −iπV
I0 = · (94)
0 >0
m2 − kn2 m2 − kn2
kn
s
Y −iπV
I0 = (95)
m2 − kn2
kn

Note that kn is bounded, but we have an infinity when V → ∞ (continuum


limit), but this integral does not yet have full operational meaning and is pro-
portional to the transition amplitude I0 ∝ hφf | U (qi , qf ; T ) |φi i, and the infini-
ties will cancel and drop out in the full expression.

Heuristic Argument for I0


As the ”surface area of knowledge” we need to remember the path integral for-
malism is small, there is a heuristic way to obtain this result without formal
discretization, etc., using the aforementioned intuition.

Recall the Gaussian integral whose argument is quadratic in its independent


variable
r
πn
Z
T 1
dx e−x Ax = ∝ (det(A))− 2 (96)
det(A)
For the Klein-Gordon field, consider the path integral with the Klein-Gordon
operator and field operators substituted
Z Z R 4
1 2 2
Dφ eiS ∼ Dφ e 2 d x φ(x)(−∂ −m )φ(x) (97)

So, we are boldly extrapolating to say that A is like the Klein-Gordon operator
and the x is like the field operator
Z
d4 x φ(x)(−∂ 2 − m2 )φ(x) ∼ xT Ax. (98)

Furthermore, we say that the path integral is proportional to the determinant


of the Klein-Gordon operator
Z
1
Dφ eiS ∝ (det(−∂ 2 − m2 ))− 2 . (99)

20
Operationally Well-Defined Quantities
As mentioned, I0 cancels for operationally well-defined quanities, such as the
2-point correlation function, a time-ordered expectation value of products of the
field operators. For example, using the path integral formalism

Dφ φ(x1 )φ(x2 )eiS


R
hΩ| T [φ(x1 )φ(x2 )] |Ωi = lim R (100)
T →∞(1−i) Dφ eiS
To check our bold extrapolations, calculate the discretized field operator product
1 X −ikm ·x1 X
φ(x1 )φ(x2 ) = 2
e φm e−ikl ·x2 φl (101)
V m
l

So, the discretized RHS numerator of the time-ordered expectation value above
is just a bunch of independent Gaussian integrals, quadratic in its independent
variables (Exercise)

 
Z Z
1 X −i(km ·x1 +kl ·x2 )  Y
numerator = 2 e d <φn d =φn 
V 0
l,m kn >0
1
(m2 −k2 )((<φ )2 +(=φn )2 )
P
−i
× (<φm + i=φm )(<φl + i=φl )e V kn0 >0 n n

 
1 X −ikm ·(x1 −x2 )  Y −iπV  −iV
= 2 e 2 − k2 2 − k 2 − i
V m 0
m n m n
kn >0
1 X −ikm ·(x1 −x2 ) −iV
= 2 e · I0 · 2
V m m − kn2 − i

Where we drastically cut down the number of integrals to evaluate, since any
integrals involving products like <φm · =φl or =φm · <φl form odd integrands
and evaluate to zero. The integeral will also be zero for terms where m 6= l and
for terms where km = kl . Integrals where km = −kl will not be zero.

Bringing this together, the RHS of the time-ordered expectation value has boiled
down to

Dφ φ(x1 )φ(x2 )eiS


R
hΩ| T [φ(x1 )φ(x2 )] |Ωi = lim R (102)
T →∞(1−i) Dφ eiS
1 X e−ikn ·(x1 −x2 )
= lim −i (103)
V →∞ V n m2 − kn2 − i
d4 k ie−ik·(x1 −x2 )
Z
= (104)
(2π)4 −m2 + k 2 + i
hΩ| T [φ(x1 )φ(x2 )] |Ωi = D(x1 − x2 ) (105)

21
So, the path integral formalism gives us exactly the propagator we wish to see.
Note that is we were to just boldy extrapolate, without discretization, etc., we
would get the same result!

For example,
1
Dφ φ(x1 )φ(x2 )eiS (∂ 2 − m2 )− 2 D(x1 − x2 )
R
R
iS
= 1 (106)
Dφ e (∂ 2 − m2 )− 2
Since if A ∼ (−∂ 2 − m2 )
Then [A−1 ]jk ∼ (−∂ 2 −m
1
2)
x
= D(x1 − x2 )
1 x2
(4) 2 2
And δ (x − y) = (−∂ − m )D(x − y).

Example: 4-point Correlation Function


Note that all 3-point correlations are zero, since they all have odd integrands.
The 4-point correlation function starts off as

Dφ φ(x1 )φ(x2 )φ(x3 )φ(x4 )eiS


R
hΩ| T [φ(x1 )φ(x2 )φ(x3 )φ(x4 )] |Ωi = Rlim .
T →∞(1−i) Dφ eiS
(107)
The numerator contains the quantities of (<φm + i=φm ) . . . (<φl + i=φl ), and
most terms will vanish as before, leaving us with terms where kl = −km and
kq = −kp , and we end up with, after applying Wick’s theorem and sending
V → ∞ (Exercise), something like

X Z
−i...
e . . . φkl φ−kl φkq φ−kq e...
kl ,kq

= DF (x1 − x2 )DF (x3 − x4 ) + DF (x1 − x3 )DF (x2 − x4 ) + DF (x1 − x4 )DF (x2 − x3 )

Interacting QFT via Path Integrals


Consider the action with a free part and an interacting part, namely the phi-
fourth interaction,
Z

S = S0 + Sint = S0 + d4 x φ4 (x). (108)
4!
Then the time-ordered expectation value for 2-point correlations can be Taylor
expanded, since λ is small,

22
Dφ φ(x1 )φ(x2 )ei(S0 +Sint )
R
hΩ| T [φ(x1 )φ(x2 )] |Ωi = limT →∞(1−i) R (109)
Dφ ei(S0 +Sint )
Dφ φ(x1 )φ(x2 )eiS0 (1 + Sint + 12 Sint
2
R
+ ...)
= limT →∞(1−i) R
iS 1 2
Dφ e 0 (1 + Sint + 2 Sint + . . . )
(110)

Where Sint = iλ
R 4 4
4! d z φ (z), and each term above is an integral of powers of
time-ordered quantum field operators which end up as Feynman diagrams, for
example, of the form

λm
Z Z Z
4 4
d z1 · · · d zm Dφ φ(x1 )φ(x2 )φ4 (z1 ) . . . φ4 (zm )eiS0 (111)
4!m
λm
Z Z
= m d z1 · · · d4 zm hΩ| T [φ̂(x1 )φ̂(x2 )φ̂4 (z1 ) . . . φ̂4 (zm )] |Ωi
4
(112)
4!
= Sum of Feynman diagrams (113)

23
5 Lecture 5: Functional Derivatives and Gener-
ating Functionals
Here we will finish the functional quantization of the scalar field.

Recall that we can compute time-ordered correlation functions for the quantum
scalar field entirely in terms of classical quantities, which is equivalent to a sum
over all diagrams,

Dφ φ(x1 ) . . . φ(xn )eiS[φ(x1 ),...,φ(xn )]


R
hΩ| T [φ̂(x1 ) . . . φ̂(xn )] |Ωi = lim
R
T →∞(1−i) Dφ eiS[φ(x1 ),...,φ(xn )]
(114)
For example, the 2-point correlation function for the Klein-Gordon field is the
Feynman propagator

Dφ φ(x1 )φ(x2 )eiS


R
hΩ| T [φ̂(x1 )φ̂(x2 )] |Ωi = lim R = DF (x1 − x2 ) (115)
T →∞(1−i) Dφ eiS

More elegantly, and analogous to multivariate Gaussian integrals, we found the


Feynman propagator DF (x − y), which is the inverse operator of the Klein-
Gordon operator −∂ 2 − m2 , to be similiar to the inverse of a matrix A, making
the Klein-Gordon operator the matrix A.
1 T
dx1 . . . dxn xj xk e− 2 x Ax
R
DF (xj − xk ) ∼ [A−1 ]jk = 1 T (116)
dx1 . . . dxn e− 2 x Ax
R

To compute these n-point correlation functions, or elements of this ”inverse


matrix”, we used derivatives of the generating functional, which is what we
generalize in this lecture. Recall the multivariate Gaussian generating functional
Z
1 T T 1 T −1
Z[J] = dx1 . . . dxn e− 2 x Ax−J x = e 2 J A J . (117)

Functional Derivatives
The functional derivative is a tool from the calculus of variations that we now
define by an example that is the continuum analog of the standard partial
derivative
δ
J(y) = δ (4) (x − y). (118)
δJ(x)
There is a one-to-one mapping from the discrete representation to the continu-
ous with correspondences

24
x∈R → j∈Z
J(x) ∈ C(R) → Jj ∈ L2 (Z) (119)
δ ∂
δJ(x) F [J(y)] → ∂Jj F [J1 , J2 , . . . ]

Where C(R) is a continuous function space and L2 (Z) is ...

Example 1
Z 
δ R 4 R 4 δ
ei d y J(y)φ(y) = iei d y J(y)φ(y) d4 y J(y)φ(y)
δJ(x) δJ(x)
Z
R 4 δJ(y)
= iei d y J(y)φ(y) d4 y φ(y)
δJ(x)
R 4
Z
= iei d y J(y)φ(y) d4 y δ (4) (x − y)φ(y)
δ R 4 R 4
ei d y J(y)φ(y) = iφ(x)ei d y J(y)φ(y)
δJ(x)

Example 2: Derivatives of Delta functions

Z  Z 
δ 4 µ δ 4 µ
d y (∂µ J(y))v (y) = boundary term − d y J(y)∂µ v (y)
δJ(x) δJ(x)
= −∂µ v µ (x)

Note that the boundary term is almost always zero, except for topologically
interesting theories.

The Generating Functional


Define the generating functional as
Z
Z[J] = lim Dφ eiS+iJ(x)φ(x) . (120)
T →∞(1−i)

This expression is obviously useful, since correlation functions are directly re-
lated to derivatives of Z[J]
δ δ

− δJ(x) δJ(y) Z[J] J=0

hΩ| T [φ̂(x)φ̂(y)] |Ωi = (121)
Z[J] J=0
So, if you can compute the generating functional Z[J], you have all of the n-
point correlation functiosn via derivatives for your field theory.

In free field theories, such as the Klein-Gordon field, the action is quadratic in
the field operators, and the argument of the exponential is Z[J] is

25
Z  
4 1 2 2
i(S0 + J(x)φ(x) = i d x φ(x)(−∂ − m + i)φ(x) + J(x)φ(x) . (122)
2

To homogenize quadraticity, complete the square by introducing the shift (with


Jacobian = 1
Z
φ0 (x) = φ(x) − i d4 y DF (x − y)J(y). (123)

This is analogous to the positional shift x0 = x − A−1 J, and works becuase the
Feynman propagator is the inverse of the Klein-Gordon operator, such that

(−∂ 2 − m2 )DF (x − y) = iδ (4) (x − y) (124)


With the variable change, the exponential argument becomes

Z  
1 0
i(S0 + J(x)φ(x) =i d4 x φ (x)(−∂ 2 − m2 + i)φ0 (x)
2
Z Z  
4 4 1
−i d x d y J(x)(−iDF (x − y))J(y) .
2

So, the generating functional is then


1
d4 xd4 y (J(x)DF (x−y)J(y))
R
Z[J] = Z0 e− 2 (125)
Where the free field contribution, independent of J, is
Z
1 0 0
R 4 2 2
Z0 = Dφ0 ei d x ( 2 φ (x)(−∂ −m +i)φ (x)) . (126)

Examples of Free Theory Correlations Functions


Example 1: The 2-point correlation function, with the Z0 cancelled out,

δ δ 1
R 4 4
e− 2 d xd y (J(x)DF (x−y)J(y)) J=0 . (127)

hΩ| T [φ̂(x)φ̂(y)] |Ωi = −
δJ(x) δJ(y)

Example: The 4-point correlation function, with notation φ̂i = φ̂(xi ), Ji =


J(xi ), and Dxi = D(x − xi )

26
δ δ δ δ

δJ1 δJ2 δJ3 δJ4 Z[J] J=0

h0| T [φˆ1 φ̂2 φ̂3 φ̂4 ] |0i = (128)
Z[J = 0]
 Z 
δ δ δ 1
R 4 R 4
− d4 x0 Jx0 Dx0 4 e− 2 d x d yJx Dxy Jy J=0

=
δJ1 δJ2 δJ3
(129)
 Z Z 
δ δ
−D34 + d4 x0 d4 y 0 Jx0 Dx0 3 Jy0 Dy0 4 × e... J=0

=
δJ1 δJ2
(130)
 Z Z Z 
δ 4 0 4 0 4 0
D34 d x Jx Dx 2 + D24 d y Jy Dy 3 + D23 d z Jz Dz 4 + O(J ) e... J=0
2

= 0 0 0 0 0 0
δJ1
(131)
= D34 D12 + D24 D13 + D23 D14 (132)

Interacting Fields
The time-ordered expectation value, which contains the generating functions by
Taylor expansion, for the (classical) phi-fourth interacting theory is

D φei(S0 +Sint ) φ(x1 ) . . . φ(xn )


R
hΩ| T [φ1 . . . φn ] |Ωi = lim R (133)
T →∞(1−i) Dφ ei(S0 +Sint )

Where Sint = − iλ d4 x φ4 (x).


R
4!

Fermionic Fields
For the (classical) fermionic field ψ̂, the 2-point correlation function, vacuum
expectation value, is

D ψeiS ψ(x)ψ(y)
R
hΩ| T [ψ(x)ψ(y)] |Ωi = R (134)
Dψ eiS
Rule number one for this expression (1) is to not think abou this operationally,
and rule number two (2) is to think in analogy to complex numbers, which can
provide a more clear representation and make things easier.

The Fermi fields obey the relations

ψ 2 (x) = 0 (135)
ψ(x)ψ(y) = −ψ(y)ψ(x) (136)

27
Vignette: Anticommuting Numbers (Grassman Numbers)
Let V be an n-dimensional vector space with basis θa ∈ V , a = 1, . . . , n. Thus,
elements of the vector space v ∈ V have the form
n
X
v= va θ a . (137)
a=1

To build a bigger vector space G(V ) from V , we first endow V with the product
operation denoted by concatenation (e.g., θa · θb · θc = θa θb θc ).

This gives us an infinite dimensional vector space with span

S ∞ (V ) = span{θa , θa θb , θa θb θc , . . . } (138)
Now restrict the basis to obey the following suggestive relations

θa θb = −θb θa (139)
θa2 = 0. (140)

Then the new vector space has dimension dim(G(V )) = 2n , and is the infinite
dimensional span modulo the elements of the underlying vector space

G(V ) = S ∞ (V )/v. (141)


(Exercise) Check that this structure is well-defined. Note that this is exactly
the space of differential forms for a tangent space V (also known as the space
of ”classical fermions”).

The basis of G(V ) is now

{1, θa , θa θb , θa θb θc , . . . } (142)
With a = 1, . . . , n, followed by 1 ≤ a < b ≤ n, a < b < c, etc.

Then a general element of the new vector space f ∈ G(V ) is

n
X X
f =α+ αj1 j2 ...jp θj1 θj2 . . . θjp (143)
p=1 1≤j1 <···<jp ≤n

, where αj1 j2 ...jp ∈ C. (144)

28
6 Lecture 6: Grassmann Numbers
We left off with an object meant to be the classical version of a fermion: a
Grassman number. It is the object of a vector space Gn (V ), generated by basis
{1, θ1 , . . . , θn ∈ V }. Imposing the anticommutation relation

{θj , θk } = 0, ∀j, k = 1, . . . , n (145)


n
This is a 2 -dimensional vector space with monomial basis

{1, θ1 , θ2 , . . . , θn , θ1 θ2 , . . . , θn−1 θn , . . . , θ1 θ2 . . . θn }. (146)


An arbitrary element f ∈ Gn (V )
X
f = f0 + fp (j1 , . . . , jp ) θj1 . . . θjp (147)
j1 <···<jp

We now define functions, linear and nonlinear, representations, complex num-


bers, calculus, derivatives, and integrals of Grassman numbers.

Functions of Grassman Numbers


Think of θj as anticommuting numbers/variables.

Note: A ”wrong” definition is to let f be an infinitely differentiable function


f ∈ C ∞ (R), adjoin the symbol f (θj ), and impose the anticommutation relation
{f (θj ), θk } = 0, ∀ j, k. Then each f will produce a new anticommuting object,
which leads to an uncountably infinite number of objects, and is not what we
expect from a function.

Linear functions should be linear maps from the vector space to itself, F ∈
M2n (C), the space of 2n × 2n matrices

F : Gn (V ) → Gn (V ) (148)
Nonlinear functions will be defined by analogy to the functional calculus of ma-
trices; consider a matrix M ∈ Mm (C). As long as M is diagonalizable, define
the nonlinear function f (M ) = S −1 f (D)S, where M is diagonalized by S, such
that M = S −1 DS. Since a diagonal matrix occupies a commutative algebra,
we know how to define function for D and then rotate from D to M via S.

In summary so far, one strategy to define functions of Grassman numbers is to


represent Gn (V ) as matrices and use functional calculus. Working with repre-
sentations, we now need matrices to represent the Grassman numbers.

29
Representations of Gn (V )
Consider a map from our vector space to a concrete space of matrices

π : Gn (V ) → Md (C (149)
Which must obey the anticommutation relation

{π(θj ), π(θk )} = 0, ∀j, k. (150)


Construct the ”Jordan-Wigner representation”, which will look very familiar.
n
Begin with a Hilbert space H = C2 , with dimension 2n , and Pauli operators

 
+ 0 0
σ = (151)
1 0
 
1 0
σz = . (152)
0 −1
Recall that these Pauli operators obey the relations

{σ + , σ z } = 0 (153)
+ 2
(σ ) = 0. (154)
Construct the representation of Gn (V )

π(θ1 ) = σ1+ ⊗ I2 ⊗ · · · ⊗ In−1 ⊗ In (155)


π(θ2 ) = σ1z ⊗ σ2+ ⊗ · · · ⊗ In−1 ⊗ In (156)
... (157)
π(θn ) = σ1z ⊗ σ2z ⊗ ··· ⊗ z
σn−1 ⊗ σn+ (158)

Single variable representation (n = 1)


Let F ∈ C ∞ (R, R) and G1 (V ) ' {a + bθ : a, b ∈ C}. This should be consistent
with the functional calculus of the representation π(·), such that F (π(·)) =
S −1 F (D)S.
To evaluate the function F , we write out the Taylor series, evaluate at x = θ,
and impose the defined relations (e.g., θj2 = 0, ∀j)


X F (j) (x = 0)xj
F (x) = (159)
j=0
j!

X F (j) (θ = 0)θj
F (θ) = (160)
j=0
j!

F (θ) = F (0) (θ = 0) + F (1) (θ = 0) · θ (161)

30
Example 1:

F (x) = sin(x) (162)


3
x
F (θ) = (x − + . . . ) x=θ (163)
3!
F (θ) = θ (164)

Example 2:

F (x) = x + x3 → F (θ) = θ (165)

Multiple variable representation


Let F ∈ C ∞ (Rn , R), with the Taylor expansion

n
X ∂F (0, . . . , 0)
F (θ1 , . . . , θn ) = F (0) (θ1 = 0, . . . , θn = 0) + θj + O(θ2 ) (166)
j=1
∂θj

Example 1: n = 2

F (x, y) = e−λxy (167)


2 2 2

F (θ1 , θ2 ) = (1 − λxy + λ x y + . . . ) x=θ1 ,y=θ2 (168)
F (θ1 , θ2 ) = 1 − λθ1 θ2 (169)

Example 2: n = 2

F (x, y) = e−λ1 x−λ2 y (170)


F (θ1 , θ2 ) = (1 − λ1 θ1 )(1 − λ2 θ2 ) (171)
F (θ1 , θ2 ) = 1 − λ1 θ1 − λ2 θ2 + λ1 λ2 θ1 θ2 (172)

Note that even though we lose higher order terms such as θj2 , nonlinear features
are preserved in the multivariable case.

Complex Grassman Numbers


Let θ1 , θ2 ∈ G2 (V ), a 4-dimensional vector space with basis {1, θ1 , θ2 , θ1 θ2 }, and
define the quantities θ and θ∗ ∈ G2 (V ) as

θ1 + iθ2
θ= √ (173)
2
∗ θ1 − iθ2
θ = √ . (174)
2

31
To extend to the multivariable case, define
θj1 − iθj2
θj∗ = √ (175)
2
Where θj1 , θj2 ∈ G2n (V ).

Grassman Derivatives
Define the derivative with respect to Grassman variables as the map

∂θj : Gn (V ) → Gn (V ) (176)
Which obeys the relations

1. ∂θj (θk ) = δjk


2. ∂θj (θk1 . . . θkp ) = δjk1 θk2 . . . θkp −δjk2 θk1 θk3 . . . θkp +· · ·+(−1)p−1 δjkp θk1 . . . θkp−1 .

The second relation follows from the anticommutation relation, and can be
thought of as bringing the corresponding θj to the front via anticommutations
and then differentiating.
For example,

∂θ2 (θ1 θ2 ) = −∂θ2 (θ2 θ1 ) = −θ1 . (177)


(Exercise) These relations can be extended by linearity for any Grassman num-
ber. This definition of ∂θj obeys the product rule and the chain rule.

Grassman Integrals
Following the analogy of the common definite integral, the Grassman integral
should be a linear map which obeys shift invariance, such that θ → θ + η,
Z
dθ : Gn (V ) → C. (178)

The only consistent definition to satisfy these two constraints for a single Grass-
man variable is
Z
dθ (a + bθ) = b, a, b ∈ C (179)

Note the very interesting property here that, by this definition, the integral and
the derivative are the exact same thing
Z
dθ (a + bθ) = ∂θ (a + bθ). (180)

This also holds for the multivariable case, where the highest order term is picked
off from the Taylor series

32
Z X
dθn . . . dθ1 (f0 + fp (j1 , . . . , jp )θj1 . . . θjp ) = fn (1, . . . , n). (181)
j1 <···<jp

This is a weird definition, but it works, behaves correctly under change of vari-
ables, and does what we need it to do.

Example: n = 2 independent Grassman numbers

Z Z

dθ∗ dθ e−λθ θ
= dθ∗ dθ (1 − λθ∗ θ) (182)

= ∂θ∗ ∂θ (1 − λθ∗ θ) (183)


Z

dθ∗ dθ e−λθ θ
=λ (184)

Note that the order of integration (θ first, θ∗ second) is by convention.

Multivariable Gaussian Integrals


Consider the multivariable Gaussian integral
Z P ∗
I = dθ1∗ dθ1 . . . dθn∗ dθn e− j,k θj Bjk θk (185)

Where B † = B and we diagonalize via U † BU = D.

Make a change of variables θj0 = Ujk θk , where U is a unitary matrix, such that
the product of these new Grassman variables is (Exercise)

X
θ10 θ20 . . . θn0 = U1k1 U2k2 . . . Unkn θk1 θk2 . . . θkn (186)
k1 ,...,kn
X
= U1π(1) . . . Unπ(n) sgn(π)θ1 θ2 . . . θn (187)
π∈Sn

θ10 θ20 . . . θn0 = det(U )θ1 θ2 . . . θn (188)

So the Gaussian integral becomes

Z
∗ ∗ θ 0∗ † 0
P
I= dθ0 1 dθ10 . . . dθ0 n dθn0 e− j,k j (U BU )jk θk det(U )det(U † ) (189)
Z
∗ ∗ 0∗ 0 0∗ 0
= dθ0 1 dθ10 . . . dθ0 n dθn0 e−λ1 θ 1 θ1 . . . e−λn θ n θn · (1) · (1) (190)

= λ1 λ2 . . . λn (191)
I = det(B) (192)

33
Recall that for the normal Gaussian integral case, we got I = (det(B))−1 .

(Exercise) The generating functional for the Grassman calculus is, where J is
a vector of Grassman numbers,

Z

Bθ+J † θ+θ † J
Z[J] = dθ1∗ dθ1 . . . dθn∗ dθn e−θ (193)

B −1 J
= eJ . (194)

The moments are (Exercise)


Z

dθ1∗ dθ1 . . . dθn∗ dθn θj θk∗ e−θ Bθ = det(B) · [B −1 ]jk (195)

Side: The mixing of regular and Grassman numbers provides the basis for the
supersymmetric method. Consider the Gaussian integral

Z Z Z
† †
dΦe−Φ MΦ
= dx1 . . . dxn dx∗1 . . . dx∗n dθ1∗ dθ1 . . . dθn∗ dθn e−Φ MΦ
(196)

= (detA)(detA)−1 (197)
Z

dΦe−Φ MΦ
=1 (198)

Where Φ = (x1 , . . . , xn , θ1 , . . . , θn ) and M is a 2n × 2n matrix


 
A 0
M= (199)
0 A

34
7 Lecture 7: Functional Quantization of the Dirac
Field
We now employ Grassmann numbers/variables to build a path integral-like ob-
ject that provides the n-point correlation functions for the Dirac (spinor) field.

Consider the Grassmann integral of the complex Grassmann variables θ and θ∗ ,


the Grassmann Gaussian generating function
 
n Z
Y P ∗ P ∗ ∗
Z[J] =  dθj∗ dθj  e− j,k θj Bjk θk + j (Jj θj +θj Jj ) (200)
j=1

Where the Grassmann variables and auxiliary fields J and J ∗ obey the anti-
commutation relations

{θj , θk∗ } = {θj , θk } = {θj∗ , θk∗ } = 0 (201)


{Jj , θk } = {Jj∗ , θk } = {Jj , Jk∗ } = 0. (202)

Calculating these Gaussian integrals, the generating functional becomes


Jj∗ [B −1 ]jk Jk
P
Z[J] = e− j,k . (203)

Since the matrix B is unitary, such that B = B, the generating functional Z[J]
is Hermitian.

Note that the generating functional takes a vector of Grassmann numbers, an-
ticommuting objects, as input and yields an expression quadratic in the Grass-
mann numbers which evaluates to a real number, a commuting object, since
observables correspond to Hermitian operators and real numbers as their eigen-
values; the expectation value must always be a real number, not a Grassmann
number.

Recall the Dirac spinor field which is what we mean to be classical fermions.
The classical fermion is represented by the 4-component spacetime vector

ψ(x) → M (Λ)ψ(Λ−1 x) (204)


With the representation of the Lorentz group
i µν
M = e− 2 ωµν S . (205)
µν i µ ν µ ν µν
Where S = and {γ , γ } = 2η . An object that transforms accord-
4 [γ , γ ]
ing to the transformation law with this representation we call a Dirac spinor.
The representation, and thus the generators, of the Poincaré group easily follows.

35
The Hamiltonian, or generator of time translations, that follows from this spinor
object solves the Dirac equation (i∂µ γ µ − m)ψ = 0. Recall that we defined the
conjugate-like object ψ̄ = ψ † γ 0 to induce Lorentz invariance for the Lagrangian
density L = ψ̄(−i∂/ − m)ψ, where the slash notation denotes A / = Aµ γ µ .

Thus far, we’ve been thinking about ψ = (ψ1 , ψ2 , ψ3 , ψ4 ) as the classical spinor-
valued Dirac field, but it is really the single-particle component that is needed
to build the classical Dirac field with anticommuting objects, since we guess to
employ the anticommutation relation of quantum Dirac field operators

{ψ̂(x), ψ̂ † (y)} = δ (4) (x − y). (206)


So, to build a quantum theory, recall that we, in the case of creation and anni-
hilation operators, for example,

1. Pick a classical single-particle theory


2. Put hats on the field operators to quantize them

3. Make an algebra for the quantized field operators to obey


4. Find representations of that algebra.

Alternatively, we can find some classical Dirac field that we quantize via the
path integral, and use the path integral as a tool to guess the quantum theory.

Guess 1 (Wrong):

Let the classical


R field ψ(x) consist of real numbers. Then the corresponding
path integral DψDψ̄ eiS will not yield the n-point correlation functions for
the quantum Dirac (fermionic) field, but will yield the n-point correlation func-
tions for the bosonic field.

Guess 2 (Correct):

Let the classical field ψ(x) consist of Grassmann numbers, a Grassmann-valued


field as the classical Dirac field. Then the path integral will yield the quantum
Dirac field.

A Grassmann-valued field (4-vectors) can be understood via sheaf theory and


ringed spaces of Grassmann numbers on manifolds.

An alternative way to make sense of a Grassman-valued field is by discretization


to a lattice of spacing , compactified to a torus. Consider a map from spacetime
to the 4-dimensional torus

R1,3 → (Z/N Z)⊗4 (207)

36
Consider the (0 + 1)-dimensional case, mapping continuous spacetime coordi-
nates to discrete coordinates: xj → j, where j ∈ Z/N Z, and discretize the
Grassmann numbers to the lattice to define the classical Dirac field

ψj ≡ ψ(xj ) ≡ ψ(j) (208)


ψj† ≡ ψ † (xj ) ≡ ψ † (j). (209)

So, the (discrete) classical Dirac field is a list of 8·N 4 Grassmann numbers, since
N 4 is the number of lattice sites and each of the two field “operator” contains
4 components.

Note that if we work in momentum space, the Fourier coefficients will be made
to be Grassmann numbers.

The continuous classical Dirac field comes from the limit of the lattice spacing
vanishing  → 0, the number of sites tending to infinity N → ∞, and the size
of the torus tending to infinity L = N  → ∞.

Now, to build the quantum theory corresponding to these classical objects via
the path integral formalism, we require an action, beginning with the discretiza-
tion of the Lagrangian density L = ψ̄(i∂/ − m)ψ. For the Dirac field, the dis-
cretized Lagrangian density is
  
X
µ ψj+µ̂ − ψj
L(ψj , ψ̄j ) = iψ̄j γ − mψ̄j ψj . (210)
⊗4

j∈(Z/N Z)

Where ψj and ψ̄j are 4D spinors of Grassmann numbers, we’ve employed the
forward-difference to represent the partial derivative, and µ̂ is a unit vector in
the µth direction.
RT
With the Lagrangian density, we can calculate the action S = i −T dt L(ψj , ψ̄j )
and the n-point correlation functions for the Grassmann-valued quantum field
operators.

For example, for the Grassmann variables, define the 2-point correlation function

Dψ̄Dψ ψ(x)ψ̄(y)eiS
R
ˆ
h0| T [ψ̂(x), ψ̄ (y)] |0i ≡ lim R (211)
T →∞(1−i) Dψ̄Dψ eiS
To calculate the 2-point correlation function, we continue to follow the prescrip-
tion

1. Discretize the field and the action


2. Evaluate the path integral
3. Take the continuous limit

37
Evaluate the path integral to find that the 2-point function is

d4 k ie−ik·(x−y)
Z
h0| T [ψ̂(x), ψ̄ˆ(y)] |0i = SF (x − y) = (212)
(2π)4 k/ − m + i
Side note (topic of ongiong research): Fermion doubling is a topological artifact
of incorrectly placing fermions on a lattice and taking the continuous limit.
Extra fermions, called doublers, appear in the calculation, as the dispersion
relation ω(k) becomes nonlinear and crosses the k-axis more than once. The
expected dispersion relation is linear ω(k) = ak, a > 0. In discretization,
we must accept this effect and learn how to work around it. This is done for
conveience, since without discretization, evaluating the 2-point function requires
many more tricks.

Generating Functional for the Dirac Field


Define the generating functional for the Dirac field in terms of two independent
Grassmann-valued functions
Z
¯
R 4
¯ /
Z[J(x), J(y)] ≡ Dψ̄Dψ ei d x (ψ̄(i∂−m)ψ+ Jψ+ ψ̄J)
. (213)

Where J and J¯ are Grassmann-valued (auxiliary) source fields that will be set
to zero after differentiation. Calculating the generating functional will yield all
n-point functions via functional derivatives, made possible by the employment of
Grassmann numbers and functional quantization versus canonical quantization.
By completing the square and simplifying the expression for the generating
functional we get (Exercise)
¯
d4 xd4 y J(x)S
R
¯ = Z0 e−
Z[J, J] F (x−y)J(y)
(214)
Where Z0 = Z[J = 0, J¯ = 0]. Recall that for Grassmann numbers, the rules of
differentiation include sign-switching and go like
d d
θη = −θ η = −θ (215)
dη dη
So the n-point correlation function is then

   
δ δ
h0| T [ψ (α1 ) (x1 ) . . . ψ (αn ) (xn )] |0i = Z0−1 i(−1)α1 +1 α1 . . . i(−1)αn +1 αn Z[J, J]
¯
δJ δJ
(216)
Where ψ (α) (x) = ψ(x) for α = 0 and ψ (α) (x) = ψ̄(x) for α = 1.

Check (Exercise) that the quantum 2-point correlation function comes out to
be the expected

h0| T [ψ̂(x)ψ̄ˆ(y)] |0i = SF (x − y). (217)

38
Interactions of Fermions and Bosons
The path integral is a great tool for guessing Feynman rules as well, since we
can expand in a Taylor series and recognize patterns that represent certain sym-
metries and diagrams. WIthout needing to introduce too much gauge theory,
we introduce massive quantum electrodynamics (QED), a quantum field theory
that models the interaction of fermions and (massive) bosons. We expect the
photon (boson) mass to be zero, but consider it massive for now, and note that
the upper bounds on the mass of the photon have been calculated to be nonzero
( 10−20 ).

Consider the Lagrangian density


1 1 2
L = ψ̄(iγ µ (∂µ − ieAµ ) − mf )ψ − F µν Fµν + m Aµ Aµ . (218)
4 2 b
There are six fields represented here: the fermion fields ψ and ψ̄, the 4 boson
fields Aµ , and the tensor F µν = ∂ µ Aν − ∂ ν Aµ .

Following the path integral quantization, the classical 2-point correlation func-
tion is

DψDψ̄DA ψ(x)ψ̄(y)eiS
R
h0| T [ψ(x)ψ̄(y)] |0i = lim R . (219)
T →∞(1−i) DψDψ̄DA eiS

Write the action in terms of the free theory and the interacting theory

S = S0 + Sint
Z 
4 1 µν 1 2 µ
S= /
d x (ψ̄(i∂ − mf )ψ − F Fµν + mb Aµ A
4 2
 Z 
+ −ie d4 x ψ̄Aµ γ µ ψ .

Then the quantized 2-point correlation function for massive QED with the
Taylor expansion is an infinite series of n-point correlation functions for the
Grassmann-valued Gaussian path integrals

DψDψ̄DA eiS0 ψ(x)ψ̄(y)(1 − ie d4 x ψ̄Aµ γ µ ψ + . . . )


R R
h0| T [ψ̂(x)ψ̄ˆ(y)] |0i = lim R .
T →∞(1−i) DψDψ̄DA ei(S0 +Sint )
(220)
The Feynman rules for massive QED, which can be deduced via patterns from
the Taylor series, are

1. Draw a straight
 line
 from a to b with momenta p for each fermion and
associate p/−mif +i
ab

39
2. Draw squiggly line
 from α to β with momenta q for each boson and asso-
−i
ciate k2 −m2 +i δαβ
b

3. To each vertex associate ieγ µ


4. Enforce momentum conservation at vertices: (2π)4 δ (4) (Σin p − Σout q)
5. Integrate over undetermined momenta

6. Amputate external lines


7. Incoming fermions η a (p) and outgoing fermions η̄ b (p)
8. (−1) for each closed fermion loop.

This prescription creates infinities, and we will visit renormalization to tame


these infinities in the following lectures.

40
8 Lecture 8: Renormalization
This is an incredibly important lecture where we ask, “How do we do physics?”
and “How do we make progress in physics?”

1. Observation
→ Empirical data
2. Explanation
→ Data compression
3. Understanding
→ Models
E.g., Hamiltonian with Hilbert space, neural network, list of data
4. Prediction
→ New observation
5. Repeat from Step 1

This algorithm yields an increasingly smaller list of plausible models that corre-
spond with the observed data. To the physicist, a model is often a Hamiltonian
(or Lagrangian), with an associated Hilbert space HΛ , that depends on some
list of unknown parameters zj ∈ R, for j = 1, . . . , n

Ĥ → ĤΛ = ĤΛ (z1 , . . . , zn ) (221)


Where Λ is a list of the degrees of freedom that we wish to explain. The degrees
of freedom may be finite or infinite, and hopefully there are tricks to tame the
infinite ones.

Some additional observations


When we speak of empirical data, we mean expectation values of Hermitian
operators hAj i = αj , which can be precisely defined such that αj = δαj or can
have a spread of uncertainty.

For each model ĤΛ (z1 , . . . , zn ) that we make a prediction for hAj i, if the pa-
rameters zj do not yield the correct expectation value, then we reject that set
of parameters for the model, and end up with a map

hAj i = fj (z1 , . . . , zn ; Λ). (222)


This map is the exact solution or our prediction for that model. Note that this
map is many-to-one, and is, thus, not invertible; there are many sets of param-
eters that can yield the same expectation value.

41
We say that a model ĤΛ (z1 , . . . , zn ) is “simpler” than another model ĤΛ0 (z10 , . . . , zn0 0 )
if one or both of the following conditions are satisfied: n < n0 and/or |Λ| > |Λ0 |.

The parameters z1 , . . . , zn are essentially coupling constants, and are not di-
rectly observable and not operationally well-defined.

All predictions fj = hAj i must be finite and real.

It is possible to give finite predictions in terms of infinite parameters.

Renormalization and QFT


In the context of quantum field theory, consider the interacting φ4 theory with
Lagrangian
1 1 λ
(∂µ φ)2 − m2 φ2 − φ4
L= (223)
2 2 4!
We have spent many weeks approximately computing the map fj (m, λ), for fixed
m and λ, and have encountered infinity many times already in these calculations.

The degrees of freedom that we wish to explain in this context Λ are the mo-
mentum modes (of interacting bosons).

To tame these infinities, we first impose a cutoff |Λ| < ∞, where Λ is an arbi-
trary parameter. So, predictions will change with respect to the chosen cutoff,
since hAj i depends on Λ, such that hAj i = fj (z1 , . . . , zn ; Λ).

We can declare victory if we can invert the prediction fj (z1 , . . . , zn ; Λ) and move
the Λ-dependence onto the parameters, such that zj = zj (Λ).

A theory which allows hAj i = fj (z1 (Λ), . . . , zn (Λ); Λ), ∀Λ and fixed n, is called
renormalizable.

Side note about parameters: Note that the mass of an electron is the measured
value, but in the model it is defined by the imposed cutoff. For a different
cutoff, the coupling constant m may be different than the actual mass of the
electron, but in the “correct”, or “most correct”, model, we call it the “mass of
the electron”.

Scattering Amplitude in φ4 Theory


In φ4 theory, we focus on one prediction in particular, the scattering amplitude.

42
Note on things to come: the combinatorial proof that φ4 , and other models, is
renormalizable.

Recall that the scattering S-matrix for φ4 theory with no cutoff, or |Λ| = ∞,
blows up to infinity

hp1 p2 | S |pA pB i = (224)

We can eliminate most of the infinite terms/diagrams by redefining the ground


state energy E0 and rest mass of the electron m0 and rescaling λ, such that we
dont have to set it to zero to make the theory work.

The first term/diagram that rescaling λ does not work for introduces a loga-
rithmic divergence and has the form

(−iλ)2 d4 k
Z
i i
I= . (225)
2 (2π)4 k 2 − m2 + i (k1 + k2 − k)2 − m2 + i

Therefore, we must impose a cutoff |Λ| =


6 ∞ in order to calculate the S-matrix
for φ4 theory.

Impose a cutoff on the momenta |k| < kc ∈ R. Then the integral above becomes
(Exercise)

43
(−iλ)2 i2 d4 k
Z
i i
I= 4 k 2 − m2 + i (k + k − k)2 − m2 + i
(226)
2 Λ (2π) 1 2
kc2
I = 2iC log . (227)
(k1 + k2 )2

Then the scattering amplitude to order O(λ2 ) is

kc2 kc2 kc2


      
M = M(kc ) = −iλ+iCλ2 log + log + log .
(k1 + k2 )2 (k1 − k3 )2
(k1 − k4 )2
(228)
The parameter z2 = λ can be fit to the experiment and model, as we do not
accept that it is fixed “at the beginning of the Universe”, and we can declare
victory by allowing z2 = z2 (kc ) = λ(kc ).

To solve for λ(kc ), let the scattering amplitude be the experimental value
M(kc , λ) = Mexp and solve the differential equation that allows λ to vary
with respect to kc and match up to Mexp

kc = 6Cλ2 + O(λ3 ). (229)
dkc

44
9 Lecture 9: Renormalizability (of φ4 Theory)
This topic covers results of Bogoliubov, Parasiuk, Hepp, Zimmermann, or the
BPHZ renormalization scheme.

A renormalizable theory is a cutoff field theory determined by a finite number


of parameters ĤΛ = Ĥ(z1 , . . . , zn ; Λ) such that for all observables Âα , the ex-
pectation value of  can be matched to experimentally determinable quantities
for any choice of cutoff Λ by redefining the parameters zj = zj (Λ)

hÂα iz1 ,...,zn ,Λ = hÂα iz1 (Λ),...,zn (Λ) = Experiment(α). (230)


There is a weak form of renormailzability that retains dependence of the expec-
tation value on the cutoff
 
1
hÂα iz1 (Λ),...,zn (Λ) = Experiment(α) + O . (231)
kc
This dependence can be worked around, since as kc → ∞, the inverse goes to
zero.

In the φ4 interaction, we have three parameters with the Hamiltonian

Ĥ(m, λ, z; Λ) (232)
Where z is the field strength renormalization parameter.

Is φ4 theory, by this definition, renormalizable?

If this is true, then we are allowed to fit an infinite number of quantities to


experimentally determinable quantities by fitting only three parameters: m, λ,
and z. Very cool!

Degree of Divergence
Consider a diagram with BE external lines. The diagram has a superficial degree
of divergence D if it diverges with the cutoff as kcD . For D = 0, we say that the
diagram has logarithmic divergence: log (kc ).

Theorem: The degree of divergence is equal to the number of spacetime di-


mensions minus the number of external lines

D = 4 − BE . (233)
Examples:

45
1. BE = 2 =⇒ D = 2 =⇒ ∼ kc2
2. BE = 4 =⇒ D = 0 =⇒ ∼ log (kc )
1
3. BE = 6 =⇒ D = −2 =⇒ ∼ kc2

Note that as BE and kc increase, the divergences become increasingly less ob-
servable. Each pair of incoming/outgoing particles contributes a propagator
R d4 k   B2E
−2 i
proportional to kc to the 4D momentum integral ∼ (2π)4 k2 −m2 +i .

Some more notation: BI is the number of internal lines, V is the number of


vertices, and L is the number of closed loops.

1. BI = 3, V = 2, L = 2, D = 2
2. BI = 3, V = 2, L = 2, D = 2

46
3. BI = 5, V = 4, L = 2, D = −2

Proof :

The number of loops corresponds directly to the number of undetermined mo-


R d4 k 4
menta (4D momentum space integrals). Morally, we say that (2π) 4 ∼ kc .

It seems that there are BI such integrals, but momentum conservation reduces
the total number of loop integrals to

L = BI − (V − 1). (234)
Each vertex has four lines and each line connects two vertices, such that

4V = BE + 2BI . (235)
Now, recall that for each loop there is a factor ∼ kc4
from the integral, and for
each line there is a factor kc−2 from the propagator. Then

D = 4L − 2BI = 4 − BE . (236)
Exercise: Prove this result for n-dimensional spacetime.

Physical or Renormalized Perturbation Theory


Consider the parameterized Lagrangian

1 2 λ
L = L(z1 = m, z2 = λ, z3 = z) = (z (∂µ φ)2 − z 2 m2 φ2 ) − z 4 φ4 . (237)
2 4!
Rewrite this in terms of the physical Lagrangian, the one that has been success-
ful in corresponding with experimental data, and counter terms dependent on
three new parameters A, B, and C,

L = Lphys + counter terms (238)


 
1 2 λphys 4
(z (∂µ φ)2 − m2phys φ2 ) − z 4 φ + A(∂µ φ)2 + Bφ2 + Cφ4

L=
2 4!
(239)

Now, think of these parameters as additional interactions that can be shifted


to eliminate the dependence on the cutoff. These parameters are determined
iteratively by the constraint that physically observable quantities do not depend
of the momenta k.

The Feynman rules for the renormalized φ4 theory are the same rules plus
two more due to an additional type of vertex that depend of the “additional
interaction” parameters.

47
i
Top-left: k2 −m2phys +i
Top-right: −iλphys
Bottom-left: 2i(Ak 2 + B)
Bottom-right: 4! · iC
So, counter terms are added to the Lagrangian as “additional interactions”,
which introduce new Feynman diagrams. The parameters are determined iter-
atively to order λN
phys , at which we call them AN , BN , and CN , and nothing
depends on the cutoff.

The next iteration N + 1 is determined by requiring that the new propagator


(bottom-left diagram above) at λN +1
phys has a pole at mphys with residue equal to
one. We also require that the scattering amplitude to order O(λN +1
phys ) is equal
to −iλphys (bottom-right diagram above).

Non-Renormalizable Theories
A non-renormalizable theory requires an infinite number of parameters to en-
sure that operationally-defined quantities do not depend on the cutoff.

How do these theories appear in renormalized perturbation theory?

For instance, consider the Lagrangian

L = L0phys + Lint
phys (λ) + (counter terms). (240)
Calculate the scattering amplitude to order O(λN phys ), and we will see that we
need counter terms to eliminate dependence on the cutoff kc , but as we go to
higher and higher order in λphys , we need more and more counter terms, and
this will continue and diverge, requiring an infinite number of counter terms
and associated parameters to eliminate the cutoff dependence.

48
10 Lecture 10: Abelian Gauge Theory (Quan-
tum Electrodynamics)
Why study abelian gauge theory?
An example of an abelian gauge theory is quantum electrodynamics and the
electromagnetic field, as well as SU (2), and SU (3) gauge bosons.

In developing a gauge theory, we will follow the same route of specifying sym-
metries, giving rise to invariants, of the theory and then, via quantization, look
for (projective) unitary representations of the group which are local.

As opposed to other field theories, the gauge theory should be symmetric under
a local gauge group G which acts independently at each location in spacetime.

For example, consider the circle group U (1), which consists of all complex num-
bers with absolute value equal to 1 under multiplication, the roots of unity

U (1) = {eiθ : θ ∈ [0, 2π)}. (241)


This symmetry group gives rise to the local gauge group G as the group of
transformations from 1 + 3-dimensional space time, Minkowski space, to the
circle group

G = {g : M1+3 → U (1)} (242)


How does the local gauge group act on fields?

Consider the DIrac field, where we want a U (1) gauge-invariant quantum field
theory of electrons. Elements of G act independently on each spacetime locationx ∈
M1+3 . Equivalently, there is a copy of U (1) attached to each x acting indepen-
dently of each other.

g : ψ(x) → π(g(x))ψ(x) = eiα(x) ψ(x) (243)


Where α(x) ∈ [0, 2π) is a phase factor.

Which theories are invariant under the Poincaré group and the local
gauge group G?

As it stands, we have an empty set of theories that are invariant under the
local gauge group. Begin populating it by building a Lagrangian density, with
a classical, continuous spacetime, and find which kinds of terms will be invariant.

Terms of the Lagrangian density for the Dirac field that we already know to be
invariant are ψ̄ψ and (ψ̄γ µ ψ)2 (contracted with itself).

49
/ is not invariant, as it is not well-defined! Why?
Note that the quantity ψ̄ ∂ψ

The differential operator acting on the field ∂µ ψ is a limit

ψ(x + nµ ) − ψ(x)


∂µ ψ ≡ lim . (244)
→∞ 
And the local gauge group acting on this quantity results in oscillatory terms
that do not converge as  → 0 (e.g., ∼ ei... ). The limit does exist under gauge
transformations, since
1 −α(x+nµ )
g(∂µ ψ) = lim (e ψ(x + nµ ) − eiα(x) ψ(x)). (245)
→∞ 
This theory is boring at this moment, as the Lagrangian has only two gauge-
invariant terms. To introduce dynamics and impose the desired symmetry, we
need the derivative, but there is no way to do it with a single fermion (or boson)
field. Therefore, we introduce an auxiliary field which transforms nontrivially
on the local gauge group (c.f., adding a catalyst in chemistry opens new ther-
modynamic paths for reactions to take place, as the catalyst is not consumed,
but is used, in the reaction to lowers the free energy of the reaction).

Introduce the parallel transporter, a recipe to compare a field at two independent


spacetime locations, which is dependent on the path γ

Uγ (y, x) ∈ U (1), ∀ x, y ∈ M1+3 (246)


To ensure that U (y, x) is a gauge invariant comparator, and allow us to compare
the two spacetime locations in a gauge invariant, we require that the local gauage
group act as

g : U (y, x) → eiα(y) U (y, x)e−iα(x) . (247)


So, we define ψ(x) to be parallel transported to y as

g : U (y, x)ψ(x) → eiα(y) U (y, x)e−iα(x) ψ(x) (248)


iα(y)
=e U (y, x)ψ(x) (249)

Does such an object U (y, x) exist?

Yes, they exist and are made rigorous in the formalism of fibre bundles and
principal bundles. Note that U (y, x) is a not a field and is nonlocal object, but
it is expressable in terms of local objects and local data.

So, ψ(x + nµ ) and U (x + nµ , x)ψ(x) transform the same way under the local
gauge group G, and we can introduce dynamics to the theory and define the
covariant derivative as

50
ψ(x + nµ ) − U (x + nµ , x)ψ(x)
Dµ ψ(x) ≡ lim (250)
→0 
What about the parallel transporter U (y, x)?

Suppose that U (x + nµ , x) is continuous and differentiable near x, and apply


the Taylor series expansion

U (x + nµ , x) = U (x, x) + nµ ∂µ U (x, x) + . . . (251)


µ
= 1 + n ∂µ U (x, x) + . . . (252)
µ
= 1 − in Aµ (x) + . . . (253)

Where we tried U (x, x) = 1, since 1 transforms correctly under the gauge group,
and it is traditional to call ∂µ U (x, x) = −iαAµ (x), where α is the fine structure
constant. Now, Aµ (x) is not arbitrary, and must satisfy some constraints.

How does Aµ (x) transform under G?

Apply a local gauge transformation g ∈ G to the last line above

µ
eiα(x+n ) U (x + nµ , x)e−iα(x) = 1 − ienµ Aµ (x) + inµ ∂µ α(x) + . . . (254)

Thus, the auxiliary field transforms under the local gauge as


1
g : Aµ (x) → Aµ (x) − ∂µ α(x) (255)
e
And this gives us a definition of Aµ (x) in one gauge, and this transformation
law allows us to change gauge, or basis. Finding the form of Aµ (x) that satisfies
this local gauge transformation is tantamount to having an infinitesimal method
for building the parallel transporter object.

Put this all together into the covariant derivative (Exercise)

Dµ ψ(x) = ∂µ ψ(x) + iαAµ (x)ψ(x) (256)


Which is now a purely local object, dependent on only one spacetime location x.

Furthermore, the covariant derivative of the field transforms under the local
gauge group by introducing the phase factor, the same as the action of G on the
field ψ itself

g : Dµ ψ(x) → eiα(x) Dµ ψ(x) (257)


So, we’ve built the parallel transporter in terms of the local field Aµ (x), and we
have a derivative object that transforms correctly under the local gauge group.
Now, build the Lagrangian density

51
/ − mψ̄ψ + auxiliary field term(s)
L = ψ̄ Dψ (258)
To include the auxiliary field Aµ (x) in the quantization we need to endow it
with dynamics.

How do we give Aµ (x) dynamics?

To first order,
µ
Aµ (x+ 12 nµ )
U (x + nµ , x) ∼ e−ien . (259)
Use the parallel transporter to build a plaquette operator, which transverses an
object around a square of dimension 

U (x) = U (x, x + 1̂) · U (x + 1̂, x + (1̂ + 2̂)) · U (x + (1̂ + 2̂), x + 2̂) · U (x + 2̂, x).
(260)
The plaquette operator U is gauge invariant, and, working out the Taylor
series, we can write it in terms of the auxiliary field (Exercise)

1 1 1 1 3
U (x) = e−iα(−A2 (x+ 2 2̂)−A1 (x+ 2 1̂+2̂)+A2 (x+1̂+ 2 2̂)+A1 (x+ 2 1̂))+O( ) . (261)

Expand in  (Exercise)

U (x) = 1 − i2 e(∂1 A2 − ∂2 A1 ) + O(3 ). (262)


The choice of direction in this derivation is arbitrary, leaving 16 possible choices;
construct the 2-tensor

Fµν (x) = ∂µ Aν − ∂ν Aµ . (263)


Note that Fµν is locally gauge invariant, Lorentz invariant, but it is not Poincaré
invariant with those spacetime indices. Construct the gauge, Lorentz, and
Poincaré invariant object from the Fµν

52
1
− F µν Fµν . (264)
4
Then the first nontrivial Langrangian density we can construct is the exact
Lagrangian for quantum electrodynamics: an electromagnetic field minimally
coupled to the Dirac field.
1
/ − m)ψ − F µν Fµν .
L = ψ̄(D (265)
4
The term “minimally coupled” means that the theory is renormalizable.

Alternative Derivation of F µν
Since Dµ is gauage invariant, the commutator with itself [Dµ , Dν ] is also gauge
invariant. So, under the local gauge group, the commutator transforms as

g : [Dµ , Dν ]ψ(x) → eiα(x) [Dµ , Dν ]ψ(x). (266)


Plugging in the expression for the covariant derivative and working out the
commutator, we get

[Dµ , Dν ]ψ(x) = [∂µ , ∂ν ]ψ + ie([∂µ , Aν ] − [∂ν , Aµ ])ψ − e2 [Aµ , Aν ]ψ. (267)

The commutators of ∂ and A with themselves are zero. Therefore,

[Dµ , Dν ]ψ(x) = ie(∂µ Aν − ∂ν Aµ )ψ(x) (268)


The commutator of the covariant derivative is the spacetime curvature tensor.

[Dµ , Dν ]ψ(x) = ieFµν (x). (269)

53
11 Lecture 11: Nonabelian Gauge Theory (Yang-
Mills)
As a recap of abelian gauge theory, a gauge theory is a theory that is invariant
under a group G of local symmetry transformations, which act independently
at each point in spacetime M1,3 .

In contrast to local symmetry groups, a global symmetry group that we have


dealt with extensively in the Poincaré group, which acts on all of spacetime,
making such transformations dependent on spacetime location.

In the context of the Dirac spinor and fermionic field theories, consider the local
phase transformation

ψ(x) → eiα(x) ψ(x) (270)


Where we assumed that the phase function α(x), x ∈ M1,3 , is differentiable,
and maps Minkowski space to the radial unit interval, α(x) : M1,3 → [0, 2π).

The new, larger, more constrained symmetry group that we are building an
invariant theory under is Poincaré group plus the local gauge group G. The
only terms invaraint under this new symmetry group that we found fit for the
Lagrangian density were

ψ̄ψ and (ψ̄γ µ ψ)2 . (271)


In order compare two independent points of spacetime in this theory, we intro-
duced dynamics in the form of the covariant derivative. The covariant derivative
was defined in terms of an auxiliary gauge field Aµ

∂µ → ∂µ − ieAµ . (272)
Call Dµ = ∂µ − ieAµ , and then we have the additional invariant term ψ̄ Dψ / to
include in the Lagrangian density under this new representation of the derivative

/ + F µν Fµν .
L = ψ̄ Dψ (273)
Where F µν is the spacetime curvature tensor that includes derivatives of the
gauge field Aµ .

Nonabelian Gauge Theory


Consider the gauge group to be the special unitary group of 2 × 2 matrices
SU (2). Note that for full generality, we should consider arbitrary connected Lie
groups, but SU (2) will get us almost all of the story, and the representation
theory of SU (2) groups can be used to find the representation theory of general

54
Lie groups.

What is the local gauge group of SU (2)?

This theory must be invariant under group transformations V (x) ∈ SU (2),


where x ∈ M1,3 . An element of this unitary group V (x) has the following form
and constraints
 
v00 (x) v01 (x)
V (x) = (274)
v10 (x) v11 (x)

1
X
|vjk |2 = 2
j,k=0

V † (x)V (x) = I
det(V (x)) = 1

We need to choose how V (x) acts on a field. Introduce two independent spinor
fields ψ0 (x) and ψ1 (x) that form a new basis, under which the new theory must
be invariant,
1
X
ψj (x) = vjk (x)ψk (x). (275)
k=0

Build the doublet field as an 8 × 1 vector that Poincaré-transforms like two


independent spinors
 
ψ0 (x)
Ψ(x) = . (276)
ψ1 (x)
How does the local gauge group, with element g ∈ G, act on the dou-
blet field?

  
v00 (x) · I4×4 v01 (x) · I4×4 ψ0 (x)
g : Ψ(x) → . (277)
v10 (x) · I4×4 v11 (x) · I4×4 ψ1 (x)
The invariant terms we can construct from the doublet field are
1
γµ
 
X 0
Ψ̄Ψ = ψ̄j ψj and (Ψ̄Γµ Ψ)2 , where Γµ = (278)
0 γµ
j=0

As in the abelian case, build the covariant derivative by introducing the parallel
transporter U (y, x) ∈ SU (2) and going to a representation of the local gauge
group. Under the local gauge transformation

U (y, x) → V (y)U (y, x)V † (x). (279)

55
The covariant derivative is then defined to be
1
nµ Dµ Ψ ≡ lim (Ψ(x + n) − U (x + n, x)Ψ(x)) (280)
→0 

Where we note that U (x + n, x) is a 2 × 2 matrix depending on two different


spacetime locations. To ensure locality of the theory, we only need to know
U (y, x) for y ' x.

Stepping back, suppose that we have some element of the gauge group U ∈
SU (2), which we assume is differentiable. Note that this U is not the parallel
transporter yet.

Since U is unitary, and recalling that exponentiated unitary elements close to


the identity are elements of the underlying Lie algebra, let

U = eiA (281)

Where A is a Hermitian matrix, such that A = A and Tr(A) = 0.

Though it is not necessary, we anticipate computation in the future, and choose


a basis. Namely, we choose the 2 × 2 Pauli spin matrices as a basis, noting that
any 2 × 2 Hermitian traceless matrix can be written as a combination of the
three Pauli matrices. In the Pauli basis, the Hermitian matrix is
3
X 1
A= αj σ j . (282)
j=1
2

The Pauli matrices are

     
1 0 1 1 0 −i 1 1 0
σ1 = , σ2 = , and σ 3 = (283)
2 1 0 2 i 0 2 0 −1

And obey the Lie bracket

[σ j , σ k ] = ijkl σ l . (284)
j
So, to specify the Hermitian matrix A, we need three real numbers α ∈ R.

Consider some other Hermitian, traceless matrix B, which is to zeroth order


equal to the identity, and to first order is proportional to the gauge field. Then
the parallel transporter constructed from B has the form
3
X σj
U (x + n, x) = eiB(x;n,) = I2×2 + ignµ Ajµ + O(3 ) (285)
j=1
2

Then, in the chosen Pauli basis, the covariant derivative is an 8 × 8 matrix and
is defined as

56
1
nµ Dµ Ψ ≡ lim (Ψ(x + n) − U (x + n, x)Ψ(x)) (286)
→0 
Where

σj
Dµ = ∂µ − igAjµ (x)
. (287)
2
The coefficient field Ajµ (x) is not arbitrary and must obey transformation laws
of the local gauge group, as well as give a representation of the local gauge group
determined by the action of the local gauge group on the parallel transporter

V : U (x + n, x) →V (x + n)U (x + n, x)V † (x) (288)


σj
 
= V (x + n) I + ignµ Ajµ + O(3 ) V † (x) (289)
2

Calculating the action of V (x + n) and V † (x), the first order term in  becomes
(Exercise)

σj σj
 
i
L.G. : Ajµ (x) → V (x) Ajµ + ∂µ V † (x). (290)
2 2 2
Hint: V (x + n)V † (x) = [(1 + nµ ∂µ )V (x)] V † (x) + O(2 ).

Next, to compute V (x)∂µ V † (x), we will take the infinitesimal approach. Recall
that for the abelian case, we just had the phase factor α(x)∂µ α† (x), which was
just a number, but now with V (x) we have a 2 × 2 matrix with an 8 × 8 repre-
sentation.

When V (x) is infinitesimally close to the identity, we know that it is some


exponential factor in the Pauli sigma matrices
j j
(x) σ2
V (x) = eiα (291)
Where αj (x) are small numbers. Applying the Taylor expansion with respect
to α(x), the action of V (x) on the partial derivative is

σj σj
V (x)∂µ V † (x) = (I + iαj )∂µ (I − iαj ) (292)
2 2
∂αj σ j
= −i µ + O(α2 ) (293)
∂x 2
Then, under the local gauge transformation, the gauge field and sigma matrices
transform as

j j
σj σj k σk
 
σ σ 1
L.G. : Ajµ (x) → Ajµ (x) j j
+ (∂µ α (x)) + i α (x) , Aµ (x) . (294)
2 2 g 2 2 2

57
Now we can see the infinitesimal local gauge transformation does to the covariant
derivative of the doublet spinor field Ψ(x)

σj
 
j
L.G. : Ψ(x) → I + iα (x) Ψ(x) (295)
2
σj σj σj k σk σj
    
j j j j
L.G. : Dµ Ψ(x) → ∂µ − igAµ (x) − i(∂µ α (x)) + g α (x) , Aµ (x) 1 + iα (x) Ψ(x)
2 2 2 2 2
(296)

To first order in α, the right-hand side of the infinitesimal transformation be-


comes

σj
 
L.G. : Dµ Ψ(x) → 1 + iαj (x) Dµ Ψ(x) (297)
2
= V (x)Dµ Ψ(x) (298)

Where, for the physicist, ignoring issues of connectivity with the local gauge
group (will need gauge fixing), we make the “big” gauge transformation (last
line) by exponentiating (e.g., (1 + nx )n = ex ).

Now we need to build a Langrangian density term that gives dynamics to the
gauge field Ajµ (x). Recall the commutator [Dµ , Dν ], the curvature of the SU (2)
fibre bundle, which is local gauge invariant, and involves derivatives of Ajµ . This
transforms under local gauge as

L.G. : [Dµ , Dν ]Ψ(x) → V (x)[Dµ , Dν ]Ψ(x). (299)


In the nonabelian case, the commutator has the form, using the fact that mixed
partial derivatives commute (Exercise)

σj
[Dµ , Dν ] = igFµνj (300)
2
j j j k
  
j σ j σ j σ kσ
= ig ∂µ Aν − ∂ν Aµ − ig Aµ , Aν (301)
2 2 2 2

So, in the nonabelian case, the curvature tensor Fµν depends quadratically on
Ajµ , whereas in the abelian case it was linearly dependent. Therefore, the invari-
ant term ∼ F µν Fµν yields cubic and quartic terms in the Lagrangian density,
making an “interacting” theory.

Under the local gauge, the curvature tensor transforms as similarity

σj
L.G. : Fµνj → V (x)(Fµνj )V † (x). (302)
2

58
By the similarity transformation, we can build an invariant Lorentz scalar with
the trace of the invariant term (Exercise, and note that σ j σ k contributes δjk )

σj σk
  
1
Tr Fµνj Fµνk = (Fµνj )2 . (303)
2 2 8
The (classical) Langrangian density for the nonabelian gauge theory, invariant
under local gauge and Poincaré transformations is
1
L = Ψ̄(iD/ − m)Ψ − (Fµνj )2 . (304)
4
Note that in the abelian case (theory of QED), the dynamics in the Lagrangian
density of the gauge field were quadratic. Considering only the gauge field term,
with no matter (fermions), essentially results in the wave equation.
2
Fµν = (∂µ Aν − ∂n uAµ )2 (305)
In the nonabelian case, they are cubic and quartic, since there is the commuta-
tor of the gauge fields.

The Lagrangian density can be quantized in two ways:

1. Perturbatively via path integrals


2. Non-perturbatively via lattice discretization.

59
12 Lecture 12: Quantization of Gauge Theories
Peskin and Schroeder, page 294

Recall the classical Lagrangian density L = ψ̄(iD/ − m)ψ − 41 Fµνa F µν, a , where a
denotes individual fields, that we crafted to be invariant under the local gauge
symmetry group SU (2). We introduced the “helper” gauge field Aaµ which man-
ifests in the terms of the spacetime curvature tensor , or the curvature of the
a
SU (2) fibre bundle, Fµν = −i[Dµ , Dν ], where Dµ = ∂µ − igAaµ σ2 is the covari-
ant derivative.

This Langrangian density represents a nontrivial dynamical system that is in-


variant under the local gauge symmetry group and endows fermions, as well as
other fields, with dynamics. Unlike many other effective classical theories, this
one is not quadratic in its fields and yields nonlinear equations of motion (e.g.,
instanton and soliton solutions).

We now quantize this gauge theoryto build a quantum theory that is invariant
under the Poincaré and local gauge symmetry groups by finding the correct
representation that has this Lagrangian density as its effective classical limit.

Two problems that arise for the gauge theories are (1) the classical theory (L0 )
is already nonlinear, and (2) there are lots of symmetries, global and local. Lo-
cal symmetries are represented by copies of SU (2) acting independently of each
other at each spacetime location.

Two approaches of quantization that R we will explore include (1) an analytic,


but naive path integral quantization DADψDψ̄eiS that is good for high-energy
scattering processes, but not for calculating ground state correlation functions.
The mathematical rigor of this approach is a current topic of research. And
(2) a computational route of lattice quantization, which makes dealing with
nonlinearity “easy”, but it loses Poincaré invariance in the process. Note that
there is also the route of canonical quantization, but we will not bother with
that here.

Path Integral Quantization of Gauge Theories


a
First, some words on the space of all gauge fields Aaµ : note that Aaµ σ2 and
a a b c
Aaµ σ2 + g1 (∂µ αa ) σ2 + i[αb σ2 , αc σ2 ] are gauge equivalent, meaning that there
exists an infinite number of Aaµ ’s with the same path integrand eiS , causing the
path integral to result in infinity, since we are redundantly integrating over a
continuous infinity of physically equivalent field configurations.

Consider the gauge field Aaµ , a list of 12 numbers in (3 + 1)-dimensional space-


time, on (0 + 1)-dimensional spacetime M0+1 (e.g., a line), where Aaµ is still

60
a list of three numbers. Now, SU (2) acts independently on Aaµ at each point
in M0+1 . This action is tantamount to multiplying by a phase on a sphere S 3
at each spacetime location, since SU (2) is parameterized by four numbers, the
coefficients of the quaternions with norm equal to one. Below is a schematic of
the space of all Aaµ ’s.

Schematic of gauge field configurations, independent copies of SU (2) acting on


(0 + 1)-dimensional spacetime.

So Aaµ is like a vector on S 3 at each spacetime location and is a possible config-


uration in M0+1 . These configurations form an equivalence class [Aaµ ] defined
by the local gauge group G. Addressing the problem of very many symmetries,
global and local, in the path integral approach, there are very many equivalence
classes, or configurations, to sum over, but we’d like to just choose one config-
uration.

Another, more common schematic to demonstrate the action of gauge groups is


to consider rotations in SO(2), where our theory is a zero-dimensional theory
invariant under SO(2). Configurations in this schematic are just points in a
two-dimensional space with equivalence classes defined by circles centered at
the origin. See schematic below.

61
Schematic of gauge field configurations and equivalence classes represented by
rotations in SO(2).

In choosing a representative point per equivalence class, we enforce the gauge


fixing condition. A good choice of representative may be the point that crosses
the horizontal axis. We then integrate over the space of the chosen representa-
tives. This effectively reduces the size of the configuration space and makes the
path integral much more tractable.

Note that we only pick one representative per equivalence class, but there can
be more than one depending on the choice of gauge fixing condition, where the
gauge fixing condition if a function that crosses the equivalence class circle more
than once. This is called the Gribov ambiguity, and commonly happens in the
Coulomb gauge.

Example schematic of Gribov ambiguity, where more than one representative of


the equialvence class is chosen by the guage fixing condition.

Gauge Fixing Condition


Recall that the path integral approach is made difficult by the fact that we are
redundantly integrating over a continuous infinity of physically equivalent field

62
configurations. By applying a gauge fixing condition, we isolate the intersting
part of the integral and count each distinct physical configuration only once.
Finding the right gauge fixing function G allows us to separate out this over-
counting in the path integral and throw it away. Note that we are free to “throw
it away” since we are guessing a quantum theory.

Choosing the gauge fixing function to be G(Aaµ ) = ∂µ Aaµ − ω a , where ω a is any


scalar field, is a generalization of the Lorentz gauge, and setting this equal to
zero is a generalization of the Lorentz gauge condition

G(Aaµ ) = ∂µ Aaµ − ω a = 0. (306)


With the gauge fixing function, the way we separate the overcounting is by
inserting unity

δG(Aα )
Z  
1 = Dα δ(G(Aα )) det . (307)
δα
Breaking equation this down, we are performing a path integral over all possible
gauge transformations, and picking out only the G(Aα ) that equals zero, obeying
the gauge fixing condition, choosing a single representative of the equivalence
class. The determinant is called the Faddeev-Popov determinant. The notation
Aα indicates the locally gauge transformed gauge field

1
(Aα )aµ = Aaµ + ∂µ αa + f abc Abµ αc (308)
2
1
= Aµ + D µ α a
a
(309)
g

Where f abc are the structure constants from the Pauli spin matrix commutation
relations
 a b
σ σ σc
, = if abc . (310)
2 2 2
This way of writing 1 is the continuous, functional generalization of that for
discrete, many-variable n-dimensional vectors
 
n Z  
Y ∂gj
1= daj  δ (n) (g(a))det (311)
j=1
∂ak

Where the determinant here is the Jacobian of the change of variables. By


change of variables, this can be written as (Exercise)
 
n Z
Y
1= dbj  δ (n) (b). (312)
j=1

63
Inserting the contiuous, functional version of one into the path integral, we now
have an expression that integrates over the equivalence classes but “sucks out”
the overcounting to just one representative of the class

δG(Aα ) iS
Z Z  
Dα DADψDψ̄ δ(G(Aα ))det e . (313)
δα
Evaluating the Faddeev-Popov determinant in our choice of the Lorentz gauge
(Exercise)

δG(Aα )
   
1 µ
det = det ∂ Dµ (314)
δα g
Z R 4
1 µ
= DcDc̄ ei g d x c̄(−∂ Dµ )c (315)

Where, recall from the study of fermions, we have used auxiliary Grassman-
valued, scalar, spin-0 fields c and c̄. These fields are non-physical and must
disappear form the final results: Faddeev-Popov ghosts or ghosts.

Inserting the ghost expression for the determinant into the path integral, we
now have

Z Z Z Z
1
d4 x c̄(−∂ µ Dµ )c)
R
Dα DA DcDc̄ DψDψ̄ δ(∂ µ Aaµ − ω a )ei(S+ g . (316)

Integrate out the delta functional, since ω a is arbitrary, using Ran Gaussian inte-
gral over ω a with coefficient ξ ∈ [0, 1], and calling S 0 = S + g1 d4 x c̄(−∂ µ Dµ )c,
Z Z 
0
R 4 1 a 2
Dω e−i d x 2 ξ(ω ) D... δ(∂ µ Aaµ − ω a )eiS (317)

Since ω a is an arbitrary, gauge-fixing, scalar function that takes one represen-


tative of each independent equivalence class, the full path integral will now be
independent of ω a , and we have the form
Z Z Z Z
0
R 4
N (ξ) Dα DA DcDc̄ DψDψ̄ ei d x L . (318)

The Lagrangian L0 has the form

1 a 2 1 1
L0 = ψ̄(iD
/ − m)ψ − (Fµν ) + ξ(∂ µ Aaµ )2 + c̄a (−∂ µ Dµab )cb . (319)
4 2 g
Note that the integral over α blows up to infinity, but in correlation functions
we always have ratios of the path integrals and the N (ξ) · ∞’s will cancel out.

Remaining questions include

64
• Does the path integral above even define a quantum theory, and is it in-
variant under Poincaré and local gauge symmetry group transformations?
See the work of ‘t Hooft.
• Do the ghosts c and c̄ vanish from the processes?
Feynman diagrams will clear up this concern.
• Is the Lagrangian density (theory) L0 renormalizable?
Also see the work of ‘t Hooft.

65
13 Lecture 13: Quantization of Nonabelian Gauge
Theory
In the path integral quantization of gauge theories, we started by guessing a
quantum theory by integrating the action functional over fermion fields ψ, ψ̄
and gauge boson fields A to calculate transition amplitudes.

We noticed that there are many gauge-equivalent configurations of the gauge


field that causes naive divergences, due to overcounting, in the Feynman dia-
gram expansion. To tame this infinity, we fixed the gauge by choosing a single
representative from each gauge equivalence class by inserting unity in
 terms of

δG(Aα )
the delta functional. This introduced a nonlinear Jacobian term det δα ,
where Aα denotes the gauge-fixed gauge field.

To compute this determinant, we introduced the auxiliary (classical, scalar,


spinless) Grassman-valued fields c(x), called Faddeev-Popov ghosts. These fields
are unphysical, and must drop out during calculation. Note that the PCT-
thereom does not apply to ghosts. Then transition amplitude is then defined
as
Z
hΦf | U |Φi i ≡ DADψDψ̄DcDc̄ eiS . (320)

Where the ghosts obey the anticommutation relation as Grassman-valued fields


{c(x), c(y)} = 0, and the Lagrangian density is

1 1
L = − (∂µ Aaν − ∂ν Aaµ )2 + ξ(∂ µ Aaµ )2 + ψ̄(i∂/ − m)ψ + c̄a (−∂ µ ∂µ )ca + Lint (g).
4 2
(321)
Note that the interacting bits, including the bits from the covariant derivative
Dµ are absorbed into Lint (g), and everything else in the expression above is the
free theory with partial derivatives ∂µ .

We now push forward under the belief that the perturbatively defined theory
above is representative of a quantum theory, and expand in powers of the in-
teraction term g to define processes and Feynman rules of the Yang-Mills
theory.

Propagators (free theory, g = 0 part)


The fermion propagator contributes

d4 k
Z  
ˆ i
hψ̄jα (x)ψ̄lβ (y)i = δjl e−ik·(x−y) . (322)
(2π)4 k/ − m αβ
The gauge boson propagator contributes

66
d4 k e−ik·(x−y)
Z  
kµ kν
hÂaµ (x)Âbν (y)i = ηµν − (1 − ξ) 2 δab . (323)
(2π)4 k k 2 + i

The ghost propagator contributes

d4 k i
Z
hĉa (x)c̄ˆb (y)i = δab e−ik·(x−y) . (324)
(2π)4 k 2
Recall that a, b = 1, 2, 3 and µ, ν = 0, 1, 2, 3.

Vertices (interacting theory, g 6= 0 part)


The interaction of two fermions and one gauge boson contributes
σa
. igγ µ (325)
2
The interaction of three gauge bosons contributes

gf abc (η µν (k − p)ρ + η µρ (p − q)µ + η ρµ (q − k)ν ) . (326)


The interaction of four gauge bosons contributes

−ig(f abc f cde (η µρ η νσ −η µσ η νρ )+f ace f bde (η µν η ρσ −η µσ η νρ )+f ade f bce (η µν η ρσ −η µρ η νσ )).
(327)
The interaction of one gauge boson with two ghosts contributes

gf abc pµ . (328)
(Exercise) And the rest of the Feynman rules for Yang-Mills theory: symmetry
factors, signs, and conservations laws.

Big Question 1
Can we interpret this as a quantum theory? In other words, does this theory
implicitly define the Hermitian operators Âα , ψ̂, and ĉ?

Yes, it is a quantum theory, subject to a cutoff.

In more detail, what are the possible modes of failure for the theory to not be
a valid quantum theory?

The first mode of failure that can occur is the time evolution operator not be-
ing unitary, or time translation not being a unitary process. To confirm that
time translation in this theory is a unitary process, check that the correlation
function is symmetric under the group of Poincaré transformation (Exercise).

67
This also implies that probability is conserved.

The second mode of failure occurs if, after building the Fock space, we get neg-
ative norm states, such that the inner product of the system eigenstates is less
than zero hΨ|Ψi < 0. If this happens, then the Hamiltonian is not positive
definite and we do not have a proper Hilbert space.

Note that there is a type of exception to this rule, which is a topic of current
research. Negative states may be able to be modded out by their subspaces to
produce an effective quantum theory from the remaining subspaces with positive
inner products. These remaining states are the physical, operationally defined
states, and they form a convex cone from an operator algebra. The inner prod-
uct in this state subspace forms a “Hilbert subspace”, where hΨphys |Ψphys i > 0.

Big Question 2
Is this theory, as a quantum theory, renormalizable?

Yes, thanks to the work of t’Hooft and Veltman in the REGULARIZATION


AND RENORMALIZATION OF GAUGE FIELDS. They found a smart cutoff
to impose that renders infinite integrals finite and retains Lorentz and gauge
invariance through the technique called dimensional regularization.

Note that different cutoffs can reveal different hypotheses about reality, and it
is generally believed that physics is Lorentz and Poincaré invariant. So, a good
cutoff should retain these invariances while also taming infinities of the Feyn-
man diagram integral contributions.

Dimensional Regularization: An Example


Peskin and Schroeder, page 249.

In φ4 theory, as well as others, we encounter loop integrals that diverge because


we are imposing too strong of a hypothesis on how the degrees of freedom of
the theory behave. For example take the following loop integral that shows up
in the Feynman expansion

dd k
Z
1
In (m) = (329)
(2π) (k − m2 + i)n
d 2

Where n ∈ Z is the number of undetermined momenta. This produces a naive


divergence of the form Λd /Λ2n .

Note that in lower dimensions, these divergences are easier to handle. For ex-
ample, if d = 2, then n = 1 tames the infinity. If d = 3, n must be greater than

68
one to tame infinities of the cutoff in just three dimensions. In four dimensions,
such as our usual spacetime, n ≥ 2 is required to make the proper cancellations
(e.g., need more undetermined momenta).

Now to evaluate that integral, we separately write out √ the time component
integral and note that there are two poles at k0 = ±( k 2 + m2 − i), where k
is just the spatial components of the k-vector (e.g., three spatial components of
spacetime), and the i is Taylor expanded out of the square root,
Z d−1 Z ∞ 
d k 1
In (m) = dk 0 . (330)
(2π)d −∞ (k02 − k 2 − m2 + i)n
Now, rotate the contour from running along the real axis <(k0 ) to the imagi-
nary axis =(k0 ), effectively changing variables k0 → ik0 . Avoiding the poles,
everything in the region is analytic, meromorphic and we can do this rotation.

Contour rotation from real-axis of k0 to the imaginary axis of k0 .

Our integral is now



dd−1 k
Z Z
1
In (m) = −i dk0 . (331)
(2π)d −∞ (k02 + ω 2 )n
Where ω 2 = k 2 + m2 . Reabsorb the time-component integral into the rest of
the integrals, which we can interpret as d Euclidean integrals, with the Eu-
clidean metric, that will now exhibit spherical symmetry. Enter spherical polar
coordinates

69
dd k
Z
1
In (m) = −i d (k 2 + m2 )n
(332)
Euclidean (2π)
Z ∞
rd−1
Z
i
=− d
dΩ d dr 2 (333)
(2π) 0 (r + m2 )n
d
i (2π) 2 ∞ rd−1
Z
In (m) = − dr . (334)
(2π)d Γ( d2 ) 0 (r2 + m2 )n
m2
Make the change of variables x = r 2 +m2

1
md−2n
Z
d d
In (m) = d dx xn− 2 −1 (1 − x) 2 −1 . (335)
(2π) 2 Γ( d2 ) 0

This integral is a Beta function!

md−2n Γ(n − d2 )Γ( d2 ) md−2n Γ(n − d2 )


In (m) = d = d (336)
(2π) 2 Γ( d2 ) Γ(n) (2π) 2 Γ(n)
The gamma function has poles at negative integers, and this has divergences
at n = 0, −1, −2, . . . . The Feynman diagram expansion gives us n, and we are
stuck with it.

What if d is not a integer? This is the trick of dimensional regularization, as


this renders the divergences finite.

Let d = 4 − ,  > 0, and study the behavior of the integral solutions as  → 0.


As an example, consider the n − 2 case (Exercise)

m− 
I2 (m) =  Γ (337)
(2π)2− 2 2
 
1 − log(m) 2
= − γ + O(2 ) (338)
4π 2 (1 − 2 log(2π)) 
Where γ is the Euler-Mascheroni constant. This diverges as  → 0, as expected,
with the “bad bit” of 2 .

If we compare this with an integral that we’ve seen before, we can seee what
kind of cutoff the free parameter  is. Recall the diagram where we applied the
momentum cutoff |k| < Λ. So, 1 ∼ Λ.

70
14 Lecture 14: Quantization of Nonabelian Gauge
Theory, Cont.
Working with nonabelian gauge theories, we’ve written down a Lagrangian
density for Yang-mills theory, and after the gauge-fixing procedure of the La-
grangian density via path integrals and the tricks of Faddeev and Popov, we are
ready to do some calculations.

The first topic to discuss is the beta function or renormalization group equation,
which tells us how theories behave at low and high energies. The second topic
to discuss is the departure from path integrals and gauge fixing to methods of
lattice regulators, or cutoffs, for doing calculations in nonabelian gauge theory.

Renormalization of Nonabelian Gauge Theories


We begin by reviewing renormalizability of quantum theories and the renormal-
ization group equation, which tells us how coupling constants depend on the
cutoff and how to adjust the coupling constants to match low and high energy
predictions.

Recall that a quantum theory Ĥ(z1 , . . . , zn ; Λ), where Λ is the cutoff, data
defined in a list of all the degrees of freedom of the theory, is renormalizable is
it leads to finite predictions for all operationally well-defined observables. The
expectation values of these observables must produce the same predictions for
different choices of the chosen cutoff

hÂj i(z1 , . . . , zn ; Kc ) ≡ fj (z1 , . . . , zn ; Kc ) = αjobs. (339)


Where Kc is a particular choice of cutoff: the length of list Λ, for example.
This relationship of expectation values with different cutoffs yielding the same
observed quantities is achieved in a renormalizable theory by allowing the cou-
pling constants to depend on the cutoff zi = zi (Kc ).

Apply the above equation to the Green’s function, n-point correlation functions,
where the dependency of the coupling constants and cutoff are implicit to the
vacuum state |Ωi and the dynamics of the field operators

G(n) (x1 , . . . , xm ; Kc ) ≡ hΩ| T [φ̂(x1 ) . . . φ̂(xm )] |Ωi (340)


, where |Ωi = |Ω(z1 , . . . , zn ; Kc )i (341)
, and φ̂(x) ≡ e−iĤ(z1 ,...,zn ;Kc )t φ̂(0, x)eiĤ(z1 ,...,zn ;Kc )t .
(342)

So, we can equivalently say that a quantum theory Ĥ(z1 , . . . , zn ; Λ) is renor-


malizable if the correlation functions produce finite predictions for time-ordered

71
quantities.

Now, to compute the relationship of G(n) to the coupling constants and the cut-
off, consider changing the (usually) continuous parameter Kc by an infinitesimal
amount. Note that a common choice of cutoff could be Kc = |pmax |.

∂G(n) ∂G(n)
dG(n) = δKc + δzj . (343)
∂Kc ∂zj
So, the coupling constants zj (Kc ) are chosen to fix G(n) with respect to trans-
formations of the cutoff of the form Kc → Kc +δKc , but G(n) should not depend
on Kc , and the above expression is equal to zero

To derive the renormalization group equation or the beta function, set the dif-
ferential dG(n) to zero, divide by δKc , and multiply by Kc

 
∂ dzj ∂
Kc + Kc G(n) = 0 (344)
∂Kc dKc ∂zj
 
∂ ∂
Kc + β(zj ) G(n) = 0. (345)
∂Kc ∂zj

This is the infinitesimal form of the statement for a renormalizable theory that
the Green’s function shouldn’t depend on the cutoff as it is changed, where
dzj dzj
β(zj ) = Kc = (346)
dKc d lnKc
Is the renormalization group or beta function, which allows us to compute the
coupling constants in terms of the cutoff zj = zj (Kc ). The behavior of β(zj (Kc ))
is often used to describe the dependence of zj on Kc .

Aside: Massless Theories


When we fix the Green’s function to be equal to observable quantities, for any
choice of cutoff, in a massive theory, we usually demand that the 2-point corre-
lation function hΩ| φ̂(p)φ̂(−p) |Ωi has a pole at the physical mass of the particle
mphys .

Complications arise in massless theories, as this leads to divergences, in the


sense that we will end up with expressions like ∞ = ∞. For massless theories,
we instead insist that the 2-point correlation function has a pole at a negative,
spacelike momenta p2 = −Kc2 ≡ M 2 with residue equal to one. This leads to
finite predictions when we let Kc → ∞.

72
Examples of Beta Functions
(1) In φ4 theory, there is one coupling constant λ, the interaction strength, and
the beta function is

3λ2
β(λ) = + O(λ3 ). (347)
16π 2
(2) In quantum electrodynamics (QED), which is an abelian gauge theory, to
first loop order, the beta function is

e3
β(e) = + O(e4 )t (348)
12π 2
Where e is the electric charge of the particle.

(3) In Yang-Mills theory, a nonabelian gauge theory, there is more work to do


in analyzing the divergences in the propagation of gauge bosons, as well as the
propagation of fermions (quarks). The path integrals derived from applying the
Yang-Mills theory Feynman rules produce divergences from the following terms
in the Feynman expansion. See Peskin and Schroeder, section 16.5 One-Loop
Divergences of Non-Abelian Gauge Theory, pages 521-544, for reference.

To tame the infinities that arise from these diagrams, apply dimensional reg-
ularization to maintain Lorentz invariance and insist that each diagram leads
to finite quantities. This then tells us what kinds of counter terms we need to
add to the Lagrangian density for our theory, and, in turn, how the coupling
constants change with respect to the cutoff.

This leads to the beta function for the SU (N ) local gauge group

g3
 
11N 2nf
β(g) = − − (349)
16π 2 3 3

73
Where nf is the number of fermion families. Note that if nf is small, β(g)
becomes negative, and that Kc → ∞ as g gets smaller, meaning that our theory
approaches being a free theory as g → 0, and we can use perturbation theory for
high-energy processes. This is called asymptotic freedom of nonabelian gauge
theories.

Low-Energy Physics of Nonabelian Gauge Theories


Results here are thanks to Wilson’s Confinement of Quarks (1974).

To quantize an abelian or nonabelian gauge theory onto a discrete lattice, Wilson


proposed the use of a lattice regulator. This is very challenging, since the gauge
group acts on gauge fields like
j j j k l
 
j σ j σ 1 j σ kσ lσ
Aµ → Aµ + (∂µ α ) + i α ,α (350)
2 2 g 2 2 2
And there any many complications in discretizing the derivative ∂µ αj to the
lattice.

Wilson recognized that the parallel transporter is the object that allowed us to
do derivatives in the first place, and treated the parallel transporter

U (j, j + êµ ) ∈ SU (2) (351)


as the fundamental degrees of freedom of the nonabelian gauge theory dis-
cretized to the lattice Z4 , instead of the gauge field.

Bigger parallel transporters are built via multiplication. For example,

U (j, j + 2êµ ) = U (j + êµ , j + 2êµ )U (j, j + êµ ) (352)

74
These parallel transporters are 2 × 2 unitary matrices which populate the list
of degrees of freedom, one for each link, or edge, in the classical lattice.

Side: Wilson ran calculations for the dynamics of a gauge field on a 4 × 4 × 4 × 4


lattice, which requires 44 lattice sites × 3 spatial coordinates per lattice site =
768 single-precision floating point numbers per unit time, requiring just 4 kilo-
bytes of RAM. Note that a gigabyte (' 230 bytes) of RAM is capable of storing
∼ 250, 000, 000 single-precision floating point numbers, corresponding to almost
100 × 100 × 100 × 100 lattice.

To quantize the nonabelian gauge theory, Wilson proposed a way to build an


action that summed over the plaquettes of the lattice, called Wilson loops. In
terms of the parallel transporters, the action has the form
X X
S[U ] = Wilson loops = tr(U1 · U2 · U3 · U4 ) (353)
 

Where travelling around a plaquette may look like

With this action, Rbuild the path integral DU e−iS[U ] , and Wick rotate into
R

the path integral DU e−S[U ] to work in imaginary time. Lastly, Monte Carlo
sample the path integral.

This approach is actually the best way to get nonperturbative results in quan-
tum field theory, but has its downsides:

Downside (1) is calculating processes in imaginary time is just like doing sta-
tistical mechanics with gauge theories at some defined temperature, and this is
not good for time-ordered processes (e.g., scattering). Downside (2) is that this
is not a quantum theory, as the Wilson loop is a classical configuration.

Hamiltonian Lattice Gauge Theory


An alternative approach to quantization was introduced by Kogut and Susskind
in 1975. They proposed a lattice quantum gauge theory, a quantum theory with

75
a Hamiltonian and Hilbert space in which the degrees of freedom live on a lat-
tice, which they argued yields Yang-Mills theory as the lattice spacing goes to
zero. This is not proven, but if you can prove it, as well as that the low-energy
limit has a mass cap, you can get a cool $1M!

Recall that classically, each link, or edge, of the lattice is associated to a 2 × 2


unitary matrix U ∈ SU (2). In this lattice quantum gauge theory, each link or
edge e is associated to a wavefunction ψ : SU (2) → C, such that the wavefunc-
tion ψ belongs the two-dimensional square-integrable functions on SU (2) which
is itself a Hilbert space per link he

ψ ∈ L2 (SU (2)) ' he . (354)


Recall that SU (2) is diffeomorphic to S 3 , which conjures memories and solu-
tions of wavefunctions on a sphere: spherical harmonics.

The total Hilbert space of this quantum theory is a tensor product over all the
edges e in the lattice E of individual Hilbert spaces he

H = ⊗e∈E = ⊗e∈E L2 (SU (2)). (355)


To build the Hamiltonian, introduce some oeprations on the Hilbert space. First,
write the states as kets in the position basis
Z
|ψi = dU ψ(U ) |U i (356)
SU (2)

Where |U i is the position eigenvector defined by the three spatial coordinates.


These states obey the inner product hU |V i = δ(U − V ).

Introduce the operators

Z Z
LU |ψi ≡ dV ψ(V ) |U V i and RU |ψi ≡ dV ψ(V ) |V U † i . (357)

These operators are an analog of the shift operator eixp̂ . Note that |U V i is still
a member of SU (2). Differentiating these operators with respect to U will yield
momentum operators: dynamics.

76
15 Lecture 15: Hamiltonian Lattice Gauge The-
ory
Continuing on the introduction to Hamiltonian lattice gauge theory as a means
of quantization of gauge fields, we will build a microscopic formulation of gauge
theory based on the real-space lattice. In contrast to the usual way of work-
ing on the Euclidean, Wick-rotated lattices, we will begin our theory with a
Hamiltonian of classical degrees of freedom: namely, the parallel transporter
U , a 2 × 2 matrix with determinant one, such that U ∈ SU (2). Since we
are working in 4D spacetime, we will have a 4D discretized lattice with lattice
spacing a ∝ K1c . In other words, 4D spacetime is discretized up to the cutoff Kc .

Like introduced before, each link, or edge, of the lattice e ∈ E, where E is the
set of all links of the lattice, has an associated parallel transporter, correspond-
ing to the shortest, rectilinear path in between each vertex, Ue ∈ SU (2) ' S 3 .
Note that the parallel transporter is not a local object, as it is path dependent,
implicitly depending on more than one coordinate.

The classical configuration space for this lattice gauge theory is a Cartesian
product of SU (2) per link in the lattice

C = SU (2) × SU (2) × · · · × SU (2). (358)


To quantize this classical lattice gauge theory, we’ll begin by taking the simplest
guess possible and introduce a wavefunction ψ : C → C, where the Hilbert space
has a ket basis |ψi ∈ H ' ⊗e∈E he , where he is the space over each link.

Since SU (2) is diffeomorphic to S 3 , we are hinted towards defining wavefunc-


tions as square-integrable functions on a sphere, and each point on the sphere
will be associated to a complex number. Therefore, we define he ≡ L2 (SU (2)),
where L2 (SU (2)) is an infinite-dimensional separable Hilbert space, since there
are arbitrarily many orthogonal wavefunctions defined on the sphere.

Recall the two operators defined in SU (2), for unitary matrix with unit deter-
minant U ∈ SU (2), that defined a right- and left-acting transformation

LU : L2 (SU (2)) → L2 (SU (2)) (359)


2 2
RU : L (SU (2)) → L (SU (2)) (360)

Where LU and RU commute, such that [LU , RU ] = 0, and form the representa-
tion defined by the relations

L†U LU = RU

RU = I and LU V = LU LV . (361)
To understand how the infinite-dimensional Hilbert space L2 (SU (2)) breaks up
into a direct sum of irreducible representations of SU (2), we invoke the third

77
part of the Peter-Weyl therorem, which states that the Hilbert space over SU (2)
consisting of square-integrable functions may be regarded as a representation of
a direct product of left- and right-acting operators, and the Hilbert space de-
composes into an orthogonal direct sum of all the irreducible unitary represen-
tations, with multiplicity of each irreducible representation equal to its degree,
the dimension of the underlying space of that representation (See Wikipedia
page on Peter-Weyl theorem for overview). We write this all as

M M
he ≡ L2 (SU (2)) ' Vl ⊗ Vl∗ ' C2l+1 ⊗ C2l+1 (362)
l∈ 21 Z+ l∈ 12 Z+

Where Vl ' C2l+1 is the (2l + 1)-dimensional vector space furnishing the irre-
ducible representation of SU (2) of spin, or angular momentum l and 21 Z+ =
{0, 12 , 1, 32 , 2, . . . }.

Now to find a representation of SU (2) on this vector space Vl , we’ll use a piece
of Lie group representation theory not found in any textbook. Note that the
action of SU (2) generates a representation

Πl (SU (2)) : Vl → Vl . (363)


In the procedure to calculate the matrix Πl (SU (2)), we will use two pieces of
machinery

• The spin- 12 fundamental representation of SU (2)


 
a b
Π 21 (U ) =
c d

• Tensor products of the spin- 12 2D vector spaces furnishing the fundamental


representation of SU (2), V 21 ' C2 ≡ {|0i , |1i}.

Begin the procedure to get the matrix representation for any spin-l, take the
tensor product of n of the spin- 21 fundamental vector spaces

V 21 ⊗ V 12 ⊗ · · · ⊗ V 12 . (364)
Note that for quantum computer fans, this is the vector space of n qubits
n
V 21 ⊗ V 21 ⊗ · · · ⊗ V 12 ' C2 ⊗ · · · ⊗ C2 = C 2 . (365)
The spin-l representation Πl (U ), generated from SU (2) action on Vl , lives in thsi
tensor product space of n copies of V 21 , as long as n = 2l or l = n2 . Therefore,
we will build Vl as a subspace of this tensor product space, most fo which will
be thrown away once we find our subspace of interest, by building a set a n + 1
orthonormal vectors

78
|w n2 i = |11 . . . 1i (366)
1
|w n2 −1 i = √ (|11 . . . 10i + |11 . . . 101i + · · · + |01 . . . 1i) (367)
n
... (368)
1
|w n2 −k i = q  (|1 . . . 10 . . . 0i + (all permutations of k zeros and n − k ones))
n
k
(369)
... (370)
|w− n2 i = |00 . . . 0i . (371)

Then the matrix elements of the representation Πl (U ) are simply given by the
expectation value on n copies of U in this orthonormal basis

[Πl (U )]jk = hwj | U ⊗ · · · ⊗ U |wk i (372)


Where j, k ∈ {− n2 , . . . , n2 }.

Note that this method is good and fast for low spin representations, but clearly
gets unwieldy for working with the Hilbert space of n = 1000 qubits, and the effi-
ciency and value of the methods of addition of angular momentum and Clebsch-
Gordan coefficients, with the raising and lower operators, the highest-weight
vectors, etc.

Sticking to low spin representations for our purposes, we can start to extract
matrix representations for l = 0, l = 12 , and l = 1.

For l = 0, the matrix representation is just the identity

Π0 (U ) = I. (373)
1
For l = 2, we have the fundamental matrix representation

[Π 12 (U )]jk = [U ]jk . (374)


In the l = 1 subspace, there are three orthonormal basis vectors
1
{|w1 i = |11i , |w0 i = √ (|10i + |01i), |w−1 i = |00i}. (375)
2
The matrix elements are then gotten by the expectation value above, pulling
values from the fundamental representation

79
[Π1 (U )]11 = h11| U ⊗ U |11i = h1| U |1i h1| U |1i = a2 (376)
1
[Π1 (U )]00 = (h10| U ⊗ U |10i + h10| U ⊗ U |01i + h01| U ⊗ U |10i + h01| U ⊗ U |01i)
2
(377)
= ad + bc. (378)

Recall that the Peter-Weyl theorem tells us that the Hilbert space of square-
integrable functions on SU (2) is isomorphic to the infinite-dimensional, until we
truncate, direct sum space of (2l + 1)-dimensional vector spaces furnishing the
representation of SU (2)
M
L2 (SU (2)) ' C2l+1 ⊗ C2l+1 . (379)
l∈ 21 Z+

And SU (2) acts on L2 (SU (2)) via the operator LU : L2 (SU (2)) → L2 (SU (2))
with action
M
LU ' Πl (U ) ⊗ I. (380)
l∈ 12 Z+

The elements of the matrix representation [Πl (U )]jk ≡ tljk (U ) are square-
integrable functions from SU (2) to the complex numbers C, since SU (2) is
compact, and form an orthogonal (not orthonormal) basis for L2 (SU (2)), where
−l ≤ j, k ≤ l. So, we can expand the wavefunction ket in the orthogonal basis
of the matrix elements
XX
l
XX
l

|ψi = ψjk |jil |kil = ψjk 2l + 1 |tljk i . (381)
l j,k l j,k
2
The inner
R product of the Hilbert space L (SU (2)) is defined with the Haar
measure dU by
Z
(ψ, φ) ≡ dU ψ ∗ (U )φ(U ) (382)

Where we use the inner product of the orthogonal basis


R vectors, where inte-
grals over SU (2) with the Haar measure behave like dU U ⊗ U † ∝ I ⊗ I +
(swap operations), and (Exercise)
0 1 0
(tljk , tlj 0 k0 ) = δ ll δjj 0 δkk0
. (383)
2l + 1
Therefore, the basis for the total Hilbert space consists of wavefunctions |Ψi ∈ H
and

X X X
|Ψi = Ψlj11lj22...
...k1 k2 ... |j1 il1 |k1 il1 |j2 il2 |k2 il2 . . . . (384)
l1 l2 ... j1 j2 ... k1 k2 ...

80
To give dynamics to the Hilbert space, we define some observables. Consider
j
an element of SU (2), U = ecj τ , where cj τ j are elements of the Lie algebra of
3 j
SU (2), cj ∈ R , and τ are the Pauli spin matrices multiplied by i and divided
by 2 for normalization conditions. The spin matrices obey the commuation
relations [τ j , τ k ] = −2jkl τ l . We can recover the spin matrix from the group
element via differentation
dU
= τj. (385)
dcj cj =0

The first observable we define is the anti-Hermitian left angular momentum


operator

ˆlj ≡ d L

. (386)

j

L
ds U =e s=0
Note that this is a factor of i away from being Hermitian, and is analogous to
the Hermitian linear momentum operator ip̂. Also, notice that we begin with
d
a unitary operator LU , apply an anti-Hermitian operator ds , and kill off the
unitarity by setting s = 0 after the derivative.

The second observable we define, similar to the first, is the right angular mo-
mentum operator

ˆlj ≡ d R

. (387)

sτ j
R
ds U =e s=0

Next, define the position observable Ûjk , which is also a map L2 (SU (2))rightarrowL2 (SU (2))
and yields the fundamental representation matrix elements when it acts on the
position eigenkets |U i of SU (2).

Ûjk |U i = [Π 21 (U )]jk |U i . (388)


R
Acting the position operator on the Hilbert space wavefunctions |ψi = dU ψ(U ) |U i,
we get the Haar integral over position eigenkets

Z Z
l= 1
Ûjk |ψi = dU ψ(U )[Π 12 (U )]jk |U i = dU ψ(U )tjk 2 (U ) |U i . (389)

To build the Hamiltonian, kinetic energy plus potential energy, we use the pla-
quette operator to define parallel transport on the 4D lattice

Û : L2 (SU (2)) ⊗ L2 (SU (2)) ⊗ L2 (SU (2)) ⊗ L2 (SU (2))


→ L2 (SU (2)) ⊗ L2 (SU (2)) ⊗ L2 (SU (2)) ⊗ L2 (SU (2))

We choose the convention to set arrows on each link running “left-to-right” and
“down-to-up” in the plane of the “paper”, and then walk around the plaquette

81
counterclockwise (CCW), taking the Hermitian conjugate of the parallel trans-
porter if we are traveling against the arrow. The parallel transporter for each
link is considered the observable for that link when traveling around the pla-
quette. Each link are labeled by ei , i = 1, 2, 3, 4.

Build an operator that acts on the 4D space of links, as a sum over the links of
the direct products of parallel transporters

Ûj1 k1 ⊗ Ûk1 k2 ⊗ Ûk†2 k3 ⊗ Ûk†3 k4 .


X
M̂;j1 ,k4 = (390)
k1 ,k2 ,k3

This is then turned into an observable by summing over the last, initially-fixed
index k4
X
tr(Û ) ≡ M̂;k4 ,k4 . (391)
k4

Note that tr(Û ) is a trace operator that acts on L2 (SU (2)) ⊗ L2 (SU (2)) ⊗
L2 (SU (2)) ⊗ L2 (SU (2)), and not a number!

Put the defined observables together into the Kogut-Susskind Hamiltonian

2 X
gH ˆlj (e)ˆlj (e)+ 1
X  
ĤKS = − L L 2 a tr(Û ) + Hermitian conjugate
2a 2gH
e∈E plaquettes 
(392)
Where gH is the coupling constant.

This model has a huge group of gauge symmetries.

Consider a vertex in the lattice and build the operator

82
Mx ≡ Rx (e1 ) ⊗ Rx (e2 ) ⊗ Lx (e3 ) ⊗ Lx (e4 ) (393)
Where x ∈ SU (2). These operators obey the following commutation relations

[Mx (v), My (w)] = 0 and [Mx (v), ĤK S] = 0, for all x, y, v, w. (394)

In summary, Wilson’s formulation received much more attention at the time for
its ease of discretizstion, translation to computer programs, and use of Monte
Carlo sampling, which classical computers are good at. Kogut and Susskind
argued that the plaquette operator becomes the curvature term, as expected,
in the small lattice spacing a limit in their theory, as well as that hte kinetic
energy term becomes the kinetic energy in the timelike direction of the curvature
term. The Kogut-Susskind formulation may be very promising for quantum
simulations done by quantum computers, which are not very good at sampling
techniques, but excel in simulating the dynamics of local lattice models.

83
16 Lecture 16: Spontaneous Symmetry Break-
ing
This is the last topic that we need to complete our description of the stan-
dard model of elementary particle interactions. Spontaneous symmetry breaking
(SSB) is an observed behavior within field theories, and we will begin with some
examples of classical SSB.

Particle in a double well example


Consider a classical particle confined to a double-well potential with Hamilto-
nian

p2
H= + V (x). (395)
2m
This system posses Z2 symmetry in its solutions, since x → −x is a symmetry
operation. Note that the particle in this potential at x = 0 must choose a pos-
itive or negative local minimum, meaning that the ground state is degenerate
and breaks the Z2 symmetry.

Potential exhibiting Z2 symmetry.

The quantum analog of this classical theory does not exhibit SSB, since the
ground state wavefunction is symmetric, but the first excited state is antisym-
metric.

84
Symmetric ground state wavefunction.

Antisymmetric first excited state wavefunction.

Ising model (statistical physics) example


The Ising model is a model of ferromagnetic materials, where spins can point up
or down, corresponding to a spin value sj = ±1, with a Hamiltonian containing
the pairwise summation
X
H=− sj sk . (396)
hjki

This theory contains two ground states: all spins pointing up or all pointing
down, and possesses Z2 symmetry by the operation sj → −sj .

Depending on the temperature, the thermal state of this system can be in one
of two regimes: critical or non-critical. Consider the Gibbs, or mixed, state
density operator describing the system at any temperature

e−βH
ρ= (397)
Z
Where Z is the partition function and β = kB1T is the temperature factor.
Note that the Gibbs state is all Z2 symmetric, meaning that it does break any
symmetries. Now there exists a critical temperature βc , below which the system
becomes ordered due to small external magnetic fluctuations and symmetry is
broken. Above the critical temperature, the system is disordered with random
thermal fluctuations. As β → ∞,

85
   
1 1
ρ= −  (all up states) + −  (all down states). (398)
2 2

Classical field theory example


Consider the Lagrangian density, much like the φ4 interacting theory we’ve
encountered, but the “mass” term is made negative and m → µ
1 1 λ
L= (∂µ φ)(∂ µ φ) + µ2 φ2 − φ4 . (399)
2 2 4!
Note that measurable quantities are usually functions of the parameters, making
this a perfectly fine theory. We then have the Hamiltonian
Z  
1 2 1 1 λ
H = d3 x π + (∇φ)2 − µ2 φ2 + φ4 . (400)
2 2 2 4!
To uncover the Z2 symmetry of this theory, minimize the Hamiltonian with
respect to the field φ, making all derivatives of φ equal to zero, and calculate
the configuration with the smallest energy. The extremum condition for the
remaining potential energy terms

∂V (φ) λ
= −µ2 φ + φ3 = 0 (401)
∂φ 6
Yields three configurations
q that extremize the energy of the system. Namely,
6µ2
for φ = 0 and φ = ± λ . The quantum analog of this theory also exhibits
SSB.

Quantum field theory example: transverse Ising model


Consider the quantum theory of a 1D lattice of spins, which also exhibits Z2
symmetry, with the Hamiltonian
X X
Ĥ = − σjx σj+1
x
+h σjz (402)
j j

Where the first term is the neighboring interaction and the second term is the
magnetic interaction with h as the magnetic field strength, and the Pauli spin
matrices
   
x 0 1 z 1 0
σ = and σ = . (403)
1 0 0 −1
The Hamiltonian exhibits Z2 symmetry, since

[Φ, Ĥ] = 0 for Φ = . . . σjz σj+1


z
.... (404)
The basis for the h = 0 ground state can be written as

86
|Ω+ i ≡ |+i ⊗ |+i ⊗ · · · ⊗ |+i (405)
|Ω− i ≡ |−i ⊗ |−i ⊗ . . . |−i (406)

Where the individual spin states are


1 1
|+i = √ (|0i + |1i) and |−i = √ (|0i − |1i) . (407)
2 2
The h = 0 ground state eigenspace is twofold degenerate with the above basis.
A good, non-degenerate ground state, for example, could be √12 (|Ω+ i ± |Ω− i),
but these states are never seen in experiment, since decoherence destroys any
superpositions of states. Neither state by itself exhibits the symmetry, but one
state must be chosen by measurement. The only information we have about the
two states’ relationship is Φ |Ω+ i = |Ω− i.

Continuous SSB Example: Linear σ-model


Consider the Lagrangian density for an effective model of pions that exhibits
SSB of a continuous symmetry
1 1 λ
L= (∂µ φj )2 + µ2 (φj )2 − (φj )4 (408)
2 2 4!
PN
Where (φj )2 = j=1 φj φj . The dynamics of the N independent (Klein-Gordon)
scalar field are invariant under orthogonal rotations O ∈ O(N ), a continuous
group of symmetries, such that

O : φj → [O]jk φk . (409)
What is the lowest energy configuration? Minimize the potential energy V (φj ) =
λ
− 12 µ2 (φj )2 + 4! (φj )4 with respect to φj , showing that the minimum occurs for
any constant configuration of the fields that satisfies the equation (Exercise)

µ2
(φj0 )2 = . (410)
λ
There is more than one energy configuration that leads to this solution, including
superpositions of the following vectors
 µ   

λ
0
 0   √µ 
   λ
 0 , 0 ,.... (411)
   
. . . . . .
0 0
In the case of N = 2, the potential energy minima form the “wine bottle” or
“mexican hat” potential. The minimum energy configurations are points on this
potential and form circles around the indent of the “bottom of the bottle”. In

87
other words, any configuration that lands on the circle is a minimum energy
configuration.

Mexican hat potential of the linear sigma model.

Low Energy Dynamics for SSB


To study the dynamics as we start to depart from the minimal energy configu-
rations, the ground state, consider the Z2 -symmetry case. Effectively, the low
energy dynamics for a classical Z2 -symmetric system are that of a harmonic
oscillator with an effective mass, a restoring force.

Small fluctuations in the energy behave much like the harmonic oscillator.

In the continuous case, we choose coordinates by rotating such that

88
   
0 0
j
0  0 
φ0 = 
. . . =  . . . 
   (412)
ν √µ
λ

And see how this behaves with small energy fluctuations. Define shifted fields
in terms of some new coordinates where the vector of fields φ is now defined as

φ(x) ≡ (π k (x), ν + σ(x)), where k = 1, . . . , N − 1. (413)


Note that π is not the conjugate momentum density, but a new classical field,
which will be, hence the notation, the pion, and k denotes the vector index.
Rewrite L in terms of the shifted fields (Exercise)

1 1
L = (∂µ π k )2 + (∂µ σ)2 (414)
2 2
√ √
 
1 λ λ λ
− 2µ2 σ 2 − λµσ 3 − λµ(π k )2 σ − σ 4 − (π k )2 σ 2 − (π k )4 .
2 4 2 4
(415)

There are N − 1 massless π k fields and one massive σ field. The second and
third terms in L correspond to an effective massive Klein-Gordon scalar field.
The N − 1 π k fields are effectively massless, as all of the other terms above
contain λ and are interaction terms.

It costs energy to move transverse in the potential, perpendicular to the circle of


minima, corresponding to the effective mass of the σ field. To move tangentially
to the manifold of minima, the circle, it costs no energy, corresponding to the
massless π fields.

Goldstone’s Theorem
In the O(N ) linear σ-model there are N2 independent continuous symmetries,


the dimension of the rotation group O(N ). After SSB, there are N 2−1 remain-


ing symmetries, the dimension of O(N − 1), corresponding  to rotations of the


π k fields. The number of broken symmetries is equal to N2 − N 2−1 = (N − 1),


which is also the number of massless fields. In other words, each broken sym-
metry causes a massless excitation: the Goldstone modes or Goldstone bosons.

Therorem: For every broken symmetry, there is a corresponding massless


bosonic particle.

Proof:

89
Consider a classical field theory with fields φa (x), a = 1, 2, . . . , and the general
Lagrangian density L = (derivatives) − V (φa ).

Let φa0 be a constant (in an extrema) field that minimizes the potential such
that

∂V (φa )
= 0. (416)
∂φa φa =φa0

Then expand the potential, a function of the vector of fields φa0 ≡ φ0 , near the
minima
 2 
a a 1 a a b b ∂ V
V (φ ) = V (φ0 ) + (φ − φ0 )(φ − φ0 ) . (417)

2 ∂φa ∂φb φ=φ0

Call the Hessian matrix [m2 ]ab , which is symmetric and real and the eigenvalues
give the masses of the effective fields.

With an orthogonal rotation, we can diagonalize a symmetric, real matrix m2 →


OT DO. Redefine the fields π a = [O]ab φb , and rewrite the Lagrangian density
X
L(π a ) = (derivatives) − Da2 (π a )2 (418)
a

Where the eigenvalues correspond to the masses of the π particles, and we must
now show that there are eigenvalues equal to zero. In other words, every con-
tinuous symmetry leads to an eigenvalue equal to zero.

A general, global, continuous symmetry of the fields has the form

φa → φa + α∆a (φ) (419)


Where α is infinitesimal and ∆a is a shift function of the fields. This is a
symmetry of the potential, since it causes the derivatives to vanish such that

V (φa ) = V (φa + α∆a (φ)). (420)


This implies, by expanding and equating first order terms, the directional deriva-
tive of the potential is zero

∆a (φ) V (φ) = 0. (421)
∂φa
Differentiate this with respect to φb to get

∂∆a (φ) ∂V (φ) a ∂2V


+ ∆ (φ) = 0. (422)
∂φb ∂φa ∂φa ∂φb
Evaluate at φa = φa0 to get

90
X
∆a (φa0 )[m2 ]ab = 0 (423)
a

Where a ∆a (φa0 ) = ∆T is the zero eigenvector, where the ∆a (φa0 ) are linearly
P
independent for each continuous symmetry, which follows by definition of the
general, global, continuous symmetry imposed above.

91
17 Lecture 17: The Higgs Mechanism
Here we discuss what happens when Goldstone’s theorem related to spontaneous
symmetry breaking (SSB) and local gauge invariance are combined.

Consider an SU (2) gauge theory


1
/ − m)ψ − Fµν F µν
L = ψ̄(iD (424)
4
Where the second term represents the auxiliary gauge boson fields that are
needed to induce the theory with local gauge invariance.

These (massive) gauge bosons are observed in experiment, and we are inclined
to add a bosonic mass term to this theory. Naively, we can add a term of the
form 12 m2 Ajµ (x)Ajµ (x). Unfortunately, this will break local gauge invariance
since it does not obey the local gauge transformation like the gauge boson fields
do, e.g., as Ajµ (x) → Ajµ (x) − 1e ∂µ α(x).

The local gauge transformation is a constraint, and constraints in physics can be


solved by adding new degrees of freedom. For example, if we have N equations
(constraints) and M unknowns (degrees of freedom), but N >> M , then there
are no solutions! Therefore, we introduce new degrees of freedom until we get
solutions, and, in the context of high energy physics, new degrees of freedoms
are fields/particles/excitations!

Abelian case
We consider the abelian case for the SU (2) gauge theory combined with SSB
as the “toy” model for such a theory. Consider a theory of a complex scalar
field φ = φR + iφI (which can also be regarded as a doublet) coupled to a U (1)
gauge field Aµ
1
L = |Dµ φ|2 − Fµν F µν − V (φ) (425)
4
Where we have the covariant derivative Dµ = ∂µ + ieAµ obeying the local gauge
condition.

We establish that L is indeed a gauge theory, since it is invariant under the


local gauge transformation, since each term built from the invariant fields are
invariant under local gauge transformation. The involved locally gauge invariant
fields are

1
Aµ (x) → Aµ (x) − ∂µ α(x) (426)
e
iα(x)
φ(x) → e φ(x). (427)

92
For the interacting part of the theory, consider the potential
λ ∗ 2 2
V (φ) = −µ2 φ∗ φ + (φ φ) , µ > 0. (428)
2
Note that this potential is quartic in φ, leaving no hope for solving this model
straight away. To find a solution, we follow the same steps as for SSB: (1) Find
a field value φ = φ0 that minimizes the potential V (φ), and (2) expand the
Lagrangian density around the minima and study small fluctuations in the field.

Following the same procedure as before,


q minimize V (φ) and we find the family
iϕ µ2
of minima at (Exercise) φ0 = e λ , and choose the ϕ = 0 point, such that
our minima occurs at
r
µ2
φ0 = . (429)
λ
To study small fluctuations about the minima, write the field as
1
φ(x) = φ0 + √ (φ1 (x) + iφ2 (x)) (430)
2
Where φ1 and φ2 are small. Substituting into the potential, it becomes (Exercise)

µ4
V (φ) = − + µ2 φ21 + · · · + O(φ3j ) (431)

Where the . . . account for all of the other terms form substituting φ(x), includ-
ing φ2 terms and cross-terms of φ1 and φ2 . Now expand L around the minima
φ0 to get
1 1
L= |∂µ φ1 |2 + |∂µ φ2 |2 − µ2 φ21 + · · · + O(φ3j ). (432)
2 2
Note that other terms exist here, such as cross-terms, indicated by . . . , and
that we alos have shifted away the constant terms in L. The first two terms are
effectively massless, while the third term is effectively massive, since it contains
µ2 . Other terms from the covariant derivative term include

1 1 √
|Dµ φ|2 = |∂µ φ1 |2 + |∂µ φ2 |2 + 2eφ0 Aµ ∂ µ φ2 + e2 φ20 Aµ Aµ + . . . . (433)
2 2
We interpret the last term above as the bosonic mass term
1 2
∆L = e2 φ20 Aµ Aµ ≡ m Aµ Aµ . (434)
2 A
This result is in the classical, low-energy limit, since we studied only small fluc-
tuations of the field.

93
To outline the quantization of this theory, we use the path integral approach, al-
though lattice quantization also works for this theory. So, break the Lagrangian
density into free and interacting parts L = L0 + Lint , where
1 1 1
L0 = |∂µ φ1 |2 + |∂µ φ2 |2 − µ2 φ21 − Fµν F µν (435)
2 2 4
And Lint is the rest of the terms: perturbations leading to vertices in the Feyn-
man rules. For example, in momentum space, one of the vertices looks like


= 2ieφ0 (−ik µ ) ≡ mA k µ (436)
At the quantum level, the mass of the gauge boson fields manifest in the poles
of the Green’s functions. For example,

i
= im2A g µν + (mA k µ ) (mA k ν ) ∼ im2A (437)
k2

Nonabelian case
For the nonabelian case of SSB in SU (2) gauge theory,  consider
 a general,
φ1
continuous gauge group G and a set of scalar fields φ =  ...  that transform
 

φd
under the gauge group G as

φj → (I + iαa ta )jk φk (438)


Where α is small, such that we have the representation of an element g ∈ G

94
a a
π(g) = eiα t
(439)
And the {ta } are purely imaginary matrices from the associated Lie algebra
that depend on the representation.

Now build a gauge theory based on the gauge group G by promoting it to a


local symmetry that acts locally on the scalar fields as

φ(x) → π(g(x))φ(x). (440)


Derivatives of the scalar fields in this gauge theory will be the covariant deriva-
tive

Dµ φ = (∂µ + gAaµ τ a )φ (441)


Where the matrix τ a = ita . Build the kinetic energy terms by squaring and
halving the covariant derivative acting on the field to obtain
1 1
(∂µ φj )2 + gAaµ (∂µ φj Tjk
a
φk ) + g 2 Aaµ Abµ (τ a φ)j (τ b φ)j (442)
2 2
Now introduce the aspect of SSB to this gauge theory by combining this with
the wine bottle potential

V (φ) = −ν|φ|2 + λ|φ|4 . (443)


So, there is a classical minima of these fields at φ0 , such that φ(x) = φ0 +φ1 (x),
where φ1 (x) are small fluctuations about the minima, which we substitute and
expand about in the Lagrangian density to find terms like (Exercise)
1 2
∆L ∼ (m )ab Aaµ Abµ (444)
2
Where we have a negative of a positive semidefinite matrix (if diagonable, then
eigenvalues are ≥ 0) with indices running from a, b = 1, . . . , dim(G)

(m2 )ab ≡ g 2 (τ a φ0 )j (τ b φ0 )j . (445)


For G = SU (2), choose the fundamental (spin-1) representation in terms of the
Pauli spin matrices

τ 1 = iσ x , τ 2 = iσ y , τ 3 = iσ z . (446)
Then the fields (eigenvectors) in this basis are
   
φ1 1
φ= and φ0 = . (447)
φ2 0
The action of the generators τ a on the minima, the vacuum state, are (Exercise)

95
 
1 0
τ φ0 = i (448)
1
 
0
τ 2 φ0 = − (449)
1
 
1
τ 3 φ0 = i (450)
0

Giving matrix elements


 
−1 −i 0
[m2 ] = g 2  −i −1 0 . (451)
0 0 −1
If there is a generator τ a , in a different representation, that leaves the vacuum
state invariant such that

τ a φ0 = 0 (452)
Then there will be zero mass terms (zero eigenvalues) in the matrix elements,
indicating that initially massless bosons become massive when symmetry is bro-
ken! Specifically in the theory of electroweak interations, these particles are the
W ± and Z bosons.

96
18 Lecture 18: SSB with Gauge Theories and
Next Steps
Here we complete our discussion of SSB in the context of gauge theories (e.g.,
the Higgs mechanism), and give a broad overview of open questions and where
to go next.

Recall from the previous lecture we began workingwith  a local gauge theory of
φ1
bosons with a tuple of independent scalar fields  ... . Each field transforms
 

φd
according to the local gauge symmetry

φj → (1 + iαa ta )jk φk (453)


Where the ta is a purely imaginary representation.

Next, we added a potential to the Lagrangian density that is minimized by some


configuration φ0 = argφ min(V (φ)). If this minimal configuration is not equal
to zero φ0 6= 0, then we can spontaneously generate mass. For example, we
choose, via gauge invariance, an eigenvector with just one nonzero value to be
the minimal configuration
 
0  
 ..  0
φ0 =  .  ≡ . (454)
ν
ν
To understand how the gauge bosons acquire mass, consider the kinetic en-
ergy term built with the covariant derivative mod squared |Dµ φ|2 , since the
representation is purely imaginary, which is a long, complicated expression. Ex-
pand the fields in this expression around the minimal configuration by writing
φ = φ0 + φ1 . This yields another long, complicated expression, but let’s focus
on the “mass term”, which allows the spontaneous generation of mass in gauge
theory,
1
|Dµ φ|2 = · · · + g 2 m2ab Aaµ Abµ (455)
2
Where the mass matrix m2ab is equal to
 
0
m2ab a b

= 0 ν τ τ (456)
ν
And can be positive semidefinite or negative semidefinite due to the metric.
Consider the form of the gauge boson fields in this mass term

Aaµ Abµ = Aa0 Ab0 − Aa1 Ab1 − Aa2 Ab2 − Aa3 Ab3 . (457)

97
The only non-negative term is the zeroth, timelike, longitudinal element, and
will cause the entire term to not look like a mass term. The three spatial de-
grees of freedom in this expression, if m2ab is positive semidefinite, will lead to a
mass term in the Lagrangian density L, since mass terms have a minus sign in L.

The longitudinal term does not look right and can be cancelled off since, for
the photon, momentum only has transverse components: no longitudinal com-
ponents. Consider the vacuum polarization diagram like we sketched in the
previous lecture for the Abelian Higgs model.

The contributing diagram in the propagation of a gauge boson, which cancels


off the longitudinal term in the perturbation expansion, is a second order term
involving the exchange of a fermion.

This diagram contributes the purely transverse term, with longitudinal contri-
butions cancelled off,

kµ kν
 
im2ab η µν − 2 . (458)
k
In the previous lecture, we worked out the fundamental representation of SU (2)
for the gauge theory. In this representation, work out the kinetic energy ∼
|Dµ φ|2 and find that

m2ab ∝ Iab . (459)


With the mass term proportional to the identity, this means that all three gauge
bosons in this gauge theory have the same mass. Now, how do we build a gauge
theory that gives mass to some gauge bosons but not others?

Glashow-Weinberg-Salam (GWS) Theory of Weak Interac-


tions
The GWS theory of weak interactions is a theory to describe gauge boson and
electromagnetic field (photon) interactions in one theory. The local gauge group
which this theory is invariant under is SU (2)×U (1). The gauge group symmetry
transformations act on gauge boson fields as

98
a a β
φ → eiα τ
ei 2 φ (460)
a
Where τ a = σ2 are the Pauli spin matrices from the SU (2) (gauge boson) part
of the group, and β is a scalar from the U (1) (photon) part of the group. Think
of the action of U (1) as giving charge to the field φ.

Now add the potential V (φ) to the Lagrangian density and minimize the poten-
tial such that
 
1 0
φ = φ0 + φ1 , where φ0 = √ (461)
2 ν.
Expand the covariant derivative around the minima φ0 , and in the fundamental
representation for this gauge group SU (2) × U (1), we get (Exercise)
i
Dµ φ = (∂µ − igAaµ τ a − g 0 Bµ )φ (462)
2
 a  
Aµ SU (2)
With the connection gauge field ∼ . Putting everything to-
Bµ U (1)
gether into a locally gauge invariant Lagrangian density

1
LGWS = |Dµ φ|2 − Fµν F µν + · · · + fermion term + · · · + V (φ). (463)
4
The fermion term will be discussed below.

Mass Generation
Due to the potential having a minimum configuration, we can generate mass
in this gauge theory to endow gauge bosons with mass. We want to do this,
because massive bosons are experimentally observed. Expand the Lagrangian
density around the minima φ0 and acquire a term from the kinetic energy term
that looks like (Exercise)

ν2 2 1 2
g (Aµ ) + g 0 (A2µ )2 + (igA3µ + g 0 Bµ )2 .

“mass term” ≡ (464)
8
So, A1µ ,A2µ , and the combination of fields A3µ /Bµ each act like they have a mass,
interpreted as massive gauge bosons: three degrees of freedom from SU (2). The
fourth degree of freedom is missing from this mass term, and we, therefore, con-
clude that it does not have a mass, and we interpet this as the photon: one
degree of freedom from U (1).

It is convenient to rename the fields

99
1
Wµ± = √ A1µ ∓ iA2µ

(465)
2
1
Zµ0 = p gA3µ − g 0 Bµ

(466)
2
g +g 02

With masses

1
mW = gν (467)
2
1p 2
mZ = g + g 02 ν. (468)
2
The fourth field (degree of freedom), with zero mass, we call
1
Aµ = p (g 0 A3µ + gBµ ). (469)
g2 + g 02

Coupling to Fermions
Gauge bosons couple differently to left-handed and right-handed fermions. Re-
call that the chirality of fermions arose in the development of gamma γ matrices
and Weyl spinors and the kinetic energy term decouples into

/ = ψ̄L (i∂)ψ
ψ̄(i∂)ψ / L + ψ̄R (i∂)ψ
/ R. (470)
Similarly, in the gauge theory

/ = ψ̄L (iD)ψ
ψ̄(iD)ψ / L + ψ̄R (iD)ψ
/ R. (471)
To recognize this difference in chirality and coupling of gauge bosons to fermions,
recall that we have the choice of representation of the gauge group SU (2), and
the left- and right-handed fermions can be separated into different representa-
tions.

100
Next Steps
There are three ways to go from here:

1. Practical calculations for experiments,


2. Physics-focused research,
3. Mathematics-focused research.

Physics
• Supersymmetry (SUSY)
Upgrading symmetries of groups (e.g., Poincaré → Super Poincaré).
• Quantum gravity via field theory tools (e.g., superstring theory, loop quan-
tum gravity).
Superstring theory offers a correct effective theory to explain the high
energy physics of black holes.
• Linear quantum gravity
Does not lead to strings.
Non-renormalizable theory.
Treats the metric g µν = η µν + δg µν as the degree of freedom.
• Linear quantum gravity with the Standard Model
Only works up to some cutoff Λ, making an effective, low energy
theory of gravity.
Best theory we have to date for describing all experiments, but we also
believe that it is not fundamental, since black holes exist, which conflict
with the imposed cutoff, and this theory fails at high energy predictions.
• AdS/CFT Correspondence
Exaplins quantum gravity with field theory alone, no strings.
It states that a strongly interacting conformal quantum field theory
(CFT) on the boundary of Anti-de Sitter spacetime (AdS) is dual to a
quantum gravity theory of Anti-de Sitter spacetime (solutions to Einstein’s
equations) in its semiclassical limit, which implies that quantum gravity
is itself a quantum field theory.
We’d prefer to develop this theory in de Sitter spacetime, as that is
the spacetime that we find ourselves in (see the work of Strominger).
• Quantum information theory is becoming helpful to study the kinematics
of quantum systems (e.g., photon entanglement).

101
Mathematics
There are many rigorous formulations of quantum field theory:
• Axiomatic QFT,
• Constructive QFT,
• Functional integration, expansions, and probabilistic approaches,
• Vertex operator algebras,
• Chiral & factorization algebras,
• Topological QFT (TQFT) & n-categories.
Why is QFT so difficult to make rigorous?

This is largely in part due to perturbation theory working so darn well, and
it is used as the only standard tool in QFT at large, but it is wrong, and we
know why it is wrong. For example, consider a Gaussian R ∞path integral in a
2 4
(0 + 1)-dimensional QFT with a quartic interaction Z = −∞ dx e−x −gx . As
prescribed, we assume that g is small, do perturbation theory, get lots of terms,
and end up with a zero radius of convergence! The interaction term does im-
prove convergence, but perturbation
R theory can’t see that. Even after Wick
rotation of the path integral Z = Dφ e−S , which also improves convergence,
perturbation theory can’t calculate a finitie value for the path integral. Also
note that for spacetime dimensions d > 6, all QFTs are trivial (e.g., Gaussians).

• Axiomatic QFT
Wightman: fields are distribution-valued objects (unbounded opera-
tors) acting on a Hilbert space.
Hang-Kastler: C ∗ -algebras.
Osterwalter-Schrader: statistical mechanical foundation (Wick-rotated).
Reconstruction theorems: With all n-point correlation functions well-
behaved, we can reconstruct full Hilbert space and the unitary represen-
tations of the Poincaré symmetry group.
There are problems with local gauge theories (current research).
• Constructive QFT
Cluster expansions takes the Wick-rotated, Euclidean path integral
and trade off the low energy perturbation series expansion for estimates
of large values of the degrees of freedom that suppress bad parts of the
series to get finite results.
Successful in (1 + 1)- and (2 + 1)-dimensional spacetime, as well as
local gauge theory, but it is very intricate and has become very difficult
to communicate and check results.

102
• Algebraic QFT
Start with some axioms and abstract what the observables should be
from there.
Quantifies locality in the algebra, leading to observable (C ∗ ) algebras.
There is difficulty in finding the states (n-point function) to match
the C ∗ algebras.
• Functional integration, expansions, and probabilistic approaches
Nelson’s axioms are stronger than the Osterwalter-Schrader axioms.
• Vertex operator algebras
Very successful with conformal field theories (CFTs), but is stuck in
(1 + 1) dimensions.
• TQFT & n-categories
Exactly solvable, strongly interacting theories built on n-categories.

• Chiral & factorization algebras


See the work of Costello on perturbation expansions.
• Mathematical theory of effective QFT
Follows the work of Wilson.

103

You might also like