Professional Documents
Culture Documents
Classical Theory of Fields Queen Mary University of London
Classical Theory of Fields Queen Mary University of London
Classical Theory of Fields Queen Mary University of London
Preface
This work is part of my exchange programme in the United Kingdom,
written during the summer of 2014, on my year abroad at Queen Mary,
University of London. The topics covered are brief introductions to Special
Relativity and covariant formulation of the Electromagnetism, including
the consequences from the postulates of Special Relativity, and a step-by-
step explanation of Tensor Calculus and Relativistic Mechanics.
Acknowledgements
To my family, which has never stopped believing me, specially my
mother Tania. To my advisor at Queen Mary, University of London, Dr.
Alston Misquitta, and to my advisor and counselor at Universidade Fed-
eral de Santa Catarina, Dr. Marco Kneipp. To my friends, in particular
Augusto, Luciano, Madlene, Deborah, Antonio and Maique, who, from
Brazil, have been supporting me in all moments. To my friends who
helped me through my quick journey in the United Kingdom, Fernanda,
Cos, Vasily, Vladimir, Ivan, Elena, Marieta, Sophia, Kristina and Mane-
tou. To my sponsor, CNPq. To Dr. Brian Wecht from Queen Mary, who
not only tought me Statistical Physics, but also what a lecturer must be
like.
1. The laws of physics are the same in all inertial frames of reference;
2. The speed of light in free space has the same value c in all inertial frames
of reference.
1
An inertial frame of reference is a frame in which a freely moving body
proceeds with constant velocity, that is, a frame in which Newton’s first law of
motion holds or, in other words, in which the velocity of any particle remains
constant unless there is a net force acting on it. If a system moves with constant
velocity with respect to an inertial reference system, then it is also inertial.
Ordinary mechanics assumes that the propagation of interactions of material
particles is instantaneous. Experiments show, however, that there is no instan-
taneous interaction in nature: there is a finite maximum speed of propagation of
interaction, which implies that motions of bodies with greater speed are impos-
sible, for if such a motion could occur, then by means of it one could realise an
interaction with a speed exceeding the maximum possible speed of propagation
of interaction. From the second postulate, it follows that this maximum speed
is the same in all inertial systems of reference. This universal constant, which
is also the speed of light in free space, designated by c, exactly given by1
1.2 Intervals
An event is described by the place where it occurred and time when it occurred.
It is useful to use a four-dimensional space, whose spatial axes are x, y, z and
temporal axis is ct. In this space, events are points (ct1 , x1 , y1 , z1 ) called world
points, and there corresponds to each particle a line, called world line.2
Consider two inertial reference systems K and K 0 , with axes (ct, x, y, z) and
(ct’, x’, y’, z’) respectively, moving relative to each other with constant velocity.
Suppose that the frames coincide at t = t0 = 0, and consider a flash of light
emanating from their common origin at the instant they coincide. Therefore,
1 Originally, one metre was intended to be one ten-millionth of the distance from the Earth’s
equator to the North Pole, but since 1983 it has been defined as the length of the path travelled
by light in vacuum during a time interval of 1/299,792,458 of a second.
2 It is easy to show that to a particle in uniform rectilinear motion there corresponds a
2
remembering that the distance travelled by the wave is given by the product of
its speed and the interval of time, the spherical wave front described in K by
x2 + y 2 + z 2 = (ct)2 (2)
0
will be described in K by
3
1.3 Proper time
The proper time of an object is defined as the time read by a clock moving with
this object. The proper time interval between two events will be therefore the
interval of time measured in a reference frame in which the two events occur at
the same point in space. Let us use the Greek letter τ to describe the proper
time.
Consider the same reference systems K and K 0 moving relative to each
other with constant velocity v, and suppose there is a clock at rest in K 0 . The
infinitesimal interval in K is given by
ds2 = c2 dτ 2 , (11)
and so
dx2 + dy 2 + dz 2
= v2 , (14)
dt2
where v = |v| is the relative speed between the reference systems K and K 0 .
Let us now define the velocity coefficient
v
β≡ (15)
c
and the Lorentz factor or Lorentz term
1 1
γ≡q =p , (16)
1− v2 1 − β2
c2
4
dτ 1
= . (18)
dt γ
Supposing v is constant, one can integrate (18) and obtain the time interval
indicated by the moving clock in K:
Z t2 Z t2
dτ 1 1
dt = τ (t2 ) − τ (t1 ) ≡ τ2 − τ1 = dt = (t2 − t1 ) (19)
t1 dt t1 γ γ
or
1
∆τ = ∆t ≤ ∆t, (20)
γ
once v is always less than or equal to c, so that 0 < 1/γ ≤ 1. Therefore, we
conclude that the proper time interval of a moving object is always less than
the corresponding interval in the rest system. In other words, moving clocks
run slow.
According to (11), we also have dτ = ds/c, so the time interval read by the
clock in K 0 is also given by
1 b
Z
τ2 − τ1 = ds, (21)
c a
taken along the world line of the clock. But, since the clock at rest always
indicates a greater time interval than the moving one, we conclude that
Z b
ds (22)
a
has its maximum value if it is taken along the straight world line joining the
points a and b.
x0 = x − vt, y 0 = y, z 0 = z, t0 = t, (23)
or, in matrix form,
t0
1 0 0 0 t
x0 −v 1 0 0 x
0 = , (24)
y 0 0 1 0 y
z0 0 0 0 1 z
5
which is called Galilean transformation and is clearly inconsistent with the prin-
ciple of relativity once it does not remain constant the interval.
Since the interval can be regarded as the distance between two world points
in our four-dimensional space, the transformation we seek must be expressible
mathematically as a rotation in this space. Let us consider a rotation in the tx
plane, so that c2 t2 − x2 must be invariant. In the most general case, we have
6
or simply
7
and
ct0
γ 0 0 −βγ ct
x0 0 1 0 0 x
0 = . (40)
y 0 0 1 0 y
z0 −βγ 0 0 γ z
vx01 vx02
t1 = γ τ1 + 2 , t2 = γ τ2 + 2 , (44)
c c
which implies
vx0 vx0
∆t = t2 − t1 = γ τ2 + 22 − τ1 − 21 (45)
c c
or
9 One may use direct transformation, remembering that, in this case, x − x = v(t − t )
2 1 2 1
in K.
8
∆t = γ∆τ, (46)
since x01 = x02 in K 0 , for it was assumed that the clock is at rest there. This is
called time dilation, since the time interval in a moving frame is greater than
the one in the rest frame. Note that (46) agrees with the result found in (20).
dx0 dy 0 dz 0
u0x , u0y , u0z
u’ = = , , . (48)
dt0 dt0 dt0
Using Lorentz Transformation, we obtain
vdx
dx0 = γ(dx − vdt), dy 0 = dy, dz 0 = dz, dt0 = γ dt − 2 , (49)
c
where v = |v|. Dividing the first three equations by the forth, we get
ux − v uy uz
u0x = , u0y = , u0z = , (50)
1 − ucx2v γ 1 − ucx2v γ 1 − ucx2v
which are the transformation of velocity. The inverse transformation is obtained
by changing v to −v. Note that, setting c → ∞ or v c, we have the classical
transformation of velocity
r = r⊥ + rk , (52)
9
so that
r · v = r⊥ · v + rk · v = rk v. (53)
This way, only the time and the component rk will transform, so, according to
(35),
rk v
t0 = γ t − 2 , r0 = r⊥ + γ rk − vt .
(54)
c
By substituting r⊥ = r − rk into the above expression for r0 , we get
r0 = r + (γ − 1) rk − γvt. (55)
Since rk and v are parallel, we have10
v r · v v
rk = rk = , (56)
v v v
and substituting now for r0 , gives
r · v v
r0 = r + (γ − 1) − γvt. (57)
v v
Factoring v in the above expression and substituting rk v = r·v in the expression
for t0 in (54), our transformation becomes
r · v γ−1
t0 = γ t − 2 , r0 = r + r · v − γt v, (58)
c v2
which is the Lorentz transformation in 3 dimensions. We wish now to find the
transformation matrix. If we define
β v
v x 1 x
β= ≡ βy = vy (59)
c c
βz vz
and its transpose
vT 1
βT = = (βx βy βz ) = (vx vy vz ) , (60)
c c
then we can rewrite (58) as
ββ T
ct0 = γct − γβ T · r, r0 = −γβct + I + (γ − 1) 2 r, (61)
β
where I is the 3 × 3 identity matrix, such that Ir = r, and β 2 = βx2 + βy2 + βz2 .
In block matrix form, this can be written as
!
−γβ T
0
γ
ct ct
= T , (62)
r0 −γβ I + (γ − 1) ββ β2
r
10 Geometrically and algebraically, v/v is a dimensionless unit vector pointing in the same
10
or, if we define
βi βj
αij = (γ − 1) , (63)
β2
for i, j = x, y, z, then in a more explicitly stated way we have
0
ct γ −γβx −γβy −γβz ct
x0 −γβx 1 + αxx αxy αxz x
0 = . (64)
y −γβy αyx 1 + αyy αyz y
0
z −γβz αzx αzy 1 + αzz z
Note that this is a transformation between two frames whose axes are parallel
and whose origins coincide. The most general Lorentz transformation also con-
tains rotation of the three axes, since the composition of two boosts is not a
pure boost, but a boost followed by a rotaion.
11
2 Tensor Calculus
2.1 Four-vectors
The coordinates (ct, x, y, z) of an event can be considered as the components of
a four-dimensional radius vector. We shall use the following notation:
x0 = ct, x1 = x, x2 = y, x3 = z. (65)
Note that the quantity
V0 = V 0 , V1 = −V 1 , V2 = −V 2 , V3 = −V 3 . (69)
µ 11
In matrix form, the column vector V is
11 V µ and V will be used to indicate either column and row vectors, respectively, or sets
µ
of four components. To remember the matrix form of each four-vector, use the mnemonic
”upper indices go up to down; lower indices go left to right”.
12
V0
V1
Vµ =
V2 , (70)
V 3 4×1
and, on the other hand,
(V 0 )2 − (V 1 )2 − (V 2 )2 − (V 3 )2 , (72)
which, according to (69), can be written as
3
X
V0 V 0 + V1 V 1 + V2 V 2 + V3 V 3 = Vµ V µ . (73)
µ=0
From now on, we will use Einstein summation convention, in which one sums
over any repeated index (also called summing index or dummy index, and one
is always contravariant and the other covariant), and omits the summation sign,
remembering that Greek letters run from 0 to 3 and Latin letters from 1 to 3.
This way, we have
Vµ V µ = V0 V 0 + V1 V 1 + V2 V 2 + V3 V 3 (74)
as the expression for the square magnitude of a four-vector. Analogously, the
Lorentz scalar product of two different four-vectors is given by12
Vµ U µ = V0 U 0 + V1 U 1 + V2 U 2 + V3 U 3 , (75)
which is invariant under rotations of the four-dimensional coordinate system.
Just like the the interval between two events, this scalar product can be positive
(timelike vectors), negative (spacelike) or zero (null or lightlike). In particular,
the interval, which can be written as
ds2 = dxµ dxµ = dx0 dx0 + dx1 dx1 + dx2 dx2 + dx3 dx3 , (76)
is invariant, as stated before.
The three space components of the four-vector V µ form the three dimen-
sional vector V, so we will use the notation
Vµ V µ = (V 0 )2 − (V)2 . (78)
12 Note that the expression V µ U is equal to V U µ when there is a sum over µ, but only
µ µ
the latter gives this sum when it comes to matrix multiplication.
13
In particular, we have
∂x00 ∂x00
= γ, = −βγ, ... , (82)
∂x0 ∂x1
so that ∂x0µ /∂xν are the entries of our transformation matrix13 . We define
∂x0µ
Λµ ν ≡ , (83)
∂xν
and hence we can write
γ −βγ 0 0
−βγ γ 0 0
Λµ ν =
0
. (84)
0 1 0
0 0 0 1
Lorentz transformation for the four-dimensional radius vector, as in (81), can
now be writen as
X
x0µ = Λ µ ν xν , (85)
ν
or simply
x0µ = Λµ ν xν , (86)
using Einstein’s convention. In conclusion, by definition, a contravariant four-
vector is a set of four quantities V µ which transform according to14
V 0µ = Λµ ν V ν . (87)
Remember now that, if φ = φ(x1 , ..., xn ) is a scalar function, then the differential
of φ is given by
n
1 if µ = ν
13 It is also important to note that ∂xµ /∂xν = ∂x0µ /∂x0ν = . It turns
0 if µ 6= ν
out that the quantity on the right-hand side is a very special kind of four-dimensional tensor,
which will be defined later.
14 Equation (87) makes sense either as matrix multiplication or as a system of equations
when µ, ν = 0, 1, 2, 3.
14
n
X ∂φ µ ∂φ µ
dφ = µ
dx = µ
dx = ∂µ φdxµ , (88)
µ=1
∂x ∂x
where
∂φ
∂µ φ ≡ . (89)
∂xµ
This partial derivative transforms as some sort of vector, but not as a contravari-
ant one. From the chain rule, we know that
∂φ ∂φ ∂xν ∂xν
∂µ0 φ = = = ∂ν φ . (90)
∂x0µ ∂xν ∂x0µ ∂x0µ
This new transformation is, by definition, the transformation of a covariant
four-vector Vµ :
∂xν
Vµ0 = Vν . (91)
∂x0µ
In the case of the Lorentz transformation, ∂xν /∂x0µ are the components of the
inverse transformation matrix, and we write
∂xν ν
= Λ−1 µ , (92)
∂x0µ
so that the transformation of a covariant four-vector can be written as15
ν
Vµ0 = Vν Λ−1 µ. (93)
2.2 Four-tensors
Either kind of vector is an example of a more general object called tensor,
which has linear transformation rule. The simplest kind of tensor S is the one
unchanged under transformation, that is, S 0 = S, and this is a characteristic of
a scalar. Rank is defined as the number of indices carried. This way, scalars are
tensors of rank 0 and vectos are tensors of rank 1.
A four-dimensional tensor of the second rank V µν , also called four-tensor, is
a set of 4 × 4 = 16 quantities which transform like the products of components
of two four-vectors under coordinate transformations. It’s worth reminding that
the transformation of a contravariant four-vector is giving by
V 0µ = Λµ ν V ν , (94)
and a covariant one transforms like
ν
Vµ0 = Vν Λ−1 µ, (95)
−1 ν
0
15 Note that it makes sense writing V = Λ
µ µ Vν relating the components, but not as
matrix multiplication.
15
and therefore, by definition, our tensor of rank 2 transforms like
V 0µν = Λµ α Λν β V αβ . (96)
The components of a second-rank tensor, however, can be written as V µν (con-
travariant), Vµν (covariant) or V µ ν (mixed). Therefore, the contravariant one
transforms like (96), and the covariant and mixed ones transform, respectively,
like
α β
V 0 µν = Vαβ Λ−1 µ Λ−1 ν (97)
and
β
V 0µ ν = Λµ α V α β Λ−1 ν. (98)
Raising or lowering space index (1, 2, 3) changes the sign of the component, and
raising or lowering the time index (0) does not change the sign, so that
V µν = V ν µ , (100)
µν
then the tensor V is called symmetric. Similarly, a tensor is called antisym-
metric or skew symmetric if16
V µν = −V ν µ . (101)
µµ
Clearly, the diagonal components V (no sum here) of an antisymmetric tensor
are zero since V µµ = −V µµ . For a symmetric mixed tensor, we have V µ ν = Vν µ ,
so that we will simply write Vνµ .
From a mixed tensor V µν , we can form a scalar by doing an operation called
contraction:
V µµ = V 00 + V 11 + V 22 + V 33. (102)
This scalar is called the trace of the tensor. Note that the formation of a scalar
product of two vectors is also a contraction operation.
We define similarly four-tensors of higher rank. For example, a fourth-rank
mixed tensor V µν αβ is a set of 44 = 256 quantities which transform according
to
γ δ
V 0µν ρσ = Λµ α Λν β V αβ γ δ Λ−1 ν Λ−1 σ. (103)
From a tensor with, at least, one contravariant and one covariant components,
one can do a contraction similarly as before, and each contraction will decrease
the rank of the tensor in 2. For instance, examples of contractions from the
16 The definition is the same if V µ µ
µν = ±Vν µ , V ν = ±Vν , etc.. Note that the matrices
associated to these tensors are symmetric/antisymmetric themselves.
16
forth-rank tensor V µν αβ are the second-rank tensors V µν µβ and V µβ αβ , or
even the scalar V µν µν .
In a tensor equation, the two sides must contain identical free indices of
the same type (contravariant or covariant). For example, V µ ν = U µ Wν makes
sence, while V µ = Uµ does not. The repeated indices may be replaced by any
other Greek or Latin letter (and remember that they are of different types).
For example, V µν µν = V ν µ ν µ = V αβ αβ , while Vµ U µ and Vi U i are completely
different expressions.
∂xµ ∂x0µ
= = δνµ . (106)
∂xν ∂x0ν
Also, remembering that repeated indices are summed, one should note that
δνµ V ν = V µ (107)
and
δνµ Vµ = Vν , (108)
so the transformation law for δνµ will be
β α
δ 0µν = Λµ α δβα Λ−1 ν = Λµ α Λ−1 ν = δνµ , (109)
since
α ∂x0µ ∂xα ∂x0µ
Λµ α Λ−1 ν = 0ν
= = δνµ , (110)
α
∂x ∂x ∂x0ν
and so δ 0µν = δ 0µν , and it is therefore an invariant tensor.
By raising the one index or lowering the other in δµν , we can define the metric
tensors gµν ≡ δµν and g µν ≡ δ µν . Considering the relations in (88), it is trivial
that
17
1 0 0 0
0 −1 0 0
gµν = g µν =
(111)
0 0 −1 0
0 0 0 −1
We have then17
gµν V ν = Vµ , g µν Vν = V µ , (112)
so that the metric tensor gµν can be used to lower index and g µν can be used
to raise index.
The completely antisymmetric unit tensor of fourth rank, µν ρσ is the tensor
whose components change sign under interchange of any pair of indices, and
whose nonzero components are ±1. Since µν ρσ is antisymmetric, it vanishes if
two indices are the same. We set
18
µν ρσ αβ ρσ = −2(δαµ δβν − δβµ δαν ), µν ρσ αν ρσ = −6δαµ . (116)
Also, the product ij k lmn , which is a true three-dimensional tensor of rank 6,
is given by
ij k lmk = δil δj m − δim δj l , ij k lj k = 2δil , ij k ij k = 6. (118)
2.4 Differentiation
We define the four-vector operator ∂µ as
∂ ∂ ∂ ∂ ∂
∂µ ≡ = , , , . (119)
∂xµ ∂x0 ∂x1 ∂x2 ∂x3
Using previous notation, we can write
1 ∂
∂µ = ,∇ (120)
c ∂t
and
1 ∂
∂µ = , −∇ , (121)
c ∂t
where
∂ ∂ ∂
∇≡ , , . (122)
∂x ∂y ∂z
Let φ be a scalar function. The four-gradient of φ is the four-vector given by
1 ∂φ
∂µ φ = , ∇φ . (123)
c ∂t
Using this definition, the differential of the scalar φ, which is given by
∂φ µ
dφ = dx , (124)
∂xµ
is a scalar, given by the Lorentz scalar product of two four-vectors. Let now
V µ = V 0 , V 1 , V 2 , V 3 = V 0 , V be a four-vector, then
1 ∂V 0 ∂V 1 ∂V 2 ∂V 3 1 ∂V 0
∂µ V µ = + + + = + ∇V = ∂ µ Vµ . (125)
c ∂t ∂x ∂y ∂z c ∂t
19
In particular, the operator [] = ∂µ ∂ µ = ∂ µ ∂µ is given by
1 ∂2
[] ≡ − ∇2 , (126)
c2 ∂t2
also known as D’Alembertian.
20
3 Relativistic Mechanics
3.1 Four-velocity and Four-acceleration
The ordinary three-dimensional velocity is given by
dr
v= (127)
dt
or
dxi
vi = , i = 1, 2, 3. (128)
dt
From this, one can form a four-vector, but since dxµ is a four-vector and the
quantity dτ is a scalar (not dt), we can define
dxµ
Uµ = . (129)
dτ
From the chain rule, it follows that
dxµ dt
Uµ = . (130)
dt dτ
Once dt/dτ = γ, we have
dxµ
Uµ = γ , (131)
dt
but since dxµ = (cdt, dr), we have
dxµ d
= (cdt, dr) = (c, v) , (132)
dt dt
so that
U µ = U 0 , U = (γc, γv)
(133)
and therefore
Uµ = U 0 , −U = (γc, −γv) .
(134)
The contraction Uµ U µ must be a scalar. In fact,
v2
1
U µ U = γ c − γ v = γ c 1 − 2 = γ 2 c2 2 ,
µ 2 2 2 2 2 2
(135)
c γ
or simply
U µ U µ = c2 . (136)
Geometrically, U µ is a four-vector tangent to the world line of the particle. In
a similar way, one can define the four-acceleration as
21
d2 xµ dU µ
Aµ = 2
= (137)
dτ dτ
and, analogously,
dU µ dt
d dγ dγ
Aµ = = γ (γc, γv) = γc , γ v + γ2a (138)
dt dτ dt dt dt
or
Aµ = γ γ̇c, γ γ̇v + γ 2 a ,
(139)
where a = dv/dt is the ordinary three-dimensional acceleration of the particle.
One may evaluate γ̇ and find
− 12
v2
d a·v 3
γ̇ = 1− 2 = γ , (140)
dt c c2
so that
a·v a·v
Aµ = γ 4 , γ4 2 v + γ2a . (141)
c c
Finally, differentiating (136) with respect to τ , we find
Uµ Aµ = 0, (142)
and this means that the four-velocity and four-acceleration are mutually per-
pendicular in our four-dimensional space.
22
where L is the Lagrange function of the mechanical system, then using the
results in (11) and (18) we can write
Z t2
αc
S=− dt, (145)
t1 γ
and comparing with (144), the Lagrangian of the free particle is
αc p
L=− = −αc 1 − v 2 /c2 . (146)
γ
The constant α characterises the particle, but in classical mechanics each particle
is characterized by its mass m. If we try to find a relation between α and m, then
we should note that if c → ∞ we must have the classical expression L = mv 2 /2.
We can then expand L in powers of v/c,
αv 2
L = −αc + , (147)
2c
and note that constant terms do not affect the equation of motion, so that −αc
can be omitted. Comparing with L = mv 2 /2, we have
α = mc, (148)
so that
Z b
S = −mc ds (149)
a
and
mc2
L=− . (150)
γ
p = γmv. (153)
23
One should note that if v c or c → ∞ then γ ≈ 1, so that we have p = mv
above. Also, if v → c then |p| → ∞. The force acting on the particle is given
by dp/dt. If one supposes that the force is directed perpendicular to v, so that
v 2 is a constant, then
dp dv
= γm . (154)
dt dt
On the other hand, if the force is parallel to v, then the velocity changes only
in magnitude, so that the unit vector v̂ = v/|v| is constant. If we write
mv
p= q v̂, (155)
v2
1− c2
−3/2
v2
dp d v 1
dv + v −1 −2v dv
=m v̂ = m q 1− 2 v̂
c2 dt
q
dt dt 1− v2 v2
1 − c2 dt 2 c
c2
(156)
or simply
v2 3 v2
dp dv 1
=m 3
γ + 2 γ v̂ = γ ma 2 + 2 = γ 3 ma, (157)
dt dt c γ c
and this means that the ratio of the force to acceleration is different in the two
cases. The energy E of the particle is the quantity
E = p · v − L, (158)
and using the expressions for L and p, we find
mc2
2
v 1
E = γmv 2 + = γmc2 + , (159)
γ c2 γ2
or simply
E = γmc2 . (160)
This expression shows that if v = 0 then the energy of the free particle is
E = mc2 , which is defined as rest energy. Also, for small velocities v/c 1 one
can expand the expression for the energy and find
mc2 mv 2
E= v2
≈ mc2 + , (161)
1 − c2 2
and this result was expected since the term mv 2 /2 is the classical expression
for the kinetic energy of the particle. Squaring now equations (153) and (160),
we have, respectively, p2 = γ 2 m2 v 2 and E 2 = γ 2 m2 c4 . Comparing these two
24
equations, we have the relation between the energy and the momentum of the
particle,
E2
= p2 + m2 c2 . (162)
c2
The energy expressed as a function of the momentum is called Hamiltonian H
of the system, so that in our case
p
H = p2 c2 + m2 c4 . (163)
Note that, if p mc, then the Hamiltonian is approximately given by
p2
H ≈ mc2 + (164)
2m
which, except for the rest energy, is the classical expression for the Hamiltonian.
Knowing now that p = γmv and E = γmc2 , we find the relation between
the energy, velocity and momentum of the particle,
E
v.
p= (165)
c2
From the equations for the momentum and energy, if v = c then both of them
are infinite, so that a particle with mass different from zero cannot move with
velocity v = c. On the other hand, from the expression relating the momentum
and energy above, particles of zero mass can exist and for such particles we have
E
. p= (166)
c
In four-dimensional form, according to the principle of least action we have
Z b Z b p
δS = −mcδ ds = −mcδ dxµ dxµ = 0 (167)
a a
25
dUµ
= 0. (171)
ds
Now, let us consider the point a as fixed, so that (δxµ )a = 0, and point b as
variable, so that we find
P µ = mU µ (176)
µ µ
where U is the four-velocity of the particle. The expression Pµ P must be a
scalar. In fact, we have
Pµ P µ = m2 Uµ U µ = m2 c2 . (177)
The force four-vector or four-force is, by analogy, defined as the derivative
dpµ dU µ
Fµ = = mc , (178)
ds ds
and its components satisfy Fµ U µ = 0. In terms of the three-dimensional force
f = dp/dt, this can be written as
γfv γf
Fµ = , , (179)
c2 c
where the time component is related by the work done by the force.
26
4 Charges in Electromagnetic Fields
4.1 Four-potential of a Field
In an electromagnetic field, the action function of a particle is given by the action
Rb
S = −mc a ds for the free particle and a term describing the intercaction of the
particle with the field, determined by the charge of the particle. The properties
of the field are characterised by the four-potential Aµ . The components of
this four-vector are functions of the spatial coordinates and time. The space
components of Aµ form the three-dimensional vector A, called vector potential,
and the time component is denoted as A0 ≡ φ, called scalar potential. This way,
we have
Z b Z b
q q
S= −mcds − Aµ dxµ = −mcds + A · dr − qφdt , (182)
a c a c
using the expression for Aµ above and for the infinitesimal four-dimensional
radius vector dxµ = (cdt, dr). Substituting now dr = vdt and ds = cdτ = cdt/γ
above, we can change the integral above to an integration over t and obtain
Z t2
mc2
q
S= − + A · v − qφ dt, (183)
t1 γ c
and so the Lagrangian for a charge in an electromagnetic field is given by
mc2 q
L=− + A · v − qφ. (184)
γ c
One can now find the components of the generalised momentum of the particle,
− 12
v2 2v i
∂L 1 q
Pi = = −mc2 1− 2 − 2 + Ai , (185)
∂v i 2 c c c
or, in other words,
q q
P = γmv + A = p + A. (186)
c c
∂L
The Hamiltonian function can be found using the expression H = v · ∂v − L,
so that for a particle in a field we have
27
q mc2 q q mc2 q
H = v · γmv + A + − A·v +qφ = γmv 2 + A · v + − A·v +qφ
c γ c c γ c
(187)
or simply
mc2 q
L=− + A · v − eφ. (191)
γ c
As seen before, we have
∂L q
= P = γmv + A (192)
∂v c
and furthermore
∂L q
= ∇L = ∇(A · v) − q∇φ. (193)
∂r c
Using for A and v the identity ∇(a · b) = (a · ∇)b + (b · ∇)a + b × (∇ × a) +
a × (∇ × b) for arbitrary vectors a and b, we find
∂L q
= [(A · ∇)v + (v · ∇)A + v × (∇ × A) + A × (∇ × V)] − q∇φ. (194)
∂r c
Now, note that v is constant in this differentiation, so that we simply have
∂L q q
= (v · ∇)A + v × (∇ × A) − q∇φ. (195)
∂r c c
This way, the Lagrange equation in (190) becomes
d q q q
p + A − (v · ∇)A − v × (∇ × A) + q∇φ = 0. (196)
dt c c c
28
The components of the potential vector A are functions of the spatial compo-
nents and time, so that
∂A ∂A
dA = dt + dr, (197)
∂t ∂r
or
∂A
dA = dt + (dr · ∇)A, (198)
∂t
which gives
dA ∂A
= + (v · ∇)A. (199)
dt ∂t
Finally, equation (196) gives us
dp q ∂A q
=− − q∇φ + v × (∇ × A) = 0. (200)
dt c ∂t c
The derivative of the momentum with respect to time, on the left hand side
of the above equation, is the force exerted on the charge in an electromagnetic
field. The terms (q/c)∂A/∂t and q∇φ do not depend on v. We denote this
force per unit charge as the electric field intensity E, so that
1 ∂A
E=− − ∇φ. (201)
c ∂t
The term (q/c)v × (∇ × A) depends on the velocity and is proportional and
perpendicular to it. The factor of v/c per unit charge is called magnetic field
intensity B, so that
B = ∇ × A. (202)
Using this definitions, we can write the equation of motion of a charge in an
electromagnetic field as
dp q
= qE + v × B, (203)
dt c
which is called Lorentz force.
A0µ = Aµ − ∂µ χ. (204)
With this change, there appears in the action integral the term
29
q ∂χ µ e
dx = d χ , (205)
c ∂xµ c
and the last term is a total differential and hence it has no effect on the equations
of motion. Using the vector and scalar potentials, the transformation in (204)
is the same as
1 ∂χ
A0 = A + ∇χ, φ0 = φ − . (206)
c ∂t
This way, the fields E = −(1/c)∂A/∂t − ∇φ and B = ∇ × A do not change
since ∇ · (∇ × V) = 0 and ∇ × (∇f ) = 0 for any well behaved vector field
V and scalar field f. This way, the potentials are not uniquely defined. The
transformation in (206) is called Gauge Transformation. As an example, it is
always possible to choose the potentials so that the scalar field φ is zero.
φ = −E · r (208)
since constant E implies ∇(E · r) = (E · ∇)r = E. On the other hand, the vector
potential can be expressed as
1
A= B × r, (209)
2
since B constant implies ∇ × (B × r) = B · ∇r − (B · ∇)r = 2B.
30
field E. The direction of E can be said to be in the x-axis. Now, we know that
the equation of motion is
dp q
= qE + v × H, (210)
dt c
so that in our case the equation of motion becomes only
dp
= qE, (211)
dt
which is a set of two equations dpx /dt = qE and dpy /dt = 0. Solving this
differential equations we have, respectively,
px = qEt, p y = p0 , (212)
and the time reference point has been chosen at the moment when px = 0, and
p0 is the momentum
p of the particle at that moment. The kinetic energy, which
is given by Ek = p2 c2 + m2 c4 , will be, in our case,
q q
Ek = p20 c2 + (cqEt)2 + m2 c4 = E02 + (cqEt)2 , (213)
where
q
E0 = p20 c2 + m2 c4 (214)
is the energy at time t = 0. Once the velocity of the particle is given by
pc2
v= , (215)
E0
we have
dx px c2 c2 qEt
vx = = =p 2 , (216)
dt Ek E0 + (cqEt)2
and hence
c2 qEt
Z
1
q
x= p dt = E02 + (cqEt)2 . (217)
E02 + (cqEt)2 qE
On the other hand, we have
dy py c2 p0 c2
vy = = =p 2 , (218)
dt Ek E0 + (cqEt)2
so that
p0 c cqEt
y= sinh−1 . (219)
qE E0
Now, from the above equation we have
31
qEy E0
t = sinh , (220)
p0 c cqE
and substituting this in the equation (217), gives us
E0 qEy
x= sinh , (221)
qE p0 c
which is a catenary curve. The second kind of motion we shall deal is the
motion in a constant uniform magnetic field. Consider the charge q in a uniform
magnetic field B, defined to be in the direction of the z-axis. The equation of
motion given by dp/dt = qE + (q/c)v × B simply becomes
dp q
= v × B. (222)
dt c
Once we have v = pc2 /E, the above equation can be written as
dv E q
2
= v × B, (223)
dt c c
which is a set of three equations
dvx dvy dvz
= ωvy , = −ωvx , = 0, (224)
dt dt dt
where ω = qcB/E. Multiplying the equation for vy by i and adding to the
equation for vx gives us
d
(vx + ivy ) = −iω(vx + ivy ), (225)
dt
which is a first order differential equation whose solution is
32
Also, dvz /dt = 0 gives us vz = v0z = constant and hence z = z0 + v0z t. These
three equations for x, y, z combined show us that the charge moves along a helix
having its axis along the direction of the magnetic field B. In particular, if
v0z = 0 then the charge moves along a circle in the plane perpendicular to the
field.
Z b q q q
δS = mcdUµ δxµ + δxµ dAµ − δAµ dxµ − mcUµ δxµ + Aµ δxµ = 0.
a c c c
(233)
Now, note that
Z b
q
mcuµ δxµ + Aµ δxµ = 0 (234)
a c
since the integral is varied with fixed coordinate values at the limits. Also, we
have
∂Aµ ν
δAµ = δx (235)
∂xν
and
∂Aµ ν
dx , dAµ = (236)
∂xν
and hence the expression for δS becomes
Z b
e ∂Aµ ν µ e ∂Aµ ν µ
δS = mcdUµ δxµ + dx δx − δx dx = 0. (237)
a c ∂xν c ∂xν
Now, let us use the fact that dUµ = (dUµ /ds)ds and dxµ = U µ ds, and also the
fact that summed indices can be exchanged, so that we can write
Z b
dUµ q ∂Aν ∂Aµ
δS = mc − − U ν δxµ ds = 0. (238)
a ds c ∂xµ ∂xν
33
Once δxµ is arbitrary, the integrand must be igual to zero, so
dUµ q ∂Aν ∂Aµ
mc − − U ν = 0, (239)
ds c ∂xµ ∂xν
or
dUµ q ∂Aν ∂Aµ
mc = − Uν. (240)
ds c ∂xµ ∂xν
We define then the electromagnetic field tensor Fµν as
∂Aν ∂Aµ
Fµν = − (241)
∂xµ ∂xν
so that we can write the four-dimensional equation of motion as
dUµ q
mc = Fµν U ν . (242)
ds c
Setting ν = i = 1, 2, 3 in the above equation, we have the equation of motion
dp q
= qE + v × H, (243)
dt c
while setting ν = 0 gives us the known equation
dEk
= qE · v. (244)
dt
In matrix notation, the electromagnetic field tensor is
0 Ex Ey Ez
−Ex 0 −Bz By
Fµν = (245)
−Ey Bz 0 −Bx
−Ez −By Bx 0
in covariant form, and
0 −Ex −Ey −Ez
Ex 0 −Bz By
F µν = (246)
Ey Bz 0 −Bx
Ez −By Bx 0
in contravariant form. Note that all the diagonal components are zero, as ex-
pected since Fµν is clearly antisymmetric.
One could now note that F µν transform in each index as a four-vector.
Expressing the components of this tensor in terms of the components of the
electric and the magnetic field, the formulas for the transformations are
v v
Ex0 = Ex , Ey0 = γ Ey − Bz , Ez0 = γ Ez − By (247)
c c
and
34
v v
Bx0 = Bx , By0 = γ By + Ez , Bz0 = γ Bz + Ey . (248)
c c
As a particular case, if v c then
v v
Ex0 = Ex , Ey0 = Ey − Bz , Ez0 = Ez − By (249)
c c
and
v v
Bx0 = Bx , By0 = By + Ez , Bz0 = Bz + Ey , (250)
c c
which can be written as
1
E0 = E − ×v (251)
c
and
1
B0 = B + E × v (252)
c
Fµν F µν , (253)
which can be easily computed remembering that Fµν = (E, B) and F µν =
(−E, B), giving
Fµν F µν = B 2 − E 2 . (254)
The second quantity we can form is given by
35
5 The Electromagnetic Field Equations
5.1 The First Pair of Maxwell’s Equations
We already know that
1 ∂A
B = ∇ × A, E=− − ∇φ, (256)
c ∂t
so, using the fact that ∇ × (∇f ) = 0 for all scalar function f , we have
1 ∂(∇ × A) 1 ∂(∇ × A)
∇×E=− − ∇ × (∇φ) = − (257)
c ∂t c ∂t
and now using the fact that ∇ · (∇ × V) = 0 for all vector field V gives us
∇ · B = ∇ · (∇ × A) = 0, (258)
and these equations are the first pair of Maxwells equations, which are homo-
geneous. In gour-dimensional notation, let us first remember that
Fµν = ∂µ Aν − ∂ν Aµ , (259)
so that
where ra is the radius vector of the charge qa . Multiplying now the quantity
dQ = ρdV by dxµ gives us
dxµ
dQdxµ = ρdV dxµ = ρdV dt . (262)
dt
Now, note that dQdxµ is a four-vector, and hence the quantity ρdV dtdxµ /dt
must also be a four-vector. Once the quantity dV dt is a scalar, we conclude
36
that the quantty ρdxµ /dt is a four-vector. We define this vector as J µ , which
is called current four-vector or four-current, and hence
dxµ
Jµ = ρ , (263)
dt
we can now evaluate
dx0 d(ct)
J0 = ρ =ρ = cρ (264)
dt dt
and
dxi
Ji = ρ = ρv = j, (265)
dt
where j is the current density vector. This way, we can write
37
or
Z
∂ρ
+ ∇ · j dV = 0, (272)
∂t
which allows us to write
∂ρ
+∇·j=0 (273)
∂t
since dV is arbitrary. This is the equation of continuity in differential form. Let
us write now the above equation in the form
1 ∂(cρ) ∂J 1 ∂J 2 ∂J 3
+ + + = 0, (274)
c ∂t ∂x1 ∂x2 ∂x3
and now note that cρ = J 0 and ∂/∂(ct) = ∂/∂x0 , so that we can write
∂µ J µ = 0, (275)
which is the equation of continuity in four-dimensional form.
S = Sf + Sm + Smf , (276)
where Sf depends on the properties of the field itself, in the absence of charges,
Sm depends only on the properties of the particle, and Smf depends on the
interaction between the particles and the field. This way, if there are many
particles, the total action Sm is the sum of the actions for each free particle:
X Z
Sm = − mc ds. (277)
38
Z
Sf = a Fµν F µν dxdydzdt, (279)
dΩ = cdtdxdydz, (280)
we have then
Z
1
Sf = − Fµν F µν dΩ, (281)
16πc
or, using the fact that Fµν F µν = 2(B 2 − E 2 ),
Z
1
Sf = (E 2 − B 2 )dV dt, (282)
8π
and hence the total action for the fied and particles is
Z Z XqZ
1 µν
X
S = Sf + Sm + Smf =− Fµν F dΩ − mc ds − Aµ dxµ .
16πc c
(283)
Z Z
X 1 Z
1 X
S=− Fµν F µν dΩ − mc Aµ J µ dΩ.
ds − (287)
16πc c2
P R
Now, note that the variation of the term − mc ds− is clearly zero, and so,
using the fact that F µν δFµν = Fµν δF µν , we have
39
Z
1 µν 1
δS = − F δFµν + 2 δAµ J µ dΩ = 0. (288)
8πc c
Using now Fµν = ∂µ Aν − ∂ν Aµ gives us
Z
1 µν 1 µν 1
δS = − F ∂µ δAν − F ∂ν δAµ + 2 δAµ J µ dΩ = 0. (289)
8πc 8πc c
We can now interchange the indices µ and ν in the expression F µν ∂µ δAν and
then replacing F µν by −F νµ , which gives us
Z
1 µν 1 µ
δS = − − F ∂ν δAµ + 2 δAµ J dΩ = 0. (290)
4πc c
Integrating by parts the first term of this integral, remembering that the limits
of integration are infinity, where the field is zero, gives us the expression
Z
1 µν 1 µ
δS = − ∂νF + 2 J δAµ dΩ = 0. (291)
4πc c
or simply
Z
1 1 1
− ∂νF µν + J µ δAµ dΩ = 0. (292)
c 4π c
Since δAµ is arbitrary, its coefficients must be zero, so that we have
4π µ
∂ν F µν = −
J , (293)
c
which is a set of four equations. If we set µ = 1 then we have the equation
∂Bz ∂By 1 ∂Ex 4π
− − = Jx , (294)
∂y ∂z c ∂t c
and similarly if i = 2, 3, so that, in vector equation, we have
1 ∂E 4π
∇×B= + J. (295)
c ∂t c
On the other hand, If µ = 0, we have
∇ · E = 4πρ, (296)
and these equations are called the second pair of Maxwell’s equations, or the in-
homogeneous pair. It is easy to obtain the continuity equation from the Maxwell
equations. Taking the divergence of equation (295), gives us
1 ∂(∇ · E) 4π
∇ · (∇ × B) = + (∇ · J). (297)
c ∂t c
Once ∇ · (∇ × B) = 0 and ∇ · E = 4πρ, we have then
40
1 ∂(4πρ) 4π
0= + (∇ · J), (298)
c ∂t c
or simply
∂ρ
∇·J+ = 0, (299)
∂t
which is the equation of continuity.
1 ∂B 1 ∂E 4π
B · (∇ × E) − E · (∇ × B) = − B · − E· + j · E, (301)
c ∂t c ∂t c
and, using the formula ∇ · (a × b) = b · ∇ × a − a · ∇b, we can write
4π 1 ∂
∇ · (E × B) = − j·E− (E 2 + B 2 ), (302)
c 2c ∂t
or simply
∂W
= −j · E − ∇ · S, (303)
∂t
where the Poynting vector S is defined as
c
S= E×B (304)
4π
and the energy density W as
E2 + B2
W = , (305)
8π
which is the energy per unit volume of the field.
41
Z
1 ∂Λ ∂Λ
δS = δq + δ(∂µ q) dΩ (307)
c ∂q ∂(∂µ q)
or
Z
1 ∂Λ ∂Λ ∂Λ
δS = δq + ∂µ δq − δq∂µ dΩ = 0. (308)
c ∂q ∂(∂µ q) ∂(∂µ q)
The term
Z
∂Λ
∂µ δq dΩ (309)
∂(∂µ q)
vanishes, and by arbitrarity of dΩ and δq, the equation of motion is then
∂Λ ∂Λ
− ∂µ = 0. (310)
∂q ∂(∂µ q)
Write now
∂Λ ∂Λ ∂q ∂Λ ∂(∂ν q)
= + , (311)
∂xµ ∂q ∂xµ ∂(∂ν q) ∂xµ
and, using the equation of motion and the fact that ∂ν ∂µ q = ∂µ ∂ν q, we have
∂Λ ∂ ∂Λ ∂Λ ∂(∂µ q) ∂ ∂Λ
= ∂µ q + = ∂µ q . (312)
∂xµ ∂xν ∂(∂ν q) ∂(∂ν q) ∂xν ∂xν ∂(∂ν q)
42
∂Aρ ∂Λ
Tµν = − δµν Λ, (317)
∂xµ ∂ ∂Aνρ
∂x
43
References
[1] Landau, Mário. Classical Theory of Fields. Addison Wesley, Massachusetts,
5nd edition, 1989.
[2] Charap, John M. Covariant Electrodynamics, A Concise Guide. Johns Hop-
kins, USA, 2011.
44