Classical Theory of Fields Queen Mary University of London

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Relativity and Covariant Electromagnetism

Ismael Rodrigues Silva


Queen Mary, University of London
August 2014

Preface
This work is part of my exchange programme in the United Kingdom,
written during the summer of 2014, on my year abroad at Queen Mary,
University of London. The topics covered are brief introductions to Special
Relativity and covariant formulation of the Electromagnetism, including
the consequences from the postulates of Special Relativity, and a step-by-
step explanation of Tensor Calculus and Relativistic Mechanics.

Acknowledgements
To my family, which has never stopped believing me, specially my
mother Tania. To my advisor at Queen Mary, University of London, Dr.
Alston Misquitta, and to my advisor and counselor at Universidade Fed-
eral de Santa Catarina, Dr. Marco Kneipp. To my friends, in particular
Augusto, Luciano, Madlene, Deborah, Antonio and Maique, who, from
Brazil, have been supporting me in all moments. To my friends who
helped me through my quick journey in the United Kingdom, Fernanda,
Cos, Vasily, Vladimir, Ivan, Elena, Marieta, Sophia, Kristina and Mane-
tou. To my sponsor, CNPq. To Dr. Brian Wecht from Queen Mary, who
not only tought me Statistical Physics, but also what a lecturer must be
like.

1 The Theory of Relativity


1.1 Historical Background
The basis on which Einstein built the special theory of relativity was the fact
that Maxwell’s equations predict that the speed of propagation of the electro-
magnetic waves is a universal constant, independent of the motion of the source
or of the detector of the waves. The two postulates of special relativity, formu-
lated by Einstein in 1905, are:

1. The laws of physics are the same in all inertial frames of reference;
2. The speed of light in free space has the same value c in all inertial frames
of reference.

1
An inertial frame of reference is a frame in which a freely moving body
proceeds with constant velocity, that is, a frame in which Newton’s first law of
motion holds or, in other words, in which the velocity of any particle remains
constant unless there is a net force acting on it. If a system moves with constant
velocity with respect to an inertial reference system, then it is also inertial.
Ordinary mechanics assumes that the propagation of interactions of material
particles is instantaneous. Experiments show, however, that there is no instan-
taneous interaction in nature: there is a finite maximum speed of propagation of
interaction, which implies that motions of bodies with greater speed are impos-
sible, for if such a motion could occur, then by means of it one could realise an
interaction with a speed exceeding the maximum possible speed of propagation
of interaction. From the second postulate, it follows that this maximum speed
is the same in all inertial systems of reference. This universal constant, which
is also the speed of light in free space, designated by c, exactly given by1

c = 2.99792458 · 108 m/s. (1)


The mechanics based on the principle of relativity stated above is said to
be relativistic. If the speeds involved are much less than c, the mechanics is
called classical or Newtonian. Time is absolute in classical mechanics, and
so there is one time for all reference frames, what makes simultaneity is an
absolute concept. This is a contradiction in special relativity though. If we use
the general law of combination of velocities to the propagation of interaction,
then the speed of propagation would be different in different inertial frames of
reference. Once time is not absolute, simultaneous events in one frame may not
be simultaneous in other frames.
The principle of relativity introduces then drastic and fundamental changes
in basic physical concepts. The notion of space and time which we have are
only approximations due to the fact that the speeds with which we deal daily
are very small compared to the speed of light.

1.2 Intervals
An event is described by the place where it occurred and time when it occurred.
It is useful to use a four-dimensional space, whose spatial axes are x, y, z and
temporal axis is ct. In this space, events are points (ct1 , x1 , y1 , z1 ) called world
points, and there corresponds to each particle a line, called world line.2
Consider two inertial reference systems K and K 0 , with axes (ct, x, y, z) and
(ct’, x’, y’, z’) respectively, moving relative to each other with constant velocity.
Suppose that the frames coincide at t = t0 = 0, and consider a flash of light
emanating from their common origin at the instant they coincide. Therefore,
1 Originally, one metre was intended to be one ten-millionth of the distance from the Earth’s

equator to the North Pole, but since 1983 it has been defined as the length of the path travelled
by light in vacuum during a time interval of 1/299,792,458 of a second.
2 It is easy to show that to a particle in uniform rectilinear motion there corresponds a

straight world line.

2
remembering that the distance travelled by the wave is given by the product of
its speed and the interval of time, the spherical wave front described in K by

x2 + y 2 + z 2 = (ct)2 (2)
0
will be described in K by

x02 + y 02 + z 02 = (ct0 )2 . (3)


In other words,

c2 t2 − x2 − y 2 − z 2 = 0 ⇔ c2 t02 − x02 − y 02 − z 02 = 0. (4)


Homogeneity of space and time and isotropy of space require that the relation-
ship between (ct, x, y, z) and (ct0 , x0 , y 0 , z 0 ) is linear. In fact, a general linear
relation can be used to find the equations for the transformations of the coor-
dinates, but this will be done later in a different simpler way.
Relation (4) motivates us to define the interval s12 between two events as
the scalar

s212 = c2 (t2 − t1 )2 − (x2 − x1 )2 − (y2 − y1 )2 − (z2 − z1 )2 , (5)


where (ct1 , x1 , y1 , z1 ) are the coordinates of the first event and (ct2 , x2 , y2 , z2 )
are the coordinates of the second event. The interval can be regarded as the
distance between two world points in our four-dimensional space. If the events
are infinitely close to each other, the infinitesimal interval ds between them is
given by3

ds2 = c2 dt2 − dx2 − dy 2 − dz 2 . (6)


Expression (5) allows us to have either s212 = 0 or s212 > 0 or s212 < 0. If

c2 (t2 − t1 )2 > (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 , (7)


then s212
> 0, and the real number s12 is said to be timelike, and there exists a
coordinate system in which the two events occur at the same point in space. If

c2 (t2 − t1 )2 < (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 , (8)


then s212
< 0, so s12 is imaginary and is said to be spacelike, and there exists a
coordinate system in which the two events occur simultaneously. Finally, if

c2 (t2 − t1 )2 = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 , (9)


then the interval is equal to zero and is said to be null or lightlike.
The equivalence in (4) implies that, if the interval in K is null, then so is
the interval in K 0 . In other words, it is invariant in this case. It turns out that
the interval, which is a scalar, is always invariant, as we shall see later using the
concept of four-vectors. Until we get there, assume the interval is invariant.
3 This geometry was introduced by H. Minkowski and is called pseudo-euclidean, and the

four-dimensional space mentioned is called Minkowski space.

3
1.3 Proper time
The proper time of an object is defined as the time read by a clock moving with
this object. The proper time interval between two events will be therefore the
interval of time measured in a reference frame in which the two events occur at
the same point in space. Let us use the Greek letter τ to describe the proper
time.
Consider the same reference systems K and K 0 moving relative to each
other with constant velocity v, and suppose there is a clock at rest in K 0 . The
infinitesimal interval in K is given by

ds2 = c2 dt2 − dx2 − dy 2 − dz 2 . (10)


0 0 0 0
On the other hand, the clock is at rest in K , so that dx = dy = dz = 0, and
the time measured is the proper time. Therefore, being constant the interval,
we also have

ds2 = c2 dτ 2 , (11)
and so

c2 dτ 2 = c2 dt2 − dx2 − dy 2 − dz 2 (12)


or
r r
dx2 + dy 2 + dz 2 v2
dτ = dt 1 − 2 2
= dt 1 − 2 , (13)
c dt c
since

dx2 + dy 2 + dz 2
= v2 , (14)
dt2
where v = |v| is the relative speed between the reference systems K and K 0 .
Let us now define the velocity coefficient
v
β≡ (15)
c
and the Lorentz factor or Lorentz term
1 1
γ≡q =p , (16)
1− v2 1 − β2
c2

where β = |β| = v/c. This way, equation (13) can be written as


1
dτ = dt, (17)
γ
and we have the important relation4
4 Mathematically, just compare equation (18) with the equation dτ = dτ
dt
dt for the differ-
ential dτ .

4
dτ 1
= . (18)
dt γ
Supposing v is constant, one can integrate (18) and obtain the time interval
indicated by the moving clock in K:
Z t2 Z t2
dτ 1 1
dt = τ (t2 ) − τ (t1 ) ≡ τ2 − τ1 = dt = (t2 − t1 ) (19)
t1 dt t1 γ γ
or
1
∆τ = ∆t ≤ ∆t, (20)
γ
once v is always less than or equal to c, so that 0 < 1/γ ≤ 1. Therefore, we
conclude that the proper time interval of a moving object is always less than
the corresponding interval in the rest system. In other words, moving clocks
run slow.
According to (11), we also have dτ = ds/c, so the time interval read by the
clock in K 0 is also given by

1 b
Z
τ2 − τ1 = ds, (21)
c a
taken along the world line of the clock. But, since the clock at rest always
indicates a greater time interval than the moving one, we conclude that
Z b
ds (22)
a
has its maximum value if it is taken along the straight world line joining the
points a and b.

1.4 The Lorentz Transformation


We wish to derive now the formulae for the transformation of coordinates from
one inertial system to another. Consider the same inertial frames K and K 0
with axes ct, x, y, z and ct0 , x0 , y 0 , z 0 respectively, and suppose that the axes x
and x0 are coincident. Let v be the speed of K 0 relative to K, and suppose that
the origins of the two systems coincide at times t = t0 = 0. We will define this
situation as a boost in x-direction. According to classical mechanics, for the
boost described we would have

x0 = x − vt, y 0 = y, z 0 = z, t0 = t, (23)
or, in matrix form,

t0
    
1 0 0 0 t
 x0   −v 1 0 0   x 
 0 =  , (24)
 y   0 0 1 0  y 
z0 0 0 0 1 z

5
which is called Galilean transformation and is clearly inconsistent with the prin-
ciple of relativity once it does not remain constant the interval.
Since the interval can be regarded as the distance between two world points
in our four-dimensional space, the transformation we seek must be expressible
mathematically as a rotation in this space. Let us consider a rotation in the tx
plane, so that c2 t2 − x2 must be invariant. In the most general case, we have

ct0 = ct cosh ζ − x sinh ζ, x0 = −ct sinh ζ + x cosh ζ (25)


where ζ is called rapidity 5 . In matrix notation, this is written as
 0    
ct cosh ζ − sinh ζ 0 0 ct
 x0   − sinh ζ cosh ζ 0 0   x 
 0 =  , (26)
 y   0 0 1 0  y 
z0 0 0 0 1 z
likewise for a rotation about the z-axis
 0    
ct 1 0 0 0 ct
 x0   0 cos θ sin θ 0 
 x
 
 0 = , (27)
 y   0 − sin θ cos θ 0  y 
z0 0 0 0 1 z
so that ζ can be interpreted as a four-dimensional angle of rotation in the tx
plane.
We wish now to determine ζ, which depends on v. But that is trivial: just
consider the motion, in K, of the origin of K 0 . We have then x0 = 0, so the
second equation in (25) gives us

ct sinh ζ = x cosh ζ (28)


or

x = (c tanh ζ)t, (29)


so that
dx
v= = c tanh ζ (30)
dt
or
v
tanh ζ = = β. (31)
c
One can now easily find
β 1
sinh ζ = p , cosh ζ = p , (32)
1 − β2 1 − β2
5 One can easily check, using the identity cosh2 ζ − sinh2 ζ = 1, that (25) maintains true

the equation c2 t2 − x2 = c2 t02 − x02 .

6
or simply

sinh ζ = γβ, cosh ζ = γ. (33)


Using this result in (25) we find our transformation of coordinates, which is
called Lorentz transformation6 :

ct0 = γ (ct − βx) , x0 = γ(x − βct), y 0 = y, z 0 = z, (34)


or, in terms of v,
t − vx2 x − vt
t0 = q c , x0 = q , y 0 = y, z 0 = z. (35)
2 2
1 − vc2 1 − vc2
Note that, if v > c, then x and t are imaginary, what is physically meaningless.
If v  c, we have the classical mechanics equations, what also happens when
one supposes c → ∞. The formulae expressing the coordinates from K as a
function of the ones from K 0 , called inverse transformation7 , are obtained from
(35) simply by changing v to −v:
0
t0 + vx2 x0 + vt0
t= q c , x= q , y = y0 , z = z0. (36)
2 2
1 − vc2 1 − vc2

In matrix notation, we can write the Lorentz transformation in (34) as


 0    
ct γ −βγ 0 0 ct
 x0   −βγ γ 0 0   x ,
 
 0 = (37)
 y   0 0 1 0  y 
z0 0 0 0 1 z
and for the inverse transformation
    0 
ct γ βγ 0 0 ct
 x   βγ γ 0 0   x0 
 y = 0
. (38)
1 0   y0
   
0 
z 0 0 0 1 z0
The transformation matrix is often called Lorentz matrix or boost matrix. If the
boost is in y-direction or z-direction, we would have, respectively8 ,
 0    
ct γ 0 −βγ 0 ct
 x0   0 1 0 0 
 x 
 
 0 = (39)
 y   −βγ 0 γ 0  y 
z0 0 0 0 1 z
6 The Lorentz transformation is in accordance with special relativity, but was derived before
special relativity. We will refer the transformation in (34), in which the coordinates from K 0
are functions of the ones from K, as direct transformation.
7 Some authors define (36) as the direct transformation.
8 In both cases, v, and consequently β, change the sign for inverse transformation.

7
and

ct0
    
γ 0 0 −βγ ct
 x0   0 1 0 0  x
 
 0 = . (40)
 y   0 0 1 0  y 
z0 −βγ 0 0 γ z

1.5 Length Contraction and Time Dilation


Similarly to the definition of the proper time, the proper length of an object is
its length in a reference system in which the body is at rest. The proper length
between two events is the length measured in a reference frame in which the
two events occur simultaneously.
Consider a boost in x-direction, and suppose there is a rod at rest in K 0
with ends at points x01 and x02 > x01 . The length of the rod in K 0 is then
L0 = x02 − x01 , which is the proper length. In K, we have L = x2 − x1 . Using
Lorentz Transformation, we find

x01 = γ (x1 − vt1 ) , x02 = γ (x2 − vt2 ) , (41)


so that

L0 = x02 − x01 = γ(x2 − vt2 − x1 + vt1 ) (42)


or
L0
L= , (43)
γ
since t2 = t1 in K once the length must be measured simultaneously at the
ends of the rod. This means that the greatest length of the rod is measured in
the system in which it is at rest, and the length decreases in a system in which
it moves with speed v. This is called length contraction or Lorentz-Fitzgerald
contraction.
Similary, suppose once more there is a clock at rest in K 0 . The proper time
interval in K 0 is then given by ∆τ = τ2 − τ1 . In K, the time interval is given
by ∆t = t2 − t1 . Using inverse transformation9 ,

vx01 vx02
   
t1 = γ τ1 + 2 , t2 = γ τ2 + 2 , (44)
c c
which implies

vx0 vx0
 
∆t = t2 − t1 = γ τ2 + 22 − τ1 − 21 (45)
c c
or
9 One may use direct transformation, remembering that, in this case, x − x = v(t − t )
2 1 2 1
in K.

8
∆t = γ∆τ, (46)
since x01 = x02 in K 0 , for it was assumed that the clock is at rest there. This is
called time dilation, since the time interval in a moving frame is greater than
the one in the rest frame. Note that (46) agrees with the result found in (20).

1.6 Transformation of Velocity


Two consecutive Lorentz transformations depend, in general, on their order,
just like the result of two rotations about different axes depends on the order
in which they are carried out.
Consider a boost in x-direction, letting v be the velocity of K 0 with respect
to K, and consider a particle moving in K with velocity
 
dx dy dz
u = (ux , uy , uz ) = , , . (47)
dt dt dt
In K 0 , we have

dx0 dy 0 dz 0
 
u0x , u0y , u0z

u’ = = , , . (48)
dt0 dt0 dt0
Using Lorentz Transformation, we obtain

 
vdx
dx0 = γ(dx − vdt), dy 0 = dy, dz 0 = dz, dt0 = γ dt − 2 , (49)
c

where v = |v|. Dividing the first three equations by the forth, we get
ux − v uy uz
u0x = , u0y = , u0z = , (50)
1 − ucx2v γ 1 − ucx2v γ 1 − ucx2v
which are the transformation of velocity. The inverse transformation is obtained
by changing v to −v. Note that, setting c → ∞ or v  c, we have the classical
transformation of velocity

u0x = ux − v, u0y = uy , u0z = uz , (51)


which is obtained by differentiating the first three equations in (23) with respect
to t.

1.7 Lorentz Transformation in 3 Dimensions


For a boost in an arbitrary direction with velocity v, it is convenient to decom-
pose the spatial column vector r = (x, y, z) into components perpendicular and
parallel to v,

r = r⊥ + rk , (52)

9
so that

r · v = r⊥ · v + rk · v = rk v. (53)
This way, only the time and the component rk will transform, so, according to
(35),
 rk v 
t0 = γ t − 2 , r0 = r⊥ + γ rk − vt .

(54)
c
By substituting r⊥ = r − rk into the above expression for r0 , we get

r0 = r + (γ − 1) rk − γvt. (55)
Since rk and v are parallel, we have10
v r · v v
rk = rk = , (56)
v v v
and substituting now for r0 , gives
r · v v
r0 = r + (γ − 1) − γvt. (57)
v v
Factoring v in the above expression and substituting rk v = r·v in the expression
for t0 in (54), our transformation becomes
 
 r · v γ−1
t0 = γ t − 2 , r0 = r + r · v − γt v, (58)
c v2
which is the Lorentz transformation in 3 dimensions. We wish now to find the
transformation matrix. If we define
   
β v
v  x  1 x 
β= ≡ βy = vy (59)
c c
βz vz
and its transpose

vT 1
βT = = (βx βy βz ) = (vx vy vz ) , (60)
c c
then we can rewrite (58) as

ββ T
 
ct0 = γct − γβ T · r, r0 = −γβct + I + (γ − 1) 2 r, (61)
β
where I is the 3 × 3 identity matrix, such that Ir = r, and β 2 = βx2 + βy2 + βz2 .
In block matrix form, this can be written as
!
−γβ T
 0 
γ

ct ct
= T , (62)
r0 −γβ I + (γ − 1) ββ β2
r
10 Geometrically and algebraically, v/v is a dimensionless unit vector pointing in the same

direction as rk and rk = r · v/v is the projection of r into the direction of v.

10
or, if we define
βi βj
αij = (γ − 1) , (63)
β2
for i, j = x, y, z, then in a more explicitly stated way we have
 0    
ct γ −γβx −γβy −γβz ct
 x0   −γβx 1 + αxx αxy αxz   x 
 0 =  . (64)
 y   −γβy αyx 1 + αyy αyz   y 
0
z −γβz αzx αzy 1 + αzz z
Note that this is a transformation between two frames whose axes are parallel
and whose origins coincide. The most general Lorentz transformation also con-
tains rotation of the three axes, since the composition of two boosts is not a
pure boost, but a boost followed by a rotaion.

11
2 Tensor Calculus
2.1 Four-vectors
The coordinates (ct, x, y, z) of an event can be considered as the components of
a four-dimensional radius vector. We shall use the following notation:

x0 = ct, x1 = x, x2 = y, x3 = z. (65)
Note that the quantity

(x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 , (66)


which is the interval, doest not change under Lorentz transformation. From now
on, Greek letters will take on the values 0, 1, 2, 3 and Latin letters will take on
the values 1, 2, 3. This way, the components of our four-dimensional vector can
be denoted by xµ , µ = 0, 1, 2, 3, and they transform according to the system of
equations
 00    0 
x γ −βγ 0 0 x
 x01   −βγ γ 0 0   x1 
 02  =   . (67)
 x   0 0 1 0   x2 
x03 0 0 0 1 x3
A contravariant four-vector V µ is, by definition, a set of four quantities
V , V 1 , V 2 , V 3 , which transform like the components of xµ under transforma-
0

tions of the four-dimensional coordinate system. Its components will transform,


therefore, according to the system
 00    0 
V γ −βγ 0 0 V
 V 01   −βγ γ 0 0   1 
 02  =   V 2 . (68)
 V   0 0 1 0  V 
V 03 0 0 0 1 V3
Components with index 0 are called time components, while the ones with
index 1, 2 or 3 are called space components. Contravariants four-vectors are
always written with superscripts. A four-vector Vµ , written with subscripts,
which will be defined later, is said to be covariant. The components of these
two kinds of four-vectors are related by

V0 = V 0 , V1 = −V 1 , V2 = −V 2 , V3 = −V 3 . (69)
µ 11
In matrix form, the column vector V is
11 V µ and V will be used to indicate either column and row vectors, respectively, or sets
µ
of four components. To remember the matrix form of each four-vector, use the mnemonic
”upper indices go up to down; lower indices go left to right”.

12
V0
 
 V1 
Vµ = 
 V2  , (70)
V 3 4×1
and, on the other hand,

Vµ = (V0 V1 V2 V3 )1×4 = (V 0 − V 1 − V 2 − V 3 )1×4 . (71)


The square magnitude of a four-vector, in comparison with (66), is given by

(V 0 )2 − (V 1 )2 − (V 2 )2 − (V 3 )2 , (72)
which, according to (69), can be written as
3
X
V0 V 0 + V1 V 1 + V2 V 2 + V3 V 3 = Vµ V µ . (73)
µ=0

From now on, we will use Einstein summation convention, in which one sums
over any repeated index (also called summing index or dummy index, and one
is always contravariant and the other covariant), and omits the summation sign,
remembering that Greek letters run from 0 to 3 and Latin letters from 1 to 3.
This way, we have

Vµ V µ = V0 V 0 + V1 V 1 + V2 V 2 + V3 V 3 (74)
as the expression for the square magnitude of a four-vector. Analogously, the
Lorentz scalar product of two different four-vectors is given by12

Vµ U µ = V0 U 0 + V1 U 1 + V2 U 2 + V3 U 3 , (75)
which is invariant under rotations of the four-dimensional coordinate system.
Just like the the interval between two events, this scalar product can be positive
(timelike vectors), negative (spacelike) or zero (null or lightlike). In particular,
the interval, which can be written as

ds2 = dxµ dxµ = dx0 dx0 + dx1 dx1 + dx2 dx2 + dx3 dx3 , (76)
is invariant, as stated before.
The three space components of the four-vector V µ form the three dimen-
sional vector V, so we will use the notation

V µ ≡ (V 0 , V), Vµ = (V0 , −V) = (V 0 , −V), (77)


µ
so that the square magnitude of V may be given by

Vµ V µ = (V 0 )2 − (V)2 . (78)
12 Note that the expression V µ U is equal to V U µ when there is a sum over µ, but only
µ µ
the latter gives this sum when it comes to matrix multiplication.

13
In particular, we have

xµ = (ct, r), xµ = (ct, −r), ds2 = xµ xµ = c2 t2 − r2 , (79)


where

r = (x, y, z) = (x1 , x2 , x3 ). (80)


Let us now rewrite Lorentz transformation in (35) as

x00 = γx0 − βγx1 , x01 = −βγx0 + γx1 , x02 = x2 , x03 = x3 , (81)

in order to note that

∂x00 ∂x00
= γ, = −βγ, ... , (82)
∂x0 ∂x1
so that ∂x0µ /∂xν are the entries of our transformation matrix13 . We define

∂x0µ
Λµ ν ≡ , (83)
∂xν
and hence we can write
 
γ −βγ 0 0
 −βγ γ 0 0 
Λµ ν =
 0
. (84)
0 1 0 
0 0 0 1
Lorentz transformation for the four-dimensional radius vector, as in (81), can
now be writen as
X
x0µ = Λ µ ν xν , (85)
ν

or simply

x0µ = Λµ ν xν , (86)
using Einstein’s convention. In conclusion, by definition, a contravariant four-
vector is a set of four quantities V µ which transform according to14

V 0µ = Λµ ν V ν . (87)
Remember now that, if φ = φ(x1 , ..., xn ) is a scalar function, then the differential
of φ is given by
n
1 if µ = ν
13 It is also important to note that ∂xµ /∂xν = ∂x0µ /∂x0ν = . It turns
0 if µ 6= ν
out that the quantity on the right-hand side is a very special kind of four-dimensional tensor,
which will be defined later.
14 Equation (87) makes sense either as matrix multiplication or as a system of equations

when µ, ν = 0, 1, 2, 3.

14
n
X ∂φ µ ∂φ µ
dφ = µ
dx = µ
dx = ∂µ φdxµ , (88)
µ=1
∂x ∂x

where
∂φ
∂µ φ ≡ . (89)
∂xµ
This partial derivative transforms as some sort of vector, but not as a contravari-
ant one. From the chain rule, we know that
∂φ ∂φ ∂xν ∂xν
∂µ0 φ = = = ∂ν φ . (90)
∂x0µ ∂xν ∂x0µ ∂x0µ
This new transformation is, by definition, the transformation of a covariant
four-vector Vµ :
∂xν
Vµ0 = Vν . (91)
∂x0µ
In the case of the Lorentz transformation, ∂xν /∂x0µ are the components of the
inverse transformation matrix, and we write
∂xν ν
= Λ−1 µ , (92)
∂x0µ
so that the transformation of a covariant four-vector can be written as15

Vµ0 = Vν Λ−1 µ. (93)

2.2 Four-tensors
Either kind of vector is an example of a more general object called tensor,
which has linear transformation rule. The simplest kind of tensor S is the one
unchanged under transformation, that is, S 0 = S, and this is a characteristic of
a scalar. Rank is defined as the number of indices carried. This way, scalars are
tensors of rank 0 and vectos are tensors of rank 1.
A four-dimensional tensor of the second rank V µν , also called four-tensor, is
a set of 4 × 4 = 16 quantities which transform like the products of components
of two four-vectors under coordinate transformations. It’s worth reminding that
the transformation of a contravariant four-vector is giving by

V 0µ = Λµ ν V ν , (94)
and a covariant one transforms like

Vµ0 = Vν Λ−1 µ, (95)
−1 ν
0
15 Note that it makes sense writing V = Λ

µ µ Vν relating the components, but not as
matrix multiplication.

15
and therefore, by definition, our tensor of rank 2 transforms like

V 0µν = Λµ α Λν β V αβ . (96)
The components of a second-rank tensor, however, can be written as V µν (con-
travariant), Vµν (covariant) or V µ ν (mixed). Therefore, the contravariant one
transforms like (96), and the covariant and mixed ones transform, respectively,
like
α β
V 0 µν = Vαβ Λ−1 µ Λ−1 ν (97)
and

V 0µ ν = Λµ α V α β Λ−1 ν. (98)
Raising or lowering space index (1, 2, 3) changes the sign of the component, and
raising or lowering the time index (0) does not change the sign, so that

V0 i = V 0i , V i j = −V ij , Vij = V ij , ... , (99)


where i, j = 1, 2, 3. If

V µν = V ν µ , (100)
µν
then the tensor V is called symmetric. Similarly, a tensor is called antisym-
metric or skew symmetric if16

V µν = −V ν µ . (101)
µµ
Clearly, the diagonal components V (no sum here) of an antisymmetric tensor
are zero since V µµ = −V µµ . For a symmetric mixed tensor, we have V µ ν = Vν µ ,
so that we will simply write Vνµ .
From a mixed tensor V µν , we can form a scalar by doing an operation called
contraction:

V µµ = V 00 + V 11 + V 22 + V 33. (102)
This scalar is called the trace of the tensor. Note that the formation of a scalar
product of two vectors is also a contraction operation.
We define similarly four-tensors of higher rank. For example, a fourth-rank
mixed tensor V µν αβ is a set of 44 = 256 quantities which transform according
to
γ δ
V 0µν ρσ = Λµ α Λν β V αβ γ δ Λ−1 ν Λ−1 σ. (103)
From a tensor with, at least, one contravariant and one covariant components,
one can do a contraction similarly as before, and each contraction will decrease
the rank of the tensor in 2. For instance, examples of contractions from the
16 The definition is the same if V µ µ
µν = ±Vν µ , V ν = ±Vν , etc.. Note that the matrices
associated to these tensors are symmetric/antisymmetric themselves.

16
forth-rank tensor V µν αβ are the second-rank tensors V µν µβ and V µβ αβ , or
even the scalar V µν µν .
In a tensor equation, the two sides must contain identical free indices of
the same type (contravariant or covariant). For example, V µ ν = U µ Wν makes
sence, while V µ = Uµ does not. The repeated indices may be replaced by any
other Greek or Latin letter (and remember that they are of different types).
For example, V µν µν = V ν µ ν µ = V αβ αβ , while Vµ U µ and Vi U i are completely
different expressions.

2.3 Special Tensors


Let us define the unit four-tensor δνµ , also known as Kronecker’ delta, as

ν 1 if µ = ν
δµ = . (104)
0 6 ν
if µ =
The matrix form of δνµ is the identity matrix,
 
1 0 0 0
µ
 0 1 0 0 
δν = 
 0
, (105)
0 1 0 
0 0 0 1
and now we are able to affirm that

∂xµ ∂x0µ
= = δνµ . (106)
∂xν ∂x0ν
Also, remembering that repeated indices are summed, one should note that

δνµ V ν = V µ (107)
and

δνµ Vµ = Vν , (108)
so the transformation law for δνµ will be
β α
δ 0µν = Λµ α δβα Λ−1 ν = Λµ α Λ−1 ν = δνµ , (109)
since
α ∂x0µ ∂xα ∂x0µ
Λµ α Λ−1 ν = 0ν
= = δνµ , (110)
α
∂x ∂x ∂x0ν
and so δ 0µν = δ 0µν , and it is therefore an invariant tensor.
By raising the one index or lowering the other in δµν , we can define the metric
tensors gµν ≡ δµν and g µν ≡ δ µν . Considering the relations in (88), it is trivial
that

17
 
1 0 0 0
 0 −1 0 0 
gµν = g µν =
  (111)
0 0 −1 0 
0 0 0 −1
We have then17

gµν V ν = Vµ , g µν Vν = V µ , (112)
so that the metric tensor gµν can be used to lower index and g µν can be used
to raise index.
The completely antisymmetric unit tensor of fourth rank, µν ρσ is the tensor
whose components change sign under interchange of any pair of indices, and
whose nonzero components are ±1. Since µν ρσ is antisymmetric, it vanishes if
two indices are the same. We set

0123 = +1, 0123 = −1, (113)


and all the nonvanishing components can be brought to the arrangement 0, 1, 2, 3
by an even or odd number of transpositions. Since there are 4! = 24 components,
we have

µν ρσ µν ρσ = −24 (114)


Strictly speaking, µν ρσ is not a tensor, but rather a pseudotensor : if we
change the sign of one or three of the coordinates, then the components µν ρσ
do not change, whereas some of the components of a tensor should change
sign; on the other hand, with respect to rotations of the coordinate system, the
quantities µν ρσ behave like the components of a tensor.
The product µν ρσ αβ γ δ form a tensor of rank 8, which is a true tensor.
We can contract one or more pair of indices and obtain tensors of rank 6, 4, 2, 0
(a tensor of rank 0 is a scalar). Since all these tensors have the same form in
all coordinate systems, their components must be expressed as combinations of
products of components of the unit tensor δµν .
The following equation and its particular cases, which will not be proved
here, can be found by starting from the symmetries that the quantities must
possess under permutation of indices:

δαµ δβµ δγµ δδµ


δν δβν δγν δδν
µν ρσ αβ γ δ = − αρ . (115)
δα δβρ δγρ δδρ
δασ δβσ δγσ δδσ
In particular,
17 Equations gµν V ν = Vµ and g µν Vν = V µ do not make sense as matrix multiplication,
only as a system of equation relating the components.

18
µν ρσ αβ ρσ = −2(δαµ δβν − δβµ δαν ), µν ρσ αν ρσ = −6δαµ . (116)
Also, the product ij k lmn , which is a true three-dimensional tensor of rank 6,
is given by

δil δim δin


ij k lmn = δj l δj m δj n , (117)
δkl δkm δkn
so, in particular, we have

ij k lmk = δil δj m − δim δj l , ij k lj k = 2δil , ij k ij k = 6. (118)

2.4 Differentiation
We define the four-vector operator ∂µ as
 
∂ ∂ ∂ ∂ ∂
∂µ ≡ = , , , . (119)
∂xµ ∂x0 ∂x1 ∂x2 ∂x3
Using previous notation, we can write
 
1 ∂
∂µ = ,∇ (120)
c ∂t
and
 
1 ∂
∂µ = , −∇ , (121)
c ∂t
where
 
∂ ∂ ∂
∇≡ , , . (122)
∂x ∂y ∂z
Let φ be a scalar function. The four-gradient of φ is the four-vector given by
 
1 ∂φ
∂µ φ = , ∇φ . (123)
c ∂t
Using this definition, the differential of the scalar φ, which is given by
∂φ µ
dφ = dx , (124)
∂xµ
is a scalar, given by the  Lorentz scalar product of two four-vectors. Let now
V µ = V 0 , V 1 , V 2 , V 3 = V 0 , V be a four-vector, then

1 ∂V 0 ∂V 1 ∂V 2 ∂V 3 1 ∂V 0
∂µ V µ = + + + = + ∇V = ∂ µ Vµ . (125)
c ∂t ∂x ∂y ∂z c ∂t

19
In particular, the operator [] = ∂µ ∂ µ = ∂ µ ∂µ is given by

1 ∂2
[] ≡ − ∇2 , (126)
c2 ∂t2
also known as D’Alembertian.

20
3 Relativistic Mechanics
3.1 Four-velocity and Four-acceleration
The ordinary three-dimensional velocity is given by
dr
v= (127)
dt
or

dxi
vi = , i = 1, 2, 3. (128)
dt
From this, one can form a four-vector, but since dxµ is a four-vector and the
quantity dτ is a scalar (not dt), we can define
dxµ
Uµ = . (129)

From the chain rule, it follows that
dxµ dt
Uµ = . (130)
dt dτ
Once dt/dτ = γ, we have
dxµ
Uµ = γ , (131)
dt
but since dxµ = (cdt, dr), we have
dxµ d
= (cdt, dr) = (c, v) , (132)
dt dt
so that

U µ = U 0 , U = (γc, γv)

(133)
and therefore

Uµ = U 0 , −U = (γc, −γv) .

(134)
The contraction Uµ U µ must be a scalar. In fact,

v2
 
1
U µ U = γ c − γ v = γ c 1 − 2 = γ 2 c2 2 ,
µ 2 2 2 2 2 2
(135)
c γ
or simply

U µ U µ = c2 . (136)
Geometrically, U µ is a four-vector tangent to the world line of the particle. In
a similar way, one can define the four-acceleration as

21
d2 xµ dU µ
Aµ = 2
= (137)
dτ dτ
and, analogously,

dU µ dt
 
d dγ dγ
Aµ = = γ (γc, γv) = γc , γ v + γ2a (138)
dt dτ dt dt dt
or

Aµ = γ γ̇c, γ γ̇v + γ 2 a ,

(139)
where a = dv/dt is the ordinary three-dimensional acceleration of the particle.
One may evaluate γ̇ and find
− 12
v2

d a·v 3
γ̇ = 1− 2 = γ , (140)
dt c c2
so that
 a·v a·v 
Aµ = γ 4 , γ4 2 v + γ2a . (141)
c c
Finally, differentiating (136) with respect to τ , we find

Uµ Aµ = 0, (142)
and this means that the four-velocity and four-acceleration are mutually per-
pendicular in our four-dimensional space.

3.2 Principle of Least Action


The principle of least action asserts that for each mechanical system there exists
an integral S, defined as the action, which has minimum value for the actual
motion, so that the variation δS is zero. This integral must be invariant under
Lorentz transformation, since it must not depend on the choice of reference
system, and so it depends on a scalar. Furthermore, this scalar is proportional
to ds since this is the only scalar that one can construct for a free particle. The
action is then
Z b
S = −α ds, (143)
a
where the integral is along the world line of the particle between two events,
Rb
and α is some constant which must be positive since a ds has its maximum
value along a straight world line. If we represent the action as
Z t2
S=α Ldt, (144)
t1

22
where L is the Lagrange function of the mechanical system, then using the
results in (11) and (18) we can write
Z t2
αc
S=− dt, (145)
t1 γ
and comparing with (144), the Lagrangian of the free particle is
αc p
L=− = −αc 1 − v 2 /c2 . (146)
γ
The constant α characterises the particle, but in classical mechanics each particle
is characterized by its mass m. If we try to find a relation between α and m, then
we should note that if c → ∞ we must have the classical expression L = mv 2 /2.
We can then expand L in powers of v/c,

αv 2
L = −αc + , (147)
2c
and note that constant terms do not affect the equation of motion, so that −αc
can be omitted. Comparing with L = mv 2 /2, we have

α = mc, (148)
so that
Z b
S = −mc ds (149)
a
and

mc2
L=− . (150)
γ

3.3 Four-momentum and Energy


The three components of the momentum of a particle are the given by the
derivatives of L with respect to the corresponding components of v. In other
words,
∂L
pi = (151)
∂v i
p
or, knowing that L = −mc2 /γ = −mc2 1 − v 2 /c2 ,
−1/2 
v2
  
∂L 1 −2v
p= = −mc2 1− 2 , (152)
∂v 2 c c2
or

p = γmv. (153)

23
One should note that if v  c or c → ∞ then γ ≈ 1, so that we have p = mv
above. Also, if v → c then |p| → ∞. The force acting on the particle is given
by dp/dt. If one supposes that the force is directed perpendicular to v, so that
v 2 is a constant, then
dp dv
= γm . (154)
dt dt
On the other hand, if the force is parallel to v, then the velocity changes only
in magnitude, so that the unit vector v̂ = v/|v| is constant. If we write
mv
p= q v̂, (155)
v2
1− c2

then, for a force parallel to v, we have

    
−3/2 
v2
  
dp d  v 1
 dv + v −1 −2v dv 
=m  v̂ = m  q 1− 2 v̂
c2 dt
q
dt dt 1− v2 v2
1 − c2 dt 2 c
c2
(156)
or simply

v2 3 v2
   
dp dv 1
=m 3
γ + 2 γ v̂ = γ ma 2 + 2 = γ 3 ma, (157)
dt dt c γ c
and this means that the ratio of the force to acceleration is different in the two
cases. The energy E of the particle is the quantity

E = p · v − L, (158)
and using the expressions for L and p, we find

mc2
 2 
v 1
E = γmv 2 + = γmc2 + , (159)
γ c2 γ2
or simply

E = γmc2 . (160)
This expression shows that if v = 0 then the energy of the free particle is
E = mc2 , which is defined as rest energy. Also, for small velocities v/c  1 one
can expand the expression for the energy and find

mc2 mv 2
E= v2
≈ mc2 + , (161)
1 − c2 2
and this result was expected since the term mv 2 /2 is the classical expression
for the kinetic energy of the particle. Squaring now equations (153) and (160),
we have, respectively, p2 = γ 2 m2 v 2 and E 2 = γ 2 m2 c4 . Comparing these two

24
equations, we have the relation between the energy and the momentum of the
particle,

E2
= p2 + m2 c2 . (162)
c2
The energy expressed as a function of the momentum is called Hamiltonian H
of the system, so that in our case
p
H = p2 c2 + m2 c4 . (163)
Note that, if p  mc, then the Hamiltonian is approximately given by

p2
H ≈ mc2 + (164)
2m
which, except for the rest energy, is the classical expression for the Hamiltonian.
Knowing now that p = γmv and E = γmc2 , we find the relation between
the energy, velocity and momentum of the particle,
E
v.
p= (165)
c2
From the equations for the momentum and energy, if v = c then both of them
are infinite, so that a particle with mass different from zero cannot move with
velocity v = c. On the other hand, from the expression relating the momentum
and energy above, particles of zero mass can exist and for such particles we have
E
. p= (166)
c
In four-dimensional form, according to the principle of least action we have
Z b Z b p
δS = −mcδ ds = −mcδ dxµ dxµ = 0 (167)
a a

since ds2 = dxµ dxµ . In other words,


b b
dxµ δdxµ
Z Z
δS = −mc = −mc Uµ dδdxµ . (168)
a ds a
Integrating by parts, we easily get
Z b
b dUµ
δS = −mcUµ δxµ |a + mc δxµ ds. (169)
a ds
The first term of this equation is zero since (δxµ )a = (δxµ )b = 0, so that we
have
Z b
dUµ
δS = mc δxµ ds = 0, (170)
a ds
and hence

25
dUµ
= 0. (171)
ds
Now, let us consider the point a as fixed, so that (δxµ )a = 0, and point b as
variable, so that we find

δS = −mcUµ δxµ , (172)


where δxµ replaces (δxµ )b = 0. The momentum four-vector or four-momentum
is the four-vector given by
∂S
Pµ = − . (173)
∂xµ
From classical mechanics, we know that pi = ∂S/∂xi are the three components
of the momentum vector. Also, the derivative −∂S/∂t is the energy E of the
particle. Remembering that x0 = ct, we can now write

Pµ = (E/c, −p) (174)


and then

P µ = (E/c, p). (175)


One can note that this can also be written as

P µ = mU µ (176)
µ µ
where U is the four-velocity of the particle. The expression Pµ P must be a
scalar. In fact, we have

Pµ P µ = m2 Uµ U µ = m2 c2 . (177)
The force four-vector or four-force is, by analogy, defined as the derivative
dpµ dU µ
Fµ = = mc , (178)
ds ds
and its components satisfy Fµ U µ = 0. In terms of the three-dimensional force
f = dp/dt, this can be written as
 
γfv γf
Fµ = , , (179)
c2 c
where the time component is related by the work done by the force.

26
4 Charges in Electromagnetic Fields
4.1 Four-potential of a Field
In an electromagnetic field, the action function of a particle is given by the action
Rb
S = −mc a ds for the free particle and a term describing the intercaction of the
particle with the field, determined by the charge of the particle. The properties
of the field are characterised by the four-potential Aµ . The components of
this four-vector are functions of the spatial coordinates and time. The space
components of Aµ form the three-dimensional vector A, called vector potential,
and the time component is denoted as A0 ≡ φ, called scalar potential. This way,
we have

Aµ = (φ, A), Aµ = (φ, −A). (180)


The components of Aµ appear in the action function in the term
Z b
q
− Aµ dxµ , (181)
c a
where q is the charge of the particle. This way, the action has the form

Z b  Z b
 q q 
S= −mcds − Aµ dxµ = −mcds + A · dr − qφdt , (182)
a c a c

using the expression for Aµ above and for the infinitesimal four-dimensional
radius vector dxµ = (cdt, dr). Substituting now dr = vdt and ds = cdτ = cdt/γ
above, we can change the integral above to an integration over t and obtain
Z t2 
mc2

q
S= − + A · v − qφ dt, (183)
t1 γ c
and so the Lagrangian for a charge in an electromagnetic field is given by

mc2 q
L=− + A · v − qφ. (184)
γ c
One can now find the components of the generalised momentum of the particle,
− 12 
v2 2v i
  
∂L 1 q
Pi = = −mc2 1− 2 − 2 + Ai , (185)
∂v i 2 c c c
or, in other words,
q q
P = γmv + A = p + A. (186)
c c
∂L
The Hamiltonian function can be found using the expression H = v · ∂v − L,
so that for a particle in a field we have

27
 q  mc2 q q mc2 q
H = v · γmv + A + − A·v +qφ = γmv 2 + A · v + − A·v +qφ
c γ c c γ c
(187)
or simply

H = γmc2 + qφ, (188)


2 2 2 2 2 2 2
since γmv + mc /γ = γmc (1/γ + v /c ) = γmc . One can now express H
as a function of the generalised momentum P and find
r  e 2
H = m2 c4 + c2 P − A + eφ. (189)
c

4.2 Equations of Motion of a Charge in a Field


The equations of motion of a particle with small charge q can be found using
the Lagrange equations
 
d ∂L ∂L
− = 0, (190)
dt ∂v ∂r
where

mc2 q
L=− + A · v − eφ. (191)
γ c
As seen before, we have
∂L q
= P = γmv + A (192)
∂v c
and furthermore
∂L q
= ∇L = ∇(A · v) − q∇φ. (193)
∂r c
Using for A and v the identity ∇(a · b) = (a · ∇)b + (b · ∇)a + b × (∇ × a) +
a × (∇ × b) for arbitrary vectors a and b, we find

∂L q
= [(A · ∇)v + (v · ∇)A + v × (∇ × A) + A × (∇ × V)] − q∇φ. (194)
∂r c
Now, note that v is constant in this differentiation, so that we simply have
∂L q q
= (v · ∇)A + v × (∇ × A) − q∇φ. (195)
∂r c c
This way, the Lagrange equation in (190) becomes
d  q  q q
p + A − (v · ∇)A − v × (∇ × A) + q∇φ = 0. (196)
dt c c c

28
The components of the potential vector A are functions of the spatial compo-
nents and time, so that
∂A ∂A
dA = dt + dr, (197)
∂t ∂r
or
∂A
dA = dt + (dr · ∇)A, (198)
∂t
which gives
dA ∂A
= + (v · ∇)A. (199)
dt ∂t
Finally, equation (196) gives us
dp q ∂A q
=− − q∇φ + v × (∇ × A) = 0. (200)
dt c ∂t c
The derivative of the momentum with respect to time, on the left hand side
of the above equation, is the force exerted on the charge in an electromagnetic
field. The terms (q/c)∂A/∂t and q∇φ do not depend on v. We denote this
force per unit charge as the electric field intensity E, so that
1 ∂A
E=− − ∇φ. (201)
c ∂t
The term (q/c)v × (∇ × A) depends on the velocity and is proportional and
perpendicular to it. The factor of v/c per unit charge is called magnetic field
intensity B, so that

B = ∇ × A. (202)
Using this definitions, we can write the equation of motion of a charge in an
electromagnetic field as
dp q
= qE + v × B, (203)
dt c
which is called Lorentz force.

4.3 Gauge Invariance


To one and the same field, there may correspond different potentials. Let us
try to add to the components of the potential Aµ the quantity −∂µ χ, where
χ = χ(t, x, y, z) is an arbitrary function. This way, our new potential four-
vector is

A0µ = Aµ − ∂µ χ. (204)
With this change, there appears in the action integral the term

29
q ∂χ µ e 
dx = d χ , (205)
c ∂xµ c
and the last term is a total differential and hence it has no effect on the equations
of motion. Using the vector and scalar potentials, the transformation in (204)
is the same as
1 ∂χ
A0 = A + ∇χ, φ0 = φ − . (206)
c ∂t
This way, the fields E = −(1/c)∂A/∂t − ∇φ and B = ∇ × A do not change
since ∇ · (∇ × V) = 0 and ∇ × (∇f ) = 0 for any well behaved vector field
V and scalar field f. This way, the potentials are not uniquely defined. The
transformation in (206) is called Gauge Transformation. As an example, it is
always possible to choose the potentials so that the scalar field φ is zero.

4.4 Constant Electromagnetic Field


An electromagnetic field is said to be constant when it does not depend on the
time. This way, we have E = −∇φ and B = ∇ × A, so that a constant electric
field is determined only by φ and a constant magnetic field is determined only by
A. Let us now determine the energy of a charge in a constant electromagnetic
field. First of all, if the field is constant, then the Lagrangian also does not
depend on the time explicitly, and in this case the energy is conserved and
coincides with the Hamiltonian. For a charge q in an electromagnetic field, we
have

E = γmc2 + qφ, (207)


so the presence of the field adds to the energy the term qφ, which is the potential
energy of the charge in the field. The magnetic field does not affect the energy of
the charge since the vector potential A does not appear in the above expression
for E. If the field intensities are the same at all points in space, the it is called
uniform. If the electric field is uniform, the scalar potential can be expressed as

φ = −E · r (208)
since constant E implies ∇(E · r) = (E · ∇)r = E. On the other hand, the vector
potential can be expressed as
1
A= B × r, (209)
2
since B constant implies ∇ × (B × r) = B · ∇r − (B · ∇)r = 2B.

4.5 Motions in Constant Uniform Electromagnetic Field


The first kind of motion with which we will be dealing is the motion in a constant
uniform electric field. Suppose there is a charge q in a uniform constant electric

30
field E. The direction of E can be said to be in the x-axis. Now, we know that
the equation of motion is
dp q
= qE + v × H, (210)
dt c
so that in our case the equation of motion becomes only
dp
= qE, (211)
dt
which is a set of two equations dpx /dt = qE and dpy /dt = 0. Solving this
differential equations we have, respectively,

px = qEt, p y = p0 , (212)
and the time reference point has been chosen at the moment when px = 0, and
p0 is the momentum
p of the particle at that moment. The kinetic energy, which
is given by Ek = p2 c2 + m2 c4 , will be, in our case,
q q
Ek = p20 c2 + (cqEt)2 + m2 c4 = E02 + (cqEt)2 , (213)
where
q
E0 = p20 c2 + m2 c4 (214)
is the energy at time t = 0. Once the velocity of the particle is given by

pc2
v= , (215)
E0
we have

dx px c2 c2 qEt
vx = = =p 2 , (216)
dt Ek E0 + (cqEt)2
and hence

c2 qEt
Z
1
q
x= p dt = E02 + (cqEt)2 . (217)
E02 + (cqEt)2 qE
On the other hand, we have

dy py c2 p0 c2
vy = = =p 2 , (218)
dt Ek E0 + (cqEt)2
so that
 
p0 c cqEt
y= sinh−1 . (219)
qE E0
Now, from the above equation we have

31
 
qEy E0
t = sinh , (220)
p0 c cqE
and substituting this in the equation (217), gives us
 
E0 qEy
x= sinh , (221)
qE p0 c
which is a catenary curve. The second kind of motion we shall deal is the
motion in a constant uniform magnetic field. Consider the charge q in a uniform
magnetic field B, defined to be in the direction of the z-axis. The equation of
motion given by dp/dt = qE + (q/c)v × B simply becomes
dp q
= v × B. (222)
dt c
Once we have v = pc2 /E, the above equation can be written as
dv E q
2
= v × B, (223)
dt c c
which is a set of three equations
dvx dvy dvz
= ωvy , = −ωvx , = 0, (224)
dt dt dt
where ω = qcB/E. Multiplying the equation for vy by i and adding to the
equation for vx gives us
d
(vx + ivy ) = −iω(vx + ivy ), (225)
dt
which is a first order differential equation whose solution is

vx + ivy = ae−iωt , (226)


where a is a complex constant. If we set a = vr e−iθ , then the above equation
becomes

vx + ivy = vr e−i(ωt+θ) , (227)


which can be easily separated into real and imaginary parts, giving

vx = vr cos(ωt + θ), vy = −vr sin(ωt + θ). (228)


Squaring both equations for vx and vy and adding them gives us

vr2 = vx2 + vy2 , (229)


and this means that the velocity of the particle n xy-plane remains constant.
Integrating now the equations for vx = dx/dt and vy = dy/dt, we have
vr vr
x = x0 + sin(ωt + θ), y = y0 + −vr cos(ωt + θ). (230)
ω ω

32
Also, dvz /dt = 0 gives us vz = v0z = constant and hence z = z0 + v0z t. These
three equations for x, y, z combined show us that the charge moves along a helix
having its axis along the direction of the magnetic field B. In particular, if
v0z = 0 then the charge moves along a circle in the plane perpendicular to the
field.

4.6 The Electromagnetic Field Tensor


In four-dimensional notation, the principle of least action states that
Z b  q 
δS = δ −mcds − Aµ dxµ = 0. (231)
a c
p
Using the fact that ds = dxµ dxµ , we have then
Z b
dxµ dδxµ

q q
δS = − mc + Aµ dδxµ + δAµ dxµ = 0. (232)
a ds c c
Using now Uµ = dxµ /ds and integrating the first two terms by parts gives us

Z b  q q   q 
δS = mcdUµ δxµ + δxµ dAµ − δAµ dxµ − mcUµ δxµ + Aµ δxµ = 0.
a c c c
(233)
Now, note that
Z b
q
mcuµ δxµ + Aµ δxµ = 0 (234)
a c
since the integral is varied with fixed coordinate values at the limits. Also, we
have
∂Aµ ν
δAµ = δx (235)
∂xν
and
∂Aµ ν
dx , dAµ = (236)
∂xν
and hence the expression for δS becomes

Z b  
e ∂Aµ ν µ e ∂Aµ ν µ
δS = mcdUµ δxµ + dx δx − δx dx = 0. (237)
a c ∂xν c ∂xν

Now, let us use the fact that dUµ = (dUµ /ds)ds and dxµ = U µ ds, and also the
fact that summed indices can be exchanged, so that we can write
Z b   
dUµ q ∂Aν ∂Aµ
δS = mc − − U ν δxµ ds = 0. (238)
a ds c ∂xµ ∂xν

33
Once δxµ is arbitrary, the integrand must be igual to zero, so
 
dUµ q ∂Aν ∂Aµ
mc − − U ν = 0, (239)
ds c ∂xµ ∂xν
or
 
dUµ q ∂Aν ∂Aµ
mc = − Uν. (240)
ds c ∂xµ ∂xν
We define then the electromagnetic field tensor Fµν as
∂Aν ∂Aµ
Fµν = − (241)
∂xµ ∂xν
so that we can write the four-dimensional equation of motion as
dUµ q
mc = Fµν U ν . (242)
ds c
Setting ν = i = 1, 2, 3 in the above equation, we have the equation of motion
dp q
= qE + v × H, (243)
dt c
while setting ν = 0 gives us the known equation
dEk
= qE · v. (244)
dt
In matrix notation, the electromagnetic field tensor is
 
0 Ex Ey Ez
 −Ex 0 −Bz By 
Fµν =   (245)
−Ey Bz 0 −Bx 
−Ez −By Bx 0
in covariant form, and
 
0 −Ex −Ey −Ez
 Ex 0 −Bz By 
F µν =  (246)
 Ey Bz 0 −Bx 
Ez −By Bx 0
in contravariant form. Note that all the diagonal components are zero, as ex-
pected since Fµν is clearly antisymmetric.
One could now note that F µν transform in each index as a four-vector.
Expressing the components of this tensor in terms of the components of the
electric and the magnetic field, the formulas for the transformations are
 v   v 
Ex0 = Ex , Ey0 = γ Ey − Bz , Ez0 = γ Ez − By (247)
c c
and

34
 v   v 
Bx0 = Bx , By0 = γ By + Ez , Bz0 = γ Bz + Ey . (248)
c c
As a particular case, if v  c then
v v
Ex0 = Ex , Ey0 = Ey − Bz , Ez0 = Ez − By (249)
c c
and
v v
Bx0 = Bx , By0 = By + Ez , Bz0 = Bz + Ey , (250)
c c
which can be written as
1
E0 = E − ×v (251)
c
and
1
B0 = B + E × v (252)
c

4.7 Invariants of the Field


We can form scalars form the electric and magnetic field intensities. The first
invariant quantity one can form is

Fµν F µν , (253)
which can be easily computed remembering that Fµν = (E, B) and F µν =
(−E, B), giving

Fµν F µν = B 2 − E 2 . (254)
The second quantity we can form is given by

µν αβ Fµν Fαβ = E · B (255)


The equation E · B = constant means that if the electric and the magnetic fields
are perpendicular in one system, that is, E · B = 0, then they are perpendicular
in any other system. The equation B 2 − E 2 = constant implies that if E < B
or E > B in one system, then E < B or E > B in any other system. Also, one
can nothat that if E · B = 0 then we can alwats find a reference system in which
E = 0 or B = 0; in other words, the field is purely magnetic or purely electric.

35
5 The Electromagnetic Field Equations
5.1 The First Pair of Maxwell’s Equations
We already know that
1 ∂A
B = ∇ × A, E=− − ∇φ, (256)
c ∂t
so, using the fact that ∇ × (∇f ) = 0 for all scalar function f , we have

1 ∂(∇ × A) 1 ∂(∇ × A)
∇×E=− − ∇ × (∇φ) = − (257)
c ∂t c ∂t
and now using the fact that ∇ · (∇ × V) = 0 for all vector field V gives us

∇ · B = ∇ · (∇ × A) = 0, (258)
and these equations are the first pair of Maxwells equations, which are homo-
geneous. In gour-dimensional notation, let us first remember that

Fµν = ∂µ Aν − ∂ν Aµ , (259)
so that

∂ρ Fµν +∂µ Fν ρ +∂ν Fρµ = ∂ρ ∂µ Aν −∂ρ ∂ν Aµ +∂µ ∂ν Aρ −∂µ ∂ρ Aν +∂ν ∂ρ Aµ −∂ν ∂µ Aρ = 0


(260)
since the derivatives commute. The quantity ∂ρ Fµν + ∂µ Fν ρ + ∂ν Fρµ is anti-
symmetric in all three indices, and the only non zero components are those with
µ 6= ν 6= ρ, so they form the set of four equations, which are ∇ × E = 1c ∂(∇×A)
∂t
and ∇ · B = 0.

5.2 The Four-dimensional Current Vector and Equation


of Continuity
The charge density ρ = ρ(x, y, z, t) is defined so that ρdV is the charge contained
in the volume dV , which allows us to treat charge as a continuously distributed
quantity in the space. Charges, however, are pointlike, which allows us to write
X
ρ(r) = qa δ(r − ra ), (261)
a

where ra is the radius vector of the charge qa . Multiplying now the quantity
dQ = ρdV by dxµ gives us
dxµ
dQdxµ = ρdV dxµ = ρdV dt . (262)
dt
Now, note that dQdxµ is a four-vector, and hence the quantity ρdV dtdxµ /dt
must also be a four-vector. Once the quantity dV dt is a scalar, we conclude

36
that the quantty ρdxµ /dt is a four-vector. We define this vector as J µ , which
is called current four-vector or four-current, and hence
dxµ
Jµ = ρ , (263)
dt
we can now evaluate

dx0 d(ct)
J0 = ρ =ρ = cρ (264)
dt dt
and

dxi
Ji = ρ = ρv = j, (265)
dt
where j is the current density vector. This way, we can write

J µ = (cρ, j). (266)


R
Finally, the total charge, which is equal to ρdV , can also be written in four-
dimensional form as
Z Z Z
1 1
ρdV = J 0 dV = J µ dSµ (267)
c c
taken over the four-dimensional hyperplane perpendicular to x0 -axis.
Let us now consider the change with time of the total charge, that is, the
quantity
Z

ρdV. (268)
∂t
First, one should note that the quantity of charge which passes in unit time
through the element dS of the surface bounding our volume is given by ρv · dS,
where v is the velocity of the charge where dS is located. The quantity ρv · dS
is positive if charge leaves and negative otherwise. Remembering that j = ρv,
we can write then
Z I

ρdV = − j · dS, (269)
∂t
where the integral on the right hand side extends over the whole boundary of
the volume. This is called equation of continuity, in integral form. Applying
now Gauss’ theorem on the right hand side gives us
Z Z
j · dS = (∇ · j)dV, (270)

and hence we have


Z Z

ρdV = − (∇ · j)dV (271)
∂t

37
or
Z  
∂ρ
+ ∇ · j dV = 0, (272)
∂t
which allows us to write
∂ρ
+∇·j=0 (273)
∂t
since dV is arbitrary. This is the equation of continuity in differential form. Let
us write now the above equation in the form

1 ∂(cρ) ∂J 1 ∂J 2 ∂J 3
+ + + = 0, (274)
c ∂t ∂x1 ∂x2 ∂x3
and now note that cρ = J 0 and ∂/∂(ct) = ∂/∂x0 , so that we can write

∂µ J µ = 0, (275)
which is the equation of continuity in four-dimensional form.

5.3 The Action Function for the Electromagnetic Field


Considering the electromagnetic field and the particles, the action function con-
sists of three parts,

S = Sf + Sm + Smf , (276)
where Sf depends on the properties of the field itself, in the absence of charges,
Sm depends only on the properties of the particle, and Smf depends on the
interaction between the particles and the field. This way, if there are many
particles, the total action Sm is the sum of the actions for each free particle:
X Z
Sm = − mc ds. (277)

On the other hand, for a system of particles we will also have


XqZ
Smf = − Aµ dxµ . (278)
c
Now, we wish to stablish the form of the action Sf . In order to do that, let us
use the fact that the electromagnetic field satisfies the principle of superposition,
which asserts that the field produced by a system of charges is the result of a
simple composition of the fields produced by each of the particles individually.
As we know, a linear differential equation has this property, and the linear
combination of any solution is also a solution. This way, under the integral sign
of Sf there must stand an expression quadratic in the field, and Sf must be the
integral of some function of the field tensor Fµν . In order to have a scalar as
the action, the quantity we look for is Fµν F µν , so that

38
Z
Sf = a Fµν F µν dxdydzdt, (279)

where a is a constant which depends on the choice of units. In the Gaussian


system of units, we have a = −1/16π. If we define now

dΩ = cdtdxdydz, (280)
we have then
Z
1
Sf = − Fµν F µν dΩ, (281)
16πc
or, using the fact that Fµν F µν = 2(B 2 − E 2 ),
Z
1
Sf = (E 2 − B 2 )dV dt, (282)

and hence the total action for the fied and particles is

Z Z XqZ
1 µν
X
S = Sf + Sm + Smf =− Fµν F dΩ − mc ds − Aµ dxµ .
16πc c
(283)

5.4 The Second Pair of Maxwell’s Equations


In the expression (283) for the action function, we can introduce the current
four-vector. In place of the point charges q, let us introduce the continuous
distribution of charge with density ρ, and write this term as
Z
1
− ρAµ dxµ dV, (284)
c
replacing the sum by an integral. We can rewrite this as
dxµ
Z
1
− ρ Aµ dV dt, (285)
c dt
or simply
Z
1
− 2 Aµ J µ dΩ. (286)
c
The action then will be of the form

Z Z
X 1 Z
1 X
S=− Fµν F µν dΩ − mc Aµ J µ dΩ.
ds − (287)
16πc c2
P R
Now, note that the variation of the term − mc ds− is clearly zero, and so,
using the fact that F µν δFµν = Fµν δF µν , we have

39
Z  
1 µν 1
δS = − F δFµν + 2 δAµ J µ dΩ = 0. (288)
8πc c
Using now Fµν = ∂µ Aν − ∂ν Aµ gives us

Z  
1 µν 1 µν 1
δS = − F ∂µ δAν − F ∂ν δAµ + 2 δAµ J µ dΩ = 0. (289)
8πc 8πc c

We can now interchange the indices µ and ν in the expression F µν ∂µ δAν and
then replacing F µν by −F νµ , which gives us
Z  
1 µν 1 µ
δS = − − F ∂ν δAµ + 2 δAµ J dΩ = 0. (290)
4πc c
Integrating by parts the first term of this integral, remembering that the limits
of integration are infinity, where the field is zero, gives us the expression
Z  
1 µν 1 µ
δS = − ∂νF + 2 J δAµ dΩ = 0. (291)
4πc c
or simply
Z  
1 1 1
− ∂νF µν + J µ δAµ dΩ = 0. (292)
c 4π c
Since δAµ is arbitrary, its coefficients must be zero, so that we have
4π µ
∂ν F µν = −
J , (293)
c
which is a set of four equations. If we set µ = 1 then we have the equation
∂Bz ∂By 1 ∂Ex 4π
− − = Jx , (294)
∂y ∂z c ∂t c
and similarly if i = 2, 3, so that, in vector equation, we have
1 ∂E 4π
∇×B= + J. (295)
c ∂t c
On the other hand, If µ = 0, we have

∇ · E = 4πρ, (296)
and these equations are called the second pair of Maxwell’s equations, or the in-
homogeneous pair. It is easy to obtain the continuity equation from the Maxwell
equations. Taking the divergence of equation (295), gives us

1 ∂(∇ · E) 4π
∇ · (∇ × B) = + (∇ · J). (297)
c ∂t c
Once ∇ · (∇ × B) = 0 and ∇ · E = 4πρ, we have then

40
1 ∂(4πρ) 4π
0= + (∇ · J), (298)
c ∂t c
or simply
∂ρ
∇·J+ = 0, (299)
∂t
which is the equation of continuity.

5.5 Energy Density


Let us consider the equations
1 ∂B 1 ∂E 4π
∇×E=− , ∇×B= + j, (300)
c ∂t c ∂t c
multiply respectively by B and E and combine them, getting

1 ∂B 1 ∂E 4π
B · (∇ × E) − E · (∇ × B) = − B · − E· + j · E, (301)
c ∂t c ∂t c
and, using the formula ∇ · (a × b) = b · ∇ × a − a · ∇b, we can write
4π 1 ∂
∇ · (E × B) = − j·E− (E 2 + B 2 ), (302)
c 2c ∂t
or simply
∂W
= −j · E − ∇ · S, (303)
∂t
where the Poynting vector S is defined as
c
S= E×B (304)

and the energy density W as

E2 + B2
W = , (305)

which is the energy per unit volume of the field.

5.6 Energy-momentum tensor of the Electromagnetic Field


Consider a system whose action integral is
Z Z
1
S = Λ (q, ∂µ q) dV dt = ΛdΩ, (306)
c
R
where Λ is a function of q and their first derivatives. The integral ΛdV is the
Lagrangian of the system, so that Λ can be interpreted as Lagrangian density.
The equations of motion are obtained by varying S:

41
Z  
1 ∂Λ ∂Λ
δS = δq + δ(∂µ q) dΩ (307)
c ∂q ∂(∂µ q)
or

Z    
1 ∂Λ ∂Λ ∂Λ
δS = δq + ∂µ δq − δq∂µ dΩ = 0. (308)
c ∂q ∂(∂µ q) ∂(∂µ q)

The term
Z  
∂Λ
∂µ δq dΩ (309)
∂(∂µ q)
vanishes, and by arbitrarity of dΩ and δq, the equation of motion is then
∂Λ ∂Λ
− ∂µ = 0. (310)
∂q ∂(∂µ q)
Write now

∂Λ ∂Λ ∂q ∂Λ ∂(∂ν q)
= + , (311)
∂xµ ∂q ∂xµ ∂(∂ν q) ∂xµ
and, using the equation of motion and the fact that ∂ν ∂µ q = ∂µ ∂ν q, we have

   
∂Λ ∂ ∂Λ ∂Λ ∂(∂µ q) ∂ ∂Λ
= ∂µ q + = ∂µ q . (312)
∂xµ ∂xν ∂(∂ν q) ∂(∂ν q) ∂xν ∂xν ∂(∂ν q)

Also, we can write


∂Λ ∂Λ
= δµν ν , (313)
∂xµ ∂x
and hence, if we define
∂Λ
Tµν = ∂µ q − δµν Λ, (314)
∂(∂ν q)
then we can write
∂Tµν
= 0. (315)
∂xν
We wish to apply now these relations to the electromagnetic field. First of all,
remember that for the electromagnetic field we have
1
Λ=−Fνρ F νρ , (316)
16π
so, using relation (314), the tensor of the electromagnetic field is

42
∂Aρ ∂Λ
Tµν =   − δµν Λ, (317)
∂xµ ∂ ∂Aνρ
∂x

which gives, after finding the variation δΛ,


1 ∂Aρ νρ 1 ν
Tµν = − F + δ Fρσ F ρσ , (318)
4π ∂xµ 16π µ
or, in contravariant form,
1 ∂Aρ ν 1 µν
T µν = − Fρ + g Fρσ F ρσ , (319)
4π ∂xµ 16π
which is not, however, a symmetric tensor, so that we add the quantity
1 ∂Aµ ν 1 ∂
F = (Aµ F νρ ), (320)
4π ∂xρ ρ 4π ∂xρ
and hence we have the final expression for the energy-momentum tensor of the
electromagnetic field,
 
1 1
T µν = −F µρ Fρν + g µν Fρσ F ρσ , (321)
4π 4
which is symmetric and whose trace Tµµ = 0.

43
References
[1] Landau, Mário. Classical Theory of Fields. Addison Wesley, Massachusetts,
5nd edition, 1989.
[2] Charap, John M. Covariant Electrodynamics, A Concise Guide. Johns Hop-
kins, USA, 2011.

44

You might also like