Electromagnetics History PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 628

CHAPTER

The Phenomenon of Light


of the special theory of relativity lies in a dilemma concerned with the
nature and velocity of light. Appreciation of this dilemma adds purpose and meaning
to relativity, and it is for this reason that the present chapter is concerned with light
and its properties. The first two sections trace the evolution of thought with respect
to whether light is corpuscular or wavelike, and whether its velocity is finite or infinite;
present-day views of these properties culminate both developments. Light and sound
(the latter being representative of wave phenomena requiring a tangible medium)
are compared in the third section and their essential similarities and differences are
highlighted; the resulting contrast prepares the way for the introduction, in Chapter 2,
of the aforementioned dilemma.
More than usual space is given in this chapter to the historical aspects of the subj ect.
An explanation of the decision to do this may be found in the Preface. The reader wishing to concentrate his efforts on the technical development may prefer to limit his
attention to the Bradley aberration experiment in Section 1.2 and the comparison of
light and sound in Section 1.3.
THE OHIGIN

1.1 *

HISTORICAL SURVEY-THE NATURE OF LIGHT

Speculation about the nature of light can be traced back to antiquity. The Sicilian
Empedocles (c.490-c.43;") B.C.) was credited with the vie\v 1 that light consists of srnall
particles emitted from a visible body. These particles were presumed to enter the eyes
and were then returned to the visible body (a conservation law l) with the resulting
streams of particles being responsible for the sensations of shape and color.Unfortunately, only fragments of the writings of this extraordinary man have survived, and
the direct evidence of his view is merely suggestive, being contained in the lyrical
passage"
As when a 111 an , about to sally forth,
Prepares a light and kindles him a blaze
Of flaming fire against the wintry night,

* Throughout this book the content of sections marked with an asterisk is primarily historical. The
reading of these sections can be omitted without materially affecting the technical exposition.
1 Plato, Meno. (See, e.g., the W. R. M. Lamb translation, \:'"01. 165 of the Loeb Classical Library, p.
285, Harvard University Press, 1962.)
2 W. E. Leonard, The Fragments of Empedocles, pp. 42-43, The Open Court Publishing Company,
Chicago, 1908.

T'ke Phenomenon. of [Jt'ght

CHAPTER

In horny lantern shielding from all winds:

Though it protect from breath of blowing winds,


Its beam darts ou tward, as 1110re fine and thin,
And with untiring rays lights up the sky:
Just so the Fire primeval once lay hid
In the round pupil of the eye, enclosed
In films and gauzy veils, which through and through
Were pierced with pores divinely fashioned,
And thus kept off the watery deeps around,
Whilst Fire burst outward, as more fine and thin.

Empedocles was a close observer of nature, the apparent originator of the longstanding and influential notion that all things are composed of the four elements: air,
fire, water, and earth. He was a poet of stature whose wide-ranging opinions exerted a
strong influence on later Greek scholars. Aristotle (384-332 B.C.) quotes hin1 frequently, often contentiously, and in De Sensu says"
Empedocles at times seems to hold that vision is to be explained as above-stated, by light
issuing forth from the eye; e.g., in the following passage: [The 13 lines given above are then
quoted.] Sometimes he accounts for vision thus, but at other times he explains it by emanations from the visible objects.

Aristotle states his own opinion about the nature of light in De A nima:"
N O\V there clearly is something which is transparent, and by "transparent" I mean what
is visible, and yet not visible in itself, but rather owing its visibility to the color of somethinq
else; of this character are air, water, and many solid bodies. Neither air nor water is transparent because it is air or water ; they are transparent because each of them has contained
in it a certain substance which is the same in both and is also found in the eternal body
which constitutes the uppermost shell of the physical COS1110S. Of this substance light is the
activity-the activity of what is transparent so far forth as it has in it the determinate
power of becoming transparent; where this power is present, there is also the potentiality
of the contrary, viz. darkness. Light is as it were the proper color of what i:-; transparent,
and exists whenever the potentially transparent is excited to actuality by the influence of
fire or something resembling "the uppermost body"; for fire too contains something which is
one and the same with the substance in question.
We have now explained what the transparent is and what light is; light is neither fire
nor any kind whatsoever of body nor an efflux from any kind of body (if it were, it would
again itself be a kind of body)-it is the presence of fire or something resembling fire in
what is transparent. I t is certainly not a body, for two bodies cannot be present in the same
place. The opposite of light is darkness; darkness is the absence from what is transparent
of the corresponding positive state above characterized; clearly therefore, light is just the
presence of that.

Aristotle's influence was greater with later cultures than with his O\VIl, and thus one
finds most ancient Greek scholars preferring to accept a simpler view similar to that of
3 Aristotle,
Clarendon
4 Aristotle,
Claren don

De Sensu, 437 b , 23, English translation under editorship of W.D. Ross, Oxford at the
Press, 1931.
De Anima, 418 b , 4, English translation under editorship of W. 1). Ross, Oxford at the
Press, 1931.

SECTION 1

Historical S"urvey-The Nature of Liqh;

Empedocles; for example, both Euclid and Ptolemy held the opinion that light consists
of rays which originate in the eye, illuminate the object seen, and then return to the eye.
In contrast to the richness of Greek speculation about light, Roman scholars do not
appear to have been interested in this problem. Indeed, all of Roman science was
essentially derivative in character and distinctly low order, contributing little that was
original, and nothing worthy of note in the present survey. Arabic science, on the other
hand, while also being derivative, was of a rather high order, being based on the finest
products of Greek scientific achievement. The successors of Mohammed evinced a
great interest in the ideas of the western people whom they conquered, and far from
being the destroyers of Western literature, they were its chief preservers. The Arabs
came into contact with the Greeks in Egypt as well as western Asia, and becarne their
virtual successors in carrying forward the torch of learning. Although inclined to be
conservative and traditional, thus accepting most Greek ideas as authoritative, the
Arabian scholars did make several independent discoveries of significance. An important example is the Arabic numbering system in use today, which evolved during this
period.
In the specific field of light, many accornplishrnents can be credited to Ibn al-Haitharn
(c.9G5-c.1039), known to the Western world by the Latin uame Alhazen. He was the
true physicist of medieval Islam, just as Archimedes had been in the Grecian period,
for he combined with rare skill both the experimental investigation of natural phenomena and the analysis of results by mathematics.' Alhazen was one of the ablest
students of optics of all times and published a seven-volume treatise on this subject
which had great celebrity throughout the medieval period and strongly influenced
Western thought, notably that of Roger Bacon and Kepler." This treatise discussed
concave and convex mirrors in both cylindrical and spherical geometries, anticipated
Fermat's law of least time, and considered refraction and the magnifying power of
lenses. It contained a remarkably lucid description of the optical system of the eye,
which study led Alhazen to the belief that light consists of rays which originate in the
object seen, and not in the eye, a view contrary to that of Euclid and Ptolemy.
Ibn Sina, or Avicenna (980-1037), the most famous of the Islamic scientists, whose
immense medical encyclopedia, the Quanun, made him the greatest name in medicine for four centuries, was also a perceptive student of various physical questions
-motion, contact, force, vacuum, infinity, light, and heat. He shared Alhazen's
view that light originated in the luminous source and felt that it must consist of some
type of particles."
Roger Bacon (1214-1294), a learned scholar who stressed the value of reading works
in their original languages, was well-versed in the teaching of Aristotle, St. Augustine,
and the Muslim scientists Alhazen and Avicenna. During a sojourn in Paris, he so
impressed the future Clement VI that the latter, upon elevation to the Papacy in 1265,
requested Bacon to transmit copies of all his writings without delay. Up to that time,
Bacon had writ.ten but little; however, in the span of one year, he composed the Opus
Mtijus, the Opus Minor, and the Opus Tertium, a stupendous undertaking, the fruits
of which exerted a great influence on Western thought for centuries. In his masterpiece,
5
6

H. J. J. Winter, Eastern Science, John Murray Publishers, Ltd., London, 1952.


G. Sarton, Introduction to the History of Science, Vol. 1, p. 721, Williams and Wilkins Cornpany,

Baltimore, l\ld., 1927.


7 Ibid., Vol. 1, p. 710.

The Phenomenon of Light

CHAPTER

the Opus 111ajus, Bacon appears to endow Alhazen and Avicenna with an ambivalent
position by saying"
If, moreover, Alhazen and A. vicenna, in the third book on the Soul . . . are cited as
opposed to this view, I reply that they are not opposed to the generation of the species of
vision, nor to the part it plays in producing sight; but they are opposed to those who have
maintained that some material substance as a visible or similar species is extended from the
sight to the object, in order that vision may perceive the object itself, and that it may
seize upon the species of the object seen and carry it back to the sight.

Bacon's own view coincided with the opinion of many of the ancients, that light
consists of emanations which originate in the eye, and he defends this view in the
passage"
The reason for this position is that everything in nature completes its action through its
own force and species alone, as, for example, the sun and the other celestial bodies through
their forces sent to the things of the world cause the generation and corruption of things;
and in a similar manner inferior things, as, for example, fire by its own force dries and consumes and does many things. Therefore vision must perform the act of seeing by its own
force. But the act of seeing is the perception of a visible object at a distance, and therefore
vision perceives what is visible by its own force multiplied to the object . . . it is clear to
him who gives it due consideration that vision must take place by means of its species
emitted to the visible object.

As for the species of light itself, Bacon says, in an explanation which has the interesting
tinge of wave motion, that!"
. . . the species is not a body, nor is it changed as regards itself as a whole from one place
to another, bu t that which is produced in the first part of the air is not separated fr0111 that
part, since form cannot be separated from the Blatter in which it is, unless it be soul, but
the species forms a likeness to itself in the second position of the air, and so on. Therefore
it is not a motion as regards place, but is a propagation multiplied through the different
parts of the medium ; nor is it a body which is there generated, but a corporeal form, without, however, dimensions per se, but it is produced subject to the dimensions of the air . . . .

The passage of three centuries marks the interval between the death of Roger Bacon
and the birth of Rene Descartes (1596-1650), whose intellect and creative genius
were to stir scientific imagination, and whose prolific pen was to prove even D10re
influential than Bacon's. Descartes lived at a time in which the world was ripe for a
new conception of the nature of things. Major changes in attitude about man's surroundings were being culminated; Galileo and Kepler were advocating the overthrow
of the geocentric hypothesis of Ptolemy, the Magellan expedition had circumnavigated
the globe, the invention of the telescope was leading to expanded knowledge of the
skies, and Aristotelian scholasticism was under attack at all its weakest points. Xlontaigne's skepticism had paved the way for a break with tradition, and Descartes set
for himself the task of erecting a new structure to replace the old. In the words of
8 R. Bacon, Opus N!ajus, Part 5, 7th Distinction, Chap. 3, the R. B. Burke translation, University of
Pennsylvania Press, Philadelphia, 1928.
9 Ibid., 7th Distinction, Chap. 4; Dth Distinction, Chap. 1.
10 Ibid., 9th Distinction, Chap. 4.

SECTION

Historical Survey-The Nature of Light

Whittaker,l1 "His aim was nothing less than to create from the beginning a theory
of the universe, worked out as far as possible in every detail."
To understand Descartes' position on the particular subject of the nature of light,
one must first appreciate the major features of his grand design of the universe and the
attitudes which shaped this design. His philosophy was essentially dualistic; he believed
the physical world to be mechanistic and divorced from the mind, the only connection
between the two being through God's intervention. In science, he supported the inductive method of Francis Bacon, but with emphasis on rationalization and logic, rather
than upon experiences. Mathematics "vas Descartes' greatest interest and he is widely
called the father of analytic geometry. Under Kepler's influence, he became convinced
that the precision and universality of mathematics set it apart from all other fields of
study. This admiration of the clarity of mathematical expression serves to explain
why, as the first rule in the Discourse on 111ethod, Descartes vowed
never to accept anything as true if I had not evident knowledge of its being so; that is, to
accept only what presented itself to mv mind so clearly and distinctly that I had no occasion
to doubt it.

This attitude led Descartes to the decision that, since effects produced by 111eanS of
contacts and collisions were the simplest and most comprehensible phenomena in the
physical world, he would accept no other causes. Such a decision implies that bodies can
act on each other only when they are contiguous, and thus Descartes ruled out action
at a distance. To account for such phenomena as the lunar influence on tides, Descartes
assumed that space is not a void but is a plenum, t being populated by transparent
particles capable of transmitting force. He actually went further than this, postulating
that all matter was in one of three distinct forms, the luminous matter of the sun, the
transparent matter of interplanetary space, and the opaque matter of the earth, giving
as his reason;"
For, seeing that the sun and the fixed stars emit light, that the heavens transmit it, and
that the earth, the planets, and the cornets reflect it, it appears to me that there is ground
for using these three qualities of luminosity, transparency, and opacity to distinguish the
three elements of the visible world.

Descartes assumed that the luminous matter of the sun consisted of particles which
were in continuous motion. Since there "vas no emp ty space for the particles to 1110Ve
into, he argued that they took the places vacated by other particles which were also in
motion, and thus developed the notion of closed chains of moving particles. The
motions of these closed chains constituted vortices, an important concept in his explanation of the universe. Thus, according to Descartes' theory;" the sun consists of an
enormous vortex composed of the first or subtlest kind of matter. The luminous particles of this vortex, due to centrifugal action, constantly strain away from their centers
of rotation and thus press against the transparent particles of the ether. The ether

t Thus did the concept of an ether en tel' science for the first tirne. The word is of Greek extraction and
originally meant blue sky.
11 E. Whittaker, A Historu of the Theories of Aeiher and Electricity, Vol. 1, p. 4, Thomas Nelson and
Sons, Ltd., London, 1951.
12 R. Descartes, Principes de la Philosphie, 4th ed., Part 3, Sec. 52, Chez Theodore Girard, Paris, 168l.
13 Ibid., Sec. 55-64.

The Phenomenon of Light

CHAPTEH 1

Descartes imagined to consist of a closely packed assemblage of globules, of a size


intermediate between that of the luminous matter of the sun and the opaque matter
of the earth. The pressure of the vortex against these ether particles causes them to
tend to move, thus exerting a pressure on their neighbors, which in turn tend to move,
and in this manner the force exerted by the vortex is passed along through the ether
particles, from layer to layer. In Descartes' view, the transmission of this pressure constitutes light, a thought he summarizes in the passage!'
. . . the force of light . . . does not consist in the duration of some motion but only in the
fact that these small globules (of the ether) are pressed and tend to 1110Ve toward some new
location, although they do not actually move,

Descartes also provided the first theoretical derivation of the law of refraction, discovered experimentally somewhat earlier (1621) by Willebrord Snell. This derivation is
important because it contains a consequence which later 100n1ed as a decisive factor in
settling the controversy as to the true nature of light. In the Descartes derivation, a
light ray is assumed to be incident on a plane interface bet\veen t\VO media at an angle i
with respect to the normal, traveling at a velocity Vi in the first medium, and departing
from the interface at a velocity v, in the second medium, in a direction making an angle
r with respect to the normal. Descartes then assumed that the component of velocity
parallel to the interface was unaffected, obtaining
Vi

from which Snell's law

sin i = u, sin r
sin i

o,

Vi

SIll

=n

follows immediately, However, if the second medium is denser, so that i > r, it follows
that u; > Vi. Thus Descartes' derivation leads to the conclusion that light 111USt travel
faster in a denser medium, a conclusion which was later shown to be in contradiction
with experiment.
Descartes' opinions were vigorously attacked by Robert Hooke (163;'">-1703), whose
views mark a significant turning point in conjectures about the nature of light. X oted
for Hooke's law, he was an able mechanician who devised many improvements in
clocks and astronomical instruments, and was the first to formulate a theory of planetary movements as a mechanical problem. He was responsible for the development of
microscopy as a science in England, and his interest in this subject led him to many
experiments concerned with light itself. Hooke became convinced that light was an
undulatory phenomenon, and his reasons are lucidly expressed in the passage"
And first for Light, it seems very manifest, that there is no luminous Body but has the
parts of it in motion more or less . . . . It would be somewhat too long . . . to examine,
and positively to prove, what particular kind of motion it is that must be the efficient of
Light . . . . I found it ought to be exceeding quick . . . that in all extreamly hot shining
bodies, there is a very quick motion that causes Light, as well as a more robust that causes
Heat, may be argued from the celerity wherewith the bodies are dissolv'd.
Ibid., Sec. 63.
Hooke, M icroqraphia, or Some Physiological Descriptions of illinuie Bodies J.[ade by M agn1fying
Glasses, 1st ed., pp. 54-56, published by the Royal Society of London, reproduced by Dover Publications, Inc., Ne\v York, 1961.
14

15 R.

SECTION ]

lj'storical Survey-l

1he

N aiure of Liqh!

Next, it must be a Vibrative motion. And for this the newly montiori'd Diamond affords
us a good argument: since if the motion of the parts did not return, the Diamond must
after n1any rubbings decay and be wasted . , , ,
And 'Thirdly, That it is a very short vibrating motion, I think the instances drawn from
the shining of Diamonds will also make probable. For a Diamond being the hardest body
we yet know in the World, and consequently the least apt to yield or bend, must consequently also have its vibrations exceeding short.

Having proposed an explanation for the sources of light, Hooke then suggested
That the motion is propagated every way through an Ilomoqeneous medium by direct or
straight lines extended every way like Rays from the center of a sphere . . . in an Homogeneous medium this motion is propagated every way with equal velocity, whence necessarily
every pulse or vibration of the luminous body will generate a Sphere, which will continually
increase, and grow bigger, just after the same manner (though indefinitely swifter) as the
waves or rings on the surface of the water do swell into bigger and bigger circles about a
point of it, where, by the sinking of a stone the motion was begun, whence it necessarily
follows, that all the parts of these Spheres undulated through an H omoqeneous medium
cut the Rays at right angles.

Thus Hooke paralleled Descartes in postulating a medium as the vehicle of light.


However, he replaced Descartes' notion that light was a statical pressure in i.he medium
with the notion that it is a rapid undulatory motion of small amplitude. Hooke then
went on to replace the Descartes analysis of refraction with one of his own, based on the
tilting of a wavefront at the interface of two media, but he failed to notice that it would
be necessary to assume the velocity to be slower in the denser medium in order to be
consistent with Snell's law.
The issue of whether light was wa velike or particlelike was firmly joined wi th the
emergence on the scientific scene of Isaac N ewton (1642-1727). Renowned for his discoveries in mechanics, N ewton also made many significant contributions in the field of
light. His most notable discovery was that white light is made up of the spectral colors,
which led him to propound a theory of prismatic colors directly opposed to an earlier
theory put forward by Hooke. This precipitated a bitter controversey in which Hooke
displayed considerable vexation and accused ~ ewton of favoring the doctrine that light
is a material substance. ~ ewton gave his answer in a communication to the Royal
Society in 1675 in which he said'"
Were I to assume an hypothesis, it should be this, if propounded more generally, so as
not to determine what light is, farther than that it is something or other capable of exciting
vibrations in the aether: for thus it will become so general and comprehensive of other
hypotheses, as to leave little room for new ones to be invented. And therefore, because I
have observed the heads of some great virtuosos to run much upon hypotheses, as if my discourses wanted an hypothesis to explain them by, and found, that so111e, when I could not
make them take my meaning, when I spake of the nature of light and colours abstractedly,
have readily apprehended it, when I illustrated 111y discourse by an hypothesis; for this
reason I have here thought fit to send you a description of the circumstances of this hypothesis as much tending to the illustration of the papers I herewith send you ..A.nd though I shall
not assume either this or any other hypothesis, not thinking it necessary to concern myself,
16 1. Newton, Papers and Letters on Natural Philosophy, edited by 1. Bernard Cohen, p. 179, Harvard
University Press, 1958,

The Phenomenon of IJight

CHAPTER 1

whether the properties of light, discovered by me, be explained by this, or Mr. Hooke's,
or any other hypothesis capable of explaining them ; yet while I am describing this, I shall
sometimes, to avoid circumlocution, and to represent it more conveniently, speak of it,
as if I assumed it, and propounded it to be believed. This I thought fit to express, that no
man may confound this with n1Y other discourses, or measure the certainty of one by the
other, or think me obliged to answer objections against this script: for I desire to decline
being involved in such troublesome and insignificant disputes.

N ewton's lifelong distaste for controversy is clearly evident here, but equally evident
is his refreshing lack of dogmatism about rigid hypotheses. He thoroughly disliked
highly imaginative suppositions, such as Descartes had invoked for his grand scheme of
the universe, and was much more interested in the formulation of the laws which govern
natural phenomena. Despite this, he found it impossible to give coherence to the
observed facts about light without resorting to some speculation about its nature.
Thus in this same communication, after an exhaustive and detailed discussion of the
possible composition of an ether, K ewton goes on to suppose that
Light is neither aether, nor its vibrating motion, but something of a different kind propagated frOIU lucid bodies. They, that will, may suppose it an aggregate of various peripatetic
qualities. Others may suppose it multitudes of unimaginable small and swift corpuscles of
various sizes, springing from shining bodies at great distances one after another; hut yet
without any sensible interval of time, and continually urged forward by a principle of
motion, which in the beginning accelerates them, till the resistance of the aethereal medium
equal the force of that principle, much after the manner that bodies let fall in water are
accelerated till the resistance of the water equals the force of gravity.

In K ewtori's lifetime, all the facts known about light could not be harmonized with
either the corpuscular or wave theories then being proposed. However, he leaned
toward a corpuscular hypothesis, and near the end of his life summed up his objections
to the wave theory in a query at the conclusion of a revised edition of his Opticks"
Are not all Hypotheses erroneous, in which Light is supposed to consist in Pression or
motion, propagated through a fluid Medium? . . . I f Light consisted only in Pression propagated without actual Motion, it would not be able to agitate and heat the Bodies which
refract and reflect it . . . . And if it consisted in Pression or Motion, propagated either in
an instant or in time, it would bend in to the Shadow. For Pression or Motion cannot be
propagated in a Fluid in right Lines, beyond an obstacle which stops part of the Motion,
but will bend and spread every way into the quiescent Medium which lies beyond the
Obstacle . . . . The Waves on the Surface of stagnating 'Vater, passing by the sides of a
broad Obstacle which stops part of them, bend afterwards . . . . But Light is never known
to follow crooked Passages nor to bend into the Shadow.

Newton goes on, in this query, to add the further objection that the wave theory (as it
then existed) could not accoun t for the recen tly discovered phenomenon of the polarization of light.
The discoverer of this phenomenon of polarization was Christiaan Huygens (lG291695), a contemporary of both Hooke and Newton, who sided with Hooke in favoring a
wave theory of light. Inventor of the pendulum clock, perceptive and influential critic
of Descartes' cosmological theories, Huygens is known principally for his work in optics.
1. Newton, Opticks, 4th ed., pp. 362-370, William Innys, Publisher, London, 1730. (Reprinted by
Whittlesey House, McGra\v-Hill Book Cornpany, New York, 1931.)

17

SECTION

Historical Survey-The Nature of I.Jight

He greatly extended and improved the wave theory first enunciated by Hooke and
subscribed wholeheartedly to Hooke's hypothesis that light consists of S0111e form of
motion. Witness the passage."
It is inconceivable to doubt that light consists in the motion of some sort of matter. For
whether one considers its production, one sees that here upon the Earth it is chiefly engendered by fire and flame which contain without doubt bodies that are in rapid motion, since
they dissolve and melt many other bodies, even the most solid; or whether one considers its
effects, one sees that when light is collected, as by concave mirrors, it has the property of
burning as a fire does, that is to say it disunites the particles of bodies. This is assuredly
the mark of motion, at least in the true Philosophy, in which one conceives the causes of all
natural effects in terms of mechanical motions. This, in my opinion, we must necessarily do,
or else renounce all hopes of ever comprehending anything in Physics.
And as, according to this Philosophy, one holds as certain that the sensation of sight is
excited only by the impression of some movement of a kind of matter which acts on the
nerves at the back of our eyes, there is here yet one reason more for believing that ligh t
consists in a movement of the matter which exists between us and the luminous body.

Huygens next addresses himself to the question as to whether the motion is that of a
medium, as assumed by Hooke, or whether it is a stream of particles, as favored by
Newton. He says
Further, when one considers the extreme speed with which light spreads on every side,
and how, when it comes from different regions, even from those directly opposite, the rays
traverse one another without hindrance, one may well understand that when we see a
luminous object, it cannot be by any transport of matter corning to us from this object,
in the way in which a shot or an arrow traverses the air; for assuredly that would too greatly
impugn these two properties of light, especially the second of them.

Huygens shared with Newton the inclination to picture an ethereal medium in which
light propagated. Whereas K ewton favored the idea that this medium was set into
vibration by the passage of light corpuscles through it, Huygens preferred to imagine
a process analogous to sound, in which the vibrating particles of the luminous source
would excite the contiguous portion of the medium into vibration, which would in turn
transfer this excitation on to the next portion, etc. This mechanical model of light
propagation led him to his most important contribution, ever since known as Huygen's
principle, and explained in the passage!"
There is the further consideration in the emanation of these waves, that each particle of
matter in which a wave spreads, ought not to communicate its Illation only to the next
particle which is in the straight line drawn from the luminous point, but that it also imparts
some of it necessarily to all the others which touch it and which oppose themselves to its
movement. So it arises that around each particle there is made a wave of which that particle
is the centre.

Using this principle, Huygens was able to show how all the points in one wavefron t
could be treated as secondary sources which created the next wavefront, and thus provided satisfactory explanations for propagation and reflection. By assuming that the
velocity of light was slower in a denser medium he was also able to explain refraction.
C. Huygens, T'raite de la Lumiere, pp. 3-4, first published in Leyden in 1690; English translation by
S. P. Thompson, London, 1912; reprinted by University of Chicago Press.
19 Ibid., p. 19.

18

10 The Phenomenon of Light

CHAPTEH

This proved to be a pivotal point a century and a half later in deciding between a
corpuscular or wave theory, since it has already been observed that the corpuscular
theory requires a faster velocity in a denser medium in order to be consistent with
the law of refraction.
Huygens was unsuccessful in explaining interference effects, such as the colored rings
of thin films and sharp shadows past obstacles, partly because it was not then appreciated how short the wavelengths of visible light are. He also confessed his inability to
explain his own discovery of polarization, but this is easily understood when one remembers that in 1700 it was not recognized that light consisted of transverse vibrations.
Similarly, N"ewton had difficulty in explaining the colors of thin films under the corpuscular theory and the noninterference of beams of light whose paths crossed, Although neither theory was adequate, the esteem in which K ewton was held by his
contemporaries and followers was so great that the wave theory was rejected and
allowed to remain unnourished for over a century. If the fact that Newton found the
corpuscular hypothesis more acceptable retarded the growth of the theory of light, as
some have claimed, the fault lay with those who blindly espoused all his views. It has
already been noted that K ewton himself did not hold rigidly to any one hypothesis but
rather gave tentative acceptance to that theory which appeared to him to fit most of
the facts.
Although most scientists of the eighteenth century accepted the corpuscular hypothesis, the wave theory was not totally without advocates. Franklin (170G-1790)
favored it, and Euler (1707-1783) took the same position, being persuaded by the
notion that particle emission from a luminous source would cause a diminution in its
mass, an effect not observed, whereas the emission of waves did not involve such a
consequence. However, the wave theory did not make any serious headway until a
new champion arose when Thomas Young (1773-1829) turned his attention to the
subject. A man of diverse and considerable talent, Young was a practicing physician
on the staff of St. George's Hospital. I-Ie was also a physicist, whose lectures at the
Royal Institution of London introduced the modern physical concept of energy. He
was a prodigy at t\VO, an accomplished linguist while still in his boyhood, a musician,
and an archeologist who participated in the deciphering of the Rosetta stone. He made
contributions to the theory of tides, explained capillarity, and established the coefficient
of elasticity known as Young's modulus,
Drawing upon all earlier explanation by K ewton in connection with tides, Young
introduced the concept of interference by saying;"
Suppose a number of equal waves of water to 1110Ve upon the surface of a stagnant lake,
with a certain constant velocity, and to enter a narrow channel leading out of the lake.
Suppose then another similar cause to have excited another equal series of waves, which
arrive at the same channel, with the same velocity, and at the same time with the first.
Neither series of waves will destroy the other, but their effects will be combined: if they
enter the channel in such a manner that the elevations of one series coincide with those of
the other, they must together produce a series of greater joint elevations; but if the elevations of one series are so situated as to correspond to the depressions of the other, they
must exactly fill up those depressions, and the surface of the water must remain smooth:
at least I can discover no alternative, either from theory or from experiment.
20 T. Young, M iscellaneous Works, edited by George Peacock, Vol. 1, pp"
Publishers, Ltd., London, 18\15.

2()2-2();~,

John Murray

SECTION 1

Historical Survey-The Nature of Light

11

Kow I maintain that similar effects take place whenever t\VO portions of light are thus
mixed; and this I call the general law of the interference of light.

Young demonstrated this concept in an experiment performed before the Royal


Society of London in 1803. Using a distant source of a single color, he permitted light
to pass through two tiny holes placed close together in one screen, and to fall on a second
screen. The second screen showed a pattern of fine bands, alternately ligh t and dark.
Young explained this pattern by recourse to a law he had enunciated " in 1802:
Wherever t\VO portions of the same light arrive at the eye by different routes, either
exactly or very nearly in the same direction, the light becomes most intense when the differences of the routes is any multiple of a certain length, and least intense in the intermediate state of the interfering portions; and this length is different for ligh t of different
colours.

He also used this law to give the first satisfactory explanation of the colors of light
reflected from thin plates, arguing that the incident light causes t\VO beams to reach
the eye: the first of these beams has been reflected from the first surface of the thin
plate, and the other from the second. These t\VO beams produce the colors in the
reflected light due to their interference. I ndeed, Young used the measured thickness
of thin plates to determine for the first time the characteristic lengths, or wavelengths,
of the various colors of visible light, publishing?" a table of values which is remarkably
accurate by today's standards.
Despi te a bi tter attack on Young by the followers of N ewton , support for the wa ve
theory accumulated rapidly. Fresnel (1788-1827) satisfactorily explained diffraction
past a sharp edge in terms of mutual interference of the secondary "Huygens" waves
generated by those portions of the original wavefront not obstructed by the diffracting
obstacle. Sharp shadows beyond obstacles big in terms of wavelengths thus became
understood, a point about the wave theory which had always bothered N ewton. Fresnel also demonstrated light interference by employing t\VO mirrors, and in a brilliant
experiment confirmed all hypothesis by Young that light consisted of transverse vibrations by showing that two cross-polarized beams of light do not interfere wi t.h each
other. 'I'his permitted an explanation under the wave theory of the phenornenon of
light polarization in crystals, which had earlier been a stumbling block for Huygeus.
Kirchhoff (1824-1887), starting from the wave equation, developed a diffraction formula in which Huygens' secondary sources were revealed, thus putting that principle
on a much firmer foundation."
Finally, the coup de grace was delivered to the corpuscular theory in 1850 when
Foucault." (1819-1868) and Fizcau " (1819-1896) measured the velocity of light in
21 T. Young, "An Account of Some Cases of the Production of Colours," Phil Trans Roy Soc (London),
92, 387-397; July 1802.
22 T'. Young, "On the Theory of Light and Colours," Phd Trans Roy Soc (London), 92, 12-48; N overnber I80l.
23 Kirchoff summarized his work in the textbook Vorlesu.ngen 'tiber maihemaiische O'piik, Zweite Vorlesung, Sec. 2, Berlin, 1891.
24 M. L. Foucault, "General Method for Measuring the Speed of Light in Air and Transparent Media.
Relative Speeds of Light in Air and Water," Compl Rend, 30, 551-5()(); May 1850.
25 H. Fizeau and L. Brequet, "Note on an Experiment Relative to the Cornparativc Velocities of
Light in Air and in Water," Cornpt Rend, 30, 562-563; May 1850.

12 The Phenomenon of Light

CHAPTER 1

air and water, finding that it was slower in the latter. This result was consistent with
the wave theory, whereas the reverse had been predicted by the corpuscular hypothesis.
With this experiment, all sensible objection to the wave theory of light had disappeared.
At about this time Maxwell (1831-1879) began formulating his theory of electromagnetism, culminating in the celebrated equations which bear his name. Wavelike
solutions to these equations indicated that electromagnetic fields would propagate
through a vacuum at the same speed as light. This led Maxwell to the important
conjecture that light is an electromagnetic phenomenon and further strengthened the
belief that light is basically wavelike in nature.
In 1887, Heinrich Hertz (1857-1894) provided the first successful demonstration of
the generation and propagation of electromagnetic waves, using separate spark gap
coils to transmit and receive. This achievement was hailed immediately by his contemporaries as the crowning victory of physics, the first experimental verification of
the validity of Maxwell's theory. Ironically, a side effect of this experiment was destined
to contribute to a great revolution in scientific thought. Hertz noticed that the sparks
produced in the gap of his receiving coil were influenced by the light falling on this gap
from the sparks in the transmitting coil. Further investigation led Hertz to conclude
that it was the ultraviolet portion of the light which was responsible for the effect,
and that the effect was greatest if the light were incident on the negative point of the
gap. Hertz reported these observations but carried the investigation no further. However, his discovery intrigued many others, and significant contributions were made by
Hallwachs, who showed that the photoelectric effect, as it came to be called, consisted
of the emission of negative charges, and by Lenard, who measured the charge to mass
ratio of the emitted charges and concluded that they were electrons.
A variety of materials was found to be photosensitive, but the characteristics of
the emission were surprising. The number of electrons emitted per unit time was proportional to the intensity of the incident light, which seemed reasonable. However,
the maximum kinetic energy of the emitted electrons was dependent on the frequency
of the light used, but independent of its intensity. A classical argument, assuming a
collision-like process, would anticipate that the greater the intensity of the incident
wave, the greater would be the energy of the electrons which were torn loose from the
surface.
Albert Einstein (1879-1955) offered an explanation of the photoelectric effect in 1905,
the same year he received his doctorate from Zurich and published his first paper on
relativity. Drawing on an hypothesis made several years earlier by Planck, who had been
concerned with the spectral distribution of black-body radiation, Einstein assumed 26
. . . that the incident light is composed of quanta of energy (Rj N A){3V . . . . The quanta
of energy penetrate the surface of the material and their respective energies are at least
in part changed into the kinetic energy of electrons. The simplest process conceivable is
that a quantum of light gives up all its energy to a single electron . . . . G pon reaching the
surface, an electron originally inside the body will have lost a part of its kinetic energy.
Furthermore, one may assume that each electron in leaving the body does an amount of work

lV, which is characteristic of the material. Those electrons which are ejected normal to and
from the immediate surface will have the greatest velocities. The kinetic energy of these
26

A. Einstein, "An Heuristic Viewpoint Concerned with the Generation and Transformation of Light,"

Ann Phys, 322, 132-148; 1905.

SECTION

J-J istorical Sllrvey- The N ature of Light

electrons is

13

R
Nit

-{3v - }V

Einstein thus hypothesized that the incident light was composed of quanta, or photons,
whose energy was proportional to the frequency v of the light. His proportionality
factor consisted of a parameter {3 multiplied by the Boltzmann constant, k = R/N A,
with R the ideal gas constant and N A Avogadro's number. Einstein then argued that,
if the photoelectric material were raised to a potential V above a surrounding grounded
electrode, then even the most energetic emitted electrons would not reach the grounded
electrode if V were of such magnitude that

R
Ve==-{3v-TV

NA

in which e is the electronic charge. He then went on to say


If the formula derived is correct, it would follow that V, if plotted in cartesian coordinates as a function of the frequency of the exciting photons, would yield a straight line
whose slope is independent of the material under investigation . . . . If each quantum of
light were to give its energy to the electrons independently of all the others then the velocity
distribution . . . will be independent of the intensity of the exciting radiation; on the other
hand the numbers of electrons leaving the body under equal conditions will be directly proportional to the intensity of the incident radiation.

Einstein's formula and explanation are notable for their simplicity and fit all the
observed facts. At the time he proposed this explanation he had at his disposal only
qualitative data, but his equation received final and thorough experimental verification
through the precise work of Millikan in 1916. 27 Working with a circuit shown in simplified form in Figure 1.1, Millikan varied the reverse bias until it reached a value V
such that the ammeter read no current. Since this voltage was just enough to prevent
the most energetic electrons from reaching the second electrode, one could argue that

Ve was the maximum kinetic energy any of the electrons had upon being emitted

from the photosensitive electrode. When Millikan varied v, the frequency of the incident light, and recorded V for each frequency, he obtained a curve such as shown
in Figure 1.2. This experimental result was consistent with Einstein's equation
Ve == (R/ N A){3v - W, and the experimental significance of the intercept Vo is that
light at a lower frequency cannot cause photoelectric emission from the metal concerned. The quantity v was found to be characteristic of the photosensitive material
forming the electrode, but the slope of the curve was the same for all electrodes. The
slope, which is Einstein's proportionality constant (Rj N A){3 proved to be identical
with the constant h which Planck employed to explain black-body radiation. Thus
Einstein's quantum of light, or photon, was found to have an energy E == h,
However, the concept that light consists of discrete energy bundles, or photons,
smacks strongly of the earlier corpuscular theories. Is light wavelike or corpuscular?
The best current answer appears to be that it has a dual personality, exhibiting one
set of characteristics or the other, depending on how it is interacting with its environment. If the process being considered is at the microscopic level, the quantized nature
27

R. A. Millikan, uA Direct Photoelectric Determination of Planck's h," Phys Rev, 7, 355-388; 1916.

14

The Phenomenon of Light

CHAPTER 1

+
V
FIG VRE

1.1

Photoelectric diode.

of light will 1110st likely have to be considered; if it is a macroscopic process, the wave
nature of light should account successfully for the interaction.
It would seem that just about everybody was right all along.

FIGURE

1.2*

1.2

111 aximum electron energy VB. light frequency.

HISTORICAL SURVEY-THE VELOCITY OF LIGHT

Whereas a determination of the nature of light is not totally decisive, such ambivalence
does not exist when the discussion turns to the conception of the velocity of light.
Whether light is thought of as a stream of photons or a propagating wave, the transfer

* The reader solely interested in the technical presentation may wish to omit this section except for
the discussion of Bradley's experiment.

SECTION

IIistorical Survey-The Velocity of Light

1,5

of energy occurs at a speed which, today, can be measured with extraordinary precision.
Yet this speed is so great that it is not surprising to find earlier debates as to whether
the velocity of light is finite or infinite.
The direct evidence is lost to us, but Empedocles apparently felt that the velocity is
finite, for Aristotle disputes with him in the passage."
Empedocles (and with him all others who used the same forms of expression) was wrong
in speaking of light as 'traveling' or being at a given moment between the earth and its
envelope, its movement being unobservable by us; that view is contrary both to the clear
evidence of argument and to the observed facts; if the distance traversed were short, the
movement might have been unobservable, but where the distance is from extreme East to
extreme West, the draught upon our powers of belief is too great.

Heron of Alexandria, whose life span has variously been placed in the period from the
second century B.C. to the third century A.D., and who is noted for his invention of
many contrivances operated by water, steam, or compressed air, believed with Euclid
and Ptolemy that light rays originated in the eye. This belief led him to an interesting
argument as proof that the velocity of light is infinite :29
That the sight rays emanating from our eyes move with infinite velocity can also be seen
from the following. N amely if, after having closed our eyes, we look again upward to the
heavens, these rays reach the heavens without any time interval having elapsed (i.e., irnmediately). For in the same instant in which we open our eyes, we see the stars, even though
we may say that the distance is practically infinite. Also, if this distance were even greater,
the same occurrence would be repeated in any case, and thus it results that the rays emanating
from our eyes propagate with infinite velocity. They therefore suffer in their propagation
no interruption in their motion, nor do they make a detour, nor follow a broken-line path,
but rather move along the shortest line, namely the straight one.

Alhazen believed otherwise, and in his treatise on optics stated :30


And we shall see that color will not be perceived in that which is color by the sight, nor
light in that which is light, except in time . . . the arrival of the sensation (of light) to the
hollow of the optic nerve is like the arrival of light from holes . . . the passing of light
from a hole to an object opposite the hole will not be possible except in time, even though
this fact is concealed from the mind.
The passing of light from a hole to an object opposite the hole cannot escape being in one
of the two following ways, namely, that either light will come to that part of the air which is
near the hole, before it can arrive to another following point, and thereafter it will come
to another point, and so to another, until it arrives at the object opposite the hole, or light
will arrive at the entire intermediate atmosphere between the hole and the object opposite
the hole, and to the very object, all at the same time. If the air received light in a successive fashion, the light would not arrive at the object opposite the hole, except through
movement. But movement does not exist except in time; thus, if the whole atmosphere
receives light at the same time, even the arrival of light to the atmosphere does not exist,
since it was not in the atmosphere before . . . .
28 Aristotle, De Anima, 418 b , 20, English translation under editorship of W. D. Ross, Oxford at the
Clarendon Press, 1931.
29 Heronis Alexandrini, Catoptrica, Vol. 2, pp. 320-323, translated into German by L. Nix and W.
Schmidt, von B. G. Teubner, Leipzig, 1900. (Private English translation.)
30 Alhazen, Opticae Thesaurus, edited by Risner, Vol. 2, Chap. 2, Article 21, Basel, 1572. (Private
translation.)

16

The Phenomenon of Light

CHAPTER

If the hole through which the light enters becomes blocked, and then the blockage is reis different from the instant
moved, the instant during which the blockage is removed
during which the light reaches the contiguous atmosphere
Therefore this is done by
a movement ; but a movement does not exist except in time
However, this time
element is strongly concealed from the mind due to the rapidity of the perception of the
sensation of light by the air.

Avicenna agreed with Alhazen, basing his opinion on the belief that light consisted of
the motion of finite particles which therefore could not have an infinite velocity. Roger
Bacon also sided with Alhazen, although he did not like the reasons advanced above
and preferred the argument Alhazen put forth in his seventh volume that "from the
same terminus the perpendicular ray reaches more quickly the terminus of the space
than the ray that is not perpendicular." However, Bacon Vias very gentle in his disagreement with Aristotle, drawing a fine distinction between perceptible and imperceptible intervals of time. His principal reason for believing in a finite velocity is contained in the passage!'
. . . an instant has the same relation to time as a point to a line. Therefore, interchanging
terms, an instant has the same relation to a point as time has to a line; but the passage
through a point is in an instant. Therefore the passage through the whole line is in time.
Therefore species [of light] passing through linear space, however small, will pass through
in time . . . . If, therefore, the multiplication of light is instantaneous, and not in time,
there will be an instant without time; because time does not exist without motion. But it
is impossible that there should be an instant without time, just as there cannot be a point
without a line. It remains, then, that light is multiplied in time, and likewise all species of
a visible thing and of vision. But nevertheless the multiplication does not occupy a sensible
time and one perceptible by vision) but an imperceptible one, since anyone has experience
that he himself does not perceive the time in which light travels from east to west.

Francis Bacon (1561-1626), an English philosopher credited with the formulation


and introduction of the inductive method of modern science, struggled with the question of the velocity of light in the absence of experimental information, as is evident in
this excerpt.r"
Even in sight, whereof the action is most rapid, it appears that there are required certain
moments of time for its accomplishment . . . . (It is not surprising that we do not see the
actual passage of light, for there are things which by reason of the velocity of their rnotion
cannot be seen-as when a ball is discharged from a musket . . . . ) This fact, with others
like it, has at times suggested to me a strange doubt, viz. whether the face of a clear and starlight sky be seen at the instant at which it really exists, and not a little later; and whether
or not, as regards our sight of heavenly bodies, [there is] a real time and an apparent time,
just like the real place and apparent place which is taken account of by astronomers in the
correction for parallaxes . . . [whether or not] the images or rays of heavenly bodies . . .
take a perceptible time in travelling to us. Btl t this suspicion as to any considerable in terval
between the real time and the apparent afterwards vanished entirely . . . what had most
weight of all with me was, that if any perceptible interval of time were interposed between
31 R. Bacon, Opu "AI ajus, Part 5: 9th Distinction, Chap. 3, the H,. B. Burke translation, University
of Pennsylvania Press, Philadelphia, 1928.
32 Francis Bacon, Philosophical Works, edited by J. 1\1. Robertson from the edition of Ellis and Spedding, p. 363, London, 1905. (As quoted in 1. B. Cohen, Roemer, p. 11, The Burndy Library, Inc.,
New York, 1944.)

SECTION

Historical Survey-The Velocity of Light

17

the reality and the sight, it would follow that the images would oftentimes be intercepted
and confused by clouds rising in the meanwhile, and similar disturbances of the medium.

A contrast to all this metaphysical speculation is found in the attitude of Galileo


Galilei (1564-1642). Widely regarded as the father of modern physics, Galileo was a
champion of the experimental method. At the age of twenty-six, while professor of
mathematics at Pisa, he began a systematic investigation of the mechanical doctrines
of Aristotle. Having convinced himself by experiment of the error in many of Aristotle's
assertions, Galileo invoked the enmity of the Church by loudly proclaiming his dissensions. These included the question of whether or not a heavy body falls faster than a
light one, and later the profound question of whether the Ptolemaic or Copernican view
of the universe was the proper one.
Galileo was the first to observe that a simple pendulum has a natural period. He
properly deduced the formulas of uniformly accelerated motion, and his contributions
to mechanics were an important precursor to the generalizations made by N ewton a
century later. He constructed the first astronomical telescope and with it discovered the
satellites of Jupiter, the crescent phases of 'Tenus, sunspots and the rotation of the sun,
and the libration of the moon. Galileo became interested in the question of light velocity
and, believing it to be finite, undertook to establish this experimentally. His approach
was logical but doomed to failure because of the great velocity involved. In the famous
Dialogues, published in Leyden in 1638, Galileo proposed that."
Each of two persons take a light contained in a lantern, or other receptacle, such that by
the interposition of the hand, the one can shut off or admit the light to the vision of the
other. Next let them stand opposite each other at a distance of a few cubits and practice
until they acquire such skill in uncovering and occulting their lights that the instant one
sees the light of his companion he will uncover his own ..After a few trials the response will
be so prompt that without sensible error the uncovering of one light is immediately followed by the uncovering of the other, so that as soon as one exposes his light he will instantly
see that of the other. Having acquired skill at this short distanee let the two experimenters,
equipped as before, take up positions separated by a distance of two or three miles and let
them perform the same experiment at night, noting carefully whether the exposures and
occultations occur in the same manner as at short distances; if they do, we may safely
conclude that the propagation of light is instantaneous; but if time is required at a distance of three miles which, considering the going of one light and the COIning of the other,
really amounts to six, then the delay ought to be easily observable. If the experiment is to
be made at still greater distances, say eight or ten miles, telescopes may be employed, each
observer adjusting one for himself at the place where he is to make the experiment at nigh t;
then although the lights are not large and are therefore invisible to the naked eye at so great
a distance, they can readily be covered and uncovered since by aid of the telescopes, once
adjusted and fixed, they will become easily visible

Later he comments,
In fact I have tried the experiment only at a short distance, less than a mile, from which
I have not been able to ascertain with certainty whether the appearance of the opposite
light was instantaneous or not; but if not instantaneous it is extraordinarily rapid-I should
call it momentary; . . . .
Galileo Galilei, Dialogues Concerning Two New Sciences, p. 43, reprinted by Dover Publications, Inc.,
New York.

33

18

The Phenomenon of Light

CHAPTER 1

Galileo's experiment was repeated by scientists of the Florentine Academy but with
inconsistent results. The human reaction times were much too great, the separation of
the lanterns was only a few miles, and the timepieces of that era were extremely crude.
In the continuing absence of decisive experimental results, the speculation continued.
Kepler (1571-1630) held an Aristotelian view;" maintaining that light can be propagated an infinite distance in zero time. He based this view on the argument that light
is not matter and thus cannot offer resistance to the force which moves it. In Aristotelian mechanics, this requires that light attain an infinite velocity.
Descartes, as has already been noted, believed that light consisted of a transmission
of pressure through the tightly packed globules of the ether. However, in his conception,
light was not a motion because the globules only tended to move, being restrained in
position by their neighbors. 'rhus each globule was capable of transmitting force
instantaneously, which led Descartes to conclude"
Thus, we shall have no trouble in realizing why such an effect, which I attribute to light,
extends in a spherical fashion all around the sun . . . and why such light propagates instantaneously to all distances.

I t is interesting to observe that Descartes could believe both that the veloci ty of ligh t
was infinite and that the velocity of light was not the same in different media, an
assumption he made in deriving Snell's law (see Section 1.2).
In a correspondence with the Dutch physicist Beekman (1570-1637), Descartes was
hard pressed to defend his metaphysical arguments in favor of an infinite light velocity,
and hit upon an argument which is scientifically sound, and which seemed to him to be
a complete proof that his position was the only correct one. Descartes proposed consideration of a lunar eclipse, caused by the earth being interposed between the sun and
the 11100n. He then supposed that it requires an hour for light to travel from the earth
to the moon, which would mean that the l1100n did not growdark until an hour after
the instant of collinearity of the three bodies. People on earth would not be aware of
this darkening for an additional hour, or until the earth and moon had 1110ved in their
orbits an additional t\VO hours beyond the position of collinearity. But, argued
Descartes, this is clearly contrary to experience, for the eclipsed moon is always
observed at a point in the ecliptic opposite to the sun. Thus the light must travel
instantaneously.
Huygens challenged this proof at its only weak point, saying."
But it must be noted that the speed of light in this argument has been assumed such that
it takes a time of one hour to make the passage from here to the Moon. If one supposes
that for this . . . it requires only ten seconds of time . . . then it will not be easy to perceive anything of it in observations of the Eclipse; nor, consequently, will it be permissible
to deduce from it that the movement of light is instantaneous.
I t is true that we are here supposing a strange velocity that would be a hundred thousand
times greater than that of Sound . . . . But this supposition ought not to seem to be an
impossibility; since it is not a question of the transport of a body with so great a speed,
but of a successive movement which is passed on from some bodies to others. I have then
34.J. Kepler, Ad Vitellionem paralipornena quibus astronomiae pars opiica traditur, Frankfurt, 1604.
35 R. Descartes, Principes de la Philosophie, 4th ed., Part 3, Sec. 64, Chez Theodore Girard, Paris, 168l.
36 C. Huygens, Traiie de la Lumiere, pp. 6-7, first published in Leyden in 1690; English translation by
S. P. Thompson, London, 1912; reprinted by University of Chicago Press.

SECTION

Historical Survey-The Velocity of Light

19

made no difficulty, in meditating on these things, in supposing that the emanation of light
is accomplished with time . . . .

Hooke also appreciated the weakness in Descartes' argument, and in speaking of the
propagation of light through a transparent body or medium, he asscrted " that the
light
may be communicated or propagated through it to the greatest imaginable distance in
the least imaginable time; though I see no reason to affirm, that it must be in an instant:
For I know not anyone Experiment or observation that does prove it. And, whereas it may
be objected, That we see the Sun risen at the very instant when it is above the sensible
Horizon, and that we see a Star hidden by the body of the Moon at the same instant, when
the Star, the Moon, and our Eye are all in the same line; and the like Observations, or
rather suppositions, may be urg'd. I have this to answer, That I can as easily deny as they
affirm; for I would fain know by what means anyone can be assured any more of the
Affirmative, than I of the Negative. If indeed the propagation were very slow, 'tis possible
something might be discovered by Eclypses of the Moon; but though we should grant the
progress of the light from the Earth to the l\100n, and from the l\100n back to the Earth
again to be full t\VO Minutes in performing, I know not any possible I11eanS to discover
it . . . .

The distinction for having performed the first decisive determination of the velocity
of light goes to Ole Roemer (1644-1710). Born in Denmark, and educated under the
Bartholins at the University of Copenhagen, he then went to Paris as a young astronomer for the Acadernie Royale des Sciences, which at that time was undertaking a
project to prepare more accurate maps. A technique had been proposed whereby the
longitude of any place could be determined relative to the longitude of Paris by simultaneous observation of an astronomical phenomenon from the t\VO positions. What was
needed "vas a celestial occurrence of reasonable frequency, and a tentative selection was
made of the eclipses of the satellites of Jupiter, a phenomenon which had been discovered earlier in the same century by Galileo.
In choosing Roemer to work on this project, the Academic picked a man who was to
prove to be one of the greatest practical astronomers of all time, He built the first good
transit instrument and the earliest transit circle, greatly improved on the construction
of micrometers, and showed that the epicycloid is the best shape for gear teeth, incorporating this discovery into the design of all his astronomical instruments: in his
later years he supervised the erection of an excellent observatory near Copenhagen.
While in Paris at the beginning of his career, and upon launching into a study of the
eclipses of Jupiter's moons, Roemer was struck by a surprising observation. Since one
would expect that the period of a moon would remain constant, knowing the time at
which one eclipse occurred, it was then a simple matter to predict a sequence of later
times at which a given moon would be eclipsed by Jupiter. But when Roemer did this,
he predicted a time sequence which did not agree with later eclipse measurements, He
attributed this disparity to the changed distance between Earth and Jupiter, which,
if the velocity of light were finite, would explain the irregularity in eclipse occurrences.
Accordingly, in September 167G, Roemer announced to members of the Paris
Academic that the next eclipse of the innermost satellite of Jupiter, expected on
371L Hooke, ~~1 icrographia, 1st ed., p. 56, published by the Hoyal Society of London, reproduced by
Dover Publications, Inc., New York, 1961.

20

The Phenomenon of Light

CHAPTER 1

November 9, would occur exactly ten minutes later than the time computed on the
basis of previous eclipses. When observation had confirmed this startling prediction,
Roemer again addressed the Academic, saying"
The necessity of this new equation of the retardation of light, is established by all the
observations that have been Blade by the Academic Royale and by the Observatory during
the last eight years, and it has been confirmed anew by the emersion of the first satellite,
observed at Paris last November 9th at 5h 35 m 45 8 at night, 10 minutes later than had been
expected . . . .

From his knowledge of the relative positions of the earth and Jupiter, Roemer deduced
that this retardation was such that light should take 22 minutes to cross the diameter
of the earth's orbit, which translates into a velocity of light of approximately 140,000
mi/sec. Roemer's value was thus about 2tj percent low, t but his accomplishment was
nevertheless impressive. For the first time in history man had been able to measure
a velocity which was so great that many had thought it to be infinite.
Roemer's assertion was accepted promptly by Huygens and 1\ewton, and many of
his colleagues were quick to rectify the error in his calculations. Thus N ewton, in the
first edition of his Opticks (1704), introduces the proposition that 39
Light is propagated from luminous Bodies in time, and spends about seven or eight minutes of an Hour in passing from the Sun to the Earth.

adding that this effect was first observed by Roemer. However, no such acceptance was
found among the Cartesians, and such was the influence of Descartes' ideas that the
Continent remained unconvinced until the brilliant confirming experiments of Bradley
a half century later.
Bradley (1693-1762) was born in Gloucestershire and educated at Oxford. His
interest in astronomy was aroused early by an uncle whose home contained an excellent
amateur observatory, and he became an acute observer through having engaged in a
regular series of observations extending from boyhood. He was elected a member of the
Royal Society in 1718 and three years later was appointed Savilian Professor of
Astronomy at Oxford. He succeeded Halley as Astronomer Royal in 1742 and devoted
the remainder of his life to the Greenwich observatory.
In addition to the discovery of stellar aberration, to be discussed below, Bradley's
minute observations led him to the detection of the nutation of the earth's axis. In an
action so characteristic of his painstaking nature, Bradley refrained from announcing
the discovery of nutation until February 1748, after he had assured himself of its
certainty by careful measurements extending over an entire revolution (18.6 years).
Bradley's discovery and interpretation of the phenomenon of stellar aberration came

t His principal source of error was an oversight. Roemer had used eclipse data from the years 16711673 to predict the retardation time, because he had at his disposal many observations from that
period, and also because Jupiter at that time had been making an aphelion passage and thus was at a
nearly constant distance from the sun. However, in 1676 Jupiter was no longer in such a position, and
Roemer failed to account for its changed distance from the sun between eclipses, thus obtaining an
incorrect value for the change in the distance between Earth and Jupiter.
38 O. Roemer, "Demonstration Concerning the Movement of Light," J des Scavans, 233-236; December 7, 1676. (Reprinted in Phil Trans Roy Soc (London), 12, 893-894; June 25, 1677.)
39 1. Newton, Opticks, 4th ed., Book 2, Part 3, Proposition 11, William Innys, Publisher, London, 1730.
(Reprinted by Whittlesey House, McGraw-Hill Book Company, NeVI York, 1931.)

SECTION

Il istorical Survey-The Velocity of Light

21

as the result of an effort to detect stellar parallax, which he began in 1725. The absence
of any measurable parallax had long been a stumbling block for adherents of the
Copernican system. Tycho (1546-1601) had recognized earlier that, when viewed from
opposite sides of the earth's orbit, stars should show a displacement in direction, but his
careful observations convinced him that no such displaeement so great as one minute of
arc existed. Later observers also had sought this effect in vain, and stellar parallax had
become one of the outstanding problems in astronomy.
Working with improved instruments, Bradley attacked this problem by systematically recording the position of l' Draeonis, a bright star in the constellation Draco, at
various times during the year. As shown in Figure 1.3a, what he was seeking was a
difference in the angles a and {3, which certainly should be evident if rl and 1'2 were not
too much greater than the diameter of the earth's orbit. It is obvious from the figure
that this parallax effect should be greatest for stars near the ecliptic pole, t and thus
l' Draconis was an ideal choice. The plane containing the ecliptic axis and l' Draconis
cuts the earth's orbit in points the earth occupies in June and December. Thus Bradley
expected to find l' Draconis making its smallest angle to the ecliptic plane in December
and its greatest angle in June. To his surprise, he found that l' Draconis lies closest to
the ecliptic in March and is most elevated in September, the difference in these angles
being about 40 sec of arc.
Bradley checked his findings by observing other stars over a three-year period, always
with similar results. Finally satisfied that the effect was real, he reported 40 his observations in 1728. After carefully eliminating other possible explanations for the effect, he
said
At last I conjectured, that all the Phenomena hitherto mentioned, proceeded from the progressive Motion of Light and the Earth's annual Motion in its Orbit. For I perceived, that,
if Light was propagated in Time, the apparent Place of a fixt Object would not be the same
when the Eye is at Rest, as when it is moving in any other Direction, than that of the Line
passing through the Eye and Object; and that, when the Eye is moving in different Directions, the apparent Place of the Object would be different.

Bradley then proceeded to explain the apparent shift in position of the stars under this
hypothesis. His reasoning can be understood with reference to Figure 1.3b, in which
Cartesian axes have been chosen fixed in the sun, with the Z axis pointing toward the
ecliptic pole and l' Draconis in the XZ plane, close to the Z axis. In March the orbital
velocity of the earth is toward 'Y Draconis, whereas in September it is away from l'
Draconis, K eglecting the diurnal rotational motion of the earth (which is only about
1 percent of the orbital motion), Bradley reasoned in effect that in March the velocity
components of the light entering his telescope from ')' Draconis were (c x + v, 0, c.),
whereas in September they were (cx - v, 0, e.), with v the orbital speed and Cx, c, the
velocity components of the light relative to the sun. Thus in March he needed to point
his telescope at an angle a above the ecliptic plane given by tan a = cz / (cx
), and in
September he needed to point his telescope at a slightly higher angle {3 above the

t The earth's orbit lies in the plane of the ecliptic, and the ecliptic pole is the axis perpendicular to this
plane and piercing it at the center of the earth's orbit.
40 J. Bradley, "An Account of a New Discovered Motion of the Fix'd Stars," Phil Trans Roy Soc
(London), 35, 637-660; December 1728.

22

The Phenomenon of Light

CHAPTER 1
Ecliptie pole

To 'Y Draconis

To 'Y Draconis

Earth

-----SUN

II

June

Ecliptic plane
/

(a)

To 'Y Draconis ....- - -

/'

March
y

Sept.
(b)
FIGURF.

1.3

Stellar aberration.

Dec.

SECTION

H isiorical Survey-The Veloc1:ty of Light

ecliptic given by tan {3 = cz / (cx

tan B - tan a
1

tan {3 tan

v).

Since {3 - a is small,

2vcz
- -2 = tan (/3 2
c

23

a) ~

B-

from which, because v < c, it follows that

v c,

{3-a"'-'2-cc

(1.1)

Upon inserting measured values for a, {3, and v into Equation (1.1), Bradley was able to
deduce a value for the velocity of light c, since he knew the direction eosine cz / c. In
his own words,
. . . the Velocity of Light [is] to the Velocity of the Eye (which in this Case may be supposed
the same as the Velocity of the Earth's annual Motion in its Orbit) as 10,210 to One, from
whence it would follow, that Light moves, or is propagated as far as from the Sun to the
Earth in 8'12".
It is well known, that 1\1r. Romer, who first attempted to account for an apparent Inequality in the Times of the Eclipses of Jupiter's Satellites, by the Hypothesis of the progressive
Motion of Light, supposed that it spent about 11 Minutes of Time in its Passage from the
Sun to us: but it hath since been concluded by others from the like Eclipses, that it is propagated as far in about 7 Minutes. The Velocity of Light therefore deduced from the foregoing
Hypothesis, is as it were a IIIean betwixt what had at different times been determined
from the Eclipses of Jupiter's Satellites.

Bradley's value for the time of passage of light from the sun to the earth translates into a light velocity of 189,000 miz'scc, a value in close agreement with modern
measuremen ts.
Bradley termed this effect which shifts the apparent position of a star aberraiion,
When his findings became widely known, all sensible objection to the view that the
velocity of light is great, but finite, ceased to exist.
The first attempt to measure the velocity of light using a purely terrestrial method
was made by Fizeau in 1849. He employed a large toothed wheel as a light chopper
and selective receiver, sending light pulses to a remote mirror at a known distance.
Upon their return, the pulses would be unable to get past a tooth which had moved
over to replace a space, if the rotational speed of the wheel were a cri tical val ue; this
fact was used to deduce the time taken for a pulse to travel from the wheel to the
distant mirror and back, from which the velocity of light followed immediately.
A lifelong resident of Paris, Fizeau (1819-1896) devoted his long and productive
career to scientific research. With Foucault, he conducted an extensive series of experiments on interference of both light rays and heat rays. He explained the Doppler effect,
made valuable discoveries related to the polarization of light, and applied the principle
of light interference to the measurement of the dilatation of crystals. He is best rerncmbered for determinations of the velocity of light in air and in moving water. The latter
determination played a significant role in the development of the special theory of
relativity and will be discussed in Chapter 2. Fizeau's determination of the velocity
of light in air was accomplished earlier, in 1849, with an apparatus which is suggested
in simplified form by Figure 1.4.
In this experiment, light from a source S was focused at f by means of the lens L 1

24

The Phenomenon of Light

CHAPTER 1

FIGURE

1.4

Fizeau's apparatus.

and the half-silvered mirror P. The principal focus of the lens 2 was made to coincide
with j so that a parallel beam of light emerged from the apparatus and traveled to a
distant station consisting of the lens L 3 and the spherical mirror 111. This beam was
focused by 3 on M, whose center of curvature was chosen to lie in 3. Thus the reflected
beam emerged from L 3 in a parallel pencil and was brought to a focus es.], from whence
it diverged to fall upon the half-silvered mirror P and be partially transmitted to the
eyepiece V.
When a toothed wheel TV was inserted in the light path atj, an image of the source S
could be seen at V unless f were blocked by the presence of a tooth. Fizeau used a
wheel with 720 teeth separated by spaces congruent to the teeth, and connected the
wheel to a clockwork driven by weights, thus using the wheel to pulse the light. With
the wheel rotating very slowly, the image of S would appear and disappear successively
as the spaces and teeth passed beforej. However, if the speed were increased to the point
that several teeth per second passed j, the persistence of vision would render a permanent image at half the intensity which had been seen with the wheel at rest and two
teeth straddling f.
When the speed of the toothed wheel was increased further, because of the finite
velocity of light, a sensible part of the light transmitted through a space toward M
would, upon returning, fall upon the adjacent tooth and be intercepted, thus decreasing
the intensity of the image. If the rotational speed became great enough so that, when
the light returned, the tooth had just moved into the position previously occupied
by the space, then all the returning light was intercepted and the image at V was
totally extinguished.
What occurred, therefore, was that at first a bright image was observed, which
faded away as the rotational speed increased to a value just sufficient to replace a space
by a tooth in the time T it took light to travel from! to M and back. When the rotational
speed was increased further, the image returned, increasing in brightness until a maximum was reached corresponding to one space replacing another in time T. Having
thus reached a maximum, the image would fade away again, and so on in succession
for higher and higher speeds.
From his knowledge of the wheel geometry and a measurement of the rotational

SECTION

Historical Survey- The Velocity of Light

25

speed during image eclipse, Fizeau was able to deduce T and thus the velocity of light,
since he knew the distance from f to 111. In reporting this experiment," he said
. . . the result turned out very well, and one was able to observe, depending on whether
the speed of rotation was more or less, a bright point of light or a total eclipse. Under the
conditions in which the experiment was performed, the first eclipse occurred for 12.6 rotations per second. For double that speed, a new bright point; for triple, a second eclipse . . .
and so forth.
The first station was placed in the belvedere of 2, house situated at Suresnes, the second
on the top of Montmartre, at a distance of approximately 8633 meters . . . .
These first attempts furnished a value for the velocity of light which differs but little from
that which has been obtained by astronomers, The mean deduced from twenty-eight observations made so far give for its value 70,948 leagues] . . . .

Fizeau's technique was limited in its accuracy because it was difficult to judge
just when the image had reached maximum or minimum intensity. Foucault devised
a modification of the apparatus which overcame this limitation by replacing the toothed
wheel with a rotating mirror. This mirror caused a measurable displacement of the
image, thus providing a determination of the velocity of light. In 1850 Foucault used
this apparatus to measure the relative velocities of light in air and water, and in 1862
he used an improved version to make an absolute determination of the velocity of light
in air.
Foucault (1819-1868) was also a Parisian, the son of a publisher. He originally
studied for a medical career but then abandoned it for physical science. With Fizeau
he carried on a series of investigations on the intensity of the light of the sun, as well
as the above-mentioned interference experiments. He established that the velocity of
light is inversely proportional to the refractive index of the medium, thus contributing
to the overthrow of the corpuscular theory. In 1851 he demonstrated the diurnal
motion of the earth via what has corne to be known as the Foucault pendulum, and in
1852 he invented the gyroscope; for these t\VO achievements he received the Copley
medal in 1855.
The 1862 determination of the velocity of light was achieved with the apparatus
shown in Figure 1.5. Foucault let solar light, transmitted from a rectangular aperture S,
pass through a half-silvered mirror P and fall upon the achromatic lens L. The ligh t
then proceeded to a rotatable plane mirror R, which was initially fixed at the proper
angular position to bring the rays to a focus at the point M, A concave mirror fixed at
111, with a radius of curvature equal to R'M; then reflected the light along a return
path such that half of the light came to a focus at A, to be viewed by a micrometer
eyepiece. A fine grating was stretched over the slit at S, so that the image at A was
crossed by dark lines, above which a cross-hair of the eyepiece could be positioned
accurately.
When the mirror R was rotated, it acted as a light chopper, in that only when R

t The league is an itinerary measure of distance which varies frorn country to country but is usually
estimated at about 3 mi. Fizeau used it in a precise sense such that his result 'was equivalent to a light
velocity of 3.13 X 108 m/sec or 194,000 mi/sec.
41 A. H. Fizeau, "On an Experiment Relative to the Speed of Propagation of Light," Compt Rend, 29,
90-92; July 1849.

26

CHAPTER 1

The Phenomenon of Light

---------------------FIGURE

-----

1.5 Foucault's apparatus.

was in the proper angular position to deliver light to Al would an image be seen at the
eyepiece. However, during the time T light takes to travel from R to M and back, the
mirror would rotate an additional angular amount a = WT in which W was the angular
velocity of the mirror. This caused the reflected beam to be deflected an angle 2a, thus
shifting the image from A to A'. By measuring the displacement AA' and the rotational speed w, since he knew the relative positions of the components of his apparatus,
Foucault was able to determine T and thus the velocity of light.
Foucault placed the mirrors Rand M an equivalent distance of 20 m apart through
the use of multiple reflections, and turned the mirror R at speeds up to 1,000 revolutions per second, obtaining image displacements in the order of 1 111m. Of his results
he said 42
Definitively, the velocity of light has been found to be noticeably diminished. Earlier data
had indicated that the velocity was 308 millions of meters per second, and this new experiment with the turning mirror gives a value, in round numbers, of 298 millions.
One is able, it seems to me, to count on the exactness of this number, in the sense that the
corrections it would have to suffer should not change its value more than 500,000 meters.

Despite the confidence expressed by Foucault in this determination, his apparatus


also suffered from a serious limitation. The distance RM could not be increased significantly without diminishing the intensity of the image at A', since the intensity
of the light reflected from M was attenuated as (Rlll) 2 before returning to R. But with
R111 at 20 m and extremely high speeds for the rotating mirror, the displacement A A'
was still small enough to be subject to considerable error.
Michelson eliminated this drawback by placing the lens L between Rand M so that
S lay at its principal focus, thus providing a parallel beam to travel to 111. The mirror M
could then be made plane and placed at a much larger distance from R, thus enhancing
the displacement AA'; indeed, Michelson was able to achieve such great image displacements that he eliminated the half-silvered mirror P. His simplified version of the
42 J. B. L. Foucault, "Experimental Determination of the Velocity of Light," Compt Rend, 55, 501503; September 1862.

SECTION

Historical Survey-The Velocity of Light


8'

/1
/ I
/ I

27

s
L

FIGURE

1.6

Michelson's apparatus.

apparatus is shown in Figure 1.6. About this apparatus and his measurements, Michelson said 43
In the following experiments the distance between the mirrors was nearly 2000 feet
and the speed of the mirror was about 257 revolutions per second. The deflection exceeded
133 millimeters, being about 200 times as great as that obtained by Foucault. If it were
necessary it could be still further increased. This deflection was measured within three
or four hundredths of a millimeter in each observation; and it is safe to say that the result,
so far as it is affected by this measurement, is correct to within one ten-thousandth part.
The revolving mirror was actuated by a current of air . . . . 1'0 regulate and measure
the speed of rotation a tuning fork, bearing on one prong a steel mirror, was employed. This
was kept in vibration by a current of electricity. The fork was so placed that the light from
the revolving mirror was reflected to a piece of plane glass in front of the eye-piece, and
thence reflected to the eye. When fork and mirror are both at rest, an image of the revolving
mirror is perceived. When the fork vibrates, this image is drawn out into a band of light.
When the mirror commences to revolve, this band breaks up into a number of rnoving
images of the mirror; and when, finally the mirror makes as many turns as the fork makes
vibrations, or any multiple . . . of this number, the images become stationary . . . .
The electric fork made about 128 vibrations per second. No dependence was placed upon
this rate, however, but at each set of observations it was COIn pared with a standard Ut 3
fork, the temperature being noted at the time.

Being thus assured of great accuracy in both of the critical measurements-image


displacement and mirror velocity-e-Michelson listed 200 data points, each of which was
the mean of 10 separate observations, and concluded that the velocity of light in air was
299,740 km /sec, being thus 299,820km/sec in vacuo. In 1882 he repeated the experiment
and announced a new value for the velocity of light in vacuo, 299,853 km /sec. This was
to remain the accepted figure for forty-five years, and when it was replaced by a more
precise figure, Michelson was once again involved in the determination.
Albert A. Michelson (1852-1931) was born in Poland but emigrated to America
with his parents at the age of two. They settled in the West following the gold rush
and he was raised in a ruining town. A rare presidential appointment as midshipman
at the X aval Academy insured his college education and stimulated his interest in
science. Upon graduation he became an instructor at Annapolis and embarked on his
A. A. Michelson, "Experimental Determination of the Velocity of Light," Am J Sci, 18, 390-393;
Novem ber 1879.

43

28

The Phenomenon of Light

CHAPTER 1

first determination of the velocity of light, described above. There followed a period of
study in Europe during which he invented the interferometer and with it performed
the first ether drift experiment. Upon returning to the United States, he teamed with
Professor Morley to improve the interferometer and repeat this celebrated experiment
which has so influenced the subject of relativity. They also collaborated in a precise
repetition of Fizeau's moving-water experiment and in the establishment of the wavelength of sodium light as a standard of length.
Michelson's ingenuity at optical instrumentation also led to the development of an
echelon spectroscope, to a determination of the rigidity of the earth, and to measurements of the distances and diameters of giant stars. In recognition of his many contributions to physics, he was awarded the Nobel prize in 1907, the first American
scientist so honored.
In 1923 Michelson was asked to go to Pasadena to make another determination of
the speed of light, and this he accomplished with the apparatus shown in Figure 1.7.

Arc light source


Mirror on
Mt. Wilson

,~i/

-,

\Q1='"

Slitf~;;;;'

~
~:-::~1~
..

;:t;,.~~

---Lens

? -,

------::/-~~~-- 'J>
Rotating octagonal prism
on Mt. Wilson

Prism

Observer

Fixed mirror on
Mt. San Antonio

Lens

1.7 Michelson's improved apparatus. [From 1.1ichelson and the


Speed of Light by Bernard Jaffe. (Science Study Series). Copyright 1960
by Educational Services Incorporated. Reprinted by permission of
Doubleday & Company, Inc.]
FIGURE

The principle of operation was still the same, although many refinements of the original
apparatus are evident. An eight-sided rotating prism of nickel-steel, with its mirror
surfaces polished true to one part in a million, was used in place of the single rotating
mirror. Once again, an air blast was used to actuate the mirror system, and a tuningfork stroboscope to measure its rotational speed. The t\VO stations were considerably
farther apart, being placed on Mt. Wilson and Mt. San Antonio. The United States
Coast and Geodetic Survey established the distance between these stations within a
fraction of an inch in 22 miles. The intensity of the image was enhanced by using large
parabolic mirrors at both stations. Many observations yielded a mean value for the
velocity of light of 299,798 krn/sec.
But Michelson was not yet through. He wanted to measure the velocity of light in as
near perfect a vacuum as possible, free' from the obstruction of haze or smoke. A milelong tube of corrugated steel was constructed and evacuated down to a pressure of
i mm, with a version of the apparatus of Figure 1.7 enclosed. Unfortunately, Michelson did not live to see the end of this experiment, succumbing two years before its

Sound Waves and Light Waves 29

SECTION ;)

completion. His colleagues made almost 3,000 independent observations, reporting:" a


mean figure for the velocity of light in vacuum to be 299,774 krrr/sec.
The value 299,792.5 km/sec in vacuo has been adopted as the velocity of light by the
International Union of Geodesy and Geophysics and by the International Scientific
Radio Union. This fundamental constant is within the limits of error of Michelson's
final figure.

1.3

SOUND WAVES AND LIGHT WAVES

The previous t\VO sections have indicated that light as a wave phenomenon has characteristics common to those of all other types of waves, These include a wavelength, a
frequency, and their product the wave velocity, as well as a variety of interference
effects. However, light has one characteristic which makes it unique-it can propagate
in the absence of a tangible medium. This feature will prove to be of fundamental
significance.
It is instructive to contrast the properties of light with those of other wave phenomena. A comparison of the behavior of sound waves and light waves in air is a good
illustrative example, because the air can be permitted to become increasingly rarefied,
approaching in the limit the absence of a tangible medium.
The Acoustic Wave Equation.
Sound waves in air consist of longitudinal molecular
vibrations, resulting in alternate compression and rarefaction of the air. If one considers the case in which sound is propagating in the positive X direction, the molecules
which (on the average) lie in a plane x = constant will (on the average) oscillate in the
X direction. As seen in Figure 1.8, their instantaneous average position will be x + ~(x,t)
in which ~(x,t) is the time-varying displacement around the average position x. Similarly,
the average position of molecules at an adj acent cross section will be x + dx + ~(x + dx, t).
For unit transverse area, the instantaneous volume between these t\VO planes of molecules is
[x

dx

+ ~(x + dx,

t)] - [x

+ ~(x,t)]

(1 + axa~)

dx

(1.2)

and thus the fractional change in volume is a~/ ax. Since the average number of molecules in this volume is a constant, it follows that the density is fluctuating. If the
instantaneous density is designated by Po + PI (x,t), then

[pO

pl(X,t)]

(1 + ~D

dx = constant = Po dx

(1.3)

When it is assumed that the density fluctuation pI(X,t) is small compared to the average
value Po and that the fractional change in volume a~/ ax is small compared to unity,
Equation (1.3) yields the first-order result
PI (x,t)

po

(1.4)

A. A. Michelson, F. G. Pease, and F. Pearson, "Measurement of the Velocity of Light in a Partial


Vacuum," Astrophys J, 82,26-61; July 1935.

44

30

lhe

CHAPTER 1

Phenomenon of Light
Adjacent layer of molecules in
its average position x + dx

Layer of molecules in
its average position x

/
~.'

..... _.....:.:...J:.-.~~"';"--O'......-----~X

+ dx

Adjacent layer of molecules in its


instantaneous displaced position
x + dx + ~(x + dx, t)

Layer of molecules in its


instantaneous displaced
position x + ~(x,t)

x
FIGURE

~(x,t)

:r

+ dx + Hx + dx,

t)

1.8 Average behavior of layers of air molecules in presence of sound waves.

The fluctuations in density of the air as the sound waves pass through are so rapid
that the air does not transfer heat. The compressions and rarefactions are thus adiabatic, and the process conforms to the gas law equation

PV'Y

constant

(1.5)

in which p is the pressure, V the volume, and 'Y is the ratio of specific heats at constant
pressure and constant volume.

Sound Waves and Light Waves

SEC'l'ION :)

31

Since it has been observed that the volume occupied by a fixed number of molecules
is fluctuating, it follows from (1..5) that the total pressure is varying also. Thus one
may write
(1.6)
p = po + P1(X,t)
in which P1(X,t) is the small fluctuation around the relatively large constant average
pressure p.;
Taking the total differential of (1.5) and then dividing by (1.5) itself, one obtains

clp
p

dV

-"I

II

which yields the first-order result

PI (X,t)
po

a~

(1.7)

-)'-

ax

because it has been noted, in connection with Equation (1.2), that a~/ax is the fractional change in volume.
N ewton's force law can be applied to the segment of air between the two adjacent
cross sections. The net force per unit transverse area acting on the molecules is
-[PI(X
dx, t) - Pl(X,t)]. Since to first order the mass is Po dx, one may write

api

a2~

Po

at2

(1.8)

ax

Combination of (1.8) with the spatial derivative of (1.7) yields the wave equation

a2~

a2~

(1.9)

ax 2 = ~s at2
in which

c. =

(~::Y'

(1.10)

The reader will have little difficulty convincing himself that the general solution of
(1.9) is
(1.11)
~(X,t) = j(x - cst) + g(x + cst)

f and g are arbitrary functions. At a time t i the spatial distribution of j is


f(x - cs t 1) , as illustrated in Figure 1.9. At a later time t 2 i t is

in which

f(x - cst 2 ) = f( {:r - cs(t 2

t 1) }

cst})

-....c,

~c,

r--~-----~~-____:lI~-

FIGURE

1.9

t----.-..;..-----~~-____:lI-X

Traveling sound waves.

32

The Phenomenon of Light

CHAPTER 1

and is therefore the same spatial distribution as earlier, but shifted along the X axis a
distance cs(t2 - t 1). For this reason .r(x - cst) represents a wave of arbitrary but constant spatial shape, traveling in the + X direction at speed Ca. Similarly, g(x + cst)
represents an arbitrary wave traveling in the - X direction at speed C8 \ The speed of
these waves is seen, from Equation (1.10), to depend on the conditions of the medium,
namely, the pressure and density of the air. If the air is sufficiently well approximated
by the ideal gas law]

pV='JLRT
in which 'JL is the number of moles, then
CS

'JLRT)~

l' - -

poV

I"..J

T~2

(1.12)

since 'JLI Po V is a constant. Therefore this first-order theory yields the result that the
propagation velocity of sound waves in air depends only on the temperature of the air.
Propagation
Independent
of Source. A
significant feature of Equation (1.10) is
its suggestion that c, is independent of the motion of the source of the sound waves and
is governed solely by the properties of the medium. This suggestion is confirmed by
experiment and is reasonable when one considers that only the air molecules in the
proximity of the source make contact with it, all others depending for their excitation
on somewhat-ordered collisions with their neighbors.
The fact that sound waves have a velocity controlled only by the medium and independent of the motion of the source can be used to explain the Doppler effect. This
effect is familiar through the common example of an approaching locomotive. As shown
in Figure 1.10, at an instant when the diaphragm of the locomotive's horn is in its most
forward position, the air adjacent to the diaphragm suffers a compression, and this
compression travels forward at a velocity c.. If r is the period of oscillation of the
diaphragm, then r seconds later the next compression of air is about to be launched
from the horn. At this moment, the earlier compression is a distance A. = (c, - v)r in
front of the horn, with v the speed of the locomotive. A is the separation between points
in the wave train representing positions of successive maximum compression and is
thus the wavelength. The frequency of the sound wave is therefore
V

C8

= -

c,

= - - vo
C8

(1.13)

in which vo = l/r is the frequency the sound wave would have if the locomotive were
at rest (vo is also the frequency of oscillation of the diaphragm). Equation (1.13) has
been amply confirmed by experiment.
Thus the motion of the source of a sound wave affects both its frequency and wavelength but in such a way that their product remains constant at the value C8 given by
(1.10).
Acoustic Power.
The rate at which energy is being transmitted by the sound wave,
per unit transverse area of the wavefront, is called the intensity, and will be denoted
by T. Consider a column of air of unit cross section, extending to infinity from the layer
of molecules whose average position is x. The net force on this column is Pl(X,t) and

t This approximation becomes better as the air is rarefied.

SECTION

Sound Waves and Light Waves

Horn moving
at velocity v

Sound disturbance
moving at velocity

. .. ...
I)) ..
~ ~ .~
..:::
..

(J

C8

.... ..
..
...... ...
..
...... ....
.. .

....
....
......

......
..
....
....
..
....

33

....
..
.
...... ....

.. .

.:

Diaphragm in
forward position

mo~eCUles

...

-vr-I---(C8 - v)r---

.. .

C~j

~::

...
:::
:::

:::

:::
:::

:::
:: :

eli r1,.:: :
I

:: :

..
..
..
......
..
....
....

..
..
....
.
.. ...

..
..
..
..
....
......
..

\ ;:Phragm in forward
position one period r later
FIGURE

1.10

The Doppler effect in sound waves.

during a time interval dt the column is compressed an amount (a~/ at) dt so that the
work done on the column during this interval is Pl(a~/at) dt. With the aid of (1.7), the
rate of energy flow into the column can thus be written
(1.14)
For a simple harmonic wave traveling in the positive X direction one can write
~

27r

= A cos - (x
A

(1.15)

which is a special case of (1.11). In this equation, A is a constant (the amplitude of


molecule oscillation) and A is the wavelength of the sound disturbance. Since c, = Av,
introducing the wave number k = 27r/'A and the angular frequency w = 27rV enables
one to rewrite Equation (1.15) in the form
~

= A cos (wt - kx)

(1.16)

Substitution of (1.16) into (1.14) gives


T = poc sw2A 2sin" (wt - kx)

(1.17)

At any cross section the time average flow is therefore


(1.18)

34

The Phenomenon of Light

CHAPTER 1

Equation (1.18) reveals that, if the air is increasingly rarefied, the intensity of a
sound wave diminishes. This occurs because the density Po decreases, whereas, if the
temperature remains constant, Cs is unaffected (cf. Equation (1.12)); the amplitude of
molecule oscillation A is limited by the finite amplitude of oscillation of the source. In
the limit, with no molecules to transfer the oscillations to their neighbors, no acoustic
power can be transmitted, and the sound wave ceases to exist.
This discussion can be summarized by saying that sound waves cannot exist without
the presence of a tangible medium, but that they are characterized by a wave velocity
which depends on the properties of the medium but not on the motion of the source.
These remarks are equally true of water waves, elastic waves in solids, etc.
Comparison.
Does light share these characteristics? With respect to the requirement of a tangible medium, the answer is no. Light can propagate in gaseous, liquid,
and solid media, but it does not require the presence of these media to exist. Indeed, it
can propagate in the almost complete vacuum which separates the stars from each
other, and many times has been shown to traverse man-made vacua with an intensity
no less than it had when air was present. For example, Xlichelson's last experiments on
the determination of the speed of light were performed in a huge evacuated tunnel. In
this respect light] as a wave phenomenon is unique in not requiring a tangible medium
for its existence.
Does light share the second characteristic, that is, does it possess a wave velocity
which is independent of the motion of the source? An indication that it does was provided when Maxwell discovered that wavelike solutions to his equations described
electromagnetic fields which would propagate through space at the velocity of ligh t,
leading him to assert that light is an electromagnetic phenomenon, But the equation he
used to obtain these wavelike solutions was similar to (1.9), the wave equation for
sound. Thus just as in the case of acoustic disturbances, Maxwell's analysis suggested
that the velocity of light should be completely independent of its source.
There is also strong experimental evidence to support this view. W. de Sitter" has
analyzed with great care the dynamics of eclipsing binary stars. Were the velocity of
light dependent on the motion of the source, it is apparent that the time for light to
reach the earth from the approaching star of a binary would be different than the time
for light to reach the earth from the receding star. de Sitter deduced that this would
introduce apparent eccentricities in their orbits as they circled each other, but such
eccentricities have never been observed. Some binary stars are at such a distance from
the earth and have sufficiently high orbital velocities that this effect could scarcely
escape observation. Because of this evidence the postulate will be accepted that light,
in common with all other wave phenomena, has a velocity which does not depend on the
motion of the source. (Many successful Doppler radar systems have been built under
this assumption.)
The Ether.
It has been noted earlier, in Section 1.1, that light was not really accepted as being wavelike in nature until the middle of the nineteenth century. By that
time many other wave phenomena were well understood. Since these other wave
phenomena all required a medium for transmission, it was natural to believe that light

t The term "light" is used here in the broad sense to include the nonvisible portions of the
electromagnetic spectru In.
45 W. de Sitter, "An Astronomical Argument for the Constancy of the Velocity of Ligh t," Z Phys, 14,
429; May 15, 1913.

SECTION

;3

Sound Waves and Light Waves

35

did also, even after it was appreciated that light could propagate in a VaCUUlTI. Thus an
intangible medium was hypothesized to provide the support for light waves, The ether,
as this medium was called, being intangible, was endowed with extraordinary properties not shared by any other known medium. These included the ability to pass through
all substances without frictional resistance and the property of being mass-less and th us
unaffected by gravitation. Despite the mystical aspects of this hypothesis, most nineteenth-century scientists firmly believed in the existence of the ether and many serious
scientific experiments were undertaken to prove the validity of the ether concept. The
quest for the ether served to sharpen a dilemma concerned with the velocity of light, a
subject which will be explored in Chapter 2.

REFERENCES
1.

Cohen, 1. B., Roemer and the First Determination of the ~l elocity of Light, The Burndy Library,
Inc., New York, 1944.

2.

Drude, P., The Theory of Optics, translation by Mann and Millikan, Longrnans, Green and
Company, London, 1917.

3.

Jaffe, B., M'ichelson and the Speed of Light, .Anchor Books, Doubleday and Company, Inc.,
New York, 1960.

4.

Morse, P. M., Vibration and Sound, 2nd ed., Chap. 6, i.\t'lcGra\v-Hill Book Company, New
York, 1948.

5.

Preston, T'., The Theory of Light, 5th ed., Macmillan and Company, Ltd., London, 1928.

6.

Reymond, A., History of the Sciences in Greco-Iiomun Antiquity, Methuen and Company,
Ltd., London, 1927.

7.

Richtmyer, F. K., E. H. Kennard, and 'r. Lauritsen, Introduction to Jfodern Physics, 5th ed.,
Chaps. 1 and 2, McGraw-Hill Book Company, New York, 1955.

8.

Whittaker, E., A 1listory of the Theories of the Aether and Electriciiu, Thomas Nelson and
Sons, Ltd., London, 1951.

9.

Williams, H. S., .4 History of Science, Vol. 1 and 2, Harper and Brothers, N ew York, 1904.

CHAPTER

The Special Theory of Relativity


RELATIVITY THEORY is usually divided into t\VO categories, the special or "restricted"
theory, and the general theory. The special theory is concerned with phenomena as they
appear to different observers who have a constant velocity relative to each other. The
general theory removes this restriction and considers phenomena as they appear to
different observers who are in arbitrary relative motion. As one would expect, the
general theory is considerably more difficult. Only the special theory will be needed as a
foundation for the electromagnetics to be developed in the remaining chapters of this
text.
The concepts underlying the special theory of relativity are sometimes puzzling on
first consideration because they lead to predictions about space, time, and matter which
are contrary to C01111110n experience. However, once these concepts are grasped, and
it is recognized that common experience need not be rejected (because it consists of
phenomena in which relativistic effects are too small to be detected), the path is opened
to an understanding of important new relationships. Fortunately, the mathematical
tools required to comprehend the special theory do not extend beyond algebra and
some elementary calculus, so that in approaching this subject much of one's attention
can be concentrated on the concepts themselves.
I t is difficult to appreciate fully the need for the special theory of relativity and its
accomplishments without first recognizing the impasse in physics which it solved.
For this reason an essentially dual chronological presentation of subject Blatter will be
followed in this chapter. In the first (or classical) chronology, the principle of relativity
is introduced and then its consequences ill terms of classical mechanics are considered.
In order to be consistent with this principle, X ewtou's Law of Inertia is shown to
require the Galilean transformation as the connection between different inertial coordinate systems, This development requires the assumptions that distance intervals
and time intervals are invariants. When the additional assumption is made that mass
is an invariant, 1\ewtori's general force law also is seen to transform properly via the
Galilean equations. A by-product of this proof is the familiar classical law of velocity
transformation. Application of this velocity law to the ease of sound waves yields a
result in agreement with observation; however, when this law is applied to the velocity of light, such agreement with observation is lacking, thus posing a fundamental
dilemma. This disagreemen t between classical prediction and observation is discussed
in terms of the Fizeau experiment involving light propagation in moving water, and the
Michelson-Morley ether drift experiment, the null result of which raises questions about
the existence of a light medium.

SECTION]

Historical Survey

37

After various classical explanations of this dilemma are considered and rejected as
unsatisfactory, the second chronology] begins with a reexamination of the fundamental definitions of space and time. Einstein's two postulates of special relativity
are then used as the basis for a resolution of the impasse and a derivation of the Lorentz
transformation. This transformation is seen to be consistent with the principle of
relativity in the case of light velocity and provides a convincing explanation of the
Fizeau experiment. The view of Einstein that the concept of a luminiferous ether is
superfluous automatically explains the null result of ether drift experiments such as
those of Michelson and Morley and the more recent test by Cedarholm and Townes
using masers.
Application of the Lorentz equations to the transformation of the laws of mechanics
is found to have several significant consequences. These include the dependence of
length on motion, time dilatation, variation of mass, and the equivalence of mass and
energy. These effects combine to yield transformation laws for mass and force. The
latter is used in Chapter 4 to derive all the results of magnetostatics via a relativistic
transformation of Coulomb's electric force law,
A second transformation is used in Chapter 5 to derive Maxwell's equations for the
case of a source system consisting of steady currents and charges, as seen by one
observer, with respect to whom a second observer is in constant translational mo tio n.
The second observer detects time-varying electrornagnetio fields due to sources of
a restricted class; upon superimposing a set of such fields, one can establish Maxwell's equations for the general case of accelerated sources. In this manner all the
basic relations of electromagnetics are derived by using the special theory of relativity
to enlarge upon the single experimental postulate of Coulomb's law, without the need
to invoke the general theory.

2.1 *

HISTORICAL SURVEY

Bradley's discovery of stellar aberration in 1728 has already been recounted in the
previous chapter. At that time the corpuscular theory of light held sway, and thus
Bradley's explanation of the effect was based on the mechanistic law of addition of
velocities. A century later, when the wave theory of light had been revived successfully
by Young and his followers, the need existed to reexamine all optical phenomena,
including aberration, on a wave basis. The concept of a luminiferous ether through
which light propagates became a natural part of the wave theory, in analogy with all
other known wavelike disturbances, each of which requires a medium for transmission.
Thus it was Young himself who employed the ether concept to explain Bradley's
discovery of aberration in terms of a wave picture. In addressing the Royal Society in
1803 he remarked that!

t The two chronologies overlap because Einstein's explanation of the dilemma, though offered in 1905,
did not gain universal acceptance immediately; many classical attempts at an explanation were still
to be forthcoming for several decades.
* This section may be omitted without loss in continuity of the technical presentation.
1 T. Young, j\1iscellaneous Works, edited by George Peacock, Vol. 1, p. 188, John Murray Publishers,
London, 1855.

38

The Special Theory of Relativity

CHAPTER

IT pon considering the phenomena of the aberration of the stars, I am disposed to believe
that the luminiferous ether pervades the substance of all material bodies with little or no
resistance, as freely perhaps as the wind passes through a grove of trees.

In this conception the earth glides through the ether, .and the light from a distant
star is unaffected by the earth's motion, Thus the light waves, during the interval of
time they traverse the tube of a telescope, suffer a displacement equal to the displacement of the earth through the ether in the same time interval. This displacement can be
compensated for by making a small angular correction in the position of the telescope,
thus accounting for aberration.
The notion of an ether which pervaded the entire universe, being everywhere at
rest in S0111e particular frame of reference, gained favor for the additional reason that it
lent support to the idea of an absolute frame of reference, with respect to which the

absolute position and velocity of all bodies could be specified.


Young enlarged on this ether concept with the suggestion that?
For explaining the phenomena of partial and total reflection, refraction, and inflection,
nothing more is necessary than to suppose all refracting media to retain: by their attraction,
a greater or less quantity of the luminous ether, so as to make its density greater than that
which it possesses in a VaCUll111, without increasing its elasticity.

Fresnel put this idea into a precise form by postulating that the density of ether in any
body is proportional to the square of its refractive index. The excess of ether density
over that in vacuo was assumed to be dragged along with the body, the remainder
staying at rest as part of a uniform background ether. With this model Fresnel was
able to derive an expression for the velocity of light v in a moving body, namely,

in which u is the velocity of the body with respect to the ether and Vo is the velocity
light would have in the body if the body were stationary in the ether; all velocities are
measured with respect to the frame of reference in which the ether is at rest.
With this formula, Fresnel was successful in explaining refraction effects under the
wave theory, for bodies in motion as ,veIl as at rest with respect to the ether. His
theory was consistent with Arago's result that the apparent refraction in a moving
priS111 is equal to the absolute refraction in a stationary prism, and it further predicted
that if observations were made with a water-filled telescope, the aberration would be
unaffected by the presence of the water. This prediction was verified by Airy in 1871.
Fizeau, in a significant experiment, passed light through tubes of moving water, and
used an interference technique to substantiate the above Fresnel formula, this being
done in 18t51.
IVI ax w ell, who possessed a physical imagination akin to that of Faraday, firmly
believed in the existence of an ether. In the classic paper" which introduced his theory
of the electromagnetic field, one finds the passage
2

I bid., p. 80.

J. C. Maxwell, "A Dynamical Theory of the Electromagnetic Field," Phil Trans Roy Soc (London),
155, 450; 1865. (See also J. C. Maxwell Scientific Papers, Vol. 1, pp. 526-597, Dover Publications,

Inc., New York.)

SECTION]

I t appears therefore that certain phenomena in electrici ty and magnetism lead to the
same conclusion as those of optics, namely, that there is an aethereal medium pervading all
bodies, and modified only in degree by their presence; that the parts of this medium are
capable of being set in Illation by electric currents and magnets: that this motion is cornmunicated from one part of the medium to another by forces arising 1'0111 the connexions
of those parts; that under the action of these forces there is a certain yielding depending
on the elasticity of these connexions; and that therefore energy in two different forms 111ay
exist in the medium, the one form being the actual energy of motion of its parts, and the
other being the potential energy stored up in the connexions, in virtue of their elasticity.

Several years earlier .vlaxwcll had devised a mechanical conception of the electromagnetic field and had been led by analogy to the conclusion that electromagnetic
waves are propagated at the velocity of light. He therefore felt that light was an electromagnetic disturbance and made the assertion
We can scarcely avoid the inference that light consists in the transverse undulations of
the same medium which is the cause of electric and magnetic phenomena.

Thus an answer was provided to speculation as to whether or not several ethers existed
for the separate support of light, heat and electricity.
Interest in the detection of this luminiferous ether grew, but it was several decades
before an experiment of sensitivity sufficient to be definitive was performed. In 1881
Michelson invented an interferometer capable of measuring second-order effects in the
assumed velocity of the earth relative to the ether. His technique for determining
ether drift was analogous to the detection of a river current through comparison of
the round trip times of rowers who follow courses parallel to and perpendicular to the
flow. This first experiment gave a null result but the sensitivity was marginal, so the
apparatus was improved and the experiment repeated by Michelson and Morley in
1887. A null result was again obtained; it was as though the earth were at rest in the
ether.
This experiment caught the attention of the Dutch physicist H. A. Lorentz (18531928), who became convinced that the null result was a real effect and sought a reason
to explain it. In 1892 he hypothesized that a material body suffers a contraetion in its
longitudinal dimeusiou, due to its rnotion through the ether, just sufficient to prevent
the ether's detection with the Michelson interferometer. This sarne explanation had
been put forth verbally by G. F. Fitz Gerald (18.51-1901) several years earlier and is
often referred to as the Lorentz-FitzGerald contraction hypothesis.
Lorentz attempted to develop a complete electron theory which would explain this
contraction in terms of a readjustment of electrical forces between molecules, due to
absolute motion through the ether. In a succession of papers he ultimately formulated
a theory in which l\Iaxwell's equations would transform from one set of variables to
another without a change in form. 4 The t\VO sets of variables were related to each other
by what have come to be known as the Lorentz transformation equations; the representation is that of the connection between two coordinate systems in different states
of constant mot.ion through the ether. To obtain this transformation Lorentz assumed
that spherical electrons were flattened into ellipsoids due to their motion through the
4 H. A. Loren tz, "Electromagnetic Phenornena in a System Moving With Any Velocity Less Than
That of Light," Pl'OC A.nlsi Acad, 6, 809; 1904. (Heprinted in English in The Principle of Relativity,
pp. 11-34, Dover Publications, New York.)

40

The Special Theory of Relativity

CHAPTER

ether and introduced what he called "local time" in one frame of reference which
depended on both time and distance in the other frame. The physical meaning of this
local time was not elaborated.
In 1932 R. J. Kennedy devised an ingenious modification of the Miehelsou-Morley
experiment which showed the Lorentz-FitzGerald contraction hypothesis to be untenable. Meanwhile, the intervening years had seen a series of repetitions of the original
Michelson experiment by a number of investigators. Though the sensitivity and accuracy increased, no change from the null result was noted and a variety of explanations
based on an ether theory proved unsatisfactory.
Concomitantly, a new approach to the problem had been evolving. In 1900 while
addressing the International Congress of Physics at Paris, Poincare reviewed the implications raised by the null result of the Michelson experiment and asked, "Our etherdoes it really exist? I do not believe that 1110re precise observations ever could reveal
anything more than relative displacements." Poincare became convinced that it was
impossible to determine the earth's absolute motion (that is, its velocity through the
ether), and embraced this belief in the enunciation of a Principle of Relativity. Speaking in St. Louis in 1904, he said 5
According to the Principle of Relativity, the laws of physical phenomena 111USt be the
same for a "fixed" observer as for an observer who has a uniform Illation of translation
relative to him: so that we have not, and cannot possibly have, any means of discerning
whether we are, or are not, carried along in such a motion.

In 1905 Einstein made a complete break with the ether concept, discarding it as
superfluous. He adopted the principle of relativity as a postulate and added as another
that light is always propagated in empty space with a definite velocity c which is independent of the state of motion of the emitting body. Upon careful reexamination of the
concepts of the measurements of space and t.ime, he concluded that neither was an
invariant. In satisfying his second postulate, Einstein was led to the same transformation equations derived earlier by Lorentz. However, the derivation was on an entirely different basis, and one which has stood the test of time.
The noninvariance of spatial and temporal intervals has caused a major reinterpretation of the concepts of mechanics. In 1906 Max Planck determined the modifications
which would be needed in the K ewtonian equations of motion to place them in accord
with the new relativity theory, and then developed expressions for the kinetic energy
and momentum of a material particle. I t was recognized that the concept of 111aSS as
an invariant must also be abandoned. This variability of mass was clearly illustrated
by G. :N. Lewis and R. C. Tolman in 1909, when they considered the collision of t\VO
similar balls as viewed from different coordinate systems, and found that either D10
mentum was not conserved or mass depended on speed. The relativistic expression for
kinetic energy led Einstein, Lewis, and others to suggest that energy and mass were
related by the now-celebrated equation l~ = me", Transf'ormation laws based on the
Lorentz equations were worked out for velocity, mass, and force, from which emerged
the result that the velocity of light is the upper limit for motion of Blatter and energy.
Experimental evidence in support of Einstein's special theory of relativity is positive
and abundant. With the advent of atomic clocks, greatly increased precision in the
measurement of time intervals has made possible a variety of terrestrial experiments
5

An English translation of this address by G. B. Halstead can be found in The Monist, January 1905.

SECTION

The Principle of Relativity and Its Classical I mplications

41

which verify all the major predictions of the theory, including the dependence on speed
of distance, time, and mass, A variety of nuclear processes has confirmed the relation
E = me', ?\Iuch of this evidence will be presented in the sections to follow, together
with the principal developments of the theory which have been enumerated above.

2.2

THE PRINCIPLE OF RELATIVITY AND ITS CLASSICAL

IMPLICATIONS

The principle of relativity in science is an old idea whose origins are difficult to trace.
Simply stated, it expresses the belief that all the laws of nature should operate in the
same manner everywhere in the universe." This idea was given specific articulation by
Poincare at a meeting of the International Congress of Physics at Paris in 1900 and
was raised to the status of a formal postulate by Einstein in 1905. Despite the apparent
simplicity and self-evident logic of this principle, it has deep-seated consequences.
Consider first the implications of the relativity principle with respect to X ewtou's
laws of motion. The First Law, or Law of Inertia, states: ..4 body at rest or in uniform

motion will remain at rest or in uniform motion unless some externaltorce is applied to it.
Implicit in this law is the notion of an observer who can determine that the mot.inn
of the body is unaccelerated. But not all observers will make such an observation, for if
t\VO observers are in accelerated motion with respect to each other, they cannot both
perceive the body to have an unaccelerated motion. 'rhus the Law of Inertia as stated
above is not applicable for all observers in all coordinate systems. Those systems in
which it is applicable are said to be inertial susieme.
Let XYZ be a Cartesian coordinate system in which the Law of Inertia is valid.
By this one means that an observer
who is stationary in XYZ will determine that
any body which is removed from interaction with all other bodies will be at rest or
traveling ill a straight line at constant speed. In the X ewtonian conception of space,
o can imagine the X, Y, and Z axes extending as straight lines in three perpendicular
directions to the limits of the universe and can imagine a one-to-one correspondence
between the points in physical space and the triplets of numbers (x,y,z). As seen by 0,
the instantaneous position of a particle can be described by its three coordinate variables x(t), y(t), z(t). If this particle is force-free, 0 can write

i == 0

==

z ==

(2.1)

in which each dot signifies a time derivative. Integration once with respect to time gives
i: =

Vx

iJ = v y

== Vz

(2.2)

with v = lxvx + lyvy + lzv z the constant straight-line velocity which 0 observes the
particle to have, in conformance with the First Law, t
But clearly the frame of reference X Y Z is not unique in the sense of being the
only inertial system, and one can readily imagine another observer 0', at rest in a
different coordinate system X' Y'Z', for whom the same body also seems unaccelerated.
Since 0 and 0' are observing the same phenorncnou. they should be able to deduce
each other's measurements from a knowledge of the relative position and motion of

t In this text, unit vectors will be designated by the symbols lx, l r, 14>, etc. Sce the Mathematical
Supplement.
6 Anaxagoras (c.5()()-430 B.C.) apparently held this belief. See 1). E. Gcrshenson and 1). A. Greenberg,
A naxaqoras and the Birth. of the Scientific M ethocl, Blaisdell Publishing Company, New York, 1964.

42

The Special Theory of Relativity

CHAPTEH 2

their two frames of reference. Stated differently, if observer 0 knows the triplet (x,y,z)
which establishes the instantaneous position of the particle as determined by himself,
he should be able to deduce the corresponding triplet (x',y',z') and thus know the
instantaneous position of the particle as seen by 0'.
This connection is accomplished through the coordinate transformation equations]
x' = 01(x,y,z,l)

(2.3)

y' = 02(X,Y,Z,t)

Several restrictions can be invoked to determine a specific form of this transformation.


First, if it is assumed that 0 and 0' agree in their measurement of distance, then the
functions gl, g2 and g3 must be linear and commensurate in the spatial variables.
Second, if 0 and 0' are both to observe the particle to have a straight line trajectory,
only a translational motion of X'Y'Z' with respect to XYZ is permitted; rotational
motion is excluded. Third, if it is assumed that 0 and 0' agree in their measurement
of time intervals, then the functions 01, 02, and 03 must also be linear in the temporal
variable, for otherwise one would not obtain x' = y' = z' = 0 when x = fj = z = o.
With these restrictions, the most general suitable solution of (2.3) is the Galilean
transformation discussed in the Mathematical Supplement (Example V.16); namely,

x' = (x - Xo - uxt) cos xx'


y' = (x - Xo - uxt) cos xy'
z' = (x- Xo - uxt) cos xz'

+ (y + (y +

Yo - uyt) cos yx'


Yo - uyt) cos yy'
(y - yo - uyt) cos yz'

+ (z + (z +

Zo - uzt) cos zx'


Zo - uzt) cos zy'
(z - Zo - uzt) cos zz'

(2.4)

In (2.4), u = l x u x + lyuy + lzu z is the constant velocity of 0' with respect to 0;


cos xx', cos yx', etc., are the cosines of the constant angles between the X and X' axes,
the Y and X' axes, etc.; (xo,Yo,zo) is the position of the origin of the X'Y'Z' system as
seen by 0 at t = O.
Equations (2.4) are known as the most general Galilean transformation, Their physical interpretation is that the primed system is in translative motion relative to the
unprirned system at a speed U = (u; + u~ + u;)}~. This motion is in an arbitrary direction with respect to the XYZ axes. Furthermore, the X'Y'Z' axes are tilted arbitrarily
relative to the XYZ axes and the primed origin is in an arbitrary position relative to
the unprimed origin at t = O.
I t is a simple matter to show that X cwtou's First Law applies in one of these systerns if it applies in the other. If x(t), Yet), and z(t) are the time-varying coordinates
of a particle as seen by 0, then differentiation of (2.4) gives

dx'
dt
dy'

dt
dz'

dt

(x - u x ) cos xx'

(iJ - u y ) cos yx'

+ (z -

(x - u x ) cos xy'

(iJ - u y ) cos yy'

(z - u z ) cos zy'

(iJ - Uy) cos yz'

(z - u z ) cos zz'

= (x -

u x ) cos xz'

u z ) cos zx'
(2.5)

Since it has been assumed that observers 0 and ()' agree in their measurement of ti me
intervals, so that it is proper to write dt' = dt, it follows that the left sides of Equations (2.5) are the instantaneous velocity components of the particle as seen by 0'.

t See the Ma thematical Supplement, Sec.

V.II.

SECTION

The Principle of Relativity and Its Classical Lmplications

43

Under this assumption, Equations (2.5) are known as the velocity transformation
equations, and one additional differentiation gives the acceleration transformation,
namely,
x' == x cos xx' + y cos yx' + Z cos zx'
y' == x cos xy' + y cos yy' + z cos zy'
(2.6)
z' == x cos xz' + y cos yz' + Z cos zz'

Thus if
is observing an unaccelerated particle, so that x = y == z == 0, then Equations (2.6) give x' == y' = z' == 0, indicating that the particle also appears unaccelerated to observer a'.
Cartesian coordinate systems linked by a Galilean transformation have several other
important properties. One of these is the invariance of distance. Suppose that (X2,Y2,Z2)
are the coordinates of one particle and (Xl,Yl,Zl) are the coordinates of another particle
at a common time t, as noted by observer
who is stationary in XYZ. 0 then says
that the instantaneous distance separating the two particles is

(2.7)
Similarly, observer 0', who is stationary in X'Y'Z', finds that the instantaneous positions are (x~,y;,z~) and (x~,y~,z~) and concludes that the distance of separation is
(2.8)

If the two coordinate systems are connected by a Galilean transformation, Equations


(2.4) can be used to deduce that
x~ - x~
(X2 - Xl) cos xx'
y~ - y~ == (X2 - Xl) cos XV'
z~ - z~ == (X2 - Xl) cos xz'

+
+
+

(Y2 - YI) cos yx'


(Y2 - YI) cos yy'
(Y2 - YI) cos yz'

+
+
+

(Z2 - Zl) cos zx'


(Z2 - ZI) cos zy'
(Z2 - ZI) cos zz'

(2.9)

If one substitutes (2.9) in (2.8), recognizing that terms of the type cos? xx' + COs 2 XV' +
cos" XZ' are unity, whereas terms of the type cos X~t' cos yx' + cos xy' cos yy' + cos xz'
cos yz' are zero t makes it apparent that

d'

==

(2.10)

Therefore the Galilean transformation leaves distance an invariant.


This invariance of distance permits a simple proof of the most important property
of a Galilean transformation-the fact that N ewton's general force law is invariant
(actually eovariaut j) under such a transformation. I t has been shown above that if
the First Law (concerning unaccelerated bodies) is valid in the unprimed system, a
Galilean transformation renders it valid in the primed system as well. But for accelerated bodies, if f = ma is a valid relation in XYZ, and it is transformed using (2.4),
does one obtain f' == m' a'? To see that under suitable assumptions this does occur,
consider the result of Example 'l.22 in the Mathematical Supplernent. In that ex-

t The term cos? xx' + cos" xy' + cos" xz' is seen to be a unit vector parallel to the X axis,
resolved in to com ponen ts along the primed axes, and dotted with itself. The term cos xx' cos yx' +
cos XVI cos YY' + cos xz' cos yz' is seen to be the dot product of two perpendicular unit vectors, one
parallel to the .L\ axis, the other parallel to the Y axis, both resolved into their primed components.
t If the [orm of a law is unchanged by a certain coordinate transformation, that is, if the law has the
same functional form in terms of either set of coordinates, the law is said to be covariant with respect
to the transformation considered.

44

The Special Theory of Relativity

CHAPTER 2

ample, a mass m experiences gravitational forces due to an assemblage of other masses


m, . . . m, The total force on rn is found to be expressible as the negative of the
gradient of the scalar potential function
N

cI>(x,Y,Z,Xl,Yl,Zl, ... ,t)

\'

'-'

mm,
G--

(2.11)

ri

i= 1

in which G is the universal gravitational constant and


r,

= [(x - Xi)2

(y - Yi)2

(z - Zi)2)H

is the instantaneous distance between m and mi. Through use of (2.11), Newton's force
law for the case of mass particles can be written in the form
N

\'

ma = V
i

mm;

L G - .- = - V<P
=1

(2.12)

1i

~f

it is assumed that

ep(X',y',z',x~,y~,z~, ... ,t) = ep(x,Y,Z,Xl,Yl,Zl, ... ,t)

(2.13)

Because distance is an invariant, and because of the form of (2.11),


mass is also an invariant, then

In other "'''0 I'ds, for a given set of relative positions of the masses, observers 0 and 0'
will agree on the value of the potential function. Formation of partial derivatives
of (2.13) gives

a4>
ax'
a4>
,
ay
a<I>
az'

a4>
,a4>
,aep
,
cos xx + - cos yx + - cos zx
ax
ay
az
aep
aep
a4>
cos xy' + - cos yy' + - cos zy'
ax
ay
az
acI>
,a4>
,a<fl
,
cos xz + - cos yz + - cos zz
ax
ay
az

= =
=

(2.14)

in which it has been recognized, through differentiation of the inverse of Equations


(2.4), that ax/ax' = cos xx', ay/ax' = cos yx', etc.
Substitution of the three components of (2.12) into (2.14) gives
-

a<I>

+ mz cos zx'

-, =

m.i: cos xx'

mx cos xy'

+ my cos yy' + mz cos zy'

ax
a4>
ay'
a4>
az'

= mi cos

xz'

my cos yx'

my cos yz'

(2.15)

mz cos zz'

Upon comparing (2.15) with (2.6), one can conclude that

a<I>

..,

- - = mx

ax'

a<I>

..,

- - = my

iy'

LG m~i

a<I>

az'

..,

mz

and thus that

ma'

= Vi

i = 1

1i

-V ' 1J

(2.16)

SECTION

1he

Principle of Relativity and Its Claseical l m.plicaiions

45

which is Newton's general force law in the same form as (2.12). Thus, under the
assumptions that time and mass are invariants (plus the consequence of the Galilean
transformation that distance is an invariant), the general Galilean transformation
leaves all of 1\ ewtorr's laws of mechanics for free mass particles unaltered in form. I t is
for this reason that considerable importance attaches to the transformation (2.4).
Other branches of mechanics, including hydrodynamics, elasticity, and the mechanics
of rigid bodies, can be treated as extensions of the mechanics of free mass particles,
through the introduction of suitable interaction energies in the form of potential functions whose gradients give forces. I t is thus clear, without entering into a detailed
treatment of these branches of mechanics, that the laws which govern them also transform properly via the Galilean Equations (2.4), under the same assumptions which
were made in the preceding development. Therefore the t\VO inertial systems XYZ and
X'Y'Z' appear to be equivalent for the description of all the phenomena of mechanics.
This belief is often referred to as the Galilean principle of relativity.
One special case of the general Galilean transformation proves particularly useful.
Assume the situation of Figure 2.1 in which the primed and unprimed axes are respec-

r-----y

Z'

x
~----y'

X'
FIGURE

2.1

Cartesian coordinate systems in constant translative relative motion.

tively parallel, the origins having coincided at t = 0, and in which the X and X' axes
are sliding along each other at a relative speed u. It is seen readily that for this case (2.4)
reduces to
X' = x - ut
(2.17)
y' = Y

z' = z

Similarly, the velocity transformation equations (2.5) reduce to

x' = x - u

y'

i' = i

(2.18)

a result which depends on the assumption that time is an invariant. Equations (2.18)
also could have been deduced directly by a time differentiation of (2.17).

4G

The Special Theory of Relativity

CHAPTER 2

The usefulness of the transformation (2.17) extends beyond its simplicity. Imagine a
third coordinate system .LY * Y *Z * connected to the system X YZ by a static rotation and
also imagine a fourth system X~ Y~Z~ connected to the system X' Y' Z' by a static rotation plus a static translation. t Then X~ Y~Z~ is moving relative to X * Y*Z* at a speed u.
This motion is in an arbitrary direction with respect to the ..cY* Y*Z* axes and is also in
an arbitrary direction with respect to the X~ Y~Z: axes. Furthermore, the t\VO origins
are in an arbitrary relative position at t == O.
But this is the description of t\VO Cartesian frames connected by the general Galilean
Equations (2.4). Therefore one can obtain the most general Galilean transformation,
connecting X * Y *Z* and X~ Y~Z~, via t\VO static transformations of Equations (2.17).
Since t\VO observers, each at rest in separate Cartesian systems of coordinates (but
systems connected by a static Galilean transformation), are in complete agreement
about measurements of motion, it follows that the observations of 0 and 0* are
equivalent, and that the observations of 0' and O~ are equivalent. Any deductions
based on (2.4) are also obtainable from (2.17). For this reason no loss in generality is
suffered if, for brevity and clarity, all the remaining discussion is presented in terms
of the simple Galilean Equations (2.17).
To summarize the ideas of this section, one can say that any two inertial frames, connected by a Galilean transformation of the type (2.17) {possibly through the intermediary of t\VO static transformations) are equally suitable as references in which to
express the general laws of mechanics. This conclusion requires the assumptions that
distance, time, and mass are invariants.
In the nineteenth century, mechanics was such a highly developed branch of science,
and there was such a satisfactory agreement between K ewton's laws and experiment,
that mechanics enjoyed a greater confidence and trust than any other area of physical
knowledge. Since the principle of relativity seemed so logical and natural, and since
the Galilean equations transformed all the laws of mechanics in conformance with the
principle of relativity, the greatest confidence also reposed in the belief that the Galilean
transformation was correct. A test of its correctness arose with the question whether
or not all the other (nonmechanica1) laws of physics also transform properly via the
Galilean equations, as the principle of relativity in its broadest sense requires. The
next several sections are concerned with this question.

2.3

APPLICATIONS

OF THE CLASSICAL

VELOCITY TRANSFORMATION LAW

A simple mechanical example will serve to illustrate the reasonableness of the velocity
transformation equations (2.18). Consider the ease of an observer 0 standing beside a
highway as a sedan goes by traveling at BO mph relative to the ground. At the same
time observer 0' is in a second car which is traveling at 70 mph relative to the ground
and in the process of passing the sedan. If the XYZ system is attached to the ground
with the X axis parallel to the highway, the situation is suggested by Figure 2.2.
Observer 0 will say that the sedan has a velocity given by i: = tjO mph.
If the X' Y' Z' system is attached to the car in which 0' is riding, then u = 70 mph is

t By a static rotation plus a static translation, one means that ..x; y~Z: and .Y ' Y ' Z' have no relative
motion, but their axes are tilted with respect to each other and their origins do not coincide.

SECTION

Applications of the Classical Velocity Transforrnation Law

47

the relative speed of the two coordinate systems, and (2.18) gives x' == 50 - 70 == - 20
mph as the speed of the sedan relative to a'. This is a result completely consistent with
common sense, and is typical of many similar applications of (2.18) which can be
encountered in everyday experience.
N ext consider an observer 0, stationary with respect to the average motion of the

,..)---------------y
o

-:

50

70

FIGURE

2.2

Relative speed.

air which surrounds him. An acoustic source is generating sound waves which pass 0
at a speed c., governed solely by the properties of the air. Imagine also an observer 0'
traveling at a speed u relative to 0, in the direction of the wave motion. Equations
(2.18) predict that the sound waves will pass 0' at a speed c, - u, and experimental
observations are consistent with this prediction.
The motion of the acoustic source will be different as observed by 0 and 0' but this
has no effect on the wave velocity (cf. Section 1.3). The reason why 0 and 0' observe a
different value for the speed of the sound waves is that the medium is at rest with respect
to 0 but is in motion at a speed u relative to 0'.
Finally, consider an observer 0, stationary in XYZ, past whom a light wave is
propagating at speed c. If a second observer 0' is traveling at a speed u relative to 0
in the direction of the wave motion, then Equations (2.18) predict that the light waves
will pass 0' at a speed c' == c - u. This result should be valid even without the presence
of a tangible medium, since it is known that light can propagate through a VaCUU111.
But in this extremity of the absence of a tangible medium, two possibilities need to be
considered:

1. There is a detectable intangible medium, call it ether, which supports the light
waves, and in which light propagates at a speed c governed by the properties of
the ether.
2. There is not a detectable ether, and a vacuum is a region to which no physical
properties can be ascribed.

48

T he Special Theory of Relativl'ty

CHAPTER 2

Under the first possibility, the existence of a detectable ether, the situation is completely analogous to the case of sound waves, For convenience in this discussion
observer 0 can be assumed to be at rest in the ether, so that the light waves do pass
him at a speed c. These light waves then pass 0' at a speed c' = c - u because the
medium is in motion at speed u relative to 0', The motion of the light source does not
affect these values of c and c' because only the ether governs the velocity of propagation (cf. Section 1.3).
Under the second possibility, the nonexistence of a detectable ether, it is illogical to
write
(2.19)
c' = c - u ~ c
This point can be appreciated by recognizing that 0 is in a vacuum, observing a light
source with some particular motion, and that this source is emitting light waves whose
velocity relative to himself 0 can measure. But 0' is also in a vacuum, observing the
same light source with S0111e particular motion, and this source is emitting light waves
whose velocity relative to himself 0' can measure, The only difference in the situation
for observers 0 and 0' is the motion of the source. But if the velocity of light is independent of the motion of the source (and the experimental evidence indicates that
light does share this characteristic with sound), then 0 and 0' should measure the same
velocity for light, and conclude that
(2.20)
c' = c
which is in violation of the classical law of velocity transformation.
Thus the t\VO possibilities lead to different predictions, and one should be able to
design experiments which will test the validity of each possibility. Several such
experiments have been performed, two of which will be described in the sections to
follow, However, before a discussion of these experiments is presented, it is significant
to point out the implications of a choice between the two possibilities (1) and (2) listed
above. If the principle of relativity is applicable to all the laws of physics, and if the
Galilean transformation equations (2.17) are consistent with this principle, then since
the velocity transformation law (2.18) is a direct consequence of (2.17), the presence of
a detectable ether is required; without an ether, the relation c' = c - u is meaningless,
Alternatively, if there is not a detectable ether, then either the Galilean transformation
equations are incorrect or the principle of relativity holds for mechanics but not for
light. A decision between possibilities (1) and (2) is of fundamental importance.
When the need to make this decision was first appreciated, there was every confidence
that an ether would be detected, that the Galilean transformation was correct, and that
the principle of relativity embraced all branches of physics. The actual detection of the
ether was eagerly a wai ted, and there were philosophical overtones to the scien tific
interest evinced in this imminent discovery, Without an ether, no single inertial frame
of reference could in any way be preferred over any other. However, if the ether could
be detected, presumably it would not consist of different portions in relative motion,
but would be everywhere at rest in one Galilean coordinate system. It would then seem
logical to take this preferred frame as the absolute reference. 'I'he instantaneous position
of every body in the universe with respect to this preferred frame could then be designated as its absolute instantaneous position.
An absolute reference frame for the entire universe had long been an appealing idea.
(K ewton, for example, had believed in absolute 1110tioIl, defining it as translation of a

SECTION

Fizeau's Experiment with lJloving WT ater 49

body from one absolute place to another absolute place.) Detection of the ether was
therefore not only expected to settle an outstanding question about light, but also to
establish a means for defining absolute position and motion.

2.4

FIZEAU'S EXPERIMENT WITH MOVING WATER

In 1859, in a classic paper," Fizeau described an experiment he had performed to determine the influence upon the velocity of light of the motion of the tangible medium
through which it passes. The result of this experiment has a strong bearing on the
question of the existence of a detectable ether, and was later credited by Einstein as
being of primary importance in his formulation of the special theory.
Fizeau divided a beam of light, which issued from a slit S placed at the principal focus
of a lens, into two parallelbeams, which he then passed through two parallel tubes
(Figure 2.3). At the end of these tubes, the two beams impinged upon a second lens
F

Tube

Flow..........- ///////

!vI

~Flow

Tube
FIGURE

2.3 Fizeau's moving water apparatus.

and were reunited at its focus, where Fizeau had placed a plane mirror. Upon reflection the rays crossed and were each returned through the other tube, to be reunited
once again by the first lens and brought to a focus at the point F, through the interposition of the half-silvered mirror 1).
With both tubes filled with water, and the water at rest, transverse interference
fringes could be observed at F with a bright central fringe corresponding to equal paths.
If then the water were put in motion with equal speeds, but in opposite directions in the
t\VO tubes, and if the velocity of light were affected by this motion, one would predict
that the central fringe would be displaced. This would be so because one beam of light
would be traveling with the water flow, both out and back, whereas the other beam
would be traveling against it. A simple rncasurerncn t of the shift in position of the central fringe would yield the difference in times along the t\VO paths and thus the dependence of light velocity on motion of the water.
When Fizeau performed this experirnent, he did note a fringe shift which depended
on the rate of flow of the water, and his data fitted the formula
(2.21)
in which v is the velocity of light in water when the water is moving at a speed u relative
7 A. H. Fizeau, "On Hypotheses Relative to a Luminous Ether," Ann de Chimie et de Phys, Ser. III,
57, ;385-404; May 1859.

50

The Special Theory of Relativity

CHAPTER 2

to the laboratory, Vo is the velocity of light in stationary water, and n is the index of
refraction. Both v and Vo were measured relative to the laboratory.
This result is at variance with a simple classical prediction. An observer 0' at rest
relative to the water should measure a velocity of light in the water of value Vo. An observer 0 at rest in the laboratory, seeing 0' go by at speed u, can invoke Equations (2.18)
to predict that v = Vo + u. Since the index of refraction of water is approximately 1.3,
the difference between (2.21) and the classical prediction is too great to be ascribed to
experimental errors. Strong reinforcement of Fizeau's findings has been provided by
the precise work of later investigatorsv" who repeated his experiments, also obtaining
agreement with the formula (2.21).
This formula had actually been derived earlier by Fresnel on theoretical grounds
through a complication of the ether concept. At the time he was interested in explaining
an observation by Arago that the apparent refraction of light in a moving prism was
equal to the absolute refraction in a fixed prism. However, the argument of Fresnel's
derivation is equally applicable to the Fizeau experiment, and in that context proceeds
as follows:
Assume that the ethereal density in any body is proportional to the square of its index
of refraction. Then if c is the velocity of light in the ether in the absence of any tangible
matter, and if Vo is the velocity of light in the given material body when it is at rest, so
that n = c/vo is the refractive index, it follows that

in which p is the density of the ether in free space and p' is its density in the material
body.
Fresnel made the additional assumption that when the material body was in motion
at speed u, part of the ether was carried along with it-namely, that part which constitutes the excess of its density over the density of ether in free space. The rest of the ether
within the space occupied by the body was assumed to remain stationary. In this manner the density of the ether carried along by the body could be computed as
p' - p

(n 2

l)p

while a density p remained at rest. The motion of the center of gravity of the ether
within the body was therefore
n2 - 1
- -2- u
n
Since this is the average motion, relative to the observer, of the ether associated with
the body, he should add this term to u, the velocity of light in the body when it is at
rest, in order to obtain v, the velocity of light in the body when it is in motion at
speed u. This addition yields the formula (2.21).
Fresnel's derivation is seen to require further hypotheses about the behavior of the
ether. No longer is the ether simply an intangible medium which is everywhere at rest
in some absolute reference frame with all the material bodies of the universe gliding
8
9

A. A. Michelson and E. W. Morley, Atn J Sci, 31,377; 1886.


P. Zeeman, Proc Arnst Acad, 17,445; 1914. Also 18, 398; 1915.

SECTION

The Jllichelson-J11 orley Experiment

51

through it without interaction. The ether becomes more dense inside a material body
and part of it is dragged along by the body's motion. Furthermore, Fresnel's derivation
would be valid only for an observer at rest in the absolute reference frame, since he
assumed that part of the ethereal density was stationary with respect to the observer.
An ether hypothesis adequate to explain Fizeau's result is thus seen to be rather
com plicated.
It is interesting to note that if formula (2.21) is valid for all material bodies, and if
a succession of material bodies is considered whose indices of refraction are suecessi vely closer to unity, in the limit as n -.., 1, (2.21) reduces to c = c',

2.5

THE MICHELSON-MORLEY EXPERIMENT

A definitive experiment designed expressly to detect the presence of an ether was first
performed by Michelson in 1881 and repeated with improved accuracy by Michelson
and Morley in 1887. The essence of the approach is precisely analogous to Example V.3
in the IVlathematical Supplement, in which two rowers determine a river's flow by
noting the difference in their elapsed round-trip times, when one man rows across the
river and the other parallel to the bank. The reader is urged to Iamiliarize himself wi th
that problem and to convince himself of the soundness of the logic underlying the
analysis.
The apparatus employed in the ether experiment was an interferometer invented by
Michelson and shown in schematic form in Figure 2.4. Light from a source IJ is split into
two parts by a half-silvered mirror P. One part travels over path ll, is reflected by
mirror M 1, and upon returning to I) is partially reflected toward the viewing telescope F.
The light which thus reaches F has gone through the plate P three times.
The other part of the original light beam travels over path Ls, through the equalizing
plate, is reflected by M 2, and upon returning to P is partially transmitted toward the
viewing telescope F. The light which thus reaches F has gone through the plate F) once
and the equalizing plate twice. Since these t\VO plates are identical except for silvering,
the paths in glass along the two routes are the same. When the light source is monochromatic, the relative phase of the t\VO light components reaching F depends on the
difference in round-trip times for light to travel along the t\VO paths P-1l1 1- 1) and
]J-A1 2-P. This relative phase manifests itself by interference effects in the field of the
viewing telescope.
Imagine that the half-silvered mirror P is set precisely at 45 deg, and that the tilts
of M', and AI 2 are adjusted so that the two light components which reach F via JVJ 1 and
M 2 travel parallel paths as they approach F. In this case the light intensity is essentially
uniform over the central region of a transverse plane AA', and the level of intensity
depends on the relative phase of the t\VO light components. However, if the tilt of P is
now shifted slightly away from 45 deg, the two light components reaching F via ]VII and
]yf 2 no longer travel parallel paths as they approach F. Thus in the transverse plane AA'
there will be alternate regions which are light and dark due to the constructive and
destructive interference of the t\VO light components. The positions of these light and
dark regions, or interference fringes as they are called, depends on the relative phase of
the t\VO light components. If this relative phase changes, the interference fringes will
shift transversely, and there will be a shift of one fringe for every 360 deg change in

52

The Special Theory of Ilelativity

CHAPTER

To eyepiece

F
A---

---+---A'

Half-silvered mirror

NIl

-I

~--+---ll----~

NI2
FIGURE

2.4

The Michelson interferometer.

relative phase of the t\VO light components. Upon focusing the viewing telescope and
adjusting the tilt of P so that this transverse field of fringes is distinct, the operator
of the interferometer has an extremely sensitive indication of a change in relative phase
of the two light components arriving from M 1 and 1.1 2 , through an observed shift in the
fringe pattern in his field of view,
With this experimental technique in mind, assume that the earth, and with it the
apparatus of Figure 2.4, are moving at a speed u relative to the ether in a direction that
would take M 2 into P. According to the ether hypothesis, the speed of light is c in any
direction in the ether. If t~ is the time for light to travel from P to M 2, then ct~ is the

The 111 ichelson- 111 orleu Experiment

SECTION [)

53

distance this light traveled through the ether. But this distance must also equal Z, - ut;
due to the motion of the apparatus through the ether. Thus
,

l2

t2 == - c+u
Similarly, if

t;' is the time for light to

travel the return path from 111 2 to

[J,

then

The total time for light to travel the path P to M 2 to P is therefore


(2.22)
To compute the time for light to travel along the path to M 1 and back, one must
account for the fact that, while the light travels from P to M 1, the whole apparatus
moves a distance 0 in the M 2-P direction, as shown in Figure 2.5. The actual distance

~------ll------""""

FIGURE

Ray path [rom P to M', to P.

2.5

traveled by the light through the ether is therefore (li


for light to get from P to 111 1 , then

02)}~. If t~ is the time it takes

o = ut~
Upon eliminating 0, one obtains

t~ =

Ide
(1 - u 2/

C2)}1

Since the time light takes along the return path from M 1 to P is the same, the total time
for light to travel the path ]J to 111 1 to P is

2l / c

t - - - -1 - 1 ( 1 _ U 2/ C2) }~

(2.23)

54

The Special Theory of Relativity

CHAPTER

The difference in phase (assuming monochromatic light) of the two light components
arriving at F is therefore
6 =

21rV(tl -

t2 )

4~v/c

(1 _

U 2/C 2 )%

II - (1 _

l2]
/c2)H

U2

(2.24)

in which u is the frequency of the light being used. As the apparatus is rotated through
90 deg, this difference in phase should steadily change, until at the end of the 90-deg
rotation, the roles of II and l2 are interchanged. At this position, the difference in phase
IS

(2.25)
If an observer continually notes the interference fringes as the apparatus is rotated
through 90 deg, he should see a total shift of n fringes, where n is given by
(2.26)
If

tt/c is small,

a series expansion (cf'. Mathematical Supplement, Part I) gives

II

+ l2U

= ----

c2

(2.27)

whereas, if u/c is not small, n is larger than the value given by (2.27). Thus (2.27)
is the most pessimistic prediction for fringe shift and is seen to be a second-order
expression in u/c.
If an observer were willing to perform this experiment every day for six months, he
would expect to encounter a value for u at least as great as the orbital velocity of the
earth around the sun, namely 30 kru /sce, this minimum value occurring if the sun were
at rest in the ether. Upon inserting u = 30 kmysec. in (2.27) one finds that II + l2 needs
to be approximately 50 m in order to assure the observation of one fringe shift. It has
proved possible to build an interferometer of this type capable of detecting as little as
1/1000th of a fringe shift, putting a reasonable requirement on the size of the apparatus.
Thus the sensitivity needed is well within the capabilities of construction.
Additional factors affecting the accuracy are apparent. The relative positions of different parts of the apparatus must remain constant within a small fraction of a wavelength during operation. The stability of the light source is important, and the frequency bandwidth of the "monochromatic" light must be as small as possible. However,
if great care is taken in the assembly of the apparatus, with due consideration given to
these possible sources of error, one should expect to be able to measure the ether drift
regardless of how slowly the sun might be moving through the ether.
The first satisfactory trials by Michelson in 1881 indicated a null result, but the sensitivity was marginal, In Michelson's words!"
In the first experiment one of the principal difficulties encountered was that of revolving
the apparatus without producing distortion; and another was its extreme sensitiveness to
10 A. A. Michelson and E. W. Morley, "On the Relative Motion of the Earth and the Luminiferous
Ether," Am J Sci, ser. III, 34, 333-345; November 1887.

SECTION

The 1\1 ichelson-Morley Experiment

55

vibration. This was so great that it was impossible to see the interference fringes except at
brief intervals when working in the city, even at t\VO o'clock in the morning. Finally . . .
the quantity to be observed; namely, a displacement of something less than a twentieth of
the distance between the interference fringes may have been too small to be detected when
masked by experimental errors.

Accordingly, the apparatus underwent a major redesign before the 1887 trials. The
interferometer was mounted on a massive stone 1.5 m square and 0.3 m thick. The stone
rested on an annular wooden float whose outside diameter was 1.5 m, with an inside
diameter of 0.7 m and a thickness of 0.2.5 m. The wooden float rested on liquid mercury
(which Morley had collected and purified), contained in a cast iron trough 1.0 ern
thick, and of such dimensions as to leave a clearance of approximately one centimeter
around the float. A central pin was used to keep the float concentric with the trough.
The annular iron trough rested on a bed of concrete on a low brick pier built in the form
of a hollow octagon. An excavation was made down to bedrock to set the supporting
column for the apparatus.

a
b

brick pier
cast iron trough

c
d

wooden float
stone slab
FIGURE

2.6

guiding pin

Argand lamp
half-silvered mirror

equalizing plate
banks of four mirrors
viewing telescope

Perspective view of the J[ ichelson-Jforley apparatus.

A bank of four mirrors was placed at each corner of the stone and multiple reflections
were utilized to increase the effective lengths of the two legs of the interferometer to
about 11 m. An Argand lamp was used as the light source and a wooden cover was
placed over the interferometer to prevent air currents and rapid changes in temperature. A perspective view of the complete equipment is shown in Figure 2.6.

56

T he Special Theory of Relat1:vity

CHAPTER

It is demonstrated in Appendix A that if the t\VO legs of the interferometer are equal
(as was the case in the Michelson-Morley experiment), and if the ether drift velocity
u is small compared to c, then the number of fringes shifted, n, is a function of the rotational angle e of the apparatus, and is given by
l u'!.
n = - - 2 cos 28
Ac

(2.28)

Using l = 11 m and the wavelength of yellow light, and choosing the minimum value
u = 30 kru/sec., one obtains
n = 0.2 cos 2e
(2.29)
as the minimum predicted fringe shift versus rotation angle of the apparatus. Equation
(2.29) assumes, in effect, that at some point in its orbit about the sun, the earth 111USt
have a motion through the ether at least as great as 30 knr/seo. If, while the earth is
in this orbital position, an experiment is performed in which the apparatus is rotated
through 360 deg, the fringes in the field of the viewing telescope should undergo a
cyclical displacement whose amplitude is at least as great as four-tenths the distance
between adjacent fringes.
Michelson and Morley conducted trials during the period July 8-12, 1887, and
plotted their data against ith of the minimum predictable fringe shift given by (2.29).
Their curves for daytime and nighttime observations are reproduced in Figure 2.7.
One-eighth of minimum
predicted fringe shift

//

/ /

--\---

..........

./

~_

--0.05~
...........

"-

" ", -

~ Daytime

//

-0.05',

,,/

.......

FIGURE

2.i

"""""--------,,"

illichelson-.llorley data for July 1887.

They estimated that the second harmonic of their experimental data was no greater
than 0.005 fringes and thus the maximum detected fringe shift was less than toth of the
minimum predicted fringe shift. Of course, the possibility existed that in July, 1887, the
earth was nearly at rest in the ether, and thus Michelson and Morley concluded
I t is just possible t hat the result an t veloci ty (of the earth relative to the ethel') at the
time of the observations was small though the chances are much against it. The experiment

SECTION

The A1ichelson-A1otley Experiment

57

will therefore be repeated at intervals of three months, and thus all uncertainty will be
avoided.

However, after completing the July 1887 trials, Michelson and Morley did not
return to this problem. The completion of the ether drift experiment for all epochs was
finally accomplished by Dayton C. Miller, first in Cleveland and then at Xlount Wilson,
during the years 1921 through 1926. The Cleveland data gave a null result comparable
in level to what had been obtained by Michelson and Morley, but considerable discussion was caused by the Mount Wilson data, because it seemed to indicate a small ether
drift through the fact that the observed fringe displacements were down to only about
one-thirteenth of the value predicted by the ether theory for a 30 km /sec. velocity of
the earth in its orbit.
Miller's harmonic analysis of the data not only yielded a slight amplitude but also
a phase; however, the latter was incapable of being fitted into any logical relationship
'TABLE 2.1
TRIALS OF THE MICHELSON-MORLEY EXPERIMENT

Observer

Year

Place

Michelson- ...............
Potsdam
1881
Michelson and Morley! .....
Cleveland
1887
Morley and Miller- ........ 1902-04 Cleveland
l\1iller d . . . . . . . . . . . . . . . . . . .
1921
Mt. Wilson
Miller e . . . . . . . . . . . . . . . . . . . 1923-24 Cleveland
Miller (sunlight)! ..........
1924
Cleveland
Tomaschek (starlight)> .....
1924 Heidelberg
Millerh . . . . . . . . . . . . . . . . . . .
1925-26 Mt. Wilson
Kennedyi .. ..............
1926
Pasadena and
IV1 t. Wil~on
Illingwort hi ...............
1927
Pasadena
Piccard and Stahel" ........
1927
l\1t. Rigi
Michelson et al:'. . . , .......
1929
1\1 t. \Vilson
Joosr' ....................
1930
.lena

2l/'A.(u/C)2

fringe

fringe

120
1,100
3,220
3,200
3,200
3,200
860
3,200
200

0.04
0.40
1 .13
1 .12
1 .12
1 .12
0.3
0.07

0.01
0.005
0.0073
0.04
0.015
0.007
0.01
0.044
0.001

2
40
80
15
40
80
15
13
35

200
280
2,590
2,100

0.07
0.13
0.9
0.75

0.0002
0.003
0.005
0.001

175
20
90
375

1 .12

Ratio

A. l\lichelson, Am. J. Sci. 22,120 (1881); Phil. l\Iag. 13,236 (1882).


A. Michelson and E. W. Morley, Am. J. Sci. 34, 333 (1887); Phil. Mag. 24,449 (1887).
c
W. Morley and I). C. Miller, Phil. Mag. 9, 680 (1905); Proc. Am. Acad. Arts Sci. 41, 321 (1905).
d
C. Miller, Data sheets of Observations December 9 to 11,1921 (unpublished).
e
C. Miller, Observations, August 23 to Scptem ber 4, UJ23; June 27 to July 26, 1D24 (unpublished).
f D. C. Miller, "Observations with Sunlight on July 8 to 9,1924," Proc. Natl. Acad. Sci. 11,311 (1925).
(J R. Tomaschek, Ann. d. Physik 73, 105 (1924).
h D. C. lVliller, Revs. Modern Phys. 5, 203 (19~33).
i R. J. Kennedy, Proc. Natl. Acad. Sci. 12,621 (1926); Astrophys. J. 68,367 (928).
i K. K. Illingworth, Phys. Rev. 30, 692 (I927).
k A. Pic card and E. Stahel, Compt. rend. 183, 420 (1926); 184, 152, 451 (1927); 185, 1198 (1927);
J. phys. radiurn 8, 56 (1927).
ll\1ichelson, Pease, and Pearson, Nature 123,88 (1929); J. Opt. Soc. Am. 18, 181 (1929).
m G. Joos, Ann. Physik 7, 385 (1930); Naturwiss. 38,784 (1931).
a

A.
A.
E.
D.
D.

t, em

58

The Special Theory of Relativity

CHAPTER

corresponding to an oscillation of the north point during the course of a sidereal day.
This anomaly cast S0111e doubt on the interpretation of the results and led to a critical
review of the data using statistical methods, The conclusion was reached that the small
observed second harmonic in the Mount Wilson experiment was not due to ether drift
hut rather could be accounted for by temperature effects.!'
Many other investigators have repeated the Michelson-Morley experiment, and a
summary of the various trials is given in Table 2.1 (on page 57) with the appropriate
journal references listed underneath. 12 In response to various objections to the original
experiment, several of the parameters were varied; sunlight and starlight were substituted for the terrestrial source, mountain-top installations were used to minimize a
possible "ether drag" over the surface of the earth, and one experiment was even
performed in a balloon.
In all these trials, the t\VO arms of the interferometer were equal, the length being as
listed in Column 4 of Table 2.1. Column 5 gives the minimum predicted shift at some
time of year, based on the earth's orbital speed of 30 km/sec. and is twice the amplitude
of the corresponding second harmonic. Column 6 lists the amplitude A of the second
harmonic of fringe shift actually found by each observer. The last column gives the
ratio of the minimum predicted second harmonic to that actually observed. For many
of the trials this ratio is large enough that clearly a null result for the Michelson-Morley
experiment can be accepted with confidence.

2.6

ETHER DRAG

The negative result of the Michelson-Morley experiment was totally unexpected and
very perplexing. If one presumes that there is a luminiferous ether in which the velocity
of light is c in all directions, the results of this experiment suggest that the light from a
distant star sweeps past an observer on earth at this velocity c regardless of where the
earth happens to be in its orbit, and thus regardless of the earth's velocity relative to
the ether. But this contradicts all common-sense knowledge of the law of addition of
velocities, embodied in Equations (2.18).
To phrase this problem more specifically, suppose that when the earth is at a certain
point in its orbit, Cartesian axes XYZ are constructed so that the earth is instantaneously at rest in XYZ and so that the X axis lies along the earth's orbit. Then for the
incoming starlight,
x = Cx if = Cy i = c,
and

(c;

+ c~ + c;)~~

is the speed of the starlight in XYZ. Six months later, another Cartesian frame can be
constructed whose X' axis slides along the original X axis at speed 2v, with v the earth's
orbital velocity. The earth will be instantaneously at rest in X'Y'Z', and according to
(2.18)
.,
z = c,
x' = c; - 2v
iJ' = Cy
11 R. S. Shankland, S. W. McCuskey, F. C. Leone, and G. Kuerti, "New Analysis of the Interferometer Observations of Dayton C. Miller," Rev Mod Phys, 27, 167-178; April 1955.
12 This table is reproduced with the kind permission of Messrs. Shankland, McCuskey, Leone, and
Kuerti and is taken from their paper, Ibid.; 168.

SECTION

The Lorentz-Fitzilerold Contraction Hypothesis

so that

e' == [(ex - 2V)2

e~

+ C;P2

59

~ e

Thus the velocity of the starlight should be different at the t\VO orbital positions. But
the resul ts of the Michelson-Morley experiment do not reveal this difference. I t is as
though the ether were caught up by the earth's atmosphere and dragged along with it,
thus accounting for the apparent constancy of the velocity of light in all directions
within the earth's atmosphere for all positions in the earth's orbit.
However, this concept of "ether drag" suffers a fatal objection when Bradley's discovery of aberration is recalled. If the ether were dragged along by the earth's atmosphere, then the setting of a telescope would not have to be altered to compensate for
the earth's orbital velocity and there would be no aberration in the position of any star.

2.7

THE LORENTZ-FITZGERALD CONTRACTION HYPOTHESIS

Another attempt to preserve the ether hypothesis but remain consistent with the
negative result of the Michelson-Morley experiment was made by Lorentz ." He
postulated that, as a result of its speed u through the ether, a material body is contracted by the factor (1 - u 2/e 2) }2 in the direction of its motion. This 111eanS, in the
Michelson-Morley experiment, if 11 and 12 are the lengths of the arms of the interferometer when it is at rest in the ether, then under the conditions depicted by Figure 2.4, 12 == 12 (1 - u 2 / e 2)H and II == L. After the apparatus has been rotated 90 deg,
l2 == Z2 and II == L(l - U2/C2)}~. Making these substitutions in (2.24) and (2.25) gives
1::1'-1::1
n==---==O

27r

(2.30)

which would account neatly for why no fringe shift was observed by Michelson and
Morley,
In a footnote Lorentz acknowledges that this possibility had occurred independently
to FitzGerald, who apparently had limited his discussion of the idea to lectures to his
students and had not published his speculations. The length contraction hypothesis is
customarily identified with the names of both these men.
This proposal was sternly criticized by Poincare, who objected to an ad hoc hypothesis, without experimental basis, designed to explain why one cannot detect the presence
of something else which has been hypothesized-the ether. K evertheless, the proposal
was taken seriously by others, and Kennedy devised a modification of the MichelsonMorley experiment capable of testing the Lorentz-FitzGerald contraction hypo thesis.I"
Assuming that (2.30) is correct, if 11 == 12 (as was intended by Xlichclson and Morley
in the original experiment), 1::1 would remain constant at zero even if u, the speed of the
earth through the ether, were to change. Kennedy, with the assistance of Thorndike,
constructed an interferometer in which II - l2 was as great as the coherence of the
source would permit, attaining a value l1 - 12 == 318 mm. Then, instead of rotating the
H A. Lorentz, "Michelson's Interference Experiment," Versuch einer Theorie der elektrischen und
Erscheinungen in bewegten Korpern, Sections 89-92, Leyden, 1895. (An English translation
appears in The Principle of Relativity, Dover Publications, Inc., New York, 1958.)
14 R. J. Kennedy and E. 1\1. Thorndike, "Experimental Establishment of the Helativity of Time,"
Phys Rev, 42, 400-418; November 1932.
13

optischen

60

The Special Theory of Relativity

CHAPTER

apparatus, he held it fixed to see if there were any variation in ~ as the earth's speed
through the ether changed.
If it is assurned that the sun is gliding through the ether at a velocity v., that the
center of the earth is moving along its orbit at a velocity v, relative to the sun, and that
a point on the surface of the earth has an instantaneous velocity v- relative to the
earth's center, then
is the square of the instantaneous speed of this terrestrial point through the ether.
Twelve hours later it has changed to

whereas six mouths later it becomes

The fringe shift noted by not rotating the apparatus, but taking readings 12 hours
apart should be, according to (2.30),

n21 =

Li2

Li1

21r

= 2

1 -

'A

12 )

[(

IL.

1 - u~/ C2)-,~ -

(1 -

ui/ c2)- ;"'! ]


lL,

Similarly, the fringe shift noted by not rotating the apparatus but taking readings six
months apart should be

n3l =

Li3

21r

Lil

= 2(Zt - 12) [(1 - ui/ C2)-~1


'A

(1 -

ui/c2)- H]

If ui, u~, and u; are all small relative to c2 , these expressions reduce to
(2.31 )
(2.32)

By using a precise photographic technique, Kennedy and Thorndike were able to


detect a fringe shift as small as 1 0100 th of the spacing between adjacent fringes. This
was almost two orders of magnitude more sensitive than the original Michelson and
l\Iorley technique, and compensated for the lowered sensitivity in the length factor.
Thus expressions (2.31) and (2.32) for the case of the Kennedy-Thorndike experiment
give an overall sensitivity comparable to that arising from (2.27) in the case of the
Michelson-Morley experiment.
The result of the analysis of 300 exposures of the fringe pattern photographed at the
viewing telescope of the Kennedy-Thorndike apparatus once again gave a null result
within the limit of experimental error. As a result of this experiment, it is reasonable
to conclude that the Lorentz-FitzGerald contraction hypothesis put forth as a means
of explaining the null result of the Michelson-Morley experiment, while still preserving
the ether concept, is invalid.

SECTION

The Interdependence of Space and Time

61

2.8 EMISSION THEORIES


Several other explanations of the Michelson-Morley null result were attempted,
involving the assumption that the velocity of light was the same in all directions in a
coordinate system in which the source was at rest. These emission theories, as they were
called, differed from each other in that they predicted different results when the ligh t
was reflected from a moving mirror. After reflection the three alternatives were that the
velocity of the light (1) remain c relative to the source, (2) beC0111e c relative to the
mirror, or (3) become c relative to the mirror image of the source.
The first alternative predicts complications in the interference pattern in the
Michelson-Morley experiment when extraterrestrial sources are used, but the results
of Miller using sunlight did not reveal any such effects. The second and third alteratives lead to coherence difficulties with reflected light, and all three emission theories
are inconsistent with the findings of de Sitter, previously rnentioned, that the velocity
of light is independent of the motion of the source. Thus these attempted explanations
had to be rejected along with the Lorentz-Fitz Gerald contraction hypothesis and the
assumption of ether drag.
Classical physics had reached an impasse. The laws of mechanics seemed to obey a
relativity principle via the Galilean transformation. The velocity of light also seemed
to obey a relativity principle in that it appeared to be the same in all coordinate systerns in vacuo. Still this was incompatible with the velocity addition law (2.18) arising
from the Galilean transformation. The ether hypothesis, which had at first seemed so
promising, had not been established after several decades of brilliant experimental
research. Clearly a new approach to the problem was needed. This was provided
by Einstein, who concentrated his attention on a reexamination of the basic principles involved in velocity determinations, namely, the measurement of space intervals
and time intervals. Realization that neither was an invariant led to a resolution of the
impasse and to a satisfactory modification of the Galilean transformation equations.
2.9

THE INTERDEPENDENCE OF SPACE

AND TIME

In the introduction to his first paper on this subject Einstein said 15


. . . The same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics hold good. \Ve will raise this conjecture (the
purport of which will hereafter be called t he 'Principle of Relativity') to the status of a
postulate, and also introduce another postulate, which is only apparently irreconcilable
with the former, namely, that light is always propagated in empty space with a definite
velocity c which is independent of the state of motion of the emitting body. These t\VO
postulates suffice for the attainment of a simple and consistent theory . . . . The introduction of a 'luminiferous ether' will prove to be superfluous inasmuch as t he view here to
be developed will not require an 'absolu tely stationary space' provided wi th special properties . . . .

Einstein thus accepted the principle of relativity in its broadest sense, postulating
that all the laws of physics take the same form in every inertial system of coordinates.
i e A. Einstein, "On the Electrodynamics of Moving Bodies," Ann Phys, 17,891-921; 1905. (An English translation can be found in The Principle of Relativ1'ty, Dover Publications, Inc., N ew York,
1958.)

62

T'he Special Theory of Relativity

CHAPTER

He further adopted the view that light waves, like sound waves, have a propagation
velocity which is unaffected by the motion of the source. t In discarding as superfluous
the concept of an ether, Einstein thus also accepted the notion that a light wave traveling through empty space will pass t\VO different observers at the same speed c even
if these observers are in motion relative to each other.
Acceptance of the second postulate together with elimination of the ether concept
automatically explains the null result of the Michelson-Morley experirnent. It also
leads to a modification of the Galilean transformation equations and therefore to a
modification of the velocity transformation law, and thus ultimately to a satisfactory
explanation of the Fizeau experiment. However, before proceeding to these developments it is desirable to consider the implications of the second postulate with respect
to the nature of space and time, The conclusions to be drawn will appear surprising
on first inspection because they are contrary to common experience, and it is this facet
of special relativity which often causes the greatest initial difficulty. Con11110n experience
develops the ingrained belief that time and space are totally different and unconnected;
once this belief is successfully challenged, the remainder of the special theory of relativity follows logically and without great difficulty.
I t takes only a simple example to challenge this belief. The one to be presented here
consists of a sequence of experiments designed to establish the lengths of rulers under
various conditions of motion, SOD1e aspects of these experiments are not completely
practical, but could perhaps be made so by a modest amount of elaboration. However,
the experiments are completely logical, which is all that is essential.
That which follows will be developed in what might seem to be overly great detail.
However it is concerned with the crux of the dilemma involving light velocity and the
velocity transformation law, and a thorough understanding at this stage will greatly
facilitate all subsequent developments.
Two Rulers at Rest.
Imagine t\VO long slender rulers, Rand R', perfectly straight
and rigid and laid out side by side 011 the ground, at rest in an inertial coordinate
system. Three observers, who will be designated as 0, 0', and 0" are in the process of
determining if these rulers are precisely the same length. They do this by lining up the
rulers so that they are parallel and flush at one end, and then seeing if they are flush
at the other end also. Having satisfied themselves that such is the case, the three
observers then establish midpoints on each of the rulers, by the use of standard techniques such as the construction of the perpendicular bisector or the employment of
an auxiliary third ruler of half-length. They have no difficulty doing all this because
the t\VO rulers are at Test side by side.
The length of a Moving Ruler.
Next imagine that one of these rulers, say R, is
parallel to the ground and just above it, but is now in motion with respect to the ground
at a constant velocity. Let this velocity be parallel to the ground, but at an arbitrary
angle with respect to the long dimension of the ruler R. This situation is depicted in
Figure 2.8. Is the length of the moving ruler R the same as when it was at rest on the
ground?
To decide this question, one must first establish an operational definition of length
which is applicable to situations involving motion. A suitable definition is embodied in

t It is interesting to note that Einstein postulated this eight years before de Sitter's oonfirrn ing
observations of the ligh t arriving from binary stars.

SECTION

The Interdependence of Space and Time

63

-1/--PA

Ruler R' is stationary on ground

R'
B' ~I- -__ ..::.=-.-

----.11

II

I;

A'

II

II
II

II
Points P A and P B are on ground
I I
and underneath two ends of
II I
R at common time
I I

Ruler R moves just above


the ground at constant
velocity v

-~--P
B
B

FIGURE

2.8

The movinq ruler R.

the Following technique of measurement: Let an observer stationary with respect to


the ground determine a fixed point jJ A OIl the ground directly under one end A of the
moving ruler at a specific time t. Similarly, let him determine a fixed point ])B on the
ground directly under the other end 13 of the moving ruler at the same time t. These
two points fixed on the ground beco mo a permaueut record, and at their leisure, ground
observers call reposition the other ruler It' so that one of its ends coincides with P A; if
its second end coincides with f) B, the moving ruler R may be said to have a length
which is unchanged from the value it had when the t\VO rulers were at rest side by side.
The crucial feature in this technique of dynamic length measurement is the requirement that the t\VO fixed points P A and 1)B on the ground be determined at precisely the
same time. 'fa ensure this the three observers 0, 0', and 0" can equip opposite ends of
each ruler with small, insulated charged probes of unlike electrical charge. Thus if one
refers again to Figure 2.8, it can be imagined that the ruler ends labeled 11 and ii' contain positively electrified probes and the ruler ends labeled Band B' contain negatively
electrified probes. A detail of one of these probes is suggested in Figure 2.9a.
Ruler R' can then be placed on the ground, with its charged probes pointing up, in
approximat.ely the position above which ruler R is expected to pass, as indicated in

H'

Probe
Insulator
(a)

A
v

/~
6Jl

~.:::i

A'

Probe construction

(c) Coincidence of

two probes

A'
(h} Moving ruler R approaching

coincidence wit.h stationary


ruler R'
FIGURE

2.9

Details of the ruler experiment.

64

The Special Theory of Relativity

CHAPTER

Figure 2.9b. If the two rulers are still o] equal length, and R' has been properly positioned
on the ground, t as R passes by with its probes pointing down, there is an instant at
which the negatively charged B probe is directly above the positively charged A' probe,
as suggested by Figure 2.9c. This coincidence causes an intense local field, resulting in
an arc of short duration, which is the source of a light pulse. At this same instant, the
positively charged A probe is directly above the negatively charged B' probe, and this
coincidence is the source of another light pulse. If A' is stationed at the midpoint of the
stationary ruler R', these t\VO light pulses will reach him simultaneously. Conversely,
from the single observation that two ligh t pulses reach him at the same time, A' will
deduce that A and B' did momentarily coincide, that A' and B did also momentarily
coincide, that these coincidences occurred at the same time, and thus that the t\VO
rulers are still of equal length, even though R is now moving, whereas R' is stationary.
If the two rulers are no longer o] equal length, the probes at A' and B' may be displaced
equal amounts away from (or toward) the midpoint of R'. 'I'he ruler R' can then be
placed in a variety of positions on the ground in the hope of determining a position
which will cause, for some pass of the ruler . 1. f" t\VO light pulses to reach the midpoint
of R' simultaneously. Ultimately, a placement of R' and a separation of its two probes
will be found such as to cause the simultaneous arrival of t\VO light pulses at the midpoint of R'. The distance separating the t\VO probes on R' is then the length of the
moving ruler R.
So far, a technique for determining the length of the moving ruler R has been developed, but the question still has not been answered as to whether R has a dynamic
length which is the same as its stationary length. To settle this question, the three
observers 0, 0', and 0" devise t\VO symmetrical experiments.
Relative Motion Perpendicular
to Length.
In the first of these symmetrical experimcnts, 0 stations himself at the midpoint of the ruler R and takes off on a space
journey. Similarly, 0' stations himself at the midpoint of the ruler R' and takes off on
a space journey. The flight paths and positions of the t\VO rulers are mirror images
with respect to a vertical plane through the position occupied by 0, who will be
assumed to remain at rest in the original inertial coordinate system X" Y" Z". These
flight paths will be such that 0 and 0' arrange to encounter each other in outer space
in a region remote from all other bodies. They coast past each other on immediately
adjacent and parallel paths at a constant relative speed u , during a period of time when
neither ruler is accelerating and each ruler is perpendicular to the path.
Figure 2.10 indicates a sequence of positions of the t\VO rulers as "seen" by 0". By
symmetry, 0" must observe that the t\VO rulers are still of equal length and thus that
A and B' will coincide, as will A' and B, these coincidences occurring simultaneously.
When A and B' are precisely opposite each other, the positively charged probe at A and
the negatively charged probe at B' interact to generate a light pulse. Similarly, an arc as
the ends A' and B pass each other gives rise to another light pulse. The generation of
these light pulses is evidence that the rulers are still of equal length. Yet A" can not conclude from this that either ruler is now the same length that it was when at rest relative
to him because now both rulers are in motion relative to him.
However, either 0 or A' is in a position to judge this question, since each is stationary

t It may take many trials of the experiment to determine this position, with R always making the
same pass.

SECTION

The Interdependence of Space and Time


B'

R'
0'

--,-I

65

(a)

R
0

A'

I
I

B'

(b)

R'

R
I
~

0'

I
B

A'

1
0"

FIGURE

2.10

J[ oving ruler experiment- transverse motion.

with respect to one of the rulers. For example, during the period of encounter, observer
no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Under Einstein's first postulate, all of the laws of
nature should be equally applicable in XYZ and the coordinate system X"Y"Z" which
o formerly shared with 0". In particular, 0 has no reason to believe that the length of R
is now any different from what it was when he and R were both at rest in XI/Y"Z".
Therefore since 0 detects the two pulses caused by the passage of R', he concludes that
not only is R' still the same length as R, but also that it is still the same length as it had
been when at rest relative to himself. Similar remarks can be made about the observations of 0'.
This result may be obtained in another way. Assume that when the rulers pass each
other, no light pulses occur, indicating that the rulers are now of dissimilar length. Then
let 0 move each of his probes the same small amount closer to the midpoint of R; let 0'

o senses

66

T he Special Theory of Relativity

CHAPTER

extend each of his probes the same small amount further from the midpoint of R'; and
let the experiment be re-run. If this procedure is repeated until positions of the probes
are found which cause light pulses, then both 0 and 0' can say, for example, that R is
the longer of the two rulers. But this result is impossible, The experiment is completely
symmetrical. If 0 thinks R' is longer, 0' must think R is longer. The only symmetrical
answer possible is that no adjustment of the probe positions was needed, and that they
both think both rulers are the sarne length, unchanged from the value when each was
at rest relative to 0".
Therefore the conclusion is reached that when a body is in motion relative to an
observer, the measurement of length transverse to that motion is unaffected by the
motion, being the same as when the body is at rest relative to the observer.
Relative Motion Parallel to Length.
Now let the space journeys of 0 and 0' on
the rulers Rand R', respectively, be repeated in all original details except that during
the period of encounter the t\VO rulers are oriented parallel to the paths. In addition,
and 0' each now takes along a clock, the two clocks having been determined to be
identical when at rest relative to 0".
Figure 2.11 indicates a sequence of positions of the t\VO rulers as "seen" by 0". When
the ends A and A' pass each other, nothing happens. But as A and B' are precisely opposite each other, the positively charged probe at A and the negatively charged probe at
B' interact to generate a light pulse. Similarly an arc as the ends A' and B pass each
other gives rise to another light pulse.
As 0" views this sequence, the t\VO rulers are moving in opposite directions at equal
speeds, and by symmetry the AB' coincidence occurs at the same time as the A'B
coincidence. The t\VO pulses of light originate simultaneously and spread uniformly
thereafter as spherical wavefrouts traveling at the velocity c. The centers of these
spherical wavefronts are the fixed points P, and J)2, indicated in position (c) of Figure 2.11. Because the velocity of light is finite, 0" "sees" the t\VO rulers separating as
the wavefronts grow and thus finds that the AB' pulse passes 0 before it passes 0',
whereas the A'B pulse passes 0' before it passes o.
Although 0" can conclude that the t\VO rulers are still the same length as each other,
he can not conclude that either of them is still the length it was when at rest relative to
him because once again they are both in motion relative to him.
However, either 0 or 0' is in a position to judge this question, since each is stationary
relative to one of the rulers. For example, during the period of encounter, observer 0
senses no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Using Einstein's second postulate, 0 knows that the
velocity of light is independent of the motion of the source and equal to a constant c in
all directions in XYZ. Thus since he knows himself to be at the midpoint, that one pulse
of light originated at one fixed end of his ruler and that the other pulse of light originated at the other fixed end of his ruler, if these t\VO pulses reach him simultaneously,
he will conclude that the arcs occurred simultaneously. This would mean to 0 that the
coincidences of AB' and A'B were simultaneous and thus that the ruler R' was still the
same length as his O\VTI.
Does 0 receive the t\VO light pulses at the same time'? Although this would be contrary to the observation made by 0", let it nevertheless be assumed that he does. This
would mean that 0 concludes that the t\VO rulers are still the same length and therefore

SECTION

The Interdependence of Space and Time

67

that 0' was directly opposite 0 at the instant of origination of the two light pulses. Due
to the finite velocity of light, at the later instant when 0 receives these t\VO pulses, 0' is
already beyond 0 and the AB' pulse has not yet reached 0' whereas the A'B pulse has
already passed him.
But this result is patently impossible. The experiment is completely symmetrical
with respect to 0 and 0' and observer 0 cannot receive the pulses simultaneously unless

(a)

B'

0'

A'

-=-+-:-----'
I
A

"I

(b)

A'
(c)

Spherical wavefront
of A' B pulse

1
0"

FIGURE

2.11

J! oving ruler experiment-longitudinal motion.

0' does also. Thus the two light pulses do not arrive at 0 at the same time and 0 no
longer thinks the two rulers are the same length.
In what order do the two light pulses reach O'? If one takes the observations of 0"
as a hint, one can assume that the AB' pulse reaches 0 an interval of time ol ahead of
the A'B pulse, as measured by the clock 0 brought along with him. This implies that
the A'B pulse reaches 0' sooner than the AB' pulse, say by an interval of time ot', as

68

T he Special Theory of Relativity

CHAPTER

measured by the clock of 0'. When this assumption is adjusted so that at = at', the
result is completely symmetrical, as required.
An important conclusion has been reached. Observer 0 now feels that the ruler R' is
shorter than the ruler R by an arnount u at. Since he is still at rest relative to R, he has
no reason to believe that the length of his own ruler is any different from what it had
been originally when at rest in the coordinate system X"Y"Z". Thus he concludes that
the length R' depends on its motion relative to him, when that Illation is in a direction
parallel to its length.
In like manner observer 0' now feels that the ruler R is shorter than the ruler R' by
an equal amount u bt', and therefore that the length of R depends on its motion relative
to him, when that motion is in a direction parallel to its length.
I t is clear that the t\VO observers 0 and 0' are no longer in agreement about measurements of distance, and that this is occasioned by their relative motion. Furthermore, the
t\VO observers are not in agreement about the measurement of time intervals, for 0
thinks that the AB' coincidence occurred first, whereas 0' thinks that the A'B coincidence occurred first. K either observer has any reason to believe that his own clock is
behaving in a different manner from when they were together at rest in X"Y"Z". Thus
each observer concludes that the other's clock has been affected by its motion relative
to him.
The temptation exists to raise the protest that the true picture of what is happening
is the sequence shown in Figure 2.11, as "seen" by 0". If 0 and 0' would only take their
motions into account, they could readily explain the time differentials in the arrival of
the two pulses and deduce that the t\VO rulers are really still the same length. But this
point of view puts 0" and his coordinate system in a privileged status. Why shouldn't
o and 0' each have the right to consider himself at rest in a coordinate system in which
all the laws of physics are valid in the same form they have in the coordinate system
of O"? If this first postulate by Einstein is accepted as reasonable, then 0' can properly
consider himself and his ruler to be at rest in a reference frame X'Y'Z' and to be measuring the length of R as it drifts by; he legitimately concludes that the measurement
of the length of R reveals a smaller value due to its relative motion.
These remarks can be made with equal validity when discussing 0 and his right to
consider himself at rest in a reference frame X YZ. Both observers conclude that the
other ruler is shorter, and both are correct, this surprising result being a consequence of
the operational definition of length enunciated earlier.
I t has been noted previously that 0" concludes that the t\VO rulers are still the same
length as each other, and he is also correct; his conclusion is due to the fact that both
rulers have the same speed relative to him. A" cannot say that either ruler is the same
length it was when back on the ground at rest in front of him; to decide this, he would
have to perform an experiment of the type just concluded by 0 and 0'. Were he to
perform such an experiment, he would find that the length of each ruler was now less,
and by the same amount, thus accounting for the fact that the t\VO lengths are still
equal.
Thus the conclusion is reached that when a body is in motion relative to an observer,
the measurement of length parallel to that motion is affected by the motion, being
shorter than when the body is at rest relative to the observer.
General Remarks.
The previous set of hypothetical experiments, or Gedankenexperimenie, reveal that space and time, upon being considered on an operational basis

SECTION

T he Interdependence of Space and Time

69

involving measurements, are not invariant concepts. If a material body has a constant
velocity relative to an observer 0, and he measures its longitudinal and transverse
dimensions, only his transverse results will agree with those obtained by an observer 0'
who is stationary with respect to the body.
What makes this conclusion seem suspect is that in everyday experience one does
not perceive objects apparently shrinking as they take on a relative motion. Airplanes
are not noticeably shorter as they race down a runway, nor does a train seem to extend
its length as it draws to a stop at a station. The reason for this apparent invariance of
length can be traced to the fact that the velocity of light is so great compared to the
velocities of all material objects in one's common experience. In the ruler experiment
just discussed, if c > U, the time intervals ol and ot' are so small as to escape detection
when ruler sizes consistent with one's "common sense" are assumed. Thus the change
in length U ot is normally so small as to be unobservable; however in considering astronornical distances or great velocities this is no longer true and the effect is detectable
and significant.
Likewise, when one motors past a tower clock, the movement of its hands does not
appear altered by the relative motion. Here again this is due to the great disparity
between the velocity of light and the normal velocities of motor vehicles. Thus in the
ruler experiment just discussed, the size of ot and ot' is an index of the difference between
the readings of the clocks of 0 and 0' as they record t\VO events, the coincidences of
ends of the rulers. But for normal velocities U, and normal ruler lengths, ot and ot' are
negligible and the t\VO clocks appear to agree. I t is only when great velocities or distances cornc into play that the differences in the readings of these clocks becorne
important.
For this reason one can appreciate that the discovery that measurements of time and
distance depend on relative motion in no way upsets the large body of common experience built up during one's lifetime. K evertheless, it is of the utmost scientific importance to recognize that these effects exist. This can only be done by considering situations beyond common experience in which such effects are significant. The purpose of
much of the remainder of this chapter will be to explore such situations.
The reader may wonder why light signals were chosen in these ruler experiments
rather than some other means of communication. The reason for this is that only light
signals can propagate in a VaCUU111 and if there were any other medium surrounding the
t\VO rulers it would have a relative motion with respect to each ruler. In general these
relative motions would not be the same, thus destroying the complete symmetry of the
experiment, a crucial point in the argument.
From consideration of the ruler experiment involving longitudinal motion, it is evident that the change in length of a moving ruler, given by U 0[, is dependent on clock
readings. The interval of time ot between the AB' and A'B coincidences is in turn
dependent on the length of the rulers, so that measurements of time intervals and
space increments are interdependent.
Although two observers in relative motion will not, in general, agree about the distance between two points nor about the time interval between two events, this does not
mean that each cannot predict the values of the other's measurements from a knowledge of his own. Such predictions are accomplished via the coordinate transformation
equations which link the two frames of reference in which the t\VO observers are stationary. It is now apparent that the Galilean Equations (2.4), which assume a time

70

The Special Theory of Relativity

CHAPTER

invariance and predict a length invariance, are only approximate, and will need to be
modified in order to find the proper way to link the observations of 0 and 0' so as to
be consistent with situations such as the two foregoing ruler experiments.

2.10

THE LORENTZ TRANSFORMATION

A satisfactory modification of the Galilean transformation can be accomplished by


returning to the ruler experiment involving longitudinal motion. Upon referring again
to the situation of Figure 2.11, one can let the origin of an XYZ coordinate system be
affixed to the tip of the probe at B and can let the origin of an X' Y'Z' coordinate system be placed at the tip of the probe at A', as shown in Figure 2.12. These two coor-

0'

.....

FIGURE

2.12

Coordinate systems fixed on each ruler.

dinate systems then have a relative speed u; the axes can be aligned so that X' and X
slide along each other, with the Y ' and Y axes and the Z' and Z axes respectively parallel, thus duplicating the situation of Figure 2.1. Let it further be assumed that
observer 0, who is stationary in XYZ, selects his time origin so that t = 0 corresponds to the A'B coincidence; in like manner 0', who is stationary in X'Y'Z', will
be assumed to have chosen his time origin so that t' = 0 corresponds to the A'B coincidence, thus causing the t\VO origins to coincide when t = t' = o.
It will be imagined that conceptually 0 determines a unique triplet of numbers
(x,Y,z) for every point in XYZ space by laying out identical scales (e.g., in meters)
along his three axes, and that similarly 0' determines a unique triplet of numbers
(x',y',z') for every point in X'Y'Z' space by laying out identical scales along his
three axes. It is further assumed that 0 and 0' layout these scales using the same
standard of length (e.g., a meter stick). By this it is meant that if 0 measures lengths
in terms of a ruler R marked in meters and at rest in XYZ, and if 0' measures lengths
in terms of a ruler R' marked in meters and at rest in X'Y'Z', then if these two rulers
were brought to rest side by side, markings one meter apart on R would coincide with
markings one meter apart on R'.
Additionally, it will be necessary for each observer, 0 and 0', to measure time
unambiguously at every point in his coordinate system. To this end it will be conceived that 0 has an inexhaustible supply of identical clocks, such that he has been
able to station one clock permanently at each point in XYZ. To ascertain that all
of these clocks are set properly and running at the same rate, 0 can then select one of

SECTION

The Lorentz Transformation

10

71

them as the reference and perform the following experiment: 0 places himself at the
reference clock and stations an auxiliary observer 0 1 at the clock to be synchronized.
o sends out a pulse of light at time fa on the reference clock, directing it toward 0 1, who
reflects it back by means of a mirror. The returned pulse of light reaches 0 at time lb.
The clock where 0 1 is stationed was set properly if it read
tb) /2 at the instant the
light pulse reached the mirror, I t is running at the proper rate if it proves to be set
properly every time 0 and 0 1 choose to perform this experiment.
In this manner every clock in XYZ can be synchronized to the reference clock, and
thus to every other clock in XYZ. It will be assumed that this has been done, and this
will be the conception of time in the frame of reference XYZ.
Likewise, it can be conceived that 0' has an inexhaustible supply of identical clocks
which he has arrayed at fixed points in X' Y' Z' and which he has synchronized by the
same procedure. It will be further assumed that if these two sets of clocks were brought
to rest relative to each other, they would be found to be identical and running at the
same rate.
With these concepts of spatial position and time, let an event be defined for
observer 0 as something which happens at a point P(x,Y,z) at time t, or more briefly
at the "point"P(x,y,z,t). The same event will occur for observer 0' at the "point"
p' (x' ,y' .z' ,t').
Returning now to a consideration of the pulse of light caused by the coincidence of
A' and B, imagine that 0 has stationed an auxiliary observer 0 1 at the fixed point
(x,y,z) and that 0 1 records the event that this light pulse passes him as having occurred
at time t. Then it follows that 0 can characterize this event by the equation

u. +

x2

+ Z2

y2

(ct)2

(2.33)

Imagine further that 0' has stationed an auxiliary observer O~ at the fixed point
(x',y',z') and that 0 1 and O~ just happen to coincide at the instant the light pulse
passes. O~ records the event as having occurred at a time t', and 0' can write

(X')2 + (y')2

(Z')2

(ct')2

(2.34)

The transformation equations which link the observations in XYZ to those in X'Y'Z'
must be such that 0 can derive (2.34) from (2.33) and such that 0' can derive (2.33)
from (2.34), since they are describing the same event.
The discussions of the previous section have already provided much information
about this transformation. For example, observers 0 and 0' agree about distances in
the transverse directions and can write

y' = y

z' = z

(2.35 )

Further, time intervals and spatial increments were found to be interdependent when
considering measurements in the longitudinal direction. Thus since every motion that
is uniform and rectilinear in XYZ 111Ust also appear uniform and rectilinear in X'Y'Z',
so that the transformation from (x,t) to (x',t') takes straight lines into straight lines,
and is therefore linear, it follows that

x' =

a1X

t' =

aaX

+
+

a 2t

a 4t

(2.36)
(2.37)

The absence of constant terms in these two equations is due to the fact that

72

The Special Theory of Relativity

CHAPTER

(x' = 0, [' = 0) corresponds to (x = 0, t = 0). The problem now remains to evaluate the coefficients ai.
First of all, note that if a point ]J'(:r',Y',z') is fixed with respect to observer a', this
point appears to be moving in the positive X direction at speed u when observed by O.
For such a point, taking differentials of (2.36) gives

dx' = 0 =

dx

0'2

dt

dx

0'2

But in this case dx/dl = u so that

0'1

-0'1-

dt

0'2

x'

= alex

-UO'I

and (2.36) can be rewritten

ut)

(2.38)

The remaining three constants, aI, 0'3, and 0'4, can be determined by requiring that
(2.33) and (2.34) transform into each other. Thus if Equations (2.35), (2.37), and (2.38)
are substituted into (2.34), one obtains

aix 2

2aiuxt

aiu 2t 2

y2

= a;c 2x 2

Z2

Since this must agree with (2.3;3) for all values of x, y,

a;c

ai 2aiu

a;c

Z,

2a30'4c 2xl

a;c 2l 2

and t, it follows that

= 1

2aaa4c2 = 0

aiu

= c2

Solution of these three equations gives

ai

a; =
0'3

(1 -

U 2/C 2 ) - 1

alU
--

c2

which yields the result

x' = K(X - ut)


y' = y
z' = z
t' = K(t - ux/c2)

(2.39)

with
K =

(1 - u 2j C2)-~~

These important equations were derived by Einstein in his 1905 paper using an argument which has been reproduced in its essentials. They are commonly called the
Lorentz transformation equations, so named by Poincare in honor of H. Lorentz,
who had derived them earlier (1903) under a different set of hypotheses. t
If u and the range of the variable x are both small compared to the velocity of light c,
Equations (2.39) reduce to
x' ~ x - ut
y' = Y
z' = z

t'

t These equations actually had been used even earlier by Voigt (1887) in connection with vibrating
motion. Lorentz in his development assumed the existence of an ether, the physical contraction of
bodies due to their motion through the ether, and required that Maxwell's equations transform
properly.

SECTION

11

Length and Time Under the Lorentz Transformation

73

which is essentially the Galilean transformation (2.17). Thus for velocities and distances encountered in C01111110n experience the Lorentz transformation can be approximated with negligible error by the Galilean transformation, a conclusion which is consistent with the discussion of the previous section.
Equations (2.39) can be inverted to give the Lorentz transformation proceeding the
other way, namely.
x = K(X' + ut')
y = y'
Z

t =

z'
K

(t'

(2.40)

+ ux'/ c

K ote that the only difference between (2.40) and (2.39) is the sign of u. But this is to
be expected; if X' Y' Z' is advancing along the X axis at velocity u, then X}T Z is receding
along the X' axis at velocity - 'U.
A 1110re general form of the Lorentz equations could be obtained by introducing
a third Cartesian coordinate system X*Y*Z* at rest with respect to X1TZ but with
its axes tilted in an arbitrary way with respect to those of XYZ. This has the effect
of letting X'Y'Z' move through X* }T*Z* in an arbitrary direction. Even greater
generality could then be obtained by selecting a fourth frame X~ Y~Z~ arbitrarily
tilted and displaced (statically) with respect to X'Y'Z'. The result would be that the
equations connecting X* Y*Z* and x~ Y~Z~ form the most general Lorentz transformation, corresponding to the most general Galilean transformation (2.4). However, no loss
in generality will occur from confining one's attention to the simpler Lorentz transformation (2.39), since the transformations from XYZ to X* Y*Z* and from X'Y'Z' to
X~ Y~Z~ are static, and therefore Galilean. This discussion parallels the remarks of
Section 2.2.
The Lorentz transformation equations call be looked upon as the means whereby one
links the t\VO quartets of numbers !)(x,y,z,t) and f)'(x',y',z',t') which identify the
same event. This process has wide applicability since many physical phenomena can be
expressed in terms of events. For exarnple, the progression of a mass particle along a
path can be thought of as a continuous sequence of events. ]>(x,Y,z,l) traces this progression as seen by 0, with the spatial variables continuous functions of the temporal
variable. The progression of this same particle as seen by 0' can be deduced through use
of the Lorentz equations.
I t should be noted that the transformations (2.39) and (2.40) are nonphysical for
u ~ c.

2.11

LENGTH AND TIME UNDER THE LORENTZ TRANSFORMATION

It is now possible to give a quantitative interpretation of the second ruler experiment


of Section 2.9 in terms of the Lorentz transformation. Let the two ends of the ruler R'
be at x~ and x~. These spatial coordinates are independent of t' and observer 0' can say
that the length of R' is
If observer 0 wishes to measure the length of R', since it is in motion with respect to
him, he should measure its end coordinates X2 and Xl at a common time t. Using the

74

The Special Theory of Relativity

CHAPTER

first of Equations (2.39), one can then write


x~

K(XI -

ut)

x;

K(X2 -

x~ - x~ =

ut)

K(X2 -

Xl)

from which
(2.41)
in which lR' is the length of the ruler R', as determined by O. One could similarly
investigate the length of the ruler R using the first of Equations (2.40) and conclude
that R appears contracted to 0' by the same factor.
Equation (2.41) is seen to be exactly the Lorentz-FitzGerald contraction formula.
However, it is to be remembered that the Lorentz contraction hypothesis included an
ether-filled space which did not contract, there being rather a physical contraction of
material bodies as they moved through the ether. Experiment proved this hypothesis
to be untenable. The interpretation to be placed on (2.41) is that the distance between
t\VO points in one coordinate system appears to be contracted to an observer in relative
motion parallel to the line connecting these t\VO points, whethera material body is present
or not. This is not an apparent contraction of material bodies alone, but of all of space;
as mentioned earlier, it is an effect caused by the operational definition of the measurement of length. If u c, this contraction is insignificant unless the length itself is very
great. Two widely different examples serve to point up this effect.
EXAMPLE

2.1

The vehicular tunnel under Mont Blanc connecting France and Italy is 11.2 km long.
How much shorter does this tunnel appear to a motorist driving through it at 100 kph?
Equation (2.41) is applicable to this situation, and the first t\VO terms of a power series
expansion give
2

lR' = l~' ( 1 - -1 -u
2 c2

61

l~.

- IR'

~ l~. u2

2 c

~ (10

+ ...

5/3600)2

3 X 10 8

= 4.8 X 10-14 knl

= 0.000000048 mm
EXAl\:IPLE

2.2

Sirius, the brightest star in the heavens, is estimated to be 8.5 light years from earth. If a
group of space travelers were to journey from Earth to Sirius, having achieved a velocity
of 0.90c relative to the solar system before cutting out their rocket motors, how far away
from Earth would Sirius seem to them?
To these observers the distance would appear contracted, being given by

d' = 8.5[1 - (0.90)2P2 = 3.7 light years

which is far from being an insignificant contraction.


Since this segment of length is going past them at 0.90c, the space travelers compute
that the journey will take them a period of time
T'

3.7

= 0.90

4.1 years

An observer back on Earth will estimate that the journey will consume an amount of time
T

8.5

= -

0.90

9.45 years

SECTION

Lenqtli and Time Under the Lorentz Transformation

11

75

This disparity is due to the fact that the t\VO sets of observers also disagree about time intervals because of their relative motion. The disparity is large because of the high velocity
and the great distance involved.

The preceding example indicated a situation in which t\VO observers in relative


motion would disagree about the time interval between two events. 'I'his phenomenon
can be treated more generally by considering a particular clock in X'Y'Z' which
remains at the fixed coordinates (x',y',z') and is thus being passed by a sequence of
XYZ clocks. One can define a first event when the hands of this single X'Y'Z' clock
indicate the time t~ and a second event when its hands indicate the time t~.
In XYZ, the first event will occur at the spatial position

y == y'

z == z'

these equations resulting from the application of (2.40). The X


registers the time of the first event as
t 1 ==

K (

t,1

}TZ

clock at this position

U :r ' )
~

Similarly, in X}TZ, the second event will occur at the spatial position
X2

==

K(X '

ut~)

==

y'

==

z'

and the XYZ clock at this position registers the time of the second event as

t2 = K(t~ + ~ XI)
Frorn this it follows that
t2

i,

==

t~ - t~

(1 _ U2/C2)~~

"

> t2

t1

(2.42)

Consider this result first from the viewpoint of 0', who is stationary beside the single
X'Y'Z' clock. He watches a succession of X1TZ clocks go by and can take only a single
reading of each of them. However, he notices that they seem to be set progressively
further and further ahead, thus accounting for the inequality in (2.42).
On the other hand, observer 0 can take a sequence of readings of the X/:V'Z' clock
as it passes a succession of XYZ clocks. Since he knows these clocks are all synchronized, he concludes that the rate of the X'Y'Z' clock is slowed by its relative motion,
These results are symmetrical and the same conclusions could be reached if a single
XYZ clock were considered to be passing a succession of X'Y'Z' clocks. Thus it can be
concluded that when the readings of a succession of rnoving clocks are compared with
those of a single stationary clock, successive moving clocks appear to be set further and
further ahead; when the readings of a single moving clock are compared with those of
a family of stationary clocks, the moving clock appears to be running slow. This effect
is known as time dilatation and is given quantitatively by (2.42).
EXAl\1PLE

2.3

Direct experimental evidence of the time dilatation effect exists. For example, the lifetimes of 1r mesons have been studied both for the case of mesons at rest in the laboratory,
and for the case when they are in motion relative to the laboratory. 1r mesons are unstable

76

l he Special Theory of Relativity


l

and they decay into a

J..L

CHAPTER

meson and a neutrino, obeying the exponential law

N = Noe- tlT

(2.43)

when at rest in the laboratory. In Equation (2.43), No is the number of 1r mesons existing
at time t = 0 and N is the number surviving at a later time t; e is the base for naturallogarithms and T is the characteristic lifetime of the decay process. Several experimentersw "
have established the average value T = 2.56 X 10- 8 sec for 1r+ mesons at rest.
The decay in a beam of 1r+ mesons traveling at 0.755c relative to the laboratory has
also been studied.'! By passing the beam through a series of counters, and noting the relative
numbers of counts in successive counters as a function of the separation distance between
counters, it was established that the separation distance needed to be 8.43 m in order to
have the fourth counter register the passage of only lie as many 1r+ mesons as did the
third coun tel'.
In the laboratory frame of reference, it takes the meson beam
8.43

t = - - = 3.72 X 10- 8 sec


0.755c

to travel this distance. I f this value for t is inserted in (2.43), it yields the prediction that
the fourth counter should be down from the third by 1/1.57e, a value which is 36 percent
lower than the experimental results.
The difficulty lies in the fact that (2.43) is valid only in a frame of reference in which the
mesons are at rest. For an observer traveling along with the meson beam, the time interval
between passage of the third and fourth counters is only

t' = 3.72 X 10- 8[1 -

(0.755)2]~2 =

2.44 X 10- 8 sec

If this value is inserted for t in (2.43), excellent agreement between prediction and experiment results.
EXAMPLE

2.4

The Mossbauer effect.' 9 has also provided a graphic illustration of time dilatation. A source
consisting of the radioactive isotope cobalt 57, which has a convenient half-life of 280 days,
was plated on to the surface of a 0.8-cnl diameter iron cylinder as shown in the figure. 2o
This cylinder was rigidly mounted between t\VO aluminum plates which also held a cylindrical shell of lucite. The latter was 13.28 ern in diam, 0.31 em thick, and concentric with
the iron cylinder. An iron foil enriched in Fe 57 was glued to the inside surface of the lucite
shell. This assembly was mounted on a shaft and rotated at angular velocities as great as
3,000 rad/sec. A xenon-filled proportional counter was placed near the assembly, just
beyond an intervening lead shield, as shown in the diagram.
As the cobalt 57 nuclei decay, they change into excited nuclei of iron 57. These iron
nuclei emit gamma rays at a frequency Vo = 3 X 10 18, and these rays can be directed into
16 1\1. Jakobson, A. Schulz, and J. Steinberger, "Detection of Positive 7r Mesons by 7r+ Decay," Phys
Rev, 81, 894-895; l\1arch 1, 1951.
17 C. E. Wiegand, "Measurement of the Positive 7r Meson Lifetime," Phys Rev, 83, 1085-1090; September 15, 1951.
18 R. P. Durbin, H. II. Loar, and \V. W. Havens, Jr., "The Lifetimes of the 1r+ and 7r- Mesons," Phys
Rev, 88, 179-183; October 15, 1952.
191l. L. Mossbauer, "Fluorescent Nuclear Resonance of Gamma Radiation in Iridium 191," Z. Phys.
151, 124-143; 1958. (For an excellent explanation of the Mossbauer effect, see the article by Sergio de
Benedetti, Sci Amer, 202, 72-80; April 1960.)
20 H. J. Hay, J. P. Schiffer, T. E. Cranshaw, and P. A. Egelstaff, "Measurement of the Red Shift in an
Accelerated System Using the Mossbauer Effect in Fe S7 , " Phys Rev L, 4, 165; February 15, 1960.

SECTION

Length and Time Under the Lorentz Transformation

11

77

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]

a beam aimed at the counter. However, the iron foil glued to the lucite shell, being enriched
with Fe 5 7 , can absorb these gamma rays and then reradiate them isotropically. This absorption effect is greatest when the source-absorber assembly is at rest, for then the quantum
energy levels have the same separation hlJo in source and absorber.
However. if the assembly is rotated, the source and absorber travel at different speeds
relative to the laboratory and thus the counting of time occurs at different rates in two
coordinate systems, in one of which the source is at rest, and in the other of which the absorber is at rest. An oscillation of period i in the source frame will appear to take a greater
time r' as measured by a clock in the absorber frame, the connection being

Since

T =

'

(1 -

) }2~ r'

(1 - ~ ~)
2 c2

1/ Vo it follows that
2

v' ~ lJo

U )
1(1 - 2 c2

in which Vi is the frequency of the photons from the cobalt source, as determined in a frame
at rest with respect to the Fe 57 absorber.
Since u is the relative speed of source and absorber, it follows that

w(R 2

R 1)

-~-----

w(6.64 - 0.4)
3 X 10 1 0

in which w is the angular velocity of the assembly. Thus the change in frequency of the

<l)

-+-J
~

104

r...

eo 103
~

.~

:::1
0

102

> 101
Q)

'.0
~

Q)
~

100

100

200

300

400

500

Angular velocity (rps)

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]

78

The Special Theory of Relativity

CHAPTER

gamma rays is approximately

Llv

Vo -

v'

1 u2

= - -2
2c

Vo

0.065 w 2

But the absorption spectrum of Fe 57 is so sharp that the width of the resonance, or the line
wid th, is only one part in 10 12 I n other words, if the incoming photons differ in frequency
from Vo by as Ii ttle as one part in 10 12, the absorption falls off markedly, and a lowered
absorption is evident at even smaller changes in frequency.
This lowered absorption plus isotropic scattering by the Fe 57 foil manifests itself by an
increased reading in the counter, since 1110re of the original directed beam of gamma rays
gets through to the counter if less is absorbed and scattered. A plot of counter reading
versus angular velocity of the assembly is shown in the graph, and a theoretical curve
based on the absorption spectrum is included for comparison. The agreement between
theory and experiment is seen to be excellent. It is to be noted that this effect would not
be predicted by a theory which assumed time to he an invariant.

2.12

PROPER TIME AND PROPER DISTANCE

One of the cardinal N ewtonian beliefs is the invariance of distance, and it has already
been seen in Section 2.2 that a Galilean transformation preserves this invariance.
Another ingrained belief is the invariance of time, and this assumption was necessary
to preserve the form of 1'\ewton's force law under a Galilean transformation. However,
it has just been noted that under a Lorentz transformation neither time nor distance is
an invariant.
However, time intervals and space intervals may be combined to form a quantity
which is invariant with respect to a Lorentz transformation, Let Ti2 and (T~2)2 be
defined by the relations
(2.44)
(2.45)
By using either of the transformations (2.39) or (2.40) one can show that

Ti2

(T~2) 2

(2.46)

and thus this quantity is an invariant.


To appreciate the physical significance of T12 (the positive square root of Ti2), it can
be recognized that if there exists a Lorentzian frame of reference XYZ in which two
events take place at the same spatial point, then T12 is the time interval between these
two events as recorded by a single clock at rest at this spatial point in XYZ. For this
reason T12 is called the proper time interval. In another frame of reference X'Y'Z',
l~ - t~ is measured by t\VO different clocks because the events are not at the samo
spatial point, and thus t~ - t~ is sometimes called the nonproper time interval. The
interdependence of space and time is clearly illustrated by (2.44) and (2.45).
I t is not always possible to find a Lorentzian frame of reference in which two events
take place at the same spatial point. For imagine that in XYZ they take place at
(Xl,Y1,ZI) and at (X2,Y2,Z2) at times t 1 and t2 respectively, and that upon inserting

SECTION

Proper Time and Proper Distance

12

79

these values in (2.44) one finds that 7i2 is negative. Then 712 is imaginary for all
Lorcntzian frames since it is an invariant. The trouble is that the t\VO points (X1,Y1,Zl)
and (X2,Y2,Z2) are widely enough separated in XYZ that even an X'Y'Z' frame going
at a relative speed u ---+ c cannot cover the distance between (X1,Y1,Zl) and (X2,Y2,Z2)
in so small a time interval as f 2 - fl. Since Lorentzian frames of reference are physically
restricted to relative velocities u < c (for otherwise x' and t' would be imaginary) it
follows that only when 712 is real is it possible to find a Lorentzian frame in which the
two events occur at the same spatial point. Whenever the value of 712 is real, the interval between the two events will be called timelike.
To accommodate situations in which 712 is imaginary, a new quantity C12 can be
defined by the relation
Ci2 ==

-C 27 i 2 == (X2 - X1)2

from which it follows immediately that

(Y2 - Yl)2

(Z2 - .Zl)2 - C2 (t 2

ciz is an invariant,

Ei2 ==

t 1) 2

(2.47)

for

(E~2)2

(2.48)

by virtue of (2.46). E12 (the positive square root of Ci2) is called the proper space interval,
because in a Lorentzian frame in which t\VO events take place at the same time, C12 is
the Cartesian distance between the t\VO events. In another reference frame X'}T'Z',
t~ - t~ ~ 0, and the distance
[(x~ - X~) 2

(Y~ - Y~) 2

(z; - z~) 2P~

is sometimes called the nonproper space interval.


Whenever the value of C12 is real, the interval between the t\VO events will be called
spacelike. Except when 712 == C12 == 0, it is always possible to carry out a Lorentz transformation to a new frame of reference in which either the two events occur at the same
spatial point (712 real) or at the same time (C12 real) but not both. 712 == C12 =
is the
boundary between these possibilities, and corresponds to the situation in which the t\VO
events can just be connected by a light ray which leaves the site of one event as it
occurs and arrives at the site of the other event as it occurs.
These ideas can be given a simple pictorial representation if attention is confined to
events which happen along the X axis. Let an observer () be at Xl at time iI, and let
light signals traveling along the X axis in each direction pass through Xl at i.. The
tracks of these light signals in the XT plane are shown in Figure 2.13 as the lines AB
and CD. The equations for th ese lines arise from the condition 7i2 == Ci2 == 0, and are

(2.49)
These two lines are thus the boundaries between spacelike intervals and timelike intervals. For events which occur in the areas marked Future and Past, the interval between
such an event and the event (X1,l.1) is tirnelike. For events which occur in the areas
marked Present, the interval between such an event and the event (x1,ll) is spacelike.
An event 1)3 anywhere in the Future region is such that the observer () at Xl still has
the opportunity to influence it causally, since he can send a signal over the distance
X 3 - xII at a velocity less than c and have it arrive there in less tirne than t 3 t-: An
event P 4 anywhere in the region labeled Past happened long enough ago that the
I

80

The Special Theory of Relativity

CHAPTER

FIGURE

2.13

The divisions of space-time.

observer 0 could have learned about it via a signal traveling the distance IX4 - xli at
a velocity less than c and requiring a time interval less than i, - t4.
However an event 1)5 anywhere in the region marked Present could be occurring
without observer 0 being aware of it, for a signal could not be sent in either direction
over the distance Ixs - xII, traveling at a velocity no greater than c, and cover this
distance in so small a time interval as Its - t 1 1.
K ewtonian mechanics can be viewed as a theory in which the velocity of light is
infinite, for then the Lorentz transformation is seen to reduce to the Galilean transformation. This would have the effect on Figure 2.13 of making the lines AB and CDhorizontal and coincident. The Present would then be reduced to a single line of events
occurring at all positions x, but at the single time Now (i 1 ) . There would be no spacelike intervals, just timelike intervals. Special relativity has one consequence of enlarging
the domain of the Present at the expense of the Past and the Future.
EXAMPLE

2.5

Imagine that the X axis is selected to be pointing at the star Betelgeuse, and that this
star is at the position Xfj. At the present time i 1 observer 0, stationed at Xl, sees Betelgeuse
as it was at the earlier time t6 ; that is he sees the event P 6 Imagine that at a later time t s"
Betelgeuse undergoes a supernova explosion, this being the event P s. At time t 1 observer 0
is not yet aware of this occurrence. However, as time goes on the crossed lines AB and CD
move vertically upward in the diagram of Figure 2.13. When they have shifted an amount
t s - le, the line CD will cross the event P 5 and observer 0 will become aware of the supernova.

Another enlightening geometrical construction results when one conceptually plots


events in the four-dimensional space (x,Y,z,l). For example, the projection on the
XT plane of the history of a moving point might be the sequence of events shown as
the line PQ in Figure 2.14.
In this same plane one can show the axes X' and T'; the equations for these axes can
be obtained by setting t' and x' equal to zero in (2.40). There is no reason why X and T
should be shown orthogonal; if they are so shown, X' and T' most decidedly are not
orthogonal. The line PQ is known as the world line of the moving point and it has the
property of being the same for all Lorentzian frames, the latter differing in the direc-

SECTION

TT elocity

1:3

81

--------- --- -

X'
--- ---

----

FIGURE

2.14

n;orld lines.

tions of their space and time axes on Figure 2.14. The locus of all the time axes is the
Past-Future area of Figure 2.13, whereas the locus of all space axes is the Present area.
A world line can follow one of the four axes in Figure 2.14 in which case length con traction and the slowing of clocks can be deduced geometrically.

2.13

VELOCITY

The general motion of a point, in which the spatial variables are continuous functions
of the temporal variable, can be traced in terms of differentials. Fr0l11 (2.39) and (2.40)
these are
dx' == td - u dt)
dx == K(dx' + u dt')
dy

dy

dy'

dz' == dz

dz

==

dz'

dt

==

dy'

dt'

(dt -

~ dX)

(dt'

+ .!!.c2 dX')

Ratios of these differentials may be formed to yield velocity components. For example
v

,
X

dx - u dt
dt - Cui c2 ) dx

dx'
dt'

=:-=:-----

1 -

dxi d! - u
(u/c 2 ) dx/dt

L'x -

Proceeding in this manner, one can derive the Lorentz velocity transformation equations, namely,
vx - u
,
v~ + u.
vx ==
vx =
2
1 - uV x / c
1 + UV~/C2

,
v

vy
y =:
(I - uV x / c2 )
,
vz
v ==
K(l - uV x / c2)
Z

vy
o,

==

==

vy

(I

+ uv~1
, c

u,

K(l

(2.50)
)

+ uv:1c2)

82

The Special Theory of Relativity

CHAPTER

It may be noticed that the transformation one way differs from the transformation the
other way only in the sign of u. If -u and , are small compared to c, Equations (2.50)
are approximated quite well by the Galilean velocity law (2.18).
EXAl\1PLE

2.6

Let there be t\VO particles moving along the ~Y axis. As seen from the X YZ frame of reference, let one particle have a velocity V x = v and let the other particle have a velocity V x =
-v. What is their relative velocity'?
To answer this question, let ~Y' Y' Z' ride along with one particle by setting u = v. Then
from (2.50L the velocity of the other particle in ~Y'}""' Z' is
v

,
z

vx

= - - - - -2 =
1-

UV I / C

-v - v

+V

2v

= - --1

2/C 2

+V

2/C 2

For v small this yields the classic result v~ = - 2v. However, as v ~ c, v~ --4 - c. Thus even
though in .o\ YZ the t\VO particles migh t be going in opposite directions with speeds each
of which approaches c relative to X YZ, their recessional velocities relative to each other
are still less than c.
For v ~ c the entire analysis is improper because one cannot then put u = v in (2.50),
since the Lorentz transformation is nonphysical for u 2:: c.
I t can be concluded from the foregoing that if a particle is traveling at a velocity less
than that of light in one Lorentzian frame, it travels at a velocity less than that of light in
all Lorentzian frames.

2.14

RELATIVISTIC INTERPRETATION OF THE FIZEAU EXPERIMENT

It will be recalled from Section 2.4 that Fizeau found the velocity of light in water to
be dependent on the motion of the water, this dependency being expressed by Equation (2.21). An explanation of this result based on an ether hypothesis had been made
earlier by Fresnel, who assumed part of the ether to be dragged along by the water.
A simpler explanation of Fizeau's data is possible in terms of the Lorentz velocity
transformation. Let XYZ and X'lT'Z' be two frames of reference such that X and X'
are aligned with the flow and X' is sliding along X at speed u. Then X'Y'Z' can be
chosen to be at rest relative to the water, resulting in XYZ being at rest relative to
the laboratory. In X'Y'Z' the velocity of the light waves as they pass through the
water is

Vx =

Vo =

c
n

(2.51)

in which n is the index of refraction.


If the appropriate equation from (2.50) is used, this velocity, as viewed from XYZ, is
V

+u
+ uf cn

cln

(2.52)

Expansion of the denominator of (2.52) in a power series (cf. Mathematical Supplemont) gives

v;

==

(~n + u)

+U
n

Vx = -

(1 - ~ + : + . .. )

1i

n2

en

u
en
2

e2n 2

-u
+ -en
+ -cun +
2

(2.53)

Cedurholm-Toumee ill aser Experiment

1he

SEC'frON I;")

83

Retaining only terms containing c in powers above c- 1 gives


, =

Vo + it

(1 - ~z)

which is in agreement with (2.21). Neither Fizeau nor his followers had sufficient experimental sensitivity to detect the effect of the higher order terms in (2.53). Thus the
Fizeau experiment is completely consistent with the Lorentz velocity transformation.

2.15

THE CEDARHOLM-TOWNES MASER EXPERIMENT

With the advent of very precise clocks based on the maser principle;" it has recently
become possible to perform an even more sensitive test of the presence of an ether than
that afforded by the Michelson interferometer. This has been accomplished by pointing
the beams of ammonia molecules comprising two masers in opposite directions and
measuring the difference in their oscillating frequencies.
'I'he operation of one of these masers is suggested in Figure 2.15. Ammonia gas is
emitted through an opening in a source S and sprays out into a region containing a
cylinder of electrostatically charged rods. The ammonia molecules norrnally exist as a

Output

FIGURE

2.15

The ammonia beam maser.

[After Gordon, Sci Amer, 199, 42; 1958.]

gas in a balance between two energy states, there being a greater population in the
lower state. However, the charged rods repel the ammonia molecules in the higher
state, whereas they attract those in the lower state. As the ammonia gas drifts through
the cylinder of charged rods, the two states start to separate. Those molecules in the
lower state (represented by black dots) diverge whereas those in the upper state (represented by grey dots) converge. The latter then enter a cavity where, due to the unbalanced population, some of them spontaneously revert to the lower energy state, emitting
photons of characteristic frequency in the process. As these photons bounce around
J. P. Gordon, H. J. Zeiger, and C. H. Townes, "The l\1aser-Ne\v Type of Microwave Amplifier,
Frequency Standard, and Spectrorneter," Phys Rev, 99, 1264-1274; August 15, 1955.

21

84

The Special Theory of Relat'ivity

CHAPTER

in the cavity, a field builds up; if the dimensions of the cavity are properly chose]
to resonate this effect, self-sustaining oscillations can occur, and an electromagneti
signal of great purity at a stable, precise frequency can be extracted from the cavity b:
means of a probe.
If two such ammonia beam masers are placed back to back, and the signals coupler
out of their respective cavities are compared, a detectable beat will occur if the signa
frequencies are not the same, That an ether theory would predict the presence of t
beat can be seen from the following argument:
Assume that the t\VO masers are back to back and at rest in a coordinate systerr
X'Y'Z'. Their ammonia beams are presumed to have velocities v and -v with respect
to X'Y'Z' and this entire system is presumed to be traveling with respect to the ethel
at a velocity u. Under these conditions IVI~l1er has shown." that photons emitted from
the first maser beam in the direction characterized by the unit vector e' have a frequency v~ in the laboratory frame of reference given by

= Vo

v e'
+ -+
C

(v e')2
c2

u v]
+ -c
2

(2.54)

in which vo is the photon frequency as determined by an observer at rest relative to


the ammonia molecules, A derivation of Equation (2.54) can be found in Appendix B.
In the cavity of the maser oscillator the ammonia molecules emit photons in all
directions, and as a result the signal coupled out of the cavity will have a mean frequency given by
(2.55)
in which dn is an element of solid angle and fee') is a weighting function dependent on
the geometrical arrangement of the cavity. Upon introducing (2.54) into (2.55), one
gets for the mean frequency

u v]
ii: = [1 + g(v) + ~
Vo

v e'
in which g(v) is the mean value of - c

(v e')2

+ ---and
c'2

(2.56)

is thus a function only of the

magnitude of v.
If this argument is repeated for the second maser beam, for which v is replaced by - v,
the mean value v~ can similarly be found. 'The difference in these two mean frequencies
is therefore
J

ii +

2vo

ii = c2

(0 v)

(2.57)

It has proved possible to achieve a precision of one part in 10 12 in this frequency


comparison, Thus since v = 0.6 knt/see. for each ammonia beam, with vo = 23,870 l\1c/
sec., an ether drift u as small as T1looth of the orbital velocity of the earth should be
detectable with this apparatus. This is 50 times Blare sensitive than the apparatus of
Joos which incorporated a Michelson interferometer and was used in 1930.
C. Meller, "On the Possibility of Terrestrial Tests of the General Theory of Relativity," Nuovo
Cirniento, 6, Suppl, 381-398; 1957.

22

SECTION

1G

The Variation of 111ass 85

Back to back maser oscillators have been constructed by Cedarholm and Townes"
and used in the manner just described. The outputs from the cavities of the t\VO rnasers
were compared ill frequency as the entire apparatus was rotated through 180 deg, thus
ensuring in (2.57) a maximum value of u v for some position of the apparatus. I n the
words of the investigators
'The experiment . . . was carefully done for the first time on September 20, 1958. No
proper effect (in the frequency difference) so large as 510 cps was found. Hence, since the
orbital velocity of the Earth of 30 km/s, would have given an effect of 20 cps, the ether
drift could not have been larger than nloo of this value, or 30 lll/~. I t is, of course, possible
for the Illation of the earth to be just cancelled by the motion of the solar system through
the ether at some particular time of the year. The experiment has now been repeated at the
Watson Laboratory during 24-hr. runs at approximately three-month intervals throughout the year. In none of these runs was any effect so large as -r/o cps found.

This null result makes the case against an ether theory even 1110re compelling,
Einstein's formulation, which treats the ether as superfluous, predicts a null result in
the Cedarholm-Townes experiment.

2.16

THE VARIATION OF MASS

Since it has been established that the Lorentz transformation affords a satisfactory
explanation of phenomena involving the velocity of light, it now becomes necessary to
reexamine the laws of mechanics. If the principle of relativity is to hold for all physical
laws, and if the Lorentz transformation is the proper link between inertial coordinate
systems, then the laws of mechanics, if properly expressed, should transform satisfactorily via the Lorentz equations. In a sense this reexamination has already been
started in that the concepts involving the measurement of distance intervals and time
intervals form an integral part of all mechanical laws. A critical study of the
operational definitions of these measurements for moving systems has revealed
that both distance intervals and time intervals are dependent on relative
motion. This reexamination will now be COIl tinued by the introduction of another
hypothetical experiment." whose symmetry raises a question about the invariance of
mass.
Imagine that two exactly similar elastic balls suffer a .collision which in the X'Y' Z'
frame appears as shown in Figure 2.16a. They are seen to approach each other along
parallel lines, collide, and then recede from each other along parallel lines. Their
approach speeds are equal and by symmetry so too are their recessional speeds. (A
perfectly elastic collision is assumed with no loss of energy, thus causing the recessional
speed to equal the speed of approach.) This experiment can be assumed to take place
either in a region free from gravitational attraction, or on a level frictionless table
over which the balls are sliding. t
Now imagine this same collision as viewed from an XY'Z frame which is moving in

t A rolling motion would complicate the discussion unnecessarily.


J. P. Cedarholm and C. H. Townes, liA New Experimental 'rest of Special Relativity," Nature,
184, 1350-1351; October 31, 1959.
24 This hypothetical experiment and the ensuing analysis were first offered by G. N. Lewis and It C.
Tolman in the paper "The Principle of Relat.ivity and Non-Newtonian Mechanics," Phil Mag, 18,
510-523; 1909.
23

86

The Special Theoru of Relativity

CHAPTEH

y'

v~

.....

.....

\
I
\

/
l

-,

",

\
./

A.
A

(a)

(b)
FIGURE

2.16

The collision of two balls.

the direction of the -X' axis at a speed u = v~. To an observer 0 stationary in XYZ,
ball A is moving parallel to the Y axis, and ball B makes a more grazing incidence to
the X axis.
As seen in X'Y'Z', each ball has its y' component of velocity reversed by the collision but its x' component of velocity is unchanged. As seen in XYZ, ball B has its y
C0111pOnent of velocity reversed by the collision but its x component is unaffected. In
XYZ ball A does not have an x component of velocity either before or after the collision; however, it does have a y C0111pOnent which suffers a reversal.
Classical mechanics would yield the result for this experiment that V y =
for ball B
and that in the XYZ frame the velocity of ball A is iVy. In terms of a Lorentz transformation, one would be ill-advised to assume this without checking. Therefore, let
W y represent the velocity of ball A in XYZ before and after the collision. Using (2.50)
one finds that for ball B

v:

whereas for ball A


The ratio gives

Wy

v=Y
K
(2.58)

and thus V y < ui; Viewed from XYZ, ball A has a greater y component of velocity
than does ball B. (For ordinary velocities the difference is exceedingly small.)

SECTION

The Variation of 111 ass

IG

87

Equation (2.G8) requires the abandonment of one or the other of t\VO principles of
classical mechanics. If mass is an invariant, then the principle of conservation of linear
momentum is violated in the ?J direction in X}TZ. If the momentum principle is valid,
then 111aSS cannot be an invariant. T'he latter assumption has proved to be the one which
is consistent wit.h experiment. and will be the basis for what follows.
Let nl~ = 1n~ be the t\VO 11laSSeS in the X'JT'Z frame (they are equal by symmetry)
and let 1nA ~ rnB be the t\VO masses in the XYZ frame. Then

so that
This result can be rephrased entirely in terms of XYZ quantities by using (2.50) to
substitute for v~. This gives
V

==

from which

== - - - 1 - uvx/c2

- 2 - == V x C

U -

2v;

UV x -

1- 2UVc

U?~V; ==
c4

ln~ ==

m;

== vx

c~

and thus

-l)

UV x

(1 _UVx)2
(1 _V:Vc )2
c
==

(1 _~)-H
c2

(2.59)

This relation is seen not to depend on Vy and should hold even when V y == O. But then
0 also, and as seen from X'Y'Z' the two balls approach each other along the X'
axis and barely touch as they pass. As seen from XYZ, ball A is at rest and ball B
passes by, barely touching .A. as it travels parallel to the X axis. With m the 111aSS
of ball A \vhen it is at rest, Equation (2.59) can be rewritten
ui; ==

ma
- -----(1 c 2) }~

B -

v;/

(2.60)

One can now argue that it no longer matters whether ball A is present or not. Further, the rest mass of ball B should also be rna, since in X' V'Z' one started with a sy mmetrical experiment using identical balls. Wi th only ball.B left, in constant rectilinear
motion, the subscripts can be dropped on m n and Vx, giving
(2.61)
In Equation (2.61), ni is the rest mass of ball B in the Lorentz frame XYZ, and m is
its dynamic mass when going at a speed v relative to XYZ.
It is inferred from this result that the mass of any material body depends on its relative motion, increasing with speed according to the relation (2.61).

88

The Special Theory of Relativity

EXAMPLE

CHAPTER

2.7

A clear confirmation of the variability of mass has been given by Zahn and Spees." Employing a radioactive source S to generate high-speed electrons, they selected a small velocity
range of these electrons through the use of a velocity filter; with a Geiger counter as detector, they were able to determine the dynamic mass.
As indicated by the figure of the apparatus, C is a parallel plate condenser with extremely
small spacing between the plates (d = 0.4663 mm) and a length of 12 em. Any electron

C~~~8

~~~S2~-~S
:I

I
I
I

I
I

1/

I"

If

P.'

[After Zahn and Spees, Phys Rev, 53, 365; 1938.)

which reaches the Geiger counter must pass between these plates. Helmholtz coils (not
shown) create a uniform, constant magnetic field B = 120.85 X 10-4 webera/m" in a
direction perpendicular to the plane of the paper. The electrons which leave S have trajectories in the plane of the paper which are circles. Only those electrons whose center of
curvature is at P will enter the condenser traveling parallel to the plates. The radius of
curvature r can be found from the geometry, which yields the formula
r 2 = (r - a) 2
a 2 + b2
r=--2a

+ b2

in which a is the distance that the source S is below the mouth of the condenser, and b is
the distance that S is to the right of the mouth. In one run of the apparatus, these distances had the values a = 0.02992 m and (a 2
b 2)H = 0.0977 m, yielding r = 0.1595 m.
The force law can be invoked to determine the velocity of the electrons traveling this
particular trajectory. One obtains
mv 2

= evB

and using (2.61) for the dynamic mass, this becomes

(1 - V2/C2)~~

!!.- rB
mo

26 C. T. Zahn and A. H. Spees, "The Specific Charge of Disintegration Electrons from Radium E,"
Phys Rev, 53, 365-373; March 1, 1938.

SECTION

The Momentum and Energy

17

89

in which e is the electronic charge (1.6 X 10- 19 coul) and m is the rest mass (9.1 X 10- 31 kg).
Solving for v, one obtains
v = 0.749c = 2.25 X 10 8 ru/sec
Under the action of the magnetic field, these electrons would continue in their circular
path, thus striking the bottom plate of the condenser, if it were not for the electrostatic
voltage between the plates. When this voltage V is properly adjusted, a balancing electrostatic force results and the electrons are able to travel between the plates, emerging
from the other end to be detected by the Geiger counter. It is clear that regardless of the
value of V, no other electrons, traveling in any other orbit as they leave S, can pass through
the condenser, and those electrons traveling at the speed 2.25 X 10 8 m/sec will get through
only if V has a value such that

eV
- = evB
d
V = dvB
= (0.4663 X 10- 3) (2.25 X 10 8 ) (120.85 X 10- 4)
V = 1,270 volts

The experimental data, showing counts per minute versus condenser voltage, are given
in the graph. The agreement between theory and experiment is seen to be very good.
70
60
~

:::s

50

's.

40

Q)

0.

ell

5
u

30
20
10
0

500

1000

1,500

2000

Volts

[After Zahn and Spees, Phys Rev, 53, 365; 1938.]


Had the rest mass rather than the dynamic mass been used above in the force equation,
the velocity of the electrons getting through the filter would have been computed to be 3.4 X
10 8 m/sec (greater than c), and the predicted condenser voltage to permit passage would
have been 50 percent higher.

2.17 MOMENTUM AND ENERGY


1\ ow let attention be turned to the more general case of a body whose velocity is not

necessarily a constant with time, in either magnitude or direction. Let it be assumed


that Equation (2.61) is applicable to this general case and define momentum by the
relation
p

= mv

mov

(2.62)

90

The Special Theory of Relativ?'ty

CHAPTER

It is apparent that for modest velocities this reduces to the classical definition of

momentum.
Further let N ewtori's force law be defined by the relation

F = dp
dt

(2.63)

Since v is now permitted to be time-dependent, it follows that the dynamic 111aSS m is


a function of time and thus that
dp
dv
dm
-==m-+v-

dt

dt

(2.64)

dt

Expressions (2.63) and (2.64) also are seen to reduce to their classical forms for normal
velocities.
The kinetic energy T of a moving body still can be defined as the work supplied to
bring it frorn rest to its state of motion, Thus

(2.65)
From (2.62)
so that
However since d(v v) = 2v dv, this can be rewritten

Therefore (2".65) becomes

dw
= moc2
(1 - w)%

[(1

V2)-~~

-c2

(2.66)

Equation (2.66) can be expanded in a power series (cf. Mathematical Supplement) to


give
1
3
v4
(2.67)
T = - mov2
ma - 2 +
2
8
c

+-

If vc, (2.67) is approximated quite well by the conventional expression for kinetic
energy.
Equation (2.66) can be written in the interesting form
T =

[(1

_r::/ C2 )!h - m o] c2 = (m - mo)c

(2.68)

This suggests that the kinetic energy can be interpreted as the square of the velocity
of light times the change in mass, If the increase in energy is thought of as the cause
of the increase in mass, it becomes an attractive hypothesis to imagine that even the

SECTIO~

18

The Transformation Law for Mos

91

rest 111aSS 1110 is due to an internal arnount of energy moc2 If 1noc 2 is called the rest
energy of the body, then the total energy 1~1, being the sum of the rest energy and
the kinetic energy, is given by
(2.69)
This celebrated equation is one of the most important results of the special theory and
has been amply substantiated by a wide variety of atomic and nuclear experiments. It
lies at the heart of the explanation of fission and fusion bornbs and has led to a satisfactory explanation of stellar energy processes. Verification of (2.69) provides convincing support of the soundness of the generalized definitions of momentum and the force
law given earlier, on which the derivation of (2.69) "vas based.
EXAwlPLE

2.8

The dynamic balance within a stable star can be explained by arguing that the great
mass causes a high gravitational pressure at the core. 'This intense pressure serves to elevate the temperature of the core to millions of degrees and thus permit fusion processes to
occur. The most likely of these processes is the conversion of hydrogen to helium. Four
hydrogen atoms, each consisting of a proton and an electron, can be transformed into a
single helium atom consisting of two protons, t\VO neutrons, and t\VO electrons, as suggested
by the diagram. A charge balance is achieved because t\VO positively charged protons plus
t\VO negatively charged electrons are replaced by t\VO uncharged neutrons. However a
mass balance is not achieved. Since the atomic weights of hydrogen and helium are 1.008
and 4.003, respectively, it follows that 4.032 units of mass are replaced by only 4.003
units. The loss in mass, multiplied by c 2 represents the energy radiated during the transformation. As this energy streams outward from the core it causes a radiation pressure

which balances the gravitational pressure, causing the star to maintain a stable size. This
stability ensures that the rate of the fusion process will remain essentially constant over a
long period of time (billions of years). This in turn makes the solar power available to a
planetary system a constant-a desirable factor in evolutionary processes.
When radiation pressure is computed on the basis 6.E = c2 Sm, theoretical calculations
yield values for stellar diameters and surface temperatures which are in satisfactory agreement with observations. Spectrographic studies of our own sun indicate hydrogen is its most
abundant element, with helium next. 'The relative abundance of these t\VO elements suggests
that the process has been going on for about five billion years, a figure which is in good agreement with geological data. I t also suggests that the process can continue in stable fashion
for another five billion years.

2.18

THE TRANSFORMATION LAW FOR MASS

Equation (2.61) is not, of course, the transformation law for mass because it relates
the rest mass in one coordinate system to the dynamic mass in the same coordinate

92

The Special Theory of Relativity

CHAPTER

system. However it can be used to relate the dynamic mass in two different coordinate
systems as follows:
Let a body of rest mass mo have a velocity v(x,y,z,t) in X YZ and a velocity Vi (x' ,y' ,z' ,t')
in X' Y' Z'. Then

m
m

and

= -----

(1 - v2/

C2)~2

mo
[1 - (v') 2/ C2P2

=-----

are the expressions for the dynamic 111aSS in the

t\VO

coordinate systems. Thus

From the velocity transformation Equations (2.50),

(V/)2 =

(1 - ~:%r2

so that

m'

(1 - ~:) +

[(V 2 - v;)
=

(1 - ~~%) m

(v% - U)2]

(2.70)

Equation (2.70) is the transformation law for mass. In using it one should remember
that in general both m and m' are functions of time.

2.19

THE TRANSFORMATION LAW FOR FORCE

On the presumption that the Lorentz equations properly transform all the laws of
physics (as required by the relativity principle) one can write
F =

F'

dt (mv)

= -

dt'

(2.71)
(2.72)

(m'v')

and inquire what the force transformation law must be in order to derive either of these
equations from the other via the Lorentz equations.
With the help of (2.50) and (2.70), Equation (2.72) can be expanded to give
F
F

,
Z

,
Y

dt d

= -

dt' dt

[K(V
Z

u)m]

dt d
- [mv]
dt' dt
Y

(2.73)

= -

dt d

F = --[mv]
dt' dt
Z

Formation of the differentials of the last of (2.39) yields

dt
dt'

SECTION

The Transformation Law fOT Force 93

19

so that Equations (2.73) become

F ' ==
x

UV

1 -

_x

)-1

c2

~ (mv x

dt

r. -

u(dm/dt)
1 - uVx / c 2

mu)

d
F
- (mv y ) =
y
K 1 - uv x /
dt
K(l - uv x / c2)
,
1
d
F'7
F ==
- (mv z ) ==
~
2
K(l - uVx / c ) dt
K(l - uVx / c 2 )

F~ =

(2.74)

c2)

From (2.Gl)

dm
dt

mo/c

Iv

(1 - V2/C2)~~ v dt == 1 -

With the help of (2.64) this can be rewritten

(v) ( dV)

(F _

(c2 - v2) dm = v
dt
dm
v F

so that

V2/C2

m dt

v dm)

dt

-- - ---dt
c2

(2.75)

Finally
F' == F
z

F ' ==
F ' ==

(2.76)

K(l - uv x / c2 )

z/_C_- F,
uv y/c
F _ _ 1_lV_
1 _ uvx/c2 1/
1 - uv x/ c2

Fz
K(l - uVx / c2)

Equations (2.7f)) are known as the force transformation law. It is evident that if u and v
are small, F ' ~ :F', indicating that in such cases the classical expression, which equates
these forces, is a valid approximation.
It is significant that Equations (2.76) are linear in the force components, Recalling
that F or F' is the total force acting on the body of rest mass me, if F is composed of
partial forces such that

F = FI

F2

+ ... +

FN

then each of these partial forces has a counterpart such that

F' == F~

F~

+ ...+

F~

Equations (2.76) can then be written in the expanded form

(F~x

+ r; + ... + r.:

== (FIX

+ F + ... + FNx)
2x

uVy / c 2

1 - uVx / c
UV Z /

o: + F~y +

F~y)

(F;z+F~z+

+F~)

(F ly

+F +

(F lz

+F +

c2

2y

2z

1 - uVx / c
(Fly + F 2y + ... + F Ny)
K(l - uv x / c2)
(F 1z + F 2z + ' " +F Nz)
e(I - uVx / c2)

94

The Special Theory of Reloiioin,

CHAPTEH

Since the partial forces arc in general independent, it follows that

F'nx

r.;

F'

If

lIX

nz

uVy / c2 F
1 _ uVxlc2 ll y
----

uu.] c2 F
1 _ uV x /c2

----

llZ

F ny
1 - uV x/c 2 )

(2.77)

Fn z

K(l - uv x / c2)

in which F, and F: represent the nth partial force as determined in XYZ or X'Y'Z',
with 1 S n S N. Thus the partial forces transforrn according to the same law as the
total forces. However, it should be recognized that, whereas (2.77) contains partial
forces, it does not contain partial velocities. The terms Vx, Vy, and v, occurring in (2.77)
refer to the total instantaneous motion of the mass m, resulting from the action of all
the forces.
This important transformation law will be central to the development of the field
of a moving charge, a topic to be considered in Chapter 4. The results there obtained
will serve as additional evidence for the validity of this reconstitution of the laws of
rnechanics in keeping with the Lorentz transformation.
REFERENCES
1.

Bergmann, P. G., Introduction to the Theory of Relativity, Prentice-Hall, Inc., New York,
1947.

2.

Dingle, H., 'The Special 'Theory of Relativity, 3rd ed., a Methuen Monograph, John Wiley
and Sons, Inc., Ne\v York, 1950.

3.

Einstein, A., H. A. Lorentz, H. Minkowski, and H. Weyl, The Principle of Relativity, a


collection of original papers, Dover Publications, Inc., New York.

4.

Leighton, R. B., Principles of l1fodern Physics, McGra\v-Hill Book Company,


1959.

5.

lVI~ller,

6.

Panofsky, VV. 1(. H., and NI. Phillips, Classical Electricity and M aqneiism, AddisonWesley, Inc., Boston, 1955.

7.

Richtmyer, F. R., E. H. Kennard, and r. Lauritsen, Introduction to Modern Physics,


5th ed., Mcflraw-Hill Book Company, N C\V York, 1955.

8.

Sherwin, C. 'V., Basic Concepts of Physics, Holt, Rinehart and Winston, Nc\v York, 1961.

9.

Whittaker, E., A History of the Theories of Aether and Electricity, Vols, 1 and 2, 'rhos.
Nelson and Sons, Ltd., London, 1953.

10.

XC\V

York,

C., The Theory of Relativity, Oxford at the Clarendon Pre~s, London, 1952.

Whittaker, E., From Euclid to Eddington, Dover Publications, Inc., N"e\v York, 1958.
PROBLEMS

2.1

Assume that t\VO plane waves of light are propagating almost parallel to the Y axis, such
that they are given by

fl = K cos (27rvt - k, -

"'2 = K cos (21t"vt + kxx -

kyY)
kyY + a)

Problems

95

in which K is the constant amplitude of both waves and a is their relative phase at the
origin. Show that in any transverse plane y = constant these waves interfere so as to
give alternate regions of light and dark. What is the spacing of these interference fringes?
How does the position of these fringes depend on a'? (Note tha; this effect is used in the
Michelson interferorneter.)
2.2

In the Michelson interferometer how does fringe shift depend on the rotational position
of the apparatus if the two arms are not equal? (Cf. .Appendix A.)

2.3

In Section 2.10 of the text, the Lorentz transformation equations were derived using the
pulse of light which occurred at the coincidence of ends of the two rulers. Show that the
Lorentz equations can also be derived by requiring that 0 and 0' obtain symmetrical
results, thus giving an analytic parallel to the literal arguments of Section 2.9.

2.4

Show that t\VO Lorentz transformations carried out one after the other are equivalent to
one Lorentz transformation for which the relative velocity is

U1 + U2
= ----

(U1 U 2/

e2)

with U1 and 1[2 the relative velocities of the t\VO transformations. Thus show that it is
impossible to combine a sequence of Lorentz transformations into one yielding a relative
velocity greater than e.
2.5

In Section 2.9 of the text, a literal argument was used to show that observers 0 and 0'
each concluded that the other ruler had shrunk when relati ve motion occurred. Use the
Lorentz transformation equations to demonstrate that the events A opposite B' and A'
opposite B occur in reverse time sequence for the two observers.

2.6

The result (2.41) was obtained when observer 0 found the positions of the t\VO ends of
the ruler R' at a common time t. Show that the same result is obtained if 0 determines
how long it takes for R' to pass a fixed point in XYZ and then multiplies tnis time interval
by the speed u.

2.7

Show that the time dilatation effect may also be obtained by determining the distance
in XYZ between two events which occur at the same point in X'Y'Z' and dividing this
distance by u to get the time interval in XYZ.

2.8

A jet passenger airplane 150 ft long is cruising at a ground speed of 600 mph. By how much
does the plane appear shortened to a ground observer '? How long would the pilot need to
fly at this speed before his clock appeared to a ground observer to have lost 1 sec?

2.9

Use the time dilatation formula to check the results of Example 2.2.

2.10

A space vehicle whose rest length is 100 In is traveling away from the earth at a constant
velocity v = 0.8e. A pulse of light is sent from the earth toward the spacecraft. As the light
pulse passes the rear of the vehicle it triggers a clock. It then continues to the front of the
vehicle where it is reflected by a mirror and returns to the clock. What time interval does
the clock record between the two passages of the light pulse? What time interval would
earth clocks record between the same two events?

2.11

An electronic clock is shown in the figure, and consists of a flashtube F and a photoelectric
cell P shielded from each other by a baffle, plus a mirror M rigidly mounted a fixed distance !J above the assembly. A. circuit in the box B is arranged so that when P receives
a light pulse from F via AI, it causes the flashtube to emit another pulse of light with
negligible delay. This clock thus U ticks" once every 2D/ e sec when at rest. N O\V suppose

96

The Special Theory of Relativity

CHAPTER

that this clock is moving at a constant velocity v relative to the laboratory frame and
determine its period. Does your answer depend on the direction of v?
2.12

A cosmic ray p, meson enters the earth's atmosphere vertically at a speed v = 0.98c. In
its own rest system the p, meson decays into an electron and 2 neutrinos with a mean
lifetime of 2.2 X 10- 6 sec. What is its mean life expectancy as determined by an earth
observer? How far will a shower of these J..L mesons penetrate the earth's atmosphere before
half of them have decayed?

2.13

Suppose that the frequency of a ray of light is v, as determined by 0 who is stationary


in XYZ, and that this light ray is traveling at an angle (J with respect to the X axis. Show
that an observer 0', stationary in X' Y' Z', will find that the frequency of the light ray is
v[l - (u/c) cos (J]
v' - - - - - - - -

(1 - u 2/ c2) ~~

This result is known as the relativistic Doppler formula. Note that the numerator is the
classical expression.
2.14

A distant galaxy is receding from Earth with a radial velocity component of 1,000 km/sec.
By how many angstrom units will the If{J line (4,861 A) be shifted? Is the shift toward the
red or toward the violet end of the spectrum?

2.15

A straight line fixed in the XZ plane makes an angle (J with the X axis. What angle does
this line appear to make with the X' axis to an observer 0' stationary in X/Y'Z'?

2.16

A small particle of mass m is moving at a constant speed v in a straight-line path in the


XZ plane. This path makes an angle () with the X axis. Find the velocity components of
this particle in the X'f'Z' frame. What angle does the particle's path make with the X'
axis? Is this answer consistent with the result of the previous problem?

2.17

A particle of rest mass j1f 0, moving through X YZ at the constant velocity V, collides
inelastically with a second particle of rest mass mo. If the second particle were initially at
rest in XYZ, find the speed of the composite particle.

2.18

Explain the aberration of light from a distant star in terms of the Lorentz transformation.

2.19

To what speed must a particle of rest mass m be accelerated in order to quadruple its
mass? What is its kinetic energy at this speed? How does this answer compare with

tmov 2 ?
2.20

Find a formula connecting the momentum p and the kinetic energy T of a particle of rest
mass mo.

Problems 97
2.21

A Compton collision occurs when a photon strikes an electron and is thereby scattered.
Find the change in frequency of the photon as a function of the angle ()through which it is
deflected. What is the change in energy of the electron?

2.22

An excited atom, at rest in XYZ, drops to a quantum state whose energy level is lower
by ~E. A photon is emitted and the atom recoils. Therefore the frequency of the photon
will not be precisely v = ~E/h, but rather will be
v =

~E
h

(1 _~ M~E)
2

oc

in which M 0 is the rest mass of the atom and h is Planck's constant. Show this result.
2.23

Consider the collision of a particle of initial energy E and rest energy Eo with a like
particle which is at rest. Show that the maximum energy available in the zero momentum
frame is (2EE o 2E o2) ~~.

2.24

Find the Lorentz transformation law for acceleration and express your answer in terms of
acceleration components which are perpendicular to and parallel to the velocity.

2.25

In an Xl"Z frame a particle is moving in the Xl" plane and has instantaneous velocity
components V x = V y =
At this same instant the two force cornponents are equal. In
what Lorentzian frame will the force appear to be entirely Y directed at this instant?
What will be the magnitude of this force?

2.26

Show that the force defined by (2.71) is parallel to the acceleration only if the acceleration
is either parallel to or perpendicular to the velocity.

2.27

An electron and a positron can combine at rest, annihilating each other with the result
that t\VO l' rays are emitted. Assuming that energy and momentum are conserved, calculate the wavelength of the l' rays.

2.28

Consider a rocket ship in which mass can be converted to energy which provides a thrust.
Find the terminal velocity of this rocket ship relative to a frame in which it was initially
at rest, as a function of the percent of original mass converted.

tc.

CHAPTER

Electrostatics in Free Space


and the t\VO which follow will be concerned with the establishment of
an electrical theory in the absence of dielectric and permeable materials. Cond uctors
will be considered in a limited way, but only as supporting structures for the distribution or transport of charge; attention will be focused on the fields set up by these
charges and not on any interaction with their conducting environment. The conductors themselves will be treated as an electrically neutral background, consisting locally
of equal amounts of positive and negative charge in a VaCUUI11. In this way a simplified theory of electromagnetic fields caused by charges in free space can be developed. Subsequent chapters will then be concerned with the extension of this theory to
situations which include the effects of materials,
The present chapter begins with a formulation of the electric field due to a static
assemblage of charges and then proceeds to the introduction of the electrostatic potential. Electric flux density is defined, following which Gauss' law and its applications are
discussed, including the use of flux maps. The relationship between field and charge at
a conductor-vacuum interface is established and the method of images is then developed and applied to several cases. Poisson's and Laplace's equations are derived and
a variety of boundary-value problems considered. The concept of capacitance is defined
and generalized to a system of conductors, and the chapter closes with a discussion of
the energy stored in an electrostatic field.
At this point a dipole theory of the behavior of dielectric materials could have been
introduced, but it would perforce be limited to static stresses. For this reason it was
felt desirable to defer the discussion on dielectrics until the general time-varying case
could be considered. Similarly, the next chapter, which deals with magnetic fields due
to time-independent currents, could logically contain sections OIl d.c. conductivity and
static effects in magnetic materials: these topics too have been postponed so that timevarying effects could be included for completeness.
THIS CHAPTER

3.1 *

HISTORICAL SURVEY

Electrostatic theory is based OIl the single experimental postulate that electric charges
exert forces on each other which vary directly as the product of the strengths of the
charges and inversely as the square of their distance of separation. Thus if q and q' are
chosen to represent the strengths of t\VO point charges, and r is the distance between
* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Sllrvey

99

them, the force which one charge exerts on the other may be expressed in the form
f

cc

qq' i,
r2

(3.1)

in which 1r is a unit vector along their connecting line. The two electric charges can be
alike or opposite, causing the force to be repulsive or attractive; this feature is accommodated mathematically by permitting the symbols q and q' to have an intrinsic algebraic sign which is positive or negative.
This inverse square law, as it is usually called, has a curious history of discovery and
rediscovery. As is true with respect to most major scientific principles, its establishmen t
cannot be wholly credited to the efforts of one man. Perhaps the first significant contribution to the realization of this law was made by Benjamin Franklin (1706-1790).
Writing to Dr. John Lining of Charlestown, South Carolina, on March 18, 1755,
Franklin described an experiment he had performed in the following words;'
. . . I electrified a silver pint cann, on an electric stand, and then lowered into it a cork-ball,
of about an inch diameter, hanging by a silk string, till the cork touched the bottom of the
cann. The cork was not attracted to the inside of the cann as it would have been to the outside, and though it touched the bottom, yet, when drawn out, it was not found to be electrified by that touch, as it would have been touching the outside. The fact is singular. You
require the reason; I do not know it. Perhaps you may discover it, and then you will be so
good as to communicate it to me. I find a frank acknowledgment of one's ignorance is not
only the easiest way to get rid of a difficulty, but the likeliest way to obtain information,
and therefore I practice it: I think it an honest policy. Those who affect to be thought to
know every thing, and so undertake to explain every thing, often remain long ignorant of
many things that others could and would instruct them in, if they appeared less conceited.

Later, upon editing a collection of his letters for publication, Franklin added the
footnote
. . . Mr. F. has since thought, that, possibly the mutual repulsion of the inner opposite
sides of the electrised cann, may prevent the accumulating an electric atmosphere UpOl1
them and occasion it to stand chiefly on the outside. But recommends it to the farther
examination of the curious.

Very little progress was made with this idea until Franklin described the abovementioned experiment to his good friend Joseph Priestley and asked Priestley to repeat
the investigation and verify his results. Priestley (1733-1804), better known as the discoverer of oxygen, undertook experiments beginning in December 1766. He suspended
two pith balls from threads which were entirely inside an electrically charged cup. Like
Franklin, Priestley found" that the balls
. . . remained just where they were placed, without being in the least affected by the
electricity; but that, if a finger, or any conducting substance communicating with the earth,
touched them, or was even presented towards them, near the mouth of the cup, they
immediately separated, being attracted to the sides; as they also were in raising them up,
the moment that the threads appeared above the mouth of the cup.
1

Bernard Cohen, ed., Benjamin Franklin's Experiments, pp. 331-338, Harvard University Press, 1941.

J. Priestley, The History and Present State of Electricity with Original Experiments, p. 732, printed for
J. Dodsley, London, 1767.

100

Electrostaiics in Free Space

CHAPTER

Based on the results of this experiment, Priestley then made the observation
May we not infer from this experiment, that the attraction of electricity is subject to
the same laws with that of gravitation, and is therefore according to the square of the
distances; since it is easily dernonstrated that were the earth in the form of a shell, a body
in the inside of it would not be attracted to one side more than another.

Despite the fact that Priestley was prompt to publish these experirnental findings
and his inference of the inverse square relation, the scientific community of his day
failed to appreciate the significance. Indeed, Priestley himself apparently did not
regard this accomplishment as a sufficiently rigorous proof and did not champion his
deductions.
Two years later in 1769, Dr. John Robison (1739-1805), of Edinburgh, undertook
the task of determining the law of force between electric charges by direct experiment.
Little attention has been given to the historical priority of his discovery, since Robison
made scant attempt at the time to publicize his findings. This is unfortunate, because
he was an accomplished investigator of wide interests, whose discoveries could have
benefi ted the progress of science. His lectures and scientific researches were published
posthumously in Edinburgh in 1822 and are clearly and engagingly presented in an
extensive four-volume treatise entitled 111 echanical Philosophy. Commencing on page 73
of the fourth volume of this treatise, Robison describes in detail an electrometer which
he constructed for the purpose of determining the force law between electrified particles.
Figure 3.1 is a reproduction of Robison's sketch of the electrometer, a device which
balances gravitational and electrical forces, thus giving a mechanical equivalent
of electrical attraction or repulsion. Robison's lengthy description of the apparatus
and its method of operation can be paraphrased by noting that A and Bare metallic
balls which, in the course of the experiment, will be electrified. B is attached to an
insulating stalk and counterpoised by the ball D, with the stalk freely hinged at C.
A is insulated by the glass arm FEL to which is attached the hinge C. With the two
balls A and B uncharged, the apparatus is adjusted so that, when BD hangs vertically,
A and B barely touch. The shaft FI is then rotated until
. . . the line LA is horizon tal, and so is CB; and the movable ball B is resting on A and is
carried by it. N O\V electrify the balls, and gently turn the handle I backwards . . . noticing
carefully the t\VO balls. It will happen that, in some particular position of the index, they
will be observed to separate. Bring them together again, and again cause them to separate,
till the exact position at separation is ascertained. This will shew their repulsive force in
contact, or at the distance of their centres, equal to the sum of their radii. Having determined this point, turn the instrument still more toward the vertical position. The balls will
now separate more and more . . . this electrometer . . . win give absolute measures:
for . . . by laying some grains weight on the cork-ball D, till it becomes horizontal and
perfectly balanced, and compu ting for the proportional lengths of BC and DC, we know
exactly the number of grains with which the balls ITIUSt repel each other (when the stalk is
in a horizontal position) in order merely to separate. Then a very simple computation will
tell us the grains of repulsion when they separate in any oblique position of the stalk; and
another computation, by the resolution of forces, will shew us the repulsion exerted between
them when AL is oblique, and Be makes any given angle with it.

After revealing his talent for careful instrumentation by instructing the reader in the
proper construction and care of the critical components of the electrometer, Robison

SECTION

Historical Survey

101

FIGURE

3.1

Robison's apparatus.

moved on to a discussion of the results he had obtained. Noting that he had made
many hundreds of measurements with different instruments, he concluded that
the mutual repulsion of two spheres, electrified positively or negatively, was very nearly in
the inverse proportion of the squares of the distance of their centres, or rather in a proportion somewhat greater, approaching to 1/r 2 . o6

By rotating the apparatus so that B was under A, Robison was able to make measurements of the attractive force between unlike charges. The results were similar and
he concluded that the force law was probably the inverse square of distance for both
attraction and repulsion. He failed to recognize the importance of this result, perhaps
because of the subordinate position in which he tended to place experimental work
relative to mathematics.
Another definitive demonstration of the inverse square law was achieved by Henry
Cavendish (1731-1810) in 1773. His experiment had the same basic form as the

102 Electrostatics in Free Space

CHAPTER

approach used earlier by Franklin and Priestley, although it is not clear that Cavendish
was aware of their efforts. He went far beyond their accomplishments, however, and
obtained a quantitative result for the law of force, including an estimate of the precision of his data.
The laboratory technique displayed by Cavendish in all his researches would earn
the admiration of any modern experimenter. In his earlier 'York with electricity, he had
developed the concept of "degree of electrification" (now called potential), and had
then convinced himself that when t\VO charged conductors are connected by a wire
they redistribute charge in order to attain the same potential. He incorporated this
result into many experiments designed to compare the charge on two bodies which had
been brought to a common potential.
In one of these experiments, Cavendish showed that the charges on similar bodies at
the same potential are in the ratio of their linear dimensions. Using this knowledge, he
expressed the charge on any body in terms of the diameter of a sphere which, when at
the same potential, would have an equal charge. This, in modern language, is the concept of capacitance, and when Cavendish spoke of the charge of a body as "globular
inches" or simply "inches of electricity" he meant that the capacitance of the body
in question was equal to that of a sphere whose diameter in inches was the value quoted.
Cavendish took as his standard a conducting spherical shell whose diameter was 12.1 in.
and he then ascertained, by a well-arranged series of measurements, the relative capacitances of a great number of bodies of many shapes.
His electric force experiment had the intention!
. . . to find au t whether, when a hollow globe is electrified, a smaller globe inclosed within
it and communicating with the outer one by some conducting substance is rendered at all
over or undercharged; and thereby to discover the law of the electric attraction and repulsion.

To this end, Cavendish constructed an apparatus consisting of a 12.1 in. diam inner
globe, mounted on a glass rod, and surrounded by two hemispheres of diameter 13.3 in.
He then
. . . made a communication between them by a piece of wire run through one of the hernispheres and touching the inner globe, a piece of silk string being fastened to the end of the
wire, by which I could draw it out at pleasure.

Cavendish next charged the outer globe, withdrew the connecting wire, removed the
t\VO hemispheres, and tested for charge on the inner globe by touching to it an electrometer consisting of two pith balls suspended by fine linen threads. However, he was not
satisfied with the first form of his apparatus, and went to an improved design, about
which he says
For the more convenient performing this operation, I made use of the following apparatus.
I t is more complicated, indeed, than was necessary, but as the experiment was of great
importance to my purpose, I was willing to try it in the most accurate manner.
ABCDEF and AbcDef [Figure 3.2] are t\VO frames of wood of the same size and shape,
supported by hinges at A and D in such manner that each frame is moveable on the horizontal
line AD as an axis. H is one of the hemispheres, fastened to the frame ABCD by the four
sticks of glass, AIm, N n, Pp, and Rr, covered with sealing-wax, h is the other hemisphere
J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, revised by J. Larmo r,
vol. 1, p. 118, et seq., Cambridge University Press, 1921.

SECTION

Historical Survey

(a) Cavendish's original sketch.

(b) lv! axwell's drawing.


FIGURE

3.2

The Cavendish apparatus.

103

104 Electrostatics in Free Space

CHAPTER

fastened in the same manner to the frame Jibe I). G is the inner globe, suspended by the horizontal stick of glass Ss, the frame of wood by which Ss and the hinges at A and D are supported being not represented in the figure to avoid confusion.
Tt is a stick of glass with a slip of tinfoil bound round it at x, the place where it is intended
to touch the globe, and the pith balls are suspended from the tinfoil.

Cavendish describes how the inner globe and hemispheres were coated with tinfoil to
make them good conductors, and how the frame was adjusted so that the hemispheres
would fit accurately together and concentrically around the inner globe. He then goes
on to explain that
It was also so contrived, by means of different strings, that the same motion of the hand
which drew away the wire by which the hemispheres were electrified, immediately after
that was done, drew out the wire which made the communication between the hemispheres
and the inner globe, and immediately after that was drawn out, separated the hemispheres
from each other and approached the stick of glass Tt to the inner globe. It was also contrived
so that the electricity of the hemispheres and of the wire by which they were electrified was
discharged as soon as they were separated from each other, as otherwise their repulsion
might have made the pith balls to separate, though the inner globe was not at all overcharged.

Upon electrifying the outer shell and following the procedure just described, Cavendish brought his pith-ball electrometer into contact with the inner globe, and observed
The result was, that though the experiment was repeated several times, I could never
perceive the pith balls to separate or shew any signs of electricity.

These experiments were performed on December 18-24, 1772 and April 4, 1773. On
the later date Cavendish improved on the detectability of his electrometer by first precharging the pith balls positively or negatively. In this situation, a small like charge on
the inner globe would have slightly altered the separation of the pith balls, whereas
Cavendish observed in both cases that, upon contact with the inner globe, the pith
balls collapsed toward each other, assuming a position in which they were barely
separated. This indicated that the greater capacitance of the inner globe was draining
most of the precharge off the pith balls, and thus that the charge which had been on
the inner globe was much less than the charge with which he had pre-electrified the
electrometer.
Cavendish next turned his attention to the question of the accuracy of his measurements. At issue was the minimum charge on the inner globe which his pith-ball electrometer could detect. To make an estimate of this minimum detectable charge, Cavendish totally discharged the condenser which had been used to charge the outer sphere
in the electric force experiment. He then recharged this condenser with -loth of its
original charge, being sure of this value through his use of a set of calibrated condensers.
Upon connecting the recharged condenser to the inner globe (with the hemispheres
removed), he was certain that the charge transferred to the inner globe was less than
a10th of the charge which had been transferred to the outer sphere in the original electric force experimen t. K O\V, upon bringing his electrometer in con tact wi th the inner

SECTION

Historical Survey

105

globe, Cavendish found a sensible effect on the separation of the pith balls. Thus he
was led to the conclusion
It appears, therefore, that if a globe 12.1 inches in diameter is inclosed within a hollow globe
13.3 inches in diameter, and communicates with it by some conducting substance, and the
whole is positively electrified, the quantity of redundant fluid lodged in the inner globe is
certainly less than 6)rth of that lodged in the outer globe, and that there is no reason to
think from any circumstance of the experiment that the inner globe is at all overcharged.

Cavendish then proceeded to argue that the law of electric attraction and repulsion
lTIUSt be inversely as the square of the distance. But he was not yet satisfied. He wanted
. . . to form some estimate how much the law of the electric attraction and repulsion may
differ from that of the inverse duplicate ratio of the distances without its having been
perceived in this experiment . . .

Reasoning as had Newton in the case of gravitational attraction, Cavendish assumed


that the electric charge would spread uniformly over a sphere, and that each element
of charge would exert forces on all other elements according to the same law, with the
principle of superposition applying. He then assumed that the force law had an inverse
distance dependency of the 2 10th power and computed the amount of charge which
would have to reside on the inner globe so that the net force on a charge located at the
midpoint of the wire connecting the two globes was zero. This amount of charge turned
out to be ~7th of the charge on the outer sphere. Since 5)-th is larger than the detectable
';oth amount, Cavendish concluded that
. . . the electric attraction and repulsion must be inversely as some power of the distance
between that of the 2 + s1rth and that of the 2 - sloth, and there is no reason to think
that it differs at all from the inverse duplicate ratio.

Cavendish also investigated the manner in which the electric force law is dependent
on the amount of charge. Once again, he ingeniously contrived a quasi-null experiment,
the apparatus for which is depicted in Figure 3.3. In Cavendish's own words:"
CD is a wooden rod 43 inches long, covered with tinfoil and supported horizontally by
non-conductors. At the end C is suspended, as in the figure, the electrometer described 5 in
Article 249, and at the other end D is suspended a similar electrometer, only the straws
reached to the bottom of the cork balls A and B, but not beyond them, and were left open so
as to pu t in pieces of wire, and thereby increase their weigh t and the force with which they
endeavoured to close.

The two Leyden jars E and F were approximately equal in capacitance and each
exceeded the capacitance of the bar and electrometers together by a factor of about
4 Ibid., pp 189-193.
51'his electrometer consisted of two wheaten straws, suspended by pin bearings from a brass block,
and terminated by gilted cork balls. Ibid., 131.

106 Electrostatics in Free Space

CHAPTER

one hundred. The outer coatings of both jars were grounded and the inner coating of E
was permanently attached to the bar CD. With the wire weights in the straws A and B,
the system consisting of the jar E, the bar CD, and the two electrometers was charged
until A and B were separated by a measurable amount. The jar F was then connected
to the system, essentially halving the charge on the electrometers. The electrometer
at C was then observed to have a separation almost equal to the separation which

FIGURE

3.3

Cavendish methodfor determining relation between force and charge intensity.

A and B had previously experienced with double the charge. Since Cavendish had
determined that the two electrometers were almost identical and since he had chosen
the wire weights so that they would quadruple the force tending to close AB (the
actual factor was 3.9), he was able to conclude that the electric force was directly
proportional to the amount of each charge.
The results of these highly original and definitive experiments were unknown to the
scientific community for almost a century for, like Robison, Cavendish chose not to
publicize his findings. By the time a general awareness had developed that each of
these men had established the inverse square law, the credit and fame had been bestowed properly on someone else.
That someone else was Charles Augustin de Coulomb (1736-1806) who, in 1785, also
demonstrated the law of electric force, using a technique totally different from those
employed by any of his predecessors. Coulomb's procedure involved the use of a torsion
balance which he had invented. With it, he measured the repulsive force between two
like charges, balancing this force by the torsion in a wire from which a bar containing
one of the charges was suspended. His celebrated First Memoir on Electricity and Afagneiism contains a preliminary section in which the torsion balance is clearly described
as follows:"
On a glass cylinder ABCD [Figure 3.4, sub-Figure 1J ... we place a glass plate . . .
which completely covers the cylinder; this plate is pierced by two holes . . . one at the
center f, upon which is erected a glass tube; this tube is bonded over the hole j': at the upper
end h of this tube, a torsion micrometer is placed which we see in detail in sub-Figure 2.
C. A. de Coulomb, "Premiere Mcmoire sur l'Electricite et Magnet.isme," Histoire de I'Academic
Royale des Sciences, 569; 1i85. For an English translation of excerpts, see W. F. Magie, A Source Book
in. Physics, McGraw-Hill Book Company, New York, 1935.
6

SECTION

Historical Survey

JZ~

.1.

p
C

FIGURE

3.4

Coulomb's apparatus.

9'

107

108 Electrostatics in Free Space

CHAPTER

The upper part, No.1, carries the milled head b, the index io, and the chuck q; this piece fits
into the hole G of piece No.2; piece No.2 consists of a circle ab divided along its girth into
360 degrees, and a copper tube 4> which fits into the tube H, No.3, which in turn is sealed to
the interior of the upper end of the glass tubefh of sub-Figure 1. The chuck q is shaped much
like the end of a solid pencil holder, and is closed by means of the ring q. In this chuck is
clamped the end of a very fine silver wire: the other end of the silver wire [sub-Figure 3] is
held at P in a clamp made of a cylinder Po of copper or iron . . . whose upper end P is split
so as to form a clamp which is closed by means of the sliding piece 4>. This small cylinder is
enlarged and pierced at C in order to permi t the needle ag to pass through; it is necessary
that the weight of the small cylinder be sufficient to stretch the silver wire without breaking
it. The needle, ag, is seen [sub-Figure 1] to be suspended horizontally, and about half-way
up in the large cylinder which encloses it, and is formed either of a silk thread or straw soaked
in Spanish wax and finished off from q to a for eighteen linest of its length by a cylindrical
rod of shellac; at the extremity a of this needle is found a small pith ball two or three lines
in diameter; at g there is a little vertical piece of paper soaked in turpentine, which serves
as a counterbalance to the ball a and retards oscillations.
We have said that the cover AC was pierced by a second hole at m; into this second hole
is introduced a slender rod mCf.>t, whose lower portion <I>t is made of shellac; at t is another pith
ball; around the perimeter of the glass cylinder ABeD, at the height of the needle, is
described a circle zQ divided in to 360 degrees; for greater simplicity I use a strip of paper
divided into 360 degrees which is pasted around the cylinder at the height of the needle.

Coulomb then goes on to explain how the instrument is prepared for the experiment
by securing the pith ball t in place and adjusting the micrometer head so that the two
pith balls are just touching. His description of the actual experiment follows:
We electrify a small conductor [Figure 3.4, sub-Figure 4] which is simply a large-headed
pin insulated by plunging its point into the end of a rod of Spanish wax; this pin is introduced
through the hole m and permitted to touch the ball t, which is in contact with the ball a;

upon withdrawing the pin, the two balls are left electrified with the same kind of electricity
and they repel each other to a distance which is measured by looking beyond the suspension
wire and the center of the ball a to the corresponding division of the circle zOQ; then by
rotating the index of the micrometer in the direction pno, we twist the suspension wire lP
and exert a force proportional to the angle of torsion, which tends to bring the ball a closer
to the ball t. In this way one can observe the distance through which different angles of
torsion bring the ball a toward the ball t; comparison of the forces of torsion with the
corresponding distances of the t\VO balls determines the law of repulsion. I shall here only
present some trials which are easy to repeat and which will at once make evident this law
of repulsion.

Coulomb then indicated an initial separation of the two balls of 36 deg, causing an
initial torsional twist of 36 deg in the suspending wire. He next turned the micrometer
head through 126 deg, causing the balls to reduce their separation to 18 deg. Finally,
by turning the micrometer head through 567 deg, he observed that the separation had
been reduced to 8t deg. Since the force of torsion is proportional to the angle of twist,
these data can be tabulated as shown in Table 3.1.
The values of wire twist in the second column are composed of the rotation of the
micrometer head and the angular displacement of ball a. The distance of separation of
the two pith balls is proportional to the sine of one-half their angle of separation, but
since all these angles are small, the distance of separation is essentially proportional, in
t Before adoption of the metric system in France, one line equalled

T~

in.

SECTION

Historical S'urvey

109

TABLE 3.1
COULOlVIB'S EXPERIMENTAL DATA FOR THE LA \V OF REPULSION

A ngular separation of the


two pith balls, deg

A ngular measure of the


force of torsion, deg

36

36

18

144

8t

575~

this experiment, to the angle of separation. One notes from the first column of the table
that the angles of separation are almost in the ratio 4: 2 : 1. The second column of the
table lists quantities proportional to the restoring force, and these figures are essentially
in the ratio 1: 4: 16. In analyzing the data, Coulomb was led to the conclusion
It results then from these three trials that the repulsive action which the two balls exert
on each other when they are electrified similarly is in the inverse ratio of the square of the
distance.

It is interesting to note the great change which has taken place in the method of
reporting in scientific journals since Coulomb's time. Whereas Coulomb went into
great detail in describing his apparatus, the experimental procedure, and possible
sources of error, when it came to reporting data he listed only three points, one of
which deviated from the inverse square law by 6 percent. His statement in the First
Memoir just prior to introduction of the data clearly suggests that other trials had
been undertaken and one can only surmise that Coulomb felt a small sample of his
data would be sufficiently convincing.
Upon turning his attention to an investigation of the law of electric force for the
attraction between oppositely charged bodies, Coulomb encountered a new difficulty:'
I wished to use the same method to determine the attractive force between t\VO pith balls
charged with opposite natures of electricity, but by using this same balance to measure the
attractive force) I found an experimental difficulty which did not occur during the measurement of repulsive force. The experimental difficulty arises when the two balls are drawn near
to each other. The attractive force . . . frequently increases at a greater rate than the
torsional force, which increases only directly as the angle of twist; as a consequence, if several
readings are desired, the balls must be prevented from touching each other by means of an
insulating stop in the path of the needle. Since the balance is often required to measure forces
of less than one thousandth of a grain, the collision of the needle with the insulating stop
influences the results and causes part of the electric charge to be lost.

Coulomb displayed his ingenuity in circumventing this difficulty by devising an


experiment in which he related the period of a pendulum to the spatial dependence of
the force of electric attraction. He suspended a horizontal needlelike insulator from a
thin silk thread (Figure 3.5) and attached to it a tinsel disc at the end l and a counterbalance at g. Kearby was placed a globe G, and in Coulomb's words
7 C. A. de Coulomb, "Seconde Mernoire sur l'Electricite et Magnot.isme," Histoire de I' Academic
Royale des Sciences, 579; 1785.

110 Electrostatics in Free Space

CHAPTER

. we adjust the globe G in such a way that its horizontal diameter Gr is opposite the
center of the tinsel disc l, which is some inches away from it. We give an electric spark to
the globe from a Leyden j ar [condenser]; we then ground the disc I with a conductor, and
the action of the electrified globe on the electric fluid of the unelectrified tinsel disc gives to
ti.e disc a charge of the opposite type from that of the globe; so that when the ground is
removed, the globe and disc act on each other by attraction.

FIGURI~

3.5

Coulomb's apparatus for unlike charges.

Designating by d the distance from the needle's center c to the center of the globe G,
Coulomb varied d and, after setting the needle into oscillation, recorded the time it
took for the needle to perform a specified number of oscillations. He listed three trials,
the recorded data for which is reproduced in Table 3.2.
TABLE 3.2
COULOMB'S EXPERIMENTAL DATA FOR THE LAW OF ATTRACTION

d, in.
9
18
24

No. of oscillations Elapsed time, sec.


15
15
15

20
41

60

SECTION

Historical Survey

111

......

G.---=::::-----~---~~-------__\_-----~-

I
FIGURE

3.6

Composition of forces in Coulomb experiment.

To analyze the data, it is necessary to determine the relationship between oscillation


time and disposition of the parts of the apparatus. Figure 3.6 shows the needle in an
arbitrary angular position and indicates the attractive electric force Fe resolved in to
tangential and radial components. If r is the distance from the tinsel disc l to the
cen ter of the needle c, then

in which I is the moment of inertia of the needle about the axis containing its suspending thread. Under the assumption that the law of attraction is the same as the
law of repulsion, F; = K(d')-2, in which K is a constant and d' is the distance from l
to the center of the globe G. Then

d28
dt 2

Kr cos <p
I (d') 2

Since d' was considerably greater than r in Coulomb's experiment, d' can be replaced
by d in the above equation. Further, since <p == 90 '- (j - a, if 90 - (j is very much
greater than a (small oscillations) then <p ~ 90 - (J, and cos <p
sin (j
e. Making
these substitutions gives
/'-!

which is the equation for simple harmonic motion. The period


T

27rd

/'-!

is therefore

/I
\) Kr

Thus for small oscillations, if the law of attraction is the inverse square, the period
should be proportional to the separation distance d. This analysis would predict periods
in the ratio 20: 40: 54 whereas Table 3.2 indicates that the measured periods were in
the ratio 20: 41 : 60.
Coulomb made some measurements of the rate of dissipation of electric charge and
then corrected his data (the entire experiment took 4 minutes and he found that -ioth
of the charge was dissipated per minute). After correction, the lack of agreement was
negligible for the second trial and only 5 percent for the third trial, which led him to
conclude:
We have thus come, by a method absolutely different from the first, to a similar result;
we may therefore conclude that the mutual attraction of the electric fluid which is called

112

Electrostatics in Free Space

CHAPTER

positive on the electric fluid which is customarily called negative is the inverse ratio of the
distances; just as we have found in the first memoir, that the mutual action of the electric
fluid of the same type is in the inverse ratio of the square of the distance.

Coulomb also investigated the manner in which the amount of electric charge affected
the electric force. To do this, he replaced the stationary pith ball t (Figure 3.4) by a
small iron circle and proceeded in the following manner:"
He electrified these two bodies [the pith ball a and the small iron circle) simultaneously by
means of the head of a pin, and the repulsive force separated the needle from the iron circle;
when it was brought back and placed at a distance of 30 degrees the index pointed to 110
degrees; the repulsive force therefore was [proportional to] 140 degrees. He then touched the
little iron circle with another of the same substance and same diameter; the needle immediately approached the circle, and to bring it back to the distance of 30 degrees, it was
found necessary to untwist the wire till the index stood at 40 degrees; therefore the repulsive
force was reduced to 40 + 30 or 70 degrees, the half of 140 degrees, the measure of its
former intensity.

Arguing that when the charged iron circle was touched by a similar uncharged circle,
its charge was reduced to half, Coulomb then concluded that the electric force is linearly
proportional to the charge on each body.
These discoveries by Coulomb formed the first quantitative basis for a mathematical
statement of the law of electric force. Although his method lacked the degree of accuracy of the approach used by Cavendish, it was direct, it was quantitative, and it was
easy to comprehend. The scientific world readily accepted Coulomb's results, the first
of a substantive nature to be published and widely distributed.
This acceptance was greatly furthered by the theoretical contributions of Simeon
Denis Poisson (1781-1840) who, in two brilliant memoirs" presented in the years 1812
and 1813 lifted electrostatic theory to a mature state of development. He accomplished
this by accepting Coulomb's inverse square law as a fundamental postulate and making
rich use of the analogy to gravitational theory, a subject already highly advanced at
that time.
In an article in the M emoires de Berlin in 1777, Lagrange had shown that if a function
if;(x,y,z) be formed by adding together the masses of all the particles of an attracting
system, each divided by its distance from (x,Y,z), then the derivatives of this function
were equal to the components of the attractive force at (x,y,z). Laplace later demonstrated 10 that this function if; satisfies the equation

a2if; a2~ a2if;


-+-+-=0
ax 2 ay 2 az 2
at all points not occupied by masses.
8 J. Farrar, Elements of Electricity, M aqnetism, and.. Electromagnetism, Hilliard and Metcalf, Boston,
1826. (Notes selected from Biot's Precis Elemeniaire de Physique, compiled for the use of students of
the University at Cambridge, New England.)
9 s. I). Poisson, "On the Distribution of Electricity at the Surface of Conducting Bodies." First
Memoir read to the French Academy on May 19 and August 3, 1812. Printed in M em. de l'Institut,
part 1, 1-92; 1811. Second Memoir read on September 6,1813. Printed in M'em, de l'Institut, part 2,
164-274; 1811.
10 P. S. Laplace, "Theory of Attractions of Spheroids," ~1 em. de l' Academie Royale, 113-196; 1782
(published in 1785).

SECTION

Historical Survey

113

In laying the groundwork for a similar formulation involving electric charge, Poisson
opened his First Memoir by remarking
The theory of electricity which is most generally admitted is that which attributes all the
phenomena to two different fluids, distributed within all bodies of nature. It is supposed
that the molecules of one fluid repel each other and that they attract the molecules of the
other; these forces of attraction and repulsion obey the inverse square law of distance; at
the same distance the attractive power is equal to the repulsive power; from which it follows
that when all the parts of a body contain equal amounts of the two fluids, the latter do not
excercise any influence on the fluids contained in neighboring bodies, and as a consequence
no signs of electricity are manifest. This equal and uniform distribution of the two fluids is
called the natural state; when this state is disturbed for any reason, the body becomes electrified, and the various phenomena of electricity begin to take place.
All the bodies of nature do not behave the same way with respect to the electric fluid:
some, such as the metals, do not appear to exert any influence on it, but permit it to move
about freely in their interior: for this reason they are called conductors. Others, on the conrary, very dry air for example, oppose the passage of the electric fluid in their interior; in
this way they serve to prevent dissipation throughout space of the fluid accumulated in
conducting bodies. The phenomena associated with electrified conductors, whether these
conductors be considered singly, or whether they be considered in conjunction and exerting a mutual influence, are the objectives of this Memoir, in which I propose to apply
the calculus to this important branch of physics. But before entering into these matters,
I wish to state in some detail the principles which serve as the basis for my analysis, and
to make known the most remarkable results to which they have led me.

Poisson's central principle, of course, was the assumption of Coulomb's inverse square
law, on the basis of which he introduced a function t cI>(x,y,z) , composed of the sum of
the charges of an electrical system, each divided by its distance from (z; y, z). He then
argued, as had Lagrange in the case of gravitational attraction, that the derivatives

a4>

ax

a<l>

az

would yield the components of electric force] at (x,y,z).


Turning his attention to conducting bodies, Poisson assumed that an excess of one
electric fluid had been placed on a conductor, and reasoned:
By virtue of the repulsive force between these [excess] particles, and because the metal does
not oppose their movement, one can imagine that the added fluid is transported to the
surface of the body, where it will remain because of the air environment. Coulomb has
proved in effect, by direct experiments, that no atoms of electricity reside in the interior of
an electrified conductor except the natural electricity of the body: all the added fluid distributes itself over the surface . . . it exerts neither attraction nor repulsion at any interior
point of the body; for if this condition were not satisfied, the action of the surface layer of
electricity on interior points would decompose a new quantity of the natural electricity of
the body, and its electric state would be changed.

t Fifteen years later, in generalizing Poisson's work on electric and magnetic phenomena, George
Green (1793-1841) gave to this function the name potential, and this appellation has been universally
adopted ever since.
t Poisson's original notation has been altered to be consistent with the remainder of this chapter.

114

Electrostatics in Free Space

CHAPTER

As a consequence of this argument, Poisson adopted the principle that he could find
the manner in which the excess charge distributed itself over the outer surface of a
conductor, by imposing the condition that this distribution must lead to no net electric force at any interior point of the conductor. In terms of the potential function 4>,
this meant that if IJ were a point in the interior of an electrified conductor, then
. . . the value of <I> is independent of the coordinates of the point P; because then the
partial derivatives of this function being null, the force at the interior point P will be also.

Thus was the concept formulated that a conducting body in electrostatic equilibrium
is an equipotential.
Poisson next turned his attention to conditions at the surface of an electrified conductor and argued, following a suggestion by Laplace, that the electric force at a point
immediately outside the conductor is proportional to the local concentration of surface
charge density. He did this by dividing the force into a part f due to the element of
charged surface immediately adjacent to the point, and a part F due to the rest of the
surface. At a neighboring point just inside the conductor, F will be unchanged but f will
have to be reversed to give a null force. Therefore the resultant force at the exterior
point must be 2f. But if the exterior point is extremely close to the surface, the immediately adjacent surface element looks like an infinite plane, uniformly charged, for
which case Poisson showed the force f to be proportional to the charge per unit area of
the surface, thus completing the theorem.
Using the principle that a charged conductor must be an equipotential, Poisson
deduced the surface distribution for several simple shapes, including an ellipsoid, and
then enlarged his analysis to the study of t\VO charged spheres placed at any distance
from each other. This was a classic and difficult problem to which he devoted over
three-quarters of the space occupied by these t\VO lengthy memoirs. The solution
involves single or double gamma functions, depending on whether or not the two
spheres are in contact. Poisson laboriously computed the values of his integrals for a
variety of conditions and exhibited very satisfactory agreement with the earlier
experimental results of Coulomb.
The year 1813 recorded another significant contribution by Poisson when, in a brief
note.!' he extended Laplace's equation to include points occupied by matter, obtaining

in which p is the volume density of mass. The same connection exists, of course,
between electric potential and charge density. Poisson's proof of the validity of this
important differential equation, which bears his name, has a simple elegance which will
fully reward a decision to consult the original paper. An alternative derivation will be
offered in section 3.9.
The admiration invoked by recounting these achievements of Poisson perhaps has
been expressed best by Whittaker ;"
11

S. D. Poisson, "Remarks on an Equation Which Occurs in the Theory of Attractions of Spheroids,"

Bull. de la Soc. Philomathique, 3, 388-392; 1813.


12 E. Whittaker, A History of the Theories of Aether and Electricity, vol. 1, p. 62, Thomas Nelson and

Sons, Ltd., 1951.

SECTION

Historical Survey

115

The rapidity with which


Poisson passed from the barest elements of the subject to
such recondite problems as those just mentioned may well excite admiration. His success is,
no doubt, partly explained by the high state of development to which analysis had been
advanced by the great mathematicians of the eighteenth century; but even after allowance
has been Blade for what is due to his predecessors, Poisson's investigation must be accounted
a splendid memorial of his genius.

Poisson's differential equation, linking spatial derivatives of the electrostatic potential to charge distribution, found its integral counterpart through a discovery by Karl
Friedrich Gauss (1777-1855). In 1813 Gauss established 13 the famous divergence
theorem

J D dS vJ
=

V D dV

connecting a volume integral throughout V to a surface integral over S, with S being"


the closed surface bounding the volume 11 , and D being any vector function possessing
continuous first derivatives in a region containing V. If D is a radial field which varies
inversely with distance from some point 0, that is, if D = lr/r 2 then the surface integral of Gauss' divergence theorem yields the simple result

J D dS = 471"

if 0 is inside V; otherwise the result is zero. This special result is known as Gauss'
integral. When D is properly related to Coulomb's inverse square law, f sD dS equals
the net charge enclosed by S. This result, coupled with the divergence theorem, yields
an integral form of Poisson's equation. These deductions will be elaborated in the sections to follow.
Another great advance in electrostatic theory, though it was not so recognized at
the time, was made by Michael Faraday (1791-1867). His keen sense of physical visualization led him to picture all force functions in terms of flux lines. This technique
first suggested itself to Faraday because of the common custom of illustrating magnetic power by strewing iron filings on a sheet of paper and noticing the curves along
which they arranged themselves when a magnet was placed underneath the paper.
From this Faraday evolved the idea of lines of magnetic force, whose direction at every
point coincided with the direction of the magnetic intensity.
It was a simple extension to apply this concept of flux lines to gravitational effects
and to electric intensity. About the latter, Faraday said 14
The lines of force of the static condition of electricity are present in all cases of induction.
They terminate at the surfaces of the conductors under induction, or at the particles of nonconductors, which, being electrified, are in that condition.

This conception permitted Faraday to replace action at a distance with a local interaction of charge and a field of force, a viewpoint which had great appeal for Maxwell
13 K. F. Gauss, "Theoria Attractionis Corporum Sphaeroidicorum Ellipticorum Homogeneorum,"
reprinted in his lVetke, vol. 5, pp. 1-22, published by the Royal Society of Science, Gottingcn, 1870.
14 1\1. Faraday, Experimental Researches in Electricity, vol. 3, art. 3249, published by Bernard Quaritch,
London, 1855.

116 Electrostatics in Free Space

CHAPTER

(1831-1879). In the Preface to the first edition of his celebrated Treatise on Electricity

and Magnetism, Maxwell wrote


. . . before I began the study of electricity I resolved to read no mathematics on the subject
till I had first read through Faraday's Experimental Researches in Electricity. I was aware
that there was supposed to be a difference between Faraday's way of conceiving phenomena
and that of the mathematicians, so that neither he nor they were satisfied with each other's
language. I had also the conviction that this discrepancy did not arise from either party
being wrong.

Maxwell found, as he proceeded with the study of Experimental Researches, that it was
possible to couch Faraday's ideas in mathematical terms and thus compare them with
the formulations preferred by mathematicians. As part of the contrast, he noted
For instance, Faraday, in his mind's eye, saw lines of force traversing all space where the
mathematicians saw centres of force attracting at a distance: Faraday saw a medium where
they saw nothing but distance: Faraday sought the seat of the phenomena in real actions
going on in the medium, they were satisfied that they had found it in a power of action at a
distance impressed on the electric fluids.

Maxwell's skillful mathematical exposition of Faraday's ideas led him to conclude that
the results of the t\VO methods coincided, but that Faraday's viewpoint was much
richer. Thus he adopted it and furthered it with many ideas of his o\vn. It was Maxwell
who introduced the concept of the electric flux density Junction D (he called it the
displacement), a concept which becomes especially meaningful in the study of dielectrics. Using Green's Theorem, he obtained an expression for the energy stored in an
electrostatic system in the form

w=

Lf
v

E2(X,y,Z) dx dy dz

which highlights the interpretation of the electric field E as the seat of the phenomena.
Maxwell also solved a variety of boundary-value problems, obtaining both the potential and electric field, and displaying these for the first time as precise field maps, to
illustrate Faraday's idea of lines of force. The plates appended to both volumes of his
Treatise include some of the most beautiful flux maps which have ever been prepared.
The field approach of Faraday and Maxwell, with strong emphasis on the local interaction of a charge and a field, will be found to have permeated the remainder of this
text.
Maxwell also contributed to the establishment of the inverse square law. His interest
in this problem had been aroused by his reading of the unpublished manuscripts of
Cavendish. These manuscripts had been brought to the attention of Lord Kelvin after
Cavendish's death. Recognizing their importance and desiring that they be published,
Kelvin urged the Duke of Devonshire, to whom the manuscripts belonged, to entrust
them to Maxwell, This he did in 1874.
The Cavendish experiment which particularly caught Maxwell's admiration was the
one concerned with the determination of the law of electric force, and he resolved to
repeat it. Accordingly, with Sir Donald l\1cAlister, he devised an apparatus which
improved in several particulars on Cavendish's original design. The principal innovations were the use as detector of a more sensitive quadrant electrometer and the adop-

SECTION

Mathematical Formulation of the Inverse Square Law

117

tion of a technique which did not require the dismantling of the outer spherical shell.
Maxwell provided a thorough analysis of the accuracy of this method which, coupled
with McAlister's data, led him to conclude that the force law was bounded by r-(2+6)
in which I<5 I < 1/21,600. t The McAlister experiment and Maxwell's analysis will be considered in Section 3.20.
By far the most sensitive investigation of the electric force law which has ever been
undertaken was accomplished by S. J. Plimpton and W. E. Lawton at the Worcester
Polytechnic Institute in 1936. Together they skillfully applied all the advantages of
modern technology and electronic instrumentation to a repetition of the Cavendish
experiment and were able to show that the distance dependency in the electric force
law deviated from the inverse square by less than two parts in one billion. This remarkable achievement stands as the most compelling reason for basing an electrical theory
on the inverse square law. The Plimpton-Lawton experiment and an analysis of the
accuracy of their results will be considered in Section 3.21.

3.2 MATHEMATICAL FORMULATION OF THE INVERSE SQUARE LAW


The preceding section of this chapter has shown how Coulomb's experiments led to the
formulation of the law of force for electric charges; namely
(3.1)
The increasing accuracy of the experiments performed by Cavendish, by Maxwell and
McAlister, and by Plimpton and Lawton have raised the confidence level in this law
almost to the point of certainty. Yet, it seems appropriate to point out certain limitations in all these experiments and to circumscribe the limits of validity of this force law.
First, none of these experiments was undertaken with the charges extremely close
to each other, nor excessively far apart. Thus, the question can be raised as to the
limits of r within which the law is valid. As yet, there is little direct evidence at very
large distances. However, if one accepts the premise that an entire electromagnetic
theory can be based on Coulomb's inverse square law, then the indirect evidence supports the belief that this law operates at astronomical distances. Concerning short
distances of separation, Rutherford's experiments, in which he bombarded atomic
nuclei with a particles, have shown that the law holds at distances as small as 10- 14 m.
It may be valid at closer distances, but nuclear forces then come into play and partially
mask the effect.
Second, note that the law as stated presumes point charges and it is clear that this
is an approximation which can be good only when the extent of each charge is small
compared to r, For example, Coulomb's experiments involved, not point charges, but
rather charge distributions on balls of finite size. Induction effects caused these distritions to be nonuniform. (In the case of repulsion, the remote sides of the two pith balls
attained a heavier charge density.) Thus it became somewhat uncertain what to use
for the true spacing r.

t Maxwell was apparently being conservative in using as the bound one part in 21,600, for in the
Introduction to The Scientific Papers of the Honourable Henry Cavendish he states "We can now use
Thomson's Quadrant electrometer, and thereby detect a deviation from the law of the inverse square
not exceeding one in 72,000."

118

Electrostatics in Free Space

CHAPTER

In the case of the Cavendish method, the nonconcentration of charge at a single


point is even more evident, and another assumption entered heavily into the experimerits. It was assumed that the law of superposition of forces holds for electric charges
so that, if q' were replaced by a distribution of charges, one could write for the force on q
N

F 0: q

n = 1

q; i; = q
~n

I ~i~"
rn

(3.2)

n = 1

in which ~n is drawn from the charge qn to the charge q, and l rn


~n/~n is a unit vector
in the same direction as ~n. t
Quite obviously, none of the Cavendish-type experiments proved the validity of the
assumption of superposition; nevertheless, superposition is important for many applications of the theory. The validity of this assumption is accepted by virtue of the fact that
results predicated on it are consistent with experiment, but it should be recognized that
the principle of superposition for the forces among electric charges was not directly
demonstrated.
Third, the force law for electric charges, (3.1), includes the implication that the line
of action of the force coincides with the straight line connecting the two charges.
Coulomb's experiments did not reveal any transverse component of force, but they
were hardly sensitive enough to be definitive on this point. The Cavendish approach
requires the assumption that this is so, in that symmetry is used to cancel out certain
components of force in computing the net force on a charge between the two spheres.
Here again, the assumption that the force acts along the line joining the charges gains
its strength not from the original experiments, but rather from the accuracy of predictions based on making this assumption.
Fourth, the law (3.1) states that the force varies directly as the algebraic product of
the two quantities of charge. Coulomb was able to show that like charges repel and
unlike charges attract according to the same function of distance, and also showed that
halving one charge reduced the force by a factor of two. Cavendish demonstrated that
doubling each charge quadrupled the force. But neither showed the general validity
of using the product qq', and this is accepted by inference.
Fifth, the law (3.1) states nothing explicitly about the medium in which the charges
q and q' are immersed. The approach to be adopted here will be that only a vacuum is
particle-free, and that any other medium can be viewed at the atomic level as an
assemblage of particles, some of which may be electrified. In the cases of such media,
the generalized form (3.2) can then be used to find the force on the charge q, with some
of the charges qn belonging to the particles which constitute the medium.
Finally, the law (3.1) also states nothing explicitly about whether or not the distance
between q and q' is time- dependent. The experiments were all essentially static. How-

t The distance symbol r is actually a German lower-case x, but it has the semantic advantage of
looking like an r. It will be used throughout this text to designate the distance from a source point (the
position of qn) to a field point (the position of q). The symbol r will be reserved for distances measured from the origin. Other authors have achieved this distinction by using r' for the distance between
source point and field point. Unfortunately that notation is not convenient here, since many of the
developments in Chapters 4 and 5 will involve t\VO coordinate systems XYZ and X'Y'Z'. rand r '
(or j' and !") will then mean the distance between the same two points as measured in the two different
coordinate systems. The reader may find it convenient to call the symbol r by the name r-sub-c or
r-cedilla.

SECTION

j.~1 alhematicol

Formulation of the Inverse Square Law

119

ever, to develop a useful theory, one must be able to let charges move. I t will be seen
in retrospect that a satisfactory theory can be based on Equation (3.2) if one assumes
that it is valid for the case that ql ... qN are all fixed in position, but that motion of q
is permitted.
There is ample experimental evidenee to support this assumption. Many modern
electronic instruments and particle accelerators employ static distributions of charges
to create steady electric fields and thus accelerate charged particles which move
through these fields. When a theoretical traj ectory for these accelerated particles
is computed based on (3.2), excellent agreement with the experimental trajectory is
obtained. This has been found to be true even when the trajectory speed is high enough
to require inclusion of relativistic effects.
It has already been seen in Chapter 2 that mass is a function of velocity and the question can be raised at this point whether charge is not also a function of velocity. It will
be seen in Chapter 4 that it is convenient to define charge to be an invariant. However,
a general answer to this question can be deferred. For the present it will be assumed
merely that the inverse square law is valid whether or not q moves, with the symbol
q which occurs in the formulas always referring to the value of charge when q is at rest.
With all the foregoing limitations and implicit assumptions in mind, the experiments
described earlier in this chapter will be taken as justification for acceptance of the
inverse square law as a fundamental postulate. Since this will be the only purely electrical postulate needed to develop a complete theory of electromagnetics, space will now
be taken to recapitulate and construct a concise mathematical formulation of this law:
As suggested by Figure 3.7, let there be a static assemblage of N charged particles,
containing respectively charges ql, q2, . . . , qN, arbitrarily arranged in otherwise

q(x,Y,Z)

......- - - - - - - - - - - - - - - - - - - y

x
FIGURE

3.7

Notation for Coulomb's law.

120

Electrostatics in Free Space

CHAPTER

empty space. The quantities qi are real numbers which can be either positive or negative. The positions of these charged particles can be described in a coordinate system
X YZ so that the nth particle is identified by the coordinates (X ,Yn,Zn) or by the position vector r., = lxxn
lyYn + lzzn. These coordinates are not functions of time.
Additionally, let a particle containing a charge q be instantaneously at the point
(x,y,z) described by the position vector r = lxx + lyY + 1zz. This charge will be permitted to move, so that the coordinates x, Y, and z can be general functions of time. The
total electrical force exerted on q is then

ll

F =

n=l

r,

(3.3)

qqn r - r n
41rfO /r - r n l 3

(3.4)

in which

r,

= _1_ q~n
41rfo rn

~n

The proportionality constant (41rEo)-1 serves to convert (3.1) to an equality. Inclusion


of the factor 471" is known as rationalization and is done so that a factor of 471" will not
appear in the more often used Maxwell's equations to be derived subsequently from
(3.3). The factor EO can be looked upon as a units-adjusting parameter. It is called the
permittivity of free space, and in the IVIKS system of units to be used in this text has
the measured value of 8.854 X 10- 12 farads/me This choice for fO permits charge to be
measured in coulombs when distance is measured in meters and force in newtons.
The dimensions of Eo will become clear later in this chapter when the concept of
capacitance is introduced. The newton is a unit of force equal to lOS dynes. Thus 1 lb
(force) equals 4.4482 newtons, so that 1 newton can be remembered conveniently as
being slightly less than a quarter of a pound. One coulomb, the unit of charge, is defined
as the quantity of electricity passing a cross section when a current of 1 amp is flowing.
(The primary definition of an ampere will be discussed in Chapter 4 after Ampere's law
has been derived.) One coulomb is also the quantity of electricity required to deposit
0.001118 g of silver f'rorn an aqueous solution of silver nitrate.
Combining (3.3) and (3.4), one can write for the total electrical force on q,

(3.5)

in which
(3.6)
is the vector drawn from qn to q. t
Equation (3.5) is a mathematical statement of the inverse square law, and will hereafter be referred to as Coulomb's law. This equation will prove to be the cornerstone
of all the theory which is to follow, and thus can equally well be called the fundamental
postulate of electricity.
It should be noted that Equation (3.5) is being adopted as a postulate only for
discrete charges which are in otherwise empty space. Although on the face of it this

t The force

F may be im plicitly a function of time if z, y, and z are functions of time.

SECTION

The Electric Field

121

appears impractical, it will be seen later in this chapter that many electrostatic systems of charges exist in the presence of electrically neutral and unpolarized material
bodies which can be treated as though they were empty space. In Chapter 6 material
bodies which cannot be so treated will be introduced and the theory will be enlarged to
take them into account.

3.3

THE ELECTRIC FIELD

If a static assemblage of charges qn exists at the points (Xn,Yn,Zn) and a small test
charge ~q is placed at the point (x,Y,z), Equation (3.5) gives for the force on ~q
LiF = -

~q

47T'

\'

~n

(3.7)

I...t qn 3

fon

~n

=1

in which 6n is drawn from 'I to ~q.


When it is assumed that the charge ~q is small enough so that its presence or absence
does not affect the spatial distribution of the other charges, the vector

E = LiF = _1
Li q
47r f 0

n= 1

qn

~n
~~

(3.8)

is defined as the force per unit charge at (x ,Y,z). E can be expressed in the units of
newtons per coulomb and is variously called the electric force, the electric intensity, or
the electric field strength.
By implication, if a charge q of arbitrary size is placed at (x,Y,z), it experiences a
force
(3.9)
F = qE
However, one must be careful in using (3.9) to ascertain that the presence of q has not
disturbed the positions of the other charges. For example, if the assemblage of charges
qn is distributed over the surface of a conductor and the charge q is placed in the vicinity, the charges q-; being free to move, will redistribute themselves to new positions of
equilibrium.
Equation (3.8) indicates that the electric force depends on the charges qn and their
positions relative to the point (x,Y,z) but that it does not depend on Liq. An intensity
E(x,y,z) can thus be associated with the point (x,Y,z) whether Liq is there or not. If the
vector function E is interpreted in this manner, it can be taken as a fundamental subject of investigation. This is the field viewpoint of Maxwell and Faraday, which differs
from the action-at-a-distance theories of their predecessors. In this view, the source
charges qn set up an electric field at the point P(x,y,z); the field in turn will exert a
force on any charge which might be introduced at P.
With this interpretation, E as defined by (3.8) is an electrostatic field, since the source
points (Xn,Yn,Zn) are static and the field point (x,Y,z) has coordinates which are not
connected to the possible motion of any particle. This functional dependence is usually
indicated by writing
(3.10)

122 Electrostatics in Free Space

CHAPTER

The dependence of E on the sources and their positions is not usually explicitly indicated in the left side of (3.10), but is nevertheless understood implicitly.
In many problems it will be appropriate to consider the total charge p dV in a volume
element dV in lieu of the discrete charge qn. In such cases (3.10) can be written
E(x,y,z)

1
-4
1T'fO

Jp(~,l1,r) 3~ d~ dl1 dr

(3.11)

in which p(~,'YJ,r) is the volume charge density function, expressed in coul/rn'', and
(3.12)
is the positional vector drawn from the volume element centered at (~,l1,r) to the field
point (x,Y,z). The volume region V is sufficient to encompass all the sources p dV.
Similarly, there will be occasions when it is useful to consider the total charge (1 dS
on a surface element dS, or the total charge x de on a line element de, in place of the
discrete charge q.: For example, in the case of surface distributions, (3.10) becomes
E(x,Y,z) = - 1

41T'fO S

(1(~,'Y1,r)

-~ dS

in which is the surface charge density function, given in coul/rn", and


from the surface element dS, centered at (~,'Y1,r), to the field point (x,y,z).
(J"

EXAMPLE

(3.13)

~3

is drawn

3.1

1"0 gain some appreciation for the effects caused by 1 caul of electric charge, recall that the
charge on a single electron is 1.6 X 10- 19 couI. Therefore, it takes 6.25 X 10 18 electrons to

comprise 1 caul of charge. If these electrons were to be arranged in a cubical lattice 3A on


centers, the resulting cube would be approximately 500 microns on a side. An exterior
electron of this assemblage would be an average distance of perhaps 250 microns from the
remainder of the charge and would thus experience a repulsive force in the order of

F = eE

1.6 X 10- 19
41T'(8.854 X 10- 12 ) (250 X 10- 6 ) 2

~ -----------~

itr newton

Al though this does not seem to be a great force, when it is remembered that the mass of
an electron is only 9.1 X 10- 31 kg, the initial acceleration of a free electron experiencing this
force would be approximately 102 7 g. In the absence of compensating forces, this assemblage
of electrons is highly unstable.
Suppose that on a macroscopic scale this entire coulomb of charge is essentially concentrated at a point. Then the electric field due to it is radial and given by

E(r)

1
-lr - 41l"Eor

10 10
- lr r2

This field is so great that if another charge of 1 caul were placed 3 m distant, it would experience a force in excess of 100 kilotons. Further, in the presence of normal air, this field
intensity would cause breakdown out to a distance r = 100 m. In every respect 1 coul
represen ts an enormous amount of charge.

The Electric Field

SECTION ~3

123

Alternatively, if one asks what amount of charge exerts normal forces at normal distances, a feeling for this can be gained by considering t\VO equal charges 1 111 apart which
exert a force of 4 newtons (approximately one pound) on each other. Solving Coulomb's
force equation gives q = 21 micro-caul.
EXAMPLE

3.2

Imagine that all of space is populated with electric charge, but in such a way that the
charge density varies in only one direction. Let X be this direction, so that the situation can
be pictured in terms of plane layers of charge, stacked one next to the other, all transverse
to the X axis. The volume charge density p(~) varies from one layer to the next, but is constant throughout any layer. Let it be desired to find the electric field distribution due to
this charge system.

"

4_

~-

--

-------=lC======~~---rP(X,Y,z)

lI

L_

;---- X

z
The layer of charge contained between the planes ~ and ~ + d~ is pictured, together with
the field point (x,y,z) at which the electric field is to be determined. By symmetry, the
charges in the four volume elements shown exert a net effect at (x,Y,z) which is X directed.
On the basis of the contributions from all volume elements in this double-paired fashion,
Equation (3.11) gives
E(x,Y,z) =

~ f"" d~ f"" dlJ foe>

47rEo

00

.1/

[(x - ~)2

pWlxCx -

(y - rJ)2

00

E(x,y,z)

7rEO

p( ~')~' d~'

drJ'
0

[(

~ ') 2

ds'

(rJ ') 2

(z - S)2]%

(s ') 2] ~~

dt

124 Electrostatics in Free Space

CHAPTER

in which ~' = x - ~, rJ' = Y - rJ, and


E(x,Y,z)

E(x,Y,z)

S' =

Z-

S. Integration gives

_.!:. /' p(f)ede


1T'Eo _

+.!:.
- 2Eo

QO

00

p( ~')

J'"

(~')2

d7]'

(rJ')2

d~'

the plus or minus sign being taken according to whether or not ~' is greater or less than
zero. Thus the electric field is independent of y and z and is given by

::0 [ f pw d~ J pW d~
x

E(x)

00

-00

This result has a simple physical interpretation. If a column of unit cross-sectional area
extending from ~ ~ - 00 to ~ ~ + 00 is considered, the first integral is all the charge in
that part of the column to the left of ~ = x, whereas the second integral is all the charge in
that part of the column to the right of ~ = x. Thus the electric field at any cross section is
uniform over that cross section, normal to it, and equal to 1/2Eo times the difference in the
total charge per unit area found to the left and righ t of the cross section.
EXAl\JIPLE

3.3

Consider a spherical conducting] shell of outer radius a which contains a net charge Q.
What is the electric field distribution for this system?
Because of repulsion, the charge Q will distribute itself uniformly over the outer surface
of the shell with a density a = Q/41T'a 2 By symmetry, the field E at any distance r from the
center of the shell will be independent of the direction of that distance. With the use of
spherical coordinates, (3.13) can be written for this example in the form

in which, for convenience, the electric field is being evaluated at the point (r,O,O).
Referring to the figure, one sees that the charge contained within the band of area
211"a 2 sin 0 dO exerts a net effect at (r,O,O) which is entirely radial, and thus the above integral
can be writ ten
E(r)

IrQ
87rEo

f~
o

cos a sin 0 dO
~2

t For the purposes of electrostatics, materials may be divided into two categories: conductors of
electricity and insulators (dielectrics). A conductor may be viewed as an aggregation of charged particles occupying a region which, on the atomic level, is mostly vacuous; conductors are thus brought
within the purview of the present analysis. A large number of these charged particles (usually electrons) are free to wander throughout the confines of the conductor. These mobile charges respond
readily to an electric field and will continue to move as long as they experience a field. Thus whenever
the mobile carriers are arranged in a spatial distribution whose statistical time-average value is zero,
there is no net static electric field anywhere within the conductor. Only situations in which this is the
case will be considered in this chapter, the treatment of dynamic situations being reserved to Chapter 8. The discussion of dielectric materials will be deferred until Chapter 6.

SECTION

The Electric Field

125

P(r,O,O)

Using the law of cosines, one can convert this to the form
r+a

lrQ

E(r)

7r foar 2

16

~2

Q
E(r) = l r - -

which gives

+r

a2

-----dr
lr-al
~2
47rfor2

>a

<

Thus charge which is uniformly distributed over a spherical shell creates an electric field
at all external points as though the charge were concentrated at the center of the shell; at
all internal points it exerts no electric field whatsoever.
EXAMPLE

3.4

An academic problem which later can be extended to a practical situation concerns two
infinitely long parallel line charges. As shown in the figure (see next page), the upper line
contains a uniform charge density of x coul/m and the lower line contains the opposite uniform charge density of -)( coul/rn. The electric field at any point in space can be deduced
by first noting that by symmetry the answer will be independent of z. Thus the value of E
will be sought at the point P(x,Y,O).
In analogy with what has been said previously for volume and surface distributions, this
lineal distribution of charge gives rise to an electric field expressible as

in which ~l is drawn from a charge element in the upper line to P and ~2 is drawn from a
charge element in the lower line to P. Thus
E(x,y,O)

f'" lx + 1
l:

= ..":

47r fo

00

x2

y(Y

y -

E(x,y) = ~
211"'fo

l~t dt

- d/2) -

2d)2 + r ]/2

[lxX +
x2 +

f'" 1 xx + l.(,I} + d/2)

__ ~
411"'fo

lyCy - d/2) _
(y - d/2)2

l.TX

x2

00

+
+

x2

lyCy

(y

11 dt

y + d)2
2 + r ]72

d/2)]
d/2)2

126 Electrostatics in Free Space

CHAP'rER

P(.t,y.O)

--

-J(

The Dirac delta function can be used to advantage in the formulation of many field
problems, Written <S(x - a), the one-dimensional delta function is defined as having the
properties:
for all x

o(x - a) = 0

! o(x -

if x = a is included in the region of integration;


otherwise the in tegral is zero

a) dx = 1

The delta function can be pictured conveniently as the limit of a Gaussian curve, or
some other similarly peaked distribution, when in the limit the curve narrows and
heightens indefinitely, but in such a way that the area under the curve remains unity.
This area is considered to be dimensionless.
From these definitions it follows that, if f(x) is any arbitrary function

f f(x)

o(x - a) dx = f(a)

!f(x) o'(x - a) dx

-f'(a)

(3.14)
(3.15)

in which the prime denotes differentiation. The first result comes readily from the mean
value theorem whereas the second can be derived from the differential of a product. In
both formulas a is assumed to lie within the integration interval.
Delta functions in multidimensional space can be fabricated by forming products of
one-dimensional delta functions. In three dimensions

oCr -

ro) = o(x - a) o(y - b) o(z - c)

SECTION

Electrostatic Potential

127

in which rand ro are the position vectors drawn respectively to the points (x,Y,z) and
(a,b,c). As a consequence of the earlier definitions, oCr - ro) vanishes except at r == ro
and, if dV == dx dy dz,

J oCr -

ro) dV = { 1 if V contains (a,b,c)


0 if V does not contain (a,b,c)

Through the use of the delta function, an assemblage of discrete charges can be represented by a volume charge density function in the form

L qn oCr' N

p(~,'Y],~) =

(3.16)

r n)

n=l

in which r' is the position vector drawn to the point (~,'Y],~). This representation can be
verified by inserting (3.16) in (3.11), which gives
1
47rEO

E(x,y,z) = -

J L~

qn

n = 1

oCr' -

r.,)

r - r'

--'I d~ d'Y]

-I

r -

d~

Use of the result (3.14) reduces this to (3.10).


Similarly, a surface charge distribution can be represented in terms of a volume
charge density through the use of a one-dimensional delta function along the direction
of the normal to the surface; a lineal charge distribution can be represented by a volume
charge density through the use of a two-dimensional delta function in a transverse surface.
By means of these representations a general discussion in terms of p-type distributions
has much wider applicability.

3.4

ELECTROSTATIC POTENTIAL

Use of the del operator V

==

I,

~
+ 1 ay~ +
ax
y

I,

i8z

to form the gradient of inverse dis-

tance (cf. Mathematical Supplement) gives

lx(x - ~)
[(x - ~)2

v(D

so that

+ ly(Y
+

- 7]) + lz(z - ~)
(y - 'Y])2 + (z - ~)2F2

from which it follows that (3.11) can be written

E(x,y,z)

== -

_1_
47rEo

J p(~,'Y],~)V (~) d~ d'Y] d~


~

Since neither p(~,'Y],~) nor the limits of integration are functions of the field point (x,y,z),
the order of integration and differentiation can be interchanged giving
E( x,y,z)

== -

'rf

J p(~,'Y],~) d~ d'Y] d~
V

1TEO~

(3.17)

128 Electrostatics in Free Space

CHAPTER

Thus the electric field is expressible as the negative of the gradient of the scalar function
cfJ(X,Y,z)

f p(~,Tj,n d~ dTj d~
41rEO~

(3.18)

<P is called the scalar electrostatic potential function and it has several important properties which will now be developed.

E = - V<I>

Since

(3.19)

use of the vector identity (V.112) gives

V'xE=O

(3.20)

and thus the static electric field is irrotational. Application of Stokes' theorem then
yields

E .u sf
=

V X E dS

==

(3.21)

in which C is any closed contour, forming the sole boundary of an open surface S.
Consider a contour C which is arbitrarily divided into two segments C I and C 2 by the
points PI and 1)2, as shown in Figure 3.8. As a consequence of (3.21)

FIGURE

3.8

A segmented contour.

Pz

{t
PI

p~

Ed= {, Ed

(3.22)

PI

In words, the line integral of the longitudinal C0111pOnent of E from PI to P2 is independent of the path and therefore the static electric field is said to be conservative. But
fro m (3.19)

or

(3.23)

SECTION

Electrostatic Potential 129

The difference in the value of the scalar electrostatic potential function at any two points
is thus the negative of the line integral of longitudinal E along any path connecting the
two points.
If the total charge in the system is finite and occupies a finite volume, it follows from
(3.18) that the value of cI> at infinity is zero. If in (3.23) the point P2 is allowed to
approach infinity, one obtains
(3.24)
This result has a clear physical interpretation. If there is a distribution of charges
p(t,1J,r) dV which create a static electric field E, and if a charge q is placed at PI, it will
experience a force qE (providing its presence does not alter the charge distribution).
If q is then displaced an amount dl along an arbitrary path, the charge system does an
amount of work on q equal to qE di. If this process is continued and q is permitted to
recede to infinity, the total amount of energy extracted from the charge system is

00

w=

ae

qE

q~(Pl)

(3.25)

Pi

Therefore W is the amount of energy potentially available when q is at Pl-energy which


can be extracted from the charge system by removing q to a point remote from all of the
charges in the system. For this reason cI>(P 1) = ltV/ q can be interpreted as the potential
energy available per coulomb at the point PI due to the charge distribution p dV. <I> is
expressed in units of joules per coulomb, or volts. For this reason E is customarily given
in volts per meter, a unit which can be understood by virtue of (3.23).
Of course, the value of ifl(P l ) can be negative just as well as positive. If it is positive,
the net action of the charge system on a positive charge q as it moves away is repulsive.
If on the other hand ifl(P l ) is negative, the net action of the charge system on a positive
charge q as it moves away is attractive. In this latter case it takes an external force and
an external supply of energy to pull the positive charge q away. Instead of the charge
system having provided energy to remove q, it has required the addition of energy at the
expense of external sources in order to effect the removal. These arguments are inverted
if q is a negative charge, but then the signs of Wand <I> are opposite, as seen from (3.25),
so the interpretation with respect to the supply or removal of energy is the same.
<I>(P I ) as given by (3.24) is called the absolute potential, whereas <I>(P 2 ) - cI>(P 1 ) as
expressed in (3.23) is customarily given the name of potential difference.
The result (3.21) is consistent with the law of conservation of energy. Since the integrand can be thought of as the work done by the charge system on a positive unit charge
as it undergoes a displacement di, the closed line integral is the work done on a positive
unit charge as it moves from an initial point around any path and ultimately back to the
initial point. Upon its return the original situation is reproduced. If there had been a net
value for the work done, this cycle could be repeated endlessly and would constitute a
perpetual-motion machine.
Because of the independence of path, the result (3.23) can be written
00

~(P2)

~(Pl)

f
P2

00

Ed-

f
Pi

Edt

130 Electrostatics in Free Space

CHAPTER

which is consonant with (3.24) and indicates that the potential difference is the difference in absolute potentials at P2 and PI, as well as being a 111eaSUre of the energy which
can be extracted from the system if a positive unit charge is moved from PI to P2. One
may interpret (3.23) by saying that, if E on the average is oriented from PI to P 2 ,
energy is extracted from the charge system as a positive unit charge moves from PI to
P2. This means that the energy potentially available at P2 is less than at PI, thus
explaining the minus sign on the right side of (3.23).
The scalar function <fl(x,Y,z), defined by (3.18), can be set equal to a constant, thereby
prescribing an equipotential surface. The discussion of gradient in the Mathematical
Supplement establishes the facts that V<fl is perpendicular to the equipotential surface
and has a magnitude and direction synonymous with the maximum spatial rate of
increase of <fl. Because of (3.19), E(x,Y,z) is therefore normal to the equipotential surface
which contains (x,Y,z), and has a magnitude and direction which give the maximum
spatial rate of decrease of . The negative feature of this interpretation is in accord with
the discussion of the negative sign in (3.23). More will be said in Section 3.6 about the
orthogonality between E and the equipotential surfaces, in connection with flux maps.
To summarize the results of this section, a scalar electrostatic potential function
~(x,y,z) has been introduced, with (3.18) the defining relation. ~(x,Y,z) has the physical
significance of being the potential energy available if a positive unit charge is placed at
(x,Y,z), this energy being extractable through removal of the unit charge to infinity. cI> is
related to the electric field by Equations (3.19) and (3.24) and is related to the sources by
(3.18). <I> is an exact function because the system is conservative, the line integral in
(3.24) being independent of the path.
Since surface distributions of charge a dS, lineal distributions x de, and discrete
charge distributions qn can be represented by volume distributions p dV through use
of the Dirac delta function (cf. Section 3.3), all the results just obtained apply to these
types of distributions as well, Appropriate forms for the potential function include
~(x,y,z)

cJ!(x,Y,z)

J U(l;,17,r) dS

= -

(3.26)

47rEo~

47rEo

\' qn
'-'
n = 1 r,

(3.27)

The second of these results can be obtained by inserting (3.16) into (3.18) and the first
can be found by a similar procedure. Alternatively, Equations (3.8) and (3.13) can be
rephrased in terms of the gradient of inverse distance and the procedure which led
to (3.18) can be repeated for these two cases.
Because (3.18) is a scalar integral, for many volume distributions it is much easier
to evaluate than the vector integral (3.11). When such is the case, it probably will
prove to be simpler to find E by first finding <fl and then forming - V4>, rather than
attempting to find E directly.
EXAl\1PLE

3.5

.An important special case of an electric charge system is the doublet, or dipole, consisting
of t\VO charges q and - q separated by a small distance d. It is desired to find the potential
and electric field of this charge system a distance ~ from its center, with ~
d. This result

SECTION

Electrostatic Potential

131

~l

derives some of its importance from a model of dielectric materials, whose behavior can be
explained in terms of atomic and molecular dipoles (see Chapter 6).
On the basis of the figure and Equation (3.27), the potential at a remote point P(~, e,)
is given by

<l> _

4 7r Eo

(1 1) _
~1

~2

(~2

47ro

- ~l)

~1~2

But

and, since r
d, these expressions may be expanded by the binomial theorem (cf. Mathematical Supplement) into rapidlyconverging series. Retaining only dominant terms gives
~2

and thus

~1 ~

d cos 8

<PC () ) ~ qd cos 8
r, ,4> - 4 2
7ro~

The product qd is called the dipole moment, a phrase borrowed from mechanics. I t is useful
to introduce a vector p whose magnitude is qd and whose direction is from the charge -q to
the charge +q. If ~ is taken to mean the directed distance from the center of the dipole to
the remote point P, the above resuit can be written
4> =

~~

47rO~3

(3.28)

This form of expression of the potential of a static electric dipole will prove convenient in
later discussions.

132 Electrostatics in Free Space

CHAPTER

The electric field due to the dipole can be found by employing the gradient operator for
spherical coordinates centered at the dipole, namely,

.,

E(r,O,<p)

(qd cos 0) = -qd- (l 2 cos 8 + 1, SIn. 0)

= -v - 41T'EO~2

41T'EO~3

(3.29)

I t can be observed that the electric field and potential of a doublet diminish with distance
as the inverse cube and square, respectively, whereas Equations (3.8) and (3.27) indicate
that for a single charge the electric field and potential diminish with distance only to the
inverse second and first powers. The explanation lies in the fact that the doublet consists of
two equal and opposite charges close enough together so that they partly neutralize each
other's effect.
Field plots proportional to Equations (3.28) and (3.29) can be found with Example 3.11.
EXAMPLE

3.6

Consider again a spherical conducting shell of outer radius a which contains a net charge Q.
The electric potential distribution for this system can be found with the aid of (3.26) in
the form

4>(r,O,O)

f !L a

_1_
41T'Eo

41T'a 2

sin 0 dO dq,
~

in which the geometry of the figure of Example 3.3 applies. Since


(~) 2

the integral can be rewritten

= a2 + r2

r d{ =
4>(r)
4>(r)

which yields

ar sin

dO

= _Q-

81rEoar

2ar cos

r+a

dr

Ir-al

-!L
41fEor

r~a

= -!L

r ~ a

Therefore, charge which is uniformly distribu ted over a spherical surface creates an electric
potential at all external points as though the charge were concentrated at the center; at all
internal points it causes a constant potential.
From these potential expressions the electric field can be found through use of the gradient
operator for spherical coordinates. The result is

E(r)

-V4>

= 1r -

41fEor 2

=0

r>a

r < a

which agrees with Example 3.3.


EXAMPLE

3.7

As an extension of Example 3.4, the potential due to two parallel line charges of opposite
sign can be deduced. Referring again to that figure, one sees that the potential is z independent and given by

SECTION

Electrostatic Potential

<I>(x,y) = _1_
41r E o

133

foo x df _ _ 1_ foo}( df
-00

{I

41rEO

~Z

-00

(3.30)

Equipotential surfaces for this distribution occur when


x

(Y + ~y ~

(Y -

0
2

=k

-------====~~~===--------x

134 Electrostatics in Free Space

CHAPTER

in which k is a constant, for then


J<.

<I>(x,Y) = - I n k
21T'fo

Rearrangemen t of terms gives


2

+( +
Y

d1

+ k 22) 2 _

21 -

d2

(3.31)

(1 - k2)2

which is the equation of a right circular cylinder parallel to the Z axis. The equipotential surfaces are a family of nonconcentric nesting cylinders with centers at (0, d(l + k 2 ) /2(1 - k 2 ) )
and radii d[k/ (1 - k 2 ) ]. A. few contours of these equipotentials are sketched in the figure.
Formation of the gradient of (3.30) gives

l zx

ly

(Y + ~)

+ (Y + ~y

which agrees with the result of Example 3.4.

3.5

GAUSS'

LAW

Let a new static vector field be introduced by the relation

Do(x,y,z)

= EoE

=~

f p(~,'Y],r) !r d~ d'Y] dr
3

41T' v

(3.32)

in which, through use of the delta function, p can represent volume, surface, lineal, or
discrete charge distributions. V is a volume large enough to encompass all the charges
of the system, Do is called the electric flux density function and the zero subscript is a
reminder that the present discussion is limited to charges in a vacuum. The units of
Do are coul /rn" as can be seen by inspection of (3.32).
Consider evaluation of the surface integral

f Do dS f [~411" f
=

Sa

Sa

~ dV]

dS

in which Sc is a closed surface bounding a volume V G. V G and V can bear any general
relation to each other-either may totally include the other, they may have a subvolume in C0111n1011, or they 111ay be nonintersecting. Since the extents of these two
volumes are independent, the order of integration can be reversed, giving

f Do dS

Sa

~
41T'

f p(~,7J,r) [ f ~ ~S] d~ d7J dt

(3.33)

Sa!

In (3.33), p(~,1],r) d~ d1] dr is a source element of charge in the volume V, and ~ is


drawn from the source element to the surface element dS in SG. This situation is
depicted in Figure 3.9.
Consider the evaluation of the surface integral on the right side of (3.33) for the
case of a point (~,'Y],r) interior to Sa. Let dS A be a surface element with a central point

SECTION

Gauss' Law

135

\ \

\ \

\\
\\

\\
\\

\\
\\

\\ r;
~

\ <t',77',()
FIGURE

3.9

Geometry for establishment of Gauss' law.

P A as shown. If lines are drawn from every point on the perimeter of dS A to (~,1],r),
the cone thus formed includes a solid angle dnA. The projection of dS A onto a sphere
thr ough 1:1 A with (~,1],r) as center therefore has an area equal to ~2 dnA. Thus

in which in is the outward-drawn unit vector normal to dS A and cos 8A is the acute
angle of intersection of the sphere and dS A . By consideration, in this fashion, of all the
elements of solid angle centered at (~,1],r), every surface element in Sa is included
and one can write

f -- = f

471"

~. dS

Sa

~3

~. in
--dfJ
~

cos ()

But 6 In/~ = - cos 8 or + cos 8 depending on whether or not 6 and in are oppositely
directed. For the solid angle dnA there is one intersection with the surface Sa and the
contribution to the above integral is therefore +dQA. For the solid angle dQ B there are
three intersections. The contributions at P B and P~ are each +dQB whereas the contribution at P~ is -dQB, for a net contribution of +dQB. It is apparent that since
(~,l1,r) is inside Sa, for each element of solid angle dn there must be an odd number of
intersections and hence a net contribution of +dn. Thus

J ~ = J dQ = 41r
dS

Sa

471"

(3.34)

When an exterior point (~/,1]/,r/) is considered, each element of solid angle dn makes
an even number of intersections with Sa, half of which give a contribution +dn and
half of which give a contribution - dQ, for a net contribution which is null. Thus the
result (3.34) applies for any point (~,1],r) interior to SG but must be replaced by zero

136 Electrostatics in Free Space


for any point

(~/,q',r/)

CHAPTER

exterior to So. Therefore (3.33) becomes

f Do dS f p(~,1/,n dV
=

So

(3.3fi)

Va

In words, the integral of the normal component of Do over a closed surface Sa is equal
to the net charge in the volume 11 G enclosed by Sa. This is Gauss' law.
Several useful corollaries to Gauss' law follow readilyv First, if the closed surface Sa
is constructed so that every point of SG is occupied by conducting material, the total
charge within SG is zero. This is a consequence of the fact that in electrostatic equilibrium, E == 0 within a conductor] and thus Do == 0 also. Therefore f So Do dS = O.
Second, in electrostatic equilibrium, there is no net charge at any interior point of a
conductor. This follows because such a point can be surrounded by a surface SG which
lies wholly within the conductor. Thus the excess charge of a conductor resides on its
outer surface in electrostatic equilibrium, t
Third, if any number of arbitrarily charged bodies be placed inside a hollow closed
conductor, the charge on its inside surface will be equal and opposite to the total charge
on the enclosed bodies. This can be shown by constructing SG to consist only of interior
points of the hollow conductor, at all of which points Do = O.
EXAMPLE

3.8

Consider a static electric system consisting of a single charge of strength q located at the
origin. For this case (3.32) gives
q r

Do = - 3
41r r

If a gaussian surface So is constructed, consisting of a sphere of radius r centered on the


charge, then over this sphere Do is everywhere normal with a magnitude q/4trr 2 Thus

f Do dS

So

= -q-2 (4trr2) = q
47rr

since 41rr2 is the surface area of the sphere. This result is seen to agree with Gauss' law.
If a charge Q resides on a spherical conducting shell of outer radius a, as in Example 3.3,
by the second corollary to Gauss' law, this charge is found on the outer surface. By sym-

t To say that E == 0 requires some elaboration. In a metallic body, for example, at the atomic level
one can imagine an array of positive ions which can vibrate about their lattice sites, plus a cloud of
electrons which are free to wander throughout the body. The vibrations of the ions and the wanderings
of the free electrons are both random thermal effects, and local fluctuating electric fields exist at any
atom site, varying with the motions of the nearby ions and electrons. However, the time-average value
of this electric field is zero unless there is a drift of the electron cloud (current) through the metallic
body. In electrostatic equilibrium no such macroscopic drifts occur. The statement that E(x,y,z) == 0
implies this, and E(x,y,z) can be interpreted properly as the time-independent component of the electric
field within the conductor. Since the conductor is being viewed as an assemblage of charges in a
vacuum, the defining relation Do = foE is applicable and therefore, within the conductor, Do == 0 also.
t The net result is that a charged conductor can be viewed internally as an electrically neutral body,
but one possessing a charge distribution over its exterior surface. The interior of the body is equivalent
to a vacuum once equilibrium is established, its role having been properly to distribute the surface
charge. This proper distribution is such that if the conducting body were removed, leaving the surface
charges intact in a vacuum, E would still be zero at all points formerly occupied by conductor.

SECTION

Gauss' Law

137

metry it has a uniform surface density (J' = Q/41ra2 If a gaussian surface Sa is drawn, which
consists of a concentric sphere of radius b, then if b > a,

f Do' dS

=Q

Sa

from which, by symmetry

Q
Do = 41rr 2
However, if b

<

a, then

Do' dS = 0

Sa

and thus, again invoking symmetry

Do ==
Upon division of these results by
EXAMPLE

fo

o.

to obtain E, agreement is found with Example 3.3.

3.9

Another academic problem, from which several practical results can be derived in due
course, involves two concentric conducting right circular cylinders of infinite length. A short
length of this geometry is shown in the figure. If it is assumed that a lineal charge density of

+x coul/rn exists on the inner conductor, the second corollary to Gauss' law indicates that
this charge resides on the surface r = a; by symmetry it is uniformly distributed over this
surface. The third corollary leads to the conclusion that the surface r = b contains a charge
of - x coul/rn ; by symmetry this is also uniformly distributed.
The field that exists in the vacuous region between these coaxial cylinders can be deduced
with the aid of Gauss' law. Let a gaussian surface SG be erected, composed of a concentric
cylinder of radius r, with a < r < b, and two end caps at the positions z = L. By symmetry, Do is entirely radial so the integrals over the end caps vanish, and
2Lx

Sa

Do dS

-L

x
Do(r) = -

Do(r)27f'r dz

= 47f'rLDo(r)

138 Electrostatics in Free Space

CHAPTER

From this it follows that

E(r) = l r

21Tfor

<I>(r) = <I>(a) - -

)(

27T'fo

In -

If the outer cylinder is grounded and the inner cylinder maintained at a potential
last result becomes
<I>(r)

Vb, this

= Vb In (rib)
]n (a/b)

As one would expect, the equipotential surfaces are concentric cylinders.


These results can be put to practical use in the treatment of tubular condensers and
coaxial transmission lines.

3.6

ELECTRIC FLUX

A graphic method for displaying any vector field is described in Example V.21 of the
Mathematical Supplement. This technique can be applied to the field Do with the
advantage that it frequently provides conceptual help in the understanding of problems.
The spatial vector function Do(x,Y,z) has a value at the point P(x,y,z) given by
Equation (3.32). Imagine that lines are constructed at P parallel to Do and marked
with arrows pointing in the direction of Do. If the number of lines per square meter
which pass through a small area erected at P transverse to Do is chosen to be numerically
equal to the magnitude of Do at P, and if this is done for all points P, a field map of
Do results.
The lines which represent Do are known as electric flux lines, and since their density
gives the value of Do, one can see why Do is known as the electric flux density function.
For any closed surface So
J/; =

f Do' dS

(3.36)

Sa

is the net number of electric flux lines emerging from So. Hence an alternative statement
of Gauss' law is that the net number of electric flux lines emerging from a closed surface
So is numerically equal to the total charge enclosed by So. Since So may be chosen
small enough so as to enclose only one discrete charge, it follows that the number of
electric flux lines originating on a positive charge ql is numerically equal to ql, and that
the number of electric flux lines terminating on a negative charge q2 is numerically
equal to q2. If Sa encloses no charge at all, t/; = 0, which means that as many flux lines
enter SG as leave it. Thus electric flux lines are continuous except at points where there
is charge. All these deductions may be summed up by the two statements:
1. All lines of electric flux originate on positive charge and terminate on negative
charge. t
2. The net efflux t/; at a point P is numerically and algebraically equal to the electric
charge at P.

t Some charge may have to be considered to he placed at infinity in order to "complete" the charge
system.

SECTION

Electric Flux

EXAMPLE

139

3.10

Consider again the case of a single charge q placed at the origin. If q is positive, then if; = q
lines of electric flux emerge from the origin; by symmetry, they are uniformly distributed
in three dimensions, as suggested by the figure. At any radius r, the density of these lines is
Do = -

l/;

41rr 2

so that

q r

Do = - -3
41T'r

which agrees with Example 3.8.


In the absence of all other charge, these flux lines would extend radially to infinity, there
to be terminated by a total charge -q, uniformly distributed over an infinite sphere. If the
charge q at the origin is negative, the directions of the arrows on the flux lines are reversed.

EXAMPLE

3.11

Next, reconsider the doublet of Example 3.5, consisting of charges q and -q a distance d
apart. All q of the lines of electric flux leaving the positive charge terminate on the negative
charge. Since for this system

Do = .!L

41T'

(6~
~1

it follows that for points very close to the charge

_~)
~2

+q,

so that the flux lines start out radially and uniformly from + q and then bend around toward
the charge -qJ where they enter radially and uniformly. This is enough information to

140

Electrostatics in Free Space

CHAPTER

allow a rapid and informative sketching of the field, with the result indicated by the heavy
solid lines in the figure.
Through use of the field expression developed in Example 3.5, if ~
d,

Do =

~ (1~ 2 cos () +

41rr3

1 8 sin ())

which is seen to be consistent with the flux plot.

Flux lines

Equipotentials

Equipotential surfaces can also be added to a flux map and they are everywhere
perpendicular to the electric flux lines. In the preceding simple example of a single
charge, the equipotential surfaces are concentric spheres. In the case of the doublet,
they are figures of rotation about the line connecting the charges; the profiles of several
equipotentials are shown in the figure.

3.7 A CONDUCTOR-VACUUM INTERFACE


The relation between flux lines and the charge which resides on conductors is of special
interest. Consider the case of the electrified conductor of exterior surface S shown in
Figure 3.10. I t already has been established, through the second corollary to Gauss'
law, that all the excess charge resides on the exterior surface S. Further, it has been seen
that in electrostatic equilibrium, E == 0 throughout the conductor, for if this-were not

SECTION

A Conductor-Vacuum Interface

141

so, charges would be flowing, in denial of the assumption of equilibrium. Thus between
any two points in the conductor IE di == 0, because these points can be connected
by a path which lies wholly in the conductor. It follows that the conductor is an
equipotential.
If the conducting body is viewed as an assemblage of charges (some mobile) in a
vacuum, it follows that Do = EoE == 0 within the conductor, and therefore all the
electric flux lines associated with the excess charge are external to the exterior surface 8.
Further, these flux lines must be normal to 8. If they were not, this would imply a
tangential component of Do, and thus of E, in 8; this would give rise to surface charge
flow, violating the premise of static equilibrium. This conclusion that the flux lines
must be normal to 8 is also consistent with the fact that S is an equipotential surface.

S~c:J\
\ Sf
\

'-.;

(a)

-..;

"....--;;....;'

(b)
FIGURE

3.10

~4n

(c)

electrified conductor.

If the total excess charge in a surface element d.S of 8 is (J dS, Gauss' law requires
that dl/; = a d.S be the total number of flux lines just outside of dS associated with the
charge in dS. But dl/; = Do dS, in which Do is the electric flux density immediately
outside of dS. Therefore, at any point on the exterior surface of an electrified conductor
(J

= Do

(3.37)

In general (J and Do are functions of position on the conductor surface. A possible


distribution is indicated by the flux lines in Figure 3.10a.
All of the foregoing conclusions apply whether the conductor whose exterior surface
is 8 is solid or hollow. Let it be assumed that it is a hollow body, with an interior surface 8', as shown in Figure 3.10b. A gaussian surface ~SG can be constructed which lies
as close to S' as one pleases, with all points of Sa lying in conductor. Gauss' law then
yields the result that S', considered as a whole, must be electrically neutral. This does
not prove that 8' must everywhere be locally neutral. One could imagine a charge
distribution over 8', somewhat as suggested by Figure 3.10c, with the flux lines running
through the hollow interior from the positive cluster of charge to the negative cluster
of charge.
But such a charge distribution on S' is not in static equilibrium. 1"0 appreciate this,
one need only form the integral IE d.f along t\VO paths from PI to ]J 2 , one path being
entirely in the conductor, the other through the hollow interior along a flux line. Since
E == 0 along the first path, it must also be identically zero along the second because
potential difference is independent of path. Therefore, the interior surface S' must be

142 Electrostatics in Free Space

CHAPTER

everywhere locally neutral and the hollow interior region is field-free. This is a feature
which can be used to advantage when it is desired to shield equipment from external
electric fields.
EXAMPLE

3.12

Previous examples have established that the electric flux density external to a conducting
spherical shell of outer radius a, possessing an excess charge Q, is Do = l r (Q/ 47T-)(r /r 3 ) .
Right at the surface
(J

Do

= - -2
47T'a

which is consistent with the fact that the surface area is 41ra 2

3.8

THE METHOD OF IMAGES

Certain problems in electrostatics may be simplified by the application of an image


technique which can be established through the use of Gauss' law. To this end, consider an arbitrary complete system of discrete fixed charges. t Figure 3.11a is intended
to suggest this general situation, with flux lines drawn from positive charges to negative charges, and equipotentials shown lighter and transverse to the Do field.
Let cI>o be a closed equipotential surface with in an outward-drawn unit vector
normal to cI>o at the point I). The surface <Po divides the system of charges into an
internal part and an external part such that
N

L qn =

n=l

Qint

+ Qext

In words, the algebraic sum of the external charges equals minus the algebraic sum
of the internal charges.
If a surface charge density a = - Do L, COUl/lU 2 is placed at each point P on the
surface <Po, at the same time that all charges exterior to <Po are removed, the result
will be as shown in Fig. 3.11b. The Do field inside <Po will be unaltered, whereas the field
outside will be completely erased. The surface <Po will still be an equipotential.
Suppose next that an extremely thin, electrically neutral conductor, shaped in the
form of the surface <Po, is slipped into the position of <Po in such a way as not to disturb
any of the interior charge. Suppose further that the charges which make up the distribution (J' become attached to the conducting surface. Since these charges were in
transverse equilibrium before the insertion of the conducting surface (E t a n == 0 over an
equipotential surface), they will not move even though they are now on a conductor and
free to do so. The important conclusion is reached that the field inside <Po is the same
in the presence of the conductor, charged with the distribution (J', as it was originally
when the external charges were present.
For any gaussian surface Sa erected outside the conducting shell, f SaDO dS = 0 by
virtue of the fact that now Do == 0 outside <Po. This means that the total charge inside

t By

a complete system one means N charges of values qn, placed at arbitrary positions, but such that
a complete
system.
1;qn

= O. If charges are imagined to exist at infinity, every system can be considered

SECTIO N

The Method of Images

<1>0

Exterior

(a)

+
+~.-....-~"""",

(b)

(c)

FIGURE

3.11

The method of images.

143

144 Electrostatics in Free Space


SG is zero, or that

CHAPTER

o.: + f dS = 0
f dS = o.:
U

<1>0

so that

<1>0

A vivid way to picture what has been done is to imagine that all flux lines which
have extremities external to <Po have unit charges of appropriate sign attached to those
extremities, these charges making up Qext. These flux lines are permitted to contract,
pulling the unit charges with them until all of Qext has collapsed onto a conductor placed
at <1>0, thus forming the distribution a. This erases the external field but leaves the
internal field intact.
Alternatively, if a surface charge density (J' = Do in = - ( J is placed at all points
on the surface <1>0 at the same time that all interior charges are removed, the result
will be as shown in Figure 3.11c. The field outside <1>0 will be unaltered whereas the
interior field will be completely erased. If the shaped conducting shell is put in place
and the charges comprising (J' are allowed to become attached to it, no change in their
distribution will occur. Thus the field outside cI>0 is the same in the presence of the
conductor containing the surface charge distribution a' as it was when the original
discrete internal charges were present. One can conclude that

f u' dS

Qint

<1>0

and say that if Qint is allowed to collapse along its flux lines onto a conductor placed
at <1>0, thus forming the distribution a', then the internal field will be erased whereas
the external field will remain intact.
Once the situation of either Figure 3.11b or Figure 3.11c is achieved, it is no longer
necessary that the conductor be an extremely thin shell. I t can assume any thickness
which encroaches only on the field-free region and can even be a solid conductor which
completely fills this region.
The above procedure turned around is the method of imaqes. If one wishes to find the
field due to fixed charges and/or charged conducting bodies, a simple solution is available if the conductors can be replaced by properly positioned equivalent charges. For
simple conductor shapes the proper equivalence is often easily recognizable.
EXAMPLE

3.13

Previous examples have been concerned with the field and potential due to an electric dipole
(doublet) and a flux map for this system can be found with Example 3.11. The equipotential
contours which have been added to that flux map show that the plane which forms the perpendicular bisector of the line connecting the t\VO charges is an equipotential surface of
value <I> = o. Therefore, the system consisting of a discrete charge q a distance d/2 above
a grounded conducting plane is equivalent (in a half-space) to a doublet. The imaqe charge
-q, a distance d/2 behind the plane can replace the plane and all the surface charge it
contains, for the purpose of computing the field anywhere above the plane. With reference to
the figure, the Do field at any point P(r,c/>,z) above the plane is therefore

-.!l

o - 41r

{I r
T

[r 2

1z (z - d/2). _ 1TT
(z - d/2) 2P2
[r 2

+ d/2) }
+ d/2) 2F2

1z (z
(z

SECTION

The Method of I mages 145

in which cylindrical coordinates are being employed, with the origin selected as that point
in the conductor closest to q.
At the plane z = 0

CT(r)

Do

1r

d[

r2

+ (d)2J
2 _3,-2
'

a distribution which is shown as the dotted area in the figure. The image technique is thus

z
P(r,cP,z)

seen to be additionally useful in yielding the charge distribution in the conductor. As a


check,

f CT(r)2rr dr =

o
EXAMPLE

qd foo

- 2

r dr

[ r 2 + (d)2J%
-

-q

3.14

The case of t\VO parallel line charges of opposite polarity has been treated in Example 3.7 in
which the equipotential surfaces were found to be nesting right circular cylinders. Com
bining the image technique with this result facilitates the solution of a problem of considerable practical importance.
Consider no\v the case of two straight parallel circular conductors, each of diameter 2a,
with a center-to-center spacing D, as shown in the figure. Let the upper and lower conductors
contain lineal charge densities of +)( and -)( COUI/l11, respectively. Since it is already known
that the equipotential surfaces for parallel line charges are cylindrical, it follows that the
system shown in the figure is equivalent to t\VO line charges at an appropriate spacing d.
This spacing d must be selected so as to give equipotentials for the t\VO surfaces occupied

146 Electrostatics in Free Space

CHAPTER

by the outer skins of the two conductors. By use of Equation (3.31) this means that
d 1 + k2
1 - k2

2=2

k
a=d-1 - k2

which, solving for d, gives

d = (D 2 - 4a 2) ~2

Upon inserting this value for d in (3.30) one obtains the expression for the potential anywhere in the space surrounding the two parallel conductors, namely
<I>(x y)
,

=-

41rEo

x2

[y

!(D2 - 4a2))~J2

In - - - - - - - x 2 + [y - t(D2 - 4a2)~,~p

In particular, if (x,y) be chosen to correspond to any point on the surface of the upper conductor, such as (0, D/2 - a), the potential of the upper cylinder is found to be

- - 1 ]H}
{D+ [(D)2
2a

<1>+ = ~ In 21rEo
2a

Since the median plane is at zero potential, it follows that the lower conductor is at a

I 1
d

potential <1>_ = -<1>+. This can also be deduced by inserting an appropriate point, such as
(0, - D/2 + a), into (3.30). The potential difference between the t\VO conductors is therefore

2<1>+

~ In { -D

~Eo

2a

[(D)2
- - 1 ]H}
2a

)(
D
= -cosh- 1 2a

~Eo

If D

a, as is often the case in practice, then to first order

V =

)(

-In-

1rEo

These results are useful in a discussion of two-wire transmission lines.

SECTION

The Method of Images

147

The problem of a uniform line charge parallel to a conducting cylinder is quite evidently
within the scope of this analysis and is left as an exercise.
EXAMPLE

3.15

The method of images can be applied to the problem of calculating the field due to a discrete
charge in the presence of a spherically shaped conductor, upon recognition of the fact that
any two discrete particles, statically separated and bearing charges of opposite sign but
arbitrary magnitude, give rise to one equipotential surface which is a sphere. To see this,
refer to the figure, in which PI and P 2 are the positions of the two charges ql and Q2, and 0
/---,p

I
I

//

\
\

"

\
l2 \

I r, P,

PI

-I

.....

......... _ - - /

II

rl

'

..

is a point on the line which connects them. The potential at the point P is
ip(P) = _1_

41l"Eo

(~ + ~)
~l

~2

A zero potential surface is defined by the condition

:=-~
ql

~l

and this surface will be a sphere if the constant ratio ~2/~1 permits the distance a from 0 to

P to be a constant also. This will be true if the triangles OPP1 and OP2P are similar, for then
r2

~2

- -~l
a

a
rl

in which r i and rs are the fixed distances from 0 to the two charges. Thus an equipotential
surface exists which is spherical, with center at 0 and with a radius a which is the geometric
mean of the distances from 0 to each charge.
As can be seen by turning this problem around, if a grounded spherical conductor of
radius a is in the presence of an external discrete charge ql, placed a distance rl from the
center of the sphere, the field for this system can be computed by replacing the sphere with
a discrete charge

this equivalent charge being positioned a distance

from 0, in the direction toward ql.


This solution can be generalized to permit any value
conductor by placing an additional equivalent charge

q3

41T' Eoa <I>

ip

for the potential of the spherical

148 Electrostatics in Free Space

CHAPTER

at the point O. Letting q3 = 0 gives the case just discussed, that of a grounded sphere. Letting q3 = - q2 gives the case of an electrically neu tral sphere.

3.9

POISSON'S

EQUATION

If a satisfactory model of the electrostatic systern in question results from assuming


that the volume charge density function p is well-behaved (i.e., has continuous first
derivatives), then it follows that Do is well-behaved also. In this event, the divergence
theorem can be applied to (3.35) giving

f VDodV f DodS f pdV


=

So

(3.38)

in which V G is the volume bounded by the closed surface Se. Since 11 G is completely
arbitrary, the integrands of the t\VO volume integrals in (3.38) must equate point by
point; thus
(3.39)
v Do = p
But one can also write
Do = EoE = EO(-VcI

so that (3.39) can be rewritten


V'2ep

= - .!!-

Eo

(3.40)

This important differential equation is due to Poisson. It relates the spatial derivatives
of electrostatic potential at a point to the volume charge density at the point and can
be viewed as the differential form of Gauss' law. The voltage distributions inside
vacuum tubes are solutions to (3.40) and it is the basis for analysis of electron beam
compositions, for design problems in electron beam shaping, and for the determination
of electron densities in plasmas.
The formal solution to this differential equation has already been found and is given
by (3.18). However, Equation (3.18) is principally useful in problems for which the
charge distribution is known beforehand and one desires to find the potential function.
There are also problems in which one begins by knowing neither the potential function
nor the charge distribution and wishes to determine both; in such cases it is often
advantageous to begin with (3.40) and seek a particular solution.
EXAMPLE

3.16

Consider two parallel plates composed of conducting material. As shown in the figure, one
plate has its interior surface situated at x = 0 and it is to be supposed that this plate is
heated so that it emits electrons into the interspace. The other plate has its interior surface
at x = l. A constant potential difference is applied between the plates by a battery so that
the unheated plate is Vb volts above the heated plate, thus attracting the emitted electrons.
A steady time-independent current results and this device can be recognized as a rudimentary model of a diode. It is desired to find the distribution of electrons and potential in the
interspace and also the connection between current and plate voltage.
One may wonder why this problem, which involves moving charges, i~ treated in a
chapter on electrostatics. However, since the current is time-independent, at any point
between the plates there always must be as much charge arriving per unit time as leaving, so
the amount of charge at the point remains constant. Thus, even though the identity of the

SECTION

Poisson's Equation

149

charges in a volume element dV keeps changing, the amount of charge p dV does not.
Therefore, p may be a function of space but not of time and Poisson's Equation (3.40) Inay
be used to deduce <1>, which also will be a function of space but not of time,

cI>=o

1- -1
1

For simplicity the plate dimensions will be taken large compared to l so that variations
of voltage and charge density in the transverse directions may be ignored. Then (3.40)
becomes

d 2<1>
dx?

Eo

p(x)

I t will be assumed that the electrons, under the repulsive action of the electron cloud already
in the interspace, are barely able to get out of the cathode, emerging with negligible
initial energy. Then if v(x) is the electron velocity at a distance x from the heated plate
(cathode),
~mv2 =

ecJl(x)

relates the kinetic energy of the electron to the work that has been done on it by the field,
as it moves from the cathode through a distance x. In the above expression - e and m are the
electronic charge and mass and the cathode has been assumed grounded.
Further, since p is the volume density of charge, pv dA dt is the charge in a tube of crosssection dA and ~Y directed length v dt. All this charge will pass through the tube in time dt.
Defining current as charge/sec passing a cross section, one can write
L

in which

dA =

pv

dA dt
dt

is the current density, expressed in amp/rn". Thus


L =

pv

is the relation between current and the flow of charge. In this problem, L is a constant, being
independent of both time and space. p and v are both functions of x but their product is not.
With the aid of these two auxiliary relations, Poisson's equation can be rewritten

150 Electrostatics in Free Space

CHAPTER

which is readily integrated to give

a solution which satisfies <1>(0) = 0 and <I>(l) = Vb as well as the condition d<l>/dx = 0 at
x = 0, imposed by the assumption that the electrons barely get out of the cathode.
From this it follows that
vex) =

(~: VbY2(TY'

p(x) =

-~foVb(xl2)-,~

so that
in which

K =

t (2e)}2 ~
ni

(3.41 )

-KVb~~

l2

= 2.33 X 10- 6
l2

with l expressed in meters.


Equation (3.41) is known as the Child-Langmuir law and is obtained for any geometry
of cathode and plate, the only factor being affected by a change in geometry is the parameter
K, and its variability is not great. This nonlinear relation between current and voltage in a

~------------+---x

diode is a most vital characteristic, being responsible, as an example, for a useful technique
in signal detection. Equation (3.41) has been verified extensively by experiments employing
a variety of geometries.
The presence of a minus sign in (3.41) may seem surprising but it can be traced to the
equation L = pv. The electrons have velocities in the positive X direction but their charge

SECTION

Laplace's Equation

10

151

density p is negative so that L is negative; that is, it constitutes a current in the negative X
direction.
Plots of <1>, o, and v versus x can be found in the graph.

3.10

LAPLACE'S

EQUATION

For those electrostatic problems in which the charge distribution is known completely,
Equation (3.18) can be used to determine the potential function and the relation
E == - V<I> can then be used to deduce the field intensity. It has been noted earlier,
however, that problems in which the charge distribution is not known beforehand arise
frequently. When this is the case the potential function in regions containing charge
can be obtained by solving Poisson's equation. Unfortunately, solutions to this equation are very difficult to obtain except for a limited class of relatively simple situations.
However, an extensive number of practical problems exists in which the charges are
confined to the surfaces of conductors, or otherwise constrained to occupy a limited
region. Under such conditions it is often advantageous first to determine the potential
distribution in the adjacent charge-free regions. For such regions, Poisson's equation
reduces to
\72<1> = 0
(3.42)
which is known as the Laplace equation after its discoverer.
Solutions to Laplace's equation must be chosen to match the conditions at the
boundaries of the charge-free regions, which is the link whereby the charges of the
system affect the potential distributions. 1"0 1' example, if all the boundaries of the
charge-free regions are conducting surfaces, then the constant potential over each
of these surfaces might be specified. This is called a Dirichlet problem. Solving Laplace's
equation for the potential subject to these boundary conditions, and forming the
gradient, permits one to determine Do at all boundary points; by this means the charge
distributions over the conducting surfaces are deduced.
Alternatively, the charge distributions over all the boundary surfaces may be specified, which is equivalent to stating the normal derivative of potential as a boundary
condition. This is called a Neumann problem. Solving Laplace's equation for the
potential, subject to these boundary conditions, yields not only the potential distribution throughout the charge-free region but also the potential of each conducting surface
forming a boundary.
It is of value to know that the solutions to Laplace's equation so obtained are unique.
To see that this is the case, imagine that t\VO functions <PI and <P2 have been found,
each of which satisfies Laplace's equation in the charge-free regions, and each of which
satisfies the boundary conditions. Then, since Laplace's equation is linear, their difference <1>1 - <P2 is also a solution and one can use the divergence theorem to write
1

f (ch -

ch)V(<I>! - <1>2) dS

f V [(<1>1 -

<l>2)V(<I>1 - <1>2)) dV

(3.43)

in which S is the totality of boundary surfaces of the charge-free regions and V is the
entire charge-free VOlU1TIC.
But the integrand in the surface integral of (3.43) is the product of <1>1 - <P2 and the
normal derivative of <1>1 - <P 2. One or the other of these factors is zero on any surface
element dS because both <PI and <1>2 are assumed to satisfy the boundary conditions.

152 Electrostatics in Free Space

o=

f [(<1>1 -

CHAPTER

<1>2)\7 2(<1>1 - <1>2)

IV(<I>1 - <1>2)1 2) dV

in which use has been made of the vector identity (Vvl O?'). Because <1>1 - <1>2 satisfies
Laplace's equation everywhere ill 11 this reduces to

f IV(<I>1 -

<1>2)1 2 dV = 0

(3.44)

Since the integrand of (3.44) can nowhere be negative, it follows that

Thus 4>1 and 4>2 can differ from each other at most by an additive constant. This
constant will have no influence on potential differences between points in V and
disappears in taking the gradient. Thus, 4>1 and <P2 yield the same electric field distribution and the solution is unique.

3.11

SOLUTIONS TO LAPLACE'S

EQUATION IN RECl ANGULAR COORDINATES

When the electrostatic potential 4> is expressed as a function of the three Cartesian
coordinates x, y, z, Laplace's equation takes the form]
(3.45)
Using the method of separation of variables, one can assume a product solution of the

form
(3.46)
Since <P is a real function, it is convenient to assume that the functions fi are real also.
Upon substituting (3.46) into (3.45) one obtains
2

~df
f3

_.!-

d 2f t _ ~ d~r2
II dx 2 12 d y 2

dz 2

(3.47)

Since the right side of (3.47) is at most a function of x and y, whereas the left side
can be a function only of z, it follows that both sides 111USt be equal to the same constant.
For convenience this constant, which 111ay have any real value, will be designated by
Then

k;.

df3
dz?

_ ~ d It
11 dx 2
2

k2 f = 0

(3.48)

z. 3

~ d~f2

i d y 2

k2

(3.49)

The left side of (3.49) is at most a function of x whereas the right side can be a

t Cf. Mathematical Supplement, Sec. \T.16.

SECTION

11

Solutions to Laplace's Equation in Rectangular Coordinates

function only of y. Consequently, both sides must equal a constant, say

k;,

153

so that
(3.50)
(3.51)
(3.52)

in which

If no one of the three constants k-, k y , k, is zero, appropriate solutions of (3.48),


(3.50), and (3.51) give
(3.53)
in which the brace notation is intended to signify

with a and b constants; etc.


If anyone of the separation constants is zero, for example, if k x = 0, then
fl(x)

ex

+d

(3.54)

and the appropriate factor in (3.53) is replaced by this linear solution.


Since Laplace's equation is linear, any sum of solutions of the type (3.53) is also a
valid representation for cI>(x,y,z). A particularly useful combination, applicable when
the potential is repetitive in intervals L, and L, in the X and Y directions is

(3.55)
The complex constant coefficients a m n and bm n may be determined from the boundary
conditions by the usual Fourier techniques. This formulation can be extended to nonrepetitive geometries by replacing (3.55) with a Fourier integral.
EXAMPLE

3.17

Consider again the case of t\VO parallel conducting plates, as first treated in Example 3.16,
but now assume that neither plate is heated, although they still differ in potential by Vb
volts, the plate at x = l being at the higher potential. This is now a parallel plate capacitance
problem, and with no electron emission occurring, Laplace's equation applies for the region
between the plates. Assuming transverse dimensions large compared to l, <P can be taken
independent of y and z and Equation (3.54) is an appropriate solution. Inserting the boundary conditions that <1>(0) = 0 and <l>Cl) = Vb gives

cI>(x)

Vbl

(3.56)

and thus the potential increases linearly from one plate to the other. This should be contrasted to the heated cathode case of Example 3.16 in which the space charge caused the

154 Electrostatics in Free Space

CHAPTER

potential to increase as the four-thirds power of distance. The two potential distributions
are shown in the figure.
<t>(x)

"bq; /

SO /

~'!11

l'1

<i~ /

~I

~/

From (3.56) one can deduce that the electric field is

E = - V <I> = -1 del> = -1 Vb
x

dx

(3.57)

Therefore the electric field is uniform between the plates and

Do = EoE = -1 x
The plate at z

EoVb

(3.58)

l has a uniform surface charge density given by

EoVb

uo = Do = - l

(3.59)

there being an equal and opposite distribution on the plate at x = O. The subscript on a is
a reminder that these plates are in a vacuum: later this configuration will be reconsidered in
the presence of a dielectric.
EXAMPLE

3.18

As an illustration of the applicability of harmonic solutions, consider an array of thin conducting strips lying in the z = 0 plane. Each strip is a/2 units wide in the X direction and
infinitely long in the Y direction. Alternate strips are at potentials of V and - V volts,
and insulated from each other by negligibly thin spacings b. This geometry is suggested by
the figure. It is desired to find the potential distribution in the upper half space z > 0.
First of all it is evident by applying (3.55) that one should select n = to be compatible
with no variations of potential in the }" direction ..Also, amo should be set equal to zero for
all m since the associated exponential term in z diverges as z ~ co. With these simplifications,
(3.55) reduces to

I
co

<I>(x,z)

b; exp (j271" l1:X) exp ( -271" ':IZ)

and this solution is subject to the boundary condition that the potential is a square wave in

x when z = O. Thus

<p(x,O)

SECTION

Solutions to Laplace's Equation in Cylindrical Coordinates

12

155

z
<I>(x,O)

,..------:1-----....----- y

-v
x
must agree with the potential plot indicated by the graph. This is a well-known problem
in Fourier series" and the coefficients are given by
bm
for m
c}>(x,z)

=~

a/2

a -a/2

cI>(x,O) exp

(-j

27rmx) dx
a

.V

J1r111,

(1 - cos m1r)

-r.:

O. (bo = 0). Therefore

~ 2Jb. m SIn. 21rmx


(21rmz)
= L
- exp
--m=l

2V

= -

1r

00

m= 1

1 - cos m1r S. I n
21rmx
(21rmz)
- - exp
---

4V [.
27rX
(21rZ)
= --;SIn -;; exp - -;;

+3

l'

67rX

SIn -;; exp

(61rZ)
--;;

107rX exp (I07rZ)


+ 5" SIn -a- -a- +
1

...J

This is a series in which the higher harmonic terms decay very rapidly in the + Z direction.
One does not need to be very far above the plane Z = 0 in order to find a potential distribution which is almost a pure sinusoid in the X direction.

3.12

SOLUTIONS TO LAPLACE'S

EQUATION IN CYLINDRICAL COORDINATES

For problems involving boundaries which are coordinate surfaces in a cylindrical geometry, it is advantageous to express Laplace's equation in terms of the cylindrical variables (r,et>,z). For this case Equation (V.86) of the Mathematical Supplement becomes

a2c}>

1 aep

1 a2ep

a2ep

-+--+--+-=0
ar2
r ar
r 2 aet>2
az2

(3.60)

15 See, e. g., 1. S. Sokolnikoff and R. 1\1. Redheffer, it!athernatics of Physics and Modern Engineering,
pp. 180-181, McGraw-Hill Book Company, New York, 1958.

1.56

Electrostatics in Free Space

CHAPTER

Once again if the method of separation of variables is used, a product of three real functions can be assumed in the form
(3.61)
which leads to the three ordinary differential equations

df
dz
d2f

2
_._3 _
2

~
d2

k2fa = 0

(3.62)

= 0

(3.63)

11 2/

J2

2fI

-ddr? + -1"1 -djI


+ ( k2 dr

2
v )
(1
1'2 ~

= 0

(3.64)

in which the separation constants k 2 and 11 2 are arbitrary real numbers. Some of this
arbitrariness is removed upon writing the solution for (3.63) as

!2( 4

::: {ejJlq,}

and thus recognizing that II must be an integer n if the range of is unrestricted and the
potential is to be single-valued. Imposing this condition, one may write
(3.65)
Ordinarily, for the special case n

0, the linear solution


(3.66)

would be indicated. However, the requirement that <I> be single-valued il11pOSeS the constraint that the constant c in (4.61) be zero. The remaining constant d can be aCeOnl1110dated in (3.65) by permitting that solution to apply also for the case n = O. Thus (3.6tj)
will be used for n = 0, 1, 2, . . . .
The solution of (3.62) proceeds in similar fashion, giving

!3(Z) = {ekz}
f3(Z) = c'z + d'

(3.67)
(3.68)

k~O

=0

However, no integer restriction exists on the allowable values of k; indeed, since k 2 can
be any real constant, k can be any pure real number or any pure imaginary number.
If both k and n are zero, (3.64) has the simple solution

fI(r) = alnr

+b

n=k=O

(3.69)

whereas if only k equals zero


(3.70)

k = 0

which can be verified by substitution.


For k ' 0 it is best to proceed by introducing the substitution variable v = kr,
which converts (3.64) to
2

-ddvf2I + -V1 -dfl


+(1dv

n- )
v2

il

This can be recognized as Bessel's differential equation.

(3.71)

Solutions to Laplace's Equation in Cylindrical Coordinates 157

12

SECTION

It will be assumed that the reader is familiar with the details of the method used for
solving (3.71) and only the principal results will be stated here. 16 The assumption of a
power series representation for fl(V) leads to the conclusion that (3.71) has two independent solutions given by
J n(lcr)

Yn(kr)

__ ~ (-1)m(kr/2)n+2m
~

mien

m=O

(3.72)

+ m) I

('Y + In ~) In(kr)
_ ~ \' en - m - I)! (~)n-2m

=;

n-l

'Tr

m!.: 0

m!

kr

1 ~ (-l)m(kr /2)n+2m (

-- L

7r m=O

mIen

+ m)!

1+-+-+"
2
3

.+-m.

1
1
1 )
+1+-+-+
.+-2
3
n + m

(3.73)

where 'Y = 0.5772


These solutions are known as Bessel functions of the first and second kind. They
appear formidable but are rarely needed in these forms in practice, since both functions
are tabulated. 17 For k real, the two functions are oscillatory, this feature being shown in
Figure 3.12 for the first few values of n.
For large arguments (kr 1, n),

In(kr)

Yn(kr)

- cos (kr - - - -)
~1T'kr
kr - -n7r - -7r)
~'Trkr
2
4
2

SIn

'nat

1T'

(3.74)
(3.75)

For this reason it is convenient to introduce particular linear combinations of J nand


Y n through the definitions
H~l)(kr)

+ jYn(kr)

(3.76)

In(kr) - jYn(kr)

(3.77)

= In(kr)

H~2)(k1') =

'I'hese combinations also form a fundamental set of solutions to Bessel's equation and
are known as Bessel functions of the third kind, or more commonly as Hankel functions.
For large arguments they have the asymptotic forms (2/1T'kr) ~1 exp [j (kr - n1T' /2 7r/4)] and thus, when combined with the harmonic time function ei wt , represent incoming
and outgoing cylindrical waves, Their principal utility will arise later in the discussion
of time-varying fields.
16 Many excellent discussions of Bessel's equation and the properties of its solutions exist in the literature. For example, the interested reader is referred to J. Irving and N. Mullineux, 1\1aihematics in
Physics and Engineering, pp. 75-82, 128-174, Academic Press, New York, 1959.
17 See, e. g., the tables appended to G. N. Watson, A Treatise on the Theory of Bessel Functions, 2d ed.,
pp. 666-752, Cambridge Press, London, 1952.

L58

Electrostatics in Free Space

rHAPTER

Yo(v)

FIGURE

3.12

Lower order Bessel functions of the first and second kind.

For small arguments (kr

1),
(3.78)

;2 ( 0.5772
Yn(kr)

+ In 2kr)

_ (n _ I)!
7r

(~)n
kr

n=O
(3.79)

nO

Regardless of the value of the index n, the Bessel functions of the second kind are seen
to possess a singularity at r = 0 and thus they must be excluded from the solutions to
physical problems in regions containing the Z axis, unless a line source exists at r = O.

SECTION

12

Solutions to Laplace's Equation in Cylindrical Coordinates

1,59

If k is imaginary, the series solutions (3.72) and (3.73) are still valid, but for convenience a pair of modified Bessel functions is employed. Letting k = jf, with f real,
since the series (3.72) is even or odd according to the nature of n, it is possible to define a
real function by the relation

(3.80)
If one attempts similarly to modify Y n , a study of the series (3.73) reveals that a cornparably simple definition will yield a com-plex function, a result which is unwieldy
when one recalls that an effort is being made to represent a real potential function <P.
However, this difficulty can be avoided by introducing the function
(3.81 )
Inspection of the complex sum of the t\VO series (3.72) and (3.73) for imaginary argument reveals that Kn(fr) is a real function; further, In and K; comprise an independent
set of solutions to Bessel's equation and their asymptotic forms for large argument are
the simple expressions
(3.82)

Kn(lr)

-+

~ e-

fr

(3.83)

where fr 1, n.
The functions In and K; do not oscillate and only K; is well-behaved at infinity. The
graphical forms for the first few modified Bessel functions are shown in Figure 3.13.
For low values of n they are widely tabulated;"
In summary of these results, <I>(r,et>,z) may be composed of a suitable product of three
factors from among:

fl(r) = alnr + b
= {rn}
= anJn(kr) + b Yn(kr)
= anIn(fT) + bnKn(fr)
f 2 ( cP) = {ejncP }
!3(2) = C'z + d'
{ekz}
= {ejfz}

n=k=O
k = 0
k real, nonzero
k = jf imaginary, nonzero
n=O,1,2,.
k = 0
k real, nonzero
k = jf imaginary, nonzero

(3.84)

One can observe from (3.84) that oscillatory functions in r combine with nonoscillatory
functions in z and vice versa.
Since Laplace's equation is linear, products formed from the factors in (3.84) can be
summed to give more general solutions. Of particular value are the summations of
products which comprise complete orthogonal sets. For example, if the potential is repetitive in the Z direction with a characteristic length L z , then an appropriate formula18

See, e. g., G. N. Watson, loco cit.

160

Electrosiaiics in Free Space

CHAPTEH

~-----_-------Y---------r-W

FIGURE

3.13

Lower order modified Bessel functions.

tion is

!P(r, cP,z)
m=-ClOn=-ClO

This can be recognized as a double Fourier series in c/> and z. The complex coefficients
and bm n can be determined in the usual way through knowledge of the boundary
conditions. The terms for n = 0, k = 0, must be treated individually in that the linear
forms in (3.84) should be substituted where needed.
An orthogonal set of functions can also be generated in the radial direction. The procedure can be illustrated with the function J 1 (kr) which is plotted in Figure 3.14a out to
a-;

(a)

(d)

Yll

vo
I

FIGURE

VV J, (Y lI -'!-)
"0

I
\.

3.14

(b)

Y11

(Y1.f.)

Y:\

Construction of orthogonal Bessel functions.

(e)

vvJ.

Y12
~V

.-

Vo

Y13

<,

f.) 1\

(f)

VVJ,(Y13

(c)

Y 12

t.x.j

tn

..

C'J

Cr.J

~.

0
0

C":l

~.

~.

~.

~..

CJ:l'"

B"
C":l

"'t3

c-

0-

CJ:l

0
~

V:;.

tv

162 Electrostatics in Free Space

CHAPTER

its first root 1'11, in Figure 3.14b out to its second root 1'12, and in Figure 3.14c out
to its third root 1'13. If each of these curves is stretched (or compressed) to a common
length kr and multiplied by ~, the result is the family of curves shown in Figures

3. 14d-f. These curves are plots of the functions vfk;. .J1('Ylmr/rO) in the interval
r ~ roo The functions form part of an orthogonal. set, and this procedure can be
followed for any value of n since

o~

(3.86)
for all integral values of n, and for any integral values of m ~ 1, p ~ 1. In (3.86) the
symbol Dm p is the Kronecker delta and has the value unity if m = p, but is otherwise
zero. Equation (3.86) is known as the orthogonality relation for the Bessel functions of
the first kind. Its derivation can be found in Appendix C together with a collection of
the more useful recursion relations and differential and integral formulas which connect
different Bessel functions.
When this type of expansion is appropriate, the potential function can be expressed in
the form
(3.87)

e:J>(r,cP,z)

The complex coefficients a m n and b-; can be evaluated for the given boundary conditions
with the aid of (3.86) and the usual Fourier formula associated with the series in cPo
EXAMPLE

3.19

Consider the case of t\VO concentric conducting cylinders with the figure of Exalnple 3.9
once again applicable. With the inner cylinder maintained at a potential Vb volts above the
outer cylinder, let it be required to find the potential distribution in the space between
cylinders.
By symmetry, the answer should be independent of !/> and z, so the proper selection from
(3.84) is
cI>(r) = i1 In r

Use of the boundary conditions e:J>(b)

::=

<I>(r)
From this
so that

0, e:J>(a) =

Vb gives

In (r jb)

= Vb In (a/b)

Do = -foV<I> = -1, LnfO~~J ~


a = Do(a) =

f.OVb/ a

In (b/a)

and the charge per unit length on the inner cylinder is

all of which is harmonious with the results of Example 3.9.

SECTION

EXAMPLE

Solutions to Laplace's Equation in Cylindrical Coordinates

12

163

3.20

As indicated by the figure, a grounded conducting cylinder is immersed in what had been a
uniform electric field Eo = l xE o. The cylinder axis coincides with the Z axis and its radius
is roo Find the potential distribution external to the cylinder,

P(r,cp)

~-~-t-----X

..

Eft

Since the problem has no z dependence, k = 0 and the proper selection from (3.84) is

I'

if>(r,cf

n=

(anrn + bnr-n)(Cneln4>

+ dnc 1n4 + a In r + b

-00

in which ~' signifies that the term n = 0 is excluded from the summation. Symmetry conditions indicate that the solution should be even in cI> and therefore the above expression
reduces to

I
00

if>(r,cf

(anrn + bnr- n) cos ncf>

+ a In r + b

n=l

The boundary condition at large r is such that there should still be a uniform field Eo, since
the effect of the cylinder is local. Therefore the potential at large r should be

<P = -Eox = -Ear cos


For this reason a = b

0 and an

cI>

0, n ~ 1, with al = <E. Thus

I
00

if>(r,cf = - Eor cos cf>

b.r: cos ncf>

n=l

The constants b; are determined by the boundary condition that ep(ro,cI = O. This gives

o= -

Eoro cos cf>

bnr- n cos ncf>

n=l

and therefore b; = 0, n :;e 1, whereas b, = Eor~. The final form of the solution is

<P(r,cI

( -Eor

Eor~) cos
+ -r-

cI>

164 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.21

Consider a grounded hollow cylindrical can of radius r and height h, as shown in the figure.
Its lid has been removed just sufficiently to be insulated from the can, and raised to a potential Yo. Find the potential distribution within the cylinder.

-1
h

,,---

"

--

- .......... , ,

x
This problem can be simplified from the outset by recogmzmg that the geometry is
= ro and
be finite at r = 0, only the J 0 functions will suffice. Thus the representation (3.87) is
appropriate in the form

c/> symmetric and therefore that n = O..Also, since the potential must vanish at r

~(r,z) = m~l Am sinh ('Yom~) J o ('Yom~)


in which the exponentials in z have been combined to give the hyperbolic sine in accordance
with the requirement that <P == 0 over the bottom of the can.
The coefficients Am can be evaluated with the aid of (3.86) and the boundary conditions
at z = h. Since

~(r,h) v,
=

mtl AmSinh('Yom~)Jo('Yom~)

it follows, upon multiplying both sides by rJ 0 [ I'Op

]'-v , (T'OP!:.-) dr =

I Am

m = 1

sinh

and integrating, that

(T'om~)
]' rJo('Yom!..) J o('Yo P; ) dr
ro
ro

. ( roh)

= A P sinh

(~)

'Y Op -

r~ J 21 ('Y op)

SECTION

Solutions to Laplace's Equation in Spherical Coordinates

13

165

Use of Equation (C.14) from Appendix C gives

d(

!-.)] =

'Y Opd r/ ro) f'Y Op !-ro J 1("'lOP ro

!-.)

('YoP
J
r

0("'1 Op !..)
ro

so that the above integral yields

Ap

2Vo

"'IOpJ ("'lop) sinh ("'lop h/ro)


1

and the potential is given by

J 0 ('Yom

2:
00

4>(r,z) =

2Vo

!..)

m=l

("'IOm ~)

sinh

ro

'YomJ1('YO m) sinh

~o

('Yom;:;;)

For specific values of ro and h the table of roots in Appendix C may be used to determine the
relative richness of harmonics in the above series.
EXAMPLE

3.22

In a variation of the preceding problem, the cylinder is insulated from the bottom lid as
well as the top lid. If the two lids are grounded and the cylinder is kept at a potential V o,
the internal field can be determined by utilizing the representation (3.85). This choice is
dictated by the need to have the potential vanish at Z = 0, h, a condition which cannot be
satisfied by hyperbolic functions of z, Once again there is cJ> symmetry, making n = 0, and
the functions K. cannot be used because the potential is finite at r = O. Therefore
4>(r,z)
Since cI>(ro,z) =

mirz

V o,
h

= m'= 1 .i, sm h

fV

. prz

o SIn -

00

dz

\'

'-' s :),

m=l

in which the mirror potential Z < O. Thus

Vo has

(m1rr o)
-h-

10

(m1rr)

f . h1n1rZ .
h

SIn

-h

sin

hp1rZ dz

been assumed for convenience in the range -h

<

A = 2Vo(1 - cos p1r)


p
pt:I o(P1r ro/ h)
and the potential is given by the expression

~(r Z)

'

2:
00

_ 4Vo
-

7r

,
I o(m1r r/h) SIn
. (m1rz)
m=lmlo(m1rro/h)
h

in which 2;' denotes odd values of m only.

3.13

SOLUTIONS TO LAPLACE'S

EQUATION IN SPHERICAL

COORDINATES

When the potential problem of interest involves boundaries which are coordinate
surfaces in a spherical geometry, Laplace's equation is best written in terms of the
spherical variables (r,(J,cJ; Equation (V.86) of the Mathematical Supplement then

166 Electrostatics in Free Space

CHAPTER

becomes

1
a (
ael
+ r-sin
-0 -ao
sin 0 +
ao

1 a (ael
- - r2 1'2

ar

ar

1
a2ep
= 0
1'2 sin! 0 a~2

(3.88)

A product solution in the form


(3.89)
can be assumed in which the functions fi are real. This leads to the separation
2
f 1)
sin 0 d (rd
2 i, dr
dr

-- -

sin e-d
+ --

(.
df 2)
SIn () f2 de
de

+ -1 d-

2f3

f3 d~2

(3.90)

Since the last term is a function only of ~ whereas the first two terms are not, it follows
that

-.!

d f3 =

(3.91)

f3 d~2
in which m must be an integer if the potential is to be single-valued. Thus
m = 0,

1, 2, . . .

(3.92)

Replacement of (1/13) d2f3/d~2 by -m 2 in (3.90) and division by sin? () gives

-! ~ (r

11 dr

f1
d )
dr

+ _1_ !!- (sin 8 df 2)


12 sin

8 dO

d8

sin 2 8

(3.93)

Because only the first term of (3.93) is a function of 1', this term must equal a real
constant which shall be designated n(n + 1), for reasons which will emerge shortly.
Thus
fl

-d ( r 2 -d )
dr
dr
f2
~
(sin () d ) +
d8
d8

n(n

[n(n + 1) sin

(3.94)

1)f1 = 0
(J -

~2 0] /2

SIn

(3.95)

= 0

Equation (3.94) is readily solved and gives

fl(r)

or:

(3.96)

br-(n+O

Solution of (3.95) is facilitated by making the substitution u = cos () which leads to

2
(1 - u 2 ) -d f 22 -

du

2u -df2
du

[ n(n

+ 1) -

2
-m-2]
1- u

12

(3.97)

The functions which satisfy this equation are the associated Legendre functions and
the t\VO independent solutions are normally designated pr;:(u) and Q:':(u). The latter
have singularities at the poles 8 = 0, 1r and must be excluded if the polar axis is part
of the region of interest. In all that follows this will be assumed to be the case. Appendix
D includes a discussion of the manner in which (3.97) is solved, together with a development of the major properties of the functions l)~(u), and only the principal results
will be stated here.

SECTION

13

Solutions to Laplace's Equotum in Spherical Coordinates 167

The assumption of a power series solution of (3.97) for the case m == 0 leads to the
conclusion that, if n is an integer, f2(U) can be expressed as a polynomial which is
well-behaved in the entire region -1 ~ u ~ 1 and given by
1 dn
Pn(u) = - n - (u 2
2 n ! dun

l)n

(3.98)

The first few of these polynomials are

Po(u)

==

and these functions are plotted in Figure 3.15. If all positive integral values of n are

1.0 ....-

P n(u)

..... --~--..------.-.....- - . . , . . . - - ..... ----,r--...,......-~

0.8

~~---l---~--4-----l~---+---_+_--+---t-----I---+H

0.6

1-----'~---+---4-----l~-__+_--_+_--_+_-~~-_+__t__t__1

0.4

l---~....--~~-+--:~~~--+---_+_--~--t__-_tt____t_____f

0.2

l----4--Jr--+----+--~~-_+_--_I_--_+_--t_+___+__t___;

1---~I----A.--.+---I---_w:~-_4_--_f_-__++__-_It_-____1

- 0.2

~-~---+--~4-------lIiI'----+--~_+_--~~-t___+___t_-____1

- 0.4

1-----l-4----+----lJ---=~~-__+_--_+_7I'_~d_-~t_-_+-__;

- 0.6

~-I-~---J,.--+---~~---+---_+_--+_--t__-_+-_____1

- 0.8

1-l--~!-.---+----4---~-__+_--_+_--_+_--t_-_+-____1

-1.0

L-_--.L _ _ .......L. _ _ -'--_ _ L - _ - L . _ _ --L._ _..J- _ _

-1.0

-0.8

-0.6

-0.4

-0.2

0.2

0.4

.L--.._--..L.._-----l

0.6

0.8

1.0

'U

FIGURE

3.15

Legendre polynomials.

included, the Legendre polynomials generated by (3.98) constitute a complete orthogonal set in the interval [- 1,1] and for this reason noni ntegral values of n normally
are not considered.

168 Electrostatics in Free Space


For m

CHAP'rER

0, the associated Legendre function P;:(u) satisfies (3.97) and is given by

(3.99)
Since ]J n (u) is an nth-order polynomial, m cannot exceed n in value. A variety of
recurrence formulas connecting associated Legendre functions and/or their derivatives
for different values of the indices an be found in Appendix D together with a list of the
specific functions generated from (:~.99) for low values of 'In and n.
The associated Legendre functions are also orthogonal in [-1,1]' the normalization
integral being
1

JP

-1

m
n

2(n

(u)P l (u) du = (

2n

+ m)!

)(

)'

1 n - m .

(3.100)

DIm

When (3.89) is expanded in terms of the solutions which have been found for the constituent functions, one obtains

IL
n

4.>(r,8,rb)

ClO

[anr n

bnr-(n+l)]p;:,(cos

8)[cm cos mrb

dm sin mrb]

(3.101)

m=On=O

The combination P=(cos O)[c m cos met> + d; sin met>] is called a spherical harmonic,
Being orthogonal in both cos (j and et> it is suitable for the expansion of arbitrary functions of 0 and et> in spherical coordinates in exactly the same way that a double Fourier
series is used in two dimensions in rectangular coordinates.
EXAMPLE

3.23

Imagine a uniform electric field of strength Eo into which an insulated conducting sphere of
radius To is placed. For convenience take the polar axis parallel to the original field and
accept the resulting potential of the sphere as the zero reference. What is the field distribution in the region exterior to the sphere?
Equation (3.101) can be used with the simplification that m = 0 because of et> symmetry,
Then

L
00

4.>(r, 8)

[anrn + bnr- (n+ 1l ]p n(COS 8)

n=O

I [anr~ +
ClO

4.>(ro,8) = 0 =

bnrO"(n+l l jP n(COS 8)

n=O

Multiplying both sides of the second of these equations by Pl(cos 0) and integrating gives

L [anr~ +

bnrO"(n+l l j

n=O

= [ alTol

+ blro

ClO

o=

-(l+l)

P n(COS 8)PI(cOS 8)d(cos 8)

-1

] --

2l

this second result arising from the normaliza.tion integral (D.24) in .Appendix D. Thus

SECTION

Solutions to Laplace's Equation in Spherical Coordinates

13

169

Since the electric field must be IzE a at points remote from the sphere,

cr

oo

lim 4>(r,8) = lim


r-4OO

r-4OO

n=O

anr n 1 -

= lim (- Eor cos 8)


r-+

Pn(cos 8)

00

Therefore only al is nonzero and al = - Eo. The potential distribution is then

<I>(r,O) = -Eo

[1 - (~YJ

r cos 0

The field intensity is given by

The surface density of charge induced on the sphere is

(J(8) = EoEr(ro) = 3EoE o cos 8


Flux lines and equipotentials are shown in the figure. I t can be observed that the sphere
exerts little influence at distances larger than one radius from its surface.

I
I

#f ~-lt
\

I I

170 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.24

The potential of a point charge at a distance rl from the origin of coordinates has c/> symmetry if the polar axis is chosen to pass through the charge. vVi th reference to the figure, the
P(r,O,t/J)

potential at the point P(r,(},c/

is

in which, in accordance with the cosine law,

~ = ~ [(~y + 1 - 2 ~ cos 8
But since

(l - 2ut

= ; [ (;

+ t2)-~~ =

y+

Li-r
00

1 - 2; cos 8

n(U)

n=O

is a generating function for Legendre polynomials if t

<1

(cf'. Appendix D), it follows that


r

<
>

rl

rl

The potential of the point charge can therefore be expressed as


ip(r,8)

Pn(cos 0)
n':o (!-)n
rl

~ ~

.si:

47T"orl

47T"or

I (~)n

n=O

Pn(cos ()

< T1

SECTION

13

Solutions to Laplace's Equation in Spherical Coordinates

171

This formulation can be used to find the potential distribution due to a point charge outside a grounded conducting sphere. If the polar axis is taken through the point charge ql,
which is a distance rl from the center of the sphere, it follows that the charge induced on the
sphere has a et>-symmetric distribution. L sing (3.101) wi th m = 0 to represen t the part of
the potential due to the charges on the sphere, and using the above series to represent the
potential due to ql, one can write
<I>(r,O)

bnr-(n+'lPn(cos

bnr-<n+l)Pn(cos

n":o

a~T<rl

<I>(r,O) =

r>rl

n':.0

0)

+ _ql_

41J"Orl

0)

\'

n':o

(~)n Pn(cos ())


rl

+ .si: ~ (~)n Pn(cos 8)


41J"or n': 0

in which a is the radius of the sphere. The radial functions have been chosen appropriately
to satisfy the finiteness condition at infinity.
Since 4>(a,8) = 0, the orthogonality of the Legendre polynomials leads to

bna-(n+l)

ql

(a)n

41J"Orl

rl

from which it follows that

bnr-Cn+I)Pn(cos

0) -

.,

I (~)n

41J"or n=O

n=O

Pn(cos 8)

with
Thus the grounded sphere is equivalent to a second point charge at an interior point on the
polar axis. This result is in agreement with Example 3.15.
EXAMPLE

3.25

A conducting spherical shell of radius a is divided into four equal sectors as shown in the
figure. These sectors are insulated from each other by small gaps and are alternately at the
potentials Vo and - Vo. Find the potential distribution in the region external to the
sphere.

The general expansion (3.101) may be used once again with an == 0 to satisfy conditions
at infinity. Then since the potential must be an odd function of ~,

I I
n

<I>(r,O,(p) =

m=On=O

d mn sin me/> r-Cn+1lp;:'(cos

0)

172 Electrostatics in Free Space

CHAPTER

At r = a, the potential is independent of

ff

(J

and is a square wave function in and thus

1 211"

<fJ(a,rp)Pf(u)

sin p du d

-1 0

ff
1

LL
n

21r

Pf(u) sin p

-1 0

00

dmna-(n+l) sin m P':(u) du d

m=On=O

= 7T'a-<l+l)

(2l

+ p)! d
1) (l _ p)! pi

2(l

by virtue of (D.30) and the Fourier normalization formula,


The left side of the above equation reduces to

v,

1I"n

Pf(u) du

[4 f

sin p d ]

= 8V

- l O P

in which p is restricted to the values 2(28


pi

Pf(u) du

-1

1) with 8 = 0, 1, 2, . . . . Therefore

- 4al + 1V o (2l

-7r--

(l

l)(l - p)! F P
+ p)!
1

in which F] = f:tPf(u) duo The dominant term in this series is Fi = 457T' /8. Thus
<fJ(r,8,)

for r

3.14

~HVo (~r4 sin 2 P~(cos 8) ~ 5Vo(~r4 sin 2

(cos 8 - cos 38)

a.

GREEN'S FUNCTIONS

A technique for solving boundary-value problems which has the virtue of systematization arises from the use of Green's second integral theorem. With reference to Section
V.21 of the Mathematical Supplement, if <I>(~,l1,r) and 'l!(~,17,r), are well-behaved functions in a volume V which is enclosed by a surface S, this theorem leads to the
identity

f (<fJV~'lJ'

v
in which

'lJ'V~<fJ)
Vs =

dV =

a+

1 z

a~

(<fJVs'lJ' - 'lJ'Vs<fJ) dS

a+

1 y a17

1 _.
Z

ar

Let <P be the potential function being sought and let 'l!
defined by

= -

47T'fo~

(3.102)

+X

= G be

the Green's function


(3.103)

with \7~x = 0 through Vand ~ = [(x - ~)2 + (y - 17)2


(z - r)2P'~. G can be interpreted
as the potential due to a unit charge placed at (x,Y,z) plus the potential due to a source
system exterior to V. Because of the singularity in G, if (x,Y,z) lies within V, a small
sphere of radius E can be erected around (x,Y,z) and its volume excluded from V, or
alternatively the singularity can be represented by the Dirac delta function. The
latter approach gives
2 y
o(r - rs)
(3.104)
'lsG = - - - Eo

SECTION

Green's Functions

14

in which rand rs are drawn from the origin to the point (x,Y,z) and to the point
respectively.
Since \7~<I> == - pleo in V, (3.102) becomes

-f
v

[<I> D(r - rs) - G ~J dV =


eo

eo

f (4) aGan -

G a<l

an

173
(~,7J,r)

dS

wherein n is a spatial variable in the direction of the outward-drawn normal. This


yields the general result
<I>(x,Y,z)

f Gp dV + sf (G a<I>an - anac) dS
<I>

EO

(3.105)

if (x,y,z) is within V, otherwise the left side of (3.105) is zero. If the Green's function G
is known, Equation (3.105) gives the solution for the wanted potential function <I> in
terms of its sources within V and the values of <I> and its normal derivative on S.
Several classes of problems which can be solved by this formulation are worthy of
mention:

1. If G is chosen so that X == 0, if S consists of a single surface which goes to infinity,


and if <I> decreases with ~ as ~ ~ 00 (as it will for any finite source distribution), then
the surface integrals in (3.105) vanish and the familiar result (3.18) is obtained.
2. If <I> has no sources in V, so that \7~<I> == 0 in V, then (3.105) reduces to
<I>(x,Y,z)

EO

f (G a<I>
an

<I>

aG) dS
an

(3.106)

However, it already has been noted in Section 3.10, in connection with the uniqueness
proof for solutions to Laplace's equation, that if \7~<I> == 0 in V, knowledge of <I> or
aif>lan on S is sufficient to determine <I> everywhere in V. Thus (3.106) as it stands is
over-determined, and additional conditions may be imposed. The most commonly
imposed conditions are (a) G == 0 on S (the Dirichlet problem) and (b) aGlan == 0 on S
(the Neumann problemj.!"
In the Dirichlet problem the Green's function may be considered to arise from a unit
charge placed at (x,Y,z) plus a system of "image" charges so positioned outside S as to
cause G == 0 on S. Another interpretation is to imagine (for the purpose of determining
G) that S is a grounded conducting surface which contains an induced surface charge
distribution due to the presence of a unit charge at (x,Y,z). For any Dirichlet problem
<I>(x,y,z)

== -

eo

dS
fs <I> -aG
an

(3.107)

in which eo aG I an is the induced surface charge distribution.


Similarly, in the Xeumann problem, the Green's function may be considered to be
due to a unit charge placed at (x,Y,z), plus a distribution of charges exterior to V such
that aG I an == 0 on S. For any 1\'"eumann problem

ep(x,y,z) ==

eo

aep

G - d.S
.)f an

(3.108)

19 See J. I). Jackson, Classical Electroduruimics, p. 19, John Wiley and Sons, Inc., New York, 1962, for
a less restrictive Neumann condition.

174 Electrostatics in Free Space

CHAPTER

The advantage of both formulas (3.107) and (3.108) is that if G is once found for a
particular geometry (i.e., a particular shape of surface S) an entire class of problems is
formally solved.
3. Since Poisson's equation is lineal', solutions 111ay be superirnposed so that, if clJ
has sources in V, a generalized Dirichlet problem gives

eJ>(x,y,z)
in which G

J Gp dV - sJ
v

EO

a ('1
eJ> -.!. dS

(3.109)

an

== 0 on S. Similarly, a generalized K eumann problem gives


~(x,y,z)

J Gp dV + J G -an dS

Eo

a~

(3.110)

with aGjan == 0 on S.
4. Finally, mixed boundary conditions are possible, with G == 0 over S', a part of S,
and aG jan == 0 over S", the other part of S. The appropriate expressions for cf> are then
natural extensions of the results already given.
EXAl\fPLE

3.26

If an arbitrary potential distribution ~(a,tJ,<p) is established over a spherical surface of


radius a by means of a source system external to the sphere, the potential anywhere within
the sphere can be determined with the aid of (3.107). On the basis of the results of Example

/
(~,.",r)1

In

SECTION

15

175

Solutions to Laplace's Equation in Two Dimensions

3.15, G will be due to a unit charge placed at the point (r,f),cP) and an image charge of
strength - air placed at the point (a 2Ir,f),cP). Then
cfJ(r,(J,)

~ J7I" J

2'"

47T'

<I>(a,lJ,lp)

0 0

~ (~
8a

~2

air)
~l

a 2 sin f} dtJ d'P

If ~1 and ~2 are expressed as functions of a and r through use of the law of cosines, the
integrand may be arranged in a form which is suitable for evaluation.

3.15

SOLUTIONS TO LAPLACE'S EQUATION IN TWO DIMENSIONS


WITH THE USE OF CONFORMAL MAPPING

A large class of two-dimensional potential problems fits the condition that, with proper
orientation of the coordinate axes, ep is independent of z and Laplace's equation reduces
to
8 2ep
8 2ep
-+
=
=0
(3.111)
2
8x

8y 2

When this is the case, a powerful method of attack utilizing the theory of functions
of a complex variable may be brought to bear on the problem. Using the real variables

x and y, one may define the complex variable g == z


jy, in which j == V-=1. I t is
then convenient to associate any given value of g with a point in the xy plane, as shown
y

c
~----

X -------tl~

---.A.--------.L...--

----------"--------

(b) m plane

(a) 3 plane
FIGURE

'U

3.16

Functions in a complex plane.

in Fig. 3.16a, and thus refer to this as a representation in the complex S plane. The
coordinates may also be expressed in polar form through the transformation equations

R
and then

(x 2 +

y2)~'

eJ>

S = Re i

tan' ' (;)

(3.112)
(3.113)

176

Electrostatics in Free Space

CHAPTER

Imagine that additionally there is a different complex variable defined by the


relations
l11 = U

+ jv

CRe i IP

(3.114)

\11 can similarly be represented by points in a


tangular coordinates u, v, or polar coordinates
function of S, such that for each assigned value
corresponding value of lV, this relationship can

complex plane, either in terms of reeill, ip, (See Figure 3.16b.) If m is some
of S there is some rule which specifies a
be symbolized by writing

f(g)

(3.115)

l11 =

Then if S is permitted to vary continuously, its representative point in the S plane


may trace out a curve C, as shown. In general, the corresponding values of ro will
trace out a curve ( in the m plane.
A small change ~s in the independent complex variable S will occasion a corresponding change ~m in the dependent complex variable m, The derivative of f(g) may then
be defined in the usual way as the limit of ~ll)1 ~g, namely:

dro = lim f(g


dg
~s~o

~g) - f(g)
~g

(3.116)

However, unlike the case of the derivative of a function of a real variable, there is no
unique path in the S plane along which g + ~g must approach g, which is to say that
~s may have any direction. If the derivative druidS has a unique value at S, regardless
of the path along which ~S is chosen, the function lUeS) is said to be analytic, or regular,
at S.
If the value of the derivative is to be independent of the direction of ~S, the same
result must be obtained if S is changed solely in the ~r direction, or solely in the y
direction. Since m = u(x,Y) + jV(x,lJ) , in the former rase
dlU = lim u(x
dg
~X--'O

~x, y)

+ jv(x +

au
av
=-+jax
ax

~x, y) ~ u(x,y) _=-jvC~,Y)


6x

(3.117)

whereas in the latter case

.
?l(x, y + ~y) + jvCr, Y + ~y) - u(x,y) - jv(.l',/})
-ds = lim
-------------- ----------------

dS

i~y-+O

av

= -

ay

~lJ

.au

(3.118)

-J-

ay

If m is to be analytic at S, it is therefore a necessary condition that

au

-- -

a:r

au
ax

av
ay

(:3.119)

au
ay

(:3.120)

These are known as the Cauchy-Riemann equations. Since any path through S can be

SECTION

Solutions to Laplace's Equation in Two Dimensions

1.5

177

expressed as the linear sum of the displacements ~x and j ~y, it follows that the CauchyRiemann equations are also a sufficient condition that tu(S) be analytic at S.
EXAMPLE

3.27

Consider the function tu = S3 which gives u

u = x3

so that

au

-ax = 3x 2

3xy 2

+ jv

av
ax

= (x
2
3u
.'J

+ jy)3 = (x 3 au
- = -6.'Clj

3 xy 2)

+ j(3x 2y -

y3)

ay

av

-ay = 3x 2

= 6xlJ
oj

31;2

The Cauchy-Riemann equations are seen to be satisfied for all values of x and y and thus
m = S3 is an analytic function for all points in the complex S plane.
It is left as an exercise to show that gn is analytic (n ~ 0) and thus that the series Lt:=oanS n
is analytic and can therefore be used to represent a general class of regular functions.

When the complex derivative defined by (3.116) exists, it may be found by the same
rules which are used for functions of a real variable. As examples

dg. (cos g)

d
- (In

sm g

ds

s)

s:'

Returning to the Cauchy-Riemann equations, if (3.119) is differentiated with respect


to x and (3.120) with respect to y and the difference taken, one obtains
(3.121)
Alternatively, if the differentiation is reversed, there results
(3.122)
Thus both the real and imaginary components of an analytic function of a complex
variable satisfy Laplace's equation in two dimensions. It is for this reason that the
theory of functions of a complex variable is so rich in applications to potential problems.
For a problem in which one of the two components of tu is chosen to represent the
potential function 4>, it is interesting to note that the other component is related to the
electric flux. To see this relation, let u(:r,y) be chosen as the potential function for a
particular two-dimensional problem, Then since Do = - fOVU,

D ox
and the vector

au

-fa

a:r

i,

Do!!

au

-fO--

au
au
a--x + lY-ay

ay

(3.123)
(3.124)

is perpendicular to the contour u(x,Y) == constant and tangent to the flux line which
passes through the point (x,y). But

av dx + -av dy
ax
ay

du = -

178

Electrostatics in Free Space

CHAPTER

which, with the aid of the Cauchy-Riemann equations, can be written

dv

au

ay

dx

+ -au
dy
ax

(3.12.5)

In moving along a contour v = constant, dv = 0 and the displacements dx and dy

(au)
: (au)
which is a vector parallel to (3.124). Thus the lines
ax
ay

. t h e ratio
. are In

= con-

stant coincide with the electric flux lines.


Further, combination of (3.123) with (3.125) gives
1

dv = - (D oy dx - I)ox dy)
to

If d = l x dx
element

(3.126)

l y dy is the general displacement implied in (3.126), then a surface

can be composed which is a rectangle, with a side de in the xy plane and a side of unit
length in the z direction. The flux through this surface element is

dtJ;

Do dS

D ox dy - D oy dx

(3.127)

Comparison of (3.126) and (3.127) reveals that


ell/; = -

~o

dv

(3.128)

and, except for an integration constant (which can be made zero by choosing the
reference for flux at v = 0),
(3.129)
l/; = -toV
This equation can be interpreted as saying that the total electric flux between the
contours v = Vi and v = V2, per unit length in the z direction, is -to(V2 - V1)' Xot only
are the contours of v the flux lines themselves, but the value of v can be made a measure
of the total flux.
If the foregoing is represented in the complex ro plane of Figure 3.1Gb, a very simple
picture emerges. The horizontal equispaced grid of lines v = constant trace the electric
flux density, and the vertical equispaced grid of lines u = constant give the equipotentials. This is the electrostatic field picture for the region between parallel conducting
plates which have been oppositely charged (cf. Example 3.17). However, the connection
to real space requires the knowledge of u and v as functions of x and y.
This connection may be viewed in terms of the transformation function lD = /(g.)
relating the contours C and ~, as described earlier and shown in Figures 3.16a and b.
If ~ is a line u = constant, then C will be an equipotential contour in real space; if
~ is a line v = constant, C will be a flux line in real space. If the correct transformation
equation /(g.) is found, these cquipotentials and flux lines will be the solution to the
problem under consideration.
It has been seen already that the function f(s) must be analytic if u and v are to
satisfy Laplace's equation. But then the derivative must be independent of path and
dhJ is uniquely related to dg by the expression

dhJ =

f' (s) dg

(3.130)

SECTION

15

Solutions to Laplace's Equation in Two Dimensions

179

If 1'(8) is considered in polar form, so that 1'(8) = Ae j a then Equation (3.130) states
that the magnitude of dtt) is A times the magnitude of d8 and that the angle of d~u is
the angle of dS augmented by a. Therefore the entire infinitesimal region in the neighborhood of a point m is similar to the infinitesimal region in the vicinity of the corresponding point 8, merely being magnified by the scale factor A and rotated through
an angle a. For this reason, if two curves C andC' intersect at a certain angle in the
8 plane, the transformed curves ( and ~' will intersect at the same angle in the tt) plane,
since both have been rotated by the angle a. Transformations possessing this property
of angle preservation are said to be conformal and every analytic function is therefore
a conformal transformation. As a particular example of this conformal property, the
angle between a flux line and an equipotential contour in the 8 plane is 90 deg; the
angle between a u line and a v line in the mplane is also 90 deg.
The problem of determining a two-dimensional electrostatic field distribution is
thus seen to be equivalent to finding the correct analytic function /(8) which will
transform the sought-for flux-potential map into a simple rectangular grid in the
mplane. As is so often the case in analysis, the inverse problem is simpler, namely, to
study a known function /(8) and see what physical potential problem it represents.
EXAiVIPLE

3.28

If n is a positive real number (not necessarily an integer) the function

is analytic and has real and imaginary components given by


u = Rn cos n
v = R sin n

It is evident from the exponential forrn of this function that a semi-infinite straight line,
drawn from the origin in the 8 plane at an angle , will transform into a semi-infinite straight

_ _ ..-60

1110.-""'_ <I>

= 0

~.......

_..-<I>=O

line, drawn from the origin in the mplane at an angle n. Therefore this transformation is
useful in determining the fields near conducting corners. As examples, consider the grounded
interior and exterior corners shown in the figure. For the interior corner, the boundaries are
the semi-infinite lines at = 0, 1r /2. If v is chosen as the potential and n = 2, these lines
transform into the m plane as the t\VO halves of the v = 0 line, thus satisfying the condition

180 Electrostatics in Free Space

CHAPTER

of zero potential. Then in the S plane, the potential distribu tion is


v(R,) = R2 sin 2

and the flux lines can be found by letting u = constant. Both fields are plotted in the figure.
Similarly for the exterior angle, since the boundaries are the semi-infinite lines at cP = 0,
31J" /2, if n = !, once again these boundaries transform into the ro plane as the two halves
of the v = 0 line. The potential distribution is therefore
2

v(R,cP) = R~'J sin - cP


3

and u = constant gives the flux lines. These fields are also shown in the figure.
This solution is applicable to corners of any angle.
EXAMPLE

3.29

Next consider the function

m = cos- 1 S
which gives

= cos (u + jv) = cos u cosh v - } sin u sinh v


x = cos u cosh v
y = - sin u si nh v

+ jy

4---+--~I---+-.f-H~~f---+--+---f-+t+--+----t--t---;--+-

-3

= 0

SECTION

Solutions to Laplace's Equation in Two Dimensions

15

181

from which it follows that

x2

y2

- +sinh!
- -v
cosh? v
x2

y2

cos! u

sin 2 u =

The first of these equations, for constant v, gives a family of confocal ellipses with the foci
at x = 1. The second equation, for constant u, yields a family of confocal hyperbolas.
The two sets of contours are orthogonal, as shown in the first figure.
Inspection of this figure reveals a variety of problems which may be solved by this transformation. If v is chosen to represent the potential, one can solve for: (1) the field between
two confocal elliptic cylinders, or between an elliptic cylinder and a flat strip stretched
between its foci; (2) the field external to a charged elliptic cylinder, including the limiting
case of a flat strip extending between the foci.
If u is chosen to represent the potential, one can solve for: the field between two confocal
hyperbolic cylinders, including the cases that one or both of them is a plane (u = tr/2 and/or
u = 0 and/or U = tr). The special cases include two perpendicular charged plates separated
by a gap and two coplanar charged plates separated by a gap.
. As an illustration, consider the case of t\VO semi-infinite conducting planes, both lying in
the XZ plane, and separated by a gap of width d, as shown in the second figure. The t\VO

1> =

Vo _......I

. . L - _ . . . '"""""'-.l.-J_..L.----L_--L.._.J........&~--a-----...... --<I>

I~

= 0

~---d---~

conductors are assumed to be equally but oppositely charged, with the right plate at potential zero and the left plate at potential Vo.
Choosing u to represent the potential function, one sees that if d = 2 and V o = tc, the
preceding development applies without modification and u(x,Y) is given implicitly by

x2

--_._-

eos 2 'U

y2

-- =

sin 2 u

However, it is a simple matter to scale this solution since if


then
with K and k arbitrary constants. In other words, both potential and distance can be scaled
linearly without affecting a solution to Laplace's equation. Therefore if general values of d

182
and

Electrostatics in Free Space

V o are

CHAPTER

used in this problem, the solution for u(x,y) is contained in the equation
x2

cos?

y2

(~)

sin?

(~:)

The flux lines, given by v = constant, scale similarly and both fields are sketched in the figure
in the upper half of space. The plots in the lower half would be similar.

3.16

THE SCHWARZ TRANSFORMATION

Example 3.28 dealt with the function ro = Sn, which was seen to map an angular section
of the S plane into the upper half of the ro plane. This angular section was bounded
by the semi-infinite lines 4> = 0 and cP = 1r/n (cf. Equation (3.112) for the meaning of cP)
and was thus controlled by n. Field distributions for grounded corners could be deduced
readily by means of this transformation.
A generalization of this technique is known as the Schwarz transformation and
permits the interior of a polygon in the S plane to be mapped into the upper half of the
to plane. (The polygon need not be closed.) To see how this is accomplished, consider
the inverse transformation
(3.131)
in which K is a constant, possibly complex, and the parameters an, an are real, but as
yet otherwise undetermined. This transformation is analytic everywhere except at the
points ro = an. Therefore if ro is caused to trace out the segment of the real axis in the
ttl plane which lies between an-l and an, Equation (3.131) indicates the phase of dS is
constant and thus the corresponding contour in the S plane is also a straight-line
segment.
The angle cPn which this S segment makes with the real axis is equal to the argument
of dS/dm evaluated at the nth segment. But this is given by

arg

(:~) =

arg K

+ al

arg (IV - al)

+ ... + aN arg (IV -

aN)

If the points an are graduated so that an-l < an for all n, and if the nth segment is
bounded by an-l and an, then it follows that for any point on the real axis, arg (ro - an)
equals zero or 1r according to whether or not hJ < all' 'rhus
(:3.132)
If (3.132) is subtracted from a similar expression for

cPll+

1 -

cPn = -1ra

By referring to Figure 3.17, one sees that all points

cPfI+l one

obtains

ll

011 the real axis segment an - all-l


in the hJ plane map into a line segment of slope cPlI in the S plane; similarly the segment an+l - an maps into the segment of slope cPlI+l. The interior angle B; is given by

SECTION

16

The Schwarz Tronsjormoiion

183

Hence Equation (3.131) may be written

dm

.
K

[1

(ro - an)({3

7J7r)-1

n=O

This is the Schwarz transforrnation for a polygon with internal angles {3n. The complex
constant K controls the relative scale and orientation of the figure in the g. plane.

L....---"'------.L---------x

L - - - - - - .....---"._----..-u

(a) ~ plane

(b)
FIGURE

3.17

m plane

J1!Iapping of a polygon.

This transformation is useful if (3.134) is easily integrable. The difficulties obviously


increase with the number of sides, and the inverse nature of the transformation causes
some inconvenience, in that it would be more desirable to use as independent variables
the S coordinates which describe the polygon under consideration. Despite these limitations, the approach is a powerful one and will yield the field distributions around a
variety of grounded segmented shapes. If one or more of the vertices of the polygon
are at infinity, different parts of the boundary need not be at the same potential.
EXAl\1PLE

3.30

The field between t\VO semi-infinite conducting planes charged to a potential difference Vo
may be solved by using the Schwarz transformation. With reference to the figure, part (a)
shows the actual geometry being considered, and part (b) shows a polygon t in the S plane
which will tend to the actual geometry as the points b 2 and b, tend to - 00. This limiting
polygon can be mapped into a m plane by recognizing that the "interior" angles tend to the
limits (32 = 0, (31 = (33 = (34 = 27r. Then by use of (3.134)
f}

= K

(l1J - al)(lu - a2)-1(lu - a3)(hJ - a4) dl1J

K'

t This polygon is exterior to the eon tour shown so as to include the region in which the field is desired,
and comprises the nondotted area of the plane.

184

Electrostatics in Free Space

CHAPTER

(a) A plane

-----------------~-----------------x

<1>=0

(b) ~ plane

------------==~~~:a.L

(c) ttl plane

___________

....

...

..

a:t

(d) ttl plane

11

SECTION

The Schwarz Transformation

16

185

in which K' is a constant of in tegration. This may be written

I f now a-t ---t 00 so that the entire real axis in the m plane 111ay be u t.ilized, then K can be
permitted to tend to zero in such a way that - K" is the limiting value of Ka4. L'pon making
the further choices al = -1, Q2 = 0, aa = +1, the above expression becomes

f ltJ2 ~ 1 dltJ + K'

g. = K"

and the z, axis is divided into four segments as shown in part (c) of the figure. Integration
gives

[~2 - In ro J +

g. = K"

K'

The constants may be evaluated by using the information that S


and that S = 0
jO when m = + 1. This gives

s=

l
-;

as the final form for the transformation,

1 - m
--2-

In

ro J

+ jl

when m

-1

186 Electrostatics in Free Space

CHAPTER

I t may be observed that the negative half of the u axis is an equipotential of value V o and
that the positive half of the u axis is an equipotential of value zero. Therefore the potential
distribution in the upper half of the m plane is given by

in which <p is measured counterclockwise from the u axis. If lU is written in the form
m = CRei'f, setting <p = constant traces out an equipotential, whereas setting CR = constant
traces out a flux line. This is shown in part (d) of the first figure. The corresponding g. traces
I11ay be found fr0I11 the transformation, and lead to the flux map shown in the second figure.
This solution may be used to deduce the fringing capacitance of a parallel plate condenser.
(Cf. Example 3.32.)

3.17

CAPACITANCE

The concept of electric flux which originates on positive charge and terminates on
negative charge has already been introduced, and has been seen to be a useful pictorialization of Gauss' la \\.. I t also has been noted that when charge resides statically OIl a

FJGURE

conductor, it does so

:3.1 S

Capaciianre of two arbitrary conductors.

the outer surface. The flux lines are then normal to the surface
on the vacuum side of the interface and are confined to the vacuum side, there being no
field within the conductor. This fact led to the conclusion that at each point on the
outer surface of the conductor Do = (J, with Do the flux density and (J the surface charge
density.
Let these ideas be applied to the ease of t\VO conductors in free space, as shown
in Fig. 3.18. Each conductor has an arbitrary shape and their relative position and
011

SECTION

17

Capacitance

187

orientation is also arbitrary. If one conductor contains a net excess charge Q and the
other a net excess charge - Q all the flux lines leaving one conductor terminate on the
other, as suggested by the figure. From a knowledge of Do(x,Y,z) in the intervening
space, one could deduce E(x,Y,z) and then determine the potential difference between
the t\VO conductors by computing the line integral of longitudinal E along an arbitrary
path extending from one conductor to the other.
Suppose that one were to double Q and - Q. This would double Do everywhere and
double E everywhere, thus doubling the voltage difference between the t\VO conductors.
From this it follows that the ratio of the charge on one of the conductors to the voltage
between them is a constant. This constant is a useful index of the charge storage capability and is called capacitance. t
The conclusion just reached is valid for arbitrary conductors and the general definition of electrostatic capacitance is therefore

Co

(3.135 )

in which Q is the charge in coulombs and V is the potential difference in volts. The
subscript on Co is a reminder that the intervening space is a vacuum. Capacitance is
measured in units of coul/volt, more commonly called farads.
EXAl\1PLE

3.31

Other illustrative examples have included a variety of situations involving two conductors
containing equal and opposite charges. Making use of the results of Example 3.9, one may
conclude that the capacitance per unit length of t\VO concentric cylinders is
Co

==

O~

In b/a

From Example 3.14, the capacitance per unit length of t\VO parallel tubular conductors of
radius a and spacing D is
Co =

7r

cosh:' D/2a

From Example 3.17, the capacitance per unit area of t\VO closely spaced parallel plates is
1

Co = Eo-

These expressions for capacitance all contain the multiplicative factor Eo and indicate
why the units for Eo are ordinarily taken as farads/m.
EXAMPLE

3.32

The expression given above for capacitance per unit area of two parallel plates leads to the

approxima.te result
Co = o

1:

as the total capacitance, if the area of each plate is ..4. However, this result neglects fringing
and assumes that the Do field is uniform and does not extend beyond the plate edges, as

t The

two conductors are said to comprise a capacitor or condenser.

188 Electrostatics in Free Space

CHAPTER

suggested by the figure. The true field is more like the flux map shown in Example .3.30,
which indicates an extension of the field beyond the plate edges and some additional charge
storage on the back sides of the plates. A more accurate expression for capacitance may be
deduced with the aid of the Schwarz transformation there derived.
y

4>

V,

plane

4>=0

For two semi-infinite parallel plates, separated in distance an amount l, and in potential
an amoun t V o, a transformation to the m plane led to the resul t that

with m = CRe icp The charge density in hJ space is therefore


(j

a<I>

an

o -

Eo

a<l>

= - - u

acp

Eo

V0

--

7rU

The total charge on the lower plate, per unit length perpendicular to the 8- plane, between
= ~, and accounting for both the inner and outer surfaces, is

x = 0 and x

in which

Ul

and

U2

straddle the point u


~

J du
U2

Q(~) = -

oVo

7r

a3

UI

in such a way that

[1 - ui +
=; [1 - u~ +

.
+ JO
=:;l

-2-

In u 1

--2-

In

1[2

these expressions arising from the Schwarz transformation.


If I~I
l, a good approximate solution may be obtained to these transcendental equations, for then 'lit
1 and us
1. Hence (1 - ui) /2 is negligible compared to In 1l} and
In U2
may be neglected in comparison to -u~/2. Thus

+t

7r~

Inul~-

QW

1
( - -Z27r~)
In u2~21n

~ - EO;O [7r~~1 + ~ In e7r}~I) ]

The first term in this expression for charge is the value which would occur if there were no
fringing and the second term is therefore the correction due to fringing.

SECTION

AIulticapacitor Systems

18

189

This result may be applied to the practical problem of a parallel plate capaci tor of area

A = ab and spacing l. If it is assumed that a l, b L, the fringing charge is approximately

in which the effects at the four edges have been sUTl1D1ed, with I~l chosen as a/2 or b/2
as appropriate. The total capacitance is then given by

C =
o

:!

f
0

r1

L+

In ('/fall)
'/fall

In ('/fbll)
'/fbll

As a specific illustration of this result, if the t\VO plates are 2 in. X 4 in. and spaced 0.1 in.
apart, the capacitance is 10 percent higher due to fringing.

3.18

MULTICAPACITOR SYSTEMS

The results just obtained for a capacitor consisting of t\VO conducting bodies oppositely
charged may be extended to the ease of many conductors. Let otherwise empty space
be populated by N conducting bodies whose outer surfaces are designated as Sn. These
conducting bodies may have arbitrary size, shape, orientation, and position, and their
general distribution is suggested by Figure 3.19. Without loss of generality, one of the

FIGURE

3.19

A systern of conducting bodies.

conductors may he considered so vast as to be an "earth," that is, an infinite reservoir


of both types of charge, and at potential zero.
As consequences of the uniqueness theorem for solutions to Laplace's equation,
t\VO general propositions may be established for this system of conductors:
1. If the electrostatic potential of every conductor is specified, there is only one distribution of electric charges which will yield these potentials.
2. If the total charge on each conductor is specified, there is only one way in which
the charges can distribute themselves over the surfaces Sn in order to be in
equilibrium.

190

Electrosiatics in Free Space

CHAPTER ~3

The first proposition is based on the fact that, if all the boundary potentials are prescribed, a unique electrostatic potential distribution, epCr,Y,z), exists in the intervening
space. But then a unique dip/an is established everywhere, including all points contiguous to S. Since the charge distribution (J is proportional to acI>/an at such points, it
follows that (J is a unique distribution over ail the conducting boundaries Sn.
The second proposition may be established by a similar argument. Each conductor
is an equipotential surface once its total charge is in an equilibrium distribution. But
this leads to a unique cI>C-C,Y,Z) , a unique dcI>/an, and thus a unique (J, with f(J d.S;
equalling the total charge on the nth body.
s,
These t\VO results 111ay be summarized by saying that the distribution of electric
charge over the outer conducting surfaces S; is fully specified if one knows either (1) the
potential of each conductor or (2) the total charge on each conductor.
K ow suppose that there are two equilibrium distributions of charge:
on the different conductors,

1. A distribution (J giving total charges Ql, Q2,


with their potentials being V b V 2, . . .
2. A distribution (J' giving total charges Q~, Q~,
with their potentials being V~, V~, ...

on the differen t conductors,

Since Laplace's equation is linear, these distributions J11ay be superposed with the
result that (J
(J' will give a total charge Qn
Q:t on 1-.';/1, its potential being V n
Clearly this conclusion 111ay be extended to the superposition of any number of charge
distributions.
As a particular application of the foregoing, if charges (21, Q2, . . . , QN give rise
to potentials V 1, V 2, . . . , V N on the N conducting bodies, then charges kQl, leQ2,
. . . ,kQN will cause potentials kVI, leV'}., . . . kV N, with k any real constant.
Suppose next that a positive unit charge is placed on the first conductor with all
other conductors left uncharged, and that this produces the potentials

+ V:.

Pl1, P21, . . . ,PNI

on the N conductors respectively. Then if a charge Ql is placed on the first conductor,


with all others left uneharged, this will cause potentials

PIlOl, P21Ql, . . . , PNIQl


Similarly, if placing a positive unit charge on the nth conductor and Ieaving the
others uncharged produces potentials

pin, p2n, . . . , pNn


then placing Qn on S; and maintaining the other bodies uncharged will yield potentials

If these distributions are superposed, the effect of charges Ql, Q2, . . . , Qv on the
N bodies is to cause their potentials to become V 1, V 2, . . . , VN where

VI = PIlQl

V: =

P21Ql

VN

PNIQl

+
+
+ ... +

P12Q2
P22Q2

PN2Q2

Pl.vQ.v
P2:vQN

+ ... +

(3.136)

PNNQN

SECTIOK

M'uliicapaciior J,..';ystems

18

191

These equations give the potentials in terms of the charges. The factors P are called
coefficients of potential; they are purely geometrical quantities which depend on the
size, shape, orientation, and position of the various conductors. Except for a few simple
geometries, the calculation of the coefficients is quite involved, but their values may
be deduced experimentally with little difficulty.
Some of the properties of Equations (3.13G) may be brought out with the aid of
Green's reciprocation theorem. 'This theorem is concerned with a set of ill point
charges qm, placed at positions where the potentials due to the other 111 - 1 charges
are given by a set of numbers <Pm. These potentials may be written
Jl1

\', qn
47r EOn = 1 ~ m n
1

ep

--~_.

m -

in which ~mn is the distance from 'l to qm, and the prime on the summation sign indicates that the term for n = m is deleted frorn the SUInt
Alternatively, if a different set of charges q~ is placed at the same points, the potentials will be

If ~
1\1

= _1

<p'
m

47r EOn

~m n

=1

Upon multiplying the first of these sumrnations by q~, the second by qm, and then
summing each resulting expression over the index m, one obtains

I <l>mq~ 4: I L'
u
I <I>~qm 4~ L If
AI

Iv!

~mn

J.\1

m=1

7rEO m

(3.137)

qnqm

0 m = 1 n=1

m= 1

(3.138)

qnqm

= 1 n = 1 ~mn

The right sides of (3.137) and (3.138) are equal to each other, since either can be
converted to the other by an interchange of the summation indices m and n. Thus
111

2:

m=l

<Pmq~ =

1\1

2:

m=l

(3.139)

tI>~qm

which is Green's reciprocation theorern. It may be extended to a set of N conducting


bodies whose potentials are V n, and which possess total charges Qn, by combining
all the points of a COn1ITIOn potential in (3.139) into a single term. This gives
N

L Q~Vn

2:

n=l

n=l

Consider now the special case that Qi


that (3.136) gives

If instead,

Q;

(3.140)

QnV~

1, all other conductors being uncharged, so

1, all other conductors being uncharged, then

V~ =

PI},

V; =

P2j,

...

VN =

PNj

192 Electrostatics in Free Space

CHAPTER

and application of (3.140) yields


N

2:

n=l

Q:Vn =

so that

2:

Pij

n=l

Pij

QnV~

Pij
(3.141)

Pji

and the coefficients of potential are symmetrical, with only Nt N + 1)/2 of them being
independent. Other properties of these coefficients may be deduced from their basic
definition. Since the Pij are the potentials at the surfaces Si due to a positive unit charge
on Sj, all the Pij must be positive. Further, the conductor possessing the charge must be
at the most positive potential and thus
(3.142)
EXAMPLE

3.33

Equation (3.141) may be interpreted in words by saying that the potential to which S, is
raised by placing uni t charge on S j, all other bodies being uncharged, is the same as the
potential to which S, is raised by placing unit charge on Si, all other bodies being uncharged. This is, of course, still true if S, and S j are the only t\VO bodies in the system.
As a special case of this result, let the first conductor be reduced to a point P and suppose
that the system contains additionally only a second conductor. Then the potential to which
the conductor is raised by placing a unit charge at P, with the conductor itself uncharged,
is the same as the potential which would be found at P if unit charge were placed on the
conductor.
Specifically, let the conductor be a sphere and let the point P be a distance r from its
center. Since a unit charge on the sphere causes a potential 1/47rfor at P, if a unit charge is
placed at P, the uncharged sphere will assume a potential 1/47rfor.

Equations (3.136) comprise a linear set of N equations which 111ay be solved to give

Ql = CUV 1 + C12V 2 +
Q2

C 21V

C 22V

+ CINV
+ ... + C V

2N

N
N

(3.143)

in which the coefficients Cij represent appropriate ratios of two determinants involving
the PijS. Thus the CijS are also purely geometrical quantities, depending on the size,
shape, orientation, and position of each conducting body. c., is called a coefficient of
capacitance, and c., (i ~ J) is called a coefficient of electrostatic induction.
I t follows from Green's reciprocation theorem that
Cij

(3.144)

Cji

If the jth conductor is raised to a positive potential while all the other bodies are
grounded, Qn must be positive, but all other charges must be negative. Therefore
Cij

~ 0

(i

j)

(3.145)

Furthermore, since the total charge of the system cannot be negative in this situation,

SECTION

Electrostatic Stored Energy

19

for any value of the index

J,

193

'"' c..
>0
tJ_

(3.146)

'-'

i=l

Equations (3.143) may be rewritten in a more revealing form by making the substitutions

C,

2:

c;

j=1

which leads to

Ql = ClIVI
C12(VI - V2)
Q2 = C2I(V 2 - VI) + C22V2

+
+

+ C IN (V 1 -

+ C2N(V2

VN)

- V N)

(3.147)

The quantities Cii and Cij are known as the self-capacitance of the ith body and the
mutual capacitance between the ith and jth bodies. Cij = Cji, by virtue of (3.144),
and all the CiiS and CijS are positive because of the defining relations and the results
(3.145) and (3.146).
An interpretation of (3.147) may be undertaken with reference to the first of these
equations. The total charge Ql has a component CuV 1 which may be attributed to a
capacitance Cn between the first body and ground, since V 1 is the absolute potential.
Additionally, there is a charge C12(V 1 - V 2) residing on the first body, which may be
attributed to a capacitance C12 between the first and second bodies, with this capacitance charged to a voltage difference VI - V 2. Since C21 = C12, there is an equal
and opposite charge C2I(V 2 - V 1) residing on the second body. A number of flux
lines C12(V 1 - V 2) connects these two bodies, originating on C12(V 1 - V 2) and
terminating on C21 (V 2 - V 1)' Similar explanations may be offered for the other terms
C1n(Vl - Vn) occurring in the first equation. Thus the entire set of Equations (3.147)
may be interpreted in terms of a capacitance Cii between the ith body and ground, plus
capacitances Cij between the ith body and each other body in the system. These
capacitances are purely geometrical quantities and often can best be determined by
experimen t.

3.19

ELECTROSTATIC STORED ENERGY

Since electric charges exert forces on each other, work is performed when they move.
In particular, energy normally is required to assemble a system of charges into a given
distribution. This energy may be said to be stored in the system. The technique
employed in Section 3.8 to develop the method of images provides a simple means for
calculating this stored energy.
Let it be desired to find the electrostatic energy stored in the charge system of
Figure 3.11a. If the charge Qext is allowed to collapse onto the surface cPo, forming the
distribution (J, the external field disappears. If, in addition, the charge Qint is allowed

194 Electrostatics in Free Space

CHAPTER

to collapse onto the surface 4>0, forming the distribution (J", the internal field also disappears. Since (J" = -(J', the net charge everywhere on cf>o will be zero and the system
has become electrically neutral.
This provides an excellent starting point for the creation of the system of Figure
3.11a. Consider the family of surfaces So, SI, S2, ... ,which is to become the family
of equipotentials <1>0, <PI, <P2, . . . ,in the final system, with <Po the innermost of these
surfaces. One begins by placing the charge distributions (J' and (J" on So. The charges
comprising (J' are then moved to S1 and changed to the distribution (J'l = D 1 , in which
D 1 is the flux density distribution the final system is to have over S1. The charges of (J'1
are next moved to S2 and changed to the distribution (J2 = D 2 , in which D 2 is the flux
density distribution the final system is to have over 8 2 If this process is continued to
completion, all the charges comprising Qext will be in their proper places and the field
external to So will be precisely that of the final system. Since (J" is still on So, the field
internal to So still will be identically zero.
How much energy is expended in moving the charges of (J' from So to S 1 '1 Consider a
surface element d.S in So, as shown in Figure 3.20. The charge (J' d.S is to be transferred

81
FIGURE

3.20

So

Transfer of charge.

from d.S to a surface element dS 1 in S1 such that c dS = (J'1 dS 1. Let dp d.S be the
amount of charge which is transferred at a time when p dS o units of charge have already
been transferred. At this time the density of charge on d.S is (J" + (J' - p) = -p
and the density of charge on d.S, is (to first order)
p. If de is the distance from d.S
to dS l , the work done in transferring the charge dp d8 0 is

d 41V = -P dp d.S de
Eo

The work required to transfer all of the charge

1 (J'2
d 3W = - - dS o de
2 Eo

(J'

= -

d.S is therefore

E2dS o de

Eo

(3.148)

SECTION

Electrostatic Stored Energy

19

Thus the work done in moving all the charges comprising

dW

-ho

(1

195

from So to S 1 is

E2 dV

VO

1-

in which VI - V o is the volume between Sl and So.


It follows that the energy stored in the system as the charges of (1 are 1110ved from So
to their final positions is given by
Wex t = tEo

J E2 dV

(3.149)

Vext

with V ext the entire volume external to So.


If So, S~, S~, ... , is the family of surfaces which is to become the family of equipotentials <Po, <P~, <P~, . . . , with So the outermost surface, the charges comprising (1'
may now be transferred successively from So, to S~, S~ to S~, etc., until they reach their
proper final positions. The work necessary to do this will be
W i n t = tEo

J E2 dV

(3.150)

Vint

with Vi n t the entire volume internal to So. The conclusion is reached therefore that the
total electrostatic energy stored in a system of static charges surrounded by free space
is given by

WE = tEo

Jv E2 dV

J E Do dV

(3.151)

in which the volume integration extends throughout all space. This suggests that the
energy is stored in the field with a volume density of f. oE2/ 2 joulc/m", but of course no
experimental verification of this is possible. Only the integrated form (3.151) is susceptible to check. However, this interpretation will prove very attractive when time-varying fields are considered in Chapter 5.
If the ultimate position of all the charge Qext is such that it is distributed over the
surface of a single conductor, and if similary Qint finally ends by being distributed over
the surface of a second conductor, an electrostatic system such as the one shown in Figure
3.18 will have been created. For such a case it is interesting to return to (3.148) and write
d 3W = ~((1 dS o) (E dt)

as the work required to transfer the charge (1 d.S to the surface element dS 1. The work
required to transfer this charge to its ultimate proper place on the conductor is therefore

d 2 W = 1;;(1 d.S

fE

de =

~(1

dSO(<PA - <Po)

in which <PAis the potential of the conductor. The work invested to transfer all of Qext
to the first conductor is then

Wext =

~(<PA -

<po)f(1 dS o = !(<PA - <PO)Qext

Similarly, the work required to 1110Ve all the charge Qint from So to the second conductor
IS

W int

t(cPB - cPO)Qint

in which <PB is the potential of the second conductor. Letting Q = Qext = -Qint, and

196

Electrostatics in Free Space

CHAPTER

letting V = <PA - <PB be the voltage difference of the two conductors, one can conclude
that
(3.152)
is the total energy stored. Since Q =
written in the alternative forms

CoY,

with

Co the capacitance,

this result may be


(3.153)

1 Q2

WE

(3.154)

=-

Co

These equations probably already are familiar from circuit analysis, and it is to be
noted that the derivation just given is valid for any arbitrary pair of conductors.
EXAMPLE

3.34

If a parallel plate capacitor of plate area A and spacing l is charged to a potential difference
electric field is uniform (neglecting fringing) of value

Vb' the

By virtue of (3.151), the energy stored in the capacitor is


r

l~ E

1 (Vb)2
-z- Al = 21(

= 2"

Eo

But it already has been noted that the capacitance is

vV E

in agreement with (3.153).

Eo

lA_) Vb2

Co

Eo

All and therefore

= iCoV~

The energy stored in a multicapacitor system 111ay be deduced by a generalization of the argument leading to (3.152). If <Po is at absolute zero potential, and a d.S is
an element of charge which will ultimately reside on a conducting surface Sn, then

d 2W

i a dSoV n

is the work needed to move this charge from So to Sn, where the potential is to be V n.
When all elements of charge in a and a' are 1110ved to their final positions, the energy
stored is

1VE = t

2:

n=l

Qn V n

(3.155 )

Use of (3.143) allows (3.155) to be written

1V E =

LL

m=l n=l

Cnm

V mV n

(3.156)

The field expression for electrostatic stored energy, (3.151), may be converted to
still another form which leads to a generalized geometric formula for capacitance. Using
the relation E = - Vel> gives

TVE

tEo

f E (- Vcf

dV

SECTION

Electrostatic Stored Energy 197

19

which, through application of the vector identity (V.107) becomes

WE

= -ho

<l>V E dV -

-ho

V (<I>E) dV

Substitution of EoV E = p into the first integral and application of the divergence
theorem to the second yields

WE =

p<l> dV -

t~o

f <I> E dS

But S may be taken as a sphere at infinity, and since E decreases as


as ~-1, the surface integral is seen to vanish. Therefore

WE

~-2

and <I> decreases

f p<l>dV

(3.157)

The electrostatic potential function ep rnay be expressed as the integral


<I> =

f
v'

pi dV'

(3.158)

41ro~

wherein primes are used so as to be able to distinguish between the contributions


to the integrals in (3.157) and (3.158). Thus the electrostatic stored energy may also be
expressed in the form

WE

=! f f ~ dV dV'
2

17

v'

(3.159)

47ro~

in which ~ is the distance between the volume elements dV and dlT ' , and the integration
is to be performed twice throughout all of space containing elements of charge.
This result may be applied to situations in which equal and opposite amounts of
charge [Q,-Q] reside on two conducting bodies whose exterior surfaces are 8 1 and 8 2 .
The surface charge densities CT1(~,1J,r) on 8 1 and (12(~,1J,r) on 8 2 are both linearly proportional to Q so that one may write
for any point (~,1J,r) on 8 1, and
for any point (~,1J,r) on 8 2 with 11 and 12 functions which give the normalized charge
distribution. Under these conditions, (3.159) may be written

WE

Q2
2

f f fd~ ss.cs; + Q2 f f fd~

S1

S'
1

47ro~

S2

S'
2

47rEO~

dS2dS~ + Q2

f f fd2 dS

S1

S2

47rfo~

1dS

wherein f~ implies fl(~',rl',r'), etc. It is evident from (3.154) that the capacitance of
this system must be given by

198 Electrostatics in Free Space


EXAMPLE

CHAPTER

3.35

Two concentric conducting shells of radii T1 and T2 form a capacitor whose capacitance can
be determined with the aid of (3.160). On the basis of the figure, let charges - Q and + Q

-Q

+Q
be placed on the inner and outer shells; then !1(ttJ,r)
are constants and

1=2

JJ

82

fd2

41rfo~

as.ss,

= - _1

21rfo

-1/41rri and !2(~,7J,r)

(~) (~)
41rrl

47T' r 2

JJ

81

82

dS\dS

1/47rr~

Let dS 2 be the zenith area element shown in the figure, and let dS l be the ring 27rri sin fJ dfJ.
Then, since
~2 = ri
r~ - 2r1r2 cos 0
= 2rlr2 sin 0 dO

2r dr
dS I

rl
27T'r2

dr

it follows that

41rf or2

Following the same procedure, one finds that the other


and + 1 /47T'for2 and therefore

t\VO

integrals in (3.160) give

C =

47T'forl - 47rfor2 - - - - - -

+ 1/41rforl

SECTION

3.20*

The 111axwell-McAlister Experiment

20

THE MAXWELL-McALISTER

199

EXPERIMENT

This section and the next are concerned with improvements in accuracy which Maxwell and Me Alister and later Plimpton and Lawton made in the Cavendish electric
force experiment (described in Section 3.1). Though not adding further to the electrostatic theory just presented, a discussion of these experiments enhances confidence
in the postulation of the inverse square law, and affords the opportunity to consider
Maxwell's analysis of the accuracy of such experiments.

Two thin spherical


metal shells

Air

---~

FIGURE

3.21

i{:D

Insulators

The Maxwell-McAlister apparatus.

With IVlcAlister's help in the laboratory, Maxwell repeated the Cavendish experiment in 1878, using an apparatus of improved design. The need Cavendish had to
remove the outer hemispheres was avoided by introducing a trap door which provided
access to the inner globe, and through which the testing electrode of an electrometer
could be inserted to detect the presence of charge on the inner globe. As suggested by
Figure 3.21, the outer hemispheres were sealed together and placed on an insulating
stand. The inner globe was spaced and insulated from the outer shell, in a concentric
position, through the use of a piece of ebonite tubing. The trap door in the outer shell
* This section may be omitted without loss in continuity of the technical presentation.

200

Electrostaiics in Free Space

CHAPTER

was so constructed that, in its closed position, it formed an electrical connection


between the t\VO spheres. The trap door could be lifted by an insulating thread, thus
breaking this electrical connection. The detector inserted through the resulting opening
was a version of Thomson's quadrant electrometer, a much more sensitive instrument
than the pith-ball electrometer available to Cavendish a century earlier. The case of
this electrometer and one of its electrodes were permanently grounded, and the testing
electrode was also kept grounded except when used to test the potential of the inner
globe. To estimate the original charge on the outer shell, a small brass ball was placed
on an insulating stand at a distance of about 60 em from the center of the shell.
The procedure followed was to close the trap door, thus connecting the outer shell
to the inner globe, and then to charge the outer shell positively from a condenser which
was brought in from another room for this purpose and then promptly removed. After
this, in Maxwell's words"
The small brass ball was then connected to earth for an instant, so as to give it a negative
charge by induction, and was then left insulated. The lid was then lifted up by means of the
silk string, so as to take away the communication between the shell and the globe. The shell
was then discharged and kept connected to earth. The testing electrode of the electrometer
was then disconnected from earth, and made to pass through the hole in the shell so as to
touch the globe within withou t touching the shell.
Not the slightest deflexion of the electrometer could be observed.

Because of the relative sizes of the small brass ball and the outer shell, and their
separation distance, Maxwell and l\1cAlister knew that at the time the brass ball
had been momentarily grounded, it had taken on an induced negative charge which
was approximately 1/54th of the positive charge which had been applied to the outer
shell. Later, when the outer shell was grounded, it actually retained a small positive
charge, through induction, and due to the presence of the insulated, negatively charged
brass ball. This small positive charge was computed to be about 1/9th of the negative
charge on the brass ball, or 1/486th of the original charge applied to the shell system.
Thus at this stage of the experiment, the outer and inner shells were insulated from
each other, the outer shell was at ground potential and possessed a positive charge
approximately 1/486th of the original charge, the small brass ball was insulated and
contained a negative charge approximately 1/54th as big as the original charge applied
to the shell system, and the electrometer indicated no charge on the inner globe.
1-'0 test the sensitivity of the instrumentation, the outer shell was disconnected from
ground and connected instead to the electrometer. Being still at ground potential
(due to the presence of the negatively charged brass ball) the outer shell caused no
deflection of the electrometer. However, at this juncture, the small brass ball was
grounded, thus raising the potential of the outer shell and producing a large deflection
of the electrometer.
Calling this observed deflection D, and letting d be the largest deflection which
could escape detection, Maxwell and McAlistcr then knew that the maximum charge
which resided on the inner globe was II486(dl D)th of the original charge applied
to the shell system. Thus Maxwell concludes
20 J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, vol. 1, p.p. 404-409,
revised by J. Larmor, Cambridge University Press, 1921. (Note 19 of Notes by the Editor.)

SECTION

The 111axwell-111cAlister Experiment

20

201

. \VC know that the potential of the globe at the end of the first part of the experiment
cannot differ from zero by n101'e than

where V is the potential of the shell when first charged.


But it appears from the mathematical theory that if the law of repulsion had been as
r-(2+0), the potential of the globe when tested would have been
O.l4780V

Hence 0 cannot differ from zero by more than .l2(d/D).


N O\V, even in a rough experiment, ]) was certainly more than 300d. In fact, no sensible
value of d was ever observed. We may therefore conclude that 0, the excess of the true index
above 2, 111USt either be zero, or must differ from zero by less than

The mathematical theory to which Maxwell refers in the above passage is developed
in the remainder of his Kate 19. This development is slightly paraphrased, with modified notation, as follows:
Assume that the law of force between t\VO electric charges a distance r apart is
qq'F(r) in which q and q' are algebraic quantities representing the amounts of charge
and F(r) is a function to be determined. Since the system is conservative, the potential
energy can be wri t ten

f F(R) dR
00

= qq'

W(r)

(3.161)

in which the result is independent of the path. This may be expressed in the form

i' (r)
W(r) == qq"r
00

so that

j(r) =

f r [f F(R)

dR

(3.162)

J dr

(3.163)

Imagine in the foregoing that the charge q' is successively elements of a charge Qa
uniformly distributed over the outer sphere (Figure 3.22) of radius a and a charge Qb
which is uniformly distributed over the inner sphere of radius b. The potential energies
between the charge q and each of these elements of charge may be added.
Let ; and (J"b be the respective surface charge densities so that

Qa

==

47ra 2(J" a

Qb == 47rb 2(J" b

Referring to the figure, one can consider first an element of charge at pi, residing in the
element of area a 2 sin 8 d8 d. (The usual spherical coordinates are implied.) Placing
the charge q at ]J, a distance c from the center, and letting r == PP', one finds that
}'2

== a 2

2ac cos 8

+c

and that the potential energy between this element and q is, from (3.162)
q(J"aa 2

f' (r)
sin 8 de df/> ' - r

(3.164)

202

Electrostatics in Free Space

CHAPTER

Air

Two shells

f
FIGURE

Geometry for 111 axwell's analysis.

3.22

rf

The potential energy between q and all the charge on the outer shell is therefore

Wa

qQa .f'(r) sin fJ dfJ d


41r
1"

Since r is independent of </>, this becomes

Wa = qQa
2

1'(1') sin fJ dfJ

(3.165)

However, the differential of (3.164) gives r dr = ac sin f} df} so that (3.165) may be
rewritten
a+c
qQ
f'(1") dr = -2a [j(a + c) - J(a - c)]
W a = qQa

2ac a-c

ac

When this procedure is repeated for the inner sphere, one obtains

qQb
TVb = [f(c

2bc

b) - f(c - b)]

If q is taken to be a unit charge, the electric potential at P is


V(c) =

~ [f(a +
2ac

c) - f(a - c)]

+ 2bc
Qb [f(c + b) -

f(c - b)]

(3.166)

From this, it follows that the potential of the outer sphere is

V(a)

~2 f(2a) + !J!'2a

2ab

[f(a

+ b) -

f(a - b)]

(3.167)

whereas the potential of the inner sphere is

Qb
(b) = 2b 2f(2b)

o. [f(a + b)
+ 2ab

- f(a - b)]

(3.168)

SECTION

The Maxwell-McAlister Experiment

20

203

In the Cavendish experiment, the two spheres first were joined by a short wire
and charged to a common potential VI above ground. Putting yea) = V(b) = V I
into the above equations, and solving for Qb, the charge on the inner sphere, one obtains

Qb

V b bf(2a) - a[f(a + b) - f(a - b)]


I

f(2a)f(2b) - [f(a

b) - f(a -

(3.169)

b)r~

The two spheres were next disconnected from each other and the outer sphere grounded.
At this point, the charge on the outer sphere changed to Q~ which can be determined
from

yea)

Q~ f(2a) + ~ [f(a + b) -

2a 2

2ab

f(a - b)]

At the same time, the potential of the inner sphere became

V(b)

V2

2~b2f(2b) + 2~~ [f(a + b) -

f(a - b)]

Elimination of Q~ from these t\VO equations yields the relation

V = V
2

[1 _~

I(a

b) - I(a - b)]
.f(2a)

(:3.170)

I t is this potential V 2 of the inner globe which the electrometer was used to detect. t
N ext assume, with Cavendish, that the law of electric force is SODle inverse power
of the distance which differs but little from the inverse square; that is, let

F(r) =

1'-(2+<5)

r 1-

so that

<5

.f(r) = 1 _ 82

Since 0 is assumed small,

1'-6

1'-0

can be expanded in the rapidly converging power series

= 1 - 0 In r

(8 In 1')2

2!

+ ...

(This is merely a Maclaurin series in powers of o. Cf. the Mathematical Supplement,


Part I.)
Use of the first two terms of this expansion in (3.] 70) gives the first-order result

8
[a- I na +- -b -In -2 4a-2-2]
V 2 = -VI

a - b

(3.171)

Insertion in (3.171) of the values used by Maxwell and l\IcAlister for the radii a and b
gives V 2 = 0.1478V I , the relation used by Maxwell in the previous quotation. It was

. t hiIS manner that Maxwell was able to determine the bound 0 = 21 1600'

In

t In the Maxwell-McAlister version of this experiment, the presence of the small charged brass ball
adds another term to the expressions for V(a) and V(b), but this term cancels out in (3.169) and
(3.170), and thus (3.170) is valid both for the original Cavendish experiment and for the later
Maxwell-McAlister experimen t.

204

Electrostatics in Free Space

CHAPTER

One can carry this analysis further and assume that Qb = 0 and thus that
Equation (3.170) then gives

bf(2a) - af(a

b)

af(a - b)

V 2 = o.

Holding a fixed, letting b vary, and differentiating twice with respect to b, one obtains

f"(a

b) = f"(a - b)

Since this Blust be true for any b < a, it follows thatj"(r) = K I andf'(r) =
in which K 1 and K 2 are constants. Hence,
00

F(R) dR = f'(r)

K1

+K

KIf

+K

2,

from which

(3.172)

Thus on the basis of the assumption of a null result in the Cavendish experiment, the
electric force law is the inverse square. This proof is due to Maxwell, It borrows from a
procedure first used by Laplace who showed that no function of the distance except
the inverse square satisfies the condition that a uniform spherical mass shell exerts no
gravitational force on a particle within it.

3.21 *

THE PLIMPTON-LAWTON EXPERIMENT

The accuracy of the Maxwell-Me Alister result stood until 1936 when Plimpton and
Lawton of Worcester Polytechnic Institute undertook the task of attempting a more
exact measurement."! Using modern equipment, they were able to show that the
electric force must be an inverse power of the distance, r-(2+6), in which 0 is bounded
by 2 X 10- 9
Plimpton and Lawton examined Maxwoll's version of the Cavendish experiment
very carefully. It was at first believed that by using a greater charging potential on the
outer sphere and a D10re sensitive electrometer, one could increase the accuracy of
Maxwell's method. This did not prove to be the case; in fact, it was concluded that
Maxwell had apparently reached the limit attainable by his method, even granting
the sensitivity of modern equipment. Maxwell's method suffered from two limitations:
(1) radioactive contamination of the metal surfaces make spontaneous ionization
possible, and this could affect the charge on the inner globe during measurement, and
(2) contact potentials establish a lower bound on the detectable voltage of the inner
sphere. The second effect is the more severe of the two.
The contact potential difficulty was eliminated completely and the spontaneous
ionization problern reduced by an ingenious modification of the apparatus. Plimpton
and Lawton placed the detector inside the inner globe and thus were able to make
permanent connections between the electrometer and the inner globe. This eliminated
contact potentials entirely. It was also possible to seal the inner globe inside the outer

* This section may he omitted without loss in continuity of the technical presentation.
S. J. Plimpton and \V. E. Lawton, "A Very Accurate Test of Coulomb's Law of Force between
Charges," Phys Rev, 50, 1066-1077; 1936.
21

SECTION

21

The Plim pion-Lauiton Experiment

205

one, thus reducing contamination of its surface. The time duration of the data-taking
was drastically curtailed, thereby decreasing the accumulated effects of spontaneous
ionization.
The apparatus used by Plimpton and Lawton is showed in Figure 3.23. The outer
globe consisted of two hemispheres 5 feet in diameter, soldered together, and mounted
on a porcelain insulator. A slat floor was constructed inside the outer sphere and on it
was placed the detector, housed in copper boxes. These copper boxes formed the lower
half of the inner globe, the upper half being a 4-ft diam hemisphere, mounted on pyrex
glass insulators and connected to the detector boxes. (Plimpton and Lawton showed
that this deformation of the geometry used by Cavendish did not invalidate the applicability of Maxwell's analysis, as given in Section 3.20.)

~----~irror
Telescope

Central rheostat
.-/-

l
I
I
I

...--

WI

~I

'-------------------------'
FIGURE

3.23

Condenser
generator

~~o~oooo~oooo~
110 a.c.

--------

The Plimpton-Lawton apparatus.

The detector used was a five-stage amplifier operating a galvanometer. This assembly
was suspended on rubber to avoid microphonics, The Johnson noise of the input resistor
caused an indication of only ~~ microvolt. The galvanometer was viewed through a
conducting window in the outer sphere, this being simply a glass-bottomed vessel
filled with a salt water solution.
During preliminary investigations, it was found that when switches were opened or
closed, the galvanometer deflected, due to magnetic field surges. A quasistatic procedure
was devised to circumvent this difficulty. The outer globe was charged by a sinusoidal
voltage source whose frequency was adjusted to the resonance of the galvanometer.
This greatly enhanced the sensitivity of the instrumentation as well as the signal-tonoise ratio. I t was found that a frequency of about 2 cycles per second submerged the
galvanometer fluctuations due to inductive effects below the Johnson noise.

206

Electrostatics in Free Space

CHAPTER

Plimpton and Lawton were unable to find a commercially available generator at


such a JO\V frequeney ; therefore they designed and const.ructcd their own, 'I'he timevarying voltage was developed by moving the center plate of a tri-plate condenser
connected to a suitable power supply.
The calibration procedure was simplified by employing a high resistance potentiometer, as shown in Figure 3.24. During calibration, a small known fraction of the

From condenser generator

o
Oscilloscope

fJ
U

"---_--...._-.-_..."

FIGURE

3.24

.....

--.J

JlIethod of calibration.

charging voltage was applied to the inner hemisphere. The potentiometer was varied
until the smallest detectable voltage was determined. This voltage was consistently
less than one microvolt.
During the actual experiment, the outer sphere was charged with a voltage wh ich
was always in excess of 3,000 volts. Although many trials were made, no detectable
deflection was ever observed in the galvanometer,
1"'0 adapt Maxwell's analysis to the Plimpton-Lawton experiment, one can return
to the expressions for the potentials of the t\VO spheres, namely, Equations (3.167)
and (3.168). Since Plimpton and Lawton were using such a low frequency (2 cps), these
equations are still valid in their case, only now there is an implied time dependence of
er", Solving these t\VO equations for Qb gives

Qb = 2b2/(2a)V(b) - 2ab[f(a
f(2a)j(2b) - [f(a

+ b)

- /(a - b)]V(a)
b) - .r(a - b)]2

(3.173)

Sincc the detector is connected between the inner and outer spheres, the current

flowing through the extremely high input resistance R of the detector is Qb = jWQb.
Thus

yea) -

V(b) = jwQbR

(3.174)

SECTION

21

The Plimpton-Lawton Experiment

207

Elimination of Qb from Equations (3.173) and (3.174) gives

{f(2a)f(2b) - (f(;w; b) - f(a - b))2

== {2b~f(2a)
Since

2ab(f(a

b) - f(a - b)]} Yea)

+ f(2a)f(2b)

- Lf(~

b) - .r(a - b) J2} V (b)

JwR

R is so large, this reduces to

ba [f(a +

b) - f(a - b)lV(a) == f(2a)V(b)

from which it follows that

yea) - V(b) == yea)

[1 _~.f(a +
b

b)l

b) - fCa -

fC2a)

(3.175)

But this is the same expression which Maxwell obtained for the static case. N ow, however, the voltages are sinusoidal, and the potential of the inner sphere is being measured
with respect to the outer sphere rather than ground.
Once again, the assumption can be made that the electric force law is of the form
r-(2+o) with 0 small, yielding the first-order result

Yea) - V(b)

~ Yea) [~ln a_+_b


2

a - b

- In _4a_
a2

_l
b2

When the values of a and b used by Plimpton and Lawton] are inserted in this expression, one obtains

yea) - V(b)

~12

yea)

(3.176)

Their measurements indicated that yea) - V(b) was not greater than one-half microvolt even for V (a) as great as :-3,000 volts. This yields the result that 0 is bracketed
by the limits 2 X ]0- 9
Because of the great accuracy of this determination, Plimpton and Lawton even
investigated possible effects due to gravity. This influence was shown to cause a potential difference between the spheres of less than 10- 10 volts, an effect which could be
neglected.
The Plimpton-Lawton determination of the law of electric force stands as a model
of precise experimentation and provides the most confident basis for a development
of an electromagnetic theory which uses the inverse square law as a postulate.
REFERENCES

1.

Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, VV. H.
Freeman and Company, San Francisco, 1962.

2.

Jackson,.T. D., Classical Electrodynamics, John \Viley and Sons, Inc., Ne\v York, 1962.

t The value used for b needs to be a suitable average, since their inner surface was not entirely
spherical.

208

Electrostatics in Free Space

CHAPTER

3.

Jeans, J., The 111 athematical Theory of Electricity and il1agnetism, 5th ed., Cambridge
University Press, London, 1946.

4.

Langmuir, R. V., Electromaqnetic Fields and lVaves, Meuraw-Hill Book Company, New
York, 1961.

5.

Lenard, P., Great ill en of Science, The Macmillan Company, Inc., N e\v York, 1933.

6.

Magie, \tV. F., . 4 Source Book in Physics, Mcflraw-Hill Book Company, New York, 1935.

7.

Panofsky, \V. I{. H., and 1\1. Phillips. Classical Electricity and Jl1 aqneiism, Addison- \Vesley
Publishing Company, Inc., Reading, Massachusetts, 1956.

8.

Plonsey, R., and R. E. Collin, Principles and Applications of Electromagnetic Fields,


McGra\v-Hill Book Company, New York, 196!.

9.

Ramo, S., and J. R. Whinnery, Fields and lVaves in Afodern Radio, 2nd ed., John Wiley
and Sons, Inc., N e\v York, 1953.

10.

Reitz, J. R., and F ..J. Milford, Foundations of Eleciromaqnetic Theory, Addison-Wesley


Publishing Company, Inc., Reading, Massachusetts, 1960.

11.

Shamos, 1VI. H., Great Experiments in Physics, Holt-Dryden Company, Inc., New York,
1959.

12.

Shire, E., Classical Electricity and ..Magnetism, Cambridge University Press, London, 1960.

13.

Smythe, ,V. R., Static and Dynamic Electricity, McGra\v-Hill Book Company, N ew York,
1939.

14.

Whittaker, E., A J/istory of the Theories of Aether and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

15.

'VoH, A., A History of Science, Technology and Philosophy in the Eighteenth Century, The
Macmillan Company, Inc., New York, 1939.
PROBLEMS

3.1

Two particles of equal mass m and equal charge q are suspended from a C0111InOn point by
light strings of equal length. Find the angle of separation of the t\VO strings.

3.2

Two small spheres are placed 1 In apart and a charge of 1 caul is placed on each. Will t\VO
strong men be able to hold the spheres in position ? (Define a strong man as one weighing
200 lbs and able to lift twice his weight.) How many excess electrons could be placed on
each sphere before the men begin to feel a sense of achievement?

3.3

An uncharged conductor of volume V is placed in a uniform electric field of strength E.


What force does it experience?

3.4

A quantity of negative charge - Ze is distribu ted uniformly throughou t a sphere of radius


r; A positive point charge +Ze is situated at an arbitrary point within the negative cloud
of charge. Find the force on the positive charge.

3.5

Use the Dirac delta function to express a surface charge density distribution in the form
of a volume distribution. Insert this expression in (3.11) and (3.18) to verify (3.13) and
(3.26) .

3.6

A. metal shell of radius a contains a charge Qa and a second metal shell of radius b is given
a charge Qb. If these t\VO shells are then connected by a wire, in which direction will curren t
flow?

Problems

209

3.7

Use an energy argument to show that the charge on a conductor resides on its outer
surface.

3.8

.A discrete system of N charges 'l are at the points (~n, 1]n,S n). Find the energy which can
be extracted from this system when each charge is allowed to move infinitely far away
from all the others.

3.9

Rutherford, in 1911, gave a satisfactory explanation for the deflection of a particles by


adopting as a model of the atom a uniformly distributed cloud of negative electricity of
amount - Ze, contained in a sphere of radius r., with a positive charge Ze at its center.
He obtained the following expressions for electric field and potential at any point within
the atom

E(r) =
'P(r) =

4~:o (~ - ~)

~:o G- 2~a + ;:)

Verify that these expressions are correct.

3.10

Find the equation for the equipotential surfaces due to an electric doublet. Plot a few of
the contours to scale. Are the results valid near the dipole? In what way should they be
modified '?

3.11

A. dipole of moment p is located in a uniform electric field of strength E. Assume that p


is initially perpendicular to E and say that in this position the dipole has zero potential
energy. Then show that if the dipole is rotated into any new position, its potential energy
has changed to - p E.

3.12

:\ flat circular ring of radius a contains a uniform charge density x couf /m. Find the potential and field intensity at any point along the axis.

3.13

Use the result of the previous problem to deduce the potential and field intensity at any
point along the axis of a disc of radius a which has a uniform surface charge density (5.

3.14

Two extensi ve metal plates are parallel and opposi tely charged, each being insulated, wi th
the intervening space being a VaCUU111. If the plates are pulled further apart, explain why
their potential difference increases. Suppose that the initial separation was 1 mm and that
initially the potential difference was 1,000 volts. If the ultimate separation is 1 m, what
is the final voltage? This effect accounts for the high voltage of some lightning discharges.

3.15

Three thin concentric conducting spherical shells of radii a = 1 m, b = 2 IU, and c = 4 m


are originally uncharged and insulated from each other by a vacuum. A total charge of +3
coul is then placed on the middle shell B. N ext, A and C are electrically connected by a
thin wire which goes through a negligibly small hole in B without touching B. .A fter the
wire is removed, the total charge on ..4 and C is measured. Predict the results of these
measu rem en ts.

3.16

Use Gauss' la w to show that the average poten tial over any spherical surface in a chargefree region is equal to the value of the potential at the center of the sphere.

3.17

With the aid of Gauss' law, determine the electric field distribution inside and outside
an electron beam of circular cross section of radius a. ASSUlTIe the beam possesses a uniform
current density L amp zm" and that the electrons are moving at a constant velocity v = lzv.
(This problem may be treated electrostatically even though the charges are moving
because the time-average amount of charge at any position in the beam is a constant.)

210 Electrostatics in Free Space


3.18

CHAPTEH

Point charges q and -q are placed at the points .A, B. The flux line which leaves A, making
an angle a with AB, meets the plane which bisects .4B at right angles, in a point I). Sho w
that
. a
_ r: . ]J AB
SIn -

2 SIn

--

Hint: When this flux line is rotated about AB as axis, the surface thus generated encloses
no net charge.
3.19

.~ point charge of +q coul is placed d 111 above an infinite grounded conducting plane in
otherwise empty space. Let P be that point in the plane nearest to q, and with ]J as center
draw a circle in the plane of radius r. If the circular area thus formed contains one-quarter
of the total induced charge in the plane, find the value of r.

3.20

Two infinite grounded conducting planes intersect at right angles and a point charge q is
placed a distance d from each plane. Find the surface charge distribution in each plane.

3.21

Let the electric field intensity at the surface of a thin spherical conducting shell be E.
Show that if an extremely small hole pierces the shell, then the electric field at the lUOU th
of the hole is i E.

3.22

Use the method of irnages to deri ve a field expression for the system consisting of a uniform
line charge parallel to and external to a right circular conducting cylinder. Consider the
general case in which the cylinder is at an arbitrary potential.

3.23

.A grounded conducting tube of infinite length has a circular cross section of inner radius
10 ern. f\ line charge of 2 microcoul/rn is placed parallel to and at a distance of 5 em from
the axis of the tube. Determine the surface charge density distribution induced on the
inner surface of the tube. What is the net force per unit length acting on the tube '?

3.24

Consider an electrified system consisting of a metallic sphere of radius a and excess charge
Q together with a point charge q a distance b > a from the center of the sphere. Find the
force on q and show that under certain circumstances it can be attractive even if q and Q
are of like sign.

3.25

.A hollow conducting sphere has an internal radius of 1 m ..~ point charge of 1 microcoul
is placed within the sphere at a distance of 50 cm from the center. Find the surface charge
density distribution on the sphere and determine the net electrostatic force experienced
by the sphere.

3.26

Two equal charges q are placed at equal distances d from the center of a grounded conducting shell of radius a. If a > d and the charges are on the same diameter, find the net
force on each charge.

3.27

A hollow conductor is formed by a quarter of a sphere and t\VO perpendicular diametral


planes. Find the image of a charge placed at any internal point.

3.28

A spherical conducting shell of radius b is insulated and uncharged and surrounds a


spherical conductor of radius a, the distance c between their centers being small, The inner
conductor contains a charge Q. Find the potential distribution between conductors and
the surface charge density.

3.29

A cylindrical volume of radius b, extending to infinity in both axial directions, contains


a space charge of constant density Po. Find the field intensity E(r) and the electrostatic
potential <I>(r) for any radial distance.

3.30

Two infinite coaxial cylindrical conducting shells of radii a and b bound a uniform space
charge density Po. Determine the potential distribution in this intervening region if the
inner and outer cylinders are held at potentials <I>(a) = 0 and <I>(b) = Vo respectively.

Problems
3.31

211

Determine the trajectory of an outer electron in an electron beam of circular cross section
subject to spreading caused by space charge repulsions within the beam. Assume axial
velocities to be constant and radial velocities such that the beam diverges symmetrically
with no crossing of trajectories. Obtain the radial electric field as a function of r and relate
rm

.l..------ -:=t --t-.....---------

this to the radial acceleration; then integrate and determine r as a function of axial
distance..Assume that for z ~ 0 the beam is confined to r; with no radial velocity.
3.32

How long does it take an electron to travel from cathode to plate in a planar diode'? Insert
typical values for the parameters and compute the time in microseconds,

3.33

1\ spherical volume of radius b contains a space charge density p(r) = b2 - r 2 Find the
field intensity E(r) and the electrostatic potential cI>(r) for any radial distance. Check your
results by substitution in Poisson's equation.

3.34

A. thin conducting spherical shell of radius a contains a total excess charge Q, and has its
inner surface coated with a thin insulating film. An equal amount of charge is distributed
throughout its hollow interior such that
T

<a

Find the charge distribu tion within the sphere and the surface charge densi ty on its ou tel'
surface. What is the absolute potential of the sphere'? Of the point at its center'?
3.35

Use Laplace's equation to show that the electrostatic potential cannot have a maximum
or a minimum value at any point in space not occupied by an electric charge. Then show
that if cI> is maximum at a point, the point must be occupied by a positive charge, whereas
a negative charge must be at a point where the potential is a minimum.

3.36

Two infinite parallel conducting plates are separated by a distance b as shown in the
figure ..\ very thin conducting septum, infinitely long and of height d, is connected to the
grounded plate, with the other plate kept at a constant potential V o. Solve for the potential distribution between plates.

11

<f> = 0

3.37

t
b

Two L-shaped conducting channels are placed near each other so as to form a narrow
longitudinal slit, as shown in the figure. The t\VO channels are kept at a difference in

212

Electrostatics in Free Space

CHAPTER

potential Vo. Assume that the structure is infinite in the Z direction and solve Laplace's
equation to obtain a solution for ep in the region between the channels.
y

<I> =

V,

Slit

I
3.38

<1>=0

~X

Find the potential distribution and electric field between t\VO half-plane conductors set
at an angle cPo but not quite touching. Ignore edge effects and assume that one plate is
grounded with the other at a potential <1>( cPo) = V o. What is the charge distribution?

<1>=0

<I> =

3.39

Vo

Find the potential distribution between a four-segment commutator of radius fl and a


grounded concentric cylinder of radius f2 > fl. Alternate segments of the commutator
are at the patentials Vo vol ts, as shown in the figure.

Problems

213

3.40

A spherical conducting shell of radius a is divided into two hemispheres by a narrow


equatorial gap. If the hemispheres are kept at a difference in potential Yo, find the potential distribution both inside and outside the shell.

3.41

For the quartered spherical shell of Example 3.25, find the potential distribution inside
the shell.

3.42

Find the Green's functions suitable for solving Dirichlet-type boundary-value problems
in vol ving a rectangular box.

3.43

Repeat the preceding problem for a cylindrical box.

3.44

A line charge of density )( coul/rn is located symmetrically inside a 90-deg grounded conducting corner, being a distance d from each face, as shown in the figure. By using a suitable conformal transformation, and then employing the image principle, find the paten tial
distribution of this system.

II
~

<1>=0

3.45

Repeat the above problem if the line charge is placed symmetrically inside a grounded

trough, as shown in the figure.

<1>=0

214

Electrostatics in Free Space

3.46

With the aid of a Schwarz transformation, find the potential and field distributions in the
region between the two right-angle conducting wedges shown in the figure.

CHAPTER

4> =

Vo
b

+
4>=0

3.47

A capacitor is formed of three concentric cylinders of which the inner and outer are connected together. Neglecting end effects, obtain a formula for the capaci tance per uni t
length.

3.48

Use a Schwarz transformation to determine the change in capacitance, for the geometry
shown, over the value which would be obtained if a uniform field existed in both parallel
plane regions.
4l

= Vo

t
~

<1>=0

3.49

.A sandwich line consists of three parallel plane conductors, as shown in the figure .
.A ssuming these conductors are infinitely long in a direction perpendicular to the paper,
and neglecting fringing, find the coefficients of capacitance per unit length.

~"-b-.~
I ..

L.1
I

Problems 215
3.50

With the aid of a transformation in the complex plane, find the coefficients of capacitance
per unit length for the geometry shown in the figure.

3.51

Show that the energy stored in the field of a coaxial capacitor is consistent. with the
formula Q2/2C. Repeat t his calculation for a spherical capacitor.

3.52

Consider an electrostatic system consisting of .V conducting bodies possessing charges


Qn and potentials V n. Prove Thomson's theorem, which states that the charge will
distribute itself so that when in equilibrium the electrostatic stored energy in the field is
a minimum.

3.53

j\ system of LV conductors is charged in any manner and then charges are transferred
among the conductors until they are all brought to the same potential V. Show that there
has been a decrease in the stored electrostatic energy equal to what would be the energy
of the system if each of the original potentials had been decreased by an amount V.

3.54

Under the assumption- that the error in measuring the angular position of ball a (see
Figure 3.4) was so much larger than anv other error that it was the determining factor in
accuracy, and that Coulomb could measure this position within deu. to what accuracv
did he determine the inverse square law?

3.55

A proof of the inverse square law, assuming a null result in the Cavendish experiment, was
provided by Maxwell based on a formulation of potentials. Can you give an alternative
proof based on force':

3.56

What was the size of the auxiliary small brass ball used by Maxwell and Me Alister '!

3.57

From the description given of the Plimpton-Lawton experiment, what is the upper bound
on the total charge residing on the inner globe '?

3.58

What was the average diameter of the inner closed surface In the Plimpton-Lawton
experiment?

CHAPTER

Magnetostatics in Free Space


as a topic within electromagnetic theory, is usually introduced
by drawing upon cxperimentai evidence to postulate either the Biot-Savart law or
Ampere's circuital law, The theory of maguetic fields due to t.irne-independeut current
distributions ill free space is then developed. Following this, the behavior of magnetic
materials is considered, usually in terms of aggregations of atomic current loops or
equivalent ruagueti dipoles. A satisfactory description of all gross static magneti
effects may be achieved in this manner.
The approach to be presented in this chapter differs ill several respects from the
above. First of all, no new experimentally based postulates will be introduced. Instead,
the previously obtained results of special relativity and electrostatics will be used to
derive the Biot-Savart law. The procedure will be to consider the force exerted on a test
charge by a system of charges which are at rest relative to an observer ()'. '1'0 a second
observer O, in constant motion relative to ()', this charge system is drifting at a constant
velocity, and thus can take 011 the appearance of a steady current. The second observer
detects a slightly different force to be acting on the test charge. This slight difference
is determinable through the force transformatiou law and proves to be the seat of
magnetosta tics.
After the force trausformatiou equations have been used to transform Coulomb's
law and thus establish the Biot-Savart law, the chapter proceeds conventionally with
the introduet.ion of the maguet.ostatic vector potential function and the derivation
of Ampere's circuital law, The magnctostatic vector potential function is found to
satisfy Poisson's equation, leading to the solution of a class of boundary-value problems, in analogy with what was presented in Chapter :~ in the ease of electrostatics.
As illustrations of the theory, a variety of problems is solved, including the far field
of a small current loop. This resultforms the building block for an explanation of
magnetic effects ill materials. However, the discussion of magnetic materials will be
deferred until Chapter 7. ill order to be able to include time-varying effeets.
MAGNETO~TATIC;';,

4.1 *

HISTORICAL SURVEY

Man's awareness of magnetic effects appears to be almost as old as recorded history,


but most of the early knowledge was concerned with the properties of permanent
magnets, The subject of magnetic fields caused by electric currents had a very welldefined beginning in the winter of 1819-1820. During that period, Professor Hans

* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Survey

217

Christian Oersted (1777-1851) of the University of Copenhagen, experimented with the


placement of a closed electric circuit near a compass needle. He had been motivated
in this study by the observation that a compass needle fluctuated erratically during a
thunderstorm. Accordingly, he set up an apparatus consisting of a galvanic battery
and a short-circuiting wire. Apparently during one of his lectures Oersted placed the
wire at right angles to a compass needle, but observed no effect. At the end of this
lecture the thought occurred to him to place the wire parallel to the needle. This action
immediately caused a pronounced deflection in the needle. After putting together a
more powerful galvanic battery, Oersted assembled some of his colleagues as witnesses
and repeated the experiment, Excerpts of his own account 1 of what was observed
follow:
The opposite ends of the galvanic battery were joined by a metallic wire, which, for shortness sake, we shall call the uniting conductor . . . . To the effect which takes place in this
conductor and in the surrounding space, we shall give the name of the conflict of electricity.
Let the straight part of this wire be placed horizontally above the magnetic needle,
properly suspended, and parallel to it . . . . Things being in this state, the needle will be
moved . . . .
If the distance of the uniting conductor does not exceed three-quarters of an inch from
the needle, the declination of the needle makes an angle of about 45. If the distance is
increased, the angle diminishes proportionally. The declination likewise varies with the
power of the battery . . . .
The effect of the uniting conductor passes to the needle through glass, metals, wood, water,
resin, stoneware, stones; . . . . The effects, therefore, which take place in the conflict of
electricity are very different from the effects of either of the electricities.
If the uniting conductor be placed in a horizontal plane under the magnetic needle,
all the effects are the same as when it is above the needle, only they are in the opposi te
direction . . . .

After noting that a rotation of the wire would be tracked by a rotation of the magnetic
needle, and that no effect was observed for needles made of brass, glass, or gum lac,
Oersted offered a few observations in the nature of an explanation of the phenomenon:
The electric conflict acts only on the magnetic particles of matter..All non-magnetic
bodies appear penetrable by the electric conflict, while magnetic bodies, or rather their
magnetic particles, resist the passage of this conflict. Hence they can be moved by the
impetus of the contending powers.
I t is sufficiently evident from the preceding facts that the electric conflict is not confined
to the conductor, but dispersed pretty widely in the circumjacent space.
From the preceding facts \ve may likewise collect that this conflict performs circles; for
without this condition, it seems impossible that the one part of the uniting conductor, when
placed below the magnetic pole, should dri ve it towards the east, and when placed above it
towards the west . . . .

There has been some debate as to the extent of the honor which should be accorded
Oersted for this discovery. A principal factor occasioning this debate is a letter from
I-Iansteen (one of Oersted's students) to Faraday, in which he says:"
1 H. C. Oersted, "Experiments on the Effect of a Current of Electricity on the Magnetic Needle,"
a pamphlet dated July 21, 1820, distributed privately to scientists and scientific societies. English
translation in A nn. of Philosophy, 16, 273-276; 1820.
2 Bence Jones, The Life and Letters of Faraday, vol. 2, pp. 389-392, Longmans, Green and Company,
London, 1870.

218

M agnetostatics in Free Space

CHAPTER

Professor Oersted was a man of genius, bu t he was a very unhappy experimenter: he


could not manipulate instruments . . . . Oersted tried to place the wire of his galvanic
battery perpendicular over the magnetic needle, but remarked no sensible motion. Once,
after the end of his lecture, as he had used a strong galvanic battery to other experiments, he
said, "Let us now once, as the battery is in activity, try to place the wire parallel with the
needle"; as this was made, he was quite struck with perplexity by seeing the needle make
a great oscillation . . . . Thus the great detection was made; and it has been said, not
without reason, that "he tumbled over it by accident." He had not before any more idea
than any other person that the force should be transversal. But as Lagrange has said of
Newton in a similar occasion, "such accidents only meet persons who deserve them."

Considerable weight has been given to this letter by Hansteen, because he was apparently a witness to the original discovery. However, the letter was written in 1857,
almost thirty years after the fact, and six years after Oersted's passing. True Oersted's
own account, paraphrased above, was completely lacking in quantitative determination, but all the salient features of the phenomenon had quite clearly been investigated,
including the dependence on current strength, distance of separation, and even shielding
effects. The inference of a circular distribution of magnetic field lines was certainly an
able deduction. As Lenard" has pointed out, the fact that Oersted had a battery and
compass needle on the table indicates he was looking for such an effect and that the
discovery cannot fairly be labeled as a pure accident. But whatever the true circumstances were surrounding this discovery, it was one of the 1110st important in the history
of science, linking for the first time the fields of electrici ty and magnetism,
Oersted's discovery was promptly enlarged by others. The academician Arago learned
of it while traveling abroad and, upon his return to Paris, described the effect at a meeting of the French Academy on September 11, 1820. This news excited the interest of
several investigators, and the next discovery was announced by Andre-Marie Ampere
(1775-1836) just one week later. Reasoning that, if magnets exert forces on each other,
and if electric currents exert forces on magnets, then two electric currents should
interact, Ampere devised an experiment in which.'
. . . in parallel directions, t\VO straight parts of t\VO conducting wires joined the terminals
of two voltaic piles; the one being fixed, and the other suspended from points and made very
mobile by a counterpoise, being able to approach or withdraw while still retaining its
parallelism with the first wire, I have then observed that upon passing an electric current
through each of them, they mutually attract if the two currents are in the same direction,
and that they repel each other when, instead, (the currents) are in opposite directions.

Meanwhile, Jean-Baptiste Biot (1774-1862) and Felix Savart (1791-1841) repeated


Oersted's experiments, and announced to the Academy at the October 30th meeting
that they had determined a law of force which governed the effect. The following brief
notice of their announcement was printed in the Journal de Pluisique:"
The beautiful observations of IV1. Oersted, combined with precise measurements of torsion
and oscillation, give the following expression for the action exercised at a distance on an
austral or boreal magnetic pole, by a nearby thin copper wire, of great length, connected to
the t\VO terminals of a voltaic apparatus. From the point of the pole draw a perpendicular
P. Lenard, Great Men of Science, p. 214, The Macmillan Company, Inc., New York, 193:3.
A. IV1. Ampere, "Memoir on the Mutual Action of Two Electric Currents," Annales de Chimie et
Physique, 15, pp. 59-76; 1820.
5 J de Phys (Paris), 91,151; 1820, See also Ann de Chimie et Physique, 15,222-223; 1820.
3
4

SECTION

Historical Survey

219

to the axis of the wire. The force acting on the pole is perpendicular to the axis of the wire.
Its intensity is proportional to the reciprocal of the distance. The nature of its action is the
same as if a magnetic needle were to be placed tangentially to the contour of the wire (in
place of the wire), in which case the austral and boreal magnetic poles would be acted upon
in opposite senses, but always along the same straight line determined by the preceding
construction.

The best source for the details of the experiment which established this law is Biot's

Precis Elementaire de Phusique." The method used can be understood with reference to
Figure 4.1a, which is a reproduction of Biot's original drawing. Shown is a compass
6

The third edition of this text was printed in Paris in 1824. An English translation is embodied in

J. Farrar's Elements of Electricity, M agnetisrn, and Electromaqneiiem; printed by Hilliard and Metcalf,
Cambridge, Massachusetts, 1826.

\'--1

F(r)
I

(b)

c'

,A
!v!'

Ail"

Z'
(a)

z
(c)

FIGURE

4.1

The Biot-Savart experiments.

220

j11agnetostatics in Free Space

CHAPTER

needle A B, which can freely pivot about its center point, and which is placed a distance
r from a long, straight wire CllfZ. A permanent magnet A'B' (not shown) is positioned
nearby in such a way as to cancel the effect of the earth's magnetic field. The equilibrium position of the needle is then found to be perpendicular to the wire axis. If the
needle is pictured as having t\VO equal and opposite magnetic poles at its extremities,
the forces exerted by the current OIl these poles are thus equal, opposite and circumferential. If then the needle is displaced from equilibrium by a small angle (), as shown in
Figure 4.1b, a restoring couple is experienced by the needle, and its equation of motion
IS

-F(r)L sin () =

Ie

in which L is the length of the needle and I is its moment of inertia. For SInal! displacements, harmonic oscillations will occur of period

Thus, in Biot's words,


. . . if we compare in this way, the squares of the periods, for different distances of the
uniting wire from the needle, supposing always the condition of isochronism to be fulfilled,
we shall obtain the ratios of the component forces exerted in these different cases by the
uniting wire, parallel to the direction of equilibrium about which the needle oscillates.

Upon performing this experiment, Biot and Savart obtained data which is reproduced
in Table 4.1. The last column of data was calculated under the assumption that the
law was inverse with distance. Since the errors were alternatively positive and negative,
irregular, and greater for the larger distances, Biot and Savart concluded that the
law had been fairly established.
TABLE 4.1
DATA FOR THE BIOT-SAVART EXPERIlVIENT

Distances of
the wire, 111n1

Duration of ten oscillations


Observed, sec

Calculated

40
50

30.00
33.50
48.85
54.75

60
120

89.00

30.99
33.88
48.62
53.74
59.40
84.25

15
20

56.75

Biot extended this experiment significantly by inquiring what the action must be
on the compass needle due to an infinitesimal length of the wire, Since the influence
of the entire straight wire varied as 1'-1, and since 1'-1 is the integral of 1'-2, he felt that
each element of the wire should make a contribution to the total force which is proportional to the inverse square of its distance from the needle. However, he realized that

SECTION

Historical Survey

221

the contribution might also depend on the orientation of the element relative to the
needle, and devised an experiment to deduce this relation. Referring to Figure 4.1c,
Biot introduced an additional V-shaped wire with its apex close to the central point
of the first wire. He then determined the period of the compass needle as a function of
1', with a steady current alternately passing through the straight wire and the bent
wire. The difference in period under the action of the t\VO wires could be explained? by
the assumption that the contribution from a single current element I de was proportional to (sin W)/~2. The discovery of this fact led Biot to proclaim
. . . the elementary action of any lamina whatever (is) proportional to sin W/~2; and
uniting with this expression, which is founded upon experiment, the knowledge of the
absolute direction of the force which is perpendicular to the plane drawn through each
distance and through the direction of each longitudinal element of the wire under consideration, we may assign by calculation the total resultant of the action exerted by a wire,
or by any portion of a wire, whether straight or curved, limited or indefinite.

In present-day notation, this result is equivalent to saying that a system of steady


currents K creates a magnetic field at point (x,Y,z) given by
B( x,Y,Z )

ex:

fX

dt X ~
~3

and that if a magnetic pole of strength m is placed at (x,y,z) it will experience a force
mB. In the above formula, ~ is the distance Irorn the element df to (x,y,z). This important equation is known as the Biot-Savart law and is often taken as the experimental
postulate on which magnetostatics is based.
Ampere, following his announcement of the discovery of the force between t\VO
currents, continued his investigations, and later that year published a memoir" which
succeeded in clarifying much of what was known about electricity at that tirne. He
distinguished phenomena involving electricity at rest from phenomena involving
electricity in motion, introducing for the former the name electrostatics, and for the
latter the name electrodynamics. He also distinguished between electric tension (voltage) and electric current. At that time, people were accustomed to speak of the conduction and flow of electricity, but since the two-fluid theory was popular, considerable
confusion existed with respect to the nature of the flow process. Ampere decided that
he would call the whole process an electric current, without regard to its inner nature,
and with the direction of the current defined as the direction in which the positive fluid
was presumed to move. This made the electric current something definite in terms of
which phenomena could be described.
The concept of electric potential, or tension, had been privately appreciated by
Cavendish, and had been admirably developed for electrostatics by Poisson. Ampere
noted that electric tension was observable in a voltaic pile before the circuit was closed,
being detectable through use of an electrometer or electroscope, instruments which
Ampere labeled as measurers of tension. As for the current itself, Ampere felt that it
was best measured by means of its magnetic effects, and he introduced for this purpose
an instrument which he called a galvanometer, an instrument which is still in use today.
To Ampere, tension appeared as a cause, and current as an effect. Koting that as
t
8

See Farrar, ibid., pp. 334-339. See also Problem 4.3 at the end of this chapter.
Ibid., pp. 59-68.

222

M agnetostatics in Free Space

CHAPTER

soon as the effect appears through completion of the circuit, the tension "disappears,
or at least becomes very small," Ampere, then made the interesting observation"
The currents of which I speak self-accelerate until the inertia of the electric fluids and the
resistance that they encounter due to the imperfections in even the best conductors cause
equilibrium with the electromotive force, after which they continue indefinitely at a constant speed such that this force remains at a constant intensity; but they cease entirely at
the instant that the circuit is interrupted.

Ohm's law, which was to be enunciated seven years later, is thus seen to be not far off
in Ampere's thinking.
With a clear definition of current, and a means for measuring it, Ampere continued
his researches over the next three years, and in 1825 collected his results in a lengthy
memoir!" which must rank as one of the most distinguished in the history of science.
In this memoir, Ampere concerned himself with the problem of determining the law
of force between two current elements. A wide variety of experiments on an assortment
of wiregeometries had led him to four conclusions about the force interaction between
currents:
1. The action of one current on another is unchanged in magnitude, but reversed
in direction, when the direction of the current is reversed.
2. The effect of a conductor bent or twisted in any small manner is the same as if
the contour were smoothed out.
3. The force exerted by a closed circuit on a current element is always normal to the
element.
4. If all dimensions of a circuit are changed proportionally, with the currents
unchanged, the forces retain their original values.
When Ampere added to these four conditions the natural assumption that the force
d 2F between two current elements I di and I' dl' is along their connecting line-an
assumption consistent with K ewton's gravitational theory and the Coulomb-Poisson
electrostatic theory-he was able, by an astute piece of analysis, to establish the force
l aw

.
dtF

ex:

IV'
[2 se.-di'
Jl ~
~3

(dl- ~) (dl'
~) ]
- 3- - ~5

in which r is the distance separating the t\VO current elements. A clear exposition of the
analysis leading to this formula, as well as the experimental basis for Ampere's four
conditions, can be found in Mason and Weaver."
If Ampere's third condition is formulated in terms of a field concept, one may write

dF =

I' se X n

in which dF is the force exerted on the current element I' df' and B is the field caused
by the closed circuit. From this, if I eli is an element of the closed circuit, according
Ibid., p. 64.
A. lVI. Ampere, "On the Mathemat.ical Theory of Electrodynamic Phenomena Uniquely Deduced
from Experiment," illem Acad, 175-388; 1825.
11 M. Mason and W. Weaver, The Electromaqnetic Field, pp. 176-183, The University of Chicago Press,
1929. Reprinted by Dover Publications, Inc., New York.
9

10

SECTION

Historical Survey

to the Biot-Savart law, it exerts a force

d2F

a:

OIl

223

li df/ given by

III di

X (di X 6)
~3

which does not agree with Ampere's formula for d 2F.


A lively controversy ensued for some time as to which of these formulas for differential force is correct. Various investigators have shown 12 that for closed circuits
f f d 2F gives the same answer when starting with either forrnula. In Ampere's time it
was not possible to decide the question by experiment. ~ O\V that the motion of free
charges can be studied under the influence of magnetic fields, the decision is clearly
in favor of the Biot-Savart formula. Ampere's difficulty can be traced to the improper
assumption that the elemental force acts along the connecting line.
Unlike Biot, who regarded magnetic poles as fundamental, Ampere considered
magnetism to be basically an electrical phenomenon. He viewed a magnetized rod as
equivalent to a coil carrying an uninterrupted current. He showed that t\VO solenoids
deflect each other in exactly the same way as do t\VO maguetiz ed rods, and was even
able to show that a single current loop, when free to move, sets itself like a compass
needle with respect to the earth's magnetic field. Ultimately, An1pere came to the view
that every magnetic molecule is really a small permanent circular current. This viewpoint was much too advanced for his contemporaries. The meager knowledge of atomic
structure would not permit the conception of permanent currents within materials
without a source of power. However, the impression produced by this memoir was deep
and lasting, and today Ampere's views of these phenomena form the core of magnetic
theory. He is properly credited with authorship of the force law dF = I'dil X B,
even though Biot and Savart deserve citation for the correct formulation of B in terms
of the current elements in a closed circuit. Ampere himself extended the applicability
of this formula by showing that a permanent magnet will exert a force on a current.
His achievements were truly remarkable and Maxwell, writing a half century later,
labeled his memoir "one of the D10st brilliant achievements in science." As a fitting
tribute, the unit for electric current and the circuital law linking magnetic field and
current are named in his honor.
During this same period Faraday made a discovery of the greatest practical importance. His interest had been aroused in electromagnetism in April 1821 when Wollaston,
a colleague at the Royal Institution, attempted to make a current-carrying wire revolve
around its own axis in the presence of a magnet. Although the experiment was unsuccessful, it piqued Faraday's interest. He began by reading what had been done by
Oersted, Ampere, Biot and Savart, and others, and repeated many of their experiments.
Finally, upon repeating Wollaston's experiment, he noted :13
. . . Magnets of different power brought perpendicularly to this wire did not make it
revolve as Dr. Wollaston expected, but thrust it from side to side . . . . The effort of the
wire is always to pass off at a right angle from the pole. indeed to go in a circle round it; . . .
a single magnet pole in the centre of one of the circles should make the wire continually
turn round. Arranged a magnet needle in a glass tube with mercury about it and by a cork,
water, etc., supported a connecting wire so that the upper end should go into the silver cup
CL, e.g., 1\Iason and Weaver, .u, pp. 183-185.
Faraday's Diary, vol. 1, pp. 49-50, being entries in his laboratory notebook for September 3rd, 1821.
Published by G. Bell and Sons, Ltd., London, 1932.
12

13

224

1J1 aqneiosiatics in Free Space

CHAPTER

and its mercury and the lower move in the channel of mercury round the pole of the needle . . . . In this way got the revolu tion of the wire round the pole of the magnet . . . .
Very Satisfactory, but make more sensible apparatus.

This was the first electric motor. 'I'he next day Faraday improved on it and shortly
thereafter invented the COD1111utator. nut he left to others the reduction to practice.
Magnetostatic theory was advanced and placed 1110re in analogy with electrostatic
theory by Franz N eumann (1798-1895) of Konigsberg, who introduced the concept
of the magnetic vector potential function A. 1\eumann discovered the utility of this
formulation 14 while devising a theory based on Faraday's emf law, and the A which
he defined is the more general time-varying function which will be encountered in
Chapter 5. However, its time-independent counterpart facilitates the solution of many
magnetostatic problems and will be used extensively in the sections to follow,
An entirely different approach to magnetostatic theory was first perceived by Leigh
Page (1884-1952) of Yale University."! Adopting the principles of special relativity
and Coulomb's law as fundamental postulates, he began with a system of charges at
rest relative to an observer 0'. Upon introducing a second observer 0, in constant motion
relative to 0', he observed that the charge system took on the appearance to 0 of a
steady current. Upon transforming the Coulomb force, Page noted that the force on a
test charge, as observed by 0, was slightly different from the value determined by 0'.
He was able to show that this small difference precisely accounted for the magnetic field.
1\ ot only was this demonstration additional evidence in support of the validity of
special relativity, but it also further illumined the basic unity of all electrical phenomena, whether they are due to charges at rest or in motion. A generalization of Page's
development will form the core of the present chapter.
4.2

THE TRANSFORMATION OF ELECTRIC FORCE

Let an electric charge q~ be at rest at an arbitrary point (x~,y~,z~) in the X' y' Z' coordinate system and let a moving] charge q' be instantaneously at the position
r'

= lxx'

lyy'

lzz'

The coordinates x', V', z' may be arbitrary functions of time so that, in general, the
charge q' has a velocity v' (t'). This situation is depicted in Figure 4.2.
An observer 0', stationary in X'}TlZ', will determine the force exerted by q~ on q'
through application of Coulomb's law, obtaining

,
q'q~ 6'
F = -477'EO (!") 3

(4.1)

in which ~' = lx(x' - x~) + ly(Y' - y~) + lz(z' - z~) and it is assurned that the
charges are in free space.
Let the coordinate axes of an X YZ system be aligned with the corresponding axes

t For the moment this statement means that this charge had a value of q' when at rest in X' Y' Z'.
It shall be seen shortly that it is convenient to consider charge to be an invariant.
14 F. E. Neumann, "The Mathematical Laws for Induced Electric Currents," Berlin Abhandiunqen,
p. 1; 1845. Also p. 1; 1848. Reprinted as nos. 10 and 36 of Ostwald's Klassiker,
15 L. Page, "A Derivation of the Fundamental Relations of Electrodynamics From Those of Electrostatics," Am J Sci, 34, 57-68; 1D12.

SECTION

TYhe T'romsjormotion of Electric F'orce

z'

225

v'(t')

r----------------y'

X'
FIGURE

Notation for a movinq charge acted on by a fixed charge.

4.2

of the X'Y'Z' system and let the X axis slide along the X' axis in the negative direction
at a speed u. Upon the coincidence of the two origins let t = i' = O. If VI and V are the
velocities of q~ and q' with respect to an observer 0 who is stationary in XYZ, then

(4.2)
From Equations (2.77), the force exerted on q' by q~, as determined by 0, is given by

F
in which

Fe = lxF~

Fm
with

Fe

lyKF~

2"" [ l x(vyFy + VzF z )


KU

= (1 -

(4.3)

F;

(4.4)

lzKF~
,

lyvxF y - lzvxF z ]

(4.5)

U2/C2)-~~

The division of F into t\VO parts is arbitrary but useful in that Fe contains all the
terms not dependent on the motion of q' relative to XYZ, whereas F m contains all the
terms which do depend on v, The subscripts on the forces Fe and Fm refer to the fact
that they shall be designated the coulomb force and the magnetostatic force for reasons
which shall become evident.
Two cases of this force transformation will now be considered.

Case 1 :

== o.

In this case, q~ is static in X'Y'Z' and q' is static in XYZ. The force exerted by q~ on
q', as determined by 0, is simply Fe given by (4.4).

226

M agnetostatics in Free Space

CHAPTER

Suppose that one wishes to apply the concept of electric flux density to this situation.
The reader will recall that in Chapter 3 the idea of electric flux was introduced in
connection with a system of fixed charges. Consistent with that discussion, observer 0'
can say, since q~ is at rest in his coordinate system, that the electric flux density associated with q~ is given by

D' - ~-~

(4.6)

o - 41r (~') 3

in which ~' is drawn from q~ to a field point P' where D~ is being determined,
0' can say further that if q' is instantaneously at the point P', it interacts with the
field D~ so as to experience a force

,D~
F' = q

(4.7)

EO

which is consistent with (4.1).


One needs to proceed cautiously however, in picturing force as the interaction of
charge and electric flux, when adopting the viewpoint of observer O. For him, the charge
q~ is not at rest. To discuss the concept of electric flux associated with a moving charge
requires an extension of the original definition of electric flux.
It is useful to explore the consequences of enlarging the original definition by assuming that a quantity of electric charge and the total electric flux associated with it are
invariants. The 0' observer, for whom q~ is at rest, will picture the electric flux as
emanating from q~ with a spherically uniform distribution. Referring to Figure 4.3a, he
can find the components of D~ at any point I)' by erecting small displacements I)' r;
and P'1); as shown, When the segment P' P~ is rotated around the line L' as axis, it
cuts out a band of area S~, as shown in Figure 4.3b. If all the electric flux which pierces
this band is counted and divided by the area S~, 0' obtains the transverse flux density
[(D~)2 + (D~z)2p2. Similarly, when the segment P'P; is rotated about L', it cuts out
a ring of area S; as shown in Figure 4.3c. If all the electric flux which pierces this ring
is counted and divided by the area S~, the longitudinal flux density D~x is obtained.
The 0 observer, for \Vh0I11 the charge ql has the velocity l x u , sees all longitudinal
dimensions of X' Y' Z' contracted by the factor (1 - u 2 / c2 ) ~2 and all transverse dimensions unaltered, thus picturing the instantaneous situation shown in Figure 4.3d.
Under the assumption that charge and total flux are invariants, 0 will count the same
number of flux lines piercing the area SI (generated by rotating PI)1 about L as in
Figure 4.3e), that 0' counts piercing the area S~. However 0 finds the area SI to be
smaller than S~, the relation being

SI = S~/ K
Thus 0 concludes that the transverse flux density is
(D~y

D~z)~~ = K[(D~)2

(D~z)2P2

(4.8)

By rotating PP2 about L as axis (Figure 4.3f), 0 generates the same area as does 0'
and counts the same number of piercing flux lines and thus concludes that
(4.9)
If 0 considers the force exerted by ql on q to be computable in the usual way, in terms

SECTION

The Transformation of Electric Force

227

z'

------L'

q'I

q1

------L

X'

X
(d)

(a)

P'1

pI

PI P

81

+..-...~-L'

P'

L'
q'1

(c)

FIGURE

--+--+--

4.3

Comporison of flux densities.

-~-L

228

AIaqneiostatics in Free Space

CHAPTER

of an interaction between q and the flux field of ql, he can write, by virtue of (4.8) and
(4.9),

Doy
, KD~y
,
Fy = q = q - - = KF y
EO

EO

But these equations agree with (4.4). Therefore by defining charge and total electric
flux to be invariants, and by making use of the relativistic contraction of length, one
can extend the validity of the relation

Fe

Do

qEO

(4.10)

to include the case that Do is the flux density due to a charge in constant translatory
motion.

Moving charge

Z'

Stationary
charge

Stationary field

----------------x'
FIGURE

4.4

L....--

Interaction of electric field and charge in relative motion.

The flux distr-ibution as visualized by the two observers is illustrated in Figure 4.4.
For observer 0' the field is stationary and spherically symmetrical; the charge q' is
moving through this field at the constant velocity - l x u . For observer 0 the charge q
is stationary and the field is moving past it at the constant velocity l x u ; the field
exhibits some longitudinal compression due to its motion. For both observers the force
is time-varying-for observer 0' this is because q' keeps moving into new regions of
different static field intensity-for observer 0 it is because, at the static position occupied by q, the field intensity keeps changing with time.

SECTION

The Transforrnation of Electric Force 229

Since measurements of distance by the t\VO observers can be connected through the
relation
~' == [K2 (X - XI) 2 + (y - Y 1) 2 + (z - z 1) 2P~
it follows that (4.8), (4.9), and (4.10) can be combined to give
(4.11)

in which Fe has been expressed entirely in terms of quantities measured in XYZ.

Case 2: v

=1=

o.

Under this condition, in X'Y'Z', q~ is at rest and q' has the general velocity v'(t').
In XYZ, ql has the constant velocity VI == lxu and q is no longer at rest, but rather has
the general velocity v(t). The force exerted by qI on q, as determined by 0, is now given
by F = Fe + F m in which Fe has the same value as in Case 1 and, from (4.5)

r,

==

V (VI,)
~ X F

(4.12)

~ X

If the idea is retained that charge and total electric flux are invariants, then it
is still true that Fe == qD o/ EO, but the total force F on q is no longer equal to Fe. One
could discard the assumption of invariance of charge and flux and require that both
vary in such a way that the transformation linking D~ and Do yields the relation
F = Fe + F m = qDO/EO. But it is apparent from a study of (4.12) that this would
require that the flux field due to an electric system (the rigidly translating charge qI)
would depend on the motion of a test charge q which was not a part of the system.
Such an unwieldy definition has no utility. Therefore, the postulate that charge
and electric flux are invariants will be adopted generally and an additional
vector field will be introduced to account for the force F m. The reader has perhaps
already surmised that this will be the magnetic field.
In summa.tion, if q1. is moving in XYZ at constant velocity, the force exerted by ql
on the arbitrarily moving charge q can be expressed by
as

Do

V (VI
F=q-+K-X
- x qD~)
EO

(4.13)

Eo

wherein use has been made of (4.7). Since VI is X directed, this equation may be converted into a form containing only XYZ quantities through utilization of (4.8), with
the result
(4.14)

in which the nature of the field Do associated with the moving charge qI is precisely as
described in Case 1 and pictured in Figure 4.4.
Let a new vector field B(x,y,z,t) be defined by the relation
(4.15)

230 ;'1 agnetostatics in Free Space

CHAPTER

in which it is noted that B is a function of time (by virtue of the fact that Do is timevarying) and depends on the source system (the moving charge ql) but not on the charge
q. B is called the magnetic flux density function and has the units of webers per square
meter. A weber is 1 volt-see and these units will take on more meaning in Chapter 5.
Substitution of (4.1(5) in (4.14) gives

+ v X B]

q[E

(4.16)

This important equation is known as the Lorentz force law. So far, it has been formulated only for the simplest source system of a single charge ql in constant translational
motion, exerting a force F on an arbitrarily moving charge q. K ow a generalization
of this result will be undertaken.

4.3

THE FIELDS DUE TO A CLOSED CIRCULATING CHARGE SYSTEM

If there is a system of charges q1 . . . qN at rest in X' Y' Z', instead of the single charge

ql, the charges of this system will have rigid translatory motion through XYZ at the
common velocity l x u. By use of the principle of linear superposition, the total field D~

due to this charge system can be found by the methods of Chapter 3, and the total field
Do can then be found with the aid of Equations (4.8) and (4.9). An observer at rest in
XYZ can determine a magnetic field B(x,Y,z,t) due to the moving charge system by
using (4.15) and can then compute the force on a test charge q moving at a velocity v
by employing (4.16). In other words, the results of the previous section are applicable
to a system of rigidly translating charges as well as to the single translating charge q1.
Admittedly, a rigidly translating system of charges is not a physically realizable
source; however, it may be used as a constituent part of any real system of charges
and currents. As an illustration, consider the closed system of circulating charges shown
in Figure 4.5. It is assumed that the motion of these charges is such that the amount of
charge in a given volume element is always the same, albeit the identity of the charge
keeps changing. It is further assumed that the charge velocity associated with a particular volume element is unchanging, so that the flow can be characterized by a static
VOlU111e charge distribution p(~,l1,r) and by a static velocity distribution Vl(~,l1,r). This
circulating stream of charge constitutes a time-independent current density t(~,77,r) = PVI.
(Cf. Example 3.16.) If a separate test charge q is instantaneously at the point (x,y,z)
with a velocity v(t), the force F which the circulating charge system exerts on q can be
found as follows:
The circulating charge system in X YZ can be shown to be equivalent to a linear
superposition of static charge distributions in all other Lorentzian frames X'Y'Z'
which move at a variety of constant velocities u with respect to XYZ. (See Appendix E.) Therefore, by superposition, the results of the previous section are extendable
to this source system. The force on q can then be thought of as being composed of differential contributions due to each source element in the circulating charge system.
Specifically, the charge P(t,l1,t) d~ d'YJ dr, which is moving at the velocity Vl(~,'YJ,t), can
be said to exert a force on q given by

dF = q(dE

in which

dE

dD o

=EO

dB

v X

(4.17)

dB)

VI

dD o

=--2

C Eo

(4.18)

SECTION

The Fields Due to a Closed Circulating Charge Syste1n 231

The electric flux field dD o at (x,y,z), due to the moving charge p dV, would be timevarying except that other charges keep moving into dV and assuming the velocity VI,
thus assuring a steady contribution to the field at (x,y,z).

v(t)

~(X,y,Z)
FIGURE

4.5 Force due to a circulating charge system.

All of the circulating charges can be taken into account by integrating (4.17) to
obtain
(4.19)
F = q[E + v X B]
which is a generalization of the Lorentz force law (4.16), based on the principle of
superposition. The fields contained in (4.19) are given by

f dD
f
B(x,y,z) =
E(x,y,z)

Eo

VI

(4.20)

zn,

2E

(4.21)

C O

in which V is the volume occupied by the circulating charge system. These fields are
static because each elemental electric flux field dD o is time-independent. This fact

232

Jl1 aqneiosiaiics in Free Space

CHAPTER

can be appreciated by still another argument. Imagine that the charge q is at a particular point (x,Y,z) with a particular velocity v. If, at a later time, q is once again at
this same point with the same velocity, it will experience the same force as before
because the state of the circulating charge system is unchanged. Thus the force F in
(4.19) is a function of time only because of the motion of q, and not because of any time
variance of E and B.
Since charge and electric flux have been defined as invariants, it follows that Gauss'
law is applicable to this situation so that
V Do =

(4.22)

Also, since p(t,l1,f) is static (because of the nature of the circulating charge system under
discussion), it further follows that
(4.23)
with ~ drawn from (~,l1,r) to (x,Y,z). One 111ay conclude that the closed circulating
system of charges gives rise to a static electric field which does not differ from what
would occur if the charges were at rest. This should be contrasted with the case of the
electric field associated with a single charge, which was found to depend on the charge's
motion.
It follows further that, since the volume V is arbitrary,
dD

= p~dV

(4.24)

41T'~3

Returning to (4.20) and (4.21), one can write the static fields in the forms
E(x,Y,z) =
B(x,Y,z)

p~dV
-

(4.25)

41T'Eo~3
PVI

X ~ dV
2

1T'C EO~

(4.26)

If the source system is specified, these integrals may be evaluated and the results
inserted in (4.19) to obtain the force on q.

4.4

THE BIOT-SAVART LAW

Inspection of Equations (4.19), (4.25), and (4.26) reveals that if vI/c and vic are much
less than unity, which is usually the case, then [v X BI < E, and a good approximation
to the Lorentz force on q is to ignore the B field altogether. This may be a valid conclusion for the system of uncompensated circulating charges discussed in Section 4.3.
However, imagine now that an additional system of charges with distribution - p(~,l1,r)
is superimposed on the first, but that the individual charges of the second system do not
move. Then each moving charge qI of the circulating system finds itself alongside a
charge - qI of the noncirculating system. The charge pair [qI, - qd exerts equal and
opposite coulomb forces on q but only the moving charge qI contributes to the magneto-

SECTION

The Biot-Savart Law

233

static force F m acting on q. Under these conditions the net force on q is

F m == qv X B

(4.27)

with B given by (4.26). In this case, ignoring B is tantamount to ignoring the entire
force on q.'
With the two systems of sources superimposed in this manner, one circulating and
the other not, every volume element dV is electrically neutral. This situation describes
conditions which prevail inside conductors through which steady currents are flowing.
The drift velocities of the electrons have the distribution VI (~,17,t). The moving electronic charge p dV is compensated by the stationary ionic charge - p dV. A timeindependent current density \ == PVI ampyrn" flows through the volume element dV
(cf. Example 3.16). Although the individual electrons have drift velocities so low that
VI/ c may be as small as 10- 10 , there are usually so many electrons participating in the
current flow inside conductors that the calculation of B from (4.26) yields a value
which is often not insignificant.
Let a substitution constant )).0, called the permeability of free space, be defined by the
relation
(4.28)
In lVIKS rationalized units, )).0 == 41r X 10- 7 henries/m; this unit will become more
meaningful when the circuit concept of inductance is introduced. With this substitution,
(4.26) can be written

B(

X,Y,Z

) ==

J l(~,17,t)
X ~ dV
4
-1 3

(4.29)

1rJ..Lo ~

This is the Biot-Savart law and permits computation of the B field arising from any
distribution of steady currents. Equation (4.27) can then be used to find the force
which this field exerts on a charge q moving at a velocity v. Because the magnetic field B
due to the system of steady currents is time-independent, this subject is given the name

nuumetostatics.
EXAMPLE

4.1

As an illustration of the use of (4.29), consider the case of a long straight wire in free space,
extending along the Z axis from -Zl to +Zl and carrying a time-independent current I.
Then t dV = tA dt = I d.t = lzl
is a current element, with A the small cross-sectional
area of the wire, The magnetic flux density at a point (r,,O) in cylindrical coordinates will
be

dr

in which the current element is situated at the source point (O,O,s), as shown in the figure.
The above expression integrates readily to give

234

M agnetostatics in Free Space

CHAPTER

-,
5'

.--+-----~----

P(r,q"O)

For points not too near the ends of the wire, and not too far removed from the wire, so that

Zl,

= 1<p---=1

27rJ..Lo r

In such regions the magnetic flux density can be mapped as a system of coneen tric circular
lines which thin ou t with distance from the wire as r- 1
EXAMPLE

4.2

A freely moving charge q enters a region in which a steady magnetic field exists, being
described by the equation B = 1zB o, with B o independent of spatial coordinates as well as
time. If the entering velocity is the constant Vo = 1xvox + 1yvoy + 1zvoz, find the subsequent motion of the charge.
The equation of motion is given by
Fm

= qv

X B

d
(mv)
dt

=-

If the velocity of the charge is never so great that its mass m need be considered relativistically, this equation can be broken down into the components

qVyB o = miJ x
-qvxB o = mv y

o=

mii,

SECTION

The Biot-Savart Law 235

These equations integrate to give

Vx = VOx cos Wbt


VO y sin Wbt
Vy = VO y cos Wbt - vOx sin Wbt
Vz = VOz

in which Wb = qBo/m is known as the cyclotron frequency. The charge thus follows a helical
path parallel to the Z axis, the radius of the helix being
fo

2
(VOx

+ VOy2) ~~

= -----::--Wb

One interesting aspect of this solution is that what would otherwise be the lateral drift of the
charge has been converted to a circular motion, which can be very tight if B is large. Further, if (xo,Yo,zo) is a point in the trajectory, then so too is the point (xo, Yo, Zo
21T'Voz/Wb).
'rhus, if a group of charges is injected at the point (xo,Yo,zo), with random initial transverse
velocities but a common VOz, they will all come to a "focus" at 21T'VOz/Wb units of distance
further along the Z axis. This principle is used in the design of many electron devices, including some cathode ray tubes.

EXAMPLE

4.3

In 1879 Dr. Edwin Hall of Harvard observed that when a conductor carrying a steady
current is placed transverse to a magnetic field, as indicated in the figure, a transverse
charge separation occurs in the conductor. This phenomenon, called the Hall effect, has
proved useful in the determination of charge densities in materials, including semiconductors. I t can be explained by the following argumen t:

;--------- y

.X

Let the magnetic field be locally uniform and given by B = l yB y Let the current flow in
a conductor of rectangular cross section, and let it have the uniform value t = l x Lx Then
the conduction electrons have an average drift velocity v = l x v x and
Lx = PV x = -nevx

in which -e is the electronic charge and n is the volume density of conduction electrons.

236

M agnetostatics in Free Space

CHAPTER

These electrons experience an average force given by


f = -ev X B = - l zevxB y

This force causes a charge separation in the Z direction until oppositely charged layers are
built up on the top and bottom faces of just the proper value to cause a compensating force.
If E H is the electric field caused by this charge separation, then
O = -e E H

evx B y

-eE H

+ -LxBy
n

so that the free charge per unit volume is

LxB y
ne=-

EH

Since all the quantities on the right side of this equation can be measured, the number of
free electrons per atom can be determined for various metals in this manner. If the technique
is applied to a p-type semiconductor, the Hall field E H is reversed, indicating that the current
is caused predominantly by positive carriers.

4.5 THE MAGNETIC FIELD INTENSITY


In Chapter 3, when using Coulomb's law, it was convenient to introduce the concept
of an electric field by spli tting the force expression in to two factors, namely,

Fe

qE = q

dV ~

J--

(4.30)

41T'EO~3

Similarly, it has proved convenient, when discussing the force on a moving charge, to
introduce the concept of a magnetic field by writing

F m = qv X B

qv X

\ dV X

-1 3

41T'JLo

(4.31)

The forms of these t\VO integrals suggest a certain analogy between Band E, with
p dV being the sources for E and \ dV being the sources for B. The comparison between
magnetostatics and electrostatics is further heightened by the introduction of a new
vector function Hs, analogous to Do, defined by the relation
(4.32)
in which the zero subscript is a reminder that the discussion so far excludes magnetic
materials. H, is called the magnetic intensity, and when (4.32) is combined with the
Biot-Savart la w one finds that

11 o(X,Y,Z ) -

J
V

\(~,1],r) X ~ dV
4
1r~3

(4.33)

Thus the units of H, are amperes/me


The manner in which E and B enter the Lorentz force law, with E acting on a charge
element q to give a force, and B acting on a current element qv to give a force; the similar manner in which E and B are related to the sources p dV and t dV; the similarity

SECTION

The Force between Currents

237

between the defining relations Do = EoE and H, = ,u 1B- all serve to point up the fact
that B plays a role similar to E and that H, is analogous to Do. It is unfortunate that
this point was not fully appreciated during the early evolvement of electrical theory,
since awareness of the analogy would be enhanced by use of the reciprocal of !J.O rather
than )..Lo itself. In this text )..Lo1 shall be used wherever convenient in order to emphasize
this d uality.
The value of introducing H, will begin to emerge shortly when Ampere's circuital
law (which is analogous to Gauss' law) is established. The principal utility of introducing D and H arises when dielectric and permeable materials are discussed (cf. Chapters 6 and 7).

4.6 THE FORCE BETWEEN CURRENTS


If the solitary moving charge q is replaced by a volume charge element

qv ~

PaY

Pa

dV a so that

dV a = t a dV a

with \a dV a a current element which is not necessarily time-independent, Equation


(4.27) becomes
(4.34)
Equation (4.34) gives the elemental force on a general current element r, dV a due to a
steady magnetic field B(x,Y,z). This field is caused by a distribution of steady current
elements and is deducible from the Biot-Savart law. Equation (4.34) is a differential
form of what is often called Ampere's force law.
The total force on all the current elements in the circuit of which \a dV a is a part
can be written

Fm =

v,

ta X

[f \b4

Vb

~-1dVbJ
3

7r,uo

dV a

(4.35)

in which tb is the steady current distribution giving rise to the field B and ~ is drawn
from dl1 b to dV a . The two volumes Tla and l'b may overlap. For example, a closed
circuit of steady current can exert a magnetic force on itself.
EXAMPLE

4.4

A simple case of magnetic interaction of some importance involves t\VO long thin straight
parallel wires carrying steady currents Xl and X2 and separated a distance d. This situation
can be approximated by assuming the wires to be vanishingly thin and infinitely long. Then
(4.34) gives

dr

as the force on a length


2 of the second wire, due to all the current elements in the first
wire. (No current element in the second wire experiences a force due to any other current
element in the second wire because 12
X ~ == 0.)
No loss in generality arises from taking 2 at the position S2 = 0, as shown in the figure,
and writing for ~ the relation

ee,

ds

238

JJ1agnetostatics in Free Space

If by

en one means the force

CHAPTER

per unit length on the second wire , then

(4.36)
'rhus the force between the wires is attractive if the currents are in the same direction'
otherwise it is repulsive. This simple formula can be used to define the unit of current
ampere, in terms of a mechanical measurement of force.
'

--+----)(

~-----d-----~

When (4.36) is considered in conjunction with COUIOlUb's law, each is seen to contain
1
two electrical quantities, either q and fo or I and JJ.o But q and I are related through a time
1
derivative and JJ.o and fo are related by (4.28). Thus in reality (4.36) and Coulomb's law
each contain the same t\VO electrical quantities, and these t\VO force laws taken together
permit the definition of all electrical quantities in tenus of the units of mass, length, and
time, indicating that a fourth fundamental unit is unnecessary.
This is not to deny that a fourth fundamental unit is convenient nor to suggest that mass
is more fundamental than charge. One could equally well start with electricity instead of

SECTION

The Time-Independent Magnetic Vector Potential Function 239

gravitation and conclude by being able to define mass in terms of the units of charge,
length, and time.

4.7

THE TIME-INDEPENDENT MAGNETIC VECTOR POTENTIAL FUNCTION

If \(~,17,S) and ~ are other than very simple functions, the evaluation of B from (4.29)
can be extremely difficult. This same situation was encountered with the electric field
in Chapter 3 and was eased by the introduction of the scalar electric potential function,
whose gradient gave -E. By analogy one is led to wonder whether B can be expressed
alternatively as a vector derivative of a potential function. That this is possible can be
demonstrated by the following argument:
Since ~ = lx(x - ~)

+ ly(Y

+ lz(z

- 17)

- t), the relation


(4.37)

can be used to rewrite (4.29) in the form

= -

In (4.37) V F = L, -

ax

ly -

ay

~
J 1 X V F (~)~ dV
47r~o V

L, -

az

(4.38)

a
a
.
+
ly - + L, - are the gradient
a~
a17
at

and V s = L, -

operators with respect to the coordinates of the field point and the source point
respectively.
The vector identity (V.109) can be utilized to obtain

G) = ~

VF X

VF X 1

VF

(D

X 1

But V F X t == 0 because 1 is a function only of the source variables ~, 17,


can be written

~ rJ V

47r~o

r. Thus (4.38)

(~)
dV
t

However, since the limits of integration are also independent of the field point P(x,Y,z),
the order of integration and differentiation may be inverted to give

= VF X

J 47r~OI~
~~

Therefore, it is convenient to define a magnetic vector potential function A by the


expression

~, )
A( X,y,z
from which

J \(~,17,r) dV

-I

41T',uo ~

B=vxA

(4.39)
(4.40)

In almost every case it is simpler to compute A first and take its curl to find B rather
than to compute B directly from (4.29).

240

111aqneiosiatics in Free Space

CHAPTER

One important consequence of (4.40) is the fact that

VB

==

(4.41)

This follows from the vector identity V V X A == 0 (cf. Mathematical Supplement,


Equation V.III). Because of the defining relation (4.32) it also follows that
V
EXAl\IPLE

H, == 0

(4.42)

4.5

For the long straight wire of Example 4.1, the magnetic vector potential function is simply

which integrates to give

A = ~ In (z

+ Zl) +

47r,u Ol

Zl)

(z -

[r 2
[r 2

+ (z + Zl)2]~~
+ (z - Zl)2P~

Taking the curl in cylindrical coordinates, and then inserting the point

nCr ~O) - 1 _1_


,0/,

q, 2

7r,uo-1 r (2
Zl

per, cP;O)

one obtains

+Zl r2) I~
7

in agreement with Example 4.l.


EXAl\IPLE

4.0

Imagine a small circular loop of radius a carrying a steady uniform current I, as suggested
by the figure. Let localized spherical coordinates (r,(),cP) be set up with origin at the center
of the loop, and such that the loop lies in the () = 7r/2 plane. I t is desired to find the magnetic field at a remote point P(r, (),cP) such that r
a.
Consider the current clement Ia d{3 situated {3 deg beyond the cP plane. The contributions
to A of this element and its twin, which is {3 deg in front of the cP plane, sum to only an A cP

z
p

8
~::::------+----+------y

d{j

SECTION

The Time-Independent Magnetic Vector Potential Function

241

component. By thus arranging all the current elements in pairs one can conclude that
A(r,(},et = lct>A<t>(r,(},et and the task has simplified to one of finding A<t>(r,(},cP). By symmetry, Act> is not a function of , so there is no loss in generality resulting from placing
P(r,(),) in the YZ plane. Then
lct>Ia d{3

-lxla cos {3 d{3 - lyla sin {3 d{3

and, from (4.39),


A (r ()) = 2 /11"
<t> ,

Ia cos {3 d{3
4

in which

~ = [(a sin (3)2

is the distance from the element

+ (r sin 0 -

Ia d{3

(1 + ~
r

-1

1rJ.1.o

+ (r cos O)2]H
Since r a,

a cos (3)2

to the point P.
sin 0 cos (3 -

~
+ .)
2r
2

If terms of higher order than r- 2 are neglected,


A<t>(r,O)

I~l

21rJ.1.o r

/11" (1
0

+ ~ sin 0 cos (3) cos {3 d{3


r

Integration gives
1ra 2I sin fj
41r,LLo1 r 2

(4.43)

It is useful to define what is known as the magnetic moment m of this small current
element. m is chosen to have a magnitude equal to the area of the loop times the current
and a direction perpendicular to the plane of the loop. Thus 111 = 1ra 2I; the direction of m
obeys the right-hand rule, which means that if the fingers of the right hand are placed along
the loop in the direction of current flow, then the right thumb points in the direction of m.
With this definition, Equation (4.43) may be written

A=mXr

(4.44)

41rJ.1.o r 3

in which r is drawn from the center of the loop to the point P.


The use of (4.40) yields
B

41r,LLo r 3

(lr2 cos ()

18 sin ())

(4.45)

Equation (4.45) has special significance since it is found to be in the same form as (3.29);
thus there is a duality between electric dipoles and small current loops.
EXAMPLE

4.7

If the small circular loop of the previous example is immersed in a region of uniform magnetic field B o, it experiences a torque tending to align its magnetic moment with B o. This
effect can be appreciated by referring to the figure, in which the loop is seen edge on and the
uniform field is indicated by flux lines. The current in the loop is assumed to be coming out of
the paper on the left side and into the paper on the right, and therefore m is upward, as
shown.
Application of Equation (4.34) leads to the conclusion that the B, field exerts forces on the
left-hand and right-hand current elements which are outward, causing a couple which tends

242 Magnetostatics in Free Space

CHAPTER

to rotate the loop so that m will be parallel to B o. A quantitative expression for this couple
can be derived as follows:
With no loss in generality, the uniform field may be assumed not to have an X component,
in which case it can be given by

with (}o a constant polar angle measured from the Z axis.


A current element I d.t situated at the latitudinal angle

I d.t

Ia d<l>( -1 x sin

<I>

+1

t1

ep

can be represented by

cos

<1

and, according to (4.34), this current element experiences a force

dF m = laRa dep( -Ix sin ep + 1y cos ep) X (L, sin ()o


= laR o d<l>(1 x cos ()o cos <I> + ly cos ()o sin <t> -

+ 1z cos

()o)

L, sin 90 sin <1

This force causes a torque around the center point of the loop (cf. Example V.9, Mathematics Supplement) given by
dT

in which r

lxa cos

ep

+ l ya sin ep

r X dF m

is drawn from the center of the loop to the current

,1

SECTION

The 'I'ime-Lndependeni Mtujneiic Vector Potential Function

243

elemen t. Therefore,
dT =

Ia 2B o dcj>( -Ix sin

00 sin? cj>

l y sin 00 sin cj> cos cj

The total torque abou t the center point of the loop can be determined by integration:
T =

fo

211"

dT = -lxla B o sin 00
2

sin? <f> d<f>


(4.46)

= -lx(7ra 2I )B o sin 00
= m X Bo

Equation (5.46) indicates that the equilibrium position for the current loop occurs when its
magnetic moment is aligned with the field. If the loop is rotated from this equilibrium
alignment, its potential energy is increased. The energy which must be supplied to rotate
the loop from an initial angle 01 to a final angle O2 is given by

(12

U =

02

T se = mB o

(h

= -mBo[cos (0 2

(h
-

sin (0 - 00) dO

(}o) - cos (0 1

( 0)]

If the zero reference level for the potential energy U is taken as occurring when the magnetic moment is transverse to the B o field (0 1 = 00 + 7r/2), then
U = -mB o cos

u,

= -m

((}2 -

(}o)

(4.47)

in which the final direction of m is used in the dot product in (4.47).

The magnetic vector potential function A has the additional important property
that its divergence is zero. This can be seen by returning to (4.39) and writing

VF A = ~
47r,LLo

[l(~,l],t)J
r

dV

in which differentiation inside the integral sign is permissible because the volume limits
are not functions of x, y, z. Use of the vector identity (V.I07) gives

But V F

==

(D i

(D

+ \ V
0, since \ is a function only of ~, 'fJ, r; therefore
VF

VF

A =

VF

~l vf \V (~)
dV
~
F

47r,LLo

= -

---s f \

47r,LLo

VS

(!)
~

dV

Use of the same vector identity gives


Vs

G)

~ V s \ + \ V s (t)

However, V s 1 == 0 because the currents are time-independent and the net efflux of
current from a volume element must be zero. Therefore,

244 Magnetostatics in Free Space

CHAPTER

The divergence theorem is now applicable and permits the conversion to


VF

A = -

_1_
41r,u Ol

J \dS
~

in which S is the closed surface bounding V. Since V can be maintained finite and yet
made large enough that none of the currents of the system intersects S, it follows that
one can make \ == 0 on S. This 111eanS that
(4.48)
as asserted.
EXAMPLE

4.8

In Example 4.5 the magnetic vector potential function for a long straight wire carrying a
steady current I was found to be
A

= 1

_I-1
41r}.LO

(z
Zl)
(z - Zl)

+
+

[r 2
[r 2

+
+

(z
Zl)2]~~
1.(,
(z - Zl) 2]72

and therefore the divergence of A is

V. A = aA

= _1_

41r,u Ol

az

{I + (z + zl)[r 2 + (z + zl)21~}" _ 1+ (z (z

Zl)

[r

(z

471"~OI {rr2 + (/+ ZI)2jH -

zl)[r 2
(z - Zl)
[r 2

+ Zl)2r':!

[r 2

+
+

(z - ZI)21~H}
(z - Zl)2P2

+ (/- ZI)2jH}

This expression for V A is not quite zero and the reason can be traced to conditions at the
two ends of the wire. There the current has been assumed to end abruptly and V \ t= 0,
which violates a condition imposed in deriving the result V A = O. If one were to include
the steady currents in the remainder of the circuit, of which this long wire is a part, then a
null value for the divergence of A would be obtained. Alternatively, if Zl and Z2 approach
00 it is seen that V A ~ 0 for finite z.

4.8

AMPERE'S

CIRCUITAL LAW

The fact that V A == 0 opens the way to the proof of an important theorem, the result
of which is known as Ampere's circuital law. Recall that in electrostatics the equations
\72ep = - - P

ep -

Eo

pdV
-41rEO~

were encountered, the first being a differential equation for the electric potential, and
the second its solution. But from (4.39), a component of A, for example the y component, is given by

Ay =

'y

d~

41rllo ~

and therefore must satisfy Poisson's equation, namely,


(4.49)

SECTION

A mpere' s Circuital Law

245

If both sides of (4.49) are multiplied by l y and the result is added to similar terms
involving the x and z components, one obtains
\

(4.50)

-1

IJ-o

and A(x,Y,z) is seen to satisfy a vector form of Poisson's equation.


Since B == V X A, use of the vector identity (V.113) gives
V X B == V X V X A == v(v A) -

\72A

by virtue of (4.48). Thus


\

vxB==-=i

IJ-o

which means that

X H,

= \

(4.Fjl)

This is the differential form of Ampere's circuital law. Integration gives

f V X u, dS

f \. dS

in which S is an open surface bounded by the closed contour C. Application of Stokes'


theorem yields

n, di = sf r- dS

(4.52)

Ienclosed

This is the integral form of Ampere's circuital law and it plays the same role in magnetostatics that Gauss' law does in electrostatics.
EXAl\,IPLE

4.9

The results of Example 4.1 can be used to deduce that the magnetic field due to a steady
current in a long straight wire is

at points not too far removed from the wire nor too near its ends. Let a closed contour C be
erected which encloses such a wire. An element of length along C, expressed in cylindrical
coordinates, is (cf. Example V.17)

and therefore,

o'

di

(l)
(r de/
21T'r

which agrees with Ampere's law.

==

1- f2~ dep =
21T'

246

M aqneiosiaiics in Free Space

EXAlVIPLE

CHAPTER

4.10

Consider an infinitely long cylindrical tube, shown in cross section in the figure, which
carries a uniformly distributed axial current I. What magnetic field is caused by this system
of sources?

.......-----+O-:-...:+---y

x
An answer can be given to this question by first noting that symmetry requires that H, be
independen t of cP; since A has only an axial corn ponen t, V X A has only a cP com ponen t.
Therefore the magnetic field is a function H et>(r).
Next imagine that a concentric circular contour C of radius r has been constructed in a
transverse z plane. If r ::; a, /1 et> dt = 0 and therefore,
c

liq, == 0
If a ~ r

S b, some

r ~ a

current is enclosed by C. The uniform current density is given by

l.=----

and thus

r~

1r(b 2

a 2)

Hq,(r)r d =

21rrHq,(r) =
Ifet> (r)

1r(b 2

(r 2

(b

a2)

21rr'dr'

a 2)

1-2
2

_L (r

21rr (b2

- a 2)
a 2)

r :::; b

Finally, if r ~ b, all the current is enclosed by C and

1/q,(r)

-L
21rr

b :::; r

Interior points of the tube are shielded; at all exterior points the field acts as though the
entire current were concentrated on the axis.
By superposition, if t\VO concentric conducting tubes carry equal and opposite steady

SECTION

A nipere' s Circuital Law

247

currents, which are uniformly distributed, the field between them is

Jlcf>(r)

27fT

Throughout the hollow interior of the inner tube and outside the outer tube the field is
everywhere zero.
EXAMPLE

4.11

Consider the long thin and tightly wound circular cylindrical solenoid shown in cross
section in the figure. Let a-a' represent the central transverse plane with ]J any point
(external or internal) in a-a'. Let the first task be to find B(IJ). I f I is the steady curren t in
the winding, COIning out of the paper at the upper cross section of each turn, symmetrically
disposed pairs of current elements I d and I d' can be found, such as the two shown in the
b

I
I

b'

a'

figure at distances ~ and i' from P. These t\VO current elements will make contributions to
the magnetic flux density at P which can be written

dB

I df

47r,uOl~3

and which are shown in the figure. By symmetry these two contributions sum to a longitudinal component of B only. When all the current elements in the solenoid are paired in
this fashion, it is evident that the entire B field is longitudinal at every point in the central
transverse plane a-a'.
X ext consider the field B(lJ 1) at an internal point in a noncentral transverse plane, such
as b-b', One can begin to construct B(IJ 1) by once again considering pairs of current elemerits, this time symmetrically disposed about P'; After awhile, all of the current elements
to the left of 1)1 will have been used up, but there will still be 80n1C left over far to the right
of ]J I . However, since the solenoid is long and thin, these leftover current elements can be
considered to advantage in pairs of a different sort, such as I df' and I
(See figure.)
If b-b' is not too near either end, the posi tion vectors drawn from I df' to P 1 and from I d"
to ]J 1 must be almost parallel as well as almost of equal length. Since the two current
elements are oppositely directed, it follows that their paired contribution to B(]J 1) is
negligible. Thus if the transverse plane b-b' is sufficiently remote from an end, B is essentially longitudinal at all interior points of b-b',

se,

248

it!agnetostatics in Free Space

CHAPTER

z------------~--::e

x
With this information about the nature of the field inside the solenoid, Ampere's circuital
law can be applied to the contour 1234 shown dotted in the second figure. Since this contour
encloses no current, and since B is essentially perpendicular to the legs 23 anrl14, it follows
that
2

JB.di= JBodi
1

and thus that B is uniform over a transverse plane inside the solenoid, provided this transverse plane is not too near either end.
Finally, n can be deduced at a point P far removed from the solenoid, with the use of the
coordinate system indicated in the second figure. If a is the radius of the solenoid, L its
length, and r the distance to the remote point 1~, then a L r, Let n be the number of
turns per unit length of the solenoid, so that n dr is the number of turns of a flat loop at the
source position t. On the basis of Equation (4.~14), the contribution to A at J:> for this flat
loop is
dA

= 7ra 2I n d((lz

X ~)

41r,uol~3

in which ~ = lxx + 1 y Y + lz(z - s) is the position vector drawn from the center of this
loop to the distant point P(x,y,z).
Since r L, ~ varies insignificantly as S ranges from -L/2 to +L/2. Thus
A

and

1ra2InL 1
zXr
o lr

~--3
41r,u

1ra 2 I nL
B = V X A ~ - - 1 - 3 (lr2 cos ()
47r,uo r

+ 18 SIn. ()

and it is as though the entire solenoid were concentrated in the Xl" plane.
These conclusions permit an approximate sketching of the B field for a long slender
solenoid, with the result suggested by the third figure.

SECTION

4.9

Boundary- Value Problems in Magnetostatics

BOUNDARY-VALUE

PROBLEMS

249

IN MAGNETOSTATICS

In connection with Equation (4.50), it has been noted that A satisfies a vector form
of Poisson's equation; in regions removed from the current sources this reduces
to a vector forrn of Laplace's equation. Therefore, all the techniques discussed in
Chapter 3 pertinent to solving V'2q, = 0 would appear to be applicable to boundaryvalue problems concerning A. Unfortunately, the situation is not that simple, due to the
vector nature of A. As discussed in Section V.16 of the Mathematical Supplement, the
Laplacian of a vector function generally involves D10re than the Laplacian of its scalar
component functions; additional terms may arise through the spatial derivatives of the
unit vectors. Only in rectangular coordinates is this not the case, because the unit
vectors have constant directions. In all other coordinate systems, the change in direction of these unit vectors with spatial position adds terms which complicate the solution
of the differential equation. For example, in cylindrical coordinates
V2A = lr ( V2A r

2 -aA<t>

1"2

Ar)

1"2

1<1> ( V'2Act>

+ -1'22 aA
-CJr - -Act
+
1'2

lzV'2A z

(4.53)

and in spherical coordinates

(4.54)

250

1\1 agnetostatics in Free Space

CHAPTER

It is apparent that one is generally confronted with the problem of solving more
complicated differen tial equations than the Laplacian of a scalar function. These
equations can be mixed, and will take different forms as the type of symmetry is
changed. For this reason the techniques tend to be more specialized than was found
to be the case when solving for the electrostatic potential function. A few examples
will serve to illustrate possible approaches.
EXAl\1PLE

4.12

A simple configuration in cylindrical coordinates has been treated in Example 4.10, that of
an infinitely long cylindrical tube carrying a uniformly distributed axial current. Such a
current distribution yields an A which is entirely axial and a function only of r. But if
A = 1zA z (r ), inspection of (4.53) reveals that Y72A = 1zV 2A z Therefore, in this case Poisson's
Equation (4.50) is simply

<

a, r

>

The solutions to these equations may be written

Az(r)

= CI
1

Ir

r < a
2

41rJlo (b 2
=

C4 ln r

a 2)

C2 ln

+C

Cs

>

in which the C, are constants of integration. Determination of the values of C I , C3, and C,
is not important, since they vanish in taking the curl of A to find the magnetic field. The
requirement that aA z/ ar be continuous across the interfaces leads to the evaluations

When these values for the constants are substituted in the above expressions for A z, performance of the curl operation yields expressions for the magnetic field in the three regions
which are in agreement with the results of Example 4.10.
EXAMPLE

4.13

A problem in spherical coordinates which can be extended to several practical situations


involves a -directed sheet of current lying in a thin spherical shell of radius a. If a is the
thickness of the shell, then j = ta amp/rn can be taken as the lineal current density in the
surface of the shell. It will be assumed that j = 1ct>i(8); that is, the current density will not
be taken as a function of .
It follows that A will have only a cP component, which is a function of rand 8 but not cPo
Inspection of (4.54) indicates that for this case
Y72A = let>

(V' 2A ct> - r sin"


~cP 0

= 0

for points not in the shell. Expansion of the Laplacian operator gives
-

1 - a ( r 2 -aAct

r 2 ar

ar

1
+ ---a
r sin () a()
2

( SIn
. 8 -aAct

ao

Act> - -r 2 sin" (J

SECTION

Boundary- Value Problems in M agnetostatics

Upon assuming that A<jl

251

fl(r)f2(O) one obtains

idr (r

f1
d ) - n(n

dr

f2
~
(Sin () d ) +
dO
dO

[n(n

1)f1 = 0

1) sin () - -.1_J 12 = 0
SIn 0

in which n(n + 1) is a separation constant. Both of these equations were encountered in


Chapter 3 in connection with solutions of Laplace's equation in spherical coordinates. The
most general appropriate solution is

i,

= i,

nt GY P~(cos
nt Y+!
an

an (;

0)

P;(cos 0)

<

>

with these series constituting a complete orthogonal set.


Performance of the curl operation yields

1 ~
~ n(n
B = ~
a n=l
1 ~
~
=~

a n=l

n(n

(r)n-1 Pn(cos 0) -

1)an a

6 ~
-1 ~
(n
a n=l

~
+ l ), (a)n+2
Pn(cos 0) + -1 L
6

+ ,1)an (r)n-l
P~(cos
a

(a)n+2 P~(cos ())

nan -

a n=l

0)

<a

>a

If a contour C is drawn in a ~ plane, straddling the shell as shown in the figure, application
of Ampere's circuital law gives

Jq,( O)a dO

[1 nL:~

= JJ.o-1 a dO ~
-1

so that

jq,(O)

= JJ.;

nanP

00

(2n

~ (n + l)a nP n
1J
n1+ 1
~ n':l

l)anP;(cos

0)

n=l

is the lineal current density, expressed as a sum of orthogonal terms. If the current distribu-

252 M agnetostatics in Free Space

CHAPTER

tion is specified, the normalization integral for Legendre polynomials can be used to deduce
the constants an.
The case in which all of the an coefficients are zero except for n = 1 is particularly interesting, for then inside the shell B, = B cos () and Be = - B sin () and the field strength is
uniform. The current distribution required to achieve this effect varies as sin (j.
All the foregoing can be extended to problems involving -type currents flowing in
spherical volumes by considering such volumes to be composed of nesting spherical shells;
the results given here then become a prototype solution.

Several other techniques have proved helpful in the solution of magnetostatic


boundary-value problems. The differential form of Ampere's law yields V X B = 0
away from the sources, and thus in such regions B may be expressed as the. gradient
of a scalar potential function in much the same manner as found in electrostatics. This
technique has been widely used when describing magnetic fields in terms of equivalent
magnetic dipoles."
Since V A = 0 it is possible to introduce a vector function W by the relation
A = V X W. In turn, W has proved to be expressible as a series of orthogonal functions,
and a variety of problems are solvable by this technique," including the spherical
shell of current discussed in Example 4.13.

4.10

COMPOSITE FIELDS

At this stage in the analysis, it is possible to formulate an expression for the force F
on a charge q which is moving through XYZ at a velocity v(t), when that force is
contributed to by a composite of three sets of sources:
1. A static volume charge distribution Pl(~,17,r).
2. A system of uncompensated charges P2(~,17,r) dV moving through space at the
constant] velocities V2(~,17,r).
3. A system of compensated charges P3(~,17,t) dV moving at the constant velocities
V3(~,17,t). There are stationary charges -P3(~,17,r) dV providing the compensation,
and one 111ay talk conveniently of the charge pairs (P3 dV, - P3 dV).

Through the use of the Dirac delta function, these volume charge densities can equally
well represent surface and lineal distributions, or discrete point charges.
"fhe force on q is given by the Lorentz force law
F

q(E

+ v X B)

t By this it should be recalled one means that the charge P2 dV, which is instantaneously in that
volume element dV which contains the point (~,17,t) has, for the moment, the particular velocity
V2(t17,t). The identity of the charge in dV keeps changing, but on the time-average there is always
charge at this position with this velocity. Alternatively, if the progress of a specific charge is followed,
it will be found to occupy a succession of positions, momentarily taking on a progression of velocities
V2, which need have neither the same magnitudes nor directions.
16 M. Abraham and R. Becker, The Classical Theoru of Electricitu and Mtujnetism, 2d English ed.,
Chap. 7, Hafner Publishing Company, l\e\v York, 1949.
17 W. R. Smythe, Static and Dimamic Electriciiu, pp. 260-271, Mc Graw-Hill Book Company, New
York, 1939.

Problems 253
in which

== _1_

E(x,Y,z)

41rfo

J p~ dV
J ~ dV
V

~3

1
t X
-a41T'Jlo v
r

B(x,y,z)~ ----=i

(4.55)
(4.56)

with P = Pl + P2 and t = P2V2 + PaVa. The fields E and B, as given by (4.55) and
(4.56) satisfy the differential relations
V E = ~
fa

VB

== 0

VxE==O

vxB

(4.57)

-1

f..Lo

These composite fields, which are due to the most general aggregation of timeindependent sources, are therefore the most general electrostatic and magnetostatic
fields obtainable.
REFERENCES
1.

Abraham, 1\1., and R. Becker, The Classical Theory of Electricity and 111 agnetism, 2d
English ed., Hafner Publishing Company, New York, 1949.

2.

Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, W. H.


Freeman and Company, San Francisco, 1962.

3.

Duckworth, H. E., Electricity and ill agnetism, Holt, Rinehart and Winston, Inc., New
York, 1960.

4.

Langmuir, R. V., Electromagnetic F'ields and lVaves, McGra\v-Hill Book Company, New
York, 1961.

5.

Page, L., and N. I. Adams, Jr., Electrodynamics, D. Van Nostrand and Company, Inc.,
New York, 1940.

6.

Panofsky, w. K. H., and M. Phillips, Classical Electricity and 111agnetism, Addison-vVesley


Publishing Company, Inc., Reading, Massachusetts, 1956.

7.

Shedd, P. C., Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood


Cliffs, New Jersey, 1954.

8.

Smythe, W. R., Static and Dynamic Electricity, McGraw-Hill Book Company, Ne\v York,
1939.

9.

Whittaker, E., .4 History of the Theories. of Aeiher and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

PROBLEMS
4.1

Two straight horizontal insulated aluminum wires carry equal steady currents in opposite
directions. If each wire is 0.5 em. in diameter and one wire lies on top of the other, what
current will barely cause separation?

4.2

A rectangular loop consists of a U-shaped conductor and a sliding bar. Find the force on
the bar as a function of the dimensions of the loop and the strength of the steady current I which flows in the loop.

254

J11 aqnetostaiics in Free Space

4.3

Referring to Figure 4.lc, Biot bent the second wire into the form of a right angle and
found that the period of oscillation of the needle, with a steady current Jl: passing through
the bent wire, was TI. Upon passing half this much current through the straight vertical
wire, he measured a period T2. The ratio T2/TI was independent of the distance r and had a
mean value of 0.917. Is this result compatible with the Biot-Savart law? How should the
ratio of periods vary with the apex angle of the bent wire?

4.4

A circular loop carrying a current I and a long straight wire carrying a current I' lie in
the same plane. Show that the mutual force is proportional to II' (sec a - 1) with a the
angle subtended by the circle at the nearest point of the straight wire.

4.5

If in the last problem the straight wire is placed perpendicular to the plane of the loop,
show that a torque exists tending to set the two wires in the same plane. Does your
answer depend on whether or not the straight wire is within the loop?

4.6

Show that the net self-force on a plane circular loop carrying a steady current is zero.

4.7

Show that a simple loop of arbitrary shape carrying a steady current tends to assume the
form of a plane circular loop.

4.8

Two circular wires of radii a and b have a COmlTIOn center and are free to turn on an axis
which is a diameter of both. Find the torque existing between these coils if they carry
steady currents Ia and Ib and are (1) at right angles, and (2) in the same plane.

4.9

For a current-carrying circular loop of radius a, show that the rate of change of field
along the axis is constant at a distance a/2 from the center. Thus show that if t\VO identical coils are placed coaxially a distance apart equal to their radii, an extended region
exists in which the magnetic field strength is essentially constant. Two such coils arranged
in this manner are known as 1/ elmholtz coils.

4.10

A long thin solenoid of length L and radius a carries a steady current


Find the B field along the axis.

4.11

A high-speed electron has a dynamic mass which is 1.5 times its rest mass. At what
radius of curvature will it travel in a perpendicular magnetic field of strength B o =

CHAPTER

I through its N turns.

0.5 webers/rn"?
4.12

In a parallel plane diode operating at a constant potential Vo, the electrons normally
travel directly across from cathode to plate. If a transverse uniform magnetic field of
strength B is interposed, what must be the value of B o just to prevent the electrons from
reaching the plate?

4.13

If ions of mass M and charge q are injected transversely into a region of uniform magnetic
field Eo, after having accelerated through a potential Vo, show that they will travel a
circular path of radius

R-

[ 2-VO (J[)J~~
B~

If a variety of ions of differen t mass and the same charge are collected after having
traveled through a semicircle, they will be separated laterally due to their different orbital
radii. This is the principle of the mass spectroscope.

4.14

In a Wien velocity filter, ions of a particular velocity Vo are not deflected in passing
through a region containing steady electric and magnetic fields. How are the fields
arranged to accomplish this '?

4.15

What magnetic moment results if a spherical conductor of radius a, possessing a net


charge Q, rotates with a constant angular velocity w?

Problems

255

<b

4.16

Find the magnetic field everywhere if two infinite coaxial cylindrical shells of radii a
carry equal and opposite steady currents I.

4.17

An infinitely long cylindrical shell is segmented into four equal quarter-circles. These four
segments carry axial currents in alternate directions of uniform lineal density j amp/m.
If the cylinder radius is a, find the magnetic field B at all points.

4.18

A fine wire is wound in the form of a flat spiral of N turns, shaped like a disc of radius a.
Find the magnetic dipole moment if a steady current I flows through this winding.

4.19

.A
. toroidal coil consists of a large number N turns of wire and carries a steady current I.
Use Ampere's circuital law to determine the field at any point inside the toroid. If a and b
are the inner and outer radii of the toroid, find the percent variation in B over a cross
section as a function of b/ a.

4.20

A fine wire is wound in a single layer of N closely spaced turns on the surface of an insulating spherical shell, such that the axes of the turns coincide with the polar axis of the
sphere. If a steady current I is passed through the winding, find the magnetic field
everywhere.

CHAPTER

Electromagnetics in Free Space


of the word implies, electromagnetics is concerned with interrelated
electric and magnetic fields, an effect which occurs when the two fields are time-varying.
This interrelation is normally introduced by accepting Faraday's emf law as an experimental postulate and adding to it the continuity equation for current, from which
ultimately Maxwell's equations may be deduced. However, the approach to be presented here will not require this additional experimental postulate. Instead, the most
general static and electric fields will be created in one coordinate system and the
resulting force expression transformed to another (moving) coordinate system, The
transformed force expression will be recognized as a generalization of the Lorentz
force law, and permits the definition of time-varying electric and magnetic fields. These
fields are then shown to be related through Maxwell's equations, one consequence of
which is Faraday's emf law. This approach provides the additional satisfaction of
identifying the electromagnetic fields in the Lorentz force law and in Maxwell's
equations as one and the same, an identity which can only be postulated in the customary derivation.
After Maxwell's equations have been established, the vector Green's theorem is used
to obtain a general solution for the electromagnetic field. Conditions at infinity are
studied, and convergence is demonstrated for real sources. The wavelike nature of the
general solution is demonstrated and then Poynting's theorem is derived to show the
energy content transported by these waves, The chapter continues with discussions of
solutions to the vector wave equation in rectangular, cylindrical, and spherical coordinate systems and concludes with a Minkowskian formulation of the field equations.
AS THE STRUCTURE

5.1 *

HISTORICAL SURVEY

The two major discoveries on which the theory of time-varying electromagnetic fields
is ordinarily based were made b.y Michael Faraday (1791-1867) and James Clerk Maxwell (1831-1879). Faraday's discovery was experimental and consisted of the significant
observation that a changing magnetic field would induce an electric field. Maxwell
was led by an analogy to the theoretical conclusion that the converse was also true,
namely, that a changing electric field would induce a magnetic field. In this respect,
time-varying electric fields play the same role as conduction currents, and Maxwell
combined the two into a total current which he showed to be continuous. The mathe-

* This section may be omitted without loss in continuity of the technical presentation.

SECTION

Historical Survey 257

matical formulation of all these effects and their interrelations constitute what is
known as Maxwell's theory.
Faraday was undoubtedly motivated in his discovery by what appears to have been
a basic tenet of his scientific philosophy-that every cause and effect has its converse.
Thus since Oersted's experiment and many developments which followed had clearly
shown that electricity can produce magnetic effects, it was reasonable to expect that
magnetism should be able to produce electricity. Faraday attacked this problem many
times without success. His laboratory notebook contains an entry dated December 28,
1824 describing an experiment in which a magnet was placed inside a helical coil
" . . . but in no case did the magnet seem to affect the current so as to alter its intensity
as shewn upon a magnetic' needle placed under a distant part of it. . ."
Again, on November 28, 1825, his laboratory notes refer to a battery-connected wire
" . . . parallel to which was another similar wire separated from it only by two thicknesses of paper. The ends of the latter wire attached to a galvanometer exhibited no
action." Replacing either straight wire by a helix also had no effect.
A third try was recorded on April 22, 1828. Faraday suspended a copper ring by a
thread and placed a bar magnet inside the ring but could detect no induced current.
Faraday's efforts were paralleled by those of many other scientists, but no one was
having any appreciable measure of success. The difficulty lay in the fact that everyone
was looking for the creation of a steady current. Perhaps the most significant discovery
had been made by Arago in 1824. He suspended a magnetic compass needle over a
copper plate and set it into oscillation, noting that the presence of the copper plate
enhanced the damping. Upon eliminating air disturbances and rotating the copper
plate, Arago was able to make the needle revolve also, and even showed that this
dragging effect depended on the conductivity of the rotating plate. Faraday repeated
Arago's experiment in 182.5 but, despite the suggestiveness of the results, the true
explanation of the phenomenon eluded both investigators.
Finally, on August 29, 1831, six years after his first attempt, Faraday discovered the
effect he had been seeking. His notes for that day state:'
Have had an iron ring made (soft iron), iron round and f inches thick and ring 6 inches in
external diameter. Wound many coils of copper wire round one half, the coils being separated by twine and calico-there were three lengths of wire
each about 24 feet long and they could be connected as one
length or used as separate lengths. By trial with a trough each
was insulated from the other. Will call this side of the ring A.
On the other side but separated by an interval was wound wire
in two pieces together amounting to about 60 feet in length, the
A
direction being as with the former coils; this side call B.
Charged a battery of 10 pro plates 4 inches square. Made the
coil on B side one coil and connected its extremities by a copper
wire passing to a distance and just over a magnetic needle (3
feet from iron ring). Then connected the ends of one of the
pieces on A side with battery; immediately a sensible effect on
needle. It oscillated and settled at last in original position. On breaking connection of A
side with Battery again a disturbance of the needle.
Made all the wires on A side one coil and sent current from battery through the whole.
Effect on needle much stronger than before.
1

Faraday's Diary, vol. 1, p. 367. Published by G. Bell and Sons, Ltd., London, 1932.

258

Electronuumetiee in Free Space

CHAPTER

This discovery of transformer action quickly led Faraday to an appreciation of the


entire effect. On September 24th he tried a different experiment. Using a remote helix
and compass needle as indicator, he wrapped a helical coil around a soft iron cylinder
and built up an apparatus which he described as follows:"
The iron cylinder and helix . . . . A.lI the wires made into one helix and
these connected with the indicating helix at distance by copper wire: Then the
iron placed between the poles of bar magnets as . . . in fig. Every time the
magnetic contact at N or S was made or broken there was magnetic motion
at the indicating helix, the effect being as in former cases not permanent, but
a mere momentary push or pull. But if the electric communication (i.e. by the
copper wire) was broken then these disj unctions and contacts produced no effect
whatever, Hence here distinct conversion of Magnetism into Electricity.

On October 1st Faraday repeated the transformer experiment but


with a wooden core, and once again obtained the same effect, though
enough weaker that he had to substitute a galvanometer for the indicating helix. He concluded: "Hence there is an inducing effect without the
presence of iron . . . ."
Finally, on October 17th, Faraday performed the most significant experiment of all. He prepared a helical wire in the form of a cylinder and then 3
. . . . a cylindrical bar magnet t inch in diameter and 8-i inches in length had one end
just inserted into the end of the helix cylinder-then it was quickly thrust in the whole
length and the galvanometer needle moved-e-then pulled out and again the needle moved but
in the opposite direction. This effect was repeated every time the magnet was put in or out
and therefore a wave of Electricity was so produced from mere approximation oj a magnet
and not from its formation in situ.

As noted in Chapter 3, Faraday preferred to think of all electric and magnetic effects
in terms of lines of force, having been first attracted to this view by observing the
disposition of iron filings in the neighborhood of a permanent magnet. He thus sought
to explain this new phenomenon of induced electricity in terms of an interaction with
magnetic flux lines. His raw thoughts OIl this subject are contained in an entry in the
laboratory notebook dated August 1, 1851, which contains the passages'
The force of a given magnet is definite and may be considered as represented by its
curves . . . . The curves . . . exist within the magnet as well as without; but within they
are in the contrary or return direction . . . . Whatever the condition of the interior of the
magnet: it has . . . the same kind and amoun t of power as the outside, and so is in full
analogy and similitude with an electro helix.
The intensity of the curves of a magnet vary greatly at different distances from the
magnet
But the amount of force is definite and the same for every section of all the
curves
.
Hence it follows that whether the curves are intersected directly or obliquely makes no
difference provided they are intersected. 'The effect depends upon the number of curves
intersected. 1\. wire moving obliquely may intersect fewer curves and therefore have a
2
3

Ibid., vol. 1, p. 372.


Ibid., vol. 1, pp. 375-376.
Ibid., vol. 5, pp. 409-411.

SECTION

Historical Survey

259

feebler current evolved in it; but if it intersected only the same curves directly across,
it would have no larger a current.
So with a given moving wire or with a given \vire under which a magnet is moving, the
quantity of electricity generated is directly as the amount of curves passed over or through.
With the same curves therefore it varies directly with the velocity of the motion.

This explanation of induction as being due to the relative motion between magnetic
lines of force and a conductor was refined by Faraday and included in a paper read
to the Royal Society later that year.' I t was given mathematical articulation by Maxwell as the equation

in which e is the emf induced in a contour C and J sBn dS is the total magnetic flux
enclosed by C. If the contour C is occupied by a conductor, e is the source of the resulting induced electric current. In the above, S is an open surface erected on C as boundary, and B; is the normal component of flux density, thus representing the number of
magnetic lines of force per unit area. This famous equation is known as Faraday's emf
law.
After his initial discovery of induction, Faraday continued to experiment with the
phenomenon. On October 28, 1831, he invented the first direct-current generator,
consisting of a copper plate rotating between magnetic poles, with an external circuit
attached between the center and rim of the plate. Through the years Faraday designed
and tested a variety of such generators, and his entry for October 11, 1851 describes
a machine consisting of a rotating wire rectangle with a commutator attached, this being
the prototype of the modern electric generator.
Faraday also discovered the phenomenon of self-induction (in 1834), unaware that
Joseph Henry (1797-1878) had made an independent discovery of the effect two years
earlier. t
It is impossible in a survey this brief to do justice to the painstaking, thorough
manner in which Faraday carefully built and enlarged his knowledge of electrical
phenomena. The interested reader is encouraged to read extensive sections of Faraday's
laboratory notebook in order to gain a full flavor of his accomplishments. As for the
discovery of electromagnetic induction itself, this 111USt rank as one of the D10st important contributions ever made to scientific knowledge.

t In fairness to Henry, it should be stated that during this period he and Faraday independently discovered many important electromagnetic phenomena, including self- and mutual induction and many
of the principles of electric machines. Henry also developed the electromagnetic relay, perfected an
electromagnet.ic telegraph, and showed that voltage could be stepped up or down by properly proportioning the coils in a transformer. Henry's lack of promptness in announcing the results of his
experimen ts has probably been the primary cause of his neglect, but the remoteness of the N ew World
from the Old, in those days of slow communications, was a contributing factor. Faraday's achievements were more promptly disseminated to the European centers of learning, and news of Henry's
accomplishments often bore the appearance of mere confirmation of what Faraday had already done.
In thc stimulation of further scientific inquiry by others, Faraday's influence was inestimably greater.
s M. Faraday, "On Lines of Magnetic Force; Their Definite Character; and Their Distribution Within
a Magnet and Through Spacc," Phil Trans ](oy Soc (London), 142, 25-56; 1852.

260 Electromagnetics in Free Space

CHAPTER

Although Faraday's law of induction was readily accepted, his explanation in terms
of lines of force fell mainly on deaf ears. The scientists of his day had been reared on
theories of action at a distance, theories which had enjoyed wide success in describing
a variety of electric and magnetic phenomena, as well as gravitational effects. The
eminent Astronomer Royal, Sir George Biddell Airy, declared that he could "hardly
imagine anyone who knows the agreement between observation and calculation based
on action at a distance to hesitate an instant between this simple and precise action
on the one hand and anything so vague and varying as lines of force on the other."
Maxwell was only twenty-four when he undertook to OVerC0111e this objection and
place Faraday's ideas on a firm mathematical basis. In the introduction to his first
paper on electricity, he stated that"
. . . the limit of my design is to show how, by a strict application of the ideas and methods
of Faraday, the connection of the very different orders of phenomena which he has discovered may be clearly placed before the mathematical mind.

After defining a single line of force as a curve in space whose direction at each point is
that of the force on a positive charge, or the force on an elementary north magnetic
pole, whichever the case may be, Maxwell continued
. . . \Ve might in the same way draw other lines of force, till we had filled all space with
curves indicating by their direction that of the force at any assigned point.
We should thus obtain a geometrical model of the physical phenomena, which would
tell us the direction of the force, but we should still require some method of indicating the
intensity of the force at any point. If we consider these curves not as mere lines, but as
fine tubes of variable section carrying an incompressible fluid, then, since the velocity of
the fluid is inversely as the section of the tube, we may make the velocity vary according
to any given law, by regulating the section of the tube, and in this way we might represent
the intensity of the force as well as its direction by the motion of the fluid in these tubes.

Maxwell then pointed out that if the force law involves distance to the inverse square,
there would be no interstices between his tubes of force .
. . . The tubes will then be mere surfaces, directing the motion of a fluid filling up the
whole space. It has been usual to commence the investigation of the laws of these forces
by at once assuming that the phenomena are due to attractive or repulsive forces acting
between certain points. vVe may, however, obtain a different view of the subject, and one
more suited to our difficult inquiries, by adopting for the definition of the forces of which
\ve treat, that they may be represented in magnitude and direction by the uniform motion
of an incompressible fluid.

With this conception, Maxwell proceeded to show that all results obtained for static
charges or permanent magnets, using action-at-a-distance formulas, were also obtainable in terms of the distribution of tubes of force. Upon pointing out the equivalence
of a steady current element and a magnetic dipole, he was also able to extend this conclusion to magnetic phenomena caused by time-independent currents. However, in
discussing induced electric currents, Maxwell admitted
6 J. C. Maxwell, "On Faraday's Lines of Force," read to the Cambridge Philosophical Society on
December 10, 1855 and February 11, 1856. Reprinted in Scientific Papers, vol. 1, pp. 155-229, Cambridge University Press, London, 1890.

SECTION

H isiorical Survey

261

. . . The idea of the electro-tonic state, t however, has not yet presented itself to my mind
in such a form that its nature and properties may be clearly explained without reference
to mere symbols, and therefore I propose in the following investigation to use symbols
freely, and to take for granted the ordinary mathematical operations. By a careful study
of the laws of elastic solids and of the motions of viscous fluids, I hope to discover a method
of forming a mechanical conception of this electro-tonic state adapted to general reasoning.

Maxwell then concluded this first paper with an extensive mathematical development
in which the vector potential emerged as being representative of the electrotonic state,
its curl giving the magnetic field, and its time derivative yielding the induction effect.
He also showed that the curl of the magnetic field at any point was equal to the current
density at that point.
This first electrical paper by Maxwell can fairly be described as principally achieving
mathematical expression for all known electric and magnetic phenomena in terms of
Faraday's physical conceptions. It exhibits Maxwell's characteristic fondness for
models, a fondness which had led him to construct a top to illustrate the dynamics of a
rigid body rotating about a fixed point, and to construct a model of Saturn's rings
(now in the Cavendish Laboratory) to illustrate the motion of the satellites in the
rings. This rich physical imagination was now to lead Maxwell to his most important
discovery, through an extension of the tube of force model so as to explain the electrotonic state. This extension was accomplished in a second paper which appeared six
years later in the Philosophical M agazine, in which he offered the introductory remark?
I propose now to examine magnetic phenomena from a mechanical point of view, and to
determine what tensions in, or motions of, a medium are capable of producing the mechanical phenomena observed. If, by the same hypothesis, we can connect the phenomena of
magnetic attraction with electromagnetic phenomena and with those of induced currents,
we shall have found a theory which, if not true, can only be proved to be erroneous by
experiments which will greatly enlarge our knowledge of this part of physics.

I t has already been noted that Faraday looked upon electrostatic and magnetic induction as taking place along curved lines of force. He imagined these lines to be ropes of
molecules starting from a charged conductor or magnet, and acting on other nearby
bodies. These ropes of molecules were in tension, tending to shorten and at the same
time bulge out laterally. Thus the charged conductor or magnet tends to draw bodies
to itself, contracting its lines of force like the fibers of a muscle. Maxwell sought to
represent this longitudinal tension and transverse pressure in terms of equivalent
conditions in a fluid medium.
Let us 110\V suppose that the phenomena of magnetism depend on the existence of a
tension in the direction of the lines of force, combined with a hydrostatic pressure; or in
other words, a pressure greater in the equatorial than in the axial direction: the next question
is, what mechanical explanation can we give of this inequality of pressures in a fluid or
mobile medium? The explanation which most readily occurs to the mind is that the excess
of pressure in the equatorial direction arises from the centrifugal force of vortices or eddies
in the medium having their axes in directions parallel to the lines of force . . . .

t Faraday called the state into 'which any body was thrown, due to the presence of a magnetic field,
the electrotonic state, and explained induction as being due to changes in the electrotonic state.
i J. C. Maxwell, "On Physical Lines of Force," Phil Mag, 21, 161-175, 281-291, 338-348; 1861.
Reprinted in Scientific Papers, vol. 1, pp. 451-513, Cambridge University Press, London, 1890.

262

Electromaqnetics in Free Space

CHAPTER

'Ve shall suppose at present that all the vortices in anyone part of the field are revolving in the same direction about axes nearly parallel, but that in passing from one part
of the field to another, the direction of the axes, the velocity of rotation, and the density
of the substance of the vortices are su bject to change. We shall investigate the resultant
mechanical effect upon an element of the medium, and from the mathematical expression
of this resultant we shall deduce the physical character of its different component parts.

In order to have adjacent vortices rotating in the same direction, Maxwell next supposed that there exist between them a large number of minute spherical bodies which
roll, without sliding, in contact with the surfaces of the vortices. These particles, which
Maxwell assumed to constitute electricity, thus play the role of idler wheels, Under
this construction, for example, the static magnetic field of a permanent magnet can be
envisioned as consisting of vortices which fill the tubes of force, with the rotational
velocity of a vortex proportional to the strength of the field and thus varying with tube
cross section. With adjacent vortices in the magnetic field rotating at the same speed
in the same direction, the particles between them rotate idly but remain in the same
position. However, if a change should occur in the magnetic field, this would mean that
one of the vortices began rotating faster than the other, and thus the particles between
them would change position, indicating an electric current. In this way, Maxwell's
model demonstrated the creation of electric currents due to changes in the magnetic
field; hydrodynamical considerations of the relations between the rotational velocities
of adjacent vortices and the displacement of the idler particles led to a mathematical
statement of Faraday's emf law,
It was precisely at this point that the great value of the model became apparent.
If a change in vortex motion can cause a displacement of the idler particles, then the
converse should be true-a displacement of the idler particles should occasion a change
in vortex motion. Cause and effect are interchangeable. A changing magnetic field can
create an electric field; a changing electric field should produce a magnetic field. Maxwell was reaching the heart of his greatest contribution when, in Part 3 of the paper, he
said"
According to our theory, the particles which form the partitions between the cells (vortices) constitute the matter of electricity. The motion of these particles constitutes an
electric current; the tangential force with which the particles are pressed by the matter of
the cells is electromotive force, and the pressure of the particles on each other corresponds
to the tension or potential of the electricity.
If we can now explain the condition of a body with respect to the surrounding medium
when it is said to be "charged" with electricity, and account for the force acting between
electrified bodies, we shall have established a connexion between all the principal phenornena of electrical science.

After pointing out that electromotive force (voltage due to magnetic effects) is the
same thing as electric tension (voltage due to charge separation), Maxwell distinguished
between conductors and insulators, concluding
Here then we have t\VO independent qualities of bodies, one by which they allow of the
passage of electricity through them, and the other by which they allow of electrical action
being transmitted through them without any electricity being allowed to pass. J.~ conducting body may be compared to a porous membrane which opposes more or less resist8

Ibid., p. 490.

SECTION]

Ifistorical Suroeu

263

ance to the passage of a fluid, while a dielectric is like an elastic membrane which may be
impervious to the fluid, but transmits the pressure of the fluid on one side to that on the
other.

Maxwell next discussed the relation between conduction current and potential in a
conductor and then went 011 to say
Electromotive force acting on a dielectric produces a state of polarization of its parts
. . . . In a dielectric under induction, we may conceive that the electricity in each molecule
is so displaced that one side is rendered positively, and the other negatively electrical, but
that the electricity remains entirely connected with the molecule, and does not pass from
one molecule to another.
The effect of this action on the whole dielectric mass is to produce a general displacement of the electricity in a certain direction. This displacement does not amount to a
current, because when it has attained a certain value it remains constant, but it is the
commencement of a current, and its variations constitute currents in the positive or negative direction, according as the displacement is increasing or diminishing. The amount of
the displacement depends on the nature of the body, and on the electromotive force . . . .

Thus Maxwell introduced for the first time the concept that variations in position of
bound charge were equivalent in their effect to a conduction current. By letting motion
of the idler particles of his model represent either or both, and finding the variation in vortex velocity due to a particle displacement, he arrived at a generalization of
Ampere's circuital law,
The importance of this generalization cannot be overstated. If motion of the idler
particles could only represent conduction current, then an electrical disturbance could
only propagate through a conductive medium. But with the concept of displacement
current, field changes could be transmitted through dielectric media, including air, and
even including free space (which Maxwell considered to be an ether).
Maxwell recognized that a finite velocity would be associated with the propagation
of any disturbance through his model medium. He described the mechanism of propagation by imagining that a translational motion of one layer of idler particles would
initiate a change in angular velocity of the contiguous vortices. These in turn would set
the next layer of idler particles into translational motion, and in this mariner the
disturbance would be transferred through a sequence of layers. Maxwell computed
the kinetic and potential energy which were transferred in this fashion, thus obtaining
a velocity of transport. By associating kinetic energy and potential energy with the
magnetic and electric fields respectively, he deduced that the velocity of propagation
of an electromagnetic disturbance was governed by the electrostatic permittivity and
magnetostatic permeability of the supporting 111ediu111. Upon using the values for
these constants, determined for air by Kohlrausch and Weber, Maxwell deduced that
the velocity of an electromagnetic disturbance should be 193,088 mi/sec. He then
concluded
. . . the velocity of light in air, as determined by 1\1. Fizeau, is . . . 195,647 miles per second. The velocity of transverse undulations in our hypothetical medium . . . agrees so
exactly with the velocity of light calculated from the optical experiments of IV1. Fizeau, that
we can scarcely avoid the inference that light consistsin the transverse undulations of the same
medium which is the cause of electric and maqneiic phenomena.

This discovery may be likened to an earlier occasion when 1\ ewton first tested his

264

Electronuujneiice in Free Space

CHAPTER

law of universal gravitation by making calculations on the distance of the n100Il. It


was Newton's misfortune to use an inaccurate value for the diameter of the earth, and
this led to such poor agreement that he put the theory aside for nearly two decades.
Maxwell was spared a similar disappointment in that both his value and Fizeau's
were in error in the same direction.
I t should be remembered that at this time no one had ever wittingly generated or
detected electromagnetic waves. The concept was completely new, as was the notion
of a displacement current. To link light to these hypothetical phenomena was a flash
of brilliance seldom equalled in the history of science. It was not to be until eight years
after Maxwell's death that these hypotheses would receive substantiation through the
experiments of Hertz.
Maxwell next discarded the model which had served so well as a scaffolding with
which to erect his theory, and in a third paper entitled "A Dynamical Theory of the
Electromagnetic Field," presented the theory completely in electrical terms." The
properties of the field are described in terms of 20 equations, which include the relation
between displacement current and conduction current, and the continuity equation
linking charge to current, as well as what are now known conventionally as Maxwell's
equations. This paper was so carefully written that it later appears almost intact in
his Treatise.
These accomplishments, added to his contributions in color vision and molecular
theory, have earned Maxwell the place as the greatest theoretical physicist of the nineteenth century. At a centenary in 1931 honoring his birth, Max Planck reviewed the
evolution of man's knowledge of electrical phenomena and concluded 10 by saying of
Maxwell
. . . it was his task to build and complete the classical theory, and in so doing he achieved
greatness unequalled. His name stands magnificently over the portal of classical physics,

and we can say this of him: by his birth, James Clerk Maxwell belongs to Edinburgh,
by his personality he belongs to Cambridge, by his work he belongs to the whole world.

Maxwell's equations, as has already been noted in Chapter 2, played a central role
in the development of the theory of special relativity. Lorentz used them as an invariant
to derive the transformation which bears his name, and Einstein devoted much of his
first paper to the same subject. In the sections to follow, this process will in effect be
reversed. The Lorentz transformation has already been established in terms of fundamental considerations of length and time measurements, The Lorentz equations will be
used to derive Maxwell's equations from a transformation of Coulomb's law.

5.2 THE TRANSFORMATION EQUATIONS FOR


ELECTRIC AND MAGNETIC FIELDS

Suppose that an observer 0', stationary in X'Y'Z', has created most general composite
electrostatic and magnetostatic fields, E' (x' ,y' ,z') and .B' (x' ,y' ,z'). He can do this
through the use of three types of sources: (1) a static system of charges, (2) a steady
First read to the Royal Society in 1864. Published in Phil Trans Roy Soc (London), 155; 1865.
Reprinted in Scientific Papers, vol. 1, pp ..526-597, Cambridge University Press, London, 1890.
10 Janus Clerk At axwell, A collection of commemorative essays, p. 65, The Macmillan Company,
New York, 1931.

SECTION

The Transformation Equations for Electric and 1\1 agnetic Fields

265

current consisting of the flow of uncompensated charges, and (3) a steady current
against the background of static compensating charges. This situation is suggested in
Figure 5.1. Formulas for the static fields arising from a composition of these three

z'
Static charges

(x',y',z') ~
v'(t')

Uncompensated
circulating charges

fI'-"'"

,'-',,

~ i"

!i l\
~

J.-------------------y'

'.--

X'

Compensated
circulating charges
FIGUHE

5.1 Composiie sources causinq most general static fields B' (x' ,y' ,z') and E' (x' ,y' ,z')

which interact with a movinq charge q.

types of sources were given in Section 4.10. If, in the presence of these fields, a charge q
is moving through X'Y'Z' at a velocity v', observer 0' can say that the force on q is

F' = q(E'

v' X B')

(5.1)

which is a use of the Lorentz force law.


Imagine that a second observer 0 is stationary in a frame XYZ which is in constant

266

Electromaqneiics in Free Space

CHAPTER

motion with respect to X' Y' Z' such that the respective axes are aligned and X is sliding
along the - X' axis at speed u. Then the Lorentz transformation equations (2.40) are
applicable and observer 0 will deduce that the force on q is F, where

F= [lxF~ + K(lyF~ + 1.F:)] + K~

X (

1x ~

F')

(5.2)

in which K = (1 - U2/C2)-~2 and vet) is the velocity of q in XYZ. Equation (5.2) is


merely a restatement of (2.76) and the present development in some ways parallels
the opening development of Chapter 4. Two cases of (5.2) wil] now be considered.

Case 1:

== O.

In this case, q is static in XYZ and the force F is just the bracketed term in (5.2).
Observer 0, who is accustomed to the idea that magnetic fields exert forces only on
moving charges, will ascribe this force to an electric field, since q is not moving relative
to O. Thus he defines an electric field such that
(5.3)
This electric field depends on time as well as spatial position because the sources of 0'
are moving relative to O.
With q stationary in XYZ, v' = -l x u and (5.1) gives

F~ = qE~

F~ = q(E~

F~ = q(E; - uB~)

uB:)

(5.4)

Insertion of (5.4) in (5.3) yields the field transformation equations

Ex

E~

E y = K(E~

+ uB:)

E, = K(E; - uB~)

(5.5)

The electric field in XYZ is seen to be contributed to by both the electric and magnetic
fields of X'Y'Z', and in relative amounts controlled by u.

Case 2: v O.
In this case q has an arbitary motion vet) in XYZ and the force F is the entire
expression (5.2). The charge q also has an arbitrary motion v'(t') in X'Y'Z', and
(5.1) gives

F'x = q (E'x

F'y
F'z

=
=

q(E'y
q(E'
1z

+ v'B'z
y

x -

+ V'B'
+ V'B'

v'B')
z y
Vx'B')
z
v'B')
x
y

(5.6)

If the velocity transformation equations (2.50) are used to replace the components of
v' by those of v in (5.6), and the resulting equations are inserted in (5.2), one obtains

= q {1x

E~ + :~ (vyE~ + v.E:) + K(vyB: - v.B~) ]

(1 - u;x) E; + n; - x- u) B:]
+ 1. [K(l - :~x)E: - .; + K(v x- U)B~J}

i, [ K

K(V

(5.7)

Observer 0 can introduce a magnetic field B(x,y,z,t) to account for the fact that the
force on q is different because it is now in motion through XYZ. This magnetic field

SECTION

The Transformation Equations for the Source Densities

267

will be time-varying because the sources of 0' are moving relative to O. If observer 0
chooses to define B(x,y,z,t) such that the Lorentz force law is still valid, then he can
write
(f).8)
F = q(E + v X B)
in which qE is the force on q when it is at rest in X YZ, and thus E is as gi ven in Case 1
by Equations (5.5). The magnetic field B must be such that (5.7) and (5.8) equate.
Upon comparing components of these two equations, one finds that
(5.9)
These transformation equations, together with (5.5), form a set from which 0 can
determine the fields which interact with q to produce the force given by the Lorentz
equation (5.8). The sources which have produced these fields are of a restricted class,
being time-independent in X'Y'Z', but this restriction will be lifted shortly.
Upon properly combining (5.5) and (5.9) one can establish that the inverse transformation is
E~ = K(Ey

B~
5.3

= K

uB z )

(By + ~ Ez)

E; = K(Ez

B:

= K(

Bz

(5.10)

uB y )

~E

y )

(5.11)

THE TRANSFORMATION EQUATIONS FOR THE SOURCE DENSITIES

It will be desirable to relate the fields E(x,Y,z,l) and B(x,Y,z,l) to the time-independent
sources p'(x',y',z') and t(x',Y',z') created by 0'. However, it is convenient first to transform these sources into their XYZ equivalents. This can be done by considering an
arbitrary volume element dV o, in which an amount of charge Po dV o is at rest. This
volume element is assumed to be moving through X'Y'Z' at velocity w", and through
XYZ at velocity w. To 0', this volume element has a size dV' = dV o[l - (W')2/ C2]H
and to 0 it has a size dV = dVo(l - W2/C2)~<2. Since charge is an invariant,
p'dV' = Po dV o = p dV
!!- = dV' = [1 - (W')2/C2J~2
p'
dV
1 - W 2/C 2

from which

(5.12)

The velocity transformation equations (2.50) give

w~ =

(1 + U;~r2 {[(W/)2 - (w~)2l (1 - ~:) + (w~ + U)2}

which may be used to convert (5.12) to the form

p(x,Y,z,l) =
Since l(x,Y,z,i)
yields

===

7 p (x ,y ,z )
(1 + uw~)",
I

(5.13 )

p(x,Y,z,t)w(x,Y,z), another use of the velocity transformation (2.50)


(5.14)

268

Elecironuumeiics in Free Space

CHAPTER

Equations (5.13) and (5.14) are called the source transformation equations and will be
of assistance in the determination of the dependence of E and B on the sources.

5.4

MAXWEll'S EQUATIONS

The relations between the static fields E'(x',y',z'), B'(x',y',z') and the time-independent
sources p' (x' ,y' ,Z'), " (x' ,y' ,Z') are already known, being given by

v' E'

v' X E ' == 0
v' X B' = \'/~Ol

= p'/EO

(5.15)

v'D' == 0

If all the quantities in these four equations are converted to their XYZ equivalents,
with the assistance of the transformations developed in Sections 5.2 and 5.3, the result
will be a set of equations in which the dependence of E and B on, the sources is dis-

played. To see how this is accomplished, consider any function! of the four coordinate
variables. Upon making use of the Lorentz Equations (2.40), one can establish that

aj = aj dx +

ax'
af
at'

-==

~!!!.-

K!1

+ KU aj

ax dx' at dx'
ax c2 at
a.r dt af dx
af
af
at dt' + ax dt' = K at + KU ax
af
af
af
af
-=-=ay' ay
az' az

(5.16)

Application of these formulas to the curl of E' gives terms such as

aE:
ax'

aE~

aE;
ax

aE~

KU

aE;
at

---=--K----

az'

az

c2

which, with the use of (5.10), can be written

aE~ _ aE; = aE z
az'

ax'

az

K2

(aEz + U aB II)
ax

ax

~~ (aEz
c

at

+ U aB

at

Upon determining all three components in this manner, one may write
V" X E

z
== 0 = i, [(aE
K ay

aEy) + (V B - aB%)]
- a;
a;

aEx - -aEz) +
ax
+ 1z [( -aEy - -aEx) ax
ay

+1

[( - az

KU

et;

(1 -

K 2) -

K 2U -

(1 -

K 2)

K 2U

ax
aEy
ax

aBy
ax
z
aB
ax

2
K U

es. -

at
2
+ -K U -aEy
c2

c2

at

aBy ]
at
2 2
(.1.17)
-K cU2 -aBz]
at
2

K U
-- 2

This result can be simplified considerably. If f is any function of x', y', z' but not of t',
it follows from the second of Equations (5.16) that

ar
?f= -u -=
at

ax

SECTION

M axwell' s

l~ quations

269

When this relation is used in (5.17) one obtains

(5.18)
Further simplification is possible through determination of V B. Since

aB
x+ -aB +-aB
ax
ay
az
y

aB x
ax'

KU

K- -

en, ets, en,


+ - +at' ay' az'

--.
2

use of (5.9) and recognition of the fact that partial derivatives with respect to t' are
all zero leads to
I

V B

KV B

aE l aEI)
(-ay' - -az'

KU

- -

(5.19)

The right side of (5.19) is zero by virtue of (5.15) and thus


(5.20)

vB==O
which means that (5.18) can be written

aB

vxE==

(5.21)

at

When this procedure is repeated for the curl of B' one obtains
V' X

B' == ~1 == 1
fJ.o
x

[(aBz _ aBy) _ ~2 aExJ _ KU2 V . E}


ay
az
c at
c
+ r, [(aBx _ aBz) _ ~ aEy] + i, [(aB y _ aBx) _ ~ aE,]
{K

az

ax

at

c2

ax

ay

c-

at

(5.22)

Once again reduction is possible since


V

"

E = KV E
==

(, +
P

fO

KP' (

=fO

KU

'

("x

U - -1

fJ.o

(aB:
ay'
-

aB~)

az'

) == - '(1 +
o

fO

fO

uw x - -1
Po

uw~)
1+2
c

Use of (5.13) yields


V E =

(5.23)

fO

and then use of (5.14) means that (5.22) can be written


t
1 aE
V x B = - + -2 POl

at

(5.24)

270

Electronuumetics in Free Space

CHAPTER

All these results are relativistically exact. When collected together, they are known
as Maxwell's equations and can be written in the form

an
at

(5.25a)

aDo
+
at

(5.25b)

VxE=--

v x H,

in which

= 1

V Do = p

(5.25c)

vB == 0

(5.25d)

Do = foE
n, = ,uolB

(5.26)
(5.27)

The sources p, 1 and the fields due to them, E, B, Do, "0, are all time-varying, However,
they are due to a restricted class of sources, namely those consisting of static charges
and steady currents in X'Y'Z'. But this restriction can be lifted by a simple argument.
If a second set of steady sources exists in another coordinate system X" Y"Z", they
will give rise to additional time-varying fields in XYZ which will also satisfy (5.25).
By superposition, the sum of the sources in X'Y'Z' and in X"Y"Z" will give rise to
fields in XYZ which satisfy (5.25). If this is generalized to include steady sources in all
Lorentzian frames, including those traveling at any speed in any direction relative to
X YZ, the sum of such sources can result in completely general distributions p(x,Y,z,t)
and I(X,Y,Z,t). This fact is demonstrated in Appendix E. Thus Equations (5.25) have
the widest validity and can form the basis for the study of all types of electromagnetic
fields. Of course, observer 0 need not rely on the steady sources of 0', 0", etc., to
establish his time-varying electromagnetic fields, but can do this equally well himself
by direct creation of the time-varying sources p and t.
Integral forms of Maxwell's equations follow readily from (5.25) with the aid of
Stokes' theorem and the divergence theorem. They are:

E dt = - f B dS
s
n, dt = f (t + Do) dS
c

Do dS vJ

B dS

== 0

dV

(5.28a)
(5.28b)
(5.28c)
(5.28d)

The first of these equations is often called Faraday's emf law and states that the line
integral of longitudinal E around any closed path is equal to minus the time rate of
change of magnetic flux enclosed. The second equation is a generalization of Ampere's
circuital law and casts Do in the same role as 1. This point was first appreciated by
Maxwell, who gave to Do the name displacement, For this reason, Do is called the displacement current. The third equation is a generalization of the integral form of
Poisson's equation, and the fourth integral states that at all times the total magnetic
flux piercing any closed surface is zero.
If the divergence of Equation (,5.25a) is formed, the left side is zero because of the

SECTION

l\,faxwell's Equations

271

vector identity (V. Ill) ; this is matched by the right side, which is also zero by virtue
of (5.25d). Sirnilarly, if the divergence of (5.25b) is taken, one obtains

v .

V X

Ho =

V (1

Do)

== 0

indicating that the total current is continuous. Thus

v 1

(v

-p

at

Do)

Use of (5.25c) converts this to

v \

(5.29)

which is known as the continuity equation. In words, V \ dV is the net efflux of current from a volume element dV, and - p dV is the time rate of decrease of charge within
dV. It is quite natural that these two quantities should be equal.
The continuity equation, which links charge and current, in no way denies the existence of charge without current, since it involves only p. A static charge distribution
p(x,Y,z) satisfies (.5.29) with no current flow.
EXAMPLE

.5.1

Consider t\VO rectangular conducting blocks, as shown in the figure, separated by a small
distance l so as to form a parallel plate capacitor. Assume a uniform electron flow in the
direction indicated, so that t = I zt. is upward in both blocks. If A is the cross-sectional
area of each block, charge is accumulating on the adjacent faces at a rate t.A coul/sec.

-,
I

Electron
flow

272

Electromaqneiics in Free Space

CHAPTER

Therefore the total flux between faces, neglecting fringing, is increasing at the rate of
L1 lines/sec, or

Do

= t

Thus the conduction current in the blocks is exactly replaced by a displacement current
in the interspace and the total current is continuous, in agreement with V (t + Do) == 0,
as deri ved above.

If this entire development, beginning with Section 5.2, had been undertaken by
starting with steady sources in X'Y'Z', and asking what fields would be determined
by an observer 0*, in a coordinate system X*y*Z* which was moving at a speed u*
relative to X'Y'Z', all the same results would once again be obtained. Time-varying
fields E*, B*, due to time-varying source distributions o", t* would be found to satisfy
l\laxwell's equations. The question could then be raised as to the relations between the
fields E, B observed by 0 and the fields E*, B* observed by 0*. It is easy to show that
these two sets of fields are related by the previously obtained transformations (5.5)
and (5.9). A proof can be found in Appendix F.

5.5

INTEGRAL SOLUTIONS OF MAXWELL'S EQUATIONS


IN TERMS OF THE SOURCES

Since Maxwell's equations are linear in free space, no loss in generality results from
assuming that time variations are harmonic and represented by ei wt . The angular
frequency w may be a component of a Fourier series or a Fourier integral, thus bringing
arbitrary time dependence within the purview of the following analysis. Accordingly, if !(x,Y,z,t) is any field component or source component, it will be assumed that
!(x,Y,z,t) = !(x,Y,z)e i wt . In this case, Maxwell's equations can be written

v X E = -jwB
V X n, = t
jwD o

V E =

VB

==

(5.30)
(5.31)

(5.32)

(5.33)

Eo

and the continuity equation becomes


V t = -JWP

(5.34)

In all the above equations, E = E(x,Y,z), etc., the time-dependence being suppressed.
E, B, etc., are now complex vectors. (See Mathematical Supplement, Section V.23).
Additionally, if the curl of (5.21) and of (5.24) is taken, and if then (5.21) and (5.24)
are used to eliminate either E or B, one obtains the vector wave equations
(5.35a)
(5.35b)

I ntegral Solutions of i11 axwell' s Equations

SECTION ;)

273

For an ejwt time-dependence, these becorne

X V X

X V X B -

E - k 2E
k 2B = V X

(5.3Ga)

(5.3Gb)

in which k = w v!:Oo is called the propagation constant, for reasons which will emerge
shortly. These last two equations can be integrated through use of a technique first
introduced by Stratton and Chu, and based on a vector formulation of Green's second
identity. 11

i,

FIGURE

.5.2 Geomein] for the vector Green's theorem.

Consider a region V, bounded by the surfaces Sl ... ~SN as shown in Figure 5.2.
Let F and G be t\VO vector functions of position ill this region, each continuous and
having continuous first and second derivatives everywhere within V and on the
boundary surfaces Si. Using the vector identity
V

and letting A

[A X B] == B V X A - A V X B

F while B == V X G, one obtains

v [F

X V X

G]

V X

V X

F - F

V X V X

whereas, if A == G and B == V X F, there results


V

[G X V X F] == V X F V X G - G V X V X F

11 J. A. Stratton and L. J. Chu, "Diffraction Theory of Electromagnetic Waves," Phys Rev, 56,99-107;
July 1939. Also, sec the excellent treatment in S. Silver, Microwave Antenna Theory and Design,
MIT Rad. Lab. Series, vol. 12, pp. 80-89, l\TcGra\v-Hill Book Company, New York, 1939. The present
development differs from Silver's principally in the nonuse of fictitious magnetic currents and charges.

274

Electromagnetics in Free Space

CHAPTEH

If the difference in these results is integrated over the volume V one obtains

f (F

V X V X G -

G V X V X F) dV =

V [G X V X F -

F X V X G] dV

If one lets In be the inward-drawn normal from any boundary surface S, into the volume
V, use of the divergence theorem gives

f (F.

V X V X G -

G V X V X F) dV

SI"

(G X V X F - F X V X G) in dS

(1).37)

,SN

This result is the vector Green's theorem,


Suppose that the E and B of (5.3Ga) and (5.3Gb) meet the conditions required of the
function F in V and let G be the vector Green's function defined by
e-jk~

(f>.38)

G = - a = y;a
~

in which a is an arbitrary constant vector and ~ is the distance from an arbitrary point,
[J(x,Y,z) within V to any point (~,1],t) within V or on Si.
G as defined by (5.38) satisfies the conditions of the vector Green's theorem everywhere except at P. Therefore, one can surround P by a sphere ~ of radius 0 and consider that portion V' of V which is bounded by the surfaces 8 1 . . SN, ~. Letting
E = F, one obtains

J (E

v'

Vs X Vs X

1/-'a - 1/-'a V s

X V s X E) dV

SI'"

E X V s X 1/;a) in dS

(1/;a X V s X E -

(5.39)

SN,T,

in which, since y; is a function of (x,Y,z) as well as (~,1],t), it is necessary to distinguish


between differentiation with respect to these two sets of variables by subscripting the
operators so that
Vs =

and

1x -a~

vp = 1z -ax

1y -a1]

1 -at
Z

+ 1 -aya + 1 -aza
Y

I t is shown in Appendix G that both sides of this equation may be transformed so


that a is brought outside the integral signs, the result being

f (jw1/-'

v'

~l

lJ.o

~ V s 1/-') dV
Eo

- a'

- a

(In' E)Vs1/; dS

SI' SN,T,

81'"

[jw1/-'(l n X B) - (in
SN,T,

X E) X V s1/;]

dS

(5.40)

SECTION

Conditions at Infinity

275

Since a is arbitrary, it follows that the integrals on the t\VO sides of (5.40) may be
equated, yielding

(jWl{!

t-t> - ~ V sl{!)

dV -

8,.

8N

[(in E)V sl{!

f (in E)Vsl{! + (in X E)

(in X E) X V sl{! - jwl{!(in X B)) dS


X Vsl{! - jwl{!(in X B)) dS

(5.41)

where, for convenience, the surface integral over the sphere ~ is displayed separately.
It is further shown in Appendix G that the right side of (5.41) reaches the limit
-47rE(x,y,z), with (x,Y,z) the coordinates of the point 1\ as ~ shrinks to zero. Therefore
the limiting value of (5.41) is

E(x,y,z)

=~

47r

f (~V sl{! -

EO

Lf

jwl{!

~)) dV

JJ.o

[(in E)V sl{!

SI'"

(in X E) X V sl{! - jwl{!(in X B)) dS

(5.42)

SN

This important formula gives .E at any point in the volume V in terms of the sources
within V plus the field values on the surfaces which bound ~T.
One may proceed in a similar fashion, by letting B = F, and deduce a companion
formula for B(x,Y,z). Alternatively, the curl of ([).42) may be taken and then (5.30)
employed to obtain B. By either procedure, one finds that

B(x,y,z)

1
= -4

7r

f --=t
t

+~
47r

X V st/; dV

JJ.o

f
81"

[jWl{! (in X E)
2

8N

(in X B) X V sl{!

(in B)V sl{!] dS

(5.43)

Inspection of the volume integrals in (5.42) and (5.43) reveals that B is given in
terms of the current sources only, whereas the expression for E contains terms involving
both the currents and the charges. However, the continuity equation (.1.34) may be
used to give

E(x,y,z)

f _.1_ lo Vs)Vst/; + k t/;t ] dV


2

47r v JWEo

+~

47r

81

f
"

[(in E)Vst/;
,

(in X E) X Vs\{; - jwl/;(l n X B)] dS

(5.44)

SN

Equations (.1.43) and (5.44) constitute a solution of Maxwell's field equations in terms
of the current sources within V and the field values over the bounding surfaces S;

5.6

CONDITIONS AT INFINITY

Let it now be assumed that the surface F;N of Figure 5.2 becomes a large sphere of
radius (R centered at the point P. (R initially will be taken great enough to enclose all
the sources t and p of the fields; ultimately CR will be permitted to become infinitely

276

Electro1nagnetics in Free Space

CHAPTER

large. Under these circumstances, consider the contributions to (5.43) and (5.44) of the
surface integrals over SN.
If 1m is a unit vector directed outward along the radius of the spherical surface SN,
so that 1m = - In , one may write for the appropriate part of (5.43)

L1[j;t
L1[- jc~

(I n X E)

L1{- jc~
= L1{- j;

(1<1\

+
X

(1<1\

[(1<1\

(1" X B) X V Sy;

E)

(1<1\

E) - (jk

B)

(I n

+ ~}

B)V Sy;] dS

1<1\ (jk

+ ~) [(1<1\ X

E) - cB]

(1<1\ B)l<1\ (jk

1<1\

D]e-;ki

dS

B) - (1<1\' B)l<1\l} e;<I\ dS

e;<I\ dS

(;J.45)

Similarly, the appropriate part of (5.44) becomes

47l"

J [(1,,

SN

E)vsY;

(1" X E) X VSy; - jwy;(l" X B)] dS

=~

471"

J {Jw [(1<1\

+ ~] +

B)

SN

E}
ffi

e-~<I\ dS

(5.46)

\.Tt

If CR ~ 00, since the surface of the sphere increases as CR2, the surface integral (5.45)
will vanish if
(5.47)
lim ffiB is finite
m~oo

lim CR[(l m X E) - cB] = 0

(tj.48)

CR~Q()

Similarly, the surface integral (,5.46) will vanish if


(5.49)

lim ffiE is finite


CR~oo

lim CR [(1<1\

<R~Q()

B)

+~]
C

(5.50)

The relations (5.47) through (5.50) are known as the Sommerfeld conditions at infinity.
Expressions (f>.47) and (5.49) are commonly called the finiteness conditions (Endlichkeit Bedingungen) and expressions (5.48) and (5})0) are customarily given the name
of radiation conditions (Ausstrahlung Bedingungen). The finiteness conditions require
that E and B diminish as (R-l while the radiation conditions require that they bear the
relation to each other found in wave propagation in regions remote from the sources.
(See Section 5.7.)
I t is now possible to demonstrate the extremely important result that real sources,
confined to a finite volume, always give rise to fields which satisfy the Sommerfeld
conditions. To see this, consider Equations (.1.43) and (fj.44) when the only boundary
surface is the large sphere SN whose radius will be permitted to become infinitely

Conditions at J nfin"ity

SECTION ()

277

large. It shall be assumed that the real sources t and pare finite and confined to a finite
volume 11 0 . With the surface ~')N becoming an infinite sphere, the volume l ' in U>.4:3)
and (:>.44) also becomes infinite, but no convergence difficulties arise with the volume
integrals because the sources are all within V o.
Borrowing from the results of Section ;").9, t he fields over SN will consist of outgoing
waves whose power density is E X H, wat.ta/rn". Since the surface area of SN is increasing as (R2, if there is even the minutest loss in V, the law of conservation of energy
requires that E and H, diminish more rapidly than m-I , and thus conditions (;").47)(5.50) are satisfied. One can then conclude that in an unbounded region, B(x,y,z) and
E(x,y,z) are given solely by the volume integrals which appear in U).4:3) and (5.44).
A check on this conclusion for the limiting case of no loss in V may be obtained
through an ordering of the terms which comprise the volume integrals. To see this,
select as origin an arbitrary point in V o and let r be the vector drawn from the origin
to the field point P(x,y,z); the vector drawn from the source element to P will be
labeled ~. Then

(jk+-1) e-~

jkl

Let>
- all
-+- all)
-

(L8

~ ao'

~ sin 0' ac/>'

in which spherical coordinates (~,O',c/>') centered at P have been employed and


~

1r = - -r
Performing the indicated differentiations, one obtains

(5.51 )
The functions l/;, V sf, and (t V s)V sf are all seen to involve polynomials in the variable
(5.43) and
(5.44) gives

~-l. Retain for the moment only first-order terms; then substitution in

f \

1
B(x , y '
z) 4
= -

e-jk~

(.5.52)

jk. - -1 X I r - - dV
rr v
JJ.o
~

E(x,Y,z)
But

=-

f -.-1

4rr v

JWf:o

[-k 2 (t Ir)lr

= [(x - ~)2 + (y - 1])2 + (z - r)2p2


== [(1' sin () cos c/> - ~)2 + (r sin 0 sin </>

~M
k 21] dV

1])2

(5..13)

(1' cos () - ~)2P~

in which now conventional spherical coordinates (r,f),cP) centered at the origin have
been introduced. As P becomes remote, ~ can be expressed in the rapidly converging
series
~

Similarly,

= r -

(~sin
~-l

0 cos c/>

r- 1

+ 1] sin () sin c/> + r cos ()) + OCr-I)

O(r- 2 )

lim l r = IT

T-+ 00

(5.54)

278

Electronuujnciics in

Space

Free

CHAPTER

and thus as r becomes very large, Equations (5.52) and (5.53) may be written

B(x,Y,z)

jk e!':
r

= -

41T

--=1 X lr ei k JI dV
J.lo

1).

f I, X (1

jk r

E(x,Y,z) = -jw -e41T l'

f\

V'

X -=i

J.lo

V'

(5.55)

0(1'-2)

eJk JI dV

(;").56)

0(r- 2 )

in which JI = ~ sin () cos + 17 sin 0 sin + S cos o. t


If one were to go back and include all the terms in the expressions for V st/; and
(1 V s) V s1/;, they would alter the results (5.55) and (5.5H) only at the level of 0(r- 2) .
Therefore these two expressions for Band E may be taken as exact.
In considering the expressions (tj.5t and (5.56) with respect to the Sommerfeld
conditions, one notices that the terms of 0(1'-2) and below satisfy all four conditions
and thus concern may be focused on the explicit first-order terms. But
lim rB = jk lim e- j k r
41T T-+ ao
ao

T-+

f~Xle

-1

J.l 0

j k JI

dV

(.j.57)

and, since the volume integral is a function of the source coordinates and the angular
direction to J>, but not of r, this limit is finite. A similar argument establishes that
lim r E is also finite and thus both finiteness conditions are satisfied.

T-+

ao

Further,
lim r
T-+

ao

[(l

X B)

+ ~]
C

e: jkr

= lim T-+

ao

47r

f [jk1

X ---=i X

JJ.o

I,

+ -jwc I, X

I,

t ]

X ---=i

JJ.o

e i k JI

dV

The integrand in (5}j8) is identically zero and therefore condition (5.t50) is satisfied.
In like manner, the condition (5.48) is found to be satisfied also. This supports the
argument that any system of real sources confined to a finite volume Vo gives rise to an
electromagnetic field at infinity which satisfies Sommerfeld's conditions, that the surface integral over an infinite sphere SN gives a null contribution, and that in an unbounded region the electromagnetic field at any point P, near or remote, is given
precisely by
E(x,Y,z)

=~

41T v

B(x,Y,z)

_.1_ [(t V s)V st/;

)Wf.o

= -

41T

f ----=i
t
V st/; dV

k 21/;t ] dV

(5..59)
(5.60)

J.lo

Suppose now that parts of the volume Vo are excluded from V by the finite, regular
closed surfaces 1..~1 S, . . . . These surfaces may exclude some of the sources from
V or not, but their presence does not alter the results at infinity. However, now the
more general expressions (.1.43) and (fJ.44) apply, and one may conclude by saying that
(5.43) and (5.44) are valid ev~n if the volume V is infinite, so long as real sources in a

t This syrn bol is the Russian lower-case "ell" and may be called the directional position of the source
point.

SECTION

TYke Potential F'U,nctions

279

finite volume are assumed. If the volume V is infinite, the surface at infinity need not
be considered.

5.7

THE POTENTIAL FUNCTIONS

If the volume V is totally unbounded,

J~quations

(:").42) and (:>.43) give


(:>.61)

(.1.62)
Since Vpl/; == -Vsl/;, and since \ and the limits of integration arc functions of (~,r],t),
but not of (x,y,z), these integrals may be written

J PYt

E = -V p

41ro

== V

dV - jw

J~
dl
41rJ,lo
-1

lYt_1 dV
41r J,lo

(.5.63)

(5.64)

Therefore it is convenient to introduce two potential functions by the defining relations


A(x,Y,z,t)

-- J

~(~,1J,t)ej(wt-kn

4.l(x,Y,z,t)

-- J

p(~,r],t)ej(wt-kn

-1

41r~o ~

41ro~

(5.6.5)

dV

(5.66)

dl

in which the time factor e jwt has been reinserted and e-jkr/-c has been substituted for l/;.
A is known as the magnetic vector potential function and <I> is known as the electric
scalar potential function.
Since k == w/C, one may write
exp [j(wt - k~)] == exp [jw (t - ~/c)]

Therefore each current element in the integrand of (f).65), and each charge element in
the integrand of (f).66), makes a contribution to the potential at (x,y,z) at time t which
is in accord with the value it had at the earlier time t - ~/c. But this is consistent with
the idea that it takes a time ~/c for a disturbance to travel from (~,'Y/,t) to (x,y,z). For
this reason, (5.6t)) and (f).66) are often called the retarded potentials.
From (5.63) and (5.64),

E == -V<I> B==vxA

(5.67)
(.1.68)

in which the subscripts on the del operators have been dropped, since A and 4.l are
functions only of (x,Y,z) and not also of (~,'YJ,t).

280

Elcctronuumetics in Free Space

CHAPTER tj

The differential equations satisfied by A and <I> may be deduced by taking the
divergence of (5.67) and the curl of (;").68), which leads to

1 ..

V'2A - - A =
c2

(;").69)

-1

1J.o
p

(ti.70)

these results being valid whether t and pare harmonic functions of time, or more general
time functions representable by Fourier integrals. A proof 111ay be found in Appendix H.
At points away from the sources, it is unnecessary to solve for both <I> and A (unless
static source distributions are involved). One need find only A, then use (f).G8) to
obtain B, and then use (5.24) to deduce E. The latter may be rewritten

c
jk

E=-vxVxA

(:").71)

It is interesting to observe that for time-independent sources (w = k = 0), Equations


(5.66) and (5.67) reduce to the electrostatic relations encountered in Chapter 3, whereas
Equations (5.6,S) and (5.68) reduce to the magnetostatic relations developed in Chapter 4.
For the more general time-harmonic case, if all the sources are confined to a finite
volume Yo, and if ~ from any point in V o to (x,Y,z) is much bigger than the maximum
dimension of Yo, then (5.65) 111ay be approximated by replacing ~ with r in the denorninator of the integrand, and by replacing ~ with (5})4) in the phase factor, where r is
drawn from an origin in V o to (x,Y,z). This gives the far-field approximation
A(x,Y,z,t)

ei(wt-kr)

---1-

41r,LLo

r v

t(~,l1,t)ejkJI dV

in which, as before, JI = ~ sin () cos cP


11 sin () sin cP
mation, using (5.68) and (5.71), one obtains

(,5.72)

+ r cos 8. To this same approxi-

B = -jkl r X A
E = -c1 r X B = -jwA T

(,j.73)
(5.74)

with AT that part of A which is transverse to the radial direction L.


A study of Equations (5.72) through (fj.74) shows that the fields are in the form of an
outgoing spherical wave
ei(wt-kr)

(5.75)

41r1J.o1r

which diminishes as the reciprocal of the distance, and that this wave is modified by
the directional weighting function

a(o,cf

f ta,'I/,t)e

i k JI

dV

(5.76)

For this reason a(8,cP) may be called the field pattern, and is closely related to the
power radiation pattern of the system of sources, as will become evident shortly.

SECTION

The Potential Functions

281

From the form of (5.75), it is apparent that the wave is propagating in the radial
direction at such a speed that a point of constant phase satisfies the relation

wt - kr = constant
dr
w
vph = -dt = -k =

which gives

as the phase velocity of the wave.


Further, 0).73) and (5.74) indicate that both Band E are transverse to the direction
of propagation and that in the transverse plane they are perpendicular to each other,
their magnitudes being in the ratio E/ B = c.
These properties are common to all time-varying electromagnetic fields in free space
at points remote from the sources.
EXAl\1PLE

5.2

A. simple source of great practical importance is the half-wave dipole. It may be assumed
to consist of a filamentary current disposed along the Z axis, as shown in the figure, with

"

I(f)

I; cos kf

an amplitude distribution which is spatially sinusoidal. Thus one may describe the current
distribution by the equation
I(s,t) = 1m cos kseiwt

282

Electronuupieiics in Free Space

CHAPTER

in which 1m is the amplitude of the current at the central feeding terminals, which are
assumed to be negligibly separated.
Use of (5.76) gives

J Um
"/4

(j,C 8)

kt . e i kt co, 9 dt

cos

-"/4

= 1 21m cos (1r /2 cos 8)


z k
sin? 8
so that

AT = -10

.
SIn

E =

from which

Imei(wt-kr)

8..-l z = -1 8 - -121rJ.Lo kr

jwlmei(wt-kr)
18 - - - -

cos (1r /2 cos 8)

------

sin (J

[COS (1r /2 cos 8)]

21rfJol kr
sin 8
_
jcl m ei(wt-kr) [cos (1r /2 cos (J)]
-1 8 - - l - 21rJ.Lo
r
sin (J

= - iT X E = l cP
C

jIm

ei(wt-kr)

- - --l

21rJ.Lo

[cos (1r /2 cos (J)]


sin 8

The directional weighting function [cos (1r /2 cos 8)]/sin {} is plotted in polar form in the
second figure, for a half-plane 4> = constant. The three-dimensional field pattern may be
obtained by rotating this plot around the Z axis, and bears some resemblance to a torus.

z
8

cos

(~cos 8)
sin 8

SECTION

lJIaqnetic Stored Energy

283

5.8 MAGNETIC STORED ENERGY


With the aid of Faraday's emf law, it is now possible to derive a relation for the
energy stored in a magnetic field. To this end, consider first a charge q which is part
of a current system giving rise to an electromagnetic field E, B. If vet) is the instantaneous velocity of the charge, in time dt it suffers a displacement v dt and experiences
a force q(E + v X B). The work done on the charge during this displacement is

dW

==
==

q(E + v X B) v dt
qv E dt

Therefore the power being supplied to the charge by the field is

dW
== qv' E
dt

p == -

(5.77)

If, in place of q, one considers all the charge PI dV in a volume element dV which
possesses the instantaneous velocity vIet), the power being supplied to this charge is
d 3P I == PIVI E dV == 1 1 E dV

Similarly, for all the charge P2 dV in the same volume element dV, which has the different instantaneous velocity V2(t), the power being supplied is 12 E dV. Upon superimposing the contributions for all the charges in dV, one obtains

d 3P

(5.78)
== 1 E dV
in which \ == \1 + \2 + . . . .
N ext, consider a distribution of current density 1(~,1],t) which has established a
steady magnetic field B(x,y,z). Let \'(~,1],t,t) be an intermediate value of the current
density as it is slowly raised from zero to its final value t, and let B' (x,Y,z,t) be the
corresponding intermediate value of the magnetic field. As suggested by Figure 5.3, let
B' dS be a tube of flux through the point P(x,y,z) and let C be the contour of this tube.

-:

FIGURE

5.3

Energy build-up in a magnetic field.

r- dS'

284

Electromaqneiics in Free Space

CHAPTER

If S' is an open surface with C as its sole boundary, then

f .'. dS'

S'

is the total current linked by C.


Let C' be the contour of one of the tubes of current " dS' which pierces S'. 'I'hen
the tube of flux H' dS induces an electric field along C' given by

2E

df'

13' rlS

C'

This electric field opposes the growth of the current " dS' and energy must be supplied
by the current to the field at the rate

d 4P

- (.'

dS')

2E

df'

(.'.

dS')(B' dS)

C'

which is a use of (5.78).


When all the tubes of current piercing S' are included

d 2P = -

fd

2E.'

v'

dV' =

13' dS

f .'dS'

S'

in which V'is the region of current flow for all the tubes of current which pierce S'.
Since the field is changing so slowly that D~ may be neglected in comparison to 1',

H~ de = f .'dS'

S'

and it follows that

d2 P =

f H~.B'dV
~v

wherein oV is the volume of the tube of flux whose contour is C. Upon including all
the tubes of flux, one obtains for the power being supplied to the entire field
P =

B dV
fv H,.,
o

d
= -dt

f -1

1,
1/-

,....0

dV

(5.79)

with V the volume of all space.


If W m is the energy stored in the magnetic field, so that P = dW mldt, then
Wm =

tJ.lo l

f B2

dV

=t

f B n,

dV

(5.80)

in which B now has its final steady value. Equation (5.80) is a companion formula to
(3.151), which gave the electrostatic stored energy for a steady electric field distribution.
EXAMPLE

5.3

It was shown in Chapter 4 at the end of Example 4.10, that if two long concentric conducting tubes carry equal steady currents in opposite directions, the field between them is
given by
B<I>(r) =

----1

27rJ..Lo r

SECTION

Poynting's Theorem 285

in which r ranges from a, the outer radius of the inner tube, to b, the inner radius of the
outer tube. I is the total current in either tube.
The energy stored between tubes, per unit length, is

IVm

1J (_Jh_)2
= -1- J - = 1
b

= ~ J.lo
2

27rJ.lo

-1

4~J.lo

dr

--I
n -b
-1

Znr dr

47rJ.lo

As an example, if a = tin., b = t in., and I = 1 amp, the stored energy in the magnetic
field, per meter length of the t\VO concentric tubes, is 0.07 micro-joules.

5.9

POYNTING'S THEOREM

Consideration can next be given to the power balance in a time-varying electromagnetic field. Assume that there is a system of impressed sources Ii which causes an electromagnetic field E', B', and that in response to this impressed field there is an induced
system t of currents I r creating an additional field ET, B T. The total current density and
field at any point is therefore

+ Ir
Ei + Er

I == Ii

E
B

B'

==

B:

In accordance with (5.78), the total field E reacts on the impressed source density t i in
such a way that, if power is being supplied to the field, it must be at the rate

But from Maxwell's equations,


ti

so that

x Ho -

aDo

at

tr

dP = [-EoVXHo+ :tG~oE2)+Eo\r]dV
3

(5.81)

Application of the vector identity (V.108) gives

v (E

Hs) = H, V X E - E V X H,

and this result, coupled with (5.25a) yields

- E V X H,
0

(E X Hs)

+ H, (aa~)
0

(5.82)

t The decomposition of the total current systern into impressed and response current densities is
arbitrary, but often forms a natural division. As an illustration, the currents which flow in the dipole
of Example 5.2 may be considered as response currents, whereas the currents which flow in the generator and transmission line leading up to the terminals of the dipole may be taken as the impressed
source system.

286

Electromaqneiics in Free Space

CHAPTER tj

Therefore (5.81) may be rewritten


(5.83)
This result gives the power balance in a volume element dV. The left side of (ti.83)
is the instantaneous power being supplied by the sources to dV. The factor tE olt 2 +
t,uOl B2 has been shown to represent the density of energy stored in static electric and
magnetic fields. If it is assumed that this factor bears the same interpretation for
dynamic fields (and since it is a point function, this is a most reasonable assumption),
then the term

at

(12
-

E2 + - ,u-1B2)
2 0

Eo

may be identified as the time rate of change of the density of stored energy.
The factor E v represents the power density being absorbed from the field by the
response current lr. If, for example, the response current is flowing in a conductor, this
term accounts for ohmic loss. Alternatively, if IT is due to freely moving charges, E IT
accounts for their increase in kinetic energy.
When the law of conservation of energy is invoked, it follows that the term
V (E X 1-1 0 ) may be interpreted as the volume density of power leaving dV.
This conclusion may be seen from another point of view by integrating (5.83).
With the aid of the divergence theorem, one may write

J (~EoE2
+ ~ /-lOlB2) dV + J E
2
2
v

dt v

tT

dV

J E X H o ' dS

.')

(5.84)

The left side of (5.84) represents the entire instantaneous power being supplied by all
the sources. The first integral on the right side of this equation accounts for the time
rate of change of the entire stored energy of the field. The second integral stands for the
power being absorbed by the system of response currents. The last integral therefore
represents the entire instantaneous power flow outward across the surface S bounding
the volume V. For this reason, one may define the Poynting vector as
CP = E X

n,

(5.85)

and place upon it the interpretation that it gives in magnitude and direction the
instantaneous rate of energy flow per unit area at a point. This is Poynting's theorem.
Since the units of E and H, are volts per meter and amperes per meter respectively,
it is seen that the units of CP are watts per square meter.
EXAMPLE

5.4

For the field of the half-wave dipole treated in Example 5.2, at points remote from the
dipole (the far-field), Equations (5.73) and (5.74) are applicable and therefore
1-1 0 =

in which

1]

-1

J..Lo

J.1.o

B = -1, X E
C

1
-1 X E
T

1]

377 ohms is called the impedance of free space. Therefore the

SECTION

Poynting' s Theorem

287

Poynting vector (5.85) may be written for this case in the form

~ = i.s,
:1

\r

= i,

_?1~_

(27T"r) 2

(1

[cos

~(2

)=

cos
SIn (J

since eRe [jei(Wl-kr)] = - sin (wt - kr). If

(j>

u,

1T X

E~
1T--;

~2] 2 sin 2 (wt

cP is the

kr)

time-average val ue of (P, then

i. 1]I~ [cos (2~7T" cos (J)]2


87T"2 r2
SIn (J

and one sees that there is a steady radial flow of energy away from the dipole. The total
average power being radiated I11ay be determined by integrating cP over the surface of a large
sphere S centered at the dipole. This gives

-- f7T'1 r
P r a d -- findS
'-.r"
S

?JIm
47T"

1r

?Jli~

--

87T"2r 2

[COS (7T"/2 cos


sin ()

[cos (7T" /2 cos (J)]2


sin (J

())]2

12
2 . (Jd(J
r 7T"r SIn

de

1'/I~ (1.2186)
47T"

in which the integral has been evaluated by first expanding the integrand in a power series .
.As a specific illustration, if Ieff = O.707Im = 1 amp, since ?J = 377 ohms, the radiated
power is P r a d = 73 watts.

Cases such as the preceding example, in which the currents and fields are varying

harmonicallu in time, occur so frequently and have such importance as to deserve


special discussion. Expressing all quantities in the form of a complex spatial vector
function multiplied by eiwt, such as
one may write

<P = E

E(x,y,z,l) = CRe E(x,y,z)e iwt

H, = t(8eiwt + 8*e-iwt) X (Jeoeiwt + Je~e-iwt)


= l(8 X :ICci' + t* X :leo) + t(t X JCoei2wt
= tCRe (E X Hci') + fCRe (E X "0)

t*

X JC~e-i2wt)

(5.86)

The term fCRe (E X Hri) is independent of time and thus represents the time-average
value of CP, giving

cP

teRe (E X Hri)

(5.87)

The term tCRe (E X 110) contains the factor e


and thus represents the oscillating
portion of Poynting's vector. CP may therefore be interpreted at a point as consisting
of a steady flow of energy density plus a flow which surges back and forth at frequency 2w.
Similarly
tOE2 = toE E = to[t(te iwt + t *e- iwt) (te iwt t *e- iwt)]
(5.88)
= -toE E* + -toCRe (E E)
(5.89)
and
tJJ.o 1B2 = tJJ.o1B B* + tJJ.o1CRe (B B)
j 2wt

288 Eleciromaqneiics in Free Space

CHAPTER

The terms tfoE E* and t,uolB B* are independent of time and represent the timeaverage stored energies; their time derivatives are zero. The terms tfoCRe (E E) and
t,uolCRe (D B) oscillate at a frequency 2w and they represent the variable components
of the stored energy.
Finally,
E IT = t( Bei wt + B*e-i wt ) (IJe iwt
= tCRe E IT* + JCRe E IT

+ IJ *e-i wt )

(:").90)

Here again, the term tCRe E IT* represents the time-average power density being
absorbed by the response currents; the term -~CRe E IT oscillates at a frequency 2w
and represents the energy density being cyclically absorbed and released by the
response currents.
With this formulation, Equation (.1.84) may be rewritten in t\VO parts. The timeaverage power balance is seen to be

tCRe

f E v. dV + tCRe sf E

H~ dS

(5.91)

whereas the time-variable part, oscillating at a frequency 2w, may be written

P(2w)

= -d

f [1 foCRe (E E) + -1,un CRe (B B) dV


+ i CRe f E \' dV + ! CRe f E

dt v

n, dS

(5.92)

Thus, on the time average, the sources supply power only to that component of the
response currents in phase with the electric field, represented by the first integral
in (5.91), and to the net energy flow out of the volume V across the surface S. In
addition, the sources may have to furnish energy and take it back at the cyclic rate 2w
if the right side of (,5.92) is not zero. However, in many practical circumstances, the
individual integrals in (5.92) may not be in phase, but may be adjusted purposely
so that they cancel each other, thus "matching" the generator.
EXAMPLE

5.5

In a volume V away from all currents, Equations (5.91) and (5.92) give

t CRe

1- CRe sf
2

E X H, dS = - -d

dt

f E H~
f [1- EoCRe
v
X

dS = 0

(E E)

+ - }J.o 1CRe (8 B) ]
4

dV

The first equation says that the average power flow into iT equals the average power flow
ou t. This is as it should be since V consists of free-space. The second equation says that
the energy which surges back and forth across S accommodates the cyclic variation of
stored energy within V.
As a specific illustration, let l be sufficien tly remote from the currents so that the fields
'
may be described by (5.73) and (5.74). If then all dimensions of V are small compared to r

SECTION

Poynting' s

Theore11~

289

(the distance to the currents), A assumes the simple form throughout V of

A =

<Xoei(wt-kz)

with <X O a constant, and the local Z axis chosen in the direction of propagation.
I t then follows that
E =

lx(80e-ikz)eiwt

B = 1.

(~ e- ikz)

eiwt

in which 8 0 is taken to be a real constant, and the X and Y axes have been oriented appropriately in a transverse plane. Thus

toCRe (E E) = toCRe

8~ei2(wt-kz)

= t08~ cos 2(wt - kz)


1

4 J..Lo CRe (B B)
1

1 J..Lo

= "4 ~ 8 0 cos 2(wt - kz) =

41 oCRe (E E)

The time-varying energies stored in the electric and magnetic fields have the same peak
values. The same is true of the time-average values.
Further,

iE
iCRe (E

X H6' = clz(to8~)
X Hs) = clz(to8~) cos 2(wt - kz)

The first of these two expressions has only a real part, is independent of spatial position,
and gives the time-average value of the Poynting vector. It is interesting to note that the
average power crossing unit transverse area is equal to the average energy stored in a
volume c units long and unit area in cross section. Since the waves are propagating at a
speed c, this is a most reasonable result.
The two integrals at the beginning of this example may be applied to the case for which
V is a rectangular volume of square and unit cross section, one-quarter wavelength long
in the Z direction. Then the first integral becomes
tCRe

fE

H; . dS = c(hoG~)

f 1,

dS

This surface integral has contributions only over the two transverse surfaces, these contributions being equal and opposite, thus giving the required null result.
For the second integral,

c(iEo(;~)

f 1, cos 2(wt -

-it ! GEO(;~)

kz) dS =

cos 2(wt - kz) dV = -

-CEo(;~ cos 2wt

it Gf (;~
W

sin 2wt)

- Ie o8 o cos 2wt
and thus the two sides of the second equation of this example are seen to agree.

All the principal features of the .preceding discussion of Poynting's theorem for timeharmonic fields may be retained by deriving a complex form of Equation (5.84). If
one assumes that all fields and currents are expressible as complex vector functions of

290 Eleciromaqneiics in Free Space

CHAPTER

position, multiplied by the time factor ei wt , one can let


d 3P

E . \i* dV

(5.93)

be defined as the complex power being supplied by the sources to the field. This concept
of complex power will require and receive subsequent interpretation. Then, since
.*

\1.

= V X
-

= V X

one may write


d 3?

aDri
H* - - -

at
*
.
H + JW D*

\r *

0
0

\T*

= [-E V X Hri - j2w(iEoE E*)

+ E \r*] dV

Once again, use of (V. 108) and (5.25a) gives

- E V X H o*

= V

* (aB)
+ Do
at

*
(E X Do)

= V (E X Hri)

+ j2w

and therefore

d 3P = [j2w(tJLo 1B B* - tEoE E*)

E t r*

llolB B*)

V (E X Hri)] dV

(5.94)

If this expression is integrated the result is

P=

j4w(Wm - We)

f E v' dV + sf E

X Hri dS

(5.95)

in which, by virtue of (5.88) and (5.89), Wmand We are the time-average values of the
total energies stored in the magnetic and electric fields in the volume V.
One-half the real part of (5.95) is seen to be identical with (5.91) and gives the timeaverage power being delivered by the sources, that is,

tCRe?

(5.96)

No equivalent simple interpretation may be placed upon the imaginary part of (5.95)
in the general case. However, in regions away from the sources

idm

fE

X Hri dS

= 2w(We - W m)

(5.97)

Example 5.5 contained an illustration of (5.97) in that E X Hri did not have an imaginary part and the time-average values of electric and magnetic stored energies were
found to be equal.
Because of the utility of the preceding formulation, it is customary when dealing
with time-harmonic fields to define a complex Poynting vector by the relation

cP

H6'

(5.98)

from whence it follows, by use of (5.87), that the time-average value of energy flow
at a point, per unit transverse area, is given by
(5.99)

SECTION

EXAMPLE

The Wave Equation in Rectangular Coordinates

10

291

5.6

In the application of Equation (5.95) to radiation problems, all points of the surface S
are customarily remote from the currents, so that (5.73) and (5.74) are applicable and

(5.100)
in which ae and act> are components of a(8,cP) as given by (5.76).
It should be noted that cP in (5.100) has only a real component and is therefore twice the
time-average energy flow in watts per square meter. At a fixed large distance r from the system of currents, cP is a function of 8 and and is known as the power radiation pattern.
The directional dependence of cP is controlled by the current distribution through (5.76).
If cP(fJ,) is specified, (5.76) becomes an integral equation involving the sought-for current
distribution \(~,l1,r); this defines an antenna synthesis problem. If \(~,l1,r) is specified,
(5.76) is an integral solution for the power pattern cP(8,); this defines an antenna analysis
problem. These subjects have been treated extensively in the literature."
For the specific case of the half-wave dipole of Example 5.2, since

21m cos (71'"/2 cos fJ)


. 8

a= -l e - k

Sin

use of (5.99) and (5.100) gives

(J> = ~(Re
2

cP

= 1,

'7I~
871'"2

r2

[cos (1l"~2 ;os (J)J2


Sin

which agrees with the result found in Example 5.4.

5.10

SOLUTIONS TO THE WAVE EQUATION IN RECTANGULAR


COORDINATES-UNGUIDED WAVES

Equations (5.35) may be looked upon as dynamic analogs to Poisson's equation. The
developments in Sections 5.5 and 5.6 have revealed that solutions to these equations
in regions remote from the currents may have a wavelike nature. But at points not
occupied by currents, (5.35a) and (5.35b) reduce to
(5.101a)
(5.101b)

and these homogeneous vector differential equations may be likened to Laplace's


equation. Because the general solutions to (5.101) are wavelike (as will be seen shortly),
See, e.g., H. Jasik, ed., Antenna Engineering Handbook, McGraw-Hill Book Company, New York,
1961. Also, R. C. Hansen, ed., Microwave Scanning Antennas, Academic Press, New York, 1964.

12

Electronuumetics in Free Space

292

CHAPTER

(5.101a) and (5.101b) are customarily referred to as the homogeneous vector wave
equations.
Since V E == 0, V B == 0 away from the currents, these equations further reduce to

1 2E
V'2E - - -

at 2

= 0

(5.102a)
(5.102b)

If 1 is any component of E or B, then in rectangular coordinates

af
af af
af
-+-+-+--=0
ax
ay
az
a (jct)
2

(5.103)

and this is seen to be a four-dimensional form of Laplace's equation. Using the method
of separation of variables, in exactly the same manner that it was employed in Section
3.11, one obtains as a primitive solution of (5.103)

f
with

= eiwt-ikr

(5.104)

drawn from the origin to the point (x,y,z) and

Ii. =

k2

lxk x
kz 2

+l

+ lzk z

yk y

ky 2

kz 2

w
= -c

(5.105)
2
2

(5.106)

The solution (5.104) is recognized as representing a uniform plane wave, in that all
points in a plane transverse to k have the same amplitude and a common phase. The
wave propagates in the direction of k at the velocity of light and has a wavelength A
given by

211"
k

(5.107)

A =-

If attention is restricted to uniform plane waves propagating in the positive and


negative X directions, Equation (5.104) indicates the fundamental solution
(5.108)

in which a, and b are constants.


By linear superposition, and with the aid of the Fourier integral theorem, a general
solution may be constructed from (5.108) of the form

!(x,t)

f1(x - ct)

+ f2(x + ct)

(5.109)

in which 11 and 12 are arbitrary functions. The forms of these functions show clearly
that any spatial waveform existing at a time t 1 is preserved and merely displaced a
distance c(t 2 - t l ) at a later time t2 , indicating undistorted propagation at the velocity
of light.
If waves propagating in all directions are considered, three-dimensional Fourier
integrals may be used to fabricate arbitrary spatial distributions. In particular, a
solution
(5.110)
!(x,y,z,l) = Ol(Y,Z)!l(X - ct) + 02(Y,Z)!2(X + ct)

SECTION

10

The Wave Equation in Rectangular Coordinates

293

may be constructed with 91 and 92 arbitrary functions. This is seen to represent nonuniform plane waves propagating in the X direction at the velocity of light, with
arbitrary amplitude distributions in a transverse plane. By insertion of (5.110) in
(5.103) it is evident that 91 and 92 satisfy the two-dimensional Laplace's equation.
If one returns to the constituent solution (5.104), which applies for any component
of E or B, it follows that on putting the components together, a uniform plane wave
may be represented by
E(x,Y,z,t) = IEEoejwt-ikr
(5.111)
B(x,y,z,t) == IBBoeiwt-ikr
(5.112)

wherein l E and In are unit vectors and Eo and B are complex constants. Since V E ==
and V B == 0, it follows that both IE and IE must be transverse to k, and for this
reason the electromagnetic wave is said to be transverse.
E and B are related through Maxwell's equations, and insertion of ((5.111) and
(5.112) in (5.25a) gives

(l{ X i E )E o - i n wB o = 0
which requires that

in == i k

(5.113)

IE

k
Eo
B o == - Eo = w

(5.114)

Therefore the E and B fields are crossed, both being transverse to the direction of
propagation, and their amplitudes are in the ratio c. These properties are held in
common with spherical waves at great distances from the sources, as has already been
noted in Section 5.7. But this is hardly surprising, since spherical waves at great radii
of curvature are well-approximated by plane waves.
The power density of this uniform plane wave is given by
(5.115)
and many of the remarks put forth in Example 5.5, in which a plane wave approximation was made, are applicable to this case.
By linear superposition, the above results for a uniform plane wave may be generalized to the case of a nonuniform plane wave through use of Fourier integrals. The development parallels what has already been said for a single component of the field.
EXAMPLE

5.7

Consider a uniform plane wave traveling in the +X direction and imagine that it encounters a flat conducting surface in the plane x == O. If it is a good conductor, practically
all the energy in the incident wave will be reflected. This situation may be idealized by
assuming that the conductor is a perfect reflector, meaning that tCRe E X H6 == 0 within
the conductor.
Assume that the Y axis is oriented parallel to the incident electric field E'. Then from
(5.111)
whereas from (5.12) and (5.13),

294

Electronuumeiics in Free Space

CHAPTER

Equation (5.115) gives for the incident power density

(Jii = cl x C
t Eo1Et1 2)
Let the reflected wave be represented by

Then, since the power flow for the reflected wave must be in the (5.12) and (5.13) give

Dr(x,l)

~Y

direction, Equations

-1 z E~
e1w. t +"ik x
C

Because no field exists inside the idealized conductor, the total electric field at x = 0must vanish, so that
Ei(O-,t)

+ Er(O-,t)

1 11 e iwt (Et

+ E~)

== 0

which requires that E~ = - E~. This in turn satisfies the condition that the power density
in the reflected wave equals the incident power density.
The total magnetic field just in front of the idealized conductor is therefore
Bi(O-,t)

+ Br(O-,t)

= l.ei"' ! (~~

~~)

2Et 1wt
.

= lz-

whereas the total magnetic field just inside the conductor is zero. This discontinuity in the
magnetic field is accommodated by a sheet of current which ftO\VS in the surface of the conductor. This current sheet has been induced by the incident wave and is the source of the
reflected wa vee I ts strength may be deduced by recourse to the dynamic form of Ampere's
circuital law, C5.28b). With reference to the figure, if a rectangular contour is chosen in the
Z

Free space

Reflected wave
Incident wave

Conductor

.. x

SECTION

The lVave Equation in Rectangular Coordinates

10

295

XZ plane such that its long legs, of length l, are just inside and just outside the conducting
surface, then

n, dl = 21 J..L;

-1

E1e;WI

This must equal the total conduction current enclosed by the contour, since Do will make
no contribution if the short legs of the contour are reduced to infinitesimals. Therefore,
the total conduction current enclosed is

I =
and a linear current density

2le f: oE1eiwt

III flows in

the conducting surface such that

= l y 2e f:oE~eiwt

In causing the flow of energy in the electromagnetic wave to be turned around, the
conductor suffers a reaction which may be computed from the Ampere force law. Since

d 3F = \ X B dV
if the areal current density 1 is "collapsed" into the surface to give a lineal current density j,
one obtains
d 2F = i X B dS
so that the conductor experiences a pressure due to the wave given by

d2F
.
p=-=JXB
dS

Since the sheet of current is immersed in a magnetic field whose spatial average value is
l z (E11e)e iwt, the pressure is
and this has an average value

l x 2 (t f:oIE~12)

which is twice the energy density in the wave (cf. Example 5.5).
This result is consistent with the viewpoint that the energy possessed by the incident
wave in a column c units long and unit cross section is e(tf:oIE~12). According to the massenergy equivalence formula, this may be equated to me", But then this much of the wave
has a momentum equal to

me =

t f:oIE~12

and this much momentum is reversed in 1 sec against unit area of the conductor, causing
a radiation pressure of 2mc.

If one returns to (5.111) and (5.112), which are the expressions for a uniform plane
wave, and the Z axis is chosen in the direction of propagation, then
lEE o

= l.rEl

lyE 2

in which E 1 and E 2 are complex constants. Therefore

= lxElei(wt-kz)

E 2 eJ(wt-kz)
.

B = -Ix -

Inspection of these equations reveals that

lyE 2e i(wt-kz)

+ Iy -E

(Ex,B y)

.
eJ(wt-kz)

(5.116)
(5.117)

and (Ey,B x ) are linearly independent

296

Electromaqneiics in Free Space

CHAPTER

fields. (Ex,B y) is said to be an X-polarized wave and (I~ly,Bx) is called a V-polarized


wave, the designation referring to the spatial direction of the electric field. The total
field (E,B) is the superposition of these t\VO cross-polarized waves,
If the complex factors Eland b 2 have the same phase, then E at any point in space
oscillates along a directional line which makes a constant angle <p with the X axis, this
angle being given by <p = tan' (1~2/I~l). Under this condition, the wave is said to be
linearly polarized.
If E 1 and E 2 have the same magnitude, but their phases differ by 90 deg, then E at
\
any point in space does not oscillate. Its magnitude is constant, but its direction rotates
at the angular velocity w. To see this, let })2 = +jEl, so that (5.116) yields
1

E(x,t)

=
=

CRe(lx + jly) E le;'(wt-kz)


El1[lx cos (wt - kz) Iy sin (wt - kz)]

(5.118)

in which, for simplicity, the phase of Ell has been chosen as zero. Therefore,

IE(x,t)1 = EI[cOS 2 (wt - kz)

+ sin" (wi -

kz)]~~ = b\

a result which is independent of time and position. Further, the direction of E makes an
angle cp with the X axis given by
cp = tan- 1

sin (wt - kz)


=
cos (wt - kz)

(wt -

kz)

(5.119)

At a fixed point (x,Y,z), cp changes linearly with time at the angular rate w. If the
thumb of the right hand is placed in the direction of propagation, E thus either rotates
in the direction indicated by the other fingers, or counter to this direction. When the
rotation of E agrees with the direction of the fingers (E 2 = -jEl), the wave is said
to be right-handed circularly polarized; if E rotates counter to the finger direction
(E 2 = +jE 1) , the wave is said to be left-handed circularly polarized.
Alternatively, if time is held fixed, and the direction of E is viewed as a function of z,
Equation (5.119) indicates that the locus of the tip of E is a helix whose axis is the
Z axis, the z length of one turn being a wavelength. The helix resembles either a lefthand thread or a right-hand thread, depending on whether E 2 lags or leads E 1
The stored energy and the energy flow associated with either a right-handed or lefthanded circularly polarized wave are given by

We

i EoIE(x,i )12 =
(P

= E

H,

tEoE~

Ei

= Wm

Iz-

(5.120)
(5.121)

11

Therefore, the energy density and the energy flow are both independent of time and
space, a characteristic not shared by linearly polarized plane waves.
If one returns again to the general solution (.:).116), and if E 1 and E 2 have arbitrary
relative amplitudes and phases, at any point in space the tip of E describes a locus
which is an ellipse, and for this reason the wave is said to be elliptically polarized. It is
left as an exercise to develop the properties of such plane waves, including the useful
fact that any elliptically polarized wave may be represented by appropriate amounts of
left-handed and right-handed circularly polarized waves.

SECTION

5.11

Rectilinear Guided TVaves 297

11

RECTILINEAR GUIDED WAVES

Many structures, such as two-wire lines, coaxial cables, waveguides, and dielectric
rods, have been found to possess the property of being able to guide electromagnetic
waves from one point to another. When this guiding occurs along a straight-line path,
the problem is amenable to analysis, for then every component of the electromagnetic
wave may be represented in the form
f( u )ei (wt-kzz)

(5.122)

in which z is chosen as the propagation direction and u, v are generalized orthogonal


coordinates in a transverse plane. (See Mathematical Supplement, Section V.11.)
Under this assumption, in a source-free region, Maxwell's equations become
1 aE
_z
h 2 av

+ jk E v
Z

1 aE
-jkzEu - - -

hl~2

[a:

hI

au

(h2E.) -

= -J'wB u
=

-jwB v

:v (h1E

u) ]

-jwB,
(5.123)

wherein tii and h 2 are the scale factors associated with u and v.
These equations can be solved for the transverse field components, yielding

(5.124)

in which k 2 = W2J..LoEo = (W/C)2 = (27r/A)2 is the square of the free-space wave number.
Equations (5.124) indicate that, in general, the entire electromagnetic field can be
determined from knowledge of the longitudinal components.
An important exception to this occurs when the field is propagating in the Z direction at the velocity of light, for then k; = k and Equations (5.124) have a pole, unless
E, = Hz == O. Once again the conclusion is reached that electromagnetic waves propagating in free space at the velocity of light are transverse.
If this case (k z = k) is pursued further, with the longitudinal field components zero,
Maxwell's equations (5.123) give
so that

E; = eli;
E; = -cB u
E B = (l ucB v - l vcBu ) (l uBu + l vB v ) == 0

(5.125)
(5.126)

298

Electronuujneiics in Free Space

CHAPTER

and thus the transverse electric and magnetic fields are orthogonal. In addition, if one
writes
E = f,(u,v)ej(wt-kz)
B = (B(u,v)ej(wt-kz)
(5.127)
with the implication being that 8 and (B are transverse two-dimensional static fields,
then
1z
1v
lu
h2
hI h Ih2
vxf,=
a a a =0
(5.128)

au

av

hlS u

h 2S v

az

Therefore f, may be expressed as the negative gradient of an electrostatic potential


function. For this reason, if any two-dimensional static electric field, such as those
found in Chapter 3 is put in motion at the velocity of light, the result is a valid dynamic
solution to Maxwell's equations. The rich storehouse of solved two-dimensional electrostatic problems is thus available for consideration in the creation of rectilinear guided
waves.
Furthermore, since (B is orthogonal to f" it follows that the flux lines of (B lie in the
equipotential surfaces of any two-dimensional electrostatic problem. Field maps such
as those in Example 3.29 may be viewed as giving an electrostatic field and its equipotentials, or alternatively, as the transverse electric and magnetic fields of a propagating wave.
EXAMPLE

5.8

In Example 3.14, the image principle was used to determine the electrostatic potential
distribution due to two infinitely long, parallel tubular conductors, each of diameter 2a
and center-to-center spacing D, when the upper cylinder contained a net charge
coul/m
and the lower cylinder contained a net charge -}{ coul/m. With the coordinate system
arranged as in the figure (see next page), the potential was given by

+}{

<I>(xy)
,

2
= -xI n {x 2 +
471'" Eo

[y + t(D2 - 4a2)~2]2}
[y - t(D2 - 4a2)~2]2

When the coordinates of a point on the upper conductor were inserted in this expression,
the potenttal of the upper conductor was deduced. When the same thing was done for the
lower conductor, the potential difference was found to be

v =..!!:- In {D2a + [(D)2


2a
71'" Eo

f,

1]~}

The electrostatic field caused by this static charge distribution may be determined from
= - Vcf>, giving
8(x,y) = 2 In /D/2a

in which

(x,y)

i(D2 -

[(D/2a)2 _

IJ~~1

f(x,y)

l x x + l y [Y 4a ) }2]
l x x + l y [Y + t(D2 - 4a2)~~]
x 2 + [y _ i(D2 _ 4a 2)Hj2 x 2 + [y + i(D2 - 4a2)~~J2

If this static electric field is put in motion along the cylinders at the velocity of light
(this assumes the cylinders are perfectly conducting and in a free-space environment),

SECTION

Rectilinear Guided Waves 299

11

f
I

-----x

~2a~

then the electric field distribution becomes

E(x

y z t)
, , ,

= t(x

y)ei(wt-kz)

2 In {D/2a

f(x,Y)

+ [(D/2a)2

- 1]~~}

Vei(wt-kz)

and a voltage wave can be imagined to travel along the twin cylinders in conjunction with
the electric field.
The accompanying magnetic field may be deduced from Maxwell's equations (5.123) or
from (5.125) and is given by

Using the integral form (5.28b) of Maxwell's second equation, and taking a contour in the
X Y plane which coincides with the perimeter of the upper conductor, one is able to determine the current flow in the upper conductor. Since Do == 0 within a perfect conductor,
the result is that
Ienclo8ed

Iei(wt-k;.)

Vei(wt-kz)

= 'f H, dt = - - - - - - - - - - - - - - c
~ ~ In {D/2a + [(D/2a)2 - 1]~~}
1r

so that the complex current amplitude, I, is linearly proportional to the complex voltage
amplitude, V. The current in the upper conductor also is seen to be a wave; a counter

300

Electromagnetics in Free Space

CHAPTER

current flows in the lower conductor. This two-conductor system, which guides the electromagnetic wave rectilinearly, is called a two-wire transmission line.
The ratio V /1 is of some interest, and is called the characteristic impedance of the
transmission line. It is given by

V ==
Z= T

{D + [(D)2
2a -

120 In 2a

1 ]~2} ohms

in which the numerical values of j..Lo and Eo have been inserted. Z is seen to be pure real
and to have a value governed by the geometry of the twin conductors. If a finite length
of this two-wire line is terminated by a lumped resistor of value R = Z, a wave traveling
along the wires, upon reaching the resistive termination, will be totally absorbed, since the
voltage-to-current ratio in the resistor is exactly the value required by the wave. If R ~ Z,
there must be a reflection.
This procedure may be repeated for a variety of transmission line geometries. Several
cases are included among the problems at the end of this chapter.

Returning to Equations (5.124), if k, ~ k, one can use these equations to find the
transverse-field components if the longitudinal components are known. Since these
equations are linear functions of E, and Bs, partial transverse fields due to E, alone
may be determined by setting B, == O. Such fields are called transverse magnetic, or
more briefly Tl\1 waves. Similarly, a second set of partial transverse fields due to B,
alone may be determined by setting E, == O. These fields are called transverse electric,
or 'I'E waves, The most general solution is then an arbitrary sum of the two sets of
partial fields.
Because the spatial derivatives of L, and L, are transverse, the vector wave equation
(5.102) has a separable Z component (cf. Mathematical Supplement, Section V.16)
which may be written

-h 1 (a-au -hh -aua+ -ava-hh -ava) E z +


a + -a -h -a) B z +
-h 1 (a-au -hhI -au
av h av
lh 2

1h 2

(k 2

2
kz)E
z

(5.129a)

(k 2

k 2z)Bl, = 0

(5.129b)

Solutions to (5.129) which fit the boundary conditions of the problem under consideration may be inserted in (5.124) to determine the transverse-field components, thereby
completing the description of the electromagnetic waves which are traveling along the
guiding structure.
EXAMPLE

5.9

A rectangular waveguide consists of a hollow pipe, usually made of good conductor, with a
rectangular cross section, as shown in the figure. This waveguide will support both TM and
T'E waves, as may be seen by the following argument:
In Cartesian coordinates, (5.129a) reduces to

If the walls are good conductors, E, ~ 0 against each wall. If this were not so, the currents
induced in the walls would be so high as not to match properly the tangential magnetic
field. This condition will be modeled by choosing as boundary conditions E, == 0 in each of

SECTION

Rectilinear Guided Waves

11

301

-t

- - - - - - - - - - - - ' - - - - - - - - - t ..
.._ X

/I~.

~I

a-----

Z
the four walls. Then the suitable primitive solution of the above wave equation is

2 . mat x . n7rY e1'(wt- k )


E z = - - SIn - - SIn -

-v;;b

zZ

(5.130)

in which m and n are independent positive integers (greater than zero). E z , as given by
(5.130), and the four transverse-field components associated with it, determinable from
(5.124), together are called the TMm n mode for rectangular waveguide. In (5.130) the
factor

2/~ is

included for normalization purposes, such that

f f t/;mnt/;rs
b

o
wherein

dx dy = O::n

.1,

'Ymn =

2 . mx x . n7rY
SIn SIn ~
a
b

--

(5.131)

and the Kronecker delta, o;::m equals unity if r = m and s = n, being otherwise zero.
Upon substituting (5.130) in the wave equation, one finds that k, = {3mn where

(5.132)
and this mode will propagate only if

If this condition is not met, the mode will attenuate exponentially. For given interior
dimensions (a,b), the higher the values of m and n, the shorter must be the free-space
wavelength A in order to achieve propagation.
The most general solution for E, consists of a linear superposition of terms like (5.130)

302 Eleciromaqnetics in Free Space

CHAPTER

for all possible values of m and n, that is,

LL
ClO

Ez

ClO

KmnY;mn(X,y)ei(wt-timrl%)

(5.133)

m=l n=l

in which the K m n are arbitrary complex constants. Equations (5.124) may be used to
obtain expressions for the four transverse-field components associated with (5.133). The
resulting collection of five equations describes the most general combination of 1'M modes
traveling in the positive Z direction in a rectangular waveguide, Reversing the sign before
f3mn will give a similar solution for propagation in the negative Z direction.
In like manner, in Cartesian coordinates (5.129b) reduces to

aB z + aB z +
ax
ay
2

(k 2

k;)B z

=0

In order to satisfy the boundary conditions that Ex == 0 in the top and bottom walls, and
that E y == 0 in the side walls, the suitable primitive solution of this wave equation is

= 'Itmn(X,Y )ei(wt-timrlz)

B,

'It

in which

mn

ms

n1rY

= - cos -a- cos - b


v;;b

(5.134)

(5.135)

and m and n are independent positive integers. (One or the other can be zero, but not
both.) {3mn once again is given by (5.132). Bs, as given by (5.134), and the four transversefield components associated with it, determinable from (5.124), are together called the
TE m n mode for rectangular waveguide.
The most general solution for B, consists of a linear superposition of terms like (5.134)
for all possible values of m and n, that is,
ClO

~ Kmn'lt mn (x,y

)ei(wt-timnz)

(5.136)

m=O n =0

with the Kmn arbitrary complex constants. Use of Equations (5.124) will yield expressions
for the four transverse-field components associated with (5.136). The resulting collection
of five equations describes the most general combination of TE modes traveling in the positive Z direction in a rectangular waveguide, Reversing the sign before {3mn will give a
similar solution for propagation in the negative Z direction.
A study of (5.132) reveals that, if a > b, the selection m = 1, n = 0 yields the lowest
possible value of k which will permit propagation. Therefore the TE mode for which rn = 1,
n = 0 (i.e., the rrE 10 mode) will propagate at a lower frequency (longer free-space wavelength) than any other TE mode, and than any 'I'M mode. This means that, at a given
frequency, it is possible to choose a and b such that only the TE 10 mode will propagate.
It is left as an exercise to show that to ensure this condition, X/2 < a < X, b < 'A/2.
Because of this unique feature of the TE 10 mode, it is particularly useful when electromagnetic energy must be conveyed with a well-defined field distribution. From (5.136) and
(5.124), the field components of this fundamental mode are given by
E; = -

jw K sin 1rX
1r/a
a

B;

jf310 K sin 7rX


7r/a
a

Bz

ei(Wt-{jlOZ)

ei(wt-{:JIOZ)

cos 1rX e i (wt- til0Z )


a

(5.137)

SECTION

with

12

The Wave Equation in Cylindrical Coordinates

303

an arbitrary complex constant and


(5.138)

Inspection of Equations (5.137) reveals that the electric flux lines for the TE IO mode
go straight across from one broad wall to another, originating on positive charge and
terminating on negative charge. The instantaneous electric flux pattern at one cross section
and charge distribution along one broad wall are shown in part (a) of the second figure.

00000000
00000000

0000

(a) Electric field and

(b) Magnetic field

(c) Current flow

charge distribution

The magnetic flux lines are closed loops which lie in planes parallel to the broad walls
and which encircle the y-directed displacement current Do. Some of these flux lines are
shown in part (b) of the second figure. 'The magnetic flux density against the waveguide
walls is associated with current flow in the walls which may be deduced from the integral
form of Maxwell's second equation, (5.28b). If perfectly conducting walls are assumed,
Do == 0 within the conductor, and a lineal current density flows in the walls of amount
(5.139)
in which In is a unit vector normal to the wall and pointing into the interior of the waveguide. Part (c) of the second figure shows an instantaneous plot of some of the current lines.

5.12

SOLUTIONS TO THE WAVE EQUATION IN CYLINDRICAL COORDINATES

Equations (5.102) were used as the point of departure in deducing wavelike solutions
to the field equations in rectangular coordinates. This was a relatively direct procedure
because of the simple form taken by the Laplacian of a vector in Cartesian frames of
reference. However, reinspection of Equation (4.52) reveals that the task is not so
simple in circular cylindrical coordinates, except when dealing with an axial field cornponent. For this reason, the approach to be adopted in the following study of cylindrical waves assumes that

E(r,,z,t) = B(r,)ejwt-jkzz
B(r,cP,z,t) = CB(r,cP)eiwt-J'kzz

(5.140a)
(5.140b)

in which k; is the wave number in the Z direction and w is the angular frequency. By
treating k, as a parameter, traveling or standing waves in the Z direction may be

304 Eleciromaqneiice in Free Space

CHAPTER

represented by appropriate linear combinations of fields of the type given by (5.140a)


and (5.140b). Use of the Fourier integral theorem will embrace a still wider variety of
physically realizable distributions within this formulation.
With E and B assumed in the form (5.140), the analysis of Section 5.11 is pertinent.,
and in this coordinate geometry (5.124) gives
Or

(,tI> =

k2

1
_

k2

('k z 8Sar

k~ J
(

k;

jk; 8S z
--

r act>

Jw 8CBz)
r ac/J
,

aCBz)

(5.141)

+JW--

a1'

so that, if the longitudinal field components can be found, the transverse components
are generally determinable.
The wave equations (,1.129) are applicable and in cylindrical coordinates give

(5.142)

By assuming either
separation

('Z

or CB z to be expressible in the form fl(r)f2(), one achieves the


(5.143)
(5.144)

with k~ a separation constant.


If k,p is limited to the integral values n = 0, 1, 2, ' . , (which provides a complete Fourier series), then letting v = Vk 2 - k; r converts (;'"),143) to

which is seen to be identical with (:3.71) and is recognized as Bessel's differential equation. This equation and its solutions were discussed in Chapter 3 and Appendix C. The
solutions may be given in many forms, including Bessel functions of the first and second
kinds, modified Bessel functions, and Hankel functions, Because of the asymptotic
forms (3.74) and (3.75), the Hankel functions are particularly convenient when representing radial waves.
By virtue of the foregoing, I~z(r,,z,t) and Bz(r,,z,t) may be composed of suitable
products of the factors
fl(V) = Zn(V)

f 2 ( cf> )

!3(Z)
/4(t)

=
=

{ejntl>}

e- jkzz

(5.145)

= e"i wt

with Zn(V) representing suitable cylinder Bessel functions. The usual Fourier tech-

SECTION

The lVave Equation in Cylindrical Coordinates

12

305

niques may be used to generalize !2, !3, and !4, and orthogonal expansions are also
available for !1 (cf. Chapter 3).
If the axis r = 0 is included, unless it contains sources, fl(V) must be expressed in
terms of J n (v) alone. If only a sector in the cf> direction is being considered, n need not
be an integer; the corresponding Bessel functions have a non-integral index and are
unlikely to be tabulated.
An elementary wave function consists of the product of the above four factors with
n, k z , and k (or w) specified, this triplet of numbers serving as identification. More
general solutions 111ay then be given by summing on these three indices. The specific
solutions for E, and B, will differ in the values attached to the different elem.entary
wave functions through imposition of the boundary conditions.
EXAMPLE

t5.10

Assume that a tubular sheet of current Ije iwt amp/m flows in the cylindrical surface r = a
between the limits z = 00 , with i a complex constant. What field does this source create
in the region r > a?
Because of the disposition of the currents, B must be transverse to the Z axis, and the
entire solution may be given in terms of E z By symmetry, the resultant field must be
independent of cf> and z; thus n = k z = 0. Further, at large radial distances

with the plus sign applying for H~l) and the minus sign for H~2); since this source system
causes outgoing waves, only H~2) need be selected. Therefore
Ez(r,t) = boHd2 ) (kr)e

with k

=~
C

jwt

and the constant bo yet to be determined.

Using (5.141), one finds that E, == 0, E == 0, B, == 0, and

and the fields are given by

The ratio of the field components is


B,

H62 ) (kr)

-JC

Hi2 ) (kr)

and this ratio approaches - C as r ~ co. This san.e ratio also has been observed for the
components of rectangular plane waves.

306

Eleciromaqneiice in Free Space

CHAPTER

The time-average power densi ty is

cP = -1 CRe
2

E X fI*
0

I'

.
21
t, Jr]
1
II(2)(kr)II(1)(kr)
2 \Hi2 ) (ka)12 0
1

At large radial distances this formula may be reduced by using the asymptotic expressions
for the Hankel functions. Further simplification is possible if ka is large, in which case

EXA:\IPLE

5.11

.A circular cylindrical waveguide consists of a hollow conducting tube of inner radius a. Find
the general expression for transverse magnetic waves propagating axially inside this tube.
Because the axis r = 0 is included in the region of interest, J n(V) must be chosen as the
radial function and (5.145) gives
E = In(v)
1

{C?S
n<f>}
SIn ncj>

ei(wt-kzz)

If the walls are assumed to be composed of a perfect conductor, Ez(a,<f>,z,t)


requires that

If the roots of I n are designated by ~nl' ~n2'

~nm'

== 0, which

then

(k2 - k;)a 2 = ~~m


and the propagation constant k, may have a sequence of values given by
f.l.

tJnm

k =
z

'YIk
I

2 _

'Yaim
2

"[nm.

Therefore there is a doubly infinite set of transverse magnetic modes which can exist in a
circular cylindrical waveguide, and these modes are distinguished by the indices n, m. For
the TMn m mode,
E, = J

( ar) {cos n<f>}


n<f>
'Ynm - .
SIn

.
eJ(wt-~nwZ)

with the other field components deducible from (5.141). Whether or not the TMn m mode
will propagate is governed by whether f3nm is real or imaginary, which is determined by
whether or not ka is greater than ~nm'

5.13

SOLUTIONS TO THE WAVE EQUATION IN SPHERICAL COORDINATES

If Equations (ij.102) are used as the starting point for the deduction of wavelike solutions to the field equations in spherical coordinates, a study of Equation (-1 ..14) indicates the difficulty of the task ..A.-II three field components are involved in all three
components of the vector wave equation and separation is possible only in a few particular situations of symmetry. Still another technique must be found for the solution
of (5.102) by indirect means.

SECTION

13

The Wave Equation in Spherical Coordinates

307

The approach to be followed begins with a study of the scalar wave equation
(5.146)
in which <p(T,f),<t>,l) is a scalar function expressed in spherical coordinates. Letting time
variations be accounted for by writing <P = 'I!(r,8,<te i wt yields

(\7 2

k 2 )'l!

(5.147)

= 0

which may be separated by assuming 'I! = !1(r)!z(8)!3(<t. This gives


2
fl
sin 8 d ( d
- - r2 - )
11 dr
dr

sin 8 d
+ --.
-

f2 d8

f2
(.
SIn 8 -d )

ae

.
k 2r2 sin" 8

1)] 11

+ -1 -d f 3 =
f3 d<t>2

which breaks into the pieces

~
(r z ddr + [k r
dr
i l

_.1_

SIn

2 2

f2
~ (Sin 8 de
d ) +

8 de

d2! 3
drjJ2

n(n

[n(n + 1) +

~J!2
sln 8
2

(5.148)

= 0

(5.149)

(.5.150)

with m and n separation constants. If the field is to be single-valued and a complete


azimuthal region is being considered, m must be an integer; this choice for m also gives
a complete Fourier series representation to !3.
Equation (5.149) was encountered in Section 3.13 and is recognized as being the
associated Legendre equation. If the axis e = 0, 7r is to be included, it has finite solutions
P':(cos e) whose properties are described in Appendix D. Restricting n to zero and the
positive integers, provides a complete orthogonal set of functions from which to construct 12.
If the substitution j', = (kr)-~~F(r) is made, Equation (5.148) may be transformed to

(5.151)
which may be recognized as Bessel's differential equation, being in the same form as
(3.64). It gives rise to the solutions
(5.152)
in which Zn+~'l(kr) is a cylinder function of half-order, i.e., an appropriate choice
among Bessel functions of the first and second kinds, Hankel functions, etc. These
functions have the same asymptotic behavior, recurrence relations, and orthogonal
properties as the cylinder functions of whole order (cf. Appendix C).
It is customary to define a spherical Bessel function by the notation
(5.153)

308 Eleciromaqnetics in Free Space

CHAPTER

and with this terminology one may construct solutions for 'It by forming products
of the partial solutions

fl(r) = zn(lcr)
f2(O) = P';(cos 0)

13<4

(5.154)

= {ejmcf>}

N ext consider the vector function

G = i r r X V'It

-v X (r'It)

(5.155)

in which 'It is a solution to the scalar wave equation (5.147). It is shown in Appendix I
that G satisfies (5.102) and is therefore an appropriate solution for either E or B.
However, G cannot represent a general field, since it has no component in the r direction. However, if 'It l and 'It2 are t\VO independent solutions to the scalar wave equation,
then a general solution may be constructed by choosing

BI

= -

~ V X

JW

EI

with the total field given by E = E I + E 2, B = B I + B 2 In this manner the total field
is expressed as the sum of t\VO partial fields, one of which is TE with respect to the
radial direction, the other being Tl\L Since 'l!I and 'l!2 are expressible in terms of cornplete sets of orthogonal functions, this is a broadly useful representation.
EXAMPLE

5.12

A spherical cavity consists of a conducting shell of inner radius a. Find the expressions for
those resonant fields in this cavity which are without an E; component.
For such fields, 'lJ 1 must be of the form

{cos
. m<t>} e jwt
SIn m'

)
'lJ 1 = J.n (k r ) pm(
n cos ()
in which

in has been

A,.

chosen for the Bessel function to ensure regularity at r

Eo = - _1_ a'1'l
sin 0 ac/>

=+=

= O. Then

jm '1'1

sin 0

E = a'It 1

ao

B, = n(n

1) 'It

s, = _1

(r a'Itao

B =

jwr

jwr ar

q,

jwr sin ()

ar

(r ac/> 1)
iJ'lJ

and, if a perfect conductor is assumed, Eo and E must vanish identically at r = a, which


requires that

SECTION

Inductance

14

If "Ynm is the mth root of in (kr), such that in ("Ynm)

309

0, then k has the allowed values

k = "Ynm
a

which determines the resonant frequencies for the TE modes in the cavity.

5.14

INDUCTANCE

The results of Section 5.8, concerned with magnetic stored energy, may be expressed
in an alternative manner. When the vector identity (V. 108) is employed, if A is the
magnetic vector potential function, such that B == V X A, then
V

(A

B) == B V X A - A V
==BB-AvxB

and (5.80) may be written

= i,uOl
= -k,uOl

V (A X B)

(A X B) dS

dV + t,uOl fA.
v

+ -k,uol fA.
v

V XB

V XB

dV

dV

But S may be taken as a sphere at infinity, and since A decreases as ~-1 and B decreases
as ~-2, the surface integral is seen to vanish. Therefore

TV m = i,uOl

fA.

V X B

For static magnetic fields, V X B

==

~1

dV

(5.156)

and

fJ.o

==

f .' dV'

v:

47rJ.lo-1 r

(5.157)

wherein primes are used so as to be able to distinguish between contributions to the


integrals in (5.1t56) and (5.157). Thus the stored magnetic energy may also be expressed
by
(5.158)
in which ~ is the distance between the volume elements dV and dV', and the integration
is to be performed twice throughout all of space containing current elements. This
development should be compared to the similar analysis presented for electrostatic
energy in Section 3.19; expressions (3.159) and (5.158) are seen to be completely
analogous.
Equation (5.158) may be applied to current systems of which the prototype is
indicated by Figure 5.4. Shown are two distinct circuital volumes V 1 and V 2 which
contain steady current distributions, the rest of space being source-free. If 11 is the
total current flow at some reference cross section in V 1, and if similarly 12 is the total
current flow at some reference cross section in V 2, it will often occur that the current

310 Electromaqneiics in Free Space

CHAPTER [)

density at any point in V I is linearly proportional to 11, whereas the current density
at any point in 11 2 is linearly proportional to 12. In such cases,

for any point

(~,'YJ,t)

in VI, and

for any point (t,1J,t) in V 2, with f l and f 2 functions which give the normalized current

V,
FIGURE

5.4

Self- and mutual inductance.

intensities. Under these conditions, (5.158) may be written

wherein f~ implies fl(~'''Y7',t'), etc. The integrals appearing in (5.159) depend only on
the normalized distributions of the two systems of currents, and for a given conductor
configuration are constants, so that one may write
(5.160)

in which L ll and L 22 are called self-inductances and M12 is called a mutual inductance,
their units being given the name henries. This development can be extended readily to
situations in which the volumes VI and T1 2 overlap and/or in which there is any number
of separately identifiable volumes containing current systems.

SECTION

EXAl\1PLE

Inductance

14

311

5.13

Find the mutual inductance between t\VO coaxial, coplanar filamentary loops of radii a and
b, as shown in the figure. Assume b a.
From (5.159),

M =

JJ

VI

fl

f~

41r,Liol~

V2'

dV dV'

In this case, the current densities may be taken as uniform over the cross-sectional areas

~----f-4-----+-+----Y

4>'

x
8 1 and 8 2 of the inner and outer loops. Then

and
in which C1 and C2 are the median contours of the two loops.
Upon first considering that element dfl which is on the X axis, one sees that by symmetry, the elements de~ may be taken in pairs symmetrically disposed with respect to the
X axis such that if
1<1>' d(~ = -1 xb sin ' d'

+1

yb

cos ' d'

then the X components may be discarded. This gives

del f 2b cos

Cl

~: d'

41r).Lo

312

Eleciromtumeiics in Free Space

With b a,

~-1 ~

b- l

a cos '/b 2 and thus

M~ ~

c,

EXAlVlPLE

CHAPTER

del

f" 2ab cos


0

c/>' dc/>' = 7fa


41rJ..Lo-lb 2
2 J..Lo-lb

.1.14

A wire of circular cross section whose radius is b is bent into the form of a closed circular
ring of mean radius a. Find its self-inductance if a b.
If one again makes use of (5.159) and (5.160), then

The geometry identifying the t\VO volume elements dV and dV' is shown in the figures,

Cross section at Of

Cross section at Of + 0

from which it follows that


f l f~ == flf~ cos ()
dV = (ad8)(pddp)
dV' = (a d8')(p' d' dp!)

SECTION

Inductance 313

14

+ 7f one finds that


( + ') + o' cos ']2 +

Upon letting 6 = 271

!2 = {[2a + p cos
1"-..1 2a(1
- k 2 sin 2 71)

[p sin (

') - p' sin 'P} (1 - k 2 sin" 71)

in which
k2 = .
[p sin (

4(a + p' cos ')[a + peas ( + ')]


') - o' sin ']2 + [2a + peas ( + ')

p' cos ']2

1 _ k2~ p2 + (p')2 - 2pp' cos

so that

4a

Let it be assumed that 11 is a function of p but not of

f f f
2rr

a
= --=1
pll (p) dp p'l~ (p') dp'

47r,uo

27ra

-1

,ua

217"

de'

,
pll(P) dp P'11(P') do'

d'

2
f rL
2rr

E(k)
To find K(k), let (3

VI -

2 sin 71. - 1 }" d71


(1 - k 2 sin? 71)/2

1)

K(k)] d

-rr/2
(

b, it follows that k 2 ~ 1 and

rr/2

k sin2

T} dT}'"

cos T} dT}

7f/2 - 71 and then

f
o V sin"
flo

K(k) =

+ -2-

E(k)

rr/2

wherein E(k) and K (k) are elliptic integrals. Since a


rr/2

e nor . Then

217"

{3

d{3
cos? {3 - k 2 cos" {3

f
e,

rr/2

VI -

d{3
k 2 cos? {3

with 1 - k 2 {3o 1. This division of the integral into two parts is done so that sin {3
may be approximated by {3 when {3 ::; {3o. If one places sin {3 = {3, cos (3 = 1 in the first
integral, and k = 1 in the second, the result is that
K(k) ~

r[-

In _/14V

k2

and therefore

~ E(k) + (~ -

1) JdcP~ j"
f '\JI +

= 27l"(ln 4 - 2) -

K(k)
217"

In

p2

(In 4 - 2 - In V

(p')2 - 2pp' cos cP d


4a 2

which means

d 2L =

47l"~~ p (In 8a ,uo

1-

2) 11 (p) dp p'f~ (p') dp'

k 2 ) d

314 Eleciromaqnetics in Free Space

CHAPTER

'I'wo cases of interest may now be distinguished. First, if the wire is assumed to have infinite conductivity, so that all the current flows on its surface, then/l(p) = (27rb)-1 o(p - b),
and

L = - a(8a
In- 1
J1.o

Second, if the current is assumed to be uniformly distributed over any cross section of the
wire, then 11 = (7rb 2)- 1 and

The difference in these results is slight under normal circumstances.

The concepts of capacitance and inductance may be extended to situations involving


time-varying currents and charges by the following argument: Let time variations
of the fields and sources be of the form e jwt (w may be only one Fourier component of the
total spectrum). Then the potential functions are
dV
f t(~,71,5)ej(wt-kn
4trlJo !

A(x,Y,z,t)
4>(x,Y,z,l)

-1

f (

t
r)ei(wt-kr)
P l;,71,~
dV
41rEo!

If the currents and charges are disposed throughout a system of conductors, and if no
point (x,Y,z) in the system is an appreciable part of a wavelength removed from any
other point (~,71,t), then retarded time may be ignored between t\VO such points, and
(5.161)
if>(x,Y,z)e

iwt

= eiwt

f p(~,7J,n dV

(5.162)

4trEo~

Upon deleting the factors eiwt, one sees that A and <I> have the same forms as were
encountered in magnetostatics and electrostatics, except that now t and p are, in general, complex quantities.
The condition that the extent of the conductor system be small compared to a
wavelength characterizes a lumped circuit. If within the circuit the regions where
conduction current and displacement current dominate are distinct and separate, the
analyses leading to Equations (3.159) and (5.158) may be repeated, with the potential
functions (5.161) and (5.162) replacing their static counterparts. The net result is
that the same formulas for capacitance and inductance are obtained. One concludes
that the static formulas for capacitance and inductance are valid so long as the frequency is low enough to insure that the circuit dimensions are small compared to
the wavelength.

SECTION

5.15

L5

T'ransjormaiion of the Integral Solutions

315

TRANSFORMATION OF THE INTEGRAL SOLUTIONS


TO FORMS SUITABLE FOR WAVEGUIDE PROBLEMS

In the development leading to the integral solutions for E and B, given by Equations
(5.42) and (5.43), the Green's function t/; = e-jkr/~ was employed. By retracing the steps
leading to (5.42) and (5.43), the reader will have no difficulty in convincing himself
that, if the more general Green's function
e- jkr

G(x,y,z,~,l1,r) = -

+ g(x,y,z,~,l1,r)

(5.163)

be used in place of t/;, with 9 any function which satisfies


(5.164)
everywhere in the volume V and over the bounding surfaces 8 1
SN, then (5.42)
and (5.43) are once again obtained, with G replacing t/; everywhere, In particular, if V
is a source-free region bounded by the single closed surface S, then at any point (x,Y,z)
within V,

E(x,y,z)

B(x,y,z)

f E)VsG + (In
~ f [jwG (In
+
41r
c

41r s

[(In

X E)

X E) X VsG - jwG(ln X B)] dS

(5.165)

(5.166)

(In X B) X V sG

(In B)V sGJ dS

in which In is the inward-drawn normal.


These equations may be applied to waveguide problems in the following manner:
let any cross section of a cylindrical waveguide be represented as in Figure 5.5, with C
y

'------------- X
FIGURE

5.5

Cross-sectional geometry of cylindrical waveguide.

the contour of the cross section, A the cross-sectional surface, u the outward-drawn
normal direction, and v the peripheral direction.
Let the closed surface S alluded to in (5.165) and (5.166) be taken to consist of the

316

Electromaqnetics in Free Space

CHAPTER

rl

interior surface of a portion of the waveguide, extending from t =


to
= S2, plus
end caps of area A at =
and at t =
Let the sources of the electromagnetic
waves within the waveguide be fields externally impressed over holes in the waveguide
walls, Further, let these holes be confined to a finite axial extent of the guide. Then, if a
small loss is assumed in the medium (an air-filled guide, instead of an evacuated guide,
for example), and if tl ~ - 00, t2 ~
00, it follows that the contributions to E(x,y,z)
and B(x,Y,z) in (5.165) and (5.lo6) due to the integrals over the t\VO end caps is negligible. This condition will be assumed, and the surface S in (5.1G5) and (5.166) will
be taken to consist of the interior waveguide surface of infinite axial extent.
For convenience, different Green functions will be used in (5.165) and (5.166). If one
selects

r rl

r2.

G 1 (x,Y,Z,1l,v,t )

e- j kr

= gl(X,y,Z,u,v,s)

for use in (5.165), and further stipulates that G l


then the Z component of (5.165) becomes

==

(5.167)

0 on S (the Dirichlet condition),

(5.168)
in which G, satisfies
(5.169)
with j)F the field point (x,Y,z) and r)s the source point (1l,V,t). Only l~z need be found
from (5.165), since all other field components of 1"'~1 waveguide 1110des 111ay be expressed
in terms of E; (cf. Section 5.11).
In like manner, if one selects
e-jk~

+ -

G2(x,y,Z,1l,v,t) = g2(X,y,z,u,v,r)
for use in (5.166) and further stipulates that aG 2 / au
then (5.166) yields the z component

B.(x,y,z)

:11'

in which G 2 satisfies

f j~

[:;2 +

{E.(U,V,n

(V'~

+k

2)G

==

k 2 ] G2

(5.170)

0 on S (the Neumann condition),

E.(u,v,n

-41rO(PF - Ps)

::~;} dv dr

(5.171)

(5.172)

In reaching the result (5.171), Maxwell's equations have been used to replace B; by
terms involving E; and E z , and several integrations by parts have been employed to
transfer the differentiations in the kernel to the Green's function.
Knowledge of tangential E everywhere on the waveguide walls permits determination
of all field components at all points, through use of (5.168) and (5.171). When the walls
are made of good conductor, E t a n ~ 0 except over the holes in the walls, and the extent
of the integration is thereby reduced.
If the waveguide has a simple cross section, such as rectangular or circular, the
Green's functions G l and G 2 can be expressed as complete series of orthonormal functions, thereby greatly simplifying the analysis. As an illustration, the Green's functions

SECTION

Trausjornuuion of the Integral Solutions

15

317

for a rectangular waveguide are derived in Appendix J. When the results are substituted in (i).168) and (:').171), one obtains
E,(x,y,z)

i f .pm.,.~.c,y) J

= -

m=1 n=1

~ L~
- ~

(mir / a)

m =0 n =0

2) {3mn

+ (n

7r /

2w/3mn

II

m=O n=O

a.pm.,(1;,7]) E,(1;,7],t)e-

au

b) 2 '1t (
)
mn x,y

Wmn(X,y)
2w

ifJ m"I,-

J 'ltmn() 1-

.s

~,'YJ

av

~v ~,'YJ,s

J aw mn(I;,7]) E,(1;,7],t)e-

t l dv dt

ifJ m"I,-

(;3.173)

1
e-'~
] Iz- tid
, v Gs
mn

t l dv dt

(5.174)

in which the upper sign is used in the second sum of (5.174) if z > S, the lower sign
being used if z < S. V;mn and 'lt m n have been defined previously by Equations (5.131)
and U>.135).
Equations (5.173) and (5.174) are known as the generating functions for rectangular
waveguide and permit the determination of the fields anywhere within the guide if the
fields are completely known on the walls. These generating functions were first obtained
by Stevenson. 13
EXAlVIPLE

5.15

Two identical rectangular waveguides are joined so as to have a common broad wall, as
suggested by the figure. It is assumed that the dimensions a and b are so chosen that only

Directional coupler

Slot dimensions

the TE 10 mode will propagate at the frequency being considered. (Cf. Example 5.9). A
pair of small crossed slots is cut in the common broad wall at a distance Xl from the nearest
A. F. Stevenson, "Theory of Slots in Rectangular Waveguides," J App Phys, 19, 24-38; January
1948.

13

318

Electromagnetics in Free Space

CHAPTER

side wall, In what is to follow, it will be shown that if Xl is properly chosen, this assembly
is a matched directional coupler. By this one means that if a TE l o mode is fed into port 1,
no reflected waves are detected at ports 1 and 2, but waves are detected at ports 3 and 4,
their relative amplitudes being controlled by the size of the crossed slots.
If the results of Example 5.9 are utilized, a 1'E I O mode incident on the crossed slots from
port 1 may be represented by

B z = K 0 cos -7rX.
e, (w t - IJI OZ)
R

in which Ko is the complex amplitude of the incident wave at z = 0. The other magnetic
field component of this incident wave is

BX

= j{310
-

1r/a

K 0 SIn
. -7rX e
a '

wt-IJIOZ

At the center of the slots (Xl,O,O), the magnetic field components are
Bx

If

Xl

. 7rXl . t
= j{31oK
- : 0 SIn
er:

7r/a

B =
Z

is chosen so that
1rXI

tan -;;

[(2a)2
~ -

7T"

= {3loa =

7rXl.

K 0 cos - a
1

er"

]-~~

then at the center of the slots, B x and B, will have equal amplitudes and will be in time quadrature. With this condition assumed, if the slots were not present, the X and Z components
of current density in the broad wall at (XI,O,O) also would have equal amplitudes and be in
time quadrature. If the slots are narrow and small (l d but l A), the conduction current which each slot interrupts is replaced mainly by a displacement current across each
slot. This means that electric fields which are in time quadrature are induced across the
two slots by the incident wave. These electric fields may be approximated by the expressions

in which E is the complex induced amplitude at the center of each slot and it is assumed
that the electric field goes to zero at both ends of each slot. The first electric field expression
applies for the Z-directed slot, the second for the X-directed slot.
By virtue of (5.174), the back-scattered TE IO mode which appears as a reflected wave at
port 1 is given by
B(l)

1r

iwt

Ee

}wf3loa 2b

cos 1rX
a

{~

7r

7rr

cos ~ cos
e-itllolz-rl
a sal
-

(.J

fJIO

d~ dr

f.

1r ~
SIn cos 7T"(~ Sal

Xl)

e-j{jlolz-t\ d5t d~r }

SECTION

A Minkowskian Formulation of the Field Equations

16

If the substitution is made that ~' = ~ -

this becomes

Xl

319

l/2

-7f cos 7fXI


a

cos -7fS cos


l

{310S

df

X
. 7f
{310 SIn
- l

l/2

a ()

t'}

COS -7f ~' cos 7f


- ~' d t;;
l
a

Since sin 7rXl/a = (7f/{310a) cos 7fXl/a, and since Z"'I\, so that cos (31of~ 1 and cos 7r~' /a~ 1
throughout the integration interval, it follows that
B~l)

= 0

and there is no back-scattering to port 1. Similarly, B~2) = o. However, for ports 3 and 4,
the upper sign must be used in the second integral of (5.174) and one obtains

B?> = B~4) =

Kl

cos 7rX eiCwt-{310Z)


a

u E
Kl=--

in which

a 3b c

The total field emerging from port 3 is given by Ko


Kl; at port 4 it is given by Kl.
The mode amplitude Kl is seen to increase with l so that the fraction of power diverted to
port 4 may be controlled by slot size. Experimentation has revealed that an extensive
dynamic range of power diversion is feasible with crossed slots of this type, making them
suitable for use not only as a waveguide coupler, but also as a circularly polarized radiator
(with the upper guide removed)."

5.16*

A MINKOWSKIAN FORMULATION OF THE FIELD EQUATIONS

Shortly after the appearance of Einstein's first paper on relativity, Hermann Minkowski (1864-1909) recognized that a considerable clarification in notation was possible
if the variable jet were treated as a fourth dimension and the equations of physics
restated accordingly.t> Thus, for example, the proper distance (2.47) could then be
written

ds 2

dxi

+ dx~ + dx~ +

in which, in place of the coordinates x, y,


X2

= Y

Z,

(5.175)

dx~

t, the new coordinates


X3

X4

= -jet

(5.176)

have been introduced, and the two events occasioning (5.175) have been assumed to be
an infinitesimal distance and time apart. Equation (5.175) is seen to be an extension to
four dimensions of the familiar expression for differential distance in three dimensions.
With this notation, functional derivatives with respect to X4 assume the same form
as those with respect to the spatial variables. As an illustration, the scalar wave equa-

* This section may be omitted without lops in continuity of the technical presentation.
A. J. Simmons, "Circularly Polarized Slot Radiators," IRE, Trans Antennas Propagation, AP-5
(1), 31-36; January, 1957.
IS H. Minkowski, Space and 'I'ime, an address delivered at the 80th Assembly of German Natural
Scientists and Physicians, at Cologne, September 21, 1908. A translation may be found in The Principle of Relativity, Dover Publications, Inc, New York.
14

320 Electronuumeiics in Free Space

CHAPTER

tion beC0111eS a four-dimensional version of Laplace's equation. The laws of dynamics


maybe recast in a static form and the governing equations of electrical phenomena also
assume a simple and elegant structure, as shall be seen by what follows.
In the Preface to his Electrodinuimics, S0111111erfeld indicated his admiration of ~\ Iinkowski's formulation by saying
. . . After I had heard Hermann Minkowski's lecture on "Space and Time" in 1909 in
Cologne, I carefully developed the four-dimensional form of electro-dynamics as an apotheosis of Maxwell's theory . . . in return, this has always met with an enthusiastic reception
on the part of my audience.

Sommerfeld's presentation of this material is especially appealing because of its


conciseness and clarity, and the ensuing development is patterned after his approach
wherever appropriate. Recently another excellent treatment of the subject has been
offered by L. J. Chu. 16 1;'01' applications of this formulation to other branches of physics,
such as dynamics, the reader is referred to the literature of those fields. 17
Confining attention to a rephrasing of the governing equations of electromagnetics,
let a four-dimensional generalization of the Laplacian operator be defined by

(5.177)
Additionally, let the potentials A and <1>, defined by (t>.G5) and (i).66), be combined to
form the four-potential A whose components are

A4

<I>
= ~

JC

(5.178)

With these definitions, the differential equations (5.69) and (.5.70) become

D 2A

(5.179)

= --1
J..Lo

with I called the four-current density and possessing the components

(;j.180)
Equation (5.179) is the four-dimensional wave equation relating the potential function
to its sources.
Equation (H.5), which connects the potentials A and <1>, may be written

aAl
a:rl

aA 2 aAa
aX2 aX3
0 A ==

1 .

V A = - + - + - = --<1>=
2
which gives

.
wherein

aA4
aX4
(5.181)

DIS the four-vector (a


. I dIvergence
- , -a, -a, - a) . Thus the four- dInlenSlona
ax!

8X2 8X3

aX4

of the four-potential function is identically zero.


16 H,. :iV1. Fano, L. J. Chu, and R, B .. Adler, Electronuujnetic Fields, Energy and Forces, Appendix 1,
John Wiley and Sons, Inc., New York, 1960.
17 See, e.g., H. Goldstein, Classical J.~1 echanics, Addision- Wesley Publishing Company, Inc., Reading,
Massachusetts, 1953.

SECTION]

A 1\1 inkowskian FOrJHUlation of the Ft'eld Equations

321

Turning now to Equations (;").G7) and (5.G8), which relate the field vectors to the
potentials, one may write

(f>.182)

These equations suggest the utility of introducing the four-dimensional curl


curl m n A
Since

CUl'lmm

aAn

= -

aX m

= 0

aAm

(5.183)

--

aX

curl.., = - curls.,

it follows that curl m n A is an anti-symmetric tensor with six distinct components


which differ from zero. Equations (;">.182) 111ay then be written in the tensor form

(~~12 - ~~J (iJA3 iJA1) (~~14 - ~~J


0
(:~l _ ~~12)
(~~23 - ~~32) (~~24 - ~~42)
(iJAl _ iJA3) (~~32 - ~~23)
0
(~~34 - ~~43)
0

curl A =

a X3

aX3

aXl

=5=

aXl

(~~41 - ~~:) (iJA2 _ iJA4) (iJiJ~: _ ~~4)


aX4

aX2

(5.184)
in which g: is an antisymmetric tensor possessing six distinct components different
frorn zero, and properly 111ay be called the field tensor. I t is given by

~=

[B,J:J

-B z

-By

-B z

Bx

By

-B x

-jJ~x

-jJ~y

-jI~z

jEx
c
jEy
c
jJ~z

(5.185)

c
0

Similarly, the .vlaxwcll equations tan be cast In a four-dimensional form. If the


operation
4

div., ~ ==

\'
L

n=l

a~mn

a~rn

(5.186)

322

Electromaqneiics in. Free Space

CHAPTER

applied to tensors is given the name reduction or divergence, it is seen to reduce a fourdimensional tensor to a four-vector, and 111ay be contrasted to the operation 0 introduced earlier, which reduced a four-vector to a scalar. Upon applying the operation
(5.186) to the tensor 5= given in (5.185), one obtains

~l

div g: =

(5.187)

J..Lo

The first three components of this equation embody the ~rax\vell equation (:>.24) and
the last component is a rephrasing of (5.23).
Since for any antisymmetric tensor 3' the components are related such that
it follows that

D div 3'

= 0

and thus from (5.187) one is able to deduce that

D I

(;j.188)

= 0

which is a restatement of the continuity equation (5.29).


Finally, let the dual of 5= be defined by the relation
0

g:*

[j:,

jE z

_jJ~lJl

jEx
c

B1J

Bz

jI~}z

B]

c
jl~y

c
-Ex

-jIE x
c
-By

-B z

Bx
(;'1.189)

which is formed by an interchange of the real and imaginary constituents of ~. The


reduction of this tensor is seen to give
(5.190)

div ~* = 0

The first three components of (5.190) comprise the Maxwell equation U>.21) whereas
the fourth component is a restatement of ([).20).
To recapitulate, a complete representation of Maxwell's electromagnetic theory for
free space is therefore embodied in the equations
div ff =

~
-1

J..Lo

(;).191)

div ~* = 0

The field tensor ~ may be found from the four-potential

A through the relation

g: = curl A
whereas the potential

(5.192)

A is deducible from the equations


I

-----=1
J..Lo

DA =

(5.193)

Problems

323

For further development and use of this notation the interested reader is referred
to Sommerfeld's Electrodynamics.
REFERENCES
1.

Campbell, L., and W. Garnett, The Life of James Clerk M axwell, Macmillan and Company,
London, 1882.

2.

Crowther, .I. G., i11en of Science,

3.

Fano, 1~. M., L. J. Chu, and R. B..Adler, Electromagnetic Fields, Energy and Forces, John
Wiley and Sons, Inc., New York, 1960.

4.

Glazebrook, R. T., James Clerk ill axwell and Modern Physics, Cassell and Company, Ltd.,
London, 1896.

5.

Harrington, R. F., Time-Harmonic Electromagnetic Fields, McGra\v-Hill Book Company,


Ne\v York, 1961.

6.

Jackson, IT. D., Classical Electrodynarnics, John vViley and Sons, Inc., Ne\v York, 1962.

7.

Jordan, E. C., Electrornagnetic Waves and Radiating Systems, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1950.

8.

Jones, Bence, The Life and Letters of Faraday, Longmans, Green and Company, London,
1870.

9.

Panofsky, VV. K. H., and M. Phillips, Classical Electricity and 111 agnetism, AddisonWesley Publishing Company, Inc., Reading, Massachusetts, 1955.

10.

Ramo, S., and J. R. Whinnery, Fields and Waves in .7\!I odern Radio, 2nd ed., John Wiley
and Sons, Inc., New York, 1953.

11.

Shedd, P. C.,Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood


Cliffs, New Jersey, 1954.

12.

Sommerfeld, A., Electrodynamics, Academic Press, Inc., New York, 1952.

13.

Stratton, J. A.., Electromagnetic Theory, Mcflraw-Hill Book Company, New York, 1941.

14.

Whittaker, E., A History of the Theories of Aether and Electricity, vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

w. VV.

Norton and Company, New York, 1936.

PROBLEMS
5.1

Because of the result of Appendix E that the most general sources t(x,y,z,t) and p(x,Y,z,t)
in XYZ may be built up from static charge distributions in all other Lorentzian frames, it
follows that one may derive Maxwell's equations by starting only with p' (x' ,y' ,z') in
X' y' Z', without including t' (x' ,y' ,z'). To see this, assume only a static charge distribution
in X' Y' Z' and parallel the development of Sections 5.2-5.4 to obtain the X~connected
portions of Maxwell's equations. Then invoke superposition to obtain equations (5.25).
This procedure has the advantage in rigor of basing the derivation of the general field
equations solely on electrostatics, and it then permits all the results of Chapter 4 to become special cases of the more general theory. In particular, Equation (4.29), which
expresses the Biot-Savart law, is seen to be a limiting form of Equation (5.60), with

k = O.

5.2

Let B = F in the vector Green's theorem (5.37) and show that B is expressible in the
form (5.43).

324 Eletiramaqneiics in Free Space

CHAPTER

5.3

By taking the curl of (5.42), show that B may be written in the form (5.43).

5.4

Use the continuity equation to show that (5.42) 111ay be converted to (5.44).

5.5

Using Fourier integral theory, show that the general form for the retarded potential <P, as
given by (H.7), is a natural extension of the harmonic form (H.2).

5.6

Find the field pattern for a dipole which is one wavelength long if the current distribution is

5.7

Use Poynting's theorem to determine the total radiated power for the full ..wave dipole of
Problem 5.6.

5.8

Find the stored magnetostatic energy per unit length for the system consisting of two thin
parallel conducting tubes of radius a and center-to-center spacing D, if they carry equal
and opposite steady currents I.

5.9

Static charges and steady currents can set up time-independent electric and magnetic
fields in a C0l11mOn region such that CP = E X II is not identically zero, but still no net
power flow exists. Show that under these conditions f s(P dS = 0 for any closed surface S
in the region.

5.10

If t\VO uniform plane waves of con11110n polarization but different angular frequencies WI
and W2 propagate simultaneously in the same direction, show that the net time-average
power flow is equal to the sum of the individual time-average power flows.

5.11

Find the radiation pressure if a plane wave is normally incident on a perfectly absorbing
plane screen.

5.12

Show that an elliptically polarized plane wave may be decomposed into appropriate
amounts of right-handed and left-handed circularly polarized waves.

5.13

Determine expressions for the instantaneous stored energy density and Poynting's vector
for an elliptically polarized plane wave, Check that your answers have the proper limits
for linearly and "circularly polarized waves.

5.14

A circularly polarized plane wave is normally incident on a perfectly conducting plane


screen. What can be said about the polarization of the reflected wave?

5.15

Establish the law of reflection for a linearly polarized wave of arbitrary polarization
incident on a perfectly conducting screen at an angle a with respect to the normal.

5.16

Repeat the analysis of Example 5.8 for the case of a coaxial transmission line consisting of
t\VO concentric circular cylindrical shells of radii a and b, with b > a.

5.17

Repeat the analysis of Example 5.8 for the case of t\VO semi-infinite planar conducting
sheets which lie in the same plane but are separated by a constant gap width a.

5.18

A rectangular cavity of dimensions a, b, c is excited in the mode

E
Z

= sin 7rX sin 7ry eiwt

If Hz = Ex = E y == 0 and the walls are perfectly conducting, find the resonant frequency
wand the force exerted on each face.

5.19

Determine an expression for the resonant frequency for a TM mode in a cylindrical cavity
of radius a and length l.

5.20

Determine the integral of the Poynting vector over any cross section of the circular
cylindrical waveguide of Example 5.11 for any transverse magnetic mode.

Problems 325
5.21

Establish the expressions for the field components and the allowed frequencies for the
resonant TM modes in a spherical cavity.

5.22

Deduce the self-inductance per meter of t\VO parallel wires, each of radius a and centerto-center spacing D, (D a). Assume that the wires carry equal and opposite harmonic
currents, and that D
A, conditions often encountered in practice, such as in telephonic
communications.

5.23

Calculate the mutual inductance between the t\VO coils shown in the figure. Assume that
b and that the respective numbers of turns are N and Ni:

---t)-\-- - - - - - - - d - - - - - e - I
0 div 3'

5.24

For any antisymmetric tensor 3', show that

5.25

In a Minkowskian formulation of the field equations} define a suitable field tensor whose
terms represent the components of H, and Do.

O.

CHAPTER

l)ieleclric lklalerials
WITH RESPECT to some aspects of their electrical behavior, materials 111ay be classified
as conductors, semiconductors, and dielectrics or insulators. (See Section 8.2.) An ideal
dielectric is a material which possesses no free charges and thus completely inhibits
the passage of steady electric current. Since many real materials of practical importance
approach this idealization, it constitutes a useful 1110del on which to base an analysis
of electric behavior, as shall be seen in subsequent sections of this chapter.
Although electrically neutral, any dielectric is composed of molecules, which in turn
are composed of charged particles (nuclei and electrons), and these particles are usually:
affected by the presence of an electric field. Such a field influences the positively and
negatively charged parts of a molecule oppositely, and these parts may suffer oppositely
directed displacements from their equilibrium positions, thus causing the molecule to
become polarized. These displacements are limited by strong restoring forces, caused
by the altered charge distribution within the molecule, such that the charge shift is
seldom more than a small fraction of a molecular diameter. In such cases the molecule
may be viewed as an elementary electric dipole (or several dipoles) whose distant field
can be calculated using the techniques developed in earlier chapters. When the contributions due to all the molecular dipoles are summed, the resultant is often found to
alter the field distribution significantly from the value it had in the absence of the
dielectric, both for points inside and outside the dielectric.
The dipole behavior of a molecule may arise from three distinct causes. First, the
electron cloud of a constituent atom may shift relative to its nucleus due to the presence
of an electric field. 'I'his induced effect is called electronic polarization. Second, the
molecular structure may be due to an arrangement of oppositely charged ions, which
can shift from their equilibrium positions under the action of an electric field, thus
giving rise to ionic polarization. Third, the molecule 111ay consist of an arrangement of
atoms which, in the absence of an electric field, is a randomly oriented permanent
electric dipole. The presence of the field then causes a partial orientation of the perrnanent molecular dipole, causing a net polarization, and this phenomenon is termed
orientational polarization. All three effects may be present in a given material.
To account for these three sources of dielectric behavior, the material may be treated
as though it were a collection of dipole moments P in a vacuum" with consideration
of the detailed composition of P deferred to a later discussion. Accordingly, a distribution of static dipole moments will first be assumed and an expression derived for the
total electric field due to static primary charges and the assemblage of dipoles. This
expression will then be used to explain the manner in which dielectric materials affect

SECTION

Historical Survey

327

capacitance, to generalize the meaning of the electric flux density Do, and to deduce a
relation for the local field at the site of any molecular dipole. Consideration will then
be given to the problem of connecting, for a linear dielectric material, the strengths
of the local field and the induced 1110111ents P- This will be done for electronic, ionic, and
orientational polarization and will be seen to permit simplifications of the expression
for the total field; additionally, it will lead to the relation D = fE, with the permittivity
factor f serving to describe dielectric behavior for large classes of materials.
S0111e attention next will be given to nonlinear materials, notably ferroelectric
crystals, in which the polarization is not only not linearly proportional to the applied
field, but also depends on the prior history of excitation.
For linear dielectric materials, the theory will then be extended to the case of timevarying fields, at which point the necessity to include dielectric losses will arise. The
concept of a complex permittivity will be introduced whose imaginary part accounts
for these losses, and the dependence of permittivity on frequency will be considered
for all three types of polarization.
Finally, the free-space form of Xlaxwell's equations derived in Chapter 5 will be
extended to apply in regions occupied by dielectric materials. Additional extensions
of Xlaxwcll's equations will occur at the ends of Chapter 7 (for magnetic materials) and
Chapter 8 (for conductive materials).

6.1 *

HISTORICAL SURVEY

The recognition of a distinction between two classes of materials, conductors and


insulators, dates from 1729 when Stephen Gray discovered the phenomenon of electric
conduction. 1 Most common substances were soon categorized with respect to this
property; the metals were identified as good conductors, and many excellent insulators
were known and widely used by electrical experimenters of the eighteenth century.
It will be remembered from Chapter 3 that during this period Henry Cavendish became
interested in electrostatics, and was the first to observe that the presence of an insulator
between the plates of a condenser increased its capacity to store charge for a given
voltage. He measured the relative dielectric constants of many common substances,
such as shellac, beeswax, ebonite, and paraffin, and performed experiments which
indicated that the dielectric constant was independent of voltage (for glass) and
independent of temperature (for rosin)." However, Cavendish's papers were still unpublished in 1837 when Michael Faraday rediscovered the effect.
Faraday was led to this problem by researches he had conducted on the deeornposition of chemical compounds placed between electrodes. Contrary to the behavior of
liquid electrolytes, he observed that when a solid substance such as sulphur was used,
it did not conduct electricity and was in no way decomposed; yet its presence between
the electrodes did cause an effect in that the charge stored on each electrode was
increased over the value found when air was the intervening 111ediu111. Faraday pursued
this discovery, selecting shellac and sulphur as the t\VO insulating substances best
* This section may he omitted without loss in continuity of the technical presentation.
r S. Gray, "Several Experiments Concerning Electricity," Phil Trans Roy Soc (London), 37, 18-44;

Feb., 1731.
H. Cavendish, Electrical Researches, ed. by J. C. Maxwell. See particularly Notes 15 and 27. Cambridge University Press, London, 1879.
2

328 Dielectric 1\1aterials

CI-{;\PTEH

suited for experimental study of the phenomenon. Using t\VO identical spherical
capacitors, he left one air-filled and fitted a hemispherical shellac cup between the
conducting shells of the other. Upon comparing the ratio of charge to voltage for the
t\VO condensers, Faraday concluded"
. . . assuming the capacity of the air apparatus as 1, that of the shell-lac apparatus would
be tfi or 1.55 . . . this by no means expresses the relation of lac to air. The lac only
occupies one half of the space . . . if the effect of the two upper halves of the globes be
abstracted, then the comparison of the shell-lac power in the lower half of the one, with
the power of the air in the lower half of the other, will be as 2: 1 . . . . I cannot resist the
conclusion that shell-lac does exhibit a case of specific inductive capacity.

This coefficient of specific inductive capacity, which Faraday introduced as a quantitative measure of capacity enhancement, is today called the relative dielectric constant. For sulphur he determined it to be 2.24, and then extended his experiments
to include a variety of insulators, both liquid and gaseous, as well as solid.
To explain this phenomenon, Faraday formed a physical conception of the action of
insulators, based on an idea originally put forward by Davy some years earlier to
describe the behavior of a voltaic pile. Davy had supposed that prior to chemical
decomposition, the molecules of the liquid electrolyte became electrically polarized.
Faraday supported this hypothesis by noting!
When I discovered the general fact that electrolytes refused to yield their elemen ts to a
current when in the solid state, though they gave them forth freely if in the liquid condition, I thought I saw an opening to the elucidation of inductive action . . . . For let
the electrolyte be water, a plate of ice being coated with platina foil on its t\VO surfaces,
and these coatings connected with any continued source of the t\VO electrical powers, the
ice will charge like a Leyden arrangement, representing a case of common induction, but
no current will pass. If the ice be liquefied, the induction will fall to a certain degree, because a current can now pass; but its passing is dependent upon a peculiar molecular arrangement of the particles . . . .
Faraday then drew the inference
. . . As, therefore, in the electrolytic action, induction appeared to be the first step, and
decomposition the second . . . as the induction was the same in its nature 'as that through
air, glass, wax, . . . produced by any of the ordinary means: and as the whole effect in
the electrolyte appeared to be an action of the particles thrown into a peculiar or polarized
state, I was led to suspect that common induction itself was in all cases an action of con-

tiguous particles . . . .

In the following year Faraday elaborated on this idea, saying"


The particles of an insulating dielectric whilst under induction may be compared to a
series of small magnetic needles, or more correctly still to a series of small insulated conductors. If the space round a charged globe were filled with a mixture of an insulating
dielectric, as oil of turpentine or air, and small globular conductors, as shot, the latter
being at a little distance from each other so as to be insulated, then these would in their
condition and action exactly resemble what I consider to be the condition and action of
3 M. Faraday, Experimental Iiesearches in Electricity, vol. 1~ Sec. 1252-1270, Bernard Quaritch, Publisher, London, 1839.
4 Ibid., Sec. 1164.
5 Ibid., Sec. 1679.

SECTION

Historical Survey

329

the particles of the insulating dielectric itself. If the globe were charged, these little conductors would all be polar; if the globe were discharged, they would all return to their
normal state, to be polarized again upon the recharging of the globe.

This insight is all the 1110re remarkable when one remembers the primitive state of
atomic theory in 1838. With this model, Faraday was able to deduce that the polarization of the dielectric would be opposite to the influence causing it, thus requiring 1110re
primary charge to maintain the same voltage. This provided an explanation for the
increase of capacity due to the presence of a dielectric.
In drawing an analogy between dielectric polarization and the behavior of a series
of small magnetic needles, Faraday established a link to a successful theory of magnetization promulgated fourteen years earlier by Poisson." Poisson had adopted Coulomb's
doctrine of two magnetic fluids as the basis for his analysis. These fluids were presumed
to neutralize each other unless magnetically excited, in which case they 1110ved to
opposite ends of the individual elements inside a magnetic body, but were incapable of
passing from one element to the next. This polarization of the t\VO magnetic fluids
then caused a magnetic field distribution which was derivable as the gradient of a
potential function <Pm. Poisson showed this potential function to be given by the
expression
<Pm =

M dS
~

! (V

V M)

dV

with the first integral taken over the surface of the magnetic body, the second taken
throughout its volume, and M the polarization density, or magnetization. This formula
shows that the magnetic field produced by the body is the same as would be caused
by a fictitious distribution of magnetic charges, consisting of a surface layer whose
density is the normal component of M, plus a volume distribution of density - V M.
With this interpretation, Poisson was able to explain the magnetic phenomena known
at that time, and this theory was one of his most significant achievements.
Accepting Faraday's conception of an analogous electric polarization in insulating
materials, all that was needed to provide a theory for dielectric behavior was to translate Poisson's theory of induced magnetism into electrostatic language. This was
done independently by Lord Kelvin? and F. O. ~Iossotti8 and the essence of their
development will be found in Section 6.3. In his memoir, Mossotti also showed the
manner in which the dielectric constant of a material depends on its mass density.
This relation, known as the Clausius-l\Iossotti equation "vas derived independently by
Clausius some years later," and will be considered in Section 6.12.
With the growth of atomic theory, the nature of polarization mechanisms became
more clearly understood, and both electronic and ionic polarization were recognized
as forms of induction. It was appreciated that an atorn or ion pair would become
polarized under the action of a local electric field, and that this local field was contributed to not only by the external sources, but also by the induced dipoles in all other
atoms or ion pairs of the material. Lorentz derived a particularly useful expression for
6 S. I). Poisson, "Memoir on the Theory of Magnetism," Mem Acad Sci (Paris), ser. 2, 5, 247-338;
February 1824.
7 W. Thomson, Cambridge and Dublin Math J, 1, 75; November 1845.
8 F. O. Mossotti, Arch des Be Phys, (Geneva) 6, 193; 1847. Mem della Soc Ital Medena (2), 14,49; 1850.
9 R. Clausius, Die mechanische Warmelehre, vol. 2, pp. 62-97, Vieweg, 1879.

330

Dielectric i11aterials

CHAPTER

the local field in ter111S of the externally applied field and the polarization densit.y.!" If
one uses the Lorentz expression, it is possible to obtain an explicit relation between the
externally applied field and the induced polarization, thus accounting for the alteration
in field distribution due to the presence of the material.
Orientational polarization, due to the existence of permanent dipole moments in
certain molecules, was first hypothesized by Debyc!' in 1912. He used this concept to
explain the high static dielectric constants of water, alcohol, and similar liquids.
Borrowing from an earlier analysis of Langevin, concerned with the analogous problem
of orientation of magnetic dipoles in a permeable material, Debye was able to demonstrate a temperature dependence for the dielectric constant of substances containing
polar molecules. He extended the Clausius-Mossotti equation to include orientational
polarization, and provided a technique whereby experimental data giving dielectric
constant versus temperature could be used to separate the orieutational polarizability
from the -electronic and ionic contributions.
Maxwell had introduced the displacement vector D to represent the polarization of a
medium , as has already been noted in Chapter 5, and this has proved to be a highly
satisfactory means for characterizing dielectric behavior. In general, it is usually possible to connect D to the electric field by the relation D = frEOE in which Er is the dimensionless relative dielectric constant (f r 111ay be a tensor). This functional relation
between D and E is controlled by the polarization properties of the material and thus
serves as a description of these properties. A central aim in any dielectric theory is to
establish the dependence of Er on the properties of materials and the state variables.
The last half century has seen an accumulation of data giving Er for a vast number
of materials as a function of several parameters, such as frequency and temperature;"
Theory and experiment are found to be in good agreement for 1110St gases. In solids and
liquids, where the molecules are closer together and the local field is 11101'e complex,
agreement is less satisfactory, and subtle refinements of the theory are required.
Considerable progress has been made with this problem in the last few decades and
this is an area of research showing continued interest.

6.2

THE ELECTRIC MOMENT OF A NEUTRAL SYSTEM OF CHARGES

In the sections which follow, the electric polarization of materials will be used as the
basis for an explanation of their dielectric behavior. A fundamental concept associated
with the phenomenon of polarization is the electric moment of a system of charges qi.
If r, is the position vector of the ith charge with respect to some fixed point, qiri is
defined as its electric moment with respect to that point. 1'0 generalize, the electric
moment of the system of charges qi relative to a fixed origin is given by

(6.1)
When the net charge of the system is zero, this definition is independent of the choice
10 H. A. Lorentz, The Theory of Electrons, Sec. 117, Teubner Publishing Company, Leipzig, 1909.
Reprinted by Dover Publications, Inc., New York, 1952.
11 P. Debye, Polar Molecules, Dover Publications, Inc., N ew York, 1945.
12 E.g., see A. H,. von Hippel, ed., Dielectric Moierials and Applications, John Wiley and Sons, Inc.,
New York, 1954.

SECTION

The Static

:3

of origin, for if Liqi


electric moment is
p

]\1 acroscopic

Electric Field

331

0 and the origin is displaced an amount - Llr, the new value for

Lip

1 qi(ri + Lir) = 1 s. + Lir 1 qi =

Under this condition of charge neutrality, the electric moment may be written in
another form by introducing electric "centers of gravity" for the positive and negative
charges via the equations

L+ qiri r , 2:+ qi == rl(2


L c. = r2 L qi = - r2Q
=

with rl and r2 the equivalent positions of the total positive and negative charges
Q, - Q. With these results (6.1) may be rewritten as
(6.2)
in which d is the directed distance from the center of negative charge to the center of
positive charge.
A simple example to which this analysis is applicable is the case of a single atom,
with r , the equivalent center of the nucleus charge and r2 the equivalent center of the
electron cloud. If the at0111 is polarized, these positions do not coincide, and d has a
value different from zero. If OIl the time average d is a constant, the at0111 is statically
polarized; alternatively, under the influence of a time-harmonic electric field, d will be
oscillatory. Th ix formulat.ion is also applicable to a pair of ions, with d drawn from the
center of charge of the negative ion to the comparable point in the positive ion.
If several neutral systems of charge are considered jointly, their combined electric
moment may be written
p =

~ Pi

(6.3)

in which Pi == qjdj, with qi the total positive charge of thejth system, and d, the directed
distance from the center of negative charge of the jth system to the center of positive
charge of the jth system. This formulation is applicable, for example, to the case of a
molecule in which the jth subsystem is a neutral ato m. I t is also applicable to a molecule
in which the jth subsystem is an ion pair, or to a group of ions which are arranged in
ion pairs. In even greater generality, it is applicable to any neutral system of charges
which can be divided into subsystems of charge which arc also neutral. I t is in this
larger sense that Equation (6.3) will be applied to the polarization of dielectric materials.

6.3

THE STATIC MACROSCOPIC ELECTRIC FIELD


DUE TO A VOLUME OF POLARIZED DIELECTRIC MATERIAL

The definition of E in vacuo, as introduced in 'Chapter 3 and generalized ill Chapters 4


and 5, may be extended to apply to matter in a manner given by Lorentz in his theory
of electrons. If matter is pictured as consisting of electrons and positively charged
nuclei, since these elementary particles are extremely small compared to atomic
dimensions, at the microscopic level the picture is one of a region of space which is

332

Dielectric Materials

CHAPTER

essentially vacuous, with a very small part occupied by particles. The electric field
associated with these particles is customarily called the microscopic electric field and
denoted by the symbol e; a microscopic magnetic field b is also associated with the
motions of the particles. These microscopic field intensities satisfy Xlaxwell's equations
in vacuo as given in Chapter E). However, the rapid variation in space and time of e and b
is not ordinarily a feature of interest. For this reason, the local average values of e and b
are introduced, with the averages taken over sufficient distance and time to smooth out
the microscopic variations. These averaged field intensities are called the macroscopic
fields and represented by the symbols E and B. It will be shown in due course that these
macroscopic fields also satisfy Xlaxwcll's equations for all types of materials, whether
they be dielectric, magnetic, or conductive. Though established from a different viewpoint, these macroscopic fields are identical with those introduced by Maxwell in his
phenomenological theory which treated matter as a continuum.
In the case of dielectric materials, a convenient starting point for the development
of a theory is to assume that the material is ideal, meaning that there is no free charge
and that all regions of the material are electrically neutral. The fundamental "building
block" of a material will be designated by the generic term molecule, this term standing
for a neutral atom, or several atoms joined in homo polar, covalent, or ionic bonds, or
a small group of ion pairs, etc. (whichever is characteristic of the material in question).
Some examples would be

1. Neutral atom: noble gases such as Ar, Ne


2. Homopolar bond: 02' N2
3. Covalent bond: SiC
4. Ion pair: Hel
5. Group of ion pairs: Solid K aCl t
Since a fundamen tal characteristic of any molecule under the above definition is its
electrical neutrality, if the molecule possesses an electric moment, this moment may be
written in the form of Equation (6.:3).
In developing a dielectric theory, it will be advantageous to consider static conditions
first, since they are less complex and lead to some useful conclusions which simplify
the later discussion of time-varying effects. Therefore, the time-average positions of
the centers of positive and negative charge within the nth molecule will be assumed.
Then if within this molecule a subgroup of charge - qnl is centered a distance d n 1 from
a subgroup of charge +qnl, this will constitute a dipole within the molecule, and will
contribute an electric moment, or dipole 1110111ent, P: = qn1d n 1, with d n 1 the directed
displacement from -qnl to +qnl. If both the magnitude and direction of d n 1 are caused

t In cases such as this, there is no unique way to define the molecule. Solid Na.Cl has a cubical lattice
structure in which Ka + and CI- ions alternate in a three-dimensional array. One could imagine a
rectangular cell containing one pair of these ions as constituting a molecule, but this has some difficulties in that a single :\a + ion is bonded to S1'X CI- ions in the solid state. Alternatively, one could
imagine a larger cubical cell which has Cl- ions at all eight corners and at the centers of all six faces,
and N a + ions at the centers of all twelve edges, as well as at the volumetric center of the cell. The
corner ions belong equally to eight cells, the edge ions equally to four cells, and the face ions equally
to two cells, so that this molecule is equivalent to four ion pairs of (N a +, CI-). This conception has
the advantage of including in the molecule all of the principal bonds involving the central Na + ion,
although it does have the unnatural feature of partial ions. Either conception of the solid N aCI molecule is electrically neutral.

SECTION

The Static Macroscopic Electric Field

:3

333

by an external field, Pill denotes either electronic or ionic polarization. If only the direction of d n l is influenced by an external field, P1l1 is representative of orientational polarization. At this stage in the analysis, it will not be necessary to define the nature and
cause of the dipole moments explicitly, and Pill will be taken to represent any of the
three types of polarization.

(x,y,z)

JC----------------y

x
FIGURE

6.1

Notation for dipole moments.

The nth molecule may consist of many subgroups of displaced charge (qni,- qni), the
displacements being dnt', in which case a total dipole moment is defined for the molecule
by the relation
(6.4)
r- == pni = qnidni

"

With reference to Figure 6.1, if ~n is drawn from the site of P to a distant point Ct,y,z),
then the dipole moment pn causes a potential at (x,Y,z) given by
ep _ pn
n

~n

47ro~~

(6.5)

this result being merely a rephrasing of Example 3.5.


I t is usually the case that the volume V of the dielectric material is large enough so
that it 111ay be subdivided into a great number N of macroscopic volume elements dV,
with N big enough so that the methods of the integral calculus may be employed,
whereas dV is large enough to contain many molecules. t When this is so, a continuum

t This use of the symbol dV is open to criticism in that dV is not in this case a mathematical infinitesimal. However, if another symbol were used to emphasize its macroscopic nature, the subsequent
integral formulations would suffer from the awkwardness of unfamiliar notation.

334

Dielectric Materials

CHAPTER

approximation may be utilized. Let p(~,1],r) dV represent thc vector sum of all thc
dipole 111011lents in dV, so that P is the volurne density of dipole morucn ts. By this, one
means that if there are M molecules within dV, then

L pn
ill

PdV =

n=l

P is often called the polarization density, or simply the polarization.


Since the position vectors ~n drawn from the various molecules within dl1 to the
distant point (x,y,z) are essentially indistinguishable, one may "Tit e for the potential at
(x,Y,z), caused by all the dipoles within dV,

p. ~
d<p == - -3 dV
41l" Eor

with ~ == l x (x - ~) + l y (Y - 1]) + l z (z - r) drawn from dV to (:c,y,z). From this


it follows that the potential at (x,Y,z) due to all the elementary dipoles in the entire
volume V of dielectric material is given by
(6.6)
Implicit in the derivation of (6.6) is the assumption that (x,y,z) is sufficiently remote
from every dipole in 11 to satisfy the condition that ~ d.; for all n and for every i.
This condition is clearly 111et for points (x,Y,z) outside the material, and the remainder
of the analysis will first be carried through under this restriction.
The Field at an Exterior Point.
Distinguishing del operators with respect to the
source point (~,1],r) and the field point (x,y,z) by the definitions

000
+1- +1-

Vs = 1 s:

VF = lx

one finds that

o~

o
ox +

Vs

<p(x,y,z)

==

_1_
47rEO

Use of the vector identity

Vs

(P)
~

J
v

ar

0
ly -

(1)

and thus that

orJ

oy +

lz

oz

P VS

(!)

(6.7)

dlT

(1)

Vs
==-~-+p.VS ~

and then the divergence theorem converts (G.7) to the form

<p(x,Y,z)

P dS

--

47rEO~

f (V

V s P)
47rEO~

dV

(6.8)

in which S is the bounding surface of the dielectric material. The electric field caused

SECTION

The Static iI/Iacroscopic Electric Field

;3

335

by this polarized volume of material is therefore


E(x,y,z)

==

-VF

[1 P dS + 1(8

-41ro~

Vs

P) dl1

(6.9)

41ro~

A very interesting interpretation may be given to Equations (6.8) and (6.9) in terms
of equivalent charges. The electrical effect of the dielectric material at exterior points
is seen to be the same as though a volume density of bound electric charge pp == - V s p
were distributed throughout 11 , together with a surface density of bound electric charge
eJp == l\~ distributed over S (with P; the normal component of P). This concept of
equivalent charges will prove useful in describing many aspects of dielectric behavior.
The Field at an
Interior Point.
The previous analysis may be adapted to points
(x,y,z) within the dielectric by treating separately a small region surrounding the
interior point in question. With reference to Figure G.2, let a spherical surface S~ of

V'

FIGURE

6.2

The field at an interior point.

radius 8 be erected around (x,Y,z) as center, thus separating a volume fT~ from the
remainder of the dielectric. 11 0 should be chosen large enough to satisfy the condition
that 8 be much greater than any d ni , but it should be small enough that P is essentially
uniform throughout Va. 'I'hese conditions will normally be satisfied if 8 is all order of
magnitude larger than the linear dimensions of the macroscopic volume element dV
described in the discussion following Equation (G.I)).
If 11 1 is all the volume of dielectric except Va, then

E 1 (X,Y,z)

==

-V F

rIP

dS

--

8+8 0

41rO~

1(-

V'

V s P)
41T"O~'

dl

(6.10)

Dielectric 111 aterials

336

CHAPTER

is the electric field at (x,Y,z) due to all the dielectric, except for the contribution
of the polarized molecules within V o. If 1l0\V (:r,Y,z) is permitted to range over the subvolume elV, consisting of the macroscopic volume element which is at the center of V' o,
no sensible change will occur in the value of J~/(X,y,Z) computed from (G.lO), since d1/
is so small compared to l,'o. 'Thus Equation (G.lO) also gives the average electric field
throughout a macroscopic volume clement, due to all the dipoles outside lla.
It is shown in Appendix I( that all the dipoles within {la cause an average electric
field throughout V o of amount
J[

E;

47l"~003 n~

= -

(6.11)

p"

in which the sum is over all the dipole 1110111ents contained in Va. But if P is the density
of dipole moments in V o, then
A/

-!7r0 3

l>n

n=l

Ei =

so that

(6.12)

3o

Since lT o is a small local volume, if the molecules within V o are alike and uniformly
distributed, it follows that (G.12) also gives the average electric field throughout the
macroscopic volume element dV', due to all the dipoles within V o. t Therefore if E(x,Y,z)
is interpreted to mean the total average field in dl1 , or 1110re briefly, the macroscopic field,
then

E(x,Y,z)

-V

PedS+

S+8 0 47ro~

jC- v s e P ) _d l

v-

47ro~

(6.13)

This result can be simplified. Since V o is so small that P is uniform over So, one 111ay
take a polar axis parallel toP, as indicated in Fig. G.3, and conclude that P, == - P cos 8
over So. (Xote that dS is directed into V o.) 'I'hcu
-V

f PdS = j

f r(47ro~

So

(pdS)/;
47rO~3

P cos 0) (02 sin 0 dO det>)[ - (l x o sin 0 cos

])

SO

L, -

f cos? ()

2o 0

et>

+ 1 osin 0 sin et> + 1,0 cos 0) 1


y

47r003

7r

sin () de = -

3o

(6.14)

Therefore the equivalent bound surface charge over So makes an equal and opposite
contribution to the macroscopic field at (:r,Y,z), when compared to the contribution
made by the polarized molecules within l!a.

t Most materials have this locally homogeneous property. This includes polycrystalline and amorphous solids and many crystals, notably those possessing a cubical lattice structure. It also includes
gases anclliquids, since the thermal agitation of their molecules assures a time-average homogeneity.

SECTION

The Static 111 acroscopic Electric Field

;3

337

Additionally, since ~TO is so small, V s P is constant throughout V o and thus

VF

Va

(V s P) dV =
(V s ~)~ dV = (v s P)
~ dV = 0
4 1rfo!
Va!
Va !

(6.15)

Therefore (G.15) may be added to the right side of (6.13) without affecting the value
of E(x,Y,z). When this is done, and the explicit value for the surface integral obtained

z
p

FIGURE 6.3

Surface integral over S8'

in (6.14) is utilized, there results

E(x,Y,z)

-V F

[I
s

P dS
41rfo!

---

1(-- - - -drTJ
V s P)

41ro~

But this is identical with (6.9) and thus the macroscopic field is given by the same
formula whether Cr,Y,z) is an interior point or an exterior point.
EXAl\lPLE

G.!

In Chapter 3 the electrical system consisting of t\VO oppositely charged, parallel, eonducting, closely-spaced plates was considered extensively. In Example 3.17 it was found
that if both plates were unheated, were separated by a distance l, and differed in potential
byV b volts, then

E = -1

Vb
l

were the field expressions, with the coordinate system oriented as shown in the figure.
The charge density on the positive plate was 0'0 = Do = oVb/l, and in Example 3.31 the
capacitance per unit area was found to be Co = foil.

338

Dielectric Materials

CHAPTER

Imagine now that a homogeneous, isotropic dielectric material completely fills the space
between the plates. With the battery still connected across the plates, the voltage difference must remain Vb volts. By symmetry considerations, one concludes that the potential
distribution is still <I>(x) = Vbxll and that the electric field is still uniform, being given by
E = -Ix V bll. Therefore, the dielectric becomes unijormui polarized in the presence of this
electric field, as suggested by the figure. The polarization is directed down because the
positive charges are shifted in that direction, the negative charges being displaced upward.

, ,, t ;,--,--'--'--'--'--n
, ,, ,, , ,
, , , , t:,t , t t: ,
<I>

= Fh

,+++++++++++++++++++++++++++++++++++\

{t

; t t

,
t ,: , , , ,
,,,

t t

I,

1
t t ,
t t t
t t t ,I; t t

t:

,
t ,,
t: t , ,

t: ;

; I;
; ; , , ;
L ____________

; t
t
t t

,----------------------------------<1>=0

If a rectangular volume V, whose cross section is shown dotted in the figure, is selected
for application of Equation (6.8), one finds that V s P == 0 within V and that P n == 0
over S except for the t\VO faces perpendicular to X. Over the face at x = l, P n = - p
whereas over the face at x = 0, P n = + P, with P the uniform volume density of induced
electric dipole moments, Thus the entire volume of dielectric behaves as though there were
uniform and opposi te surface charge densi ties on its t\VO faces contiguous to the metallic
plates. These equivalent charge densities are opposite in sign to the original primary charge
densities (0-0, -0-0) placed on the plates by the battery.
Since the potential and electric field distributions between the plates must be the same
as before the dielectric was inserted, it follows that the battery must have delivered enough
additional charge to the t\VO plates so as just to cancel the effect of the dielectric. Therefore
the new primary charge densities on the plates are [(0-0
P), - (0-0 + P)]. The capacitance
per unit area is now

C = -+-P= Co (1 + -P) > Co


(10

Vb

(10

This increase in capacitance may be considerable, depending on the strength of the induced P for a given 0-0. Later sections of this chapter will be concerned with a determination of P as a function of external stimulus.
This conclusion that the capacitance has been increased due to the presence of a dielectric
is consistent with an energy argument. According to Equation (3.152), the energy stored in
the capacitor, per unit plate area, is now TV = i(o-o
P)V b' and this increase in stored
energy can be traced to the fact that it took work to cause a relative displacement of the
positive and negative charges comprising the dielectric.

SECTION

A Generalization of Do 339

6.4 A GENERALIZATION OF Do
Consider next a general electrostatic system consisting of primary charges plus electric
dipoles which represent dielectric materials. One can imagine that a primary volume
charge density p(~,'YJ,r) occupies the volume 1/'1 and that an assortment of dielectric
materials occupies the VOlU111e V 2, being bounded by the surface 8 2 V 1 and V 2 may overlap and neither of them need be only one simply connected region. If p(~,?1,r) is the
volume density of electric dipole moments in V 2, then the total macroscopic electric
field E at a point (x,Y,z) will be

E(x,Y,z)

El(x,y,z)

E 2(x,Y,z)

(6.16)

in which
(6.17)
and
(6.18)
If the entire electrostatic system, including the dielectric materials, is viewed as a distribution of primary charges of density p and equivalent bound charges of density
Pp = - V s P in a vacuum, a total macroscopic electric flux density may be defined by
Do(x,y,z) = EoE(x,y,z)
= EoEl(x,y,z)
= Dol(x,y,z)

+ EoE 2(x,y,z)
+ D o2(x,y,z)

(6.19)

In (fi.I 9) DOl is the density of electric flux which originates on the positive charges
of the primary distribution p and terminates on its negative charges. Similarly D 0 2 is
the density of electric flux which originates 011 the positive charges of the equivalent
distribution PP and terminates on its negative charges. D 0 2 is a macroscopic field and
does not have the detailed microscopic structure of a flux field associated with the
individual charges which comprise the molecular dipoles. Both the flux densities DOl
and D 0 2 satisfy Gauss' law and are derivable through formation of the gradient of a
scalar potential function. Therefore
V DOl = P
V X DOl == 0
V D 0 2 = -V P
V X

(6.20)

D 0 2 == 0

In 1110st problems of practical significance, one is interested in the total electric field,
which determines the force on a charge, and in the electric flux density DOl, which is
linked to the primary sources p dl 1 through Gauss' law. Equation (6.19) can be rewritten to display these two quantities of interest in the form
DOl

= EoE - D 0 2

(6.21)

D 02 is the extraneous quantity in this equation, and it is advantageous to explore the


implications of defining a macroscopic electric flux density field D by the relation
D = EoE

+P

(6.22)

340 Dielectric Materials

CHAPTER

The substitution of P for - D 02 is clearly suggested by the third of Equations (6.20).


In making the definition (6.22) it is to be emphasized that E still has the meaning of
total macroscopic electric field and that P still represents the macroscopic volume
density of electric dipole moments. Outside V 2, where P == 0,
=

DOl

D02

D =

DOl

D 02

D
whereas inside V 2

(6.23)

+P

(6.24)

Thus in terms of the earlier conception of electric flux associated with charges in free
space (i.e., the primary charges p and the equivalent bound charges pp), D equals the

total macroscopic flux density Do outside the dielectric materials, but not inside.
Furthermore, taking the curl of (6.22) and utilizing (6.20) gives

vxD=VxP

(6.25)

whereas taking the divergence yields


V D

(6.26)

Therefore D, as defined by (6.22), may not be irrotational, but it does satisfy Gauss'
law in terms of the primary sources only, and one 111ay write

J D dS vJ pdF

(6.27)

and conclude that D has discontinuities only at points occupied by primary charges.
Equation (6.22) will be adopted as the defining relation for the generalized electric
flux density function D, with the understanding that this equation refers to the total
macroscopic E everywhere, whether inside or outside the materials, but that it only
refers to the total macroscopic Do field outside the materials. I t is an equation of cardinal
importance in the theory of dielectrics, because usually it can be converted to the form
D = frfOE, since for most materials P is expressible as a function of E. The relative
dielectric constant fr then serves the role of representing the macroscopic electric
behavior of the dielectric.
Besides containing the polarization density P explicitly, Equation (G.22) has several
other advantages. Outside the materials, (G.22) gives D = foE which is a natural relation, whereas (Cl.21) does not similarly connect DOl and E. In the absence of materials,
(6.22) reduces everywhere to (3.32) and thus all the earlier discussion of electric fields
due only to primary charges becomes a special case of the present generalization. But
an even 1110re important feature is that D, as defined by (6.22), shares with DOl the
property of satisfying Gauss' law, thus providing a cause and effect relation with the
system of primary charges.
EXA:\lPLE

6.2

Imagine that a parallel plate capacitor is charged by a battery until its plates have densities
(0", -0") and that then the battery is removed. If a block of dielectric is inserted between the
plates, as shown in the figure, the D field is unaffected, since the charges on the plates
cannot change. However, the E field is affected, because the induced polarization in the
dielectric produces a field in the opposite direction. The equivalent bound surface charges

SECTION

.A. Generalization of Do

4/

+
~
~
/':

~+

+
%

341

-(j

D Field lines

(J

(Jp

-(Jp

-(j

E Field lines

(UP, -up) on the dielectric faces are also shown in the E plot. If the Do field were shown,
it would be everywhere proportional to E through the factor Eo, and just like E would be
discontinuous at the dielectric faces, due to the flux lines originating on Up and terminating
on -Up.
EXAMPLE

6.3

Conceptually, the macroscopic field vectors E and D, which appear in (6.22), may be
measured inside a dielectric with the aid of small cavities constructed within the dielectric.
With reference to the first figure, let it be desired to determine E at the point A, and
imagine that through the removal of material, a long, thin, needle-shaped cavity of arbitrary orientation has been created with A as its central point. For the moment, let it be
assumed that compensating charges of volume density -V P and surface density P n have
been placed in the cavity and on its walls, so that the macroscopic field everywhere is the
same as it was before removal of part of the dielectric. Since V X E == 0, it follows that
around the rectangular contour 1234,

If the cavity is small, and the legs 12 and 34 are negligible, this means that E 14 = E Z3
In other words, the longitudinal component of electric field within the cavity is the same as
the corresponding component of electric field in the adjacent dielectric.
Now consider the compensating charge distributions -V P and P n , within the cavity
and on its walls, If the cavity is small, -v P is uniform within the cavity, and by symmetry these volume charges contribute nothing to the electric field at point A. If the
cavity is also slender, the surface charges comprising P n can, at most, contribute to the
transverse electric field at A. Thus removal of these surface and volume charges will not

342 Dielectric Materials

CHAPTER

affect the longitudinal field at 11. Therefore if a small needle-shaped volume of dielectric is
removed in the neighborhood of any point A, and no compeneaiinq charges are introduced,
a measurement of the longitudinal electric field at .1 will give the same value for that
component of the field as existed in the dielectric before removal of the material, Three
perpendicular orientations of the needle-shaped cavity can yield measure of all three COlnponents of the electric field within the dielectric.

Next consider the creation of a waferlike cavity, with ./1 as its central point, as shown in
the second figure. The circular faces of this cavity are large compared to its cylindrical rim,
and its orientation is arbitrary. If, once again, volume and surface charges of densities
-v P and P n are introduced to compensate for the removal of the dielectric to form the
cavity, the total field everywhere is as before; a measurement at point A of the electric field
component En along the cylindrical axis of the wafer will yield the same value as existed in
the dielectric with all the material present.
If the cavity is small, -v P is uniform within the cavity, and by symmetry the volume
charges contribute nothing to the electric field at i1; their removal will not change the
reading of the field strength. On the other hand, the surface charges on the walls of the
cavity may make a substantial contribution to the electric field at A. If the wafer cavity is
extremely thin, the surface charges on the cylindrical rim can be ignored. However, then
the surface charges on the t\VO circular faces, being equal and opposite, cause a field at 1"1
which is like that between the faces of a parallel plate capacitor. This field is along the
cylindrical axis of the cavity and of strength - P n/ Eo. Removal of these surface charges
would cause the axial componen t of electric field to change to the value En
Pn/ Eo. If
Equation (6.22) is resolved into components, one may write Dn / Eo = En
P n/Eo. Therefore if the axial component of electric field is measured at the center of a waferlike cavity,
with no compensating charges present, multiplication of this field reading by Eo gives the
component of D in the axial direction. Three perpendicular orientations of the cavity will
then yield all three components of D.

6.5

THE LOCAL FIELD

The defining relation (6.22), which gives D in terms of E and P, is a macroscopic equation whose further interpretation Blust await the linking of P to its causes. But this
linkage occurs at the microscopic level, and thus P forms a bridge between the macroscopic and microscopic theories of dielectric behavior.

The Local Field

SECTION ;)

343

1'0 develop the connection between P and its causes, it is useful to introduce the
concept of the local field, E 1o c , which will be defined as the average field intensity acting
on a given molecule within the dielectric. E 10 c may be determined by removing the
molecule ill question, maintaining all other molecules in their time-averaged polarized
positions, and calculating the space-averaged electrostatic field in the cavity previously
occupied by the removed molecule. If 11 m is the volume of the cavity, then
E 10c =

+! e
T'm

Vm

dV - _1_

Vm

! em

Vm

dV

(6.28)

in which e is the total tirne-averaged field at a point in V m and em is the time-averaged


field at the same point just due to the charge distribution of the molecule in question.
If the dielectric material is locally homogeneous, the first integral in (6.28) is approximately equal to the macroscopic field E. And if 11 m 111ay be chosen as a spherical volume
of radius 1'm, then the results of Appendix I( give
(6.29)
with p the total dipole moment of the removed molecule. If N = 1/ 1T m is the local
volume density of molecules, then assuming parallel and equal polarization for all
local molecules gives
p
p
P = Np = 4-:3 = 3Eo - - . 3
"311'"1 m

411'"Eo1 m

so that the space-averaged self-field is - P/3eo. With these substitutions, (6.28) becomes
E 10 c

P
+
3Eo

(6.30)

Equation (6.30) was first derived by Lorentz " using the definition that the local field
was the value found at the center of the molecule rather than averaged over a molecular
volume. It requires the assumption that the molecules are alike and locally distributed
in a uniform fashion. I t is approximate in the sense that the field right at one of the
dipole moments Plli of the nth molecule may not be accurately given by the field averaged over the entire volume associated with that molecule. An additional difficulty
is the fact that if all space is to be considered filled with molecular volumes (which is
implicit in the relation It m = 1/ N), then IT m cannot be spherical. Despite these approximations, when (6.30) is used to predict dielectric behavior, the results are in satisfactory agreement with experiment for many substances.
Although the integral in (G.29) 111ay not actually reduce to precisely - P /3Eo in all
cases, it is arguable that the average field in 11 m due to the polarized molecule contained
in V m should be proportional to the dipole moment of the molecule, which in turn is
proportional to P by definition. Therefore
_1

! em dV

V m v.,

13

- 'Y

EO

H. A. Lorentz, Theory of Electrons, Dover Publications, Inc., New York, 1952.

344 Dielectric M aierials

CHAPTER

in which "y is called the internal field constant, and does not necessarily have the
value one-third. In this case (G.:10) aSSU111CS the somewhat 1110re general form

E 10 c = E

+ (~) p

(6.31)

.A. best-fit between experimental data and a theory based on (6.31) 111ay then provide a
means for determining the value of "y.
Upon adoption of this formulation of the local field, it is possible to obtain relations
between P and its causes for each of the three types of polarization, and from these
results to determine a functional relation between D and E.

6.6

ELECTRONIC POLARIZATION

The developments contained in this and the next t\VO sections will show that, under
suitable assumptions, a linear relation exists between the strength of the local field
and the induced polarization. This linearity presumes field strengths which are not
excessive and is applicable to most dielectric materials, It includes all three polarization
mcchanisms-v-eloctronic, ionic, and orientational-the first of which will now be
considered.
In the presence of an electric field, the electron cloud and nucleus of an atom tend to
shift in opposite directions, causing electronic polarization. This effect can be related
to the electric field causing it by assuming that the electron cloud has a time-average
distribution which is a function only of radial distance from the cloud's center, and then
making use of Coulomb's law. Although such a classical model is admittedly crude, and
one should properly use quantum mechanical expressions for the cloud distribution, this
classical approach has the virtue of simplicity and yields results of the right order of
magnitude.
With reference to Figure 6.4, let the volume charge density in the electron cloud be
p(l')

Electron cloud charge distribution

-, 'l'a><..
I

t"

',

......

,.." /

+
I

d--

Polarized atom

FIGURE

6.4

Polarization of a single atorn.

SECTION

Electronic Polarization

345

some general function per) which drops to zero at a radius r: It will be assumed that the
nucleus is displaced a distance d relative to the center of the electron cloud, due to the
presence of a field E 1oc If Ze is the total charge on the nucleus, with Z the atomic number
and e the proton charge, then ZeE loc is the force on the nucleus due to all other charges
except its O\VI1 electron cloud. This force must be balanced by the restoring force which
the electron cloud impresses on the nucleus.
A spherical shell of charge 47rr 2 p(r) elf will exert a force at points beyond r as though
the charge were concentrated at the center, and will exert no force at points within r.
'rhus the nucleus will experience a restoring force due only to that part of the electron
cloud within the radius d. According to Coulomb's law, this force will be

so that, for an equilibrium displacement,


:l

ZeE loc

(Ze)2

--

od 2

Jd r 2f(r) dr
0

= 0

(6.32)

in which f(T) = - per)/Ze is the normalized absolute charge density in the electron
cloud, in the sense that

J 41rr
o
r.,

dr

2f(r)

Equation (6.32) may be rewritten in the form

e.: =

Jr
d

Ze

od 2 0

2f(r)

dr

If the displacement d is small compared to the effective atomic radius


approximated by

E 10c

Zef(O)

J
d

= --')
1'2
od~ 0

Zed
dr = f(O)
3o

fa,

this can be

(6.33)

in which case the equilibrium displacement is seen to be linearly proportional to the


local field intensity causing the displacement. But Zed = pe is the dipole moment of
the atom, and therefore
(6.34)
with a e = 3o/f(O) called the electronic polarizability of the atom,
To gain a feeling for the magnitude of f(O), imagine that per) as sketched in Figure 6.4
is constant out to the radius r; and then zero, so that

an expression which is seen to be normalized. Then f(O) also is given by this expression
and
(6.35)

346 Dielectric 111aieriols

CHAPTER

If the atom whose electronic polarizability is being considered is only one of t\VO
or 1110re atoms comprising a molecule, the radial symmetry assumed above is, of course,
invalid. The displacement for a given local field will depend on the orientation of the
molecule with respect to the field. However, assuming knowledge of the effect of nearby
at0111S and the probability distribution of orientations of the molecule, a suitable
average value for a e could be deduced, whir.h would differ from ((-L~;) only by some
multiplicative factor. Therefore Equatiou (6.:34), which indicates all induced electronic
dipole moment proportional to the local field, is generally valid, even though explicit
expressions for a e 111ay be too difficult to obtain in complicated cases.

6.7 IONIC POLARIZATION


Two different atoms X and Y may join together as a molecule by forming a chemical
bond. This bond 111ay be caused by the transfer of electrons from one atom to the other
or through the sharing of electrons between the atoms. When a transfer of electrons is
involved, the bond is said to be ionic; if the electrons are being shared, the bond is called
covalent. A simple ionic bond between one X at0I11 and one Y atom quite obviously
results in a permanent dipole moment, A covalent bond can also exhibit a permanent
dipole moment if the "center of gravity" of the shared electrons does not coincide with
the "center of gravity" of the remainder of thc charges in the t\VO atoms,

(a)

(b)
FIGURE

6.5

(c)

Jonic bonds in molecules.

Depending on the kind of bond and how many at0I11S of each type are in the complete
molecule, a variety of possibilities arises. Three of the I110re simple arrangements involving ionic bonds are suggested by Figure 6.5. In (a) the molecule XY is seen to have
a dipole moment, but in (b) the in-line disposition of atoms yields a net dipole moment
of zero for the molecule XY2. However, if the atoms are arranged as in (c), the n101ecule XY 2 does have a net moment. Real molecules can be found to fit all three of these
cases; examples are (a) HC], (b) CO 2, and (e) H 20 .
Molecules are classified as polar or non-polar according to whether or not they possess
a permanent dipole moment. In the presence of an electric field, polar molecules
experience a torque tending to align thorn with the field. This is orientational polarization, and will be treated more fully in Section 6.8. ~ on-polar molecules are not sub-

SECTION

Ionic Polarization

347

jccted to this aligning torque, but the presence of a field can induce in them a type
of polarization which can be explained with the aid of Figure G.G. In (a) the in-line
XY 2 molecule is shown again, but this timc in the presencc of all electrostatic field,
which causes relative displacements of the charged atoms in the directions shown.
The t\VO dipole moments 110 longer cancel, and the net moment is 2t1p. This effect is
known as ionic polarization.

p - ap

E10c

p - Lip'

+ Lip

/
p

(b)

(a)
FIGURE

Lip'

Ionic polarization.

6.6

However, in many materials the XY 2 molecules are randomly oriented and the more
general situation is shown ill Figure 6.(->b where the displacement is due to the field
component J!J'loc cos O. A quantitative est.imate of this displacement may be obtained
in the following way: Let da be the normal interatomic spacing between the center
of the X atom and the center of either Y atom and let 2q be the net time-average charge
on the X atom. Then in the absence of an external electric field, if the ions were a
distance T apart, the lower Y ion would experience an attractive force

+ (- q)(q)

Fa = (2q)(q)
41l" E or 2

47ro(2r) 2

= ~

167rEor 2

due to the positive X ion and the other negative Y ion.


To prevent the molecule from collapsing, there 111USt also be repulsive forces between
the ions. These repulsive forces occur when the electron shells of adjacent ions overlap,
and they increase strongly for small decreases in 1'. Following Born, one can make the
simple assumption that the repulsive force experienced hy the lower Y ion, due to the
electron cloud of the X ion, is

}?

-T

1'11-+1

in which K and n are as yet undetermined constants whose values depend on the ions
heing considered. When the molecule j~ in equilibrium, F, = F; and r = d a and thus

7q2d~-1

1()7ro

348

Dielectric Materials

CHAPTER

Therefore the total force on the lower Y ion, due to the other ions in the molecule, is
(6.36)
and this is seen to be zero for the equilibrium spacing r = d.;
If, in the presence of E loc , the X ion shifts a distance Ad closer to the upper Y ion, the
force experienced by the 10,,,e1' Y ion, due to the other two ions in the molecule, is
changed to
1
F =
(2q)(q)
+ (-q)(q) _
7q 2
41rEo(da + ~d)2
41rEo(2da) 2
IG1rEo(da + ~d)n+l

d:-

~ -q-3 [7(n +
161rE Oda

1) -

16]q ~d

This force must balance the external force - ql~lloc cos () exerted on the lower Y ion and
thus, since 2~p' = 2q ~d is the induced dipole 1110111ent for the entire molecule,
')Lip

...

321rEod~ cos 8 l~l


JJ
- 7(n + 1) - 10 loc

(n.37)

If all orientations of the molecule are equally likely, and the average value of 2Lip'
is denoted by pi, then since the average value of cos () is 2/1r,
Pi =

in which

(Xi

(6.38)

(XiE loc

is called the ionic polarizabiliut of the molecule and is given by


()4E

(Xi

= 7 (n

d:

(6.39)

1) - 16

The value of n to be used in this formula depends on the number of electrons in the
outer shells of the iOIlS. Using an approximate analysis of the interaction between
closed-shell electronic distributions, Pauling deduced 14 a set of values of n as a function
of ion size which are given in Table 6.1. The numbers in this table should be used by
taking the average of the values of n for the t\VO types of ions occurring in the molecule.
For example, carbon disulphide would call for a choice of n = 7. Some experimental
TABLE 6.1
REPULSIVE EXPONENT

Type of closed

Representative ions

shell structure

He
Ne
A
I(r

Xe

2-

8 2-

Lit
F-

Cl-

Re2+

Na+
{{+

Br:

1-

Mo2+
e

Ca 2+

n
5
7
9
10
12

14 L. Pauling, The Nature of the Chemical Bond, p. 339, Cornell University Press, Ithaca, New York,
1948.

SECTION

I onic Polarization 349

work by Slater " on alkali halides indicates reasonable agreement with this averaging
procedure of the n values in Pauling's table.
When the relative sizes of typical values of fa and d; are taken into consideration, formulas (G.35) and (6.3H) indicate comparable values for electronic and ionic polarizability.
Equation (G.38), which is seen to be in the same form as (6.34), is applicable to more
complex non-polar rnolecules than the one represented by Figure 6.6. I t is also applicable to ionic crystals. (Cf. Example 6.4.) The analysis differs in these 1110re general
cases only in that the coefficient in (6.37) is slightly altered, which, in turn, makes a
small change in the magnitude of ai. Thus ionic polarization manifests itself quantitatively in the same manner as electronic polarization.
EXAMPLE

G.4

An ionic crystal such as NaCI is a cubical array of (alternately) positive and negative ions.
Under the influence of a static axial field the positive and negative ions shift oppositely,
approximately as suggested by the figure, until the restoring forces balance the impressed
force and equilibrium is reestablished. The balance of forces on a central positive ion may be
determined as follows:
Let the equilibrium relative displacement be ~d, with d; the normal interionic distance,
and ~d
d., If z is the valence number, +ze is the net charge on a positive ion, -ze the

l~

J. C. Slater, "Compressibility of the Alkali Halides," Phys Rev 23, 488-500; April 1924.
y

0:
--e---<

--

u:
~

,..

,......,

--

....

- 1
1- - 1-

-....
--

--

--

.-.-da~

--

II

-...
--

1_...

-...

...~

-....
-

-ze

-....

--

,.

"

--

...-

-....

J)

-...
)

-...
--

...-

-- E

I-

,..

1;-

J
"

-...

--Ad

1+

-ze +ze

I-

-.....-

- -

.-

-)

-1_
-

350

D~'electrc

Maierials

CHAP'fER

net charge on a negative ion. The positive ion 1+ at the origin has suffered no asymmetrical
displacement relative to the other positive ions, so by symmetry the other positive ions
exert a null net force on 1+.. Also, by symmetry, the negative ions exert a net force on 1+
which has only an X component. Consider first the negative ion which is normally at the
position (kda,lda,mda) with k, l, and m integers. This ion causes a force on 1+ whose X component is
(ze)2

f = 41r Eo [(kd.

+ ~d

kda

+ ~d)2 +

([d.)2

+ (mda)2]~~

Using a binomial expansion and ignoring terms higher than first order, one obtains
(ze)2 [

f = 4no

+ m2)%d~

(k 2 + [2

~d 2k 2 - l2 - rn2 ]
- d~ (k2 + [2 + m 2)H

The zero-order term is seen to be odd in k and thus, when all negative ions are considered, there is only a first order force F 1, given by

= -

(ze)- ~d
-

4nod3 ( )

III
k

2k (k 2 +

[2 -

[2

+ m )%
2

in which the sum is over all allowable values of the indices k, l, m, these values corresponding to the sites of negative ions. This series sums to zero and thus

PI

and, to first order, the array of displaced ions does not contribute a restoring field.
However, the balance of interactions of the electron cloud of 1+ with those of the t\VO
neighboring negative ions 11 and 12" has also been disturbed by the displacement, giving
rise to a restoring force F 2. Once again, use of the Born approximation gives
F2 =

__

(da - ~d)n+l

(da + ~d)n+l

= 2K

en + 1) Lld
d:+ 2

The restoring force in the absence of an external field is I(/d~+l and this force 111Ust balance
the attraction exerted on 1+ by all the ions (both positive and negative) to one side of 1+
when no external field is presen t. Th us

K
d:+ 1

(ze)2

[\,1

41rEod~ ~ (k 2 + l2

\'"

1n2)~~ - L

(k 2

+ l2 +

m2)~2

in which ~' is over all terms for which k + l + m is odd and ~" is over all terms for which
l
m is even. When these SU111S are evaluated, their difference gives

+ +

K = 0.29 (ze)2 d:- 1


47r Eo
F 2 = 0.58 (ze)2 (n
47r Eoda3

and therefore

+ l)(~d)

In the presence of a static electric field, the balance of forces on 1+ is thus

+d1) ~d
41r Eo a3

= 0 58 (ze)2(n

ze E l~

If one lets Pi

ze 6.d stand for the induced dipole moment, it follows that

SECTION

Orientational P olarizaiion

351

in which the ionic polarizability ai is given by


ai

47r od~

= ----O.5S(n

+ 1)

an expression which is seen to be similar in form to (6.39).

6.8

ORIENTATIONAL POLARIZATION

If a homogeneous polar material is considered, all molecules have the same permanent dipole moment and, in the absence of an exciting field, these moments are often

W(8)

01--------+-------+---

-J.l.E 1oc

-qE 1oc

(a)

(b)
FIGURE

6.7

Permanent dipole in a local field.

randomly oriented. When an electrostatic field is present, each molecule experiences a


torque which tends to align its dipole moment with the field. Were it not for thermal
agitation, in such materials all of the molecules would become aligned; but their
collisions with each other keep breaking up this pattern, so that on the time-average
there is only a partial alignment. A quantitative indication of this effect may be
deduced by first considering a permanent dipole J1. = qd, as shown in Figure 6.7 a, which
makes an angle 8 with respect to the field. The energy added to the system consisting
of the dipole and the sources of E 1oc , when an external agency rotates the dipole from
an initial setting 8 1 , to a new position 8, is

f qE10cd sin
8

W =

(3 d(3

= - JJ.E1oc(cos () - cos

(}l)

81

Therefore the potential energy of the permanent dipole moment J.l is a minimum for
() = 0 and a maximum for 8 = 7r; in other words, alignment with the field is the pre-

352 Dielectric Materials

CHAPTER

ferred orientation. If the reference for potential energy is taken as 01 = 7r/2, then
W = -ll E 10c

(6.40)

and the energy versus angular position may be plotted as shown in Figure ().7b.
Let N v be the number of molecules in a macroscopic volume clement and assume
that all pointing directions would be equally probable for the permanent dipole
moments of these N; molecules were E Joc = o. In the band of directions ~7r sin 0 dO
shown in Figure 6.8, one would find oriented a fraction 27r sin 0 dO/47r of the N; dipoles,

E\or.

()

FIGURE

d8

6.8 Calculation of N (0).

this being the ratio of the band area to that of the entire unit sphere. However, the
presence of E10c causes a weighting by energy of the pointing directions such that

N(O) dO

K27r sin 0 e- W(8)/kT dO

(().41)

is the number of dipole 1110111cnts pointing in this band of directions. The factor K
appearing in (6.41) is a normalizing constant and l: = 1.38 X 10- 2 3 joules/deg is
known as Boltzmann's constant. The temperature T is expressed in degrees Kelvin
and the weighting factor e-ll'(8)/k1' is a consequence of the Xlaxwell-Boltzmann statistiCS. 16 Since J~N(O) dO = N v, one finds that K = alv v/ 47r sinh a, wherein a = J.LI~loc/kT.
The net dipole moment for all those molecules whose dipoles point in the band of
directions 27r sin 0 dO is JJ. cos ON(0) dO and therefore the equivalent orientational dipole
16 See, e.g., F. W. Sears, An Introduction to Thermodunomics, the Kinetic-Theory of Gases, and Statistical Mechanics, 2d ed., Chaps. 12 and 14, Addison-Wesley Publishing Company, Reading, Massa-

chusetts, 1953.

SECTION

Orientational Polarization 353

moment per molecule is given by the expression

~ 0J J.I. cos ON(O) dO

p; =

N;

in which the substitution variable v

po =

J.I. (

J
a

= 2

~h
ve" dv
a SIn a_a

a cos 8 has been introduced. This integral gives

coth a -

(6.42)

= J.l.2(a)

The function ~(a) defined by (6.42) is known as the Langevin function)' it first arose
in an analysis by Langevin in 1905 of the similar problem of the orientation of magnetic
dipoles in a steady magnetic field. It is plotted in Figure 6.9. For large values of
~~(a)

1.0

0.8

0.6

0.4

/.

0.2

1L-_ _----ll.-

L-

.1.-

....L-

FIGURE

6.9

...L_

__L._a

The Langevin function.

a = IJ.l!}loc/kT, the Langevin function is seen to approach unity. This corresponds to


low temperatures and gives Po ~ IJ., indicating that all the permanent dipole moments
are essentially aligned with the field. At very low temperatures this is a result one would
expect, since it is thermal agitation which is interfering with the tendency to alignment.
However, at normal temperatures ~(a) is quite small. This may be appreciated by
noting that IJ. has a value which, in order of magnitude, is the charge on a proton
multiplied by a distance of one angstrom, i.e., 10- 10 meters. (Because of the convenient
size of this product, dipole moments are often expressed in debye units, with one debye
unit equal to 10- 10 esu angstroms or 3.33 X 10- 30 coulomb meters.) If one assumes a
polar gas in which each molecule has a permanent dipole moment of one debye unit,
at room temperature for a local field even so strong as ten million volts per meter,
a ~ 10- 2 For values of a this small, ~(a) ~a/3. If this approximation is used, (6.42)
becomes

(6.43)

354 Dielectric Materials

CHAPTER

Thus at normal temperatures the equivalent orientational dipole 1110n1ent per molecule
is inversely proportional to temperature, proportional to the square of the permanent
dipole moment per molecule, and linearly proportional to the local field. This result
aSSUl11eS a material whose dipole 1110111ents are randomly oriented in the absence of an
external field.
Under the conditions leading to (G.43) one 111ay write

(6.44)
in which a o = J.L 2 / 3k 1' is called the orieniaiional polarizabilily of the molecule.
Thus it is seen that all three types of induced polarization can be linearly proportional
to the local field.

6.9

DIELECTRIC SUSCEPTIBILITY, PERMITTIVITY,


AND RELATIVE DIELECTRIC CONSTANT

Materials whose molecules possess permanent dipole moments which are randomly
oriented in the absence of a field may exhibit all three types of polarizability-electronic, ionic, and orientational. In such cases one 111ay write for the average polarization
per molecule
(G.45)
p = aE 10 c
in which
(G.4G)
P = l)e + Pi + po
is the SUln of the electronic, ionic, and orientational dipole moments, and
a

ae

+ a, + a;

(G.47)

is the total polarizability, being the sum of the three contributions. a e and ai include
the effects of all atoms and ions in the molecule and must be suitably averaged over all
orientations of the molecule with respect to the field; a o = J.L '2/3k T under normal
conditions of temperature, with p. the permanent dipole moment per molecule.
If N is the density of molecules per cubic meter, then
P = lVp = NaE loc

((1.48)

is the volume density of dipole moments. If one invokes (6.31), this 111ay be rewritten as

Na

(6.49)

Thus under the restrictions of this analysis the polarization P is linearly proportional
to the macroscopic electric field. For this reason one may write
P = xefuE

in which

(6.50)

x, is a dimensionless constant, called the dielectric susceptibility, and is given by


Na/EO
1 - "INa/EO

Xe = - - - - -

(6.51)

SECTION

Dielectric Susceptibiliiu, Permiiiuniu, Constant

355

From (6.22) the flux density is

D == foE

xeoE == o(l

x.) E == E

(6.52)

with == o(l + x.) called the permittivity of the medium, The relative dielectric
constant r is defined by the relation
r

== -

== 1

(6.53)

Xe

and, like x., is a dimensionless quantity.


If one returns to Example 6.1, and a dielectric material satisfying the conditions
of this analysis is placed between the parallel plates of the capacitor, then the capacitance is changed so that

Co

X,

==

(6.54)

and thus t he relative dielectric constant is an easily measured quantity. If one has
determined the capacitance in a VaCUU111 and then in the presence of the material
medium, it is then possible, through the use of (6.51), to obtain an estimate of the total
polarizability a-if the particle density is known and a value (such as one-third) is
assumed for 'Y.
The results of this section are also applicable to materials in which one or more of the
partial polarizabilities of (6.47) is zero.
EXAMPLE

6.5

The measurement of dielectric constant through a capacitance experiment is limited to


frequencies at which the circuit has dimensions small compared to a free-space wavelength. When this is the case, a circuit such as the one shown in the figure may be ernployed. Neglecting loss, the resonant frequency is

[LeCs +

C)]-~~

10 = - - - - - 21r

If Cs is a calibrated condenser which may be retuned to maintain resonance after insertion


of the dielectric specimen, the change in Cs gives the change in capacity C due to the
dielectric, from which the susceptibility may be deduced. If the frequency is 10 \V eno ugh,

Variable
frequency
generator

356 Dielectric It!aterials

CHAPTER

this quasistatic procedure yields the static dielectric constant. The power loss in the dielectric as a function of frequency may be determined by measuring the sharpness of tuning
near resonance.
At microwave frequencies a different procedure is used. A. dielectric specimen perhaps a
tenth of a wavelength long is inserted in a waveguide and the change in VSvVR noted.
From this the velocity of propagation v in the specimen may be deduced; since v = (}.Lo E)-}2,
the permittivity is thereby determined.
At light frequencies, still another procedure is followed. The index of refraction is determined for the specimen, usually in a minimum deviation prism experimen t, though total
reflection and interferometric techniques are sometimes employed, Since Snell's law gives
sin i = n sin r, with i and r the incident and refracted angles, if a light wave is incident
from a vacuum on the specimen, the refractive index n is the ratio of light velocities in free
space and in the specimen. But this is also the square root of the relative dielectric constant.

6.10

THE STATIC DIELECTRIC CONSTANT OF GASES

For most gases, under normal condi tions of pressure and temperature, IV a/ EO 1.
This conclusion will be substantiated in the illustrative examples to follow, and is due
to the relatively low particle density; it does not extend to liquids and solids. However,
when this condition is met, a significant simplification of the analysis is possible. The
formula for dielectric susceptibility becomes

Xe

Na/Eo 1

(G.55 )

and the relative dielectric constant is therefore close to unity. 'I'he local field is given by

E10c = E

+ -Eo'Y P

= E(l

+ 'Yx e) ~ E

(6.56)

This result may be explained by saying that the molecules are far enough apart to
cause a low polarization density P; the nearest neighbors of a molecule do not exert a
sufficient effect to cause its local field to differ substantially from the average macroscopic field.
These simplifications can be applied to a variety of special cases.
Monoatomic Gases. The rare gases, such as helium and argon, are monoatomic
under normal conditions, and thus the principal polarization mechanism is a relative
shift of nucleus and electron cloud, and a = a e A measurement of capacitance, first
in a vacuum and then in a rare gas medium, will permit a deduction of ae.
As an illustration, the relative static dielectric constant of helium, measured at one
atmosphere and DoC, is found to be Cleo = Er = 1.0000684. Under these conditions
the particle density is approximately N = 2.7 X 10 2 5 at0111S/111 3 , so from (6.55)
CX e

0.22 X 10- 4 0 farad 111 2

Through use of (6.35), an estimate of the radius of the helium at0111 is


Ta ~

O.G X 10- 10 meters

which is the right order of magnitude. Thus, even though the classical atom 1110del
(used in Section 6.6 to obtain a quantitative index of electronic polarizability) is rather
crude, the results are quite reasonable. As further illustration, Table 6.2 lists the

SECTION

10

The Static Dielectric Constant of Gases

357

polarizabilities, under comparable conditions, of many of the rare gases. The polarizability is seen to increase as the atoms increase in size, which is in accordance with
Equation (6.35). One also observes from this table that even for xenon, Nae/f.o ~ 10- 3 ,
which is consistent with the assumption leading to (6.35).
TABLE 6.2
ae

X 1040 farad

n1 2 FOR RARE GASES 17

Gas

He

Ne

Ar

Kr

Xe

a. e

0.201

0.390

1.62

2.46

3.99

17 From L. Pauling, "Many-Electron Atoms and IORs,"


Proc Roy Soc (London), A114, 181-211; 1927.

The average distance between molecules is N-~'J. Under normal conditions of pressure
and temperature, this is approximately ~3 X 10- 9 meters for a monatomic gas, which is
about twenty times the atomic diameter. This low population density accounts for the
low value of Xe, and Equation (6.5;")) then indicates that the dielectric susceptibility
is linearly dependent on particle density, (so long as lVae/f.o 1), an effect which has
been verified by experiment.
Using the foregoing analysis, one may also deduce the relative shift d of the nucleus
and electron cloud of a single atom. For an electric field of 106 volts /rneter, Equation
(6.34) gives for a helium atom
p ~ 0.2 X 10- 34 coul-m

Since the charge is Ze = 3.2 X 10- 19 coul, the displacement is approximately

d = 6 X 10- 17 meters
This is only about one millionth of the atomic radius, which serves to justify some of the
earlier assumptions about dipole moments, For reasonable values of electric field, the
atoms of a rare gas are only minutely perturbed.
The classical development in Section 6.6 suggests that the electronic polarizability
a e is dependent only on distribution of charge in the electron cloud, as represented
by the function fer). If this cloud distribution remains the same, a, should be a constant. From this one would predict that a e (and hence f.r) is independent of temperature
for a rare gas, unless the temperature is so high as to excite the atoms above their
ground state. This prediction is also borne out by experiment.
EXAMPLE

6.6

Measurements of the atomic diameter of argon in its condensed state at 50!\: give 2ra
3.84 X 10- 10 meters, Using these data, what is the relative dielectric constant of argon
under standard conditions of pressure and temperature?
At T = 273I{ and p = 1 atrn the particle density of argon gas is approximately N
2.7 X 10 2 5 at0I11s/m 3 Combination of (6.35) and (6.55) gives
so that

x, =
r

47rNr~ = 2.4 X 10- 3


= 1
X, = 1.0024

358 Dieleciric Materials

CHAPTER

Direct measurement of the relative dielectric constant of argon in a capacitance experiment gives Er = 1.000545, so the above estimation of Xe is high by a factor of 4. The chief
source of error can be traced to the assumption, which led to Equation (6.35), that the
electron cloud has a uniform density. If one were to assume instead that the electron cloud
is relatively 1110re dense near its renter, this would increase f(O) and decrease Xc, thus
bringing the above calculation into closer accord with direct experiment. Further refinement would require use of a quantum mechanical 1110del for the electron cloud.

Non-polar Gases.
For a non-polar gas whose molecules have a structure such as
that indicated by Figure 6.G, there may be ionic as well as electronic polarization.
The relative displacement of nucleus and electron cloud will be different for the several
types of atoms, and the total electronic polarizability will be the sum of the partial
effects. Both the electronic and ionic polarizabilities need to be averaged over all
possible orientations of a molecule, with the results represented by mean values of a e
and ai. Then if N is the density of molecules per In 3, the polarization density is given by

Because a, is comparable to a e , ]J / EoE is quite small for norrnal particle densities, and
the approximations (6Ji l ) and (6.~)6) are once again valid.
Ionic polarizability depends on the structure of a single molecule and thus shares
the property with electronic polarization of being temperature insensitive unless the
temperature is extremely high. For this reason non-polar gases such as CH 4have
permittivities which are constant over wide temperature ranges, as may be seen from
Figure G.!!.
f

EXAl\1PLE

6.7

The measured dielectric susceptibility of CO 2 at OC and one atmosphere is 0.985 X 10- 3


Find the total polarizability and estimate what fraction of it is due to ionic displacement.
Use of (6.55) gives
a

a
e

+ a.
t

fOX e

= 8.854 X 10- 12 X 0.985 X 10- 3


2.7 X 1025
= 3.2 X 10- 40 farad

In 2

If one aSSU111eS that the bond consists of C4+ with 202~ then the carbon ion has a shell
structure like helium, and the oxygen ions resemble neon. Referring to Table 6.1, the repulsive coefficient is therefore n = 6. The rotation-vibration spectrum of CO2 yields the
information." that the internuclear distance is da = 1.16 A. Using Equation (6.39) one
obtains

a, =

64 X 8.854 X 10- 12 X (1.16)3 X 10-30 = 0 .27 X 10-40 f ara d 111.2


7(6 + 1) - 16

This indicates that the ionic polarizability is about 9 percent of the total polarizability.
This fractional polarizability has also been determined experimentally. If the dielectric
constant is measured at visible light frequencies (in a refraction experiment), the relatively
heavy ions cannot follow the field variations, and a e alone is contributing. Upon comparing
this result with the measurement of dielectric constant at very low frequencies, where both
a e and (Xi are contributing, one is able to separate the t\VO polarizabilities. The average of
G. Herzberg, Infrared and Raman. Spectra, p. 398, D. Van Nostrand Company, Inc., New York,
1945.

18

SECTION

The Static Dielectric Constant of Gases 359

10

the data of three independent investigations!" gives ai to be 11 percent of the total polarizability, so the above rough estimate is the right order of magnitude.

Polar Gases. For a gas composed of polyatomic molecules possessing permanent


dipole moments, all three types of polarization are present and under normal conditions
the polarization density is expressible as

P == N(a e

+ ai + Jl2/3kT)E

== xeoE

so that the dielectric susceptibility is ternperature-dependent, being given by

X, == - (a e
a

ai

J.L2/3kT)

If x, is plotted as a function of T':' for a polar gas, the curve is a straight line, as
indicated by Figure 6.10, the intercept being Nto;
aJ/eo and the slope NJ.L2/3ko.

Xe

,/

,/

./

./

,,/

T':'

L..--

FIGURE

6.10

Dielectric susceptibility vs. tentperature for a polar gas.

When experimental data for real gases are plotted in this fashion, it is possible to separate
the contribution due to orientational polarization and make an estimate of the permanent dipole 1110111ent per molecule. Figure G.11 shows such a representation of the experimental results for a variety of gases and supports the straight line prediction of the
theory.
EXAlVIPLE

6.8

With reference to Figure 6.11, measurement of the intercept and slope for the plot representing CH 2Cl 2 gives

N(a e

+ ai)

== 0.7 X 10-3

N Jl2
1.5
3ko
ai
0.7 X 10- 3T
-- =

Therefore

a,

J.2/3kT

1.5

C. 1'. Zahn, Phys Rev, 27, 459; 1926. H. H. Uhlig, J. G. Kirkwood, and F. G. Keyes, J Chem Phys,
1, 155; 1933. H. A. Stuart, Z Physik, 51, 490; 1928.

19

360

Dielectric Materials

CHAPTER

10

CHaCI
8

_ _ _ _ _ CR,CI,
4

- - - - - - - - - - CCl4
2

1000

2.5

3.5-r

3.0

FIGURE 6.11 Experimental data of static dielectric susceptibility us.


ieniperaiure for various cases. [After R. Sanger, Phys. Z., 27, 556; 1926.]

and thus at room temperatures the orientational polarizability is roughly seven times as
great as the electronic and ionic polarizabilities combined.

_(4.5NkEo)H

Further,

jJ.-

--

so that, using N = 2.7 X 1025 rnoleculea/m" as the particle density under standard conditions, one finds that
J.L =

4.5 X 10- 30 coul-m.

= 1.35 debye units

is the permanent dipole moment of a CH 2Cl 2 molecule.

The procedure illustrated by Example G.8 has been utilized to deduce the permanent
dipole moment jJ. for many gases, and Table 6.3 lists the results for some of the more
common cases.
TABLE 6.3
PERMANENT DIPOLE

Alolecule u, Debye units


HCl
HBr

H 2O
H 2S

CO

CO2

1.03
0.79
1.81
0.92
0.12
0

~IOMENTS

illolecule u, Debye units

CS2
CH 4
CC1 4
CH 3CI

NO
N0 2

0
0
0
1.89
0.1
0.4

SECTION

The Static Dielectric Constant of Solids and Liquids

11

361

A study of this table permits some inferences to be made about molecular structure.
The expected symmetry of the CH 4 and CC14 molecules would lead to the prediction
of no net permanent dipole moment, and this is seen to be the case. However, the zero
moment for CO 2 , in face of the fact that CO does have a moment, implies that CO 2
has the in-line molecular structure of Figure 6.5b. A similar conclusion may be drawn
about the structure of CS 2 Alternatively, molecules such as H 20, H 2S, and N0 2 , which
do have net dipole moments, must have the structure suggested by Figure 6.5c, with
the two bonds making an angle other than 180 deg.

6.11

THE STATIC DIELECTRIC CONSTANT OF SOLIDS AND LIQUIDS

Unlike the situation normally encountered in gases, where a large susceptibility is


precluded because the molecules are many molecular diameters apart, in the case of
solids and liquids the molecules are contiguous. This high particle density often results
in a substantial value of P per unit applied field, thus causing the local electric field to
be significantly different from the macroscopic electric field. Concomitantly, the
dielectric susceptibility is not small, and the approximate expressions (6.55) and (6.56)
for Xe and E 10 c are not applicable, being superseded by the more exact relations (6.31)
and (6.51).
In applying these more exact relations to solids, three classes of materials may be
identified :
1. Solids exhibiting only electronic polarizability
2. Solids which manifest electronic and ionic polarizabilities
3. Solids which possess orientational as well as electronic and ionic polarizabilities
These three types of solids will be considered in turn.
Elemental
Solid Dielectrics.
These are materials consisting of a single type of
atom, such as diamond, sulphur, etc. It is apparent that such materials contain neither
ions nor permanent dipoles, and thus can exhibit only electronic polarization. If one
makes use of (6.51), the relative dielectric constant is

1
Xe

==

Na e

(1 - 1 ' ) o

Na e

(6.58)

1 - 1'EO

Measurement of the dielectric constant of such solids, together with an assumed value
for 1', permits an inference of the electronic polarizability.
EXAMPLE

6.9

At 25C, the relative dielectric constant of sulphur is 3.75. Estimate the electronic polarizability if the density of sulphur at this temperature is 2.05 gms/cc,
Insertion of l' = i in (6.58) gives
Na e
Eo

1.43

362 Dielectric Materials

CHAPTER

Since the atomic weight of sulphur is 32, one kg-mole contains N A = 6.02 X 102 6 atoms
and weighs 32 kg. This many atoms occupies a volume

- - -- 15 600 cm 3
V -- 32,000
2.05
'
and therefore the atom density is

6.02 X 102 6
0.0156

= 3.86

X 1028 atoms/rn!

The electronic polarizability is thus

ae =

1.43 X 8.854 X 10- 12


3.86 X 1028
3.28 X 10- 40 farad m 2

With reference to Table 6.2, since sulphur is close to argon in the periodic table, this result
is seen to be the right order of magnitude, but larger than one would expect for a free
sulphur atom. This increase of electronic polarizability in the solid state is due to the fact
that the bonding of adjacent at0111S affects the valence electrons. Similar calculations for
the diamond-structure solids, such as silicon and germanium, also give a somewhat higher
value for a e than would be predicted for the corresponding free atom.

Since the distance between atoms in a solid is only slightly affected by temperature,
the quantities a e , 1', and N which appear in (6.58) are temperature insensitive, and
one would expect Er for an elemental solid dielectric to be essen tially constan t over a
wide temperature range. This prediction is confirmed by experiment.
Ionic Non-polar Solid Dielectrics.
Generally, those solids which contain more than
one type of atom, but no permanent dipoles, evidence ionic as well as electronic polarizability. Prominent a1110ng such solids are the ionic crystals, such as the alkali halides.
The structure of these crystals is characterized by a regular three-dimensional alternation of positive and negative ions, and hence the entire crystal has no permanent
dipole moment. However, in the presence of an external field, the positive ion lattice
will suffer a displacement relative to the negative ion lattice, resulting in ionic polarization. Additionally, both ion types will show electronic polarization, so that the total
polarization density 111ay be written as the SU111 of these contributions, in the form
P = P e + Pi. The static relative dielectric constant rs is related to P by the expression
(6.59)

However, if the dielectric constant is measured at light frequencies (by a refraction


experiment), the ions are too heavy to follow the field variations, and Pi = O. Denoting
the relative dielectric constant under these conditions by Erl one can write
(6.60)
Combining (6.59) and (6.60) yields for the ratio of polarization densities
rs -

rl

rl -

(6.61)

Therefore measurements of the dielectric constant under quasistatic and, optical

SECTION

The Static Dielectric Constant of Solids and Liquids

11

363

conditions provides an indication of the relative strengths of ionic and electronic


polarization.
Table 6.4 lists these data for many of the alkali halides. The ionic contribution is seen
typically to be three times as great as the electronic contribution.
TABLE 6.4
STATIC AND OPTICAL DIELECTRIC CONSTANTS FOR ALKALI HALIDES

LiF .............
LiCl. ...........
LiBr ............
LiI .............
NaF ............
NaCI ............
NaBr ...........
NaI .............

f rs

fTl

Pi/P e

9.27
11.05
12.1
11.03
6.0
5.62
5.99
6.60

1.92
2.75
3.16
3.80
1.74
2.25
2.62
2.91

8.0
4.7
4.1
2.6
5.8
2.7
2.0
1.9

KF .............
KCl. ...........
I{Br............
1\1.............
RbF ............
Rbel. ..........
RbBr ...........
RbI ............

f rs

e-i

r..

6.05
4.68
4.78
4.94
5.91
5.0
5.0
5.0

1.85
2.13
2.33
2.69
1.93
2.19
2.33
2.63

4.9
2.3
1.8
1.3
4.3
2.4
2.0
1.5

Because Pi, like P, is dependent on atomic structure and particle density, (quantities
which normally are unaffected by temperature), ionic solids also reveal susceptibilities
which are independent of temperature.
Polar Solids.
For solids whose molecules possess permanent dipole moments, the
total polarization contains three contributions, and may be written P = P,
Pi
Po,
with Po representing the orientational contribution. Unfortunately, no adequate
quantitative theory exists which relates Po to its stimulus in the case of solids. The
reason for this is that, unlike the molecules of a liquid or a gas which can rotate freely,
the molecules of a solid are often constrained by the stability of the structure and the
directional character of the bonding. This constraint varies from one material to
another, and impedes the alignment of the permanent dipoles with the electric field
in a manner which, unlike thermal agitation, is not random, and cannot be expressed
generally. Therefore the discussion of polar solids must be limited principally to some
qualitative remarks.
In solids the ease with which a molecule can rotate depends on its shape and on its
interactions with its neighbors. Apparently the more symmetrical the molecule is, the
more freely it will rotate. Non-polar solid methane (CH 4) , which is highly symmetrical,
exhibits this feature and so does solid hydrogen. However, less symmetrical molecules
such as H 2S and H'Cl do not. Indeed, instead of displaying a rotational characteristic,
they appear to have several stable orientations and thus to obey an order-disorder

theory;"
Polar solids whose molecules have a discrete number of allowed orientations show a
dielectric susceptibility with the same temperature dependence that one would expect
if the molecules could rotate freely. To see this, imagine that there are M allowed
orientations which make angles fli with the field E 1oc If Vi is the energy of a molecule in
N. L. Alpert, "Study of Phase Transitions by Means of Nuclear Magnetic Resonance Phenomena,"
Phys Rev, 75, 398-410; February 1, 1949.

20

364 Dielectric Materials

CHAPTER

the ith state in the absence of an external field, then the relative population of the
field-free states is
No net orientational polarization in the absence of a field implies that
.lIt'
J.L cos (Ji e- Ud kT = 0

2:

i=l

When the field is present, the relative population becomes

and the net polarization per molecule is therefore

.1\1

po =

i= 1

J.L

cos

(Ji e-(Ui-jlEloc

cos 9 j ) IkT

111

e-(Ui-jlEl oc cos 9i ) /kT

i=l

When p,E1oclkT

1, this reduces to

Al

po

J.L cos (Ji e-UdkT[l

AI

J-LE

1oc cos (Ji/kT]

i=l

cos?

(Ji e-UilkT

J.L2E 1oc i= 1

--

kT

---=-M~---

12

e- U d k T

i= 1

80

100

120

140

160

180

T(OI()

FIGURE

5 kcps.

6.12 Relative dielectric constant of solid H 2S vs. temperature at


[After Smyth and Hitchcock, J. A.m. Chem. Soc., 56, 1084; (1934)]

(6.62)

SECTION

The Static Dielectric Constant of Solids and Liquids

11

365

If Ui/kT is also small, as is usually the case, then

L cos" Oi
M

p; =

J.l2E1oc i= 1

J:T--M-

(6.63)

which, except for a difference in the multiplicative factor, is the same as expression
(6.43). For this reason, one would expect a polar solid to exhibit a T":' temperature
dependence, whether the orientational mechanism is a free rotation or a set of states.
40

30

10

Solid

Melting point

O""'----~--....-.----'----"-------IL....---"'-

270

290

280

300

T(OK)

6.13 Relative dielectric constant of nitrobenzene vs. temperature.


[.lfter S1nyth and J!itchcock, J .:1n1 Chern Soc, 54,4631,: 1932.]

FIGURE

This temperature dependence is shown clearly in substances such as solid Hel and
H 2S. The relative dielectric constant of solid hydrogen sulfide is plotted as a function
of temperature in Figure 6.12. r is seen to increase as the temperature is lowered until,
at 127K, a slight jump occurs, presumably occasioned by a change in structure and a
reconstitution of allowed states. The temperature dependence is once again evident
until103.5I\: is reached, at which point the permanent dipole moments are apparently
"frozen" and no orientational effect remains,
In considering liquid dielectrics, once again three classes of materials may be identi-

366 Dielectric 1\1 aterials

CHAPTER

fied. Those liquids which have only electronic, or at most electronic and ionic polarizabilities, may be discussed in a manner analogous to what has already been done in the
case of non-polar solids. For the same reasons, such liquids customarily display dielectric constants which are temperature insensitive. Appreciable susceptibilities are found
insome of these liquids because of their favorable molecular structure and high particle
density.
An additional difficulty arises when one attempts to apply the theory to polar liquids.
Although molecular rotation is normally free, so that the Langevin analysis of net
orientation appears pertinent, an actual rotation of a given molecule affects its nearby
neighbors and alters E 1oc , thus further complicating the calculation of ao. This effect
was first appreciated by Onsager" in 1936 and approximate theories have been developed which take it into account;" The prediction of these theories is that a o is still
proportional to J.L2/kT', the only difference being a change in the value of the proportionality constant.
This T':' temperature dependence of the dielectric constant is evident in the liquid
phases of polar materials such as Hel and 1128. The case of nitrobenzene is shown in
Figure 6.13. A rise in Er is seen to accompany a lowering of the temperature until solidification apparently sets the permanen t dipoles in fixed orientations.

6.12

THE CLAUSIUS-MOSSOTTI

EQUATION

If the Lorentz expression for the local field Equation (6.30) is assumed, so that
t, then the general polarizability Equation (6.51) may be written in the form

'Y =

Net.

X,

x, +

Er -

(6.64)

If PM is the mass density of the dielectric material, then the molecular density N is
given by
(6.65)
in which N Jt is Avogadro's number and At is the molecular weight, In ~II(S, units,
= 6.02 X 10 26 is the number of molecules in one kilogram molecular weight, and
1\1 is the kilogram molecular weight (as an illustration, 111 = 32 kilograms for oxygen);
Pl.! is expressed in kg /m". If (6.65) is substituted in (6.64) one obtains

NA

(6.66)
This is the Debye generalization of the Clausius-Mossotti equation, the earlier version
of which had not included the effects of orientational polarization. I t relates the dielectric constant to the mass density of the material, and is written in such a form that the
right side (called the molar polarizability) is a function only of temperature. Therefore
the left side is independent of density.
21
22

L. Onsager, "Electric Moments of Xlolcculcs in Liquids," J A ni Cheni Soc, 58, 1486-14D~3; 1036.
H. Frohlich, Theory of Dielectrics, 2d ed., pp. 33-50, Oxford Press, London, 1958.

SECTION

12

The Clausius-Mossotti Equation

367

This equation is valid only for dielectrics of low density, for which inaccuracies in
the Lorentz local field expression have little effect. Thus principal agreement between
(6.66) and experiment is found for gases at not too excessive pressures. The form (6.66)
is really not pertinent to liquids and solids, in that their densities are not readily
changed, and then not substantially. Equation (6.64), which is often also called the
Clausius-Mossotti formula, gives fair agreement with experiment for non-polar liquids
in which short range forces are negligible, and also for SOIne simple non-polar solids for
which the Lorentz local field expression is a good approximatiori.P
The ease of gaseous hydrogen is illustrated by Table 6.5, in which the .calculated
values of molar polarizability listed in the last column are seen to be almost constant
over a wide range in density.
TABLE 6.5
DIELECTRIC CONSTANT OF HYDROGEN VERSUS
DENSITY AT

Pressure

24.9C 2 4

Density
(kgjn1

(atmospheres)

7.96
30.03
88.13
255.04
478.78
814.62
1425.36

0.324
1.206
3.421
8.984
14.955
21.755
30.357

jll

r -

PM

1
2

---

3)

1.00192
1.00730
1.02083
1.05540
1.09310
1.13766
1.19500

3.946
4.026
4.030
4..038
4.026
4.032
4.022

After A. Michels, P. Sanders, and A. Schipper, "The


Dielectric Constant of Hydrogen," Pliusica; 2, 753-756;
1935.

24

If the dielectric is a mixture, consisting of a homogeneous blend of a number of


different types of molecules, the Clausius-Mossot.ti equation may be extended readily
to the form
r

III

--=2
3 o n
r

N;.
n

(6.67)

in which N n and an are the molecular density and the total polarizability of the nth
species.
EXAMPLE

6.10

If Equation (6.66) is re-solved for the relative dielectric constant, one obtains

23 For penetrating discussions of the validity of the Clausius-Mossotti formula, see H. Frohlich,
Theory of Dielectrics, 2d ed., Appendix A3, Oxford Press, London, 1958, and C. J. F. Bottcher,
Theory of Electric Polarisation, pp. 199-212, Elsevier Publishing Company, Amsterdam, 1952.

368 Dielectric M aierials

CHAPTER

and this equation 111ay be plotted, with the result suggested by the figure. As the density
increases, the dielectric constant is seen to climb more and more sharply, and tend to
infinity as the density approaches the critical value 3J! fo/lV A(X. This so-called polarization
ca tastrophe is not approached in the case of most real materials because the approximations
inherent in the derivation of the Clausius-Mossotti equation become invalid long before
such densities are reached. However, it provides an interesting insight to the behavior of
ferroelectric crystals, as shall be seen in Section 6.14. Many gases conform to the lower
portions of the curve, especially those with small polarizabilities.

10

10-

--.._

A typical gas under standard conditions has a relative dielectric constant in the neighborhood of 1.001. For such a gas, if one asks by what factor the density must be increased to
raise f r to the value 2, use of the above equation gives the result 750. This requires that
the molecules be 9 times closer together, which is approaching the liquid state. It is apparent
that an inordinate change in the state variables of a gas is needed before a significant change
in its susceptibility will occur.
EXAMPLE

6.11

If the Clausius-Mossotti equation is written for non-polar materials under quasistatic conditions (a == a e + ai) and at light frequencies (a = a e ) and the difference is taken, one
obtains
Af

[frs -

PM

e.,

1_

+2

frl Erl

1]

+2

= N A(Xi =

TI

3 fo

this result being known as the rnolar ionic polarizability. If attention is directed to the
solid alkali halides, and the expression found for (Xi in Example 6.4 is used, one finds that

e., - 1

frl -

f rs

frl

N (Xi

27T'

--- - --- = -- = -----

+2

+2

3Eo

1.74(n

1)

since 2d~ is the volume occupied by one molecule, so that 2N d~ = 1. Use of the experimental values of dielectric constant from 'fable 6.4 provides a check of this equation.
Comparing the last t\VO columns of Table 6.6 reveals that the agreement is reasonably good
for some of the salts listed and fair for the others.

SECTION

Isotropic Medium 369

13
'TABLE 6.6

MOLAR IONIC POLAHIZABILITIES OF VARIOUS ALKALI HALIDES

Repulsive
coefficient,
n

LiF ...........
LiCl. .........
NaF ..........
NaC!. .........
KCI ...........

7
7
8
9

10.5
10
11

1\] ............

RbBr .........
RbI ...........

6.13

Experimental
rs -

rl -

Theoretical

----rs

+2

+2

rl

0.50
0.40
0.43
0.31
0.28
0.21
0.27
0.22

0.52
0.45
0.45
0.40
0.36
0.31
0.33
0.30

PRIMARY STATIC CHARGES IN AN INFINITE,


HOMOGENEOUS, ISOTROPIC MEDIUM

The results achieved so far in this development of a theory of dielectric behavior permit
consideration of a problem of some theoretical interest. Imagine that a system of static
primary charges of density p is contained within a finite volume V 1, this volume being
part of an infinite homogeneous and isotropic medium. In this event the total macroscopic field at any point in space is given by

E - -VF

r.:

- +
- - - jVs.PdV]
[j -PdV
47ro~
s, 47ro~
47ro~
VI

(6.68)

V 00

in which it has been assumed that the field caused by the primary charges has induced
in the medium a static polarization density P which is linearly proportional to the field.
The surface integral over Sx; vanishes because Soc; may be taken to be a sphere of
radius R ~ 00, and over Soc; the polarization P is R-dependent to a greater negative
power than -2.t Therefore (6.68) may be written

(6.69)
But P has been assumed to be linearly proportional to the macroscopic E field so that

V' D

D = foE + P = foE + XeoE


== oV E + V' P == o(l + Xe)V E

p -

P = EOV' E == - I

+ x,

t This conclusion may be reached by permitting the minutest bit of loss in the medium during establishment of the field.

370 Dielectric Materials

CHAPTER

Making this substitution in (6.69), one obtains


(6.70)
and it is as though no polarized molcoules were present, but Coulomb's law and all its
consequences were valid with EO replaced by E and only the primary charges considered.
For gases such as air in which x, is very small, the impractical restriction of Chapter 3,
that only primary charges in a vacuum would be considered, 111ay be removed with no
practical alteration in any of the results.

6.14

FERROELECTRIC

CRYSTALS

For all the dielectric materials considered in the preceding sections, the displacements
of centers of charge and the accompanying polarization were linearly proportional to
the externally applied electric field, and disappeared upon removal of the field. However, not all dielectric materials behave in this manner, In recent years several classes
of ionic crystals have been discovered to have the property that, when the centers of
positive and negative charge have been pulled apart sufficiently, they remain locked
in their new positions, causing the molecules to become permanently polarized. These
crystals then display polarization without benefit of an external field, and are said to
be spontaneously polarized.
It is possible to cause a specimen of such material to become uniformly polarized,
but a 1110re common initial situation is one in which subvolumes of the specimen, called
domains, are individually uniforrnly polarized, with the direction of polarization varying from one domain to another in a random manncr.J" When the specimen is in this
condition it shows no net bulk polarization. If now an external electric field is applied,
those domains whose polarization is aligned with the field grow at the expense of the
other domains, as individual molecules change the direction of their polarization. A
plot of the net polarization, as the intensity of the applied electric field is increased,
will take the form of the curve oab in Figure 6.14. At first the slope of this curve keeps
increasing, as the growth of favorable domains assists the external field in the realigning
process. Then an inflection point is reached, and the slope decreases as saturation sets
in until, at point b, only favorably oriented domains are left and no further realignment is possible. Additional increase of the electric field then causes only a slight linear
change in P (portion bc), this occurring because of the normal electronic and ionic
polarization effects discussed previously. An extrapolation of the linear segment be
intersects the P axis at a value P s , which is the spontaneous polarization density each
domain individually possessed in the random state, and which the organized specimen
now exhibits.
If the electric field is decreased, some nonaligned domains spring up and begin to
grow, and the curve cbe is traced out. However, when the E field is totally removed,
a remanen t polarization oe still exists and the specimen shows a bulk polarization in
the absence of an external field. The electric field must be reversed and reach a value
25 These domains can be observed through a microscope using polarized light. See, e.g., P. W. Forsbergh, Jr., "Domain Structures and Phase Transitions in Barium Titanate," Phys Rev, 76, 1187-1201;
1949.

SECTION

Ferroelectric Crystals 371

14

fo before the net polarization is reduced to zero. If the reversed field is further increased,
the curve fg is followed, saturation once again setting in when those domains with
polarization parallel to E have grown to engulf the entire specimen. Another reversal
of the electric field results in the curve ghic, thus completing the cycle.
The similarities between this process and the corresponding one involving ferromagnetic materials explain why dielectrics which display this hysteresis effect are called
ferroelectrics. The choice of the word ferroelectric is not completely appropriate because
the microscopic mechanisms are dissimilar, as will become evident in Chapter 7.
p

-------~----fl~---'Ii~-----E

FIGURE

6.14

Hysteresis curve for ferroelectric crystal.

The nature of the hysteresis curve of Figure 6.14 points up the fact that there is no
linear relation between polarization and field for ferroelectric crystals, and that the
relation is not even single-valued. One can define a differential dielectric "constant"
by the equation
dP
dE

(6.71)

but r is not a constant, and its value depends on the previous history of the specimen,
as well as on the value of E. In speaking of "the dielectric constant" of a ferroelectric
crystal, what is usually meant is the value of r deduced from (6.71) when dP IdE is
found for the virgin curve oab at the origin. It is a characteristic of ferroelectric crystals
that r so determined is quite high, with values in the range 500-5,000 being not
uncommon.
This phenomenon of spontaneous polarization has S0111e features akin to orientational polarization, but the t\VO mechanisms differ in several fundamental ways,
Orientational polarization disappears when the external field is rcmoved ; spontaneous
polarization does not. Orientational polarization is due to permanent dipole moments
possessed by the individual molecules. These molecules orient themselves in a random

372 Dielectric Materials

CHAPTEH

succession of directions due to therrnal agitation and therefore orientational polarizability is temperature dependent. However, in ferroelectric crystals, the individual
molecules have permanent dipole moments which, within a given domain, are all
aligned steadily, thermal agitation not affecting this orientation.
Ferroelectric crystals do exhibit a different type of temperature effect, however. The
spontaneous polarization usually disappears above a certain characteristic temperature OJ, called the ferroelectric Curie temperature. The reason for this can be traced
to a change in crystal structure such that individual molecules no longer possess permanent dipole moments. Above the Curie temperature the dielectric constant is found
to vary with temperature in such a way as to obey the Curie-Weiss law

== - - -

(6.72)

T-O

in which C and () are constants characteristic of the crystal in question. () is usually a


few degrees below the Curie temperature Of.

Ps

100

105

110

115

120

125

T(OI()

6.15 Spontaneous polarization of potassium dihydrophosphate,


I(II 2P0 4 [After .l rz and Bantle, Helu Phys Acta, 16, 211,. 1943.J

FIGURE

Several different groups of ferroelectric crystals may be classified on the basis of


their chemical composition, structure, and electrical behavior:
1. Dihydrogen phosphates and arsenaies of the alkali metals. Typical of this group
is !<:H2P0 4, whose spontaneous polarization density as a function of temperature is
given in Figure 6.15. The Curie temperature is seen to be 123I\:, which is representative
of the entire group, and so low as to limit seriously their practical applications. The
shape of this curve is similar to one showing the spontaneous magnetization of
iron, as shall be seen in Chapter 7.
2. The tartrates. Characteristic of this group, and the first solid to be recognized
as possessing ferroelectric properties, is Rochelle salt (N aT~C4H406 . 4H 20). This salt
was first prepared in 1672 by the French pharmacist Seignette who lived in La.Rochelle,
and for this reason ferroelectricity is referred to by S0111e authors as Seignctte elec-

SECTION

Ferroelectric Crystals

14

373

tricity. Figure 6.16 gives the spontaneous polarization density of Rochelle salt as a
function of temperature and reveals t\VO transition temperatures, at 2,55I( and 2961(.
In the ferroelectric phase the crystal is monoclinic but above 296I( and below 255I(
it has an orthorhombic structure. The spontaneous polarization occurs along one
polar axis only and therefore the dipole moment is limited to t\VO opposite directions,
resulting in a relatively simple domain structure. When measured along this axis, the
relative dielectric constant is found to reach values as great as 4,000 in the neighborhood of the t\VO transition temperatures. Transverse to this axis, the relative dielectric
constant is very low.

250

260

270

280

290

300

T(OK)

6.16 Spontaneous polarization of Rochelle salt.


[After llalblutzel, flelv Phys Acta, 12., 489; 1939.]

FIGURE

3. The GASH group. So-named after guanidine aluminum sulphate hexahydrate


(CN 3H6)AI(S04)2 . 6H 20, which was discovered to be ferroelectric in 1955, this group
contains a wide variety of crystals. They have the common property of possessing
hydrogen bonds of the type O-H-O or O-H-N which link together deformable
ions such as (S04)2-, (8e04)2-, etc. The list includes certain alums and many glycine
compounds. GASH-type crystals are characteristically "soft," which means that they
are water soluble, have a low melting or decomposition temperature, and are physically
soft at room temperature. They exhibit relative dielectric constants of the order
of 5 or 6 and do not have a detectable Curie point, presumably because elevation of
the temperature occasions the loss of water of crystallization, thus preventing reproducible results. C;ASH itself is trigonal and has a spontaneous polarization of 0.35 microcoul/cm ' at room temperature. An attractive feature is its square hysteresis loop.
4. The oxygen octahedron group. Prominent among these crystals is barium titanate,
BaTi0 3 Above the Curie temperature {)f == 3931(, it possesses the cubical structure
suggested by Figure 6.17. This is called the perovskite structure, after the prototype
mineral CaTi0 3 The Ba 2+ ions are seen to occupy the corners of a cube, the centers
of the six faces being occupied by 0 2- iOIlS. The oxygen ions thus form an octahedron,
at the cen tel' of which is found the Ti 4+ ion.
Below the Curie temperature the structure is no longer cubic. The material becomes
spontaneously polarized in a direction parallel to one of the cube edges, and along this

374 Dielectric 1l{aterials

CHAPTER

direction the crystal expands, whereas transverse to this direction it contracts, resulting in a tetragonal structure. The Ti 4 + ion is no longer at the center of gravity of the
0 2- ions, having suffered a sizable relative displacement which, coupled with its 4e
charge, can account for the high resulting polarization density. Since there are six possible directions of spontaneous polarization, the domain structure of barium titanate is
more complex than that of Rochelle salt.

CD

FIGURE

6.17

Ba 2+
Ti 4 +

0 2-

The Perovskite crystal structure of barium titanate.

Figure 6.18 depicts the spontaneous polarization density and relative dielectric constant of BaTi0 3 versus temperature, when measured along a cube edge. Two additional
transition temperatures are observed. These occur at 278!{, where the spontaneous
polarization changes to a direction parallel to a face diagonal, and at 193K, where
it changes direction again and becomes parallel to a body diagonal.
The possibility of spontaneous polarization in these ferroelectric crystals is suggested
by Equation (6.49). If Nex~/EO = 1, the indication is that a finite value of P can exist
with ]~' = O. To achieve this condition, one requires a high value of the product of N
(liquid or solid state), of ~ (large local field), and of the polarizability ex. This favorable
combination of factors is peculiar to a limited class of crystals.
A possible cause for the existence of a Curie temperature in these crystals may also
be traced to Equation (6.49), which tall be combined with (6.50) and (6.~)3) to give
Er -

1 =

Na/Eo
Na

(6.73)

1 - 'YEO

If it is assumed that ex and ~ are independent of temperature, then Er depends on ternperature in a manner controlled by N. But the particle density N, which is inversely
proportional to the volume, is related to the volume coefficient of expansion A by the
expression

dV/V

dN/N

--=A=--dT

ar

(6.74)

Therefore as the temperature is decreased, the particle density increases. If at some

SECTION

Ferroelectric Crystals

14

375

temperature 1 the quantity N a:yI EO is only slightly less than unity, cooling the crystal
to a lower ternperature ()f may cause sufficient contraction to make N Ci.Y / Eo == 1, with
the result that spontaneous polarization 111ay occur. In a small range of temperature
just above (Jj, this would suggest that changes in the dielectric constant are related
to the volume expansion of the crystal.
1

<,

~
0

~
~

c,

20 X 10- 2
16
12
8
4

0
T(OK)

10 X 10 3
8
6

4
2

180

120

240

:360

:l00

420

T(OI{)

6.18 Spontaneous polarization and dielectric constant


(along cube edge) of barium titanate as functions of iemperature.

FIGURE

If Equation (6.73) is differentiated with respect to temperature, one obtains

dN IN
dT

_ -x _
-

- (E r

dE rld1
l)[Y(Er - 1)
1

In the neighborhood of () f the relative dielectric constant


of unity,
dT

Formation of the integral of this expression gives

'jTJ ~:r =
fr(O/)

_ A'Y

0/

dT

+ 1]
1; since y

tr

is of the order

376

Dielectric Materials

Since fr(Of) ~

00,

CHAPTER

this reduces to
l/A1'
---T - Of

(6.75)

which is the form of the Curie-Weiss law (6.72); the agreement is not perfect, in that
experimentally 0 is usually found to be several degrees below the Curie temperature
Of. However, (A1')-1 appears to be the right order of magnitude for the Curie constant
C. For example, barium titanate has an observed Curie constant of approximately 10 5
and an expansion coefficient A = 3 X 10- 5 per degree, which would give l' = t, a
reasonable value.
The high value of the polarizability a exhibited by all ferroelectrics may be explored
further by returning to a discussion of crystal structure. In the case of barium titanate,
the cubical cell of Figure 6.17 has been measured and is found to have an edge dimension
of 4.00 A. Since Figure 6.18 indicates a spontaneous polarization density at room temperature of 0.16 coul/rn", the equivalent dipole moment per unit cell of barium titanate is
p = P, dV = (0.16)(4 X.10- 10 ) 3 = 10- 29 coul-m
If one assumes that this dipole moment is entirely due to a shift of the titanium ion
relative to the tetragonal array of oxygen ions, then this displacement is

d =

10- 29
= 0.16 A
4 X 1.6 X 10- 19
0

which is a reasonable result. The actual situation may be more complicated, since some
relative shift of the barium ions and the oxygen ions is possible. 26
It is interesting to observe that if all the permanent dipole moments in liquid water
were aligned; the result would be a polarization density Po ~ 0.10 coulyrn", a figure
which is comparable to the spontaneous polarization density for barium titanate. The
significant difference is that in water the dipole moments are randomly oriented whereas
within a domain of barium titanate they are all parallel. Another feature of interest
in barium titanate is that its electronic polarizability is not insignificant.
The unusual properties of ferroelectric crystals have led to their use in a variety
of practical applications. The nonlinear relation between P and E permits the design
of devices such as rectifiers; the polarity of the remanent polarization (oe or on in
Figure 6.14) permits the storage of binary information in a computer memory; and
the high relative dielectric constant permits large capacities in small volumes.

6.15

PIEZOELECTRICS

In Example 6.4 the polarizability of N aCI was examined by assuming that a static
electric field was applied parallel to one axis of the cubical crystal. Relative displacements of the negative and positive ionic lattices were then hypothesized and a new
balance of forces was established, from which an equilibrium displacement could be
deduced. If second-order effects had also been considered, it would have been found
26 See, e.g., A. J. Dekker, Solid State Physics, Sec. 8.6, Prentice-Hall, Inc., Englewood Cliffs, New
Jersey, 1957.

SECTION

Piezoelectrics

15

377

that the longitudinal shift of the two sets of ions caused transverse force components
which resulted in a slight lateral compression, This in turn permitted an elongation
of the crystal in a direction parallel to the field. If the electric field had been reversed,
the same mechanical deformation would have occurred. This phenomenon is called
electrosiriction and is common to all dielectric solids.
F

/~---CI

//

-----------/l
/

f----

Cl

//

- --{

Cl

/;
Na

I
I
I

CI

L___

Cl

-----__

I
/
I /

1/

FIGURE

6.19

Mechanical deformation of NaCI crystal.

If the crystal possesses a center of symmetry for the constituent charges, this is not
a reciprocal effect, in the sense that a mechanical deformation will not cause a net
polarization of the specimen. The reason for this is that an applied mechanical force
acts on elements of mass but does not distinguish between the two types of charge.
Taking N aCI as an example, Figure 6.19 shows the deformation of a local region of the
crystal when it is placed in compression. The arrows indicate the directions of displacement of the six CI- ions relative to the central N a + ion. By symmetry these displacements are equal and opposite in pairs, so that the electrical center of the six CI- ions
still coincides with the position of the N a + ion and there is no polarization.

378 Dielectric Materials

CHAPTER

However, not all crystals contain a center of symmetry, Consider for example the
hexagonal array of ions shown in Figure 6.20a. Starting from any ion, if a vector is
drawn to any other ion, the negative of this vector does not reach a similar ion, so this
crystal is not center-symmetric. In the undisturbed state the equivalent centers of
the three positive and three negative ions, shown connected by solid bond lines, coincide at the point P and there is no polarization. However, if the crystal is placed in
compression, as shown in Figure 6.20b, the hexagon flattens. Since the transverse
expansion is smaller than the longitudinal contraction, the equivalent center of negative charge shifts to the right, the equivalent center of positive charge shifts to the left,

+ ---

II

'\

'+_.-! /

\ +-----/
/' \

-,
,

":t
I

\
/

"+

P +-----

t-"

+--_.~
\

'+----

"

\,
I

+-----

!-

-:-

+-_.

t
F

(a)
FIGURE

6.20

~11 echanical

\+

P_j
=+--

\p+

(b)

+--

/ \
\PPi
+--

l
F

(c)

deformation of a crystal lacking a center of S?jn1.1netry.

and a net transverse polarization appears. Alternatively, if the crystal is placed in


tension, the hexagon elongates, as shown in Fig. 6.20c, and a transverse polarization
of the opposite sense occurs.
This phenomenon, in which a mechanical deformation causes an electric polarization,
is called piezoelectricity, after the Greek word piezein which means to press. The effect
was discovered in 1883 by Pierre Curie and is reciprocal-an electric field which causes
the polarization of Figure 6.20b will also cause a contraction transverse to itself; an
electric field which results in the polarization of Figure 6.20c will similarly cause an
elongation transverse to itself.
Of the 32 different classes of crystals, 20 lack a center of symmetry and hence are
possible piezoelectrics. (Quartz and Rochelle salt arc t\VO con11110n materials which
exhibit this effect.) Such crystals are important because they permit the conversion
of mechanical energy into electrical energy and vice versa. Practical applications
include the crystal pickup in a phonograph, the stable frequency source, and the transducer, in which an electrical voltage is used to drive a piezoelectric crystal at its mechanical resonant frequency, thus serving as a source of ultrasonic radiation.
It should also be noted that when a piezoelectric crystal is heated or cooled, a charge
separation appears across its faces. This effect is known as pyroelectricity and the charges

SECTION

Time-H armonic Fields and C0111plex Permiitiuiiu

16

379

are evidence of an internal polarization, caused by the strain associated with thermal
expansion or contraction of the crystal.
Finally, mention should be made of a class of materials known as electrets. These
are organic materials which, when solidified in thc presence of a steady electric field,
appear to have a nct electric moment "frozen in." Despite the fact that these moments
may persist for long periods of time, it is believed that this is a metastable state.

6.16 TIME-HARMONIC FIELDS AND COMPLEX PERMITTIVITY


In the preceding sections dielectric materials were studied when under the influence
of static electric fields. Three polarization mechanisms were identified (electronic, ionic,
orientational) in which the induced dipole moments normally were linearly proportional to the exciting field, and disappeared when the field was removed. A fourth
mechanism (spontaneous polarization) was found to occur in a limited class of materials, the ferroelectric crystals, and in such materials the relation between polarization
density and external field was generally nonlinear. Beginning with this section, these
earlier static results will be extended to the case in which the exciting field is timeharmonic, but the analysis will be restricted to linear materials. Thus electronic, ionic,
and orientational polarization will be considered, but the reaction of ferroelectric
crystals to a time-harmonic excitation will not be treated.
If an externally applied t ime-harrnonic electric field has persisted within a dielectric
for a sufficient length of time, the induced polarization density P must also be periodic
in time. However, the displacement of charges associated with this polarization usually
shows some inertia, so that P may not be in phase with the local field. It is also possible
that P may not be parallel to the local field. An example of this was noted in the case
of the hexagonal crystal structure of Figure 6.20. However, it will be assumed in all
that follows that the maqnitude of P is linearly proportional to the magnitude of the
local electric field.
It is shown in Appendix L that if p(~,17,r)ejwt is the induced polarization, with P a
complex vector, then the macroscopic potential function due to all the dipole moments
within a dielectric specimen of volume V is

( - i":" +

cI> x,Y,z,t

41T"Eo~

j-Vse{P}dV
V

41T"EO~

(6.76)

in which S is the bounding surface of the specimen and ~ is drawn from dS or dV to


(x,Y,z). {PI is the time-retarded value of P. This result is seen to be similar to the
static formula (6.8), the only difference being that the polarization is now t.ime-harmonic and retardation must be incorporated. Equation (6.76) normally is valid for
points within the dielectric as well as without, and permits the interpretation that the
dielectric specimen is equivalent in its electrical action to a surface charge distri bution
P n plus a volume charge distribution - V s P.
Paralleling the development of Section 6.4, one can consider a distribution of primary
charges, of density p(~,1],s)ejwt, occupying a volume V 1, and an assortment of dielectric
materials which fill a volume V 2, with P(~,1],r)ejwt the density of induced dipole momen ts.
Both p and P are in general complex functions and the resulting macroscopic electric
e

380 Dielectric Materials

CHAPTER

field E is such that

v (EoE) =

p -

VP

It is once again advantageous to introduce a generalized macroscopic electric flux


density function D by the relation
D = EoE + P
(6.77)
In (6.77) all three vectors are in general complex and P need not be parallel to E, in
which case D is not parallel to either of them. Once again, D satisfies the divergence
relation
V D = p
and is thus a flux field whose discontinuities are associated with the positions of the
primary charges.
Many other results of the static theory may be extended to the time-harmonic case.
Equation (6.31), which connects the local and external fields via the polarization, is
valid if the external field constant 'Y is taken to be a complex constant (or perhaps a
complex tensor). Equations (6.48) and (6.49) also apply if the polarizability a is treated
as complex, so that once again P = xeEoE and (6.77) may be written
in which
(6.78)
is the complex permittivity, E' and e" being its real and imaginary parts; x, is now the
complex dielectric susceptibility. As before, the relative dielectric constant Er can be
defined by Er = e/EO. It will be found in what is to follow that a, Xe, E, and Er are all
complex functions of frequency, and, in general, each may be a tensor, since P is not
necessarily parallel to E.

6.17

TIME-HARMONIC ELECTRONIC POLARIZABILITY

The complex polarizability a can be considered first for simple materials, such as monoatomic gases and elemental solids, in which the only polarization mechanism is electronic. Returning to the atomic model of Section 6.6, let it be assumed that a single
atom consists of a nucleus, of charge Ze, and an electron cloud of total charge - Ze
distributed uniformly throughout a sphere of radius ?"a. Since the nucleus is several
thousand times more massive than the electron cloud, the motion forced on the nucleus
by an external time-harmonic field can be ignored in comparison to the periodic oscillations of the cloud itself. If attention is focused first on the case of monoatomic hydrogen, the differential equation which describes the motion of the electron cloud can be
deduced by the following argument: Let the cloud be displaced an amount r relative
to the nucleus, after which all external forces are removed. In accordance with Equation
(6.32), the cloud will experience a restoring force given by

Fr

= -

e2
--3
47T" Eo1'a

= - ar

in which a is called the restoration coefficient. If there were no damping, the resulting
motion would be governed by the differential equation of an harmonic oscillator,

SECTION

Ti1ne-I-I arinonic Electronic Polarizobiliiu

17

381

namely,
(6.79)
with m the mass of an electron. Solutions to this equation are periodic at a natural
angular frequency Wo == (a/rn)V2. If one takes fa as 0.5 X 10- 10 m (a good approximation for the ground state) and uses m == 0.91 X 10- 31 kg for the 111aSS of an electron,
the estimate can be made that 10 1 6 < Wo < 10 17 and therefore that the natural resonant
frequency of electronic oscillation for a single hydrogen at.orn lies in the ultraviolet
portion of the spectrum.
Equation (6.79) is inexact because it does not account for the damping caused by
emission of energy due to the time variation of velocity of the electron cloud. It is
shown in Appendix ~\I that this damping can properly be included by taking as the
differential equation of free motion
mi

-aT - 2br

(6.80)

in which the damping constant b is given by


(6.81)
It therefore follows that if a forcing electric field Eloc(x,Y,z)eiwt is present, the equation
of motion of the cloud satisfies
nii == - aT -

2br - eEloceiwt

(6.82)

in which E 10c may be treated as a complex constant in the small region occupied by the
atom, This differential equation has the particular solution T(t) == CRe Aeiwt, with the
complex amplitude A given by

__
w

(elm)E 1oc
w~ - jw(2blm)

(6.83)

If the forcing field has been present long enough, the damped complementary solution
to the homogeneous equation (6.80) may be ignored in comparison to the particular
solution (6.83). It then follows that the induced dipole 1110n1ent (cf. Appendix L) is
given by
(6.84)

so that the complex electronic polarizability is


a

==

e2 111~
wo[ 1 - (wi wo) 2] + j (2 bwoI m )wi Wo
2

(6.85)

If a, is separated into its real and imaginary parts according to the formula

and displayed as a function of frequency, the curves of Figure 6.21 result. The real part
is seen to be positive for wi Wo < 1 and negative t hereafter. For wi Wo small it reduces to
the static polarizability given by (6.35), and it has two resonant peaks, symmetrically
disposed around w/ Wo = 1 and separated by 2bwo/Jn. The imaginary part has a single
resonance centered around wi WQ = 1 and it will be seen in Section 6.20 that this gives

382 Dielectric M aierials

CHAPTER

rise to absorption of energy by the at0111 from the exciting field. Outside the region

2bwo

2bwo

- -111.,< - <
1 + -m
Wo

the imaginary part of the electronic polarizability is seen to be negligible; the real part
has essentially the static value below this region and is negligible above it. Upon placing
numerical values in (6.81), one finds that 2bwo/11L 1 and therefore this resonance
is confined to a tight frequency region in the ultraviolet part of the spectrum,

Ot--------:::...---t---\-

(-2~:o 1

FIGURE

6.21

Complex electronic polarizability of single hydrogen alOUL

When the higher energy eigenstates of a hydrogen atorn are considered in this fashion,
the classical equivalent is to choose different values for fa, which leads to a sequence of
harmonic oscillators with restoration coefficients ai, damping coefficients bi, and resonant frequencies Wi. The strengths of these harmonic oscillators are weighted, and the
result is a plot of complex electronic polarizability versus frequency which displays a
series of resonances of varied heights, spaced throughout the ultraviolet portion of the
spectrum: each of these resonances has the shape of Figure 6.21. At visible frequencies
and below, the electronic polarizability of a hydrogen atom is, for all practical purposes,
a pure real number given by the static formula (6.35).
If an atom of higher atomic number Z is considered, the spectral series of absorption
lines will be altered because of the 1110re complex structure of the electron cloud.
Individual elements will exhibit electronic polarizabilities with resonances at characteristic frequencies, these frequencies predominant ly occurring in the ultraviolet or
beyond. At frequencies below the ultraviolet, the total electronic polarizability will be
essentially the static real value.

6.18

COMPLEX IONIC POLARIZABILITY; TIME-HARMONIC


PERMITTIVITY OF NON-POLAR MATERIALS

If two ions are set into harmonic motion by a time-harmonic field, the analysis of the
effect parallels what has been presented for electronic oscillations. In the case of ions
the restoring force is also (to first-order) proportional to the displacement from equilib-

SEc'rION

Dipolar Relaxation

19

383

rium, with the restoration coefficient approximately given by


a==

(z~)2(n

1)

47rod~

in which z is the ion valence, n the repulsive exponent, and d; the spacing between ions.
If ni is the mass of an ion, the natural resonant frequency is once again given by
Wo == (a/1n)~2. Using typical values, one finds that 10 14 < Wo < 10 15 so that natural
ionic oscillations are in the infrared part of the spectrum, As with the electronic case,
there is a small damping factor so that o: is complex in the neighborhood of Wo, having
the static value below resonance and being essentially zero above resonance. For complex molecules, a series of resonances in the infrared will be noted.
In materials which exhibit only electronic and ionic polarization, the complex relative
dielectric constant is given by
r

==

(1 - ~)Na/o

X e == - - - - - - 1 - ~Na/o

in which a == (a: - }a:') + (a~ - ja~') is the sum of the complex electronic and ionic
polarizabilities per molecule, this equation arising from (6.51). For millimeter wavelengths and longer, r for such materials is frequency independent, temperature independent (if the molecule density N is constant), and equal to the static value discussed
in earlier sections of this chapter. These materials normally will show a complex
permittivity only in the infrared and ultraviolet regions of the spectrum and then only
in a series of isolated narrow bands of frequency. In the visible spectrum the ionic contribution will have dropped out, and the dielectric constant will equal what the static
value would be if there were only electronic polarization. Beyond the ultraviolet region
the permittivity is that of free space.

6.19 DIPOLAR RELAXATION


In the previous two sections the frequency dependence of electronic and ionic polarization has been discussed. It was found that dielectric materials which have only these
t\VO kinds of polarizability show a permittivity which is frequency independent (and
equal to the static real value) at all wavelengths longer than infrared. However, many
materials do not fit this category. Due to the frequency dependence of their orientational
polarization (or other causes yielding similar effects), some materials exhibit complex
permittivities in the microwave range of frequencies and below, and thus give rise to
dielectric losses at wavelengths of practical interest to those concerned with electromagnetics. These materials include many liquids and glassy substances and even some
crystalline materials.
For such materials if a constant electric field E is suddenly applied, the polarization
shows considerable inertia and will not reach its static value immediately but instead
will approach it gradually, as suggested by Figure 6.22a. If after the ultimate value of
polarization P2 has been achieved the field is suddenly removed, the polarization will
not vanish immediately but will decay as shown in Fig. 6.22b.
This behavior can be explained by arguing that the almost instantaneous change in
polarization (either Pi or P2 - P 3) is due to the electronic and ionic contributions and

384 Dielectric Materials

CHAPTER

that the slow buildup or decay (P 2 - PI or P 3 ) is due to an orientational or equivalent


effect, with which there is associated relatively much more inertia. It has been noted
that the natural angular frequencies of electronic and ionic oscillations exceed 10 1 4 rad/
sec, so the rise to PI or the fall from P2 to P 3 can be expected to occur in a time interval
of the order of 10- 1 4 sec. The orientational relaxation, on the other hand, 111ay vary
from 10- 12 sec to many hours, depending on the structure of the material and its
tern perature.
p

------- ----P

--------------- P 2

- - - - - - - - - - - - Pi

(a)
FIGURE

WI

6.22

(b)

Build-up and decay of polarization upon sudden application or removalof steady field.

According to this analysis, if fei is the (real) permittivity of the material at a frequency
just below the first infrared resonance, and if Es is the static (real) permittivity, then

Pi

P2

P2

(f s

P3 =
fo)E

(Eci -

fo)E

(6.86)
(6.87)

If the shape of the build-up or decay curve in Figure 6.22 is known, it is possible with the
aid of the boundary conditions (6.86) and (6.87) to determine the complex permittivity

feW)

f/(W) - jf"(W)

in the range of angular frequencies below Wi. To see how this is accomplished;" imagine
that during the time interval between u and u + du a rectangular pulse of strength
E(u) has been applied to the dielectric, being zero outside this interval. By virtue of
the equation
(6.88)
D = foE + P
it follows that a displacement D will arise as a result of the application of this pulse,
which will be present at time t > u + du, but which will gradually subside. D is therefore a function of t - u such that one may write

D(t - u)

E(u)J(t - u) du

t> u

(6.89)

du

where J{t - u) is the decay function which describes the gradual subsidence of D. In
general, J(t - u) may be a tensor function.
As suggested by Figure 6.22, D contains a part which can follow the field almost
instantaneously and which is expressible by feiE. Thus
D(l - u) = feiE(u)

E(u)j(O) du

<t<u +

du

(6.90)

27 This analysis draws upon a development given by Frohlich, Theory of Dielectrics, 2d ed., pp. 4-8,
70-73, Oxford Press, London, 1958.

SECTION

Dipolar Relaxation

19

385

in which it is assumed that f is essentially constant during the short time interval duo
Imagine that at a later time interval between u' and u' + du' an additional rectangular
pulse E(u') is applied to the dielectric. Under the assumption of a linear theory, the
principle of superposition will apply and the resulting displacement D(t - u') will add
linearly to the earlier displacement D(t - u). If this process is extended to a continuous
time-dependent field E(t) which is initiated at the time t == 0, then the total displacement D (t) at any time t > is given by

D(t)

EeiE(t)

f E(u)f(t t

u) du

(6.91)

If this development is applied to time-harmonic fields so that E == CRe E(x,Y,z)e i wt , then

f f(t - u)eiwu du
E(x,y,z) f f(v)ejw(t-v) dv
t

D(x,Y,z,t) - EeiE(x,y,z)ejwt

E(x,y,z)

E(x,y,z)e jwt

f f(v)e- jwv dv

(6.92)

in which v == t - u is a substitution variable. If it is assumed that the time-harmonic


field has existed long enough so that all transients have died out, and D is also periodic,
this implies that t is large enough so that f(t) is essentially zero. But this means that the
upper limit of integration in (6.92) can be extended to infinity with negligible error.
When this is done, (6.92) may be written in the form
~

[Eei

f f(v)e- jwv dv]

(6.93)

Upon defining the complex permittivity by the relation D = EE, one finds that

+ f f(v)e- jwv dv
ec

E(W) = E' (w) - jE" (w) = Eei

(6.94)

Thus if the decay function f(v) is known, the frequency dependent portion of the complex permittivity may be deduced.
For many materials the simple exponential function

f(v) == Ae- vlT

(6.95)

is found to be appropriate. In this equation T is independent of time but may be


influenced by temperature; it is called the relaxation time and its reciprocal is referred to
as the relaxation frequency. A is a constant which will now be determined.
If the special case of equilibrium in a static field is considered then w == 0, D == e.E,
and (6.94) gives
~

Es

= Eei

+f
o

Ae- v/r dv =

E.i

AT

Dielectric Materials

386

so that A = (e, -

fei)/T.

CHAPTER

Thus
fs -

fei

f(v) = - - - e -

(6.96)

fJ T
/

When this decay function is inserted in (6.94), one obtains

feW)

+ 1+.
fs -

fei

fei

(6.97)

JWT

If this result is separated into its real and imaginary parts, it is found that
f' (W)

f"(W)

fa -

fei

- (e -

fei

+1+
f .)
et

(6.98)

W 2T 2
WT

(6.99)

w 2r 2

These equations are customarily referred to as the Debye equations, and they are
illustrated in Figure 6.23 as functions of WT for the case e, = 5f ei/4. The real part fo' is
seen to change smoothly from the value f s to the value fei as W is increased. The imaginary
part f" has a broad maxirnum at wr = 1. Irrespective of the values of fa and fei, this
resonance is such that a decade change in crt either side of unity causes a fivefold
decrease in e",

Ee;

-- -

E"

L-

--"-

o
FIGURE

6.23

--Aoo.

;{

----'

WT

Debye curves for a dielectric 'material 'uith a single relaxation time (e, = 5f ei/ 4).

SECTION

Dipolar Relaxation

19

387

From these observations it may be concluded that maximum absorption of energy


from the field by such dielectric materials will occur at the relaxation frequency
w == ,-I. Sufficiently below this frequency, the absorption is negligible and the permittivity has the real value s; sufficiently above the relaxation frequency, the absorption is
once again negligible and the permittivity has the smaller real value ei. At the relaxation
frequency, == (e, + ei) /2 - j(s - ei) /2. For most materials to which this analysis
applies, (s - ei) / (e,
ei) 1 so that this dielectric constant is only slightly complex.
Several physical models fit the assumption that the decay functionf(v) is representable
by a simple exponential term containing a single relaxation time ,. Prominent among
these models is the case of a volume density N of permanent dipoles of strength J-I. in a
liquid. In the presence of an externally applied periodic field, these dipoles attempt to
rotate back and forth in synchronism with the field, but are impeded in this effort by
their inertia and the thermal effects of their neighbors. The result is that there is a net
orientational polarization density Npo which is periodic but lags the field, being dependent both on frequency and temperature, The analysis is similar to the one already
presented for static fields in Section 6.11 and gives results which agree best with experiment for dilute solutions of dipolar molecules in non-polar liquids."
Another physical model which fits the theory is concerned with dipolar solids.
Imagine such a solid to consist of permanently polarized molecules, each of which, due
to the internal crystalline field, possesses a number of equilibrium orientations which
are separated by potential barriers. In the simplest ease only t\VO equilibrium positions
(1) and (2) exist, with opposite dipole directions. At very low temperatures these
dipoles fall into an ordered arrangement due to interactions with each other. But as
the temperature is raised, the extent of this ordering decreases, changing from long-distance to short-distance, and finally disappearing all together at sufficiently high
temperatures.
If one assumes that the temperature is sufficiently elevated, an individual dipole is
just as likely to be found in position (1) as in position (2), under the proviso that the
pot.ential barrier is symmetrical. However, if an external static field E is introduced,
the energies which a dipole has in the two allowed states will now differ by 2v E, in
which tJ is the dipole moment in state (1), -tJ being the dipole moment in state (2).
(Cf. Equation 6.40.) For this reason the equilibrium distribution will be altered and
there will now be more molecules in one state than in the other.
Let N 1, N 2 be the numbers of molecules in the two states at a given time and let
W12 dt be the probability that a molecule in state (1) makes a transition to state (2)
during the time interval di: similarly, let the probability for the reverse process be
W21 dt. Then

(6.100)

At equilibrium lV'l ==

N2

== 0 which gives N 1/ N 2 ==

W21/W12'

However, in equilibrium

28 An analysis may be found in H. Frohlich, ibid., pp. 83-90. Section 11 of Froehlich's text also contains an excellent discussion of other physical models 'which fit the relaxation theory.

388 Dielectric Materials

CHAPTER

N land N 2 must satisfy the Boltzmann distribution and thus


Nl
N2
If

l,1 -

e21!: E /k T

kT, the perturbation from two equally populated states is small and

(6.101)
with K a constant whose value can be determined in the following manner: In the
absence of the field E let T be the average time between when a dipole arrives in state
(1) and when it next arrives in state (2). Then in unit time, Wl2N 1 molecules leave state
(1), the same number arriving back, in which Wl2 == W2l are the transition probabilities
for equally populated states. The average time per round trip per molecule is therefore
2T = N l/W12N 1 = l/w12 == 1/tv21. With l,1- E/kT small, Wl2 and W21 differ but little
from their field-free values W12 and W2l. Therefore K == 1/2T.
Consider next the application of a time-harmonic field Eei wt Equations (6.100) are
once again applicable, but N 1 and N 2 are now functions of time. Using for (6.101)

~ ;r [1 - (f}k"T

Wl2

) ei wt ]

one may write for (6.100)

2TN l == j2WTN l = -(N l - N 2 ) + (N l + N 2) (y - E/kT)e i wt


2T1V 2 == j2WTN 2 = (N l - N 2 ) - (N l + N 2 )(J) - E/kT)e i wt
These equations have the solution

N _ N = N 1 + N '}.
1
2
1 + jWT

(!lkT E) e

i wt

(6.102)

If N 1 + N 2 = N is the number of molecules per unit volume, then the net polarization
density is

P = (N 1

2)f}

(Nf})(f} " .E/kT) ei wt


1
JWT

(6.103)

and the orientational effect of these permanent dipoles in the solid make a contribution
to the permittivity which is in the form of the second term of (6.97). Therefore this
model conforms with the assumption that the decay function is exponential with a
single relaxation time.
The result (6.103) is based on the premise that in the absence of an external field the
t\VO states are equally probable for any molecule. This premise loses its validity as
the temperature is lowered and short-distance order sets in. Interestingly, as the temperature is lowered still further so that the order is long-distance, the analysis once
again becomes valid. The reason for this is that when the lattice is completely ordered,
it contains t\VO types of sites in which the dipoles are oppositely directed. Each molecule
has a second equilibrium position in which its energy is slightly higher than when in the
ordered position. But this differential in energy grows smaller as the temperature is
lowered and finally becomes negligible, so that the premise leading to (6.103) is once
again justified.

SECTION

Dipolar Relaxation

19

389

If there are several potential barriers separating more than t\VO directions of a permanent dipole, then there should be a variety of r values associated with the different
switching times. Similarly, if ions can occupy more than one position, these positions
being separated by potential barriers, a relaxation phenomenon occurs, leading to a
result in the form of (6.103). This latter effect appears to be the case in glassy substances
whose ions can be displaced over one or more interatomic distances.
90 ,..---...,..------,------r---.--,.----.r------r----.

80..----.....----+-----+-----+---1-----+----4

70

60

,iO
(r

40

:~O

20

10

0
-70

-60

-50

-40

-30

-20

-10

Tee)
FIGURE 6.24
Relative permitlioiuj of ice VB. teniperaiure and frequency,
in cps. [After Smutli and Hitchcock, J ~11n Chem Soc, 54, 4631,. 1932.J

The temperature dependence of permittivity for materials which fit assumptions such
as have been used in the foregoing analysis is more complicated than appears explicitly
in (6.103). The particle density N is somewhat temperature-dependent in solids, but of
even more significance is the dependence on temperature of the relaxation t.ime T. For
example, in ice r increases sevenfold as the temperature is lowered from - 5C to
-22C. The manner in which permittivity varies with temperature and frequency in
ice is shown in Figure 6.24. It is observed that at a given temperature as the frequency
is raised, the permittivity decreases until finally the orientational contribution vanishes;

390

Dieleciric M aterials

CHAPTER

at Iower temperatures the frequency which must be reached is not so high because T is
greater. I t is reasonable that a permanent dipole should be able to switch its direction
in less time at higher temperatures. In water at r00111 temperature the relaxation frequency is as high as 3 X 10 10 cps,
The Debye equations (6.98) and (6.99) are of such a form that knowledge for all
angular frequencies w of either e' or f " permits complete determination of the other. This
conclusion is reached for any f(v) and not just for (6.95), which led to the Debye equations. The general dependence of f' and f " on each other 111ay be expressed by a pair of
in tegrals, known as the Kronig-Kramers relations."

6.20

DIELECTRIC LOSSES

If a static electric field is established in a region containing linear dielectric materials,


the stored energy 111ay be deduced by the technique used in Section 3.19. All the primaru
charges 111ay be assembled in neutralizing pairs, with proper distribution over what is to
become an equipotential surface <Po. The primary charges can then be 1110ved along
what are to become D lines to their ultimate positions. Elements of work can be cornputed by using the coniponeni of E parallel to the D lines. The result is that the stored
energy is given by

WB=iJEoDdV

(6.104)

in which V is the VOlU111e of all of space. This differs from the earlier free-space formula
(3.151) only in that D has replaced Do. Each of these flux densities is associated with
surface densities of primant charge, which are the agencies whereby the energy is
introduced to the system, even though SOB1C of it ends up being stored in the induced
dipoles within the dielectrics. In (3.1tj1) E and Do were parallel and the dot product
notation was somewhat superfluous. However, in (6.104), for points within a dielectric,
E and D are not necessarily parallel, and it is only the C0111pOncnt ofE along the D
lines which is effective in the transfer of energy. For isotropic materials, in which E and
D are parallel, (6.104) reduces to

WE

J ~$E2 dV

(6.105)

1T

wherein e, is the static nontensor dielectric constant. For this reason it is often stated
that the volume density of electrostatic stored energy is f s E 2j 2. This statement is 111isleading unless e, is independent of temperature. It can be shown'" that (6.10;") generally
represents the change in free energy of the system. t
If, after the static fields E, D of (6.104) have been established, the primary charge
distribution is changed infinitesimally by starting with additional infinitesimal charge
pairs on <Jl o and rnoving them to their destinations along the D lines, t hen the additional

t The free energy is an exact function of the state variables whose changes represcn t the maximum
work which can be done in an isothermal process.
29

See, e.g., C. Kittel, QuanlU111 Theory of Solids, pp. 405-406, John Wiley and Sons, Inc., New York,

1963.
30

E.g., H. Frohlich, Ope cit, pp. 9-12.

Dielectric Losses 391

SECTION 20

energy put into the system is

dW

E dDdV

(6.106)

in which dD == do is the charge density in transport during the process of changing the
primary charge distribution. If this process takes a time dt, then

dW

f E aDat dt dV

(6.107)

Imagine that, in a succession of infinitesimal time increments such as this, the


primary charge system is cyclically varied so that E and D, as they appear in (6.107),
are time-harmonic. Then
E == E(x,y,z) cos [wt + /3(x,y,z)]
D == /E(x,y,z) cos [wt + ,B(x,y,z)] + f"E(x,y,.z) sin [wt + ,B(x,y,z)]
in which E(x,y,z) is a real vector function, /3(x,Y,z) is a real phase angle, and f/, f " are
the real and imaginary parts of the complex permittivity. (They may be tensors.)
Over one complete cycle the energy supplied to the system is

f dW = f E(x,y,z)
v

7f'

2lT

f"E(x,y,z) dV

f cos" [wt + ,B(x,y,z)]

dwt

(6.108)

E(x,y,z) f"E(x,y,z) dV

According to (6.104), the energy stored in the system at the end of a cycle is the same
as at the beginning, since E and D have reverted to their initial values. Thus (6.108)
gives the energy lost per cycle within the dielectric materials in the form of heat.
The heat produced per second per unit volume in the dielectric may therefore be
expressed by
(6.109)
If the dielectric is isotropic, so that D II E and

L
in which

==

"

~ E2
2

==

f/, f "
I

are real scalars, then

~ E2 tan 0
2
f

"

tan 0 == ,.

(6.110)
(6.111 )

with 0 the phase angle of the com plex permittivity. Because of the form of (6.110), 0
is also called the loss angle and tan 0 the loss tangent. Formula (6.110) is a convenient form in which to express the loss because for many materials f' is essen tially constant over wide frequency spectra, and tan 0 then becomes the loss index of the
material.
Dielectric loss at frequencies below the infrared is usually due to dipolar relaxation
in one of the forms discussed in Section 19. (In some dielectric materials there may
also be a small amount of loss due to free electrons which cause a slight conductivity.
This subject will be considered more fully in Chapter 8.) An extensive compilation of
dielectric data in the frequency range from 100 cps to 2.5 X 10 10 cps has been made by

392 Dielectric AIaterials

CHAPTER

'fABLE 6.7
THE REAL PART OF THE HELATIVE DIELECTRIC CONSTANT AND THE LOSS
TANGENT VEHSUS FREQUENCY FOR VARIOUS DIELECTRIC MATERIALS

Values for

111aierial

Zirconium porcelain ....

TOC

25

Borosilicate glass.......

25

Mycalex 400..........

25

Bakelite BM-120 .......

25

Laminated fiberglas ....

24

Nylon FM 10001 ......

25

Lucite ................

23

Plexiglas..............

27

Polystyrene...........

25

Teflon ................

22

Frequency in cps

tan {)

are multiplied
by 104

102

104

106

108

1010

f'l EO

6.44
59

6.35
31

6.32
23

6.30
25

6.18
57

E'I Eo

4.05
13.6

4.05
4.4

4.05
5.8

4.05
8.0

4.05
15.5

E'I Eo

7.47
29

7.42
16

7.39
13

....
....

7.12
33

E'I Eo

4.87
300

4.62
200

4.36
280

3.95
380

3.68
410

E'/ fo

14.2
2500

7.2
1600

5.3
460

4.8
260

4.37
360

tan {)

E'I Eo

3.84
115

3.64
262

232

3.36

3.17
175

3.02
107

t.' / t.o

3.20

2.75

2.63

2.58

2.57

145

67

49

E'/ Eo

3.40
605

2.95
300

2.76
140

., ..

....

2.59
67

Eo

2.56
<0.5

2.56
<0.5

2.56
0.7

2.55
<1

2.54
4.3

2.1
<5

2.1
<3

2.1

2.08
3.7

tan {)
tan {)
tan

tan {)
tan 0

tan {)
tan 0
t.' /

tan {)
E'/ Eo

tan 0

620

315

2.1
<2

<2

von Hippel" and selections from his data are reproduced in Table 6.7. One observes
that tan 0 is characteristically small and that E' is reasonably constant, in keeping with
the interpretation of the Debye equations as displayed in Figure 6.23. One or more
resonances may be noted in tan 0, but they are all very broad, which is also consistent
with the theory.
The dielectric losses occasioned by ionic vibrations are customarily referred to as
infrared absorption. Similarly, those losses associated with the electrons are called
A. R. von Hippel, ed., Dielectric Mcderials and Applications, John Wiley and Sons, Inc., New York,
1954.

31

SECTION

21

J11 axwell' s Equations for Dielectric Materials

393

optical absorption. Most of these losses are normally in the ultraviolet and for transparent materials they are totally so. However, in some materials they also occur in the
visible spectrum, thus giving rise to the colors of these materials. In ionic crystals this
may happen if electrons occupy S0111e of the sites normally filled by negative ions, thus
causing a natural frequency intermediate between ultraviolet and infrared.

6.21

MAXWELL'S EQUATIONS FOR DIELECTRIC MATERIALS

If a dielectric material is considered at the microscopic level to consist of an aggregation


of charged particles in motion in a vacuum, then the free-space Maxwell's equations

1t

v X E == -B

VXB==-=i+ 2
Mo
c

(6.112)

are applicable, with t, == 1 + lb representing the total current density due to motions
of both primary charges t and the bound charges (1b) within the dielectric. However, a
more convenient form of these equations 111ay be devised as follows: If

P dV ==

AI

2111

n=l

n=l

L pn == 2:

qnrn

is the total instantaneous dipole moment within a macroscopic volume element dV


(cf. Section 6.2, 6.3), then
in which v- is the instantaneous velocity of the nth charge within dV. This expression
is seen to be the instantaneous current in dll due to the motions of the bound charges.
When the thermal velocity components (which average to zero) are subtracted out,

in which tb is the equivalent instantaneous ordered current density representing the


contributions due to the motions of all the bound charges within dV. Thus P == tb is
the contribution to the total current density tt made by the time-varying dipole moments within the dielectric. Since
it follows that

D == f:oE

+P

If the dielectric material is nonmagnetic (this restriction will be removed in Chapter 7),
then
tt
VXB==--=i+
).Lo

+D

D+

1 -

f:oC

1t

-1

J.Lo

Maxwell's other equation remains intact in the form given in (6.112) so that at
points within nonmagnetic dielectric materials, Maxwell's equations 111ay be wril.ten in

394 Dielectric Iv[ aterials

CHAPTER

the form

v XE

-8

VXB

+D
-1

J.Lo

(6.113)

The auxiliary relations become

v D =

V B

== 0

(6.114)

in which D = eE, with the permittivity e capable of all the diverse characteristics discussed earlier in this chapter.
These results will be generalized further in the next t\VO chapters to include magnetic
and conductive materials,
REFERENCES
1.

Bottcher, C. J. F., Theory of Electric Polarization, Elsevier Publishing Company, Amsterdam, 1952.

2.

Corson, D., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, 'V. H.
Freeman and Company, San Francisco, 1962.

3.

Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1957.

4.

Dekker, A. J., Electrical Engineering 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1959.

5.

Frohlich, H., Theory of Dielectrics, 2d ed., Clarendon Press, Oxford, 1958.

6.

Kittel, C., Introduction to Solid State Physics, 2d ed., John Wiley and Sons, Inc., New
York, 1956.

7.

Plonsey, R., and R. E. Collin, Principles and A pplications of Electromaqnetic Fields,


McGraw-Hill Book Company, New York, 1961.

8.

Reitz, J. R., and F. J. Milford, Foundations of Eleciromaqnetic Theory, Addison-Wesley


Publishing Company, Inc., Reading, Massachusetts, 1960.

9.

Van Vleck, J. H., Theory of Electric and Jf agnetic Susceptibilities, Oxford University Press,
London, 1932.

10.

von Hippel, A. R., ed., Dielectric ill aterials and Applications, John Wiley and Sons, Inc.,
New York, 1954.

11.

vVhittaker, E. S., A History of the Theories of the Aether and Electricity, Vol. 1, Tholnas
Nelson and Sons, Ltd., London, 1951.

PROBLEMS

6.1

What charge distribution over a spherical surface is equivalent in the exterior region to
a dipole of moment p at its center?

6.2

An electron is placed 30 angstroms from an atom of polarizability a = 10- 40 farad 1n 2


Find the dipole moment induced in the atom and the resulting force on the free electron.
What initial acceleration will this cause the electron to experience?

6.3

Find the energy stored in an atom of polarizability a if it is placed in a uniform field of


strength Eo.

6.4

Find the electrostatic energy stored in a system consisting of two identical dipoles which
are parallel and coaxial. Let their strength be p = qd and their separation be l, with l
d.

Problems 395
6.5

The molecules of a polar gas have a permanent dipole moment of 1.5 debye units. What
externally applied field strength is needed to cause an orientational polarization which is
0.5 percent of the saturation value? Assume that the temperature is 20C.

6.6

The dipole moment of a water molecule is 1.87 debye units. Neglecting electronic and
ionic polarizabilities, estimate the dielectric constant of water at OC.

6.7

Show that the interionic force between the positive and negative ions of a salt will decrease by an order of magnitude when the salt is dissolved in water. This effect plays a
prominent role in keeping the salt in solution.

6.8

With the aid of Figure 0.11, estimate the permanent dipole moment of CH 3CI and the
relative strength of its orientational polarizability.

6.9

A commercial 400 volt d.c. paper-insulated tubular capacitor for use in electronic components has 0.1 microfarads of capacitance. Exclusive of the exterior plastic shell it con-

sists of two sheets of foil 0.0005 in. thick, interleaved by two sheets of paper 0.001 in.
thick, with the assembly wound into a circular cylinder 16 in. in diameter and 1 ~ in. long.
What is the relative dielectric constant of the paper?
6.10

Find the polarization density in the capacitor of the preceding problem, if a paper of
relative dielectric constant 2.4 is used and 400 volts is applied.

6.11

If a uniform field Eo is set up in a dielectric of permittivity


field in a spherical cavity inside the dielectric.

6.12

Figure 6.18 indicates two drops in the spontaneous polarization of BaTi0 3 , as measured
along a cube edge. These occur when the direction of polarization changes from being

find the strength of the

parallel to a cube edge, to being parallel to a face diagonal, and then to being parallel to
a body diagonal. Use the experimental data of Figure 6.19 to show that the total spontaneous polarization is essentially constant below 300oK.
6.13

Barium titanate has an index of refraction n = 2.4. Assuming that the Clausius-Mossotti
equation is applicable, show that the electronic polarizability is approximately one and
a half times as great as the spontaneous ionic polarizability.

6.14 A charge q is placed at the center of a dielectric shell of inner radius a, outer radius b, and
relative dielectric constant Er . What is the change in energy of the system if the charge is
removed to infinity?
6.15

Show that a conducting metallic sphere of radius a has a polarizability a = 411"oa 3 I f an


artificial dielectric is constructed by using N such conducting spheres per unit volume,
find the relative static dielectric constant. What would be the frequency dependence of
the permittivity of this array?

6.16

Show that Equation (L.3) of Appendix L is applicable to the case of permanent dipoles
which oscillate about central pointing directions with {P} representing the retarded
orientation of the polarization density.

6.17

In parallel with the development in Appendix L, find the magnetic vector potential
function due to a collection of oscillating dipoles which represent a dielectric material.
Show that the equivalent time-harmonic currents are consistent with the equivalent timeharmonic charge densities of Equation (L.3).

CHAPTER

Magnetic Materials
to their magnetic behavior, materials may be classified as diamagnetic,
poramaqnetic, [erromaqneiic, antijerromaqnetic, or [errimaqnetic; distinctions will be
drawn among these five types of materials in later sections of this chapter. Briefly, the
manifestation of magnetic behavior may be attributed to electron orbital motion, to
electron spin, and to nuclear spin. All three of these causes are representable by equivalent atomic currents flowing in circular loops. Unlike the currents considered in Chapter
4, which were caused by the macroscopic transport of charge, these atomic currents are
bound to the individual molecules, and the phenomenon in many respects is analogous
to that of the bound dipole charges which formed the basis for an explanation of dielectric behavior. These equivalent atomic currents cause magnetic fields which are
calculable by the methods developed in Chapter 4. The aggregate effect of the atomic
currents in all the molecules of a material specimen may be to produce a significant
macroscopic magnetic field.
An equivalent atomic current loop may be characterized by its magnetic moment m.
For diamagnetic materials, the net magnetic 1110111ent of each molecule in the absence
of an external magnetic field is zero. The presence of an external field induces a slight
net magnetic moment, and this effect is akin to electronic polarization in dielectric
materials. The induced magnetic moments translate into a relative permeability slightly
less than unity.
In paramagnetic materials, the molecules possess net permanent magnetic moments
which are randomly oriented. The presence of an external field causes some net alignment of these magnetic moments, in balance with the thermal agitation. This effect is
analogous to orientational polarization in dielectric materials, Normally, in paramagnetic materials, adjacent molecules exert little magnetic influence on each other and
such materials exhibit a relative permeability only slightly in excess of unity.
Ferromagnetic materials are composed of molecules possessing magnetic moments of
equal strength. The interactions of adjacent molecules arc so strong that all of the
magnetic moments align within a given domain of the material even in the absence of
an external field. This effect is similar to what occurs in ferroelectric crystals. These
materials have large relative permeabilit.ies, values in the range 10 4 to 10 5 being not
uncommon. Antiferrornagnetic materials differ in that adjacent molecules have 111agnetic 1110111ents of equal strength but the alignment is anti parallel. In fcrrimagnetic
materials, adjacent magnetic 11101nen ts are at two different strengths as well as being
anti parallel. These four conditions of the permanent magnetic moments are suggested
for a one-dimensional model in Figure 7.1.
'VITH RESPECT

SECTION

Historical k')urvey

397

To account for all three causes of magnetic behavior and their possible occurrence in
any of the five types of magnetic materials, an arbitrary specimen will be treated as
though it were a general collection of magnetic moments m, in a vacuum, with consideration of the detailed composition of m, deferred until later in the chapter where specific
types of materials are discussed. Static conditions will be treated first, and an expression will be obtained for the total magnetic field caused by externally impressed currents
plus a distribution of atomic magnetic moments. This expression will then be used to
generalize the relation between Band H and to deduce the local field at the site of any

I\/~
Paramagnetic

Antiferromagnetic
FIGUHE

7.1

tt tt
Ferromagnetic

Ferrimagnetic

Aliqnmeni of magnetic moments for di.fferent types of magnetic materials.

molecule. Consideration will then be given to the problem of relating the strengths of
the local field and the net magnetic moment density M. For many materials this will
permit simplifications of the expression for the total field; additionally, it will lead to the
relation H == J..L-1B, with the permeability factor J..L serving to describe the magnetic
behavior of the various types of magnetic materials.
The permeability will be investigated for all five classes of materials under static (or
quasistatic) conditions, and then the results will be generalized to time-harmonic
situations. Resonance phenomena will be described and hysteresis losses treated.
Finally, the free-space form of Maxwell's equations derived in Chapter 5 will be extended to apply to regions occupied by magnetic materials in a manner analogous to
what was done for dielectric materials in Chapter 6.
7.1

HISTORICAL SURVEY

An awareness of the existence of magnetized and magnetizable materials can be traced


back to the ancients, who were familiar with lodestone and its power to attract iron.
Thus Plato, in the dialogue Ion, invests Socrates with the words
. . . this gift you have of speaking well on Homer is not an art; it is a power divine, impelling you like the power in the stone Euripides called the magnet . . . . This stone does
not simply attract the iron rings, just by themselves; it also imparts to the rings a force
enabling them to do the same thing as the stone itself, that is, to attract another ring, so that
sometimes a chain is formed, quite a long one, of iron rings suspended from one another.
For all of them, however, their power depends on that lodestone.

* This section may be omitted without loss in continuity of the technical presentation.

398 Magnetic 1J1aterials

CHAPTER

Despite this long history of acquaintance with the powers of natural magnets, the
introduction of the compass (which was the first practical application of magnetism)
did not occur until the Middle Ages.' The date and authorship of this invention are not
known, but primitive forms of the compass were commonly employed in northwestern
Europe by the end of the twelfth century. About one century later, and apparently
independently, the Chinese also discovered that the directive power of a magnet could
be applied to navigational purposes.
The science of magnetism can be said to date from 1269, for in that year Pierre
de lVlaricourt (Peregrinus) announced the discovery of an important property of lodestones. In his own words:"
So you must know that this stone bears in itself the similitude of the heavens, the method
of proving which I will explain clearly how to find . . . there are two points in the heavens
more noteworthy than the rest, because the celestial sphere turns about them as upon axes.
One of these is named the Arctic or North pole, whilst the remaining one is named the
Antarctic or Southern. So in this stone you should thoroughly comprehend there are two
points of which one is called the North, the remaining one the South. To the general discovery of these two points you may attain by manifold industry . . . one way is to have
this stone rounded with a tool with which crystals and other stones are rounded. Afterwards
let a needle . . . be placed over the stone, and along the length of the needle let a line be
marked out dividing the stone along the middle. Afterwards let the needle be placed in
another position over the stone, and mark the stone with a line again in the same way
according to that position. And if you wish, you shall do this in several places or positions,
and without doubt all the lines of this kind will meet in t\VO points, just as all the meridian
circles of the World meet in the t\VO opposite poles of the World. Know you then that one is
the North, the other the South . . . .

Peregrinus went on to the discussion of further experiments in which he showed that


the two poles were also the points of greatest concentration of magnetic strength. His
terminology has prevailed to this day, and this conception of a polarization effect in
magnets has made a lasting impression and forms the basis for many subsequent
theories of magnctizaticn.
It is difficult for the present-day reader to appreciate the superstitious awe with
which medievalists regarded the magnet and its seemingly supernatural ability to
attract other bodies. Thus curative properties for all sorts of afflictionswere ascribed
to it. Gout, dropsy, convulsions, and even marital disputes were believed to give way
to its magical powers, Another common belief which prevailed for centuries was that a
magnet would lose its directive properties if rubbed with garlic. The rise of the scientific
method included consideration of such matters, as one can see in this passage from the
writings of Giambattista della Porta:"
1 A summary of what is known about the discovery of the compass, with bibliography, has been given
by D. G. Knapp in "Origins of Geomagnetic Science," Chap. 6 of ill tujneiisni of the Earth, Publication 40-1, Coast and Geodetic Survey, U.S. Dept. of Commerce, Washington, I).C., 1962.
2 Petrus Peregrinus, De M agnete, Chap. 4, English translation by Sylvanus P. Thompson, Chiswick
Press, London, 1902.
3 John Baptist Porta, Natural illtujik, English translation of Afaqiae Naturlis, 20 volumes, published
in London, 1658~

SECTION

Historical Survey

399

It is a common opinion among seamen that onions and garlic are at odds with the lodestone; and steersmen and such as tend the mariner's card are forbidden to eat onions or
garlic lest they make the index of the poles drunk. But when I tried all these things, I found
them to be false; for not only breathing . . . upon the lodestone after eating of garlic did
not sto pits virtues; but even when it was anointed all over with the juice of the garlic, it performed its office as well as if it had never been touched with it, and I could observe almost
not the least difference.

Although dispelling such beliefs also gained the attention of William Gilbert (15401603), it is fortunate that he still found time for more significant inquiries. Gilbert
was the first to appreciate that the earth itself is a giant spherical magnet. He went so
far as to magnetize a small iron ball and demonstrate that it possessed a magnetic field
similar to that of the earth. The action of a compass needle was then readily explained
as merely another example of the principle that like poles of different magnets repel,
whereas unlike poles attract."
The seventeenth century witnessed a widened interest in magnetic investigations.
Among the accomplishments of that period may be mentioned the demonstration by
A. Kirchner (1601-1680) that the two poles of a magnet have equal strength. This was
done by measuring the force required to pull a piece of iron away from either pole.
N. Cabeo (1585-1650) revealed an inductive effect when he noted that an unnuumeiized
needle floating freely on water would align itself with the earth's magnetic meridian.
H. Gellibrand (1597-1636) discovered the secular variation of the magnetic declination.
Descartes offered the first theoretical explanation of magnetic phenomena by attempting to embrace all known effects within his theory of vortices. He assumed that the fluid
matter of a vortex entered a magnet at one pole and emerged at the other, acting on
nearby pieces of iron because the molecules of the iron presented a special resistance
to its motion.
In the Principia, Isaac N ewton speculated that the law of force for a bar magnet
was the inverse cube of distance. s However, John Xlichell (1724-1793) was the first
to enunciate a correct law for the force between magnetic poles, stating:"
Whenever any Magnetism is found, whether in the Magnet itself, or any piece of Iron,
etc., excited by the Magnet, there are always found t\VO Poles, which are generally called
North and South . . . . Each Pole attracts or repels exactly equally, at equal distances,
in every direction . . . . The Attraction and Repulsion of Magnets decreases, as the
Squares of the distances from the respective poles increase.

Michell based the statement of this law on his own experimental observations and those
of several contemporaries. The validity of the inverse square relationship was later
reinforced by the refined experiments of Coulorn b.
The prevalence at that time of fluid theories of electricity naturally led to efforts
to construct similar theories of magnetism. A one-fluid theory was proposed by Aepinus
in 1759, in which the poles were presumed to be regions in which the magnetic fluid
W. Gilbert, de ill tumeie, London, 1600. An English translation by P. F. Mottelay was published in
1893, reprinted by Dover Publications, Inc., New York, 1958.
5 Cf. Book III, Prop. VI, Theorem \'1, Cor. \Y. This conclusion is correct for distances large compared
with the length of the magnet.
6 J. Michell, A Treatise of A rtificial }vf agnets, London, 1750.
4

400

Magnetic Jl.faterials

CHAPTER

was present in excess or deficiency of the normal amount. A two-fluid theory was
favored by Brugmans and Wilcke, with elements of one fluid repulsing each other, but
attracting all elements of the other fluid. The names austral and boreal were given to
these fluids and Coulomb adopted this two-fluid idea, using it to explain why a magnet,
upon being broken in t\VO, becomes t\VO magnets each with a pair of poles, rather than
t\VO half-magnets each with a single pole. According to Coulomb," this effect could be
explained by imagining the t\VO magnetic fluids to be trapped in equal amounts within
the molecules of magnetic bodies, with no possibility of transfer of either fluid from one
molecule to the next. In the unmagnetizcd state every molecule of the body has both
its fluids uniformly distributed, and magnetization occurs when the austral and boreal
fluids retreat to opposite ends of each molecule,
This polarization hypothesis of Coulomb was used by Poisson in 1824 as the basis
for the first successful theory of magnetism." Poisson's development has already been
described in Section 6.1 and its dual in terms of electric dipoles was presented in detail
in Section 6.3. Upon representing the effect of a magnetized body in terms of equivalent
surface and volume distributions of magnetic charge, .Poisson was able to explain satisfactorily all the magnetic phenomena known at that time. In addition to describing the
behavior of a permanent magnet, he accounted for the properties of temporary magnets
by deriving an expression for the distribution of induced magnetization. He also offered
an explanation for the observed fact that some materials are more highly magnetizable
than others, introducing a quantitative index of this property which is akin to the
modern factor of permeability.
The Coulomb-Poisson conception that a polarized magnetic molecule is the primitive element in any magnetized specimen was favored by many subsequent investigators. With some refinemen ts, it still forms the basis of a mathematical theory of magnetism which is acceptable for most calculations. However, a different model, conceived
by Ampere," was destined to supersede it. Stimulated by his discoveries relating magnetic fields to the shapes and dispositions of curren t-carrying conductors, Am perc
chose to consider all of magnetism as being basically an electrical phenomenon. In his
view, each of Poisson's magnet.io molecules owed its properties to a small internal current which circulated perpetually. Aggregations of these elemental circulating currents,
properly aligned, could account for the action of permanent magnets, Although Ampere
did not develop a complete theory which would rival Poisson's, he did treat extensively
the equivalence of permanent magnets and electric circuits. It was to be S0111e time
before this approach \VaS carried much further; the notion of a perpetual molecular
current which encountered no resistance was too revolutionary to be accepted readily
by his contemporaries.
A new class of magnetic materials was discovered by Faraday in 1845. Earlier that
same year, he had determined that the plane of polarization of a beam of light was
rotated when the beam travelled through a bar of heavy glass in the presence of a
longitudinal magnetic field. In an extension of this experiment he noted: 10

c. A. Coulomb, "Seventh Memoir on Electricity and Magnetism," M ern ilead, 488: 1789.
S. D. Poisson, "Memoir on the Theory of Magnctism ," Al em Acad Sci, Ser. 2, 5, 247-338; February
1824.
9 A. IVI. Ampere, "On the Mathemutical Theory of Electrodynamic Phenomena Uniquely Deduced
from Experimen t," AIern A cad, 6, 367; 1825.
10 ~1. Faraday, Experimental Researches in Electricity, vol. 3, Sec. 2253-2258, Bernard Quaritch,
London, 1855.
7

SECTION

Historical Survey 401

The bar of silicated borate of lead, or heavy glass already described as the substance in
which magnetic forces were first made effectually to bear on a ray of light . . . was suspended centrally between the magnetic poles, and left until the effect of torsion was over.
The magnet was then thrown into action by making contact at the voltaic battery: immediately the bar moved, turning round its point of suspension, into a position across the magnetic curve or line of force . . . . Here then \ve have a magnetic bar which . . . pain ts
perpendicularly to the lines of magnetic force.

Faraday also observed that a cube of the heavy glass would tend to move out of the
magnetic field. This behavior was contrary to that of an iron bar, which would align
itself with the field, and to that of a cube of iron, which would seek the strongest region
of the field. Faraday offered an explanation of the behavior of such materials by supposing that magnetic induction caused in them a contrary state to that which it produced in magnetic matter, and for this reason he called them diamaqneiic.
Upon attempting to picture this behavior in terms of lines of magnetic force, Faraday
was led to distinguish between diamagnetic materials and those which were normally
magnetic. He introduced the word paramaqneiic to describe the latter, and suggested
that a paramagnetic body possessed a high conducting power for magnetic flux, thus
causing the lines of force to crowd into it in preference to its surroundings. Diamagnetic
materials, on the other hand, were pictured as having a low conducting power for the
flux lines so that the density of lines within a diamagnetic body was low compared to
the surroundings.
Two years later, Wilhelm Weber (1804-1890) offered a detailed explanation 11 of
diamagnetism. He assumed the existence of Amperian molecular circuits, and invoked
Faraday's emf law to argue that currents should be induced in these circuits if a timevarying magnetic field were applied. Since the induction would result in currents whose
fields were opposed to the stimulus, this would neatly account for diamagnetic behavior.
According to this argument, all bodies exhibit diamagnetism, Weber accepted this
conclusion, and then assumed further that paramagnetic substances additionally
possessed permanent molecular currents which were the cause of their paramagnetism.
A material whose permanent molecular currents were large would be normally magnetic
to such a high degree that the weak diamagnetic effect due to induced currents would
be masked completely. Weber was so satisfied with this explanation that he used it as a
reason to reject the Coulomb-Poisson hypothesis of polarizable magnetic fluids, saying
in the same article:
Through the discovery of diamagnetism, the hypothesis of electric molecular currents
in the interior of bodies is corroborated; the hypothesis of magnetic fluids in the interior of
bodies is refuted.

In 1871 Weber reformulated his theory of magnetism, taking as a new model for an
Amperian molecular current one in which an electric charge was pictured as orbiting
around a fixed charge of opposite sign. This hypothesis preceded by a quarter of a
century Thomson's discovery of the electron and by a third of a century the development of the Rutherford-Bohr atom. It was adopted in 1905 by P. Langevin (1872-1946)
11 W. Weber, "On the Relationship of the Science of Diamagnetism With the Sciences of Magnetism
and Electricity," Ann Phys, 87, 145-189; 1852.

402

Magnetic AIaterials

CHAPTER

as the basis for the first electron theory of magnetism." Langevin's theory provided a
satisfactory explanation for the distinctions between diamagnetism and paramagnetism, including the temperature independence of the former and the T-I temperature
dependence of the latter, effects which had been observed experimentally by P. Curie
(1859-1906) a decade earlier.!"
In introducing the terms diarnagnetic and paramagnetic Faraday had grouped all
materials whose permeabilities exceed that of free space in the paramagnetic category.
The need for subdivision of this classification became apparent in 1907 when P. E.
Weiss (1865-1932) pointed out that spontaneous magnetization exists in highly 111agnetic materials in the absence of an external field.!' He called this the ferromagnetic
property, and this term has ever since been used to characterize such materials, The
adjective paramagnetic is now confined to those materials which have relative permeabilities only slightly in excess of unity and which normally do not exhibit spontaneous
magnetization.
Weiss accounted for the possibility of a gross demagnetized state of a ferromagnetic
material by postulating the existence of regions (now called domains) each of which is
always magnetized to saturation. He supposed that the directions of magnetization
in the various domains were different, and randomly oriented in the absence of an
external field, in which case the gross magnetization of all the domains considered
together was zero. To explain the uniaxial saturation magnetization in any domain,
Weiss hypothesized that a strong, local molecular field forced the parallel alignrnent of
adjacent Amperian current loops. As a consequence, he was able to show that a ferromagnetic material, when heated to a certain critical temperature 'I', (called its Curie
point) ceased to be Ierromagnctic, and that for higher temperatures the magnetic
susceptibility was proportional to (T - TJ-l. This latt-er result is called the CurieWeiss la w .
Experimental evidence in support of Weiss' domain theory for ferromagnetic materials was first provided by Barkhausen " in 1919, who observed jU111PS in the magnetization of a ferromagnetic specimen which are associated with irregular fluctuations
in the motion of the domain walls, Even 1110re direct experimental evidence results
from a technique introduced by Bitter." In this technique one prepares a colloidal
suspension of ferromagnetic particles and spreads it on a carefully prepared surface
of the specimen ; the strong local magnetic fields at the domain boundaries cause the
particles to collect there, creating a pattern which is seen easily under a microscope.
The hypothesis by Weiss of a strong local field which aligns all the magnetic moments
in a domain is similar to the Lorentz local field theory for dielectrics (cf. Section 6.5).
However, in order to account for the extremely large values of relative permeability
in ferromagnetic materials, if one writes for the local field Bloc = B + ('Y - l)MjJ,LO-l,
with M the magnetization density, it develops that 'Y is several orders of magnitude
larger than the value predicted by a Lorentz-type theory based on ordinary electro-

P. Langevin, "Magnetism and the Theory of Electrons," Ann Chim (Paris), VIII, 5, 70-127; 1905.
P. Curie, "Magnetic Properties of Bodies at Diverse Temperatures," Ann Chim. (Paris) VII, 5,
289-405; 1895.
14 P. E. Weiss, "The Hypothesis of the Molecular Field and the Ferromugnctic Property," J de }:>hys
(Paris) Ser. 4, 6, 661-690; 1907.
15 If. Barkhausen , "Two Phenomena Discovered with the Help of a l\C\V Arnplifier," Phys Zeit, 20,
401-403; 1919.
16 F. Bitter, "Experiments on the Nature of Ferromagnetism," Phys Rev, 41, 507-515; August 15,1932.
12

13

SECTION

H isiorical Survey

403

magnetic assumptions. This difficulty was overcome in 1928 by Heisenberg," who


showed that the large local field may be explained in terms of quantum-mechanical
exchange forces. These forces are electrostatic, but due to constraints imposed by the
Pauli exclusion principle are equivalent to extremely strong coupling between electron
spins. Heisenberg's theory has been given experimental confirmation through studies
of the gyromagnetic effect. 18 In these studies the ratio of magnetic moment to angular
momentum is determined. Theory predicts a characteristically different ratio if the
magnetism is due to electron orbital motion as opposed to spin; the experimental data
agree with the spin prediction.
Another magnetic phenomenon whose explanation requires the quantum theory is
the spatial quantization of atoms under the influence of an applied magnetic field.
The theory predicts that the angular momentum of the atom along the field axis must
be an integral or half-integral multiple of h/21r, with h Planck's constant. The experiments of Stern and Gerlach on the deflection of atoms in a nonhomogeneous magnetic
field have confirmed this." A discrete rather than continuous set of deflections was
observed, giving conclusive evidence that the atoms can assume only particular orientations in the presence of the field.
In the Heisenberg theory for ferromagnetic materials the exchange forces are positive, accounting for the parallel alignment of adjacent magnetic moments, There is
no restriction in the theory which requires that these exchange forces be positive for
all materials; if they are negative, an anti parallel alignment of neighboring spins is
favored. Materials which meet this latter condition are said to be antiferromagnetic.
The theory of such materials was first investigated by Nee120 and Bitter;" and later
extended by Van Vleck." The first material shown to be antiferromagnetic is manganese oxide; this was demonstrated in 1938 by Bizette, Squire, and Tsai. 2 3
Most magnetic materials are good conductors, and thus will not pass electromagnetic
waves easily, particularly at the higher frequencies. In an effort to overcome this
limitation Snoek, Verwey, and their co-workers at the Phillips Laboratories instituted
a search during the 19408 for new ferromagnetic materials which would have high
enough electrical resistivities to be suitable for practical applications requiring wave
passage. They were successful 2 4 in developing a group of materials, called ferrites,
which are obtained by replacing the ferrous ion Fe 2+ in magnetite (FeO' Fe203) by
another divalent metal ion such as Mn, Cd, Co, Zn, Ni, Cu, or IVIg. Mixed ferrites were
also obtained by using combinations of these ions. The resistivities achieved are in the
range 10 0 to 10 4 ohm-m, compared with 10- 7 ohm-m for iron.
W. Heisenberg, "On the Theory of Ferromagnetism," Z Phys, 49, 619-636; 1928.
See, e.g., S. J. Barnett, "Gyromagnetic and Electron-Inertia Effects," Rev Mod Phys, 7, 129-166;
1935. Also, Proc A l1L A cad A rts Sci, 75, 109; 1944.
19 W. Gerlach and 0. Stern, "The Experimental Detection of Quantized Orientations in a Magnetic
Field," Z Phys, 9, 349-:352; 1922.
20 L. Neel, "Influence of Molecular Field Fluctuations on the Magnetic Properties of Bodies," Ann
Phys (Paris), Ser. 10, 18,5-105; 1932.
21 F. Bitter, "A Generalization of the Theory of Ferromagnetism," Phys Rev, 54, 79-86; 1938.
22 J. H. Van Vleck, "On the Theory of Antiferromagnetism," J Chem Phys, 9, 85-90; 1941.
23 H. Bizette, C. F. Squire, and B. Tsai, "The Transition Point ~ of the Magnetic Susceptibility of
MnO," Comp Rend, 207, 449-450; 1938.
24 For reviews of these developments see J. J.
Went and F. W. Gorter, "Magnetic and Electrical
Properties of Ferroxcube l\1aterials," Phillips Tech Rev, 13, 181; 1952. Also see A. Fairweather, F. F.
Roberts, and J. E. vVelch, "Ferrites," Rept Prog Phys, 15, 142-172; 1952.
17

18

404

Magnetic it! aterials

CHAPTER

Ferrites have been found to have the spinel structure (after the mineral MgAI 20 4) ,
which is a composite of tetrahedral and octahedral arrangements of ions. N eel has
treated these materials theoretically" by assuming that a negative interaction exists
between the ions at the tetrahedral sites (A sites) and those at the octahedral sites
(B sites), with this interaction causing an antiparallel spin alignment of the A and B
ions. The theory also includes consideration of the weaker AA and BB interactions,
which turn out to be negative as well. The ferromagnetic behavior of ferrites is thus
explained, surprisingly, in terms of three antiferromagnetic interactions. Neel termed
materials which behave in this manner ferrimagnetic.

7.2 THE STATIC MACROSCOPIC MAGNETIC FIELD

DUE TO A VOLUME OF POLARIZED MAGNETIC MATERIAL

A satisfactory theory of the magnetic behavior of materials can be based on the conception that individual atoms or molecules contribute magne.tic effects because of (a)
the orbital motion of their electrons, (b) electron spin, and (c) nuclear spin. Following
Ampere, one can represent each of these causes by the magnetic moment of an equivalent current loop. In this way the resulting magnetic theory will parallel in many
respects the dielectric theory presented in Chapter 6.
Consider a specimen of material of arbitrary composition whose volume V is bounded
by the surface S. In determining the magnetic field caused by the material, each molecule of the specimen will be replaced by an elementary current loop (or several current
loops), these current loops being treated as though they were in a vaCUU1n, the total
magnetic field of all the current loops being the same as that of the material specimen
they replaced. The magnetic field will be significant if enough of these current loops
are similarly oriented.
In some materials this similar orientation exists independent of any external stimulus.
Such materials are appropriately called permanent magnets. They may occur naturally
or be formed by metallurgical processes. Other materials are magnetically neutral until
subjected to an external magnetic field. The response to a given stimulus varies from
one material to the next, and even for one material the response may vary in a complicated way with changes in the direction or strength of the external field. For most
materials the magnetic response is very small and in this sense they differ little from
free space. A strong magnetic behavior is exhibited mainly by a small group of metals,
chief of which are iron, nickel, cobalt, and their alloys.
As a consequence of this diversity in magnetic response among materials, no attempt
will be made beforehand to link the distribution of equivalent current loops with any
external field; the discussion of such cause-and-effect relations will be deferred to a
later point in the development of the theory. Accordingly, the initial broad assumption will be made that an arbitrary distribution of current loops exists throughout V.
As in the case of the dielectric theory, static conditions will be considered first.
Through use of the results of Example 4.6, if m is the magnetic moment of the molecule situated at the point (~,l1,r), then the magnetic vector potential function due to
25 L. Neel, "Magnetic Properties; Ferrimagnetism and Antiferromagnetism," Ann Phys (Paris),
Ser. 12, 13, 137-198; 1948.

SECTION

The Static Macroscopic Magnetic Field 405

this molecule is
(7.1)

in which ~ is drawn from (~,l1,r) to the distant point (x,Y,z) where .A. is being determined.
It is usually the case that the volume V of the magnetic material is large enough so
that it may be subdivided into a great number N of macroscopic volume elements dV,
with N big enough so that the methods of the integral calculus may be employed,
whereas dV is large enough to contain many molecules. (The notation of Figure 6.2 is
applicable.) When this is so, a continuum approximation may be employed. Let
M(~,?7,r) dV represent the vector sum of all the magnetic moments in dV, so that M is
the volume density of magnetic moments. Then

is the magnetic vector potential function due to all the molecules in dV. From this it
follows that the potential at (x,y,z) due to all the molecules in the entire volume V of
magnetic material is

A(

x,y,z

) ==

f M(~,?7,r)

X ~ dV
-1 3
47r,uo ~

(7.2)

In (7.2) M has the dimensions amperes/meter; it is customarily called the magnetization density, or simply the magnetization, and is a macroscopic function. Implicit in
this derivation is the assumption that (x,Y,z) is sufficiently remote from each equivalent current loop to satisfy the condition that ~ a, with a the loop radius. t This
condition clearly is met for points (x,Y,z) outside the material, and the remainder of
the analysis will first be carried through under this restriction.
The Field at an Exterior Point.
If one again distinguishes del operators with respect
to the source point (~,l1,S) and the field point (x,y,z) by the symbols V s and V F, then
the relation V s(l/~) == ~/~3 may be used to convert (7.2) to the form

JM
v

X V s(l/~) dV
41r,u OI

Use of the vector identity (V. 109) then gives

Vs X ~.dV _
Vs X (M!~) dV
41r,uo ~
v
47r,uo

But from Problem V. 29 (at the end of Appendix V),

_f
V

V s X (M/~)

47T',u OI

dV

fM
S

X dS
41r,uOl~

t The equivalent loop diameter need not exceed the molecular diameter for any of the three atomic
causes of magnetism.

406

Magnetic Materials

and thus

CHAPTER

_f 1\1

A -

X dS
1
41rJ..Lo ~

dV
1
41rJ.Lo ~

V S X 1\'1

(7.3)

in which S is the bounding surface of the material, The magnetic field caused by this
specimen of magnetized material is therefore

VFX

[f M41rJ..LOdS + f
X

Vs X M dVJ
1
41rJ..Lo ~

(7.4)

An interesting interpretation of Equations (7.3) and (7.4) may be offered. The action
of the magnetic specimen is as though an areal current density t = V X 1\1 amperes/rn!
were distributed throughout V, and as though a lineal current density j = M X in
amperes/rn were distributed over S, with in the outward-drawn unit normal vector. t
This concept of equivalent current densities will prove useful in describing many
aspects of magnetic behavior.
The Field at an Interior Point. The previous analysis may be adapted to points
(x,Y,z) within the magnetic material by treating separately a small region surrounding
the interior point in question. Let a spherical surface S~, of radius 0, be erected around
(x,Y,z) as center, thus separating a volume V~ from the remainder of the material, V t
should be chosen large enough to satisfy the condition 0 a, with a the radius of any
equivalent current loop, but it should be small enough that 1\1 is essentially uniform
throughout V 6 These conditions normally will be satisfied if 0 is an order of magnitude
larger than the linear dimensions of a macroscopic volume element dV.
If V'is all the volume of magnetic material except V~, then
B (x,Y,z) = V
I

f
S +Sc5

M X dS
-1

41rllo ~

f Vs
V'

X M dVJ
-1.
41rllo ~

(7.5)

is the magnetic field at (x,y,z) due to all the specimen except for the contributions
of the molecules within V~. If now (x,Y,z) is permitted to range over the subvolume dV,
consisting 'of the macroscopic volume element which is at the center of V c5 , no sensible
change will occur in the value of B/(x,Y,z) computed from (7.5), since dV is so small
compared to 11~. Thus Equation (7.5) also gives the average magnetic field throughout a
macroscopic volume element, due to all the molecules outside V~.
It is shown in Appendix N that all the current loops within V~ cause an average
magnetic field throughou t 11 0 of amount

2 IVI
B= - -1
3 110

(7.6)

Since V o is a small local volume, if the molecules within V~ are alike and uniformly
distributed, it follows that (7.6) also gives the average magnetic field throughout the
macroscopic volume element dl/, due to all the current loops within V~. Therefore,
if ll(x,y,z) is interpreted to mean the total average field in dV, or more briefly the

t If at a point in S a line of length dl is drawn in S transverse to the current flow, and the total
current crossing dt is counted and found to be dI, then i = dlldt is the lineal current density. t is
given the direction of the current flow.
t Most magnetic materials have this locally homogeneous property.

The Static Jvfacroscopic IvIagnetic Field 407

SECTION

macroscopic field, then

B(x,Y,z)

[s+so
1 M4

X dS
-1

7rJ.lo

Vs X

1\1 dV]
-1

7rJ.lo

V'

2 M

"3

-=1

(7.7)

J.1.o

This result can be simplified. Since V o is so small that M is uniform over So, one may
take a polar axis parallel to M so that M = 1rM cos 8 - 1 0M sin 8. Noting that dS
is directed into Yo, it follows that
V

M X dS _

So

-1

47rJ.1. 0 ~

.'3 0

(M X dS) X !:
-1 3

47rJ.1. 0 ~

[(lrlll cos 8 - l011f sin 8) X (-lr dS)] X (-lr o)

1
1IBM

47rJ.1.o-1~3
u

So

So

sin edS = _ ~ 1\'1


3 -1
47rJ.1.o-102
'- J.1.o

(7.8)

Therefore the equivalent lineal current density over So makes an equal and opposite
contribution to the macroscopic field at (x,y,z) when COIn pared to the contribution
made by the molecules within V o.
Additionally, since V o is so S111 all , Vs X M is constant throughout V o and thus
VFX

V s X M dV
4

Vb

-1

7rJ.lo ~

1 (v s

Vo

X IVI) X ~ dV
4

-1

7rJ.1.o ~

(v s X M)
4

-1

7rJ.lo

1 -dV==O
~

V(5

(7.9)

Therefore (7.9) may be added to the right side of (7.7) without affecting the value of
B(x,Y,z). When this is done, and the explicit value for the surface integral obtained in
(7.8) is utilized, there results
B(x,Y,.z) = VI? X

[1 M x_~~ + v1
S

47rJ.1.o ~

~dV]

(7.10)

Vs X
47rJ.lo ~

But this is identical with (7.4) and thus the macroscopic field is given by the same
formula, whether (x,Y,z) is an interior point or an exterior point.
EXAl\1PLE

7.1

The long, thin cylindrical compass needle suggested by the figure has an almost-uniform
axial rnagnctiza tion, which is indicated by the array of arrows. If Equation (7.3) is applied to this specimen, one can conclude that, to the extent M is uniform throughout V,
V s X 1\1 == 0 in 1'. Further, M X dS == 0 over the end surfaces and M X dS is circumferential on the cylindrical surface and transverse to the long dimension. Thus the entire
......

..

...................

~.-~~.-

~~

............

....

..................

~~~~---~~.-~~~--

~~---~~~~~

~~~

~~

specimen is equivalent to a long, slender solenoid with many closely spaced turns carrying
a steady current: the equivalent ampere-turns per unit length is ill. The field of this compass needle is therefore similar to the one shown in the flux plot of Example 4.11.
If the flux lines external to this compass needle were assumed to originate on positive
magnetic charge and to terminate on negative magnetic charge, one can see how the right
end would act like a north magnetic pole and the left end like a south magnetic pole.

408 "AIagnetic "AIaterials

CHAPTER

"fhe original work of Poisson, in which magnetic effects were explained in terms of
positive and negative magnetic fluids, was noted in the historical introduction. Equal
amounts of these fluids were assumed to be within each molecule and to separate into
magnetic dipoles when a specimen became magnetized. The resulting fields were computed in a manner precisely analogous to the procedure followed for dielectric materials in Chapter 6, in which polarized electric charges were assumed to be the cause of
dielectric behavior.
This analytical approach to magnetic behavior has been widely used, but it suffers
from several disadvantages. First, the equivalent atomic current concept of Ampere
appears to be closer to reality. Second, the expressions for internal and external macroscopic fields are not the same under a magnetic charge formulation;" whereas it has
just been demonstrated that they are the same when a current formulation is used.
Third, certain difficulties arise when interpreting the results. of combining magnetic
fields due to primary currents and magnetic materials, when the former is expressed
in terms of electric charge transport and the latter in terms of magnetic dipoles. This
can be appreciated by tracing the arguments of Section 7.3 under a rnagnetic charge

formulation.
Despite these difficulties, there are boundary value problems in which the magnetic
charge concept is useful and offers certain simplifications. These problems are usually
concerned with the fields outside the materials, in which case the aforementioned
difficulties are avoided. The simplifications occur because, like electric charge effects,
the fields due to magnetic charges may be given in terms of a scalar potential function.
For the reader who is interested in this alternative development, a derivation may be
found in many textbooks."

7.3

A GENERALIZATION OF

"0

Consider next a general magnetostatic system consisting of primary steady currents


plus equivalent current loops which represent magnetic materials. One can imagine
that a primary current density l(~,1],r) occupies the volume V 1 and that an assortment
of magnetic materials occupies the volume V 2. V 1 and V 2 may overlap and neither of
them need be only one simply connected region. If l\l(~,1],r) is the volume density of
magnetic moments in 11 2 , then the total macroscopic magnetic field B at a point (x,Y,z)
will be
(7.11 )
B(x,Y,z) = B 1(x,Y,z)
B 2(x,Y,z)
in which

and

n.
-

= VF X

[ J M41TJJ.OxdS
82

I!

X MdV]
+ vJ s 41TJJ.O
1
!
2

If the entire magnetostatic system, including the magnetic materials, is viewed as a


26 See, e.g., I). R. Corson and P. Lorrain, Introduction to Eledronuupieiic Fields and 1t' aves, Chap. 7,
W. H. Freeman and Company, San Francisco, 1962.
27 See, e.g., J. R. Reitz and F. J. Milford, Foundations of Electrornagnetic Theory, Chap. 10, AddisonWesley Publishing Company, Inc., Reading, Massachusetts, 1960.

A Generalization of H 0 409

SECTION ;)

distribution of primary currents of density \ and equivalent currents of density


in a vacuum, a total macroscopic H o field 111ay be defined by

Ho(x,y,z)

V X

== J..Lo1B(x,y,z)
== J..LolBl(x,y,z) + J..Lo l B 2(x,y,z)
== HOl(x,y,z) + H 02 (x,y,z)

IVl

(7.12)

In (7.12) HOI is the field which encircles the primary current distribution t. Similarly ,
H 0 2 is t he field associated with the equivalent currents V X M. H 0 2 is a macroscopic
field and does not have the detailed microscopic structure of a magnetic field associated
with the movements of the individual charges within the molecules. Both of the fields
HOI and H 02 satisfy Ampere's circuital law and are derivable through formation of the
curl of a vector potential function. Therefore,
V

HOI

V X

HOI

V H 02
V X H 02

==

==

V X M

(7.13)

== 0

In most problems of practical significance one is interested in the total B field which
determines the force on a moving charge, and in the HOI field which is linked to the
primary sources t dV through Ampere's circuital law. Equation (7.12) can be rewritten
to display these t\VO quantities of interest in the form
HOI

== J..LoiB - H 02

(7.14)

H 02 is the extraneous quantity in this equation, and it is advantageous to explore the


implications of defining a macroscopic magnetic field H by the relation
H == J.lolB - M

(7.15)

The substitution of M for I-I 0 2 is clearly suggested by the last of Equations (7.13). In
making the definition (7.15) it is to be emphasized that B still has the meaning of total
macroscopic magnetic flux density and that M still represents the macroscopic volume
density of magnetic moments. Outside 11 2 , where M == 0,

whereas inside V 2

H == Hal

H 02

H == HOI

H 02

(7.16)
-

(7.17)

Thus in ter111S of the earlier conception of magnetic fields associated with currents

in free space, H equals the total macroscopic H, field outside the magnetic materials,
but not inside. Furthermore, taking the divergence of (7.15) and utilizing (7.13) gives
whereas taking the curl yields

VH == -vM

(7.18)

vxH==t

(7.19)

Therefore H as defined by (7.15) may not be continuous but it does satisfy Ampere's
circuital law in terms of the primary sources only, and one may write

H ae sJ \. dS

and conclude that H is governed explicitly by the primary current distribution.

(7.20)

410

Magnetic AIalerials

CHAPTER

Equation (7.15) will be adopted as the defining relation for the generalized magnetic
field function I-I, with the understanding that this equation refers to the total macroscopic B everywhere, whether inside or outside the materials, but that it refers only
to the total macroscopic H, field outside the materials. It is an equation of card inal
importance in the theory of magnetic materials and can often be converted to the form
II == J.L- 1B , since for many materials M is expressible as a function of B. The permeability factor J.L then serves the role of representing the macroscopic magnetic behavior
of the material.
Besides containing the magnetization M explicitly, Equation (7.15) has several
other advantages. Outside the materials (7.15) gives I-I == J-Lo1B which is a natural
relation, whereas (7.14) does not similarly connect HOI and B. In the absence of materials (7.15) reduces everywhere to (':1.:12) and thus all the earlier discussion of steady
magnetic fields due only to primary currents becomes a special case of the present
generalization. But an even more important feature is that 1-1, as defined by (7.15),
shares with 1-1 0 1 the property of satisfying Ampere's circuital law, thus providing a
cause and effect relation with the system of primary currents.
Needlelike and waf'erlike cavities can be constructed inside a specimen of magnetic
material in order to 111eaSUre Band H. The procedure is analogous to what was done
for dielectric materials in Example 6.3.

7.4

THE LOCAL FIELD

The defining relation (7.15), which gives H in terms of nand 1\1, is a macroscopic
equation whose further interpretation must await the linking of M to its causes. But
this linkage occurs at the microscopic level, and thus 1VI forms a bridge between the
macroscopic and microscopic theories of magnetic behavior.
To develop the connection between 1\'1 and its causes, it is useful to introduce the
concept of the local field, Bloc, which will be defined as the average field intensity in the
region occupied by a given molecule within the material, due to all external sources
plus every other molecule, but excluding the molecule in question. Bloc 111ay therefore
be determined by removing the given molecule, maintaining all other molecules in
their time-averaged states, and calculating the space-averaged maguetostatic field
in the cavity previously occupied by the removed molecule, If V m is the volume of the
cavity, then
Bloc

1
-V
m

f-b d V -

Vm

1
-V
m

f-b

dV

(7.21)

Vm

in which b is the total time-averaged field at a point in V m and b m is the time-averaged


field at the same point due just to the molecule in question.
If the magnetic material is locally homogeneous, the first integral in (7.21) is approximately equal to the macroscopic field B. And if V mean be chosen as a spherical volume
of radius r-; then the results of Appendix N give
(7.22)
with m the total magnetic moment of the removed molecule. If N == 1/ V m is the local

SECTION

Magnetic Susceptibility

411

volume density of molecules, then assuming parallel and equal magnetization for all
local molecules gives

so that the space-averaged self-field is 2M/3,uo 1 . With these substitutions (7.21)


becomes

Bloc == B -

2M

(7.23)

--1

3,uo

This equation is similar to the Lorentz field derived in Section 6.5 for electrically
polarized molecules and suffers from the same limitations. However, once again one
111ay argue that, even if the integral in (7.22) does not reduce precisely to 2M/3,u Ol in
all cases, it is defensible that the average self-field in V m should be proportional to m,
which in turn is proportional to M by definition. Therefore

J-

V1

m Vl1l

M
bmdV == (1 - 1')-=1

,uo

in which l' is called the internal field constant.


general form
Bloc

==

t In this case (7.21)

M
('Y - 1) ----=t

aSSU111es the more

(7.24)

,uo

Equation (7.24) was first postulated by Pierre Weiss. ror. Section 7.1. ) It will be
interesting to note subsequently that for S0111e magnetic materials l' will be several
orders of magnitude larger than the value ~~ predicted by the Lorentz-like derivation
given above, and culminating in (7.23).

7.5

MAGNETIC SUSCEPTIBILITY

In general, the average magnetization per molecule m in any magnetic specimen is


functionally related to the local field. This dependence may be described by the equation

m = aBIDe

(7.25)

in which a is called the magnetic polarizability and may be a simple scalar or a tensor,
but is often a function of the strength of the local field. If N is the density of molecules
per cubic meter, then

==

Nm

==

(7.26)

NaB loe

is the magnetization density. With the use of (7.24) this may be rewritten in the form

M ==

1 -

Na

('Y - 1) N al ,u

(7.27)

an equation which is seen to be similar to (6.49), derived earlier for dielectric materials.

t (1 - -y) is used as the proportionality constant here to conform with tradition. Because of this
choice (7.24) is equivalent to H 10 c = H + -yM, which is the customary way of expressing the local field,
and which unfortunately emphasizes H rather than B.

412

illagnetic ltl aterials

CHAPTER

Equation (7.27) indicates a functional relation between M and B. The truupieiic


susceptibility x, is defined in such a way that this relation may also be written

~l =

-IB

(7.28)

Xm J.Lo

A study of either (7.23) or (7.24) reveals that the dimensions of M and B differ by J.Lo l
and therefore that Xm is dimensionless. t
Upon combination of (7.27) and (7.28) the magnetic susceptibility may be expressed
in ter111S of the polarizabilityand t he internal field constant, namely,
Xm

N a/ Jl(;1
1 - 'YNa/J.Lo

(7.29)

This equation is seen to be completely analogous to (6.,51).


Through the combination of (7.1: and (7.28), the H field is given by

I-I

11

r:

-In
0

in which

Xm

+ Xm

11

r:

-IB

(7.30)

(7.31)
is called the permeobiliiu of the material, The relative permeability
J.Lr

Jl

= -

j.lo

Xm

j1.r

is defined by
(7.32)

and, like X m , is a dimensionless quantity.


Because of the nature of the polarizability a, all the quantities Xm, jJ., and j.lr may be
scalars, or they may be tensors, depending on the material in question. In materials
which are strongly magnetic, they almost invariably will be dependent on the strength
of the field. At the macroscopic level the induced magnetization in a material specimen
may be represented through usc of Equations (7.28) and (7.31) or (7.32) by anyone of
the quantities Xm, J.L, or J.Lr.
EXAIVIPLE

7.2

Consider a coil wound on a soft iron core, as shown in the figure, and carrying a timeindependent current I. This steady current causes a B 1 field upward inside the coil and
thus induces a nearly uniform magnetization M upward in the iron core. The M distribution is equivalent to a solenoidal sheet of current in the cylindrical surface of the core,

t The reader may wonder why XIII was not introduced by using; the simpler equation 1\'1 = xm,uolB,
in parallel with (6.50). The reason for this is that the traditional development of magnetic theory links
Xm to II rather than to B, the defining relation being M = xmH, which is equivalent to (7.28).
The reader is also cautioned against a possible source -of confusion in approaching the literature
of this field for the first time, Many investigators express their results in cgs units, measuring H in
oersteds. They may choose to give the magnetic moment per ec, or per gram, or per mole of the
substance being measured. The corresponding susceptibilities are then X e.m.u.Zcc, or x' e.m.u.ygm ,
etc. The dimensionless susceptibility Xm is related to t.hese other quantities hy Xm = 47rX =' 47rpX', etc.
(p is the density.)
It should also be noted that some authors use the symbol Xm to denote susceptibility per mole,
In this text the subscripts e and 1H arc used to distinguish between the dimensionless electric susceptibility x, and the dimensionless magnetic susceptibility Xm.

SECTION

Magnetic Susceptibil ity

/----

..........

.......

"

413

\ C

\-

\
\
\

\
\

I
I

,I
I

I
I

I
I
\

"

<, .......... _ - - - /

in the same direction as I. Thus the B 2 field caused by M enhances the B 1 field caused by I;
the total field B is greater everywhere because of the presence of the iron core.
Applying Ampere's circuital law (7.20) to the contour C, shown dotted in the figure,
one obtains

H.d=NX =
C

f Hd+ f Hd

Cair

Ciron

in which Cair and Ciron are the parts of the contour C which are outside and inside the
iron core, and N is the total number of turns of the coil. If (7.30) applies in the iron, then

NI =

J J.Lo B se + J J.L-1B se
l

Cair

Ciron

414

Magnetic Materials

CHAPTER

With the length of the contour C designated by J./, the average value of B along C is

By virtue of this equation, the upper and lower bounds on B a v are such that

The lower bound occurs with no iron core, and the upper bound occurs when the core
extends around to include the entire contour C. Since J.L- 1 is usually so much less than J.Lo 1 ,
the range in Bi; can be considerable. Because B is continuous, it follows that B just outside the core is the same as n just inside the core. Therefore B along Ca ir keeps increasing
as Ciron is made a larger and larger part of C. It is for this reason that iron cores in electrical
machines are designed to occupy as great a portion of the path as possible, leaving the
minimum air gap permissible under mechanical considerations. For a given current in the
coil wound on the iron core, a current-carrying moving conductor in the air gap then
experiences the greatest force through its interaction with this maximized B field. This
force provides the driving torque in a motor and the reaction torque in a generator.

7.6

MEASUREMENT OF SUSCEPTIBILITY

The method used to measure magnetic susceptibility depends on whether or not the
relative permeability is close to unity. For diamagnetic and paramaqnetic substances,
the commonly used techniques involve a determination of the force exerted on a specimen by a nonuniform field. This force is most easily expressed as a spatial derivative
of the stored energy.
The derivation in Section 5.8 of an expression for the magnetic energy stored in a
region of free space is still valid when magnetic materials are present, since H (like
HOI) satisfies Ampere's circuital law and all other steps in the proof are identical. Thus
as a generalization of (.5.79), one n1UY write for the power being supplied to an entire
magnetic field in a volume V,

J H B dV

IT

If various subregions of V contain materials which satisfy (7.30), then one may write

P =

Jv

J.L-lB

B dV = -a

J -21

at v

p.-lB2 dV

From this it follows that the magnetic stored energy is

W",

/L-IB2

dV

(7.33)

in which the permeability J.L is not necessarily a constant throughout V.


Let this result be applied to the Gouy apparatus shown in Figure 7.2 in which a
slender magnetic specimen is suspended from a balance 80 as to hang between the pole
pieces of a magnet. If 11 is taken large enough to encompass the magnet, all of its

SECTION

M easuremeni of Susceptibility

415

sensible field, and the specimen, the magnetic stored energy may be wri tten

Wm =

~.

f J.L-l[B(~,1'/,n + /lB(~,1'/,r,z)F d~ d1'/ dt

in which B is the field distribution in the absence of the specimen, and oB is the additional field due to the presence of the specimen, with z the vertical coordinate of some
reference point in the specimen (say the midpoint).

e.

FIGURE

Specimen

Gouy apparatus.

7.2

If (Vo,JLo), (V 1,JL 1), and (V2,~2) are the volume and permeability, respectively, of the
air region of V, the magnet region, and the specimen region, then

w,

VO+V2

J.Lo1(B

/lB)2 dV

+i

J J.LI1CB + /lB)2 dV
VI

+i

J (J.L2

Vz

J.L(1)(B

/lB)2 dV

in which the free-space integral has been added and subtracted over the volume V 2.
In the absence of a specimen, V o + V 2 is the air region of V, and with oB == 0, the
first t\VO integrals in the preceding expression give the total stored energy with no
specimen present. For a smoll diamagnetic or paramagnetic specimen (J.1.2 ~ J.1.o), oB is
negligibly small everywhere, and only the third integral of the preceding expression is a
sensible function of the position of the specimen. Therefore the force on the specimen is

dW-m == - -d
F z == - -

dz

d.z

f -1 (JL2"l
---=i
V2

J.1.o

1) JLo 1B2 dV
0

(7.34)

z+l/2

d
1
-1
1
-1
2
== -d
- xmA JLo B2 dr == ? xmAJLo (B a
z z -l/2 2
0

2
b)

*oJ

in which a thin cylindrical specimen of length l and cross section A has been assumed,
and Sa and Bb are the field intensities at the top and bottom of the specimen.

416

111agnetic

j1{ aterials

CHAPTER

Typically, a specimen 111ay be 10 to 15 CIn long, so that the top end might be in a
fringe field of about 100 gauss when the bottom end is in a central field of 10,000 gauss;
B; may then be neglected in the above formula, The SpeCi111Cn is weighed with the
electromagnet not excited, and again with the field turned on. The difference in weight
is F z ; with A and Bb known, Xm may be deduced. The method is applicable to solid
samples and also to liquids and powders placed in a glass tube.
Instead of measuring directly the susceptibility of highly permeable materials, such
as iron, the B-H curve is often deduced by forming a specimen in the shape of a torus,
on which t\VO windings are placed. A d.c. source in series with a set of standard resistors
is connected to the primary winding, and a ballistic galvanometer to the secondary
winding..As the resistors are shorted out one by one, the primary current increases in
steps, as does H in the torus. An ammeter is used to 111eaSUre the current, and Ampere's
circuital law to deduce H. As each resistor is shorted out, the impulse in the galvanometer
is used to determine B. If the resistors are removed sequentially, then reinserted one by
one, and if then the d.c. source is reversed and the process is repeated, a hysteresis loop
is determined. The incremental susceptibility may be deduced from the slope of the
hysteresis curve, and is normally not a constant.

7.7 DIAMAGNETISM
The diamagnetic effect in materials causes a slight negative contribution to susceptibility
and is attributable to alterations in electron orbital motion due to the presence of an
external magnetic field. While the external field is being established, the growth in its
flux density induces changes in electronic motion which in turn create a reaction field
opposed to the stimulus. When the external field reaches a steady value, so too does the
reaction field.
In terms of a classical model, this steady reaction field can be pictured as due to
circular orbital motion at constant speed of each electron in every at0111 of the material
specimen. Although this conception is somewhat crude when compared to a quantum
mechanical model, it greatly facilitates the calculation of magnetic moments and gives
results which agree with quantum deductions.
Using the classical model, consider first the case of a single electron moving along a
circular path of radius r at a constant speed v, as indicated by Figure 7.3. It will be
assumed that a uniform, steady field Bloc exists in the region, and is perpendicular to
the plane of the orbit, directed out of the paper in the figure. The electron experiences
radial forces due to (1) the electrostatic attraction of the nucleus, and (2) its interaction
with the magnetic field. N ewton's force law yields
- l 1nv
r
r

(7.35)

in which m and -e are the Blass and charge of the electron, respectively. With the angular velocity w of the electron defined so that v = w X r (cf, Section V.8), it follows that w
is into the plane of the paper in Figure 7.3, and that (7.3tj) may be rewritten as

Diamaqnetism

417

When both sides of this equation are crossed with v, the result is

wXv - -

Bloc

) X v ==

e
2

47l'"omr 3

r X (w X r)

which reduces to
2

)
e
e
( 1 - 47l'"ow 2mr 3 co == -m

(7.36)

Bloc

If the angular velocity and orbital radius in the absence of a magnetic field are
and ro, respectively, then (7.36) yields for this case

Wo

(7.37)
Since

To

is of the order 10- 10 meters, (7.37) indicates that

FIGURE

10 16 rad/sec. Values of

+e

Bloc

Wo

7.3

Electron in circular orbit about nucleus.

Bloc normally do not exceed 100 webers /rn" (10 6 gauss) and thus eBloc/l1~ ~ 10 13 Under
these circumstances, a study of (7.36) discloses that w is close to woo If (7.36) is written
in the form
e
m

Bloc

==

upon expansion the first-order result is

DW == 2m

(7.38)

Bloc

so that, to first order,


W

==

Wo

+ -2m Bloc

==

Wo

+ WL

(7.39)

418

Magnetic Materials

CHAPTER

The term WL == eB loc / 2nt is called the Larmer angular frequency, and is the change in
angular velocity the electron experiences due to the presence of the magnetic field. It is
in the same direction as the mugnet.ie field, and is independent of the sense of rotation of
the electron in the orbit. Although WI. is small compared to Wo, it will soon be seen to
playa significant role in the phenomenon of diamagnetism.
This electronic orbital motion corresponds to a charge - e passing any point in the
orbit w/21r times per second, or to an equivalent current I = we/21r. The magnetic
moment caused by this orbital motion is therefore
(7.40)
the minus sign occurring because the electronic motion in one direction corresponds to a
positive current flow in the opposite direction.
Since the angular momentum cC or b of the electron about the nucleus is

cC or b

= r X mv = r X rn(w X r)

it follows that cC or b = rnr 2w and thus that


morb

= -

e
21n

(7.41)

c:Corb

The orbital magnetic moment of an electron is therefore oppositely directed to its


angular 1110111entunl, these quantities being in the ratio (-e/21n).
Consider now a material specimen which, in the absence of any applied magnetic
field, exhibits no magnetic effect whatsoever. One may conclude from this that the
orbital planes of all the electrons within the specimen have tilts which are completely
random, thus giving a net magnetic moment which is zero. When the field is applied,
one n1UY consider the resulting effect in the following manncr : Imagine that the orbital
motion of a particular electron is viewed in terms of its projection 011 a plane transverse
to Bloc and in terms of its projection on a plane parallel to Blo(~. Only the motion in the
transverse plane is affected by 1he field. Thus when one considers all the electrons in
the specimen, the projected orbits in planes parallel to Bloc are random and give a null
effect. The projected orbits in a plane perpendicular to Bloc arc affected in an ordered
way and do give an effect.
Within a single atom of the material, if there is an electron whose projected orbital
motion in a plane traverse to Bloc is counterclockwise at angular velocity

and radius 1\, it is equally likely that on the time average there will be another electron
whose motion is clockwise at angular velocity W-i = - WOi
eB loc / 2nl and the same
radius. These electrons pair to give a net magnetic moment which, according to (7.40), is

m, = _

! er; (WOi + eIJlo


2

21n

C)

! er; (-WO; + eBloc)


2

2111,

2r;Bloc

= - e

(7.42)

2111-

If there are N atorns/rn'' and Z is the number of electrons per atom, the net magnetization is
Z/2

- N e2B l oc/21n

2:

i=l

r;

(7.43)

SECTION

Dianuujnetism

419

If the X Y plane is chosen to be perpendicular to Bloc, the mean square distance from the
field axis is
_

L r; = ~ L (x

0 Z/2

') Z/2

Zi=l

Zi=l

r~ = ~

2 + y2) = x 2 + y2

If the atom is assumed to be spherical and to have a mean square radius r 2 == x 2 +


then since x 2 == y2 == z\

y2

Z2,

and Equation (7.43) may be written

M
=

(Z~~

r)

(7.44)

Bloc

In diamagnetic materials (i.e., in materials which exhibit this induced magnetization


but no other magnetic effects) the magnetization is so weak, and the coupling between
atoms so slight, that Bloc ~ Band Xm < 1. For such materials use of (7.28) gives
(7.45)
Equation (7.45) is the Langevin expression for diamagnetic susceptibility. Although
derived on the basis of classical arguments, it is in agreement with the result obtained
by 111eanS of a quantum mechanical analysis." If one assumes as representative values,
Z == 10 and N == 5 X 10 28 atoms /rn ", this expression predicts that Xm will be in the
order of 10- 5 .
The susceptibility of diamagnetic materials may be determined experimentally by
measuring the force on a specimen when it is placed in a nonuniform magnetic field. 29
The results of such measurements for a variety of rnaterials are given in Table 7.1. The
TABLE 7.1
SUSCEPTIBILITIES OF SOlVIE DIAl\1AGNETIC MATERIALS

AT ROO.M TEl\tIPERATURE AND ATlVIOSPHERIC PRESSURE

111aterial

Xm X 105

111aterial

Bismuth ........ .. .. .

-1.66
-0.95
-2.2
-12
-0.8
-3.6
-3.2

Selenium ....... . .....


Silicon ..... . .. . . . ..
Silver ........ . . . ....
Sodium .... . .......
Aluminum oxide .. . .
Barium chloride .. . ..
Sodium chloride .....

Copper ........
...
Diamond .... .. . . .
Graphite. ' " .
.
Gerruanium ....... ...
Gold ..... .
"

Mercury ........

Xm X 105

-1.7
-0.3
-2.6
-0.24
-0.5
-2.0
-1.2

Sec, e.g., C. Kittel, I ntroduction to Solid State Physics, 2d cd., p . .577, John Wiley and Sons, Inc.,
X C\V York, 1D56.
29 Cf. Sec. 7.6 for a typical procedure. A survey of experirnental methods may be found in L. F. Bates,
Modern Miujnetism, 3rd ed., Cambridge University Press, London, 1951.

28

420

Magnetic lv[ aierials

CHAPTER

agreement between these data and the prediction drawn from Equation (7.45) is seen
to be quite good. The theory can be improved'? by using more realistic assumptions in
calculating r 2
The similarities between the present discussion of diamagnetism and the earlier
treatment of electronic polarizability in dielectrics are striking (cf. Section 6.6). A
further parallel arises when one considers tern perature effects. Like electronic polarizability, diamagnetism is also due to alterations in the relative positions and motions of
the parts of individual at0111S caused by an external field. So long as the temperature
is not so excessive as to cause a change in electronic configuration, the diamagnetic
effect should be independent of temperature, This conclusion is supported by experiment.
EXANIPLE

7.3

A hydrogen atom in its ground state (Is) has a mean square orbital radius for its electron
given by
r 2 = 3a~ = 3(0.529 X 10-1)2
in which ao is the Bohr radius expressed in meters. If the molal specific volume of Hz gas
under standard conditions of pressure and temperature is taken to be 22.4 In 3/ kg m-mole,
then use of Avogadro's number yields the result that there are 2.68 X 10 25 hydrogen
molecules/m ' under these conditions. If one assumes that each hydrogen atom in an
H 2 molecule behaves diamagnetieally as though it were monoatornic, then the value
N = "5.36 X 102 5 atorns/rn- 111ay be used in formula (7.45) to compute the diamagnetic
susceptibility of hydrogen gas. When this is done, one obtains
Xm

= -

(1.6 X 10-19)2(5.36 X 10 25) X 3 X (0.529 X 10- 1)2


6 X 9.1 X 10- 3 1 X (4-rr X 10-i)-1

------------------~

= -2.65 X 10- 9
This figure is about 10- 4 times the values listed in Table 7.1, but the difference can be
attributed to the fact that Z = 1 and to the fact that the atom density in a typical gas
is only about 10- 3 of the value for a typical solid or liquid.

7.8

PERMANENT MAGNETIC MOMENTS

In the previous section diamagnetic effects were seen to arise in materials due to
alterations in electronic orbital motion induced by an external magnetic field. Thus
diamagnetism is present in all materials, and is in no way dependent on whether or
not the individual molecules possess permanent magnetic 1110111ents. However, the
explanation of all other magnetic phenomena (i.e., paramagnetism and the various forms
of ferromagnetism) rests on the assumption that the molecules of materials which
exhibit these phenomena do contain permanent magnetic moments. For this reason it
is desirable to review the origin of molecular magnetization in order to determine the
manner in which permanent moments can occur.
. It will be assumed that the reader has some familiarity with quantum mechanics and
is acquainted with the solutions of Schrodinger's equation applicable to a hydrogen
atom." These solutions are characterized by a set of quantum numbers having the
following properties:
C. Kittel, op. C1t., p. 209.
An excellent introductory treatment is given by C. W. Sherwin, Introduction to Quantum Jl,fechanics,
Holt, Rinehart and Winston, New York. 1960.
30

31

SECTION

P ermaneni Magnetic

]\I! aments

421

1. The principal quantum number n determines the electron orbital energy. It can
have the positive integral values n == 1, 2, 3, 4,
The corresponding electronic
shells are called respectively the K, 1-1, 1);[, N,
shells.
2. The orbital angular 1110mentU111 of the electron is determined by the quantum
number l which can aSSUl11e the values l == 0, 1, 2, . . . , (n - 1). The corresponding
spectroscopic designations for these l states are s, p, d, !, 9, . . . . The value of the
orbital angular momentum is given by h[l(l
l)P~, in which h == h/27r == 1.054 X 10- 34
joule see is known as the reduced Planck's constant.
3. The azimuthal quantum number Ttl.; indicates the allowed cornponents of orbital
angular momentum along a given direction (e.g., the direction of a magnetic field). It
111ay have the values 111l == 0, 1, 2, . . . , l.
4. In addition an electron is found to have spin, and the angular momentum associated with this spin has the allowed values h/2 along a magnetic field direction. For
this reason a fourth quantum number m, is introduced and is permitted the values t.
5. The probability density distribution for the electron's position is governed in the
radial dimension by nand l and in the angular coordinates by land m.,

Upon invoking the Pauli exclusion principle (which states that no t\VO electrons of
the same atom may have the same four quantum numbers), one is able to extend these
solutions qualitatively to other elements than hydrogen and to explain the arrangement
of the elements in the periodic table." If the quantum numbers for the ground state of
hydrogen are chosen to be n == 1, l == 0, m, == 0, in, == t, then when the above development is extended to helium, which has t\VO electrons, its ground state 111USt be such that
the additional electron assumes the set of quantum numbers n == 1, l == 0, rni == 0,
m, == -to This fills the K shell so that when lithium is considered the third electron in
its ground state must adopt the principal quantum number n == 2, thus initiating the
population of the 1.1 shell. In this manner a progression of sets of quantum numbers is
seen to accompany the additional electrons which go into the formation of the higher
elements. Table 7.2 indicates the numbers of electrons found in the various nand l
states for the first 36 elements in the periodic table.
In the light of these relationships, the various contributions to a possible permanent
magnetic moment of a free atom or ion 111ay be assessed as indicated in the following
sections.
1. Electron orbital motion. Equation (7.41) gives the relation between the orbital
angular momentum of an electron and its orbital magnetic moment. Though developed
through use of a classical argument, it is equally valid under a quantum mechanical
derivation. Thus the quantum numbers land m, govern the orbital contribution an
electron makes to the magnetic moment. For example, if l == 2, the total orbital angular
momentum is

.e

== h[l(l

1) p~ ==

V6h

and the allowed components of angular momentum along a field direction may be
deduced from Figure 7.4. If for convenience the field is assumed to be in the Z direction,
then it is evident that

.. = [l(l

l)pi. = mdi

(7.46)

32 See, e.g., L. V. Azaroff and J. J.


Brophy, Electronic Processes iri Mcteriols, Chap. 3, Mc Gr aw-Hill
Book Company, New York, 1963.

422

1J1agnetic 111aierials

CHAPTER

'fABLE 7.2
THE ARHANGEMENT OF ELECTRONS IN THE FIRST

K
n = 1

Atomic
Element - - Number Z
l

I
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

34
35
36

11

=
s

Li

2
2

2
2
2
2
2

2
2
2
2

Na

2
2
2
2
2
2
2
2

2
2
2
2
2
2
2

6
6

2
2
2
2
2
2

2
2

Al
Si
P
S
CI
A
I{

Ca
Sc
Ti
V
Cr
Mn

Fe

2
2

Co
Ni

2
2

eu

Zn
Ga

Ge
As

Se

Br
Kr

n=4

1
2

MCT

ill

n=3

l =0 l = 1 l = 0 l = 1 l = 2 l = 0 l = 1 l = 2 l =3
p
d
d
s
s
s
p
p
f
- - --- - - --- - -- - --"- - - -

C
N
0
F

Ne

ELE1VIENTS

He

Be

n=2

36

2
2
2
2

2
2
2

2
2

1
2
3
4
5
6
6
6
6
6
6
6

2
2
2
2

6
6
6
6
6
6
6
6

2
2
2
2
2
2
2

6
6

6
6
6
6

1
2
2
2
2

2
2
2
2

2
2
2
2
2

2
2
2
2
2

2
2
2
2

1
2

1
2

6
6
6
6
6

6
6
6

6
6
6
6
6
6
6
6

1
2
3

5
5
6
7

8
10
10
10
10
10
10
10
10

2
2
2
1
2
2

2
2
1
2
2

3
4

2
2

5
6
I

SECTION

Permanent Magnetic M oments

423

From (7.41), the allowed values of orbital magnetic moment along the field direction
are seen to be
eft

... , -2-,

0,

211~

eli
2-,

eli
-,
Zm

211~

the quantity m, == (eh/2nz) == 9.27 X 10- 24 amp 111 2 is commonly called a Bohr
magneton. The orbital magnetic moment of an electron along a field direction
may then be said to have the value -'}nz Bohr magnetons.

V6
rn, = 2

------

FIGURE

7.4

Allowed coniponenis of orbital angular momentum along a field direction.

If one recalls the discussion leading to the construction of the periodic table in the
form of Table 7.2, it is apparent that a completely filled electronic shell makes no net
contribution to the orbital magnetic moment of an atom, this being due to the fact that
electrons in the shell may be paired in terms of their equal and opposite m, quantum
numbers. A resultant orbital magnetic moment can occur only in atoms containing
incompletely filled electronic shells. Even then, the resultant for a large number of
similar atoms may be zero if the occurrence of a positive value for m, is just as frequen t
as the occurrence of the corresponding negative value.
Among those elements containing incompletely filled shells particular interest
attaches to the transition elements 21 through 28 (the iron group). The partially filled
3d set of states for these elements indicates that individual atoms have a net orbital
magnetic moment, and this is confirmed by experiment. Despite this, the contributions these moments make to the magnetic properties of the iron group elements in the
solid state prove to be negligible. This result may be explained by noting that, in the
iron group, the partially filled shell lies near the outer boundary of the atom and thus
is strongly influenced by neighboring atoms. The interaction of adjacent atoms proves
so strong that the orbital magnetic moments of the individual atoms cannot orient

424

111aqneiic J[ aterials

CHAPTER

themselves in an external field. These 1110111ents are therefore "quenched" and their
immobility is similar to that of the rigid permanent electric dipole 1110111ents found in
some dielectrics.
This behavior 111ay be contrasted with that of the transition elements 58 through 71
(the rare earths), whose incomplete shells lie deeper inside the atom, and are thus
shielded from the influence of neighboring atoms. The orbital magnetic 1110111ents of
these rare earth elements are 110tquenched and do make a contribution to the magnetic
susceptibility.
2. Electron Spin. Since the spin angular 1110111entum of an electron can be h/2
along a field direction, Equation (7.41) would suggest that the electron spin contributes
a magnetic 1110111ent of t Bohr magneton in this direction. However, (7.41),
though applicable to orbital angular nl0111entU111, is not valid for spin. It has been
found that the proper relation is
mspin

== - 2.0023 -

21n

cCsf)in

~ -

eli

2n~

l spin

(7.47)

Therefore the electron spin gives rise to approximately one Bohr magneton along the
direction (or opposite) of a field B.
If one refers again to the discussion concerning the construction of the periodic table,
the Pauli exclusion principle is seen to require that a completely filled electronic shell
have equal numbers of electrons whose spins are aligned and eounteraligned with a
magnetic field direction. For this reason only partially filled shells can make a net spin
contribution to the magnetic moment of an atom. Herein lies the explanation for the
strong magnetic behavior of the iron group of transition elements. Table 7.2 shows an
orderly filling of the K, L, and M shells for the first 18 elements, Beginning with
potassium, the 3d states are first passed over while the t\VO 4s states are filled, and
then with scandium the population of the 3d states begins. There is a total of ten 3d
states (m, == 0, 1, 2; m, == t) and these are filled in such a way that the first five
electrons have like spins; not until the sixth electron is added does any spin cancellation
occur.
This process is illustrated by Table 7.3 in which the spins of the 3d electrons are
schematically represented as either up or down. Two variances in the orderly progression of population may be noted: chromium and copper each add two electrons to the
3d level at the expense of one electron from the 4s level (cf. Table 7.2). The net number
of Bohr magnetons (in round figures) due to spin is noted in the last column for each of
the elements in this series. The free atom of manganese may be seen to have the
largest spin magnetic moment.
It should be emphasized that Table 7.3 refers to single free atoms of the transition
elements. The situation is 1110re cornplicated in the solid state, since adjacent at0l11S can
exert a strong local effect on the orientation of the magnetic moments of their neighbors. Table 7.4 lists the net magnetic moment per atom in the solid state for several
of the iron group elements. The effect of neighboring atoms is seen to reduce the magnetic 1110n1ent from the value which would occur with a single free atom.
3. Nuclear spin. Since the nucleus of an atom contains a net charge, it is also a
contributor to magnetic moment through the agency of nuclear spin. It is found that
the angular momentum associated with nuclear spin is of the same order of magnitude

SECTION

Permanent Magnetic Moments

425

as the angular momenta connected to electron spin and electron orbital motion, However, since the nuclear mass is three orders of magnitude larger than the mass of an
electron, the magnetic moment caused by nuclear spin is of the order of 10- 3 Bohr
magnetons. For this reason the nuclear contribution to magnetic susceptibility is almost
invariably overshadowed by the electronic contributions.
TABLE 7.3
POPUIJATION OF THE

iltomic
number

21
22
23
24
25
26
27
28
29

Element
Scandium
Titanium
Vanadium
Chromium
Manganese
Iron
Cobalt

I Nickel

STATE FOH THE IHON GROUP TRANSITION ELEMENTS

S pin direction of
3d electrons

Number of
3d electrons

1
2
3
5

5
6
7

8
10

Copper

t Including

3d

N et spin magnetic
moment in Bohr magnetons

1
2
3

l' i
i j l'

4t

j j j j j

l' i j l' i
l' j j l' j 1
r l' i i j 1 1
r i i j ill 1
i i l' l' l' 1 1 1 1 1

.5
4
3
2

It

the contribution from the single 48 electron.

4. Spin-orbit coupling. If the nuclear contribution is neglected, the total magnetic


moment of a single atom or ion may be determined by adding the orbital and spin
contributions of the constituent electrons. An electron possessing the quantum numbers m, and in, will contribute essentially m, + 21n Bohr magnetons of magnetic
moment along an external field direction. The total magnetic moment along this direction may then be found by summing the individual values of m, + 211'1.,8 associated with
the different electrons present in the atom. Filled shells make no net contribution, so
the summat.ion can be confined to incompletely filled shells.
8

TABLE 7.4
MAGNETIC l\10l\;IENT PER ATOl\1 IN THE SOLID STATE FOH IHON GHOUP
THANSITION ELEMENTS

Element

Sc

Ti

In in
Bohr magnetons

...

. ..

Cr

Mn

Fe

Co

Ni

0.4

1.1

2.22

1.71

0.61

The result of this com putation n1~Y be related to the total angular momentum of the
electron cloud. If vectors c9, cC, S, of lengths h[J(J + l)P\ h[L(L + l)P~, h[S(S + l)P~
are chosen to represent, respectively, the total angular momentum, the total orbital
angular momentum, and the total spin contribution to momentum of the electron

426

Magnetic Materials

CHAPTER

cloud, then
(7.48)
The quantities L, S, and J are the magnitudes of the algebraic SUIns of the m, values,
the m, values, and the m, + 1'n 8 values for the individual electrons; i.e.,
Which values of mi and 1n the individual electrons of the atom are allowed to assume
is governed by the Pauli exclusion principle and the energy state of the atom. In general the allowed quantum numbers are such that Land S are not independent and
therefore it is proper to say that there is spin-orbit coupling. For a single isolated atom
or ion in the ground state, the values of S, L, and J can be determined through application of H und's rules:
8

1. The electron spins are so oriented as to yield the maximum value of S consistent
with the Pauli exclusion principle.
2. The orbital angular momenta are arranged so as to maximize L for the value of S
found above.
3. If a partially filled shell is less than half occupied, J = 1.1 - S; if it is more than
half occupied, J = L
S.

The first of these three rules is evident in the electronic configuration of Table 7.3.
The total angular momentum of a free atom or ion can assume only a discrete set of
orientations with respect to the direction of a field B. If a construction identical with
that given in Figure 7.4 is used, the allowed orientations are such that the component
of the total angular momentum along the field direction is h multiplied by one of the
numbers

(7.49)

-J, - (J - 1), . . . , (J - 1), J

The experimental discovery by Stern and Gerlach of this quantization in orientation


has been noted in Section 7.1.
The orbital contribution to magnetic moment for an individual electron is related to
the electron's orbital angular momentum by Equation (7.41). Similarly, the spin contribution which this electron makes to magnetic moment is related to its spin angular
momentum by Equation (7.47). When these relations are summed over all the electrons
comprising the cloud of the free atom or ion, one may write

m=

e
2m

(7.50)

-gJ-~

in which m is the magnetic moment for the entire free atom or ion, and gJ is called the
Lande splitting factor. If each electron spin is weighted with a g factor of 2, in accordance with the approximate form of (7.47), and if each orbital contribution is weighted
with a g factor of 1, in aecordance with (7.41), it develops that g., is given by the Lande
formula."
gJ

J(J

1)

+ S(S + 1) 2.J(J + 1)

1./(1.1

1)

(7.51)

33 A derivation may be found in G. Herzberg, Ato1nc Spectra and AtonL'ic Structure, 2d ed., pp. 109-111,
Dover Publications, Inc., New York, 1944.

SECTION

Poramaqneiism

427

The name splitting factor for gJ arises from the fact that the presence of a magnetic
field causes the potential energy to depend on orientation of the magnetic moment,
according to the relation U == -m B (cf. Example 4.7). With the aid of (7.50), this

gives

eli
U == -gJ - pB
2m

==

-gJprnoB

(7.52)

in which p is one of the quantum numbers in the sequence (7.49). Therefore successive
allowed orientations of the magnetic moment with respect to the external field direction have energy levels which differ by gJ Bohr magnetons per gauss. This difference

in energy levels is evident in the spectrum (Zeeman effect) and the splitting of the
spectral lines is the cause for the naming of gJ.

In a crystalline solid the coupling of adjacent atoms modifies the magnetic moment
per atom and the quantization of orientation. The splitting of spectral lines is then
denoted by the more general spectroscopic splitting factor g. The Lande factor gJ
defined above may be looked upon as the spectroscopic splitting factor for the special
case of an isolated atom or ion.
EXAl\1PLE

7.4

Consider the Fe 2 + ion which, according to Table 7.2, has six electrons in the 3d shell and
an empty 48 shell. According to the first of Hund's rules, five of the 3d electrons will align
their spins, the sixth will oppose, and thus S = 2. The five aligned electrons 111USt take the
mi values - 2, -1, 0, + 1, +2, respectively, but the sixth electron may take any of these
values. The second of Hund's rules then gives L = 2. Since the 3d shell is 1110re than half
occupied, the third of Hund's rules gives J = 4. Use of (7.51) yields the information that
the Lande splitting factor is gJ = 1.5 for an Fe 2 + ion.

7.9 PARAMAGNETISM
The paramagnetic effect occurs in certain materials whose individual atoms possess
permanent magnetic moments which are randomly oriented in the absence of all
external magnetic field. The presence of an external field causes some net alignmeu t
of these magnetic moments with the field direction, and thus S0111e magnetization of
the specimen. Under normal eircumst.auces this magnetization is slight and the paramagnetic substance exhibits only a small positive susceptibility.
The discussion of the preceding section indicates that permanent atomic magnetic
moments occur only in atoms or ions whose electronic configuration includes one or more
incompletely filled shells. For this reason paramagnetic materials are found among
those compounds containing transition group elements. Of these the iron group
compounds and the rare earth group compounds have received the most extensive
investigation.
An analysis of the behavior of paramagnetic materials can be undertaken by considering a specimen containing N atoms per unit volume, the total angular 1110111entU111
quantum number of each atom being J (cf. Section 7.8). In accordance with (7.50)
there is a magnetic moment per atom m, associated with this angular momentum,
whose magnitude is
(7.53)
in which

peff

is the effective number of Bohr magnetons per atom.

428

Magnetic Materials

CHAPTER

If these atomic magnetic moments were assumed to be freely rotating, the classical
Langevin theory would be applicable, with the development paralleling what has
been presented in Section 6.8 for orientational polarization in dielectric materials. From
Example 4.7 the magnetic potential energy of an atom would be -m Bloc. For paramagnetic materials this could be written - m B, since the local interactions are weak
and Bloc is essentially equal to the macroscopic field B. The net magnetization per unit
volume would then be
(7.54)
with ~(a) = coth a - l/a the Langevin function, and a = peffmoB/kT.
However, as noted in Section 7.8, the atoms are not freely rotating but are restricted
to a discrete set of orientations relative to the direction of the applied magnetic field.
The possible components of magnetic moment along the field direction are given by
pgJrno in which p has one of the values listed in the sequence (7.49).
The magnetic potential energy of an atom when it is in one of these positions is then
given by (7.52) and the relative probability of finding the atom so oriented is exp
(pgJrnoB/kT). Summing over the probability distribution of N atoms per unit volume
gives for the magnetization

L
J

pgJrnoePOJmoB/kT

P~=--_JJ------

p=

(7.55)

ePoJmoB/kT

-J

Since both numerator and denominator of (7.55) are geometric and/or arithmetic
progressions, they can be summed to give
(7.56)
in which

~J(x) =

(x)

2J+1
[(2J+1)X]
1
2J coth
2J
- 2J coth 2J

(7.57)

is called the Brillouin function, with x = gJJrnoB/kT.


For large J the Brillouin function is plotted in Figure 7.5, and is seen to be essentially
equivalent to the Langevin function (Figure 6.9). The magnetization formula (7.56)
then reduces to the classical expression (7.54).
A study of Figure 7.5 reveals that as x becomes large, }BJ(x) tends to a limit of
unity and therefore the magnetization density approaches the saturation value
M sat = N Peffrno. One interpretation which may be placed on this result is that, at a
constant temperature, as B is increased, a greater percentage of the atomic magnetic
moments spend a greater percentage of the time being aligned with the field. The
saturation magnetization corresponds to total alignment of all atomic magnetic moments. The formulas also indicate that alignment is easier as the temperature is lowered.
Even for small J, normal temperatures and field values imply that x 1 and the
Brillouin function (7.57) then reduces to Q3 J (x ) = (J + 1)x/3J. (See Problem 7.10
at the end of this chapter.) The magnetization, as given by (7.56), simplifies to
(7.58)

SECTION

Paramaqnetism

429

This is also the classical limit of (7.54) when a is small, and thus the quantum and classical theories yield the same expression for magnetization under normal circumstances.
Using (7.28), and recognizing that the susceptibility of paramagnetic materials is
small, one may write
X

For x

Ai

== - - ==

J.Lo1B

Q5 J (x )

Np.ogJJm o B

(7.59)

1, this reduces to
C

(7.60)

in which C is called the Curie constant; Equation (7.60) is known as the Curie law of
paramagnetism. It may be contrasted to (6.57), in which the orientational contribution
58J(x)

1.0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

0.8

0.6

0.4

/.

0.2
L-

L--

__L

2
FIGURE

7.5

.L.._

_'__

....a.._

The Brillouin function for large J.

to dielectric susceptibility was seen to be N /3kTo~times the square of the permanent


electric dipole moment per atom. A good experimental verification of the Curie law
of paramagnetism is offered in Figure 7.6. t
An estimate of the size of the Curie constant 111ay be obtained by taking N ~ 4 X 10 2 8
atorns./m! and peff == 2, in which case C ~ iOI<:. This means that at room temperature
one would expect paramagnetic materials to have a susceptibility of the order of 2 X 10- 3
This prediction of the theory is borne out by the experimental data given in Table 7.5.
In measuring the susceptibility of a paramagnetic substance, one l11USt accept the
fact that there is present a diamagnetic contribution. However, the diamagnetic and
paramagnetic contributions are additive, and the results of Section 7.7 indicate that
the diamagnetic contribution is normally t\VO orders of magnitude below the paramagnetic contribution at room temperature and thus can be ignored.

t Miss Hupse's original results were expressed in terms of x', the susceptibility in e.m.u. per gram.
The dimensionless susceptibility x., has been determined using the relation x., = 41l'px l , with p the
density in gm/cc. In the temperature range shown, p varies but little.

430

Magnetic Materials

CHAPTER

15

;:;
X

e
><
Q.

10

0-----1....-----_'-o
100
200

.1..--_

300

T(OI()
FIGURE 7.6 Reciprocal paramagneticsusceptibility of powdered CUS04 . K 2S0 4 6H~O
os. temperature. [Afler J. C. lIupse, Physica, 9, 633; July 19.i2.]

Experimental verification of the magnetization formula (7.56) has also been achieved
by holding a variety of paramagnetic materials at a low temperature and varying the
applied field. A clear illustration of the saturation effect in ~J(x) is found in Figure 7.7.
Agreement between theory and experiment is seen to be excellent.
'fABLE 7.5
THE ROOM TEMPERATURE SUSCEPTIBILITIES
OF

SO~1E

PARAMAGNET.IC MATERIALS

A[ aterial

10 3 X m

Material

10 3 X m

FeS04............
NiS0 4............
MnS04............
CoS0 4H 20 ........

2.8

Fe 20 a.........
Cr 20 3.........
FeCI2 ...
CrCta.........

1.4
1.7
3.7
1.5

1.2
3.6

2.0

With the aid of Hund's rules and the Lande formula for gJ, the effective number of
Bohr magnetons can be calculated for various particles through use of (7.53). When
this is done for the trivalent positive rare earth ions, results such as those shown in

SECTION

Paramaqneiism.

431

7.00

6.00

s.oo
x
~

.~

4.00

..,.>

Q)

E
$..c

..c:
0
~

~(Cr3+)

:{.OO

o 1.30 oK

2.00

2.00 oK

x :3.00 oK

1.00

o. 00

4.21 I\:
Brillouin

_-'---.L-----'-----I~....&..---L-___A__L..-.-~....L----'-_.L..._'""'____L___L.._..&_

13

_'____i_____L_..J

webers/n1 2 / oK

7.7 J[ agnetic moment vs. B / T for qadolinuiui sulfate octahydrate, for ferric animonium alum, and for potassium chromium alum. [.lfter Il". E. llenry, Phys lie, 88,559;
1952.]
FIGURE

Table 7.6 are obtained. Representative experimental values are also listed, and the
agreement is seen to be quite good.
When this same procedure is applied to the iron group ions, no such favorable cornparison with.experiment is obtained, as is evident by comparing the third and fifth
columns of Table 7.7. However, if one assumes that the orbital 1110111entull1 of these
ions is "quenched" by strong neighbor interactions, (i.e., the orbital part of the magnetic moment is unable to respond to an external field), this is tantamount to setting
L = 0, letting J = Sand gJ = 2. When p. is calculated on this basis, the result is as

432

M aqneiic Materials

CHAPTER

TABLE i.6
THE EFFECTIVE l\10iVIENT IN BOHR NIAGNETONS

ron

VARIOUS

TRIVALENT RAHE EARTH IONS AT ROOl\1 TEl\tlPERATURE

Ion

Configuration

Ce 3+
Nd 3+

4f 15s2p 6
4f35s 2p 6
4f75s 2p 6
4f95s2p 6
4fl15s 2p 6
4f135s 2p 6

Gd 3+
D y 3+
Er 3+
Yb 3+

Peff

= gJ[J(J

l)r~

Peff(CXP)

2.54
3.62
7.94
10.63
9.59
4.54

2.4
3.5
8.0
10.6
9.5
4.5

given in the fourth column of Table 7.7. These calculations are seen to be in much
better agreement with experiment.
This difference in behavior of the rare earth and iron group salts is reasonable when
one considers the relative positions of the incomplete shells. In the rare earth C0111pounds the 4f shell is responsible for the paramagnetic effect, and this shell is wellshielded from neighboring atoms by the completed outer 5s and 5p shells. However,
TABLE 7.i
THE EFFECTIVE IVIO:\1ENT IN BOHR ?vIAGNETONS FOR VARIOUS

Ion

Configuration

Ti 3+

V 4+
V 3+
Cr 3+ V 2+
l\1n 3+ Cr 2+
Fe 3+ Mn 2+
Fe 2+
C 0 2+

Ni 2+

Cu 2+

3d 1
3d 2
3d 3
3d 4
3d 5
3d 6
3d 7
3d 8
3d 9

Peff

gJ[J(J

1.55
1.63
0.77
0
5.92
6.70
6.54
5.59
3.55

+ 1)J}2

Peff

2[8(8

1.73
2.83
3.87
4.90
5.92
4.90
3.87
2.83
1.73

IHO~

GROUP IONS

+ l)r~

pclf(exp)t

1.8
2.8
3.8
4.9
5.9
5.4
4.8
3.2
1.9

t Representative values
in the iron group ions, the 3d shell is responsible for the paramaguetism ; this is the
outermost shell and thus exposed to the intense local fields caused by neighboring
atoms, These local fields are responsible for the quenching of the orbital contribution
to magnetic moment.
EXAMPLE

7.5

For the Cr 3+ ion, which has three electrons in the 3d shell, Hund's rules give 8 = ! and
L = 3. Since this shell is less than half-occupied, J = L - 8 = !. Use of the Lande formula
yields gJ = 0.4. From this, it follows that, gJ[J(J + 1)1~~ = 0.77, 2[8(8 + 1)]~~ = 3.87,

SECTION

Properties of Ferromaqnetic Materials 433

10

which are t\VO of the entries in Table 7.7. The low value of calculated gJ, namely 0.4, is seen
to be at wide variance with the value
Peff(exp) = 1.97
[8(8 + 1)]~~
calculated through use of the experimental data and the assumption that orbital angular
momentum is quenched.
EXAMPLE

7.6

Since the limiting value of the Brillouin function is unity, Equation (7.56) indicates that
the saturation magnetization for a paramagnetic substance is

Under general conditions of temperature and applied field, the fractional magnetization is
then
M(x) = SBJ(x)

n.;

At normal field intensities and temperatures, this reduces to


M(x) =

u.;

+
J

~ = [(J
3

l)/J]~~ PeffrnO B
3kT

111 (x)/ll1 s a t may be interpreted as the fraction of atoms which have their magnetic
moments aligned with the B field. At room temperature, for the representative value
[(J
1) / JJ ~~Peff = 2, the above equation gives

lV/(x) = 1.5 X 10- 3B


M sat

For the weak field B = 10-- 6 webers/rn" (10- 2 gauss), only one atom in 109 is seen to be
aligned. Even for the strong field B = 1 weber /m? (104 gauss), only about one atom in a
thousand is aligned. This result for paramagnetic materials will be seen to be in sharp contrast with the behavior of ferromagnetic substances, all of whose magnetic moments can be
aligned at room temperature.

7.10

PROPERTIES

OF FERROMAGNETIC MATERIALS

A substance is said to be ferromagnetic if it can exhibit a spontaneous magnetic


moment, that is, a magnetic moment even in the absence of an applied field. For a
given ferromagnetic material this spontaneous magnetization can occur only below a
critical temperature Tc, called the [erromaqnetic Curie temperature. "VeIl above the
Curie temperature, such materials are found to behave paramagnetically, and to have a
well-defined susceptibility which follows the Curie-Weiss law, namely,
X

=---

(T - 0)

(7.61)

in which C is the Curie constant and () is called the poramaqnetic Curie temperature.
e is usually somewhat higher than T c ; for the five elements which display ferromagnetism, the comparison of these Curie temperatures is given in Table 7.8. Ferromag-

434

Magnetic Materials

CHAPTER

netic alloys and compounds also conform to this pattern as well as some compounds
which do not even contain ferromagnetic elements.
TABLE 7.8
CURIE TE1VIPEUATUUES OF FEUROlVIAGNETIC ELEMENTS

Fe

Co

Ni

Gd

Dy

Tc

1043

1393

631

289

105

()

1093

1428

650

4~ typical plot relating x., to temperature above the Curie point for a ferromagnetic
material is shown in Figure 7.8. Comparison with Figure 7.6, which typifies a paramagnetic material, reveals a similarity in slope, and comparable values of X m for a given
temperature level. The principal difference in behavior between the t\VO types of

300,.-----....-,--------r-------r------"'"'"

200 t---------t------+-------t---~.,_.-_i

100

600

650

700

750

800

T(OI{)

7.8 Reciprocal susceptibilityof nickel VS. temperature. [After P. lFeiss and R. Forrer,
Ann Phys. (Paris), 5, 153; 1926.]

FIGURE

material is that, for a truly paramagnetic material, (} == 0; also, for a ferromagnetic


material, as () is approached from above, X~l becomes a nonlinear function of temperature. Stoner" has offered an explanation of this nonlinear behavior in the neighborhood
of the Curie point.
34

E. C. Stoner, Proc Leeds Phil Lit Soc, 3, 457; 1938.

SECTION

Properties of Ferromaqnetic Materials

10

435

Below the Curie temperature T e ferromagnetic materials show a marked increase


in susceptibility, and also have a B-H characteristic which displays the familiar
hysteresis effect. This behavior is indicated by Figure 7.9 and can be explained in the
following "ray: Beginning with a virgin specimen (B == H == 0), if H is increased
(through application of a current in a coil wrapped around the specimen, for example),
B at first increases reversibly, and the curve oa is traced out. Further slight increases
B

----~:--~.......---Bsat

- - - - - - -.....- - - - - i &
o -----e-------H

Bsat---llllllli:::~-----

FIGURE

7.9

B-H hysteresis loop of a [erromaqneiic specimen.

in H cause large increases in B, until at position c on the curve saturation has set in.
When position d is reached, further increases in H cause negligible change in B, and a
saturation field B sat is clearly delineated. If now H is decreased, the curve de results,
and a remanent field B, is observed with H == O. Since there is no longer an external
excitation, the specimen has become spontaneously maqnetized: From (7.15) the magnctization at this point on the curve is 11-1 == MolEr.
Reversal of the external current will cause a negative H field, resulting in further
decrease of B, and producing the curve segment ef. At a value -He, called the coercive

436

Magnetic Materials

CHAPTER

force, the B field is reduced to zero. Further decrease of H creates the segment fg, with
reverse saturation of the B field occurring at g. Another reversal of H traces out gijd,
the second half of the cycle, this being an inverse image of defg.
The incremental permeability of the specimen may be defined as the slope of the
B-H curve, and is obviously a function of B and the prior history of the specimen.
This incremental permeability may be extremely large, values of relative permeability
as high as 10 5 being not uncommon. The initial permeability of the specimen is defined
as the slope of the virgin curve oab at the origin.
The coercive force varies widely from one ferromagnetic material to another and
is a property of great practical importance. In permanent magnet materials it should
be high, whereas in transformer materials it should be as low as possible. The wide
range of He is evident from a study of Table 7.9, in which the principal properties of
many ferromagnetic materials are listed.
T'ABLE 7.9
DATA FOR FERROl\fAGNETIC l\fATEHIALS

High-permeability materials

~!J aterial

Iron
Purified iron
Cold rolled steel
78 Permalloy
Mu metal
Supermalloy

Percent composition

99.91 Fe
99.95 Fe
98.5 Fe
21.2 Fe, 78.5 Ni, 0.3 Mn
18 Fe, 75 Ni, 2 Cr, 5 Cu
15.7 Fe, 79 Ni, 5 Mo, 0.3 Mn

Maximum
relative
permeability

Saturation flux
density B sat

Coercive
force He

(webers/rn")

(amp/In)

5,000
180,000
2,000
100,000
100,000
800,000

2.15
2.15
2.10
1.07
0.65
0.80

80
4
145
4
4
0.16

Perrnanent-magnet materials

J.,1 aterial

Carbon steel
Tungsten steel
Rernalloy
Alnico II (sintered)
Alnico V
Platinum-Cobalt

7.11

Percent composition

98.1 Fe, 1 Mn, 0.9 C


94 Fe, 5 'tV, 0.3 Mn, 0.7 C
71 Fe, 17 Mo, 12 Co
64.5 Fe, 10 AI, 17 Ni, 2.5 Co, 6 Cu
53 Fe, 8 AI, 14 Ni, 24 Co, 3 Cu
77 Pt, 23 Co

Remanent flux
density B r

Coercive
force He

(webers/rn 2)

(ampyrn )

1.0
1.03
1.05
0.69
1.25
0.45

4,000
5,600
20,000
41,600
44,000
208,000

THE WEISS THEORY OF FERROMAGNETISM

Inspection of Table 7.9 indicates that the remanent flux density B r for permanent
magnet materials is typically about 1 wober/rn". Since H = 0 at this point on the

SECTION

The JVeiss Theory of Ferromaqnetism

11

437

hysteresis curve, (7.15) gives lYI = P-olB ~ 106 amp/rn. If each atom in the material
has a magnetic moment of the order of one Bohr magneton, then the atom density is
N == Mirna ~ 10 2 9 atorns /rn", if all the atomic maqnetic moments are aligned. Since
this is the right order of magnitude for the atom density of solid materials, it follows
that such a high remanent flux density can be explained only by assuming a spontaneous magnetization in which essentially all the atoms of the material are so oriented
that their magnetic moments are parallel. This behavior in the absence of any external
magnetic field whatsoever is radically different from that which occurs in paramagnetic
materials. In Example 7.6 it was observed that even with so strong an external field
as 1 weber/rn" (10,000 gauss), only one at0111 in a thousand within a paramagnetic
material has its magnetic moment aligned with the field.
In order to interpret this phenomenon of total spontaneous magnetization in ferromagnetic materials, Pierre Weiss, in 1907, advanced the hypothesis that a strong
"molecular field," or local field, exists within the material and is the cause for total
alignment of the atomic magnetic moments. To account for the fact that even below
the Curie point a ferromagnetic specimen does not always exhibit so great an external
field as Bv, Weiss assumed as a second hypothesis that a macroscopic region of the
specimen may contain, if T < T c , a number of subregions (domains), each of which
is spontaneously magnetized. All the atomic magnetic moments within one domain
are pictured as being aligned, but the direction of magnetization might vary from one
domain to the next. With the domains randomly magnetized the net bulk magnetization would be zero; with all domains reinforcing, the net bulk magnetization would
result in the field B rWith these two hypotheses Weiss was able to explain the temperature dependence
of susceptibility above the Curie point, and the hysteresis effect below the Curie point.
His original theory made use of the classical Langevin analysis based on freely rotating
atomic magnetic dipoles. In the presentation which follows Weiss' theory is slightly
modified to account for the quantized nature of the field-directed component of atomic
magnetic moment.
The Weiss expression for the local field has been introduced in Section 7.4. In terms

of B fields it may be written

Bloc == B

(1' - l),u aM

(7.62)

in which B is the macroscopic total field, defined in Section 7.3, and M is the magnetization, defined in Section 7.2; l' is called the internal field constant, or the Weiss
constant. t
Suppose that the ferromagnetic specimen contains N atoms/rn", each with a total
angular momentum quantum number J. Repeating the analysis of Section 7.9, one
finds that the magnetization is once again given by
(7.63)
except that now
x - gJJmoB

kT

J
[B
- gJ rna

1oc _

('Y - l)J.L oM]


kT

(7.64)

t The original Weiss hypothesis was that H 10c = H + I'M, but this is completely equivalent to (7.62).
Through use of (7.15), either may be deduced from the other.

438

Magnetic Materials

CHAPTER

This analysis is distinguished from the development previously given for paramagnetic
materials in that it is no longer assumed that Bloc ~ B.
It is no\v desirable to consider separately two temperature regions.
Ferromagnetic Materials at High Temperatures.
If T > T c , the thermal agitation
is so great that the local field is not sufficient to maintain alignment of adjacent atomic
magnetic moments. The domain structure disappears, and individual magnetic moments are free to assume any allowed orientation with respect to Bloc; the behavior ]8
completely analogous to paramagnetism except that the local field is not negligibly
different from the applied field. The low magnetization at these high temperatures
corresponds to x 1 in (7.64). But then the Brillouin function may be approximated
by'BJ(x) = (J
1)x/3J and (7.63) becomes

N(Peff mo)2/ 3kT

M =

1 - Jlo(" - l)N (PeffmO) 2/3kT

(7.65)

With the application of (7.28) the magnetic susceptibility is found to be

NJlO(Peffmo) 2/3k
"N Jlo(Peffmo) 2/3k

Xm = ------1

(7.66)

which is seen to be in the form of the Curie-Weiss law (7.61). The paramagnetic Curie
temperature is therefore
8

= "INJlO(PeffmO) 2

(7.67)

3k

and the Curie constant is C =

0/".

The CurieWeiss law (7.66) quite obviously cannot hold for T ~ 8, for then the magnetic susceptibility would pass through a pole and become negative. The presence of a pole in (7.66)
suggests the possibility that spontaneous magnetization can occur (i.e., that a finite
value of M is possible for H == 0). To develop this argument further, one can return
to (7.63) and consider the magnetization for values of x which are not necessarily small.
Since H = 0 for spontaneous magnetization, B = JloM. Insertion of this result in (7.64)
gives
Ferromagnetic

Materials

at Temperatures

Below the Curie Point.

(7.68)
The variable x may be eliminated from (7.63) and (7.68) by simultaneous solution,
which permits the magnetization to be expressed as a function of temperature. This
elimination is facilitated by first normalizing both expressions so that they become

M(x) = 'SJ(x)

u.:

=J

+ 1 !.- x

3J

(7.69)

in which M s a t is the limiting maximum value of (7.63), occurring when x ~ co. These
two functional forms of the fractional magnetization are plotted in Figure 7.10 for the
case J = 1 and for several values of T /8. An intersection is seen to occur only for
T < o. If this intersection is used to determine x, the fractional magnetization may be
plotted as a function of temperature, the result being as indicated in Figure 7.11. Also
shown in this figure are the results of following this procedure for the case J = t and

1he

SECTION 1]

lV/(X)
A/sat

lVeiss Theory of Ferromaqneiism

T<O

1.0

0.8

0.6

0.4

0.2

O--------"-------'---------L.-----..A----X
1.0
2.0
3.0
4.0
FIGURE

Intersection of iuo functional representations of [raciional inaqneiizaiion,

7.10

1.0 ~~~-====I?\':--::------.-------r------r------'

0.8 t-------+-----~--::lllooc------+----

0.6 r--------+------~-----+---1ll(x)

IV/sat

Fe
0.4.

t--------+--

Ni--+------f------4---

Co

0.2 r - - - - - . . . - - - - + - - - - - + - - - - - - - - 4 - - - - - - + - - - - -

0.2

0.4

0.6

0.8

1.0

T
(J
FIGURE

7.11

The spontaneous magnetization of iron, nickel. and cobalt s. teniperaiure.

439

440

111agnetic 111aterials

CHAPTER

for the classical Langevin ease J ~ 00. The experimental data for iron, cobalt, and
nickel are seen to fit the ,I == t curve quite well. Considering the fact that these three
materials have widely different paramagnetic Curie temperatures and saturation
magnetizations, the agreement between theory and experiment is even more significant.
The fit between experimental data and the ,I == t curve in Figure 7.11 also supports
the argument that the orbit.al contribution to magnetic moment is quenched in the
iron group elements (cf. Section 7.9). Further support comes from gyromagnetic
experiments, in which the gJ factor of a specimen is determined by the effect that magnetization and angular momentum have on each other. 35
From the saturation magnetization l\1s a t and the volume density of atoms, one can
calculate the effective number of Bohr magnetons per atom for various materials, The
results of such calculations for the iron group elements were given in Table 7.4. The
nonintegral values may be explained by noting that, in the solid state, atomic energy
levels are broadened into bands, and adjacent atoms affect the electronic structure (and
thus the magnetization) of their neighbors.
Finally, the Weiss theory as embodied in Figure 7.10 indicates that the spontaneous
magnetization goes to zero when T == (J, for then an intersection of the two curves is
no longer possible. Thus this theory does not differentiate between the ferromagnetic
Curie temperature T; and the paramagnetic Curie temperature O. Table 7.8 reveals
that experimentally these two Curie points are not the same for real materials. However, the difference is not great and the theory does afford a reasonable explanation
for all the principal observed phenomena.
EXAMPLE

7.7

If the Curie point (J = 1093K is selected for iron from Table 7.8, at room temperature, for
J = 1, the right side of (7.69) gives

+ 1 '!.. x

3J

()

0.185x

A plot of this straight line on Figure 7.10 provides an intersection with the SSl(X) curve at
a point for which Jlf(x)/ A/sat> 0.99. This result would be modified slightly if the more
appropriate value J =
were used, but the indication would still be that, at room ternperature, iron is spontaneously magnetized to a degree very close to the saturation value.
For cobalt, which has a higher Curie point, the approach to saturation is even greater at
room temperature. For nickel, with a Curie point of only 650 oK, the fractional magnetization at room temperature is in the neighborhood of 95 percent.

7.12

THE WEISS FIELD CONSTANT AND THE EXCHANGE INTEGRAL

An estimation of the value of the Weiss constant l' may be obtained by returning to
Table 7.8 and noting the experimentally determined values of the paramagnetic Curie
temperature e. For iron it is seen that (J == 10931(. Insertion of this value and other
appropriate quantities in (7.67) yields the result that 'Y ~ 5,000. This is four orders
of magnitude higher than the classical value 'Y = t predicted by the Lorentz theory
(cf. Section 7.4).
See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 408-410, John Wiley and Sons,
Inc., New York, 1956.

35

SECTION

The TVeiss Field Constant and the Exchange Integral

12

441

In spontaneously magnetized iron, H == 0, B == J-Lo~!J, and thus Bloc == 'YJ-Lol\!J ~ 6,000


webers/m 2 (60 million gauss). This intense local field cannot be explained classically;
the field produced by the magnetic moment of a neighboring atom is only I"'..IpeffmO/
47T"J-Lo 1a 3 ~ 10- 1 webera/m". Proper addition of the contributions due to all significant
neighbors will produce a field which still falls short of Bloc by four orders of magnitude, t
An adequate explanation of the physical origin of the intense local field was first
presented by Heisenberg:" and is based on the quantum mechanical exchange integral.
Under suitable assumptions, the energy of interaction of atoms a and b, having spins
Sa and Sb, contains a term
(7.70)
in which Vex is the exchange energy and ~ is the exchange integral and is governed by
the overlap of the electron clouds of the t\VO atoms, The spatial derivatives of this
exchange energy give the exchange forces, which prove to be intense and account for
the high value of Bloc; these exchange forces have an electrostatic origin, but the effect
is as though there were direct interaction of the spins Sa and Sb. When S is positive, the
exchange forces are in such directions as to cause parallel alignment of Sa and Sb; this
is the case in ferromagnetic rnaterials. However, S can also be negative, in which case
Sa and Sb tend to line up in an anti parallel arrangement. This occurs in antiferromagnetic and Ierrimagnetic materials. (Cf. Sections 7.13-7.14.)
An approximate relation connecting the exchange integral S and the paramagnetic
Curie temperature emay be established for ferromagnetic materials as follows: Imagine
that an atom in the specimen has z nearest neighbors, with each of which it shares an
exchange energy given by (7.70), and that its exchange energy with more distant neighbors may be ignored. With all the spins equal and parallel, the total exchange energy is
U, == -2zSS(S

+ 1)

(7.71 )

This energy would be expended if the atom in question were rotated so that its spin
was at right angles to the spins of its nearest neighbors. But in a spontaneously magnetized specimen this would amount to rotating a magnetic moment of strength
PeffmO

==

gJmo[J(J

through a local field whose intensity is


is once again employed, it follows that

1)P~

== gJmo[S(S

Bloc == 'YfJ.OMsat.

1)P2

If the result of Example 4.7


(7.72)

Since for a ferromagnetic material, L == 0 and J == S, the saturation magnetization


is given as the limit of (7.63) in the form 111s a t == N g.rS1no. Using this result and cornbining (7.72) with (7.71) one obtains
(7.73)
wherein application has been made of (7.67). Therefore the exchange integral is related

t In the case of ferroelectn:c materials, the electric dipole moment is two orders of magnitude higher,
and dipole-dipole interactions thus provide a satisfactory explanation for the observed field constants,
which for such dielectrics are in the range of one-third.
36 The Heisenberg theory is treated completely in J. II. Van Vleck, Theory of Electric and "Magnetic
Susceptibility, Chap. 12, Clarendon Press, Oxford, 1932.

442

Magnetic Materials

CHAPTER

to the paramagnetic Curie temperature through the approximate expression


3kfJ

(7.74)

g = 2zS(S + 1)

indicating that a high value for the exchange energy is manifested by a high Curie
temperature.

7.13

FERROMAGNETIC DOMAINS

As mentioned earlier, in order to explain the fact that a ferromagnetic specimen may
exist in a state which exhibits no bulk magnetization, whereas a weak external magnetic

',---',,/
'v"

,~//

'v/

,,---."
, /
/

I
I

I
I
I

I
I

,
I

/ /

I
I

",

/~,

r>

'

//",

/~, /~',
(b) Polycrystal.

(a) Single crystal.

Domain walls dotted.

Crystal walls solid.

Domains not shown.


Individual crystals magnetized.

FIGURE

7.12

Examples of zero net rnagnetization.

field is capable of producing saturation magnetization in the same specimen, Weiss


introduced the notion that the volume of a specimen may be divided into domains.
Below the Curie temperature each domain is envisioned as being spontaneously magnetized, with the degree of magnetization being appropriate to the temperature T of the
specimen. The bulk magnetization is given by the vector sum of the individual domain
magnetizations. Examples of how the bulk magnetization can be zero in single crystals
and polycrystals are given in Figure 7.12.

SECTION

Ferromaqneiic Domains

443

Ample experimental evidence exists to support the domain hypothesis. A graphic


technique due to Bitter involves the preparation of a colloidal suspension of a fine
ferromagnetic powder. When a drop of this suspension is placed on the carefully prepared surface of the ferromagnetic crystal being studied, the colloidal particles gather
along the domain boundaries. This occurs because adjacent domains are magnetized
in different directions resulting in strong local fields at the interfaces. A photornicrograph reveals the powder distribution, and thus the domain boundaries. An example
of this is shown in Figure 7.13. The origin of domains may be explained by invoking

H = 475 amp/m

875 ampyrn

4,000 amp/rn

10,600 amp/rn

27,800 amp/rn

FIGURE 7.13
Hitter powder patterns on surface of
single crystal of silicon-iron, showing variation of domain size with magnetic field; If horizontal in figure.
[After Bates, Modern J[ aqneiism, 3d ed., p. 465, Cambridge [,1 niversity Press, London, 1951.]

the principle of minimum energy. A magnetized crystal has stored magnetic energy
associated with its magnetization. If the crystal can arrange itself into domains which
are oppositely magnetized, the overall crystal magnetization will be reduced, as will
the stored magnetic energy. But the boundary surfaces between oppositely magnetized
domains require energy to be maintained, since the exchange forces favor parallel
and oppose anti parallel orientations of the magnetization. Additionally, most crystals
are more easily magnetized along certain crystallographic axes (easy directions) than
they are along intermediate (hard) directions, the difference in energy for these t\VO
states being called the anisotropy energy. A perfect crystal will tend to form into
domains whose shapes and number are such as to minimize the sum of the magnetic
energy, the domain wall energy, and the anisotropy energy.

444

illagnetic Materials

CHAPTER

As an illustration, the single crystal shown in Figure 7.12a could represent cobalt,
which is a hexagonal crystal, and has only one easy axis of magnetization. With the
domain structure shown, there is no net magnetization and thus zero magnetic energy.
The long domains are magnetized in easy directions, but the small closure domains at
the ends are magnetized in hard directions. If this type of domain structure is retained
but the number of domains is increased, the fraction of the volume occupied by closure
domains will decrease, as will the anisotropy energy. However, the number of boundaries
will increase along with the boundary energy. These two competing tendencies will
M

--- - - - -

-+-- -

---

Irreversible domain
wall motion
Reversible domain
wall motion

-\--

~---_.--.Il.-_---------H

HI

7.14 T'upical maqnetization curve sho-wing regions


in which different rnagnetization processes dominate.
FIGURE

achieve equilibrium when the number of domains is such as to minimize the total
energy.
In iron, which is a cubic crystal, the easy directions of magnetization are parallel
to the cube edges. Nickel also has a cubical crystal structure but the preferred axes
of magnetization are the body diagonals. In each of these elements it is possible for
both principal domains and closure domains to be magnetized along different easy
directions. Domain size is then determined by minimizing the boundary energy plus
the magnetostrictive energy-the latter arising due to the difference in elongation
of the principal and closure domains, which causes an elastic strain.
Of course, these idealized pictures must be modified to account for the presence of
impurities, lattice imperfections, etc., but the basic explanation that energy minimization accounts for the origin and size of domains seems well established. In real materials
the size of individual domains varies widely, with typical values lying in the range
10- 2 to 10- 6 cm 3.
Consider such a specimen of real ferromagnetic material which is below its Curie
temperature and has arranged its internal domain structure so as to display no net
magnetization. Upon application of an external field, the individual atomic magnetic

SECTION

14

A nii]erronuujneiism

445

moments tend to align with the field, upsetting the balance of internal forces which
had resulted in minimum energy. It is found that the volume growth of favorably
oriented domains occurs more easily than the rotation of magnetization from an easy
to a hard direction, and thus domain wall motion occurs for relatively weak applied
fields, whereas rotation requires high fields. This behavior is indicated by Figure 7.14,
in which the magnetization curve of a virgin specimen is shown, with the different
regions noted in which one or the other process dominates, Observe that for fields less
than HI, removal of the field permits the specimen to return to its original domain
structure.
One is able to exert some control over the coercive force He by taking advantage of
the fact that domain wall motion occurs 11101'e easily than magnetization rotation. In
materials designed for use as permanent rnagnets, a large value of He is desired. Through
suppression of domain boundary motion, a high cocrcivity is assured; this may be accornplished by using materials consisting of two metallurgical phases which give a heterogeneous structure on a very fine scale.
Alternatively, materials used in transformer cores should have a high permeability
and a low hysteresis loss. Since the latter is proportional to BrH c (cf. Section 7.16)
reduction of He is desirable. This can be achieved by making the material pure, h01110geneous, and well-oriented to facilitate domain wall motion. The result is not only a
lowered He but a higher permeability as well. 'The inverse relation between coercivity
and permeability is clearly demonstrated by the data in Table 7.9.

7.14

ANTIFERROMAGNETISM

The Heisenberg theory of ferromagnetism is based on the quantum mechanical exchange integral, and a positive value of that integral corresponds to parallel alignn1ent
of adjacent spins (cf. Section 7.12). However, if the exchange integral is negative, a
tendency for anti parallel spin alignment exists. This occurs, for example, in certain
materials in which the interatomic distance is small, Counteraligned spins are found
in both antiferromagnetic and fcrrimagnet.ic substances. These t\VO classes of materials
are distinguished by the fact that the t\VO sets of opposed spins have unequal moments
in a ferrimagnetic specimen whereas they are of equal strength in an antif crromagnetic
material (see Figure 7.1).
A theoretical treatment of antiferrornagnetism by K eel preceded the experimental
discovery of the phenomenon (cf. Section 7.1). Polycrystalline manganese oxide is the
first antiferrornagnetic substance to have been identified by experiment and one of
its principal properties is the manner in which the susceptibility varies with temperature. As seen in Figure 7.15, a maximum in the susceptibility occurs at a ternpera.ture
TN (called the N eel temperature) and this behavior is characteristic of all anti ferromagnetic materials. It may be explained qualitatively on the basis of a crystal model
containing two types of atoms, A and B, distributed over t\VO interleaved lattices.
At low temperatures, due to the strong internal fields, the A spins tend to "lock in"
antiparallel to the B spins and an external field has little effectiveness in inducing a net
magnetization. As the temperature is raised, the thermal energy tends to unlock the
spin pattern and an external field can cause a greater net magnetization, this being
manifested by a higher susceptibility. Finally, at the critical ternpcrat.urc TN the spins
are completely freed, and above TN the specimen behaves paramagnetically, exhibiting

446

Magnetic M aterials

CHAPTER

a susceptibility which decreases as 'I':', A quantitative theory based on this model


will be discussed later in this section.
Neutron diffraction studies provide direct experimental evidence for the existence
of the antiferromagnetic spin arrangement. A neutron beam which is incident on a
crystal suffers scattering by the atomic nuclei but also interacts with the ordered spin
lattice, this latter interaction giving rise to additional diffraction lines. These extra
lines have an intensity which diminishes as the temperature is raised because the antiferromagnetic order is decreasing; finally, at TN the extra lines disappear completely
from the diffraction pattern. Shu1l 37 and his co-workers were the first investigators to

10.0

8.8

7.5

.50

100

150

200

250

300

T(OI{)
FIGURE 7.15 ill agnetic susceptibiHty of MnO VS. temperature in 5,000 Gauss
field. [After Bizette, Squire, and Tsai, Comp Rend, 207, 449; 1988.]

employ this technique and have used it successfully to determine the spin arrangements
in both ferromagnetic and antiferromagnetic substances.
In terms of the two sublattice model, assuming that all the nearest neighbors of an
A atom are B atoms and vice versa, the local fields at the two sites may, in a further
use of (7.24), be written
B, = B - a,uoMa - /3,u OM b
(7.75)
(7.76)
B, = B - /3,u oM a - a,uol\1 b
in which a and /3 are internal field constants. It is anticipated that /3 will be positive
in order to account for antiparallel alignment of spins. It is further anticipated that
IIJI > lal, since all of an atom's nearest neighbors are of the opposite type. However,
no prior prediction is being made about the sign of a.
It now becomes advantageous to consider several temperature regions.
1. The temperature region T > TN. In this region the magnetizations M a and M b
are weak and the material is behaving parnmagnetically. A repetition of the analysis
37

C. G. Shull and J. S. Smart, "Detection of Antiferromagnetism by Neutron Diffraction," Phys Rev,

76, 1256-1257; October 15, 1949.

SECTION

A ntijerromaqneiism

14

447

of Section 7.9 then leads to the result that

l\1 =
~

rLN (PeffmO)
2J B
3kT

(7.77)

in which N is the density of A atoms and peff is the number of Bohr magnctons per

A atom,

If it is assumed that the density of B atoms is also N and that the net magnetic
moment per atom is the same for both types of atoms, then it follows that

1\1 =
~

rN (Peffn10)
2J B
3kl
1

(7.78)

Upon adding these t\VO equations and making use of (7.75) and (7.76) one obtains

M = Ma

Mb =

rLN(Peff3kT o)2J [2B m

({3

a)).LoM]

(7.79)

which can be solved for 1\1 to give

so that

------

m -

).LolB - T

+ ()

(7.80)

in which the Curie constant is

C = 2N (PeffmO) 2
3k).Lol

(7.81)

and the paramagnetic Curie temperature is


(J =

_({j_+_a)_C
2

(7.82)

Because the expectation is that (3 + a is positive, one would predict that () is also positive, and this is confirmed by experiment for a variety of materials.
It is interesting to compare the behavior of paramagnetic, ferromagnetic, and antiferromagnetic materials in a temperature range in which they are all acting paramagnetically. This is done in Figure 7.16, where the intercepts with the T axis clearly
indicate the differences in the nature of the paramagnetic Curie temperature.
2. The N eel ieniperaiure TN. At this temperature, the magnetization is still weak
enough for the above analysis to be valid. If there is no external magnetic field present,
at 1 N one may combine (7.77) and (7.75) to give
1

which may be rewritten in the form


(7.83)

448

Magnetic Materials

CHAPTER

- 8(antiferro)~

--c-

I
~8(ferro)

7.16 Comparison of behavior of reciprocal rnagnetic s11sceptibility


with temperature above Curie points for three classes of materials.
FIGURE

Similarly, (7.78) and (7.76) combine to give


(7.84)
These two equations will yield nontrivial solutions for M, and M, only if the determinant of the coefficients vanishes. But this leads to

TN

= (1 -

a)C
2

(7.85)

Comparing (7.82) and (7.85), one sees that the two sublattice model does not predict
the same value for the N eel temperature and the paramagnetic Curie temperature,
This is in agreement with experiment, as can be seen from Table 7.10, which lists
measured values of TN and () for a variety of antiferrornagnetic materials. It may be
observed from this table that not only are TN and () widely different but also that TN
is generally lower than (). This suggests that a is positive which implies that not only
the AB interactions, but also the AA and BB interactions are antiferromagnetic.
3. The teniperaiure region T < TN. Below the Neel temperature TN, it is necessary
to distinguish the behavior of a single crystal from that of a polycrystal. In a single
crystal, due to anisotropy, there will be one or more preferred directions along which
the spins will tend to align themselves. If an external field is applied perpendicular to a
natural spin direction, the magnetic. moments M, and M, are turned by the field so as
to make an angle 2</> with each other, as shown in Figure 7.17. The local fields at the two
sites are still given by (7.75) and (7.76), and in equilibrium M, should be aligned with

SECTION

14

A ntijerronuumetism

449

TABLE 7.10
NEEL TEMPERATURE

TN AND

PARAIVIAGNETIC CURIE TEIVIPERATURE ()

FOR CERTAIN ANTIFERROMAGNETIC MATERIALS

Substance

NiC1 2

TNOK
.

CoF 2 . . . . . . . . .
FeF 2
.

MnF 2

NiF 2 . . . . . . . . . . . .
FeO
.

MnO
Mn02
MnS

.
.
.

I~

50
38
79
72
73
198
122
84

165

68
53
117
113
116
570
610
316
528

B a and M, should be aligned with B b Thus


M, X B,

==

==

o ==

M, X (B - aJ..L ol\l a
M a X (B - (3,u OM b)

(3,u OM b)

This requires that the component of B - {3,u OM b which is perpendicular to M a be zero, or

B cos c/> - {3,u 0111 b sin 2c/>

==

(7.86)

When M, is crossed into (7.76), Equation (7.86) is seen to hold for M, as well, Therefore
the net magnetization in the direction of B is
.
2B cos cP sin cP
M=(Ma+ft1 b) sln cP ==
.
(3,uo SIn 2c/>

=-

{3,uo

(7.87)

The transverse susceptibility is therefore given by


.1

Xm

J..LoM

=:--=-

(3

(7.88)

This formula is independent of temperature, but of course the model is imperfect,


and the concept of an array of counteraligned spins is less valid as the temperature
B

FIGURE

7.17

Calculation of perpendicular susceptibility for antijerromaqneiic material below TN.

450

Magnetic lJl aierials

CHAPTER

x;

departs from absolute zero. An actual case is shown in Figure 7.18, where
is seen to
decrease somewhat as TN is approached.
If the applied field is parallel to a natural spin direction, the calculation of the longitudinal susceptibility x~ is more complicated, and involves statistical methods which
30 r-------.---------.------.,--------.

20

:.0
'Z

0-

<1,)
Co)

00

='
rn

s..
~

"0

10

50

100

150

200

T(OK)
FIGURE 7.18
1455; 1950.]

Molar maqneiic susceptibility of MnF 2.

[After Gri.ffel and Stout, J Chern Phys, 18,

use Brillouin functions. It is apparent that


x~(OOI{) = 0

since all spins at absolute zero are either parallel or anti parallel to the field, thus
experiencing no torque. Calculations by Van Vleck." show that the longitudinal
susceptibility increases regularly from its null value at absolute zero until it reaches
the value
x~(TN) = x;(T N)
This conclusion is in agreement with the experimental curve of Figure 7.18.
For polycrystalline materials, the susceptibility below TN is an average value lying
between x~ and
which accounts for the rising portion of the curve in Figure 7.15.

x;,

EXAMPLE

7.8

From the data offered in Table 7.10, the relative size of the internal field constants a and {3
may be determined. Taking the ratio of (7.85) and (7.82), one obtains
TN = {3 - a
(3 + a

38

J. II. Van Vleck, "On the Theory of Antiferromagnetisrn," J Chern Phys, 9, 85-t)(); January 1941.

SECTION

15

Ferrinuumeiism

4t51

which may be solved for ({3 / a) to give

{3

1 -

(TN/B)
(TN/B)

Calculation from this formula gives the following data:

Material NiCh

CoF 2

FeF 2

MnF 2

NiF 2

FeO

MnO

Mn02

MnS

2.1

1.5

1.7

1.9

I
1

{3/a

6.5

6.1

5.2

4.5

4.4

When one recalls that {3 and a are measures of the relative influence on the local field of
nearest neighbors and next-nearest neigh bors, these are seen to be reasonable ratios. Their
variability is due partly to the differences in lattice dimensions.

7.15 FERRI MAGNETISM


The previous section was concerned with antiferromagnetic materials, whose magnetic
properties could be explained in terms of t\VO su blattices of equal and opposite spins.
There is no fundamental reason why the two opposite sets of spins must always be equal.
When they are not, one encounters a different class of materials called [errimaqneiice.
These materials also have a Curie point, and below this critical temperature the unequal
spin systems tend to "lock in" with an antiparallel orientation, resulting in a net
spontaneous magnetization. In this respect, they resemble ferromagnetic substances.
However, one should note carefully the basic difference in the mechanism. Ferromagnetism is associated with a positive value of the exchange integral and parallel
alignment of all the spins. Ferrimagnetism, like antiferrornagnetism, corresponds to a
negative value of the exchange integral and to anti parallel spin alignment.
Prominent among ferrirnagnctic materials are the ferrites, a group of compounds
whose composition may be represented by the chemical formula XOFe 20 3, in which X
is a divalent ion such as Cd 2+, C 02+, Cu 2+, Fe 2+, ~lg2+, Mn?", Ni 2+, Zn 2+ (or a mixture
of these ions). t Because they are oxides, ferrites have a lowered density when compared
to metallic substances and also have a much higher resistivity. Depending on the specific
composition, they have resistivities in the range 10 to 104 ohm-m, which is comparable
to the resistivity of a semiconductor, and which is many orders of magnitude above the
value (10- 7 ohm-m) for iron. For this reason, ferrites are attractive for use in transformer
cores at frequencies beyond the range where the eddy current losses in iron cores are
prohibitive. They are also widely used in microwave applications where the low depth
of penetration of iron prevents its use.
The process of producing ferrites involves mixing the various desired oxides in the
proportions which are required to yield a given set of properties. The oxides are then
put under pressure (sometimes with a binder) and sintered at temperatures as high as
1700 0 } ( . The result is a ceramic-like material which can be ground to a desired shape.
The electric and magnetic properties vary widely with composition.

t One of these ferrites, FeOFe 2 0 a (magnetite) bears the distinction of having been the first ferrous
material known to mankind with references to it dating back through antiquity.

452

Magnetic 1\1aierials

CHAPTER

Through careful X-ray diffraction and neutron diffraction analyses, the crystal lattice
of ferrites has been found to have the spinel structure (after the mineral spinel, l\;fgAI 20 4)
with a unit cell which is cubical and approximately 8.4 Aon a side. The spinel structure
is indicated in Figure 7.19, and is found to contain 32 0 2 - ions, 16 Fe 3+ ions, and 8
divalent ions. The oxygen ions are so distributed as to form 64 tetrahedral interstices
(A sites) and 32 octahedral interstices (B sites). Eight of the tetrahedral sites are
occupied, as are sixteen of the octahedral sites, thus accounting for the twenty-four
metallic ions in the unit cell.

/0,
Q_-:- ,--b
~
"0'//
'

Octahedral interstice
(32 per unit cell)

Tetrahedral interstice
(64 per unit cell)

Oxygen

Metallic ion at tetrahedral site

Metallic ion at octahedral site


FIGURE

7.19

The spinel structure.

Since the 0 2- ions have only completely filled shells, and therefore no net spin, the
magnetic properties of ferrites are due to the metallic ions. A good illustration is
3
magnetite, in which the 8 Fe 2+ ions occupy half of the octahedral sites, and the 16 Fe +
2
ions are evenly divided between tetrahedral and octahedral sites. Since each Fe + ion
3
has 4 Bohr magnetons and each Fe + ion has 5 Bohr magnetons (cf. Table 7.3; the 48
electrons are removed first), and since the AB interaction is antiferron1agnetic, one
would expect the A sites to have a net magnetic moment of 40nlo per unit cell, and the
B sites to have a counteraligned magnetic moment of 72mo per unit cell. The saturation
magnetization J.l1s a t (which is the spontaneous magnetization at DO!\:) should therefore
be 72 - 40 = 321no per unit cell. The experimental value is 32.641no, so the agreen1ent
is quite good.
If the value 8.4 A is used for the dimension of the unit cell, the above numbers
translate into a saturation magnetization ~lsat for magnetite of 5 X 10 5 amp/me This
is smaller than the M s a t values typically encountered in iron and cobalt materials by a

SECTION

15

factor of 2-;"). The lower value of saturation magnetization in ferrites may be explained
by noting that the concentration of magnetic ions is smaller and that the ruagnetizations at the A and B sites oppose.
In 1948, ~ ee139 proposed a theory which accounts for the principal features of ferrimagnetism, and which is based on the hypothesis that there exists a negative interaction
between the rnetallic ions at the tetrahedral (A) sites and the metallic ions at the
octahedral (B) sites, this interaction being the cause of a tendency for the A and B ions
to adopt an anti parallel spin alignment. The essential features of K eel's theory may be
appreciated by considering the ferrite XO' Fe 20 3, whose unit cell contains y Fe 3+ ions
on tetrahedral sites and (16 - y) Fe 3+ ions on octahedral sites; the divalent X ions are
then distributed such that (8 - y) of them occupy tetrahedral sites and y of them are
at octahedral sites. The formula for this ferrite may be written

in which x == y/8 is a discrete variable having the range 0 ~ x ~ 1. The bracketed


portion of this formula refers to the occupancy of the octahedral sites.
If the divalent ions are nonmagnetic, the net magnetization may be written

M == xlVI a

(2 - x)lVI b

(7.89)

in which M, and 1\'1 b are the magnetization densities of the A sites and B sites for the
special case that there are 8 I~e3+ ions at tetrahedral sites and 8 Fe 3+ ions at octahedral
sites per unit cell (x == 1).
The mathematical expressions for the local fields at the two sites were postulated by
N eel in the form

B a == B - Jlo'Y[(2 - x)1\1b - axM a ]


B b = B - IJo'Y[xM a - (3(2 - x)M b]

(7.90)
(7.91)

in which B is the total macroscopic field, - J.Lo",/(2 - x)M b or - J.Lo",/xM a accounts for the
negative AB interaction, and JlO'YaxMa, Jlo'Y{3(2 - x)M b represents the AA and BE
interactions respectively. The factors a, {3, yare internal field constants, and these
equations are based on the same assum ption of linearity which Weiss had earlier introduced in his theory of ferromagnetism.
Above the Curie point the magnetizat.ions M, and M b should have an inverse temperature dependence and follow a paramagnetic Curie law. It seems reasonable to
write for this temperature region,

1\1

=
a

CIJolB

2T

M _ C/-lo1B b
b 2T

(7.92)

in which C is a Curie constant. Upon combining Equations (7.89)-(7.92), one finds that
(7.93)
39

L. N eel, "Magnetic Properties of Fcrrites; Ferrimagnetism and Antiferromagnetism ," Ann Phys

(Paris), 3, 137-198; 1948.

454

Magnetic Materials

CHAPTER

in which

1
l'
- = - [2x(2 - x) - ax 2 - (3(2 - X)2]
XO

')'2

a = -

16

(7.94)

Cx(2 - x)[x(1
() =

')'

"4 Cx(2

a) -

(2 - x)(1

+ (3)]2

(7.95)

+ a + (3)

- x)(2

(7.96)

Equation (7.93) indicates that, if 1/Xm is plotted versus T above the Curie point, the
resulting curve is convex. This is in agreement with experiment, and a plot of experimental data in this form may be used to determine the constants C, XO, (1, and () which
occur in (7.93). If x is known, (7.94)-(7.96) may be used to obtain the internal field
constants lX, (3, and ')'. This procedure has been followed for several ferrites with the
interesting conclusion that a and (3 are both negative, so that all three interactions are
antiferromagnetic' with the AB interaction dominating.
3.0

2.5

y~

2.0

X IJi
I

)( 1.0

OJ>

800

8.1)0

~v

.;

/'"
900

950

1000

1050

1100

T(OK)
FIGURE 7.20
Reciprocal rnagnetic susceptibility of rnagnetite vs.
temperature. [After C. Kittel Introduction to Solid State Physics,
2d ed., p. 445, John lriley and Sons, Inc., lvew York, 1956.J

The foregoing analysis is also applicable to ferrites in which the divalent ions are
magnetic, provided that all divalent atoms are on octahedral sites (x = 1) and the
magnetization M b includes the divalent contribution. Magnetite is an example of such
a ferrite, and its reciprocal susceptibility as a function of temperature is plotted in
Figure 7.20. The convex curvature predicted by the theory is clearly indicated.
7.16

TIME-VARYING PHENOMENA

Previous sections of this chapter have dealt with the behavior of magnetic materials
under the influence of a static or quasistatic external field. Three internal causes for
magnetic effects in materials have been noted (electron orbital motion, electron spin,
and nuclear spin), and five classes of materials have been identified (diamagnetic,
paramagnetic, ferromagnetic, antiferromagnetic, and ferrimagnetic). In what follows
the static results obtained in these earlier sections will be extended, in principle, to
include time-varying phenomena, and then several cases of practical interest will be
discussed.

SECTION

Time- Varying Phenomena

16

455

If primary current sources l(x,y,z) and their associated field B1(x,y,z) induce a magnetization density M(x,y,z) in a material specimen, then if 1 and B 1 become timevarying, in general M will vary with time also. Leaving aside for the mornent the question of a specific relationship between B 1 (x,y,z,l) and M(x,y,z,l), the macroscopic
response field B 2 (x,y,z,l) caused by the aggregation of time-varying magnetic moments
may be deduced in a manner completely analogous to what was done for electric dipole
moments in Chapter 6 (specifically, cf. Appendix L). The result is that B 2 is given by
the expression
B 2 (x ,Y,z,t)

== V F X

[f {M } X dS + f VsX{M}dV]
-1

47rJ..Lo ~

-1

47rJ..Lo ~

(7.97)

in which S is the bounding surface of the specimen, V its volume, and ~ is the distance
from dS or dV to the field point (x,Y,z). {M} is the time-retarded value of M. This
result is seen to be similar to the static formula (7.4), the only difference being that the
magnetization density is now time-varying and retardation must be included. Equation
(7.97) normally is valid for points within the magnetic specimen as well as without and
permits the interpretation that the specimen is equivalent in its magnetic effect to a
lineal current density j == M X In on S plus an areal current density t == V X M in V.
In parallel with the development of Section 7.3, a macroscopic H field may be related
to the magnetization M and the total field B == B 1 + B 2 by the defining equation
(7.98)
In (7.98) all three vectors are time-varying and M may not be parallel to B, in which
case H is not parallel to either of them.
Effective use of Equations (7.97) and (7.98) requires knowledge of the relation
between Band M. Unfortunately, unlike the analogous situation in dielectric materials
(for most of which P and E were found to be linearly connected), no simple general
relationship exists between M and B in those materials whose magnetization is sizeable.
Thus if one writes H == ,u-lB == B/,uo(l + Xm), the common result is that the dynamic
magnetic susceptibility x., is a nonlinear com plex tensor when the fields are timeharmonic. Despite this difficulty, many dynamic cases are amenable to analysis,
particularly those involving small incremental fields. The remainder of this section will
be devoted to a discussion of several of these cases.
1. Diamagnetic susceptibility. Diamagnetic materials have been seen to have slight
negative static magnetic susceptibilities of the order of 10- 5 The phenomenon of diamagnetism is similar to electronic polarizability in dielectrics, and therefore by analogy
one would expect the dynamic diamagnetic susceptibility Xm to exhibit a resonance at a
characteristic frequency i-: In the neighborhood of 1m a complex value of Xm should be
found, indicating some absorption; below 1m the static value of Xm should prevail,
whereas above 1m a negligible value of x., should result. However, the diamagnetic
effect is so small that this phenomenon is of little practical interest and has not been
widely investigated.
2. Cyclotron resonance. If the diamagnetic material is a semiconductor, a different
resonance phenomenon arises when the specimen is placed in a region containing both a
static magnetic field B, and an electromagnetic field (E1,B1) whose electric intensity is
perpendicular to B o. A free electron or mobile hole (cf. Section 8.11) within the semi-

456

lYfagnetic AIaterials

CHAPTER

conductor will experience forces due to these fields such that its equation of motion
between collisions is
(7.99)
in which q = e depending on whether the mobile carrier is a hole or electron, and m *
is the effective mass of the hole or electron. t
Since E I = (c/n)B I , and since the index of refraction n of the semiconductor is close
to unity, for normal values of v the term v X B I may be neglected in comparison to E I .
Upon taking B, in the Z direction and E 1 in the X direction, (7.99) may be expanded
to give

qE1ei wt

+ qVyB o = 1n*v
-qvxB o =

o=

n~*vy

(7.100)

m/,

In (7.100) it has been assumed that the steady magnetic field B, is uniform throughout
the region occupied by the semiconductor specimen, and that the wavelength of the
electromagnetic field (EI,B I) is large compared to the size of the specimen, so that the
instantaneous phase of E I may be assumed to be uniform throughout the specimen.
The third of Equations (7.100) indicates a constant drift in the Z direction. The first
two equations may be combined to give
(7.101)
(7.102)
in which We = qB o/ 1n *
particular solution
Vx =

IS

called the cyclotron frequency. These equations have the

jww c E 1 .
-2 - - - eJwt
w - w~ B o

(7.103)

which can be recognized as elliptical motion in the X Y plane with a resonance at


= We. When collisions are taken into account this resonance is limited to a finite
amplitude. Therefore if the frequency of the E 1 field is varied, the absorption of
energy by the specimen from the field will show a characteristic peak at the cyclotron
frequency. Knowledge of B 0 then permits determination of the effective Blass of the
free electrons or holes.] The result is found to depend on crystal orientation, but
typically the effective masses of holes and electrons are found t.o be in the range 0.1 to
0.35 times the mass of a free electron."
3. Paramaqneiic relaxation. If a magnetic field is suddenly applied to a paramagnetic specimen, the magnetization density ~1 exhibits some inertia and does not
W

t The effective mass of an electron or hole in a semiconductor differs from the free space mass of an
isolated electron because the periodic potential within the crystalline semiconductor causes the average
motion of the electron or hole to be equivalent to the mot.ion of a particle of altered mass against
a background of uniform potential.
t The specimen can be doped so that the density of either free electrons or holes dominates by many
orders of magnitude and thus the effective mass of only one or the other is being measured; cf. Sec. 8.11.
40 See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 371-379, John Wiley and Sons,
Inc., New York, 1956.

SECTION

16

Tin~e- Varying

Phenomena

457

immediately reach its static value, but instead approaches it gradually. If after the
ultimate value of magnetization is essentially achieved the field is suddenly removed,
the magnetization will not vanish instantly but instead will decay with the entire
process resembling the trace of Figure 6.22. In the case of many paramagnetic materials
the inertia in this process is attributable to two relaxation mechanisms. The first of
these results frorn a spin-lattice interaction through which energy 111ay be interchanged
between the spin system and the lattice vibrations. The second mechanism is due to
spin-spin interaction through which an individual spin can exchange energy with the
magnet.io field caused by neighboring spins. The spin-spin relaxation time is typically
about 10- 10 seconds and is temperature independent. The spin-lattice relaxation time is
normally much longer; it varies with the material and is strongly temperature dependent.
In analogy with dipolar relaxation in dielectrics (cf. Section 6.19), this phenomenon
may be described in terms of the spin-lattice relaxation time 7; a repetition of that
earlier analysis leads once again to the Debye equations, such that the complex permeability of the paramagnetic specimen is given by
J.l

-lew)

-1

J.l a

-1

J.l s

-1

J.l a

+ JW7
.
-

(7.104)

in which J.la is the permeability at a frequency fa such that f~1 is much greater than the
spin-lattice relaxation time 7 but much less than the spin-spin relaxation time ; J.ls is the
static permeability,
If one introduces the complex paramagnetic susceptibility Xm = x~ - jx~ by the
defining equation jJ. = (1 + Xm)J.lO, and recognizes that x, and Xs are both small cornpared to unity, (7.104) may be converted to give

(7.105)

which is a generalization of a result first obtained by Gorter and Kronig.!' Equations


(7.105) indicate a resonance at w = 7- 1. Another resonance would occur at a frequency
equal to the reciprocal of the spin-spin relaxation time, but (7.105) does not show this
and these equations are not valid for frequencies that high. Agreement between experiment and (7.105) in the appropriate frequency range is quite good. 42
An interesting independent derivation of (7.105) has been provided by Casimir and
Du Pre using a thermodynamic argument. 43 A key point in their analysis is that the
spin-spin relaxation time is so short compared to the spin-lattice relaxation time that
the spin system can be treated as though it were always in thermodynamic equilibrium.
For this reason the aggregation of spins may be taken as a separate thermodynamic
41 C. J. Gorter and It. Kronig, "On the Theory of Absorption and Dispersion in Paramagnetic and
Dielectric Media," Physica, 3, 1009-1020; November 1936. In their analysis spin-spin relaxation was
ignored, which is equivalent to setting X a equal to zero.
42 L. J. F. Broer and C. J. Gorter, "Paramagnetic Dispersion in Gacloliniurn Salts," Physica, 10,
621-628; October 1943.
43 H. B. G. Casimir and F. K. Du Prc, "Therrnodynamic Interpretation of Paramagnetic Relaxation
Phenomena," Physica, 5,507--.511; June 1938.

458 111agnetic !l![aterials

CHAPTER

system, with its own temperature and entropy. This assumption gains its validity from
the fact that it leads to the Debye equations, and it further serves to explain the process
whereby temperatures below 11<: are attained by the adiabatic demagnetization of a
paramagnetic salt.
This cooling process is indicated by the entropy-temperature diagram of Figure 7.21.
The paramagnetic specimen is placed in good thermal contact with its surroundings
when the spin system has a temperature T 1 and an entropy S1. An external magnetic

,b

I
I
I
r

I
I
I
I
I
I

------------------..L---------T
T2
T1
FIGURE

7.21 ..1 diabatic demoqneiizaiion.

field is then applied, increasing the ordering of the magnetic moments and thus decreasing their entropy. Heat flows from the spin system to the lattice to the surroundings
and the isothermal path ab is traced out. The specimen is then insulated from its
surroundings and the external field is removed, resulting in the isentropic path be and a
lower temperature T 2 Successive repetitions of this process have yielded temperatures
as low as 10- 3 deg abs. The ultimate temperature attainable is limited principally by the
natural splitting of the spin energy levels (caused by the crystalline field).
4. Electron paramaqnetic resonance. The discussion of Section 7.8 indicated that,
in a crystalline solid, the magnetic moments of the individual atoms or ions can adopt
only 2J + 1 orientations with respect to an external field direction. In a paramegnetic
material these magnetic moments are essentially independent and noninteracting. As a
consequence the 2J + 1 energy levels are approximately equally spaced in the presence
of a macroscopic field Eo, the increments being gmoB o joules (cf, Equation 7.52).
If in addition to the steady field B o the specimen is in the presence of an electromagnetic wave of angular frequency w, chosen to satisfy

Iiw = gmoBo

(7.106)

then transitions among the 2J + 1 energy levels will be induced and the internal array
of magnetic moments can absorb energy from the wave. This phenomenon is known as
electron paramagnetic resonance. It may be observed by positioning a specimen inside

SECTION

1\l11,e- Varying Phenomena

1()

459

a rectangular waveguide and then placing the waveguide and specimen between the
pole faces of an electromagnet so that the field of the electromagnet is perpendicular to
the broad walls of the waveguide. When a TE lo mode is passed down the waveguide
and detected at the far end, the detector readings sho w a dip as the field Eo of the electromagnet is varied, the dip occurring at the value of Eo which satisfies (7.106).
The first observation of this phenomenon was made by Zavoisky" in 1945 using the
paramagnetic salt CuC1 22H 20 . The clearly resolved resonance of 1\ln 2+ obtained by
CUn1111CrO\V and Halliday is illustrative of the experimental results which can be
achieved and is reproduced in Figure 7.22. Upon inserting in (7.106) the values

1(\
"

/ \

\\.

0.1

-----

0.2

0.3

Electromagnet field in webers/rn 2

7.22 Ptuamaqnctic resonance of l\1n 2 + ion in


l\1nS044H 20 . Observed at 2930 l\1C. [.lfter Cutnmeroio
and J I alliday, Phys [(ell, 70, 433, 1946.]
FIGUfiE

w/27r = 2,930 X 10 6 and Eo = 0.1125 derived from their experiment, one can determine
that g = 1.86 for the paramagnetic ion Mn?", In an iron-group salt such as this, one
would expect the g factor to have a value close to the Lande factor 9J' The calculated
value of gJ is 2 so the agreement is quite satisfactory.
Electron paramagnetic resonance has proved to be an important research tool in
gaining further understanding of various interactions within solids and liquids. For
example, the crystalline electric field inside many solids affects the allowed orientations
of magnetic moments. Therefore an anisotropy in the measured g values is observed for
these materials as the crystal axes are changed with respect to the field direction. This
anisotropy permits deductions to be made about the crystalline field. Other investigations have been concerned with the detection of impurity atoms in semiconductors and
with studies of free radicals, excited molecules, and conduction electrons in metals.
5. Ferronuumeiic resorumce. Electron resonance also has been observed in ferromagnetic substances, using the same absorption technique at microwave frequencies
which has just been described for paramagnetic materials. The ferromagnetic specimen
is usually a thin plate or disc, inserted in the waveguide so that its faces are transverse
to the longitudinal waveguide axis, and thus parallel to the B o field of the electromagnet.
For a constant microwave frequency, as the field of the electromagnet is varied, the
E. Zavoisky, "On the Absence of Anisotropy for Spin Magnetic Resonance," J Phys U SSR, 9,
447-448; July 1945.

44

460

Magnetic leI alerials

CHAPTER

absorption traces a curve typified by Figure 7.23. This effect was first observed by
Griffiths" in 1946.
The analysis of ferromagnetic resonance differs from that of paramagnetic resonance
because the individual magnetic moments in a ferromagnetic specimen cannot respond
to an incident microwave field independently of each other. Consequently, it is appropriate to consider the interaction of the microwaves with the induced magnetization

0.68

0.72

0.80

0.76

Electromagnet field in webers/rn!

7.23 Ferromaqnetic resonance in nickel ferrite at 24 ge.


[After Yager, Galt, Merrit, and "rood, Phys Rev, 80, 744; 1950.]

FIGURE

density M, rather than with the individual atomic magnetic moments. To this end, let
m be the magnetic moment of a single atom and let ~ be the angular momentum of the
electron cloud, these quantities being related by m = -g(e/2m)$) (cf. Equation 7.50).
In the presence of a magnetic field Bloc, the electron cloud experiences a torque m X Bloc
(cf. Example 4.7) and therefore the dynamic equation of motion of the cloud is

mX

Bloc

d&J

di =

2m dm

g; di

45 J. H. E. Griffiths, "Anomalous HF Resistance of Ferromagnetic Metals," Nature, 158, 670-671;


November 9, 1946.

SECTION

Time- Varying Phenomena

16

461

Upon multiplying both sides of this equation by the atom density N, one obtains

dM
== -rM X Bloc
dt

-rM X B == -f,uoM X H

==

(7.107)

in which I' = g(e/21n) is a substitution constant, called the gyromagnetic ratio, and the
t\VO reductions in (7.107) are by virtue of (7.24) and (7.15).
If Z is selected as the longitudinal axis of the waveguide, with Y perpendicular to the
broad walls, the total field and magnetization may be written
(7.108)
in which H 0 is the quasistatic field of the electromagnet, M 0 is the magnetization it
induces, HI is the microwave field, and M 1 is the magnetization it induces. In a typical
experiment H 0 HI, M 0 MI. Making this assum ption so that only first-order terms
are retained when (7.108) is inserted in (7.107), one finds that

jwM l x == rJ.Lo(Jltl 1zH o - MoH t z )


jwM 1y ~ 0
jwM lz == r,uo(MoH l x - M1xHo)

(7.109)

Since H lz is very much greater inside the ferromagnetic specimen than without, the
discontinuity in M 1z at the material faces is accounted for essentially by the value of
H lz within the specimen. Thus one may replace - H lz by M lz in the first of Equations
(7.109) and then solve for the dynamic magnetic susceptibility in the X direction,
obtaining
MIx
Mo/ll o
(7.110)
Xx = =
H 1x
1 - (w/WO)2
with Wo = g(e/2m) vi J.LolloB o the ferromagnetic resonant frequency.
Equation (7.110) indicates a resonance which is useful in determining the g values for
various ferromagnetic materials, and representative values include 2.15 for iron, 2.20
for nickel, and 2.22 for cobalt. These results are in satisfactory agreement with g values

obtained for these same elements using gyromagnetic expcriments.:"


6. Hysteresis loss. If the external field applied to a ferromagnetic specimen contains
a time-harmonic component, the induced magnetization generally will consist of the
sum of steady and cyclical terms. The B-H relation will take on the appearance of the
closed loop shown in Figure 7.24, this loop being traced out once per cycle. The variations in magnetic moment are resisted by the crystal, and energy must flow from the
exciting field into the specimen to provide for this hysteresis loss.
The instantaneous power being supplied to a specimen of volume V (cf. Section 7.6) is

f HBdV

p =

so that the energy supplied per cycle is

f
T

Wm

P dt

f H dB

(7.111 )

c v

46 See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 408-410, John Wiley and Sons,
Inc., New York, 1956.

462

AIagnetic AIaierials

CHAPTER

in which C is the contour of the hysteresis loop and r == 1/ v is the period of the cyclic
field variations. If Hand B arc parallel, and if each is uniform throughout V, (7.111)
reduces to
(7.112)
and the integral in (7.112) can be recognized as the area enclosed by the hysteresis loop
in the BH plane. The time-average power supplied to account for hysteresis losses is
then simply v times the value of W m computed from (7.112).
B

~------------------H

FIGURE

7.24

Hysteresis loop with steady bias.

7. Tensor permeability in [erriies. If a ferrite specimen of arbitrary shape is placed


in a region containing both time-independent and time-harmonic magnetic fields, with
the steady field Z directed and much greater than the cyclic field, then the first-order
equations relating the magnetization and the fields are

jwlvJ 1x == -rp.o(lVI]yH o - 111 0 H 1y)


jwll11y = - fp.o(1\f oHl z - M lxH 0)

(7.113)

jwllli z ~ 0
These equations arise from an expansion of (7.107), the development being similar to
that which led to (7.109). Simultaneous solution of these equations for ]vf1:L and 1\1 1y
yields

(7.114)

SECTION

T'inLe- Varying Phenomena

16

463

These equations may be written in matrix notation, namely,

IX)

1\[
( 111
1JJ

111 0 / /1 0
1 - (w/ wo) 2

jw / wO)(I-I IX)

jw / wo

II 1y

(7.11S)

which can be recognized as being in the form M I == xm H 1 , with Xm the complex tensor
magnetic susceptibility, since it is identifiable as [Jf 0/ H 01/[1 - (wi wo) 2] times the
matrix appearing in (7.115). 1' 1'0 111 this it follows that the COIn plex tensor permeability,
defined by B 1 == ~tHl == ).L0(1
xm)H1is given by the expression
1

-jx

M = Mo (

in which

~1

+1-

x == -

Wo

JX

~)

(7.116)

0/ H 0

(7.117)

(W/WO)2

llfolH o
1 - (W/wO)2

(7.118)

Equation (7.116) indicates that in the presence of a static longitudinal field, the transverse components of the time-harmonic field are coupled, in the sense that either B Lz
or B l y can give rise to both H 1x and H 1y components. This coupling has a resonant
feature at the angular frequency woo
EXAMPLE

7.9

Electromagnetic waves can propagate through a ferrite medium without suffering intolerable attenuation because the conductivity is low. The interaction of the waves and the
ferrite is particularly interesting if the waves are circularly polarized and propagating in
the direction of the static field B o If one lets If ly = += JH lx, t with H Iz == O. Equation (7.116)
leads to

BIX)
( Bi,
BIz

=).Lo

(s

Bi;

B l y = Jlo( -Jx

-jx
0

J'is
0

0)( u., )
0

).Lo(s

=+=

js)H l z = ).Lo(e

+= JI/ l x
0

x)H l z

x)H

ly

and therefore B I = ).Lo(e x)H 1 This means that right and left circularly polarized waves
propagate through the ferrite as though it had a simple scalar permeability, but the value
of this permeability is different for the t\VO senses of rotation. This fact has led to a variety
of useful microwave devices utilizing ferrites. Since e and x are both affected by H 0, varying
the d.c. magnetic field will cause a circularly polarized wave to suffer a variable phaseshift
in traversing a ferrite section. This phaseshifting feature is useful in its own right. In addition, a linearly polarized wave may be treated as equal amounts of right and left circularly
polarized waves. Thus when a linearly polarized wave passes through the ferrite, its polarization rotates; this feature permits the construction of circulators through the use of t\VO
output ports in a waveguide section containing a ferrite, these ports being disposed at
right angles to each other. Control of e and x through H 0 determines which of these ports

t This is the condition for

circularly polarized waves. Cf. Sec. 5.10.

464

11{agnetic

1Y1aterials

CHAPTER

is coupled to the emerging wave. The reader interested in these and other ferrite devices
is referred to the literature."

7.17 MAXWELL'S EQUATIONS FOR MAGNETIC MATERIALS


If a collection of dielectric and/or magnetic materials] is considered at the microscopic
level to consist of an aggregation of charged particles in motion in a vacuum, then the
free space Maxwell's equations

E =

-8

are applicable at points within these materials, with tt representing the total current
density. t, is expressible as the linear sum

tt

= 1

+ r, + t,

wherein t is the primary current density, tb = P is the contribution made by the timevarying dipole moments which represent the dielectric effect, and t m = V X 1\1 is the
contribution made by the time-varying magnetic moments which represent the magnetic
effect.
Proceeding as in Section 6.21, one may account for 1b by writing Maxwell's equations

in the form

-B

X E

X B

= (\ +

\m

+ )))

#.Lo- 1

(7.119)

Since

it [ullo ws that
1m = V X

M = Jlo-1V X B - V X H

When this relation is substituted in (7.119) one obtains

-B

VXH=t+D

(7.120)

which is the form of Maxwell's equations suitable for application to dielectric and/or
magnetic materials. The auxiliary relations remain
V D = p

vB == 0

(7.121)

These results will be generalized further in the next chapter to include conductive
materials.

REFERENCES

1.
2.

Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and fVaves, 'V. H.
Freeman and Company, San Francisco, California, 1962.
Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey,
1957.

t This includes

materials which are both dielectric and magnetic.


See, e.g., C. L. Hogan, "The Microwave Gyrator.. " BSTJ, 31, 1-31; January 1952. Also, R. F.
Soohoo, Theory and Applicoiions of Ferrites, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1960.
47

Problems
3.
4.
5.
6.
7.
8.

9.

465

Dekker, A. J., Electrical Engineering 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1959.
Katz, H. W., ed., Solid State 111 agnetic and Dielectric Devices, John vViley and Sons, Inc.,
New York, 1959.
Kittel, C., Introduction to Solid State Physics, 2d ed., John vViley and Sons, Inc., New
York, 1956.
Panofsky, VV. K. H., and 1\-1. Phillips, Classical Electricity and 111 agnetism, AddisonWesley Publishing Company, Inc., Reading, Massachusetts, 1955.
Reitz, J. R., and F. J. Milford, Foundations of Electromagnetic Theory, Addison-Wesley
Publishing Company, Inc., Reading, Massachusetts, 1960.
Sears, F. \V., An Introduction to Thermodynamics, Kinetic Theory of Gases, and Statistical
Mechanics, 2d ed., Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1953.
Van Vleck, J. H., Theory of Electric and illagnetic Susceptibilities, Oxford University Press,
London, 1932.

PROBLEMS
7.1

A. magnetic specimen has a permanent magnetization M(x,Y,z) and occupies a volume V.


Its magnetic moment is defined by the integral f vM dV. If the specimen is placed in a
uniform magnetic field Bs, find the torque it experiences in terms of its magnetic moment.

7.2

A long straight wire of radius a carries a steady current I and is on the axis of a long
hollow cylinder of soft iron. The inner and outer radii of the cylinder are band c and its
relative permeability is J..Lr. Find the total flux of B inside the cylinder, per unit length.
Determine the equivalent current density throughout the iron cylinder, including its inner
and outer surfaces. From this express B external to the cylinder.

7.3

Show that the force on an atomic current loop of strength


uniform magnetic field B, is given by F = ( m V)B.

7.4

In a development which parallels Example 6.3, discuss the types of cavities which could
be constructed inside a specimen of magnetic material in order to measure Band H.

7.5

A spherical shell of radii a and b is permanently magnetized with a uniform magnetization M. Find Band H in all three regions.

7.6

.A homogeneous spherical shell of magnetic material, of permeability J..L < }J.o, is placed in
a field which "vas originally uniform at strength Bs, Find the total field inside the shell.

7.7

A homogeneous sphere of magnetic material is uniformly magnetized to a strength M


by a coil on its surface, which carries a steady current I. How is the winding designed?
(The Westinghouse-Goudsmit mass spectrometer employs such a winding on a hollow
sphere.)

7.8

If an atomic model is assumed which consists of a nucleus surrounded by a spherical


charge cloud, all elements of the latter rotating about the nucleus at a common angular
velocity w, show that the orbital magnetic moment of such an atom is also given by (7.41).

7.9

A. homogeneous sphere of mass AI has a charge Q uniformly distributed over its surface
and is rotating about an axis through its center at angular velocity co. Find the gyromagnetic ratio for this system (i.e., the ratio of magnetic moment to angular momentum).

7.10

Using the formula for the sum of a geometric progression, show that (7.55) reduces to
1, the Brillouin
(7.56), with the Brillouin function given by (7.57).1'hen show that, for
function is well-approximated by (J + 1 )x/3J. Finally show that for J large the Brillouin
and Langevin functions are essentially equivalent.

m, when

placed in a non-

466

Magnetic Materials

7.11

The Curie constant and paramagnetic Curie temperature which appear in the CurieWeiss law (7.61) may be deduced from experimental data relating Xrn to temperature.
For nickel, such data are shown in the plot of Figure i .8. Use this curve to determine
C and 8 for nickel, and then use (7.67) to check the reasonableness of your results.
The trivalent ion of thulium, Tm 3+, has the outer shell configuration 4f125s2p 6. Using

7.12

CHAPTER

Hund's rules and the Lande formula, show that for this ion S = 1, L = 5, J = 6, and
= 1.167. 'rhus show that u; = 7.57 Bohr magnetons. (The experimental value is 7.3.)

gJ

7.13

For several typical ferromagnetic materials, establish that 1l1oBIoc ~ k'I'. Is this a reasonable result?

7.14

Using the value J = t and a construction similar to Figure 7.10, determine the fractional
magnetization for nickel at room temperature. What is the saturation magnetization?

7.15

From Figure 7.20 deduce the internal field constants a, (3, l' for magnetite.

CHAPTER

Conductive Materials
varies widely from one class of materials to another-at
room temperature a typical insulator will display a conductivity in the range of 10- 16
mhos/rn, whereas for a semiconductor the value might be 10- 2 mhos/rn, and for a
metal such as silver the conductivity will be as high as 108 mhos/me lVlost of the discussion to be undertaken in this chapter will be concerned with highly conductive materials
though some of the results (such as Ohm's law) will be seen to apply to the less conductive
materials as well,
The band theory of solids will be invoked to explain the basic differences among
insulators, semiconductors, and conductors. The free electron theory will then be used
to describe the conduction process in metals. This theory views a metal, at the atomic
level, as consisting of a lattice of positive ions which is held together by an electron gas,
with motions of this gas accounting for the conductive properties. With the aid of this
model, a derivation of Ohm's law will be presented which assumes an electron-lattice
interaction that impedes the flow of the electron gas. A relaxation time and mean free
path will be deduced for the electrons with the aid of Fermi-Dirac statistics and an
atomic interpretation of Joule heat loss will also be developed. The temperature
dependence of the resistivity of metals will be considered, including the effects of
impurities. Some attention will be given to the heat capacity and thermal conductivity
of metals and the connection between thermal and electrical conductivities will be
developed. Consideration will then be given to semiconductors, and the manner in
which their conductivity varies with impurity concentration and temperature. The
chapter concludes with a discussion of the form of Maxwell's equations suitable for
conductive media.

ELECTRICAL CONDUCTIVITY

8.1 *

HISTORICAL SURVEY

The discovery that certain materials could be used to convey electricity from one place
to another was made by Stephen Gray ' in 1729. His experiments were quasistatic and
originally dealt with a glass tube about three feet long, to one end of which he fitted a
cork. Upon rubbing the glass tube Gray found that the cork also became electrified, and
concluded that . . . "there was certainly an attractive Vertue communicated to the

* This section may be omitted without loss in continuity of the technical presentati 'n.
I S. Gray, "Several Experiments Concerning Electricity," Phil Trans Roy Soc (London), 37, 1S 44:
February 1731.

468

Conductive 111aieruils

CHAPTER

Cork by the excited Tube." Stimulated by this result, Gray interposed a wooden rod
between the glass tube and cork and observed the same effect. N ext he connected tube
and cork with iron or brass wire, Still the same effect, undiminished by the length of
wire, Finally, he tied one end of a length of hemp cord to the glass rod and the other
end to an ivory ball. Using lengths of cord as great as four hundred feet, Gray was able
to electrify the ball by rubbing the distant glass rod.
Among those to whom Gray first communicated this discovery of electrical conduction was J. T. Desaguliers (1683-1744), who continued the experiments after Gray's
death in 1736. Desaguliers determined" that only a limited class of materials, notably
the metals, could convey electricity easily, and to these materials he gave the name

conduclorS.
Further progress with the investigation of conductive properties was hampered by
the lack of a source capable of maintaining the flow of electricity. However, Beccaria
was able to show' in 1753 that when an electric discharge was passed through a circuit
containing a tube of water the shock was more powerful if the tube cross section were
increased. And Henry Cavendish partially anticipated Ohm's law when he showed
that the resistance of a conductor is independent of the strength of the discharge. He
also established the manner in which a discharge divides itself among a set of conductors in parallel, and determined several relative conductivities, saying in a memoir"
presented to the Royal Society in 1775:
It appears from some experiments, of which I propose shortly to lay an account before
this Society, that iron wire conducts about 400 million times better than rain or distilled
water-that is, the electricity meets with no more resistance in passing through a piece of
iron wire 400,000,000 inches long than through a column of water of the same diameter
only one inch long. Sea-water, or a solution of one part of salt in 30 of water, conducts
100 times, or a saturated solution of sea-salt about 720 times, better than rain-water.

The details of these experiments lay undisclosed for a century until Maxwell's posthumous edition of Cavendish's papers appeared in 1879.
Invention of the first chemical battery by Alessandro Volta (1745-1827) was an
immediate stimulus to the study of conductivity. Motivated by Galvani's researches
on animal electricity, Volta developed a pile consisting of pairs of strips of dissimilar
metals immersed in brine or a weak acid electrolyte. When a circuit was formed by
connecting a wire across the pairs of strips, a continuous electric current was observed
to flow. This was one of the most important discoveries in the history of electrical
science, and was communicated to Sir Joseph Banks, President of the Royal Society
in 'London, in a letter dispatched by Volta from his home in Como, Italy, on March
20, 1800. This letter arrived in two sections and was ultimately read before the Society
on June 26th, being published in the Transactions later that year. 5
The announcement of this discovery was so startling that scientists on both sides
2 J. T. Desaguliers, "Some Thoughts and Experiments Concerning Electricity," Phil Trans Roy Soc
(London), 41, 186-210; July 1739.
3 G. B. Beccaria, Dell' elettricismo artificiale e naturale, p. 113, Turin, 1753.
4 H. Cavendish, "An Account of Some Attempts to Imitate the Effects of the Torpedo by Electricity,"
Phil Trans Roy Soc (London), 66, 196-225; 1776.
5 A. Volta, "On the Electricity Excited by the Mere Contact of Conducting Substances of Different
Kinds," Phil Trans Roy Soc (London), 90, 403-436; 1800.

SECTION

H islorical Survey

469

of the Atlantic immediately set forth to repeat and extend Volta's experiments. Even
before the delayed second section of Volta's letter reached England, Nicholson and
Carlisle had constructed a voltaic pile and with it effected the electrical decomposition
of water into its constituent gases. This achievement was then extended by Cruickshank, who showed that metallic salt solutions could be similarly decomposed. vVollaston next showed 6 that water could also be decomposed by a discharge of frictional
electricity, thus inferring that the sources of voltaic electricity were common with
those of electrostatic phenomena.
'I'hese experiments attracted the attention of Humphry Davy (1778-1829), who at
about this t.ime was appointed Professor of Chemistry at the Royal Institution in
London. Together with William Pepys, an instrument maker and Fellow of the Royal
Society, Davy designed and had constructed a succession of voltaic piles which were
the largest then in existence. The last of these was built in 1808 and consisted of 2,000
pairs of plates of zinc and copper, each plate being 6 in. square. With these batteries,
Davy melted iron wires up to a tenth inch in diameter and decomposed alkalis, obtaining thereby potash and soda ash from which he extracted the new elements potassium
and sodium. He was also able to melt quartz, sapphire, and platinum, to evaporate
diamond, and to boil liquids such as water and oil. The new clements barium, strontium.
magnesium, and calcium were extracted frorn the decomposition of alkaline earths.
And Pepys in 1815 utilized the intense heat developed by the voltaic pile to melt iron
wire and diamond dust together. thus directly carburizing the iron and producing steel.
In 1821 Davy turned his attention to the problem of determining the ability of
various metals to conduct a voltaic current." He accomplished this by connecting a
voltaic battery across a circuit consisting of a column of water in parallel with the
metallic \vire being investigated. When the length of wire was less than a certain critical
value, the division of current was such that the water ceased to decompose. Davy
measured the lengths and weights of wires of different materials which would cause
this critical condition; by comparing the results he was able to show that the critical
conductance of a wire was inversely proportional to its length l and directly proportional
to its cross sectional area A, though independent of the shape of the cross section.
Critical conductance could thus be expressed by the formula G = a(Ajl) in which a is a
fundamental material property called the electrical conductivity. With this apparatus
Davy also was able to compare the conductivities of different metals, and determined
additionally that critical conductivity varied inversely with temperature.
A year earlier Ampere had provided a usable definition for current and devised an
instrument for measuring it, which he called a galvanometer. (Cf. Section 4.1.) He
distinguished between electric tension (voltage) and electric current, and observed that
electric tension existed in a voltaic battery before the circuit was closed, being detectable through the use of an electroscope. He viewed tension as a cause and current as an
effect. Am pere realized that a relation existed between the cause and the effect, bu t
neither he nor Davy appreciated that the relation was a simple ratio in proportion to
W. II. Wollaston, "Experiments on the Chemical Production and Agency of Electricity," Phil Trans
Roy Soc (London), 91, 427-434; June 1801.
7 H. Davy, "Further Researches on the Magnetic Phenomena Produced by Electricity; With Some
New Experiments on the Properties of Electrified Bodies in Their Relations to Conducting Powers and
Temperature," Phil Trans Roy Soc (London), 111,425-439; July 1821.
6

4 70

Conductive 1\1 aierials

CHAPTER

Davy's critical conductivity figures. This final link ill the chain was forged by Georg
Simon Ohn1 8 (1787-1854) in the year 182G.
Working with deficient apparatus, Ohm was nevertheless able to perform a series
of carefully devised and definitive experiments which firmly established the law of
conduction which now bears his name. Preliminary investigations using voltaic batteries
proved unsatisfactory, because the electric tension of such cells fluctuated with time
due to chemical changes. For this reason Ohm substituted as SOUf(~e a thermoelectric
battery, the principle of which had been discovered by Seebeck in 1821. Using strips
of copper and bismuth joined at their t\VO ends, Ohm kept one pain t of contact in boiling
water and the other in ice, and thereby obtained a very stable current in any external
circuit he connected across the t\VO points of contact. A magnetic needle was placed
over the circuit and suspended from a torsion balance so that the current strength
could be gauged by the torsion needed in the balance in order to preserve the pointing
direction of the needle.
In one series of experiments, Ohm prepared eight copper conductors of common
cross section but different lengths and placed them in turn across the battery, recording
the following data:
2

Length of conductor, in.

10

18

34

66

130

-- -- -- -- -- -- -Strength of magnetic action, torsion

305

282 2581- 223t

178 124i

78

44

He then went on to analyze these results, saying:


The numbers already given can be represented very satisfactorily by the equation

x=_a_

b+x

in which X is the strength of magnetic action when the conductor is used whose length is x,
and a and b are constants which represent magnitudes depending on the exciting force and
the resistance of the rest of the circuit. If for example we set b equal to 20t and a
equal to . . . 6800, we obtain by calculation the following results

305t

280t

259

I 224i I 177i-

125f

79

45

If we C0111pare these numbers found by calculation with the former set found by experiment,
it will appear that the differences are very small, and are of the order that one might expect
in researches of this kind.

In this same paper Ohm reported four other trials for each of which the same values
of a and b gave comparable agreement, He then considered wires of different material
and different diameter and established the general validity of his formula, clearly
identifying the parameter a with the electroscopic force (voltage) of the battery. It
was then possible to deduce from his formula Davy's result that resistance is proporG. S. Ohm, "Determination of Laws Whereby Metals Conduct Contact Electricity," J Chemic urul
Physik (Schweigger's Journal), 46, 137; 18:26.

SECTION

H islorical Sl.l.rvey

471

tional to length, and inversely proportional to cross-sectional area. Ohrn also confirmed
the temperature dependence of conductivity previously reported by Davy.
K at yet satisfied, Ohm next made the important generalization that the law he had
discovered applied to any part of the circuit as well as to the entire length of wire, He
compared the flow of electricity to the flow of heat, and drew the parallel that electroscopic force played the same role with respect to current that temperature did with
respect to heat conduction. However, neither Ohm nor his contemporaries truly appreciated the relation between the electroscopic force of a battery and the electrostatic
potential of Poisson. Several decades were to pass before this relation was widely
understood," and Oh111 was forced to endure a long, bitter period during which the true
value of his work was neither recognized nor rewarded.
The law which connects the current flowing in a metallic conductor to the heat
evolved was determined by J. I). Joule (1818-1889) in the year 1841. 10 'This was aCC0111plishcd by coiling wires of different lengths, cross sections, and composition onto thin
glass tubes, and then immersing the resulting assemblies in separate beakers containing
measured quantities of water. When the same intensity of steady current was passed
through the different coils, the water was found to heat up to an equilibrium temperature which differed among the several beakers, but in such a way that the change in
temperature was proportional to the resistance of the coil in question. From this Joule
concluded
. . . that when a given quantity of voltaic electricity is passed through a metallic conductor for a given length of time, the quantity of heat evolved by it is always proportional
to the resistance which it presents, whatever may be the length, thickness, shape or kind
of that metallic conductor.

Joule then reasoned:


On considering the above law, I thought that the effect produced by the increase of the
intensity of the electric current would be as the square of that element, for it is evident
that in that case the resistance would be augmented in a double ratio, arising from the
increase of the quantity of electricity passed in a given time, and also from tie increase of
the velocity of the same. vVe shall immediately see that this view is actually sustained by
experirnen t.

Joule then established this last feature of the law which bears his name by showing
that the temperature rise of a coiled wire immersed in water is proportional to the square
of the current passing through it.
The ease with which heat passes through a conductor also was found to depend on
its electrical conductivity, and in 1853 Wiedemann and Franz obtained the cxperirncntal result that at any temperature the ratio of the thermal conductivity of a body to its
electrical conductivity is approximately the same for all metals, and that the value
of this ratio is proportional to the absolute temperature."
As the atomic nature of materials became better understood, these experimental laws
governing electrical conductivity were rendered susceptible to theoretical derivation.
9 The clarification was achieved mainly by Gustav Kirchhoff. See, e.g., "On a Derivation of Ohm's
Law which Agrees with Electrostatic Theory," Ann Phys, 78,506-513; 1849.
10 J. P. Joule, "On the Heat Evolved by Metallic Conductors of Electricity," Phil Moq, 19,260-265;
August 1841.
11 G. Wieden1ann and R. Franz, "On the Heat Conductivity of l\1etals," Ann Phys, 89, 497-531; 1853.

472

Conduclive M aierials

CHAPTER

Shortly after the discovery of the electron by .T. J. Thomson in 1895, an atomic model
of a conductor was proposed by Drude in which free electrons were pictured as wandering through a lattice of fixed positive ions. The application of an electric field would
cause the free electron gas to drift, resulting in a current, and electron-ion collisions
could then account for the resistance to this current flow. The work of several investigators, based on this model, culminated in a theory by Lorentz 12 which used MaxwellBoltzmann statistics and employed the Boltzmann transport equation to derive Ohm's
law and obtain a specific formula for the conductivity. Thermal conductivity also carne
within the scope of this analysis when one assumed that the thermal energy was transported by the free electrons. A major accomplishment of the theory was a confirmation
of the Wiedemann-Franz law.
Another success of the Maxwell-Boltzmann statistics was the derivation that the
specific heat capacity of a dielectric solid should be 3R, with R the ideal gas constant.
This result was based on a consequence of the statistics that each degree of freedom
of the particles of a system yielded a mean energy per particle of k'I' /2, with k Boltzmann's constant and T the absolute temperature. It agreed with the experimental
result of Dulong and Petit which had been established in 1819. However, when the
Sa111e argument was applied to a conductor, the prediction was that the free electron
gas should make a contribution of 3R/2 to the specific heat capacity. Experiment did
not reveal this contribution, and matters were further complicated in that the specific
heat capacity of all solids was found to decrease as absolute zero is approached, in contradiction to the theory.
These theoretical difficulties were overcome with introduction of the quantum statistics. An explanation of the behavior of specific heat capacity at low temperatures was
offered by Einstein in 1906, and an improved theory was then put forward by 1). Debye
in 1912. Later workers have refined this analysis so that theory and experiment are
now in satisfactory agreement. In the case of metals, when the Fermi-Dirac statistics
are applied to the free electron gas, the contribution to specific heat capacity is found
to be small, in agreement with experiment.
In 1928 A. Sommerfeld!" reworked Lorentz' theory of conduction in metals, retaining
the model of a free electron gas but replacing the Maxwell-Boltzmann statistics by
quantum statistics. The results were particularly satisfactory in the case of monovalen t
metals, for which the assumption of a spatially uniform potential within the metal (an
inherent feature of the free electron model) is a good one. Better agreement with experiment for a wide variety of solids has been achieved by assuming a spatially periodic
variation of potential in accordance with the lattice dimensions, this development
being part of what is known as the band theory of solids.
These refinements have produced an acceptable theory of solids with respect to
electric and thermal conductivity and specific heat capacity, including many temperature effects and containing derivations of the laws of Ohm, Joule, and WiedernannFranz. A clear delineation of the basic differences among materials which are classified
as insulators, semiconductors, or conductors has been a major triumph of the band
theory. In the case of semiconductors, use of the Fermi-Dirac statistics to determine
electron population in the different energy bands has led to a completely satisfacIT. A. Lorentz, Proc Amst A cad, 7,438, ,585, 684; 1904-]905.
A. Sommerfeld, "On the Electron Theory of Metals Based on the Fermi Statistics," Z Phys, 47,
1-32; 1928.
12

13

SECTION

Classification of Conductive Properties

473

tory explanation of the dependence of conductivity on impurity concentration and


tern perature.
One experimental discovery of significance which is still engaging the attention of
theorists is the superconductivity effect uncovered by Kamerlingh Onnes of Leiden
in 1911. This phenomenon is confined principally to those metals which are not among
the better conductors at 1'00111 temperature, and is characterized by a resistivity versus
temperature curve which suffers a sharp drop within a few degrees of absolute zero,
and then tends to zero as the temperature is reduced further. Applications to switching
and magnetic coils have heightened interest in these superconductive materials.

8.2

CLASSIFICATION

OF CONDUCTIVE PROPERTIES

UNDER THE BAND THEORY

As in Chapter 7, it will be assumed that the reader has S0111e familiarity with quantum
mechanics and is acquainted with the solutions of Schrodinger's equation applicable
to a hydrogen atom. These solutions form a discrete set, with individual solutions
identified by four quantum numbers and a characteristic energy given quite accurately
by the formula
(8.1)

2h 2 n 2

in which m and -e are the mass and charge of an electron, h is the reduced Planck's
constant, and n is the principal quantum number.
Using the quantum selection rules discussed in Section 7.8, one sees that if n == 1,
the quantum numbers land mi are constrained to be zero; m, can be either +t or -t.
Thus there are two quantum states with an energy E 1 calculable from (8.1). Similarly,
if n == 2, the quantum number l could be zero, in which case m, == O. Another possibility is that l == 1, and then m, can have anyone of the three values 0, 1. For all
these four allowed combinations of land mi, m, can be t, so there is a total of eigh t
quantum states possessing an energy E 2 calculable frorn (8.1). Proceeding in this
'TABLE 8.1
THE CORRESPONDENCE OF SPECTROSCOPIC NOTATION AND QUANTUM NUMBERS

-- ------

Spectroscopic designation ..........

fashion for higher values of n, one can determine the totality of allowed quantum
states at each energy level. If one uses the spectroscopic designation of Table 8.1, this
information may be displayed in an energy level diagram, as shown in Figure 8.1.
Each short line segment represents one allowed state and a dot placed on a particular
line segment can then indicate a hydrogen atom whose electron is in that particular
state. As an illustration the case of an atom in the excited 2p state is shown in the
figure.
Diagrams similar to Figure 8.1 may be constructed for other elements and displayed

Conductive /VIaterials

474

4/
I

l = 1 l

4d

4p

4s

48

4p

4d

L...L.L...J....

L.J...L.J

L.J

'-L..L.J,.."L.

l = 2

l=3

CHAPTER

Oil = 0 1 = 1 l = 2

3d

3p

38

38

3p

L1..L....L...L-J

L.L...L.J

L-I

L.J

L..J....L.J

3d

'-L..L.J,.."L.

=3

4/

" ' "

n = 3

eo

'""
c

2p

Q)

u....w

Q)
Q)

z>

l1t

-101

28

28

L-I

L...I

2p

ILu n=2

117,8

-i-

18

Is

'-'

FIGURE

8.1

L..J

m,

A pproximate energy level diagrarn for hydrogen.

3d
3p

38

eo
Q)
c'""
Q)

2p

Q)

>

28

18
(a) One atom

(b) Two atoms

8.2 Energy levels for a single atom and for a


pair of identical atoms with close spacing.

FIGURE

+i

n = 1

, , n = 4

SECTION

Classification of Conductive Properties 475

in a somewhat simplified form, as shown in Figure 8.2a, wherein the segments representing different quantum states have been coalesced into a single horizontal line.
Note that in this more general case, the energy of a state is dependent on the quantum
number l as well as the quantum number n, (This is really true in the case of hydrogen
as well since formula 8.1 is not quite precise.)
If t\VO atoms of the same element are widely enough separated, their energy level
diagrams are identical, and each can be represented by Figure 8.2a. However, if the

3p

3s
~
bl)
~

Q)
C)

>

zoj

2])

a3
~

~~--F----------------- 2s

Is

J nteratornic spacing
FIGURE

8.3

Energy bands in a crystalline solid.

atoms are brought close together, they interact. This introduces a coupling term into
Schrodinger's equation and causes the solutions to split into pairs, so that the energy
level diagram is represented by Figure 8.2b. The degree of splitting is governed by the
distance of separation. An important feature to note is that the total number of allowed
states is twice the number for an isolated atom, since there are now t\VO at0111S in the
system.
This idea can be generalized to include the case of a crystal (i.e., a three dimensional
array of atoms). If the interatomic distance is varied, an energy level diagram will take
the form suggested by Figure 8.3. As the separation of at0111S decreases, the lines representing energy states begin to separate and form bands. In a typical solid of at0111
density 10 29 per 111 3, there will be approximately 10 29 states in each band so tha t a band
may be looked upon as a quasicontinuous region of allowed energy states. Between
these allowed energy bands (unless they overlap) are forbidden regions or gaps, so-called

476

Conductive Materials

CHAPTER

because no electron of any atom in the solid can have an energy corresponding to a level
in one of these forbidden regions.
Depending on the equilibrium value r, of the interatomic separation distance, two
particular allowed energy bands mayor may not overlap. As indicated by Figure 8.3,
the lower energy states split into bands at a smaller atom spacing. This is reasonable,
since these states represent electrons closer to their nuclei and therefore less coupled
to electrons of other atoms. For this reason the higher energy bands overlap first, and
the diagram indicates a situation in which the equilibrium spacing is such that all but
the Is and 28 states have split into bands, with the 38 and 3p bands overlapping.

few)

"

"'"

\ " ' - 0 1~

"
\

\
\

0"---------------------1-----..;;:1----.;;::.----WF(O)
W

FIGURE

8.4

A plot of the fermi function vs. energy for several teniperaiures.

For any finite atom spacing, if one looks high enough on an energy diagram of the
type represented by Figure 8.3, overlapping bands will be encountered. This feature is
of little significance if, for the material in question, the overlapping bands of allowed
energy states normally are unoccupied by electrons. However, if t\VO overlapping bands
normally are partially filled, electronic conduction can occur readily, as shall be seen
shortly. Therefore, it is important to be able to determine the degree of occupancy of
the allowed energy levels in the various bands. This can be accomplished with the aid
of the Fermi-Dirac statistics.
It is shown 14 in texts concerned with quantum statistics that, for particles (such as
electrons) which obey the Pauli exclusion principle, the probability that a particular
quantum state having an energy w is occupied is given by the Fermi-Dirac function

f(w)

1
e(w-wrl/kT

(8.2)

in which WF is a temperature-dependent parameter known as the Fermi energy. When


(8.2) refers to the energy levels in a crystalline solid, the value of W F is also found to
depend on the material being considered.
See, e.g., F. W. Sears, A n Introduction to Thermodynamics, the Kinetic Theory of Gases, and Statistical
Mechanics, 2d ed., Chap. 16, Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1963.

14

SECTION

Classification of Conductive Properties

477

In the temperature range T :::; 10001(, k'T does not exceed T10th of an electron volt
and, for most crystalline materials, wF/kT 1. Therefore, if w w/<', the exponential
term in (8.2) is very small andf(w) ~ 1, indicating that all states having energies much
smaller than the Fermi energy essentially are completely occupied. Alternatively, if
W wP, the exponential term in (8.2) is quite large andf(w) tends to zero exponentially
as w increases. A plot of (8.2) is given in Figure 8.4 for several temperatures and it is
evident that states whose energies exceed WF by 1110re than kT are virtually unoccupied.
At absolute zero the distribution simplifies to a step function, with all states below
w F fully occupied and all states above w F totally empty.
If for a particular material S(w) dw is the number of allowed quantum states per
unit volume in the energy range from w to w + dw, it follows that

N(w) dw = f(w)S(w) dw
Sew) dw

(8.3)

is the number of electrons per unit volume of the material having energies between w
and w + dw.

'---------~W('

.----------

Wv

--------~We

--------

t - - - - - - - -..... W"

Wf'

------tWr

(a) Partially filled band.

Conductor.

(b) Totally filled band.

I nsulator or

(c) Overlapping bands.

Conductor.

semiconductor.
FIGURE 8.5
Energy band diaqrams for various positions of the Fermi level. (Top of lower
band denoted by Wv, bottom of upper band by We).

Since the Fermi level must lie either within an allowed band or between bands in a
forbidden region, three possible cases arise, as illustrated by Figure 8.,S. If the Fermi
level lies within a band (Figure 8})a), Equation (8.3) indicates that at QOI( the band is
totally filled below Wp and completely empty above Wp. As the temperature is raised,
the population in this band spreads out somewhat but very few electrons are found in
the next higher band. A material exhibiting this type of electron distribution is a

478

Conductive Materials

CHAPTER

conductor, since unpopulated energy states are available in the same band, to which
an electron can move, The change in energy which an electron requires in order to
move to one of these unpopulated states is slight, since the density of states within the
band is so great. An electron can gain this energy easily from an applied electric field
and therefore readily transfers its association from one atom to the next.
If the Fermi level lies between bands (Figure 8.5b), Equation (8.3) indicates that at
OO!{, the lower band is completely filled and the upper band is empty. At a temperature
T, if the energy gap between bands is large compared to k'I', the population distribution
is still essentially the same as for 01(. Thus there are no empty states at a nearby energy
level to which an electron can move-c-eonduction is very difficult and the material
is an insulator. t If the gap width is not too great, at a finite temperature some population of the upper band by electrons occurs and these electrons are free to move to empty
states in the upper band. They are therefore conduction electrons and contribute to a
small current occasioned by the presence of an external electric field. The states which
they vacated in the lower band also provide a contribution to the conduction process.
Such behavior characterizes materials known as semiconductors.
Finally, if the Fermi level lies in a region where two bands overlap (Figure 8.5c),
one or both of the bands is only partially filled and conduction occurs easily.
These observations may be summarized by saying that a conductor is a material
whose electrons populate the allowed energy bands in such a way that there is an upper
band which is partially filled or an upper pair of overlapping bands which are partially
filled. Electrons in these bands can move from one state to another with only a small
change in energy and are therefore highly mobile. On the other hand, an insulator is a
material whose electrons populate the allowed energy bands in such a way that there is
an upper band which is completely filled. Conduction in an insulator could only occur if
some of these electrons could move up to the next higher band, but the energy gap is
too great to permit this at normal temperatures. In a semiconductor, this gap width
is not so great, and some small population of the next band occurs at normal temperatures, permitting a slight current to flow in the presence of an electric field.

8.3

FREE ELECTRON THEORY OF METALS-OHM'S LAW

Since the conduction electrons in a good conductor lie in partially filled bands and can
move from one state to another with little change in energy, a suitable model of such
materials is one which pictures the conductor at the atomic level as consisting of a
lattice of positive ions coexisting with an electron gas. The ions are capable of vibrations about their individual lattice sites but are not free to wander from one lattice
site to another. The electrons comprising the gas, on the other hand, are highly mobile
and able to move throughout the lattice against an almost-constant background potential, thus not belonging to any particular atom. In this picture, called the free electron
model, the number of electrons in the gas is not necessarily an integral multiple of the
number of atoms in the material; thus the average valence of the lattice ions is not
restricted to an integer. Also, the identities of the electrons making up the gas may
change with time, thereby accounting for electron-lattice interactions in which an
electron is either freed or captured.

t At room temperature, kT is 41>th of an electron volt. In a good insulator, such as diamond, the gap
width may be as great as 7 or 8 electron volts.

SECTION

:3

Free Electron Theoru of M etals-r-Uhm'sLaw

479

At a constant temperature T, the number of free electrons per unit volume should
have a time-independent value n. The average, or drift velocity of the electrons making
up the gas is given by
(8.4)

in which n is assumed to be large enough to make this average meaningful and Vi is


the instantaneous veloci ty of the ith electron. In the absence of an applied electric field
v a == 0 since in equilibrium, in any speed range, there should be just as many electrons
moving in one direction as in any other.
v(t)
/_ _ T _ _

I
I'-

FIGURE

8.6

Equivalent velocity history of average free electron.

If a constant electric field E is applied to the conductor, it is found that a new equilibrium condition is established in which a steady current density exists at each point in
the conductor, implying a time-independent, but nonzero value for the drift velocity Yd.
This experimental result can be explained by assuming that each electron in the gas,
during each time interval marked by successive interactions it has with the lattice, is
accelerated by the applied field. The electrons gain 1110111entum during these intervals
and then surrender some of their momentum to the lattice during an interaction. For a
given electron the succession of time intervals between lattice interactions is a random
sequence, as is its initial velocity after each interaction. The history of such encounters
also differs from one electron to the next. However, the net average effect is as though
each electron possessed the drift velocity Yd.
One 111ay account for the momentum transfer to the lattice by imagining that the
average electron has an equivalent velocity history as shown in Figure 8.6. Every r sec
it interacts with the lattice, stopping momentarily and surrendering mv r units of
momentum, with 112 the electronic mass. Between interactions the equivalent constant
velocity v is maintained. It is apparent that the time-average velocity in Figure 8.6
is Vd if the interaction time is negligible compared to r . The period r can be selected so
that the rate of momentum transfer per electron is properly given by 112vd/r.
Since the time-average motion of the free electrons is unaccelerated, they experience
no net force and the effect of the external field is balanced by the rate of momentum

480 Conductive Materials

CHAPTER

transfer so that

-eE

(8.5)

with e the electronic charge. The induced steady current density is given by
t

= -neVd

=-E
m

and thus

(8.6)

ne 2r

(8.7)

When the electrical conductivity a, is defined by the relation

(8.8)
it follows that
(Jc

= -

Pc

ne 2r
m

(8.9)

=-

in which Pc, the reciprocal of a., is called the resistivity.


Equation (8.7) has been derived under the assumption that the conduction electrons
are free to wander throughout the metal against a constant background potential.
This assumption implies an isotropic material and is most valid for monovalent metals.
For nonisotropic materials, the free electron mass m must be replaced by the effective
mass m* (cf. Section 7.16), and both rand m* depend on the direction of application
of E. In this more general case (Jc as given by (8.9) becomes a tensor.
Inspection of Equation (8.9) reveals that the electrical conductivities of different
isotropic conductors are governed by the free electron density n and by the average
time T between momentum transfers. If one considers monovalent metals such as
lithium, copper, or silver, n can be taken to be equal to the atom density. An experimental determination of the electrical conductivity will then yield an estimation of T.
The experiment consists simply of choosing a cylindrical specimen of the material,
of length l and cross-sectional area A, and establishing a uniform electric field E
throughout the specimen. Then from (8.8)

LA
in which

is the total current,

(JcA

=I =V

El = -

Pel

V
R

= -

(8.10)

is the voltage drop through the length land

= o.>:

(8.11)

is the resistance of the specimen. Equation (8.10) is recognized as Ohm's law, an alternative form of which is given by (8.8). Thus measurements of V, I, and the dimensions
of the specimen will permit a determination of R, Pc, and ultimately a.: The results of
this experiment in the case of several monovalent metals are listed in Table 8.2. It is
to be noted that the deduced values of r are all in the range of 10- 14 sec.
If instead of being constant E is a function of time which varies slowly compared

SECTION

Free Electron Theoru of Meiole-e-Ohm: Law

481

to r , the force equation (8.5) assumes the more general form


dVd

mdt

+ n~vd/r

==

-eE

(8.12)

This differential equation has the complementary solution

vd(l) == vd(O)e-

t /r

(8.13)

indicating that, if a steady electric field were suddenly removed, the drift velocity
would decay exponentially. For this reason r customarily is called the relaxation time.
For the metals of Table 8.2 this relaxation time is seen to be exceedingly short.
1'ABLE 8.2
THE ELECTRICAL CONDUCTIVITY

Metal

(Jc

OF SOl\1E lVIONOVALENT l\IETALS AT

a, (mhos /m)

(sees) from (8.9)

Li ...........

0.11 X 10 8

0.9 X 10- 14

Na ..........

0.21 X 108

3.3 X 10- 14

K ...........

0.15X10 8

4.4 X 10- 14

Cu ..........

0.58 X 108

2.7 X 10- 14

Ag..........

0.62 X 108

3.8 X 10- 14

If E is time-harmonic (8.12) becomes

(jw +

Va

aoe

-eE

(8.14)

revealing that - Vd (and thus t) will be in phase with E only if r- 1 w. Microwave


measurements in K-band waveguide at an angular frequency w '"'" 2 X 1011 rad/sec
indicate that, even at this high frequency, t and E are essentially in phase in the copper
walls, thus providing supporting evidence that T < 10- 1 1 sec. For this reason Equations (8.8) and (8.9) may be presumed to be valid for monovalent metals which are
good conductors, even if t and E are time-varying, and even at frequencies as high as
the upper end of the microwave range.
EXA:MPLE

8.1

By experiment, the resistance at room temperature of a silver wire is found to be 0.0281


ohms. Its measured length is 1.265 In and its mean measured diameter is 0.096 em. I t is
desired to estimate the relaxation time for this specimen.
Th rough use of (8.11)
p; =

RA

-Z-

(0.0281)(77"/4)(0.096 X 10- 2)2


1.265

1.61 X 10- 8 ohm-m

Conductioe M ateriais

482

CHAPTER

from which
<Ie

= - = 0.62 X 108 mhos/rn


Pc

in agreement with Table 8.2.


If one free electron per atom is assumed for silver, then n = 5.86 X 1028 and (8.9) gives
<Icm

T =-

ne

(0.62 X 108)(9.1 X 10-3 1)


(5.86 X 1028) (1.6 X 10- 19 ) 2

= 3.8 X 10- n sec


which is also in agreement with Table 8.2.

8.4

OHM'S LAW-ALTERNATE

DERIVATION

By making use of Boltzmann's transport equation, one is able to derive Ohm's law
in a manner which differs from the approach taken in Section 8.3. In addition to providing further insight to the meaning of electrical conductivity, this second procedure
has the advantage of paralleling the development of thermal conductivity to be presented in Section 8.10, thus simplifying the establishment of the Wiedemann-Franz
law.
Once again a free electron gas is assumed to exist within the conductor. If one lets
(vx,vy,v z ) represent the velocity of an electron, at time t the quantity
!(x,y,z,Vx,vy,v;.,t) dx dy dz du; do; do,
can be taken to represent the number of electrons in the spatial volume element
dx dy dz which have their velocities lying in the range vx to u, + du., Vy to Vy + dvy,
u, to Vz + du.. The function f is known as the distribution function in six-dimensional
space, or phase space (three spatial dimensions plus three dimensions for the velocity
components).
If there were no electron-lattice interactions, an electron which at time t was at the
point (x,Y,z) and had velocity (vx,vy,v z ) would at time t + dt be at the point (x
o; dt,
y Vy dt, z u, dt) and have the velocity (vx Vx dt, Vy vy dt, u, vz dt). Were it
not for collisions, all the electrons which had been in dx . . . du, would be in a new
volume element dx' . . . dv;. Since the velocities and accelerations of all the electrons
in these t\VO adjacent elements of phase space are essentially the same, to first order

dx dy dz du; do; du,

dx' dy' dz' dv: dv~ dv;

If now one lets

[ ~]
at

dt dx' dy' dz'


coli

dv~ dv~ dv;

be the net num ber of electrons which are forced into dx' .. dv: due to electronlattice interactions during the time interval dt, then it follows that

f(x

Vx dt,

+ Vy dt, z + u, dt,

1)x

+V

dt,

Vy

+ v dt, , + V dt, t +
y

!(x,y,z,VX,vy,vz,t)

dt)

[:1011 dt

(8.15)

SECTION

OhnL'S Laur-rAliernaie ]Jerivation 483

This equation may be recognized as containing a primitive form of a total differential,


from which there results

af +

Vx -

ax

af +

Vy -

ay

aj + V.x - aj + V. af + v.z -af + -af

Vz -

az

y -

avx

au!}

avz

at

[ -af]
at

colI

(8.16)

This is known as Boltzmann's transport equation.


In many problems such as the one being treated here the collision mechanisms are
such that, if a stimulus is present and causes the distribution f, and then the stimulus
is removed, the collisions cause the particles to reach an equilibrium distribution fa
under a relaxation process. In other words if the stimulus is removed at t = 0, then
(8.17)

in which T is a relaxation time which characterizes the process.


In the absence of a stimulus collisions are the only cause of a time change in
thus from (8.17)

[ ~]
at

=
coIl

at

(f - fo) = _ f - fo

f and
(8.18)

In the development which follows it will be assumed that (8.18) is an appropriate


expression to insert in (8.16) to account for distribution changes due to collisions.
Imagine that through the agency of external sources a static electric field has been
applied to a conductor, and that coordinate axes have been chosen so that in a small
region of the conductor the field is Z directed. After sufficient time has elapsed, a tinleindependent distribution f(z,vx,vy,v z ) of the free electrons within the conductor will have
been established which differs from the field-free distribution fo(vx,vy,v z ) . Since

vz

e
nl

= - - E..

the Boltzmann equation becomes

u, ~ -

az

171

f - fo

e. af

auz

(8.19)

For normal electric fields f will differ only slightly from fa, and one 111ay substitute
f = fa in the left side of (8.19), obtaining

f = fo

+ re E
m

afo
avz

(8.20)

The distribution f is thus seen, in first approximation, to depend only on the velocity
variables, and may be used in the following manner to deduce an expression for the
electric current:
With reference to Figure 8.7 let dA be a small area constructed transverse to the
Z direction and consider the parallelopiped of length v dt erected on dA as a base. The
number of free electrons within this parallelopiped whose velocity components are
Vx =
IS

v sin

(J

cos

<P

"

= v sin

(J

sin

c/>

, = v cos

(J

484

Conductive Materials

CHAPTER

since dx dy dz = v cos () dA dt. This is the number of electrons with velocity v which
have passed through dA in time dt. (Electron-lattice interactions change the identity
of some of these electrons but not their number.) The charge transported per unit area

....y

FIGURE

8.7

Calculation of flow through area element.

per unit time is therefore

Upon considering in this manner the free electrons in the entire velocity range, one
concludes that the electric current density is
t

IIIv.j(vx,vy,v,) dvx dvy du,


-e III v.jo do, dvy dv, - ::2 E, III

= - e

11,

:~: d, dvy dv,

(8.21 )

Since 10 is symmetrical in v., the first integral in (8.21) is zero; the second term may be
integrated by parts, giving
(8.22)
The first of these integrals is odd and gives a null value; the second yields the result
(8.23)
in which n is the spatial electron volume density. Equation (8.23) is seen to be identical

SECTION

.s

T'he 1\1ean T'iJne Between Electron-Lattice Interactions

485

with (8.7) when proper regard is given to vectorial directions, and is thus a statement
of Ohm's law,

8.5

THE MEAN TIME BETWEEN ELECTRON-LATTICE INTERACTIONS

Within the electron gas of an isotropic conductor, consider a large number N G of electrons, all of these electrons being characterized by the fact that at the present time t,
each is about to have its next interaction with the lattice. Let N(t l + dt l ) be the subgroup of these electrons which have not suffered a lattice interaction since the earlier
time t 1 + dt-: At the still earlier time l-i, this sub-group is further reduced to N(t l ) . It
follows therefore that the number

is the count of electrons within N G which had their last interaction with the lattice
during the time interval dt 1 beginning at the time i..
I t is reasonable to assume that dN is proportional to both N (ll) and dis, so that one
may write

dN == N dt 1

(8.24)

Ie

in which Ie is a constant of proportionality having the units of time. Integration of


(8.24) gives
(8.25)
The average time between lattice interactions for these N G electrons is given by

and thus the proportionality constant Ie is the mean free t.ime between electron-lattice
interactions, or more briefly, the mean collision time.
The drift velocity is the average velocity of all N G electrons just before their next
lattice interaction, namely
v =

Na

J v(t) s,

(8.26)

in which dN v is a function of v and the integration is over all velocity space.


In the case of isotropic materials since is normally so small compared to the average speed of the free electrons, the scatter of these free electrons after their lattice
interactions is isotropic, and their aggregate momentum afterwards is zero. Therefore
the momentum they transfer to the lattice is their aggregate momentum just before
interaction. This means that the average momentum transferred per electron is
(mv) =

~G

J mv(t) dN"

mv

(8.27)

For this reason one may say that the average momentum transferred to the lattice per
electron per interaction is 1nvd/Tc. Upon comparing this result with the analysis in
Section 8.3 or Section 8.4, one finds that for isotropic materials the mean collision time

486

Conductive 111 aterials

CHAPTER

is the same as the relaxation time-s-that is, r. = T. This is not true for nonisotropic
materials, but such a discussion lies outside the scope of the present development.

8.6 MEAN FREE PATH

A mean free path A for the electrons comprising the gas may be defined by the relation
A

(8.28)

UTe

in which U is an appropriate average electron velocity. Since the electron gas obeys the
Pauli exclusion principle, in the absence of an external field the energy distribution
of electrons is given by the Fermi-Dirac formula;" namely,

47r
W~2
dn w = - (211~)%
3
h
exp [(w - wF)/kT]

dw

(8.29)

in which dn.; is the number of electrons per unit volume whose energy lies in the range
W to w + dw. The quantities h, k, and m which appear in (8.29) are Planck's constant, Boltzmann's constant, and the electronic mass, and w F is a temperature-dependent parameter known as the Fermi energy. (Cf. Section 8.2.) The value of WF must be
adjusted so that the integral of (8.29) over all possible energies gives the total number
of electrons per unit volume, n.
A relative plot of this distribution in momentum space for several values of the
absolute temperature T has been given in Figure 8.4. At absolute zero all states
are seen to be uniformly occu pied out to an energy W F(O), beyond which all states
are empty. As the temperature is raised, this distribution is rounded off and WP
becomes the energy of the half-populated state. For kT W r, the rounding off causes
a significant departure from the absolute zero distribution only in the energy range
WF - kT < W < WF + kT.
The value of the Fermi energy at absolute zero may be determined by considering the distribution in momentum space. Since the free electron gas model of an
isotropic conductor implies a uniform potential distribution within the conductor, to
an energy w possessed by a free electron there corresponds a momentum p, these quantities being related by
1
(mv)2
p2

from

- mo?
2

= W = -- = -

2m

2m

Thus 47rp 2 dp = 27r(21n)~2w~~ dw and the distribution (8.29) is equivalent to

dn;

2/h 3
exp [(w - wp)/kT]

Inspection of (8.30) reveals that, if T = OOI{, for w


n1entU111 space is

47r p 2 dp

(8.30)

< w F(O)

the distribution in mo-

(8.31)

whereas for w > wp(O), dn p == O. Therefore the spherical shell of volume 47rp 2 dp in
momentum space has a uniform population of electrons with density (2/h 3 ) for all
U

See, e.g., F. W. Sea.rs, loco cit.

SECTION

Mean Free Path

shells of radius 0

~ p ~

487

P FO, in which
(8.32)

P~o == Zmu: F(O)

Integration of (8.31) gives


(8.33)
Combination of (8.32) and (8.33) yields for the Fermi energy level at absolute zero

h (3n)%
2

W F(O)

== - ' 8rn 7r

(8.34)

A calculation of the Fermi zero energy may be undertaken using (8.34) if one assumes
a value for the free-electron density n. For monovalent metals, allowing one free electron
per atom, the results are listed in the second column of Table 8.3. It is to be noted that
WF(O) is characteristically several electron volts for these materials,
TABLE 8.3
FERl\lI LEVEL ENERGY, VELOCITY, AND l\lEAN FREE
PATH FOR SOlVIE l\'10NOVALENT l\1ETALS

111etal

WF(O) ev

UF m/sec

A angstroms

Li...........

4.7
3.1
2.1
7.0
5.5

1.3 X 106
1.05 X 10 6
0.85 X 106
1.6X10 6
1.4 X 106

115
345
370
430
530

Na ...........
K ............

Cu ......... '1
. .Ag ...........

It may be shown that the Fermi energy at any temperature is given by the expansion 16
WF(7

== WF(O)

{I _ 127f2 [~J2
+ ... }
WF(O)

Since at room temperature kT ~ 0.025 ev, it follows that at ordinary temperatures,


for good conductors with Fermi energies such as those given in Table 8.3, kT /w F(O) 1
and wF(T) ~ WF(O). Under these circumstances one may gain a pictorial feeling for
the Fermi-Dirac distribution at room temperature for a free electron gas (within copper
for example) by imagining it to be the curve labeled T 1 in Figure 8.4, with the partially
populated energy levels confined essentially to a region less than T!oth of W F
SO far this discussion of the Fermi-Dirac distribution of an electron gas has been
limited to the case in which no external electric field is present. Upon application of an
external field, the entire gas takes on a drift velocity Vd which slightly alters the energy
distribution. However, since the energy levels below W F - kT were essentially filled,
and since the energy levels above W F + k1 were essentially empty, the principal change
in the distribution occurs around the Fermi level. Thus, for example, when a steady
electric field is removed and the drift velocity v begins to disappear, this is due to a
1

See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., Chap. 10, John Wiley and Sons, Inc.,
New York, 1956. This result was first obtained by Sommerfeld.

16

488

Conductive AIaterials

CHAPTER

relaxation in the energy distribution of free electrons back to the form of Figure 8.4.
For this reason it is principally those electrons whose energies arc close to W r which
participate in the relaxation process. The relaxation time T occurring in (8.13) and in
the conductivity expression (8.9) therefore refers to the Fermi-level electrons.
Returning to (8.28), one sees that the mean free path for Fermi-level electrons in an
isotropic conductor is expressible as A = UFT, in which Up is given adequately by the
relation Jnu~/2 = WF(O). For the monovalent metals listed in Table 8.3, the values for
UF and A calculated in this way are given in columns 3 and 4 of that table. On the presumption that the average collision t ime for free electrons in any speed range (v, v + dv)
is inversely proportional to v, A has the same value for all the free electrons. Thus even
though the mean free path A has been computed by focusing attention on those electrons
whose energies are close to W F, the results should be applicable to the entire free electron
gas.
At first sight the A values listed in Table 8.3 seem startling, since they are t\VO orders
of magnitude larger than the lattice dimensions of the metals to which they refer.
However, if one recalls the wavelike properties of electrons," an explanation can be
found in terms of waves passing through periodic structures. Just as a light wave
would propagate through a perfect crystal without attenuation, so would electron waves
pass through a perfectly periodic lattice without interaction. Thus it is not the lattice
sites themselves which cause the electron-lattice interactions, but rather it is the
deviations from a perfectly periodic lattice which impede the electron motion. These
deviations usually are caused chiefly by thermal lattice vibrations but also include
lattice defects and foreign impurity atoms. Boundaries, of course, also play an impeding
role.
Because the classical picture of a collision between an electron and a lattice ion is
seen not to be valid in describing this process, the term electron-lattice interaction has
been used in preference to collision. The subject of lattice imperfections will be taken
up again in Section 8.9 where the temperature dependence of resistivity is discussed.
EXAl\IPLE

8.2

The Fermi zero energy for silver may be determined from (8.34) by assuming one free
electron per atom, so that n = 5.86 X 1028 electrons/rn''. Then
(6.62 X 10- 34 )2
WF(O) = - - - - 8 X 9.1 X 10- 31

= 8.85 X

10- 19

(3

X 5.86 X 1028)~~
1f'

joules = 5.5 ev

From this it follows that


UF

2WFCO)] H
[ --;;;-

1.39 X 106 m/sec

Using Table 8.2, one finds the mean free path in silver to be
A =

UFT

5.3 X 10- 8 m

= 530 A

These results are all seen to be in accord with Table 8.3.


17 See, e.g., the electron diffraction experiments of Davisson and Germer, "Diffraction of Electrons by
a Crystal of Nickel," Phys Rev, 30, 705-740; 1927.

SECTION

8.7

Joule's Law

489

JOULE' 5 LAW

As noted in Section 8.1, Joule determined experimentally that the heat developed per
second in a conducting wire of resistance R, carrying a current I, is 12R. If one uses
(8.10) and (8.11), this may be put in the form

PR =

V2

It

(El)2

E2

= pel/ A =

C1 c

in which l and A are the length of the wire and its cross section. From this, one may
infer that the volume density of heat developed per second is
(8.35)
It is this form of Joule's law for which an elementary derivation will now be given.
Consider again, as in Section 8.5, an isotropic conductor in which each of a large
number of electrons N G is about to have its next interaction with the lattice at the
present time t. One of these electrons had velocity components Vx, Vy , u, as it carne off its
last encounter with the lattice at an earlier time t.. If an electric field of strength E is
acting along the negative X direction, its velocity components now are
Vy

since it has been uniformly accelerated throughout this period of time. The increase in
energy of this electron between lattice interactions is therefore
(8.36)
Assuming isotropic scattering, if (8.36) is averaged over all values of initial velocity

v,;, since (vx ) = 0, one finds that

e2

(b.W) == -

2m

E2(t - t 1)2

(8.37)

If now this expression is further averaged over all earlier times i., using the distribution
function (8.25), one obtains

sw =

NG

e2E2
J (6W)dN = -2mT
J (t t

c _ co

t 1) 2e-(H\)h,dt 1

Upon performing the integration, one finds that

When one recalls that for isotropic scattering T == T c, if there are n free electrons per
m 3, and if all these electrons transfer their excess kinetic energy to the lattice, then
(8.38)
which agrees with the experimental statement of Joule's law, (8.35).

490

Conductive Materials

8.8

THE DEBYE THEORY OF SPECIFIC HEAT

CHAPTER

The first law of thermodynamics is a statement of the principle of conservation of


energy and asserts that if an amount of heat dQ is added to a system, this must be equal
to the increase dU in the system's internal energy plus the work done by the system on
its surroundings. If the work done is mechanical in nature, the law takes the form

dQ

dU

p dV

(8.39)

with p the pressure exerted by the system and dV its change in volume.
The internal energy U is expressible formally as a function of any t\VO of the state
variables-pressure, volume, and temperature. Thus one form of the total differential is

dU = (aU)
dV + (au)
dT
av
aT v

(8.40)

with the subscripts on the partial derivatives indicating which state variable is being
held fixed. Combination of these two equations gives

dQ

[p + (:~)J dV + (:~)v dT

(8.41 )

The heat capacity of the system at constant volume will be represented by the symbol

Cv , and is defined as the amount of heat which can be absorbed per unit rise in temperature under the restriction that the volume not change. In mathematical terms

c-

(ddTQ) v

(8.42)

It is sometimes advantageous to speak in terms of one mole of the system and to distinguish molal values by using lower-case symbols for the heat absorbed, internal
energy, volume, etc. The heat capacity per mole at constant volume is thus designated
by c, and is customarily called the specific heat at constant volume. From (8.41) and
(8.42) it is given by

(au)
aT

(8.43)

This quantity can be ascertained for different materials by experiment and is generally
found to be a function of temperature. In 1912 Debye" addressed himself to the problem
of determining a theoretical expression for c; which would agree with observed data.
The essentials of his theory are embodied in the following development.
Consider a solid in the shape of a cube of edge L, and imagine that the vibrations of
the atoms which occupy the lattice sites of this solid give rise to elastic standing waves
within the solid. These waves satisfy the differential equation

(8.44)
which has the standing wave solutions

f
18

= sin

l7rX) sin (m7r


L
(L

Y)

. (n7rz)
SIn
L cos

27rvt

P. Debye, "On the Theory of Specific Heat," Ann Phus. 39. 789-839: 1912.

(8.45)

SECTION

The Debye Theory of Specific Heat

491

in which l, m, n are positive integers and v is the frequency of oscillation; Cs == VA is


the propagation velocity, with A the wavelength. Insertion of (8.45) in (8.44) yields a
relation which dictates the allowed frequencies and wavelengths, namely,

71"' 2)
( -L2 ([2

m?

+ n2)

471"'2 v 2

== __

c;

471"'2
A2

== -

(8.46)

The modes (8.45) may be put into one-to-one correspondence with the triplets of
positive integers (l,m,n) and this correspondence permits determination of the number
of modes Z(v) dv in the frequency range from v to u + dv, If the triplets (l,m,n) are
plotted as points in Cartesian coordinates, each point occupies a unit volume. Since
r2

=F+

m?

+ n 2 = ( 2Lv)2
-;;:

may be interpreted as the square of the distance out to the triplet (l,m,n), then it
follows that
being the volume in the first octant between the spheres of radii rand r + dr, is numerically equal to the number of triplets in the shell bounded by these spheres. But

(2L)3

-1 (471"'1' 2 dr) == -71'" 8


2 c.

V2

dv

==

Z ( v) d v

(8.47)

and thus the frequency distribution of I110des has been determined.


In Debye's theory longitudinal and transverse elastic waves are permitted with
propagation velocities c. and Ct. Since there are t\VO independent transverse axes, one
concludes as an extension of (8.47) that the distribution of modes, including longitudinal
and both transverse types, is

Z (v) d == 471'" V (ci 3

2C~3) v2

dv

(8.48)

in which V == 3 is the volume of the specimen.


Debye assumed that each atom in the lattice had 3 degrees of vibrational freedom,
and that the number of allowed modes for the elastic waves was limited to 3N, with N
the total number of atoms. This defined an upper frequency v D (called the Debye frequency) which satisfies the relation

Z(v) dv

3N

9N (c- 3

2C-3)-1

Using (8.48), one finds" that


3 __
VD - 471"'V

(8.49)

A feeling for the magnitude of v D may be obtained by choosing as a representative value


NjV == 10 28 atoms per m ', and assuming that ci == Ct == 10 3 rri/sec. This gives
12
liD ~ 10
cps.
Using Planck's formula that the average energy of an oscillator at temperature T is
hvj(ehJllkT - 1), one may write for the vibrational energy of the solid
Jl D

U =

h
Z(v) ehPlk/- 1 d.v

9NkT JXD x 3 dx
(hVD/ k T )3 eX - 1
0

(8.50)

492 Conductive Materials

CHAPTER

in which x = hv/kT. It is convenient to introduce a characteristic temperature


called the Debye temperature, by the definition (JD = hVD/k, which gives

.
L

9NkT
(8 n / T )3

8D/T

(}D,

x 3 dx

ex _

(8.51)

For high temperatures (T (}D), inspection of (8.51) reveals that x 1 for the
entire range of integration. The denominator of the integrand may then be replaced
by x and one obtains

U = 3Nk1'

(8.52)

wherein N A is Avogadro's number, the number of atoms per mole. If one uses (8.43),

3kN A

Cv =

3R

(8.53)

in which R = kN A is the ideal gas constant. This result agrees with the classical
Maxwell-Boltzmann theory, indicating that the latter is adequate at temperatures
sufficiently above (}D.
For low temperatures (T (}D), the upper limit of integration in (8.51) may be
replaced by infinity and the integral assumes" the value 1r 4/15. Then
1t =

31r 4R

--

5(J~

1'4

c = 121r R

(~)3

(8.54)

(Jv

At intermediate temperatures the integral in (8.51) can be calculated by standard


techniques. Over the whole range of temperature one is able to construct the plot of
specific heat shown in Figure 8.8. The observed values of c, for any solid may be fitted
19 See, e.g., Whittaker and Watson, Modern A nalysis, 4th ed., p. 265, Cambridge University Press,
London, 1935.

2R

./~

/
J

0.2

0.4

0.0

0.8

1.0

1.2

OD
FIGURE 8.8 Debye curve for heat capacity of a solid.
Data points are for aluminum (() D = 418 0 K).

SECTION 8

1he

Debye Theory of Specific Heat 493

to this curve by selecting a value for the Debye temperature eD The result of doing
this for aluminum is indicated by the data points and the agreement is seen to be very
good. A list of Debye temperatures, determined in this manner for a variety of materials, is given in Table 8.4.
TABLE R.4
REPRESENTATIVE VALUES OF UEBYE TEMPERATURE IN OK

Substance

Substance

-------- ----

Beryllium
Diamond C

.
.

Sodium

Magnesium
Aluminum

.
.

Silicon
Potassium
Calcium

.
.
.

Titanium

1160
2000
150
406
418

658
100
219
278

Substance

----------1----11-------__-- - - - -

Vanadium

Chromium

Iron
Cobalt
Nickel
Copper .. "

.
.
.
.

Zinc "
Germanium
Molybdenum

.
.
.

273
402
467
385
456
339
308
366
425

Silver
.
Cadmium.
.
Tungsten . . . . .. .,.
Platinum
.
Gold
.
Lead
.
Bismuth
.
NaC~l
"
AgBI'

225
300
379
229
165
95
117
281

144

The Debye theory has been found to be only approximate with deviations occurring
between theory and experiment at very low temperatures, However, the refinements
which improve on the Debye theory need not be a concern in the present discussion,
since they do not affect the arguments which are to follow. The reader interested in
these refinements is referred to the literature;"
It is apparent from the development that the internal energy expression (8.51) refers
to the lattice energy, and does not include any contribution from a free electron gas
which might be present within a conductive material. However, the Fermi-Dirac
statistics may be used to show " that the total internal energy of N conduction
electrons is

== ~"i NWF(O)
5

{5
kT- J + ...
1 + ~ 12 WF(O)
2 [

(8.55 )

For a conductor which has fNA free electrons per mole, the contribution to the specific
heat capacity at constant volume is therefore
cv(electrons) =

2" fR
7T"2

[kT
J
WF(O)

(8.56)

Since kT/wF(O) is so small at ordinary temperatures for most metals, and since the
fraction f is normally close to unity for these metals, one sees from (8.56) that the free
electron gas normally makes a contribution to the total specific heat which is much
less than R. For this reason, at temperatures sufficiently above OD, cv(total) ~ 3R
whether the substance is a conductor or not. This conclusion is in agreement wi th the
Dulong-Petit experimental law, At temperatures so elevated that the contribution
20
21

See, e.g., IV!. Blackman, "The Theory of the Specific Heat of Solids," Rep Prog Phys, 8, 11-29; 1941.
See, e.g., F. W. Sears, op cit., pp. 335-336.

494 Conductive Materials

CHAPTER

(8.56) cannot be ignored, most metals are no longer solid and indeed may have evaporated so that the entire analysis is inapplicable,

8.9

THE TEMPERATURE DEPENDENCE OF THE RESISTIVITY OF METALS

Equation (8.24) may be rewritten in the form

dN/N

at,

(8.57)

in which T has replaced r., implying isotropic materials, (Cf'. Section 8.5.) Written in
this manner, (8.57) may be given the interpretation that the fraction of free electrons
which suffer a lattice interaction per unit time is l/T. Stated in another way, liT is the
probability for isotropic scattering of a free electron per unit time.
Imagine a conductor whose lattice atoms do not vibrate, but which contains imperf'ections in the forms of vacancies, interstitials, dislocations, and impurity at0111S. These
imperfections cause scattering, with a probability which may be designated by l/ri.
Alternatively, imagine a conductor which contains no imperfections but whose lattice
atoms do vibrate. These vibrations cause the instantaneous spacings of the at0l11S to be
slightly irregular so that the electron waves experience some scattering. Let the probability of scattering in this case be denoted by 1/T l .
A real conductor exhibits both scattering mechanisms since it contains imperfections
and since its lattice atoms do vibrate. Because these two mechanisms are independent
the scattering probabilities are additive, and one may write

111
-=-+-

Ti

(8.58)

Tl

If use is made of (8.9), the resistivity is given by


p

c -

ni

ne'

(1 +-1)
-

T 1'

(8.59)

Tl

Over a reasonable range of temperatures, the concentration and distribution of


imperfections, as well as the free electron density, are independent of temperature,
indicating that ri is the only temperature-dependent quantity in (8.59). One would
expect the scattering probability I/Tl to be proportional to the collision cross section
of a lattice atom. In turn, the collision cross section is governed by the square of the
mean amplitude of atomic vibrations, and is thus proportional to the lattice energy.
For this reason (8.59) may be rewritten in the form
pc = Pi

(8.00)

b'u

in which Pi = 1n/ne2Ti is known as the residual resistivity (being due to imperfections),


u is the molal internal energy, and b' is a constant. At temperatures sufficiently above
the Debye temperature On, (8))2) indicates that u ~ 1 and then (8.60) becomes
1

Pc =

b'I'

rs.en

with a and b constants. This linear dependence of resistivity on temperature is known

SECTION

The 'I'eniperaiure Dependence of the Resistivity of Metals 495

50 r - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ,

40
''''

C:::
0::
-----0

..-4

30

if:

.~

Q)

:-.
C,.)

>

20

C3

:c
10

o
4

10

12

14

16

18

20

22

TCI()

8.9 Resistance of an inipure specimen of sodium as a function of temperoture.


[After AfacDonald and Mendelssohn, Proc Roy Soc (London), A202, 103; J 950.]

FIGURE

as Mat.t.hicserr's rule. It i~ a result which agrees with observed data for most metals,
for T OJ).
An illustration of the effect indicated by (8.60) is shown in Figure 8.9. The resistance
of a specimen of sodium is plotted as a function of temperature, with the curve clearly
displaying a behavior which can be explained as being due to a residual resistance plus
a contribution which follows the trend of the specific internal energy.
Another illustration of the effeet of imperfections on resistivity is provided by Figure
8.10. When nickel is added as an impurity at0111 to copper, the resistivity increases
markedly even though niekel is itself a good conductor. The decisive factor is the symmetry disruption of the atoms filling the lattice.
EXAMPLE

8.3

With reference to Figure 8.9, the scattering probabilities due to imperfections and due to
thermal lattice vibrations may be estimated for the sodium specimen under consideration.
"Then contractions in the length and diameter of the specimen as the temperature is lowered
are ignored, the ordinates are resistivities relative to the resistivity at 290 oK. If one takes
for the latter Pc(2900K) = 4 X 10- 8 ohrn-m, the residual resistivity is
Pi = 4 X lO-4 pc(2900I() = 16 X 10- 12

OhI11-111

'This means that

r,
=

(2.54 X 1028)(1.6 X 10- 19 )2(16 X 10- 12 )


ne 2Pi
m
9.1 X 10- 31
1.14 X 10 10

496 Conductive Materials

CHAPTER

[)

..c
0

S
X
e,

'>

:fj

'r;.

c::
Q)

1'(OI{)

8.10 Resistivity of Cu-Ni alloy vs. temperature as a function of


percentage of Xi concentration. [;lfter Linde, Ann. Ploj. 15, 1~19; 1932.J

FIGURE

The mean time between electron-lattice interactions thus would be about 10- 10 sec if there
were only imperfections and no lattice vibrations.
At 290 0 !(

so that

Pc

=-

= 0.285 X 10 14

ne21'

10- 8

Therefore at room temperature, with both imperfections and lattice vibrations contributing
to the resistivity, the mean time bet\veen electron-lattice interactions is much shorter,
being about 3 X 10- 14 sees, Since

it follows that in this case Tl ~ T and the lattice vibrations are the dominant cause of
resistivity at room temperature. Electron-lattice interactions due to lattice vibrations occur
approximately ten thousand times as often as do interactions caused by imperfections in
the specimen.

8.10

THERMAL CONDUCTIVITY OF METALS


AND THE WIEDEMANN-FRANZ LAW

If the temperature is not uniform throughout a homogeneous, isotropic substance, a


flow of heat "rill occur in the direction opposite to the tern perature gradient. This fact

SECTION

Thermal Conductiuiiu of Metal and the Wiede1nann-Franz Law

10

497

is usually expressed by the equation

Q == -KvT

(8.62)

in which Q is the areal density of heat flo\v in watts per n1 2 and K is the thermal conductivity coefficient, expressed in watts per meter per degree.
For most materials the value of K depends on temperature: the case of copper is
shown in Figure 8.11 as an illustration, and Table 8.5 lists measured thermal conductivity values for some representative materials at two different temperatures. It may be
.1)0

40
~
Q.)

"0

E
~

<,

30

~
~

'S;

='

"0

c
0

20

E
s...
~

..c
~

10

o
FIGURE

20

40

60

80

100

T(OK)

8.11

Thermal conductivity of copper vs. temperature.

observed from this table that the metals are generally better conductors of heat than
the insulators, the difference being as great as two orders of magnitude. This is due to
the fact that the principal means for transferring heat energy in insulators is via lattice
vibrations. Although this mechanism is also operative in metals, the highly mobile
free electrons are the carriers of most of the heat being transferred;" and are responsible
for the greater thermal conductivity.
The analysis which follows will be restricted to metals, and only the contribution
to thermal conductivity made by the free electrons will be taken into account. The
procedure will be to consider all the free electrons which pass in unit time through a
22 In some metals the thermal conductivity due to electrons may be one or two orders of magnitude
greater than the contribution made by lattice vibrations. See, e.g., C. Kittel, J ntroduction 0 Solid
State Physics, 2d ed., p. 149, John Wiley and Sons, Inc., New York, 1956.

498

Conductive Materials

CHAPTER

unit area transverse to the heat flow, in both directions, to determine their individual
transport energies, and thus by summing to deduce Q, the net energy flow per unit
time. The result will prove to be proportional to V T so that use of (8.62) will yield
an expression for K.
Consider a metallic conductor in which a temperature distribution has been established. The average energy of those free electrons which are in regions of higher ternperature will be greater than the average energy of those free electrons which are in
regions of lower temperature. The more energetic electrons will move toward the lower
1'ABLE 8.5
THERlVIAL CONDUCTIVITY

~"'OR REPHESENTATIVE l\'IATEHIALS

K in watts/n1/oI{
ill aterial

Aluminum
Copper
Silver
Gold
Iron

.
.
.
.
.

Magnesium

Nickel
Sodium
,
Nae!.
Potassium Alum
Carbon
Glass
Mica

.
.
.
.
.
.

250
565
420
180
185
110
150
\

26
1.2

220
385
410
300
90

170
80

135
7
1.9
4.2

1.0

0.8

temperature regions carrying this additional energy with them and causing a flow of
heat. If this loss of energy in the high temperature regions is replenished, and if the
gain in energy of the low temperature regions is siphoned off, the temperature distribution can be maintained.
Because of the difference in mean velocity associated with this differential in electron
energy, there is a tendency for the number of electrons which arrive per unit time in
the low temperature regions to exceed the number which leave, with the opposite
tendency prevailing in the high temperature regions. This tendency toward charge
accumulation causes a small internal electric field parallel to the temperature gradient,
which is just sufficient to prevent continued charge accumulation. The net result is
that as many electrons leave any region per unit time as arrive; in a low temperature
region, the arriving electrons bring with them 1110re kinetic' energy than the departing
electrons take away; the reverse situation exists in the high temperature regions.
With this picture of the mechanism in mind, choose coordinate axes such that in
some region of the conductor, the temperature is a function only of z, represented by
T(z). In parallel with the development of Sect ion 8.4, let !o(z,VX,vy,v z ) be the free electron
equilibrium distribution which would exist in the presence of this temperature gradient

SECTION

10

Thermal Conductivity of Metals and the W iedemann-Franz Law

499

if there were no electric field present, and let f(z,vx,vy,v z ) be the free electron equilibrium
distribution with the compensating electric field also present. For this case Boltzmann's
transport equation becomes

aj
az

Vz -

Ez

aj
avz
-

f - fa

(8.63)

-T

Since the temperature gradients and the electric fields they induce are normally very
small, one may substitute j == fa in the left side of (8.63) with negligible error, obtaining

ajo

f = jo + T ( - E z ni
av z

ajo aT)
aT az

(8.64)

Vz -

For the same reason fo, as it appears in (8.64) and all subsequent equations, may now
be taken to be the Fermi-Dirac distribution jo(vx,vy,v z ) .
The electric current density may be deduced from the transport of charge, and as
in Section 8.4 is given by
L

fff (-e)vzf dv

dvy do, = -er

fff v, (;e E, ;vafoz -

afo aT)

o, aT a;

dvx do; d,

(8.65)

By analogy the flow of heat may be deduced from the transport of kinetic energy,
and is given by

Q=

ff f (2mv2) vzf du; dv dvz


y

2rnr fff vzv

( e e. avajoz
;

ajo -aT) dv do; d.


x
.

Vz -

a1 az
1

(8.66)

Under the equilibrium requirement that L == 0, the electric field E, 111ay be eliminated
by combining (8.65) and (8.66). When this is done, and one finds the partials ajo/avz ,
ajo/aT for the Fermi-Dirac distribution, evaluation of the integrals gives the firstorder expression 23

Q==

1 n1r2k 2TT aT

az

(8.67)

From this result and (8.62) it follows that the thermal conductivity is

1 n7r 2k 2T

K == - - - T
3 m

(8.68)

a formula which is seen to depend linearly on the absolute temperature.


The electrical and thermal conductivities may be compared by forming the ratio

L ==

K ==! (7rk)

(jT

(8.69)

This result is seen to be a universal constant and is a theoretical confirmation of the


Wiedemann-Franz law, established experimentally in 1853. The constant IJ is known
23 The reader interested in the details of this analysis is referred to A. H. Wilson, The Theory of Metals,
2d ed., pp. 200-201, Cambridge University Press, London, 1954.

500 Conductive M aierials

CHAPTER

as the Lorenz number and has the theoretical value 2.45 X 10- 8 watt-ohm/deg". The
experimental Lorenz numbers for various metals are listed in Table 8.6 and the agreement with theory is seen to be quite good.
'TABLE H.6
EXPERI~iENTAL

LORENZ NUMBEHS

L X 108 watt-ohm/deg!

L X 108 watt-ohrrr/deg"
.~f etal

Metal

T = 273!{ T = 373I{

T = 273I{ T = 373K
Copper

Zinc
Molybdenum
Silver
Cadmium

2.23
2.31
2.61
2.31
2.42

2.33
2.33
2.79
2.37
2.43

Tin
Tungsten
Platinum

Gold
Lead

2.52
3.04
2.51
2.35
2.47

2.49
3.20
2.60
2.40
2.56

At low temperatures the theory given does not apply. It is no longer suitable to
neglect lattice contributions to K, and a relaxation time for the electrical and thermal
conductivities cannot be defined uniquely. For this reason neither theory nor experiment conforms to the Wiedemann-Franz law unless the temperature of the specimen
is above the Debye temperature.

8.11

CONDUCTIVITY OF SEMICONDUCTORS

Homogeneous semiconductors are found to share with metals the property of supporting drift currents which conform to Ohm's law. The explanation of this conformance
differs for the two classes of materials, In earlier sections of this chapter, the free
electron model formed the basis of a theory of conduction which gives satisfactory
agreement with experiment in the case of metals, particularly those which are monovalent. However, the conduction process is more complex in a semiconductor. The free
electron model is not applicable and one must turn to the band theory for a suitable
explanation.
Referring to the discussion of Section 8.2, a crystalline solid whose Fermi level lies
in the forbidden region between two allowed energy bands is a senliconductor if the
energy gap between these bands is small, Silicon and germanium are notable examples
of such solids. Silicon has the atomic number 14 and a valence of 4. At normal temperatures all of its energy bands are essentially filled through 2p (cf. Table 7.2). Its 38 and
3p bands have combined in a manner which can best be understood by imagining what
happens as the silicon atoms are brought closer together..At large atom separation
these two bands are separated and well-defined, with 3p above :38. As the separation
decreases, the bands overlap and merge together. At still shorter interatomic spacing,
the merged levels split into t\VO bands, each of which contains four states per atom.
This is the situation at the equilibrium spacing in solid silicon, and at absolute zero the
lower of these bands is completely filled, the upper band being void; a gap of 1.1 ev
separates these two bands. Similarly, germanium, with the atomic number 32, has a

SECTION]

Conductivity of Semiconductors 501

valence of 4. At normal tern peratures all of its energy bands are essentially filled through
3d. Its 4s and 4p bands have merged and resplit into two new bands which are separated
by a gap of 0.72 ev. At absolute zero the lower of these bands is completely filled and
the upper band is empty. It will be shown later in this section that for both of these
pure crystals the Fermi level at absolute zero lies half way between the last filled band
(called the valence band) and the first unfilled band (called the conduction band).
Because of their valence of 4, silicon and germanium solidify into crystals with a
structure similar to that of diamond, in which each atom forms a covalent bond with
four other atoms in a three-dimensional lattice. A two-dimensional representation of
this structure is indicated by Figure 8.12. At absolute zero all the bonds are complete,

FIGURE

8.12

Two-dirnensional representation of silicon or qermanium crystal structure.

but as the temperature is raised some of the bonds break, indicating that an occasional
electron has gained enough energy to detach itself, leaving behind an atom with a
vacancy, or hole.
In thermal equilibrium, in the absence of an external electric field, the detached
electrons will wander randomly through the crystal. So too will the holes, since it is an
unnatural state for an atom in the crystal to be incompletely bonded. An atom in this
state will "steal" an electron from one of its four neighbors, thus transferring the hole
to the neighbor, which in turn will pass the hole on to one of its four neighbors. In this
way the holes also move about randomly through the crystal. Occasionally a hole and a
detached electron encoun tel' each other and recom bine. This annihilation of holeelectron pairs is balanced by their thermal generation so that a constan t volume density
of mobile electrons and holes is maintained. Raising the temperature increases the
volume density of these charge carriers.
This condition can be represented in an energy band diagram, as shown in Figure
8.13. The Fermi-Dirac function (8.2) is plotted to the right of the bands and indicates
a partial population of the bottom of the conduction band at the expense of the top
of the valence band. Individual energy levels just below Wv or just above We are randomly occupied, but on the time average, the fractional occupancy follows the Ferrni
curve.
If an electric field is applied from left to right in Figure 8.12, the random motion of
mobile electrons and holes has superimposed on it an ordered drift of electrons to the
left and holes to the right; both of these drifts contribute to a current flow in the direc-

502

Conductive JIll aterials

CHAPTER

tion of the electric field. In terms of the band diagram of Figure 8.13, since there are
filled and empty energy levels adjacent to each other just below wv and just above We,
the holes and conduction electrons need to acquire only a slight amount of energy from
the field to move from one level to another. Because of this, they are highly mobile
and will drift easily.
Despite this mobility, the drift current per unit applied field is extremely low in a
pure semiconductor. The reason for this is that the density of conduction electrons and
Electron energy

Conduction band

Wr - - - - - - - - - - - - - - - - - - - -

Probability that an electron


occupies quantum state
FIGURE

8.13

Energy band diaqram for pure semiconductor.

holes is very small. For example, ill germanium at 1'001n temperature, the measured
conduction electron density is only 2.5 X 10 19 per 111 3 This is nine orders of magnitude
below the value for silver (cf. Example 8.2) and accounts for the great disparity in
the conductivities of the t\VO materials.
The conductivity of a semiconductor may be enhanced greatly by the addition of a
small trace of appropriately chosen impurity. For example, if a slight concentration
of an element of valence 5, such as arsenic or antimony, is introduced into molten
germanium, the crystal formed on cooling contains a dispersion of impurity atoms
which have substituted for germanium atoms at regular lattice sites. After forming
bonds with four neighbors, the impurity atom has one electron left over which is easily

SECTION

11

Conductivity of Semiconduciors

503

detached; from then on this electron is capable of moving about through the crystal,
thus contributing to the conductivity. Because valence 5 atoms donate an electron
in this manner, they are called donors. Upon making its donation, the impurity atom
becomes an immobile positive ion. Thus donor impurities contribute conduction electrons, but do not produce holes.
Similarly, if a slight concentration of an element of valence 3, such as gallium or
boron, is added to molten germanium, the resulting crystal contains a dispersion of
substituted impurity atoms. After forming bonds with fou:" neighbors, the impurity
atom is shy one electron, which may be "stolen" from a germanium atom, thereby
creating a mobile hole. Because valence 3 atoms accept an electron frorn the crystal
in this manner, they are called acceptors. Upon accepting an electron, the impurity
atom becomes an immobile negative ion. Thus acceptor impurities contribute mobile
holes, but do not produce conduction electrons.
For impurity concentrations even so low as one part in a million, the mobile carriers
produced by impurities outweigh those produced thermally by at least several orders
of magnitude; the conductivity is then governed by the impurities. For this reason, a
semiconductor containing donors is said to be n-type since the conduction is due largely
to the negatively charged electrons. Similarly, a semiconductor containing acceptors
is called p-type because the conduction is principally due to the positively charged holes.
The energy band diagram for a semiconductor containing impurities is illustrated
by Figure 8.14. Shown is the case in which both donor and acceptor atoms are present
with the donors more abundant. Therefore the population of the conduction band is
increased with respect to the case of a pure semiconductor (cf. Figure 8.13) and the
Fermi level has risen to reflect this fact. The energy level the donor electrons had when
associated with the donor atoms is designated by W D. Actually, this is a band of energy
levels, but the density of donor atoms is normally so small that their spacing is 10
lattice sites or more, causing the donor band to have negligible width. Similarly, the
energy level the acceptor electrons have when they become attached to the acceptor
atoms is designated by WA. The energy level WD is shown close to We and WA close to Wv,
indicating that ionization of both types of impur ity atoms occurs easily.
The population densities in the various energy bands 111ay be determined with the
aid of the Fermi-Dirac function. If N [) is the volume density of donor impurity atoms
and n a the number of electrons per unit volume remaining in the states at WD, then
(8.70)

which is a further use of (8.2). Under normal circumstances, w reduces to

WI"

k.T and (8.70)


(8.71 )

indicating that nD is quite small and that most of the donor Ut0111S are ionized.
Likewise, the density of electrons nA which have been raised to the states at
given by

Wit IS

(8.72)

with NAthe volume density of acceptor impurity atoms,

504

Conductive Materials

CHAPTER

To find the volume density of electrons in the conduction band, it is convenient first
to determine the equivalent volume density of states N e concentrated at the energy
level We. Actually, the allowed states are spread out through the entire conduction band,
but the Fermi-Dirac function indicates that only those states close to We have a significant electron population. When the true density of states is combined with the
Electron energy

Conduction band

WF

---------------------

WA -

Wv

.. ' . ' . .

...,".

....

..

".

o
Probability that an electron
occupies quantum state
FIGURE

8.14

Energy band diagrarn for impure semicoruluctor,

Fermi-Dirac function, it can be shown." that the equivalent volume density of states
at We is
(8.73)

1n:

in which
is the effective mass of a conduction electron in the presence of the periodic
potential of the crystal (cf'. Section 7.16). The volume density of electrons in the conduction band is then given by
(8.74)
24 See, e.g., L. V. Azaroff and J. J. Brophy, Electronic Processes in Muterials, pp. 197-200, :\IcGraw-Hill
Book Company, New York, 1963.

Conductivity of Semiconductors

505

By similar reasoning, one can show that the volume density p of mobile holes in
the valence band may be expressed as

(8.75)
(8.76)

in which

1n:

is the equivalent volume density of states at the energy level wv. In (8.76)
is the
effective mass of holes in the presence of the periodic potential of the crystal.
These formulas permit determination of the population densities at all four energy
levels if these levels and the temperature are known, and if the Fermi energy W F has
been ascertained. l\.. deduction of the value of W F proceeds from a statement of the
charge neutrality of every volume element of a homogeneous semiconductor. Since
the density of free electrons is n, that of negative acceptor ions is nA, that of holes P,
and that of positive donor ions N J) - nJ), it follows that
n

n.,l ==

lV J)

nn

(8.77)

which, with the aid of the preceding formulas, may be written

With all other quantities known (8.78) may be solved for WF. The general solution is
not of great interest, but three special rases are of practical importance,

Case 1: The pure, or intrinsic, semiconductor.


If N A == N D == 0, the semiconductor contains no impurities, If one designates n;
as the free electron density for this case, Pi the mobile hole density, and so, the Fermi
level, (8.77) gives n, = pi, and (8.78) yields
ui;

= We

+ Wv +
2

kT In Nv
2
Nc

(8.79)

Since the effective masses of electrons and holes are nearly equal, N v ~ N c and (8.79)
indicates that the intrinsic Fermi level is approximately half way between the valence
and conduction bands for all reasonable temperatures.

Case 2: An n-type semiconductor.


When only donor impurit.ies are present, N A == 0. With n; designated as the conduction electron density for this case, P the mobile hole density, and ui; the Fermi level, if
the donor at0111S are essen tially all ionized (which is usually the case), then (8.77) gives

n.;

== 'p

ND

(8.80)

However, the product of (8.74) and (8.75) is independent of Fermi level and therefore
(8.81)

506 Conductive 111 aierials

CHAPTER

Under normal circumstances, N D


becomes

n; so that, with the aid of (8.81), Equation (8.80)

n; ~ N D
showing that n; p;
Combination of (8.82) with (8.i4) gives for the Fermi level
io;

==

Wc -

(8.82)

NC

k'l' In N

(8.83)

J)

Equation (8.83) indicates that the Fermi level rises as N D is increased, which reflects
the increased electron population in the conduction band.

Case 3: A p-type semiconductor.


When only acceptor impurities are present, N D == O. With n p chosen to represent the
conduction electron density for this case, PP the mobile hole density, and W p the Fermi
level, if the acceptor atoms are essentially all ionized (which is usually the case), then
(8.77) gives
(8.84)
n p + N A == P
Because normally N A

tu, and because further use of (8.81) gives

(8.85)

it follows that (8.84) reduces to


(8.86)

Pp ~ N A

Combination of (8.86) with (8.75) gives for the Fermi level


Wp =

Wv

Nv

+ kT In N

(8.87)

Equation (8.83) indicates that the Fermi level lowers as N A is increased, which is
consistent with the decreased electron population of the valence band.
For all three of the foregoing cases, the densities of mobile carriers, (8.74) and (8.75),
may be written
n = N ce-(wc-u\)/kTe(WF-W)/kT
== nie(wF-w)/kT
(8.88)
p

Nve-(Wi-WV)/kTe-(wp-w)/kT

(8.89)

Pie-(w,..-wj)/kT

indicating that the conduction electron and hole densities are controlled by the shift
in Fermi level. This shift may be determined by combining (8.79) and (8.83) or (8.87).
EXAMPLE

8.4

Using the atomic number of germanium, its measured density, and Avogadro's number,
one can determine that the atom density is 4.4 X 1028 per m'. If an arsenic impurity concentration of 10- 5 is added to pure germaniuru, this means that the volume density of
arsenic atoms is N D = 4.4 X 1023 per 01 3 The shift in Fermi level due to this impurity
concentration can be deduced from (8.88). One may write

n.;

ND

4.4 X 10 23 = nie(WII-Wi)/kl' = 2.5 X 101IJ e ( w ,,uu = kT In (17,600) = 9.78kT

wi ) / k l'

ui; -

At room temperature, kT

= 0.025 ev and the rise in Fermi level is


Wn -

Since

We -

Wv

ui,

= 0.24 ev

0.72 ev, this shift amounts to 34 percent of the gap width.

SECTION

Conductivity of Semiconductors

11

507

The conduction electron-lattice and hole-lattice interactions in a semiconductor fit


the assumptions of the analysis presented in Section 8.4, the only differences being the
form of the velocity distribution function of mobile carriers and the dependence of
carrier acceleration on the periodic potential of the lattice. When these differences are
taken into account, the result is once again Equation (8.23), with the effective mass
replacing the electronic mass. Therefore the drift current density in a semiconductor
may be written
t

==
==

In

<TeE

(8.90)

lp
<ThE

==

<TE

in which In is the current density due to free electron drift and t , is t.he current density
due to hole drift. The conductivities a; and <Th are given by

ne'r,
== .-*

o;

(8.91)

m,

with r, and Th the respective relaxation times. The total conductivity <T is the sum of
a, and (Jh. Because of the anisotropies in the relaxation times and effective masses, (J is
in general a tensor.
Average drift velocities for the conduction electrons and holes rnay be defined by
the flow equations
(8.92)
which can be combined with (8.90) to yield
Vn

(J

== neE

(8.93)

The drift velocities per unit field are called the mobilities, and defined by the relations
Vn

J.Ln = -

J.LP

vp

(8.94)

In terms of the mobilities, the total conductivity of a homogeneous semiconductor is


(8.95)
The mobilities are somewhat dependent on temperature and impurity concentration,
and at room temperature representative values are given in Table 8.7. The principal
determining factors of conductivity in a semiconductor, as seen from (8.9,5), are the
volume densities of conduction electrons and holes.
TABLE

s.:

MOBILITIES AT

J.Ln

Silicon
Germanium

.
.

;300 0 K

m 2jvolt-sec
0.12
0.36

J.LP

m 2jvolt-sec
0.025
0.17

508

Conductive Materials

EXAMPLE

CHAPTER

8.5

A cubical block of germanium 0.5 ern on a side contains a 10- 5 arsenic concentration.
When a potential of 5.0 millivolts is applied across opposite faces of this specimen, the
measured current flow is 0.634 amp. From these data, the mobility of free electrons in
germanium may be deduced.
If one borrows the results of Example 8.4, the conduction electron density in this specimen is ti = 4.4 X 10 23 per 1113 Therefore
(2.5 X 1019 ) 2

n.p,

4.4 X 1023

p; = --;;:: =

1.4 X 10 15

and the conduction electron concentration is seen to be eight orders of magnitude higher
than the hole concentration. For this specimen (8.95) reduces to
(J

nnelJ.n

= 7.04 X 104,un mhos/m

Since one may also write for the conductivity


(J

= ~ = 0.634/(0.005)2 = 2.53 X 10 4 mhos/rn


E

5 X 10- 3/0.005

,un =

it follows that

2.53 X 10 4

7.04 X 10 4

= 0.36

n12/volt-sec

which agrees with the entry in Table 8.7.


It is interesting to note that this mobility is two orders of magnitude higher than the
mobility of electrons in commercial copper. The vastly greater conductivity of copper is
due solely to its free electron concentration.
For an intrinsic semiconductor, n, = Pi and (8.95) may be combined with (8.74) to give
(J

With

ui,

= nie().Ln

taken midway between

+ J.lp)

We

and

= N c e(J.ln
Wv,

+ J.lp)e-(wc-wd/kT

this may be written


(8.96)

Equation (8.96) indicates an exponential dependence on temperature of the conductivity


of intrinsic semiconductors, a feature which distinguishes them from metallic conductors.
If (J, J.ln, and J.lp are measured as functions of temperature, (8.96) may be used to deduce
the gap width We - wv.
For an impure semiconductor in which the impurity concentration is large enough to
dominate the conduction process, one or the other of the two terms in (8.95) may be neglected. The remaining term contains a factor n = N D or P = N A, neither of which is temperature sensitive. Since the mobilities also are not strongly dependent on temperature, the
conductivity of an impure semiconductor is found to be almost independent of temperature
(unless the temperature beC0111eS high enough to cause the intrinsic effects to dominate those
governed by impurities).
In addition to the drift currents discussed in this section, which obey Ohm's law, a h01110geneous semiconductor can support diffusion currents if the local balance between electronhole annihilation and generation is upset. This will occur, for example, when t\VO dissimilar
semiconductors have an interface, and this phenomenon is the controlling feature of the
semiconductor diode and the transistor. Diffusion currents in semiconductors are treated in
most texts on transistor theory. 25
See, e.g., the excellent discussion in H. I). Middlebrook, . 4n I ntroduction to J unction Transistor
Theory, John Wiley and Sons, Inc., New York, 195i.

25

i11 axwell' s Equations for Conductive lYI edia

SECTION] :2

8.12

509

MAXWELL'S EQUATIONS FOR CONDUCTIVE MEDIA

If a collection of dielectric, and/or magnetic, and/or conductive matcrialsj is considered


at the microscopic level to consist of an aggregation of charged particles in motion in a
vacuum, then the developments of Sections 6.21 and 7.17 have shown that Xlaxwell's
equations in the form

E ==

-8

(8.97)

properly account for the dielectric and magnetic properties. When further it is appropriate to express the primary current density in the form t == c E, these equations
become

vXE==-B

vxlI=(JE+I)

(8.98)

The auxiliary relations are still

VD==p

vB=O

(8.99)

In particular, if a medium is characterizable by the constitutive parameters and p"


as well as 0", via the relations
D == eE
H == J.l-1B
t == 0" E
(8.100)
and the fields are time-harmonic through the function ejwt , then (8.98) and (8.99)
become
V X H == (0" + jw)E
v X E == -jwJ.lH
(8.101)
v. E == E
vH == 0
E

Equations (8.101) comprise the general form of Maxwell's equations for time-harmonic
fields in linear media and are the starting poin t for most practical applications of
electromagnet.ic theory. The interested reader is referred to the journal literature and
to the many fine texts which deal with these subjects. Some of the latter are included
among the references at the ends of Chapters 3, 4, and 5.
REFERENCES
1.

Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1957.

2.

Dekker, A. J., l~lectrical Enqineerinq 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 19S9.

3.

Kittel, C., Introduction to Solid State Physics, 2d eel., John Wiley and Sons, Inc., New
York, 1956.

4.

Lenard, P., Great Jf en of Science, The Macmillan Company, Inc., New York, 1933.

5.

Magie, \V. F., ..1 Source Book in Physics, McGraw-Hill 1300k Company, Ne\v York, 1935.

6.

Mott, N. F., and H. Jones, Theory of the Properties of J[ etals and Alloys, Oxford Press,
London, 1936.

7.

Sears, F. \V., An Introduction to Therrnodynarnics, Kinetic Theory of Gases, and Statistical


Meclumics, 2d ed., Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1953.

8.

Whittaker, E. S., A Flistory of the Theories of the A ether and Eleciriciui, vols. 1 and 2,
Thomas Nelson and Sons, Ltd., London, 1951 and 1953.

9.

Wilson, .A. H., The Theory of Metals, 2d ed., Cambridge University Press, London, 1954.

t This can include rnaterials which have any two or all three of these properties.

510

Conductive Materials

CHAPTER

PROBLEMS
8.1

What is the drift velocity of electrons in a copper wire under the influence of an electric
field of strength 100 volts/em?

8.2

Mobility may be defined as the average drift velocity, per unit of electric field, adopted
by the free electrons in a conductor as they form a steady current flow. Find the Inability
in silver.

8.3

Repeat the previous problem for copper.

8.4

Find the relation between the mean free path and mobility of the free electrons within a
metal.

8.5

Estimate the relaxation time for copper if the measured resistance at room temperature of
1,000 ft of No.8 wire is 0.641 ohms. (No.8 wire has a diameter of 0.128 in.)

8.6

Find the Fermi zero energy for free electrons in copper assuming one free electron per
atom. Use this result to find the mean free path and compare your answers with 'fable 8.3.

8.7

Explain why the constant a in Equation (8.61) is not the residual resistivity.

8.8

For small concentrations of impurities, how should the resistivity of a metal depend on
impurity concentration? Is your answer consistent with the data of Figure 8.10?

8.9

An absolutely pure and perfect silver wire has a conductivity at room temperature (300!\:)
of 0.60 X 10 8 mhos/me Find the conductivity if the temperature is doubled.

8.10

A germanium specimen contains a 10- 5 impurity concentration of acceptor atoms. ASSU111ing that the acceptor energy level is 0.05 ev above the valence band, find the fraction of
acceptor atoms which are not ionized at 1'00111 temperature,

8.11

A rectangular specimen of silicon which contains a uniform concentration of impurity


atoms is shown in the figure and is being considered at room temperature. It has measured dimensions a = 5 mm, b = 2 rnrn, C = 1 InITI.

..
y

-:
X

Problems

511

(a) When a d.c. voltage is applied so as to make the face at x = a one volt higher in
potential than the face at x = 0, a current of 0.52 amp ftO\VS through the specimen.

What is the conductivity?


(b) When in addition a steady magnetic field of 10 4 gauss is applied in the positive
Z direction, a Hall voltage of 0.0105 volts is measured in the Y direction, with the
face at y = b at the higher potential. (Cf. Example 4.3.) Is the sample p-type or
n-type? Explain.
(c) What is the mobility of holes in silicon at room temperature?
(d) What is the hole concentration?
(e) If the atom density of silicon is 5.0 X 10 2 2 per CI11 3 , what is the impurity concentration?

APPENDIX

FRINGE SHIFT VERSUS ROTATION OF THE MICHELSON-MORLEY APPARATUS

IF THE Michelson-Morley apparatus is assumed to have equal arms (ll = l2 = l) and


to be drifting through the ether at a speed u c, a simple expression can be deduced
for the fringe shift as a function of rotational position of the apparatus. Imagine that
the ether is at rest in Figure A.I and that the interferometer is moving upward as
indicated by the vector u. Let t~ be the time it takes light to get from the half-silvered
mirror P to the reflecting mirror MI. This light suffers a horizontal displacement
l cos () and a vertical displacement l sin 0
ut~, because during its transit the mirror
M 1 has moved upward a distance ut~. Since the distance traveled through the ether is
also ct~, one can write

( ct~ ) 2

from which

t~ =

(l cos ()) 2

c2

u2

(l sin 0

[u sin

(J

(c 2

ut~) 2

u 2 cos"

(J)~]

Similarly let t~' be the time required for this light to return from M 1 to P. During
transit it undergoes a horizontal displacement l cos () and a vertical displacement
l sin () - ut~', because the half-silvered mirror P has moved upward a distance ut~/.

Thus

(ct~') 2

and

t"1

= (l cos 8) 2
=

c2 _ u 2

(l sin

[-u sin

(J

(J -

ut~') 2

(c 2

u 2 cos"

8)~]

Therefore the round-trip time is


(A.I)
To find the time for light to travel from P to M 2 and back to P, one need only replace
o by 8 - 1r /2 in the above expression and obtain
(A.2)

The difference in phase of the two light components reaching the viewing telescope
is therefore

(A.3)

514 Fringe Shift versus Rotation of the Michelson-Morley Apparatus

APPENDIX

A----

II

----------\
u

FIGURE

A.I

Rotation of the Jfichelson-JIorley apparatus.

The number of fringes shifted, n, can be found by dividing this phase difference by 27r.
If u c, the radicals can be expanded to give

l u2
- - 2 cos 28

Xc

and thus the fringe shifting occurs as a second harmonic of the rotational angle

(A.4)

e.

APPENDIX
CLASSICAL

DOPPLER SHIFT FROM A MOVING

SOURCE IN THE PRESENCE

OF A MOVING ETHER

ASSUME THE existence of a luminiferous ether and the validity of the Galilean transformation and let XYZ be a frame of reference at rest in the ether. Let X'Y'Z ' be
another frame whose axes are respectively aligned to those of XYZ, whose origin coincides with the XYZ origin at t == t' == 0, and which is in translative motion at the constant velocity u through XYZ.
Additionally, assume that a source of light waves is moving through X' Y' Z' at the
constant velocity v, and thus through XYZ at the constant velocity u + v. The
directions of u and v will be taken as arbitrary but attention will be restricted to the
case that both their magnitudes are small compared to the velocity of light c.
In Figure B.l the large dots represent positions of the light source relative to XYZ
at two instants of time, one period apart. Some of the radiated energy from this source

------- --------------- -- -- ---- ----- -u+v

FIGURE

B.l

Radiation from a moving source.

travels in the direction represented by the unit vector e. It is desired first to find the
wavelength and frequency of the light waves going in this direction.
Since XYZ is at rest in the ether, the ray velocity (or velocity of energy transport)
in this frame is the same as the phase velocity of the waves. The latter is nc, in which
n == e is a unit vector normal to the wavefront. In time r, the radiation which left the
source at t == 0 has traveled a distance l == cr and the source has been displaced a
distance (u + v) nr in the direction of e. Therefore the wavelength is
A == [c - (u

+ v) n]r

(B.l)

and the frequency is


v

c
== - ==
A

vo

c
c - (u

v) n

(B.2)

516

Classical Doppler Shift

APPENDIX

in which Vo = liT is the frequency of the light source as determined by an observer at


rest relative to the source.
These light waves, traveling ill the direction of e in XYZ, can be described by the
equation
(B.3)
if; == K cos <p
in which K is the wave amplitude and
<p

21rV

(t - reo)
~-

(B.4)

is its phase. In (B.4) r is the position vector of the point (x,y,z), that is
(B.5)
Equation (B.4) has a useful physical interpretation. Imagine that the wavefront
which passes through the origin of XYZ at t = 0, traveling in the direction of e, is
labeled. Let an observer 0 be placed at the fixed point (x,Y,z) and let him note the time
at which the labeled wavefront passes him. This time will be

t1

r en
=-

(B.6)

because the phase of this wavefront never changes and its phase was zero when it
passed through the origin. Thus at any later time t, the phase of the wavefront which
is then passing 0 is given by
(B.7)
In short, to measure the phase of the wavefront passing him at any t.ime t, 0 need
take only the labeled wavefront as reference. This means further that the number of
wavecrests which pass 0 between the time the labeled wavefront passed him and the
time t is vet - t 1) .
Now consider an observer 0' stationary in X'Y'Z' at the particular point (x',y',z')
which causes 0 and 0' to coincide at the instant t = t'. If 0' counts the total number
of crests of the wave which pass him between the time the labeled wavefront passed
him and the time t', he must get the same answer as O. But he will describe the wave
by the equation
(B.8)
"" = K' cos 4>'
in which

4>'

( r' n')

27rv' t' - -c-'-

(B.9)

Since the labeled wavefront passed 0' at the time

r' n'
c'
e

t~ = - -

(B.lO)

and the number of crests which have passed him since then is v'(t' - t~), it follows that
the phase of the plane wave is an invariant, that is

= 4>'

(B.Il)

Classical Doppler Shift

APPENDIXB

517

}'ro111 this invariance the characteristics of the wave as viewed from X' Y ' Z' can be
determined, since
vI

/)

'

n
t - r- I

==v (

c'

n)

l' t--

(B.12)

must hold for all values of the spatial and temporal variables. If x, y, z, and tare eliminated from (B.12) through use of the Galilean Equations

== 1"
t == t'

(B.13)

ut'

one obtains
1/'

[t l -

r_'_~,n_/J

1/

[t l -

_(r_I_+_~_t_/)_._nJ

(B.14)

If coefficients are equated, the results are

( u n)

== v I - -c-

v'
V

--, n x == - n x , etc.
c
c

(B.15)
(B.16)

From (B.16) it follows that


V

c'

(B.17)

==

n'

(B.18)

11

Therefore the wavefront has the same direction in both coordinate systems and a cornbination of (B.17) with (B.15) gives

c' == c - u

11

(B.19)

Equation (B.19) gives the phase velocity in X1YIZ '. The ray velocity is given by the
conventional Galilean formula V' == nc - u.
Thus the phase velocity and ray velocity are not collinear in X1Y'Z'. If e' is a unit
vector in the direction of the ray velocity ~ then
n

V'
== e / c

+-uc

(B.20)

and

V'
c

, u e'

-=ne---

(B.21)

Since n e' differs from unity only in second order, (B.20) can be writ.ten as

n e' (1 _u~ e') + ~


=

which is correct to first order.

(B.22)

518 Classical Doppler Shift

APPENDIX

Upon combining Equations (B.2), (B.15), and (B.22) one obtains


,

v =

vo 1 -

(v/c) - c' - [(ll/C) - c'][(vjc) - e'] + (u/c) 1 - (u/c) - e' + [(u/c) e']2 - (U/C)2

V/C}-l

-----~-----~----

(B.23)

If this equation is expanded (cf. Mathematical Supplement) and terrns through the
second order only are retained, the result is that
v' =

vo [ 1

vee'
(V-C')2
uev]
++
+
-C
c
c
2

(B.24)

APPENDIX
SOME PROPERTIES

OF BESSEL

FUNCTIONS

function of the first kind In(v) (in which v == kr and n is integral or zero)
is the coefficient of t in the power series expansion of the expression

THE BESSEL

(C.l)
and for this reason (C.1) is known as the generating function. This result 111ay be
appreciated by writing

vt 1 (vt)2
[.1+-+2
21 2

+-3!1 (vt)3
-2 + ... +-n!1 (vt)n
-2 + ... J

v I (V)2
X [1-2t+2!
2l -3!1 (V)3
2t + ...

(-l)n(v)n
2t + ... J

+~

from which it follows that the coefficient of tn is

(v/2)n
n!

(v/2)n+2
- l!(n

I)!

(v/2)n+4
2!(n

(v/2)n+2m
(-l)m m.'( n
m.

2)! -

+.

)'

which is identical with (3.72). Thus


(C.2)
n=

If in (C.2) one replaces

-00

by - - the result is

exp

[~ v ( - ~ + t)]

_I

n-

(-t)-".I,,(v)

-00

If the summation index -n is substituted for n, this becomes


cxp

[t v (t - D
J _I
=

n-

(-l)"t".L,,(v)

-00

Comparison of (C.3) and (C.2) reveals that


(C.4)

520

Some Properties of Bessel Functions

APPENDIX

The Bessel function of the second kind, for integral order, may be defined by the
relation
. cos VTrJ (v) - J (v)
(C.5)
Y n (V) == 1an
.
JI

-JI

SIll

v-~n

VTr

and application of L'Hospital's rule to (C.5) leads to the series (3.73). It therefore
follows, since Y n is definable in terms of Bessel functions of the first kind, which obey
(C.4), that
(C.6)
Because of the nature of the defini ng relations (3.76) and (3.77) the Hankel functions
also obey this law. However, the manner in which the modified Bessel functions are
defined leads to the result that
(C.7)

with w == fro These formulas are useful when working with the orthogonal representations (3.85) and (3.87).
When both sides of (C.2) are differentiated with respect to t, one obtains

If the expression on the right is arranged in powers of t and the coefficients of t n equated, it is evident that

are

(C.8)

If any two successive Bessel functions are known, the third in sequence can be deduced
from (C.8) and then this process may be repeated indefinitely.
Alternatively, if both sides of (C.2) are differentiated with respect to v, one obtains

Upon equating coefficients of in 011 the

J:(v) ==

t\VO

sides of this iden ti ty, one finds that

-klJ n-l(V)

- J n+l(V)]

(C.9)

Equations (C.8) and (C.9) are known as recurrence relations and arc also satisfied by
Bessel functions of the second and third kind. However, because of the nature of the
definitions (3.80) and (3.81), the modified Bessel functions satisfy

nI neW)

2 [In-l(w)

I:(w) = t[In-1(w)

- In+1(w)]

+ I n+

1(w)]

(C.lO)
(C.ll)

APPENDIX

C;

S01ne Properties of Bessel Functions

521

and

K:(w) ==
Upon eliminating either I

n- 1

2" [Kn-J(w) - Kn+1(w)]


-t[Kn-1(w) + Kn+1(w))

nKn(w) == -

(C.12)
(C.13)

or I n + 1 from (C.8) and (C.g) one obtains

+ nJ n(V) == vJ n-l(V)
vJ: (v) - nJ n(V) = -vJ n+l(V)

vJ~(v)

which are equivalent to

d
dv [VnJn(v)] == vnJn-l(V)
d
dv [V-nJn(v)] == -v-nJn+1(v)

(C.14)
(C.15)

These differentiation formulas are also obeyed by the Bessel functions of the second and
third kind. For n == 0 the result is simply
(C.16)
Because of the difference in the recurrence relations, the modified Bessel functions
satisfy
(C.17)

(C.18)
and
(C.19)

(C.20)
If v == kr is real, the J n(V) functions oscillate and each has a sequence of roots which
may be designated by l' nJ, I'nZ, . . . , I'nm, . . . ,such that J n( I'nm) = 0,
rn. == 1, 2, 3, . . .

A family of functions

(C.2!)
can be generated with the property that each of these functions has a null at v == Vo; for
the mth function there are m nulls in the interval 0 :s; v :s; vo. That the individual
members of the family (C.2!) are orthogonal to each other 111ay be seen by the following
argument:

522 S01ne Properties of Bessel Functions

APPENDIX

Let

~)

fm(V)

VV J

fp(v)

vi! J Ynp~)

n (

Ynm

n (

be any two members of the family. By direct substitution they are seen to satisfy the
differential equations

Multiplying the first of these by fp and the second by 1m and subtracting furnishes the
identity

'Ynm - 2 'Y

np

Vo

f mp
f =

f"f
- f"!
m

ppm

Integration of both sides of this identity from 0 to b yields

[f~fpl:

f f~f~dvJ b

[f~fml~

f f~f~dvJ
b

= rf~fp - f~fm]~

which can be written

_1';_m_-_
_1'--=-~p
2
Vo

f vJ

('Y

=
If m ~ p and

nm

!!-) J n (I'n p!!.-.) dv


Vo

~ [ YnpJ

n=

Vo,

Va

n (

Ynm

~) J: (Ynp~) - -:

n (

Ynp

~) .I: ( Ynm ~) ]

(C.22)

since J n(1'nm) = J n('Ynp) = 0, the above formula reduces to

v.In ('Ynm

~) J n('Ynp ~) dv =

Vo

Vo

(C.23)

and thus (C.21) is an orthogonal family of functions.


Upon differentiating (C.22) with respect to l' nm and then letting m = p and u =
one obtains

Vo

(C.24)

Sonic Properties of Bessel Functions

APPENDIX C~

523

In the development leading to (C.l2) it was shown that

J n(v) == - J n(V) - J n+l(V)

J: ('Ynm)

and thus

== - J n+l(l'nm)

(C.25)

Combining these last three results, one can express the orthogonality relation in the
form
(C.26)

in which Omp is the Kronecker delta and has the value unity if m == p, but is otherwise
zero.
Some of the roots l'nm for the lower-order Bessel functions are listed in Table C.l.
lIABLE C.l
THE ROOTS

~
1
2
3
4
5
6

l'nm

OF

In(v); ,]n('Ynm) = 0

2.4048
5.5201
8.6537
11.7915
14.9309
18.0711

3.8317
7.0156
10.1735
13.3237
16.4706
19.6159

5.1356
8.4172
11.6198
14.7960
17.9598
21.1170

6.3802
9.7610
13.0152
16.2235
19.4094
22.5827

7.5883
11.0647
14.3725
17.6160
20.8269
24.0190

8.7715
12.3386
15.7002
18.9801
22.2178
25.4303

APPENDIX
THE ASSOCIATED

LEGENDRE EQUATION

IN CHAPTER 3 the differential equation


(D.I)
was encountered and its solution will be undertaken here. As a first step, consider the
case m = 0 which results in the ordinary Legendre equation

(1 - u 2 )

d 2g
du 2

dg
du

2u -

+ n(n + l)g

(D.2)

Let a solution to (D.2) be assumed in the form

L apu + p
00

g =

in which s is a constant. Then

p=o

00

dg
du

p=o

(D.3)

(8

+ p)apu s+p-

and substitution of these terms in (D.2) gives

L (s + p)(s
~

+p

L (s + p 00

- 1)a pu 8 + p -

p=O

2 -

2:

p=2

- 2

p=2

(8

2)(s

p - 2)ap_2us+p-2

p - 3)a p _ 2u

+ n(n +

8+

J> - 2

L ap_2us+p-2 =
00

1)

Since this result is to hold for all values of u, the coefficient of each power of
separately equal zero and therefore

(s

p)(s

s(s - l)ao = 0
(s + l)sal = 0
p - l ), = [(s + p - 2)(s + p - 1) - n(n

p=2

111Ust

1)]a p _ 2

If s = 0 the first two of these conditions are satisfied and the third condition becomes

APPENDIX

The Associated Legendre Equation

the recursion formula


(n -

2)(n

p - 1)

pep _ 1)

ap -

525

(D.4)

The solution to (D.2) can then be written

= ao [

1-

al [ U

n(n

1)

2!
(n -

l)(n

3!

+ n(n + 2) 3
u +

u2

2)(n

l)(n

4!
(n -

3)

l)(n - 3)(n

5!

u/: -

...

4)

2)(n

u5 -

... J

(D.5)

For non-integral values of n both of the series in (D.5) converge except at u == 1.


Since one series is odd and the other even, they represent linearly independent solutions of (D.2) so that (D.5) is a general solution provided that lui < 1. Nothing further
is added by choosing s == 1 since each choice leads to one of the series in (D.5).
If n is an even integer, it is clear that the first series in (D.5) terminates and is thus
a polynomial, whereas if n is an odd integer the second series reduces to a polynomial.
If the arbitrary constants ao and al are adjusted so as to give these polynomials the
value unity when u == 1, the Legendre polynomials are obtained, the first few of which
are:
Po(u) == 1
P1(u) == u == cos ()
P 2 (u) == tu 2 - i == t cos 2() + t
P 3 (u) == ju 3 - !u == t cos 3() + i cos ()
These polynomials can also be generated from Rodrigues' formula
1 dn
Pn(u) == - nn! (u 2 - l)n
(D.6)
2
dun
which may be verified by expansion.
For n an integer, the nonterminating series in (D.5), with the constant suitably
adjusted, is known asthe Legendre function of the second kind, Qn(U). These functions
are characterized by singularities at u == 1 and must be excluded from the solutions
of physical problems in regions containing the polar axis. They will not be considered
in this appendix.
The Legendre polynomials P n(U) defined above satisfy (D.2) which may be written
in the form

(D.7)
If this equation is differentiated m times with respect to u one obtains
(1 - u 2)

d 2h
du 2

2(m

+ l)u -dh +
du

1) -- mtm

l)]h

== 0

(D.8)

a-r,

h(u) = - dum

in which
When one lets h(u)

[n(n

(1 - u 2)- m/2j 2(U), Equation (D.8) transforms into (D.l). Thus

f2(U) == pr;:(u) == (1 - u 2)m/2

dmPn(U)
dum

(D.9

526

The Associated Legendre Equation

APPENDIX

is a solution to the associated Legendre Equation (D.I). The functions

pm(u) ==

(1 - u 2 ) m / 2 d n + m

-n- (u 2

2nn!

l)n

du + m

(D.IO)

are known as the associated Legendre functions of the first kind. Since P n is a polynomial of order n, it follows that pr;:(u) = 0 for m > n.
It is obvious that the functions P~(u) are identical with the polynomials P71(u)
previously listed. If one uses (D.9),

u 2 )% = sin 0
U2)~2 = i- sin 28
P~(u) = !(5u 2 - 1)(1 - U 2)!t1 = i sin 8 + -)f- sin 30
P~(u) == 3(1 - u 2 ) = i- - ! cos 20
Pi(u) = 15u(1 - u 2 ) = Jl- cos 8 - J-t cos 38
P;Cu) == Itj(l - U2)~2 = _4-!- sin 8 - J-l- sin 38
P~Cu) == (1 p~Cu) == 3u(1 -

A second generating function for the Legendre polynomials is given by the expression
f(u,t) = [1 - (2ut - t 2 )]- H
which can be expanded into the series (cf. Mathematical Supplement)

f(ut) = 1
,

(t)

+ -I!

(2ut - t. 2)

(i)(!)

+ -2!- (2ut

+ ...

- t2 ) 2

(t)(j) . . . [(2n - 1)/2] (2ut _ t 2 )n +

n!

If this is rearranged as a power series in t one obtains


f(u,t)

ut

3u 2

+ --2-

t2

5u 2

3u

and the coefficients of the different powers of t are recognized to be the Legendre polynomials, so that
(1 -

2ut

L tnPn(u)
CIO

t2)-~2 =

(D.II)

n=O

Differentiation of (D.II) with respect to t gives


co

t
\'
-(1---2u-t-+-t-2)-~2 =

/::0 ntn-1Pn(u)

U -

which can be written

L tnPn(u)
CIO

(u - t)

n=O

(1 - 2ut

(D.12)

L ntn-1Pn(u)
co

t2)

n=O

Equating coefficients of t", one determines that


(n

I)Pn+1(u) -

(2n

l)uPn(u)

+ nP

n - 1(u)

(D.13)

This recurrence relation will permit the determination of any Legendre polynomial if
two successive ones are known.

The Associated Legendre Equation.

APPENDIX I)

527

Differentiation of (D.II) with respect to u yields

2:
00

t
(1 - 2ul

+ l2)~2

==

n=O

tnp

(T).I4)

n()

which can be rearranged as

( L [npn(U)
00

L i-r; (u)
00

+ (2)

== (1 - 2ut

n=O

n=O

The coefficient of l" gives


Pn-1(u)

== P~(u) - 2UP:_l(U)

P~_2(U)

(D.I5)

Knowledge of the derivatives of t\VO successive Legendre polynomials will thus permit
determination of any other through the use of (D.15).
Alternatively, (D.14) can be rearranged with the aid of (D.12) to give

L ntn-1Pn(U)
00

2:
00

(u - t)

n=O

11

tnp~(u)

=0

which yields the recursion formula


nPn(u)

==

(D.16)

uP~(u) - P~_l(U)

from which the derivative of any Legendre polynomial can be determined if one
polynomial and its derivative are known.
Combination of (D.IFi) and (D.16) delivers the useful differentiation forrnula
(1 -

u 2)

r;
-

du

== nP n - 1 (u ) - nuPn(u)

(D.17)

Recurrence relations for the associated Legendre functions follow readily with the
aid of (D.IO). Two of the more important formulas are
(n - m

l)P:+l -

(1 - u 2 )

(2n

dpm
_ n == (n
du

l)uP:

(n

+ 1n)P':_1

1n)pm
- nul'"n
n-l

== 0

(D.18)
(D.19)

One of the most useful properties of the Legendre polynomials is their orthogonality
in the interval -1 :::; u ~ 1. This can be established by returning to the differential
equation (D.7). The two polynomials Pl(u) and Pn(u) satisfy this equation in the forms
I
d
- [(1 - U2)P l(U)]

du
d
,
- [(1 - u 2)Pn(u)]
du

l(l

n(n

l)P l(u) == 0

l)Pn(u) == 0

(D.20)
(D.21)

Upon multiplying (D.20) and (D.21) by Pn(U) and Pl(u) respectively, subtracting, and
then integrating from -1 to + 1, one obtains
(l - n)(l

+n+

1)

Pt(u)P,,(u) du = [(1 -

U2

)[P ,, (ll )P;(U) -

Pt(u)p:(n)JlI~:

-1

(D.22)

528

The Associated Legendre Equation

APPENDIX

in which the right side of (D.22) has been achieved through integration by parts.
Therefore

J PI(u)Pn(u) du = 0
1

(1).23)

l~n

-1

and the Legendre polynomials are orthogonal.


To determine the value of this integral if l = n, the generating function (D.lI) can
be used. Squaring both sides and integrating with respect to u gives

J (1 -

J [Po(u) + tP1(u)

2ut

t 2)- 1 du

-1

+ ... +

tnPn(u)

+ ...)2 du

-1

which becomes

[ _ .-!. In (1
2t

+t

_ 2ut

L J P~(u)
1

00

2) ] 1

-1

n=O

t2n

-1

du

with the reduction of the right side occurring by virtue of (D.23). Insertion of the limits
yields

+t
I - t

In 1

L~
= L o- J P~(u)
2n + 1

00

00

n =0

n= 0

du

_ 1

in which the logarithmic function has been replaced by its series expansion. Equating
coefficients of like powers of t, one obtains
1

p2(U) du

-I

du

[(l -

du

u 2) dP';:]

du

2n

(D.24)

Pi and P;:, which satisfy

- u 2 ) dP'(']

- -

The associated Legendre functions

.!![(1
du

2
:=

[iO + 1) - ~-]
Pi
1 - u

(D.25)

~] P;:

==

(D.26)

[n(n + 1) -

1 - u2

are also orthogonal in the same interval. This can be established by a repetition of the
foregoing procedure. If (D.25) and (D.26) are multiplied by P': and P,/" respectively, the
difference taken, and the result integrated, the result is that
1

J P'('(u)P';:(u) du

(D.27)

lrfn

-1

The normalization integral is

f [P::'(u)J2 du f (1 1

-1

-1

drP; dmp n
u 2)m- - - - du

dum dun

which reduces to
1

-1

[P:(U)]2 du = -

n .!!f dm-1P
m-

-1

du

du

[(1 -

u 2)m dmPn] du

dum

(D.28)

APPENDIX

Associated Legendre Equation

1he

529

after integration by parts. If in (D.8) one replaces m by rn - 1 and multiplies through


by (1 - u 2 )m- J, there results

d [
(1 du

dmPnJ
u 2)m - - = - (n - 1n
dum

l)(n

+ 1n)(1

(1 -

U 2)m-

dm-1P n
du m- 1

- u 2 )m- l _

Substitution of this expression in (D.28) gives


1

f [p:(u)]2

-1

du = (n

m)(n - m

+ m)(n

(n

- m

+
+

1)

f
f

-1

dm-lPn dm-1P n
l - - - - du
dur:' du:'
(D.29)

1)

fP;:,-l(u)j2 du

-1

with the aid of (D.28). Use of the reduction formula (D.29) yields

( ) ,

[P;:'(u)j2 du = (:

~ :);

[P~(u)j2 du

Finally, through the use of (D.24),


1

-1 Pn(u)P I (U) du = (2n

en + m)!

l)(n _ m)' lit.

(D.30)

This result is of considerable importance since it provides the opportunity to expand


a function f2(U) in terms of associated Legendre polynomials with the coefficients
individually determinable from (D.30). This technique greatly facilitates the solution
of many boundary value problems,

APPENDIX

COMPOSITION OF GENERAL SOURCES

general current density distribution t(x,y,z,l) be represented by a four fold


Fourier integral such that a component may be written

LET A

J gl(kz)eikzz dk z J
00

tz(x,y,z,t)

00

g2(ky)ejk,y

u;

J
00

J g4(WW
00

ga(k.)eik,. dk.

WI

dw

(E.l)

This may also be expressed as

JJJJ
00

tx(x,y,z,t)

00

gz(kz,ky,k.,w)

u; dk y dk. dw ej(wt+kzxH.y+k,.)

(E.2)

in which gx = glg2g3g4. Proceeding in this manner for all three components, one is able
to represent the current density in the form

t(x,Y,z,t) =

J J J J g(lcx,ky,k.,w) .u; .u; .u; dw eiewt+kzx+k."+k,z)

-00

-00

-00

(E.3)

-00

with g = lxgx + l yg y
l zg z.
Similarly, a general charge density distribution p(x,Y,z,t) may be represented by

p(x,y,z,t) =

JJJJ

-00

-00

-00

f(kz,ky,k.,w) dk x .u, dk. dw ejCwt+kzx+k.y+k,z)

(E.4)

-00

The integrands of (E.3) and (E.4) are connected by the continuity equation, V t = - p,
which gives
1
(E.5)
f = - - (k g)
w

wherein k = lxk x + l yk y
lzk z.
If the fictitious charge and current densities in the interval (dk,dw) are treated as an
independent entity which satisfies the flow equation t = pv, then the velocity of these
fictitious charges is

wg
k g

v(k w) = - = - -

(E.6)

This velocity is independent of x, y, Z, and t and is therefore a common velocity shared


by all the charges which give rise to the (k,w) current and charge waves. In a coordinate

APPENDIX I~~

Composition. of General SOllTCeS

531

system traveling at the velocity v with respect to XYZ, these charges are at rest. As
k and ware permitted to range over their complete spectra of values, (E.6) indicates
that all values of v will be encountered in the interval 0 ~ v < 00. One 111ay conclude
from this that arbitrary static charge distributions in all Lorentzian frames 111ay be
combined to give the most general time-varying spatial distributions of current and
charge density in a particular Lorentzian frame.
Because the range of v is unrestricted, some of these fictitious charge distri butions
are traveling through XYZ at speeds greater than light. This requires use of the
Lorentz transformation equations when v > c. Even though the transformation is then
nonphysical, this is mathematically admissible in the sense that all physical laws
formally transform properly under a Lorentz transformation regardless of the value of
vic; it should also be recalled that the charge densities in the interval (dk,dw) are
fictitious. No intimation is intended that the real time-varying charges, which are the
sum of these fictitious static charge densities, are traveling at speeds in excess of c.
As a corollary of the above result, a steady current distribution t'(x',Y',z') in X'Y'Z'
may be decomposed into static charge distributions in all other Lorentzian frames.
For this reason the most general sources t(x,Y,z,t) and p(x,y,z,t) in XYZ may also be
built up from static charge distributions p'(x',y',z') plus static current distributions
,'(x',y',z') in all other Lorentzian coordinate systems X'Y'Z'.

APPENDIX

GENERALIZATION OF THE FIELD TRANSFORMATION EQUATIONS

(5.2) it was established that, if the most general system of time-independent


sources existed in X'Y'Z', time-varying fields E(x,y,z,l) and B(x,y,z,t) would be experienced in XYZ, with these fields contributing to the force on a moving charge q in
accordance with the Lorentz force law (5.8). The relations between the static fields
E', B' in X'Y'Z' and the time-varying fields E, B in XYZ were given by (5.5) and (5.9).
Imagine no\v a third Cartesian coordinate system X*Y*Z*, with its axes respectively
aligned to those of X'Y'Z', and with the X* axis sliding along the X' axis in the negative
direction at a constant speed u", Through use of the velocity transformation equations
(2.50), X*y*Z* is seen to be in motion with respect to XYZ at a velocity lxu given by

IN SECTION

lxu = l x

u - u*

1 - uu*jc 2

(F.1)

In a parallel development to what was presented in Section {J.2, an observer 0*, stationary in X*Y*Z*, may define time-varying fields E*(x*,y*,z*,t*) and B*(x*,y*,z*,t*)
by the relations
E 'z - u *B'y
E* =----(F.2)
z
[1 _
(u *) 2/ C2P2

,
B.
1/

u*
-2

B* = - - - - - [1 -

B'z

Ez

(u *) 2/ C2P~

B*

[1 -

u* E'
2

.J

(U*)2/C2P~

(F.3)

Elimination of the primed field components from among (5.5), (5.9), (F.2), and (F.3)
yields

E;

By =

E];

+ uB:

(1 - U2/ c2) H

B*y - ~
c2 E*z
(1 -

U2/C2)~~

z =

Bz =

E*z - uB*Y

(1 _ U2/C2)~~

B*z

+~
c2 E*y

(1 - u2/ c2 ) ~'~

(F.4)

(F.5)

But these are the same transformation equations as (5.5) and (5.9) and thus the earlier
result has more general validity, linking fields as seen in two different Lorentzian frames
when the fields are time-varying in both frames. Of course, the sources which gave rise

APPENDIX

Generalization of the Field Transformation Equations

533

to these fields are of a restricted class, being time-independent in X' Y ' Z'. However,
time-independent sources in still another frame X" Y" Z" would lead to the same result
(F.4) and (F.5). If time-independent sources in an infinite variety of Lorentzian frames
are su perim posed, the most general time-oaruinq sources can be created for observers
and 0*. By superposition, the total fields experienced by these t\VO observers will
still satisfy (F.4) and (F.5). Thus these field transformation equations have a completely
general validity.

APPENDIX

REDUCTION OF THE VECTOR GREEN'S FORMULA FOR E

IN

CHAPTER

5 the vector Green's theorem was used to establish the relation (5.39),

namely

J (E

v'

V s X V s X ifia -

ifia V

X V

X E)

dV

E X Vs X ifia) 1" dS

(ifia X V s X E -

81 ... SN,'};

This equation can be transformed in the following manner: Using the vector identity
(V.113) one may write
(G.l)
However,
(G.2)
v sofa == fV s a + a V sf == a V sf
since a is a constant vector. Also
V~fa

(G.3)

= a V'~f = - k 2 f a

because f satisfies the scalar wave equation \7~1/;

k 2f ==

o.

Thus
(G.4)

Using both (G.4) and (5.3F, one obtains

VS X Vs X ifia -

ifia'

Vs X V s X

Vs(a'

Vsifi)

a'

(jw ifi

/L;l)

Further use of the identity (V.I07) gives

E V s (a V sf) == V s [E (a V sf)] - (a V sf) V s E

== V s [E(a Vsf)] - .!!- a V sf


E:o

so that the left side of (5.39) becomes

{VS. [E(a Vsf)] ..eo (a Vs1/;) + a (jwifi ~l)} dV


~o

= a'

J(jwifi

v'

~l

/-Lo

_!:. VSifi) dV -

a'

E:o

in which the divergence theorem has been employed,

J
81 ... SN,'1:

(1 n

E)VsifidS

(G.5)

Iieduction of the Vector Green's Formula for E

APPENDIX C~

535

The constant vector a may also be taken out in front of the integral sign on the right
side of (5.39). Since, with the aid of (V.l09) and the triple scalar product, one can write

[E X (v s X ~a)] in == [E X (Vsl/; X a)J 1 n == [(I n X E) X Vsl/;] a


[fax V s X E] in == - jwl/;(a X B) 1n == jwfa (in x B)
it follows that

f
S1 ...

(1/;a X Vs X E - E X Vs X 1/;a) in dS
SN,~

= -a'

[jw1/;(l n X B) - (in X E) X V s1/;] dS

(G.G)

81 ... SN,L-

But (G.tj) and (G.6) are equal to each other, being modified forms of the left and right
sides of (5.39). And since this is true for any arbitrary constant vector a, it follows that
the integrals themselves must be equal. Thus

f(jw1/;
v:

~1

fJ.o

~ Vs1/;) dV
EO

[(In' E)V s1/;

81 .. SN

= [

(in X E) X Vs1/; - jw1/;(l n X B)] dB

[(1" E)V s1/;

(in X E) X Vs1/; - jw1/;(1" X B)] dS

(G.7)

where, for convenience, the surface integral over the sphere ~ has been split off.
Consider this integral. On the surface of the sphere 2: one has

(Vs1/;). = 1.,

[~ C-;k)

= -in (jk

De-;k.

(G.8)

If dQ is the element of solid angle subtended at P by a surface clement dS on 2:, then


the right side of (G. 7) call be written

=! r 41T"

(1)
+5

(in E)ln jk

j k fJ

e-0- -

u. X E)

X in

(1)
+8
jk

j kb

e-0-

jkO

e- ]
- ,iw(l n X B) -0- 02 dn
41T"

- se!"

f [jk(l n E)l n + jk(l n X E) X in


o

jw(l n X B)] dQ

41T"

- e-

jk

[(1,. E)1"

(in X E) X in] dQ

(G.D)

Since both integrals in (G.g) are well-behaved at P, it follows that

f E dQ = - E(P) f dQ = -471'E(P)
41T"

lim I = - lim e0---+0

0---+0

jk

41T"

(G .10)

Next consider the limit, as 0 ~ 0, of the left side of (G.7). Obviously the surface
integrals are well-behaved because P is restricted to be a point within V and thus not

536 Reduction of the Vector Green's Formula for E

APPENDIX C~-

on any of the bounding surfaces Si. As V' ~ V, the volume integral is also well-behaved.
To see this, spherical coordinates may be introduced centered at P. Then dV =
r 2 sin fJ dr dfJ dcP. Since t/; and Vsl/; contain terms involving ~-l and ~-2 only, the contribution of the volume element at ~ = 0 to the volume integral in (0.7) is finite. Therefore,
the limiting value of (G.7) is
E(x,Y,z)

~
J (!!... VS1f' 41r V Eo
+~
41r

J
S1'"

SlY

jW1f'

~1) dV

fJ.o

[(In E)V s1f'

(In X E) X VS1f' - jW1f'(ln X B)] as

(C.Il)

in which (x,Y,z) are the coordinates of the point P, and it is to be remembered that a
time factor ei wt has been suppressed.

APPENDIX

THE WAVE EQUATIONS FOR A AND <P

IN

CHAPTER

5 the potential functions A and ep were introduced by the defining rela-

tions

f
v

t(~, rJ,f)ej(wt-k!")

f
v

p(~,1],r)ej(wt-kr)

A(x,Y,Z,t)

==

cf?(x,Y,z,t)

dV

(H.1)

41rJ.Lo ~

dV

(H.2)

41ro~

Upon taking the divergence of (H.i) one obtains


(H.3)
since \ is not a function of (x,Y,z). But
1. V F

(t;-;H) =

-1. V s

(e-;k f)

e-~kf V S' 1

_ VS'

Ce~jkr)

in which use has been made of (V.107). If V s \ is replaced by -jwp in accordance with
(5.34), (H.3) may be writ.ten
VF

. f

:=:

-Jw

p(~,1],r)ej(wt-kn
-1

41rJ.Lo

dV -

\(~,1],r)ej(wt-kO
-1

41rJ.Lo ~

dS

(H.4)

after the divergence theorem has been employed. Since S may be made large enough
to encompass all the sources without containing any of them in its surface, the second
integral in (H.4) vanishes and one is left with the conclusion that

jw
1 .
VA== --ell:=:
--ell
c2
c2
Through application of the Fourier integral, if t and
one sees that
A(x,Y,z,t)
it>(X,Y,z,t)

f 1(1;, 'fJ, r, t -

are general functions of time,

~/e) dV

(H.6)

t - 'fIe) dV

(H.7)

411"J..Lol~

(R.5)

p(1;,

'fJ,

r,

47rEO~

538

Tne W ave Equations for A and <I>

APPENDIX

these integrals being natural extensions of (H.1) and (H.2). Because linear superposition has been employed, it follows that A and <I> as given by (H.6) and (H.7) also satisfy
(R.5). Further, the fields E and
arising from the sources t(~,l1,r,t) and p(~,1],r,l)
satisfy
(H.8)
E = -V<I> - A
(H.9)
B=VxA

these equations being restatements of (:>.67) and (;">.68) but for the more general
potential functions (H.6) and (H.7).
If one takes the divergence of (H.8) and the curl of (H.9), the result is
V E =
V X B

- \72<1> -

V A

V X V X A = V(V A) -

= -

fO

\72A

= - -1
J.Lo

+ -1.
E
2
C

which, with the aid of (R.5) and (H.8) become

1 ..
c2

\724> - - <P =

(R.IO)
t
-1

(1-1.11)

J.Lo

Thus both A and cI> satisfy the same type of differential equation, the solutions being
given formally by (H.6) and (H.7).

APPENDIX
VECTOR WAVE SOLUTIONS IN SPHERICAL

CONSIDER

COORDINATES

the vector function

G == r X V'1'

(1.1)

V X (r'lJ)

== -

in which ' satisfies the scalar wave equation

\72'1'

k 2'l' == 0

(1.2)

I t is desired to show that G satisfies the vector wa ve equation


\72G

+ k 2G

== 0

(1.3)

the entire discussion being confined to spherical coordinates. Since


14>
<3'1'
+ -1r 0 -<3'aelJ + -.r sin e a</>

<3'lJ

= iT -

V'lt

aT

it follows that

10 a'lJ
- sin e d ct>

===

a\j!

4>

ao

(1.4)

Using Equation (4Jj4), which gives the Laplacian of a vector in spherical coordinates,
one finds that

yr2G == 1 0 [ \72 ( - -1- -a'l!)


sin e ae/>

But
and

\72 ( - sin

a'lJ)
e a</>

\72

1'2

de

sin"

+
1
sin

== -

(a'1J)

~
ae

a
e ae/>

(\'I2'l')

10

- .SIn

10

= -.- SIn

which was to be proved.

e ae/>

1 [\72

and therefore (1.5) becomes


V'2G

2 cos e a 2'lJ]
r 2 sin? e de a</>

+ - -1 - -a'lJ -

e dcj>

e a</>

(a'l') _ 2 cos e a '1t


de
1'2 sin 3 e ae/>2

(\7 2'1')

1'2

1
sin 2

(\7 2'1')

1'2

2 cos e a 2'1'
sin" e ae act>

1_
sin 2 e

a'lJ]
ae

(1..5)

- - - - -1 - a'l'

1'2

a'l' +

e de

1'2

sin"

e ae/>

2'l'

2 cos 8
1'2 sin 3 e de/>2

1 - (\1 2'1')

de

(k 2'lt ) - 14> -- (k 2'1' ) == - k 2G

de

(1.6)

APPENDIX

GREEN'S FUNCTIONS FOR RECTANGULAR WAVEGUIDE

IN SECTION 5.15 it was shown that a useful formulation for TM modes in any cylindrical waveguide resulted from the use of a Green's function G 1 which satisfied the
conditions:
e- j k r
1. G1 = 01 -
r
2. G 1 == 0 on S
3. 01 is regular in V and on S and satisfies

for all (u,v,s) on S and any (x,y,z) in V.


4. G1 satisfies
(J.l)

wherein P F and P s are the field point (x,y,z) and the source point (u,v,s) respectively,
o is the Dirac function, V is the volume interior to the waveguide, and S is the interior
surface of the waveguide.
For the rectangular waveguide shown in the first figure of Example 5.9, assume
G,

~ ~ 2 . mx ~ . n1rl1
= L L _/- SIn sin F mn(S,x,y,z)
a

m=ln=lvab

(J.2)

This is a reasonable starting assumption, since each term in (J.2) separately satisfies
the boundary condition G1 == 0 on S. All other terms in the complete Fourier series
in ~ and 11 do not satisfy this boundary condition.
If (.1.2) is substituted into the inhomogeneous wave equation (J.l), one obtains

LL[(-

m 7r) 2

2 . 1n7r~ . n7r1]
+ (n1r)
-b 2] --==
SIn SIn F
vab
a
b

LL
+ k LL
2

nl7r~

mn

n7r1]

a2F mn

--== SIn sIn - - vab


a
b ar 2

2
. 1n7r~ . n7rl1
SIn sin Fm n
vab
a
b
_ j-

47ro(PF - P s )

(J.3)

APPENDIX

Green's Functions for Rectangular Waveguide 541

Let
2

)J.mn

==

(m7f)
-;;
+ (n1r)
b

(J.4)

and let

(J.5)
be the third or fourth quadrant root, in which k, the wave number, has a slight negative
imaginary component to account for losses which attenuate a wave as it travels along

eRe

-Ji.mn 2

FIGUng

J.1

Propaqaiion phaeors.

the waveguide. The pertinent phasors are shown in Figure J.l. Substitution of (J.5) in
(J.3) gives

II(-a2Far- +
mn

Multiplication by

f3 mnF mn

(2/V ab) sin

a2F r s +
-ar
2

R
IJrs

2
'7n7i~
n1r'Yl
_ r::L sin sin a
b

v ab

==

41rO(PF - P s )

T1r~/ a sin s1r'Yl/b and integration over (a,b) gives

rs

==

87f

T7fX

S7fY

-==
vi ab sin - a sin - b o(z - ~r)

(J.6)

Thus
2

. T7fX . S7fY
sIn vab
a
b

(J.7)

== brs(r,z) _ j - SIn -

Frs

and

Gl

==

~ 4

m=ln=l

1n7f~

n7f'Yl.

n17fX

n7fY

sin - - sin SIn - - SIn bmn(r,z)


ab
a
b
a
b

~ -

(J.8)

Green's Functions for Rectangular TVaveguide

542

APPENDIX

with bmn(r,z) satisfying


(J.g)
Lagrange's method of variation of parameters' will be used to solve (J.g). Assume

(J.IO)
Then

(J.II)

Let

(J.12)

an d t h en

(J2bmn

ds2- =

dVl

as

-J~

fJmn

'/J

e- JIJ mll!

+ J~' -(JV2,
ar eJ8mflt

fJmn

fJmn

'/J,.

Vle-JlJmt,)

Pmn

'/J"

V2e JlJ mll)

(.J.13)

Substitution in (J.g) gives


.
-J~

fJmn

Equations (J.12) and (J.14)

aVl.
e-J8mnt

as

111USt

av! = !47TO(Z 0_

ar

e--

+ JQ.

aV2.

fJmn

ar eJ8mnt

= 41ro(z ,~r)

be satisfied simultaneously by

eit1mnt

j{3mn ei

j{3mn!

- j{3mn e- it1mnt

411"o(z -

{3 m" f

ej {3 mnt
j{3mn e j {3 mnf

Vi

and V2. Thus

s)e j{3mt,f

u:

(J.14)

(J.15)

Similarly,
411"o(z - r)e- j {3 mnt

aV2

ar

2.i{3mn

(J.16)

When one recalls that (3mn is a third or fourth quadrant root, (J.IO) requires that

Vl(r,Z)

== 0
:=

V2(r,Z)
For

r > z,

C1(z)

== 0
==

=0

C2 (Z)
1

r<z
r> z
r>z
r<z

r<z

1 See, e.g., 1. S. Sokolnikoff and R. i'd. Hedheffer, AI athematics of Physics and Modern Engineering,
Section 28, ~lcGraw-Hill Book Company, New York, 1958.

APPENDIX

Green's Functions for Rectangular Waveguide

543

Similarly,

r>

==0
Therefore, if one uses (J.10),
41T"

r>

- - e-Jf3mn(r-z)

2j{3mn

- -41T"- e- J.{1mn( z-

2j{3mn

>r

Of,

(J.17)

for all rand z. When this result is substituted in (J.8), the Green's function G 1 is found
to be

(J.18)

in which
2

V;mn(X,Y) == ---==

Vab

n1/TrX

SIn -

n7rY

SIn -

(J .19)

By a similar process, if one defines the quantity


2

1n7TX

n7TY

\Vmn(X,Y) == _ / - cos - - cos'I ab


a
b

it may be shown that

LL
<Xl

G2

47r

(J.20)

<Xl

m=On=O

The proof of (J .21) is left as an exercise.

(J.21)

APPENDIX
THE AVERAGE

ELECTROSTATIC FIELD INTENSITY INSIDE A

SPHERE CONTAINING AN ARBITRARY

DIPOLE DISTRIBUTION

WITH REFERENCE to Figure 1\:.1, let ql be one of the charges inside a spherical volume
of radius 1'0, and for convenience choose the orientation of the zenith axis so that it
passes through the site of ql. Then if E l is the average value throughout the spherical

FIGURE

f<.l

..1 verage field inside a sphere.

volume V of the electrostatic field due to ql, it is apparent from symmetry that
has only a Z component. First one wishes to calculate
E- Iz = - 1

JE

Vv

1z

E1

(1(.1)

dV

that is, the average field due to the single charge ql'
With 1"1 the distance from ql to the center of the sphere, let the volume V be divided
into t\VO parts by the concentric spherical surface of radius 1'1. Then if (x,Y,z) is any
point within V, from the figure,
~2 =

1'2

+ ri -

21'1'1

cos 0

APPENDIX

I(

Electrostatic Field Intensity inside a Sphere

545

which leads to the expansions

~
-1 == -1 L
J:

T1 n = O

1
r

= -

(r)n
- P (cos 8)
n

r~

~ (1')n
L..
-.-!
n=O

Pn(cos e)

(cf. Example 3.24). The potential due to the point charge q1 can therefore be expressed
as
r

<

1"1

Then since
- cos
it follows that

sin e a<P1
e-a<P1 + ---

aT

T ae

>

'J'l

The orthogonality relation (D.23) gives


1

J Pn(cos e)p
and thus, since

o(COS

e) d(cos e) = 0

n~O

-1

E 1z ==

-3

J
21'0
3

JE

411'"1'0 v

lz

J E!zPo(cos e) d(cos e)
1

To

-3

r 2 sin e dr de d

1'2

dr

-1

one may conclude that

ql1'l
-

411'"EOr~

(I{.2)

Had the charge ql been chosen to lie in any other direction than the zenith, the
answer would have been
(1(.3)

546 Electrostatic Field Intensity inside a Sphere

APPENDIX I{

If additionally a charge - ql is at the position r~, the average electric field due to both
charges is
ql(rl - r~)
(1(.4)
47l"f01'~

in which d , is the directed displacement from -ql to +q!. But q1d 1 == Pi is the dipole
moment of the charge pair (ql' - ql) and thus, if there are J11 dipoles contained within
the spherical volume V, they cause an average field throughout the volume given by

f:

== -

--3
47l" f

o1'o

lv!

\'

L pn

=1

(K.5)

APPENDIX

THE DYNAMIC MACROSCOPIC SCALAR

POTENTIAL FUNCTION

DUE TO A VOLUME OF POLARIZED DIELECTRIC MATERIAL

CONSIDEH a charge pair (q, -q) within a macroscopic element dV of a specimen of


dielectric material, and imagine that their relative separation is d cos wt, with d the
maximum separation. Let the relative displacement be parallel to a direction characterized by a unit vector l z , and let the instantaneous position of q be Zl, the instantaneous
position of - q be Z2. With reference to Figure L.1, if q8(s 1 - Z1) ds 1 and - q8(r 2 - Z2) ds 2
are taken to be the distributions of the t\VO moving charges with 8 the Dirac delta
function, then the scalar potential at a distant point, due to this charge pair, is

in which the braces indicate that time-retarded values are to be used. If neither charge
makes a great excursion about its central point, so that r > IZII, r > IZ21, then
cI>(r,(},l) == _q [jet) 8eSl 47ro

== 47ro
~

Since ZI - Z2

==

eo

r -

(1

{z I} cos (} - t -

T -

jet) 8eS2 - {Z2}) d S2 J


T - S2 cos (}

{zd) dS l _

S1 cos (}

_ co

1)e

{z 2} cos

q cos (}
- - -2 ({ZI} - {Z2})
47ror

d cos wt, if d

X, as is usually the case, then

lzd - lzzl

dcosw(t

-~)

If p is defined as having a magnitude qd and a direction parallel to L; then


per
cI>(r,O,t) == -3 cos

47T"or

with

(r)
t- c

drawn from the oscillating dipole to the distant point.

(L.1)

548 The Dynami c Macro scopic Scalar Potential Function

APPENDIX

dl l

z,

FIGURE

L.l

Oscillating dipole.

This result is seen to be simila r to t he static case except t hat now th e dipole mom en t
is oscillatory at angular frequ ency wand retarded ti me must he used to dedu ce t he
scalar potential.
Upon letting P dV repr esent th e sum of th e dipol e mom ents du e to all t he cha rge
pairs within dV , on e may write
P (~, 11, L t - ~/ c) . ~ dV
(L.2)
cf> (x,Y ,Z,t)
[
41l"Eo~3

APPENDIX

in which

The Ihjnamic Macroscopic Scalar Potential Function

(~,l),n to

is drawn from

(x,Y,z). Since Vs

(t)

identity
VS'

[1~IJ

converts (L.2) to the form

ep(x,y,z,t)

f [{P}]

~IP}] +

= VS'

dS

47rE:O~

lIPI]' VS

~/~3,

47rE:O~

use of the vector

G)

J (-\'s [{PI])
v

549

dV

(L.3)

in which the divergence theorem has been used to obtain the first integral, and {P} is
the retarded value of P.
Equation (L.3) is seen to be similar to the static result (6.8) except that now P is
time-harmonic and retarded time must be used. This dynamic result is applicable at
interior points as well as exterior points. The proof follows the procedure used in Section
6.3 and requires that the radius of the sphere erected around an internal point be small
compared to a wavelength. This is normally a reasonable assumption.
Although the derivation just given is only applicable to electronic and ionic polarization, Equation (L.3) is also valid for orientational polarization. The proof of this assertion is left as an exercise.

APPENDIX

THE DAMPING CONSTANT OF A FREELY OSCILLATING DIPOLE

assumption that Equation (6.80) properly describes the motion of the


center of gravity of a freely oscillating electron cloud of total charge - e and mass in,
if the Z axis is aligned with the displacement, one 111ay write
UNDEH THE

1nzi + tizz = - 2bi 2

!!-dt (~2 niz? + 2~ aZ

which yields

2)

==

(~I.1 )

2bi 2

(~L2)

Therefore P == 2bi 2 is the instan taneous t.ime rate of energy loss by the atorn through
radiation to its surroundings.
Solution of (6.80) gives
z(t) == zae-(b/m)t cos wat
wherein Zo is the initialdisplacement of the electron cloud and

Wo

~~m _ (~)2
m

(~'I.4)

is the natural frequency of oscillation. From this it follows that

P ==

2bz~e-(2b/m)t

b
( -1n cos wot

)2

Wo sin wot

(1\1.5)

If 2b/1nwo 1 (this will subsequently be shown to be the case), then the decay is very
small during one cycle, and the energy lost by the oscillating atom in one cycle at the
time t is

lV ==

9b

J (b-

Zo e-(2b/m)t
Wo
0

>.J

27T"bz~

== - - [( -b

Wo

'}}1,

211"

cos wat

)2 d(wot)

Wo sin wat

)2 + Wo2J e-(2b/m)t
(lV1.6)

~ 27T"bwoz~e-(2b/m)t

The radiation field associated with this emission of energy may be deduced from the
magnetic vector potential function
-e{i}
(lV1.7)
A == lz - - - - 1
47T"J.Lo r

APPENDIX

The Darnping Constant of a Freely Oscillating Dipole

551

in which {z} is the retarded value of dz/ dt and Zo is assumed to be very small com pared
to the wavelength A = 27rcl woo Upon computing B in spherical coordinates from
B = V X A, one finds that only Bep contains a term with an 1'-1 dependency, and that
this term is given by

Rep = - ezo sin


-1 e e-(b/m)(t-r/c) { [(-b
47r J..L 0 cr
m

)2 - Wo2] cos Wo (1')


t - - + 2wo -b.
SIn Wo ( t c
m

-r)}

c
(lVI.8)

Through use of Poynting's theorem, the instantaneous density of power flow across
a spherical surface of large radius 1', centered at the dipole, is

CP = J.lo1cB~
so that the total power crossing the surface at time t is

d:

6:~;lC

e-(2b/m)(H!c) {[

(;y - w~]

cos Wo

(t - ~) + 2wo; sin Wo (t - ~) r

Through further use of the assumption that 2b/1nwo


the surface in one cycle at the time t is therefore

e2z w
~ ~ e-(2b/m)(t-r!c)

1, the energy which crosses

6J.lo c

(lVI.9)

But this should equal the energy which left the dipole during one cycle at the earlier
time t - ric. That energy can be found by using retarded time in (lV1.6). When this is
done and the two expressions are compared, one finds that

b=

e2w~

e2w~

127rJ..Lo 1 c

- - -3
121l'" Eoc

(1\1.10)

To check the validity of an earlier assumption, this may be written in the form
2b
me

e2wo

---~
3
7r Eo1n c

10- 5

in which the high value Wo = 10 17 has been used. Thus the assumption that 2b/1nwo 1
is entirely justified and (lVI.I0) is a good approximation to the value of the damping
constant.

APPENDIX
THE AVERAGE

MAGNETOSTATIC FIELD INTENSITY INSIDE A SPHERE

CONTAINING AN ARBITRARY

DISTRIBUTION OF CURRENT LOOPS

a source point (~,1J,s) defined by a position vector r', as shown in Figure


N.1, and a field point (x,Y,z) defined by a position vector r. Let the angles at which r'
and r point be (O',') and (O,) in conventional spherical coordinates and let I' be the
angle between r' and r. Then if ~ = Ir - r/], it follows from the law of cosines that
CONSIDER

[1'2

- =
~

+ (1")2

- 21'r' cos 1']-71~

(N.I)

As in Example (3.24), this result may be expressed in terms of one or the other of the
expansions

2:

(1")n
Pn(cos 1')

> 1"

(N.2)

2 L~

(!-)n Pn(cos 1')

< r'

(N.3)

00

-1 = -1
r

1"

n=O

n=O

r'

However, the addition theorem for spherical harmonics gives'


P n (cos 1') = P n (cos 0)P n (cos 0')

~ (n-m)1

m=l

(n

+m

); P;:'(cos O)P;(cos
.

0') cos [m(4) - 4>')]

(N.4)

so that both expansions may be written in terms of double spherical harmonics.


These results may be applied to the case of a filamentary current loop of radius a,
situated centrally in the XY plane, as depicted in Figure N.2. For all the source points,
0' = 7r/2 and the magnetic vector potential function due to this loop may be found at
= 0 with no loss in generality, since the answer is e-symmet.ric. One obtains

A.p(T,O) =

47r,LLo-

f cos 4>' d4>'


27r

(N.5)

which agrees with Example 4.6. Unlike that example, no approximations will be made
due to assumptions about the relative sizes of r and a, but instead the expansions of
1 See, e.g., J. J). Jackson, Classical Electrodunamics, pp. 67-69, John Wiley and Sons, Inc., New York
1962.

APPENDIX

111 agnetostatic Field Intensity inside a Sphere

"\....................

cJ>/~\\
x

II
'"..........

I ,,~
I

\ I
\~

FIGURE

N.1

Source and field point qeomeiri].

Idt

x
FIGURE

N.2

Circular current loop at origin.

553

554

M aqnetosiaiic Field Intensity inside a Sphere

r- 1 will

APPENDIX

be employed to deduce exact expressions for

Acf>(r,8) =

~.

41rJ.L 0

I~ ()~ f cos

1 n= 0

21r

~4<p.

For

>

a,

et/[Pn(cos 8)P n(O)

+ 2 fL 1 (n-1n)!
(
)' P;:(cos fJ)P;:(O)
n + m. .
m=

cos l1~eP'] deP'

This reduces to
(N.6)

(N.7)
To evaluate the radial component of B, one needs

from which it follows that

The () com ponent of B is

oo

1ra2I
(a)n-l
p~ (0)
-B e = --21r ,-0 Ir 3 n = 1 r
n + 1
1l -

r: (cos B)
n

(r)n+2 -p~(o)
- P1(cos e)

1ra 2I I~
21rJ.Lo r n = 1 a

- -1
-3

>

<

>a

<

For r a, the n = 1 term dominates and is seen to give the same field as was found in
Example 4.6.
With these expressions for the magnetic field com ponents, it is now possible to find
the average value of B throughout a spherical volume Va. Referring to Figure N .3, let
the central point of V~ lie in the XZ plane at the Cartesian position (h,O,k), and let the
current loop (which is seen edge-on) lie entirely inside V s. Then, since

n=

(B r sin fJ -t- Be cos 8)(lx cos

one finds, for r

>

1"3

ly sin

cJ

(B r cos () - Be sin O)lz

a,

1Ta~~ ~ (~)n-l P~(O)


ni:. r

21r).lo

cj>

P~(COS

- cos 8 n

[Clx cos

</>

i, sin

</

(sin ()Pn(cos fJ)

f))) + L, ( cos 8Pn(cos 8) + sin 0 P;(COS


fJ))]
n + 1 .

(N.8)

APPENDIX

1~1 agnetostatic Field

Intensity inside a Sphere

555

FIGUR!'~

whereas, for r

= -

<

a,

7ra~~ n!.:.l
~ (~)n+2
P;(O)
a

21rJ..Lo

1"3

cos ()

Bav =

P~(cos ()))
n

i, sin 4 (sin ()Pn(cos ())

()))J

(N.9)

Br 2 sin 8 dr d8 dcP

(N.lO)

(
.
P~(cOS
L, cos 8Pn(cos 8) - SIn 8
n

271"

""3"11"5

[(l cos 4>

A f f f Br sin
71"

Since

Current loop inside a spherical volurne.

N.3

0 dr dO d

+-

ff f
71"

471'"0 3 0

271"

n(O,)

with 1"1(8,4 the distance Irom the center of the loop to a point on Sa, a study of (N.8)
and (N .9) reveals the following:
1. The first integral of (N.lO) does not contribute an X component nor a Y cornponent to Bav because of the 4> periodicity of (N .8).
2. The first integral of (N.IO) does not contribute a Z component to Bav except for
the term n == 1 because sin 8 == - pi(cos 8) and cos () == PI (cos 8) and the orthogonality
relation (D.30) eliminates all other terms.
3. The second integral of (N.lO) does not contribute to Bav whatsoever, This is
because

rl(),cP) == h sin () cos cP

k cos ()

[(h sin

f)

cos cP

+ k cos ())2 +

02 - h 2 - k2r~

is even in cPo If the r integration is performed first, the resulting integrand factor must
contain only even terms in cP, each of which is representable by spherical hannonics
whose cP integrations are zero except for m == O. Even for the case m == 0, only the Z
component need be considered, so the problem is reduced to an evaluation of integrals

556

M agnetostatic Field Intensity inside a Sphere

of the type

P;(O)

f P1(COS 0)

[cos OPn(COS 0)

APPENDIX

+ sin 0 P~(COS 0)]


n

sin

ede

However, inspection of the expression for 1'l(e,</ reveals that the m = 0 component is
accompanied by an even function of (), so the index l must be even. Since P~(O) is zero
unless n is odd, the term in square brackets in the above integrand is an even function
of e. Therefore, the entire integrand is odd and the integral is zero for all allowed values
of nand !.

Because of these simplifications, (N.10) reduces to


3

Bay

= (

JJJ
1 211"

-1 3

47r0 )27r,LLo

-1 0

/[P 1(u)]2

+ [P~(u)J2lr2 dr de/> du

(N .11)

27T',li

15 3

in which m = lz7T'a 2X is the magnetic moment of the loop. This result is independent of
the position and orientation of the loop in 11 6 and therefore, if a distribution of loops
exists in l1 a, they contribute an average field in V 6 given by
N

Bav = _ 1- \'
L
27r,LLo 1 u~3 i=1

rn,

(N.12)

In the special but important ease that the distribution of loops is uniform in a region
containing V 6, and of volume density M, those loops within V 6 contribute an average
field throughout V 6 of amount
n-: = M(47l" 013j 3 ) = ~ M l
(N.13)
27T'J.Lo (j 3
3 J.Lo
This result includes the effects of
average such loops are half within
sphere of radius a around a given
loop contributes to Bav according
is within Yo.

those loops which are only partially in V 6 On the


Ve5 and half outside. Since only the integration over a
loop contributes to Bav, it follows that each partial
to that fraction of its "loop volume," 47T'a 3/3, which

MATHEMATICAL SUPPLEMENT: PART


TAYLOR'S

SERIES

of functions into series representations is a C01111110nly used and


effective analytical technique. In electromagnetic theory the function to be expanded
often depends on several variables, and thus it is desirable to develop such a technique
with adequate generality. Accordingly" this small supplement on series, after a brief
historical introduction, reviews several mean value theorems, derives Taylor's series for
functions of one variable, and then extends the result to cover multivariable functions.
THE EXPANSION

s.i *

HISTORICAL SURVEY

The series expansion

f(x

h) == f(x)

hf'(x)

h2

+ 2!f"(x) +

which bears his name was first enunciated by Brook Taylor (1685-1731) as early as
1712 in a. letter to John Machin, Its first formal appearance was in his text 1v[ethodus
incrementorum directa et inversa which was published in London in the period 1715-1717.
This text also contains the easy consequence now known as Maclaurin's series, but
Taylor's proof of these expansions did not consider convergence and is worthless. The
importance of these expansions was not appreciated by analysts for over a half century
until Lagrange pointed out their applicability, and no rigorous proof of Taylor's
theorem was offered until Cauchy included a remainder term and tested for convergence
in 1821.
Colin Maclaurin (1698-1746), though an able mathematician, is improperly credited
with authorship of the expansion
x2
f(x) == f(O) + xf'(O) + ,f"(O) + ...
2.
which was contained in his Treatise of Fluxions published in Edinburgh in 1742. This
expansion is obviously a special case of Taylor's theorem, a point which was indicated
by Taylor 25 years earlier. Additionally, Maclaurin's expansion was apparently discovered independently by James Stirling and is contained in his paper M ethodus
differentialis sive Tractaius de summoiione et interpolatione serierum infiniiarum published in London in 1730. The greater fame of Maclaurin and the wider circulation of
his Treatise are accountable for this miscredit.

* This section may be omitted without loss in continuity of the technical presentation.

558

5.2

Taylor's Series

IVlATHEl\IATICAL SUPPLEl\1ENT: PAR'r I

MEAN VALUE THEOREMS

A discussion of Taylor's series builds on the base of several mean value theorems which
serve as lemmas. The first of these is the well-known

Let f(x) be a function of the real ariable x which possesses a conXl ~ X ~ X2. Let a and b be two points within the
intervalt for which f(a) = f(b) == O. Then at least one value of x can befound between a and
b, say x == t, for which f' (t) == O.
ROLLE'S THEOREl'vI:

tinuous first derivative over the interval

Proof: The truth of this theorem is almost self-evident from a geometric display of the
function such as shown in Figure S.l. If the function is to be zero at a and at b, it cannot
f(x)

J------;---4--------4L.----+-~-o#__~__+_--x

FIGURE

S.l

Rolle's theorem.

be ever-increasing, nor can it be ever-decreasing in the interval between a and b. Where


the function changes over from increasing to decreasing, the first derivative must vanish.
Rolle's theorem can be em ployed to establish the

Let f(x) and g(x) be two functions of the real variable x which
possess continuous first derivatives ihrouqhoui the interval Xl S X ~ X2. Let a and b be any
two points within this in ierool such that g(a) ~ g(b). If g' (z) is nowhere zero in the interval,
then for some value x == t between a and b,
THEOREIVI OF l'vIEAN VAL DE:

f(b) - f(a) _

f'(~)

g(b) - g(a) - g'(~)

(S.l)

Proof: Define a function h(x) by the relation


hex) =

~~~

=:~:~

[g(x) - g(a)] - [f(x) - f(a)]

t In this and all subsequent theorems of this supplement, b can be either larger or smaller than a.

SECTION

Mean Value Theorems 559

8.2

It can be observed that hex) is a function which satisfies all the requirements of Rolle's
theorem. It has a continuous first derivative in the interval and h(a) == h(b) = O. 8ince

h'(x) = feb) - f(a) g'(x) - f'(x)


g(b) - g(a)

it follows that for some x == ~,

h'(O

= 0 =

feb) - f(a)

g(b) - g(a)

g'm - 1'(0

which, upon rearrangement, yields the stated result.


A special case of this theorem of some importance occurs when g(x) = x. Then
Equation (S.l) reduces to

feb) - ita)
b- a

I'm

(8.2)

A significant generalization of the above theorem is embodied in the


EXTENDED THEOHElVI OF ~!IEAN VALUE: Let f(x) be any function of the real variable x
which, together with its first n derivatives, is coniinuous in the interval Xl ::; X ::; X2. Lei a
and b be any two points within this interval. Then

b-a

(b-a)2

f(b) == f(a)

+ -,-f'(a)
+
1.

in which

is some point between a and b.

~n

')'
..,.

f"(a)

Proof: If one makes use of Equation (8.2), there is a point

f'(~o) =

feb) - f(a) - (b -, a)
1.
Define a constant K 2 by the equation

~o between a and

feb) - f(a) - (b - a) rea) _ (b - a)2 K 2


I!'
2!
and from this form the function
h(x) ==f(x) -f(a) _.

(x - a)

1!

f'(a) -

b for which

= 0

(x - a)2

2!

(8.4)

K2

The function hex) has a continuous first derivative in the interval, given by

h'(x) == f'ex) - f'ea) - (x - a)K 2


and since h(a) == h(b) == 0, Rolle's theorem applies. Thus there is a point x == ~1 between
a and b such that h' (~1) == O.
Furthermore, h'(x) has a continuous first derivative in the interval, namely

h" (x)

==

f" (x)

K2

560

Taylor's Series

IVIATHElVrATICAL SUPPLEMENT: PART I

and since h'(a) == h'(~l) == 0, there must be a point x == ~2 between a and ~1 for which

If this result is substituted in (8.4), one obtains

f(b)

= f(a)

(b - a) f'(a)
1!

(b - a)2f"(~')
2!
2

(8.5)

A constant K 3 can next be defined by the relation

f(b) - f(a) - (b - a) f'(a) _ (b - a)2f"(a) _ (b - a)3 K


I!
2!
3!'

(S.6)

from which it follows by the above procedure that !(3 = f'''(~3), where ~3lies between
a and b. Continuing this process out to the nth derivative yields the result (8.3). The
last term of this series, namely

is known as the remainder after n terms. For the important case in which f(x) is a
function with continuous derivatives of all orders, (8.3) becomes an infinite series as
n ~ co , If the remainder goes to zero in this process, the series converges to the value
f(b) and one may write
(8.7)
EXAl\1PLE

If fex)
lets a

S.l

= sin
=

x, the remainder does go to zero and the expansion (8.7) is applicable. If one
= 1/ y!2 and feb) = t. (8.7) gives

1r/4 and b = 1r/6, it follows that f(a)


f(b) =

1
V2

1r -"21 (1r)2
1 (1r)3
]
12 + 6 12 + ...

1 - 12

Use of only the first four terms of this series gives the approximation

f(b)

S.3

TAYLOR'S SERIES

0.4999

FOR ONE VARIABLE

If f(x) and all its derivatives are continuous in the interval Xl ~ X ~


are any t\VO points within this interval, it follows from (8.3) that

X2,

and if a and x

(S.8)

in which

~n

is some point between a and x. If

SECTION

Taylor's Series for Several Variables

8.4

561

for all x within (XI,X2], then

f(x)

~ (x - a)m
,fm(a)

==

(8.9)

n~.

m=O

is a convergent series representation for f(x), valid within the entire interval. (8.9) is
known as the Taylor's series expansion of f(x) about the point a.
The special case of this result in which a = 0 is known as Maclaurin's series, and can
be written
00

\' x

L -

f(x) =

m=O

m!

(8.10)

jm(O)

Another useful form of Taylor's series results when f(x


about the point x. A straight substitution in (8.9) gives

f(x
Both x and x
EXAMPLE

-t-

~x)

is expanded in a series

~ (~x)m

= L _,_jm(x)

~x)

m=O

(S.11 )

1n.

+ Lix must be within the interval (XI,X2].

8.2

Consider the function f(x

~x)

= (x

fm(x) =

+ Lix)n in which
n'.

(n - m)!

xn -

n is an integer. Then f(x) = x n and

fm(x) = 0

m>n

Substitution in (S.ll) gives


(x

+ ~x)n

= \'

n!

xn-m(~x)m

mL: o m!(n - m)!


= z" + nxn-l(~x) +

n(n - 1)

2!

xn-2(~x)2

+ ... +

nx(~x)n-l

(S.12)
(~x)n

which can be recognized as the binomial expansion.

S.4

TAYLOR'S SERIES

FOR SEVERAL

VARIABLES

The results of the previous section may be extended to functions of more than one
variable with little difficulty. Let j(x,y) be any function which, together with all its
partial derivatives, is continuous in the interval Xl ~ X ~ X2, YI ~ Y ~ Y2. If (a,b) and
(x,y) are any two points within this interval, then by Equation (8.9),

x ) = ~ (x - a)m amf(a,y)
j( ,y
L
,
axm
m=O
m.

(8.13)

But the functions of y appearing on the right side of (8.13) also can be expanded in a

562

Taylor's Series

l\IATHEl\IATICAL SUPPLEIVIENT: PART I

Taylor's series, namely,


(8.14)
(S.15)

so that

All the series in (8.13), (8.14), and (8.15) rnust converge for all points in the interval
in order for this to be a valid procedure. When they do, (8.15) is known as a 'I'aylor's
series expansion of f(x,Y) about the point (a,b).
A useful alternative form of (8.15) arises when f(x + LlX, Y + ~y) is expanded in a
Taylor's series about (x,y). Direct substitution in (S.15) gives

f(x

LlX, Y

/1y) =

L~ (~x)m (~y)n am+nj(x,Y)

m=O n=O

-- -- --n!

nd

axmayn

(8.16)

Next, let !(x,y,z) be any function which, together with all its partial derivatives, is
continuous in the interval Xl ~ X ~ X2, Yl S; Y :s; 1}2, ZI ~ Z ~ Z2. If (a,b,c) and (x,Y,z)
are any t\VO points within this interval, then by (S.15),
~

~ (x - a)m (y - b)n am+nj(a,b,z)

L L

f(x,Y,z) -

1n!

m=On=O

n!

axmayn

(8.17)

whereas from (8.9),


~ (z - c)p am+n+pf(a,b,c)

am+n.f(a,b,z) =

~o

axmayn

p!

axmaynaz p

(8.18)

Combination of these results gives

_I
oo

!(x,Y,z) -

I~

I
oo

m=On=Op=O

(x - a)m (y - b)n (z - c)p a m+ n + 7>! (a ,b,c)


1n!
n!
p!
axmaynaz p

(8.19)

When it is assumed that the necessary convergence conditions are met, (8.19) is known
as the Taylor's series expansion of !(x,Y,z) about the point (a,b,c).
In an alternative form,
j(x

LlX, Y

~Y,

<'
.'wI

LlZ) -

~ (6.x)m (6.y)n (6.z)p am+n+p!(x,y,z)


,
,
,
p
p=O 1n.
n.
p.
ax may71az

L L L

m=O n=O

(8.20)

The extension of these results to functions of four or more variables follows the same
procedure and can be predicted by inspection.
EXAMPLE

8.3

In a vacuum triode, the plate current ib is a function of both the plate voltage eb and the
grid voltage e.. In many applications the triode has a plate current which consists of a
time-independent, or d.c. component, and a time-varying component. 'The plate current
can then be expressed in the form
i b = Ib + i
in which Ib is the quiescent value and i p is the superimposed time-varying part. These t\VO
component currents flow in response to the voltages eb = Eb
e p and e c = E e + eg, with

SECTION

Taylor's Series for Several Variables 563

8.4

(Eb,EJ the quiescent portions and (ep,e ll ) the time-varying portions. When Equation (8.16)
is applied to this situation, one obtains

If the triode is biased to operate in the linear portion of its characteristic, then all higher
order derivatives vanish and this expansion simplifies to
(8.21)
If one defines the plate conductance gp and transconductance gm by the relations

the time-varying part of (8.21) can be written


(8.22)
Equation (8.22) is the basis for a variety of equivalent circuits for the triode which are
distinguished by assumptions concerning the waveforms of the signal voltages and the
lumped elements placed in the grid and plate circuits.

REFERENCES
1.

Cajori, F., A History of Miuhenuiiics, 2d ed., pp. 226-229, The Macmillan Company, New
York, 1919.

2. Love, C. E., and E. D. Rainville, Differential and Integral Calculus, 6th ed., pp. 439-447,
The Macmillan Company, New York, 1962.
3.

Smith, D. E., History of Mathematics, vol. 1, pp. 449-454, Ginn and Company, New York,
1923.

MATHEMATICAL SUPPLEMENT: PART

II

VECTORS

analysis is a major part of the mathematical language of electromagnetic


theory and occupies an equally important position in many fields of science. Because
of the varied backgrounds in vector analysis possessed by those em barking on a study
of electromagnetics, this su pplement is in tended to meet the needs of several grou ps of
people. For those well-versed in the subject, an orientation in the notation will be the
principal purpose. For those whose experience in vector analysis has been largely confined to Cartesian coordinate systems, the sections on generalized coordinates 111ay prove
helpful, and perhaps some of the less commonly encountered integral theorems will be
worthy of attention. For those who are new to the subject, the supplement is designed
to cover all those vector topics needed in the exposition in the main part of this book.
A selection of problems is provided at the end as an aid to comprehension and manipulative skill.
Following an historical review, the supplement begins with a discussion of vector
algebra which includes developments of the dot and cross products and their applications to physical problems. Vector differentiation is introduced, after which generalized
coordinate systems are discussed. Gradient, divergence, and curl are then defined,
physically interpreted, and their general forms developed. Various integral theorems
arc treated, notably the divergence theorem, Stokes' theorem, and Green's theorems.
The supplement contains a list of useful vector identities and concludes with a summary
of important vector relations in the principal coordinate systems.
VECTOR

V.l *

HISTORICAL SURVEY

Historically, the origins of the concept of a vector as a quantity possessing direction


as well as magnitude can be traced to early attempts to display the number system
pictorially. The notion of opposite directions was em bodied in the representation of
positive and negative quantities as distances laid off to the right and left of a reference
point on a straight line. However, it is probably more meaningful to date the origin
of vector analysis from the work of John Wallis (1673) who used t\VO successive directed
orthogonal displacements in a plane to represent the complex roots of quadratic
equations.
Wallis selected an origin 011 a horizontal axis and then laid off a distance to the right
or left of this origin algebraically proportional to the real part of the root. From the

* This section may be omitted without loss in continuity of the technical presentation.

SEC1'ION

Historical Survey 565

V.I

point thus determined on the axis of reals, he then erected a perpendicular line, of
length proportional to the imaginary part of the root, and in one direction or the other,
depending on the sign of the imaginary part. In this way, Wallis achieved a one-to-one
correspondence between the points in a plane and the set of complex numbers: but
surprisingly, it did not occur to him to take the logical next step of introducing a vertical axis of imaginaries.
It remained for Caspar Wessel, a Norwegian surveyor, to take this step over a century later. In the modestly titled article On the Analytic Representation of Direction;
an Attem-pt. published in the Proceedings of the Royal Society of Denmark in 1799,
Wessel introduced an axis of imaginaries, constructed a directed line segment connecting the points (0,0) and (a,b), and then associated the terminal point of this directed
line segment with the complex number a
jb. After defining addition and subtraction
such that (a,b) (c,d) == (a c, b d), Wessel proceeded to define the product of
two complex numbers in terms of an operation on the two line segments. He decided
that the product should be a new line segment whose length is the product of the
lengths of the two original time segments, with the new line segment making an angle
with the axis of reals which is the sum of the angles which the two original line segments
make with the axis of reals. This construction is consistent with the expansion

(a

+ jb)(c + jd)

== ae - bd

+ J(ad +

be)

Thus Wessel introduced all the features of what is commonly called an Argand diagramUnfortunately he published his findings in a journal not widely read by scholars, and
most of the fame for this discovery went to J. R. Argand, who independently reached
similar conclusions in 1806.
Wessel sought without success to extend his rotational multiplication method to
three dimensions, and the French mathematician Servois made a similar unrewarding
attempt 15 years later. This extension was finally accomplished by Sir William Rowan
Hamilton (1805-1865) after much fruitless toil, through his willingness to break with
tradition and discard the commutative law of multiplication.
Hamilton was a widely gifted man, accomplished in the classics and languages at
the age of thirteen, famous at twenty-seven for his mathematical prediction of conical
refraction, and celebrated at thirty for his fundamental work in dynamics. From this
point in his career, Hamilton proceeded to devote all his energy and talent for the
remainder of his life to the subject of quaternions, whose invention he announced before
a meeting of the Royal Irish Academy in 1843.
A quaternion q may be written in the form

in which the o; are ordinary numbers and the I, are fundamental units possessing direction as well as magnitude. A quaternion is thus, in essence, the sum of a scalar and a
vector, the latter consisting of three independent unidimensional components. Hamilton imposed the conditions that the fundamental unit vectors be subject to noncommutative rules of multiplication:
1 21 3 == -1 31 2 == 1 1
1 31 1 == -1 11 3 == 1 2
1 11 2 == -1 21 1 == 1 3

566

Vectors

l\1ATHEMATICAL SUPPLEMENT: PART II

with the additional requirement that

1i == 1~ == 1i == -1
As a consequence of the distributive law, he was able to show that the product of two
quaternions is a quaternion. Two special cases of this general result are noteworthy:
First, the product of a quaternion with a vector can be made to result in any arbitrary
vector, by suitable choice of the coefficients in the quaternion. Thus the quaternion
proved to be an operator capable of rotating a vector to any new arbitrary direction
and altering its length by any prescribed factor. This was the goal Harniltbn had been
seeking originally in his effort to extend Wessel's construction to three dimensions,
Second, the product of t\VO vectors is a quaternion, the scalar part of which is the
negative of what is now called the dot product, and the vector part of which is now
called the cross product. This result is rich in physical applicability, a point which was
well made by Hamilton in his textbook Elements of Qualern1.'ons, published in 1866 the
year after his death. In this text Hamilton also presented another of his inventions,
the del operator, and exhibited the concepts which are now known as gradient, divergence, and curl. He considered this work to be his crowning achievement and did secure
one lifelong champion in I). G. Tait (1831-1901). Despite this, quaternion theory
never gained wide popularity in the scientific community. It was unnecessarily encumbered in that vectors were only a part of quaternions, and the results of most operations
gave mixed physical interpretations. What has endured from all of Hamilton's tremendous labors in this field was his demonstration of a self-consistent algebra which
did not require the commutative law of multiplication to hold, and his introduction
of the del operator.
A year after Hamilton's first announcement of the quaternion theory, Grassman
(1809-1877) published in Germany a remarkable treatise, Die Lineale Ausdehnungslehre, ein neuer Zweig der JIllaihemaiik, concerned with algebra in n-dimensional space.
Most of the ingredients of what are now called vector and tensor analysis were embodied
in Grassman's work as special cases. To appreciate the breadth of Grassman's view,
one must realize that, except for the contemporary work of Cayley, no one else at that
time was thinking beyond Euclid's three dimensions, Even Hamilton's quaternions,
while being a four-tuple, were restricted to three dimensions in their vector character.
Like Hamilton, Grassman was a widely gifted man. Accomplished in philosophy,
harmony, physics, and the Sanskrit classics, he was a pious husband, who supported a
wife and nine children from his meagre earnings as an elementary school teacher.
Grassman invested what time he could find, stretching over three decades of his career,
in the algebra which he proudly referred to as a new branch of mathematics. Its developinent was a rich source of satisfaction to him, ranking perhaps only behind theology
as a central force in his life.
Briefly stated, Grassman's algebra is concerned with the concept of hyper-numbers,
of which an example is the polynomial

in which the am coefficients are ordinary numbers and the quantities 1 m are primary
units upon which Grassman imposed a variety of conditions. The sum of t\VO such

SECTION

Historical Survey 567

V.1

hyper-numbers is given by

and multiplication and division of hyper-numbers by ordinary numbers follow the eonven tional rules of algebra.
The product of t\VO hyper-numbers can be written as

GraSS111an imposed the conditions


l~

Imln = 0

to obtain what he called the inner product, and the conditions


l~ ==

to obtain the outer product, thus rejecting, as did Hamilton, the inviolability of the
commutative law of multiplication.
In three dimensions, these results are recognized as the dot and cross product, and
Grassman explained their significance in great detail. He also considered higher products, including the triple scalar product associated with volume in three dimensions.
Another type of product, which Grassman called "open," or "indeterminate," led to
what is now called a matrix, and Grassman clearly anticipated the later work of Cayley
in this field. Quarternions can be included in this generalized vector algebra as a very
special case, and the theories of determinants and tensors are also embodied in the
general development.
Unlike Hamilton, Grassman received no honors in his lifetime. His philosophical
interests led him to endow his theory with the greatest possible generality, and the
first edition of A usdehnutujslehre (1844) was heavily burdened with philosophical
abstractions. The fact that it was all hut ignored by mathematicians spurred Grassman
to greater efforts to gain recognition for his theory. Eighteen years later, he published
a second edition of A usdehnungslehre, which was extensively revised and greatly
expanded but which was hardly less incomprehensible, The combination of a generalized
theory which broke with tradition to plow new and difficult ground, plus his own
obfuscations, doomed the second edition to the same fate as the first. Grassman abandoned mathemat.ics, 50 years ahead of his time, with the tribute that was his due
reserved for the twentieth century to bestow posthumously.
Probably the figure who had the most influence on the shaping of vector analysis
as it is now used was Josiah Willard Gibbs (1839-1903). Renowned for his work in
statistical mechanics and America's outstanding mathematical physicist of the nineteenth century, Gibbs was perhaps better qualified than either Hamilton or Grassman
to sense the mathematical needs of scientists. He blended the D10re useful features of
both men's work into a treatise entitled Elements of Vector Analysis, printed in pamphlet form (1881-1884) for the private use of his students. In the preface, Gibbs
acknowledged a similarity between his development and quaternions, but indicated
a stronger relation to the work of Grassman of which he was intimately aware, In
effect, Gibbs employed the three-dimensional form of Grassman's general algebra,

568

11 ectors

XL\THEJL\TICAL

SUPPLE~\tIENT:

PART II

taking the Iundamcntal units to be mutually orthogonal in space, and stripping Grass111an'S development of the confusion inherent in its generality. As such, he was dealing
with vectors unshackled from their wedding to scalars, an intrinsic feature of the
quaternion theory. However, Gibbs did retain the various del operations introduced
by Hamilton and clearly illuminated their physical significance.
Although privately published, Gibb's pamphlet became widely known and precipitated a prolonged controversy over the relative merits of quaternions and vector
analysis. He received strong support in this controversy from Oliver Heavisidc (18501925) who published in 1893 a text entitled Eleciromoqneiic Theoru. 1\ long chapter of
this book was devoted to the development of vector algebra and analysis with numerous
practical applications. His point of view was harmonious with that of Gibbs; they
principally differed in notation. The end result of this controversy over quaternions
versus vectors was predictable on practical grounds, and with the death of Tait in
1901, quaternions quietly moved into history.
In 1901, E. B. Wilson published an excellent and exhaustive text entited Vector
Analusis based on the lectures of Gibbs, and this book was instrumental in establishing
its wide use as a mathematical tool. The first significant presentation of the vector
method to appear on the Continent was contained in Foppl's Geometrie der vVirbelfelder
(1897) which received extensive distribution in a revised version written by IVI. Abraham and published in 1904. The impact of these t\VO texts, and others which followed,
has placed vector analysis in its secure and rightful position as a standard part of the
mathematical education of students of science.

V.2

SCALARS

AND VECTORS

Many physical quantities permit a mathematical description, and for some a magnitude will suffice. 'I'hus the temperature of a chemical solution, the size of an audience,

FIGURE

V.I

Representation of a vector.

the entropy of a gas, the yield of a cornfield, can all be described by a real number. These
are known as scalar quantities. t
However, the statement that a shell is traveling at 2,000 It/sec is incomplete, No
one will deny that in this instance the direction might prove to be a highly valuable
piece of information. Similarly, all displacements, velocities, accelerations, and forces
are completely defined only when both their magnitudes and directions are specified.
These are called vector quantities and the branch of mathematics which is concerned
with them is known as vector analysis.

t The discussion

will be enlarged to include complex scalars and complex vectors in Section V.23.

SECTION

The Addition Law for Vectors

V.3

569

The simplest example of a vector quantity is a displacement. It may be represented


pictorially by a directed line segment, as shown in Figure V.I. The length OA represents
the magnitude of the displacement ; the orientation of the line OA and the arrow serve
to indicate the direction of the displacement. If the actual place of occurrence of the
displacement is significant, 0 can represent the initial point and A the terminal point.
As indicated by the use of the symbol a in Figure V.I, a vector will be denoted in this
text by boldface type and its magnitude by the same symbol in italic type.

V.3

THE ADDITION LAW FOR VECTORS

Addition of vectors is permissible if (1) they represent quantities of the same dimensions, and if (2) they are either free or fixed and acting at the same point. t

FIGURE

V.2 The sum of two vectors.

The law of vector addition follows naturally from the concept of successive directed
displacements, For free vectors, if the tail of vector b is placed in coincidence with the
head of vector a, and a vector c is drawn from the tail of a to the head of b, then c is

FIGURE

V.3

The poralleloqram lino of vector addition.

said to be the sum of a and b and the addition operation is written

c=a+b
This construction is shown in Figure V.2, from which it is apparent that an equally
valid method for determining the SUIn of t\VO vectors would be through the use of a
parallelogram with sides a and b, as shown in Figure V.3. c can then be identified as a

t The problem being considered determines whether a vector is fixed or free. For example, the weights
of soldiers standing on a bridge may be represented by vectors directed downward at the appropriate
spots. To obtain the total live load on the bridge's supports, these vectors may be moved parallel to
themselves, until they are at a common point, and then summed. However, if the partial load at each
support is to be computed, the positions of the vectors are also important. They are the same vectors
in both problems, but in the first case they are free, whereas in the second case they are fixed.

570 Vectors

MATHEl\IATICAL SUPPLEMENT: PART II

diagonal of the parallelogram, with the tails of 8, b, and c coincident. Since this alternative method is also applicable to fixed vectors, it is the one which shall be adopted.
It is customarily referred to as the parallelogram law of vector addition.
Since the result of t\VO successive directed displacements is independent of their
order, it follows that the commutative law holds for vector addition, namely that

FIGURE

V.4 Vector subtraction.

a + h = h + a. This is consistent with the introduction of the parallelogram law,


and the fact that opposite sides of a parallelogram are equal and parallel.
If c = a + h = 2a = 2b, in which by the symbol 2a is meant a vector in the same
direction as a and twice as long, then it is evident from either Figure V.2 or Figure V.3

I
I

I
I

Ia +
I

"'"......

---1--_
I

. . . . . "'"~+c

"'" . . . . .

"""' ..................

a+b+c
FIGURE

V.5

The addition of three vectors.

that a = h. Thus two vectors are equal if they have the same magnitude and a common direction.
If c = a + b = 0, c is said to be a null vector, and it is equally evident from the
t\VO figures that a and b have the same magnitude but are oppositely directed. The
null equation can be rewritten in the form a = - b, providing the interpretation that a
vector and its negative have the same magnitude, but point in opposite directions.

SECTION

The Addition Law for Vectors 571

V.3

This permits extension of the concept of vector addition to the case of subtraction of one
vector from another. By writing a - b = a + (- b) and using the construction indicated in Figure V.4, the difference is obtained easily.
To add three free vectors, let them be connected as shown in Figure V.5. From
inspection, it follows that
a

c = a + (b + c)
(a + b) + c
= (b + c) + a = b + c + a

Thus in the addition of three free vectors, the commutative and associativelaws hold,
and the order in which the three vectors are added is immaterial. These ideas are readily
extended to fixed vectors, and by induction, to the addition (or subtraction) of four or
more vectors.
EXAI\1PLE

V.1

In the absence of wind, an airplane flying due west is able to average 200 mph. If a wind is
blowing from the northwest at 40 mph, at what bearing must the pilot set the course if the
plane is actually to be traveling west? What will be its ground speed?

w __

To solve this problem, the navigator can draw a vector a 40 units long (by some convenient scale) from the northwest, thus representing the displacement caused by the wind .
.As shown in the figure, he can then account for his air speed by drawing a vector b 200 units
long, starting from the head of a, and such that the head of b is on the same horizontal line
as the tail of a. The resultant c can then be drawn from the tail of a to the head of b so
that c = a
b. This ensures that the resultant ground velocity is due west, as required.
By graphical measurements from the figure, or by trigonometry, the navigator finds that
the bearing (J is 8.1 deg north of west and that the ground speed is 170 mph.

EXAMPLE

V.2

Many theorems of plane geometry can be proved 'with considerable economy by the use of
vector algebra..As an example, one can show that for any quadrilateral, if successive midpoints of the sides are joined, the figure thus formed is a parallelogram.

To show this, imagine that the sides are free vectors, such that a is the directed line
segment drawn from P, to P2, b is the directed line segment drawn from P2 to P 3 , etc.

572

Vectors

l\1ATHEMATICAL SUPPLEMENT: PART II

Since the quadrilateral is a closed figure, it follows that a

e=

and therefore

id

+ ia

f = ~-b

e+f=O

+b+c+d

= O. But

+ tc

-f

'rhus e and f are parallel and of equal length, and therefore the inscribed figure is a
parallelogram.

V.4

THE MULTIPLICATION OF VECTORS BY SCALARS

The product of any vector a with any real number a will be written as aa and defined
to be a vector whose magnitude is lala. Its direction is the same as that of a if a is positive, and opposite to that of a if a is negative. This definition obviously includes division
of a vector hy a scalar, since a can be written as the reciprocal of another scalar.
Up to this point, the discussion has been confined to real vectors, but the concept
of a vector can be extended usefully to the complex domain. This can be done by taking
the product of a real vector a with any complex number l' == a + j{3. The result can
be written ')'3, a com plex vector whose real and imaginary parts, o a and (3a are
real vectors which obey all the rules for vectors so far enunciated. Thus the product
of a real vector and a real number becomes a special case of the product of a real vector
and a complex number. Complex vectors will be treated 11101'e fully in Section V.23.
All the rules of scalar algebra apply to the multiplication of a real vector and a real
scalar. As examples,
a({3a) == (a{3)a == (3(aa)
(a + (3)a == aa + (3a
a(a + b) == aa + ab
It is now possible to explain why velocity, acceleration, and force are vectors which
obey the parallelogram law of addition. This law, as has been seen, is a logical statement of the SU111 of directed displacements, But velocity can be defined as the limiting
ratio of displacement to time interval. Thus if
v

1 -

then

Vl

liITI f1 1

6t~O ~t

V2

_
2 -

lim f1l l
~t~O

lim f12

~t~O ~t

~l2

t1t

and the sum of t\VO velocities is found by sumrning t\VO displacements (which sum
obeys the parallelogram law of addition), dividing by the scalar factor tst, and taking
the limit. Similarly, summing t\VO accelerations amounts to summing t\VO velocity
increments (which sum has just been shown to obey the parallelogram law), dividing
by the factor at, and taking the limit, Therefore accelerations are vectors which obey
the parallelogram law of addition. In like manner one can show that forces, which
involve the time rate of change of momentum, also add according to the parallelogram
law, since momentum is the vector velocity multiplied by the scalar mass.

V.5

RESOLUTION INTO COMPONENTS

The parallelogram law of addition provides a useful way to decompose vectors. Suppose a real vector a is acting at a point P. Choose three different lines PP 1, PP 2, and

SECTION

Resolution into Components

\T.5

573

PP3 through P, not all coplanar. Find a vector 31 along PP 1 such that a - a , lies in
the plane containing PP 2 and PP 3 Then find vectors 32 and 33 along PP 2 and PP 3
respectively, such that 32 + 33 = a - al. This ensures that

and a is said to have been resolved into components along the lines PP 1, PP 2 , and PP 3
For the important case in which PP l , PP 2 , and PP 3 are mutually perpendicular,
the components are uniquely determined and form the sides of a rectangular parallelopiped, as shown in Figure V.5. The magnitudes are then simply related by the expression a 2 == ai + a~ + ai. To find aI, a2, or a3~ one need only drop a perpendicular from
the head of a to PP 1, PP 2 , or PP 3 and measure the projection.

f--

I
I
(

at

------------/1
// I
a,
/

/-----PI
FIGURE

1//

I
I

V.n Resolution of a vector into coniponenis.

It will often be convenient to describe the position of the point P in Cartesian coordinates. The lines PP 1, PP 2 , and PP3 are then taken parallel to the X, Y, and Z axes
and the process is known as resolving a into its Cartesian components. For this case,
the notation can be improved by introducing unit vectors] lx, l y, and l z These vectors
have a dimensionless magnitude of unity and are oriented parallel to the coordinate
axes in the direction of increasing x, Y, and z, The quantity lxa~ is then understood to
be a vector of magnitude laxl which points along the X axis. It is positively or negatively
directed according to whether ax is positive or negative. With this convention it is
possible to find three real numbers ax, all , and at: such that
and
(V.I)

Addition or subtraction of two vectors then becomes a purely algebraic matter through
the relation
(V.2)

t Some authors use the symbols i, j, and k to represent these unit vectors. However, in texts on electromagnetic theory, this can cause confusion since i may stand for current density, j for the imaginary
unit, and k for the wave number.

574

Vectors

EXAl\IPLE

MATHEl\1ATICAL SUPPLEMENT: PART II

V.3

A river which is 2 111i. wide flows due south at a speed which is to be determined. Let identical
rowers in identical rowboats start out from a C0111ffiOn point by the west bank. As suggested
in the figure, rower A proceeds to the east bank and then returns to the starting point, with
his total route lying in an east-west line. Rower B goes straight south far 2 lui. and then
turns around and rows back to the starting point. The difference in elapsed times is 0.357
hours. If each man rows at a steady rate of 2 111ph relative to the water, what is the speed of
the river current?

This problem derives some importance from the fact that it is analogous to the MichelsonMorley experiment (see Chapter 2). '1'0 solve it, let the X axis point east and the Y axis point
north and denote the river current by -l y vr Then for rower A on his first leg, the total velocity relative to the ground is

in which () is the angle north of east at which he must point his rowboat in order to proceed
due east. Thus
Val

= 2 cos

()

Vr

= 2 sin

()

Similarly, on the return leg, rower A establishes a total velocity relative to the ground which
is given by

in which 8' is the angle north of west at which he must point his rowboat in order to proceed
due west. Th us
Va 2 = 2 cos ()'
V r = 2 si n ()'
so that
0' = 8
Va 2 = Va 1

SECTION

V.5

Resolution into Components

575

Rower .::1 spends the same time on the first leg that he does on the second, and his total
elapsed time is
4
2
t -------1 Val [1 - (v r / 2)2]}2
As for rower B, his downstream velocity is - l y(2
l y(2 - vr ) . Therefore, his total elapsed time is
lz

228

= -2

+V

+ -2 -

Vr

= -4 - v;

vr ) and his upstream velocity is


2

= ----

1 - (vr / 2) 2

If the two times are compared, it is apparent that rower B takes longer. Since the difference
in elapsed times is known, one can write

t - t = 0 357 =
2

[1 _ (Vr/2)2]}~

{
I
- 1}
[1 - (Vr/2)2]~~

Solving for the river current gives v,. = 1.0 mph.

Similarly, it may prove convenient to describe the position of the point P in cylindrical coordinates. The lines PP 1, PP 2 , and P P 3 are then taken in the radial, angular,

X
FIGURE

V.7 Unit vectors in cylindrical coordinates.

and axial directions, as shown in Figure V.7. Once again, unit vectors can be introduced
in the positive directions of these three coordinates so that a 111ay be written
a = lrar

la

lza z

However, an important essential difference is to be noted. Whereas in Cartesian coordinates, the unit vectors have directions which are independent of the (x,Y,z) coordinates

576

Vectors

~VIATHEi.\L\ TICAL

SUPPLEi.\IENT: PAHT II

of the point P, in cylindrical coordinates the directions of both I, and l depend on


the rand c/> coordinates of the point P. Thus in cylindrical coordinates, one can express
addition or subtraction of t\VO vectors in the fonn
a

h == l r (a r

br )

l(a o

blj

lz(a z

bz )

only if the vectors are acting at the same point. If they are fixed vectors, this is an
automatic condition. If they are free vectors, in moving one vector from its original
point to a common point, account must be taken of the shift in direction of I, and l.
In effect, the shifted vector must be re-resolved into new components. 1.01' this reason
Cartesian coordinates is the natural system in which to translate vectors.
In like fashion the position of the point I~ can be described in spherical coordinates,
as shown in Figure V.8. The lines PP 1, PP 2 , and PP 3 are then taken in the radial,

x
FIGURE

Unit vectors in spherical coordinates.

V.8

latitudinal, and azimuthal directions and unit vectors can be introduced in the positive
directions of these three coordinates so that a may be written
a == lra,

loao + la

Once again, addition or subtraction of two vectors can be expressed in the form
a

b == lr(a r

br )

lo(ao

bo) +

l(a

b)

with the precaution that the vectors are acting at the same point, since the directions
of all three unit vectors depend on the position of the point P.

V.6

MULTIPLICATION OF VECTORS-THE DOT PRODUCT

The multiplication of a vector by a scalar has already been introduced. Additionally,


it is convenient to define t\VO operations involving the multiplication of a vector by a

SECTION

Multiplication of Vectors-The Dot Product

V.6

577

vector, each of which has extensive physical application. The first of these operations
is called the dot product (or scalar product, or inner product) and is symbolized in the
form a b. By definition,
a h = ab cos 8
(V.3)
in which 8 is the angle measured from a to b, as shown in Figure V.9. Note that the
dot product can be positive or negative according to whether 8 is less than or greater
than 90 deg.

a
FIGURE

V.9

N olation for the dot product.

The dot product is a permissible operation if the vectors are free, or are fixed and
acting at the same point. They need not have the same dimensions. Since h a =
ba cos ( - 8) == ba cos 0, it follows that
a b == b a
(V.4)
and thus the dot product is commutative.

a
FIGURE

V.lO

Distributive law for the dot product.

Pictorially, a b may be interpreted as b times the projection on b of a, or as a times


the projection on a of b. Thus in forming the dot product a (b
c), it can be seen
from Figure V.IO that

a (b

c)

a proj, (b

and therefore that


a (b

c)

+
==

c)

==

a h

a proj, b

a proj,

(V.S)

Thus the dot product obeys the distributive law, This result can be used to expand
products such as

.578

Vectors

:MATHEMATICAL SUPPLElVIENT: PART II

When it is recognized that L, L, = 1, L, l y == 0, etc., the expansion reduces to


(V.6)
Equation (V.G) is a statement of the dot product in terms of Cartesian com ponents.
In particular
(V.7)
The dot product may also be used to find the angle between t\VO vectors through
the relation
cos 0 =
Finally, if a b =
EXAl\1PLE

ab

(V.8)

ab

and a > 0, b > 0, then a must be perpendicular to b.

V.4

Let a be any vector, and let it make angles a, {3, l' with the X, Y, and Z axes. Then, since
L, is a unit vector,
lx a = ax = a cos a, etc.
and
a = a(l x cos a
ly cos (3
lz cos 1')

from which it follows that


cos" a
The three quantities
cos a

ax

=-

+ cos! {3 + cos- l' =


cos {3

ay

=-

az

cos l' =

are called the direction cosines. All three are needed to determine uniquely the direction of a.
EXAl\1PLE

\T.5

The dot product is very useful in Iormulating energy problems. As an example, let a particle
of 111aSS In 1110ve along; the path C from point P, to point P2 under the action of a force F.
The path C need not lie in a plane and F need not be constant. At a general position on the
path, the mass ni undergoes a displacement e. Since dt has a magnitude de and the direction
of the motion of the particle (i.e., tangent to the path), it follows that F dt is the comp o-

SECTION

V.7

The Equation of a Plane

579

nent of F parallel to the displacement, multiplied by the displacement. But this is the element of work diV done by F on the particle during the displacement di. Thus

TV

pz

P2

dTV =

Pi

o.

F d

Pi

is the work done by the force F on the particle as it moves from PI to P z.


As a specific illustration of this relation, let the particle move along the helix given by the
parametric equations
y = sin t
x = cos t
z= t
with the initial point PI taken to correspond to t, = 0 and the terminal point P taken to
correspond to t 2 = 21r. Let the force be given by
'fhen

1x x

+1

yY

- 1zz2

dt = 1x dx + 1 y dy + L, dz = [L, ( - sin t)
F = 1 x cos t + 1u sin t - 1 zt 2
F dt = (- cos t sin t + sin t cos t - t 2) dt
W = -

1y cos t

1z ] dt

211"

t 2 dt =

(27r)3

The negative sign attached to this answer means that on the average the force F was opposed
to the particle's motion. Thus rather than doing work on the particle, the force F was the
means whereby energy was taken from the particle.

V.7

THE EQUATION OF A PLANE

The dot product is also useful in the derivation of the equation of a plane, a result
which is needed in the discussion of electromagnetic plane waves. Consider the plane
shown in Figure V.11. Let P o(xo,Yo,zo) be that point in the plane which is closest to the

.r-:
Po

x
FIGURE

V.II

Geometry of a plane.

,580

11 ectors

MATHE1\IATICAL SUPPLEMENT: PART II

origin, and let P(x,Y,z) be any other point in the plane. Further, let a be drawn from
the origin to Po, and let h be drawn from Po to P. Then a is perpendicular to the plane,
whereas h lies in the plane, and thus

which can be rearranged to give


[

(xo

Zo

Yo + zo)/2

1(,

(x~

+ y~ + Z~)~2 (V.IO)
Ax + By + Cz = D.

This is the equation of the plane, and is in the standard form


As written, it reveals all the important properties which characterize the plane.
The coefficients of x, Y, and z in Equation (V.IO) are the direction cosines of any
line perpendicular to the plane and thus define its tilt. The constant term on the right
gives the distance from the origin to the plane. The equation of any plane can be put
in the form (V.IO) merely by normalizing the coefficients. This is accomplished when
the sum of the squares of the coefficients of x, Y, and z is unity.
It follows that the equations for a family of parallel planes will have identical x, y,
and z coefficients when written in the normalized form (V.lO). The distance between
any two members of the family will then be the difference between the constant terms
on the right sides of the equations.
EXAl\IPLE

V.6

What is the equation of a plane which is parallel to the plane 6x - 3y + 2z


times as far from the origin?
One can begin by normalizing the equation of the first plane. Since

[(6)2
the normalized form is

(-3)2

~x -

(2)2P2

fy + tz

7 and three

= 1

and thus the first plane is 1 unit from the origin. J.~ suitable choice for the second plane is
therefore
~x - --y
=3

+ tz

An equally valid choice is the plane on the other side of the origin, for which one obtains the
equation

v.s

MULTIPLICATION OF VECTORS-THE CROSS PRODUCT

The second operation to be introduced which involves the multiplication of a vector


by a vector is known as the cross product (or vector product or outer product) and is
symbolized in the form a X b. By definition,
a X b = ablsin Olin

(V.II)

in which () is the smaller angle measured from a to b as indicated in Figure V.12, and
In is a unit vector normal to the plane containing a and h. Which direction in takes
perpendicular to the plane is determined by the right-hand rule. If a right-hand screw
were placed parallel to In and rotated in such a way as to take a through the angle ()

SECTION

,r.8

Multiplication of Vectors-The Cross Product

581

into b, the longitudinal displacement of the screw would be in the direction of In. An
alternative method to determine the sense of in is to use the thumb, index, and middle
fingers of the right hand to indicate respectively the directions of a, h, and In.

FIGURE

V.12

N olation for the cross product.

From this definition, it is evident that taking the cross product of two vectors yields
a new vector perpendicular to both original vectors. Its magnitude is deterrnined by
the magnitudes of the original vectors and the sine of the angle between them. The
distinctions between the scalar product and the vector product should be carefully
noted.
Forming the vector product is a permissible operation if the vectors are free, or fixed
and acting at the same point. They need not be of the same dimensions.
An immediate consequence of the rule for determining the sense of ln is the relation
a X

h == -b X a

(V.12)

and thus the commutative law is not obeyed. The vector product does conform to the
distributive law, however, and one may write
a X (b

c) == a X h

a X c

(V.13)

This result may be established with the aid of Figure V.13. band c (and hence b

b'
FIGURE

V.13

Distributive law for the cross product.

c)

582 Vectors

l\1A'rHElVIATICAL SUPPLEMENT: PART II

lie in a common plane which, in general, is tilted with respect to the plane of the paper,
since in the figure a is taken perpendicular to the plane of the paper. Their projections
in the plane of the paper are h', c', and (b + c)' respectively. It follows that
a X b == a X})'

a X c == a X c'

a X (h

c) == a X (h

c)'

But if the triangle of sides b', c', and (b + c)' is rotated 90 deg and altered by the
factor a, the result is a triangle of sides a X h', a X c', and a X (b + c)'. Thus

a X (h
and thus

a X (h

c)' == a X c'

a X h'

c) == a X h

a X c

which is (V.13). This proof may be extended readily to a X (b


(c + d), etc. In particular it 111ay be applied to the expansion

When it is recognized that L, X L,

==

+c+

d), (a

h) X

0, L, X 1y == 1z , etc., this expansion reduces to

a X b ==

i,
ax

i;

ly
ay
by

lz
a,

(V.14)

i;

Equation (V.14) is oft-used and worth committing to memory. It can be observed


that h X a corresponds to an interchange of t\VO rows of the determinant, which gives
the required change of sign.
The staternent that L, X l y == L, im plies the use of right-handed coordinate systems.
This convention has been adopted almost universally and will be followed in this
text. Thus for example, (x,y,z), (r,cI>,z), and (r,(},cI will be the chosen order of writing
coordinates in Cartesian, cylindrical, and spherical systems. The positive directions
of the three coordinate axes will always be chosen so that L, X 1y = 1z , I, X 1<p == 1z ,
and IT X 1 8 = I<p, etc. Strict adherence to this convention is necessary if confusions
of sign in formulas for vector products are to be avoided.
Finally, if aX b == 0, and a > 0, b > 0, it follows that a and b are parallel.
EXAMPLE

V.7

A. parallelogram of sides a and b has an area given by S = ab sin (), with ()an interior corner
angle. If the sides of the parallelogram are treated as vectors, this can be written S =
X
I t is even 1110re useful in this case not to take the absolute value. The vector product itself
not only has a magnitude equal to the area of the parallelogram, but also points in a direction perpendicular to the plane of the parallelogram. This direction serves to describe the
orientation in space of the plane surface bounded by the parallelogram, and thus more information is provided when the relation
(V.I5)
S=aXb

la hi

is used.
EXAl\'IPLE

V.8

The result of the previous illustrative example can be combined with the dot product to
express the volume of the parallelopiped shown in the figure. If the three sides are represented

SECTION

Multiplication of Vectors-The Cross Product

V.8

583

\
\
\
\

-------\
b

c,

cl

by the free vectors 3, b, the area of the base is Ib X


and the height is the projection of a
on a line perpendicular to the base. Thus the volume of the parallelopiped is

= a (b X c)

When one makes use of the cross product in the form (V.14), the volume can be written
simply as the determinant

This result is often called the scalar triple product.


EXAMPLE

V.9

Consider a force F acting on a pivoted rod as shown in the figure. The torque exerted by this
force is T = Fr sin 0, and it tends to rotate the rod counterclockwise. By letting r be a free

---+ --- - ---

I
F

vector drawn from the pivot point out to the point of application of the force, one can write

T =

r X

(V.16)

The magnitude of T is the torque. The direction of T is the axis of rotation for the torque,
and application of the right-hand rule t yields the information as to whether the torque is
clockwise or counterclockwise. The formula (V.16) is a general result, not restricted to the
case that the rod and the force F lie in the plane of the paper.

t The right thumb is placed in the direction of T and the remaining four fingers then point in the
rotational direction of the torque.

584

Vectors

EXAMPLE

MA'rHEl\1ATICAL SUPPLEMENT: PART II

V.IO

I magine a body to be rotating about a fixed axis at an angular velocity w, as shown in the
figure. Let r be drawn from a fixed point 0 on the axis of rotation to a point P in the body.
w

"The angular rotation may be described by the vector w. The magnitude of w is the angular
velocity; its direction is the axis of rotation. The sense of w (up or down) is chosen to correspond to the right-hand rule, thus yielding the proper direction of rotation (clockwise or
counterclockwise). It follows that the velocity of the point P is given by
(V.17)

v=wXr

Equations (V.IS), (V.I6), and (V.17) illustrate the power of the vector method. In
a simple, terse vector formula, one is able to include all the information otherwise contained in a scalar equation and its associated paragraph of explanation concerning
directions.

V.9 THE DERIVATIVE OF A VECTOR


Frequently the value (both magnitude and direction) of a vector will depend on one
or more variables. As examples, the velocity of an accelerating particle is a function
of time, the gravitational attraction of the earth is a function of height in the atmosphere, and the intensity of a radio signal depends on distance and direction from the
transmitting station.
The rate of change of such vector functions with respect to the functional variable
is often of considerable interest, and one is thereby led to the notion of a vector derivative. I t proves useful to adopt a definition for this derivative similar to the one used in
ordinary calculus. Thus suppose there is a vector f related to a scalar parameter s in
such a way that as 8 varies continuously, f does also. This dependency of f on 8 will be
indicated by writing f'(s). If ~f denotes the increment in f due to an increment ~s in
the parameter s, then the vector derivative df/d8 is defined by

d = lim M = lim r(s

ds

.18-+

tJ.s

.1s-+

+ b.s)

tJ.s

- res)

(V.I8)

SECTION

The Derivative of a Vector

V.9

585

It should be emphasized that the increment Lif is often not parallel to f, indicating
that not only is the magnitude changing, but the direction as well. In such cases df/ds
is not in the same direction as f. This is an essential feature of the vector derivative
which distinguishes it from the scalar derivatives encountered previously in earlier
studies of the calculus. One should guard against overlooking this feature.
Formulas for the vector derivatives of common functional combinations can be
established in a manner completely analogous to what is done in ordinary calculus.
Thus for example, if u is a continuous scalar function of the parameter s and if v is a
continuous vector function of the same parameter, then

d(uv) = lim (u
ds

dv
= uds

Similarly, if

Ul,

Uz,

VI

+ Liv)

Liu)(v

uv

Lis

~s-+O

(V.19)

du
vds

and vz are all continuous functions of the parameter s,


(V.20)

These two formulas can be used to establish the general derivative of f'(s) in terms of its
Cartesian components. If one writes f'(s) in the form

since the unit vectors are not functions of s, it follows from (V.19) and (V.20) that

df _ 1 dfl
'ds - x d s

d 2f = 1 dZf1
d8 z

xd2
S

1 df2
Yd s

1 d 2f2
Yd2
S

1 df3
zd S

1 d 2f 3
Zd2
S

(V.21)

Formula (V.21) is not, in general, extendable to other coordinate systems. For example,
if f were a function of the angular variable cP in cylindrical coordinates, and if one
expressed f in terms of its cylindrical components, namely,

application of the formulas (V.19) and (V.20) would have to account for the fact that
1T and let> have directions which are functions of cP.
EXAMPLE

V.II

With respect to the origin of a Cartesian coordinate system, the position of a particle as a
function of time can be expressed in the functional form r(t). Such a particle might be follow-

586

Vectors

l\1ATHEMATICAL SUPPLEMENT: PAHT II

ing a path as shown in the figure. The instantaneous velocity of the particle is given by
v(t)

= dr = lim Lir = lim


dt

.1t-+O

Lit

.1t-+O

r(t

+ Lit) -

r(t)

j)l

and it is evident from the construction in the figure that v(t) is tangent to the path.
As a specific illustration, suppose that the trajectory is given by

in which the coefficients k; are constants. Then

and the particle is drifting at constant speed in its projection on the ..:YY plane but has a
linearly changing veloci ty in the Z direction. I ts acceleration is

a constant (which in some problems might be due to gravity).


EXAl\1PLE

V.12

Consider a particle which is going around in a circular orbit of radius ro at the constant angular rate w rad/sec. As suggested by the figure, a reference line can be established which intersects the orbit at a position which the particle occupied at t = o. The instantaneous angular
position of the particle can then be specified in terms of the angle cP by the relation cP = wt so
that dcP/dt = t,

SECTION

Tangent Lines and Tangent Planes

V.I0

587

If the origin of cylindrical coordinates is taken at the center of the orbit, the position vector of the particle is
r(t) = l,ro
which is a function of time by virtue of the fact that the direction of the uni t vector 1, is a
function of time. When use is made of (V.19), the velocity of the particle is seen to be
v

dr
= -dt

d
d
de/>
d
ro- (1,) = ro- (1,) - = wro - (1,)
dt
de/>
dt
de/>

The angular derivative of 1r can be determined with the aid of the vector diagram showing

I, at successive positions d<t> apart, from which it is evident that d(lr) = let> d<t>, and thus

which leads to the result that v = 1wro.


This problem could also have been solved in the manner of Example V.IO.

V.10

TANGENT LINES AND TANGENT PLANES

As indicated in the previous section, a space curve can be described by the vector
function res), in which s is a parameter (not necessarily distance or time). Since dr/ds
is tangent to the space curve, this provides a means for developing the equation of the
tangent line. As an example, in rectangular coordinates the space curve could be
represented by
r = lxfl(s) + lyf2(s)
l zf 3(s)
so that

dr
ds

,
l,jl(s)

l yf2(s)

,
lzf3(s)

If Po(xo,Yo,zo) is the point on the space curve corresponding to the parameter value
So, and P(x,Y,z) is any other point on the line which is tangent to the space curve at Po,
then the equation of this tangent line is simply

x - xo Y - Yo
z - Zo
-,-- == -,-- = - , f1(SO)

f2(SO)

13(SO)

(V.22)

588

11 ectors

EXAMPLE

lVIArrHEMATICAL SUPPLEl\,IENT: PART II

V.13

Consider once again the helix of Example V.5, given by the parametric equations
x = cos t

= sin t

z= t

A point Po on this helix is given by the position vector


r(to) = lx cos to + ly sin to + lzt o

and, through use of (V.22), the line tangent to the helix at Po is seen to satisfy the equation

x - cos to
- sin to

y - sin to
- - - = z-t o
cos to

Consider next a scalar function F(x,y,z). The locus of points for which

F(x,Y,z)

F(xo,Yo,zo)

(V.23)

with K a constant, is simply the collection of points P(x,y,z) for which the function F
has the same value that it does at the point Po(xo,Yo,zo). This locus is usually a surface.
This fact may be appreciated by recognizing that for rnost functions F of practical
interest, (V.23) can be solved in the form

(V.24)

f(x,y)

Equation (V.24) determines a point P(x,y,z) for each value of x and y. When x and y
vary continuously over some region, P(x,y,z) will, in general, vary continuously over
the corresponding portion of a surface in space.
In what follows attention will be restricted to functions F for which C'/.23) is the
equation of a surface. At the point Po, the total differential is

er dx + -aF dy + -er dz
ax
ay
az

dF == -

(V.25)

in which it is assumed implicitly that the three partial derivatives in (V.25) have been
evaluated at Po. If the neighboring point PI reached by the increments dx, dy, and dz
is also on the surface defined by (V.23), then dF = O. It is interesting to interpret this
as an equation of the form
(V.26)
N dt = 0

aF

in which

N == lx -

and

dt

ax

==

I, dx

aF

ay +

ly -

1]) dy

aF

lz-

az

L, dz

(V.27)
(V.28)

Since Po and PI are both points in the surface, dt (which connects them), is tangent
to a space curve which lies in the surface and passes through Po and Pl. N is perpendicular to this space curve, and since PI can be any neighboring point in the surface, it
follows that N is normal to all the space curves in the surface which pass through the
point Po, and is therefore normal to the surface itself at the point Po.
If one refers to the development of Section V.7, the equation of the tangent plane at
Po is of the form Ax
By
Cz = D, and

aF
ax

aF

B=

ay

aF
c=az

Generalized Coordinates

SECTION'l.ll

589

The value of the constant D 111ay be found by inserting the point Po into the general
equation of the plane which gives

aF

er

aF

ax + Yo -ay +

Xo -

az

Zo -

== D

Thus the equation of the tangent plane is

er + (Y -

er +

Yo) -

(x - xo) -

ax

ay

er ==

(z - zo) -

az

(V.29)

This result could have been achieved in another way. Since l x (x - xo) + ly(Y - Yo) +
l z (z - zo) is a vector which lies in the tangent plane, its dot product with N, a vector
normal to the plane, should be zero.
EXAMPLE

V.14

Let the problem be posed to find the general expression for the plane tangent to a sphere of
radius r. Since the equation of a sphere can be written

F(x,Y,z)

x2

y2

+ Z2

r2 =

it follows that, at the point Po(xo,Yo,zo),

aF
ax

aF

2xo

ay

= 2yo

aF

az

= 2zo

When one uses (V.29), if P(x,Y,z) is any point in the tangent plane, then

2xo(x - xo)

+ 2yo(Y -

which can be rewritten as


Xox

yoy

Yo)

+ 2zo(z -

zo) =

zoz = r 2

This is the general expression for the equation of a plane tangent to a sphere of radius r at the
point (xo,Yo,zo) . .As a special case, if the point of tangency is (r,O,O), then the equation of the
tangent plane is x = r, an expected result.

V.ll

GENERALIZED COORDINATES

It may be observed that the definitions of all vector operations so far introdueed->
addition, subtraction, the scalar and vector products-are independent of any coordinate
system. This will also be true of all vector operations to be defined subsequently. This
feature of the vector method is one reason for its wide applicability to physical problems. A physical law is independent of any coordinate system used to describe it; the
mathematics employed should reflect this independence.
This does not mean that one should never use coordinate systems with vector analysis.
On the contrary, it means that any admissible coordinate system 111ay be used as the
frame of reference for vector analysis with the assurance that the results have physical
applicability. A COm1110n choice is Cartesian coordinates, and earlier sections have
shown the forms which S0111e vector relations assume in that frame.
There are many other useful coordinate systems, and which one is employed depends
on the problem being considered. The physical sY111D1etry usually suggests the proper
choice. Thus plane electromagnetic waves are 1110st simply described in Cartesian

590

Vectors

lVlATHEl\lATICAL SUPPLEMENT: PART II

coordinates because the equation of a plane has its simplest form in that system.
Cylindrical coordinates are used to show the transfer of power in a coaxial cable, since
the cable consists of t\VO concentric cylindrical conductors. Radiation from antennas is
best expressed in spherical coordinates because, from a great distance, antennas appear
to be point sources. The elliptical cross section of the electrodes of some modern VaCUUlTI
tu bes indicates the choice of eIliptical cylindrical coordinates, etc.
I t is a needless expenditure of effort to derive the forms for all vector operations in
each and every coordinate system which is considered to be potentially useful. A
preferable procedure is the following:

1. Establish the form that the vector operation assumes in Cartesian coordinates.
2. Specify a general coordinate system by the transformation equations
w

= 03(X,Y,Z)

(V.30)

3. Transform the Cartesian expression found in Step 1 by means of Equations (V.30)


to obtain the general expression for the vector operation in (u,v,1.v) coordinates.
The specific expression desired may then be found by substituting the appropriate
coordinate system for (u,v,w) in the general expression deduced in Step 3.
The only restriction on the procedure just outlined is that (u,v,w) should constitute
an admissible coordinate system. Admissibility in this context arises from the notion
that physical space can be described by a Cartesian reference frame, in the sense that
to each physical point in space there corresponds a unique triplet of numbers (x,Y,z).
Similarly, to each triplet of numbers (x,Y,z) there corresponds a unique point in physical
space. t This notion is often stated more briefly by saying there is a one-to-one correspondence between the Cartesian coordinates and physical space. If (u,v,w) is to be
an admissible coordinate system, there 111USt also be a one-to-one correspondence
between points in physical space and the triplets of numbers (u,v,w). Mathematically,
this means that the functions (V.30) must be single-valued and defined for all values of
x, Y, and z, This will ensure that to every triplet (x,y,z) there corresponds a unique
triplet (u,v,w). It means further that it must be possible to solve (V.30) in the form
(V.31)

in which G 1, G2 , and G3 are single-valued functions, defined for all values of u, v, and w
This will ensure that to every triplet (u,v,w) there corresponds a unique triplet (x,Y,z).
When these conditions are met, (u,v,w) is a completely admissible coordinate system.
A criterion for admissibility can be developed with the aid of a geometric interpretation of Equations (V.30). If the discussion of Section V.lO is recalled, U = Ol(X,y,Z)
may be thought of as a family of surfaces, with different members of the family identified
by different values of u. Similarly, v :: g2(X,y,.z) and w = g3(X,Y,Z) may be treated as
families of surfaces. These three families should be mutually intersecting so that the
three surfaces Uo = gl(X,Y,Z), Vo = g2(X,Y,Z), and Wo = g3(X,y,Z) have in C0111mOn the
single point Po(uo,vo,wo).
The surfaces u and Vo intersect in a line whose parametric equations are given by
y = Gz(uo,vo,w)

t Such a concept of space is usually called Newtonian.

(V.32)

SEC'I'ION

Generalized Coordinates 591

V.II

This is called a
the v line

1V line

because, along its length, w varies but u and v do not. It intersects


(V.33)

and the u line


(V.34)
at the point Po(uo,vo,wo). These are the three coordinate lines through the point. The
geometric features just described are suggested by Figure V.14.

Generalized coordinates.

V.14

FIGURE

The displacement

d fJ = 1:r d X
.(I

1v d y

1
1z d Z = [1 x aG
aw

2
1Y aG
aw

+1

aG3]
aw dw

(V.35)

obtained by forming differentials from Equations (V.32), lies in the w line at Po if the
partial derivatives in (V.35) are evaluated at Po. Thus the vector

T = 1 aG1
z

aw

1 aG2
Y

in

1 aG3

aw

(V.36)

1 aG 3
av

(V.37)

1 aG 3

(V.38)

is tangent to the w line at Po. Similarly, the vectors

=
v

=
u

1 aG 1
x av
1 aG1
x

au

1 aG2
Y av

1 aG 2
Y au

are tangent to the v and u lines at Po respectively.

au

592

Vectors

IVIATHEMATICAL SUPPLEMENT: PART II

If no one of these three tangent vectors is a null vector, and if they are not coplanar,
then the three surfaces Uo, vo, and Wo will intersect in a point Po (rather than, say, in a
line). When the result of Example V.8 is employed, it follows that the necessary and
sufficient condition to ensure a one-point intersection is

aG 1
au
aG 1
av
aG
- 1
aw

aG 2 aG 3
au au
aG 2 aG 3
av
av
aG 2 aG 3
aw aw

(V.39)

~O

Since the inequality (V.39) is also the necessary and sufficient condition that (V.30)
can be derived from (V.31), it follows that the (u,v,w) coordinate system is completely
admissible if J ~ 0 for all values of u, v, and w.
The determinant of (V.39) is known as the Jacobian.
EXA1'vlPLE

V.I5

As an illustration of the Jacobian test for admissibility, consider the transformation to


cylindrical coordinates defined by

= arctanx

Z = Z

(V.40)

These equations represent, in turn, a family of concentric cylinders, a family of half-planes


which have the Z axis in common, and a family of planes which are all perpendicular to the
Z axis. The geometric relations between cylindrical and Cartesian coordinates were shown
in Figure V.7.
Equations (V.40) may be solved to give

x = r cos

y = r sin 4>

(V.41)

The Jacobian for cylindrical coordinates is thus


sin 4>

cos
-rsincP

J=

rcos4> 0 =r
0
1

(V.42)

Except along the Z axis where r = 0, cylindrical coordinates are seen to be an admissible
coordinate transformation. The reason that the Z axis is inadmissible can be understood if
one considers the triplet (0,0,3) in Cartesian coordinates. This triplet corresponds to all the
triplets (0,cP,3) in cylindrical coordinates for S <p S 21r. Thus a one-to-one correspondence
does not exist for this point or any other on the Z axis.

EXAl\'1PLE

V.16

Another type of transformation, of importance in classical mechanics, connects one Cartesian


coordinate system to another. With reference to the figure, let XYZ and X' Y'Z' be the two
sets of axes and let the origin of the primed coordinate system occupy the point (xo,Yo,zo) in
the unprimed coordinate system at time t = o. Further, let the primed origin have the constant velocity u = lxux
lyu y lzu z relative to the unprimed system. Also, let cos xx',
cos xy', ... , cos zz' be the cosines of the angles between the various axes of the two frames.

SECTION

V.11

Generalized Coordinates

593

Z'

x
If (x,y,z) is the position of a particle at time t, as seen by an observer 0 who is stationary in
the unprimed system, and if (x',y',z') is the position of the same particle, at the same time, t
as seen by an observer 0' who is stationary in the primed system, then
r

= lxx

lyY

lzz

is the instantaneous position vector which 0 attaches to the particle, and

is the instantaneous position vector which 0' attaches to the particle..Additionally, 0


describes the instantaneous position of the primed origin by the vector

If it is assumed that the two observers agree about distance measurements (another classical
assum ption) then these position vectors can be connected by the relation
r' = r - R

If this equation is dotted successively with lXI, 1y and 1z the result is the coordinate transformation
l )

x' = (x - Xo - uxl) cos xx'


y' = (x - Xo - uxt) cos xy'
z' = (x - Xo - uxt) cos xz'

+
+
+

(y - Yo - uyt) cos yx'


(y - Yo - uyt) cos yy'
(y - Yo - uyt) cos yz'

+
+

(z - Zo - uzt) cos zx'


(z - Zo - uzt) cos zy'
(z - Zo - u;:t) cos zz'

(V.43)

Equations (V.43) are known as the most general Galilean transformation. Their physical
interpretation is that the primed system is moving relative to the unprimed system at a
+ U;)~2. This motion is in an arbitrary direction with respect to the
speed U = (u; +
XYZ axes and is also in an arbitrary direction with respect to the X'Y'Z' axes. Furthermore,
the primed origin is in an arbitrary position relative to the un primed origin at t = O. The
special case U = 0 yields the most general static transformation between two Cartesian
coordinate systems.

u;

t This is a classical assumption that time is the same in both frames-an assumption which is challenged
in Einstein's special relativity. See Chap. 2.

594

Vectors

IVIATHEMATICAL SUPPLEMENT: PART II

The Jacobian can be deduced readily from (V.43) and is


cos xx'
J = cos xy'
cos xz'

cos yx'
cos yy'
cos yz'

cos zx'
cos zy'
cos zz'

(V.44)

It is left as an exercise to show that (V.44) is different f1'0l11 zero, and therefore that the
general Galilean transformation is completely admissible. (See Problem V.21 at the end of
this supplement.)

V.12

ELEMENTARY GEOMETRY IN GENERALIZED COORDINATES

By virtue of Equation (V.8), the angle of intersection of a u line with a v line can be
expressed in the form

in which Ttl and Tv are given by (V.38) and (V.37). Therefore a necessary and sufficient
condition that the intersection of a u line with a v line be a right angle is

aG l aG l

aG aG

aG 3 aG

aG aG l

aG

so.

aG 3 aG 3

aG aG

aG aG

2+
z
30
+
=
au av
au av
au av
Similarly,

(V.45)

2+ - - = = 0
t
-av+
aw
av aw
av aw

(V.46)

2
30
- + -aw
-2-au+
-oW-3 =
aw au
au

(V.47)

aG l aG l

are the necessary and sufficient conditions that the intersections of a w line with a v
line and a u line respectively be right angles.
A coordinate system for which Equations (V.45), (V.46), and (V.47) are satisfied at
all points is said to be orthogonal. t Computation of physical quantities is vastly simplified in orthogonal coordinate systems, and in the remainder of this supplement only
orthogonal euetems will be considered.
An example of this simplification is the expression for an element of length in generalized coordinates. Since

dt = [(dX)2 + (dy)2 +

(dZ)2P~

forming the total differentials of Equations (V.31) yields the transformation to


coordinates, namely,

dt

= {[ (~~lY +

(u,v,w)

ea~2Y + ea~aY] (dU)2 + [(aa~lY + ea~2Y + (a~3Y] (dV)2

+ [(aGl)2 + (aG2)2 + (aG3)2] (dW)2 + 2 [aGl aG l + aG 2aG 2+ aG aaGa] du dv


aw
aw
in
au av
au av
au av
3 aG3] d d }~~
2
2
1
2
3
aG2aG
aG-aGa]
aG+ aG aG+ aG dG l so,
+2 [ +
+
dudw+2 [aGl
vw
au in au in au aw
av aw av aw ov in
t It should be noted that when a coordinate system is orthogonal not only do the coordinate lines
intersect at right angles but the coordinate surfaces do as well.

SECTION

Elementary Geometry in Generalized Coordinates

V.12

595

When attention is restricted to orthogonal coordinate systems, this reduces to

df

[(hI du)2

+ (h 2 dV)2 +

(h 3 dW)2P2

(V.48)

in which the hi are called scale factors and are given by


(V.49)
(V.50)

(V.51)

A general space curve can be characterized by the parametric equations

==

v == V(S)

U(S)

== W(S)

(V.52)

and the length of a section of this curve is therefore

l = f de = I [(hi ~:Y + (h ~:Y + (h ~:Yr' ds


2

EXAMPLE

(V.53)

81

V.17

An element of length in cylindrical coordinates can be found if one chooses Equations


(V.41) to define the functions Gl , G2, and G3 and performs the differentiations indicated in
Equations (V.49)-(V.51). One obtains

hi = [(cos cf2
(sin cJ2 + (0)2P2 = 1
h 2 = [(-r sin cf2
(r cos cf2
(0)2]Yl
h 3 = [(0)2
(0)2 + (1)2P2 = 1

from which

df

= [(dr)2

+ (r dcJ2 + (dZ)2]~2

With reference to Figure V.7, this result is consistent with the observation that elemental
displacements from the point P in the directions of the three unit vectors are dr, r d<l>, and dz.
EXAMPLE

V.i8

In cylindrical coordinates the parametric equations of a helix are

r = ro
in which ro, aI, and a2 are constants and 8 is a parameter. The length of one turn of this helix
can be determined with the aid of (V.53).

J
82

[(alro)2

+ (a2)2]~2 ds

81

= (82 -

But

a182

a181

81)

[(alrO)

+ a~]~~'

= 211'" and therefore


l = 27r

[r~ + (~Yr

In generalized coordinates, the equation of a surface may be expressed in the form


W ==

w(u,v)

(V.54)

596

Vectors

lVIATHE1VL\TICAL SUPPLE?\'IENT: PART II

The family of coordinate surfaces U = gl(X,Y,Z) will intersect this surface in a grid of
lines. The family of coordinate surfaces v = g2(X,y,Z) will also intersect it in a grid of
lines. These t\VO grids will cross each other and thus divide the surface into a mesh of
surface elements, The area of the element cut out by the surfaces u Uo + du, Vo, and
Vo + dv can be found as follows:
In Cartesian coordinates the parametric equations of the line of intersection of
(V.54) and Uo are

x = G1[uo,v,w(uo,v)]

Z =

so that

G3[uo,v,w(uo,v)]

(V.55)

ec,
aG Bu:
dv + - - dv
av
aw av

dx

= -

aG 2
dy = -dv
dz

ino
+ -aG- d
v
2

av
aw av
aG 3 dv + -aG 3 -ino dv
ov
aw av

The elemental length of the line (V.55) which is contained between the surfaces Vo and
+ dv is thus given by the vector

Vo

I aw)
V dV= [1' x (aGl
-av + aG
aw -av

2aw) + 1z (aG3
+ 1 (aG2
-av + aG
-av + aG
-aw)] dv
in dV
in dV
3

(V.56)

Similarly, the elemental length of the line of intersection of (V.54) and Vo which is contained between the surfaces Uo and Uo + du can be expressed by the vector

aGI aw) +l (dC2


aG 2aw) +
Uu= [1 (-aCI
au+ - au
. au+ - au
d

in

in

1z

(ac'J
aG -aw)] d u
au+ aw au
3

(V.57)

The area of this surface element is the magnitude of the vector


(V. 58)

dS = U X V du dv

If C is any simple closed curve lying wholly in the surface (V.54), the area enclosed
by Cis

f dS

U2

V2(U)

J du flU X VI dv

Ul

(V.59)

Vl(U)

in which VI(U) and V2(U) are the projections of the two partial curves into which C is
divided by the extreme values UI and U2.
A common problem is the computation of an area in one of the coordinate surfaces.
This corresponds to setting w = ui in (V.54). In this case (V.59) reduces to

f du f
U2

S =

Ul

EXAMPLE

V2(U)

V.19

Compute the total surface area of the figure bounded by the plane z
x2
y2 = 2, and the plane z = x + 2.

(V.60)

h 1h 2 dv

Vl(U)

= -

2, the cylinder

SECTION

Elementary Geometry in Generalized Coordinates 597

V.12

In cylindrical coordinates these three surfaces can be described by the equations z = - 2,


2, and z = r cos cP + 2. The first and second are seen to be coordinate surfaces and
(V.60) may be used. Thus

J dq, J r dr
o
27r

81

411"

27r

2 cos 4>+2

27r

J dq, J

82 =

2 dz

J (2

-2

cos

+ 4) dq, =

q,

1611"

The third surface is not a coordinate surface so the more general expression (V.59) must be
employed. Making use first of (V.56) and (V.57), one obtains

V = lx cos
V

c/>

so that
U X V

= r[l x ( - sin 2 cP

+ l sin c/> + L, cos c/>


+ lyr cos c/> - l sin
y

= -lxr sin cP

zr

+ l y( - sin c/> cos c/> + sin c/> cos c/ +


IV X vi = V2r

- cos! c/

V2 J

27r

83 =
Thus

c/>

8 = 81

J r dr =
2

dq,

411"

lz(cos 2 cP

+ sin" c/]

V2

+8 +8
2

= 25.667r

The volume of an object may be computed if one finds the expression for the volume
of a differential element and then sums the volumes of all the elements contained within
the boundary surface of the object. In generalized coordinates the differential element
consists of the region within the surfaces uo, uo + du, VO J Vo + dv, Wo, and Wo + dw. The
edges of this element are sections of coordinate lines and are given by the vectors

eo,
[ 1 au +
aG
iJv +

U du =

v dv

x -

[1

aw +

W dw = [1 iJG 1
x

1y
1

aG + L, -aG3] du
au
au
aG
aG3]
dv
dV +
dV
2

1
Z

2
aG
+ 1 aG3]
inn
dW

dw

When one makes use of Example V.8, the volume of this element is

(D X V) W du dv dw

(V.51)

J du dv dw

in which J is the Jacobian, given by (V.39). Since attention is being restricted to


orthogonal coordinate systems, J = h 1h 2h 3, and the volume contained within a simple
closed surface S is given by
U2

J du J
Ul

EXAMPLE

V2(U)

Vl(U)

W2(U.V)

dv

(V.52)

h 1h 2h s dw

Wl(U.V)

V.20

Find the volume of the figure bounded by the surfaces z = - 2, x 2


(Cf. Example V.19.)

y2

= 2, and z = x

+ 2.

598

Vectors

MATHEMATICAL SUPPLEMENT: PART II

Using (V.62), one obtains

J J
2

V =

V.13

21r

dr

df/J

cos ct>+2

r dz = 1611"

-2

ADDITION, SUBTRACTION, AND MULTIPLICATION


IN GENERALIZED ORTHOGONAL COORDINATES

If attention is centered on the vector operations already defined, in generalized orthogonal coordinates a vector may be expressed in terms of its components by writing

in which the unit vectors are taken in the positive directions along the coordinate lines
at the point where a is acting.
Addition or subtraction of two vectors then takes the form
(V.63)
Since (u,v,w) is an orthogonal coordinate system,

+ avb v + awbw
1 i, i,
au a, a;
b; i; bw

a b = a.b;

(V.64)

11

and

a X b =

(V.65)

Thus the general expressions for all these operations have the same form as the earlier
specific expressions encountered in Cartesian coordinates. Several precautions should
be noted, however. In (V.63) the vectors must have the same dimensions; this is not a
requirement in (V.64) and (V.65). In all three formulas the vectors must be either free
or fixed and acting at the same point. If they are free, care must be observed in translating one vector from its original point to a common point. If the directions of the
unit vectors change in the process, the vector must be re-resolved into components
appropriate to the new point. The reader is referred to a discussion of this difficulty
given in Section V.5.
V.14

GRADIENT

Let 'l!(x,Y,z,t) be a scalar function of space and time with continuous first derivatives.
At any specific time, by assigning a sequence of constant values to '1', a family of surfaces
can be described. The locus of points P(x,Y,z) which satisfy w(x,Y,z,lo) = K; with K o
and, to constants, is one of these surfaces. It has been shown in Section V.IO that a
vector normal to this surface is
a'l!
a'l!
a'1'
(V.66)
1-+1-+1x

ax

ay

az

SECTION

Gradient 599

V.14

in which it is understood that the three partial derivatives have been evaluated at the
point in question in the surface. Though not so identified at the time, this normal
vector is called the gradient of '1'.
Because of the form of (V.66), it is convenient to introduce a vector operator by the
definition

1x ox

+ 1 oy + 1. oz
y

(V.67)

The symbol V is widely called the del operator though some writers prefer the name
"nabla." Wilson gives an engaging footnote on the origins of these appellations.'
Operation on a scalar function 'It with the del operator (V.67) will then mean
V'lt =

(1

x -

a + i a + L, -a) 'It =
ay
az

ax

y -

lx

a'1' + 1 a'1' + lza'1'


ax
ay
a.Z
y -

(V.68)

and the gradient of 'It usually will be referred to by the symbolic shorthand V'!t.
In addition to being normal to a surface over which '1J is constant, the gradient has
several other interesting properties. Suppose P is a point in the surface '1'(x,Y,z,lo) = K
and that the gradient V'l! is computed at P. Let any space curve be drawn through P
and characterized by the parametric equations

The differential change in '1J along this space curve will be

d'l!

a'1'

= -

ax

dx

a'1'
a'1'
+ - dy + - dz
ay

az

in which dx, dy, and dz are components of the displacement

dt

l x dx

111 dy

dfl

L, dz == [ l x ds

+1

df2
ds

y -

df3]
s

L, -d ds

which extends from the point P to a neighboring point on the space curve. Thus

d'l!
a'1' dx
- == - -

.u

ax

dt

a'1' dz
+ -aya'l! -dy
+
- =
dt
az .u

V'lt lr

(V.69)

1 Somewhat paraphrased, after saying in the main text, "It has been found by experience that the
monosyllable del is so short and easy to pronounce that even in complicated formulae in which V
occurs a number of times no inconvenience to the speaker or hearer arises from the repetition." Wilson
adds in a footnote: "Some use the term Nabla owing to its fancied resemblance to an Assyrian harp.
Others have noted its likeness to an inverted ~ and have consequently coined the none too euphonious
name Atled by inverting the order of the letters in the word Delta. Foppl avoids any special designation and refers to the symbol as 'die Operation V.' How this is to be read is not divulged." From
J. W. Gibbs and E. B. Wilson, Vector Analysis, p. 138, C. Scribner's Sons, Inc., New York, 1901.

600

Vectors

IVIATHEMATICAL SUPPLEMENT: PART II

in which l T is a unit vector tangent to the space curve at P and pointing in the direction
of dt.
It follows from (V.59) that the maximum spatial rate of change of 'It is along that
space curve which is normal to the surface 'I!(x,y,z,lo) = Ki; for then V'I! IT is optimized. Since 1T is a unit vector, this maximum value equals I"'ltl. Therefore the gradient
is not only normal to the surface \II = Ku; but it has a magnitude and direction which
denote the maximum spatial rate of change of 'It. It is these properties of the gradient
which are responsible for its name,
The physical meaning of gradient makes the task of finding its expression in generalized coordinates a sim ple one. One need only find the rates of change of 'I! with
respect to distance in three mutually perpendicular directions, multiply these three
quantities by the appropriate unit vectors, and sum,
With reference to (V.48), which is the expression for an element of length in generalized orthogonal coordinates, hI du, h 2 dv, and h 3 dw can be recognized as being
differential lengths in the directions of the three coordinate lines. Since these three
coordinate lines are mutually perpendicular,

(V.70)
is the formula for the gradient of 'I! in generalized orthogonal coordinates.
Because dl = [(dx)2 + (dy)2 + (dz)2]~~ in Cartesian coordinates, it follows from
(V.48) that the scale factors are h l = h 2 = h 3 = 1 in that system. If this special case is
inserted in (V.70), one obtains (V.68) as expected.
EXAr"IPLE \T.21

1'00111

6 m square and 3

111

high has a temperature distribution given by

T(x,Y,z,t) = 300 [1

(z - 3) ex
100

6) e-t]

in which the origin of coordinates has been taken in a lower corner of the 1'00111 with the Z
axis pointing; up. The temperature is in degrees Kelvin, distances are in meters, and time is
in hours.
The gradient of this temperature distribution is

This result 111ay be plotted by employing a useful technique known as flux mapping.
At every point in the room there is a value of the gradient possessing magnitude and direction. If, at every point in a figure representing the room, lines are drawn in the direction of
the gradient, with the density of these lines numerically equal to the magnitude of the gradient, all the information about gradient can be displayed in the figure. The arrows on the
lines indicate direction, and a practiced eye soon associates closely bunched lines with a high
value, and widely spaced lines with a low value. This technique can, of course, be utilized to
plot any vector function. The plot is often called a flux map or a field representation arid the
lines themsel ves are called flux lines or field lines. Common usage of this technique has caused
vector functions and their field representations to become interchangeable concepts, in much
the same way that the equation of a line and the graph of a line have become equivalent.

Gradient

SECTIONV.14

601

The temperature gradient at t = 0 is shown plotted in the figure, in which a two-dimensional representation is sufficient, since there is no y dependency. The bunching of the lines
indicates that the greatest spatial rate of change of temperature occurs near the Y axis

z
300
305
310

~~q;

:+

4,.~v

315
320
325
330
335
340
345
350

It is also possible to plot in this figure profiles of the surfaces over which the temperature
is constant, and several of these profiles are indicated. The flux lines representing gradient
are seen to be perpendicular to these isotherms in accordance with the earlier discussion
concerning properties of the gradient.
When one displays, such as has been done here with isotherms, a regularly spaced selection
of the members of the family of surfaces over each of which a scalar function is constant, the
resulting figure is said to be a field representation of the scalar function. Where the surfaces
are closely bunched, the function is changing rapidly as the point of interest moves from one
surface to the next. Where the surfaces are widely spaced, the function is changing slowly.
Of course, along one of these surfaces, the function remains constant. Because of this ability
to represent scalar functions by such a plot, the terms scalar field and scalar function have
also grown to be used interchangeably.
Since heat flows in the direction of temperature decrease, the figure suggests the presence
of a heat source concentrated near the Y axis with the flow of heat being along the flux lines
against the direction of the arrows. The exponential time factor indicates that this heat
source is dying out, with an ultimate uniform temperature of 300 deg predicted. If the field
map were plotted for a succession of times the shape would be unaffected, but the gradient
lines would thin out everywhere and the isotherms would spread further apart.
EXAMPLE

V.22

Newton's gravitational law states that the force on a mass m due to other masses tru,
(V.71)

in which G is a universal constant, r, is the distance between m and m i, and l r i is a unit

602

Vectors

MATHEl\fATICAL SUPPLEMENT: PART II

vector drawn from m, toward m. If a scalar function <f> is defined by the expression

L
N

= -Gm

eJ.>(x,Y,z,t)

i= 1

mi
r,

(V.72)

in which (x,y,z) are the instantaneous positional coordinates of m, then


f

= - V<f>

(V.73)

To show this let (Xi,Yi,Zi) be the instantaneous positional coordinates of m, and then

But

and therefore

as was to be proved.
<f>(x,y,z,l) is called the potential energy function. If the mass m is 1110ved over a surface on
which <f> is constant, since the force on 'In is everywhere perpendicular to this surface, no work
is done on the mass 'In and the energy of the system is held constant. If the mass 'In is moved
from one potential surface to another, there is a component of the force on m which is along
the path and work is done on 'In, which changes the energy of the system. The value of <I> is
the energy potentially available if the mass m is removed infinitely far from the proximity
of all the other masses. Since gravitational forces are always attractive, this energy potentially available is negative. Thus additional energy has to be put into the system of masses in
order to remove ni from the influence of the other bodies.
This serves to explain why <f> was defined, in (V.72), as the negati ve of a sum of in trinsically positive terms, It also becomes physically clear why f should be the negative gradient
of the potential energy function. The force on m is toward the remainder of the mass system,
whereas the potential energy decreases as 'In approaches the other masses.
EXAMPLE

V.23

What are the expressions for gradient in spherical and cylindrical coordinates?
To answer this question for spherical coordinates, one can refer to Figure (V.8) and write
the transformation equations linking spherical and Cartesian coordinates ei ther in the form
r = [x 2

y2

Z2P~

or in the form

x = r sin (} cos

[x2

+ y2]~~

(} = arctan - - - z
y = r sin (} sin c/>

y
arctanx

= r cos

(V.74)
(V.75)

Equations (V.74) are a specific example of (V.3D) whereas Equations (V.75) are in the form

SECTION

Divergence

V.1S

603

of (V.31). Using (V.49)-(V.51), one finds for spherical coordinates that

h3 = r sin

(V.76)

Thus a differential path length in spherical coordinates is

se =

1 dr
T

+ 18 r de + 1<1> r sin e del>

(V.77)

With reference once again to Figure V.8, this result is consistent with the interpretation of
displacements from the point P, in the directions of the three unit vectors, caused by the
increments dr, dO, and del>.
When (V.70) is used, the expression for gradient in spherical coordinates is seen to be
(V.78)
To answer the question for cylindrical coordinates is a simple matter, since the scale factors were found in Example (V.17). Once again using the general form (V.70), one can determine that the expression for gradient in cylindrical coordinates is
(V.79)

V.15

DIVERGENCE

The discussion in Example V.21 established the idea that any vector function may be
represented by a field of lines whose density and direction at every point give the magnitude and direction of the vector at that point. If one visualizes a general vector
function A(x,y,z,l) in this manner, and if a differential element of area dS is erected at a
point P(x,Y,z), then A dS is instantaneously the number of field lines piercing dS.
When time is imagined to be "stopped," if the element of area dS is oriented in a
succession of directions, A dS will vary, and will be a maximum when dS is transverse
to the field Of, in other words, when dS is in the same direction as A. For any orientation
of dS, the component of A in the direction of dS will be A dSjdS which follows readily
from the definition of the dot product.
In physical problems it often occurs that the field lines which represent a vector
function are discontinuous, which implies that the number of lines which enters a
volume element is different from the number of lines which emerges. This is an important
effect which is measured by the divergence of the vector function.
To put this concept into more specific terms, let A(x,Y,z,t) be any vector field and let
Ll V == ~x Lly ~z be a volume element which has as one of its corners the point P(x,Y,z).
This situation is depicted in Figure V.I5. If the net efflux from Ll V is taken to mean the
excess of emerging lines over entering lines (the net efflux may be negative) then

f AdS

(V.80)

is numerically equal to this net efflux. S is the six-sided surface surrounding LlV, and
dS must everywhere be chosen to have the direction of the outward-drawn normal] if

t The convention of the outward-drawn normal will be adhered to in this text.

604 Vectors

MATHEMATICAL SUPPLEMENT: PART II

x
FIGURE

Net efflux [rom a oolume element.

V.I5

(V.80) is to be interpreted as efflux rather than influx. Application of the mean value
theorem gives

J A dS

+ AX, Y + k 1 Ay, Z + k AZ, t) Ay Az


- Ax(x, y + k Lly, Z + k, Llz, t) Lly Liz
+ A (x + k Lix, y + Ay, z + k Llz, t) Lix Liz
- Ay(x + k Llx, y, z + k Liz, t) Lix Liz
+ Az(x + kg Lix, Y + k lo Lly, Z + Llz, t) Llx Lly

Ax(x

1I

Az(x

k ll Llx, Y

+k

I2

Lly,

Z,

t) Llx Lly

in which 0 ~ k: ~ 1. If these terms are expanded in a Taylor's series about the point
P(x,y,z), (cf. Part I of this Supplement), there results

A dS =

+ -aAz)
( aAaxx + -aAy
ay
az
-

aA
+-

ay

(k 1

z
+ -aA
(kg
ax

+ -dAy
(k
dZ

+ ...

Llx Lly Llz

k 3) Lly 2 Llz

+ -aAax

- k ll ) Llx 2 Liy

6 -

kg) Lix LiZ 2

(k s - k 7 ) Llx 2 Llz

x
+ -aA
(k
az

dA z

+-

ay

in which the undesignated terms are all higher order.

2 -

k 4 ) Lly LlZ 2

(k io - k 12 ) Llx Lly 2

SECTION

Divergence

V.I5

605

The divergence of A, written div A, is defined as the limiting value of net efflux per
unit volume at P, that is

J AdS

div A == lim

L1V~O

Since k, - k;

0, etc., as

~V ~

(V.81)

~V

0, it follows that

di
IvA

aAx
aA
aA
=-+-'
+ax
ay
az
1,

(V.82)

Equation (V.82) is the expression for divergence in Cartesian coordinates. Since the
right side of (V.82) is precisely V A, in which V is the del operator for Cartesian
coordinates defined in (V.67), the divergence is usually symbolized in this way.
The procedure just followed may be repeated to obtain the expression for divergence
in generalized orthogonal coordinates. Let the volume element be contained within the
surfaces, u, U + Su, v, v + ~v, w, and w + ~w. Application of the mean value theorem
to each of the six sides of this volume element gives

J A dS = Bu(u +

LlU, v

i, LlV, W + k 2 LlW, t) LlV Llw

- Bu(u, v + k 3 Su, w + k 4 D.w, t) ~v D.w


+ Bt(u + k; D.u, v + Su, w + k 6 ~w, t) D.u ~w
- Bv(u + k 7 D.u, v, w + kg ~w, t) ~u D.w
+ Bw(u + kg ~u, v + k lO S, W + ~w, t) ~u ~v
- Bw(u + k ll ~u, V + k 12 ~v, w, t) Su ;j.v

in which B; = h 2h 3A u , B v = h 1h 3A v , B w = h1hzA w , and ~ k, ~ 1. If each term of this


result is expanded in a Taylor's series about the point P(u,v,w), and if the limit is taken,
after division by D. V = h 1h 2h 3 ~u ;j.v ~w, one obtains
(V.83)
which is the expression for divergence in generalized orthogonal coordinates.
When one recalls that for Cartesian coordinates h, == h 2 == h., = 1, substitution of
this special case in (V.83) yields the expected result (V.82).
EXAMPLE

V.24

Find the expressions for divergence in cylindrical and spherical coordinates.


Since the scale factors for each of these systems already have been determined (see
Examples V. 17 and V. 23), this is a simple matter of substitution in Equation (V.83). For
cylindrical coordinates one obtains
.
div A

1 a

+ -1 -aA</> + aA
- z

(V.84)

1
+ .-.-a (sin. OAs) + - 1. -aA</>
r SIn 0 ao
r SIn 0 iJ</>

(V.85)

=- -

r ar

(r A r )

r a

az

and for spherical coordinates the result is

.
1
div A = - 2 - (r 2A r )
r iJr

606

Vectors

lVIATHEMATICAL SUPPLEMENT: PART II

Comparing Equations (V.79) and (V.84), one can say that in cylindrical coordinates
the gradient operator is

whereas the divergence operator is

1 a
- - (T1
r ar

)+ --

r a

(1<1>

Thus these operators are not the same. Similarly, if one were to compare (V.78) and
(V.85), the observation could be made that the gradient and divergence operators in
spherical coordinates are different. Indeed, only in Cartesian coordinates, and only
because in that system hI = h 2 = h 3 = 1, do the gradient and divergence operators
turn out to be identical. Yet it was this identity which caused the suggestion, when
Equation (V.82) was reached, to symbolize div A by V A.
The use of V A as the symbol for the divergence of A is widespread, regardless of
the coordinate system being used, and this will cause no difficulty if one remembers
that V is a different operator for divergence than it is for gradient in every system
except Cartesian coordinates. In Section V.I8 it will be found that the del operator for
curl takes still a third form.

V.16

THE LAPLACIAN OPERATOR

The gradient of a scalar function is, in general, a vector function. As such, it may be
represented by field lines (cf. Example V.21) and these lines may originate or terminate
in volume elements, which gives meaning to the concept of the divergence of the
gradient of a scalar- function. If 'l'(x,Y,z,t) is a general scalar function, then V V'l' will
symbolize the divergence of its gradient. The combination of expressions (V.70) and
(V.83) yields

v V'l'

1
h 1h 2h 3

[aau (h2h3
a'1') a(h1h3 a'lt) a (h a'1')J
h: au + av h; a;; + aw F:; dW
1h 2

(V.86)

as the form for this operation in generalized orthogonal coordinates. In the case of
Cartesian coordinates this reduces to
V V'l'

:=

[~
+ ~]
ax + ~
ay
az '1'

Since, by analogy with Equation (V.7),

(V.87)

a2 a2 a2
ax 2 + -ay 2 + -az 2

(V.88)

V V = \72 = -

it is customary to write (V.87) as V' 2 'lF. The scalar operator V'2, whose Cartesian form is
defined by (V.88), is called the Laplacian operator. From (V.86) the form the Laplacian
takes in generalized orthogonal coordinates is

(h ~
)+~
aw h aw
1h 2
3

)]

(V.89)

SECTION

Curl

V.18

607

The expressions for the Laplacian in cylindrical and spherical coordinates are listed at
the end of this supplement.
The Laplacian operator may also be applied to a vector, in which case one may write

(a

a + --.ah -a+ - a h- -a) (luAu + lvAv + lwAw)

1
h 2h 3
- h 1h 2h 3 au n, au

\72A = - -

1h 3

av

h2

av

aw

1h 2

h3

aw

(V.90)

The unit vectors lu, lv, and l w are in general functions of u, v, and w because their
directions may change from point to point even though their magnitudes remain unity.
Care must therefore be exercised in computing the derivatives.
In Cartesian coordinates the unit vectors are not functions of position, so (V.90)
reduces readily to
(V.9I)
The Laplacian of a vector is given for cylindrical and spherical coordinates in Problems
V.26 and V.27 at the end of this Supplement.
The Laplacian of vector and scalar functions arises in the study of differential equations which stem from a wide variety of physical phenomena, chiefly those concerned
with wave motion.

V.17

THE DIVERGENCE THEOREM

By virtue of the definition of divergence, contained in Equation (V.Si), if A is a vector


field, its net efflux from a volume element dV is V A dV. This result may be integrated.
Let S be a closed surface (not necessarily simply connected) which bounds a volume V.
The net efflux from V is J v V A dV, being the number of field lines leaving V minus
the number of field lines entering V. But this result can also be computed from Is A dS
in which dS is an element of surface area in S to which has been affixed an out.warddrawn unit normal vector. Thus

V A dV =

!
,

A dS

(V.92)

Equation (V.92) is known as the divergence theorem. It is valid when A has continuous
first derivatives throughout V and over S, and is applicable only when S is a closed
surface.

v.ia

CURL

The last major vector operation to be introduced is curl. Its physical significance can
be anticipated by first discussing an example.
Suppose a small spoked wheel, free to turn on its shaft, is immersed in a stream of
water, as suggested by Figure V.1"6. If the spokes and hub are extremely thin, the
action of the water will be entirely on the rim. If the water is coursing past the rim
more rapidly on one side than on the other, the wheel will rotate. If the center of the

608 Vectors

MATH E MA TICA L SU PPL EMEN T: P ART II

wheel is kept at t he sa me point P whil e t he shaft is orie nted in a succession of differ en t


di rect ion s, it will be found in gene ra l t hat t he wh eel will rota t e at a succession of d iffer en t
speeds. F or some orientation of t he sha ft, t his rota tional effect will be a m aximum .
Wh a t is being measured is the te ndency of t he wa ter to "c url" in the neighborhood of
P . If t he wheel is mad e smaller an d smaller it becomes less of a contami nating influence
on t he effect under observation, a nd in t he limi t one would be measuring the curl of
t he water righ t at the point P. T his curl is a vector effect, since it involves t he maximum
rota tional speed of t he whee l a nd also t he orien ta tion of t he sha ft wh en this maximum
speed occurs.

....

FIGU RE

V.16

Rotation of submerged wheel.

H ow ca n t h is effect be exp resse d ma thema ti cally ? If A (x,Y,z,t) describes the flow of


wa t er , then t he action at the rim will be pro portiona l to A . df in whi ch df is an increc
men t of leng th along t he rim and C is the locu s of points (x,Y,z) occupied by t he rim at
time t. (T he sma ll circle attached t o the integra l sign indi ca t es t ha t a complete closed
contour is being taken and t he int egral A . dt is ofte n ca lled t he circu lation.) In wh at

is to follow it will be see n t hat in the lim it , as t he size of th e wheel shrinks t o an infinites ima l, A . df is a second-order differen tial , an d t hus t he curl of the wa t er a t point
c
P is a ppropriately measured by
lim

A . df

-,,c

6S~O

(V.93)

t>.S

in which t>.S is the area of t he whe el. The valu e for curl obtained from (V.93) will
depend on th e orientation of the wheel , and thus it is more proper to say t hat (V.93)
measures the curl of t he wa ter around an axis perpendicul ar to t>.S .
Of course, t he concept of curl has wider a pplica bility t ha n t he exa mp le ju st cited .
Any vector field ca n exhibit curl if it yie lds a value up on insertion in (V.93) . Thus the
curl of a gen er al vector fun ction A (x,Y,z,t) at a point P (~,71,r) is defin ed by the following
process: Construct a un it vector l n t hr oug h P in an a rbitrary directi on. Choose a
smooth surface S throug h P suc h t hat ln is norm al to S at P. In S construct an y sim ple
closed pa th C a round P . The curl of A at P , aro un d a n axis in t he di rection of l n, is t hen
defined by the rela tio n

A . df

(curl

Ah. =

lim
6S~ O

AS

...

(V.94)

SECTION

V.18

Curl

609

b.S is the area on S enclosed by C and the direction of integration along C is given by
the right-hand rule. t
The limit in (V.94) must exist and be independent of the shapes of Sand C if (curl A)l n
is to be defined uniquely. It can be shown that these conditions are met if A has continuous first derivatives in the neighborhood of P.

x
FIGURE

V.17 Circular contour and its projection.

To obtain (curl A)l n in Cartesian coordinates, one may select as S the plane through
perpendicular to in = l xn x + l yn y + l zn z For C a circle of radius r can be
chosen with its center at P. The parametric equations of Care
P(~,l1,r)

x - ~ = ( 2 r 2)1~ (nxn z cos I/J - n y sin I/J)


nx

Y - 'T/

z-

=(

s=

ny

2 r 2) ~~ (nyn z cos I/J

n; + n y
- (n; + n;)}~ r cos

+ ti; sin I/J)

(V.95)

These equations can be established by recognizing that the projection of C on the XY


plane (see Figure V.17) is an ellipse whose minor axis is in the direction of l xn x + l y n ll
and whose center is at (~,l1,O). The parametric equations of this ellipse are
x' = rn z cos 4>

y' = r sin

and a simple rotation of axes gives the parametric equations in terms of x and y. Since
t If the right thumb is placed in the direction of l n , the remaining fingers indicate the proper direction.

610

Vectors

l\'IATHEl\iATICAL SUPPLEl\1ENT: PART II

the equation of the plane S is nx(x - ~) + ny(Y - 11) + nz(z - r) = 0, the parametric
equation for z is found by substitution.
From (V.95) an increment along the path C can be determined to be

df = (2
n;

r d

2)~~

ny

n y cos

ly( -nyn z sin

{1x(-nxn z SIn

+ 1A(n; +

n, cos cP)

n~) sin </>J}

(V.96)

A ce = A.(x,y,z,l) de. + Ay(x,y,z,l) dey + A.(x,y,z,l) de,

(V.97)

Then, since

if

Ax is expanded in a Taylor's series, namely,


Ax(x,y,z,l)

= Ax(~,11,r,t)

(x -

~)

aA x
ax

+ (y -

aA x

11) -

ay

aA x

+ (z -

r) -

az

and if similar expansions are obtained for A y and A z , substitution in (V.97) gives

c A df =

1rr 2

[n x (-dAayz -

dAy)
'
az

+n

(aAx
oz

aAz)
ox

+ n z (dAy
ax

- -oAx)]
ay

+...

When one divides by the area tlS = 1rr 2 and takes the limit, all the higher order terms
vanish, and thus
(curl A)l = In
n

aAz) + 1 (aA
+ l (aAx
[l x (-aayAz- OAy)
az
az
ax
ax
-

aAx)J
-

ay

(V.98)

The direction and magnitude of the maximum curl are therefore given by the vector
which is dotted with In in (V.gg) .. The curl about an axis parallel to the X axis can be
found by setting n y = n, = 0 and n; = 1 in (V.gg), etc. It is therefore completely
appropriate to treat curl A as a vector and write
curI A = 1x ( -aA z oy

aAy)

GZ

+ ly (aAx
- -aAz) + 1 (aA
oz
ax
ax
z

-- -

aAx)
ay

(V.99)

as the expression for the curl of a vector function in Cartesian coordinates. Since (V.gg)
can be put in the alternative form

lit
curl A =

ly

ax ay
A3: A y

(V.IOO)

comparison with (V.14) suggests the notation curl A = V X A, in which V is the del
operator defined by (V.67).

SECTION

V.1S

EXAMPLE

Curl

611

V.25

= l xK z 2 with K a constant.

Let the flow of water in a river be represented by the function v


Then the curl of the water is
V X v

VV x

= ly -

vz

= 2Kzl y

which indicates that a waterwheel with its axis transverse to the stream will turn, the
rotation being greatest near the top of the stream.

t --=

,
h
1

_ ----. ---

..x

Since the x, Y, and Z components of (V.99) are precisely the values for curl which
would be found by orienting ln parallel to the three coordinate axes in turn, this fact
suggests a simple way to find the expression for V X A in generalized coordinates.
Suppose that it is desired to find the curl of a general vector function A(u,v,w,t) at the
point P(u,v,w). S can be taken as the u surface through P. The surfaces v ~v/2 and
w ~w/2 intersect S in such a way as to form a "rectangle" around P. If this "rectangle" is taken as the contour C, application of the law of the mean gives

ae = e; (u, v+ ~v, w+ k ~w) ~w


1

- s; ( u,

v -

~~, w + k

Su: )

- n, (u, v+ k

~w + e, ( u, v + k

S,

S,

+ ~:) ~v

W -

~:) ~1J

in which B; = h 2A v and B; = h 3A w and in which -t ~ k, ~ t. If these terms are


expanded in Taylor's series about the point (u,v,w), one obtains

Cyclical interchange of the coordinates yields as the expression for curl A in generalized
orthogonal coordinates

vxA=

lu
h 2h 3

--

lv
h 1h 3

lw
h 1h 2

h1A u

h 2A v

h 3A w

a
au

a
av

a
aw

(V.IOI)

The specific forms which curl takes in cylindrical and spherical coordinates are listed
at the end of this supplement.

612 Vectors
V.19

l\1ATHEMATICAL SUPPLEMENT: PART II

STOKES' THEOREM

As an outgrowth of the definition of curl embodied in Equation (V.94), if dS is an element of area bounded by the infinitesimal closed contour dC, then

A de

V X A dS =

(V.102)

dC

This effect may be integrated. Consider the simple closed curve C shown in Figure
V.18. If a surface cap S is constructed with C as its sole boundary, then S can be divided
into surface elements dS. The direction of the outward-drawn normal associated with
dS is determined by first selecting a direction of integration along C. If the right
thumb is then placed along C in this direction, the remaining fingers thread S in the
direction of the outward-drawn normal.

FIGURE

V.I8

Stokes' theorem.

Let A be a vector function with continuous first derivatives in a region containing


Sand C. Then integration of (V.I02) gives

V X A dS

J (
dC

A de)

(V.I03)

in which the notation on the right side of (V.I03) indicates that the line integral around
every elemental contour de which divides S is to be included. With reference once
again to Figure V.18, it can be observed that whenever two adjacent surface elements
have a common boundary, a line integration is performed along the common boundary
for each element but in opposite directions. These integrations cancel, and only for those
elements which have an unshared boundary will there be a net contribution. All such
elements border on C and the unshared parts of their boundaries comprise precisely all
of C. Thus (V.I03) can be rewritten

V X A dS =

A de

(V.I04)

This important result is known as Stokes' theorem.

V.20 '/ECTOR IDENTITIES


The operations of gradient, divergence, and curl may be applied in many combinations.
One example of this has been noted in the case of the divergence of the gradient of a

SECTION

Green's Integral Theorems 613

V.21

scalar function, which led to the definition of the Laplacian operator. Several other
useful relations may be derived from the basic definitions. The reader may wish to convince himself that if <1>, '1t ~ A, and B are regular functions of space and time, then

v (<1>'1')

V(A B) == (A V)B

(B V)A

v (<I>A)

== 'lrV<I>
==

<I>V'lJ

A X (v X B)

<l>V A

+ Vel>

(V.I05)
B X (v X A)

V (A X B) == B V X A - A V X B
V X <I>A == <I>V X A + Vel> X A
V X (A X B) == A V B - B V A + (B V)A - (A V)B
V (v X A)

V X (V<I
V X V X A

== 0
== 0

== V(V A) - \72A

(V.~06)

(V.107)
(V.108)
(V.109)

(V.llO)
(V.lll)
(V.112)
(V.113)

I t is sufficient to prove these identities in Cartesian coordinates. Since the basic definitions of gradient, divergence, and curl are independent of any coordinate system, the
above relations will be true in all systems if they are true in one.
Equations (V.I08) and (V.113) are especially important to the subject of electromagnetic field theory and are worth committing to memory. Equation (V.III) offers
an interesting sidelight on the del operator since it indicates that V X A is "perpendicular" to V. In this respect V is no different from any other vector. Equation (V.113)
provides an alternative method for computing \72A. (Cf. Problems V.26 and V.27 at the
end of this Supplement.)

V.21

GREEN'S INTEGRAL THEOREMS

An integral formulation due to Green which is useful in the solution of boundary-value


problems can be presented in terms of two identities. If <I> .and 'It are suitably behaved
scalar functions, then the divergence theorem gives

J V (<I>V'1J) dV = J (<I>V'1f) . dS

However, if one uses (V.I07), the integrand on the left may be expanded, which gives

J (<I>V' 2'1J + V<I> V'1f) dV J (<I>V'1J) dS


=

(V.114)

This is called Green's first identity.


If the roles of ep and '1t are interchanged, one can write

J ('1JV'2<1> + V'1f V<I

dV

J ('1JV<I

dS

(V.115)

'1fV<I . dS

(V.116)

Subtraction of (V.115) from (V.114) gives

Jv (<I>V' 2'1J -

'1JV'2<1 dV =

J (<I>V'1J -

614

Vectors

l\IATHEMATICAL SUPPLEMENT: PART II

which is known as Green's second identity. It may also be written in the form
(V.117)
in which n is a metric variable in the direction of the outward drawn normal.
If one chooses 'It to be a point source function, the value of <P at a point may be
expressed in terms of knowledge of <I> throughout a volume and over a bounding surface.
This is often a useful formulation, especially when the boundary conditions axe simple.
The reader is referred to Chapter 3 for several examples of this technique.

V.22

SOLENOIDAL AND IRROTATIONAL VECTOR FIELDS

The identity V V X A == 0 leads to the observation that any vector field is divergenceless if it can be expressed as the curl of another vector field; that is, its flux lines are
continuous, neither originating nor terminating in any volume element. Such a field,
whose divergence is everywhere zero, is said to be solenoidal.
Similarly, the identity V X V<I> == 0 indicates that if a vector field can be expressed
as the gradient of a scalar field, then the vector field is without curl. Such a field, whose
curl is everywhere zero, is said to be irrotational.
All vector fields can be categorized as belonging to one of three types:
1. Irrotational
2. Solenoidal
3. Neither irrotational nor solenoidal
A theorem due to Helmholtz which is concerned with static (time-independent) fields
states, in effect, that under certain conditions a field in the third category can be
expressed as a linear sum of fields from the first two categories. These conditions are so
reasonable as to include all cases of practical physical interest. An alternative way of
stating the Helmholtz theorem is to say that any physically realizable static vector
field is completely determined by its divergence and curl. A proof of this theorem may
be found in many textbooks."
A corollary of the Helmholtz theorem is that any static irrotational vector field is
derivable from a scalar potential function. Thus if E(x,y,z) is a field such that V X E == 0,
then
(V.118)
E = -V<P
and application of Stokes' theorem gives

E ae = sJ V X (- Vel

dS

== 0

(V.119)

Therefore the line integral of an irrotational static field around any closed contour is zero.
2 See, e.g., R. Plonsey and R. Collin, Principles and Applications of Electromagnetic Fields, pp. 29-36,
Mc'Graw-Hill Book Company, New York, 1961.

SECTION

Complex Vectors 615

V.23

A second corollary of the Helmholtz theorem is that any static solenoidal vector
field is derivable from a vector potential function. Thus if B(x,Y,z) is a field such that

v B ==

0, then
(V.120)

B==vxA
and application of the divergence theorem gives

J B dS == 0

(V.121)

in which S is any closed surface.

V.23

COMPLEX VECTORS

In Section V.4 the notion was introduced that a real vector a could be multiplied by a
complex number 'Y = a + j{3. This generated a complex vector 'Ya whose real and
imaginary parts, aa and {3a were real vectors. By prescribing that aa and ,Ba obey all
the rules for real vectors developed in this chapter, the entire formalism of vector
analysis can be extended to the complex domain, The rules for vector algebra and
complex number algebra combine in a logical manner to yield the results for all operations on and by complex vectors. Thus, for example, if 'Y == a + j{3 and 'Y' == a' + j{3'
are complex numbers and a and b are real vectors, then

in which the multiplication process 'Y'Y' follows the rule for the product of complex
numbers. The distributive law holds so that
('Y

'Y(a

'Y')a

b)

==
==

'Ya
'Ya

'Y'a

+ 'Y b

For the addition law one obtains


'Ya

'Y'b

== (o a a'b)

+ J'C[3a ,B'b)

A complex vector may be resolved into components in the usual way, namely,

and the dot product operations are simply


(va) (v'b) = 'Y'Y'a b
'Ya ('Y'b

'Y" c)
'Ya 'Y*a

==
==

'Y'Y' a b
1'Y12 a2

'Y'Y"a c

in which 'Y* is the complex conjugate of 'Y, Similarly, the cross product operations are
'Ya X 'Y'b
'Ya X ('Y'b
'Y"c)

==
==

'Y'Y' (a X b)
'Y'Y'Ca X h)

'Y'Y"(a X c)

616 Vectors

MATHEMATICAL SUPPLEMENT: PART II

Some of the most useful operations involve complex vector and scalar fields which
are functions of real variables. Thus if

+ jA 2(u,v,w,t)

A(u,v,w,l) = AI(u,v,w,l)

in which A is complex and Al and A 2 are its real and imaginary parts, then

aA

au

aA 1

= -

au

aA
+j-,
etc.
2

au

and calculation of such quantities as VcP, V A, V X A, and integrals containing these


quantities proceed naturally if the real and imaginary parts are treated separately as
real vectors and then their complex sum is taken after the operations are concluded.

SUMMARY OF IMPORTANT VECTOR RELATIONS


GENERALIZED ORTHOGONAL COORDINATES

luau
I va
l uCa u bu )
a b = aubu
avbv
a

tl

+ lwaw
+ Iv(a v bv) + lw(aw b
+ awbw

w)

a Xb =

+ (h dV)2 + (h
a<I>
+ -h -av + -hi,-aw

dt = [(hI dU)2

1 0<1>
V<I> = -- hI au
1

v A

= h hI h

1 2 3

\72 _ _
1

ltt lv lw
au a, a;
bu bv bw

dW)2P~

tv a<l>
2

[~(h2h3Au)
+~
(h
au
av

1h

3Av)

[~ (h 2h 3 ~) + ~ (h Ih 3

h Ih 2h 3 au

hI

au

av

111.
h2h 3
0
au
hIA u
-

vxA=

+ !.....
(h
aw

i) + !.....

h 2 av
aw
lv
lw
--.
h 1h 3 h 1h 2
0
a
av
aw
h2A~ h 3A w

VvxA==O
V x (Vel == 0
V X V X A = v(v A) - \72A
v (A x B) = B v x A - A v x B

J v A dV

A dS

J v x A dS c A ae

1h

2Aw)]

(h 1h 2

!-)]

i, in

A Summary of Important Vector Relations


CARTESIAN COORDINATES

u=x

v==y

w=z

hI = 1

h2 = 1

h3 = 1

a<I>

L ax

V<I> =

a<I>

I, ay

a<I>
t, az

aA x aAy aA z
vA=-+-' +ax
ay
az
\72

a2

a
+ -aya +Bx?
az

= -

r,

r.
a

vxA=

i,

ax

ay az

Ax

Ay

Az

CYLINDRICAL COORDINATES

u=r

v=ct>

hI = 1

h2

\72 =

h3

==

a<I>
a<I>
+ - - + 1z ar
r ae/>
az
1 a
1 aA
- - (rA r ) + - + aA
- z
r ar
r act>
az

V4> = 1r -

v A

w=z

r
1 a4>

!r ~
(r ~) + ! ~
+~
ar ar
ae/>2 az

1"2

vxA=

i,
r

i,
r

ar

act>

az

Ar

rA

Az

SPHERICAL COORDINATES

u=r

hI = 1
V<I>

v A
\72

= 1, a<l>

ar

19 a<l>
r ae

v=O

h2 = r

w=ct>
h. = r sin 0

a<l>
r SIn 8 act>

a 2
1
a
1 aA
(r A r ) + -.- - (sin 0 A 8) + -.--r ar
r SIn () ao
r SIn 0 act>
== -.! ~ (r2~) + _1_ ~ (Sin 0 _~) + 2 1
~
2
2
1

= - 2

r ar

ar

r sin 0 dO
lr
16
-- -r2 sin 0 r sin 0
vxA=
d
d
ar
dO
Ar

rA fJ

ae
1
r
d
de/>

r sin 8A

r sin" 8

ac/>2

617

618 Vectors

MATHEMATICAL SUPPLEMENT: PART II

REFERENCES
1.

Bell, E. T., The Development of


Company, New York, 1945.

~1f athematics,

20

Franklin, P., Methods of Advanced Calculus, Mcflraw-Hill 1300k Company, New York,
1944.

2d ed., pp. 199-211, McGraw-Hill Book

3.

Gibbs, J. W., and E. B. \rVilson, Vector Analusis, C. Scribner's Sons, Inc., New York, 1901.

4.

Phillips, H. B., Vector Analysis, John Wiley and Sons, Inc., New York, 1933.

5.

Wills, A. P., Vector Analysis with an Introduction to Tensor Analysis, pp. 1-111, Dover
Publications, Inc., New York, 1958.
PROBLEMS

V.l

A ship is traveling southwest at a speed of 20 knots relative to land. If the water current is
2 knots due north, what is the ship's velocity relative to the water?

V.2

Prove, with the aid of vector analysis, that the diagonals of a parallelogram bisect each
other.

V.3

Use vector analysis to show that for any triangle the line connecting the midpoints of
any t\VO sides is parallel to the third and half its length.

V.4

Prove the law of cosines for a plane triangle with the aid of vector analysis.

V.5

Prove that the diagonals of a rhombus are perpendicular.

V.6

Find the equation of a plane through the point (2,3,4) and perpendicular to the vector
= 1x5 + 1112 - 1:3.

V.7

How far distant from the origin is the plane 3x + 4y + 5z = 12? Write the equations
of the planes parallel to it and twice as far from the origin.

V.8

Write the equation for the family of planes perpendicular to those of Problem V.7 and
parallel to the Z axis.

V.9

Show that
sin" () = _(a_X_b_)_o _(a_X_b_)
(aoa)(bb)

V.I0

Use the result of Problem V.9 to prove the identity


(a X b) (a X b)

a b

a a

Ia

0
0

bob

I
=

+ b).

V.ll

Deduce the law of sines for a plane triangle from the fact that c X c

V.12

What is the necessary and sufficient condition that three free vectors be coplanar?

V.13 Show that


d
- (u

ds

dv
v) = u -

ds

du
0

ds

V.14 Show that


d

ds

(u

v X w)

dw

u v X -

ds

dv

u -

ds

X w

+ -du v
ds

X w

c X (a

Problems 619
V.I5

.~

top is rotating about a vertical axis at the constant rate of 50 rad/sec. Its point of contact is moving on a horizontal plane in a circle of radius 2 feet at a constant angular velocity of io rad/sec. For a point in the top a distance -l- in. from the axis, find the velocity in
feet/sec as a function of time. (Reminder: The term velocity refers to a vector; speed is
used to designate the magnitude of a velocity.)

V.16

When air resistance is neglected, the trajectory of a shell is given by


r

== l x l ,OOOt

l y ( l ,OOOt - 16.1t 2)

The point of observation is the gun, distances are in feet, and time is in seconds.
(a) Find the velocity v as a function of time.
(b) Find the acceleration a as a function of time.
{c) What is the gun elevation?
(d) What is the range of the shell?

V.l.7 If a particle travels along a space curve with velocity

v, show that its acceleration is

given by

dv

a == 1 T

dt

1N

v2

in which 1 T is a unit tangent vector, IN is a unit normal vector in the principal plane, and
p is the radius of curvature of the space curve at the point occupied by the particle.
V.I8

l\ constant-rise helix is wrapped on a cone. Find the parametric equations which describe
it and write the equation of the line tangent to the helix at an arbitrary point.

V.19

Find the equation of the plane tangent to a paraboloid of revolution at an arbitrary point.

V.20

Show that spherical coordinates are orthogonal. Determine and explain the Jacobian.

V.21

With the aid of Equations (V.45)-(V.47), show that the Jacobian for the general Galilean
transformation is not zero.

V.22

When one assumes that the earth is a sphere, the equation of a line of latitude can be
expressed as

() ==

r == ro

ep

()o

==

ks

Find the circumference at this latitude.


V.23

A right circular cone is truncated by a sphere whose center is at the apex of the cone.
Find the total surface area of the toplike body consisting of the cone and its spherical cap.

V.24

The t\VO cylinders x 2 + y2 == a 2 and x 2 + Z2 == a 2 intersect forming a common region.


Show that the area of this common region is 16a 2

V.25

Compute the volume of the body of Problem V.23.

V.26

Show that in cylindrical coordinates

\72A == 1, (\7 2A r V.27

~ aA

r2

cP

aep

Ar)
r2

let> (\72.11 </>

+ ~2 a.4r _
a4>

Aet
r2

1z\72A z

Show that in spherical coordinates

V'2A

= 1,

V'2(rA,) -

~ V A)

Ie ( \72A 8

+ -:-22 -aA - -2sin


-1 2
r

ao

1<1> ( \72A</>

2 cos

e ali</

Ae - -- -0
r 2sin 2 0 d4>
2
dAr
2 cos () dA 8

+ -- r 2sin()

aep

+ r--- -dep - r-1-f) A


2sin 2 ()

2sin 2

)
cP

620

Vectors

l\1ATHEMATICAL SUPPLEMENT: PART II

V.28 An interesting proof of (V.III) is possible through a wedding of Stokes' theorem and the
di vergence theorem.
V.29 Show that

J V X A dV = - sA X dS

v
V.30

Show that

J V <P dV = <P dS

v
V.31 Show that

J V<P X dS = - c <P di

A uihor Index
Alhazen, 3-4, 15-16
Alpert, N. L., 363

Ampere, A. M., 218, 221-223, 244,400, 469


Anaxagoras, 41
Arago, D. F., 38, 50, 218
Aristotle, 2, 15
Arx, A., 372
Avicenna, 3-4, 16

Bacon, F., 5, 16-17


Bacon, R., 3-4, 16
Bantle, W., 372
Barkhausen, H., 402
Bates, L. F., 443
Beccaria, G. B., 468
Bessel, F. v'l., 156
Biot, IJ. B., 218-221
Ritter, F., 402-403, 443
Bizette, H., 403, 446
Blackman, M., 493
Bradley, .T., 20-23, 37
Brillouin, L., 428
Broer, L . .1. F., 457

Cabeo, N., 399


Casimir, H. B. G., 457
Cauchy, A. L., 176, 557
Cavendish, H., 101-106,221,327,468
Cedarholm, J, P., 83-85
Chu, L . .1.,273,320
Clausius, R., 329, 366
Coulomb, C. 4~' de, 106-112,329,399-400
Cranshaw, T. E., 76
Cummerow, R. L., 459
Curie, P., 372, 378, 402

Davisson, C., 488


Davy, H., 328, 469
Debye, P., 33~ 366, 386, 490

Desaguliers, .10 1'" 468


Descartes, R., 4-8, 18-19
de Sitter, vV., 34, 61
Dirac, P. A. M., 126, 476
Dirichlet, P. G. L., 151, 173
Drude, P., 472
Dulong, P. L., 472
DuPre, F. K., 457
Durbin, R. P., 76

Egelstaff, P. A., 76
Einstein, Ao, 12-13,40,61-62,72
Empedocles, 1-2, 15
Euclid, 3, 15
Euler, L., 10

Fairweather, A., 403


Faraday, M., 115, 223, 256-259, 327-329,

400-401
Fermi, E., 476
FitzGerald, G. F., 39
Fizeau, H., 11, 23-25, 38, 49-51, 82-83, 263
Forrer, R., 434
Forsbergh, P. W., Jr., 370
Foucault, J. B. L" 11, 23, 25-26
Fourier, J. B . .I., 153
Franklin, u., 10, 99
Franz, n., 471
Fresnel, s, .I., 11, 38, 50

Galileo, G., 4, 17
Galt, J. K., 460

622 A uihor Index


Gauss, !(. F., 115, 134
Gellibrand, H., 399
Gerlach, \V.) 403
Germer, L. rr., 488
Gibbs, ,J. \V., 567-568
Gilbert, \V., 399
Gordon, ,J. P., 83
Gorter, C. .I., 457
Gorter, F. W., 403
Grassman, II. (i., 566-567
Gray, S., 327, 467-468
Green, G., 113,273,315
Griffel, xr., 450
Griffiths, .1. If. E., 460

Larmer, .I., 418


Lawton, \V. ]~:., 117, 204-207
Legendre, A..\I., 166
Lewis, G. N., 40, 85

Linde, .T. 0., 495


Loar, u. 11., 76
Lorentz, u. A., 39, 59, 72, 264, 329-331, 343,
472

MeAlister, 1).,116,199
Macl.ronald, D. K. C., 495
Maclaurin, C., 557
Maxwell, ,J. C., 12,38-39, 115-117,256,260264

Halblfltzel, ,T., 373


Hall, l~:., 235
Halliday, n., 459

Hamilton, \V.H., 565-566


Havens, \V. \V., .Jr., 76
I-Iay, H..1., 76
Heisenberg, \V., 403, 441
IIehnholtz, H. L. P., 254
Henry, J., 259
Henry, \V. E., 4:31
Heron, 15
I-Iertz, H., 12, 264
Hitchcock, C. S., 364-365, 389
Hogan, C. L., 464
Hooke, R., 6-9, 19
Hupse, J. C:, 430
Huygens, (;.,8-10, 18, 20
Jakobsen, 1\1., 76
J 0 U 1e, ,J. P., 47 1, 489
Kennedy, R . .I., 40, 59
Kepler, J., 3-5, 18
I{eyes, F. G., 359
Kirchhoff', G. n., 11
Kirchner, j\., 399
Kirkwood, J. G., 359
Kittel, C., 454
J{ohlrausch, K. \V. F., 263
Kronig, n., 457

Mendelssohn, K., 495


Merrit, F. It, 460
l\Iichell, .1.,399
l\Iichels, 1\., 367

Michelson, 6-\' j\., 26-29, 39, 50,51-58


1\1 iller, I). C., 57
IVIillikan, R. 6-\" 13
IVlinko\vski, H., 319
l\lller, C., 84
Morley, E. \V., 28,50,51-58
Mossbauer, n L., 76
Mossotti, F. 0., 329, 366

Neel, L., 40:3-404, 445, 453


Neumann, F. E., 151, 173, 224
Newton, 1.,7-10,20. 263-264,399
Oersted, 1-I. o., 217-218
Ohm, G. S., 470-471, 478
Onnes, K., 473
Onsager, L., 366

Page, L., 224


Pauling, L., 348, 357
Percgrinus, 398

Petit, :\. 'r., 472


Planck, :\'1., 12-13,40,264
Plato, 397

Plimpton, S. ,1., 117, 204-207


Poincare, H., 40, 59, 72

Lagrange, .J. L., 112, 557


Langevin, P., 330, 353,401,419
Laplace, P. S., 112, 114, 151

Poisson, S. 1).,112-115,148,221,329,400
Porta, J. B., 398
Poynting, J. H., 285

A uthor Index
Priestley, .J., 99-100
Ptolemy, 3, 15

Riemann, G. F. B., 176


Roberts, F. F., 403
Robison, J., 100-101
Roemer, 0., 19-20

Sanders, P., 367


Sanger, R., 360
Savart, F., 218-221
Schiffer, J. P., 76
Schipper, A., 367
Schulz, .A., 76
Schwarz, H. A., 182
Shull, C. G., 446
Simmons, ~\. J., 319
Slater, J. C., 349
Smart, J. S., 446
Smyth, C. P., 364-365, 389
Snell, \V., 6
Socrates, 397
Sommerfeld, .A., 276, 320, 472
Spees, f\. H., 88-89
Squire, C. F., 403, 446
Steinberger, J., 76
Stern, 0., 403
Stevenson, A. F., 317
Stokes, G. G., 612
Stoner, E. C., 434
Stout, .1. \V., 450
Stratton, J. A., 273
Stuart, H. 1\., 359

Taylor, B., 557


Thomson, W. (Lord Kelvin), 116,329

Thorndike, E. IV1., 59
Tolman, R. C., 40, 85
Townes, C. H., 83-85
Tsai, B., 403, 446
Tycho, 21

Uhlig, H. H., 359

Van Vleck, J. H., 403, 450


Volta, A., 468-469
Von Hippel, :\.. R., 392

vVallis, J., 564


vVeber, VV., 263, 401
Weiss, P. I~., 402, 434

Welch, .1. E., 403


Went, J . .1., 403
vVessel, C., 565
\hittaker, E., 114
Wiedemann, G., 471
vViegand, C. E., 76

vVien, ; 254
Wollaston,
H., 469
vVood, E. s., 460

v:

Yager, VV ..A., 460


Young, 1'.,10-11,37-38

Zahn, C. '1'., 88-89, 359


Zavoisky, E., 459
Zeeman, P., 50, 427
Zeiger, H. J., 83

623

Subject Index
Aberration of light, 20-23, 37-38
Absolute motion, 48-49
Absolute potential, 129
Absolute reference frame, 48
Acceptors, 503
Acoustic power, 32-34
Acoustic velocity, 31-32
.Acoustic wave equation, 29-32
Acoustic waves, 29-34
Adiabatic demagnetization, 458
Alkali halides, 349, 363, 369
Ammonia beam maser, 83
Ampere's circuital law, 244-245, 270
Ampere's experiments, 222
Analytic function, 176
Anisotropy energy, 443
Antiferromagnetism, 403, 445-451
Associated Legendre functions, 168, 307-308
Atomic magnetic moment, 240-241
Average field in sphere
electric, 544-546
magnetic, 552-556
Band theory, 473-478, 500-505
Barium titanate, 373
Bessel functions, 157-165, 519-523
Bessel's equation, 156
Biot-Savart experiment, 218-221
Biot-Savart law, 221, 232-234
Bitter powder technique, 443
Bohr magneton, 423
Boltzmann transport equation, 482-483
Brillouin function, 428
Capacitance, 186-193, 337-338
Cauchy-Riemann equations, 176

Cavendish experiment
Cavendish, 101-106
Maxwell-Mc.Alister, 199-201
Maxwell's analysis, 201-204
Plimpton-Lawton, 204-207
Cedarholm-Townes maser experiment, 83-85
Child-Langmuir law, 150
Circular polarization, 296, 463-464
Circular wa veguide, 306
Circulator, 463
Classical invariance
of distance, 43
of mass, 44
Classical velocity transformation, 42-43, 45
applications, 46-49
cars, 46-47
light, 47-48
sound, 47
Classification
of conductive properties, 473-478
of magnetic materials, 396
Clausius-Mossotti equation, 366-369
Clock synchronization, 70-71
Coaxial cylinders, 137-138,246-247,284-285
Coefficien t
of capacitance, 192
of electrostatic induction, 192
of potential, 191
Coercive force, 435
Compass, 398, 407
Complex paramagnetic susceptibility, 457
Complex permittivity, 379-380
Complex Poynting vector, 290
Complex Poynting's theorem, 290
Complex vectors, 615-616
Composite static fields, 252-253
Composition of general sources, 530-531

626

Subject Index

Conditions at infinity, 275-279


Conduction band, 477-478, 504
Conductive materials, 467
Conductor, 478
Conductor-vacuum interface, 140-142
Conformal mapping, 175-186
Continuity equation, 271
Contraction of length, 73-74
Contraction hypothesis, 39-40, 59
Corpuscular theory of light, 6
Coulomb experiment, 106-112
Coulomb's law
formulation, 117-121
history, 99-112
Coulomb's torsion balance, 106-108
Covalent bond, 346
Covariance of Newton's general force law,
43-45
Crossed-slot coupler, 317-319
Curie law of paramagnetism, 429
Curie temperature
ferroelectric, 372
ferromagnetic, 433
paramagnetic, 433, 438, 442, 447
Curie-Weiss law
electric, 372, 376
magnetic, 433, 438, 453
Curl, 607-611
Cyclotron frequency, 456
Cyclotron resonance, 455-456

Damping constant (dipole), 550-551


Debye equations, 386
Debye temperature, 492
Debye theory of specific heat, 490-494
Debye unit, 353
Definition
of electric field, 121
of magnetic field, 229
Delta function, 126-127
Diamagnetism, 401, 416-420
Diamond structure, 501
Dielectric constan t
definition, 355
of gases, 356-361
of liquids, 365-366
of solids, 361-365
Dielectric losses, 390-393
Dielectric susceptibility, 354
Dilatation of time, 75

Dipolar relaxation, 383-390


Dipole
electrostatic, 130-131
magnetic, 241
Dipole moment
electric, 330-331
magnetic, 241, 404
Dirac delta function, 126-127
Directional coupler, 318
Dirichlet problem, 151, 173
Displacemen t curren t, 263, 270
Displacement vector, 116
Divergence, 603-606
Divergence theorem, 115, 607
Divisions of space-time, 79-80
Domain wall energy, 443
Domains, 437, 442-445
Donors, 503
Doppler effect
classical, 32
relativistic, 96
Doppler shift
classical, 515
relati vistic, 96
Dulong-Petit law, 472, 493
Dynamic length, 62-70

Effective mass, 456, 504

Electrets, 379
Electric dipole
dynamic, 547
static, 130-131
Electric dipole n10111en t, 330-331
Electric field
dynamic, 266
static, 121-127
Electric flux, 138
Electrical conductivity, 480
Electromagnetic induction, 259
Electromagnetic nature of ligh t, 263
Electromagnetics, 256-325
Electron orbital motion, 421-424
Electron paramagnetic resonance, 458-459
Electron spin, 424
Electronic polarizahility
static, 344-346
time-harmonic (complex), 380-382
Electrostatic field, 121
Electrostatic potential, 127-134
Electrostatic stored energy, 193-197

Subject Index
Electrostatics, 98-215
Electrostriction, 377
Elliptical polarization, 296
Emission theories, 61
Energy
electrostatic, 193
kinetic (relativistic), 89-91
magnetic, 283-285
Energy gaps, 475
Equipotentials, 130, 140
Equivalence of mass and energy, 40, 91
Equivalent magnetic charge, 408
Ether, 5, 34-35, 37-40, 47-49, 61
Ether drag, 50, 58-59
Event, definition, 71
Exchange integral, 440-442
Extended theorem of mean value, 559

Far-field approximation, 280


Far-field pattern, 280
Faraday rotation, 463
Faraday's emf law, 259, 270
Faraday's experiments, 257-259
Fermi-Dirac statistics, 476-477, 503-505
Fermi level, 477, 487
Ferrirnagnetism, 403-404, 451-454
Ferrites, 403, 451-454, 462-464
Ferroelectric crystals, 370-376
Ferromagnetic domains, 442-445
Ferromagnetic resonance, 459-461
Ferromagnetism, 402, 436-445
Field tensor, 322
Field transformation equations
to electromagnetics, 264-267
generalization, 532-533
to magnetostatics, 224-230
Fizeau's moving water experiment, 49-51
82-83
Flux lines, 115
Flux maps, 600
Force between currents, 237-239
Force transformation law, 92-94
Four-poten tial, 320
Fourth fundamental unit, 238
Fractional magnetization, 438
Free electron theory of metals, 478
Fringing, 188

627

Galilean transformation, 42, 45, 72


Gauss' divergence theorem, 115
Gauss' law, 134-136
General sources, 530-531
Generalized coordinates, 589-594
Generalized electric flux density, 339
Generalized magnetic intensity, 408-410
Germanium, 500-508
Gouy apparatus, 414
Gradien t, 598-603
Gravitational potential, 44
Green's functions, 172-175, 315-317, 534536, 540-543
Green's integral theorems, 613-614
Green's reciprocation theorem, 191
Guided waves, 297-303

Half-wave dipole, 281-282


Hall effect, 235-236
Hankel functions, 157
Heat flow, 497
Heisenberg's exchange theory, 441
Helmholtz coils, 254
Hole-electron pairs, 501
Huygens' principle, 9
Hydrodynamic analog of electromagnetism,
261-263
Hysteresis, 371, 416, 435, 461
Hysteresis loss, 461-462

Ideal gas law, 30


I mages, 142-148
Impedance of free space, 286
Impurities in semiconductors, 502
Inductance, 309-314
Inertial coordinate systems, 41
Infinite homogeneous medium, 369
Infrared absorption, 392
Initial permeability, 436
Insulator, 478
Integral solutions of Maxwell's equations,
272-275
Interdependence of space and time, 61-70
Interference of light, 10-11
Interference fringes, 51
Internal field constant
electric, 344
magnetic, 411, 440

628 Subject Index


Invariance
of charge, 226, 229
of electric flux, 226, 229
of transverse length, 62-66
Inverse square law
formulation, 117-121
history, 99-112
Maxwell's derivation, 204
Ionic bond, 346
Ionic crystals, 349, 362
Ionic polarizability
static, 348
time-harmonic (complex), 382-383
Ionic polarization, 346-351
Irrotational vector fields, 614-615

Joule's experimen t, 471


Joule's law, 471, 489

Kennedy-Thorndike experimen t, 59

Lande splitting factor, 426


Langevin function, 353, 440
Laplace's equation, 112, 151-186,249-252
Laplacian operator, 606-607
Larmor angular frequency, 418
Law of inertia, 41
Legendre functions, 524-529
Legendre polynomials, 167
Legendre's equation, 166
Length contraction, 73-74
Light
aberration of, 20-23, 37-38
corpuscular theory, 6
electromagnetic nature, 12
interference, 10-11
nature of, 1-14
polarization of, 8
spectrum, 7
transverse vibrations, 10
velocity of, 14-29
wave theory, 6
Linear polarization, 296
Local field
electric, 342-344
magnetic, 410-411, 437
Loren tz-FitzGerald can traction hypothesis,
39-40, 59

Lorentz force law, 230-231, 252, 265


Loren tz internal field
electric, 343
magnetic, 411, 440
Lorentz transformation equations, 39, 70-73
Loss tangent, 391
Lumped circuit concept, 314

Macroscopic electric field, 331-337


Macroscopic magnetic field, 404-408
Magnetic current loop, 240-241
Magnetic dipole moment, 241
Magnetic flux lines, 258, 260
Magnetic focusing, 235
Magnetic intensity, 236-237
Magnetic polarizabili ty, 411
Magnetic poles, 398
Magnetic stored energy, 283-285
Magnetic susceptibility, 411-414
Magnetization density, 405
l\1agnetostatic field, 231
Magnetostatic vector potential function, 239
Magnetostatics, 216-255
Maser, 83
Mass (variable), 40, 85-89
Mass-energy equivalence, 40, 91
Mass spectrometer, 465
Mass spectroscope, 254
Mass transformation law, 91-92
Matthiesen's rule, 496
Maxwell-l\1c.Alister experimen t, 199
Maxwell's equations, 268-270, 393-394,
464, 509
Mean free path (metals), 486
Mean free time (metals), 485

Measurernen t
of electrostatic fields, 341
of susceptibility
electric, 355
magnetic, 414-416
of velocity of light, 19-29
Meson decay, 76
Method of images, 142-148
l\1ichelson interferometer, 28, 39, 51-58
Michelson-Morley experiment, 51-58, 513
Minkowskian formulation, 319-323
Mobility, 507
Modified Bessel functions, 159
Molar polarizability, 366
Momentum, 89-90

Subject Index
Momentum principle, 87
Mossbauer effect, 76
Motion of source, influence on wave velocity,
32, 34, 47-48, 61-62
Moving water cxperimen t of Fizeau, 49-51,
82-83
Multicapacitor systems, 189-193
Mutual capacitance, 193
Mutual inductance, 310-311
Nature of light, 1-14
Neel temperature, 447
Neumann problem, 151, 173
Neutron diffraction, 446, 452
Newtonian space, 41
Nuclear spin, 424
Oersted's experiment, 217
Ohm's experiments, 470
Ohm's law, 470-471, 478-485
Optical absorption, 393
Orbital angular momentum, 421
Orientational polarizability, 354
Orientational polarization, 351-354, 371
Orthogonal coordinates, 594
Orthogonalized Bessel functions, 160-162
Paramagnetic relaxation, 456-458
Paramagnetic resonance, 458-459
Paramagnetism, 402, 427-433
Periodic table, 422
Perrnanen t magnetic momen ts, 420-427
Permeability, 412
Permittivity
free space, 120
materials, 355
Photoelectric effect, 12-14
Photon~, 13
Piezoelectrics, 376-379
Plane waves, 291-296
Plimpton-Lawton experiment, 204-207
Poisson's equation, 114, 148-151,245
Poisson's theory, 112-114
Polar molecules, 346
Polarizability
electronic, 345
ionic, 348
orien ta tional, 354
Polarization density, 334

629

Polarization
of light, 8
of waves, 296
Polarized molecules, 326, 328
Poten tial difference, 129
Poten tial energy of magnetic moment in a
field, 243
Poten tial function
scalar, electrostatic, 128
time-varying, electromagnetic, 279-280
vector, magnetostatic, 239
Poynting vector, 286
Poynting's theorem, 285-291
Principle of relativity, 40-46, 61
Proper distance, 79
Proper time, 78
Properties of ferromagnetic materials, 433436
Pyroelectricity, 378
Quantum states, 420-421
Quenching of magnetic moments, 424, 432
Radiation integral, 280
Radiation pressure, 295
Recessional velocity, limit on, 82
Rectangular waveguide, 300-303, 317-319,
540-543
Regular function, 176
Relative dielectric constant, 355
Relative permeability, 412
Relativistic velocity t.ransforrnation, 81-82
Relativity principle, 40, 61
Relaxation time in metals, 481
Remanent field, 435
Residual resistivity, 494
Resistivity, 480, 494
Retarded poten tials, 279
Robison experimen t, 100
Rochelle sal t, 372
Ilolle's theorem, 558
Ruler experiments, 62-70
Russell-Saunders coupling, 425-427
Rutherford atomic model, 209
Scalar poten tial function
electrostatic, 128
time-varying
dielectric, 547-549
free space, 279

630 Subject Index


Scattering

impurity, 494
lattice, 494
Sch warz transformation, 182
Seignette electricity, 372
Self-capacitance, 193
Self-inductance, 310, 312-314
Semiconductors, 478, 500-508
Separation of variables, 152, 156, 166
Silicon, 500-508
Snell's law, 6
Solenoidal vector fields, 614-615
Solenoids, 247-248, 407
Solutions
to Laplace's equation
using conformal mapping, 175-186
in cylindrical coordinates, 155-165
in rectangular coordinates, 152-155
in spherical coordinates, 165-172
to wave equation
in cylindrical coordinates, 303-306
in rectangular coordinates, 291-303
in spherical coordinates, 306-309
Sommerfeld conditions, 276
Sound waves, 29-34
Source transformation law, 267
Sources of magnetic momen ts, 404
Space-time interdependence, 61-70
Special relativity, 36-97
Specific heat, 490-494
Specific inductive capacity, 328
Spectroscopic notation, 473
Spherical Bessel functions, 307
Spherical cavity, 308
Spinel structure, 452
Spin-lattice interaction, 457
Spin-orbit coupling, 425-427
Spin-spin in teraction, 457
Splitting factor, 427
Spon taneous polarization
electric, 370, 374-376
magnetic, 435, 437
Stellar aberration, 20-23, 37-38
Stokes' theorem, 612
Stored energy
electromagnetic, 286
electrostatic, 193
magnetostatic, 283-285
Superconductivity, 473
Superposition of forces, 118

Susceptibility
dielectric, 354-356
magnetic, 411-414
Syrnmetry experiments
colliding balls, 85-87
rulers, 62-70
Synchronization of clocks, 70-71

Table of vector relations, 616-617


Taylor's series
for one variable, 560-561
for several variables, 561-563
Temperature dependence
of dielectric constant, 359, 363-365, 388
of magnetic susceptibility, 429, 438, 447,
453
of resistivity
in metals, 494-496
in semiconductors, 508
Tensor conductivity, 507
Tensor permeability in Ierrites, 462-464
Theorem of mean value, 558
Thermal conductivity of metals, 496-500
Time dilatation, 75
'rime-space interdependence, 61-70
Torque on magnetic moment, 243
Transformation
of electric force, 224-230
of fields, 264-267
of sources, 267
Transformation law
for force, 92-94
for Blass, 91-92
Transformer action, 257-258
Transition elemen ts
iron group, 423, 425, 432
rare earths, 424, 432

Unguided waves, 291-296


Uniform plane waves, 291-296
Uniqueness theorem for electrostatic potential, 151-152

Valence band, 477-478, 505


Variability
of longitudinal length, 66-68
of mass, 40, 85-89
Vector analysis, 564-620

You might also like