Download as pdf or txt
Download as pdf or txt
You are on page 1of 201

Statistical Mechanics Notes: Honours

Contents
Contents

Preface

1 Classical Thermodynamics
1.1 Brief Review of Classical Thermodynamics
1.1.1 The First Law of Thermodynamics:
1.1.2 The Second Law of Thermodynamics
1.1.3 Third Law of Thermodynamics
1.1.4 Zeroth Law of Thermodynamics
1.2 Entropy
1.2.1 Origin of the Entropy Function: Clausius Theorem
1.2.2 Properties of the Entropy Function
1.3 Mini assignment 1

1
1
2
3
3
4
4
4
5
8

2 Fundamental Equation
2.1 Fundamental Equation of Thermodynamics
2.2 Alternative Forms of the Fundamental Relation
2.3 Thermodynamic Potentials
2.3.1 In terms of S and P : Enthalpy
2.3.2 In terms of T and V : Helmholtz Free Energy
2.3.3 In terms of T and P : Gibbs Free Energy
2.4 The Maxwell Relations
Exercises

9
9
11
12
12
14
15
15
17

3 Models of Thermodynamic Systems


3.1 Quantum Models
3.1.1 Stationary States
3.1.2 External Parameters
3.1.3 Particle Number
3.1.4 Interaction
3.1.5 Independent Particle Models
3.2 Classical Models
3.2.1 Classical Specification of State
3.2.2 External Parameters
Exercises

19
20
20
21
22
22
24
25
25
26
28

4 Isolated Systems: Fundamental Postulates


4.1 Thermodynamic Specification of States
4.2 Equilibrium States of Isolated Systems
4.3 Microscopic View of Thermodynamic Equilibrium
4.4 Isolation of a System
4.5 Dependence of on E and E
4.6 Thermal Interaction
4.6.1 The Probability Distribution
4.6.2 Sharpness of Peak
4.7 Entropy
4.7.1 Meaning of and .
4.8 Density of States
4.9 Approximate value for total

30
30
31
31
32
33
35
36
36
38
40
41
42

3
4.10 Fundamental equation
Exercises

43
45

5 Microcanonical formalism: Magnetic Systems


5.1 Magnetic Materials and Behaviour
5.2 Description of Magnetic Behaviour
5.3 Paramagnetic Materials
5.3.1 A Model for Paramagnetism
5.3.2 Single Particle States Available to Each Atom of the Solid
5.3.3 Microstates of the Solid
5.3.4 Number of Accessible States in the range E to E + E
5.3.5 The Fundamental Relation
5.3.6 Thermodynamics of the Paramagnetic system
5.3.7 Equations of State
5.3.8 Predictions of the Model
5.3.9 Heat Capacities:
5.3.10 Entropy
Exercises

46
46
47
48
48
48
49
50
51
51
53
54
55
57
60

6 Classical Model of a Gas


6.1 Classical phase space density
6.2 General Model of a Gas
6.3 Ideal Gas
6.3.1 Assumptions
6.3.2 Fundamental Relation
6.3.3 Properties of the Gas
6.4 Monatomic Ideal Gas
6.4.1 Evaluation of
6.4.2 Fundamental Equation
6.4.3 Heat Equation and Specific Heat
6.4.4 Dependence of S on N
6.5 Correct Classical Counting of States
6.6 Gibbs Paradox
Exercises

61
61
61
62
62
63
63
64
64
65
66
67
68
70
71

7 Canonical Formalism
7.1 A New Formalism
7.2 The Probability Distribution
7.2.1 Probability of Occurrence of State r
7.2.2 Probability of Occurrence of Energy E
7.2.3 Probability of Energy in range E to E + E
7.3 Statistical Calculation of System Parameters
7.3.1 Energy
7.3.2 Conjugate Variables
7.4 Fundamental Relation
7.5 Entropy
Exercises

73
73
74
74
77
77
78
78
80
81
82
83

8 Heat Capacity of Solids


8.1 Modelling Specific Heats
8.2 Experimental Facts
8.3 Historical
8.4 Einsteins Model
8.4.1 Partition Function
8.4.2 Thermodynamics of the Oscillator System

84
84
84
85
86
87
89

8.5

8.6

8.4.3 Entropy
8.4.4 Energy Equation
8.4.5 Heat Capacity
Discussion of Results
8.5.1 Energy
8.5.2 Heat Capacity
Comparison with Experiment
Exercises

90
90
90
90
90
91
92
94

9 Paramagnetism : Canonical Approach


9.1 Magnetic Moment of Spin-S Particles
9.2 Quantum States of Spin-S Particles
9.3 Statistics of a Single Paramagnetic Atom
9.3.1 Average Magnetic Moment
9.3.2 Average Energy
9.4 Properties of the Brillouin Functions
9.5 Properties of Paramagnetic Solids of Spin S
9.6 Thermodynamics of Paramagnetic Materials
Exercises

95
95
95
96
97
98
98
99
100
103

10 Canonical Formalism in Classical Models


10.1 Ideal Monatomic Gas
10.2 Monatomic Gas: Another Method
10.3 Maxwells Distribution
10.3.1 Distribution of a Component of Velocity
10.3.2 Distribution of Speed
10.4 Gas in a Gravitational Field
10.5 Equipartition of Energy
Exercises

104
105
107
108
111
111
112
114
117

11 Quantum Theory of Ideal Gases


11.1 Quantum Theory of Gases
11.1.1 Indistinguishability of particles
11.1.2 The Pauli Exclusion Principle
11.1.3 Spin and Statistics
11.1.4 Effect of Indistinguishability and Exclusion on Counting Procedures
11.1.5 Maxwell-Boltzmann Case
11.1.6 Bose-Einstein Case:
11.1.7 Fermi-Dirac Case
11.2 The Partition Functions
11.2.1 Maxwell-Boltzmann Gas
11.2.2 Bose-Einstein Gas
11.2.3 Fermi-Dirac Gas
11.3 Mean occupation numbers

118
118
118
119
120
120
121
121
122
122
122
124
125
126

12 Blackbody Radiation
12.1 Equilibrium States of a Radiation Field
12.2 Modelling Cavity Radiation
12.3 Cavity Modes
12.4 Partition Function of the Radiation Field
12.5 Statistics of the Radiation Field in Thermal Equilibrium
12.6 Plancks Law for Blackbody Radiation
12.7 Spectral Properties of the Radiation Field
12.8 Fundamental Relation for Radiation Field
12.9 Thermodynamics of the Radiation Field

128
128
128
129
131
132
133
134
136
137

5
Exercises

139

13 Grand Canonical Formalism


13.1 Another Formalism
13.2 Chemical Potential
13.3 Grand Canonical Distribution
13.4 The Grand Partition Function
13.5 Grand Canonical Potential
13.6 Thermodynamics via the Grand Canonical Potential
13.7 Relation to the Canonical Potential
13.8 Application to Boson and Fermion Gases
13.8.1 Grand Partition Function for Gas of Bosons
13.8.2 Grand Partition Function for Fermion Gas
13.9 Occupation numbers
13.9.1 Bosons
13.9.2 Fermions
13.10 Quantum statistics in the classical limit
Low concentration
High temperature

140
140
140
141
142
143
144
145
145
146
147
148
148
149
149
149
150

14 Ideal Fermi Gas


14.1 Fermi-Dirac Particles
14.2 Ideal Fermi Gas
14.2.1 Classical limit
14.3 Formal Criteria for a Degenerate Fermion Gas
14.4 Density of States
14.4.1 Properties of the Degenerate Gas
14.5 Fundamental Equation
14.5.1 Fermi-Dirac Functions
14.6 Simplistic Model of a white dwarf star
14.6.1 Relativistic Density of states
14.6.2 Energy of a Relativistic Fermi Gas at T = 0
14.6.3 Pressure of a Relativistic Fermi Gas at T = 0
14.6.4 Stability of the White Dwarf Star
Exercises

152
152
153
154
155
157
159
160
160
163
164
165
166
167
169

15 Phase transitions and critical exponents


15.1 Dynamical model of phase transitions
15.2 Ising model in the zeroth approximation
15.3 Critical Exponents

170
170
173
177

A Statistical Calculations
2

A.1 The Integral ex dx

A.2 The Integral 0 xn ex dx


A.3 Calculation of n!
15.3.1 Approximate Formulae for n!
15.3.2 Lowest Order Approximation
15.3.3 Stirlings Formula
15.3.4 Infinite Series for n!
A.4 The Gamma Function
15.4.1 Definition
15.4.2 Recurrence Relation
15.4.3 and the Factorial Function

179
179
179
180
180
181
182
183
184
184
185
186

B Volume of a Sphere in Rn

189

6
C Evaluation of

3 x
x (e
0

D Fermi-Dirac Functions

1)1 dx

191
194

Preface
These notes are based on notes written by F A M Frescura. It is intended for the honours course on
Statistical Mechanics at the School of Physics, University of the Witwatersrand and may not be reproduced
or used for any other purpose.
DPJ January 2009

Chapter 1
Classical Thermodynamics
1.1 Brief Review of Classical Thermodynamics
Historically, thermodynamics developed out of the need to increase the efficiency of early steam engines.
Classical Thermodynamics, TD, (from the Greek thermos meaning heat and dynamics meaning power) is
a branch of physics that studies the effects of changes in thermodynamic variables on physical systems at
the macroscopic scale by analyzing the collective behaviour of the constituent parts. Roughly, heat means
"energy in transit" and dynamics relates to "movement"; thus, in essence thermodynamics studies the
movement of energy and how energy instills movement.
The starting point for most thermodynamic considerations are the laws of thermodynamics, which
postulate that energy can be exchanged between physical systems as heat or work. They also postulate
the existence of a quantity named entropy, which can be defined for any system. In thermodynamics,
interactions between large ensembles of objects are studied and categorized. Central to this are the
concepts of system and surroundings. A system is composed of particles, whose average motions define its
properties, which in turn are related to one another through equations of state. Properties can be combined
to express internal energy and thermodynamic potentials, which are useful for determining conditions
for equilibrium and spontaneous processes. With these tools, thermodynamics describes how systems
respond to changes in their surroundings. This can be applied to a wide variety of topics in science and
engineering, such as engines, phase transitions, chemical reactions, transport phenomena, and even black
holes. The results of thermodynamics are essential for other fields of physics and for chemistry, chemical
engineering, aerospace engineering, mechanical engineering, cell biology, biomedical engineering, and
materials science to name a few.
Classical Thermodynamics deals with the macroscopic properties of macroscopic systems. It does so
without making any assumptions about the ultimate constitution of matter, and does not really depend
on whether matter has any ultimate constitution at all. In order to develop a statistical description
of thermodynamic systems from a microscopic description, the basic facts that we will need about
macroscopic systems are as follows:
1.

2.

3.

4.

Left for a sufficiently long time, each macroscopic system eventually settles into a condition in which
its macroscopic properties no longer change with time but remain constant until the system is disturbed
by outside influences. These settled states are called equilibrium states, or, states of thermodynamic
equilibrium, of the system. The characteristic time needed for the system to settle into an equilibrium
state is called the relaxation time for the system. Relaxation times differ hugely for different systems,
and can be as short as 106 s for gases, and as long as several centuries for glass.
In each equilibrium state, the configuration of the system can be specified by a small number
of macroscopic configuration variables. These normally include variables to specify the spatial
dimensions of the system (volume, area, length), the amount of material it contains (mole numbers
or masses of each constituent chemical species, total mole number or mass of the system), and its
electrical and magnetic condition (polarisation, magnetisation). The configuration of the simplest
systems, called simple hydrostatic systems, is specified by the volume V it occupies and the total
amount of material that it contains. Classically, this was specified by its total mass m. However, in
view of the atomic nature of matter, it is better to specify the material content of the system either by
the total number of moles it contains (commonly used in chemistry), or by the total number N of
atoms, molecules or particles it contains (commonly used in physics).
For each configuration variable of a given system, there is an associated generalised force which is
responsible for changing the value of that configuration variable. Thus, pressure is responsible for
altering volume, surface tension alters area, force alters length, chemical potentials alter species mole
numbers, electric field alters polarisation, magnetic field alters magnetisation, and so on.
In each equilibrium state of the system, each generalised force has a fixed constant value.

2
5.

Chapter 1

The generalised forces are conjugate to their associated configuration variables in the sense that they
occur conjointly in the expression for the work done by the system in a given change of configuration.
Denote the configuration variables by xi , and their associated generalised forces by Xi . Then, if the
configuration of the system is changed quasistatically and non-dissipatively (that is, reversibly) from
value xi to value xi + dxi , the amount of work done by the system is given by
dW = i Xi dxi = Xi dxi

6.

7.

8.

Classical Thermodynamics

(1.1)

Each equilibrium state has a fixed definite temperature T . This is the only variable which, in the initial
stages of the theory, does not have a conjugate. After the introduction of the Second Law, which leads
to the discovery of a new configuration variable for the system called its entropy S, we discover that T
is the conjugate of S. However, S is not a directly measurable macroscopic parameter for the system,
so we do not include it here.
In each equilibrium state, each macroscopic measurable variable, that is xi , X i , and T , has a well
defined constant value, and the equilibrium state is fully characterised by these values. The variables
(xi , X i , T ) are therefore called state variables, since they characterise the equilibrium state of the
system.
Not all of the state variables are independent. Fixing a certain number of them uniquely fixes the values
of all the others. This means that the state variables are subject to a certain number of relations, called
the equations of state of the system. The number of independent state variables in the system is called
the number of degrees of freedom of the TD system. For example, a simple hydrostatic system has
state variables (V, P, T, N). Of these, only two are independent. This system thus has two degrees of
freedom.
Mathematically, we can represent each set of values (xi , X i , T ) as a point in a (2n + 1)-dimensional
Cartesian space, R2n+1 . The existence of equations of state means that not all points of this space
represent possible equilibrium states of the system. Only those points that lie on the lower dimensional
hypersurface defined by the equations of state represent equilibrium states. The points off that surface
have no physical interpretation at all. Denote the number of equations of state for the system by r. The
existence of r relations among 2n + 1 variables defines an (2n + 1 r)-dimensional hypersurface.
The space of equilibrium points is thus a manifold, or space, of dimension (2n + 1 r) dimensions.
This space is called the state space for the TD system.

The basic principles of classical TD that we need to use are as follows:

1.1.1

The First Law of Thermodynamics:

In each equilibrium state, the system has a definite, well defined amount of energy E, called its internal
energy. The internal energy is therefore a function of the equilibrium state, and so can be expressed
mathematically as a function of any set of independent variables taken from (xi , X i , T ). For example, for
a simple hydrostatic system, we have E = E(V, P ), E = E(V, T ), or E = E(P, T ).
If the equilibrium state of the system is changed, its internal energy increases by a fixed definite amount
E = E(f ) E(i), where i and f represent the initial and final equilibrium states of the system. This
energy difference must be supplied by the surroundings of the system, and may be supplied in one of
two forms only: work, and heat. Work is energy transferred to or from the system by virtue of changes
in its configuration coordinates alone. Energy transferred by any means other than a change of system
configuration is called heat. From a microscopic point of view, heat is energy transferred by changes in all
of the configurational degrees of freedom of the system which remain unaccounted for in the macroscopic
description, that is, the microscopic degrees of freedom. The sign conventions for work and heat are as
follows: positive work is energy transferred out of the system by the configurational degrees of freedom;
heat is energy transferred into the system.
1.1.1.1

The Principle of the Conservation of Energy

This principle states that the total energy of the system and its surroundings is constant. Denote the work
done by the system in a given change of equilibrium by W , and the heat into the system by Q. Then, the

3
conservation law gives E(f ) = E(i) + Q W , or
Q = E + W

(1.2)

For infinitesimal changes of equilibrium state, this becomes


dQ = dE + dW

(1.3)

where the symbol d indicates a small amount of the associated quantity, and not an infinitesimal difference
of function values. In other words, there are no functions Q and W that can be differentiated to give
dQ and dW. These two equations are often called the First Law of Thermodynamics." In fact, they are
not. The First Law asserts the existence of the function E. These two equations are a combination of two
independent laws, the First Law and the Principle of the Conservation of Energy. But no confusion can or
will arise if you continue to refer to these equations as the First Law."

1.1.2

The Second Law of Thermodynamics

This law states that heat does not, of itself, flow from a cold body to a hotter one (Clausius Statement).
An equivalent statement is that no system can of itself convert a given amount of heat completely into
work (Kelvin-Planck Statement). Logic applied relentlessly to the second law shows that, for every
thermodynamic system, there exists a function of state, S, called the entropy of the system, which has the
property that for all processes of the system,
T dS = dE Xi dxi

(1.4)

For simple hydrostatic systems, this relation is


T dS = dE + P dV dN

(1.5)

If, furthermore, the system is closed, then dN = 0 and this relation reduces to the familiar
T dS = dE + P dV

(1.6)

This relation is the single most important equation in TD, and we consider it in detail later.
The most common enunciation of second law of thermodynamics is essentially due to Rudolf Clausius:
"The entropy of an isolated system not in equilibrium will tend to increase over time, approaching a
maximum value at equilibrium."
Note that the content of (1.6) is different in general from that of (1.3). In a very special subclass of
idealised processes, called reversible (these are the TD analogue of frictionless motion in Mechanics), we
have
dQ = T dS and dW = Xi dxi

(1.7)

and the content of (1.3) and (1.6) becomes identical. However, for irreversible processes, which hugely
outnumber the reversible ones and include all real processes which occur in nature (as opposed to ideal
processes, which do not), we have
dQ < T dS and dW < Xi dxi

(1.8)

For these processes, equations (1.3) and (1.6) are different in content and provide independent pieces of
information.

1.1.3

Third Law of Thermodynamics

The third law of thermodynamics is an axiom of nature regarding entropy and the impossibility of reaching
absolute zero of temperature. The most common enunciation of third law of thermodynamics is: As
a system approaches absolute zero of temperature all processes cease and the entropy of the system
approaches a minimum value."

Chapter 1

1.1.4

Classical Thermodynamics

Zeroth Law of Thermodynamics

The zeroth law of thermodynamics is a generalized statement about bodies in contact at thermal
equilibrium and is the basis for the concept of temperature. The most common enunciation of the zeroth
law of thermodynamics is: "If two thermodynamic systems are in thermal equilibrium with a third, they
are also in thermal equilibrium with each other."

1.2 Entropy
Entropy is at the centre of statistical mechanics.

1.2.1

Origin of the Entropy Function: Clausius Theorem

Using the Second Law, Clausius proved the existence of a new state function S for any given
thermodynamic system. He called the new state function the entropy of the system. Entropy means
conversion. Clausius chose this name because S determines the maximum amount of work that can be
derived from a given amount of heat. In other words, it determines the conversion of heat into work. The
existence of S follows directly from a result which we now call the Clausius Theorem.

Theorem 1 Clausius Theorem


If is any quasistatic cyclic process for a given system, then

dQ
0
T

(1.9)

The integral here is taken over one complete cycle. The equality holds if and only if the process is also
non-dissipative, and thus reversible.
A general proof of this theorem is found in Fermi (Fermi, 1937, p 46-49). Other books, like Zemansky
and Dittmann (Zemansky and Dittmann, 1997, p 186-189) and Sears and Salinger (Sears and Salinger,
1975, p 127-129), generally give a proof that is valid only for systems of 2 degrees of freedom, and prove
only half of Clausius result.
If we restrict ourselves to reversible cyclic processes alone, this theorem states that, for all reversible
cyclic processes of the system,

dQ
=0
T

This means that the integral taken between any two given equilibrium states is path independent, and so
can be used to define a function S by putting
f

S(f ) =
i

dQ
+ S(i)
T

(1.10)

Here i is any fixed chosen equilibrium state, and f is any other equilibrium state. The value S(i) is
effectively a constant of integration and may be assigned arbitrarily. With its value fixed, the integral may
then be evaluated for each state f , thus assigning an unique value S(f ) to each equilibrium state of the
system. S is thus defined uniquely for the system up to an additive constant.
Note that (1.10) defines S as a function of the equilibrium states of the system, and thus gives S as a
function of the state variables. So, though we use reversible processes to infer the value of S for each
equilibrium state, once S is known, its value is determined only by the given equilibrium state and does
not in any way depend on how the system arrived in that state. S is therefore a function of the equilibrium
state alone, and is not in any way a function of process.

Section 1.2

Entropy

With S determined, we may now use S to calculate the change in entropy S = S(f ) S(i) for any
process , quasistatic or non-quasistatic, in which the system begins in equilibrium state i and ends in
equilibrium state f . And furthermore, if the process is quasistatic, be it reversible or irreversible, we
may also calculate the value of the integral dQ/T for the process. How does the value of this integral
compare with the value of S? To answer this, let be any reversible process that takes the system from
the given initial state i to the given final state f. Then the combined process + (), where is the
reverse process of , is a quasistatic cyclic process for the system. By the Clausius Theorem (equation
(1.9) we then have
0

+()

dQ
=
T

dQ
+
T

dQ
=
T

dQ

dQ
=
T

dQ
S
T

(1.11)

so that for any quasistatic process whatever, we have

dQ
S
T

(1.12)

We have therefore shown that,


for any quasistatic process of the system, the change in entropy of the system is always greater
than or equal to the integral dQ/T . The equality holds only for reversible processes.
For infinitesimal processes, the above result gives
dS

dQ
T

(1.13)

or, since T > 0 always,


T dS dQ

(1.14)

with the equality holding only for reversible infinitesimal processes. We thus see that, in a general process,
we do not have dQ = T dS. Consequently also, in a general process, dW Xi dxi , with the equality
holding only in a reversible process. For a simple hydrostatic system, this means dW P dV . These
results for quasistatic processes are summarised in the following table:
Reversible
dQ/T

S =

1.2.2

Irreversible
dQ/T

T dS = dQT

T dS dQ

dW = P dV

dW P dV

dW = Xi dxi

dW Xi dxi

Properties of the Entropy Function

For reversible and irreversible processes, we have


dQ = dE + dW.

(1.15)

dQ = T dS and dW = Xi dxi

(1.16)

T dS = dE Xi dxi

(1.17)

For reversible processes, we also have

Combining these relations, we get

Chapter 1

Classical Thermodynamics

Though we arrived at this relation by considering reversible processes, the relation itself makes no
reference at all to any process. It contains only state variables, and infinitesimal differences (that is,
differentials) of state variables. It is therefore an equation among the state variables of the system that is
valid at all times, and not only while the system undergoes a reversible process. This equation is called the
fundamental equation for the system.
Equation (1.17) is a differential expression. It expresses the differential dS in terms of the differentials
dE and dxi . Since S is a state function, dS is an exact differential. Equation (1.17)) thus shows that the
primitive of the differential dS is a function of the variables E and xi , and assumes also that all other
state variables, including T and the Xi , have all been expressed in terms of the independent variables
E, x1 , ..., xa . The entropy of the system is therefore properly a function of the internal energy E of the
system, and of the system configuration variables xi . Thus,
S = S(E, x1 , x2 , ..., xa )

(1.18)

The following properties of S can be deduced:


1.
2.

3.

S is continuous and differentiable. This fact follows from its defining equation, and is used implicitly
every time we use the differential dS.
S is an extensive variable. Thus, a given system containing n moles of material in a given equilibrium
state has entropy S, then a system containing n moles of the same material and in the same
equilibrium state will have entropy S. of a system, S is proportional to N, the total number of moles
contained in the system.
The function S(E, x1 , ..., xa ) is an homogeneous of degree 1. That is, for any R, we have
S(E, x1 , x2 , ..., xa ) = S(E, x1 , x2 , ..., xa )
This property has important consequences that are often exploited. The most important is contained in
Eulers Theorem for homogeneous functions, which states that
Eulers Theorem:
If f(x1 , ..., xn ) is homogeneous of degree , that is, f has the property that for all R
f (x1 , ..., xn ) = f (x1 , ..., xn )
then
f 1
f 1
(x , ...xn ) + ... + xn
(x , ...xn ) = f(x1 , ..., xn )
x1
xn
Using the summation convention, this result can be written more concisely in the form
x1

xi

f
= f
xi

Applied to the entropy, this theorem gives


E

4.

S
E

+ xi
x1 ,...,xa

S
xi

=S

(1.19)

E,x1 ,..., xi ,...,xa

where we have used the summation convention, and where the symbol xi means, omit the variable xi
from the list of variables."
S is a monotonically increasing function of E. That is,
S
E

>0
x1 ,...,xa

We shall see later that, physically, this means that the (absolute) temperature T of the system is
positive,
T >0

(1.20)

Section 1.2

5.

Entropy

This fact incorporates several results, including the Third Law of Thermodynamics which leads to the
conclusion that the absolute zero of temperature is an unattainable lower limit of temperature. We can
approach as close as we like to the absolute zero, but can never reach it.
S is additive over subsystems. Thus if a given system is a compound system consisting of separate
constituents , each with its own variables and entropy function S of those variables, then the total
entropy of the composite system is
S

S=

(1.21)

References
Wikipedia, http://en.wikipedia.org/wiki/
Fermi, E., 1937, Thermodynamics, Dover Publications Inc., New York.
Sears, F. W., 1975, Thermodynamics, Kinetic Theory, and Statistical Thermodynamics, Addison-Wesley Publishing
Company, Reading, Massachusetts.
Zemansky, M. W., and Dittmann, R. H., 1997, Heat and Thermodynamics, Seventh Edition, McGraw-Hill, Boston,
Massachusetts.
Brief reviews of Classical Thermodynamics can be found in the following texts:
Callen, H. B., 1985, Thermodynamics and an introduction to Thermostatics, John Wiley and Sons, New York, p
5-26.
Reif, F., 1965, Fundamentals of Statistical and Thermal Physics, McGraw-Hill Book Company, New York, p
122-123.

Chapter 1

Classical Thermodynamics

1.3 Mini assignment 1


Various terms and concepts used in Statistical Mechanics and Thermodynamics

This assignment is intended to give you the opportunity to make sure that you are familiar with various
terms and concepts used in Statistical Mechanics and Thermodynamics. The list is not exhaustive and
we may add more terms and concepts as the course progresses. Write short explanatory definitions or
descriptions of the following terms and concepts.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Thermodynamics
Statistical Mechanics
Chemical Thermodynamics
Intensive variables and extensive variables.
Thermodynamic state
Reversible process
Irreversible process
Thermodynamic systems
Conjugate variables
Thermodynamic process:
(a) isobaric
(b) isochoric
(c) isothermal
(d) isentropic
(e) isenthalpic
(f) adiabatic
11. The laws of thermodynamics

Chapter 2
Fundamental Equation
2.1 Fundamental Equation of Thermodynamics
By combining the First and the Second Laws of thermodynamics, we arrive at the equation
T dS = dE Xi dxi .

(2.1)

This relation is the most important single equation in thermodynamics. The reason for its importance is
that it contains within it everything that can be known about a thermodynamic system. For ease of notation
in the rest of this section, we shall only consider simple closed hydrostatic systems where this expression
reduces to
T dS = dE + P dV.

(2.2)

(A closed hydrostatic system is a system of constant mass that exerts a uniform pressure on its
surroundings. Its equilibrium states can be described in terms of pressure, volume and temperature. The
equation of state takes the form F (P, V, T ) = 0 which implies that there only two independent variables.)
Remember, however, that in general the thermodynamic functions will depend on other variables such as
particle numbers as well.
The importance of equation (2.1) or (2.2) can be seen as follows. Express equation (2.2) in the form
dS =

1
P
dE + dV
T
T

(2.3)

This is a differential expression in three variables: S, E and V . The fact that S, E and V are the variables
in this relation, and not any other combination of P , T , S, E and V , is clear from the fact that each of them
appears in the expression as a differential. In contrast, neither dP nor dT appear in (2.2) or (2.3).
In (2.3), dS has been expressed in terms of dE and dV . dE and dV therefore determine dS. This
means, by virtue of the way in which we have written (2.3), that we have chosen to regard E and V as the
independent variables, with S dependent. Implicitly therefore, S has been expressed as a function of E and
V.
Since E and V are the independent variables in (2.3), with S a function of E and V , the right hand side
must be a differential expression in E and V . The coefficients of dE and dV are thus functions of E and
V . More explicitly, (2.3) assumes that both 1/T and P/T , and therefore also T and P , are functions of E
and V . In other words, the right hand side of (2.3) is a differential expression of the type
f (E, V ) dE + g(E, V )dV

(2.4)

where f (E, V ) = 1/T and g(E, V ) = P/T respectively.


By the Clausius Theorem, the differential dS is exact. Equation (2.3) must therefore arise by
differentiation of a function of the form
S = S(E, V )

(2.5)

Differentiating (2.5) we get


dS =

S
E

dE +
V

S
V

dV

(2.6)

Comparing with (2.3) we get


1
=
T

S
E

(2.7)
V

10

Chapter 2

Fundamental Equation

and
P
=
T

S
V

(2.8)
E

Equations (2.7) and (2.8) are essentially the heat equation (the energy as a function of temperature
and volume) and the mechanical equation of state (the pressure as a function of the configuration
variables excluding energy) for the system, albeit in a form different from the usual, and requiring a little
manipulation to reduce them to something more familiar.
Consider first equation (2.7). Since S is a function of E and V , so is the partial derivative (S/E)V .
Equation (2.7) is therefore an equation relating E, T and V . It is therefore an implicit equation for E in
terms of V and T , and can be made explicitly into such an equation by solving it for E in terms of T and
V . This equation must therefore be the heat equation for the system. For, if it were not, then it, together
with the heat equation, would provide two relations between E, T and V . This would mean that there
is only one independent variable among E, V and T . This is a contradiction, since this system has two
independent variables and not one alone. So the heat equation cannot be different from equation (2.7).
Thus (2.7) is the heat equation for the system in implicit form.
Now consider equation (2.8). This is a relation between the variables P , T , E and V . We can use
equation (2.7) to eliminate E, leaving an equation between P , T and V . This relation must be the equation
of state for the system. For if it were not, then it, together with the equation of state for the system, would
provide two relations between the three variables P , T and V . This would mean that there is only one
independent variable among the three, which contradicts the fact that there are two independent variables
among them. The equation obtained by eliminating E from (2.8) must therefore be the equation of state
for the system.
This is a remarkable result. All thermodynamic information of a simple hydrostatic system is contained
in its two equations of state, the mechanical equation and the heat equation. From these, every possible
macroscopic property of the system, and its behaviour in every conceivable quasistatic non-dissipative
process, can be calculated. Since both of these equations can be calculated from equation (2.5), all
thermodynamic information of the system is contained in this single equation. To fully specify a
thermodynamics of a hydrostatic system, therefore, all we need do is determine the single equation
S = S(E, V )
From it, we can deduce, by simple differentiation and elementary manipulations, both of the equations of
state for the system, and from these equations we can then deduce by differentiation, CP , the heat capacity
at constant pressure, CV the heat capacity as constant volume, and every other system parameter of
interest. To know (2.5) therefore, is to know everything that there is to know about the system.
This makes equation (2.5) the single most important relation in all of thermodynamics. To emphasise its
importance, it is called the Fundamental Relation of Thermodynamics.
We shall see in later sections that the content of the fundamental relation can be expressed in a variety
of different, equivalent, ways. To distinguish the particular form (2.5) from its equivalent representations,
we shall refer to the fundamental equation as expressed in (2.5) as the fundamental equation in entropy
representation. This name reflects the fact that (2.5) expresses the entropy S of the system as a function of
E and V .
The entropy representation of the fundamental equation, like all its other representations that we shall
encounter, is deduced from the differential relation (2.2). We shall therefore refer to equation (2.2) as the
fundamental relation in differential form.
Note that the argument presented above explicitly uses the fact that S has been expressed in terms
of E and V . We may in fact express S, like any other state variable, in terms of any convenient set
of independent variables. However, if we do so, we loose some of the information that is contained in
equation (2.5). The argument presented above explicitly requires that S be expressed in terms of E and V ,
and not in terms of any other set of independent variables. In this sense, E and V are the natural variables
for the function S. If we choose to express S in any other form, we forfeit one or other of the equations of
state and so arrive only at partial information about the system. S, expressed in terms of any other pair of
variables is therefore not a fundamental relation for the system.

Section 2.2

Alternative Forms of the Fundamental Relation

11

2.2 Alternative Forms of the Fundamental Relation


To arrive at relation (2.5), we isolated dS in (2.2) and expressed it in terms of dE and dV , so obtaining
equation (2.3). This was rather arbitrary. We could equally well have isolated dE, expressing it in terms of
dS and dV , to obtain
dE = T dS P dV

(2.9)

or dV , expressing it in terms of dS and dE, to obtain


dV =

T
1
dS dE
P
P

(2.10)

Then by an argument analogous to the one used above, we could have shown that complete information
about the thermodynamic system is contained in the functions E = E(S, V ) or V = V (S, E) respectively.
Each of the functions E = E(S, V ) or V = V (S, E) is therefore another fundamental relation for the
system, called respectively the energy representation and the volume representation of the fundamental
relation. The details are as follows.

2.2.0

Energy Representation:

Write equation (2.2) in the form


dE = T dS P dV

(2.11)

This presupposes that E is given in the form


E = E(S, V )

(2.12)

Differentiating (2.12), we get


dE =

E
S

E
V

dS +
V

dV

(2.13)

Comparing with (2.11) gives


T =

E
S

(2.14)
V

and
P =

E
V

(2.15)
S

Equations (2.14) and (2.15) are then, up to some elementary manipulations, the equations of state for the
system. This is seen as follows.
Equation (2.14) gives T as a function of S and V . Equation (2.15) gives P as a function of S and V . If
we eliminate S from these two equations, we get a relation between P , V and T . By the same reasoning
as above, this must be the mechanical equation of state. Also, if we eliminate S from (2.12), using either
(2.14) or (2.15), we get E as a function either of T and V , or of P and V . Either way, we get the heat
equation in one of its familiar forms.

2.2.0

Volume Representation:

Write equation (2.2) in the form


dV =

T
1
dS dE
P
P

(2.16)

This presupposes that V is given in the form


V = V (S, E)

(2.17)

12

Chapter 2

Fundamental Equation

Differentiating (2.17), we get


dV =

V
S

dS +
E

V
E

dE

(2.18)

Comparing with (2.16) gives


T
=
P

V
S

(2.19)
E

and

1
=
P

V
E

(2.20)
S

Equation (2.20) gives P as a function of S and E. Equation (2.19) gives T /P as a function of S and
E. And since we already know P as a function of S and E, we get also T as a function of S and E.
Eliminating S from P = P (S, E) and T = T (S, E), we get a relation between P , T and E, which is the
heat equation. Further, eliminating S and E from (2.17) using (2.19) and (2.20), we get V as a function of
T and P , which gives the mechanical equation of state.

2.3 Thermodynamic Potentials


It is clear from the previous section that the fundamental relation can be expressed in many forms.
In each of its forms, some thermodynamic variable of the system is expressed as a function of two
other, uniquely determined, state variables. That function is called a thermodynamic potential. The
reason for this name appears to be that all thermodynamic quantities can be calculated from the chosen
thermodynamic potential by simple differentiation, in a manner analogous to the way that forces in
mechanics are calculated from force potentials. The analogy between thermodynamics and mechanics in
fact is very close. The thermodynamic potentials play a role in thermodynamics entirely analogous to that
played by the Lagrangian function in Lagrangian mechanics and the Hamiltonian function in Hamiltonian
mechanics. All the system information of a mechanical system is contained implicitly in the Lagrangian
or the Hamiltonian, and is extracted from these by differentiations. So also all the system information of a
thermodynamic system is contained implicitly in the thermodynamic potentials and is extracted from them
by differentiations.
Note that each representation of the fundamental relation gives rise to a corresponding thermodynamic
potential. Any one of these potentials is sufficient for a complete description of the system. In fact, all of
the potentials are equivalent in information content.
The fundamental relation can be represented in a variety of ways. However, not all its representations
are equally useful. For example, of the representations considered above, the energy and entropy
representations are often used, but the volume representation is not.
The usefulness of a particular representation is determined by what variables are controlled in a
given experimental situation. Generally, we obtain the simplest description of a system if we choose as
independent those variables that are controlled in the experiment, and express all others in terms of them.
When altering the representation of the fundamental equation to suit a particular context, we must be
careful not to loose information. Incorrect manipulation leads to loss of information, and yields expressions
which are not representations of the fundamental relation. The method to be followed when changing
representation is illustrated in the following important examples.

2.3.1

In terms of S and P : Enthalpy

Suppose we wish to write the fundamental equation in a way that regards S and P as the independent
variables. The fundamental equation in differential form is
T dS = dE + P dV

(2.21)

This equation is a differential relation among the variables are S, E and V . P is not among them. To make

Section 2.3

Thermodynamic Potentials

13

P appear among the variables, we must manipulate (2.21) in such a way as to generate a term containing
dP . This is done by the product rule of differentiation to rewrite the term P dV in the form
P dV = d(V P ) V dP

(2.22)

T dS = dE + d(P V ) V dP = d(E + P V ) V dP

(2.23)

Substituting into (2.21) we get

Here we have combined the differentials dE and d(P V ) into a single differential d(E + P V ). The reason
for this step is that our system has only two independent variables. Any other is expressible in terms of the
two independent ones alone. Any single differential relation must therefore be reducible to an expression
containing three variables only, with two regarded as independent, and the third a function of the two. The
obvious differentials to combine in (2.23) are the ones without coefficients. Combining them yields a new
state variable. In this case, the new variable is
H = E + P V,

(2.24)

which is the enthalpy of the system.. Thus (2.21) becomes


T dS = dH V dP

(2.25)

This is a new differential representation (2.21). It is equivalent to it, since we can pass from (2.21) to (2.25)
and back again.
It is interesting that the enthalpy appears in a natural way in these manipulations. It shows that, expressed
in terms of the correct variables, enthalpy gives a complete representation of the thermodynamic system.
We may now choose which of the three variables P , S and H we wish to regard as independent. A
particularly useful representation is obtained by expressing H in terms of P and S. In terms of (2.25), this
means writing
dH = T dS + V dP

(2.26)

Equation (2.26) assumes implicitly that H is given as a function


H = H(S, P )

(2.27)

so that
dH =

H
S

H
dP
P S

dS +
P

(2.28)

Comparing with (2.26), we get


T =

H
S

H
P

(2.29)

and
V =

(2.30)

Equations (2.29) and (2.30) are essentially the equations of state for the system. Eliminating S from
(2.29) and (2.30) gives the mechanical equation of state. The heat equation is then obtained by noting that
H = E + P V , so that
E = H(S, P ) + P V (S, P ) = H(S, P ) + P

H
P

which gives E as a function of S and P . This is the heat equation, expressed in terms of S and P . If we
want it in terms of T and P , we can eliminate S using (2.29). If we want it in terms of V and P , we can
eliminate S using (2.28). Equation (2.29) therefore contains full information about the thermodynamic
system, so it is a fundamental representation of the fundamental equation. It is called the fundamental
equation in enthalpy representation.

14

Chapter 2

Fundamental Equation

The representations obtained by writing S = S(H, P ) and P = P (S, H) are hardly ever used and so
have no name.
As an exercise work out the details of these representations.

2.3.2

In terms of T and V : Helmholtz Free Energy

Suppose we wish to write the fundamental relation in terms of T and V . In differential form, the
fundamental relation is
T dS = dE + P dV

(2.31)

V appears as one of the variables in this equation. However, T does not. To make it appear as a variable,
we must manipulate (2.31) in such a way as to generate a term containing dT . This is done by rewriting
the term T dS, using the product rule of differentiation, in the form
T dS = d(T S) S dT
Substituting into (2.31) we get
d(T S) S dT = dE + P dV

(2.32)

Since we wish to regard T and V as independent, we need to express everything in terms of dT and dV .
This gives,
d(E T S) = S dT P dV

(2.33)

The function F = E T S is a new state function that we have not previously encountered. It is called the
Helmholtz Free Energy. In terms of it, (2.33) becomes
dF = S dT P dV

(2.34)

Since F = E T S is a state function, dF is exact, and so (2.34) is the differential form of


F = F (T, V )

(2.35)

Differentiating (2.35), we get


dF =

F
T

dT +
V

F
V

dV

(2.36)

and comparing with (2.34), we get


S =

F
T

P =

F
V

(2.37)

and
(2.38)

Equations (2.37) and (2.38) are essentially the equations of state for the system. In fact, equation (2.38) is
a relation between P , V and T and so is directly the mechanical equation of state. The heat equation is
obtained by noting that F = E T S, so that
E = F (T, V ) + T S(T, V ) = F (T, V ) T

F
T

(2.39)
V

which is the heat equation in terms of T and V . Equation (2.35) thus contains all thermodynamic
information and so is a representation of the fundamental equation. It is called the Helmholtz
Representation.

Section 2.4

2.3.3

The Maxwell Relations

15

In terms of T and P : Gibbs Free Energy

This is the last representation that we consider. There are still others that are useful, but the above are
sufficient to illustrate the method.
We wish to write the fundamental equation in terms of T and P . The fundamental equation in differential
form is
T dS = dE + P dV

(2.40)

To make both T and P appear in this equation as variables, we need to alter two terms by means of the
product rule. By the same method as described in the previous examples, we get
d(T S) S dT = dE + d(P V ) V dP
or, collecting together the differentials without coefficients,
d(E + P V T S) = S dT + V dP

(2.41)

The function G = E + P V T S is another new function, not previously encountered. It is called the
Gibbs Free Energy. In terms of it, (2.41) becomes
dG = S dT + V dP

(2.42)

G = G(T, P )

(2.43)

which is a differential form of

Differentiating,
dG =

G
T

G
P

dT +
P

dP

(2.44)

and comparing with (2.42), we get


S =

G
T

V =

G
P

(2.45)
P

and
(2.46)
T

Equation (2.46) is directly the equation of state. The heat equation is obtained by noting that
G = E + P V T S, so that
E = G(T, P ) + T S(T, P ) P V (T, P ) = G(T, P ) + T

G
T

G
P

(2.47)
T

which is the heat equation in terms of T and P . Equation (2.42) thus contains all thermodynamic
information and so is a representation of the fundamental equation. It is called the Gibbs Representation.

2.4 The Maxwell Relations


Maxwells relations are a set of equations in thermodynamics which are derivable from the definitions
of the thermodynamic potentials. The Maxwell relations are statements of equality among the second
derivatives of the thermodynamic potentials. They follow directly from the fact that the order of
differentiation of an analytic function of two variables is irrelevant. If is a thermodynamic potential and
x and y are two independent variables for that potential, then the Maxwell relation for that potential and
those variables is:

xi

xj

xj

xi

16

Chapter 2

Fundamental Equation

Of the representations of the fundamental relation considered above, four are particularly important.
They are the energy, enthalpy, Helmholtz and Gibbs representations, given respectively in differential form
by
dE
dH
dF
dG

=
=
=
=

T dS P dV
T dS + V dP
S dT P dV
S dT + V dP

(2.48)

Since each of the differentials on the left hand side of these relations is exact, the differential expressions
on the right hand side must each be exact. The equations (2.48) are therfore of the form

d =

dx +
y

dy.

(2.49)

For continous functions with continous second derivatives, mixed partial derivatives are equivalent:

=
y

=
x

2
2
=
xy
yx

(2.50)

Applying the exactness (or, integrability) condition to each of the equations (2.48), we obtain a set of four
useful relations called the Maxwell equations.
T
V
T
P
S
V
S
P

P
S
V
=
S P
P
=
T V
V
=
T
=

(2.51)
P

For example

E
S
T
V

S V
P
=
S

E
V

=
V

(2.52)
V

References
Callen, H. B., 1985, Thermodynamics and an Introduction to Thermostatics, John Wiley and Sons, New York,
p 27-33.
Carrington, G., 1994, Basic Thermodynamics, Oxford University Press, Ch 9 & l0.

Exercises

17

Exercises
Lower case symbols are used for "specific" quantities, that is for thermodynamic quantities per mole of
substance.
1.

Show that if F is known as a function of V and T , then


F
T

H =F T

F
V

and
F
V

G=F V
2.

The specific Gibbs function of a gas is given by


g=

G
P
= RT ln
AP
N
P0

where A is a function of T only. (a) Derive expressions for the equation of state of the gas and its
specific entropy. (b) Derive expressions for the other thermodynamic potentials. (c) Derive expressions
for cP and cv . (d) Derive expressions for the isothermal compressibility ( = V1 V
P T ) and the
expansivity ( = V1 V
).
(e)
Derive
an
expression
for
the
Joule-Thomson
coefficient,
which is
T P
defined by
=

3.

T ( v ) v
T
[Hint: show that P
= Tcp P , CP = T
h
The specific Gibbs function of a gas is given by

T
P

S
T P ]

g = RT ln

v
+ Bv
v0

where B is a function of T only. (a) Show explicitly that this form of the Gibbs function does not
completely specify the properties of the gas. (b) What further information is necessary so that the
properties of the gas can be completely specified?
Solution:
g = g(T, v)
4.

Define a property of a system represented by which is given by the equation


=S

E + PV
T

Show that
V
E

P T

= T T
+P
T P
= T

and
S = +T

18
5.

Chapter 2
The fundamental equation of a certain system is given by
E=

6.

v0
R2

S3
NV

where v0 , and R are constants. (a) In terms of what representation is this fundamental relation
expressed? Explain. (b) Find the equation of state for the system, and the heat equation.
A particular system obeys the relation
u=

7.

Fundamental Equation

E
= Av 2 es/R
N

N moles of this substance, initially at a temperature T0 and pressure P0 , are expanded isentropically
until the pressure is halved. Find the final temperature of the system. (Answer: Tf = 0.63T0 )
A simple hydrostatic system is such that P V k is constant in a reversible adiabatic process, where
k > 0 is a given constant. Show that its internal energy has the form
E=

1
PV + N f
k1

PV k
Nk

where f is an arbitrary function.


Hint: P V k must be a function of S (why?) so that (E/V )S = g(S, N ) V k , where g(S, N) is an
arbitrary function.

Printed January 24, 2014 \sm-02\tutorials\tut-1

19

Chapter 3
Models of Thermodynamic Systems
Any system containing more than two particles in interaction may be regarded as a thermodynamic system.
Thus a complex atom, or a large nucleus, or a spray of elementary particles in a collider may be treated
by the methods of thermodynamics. More typically, however, thermodynamic systems are macroscopic.
These consist of vast numbers of interacting particles, with even the smallest containing well in excess of
1015 . Chemists, solid and condensed state physicists, and material scientists regularly deal with systems
containing anywhere between 1024 to 1030 particles, and astrophysicists with ones that exceed these
figures by many orders of magnitude.
Modelling systems of such complexity requires us to be judiciously selective. The choice of basic unit
for the model depends on the type of system considered, and on which of its properties are of interest.
Particle physicists are interested in properties that depend on the fundamental interactions of elementary
particles. The basic entities in their models are thus quarks and gluons. Nuclear physicists will build their
models from protons, neutrons and electrons. Chemists, solid state physicists and material scientists use
atoms, molecules and macromolecules as basic building blocks, while cosmologists build their models
using galaxies as their particles." But however the model is constructed, all face the same difficulty in the
end: how to keep track of so many basic entities. Complete modelling of a thermodynamic system would
require us to keep track of each and every one of its constituents. This is clearly impossible. Tracking in
detail for only a few minutes the particles even of a system as small as a single oxygen atom would require
so many calculations that it is estimated that the most powerful computers available today would take
longer than the age of the universe to perform the calculations.
If we are to make any headway in developing tractable models of thermodynamic systems, we need to
adopt a different procedure. The answer is in statistics. Large assemblies of items display predictable
regularities and trends when averaged. The detailed behaviour under the same conditions of two identical
systems can differ substantially. But, on average, their behaviour and properties are the same. This
phenomenon is called statistical regularity. The way to deal with huge systems of particles, then, is not by
detailed modelling, but by developing methods for predicting their average behaviour, the likelihood of
observing departures from their predicted average behaviour, and the expected size of these departures.
The set of techniques by which this is done is called Statistical Physics because each particle is assumed to
obey the laws of Physics, or Statistical Thermodynamics, because the systems modelled are macroscopic
in size and are thus thermodynamic systems.
Strictly speaking, the behaviour of the constituent particles is governed by the laws of quantum
mechanics. Classical mechanics is now known to be incorrect in the sense that it does not describe
the behaviour of systems on the atomic scale. It provides a tolerably accurate account of the general
mechanical behaviour of macroscopic systems, but does not adequately describe that of subatomic
particles. Any viable statistical approach will need to describe the behaviour of large numbers of quantum
particles and must therefore be based on quantum mechanics. In certain limits, however, such as that of
low density and high temperature for example, quantum properties are not important. In these cases, the
behaviour of the constituent particles is adequately described by the laws of classical mechanics. Quantum
mechanics is significantly more difficult to deal with than classical mechanics, both conceptually and
mathematically. It is therefore advantageous in these limiting cases to use classical mechanics to model the
system. Ironically, classical statistical models are much more difficult to implement than quantum models.
This fact substantially compromises the usefulness of classical statistical models and makes them less
attractive than might otherwise have been expected.
In this chapter, we look at some general features of both quantum and classical models of large numbers
of particles. The details of these models are not always easy to grasp at first reading, or to implement. The
best way to understand them is thus, not by protracted discussion of the general theory, but by working
through sufficient examples that illustrate it. So, if at first you fail to understand the principles outlined in
this chapter, dont give up. Rather press on to suitably chosen illustrative examples and return to the theory
described here after you have seen it applied in particular case examples.

20

Chapter 3

Models of Thermodynamic Systems

3.1 Quantum Models


3.1.1

Stationary States

A quantum system of f degrees of freedom is described by a wavefunction (q1 , q2 , ..., qf , t) of the f


coordinates q1 , q2 , ..., qf of the system, and of the time t. This wave function satisfies the time dependent
Schrdinger equation
i

= H
t

(3.1)

is the Hamiltonian operator for the system.


where H
The first law of thermodynamics asserts that a thermodynamic system in equilibrium has a well defined
energy. We are therefore interested in the stationary state solutions of equation (3.1), since these are the
states of fixed energy for the quantum system. Denote the energy of the quantum system by E. Then the
stationary state with energy E has wavefunction
(q1 , q2 , ..., qf , t) = (q1 , q2 , ..., qf )eiEt/

(3.2)

where (q1 , q2 , ..., qf ) is a function of the coordinates q1 , q2 , ..., qf alone, and satisfies the time
independent Schrdinger equation
= E
H

(3.3)

Equation (3.3) is not by itself sufficient to determine the solution completely. It needs to be supplemented
by boundary conditions. These distinguish the physically acceptable solutions from those which are
not. Once the boundary conditions are specified, equation (3.3) generally does not admit solutions for
arbitrarily chosen values of E, but only for certain well defined values called the characteristic energies,
or eigenenergies, or energy eigenvalues of the system. These values are the only energies in which the
quantum system can be found. It will never be found with any other values of the energy.
Equation (3.3) can sometimes (approximately) be reduced by suitable choice of coordinates q1 , q2 , ..., qf
to a set of f simultaneous ordinary differential equations, one for each coordinate. One technique by which
this reduction is effected is the method of separation of variables, which is valid if the particle interactions
are negligible. Each of the resulting ordinary differential equations then contains a constant of separation,
and allows solutions for each choice of value for this constant. However, the boundary conditions for
the original problem induce boundary conditions on each of these equations whose nett effect is that the
individual equations for each coordinate allows solutions, not for all possible values of the constant of
separation, but only at well defined values of it. We thus obtain a restricted family of physically acceptable
solutions, parametrised by a single variable, usually taking only discrete values (normally integers), called
the quantum number for that equation. The value of the separation constant thus depends on the quantum
number of the solution. There is therefore exactly one quantum number associated with each degree of
freedom of the quantum system.
The solutions admitted by (3.3) thus occur in general only for a restricted set of values of E. In the
idealised case, where interactions between particle is neglected, E is a sum of the separation constants.
Each solution is then labelled by a set of f quantum numbers n1 , n2 , ..., nf . We show this explicitly by
denoting the solutions as n1 ,n2 ,...,nf . Each solution n1 ,n2 ,...,nf of (3.3) occurs at a well defined energy,
which we denote by En1 ,n2 ,...,nf . This energy is uniquely determined by the quantum numbers n1 , ..., nf .
Since each solution n1 ,n2 ,...,nf defines an unique stationary state (3.2) of the quantum system,
n1 ,...,nf (q1 , ..., qf , t) = n1 ,...,nf (q1 , ..., qf )eiEn1 ,...,nf t/

(3.4)

the quantum numbers n1 , n2 , ..., nf uniquely specify a stationary state of the quantum system and its
energy. The quantum numbers n1 , n2 , ..., nf are thus in one to one correspondence with the quantum
states. Each set n1 , n2 , ..., nf defines exactly one quantum state, and conversely.
It is important to note that while each given set of quantum numbers n1 , n2 , ..., nf uniquely defines
one stationary state of the system, and therefore also its energy En1 ,n2 ,...,nf , a given admissible value of
the energy E does not in general define uniquely a corresponding stationary state. More often than not,

21
there are many stationary states with that same energy. If so, we say that the energy level with value E is
degenerate, and we denote the number of stationary states which occur at energy E by gE . The number
gE is called the degeneracy of the energy level with energy E. In the special case when gE = 1, we say
that the energy level at energy E is non degenerate.
Note the difference between energy levels" and quantum states". The quantum states are the actual
stationary states of the system. The energy levels are the values of the energy E that are allowed for the
system. These are not the same concept. Each state has a definite energy. But there may exist more than
one state with that energy. So in general there are more quantum states than there are energy levels. In
systems with 1 degree of freedom, there is no degeneracy and therefore there are as many energy levels as
there are quantum states, but this is an exception and not the rule. Systems with more than 1 degree of
freedom almost always display degeneracy, and the larger the number of degrees of freedom of the system,
the larger the degree of degeneracy.
It is a general feature of quantum systems that the degree of degeneracy of the energy levels increases
dramatically as the number of degrees of freedom of the system increases. When the number of degrees of
freedom is huge, as it is in the case of thermodynamic systems, the degree of degeneracy of the levels can
be astronomical. This feature is exploited in statistical physics.

SUMMARY:

the stationary states of a quantum system with f degrees of freedom are


specified by a set of f quantum numbers n1 , n2 , ..., nf .
Each stationary state has an unique energy En1 ,n2 ,...,nf .
Any given allowed energy value E may correspond to many quantum
states, that is, there may be many sets of values n1 , n2 , ..., nf which give
the same value E of the energy.
The number of states with given energy E is called the degeneracy of the
energy level E, and is denoted by gE .

3.1.2

External Parameters

The characteristic energies E of the stationary states of a quantum system are determined by the
in equation (3.3). For a system of particles, H
is the sum of kinetic and potential
Hamiltonian operator H
energy operators for the system,
= T + V
H

(3.5)

The kinetic energy contains information about the masses of the particles. The potential, on the other
hand, contains two types of information: information about how the particles interact with each other, and
information about the environment. For example
N

V =

u (ri , rj ) +
ij

vexternal (ri )
i=1

where u (ri , rj ) represents the interaction potential between two particles and vexternal (ri ) the interaction
of a particle with the environment. The environmental information is contained in the form of parameters,
called external parameters, that specify the strength of interaction between system and environment, the
dimensions (such as volume, or length, breadth and height) of the potential that confines the system to a
specific region of space, the magnitude direction of the applied electric and magnetic fields, and so on.
Denote these external parameters by 1 , 2 , ..., or, more briefly, by i vexternal ri , i
.
When solving equation (3.3), we assume that the external parameters each have a given, fixed value.
This means that the solutions obtained, both for and for E, are implicitly functions of these fixed values.
Change the values of the i , and the solutions and their eigenvalues must also change.

22

Chapter 3

Models of Thermodynamic Systems

By their nature, the external parameters are continuous variables. Small changes in their values are not
expected to produce catastrophic changes in the state of the system. We thus expect small changes in the i
to produce only small changes in the eigenfunctions n1 ,n2 ,...,nf for the system and in their corresponding
energies En1 ,n2 ,...,nf . Put differently, we expect both n1 ,n2 ,...,nf and En1 ,n2 ,...,nf to be continuous
functions of the parameters i .

3.1.3

Particle Number

Another important parameter in thermodynamics is the number N of particles in the system. In a quantum
model, N enters explicitly into both T and V via the number of terms in the summations that make up each
It also enters into the wavefunctions
of these operators. It thus enters explicitly also into the operator H.
i
via the number of variables q on which they depend.
N

=
H
i=1

2mi

2i +

vexternal ri , j

u (ri , rj ) +
ij

i=1

The principal difference between particle number and the external parameters is that N is not a
continuous variable, but discrete. A change in N produces a discontinuous change of the system. This is
reflected in the fact that if N is changed, the number f of degrees of freedom of the system changes. This
means that a different Schrdinger equation needs to be solved and not, as was the case with changes in the
values of the external parameters, the same equation but with different constants. The system stationary
states and their corresponding energy eigenvalues will thus be labelled by a different number of quantum
numbers n1 , n2 , ..., nf .
It often happens however that the changes introduced by changing the value of N is not as dramatic as
might have been expected, and that the energy eigenvalues can be expressed as a function of N . Further,
when N is large, even quite large changes N in particle number constitute only a very small relative
change in N, that is, N/N 1, and we may use a series expansion in the variable N/N to good
effect. We shall therefore treat N as if it were a continuous parameter for the system. This puts N
effectively on the same footing as the external parameters i . We thus often regard E as a continuous
function of 1 , 2 , ..., and N .

3.1.4

Interaction

There are essentially two ways to change the energy of any given quantum system. In the first, all external
parameters are held fixed. The energy levels of the system therefore do not change. We make the system
change its state by supplying it with a quantum of energy of the right magnitude, or by removing one. The
system will then make a transition from an initial quantum state to another at a different energy. In the
second, we prevent the system from absorbing or emitting quanta. It thus cannot make any transitions to
states of different energy. However, if we change its external parameters very slowly, we gradually change
the values of the eigenenergies of the system for each given state (n1 , ..., nf ) without altering the quantum
state in which it is found. In this way, we transfer energy into or out of the system while the system remains
in a fixed quantum state. (Remember that the quantum state is defined by its quantum numbers. These are
assumed to remain unchanged as the external parameters are changed.)
Physically, the two types of interaction are interpreted as follows. The external parameters of a system
include things like the size of the system (that is, its length, breadth and height, or its volume), the applied
electric and magnetic fields, and so on. These parameters are analogous to, and in fact are closely related
to, the configuration variables of thermodynamic systems. We will see later that the external parameters do
not always coincide with the thermodynamic configuration parameters. But often they do. And when they
do not coincide, they are nonetheless closely related to them. In the first type of interaction, these external
parameters are held constant. This type of interaction is thus analogous to a process in thermodynamics in
which there is no change of the configuration variables of the system. In such a process, no work is done,
and the system changes its equilibrium state by heat flow alone. This kind of interaction is thus analogous
to heat flow. Of course, we have not yet sufficiently developed the microscopic theory to enable us to
explain heat flow completely. But this first method for transferring energy to the system forms the basis of

23
the explanation that we will eventually give of heating. We thus refer to this kind of interaction as thermal
interaction.
In the second type of interaction, we force the system to remain in a given stationary state and alter its
energy by manipulating the external parameters. This is analogous in thermodynamics to an adiabatic
process, in which we completely inhibit heat flows to and from the system, and force it to alter its
equilibrium state purely by changing the values of its configuration variables. This kind of process is thus
is called adiabatic perturbation of the quantum system. It provides the basis for a microscopic model of
work done by the thermodynamic system.
Manipulation of the external parameters changes the energy of the system, even though its quantum
state remains the same. This process therefore transfers energy to and from the system. We shall call
energy transferred in this way work, and the interaction, work interaction. If the external parameters are
distances, areas or volumes, the work is said to be mechanical work, and the interaction a mechanical
interaction. If they are electric or magnetic fields, the work and interaction are said to be respectively
electric or magnetic.
Mathematically, the work done by the system is calculated as follows. The energy eigenvalues of the
system are functions of the external parameters,
ER = ER ( 1 , 2 , ..., )

(3.6)

Here we have used R as an abbreviation for the quantum numbers n1 , n2 , ..., nf that define the quantum
state of the system. Suppose the system is in a given quantum state R. Change the values of the external
parameters, each by amount d i . The value of the energy eigenvalue of state R then increases by amount

dER =
i=1

ER 1 2
( , , ..., ) d i
i

(3.7)

The work done by the system on its surroundings is therefore

dWR = dER =

i=1

ER 1 2
( , , ..., ) d i
i

(3.8)

ER
i

(3.9)

The coefficient
XR,i =

is called the generalised force conjugate to the external parameter i exerted by the environment on the
system. If the parameter i is an ordinary Cartesian distance, then XR,i is an ordinary force; if it is an
angle, then XR,i is a torque; and if it is a volume, then XR,i is a pressure.
Note that the work dWR done by the system when the external parameters are changed depends on the
state R in which the system is found. For different states R, the same change of external parameters results
in general in different amounts of work being done. The generalised force XR,i exerted by the environment
on the system when its external parameters have given values 1 , 2 , ..., also depends on the state R
of the system. Thus, in general, for given i , the same system in different states will experience different
forces.

Remark 1 The concept of a generalised force comes from Lagrangian Mechanics and is a generalisation
of the Newtonian concept. The definition arises as follows. The work done in displacing a particle by d2r
increases its energy by an amount E = F2 .2r. Expressing this in Cartesian coordinates gives
Fx =

E
,
x

Fy =

E
,
y

Fz =

E
z

(3.10)

We can now reverse the order of the definitions. Given the energy E of the particle as a function of x, y, z,
we can define the ith component of the force on the particle to be the partial derivative of E with respect
to the ith coordinate. This definition permits us now to use arbitrary coordinate systems. It is easy to show
that, in polar coordinates, the derivative E
is the torque, or angular force, exerted by the force on the

24

Chapter 3

Models of Thermodynamic Systems

particle about the origin. Lagrangian Mechanics makes the obvious generalisation: the derivative with
respect to any type of coordinate yields the generalised force conjugate to that coordinate. Conjugate"
here means, the force" responsible for changes in that coordinate. So, for example, if volume is used as a
coordinate, the force" conjugate to it is pressure. The advantage of this concept of generalised force is that
it can be imported into any context where the system changes its energy by a change of some configuration
coordinate, even those contexts where the Newtonian concepts fail (as in quantum mechanics) or are absent
(as in thermodynamics).

If the above two effects occur simultaneously, we have a model for a general quasistatic thermodynamic
process. It is possible also to change the external parameters suddenly rather than gradually. This normally
results in two things simultaneously: the system changes its stationary state and the energy levels of each
given stationary state alter as a result of the change in the values of the external parameters. In such a
situation however, we do not know into which stationary state the system has moved or how much energy
was supplied by the sudden influx of heat. This provides a model for non-quasistatic processes.
These facts form the basis of a microscopic understanding of both heat and work in thermodynamics.
They do not yet provide a complete theory of them. There is one more factor involved which will be
discussed in a later section. It is this: there are in general very many stationary states of the system
compatible with any given set of thermodynamic state variables, and each such state has a certain
probability of occurrence. Thermodynamic heat and work are therefore statistical averages of the above
effects, taken over all these compatible states.

3.1.5

Independent Particle Models

In many situations of interest, the individual particles of the system are identical and interact only weakly
with each other. By interact weakly with each other" we mean that, compared with their kinetic energy
and with their energy of interaction with the environment, the energy of their interaction with each other is
negligible. The strength of an interaction is measured by the amount of energy involved in the interaction.
So if the particle-particle interaction energy is much less that of the particle-environment interaction
energy, or the particle kinetic energy, then we commit only a small error by neglecting it. The resultant
model provides a good first approximation to the real behaviour of the system. At worst, we only get a
qualitative explanation of the observed phenomena by taking this approach, but often we get a powerful
predictive model in spite of the severity of the approximation. So instead of using
N

=
H
i=1

2mi

2i +

vexternal ri , j

u (ri , rj ) +
i=1

ij

we use a simplified approximation


N

=
H
i=1

2mi

2i +

vexternal ri , j
i=1

Ignoring inter-particle interactions produces considerable mathematical simplification in the quantum


model. This is the principal reason for using this approximation. For a general system, the Hamiltonian
consists of a sum of terms of the type
N

H=

HA +
A=1

HAB .

(3.11)

A,B=1
A<B

Here the HA = 2mA 2A + vexternal rA , j is the Hamiltonian of the Ath particle alone in interaction
with the given environment, while HAB = uAB (rA , rB ) is the Hamiltonian of the particle A interacting
with particle B. So if we ignore the inter-particle interaction, we may remove the HAB terms from H and

Section 3.2

Classical Models

25

write
N

HA

(3.12)

A=1

Since each HA = HA (rA , pA ) involves only the position and momentum of the Ath particle in the
external potential, the time independent Schrdinger equation is separable. If we write
(r1 , r2 , ..., rN ) = P (1) (r1 ) (2) (r2 ) ... (N) (rN ) ,

(3.13)

the time independent Schrdinger equation separates into N independent equations, one for each particle,
HA (A) (rA ) = E (A) (A) (rA )

(3.14)

In (3.13) P is a permutation operator that depends on the statistics of the particles (Fermions or Bosons).
If furthermore the particles are identical, each of these equations is identical in form to every other, and
therefore have solutions that are identical in form.
We write the common equation symbolically as
=
h

(3.15)

= 2 + vexternal r, j is the common single particle Hamiltonian, is the single particle


where h
2m
wave function, and is the eigenenergy of the single particle alone in interaction with the environment.
The state of the N particle system is then specified by giving the state of each individual particle. The
single particle state is specified by the 3 quantum numbersn, l, m (1 particle has 3 degrees of freedom, so
3 quantum numbers), and its energy is nlm . It is customary to abbreviate (n, l, m) by the single letter
r = (n, l, m). The corresponding energy for the single particle is then denoted r . The state of the system
of N particles is now specified by the quantum numbers ri for each particle, that is by the set of numbers
r1 , r2 , ..., rN , and the total energy of the system is given by
Er1 ,r2 ,...,rN = r1 + r2 + ... + rN

(3.16)

This independent particle model requires a lot of words to describe it, but in fact it is very easy to use.
How to use it will become clear in particular examples. Keep in mind that the independent particle
approximation is usually a very severe approximation and we only expect to get qualitative results that will
help us to develop a quantitative intuition of the physics that occurs in real systems.

3.2 Classical Models


Classical mechanics is quite different in concept and structure from quantum mechanics. The basic features
of the theory that are needed to construct a classical statistical mechanics are accordingly different. This
section outlines the principal points on which the theory is built.

3.2.1

Classical Specification of State

There are three principal different approaches to classical mechanics that we need to note. These are
Newtonian, Lagrangian, and Hamiltonian. Newtonian mechanics is the simplest and most direct of these.
The picture it offers is immediate and intuitive.
Any mechanical system can be considered to consist of a fixed number N of point particles. Each
particle is subject to the action of a force F2A , A = 1, 2, ..., N, which determines its acceleration via
Newtons Second Law,
F2A = mA2a
The acceleration then determines, together with 6N initial values of position and velocity, the N

(3.17)

26

Chapter 3

Models of Thermodynamic Systems

trajectories 2rA = 2rA (t) of the particle by the 3N second order ordinary differential equations
F2A
d22rA
= 2aA =
2
dt
mA

(3.18)

These trajectories together define the position and velocity of each particle in space at each time t. The
position of each particle at time t is defined by 3 coordinates. To specify the configuration of the system at
time t therefore, we need 3N coordinates. The set of all possible configurations is called the configuration
space for the system, and we say that the system has 3N degrees of freedom (in the absence of constraints).
However, knowing the configuration of the system alone is not sufficient for determining how the system
will move, or change its configuration, in the next instant. For this, we also need to know the velocity of
each particle. We thus need to specify 6N quantities, three position coordinates and three components of
velocity, for each particle in order to specify the state of motion of the system. These 6N quantities can
be represented in a space of 6N dimensions, called the velocity phase space. The state of motion of the
system is called its phase.
It is often convenient to express the state of motion of the system not in terms of position and velocity,
but of position and momentum. Since there is an unique momentum associated with each velocity, and
conversely, these two descriptions are equivalent. We may thus specify the state of motion of the system
by three position coordinates and three components of momentum for each particle. These 6N quantities
may again be represented in a space of 6N dimensions. This one is called the momentum phase space.
For convenience, denote the configuration coordinates of the system by q i , i = 1, 2, ..., f. Here, f
is called the number of degrees of freedom of the system, and for a system of unconstrained particles,
f = 3N . Denote also the momenta of the particles by pi , i = 1, 2, ..., f. Newtons law then determines the
q i and the pi as functions of time, and the state of motion of the system at time t is given by (q i (t), pi (t)).
The set of states for a classical system of f degrees of freedom is thus a 2f -dimensional space, or
continuum.
In each state, and at each time, the classical system has a well defined energy. This means that the energy
of the system is a function H of the 2f + 1 variables (q i , pi , t). If we denote the energy by E, this means
that
E = H(q 1 , ..., qf , p1 , ..., pf , t)

(3.19)

The function H is called the Hamiltonian function for the system. The Hamiltonian function is usually
easily constructed from a knowledge of the physical interactions of the particles in the system. In Cartesian
coordinates, it takes the standard form
H(q 1 , ..., qf , p1 , ..., pf , t) = T + V

(3.20)

where T is the total kinetic energy of the system, and consists of the sum of the kinetic energies of each of
the particles,
N

p22
p22
p
22A
p22
T = 1 + 2 + + A =
2m 2m
2m A=1 2m

(3.21)

and V is the sum of the potential energies of the particles due both to external fields (gravitational, electric,
magnetic) and to mutual interactions (Coulomb, van der Waals, etc.).
All that is needed for the construction of a classical statistical mechanics is a knowledge of the
Hamiltonian function H. Strangely, this does not require us to solve the equations of motion for the
system. This should be contrasted with the quantum case, where we need to solve the Schrdinger equation
explicitly to uncover the possible energies of the eigenstates of the quantum system.

3.2.2

External Parameters

As in the quantum model, the effect of the environment on the system (applied potentials, external
forces, electric and magnetic fields, confinement potentials, etc.) is described by means of parameters i ,
i = 1, ..., r, that enter into the external potentials. This means that the energy of the system in state (q i , pi )
at time t depends not only on its state, but also on the values of the external parameters. We therefore

Section 3.2

Classical Models

27

have
E = H(q 1 , ..., qf , p1 , ..., pf , t; 1 , ..., r )

(3.22)

For a given state at a given time, therefore, it is possible to change the energy of that state by altering the
values of the external parameters. If we denote the state at time t by R = (q i , pi ), we can adopt a notation
analogous to the one used above when discussing interaction in the quantum case and write
E = HR ( 1 , ..., )

(3.23)

where the argument t has been suppressed for simplicity. We may now divide possible interactions with
the system into two types. In the first, the external parameters for the system are held constant and energy
is exchanged by the system with the surroundings through interaction with external potentials. This type
of interaction forms the basis for a theory of thermal interaction when modelling thermodynamic systems.
In the second, no energy is exchanged by the system with the surroundings via the external potentials,
but the external parameters are changed. This type forms the basis for a theory of work in a model of a
thermodynamic system.
References:
Reif, F., 1965, Fundamentals of Statistical and Thermal Physics, McGraw-Hill Book Company, New York, Ch 2, 6
and 7.
Tolman, R. C., 1979, em The Principles of Statistical Mechanics, Dover Publications, Inc., New York, Ch 2, 3, 7, 8.

28

Chapter 3

Models of Thermodynamic Systems

Exercises
1.

Consider a non-relativistic particle of mass m in a rectangular box of dimensions Lx , Ly , Lz . The


potential inside the box can be set to zero, then outside the box the potential is infinite.
(a) Show that the allowed energies of the independent particle states are
nx ,ny ,nz =

h2
8m

n2y
n2x
n2z
+
+
L2x L2y
L2z

with nx , ny , nz = 0, 1, 2, 3, ... positive integers.


(b) Show that the number of states in an infinitesimal interval d is given by
n () = ()d
where
() = 2V

2.

2m
h2

3/2

1/2 .

(3.24)

is the density of states an V the volume of the box. (Consider >> 0).
Result (3.24) is specific to non-relativistic particles, since the Schrdinger equation from which it was
derived is non-relativistic. Result (3.24) is therefore also correct only for non-relativistic particles. We
may arrive at the same result in a more general way as follows. A free particle with momentum p2 and
energy has wavefunction
p$ (2r, t) = Aei($r.$pt)/ = (2r) eit/

(3.25)

(Recall that 2k = 2p where 2k is the wavevector). The parameter is not a free parameter, but is
determined by the three parameters p
2. For a non-relativistic particle, it satisfies the equation
=

p2
2m

(3.26)

For a relativistic particle, it satisfies the equation


2 = p2 c2 + m2 c4

(3.27)

The following steps are valid for both relativistic and non-relativistic cases.
(a) Show that by demanding that (2r) obeys periodic boundary conditions
((2r) = (x, y, z) = (x + Lx , y, z) and similarly for the y and z directions and any combination
of coordinates), we get the quantisation condition
px =

hnx
,
Lx

py =

hny
,
Ly

pz =

hnz
Lz

(3.28)

and nx , ny , nz integers. (We can also use the rectangular box to arrive at the same result - try this
and compare.)
(b) According to (3.28), the quantum states for the particle may be represented by a lattice of integer
points in the first octant of a 3-d state space with axes nx , ny , nz . Denote the number of quantum
states with parameter values in the range px to px + dpx , py to py + dpy , pz to pz + dpz by
(2
p) d3 p. Show, with appropriate explanations, that
(2
p) d3 p =
where V = Lx Ly Lz .

V 3
d p
h3

(3.29)

Exercises

29

(c) p2 = p2x + p2y + p2z . Denote the number of states with parameter p in the range p to p + dp by
(p) dp, and show that this number is given by
(p) d3 p =

V
4p2 dp.
h3

(3.30)

(d) In spite of appearances, 2p in the above formulae is not the momentum of particle in a box. Explain
this assertion as fully as you are able. Attempt a physical interpretation of p2. (Hint: this question is
not trivial. You might find Reif, p 353-360 useful in this regard. )
(e) Non-relativistic density of states: The energy of the particle is given by = p2 /2m. Use this
relation, together with result (3.30), to show that the number () d of states with energy in the
range to + d is given by
() d = 2V

2m
h3

3/2

1/2 d

(f) Relativistic density of states: The energy of the particle is given by = + p2 c2 + m2 c4 . Use
this relation, together with result (3.30), to find an expression for the number () d of states with
energy in the range to + d.

30

Chapter 4

Isolated Systems: Fundamental Postulates

Chapter 4
Isolated Systems: Fundamental
Postulates
The simplest thermodynamic situations to imagine are those where the system of interest is isolated.
Isolated" means that the system does not interact at all with its surroundings. Heat does not flow into it,
it can do no work, and the number of particles that it contains does not change. So, its internal energy
remains constant. According to quantum theory therefore, the system is in a stationary state, and its
external parameters do not change. Comparison of this microscopic description of the state of the system
with its classical macroscopic description has far reaching consequences. We therefore begin our study of
microscopic modelling by constructing a model of an isolated system.

4.1 Thermodynamic Specification of States


From a thermodynamic point of view, the only parameters of a quantum system which can be regarded as
macroscopic are its total energy E, the values i of the external parameters, and the number N of particles
in the system. All other variables are microscopic, and any closer specification of the system requires a
detailed knowledge of its microscopic constitution and state. The values (E, 1 , ..., , N ) thus provide
the most detailed possible macroscopic specification of the quantum state. In thermodynamic language,
(E, 1 , ..., , N ) are state variables for this system. Sometimes, not all the external parameters i are of
interest. The description they give is too detailed. In such cases, we use a reduced set of parameters defined
from the i in their place. If so, then this reduced set, together with E and N are the state variables.
In general, the values (E, 1 , ..., , N ) do not define a unique stationary state of the quantum system.
The values i and N, together with information about the interaction that the particles undergo, determine
the Schrdinger equation to be solved, and hence the stationary states, their energies and their degeneracy.
Thereafter, choosing the value of E selects the energy level at which the stationary state is to be found. If E
happens to be a non degenerate energy eigenvalue, then the given values (E, 1 , ..., , N ) have specified
a unique quantum state. However, with systems containing huge numbers of particles, non degenerate
energy levels are very rare. It is more usual that a level has a very high degree of degeneracy so that there
is not one, but gE >> 1 stationary states for the system at this energy value. Of course, if the given value
of E is not one of the system energy eigenvalues then there are no stationary states at all at this value.
Since the quantum system offers no other macroscopic variables for its description, this means that a
thermodynamic description of the system will have to be formulated exclusively in terms of E, i , N , and
any parameters that can be deduced from these as generalised forces. In particular, fixed values of E, i , N
must determine the thermodynamic equilibrium states of this system.
From this observation, we learn the following important fact: specification of a thermodynamic state of
the system is an incomplete specification of its quantum state. Given a thermodynamic equilibrium state,
the system could be in any one of gE different stationary states. This leads us to the following definitions:
A stationary state of the system will be called a microstate of the system. This represents the most
complete detailed knowledge that one can have of the system and of its state of motion. This microstate
is specified by the particle number N, the values n1 , n2 , ..., nf of the f quantum numbers, and the
values of the external variables i . Together, these determine the energy E of the system.
The partial specification of the state of a system by the values of the macroscopic variables (E, i , N)
will be called a thermodynamic state of the system. A specification of thermodynamic state thus
leaves uncertain the microstate that the system actually occupies, but limits the number of microstates
in which it could be.
The set of microstates which could be occupied by the system in a given thermodynamic state and
which are compatible with it, are called its accessible (micro)states.

Section 4.2

Equilibrium States of Isolated Systems

31

4.2 Equilibrium States of Isolated Systems


Consider an isolated system. Its configuration parameters are fixed, so the system cannot work on its
surroundings. Also, it is thermally insulated, so it cannot exchange heat with its surroundings. Its energy
is therefore fixed. This means that the system must be in a quantum state which has that energy as its
eigenvalue, and such that the values of its external parameters are compatible with those of its macroscopic
configuration. But there are in general an huge number of such quantum states, and the system could be in
any one of them. In which state therefore is it? We cannot know. Nor will any macroscopic measurement
reveal it.
There are two things however that we do know. First, when the system is in equilibrium, all of its
properties are independent of time. So the probability of finding it in any given accessible state must also
be time independent. Second, there is nothing in the laws of mechanics, classical or quantum, to prefer one
accessible state above another. So it is not unreasonable to suppose that it is equally likely to be in any one
of them. This leads us to the following hypothesis:
Fundamental Postulate:
An isolated system in equilibrium is equally likely to
be in any one of its accessible states.
We have not proved this statement. We have merely made it plausible by a reasonable argument. Like
all hypotheses in science, its proof" rests on the predictions it makes and how these match experiment.
This postulate has now been used extensively for well over 100 years and its predictions have been amply
verified in a very wide variety of contexts. It is, in fact, one of the most well tested laws of physics.

Remark 2 It is not immediately clear that the fundamental hypothesis is consistent with the laws of mechanics, either classical or quantum. The study of its consistency is long, involved and difficult. Much of
the original work on this subject was devoted to this question. This is reflected in the large amount of space
devoted to it in the older textbooks. (See, for example, R C Tolman, The Principles of Statistical Mechanics,
Oxford University Press, 1938.)

4.3 Microscopic View of Thermodynamic Equilibrium


An isolated thermodynamic system could be in any one of an huge number of accessible states. The
fundamental postulate asserts that, when the system is in equilibrium, it is equally likely to be in any one
of them. What does this mean in practice? There are two ways to think about such a system. One is based
on a standard, ideal, statistical fiction, the other on sound physical intuition.
The statistical fiction is this. Imagine, not a single isolated system, but an huge collection of copies of the
system, each prepared to identical macroscopic specifications. This huge collection is called a statistical
ensemble. Ensemble is the French word for set. The macroscopic specification is too weak to determine
uniquely the microstate of each system in the ensemble so each, in general, will be in a different microstate.
The original system in equilibrium is then represented by an ensemble in which each accessible microstate
is represented in the ensemble an equal number of times. This is called an equilibrium ensemble.
To calculate the average value of any given measured parameter for the system, we imagine a set of
measurements of it, one on each copy of the system in the ensemble. These measurements, in general, will
each yield a different result, depending on what value that parameter has in any given microstate. To obtain
the value that would actually be measured in the given thermodynamic state, we average all the values
obtained over the ensemble.
A more physical way to think about the system is as follows. Consider, not an ensemble of systems,
but a single given system in thermodynamic equilibrium in a given thermodynamic state. The isolation of
the system can never be perfect. There will always remain some residual interaction of the system with
its surroundings, even if this interaction is extremely small. This small residual interaction is sufficient to
drive the system continually from one of its accessible states to another. So in the course of time, if one
waits sufficiently long, the system will pass through each of its accessible states in random order. Since

32

Chapter 4

Isolated Systems: Fundamental Postulates

the probability that the system is in any given accessible state is equal to that of it being in any other, the
system must spend, on average, equal amounts of time in each of its accessible states. Suppose now that
we wish to measure the value of a particular variable for the system. Typically, the time needed for the
measurement is very much longer than the amount of time that the system spends on average in any one
microstate. So if we make repeated measurements of this particular variable, we can expect to pick up a
series of values each of which is an average of the values of that variable over whatever microstates the
system has been in during the time taken for the measurement. The average of these measurements will
then be the average value of that variable for the system in the given thermodynamic equilibrium.

Remark 3 Attainment of Equilibrium: Consider a system not in equilibrium. The system can then
be found in any of its accessible states, but not with equal probability. There will however be a finite
probability for the system to be in each of its accessible states. Since the system moves spontaneously to
equilibrium, this probability distribution must change with time until it assumes the same constant value
for each accessible state. Physically what happens is this. The system may initially be in some equilibrium
state. Something is done to disturb this equilibrium (e.g. sudden compression of a gas, or decompression).
The system then finds itself with a new set of accessible states, but distributed over them in some non
uniform way. (eg in sudden compression of a gas, the molecules are clustered near the piston. In sudden
decompression, they are clustered in the original volume which they occupied.) There is nothing in the new
situation to prevent the system from entering the new set of accessible states. The system thus begins to
make transitions into them. Eventually, it reaches a situation in which it is spending on average as much
time in any one of the new accessible state as in any other. That is the new equilibrium. The time taken to
attain a new equilibrium is called the relaxation time of the system. It can be shorter than a microsecond
for a gas, some minutes for very viscous fluids, or even centuries for geological systems. The study of non
equilibrium situations is very complex and difficult. It falls under the heading of non-equilibrium statistical
mechanics, or non-equilibrium thermodynamics.

4.4 Isolation of a System


In our model of an isolated system, we have assumed that complete and perfect isolation of the system is
possible. In fact, this is a nave idealisation. No physical system is, or can ever be, completely isolated
from its surroundings. There will always exist forces that permeate all of physical space. They may be
weak and negligible in comparison with the effects of interest, yet they are always present. Gravitational
fields provide one such example. There is no way to shield a gravitational field. So, no matter how weak
gravitational effects may be, there will always remain some small residual gravitational interaction of the
system with the surroundings.
These residual interactions with the system have important consequences. First, they provide the means
by which the physical system will continually make transitions between all of its accessible states. A
quantum system which is truly isolated, once it has entered any one of its stationary states, will remain in
that state forever. There will never be any spontaneous transitions to any other states accessible to it. The
presence of residual interactions thus has the effect of producing the dynamical equilibrium" of a quantum
system of interest in thermodynamics in which the system is continually moving from one accessible
quantum state to another. But second, and more importantly, the residual interaction broadens the range
and number of states that are actually accessible to the system. A weak interaction will allow the system
to exchange energy with the interacting field. The thermodynamic equilibrium into which it settles will
therefore not be one at a perfectly definite energy E, but one in which the energy can have some value in a
small range E to E + E. The number of states accessible to the system will therefore not be simply the
degeneracy of the energy level with energy E, but will be the sum of degeneracies of all the states with
energy in the narrow range of energy E to E + E. The more perfect the isolation of the system, the
narrower the range of energy accessible to the system. A sharply defined energy E is therefore an ideal
limit.
A thermodynamic system contains a huge number N of particles. This has two effects. First, the degree

Section 4.5

Dependence of on E and E

33

of degeneracy, gE , of each energy level of the system is usually extremely large. Second, the spacing
between adjacent energy levels is very, very small. In fact, the level spacing is almost always much, much
smaller than the precision E of definition of the total energy E of the system. This means that the states
accessible to the system will be, not the gE states at some sharply defined energy E, but all of the states
represented by each energy level in a cluster of levels at energy E in the range E to E + E. The number
of accessible states is thus normally impressively large. We denote this number by the symbol
(E, E) = number of accessible states in range E to E + E

(4.1)

Note the inclusion of the range size E in the symbol to emphasize its dependence on E. We shall
later find that the results of interest to us in the case of isolated systems" are completely insensitive to the
actual value of E. It is customary therefore, for the sake of simplicity, to omit the dependence of on E
whenever this dependence is not of interest. We will thus often write (E) in place of (E, E), or more
briefly still, . Note that these are abuses of notation, introduced only for convenience. In all cases, it is
implicit that = (E, E).
In the above discussion, we assumed that the parameters i , N are held fixed. In general, if we change
their values, the number of states accessible to the system will also change. So can be expected to
be a function also of i , N as well as of E and E, or = (E, E, i , N ). The fact that is largely
insensitive to the value of E leads us to express this as
= (E, i , N)

(4.2)

4.5 Dependence of on E and E


It is useful to have a rough estimate of how the number of accessible states depends on E and E for
typical systems. For this, it is sufficient to consider only the independent particle approximation.
In independent particle models, there are two cases to consider. In the first, the number of states available
to a single particle is unlimited, with energies ranging from the ground state energy to infinity. In the
second, the number of states is finite and there is a state of highest energy available to the particles. This
occurs, for example, in spin systems.
Consider first a system in which an unlimited number of states is available to a single particle. Suppose
the system has f degrees of freedom, and a total energy E. We are interested in calculating the number of
states (E, E) available to the system when its energy lies in the range E to E + E. This can be done
most easily as follows. Denote by (E) the number of states available to the system with energy less than
E. Then for sufficiently small E we have
d
(E, E) = (E + E) (E)
(E) E
=
dE

(4.3)

It is not difficult to estimate (E). This is done as follows. There is nothing in a system of identical
independent particles to prefer one degree of freedom above another. So, on average, the total energy E
of the system will be distributed equally among the available degrees of freedom. Each degree of freedom
therefore will contain, on average, energy = E/f . Experience in dealing with systems of this kind shows
that the energy eigenvalues for a single degree of freedom typically follow a power law of the type n
where n is the quantum number for this degree of freedom, and is a number like 2 for the infinite square
well, 1 for the simple harmonic oscillator, -2 for the hydrogen atom, and so on. In a rough estimate such
as the one we are making here, we can afford to be generous and to say that has a range something like
101 < || < 10. Now, n is the quantum number that labels the states available to the single particle.
Ordinarily, n is an integer with values 1, 2, 3, ... or 0, 1, 2, .... Denote the number of states available to the
single particle with energy less than by (). This number of states, give or take one state, is equal to the
quantum number n corresponding to the given energy . Thus
()
= n 1/ =

E
f

1/

(4.4)

34

Chapter 4

Isolated Systems: Fundamental Postulates

The number of states available to the whole system of identical particles is thus the product of the number
of states available to the individual particles,
f
(E) = [()]

E
f

f /

(4.5)

from which
(E, E)

E
f

(f /)1

(4.6)

For macroscopic systems, f is of the order of Avogadros number, 6.021023 1024 . So (f /)1 f /
and
ln (E, E) ln f ln +

f
(ln E ln f ) + ln E

(4.7)

Since f 1024 , the terms ln f 24, ln 1, and ln E are all utterly negligible in comparison with the
term with coefficient f / 1023 . Note that, since ln E is utterly negligible, to excellent approximation
does not depend on E at all, and we can regard it as a function of E alone. This is the promised proof
that the choice of E is essentially of no importance in this problem. Also, since and f are constants for
this problem, we can write the dependence of on E in the form
(E) E f /

(4.8)

This shows, by an order of magnitude estimation, that depends essentially only on E and not on E,
and that it increases extremely rapidly with increasing E. In fact, it rises exponentially, with exponent of
order 1024 for macroscopic systems. This increase is staggeringly rapid. The graph of vs E rises almost
vertically after an initial tail, as shown in the Figure 1.

Figure 4.1.Approximate dependence of (E) on E.


For systems where there is a finite number of quantum states available to the single particles, there is a
maximum allowed energy for the system. The initial results are similar but after the initial exponential
increase, (E) starts to decrease until the maximum allowed energy is reached, at which point becomes
zero since no more energy can be taken into the system.
There is an alternative way to derive the essence of equation (4.5) without appeal to the independent
particle picture. A quantum state is specified by its quantum numbers {qi }. The quantum numbers are
normally discrete and the number of quantum numbers ni for the ith degree of freedom with values |qi | ,
will be proportional |qi | . For a given energy ER the total number of eigenstates will be proportional to
(ER ) fi=1 |qi |

(4.9)

Section 4.6

Thermal Interaction

35

If we assume that the total energy is partitioned to the different quantum numbers in proportion to the
quantum number,
|qi | i

E
,
f

equation (4.9), becomes


(E) i |qi | fi=1 i

E
f

E
f

which is qualitatively equivalent to (4.5).

4.6 Thermal Interaction


Consider now two thermodynamic systems A and A interacting thermally with each other, but not in any
other way. The combined system A + A is isolated. Denote the energies of these two systems by E and
E respectively, and the number of states available to them at these energies by A (E) and A (E ). Since
A and A are free to interact thermally but not in any other way, the external parameters of each are fixed.
And since the composite system A + A is isolated, it will at all times have fixed total energy E0 . E and
E are therefore constrained by the equation
E + E = E0

(4.10)

The system A + A will eventually settle into a state of thermodynamic equilibrium. Our problem is to
determine, from our microscopic model, the energies of each of A and A in this equilibrium.

Remark 4 The relation EA+A = EA + EA assumes that the interaction of A with A is weak, so that
the total energy involved in the interaction is small compared with that contained in the systems themselves.
In general, the Hamiltonian operator for two systems in interaction has the form
H = HA + HA + Hint

(4.11)

where HA is the Hamiltonian for the system A by itself, HA that for A by itself, and Hint is the term that
represents the interaction. We thus expect the total energy for the system to be
E0 = EA + EA + Eint

(4.12)

The assumption that we have made therefore is that Eint EA + EA . For macroscopic systems in thermal
contact, this is generally an excellent approximation.
Now, A + A is an isolated system. So, by the fundamental postulate, when it has reached equilibrium, it
will be found with equal probability in any one of its accessible states. But A and A are free to exchange
energy. So A + A could be found in any state in which A is in equilibrium with energy E, and A is in
equilibrium with energy E = E0 E, where 0 E E0 . But when A has energy E, it has A (E)
states accessible to it, and when A has energy E = E0 E, it has A (E0 E) states accessible to it.
So when A has energy E, and A has energy E = E0 E, the number of states available to A + A is
A+A (E, E ) = A (E) A (E0 E)

(4.13)

The total number total of states accessible to the isolated system A + A , for all possible partitions of the
energy E0 between A and A , is therefore
E0

E0

A+A (E, E ) =

total =
E=0

E=0

A (E) A (E0 E)

(4.14)

36

Chapter 4

Isolated Systems: Fundamental Postulates

This total number total is a given fixed constant for this problem. Now, all the states accessible to the
isolated system A + A are equally probable, and there are A+A (E, E0 E) of them in which A has
energy E and A energy E = E0 E. The probability, therefore, of finding A + A with energy E in A
and E = E0 E in A is
P (E) =

A+A (E, E )
= c A (E) A (E0 E)
total

(4.15)

where, for convenience, we have put c = 1/total .

4.6.1

The Probability Distribution

Result (4.15) shows us that, when A + A is in equilibrium, A does not have an uniquely defined energy.
In fact, A could be found with any energy E in the range 0 E E0 . This appears to contradict our
classical expectations which lead us to believe that A will have a well defined total internal energy E while
A will have equally well defined total internal energy E = E0 E. To see how to resolve this apparent
contradiction, we must examine the probability distribution (4.15).
First, we present an intuitive argument. In the next section, we will give a detailed mathematical
justification of it. P (E) is a probability distribution. So, necessarily 0 P (E) 1 for all E. Also,
for any macroscopic system, (E) increases rapidly with increasing E. So the factor A (E) in (4.15)
increases rapidly with E, while A (E0 E) decreases rapidly. Because P (E) always has values between
0 and 1, the product A (E) A (E0 E) must always be finite. It must therefore climb very steeply to
a maximum and then drop again, also very steeply. So P (E), which is this product divided by the huge
number total , will exhibit a pronounced maximum in the form of a sharp spike with P (E) very close to
zero everywhere else. This is illustrated in Figure 2. Physically, the implications of this behaviour of P (E)

Figure 4.2.Sharpness of P (E).


are the following. When A + A is in equilibrium, the probability of finding system A with energy E is
very nearly zero for almost all energies E in the range 0 E E0 . The system thus has non negligible
probability of being found with energy E in A only when E is in a narrow range E around the value
Here, E denotes that value of E where P (E) attains its maximum value. The total internal energy
E = E.
We shall see in the next section that
of A will thus fluctuate around a very sharply defined value E = E.
these fluctuations are so minute that, to the accuracy of classical experiments, the internal energy of A can

be regarded as precisely" defined and with value E = E.

4.6.2

Sharpness of Peak

We saw from general arguments that the probability distribution P (E) must have a very sharp peak. We
Since P (E)
investigate this more closely. Denote the value of E at which P (E) has its maximum by E.
it is easier and more convenient to investigate ln P (E) rather than
is expected to change rapidly near E,

Section 4.6

Thermal Interaction

37

P (E), since this function changes more slowly than P (E). Now,
ln P (E) = ln [c A (E) A (E0 E)]
= ln c + ln A (E) + ln A (E0 E)

(4.16)

we have
Near E
+
ln A (E) = ln A (E)

2
d ln A
+ 1 d ln A (E)
(E E)
2 +
(E) (E E)
dE
2! dE 2

(4.17)

Put
d ln A
d2 ln A
(E)
(4.18)
(E), =
dE
dE 2
The negative sign in the definition of is introduced for convenience. We shall see later that the second
is always negative. The negative sign thus makes always positive. With
derivative of A evaluated at E
this notation, (4.17) then becomes
=
= ln A (E),

1 (E E)
2 +
ln A (E) = + (E E)
2!
= E0 E.
This gives
may be expanded at the value E

Similarly, ln A

+ (E
E)
ln A (E0 E) = ln A (E0 E)
+ d ln A (E0 E)
(E
E)
= ln A (E0 E)
dE
1 d2 ln A
(E
E)2 +
(E0 E)
+
2! dE 2

(4.19)

(4.20)

so putting
=
= ln A (E0 E),

2
d ln A
= d ln A (E0 E)

(E0 E),
dE
dE 2

(4.21)

we get
E)
ln A (E0 E) = + (E

1
(E E)2 +
2!

(4.22)

and hence near E


1 ( + ) (E E)
2 +
ln P (E) = ln c + ( + ) + ( ) (E E)
2
its first derivative there must vanish, so
Since ln P has its maximum at E = E,
=

(4.23)

(4.24)

therefore
For E near E

P (E) = c e+ e 2 (+ )(EE)
+

P (E) = P (E),
so P (E)
= ce
But at E = E,

(4.25)

and we may write

2
e 12 (+ )(EE)
P (E) = P (E)

(4.26)

This result shows that, near its maximum, P (E) is approximately Gaussian with width = 1/ +
Note that + cannot be negative. If it were, P (E) would have a minimum, not a maximum. But we
know from experiment that the system A + A reaches a well defined thermal equilibrium, so P (E) must
have a maximum, not a minimum. So + 0.
Further, and must each individually be positive. The values of and are properties of the system
A and A respectively. By altering the systems A and A , we can change the values of and , and hence
also the value of + . Since for all systems A and A we must have + 0, this must be true for

38

Chapter 4

Isolated Systems: Fundamental Postulates

all possible choices of values of and . Thus 0 and 0. This result has a simple physical
interpretation which we shall discuss in the next sections.
Returning to the sharpness of P (E), we see that the value of P (E) will be negligibly small when E lies
E where
outside the range E
1

E =

To estimate E, suppose that < , so that + > 2 and E = 1/ + < 1/ 2. But


according to our order of magnitude estimates, Eq.(4.7),
ln (E)

f
(ln E ln f )

(4.27)

so that

2 f
(ln E ln f)
E 2

E=E

f 1
2
E

(4.28)

and hence
E =

E
< 1/ 2
f
+
1

(4.29)

or
E

E
f

(4.30)

For typical macroscopic systems, f 1024 and 1, so


E
1012
E
by as little as 1 part in 1012 , then P (E) is
This is an extremely sharp maximum. If E differs from E
completely negligible.
The sharpness of the peak of P (E) is a characteristic of systems having an huge number of degrees of
freedom. This means that the energy of the system is so sharply defined that statistical fluctuations of the
energy around its mean value are undetectable except with extremely sensitive equipment. This explains
why classical thermodynamics remained completely ignorant of them for such a long time. They show up
only with very sensitive equipment, or in very small systems.

4.7 Entropy
The behaviour of the composite system A + A in relation to P (E) is reminiscent of its behaviour in
classical thermodynamics in relation to the entropy. There, the system comes to equilibrium in that
thermodynamic state that gives the system entropy its maximum value (second law of thermodynamics),
where
SA+A (E, E ) = SA (E) + SA (E )

(4.31)

Here, the system comes to thermodynamic equilibrium when its energy is such that it maximises the
probability P (E). Up to a multiplicative constant, P (E) is just the function
A+A (E, E ) = A (E) A (E )

(4.32)

This suggests that S and are closely related. Suppose the relationship is of the form
S = f ()

(4.33)

Our previous discussion made no special assumptions about the nature of the systems A and A . The
relationship (4.33) must therefore be universal, relating S and for any conceivable type of physical

Section 4.7

Entropy

39

system, irrespective of its detailed nature. So, in particular, we must have for the individual systems A and
A ,
SA = f (A ) and SA = f (A )

(4.34)

while for the composite system A + A we must have


SA+A = f (A+A )

(4.35)

From this last relation, using (4.31) and (4.32) above, we get
SA + SA = f (A A )

(4.36)

f (A ) + f (A ) = f (A A )

(4.37)

or

This last relation must hold for all possible systems A and A , and thus for all possible values of A and
A . It is therefore an identity for the unknown function f of the form
f(x) + f (y) = f (xy)

(4.38)

which must hold for each possible value of x and each possible value of y. In particular, it holds for any
chosen fixed value y = of y,
f (x) + f () = f (x)

(4.39)

So differentiating with respect to x, we get


df
dx

+0=
x=x

df (y)
dy

d
df (y)
(x) =
dx
dy

y=x

(4.40)

y=x

This last relation must also hold true for all possible values of x, and so in particular it must hold true for
the value x = 1. We thus get
df
dx

=
x=1

df
dx

(4.41)

x=

The left hand side is a constant. Call it k. Thus


df
dx

=
x=

(4.42)

This relation states that the function df /dx evaluated at value is k/. But since the value was chosen
arbitrarily, and since k/ is just the function k/x evaluated at , this means that
df
k
=
dx
x

(4.43)

which is a differential equation for f . The solution for f is f = k ln x + c, where k and c are constants. In
terms, of our original problem this gives
S = k ln + S0

(4.44)

This relation must hold irrespective of the nature of the thermodynamic system considered. The constants k
and c must therefore be universal constants which relate, for every system, the number of states accessible
to the system to its entropy.
In classical thermodynamics, the entropy was defined by the Clausius theorem only up to an arbitrary
additive constant and it provided no way to determine that constant. Here, on the other hand, ln (E)
has a physically well defined value determined by the system itself. This means that we have, because of
equation (4.44), a way of defining the entropy of a system in an absolute way, without the intrusion of any
arbitrary constant. The constant S0 simply defines the zero of the entropy scale. We may therefore choose

40

Chapter 4

Isolated Systems: Fundamental Postulates

it to be zero. And since relation (4.44) is universal, this sets S0 to zero for all possible systems. We are
thus left with the relation
S = k ln
The remaining constant k is a universal scaling constant that relates the scale of S with that of ln . It is
called Boltzmanns constant. We cannot set its value arbitrarily, but must determine it by evaluating it
using any conveniently chosen thermodynamic system. Using an ideal gas of N particles, we shall later
show that N k = nR, so that
k=

n
R
R=
N
NA

where NA is Avogadros number. Thus, numerically,


k = 1.381 1023 J.K1
We have thus arrived at the seminal relation in statistical mechanics, called Boltzmanns equation,
S = k ln
The entire theory of statistical mechanics hinges on this single relation. It relates , given as a function
of the parameters E, i , N, to the classical entropy S of the system. It therefore yields the fundamental
relation of the thermodynamic system in entropy representation. In this way, it determines all the
thermodynamic properties of the system, and so provides the fundamental link between its microscopic
and macroscopic descriptions. We shall later see that Boltzmanns entropy equation in fact provides more
information than did classical thermodynamics, since it determines also the statistical fluctuations of the
classical macroscopic parameters about their mean values.

4.7.1

Meaning of and .

It is reasonable to assume that the most probable states that the system will be found in are the states
It also
with the highest probability, which according to equation (4.26) are the states with total energy E.

follows that the equilibrium will be attained at P (E). From classical thermodynamics it is known that the
temperatures of two thermodynamics systems in equilibrium are the same. From equation (4.24) it follows
that at equilibrium
d ln A
d ln A

(E) = = =
(E0 E).
dE
dE

(4.45)

Since that parameters and are equal at equilibrium, it is now reasonable to associate with the
thermodynamic temperature. Recall that
S
E

=
V

1
.
T

(4.46)

From (4.45) and (4.46) is follows that


ln
E

=
V

1
.
kT

(4.47)

From (4.18) and (4.47) the properties of ,


=

2 ln

0
=
2
E
E

Expressed in terms of the temperature T = 1/k, this becomes,


0

1 T

=
E
kT 2 E

(4.48)

Section 4.8

Density of States

41

or
T
0
(4.49)
E
that is, the temperature of a system cannot drop as its internal energy increases, nor can its internal energy
decrease as its temperature rises. This is consistent with experiment. The temperature of a system always
rises on heating, except during phase changes when it remains constant. Heating never produces a drop in
the system temperature.

4.8 Density of States


To find the fundamental equation for a system from a microscopic model, we must count the number
of states accessible to the system when it is in any given thermodynamic equilibrium state. This can be
done explicitly for some simple systems by combinatorial techniques. In general however, this problem
is analytically intractable, even by highly specialised combinatorial methods, so alternative approximate
methods must be used. These alternative methods are based on the concept of the density of states.
For macroscopic systems, and also for very many smaller ones, the eigenenergies of the system are
extremely closely spaced. In fact, the typical spacing between energy levels is hugely smaller than the
accuracy E of any measurements that we either can or may wish to make. Measurements are thus not
normally able to resolve individual energy levels of the system. The result is that, when dealing with
macroscopic measurements on macroscopic systems, we are dealing not with the states accessible to the
system at some precisely defined energy E, but with the states accessible in some cluster of levels in a
range of energy E to E + E.
This situation is analogous to that encountered in classical mechanics when calculating mass density.
Mass is distributed in space atomically. But in situations in which space measurements are not able to
resolve individual atoms, we measure the mass of those atoms clustered about the given point in space and
which are contained in the smallest volume that we are able to resolve. Dividing this mass by the volume
containing them, we obtain an average density which serves as a very good description, at this scale, of the
way that mass is distributed in space. Were our measurements refined in such a way that we could measure
distances less than the inter-atomic spacing, we would detect the atomic nature of matter and would obtain
an accurate description of the discrete distribution of matter in space. But such accuracy would not give
any substantial improvement of the macroscopic properties that we calculate from the density function.
For most purposes, the original average mass density is as accurate as we need and simpler to use than the
detailed discontinuous distribution of matter given by more accurate measurements.
In a similar way, quantum states are distributed on the energy axis in a discrete and discontinuous way.
But, for macroscopic systems, the states are clustered so closely that our energy measurements cannot,
in general, resolve the individual levels. We may therefore describe their numbers on the energy axis by
means of an average density" which tells us how many quantum states are found per unit range of energy,
accurate to the degree of resolution of energy in our experiments. This average number density", is called
the density of states.
The density of states for a system is defined as follows. Consider the states of the system with energies
between energies E and E + E, where E is any given small value. Denote the number of states with
energy in this range of values by (E, E). Clearly, this number must depend on both E and E. It
will also depend on the external parameters of the system or, if they are different, on the thermodynamic
variables that define the thermodynamic state of the system. For clarity, we here suppress this dependence
in the notation, but it should not be forgotten and will play an important role later. As E 0, we
must have (E, E) 0, since an energy range of width zero can contain no quantum states at all. We
now examine the behaviour of (E, E) for small E. Provided that E does not approach the typical
inter-level spacing for the system, we will have
(E, E) = (E, 0) +

1 2
(E, 0) E +
(E, 0) (E)2 +
(E)
2 (E)2
(4.50)

= 0 + (E) E + terms of order (E)2 or higher

42

Chapter 4

Isolated Systems: Fundamental Postulates

For sufficiently small E, but with E still above the typical inter-level spacing, we will get to a very good
approximation,
(E, E)
= (E) E

(4.51)

The coefficient (E) is called the density of states for the system at energy E. Note that depends not
only on E but also on the external parameters i and N , so the density of states must also depend on i
and N .
An equivalent definition of is as follows. For given E, take an interval E of any size, and count the
number (E, E) of quantum states with energies in the range E to E + E. Form the ratio
(E, E)
E
As E is made smaller, this ratio will at first begin to tend more and more closely to a definite value, (E).
The approach to this value will continue smoothly until E reaches a size so small that it becomes sensitive
to the inter-level spacing for the system at value E. At this point, the limiting procedure will start to go
badly wrong and will yield ratios that decrease discontinuously until they reach value zero. Just before this
happens however, the ratio will have approached the value (E) so closely that had we not insisted on
making E still smaller we could easily have continued under the illusion that (E) was the limit of the
ratio as E 0. In fact, if we lacked the technology to measure intervals E less than this critical value,
which is a realistic scenario, then we could have no way of knowing that the limiting procedure would
actually fail in practice. Thus we could define" (E) as the limiting ratio
(E) = lim

(E, E)
E

(4.52)

In the limit as E is made very small, but not too small.


From this discussion, you can see that the concept of a density of states is a fiction. However, it is a
pleasant fiction that permits the calculation of macroscopically measurable parameters with considerable
ease. If you are unhappy with the concept, then consider it as an approximation procedure which is
an integral part of the microscopic model, with the function (E) included among the postulates. But
remember that we have ways of guessing" this function from the microscopic model, as I shall show you
in a later section.

4.9 Approximate value for total


allows us to use the concept of
The fact that A+A (E, E ) exhibits such a large, narrow peak at value E
the density of states to obtain an useful approximate formula for the total number total of states accessible
to the system A + A ,
E0

total =
E=0

A+A (E, E )

A+A (Er , Er )r E

E0
E=0

A+A (E, E0 E) dE

(4.53)

Here we have discretised the energy axis into a net of subintervals, each of width r E, then we have
approximated A+A in each interval by the density function, and finally we have converted this discrete
sum into an integral over E. Here we have made use of the fact that A+A is nearly zero almost
everywhere and so contributes negligibly to the value of total . But A+A differs significantly from
in which it rises rapidly to its maximum value. So, to good
zero only in a region of width E about E,
approximation,
E0

total

E=0

E0 E)
E
A+A (E, E0 E) dE A+A (E,

(4.54)

But if our maximum energy resolution is E,


A+A (E, E0 E) =

A+A (E, E )
E

(4.55)

Section 4.10

Fundamental equation

43

So
total

E0 E)

A+A (E,
E
E

(4.56)

More importantly
E0 E)
+ ln
ln total ln A+A (E,

E
E0 E)

ln A+A (E,
E

(4.57)

This means that the ln total is, to excellent approximation, just ln A+A evaluated at E = E,
E0 E)

ln total ln A+A (E,

(4.58)

E0 E)

total A+A (E,

(4.59)

And we may take

Physically, this means that the overwhelming majority of the systems accessible states are those at E = E
and we may ignore all those occurring at other energies in our count of states. They do not contribute
significantly to the total count.

4.10 Fundamental equation


Boltzmann s equation relates the entropy S of a macroscopic system of N particles and with external
parameters i to the number of states (E, i , N) accessible to it in a narrow range of energy about
the value E. For definiteness, consider a system with exactly one external parameter, the volume V it
occupies. Then, since
= (E, V, N )
Boltzmanns entropy equation
S = k ln

(4.60)

gives the entropy as a function of E, V and N . We thus get the entropy in the system in the form
S = S(E, V, N )

(4.61)

The entropy given in this form is a representation of the fundamental equation. It therefore contains
complete thermodynamic information about the system modelled. This information may be extracted as
follows. First, form the differential of S from (4.60),
dS =

S
E

dE +
V,N

S
V

dV +
E,N

S
N

dN

(4.62)

E,V

Comparing with the fundamental relation in infinitesimal form,


T dS = dE + P dV dN

(4.63)

and combining with (4.60) gives

1
T

P
T

k
k
k

ln
E

V,N

ln
V

E,N

ln
N

E,V

(4.64)
(4.65)
(4.66)

44

Chapter 4

Isolated Systems: Fundamental Postulates

Equation (4.64) gives T as a function of E, T and N , and so is the heat equation for the system,
T = T (E, V, N )

(4.67)

All heating properties and specific heats can be calculated from it. Equation (4.65), taken together with
(4.64) to eliminate T , gives P as a function of E, V and N,
P = T (E, V, N )

ln
V

= P (E, V, N )

(4.68)

E,N

This is a second implicit form of the heat equation. However, if we eliminate E from (4.67) and
(4.68) together, we get the mechanical equation of state. All mechanical parameters, like expansivities,
compressibilities and bulk moduli, can be calculated from it.
It is important to understand that, even though we considered an isolated system in order to arrive at
these results, equation (4.61) is a representation of the fundamental equation. It therefore contains full
thermodynamic information for the system irrespective of the particular conditions in which the system
finds itself. It does not matter therefore whether the system is isolated, interacting either thermally or
mechanically with its surroundings, open or closed. All of its properties, in all conceivable circumstances,
are described by (4.61). We have thus arrived at a complete thermodynamic model of the system via our
microscopic statistical analysis.
References:
Callen , H. B., 1985, Thermodynamics and an introduction to Thermostatics, John Wiley and Sons, New York, Ch
12.
Reif, F., 1965, Fundamentals of Statistical and Thermal Physics, McGraw-Hill Book Company, New York, Ch 8.
Pathria, R. K., 1972, Statistical Mechanics, Butterworth Heinemann, Oxford, Ch 1.
Zemansky, M. W., 1957, em Heat and Thermodynamics, McGraw-Hill Book Company, Inc., New York, Ch 17.

Exercises

45

Exercises
1.

2.

A system consists of two identical but distinguishable simple harmonic oscillators of natural frequency
(System A). The quantum states of each are labelled by n = 0, 1, 2, ..., and the energy of state n is
n = (n + 1/2) . The system is isolated, and has total energy in the range E to E + E.
(a) Using the quantum numbers of the oscillators, devise a way to specify the state of the system.
Briefly explain your reasoning.
(b) Represent the states of the system as a discrete lattice in the plane. Explain your representation.
(c) The system is isolated, and has total energy in the range E to E + E. Identify, in your plane
representation, the states of the system with energy E, then identify those state with energy
E + E. Finally, identify in your representation those states with energy in the range E to
E + E.
(d) Devise a way to count approximately the number of states accessible to the system when it has
energy in the range E to E + E. You may assume that E E . Explain your method.
(e) Find an expression for the entropy of the system. Hence find the heat equation for the system, and
its heat capacity.
A system consists of two distinguishable simple harmonic oscillators of natural frequency and
natural frequency 3 , respectively (system B).
(a) How are the answers in Question 1 affected by the difference in the oscillator frequency? Explain.
(b) Systems A and B are initially each isolated, and with energies EA and EB respectively. They are
now allowed to interact thermally. What is the energy of each when equilibrium is established.
Explain your reasoning.

Printed January 24, 2014 \sm-02\tutorials\tut-3

46

Chapter 5

Microcanonical formalism: Magnetic Systems

Chapter 5
Microcanonical formalism: Magnetic
Systems
In the previous chapter we arrived at Boltzmanns entropy equation by considering the accessible
macroscopic states available to a system that is isolated, with a fixed number of particles and fixed volume.
We relaxed the constraint on fixed energy slightly and considered the accessible states in a small range
of energies between E and E + E. This is the essence of the microcanonical formalism of statistical
mechanics. So, in the microcanonical formulation the number of accessible states (E, V, N ) are found
by investigating an isolated system where effectively no particles or energy is exchanged with the rest
of the universe. As a first example of applying this approach, we derive a statistical description of a
paramagnetic system.

5.1 Magnetic Materials and Behaviour


Magnetic materials are materials which develop magnetic fields either in response to an applied magnetic
field, or spontaneously in the absence of a field. There are five principal types of magnetic behaviour in
materials.
Paramagnetism is the phenomenon by which a material placed in a magnetic field becomes weakly
magnetised in the direction of the applied field. It thus develops its own magnetic field which strengthens
the applied field. The effect is lost when the applied field is removed.
Diamagnetism is the phenomenon by which a material placed in a magnetic field becomes magnetised
in the direction opposite to that of the applied field. It thus develops its own magnetic field which weakens
the applied field. The effect is lost when the applied field is removed. Diamagnetism is generally a very
weak effect when compared with paramagnetism. It is easy to remember the difference between dia- and
para- magnetism: dia is Greek for through, or against, as in diatribe which is a violent criticism of, or
attack against, a person or a doctrine; para is Greek for alongside, or auxiliary, as in paramilitary which is
a body or force to assist or to fight alongside the military.
Ferromagnetism is the phenomenon by which a material becomes strongly magnetised. It may become
magnetised spontaneously, or in response to an applied field, in which case it responds strongly to the
field by becoming magnetised in the same direction as the applied field. Even weak applied fields produce
strong magnetisation. The magnetisation persists after the removal of the field.
Antiferromagnetism is the phenomenon by which materials which have strong atomic magnetic dipole
moments, which might be expected collectively to exhibit ferromagnetic behaviour, fail to develop any
magnetisation. The difference between ferromagnetic materials and antiferromagnetic materials is in the
type of interaction between neighbouring atomic dipoles. In ferromagnetism, the interaction is such as
to favour alignment of the dipoles. In antiferromagnetism, it favours counter-alignment of neighbouring
dipoles, resulting in zero nett magnetisation.
Ferrimagnetism is the phenomenon by which two different magnetic components in a compound
material align parallel or anti-parallel to the applied magnetic field. Each component individually produces
a magnetisation, but the strength of each is different and the magnetisations are opposite to each other.
The nett effect is a magnetisation in the direction of the applied field that is weaker in magnitude than that
of the individual components considered separately. The strength of ferrimagnetism is between that of
paramagnetism and ferromagnetism, as shown in Figure 1.1

From Askeland, 1989, p 707, Figure 18-14.

47

Figure 5.1.Comparison of strengths of magnetic behaviour.

5.2 Description of Magnetic Behaviour


The applied magnetic field is produced by known currents which are controlled by the experimenter. In
2
electromagnetic theory, these are known as free currents, and the field that they produce is denoted by H.
In the absence of magnetic materials, this is the only field in the system, and it is related to the magnetic
2 by the permeability of free space, 0 , through the equation
induction B
2 = 0 H
2
B
A magnetic material put into this field responds by generating currents which are not controlled by the
experimenter, and of which he generally has no knowledge. They are responsible for generating their
own magnetic field. These currents are called bound currents. Their presence is taken into account by
introducing a new field m,
2 called the magnetisation of the material. Magnetisation is the magnetic moment
2
of the material per unit volume. The magnetisation field contributes to the total magnetic induction B,
2 generated by the free currents. The total magnetic induction B
2 produced by both free
alongside the field H
and bound currents is given by
2 = 0 (H
2 + m)
B
2

(5.1)

For a large class of magnetic materials, called linear isotropic homogeneous materials, it is found
2
empirically that m
2 is proportional to the applied field H,
2
m
2 = H

(5.2)

where is called the magnetic susceptibility of the material. We can thus write
2 = 0 (H
2 + H)
2 = m H
2
B

(5.3)

where m is called the magnetic permeability of the material in the field. The usual symbol for magnetic
permeability is , but we will be using this symbol later for atomic magnetic moment, so we here denote
the magnetic permeability by m . The symbol 0 is called, by analogy, the magnetic permeability of the
vacuum.
A myriad other magnetic permeabilities and susceptibilities are also used, all of them related in some
trivial way to the ones defined above. See Askeland, 1989, p 704-706, or any other book on magnetic
materials or Electromagnetism.

48

Chapter 5

Microcanonical formalism: Magnetic Systems

5.3 Paramagnetic Materials


5.3.1

A Model for Paramagnetism

Paramagnetism occurs in materials whose atoms contain unpaired atomic electrons. Electrons have a nett
magnetic dipole moment due to electron spin. Unpaired electrons thus leave the atoms in the material
with a nett magnetic moment. In the absence of an applied magnetic field, these magnetic moments are
randomly oriented in the material and so produce no nett magnetisation. When a field is applied, the
electronic magnetic dipoles align themselves with the field to produce a nett magnetisation.
All known paramagnetic materials consist of very large molecules in which only one ion has magnetic
properties. For example, in chromium potassium sulphate, Cr2 (SO4 )3 K2 SO4 24H2 O, the only
magnetically active ion is chromium (Cr). In the molecule, the chromium is hugely outnumbered by 2
sulphur atoms, 1 potassium atom, 20 oxygen atoms, and 24 hydrogen atoms. This gives a total of 47 other
atoms for each Cr atom which are non-magnetic. The magnetic ions are thus very widely separated in the
material. If there is any interaction between them, therefore, it can only be very weak. Consequently, there
is no mutual reinforcement of magnetic effects and large fields are needed to produce magnetic alignment.
Also, once the field is removed, the ions randomise themselves again very quickly and all magnetic effect
is lost.
We model a paramagnetic material as follows. Atoms in a solid are localised. The electrons that give
them their nett magnetic moment are therefore confined to definite sites within the solid, so we may treat
them as distinguishable identical particles. Also, they are generally widely separated, so we may neglect
their mutual interaction. Our model will therefore consist of a system of N identical non-interacting
magnetic dipoles, each with spin 1/2 and dipole moment , at fixed positions in space. This assumes that
only one valence electron per molecule is responsible for producing the paramagnetism. We will relax this
condition and consider a more general case later in the course.

5.3.2

Single Particle States Available to Each Atom of the Solid

2 can assume only one of two


From experiment, we know that a spin 1/2 dipole placed in a magnetic field B
2
possible orientations with respect to B: it either aligns, or it counteraligns. There are no other possibilities
available to it. This is a quantum property of spin 1/2 particles, and is contrary to the expectations of
classical mechanics. According to classical mechanics, the particle ought to be able to assume any
orientation relative to the field.
The fact that the particle either aligns or counteraligns with the magnetic field means that there are only
two quantum states accessible to it. These are generally denoted by + and respectively. The energy of a
2 is given by
magnetic dipole 2 in a magnetic field B
2
= 2 B
so, the energies of these two states are respectively
+

= B
= +B

(5.4)

This information must now be applied to our model of a paramagnetic solid.


2 which is created and controlled by
When a paramagnetic solid is placed in an external magnetic field H,
an experimenter, each spin 1/2 dipole aligns or counteraligns with the applied field. As they do so, they
collectively set up a magnetic field m
2 which combines with the applied field to give a total magnetic field
2 at each point in space, where
B
2 = 0 (H
2 + m)
B
2

(5.5)

2
2 is the local magnetic field which acts
Each dipole will interact with the B-field
at its position. Here B
on the dipole. By the local magnetic field, we mean the total magnetic field that would be present at the
position of the dipole if that dipole were to be removed, assuming that its removal in no way disturbs its
surroundings. If the dipoles interact only with the local magnetic field and with nothing else, then the

Section 5.2

Description of Magnetic Behaviour

49

energy of each dipole is either + = B or = +B according as it is aligned or counteraligned with


2
B.
Paramagnetism is a weak effect. The magnetisation field m
2 of the material in response to the applied
2 is weak. To good approximation therefore, it may be neglected when compared with H.
2 We thus
field H
assume that
2 H
2
B
0

(5.6)

This means that each dipole interacts only with the applied field. The energies of the quantum available to
the dipoles are thus approximately + = 0 H or = +0 H according whether they are aligned or
counteraligned.

5.3.3

Microstates of the Solid

To specify the microstate of the solid in our model, we need to specify the magnetic state of each of its
constituent particles. This requires us to list each particle, and its corresponding state, in a table of the type
Particle Number
Particle State

A
rA

1
r1

2
r2

3
r3

N
rN

where rA has value either + or . Each such table represents a possible microstate of the solid. Each
microstate can thus be specified by the set R = (r1 , r2 , ..., rN ). Each ri can take only one of two values, +
or . So, the total number of microstates accessible to the system is 2N .
Not all of the microstates R have energy in the required range E to E + E. Our next task therefore
is to determine how many microstates R have energy in this range. Given R = (r1 , r2 , ..., rN ), the total
energy of the solid in this microstate is
N

ER =
i=1

ri = r1 + r2 + + rN

ER is better expressed in terms of the numbers n+ and n of particles in magnetic states + and
respectively, where
n+ + n = N

(5.7)

n+ and n are called the occupation numbers of the single particle states + and respectively. It is clear
that each microstate R uniquely determines a set of occupation numbers. The converse however is not true:
a given set of occupation numbers does not uniquely determine a microstate R of the system, since many
microstates may give rise to the same occupation numbers. Expressed in terms of n+ and n , the energy
of the microstate R is given by
ER = n+ + + n = (n+ + n )0 H (n+ + n )B

(5.8)

In a system in which its single particles can be in only one of two states, the energy E of the system and
the number N of particles uniquely determine the occupation numbers. This is exceptional, and occurs
only for two-state systems. More typically, the single particles have more than two states available to
them. There is then an occupation number for each state available to the single particles. The energy and
particle number of the system, expressed in terms of the occupation numbers ni , i = 1, 2, ... provide only
two equations among the occupation numbers. They therefore do not determine them, but merely constrain
them.
For a paramagnetic system with given E and N , we have
E
N

= (n+ + n )B
= n+ + n

(5.9)
(5.10)

50

Chapter 5

Microcanonical formalism: Magnetic Systems

The given values of E and N therefore uniquely determine the occupation numbers,
n+

1
2
1
2

E
B
E
N+
B
N

(5.11)
(5.12)

To determine the number of microstates available to the system at energy E, we need to count how many
microstates have given occupation numbers n+ and n . This can be done by regrouping the entries in the
above table in such a way that all particles with state + are collected together, and all particles with state
. The table then looks like this:
Particle State
Particle Number

rA
A

+
A1 A2 An+

An+ +1 An+ +2 An+ +n

Permuting the entries in any given column does not yield a new microstate, so we obtain a microstate of
the system at energy E for each distinct arrangement of the particle numbers into two columns, with n+
in the first and n in the second, without regard to the order of the elements in the individual columns.
There are N ! ways of entering the particle numbers into the table. But there are n+ ! ways to rearrange the
entries in the first column, and n ! ways to rearrange those in the second. So the number N! over counts
the microstates by n+ ! n !. The total number of microstates for given n+ and n is therefore
N!
n+ ! n !

(5.13)

Since this is the number of quantum states of the system with given E and N , we have calculated the
degeneracy gE of the energy level E of the system of N particles. Thus,
gE =

5.3.4

N!
=
n+ ! n !

N!
N
E

!
2
2B

N
E
+
!
2
2B

(5.14)

Number of Accessible States in the range E to E + E

Many texts interpret relation (5.14) as the number of states accessible to the isolated system. The
properties obtained in this way are correct, but the interpretation is wrong. Equation (5.14) gives the
degeneracy of the energy level E, not the number of states accessible to the isolated system in the energy
range E to E + E. To calculate , we need to count the number of accessible states in this range.
Note from (5.9) that the energy levels of the system are evenly spaced, and separated by an amount of
2B. This follows from the fact that a change n+ in n+ and n in n produces a change in E given by
E = (n+ n )B
But n+ and n are constrained by (5.10), since N is constant,
0 = n+ + n
so that
E = 2Bn+
The smallest change that can be made in n+ is 1, so the energy level spacing in the system is
E = 2B
Even for high fields B, this spacing is extremely small. Typically, the magnetic moment of a paramagnetic
spin 1/2 atom is of the order of the Bohr magneton, B = e /2me = 9.27 1024 J/T, and the magnetic
field of a large electromagnet ranges from 1 to 10 T. The strongest known magnetic fields are those at the
surface of strongly magnetic neutron stars, called magnetars, and have strength of the order of 1011 T. The

Section 5.2

Description of Magnetic Behaviour

51

level separation for paramagnetic materials in laboratory fields is thus of the order of 1024 J. Even on the
surface of a magnetar, the level separation would be only 1013 J, which is still extremely small.
The degree of isolation of a system determines the size E of its energy range. Typically, E is several
orders of magnitude larger than the level separation E, so the energy range E to E + E contains very
many energy levels. To calculate , we need to add together the degeneracies gE for all the levels in the
given range,
(E, E) =

gE

EEE+E

This is not easy. However, we can obtain an excellent approximation to it as follows. For small changes of
E, the occupation numbers n+ and n do not change substantially. So all the energy levels in the range E
to E + E all have almost the same degeneracy. We may therefore write
(E, E) gE number of levels in range E to E + E
The number of levels in the range E to E + E is E/E = E/2B, so
(E, E) =

5.3.5

N!
N
E

!
2
2B

N
E
+
!
2
2B

E
2B

(5.15)

The Fundamental Relation

We calculate the fundamental relation for the paramagnetic system from (5.15) using Boltzmanns entropy
equation,
S = k ln
To write the right hand side in tractable form, we use Stirlings approximation which states that for very
large numbers n, ln n! is given to lowest order approximation by
ln n! n ln n n
This approximation is excellent for numbers as low as 103 . For numbers of the order of 1023 , it is very
good. Using it, we get,
S
k

= N ln N N

N
E
+
2
2B

N
E
N
E

ln

+
2
2B
2
2B
N
E
N
E
ln
+
+
+
2
2B
2
2B

N
E

2
2B
E
+ ln
2B

(5.16)

We make one more approximation. All of the logarithms on the right hand side, except the last, have as
coefficients the numbers N , n+ , n . Now, N 1023 , and each of n+ and n is a fraction of that, and so
each is still very large. The last term, on the other hand, has coefficient 1, so at most its value will be in
the range 0 to 30. It is utterly negligible when compared with the other terms, and can be omitted with no
significant loss of accuracy. In addition, snece the terms m ln m >> m for large m, we can approximate,
S
= N ln N
k

N
E

2
2B

ln

N
E

2
2B

N
E
+
2
2B

ln

N
E
+
2
2B

(5.17)

This shows explicitly that S is essentially independent of the energy range E. It explains also why those
authors that assume that is given by (5.14) still get correct answers. Incidentally, you can use this fact to
shortcut similar calculations in microcanonical problems: take to be the degeneracy of the state at energy
E. But remember that it may not be possible in some problems to shortcut the calculation in this way.

5.3.6

Thermodynamics of the Paramagnetic system

Equation (5.17) is the fundamental relation of the system, from which all its properties can be calculated.

52

Chapter 5

Microcanonical formalism: Magnetic Systems

Equation (5.17) expresses S as a function of the parameters E, B and N . We expected S to depend on E,


which is the internal energy of the paramagnetic system, and N . Its dependence on B is unfamiliar from
elementary thermodynamics, and needs to be examined.
Surprisingly, S has no dependence on V , even though we specified at the beginning that the volume of
the system be held fixed. It is not difficult to find the reason for this. In our model, we stipulated that each
atom occupy a fixed position in space. They are not free to move. Our model is such, therefore, that even
in principle its volume cannot change. Consequently there is no parameter in S that reflects any change of
volume. So, it is not that the system does not have a volume, but that our model does not allow it to change.
To remedy this defect, we would need to allow the atoms some mobility. This could be done by means of
a potential that is responsible for holding the individual atoms in their respective sites, but which allows
some small movement around the site as the atom acquires some energy. This potential would depend on
characteristic length parameters that in turn would define V . Then S would depend on V also.
Note that, if V is absent from S as a parameter, we cannot calculate the system pressure. If V were really
absent from the description of the system, pressure would not even be definable. The model constructed
above actually does have a volume, and thus a pressure. However, we have fixed the volume completely
rigidly, so this model does not provide any method for calculating pressure.
To understand the thermodynamics of this model, we need first to determine the significance of the
dependence of S on B. The magnetic field is an external parameter in the system Hamiltonian. It
determines the energy levels available to the single particles in the system, and thus also the energy levels
available to the system as a whole. If we change B, we change also the values of + and . This shifts
the single particle energy levels, and hence also the system energy levels, closer or further apart from each
other. As B is changed therefore, the entire system of energy levels for the N -particle system shifts.
We thus have two ways of passing energy into and out of the system. One is by depositing or removing
quanta of energy from it. This is achieved by radiation. That is exactly what heating is. Heat is
electromagnetic radiation in a particular range of frequencies. The other is by changing B. If we change B
very slowly, in such a way that each particle of the system remains in a fixed quantum state as the energy
level of that state is slowly altered, we alter the total energy of the system. Changing the system in this way
is called adiabatic perturbation. It is called adiabatic because the change of energy of the system is not by
the emission or absorption of quanta, which would constitute radiation and would thus be an heat flow, but
purely by alteration of the parameter B, which shifts the energy levels of the system. If the energy transfer
is not by heat flow, then it must be by work done. This change of energy is therefore by work done.
We can arrive at an expression for the work done by a magnetic system using the following simple
argument. Consider the magnetic system in microstate R. In an adiabatic change of B, each particle in the
system remains in a fixed single particle state, so n+ and n remain fixed. The energy of the microstate R
is given by
ER = (n+ + n )B = MR B

(5.18)

where MR = (n+ + n ) is the total magnetic moment of the system in the microstate R. So, if B is
changed to B + dB with n+ and n fixed, the change in the system energy is
dER = (n+ + n ) dB = MR dB

(5.19)

Since this is a change of energy of the system, the work done by the system is
dWR = MR dB.

(5.20)

Averaging over all the accessible microstates, gives the work done by the system in a macroscopic process.
We thus get
dW = M dB

(5.21)

where M is the total magnetic moment of the sample.


This same result can be obtained without appeal to an atomic model. However, the calculation of the
work done in magnetisation is tricky and requires some care. A good discussion is found in Mandl, 1988,
p 21-28, and p 336-339.

Section 5.2

Description of Magnetic Behaviour

53

The fundamental relation for a magnetic system in infinitesimal form is therefore


T dS = dE + M dB dN

(5.22)

where we have used for the chemical potential so as not to confuse it with the magnetic moment of the
atomic dipoles.

5.3.7

Equations of State

The equations of state for the system are obtained from (5.22) by the standard methods,
1
T

M
T

S
E

B,N

S
B

E,N

S
N

E,B

(5.23)
(5.24)
(5.25)

The equation for is not of particular interest in the case of a solid, so we calculate only the first two. The
calculation is simplified by noting that
y
1 y
y
y

(y ln y y) =
ln y + y

=
ln y
x
x
y x x
x
From (5.23) and (5.17), we get
1
1
=
kT
k

S
E

B,N

and
1
M
=
kT
k

S
B

N
E
+
2
1
2B

=
ln

N
E
2B

2
2B

(5.26)

E
N
+
2
2B

ln
N
E

2
2B

(5.27)

=
E,N

E
2B 2

Equation (5.26) yields T as a function T = T (E, B, N). Changing the subject of this relation from T to E
gives the heat equation of the system in the form E = E(T, B, N ),
E = N B tanh

B
kT

(5.28)

In principle, substituting for T in equation (5.27) the function T = T (E, B, N ) obtained from (5.26)
yields an equation for M as a function of E, B and N ,

E
N
+
2
E
2B

M = kT (E, B, N )
ln
(5.29)
2

N
E
2B

2
2B

This, however, is not the most useful form of the magnetisation equation. In the laboratory we directly
control T and B, not E and B. We want therefore to express M in terms of T , B and N . This is most
easily done by using equations (5.26) and (5.28) in conjunction to eliminate E from (5.29) in favour of T .
This gives,
M = N tanh

B
kT

(5.30)

54

Chapter 5

Microcanonical formalism: Magnetic Systems

Note from (5.28) and (5.30) that


E = MB

(5.31)

This result has an interesting interpretation from electromagnetic theory. In the general case of interaction
2 B
2 is the total energy of interaction between the
of a magnetised material with the magnetic field, M
material and the field (Mandl, 1988, p 26, equation (1.44)). In our case, we have assumed that both
2 and M
2 are uniform, so this interaction energy becomes MB. Our total internal energy therefore is
B
simply the negative of the energy of interaction of the material with the field. This reflects precisely how
we constructed our model, in which we did not include either the internal energy of the dipoles in the
system, or the energy that is stored in the magnetic field. Equation (5.31) is thus a nice confirmation of the
consistency of our model.

5.3.8

Predictions of the Model

Equations (5.28) and (5.30) enable us to make numerous predictions of the behaviour of paramagnetic
systems. We examine these in detail.
5.3.8.0

Magnetisation:

2 per unit volume. In the above


The magnetisation m
2 of a magnetic material is its magnetic moment M
2 of the sample is in the direction of the applied field, and has
model, the total magnetic moment M
magnitude given as a function of T , B and N by (5.30). Thus
m=

B
N
tanh
V
kT

(5.32)

The parameter that determines the detailed behaviour of the sample is x = B/kT . We examine the two
obvious limiting cases. First, consider the case x 1, or B kT . This corresponds to weak applied
fields, or high temperatures. Then tanh x x, so
m

N 2 B
V kT

(5.33)

The magnetic susceptibility of a material is defined by the relation


2
m
2 = H

(5.34)

2 so B
2 0 H.
2
For paramagnetic solids, the magnetisation is very small compared with the applied field H,
We thus get, in this low field, high temperature, limit
=

N 0 1
Vk T

(5.35)

The relation = CC /T was established experimentally before there was an adequate theory of
paramagnetism. This relation is called Curies Law, and CC is called Curies constant. This model
explicitly evaluates Curies constant in terms of the density of the solid, which is directly related to N/V ,
the magnetic moment of the participating dipoles, and the fundamental constants 0 and k.
Curies law is found to hold very accurately for many salts at low temperatures, right down to about 1
K. This makes paramagnetic salts useful as low temperature thermometers. For one particular salt, cerium
magnesium nitrate (CuSO4 K2 SO4 6H2 O), Curies law holds down to 0.01 K, making it an especially
useful thermometer. The excellent agreement at low temperature of the behaviour of this salt with Curies
Law is shown in Figure 2.2
At the opposite extreme, namely in the limit x 1 or B kT , which corresponds to high field or low
temperature, we get tanh x 1, we get
m
2

From Mandl, 1988, p 73, after J. C. Hupse, Physica, 9, 622 (1942).

(5.36)

Section 5.2

Description of Magnetic Behaviour

55

Figure 5.2.Magnetic mass susceptibility, m = /density, vs. 1/T for cerium magnesium nitrate.
The magnetisation has thus become independent of both applied field and of temperature. This is expected
in terms of our model for the following reasons. The material contains a finite number of atomic dipoles.
There is therefore a maximum value of magnetisation that can be attained. The state of lowest energy
for the system is achieved when all the dipoles are aligned with the field. As we drop the temperature
of the sample, or increase the applied field, we expect the number of aligned atoms to increase, but this
increase can continue only until all dipoles have been aligned, at which point the magentisation saturates
and reaches a maximum value.
3
The general behaviour of the magnetisation as a function of x = B
kT is displayed in Figure 3.
4
This curve is an excellent fit to experimental data, as is shown in Figure 4.
The very good agreement between experiment and theory shows that the simplifying approximations of
this simple model are justified.

5.3.9

Heat Capacities:

The energy equation (5.28) can be used to calculate the heat capacities of the paramagnetic system. The
system can be heated at constant magnetic field. This is commonly called the magnetic heat capacity. It is
defined in non dissipative quasistatic processes by
CB =

dQB
dT

(5.37)

Using the fact that in these processes we have


dQ = T dS = dE + M dB

(5.38)

this gives
CB =

E
T

= B

where the last term follows from


E = BM
3
4

From Mandl, 1988, p 71.


From Mandl, 1988, p 74, after W. E. Henry, Phys. Rev., 88, 559 (1952).

M
T

(5.39)
B

56

Chapter 5

Microcanonical formalism: Magnetic Systems

Figure 5.3. Magnetisation m as a function of x.


Equation (5.28) is in the correct form for use here, since (5.39) assumes that E is given as a function of T
and B. We therefore get
B
kT

CB = N k

sech2

B
kT

(5.40)

CB /N k is plotted against 1/x = kT /B in Figure 5.5 The heat capacity is zero both at very low
temperatures, and at very high temperatures, and reaches a maximum value at kT B. The hump shape
is characteristic of a two state system and is called a Schottky hump. Its presence in empirical data signals
the existence of two relatively close low-energy states which are well separated from all other energy states
by a considerable gap. The hump-shape in the curve indicates that, for a given magnetic field, the system
undergoes a transition from a state of complete order at low temperature to one of random disorder at some
higher temperature, with the transition being completed over a finite temperature range. At temperature
T = 0, the system is in complete order. For a very small increase of temperature (the almost flat initial part
of the curve), the state of complete order persists, but begins to break up almost immediately (the very
sharp rise to the peak). The break up of the order is seen in the fact that an huge amount of heat is taken
into the system with very little rise of temperature. This shows that that the heat is being used to loosen
up" the system. When almost complete disorder has set in (at the peak of the hump), the heat intake is
used less and less in creating disorder (there is little more disorder that can be created) and more and more
in raising the temperature of the system. The system has reached a state of total disorder when CB has
dropped to zero. This happens effectively when kT/B 10. The total energy needed for the transition is
given by

CB dT = BN

(5.41)

To calculate the heat capacity at constant magnetisation, we proceed as follows. The definition of CM
is
dQM
(5.42)
dT
To calculate dQM , we need to rewrite the fundamental equation in differential form in such a way that M
appears as one of the variables,
CM =

dQ = T dS = dE + M dB = dE + d(M B) B dM = d(E + M B) B dM
5

From Mandl, 1988, p 76.

(5.43)

Section 5.2

Description of Magnetic Behaviour

57

Figure 5.4. m/N vs. B/T for samples of I. potassium chromium alum, II. iron ammonium alum, III gadolinium
sulphate.

so that
dQM = d(E + MB)

(5.44)

and hence
CM =

(E + M B)
T

(5.45)
M

Since for the paramagnetic system we have


E = MB

(5.46)

CM = 0

(5.47)

this gives

At constant magnetisation therefore, there is no heat intake into the system. This result holds in general
provided that M is a function of B/T alone. Any substance for which this is true is said to be an ideal
paramagnetic material.

5.3.10

Entropy

We get a nice confirmation of the order-disorder transition interpretation given above of the magnetic
specific heat curve by examining the behaviour of the entropy of the system. To make the comparison, we
need first to express the entropy as a function of T, B, N rather than of E, B, N . We must therefore use

58

Chapter 5

Microcanonical formalism: Magnetic Systems

Figure 5.5. CB /N k vs. kT /B.


(5.28) to eliminate E in (5.17) in favour of T . This gives
S = N k ln 2 + ln cosh

B
kT

B
kT

tanh

B
kT

(5.48)

The behaviour of this function is displayed in the figure below.6


In the low temperature or high field limit, x = B/kT 1, so that coshx ex /2 and tanh 1, the
entropy becomes, to first order,
S Nk ln 2 + ln

ex
x =0
2

(5.49)

Using Boltzmanns entropy relation, this means that = 1. The system thus has exactly one microstate
available to it in its lowest energy, or ground state. Since, in this limit, M N , we see that the system is
in that state in which all of its dipoles are aligned with the field. This confirms that at low temperatures,
and/or high field, the system is in a state of very high order.
At the opposite extreme, in the limit of high temperature or low field, we have x = B/kT 1, so that
ln coshx ln(1 + x2 + ) x2 /2 and tanh x x, giving to lowest order,
S Nk ln 2 +

x2
x2 k ln 2N
2

(5.50)

Using Boltzmanns entropy equation this means that, in this limit, = 2N . Thus, at high temperature
and/or low field, the system has each of its total of 2N states accessible to it and the system is thus in
maximum disarray.
The transition to a completely ordered state at extremely low temperature is responsible for the failure
of Curies Law at the lowest temperatures. A nice discussion of this point is found in Sears and Salinger,
1975, p 228-233.

From Sears and Salinger, 1975, p 405.

Section 5.2

Description of Magnetic Behaviour

59

Figure 5.6. S/Nk vs. 1/x = kT /B.


References
Askeland, D.R., 1989, The Science and Engineering of Materials, Second Edition, Chapman and Hall, London.
Mandl, F., 1988, Statistical Physics, Second Edition, The Manchester Physics Series, John Wiley and Sons, New
York.
Sears, F.W., Salinger, G.L., 1975, Thermodynamics, Kinetic Theory, and Statistical Mechanics, Third Edition,
Addison-Wesley Publishing Company, Reading, Massachusetts.

60

Chapter 5

Microcanonical formalism: Magnetic Systems

Exercises
1.
2.

Use relation (5.17) to deduce relations (5.28), (5.29) and (5.30).


Typically, laboratory magnetic fields are of the order of 1 T, and atomic dipoles have magnetic moment
with value close to the Bohr magneton, given in SI units by
B =

3.

e
= 9.27 1024 J.T1
2me

where e is the electron charge and me is the electron mass. Calculate the value of the parameter
x = B/kT at temperatures T = 1 K and T = 300 K, deduce approximate expressions for the
magnetisation m of a paramagnetic solid at these temperatures, and estimate the error made in these
approximations.
Use (5.28) to eliminate E in (5.17) in favour of T , and show that this gives
S = N k ln 2 + ln cosh

4.

(5.51)

B
kT

B
kT

tanh

B
kT

(5.52)

Consider an isolated system consisting of a large number N of very weekly interacting localised
particles of spin 12 . Each particle has a magnetic moment that can point either parallel or anti-parallel
to an applied magnetic field H. The energy of the system is then E = (n1 n2 ) H, where n1 is the
number of spins aligned parallel to H and n2 the number of spins aligned antiparallel to H.
(a) Consider the energy range between E and E + E where E E, but microscopically large so
that E H. What is the total number of states (E, E) in the energy range (E, E + E)?
(b) Find the expression of ln (E) as a function of E. Simplify the expression by applying Stirlings
formula.
(c) Assume that the energy E is in the region where (E) is large, that is not close to the possible
extreme energy values E = NH which it can assume. Show that
2N
(E, E)
exp
2N

2NH

E
.
2H

[Hint: Expand the solution to part (a) in terms of the small parameter = E/ (NH) .]

61

Chapter 6
Classical Model of a Gas
6.1 Classical phase space density
In classical mechanics, the state of a system with f degrees of freedom is specified by a phase point
(q 1 , ..., qf , p1 , ..., pf ) in the phase space of the system. The energy of the state represented by the phase
point (q 1 , ..., qf , p1 , ..., pf ) is given by a function of the form
E = E(q 1 , ..., q f , p1 , ..., pf )
The set of states accessible to a classical system is continuous, and thus non-denumerable. This makes it
inconvenient for statistical analysis. The simplest way to proceed is by replacing the set of classical states
by another set which is discrete, and therefore denumerable. This is done by following the same procedure
as was used in the quantum mechanical formalism: rectangulate the phase space by dividing it into cells of
equal phase volume hf , and regard each phase cell as representing, in an approximate way, one state of the
classical system. In the classical approach, the parameter h is not well defined and can be considered as an
adjustable parameter of the formulation.
Consider now the energy of the state represented by a given phase cell. Let (q 1 , ..., qf , p1 , ..., pf ) be
any phase point contained in the given phase cell. The energy of the state represented by this phase point
is given by E(q 1 , ..., qf , p1 , ..., pf ). If the function E(q 1 , ..., q f , p1 , ..., pf ) varies sufficiently slowly over
the points of the cell, then to good approximation we may take this energy to be the energy of the state
represented by the given cell. The requirement that E(q 1 , ..., qf , p1 , ..., pf ) vary slowly over the cell is
essentially a restriction on the permissible cell size hf . If a given cell size does not meet this requirement
adequately, we can always improve the approximation by making h smaller. The total number of allowed
states in the phase space volume dq 1 ...dqf dp1 ...dpf for a system of f degrees of freedom is given by
number of allowed states =

dq 1 ...dq f dp1 ...dpf


.
hf

(6.1)

6.2 General Model of a Gas


We model a gas as a system of N identical particles enclosed in a container of volume V . The total energy
of the system can be written as
Egas = Eke + Epe + Eint
Here Eke denotes the total kinetic energy of translation of the particles, Epe their total potential energy,
and Eint their total internal energy. Denote the momentum of the centre of mass of the Ath particle by
2p(A) . Then
N

Eke =
A=1

p22(A)

(6.2)

2m

The potential energy of the particles can arise from several sources. First, there is the potential energy of
the potential that contains them in the container of volume V . This potential acts separately, in the same
way, on each of the particles. Denote the potential for a single particle at position 2r in the container by
(2r). The the potential for the entire gas in the container is then,
N

(2r(1) ) + (2r(2) ) + (2r(N) ) =

(2r(A) )
A=1

(6.3)

62

Chapter 6

Classical Model of a Gas

If the container is perfectly rigid then the potential is given by


0

(2r) =

if 2r is inside the container


if 2r is outside the container

(6.4)

Second, there is the potential energy due to the action of external fields, like gravity. Potentials of this
kind also act on each particle individually, so the contribution to Epe of potentials of this type will be
a sum of the same form as in (6.3). Each field acting on the particle will contribute a sum of this type.
Lastly, there is the potential energy due to the interaction of the gas particles with each other. Depending
on the nature of the interactions, there may be two body interactions, three body interactions, ... , to N
body interactions. Each type will contribute a sum which includes the r-body interactions for each distinct
r-body combination. The terms in each sum will each be a function of the position vectors of the r bodies
involved in the interaction. In general therefore, we will have
Epe = Epe (2r(1) , 2r(2) , ..., 2r(N) )
In the special case of magnetic forces, Epe will depend also on the velocities 2v(A) of the particles.
The internal energy Eint include all energies not yet considered, like kinetic energy of rotation of the
particles about their centre of mass (if the gas does not consist of point particles), energy of vibration about
the centre of mass (if the particles are composite particles like molecules), atomic excitation energies (if
the particles are not point particles), etc. These internal degrees of freedom can each be described by
additional generalised coordinates, which we denote by Q1 , ..., Qs , with associated generalised momenta
P1 , ..., Ps . Then,
Eint = Eint (Q1 , ..., Qs , P1 , ..., Ps )
If the gas particles are treated as point particles, Eint = 0.
The number of states accessible to the gas when its total energy Egas lies in the range E to E + E is
calculated from the formula
(E, E) =

1
h3N+s

d2r(1) ...d2r(N) d2
p(1) ...d2
p(N) dQ1 ...dQs dP1 ...dPs

(6.5)

where D is the domain in the (6N + 2S)-dimensional phase space by the condition
2(1) , ..., p2(N) , Q1 , ..., Qs , P1 , ..., Ps ) E + E
E Egas (2r(1) , ..., 2r(N) , p
Because of this condition, evaluation of integral (6.5) is highly non-trivial, except in the simplest cases.

6.3 Ideal Gas


An important case in which we can evaluate integral (6.5) without difficulty is that of an ideal gas. An ideal
gas is defined in classical thermodynamics to be any gas whose equation of state is the ideal gas equation,
P V = nRT
and which obeys Joules Law, namely, that the internal energy of the gas (constant N ) is a function of its
temperature alone,
E = E(T )
In this section we use classical mechanics to construct a simple atomic model of a gas, and show how it
explains the familiar properties of ideal gases. The atomic model provides insight into the nature and origin
of these well known properties, and also indicates how it might be modified to also explain the properties
of real gases.

6.3.1

Assumptions

Consider first the simplest possible situation, in which no external fields act on the system and in which

Section 6.3

Ideal Gas

63

there is no interaction between the particles. The only potential acting on the gas is thus that due to the
walls of the container. If we assume also that the container is perfectly rigid, the only potential acting on
the particles is that given by (6.4). The condition under which integral (6.5) must be evaluated is thus
N

A=1

p22(A)
2m

(2r(A) ) + Eint (Q1 , ..., Qs , P1 , ..., Ps ) E + E

+
A=1

(6.6)

where is defined by (6.4). The discontinuous nature of means that condition (6.6) is violated if for any
A we have 2r(A) V , while if for all A we have 2r(A) V , then (6.6) becomes
N

A=1

22(A)
p
2m

+ Eint (Q1 , ..., Qs , P1 , ..., Ps ) E + E

(6.7)

which does not involve the 2r(A) . This means that we can separate out the 2r integrals in (6.5) and evaluate
them separately.

6.3.2

Fundamental Relation

With the assumptions of the previous sections, (6.5) becomes


(E, E) =
=

1
h3N+s
1
h3N+s

= VN

d2
p(1) ...d2p(N) dQ1 ...dQs dP1 ...dPs

d2r(1) ...d2r(N)
D

V ...V

d2r(1)

1
h3N+s

d2p(1) ...d2
p(N) dQ1 ...dQs dP1 ...dPs

d2r(N)
D

d2p(1) ...d2
p(N) dQ1 ...dQs dP1 ...dPs
D

where D is the domain defined in the p-Q-P subspace by condition (6.7). The remaining integral will in
general depend on E, N, and any other parameters that occur in Eint (Q1 , ..., Qs , P1 , ..., Ps ). Put
(E, N ) =

1
h3N+s

d2
p(1) ...d2
p(N) dQ1 ...dQs dP1 ...dPs

(6.8)

where does not depend on V . Then


(E, E) = V N (E, N)

(6.9)

The fundamental relation for this gas is therefore given by


S = kN ln V + k ln (E, N )

(6.10)

where (E, N ) is an unknown function of E and N and which does not depend on V .

6.3.3

Properties of the Gas

To extract the properties of the gas modelled above, recall that


T dS = dE + P dV dN

(6.11)

We thus obtain the following equations of state. First,


1
=
T

S
E

=
V,N

=k
N

ln
E

(6.12)
N

In general, this equation provides a relation between T , E, V , and N. If we solve it for E, it yields the heat
equation for the system in the form E = E(T, V, N). However, since is a function of E and N alone,
and does not depend at all on V , this is a relation between T , E and N alone. So when solved for E gives
E = E(T, N )

(6.13)

64

Chapter 6

Classical Model of a Gas

which is Joules Law and states that, when N is constant, the internal energy of the gas depends on its
temperature alone.
Second, we have
P
=
T

S
V

=
E,N

kN
V

(6.14)

or
P V = NkT = nRT

(6.15)

which is the ideal gas equation of state. Our model thus satisfies both conditions that define an ideal gas,
and we have uncovered sufficient conditions for a gas to be ideal. The assumption that there are no external
fields acting on the particles, is not really an assumption about the system, but about its environment, and
so is not essential. The essential assumption on which the model is based is that the molecules do not
interact with each other, and that they do not interact with the container walls except at the instant of
impact. This assumption of non-interaction is what makes an ideal gas ideal."
Note that we know nothing about the function (E, N), so we can come to no definite prediction
about the specific heats of the gas. We can however arrive at two conclusions. The first is that they are
independent of V , and can at most depend on E and N , and hence on T and N. The second conclusion is
that their specific heats will depend critically on their internal degrees of freedom. This is seen from the
fact that evaluation of the integral that defines depends on Eint .

6.4 Monatomic Ideal Gas


We now add a further assumption to the ones made above. We suppose the particles of the gas are point
particles. By definition, a point particle has no internal degrees of freedom. There are thus no variables
(Q1 , ..., Qs , P1 , ...Ps ) for the system, and Eint = 0.

6.4.1

Evaluation of

This assumption enables us to explicitly evaluate the function . Its defining integral becomes
(E, N) =

1
h3N

d3 p2(1) ...d3 p
2(N)

(6.16)

where the domain D is now defined by the condition


N

p22(A)
2m

A=1

E + E

(6.17)

For given E, the equation


N

E=
A=1

p22(A)
2m

(6.18)

or,
N

p22(A) = 2mE
A=1

defines a sphere of radius R = 2mE in R3N . Condition (6.17) thus defines a spherical shell in R3N of
radius R and thickness R, where

m
dR
m
(E) E =
E =
E
R
dE
2mE
2E

Section 6.4

Monatomic Ideal Gas

65

The integral in (6.16) is the volume of this shell, and is given by (see Appendix B),
3N (R + R) 3N (R)

d 3N
(R) R = A3N (R) R
dR

(6.19)

which, by formula (B.11) of Appendix B, is


3N (R + R) 3N (R)

23N/2
(2mE)3N/2 E
m
(2mE)(3N1)/2
E =
3N
3N
E
2E
1 !
1 !
2
2

so
(E, N ) =

6.4.2

1
3N
1 !
2

3N/2

2mE
h2

E
E

(6.20)

Fundamental Equation

With evaluated, we can now find the fundamental equation for this system. We have
VN
3N
1 !
2

(E, E) =

and so

S = k ln = k ln

2mE
h2

3N/2

E
E

(6.21)

VN
E

(2mE)3N/2
3N
E
1 !
2

h3N

(6.22)

But for a gas, N is huge. So, 3N/2 1 3N/2 and, using Stirlings approximation, we have
ln

3N
1 ! ln
2

3N
3N
!
ln
2
2

3N
2

3N
2

(6.23)

so that,
S
= N ln V
k

4mE
3h2 N

3/2

3
N + ln
2

E
E

(6.24)

The first two terms in this expression has coefficient N 1023 . The last is the logarithm of a fraction
E/E, and so is utterly negligible in comparison. Note that the term ln E/E remains utterly negligible
no matter how large we make ! The expression for the entropy is therefore completely insensitive to the
shell thickness E. At first, this seems surprising. We might have expected S to show some sensitivity to
, so that we would then have had to take the limit E 0. But this does not happen. The reason for
this is that in spaces of high dimension, most of the volume of a sphere is contained in a thin layer at the
surface of the sphere. So the vast majority of the accessible states for the system are found very near the
surface of the momentum sphere. Increasing the thickness of the shell for given radius therefore does not
substantially increase the number of states accessible to the system. We are lucky to live in a space of low
dimensionality. An orange in a space of high dimension would be almost exclusively pith.
Since the dependence of S on E is ridiculously small, we are justified in regarding S as a function of
E, V and N alone, as anticipated. The fundamental equation for an ideal gas of point particles is therefore
given by
S = Nk ln V

4mE
3h2 N

3/2

3
+ Nk
2

(6.25)

66

6.4.3

Chapter 6

Classical Model of a Gas

Heat Equation and Specific Heat

The only difference between the general ideal gas and one composed of point particles is in that, for the
latter, we have an explicit expression for . The equation of state for the gas of point particles will thus be
the same as for the general ideal gas, but the energy equation can now be calculated explicitly. From (6.25)
we get
1
=
T

S
E

3N k
2E

=
V,N

(6.26)

from which we get the energy equation


3
3
N kT = nRT
2
2

E=

(6.27)

Here n is the number of moles of gass. The molar specific heat at constant volume for the gas of point
particles is then
3
R
2

(6.28)

cV = 12.47 J deg1 mole1

(6.29)

cV =

1
n

E
T

=
V,N

This has numerical value

Since the internal energy is a function of T only


dE

E
T

dT
V,N

= CV dT.

(6.30)

Then
dQ = CV dT + P dV
Since
P V = NkT
it follows that at constant pressure
P dV = NkdT
and
dQ = CV dT + N kdT
= CV dT + nRdT

(6.31)

so that
cP =

1
n

dQ
T

= cV + R =
P

3
5
R+R = R
2
2

(6.32)

The ratio of specific heats is thus


=

cP
5
= = 1.667
cV
3

From (6.32) we can also write


=

cP
R
=1+
.
cV
cV

(6.33)

Section 6.4

Monatomic Ideal Gas

67

This expression yields better results for the of real gases than (6.33). The particles in monatomic gases
are not exactly point particles, but may be well approximated by them. How good this approximation is
may be seen from the two monatomic gases listed in the table in Figure 1.7

Figure 6.1.

6.4.4

Dependence of S on N

Even though (6.25) gives the correct equation of state and heat equation for an ideal gas, it cannot be the
correct fundamental relation. S, like E, V , and N , is an extensive quantity. The function S(E, V, N ) must
thus be homogeneous of order 1 in each of its arguments. The function on the right hand side of (6.25),
however, is not, since

S(E, V, N ) = (N)k ln (V )
= N k ln (V )
= N k ln V
= S(E, V, N )

3/2

4m(E)
3h2 (N )
4mE
3h2 N

4mE
3h2 N

3/2

3/2

3
(N )k
2

3
Nk
2

3
N k + Nk ln
2
(6.34)

There is therefore something fundamentally wrong with equation (6.25).


One way to track down the problem is to tamper with the result and see what, if anything, can be done to
give it the correct form. The single offending term that prevents S(E, V, N ) from being homogeneous is
the solitary V inside the log function. The coefficient of the log function is already homogeneous of degree
1, as is the term that is added to it. The argument of the log function must therefore be homogeneous of
degree zero. The bracketed factor in the second but last time in Eq. (6.34), raised to the power 3/2, is
homogeneous of degree zero by virtue of containing the ratio E/N . Only the V appears by itself and not in
some ratio with another extensive variable, and gives rise to the additional term N k ln in the final result
that violates the homogeneity of the right hand side.

Table from Reif, 1965, p 157.

68

Chapter 6

Classical Model of a Gas

Recall that
1
T
P
T

S
E

S
V

V,N

=
E,N

kN
V

are given correctly by equation (6.25). To remedy the problem with equation (6.25), we must introduce
into the argument of the log function a division by some extensive variable, so as to render the ratio of
V with the new variable homogeneous of degree zero. This new variable can neither be E nor V , since
otherwise the mechanical and heat equations of state, which we know to be correct, would change. It must
therefore be N . Rewrite equation (6.25) by introducing a denominator N under V , but without changing
S. This means add and and re-subtract N k ln N on the right hand side of (6.25) to get
S

V
N

4mE
3h2 N

3/2

= Nk ln

V
N

4mE
3h2 N

3/2

= Nk ln

V
N

4mE
3h2 N

3/2

= Nk ln

3
N k + N k ln N
2

5
N k + k(N ln N N )
2

5
N k + k ln N !
2

(6.35)

By inspection, we see that the first two terms on the right hand side together are homogeneous of degree 1.
Put

Then S" and S are related by

V
S" = Nk ln
N

4E
3h2 N

3/2

5
Nk
2

S" = S k ln N !

(6.36)

(6.37)

"
The function S(E,
V, N) differs from S by a term which is a function of N alone, and so yields the same
mechanical and heat equations of state as S. Further, it is homogeneous of degree 1 in all its arguments. It
therefore has all the properties needed to represent the same system as S, but without its deficiency. We
thus get consistent results if we replace (6.25) by (6.36).
Note that the same argument could have been applied to equation (6.10) instead of (6.25). We would
then have obtained a necessary condition on the unknown function (E, N ).
In thermodynamics, S is defined up to an arbitrary constant. Here, however, we have altered S not by a
constant, but by a function of N , so we have not simply replaced one possible entropy function by another.
We have replaced it by a different one. S and S" are not equivalent entropy functions. How do we justify
this?
The fact that S is not homogeneous is sufficient justification. But if you are not happy with this, then
note that above we did not calculate the third equation of state, which yields the chemical potential as a
function of E, V, N. The functions S and S" yield the same first two equations of state. However they yield
different equations for the chemical potential. S gives an incorrect answer, so we are justified in rejecting
it as a fundamental relation for the system.

6.5 Correct Classical Counting of States


The fact that we obtained an incorrect fundamental relation, calls into question the entire classical statistical
modelling procedure. There are only two possibilities. Either the theory we used is fundamentally flawed,
or we have imported into the theory an incorrect assumption that has led to the error. It is unlikely that the
theory is flawed, since it easily yielded two correct equations of state. It is probable therefore that we have

Section 6.5

Correct Classical Counting of States

69

unwittingly imported an incorrect assumption. To identify this assumption, we need first to discover the
meaning of the correction term k ln N ! in equation (6.37).
" the number of accessible states of the system predicted by S,
" and by the number
Denote by
predicted by S. Then, from (6.37),
" = eS/k = 1 eS/k = 1

N!
N!

In obtaining S, we therefore over-counted the accessible states by a factor N !. Now, N ! is the number of
permutations of N objects. Our gas consists of N particles. To reduce a count involving N objects by N !
means that those objects, which originally were being treated as distinguishable, are now being regarded
as indistinguishable. The difference between S and S" therefore, is that S represents the entropy of a gas
of distinguishable particles, while S" represents that of a gas of indistinguishable ones. The fact that S" is
the correct entropy function, and S is patently incorrect, forces us to conclude that the particles of a gas, in
classical mechanics, must be regarded as indistinguishable.
This is a surprising result. In quantum mechanics, the concept of indistinguishability is a natural
consequence of the fact that, for a gas, the wavefunctions only give a probability distribution and no
information on the position of a single particle We therefore have no means of identifying which of the
particles has been at a particular position. In classical mechanics however, there is no rational basis for
treating the particles as indistinguishable. In this theory, the gas particles are identical, but localised. We
therefore have a-priori reason to expect them to be distinguishable. Both experiment and theory however
force on us to the conclusion that they are not.
It is not therefore the general theory of classical statistical mechanics that is incorrect, but the way in
which we counted the accessible states. When dealing with gases, to correct for the over-counting of states,
we must replace (6.5) with the corrected expression
(E, E) =

1
1
3N+s
N! h

d32r(1) ...d32r(N) d3 p2(1) ...d3 p


2(N) dQ1 ...dQs dP1 ...dPs
D

It is interesting to note that, in a counting problem in which N indistinguishable objects (gas particles)
are distributed into an array of boxes (accessible states), the number N ! is the over-count factor" only if
we are allowed at most one object per box. If we are allowed more than one object per box, the over-count
factor" is different. For a distribution in which there are n1 objects in the first box, n2 in the second, etc.,
permutation of the objects in each box does not change the distribution, so the over-count factor" is not
N!, but the smaller number
N!
n1 !n2 !...
So the fact that we have been forced to introduce the factor N !, and not N!/n1 !n2 !... thus forces us also to
conclude that, in a classical gas, the occupancy of each accessible state is at most 1. We will see this again
in a later section. This restriction is not due to any exclusion principle. It is due to the huge number of
accessible states in classical theory. No finite number of particles can equal an infinite number of states, so
the likelihood of finding more than one particle in a given state is zero.
It is here that we see the limitation of classical statistical mechanics. In reality, every system is a quantum
system, and so has only a finite number of states accessible to it for given energy E. But if E is in a range
where the available states hugely outnumber the number of particles, then the situation is similar to the one
in classical mechanics and we can expect the classical theory to give good results. If, on the other hand,
the number of available states at energy E is comparable to the number of particles, then we can expect
multiple occupancy if there is no exclusion principle, or restricted occupancy if there is an exclusion
principle, and the behaviour of the system will differ radically from that expected classically, where there
is no pressure on the accommodation of particles into the accessible states.
Classically, the number of states accessible to a single gas particle is proportional to V . The number of
states per particle is thus proportional to V /N . We can expect a criterion determining when the classical
theory is applicable to be stated in terms of V /N. We shall deduce one later in the course.

70

Chapter 6

Classical Model of a Gas

6.6 Gibbs Paradox


In his original work, Gibbs devised a thought experiment to show that expression (6.24) for the entropy of
a gas is incorrect. The thought experiment leads to an obvious contradiction, called the Gibbs paradox.
The thought experiment is as follows.
Consider a system consisting of two ideal gases of N1 and N2 particles respectively which are kept at
the same temperature T in separate, adjacent containers of volume V1 and V2 respectively. The chambers
of the two containers are separated only by a removable partition which initially separates the one from the
other. Let the partition then be removed, so that the two gases are free to mix, and consider the change in
entropy of the system which is the result of the mixing of the two gases. The temperature is kept constant
throughout the mixing process.
Both gases are ideal, so the internal energy of each depends only its temperature, and does not depend
on its volume. Both gases are at the same temperature throughout, so the internal energy of each remains
constant. The total entropy of the system before the partition is removed is

3/2

4m1 E1
3

+ N1 k
S1 = N1 k ln V1
2

3h
N
2
1

0
3/2
Before:
E
4m
3
2
2

S
=
N
k
ln
V
+ N2 k

2
2
2
2

3h0 N2
2

S = S1 + S2
while the entropy after the partition is remove is

S1 = N1 k ln V

After:

S2 = N2 k ln V

S = S1 + S2

4m1 E1
3h20 N1

3/2

3
+ N1 k
2

4m2 E2
3h20 N2

3/2

3
+ N2 k
2

The change in entropy for this process is therefore


S = Sf Si

= (S1 + S2 )f (S1 + S2 )i
= N1 k ln V + N2 k ln V N1 k ln V1 N2 k ln V2
V
V
= N1 k ln
+ N2 k ln
V1
V2

(6.38)

Since the process considered is the mixing of the gases, S is called the entropy of mixing. This result is
independent of the nature type of gases being mixed. The only variable in the formula for S that reflects
the type of gas involved is m, the mass of the gas molecule. Since the terms containing m1 and m2 cancel
in the expression for S, S is the same, whatever the nature of the gases involved.
In (6.38) S is positive, as it should be for an irreversible process such as mixing of two gasses. So far
it then seems all right. The paradox arises when the two gasses that are mixed are identical. In this case
the initial and final density and temperature are identical and the process is reversible and adiabatic. When
the removable partition is replaced, the final system is identical to the initial system and must have the
same, unchanged, entropy. Gibbs paradox is solved by using expression (6.36) for the entropy of an ideal
gas.
References
Reif, F., 1965, Fundamentals of Statistical and Thermal Physics, McGraw-Hill Book Company, New York.

Exercises

71

Exercises
1.

2.

Consider a classical one dimensional harmonic oscillator of mass m and spring constant k. Denote the
displacement coordinate by x and the momentum by p.
(a) Show that for constant energy an ellipse is traced out by x and p in the two dimensional phase
space defined by (x, p) .
(b) Give an argument to show that the number of states accessible by the oscillator in the energy range
E to E + E is proportional to the area between two ellipses.
(c) Give an argument (can be graphical) to show that for a given dx the oscillator is more likely to be
near an extremum in the displacement than near the centre of the displacement of the oscillator.
Consider an ensemble of identical classical one dimensional oscillators, each with the same energy, but
with different phases.
(a) The displacement x is a function of time t given by x = A cos (t + ) . Assume that the phase
angle is equally likely to assume any value in the range 0 2. Show that the probability
1
w() d that lies between and + d is then simply w() d = 2
d. For a fixed time t, find
the probability P (x) dx that x lies between x and x + dx by summing w() d over all angles
for which x lies in this range. Express P (x) in terms of A and x.
Hint: P (x) dx = dx w() d where the integral is over all such that x lies between x and
x + dx. Consider as a function of x and the integral extends over all in the range (x) to
(x + dx) . Therefore (since dx 1)
x+dx

P (x) dx =

w () d =
dx

w ()
x

d
d
dx = w ()
dx
dx
dx

and the absolute value is required since P (x) dx > 0. In the integration above, it is assumed that
(x) is a single valued function of x. Note that in this case is a double valued function of x and
this must be taken into account in the integration.
d

dx
0.5

-0.5

-1

Figure 6.2.

3.

(b) Consider a the classical phase space for such an ensemble of oscillators, with energy in a small
range between E and E + E. Calculate P (x) dx by taking the ratio of that volume of phase
space in this energy range and in the range x to x + dx to the total volume of phase space lying in
the range E and E + E. Express P (x) in terms of E and x. By relating E to A, show that the
result is the same as in part (a).
Show that the heat capacity of a classical system of N independent harmonic oscillators is equal to N k
for N 1. Within the micro-canonical formulation you have to count the number of states available
to the system in the energy range (E, E + E) . Try to do this in two ways.
1) Determine the total number of states with energies E, and then take the derivative with respect to

72

Chapter 6

Classical Model of a Gas

E. Try to build up an expression for two, three and more oscillators.


2) Do the integration in momentum phase space.
You may find the discussion at http://physics.unipune.ernet.in/~phyed/26.3/File8.pdf of interest. Do
you agree with the discussion in this presentation? Beware that not everything published on the internet
is correct.

73

Chapter 7
Canonical Formalism
7.1 A New Formalism
The microcanonical formalism, where a system is considered in isolation with energy in the range E to
E + E, is simple in principle. All we need to do is count the number of states available to the system in
a given narrow range of energy E to E + E. But though simple in principle, it is generally intractable in
practice. Even in the most elementary cases, it taxes ingenuity to the limit.
The problem lies with the counting procedures. In quantum models, these require combinatorial
techniques, and combinatorial problems are notoriously difficult to solve. More often than not, they require
convoluted reformulation to render them soluble. This makes the counting of the accessible states either
impractical or, as is generally the case, impossible. In classical models, the counting requires the evaluation
of an huge number of multiple integrals over complicated domains of integration. In the simplest examples,
the integration problem may be overcome by ingenuity, but slightest modification of the system to make
it physically more realistic complicates the domain of integration so hugely that the integral becomes
intractable. In practice therefore, the evaluation of is possible only for a few highly idealized models.
It is not difficult to find the root of the counting problem, and it is the same in both classical and quantum
models. It is the requirement that the energy lie in a restricted range. Were it not for this single fact,
there would be hardly a problem. But with the restriction on energy in place, the combinatorial problem
effectively requires us to calculate the number of ways that a given amount of energy can be distributed
among the available degrees of freedom, and this is generally beyond our mathematical capabilities.
The solution to the counting problem is therefore to remove the limitation on the available energy. This
requires us to consider a system that is not isolated, but which is able to interact with its surroundings. This
can be done by allowing it either to work on the surroundings, or to absorb heat from them, or both.
In the canonical formalism, we allow the system to interact thermally with its surroundings, but not in
any other way. More precisely, we allow the system to interact thermally with the rest of the universe. And
since the universe is really a rather large place, no matter how much heat the system absorbs or rejects to it,
it can hardly be expected to change its temperature substantially. So, we model the surroundings as an heat
reservoir at fixed temperature. In principle, the temperature of the surroundings may be set and controlled
by the experimenter.
The canonical formalism therefore considers a system which is in contact with an heat reservoir at fixed
temperature, and which can exchange any amount of energy with it. Thus, states of all energies, from zero
to infinity, are now available to the system, and not just those states that are found in some specified narrow
range.
Since the system is no longer isolated, the postulate of equal a priori probabilities no longer applies. The
system will have different probabilities of being in the different states accessible to it. It therefore does not
spend the same fraction of time in each state. Our first problem is thus to determine the probabilities of the
system being in each of its accessible states. Thereafter, using these probabilities, we can calculate the
average value of any of the systems parameters.
It turns out, after proper analysis, that the canonical formalism can be used to obtain a fundamental
relation, but not in entropy representation as was the case in microcanonical formalism. This fundamental
relation obtained is in Helmholtz representation. From it, all thermodynamic properties of the system
can be calculated by the methods of thermodynamics. Since along the way we obtain also a probability
distribution for the systems microstates, we can calculate exactly the same quantities by the methods
of probability theory. We thus have a choice in how to arrive at the desired answers. An advantage of
probability methods, however, is that they permit us to make statements about the microscopic properties
of the system. This is not permitted by standard thermodynamics, which remains stubbornly agnostic about
the microscopic constitution of matter. Strangely, a number of the more important of these microscopic
parameters can, with hindsight, also be extracted from the fundamental relation. This is an unexpected
boon which renders classical thermodynamics more useful even than was first imagined.

74

Chapter 7

Canonical Formalism

7.2 The Probability Distribution


For simplicity, we first consider only quantum models. Classical models will be discussed in a separate
section.
Let A be a quantum system in thermal interaction with an heat reservoir A . We allow only thermal
interaction between A and A . The macroscopic configuration parameters of A are therefore all held fixed.
So A can do no work, and the only way that A can interact with its surroundings is by heat flow. We
impose no constraint on this heat flow. System A therefore can absorb or reject heat freely.
As far as system A is concerned, A represents the rest of the universe," since it can interact with
nothing else. We loose no generality therefore by regarding the combined system A + A as isolated, The
total energy of the combined system will therefore lie in a narrow range of energies E0 to E0 + E0 . No
matter how much heat A absorbs or rejects therefore, this energy is transferred to and from A , and the
total energy of the combined system remains constant.
System A may be large or small. It may be a macroscopic system, or an individual atom or molecule in
a macroscopic system. Our only requirement is that A must be sufficiently small in relation to A for A
to be, for all practical purposes, an heat reservoir for A. Thus any likely heat exchange between A and A
should leave the temperature T of A practically unchanged. We shall see below that this condition may
always be satisfied in any practical situation.
Physically, the contact between A and A has the following effect. When A is first put into contact with

A , the unconstrained macroscopic parameters of A will begin to change. After a sufficient length of time,
A will settle into a condition where there are no more detectable changes in its unconstrained macroscopic
parameters. The system has then reached a state of thermodynamic equilibrium. From thermodynamics,
we know that this final equilibrium is one in which the temperature of the system is equal to that of the
reservoir.
Microscopically, what happens is this. The thermal contact between A and A allows heat to flow freely
between the two systems. Since the systems configuration coordinates are held fixed, only those states of
the system in which the configuration coordinates have the given fixed values are accessible to the system.
These accessible states in general will occur over a very wide range of energies for A. Since the flow of
heat between A and A is unconstrained, none of these these states is inaccessible to the system, no matter
how high or low the energy of that state might be. In the micocanonical formalism, where the system is
isolated with the same values of its configuration parameters, most of these states are not accessible to
it. Only those with energy in the given range E to E + E can be reached. Now however, states of any
energy with the given configuration parameters are accessible. The set of states accessible to the system
has thus been considerably enlarged by allowing thermal interaction.
The principal difference between the system A when isolated and when in thermal contact with A
is that it is no longer equally likely to be in any one of its accessible states. The postulate of equal a
priori probability for each accessible state is only for isolated systems. To calculate the values of its
various relevant physical parameters therefore, we need first to find the probability Pr that system A is in
accessible microstate r when it is in thermal equilibrium with A .

7.2.1

Probability of Occurrence of State r

Denote the microstates accessible to system A by r, and the energy of A in microstate r by Er . We will
assume without loss of generality that the accessible microstates have been labelled in such a way that
E1 E2 E3 .
Now, A is not isolated, but the composite system A + A is isolated, so we can apply the postulate of
equal a priori probabilities to it. Each of its accessible states is therefore equally likely.
Suppose the interaction of A with A is weak. Then the total energy of the composite system is in a
narrow range E0 to E0 + E0 , where
E0 = Er + E
If the interaction were not weak, then we would have had E0 = Er + E + Eint . Weak interaction"
therefore means Eint Er and E , so that we can neglect Eint in comparison with Er and E . This
condition is more than adequately met in the vast majority of cases of interest and so represents no real

Section 7.2

The Probability Distribution

75

restriction on the theory.


When A is in the given microstate r, which has (exactly) energy Er , system A can be in any one of its
very many accessible states which have energy in the range E = E0 Er to E + E = E0 + E0 Er .
Denote the number of microstates accessible to A in this range by A (E ). The total number of states
accessible to the composite system A + A is then equal to the number of systems accessible to A,
multiplied by the total number of systems accessible to A . But, since A is in the given state r, the number
of states accessible to it is 1, and so we have
A+A (Er , E ) = 1 A (E ) = A (E0 Er )
The composite system is equally likely to be in any one of its accessible states. The probability Pr of A
being in state r is thus given by
Pr =

number of states accessible to A + A in which A is in state r


total number of states accessible to A + A in the range E0 to E0 + E0

In symbols,
A+A (Er , E0 Er )
A (E0 Er )
Pr = '
= '
r A+A (Er , E0 Er )
r A (E0 Er )

(7.1)

The denominator is a given fixed number which is characteristic of the composite system A + A , and
which is independent of r. We may therefore write
Pr = c A (E0 Er )

(7.2)

Note that we can evaluate c by using the property of probabilities that


Pr = 1

(7.3)

so, from (7.2), we get


1=

Pr =
r

c A (E0 Er ) = c

A (E0 Er )

(7.4)

as expected.
Up to now, the discussion has been completely general. No special assumptions have been made about
system A . Systems A and A might have been any two systems in thermal interaction. We shall now use
the fact that A is an heat reservoir for A. A is thus very much smaller than A , and so Er E < E0 . We
may therefore expand the function A (E0 Er ) about the value E0 . In fact, since A is a very rapidly
varying function, it is more useful to expand ln A , and we write
ln A (E0 Er ) = ln A (E0 ) +

ln A
1 2 ln A
(E0 ) (Er ) +
(E0 ) (Er )2 +

E
2 E 2

(7.5)

But ln A = S /k, where S is the entropy of system A . So


ln A
1
=
E
kT
where T is its temperature. So, defining the temperature T by
1
ln A
=
(E0 )
kT
E
(7.5) becomes
Er
1 Er 2 T

(E0 ) +
(7.6)
kT
2 kT 2 E
Physically, T is the temperature of system A when all of the energy E0 is found in A , and none of it is
found in A.
ln A (E0 Er ) = ln A (E0 )

76

Chapter 7

Canonical Formalism

Now, system A is an heat reservoir, so it has an huge number of degrees of freedom. The value of
A (E ) thus increases with fantastic rapidity as E increases, and so the value A (E0 Er ) will
decrease with fantastic rapidity as Er increases. The probability of states r of A occurring in which Er has
relatively high values is therefore absurdly small, and their contribution to statistical averages is completely
negligible. Any approximation made in (7.6) when r is large, no matter how absurd, will thus have no
effect at all on the statistical averages. The states r that contribute significantly to statistical averages all
have Er E , or E = E0 Er E0 , and so
ln A
1
1
ln A
=
(E0 )
(E ) =

kT
E
E
kT

(7.7)

In other words, for all those states r in which A has any significant likelihood of being found, the actual
temperature T of the reservoir A can only differ insignificantly from the temperature T which it has when
n
it contains the total energy E0 of the composite system. This implies that ETn (E0 ) 0, and We may
therefore write (7.5) to more than excellent approximation as
ln A (E0 Er ) = ln A (E0 )

1
Er
kT

(7.8)

or
A (E0 Er ) = A (E0 ) eEr /kT

(7.9)

Pr = c A (E0 Er ) = c A (E0 ) eEr /kT

(7.10)

Pr = CeEr /kT

(7.11)

Combining this with (7.2) gives

or

where C is a constant for the isolated system A + A , and is independent of r.


Equation (7.11) gives the probability Pr that a system A, with microstates r, in thermal contact with an
heat reservoir A at temperature T , will be found in quantum state r. This probability distribution was first
discovered by Boltzmann, and is called the Boltzmann distribution. Special cases of it were known before
Boltzmanns result (e.g. Maxwells distribution), but Boltzmann was the first to show that this distribution
applies generally to all systems in thermal contact with an heat reservoir. The factor eEr /kT is called the
Boltzmann factor. This distribution is also called the canonical distribution. It is a result of fundamental
importance in statistical mechanics.
The constant C can be evaluated easily from relation (7.3). This gives,
1=

CeEr /kT = C

Pr =
r

eEr /kT
r

so that
C='

1
eEr /kT

The ratio'
1/kT occurs frequently in statistical physics, so it is often denoted more briefly by . The
function r eEr /kT whose inverse yields C also occurs frequently and, as we shall see, is of
fundamental importance. It is denoted by
eEr /kT

Z=
r

Z is a sum over all microstates of the system of the Boltzmann factor eEr /kT . To emphasise this, it is
called in German the sum over all states", or zustandsumme, from which we get our symbol Z. We shall
see that it describes how the energy is distributed over the accessible states of the system. To emphasise
this, Z is called, in English, the partition function.
In terms of and Z, the Boltzmann distribution becomes

Section 7.2

The Probability Distribution

77

eEr = eEr
Z
eEr

Pr =
r

This result yields interesting insights into the behaviour of the system when it is in contact with an heat
reservoir. Suppose the system is known to be in state r. Then it has energy Er . Accordingly, A has energy
E0 Er , and it could be in any one of an huge number of states accessible to it at this energy. Now, the
number of states available to A is a very rapidly increasing function of its energy. So the higher the energy
Er of A, the lower the energy available to A and the smaller the number of its accessible states. The lower
therefore the probability of this division of energy between A and A . So the higher the energy Er of the
state r of A, the lower the probability of its occurrence. This is what the exponential dependence of Pr on
Er expresses.

7.2.2

Probability of Occurrence of Energy E

The Boltzmann distribution gives the probability of finding system A in one particular quantum state of
energy Er . However, there may be very many quantum states with the given energy Er , so Pr does not
necessarily give the probability that the system has energy E = Er . To calculate this probability, we must
sum Pr over all the states r with this given energy.
Denote the degeneracy of the energy level E of the quantum system by gE , and the probability that A
has exactly energy E by PE . Then
PE =

eEr
=
Z

Pr =
r

Er =E

Er =E

eE
eE
=
Z
Z

Er =E

1=
r

eE
gE
Z

(7.12)

Er =E

where we have used the fact that Z does not depend on r. Thus
PE = gE

eE
Z

(7.13)

The partition function can also be expressed as a sum over all allowed energies of the system rather than a
sum over all states. Summing (7.13) over all allowed energies E and using (7.3), we get
gE eE

Z=

(7.14)

7.2.3

Probability of Energy in range E to E + E

It is useful to know also the probability P (E, E) that the system has energy in a range from E to
E + E. This is found by summing Pr over all states with energies in this range. Thus
P (E, E) =

Pr

(7.15)

E<Er <E+E

If the energy range E is narrow, the Er E for all the states r, and so
P (E, E) =
r

E<Er <E+E

eEr
eE

Z
Z

1=
r

1
A (E, E)eE/kT
Z

(7.16)

E<Er <E+E

This result is informative. The function eE/kT is a very rapidly decreasing function of E. On the other
hand, A (E) is a very rapidly increasing function of E. Their product normalised by the factor 1/Z is
therefore negligibly small almost everywhere except in a very narrow range where it rises steeply to its
maximum value, and then drops to zero again.

78

Chapter 7

Canonical Formalism

Figure 7.1.P (E) as a function of E.


This behaviour is shown schematically in Figure 1. The proportions in the figure are exaggerated to
show the general behaviour of P (E). Generally, the peak is extremely sharp. The larger the system A, the
sharper this maximum becomes. So though the system is free to assume any energy, in practice it spends
most of its time in states with energy extremely near to the value E, with deviations from this value that
are barely detectable. To all intents and purposes therefore, A will have a well defined" energy E, with
extremely small random deviations from it. This explains why classical physicists were able to regard
without contradiction the energy of the system as sharply defined.

7.3 Statistical Calculation of System Parameters


7.3.1

Energy

The energy of a system in contact with an heat reservoir is not fixed. Its configuration parameters are held
constant, and so it cannot do work on its surroundings. But it is not thermally isolated, so it can take in or
expel heat freely. The energy of the system is thus expected to fluctuate as it moves from one accessible
state to another. Once equilibrium is reached, the system temperature will remain fixed at the temperature
T of the heat reservoir A , but the system energy will fluctuate about a well defined mean value, E,
with some statistical scatter. In this section, we calculate first the average system energy, and then the
statistical scatter about this mean energy. Finally, we examine the probability that the system has energy in
a specified narrow range E to E + E.
7.3.1.1

Average energy

The average energy of the system can be calculated from

E=

P r Er =
r

eEr
1
Er =
Z
Z

Er eEr
r

But, consistent with keeping the configuration variables and hence the Hamiltonian constant, the

(7.17)

Section 7.3

Statistical Calculation of System Parameters

79

eigenenergies Er of the system do not depend on ,


Er eEr =

Er
e

and we have
Er eEr =
r

Er

e
=

eEr =

(7.18)

and so
E=

1 Z

=
ln Z
Z

(7.19)

The average energy of the system can thus be calculated simply from the partition function Z or, more
directly, from the function ln Z.
7.3.1.2

Energy Fluctuations

The energy of the system has average value E. Its actual value will fluctuate about this value. Its root mean
square (RMS) fluctuation will be the square root of
(E)2 = E E

Now,

(
)
2
= E 2 2E E + E
= E2 2 E
Pr Er 2 =

E2 =
r

1
eEr
Er 2 =
Z
Z

+ E

= E2 E

Er 2 eEr

(7.20)

By the same reasoning as above,


Er 2 eEr =

2 Er
e
2

(7.21)

so that
E2 =

1 2Z
Z 2

(7.22)

We can express this in terms of ln Z using the product rule as follows,


1 2Z
Z 2

=
=
=

1 Z
Z

1
Z
1
Z

1
Z2

+ E

1
Z
2

(7.23)

and thus
(E)2 =

2
E
ln Z =

(7.24)

This result can be re-expressed in an interesting way,


(E)2 =

E T
= kT 2 C
T

where C is the heat capacity of the system at constant external parameters. For a P V T -system, this heat
capacity is CV . The quantity of interest is not (E)2 directly. A more meaningful quantity is the relative

80

Chapter 7

fluctuation. Denote the RMS deviation by E =

Canonical Formalism

*
(E)2 . Then, the relative variation is

E
=
E

kT 2 C
E

(7.25)

Note that both C and E are extensive quantities, and so are each proportional to N. On the other hand,
kT 2 is independent of N . Thus
E
T

E
N

(7.26)

The relative fluctuation of the energy of a system in thermal contact with an heat reservoir is thus inversely
proportional to the square root of its particle number. For macroscopic systems, N 1024 , and so
E/E 1012 . The fluctuations in energy are thus 1 part in 1012 , which is ridiculously small. The
energy of the system is thus, to all intents and purposes, constant. This confirms quantitatively the property
we deduced above from qualitative arguments.
This expression enables us to evaluate explicitly the RMS deviations of the system energy from the
mean. Note that this deviation increases with increasing T , but is dominated by the value of k which is
extremely small.
Another interesting consequence of the above result is the following. Since (E)2 0, we must have
(E)2 0 and thus also E/ 0. Expressed in terms of T , this becomes
E
0
T

(7.27)

Thus E can only ever remain constant or increase with increasing T . It can never decrease.

7.3.2

Conjugate Variables

In canonical formalism, it is assumed that the configuration variables of the system are held fixed. Their
values are thus necessarily well defined. Their conjugate variables, or associated generalised forces,
however take on different values in different states r of the system. They will therefore display statistical
scatter about some mean value. To calculate this mean value and its statistical scatter, we consider virtual
changes of the configuration variables.
The configuration variables for the system appear in a microscopic model as parameters in the
Hamiltonian for the system. They are also called external parameters, because they are usually parameters
whose values an experimenter can set and control. The system eigenvalues will thus be functions of the
external parameters.
For simplicity, assume the eigenvalues Er are functions of one external parameter x only, and denote
the associated generalised force by X. Generalisation to two or more parameters is straightforward. For
the sake of definiteness, you can take this external parameter to be the volume V of the system. Then the
corresponding generalised force will be the system pressure P .
Suppose now that the system is in the microstate r. While it is in this state, we change the value of the
external parameter quasistatically from x to x + dx. The system energy will then change from Er (x) to
Er (x + dx), and the system increases in energy by an amount
dEr =

Er
(x) dx
x

(7.28)

and so the work done by the system on the surroundings through this change is
dWr = dEr =

Er
(x) dx
x

(7.29)

But dW = X dx. The generalised force acting on the system when it is in state r which causes x to
change is thus
Xr =

Er
(x)
x

(7.30)

Section 7.4

Fundamental Relation

81

Note that the value Xr of the generalised force exerted by the system depends in general on the state r of
the system. So when the system is in equilibrium at temperature T , this value will change randomly as the
system moves from one accessible state to another. Its average value is given by
X=

Pr Xr =
r

1
eEr
Xr =
Z
Z

eEr
r

Er
(x)
x

(7.31)

1 Z
x

(7.32)

Now,
eEr
r

Er
1
(x) =
x
x

eEr
r

and hence
X=

1 1 Z
1 ln Z
=
Z x
x

(7.33)

The average values of the conjugate variables can thus also be expressed in terms of the partition function
Z.
As an example, consider a system in which the external parameter x is the system volume V . The
variable conjugate to V is the system pressure P . The above calculation showed that though the system
volume is fixed, the system pressure in fact depends on the state r of the system. So as the system moves
randomly from one accessible state to another, the pressure will fluctuate. The mean pressure for the
system is given by
P =

1 ln Z
V

(7.34)

As with the energy, it can be shown that the probability that the system has pressure in a range P to
P + P is sharply spiked, with maximum at P . Thus, though the pressure of the system fluctuates, the
system has effectively a well defined mean pressure, and the random fluctuations in pressure around this
mean value are extremely small. This explains why classical physicists could labour under the illusion that
the system parameters were well defined.
Note also that Z is a function of T (through ) and of V (through Er ). This last equation is therefore a
relation between P , V and T . It is therefore the equation of state.

7.4 Fundamental Relation


The fact that all the above calculations by statistical methods of relevant physical parameters of the system
lead to formulae expressible in terms of ln Z leads one to suspect that ln Z must somehow be related to
a fundamental relation for the system. It is not difficult to discover how. We have seen that the system
energy is very sharply defined with mean value E and very small random fluctuations about this value. The
classical internal energy of the system is therefore given by U = E. We now use this fact to express Z in
terms of the classical thermodynamic variables.
The partition function Z can be expressed as a sum over energies rather than as a sum over states as
follows. First reticulate the energy scale into ranges of width E. Then sum over all possible ranges to
get
eEr =

Z=
r

A (E, E)eE

(7.35)

The summand is a product of a rapidly increasing function of E with a very rapidly decreasing function.
The product is thus very sharply spiked, with maximum at U = E. The contribution to this sum is thus
overwhelmingly from the region around U = E in a region of width E. So, to excellent approximation,
we have
Z=
E

A (E, E)eE A (U, E)eU

E
E

(7.36)

82

Chapter 7

Canonical Formalism

Taking the log, we get


ln Z ln A (U, E) U + ln

E
S(U )

U
E
k

(7.37)

where S = k ln A (U, E) is the entropy of the system A at energy U, and T is its temperature. Thus
kT ln Z T S(U) U

(7.38)

T S = kT ln Z
F =E

(7.39)

and we obtain

Z is directly related to the Helmholtz free energy of the system. Further, Z is a function of T , through in
the exponential terms in the sum, and also of all the external parameters, since each energy eigenvalue Er
is a function of them. Equation (7.39) thus yields
F = F (T, 1 , 2 , ..., n )

(7.40)

which is a fundamental representation.

7.5 Entropy
Since
Pr =

eEr
Z

(7.41)

it follows that
ln Pr = ln Z Er
and
Pr ln Pr

Pr ln Z

= F E
S
=
k

Pr Er

(7.42)

Recalling that
F = E TS
S = k

Pr ln Pr

(7.43)

This is a more general definition of entropy than the one we deduced for the micro-canonical case. For an
isolated system, with degenerate states, this expression becomes
S

= k
= k ln

as we had before.

1
1
ln

Exercises

83

Exercises
1.

Consider a system with N magnetic atoms per unit volume and which is placed in a magnetic field B.
Assume that each atom has spin 12 (corresponding to one unpaired electron) and intrinsic magnetic
moment = |2| (In other words is the magnitude of the magnetic moment.) For this two-level
system tin a quantum mechanical description, the magnetic moment 2 for each atom can align either
parallel or anti-parallel to the field B. Ignore possible interactions between the magnetic moments of
the atoms.
(a) Consider the rest of the system as a heat bath and determine the mean magnetic moment
of the
atom if the substance is at absolute temperature T. (For each atom the magnetic moment 2 will
fluctuate between parallel and anti parallel allignement with the field. If we write the magnetic
$
B
, then B = |2| = .
moment as 2 = B B
is the time average of B .)
| $|
= and give a physical interpretation of this result .
(b) Show that when B
kT 1
(c) The magnetisation, or mean magnetic moment per unit volume is equal to
M = N
.
Show that when

2.

B
1
kT
the magnetic susceptibility is inversely proportional to the temperature (compare this with the
results for the microcanonical approach).
Show that the probability that the system has pressure in a range P to P + P is sharply spiked, with
maximum at P .

84

Chapter 8

Heat Capacity of Solids

Chapter 8
Heat Capacity of Solids
8.1 Modelling Specific Heats
The heat capacity of a system (C = dQ
dt ) is determined by the way that energy can be taken into its various
degrees of freedom. The more complex the system, the greater the number of its degrees of freedom, and
hence the greater its capacity to absorb energy. Each degree of freedom of the system thus contributes to
the heat capacity of the system.
The different degrees of freedom of the system however do not all contribute to its heat capacity with
equal importance. Some require large energies to excite their associated motions. These therefore are not
able to take in energy until the total energy of the system reaches some critical threshold, below which
these degrees of freedom lie dormant. So long as the system energy is below this threshold therefore, they
are inactive and contribute nothing to the heat capacity. Some degrees of freedom, though active, may
not be able to accommodate as much energy as others. Their contribution to the heat capacity is small
and perhaps negligible. We may thus divide the degrees of freedom of a system into distinct categories
according to their ability to contribute to the system heat capacity at different temperatures, and according
to their activation threshold.
When modelling a system, we attempt to identify those degrees of freedom whose contribution to the
heat capacity is likely to be significant. These are modelled independently of the rest, and their contribution
compared with experimental data. Once the principal features of observed behaviour of the system have
been replicated satisfactorily, qualitatively or quantitatively, we then seek to improve the model by taking
into account those degrees of freedom whose contribution to the behaviour of the system is less significant.
In this way, we build up realistic models of real physical systems.
There are several easily identifiable important contributions to the specific heat of solids. These
include the degrees of freedom associated with the motion of conduction electrons in conductors, and the
orientation of magnetic moments in magnetic materials. But the principal contribution to the specific heat
of solids is the vibrational energy of its constituent atoms as they oscillate about their equilibrium positions
where they are bound by the forces that give rise to the solid state. Each of these contributions may be
treated separately and then combined into a more comprehensive model. In this section we consider only
that contribution to the specific heat of solids due to the vibration of the individual atoms or molecules that
make up the solid.

8.2 Experimental Facts


There are two basic experimental facts about the heat capacity of solids which any satisfactory theory must
explain. They are,
1.

Close to room temperature, the molar heat capacity of most solids is constant, with a value close to 3R,
where R is the gas constant. Thus, for most solids at room temperature, we have
cP = 3R = 24.9 J mole1 K1

(8.1)

This is result is essentially the Dulong-Petit Law, which was enunciated in 1819. Note that all early
experiments were open-bench experiments and were thus conducted at constant pressure. Strictly
therefore, the Dulong-Petit Law is a statement directly about cP and not cV . However, the rate at which
the volume of a solid changes per unit rise in temperature is extremely small, so the work done by
the solid in expansion is generally negligible. This makes cV cP , and we may refer loosely to the
specific heat of a solid" without being too particular about which specific heat is intended.
Note that the Dulong-Petit Law only holds approximately and can be quite wrong, even at room

Section 8.3

Historical

85

temperature. Silicon, for example, has a specific heat of 19.9 J mole1 K1 , while that of diamond is
6.1 J mole1 K1 . Any credible model will have to explain also these severe deviations from the law
of Dulong and Petit.
At low temperatures, the heat capacities of solids differ dramatically from that predicted by the
Dulong-Petit law. Typical behaviour is illustrated in Figure 1 which shows the variation of cP with
temperature for gold.8

2.

Figure 8.1.cP vs. T for gold.


For most substances, the specific heat at low temperatures is found to obey a law of the form
cV = T 3 + T

(8.2)

where and are constants. For insulators, is zero and the specific heat follows a pure T 3 law. For
conductors, is non-zero. It is reasonable to conclude from this that the linear term must thus represent
a contribution to the specific heat from the presence in conductors of free conduction electrons. The
law represented by equation (8.2) is most clearly illustrated in a plot of cV /T vs. T 2 . According to
(8.2), such a plot will yield a straight line with slope and intercept . The plot for a typical insulator
(KCl) is shown in Figure 2,9 while that for a typical conductor (Cu) is shown in Figure 3.10

8.3 Historical
Classical statistical mechanics predicts for solids a constant heat capacity with value 3R at all temperatures,
and is incapable of explaining any deviations from this value. As seen above, this predicted behaviour is in
strong disagreement with experiment. In particular, the typical low temperature behaviour described by the
empirical equation (8.2) is completely unexplained by models based on classical mechanics.
The first model to account for the behaviour described by equation (8.2) was proposed by Einstein in
1907. In it he used his new quantum theory to treat, in simplified form, the lattice vibrations in the solid.
This theory reproduces qualitatively one of the principal features observed experimentally: the steep rise
8
9
10

From Mandl, 1988, p 107.


From Mandl, 1988, p 148.
From Mandl, 1988, p 149.

86

Chapter 8

Heat Capacity of Solids

Figure 8.2.cV /T vs. T 2 for KCl.


of cV from value zero at T = 0 at low temperatures to the classical value of 3R of the Dulong-Petit law.
This remarkable success where classical theory had failed completely, helped to make credible Einsteins
new theory of the quantum. But Einsteins oversimplified model does not give good quantitative results.
The increase in cV that it predicts is exponential, and not the experimentally observed T 3 behaviour. The
Einstein model was made more realistic by Debye (1912). Debyes model predicts the T 3 behaviour at
low temperatures, and fits the experimental data impressively. This model confirms the inference from the
experimental data that the T 3 term in equation (8.2) is due entirely to the lattice vibrations of the solid.
The linear term in (8.2) is not due to lattice vibrations, but is a contribution from the electronic degrees of
freedom in conducting materials. It is thus not explained by either the Debye or the Einstein theories.
The Debye model contains no principles not already contained in Einsteins model. It simply implements
Einsteins concepts more realistically and with less dramatic simplifying assumptions. Hence its better
results. Because of this conceptual overlap with the Einstein model, a study of Einsteins model is useful,
in spite of the fact that its detailed predictions are wrong.

8.4 Einsteins Model


In Einsteins model, the atoms in a solid are assumed to vibrate independently of each other about fixed
lattice sites. Each atom is assumed to be an isotropic harmonic oscillator with frequency . The classical
equation of motion for each atom is thus
d22r(A)
= 22r(A)
dt2
This reduces to three identical independent 1-dimensional simple harmonic oscillator equations, each with
frequency . The assembly of N atoms in the solid can thus be modelled as a set of 3N identical (but
distinguishable) independent simple harmonic oscillators, each of frequency .
In this Einsteins model does not differ from the classical model which predicts cV = 3R. Einsteins
principal innovation is in assuming that the oscillators are not classical, but quantised. He adopted the
same hypothesis as was used by Planck in 1900 in his explanation of the spectrum of blackbody radiation,
which assumed that the atoms in the cavity walls that contained the radiation were quantised oscillators.
According to this hypothesis, each oscillator can exist only in a discrete set of states r = 0, 1, 2, ..., where
the energy of the oscillator in state r is given by
r = r +

1
2

(8.3)

Note that, in his original treatment, Einstein assumed, like Planck before him, that the oscillator energy
levels are given not by (8.3), but by
r = r

Section 8.4

Einsteins Model

87

Figure 8.3.cV /T vs. T 2 for Cu.


It was not until the new quantum theory of Heisenberg and Schrdinger that it became known that the
oscillator energy is in fact given by expression (8.3). In these notes, we use the correct expression (8.3).

8.4.1

Partition Function

Consider a system of 3N independent oscillators. Denote the quantum states of this system by R. Then,
since the oscillators are distinguishable, we can specify the state of the entire system by listing the state r
of each oscillator. Thus
R = (r1 , r2 , ..., r3N )
Since according to this model the 3N oscillators this model are non-interacting, the energy of the system
in state R is
3N

ER =
A=1

r(A) = r(1) + r(2) + + r(3N)

The probability of finding the system in state R is thus


PR =

1 ER
e
Z

where Z is the partition function for the system of 3N oscillators. The partition function can be expressed
as
Z

eER =

=
R

3N
i=1

r(i)

R=(r1 ,r2 ,...,r3N )


3N

N
i=1

{1 ,...,N }

= zN

(8.4)

'

where z = e is the partition function of a single (one dimensional) harmonic oscillator. We


therefore only have to determine the partition function of a single oscillator in order to find the partition
function for the whole system of independent, identical oscillators. The function Z determines both the
statistics and the thermodynamics of the 3N oscillator system.
Consider first a single oscillator in thermal contact with an heat reservoir at temperature T . The
probability of the oscillator being in quantum state r is then
Pr =

1 r
1
e
= e
z
z

(r+1/2)

(8.5)

88

Chapter 8

Heat Capacity of Solids

Here, z is the partition function for a single oscillator and is given by


z=

er =

r=0

e(r+1/2)

= e

/2

r=0

er

(8.6)

r=0

The sum on the right hand side of (8.6) is of a geometric series with common ratio = e and first
term a = 1, so it can be summed in closed form by the usual formula for the sum of a geometric series of
n terms, given by
n =

a(n 1)
( 1)

For an infinite sum, provided that || < 1, 0 as n , so z becomes


z = e

/2

a
= e
( 1)

/2

1
1 e

(8.7)

The average energy of an oscillator is

1 z
z

1
z

z e
2

/2

1
e
+
2 (1 e )

(1 e

)2

For kT > expand the exponentials:


)
(

2
1 + 12 ( ) + ...
1

))
(
E
+ (
2
1 1 + 12 ( )2 + ...
= kT

(1 + ....)
1
+
1
2
1 2 + ...

(8.8)

The properties of this system are obviously determined by the ratio /kT . The numerator is the
spacing between the oscillator energy levels and thus represents the excitation energy of the oscillator.
From Eq. (8.8) the numerator kT is the characteristic thermal energy available to the oscillator. The
statistics of the single oscillator is thus determined by the ratio of the oscillator excitation energy to
the available thermal energy. The smaller the amount of energy available thermally, the less likely the
excitation of the oscillator.
This discussion may be re-expressed in terms of temperature. The fact that k is an universal constant
means that energy and temperature are measurements in different units of one and the same underlying
physical reality. We may thus convert temperatures to energy, as in the expression kT , or energies to
temperature. This is seen above by the fact that /k has the units of temperature. We may therefore
define a characteristic system temperature by the relation
=

is called the Einstein temperature of the oscillator, and it measures the excitation energy of the oscillator
in the units of temperature. The behaviour of the system is then determined by the ratio of the characteristic
system temperature to the temperature T of the reservoir. The temperature T is a measure of the thermal
energy available to the system by virtue of its contact with the heat reservoir. Expressed in terms of , the
partition function (8.7) becomes
z = e/2T

1
1 e/T

Section 8.4

Einsteins Model

89

Since it is the ratio of the energies, or equivalently of the temperatures, that determines the behaviour of
the system, it is convenient to put
x=

=
kT
T

This enables us to write (8.7) in the form


1
1 ex

z = ex/2

(8.9)

The probability of finding the system of N oscillators in state R is thus


PR =

1 ER
e
Z

(8.10)

where Z is the partition function for the system of 3N oscillators. By virtue of (8.9), this partition function
becomes
Z = z1 z2 z3N = z 3N = e3Nx/2

1
1 ex

3N

(8.11)

The function Z determines both the statistics and the thermodynamics of the 3N oscillator system.

8.4.2

Thermodynamics of the Oscillator System

The fundamental relation for the system is determined by the seminal relation of the canonical formalism,
F = kT ln Z = E T S

(8.12)

Thus, from (8.11),


1
3
F = 3N kT x ln(1 ex ) = N + 3N kT ln(1 e
2
2

/kT

(8.13)

Relation (8.13) expresses F as a function of T and N . It has no dependence on the volume V of the solid.
The reason for this is that, in setting up the model, we assumed that the atomic lattice sites are fixed points
in space. This automatically holds the volume of the solid fixed. We cannot therefore expect this simple
model to display the V -dependence of F .
The relation
F = F (T, N )
allows us to calculate two equations of state. We have, in infinitesimal form,
dF = S dT + dN

(8.14)

The first equation of state thus yields the system entropy as a function of T and N , and the second yields
the chemical potential for the solid. Since we are not interested in the solid as an open system, the chemical
potential is not relevant. The entropy equation however allows us to calculate the energy equation for the
system from the relation
E = F + TS
and its heat capacity from the relation
S
T

CV = T

(8.15)
V

The heat capacity can also be calculated from E, since


CV =

E
T

(8.16)
V

90

Chapter 8

Heat Capacity of Solids

This is less direct, since we must first obtain E. However, it is sometimes less painful, since the expression
for E may be less complicated than that for S.

8.4.3

Entropy

The entropy of the system is obtained from (8.13) as follows,


S=

F
T

1
e
(1 e /kT )
1
1
/kT
) + 3N
(e /kT 1) T

= 3N k ln(1 e

/kT

= 3N k ln(1 e

8.4.4

) 3N kT

/kT

kT 2
(8.17)

Energy Equation

The energy equation may be obtained from the relation


E = F + TS

(8.18)

ln Z

(8.19)

or
E=
which gives
E=

8.4.5

3
N + 3N
2
(e

1
/kT

(8.20)

1)

Heat Capacity

The dependence of E on T is less complicated than that of S. It is easier therefore, in this example, to
calculate the heat capacity of the system from the system energy. Thus,
CV =

E
T

= 3N
V

1
/kT

(e

1)2

/kT

kT 2

or
CV = 3N k

e
(e

/kT

/kT

In terms of the parameter x defined above, this becomes


CV = 3N k

1)2

x2 ex
(ex 1)2

kT

(8.21)

(8.22)

8.5 Discussion of Results


8.5.1

Energy

The total internal energy of the system is given by (8.20). It has two contributions. The first, 32 N is the
zero point energy of the 3N oscillators. This is an energy possessed by the system irrespective of its state
of excitation. The second term in (8.20) represents the excitation energy of the oscillators when the system
is at temperature T .
It is instructive to investigate the limiting forms of the expression for the system energy in the limits
x 1 and x 1. When x 1, we have kT . The temperature of the system is thus so high

Section 8.5

Discussion of Results

91

that the thermal energy kT is much larger than the level separation of the constituent oscillators, and
therefore also of the system. This limit is generally called the high temperature limit, and gives
(ex 1)1 = (x + )1

kT

so that
E

3
N + 3N kT
2

(8.23)

The excitation energy of the system is thus 3N kT . The excitation energy per oscillator of the system, or
equivalently, the average excitation energy of any given oscillator, is thus kT . This result agrees with the
classical prediction for the system.
The limit when x 1 occurs when the level separation for the system is much greater than the thermal
energy kT , and is called the low temperature limit. When this condition is satisfied, e /kT 1, so
E

3
N + 3N e
2

/kT

(8.24)

The excitation energy per oscillator thus drops dramatically and becomes very different from the classically
expected value of 3N kT (see section on the equipartition theorem). As the system temperature drops to
zero, e /kT , and so E 32 N . At the absolute zero therefore, each of the oscillators of the
system drops to its ground state.

8.5.2

Heat Capacity

The heat capacity for the system is given by equations (8.21) and (8.22),
CV = 3N k

e
(e

/kT

/kT

1)2

kT

= 3N k

x2 ex
1)2

(ex

(8.25)

This result is commonly expressed in terms of the Einstein temperature ,


CV = 3N k

e/T
(e/T 1)2

(8.26)

The behaviour of the system is determined by the ratio /T . We consider two limiting cases, T and
T , the high temperature and the low temperature limits respectively.
Consider first the high temperature limit, T . Then x 1, so that ex 1 x and
x2 ex
x2 (1 x + )
=
1
(ex 1)2
(x + )2
and so (8.26) becomes
CV 3N k = 3nR

(8.27)

The high temperature limit thus yields the classical Dulong-Petit law.
In the low temperature limit, we have T , so x = /T 1, giving ex 1 and
CV

x2 ex
(ex )2
= 3Nkx2 ex
3Nk

(8.28)

or, in terms of and T ,


CV 3N k

e/T

(8.29)

92

Chapter 8

Heat Capacity of Solids

To calculate the limit as T 0, we need to use LHospitals rule:


x2
2x
2
= lim x = lim x = 0
x ex
x e
x e
lim

(8.30)

According to this model, the heat capacity of the system goes to zero as T is decreased to zero.
Qualitatively, this matches the experimentally observed behaviour of the system, which classical models
were not able to do. In detail however, the Einstein model predicts an exponential temperature dependence
of cV , not the observed T 3 behaviour.
Physically, the model explains the decrease of cV with temperature as follows. At very low temperatures,
kT is much smaller than . This means that the energy available from the heat reservoir is much less than
that needed for the excitation of the oscillators, so the probability of an oscillator acquiring a quantum of
the right size to put it into an excited state is very low. At low temperature therefore, most oscillators will
be in their ground state and very little energy will be absorbed by the system from the heat reservoir. Its
internal energy therefore stays low initially, almost unchanged, as the temperature of the reservoir is raised.
Since cV is the rate of increase per mole of the system energy E with respect to the temperature of the
reservoir, it will at first show no perceptible sign of increase. The curve cV vs. T at this point is therefore
almost flat. The oscillator degrees of freedom thus contribute insignificantly to cV in this temperature
range.
Degrees of freedom which cannot be excited because there is insufficient energy available for the
excitation are said to be frozen. The existence of frozen degrees of freedom is a purely quantum mechanical
phenomenon. It cannot happen classically, since the smallest amount of energy can be absorbed into any
degree of freedom in calssical mechanics. When a system has several types of degrees of freedom, it can
happen that some are excited by a given energy influx, while the others remain frozen. The contribution
of each kind of degree of freedom to the heat capacity is different at different temperatures, and the heat
capacity shows marked change at the onset of excitation of a new type of motional mode.
When the temperature T of the reservoir reaches the critical value , the energy available to the system
from the reservoir becomes sufficient to produce excitations of the oscillators. The amount of energy
that can be absorbed from the reservoir thus begins to rise as more of the oscillators are able to make the
transition to the first excited level. Correspondingly, the internal energy E of the system begins to rise
and so cV begins to climb. The available energy however is still not sufficient to supply each and every
oscillator with the requisite amount of energy for excitation. The average increase of energy per degree of
freedom is still not at its maximum value of .
At higher temperatures, the amount of energy available from the reservoir is so great that there is
sufficient for each oscillator of the system to absorb the amount needed for the next transition. The
energy absorbed per oscillator per unit rise in temperature has thus reached its peak value of 3R, and can
increase no further. The capacity of each oscillator degree of freedom for energy absorption per unit rise of
temperature has thus saturated and cV can rise no further.

8.6 Comparison with Experiment


The Einstein model gives a good fit to the data for the heat capacity of diamond, as is shown in Figure 4. 11
The Einstein temperature is a free parameter in the model whose value is obtained by looking for the
best fit of the model with the data. In his original paper, Einstein chose for the value 1325 K in order to
produce the good agreement shown in Figure 4. According to the model, it is this high value of which is
responsible for the remarkably low heat capacity of diamond at room temperature. Most other solids have
values of between 200 K and 300 K, which brings the value of cV for them very close to the classical
value of 3R. This explains the success of the Dulong-Petit Law.
Einsteins result (8.29) does not yield the experimentally observe T 3 behaviour for insulators. The
relation it predicts is exponential, and is far from giving the linear relation between cV /T and T 2 of
Figures 2 and 3. In general, its agreement with experiment is only approximate, no matter how is
chosen. In view of the simplistic assumptions of the model, this fact is not surprising. It is clear that
atoms in a crystal lattice cannot vibrate independently, and all at the same frequency. The lattice oscillators
11

From Mandl, 1988, p 154, after Einsteins original paper of 1907.

Section 8.6

Comparison with Experiment

93

Figure 8.4.Observed molar heat capacity of diamond, compared with Einsteins model with = 1325 K.
are coupled oscillators. Reduction of the equations of motion to normal mode coordinates will thus not
produce a single frequency of vibration, but a spectrum of frequencies. The virtual oscillators needed to
represent the vibrating atoms in the model can thus not be identical, but must have a range of frequencies.
This correction to Einsteins model is the starting point of the Debye theory, which leads to the correct
T 3 dependence for CV . The agreement between the Debye model and experiment is excellent showing
beyond dispute that the origin of the T 3 term in the observed specific heat is in the lattice vibrations of the
particle that make up the solid.
The reason that Einsteins model gives good results for diamond is due to the stiffness" of the potential
that binds the diamond atoms to their lattice sites. The stiffer the potential, the smaller the spread of
frequencies of the coupled virtual oscillators. If this spread of frequencies is sufficiently small, then
Einsteins assumption of a single frequency for the oscillators is a good approximation.
References:
Mandl, F., 1988, Statistical Physics, Second Edition, The Manchester Physics Series, John Wiley and Sons,
Chichester.

94

Chapter 8

Heat Capacity of Solids

Exercises
1.
2.

Show that the entropy for the Einstein solid goes to zero as T 0.
The specific heat of graphite can be estimated from the following simple model: Graphite has a highly
anisotropic crystalline structure. each carbon atom in this structure can be regarded as performing
simple harmonic motion in three dimensions, with the restoring forces parallel to the planes larger than
the restoring forces perpendicular to the layers. The natural frequencies for in-plane oscillations are
large with 3000 K. On the other hand, the natural frequencies for inter plane motion is small
with 300 K. On the basis of this model, what is the molar specific heat (at constant volume) of
graphite at 3000 K.

95

Chapter 9
Paramagnetism : Canonical Approach
We have considered the case where each magnetic atom has spin 1/2. We used it to illustrate the use of
the micro-canonical formalism. In the general case, the atoms can have spin S, where S is integral or
half integral. Treating this general case using the micro-canonical formalism is clumsy and difficult. In
contrast, the canonical treatment of it is simple and straightforward. We therefore use it in this chapter to
illustrate the use of the canonical formalism.

9.1 Magnetic Moment of Spin-S Particles


A spin-S particle is a particle with angular momentum of magnitude J = S . Thus S measures its total
angular momentum in units of the fundamental constant . This angular momentum has a direction in
2 The angular momentum vector of the particle is thus J2 = S.
2
space that we represent by a spin-vector S.
2
Each particle with spin vector S has a magnetic moment given by
2
2 = gu S
where u is a convenient standard unit of magnetic moment, and g is a dimensionless factor of order unity.
For atomic problems, the standard unit of magnetic moment is the Bohr magneton which, in SI units is
given by
B =

e
= 9.2732 1024 J T1
2me

where me is the electron mass. In CGS units, B is given by


B =

e
= 9.2732 1021 erg.gauss1
2me c

where c is the speed of light. In nuclear problems, the standard unit of magnetic moment is the nuclear
magneton given in SI units by
N =

e
= 5.0505 1027 J T1
2mp

where mp is the proton mass, and in CGS units by


N =

e
= 5.0505 1024 erg.gauss1
2mp c

The factor g is an empirical factor to correct for the fact that small residual interactions of the particles
change the magnetic moment of the bare particle. In the case of atoms having both electronic spin and
orbital angular momentum, g is called the Land g-factor.

9.2 Quantum States of Spin-S Particles


According to Classical Mechanics, the magnetic moment vector 2 of a spin-S particle placed into a
2 can assume any orientation relative to the field, and the particle acquires a magnetic
magnetic field B
potential energy given by
2
= 2 B
2 are restricted by the condition that
According to QM, however, the possible orientations of 2 relative to B

96

Chapter 9

Paramagnetism : Canonical Approach

2 in the direction of B
2 can have only values
the component of S
S, S + 1, S + 2, ..., S 2, S 1, S
A spin-S particle in a magnetic field can thus be in any of 2S + 1 quantum states. It is convenient to label
these by the symbol r, where r can range through the 2S + 1 values
r = S, S + 1, S + 2, ..., S 2, S 1, S
2 and denote the component of S
2 along B
2 by Sz , we have
Thus, if we choose the z-axis in the direction of B
states r = Sz . Thus,
S
S
S
S
S

=0
= 12
=1
= 32
=2

:
:
:
:
:

Sz
Sz
Sz
Sz
Sz

=0
= 12 , + 12
= 1, 0, +1
= 32 , 12 , + 12 , + 32
= 2, 1, 0, +1, +2

1 state
2 states
3 states
4 states
5 states

and so on. The energy of the state r is then given by


r = (z )r B = gu (Sz )r B = rgu B
It is convenient to define the energy
= gu B
The energy of quantum state r can then be written as
r = r

(9.1)

2 at the site of a magnetic atom is given in general by B


2 = 0 (H
2 + m),
2 is the
The total field B
2 where H
applied field and m
2 is the magnetisation field of the material. In the case of paramagnetic substances,
the magnetisation field is very small when compared with the applied field, so the total field is given
2 0 H,
2 and we have
approximately by B
gu 0 H

(9.2)

9.3 Statistics of a Single Paramagnetic Atom


We are interested in the properties of a paramagnetic solid consisting of N magnetic atoms, each of spin S
and magnetic moment . To obtain the fundamental relation for this system by the canonical formalism,
we imagine the system in contact with an heat reservoir of temperature T , and in equilibrium with it.
Consider first a single paramagnetic atom of spin S in this solid. It interacts weakly with the rest of the
paramagnetic solid, and also with the heat reservoir. Since solid and reservoir are in equilibrium, both will
be at the same temperature T , and we can regard the single atom as a system in its own right, in thermal
contact with an enlarged reservoir consisting of the rest of the solid and the original heat reservoir. The
quantum states accessible to this system are the states r discussed above, with energies r given by (9.2).
This system is free to exchange energy with the enlarged reservoir. In the course of time, it will make
transitions continually to each of its accessible states, spending different times in each in proportion the
probability of its being found in that state. According to the canonical theory, the probability of finding the
single atom in state r is thus
Pr = constant er = constant er
The constant can be evaluated from the property
1=
r

Pr = constant

er
r

Section 9.3

Statistics of a Single Paramagnetic Atom

97

The function
er

z=
r

is the partition function for the system consisting of a single magnetic atom, and so
Pr =

1 r
e
z

(9.3)

This probability distribution completely determines the statistics of the single particle system and also
all of the average values of its associated parameters. If it makes sense to think of the system as a
thermodynamic system, it also determines fully the thermodynamics of the system via the fundamental
relation
F = kT ln z

(9.4)

We can evaluate z as follows. We have


S

z=
r=S

er = eS + e(S1) + e(S1) + eS

This is the sum of a geometric series to m = 2S + 1 terms where the common ratio is = e and the first
term is a = eS . Thus
z=

eS 1 e(2S+1)
a (1 m )
eS e(S+1)
=
=

1
1e
1 e

We can express this more symmetrically by multiplying numerator and denominator by e/2 to get
z=
or

e(S+1/2) e(S+1/2)
e/2 e/2
z=

sinh(S + 12 )
sinh 12

(9.5)

This is the partition function for a single atom in contact with an heat reservoir at temperature T . This
single atom will continually change its state as it interacts with the heat reservoir, emitting and absorbing
quanta of energy from it.
We can calculate the time-averaged average properties of the single particle from the probability
distribution (9.3).

9.3.1

Average Magnetic Moment

In the CM theory of magnetic moment, all three components of 2 can in principle be measured at any
instant. However, the magnetic field exerts a torque on the particle which causes 2 precess around the
direction of the magnetic field. The time average of the components of 2 perpendicular to the field is
thus zero. The only component of 2 with non-zero average value is thus that in the direction of the field.
In the QM theory, the components of the particle magnetic moment are represented by three operators.
These are proportional to the spin operators and so, apart from a scaling factor, have the same properties
as angular momentum operators. Since these do not commute, it is not possible, even in principle, to
measure simultaneously the three components of magnetic moment. The best that we can do is to measure
simultaneously the total magnetic moment 2 of the particle, and the component of 2 in the direction of the
magnetic field. Choose z-axis in the direction of the magnetic field. Then, when the particle is in quantum
state r, its z-component of magnetic moment is given by
(z )r = gu r

98

Chapter 9

Paramagnetism : Canonical Approach

The time-average value of z for the magnetic particle is thus


z =

1
1 r(gu B)
e
(gu r) =
z
z

Pr (z )r =
r

1 ( r(gu B) ) 1 1 z
e
=
B
z B

(9.6)

so that
z =

1 ln z
B

(9.7)

Now, from (9.5), we have


ln z = ln sinh S +

1
2

1
ln sinh
2

(9.8)

so that
z =

1 ln z
= gu
B

S+

1
2

coth S +

1
2

1
1
coth
2
2

(9.9)

which we may write in the form


z = BS ()

(9.10)

where = gu S is the magnitude of the particle magnetic moment, and the function BS (x) is defined
by
BS (x) =

1
S

1
2

S+

1
2

coth S +

1
1
coth x
2
2

(9.11)

and is called the Brillouin function for spin S.

9.3.2

Average Energy

We can also use result (9.3) to calculate the time-average energy of the magnetic particle in the applied
field,
=

Pr r =
r

1 r
1
e
r =
z
z

ln z
er =

(9.12)

which gives
= SBS ()

9.4 Properties of the Brillouin Functions


All of the properties of the paramagnetic system for spin S can be expressed in terms of the Brillouin
functions BS (x). In this section, we examine their properties. The parameter
x = = /kT
represents a ratio of characteristic energies. The energy = gu B is a characteristic magnetic potential
energy, and represents the spacing between the energy levels available to the single magnetic particle. On
the other hand, kT is the characteristic thermal energy of the system at temperature T . The behaviour of
the system is thus determined by the ratio of magnetic energy level spacing to the thermal energy of the
particle. We examine the behaviour of BS (x) in the limits x 1 and x 1.
When x 1, that is, when kT , we have ex ex , so that
coth x =

ex + ex
ex
x = 1 : x >> 1
x
x
e e
e

Section 9.5

Properties of Paramagnetic Solids of Spin S

99

and hence
BS (x)

1
S

S+

1
2

1
=1
2

(9.13)

When x 1, that is, when kT , we have


coth x =

ex + ex
ex ex

=
=
=

1 + 12 x2 +
x + 16 x3 +
1
1
(1 + x2 + )(x + x3 + )1
2
6
1
1
1
(1 + x2 + ) (1 + x2 + )1
2
x
6
1
1
1
(1 + x2 + ) (1 x2 + )
2
x
6
1
1
(1 + x2 + )
x
3

so that
coth x

1 1
+ x +
x 3

and hence
BS (x) =
=

+
1
1
1
1
S+
+
1
S
2
S+2 x 3
x 2
(S + 1) : x << 1
3

1
S+
2

1
2

2 x
+
x 6

(9.14)

It is easy to show that BS (x) is monotonic increasing. It thus rises linearly when x is small and rapidly
saturates to value 1. The function BS (x) is graphed in Figure 1 for several values of S.

Figure 9.1.

BS (x) as a function of x for different values of S.

9.5 Properties of Paramagnetic Solids of Spin S


Result (9.10) gives the time-average magnetic moment of a single particle in the magnetic solid as a

100

Chapter 9

Paramagnetism : Canonical Approach

function of temperature. We can use this result to infer the average properties of the entire paramagnetic
solid. Suppose the solid contains N magnetic atoms. Then at any given time the magnetic atoms of the
solid will be distributed over the states available to the single particles in the same proportions as the
proportion of time spent by a given single atom in that state. The time average of a parameter for a single
particle will thus be the same as the average value of that parameter per atom at a given time. So, for
example, the average value of the z-component of total magnetic moment of the solid will be
Mz = N z = N BS () = N gu SBS ()

(9.15)

For kT , this relation gives


Mz N gu S

(S + 1)
g2 2u BS(S + 1)
= N
3
3kT

(9.16)

The magnetic susceptibility of the material is defined by the relation


2
m
2 = H
2 is the applied field. Since
where m
2 is the magnetic moment of the material per unit volume, and H
m = M/V , where V is the volume of the material, and for paramagnetic substances B 0 H, we have
from (9.16),
=

N g 2 2u 0 S(S + 1) 1
C
=
V
3k
T
T

(9.17)

which is Curies Law with


N g 2 2u 0 S(S + 1)
V
3k
At the opposite limit, where kT , relation (9.15) becomes
C=

Mz = Ngu S

(9.18)

The magnetic moment therefore saturates, and no further increase of the applied magnetic field can
increase it further. This is expected since, at saturation, all of the atomic dipole moments have been
aligned and no further alignment can take place. The magnetic moment of the sample has thus reached the
maximum value that it can have.
Figure 2 shows theoretical plots of the predicted magnetic moment per ion versus B/T for three
substances, together with the corresponding experimentally measured values. The data is taken from
Henry, W.E., Phys. Rev., 88, 561 (1952). The substances are, I. potassium chromium alum, II. iron
ammonium alum, and III. gadolinium sulphate octahydrate.
Note that the theory developed here is equally valid whether the magnetic moment of the atoms is due
to unpaired orbital electrons or to the magnetic moment of the atomic nucleus. The difference between
electronic and nuclear paramagnetism is in magnitude only. In electronic paramagnetism, is of the order
of a Bohr magneton. In nuclear paramagnetism, it is of the order of a nuclear magneton, which is smaller
than the Bohr magneton by about the ratio of the electron to the nucleon mass, i.e. it is about 1000 times
smaller. Nuclear paramagnetism is thus about 1000 times smaller than electronic paramagnetism, and
so requires an absolute temperature about 1000 times smaller to achieve the same degree of magnetic
alignment.

9.6 Thermodynamics of Paramagnetic Materials


The calculations above were probabilistic. The same results can be obtained by thermodynamic methods.
This shows one way in which the canonical formalism is superior to and more flexible than the micro
canonical formalism: it allows us to take the system as large or as small as we please, and to deduce its
properties either by probability theory or thermodynamics.
In the thermodynamic approach, we take the system to be, not a single magnetic atom in the material, but
the material with all of its N magnetic atoms. Denote the quantum states of this system by R. Specification

Section 9.6

Thermodynamics of Paramagnetic Materials

Figure 9.2.

101

Magnetic moment per ion versus B/T for three different samples.

of the state R requires the specification of the state r of each magnetic atom individually. Thus
R = (r1 , r2 , ..., rN )
Since there are 2S + 1 states for each atom, the total number of states accessible to the N atom system is
(2s + 1)N . The partition function for the system is then
eER

Z=

(9.19)

where ER is the energy of the quantum state R. The paramagnetic atoms are well separated in the material,
so their interaction with each other is very weak. We may thus write
N

ER =
i=1

ri = r1 + r2 + rN

The states ri of the constituent atoms are independent, so the partition function in (9.19) factorises into N
identical factors, one for each constituent particle,
er1

Z=
r1

er2
r2

erN
rN

= zN

(9.20)

102

Chapter 9

Paramagnetism : Canonical Approach

where z is given by (9.5). The fundamental relation for the N -atom system is thus
F = kT ln Z = N kT ln z = N kT ln sinh S +

1
2

1
ln sinh
2

F is a function of T , B and N as expected. For a magnetic system, the fundamental relation in infinitesimal
form is
dF = S dT M dB + dN
The equations of state then give S, M and as functions of T , B, and N .

Exercises

103

Exercises
1.
2.
3.

Find S as a function of T , B, and N .


Find E as a function of T , B, and N , and hence find the heat capacity CB of the paramagnetic system.
Discuss the x 1 and x 1 approximations of CB , and sketch CB /N k as a function of 1/x = kT /.
Find M as a function of T , B, and N, and compare your answer with that already obtained by
probabilistic methods.

104

Chapter 10

Canonical Formalism in Classical Models

Chapter 10
Canonical Formalism in Classical
Models
In classical mechanics, the state of a system with f degrees of freedom is specified by a phase point
(q 1 , ..., qf , p1 , ..., pf ) in the phase space of the system. The energy of the state represented by the phase
point (q 1 , ..., qf , p1 , ..., pf ) is given by a function of the form
E = E(q 1 , ..., qf , p1 , ..., pf )

(10.1)

The set of states accessible to a classical system is continuous, and thus non-denumerable. This makes it
inconvenient for statistical analysis. The simplest way to proceed is by replacing the set of classical states
by another set which is discrete, and therefore denumerable. This is done by following the same procedure
as was used in the microcanonical formalism: rectangulate the phase space by dividing it into cells of
equal phase volume hf , and regard each phase cell as representing, in an approximate way, one state of the
classical system.
Consider now the energy of the state represented by a given phase cell. Let (q 1 , ..., qf , p1 , ..., pf ) be
any phase point contained in the given phase cell. The energy of the state represented by this phase point
is given by E(q 1 , ..., qf , p1 , ..., pf ). If the function E(q 1 , ..., q f , p1 , ..., pf ) varies sufficiently slowly over
the points of the cell, then to good approximation we may take this energy to be the energy of the state
represented by the given cell. The requirement that E(q 1 , ..., qf , p1 , ..., pf ) vary slowly over the cell is
essentially a restriction on the permissible cell size hf . If a given cell size does not meet this requirement
adequately, we can always improve the approximation by making h smaller.
Now put the system into thermal contact with an heat reservoir at temperature T . The system is able
to exchange energy freely with the reservoir in the form of heat. As it does so, it will move from one
accessible state to another, spending a time in each which is proportional to the probability of it being
found in that state. The results deduced in the general theory of the canonical formalism are independent of
whether the system considered is a classical or quantum system, and rely only on the assumption that the
states are discrete and hence denumerable. Though we cannot apply these results directly to the continuum
of states of a classical system, they are easily applied to the discrete approximation of the state by phase
cells. The probability of finding the system in a state represented by a cell that contains the phase point
(q 1 , ..., qf , p1 , ..., pf ) is thus proportional to
eE(q

,...,q f ,p1 ,...,pf )

(10.2)

This fact enables us to define a probability density for the phase space as follows. Consider a phase cell
with sides dq 1 , dq 2 , ..., dq f , dp1 , ..., dpf , chosen in such a way that the dq i , dpi are sufficiently small for E
not to vary substantially over the points of the cell, but yet sufficiently large for the cell df q df p to contain
many phase cells of volume hf . Then the total probability of finding the system in one of these states is
proportional to
eE(q

,...,q f ,p1 ,...,pf )

df q df p
hf

(10.3)

We define the probability density P (q 1 , ...q f , p1 , ..., pf ) for the system to be the probability per unit phase
volume that it has phase coordinates in the range (q 1 , ..., qf , p1 , ..., pf ) to (q 1 + dq 1 , ..., qf + dq f , p1 +
dp1 , ..., pf + dpf ). Thus,
P (q 1 , ...qf , p1 , ..., pf ) dq 1 ...dqf dp1 ...dpf = c eE(q

,...,q f ,p1 ,...,pf )

dq 1 ...dqf dp1 ...dpf


hf

(10.4)

which we abbreviate in an obvious way to


P (q, p) df q df p = c eE(q,p)

df q df p
hf

(10.5)

Section 10.1

Ideal Monatomic Gas

105

This is the continuous probability distribution that replaces the discrete distribution
Pr =

eEr
Z

(10.6)

in the case that the system has a continuous set of states rather than a discrete one.
The partition function for a continuous state space is obtained from (10.6) in a manner analogous that
used in the case where the set of accessible states is discrete. Evaluate the constant c using the condition
that
P (q, p) df q df p = 1

(10.7)

where M is the phase space for the system. Thus the classical partition function is thus Z = 1/c and is
given by
Z=

1
hf

eE(q,p) df q df p

(10.8)

It can be shown by methods analogous to those used in the discrete case that all properties of the system
can be calculated from Z. Z thus provides a representation of the fundamental relation for the classical
system through the seminal relation
F = kT ln Z

(10.9)

where F is a function of T through , of the external parameters of the system through E, and of the
number N of particles in the system, also through E.
Note that the choice of cell size for the rectangulation of the phase space of the system does not affect the
fundamental relation (10.9) in an essential way. The cell size h contributes an arbitrary additive constant to
F . This constant vanishes on differentiation, and so affects only the zero point for the function F . This is
consistent with classical thermodynamics where F is defined only up to an arbitrary additive constant.
The logic leading to equation (10.8) is tricky and convoluted. You may prefer to consider (10.9) as a
fundamental postulate in classical statistical mechanics, with h an arbitrary constant.
The method of the canonical formalism for classical statistical mechanics thus reduces to the following
algorithm:
1.
2.
3.

Write down the expression for the energy E(q, p) of the system in state (q 1 , ...qf , p1 , ..., pf ).
Calculate Z from (10.8), with h arbitrarily chosen.
Calculate all relevant physical parameters, either by statistical methods from the probability distribution
(10.6) with c = 1/Z, or by thermodynamic methods from the fundamental relation (10.9).
Note that h cancels out of (10.6) when c is replaced by 1/Z, so P (q, p) is independent of h,
eE(q,p)

P (q, p) =

(10.10)

eE(q,p) df q df p
M

This shows that h plays no role at all in determining the statistics of the system.

10.1 Ideal Monatomic Gas


Consider a gas of N point particles contained in volume V at absolute temperature T . Point particles, by
definition, have no internal structure, and thus no internal energy. The only degrees of freedom of this
system are translational. Its phase coordinates are (2rA , p2A ), where A = 1, ..., N. The energy of the gas in
state (2rA , p2A ) is
N

E(2r1 , ..., 2rN , p21 , ..., p2N ) =


A=1

p22A
+ (2r1 , ..., 2rN )
2m

(10.11)

106

Chapter 10

Canonical Formalism in Classical Models

If we assume also that there is no interaction between the particles, then (2r1 , ..., 2rN ) = 0 inside the
container and infinite outside the container. The energy is then
N

p22A
2m

E(2r1 , ..., 2rN , p21 , ..., p2N ) =


A=1

(10.12)

and the partition function for this system is given by


1
h3N

Z=

p
$2A /2m

d3N 2r d3N p2.

(10.13)

Each integral in (10.13) is independent and may be performed separately, so Z becomes


Z

=
=

1
h3N

d2r1

1
VN
h3N

d2rN

R3

V
2

e$p

e$p1 /2m d2
p1

R3

e$pN /2m d2
pN

N
/2m

d2
p

R3

The momentum integral is easily evaluated,


2

e$p

/2m

e(px +py +pz )/2m dpx dpy dpz

d2
p =

R3

R3
+

epx /2m dpx

epy /2m dpy

ep

/2m

epz /2m dpz

dp

But
+

p2 /2m

dp =

so

e$p

2m

/2m

x2

dx =

2m

d2
p = (2mkT )3/2

R3

and hence
Z =VN

2mkT
h2

3N/2

(10.14)

The fundamental relation for the system is thus


F = kT ln Z = N kT ln V +

3
3
2mk
ln T + ln
2
2
h2

(10.15)

From this, we may calculate the equations of state for the system. The fundamental equation in
infinitesimal form is
dF = S dT P dV + dN
and so we get
1.

Entropy equation:
S=

F
T

= N k ln V +
V,N

3
3
2mk
3 1
ln T + ln
+ N kT
2
2
2
h
2 T

= N k ln V T 3/2

2mk
h2

3/2

3
Nk
2

(10.16)

Section 10.2
2.

Monatomic Gas: Another Method

107

Equation of state:
F
V

P =

= N kT
T,N

1
V

or
P V = NkT
3.

(10.17)

Chemical Potential:
=

F
N

T,V

= kT ln V +

3
3
2mk
ln T + ln
2
2
h2

= kT ln V T 3/2
4.

2mk
h2

3/2

(10.18)

Energy Equation: since F = E T S, we have


E = F + T S = N kT ln V T 3/2

2mk
h2

3/2

+ N kT ln V T 3/2

2mk
h2

3/2

3
NkT (10.19)
2

which gives
3
N kT
2
As in the microcanonical formalism for the classical ideal gas, the above expressions for F , S and
cannot be correct, in spite of the fact that we have obtained the correct mechanical equation of state and
the correct heat equation from the model. F and S should be extensive quantities, and so homogeneous
functions of degree 1 in the extensive variables V and N . Also, is an intensive variable and so should be
an homogeneous function of degree zero in V and N . It is clear by inspection that they are not. The fault
lies with their dependence on V . To obtain expressions of the correct form for F , S and we need, as in
the microcanonical formalism and for the same reasons, to replace V by V/N . This amounts to admitting
that we have used an incorrect formula for the partition function of the gas. We should have used in place
of (10.14) the function
E=

"= 1 Z= 1 1
Z
N!
hf N!

eE(q,p) df q df p

(10.20)

The additional factor of 1/N! is due to an over counting of states accessible to the gas. This over counting
occurred in the argument leading up to equation (10.5). There, we used the fact that the number of states
accessible to the system in the range q, p to q + dq, p + dp is df q df p/hf . This is correct in situations in
which the particles are distinguishable. In the case of a gas however, we have already seen that the particles
must be reckoned to be indistinguishable. So this expression leads to the incorrect answer. The correct
counting" of accessible states thus leads to the result that, for a gas, the correct partition function is given
by (10.20)
Note that, for systems in which the particles are distinguishable, the correct partition function is still that
given by equation (10.8).

10.2 Monatomic Gas: Another Method


Instead of considering the entire gas as the system, as we did above, we can focus attention on a single gas
particle and consider this to be the system. The gas particle is then in thermal interaction with the rest of
the gas, and with the heat reservoir. At equilibrium, both are at temperature T . So the single particle may
be considered to be in thermal contact with an enlarged heat reservoir at temperature T .
If the particle is a point particle, it has no internal structure and its accessible states are specified by the
coordinates (2r, p2). The phase space for the single particle is thus 6-dimensional. We may now rectangulate
this phase space as before into cells of equal phase volume given by h3 and regard each cell as representing

108

Chapter 10

Canonical Formalism in Classical Models

a single state of the particle. Arguments analogous to those given above then lead us to the probability
density P (2r, p2) for the particle to be in state (2r, p2) given by
1 ($r,$p)
e
h3

P (2r, p2) = c

The probability of the particle having position in the range 2r to 2r + d2r and momentum p2 to p2 + d2
p is thus
P (2r, p2) d2r d2
p=c

1 ($r,$p) 3 3
e
d 2r d p2
h3

The partition function for the single particle is therefore given by


z=

1
h3

e($r,$p) d32r d3 p
2
M

where M is the 6-dimensional phase space of the single particle.


If we assume that the interaction of the particle with the reservoir is weak, so that its interaction energy
is negligible when compared with its total energy, then its total energy is
(2r, p2) =

p22
+ (2r)
2m

In the absence of external fields, this becomes


(2r, p
2) =

p22
2m

so that
z

=
=
=

2
1
e$p /2m d32r d3 p
2
h3 M
2
1
d32r
e$p /2m d3 p2
h3 V
3
R
1
V (2mkT )3/2
h3

The momentum integral has been evaluated in the same way as before. Thus
z=V

2mkT
h2

3/2

(10.21)

For the entire gas, were we to treat the particles as distinguishable, we would have
Z = zN = V N

2mkT
h2

3N/2

which is the same incorrect partition function obtained before. However, if we treat the particles as
indistinguishable, and assume that no two particles are likely to be found in the same state, we obtain
Z=

1 N
1
z =
VN
N!
N!

2mkT
h2

3N/2

which is the corrected partition function of the previous section.

10.3 Maxwells Distribution


Maxwells distribution determines the probability, in a gas of temperature T , that a molecule of the gas has
velocity in the range 2v to 2v + d2v. Maxwell arrived at his result by arguments based on kinetic theory (Jeans,
1940, Appendix I, p 296). The same result can be derived more easily using the canonical formalism.

Section 10.3

Maxwells Distribution

109

Consider a particle of mass m in a dilute gas. The gas need not be monatomic, or even consist of a
single chemical species. Denote the position of the center of mass of the particle by 2r, the momentum of
its center of mass by p2, and the coordinates and momenta of its internal degrees of freedom by (Q , P ),
where = 1, ..., s. The states accessible to this particle are thus specified by the variables (2r, p
2, Q , P ).
The phase space for the single particle is thus (6 + 2s)-dimensional.
If there are no external forces on the particle, its energy is given by
=

p
22
+ int (Q, P )
2m

(10.22)

The first term on the right is the kinetic energy of the center of mass motion. The second arises only if the
molecule is not monatomic, and includes rotational and vibrational energies of the constituent atoms of
molecule with respect to the molecular center of mass. These may be treated either by classical or quantum
methods. Note that we have assumed that the particle does not interact, or else interacts only weakly, with
the other particles in the gas, so does not depend on 2r.
When the gas is in thermal contact with an heat reservoir at temperature T , the given gas particle
may be considered to interact weakly with the system consisting of the reservoir and the rest of the
gas. The probability that the particle is in an accessible state that lies in the range (2r, p
2, Q , P ) to

(2r + d2r, p2 + d2p, Q + dQ , P + dP ) is thus


2
1
e[$p /2m+int (Q,P )] d2r d2
p ds Q ds P
h3+s
1 $p2 /2m
1
e
d2r d2p s eint (Q,P ) ds Q ds P
h3
h

P (2r, p2, Q, P ) d2r d2


p ds Q ds P

= c

=c

(10.23)

Every statistic for the gas particle can be calculated from distribution (10.23). For example, we may be
interested in the probability that the particle is found with position in the range 2r to 2r + d2r and with
momentum in the range p2 to p2 + d2
p, irrespective of its internal state. This probability is obtained by adding
the probabilities given by (10.23) for all possible internal states, and is thus given by
P (2r, p2, Q, P ) ds Q ds P d2r d2
p

P (2r, p2) d2r d2p =


Q

= c
= c

1 1
h3 hs

e[$p
Q

/2m+int (Q,P )]

ds Q ds P

d2r d2
p

1 $p2 /2m
1
e
d2r d2p s
h3
h

eint (Q,P ) ds Q ds P
Q

(10.24)

The integral over the internal coordinates and momenta yields a factor independent of 2r and p2, and which
depends on T through , and on the system parameters that enter into int . We do not need to evaluate it
explicitly since, if we incorporate it with the coefficient c, we may then evaluate the new coefficient c from
the normalisation condition for probabilities. Thus writing
P (2r, 2p) d2r d2
p = c

1 $p2 /2m
e
d2r d2
p
h3

we may evaluate c by using the condition that


P (2r, p2) d2r d2p = c

1=
p
$

$
r

1
h3

e$p
p
$

/2m

d2r d2
p

$
r

The advantage of calculating c by this method is that we do not need to evaluate explicitly the integral
over the internal coordinates and momenta.
To obtain the probability that the particle has velocity in the range 2v to 2v + d2v , we calculate first the
probability that it has momentum in the range p2 to p2 + d2p. For this, we need to sum the probabilities for
all accessible states in this range. We have already summed the probabilities over all possible internal
coordinates. We thus need only sum (10.24) over all possible positions of the particle. This gives

110

Chapter 10

P (2
p) d3 2p =

Canonical Formalism in Classical Models

P (2r, p2) d32r d2p


V

2
1
e$p /2m d32r d2p
h3
V
2
1
= c 3
d32r e$p /2m d2p
h
V
2
V
$
= c 3 e p /2m d2
p
h

= c

Combining c with the new factors which are independent of p2, we get a new factor which we denote by
C. C is a function of the external parameters of the system and of T . The desired probability distribution
is thus given by
2

P (2
p) d2
p = C e$p

/2m

d2p

(10.25)

Without knowing c explicitly, we can evaluate C from the condition that


2

e$p

P (2p) d2
p=C

1=
p
$

/2m

d2
p = C (2mkT )3/2

p
$

so that

P (2
p) d2
p=

3/2

1
2mkT

e$p

/2m

d2
p

To obtain the probability that the particle has velocity in the range 2v to 2v + d2v , we simply change the
variable in (10.25) from p2 to 2v = p2/m, giving

P (2v) d2v =

1
2mkT

3/2

em$v

/2

m3 d2v

or
P (2v) d2v =

( m )3/2
2
em$v /2 d32v
2kT

(10.26)

This is Maxwells distribution of velocities for particles of mass m in an ideal gas in equilibrium at
temperature T . It gives the probability that a molecule in the gas with mass m has velocity in the range 2v
to 2v + d2v.
Maxwells distribution is often expressed in terms of numbers of particles, or of particle densities. If the
total number of molecules in the gas of the species considered is N , then the total number of particles of
this species in the gas which have velocity in the range 2v to 2v + d2v is given by
N (2v) d2v = N P (2v) d2v = N

( m )3/2
2
em$v /2 d2v
2kT

Equivalently, if the total number density of molecules in the gas of the species considered is n = N/V ,
the the total number of particles per unit volume of this species in this gas with velocity in the range 2v to
2v + d2v is given by
( m )3/2
2
n(2v ) d2v = n P (2v) d32v = n
em$v /2 d2v
2kT

Two other distributions follow immediately from Maxwells. They are the distribution of a component of
velocity of a particle, and the distribution of its speed. We discuss these individually.

Section 10.3

10.3.1

Maxwells Distribution

111

Distribution of a Component of Velocity

Consider first the probability that the given particle has a component of velocity, say the x-component, in
the range vx to vx + dvx . We obtain this by adding the probabilities P (2v)d2v for all values of vy and vz .
Thus
+

P (vx ) dvx

P (2v ) dvz dvy dvx

=
vy =

vz =

( m )3/2
2
em$v /2 dvz dvy dvx
2kT
vz =
+

=
vy =

( m )3/2
=
2kT

emvy /2 dvy

vy =

emvz /2 dvz emvx /2 dvx

vy =

+
( m )3/2
2
2
2
e d emvx /2 dvx
2kT
m
( m )3/2 . 2 2
2
=
emvx /2 dvx
2kT
m

so that
P (vx ) dvx =

( m )1/2
2
emvx /2kT dvx
2kT

(10.27)

Equation (10.27) shows that a component of velocity is distributed with a Gaussian distribution and is
symmetric about the value vx = 0.
The mean value of vx is zero. Physically, this is obvious. There is nothing to favour one direction above
another, so the component of velocity in any given direction is as likely to be positive as it is to be negative.
Its mean value must therefore be zero. Mathematically, this follows from the fact that, in the integral
vx =

vx P (vx ) dvx =

( m )1/2
2kT

vx emvx /2kT dvx

the integrand is an odd function of vx . The same is true of the mean value of all odd powers k of vx . The
mean square of vx however is given by (show as an exercise)

vx2 =

vx2 P (vx ) dvx =

kT
m

The root mean square width of the Gaussian (10.27) is thus

vx =

vx2 =

kT
m

So, the lower the temperature, the smaller the width of the distribution. Note also from (10.26) and (10.27)
that
P (2v) d32v = [P (vx ) dvx ] [P (vy ) dvy ] [P (vz ) dvz ]
which shows that the three components of velocity are statistically independent.

10.3.2

Distribution of Speed

The distribution of speed for molecules of mass m in an ideal gas is also obtained from (10.26). First,
express the velocity in terms of polar coordinates, and then, sum the probabilities for given v for all polar

112

Chapter 10

Canonical Formalism in Classical Models

angles and . Thus, from (10.26),


P (v) dv

=0

=0

=
=
=

P (2v) d2v
( m )3/2
2kT
( m )3/2

emv

=0 =0
2

2kT

=0

/2

v 2 sin dv d d

sin d emv

/2

v2 dv

=0

so that
P (v) dv =

( m )3/2
2
4v2 emv /2 dv
2kT

(10.28)

This is Maxwells distribution of speeds for particles of mass m in an ideal gas in equilibrium at
temperature T . It gives the probability that a particle of mass m will have its speed in the range v to v + dv.
The mean values for this distribution are as follows (show these results as an exercise). The mean speed
v is given by
.

8kT
v=
vP (v) dv =
(10.29)
m
0
The mean square speed v 2 is given by

v2 =

v2 P (v) dv =

3kT
m

(10.30)

so the root mean square speed vrms is given by


vrms =

v2 =

3kT
m

(10.31)

Finally, the most probable speed of the particle is that speed for which (10.28) reaches its maximum, and is
given by
.
2kT
vmax =
(10.32)
m
Note that the most probable speed and the root mean square speed do not coincide. The distribution is thus
skewed. The fact that vrms > vmax means that the distribution is skewed toward the higher speeds.
All the mean speeds are functions of kT/m. Thus for a given mass, the higher the temperature, the
higher the speed. Also, for given temperature, the higher the particle mass, the lower its speed. In an ideal
gas mixture therefore, the heavier molecules have, on average, lower speeds. In the earths atmosphere,
hydrogen and helium have speeds such that the larger proportion of these particles move faster than the
escape velocity for the earths gravitational field. In contrast, the molecules of heavier gases rarely reach
the escape velocity. This explains why the abundance of hydrogen and helium is far lower than that of
other gases.

10.4 Gas in a Gravitational Field


In a gravitational field, the molecules of a gas also have potential energy. For simplicity, consider a gas in
volume V in thermal contact with an heat reservoir at temperature T . Focus attention on one molecule in
the gas, and denote its mass by m. The energy of the molecule is then
=

p2
+ mgz + int
2m

Section 10.4

Gas in a Gravitational Field

113

where we have chosen the z-axis in the vertical direction in the gravitational field. Denote the
internal coordinates and momenta of the particle by Q and P , = 1, ..., s, respectively. Then,
the probability that the particle is in an accessible state that lies in the range (2r, p2, Q , P ) to
(2r + d2r, p2 + d2p, Q + dQ , P + dP ) is
P (2r, p2, Q, P ) d2r d2
p ds Q ds P

2
1
e[$p /2m+mgh+int (Q,P )] d2r d2
p ds Q ds P
h3+s
2
1
1
= c 3 e$p /2m emgz d2r d2
p s eint (Q,P ) ds Q ds P
h
h

= c

Therefore, the probability that the particle is in an accessible state that lies in the range (2r, p2) to
(2r + d2r, p2 + d2p), whatever its internal state, is
P (2r, p2) d2r d2
p = c

1 $p2 /2m mgz


e
e
d2r d2p
h3

Unlike the gravitational-free case, this probability now depends also on the coordinate z.
Consider now the probability that the molecule has momentum in the range p2 to p2 + d2
p. This is given by
P (2
p) d2
p=

P (2r, p2) d2r

d3 p2 = C e$p

/2m

d2p

$
r

which is exactly the same distribution as was obtained in the absence of the gravitational field. Thus the
momentum, velocity, component of velocity, and speed distributions are identical to those in the absence
of the gravitational field and need not be reexamined.
At first, this may seem to be a surprising result. One might be inclined to think that the component of
velocity in the z-direction might be preferentially downwards because of the action of the gravitational
field. The corresponding distributions might thus be expected to be skewed in the negative z-direction.
This however is not so. In fact, if a gas in a field-free equilibrium configuration is put into a gravitational
field, it will not be in equilibrium in the field. The molecules will move preferentially in the downward
z-direction. This will continue until equilibrium is established. Once equilibrium is established however,
the parameters that characterise the system will cease to change with time. In particular, its density will
have stabilised. Once this has happened, we can no longer have preferential motion in the downward
direction, otherwise the gas density will continue to change with time. The velocity distribution must thus
be once again as it was in the absence of the field, with no preferred direction for the velocity.
What is different from the field-free case is the probability that a molecule is at position 2r. In the absence
of a field, the molecule was equally likely to be anywhere in the volume V in which the gas is contained,
since P (2r) d32r there was constant. This is no longer true. We have
P (2r) d2r =

P (2r, p2) d2
p

d32r = Cemgz d2r

p
$

This probability is independent of x and y, and so the probability distribution for each of these coordinates
is a constant. The particle is thus equally likely to be at any horizontal position within the container.
However, P (2r) depends on z. The greater the value of z, the less likely it is that the molecule will be at
that position. The probability distribution for the z-coordinate of the molecule is
P (2r) dx dy

P (z) dz =
x

dz = C emgz dz

The constant C may be found from the condition that


1=

P (z) dz
z

The probability of finding a particle at height z thus decreases exponentially with height. This means that
the gas is concentrated in the lowest portion of its container. This result is called the law of atmospheres,
because it would describe the variation of density in the atmosphere near the surface of a planet were the
atmosphere in equilibrium at temperature T . Generally, planetary atmospheres, if they exist, are not in

114

Chapter 10

Canonical Formalism in Classical Models

equilibrium, and they are also not isothermal. This law thus has only very limited application. The number
N(z) dz of molecules of mass m, and thus also the density n(z) dz of molecules of mass m, in an ideal
gas in a gravitational field in a layer at height z and of thickness dz is proportional to P (z) dz. Thus
N(z) dz = N P (z) dz = NC emgz dz
and
n(z) dz =

N
N mgz
P (z) dz =
Ce
dz
V
V

(10.33)

10.5 Equipartition of Energy


The equipartition theorem is a very useful result in classical statistical mechanics which can be established
under general assumptions.
Consider a system with f degrees of freedom, and denote its generalised coordinates and momenta by q i
and pi respectively, where i = 1, ..., f . The energy of the system when in the state (q i , pi ) is given by a
function of the type
E = E(q 1 , ..., q f , p1 , ..., pf )
In many problems it happens that the dependence of E on one or more of the generalised momenta (or
generalised coordinates), say p1 to ps has the special form
" 1 , ..., qf , ps+1 , ..., pf )
E = 1 (p1 ) + + s (ps ) + E(q

(10.34)

where for each = 1, ..., s the term (p ) = a p2 is quadratic in p with a constant, and the term
" 1 , ..., qf , ps+1 , ..., pf ) does not depend at all on the p but possibly does depend on the q ). The most
E(q
common situation of this kind is where the pi are linear momenta. The kinetic energy terms are then
quadratic in the pi , and the potential does not depend at all on them.
For simplicity, we will display explicitly only one of the terms (p ), say the term with = 1. We will
thus write (10.34) in the form
" 1 , ..., q f , p2 , ..., pf )
E = 1 (p1 ) + E(q

The results obtained are easily extended to as many terms as desired.


Put the system in thermal contact with an heat reservoir at temperature T . Since the probability of the
system being in state (q i , pi ) is
P (q, p) df q df p = c

1 E(q,p) f f
e
d qd p
hf

(10.35)

the mean value 1 of the term 1 is


1

1
hf
1
= c f
h

1 (p1 )eE(q,p) df q df p

= c

= c

1
hf

1 (p1 )e1 (p1 ) eE(q

,...,qf ,p2 ,...,pf )

df q df p

1 (p1 )e1 (p1 ) dp1


p1

eE(q
p2 ,...,pf

,q1 ,...,q f

,...,qf ,p2 ,...,pf )

df 1 q df p(10.36)

Section 10.5

Equipartition of Energy

115

We can evaluate c in the usual way from (10.35) to get


P (q, p) df q df p

1 =
M

= c
= c

1
hf
1
hf

e1 (p1 ) eE(q

,...,qf ,p2 ,...,pf )

df q df p

e1 (p1 ) dp1

e E(q

p1

p2 ,...,pf

,...,q f ,p2 ,...,pf )

df 1 q df p

(10.37)

,q 1 ,...,qf

from which we get


1 (p1 )e1 (p1 ) dp1
1

p1

e1 (p1 ) dp1
p1

We can re-express this result conveniently by noting that

p1

1 (p1 )e1 (p1 ) dp1 =

e1 (p1 ) dp1
p1

so that
1 =

e1 (p1 ) dp1

ln

(10.38)

p1

We may now evaluate this mean value,


e1 (p1 ) dp1 =
p1

and so
1 =
or

ln

2
1
ea1 p1 dp1 =
a1

e d =

a1

1
[ ln ln ln a1 ]
=
a1
2

1
kT
(10.39)
2
The same result is obtained for each momentum p which contributes only a single quadratic term to
the energy E of the system. Although the discussion was in terms of the momenta, equation (10.38) is
completely general for a system of independent particles and equation (10.39) is valid for any degree of
freedom that gives a quadratic contribution to the energy.
Thus the result can be extended. In certain systems, some generalised coordinates, q 1 , ..., qt say, also
each contribute exactly one quadratic term v (q v ) = bv q 2 to the energy. For these, the average value v (q v )
of v (q v ), by an analogous calculation, is again
1 =

v =

1
kT
2

(10.40)

This result is called the theorem of the equipartition of energy. We may state it as follows:
Theorem of the Equipartition of Energy: When a system is in thermal equilibrium with an heat
reservoir at temperature T and its external parameters are held fixed, each independent generalised
position or momentum coordinate which occurs only in a single quadratic term in the expression
E(q, p) for the energy of the system contributes an energy kT/2 to the mean energy E of the
system.
This result is true only for classical systems. It holds true only approximately in quantum systems when

116

Chapter 10

Canonical Formalism in Classical Models

the energy levels are spaced in such a way that the inter-level spacing E is very small compared with
the thermal energy kT . If E kT , then there is a large discrepancy between results predicted by the
classical equipartition theorem and the actual mean energy of the system.

Example: Mean kinetic energy of a molecule in an ideal gas.


Let the gas be at fixed temperature T and volume V . The kinetic energy of a molecule is given by a sum of
quadratic terms,
p2y
p2
p2x
+
+ z
(10.41)
2m 2m 2m
The momentum components of this molecule do not feature anywhere else in the expression for the energy.
The average kinetic energy of the molecule is therefore
E=

K=

3
kT
2

If the gas is monatomic and not in an external force field, then the entire energy of the molecule is kinetic,
and so its mean energy is
3
kT
2
If the entire gas consists of N molecules of a single species, then the average total energy of the gas is
E=

E=

3
N kT
2

so that the heat capacity at constant volume is


CV =

E
T

=
V

3
3
N k = nR
2
2

Reference:
Jeans, J., 1940, An Introduction to the Kinetic Theory of Gases, Cambridge University Press.

(10.42)

Exercises

117

Exercises
1.

2.

An ideal monatomic gas is in thermal equilibrium at room temperature T so that the molecular velocity
distribution is Maxwellian.
(a) If v denotes the speed of the molecule, calculate v1 . Compare this to v1 .
(b) Find the mean number of molecules per unit volume whose energy lies in the range between and
+ d.
(c) What is the most probable kinetic energy of molecules with a Maxwellian velocity distribution?
Is it equal to 12 m
v2 where v is the most probably speed of the molecules?
Determine the heat capacity of a monatomic gas in a uniform gravitational field with energy per
molecule
p2
+ mgz
2m
where we have chosen the z-axis in the vertical direction in the gravitational field. You may assume
that the gas is in a rectangular box with vertical length h.
=

118

Chapter 11

Quantum Theory of Ideal Gases

Chapter 11
Quantum Theory of Ideal Gases
11.1 Quantum Theory of Gases
So far, when developing the theory of gases, we have assumed that the motion of the particles of the gas
are governed by the laws of classical mechanics and that the states accessible to the gas are described by
points of a classical phase space. It is known however that the motion of particles is correctly described,
not by the laws of classical mechanics, but by those of quantum mechanics. To develop a correct theory of
gases therefore, we must replace Newtons equations for the gas system by a Schrdinger equation for the
particles involved.
This replacement does not in any way affect the general statistical theory previously developed, nor does
it affect the relation between the resulting statistics and the thermodynamic properties of the system. All
states accessible to an isolated system are still all equally probable, and the fundamental relation for the
system is still given by Boltzmanns relation
S = k ln

(11.1)

Equivalently, if the system is in thermal contact with an heat reservoir at temperature T , the probability of
the system being in an accessible state R with energy ER is still
PR =

eER
Z

(11.2)

and the fundamental relation for the system is still given in Helmholtz representation by
F = kT ln Z

(11.3)

The principal difference between classical and quantum treatments of a given system is in the states R
accessible to the system, and the way in which they are counted. Without entering into a detailed discussion
of quantum mechanics, the essential differences that characterise the classical and quantum theories of
gases can be reduced to two principles: the indistinguishability of gas particles, and the exclusion principle.
Equation (11.2) is valid for a system irrespective of whether the particles in the system interact
with each other or not. In the general case determining the eigenenergies of a system of interacting
particles is impossible and we normally resort to simplified models where the interactions between
particles are neglected as a first approximation. This is sometimes reasonable and can be approximated
by quasi-particles where the interactions between particles are included as best as possible in the
definition of the quasi-particles. The non-interacting quasi-particles include part of the interactions as
defining properties and are treated as non-interacting. This works well at low excitation energies, but the
interactions between the quasi-particles normally increase in importance at higher excitation energies.

11.1.1

Indistinguishability of particles

Intuition, which is formed on the basis of a lifetime of experience with macroscopic objects, leads us
to expect the identical microscopic particles which make up a gas to be distinguishable. This nave
expectation is rudely shattered by the classical theory of gases. The assumption of distinguishability
leads to an incorrect partition function which over counts the states accessible to the gas by a factor of
N!. To obtain the correct partition function, we are forced to correct this partition function by dividing
it by N!. The only plausible explanation for this correction factor is that the gas particles are not in fact
distinguishable, but indistinguishable, and that there are so many states available for the individual particles
that no two or more of them will ever be found in the same state.
There is no rational basis in classical mechanics for the assumption that the gas particles are
indistinguishable. It is a post hoc conclusion that is forced on us by the theory, and that we have to accept

119
even though we can offer no explanation for it.
The situation in the quantum theory of gases is different. There the theory itself forces on us a priori the
fact of indistinguishability. The wave function of a single gas particle is non-zero almost everywhere in the
volume V of the container. The particle it describes thus has non-zero probability of being found in any
given small element of volume in the container. This is true for each of the gas particles. Thus, given an
element of volume within the container, if we observe a gas particle in it, we cannot say which of the gas
particles we have observed. All of them have a finite probability of being found at that position, so it could
have been any one of them.
The indistinguishability of particles in a gas (or a liquid) leads, by a simple application of group
theory, to the result that the state space for the gas system must carry an irreducible representation of
the permutation group of N objects. For large N, there are a very large number of such irreducible
representations. We might therefore expect to find in nature a very large number of particle varieties, each
variety corresponding to one of the possible irreducible representations of the permutation group. Nature
however has been kind. Only two of these representations are found to occur: the fully symmetric, and the
fully antisymmetric representations. These correspond respectively to those particles called Bosons and
Fermions, defined below. The antisymmetric representation, furthermore, leads necessarily to the Pauli
Exclusion Principle, discussed in the next section.
Though quantum theory demands that the particles in a gas be regarded as indistinguishable, it is
nonetheless instructive to consider the effect on the theory of treating the particles as distinguishable. For
this purpose, we shall invent a third category of particles which we shall call Maxwell-Boltzmann particles,
and shall consider the statistic of Bosons, of Fermions, and of Maxwell-Boltzmann particles. Remember
however that nature does not admit this third category of particles. It is a purely theoretical fiction whose
sole purpose is to further our understanding.

11.1.2

The Pauli Exclusion Principle

In the case where the gas particles are indistinguishable and non-interacting, the state of the gas is
completely specified by the occupation numbers nr of the single particle states r. We must therefore now
consider what values these occupation numbers are allowed to take.
The most natural assumption to make is that there is no limit to the number of particles that can be found
in any given single particle state r. For a system of N particles therefore, according to this assumption, we
can have
nr = 0, 1, 2, ..., N
For a system like a photon or a phonon gas, where the number of particles in the system is variable, we will
have
nr = 0, 1, 2, ....
This is the assumption made by Bose in 1924. He was the first person to work out explicitly the statistics
of a photon system, and to use this result to derive Plancks radiation law. He sent his paper to Einstein
for his opinion. Einstein immediately saw the importance of Boses work and communicated his paper
to the Zeitschrift fr Physik. He also saw the implications of Boses work for particles of non-zero mass
and submitted a paper of his own in which he applied Boses work to an ideal gas. These two papers were
published in 1924, and the associated statistics are now known as Bose-Einstein statistics.
At about this time, Pauli was using the new quantum theory of Heisenberg and Schrdinger to work out
the theory of multi-electron atoms. He discovered that he could explain the periodicities of the periodic
table of the elements by assuming that no two electrons in a multi-electron atom can occupy the same
single-particle state simultaneously. This principle, by which occupation of a given single-particle state by
one electron excludes the occupation of that same state by any other, is now known as the Pauli exclusion
principle. It was published in 1925. The moment Fermi read Paulis paper, he realised its implications
for ideal gases whose particles are electrons and immediately worked out the associated statistics. He
submitted a paper on this subject which was published in 1926. While Fermis paper was in press, Dirac
submitted a paper of his own on the same subject. He had independently worked out the same theory as
Fermi. The associated statistics are now known as Fermi-Dirac statistics.

120

Chapter 11

Quantum Theory of Ideal Gases

The successful application by Fermi and Dirac of their statistics to electron gases showed that not
all particles obey Bose-Einstein statistics. We now distinguish two kinds of particles. Those that obey
Bose-Einstein statistics are called Bosons, while those that obey Fermi-Dirac statistics are called Fermions.
For Fermion systems therefore, the occupation numbers nr are restricted to two values only,
nr = 0, 1
Occupation numbers nr > 1 are not allowed.
Many attempts have been made to work out statistics in which the occupation numbers for single-particle
states are permitted values other than those given above. In spite of extensive searches for systems that
obey these other statistics, none have yet been found. We may therefore take it as an experimental fact
that nature admits only Bosons and Fermions and excludes other theoretically conceivable particle types.
Should particles ever be found in the future which are not adequately described by either Bose-Einstein or
Fermi-Dirac statistics, then this is one place where the current theory might be adjusted to accommodate
the new particles.
The Pauli exclusion principle as described above, applies to non-interacting particles. For a system of
identical Fermions, the principle states that the eigenfunctions (1, 2, 3, ..., N ) are totally symmetric
under any even permutation of the particles, and anti-symmetric for any odd permutation. It reduces to the
statement that the occupation numbers nr are restricted to two values only, nr = 0, 1, for non-interacting
identical Fermions.

11.1.3

Spin and Statistics

It is an experimental fact that most particles have an angular momentum which is not related to their
motion. They have it even when they are at rest. It is therefore not a property of the orbit of the particle,
but of the particle itself. It is thus called intrinsic angular momentum, to distinguish it from orbital angular
momentum. When originally discovered, it was imagined to be something like the angular momentum of a
spinning top whose centre of mass is at rest. It was thus called the spin of the particle. We now know this
analogy to be false, but the name spin has remained.
Intrinsic angular momentum is quantized in multiples of /2. In quantum theory, the natural unit of
angular momentum is . It is thus common to refer to a particle with intrinsic angular momentum /2 as a
spin-1/2 particle. Similarly for spin-1, spin-3/2, etc., particles. Particles which have no intrinsic angular
momentum are called spin-0 particles. The spin of elementary particles is fixed and cannot be changed.
Thus, all electrons, protons and neutrons have spin 1/2, and all photons have spin 1. The spin of composite
particles is determined by the configuration of their constituents.
It is known from experiment that all particles with integral spin obey Bose-Einstein statistics, and
that all particles with half integral spin obey Fermi-Dirac statistics. No exception to this rule is known.
There thus seems to be a fundamental relationship between spin and statistics. This relationship between
spin and statistics was first explained theoretically by Pauli in 1940 in what is now called the Spin and
Statistics Theorem. He showed that this relationship between spin and statistics is a necessary consequence
of demanding that the quantum theory of fields, which is the correct way to describe systems of many
particles quantum mechanically, be consistent with the requirements of the special theory of relativity.
Paulis proof, and its later generalisations, is the only theory known that explains the relation between spin
and statistics. The fact that it is able to do so is generally regarded as compelling evidence for the validity
of relativistic quantum field theory.

11.1.4 Effect of Indistinguishability and Exclusion on Counting


Procedures
The statistics of the three types of particles introduced above are markedly different. The reason for this
is that the distinguishability or indistinguishability of particles radically changes the number of states
accessible to the system. The respective partition functions are therefore also radically different. Further, if
the particles are indistinguishable, the presence or absence of an exclusion principle also severely affects
the number of states accessible to the system, and hence also the partition function for the system.
The effects on the counting procedures of indistinguishability, and of the exclusion principle, are best

121
illustrated by a trivial example. Consider a system with few states available to the single particles, and
consisting of a small number of particles. The exclusion principle requires at most one particle per state,
so if we are to illustrate all three cases simultaneously, we need to consider a case in which there are
more single particle states available than there are particles. Consider therefore a gas" consisting of two
particles, in which there are exactly three quantum states available to each particle. We shall count the
number of states available to the whole 2-particle gas" in each of the three cases.

11.1.5

Maxwell-Boltzmann Case

The particles are distinguishable, so denote them by A and B. We may regard the three states r = 1, 2, 3
available to each particle as bins" or piles" into which the two objects A and B are to be placed. Each
binning" of the two objects then represents a possible gas state, R. Placing both A and B into the same
bin produces only one binning," irrespective of the order in which the individual particles were binned.
Thus
1
AB

is equivalent to

1
BA

Placing A and B into different bins in different order however produces different binnings" which must
be counted as distinct. Thus
1
A

2
B

is not equivalent to

1
B

2
A

The possible binnings" for this case are thus


1
AB

AB
AB
A
B
A
B

B
A

A
B

B
A
B
A

This gas" thus has 9 states accessible to it. The partition function will thus consist of a sum of nine
terms which, after simplification, becomes
eER = e21 + e22 + e23 + 2e(1 +2 ) + 2e(1 +3 ) + 2e(2 +3 )

Z=

(11.4)

11.1.6

Bose-Einstein Case:

The particles are indistinguishable, so denote them each by the same symbol, A. As before, placing both
particles into the same bin produces only one binning," irrespective of the order in which the particles are
placed there. However, the indistinguishability makes itself felt when the particles are placed into different
bins. In the Maxwell-Boltzmann case, this led to distinct gas" states. In the Bose-Einstein case, it does
not,
1
A

2
A

is indistinguishable from

The possible binnings" for this case are thus

1
A

2
A

122

Chapter 11
1
AA

Quantum Theory of Ideal Gases

AA
AA
A
A

A
A
A

This gas" thus has 6 states accessible to it, not 9. Accordingly, the partition function will consist of a
sum of six terms which, after simplification, becomes
eER = e21 + e22 + e23 + e(1 +2 ) + e(1 +3 ) + e(2 +3 )

Z=

(11.5)

which is markedly different from (11.4), and so leads to very different properties for the gas."

11.1.7

Fermi-Dirac Case

The particles are again indistinguishable, and so may be denoted them by the same symbol A, as in the
Bose-Einstein case. This case however differs from the Bose-Einstein case in that the exclusion principle
forbids placing both particles into the same bin. The states accessible to this gas" are therefore
1
A
A

2
A
A

3
A
A

This gas" thus has only 3 states accessible to it. The partition function will thus consist of a sum of
three terms only, and is given by
eER = e(1 +2 ) + e(1 +3 ) + e(2 +3 )

Z=
R

This result is different to both (11.4) and (11.5), and leads to markedly different properties for this gas.

11.2 The Partition Functions


11.2.1

Maxwell-Boltzmann Gas

Consider a gas of N particles, confined to volume V and held at temperature T . Denote the states available
to a single particle by r, and those available to the entire gas by R. Denote the corresponding energies of
these states respectively by r and ER .
For the moment, we shall ignore the details of the single particle states. The distributions we shall derive
therefore are valid irrespective of the particular model used, or of the details of the system being modelled
with the restriction that particle interaction is negligible. These will be added later when we develop
particular models.
First, we must find a way to specify the states R accessible to the entire gas. In the Maxwell-Boltzmann
case, the particles are distinguishable. The occupation numbers nr thus provide only a partial specification
of the state of the gas. Note however that all gas states R with the same occupation numbers n1 , n2 , ...
have the same energy
ER = n1 1 + n2 2 + n3 3 +
We can thus calculate the partition function Z for the gas without needing to find a way to specify fully
the gas state R. Each gas state with the given occupation numbers contributes precisely the same term
e(n1 1 +n2 2 +n3 3 + ) to the partition function. All we need do therefore to determine Z is count the

Section 11.2

The Partition Functions

123

number of gas states R with the given occupation numbers. This is easily done. Let n1 , n2 , ... be given.
This means that we have distributed N distinguishable particles into distinct piles in such a way that n1
particles are in the first pile, n2 in the second, and so on. The number of ways of distributing N particles
is N !. However, shuffling the particles in pile r does not produce a new distribution into piles. We have
therefore overcounted the distinct pile distributions by the number of ways in which each pile can be
reshuffled. The total number of distinct ways of distributing the N particles into piles with n1 in the first,
n2 in the second, etc., is therefore
N!
n1 ! n2 ! n3 !

(11.6)

The partition function for the Maxwell-Boltzmann gas is therefore


eER =

Z=

n1 ,n2 ,...

/ 01 2
r

N!
e(n1 1 +n2 2 +n3 3 + )
n1 ! n2 ! n3 !

(11.7)

nr =N

where the underbrace on the summation indicates that the summations over n1 , n2 , ... are not to be carried
out independently, but subject to the restrictive condition
nr = N
r

It is easy to evaluate this partition function. We have

=
n1 ,n2 ,...

/ 01 2
r

nr =N

n1 ,n2 ,...

/ 01 2
r

N!
e1
n1 ! n2 ! n3 !

n1

e2

n2

e3

n3

nr =N

N!
e(n1 1 +n2 2 +n3 3 + )
n1 ! n2 ! n3 !

+ e2 + e3 +

(11.8)

so that
N
r

Z=

(11.9)

This is the partition function for the Maxwell-Boltzmann gas.


Alternative Derivation: When the particles are distinguishable, the state R of the gas can be specified completely by
specifying the state r of each individual particle in the gas. Thus,

R = (r1 , r2 , r3 , ..., rN )

(Note that R cannot be specified in this way when the particles are indistinguishable.) We thus obtain all possible states of

124

Chapter 11

Quantum Theory of Ideal Gases

the gas by allowing each of the ri to range over all their possible values. Thus
Z

eER

=
R

e(r1 +r2 ++rN )

=
r1 ,r2 ,...,rN

er1 er2 erN

=
r1 ,r2 ,...,rN

r1

e
r1

r2

e
r2

rN

e
rN

er

(11.10)

which is result (11.9) above. The advantage of this derivation is its simplicity. Its disadvantage is that the method it uses
cannot be used in the other two cases to be considered.

11.2.2

Bose-Einstein Gas

When the gas particles are indistinguishable, we cannot specify the state of each particle in the gas since
the individual particles cannot be distinguished one from another. The most that can be said of any state of
the gas is that n1 particles are in state 1, n2 in state 2, etc. The state of a gas of indistinguishable particles
is thus completely specified by its occupation numbers n1 , n2 , .... The gas states are thus given by
R = (n1 , n2 , n3 , ...)
Each allowed distinct choice of occupation numbers thus defines one possible state of the gas. This is true
in both Bose-Einstein and Fermi-Dirac cases.
The difference between Bose-Einstein and Fermi-Dirac cases is only in what values the occupation
numbers are allowed to take. For a Bose-Einstein gas of N particles, each occupation number nr can
take values 0, 1, 2, ..., N . However, they cannot range independently. They are subject to the restrictive
condition that
nr = N
r

The partition function for the Bose-Einstein gas is thus given by


eER =

Z=
R

e(n1 1 +n2 2 +n3 3 + )

(11.11)

n1 ,n2 ,...

/ 01 2
r

nr =N

The restrictive condition is awkward. It means that the summations over n1 , n2 , ... are not independent of
each other. So, even though the exponential factor splits into a product of factors, the first involving n1
alone, the second n2 alone, and so on, the summation does not split into a product of summations since the
range of each nr depends on the partial summations already carried out. This complication prevents us for
the moment from evaluating explicitly the partition function Z for this gas.
There is however a special case of a Bose-Einstein gas for which Z can be explicitly evaluated
immediately. Photons are not conserved particles. In a radiation cavity, in thermnal contact with a heat
bath, photons are randomly created and destroyed in the radiation field as the field interacts with the
walls of the containing cavity. Their number is therefore not permanently some fixed given value N . The
occupation numbers for a photon gas are therefore not constrained by any relation, and each occupation
number nr may assume any value whatever, irrespective of what values any of the other occupation

Section 11.2

The Partition Functions

125

numbers have assumed. For the photon gas therefore, we have


Z

eER

=
R

=
=

e(n1 1 +n2 2 +n3 3 + )

n1 ,n2 ,...=0

n1 ,n2 ,...=0

en1 1 en2 2 en3 3

n1 1

n1 =0

n2 2

n2 =0

en1

n=0

en3 3

n3 =0

en2

n=0

en3

n=0

(11.12)

Now, each factor is the sum of an infinite geometric series and so may be evaluated explicitly,

n=0

enr = 1 + er + e2r + e3r + =

1
1 er

(11.13)

So the partition function for a photon gas becomes


1
1 e2

1
1 e1

Z=

1
1 e3

(11.14)

This can be more concisely expressed by taking the log of both sides to get
ln Z =

ln 1 er

(11.15)

The summation is over all possible single particle states r for the photons.

11.2.3

Fermi-Dirac Gas

The Fermi-Dirac case differs from the Bose-Einstein case only in the values that the occupation numbers
nr are allowed to take. Fermions are subject to an exclusion principle that prevents multiple occupation of
any given single particle state. The occupation numbers nr can take thus each only take values
nr = 0 or 1
As in the Bose-Einstein case however, they cannot range independently since, for a gas of N particles, they
are subject to the restrictive condition that
nr = N
r

The partition function for the Fermi-Dirac gas is thus given by


eER =

Z=
R

e(n1 1 +n2 2 +n3 3 + )

(11.16)

n1 ,n2 ,...

/ 01 2
r

nr =N

where each nr = 0, 1.
As in the Bose-Einstein case, the restrictive condition means that the summations over n1 , n2 , ... are
not independent of each other. So, even though the exponential factor splits into a product of factors,
the first involving n1 alone, the second n2 alone, and so on, the summation does not split into a product

126

Chapter 11

Quantum Theory of Ideal Gases

of summations, since the range of each nr depends on the partial summations already carried out. This
complication prevents us for the moment from evaluating the partition function Z for this gas explicitly.
There are no known cases in nature of Fermions whose numbers are not conserved. There is thus no
special case of a Fermion gas analogous to that of a photon gas. Were there such a gas, then its partition
function would be given by
Z

eER

=
R

e(n1 1 +n2 2 +n3 3 + )

=
n1 ,n2 ,...=0
1

=
n1 ,n2 ,...=0

en1 1 en2 2 en3 3

en1 1

en2 2

n1 =0

en3 3

n2 =0

n3 =0

(11.17)

so that
Z = 1 + e1

1 + e2

1 + e3

which can be expressed more concisely by taking the log of both sides to get
ln 1 + er

ln Z =

(11.18)

11.3 Mean occupation numbers


The mean occupation number, or probability that a single particle level i will be occupied is given by

ln Z
i

'

ni e(n1 1 +n2 2 +n3 3 + )

'n1 ,n2 ,...nN

=
T,j ,j=i

n1 ,n2 ,...nN

e(n1 1 +n2 2 +n3 3 + )

= n
i

(11.19)

For a Maxwell-Boltzmann gas we have


N
r

Z=

= zN

(11.20)

n
i

=
=
=

ln Z
i

11
N
z
i

Ne
1
ZN

ln z
i

T,j ,j=i

T,j ,j=i

(11.21)

Section 11.3

Mean occupation numbers

127

while for a photon gas


n
i

=
=

ln Z
i T,j ,j=i
'
r ln 1 + er
i

T,j ,j=i

e
(1 + ei )
1
(1 + ei )

(11.22)
(11.23)

For the fictitious Fermi gas of (11.18)


1
ei 1
This illustrates clearly that the different statistics will lead to different thermodynamics.
n
i =

(11.24)

128

Chapter 12

Blackbody Radiation

Chapter 12
Blackbody Radiation
12.1 Equilibrium States of a Radiation Field
All bodies at temperature T = 0 emit electromagnetic radiation. If the emitted radiation moves away from
the emitter and is transported to distant parts of space, it does not have the opportunity to interact with
matter in such a way as to bring it into thermal equilibrium with it. In such circumstances, the radiation
field never reaches equilibrium and carries with it the properties, or traces of the properties, of the emission
process that gave it birth. However, if the radiation field is trapped in the vicinity of the emitter or in a
region of space where it can interact with matter, its photons will continually be absorbed and re-emitted.
The nett effect of this interaction is to cause the energy in the radiation modes to redistribute itself until
the field reaches thermal equilibrium with the surrounding material at some temperature T . When this
happens, the radiation looses the characteristic properties imparted to it by the emission process from
which it was born and it acquires a new set of properties that reflect the equilibrium condition that it has
reached. This process by which the radiation looses all trace of its birth characteristics and is left only with
the characteristic properties of its equilibrium state at temperature T is called the thermalisation of the
radiation field.
An interesting way to thermalise radiation and to study its properties in thermal equilibrium is to trap it
inside an opaque cavity whose walls are kept at a fixed temperature T . The opaqueness of the cavity walls
ensures that there is sufficient interaction for the field to reach thermal equilibrium quickly. If a small hole
is cut into the cavity wall, some of the radiation will escape through it and its properties can be studied.
Provided that the hole is sufficiently small, the loss of radiation through it will not significantly disturb
the equilibrium in the cavity, and so the emitted radiation will have the same properties as that inside the
cavity. Also, any radiation incident on the hole will enter the cavity and, by repeated interaction with the
opaque walls, will quickly be brought into equilibrium with the system.
The radiation emitted from the small hole also has the same properties as that emitted by a perfectly
black body at the same temperature T. For this reason cavity radiation is also called black-body radiation.
A perfectly black body absorbs all radiation falling on it. To very good approximation, the small hole
in the wall of the cavity also has this property: radiation entering the cavity through the hole is almost
completely absorbed by the cavity walls through repeated reflections from cavity wall. So, the radiation
emitted from the hole and that emitted by a perfectly black body will have the same characteristics, as
follows from Kirchhoffs laws of radiation.

12.2 Modelling Cavity Radiation


The wave-particle duality of electromagnetic radiation leads to two alternative statistical treatments of
black-body radiation, one based on the wave picture, the other on the photon model.
According to the wave picture, the normal modes of oscillation of the radiation field in the cavity are
its states of fixed energy. Classically, these might have any energy in a continuous range, beginning from
zero. According to quantum theory, however, the normal mode energies form a discrete set. When the field
is put into contact with an heat bath at temperature T , it will come to equilibrium. When in equilibrium,
the energy in each of its modes is determined by the Boltzmann factor of statistical theory. Classical theory
gives the incorrect spectrum for this radiation. Quantum theory however leads to the Planck radiation law
which is in excellent agreement with observation.
The alternative approach is based on the photon model of light. It models the radiation field as a gas of
radiation quanta, or photons. These photons populate the available energy levels in the same way that gas
molecules in an ideal gas populate the energy levels available to them. The key ingredient in this model is
the fact that photons behave like particles of spin one. They are thus not subject to any exclusion principle

Section 12.3

Cavity Modes

129

and therefore obey Bose-Einstein statistics. Thermalised radiation is thus an interesting example of an
ideal quantum gas.

12.3 Cavity Modes


Consider the normal modes of oscillation of an electromagnetic field in a rectangular cavity with sides
Lx , Ly and Lz . Each vector component of both the electric and magnetic field satisfy the classical wave
equation,
1 2
c2 t2

(12.1)

(2x, t) = (2x)eit

(12.2)

c2

(12.3)

0 = 2
The normal mode solutions are of the form

Substitution of (12.2) into (12.1) gives


0 = 2 +

So, for (12.2) to be a solution of (12.1), the function must be a solution of the Helmholtz equation
0 = 2 + k 2

(12.4)

where we have put


k2 =

2
c2

(12.5)

We must now look for solutions of (12.4) that represent cavity modes. These have the property that = 0
everywhere outside the cavity. In particular, this means that = 0 at the cavity walls.
Note that this is a different condition from another commonly encountered one where the function is
required to be periodic at the boundary walls. That condition does not require the field to vanish outside
the cavity walls, but to be periodic with periodicity determined by the cavity. It therefore admits travelling
wave solutions, which are more general than the solutions admitted here. In spite of superficial similarity
to the cavity problem therefore, it is physically distinct.
To solve the Helmholtz equation, we look for solutions in separable form,
= X(x)Y (y)Z(z)
Substitution into (12.4) then yields, by the method of the separation of variables, three ordinary differential
equations for the unknown functions X, Y and Z,
0 =
0 =
0 =

d2 X
+ kx2
dx2
d2 Y
+ ky2
dy 2
d2 Z
+ kz2
dz 2

(12.6)
(12.7)
(12.8)

where the constants of separation, kx , ky and kz satisfy


kx2 + ky2 + kz2 = k2
We now attempt to satisfy the boundary conditions. Suppose the cavity extends from x = 0 to x = Lx on
the x-axis. The general solution of (12.6) is
X(x) = Ax sin(kx x + x )

130

Chapter 12

Blackbody Radiation

The boundary condition at x = 0 requires that


0 = X(0) = Ax sin( x )
so that x = m where m = 0, 1, 2, .... These possible values of m do not all yield independent
solutions. In fact, in the infinite set, only one is independent, with all others identical with it up to a sign.
We therefore choose the value m = 0, which gives x = 0 and hence
X(x) = Ax sin(kx x)
The boundary condition at x = Lx requires that
0 = X(Lx ) = Ax sin(kx Lx )
so that kx Lx = n with n = 0, 1, 2, .... Again, not all of these allowed values yield independent
solutions. The value n = 0 yields X(x) = 0 and hence (2x) = 0, which is trivial, while each negative
integer yields up to sign a solution identical with the one obtained from the corresponding positive value.
We thus obtain independent solutions only for the values n = 1, 2, 3, ..., so that the permitted values of kx
are given by
nx
kx =
Lx
where nx = 1, 2, 3.... In a similar way, we get
ky =

ny
Ly

kx =

nz
Lz

and

where ny , nz = 1, 2, 3.... The cavity oscillations of the field thus do not permit arbitrary values of kx , ky
or kz , but restricts their values as indicated. This in turn restricts the values that k may take to
n2y
n2x
n2
+ 2 + z2
2
Lx Ly
Lz

k2 = 2

The normal mode frequencies of the oscillations are accordingly also restricted to
2 = c2 2

n2y
n2x
n2
+ 2 + z2
2
Lx Ly
Lz

The separable cavity mode solutions are thus


nx ny nz (2x) = sin

nx
x sin
Lx

ny
y sin
Ly

nz
z ei(nx ,ny ,nz )t
Lz

(12.9)

All other cavity modes are linear combinations of these.


The restriction of the cavity modes to a discrete set of solutions is essentially the quantisation
phenomenon: not all oscillation frequencies are admitted in the cavity. The ones that are admitted are
uniquely defined by triplets (nx , ny , nz ) of non-zero integers. For each distinct set of values (nx , ny , nz )
with nx , ny , nz = 1, 2, 3, ..., we obtain a distinct normal oscillation mode. The normal mode states of the
field in the cavity are thus completely specified by the values (nx , ny , nz ).
In the quantum interpretation of the field, its energy is given by E = , so the field energy is quantised
and is given by
En2x ny nz =

2 2 2

n2y
n2x
n2
+
+ z2
2
2
Lx Ly
Lz

(12.10)

Section 12.4

Partition Function of the Radiation Field

131

The field acquires its energy in discrete steps by the absorption of energy quanta, so the numbers
(nx , ny , nz ) can be interpreted as the number of field quanta in the corresponding field modes. These
quanta are not particles in the usual sense. They are not localised in space. Rather, their energy is
distributed diffusely through space even though they might enter the field discretely by absorption from a
localised source. In this sense, the field quanta are pseudo-particles.
Since the divergence of the electric field is zero in free space,
2 E
2 (2x) = 0
.

(12.11)

only two independent solutions are allowed per 2k vector. This follows since each of the components of
2 (2x) = E
2 0 $ (2k.2x) satisfy equations of the form (12.9) and therefore
E
k
2 E
2 (2x) = 2k.E
2 (2x) = 0
.

(12.12)

where 2k is one of the allowed vectors. This implies that the electric field is perpendicular to the direction
of propagation defined by 2k and hence it is transverse. For each value of 2k we can choose two mutually
perpendicular directions of polarisation perpendicular to 2k.
Consider a narrow range of frequencies to + d. The number of states in this range for cavity
radiation is (see tutorial)
()d =

V
2 d
2 c3

(12.13)

12.4 Partition Function of the Radiation Field


In this section, we treat the radiation field for cavity radiation as an deal quantum gas. Photons have three
properties that are important for this model:
1.
2.

3.

Photons obey Bose-Einstein statistics. Insofar as they are particles, they behave like particles of spin
one. According to the Spin and Statistics Theorem therefore, they must obey Bose-Einstein statistics.
Photons do not interact with each other. This is not quite true. Extremely strong electromagnetic
fields display significant numbers of photon-photon interactions. These are predicted by quantum
electrodynamics. However, in moderate fields, and even in very strong ones, the number of these
interactions is so small as to be utterly insignificant. This follows from the fact that the Maxwell
equations are linear and the fields therefore obey a principle of superposition. Since the degree of
interaction of photons with each other is utterly negligible, we may treat them as an ideal quantum gas.
The number of photons in the radiation field is not constant. Photons do not obey a law of conservation
of photon number. Thermal equilibrium between the radiation field and the cavity walls is achieved by
the continual emission and absorption of photons by the atoms in the surrounding walls. They are thus
continually created and destroyed. Their number is therefore not constant, but fluctuates about a mean.
This mean number is a property of the equilibrium state and is determined by the temperature T of the
cavity.

This last property has an important consequence: when calculating the partition function for the photon
gas,
' the occupation numbers for the photon states are not subject to any restrictive condition of the type
r nr = N , as are particles that obey a law of conservation of number. This enables us to evaluate the
partition function for the photon gas without difficulty (See equation (11.15)):
Z=

3
r

1
1 er

12.5 Statistics of the Radiation Field in Thermal

(12.14)

132

Chapter 12

Blackbody Radiation

Equilibrium
The state of the radiation field in equilibrium at any instant can be specified by listing the occupation
numbers of each of its modes at that instant. Its state is thus determined by R = (n1 , n2 , ...), where each
nr is the number of photons in the rth mode at that instant. Since the particles are indistinguishable, no
fuller specification of the state of the system is possible. The total energy of the field in this state is then
ER =

nr r
r

and the probability of finding the field in this state is therefore


PR =

eER
e r nr r
1 3 nr r
=
=
e
Z
Z
Z r

(12.15)

PR is the probability of finding the field simultaneously with n1 photons in mode r = 1, n2 in mode r = 2,
n3 in mode r = 3, etc. In other words, PR = P (n1 , n2 , ...). Now note that
Z=

3
r

1
1 er

so that we can write (12.15) as


4 n
54
5
1 1
e
(1 e1 ) en2 2 (1 e2 )
3
=
enr r (1 er )

P (n1 , n2 , ...) =

(12.16)

Interestingly, each factor summed on nr yields 1,

nr

enr r (1 er ) = (1 er )

enr r = 1

(12.17)

nr

From (12.16) we can calculate the probability pr (nr ) that mode r contains exactly nr photons. For
example, the probability that mode r = 1 contains exactly n1 photons is

p1 (n1 ) =

P (n1 , n2 , ...)

n2 ,n3 ,...=0

4 n1 1
5
e
(1 e1 )
nr r

= e

(1 e

n2 =0

4 n2 2
5
e
(1 e2 )

n3 =0

4 n3 3
5
e
(1 e3 )

(12.18)

Similarly, the probability pr (nr ) that mode r contains exactly nr photons is


pr (nr ) = enr r (1 er ) =

enr r
Zr

where we have put


Zr =
Since

1
1 er

pr (n) = 1

n=0

we may interpret pr (n) as the probability that mode r contains exactly n photons, and Zr as the partition

Section 12.6

Plancks Law for Blackbody Radiation

133

function for that mode. (12.16) thus becomes


P (n1 , n2 , ...) = p1 (n1 )p2 (n2 ) =

pr (nr )

which shows that the occupation of each mode is independent of that of every other. We may therefore
treat each mode separately without regard to the others.
We may now calculate the average occupation number of each mode r. We have
n
r =

nr pr (nr ) =

nr =0

nr =0

1
nr enr r
=
Zr
Zr

nr enr r

nr =0

Now note that

nr enr r =

nr =0

nr =0

1
enr r =
(r )
r

nr =0

enr r =

1 Zr
r

so that
n
r =

1
ln Zr
r

Explicitly evaluated, this gives


n
r =
or

1 er
1
= r

r
1e
e
1
n
r =

1
er 1

(12.19)

This result, which gives the average occupation number of the rth mode, is called the Planck distribution.
The fluctuation nr in the photon occupation number nr at equilibrium can also be calculated by a
similar method. We have, by definition,
(nr )2 =< (nr n
r )2 >=< (nr )2 > (
nr )2
and
< (nr )2 >=

(nr )2 pr (nr ) =

nr =0

1
Zr

(nr )2 enr r =

nr =0

1 1 2 Zr
Zr 2 2r

from which
(nr )2 =

1 2
1
ln Zr =
< nr >
r
2 2r

Evaluated explicitly, this gives


(nr )2 =

er
(er 1)2

(12.20)

12.6 Plancks Law for Blackbody Radiation


The Planck law for blackbody radiation determines the spectral composition of electromagnetic radiation
in thermal equilibrium. In other words, it determines how much energy is contained on average in the
radiation field at each frequency when the field is in thermal equilibrium.
Consider first the average energy in the rth mode. It is given by
n
r r =

r
er 1

(12.21)

134

Chapter 12

Blackbody Radiation

This result is not useful experimentally. The energy levels r are too closely spaced to be resolved in a
measurement. The best that can be done in a laboratory is to measure the total energy dE in the radiation
field contained in a given narrow range of angular frequencies to + d or of wavelength to + d
of the radiation. These measurements are then used to determine the spectral density of energy in the
radiation field, that is, the energy per unit frequency or per unit wavelength.
The spectral density of the radiation can be predicted from Plancks distribution using the density of
states for the radiation field. Consider a narrow range of frequencies to + d. The number of states in
this range for cavity radiation is
V
2 d
(12.22)
2 c3
Since the range is narrow, we can assume that the energy of all modes in this range is approximately the
same. The energy of a photon in the mode at frequency is , so the average number of photons in this
mode is
1
(12.23)
n
=
e
1
()d =

so the average energy at this frequency is

n
=

(12.24)

The energy in the modes with frequencies in the range is + d is therefore


n
()d =

V
3
d
2
3

c e
1

(12.25)

This energy is proportional to the volume occupied by the radiation, so it is convenient to define the energy
per unit volume per unit angular frequency range of the radiation at angular frequency and at temperature
T . This denoted by u(, T ) and is given by
u(, T ) d =

3 d
1

1
2 c3

(12.26)

This is Plancks Radiation Law. Its prediction is in excellent agreement with measurements. It gives a
complete description of what can be measured experimentally of the properties of cavity and of blackbody
radiation.

12.7 Spectral Properties of the Radiation Field


The energy density 2u(, T ) of the radiation, per unit volume per unit range of the frequency = (/2)
is plotted in Figure (1)12 as a function of three temperature scales: 2000, 4000, 6000 K. The wave-number
scale 1 = c where c is the speed of light and the frequency is also plotted on the horizontal axis. Note
that the visible part of the spectrum is marked. At 6000 K the maximum of the distribution lies at the edge
of the visible spectrum. 6000 K is the temperature of the surface of the sun, which approximates a black
body fairly closely.
The maximum of the distribution shifts to higher frequencies with increase in temperature. At a given
temperature it occurs at

u(, T )

=0

(12.27)

which leads to the transcendental equation


(3 x) ex = 3
12

From Mandl, 1988, p 252


.

(12.28)

Section 12.7

Spectral Properties of the Radiation Field

135

Figure 12.1.
where x = max h max . The solution must be found numerically and is
max
h max
=
= 2.822.
kT
kT

(12.29)

This result is known as Wiens displacement law and is well confirmed by experiment.
If we integrate (12.26) over all frequencies, we obtain the total energy density of the black-body radiation
at a given temperature T.

u (T ) =

u(, T ) d

1
2
c3

= T4
=
The integral

x3 d
0 ex 1

4
15

3 d
e 1

2 c3

2 k4 4
T
15 3 c3

x3 dx
ex 1
(12.30)

(see appendix). This is called the Stefan-Boltzmann law and the constant
=

2 k4
= 5.67 108 J m2 s1 K4
60 3 c2

is called the Stefan-Boltzmann constant.


Note that in the kinetic theory of gasses it is shown that
1
cu (T ) = T 4
4
is the energy incident per unit area per unit time on the wall of the enclosure containing radiation.

(12.31)

136

Chapter 12

Blackbody Radiation

12.8 Fundamental Relation for Radiation Field


The fundamental relation for a radiation field in thermal equilibrium is obtained from the photon partition
function Z,
F = kT ln Z
Now,
Z=
So,

3
r

1
1 er

ln 1 er

F = kT

(12.32)

In a cavity of macroscopic size, the energy eigenvalues r are very closely spaced. So we can replace the
sum over states on the left by a sum over energies by subdividing the energy axis into small intervals of
size and treating all states within each range as having the same energy. Thus,

ln 1 er

ln 1 eA (A )

where () is the energy density of states for cavity radiation, that is, the number of energy states per unit
energy range. If we make the subdivision very small, but keeping it always larger than the typical gap
between energy levels, then the right hand side is well approximated by an integral. Thus,
F kT

ln 1 e ()d

We have already shown that, for photons,


V

() =

2 c3 3

so we have
F

kT V
2 c3 3
k4 T 4 V
2 c3 3

ln 1 e 2 d
ln 1 ex x2 dx

(12.33)

The integral on the right hand side simplifies if integrated by parts,

ln 1 ex x2 dx =
=

ln 1 ex d

ln 1 ex

x3
3

x3
3

5
x3 4
d ln 1 ex
3

(12.34)

The boundary terms evaluate to zero, since when x 1, we have ex 1 x, so that 1 ex x and
hence
x3 ln(1 ex ) x3 ln(x + )
giving
lim x3 ln(1 ex ) = lim x3 ln x = 0

x0

x0

Similarly, for x 1,
x3 ln(1 ex ) x3 ln ex (ex 1) = x3 [ln ex + ln(ex 1)] = x4 ln e + x3 ln(ex 1)

Section 12.9

Thermodynamics of the Radiation Field

137

but for x 1, we have (ex 1) ex , so that


x3 ln(1 ex ) x4 ln e + x3 ln ex = 0
so that
lim x3 ln(1 ex ) = 0

Thus

5
x3 4
d ln 1 ex
3
0

x3 ex
=
dx
3 1 ex
0

1
x3
dx
=
3 0 ex 1

ln 1 ex x2 dx =

The integral on the right hand side can be evaluated in closed form (see Appendix to this chapter) and
gives

x3
4
dx =
1
15

(12.35)

ex

so we finally get
2 k4
T 4 V.
45c3 3
The constant in this equation related to the Stefan-Boltzmann constant , since
F =

2 k4
= 5.67 108 J m2 s1 K4
60c2 3

(12.36)

(12.37)

so that we can write the Helmholtz free energy for the system as
F =

4
T 4 V
3c

(12.38)

Note that F is a function of T and V alone,


F = F (T, V )

(12.39)

This is consistent with the fact that, in thermal equilibrium, the radiation field does not contain a fixed
number of photons. Its thermodynamic potentials therefore cannot depend on N .

12.9 Thermodynamics of the Radiation Field


All thermodynamic properties of the radiation field follow directly from (12.39) by standard calculations.
For example, the equations of state for the field are obtained by comparing the fundamental relation in
Helmholtz representation,
dF = S dT P dV

(12.40)

with the differential


dF =

F
T

dT +
V

F
V

dV

(12.41)

Thus,
S=

F
T

=
V

16
T 3 V
3c

(12.42)

138

Chapter 12

Blackbody Radiation

and
P =

F
V

=
T

4
T 4
3c

(12.43)

Relation (12.42) gives the entropy of the field, or its heat equation, in the form S = S(T, V ), while (12.43)
yields its mechanical equation of state. The total internal energy of the field can then be calculated from
the relation F = E T S , and gives
E = F + TS =

4
16
4
T 4 V +
T 4 V = T 4 V
3c
3c
c

(12.44)

The field therefore has energy density


u (T ) =

E
4
= T 4
V
c

(12.45)

which is the Stefan-Boltzmann law, previously derived directly from Plancks distribution by calculating
the average energy in the radiation field at temperature T . Note that u = u(T ) is a function of temperature
alone. The pressure of the radiation field can be directly related to its energy density by eliminating T from
(12.43) and (12.45) to get
1
u(T )
3
This result was first deduced from Maxwells equations.
P =

(12.46)

Exercises

139

Exercises
1.

Consider a narrow range of frequencies to + d. Show that the density of photon states for cavity
radiation is
V
()d = 2 3 2 d
c
2

2.
3.

4.

Keep in mind that k = c2 and that for each allowed wavevector there are two allowed polarisations.
Prove equation (12.28).
Electromagnetic radiation at temperature Ti fills a cavity of volume V. If the volume of the thermally
insulated cavity is expanded quasistatically to a volume 8V, what is the final temperature Tf ? (Neglect
heat capacity of the walls of the cavity.)
Apply the thermodynamics relation
T dS = dE + P dV
to a photon gas. We can write
E =Vu
where u (T ) is the mean energy density of the radiation field, independent of the volume.
S
S
and V
.
(a) Consider S as a function of T and V, express dS in terms of dT and dV. Find T
V
T
( 2 ) ( 2 )
S
S
(b) Show that the mathematical identity V T = T V gives a differential equation for u which
can be integrated to yield the Stefan-Boltzmann law u T 4 .

140

Chapter 13

Grand Canonical Formalism

Chapter 13
Grand Canonical Formalism
13.1 Another Formalism
For an ideal gas consisting of Boson or Fermion particles which are conserved, the number of particles in
a gas enclosed in a container of volume V remains constant. The partition function for such a system is
thus given by a summation over the occupation numbers nr in which these numbers are restricted by the
condition
nr = N

(13.1)

This requirement makes that summation a nested one, and nested sums are difficult to evaluate. Often, they
are intractable by analytic methods.
There are two ways of getting around this problem. One is to develop special mathematical techniques to
enable us to evaluate the particular nested summations encountered. Several such methods are available for
Fermion and Boson gases. All of these are ad hoc and complicated, and most are not transparent. The other
option that is open to us is to change the thermodynamic formalism that we are using, to enable us to drop
the restriction on particle number. Essentially, this means that we must turn from a thermodynamic theory
of closed systems to one of open systems. This is done by putting the system in contact not only with an
heat reservoir, as was done in the canonical formalism, but also with a particle reservoir. In this way, the
system can draw particles from the particle reservoir, or reject particles to it. The particle equilibrium is
then determined by the value of the chemical potential of the particle reservoir.
At first, this approach might seem radical. The most common reason for reluctance to follow this route
appears to be unfamiliarity with the concept of the chemical potential. But this approach is no more radical
than the step we took when passing from the microcanonical formalism to the canonical. In fact, it is
entirely analogous to it. There, we removed the troublesome condition of counting states at fixed energy by
allowing the system to exchange heat freely with an heat reservoir. In the new formalism, the energy of the
system was no longer held fixed by external constraints, but is determined by the thermal equilibrium with
the heat reservoir at fixed temperature. Now, we propose to eliminate the troublesome condition of fixed
particle number by allowing free passage of particles between the system and a particle reservoir. In the
resulting formalism, the number of particles in the system is no longer held fixed by external constraints,
but is determined by the chemical equilibrium of the system with the particle reservoir, held at fixed
chemical potential.
The statistical and thermodynamic theory for open systems at fixed temperature was first introduced by
Gibbs. He called it the grand canonical formalism.

13.2 Chemical Potential


Equation (4.63) states that for a one component system
T dS = dE + P dV dN

(13.2)

dE = T dS P dV + dN

(13.3)

which can be re-arranged as

E
N

S,V

The chemical potential is therefore the energy needed to change the particle number by one while keeping

Section 13.3

Grand Canonical Distribution

141

the volume and entropy of the system constant. The chemical potential can also be expressed in terms of
the Helmholtz and Gibbs free energies as follows:
dF = SdT P dV + dN

(13.4)

dG = SdT + V dP + dN

(13.5)

whence
=

F
N

=
T,V

G
N

(13.6)
T,P

13.3 Grand Canonical Distribution


The grand canonical distribution is obtained in a manner completely analogous to that used in obtaining
the canonical distribution. Consider a system A in thermal contact with an heat reservoir A , and able to
exchange particles of the correct species with a particle reservoir A . For simplicity, we may regard the
two reservoirs A and A as a single system and denote it by a single symbol B. Since we allow A to
interact only with B, the composite system A + B may be taken to be isolated. If it is not, we loose no
generality by isolating it. The composite system is then equally likely to be in any one of its accessible
states. Denote the total energy of the composite system by E0 , and its total particle number by N0 , and
denote the total number of states accessible to the isolated composite system A + B by
total (E0 , N0 )

(13.7)

Note that we have suppressed the dependence of on all other parameters since these are not of immediate
interest here.
Denote the quantum states accessible to system A by R. We want to calculate the probability for A to
be in some given, definite accessible quantum state R. Denote the energy of the system in this state by
ER , and the number of particles that it contains by NR . Then the reservoir B contains energy E0 ER
and particle number N0 NR . The total number of states accessible to the composite system when A is in
state R is thus, since A is in a given definite given state,
1 B (E0 ER , N0 NR )

(13.8)

Now, the composite system is isolated. So, each of its accessible states is equally probable, and the
probability PR of finding system A in the state R is
PR =

B (E0 ER , N0 NR )
total (E0 , N0 )

(13.9)

We now use the fact that B is an heat and particle reservoir for A and so contains hugely more energy and
particles than A. So, for any state R in which A is likely to be found, we have ER E0 and NR N0 .
Now, the function B (E0 ER , N0 NR ) is an extremely rapidly decreasing function of ER and of NR .
To avoid convergence problems, we expand its log rather than the function itself about the values E0 and
N0 . This gives
ln B
E

ln B (E0 ER , N0 NR ) = ln B (E0 , N0 ) +
+

ln B
N

(E0 , N0 ) (ER )

(E0 , N0 ) (NR )

(13.10)

The terms in this equation are interpreted as follows. First, note that
k

ln B
E

=
N

SB
E

=
N

1
TB

(13.11)

142

Chapter 13

Grand Canonical Formalism

where TB is the temperature of the heat reservoir. So,


k

ln B
E

(E0 , N0 )
N

is the reciprocal of the temperature of the heat reservoir when all of the energy, and all the particles, of
the composite system A + B are found in the reservoir. However, E0 ER E0 , since ER E0 , and
N0 NR N0 , since NR N0 . So, TB is in fact also the temperature of the reservoir for all possible
states R in which the system A is ever likely to be found. This, after all, is the definition of a reservoir it is a system whose temperature does not change no matter how much heat the system draws from it, or
rejects to it. Once equilibrium is reached, the system will be at the same temperature as the reservoir, so it
too will be at temperature TB . We can thus write
ln B
E

(E0 , N0 ) =
N

1
kT

(13.12)

where T is the system temperature.


In a similar way,
k

ln B
N

=
E

SB
N

B
TB

(13.13)

where B is the chemical potential of the reservoir and TB is its temperature. Thus
kTB

ln B
N

(E0 , N0 )

(13.14)

is the chemical potential of the reservoir when all of the energy, and all of the particles, of the composite
system A + B are found in it. However, E0 ER E0 , since ER E0 , and N0 NR N0 , since
NR N0 . So B is the chemical potential of the reservoir for all states R in which the system A is ever
likely to be found. At equilibrium, the system will be at the same chemical potential as the reservoir, so
= B , and we can write
ln B
N

(E0 , N0 ) =

kT

(13.15)

where is the chemical potential of the system A.


Relation (13.10) thus becomes
ln B (E0 ER , N0 NR ) = ln B (E0 , N0 )

ER NR
+
kT
kT

(13.16)

or
B (E0 ER , N0 NR ) = B (E0 , N0 ) e(ER NR )/kT

(13.17)

Since total (E0 , N0 ) and B (E0 , N0 ) are each constants for this problem, we finally obtain from (13.9)
the result that
PR = constant e(ER NR )/kT

(13.18)

Result (13.18) is called the grand canonical distribution, or the Gibbs distribution. It gives the probability
that the system A is in a given quantum state R when it is in equilibrium, in thermal contact with an heat
reservoir at temperature T , and able to exchange particles with a particle reservoir at chemical potential .

13.4 The Grand Partition Function


The constant occurring in distribution (13.18) may be evaluated using the standard condition for

Section 13.5

Grand Canonical Potential

143

probabilities that
1=

PR
R

Thus
1 = constant

e(ER NR )/kT
R

so that
1

constant =

(ER NR )/kT

(13.19)

The function
e(ER NR )/kT

(13.20)

plays the same role in the grand canonical formalism as did Z in the canonical formalism, and is called
the grand partition function. In the next section, we show that it provides a fundamental relation for the
system, from which all its thermodynamic and statistical properties may be calculated. In terms of , the
Gibbs distribution becomes
PR =

e(ER NR )/kT
(ER NR )/kT

1 (ER NR )/kT
e

(13.21)

13.5 Grand Canonical Potential


The sum that defines the partition function is expressed in (13.20) as a sum over quantum states R. In
macroscopic systems, where N is large, the energy levels are highly degenerate. There are thus very many
quantum states R with given values of E and N. We may therefore express this sum as a sum over E and
N by writing
e(ER NR )/kT =

=
R

A (E, N) e(EN)/kT

(13.22)

E,N

Now, A (E, N) is a very rapidly increasing function of E and N . Also, e(ER NR )/kT is a very rapidly
decreasing function of E and N . The product of these two functions is therefore very sharply peaked about
some value E of E and N of N. Most of the terms occurring in the sum on the right hand side of (13.22)
thus contribute negligibly to the partition function, with the most overwhelming contribution by far coming
from the terms in a narrow range E and N about E and N . We may thus write approximately
A (E, N ) e(EN)/kT E N

(13.23)

Taking logs, and noting that ln E N is utterly negligible in comparison with the other terms in the
resultant expression, we have to more than excellent approximation
ln ln A (E, N )

E N
kT

(13.24)

or
kT ln

= kT ln A (E, N) E + N
= T S E + N

(13.25)

144

Chapter 13

Grand Canonical Formalism

since k ln A (E, N) = SA (E, N ) is the entropy of the system in its most probable state at equilibrium.
This result leads us to define the grand canonical potential, , by the relation
= E T S N

(13.26)

The properties of are easily deduced. We have


d = dE T dS S dT dN N d
= S dT P dV N d

(13.27)

This means that the correct variables for to be a fundamental relation for the system are T , V and . The
fundamental relation for a system expressed in the form
= (T, V, )

(13.28)

is said to be in grand canonical representation. The variables T, V, are precisely the variables in terms of
which is expressed. The grand partition function thus provides a fundamental relation for the system via
the algorithm
= kT ln

(13.29)

13.6 Thermodynamics via the Grand Canonical Potential


The grand canonical potential is given by
= E T S N

(13.30)

d = S dT P dV N d

(13.31)

Expressed in infinitesimal form, it gives

which is the fundamental relation expressed in terms of the variables T , V , and . The equations of state
are thus given by
S
P
N

V,

T,

T,V

= S(T, V, )

(13.32)

= P (T, V, )

(13.33)

= N (T, V, )

(13.34)

These are respectively the entropy equation, the mechanical equation of state, and the system particle
number equation, each expressed in terms of T, V, . To obtain the entropy equation and the mechanical
equation of state in their more usual forms, which are expressed in terms of T, V, N , we need to solve
(13.34) for to get = (T, V, N), and then use this to eliminate in equations (13.32), (13.33) in favour
of N.
The heat equation can be obtained directly from the definition of , which gives
E = + T S + N = T

V,

= E(T, V, )

(13.35)

T,V

Again, to obtain it in more familiar form, we must eliminate in favour of N using equation (13.34).
A further, very useful, result is provided by the Euler relation. For our system, this relation is
E = T S P V + N

(13.36)

Section 13.7

Relation to the Canonical Potential

145

so that
= E T S N = P V
giving
P V = kT ln

(13.37)

This relation enables us to obtain the equation of state for the system directly, without having first to
calculate P from (13.33). This is an useful shortcut. Note however that (13.37) still expresses the equation
of state in terms of T, V, , and not in terms of T, V, N . To get it in terms of T, V, N , we again need to use
relation (13.34) to eliminate in favour of N .

13.7 Relation to the Canonical Potential


You may feel uneasy having to learn to use yet another thermodynamic potential to represent the
fundamental relation. If so, it is easy, in principle, to relate the grand canonical potential to the canonical,
that is, to the Helmholtz free energy. It is also easy, in principle, to calculate the partition function from the
grand partition function. The Helmholtz free energy is given by
F = E TS

(13.38)

which is obtained from the partition function by


F = kT ln Z

(13.39)

Z = eF

(13.40)

= E T S N = F N

(13.41)

= kT ln

(13.42)

or

On the other hand

and

so that
= e = e(F N) = ZeN
or
Z = eN

(13.43)

where N in these formulae represents the mean number of particles N in the open system. Similarly, the
symbol E represents the mean energy E of the system in thermal contact with the heat reservoir.
Relation (13.43) allows us, in principle, to calculate the partition function Z from the grand partition
function , from which we may obtain all thermodynamic and statistical results via the methods and
formulae of the canonical formalism. You may encounter a snag however. To get a true canonical
representation, you will need to eliminate from all the above formulae in favour of N . This needs to be
done using the equation of state (13.34) of the grand canonical formalism. It often happens however that
the inversion of this equation is an intractable problem. So, though it can be done in principle, it may be
difficult to implement it in practice. For this reason, it is better to learn to live with the grand canonical
formalism, rather than insist on converting back to the canonical. It saves a lot of work in the long run.

13.8 Application to Boson and Fermion Gases

146

13.8.1

Chapter 13

Grand Canonical Formalism

Grand Partition Function for Gas of Bosons

Placing the Boson system in contact with a particle reservoir, as well as an heat reservoir, radically
increases the number of quantum states R accessible to the system. We are no longer considering the
quantum states accessible to the system when T , the remaining external parameters, and the number N
of particles are held fixed, but the quantum states accessible to system with given T and given remaining
external parameters fixed when it has no particles, when it has 1 particle, when it has 2 particles, etc. with
the number of particles increasing without limit.
Denote the energy of the given quantum state R by ER , and the number of particles the system contains
when in this state R by NR . The grand partition function is given by
e(ER NR )

(13.44)

We can express the sum on the right hand side in terms of the occupation numbers nr of the states available
to the single particles by writing
ER
NR

= n1 1 + n2 2 + n3 3 + =
= n1 + n2 + n3 + =

nr r

(13.45)

nr

(13.46)

so that (13.44) becomes

e[(n1 1 +n2 2 +n3 3 + )(n1 +n2 +n3 + )]

=
n1 ,n2 ,n3 ,...

e[n1 (1 )+n2 (2 )+n3 (3 )+ ]

=
n1 ,n2 ,n3 ,...

=
n1 ,n2 ,n3 ,...

en1 (1 ) en2 (2 ) en3 (3 )

The difference between this summation and that which occurs in the evaluation of the partition function Z
for the same system is that here there is no restrictive condition on the nr , so each nr is allowed to range
over its possible values freely. The summation thus decomposes into a product of independent summations.
Since we are considering a Boson system, in which there is no limit to the number of particles that can be
accommodated into one state, each nr ranges from 0 to , this gives
=

en1 (1 )

n1 =0

en2 (2 )

n1 =0

n1 =0

en3 (3 )

Each of the summations in this product of the form

n=0

en(r ) = 1 + e(r ) + e2(r ) + e3(r ) +

which is the sum of an infinite geometric series, and thus has value

en(r ) =

n=0

1
1

e(r )

The grand partition function for the Boson system thus becomes
=

1
1

e(1 )

1
1

e(2 )

1
1

e(3 )

Section 13.7

Relation to the Canonical Potential

147

This is more neatly expressed in terms of logs,


ln =

(
)
ln 1 e(r )

(13.47)

The fundamental relation for the Boson gas is thus


= kT ln = kT

)
(
ln 1 e(r )

(13.48)

In principle, we could find the partition function Z from ,


Z = eN

(13.49)

from which we can get the fundamental relation in Helmholtz representation,


F = kT ln Z = kT ln + N
When attempting to do this, however, we encounter a technical problem. The Helmholtz representation
assumes all variables to be expressed in terms of T, V, N . We therefore need to eliminate from the above
expressions in favour of N . Now

N =

=
T,V

e(r )

(13.50)

Which is a transcendental relation between and N . It cannot be inverted by hand. That is, we cannot
express in terms of N in terms of ordinary elementary functions. Without inventing a new, special
function for the purpose of reinstating the canonical formalism, we cannot effect the conversion in practice.
It is better therefore to learn to live with the grand canonical formalism, which does not require the use of
any new techniques.

13.8.2

Grand Partition Function for Fermion Gas

The calculation of the grand partition function for the Fermion gas is identical to that for the Boson gas
up to the point where the range of the occupation numbers is inserted into the formulae. For the Boson
system, there is no limit to the number of particles that can be accommodated into one state, so each nr
can range from 0 to . For the Fermion gas, the particles are subject to the Pauli exclusion principle, so
the occupation numbers are allowed to assume only the values nr = 0, 1. Thus the grand partition function
becomes
1

en1 (1 )

=
n1 =0

en2 (2 )
n1 =0

n1 =0

en3 (3 )

Each of the summations in this product of the form


1

en(r ) = 1 + e(r )
n=0

and the grand partition function for the Fermion system thus becomes
)(
)(
)
(
= 1 + e(1 ) 1 + e(2 ) 1 + e(3 )

This is more neatly expressed in terms of logs,


ln =

(
)
ln 1 + e(r )

148

Chapter 13

Grand Canonical Formalism

The fundamental relation for the Boson gas is thus


= kT ln = kT

(
)
ln 1 + e(r )

(13.51)

In trying to convert these formulae to those for the canonical formalism, we encounter the same snag as
was encountered in the Boson case. In principle, we could find the partition function Z from ,
Z = eN

(13.52)

from which we can get the fundamental relation in Helmholtz representation,


F = kT ln Z = kT ln + N

(13.53)

However, the relation between N and is given by


N =

=
T,V

1
e(r ) + 1

(13.54)

which is again transcendental. It is easier therefore to retain the grand canonical formalism, rather than
change to canonical formalism.

13.9 Occupation numbers


The relative probability of finding a gas of N particles in a state R with n1 particles in state 1, n2 particles
in state 2 etc. is
e(n1 1 +n2 2 +...+nN N ) .

(13.55)

The mean number of particles in state s can therefore be written as


'
(n1 1 +n2 2 +...+nN N )
R ns e
'
n
s =
(n1 1 +n2 2 +...+nN N )
Re
1
1 (n1 1 +n2 2 +...+nN N )
=
e

ZN R
s
=

13.9.1

1 ln ZN
1 ZN
=
.
ZN s
s

(13.56)

Bosons

Let = kT
. For Bosons we can write

ln Z = ln + N

(13.57)

and
n
s

1 ln ZN
s
1

=
ln +
(ln + N )
s

s
=

(13.58)

where the last term takes account of the fact that is a function of s through (13.50). Using from (13.47),
the relation becomes
n
s =

1
e(s )

+
(ln Z)
.

s
1 e(s )

(13.59)

Section 13.10

Quantum statistics in the classical limit

149

In (13.49) the chemical potential is the common chemical potential of the particle reservoir and the
thermodynamic system of interest. The assumption is that ZeN = ZeN = has a maximum for
N = N at a fixed chemical potential and temperature or fixed , hence

[ln Z(N ) N ]N =N =
ln Z(N) = 0.

N
N

(13.60)

Since this expression involves a specific value of N = N, must be a function of N. Recall that N here is
the particle number at equilibrium and that a change in N implies a change in . Therefore, from (13.57),
the relation (13.60) is equivalent to
0=+ N +

ln

or
0=N+

ln =
ln Z

Consequently, (13.59) becomes


n
s =

13.9.2

1
.
e(s ) 1

(13.61)

Fermions

For Fermions a similar argument to the above leads to


n
s =

1
e(s )

+1

(13.62)

13.10 Quantum statistics in the classical limit


The average occupation number of a single particle state s is given by the relation
n
s =

1
e(s ) 1

(13.63)

where the + sign refers to Fermi-Dirac statistics (Fermions) and the sign to Bose-Einstein statics
particles, the parameter is determined by the
(Bosons). If at equilibrium the gas consists of N
condition
=
N

ns =
s

e(s )

(13.64)

The partition function of the gas is given by


ln Z = N

(
)
ln 1 e(s )

(13.65)

( = kT
). Let us examine in some limiting cases.

Low concentration
is made small enough. The relation
At a given temperature the concentration of the gas will be low when N
(13.64) can then be satisfied only if the sum over all the terms is sufficiently small, that is if.
ns 1 or
e(s ) 1 for all states s.

150

Chapter 13

Grand Canonical Formalism

High temperature
Consider a gas with a fixed number of particles when the temperature is made sufficiently large, that is,
is made sufficiently small. If in (13.64) were kept constant more terms of appreciable magnitude will
contribute as the temperature is increased. It follows that when 0, an increasing number of terms
, (T )
with large values of contribute substantially to the sum. To prevent the sum from exceeding N
(s )
must become large enough so that each term is sufficiently small. Again this means that e
1 or
n
s 1 for all states s.
The conclusion is that if the concentration is sufficiently low or the temperature is sufficiently high,
e(s ) 1
for all s. Equivalently this means that the occupation numbers become small so that
n
s 1
for all s. These limits are known as the classical limit, since they imply that for both Fermi-Dirac and
Bose-Einstein statistics the occupation numbers reduce to
n
s = e(s )

(13.66)

From (13.64) the chemical potential can be determined as


N

e(s )

=
s

es

= e

or
N
.
s
se

Since

e = '
n
s =

(13.67)

1
e(s )

in the calssical limit


es
n
s = N ' s
se

(13.68)

In the classical limit, therefore, both Fermi-Dirac and Bose-Einstein quantum distribution laws reduce to
Maxwell-Boltzmann distribution laws.
Recall that
x2
+ O(x3 ).
2
The partition function in the classical limit therefore becomes
(
)
ln Z = N
e(s )
ln(1 x) = x

= N + N.

(13.69)

From (13.67) then


= ln N ln

es
s

and
ln Z = N ln N + N + N ln

es
s

(13.70)

Section 13.10

Quantum statistics in the classical limit

151

This is not the partition function for the Maxwell-Boltzmann gas, equation (11.9),
es

ln ZMB = N ln
s

However,
ln Z

= N ln N + N + ln ZMB
ln N ! + ln ZMB

or
ZMB
(13.71)
N!
where we made use of Stirlings formula for large N. Here N ! corresponds to the number of possible
permutations of the particles, permutations that are physically meaningless when the particles are identical.
This is precisely the factor that was introduced ad hoc to save us from the consequences of the Gibbs
paradox.
A gas in the classical limit, where (13.66) is satisfied, is said to be non-degenerate, while a gas
for which the concentration is high enough and the temperature low enough so that Fermi-Dirac of
Bose-Einstein statistics are essential for a proper description, is said to be degenerate.
Z=

152

Chapter 14

Ideal Fermi Gas

Chapter 14
Ideal Fermi Gas
14.1 Fermi-Dirac Particles
Electrons, protons, neutrons and a number of other distinct particles have been found to satisfy the Pauli
principle, which states that the wavefunctions (1, 2, 3, ..., N ) of identical fermions are totally symmetric
under any even permutation of the particles, and anti-symmetric for any odd permutation. Because of the
Pauli principle the eigenstates of non-interacting fermions bound to a common potential cannot be simple
product sates. The simplest N -particle state formed from independent single particle states that satisfy the
Pauli exclusion principle, is a Slater determinant: A N N determinant
4
5
1
(1, 2, 3, ..., N ) = det km (j)
N!

(14.1)

where k labels the rows and j the columns or vice versa. Here the km
s represent the quantum numbers and
the j s the spatial and spin coordinates. Clearly the N single particle wavefunctions k (j) in the Slater
determinant must be distinct, otherwise the determinant is identically zero. The pre-factor 1N! ensures
that the wavefunction is normalised if the single particle states are normalised.
The Hamiltonian for a system of N independents particles reduces to the kinetic energy operator and a
sum over single particle potential operators:

=
H

2m

2i +

vi (2ri ) .

(14.2)

For a system of free Fermions, the total energy of a system of is simply the sum of the kinetic energies of
the individual particles. To see this, let
= T =
H

2m

2i

(14.3)

We can simulate an infinitely extended system of non-interacting Fermions by requiring that the single
particle wavefunctions satisfy periodic boundary conditions
(x + Lx ) = (x) ,

(14.4)

and similarly for y and z. Assuming isotropy (space looks the same in every direction) and with
Lx = Ly = Lz = L, the single particle eigenfunctions of the kinetic energy become
$k (2r) =

1 i$k.$r
3 e
L2

(14.5)

with
2k = 2 (mx , my , mz ) mx , my , mz integers
L
From (14.5) it follows that the single particle energies are given by
2

k =

kx2 + ky2 + kz2


.
2m

(14.6)

(14.7)

The total energy is


E (..., km , ...) =
=

| (1, 2, 3, ..., N )
(1, 2, 3, ..., N )| H
(1, 2, 3, ..., N )| T | (1, 2, 3, ..., N)

(14.8)

Section 14.2

Ideal Fermi Gas

153

If this expression is expanded, we find that each of the N ! permutations of has the same kinetic energy
and taking the normalisation factor into account, the total energy is
2

E (..., km , ...) =
x,y,z

kx2 + ky2 + kz2


2m

(14.9)

where the allowed wavevectors k s depend on the boundary conditions.


The 2kvectors can be mapped one-to-one onto a cubic grid with vertices (mx , my , mz ) , and hence from
3
(14.6) it follows that in k-space, each particle occupies a volume 2
. In the ground state the lowest
L
N-energy states are occupied which implies that the corresponding 2kvectors lie in a sphere with radius
kF in k-space. It now follows that the number of grid points in k-space that correspond to occupied single
particle states for the ground state yields
N=

1=
mx my mz

L
2

4 3
k .
3 F

(14.10)

To add (or remove) a particle from this Fermi-sea, with the minimum energy, it is at necessary to find an
unoccupied (occupied) single particle state near the Fermi-surface (k = kF ). The energy associated with
2 2
kF
this process is = 2m
close to a single particle eigenenergy for k = kF . This identifies the chemical
potential =

2 2
kF
2m

= F .

14.2 Ideal Fermi Gas


The partition function ZN can be found by allowing each k a single choice, occupied (with the appropriate
Boltzmann factor), or unoccupied. The counting is subject to the requirement that the number of occupied
states sum to N. In chapter 13 we saw that it is in principle possible to determine the partition function ZN
from the grand partition function , equation (13.53). Therefore we have
F

= kT ln Z = kT ln + N
(
)
= kT
ln 1 + e((k))/kT + N

(14.11)

where (k) is the single particle eigenenergy with quantum numbers k. The total number of Fermions is
given by equation (13.54)
N=
k

with f () =

1
e() +1

1
=
e((k)) + 1

f ( (k))

(14.12)

the Fermi function shown in figure 1.

The total energy E is the sum of the individual energies. This is consistent with the thermodynamic
)
definition E = (F
:
E

(F )

ln(1 + e((k))/kT + N
k

f ( (k)) (k)

(14.13)

This confirms that f() is the average occupancy of a state with energy . This function is not a probability,
but it is positive and varies in the range form 1 to 0 and we average quantities over it. Note that f () = 12
at = for all T and therefore the locus of f () = 12 determines the Fermi surface. At T = 0 f () is a

154

Chapter 14

Ideal Fermi Gas

1
0.8
0.6
0.4
0.2

-10

-5

1
e() +1

Figure 14.1.Fermi function f () =

10

as a function of .

step function, equal to 1 for F and zero for > F . The condition that the Fermi sphere at T = 0
must allow for N occupied states, determines the value of the chemical potential, 0 , at T = 0.

14.2.1

Classical limit

In the classical limit of sufficiently low density or sufficiently high temperature, the partition function
becomes (see equation (13.70))

ln Z

= N

Z
where =

'

s
.
se

ln

'

es ln N + 1
N

es
N!

N
N!

(14.14)

This expression can be evaluated using (14.7):


e

2 k2 +k2 +k2
x
y
z
2m

kx ,ky ,kz

where the sum is over all allowed values of kx , ky and kz given by (14.6). The exponential function factors,
and thus this expression can be written as the product of three similar sums:
e

2k
x
2m

kx

2 k2
y
2m

ky

2 k2
z
2m

(14.15)

kz

As a good approximation, the sums can be replaced by integrals

2k
x
2m

2k
x
2m

kx =

Lx
2

Lx
2

2m
2
2m

Lx
dkx
2

1
2

1
2

(14.16)

Section 14.3

Formal Criteria for a Degenerate Fermion Gas

155

Hence

2m

3
2

(2 )
3
V
(2mkT ) 2
3
h

(14.17)

Thus, in the classical limit, using Stirlings approximation N ! = N ln N N, the partition function can be
written as
ln Z = N ln

3
3 2m
V
ln + ln 2 + 1 ,
N
2
2
h

(14.18)

the energy as
= ln Z = 3 N = 3 N kT,
E

2
2

(14.19)

and the entropy as


= N k ln V + 3 ln T + 0
S = k ln Z + E
N
2
where 0 =

3
2

(14.20)

5
ln 2mk
h2 + 2 . The heat capacity is found from

E
3
= N k.
T
2

(14.21)

The pressure of the gas follows from equation (13.37)


PV
P

= kT ln
(
)
kT
=
ln 1 + e((k))/kT
V
k

kT
V

e(s ) =
s

N kT
V

(14.22)

14.3 Formal Criteria for a Degenerate Fermion Gas


In the classical limit for both Fermi-Dirac and Bose-Einstein statistics the occupation numbers reduce to
n
s = e(s ) 1

(14.23)

This condition implies that 0 and thus


z = e 1.

(14.24)

In this limit we can use the results for a classical gas, where we found that for monatomic particles.
V, T, E

e(ER NR )

=
R

eNR

=
NR

ENR
NR

e(ENR )

ZNR

(14.25)

NR

Where ZNR =

'

ENR

e(ENR ) is the partition functionfor a system with NR particles. In the previous

chapter, equation (13.71),we saw that in the classical limit ZNR =

(zMB )NR
NR !

where zMB is the classical

156

Chapter 14

Ideal Fermi Gas

partition function of a single particle. In the calssical limit the grand partition function becomes

V, T, E

(zMB )NR NR
e
NR !

NR =0

NR =0
zzMB

e zMB
NR !

NR

= e

(14.26)

Recall that for a classical gas, equations (6.9) and (6.13)


(E, N ) = V N (E, N )

(14.27)

E = E(T, N )

(14.28)

and

hence,
(E, N) = V N (E(T, N ), N )
ZMB

= VN

(E, N )e kT
E

= (V f (T ))N
= N

(14.29)
(14.30)

3
V
(2mkT ) 2
h3

(14.31)

and hence
3
1
(2mkT ) 2 .
h3
The grand partition function can therefore be written as

f (T ) =

= ezV f (T )
V, T, E

(14.32)

(14.33)

hence
N

= kT

ln

T,V

T,V

= zV f (T ) .

(14.34)

or

z=e

N 1
.
V f (T )

The condition (14.24) is therefore equivalent to


N 1
1
V f (T )

(14.35)

h3
N
1
V (2mkT ) 32

(14.36)

or

Section 14.4

Density of States

157

Since, in the classical limit,


p=

3
mkT
2

(14.37)

we have
N 3
1
V g

(14.38)

where is the de Broglie wavelength, = hp and g is the number of spin states allowed per particle.
A Fermion gas is said to be nearly degenerate if
N 3
1
V g

(14.39)

N 3

V g

(14.40)

In the limit

the gas is said to be completely degenerate. For example, if N and V are held fixed, the gas becomes
completely degenerate in the limit T 0, since = hp 3 h . Similarly, if we hold T and N fixed,
2 mkT

then the gas becomes degenerate in the limit as V 0, or, if we hold T and V fixed, in the limit as
N .
The situation most commonly encountered when dealing with Fermion gases is that in which N is
constant, and V is constant. In this case, the term degenerate" is synonymous with low temperature," and
completely degenerate" with zero temperature." Note however that the meaning of these terms is broader
than this restricted usage.
The average occupation number of the single particle states is determined by
f (r ) = nr =

1
e(r )/kT + 1

(14.41)

When T = 0, that is, in the completely degenerate case, nr is a step function, with
f () =

1
0

for F
for > F

(14.42)

In the nearly degenerate case, we have T close to, but not quite, zero. More precisely,
0 < T TF =

k
k

Then, nr is close to a step function, with nr 1 in most of the range < , and nr 0 in most of the
range > , but dropping steeply from value 1 to 0 in a narrow band of width 4kT .
In this nearly degenerate state, most of the Fermions remain in the ground state. Only a very small
number of electrons with energies in the range F 2kT to F move from the ground state into an excited
state that lies in the narrow range F to F + 2kT .

14.4 Density of States


It is often convenient to work with the density of states expressed in terms of the magnitude of the particle
momentum, rather than in terms of the particle energy. The reason for this is that, expressed in terms of p,
the density has the same form irrespective of whether the particles are relativistic or non-relativistic, and
quantum or non-quantum. The same formula applies to all cases.
We shall derive this formula here using a classical argument. Consider a free particle confined to volume
V in space. The number of states, (P ), of this particle in which the magnitude p of its momentum is less

158

Chapter 14

Ideal Fermi Gas

than a given magnitude P is then proportional to the volume of phase space defined by the conditions
*
2r V and p = p2x + p2y + p2z P

It is thus given by the integral

(P ) =

1
h30

d32r d3 p2 =
V

pP

1
h30

d32r
V

d3 p2

(14.43)

pP

We can interpret the constant h30 as the volume of a cell in phase space that contains a single" classical
state. In classical mechanics, we must take the limit h0 0, so the only physically significant quantities
are those in which this constant does not play a role, or in which it cancels out. In quantum mechanics
however, it is known that the number of states available is finite, and is defined by the value h0 = h,
where h is Plancks constant. We thus obtain the appropriate quantum formulae from the classical ones by
replacing h0 by h. Since h0 does not feature in any significant way in classical results, it does not matter
what value it is actually assigned. So, we may as well assign it value h0 = h in all subsequent calculations.
That way we will get formulae that are valid for both quantum and non-quantum theories.
The first integral is easily evaluated. 2r is allowed to range only over the volume V to which the particle
is confined, so
d32r = V
V

Also, p2 is only allowed to range over momenta p


2 that satisfy the condition
p2x + p2y + p2z P 2
This defines a sphere in p-space of radius P , and so the second integral is also easily evaluated,
d3 p2 =
pP

4
P 3
3

Hence
(P ) =

1
4
V P 3
3
h
3

The density of states is therefore given by,


(P ) =

d
4V
(P ) = 3 P 2
dP
h

(14.44)

or, changing the dummy variable,


4V 2
p
(14.45)
h3
If we need it, we can obtain the density of states in terms of the energy, we use the fact that, by definition
of the density of states, we must have
(p) =

() d = (p) dp
so that,
dp()
d
d
Note that the relation p = p() must be used at this stage of the calculation. Whether we get results (for
quantum or non-quantum theory) that are relativistic or non-relativistic depends on what relation is used
for p as a function of .
Note that this result is true for spinless particles. For particles of spin s, which have g spin states
available for each given momentum 2p, we will have density of states given by
() d = (p())

() d = g (p())

dp
() d
d

(14.46)

Section 14.4

Density of States

159

For example, in a non-relativistic theory of a free particle, quantum or non-quantum, we have


p2
2m

=
So,
d =

2p dp
dp
m
m

=
=
2m
d
p
2m

giving
() = (p)

m
dp
4V
4V
= 3 p2
= 3 m 2m
d
h
p
h

(14.47)

or,
3/2

2m
h2

() = 2V

1/2

(14.48)

which is the density deduced previously by other arguments.

14.4.1

Properties of the Degenerate Gas

Consider first the total internal energy of the degenerate Fermion gas. At temperature T = 0, all of the
states for energies F are occupied, with all the states for energies > F unoccupied, so
1
0

f () = n =

for F
for > F

So the total energy E0 of the Fermion gas at temperature T = 0 is given by


E0

nr r =

f () () d =

() d
0

r
pF

(p) (p) dp

(14.49)

This formula is true in all cases. For a non-relativistic gas, we have = p2 /2m, so
pF

E0 =
0

2gV
p2 4gV 2
p dp =
3
2m h
mh3

pF

p4 dp

(14.50)

or,
E0 =

2gV p5F
mh3 5

(14.51)

This result is more transparent when expressed in terms of the average energy per particle of the system.
We have
2gV p5F
E0
3 p2F
3
= mh 5 =
4gV 3
N
10 m
p
3h3 F

(14.52)

E0
3
= F
N
5

(14.53)

or

The pressure of the Fermi gas at T = 0 can be obtained from the relation
E=

3
PV
2

(14.54)

160

Chapter 14

Ideal Fermi Gas

(see equation(14.67)) and is given by


2 E0
2 N
=
F
3 V
5 V

P0 =

(14.55)

More explicitly,
P0 =

2/3

6 2
g

1
5

5/3

N
V

(14.56)

Note that, in a Fermi gas, at T = 0, almost all of the particles have energies well above their ground
state and momenta above their ground state momentum. This is directly a result of the Pauli exclusion
principle. Even at zero temperature, Fermions are prevented from settling into their lowest possible energy
state, unlike Bosons which all cascade into their ground state. The Fermi gas is thus quite live," even at
zero temperature.

14.5 Fundamental Equation


14.5.1

Fermi-Dirac Functions

The grand partition function for the ideal fermi gas can be written as

e(ER NR )/kT

=
R

N=0

zN

eni i /kT

{ni }

N=0 {ni }

(
)ni
i zei /kT

(
)
= i 1 + zei /kT

where the indicates summation over


follows that for the ideal Fermi gas

'

(14.57)

ni = N and z = e kT is the fugacity of the gas. From (13.37) it

PV
kT

= ln
ln 1 + zei

(14.58)

and from (13.54)


N

1
e(r ) + 1

z 1 er

+1

(14.59)

For a large volume V , energy levels of the gas are almost continuos and we can convert the summations in
(14.58) and (14.59) to integrals. Using (14.48) we get
PV
=
kT

g2V

2m
h2

3/2

1/2 ln 1 + ze d + ln (1 + z)

(14.60)

Section 14.5

Fundamental Equation

N=

161

2m
h2

g2V

3/2

1/2

1
1
d + 1
.
z 1 e + 1
z +1

(14.61)

The factor g is a weight factor arising from the internal structure of the particles, such as spin. The last
terms in (14.60) and (14.61) are included since the conversion to integrals inadvertently gives zero weight
to the groundstate contributions. The parameter z can take the values
0 z .
The last term in (14.61) is therefore negligible for large N and for z 1 the last term in (14.60) can also
be neglected.
p2
Upon substituting = 2mkT
= x into (14.60) and (14.61) we obtain
P
kT

(14.62)

3/2

2mkT
h2

= g2

g
f3/2 (z) .
3

x1/2 ln 1 + zex dx

g
f5/2 (z)
3

N
V

3/2

2mkT
h2

= g2

x2
dx
1
z ex + 1
(14.63)

is the mean thermal wavelength of the particles


=

(14.64)

(2mkT ) 2

The functions fn (z) are the Fermi-Dirac functions (see appendix D) defined by
fn (z) =

1
(n)

xn1
dx
z 1 ex + 1

(14.65)

The internal energy of the system E is given by

=
=

ln

= kT 2
z,V

ln
T

f5/2 (z)
3
3
gV
kT 3 f5/2 (z) = N kT
.
2
2
f3/2 (z)

z,V

(14.66)

From (14.62) and (14.66) it follows that

2E
(14.67)
3V
The heat capacity CV of the gas can be obtained by differentiating (14.66) with respect to T, keeping V
and N constant and using
P =

1
z

z
T

3 f3/2 (z)
.
2T f1/2 (z)

The result is
15 f5/2 (z) 9 f3/2 (z)
CV
=

.
Nk
4 f3/2 (z) 4 f1/2 (z)

(14.68)

162

Chapter 14

Ideal Fermi Gas

The Helmholtz Free energy of the gas is given by


F

= N PV

f5/2 (z)
ln z
f3/2 (z)

= N kT

(14.69)

and the entropy by


S

F
E
T
5 f5/2 (z)
= Nk
ln z
2 f3/2 (z)

(14.70)

In order to determine the various properties of the Fermi gas in terms of N, V and T, we need to know the
functional dependence of the parameter z on N, V and T. This information is implicit in equation (14.63).
In the limit of low density or high temperature if follows from (14.63 ) that
f3/2 (z) =

N h3
3

V g (2mkT ) 2

(14.71)

This corresponds to the highly non-degenerate state. From the expansion (see appendix D)
fn (z) =

(1)r

r=0

z r+1
z2
z3
z4
= z n + n n+
n
(r + 1)
2
3
4

z is small in the highly non-degenerate state and .fn (z) z. This leads to
P

CV

NkT
V
3
N kT
2
3
Nk
2

At low, but finite temperatures, z is large in comparison to unity, but finite. Under these conditions the
series expansions of fn (z) can be expressed as asymptotic expansions in powers of (ln z)1 (See appendix
E, Statistical Mechanics, R K Pathria). To a first approximation
f5/2 (z) =

5
8
5 2
(ln z) 2 1 +
(ln z)2 + ...
15
8

(14.72)

3
4
2
f3/2 (z) = (ln z) 2 1 +
(ln z)2 + ...
3
8

(14.73)

1
2
2
f1/2 (z) = (ln z) 2 1
(ln z)2 + ...

24

(14.74)

Substituting (14.73) into (14.63) gives


N
4g
=
V
3

2m
g

3
2

(kT ln z) 2 1 +

2
(ln z)2 + ... .
8

(14.75)

In the zeroth approximation this gives


kT ln z

3N
4gV

3
2

h2
2m

(14.76)

Section 14.6

Simplistic Model of a white dwarf star

163

which is identical with the ground state result 0 = F . In the next approximation
2
12

kT
F

E
3
2
= (kT ln z) 1 +
N
5
2

kT
F

kT ln z F 1

(14.77)

Substituting into (14.72) and (14.73) into (14.66) gives

Using (14.76) this becomes


E
3
52
= F 1 +
N
5
12

kT
F

(14.78)

The pressure of the gas is given by


P

=
=

2E
3N
52
2
F 1 +
5
12

kT
F

(14.79)

From (14.78) the low temperature heat capacity is given by


CV = N k2

2 T
+ ...
2 F

(14.80)

which shows that at low temperatures the specific heat for electrons is linear in the temperature.
The Helmholtz free energy becomes
F
N

=
=

PV
N

3
52
F 1
5
12

kT
F

(14.81)

whence
S
2
=
Nk
2

kT
F

+ ...

(14.82)

Thus, as T 0, S 0, in accordance with the third law of thermodynamics.


In a metal the Fermi temperature TF = kF 105 K. Hence at room temperature (300 K), the Fermi
energy and chemical potential are essentially equivalent. This is significant since it is the chemical
potential, not the Fermi energy, which appears in Fermi-Dirac statistics. It also implies that even at room
temperature, and ideal Fermi gas can be treated as highly degenerate.

14.6 Simplistic Model of a white dwarf star


A typical simple model of a white dwarf star assumes a mass M( 1030 kg) of helium, packed into a ball
of mass density ( 107 g cm3 ), at a central temperature T ( 107 K). At this temperature the mean
thermal energy per particle is of the order of 103 eV, which is much greater than the ionisation energy of
an helium atom ( 24 eV). The complete star therefore exists in a state of ionisation. The microscopic
components of the star can be taken as N electrons, each of mass m, and 12 N helium nuclei, each of mass
4mp . The typical electron density in a white dwarf is O 1030 electrons per cm3 .

164

Chapter 14

Ideal Fermi Gas

The Fermi momentum of the electron gas is


pF =

3N
8

1
3

h = O 1017 g cm sec1 ,

(14.83)

which is comparable to the characteristic momentum mc of an electron gas. The Fermi energy
F = O 106 eV and hence the Fermi temperature TF = kF = O 1010 K. The conclusion is that
the electrons in a white dwarf are relativistic and though the temperature is high in comparison to
terrestrial standards, the electrons are, statistically speaking, in a state of almost complete degeneracy:
T
3
.
TF = O 10
As a further simplification, treat the white dwarf star as if it were an ideal electron gas. That is, ignore
the presence of He nuclei, neglect all electron-electron interactions, neglect all electromagnetic radiation
emitted by the electrons, and neglect the gravitational attraction on the electrons. We thus treat the star as
if it were an ideal Fermi-Dirac gas of uniform density. This model is unrealistic, but the results it gives
are qualitatively correct since the stability of the core is detemined mainly by the balance between the
gravitational attraction and the pressure due to the degenerate Fermi gas of electrons.

14.6.1

Relativistic Density of states

The electrons in a white dwarf are moving at relativistic velocities. We therefore cannot use the density of
states for a non-relativistic homogeneous electron gas given by
() = 4gV

2m
h2

3/2

1/2 ,

(14.84)

since this result was deduced from the non-relativistic Schrdinger equation, based on the classical relation
E = p2 /2m for the energy of a free particle. The relation that we need to use for relativistic particles is
E=

p2 c2 + m2 c4

(14.85)

We must therefore use a more general expression for the density of states which is valid for all values of
the momentum p.
For this, we revert to the de Broglie theory, which is relativistic. The wave associated with a particle of
energy E and momentum p has wavelength and frequency given by the equations p = h , E = h. From
(14.6) it follows that
nx 2
Lx

px = kx =

and similarly for the y and z components. The total number of permissible states in a volume dp3 in
momentum space, is therefore
Lx Ly Lz 3
dp
(2 )3
V 3
=
dp
h3
V
= 4 3 p2 dp
h
The relativistic density of states in momentum space for spinless particles is thus
(p) = 4

V 2
p ,
h3

the same as eqaution (14.45) dereived using a classical argument. For particles with spin s, it is therefore
(p) = 4g (s)

V 2
p
h3

Section 14.6

Simplistic Model of a white dwarf star

165

where g is a factor that corresponds to the internal structure of the particle. For example, for a spin 12
particle, two single particle states for each momentum vector is allowed, spin up and spin down, which
makes g = 2 for a spin 12 Fermions.

14.6.2

Energy of a Relativistic Fermi Gas at T = 0

We have
nr r .

E=

(14.86)

At temperature T = 0,
1
0

nr =

< F
> F

(14.87)

so that, in integral approximation, we get


F

E=

() d

(14.88)

The number of states in an interval dp which corresponds to an energy range d, can be expressed as
dp
d
d
= ()d

(p)dp = (p)

Since we have the density of states as a function of p, we rewrite the integral (14.88), using
() d = (p) dp, as
pF

(p) (p) dp
0

pF

p2 c2 + m2 c4 4g

=
0

= 4g

pF

V
h3

V 2
p dp
h3

p2 c2 + m2 c4 p2 dp

(14.89)

(14.90)

Write
p2 c2

+ m2 c4

= mc

1+

( p )2
.
mc

Then, setting x = p/mc, we have


pF

xF

p2 c2 + m2 c4 p2 dp =

mc2

xF

= m4 c5

1 + x2 (mcx)2 d(mcx)
1 + x2 x2 dx

*
m4 c5
xF (1 + 2x2F ) 1 + x2F arcsin h (xF )
8
*
*
m4 c5
=
xF (1 + 2x2F ) 1 + x2F ln xF + x2F + 1
8
= m4 c5 I(xF )

(14.91)

166

Chapter 14

Ideal Fermi Gas

where xF = pF /mc. and


xF

1 + x2 x2 dx.

I(xF ) =
0

1
8

*
*
xF (1 + 2x2F ) 1 + x2F ln xF + x2F + 1

(14.92)

We get, at T = 0, that
E = 4g

V
m4 c5 I(xF ).
h3

(14.93)

By definition of pF , we have at T = 0
F

N=

nr =

pF

() d =
0

(p) dp = 4g
0

V
h3

pF

p2 dp

(14.94)

so that
pF =

3 N h3
4 V g

1/3

(14.95)

and hence
xF =

h
2mc

3N
V

1/3

(14.96)

where we have used the fact that, for electrons, g = 2. Note that m for our model of a white dwarf becomes
the electron mass me .

14.6.3

Pressure of a Relativistic Fermi Gas at T = 0

Equation (14.93) expresses the energy E of the Fermi-Dirac gas at T = 0 in terms of V and N . It is thus an
equation of the form E = E(T, V, N ), evaluated at T = 0. This means that the above manipulations have
succeeded in eliminating in favour of N, albeit only approximately, and at zero temperature. We have
thus obtained an equation of state for the gas in Helmholtz representation. Now, in this representation,
dF = S dT P dV + dN

(14.97)

where F = E T S. Thus
P =

F
V

T,N

T,N

(E T S)

(14.98)

so, at zero temperature, we have


P

T,N

(E T S)

T =0

=
[(E T S)]T =0
V N

=
[E(T, V, N)]T =0
V N
E(0, V, N )
=
V
N

(14.99)

Section 14.6

Simplistic Model of a white dwarf star

167

Hence,

V
8 3 m4 c5 I(xF )
V N
h
8
V
dI
= 3 m4 c5 I(xF ) 8 3 m4 c5
(xF (V, N ))
h
h
dxF

m4 c5
(x 1 + x2F
h3 3

1
2

xF
V

(V, N )

(14.100)

2x2F 3) + 3 sinh1 (xF ))

Using (14.92)
dI
1
1
(xF (V, N )) = x2F + x4F x6F
dxF
2
8
1
xF
(V, N ) = . xF V 1
V N
3

(14.101)

For xF 1
P =

4
4 51 8 5
m c
x x7 + O x9F
h3
3 5 F 7 F

(14.102)

while for xF 1
P =

4 51
7
m c
2x4F 2x2F + 3 ln 2xF
3
h
3
12

5
+ O x3
+ x2
F
4 F

(14.103)

Equation (14.100) is an expression for the pressure in terms of N and V since, equation (14.96),
xF =

h
mc

3N
8V

1/3

(14.104)

N and V are not convenient parameters to use, more convenient parameters are the stellar radius R and the
stellar mass M . Now, a white dwarf star consists almost exclusively of helium. So its mass is given by
M = N (me + 2mp ) 2mp N
Also,
V =

4
R3
3

So,
3
N
M/2mp
M
= 4 3 =
V
8 mp R3
3 R
giving
xF =

14.6.4

h
2mc

3 3
M
8 mp R3

1/3

h
2mcR

9 M
8 mp

1/3

(14.105)

Stability of the White Dwarf Star

We can arrive at a condition for the stability of a white dwarf star by the following simple argument. First,
consider a situation in which there is no gravitational attraction to hold the ideal electron gas together.
To assemble the electrons into a star of volume V , we would have to compress the electron gas from an
infinite volume into volume V . The work done (by the system) in this process is
R

P dV =

P 4r2 dr

(14.106)

168

Chapter 14

Ideal Fermi Gas

This is the work that must have been done by the gravity of the star to assemble it.
Second, we calculate the work that must be done against the gravitational field (by the system) to
dismantle it and to disperse it through space. By dimensional arguments, this work is given by an
expression of the form
GM 2
(14.107)
R
where is some constant 1 determined by the detailed mass distribution of the star.
Since the star is stable under the combined action of electron pressure and gravity, these two energies
must be the same. Thus

P 4r2 dr =

GM 2
R

(14.108)

Now get rid of the integral by differentiating both sides with respect to R. This gives
P 4R2 =

GM 2
R2

(14.109)

or
GM 2
4 R4
Since this pressure is provided by the degenerate electron gas, it is given by (14.102), and so we have
P =

8 5
4
3h3 GM 2
xF x7F = 2 4 5
5
7
4 m c R4

(14.110)

This equation establishes a one to one, though implicit, relationship between the mass M and the radii R
of white dwarf stars (see equation(14.105)).
Since M 1033 g, mp 1024 g and
8
4 5 5
xF 1 and P 15
h3 m c xF and

mc

1011 cm, xF 1 for R 108 cm. For R 108 cm,


2

For R 108 xF 1 with P

h3

3 (9) 3 2 M 3
13
R
5 M
40 Gmmp3
4
5
m4 c5 31 2x4F 2x2F which results in
2

(9) 3
R
2 mc

M
mp

1
3

M
M0

2
3

(14.111)

1
2

(14.112)

where
9
M0 =
64

3
3

1
2

c
G

m2p

2
3

(14.113)

The radius of the the white dwarf therefore decreases with increase in mass. There is obviously no solution
for M > M0 and we can conclude that according to this crude approximation, all white dwarfs must have
a mass less than M0 .
The correct limiting value of the mass is called the Chandrasekhar limit The physical reason for this is
that for a mass exceeding this limit, the ground state pressure of the electron gas ( which arises from the
Pauli exclusion principle) would not be sufficient to support the system against gravitational collapse.

Exercises

169

Exercises
1.
2.

3.

An ideal Fermi gas is at rest at absolute zero and has Fermi energy . The mass of each particle is m. If
2v denotes the velocity of each particle, find vx and vx2 .
Consider an ideal Fermi gas of N electrons in a volume V at absolute zero.
of the gas.
(a) calculate the total mean energy E
in terms of the Fermi energy .
(b) Express E
is properly an extensive quantity, but that for a fixed volume V , E
is not proportional
(c) Show that E
to the number of particles in the container.
Find the relation between the mean pressure p and
V of an ideal Fermi-Dirac gas at T = 0.
( volume
)

E
(a) Compute this by the general relation p = V
, valid at T = 0.

4.

(b) Compute this relationship from p = 23 VE .


(c) Use this to calculate an approximate pressure exerted by the conduction electrons of copper on the
23
electrons cm3 ).
solid lattice which confines the electrons of the metal. ( N
V = 8.4 10
Use arguments based on specific heat to give approximate answers to the following questions. (You
may assume T dS = dQ and C T a )
(a) By what factor does the entropy of the conduction electrons in a metal change when the
temperature is changed from 200 to 400 o K?
(b) By what factor does the entropy of the electromagnetic radiation field change when the
temperature is changed from 1000 to 2000 o K?

170

Chapter 15

Phase transitions and critical exponents

Chapter 15
Phase transitions and critical exponents
We have discussed examples where the interactions between the microscopic components of the system
can be regarded as practically non-interacting. As a result the thermodynamic functions of the system
follow from a knowledge of the energy levels of the individual constituents. Examples are the specific
heats of gases and solids and the condensation of an ideal Bose gas, paramagnetism and the spectral
distribution of black-body radiation. In the case of solids, the interatomic interactions play an important
physical role, but since the positions of the atoms do not depart substantially from their mean positions
over a range of temperatures, we can rewrite the problem, in a first approximation, in terms of so-called
normal coordinates and treat the solid effectively as an assembly of non-interacting harmonic oscillators
and determine the specific heat, for example, form a simple, smooth, thermodynamic function. In the
process the interparticle interactions are effectively removed by a transformation where the normal
coordinates represent quasi-particles that do not interact.
Many physical phenomena cannot be described by simple models where the interactions between the
particles are removed by a suitable transformation. As a consequence the energy levels of the system
cannot be related to the levels of the individual constituents. The interactions between the constituents are
necessary to describe various kinds of phase transitions such as the melting of solids and condensation,
ferromagnetism and antiferromagnetism and order-disorder transitions in alloys. In these systems, it
is typical to find large numbers of microscopic constituents interacting with one another in a strong,
cooperative fashion. This cooperative behaviour normally assumes macroscopic significance below a
particular temperature Tc , known as the critical temperature of the system.
Mathematical problems associated with the study of cooperative phenomena are quite daunting. As
usual, one is forced to introduce models where the interparticle interactions are simplified, but yet retain
the essential physics of the system to describe the cooperative behaviour of the system.

15.1 Dynamical model of phase transitions


A number of systems that undergo phase transitions can be modelled by an array of lattice sites, with
only nearest-neighbour interaction that depends on the manner of occupation of the sites. This type of
simple model is good enough to give a unified theoretical basis for understanding a variety of phenomena
such as ferromagnetism and antiferromagnetism, gas-liquid transitions and order-disorder transitions in
alloys. The model over simplifies the physics but retains the essential physical features that give rise to
long-range order in the system which arises in the nature of the cooperative phenomenon characteristic of
these systems.
It is convenient to formulate the problem in terms of a lattice model of ferromagnetism. Each of N
lattice sites is regarded as occupied by an atom with magnetic moment 2 of magnitude gB J (J + 1),
which has (2J + 1) discrete orientations in space. These orientations define different possible occupations
of a given lattice site. The whole lattice is therefore capable of (2J + 1)N different configurations.
Associated with each configuration is a different energy E that arises from the mutual interactions between
2 A statistical
the neighbouring sites and from the interaction of the whole lattice with an external field B.

analysis of the canonical ensemble should yield the expectation value M(B, T ) of the net magnetization
(0, T ) at temperatures below the critical temperature
M. The presence of spontaneous magnetisation, M
Tc and its absence above Tc will then be interpreted as a ferromagnetic phase transition at Tc .
The simplest model to describe ferromagnetism assumes that

= 2B

s (s + 1),

s=

1
2

where s is the quantum number associated with the electron spin. With s =

(15.1)

1
2

there are only two

171
orientations for each site,
sz
sz

1
, z = B
2
1
= , z = B
2
=

(15.2)

The whole lattice is capable of 2N configurations.


The interaction between neighbouring sites are of crucial importance to give a model for a phase
transition. The interaction energy between two neighbouring spins in this model is
ij = constant 2Jij (2si .2sj ) .

(15.3)

The scalar product 2si .2sj can be written as


2si .2sj

=
=

1
(2si + 2sj )2 2s2i 2s2j
2
1
S(S + 1) s(s + 1)
2

where S = 0, 1 is the total spin of the two particles and s =


2si .2sj

1
2

(15.4)

the spin of a single particle. Hence

1
; S=1
4
3
= ; S=0
4
=

(15.5)

The difference in energy between two parallel spin and two anti-parallel spin neighbours is

1
3

4
4

= 2Jij
= 2Jij .

(15.6)

If Jij > 0, the -state is energetically favoured over the -state and we would expect ferromagnetism.
For Jij < 0 the situation is reversed and we expect antiferromagnetism.
The constant in (15.3) is immaterial because the potential energy is defined to within a constant and if
only nearest neighbours are considered, we can set Jij = J for nearest neighbours and zero between all
other sites. The interaction energy of the whole system is then given by
E = constant 2J

(2si .2sj )

(15.7)

n.n.

where the sum is over nearest neighbours only. This model is known as the Heisenberg model (Heisenberg
W. Z. Physik 49, 619, 1928)
A simpler model results if we write the energy of the lattice as
E = constant 2J

ij

(15.8)

n.n.

where = +1 for up spin and = 1 for down spin. Note that we still have = 2J. The
simplified expression (15.8) for the energy does not require a quantum mechanical treatment and thus
avoids the need to take care of the commutation properties of the spin operators. This model is known as
the Ising model. (Lenz W. Z. Physik 21, 613, 1920, Ising, E. Z. Physik 31, 253, 1925).
The Ising model can be derived from the Heisenberg model by taking the last term in
2si .2sj = six sjx + siy sjy + siz sjz .

(15.9)

A different model, the X-Y model, results if the z-component is suppressed and the x and y components
are retained.

172

Chapter 15

Phase transitions and critical exponents

To study the statistical mechanics of the Ising model we disregard the kinetic energy of the atoms
2 in the up
occupying the different lattice sites. The z-direction is fixed by an external magnetic field B
direction, so that each spin acquires an additional energy of the Bi . The Hamiltonian of the system is
then given by
H {i } = 2J

n.n.

i j B

i .

(15.10)

The partition function is given by


ZN (B, T ) =

eH{i }

...
1

e [2J

...
1

n.n.

i j B

i ]

(15.11)

The Helmholtz free energy, the internal energy, the specific heat and the nett magnetisation of the system
follow from the formulae
F (B, T ) = kT ln ZN (B, T )
F

E (B, T ) = T 2
= kT 2
ln ZN (B, T )
T
T
2
E (B, T )
= T
ln ZN (B, T )
C (B, T ) =
T
T 2
B

(15.12)
(15.13)
(15.14)

and
(
(B, T ) =
M

) 1
i =

ln ZN (B, T )
B

F
B

(15.15)

(0, T ) gives the spontaneous magnetisation of the system. If it is non-zero only below the
The quantity M
critical temperature Tc the system would be ferromagnetic for T < Tc and paramagnetic for T > Tc . At
the critical temperature itself, the system is expected to show some sort of singular behaviour.
It is obvious that the energy levels as a whole will be degenerate since various configurations of { i }
will have the same energy. The energy of this model does not depend on the detailed values of the variables
i , it depends on the total number N of up up nearest neighbour pairs and so on. To see this define
N total up spin sites
N total down spin sites
N total up-up spin nearest neighbours
N total down-down spin nearest neighbours
N total opposite spin nearest neighbours

(15.16)

The numbers N and N must satisfy


N + N = N.

(15.17)

Let q be the coordination number of the lattice, i.e. the number of nearest neighbours for each lattice site.
For example, for a square lattice q = 4 and for a cubic lattice q = 8. We also have the relations
qN
qN

= 2N + N
= 2N + N

(15.18)

All numbers can be expressed in terms of any two, for example in terms of N and N :
N
N
N

= N N
= qN 2N
1
qN qN + N
=
2

(15.19)

Section 15.2

Ising model in the zeroth approximation

173

Note that the sum of all nearest neighbour pairs is


1
N + N + N = qN,
2
as expected.
With the relations above, the Hamiltonian can be expressed as
HN (N , N ) = J (N + N N ) B (N N )
1
qN 2qN + 4N B (2N N ) .
= J
2

(15.20)

Let gN (N , N ) be the number of distinct ways in which the N spins of the lattice can be arranged to
give the preassigned values of N and N . The partition function of the system can now be expressed as
ZN (B, T ) =
N ,N

gN (N , N ) eHN (N ,N )

= eN ( 2 qJB)
1

e2(qJB)N
N

N
N fixed

gN (N , N ) e4N

(15.21)

The central problem thus consists of determining the combinatorial function gN (N , N ) for a given
value of N .

15.2 Ising model in the zeroth approximation


Define a long range order parameter L in a given configuration by the relationship
L=

1
N

i =
i

N
N N
=2
1
N
N

(1 L 1) ,

(15.22)

so that
N =

N
(1 + L)
2

and N =

N
(1 L)
2

(15.23)

The magnetisation M is then given by


M = (N N ) = NL,

(N M N)

(15.24)

For a completely random system N = N = 12 N and the expectation values of both L and M are zero.
We now approximate the first part of the Hamiltonian in equation (15.10) by
2J

n.n.

i j J

1
q

(15.25)

In other words, for a given i we replace j by an average

1
N

q n.n.

(15.26)
all configurations

over nearest neighbours. The factor 12 is included to avoid duplication in the counting of nearest-neighbour
pairs. Since the coordination number is q,

1
N

q n.n.

=
all configurations

1
N

i
i

=L
all configurations

(15.27)

174

Chapter 15

Phase transitions and critical exponents

From (15.10), (15.25), (15.26) and (15.27) it follows that the approximate configurational energy of the
system is given by
E=

1
qJL NL (B) N L.
2

(15.28)

The expectation value of E is then given by


1
2
E = qJN L BN L.
2

(15.29)

In the same approximation the difference expended in changing an up spin into a down spin is given
by
= J (q
) B
qJ

+B
= 2

qJ
= 2
L+B

(15.30)

for = 2. The quantity qJ


plays the role of an internal field. It is determined by (i) the mean value of

the long-range order prevailing in the system and by (ii) the strength of the coupling, qJ, between a given
spin and its q nearest neighbours. This approximation is equivalent to the mean molecular field theory of
Weiss, put forward in 1907 to explain the magnetic behaviour of ferromagnetic domains.
The relative values of the equilibrium numbers N and N follow from the Boltzmann principle,
2(

+B )

N
kT
= e kT = e
.
N
qJ

(15.31)

Substituting (15.23) into (15.31), leads to


2(

+B )

1L
kT
= e
1+L

(15.32)

qJL + B
1 1L
= tanh1 L
= ln
kT
2 1+L

(15.33)

qJ

or

In order to investigate possible spontaneous magnetisation, let B 0, which gives the relationship
L0 = tanh

qJL0
.
kT

(15.34)

1.0

0.8

y=tanh(2 x)

y=x

0.6

0.4
y=tanh(2/3 x)
0.2

0.2

0.4

0.6

0.8

1.0

Section 15.2

Ising model in the zeroth approximation

175

1.0

0.8

sech(x)

0.6

0.4

0.2

Figure 15.1.
The solution L0 = 0 is allowed for all positive T, but only for T < Tc does the straight line y = L0
0
intersect the line y = tanh qJL
kT for non-zero positive L0 . Since
d
tanh(lx) = l sech2 (lx)
dx
qJ
0
the slope of the line y = tanh qJL
kT is equal to kT at L0 = 0 and decreases monotonically to zero with
increasing L0 . The condition that there is a solution for (15.34) for finite L0 , is then

qJ
> 1,
kT
or
T <

qJ
= Tc .
k

(15.35)

The system therefore has a critical temperature Tc, below which the system acquires a spontaneous
magnetisation and above which it does not. It is natural to identify Tc with the Curie temperature of the
system, the temperature that marks the transition from paramagnetic to ferromagnetic behaviour of the
system.
For each L0 which is a solution of (15.34), -L0 will also be a solution. Since this is a solution for the zero
field situation, there is no way of assigning up and down spin states and both solutions are equally valid.
The precise variation of L0 (T ) as a function of T can be obtained by solving equation (15.34)
numerically. The general trend, however, can be seen by noting that at T = qJ
k , the straight line y = L0 is
0
tangential to the curve y = tanh qJL
kT at the origin and L0 (Tc ) = 0. As T decreases, the initial slope of the
curve increases and the point of intersection rapidly moves away from the origin and accordingly L0 (T )
increases rapidly as T decreases below Tc .
It is possible to get and approximation for the dependence of L0 (T ) near the critical temperature by
noting that for small x

tanh x x

x3
3

to obtain
T
L0 (T ) 3 1
Tc

1
2

Tc , B = 0.

(15.36)

176

Chapter 15

Phase transitions and critical exponents

Figure 15.2.
As T 0, L0 (T ) 1 in accordance with the relationship
L0 (T ) 1 2e

T
1
Tc

2Tc
T

(15.37)

Experimental data for iron, nickel cobalt and magnetite are shown in Figure 213 .
The field-free configurational energy is given by equation (15.29),
1
2
E (T ) = qJN L0
2
and the specific heat by
C0 (T ) =

dE
dT

= qJN L0
=

dL0
dT

N kL0
(T /Tc )2
2
1L0

(15.38)

T
Tc

For all T > Tc E (T ) and C0 (T ) are zero. The value of the specific heat as Tc is approached from below
follow from equations (15.36) and (15.38):
C0 (T Tc ) = Limx0
=

Nk.3x
(1x)2
13x

3
Nk
2

(1 x)

(15.39)

The specific heat therefore has a discontinuity at the transition point. On the other hand, C0 (T ) vanishes
as T 0 according to the formula
C0 (T ) 4N k
13

From Pathria, 2003, p 325.

Tc
T

2Tc
T

(15.40)

Section 15.3

Critical Exponents

177

The vanishing of the configurational energy and the specific heat of the system are related to the fact that
in the present approximation, there is no configurational order above Tc . Consequently the configurational
entropy should attain its maximum value at T = Tc and above that the system remains thermodynamically
inert. As a check the configurational entropy of the system at T = Tc can be evaluated using equations
(15.34) and (15.38):
Tc

S0 (Tc ) =
0

C0 (T )
dT
T
Tc

= qJN

0
0

= qJN
1

= Nk

L0 dL0
dT
T dT

L0
dL0
T

tanh1 L0 dL0 = Nk ln 2.

(15.41)

This is exactly the entropy of a system capable of 2N equally likely microstates. The fact that there are 2N
equally likely microstates are again related to the fact that there is no configurational order for T Tc .
The magnetic susceptibility of the system can be determined from equation (15.34) and is given by
M
B

(B, T ) =

L
B

= N

N
1 L (B, T )
(
).
k T Tc 1 L2 (B, T )

(15.42)

For L 1, which is true for for high temperatures at finite B and also near Tc if B is small, (15.42) gives
the Curie W eiss law
0 (T )

N 2
(T Tc )1
k

Tc , B 0

(15.43)

For T less than, but close to Tc , equation (15.36) can be used to give
0 (T )

N 2
(Tc T )1
k

Tc , B 0

(15.44)

Experimentally the Curie-Weiss law is satisfied with considerable accuracy, except that the value of Tc is
always somewhat larger than the true transition temperature of the material.

15.3 Critical Exponents


A basic problem in the theory of phase transitions is to study the behaviour of a system near its critical
point. This behaviour is marked by the fact that various physical quantities posses singularities at the
critical point as we saw for the specific heat in the previous section. It is customary to express these
singularities in terms of power laws characterised by a set of critical exponents which determine the
qualitative nature of the critical behaviour of the system.
To begin with, identify an order parameter m, and corresponding ordering field h, such that in the
limit h 0, m tends to a limiting value m0 , with the property that m0 is zero for T Tc and = 0 for
T < Tc . For magnetic systems, the natural candidate for m is the parameter L of equation (15.27) while h
B
is identified as kT
.
c
The manner in which m0 0 as T Tc from below, defines the critical exponent :
m0 (Tc T )

(h 0, T

Tc )

(15.45)

The manner in which the low-field susceptibility 0 diverges as T Tc from above (or below) defines the

178

Chapter 15

Phase transitions and critical exponents

critical exponent (or ):


0

m
h

(T

T,h0

Tc )
for

(Tc T )

h 0, T

Tc

for h 0, T

Tc

(15.46)

The critical exponent is defined by the relation


1

m|T =Tc = h

(T = Tc , h 0)

(15.47)

and the exponents and are defined on the basis of the specific heat:
CV

(T Tc ) for , T
(Tc T )

for T

Tc
Tc

(15.48)

In the simple Ising model of the previous example,


1
= , = = 1, = 3, = = 0.
2

(15.49)

Experimental values for the exponents vary from those derived within the mean field approximation,
which makes it clear that a theory different from the mean field approximation is necessary to describe real
systems. It turns out that the critical exponents are not independent, but satisfy inequalities or equalities
dictated by the principles of thermodynamics. The critical exponents are determined by a small number of
parameters and tend to be similar from system to system. The most important parameters are the dimension
in which the system is embedded, the number of components that determine the order parameter, and the
range of the microscopic interaction. As far as the interactions are concerned, the only thing that matters is
whether the interactions are short or long ranged.

179

Appendix A
Statistical Calculations
15.1 The Integral

ex dx

The following integral occurs frequently in statistical calculations,


+

ex dx

I=

To evaluate it, we use the following subterfuge. First, square the integral to get
2

x2

I =

dx

ex dx

=
x=

ey dy =

e(x

+y2 )

dx dy

R2

y=

Then, transform to polar coordinates on the x-y plane, with r2 = x2 + y2 , to get


I2 =

=0

er rd dr =

r=0

d
=0

er d(r2 /2) =

r=0

eu du =

u=0

from which it follows that


+

ex dx =

Since the integrand ex in this integral is even, we can deduce another useful result,

ex dx =
2
0

15.2 The Integral

xnex dx

Another integral encountered frequently in statistical calculations is

In =

xn ex dx

where n = 0, 1, 2, .... This integral can be evaluated by induction on n. First note that, for n = 0, I0 is
trivially evaluated,

I0 =
0

4
5
ex dx = ex 0 = 1

Then note that In can be reduced to In1 by integration by parts,

In =
0

xn ex dx =

Using lHospitals rule, it is easy to show that

4
5
xn d ex = xn ex 0 +

4
5
lim xn ex = 0

ex d(xn )

180

Appendix A

Statistical Calculations

so that [xn ex ]0 = 0, and hence

In = n

ex xn1 dx = n In1

This is a recurrence relation which, by repeated application, enables us to express In in terms of I0 , which
we have already evaluated. For all integers n > 0, it gives
In = n In1 = n(n 1) In2 = = n(n 1)(n 2) 3 2 1 I0
Since I0 = 1, we get
In = n!
Since I0 is well defined and has value I0 = 1, we extend the factorial function to include also the value
n = 0. Define
0! = I0 = 1
With this definition, we can summarise the results of this section by a single formula: for n = 0, 1, 2, ...,

xn ex dx = n!

(A.1)

This, incidentally, is the real reason for defining the symbol 0! and for assigning to it the value 1.

15.3 Calculation of n!
15.3.1

Approximate Formulae for n!

The calculation of n! for large n is laborious and time consuming. Fortunately, there are several simple
approximations by which n! may be calculated to any desired degree of approximation.
To lowest order approximation, n! is given for very large n by
ln n! = n ln n n

(A.2)

This can also be expressed as


n! = nn en =

( n )n

(A.3)
e
In statistical mechanics, we generally deal with huge numbers, and (A.3) is an excellent approximation.
For numbers n which are large, but not huge, a better approximation is obtained from Stirlings formula,
given by

n! = nn en 2n
Stirlings formula can also be written as
n! =
or

( n )n

2n
e

1
ln 2n
(A.4)
2
Note that, if n is extremely large, then ln n n, and (A.4) reduces to the simpler result (A.2). For
example, if n = 6 1023 , then ln n = 55, which is utterly negligible when compared with n ln n n.
For numbers which are neither huge nor very large, Stirlings formula may not be sufficiently accurate.
We can then use an infinite series expansion for n! to calculate n! to any desired degree of accuracy. This
ln n! = n ln n n +

Section 15.3

Calculation of n!

181

infinite series is given by,


n! =

( n )n

1
1
139
1+
+
2n

+
e
12n 288n2 51840n3

(A.5)

Higher order terms in this series are rarely needed, but can be calculated easily by the method outlined
below in the proof of this power series. Series (A.5) can also be expressed as
ln n! = (n ln n n) +

1
1
1
1
139
ln 2 + ln n + ln 1 +
+

+
2
2
12n 288n2 51840n3

In statistical mechanics, we deal typically with numbers of the order of 1023 . The difference between the
crude approximation of equation (A.2) and the answer given by the power series (A.5) continued to any
number of terms is, for n = 1023 ,
ln n! =

139
1
1
1
1

+ = 0.9 + 12.5 + .008 = 13.4


ln 2 + ln n + ln 1 +
+
2
2
2
12n 288n
51840n3

so the fractional error made in the calculation of ln n! is


ln n!
13
=
1023
ln n!
n(ln n 1)
Formula (A.2) is therefore in error by only 1 part in 1023 , which is fantastically small. As for calculating
23
n!, the error will be one part in e10 .

15.3.2

Lowest Order Approximation

To lowest order, n! is given by


ln n! = n ln n n
The proof of this result is elementary. First, write n! explicitly as a product,
n! = 1 2 3 (n 2) (n 1) n
Then
n

ln n! = ln 1 + ln 2 + + ln(n 2) + ln(n 1) + ln n =

ln r

(A.6)

r=1

We now interpret the summation in (A.6) as the area under the polygonal graph shown in Figure 1.14
In the theory of integration, a polygonal area like this is regarded as an approximation to the area under
the graph of the function ln x, also shown in Figure 1,
n1

n
1

ln x dx

ln xr x

(A.7)

r=1

where xr = r and x = 1. If we invert this line of reasoning, we may regard the integral on the left hand
side of (A.7) as an approximation to the polygonal area represented by the sum on the right hand side, and
thus also to ln n!. The error in this approximation is displayed visually in Figure 1 by the excess of the
polygonal area over that under the graph of ln x. This error is large for small n, but is seen to decrease
dramatically as n is made larger, because of the way that the function ln x flattens as x . The larger
n
the value of n therefore, the smaller the fractional error made by replacing ln n! by the integral 1 ln x dx.
So, integrating by parts, we get
n

ln n!
14

ln x dx = [x ln x]n1

From Sears and Salinger, 1975, p 426, Figure C-1.

n
1

dx = n ln n (n 1)

182

Appendix A

Figure 15.1.terpretation of

n
r=1

Statistical Calculations

ln r as an area, and its approximation by the area

n
1

ln x dx.

For n 1, we have n 1 n, and hence


ln n! n ln n n
as claimed.

15.3.3

Stirlings Formula

Stirlings formula is a better approximation to n!, valid over a larger range of n. The method by which it is
proved is one that is used to prove several other results in statistical mechanics and is thus useful in its own
right.
The factorial function is usefully represented as an integral by means of result (A.1) of Section 15.2
above,

n! =

xn ex dx

(A.8)

The proof of Stirlings formula begins by noting that, when n is large, the integrand in (A.8) is a product of
a very rapidly increasing function, xn , with an even more rapidly decreasing function, ex . The product
F (x) = xn ex is therefore a very sharply peaked function of x which is nearly zero almost everywhere,
except in a narrow interval where it rises rapidly to extremely large values, reaches a maximum, and then
drops very rapidly to nearly zero again. In this respect, it behaves something like the Dirac -function.
The integral in (A.8) thus accumulates almost all of its value in a very narrow region of the domain of
integration around the peak of F (x), with the rest of the domain contributing virtually nothing to the
integral. We expect therefore to get a good approximation to it by considering F (x) only in the region of
its peak.
The peak of F (x) occurs when
0=

dF
= nxn1 ex xn ex = (n x)xn1 ex
dx

So x = 0 or x = n. Since F (x) = 0 when x = 0, there is only one peak, at x = n.


Interestingly, our first approximation to n! is just the maximum value of F (x), since at the maximum,
F (n) = nn en . This confirms that the only significant contribution to the integral (A.8) comes from the
immediate vicinity of the peak of F (x). In fact, since
n+1/2

n! F (n) = F (n) 1

xn ex dx

n1/2

we see that when n 1, the principal contribution to (A.8) comes from a region of width x 1.

Section 15.3

Calculation of n!

183

To obtain a better approximation than n! F (n), we expand F (x) in the vicinity of x = n as a power
series and use it to evaluate the integral in (A.8). Actually, it is not convenient to expand F (x) directly.
F (x) changes so rapidly near the peak that it is difficult to obtain for it a power series that is valid over an
appreciable range of x. It is better to expand ln F (x), which varies much more slowly and is thus easier to
expand. Put x = n + . Then
ln F (x) = n ln x x = n ln(n + ) (n + )

(A.9)

But for |x| 1 we have


ln(1 + x) = x

x2 x3 x4
xr
+

+ =
(1)r1
2
3
4
r
r=1

so, correct to second order, we get


ln(n + ) = ln n 1 +

= ln n + ln 1 +

= ln n +

1 2

+
n 2 n2

(A.10)

Substituting this in (A.9) gives, correct to second order,


ln F (n + ) n ln n n

1 2
2 n

or
F (n + ) nn en e

/2n

(A.11)

Integral (A.8) then becomes


n! nn en

/2n

(A.12)

The integrand
in this last integral is Gaussian. It
has a maximum at = 0, and width 2n. When n is

large, n n. So, outside of the region || 2n, the Gaussian is small and contributes very little to the
integral. In particular, for n, the Gaussian is negligibly small. We may therefore extend the lower
limit of integration from n to without altering the value of the integral in any significant way. This
gives,
n! nn en

/2n

d = nn en


2
2n e d = nn en 2n

where we have put = / 2n, and have used the fact that 0 ex dx = /2. Thus,

n! = 2n nn en

(A.13)

which is Stirlings formula.

15.3.4

Infinite Series for n!

We obtained Stirlings formula by approximating ln F (n + ) to second order. If we continue the series,


we obtain an infinite series for n!. Thus, (A.10) becomes
ln(n + ) = ln n + ln 1 +

= ln n +

1 2
r1 1

+
(1)
n 2 n2 r=3
r nr

so that
ln F = n ln n n

1 2
1 r
+
(1)r1
2 n
r nr1
r=3

184

Appendix A

Statistical Calculations

In place of integral (A.12) we then obtain

n! = nn en

/2n

exp

1 r
r nr1

(1)r1

r=3

The dominant term in the integrand is the Gaussian factor. It makes the integrand negligibly small when
|| > n1/2 . In particular, the integrand is utterly insignificant for all < n, and so we may extend the
lower limit of the integral to , to get

n! nn en

/2n

exp

(1)r1

r=3

1 r
r nr1

(A.14)

Since the integrand is negligibly small when || > n1/2 , the second factor contributes to the value of
integral only when n1/2 , so we may expand it by a Taylor series to give
exp

(1)r1

r=3

= 1+
= 1+

1 r
r nr1

(1)r1

r=3

(1)r1

r=3

= 1+
+

r=3

1
1 r
+
r nr1 2! r=3

1 r
1
+
r nr1
2!

(1)r1

1 r
r nr1

(1)r+s2

s=3
6

1 r+s
+
rs nr+s2

1
1
1
1

+
3 n2 4 n3 5 n4 6 n5

1 1 6

2! 9 n4

1 7
1 7
+
12 n5 12 n5

1 8
1 8
1 8
+
+
15 n6 16 n6 15 n6

Thus (A.14) becomes


n! nn en

/2n

1+

1
n2

3
3

1
n3

4
4

1
n4

5
6
+
5
18

1
n5

6
7
+
6
12

This series can be continued to any desired degree. The resulting integrals are all standard, and their
evaluation is straightforward. Correct to four terms, this series gives
n! =

2n

( n )n
e

1+

1
1
139
+

+
12n 288n2 51840n3

(A.15)

which is seen to converge very rapidly. Even when n is as small as 10, the first two terms of this series
gives answers that are accurate to better than 1 percent, while if 4 terms of the series are used, the answers
are accurate to better than .001 percent.

15.4 The Gamma Function


15.4.1

Definition

The integral

xn ex dx

which we used to represent n! in the proof of Stirlings formula is defined not only for non-negative integer
values of n, but for all real numbers n > 1. It thus defines a real valued function on the interval (1, ).

Section 15.4

The Gamma Function

185

This function is called the gamma function, or the -function, and occurs frequently in mathematical
physics. It is often encountered in statistical mechanics. In this appendix, we define it, and study some of
its properties.
The gamma function is defined for all real > 1 by the relation

( + 1) =

x ex dx

Equivalently, it is defined for all = + 1 > 0 by

() =

x1 ex dx

(A.16)

It can be shown (Brand, 1955, p 436-437) that the integral on the right hand side of (A.16) is absolutely
convergent for all real values > 0. This means that () is thus well defined by (A.16) for all > 0.

15.4.2

Recurrence Relation

The recurrence relation proved for In = (n + 1) in Appendix 15.2 for non-negative integer values of n is
valid for the gamma function for all real values > 0. We state this result as a theorem:
Theorem:
For all real values > 0,
(A.17)

( + 1) = ()

Proof:

The proof is by partial integration,

x ex dx =

( + 1) =
0

x ex

x d(ex )
0

(ex ) d(x ) = 0 +

ex x1 dx = ()
0

The restriction > 0 is necessary because the boundary term x ex


obtained by integration by parts.

is undefined at x = 0 if 0, as is also the integral

This recurrence relation is a functional equation for , and is its most basic property.
15.4.2.1

Extension of to Negative Values

Recurrence relation (A.17) can be used to extend the definition of to negative non-integral values of
values of as follows. Demand, wherever possible, that relation (A.17) holds. Let < 0 be given. Then by
adding some positive integer to it, we can bring it into the range [0, 1]. Denote the integer needed to do this
by n. Then 0 + n 1. Ignoring for the moment the fact that () is undefined at = 0, we attempt to
preserve the characteristic property of the -function given by (A.17). We thus want
( + n) = ( + n 1)( + n 1)
= ( + n 1)( + n 2)( + n 2)
..
.
= ( + n 1)( + n 2) ( + 1)( + 0) ()
Since everything in this equation is defined except (), we can invert it and use it to define (). We thus
tentatively define, for < 0,
() =

( + n)
( + 1) ( + n 1)

(A.18)

where n is the smallest integer such that + n 0.


This definition is good for all negative values of except when is a negative integer. For then the
smallest integer value n to take us into the range [0, 1] lands us on the value + n = 0, so the above
method fails because ( + n) is undefined. One might try to remedy this by adding n + 1 to to

186

Appendix A

Statistical Calculations

make + n + 1 = 1, since (1) is well defined. However, this is undesirable since we would obtain a
discontinuous extension for the function . It is better therefore to accept that must remain undefined for
all negative integer values of .
We thus take (A.18) as the definition of for all values < 0, except when is an integer, and agree to
leave undefined for all integers 0, 1, 2, .... This is equivalent to demanding
() =
() =

( + 1)

( + 2)
( + 1)

for 1 < < 0


for 2 < < 1

..
.
() =

( + n)
( + 1) ( + n 1)

for n < < n + 1

The graph of the -function, extended to negative values, is displayed in Figure 2.15

Figure 15.2.Graph of the -function

15.4.3

and the Factorial Function

If = n is a positive integer, (n) is defined by the integral (A.16) and is given by

(n) =

x1 ex dx = In1

Here In is the integral evaluated in Appendix 15.2. Thus (n) = (n 1)!, or


(n + 1) = n!

(A.19)

This result allows us to regard the -function as an extension of the factorial function to all values for
which is defined, and factorial notation is often used in place of the -function notation.
15

Figure from Brand, 1955, p 438.

Section 15.4

The Gamma Function

187

An example of the use of the -function to give meaning to non-integer factorials is in the definition of
(n/2)!, where n > 0 is an integer. Using the recursion relation for , we get for any integer n > 0,

(n

) n (n
) (n
)
5 3 1
+1 =
1
2
2
2 2
2
2 2 2

1
2

Since the -function for < 0 is defined in terms of its values in the interval (0, 1), it is pointless extending
this recursion any further. We have thus reduced the calculation of (n/2 + 1) to what looks like a factorial
product of half integers. Had we evaluated ( + 1) for = n, where n > 0 is an integer, we would have
obtained
(n + 1) = n(n 1)(n 2) 3 2 1 (1)
The half integral case differs from this only in that the recursion ends with (1/2) instead of (1) = 1. We
therefore define the symbol (n/2)! for any integer n > 0 by
(n)
(n
)
!=
+1
(A.20)
2
2
Thus, by definition,

(n)
) (n
)
n (n
5 3 1
!=
1
2
2
2 2
2
2 2 2

1
2

All we need do to complete this definition is evaluate (1/2). From the definition of the ,

1
2

ex x1/2 dx

Change the variable of integration to u = x1/2 , so that du = x1/2 dx/2. Then

1
2

=2

eu du =

eu du

This last integral was evaluated in Appendix 15.1. We thus have

1
2

(A.21)

Because of definition (A.20) for n > 0, many authors denote (1/2) by the symbol (1/2)!. Thus, result
(A.21) is often written as

1
!=
2

and hence
(n)
) (n
)
n (n
5 3 1
!=
1
2
2
2 2
2
2 2 2

You will probably suffer apoplectic seizure when encountering these results for the first time. While
convalescing however you will probably have plenty of time
for reflection, in which case you will
undoubtedly eventually realise that the result (1/2)! = is no more shocking than the result that
0! = 1. It is merely a matter of definition of symbols, based on perfectly reasonable properties of the
-function.
Incidentally, in defining the factorial function positive half integer values, we have proved also the
following important result which we will need to use in this course, that for all integers n 1

ex x(n2)/2 dx =

(n

) (n)
+1 =
!
2
2

188

15.4.3

Appendix A

Statistical Calculations

References

Brand, L., 1955, Advanced Calculus, An introduction to Classical Analysis, John Wiley and Sons, New York.
Sears, F.W., and Salinger, G.L., 1975, Thermodynamics, Kinetic Theory, and Statistical Thermodynamics, Third
Edition, Addison-Wesley Publishing Company, Reading, Massachusetts.

189

Appendix B
Volume of a Sphere in Rn
By a sphere in Rn we mean the surface in Rn defined by the equation
(x1 )2 + (x2 )2 + + (xn )2 = R2

(B.1)

where R is a given constant, called its radius. Denote the set of points on and within the sphere by .
Thus,
= x Rn : (x1 )2 + (x2 )2 + + (xn )2 R2

(B.2)

The volume of the sphere is defined to be the value of the integral


n (R) =

(B.3)

where d is the volume element for Rn , and is given by


d = dn x = dx1 dx2 ...dxn
The surface area An (R) of a sphere is defined to be the volume of a thin spherical shell of radius R and
thickness R, in the limit as R 0. Now, the volume of a spherical shell bounded by radii R and
R + R is given by n (R + R) n (R), so
An (R) = lim

R0

n (R + R) n (R)
d n
=
(R)
R
dR

(B.4)

There are several methods available for calculating n (R). A simple one is as follows. First, note from
the definition of volume that n must have dimensions of (length)n , and so must be proportional to the
product of n lengths that characterise the sphere. But a sphere is characterised by only one length, its
radius R. So its volume must be given by a formula of the form
n (R) = cn Rn

(B.5)

where cn is a constant to be determined. Further, from definition (B.4), its surface area must be
An (R) = ncn Rn1

(B.6)

All we need now to do is determine the unknown constant cn . This may be achieved by the following
cunning method.
We evaluated the integral
+

ex dx =

(B.7)

by squaring it, and then using the fact that x2 + y 2 = r2 is the radius squared of a circle in R2 . If instead
of taking the second power we take the nth , we get a formula which contains the radius of a sphere in Rn .
Thus,
n/2 =

n
2

ex dx

1 2

e(x

dx1

2 2

e(x

dx2 ...

n 2

e(x

dxn

1 2

e[(x

=
Rn

) +(x2 )2 ++(xn )2 ]

(B.8)

190

Appendix B

Volume of a Sphere in Rn

where d = dn x = dx1 dx2 ...dxn is the volume element of Rn . Putting R2 = (x1 )2 + (x2 )2 + + (xn )2 ,
this gives
2

n/2 =

eR d

(B.9)

Rn

Now convert the n-fold integral in Rn into a single integral by using (B.4) and (B.6), according to which
d n (R) =

dVn
(R) dR = An (R) dR = ncn Rn1 dR
dR

The integral over Rn may in this way be replaced by an integral over all infinitesimal spherical shells from
radius R = 0 to R = . So, (B.8) becomes
n/2 =

eR ncn Rn1 dR = ncn

R=0

eR Rn1 dR

(B.10)

which provides an equation for cn . To solve for cn , we must first evaluate the integral on the right hand
side of this equation (B.10). This is done as follows. First, change the variable of integration from R to
x = R2 . So, dx = 2RdR, giving Rn1 dR = x(n2)/2 dx/2, and hence

2
1
1 (n)
eR Rn1 dR =
ex x(n/2)1 dx =
2 x=0
2
2
R=0

So, (B.10) becomes

n/2 = cn
or,

(n
)
(n)
n (n)

= cn
+ 1 = cn
!
2
2
2
2

n/2
n/2
(n
) = (n)

+1
!
2
2
The formulae for the volume and surface area of the sphere in Rn are therefore
cn =

n (R) =
and

n/2
n/2
) Rn = ( n ) Rn
(n
+1
!

2
2

nn/2
2n/2
2n/2
(n
) Rn1 = ( n ) Rn1 = ( n
) Rn1
(B.11)

+1

1 !
2
2
2
3
2
For n = 3, these formulae give 3 (R) = 4R /3 and A3 (R) = 4R which are the volume and surface
area of a sphere, as expected. For n = 2, they give 2 (R) = R2 and A2 (R) = 2R, showing that area is
the hypervolume for a 2-dimensional space, and length is its hyperarea. For n = 1, they give 1 (R) = 2R
and A1 (R) = 2. So length is the hypervolume of a 1-dimensional space, while its hyperarea simply counts
the endpoints of the hypervolume.
An (R) =

191

Appendix C
Evaluation of

3 x
0 x (e

(Written by F F Frescura.)
There are several ways to show the result that

I=

1)1dx

x3
4
dx
=
(ex 1)
15

(C.1)

Unfortunately, none of them are obvious. Each requires a devious mind. Here is one.
First, return to the original infinite summation from which we obtained the factor 1/(ex 1). Thus
write

1
ex
=
=
enx
x
(e 1)
(1 ex ) n=1

(C.2)

Substituting this into integral (C.1) then gives

I=

x3

enx dx

(C.3)

n=1

Now interchange sum and integral and make the substitution y = nx to get
I=

n=1

x3 enx dx =

1
n4
n=1

y3 ey dy

(C.4)

The integral evaluates easily by parts and yields, after several steps,

y 3 ey dy = 6

(C.5)

so that
I =6

1
4
n
n=1

(C.6)

The problem now is to evaluate the infinite sum in ( C.6). If you wish to look no further, this sum can easily
be evaluated numerically. Successive terms converge very rapidly, and you can calculate its numerical
value to any desired degree of accuracy. To three decimal places, you will get

1
= 1.082...
4
n
n=1

(C.7)

It is more satisfying in theoretical work however to find an answer in closed form. To do this, it is useful
to recall that sums like this one occur often in Fourier series, so it is a good idea to go looking in a book
that lists them. It should not take you long to discover that the answer you want can be obtained by slightly
devious means from the Fourier series for a suitably scaled square wave. The method for doing so, and the
necessary deviousness, is as follows. Consider the Fourier series for the periodic function with period 2
defined by the prescription
f (x) =

/4
/4

0<x<
< x < 0

(C.8)

The /4 amplitude is a scaling factor we have applied to get a Fourier series more closely resembling our

192

Appendix C

Evaluation of

3 x
0 x (e

1)1 dx

series. Since f (x) is odd, its Fourier series has the form

f (x) =

an sin nx

(C.9)

n=1

and the coefficients evaluate easily to


0
1/n

an =

n = 2, 4, 6, ...
n = 1, 3, 5, ...

(C.10)

so that f(x) in the range 0 < x < yields

=
sin nx
4 n=1,3,5,... n

(C.11)

We can get higher inverse powers of n into the infinite sum by integrating both sides from 0 to x. To get an
inverse fourth power, we need three successive integrations, which give respectively

x =
4

We can isolate the sum

'

2
x
8

3
x
24

n=1,3,5,...

n=1,3,5,...

n=1,3,5,...

1
(1 cos nx)
n2

(C.12)

1
1
x 3 sin nx
2
n
n

(C.13)

1 2
1
x 4 (1 cos nx)
2n2
n

(C.14)

1/n4 from the last term of ( C.14) by choosing x = /2, to get

1
4
2 1

=
192 n=1,3,5,... 8 n2 n4

(C.15)

or

n=1,3,5,...

We may now evaluate

'

1
2
=
4
n
8

n=1,3,5,...

1
4

2
n
192

(C.16)

1/n2 by putting x = /2 in (C.12) to get

1
2
=
n2
8

(C.17)

1
2 2
4
4
=

=
n4
8 8
192
96

(C.18)

n=1,3,5,...

so that

n=1,3,5,...

Aaah," you say, but you have not evaluated the sum that we want, which is of 1/n4 over all integers,
and not over the even ones only." True," I reply, but watch this for a last bit of impressive cunning and
deviousness!" At which point I note that

n=1

1
1
1
=
+
4
4
n
n
n4
n=1,3,5,...
n=2,4,6,...

(C.19)

and then I note furthermore that

n=2,4,6,...

1
1
1
=
=
4
4
n
(2r)
16
r=1

r=1

1
r4

(C.20)

193
so that

n=1

1
1
1
=
+
4
4
n
n
16
n=1,3,5,...

n=1

1
n4

(C.21)

and hence

n=1

1
16
=
n4
15

n=1,3,5,...

1
16 4
4
=
=
n4
15 96
90

(C.22)

I then retreat and watch you squirm in wonder and amazement at this coup of intellect and mathematical
power, and leave you of limited prowess to show that the final answer that we need is
I=6

4
1
=
n4
15
n=1

(C.23)

194

Appendix D

Fermi-Dirac Functions

Appendix D
Fermi-Dirac Functions
By definition the Fermi-Dirac functions are
fn (z) =

1
(n)

xn1
dx,
z 1 ex + 1

(D.1)

where (n) is the gamma function. For 1 < y < 1 we have


(1 + y)1 = 1 y + y2 =

(1)r yr

(D.2)

r=0

so for 0 z 1 we get
(z 1 ex + 1)1 = (z 1 ex )1 (1 + zex )1 = zex

(1)r (zex )r =

r=0

(1)r (zex )r+1

(D.3)

r=0

where we have used the fact that, in the integration, 0 x < , so 1 ex > 0, and z 1, so zex 1
for the entire range of integration. This gives

xn1
dx =
1
z ex + 1
=

xn1 e(r+1)x dx

(1)r z r+1

xn1 e(r+1)x dx

(D.4)

r=0

The integral on the right hand side is now nearly in the form of a -function, which is defined for all real
n > 1 by

(n + 1) =

ex xn dx

so, putting = (r + 1)x in (D.4), we get for that integral

xn1 e(r+1)x dx =

so that, for z 1, we have


fn (z) =

r=0

(1)r

n1
d
(n)
e
=
n1
(r + 1)
r+1
(r + 1)n

z r+1
z2
z3
z4
=
z

+
(r + 1)n
2n 3n 4n

(D.5)

(D.6)

You might also like