Rasanen - Cosmology I & II (Notes) PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 240

Cosmology I & II

Fall 2014
Syksy Räsänen
1 INTRODUCTION 1

Preface
These are lecture notes for the courses Cosmology I and II at the University of
Helsinki. They are closely based on the notes prepared by Hannu Kurki-Suonio,
who has lectured this course in the past. (Credit for some figures goes to Elina
Keihänen, Jussi Väliviita and Reijo Keskitalo.)
A difficulty in teaching cosmology is that some central aspects rely on rather
advanced physics, such as quantum field theory in curved spacetime. Nevertheless,
the main applications of these advanced concepts can be discussed in relatively
simple terms, so requiring students to first learn general relativity, quantum field
theory and so on in detail is not necessary. Thus only the standard Bachelor level
theoretical physics background (mechanics, special relativity, quantum mechanics,
statistical physics) is assumed. The more advanced theories that cosmology relies
on are reviewed as part of these courses to the required extent. This means that
some results have to be accepted without proper derivation, especially in the later
parts of the courses.
The course is divided into two parts. In Cosmology I, the universe is discussed in
terms of the homogeneous and isotropic approximation (the Friedmann-Robertson-
Walker model) and statistical physics. In Cosmology II, deviation from homogeneity
and isotropy, is discussed from inflation as a theory of initial perturbations to struc-
ture formation and the cosmic microwave background.

1 Introduction
1.1 Overview of modern cosmology
Cosmology is the study of the universe as a whole, its structure, origin and evolution.
Cosmology is grounded on observations, many of them astronomical, and laws of
physics measured on Earth. These lead naturally to the standard framework of
modern cosmology, the Hot Big Bang theory.
As a science, cosmology has a rare, if not singular, restriction: we can observe
only one universe. We cannot do experiments in cosmology, and observations are
restricted to a single object: the Universe. We cannot observationally make compar-
ative or statistical studies among many universes (to the extent such a concept even
makes sense). Also, we cannot move around our universe, but are restricted to (on
cosmological scales) a single point in space and time. As a result, cosmology relies
more on model-dependent interpretations than many other branches of physics.
Nevertheless, the last few decades have seen a remarkable progress in cosmol-
ogy, as a significant body of observational data has become available with modern
astronomical instruments. We now have a good observational handle of the overall
history of the universe for all times between one second and the present time (the
universe is today about 14 billion years old). Theoretically, we understand the evo-
lution of the universe at times between 10−11 s and a few billion years. Important
questions remain, in particular about the nature of dark matter, dark energy and
the processes of inflation and baryogenesis. In the first part of the course, we will
consider dark energy and dark matter, and will look at inflation in the second part.
Baryogenesis will be mentioned only in passing.
One historically important observation supporting the big bang theory is the
1 INTRODUCTION 2

redshift of distant galaxies. Their spectra are shifted towards longer wavelengths.
The further out they are, the larger is the shift. This implies that they are receding
from us; their distance from us is increasing. According to general relativity, this is
understood as the expansion of space itself (which is an aspect of the curvature of
spacetime), not as motion of galaxies in space. As space expands, the wavelength of
the light travelling through space is stretched.
The expansion appears to be uniform over large scales. While there are deviations
of order unity in the expansion rate on small scales (for example our galaxy does not
expand), the average expansion rate on cales larger than galaxy clustering, 10 Mpc
or so, is almost the same everywhere. In homogeneous and isotropic cosmological
models, the expansion is simply described by a time-dependent scale factor a(t).
Starting from the observed present expansion rate, H ≡ ȧ/a, we can use general
relativity to calculate a(t) as a function of time, given our understanding of the
physics of the matter content in the universe. (We will discussed this in more detail
later.) The result is that a(t) → 0 and the density of the universe ρ → ∞ about 14
billion years ago. At this singularity, time and space begin, and we can choose it as
the origin of the time coordinate, t = 0. However, we do not expect general relativity,
which governs the evolution of spacetime, nor the Standard Model of particle physics,
which governs the behaviour of matter, to be applicable at extremely high energy
densities. At the so called Planck density, ρPl ∼ 1097 kg/m3 , quantum gravitational
effects should be large. To describe the earliest times, the Planck era, we would need
a theory of quantum gravity, which we do not have.1 At present we don’t know what
happened at t = 0 or in its immediate vicinity at the Planck time of 10−42 s. It
seems likely that a major breakthrough in our understanding of physics is required
before we can make any definite statements about that primordial era.
Thus, the big bang theory does not actually apply all the way to the beginning
of time and a “big bang”. Rather, it is a valid description of the history of the
universe starting from some early time when the universe was very hot2 , very dense,
and expanding rapidly. The universe was at early times filled with an almost ho-
mogeneous “soup” of particles which was in thermal equilibrium for a long time.
We can therefore describe the state of the early universe with a small number of
thermodynamic variables, and this makes the evolution of the universe calculable.
The fact that the scale factor tends to zero at early times does not imply that
the universe would have started from a point. The part of the universe which we
can observe today was indeed very small at very early times, possibly smaller than
1 mm in diameter at the earliest times that can be sensibly discussed within the
big bang framework. And if the inflationary scenario is correct, it would have been
much smaller than that before inflation. However, the universe extends beyond
what can be observed today (beyond our horizon), and may even be infinite. In
the current cosmological models, if the universe is infinite, it has been always been
infinite, except at the moment t = 0, when the size is not defined. (We do not know,
1
String theory is a candidate for the theory of quantum gravity, and applying it to the early
universe is an active area of research. However, we do not have a complete formulation of the theory,
and it is not known whether string theory is in fact correct, so its applications are necessarily rather
speculative.
2
The realisation that the early universe must have had a high temperature did not come im-
mediately after the discovery of the expansion. The results of big bang nucleosynthesis and the
discovery of the cosmic microwave background –which we will discuss in some detail– however
provided convincing evidence that the early universe was hot.
1 INTRODUCTION 3

theoretically or observationally, whether the universe is finite or infinite.)


As the universe expands, it is not expanding into some space “around” the
universe. The universe by definition consists of all space: as the universe expands,
the volume of space grows. (In the case of an infinite universe, it is meaningless
to talk about change of the total volume. However, we can say that the volume of
any sufficiently large finite portion of the universe grows in time. The disclaimer
“sufficiently large” is here because on small scales, space can collapse, as happens
during structure formation, to which we will come later.)
In order to describe behaviour at high temperatures and energies, we need to
treat matter in terms of quantum fields rather than classical particles. (In this
course, we can skip a lot of particle physics details.) The Standard Model of parti-
cle physics is based on the symmetry group SU(3)C ⊗ SU(2)W ⊗ U(1)Y . From the
viewpoint of the Standard Model, we live today in a low-energy universe, where part
of the symmetry of the theory is broken. The “natural” energy scale of the theory
is reached when the temperature of the universe exceeds 100 GeV (about 1015 K),
which was the case when the universe was younger than 10−11 s. Then the primor-
dial soup of particles consisted of free massless fermions (quarks and leptons) and
massless gauge bosons mediating the interactions (colour and electroweak) between
these fermions. The standard model also includes a particle called the Higgs boson,
which was discovered in 20123 .
The Higgs field is responsible for the breaking of the electroweak symmetry. This
is one of the phase transitions in the early universe. In the electroweak (EW) phase
transition the electroweak interaction splits into two separate interactions4 : 1) the
weak interaction mediated by the massive gauge bosons W ± and Z 0 , and 2) the
electromagnetic interaction mediated by the massless gauge boson γ, the photon.
Fermions and W and Z bosons get their masses in the EW phase transition (with
the possible exception of the neutrinos, the origin of their masses is not clear).
The mass is due to the interaction of the particle with the Higgs field. The EW
phase transition took place when the universe cooled below the critical temperature
Tc ∼ 100 GeV of the phase transition at t ∼ 10−11 s.
Another phase transition, the QCD phase transition, or the quark-hadron tran-
sition, took place at t ∼ 10−5 s. The critical temperature of the QCD phase transi-
tion is Tc ∼ 150 MeV. Quarks, which had been free until this time, formed hadrons:
baryons (which contain three quarks), e.g. the nucleons n and p, and mesons (which
contain a quark and an antiquark), e.g. π and K. The matter filling the universe was
converted from a quark-gluon plasma to a hadron gas. (All mesons and most baryons
are unstable with rather short lifetimes, so only protons and neutrons remain at late
times.)
The order of the electroweak phase transition depends on the particle physics
on those scales. In the Standard Model, the transition is very smooth. In fact,
the electroweak phase transition is a misnomer, because the process is a smooth
cross-over and not a phase transition at all. However, in extensions of the Standard
3
CERN announced the discovery of a new particle consistent with the Higgs boson on July 4,
2012. Many theoretical physicists felt confident already at the time that it is the Higgs particle,
but it was officially labeled the Higgs only on March 14, 2013.
4
More accurately, the interaction before the symmetry breaking had the two parts SU(2)W ⊗
U(1)Y , and after the symmetry breaking the broken weak interactions and the unbroken electro-
magnetic symmetry have parts of both.
1 INTRODUCTION 4

Model, there can be a first order phase transition (so the two phases have different
energy densities at the critical temperature), in which case it proceeds through the
formation of bubbles of the new phase. The phase transition can then have inter-
esting effects, and baryogenesis, the generation of the observed matter-antimatter
asymmetry in baryons, may occur in that process.
For every type of particle there is a corresponding antiparticle, which has the
same properties (e.g. mass and spin) as the particle, except for charges (like the
electric charge or colour charge), which have the opposite sign. Particles that do not
have any charges, such as photons, are their own antiparticles. At high temperatures,
T ≫ m, where m is the mass of the particle, particles and antiparticles are constantly
created and annihilated in various reactions, and there is roughly the same number
of particles and antiparticles. But when T ≪ m, particles and antiparticles may still
annihilate each other (and decay, if they are unstable), but there is no more thermal
production of particle-antiparticle pairs. As the universe cools, heavy particles and
antiparticles therefore annihilate each other. These annihilation reactions produce
additional lighter particles and antiparticles. If the universe had originally an equal
number of particles and antiparticles, only photons and neutrinos (of the known
particles) would be left over today in any significant quantities. The presence of
matter today indicates that in the early universe there must have been slightly
more nucleons and electrons than antinucleons and positrons, and this excess was
left over. The lightest known charged massive particle is the electron, so the last
annihilation event was the electron-positron annihilation which took place when
T ∼ me ∼ 0.5 MeV and t ∼ 1 s5 . After this the only remaining antiparticles were
the antineutrinos, and the primordial soup consisted of a large number of photons
(who are their own antiparticles) and neutrinos (and antineutrinos) and a smaller
number of “left-over” protons, neutrons, and electrons. (Dark matter is left out of
this story; in typical dark matter models, there are an equal number of dark matter
particles and antiparticles in the late universe. We will come back to this later.)
When the universe was a few minutes old, T ∼ 100 keV, protons and neutrons
formed nuclei of light elements. This process is known as Big Bang Nucleosynthesis
(BBN), and it produced about 75% (of the total mass in ordinary matter) 1 H, 25%
4 He, 10−4 2 H, 10−4 3 He, and 10−9 7 Li. (Other elements were formed much later, in

stars.) At this time matter was completely ionized, all electrons were free. In this
plasma the mean free path of a photon was short and the universe was opaque.
The universe became transparent when it was about 400 000 years old. At a
temperature T ∼ 3000 K (∼ 0.25 eV), electrons and nuclei formed neutral atoms,
and the photon mean free path became longer than the radius of the observable
universe. This event is called recombination. (This name, taken from statistical
physics, is misleading, since this is actually the first time ever when electrons and
nuclei combine.) Since recombination the primordial photons have been travelling
through space almost without scattering. We can observe them today as the Cosmic
Microwave Background (CMB). The CMB is like a photograph of the universe at
400 000 years of age, modified by its passage to us through 13 billion years.
From the CMB we learn that the early universe was locally very homogeneous
and isotropic, unlike the present universe, where matter has accumulated into stars,
5
Neutrinos also have very small masses. However, at temperatures less than the neutrino mass,
the neutrino interactions are so weak that the neutrinos and antineutrinos cannot annihilate each
other. It is also possible that neutrinos are their own antiparticles, like photons.
1 INTRODUCTION 5

galaxies, clusters, filaments and superclusters. (The distribution of structures in


the late time universe is statistically homogeneous and isotropic, but there are large
local variations.) The density variations in baryonic matter and photons were of
the order 10−56 , and we see these as small intensity variations of the CMB (the
CMB anisotropy). Due to gravity, slight overdensities in the matter have grown
in time, and eventually they formed galaxies. This process is called cosmological
structure formation. Galaxies are not evenly distributed in space but form various
structures, galaxy groups, clusters (large gravitationally bound groups), superclus-
ters, filaments, and walls, separated by large, relatively empty voids. Observations
of this present-day large scale structure of the universe form a significant body of
data, which our cosmological theories are able to explain in detail.
There are two parts to structure formation:
1. The origin of the primordial density fluctuations, the “seeds of galaxies”. The
scenario which has been most successful in explaining the observed properties of the
primordial fluctuations is cosmic inflation, which probably occurred much before
the EW phase transition. We will discuss inflation in the second part of the course
and we will calculate how primordial perturbations are generated from quantum
fluctuations.
2. The growth of fluctuations in time. The growth is due to gravity, and it
depends on the composition and amount (average density) of matter, as well as the
way the universe expands.
One of the main problems in cosmology today is that most of the matter and
energy content of the universe appears to be in some unknown forms, called dark
matter and dark energy. The dark matter issue dates back to 1920s, whereas the
dark energy problem arose in the 1990s.
From the motions of galaxies we can deduce that the matter we can directly
observe as stars, “luminous matter”, is just a small fraction of the total mass which
affects the galaxy motions through gravity. The rest is dark matter, something that
we observe only via its gravity. There are also many other lines of evidence for dark
matter, including the structure of the CMB anisotropies, the pattern of large-scale
structure and the motions of galaxies and gas in clusters and gravitational lensing.7
We do not know what most of this dark matter is. A small part of it is just
ordinary, baryonic, matter, which consists of atoms just like stars, but does not
shine enough for us to notice it. Possibilities include planet-like bodies in interstellar
space, “failed” stars (too small, m < 0.07m⊙ , to ignite thermonuclear fusion) called
brown dwarfs, old white dwarf stars, and tenuous intergalactic gas. In large clusters
of galaxies the intergalactic gas8 is so hot that we can observe its radiation. Thus
its mass can be estimated and it turns out to be several times larger than the total
mass of the stars in the galaxies. We can infer that in other parts of the universe,
where this gas would be too cold to be observable from here, also contain significant
6
There were larger density variations in dark matter. The amplitude depends on the scale, but
for relevant cosmological scales they were of the order 10−3 – we will look at this in detail later.
7
Since all evidence for dark matter comes from its gravitational effect, it might in principle be
possible to explain the observations without dark matter by modifying the laws of gravity instead.
However, fitting all the different observations requires resort to rather baroque models, whereas
dark matter explains several observations via one simple hypothesis. Many cosmologists consider
the existence of dark matter as an established fact.
8
This gas is ionized, so it should more properly be called plasma. Astronomers, however, often
use the word “‘gas” also for ionized matter.
1 INTRODUCTION 6

amounts of thin gas, which is apparently the main component of baryonic dark
matter.
However, we can estimate from BBN and the CMB anisotropies the total amount
of baryonic matter, and there is not nearly enough of it to explain the whole dark
matter problem. Most of the dark matter is non-baryonic, meaning that it is not
made out of protons and neutrons9 . The only non-baryonic particles in the standard
model of particle physics that could act as dark matter are neutrinos. If they had a
suitable mass, ∼ 1 eV, neutrinos left from the early universe would have a sufficient
total mass to be a significant dark matter component. However, producing the
structures seen in the universe requires most of the dark matter to have different
properties than neutrinos have. Technically, most of the dark matter must be “cold”
or “warm”, instead of “hot”. These terms refer to the dynamics of the particles
making up the matter, and do not specify the details of the particles. The difference
between hot dark matter (HDM) and cold dark matter (CDM) is that HDM is made
of particles whose velocities were large when structure formation began, but CDM
particles had small velocities. Neutrinos with m ∼ 1 eV, would be HDM. Dark
matter between the two alternatives is called warm dark matter (WDM), and it
is still observationally allowed. As the Standard Model does not contain particles
that could explain all of the dark matter, it appears that most of the matter in the
universe is made out of some unknown particles.
Usually, the term ”dark matter” is used to refer only to non-baryonic dark mat-
ter, and often neutrinos are also excluded, so that ”dark matter” refers only to the
unknown, exotic part of the matter that is not observed via light.
Particle physicists have independently come to the conclusion that the Standard
Model is not the final word in particle physics. Many proposed extensions of the
Standard Model contain suitable dark matter particle candidates (e.g. neutralinos,
technibaryons, axions, right-handed neutrinos). Their interactions have to be rather
weak to explain why they have not been detected so far, which imposes constraints
on models of particle physics. Dark matter thus presents one area where the physics
of the very small and the very large overlap.
The situation of dark energy is different. While cosmologists consider dark mat-
ter quite natural, many of them find dark energy puzzling. The increasing breadth
and precision of cosmological observations has made it possible to determine the
distance scales and the expansion of the universe accurately. In the context of the
homogeneous and isotropic models based on general relativity, a form of matter
called dark energy is required to fit these observations. Unlike dark matter, which
is clustered, dark energy is relatively uniform in the observable universe. And while
dark matter has negligible pressure, dark energy has a large negative pressure. The
simplest possibility for dark energy is a cosmological constant or vacuum energy.
High energy physics predicts that the the vacuum has an energy density, but it is
difficult to understand the small energy scale ∼ meV that is required to explain the
observations. Another possible explanation is modification of the law of gravity at
large distances. In the dark energy case, this is less difficult than for dark mat-
ter, as the only observable effect of dark energy is to increase the expansion rate
of the universe at late times, whereas the effect of dark matter is seen in various
9
And electrons. Although electrons are not baryons (they are leptons), cosmologists refer to
matter made out of protons, electrons, and neutrons as “baryonic”. Electrons are anyway so light
that their contribution to the total mass is tiny.
1 INTRODUCTION 7

physical systems on different scales and in different eras of the universe. Neverthe-
less, constructing models that would explain the observations on large scales while
being consistent with the precision tests of general relativity in the solar system
has proven to be difficult (apart from the cosmological constant). This remains an
active area of research. The third possibility is that the homogeneous and isotropic
approximation is not good enough at late times due to the formation of non-linear
structures. Studying this possibility is difficult, because it requires dealing with
non-perturbative general relativity in the complex setting of cosmological structure
formation.

1.2 Units and terminology


Natural units. We use natural units in which c = ~ = kB = 1.

c=1
The theory of relativity unifies space and time into a single concept, the four-
dimensional spacetime. It is thus natural to use the same units for measuring spatial
distance and time. Since the (vacuum) speed of light is c = 299 792 458 m/s, we set
1 s ≡ 299 792 458 m, so that c = 1 and 1 second = 1 light second and 1 year = 1 light
year. Velocity is thus a dimensionless quantity, and smaller than one for massive
objects. Energy and mass have the same dimension and the relation between mass
m and energy E for free particles E 2 = m2 c4 + p2 c2 is simply E 2 = m2 + p2 , where
p is particle momentum.

kB = 1
Temperature T is a parameter that describes a thermal equilibrium distribution.
The formula for the occupation number of energy level E includes the exponential
form eβE , where β = 1/(kB T ). The only function of the Boltzmann constant,
kB = 1.3805 × 10−23 J/K, is to convert temperature into energy units. We decide to
give temperature directly in energy units, so kB becomes unnecessary. We define 1
K = 1.3805 × 10−23 J, or
1eV = 11600K = 1.78 × 10−36 kg = 1.60 × 10−19 J . (1.1)
Thus kB = 1, and the exponential form is just eE/T .

~=1
The third simplification in the natural system of units is to set the reduced Planck
constant to unity, ~ = h/2π = 1. In SI units we have ~ = 1.054573 × 10−34 Js,
so in the natural system of units the dimensions of mass and energy are equal to
the dimension of 1/time or 1/distance. This is convenient, because the typical time
and distance scales of quantum mechanics are associated with particle energy. For
example, the energy of a photon E = ~ω = ω is equal to its angular frequency. We
have
1 eV = 5.07 × 106 m−1 = 1.52 × 1015 s−1 . (1.2)
A very useful relation to remember is
~ ≈ 197 MeV fm = 1 , (1.3)
1 INTRODUCTION 8

where we have the energy scale ∼ 100 MeV and length scale 1 fm of strong interac-
tions.
Equations become now simpler and the physical relations more transparent, since
we do not have to include the above fundamental constants. However, still have to
do conversions among different units because the preferred units used in particle
units and cosmology (not to mention astrophysics) are different.

Astronomical units. A common unit of mass and energy is the solar mass, m⊙ =
1.99 × 1030 kg, and a common unit of length is one parsec, 1 pc = 3.26 light years =
3.09×1016 m. One parsec is defined as the distance from which 1 astronomical unit
(AU, the distance between the earth and the sun) forms an angle of one arcsecond,
1”.10 (Astronomers and cosmologists only use light years when talking to outsiders.)
A more common scale in cosmology is 1 Mpc = 106 pc, which is roughly the typical
distance between galaxies at the present time.

1.3 The Observable Universe


The observations relevant to cosmology are mainly astronomical. The speed of light
is finite, and therefore, when we look far away, we also look back in time. The
universe has been transparent since the formation of atoms at about 400 000 years,
so more than 99.99% of the history of the universe can be seen by us.
The most important channel of observation is the electromagnetic radiation
(light, radio waves, X-rays, etc.) coming from space. We also observe charged
particles (protons, electrons, nuclei), called cosmic rays, as well as neutrinos. Ac-
cording to theory, there are also gravitational waves going through us, but we have
not been able to observe them so far. In addition, the composition of matter in the
solar system has cosmological significance.

1.3.1 Redshift and the Hubble law


One of the starting points of modern observational cosmology was the discovery by
Lemaı̂tre in 1927 and Hubble in 1929 that the redshifts of galaxies are proportional
to their distance. Redshift refers to the fact that the light is redder (has longer
wavelength) when it arrives to us than when it was emitted. This redshift can be
determined with high accuracy from spectral lines. These lines are caused by tran-
sitions between different energy states of atoms, and thus their original wavelengths
λ0 can be measured in the laboratory on Earth11 . The redshift z is defined as
λ − λ0 λ
z≡ or 1+z = (1.4)
λ0 λ0
where λ is the observed wavelength. The redshift is observed to be independent of
wavelength and follows the relation

z = H0 d , (1.5)
10
One degree is divided into 60 arc minutes, denoted by 60’, and one arc minute is divided into
60 arc seconds, denoted by 60”.
11
Assuming that the laws of physics are the same here and at the emission event. Put another
way, spectral lines offer a sensitive test of the change of the laws of quantum electrodynamics and
nuclear physics in time and space. No deviations from the laws observed on Earth have been found
so far.
1 INTRODUCTION 9

where d is the distance to the galaxy and z is its redshift. (The speed of light c is
set to unity, as noted above.) This relation was first discovered by Lemaı̂tre, and it
is called the Hubble law. The proportionality constant H0 is correspondingly called
Hubble constant. It was introduced, its value was first determined from observations
and it was interpreted as the expansion rate by Lemaı̂tre.
While the redshift can be readily determined with high accuracy, it is more
difficult to determine the distance d. Measurements of distances in general and the
Hubble parameter in particular have been the subject of much work and controversy
over decades. Distance determinations used to be exclusively based on the notion of
the cosmic distance ladder. This refers to a series of relative distance determinations
between more nearby and faraway objects. The first step of the ladder is made of
nearby stars, whose absolute distance can be determined from their parallax, their
apparent motion on the sky due to our motion around the Sun. The other steps
require “standard candles”, classes of objects with the same absolute luminosity
(radiated power), so that their relative distances are inversely related to the square
roots of their “brightness” or apparent luminosity (received flux density). Often
several steps are needed, since objects that can be found close by are too faint to
be observed from very far away, errors (inaccuracies) accumulate from step to step.
Nowadays there are also measurements which do not use the distance ladder, and
the value of H0 is determined to a reasonable accuracy, though the matter may not
be entirely settled.
The latest measurement of H0 reports the value [1]

H0 = 72.5 ± 2.5 km/s/Mpc .

The error bars are derived from combining three different measurements which are
used to anchor the distance ladder. Systematic effects are included, but may be
somewhat underestimated. The error represents the range where the real result is
with a 68% probability. Doubling the error bar gives the 95% probability range.
(Unless otherwise noted, all errors bars given during the course are 68% error bars.)
There are other observations pointing to around this value (as well as some
observations pointing to a somewhat lower value). This uncertainty of the distance
scale is reflected in many cosmological quantities. It is customary to give these
quantities multiplied by the appropriate power of h, defined by

H0 = h · 100km/s/Mpc. (1.6)

Conservatively, we have h = 0.6 . . . 0.8.


For small redshifts (z ≪ 1), the redshift is often thought of as the Doppler effect
due to the relative motion of the source and the observer. The distant galaxies are
thus apparently receding from us with the velocity

v=z . (1.7)

The further out the galaxies are, the faster they are receding from us. Astronomers
often report the redshift in units of velocity (by reintroducing c in (1.7), z = v/c).
However, according to general relativity, movement in space is not the proper
way understand the redshift. The galaxies are not actually moving, the distance
between the galaxies increases because the space between the galaxies is expanding.
We will later derive the redshift from general relativity. It turns out that equations
1 INTRODUCTION 10

(1.5) and (1.7) hold only at the limit z ≪ 1, and the general result, d(z), which
relates distance d and redshift z ,is more complicated. In particular, the distance
reaches a finite value as z goes to infinity – though we should be careful about what
we mean by distance! We look at this in detail in the next chapter. The redshift is
directly related to the expansion. The easiest way to understand the cosmological
redshift is that the wavelength of travelling light expands with the universe. Thus
the universe has expanded by a factor 1 + z during the time light travelled from an
object with redshift z to us.
The largest observed redshift of a galaxy is at present is z = 8.6. Thus the
universe has expanded by a factor of about 10 while the observed light has been on
its way. When the light left the galaxy, the age of the universe was only about 600
million years. At that time the first galaxies were just being formed. This upper
limit in the observations is however not due to there being no earlier galaxies, but
rather to the fact that they are so faint due to the large distance. There may be some
galaxies with a redshift greater than 10. NASA is planning a new space telescope,
the James Webb Space Telescope12 , which would be able to observe these.
The expansion rate H changes on the cosmological timescale. Properly, the
time-dependent function H(t) is called the Hubble parameter, and its present value
is called the Hubble constant, H0 . In cosmology, it is customary to denote present
values of quantities with the subscript 0. Thus H0 ≡ H(t0 ).
The galaxies are not exactly at rest in the expanding space. Each galaxy has
its own peculiar motion vgal , caused by the gravity of nearby mass concentrations,
such as other galaxies. Neighbouring galaxies can fall towards each other, orbit each
other and so on13 .
Thus the redshift of an individual galaxy is the sum of the cosmic and the peculiar
redshift.
z = H0 d + vgal (when z ≪ 1) . (1.8)
Usually only the redshift is known precisely. Typically vgal is around 300 . . . 500
km/s. (In large galaxy clusters, where galaxies orbit each other, it can be several
thousand km/s; but then one can take the average redshift of the cluster.) For
faraway galaxies, H0 d ≫ vgal . The larger the redshift, the younger the universe was
when the light left.

1.3.2 The horizon


Because of the finite speed of light and the finite age of the universe, only a finite
part of the universe is observable. Our horizon is at the distance from which light
has just had time to reach us during the entire age of the universe. Were it not for
the expansion of the universe, the distance to this horizon rhor would equal the age
of the universe, around 14 billion light years (4300 Mpc). Expansion complicates
the situation; we will calculate the horizon distance later. For large distances the
redshift grows faster than (1.5). At the horizon z → ∞, i.e., dhor = d(z = ∞).
The universe has been transparent only for z < 1090 (after recombination), so the
12
http://www.jwst.nasa.gov
13
Alternatively, we may say that the galaxies remain in place, but there are local deviations
in the expansion rate. This is the more natural interpretation from the point of view of general
relativity. However, the idea of peculiar velocities is perhaps simpler to understand, as it is closer
to Newtonian physics the way it is usually formulated).
1 INTRODUCTION 11

“practical horizon”, i.e. the limit to what we can see, lies already at z = 1090. The
distances d(z = 1090) and d(z = ∞) are very close to each other; z = 4 lies about
halfway from here to the horizon.
Therefore we can only observe a finite region of the universe, enclosed in the
sphere with radius dhor . The universe can extend to large distances beyond that,
and it may even be infinite. Sometimes the word “universe” is used to denote just this
observable part of the “whole” universe. Then we can say that the universe contains
some 1012 galaxies and 1023 stars. Over cosmological time scales the horizon recedes
and parts of the universe which are beyond our present horizon become observable.
(However, if the expansion continues to accelerate as it seems to have done during
the past few billion years, the observable region will not grow, and in the distant
future galaxies that are now observable will disappear from our sight.)

1.3.3 The electromagnetic channel


Consider first the electromagnetic channel of observation. Although interstellar
space is transparent (except for radio waves longer than 100 m, which are absorbed
by interstellar ionized gas, short-wavelength ultraviolet radiation, which is absorbed
by neutral gas and very high-energy gamma rays which interact with the cosmic
microwave background), Earth’s atmosphere is opaque except for two wavelength
ranges, the optical window (λ = 300–800 nm), which includes visible light, and the
radio window (λ = 1 mm–20 m). The atmosphere is partially transparent to infrared
radiation, which is absorbed by water molecules in the air; high altitude and dry
air favours infrared astronomy. Accordingly, the traditional branches of astronomy
are optical astronomy and radio astronomy. Observations at other wavelengths have
become possible only during the past few decades, from space (with satellites) or at
very high altitude in the atmosphere (with planes, rockets and balloons).
From optical astronomy we know that there are stars in space. The stars are
grouped into galaxies. There are different kinds of galaxies, such as irregular, ellip-
tical, and flat disks or spirals. Our own galaxy (the Galaxy, or Milky Way galaxy)
is a disk. The plane of the disk can be seen (on a dark night) as a faint band –the
Milky Way– across the sky.
Notable nearby galaxies are the Andromeda galaxy (M31) and the Magellanic
clouds (LMC, Large Magellanic Cloud, and SMC, Small Magellanic Cloud). These
are the only other galaxies that are visible to the naked eye. (The Magellanic clouds
and the center of the Milky Way lie too far south to be seen from Finland.) The
number of galaxies that can be seen with powerful telescopes is many billions.
Other observable objects include dust clouds, which hide stars behind them, and
gas clouds. Gas clouds absorb starlight at certain frequencies, which excite the gas
atoms to higher energy states. As the atoms return to lower energy states they
emit photons at the corresponding wavelength. Thus we can determine from the
spectrum of light what elements the gas cloud is made of. In the same way the
composition of stellar surfaces can be determined.
The earliest “cosmological observation” was that the night sky is dark. If the
universe were eternal and infinitely large, unchanging and similar everywhere, our
eye would eventually meet the surface of a star in every direction, and the entire
night sky would be as bright as the Sun. This is called Olbers’ paradox.
1 INTRODUCTION 12

Optical astronomy and the large scale structure. There is a large body of
data relevant to cosmology from optical astronomy. Counting the number of stars
and galaxies we can estimate the matter density they contribute to the universe.
From the different redshifts of galaxies within the same galaxy cluster we obtain
their relative motions, which reflect the gravitating mass within the system. The
mass estimates for galaxy clusters obtained this way are much larger than those
obtained by counting the visible stars and galaxies in the cluster, pointing to the
existence of dark matter.
From the spectral lines of stars and gas clouds we can determine the relative
amounts of different elements and their isotopes in the universe.
The distribution of galaxies in space and their relative velocities tell us about
the large scale structure of the universe. The galaxies are not distributed uniformly.
There are galaxy groups and clusters. Our own galaxy belongs to a small group
of galaxies called the Local Group. The Local Group consists of three large spiral
galaxies: M31 (the Andromeda galaxy), M33, and the Milky Way, and about 60
smaller galaxies and dwarf galaxies. The local group’s diameter is around 3 Mpc.
The nearest large cluster is the Virgo Cluster. The grouping of galaxies into clus-
ters is not as strong as the grouping of stars into galaxies. Rather, galaxies are
distributed in a complex pattern called the cosmic web, which consists of walls, fil-
aments, clusters and voids (low-density regions). Most galaxies are not part of any
well defined cluster.

Radio astronomy. The sky looks very different on radio wavelengths than to the
naked eye. There are many strong radio sources very far away. These are galaxies
which are optically barely observable. They are distributed isotropically, i.e. there
are equal numbers of them in every direction, but there are more far away (at z > 1)
than close by (z < 1). The isotropy is evidence of the homogeneity of the universe
at the largest scales—there is structure only at smaller scales. The dependence
on distance is a time evolution effect in two ways: the radio sources evolve and
the volume of the universe evolves. In general, there are more objects at larger
distances simply because there is more volume there. However, the evolution in the
number counts is not explained only by the change in the volume of the universe,
but it shows that the radio sources themselves evolve. Some galaxies are strong
radio sources when they are young, but become weaker with age by a factor of more
than 1000.
Cold gas clouds can be mapped using the 21 cm spectral line of hydrogen. The
ground state (n = 1) of hydrogen is split into two very close energy levels depending
on whether the proton and electron spins are parallel or antiparallel (the hyperfine
structure). The separation of these energy levels, the hyperfine structure constant,
is 5.9 µeV, corresponding to a photon wavelength of 21 cm, i.e. radio waves. The
redshift of this spectral line shows that redshift is independent of wavelength (it is
the same for radio waves and visible light), as it should if it is due to the expansion
of space.

Cosmic microwave background. At microwave frequencies the sky is domi-


nated by the cosmic microwave background (CMB), which is highly isotropic, i.e.
the microwave sky appears glowing uniformly without any features, except for very
1 INTRODUCTION 13

small contrasts. The electromagnetic spectrum of the CMB is the black body spec-
trum with a temperature of T0 = 2.72548 ± 0.00057 K [2]. It follows the theoretical
black body spectrum better than anything else we have observed or produced. It
is the remnant of a hot state in the early universe, when matter and light were al-
most homogeneously and isotropically distributed and in thermal equilibrium. The
temperature of the CMB falls as (1 + z)−1 due to photon redshift, so as the CMB
redshift is about 1090, the original temperature was about 3000 K.
The state of a system in thermal equilibrium is determined by a small number
of thermodynamic variables, in this case the temperature and chemical potentials
(for particles with conserved quantum numbers). The observed temperature of the
CMB and the observed density of the present universe allows us to fix the evolution
of the temperature and the density of the universe, which then allows us to calculate
the sequence of events in the early universe. That the early universe was hot and
in thermal equilibrium is a central part of the Big Bang paradigm, and it is often
called the Hot Big Bang theory to spell this out.
With sensitive instruments a small anisotropy can be observed in the microwave
sky. This is dominated by the dipole anisotropy (one side of the sky is slightly hotter
and the other side colder), with an amplitude of 3.346 ± 0.017 mK, or ∆T /T0 =
0.0012. This is a Doppler effect due to the motion of the observer, i.e. the motion of
our Solar System with respect to the radiating matter at our horizon. The velocity
of this motion is v = (∆T /T0 ) c = 369 km/s, or v = 0.00123 and it is directed
towards the constellation of Leo (R.A. 11h 8m 50s , Dec. −6◦ 37′ ), near the autumnal
equinox (where the ecliptic and the equator cross on the sky) [3]. It is due to two
components, the motion of the Sun around the center of the Galaxy, and the peculiar
motion of the Galaxy due to the gravitational pull of matter concentrations up to
100–200 Mpc away14 .
When we subtract the effect of this motion from the observations (and look
away from the plane of the Galaxy—our Galaxy also emits microwave radiation,
but with a non-thermal spectrum) the true anisotropy of the CMB remains, with an
amplitude of about 3 × 10−5 , or 80 muK.15 This anisotropy gives a picture of the
small density variations in the early universe, the “seeds” of galaxies. Theories of
structure formation have to match both the small inhomogeneity of the order 10−5
at z = 1090 and the structure observed today (z = 0).
14
Sometimes it is asked whether there is a contradiction with special relativity here—doesn’t
CMB provide an absolute reference frame? There is no contradiction. The relativity principle
just says that the laws of physics are the same in the different reference frames. It does not say
that systems cannot have reference frames which are particularly natural for that system, e.g. the
center-of-mass frame or the laboratory frame. For road transportation, the surface of the earth is
a natural reference frame. In cosmology, the CMB gives us a good “natural” reference frame—it is
closely related to the center-of-mass frame of the observable part of the universe, or rather, a part
of it which is close to the horizon (the last scattering surface). The different parts of the plasma
from which the CMB originates are moving with different velocities (part of the 10−5 anisotropy
is due to these velocity variations). If there is something surprising here, it is that these relative
velocities are so small, of the order of just a few km/s, reflecting the astonishing homogeneity of the
early universe over large scales. We will return to the question of whether these are natural initial
conditions when we discuss inflation in the second part of the course.
15
The numbers refer to the standard deviation of the CMB temperature on the sky. The hottest
and coldest spots deviate some 4 or 5 times this amount from the average temperature.
REFERENCES 14

Gamma ray bursts and quasars. The highest energy region of the electro-
magnetic spectrum is occupied by γ rays. Space-based γ-ray observatories have
discovered powerful Gamma Ray Bursts (GRB) on the sky. These are short events
lasting from a fraction of a second to a few minutes. They are observed about once
per day, and are distributed isotropically on the sky. The isotropic distribution sug-
gests that they are at cosmological distances (further out than our own or nearby
galaxies). This has been confirmed by the identification of some GRB’s with galaxies
with high redshifts (z > 1). This means that the bursts must have extremely high
energies. The longer duration (longer than a second) GRB’s appear to be related to
particularly powerful supernova events. The shorter duration (less than a second)
are possibly due to collisions of neutron stars with each other or with black holes.
Quasars (Quasistellar Objects, QSOs) are the most powerful continuously radi-
ating objects in the universe. Thus the most-distant (earliest) objects observed in
the universe are mostly quasars. The highest observed power is about 1041 W. At
first quasars were considered different from galaxies since they looked like point-like
objects. In photographs they looked like stars, but their redshifts revealed their
huge distances and thus their huge power outputs. Now better observations have
revealed “host” galaxies around several quasars. It has been concluded that quasars
are powerfully radiating galactic nuclei, and are related to some more close-by galax-
ies (Seyfert galaxies), whose nuclei are also fairly powerful sources of radiation. To-
gether these objects are called Active Galactic Nuclei (AGN). Quasars are powerful
sources at many different wavelengths (radio, optical, X-ray). Some of them be-
long to the radio sources mentioned earlier, others are radio quiet. There are more
quasars at large distances (in the past, z > 1) than nearer to us (later, z < 1,
because quasars grow fainter as they age; they become ordinary’ quiet galaxies.
The power source of an AGN is thought to be a very large black hole (with
m = 108 M⊙ or so) at the center of the galaxy, into which surrounding matter is
falling. As it approaches the hole, this matter is heated up and begins to radiate.
AGN’s quiet down over cosmological time scales as the black hole gradually cleans
up the surrounding regions.

1.3.4 Cosmic rays


Cosmic rays are protons, electrons, and nuclei coming from space. Some of them
have extremely high energies, some above 1020 eV (in the laboratory frame; in the
center-of-mass frame the energy is of the order 1015 eV). These energies are higher
than what can be reached in particle accelerators (LHC reaches ∼ 1013 eV). It is
thought that cosmic rays originate from supernovae (exploding stars). Since they
are charged particles their paths are warped by galactic magnetic fields, so their
arrival direction does not point towards their origin. The cosmic rays are about
90% protons, 10% other nuclei and 1% electrons. All elements up to uranium are
represented.

References
[1] G. Efstathiou, Mon. Not. Roy. Astron. Soc. f440 (2014) 1138 [arXiv:1311.3461
¯
[astro-ph.CO]]
REFERENCES 15

[2] D.J. Fixsen, Astrophys. J. 707 (2009) 916.

[3] N. Aghanim et al. [Planck Collaboration], arXiv:1303.5087


[astro-ph.CO] (conversion to equatorial coordinates:
www.astro.utu.fi/EGal/CooC/CooC6.html)
2 BASICS OF GENERAL RELATIVITY 16

2 Basics of general relativity


The general theory of relativity completed by Albert Einstein in 1915 (and nearly
simultaneously by David Hilbert) is the current theory of gravity. General relativity
replaced the previous theory of gravity, Newtonian gravity, which can be understood
as a limit of general relativity in the case of isolated systems, slow motions and weak
fields. General relativity has been extensively tested during the past 99 years, and no
deviations have been found, with the possible exception of the accelerated expansion
of the universe, which is however usually explained by introducing new matter rather
than changing the laws of gravity [1]. We will not go through the details of general
relativity, we will just try to get some rough idea of what the theory is like, and
introduce a few concepts and definitions that we will need.
The principle behind special relativity is that space and time together form
four-dimensional spacetime. The essence of general relativity is that gravity is a
manifestation of the curvature of spacetime. While in Newton’s theory gravity acts
directly as a force between two bodies1 , in Einstein’s theory the gravitational in-
teraction is mediated by the spacetime. In other words, gravity is an aspect of the
geometry of spacetime. Matter curves the surrounding spacetime. This curvature
then affects the motion of other matter (as well as the motion of the matter gener-
ating the curvature). “Matter tells spacetime how to curve, spacetime tells matter
how to move” [2]. From the viewpoint of general relativity, gravity is not a force;
if there are no other forces than gravity acting on a body, the body is in free fall.
A freely falling body is moving in a straight line in the curved spacetime, along a
geodesic. If there are other forces, they cause the body to deviate from geodesic
motion. It is important to remember that the viewpoint is that of spacetime, not
just space. For example, the orbit of the earth around the sun is curved in space,
but straight in spacetime.
If a spacetime is not curved, it is said to be flat, which just means that it has the
geometry of Minkowski space. In the case of space (as opposed to spacetime), “flat”
means that the geometry is Euclidean. (Note the possibly confusing terminology:
Minkowski spacetime is called simply Minkowski space!)
To define a physical theory, we should give 1) the kinematics of the theory
(closely related to the symmetry properties), 2) the degrees of freedom and 3) the
laws that determine the time evolution of the degrees of freedom, consistent with
the kinematics (in other words the dynamics). In Newtonian gravity, the kinematics
is that of Euclidean space with the Galilean symmetry group, which is to say that
the laws of physics are invariant under the change of coordinates

xi → x′i = Ri j xj + Ai + v i t
t → t′ = Bt + C , (2.1)

where xi are spatial coordinates, t is time, Ri j is a constant rotation matrix, Ai


and v i are constant vectors and B and C are constants. (Summation over repeated
indices is implied; see section 2.5.) The degrees of freedom are point particles, and
the dynamics is given by Newton’s second law, which states that the acceleration of
1
The way Newtonian gravity is usually formulated. It is also possible to formulate Newtonian
gravity in geometric terms, so that gravity is an expression of spacetime curvature, although this
is less natural than in the case of general relativity.
2 BASICS OF GENERAL RELATIVITY 17

particle 1 due to particle 2 is


x̄1 − x̄2
¨1 = −GN m2
x̄ , (2.2)
|x̄1 − x̄2 |3
where GN is Newton’s constant and m2 is the mass of particle 2. This law is
consistent with the symmetry (2.1), but it is not uniquely specified by it.
In general relativity, Euclidean space is replaced by curved spacetime. Unlike
Euclidean space or Minkowski space, a general curved spacetime has no symmetries.
However, a central role is played in the theory by diffeomorphism invariance, which
is to say invariance under general coordinate transformations, xα → x′α (xβ ) (Greek
indices label directions in four-dimensional spacetime, Latin indices label spatial di-
rections). In addition to the matter degrees of freedom, (which are more complicated
than point particles and have to be specified by a matter model –general relativity
is not not a theory about the structure of matter!–), there are gravitational degrees
of freedom. In Newtonian theory, gravity is just an interaction between particles,
but in general relativity, it is an aspect of the geometry of spacetime and its degrees
of freedom are described by the metric. The equation of motion is the Einstein
equation. We will below first go through some kinematics of curved spacetime, and
then briefly discuss the Einstein equation and its relation to Newtonian gravity.

2.1 Curved 2D and 3D space


(If you are familiar with the concept of curved space and how its geometry is given
by the metric, you can skip the following and go straight to Sec. 2.3.)
To help to visualise a four-dimensional curved spacetime, it may be useful to
consider curved two-dimensional spaces embedded in a flat three-dimensional space.2
So let us consider first a 2D space. Imagine there are 2D beings living in this 2D
space. They have no access to a third dimension. How can they determine whether
the space they live in is curved? By examining whether the laws of Euclidean
geometry hold. If the space is flat, then the sum of the angles of any triangle is
180◦ , and the circumference of any circle with radius χ is 2πχ. If by measurement
they find that this does not hold for some triangles or circles, then they can conclude
that the space is curved.
A simple example of a curved 2D space is the sphere. The sum of angles of any
triangle on a sphere is greater than 180◦ , and the circumference of any circle drawn
on the surface of a sphere is less than 2πχ. (Straight lines on the sphere are sections
of great circles, which divide the sphere into two equal hemispheres.)
In contrast, the surface of a cylinder has Euclidean geometry, i.e. there is no
way that 2D beings living on it could conclude that it differs from a flat surface,
and thus by our definition it is a flat 2D space. (Except that by travelling around
the cylinder they could conclude that their space has a non-trivial topology. )
In a similar manner we could try to determine whether the 3D space around us
is curved, by measuring whether the sum of angles of a triangle is 180◦ or whether
2
This embedding is only a visualisation aid. A curved 2D space is defined completely in terms
of its two independent coordinates, without any reference to a higher dimension. The geometry is
given by the metric (part of the definition of the 2D space), which is a function of these coordinates.
Some such curved 2D spaces have the same geometry as a 2D surface in flat 3D space. We then say
that the 2D space can be embedded in flat 3D space. But there are curved 2D spaces which have
no such corresponding surface, i.e. not all curved 2D spaces can be embedded in flat 3D space.
2 BASICS OF GENERAL RELATIVITY 18

Figure 1: Cylinder and sphere.

a sphere with radius r has surface area 4πr2 . The space around Earth is indeed
curved due to Earth’s gravity, but the curvature is so small that more sophisticated
measurements than the ones described above are needed to detect it.

2.2 The metric of 2D and 3D space


The tool to describe the geometry of space is the metric. The metric is given in
terms of a set of coordinates. The coordinate system can be an arbitrary curved
coordinate system. The coordinates are numbers which identify locations, but do
not, by themselves, say anything about physical distances. The distance information
is in the metric.
To introduce the concept of a metric, let us first consider Euclidean two-dimensional
space with Cartesian coordinates x,y. Take a parametrised curve x(η), y(η) that
begins at η1 and ends at η2 . The length of the curve is given by
Z Z p Z η2 p
s = ds = 2
dx + dy = 2 x′2 + y ′2 dη , (2.3)
η1
p
where x′ ≡ dx/dη, y ′ ≡ dy/dη. Here ds = dx2 + dy 2 is the line element. The
square of the line element, the metric, is

ds2 = dx2 + dy 2 . (2.4)

The line element has the dimension of distance. As a working definition for the
metric, we can use that the metric is an expression which gives the square of the
line element in terms of the coordinate differentials.
We could use another coordinate system on the same 2-dimensional Euclidean
space, e.g., polar coordinates. Then the metric is

ds2 = dr2 + r2 dϕ2 , (2.5)

giving the length of a curve as


Z Z p Z η2 p
s = ds = 2 2 2
dr + r dϕ = r′2 + r2 ϕ′2 dη . (2.6)
η1
2 BASICS OF GENERAL RELATIVITY 19

Figure 2: A parametrised curve in Euclidean 2D space with Cartesian coordinates.

In a similar manner, in 3-dimensional Euclidean space, the metric is

ds2 = dx2 + dy 2 + dz 2 (2.7)

in Cartesian coordinates, and

ds2 = dr2 + r2 dθ2 + r2 sin2 θdϕ2 (2.8)

in spherical coordinates (where the r coordinate has the dimension of distance, but
the angular coordinates θ and ϕ are dimensionless).
Now we can go to our first example of a curved (2-dimensional) space, the sphere.
Let the radius of the sphere be a. For the two coordinates on this 2D space we can
take the angles θ and ϕ. We get the metric from the Euclidean 3D metric in spherical
coordinates by setting r = r0 ,

ds2 = r02 dθ2 + sin2 θdϕ2 .



(2.9)

The length of a curve θ(η), ϕ(η) on this sphere is given by


Z Z η2 q
s = ds = r0 θ′2 + sin2 θϕ′2 dη . (2.10)
η1

For later application in cosmology, it is instructive to now consider a coordinate


transformation r = sin θ (this new coordinate r has nothing to do with the earlier
r of 3D space, it is a coordinate on the sphere growing in the same direction
√ as θ,
starting at r = 0 from the North Pole (θ = 0)). Since now dr = cos θdθ = 1 − r2 dθ,
the metric becomes
dr2
ds2 = + r2 dϕ2 . (2.11)
1 − r2
For r ≪ 1 (in the vicinity of the North Pole), this metric is approximately the same
as Eq. (2.5), i.e., it becomes polar coordinates on the “Arctic plane”. Only as r
gets bigger we begin to notice the deviation from flat geometry. Note that we run
into a problem when r = 1. This corresponds to θ = 90◦ , i.e. the “equator”. After
2 BASICS OF GENERAL RELATIVITY 20

Figure 3: A parametrised curve on a 2D sphere with spherical coordinates.

Figure 4: The part of the sphere covered by the coordinates in Eq. (2.11).

this r = sin θ begins to decrease again, repeating the same values. Also, at r = 1,
the 1/(1 − r2 ) factor in the metric becomes infinite. We say we have a coordinate
singularity at the equator. There is nothing wrong with the space itself, but our
chosen coordinate system applies only for a part of this space, the region “north” of
the equator.

2.3 4D flat spacetime


The coordinates of the four-dimensional spacetime are (x0 , x1 , x2 , x3 ), where x0 = t is
a time coordinate. Some examples are “Cartesian” (t, x, y, z) and spherical (t, r, θ, ϕ)
coordinates. We use Greek indices to denote an arbitrary spacetime coordinate, xµ ,
where µ can have any of the values 0, 1, 2, 3. Latin indices are used to denote space
coordinates, xi , where i can have any of the values 1, 2, 3.
The metric of the Minkowski space of special relativity is

ds2 = −dt2 + dx2 + dy 2 + dz 2 , (2.12)

in Cartesian coordinates. In spherical coordinates it is

ds2 = −dt2 + dr2 + r2 dθ2 + r2 sin2 θ dϕ2 , (2.13)

The fact that time appears in the metric with a different sign is responsible for
the special geometric features of Minkowski space. (We assume that the reader is
familiar with special relativity, and won’t go into details.) There are three kinds of
distance intervals,
2 BASICS OF GENERAL RELATIVITY 21

Figure 5: The light cone.

• timelike, ds2 < 0

• lightlike, ds2 = 0

• spacelike, ds2 > 0.

The lightlike directions form the observer’s future and past light cones. Light
moves along the light cone, so everything we see lies on our past light cone. To see
us as we are now, the observer has to lie on our future light cone. As we move in
time along our world line, we drag our light cones with us so that they sweep over
the spacetime. The motion of a massive body is always timelike, and the motion of
massless particles is always lightlike.

2.4 Curved spacetime


These features of the Minkowski space are inherited by the spacetime of general
relativity. However, spacetime is now curved, whereas Minkowski spacetime is flat.
(Recall that when we say space is flat, we mean it has Euclidean geometry; when
we say spacetime is flat, we mean
R it has Minkowski geometry.) The (proper) length
of a spacelike curve is ∆s ≡ ds. Light moves along lightlike world lines, ds2 =
0, massive objects along timelike world lines ds2 < 0.
R The time measured √ by a
clock carried by the object, the proper time, is ∆τ = dτ , where dτ ≡ −ds2 , so
dτ 2 = −ds2 > 0. The proper time τ is a natural parameter for the world line, xµ (τ ).
The four-velocity of an object is defined as
dxµ
uµ = . (2.14)

The zeroth component of the 4-velocity, u0 = dx0 /dτ = dt/dτ relates the proper time
τ to the coordinate time t, and the other components of the 4-velocity, ui = dxi /dτ ,
to the coordinate velocity v i ≡ dxi /dt = ui /u0 . To convert this coordinate velocity
2 BASICS OF GENERAL RELATIVITY 22

Figure 6: Two coordinate systems with different time slicings.

into a “physical” velocity (with respect to the coordinate system), we still need to
use the metric, see Eq. (2.20).
In an orthogonal coordinate system the coordinate lines are everywhere orthog-
onal to each other. The metric is then diagonal, meaning that it contains no cross-
terms like dxdy. We will only use orthogonal coordinate systems in this course.
The three-dimensional subspace, or hypersurface t = const. of spacetime is called
the space (or the universe) at time t, or a time slice of the spacetime. It is possible
to slice the same spacetime in many different ways i.e. to make different choices of
the time coordinate t.

2.5 Vectors, tensors, and the volume element


The metric gµν of spacetime is related to the distance interval by
3 X
X 3
ds2 = gµν dxµ dxν ≡ gµν dxµ dxν . (2.15)
µ=0 ν=0

We introduce the Einstein summation rule: we always sum over P repeated indices,
even if we don’t bother to write down
P 3 P3 the summation sign . This also applies
i j i j
to Latin indices, gij dx dx ≡ i=1 j=1 gij dx dx . The objects gµν are the com-
ponents of the metric tensor. They are usually taken to be dimensionless, but
sometimes (particularly in the case of angular coordinates) it is more useful to keep
the coordinates dimensionless and put the dimension in the metric. The components
of the metric tensor form a symmetric 4 × 4 matrix.
In the case of Minkowski space, the metric tensor in Cartesian coordinates is
called ηµν ≡ diag(−1, 1, 1, 1). In matrix notation we have for Minkowski space
 
−1 0 0 0
 0 1 0 0
gµν = 
 0 0 1 0
 (2.16)
0 0 0 1
in Cartesian coordinates, and
 
−1 0 0 0
 0 1 0 0 
gµν =
 0 2
 (2.17)
0 r 0 
0 0 0 r2 sin2 θ
2 BASICS OF GENERAL RELATIVITY 23

in spherical coordinates.
As another example, the metric tensor for a sphere (discussed above as an ex-
ample of a curved 2D space) has the components
 2 
r0 0
[gij ] = . (2.18)
0 r02 sin2 θ

The vectors that occur naturally in relativity are four-vectors, with four compo-
nents, as with the four-velocity discussed above. We will use the short term “vector”
to refer both to three-vectors and four-vectors, as it should be obvious from the con-
text which one we mean. As in three-dimensional flat geometry, the values of the
components depend on the basis used. For example, if we move p along the coordinate

1 1
x so that it changes by dx , the distance travelled is ds = g11 dx1 dx1 = g11 dx1 .
Similarly, the components of a vector do not give the physical magnitude of the
quantity. In the case when the metric is diagonal, we just multiply by the relevant
metric component to get the physical magnitude,

wα̂ ≡ |gαα |wα ,


p
(2.19)

where wα is the component of a vector in the basis where the metric is gαβ , and wα̂
is the correctly normalised physical magnitude of the vector. (In the above, there is
no summation over α.)
For example, the physical velocity of a object is3

v î = gii dxi / |g00 |dx0 ,
p
(2.20)

and the spatial components are always smaller than one.


The volume of a region of space (given by some range in the spatial coordinates
x1 , x2 , x3 ) is given by
Z Z q
V = dV = det[gij ]dx1 dx2 dx3 (2.21)
V V
p
where dV ≡ det[gij ]dx1 dx2 dx3 is the volume element. Here det[gij ] is the deter-
minant of the 3 × 3 submatrix of the metric tensor components corresponding to the
spatial coordinates. For an orthogonal coordinate system, the volume element is
√ √ √
dV = g11 dx1 g22 dx2 g33 dx3 . (2.22)

Similarly, the surface area of a two-dimensional spatial region is S


Z Z q
S= dS = det[gij ]dx1 dx2 (2.23)
S S
p
where dS ≡ det[gij ]dx1 dx2 is the area element. Here det[gij ] is the determinant of
the 2×2 submatrix of the metric tensor components corresponding to the subvolume
with constant x0 and x3 . For an orthogonal coordinate system, we again have
√ √
dS = g11 dx1 g22 dx2 . (2.24)

3
When g00 = −1, this simplifies to gii dxi /dt.
2 BASICS OF GENERAL RELATIVITY 24

The metric tensor is used for taking scalar products of four-vectors,

w · u ≡ gαβ uα wβ . (2.25)

The (squared) norm of a four-vector w is

w · w ≡ gαβ wα wβ . (2.26)

Exercise: Show that the norm of the four-velocity is always −1.

2.6 Contravariant and covariant components


(This subsection is not really needed for the course, but it may help to clarify things for
those who wonder why we sometimes write indices up and sometimes down, and what the
difference is.)
Sometimes the index is written as a subscript, sometimes as a superscript. We will not
be doing index gymnastics in the course, but for completeness’ sake, let us say a few words
about this. The component wα of a four-vector is called a contravariant component. The
corresponding covariant component is defined as

wα ≡ gαβ wβ . (2.27)

The norm is now simply


w · w = wα w α . (2.28)
In particular, for the 4-velocity we always have
ds2
uµ uµ = gµν uµ uν = = −1. (2.29)
dτ 2
In Minkowski space written in Cartesian coordinates, the only difference is in the sign
of the 0-component, but in curved spacetime (or in curved coordinates), the covariant and
contravariant vectors can be quite different.
We defined the metric tensor through its covariant components (Eq. 2.15). We now
define the corresponding covariant components g αβ as the inverse matrix of the matrix
[gαβ ],
gαβ g βγ = δαγ . (2.30)
Now
g αβ wβ = g αβ gβγ wγ = δ αγ wγ = wα . (2.31)
The metric tensor can be used to lower and raise indices. For tensors we have

Aαβ = gαγ Aγβ


Aαβ = gαγ gβδ Aγδ
Aαβ = g αγ g βδ Aγδ . (2.32)

Note that in general Aαβ 6= Aβα unless the tensor is symmetric.


The symbols δαβ and ηαβ are not tensors, and the location of their indices carries no
meaning.

2.7 The Einstein equation


Given that the degrees of freedom of the spacetime are given by the metric and we
want to have equations of motion which are second order, they can only involve the
metric and its first and second derivatives,

gµν , ∂gµν /∂xσ , ∂ 2 gµν /(∂xσ ∂xτ ) , (2.33)


2 BASICS OF GENERAL RELATIVITY 25

as well as the matter degrees of freedom. The requirement of invariance under gen-
eral coordinate transformations restricts the equation of motion (in four dimensions)
to have the form
Gµν = 8πGN Tµν , (2.34)
where Gµν is a unique tensor constructed from the metric and its first and second
derivatives and Tµν is the energy-momentum tensor, also known as the stress-energy
tensor. This equation specifies how the geometry of spacetime and its matter content
interact, in other words it is the law of gravity according to general relativity. We
will not discuss the Einstein tensor or this equation in much detail in this course.
In the first part of the course we only need it in the case of the homogeneous and
isotropic approximation, and in the second part we will look at small perturbations
around this. However, we have explained a little bit about general relativity to give
some idea of the mathematical structure which underlies the Friedmann-Robertson-
Walker models.
The energy-momentum tensor describes all properties of matter which affect the
spacetime, namely energy density, momentum density, pressure, and stress. For
frictionless continuous matter, a perfect fluid, it has the form

Tµν = (ρ + p)uµ uν + pgµν , (2.35)

where ρ is the energy density and p is the pressure measured by an observer moving
with four-velocity uµ (such an observer is in the rest frame of the fluid). In cosmology
we can usually assume that the energy tensor has the perfect fluid form. T00 is the
energy density in the coordinate frame, Ti0 gives the momentum density, which is
equal to the energy flux T0i and Tij gives the flux of momentum i-component in
j-direction.
In Newton’s theory the source of gravity is mass, in the case of continuous matter,
the mass density ρm . According to Newton, the gravitational field ~gN is given by
the equation
∇2 Φ = −∇ · ~gN = 4πGρm , (2.36)
where Φ is the gravitational potential. (We earlier discussed Newton’s law in the
form of the force law for point particles; this potential formulation for a continuous
medium is equivalent, for finite systems.) Comparing (2.36) to (2.34), the mass
density ρm has been replaced by Tµν , and ∇2 Φ has been replaced by the Einstein
tensor Gµν , which is a short way of writing a complicated expression built from gµν
and its first and second derivatives of. Thus the gravitational potential is replaced
by the 10-component tensor gµν .
In the case of a weak gravitational field, the metric is close to the Minkowski
metric, and it can be written as

ds2 = −(1 + 2Φ)dt2 + (1 − 2Φ)δij dxi dxj , (2.37)

where |Φ| ≪ 1. The Einstein equation then reduces to

∇2 Φ = 4πG(ρ + 3p) . (2.38)

Comparing this to Eq. (2.36) we see that the mass density ρm has been replaced
by ρ + 3p. For relativistic matter, where mass is not the dominant contribution
to the energy density and p can be of the same order of magnitude as ρ, this is
2 BASICS OF GENERAL RELATIVITY 26

Figure 7: Defining the angular diameter distance.

an important modification to the law of gravity. For nonrelativistic matter, where


the particle velocities are v ≪ 1, we have p ≪ ρ ≃ ρm , and we get the Newtonian
equation.
We said that general coordinate invariance together with the requirement of
second order equations of motion constrains the Einstein equation to have the par-
ticular form (2.34) in four spacetime dimensions. However, there is one caveat – we
can add a multiple of the metric tensor to the equation so that it becomes

Gµν + Λgµν = 8πGTµν . (2.39)

The constant Λ is called the cosmological constant. The gravitational effect of a


positive cosmological constant is repulsive - we come back to this in the next chapter,
where we discuss homogeneous and isotropic cosmological models.

2.8 Distance, luminosity, and magnitude


In general relativity, it is possible to define spacelike distances just as in special
relativity
p and Newtonian physics. One simply draws a spacelike line and integrates
2
|ds | along the line. However, in a spacee that evolves in time, it is impossible
to measure such distances, because they are defined only at one particular moment
in time. The observer necessarily moves forward in time, and can never travel in
a spacelike direction. (In other words, the space changes as the observer is going
about measuring it.) Even if we lived in a static universe where such measurements
would be possible in principle, they could not be done in practice for cosmology,
since we cannot move for cosmologically significant distances. In cosmology the
observationally relevant distances are those defined with respect to light. They are
not distances in space but in spacetime, specifically along lightlike directions. The
two main distances used in cosmology are the angular diameter distance and the
luminosity distance.
In Euclidean space, an object with proper size dS distance d away is seen at an
angle (when d ≫ dS)
dS
dθ = . (2.40)
d
In general relativity, we therefore define the angular diameter distance of an
object with proper size R and angular size θ as
dS
dA ≡ . (2.41)

The reasoning of the Euclidean situation is here reversed. Objects do not look
smaller because they are further away, they are further away because they look
smaller. In the case of curved spacetime this can lead to behaviour at odds with in-
tuition from Euclidean geometry; we will encounter one example in the next section.
2 BASICS OF GENERAL RELATIVITY 27

In order to determine the angular diameter distance, we need to know the proper
size of the object we are observing. In cosmology, this is can be done reliably only in
a few cases, the most notable of which is the pattern of the anisotropy of the CMB,
which we will discuss in the second part of the course.
The luminosity distance is defined in a similar manner. In Euclidean space, if an
object radiates isotropically with absolute luminosity L (this is the radiated energy
per unit time you would measure next to the object), an observer at distance d sees
the flux (energy per unit time per unit area)

L
F = . (2.42)
4πd2
In general relativity, the luminosity distance dL is defined as
r
L
dL ≡ . (2.43)
4πF
As with the angular diameter distance, objects in curved spacetime are further
away because they look fainter, not the other way around. (However, at least in
homogeneous and isotropic models, the luminosity distance behaves qualitatively as
expected from Euclidean intuition, unlike dA .)
In any spacetime, the two distances are related by dL = (1 + z)2 dA , so there is
really only one independent observational cosmological distance measure.
In astronomy, luminosity is often expressed in terms of magnitude. This system
hails back to the ancient Greeks, who classified stars visible to the naked eye into six
classes according to their brightness. Magnitude in modern astronomy is defined so
that it roughly matches this ancient classification, but it is not restricted to positive
integers. The magnitude scale is logarithmic in such a way that a difference of 5
magnitudes corresponds to a factor of 100 in luminosity4 . The absolute magnitude
M and the apparent magnitude m of an object are defined as
L
M ≡ −2.5 log10
L0
F
m ≡ −2.5 log10 , (2.44)
F0
where L0 and F0 are reference luminosity and flux. There are actually different
magnitude scales corresponding to different regions of the electromagnetic spectrum,
with different reference luminosities. The bolometric magnitude and luminosity refer
to the power or flux integrated over all frequencies, whereas the visual magnitude
and luminosity refer only to the visible light. In the bolometric magnitude scale
L0 = 3.0×1028 W. The reference flux F0 for the apparent scale is chosen so in relation
to the absolute scale that a star whose distance is d = 10 pc has m = M . From
this, (2.43) and (2.44) follows that the difference between apparent and absolute
magnitudes is related to the luminosity distance as

m − M = −5 + 5 log10 (DL /pc) . (2.45)


4
So a difference of 1 magnitude corresponds to a factor 1001/5 ≈ 2.512 in luminosity.
REFERENCES 28

References
[1] C.M. Will, The confrontation between general relativity and experiment, Liv-
ing Rev. Rel. 9 (2006) 3, http://www.livingreviews.org/lrr-2006-3 [arXiv:gr-
qc/0510072]

[2] C.W. Misner, K.S. Thorne, J.A. Wheeler, Gravitation (Freeman 1973)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 29

3 The Friedmann-Robertson-Walker model


3.1 Kinematics
3.1.1 The Robertson-Walker metric
In cosmology a common approximation is that there is a slicing of spacetime into
spacelike slices which are exactly homogeneous and isotropic. This means that there
exists a coordinate system in which the t = const. hypersurfaces are homogeneous
and isotropic. The proper time t which labels the hypersurfaces is called the cosmic
time.
There is evidence that the universe is indeed statistically homogeneous (all places
look the same) and isotropic (all directions look the same) on scales larger than
about 100 Mpc. This does not prove that the universe would be well described
by a model which is exactly homogeneous and isotropic, but it does motivate using
it as a first approximation. (We shall see that the approximation is in fact quite
good, and at early times it is excellent, as the universe was then more homogeneous
and isotropic than today.) In the first part of the course we only consider exactly
homogeneous and isotropic spacetimes, in the second part we look at perturbations
around homogeneity and isotropy.
Since the spacetime is spatially homogeneous and isotropic, its curvature is the
same at all points in space, but can vary in time. It can be shown that the metric
can be written (by a suitable choice of the coordinates) in the form

dr2
 
2 2 2 2 2 2 2 2
ds = −dt + a (t) + r dθ + r sin θ dϕ . (3.1)
1 − Kr2

An alternative form, in Cartesian as opposed to spherical coordinates, is


1
ds2 = −dt2 + a2 (t)  δij dx
i
dxj . (3.2)
K 2 2
1+ 4r

In either form, this is called the Robertson–Walker (RW) metric, sometimes the
Friedmann–Robertson–Walker (FRW) metric or the Friedmann–Lemaı̂tre–Robertson–
Walker (FLRW) metric1 . Note that neither form of the metric has the same amount
of symmetry as the spacetime itself: the metrics are isotropic, but not homogeneous.
The full symmetry of the spacetime is usually not apparent in the metric itself, even
though all physical quantities calculated from the metric display the symmetry. The
time coordinate t is the cosmic time. Here K is a constant, related to curvature of
space (not spacetime) and a(t) is a function of time which tells how the universe
expands (or contracts). We call
p
Rcurv ≡ a(t)/ |K| (3.3)

the curvature radius of space (at time t). The metric (3.1) is given in spherical
coordinates. We see immediately that the 2-dimensional surfaces t = r = const have
the metric of a sphere with radius ar. The time-dependent factor a(t) is called the
1
The most commonly used term is the FRW metric. However, some authors prefer to make
the distinction between the geometry (with the names Robertson and Walker attached) and the
equations of motion (endowed with the name Friedmann and sometimes also Lemaı̂tre).
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 30

scale factor. We will need the Einstein equation to solve a(t). From the geometrical
point of view, it is just an arbitrary function of the time coordinate t.
We have the freedom to rescale the radial coordinate r. For example, we can
multiply all values of r by a factor of 2, if we also divide a by a factor of 2 and K
by a factor of 4. The geometry of the spacetime stays the same, the meaning of the
coordinate r has just changed: the point that had a given value of r has now twice
that value in the rescaled coordinate system. There are two common ways to use
the rescaling to make the notation easier. If K 6= 0, we can rescale r to make K
equal to ±1. In this case K is usually denoted k. In this case r is dimensionless, and
a(t) has the dimension of distance. The other way is to set the scale factor today
to unity2 , a(t0 ) ≡ a0 = 1. We will use this latter convention in the course. In this
case a(t) is dimensionless, and r and K −1/2 have the dimension of distance.
If K = 0, the space part (t = const.) of the Robertson–Walker metric is flat.
The 3-metric (the space part of the full metric) is that of ordinary Euclidean space,
with the radial distance given by ar. The spacetime, however, is curved, since a(t)
depends on time, describing the expansion or contraction of space. It is often said
that the “universe is flat” in this case, though if the universe is understood as the
four-dimensional spacetime (as opposed to a spatial slice), “spatially flat” would be
more correct. √
If K > 0, the coordinate system is singular at r = rK ≡ 1/ K. (Remem-
ber the discussion of the 2-sphere in the previous chapter.) With the substitution
(coordinate transformation) r = rK sin χ the metric becomes

ds2 = −dt2 + a2 (t)K −1 dχ2 + sin2 χ(dθ2 + sin2 θ dϕ2 ) .


 
(3.4)

The spatial part has the metric of a 3D hypersphere, a sphere with one extra dimen-
sion. There is a new angular coordinate,
√ χ, whose values range from 0 to π, just
like θ. The singularity at r = 1/ K disappears in this coordinate transformation,
showing that it was just a coordinate artifact, not a physical singularity. The orig-
inal coordinates
√ covered only half of the hypersphere, as the coordinate singularity
r = 1/ K divides the hypersphere into two halves. The case K > 0 corresponds to
a closed universe, whose spatial curvature is positive.3 This is a finite universe, with
circumference 2πarK = 2πRcurv and volume 2π 2 a3 rK 3 = 2π 2 R3
curv , and Rcurv is the
radius of the hypersphere.
If K < 0, there is no coordinate singularity, and r ranges from 0 to ∞. The
substitution r = |K|−1/2 sinh χ is, however, often useful in calculations. The case
K < 0 corresponds to an open universe, the spatial curvature of which is negative.
The metric is then

ds2 = −dt2 + a2 (t)|K|−1 dχ2 + sinh2 χ dθ2 + sin2 θ dϕ2 .


 
(3.5)

This universe is infinite, just like in the case K = 04 .


2
In some discussions of the early universe, it may be convenient to rescale a to unity at early
time instead.
3
Positive (negative) curvature means that the sum of angles of any triangle is greater than (less
than) 180◦ and that the area of a sphere with radius r is less than (greater than) 4πr2 .
4
The terminology of open vs. closed refers to the simplest possible choice of topology for the
space. The K > 0 models are always finite, but it is also possible for the K = 0 and K < 0 models
to be finite in a compact space with non-trivial topology. We will not discuss this possibility. (The
mathematically oriented reader will note that the terms “open” and “closed” do not have the same
meaning in cosmology as they do in topology!)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 31

Figure 1: The hypersphere. This figure is for K = k = 1. Consider the semicircle in the
figure. It corresponds to χ ranging from 0 to π. You get the (2-dimensional) sphere by
rotating this semicircle off the paper around the vertical axis by an angle ∆ϕ = 2π. You get
the (3-dimensional) hypersphere by rotating it twice, in two extra dimensions, by ∆θ = π
and by ∆ϕ = 2π, so that each point makes a sphere. Thus each point √ in the semicircle
corresponds to a full sphere with coordinates θ and ϕ, and radius (a/ K) sin χ.

The Robertson–Walker metric has two associated length scales, both of which
in general evolve in time. The first is the curvature radius, Rcurv ≡ a|K|−1/2 . The
second is the time scale of the expansion, the Hubble time, tH ≡ H −1 , where H ≡ ȧ/a
is the Hubble parameter. The Hubble time multiplied by the speed of light, c = 1,
gives the Hubble length, ℓH ≡ ctH ≡ H −1 . In the case K = 0 the universe is flat, so
the Hubble length is only length scale.
The coordinates (t, r, θ, ϕ) of the Robertson–Walker metric are called comoving
coordinates. This means that the coordinate system follows the expansion of space,
so that the space coordinates of objects which do not move with respect to the
background remain the same. The homogeneity of the universe fixes a special frame
of reference, the cosmic rest frame given by the above coordinate system, so (unlike
in the empty Minkowski space) the concept “does not move” has a specific meaning
(as long as the energy density and pressure are not zero). The coordinate distance
between two such objects stays the same, but their physical, or proper, distance
grows with time as space expands. The time coordinate t, the cosmic time, gives
the time measured by such an observer, at (r, θ, ϕ) = const.
It can be shown that expansion causes the motion of an object in free fall to slow
down with respect to the comoving coordinate system. For nonrelativistic physical
velocities we have,
a(t1 )
v(t2 ) = v(t1 ). (3.6)
a(t2 )
The velocity of a galaxy with respect to the background is called peculiar velocity5 .
5
When there are perturbations, the split between the background and the perturbations is a
delicate issue, and statements like “moving with respect to the background” have to be phrased
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 32

3.1.2 Conformal time


If we want the metric to remain isotropic, we cannot make coordinate transforma-
tions that mix the time coordinate with the spatial coordinates. However, just as
we redefined the radial coordinate above, we can make redefinitions which involve
only the time coordinate. In the comoving coordinates used above, the space part
of the coordinate system is expanding with the expansion of the universe. It is often
practical to change the time coordinate so that the “unit of time” (i.e. separation
of time coordinate surfaces) also increases in time. The conformal time η is defined
by Z t
1 dt′
dη ≡ dt, or η= ′
. (3.7)
a(t) 0 a(t )
Exercise: Write the FRW metric in the coordinates (η, r, θ, ϕ) and (η, χ, θ, ϕ).
The latter form is especially nice for studying light propagation, where ds2 = 0.
(Since all directions are equivalent, we can choose the direction of propagation to
be radial, so that dθ = dϕ = 0.)

3.1.3 Redshift
As mentioned in chapter 1, redshift is one of the most important cosmological ob-
servables. Let us find how it is related to the spacetime geometry in the case of
the FRW metric. Consider galaxy A. Light leaves the galaxy at time t1 with wave-
length λ1 and arrives at galaxy O at time t2 with wavelength λ2 . It takes a time
δt1 = λ1 /c = 1/f1 to send one wavelength and a time δt2 = λ2 /c = 1/f2 to receive
one wavelength. Follow now the two light rays sent at times t1 and t1 + δt1 (see
figure). We can choose the coordinates such that the light path is radial (since all
directions are equivalent), θ and ϕ stay constant while t and r change. Light follows
lightlike geodesics for which
ds2 = 0 . (3.8)
We thus have
dr2
ds2 = −dt2 + a2 (t) = 0 (3.9)
1 − Kr2
dt −dr
⇒ = √ . (3.10)
a(t) 1 − Kr2
Integrating this, we get for the first light ray,
Z t2 Z rA
dt dr
= √ , (3.11)
t1 a(t) 0 1 − Kr2
and for the second,
t2 +δt2 rA
dt dr
Z Z
= √. (3.12)
t1 +δt1 a(t)
0 1 − Kr2
The right hand sides of the two equations are the same, since the sender and the
receiver have not moved (they stay at r = rA and r = 0). Thus
Z t2 +δt2 Z t2 Z t2 +δt2 Z t1 +δt1
dt dt dt dt δt2 δt1
0= − = − = − , (3.13)
t1 +δt1 a(t) t1 a(t) t2 a(t) t1 a(t) a(t2 ) a(t1 )
carefully. We will discuss this a bit more in the second part of the course.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 33

Figure 2: The two light rays to establish the redshift.

and the time to receive one wavelength is


a(t2 )
δt2 = δt1 . (3.14)
a(t1 )
(This derivation is even simpler when using conformal time.)
As is clear from the derivation, this cosmological time dilation effect applies to
observing any event taking place in galaxy A. As we observe galaxy A, we see ev-
erything happening in “slow motion”, slowed down by the factor a(t2 )/a(t1 ), which
is the factor by which the universe has expanded since the light (or any electromag-
netic signal) left the galaxy. This effect can be observed e.g. in the light curves (flux
as a function of time) of supernovae.
For the redshift we have the result
λ2 δt2 a(t2 )
1+z ≡ = = . (3.15)
λ1 δt1 a(t1 )
The result is simple: the wavelength expands with the universe. So the redshift tells
us how much smaller the universe was when the light left the galaxy.

3.1.4 Age-redshift relation


If we see a source at redshift z, how old was the universe when the light left the
source? In the FRW universe we have
da da 1 dz 1
dt = = =− , (3.16)
ȧ a H 1+zH
where we have assumed that ȧ 6= 0, i.e. that a(t) is monotonic. The age of the
universe at redshift z is then
Z t Z ∞
′ dz ′ 1
t(z) = dt = ′ H(z ′ )
, (3.17)
0 z 1 + z
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 34

where we have already used the knowledge that for realistic cosmological models,
the universe has a finite age and that at the beginning a = 0 (i.e. z = ∞), and
chosen the beginning of time as t = 0. Putting z = 0 gives the present age of the
universe,
Z ∞
dz ′ 1
t0 = . (3.18)
0 1 + z H(z ′ )

Subtracting the two tells us for how long the photons travelled to come to our
detectors:
Z z
dz ′ 1
∆t ≡ t0 − t(z) = ′ ′
. (3.19)
0 1 + z H(z )

Note that whereas time t is a coordinate whose origin is in the past (in usual
cosmological models it is chosen to be at the beginning of the universe), the origin
of the redshift is set to be today. Conceptually, t is just like Newtonian time, so
it simple to use. For example, if we discuss two different cosmological models, it
is straightforward to compare them when the universe has the same age (assuming
both have a beginning of time). In contrast, comparing them at the same redshift
doesn’t make sense unless you specify by which criteria you select ’today’ in the two
models. Observationally, however, it is difficult to determine the age of the universe,
while it is easy to measure the redshift. The redshift is useful when it is expressed in
relation to some quantities which are easier to measure than time, such as distances,
to which we now turn.

3.1.5 Angular diameter distance


Almost all cosmological observations are made along the past lightcone, and impor-
tant observable quantities include, in addition to the redshift, the angular diameter
and luminosity of objects. We want to use the FRW model to relate these observable
quantities to parameters of the model, so that we can constrain the geometry of the
universe with the observations. (As we will soon discuss, we can then also constrain
the matter content of the universe.)
Suppose we have a set of standard rulers, objects that we know are all the same
small size ds, observed at different redshifts. Their observed angular sizes dθ(z) then
give us the angular diameter distance as dA (z) = ds/dθ(z), as discussed in chapter
2. This can then be compared to the theoretical dA (z) for the FRW universe to find
parameter values which give the best fit between observation and theory.
From the FRW metric, the proper distance corresponding to angle θ is, from
ds = a2 (t)r2 dθ2 ⇒ ds = a(t)rdθ. We thus have
2

1
dA = a(t)r = r. (3.20)
1+z
Now we have to relate the radial coordinate r to the observed redshift. As light
travels on null geodesics, we have
dr2
ds2 = −dt2 + a(t)2 =0
1 − Kr2
dr
⇒ dt = −a(t) √ . (3.21)
1 − Kr2
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 35

Since we place the observer at the center, the radial coordinate for incoming light
rays decreases as time increases, hence the minus sign. Integrating, we obtain
Z t0 Z r
dt dr
= √
t1 a(t) 0 1 − Kr2
= (−K)−1/2 arsinh[(−K)1/2 r]

−1/2 arcsin(K 1/2 r) ,
K
 K>0
= r, K=0 (3.22)

 −1/2
|K| 1/2
arsinh(|K| r) . K < 0

To facilitate handling all three cases simultaneously, we define the function



−1/2 sin(K 1/2 x) ,
1 √ K
 K>0
SK (x) ≡ √ sinh( −Kx) = x , K = 0 (3.23)
−K 
sinh(|K|1/2 x) .
 −1/2
|K| K<0.
1

In other words, the function √−K sinh( −Kx) is understood as the analytical con-
tinuation when K is positive, and as the limit of small K when K = 0. The inverse
−1
of this function is denoted SK (x). Putting together (3.22) and (3.23), we have
Z t0

 
1 dt
r = √ sinh −K
−K t a(t)
Z z1
√ dz ′
 
1
= √ sinh −K ′
, (3.24)
−K 0 H(z )

where we have on the second line used the relation (3.16).


Inserting (3.24) into (3.20), we finally obtain the angular diameter distance as a
function of redshift:
Z z
dz ′

−1
dA (z) = (1 + z) SK ′
0 H(z )
Z z
√ dz ′
 
1
= (1 + z)−1 √ sinh −K ′
. (3.25)
−K 0 H(z )

In the spatially flat case this reduces to


z
dz ′
Z
−1
dA (z) = (1 + z) . (3.26)
0 H(z ′ )
This relation tells us how distance scales in the FRW universe change because
of the expansion of the universe. For a general FRW metric, the angular√diameter
distance depends only on the redshift, the coordinate curvature radius 1/ −K and
the integral over the inverse Hubble parameter. Note that if the universe expands
very fast in the past (which is the case in the real universe), the contribution to the
distance from early times becomes rapidly very small, because the length scales at
early times were much smaller than today.
Because of the factor (1 + z)−1 , the angular diameter distance is not necessarily
monotonic in redshift, i.e. it may be that the distance to objects decreases above
some redshift. This curious feature is present in realistic cosmological models, and
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 36

can occur even if the spatial geometry is Euclidean (i.e. K = 0). It is related to the
fact that the angular diameter distance is defined along a lightlike direction in the
non-Euclidean spacetime, not along a spatial slice (which may itself, of course, be
non-Euclidean).

3.1.6 Luminosity distance


Recall that if the absolute luminosity of an object is L and the measured flux is F ,
its luminosity distance is r
L
dL = . (3.27)
4πF
Consider the situation in the FRW universe. The absolute luminosity can be
expressed as:
number of photons emitted Nγ Eem
L= × their average energy = . (3.28)
time tem
If the observer is at a coordinate distance r from the source, the photons have
at that distance spread over the area (recall that a(t0 ) = 1)

A = 4πr2 . (3.29)

The flux can be expressed as:


number of photons observed Nγ Eobs
F = × their average energy = . (3.30)
area · time tobs A
The number of photons Nγ is conserved, but their energy is redshifted, Eobs =
Eem /(1 + z). Also, if the source is at redshift z, it takes a factor 1 + z longer to
receive the photons ⇒ tobs = (1 + z)tem . Thus,

Nγ Eobs Nγ Eem 1 1
F = = . (3.31)
tobs A tem (1 + z) 4πr2
2

We thus have
r
L
dL = = (1 + z)r
4πF
= (1 + z)2 dA (z)
Z z
√ dz ′
 
1
= (1 + z) √ sinh −K ′
, (3.32)
−K 0 H(z )

where we have used (3.20) and (3.25). Compared to the angular diameter distance
dA (z), there are two extra factors of 1 + z. One-half comes from the redshift of
photon energy, one-half from cosmological time dilation in receiving the emitted
photons, and one from the change in the area element. (As we mentioned in chapter
2, this relation holds for a general spacetime, not just for the FRW universe; however,
proving the general case is a more complicated.)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 37

3.1.7 Proper distance


As discussed in chapter 2, the only cosmological distances we can measure are those
defined along lightlike curves. However, spacelike distances are still theoretically
interesting. In particular, the proper distance of an object is a useful quantity:
this is the size of an object (more generally, distance between two points) in the
rest frame of that object (more generally, at a surface of constant cosmic time). In
practice, objects do not move perfectly uniformly. However, deviations from the
mean flow are small, v < 10−3 (recall that c = 1, so 10−3 = 3000 km/s), so effects
like Lorentz contraction and time dilation are small.
Proper distance is defined as the physical distance measured on a slice of constant
time. If we consider the proper distance between galaxy O and galaxy A, we can
without loss of generality choose the direction between them to be radial and set
O to be at the origin r = 0 and A to be at the radial coordinate rA . The distance
interval is given by

dr2
ds2 = a(t)2 , (3.33)
1 − Kr2
so the proper distance is
Z A
dP = ds
0
Z rA
dr
= a(t) √
0 1 − Kr2
−1
= a(t)SK (r) . (3.34)

The distance between two points which are fixed in the comoving coordinates
grows proportionally to the scale factor as the universe expands, like we would
expect.
In cosmology, it is common to use the comoving distance, which just means
the physical distance to redshift z scaled by the difference between the scale factor
at then and now. So if we have some distance measure d(z), the corresponding
comoving distance, denoted dc (z), is dc (z) = (1 + z)d(z). The idea is that it is easier
to compare objects from different eras if we discuss them in terms of the distance
they would now span. For example, the sound horizon of the photon-baryon plasma
at the time of last scattering when the universe was about 380 000 years old is
rs ≈ 0.13 Mpc, whereas the comoving sound horizon is (1 + z∗ )rs ≈ 140 Mpc,
where 1 + z∗ ≈ 1090 is the redshift of the last scattering surface. This is especially
convenient for the comoving proper distance, which remains constant in time,

dcP = (1 + z)dP = SK
−1
(r) . (3.35)

The relation (3.34) shows how the coordinate r is related to the physical distance
dP ,

r = SK (dP /a) = SK (dcP ) . (3.36)

The radial coordinate r does not give the physical distance, but nevertheless has
a clear physical interpretation. The physical distance to an object at coordinate r
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 38

Figure 3: Calculation of the proper distance.

is dP , the length of the circle with physical radius dP (t, r) is 2πa(t)r and its surface
area is 4πa(t)2 r2 , as can be immediately verified from the FRW metric (3.1).
−1
The functions SK and SK convert between two natural length measures of a
FRW universe: the proper distance measured along the radial line (i.e. the proper
radius) and the area distance measured along the surface of a sphere. The fact
that these quantities do not agree is a reflection of the fact that the space is non-
Euclidean. In the flat case with K = 0, we have simply SK (x) = x, as the space is
Euclidean. In this case the only relativistic effect is the stretching of space.
In addition to the straightforward issue of proper distance as a function of time
as measured on the spacelike slice, we can ask the following slightly more involved
question: if we see (along a null geodesic) a galaxy at redshift z, what is the proper
distance (along the spacelike slice) to the galaxy today? Here we assume that the
galaxy is at rest in the comoving frame (i.e. we neglect peculiar velocities) and still
exists today. (In fact, we cannot know what has happened to the galaxy since the
light left it.)
From (3.20), (3.25) and (3.34) we have for the proper distance to object that
emiited light at time t1 , as measured at time t:
−1
dP (t1 , t) = a(t)SK (dA /a)
Z t
dt′
= a(t) ′
t1 a(t )
Z z1
dz ′
= (1 + z)−1 , (3.37)
z H(z ′ )

Note that this result is independent of spatial curvature. So the proper distance to
redshift z today is
Z z
dz ′
dP (z) = ′
. (3.38)
0 H(z )
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 39

As the distance in (3.38) is defined today, it makes no difference whether it is


comoving or not. The longest distance (as measured along the spatial slice) from
which it has been possible to receive signals at time t is called the horizon distance
dhor at time t, or simply the horizon. (Sometimes the name horizon is also used for
the spherical shell with proper radius dhor centred on the observer.) We get it by
putting t1 = 0, or equivalently z = ∞ (as a(0) = 0) in (3.37),
t Z ∞
dt′ dz ′
Z
−1
dhor (t) = a(t) ′
= (1 + z)
0 a(t ) z H(z ′ )
Z t ∞
dt′ dz ′
Z
⇒ dchor (t) = ′
= . (3.39)
0 a(t ) z H(z ′ )

We get the horizon distance today by putting t = t0 or z = 0 in the above.


There are actually a few different concepts in cosmology called the horizon. The
one given above is the particle horizon, and it indicates the maximum distance from
which we can in principle have received any information from up to now. Another
horizon concept is the event horizon, which is related to how far the light can travel
in the future. (More precisely, the event horizon is the boundary of the region, if any,
from which the observer can never receive any signals, even infinitely far into the
future.) The Hubble distance H −1 is also often referred to as the horizon (especially
when one talks about subhorizon and superhorizon distance scales, as we will do
in the second part of the course). For realistic cosmological models, the particle
horizon and the Hubble distance are (in the late universe) almost the same, they
differ only by a factor of order unity.

3.1.8 The Hubble law


In chapter 2 we discussed the Hubble law, which is a redshift-distance relationship
is linear for small redshifts, z = H0 d. Given the different measures of distance (and
we can define new distance measures simply by multiplying dA with any power of
(1 + z)), the question arises what is the distance that we have in this relation?
The answer is that for small redshifts, all of the above distance measures agree.
From (3.25), (3.32) and (3.37) we get

dA ≃ dL ≃ dP ≃ H0−1 z (3.40)

for z ≪ 1. For redshifts that are not small, the relation between the distance and the
redshift is more complicated, as shown by (3.25), (3.32) and (3.37). We need to know
not just the present value H0 , but the function H(z) all the way to the redshift of
the source (in the case of the angular diameter distance and the luminosity distance,
we also need the spatial curvature). The function H(z) is determined by the matter
content according to the dynamics of general relativity, to which we now turn.

3.2 Dynamics
3.2.1 The Friedmann equations
The considerations thus far have been purely geometrical and kinematical. In order
to find how the scale factor a(t) evolves, we need to consider the equations of mo-
tion, given by the Einstein equation. The Robertson-Walker metric of (3.1) has the
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 40

components
0 0 0
 
−1
 0 a2
1−Kr 2
0 0 
gµν =
 0 2 2
 . (3.41)
0 a r 0 
2 2 2
0 0 0 a r sin θ
Calculating the Einstein tensor from this metric gives

ȧ2 K
G00 = −3 2 − 3 2 (3.42)
a a
2

i ä ȧ K
G j = − 2 + 2 + 2 δij (3.43)
a a a
G0i = 0 . (3.44)

From the symmetry of the spacetime it follows that the energy-momentum tensor
has the perfect fluid form,
 
−ρ 0 0 0
 0 p 0 0
T µν = 
 0 0 p 0 ,
 (3.45)
0 0 0 p

where ρ is the energy density and p is the pressure. Homogeneity implies that they
only depend on time, ρ = ρ(t), p = p(t).
In general, the Einstein equation Gαβ = 8πGN Tαβ is a non-linear system of ten
partial differential equations. In the case of the FRW universe, it reduces to two
ordinary non-linear differential equations:

ȧ2 K
32
+3 2 = 8πGN ρ (3.46)
a a
ä ȧ2 K
−2 − 2 − 2 = 8πGN p . (3.47)
a a a
This pair of equations can be rearranged as

ȧ2 K
3 +3 2 = 8πGN ρ (3.48)
a2 a

3 = −4πGN (ρ + 3p) . (3.49)
a
These are the Friedmann equations. (“Friedmann equation” in the singular refers
to (3.48).)
The general relativity version of energy and momentum conservation, energy-
momentum continuity, follows from the Einstein equation. In the present case this
becomes the energy continuity equation (sometimes this is considered to be one of
the Friedmann equations)

ρ̇ = −3(ρ + p) . (3.50)
a
Since the symmetry of the situation forbids fluid flow in the spatial directions, the
equation corresponding to momentum conservation is satisfied identically. (Exer-
cise: Derive (3.50) from the Friedmann equations.)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 41

In fact, (3.50) is not a conservation equation for the energy. Rather, it shows
how the energy density evolves as the universe expands. We can rewrite (3.50) as

1 1 d(a3 ρ)
p = −
3H a3 dt
d(a3 ρ)
= − . (3.51)
d(a3 )

If the pressure is zero, the energy contained in a volume remains constant as the
universe expands or contracts. If the pressure is positive, the total amount of energy
decreases with the expansion of the universe (and increases if the universe contracts).
If the pressure is negative, the opposite happens: the energy of an expanding universe
increases. We can compare (3.51) with the first law of thermodynamics,
X
T dS = dU + pdV − µi dNi , (3.52)
i

where T is the temperature, S is the entropy, U is the internal energy, V is volume


and µi and Ni are chemical potential and particle number for particle species i. We
see that the energy density in a FRW universe changes like the energy density of
a gas which is expanding or contracting adiabatically and with constant particle
number. However, while pressure has a kinematical interpretation in the statistical
physics of a gas of particles, the quantity p appearing in (3.51) is more general. The
pressure of matter which consists of a gas of (almost) free particles is always positive,
but other forms of matter (such as coherent scalar fields or topological defects) can
have negative pressure. (We will come back to this soon.)

3.2.2 Critical density


The Hubble parameter H = H(t) gives the expansion rate of the universe. Its present
value H0 is the Hubble constant. The dimension of H is 1/time, or 1/distance. In the
time interval dt a distance gets stretched by a factor of 1 + Hdt (a distance L grows
with velocity HL). The Friedmann equation (3.48) connects the three quantities,
the density ρ, the space curvature K/a2 , and the expansion rate H of the universe,

K
3H 2 = 8πGN ρ − 3 . (3.53)
a2
Dividing by 3H 2 , we have
8πGN ρ K
1 = 2

3H (aH)2
≡ Ω + ΩK , (3.54)

where we have defined the density parameters


8πGN ρ
Ω(t) ≡
3H 2
K
ΩK (t) ≡ − . (3.55)
(aH)2
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 42

Another often used quantity is the critical density defined as

3H(t)2
ρc (t) ≡ . (3.56)
8πGN
The critical density is the energy density that a spatially flat universe that expands
with the rate H(t) would have. So the critical density changes as the universe evolves
and the Hubble parameter changes. Often in cosmology the word critical density is
used to refer just to the present value. We always use the subscript 0 when referring
to the present value:
3H02
ρc0 ≡ ρc (t0 ) = , (3.57)
8πGN
so we have
ρ(t)
Ω(t) ≡ . (3.58)
ρc (t)
Positive curvature contributes to the Hubble rate with a negative sign and neg-
ative curvature with a positive sign, as (3.48) shows. In other words, if we measure
that the density of the universe is ρ and the critical density is ρc (i.e. the Hubble
parameter is H), we can make the following conclusion about the spatial curvature:

ρ < ρc ⇔ Ω<1⇔ K<0 (3.59)


ρ = ρc ⇔ Ω=1⇔ K=0 (3.60)
ρ > ρc ⇔ Ω>1⇔ K>0. (3.61)

Thus Ω = 1 implies that the universe is spatially flat, Ω < 1 implies that spatial
curvature is negative and Ω > 1 that spatial curvature is positive. The Friedmann
equation can be written as
 2
K ℓH
Ω(t) = 1 + =1+ , (3.62)
a(t)2 H(t)2 Rcurv
where ℓH is the Hubble length and Rcurv is the curvature radius. If Ω < 1 (or > 1)
at some instant of time, it will stay that way (since K is constant). And if Ω = 1, it
will stay constant, Ω = Ω0 = 1. Observations show that the density of the universe
today is close to critical, Ω0 ≈ 1.

3.2.3 Matter components


In the two Friedmann equations (3.48) and (3.49), there are three unknowns, a(t),
ρ(t) and p(t). We can also consider a system of three equations, with (3.50) added
to the mix, but in that case only two are independent. The system is underdeter-
mined, reflecting the fact that different matter components affect the expansion rate
differently, and we need to specify which kind of matter there is in the universe. In
order to close the system, it is enough to give the relation between pressure and the
energy density: we can then solve for the energy density from (3.50) or (3.51) and
insert the solution to (3.48) and integrate.
The relation between the pressure and the energy density is called the equation
of state. In cosmology, this term refers specifically to the combination p/ρ. The
simplest equations of state are barotropic, which means that the pressure is a func-
tion of the energy density, p(ρ). (Scalar fields, for which the equation of state is
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 43

not barotropic, will be important in the second part of the course.) The simplest
possibilities are the following:
• Matter. The term “matter” refers to a form of matter for which the pressure
is zero p = 0, or at least negligible, |p| ≪ ρ. Such a form of matter is also
called “dust”. (The name “dust” is more common in a pure general relativity
context.) This is the case for a gas of free non-relativistic particles, where
the energy density is dominated by the mass. The relation (3.51), shows that
d(ρa3 )/dt = 0, or ρ ∝ a−3 .

• Radiation. The term “radiation” refers to matter for which the pressure
is (exactly or very closely) 1/3 of the energy density, p = 13 ρ. This is the
case for a gas of free ultrarelativistic particles, for which the energy density is
dominated by the kinetic energy (i.e. the momentum is much bigger than the
mass). In particular, this always holds for massless particles such as photons.
From (3.51), we get d(ρa4 )/dt = 0, in other words ρ ∝ a−4 .

• Vacuum energy. For vacuum energy the energy density does not change in
time, ρ = constant. From (3.50) it follows that the pressure is very negative
p = −ρ. (This type of matter is, a bit misleadingly, also called the cosmological
constant; see section 3.2.4 below.) Thus a positive vacuum energy corresponds
to negative vacuum pressure. The total amount of energy increases propor-
tional to the volume of space (because there is more space, and a constant
amount of energy per volume everywhere).
The universe contains non-relativistic matter in the form or ordinary, baryonic
matter (i.e. atoms, ions and electrons) as well as (most probably) dark matter, which
is (practically) pressureless, weakly interacting and extremely cold. Dark matter is
usually thought to consist of a gas of a new heavy particle species. We discuss dark
matter in more detail in chapter 7, at the end of the first part of the course. There is
also radiation, most importantly in the form of the cosmic microwave background,
which is a remnant of the radiation that used to dominate the expansion of the
universe. In addition, there are neutrinos, which behaved like radiation in the early
universe but now behave like matter. This happens for all particles which are not
strictly massless: the kinetic energy falls with the expansion of the universe, so that
at some point the mass starts dominating the particle energy. In chapters 4 and 5
we discuss in detail how different particle species behave as radiation in the early
universe when it is very hot, but as the universe cools, the massive particles change
form ultrarelativistic (radiation) to nonrelativistic (matter). During the transition
period the pressure due to that particle species falls from p = ρ/3 to p ∼ 0. In this
chapter, we focus on the late universe, when it is sufficient to divide matter into dust
(p ≈ 0) and radiation (p ≈ ρ/3), without worrying about the transitions. (Neutrinos
may undergo the transition quite late –the neutrino masses are not precisely known–
but their contribution to the total energy budget is negligible at late times, so we
can skip this detail.)
We have mentioned in that the present observational data cannot be explained in
terms of known particles (or hypothetical particles with similar properties), general
relativity and the FRW metric. One of the three assumptions –known forms of
matter, general relativity and the approximation of homogeneity and isotropy– is
then wrong. Sticking to the FRW metric and general relativity, the observations
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 44

indicate that the expansion of the universe has accelerated during the past few billion
years. From (3.49) we see that this requires (in the context of general relativity and
the FRW metric) an energy component with negative pressure, dark energy. It
is called dark since it has not been observed to emit or absorb light, and energy,
since the name “dark matter” was already taken (though “dark pressure” might be
more appropriate!). The simplest possibility for dark energy is just the cosmological
constant (vacuum energy), which generates repulsive gravity, leading to accelerated
expansion which fits the data in detail. Therefore we shall carry on our discussion
assuming three energy components: matter, radiation, and vacuum energy. We shall
later comment on how much current observations actually constrain the equation of
state of dark energy, if it is not just vacuum energy.
If the universe contains these three energy components, we can arrange (3.48)
into the form
ȧ2
3 = 8πGN ρr0 a−4 + 8πGN ρm0 a−3 − 3Ka−2 + Λ, (3.63)
a2
where ρr0 , ρm0 , a0 , K, and Λ are constants.6 The four terms on the right hand side
are due to radiation, matter, spatial curvature, and vacuum energy, respectively. As
the universe expands (a grows), different components on the right hand side become
important at different times. The universe was first radiation-dominated up to about
50 000 years, then the expansion was dominated by matter until a few billion years
ago, when vacuum energy started to dominate. (The universe has apparently never
been in state where the spatial curvature would have been the largest term.)
The radiation component is insignificant at present, and we can forget it in
(3.63), if we exclude the first few million years of the universe from discussion. In
the “inflationary scenario”, there was something resembling a very large vacuum
energy density in the very early universe (during the first small fraction of the
first second). So there may have been a very early “vacuum-dominated” era called
inflation – we will return to this in the second part of the course.
We thus divide the density into matter, radiation, and vacuum components ρ =
ρm + ρr + ρvac , and likewise for the density parameter, Ω = Ωm + Ωr + ΩΛ , where
Ωm ≡ ρm /ρc , Ωr ≡ ρr /ρc , and ΩΛ ≡ ρvac /ρc ≡ Λ/3H 2 . The density parameters Ωm ,
Ωr , and ΩΛ are functions of time (although ρvac is constant, ρc (t) is not). We have

Ω = Ωm + Ωr + ΩΛ . (3.64)

Even more so than in the case of the critical density, the symbols Ωm , Ωr , ΩΛ and
ΩK are often used to denote the present values of these quantities. In this course, to
avoid confusion, we always use the subscript 0 when referring to the present values,
Ωm0 ≡ Ωm (t0 ), Ωr0 ≡ Ωr (t0 ), ΩΛ0 ≡ ΩΛ (t0 ), ΩK0 ≡ ΩK (t0 ). The present radiation
density is relatively small, Ωr0 ∼ 10−4 (we will calculate the precise number in
chapter 5). So we usually write just

Ω0 = Ωm0 + ΩΛ0 . (3.65)

In addition to being small today, the radiation density is also known very accurately
from the temperature of the cosmic microwave background, and therefore Ωr0 is
6
We ignore transfer of energy between the components. Such transfer is important only in the
early universe, before the decoupling of the different particle species, or when particle species go
from being relativistic to non-relativistic. In chapters 4 and 5 we return to this issue in some detail.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 45

not usually considered as a cosmological parameter (in the sense of an inaccurately


known number that we try to determine with observations). This simple FRW
cosmological model is thus defined by giving the present values of three cosmological
parameters, which we can take to be H0 = h100km/s/Mpc, Ωm0 , and ΩΛ0 .
It is often useful to define the “physical” or “reduced” density parameters where
we multiply away the dependence of the critical density (and thus the Ω parameters)
on the value of h: ωm ≡ Ωm0 h2 , ωr ≡ Ωr0 h2 , which are directly proportional to the
actual densities in kg/m3 . (The corresponding quantities ωΛ and ωK are not as
useful.) Note that the ω parameters are not defined as a function of time, they are
constants defined with respect to present-day density only!
Two models have been particularly important. The first is the Einstein-de Sitter
model, which contains only matter and is spatially flat, Ωm = 1, Ωr = ΩK = ΩΛ = 0.
This model (with radiation added at early times, and coupled to a specific spectrum
of perturbations around homogeneity and isotropy – we will discuss this in the
second part of the course) was known as the Standard CDM model from the 1980s
onwards. (The abbreviation CDM stands for cold dark matter, which we will discuss
in chapter 7.)
At the end of the 1990s SCDM was supplanted by the ΛCDM model, which is
identical except that it also contains vacuum energy (it is spatially flat). This model
is also known as the ’concordance model’ due to the fact that it is able to fit a
number of independent observations. Comparing to observations, the parameters of
the model turn out to be h ≈ 0.7, Ωm0 ≈ 0.3 and ΩΛ0 ≈ 0.7. The precise values
depend on the datasets one fits to and the assumptions one makes about them. (The
first two parameters H0 and Ωm0 can be also be determined model-independently,
and the values agree with the parameter fits done in the context of the ΛCDM model.
Determining the vacuum energy density is more complicated; we discuss this at the
end of this chapter.)

3.2.4 Vacuum energy


Before proceeding into more details of the expansion history and the distance-
redshift relationship for different matter contents, let us say a few words about the
cosmological constant. It was originally introduced by Einstein because he thought
the universe should be static. A look at (3.49) shows that this requires matter with
negative pressure or a positive cosmological constant. Introducing a cosmological
constant makes it possible to balance the gravitational attraction of the energy den-
sity of matter against the repulsion due to a positive cosmological constant. This
model is called the Einstein static universe. (It is, in fact, unstable to small pertur-
bations and thus does not provide a viable model of the universe.)
While the cosmological constant is a geometrical term (a contribution to the
left-hand side of the Einstein equation), we can add an identical term to the matter
(the right-hand side of the Einstein equation), and this is called vacuum energy.
In quantum field theory, the fundamental physical objects are fields, particles
are just quanta of the field oscillations. Vacuum refers to the ground state of the
system, where fields have values which correspond to minimum energy. In quantum
field theory, there is no reason why this minimum energy should be zero. This energy
density is analogous to the zero-point energy of a harmonic oscillator in quantum
mechanics. The energy tensor of the vacuum has the form Tµν = −ρvac gµν . Thus
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 46

vacuum energy has exactly the same effect as a cosmological constant with the value

Λ = 8πGN ρvac . (3.66)

Vacuum energy is observationally indistinguishable from a cosmological constant,


though conceptually they are different, because the former is a new matter compo-
nent, and the latter is a modification of the law of gravity.
A problem with vacuum energy is that the expected scale of vacuum fluctuations
is huge, of the order of particle physics scales (perhaps the Planck scale of 1018
GeV, and at least 100 GeV), but observations restrict it to a much smaller value –
if vacuum energy is responsible for acceleration at late times, the energy density is
of the scale (meV)4 . Equivalently, the value of the cosmological constant is of the
order (10−33 eV)2 . However, our present understanding of quantum theory does not
allow us to calculate what the value of the vacuum energy is, so there is no conflict
between theory and observation, just unmet expectations. Possibly there is some
unknown principle which sets the vacuum energy to be zero, or at least prevents it
from interacting gravitationally – or almost so. The cosmological constant problem
was considered to be one of the most important issues in cosmology and particle
physics already before the observation of late time acceleration.

3.2.5 The expansion law and the big bang


Let us now solve the Friedmann equation for the case where it is dominated by a
term with a constant equation of state, ω ≡ ρ/p =constant. From (3.50) we get

ρ ∝ a−3(1+ω) . (3.67)

As far as the expansion history is concerned, spatial curvature is equivalent to a


fluid with the equation of state ω = −1/3 and a positive (negative) energy density
corresponding to negative (positive) spatial curvature, respectively. (However, the
spatial curvature also changes the relation between the expansion rate and the red-
shift, as we have discussed above; a fluid with the same equation of state of course
would have no such effect.) Inserting (3.67) into the Friedmann equation (3.48) and
putting K = 0, we get
ȧ2
∝ a−3(1+ω) . (3.68)
a2
Integrating, we get (we assume that ω > −1; the vacuum energy case ω = −1 and
the case ω < −1 need to be treated separately)
2
a = (t − ti ) 3(1+ω) . (3.69)

At a finite time into the past, the scale factor becomes zero; without loss of gener-
ality, we choose the origin of the time coordinate to be there, ti = 0. At this time
the energy density is correspondingly infinite, and the spacetime is also infinitely
curved. This singularity is called the big bang, and it is a general feature not only
of FRW models but of realistic cosmological models which include inhomogeneities.
Space and time do not continue beyond this event. However, at the big bang (or
more properly, as we come near its vicinity) general relativity does not apply any-
more, so we cannot make any definite statements about what happens very near the
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 47

beginning. Also, we cannot really expect matter to behave in this simple way in the
early universe; in the second part of the course when we discuss inflation we will see
one possibility of how the early universe can behave differently. (But inflation does
not save us from the cosmological singularity.)
In particular, we have the three cases
ω = 1/3 radiation-dominated a ∝ t1/2
ω=0 matter-dominated a ∝ t2/3
ω = −1/3 curvature-dominated (K < 0) a ∝ t
The cases K > 0 and vacuum energy have to be treated differently (this is left as
an exercise).

3.2.6 Age of universe


Now that we have a parametrised form of the expansion function H(z), we can
return to the age of the universe discussed in section 3.1.4. The Friedmann equation
(3.48) reads

3H 2 = 8πGN ρr0 a−4 + 8πGN ρm0 a−3 − 3Ka−2 + Λ . (3.70)

Dividing by 3H02 , we get

H p
= Ωr0 a−4 + Ωm0 a−3 + ΩK0 a−2 + ΩΛ0
H0
p
= Ωr0 (1 + z)4 + Ωm0 (1 + z)3 + ΩK0 (1 + z)2 + ΩΛ0 . (3.71)

We will have much use for this convenient form of the Friedmann equation. Inserting
(3.71) into the relation (3.17) for the age, we find the time it takes for the universe
to expand from scale factor a1 to a2 , or from redshift z1 to z2 ,
Z z1
dz ′ 1
t2 − t1 = ′ ′
z2 1 + z H(z )
Z z1
dz ′
= H0−1 p
′ ′ 4 ′ 3 ′ 2
z2 (1 + z ) Ωr0 (1 + z ) + Ωm0 (1 + z ) + ΩK0 (1 + z ) + ΩΛ0
Z 1
1+z1 da
= H0−1 p , (3.72)
1 Ω a −2 + Ω a −1 + Ω + Ω a 2
1+z2 r0 m0 K0 Λ0

where the second form is more convenient due to the cancellation of some factors of
1 + z. Recall that ΩK0 = 1 − Ωr0 − Ωm0 − ΩΛ0 ≡ 1 − Ω0 . The expression (3.17) is
integrable to an elementary function if two of the four terms under the root sign are
absent. From this we get the age of the universe t at redshift z as
1
da
Z
1+z
t(z) = H0−1 p . (3.73)
0 Ωr0 a + Ωm0 a−1 + ΩK0 + ΩΛ0 a2
−2

This gives the function t(z), that is, t(a). Inverting this function gives us a(t), the
scale factor as a function of time. Note that a(t) is not necessarily an elementary
function, even if t(a) is. However, even in that case we can sometimes have a
parametric representation a(ψ), t(ψ) in terms of elementary functions.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 48

For the present age of the universe we get


Z 1
−1 da
t0 = H 0 p . (3.74)
0 Ωr0 a−2 + Ωm0 a−1 + ΩK0 + ΩΛ0 a2

If the Ωs are of order unity (and ΩΛ0 is not the only non-zero one), the value of
the integral is of order unity. So the the age of the universe is of the order of the
Hubble time. In the real universe, Ωr0 ≈ 10−4 , so dropping the radiation term
causes negligible error (physically, this means that the radiation-dominated era is
relatively short).
Example: Age of the open universe. Let us consider an open universe (K < 0 or
Ω0 < 1), without vacuum energy (ΩΛ = 0), and approximating Ωr ≈ 0. Integrating
Ωm0
(3.74) (e.g., with the substitution a = 1−Ω m0
sinh2 ψ2 ) gives the age of the open
universe as
Z 1
−1 da
t0 = H 0 √
0 1 − Ωm0 + Ωm0 a−1  
−1 1 Ωm0 2
= H0 − arcosh −1 . (3.75)
1 − Ωm0 2(1 − Ωm0 )3/2 Ωm0

A special case of the open universe is the completely empty universe, which is dom-
inated by the spatial curvature, with Ωm = ΩΛ = 0 and ΩK = 1. In this case we
obtain from (3.71) the result a = H0 t, and we have t0 = H0−1 . We thus get the
following table for the age of the universe:
Ωm0 ΩΛ0 t0 H 0
0 0 1
0.1 0 0.90
0.3 0 0.81
0.5 0 0.75
1 0 2/3
The cases (Ωm > 1, ΩΛ = 0) and (Ω0 = Ωm + ΩΛ = 1, ΩΛ > 0) are left as
exercises. The more general case (ΩK 6= 1, ΩΛ 6= 0) leads to elliptic functions. The
results for H0 t0 are plotted in figure 4.
The best model-independent estimates of the age of the universe (based on ages
of globular clusters, which are compact groups of stars in our galaxy) give a 95%
probability lower limit on the age of the universe of 11 Gyr, and a best-fit age of
about 13.4 Gyr. The Hubble time is H0−1 ≈ h−1 9.8 Gyr, so we get H0 t0 & 1.14h
as the lower limit, and H0 t0 ≈ 1.37h as the preferred value. So the age of the
universe implies that models with only matter and curvature need a small Hubble
parameter. For a spatially flat matter model, we would need h = 0.48, and an open
model with Ωm0 ≈ 0.3 (as indicated by observations) would need h ≈ 0.6. Recall
that measurements of H0 indicate h = 0.73 ± 0.04: the mean value gives H0 t0 ≈ 1.0.
So just from measurements of H0 and t0 we can conclude that models with spatial
curvature and matter have trouble fitting the observations. However, the strongest
evidence against a model with no vacuum energy (or other form of negative pressure
matter) comes from distance measurements.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 49

-1
Age of the universe / H0
2

no big bang
1.5

1.5
2

1.2

75
0.
0.9

0.

7
0.
1

accelerating --- decelerating


ΩΛ

0.5 open --- closed

recollapses eventually
0.8

5
0.7

-0.5
0.7

5
0.6

0.6

5
0.5

-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

m

Fig. by E. Sihvola

Figure 4: The age of the universe as a function of Ωm0 and ΩΛ0 .


3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 50

3.2.7 Distances in the universe


Earlier, we derived the angular diameter distance, luminosity distance and proper
distance for a general FRW spacetime. We can now plug the parametrised expan-
sion rate H(z) into these expressions, like we did for the age of the universe. The
comoving proper distance to a comoving object seen at redshift z is, from (3.37) and
(3.71),
Z z
c dz ′
dP (z) = ′
0 H(z )
Z z
dz ′
= H0−1 p
0 Ωr0 (1 + z ′ )4 + Ωm0 (1 + z ′ )3 + ΩK0 (1 + z ′ )2 + ΩΛ0
Z 1
da
= H0−1 p
1
1+z
Ωr0 + Ωm0 a + ΩK0 a2 + ΩΛ0 a4
Z 1
da
≈ H0−1 p , (3.76)
1
1+z
Ω0 (a − a2 ) − ΩΛ0 (a − a4 ) + a2

where on the last line we have dropped the Ωr0 term which has negligible effect,
and used Ωm0 = Ω0 − ΩΛ0 . The proper distance depends on three independent
cosmological parameters, for which we have taken H0 , Ω0 and ΩΛ0 , and the distance
at a given redshift is proportional to the Hubble distance, H0−1 . If we give the
distance in units of H0−1 , then the distance depends only on the two remaining
parameters, Ω0 and ΩΛ0 .
If we increase Ω0 while keeping ΩΛ0 constant (meaning that we increase Ωm0 ),
the distance corresponding to a given redshift decreases. This is because the universe
has expanded faster in the past, so that there is less time between a given value of
the scale factor a = 1/(1 + z) and the present. The distance to the galaxy with
redshift z is shorter because photons have had less time to travel. Whereas if we
increase ΩΛ0 with a fixed Ω0 (meaning that we decrease Ωm0 ), we have the opposite
situation and the distance increases. In figure 5 we show the proper distance for
some parameter values.
In the case ΩΛ0 = 0, we have
2  
dcP (z) = H0−1
p
Ωm0 z − (2 − Ωm0 )( 1 + Ωm0 z − 1) . (3.77)
Ω2m0 (1 + z)

A subcase of this is the Einstein-de Sitter universe, which has Ω = Ωm = 1, ΩK =


ΩΛ = 0,
 
c −1 1
dP (z) = 2H0 1− √ . (3.78)
1+z
The comoving horizon distance today is
Z 1
c −1 da
dhor = H0 p . (3.79)
0 ΩΛ0 a + (1 − Ω0 )a2 + Ωm0 a + Ωr0
4

In figure 6 the comoving horizon distance is plotted for various choices of parameters.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 51

Matter only Flat universe


2.0 2.0

1.8 1.8

1.6 1.6

1.4 1.4
distance (H0 )

distance (H0 )
-1

-1
1.2 1.2

1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift

Figure 5: The proper distance, (3.76), for a) the matter-only universe ΩΛ = 0, Ωm0 = 0,
0.2,. . . ,1.8 (from top to bottom) b) the spatially flat universe Ω = 1 (ΩΛ = 1−Ωm ), Ωm0 = 0,
0.05, 0.2, 0.4, 0.6, 0.8, 1.0, 1.05 (from top to bottom). The thick line in both cases is the
Einstein-de Sitter model with Ωm = 1, ΩΛ = 0.

The angular diameter distance is, from (3.25) and (3.76),


Z z
√ dz ′
 
−1 1
dA (z) = (1 + z) √ sinh −K ′
−K 0 H(z )
Z 1 !
−1 −1 1 p da
= H0 (1 + z) √ sinh ΩK0 p
ΩK0 1+z
1 Ωr0 + Ωm0 a + ΩK0 a2 + ΩΛ0 a4
Z 1 √ !
1 1 − Ω 0
≈ H0−1 (1 + z)−1 √ sinh da p ,
1 − Ω0 1
1+z
Ω0 (a − a2 ) − ΩΛ0 (a − a4 ) + a2
(3.80)

where we have used the definition ΩK0 = −K/H02 = 1 − Ω0 and have on the last line
again dropped Ωr0 . The angular diameter distance is plotted in figure 7 for some
values of the parameters; figure 8 shows the same plot for the comoving angular
diameter distance. As always, the luminosity distance is dL = (1 + z)2 dA .
In a spatially flat universe the angular diameter distance is equal to the proper
distance,

dcA (z) = dcP (z)


Z z
dz ′
= ′
0 H(z )
Z 1
da
= H0−1 p , (3.81)
1
1+z
Ωr0 + Ωm0 a + ΩK0 a2 + ΩΛ0 a4

From anisotropies of the CMB we can infer that dA (1090) ≈ 13 Mpc. In the
second part of course we will discuss in detail where this length scale comes from.
But given this number, it is simple to use it as a cosmological constraint: the
parameters of any model have to be such that this distance is reproduced.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 52

-1
Horizon / H0
2

no big bang

1.5

2.2
2.4
2.7

2
3
3.5
5
6

4
7

accelerating --- decelerating


ΩΛ

0.5 open --- closed

recollapses eventually
2.4
3

-0.5
1.6
1.9

1.8
2

1.7

-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

m

Fig. by E. Sihvola

Figure 6: The comoving horizon as a function of Ωm and ΩΛ .


3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 53

Matter only Flat universe


2.0 2.0

1.8 1.8
angular diameter distance (H0 )

angular diameter distance (H0 )


-1

-1
1.6 1.6

1.4 1.4

1.2 1.2

1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift

Figure 7: The angular diameter distance, for a) the matter-only universe ΩΛ = 0, Ωm0 = 0,
0.2,. . . ,1.8 (from top to bottom) b) the spatially flat universe Ω = 1 (ΩΛ = 1−Ωm ), Ωm0 = 0,
0.05, 0.2, 0.4, 0.6, 0.8, 1.0, 1.05 (from top to bottom). The thick line in both cases is the
Einstein-de Sitter model with Ωm = 1, ΩΛ = 0. Note how the angular diameter distance
decreases for large redshifts, meaning that the object that is farther away may appear larger
on the sky. In the flat case, this is an expansion effect. In the matter-only case, the effect
is enhanced by space curvature effects for the closed (Ωm > 1) models.

Matter only Flat universe


2.0 2.0
comoving angular diameter distance (H0 )

comoving angular diameter distance (H0 )


-1

-1

1.8 1.8

1.6 1.6

1.4 1.4

1.2 1.2

1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift

Figure 8: Same as figure 7, bur for the comoving angular diameter distance. For the closed
models (for Ωm > 1 in the case of ΩΛ = 0) even the comoving angular diameter distance
may begin to decrease if we look at large enough redshifts. This happens when we are
looking beyond χ = π/2, where the universe “begins to close up” as we pass the equator of
the hypersphere. The figure does not go to high enough z to show this for the parameters
used. Note how for the flat universe the comoving angular diameter distance is equal to the
comoving distance (see figure 5).
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 54

Figure 9: Spacetime diagrams for a flat universe giving a) the actual distance b) the
comoving distance from origin as a function of cosmic time.

Exercise. Show that in order to fit this distance in the Einstein-de Sitter model,
the Hubble parameter has to be smaller than the observed value 73±4 km/s/Mpc.

3.2.8 Illustrating the distances


Just like any planar map of the surface of the Earth must be distorted, so is that
of the curved spacetime. Even in the simplest spatially flat case, the expansion
rate affects the mapping. Thus any spacetime diagram is a distortion of the true
situation. In figures 9 and 10 there are three different ways of drawing the same
spacetime diagram for the simplest cosmological model, the Einstein-de Sitter model
which has Ωm = 1. In the first one the vertical distance is proportional to the cosmic
time t, the horizontal distance to the actual distance at that time, d1 . The second one
is in the comoving coordinates (t, r), so that the horizontal distance is proportional
to the comoving proper distance dcP . (Recall that for in the case K = 0 we have
dcP = r, see (3.35)) The third one uses the conformal coordinates (η, r). The last
one has the advantage that light cones are always at a 45◦ angle. This is thus a
spacetime analogue of the Mercator projection.

3.2.9 Luminosity distance and observations


Using the luminosity distance to constrain the cosmological model, we would ideally
have a set of standard candles, objects which are known to have the same absolute
luminosity L. From there observed redshifts z and fluxes pF (z) we then get an
observed luminosity-distance-redshift relationship dL (z) = L/4πF , which can be
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 55

Figure 10: Spacetime diagram for a flat universe in conformal coordinates.

Matter only Flat universe


10 10

9 9

8 8
luminosity distance (H0 )

luminosity distance (H0 )


-1

-1

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift

Figure 11: Same as Fig. 7, bur for the luminosity distance. Note how the vertical scale now
extends to 10 Hubble distances instead of just 2, to have room for the much more rapidly
increasing luminosity distance.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 56
Matter only Flat universe
5 5

4 4

3 3

2 2
magnitude

magnitude
1 1

0 0

-1 -1

-2 -2

-3 -3

-4 -4

-5 -5
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift

Figure 12: Same as Fig. 7, bur for the magnitude-redshift relationship. The constant
M − 5 − 5 lg H0 in (3.84), which is different for different classes of standard candles, has
been arbitrarily set to 0.

compared to the theoretical one to find the values of the cosmological parameters
which give the best fit between theory and observations.
As we discussed in chapter 2, astronomers give luminosities as magnitudes. From
the definitions of the absolute and apparent magnitude,
L F
M ≡ −2.5 log10 , m ≡ −2.5 log10 , (3.82)
L0 F0
and (3.27) we get the distance modulus m − M in terms of the luminosity distance
as
F L0 F0
m − M = −2.5 log10 = 5 log10 dL + 2.5 log10 4π = −5 + 5 log10 dL (pc) .
L F0 L0
(3.83)
(As explained in chapter 2, the constants L0 and F0 are chosen so as to give the
value −5 for the constant term, when dL is given in parsecs.) For a set of standard
candles, all having the same absolute magnitude M , we find that their apparent
magnitudes m should be related to their redshift z as

m(z) = M − 5 + 5 log10 dL (pc)


"
1
= M − 5 − 5 log10 H0 + 5 log10 (1 + z) √ ×
1 − Ω0
1
√ !#
1 − Ω0
Z
× sinh da p (3.84)
1
1+z
Ω0 (a − a2 ) − ΩΛ0 (a − a4 ) + a2

The Hubble constant H0 contributes only a constant term in this magnitude-


redshift relationship. If we just know that all the objects have the same M , but do
not know the value of M , we cannot use the observed m(z) to determine H0 , since
both M and H0 contribute to this constant term. On the other hand, the shape of
the m(z) curve depends only on the parameters Ω0 and ΩΛ0 .
Unfortunately, there are no known good standard candles. However, the absolute
peak luminosity of type Ia supernovae (SNe Ia) is correlated with the shape of
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 57

relative difference in m(z)


1.0

0.8

0.6

0.4
m

0.2

0.0

-0.2

-0.4

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
redshift

Figure 13: The difference between the magnitude-redshift relationship of the different
models in Fig. 12 from the reference model Ωm = 1, ΩΛ = 0 (which appears as the horizontal
thick line). The red (solid) lines are for the matter-only (ΩΛ = 0) models and the blue
(dashed) lines are for the flat (Ω0 = 1) models.

the observed luminosity as a function of time. Therefore, calibrating off nearby


supernovae whose distance can be determined independently, it is possible to find
the absolute luminosity of individual SNe Ia7 .
It was observations of SNe Ia published in 1998 which provided the first evidence
that the expansion of the universe had accelerated. (Though there were a number
of hints –such as the age of the universe coupled with the value of H0 – that some-
thing was wrong with the matter-dominated model.) Two groups, the Supernova
Cosmology Project8 and the High-Z Supernova Search Team9 made independent
observations and published independent analyses of SNe Ia up to redshifts z ∼ 1
to determine the values of the cosmological parameters Ω0 and ΩΛ0 [2, 3]. Their
observations were inconsistent with a matter-dominated universe, i.e., with ΩΛ = 0.
In fact the expansion of the universe was indicated to be accelerating.
Later more accurate observations by these and other groups have confirmed this
result. This SNIa data is one of the main arguments for the existence of dark
energy in the universe.10 See Fig. 14 for SNIa data from 2004, and Fig. 15 for a
determination of Ωm0 and ΩΛ0 from more recent data in 2009 [5]. (The lines labelled
“Union” and “Union+CfA3” are different supernova compilations, and BAO stands
for Baryon Acoustic Oscillations, which is information provided by surveys of large
scale structure.)
7
The variation in luminosity is about a factor of ten, so type Ia supernovae are far from standard
candles, though they are often incorrectly referred to as such. Another, less incorrect, expression is
“standardizable candle”.
8
http://supernova.lbl.gov/
9
http://cfa-www.harvard.edu/cfa/oir/Research/supernova/HighZ.html
10
The other main argument comes from combining CMB anisotropy and large-scale-structure
data, and will be discussed in Cosmology II.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 58

1.0

0.5
∆(m-M) (mag)

0.0

-0.5

Ground Discovered
-1.0 HST Discovered
)
=1.0) Ω M=1.0
t ( + Ω M o lu ti o n ~ z, (+
y dus Ev
z gra
0.5
high-
∆(m-M) (mag)

0.0

ΩM =1.0
-0.5
Empty (Ω=0) , ΩΛ =0
.0
ΩM=0.27, ΩΛ=0.73
"replenishing" gray Dust
0.0 0.5 1.0 1.5 2.0
z

Figure 14: The Supernova Ia luminosity-redshift data. The top panel shows all supernova
of the data set. The bottom panel show the averages from different redshift bins. The
curves corresponds to three different FRW cosmologies, and some alternative explanations:
“dust” refers to the possibility that the universe is not transparent, but some photons get
absorbed on the way; “evolution” to the possibility that the SNIa are not standard candles,
but were different in the younger universe, so that M = M (z). This Figure is from Riess et
al., astro-ph/0402512 [4].

We have in the preceding assumed that the mysterious dark energy component of
the universe is vacuum energy, for which pde = −ρde . Instead allowing the equation
of state parameter wde ≡ pde /ρde for dark energy to be an arbitrary constant11 , we
see that wde is restricted to be close to −1; see Fig. 16.
It is worth emphasising that all the supernova observations (and observations of
the angular diameter distance to the CMB) show is that the distances are longer
than in the Einstein-de Sitter model. If the distance observations are interpreted
assuming that the FRW approximation is valid (i.e. that the FRW relation (3.25)
between the distance and the expansion rate holds), it follows that the expansion
rate has accelerated. Assuming that the Friedmann equations hold (i.e. that general
relativity is valid), (3.49) shows that the total pressure then has to negative. While
the acceleration has not been established independently of the assumption that the
FRW approximation holds, observations of t0 and H0 (and other observables, such
as the growth rate of cosmic structures) are consistent with this interpretation. Note
that the only cosmological effect of vacuum energy is to increase the expansion rate
and correspondingly increase the distances. Its success in fitting various cosmological
observations in detail thus is thus strong evidence for faster expansion, but it may
be that the explanation for the faster expansion is not vacuum energy but something
11
There is no theoretical justification for the assumption that wde is constant, if it is different
from −1. It is just taken for simplicity.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 59

Figure 15: The densities Ωm and ΩΛ determined from Supernova Ia data. The dotted
contours are the 1998 results [2]. This figure is from [5].

else, be it a more complicated form of dark energy or modified gravity, or breakdown


of the FRW approximation due to cosmological structure formation. In this course,
we will not discuss these possibilities, and will stick with vacuum energy.
REFERENCES 60

Figure 16: The matter density Ωm and the dark energy equation of state w determined
from Supernova Ia data, assuming spatial flatness. This figure is from [5].

References
[1] L.M. Krauss and B. Chaboyer, Age Estimates of Globular Clusters in the Milky
Way: Constraints on Cosmology, Science 299 (2003) 65

[2] A.G. Riess et al., Astron. J. 116, 1009 (1998)

[3] S. Perlmutter et al., Astrophys. J. 517, 565 (1999)

[4] A.G. Riess et al., Astrophys. J. 607, 665 (2004), astro-ph/0402512

[5] M. Hicken et al., Astrophys. J. 700, 1097 (2009), arXiv:0901.4804


4 Thermodynamics in an expanding universe
4.1 Phase space density
As we look out in space we can see the history of the universe unfolding in front of
our telescopes. However, at redshift z = 1090 our line of sight hits the last scattering
surface, from which the cosmic microwave background (CMB) radiation originates.
This corresponds to t ≈ 400 000 years. Before that the universe was not transparent,
so we cannot see further back in time. However, the isotropy of the CMB indicates
that matter was distributed almost homogeneously and isotropically in the early uni-
verse, and the spectrum of the CMB shows that this matter, the “primordial soup”
of particles, was in thermal equilibrium. Therefore we can use thermodynamics to
calculate the history of the early universe. As we will see, this calculation leads to
predictions testable by observation (and big bang nucleosynthesis in particular has
been successfully tested). We will now derive the thermodynamics of the primordial
soup starting from statistical physics. Note that we only deal with the statistical
physics of a gas of particles: the thermodynamics of the gravitational degrees of
freedom are poorly understood, and will not be relevant for our discussion. Also,
the interactions responsible for thermal equilibrium are those of non-gravitational
physics. The only role of gravity here is to determine the expansion of space.
From elementary quantum mechanics we are familiar with the “particle in a
box”. Let us consider a cubic box, whose edge is L (volume V = L3 ), with periodic
boundary conditions. Solving the Schrödinger equation gives us the energy and
momentum eigenstates, where the possible momentum values are
h
p~ = (n1 x̂ + n2 ŷ + n3 ẑ) (ni = 0, ±1, ±2, . . .). (4.1)
L
The state density in momentum space (number of states / ∆px ∆py ∆pz ) is thus

L3 V
3
= 3, (4.2)
h h
and the state density in phase space {(~x, p~)} is 1/h3 . If the particle has g internal
degrees of freedom (e.g., spin),
 
g g h
density of states = 3 = ~ ≡ ≡ 1 . (4.3)
h (2π)3 2π

This result is true even for relativistic momenta. The state density in phase space
is independent of the volume V , so we can apply it to arbitrarily large systems
(including an infinite universe).
For much of the early universe, we can ignore interaction energies between par-
ticles. Then the particle energy is
p
p) = p2 + m2 ,
E(~ (4.4)

where p ≡ |~ p| is the magnitude of the three-momentum (not pressure!), and the


states available for the particles are the free particle states discussed above.
Particles fall into two classes, fermions and bosons. Fermions obey the Pauli
exclusion principle: no two fermions can be in the same state.

61
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 62

In thermodynamic equilibrium the distribution function, or the expectation value


f of the occupation number of a state, depends only on the energy of the state.
According to statistical physics, it is
1
f (~
p) = (4.5)
e(E−µ)/T ±1
where + is for fermions and − is for bosons. (For fermions, where f ≤ 1, f gives
the probability that a state is occupied.) This equilibrium distribution has two
parameters, the temperature T , and the chemical potential µ. The temperature is
related to the energy density in the system and the chemical potential is related
to the number density n of particles in the system. Note that since we use the
relativistic formula for the particle energy E, which includes the mass m, it is also
“included” in the chemical potential µ. Thus in the nonrelativistic limit, both E
and µ differ from the corresponding quantities of nonrelativistic statistical physics
by m, in such a way that E − µ and the distribution functions remain the same.
If there is no conserved particle number in the system (as e.g. in a photon gas),
then µ = 0 in equilibrium.
The particle density in phase space is the density of states times their occupation
number,
g
f (~
p). (4.6)
(2π)3
We get the particle density in (ordinary) space by integrating over the momentum
space. We thus have the following quantities:
g
Z
number density n = p)d3 p
f (~ (4.7)
(2π)3
g
Z
energy density ρ = E(~p)f (~p)d3 p (4.8)
(2π)3
g p|2
Z
|~
pressure p = 3
p)d3 p .
f (~ (4.9)
(2π) 3E
The index i here labels different particle species, which have different masses mi .
The above discussion applies separately to each particle species.
We won’t need it, but let’s note for the sake of completeness that the general
expression for the energy-momentum tensor of a species i is
Z 3
α g d p α
Tβ= 3
p pβ f (~
p) , (4.10)
(2π) E
where the four-momentum is pα = (E, p~), with pα pα = −m2 .

4.2 Equilibrium distributions


If particle species i has the above distribution for some µi and Ti , we say the species
is in kinetic equilibrium. If the system is in thermal equilibrium, all species have the
same temperature, Ti = T . If the system is in chemical equilibrium (“chemistry”
here refers to reactions where particles change into other species), the chemical
potentials of different particle species are related according to the reaction formulae.
For example, if we have a reaction
i+j ↔ k +l, (4.11)
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 63

then
µi + µj = µk + µl . (4.12)
Thus all chemical potentials can be expressed in terms of the chemical potentials
of conserved quantities, e.g., the baryon number chemical potential, µB . There are
thus as many independent chemical potentials as there are independent conserved
particle numbers. For example, if the chemical potential of particle species i is µi ,
then the chemical potential of the corresponding antiparticle is −µi .
As the universe expands, T and µ change in such a way that the energy continuity
equation is satisfied and conserved quantum numbers are conserved. In principle,
an expanding universe is not in equilibrium. The expansion is however so slow that
the particle soup usually has time to settle close to local equilibrium. (And since
the universe is homogeneous, the local values of thermodynamic quantities are also
global values). From the remaining numbers of fermions (electrons and nucleons) in
the present universe, we can conclude that in the early universe we had |µ| ≪ T when
T ≫ m. (We don’t know the chemical potentials of the three neutrino species, but
they are usually assumed to be small too.) If the temperature is much p greater than
the mass, T ≫ m, the ultrarelativistic limit, we can approximate E = p2 + m2 ≈ p.
For |µ| ≪ T and m ≪ T , we approximate µ = 0 and m = 0 to get the following
formulae
 3
g
Z ∞
4πp2 dp  2 ζ(3)gT 3 fermions

n = = 4π (4.13)
(2π)3 0 ep/T ± 1   1 ζ(3)gT 3 bosons
π2
 2

gT 4 fermions

4πp3 dp
Z ∞ 
g 
8 30
ρ = = (4.14)
(2π)3 0 ep/T ± 1  π 2
4
 gT

bosons
30

Z ∞ 4 3 1.0505nT fermions
g 3 πp dp 1
p = = ρ ≈ (4.15)
(2π)3 0 ep/T ± 1 3 
0.9004nT bosons .

For the average particle energy we get

7π 4


 T ≈ 3.151T fermions
ρ 
180ζ(3)
hEi = = (4.16)
n  π4

 T ≈ 2.701T bosons .
30ζ(3)

In the above ζ is the Riemann zeta function, with ζ(3) ≡ ∞ −3 = 1.20206.


P
n=1 n
If the chemical potential vanishes, µ = 0, there are equal numbers of particles
and antiparticles. If µ 6= 0, we find for fermions in the ultrarelativistic limit T ≫ m
(i.e., for m = 0, but µ 6= 0) the “net particle number”
Z ∞  
g 2 1 1
n − n̄ = dp 4πp −
(2π)3 0 e(p−µ)/T + 1 e(p+µ)/T + 1
gT 3
  µ 3 

= π + (4.17)
6π 2 T T
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 64

and the total energy density


Z ∞  
g 3 1 1
ρ + ρ̄ = dp 4πp +
(2π)3 0 e(p−µ)/T + 1 e(p+µ)/T + 1
7 π2 4
 
30  µ 2 15  µ 4
= g T 1+ 2 + 4 . (4.18)
8 15 7π T 7π T

Note that the last forms in equations (4.17) and (4.18) are exact, not just truncated
series. (The difference n − n̄ and the sum ρ + ρ̄ lead to a nice cancellation between
the two integrals. We don’t get such an elementary form for the individual n, n̄, ρ,
ρ̄, or the sum n + n̄ and the difference ρ − ρ̄ when µ 6= 0.)
In the nonrelativistic limit, T ≪ m and T ≪ m − µ, the typical kinetic energies
are much below the mass m, so that we can approximate E = m + p2 /2m. The
second condition, T ≪ m − µ, leads to occupation numbers ≪ 1, a dilute system.
This second condition is usually satisfied in cosmology when the first one is. (It is
violated in systems of high density, like white dwarf stars and neutrons stars.) We
can then approximate
e(E−µ)/T ± 1 ≈ e(E−µ)/T , (4.19)
so that the boson and fermion expressions become equal1 , and we get (exercise)
3/2
m−µ

mT
n = g e− T (4.20)

 
3T
ρ = n m+ (4.21)
2
p = nT ≪ ρ (4.22)
3T
hEi = m + (4.23)
2
 3
mT 2 − m µ
n − n̄ = 2g e T sinh . (4.24)
2π T
In the general case, where neither T ≪ m, nor T ≫ m, the integrals don’t give
elementary functions, but n(T ), ρ(T ), etc. need to be calculated numerically for the
region T ∼ m.2
By comparing the ultrarelativistic (T ≫ m) and nonrelativistic (T ≪ m) limits
we see that the number density, energy density, and pressure of a particle species
falls exponentially as the temperature falls below the mass of the particle. We
have not so far made assumptions about the interactions that are responsible for
maintaining equilibrium. In the cosmological case, these include annihilation and
particle-antiparticle pair formation. At high temperatures, these reactions balance
each other, but as the temperature falls below the mass, the thermal particle energies
are not sufficient for pair production any more, so the reactions happen only in the
annihilation direction. The process of particle-antiparticle annihilation takes place
mainly (about 80%) during the temperature interval T = m → 61 m, as shown in
figure 1. It is thus not an instantaneous event, but takes several Hubble times.
1
This approximation leads to what is called Maxwell–Boltzmann statistics; whereas the previous
exact formulae give Fermi–Dirac (for fermions) and Bose–Einstein (for bosons) statistics.
2
If we use Maxwell–Boltzmann statistics, i.e., we drop the term ±1, the integrals give modified
Bessel functions, e.g., K2 (m/T ), and the error is often less than 10%.
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 65

Figure 1: The fall of energy density of a particle species, with mass m, as a function of
temperature (decreasing to the right).

4.3 Effective number of degrees of freedom


According to the Friedmann equation the expansion of the universe is governed by
the total energy density X
ρ(T ) = ρi (T ) ,
i

where i runs over particle species. Since the energy density of relativistic species is
much greater than that of nonrelativistic species, it suffices to include the relativistic
species only. (This is true in the early universe, but not at later times. Eventually
the rest masses of the particles left over from annihilation begin to dominate and
we enter the matter-dominated era.) Thus we have

π2
ρ(T ) = g∗ (T )T 4 , (4.25)
30
where
7
g∗ (T ) = gb (T ) + gf (T ),
8
P P
and gb = i gi over relativistic bosons and gf = i gi over relativistic fermions.
For pressure we have p(T ) ≈ 31 ρ(T ).
The above is a simplification of the true situation: Since the annihilation takes
a long time, often the annihilation of some particle species is going on, and the
contribution of this species disappears gradually. Using the exact formula for ρ we
define the effective number of degrees of freedom g∗ (T ) by
30 ρ
g∗ (T ) ≡ . (4.26)
π2 T 4
We also define
90 p
g∗p (T ) ≡ ≈ g∗ (T ) . (4.27)
π2 T 4
When there are no annihilations taking place, g∗p = g∗ = const ⇒ p = 31 ρ. From
the Friedmann equation it then follows that ⇒ ρ ∝ a−4 , so we have and ρ ∝ T 4
and T ∝ a−1 . We will soon calculate the scale factor-temperature relation more
precisely (including the effects of annihilations).
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 66

4.4 Redshift of momenta


Let us now show that the momentum of freely moving massless (or ultrarelativistic,
m ≪ T ⇒ E ≃ p) particles redshift with the expansion of the universe as

a(t1 )
p(t2 ) = p(t1 ) . (4.28)
a(t2 )
It follows that ultrarelativistic non-interacting particles stay in kinetic equilibrium.
We can see this as follows.
At time t1 a phase space element d3 p1 dV1 contains
g
dN = p1 )d3 p1 dV1
f (~ (4.29)
(2π)3
particles, where
1
f (~
p1 ) =
e(p1 −µ1 )/T1
±1
is the distribution function at time t1 . At time t2 these same dN particles are in a
phase space element d3 p2 dV2 . How is the distribution function at t2 , given by
g dN
f (~
p2 ) = 3 ,
(2π)3 d p2 dV2

p1 )? Since d3 p2 = (a1 /a2 )3 d3 p1 and dV2 = (a2 /a1 )3 dV1 , we have


related to f (~

g d3 p1 dV1
dN = (dN evaluated at t1 )
(2π)3 e(p1 −µ1 )/T1 ± 1
a 3 3 a 3
g ( a21 ) d p2 ( a21 ) dV2 (rewritten in terms of
= (4.30)
(2π)3 ( aa12 p2 − µ1 )/T1 p2 , dp2 , and dV2 )
e ±1
g d3 p2 dV2
= (defining µ2 and T2 ) ,
(2π)3 e(p2 −µ2 )/T2 ± 1

where µ2 ≡ (a1 /a2 )µ1 and T2 ≡ (a1 /a2 )T1 . Thus distribution retains the thermal
shape; the temperature and the chemical potential just redshift ∝ a−1 .
Exercise. Show that for a non-relativistic particle species, the distribution func-
tion retains the thermal shape as the universe expands, with T2 = T1 (a(t1 )/a(t2 ))2 ∝
a(t2 )−2 and µ(t2 ) = m + (µ(t1 ) − m)T2 /T1 .

4.5 Scale factor-temperature relation


The relation between the temperature T and the scale factor a follows from the
conservation of entropy. We define the entropy density by s ≡ S/V and the effective
number of entropy degrees of freedom g∗s (t)

2π 2
s(T ) ≡ g∗s (T )T 3 . (4.31)
45
The equation (4.31) defines the coefficient g∗s (T ).
According to the second law of thermodynamics the total entropy of the uni-
verse never decreases: it either stays constant or grows. It turns out that entropy
production in various processes in the universe is insignificant compared to the total
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 67

Figure 2: The expansion of the universe increases the volume element dV and decreases
the momentum space element d3 p so that the phase space element d3 pdV stays constant.

entropy of the universe3 , which is huge, and which is dominated by the relativistic
species. Thus it is an excellent approximation to treat the expansion of the universe
as adiabatic, so the entropy stays constant i.e.,

d(sa3 ) = 0. (4.32)

This gives the desired relation between a and T :

g∗s (T )T 3 a(t)3 = constant . (4.33)

We will have much use for this formula.


In order to give substance to (4.33), we have to know what is g∗s (T ). For this
we turn to the fundamental equation of thermodynamics,
X
E = T S − pV + µi N i ,
i

from which we get P


ρ + p − i µi n i
s= . (4.34)
T
In general,
P we get the entropy density by summing up the contributions to ρ +
p − i µi ni from all particle species, using the exact expressions given earlier. If
|µi | ≪ T , we have for a single relativistic species
 2
7π 3
ρ + p  180 gT fermions
s= = (4.35)
T  2π2 3
gT bosons .
45
3
There may be exceptions to this in the very early universe, most notably the end of inflation,
where essentially all of the entropy of the universe may have been produced. Recall that we are
discussing only the entropy of matter: the entropy of gravitational degrees of freedom is a topic
which remains poorly understood. Black holes are thought to have extremely large entropy.
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 68

Adding up all relativistic species and allowing for the possibility that some of them
may have a kinetic temperature Ti different from the temperature T of those species
that remain in thermal equilibrium, we get
 4  4
X Ti 7X Ti
g∗ (T ) = gi + gi
T 8 T
bos fer
 3  3
X Ti 7X Ti
g∗s (T ) = gi + gi , (4.36)
T 8 T
bos fer

and the sums are over all relativistic species of bosons and fermions.
If some species are “semirelativistic”, i.e. m = O(T ), then ρ(T ) and s(T ) have
to be calculated from the integral formulae of section 4.2. Non-relativistic species
give negligible contribution to the entropy.
As long as all species have the same temperature and p ≈ 13 ρ, we have

g∗s (T ) ≈ g∗ (T ) . (4.37)

We will see that this approximation breaks down in the real universe at around 1 s.
5 Thermal history of the early universe
5.1 Timescale of the early universe
We will now apply the thermodynamics discussed in the previous section to the
evolution of the early universe. It is useful to keep in mind some simple relations
between time, distance and temperature in a radiation-dominated universe. Spatial
curvature can be neglected in the early universe, so the metric is

ds2 = −dt2 + a2 (t) dr2 + r2 dθ2 + r2 sin2 θ dϕ2 .


 
(5.1)

and the Friedmann equation is

π2 T4
3H 2 = 8πGN ρ(T ) = g∗ (T ) 2 , (5.2)
30 MPl

where
√ we have written Newton’s constant in terms of the Planck mass, MPl ≡
1/ 8πGN ≈ 2.436 × 1021 MeV. To integrate this equation exactly we would need
to calculate numerically the function g∗ (T ) with all the annihilations. For most of
the time, however, g∗ (T ) is changing slowly, so we can approximate g∗ (T ) = const.
Then T ∝ a−1 and H ∝ a−2 , so we get the following relation between the age of the
universe t and the Hubble parameter H:
r  −2
1 −1 45 1 MPl 1.51 MPl 2.42 T
t= H = √ ≈ √ ≈ √ s. (5.3)
2 2π 2 g∗ T 2 g∗ T 2 g∗ MeV

We thus have
a ∝ T −1 ∝ t1/2 .
This approximate result (5.3) will be sufficient for us as far as the time scale is
concerned1 , but for the relation between a and T , we need to use the more exact
result derived in section 4.5.
The distance to the horizon (i.e. the proper comoving distance to t = 0, or
z = ∞) is
t
dt′
Z
dhor = a(t) = 2t = H −1 . (5.4)
0 a(t′ )

In the radiation-dominated early universe, the distance to the horizon is equal to the
Hubble length, so we can use the terms “horizon length”, “horizon” and “Hubble
length” interchangeably. (This is often also done for other eras, when the two are
not equal.)

5.2 Particle content


The primordial soup initially consists of all the different species of known elementary
particles. Their masses range from the heaviest known elementary particle, the top
quark (m = 174 GeV) down to the lightest particles, the electron (m = 511 keV),
the neutrinos (m . 2 eV) and the photon (m = 0). In addition to the particles of
1
Usually the error from ignoring the time-dependence of g∗ (T ) is small, since the time scales of
earlier events are so much shorter.

69
5 THERMAL HISTORY OF THE EARLY UNIVERSE 70

the Standard Model, there are presumably other, so far undiscovered, species. In
particular, we will discuss dark matter particles in chapter 7. As the temperature
falls, the various particle species become nonrelativistic and annihilate at different
times.
The particles of the Standard Model are listed in table 1, and the effective num-
ber of degrees of freedom g∗ (T ) (solid), g∗p (T ) (dashed), and g∗s (T ) are plotted in
figure 1 as a function of temperature. In table 2 we list some important events in
the early universe.

Table 1: The particles in the Standard Model


Particle Data Group, 2014 [1]

Quarks t 173.2 ± 0.9 GeV t̄ spin 12 g = 2 · 2 · 3 = 12


b 4.18 ± 0.03 GeV b̄ 3 colours
c 1.275±0.025 GeV c̄
s 95 ± 5 MeV s̄
d 4.5–5.3 MeV d¯
u 1.8–3.0 MeV ū
72

Gluons 8 massless bosons spin 1 g=2 16


1
Leptons τ− 1776.82±0.16MeV τ + spin 2 g =2·2=4
µ− 105.658 MeV µ+
e− 510.999 keV e+
12
1
ντ < 2 eV ν̄τ spin 2 g=2
νµ < 2 eV ν̄µ
νe < 2 eV ν̄e
6

Electroweak W ± 80.385 ± 0.015 GeV spin 1 g=3


gauge bosons Z0 91.1876±0.0021 GeV
γ 0 (< 1 × 10−18 eV) g=2
11

Higgs boson H 125.7 ± 0.4 GeV spin 0 g=1 1

gf = 72 + 12 + 6 = 90
gb = 16 + 11 + 1 = 28

For T > mt = 173 GeV, all known particles are relativistic. Adding up their
internal degrees of freedom we get

gb = 28 gluons 8×2, photons 2, W ± and Z 0 3×3, and Higgs 1


gf = 90 quarks 12×6, charged leptons 6×2, neutrinos 3×2
g∗ = 106.75 .
5 THERMAL HISTORY OF THE EARLY UNIVERSE 71

2
10

2
g*

1
10

0
10
-6 -5 -4 -3 -2 -1 0 1 2
-lg(T/MeV)

Figure 1: The functions g∗ (T ) (solid), g∗p (T ) (dashed), and g∗s (T ) (dotted) calcu-
lated for Standard Model particle content.

The electroweak (EW) phase transition took place close to the temperature 100
GeV.2 As with other phase transitions, the system was not in thermal equilibrium
during this event, and this may have important cosmological consequences (in partic-
ular, it may determine the baryon-antibaryon asymmetry observed in the universe),
depending on the way the electroweak phase transition happens – this is one of the
main research topics at the Large Hadron Collider in CERN, due to start taking
data again in 2015. We will not discuss the electroweak phase transition, and for
our purposes it is enough to know that it appears that g∗ was the same before and
after the transition. Going to earlier times and higher temperatures, we expect g∗
to get larger than 106.75 as new physics (new, thus far unknown, particle species)
comes to play.
Let us now follow the history of the universe starting at the time when the
EW transition has already happened. We have T ∼ 100 GeV, t ∼ 20 ps, and t
quark annihilation is ongoing. (Recall that the transition from relativistic into non-
relativistic behaviour is not complete until about T ≈ m/6 ≈ 30 GeV.) The Higgs
boson and the gauge bosons W ± , Z 0 annihilate next. At T ∼ 10 GeV, we have
g∗ = 86.25. Next the b and c quarks annihilate, followed by the τ lepton. If the s
quark would also have had time to annihilate, we would reach g∗ = 51.25.
2
Like many other terms in cosmology, this may be bit of a misnomer, because in the Standard
Model, there is no phase transition, just a smooth crossover from one regime to another. In some
extensions of the Standard Model, there is really a phase transition.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 72

Electroweak phase transition T ∼ 100 GeV t ∼ 20 ps


QCD phase transition T ∼ 150 MeV t ∼ 20µs
Neutrino decoupling T ∼ 1 MeV t∼1s
Electron-positron annihilation T < me ∼ 0.5 MeV t ∼ 10 s
Big Bang Nucleosynthesis T ∼ 50–100 keV t ∼ 3 − −30 min
Matter-radiation equality T ∼ 0.8 eV ∼ 9000 K t ∼ 50000 yr
Recombination + photon decoupling T ∼ 0.3 eV ∼ 3000 K t ∼ 380000 yr

Table 2: Early universe events.

5.3 QCD phase transition


In the middle of the s quark annihilation, something else happens, however: matter
undergoes the QCD phase transition (also called the quark–hadron phase transition).
This takes place at T ∼ 150 MeV, t ∼ 20 µs. The colour forces between quarks and
gluons become important (so the formulae for the energy density in chapter 4 no
longer apply) and the phase transition takes place. After that, there are no more
free quarks and gluons; the quark-gluon plasma has become a hadron plasma. The
quarks and gluons have formed bound three-quark systems, called baryons, and
quark-antiquark pairs, called mesons. (Together, these bound states of quarks are
known as hadrons.) The lightest baryons are the nucleons: the proton and the
neutron. The lightest mesons are the pions: π ± , π 0 . Baryons are fermions, mesons
are bosons.
There are many different species of baryons and mesons, but all except pions are
nonrelativistic below the QCD phase transition temperature. Thus the only particle
species left in large numbers are the pions, muons, electrons, neutrinos, and the
photons. For pions, g = 3, so now we have g∗ = 17.25.

Table 3: History of g∗ (T )

T ∼ 200 GeV all present 106.75


T ∼ 100 GeV EW transition (no effect)
T < 170 GeV top annihilation 96.25
T < 80 GeV W ±, Z 0, H 0 86.25
T < 4 GeV bottom 75.75
T < 1 GeV charm, τ − 61.75
T ∼ 150 MeV QCD transition 17.25 (u,d,g→ π ±,0 , 37 → 3)
T < 100 MeV π ± , π 0 , µ− 10.75 e± , ν, ν̄, γ left
T < 500 keV e− annihilation (7.25) 2 + 5.25(4/11)4/3 = 3.36

This table gives what value g∗ (T ) would have after the annihilation is over as-
suming the next annihilation would not have begun yet. In reality they overlap in
many cases. The temperature value on the left is the approximate mass of the parti-
cle in question and indicates roughly when the annihilation begins. The temperature
is much smaller when the annihilation ends. The top quark receives its mass in the
EW transition, so its annihilation only begins after the EW transition.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 73

5.4 Neutrino decoupling and electron-positron annihilation


Soon after the QCD phase transition, pions and muons annihilate and for T =
20 MeV → 1 MeV, we have g∗ = 10.75. Next the electrons annihilate, but to discuss
the e+ e− annihilation we need a bit more details.
So far we have assumed that all particle species have the same temperature,
i.e. interactions among the particles are able to keep them in thermal equilibrium.
Neutrinos, however, feel only the weak interaction. The weak interaction is actually
not that weak when particle energies are close to (or higher than) the masses of the
W and Z bosons, which mediate the interaction. But as the temperature falls, the
weak interaction becomes rapidly weaker.
A particle species falls out of chemical equilibrium when interactions become
too weak to maintain it in touch with the other species as the universe expands.
This happens when the interaction rate Γ becomes smaller than the expansion rate,
Γ < H. The interaction rate Γ has units of 1/time, and it can be interpreted as the
frequency of particle interactions. The limit Γ < H can roughly be understood as
saying that if particles on average have less than one interaction per Hubble time,
the distribution cannot keep up with the expansion. The interaction rate can be
written as Γ = nhσ|v|i, where n is number density of the particles, σ is interaction
cross section, v is particle velocity and the brackets represents average over the
phase space. If the cross section is independent of velocity, we can take it out of the
average. If the particles are ultrarelativistic, we can approximate |v| = 1, in which
case we have simply Γ = nσ. The cross section has units of area, and it expresses
the strength of the interaction3 .
For the weak interaction processes relevant for neutrinos, the cross section is
σ ∼ G2F T 2 , where GF ≈ 1.17 × 10−5 GeV−2 is the Fermi constant. The interaction
rate is then Γ = nσv ∼ G2F T 5 , where n is the number density q and v ≈ 1 is typical
neutrino velocity. According to the Friedmann equation, H ∼ 2 ∼ T 2 /M .
ρ/MPl Pl
So we have Γ/H ∼ G2F MPl T 3 ∼ (T / MeV)3 . So, neutrinos decouple close to T ∼ 1
MeV, after which they move practically freely, without interactions.
Even though neutrinos are no longer in chemical equilibrium, they remain in
thermal equilibrium as long as the temperature of the particle soup also evolves
like T ∝ a−1 , so that Tν = T . However, annihilations will cause a deviation from
T ∝ a−1 . The next annihilation event is the electron-positron annihilation.
As the number of relativistic degrees of freedom is reduced, energy density and
entropy are transferred from electrons and positrons to photons, but not to neutrinos,
in the annihilation reactions

e+ + e− → γ+γ .

The photons are thus heated relative to neutrinos (the photon temperature does not
fall as much). In the electron-positron annihilation, g∗s changes from

g∗s = g∗ = 2 + 3.5 + 5.25 = 10.75 (5.5)


±
γ e ν
3
This terminology comes from particle physics. The idea is that if you consider a beam of classical
particles randomly directed at a target with total area A, σ of which is covered with particles, the
probability of crossing a particle and hence interacting is σ/A.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 74

to  3

g∗s = 2 + 5.25 . (5.6)
T
For time 1 before the annihilation and time 2 after it, we have from (4.34)

2a32 T23 + 5.25a32 Tν2


3
= 10.75a31 T13 . (5.7)

Before the electron-positron annihilation, the neutrino temperature was the same as
the temperature of the other species, so a31 T13 = a31 Tν1
3 = a3 T 3 , where we have used
2 ν2
the fact that Tν ∝ a−1 throughout, since neutrinos are relativistic and they are not
heated by the electron-positron annihilation. We thus have from (5.7)
 3
T
10.75 = 2 + 5.25 ,

from which we solve the neutrino temperature after e+ e− -annihilation4 ,


 1
4 3
Tν = T = 0.714 T
11
4
g∗s (T ) = 2 + 5.25 · = 3.909 (5.8)
11
 4
4 3
g∗ (T ) = 2 + 5.25 = 3.363.
11

These relations remain true for the photon+neutrino background as long as the
neutrinos stay ultrarelativistic (mν ≪ T ). The neutrinos are no longer in chemical or
thermal equilibrium, but they are still in kinetic equilibrium, i.e. their distribution
function has the thermal shape.
If the neutrinos are massless or their masses are so small that they can be ig-
nored, the above relation would apply even today, when the photon (the CMB)
temperature is T = T0 = 2.725 K = 0.2348 meV, giving the neutrino background
the temperature Tν0 = 0.714 · 2.725 K = 1.945 K = 0.1676 meV today. However,
neutrino oscillation experiments in the 1990s established that neutrinos have masses
which are at least in the meV range5 , and there is an upper limit of about 2 eV from
direct detection experiments and cosmology. Therefore, the neutrino background is
non-relativistic today. As neutrinos become non-relativistic, they fall out of kinetic
equilibrium, because the shape of the thermal distribution function is not preserved
as the momenta redshift to the value p ∼ m. However, once neutrinos become very
non-relativistic, with typical values of the momenta p ≪ m, the distribution function
again has the thermal shape, but with a different temperature scaling.

4
To be more precise, neutrino decoupling was not complete when e+ e− -annihilation began, so
some of the energy and entropy did leak to the neutrinos. Therefore the neutrino energy density
after e+ e− -annihilation is about 1.3% higher (at a given T ) than the above calculation gives. The
neutrino distribution also deviates slightly from kinetic equilibrium.
5
Specifically, the oscillations show that the mass differences between the neutrinos are of the
order meV. In principle, the lightest neutrino could be massless.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 75

Figure 2: The evolution of the energy density, or rather, g∗ (T ), and its different components
through electron-positron annihilation. Since g∗ (T ) is defined as ρ/(π 2 T 4 /30), where T
is the photon temperature, the photon contribution appears constant. If we had plotted
ρ/(π 2 Tν4 /30) ∝ ρa4 instead, the neutrino contribution would appear constant, and the
photon contribution would increase at the cost of the electron-positron contribution, which
would better reflect what is going on.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 76

5.5 Matter
We noted that the early universe is dominated by the relativistic particles, and we
can forget the nonrelativistic particles when we are considering the dynamics of
the universe. We followed one species after another becoming nonrelativistic and
disappearing from the picture, until only photons (the cosmic background radiation)
and neutrinos were left, and even the latter of these had stopped interacting.
We now return to look in more detail what happens to nucleons and electrons.
We found that they annihilated with their antiparticles when the temperature fell
below their respective rest masses. For nucleons, the annihilation began immediately
after they were formed in the QCD phase transition. There were however slightly
more particles than antiparticles, and this small excess of particles was left over.
(This has to be the case because we observe electrons and nucleons today). This
means that the chemical potential µB associated with baryon number differs from
zero (it is positive). Baryon number is a conserved quantity in the eras we are
considering (though not before the electroweak phase transition). Baryon number
resides today in nucleons (protons and neutrons; since the proton is lighter than the
neutron, free neutrons have decayed into protons, but there are neutrons in atomic
nuclei) because they are the lightest baryons. The universe is electrically neutral,
and the negative charge lies in the electrons, the lightest particles with negative
charge. Therefore the number of electrons must equal the number of protons.
The number densities etc. of the electrons and the nucleons we get from the
equations of chapter 4. But what is the value of the chemical potential µ? For each
species, we get µ(T ) from the conserved quantities6 . The baryon number resides in
the nucleons,
nB = nN − nN̄ = np + nn − np̄ − nn̄ . (5.9)
Let us define the parameter η, the baryon-photon ratio today,
nB (t0 )
η≡ . (5.10)
nγ (t0 )

From observations we know that η ≈ 6 × 10−10 . (We take a closer look at this
number in the next chapter.) Since baryon number is conserved, nB V ∝ nB a3 stays
constant, so
nB ∝ a−3 . (5.11)
After electron annihilation nγ ∝ a−3 , so we get

2ζ(3) 3
nB (T ) = ηnγ = η T for T ≪ me . (5.12)
π2
We can put (5.11) and (5.12) together and replace a−3 using the relation (4.34)
between the temperature and the scale factor to obtain
2ζ(3) g∗s (T ) 3
nB (T ) = η T . (5.13)
π 2 g∗s (T0 )
6
In general, the recipe to find how the thermodynamical parameters evolve in the expanding FRW
universe is to use the conservation laws of the conserved number densities, entropy conservation
and the energy-momentum tensor continuity relation, to find how the number densities and energy
densities evolve. The other thermodynamical parameters will then evolve so as to satisfy these
requirements.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 77

For T < 10 MeV we have

nN̄ ≪ nN and nN ≡ nn + np = nB .

In the next chapter, we will discuss big bang nucleosynthesis, i.e. how the protons
and neutrons form atomic nuclei. Approximately one quarter of all nucleons (all
neutrons and roughly the same number of protons) form nuclei (A > 1) and three
quarters remain as free protons. Let us denote by n∗p and n∗n the number densities
of protons and neutrons including those in nuclei (and also those in atoms), whereas
we shall use np and nn for the number densities of free protons and neutrons, which
are not bound with each other or electrons. We thus write

n∗N ≡ n∗n + n∗p = nB .

In the same manner, for T < 10 keV we have

ne+ ≪ ne− and ne− = n∗p .

At this time (T ∼ 10 keV → 1 eV) the universe contains a relativistic photon and
neutrino background (“radiation”) and nonrelativistic free electrons, protons, and
nuclei (“matter”). Since ρ ∝ a−4 for radiation, but ρ ∝ a−3 for matter, the energy
density in radiation falls eventually below the energy density in matter—the universe
becomes matter-dominated.
The above discussion is in terms of the known particle species. Today there is
much indirect observational evidence for the existence of what is called cold dark
matter (CDM), which is supposedly made out of some yet undiscovered species of
particles. The CDM particles interact weakly with normal matter (they decouple
early), and their energy density contribution should be small when we are well in
the radiation-dominated era, so they do not affect the above discussion much. They
become nonrelativistic early and dominate the matter density of the universe today
(there is about five to six times as much mass in CDM as there is in baryons). Thus
the CDM causes the universe to become matter-dominated earlier than if the matter
consisted of nucleons and electrons only. The CDM will be important later when
we discuss the formation of structure in the universe. The time of matter-radiation
equality teq is calculated in an exercise at the end of this chapter.

5.6 Recombination
Radiation (photons) and matter (electrons, protons, and nuclei) remained in thermal
equilibrium for as long as there were lots of free electrons. When the temperature
became low enough the electrons and nuclei combined to form neutral atoms, an
event known as recombination7 , and the density of free electrons fell sharply. The
photon mean free path grew rapidly and became longer than the horizon distance.
Thus the universe became transparent. Photons and matter decoupled, i.e. their
interaction was no longer able to maintain them in thermal equilibrium with each
other. After this, by T we refer to the photon temperature. Today, these photons are
the CMB, and T = T0 = 2.725 K. (After photon decoupling, the matter temperature
7
This is the first time when nuclei and electrons combine, so the term recombination, adopted
from chemistry, is somewhat of a misnomer.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 78

fell at first faster than the photon temperature, but structure formation then heated
up the matter to different temperatures at different places.)
The relevant interaction here is not weak interaction, as in the case of the neu-
trinos, but instead the electromagnetic interaction between photons and electrons.
The interaction rate is Γ ∼ ne σT , where σT = 8π 2 2
3 α /me ≈ 2 × 10
−3 MeV−2 is

the Thomson cross-section, and α ≈ 1/137 is the electromagnetic coupling constant.


(The 1/m2 factor shows that interactions between photons and nuclei are not impor-
tant, as they are suppressed by the large masses of the nuclei.) Finding the photon
decoupling era is a bit more involved than in the neutrino case, as the evolution of
the electron number density is more complicated.
To simplify the discussion, let us ignore other nuclei than protons (over 90% (by
number) of the nuclei are protons, and almost all the rest are 4 He nuclei). Let us
denote the number density of free protons by np , free electrons by ne , and hydrogen
atoms by nH . Since the universe is electrically neutral, np = ne . The conservation
of baryon number gives nB = np + nH . From chapter 4 we have

mi T 3/2 µi −mi
 
ni = g i e T . (5.14)

For as long as the reaction
p + e− → H + γ (5.15)
is in chemical equilibrium the chemical potentials are related by µp + µe = µH (since
µγ = 0). Using this we get the relation

me T −3/2 B/T
 
gH
nH = np ne e , (5.16)
gp ge 2π
between the number densities. Here B = mp + me − mH = 13.6 eV is the binding
energy of hydrogen. The numbers of internal degrees of freedom are gp = ge = 2,
gH = 4. Outside the exponent we approximated mH ≈ mp . Defining the fractional
ionisation
np
x≡ , (5.17)
nB
equation (5.16) becomes
√  3/2
1−x 4 2 ζ(3) T
= √ η eB/T , (5.18)
x2 π m e

the Saha equation for ionisation in thermal equilibrium. When B ≪ T ≪ me , the


RHS ≪ 1 so that x ∼ 1, and almost all protons and electrons are free. As temper-
ature falls, eB/T grows, but since both η and (T /me )3/2 are ≪ 1, the temperature
needs to fall to T ≪ B, before the whole expression becomes large (∼ 1 or ≫ 1).
The ionisation fraction at first follows the equilibrium result of (5.18) closely, but
as this equilibrium fraction begins to fall rapidly, the true ionisation fraction begins
to lag behind. As the number densities of free electrons and protons fall, it becomes
more difficult for them to find each other to “recombine”, and they are no longer
able to maintain chemical equilibrium for the reaction (5.15). To find the correct
ionisation evolution, x(t), requires then a more complicated calculation involving
the reaction cross section of this reaction. See figures 3 and 4.
Although the equilibrium formula is thus not enough to give us the true ionisation
evolution, its benefit is twofold:
5 THERMAL HISTORY OF THE EARLY UNIVERSE 79

1 1

0.8 Saha 0.8


Peebles
0.6 0.6
x
0.4 0.4

0.2 0.2

0 0

1000 1e+09
-3
100 ne/m 1e+08
λγ/Mpc

-3
λγ/Mpc

ne/m
10 1e+07

1 1e+06

0.1 1e+05
1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900
1+z

Figure 3: Recombination. In the top panel the dashed curve gives the equilibrium ionisation
fraction as given by the Saha equation. The solid curve is the true ionisation fraction,
calculated using the actual reaction rates (original calculation by Peebles). You can see that
the equilibrium fraction is followed at first, but then the true fraction lags behind. The
bottom panel shows the free electron number density ne and the photon mean free path
λγ . The latter is given in comoving units, i.e., the distance is scaled to the corresponding
present distance. This figure is for η = 8.22 × 10−10 . (Figure by R. Keskitalo.)

10 10
1 1
Saha
0.1 Peebles 0.1
0.01 0.01
x
0.001 0.001
0.0001 0.0001
1e-05 1e-05

1e+10 1e+08
1e+08 1e+06
-3
ne/m
λγ/Mpc

10000
-3

1e+06
λγ/Mpc
ne/m

10000 100
100 1
1 0.01
0.01 0.0001
1800 1600 1400 1200 1000 800 600 400 200 0
1+z

Figure 4: Same as figure 3, but with a logarithmic scale for the ionisation fraction, and the
redshift scale extended to present time (z = 0 or 1 + z = 1). You can see that a residual
ionisation x ∼ 10−4 remains. This figure does not include reionisation, which happened
around z ∼ 10. (Figure by R. Keskitalo.)
5 THERMAL HISTORY OF THE EARLY UNIVERSE 80

1. It tells us when recombination begins. While the equilibrium ionisation changes


only very slowly, it is easy to stay in equilibrium. Thus things won’t start to
happen until the equilibrium fraction begins to change a lot.

2. It gives the initial conditions for the more complicated calculation that will
give the true evolution.

A similar situation holds for many other events in the early universe, e.g. big bang
nucleosynthesis.
Recombination is not instantaneous. Let us define the recombination tempera-
ture Trec as the temperature where x = 0.5. Now Trec = T0 (1+zrec ) since 1+z = a−1
and the photon temperature falls as T ∝ a−1 . (Since η ≪ 1, the energy release in
recombination is negligible compared to ργ ; and after photon decoupling photons
travel freely maintaining kinetic equilibrium with T ∝ a−1 .)
We get (for η ∼ 10−9 )

Trec ∼ 0.3 eV
zrec ∼ 1300.

You might have expected that Trec ∼ B. Instead we found Trec ≪ B. The main
reason for this is that η ≪ 1. This means that there are very many photons for each
hydrogen atom. Even when T ≪ B, the high-energy tail of the photon distribution
contains photons with energy E > B so that they can ionise a hydrogen atom.
The photon decoupling takes place somewhat later, at Tdec ≡ (1 + zdec )T0 , when
the ionisation fraction has fallen enough. We define the photon decoupling time to
be the time when the photon mean free path exceeds the Hubble distance. The
numbers are roughly

Tdec ∼ 3000 K ∼ 0.26 eV


zdec ∼ 1100.

The decoupling means that the recombination reaction can no more keep the ion-
isation fraction on the equilibrium track, but instead we are left with a residual
ionisation of x ∼ 10−4 .
A long time later (at z ∼ 10) the first stars form, and their radiation reionises
the gas that is left in interstellar space. The gas has now such a low density, however,
that the universe remains transparent.
Exercise: Transparency of the universe. We say the universe is transparent
when the photon mean free path λγ is larger than the Hubble length lH = H −1 ,
and opaque when λγ < lH . The photon mean free path is determined mainly by
the scattering of photons by free electrons, so that λγ = 1/(σT ne ), where ne = xn∗e
is the number density of free electrons, n∗e is the total number density of electrons,
and x is the ionisation fraction. The cross section for photon-electron scattering
is independent of energy for Eγ ≪ me and is then called the Thomson cross sec-
tion, σT = 8π 2
3 (α/me ) , where α is the fine-structure constant. In recombination x
falls from 1 to 10−4 . Show that the universe is opaque before recombination and
transparent after recombination. (Assume the recombination takes place between
instantly at z = 1300. You can assume a matter-dominated universe—see below
for parameter values.) The interstellar matter gets later reionised (to x ∼ 1) by the
5 THERMAL HISTORY OF THE EARLY UNIVERSE 81

Figure 5: The CMB frequency spectrum as measured by the FIRAS instrument on the
COBE satellite [2]. This first spectrum from FIRAS is based on just 9 minutes of measure-
ments. The CMB temperature estimated from it was T = 2.735 ± 0.060 K. The final result
from FIRAS is T = 2.725±0.002 K (95% confidence) [3]. Using data from other experiments
as well, the best current value is T0 = 2.72548 ± 0.00057 K (68% confidence) [4].

light from the first stars. What is the earliest redshift when this can happen without
making the universe opaque again? (You can assume that most (∼ all) matter has
remained interstellar.) Calculate for Ωm0 = 1.0 and Ωm0 = 0.3 (note that Ωm also
includes nonbaryonic matter). Use ΩΛ = 0, h = 0.7 and η = 6 × 10−10 .
The photons in the cosmic background radiation have thus travelled almost with-
out scattering through space all the way since we had T = Tdec ∼ 1100 T0 .8 When we
look at this cosmic background radiation we thus see the universe (its faraway parts
near our horizon) as it was at that early time. Because of the redshift, these pho-
tons which were then largely in the visible part of the spectrum, have now become
microwave photons, so this radiation is now called the cosmic microwave background
(CMB). It still maintains the thermal equilibrium distribution. This was confirmed
to high accuracy by the FIRAS (Far InfraRed Absolute Spectrophotometer) instru-
ment on the COBE (Cosmic Background Explorer) satellite in 1989. John Mather
received the 2006 Physics Nobel Prize for this measurement of the CMB frequency
(photon energy) spectrum (see figure 5).9
We shall now, for a while, stop the detailed discussion of these events, recom-
bination and photon decoupling. The universe is about 400 000 years old now.
Next, gravitationally bound structures start to form as gravity attracts matter into
overdense regions. Before photon decoupling the radiation pressure from photons
8
The probability for a photon to have one or more scatterings between decoupling and today is
about 10%.
9
He shared the Nobel Prize with George Smoot, who got it for the discovery of the CMB
anisotropy with the DMR instrument on the same satellite. The CMB anisotropy will be discussed
in the second part of the course.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 82

prevented this. But before going to the physics of structure formation, we discuss
some earlier events (big bang nucleosynthesis, dark matter decoupling and inflation)
in more detail.

5.7 The Dark Ages


How would the universe after recombination appear to an observer with human eyes?
At first one would see a uniform red glow everywhere, since the wavelengths of the
CMB photons are in the visible range. (It would also feel rather hot, 3000 K). As
time goes on this glow gets dimmer and dimmer as the photons redshift towards
the infrared, and after a few million years it gets completely dark, as the photons
are redshifted into the infrared. There are no stars yet. This era is often called the
Dark Ages of the universe. It lasts several hundred million years. While it lasts, it
gradually gets cold. In the dark, however, masses are gathering together. And then,
one by one, the first stars light up.
It seems that the star-formation rate peaked between redshifts z = 1 and z = 2.
Thus the universe at a few billion years was brighter than it is today, since the
brightest stars are short-lived, and the galaxies were closer to each other then.10

5.8 The radiation and neutrino backgrounds


While the starlight is more visible to us than the cosmic microwave background,
it’s average energy and photon number density in the universe is much less. Thus
the photon density is essentially given by the CMB. The number density of CMB
photons today (T0 = 2.725 K) is

2ζ(3) 3
nγ0 = T = 410.5 photons/cm3 (5.19)
π2 0
and the energy density is

π2 4
ργ0 = T = 2.701T0 nγ0 = 4.641 × 10−31 kg/m3 . (5.20)
15 0
Since the critical density today is

3H02
ρc0 = = h2 · 1.8788 × 10−26 kg/m3 (5.21)
8πG
we get for the photon density parameter
ργ0
Ωγ0 ≡ = 2.47 × 10−5 h−2 . (5.22)
ρc0
While relativistic, neutrinos contribute another radiation component

7Nν π 2 4
ρν = T . (5.23)
8 15 ν
10
Although galaxies seen from far away are rather faint objects, difficult to see with the unaided
eye. If you were suddenly transported to a random location in the present universe, you might not
be able to see anything. To enjoy the spectacle, our hypothetical observer should be located within
a forming galaxy, or have a good telescope.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 83

After e+ e− annihilation this gives


 4
7Nν 4 3
ρν = ργ , (5.24)
8 11
where Nν = 3 is the number of neutrino species.
When the number of neutrino species was not yet known, cosmology (BBN)
was used to constrain it. Big bang nucleosynthesis is sensitive to the expansion
rate in the early universe, and that depends on the energy density. Observations
require Nν = 2–4. Actually any new particle species that would be relativistic
at nucleosynthesis time (T ∼ 50 keV – 1 MeV) and would thus contribute to the
expansion rate through its energy density, but which would not interact directly
with nuclei and electrons, would have the same effect. Thus such unknown particles
may not contribute to the energy density of the universe at that time more than one
neutrino species does.
If we take (5.24) to define Nν , but then take into account the extra contribution
to ρν from energy leakage during e+ e− -annihilation (and some other small effects),
we get (as a result of years of hard work by many theorists)

Nν = 3.046 . (5.25)

(This does not mean that there are 3.046 neutrino species, but that the total energy
density in neutrinos is 3.046 times as much as the energy density one neutrino species
would contribute had it decoupled completely before e+ e− annihilation.)
If neutrinos were still relativistic today, the neutrino density parameter would
be  1
7Nν 4 3
Ων0 = Ωγ0 = 1.71 × 10−5 h−2 , (5.26)
22 11
so the total radiation density parameter would be

Ωr0 = Ωγ0 + Ων0 = 4.18 × 10−5 h−2 ∼ 10−4 . (5.27)

We thus confirm the claim in chapter 3 that the radiation component can be ignored
in the Friedmann equation, except in the early universe. The combination Ωi h2 is
often denoted by ωi , so we have

ωγ = 2.47 × 10−5 (5.28)


−5
ων = 1.71 × 10 (5.29)
−5
ωr = ωγ + ων = 4.18 × 10 . (5.30)

As noted earlier, neutrinos have masses in the meV to eV range. Thus neutrinos
are nonrelativistic today and count as matter, not radiation, so the above result for
the neutrino energy density does not apply. However, unless the neutrino masses
are above 0.2 eV, they would still have been relativistic, and counted as radiation,
at the time of recombination and matter-radiation equality. While the neutrinos are
relativistic, we get neutrino energy density

ρν = Ων0 ρc0 a−4 (5.31)

using Ων0 from (5.26), even though Ων0 does not give the present density of neutrinos.
REFERENCES 84

Today, even though the photon and neutrino backgrounds do not dominate the
energy density of the universe any more, they do dominate the entropy density.
Exercise: Matter–radiation equality. The present density of matter is
ρm0 = Ωm0 ρc and the present density of radiation is ρr0 = ργ0 + ρν0 (we assume
neutrinos are massless). What was the age of the universe teq when ρm = ρr ? (Note
that in these early times—but not today—you can ignore the curvature and vacuum
terms in the Friedmann equation.) Give numerical value (in years) for the cases Ωm0
= 0.1, 0.3, and 1.0, and H0 = 70km/s/Mpc. What was the temperature at that
time, Teq ?

References
[1] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38 (2014) 090001,
http://pdg.lbl.gov/2014/listings/contents listings.html

[2] J.C. Mather et al., Astrophys. J. Lett. 354 (1990) 37.

[3] J.C. Mather et al., Astrophys. J. 512 (1999) 511.

[4] D.J. Fixsen, Astrophys. J. 707 (2009) 916.


6 Big Bang Nucleosynthesis
One quarter (by mass) of baryonic matter in the universe is helium, heavier elements
make up a few per cent, and the rest is hydrogen.
The building blocks of atomic nuclei, the nucleons, or protons and neutrons, are
formed in the QCD phase transition at T ∼ 150 MeV and t ∼ 20 µs. Elements
heavier than lithium, up to iron, cobalt, and nickel, have been made from lighter
elements by fusion reactions in stars. These reactions provide the energy source for
the stars. Elements heavier than these have been formed in supernova explosions.
However, the amount of helium and some other light isotopes in the universe cannot
be understood by these mechanisms. It turns out that 2 H, 3 He, 4 He, and 7 Li were
mainly produced already during the first hour of the universe, in a process called
Big Bang Nucleosynthesis (BBN).
Nucleons and antinucleons annihilated each other soon after the QCD phase
transition, and the small excess of nucleons left over from annihilation did not have
a significant effect on the expansion and thermodynamics of the universe until much
later, when the universe became matter-dominated (teq ≈ 1000ωm −2 years). The

ordinary matter in the present universe comes from this small excess of nucleons.
Let us now consider what happened to it in the early universe. We will focus on the
period when the temperature fell from T ∼ 10 MeV to T ∼ 10 keV (from t ∼ 0.01 s
to a few hours).

6.1 Equilibrium
The total number of nucleons minus antinucleons stays constant due to baryon
number conservation. In the temperature range under consideration, the number
density of antinucleons is negligible. This baryon number can be in the form of
protons and neutrons or atomic nuclei. Weak nuclear reactions convert neutrons
and protons into each other and strong nuclear reactions build nuclei from them.
During the period of interest the nucleons and nuclei are nonrelativistic (T ≪
mp ). Assuming thermal equilibrium we have
3/2
µi −mi

mi T
ni = g i e T (6.1)

for the number density of nucleus type i. If the nuclear reactions needed to build
nucleus i (with mass number A and charge Z) from the nucleons,

(A − Z)n + Zp ↔ i

occur at sufficiently high rate to maintain chemical equilibrium, we have

µi = (A − Z)µn + Zµp (6.2)

for the chemical potentials. Since for free nucleons we have


 3/2 µp −mp
mp T
np = 2 e T

3/2
µn −mn

mn T
nn = 2 e T , (6.3)

85
6 BIG BANG NUCLEOSYNTHESIS 86

AZ B g
2H 2.22 MeV 3
3H 8.48 MeV 2
3 He 7.72 MeV 2
4 He 28.3 MeV 1
12 C 92.2 MeV 1

Table 1. Some of the lightest nuclei and their binding energies.

we can express ni in terms of the neutron and proton densities,


  3 (A−1)
3 2π 2
ni = g i A 2
2
−A
nZ A−Z Bi /T
p nn e , (6.4)
mN T

where
Bi ≡ Zmp + (A − Z)mn − mi (6.5)
is the binding energy of the nucleus. Here we have approximated mp ≈ mn ≈ mi /A
outside the exponent, and denoted it by mN (“nucleon mass”).
The different number densities add up to the total baryon number density
X
Ai ni = nB . (6.6)

The baryon number density nB we get from the photon density


2
nγ = ζ(3)T 3 (6.7)
π2
and the baryon/photon -ratio

nB g∗s (T )
= η (6.8)
nγ g∗s (T0 )
as
g∗s (T ) 2
nB = η ζ(3)T 3 . (6.9)
g∗s (T0 ) π 2
After electron-positron annihilation g∗s (T ) = g∗s (T0 ) and nB = ηnγ . Here η is the
present baryon/photon ratio. It can be estimated from various observations, and it
is about 6 × 10−10 .
For temperatures mN ≫ T & Bi we have

(mN T )3/2 ≫ T 3 ≫ nB > np , nn

and thus (6.4) implies that


ni ≪ np , nn
for A > 1. Thus initially there are only free neutrons and protons in large numbers.
6 BIG BANG NUCLEOSYNTHESIS 87

6.2 Neutron-proton ratio


What can we say about np and nn ? Protons and neutrons are converted into each
other by the weak interaction in the reactions

n + νe ↔ p + e−
n + e+ ↔ p + ν̄e (6.10)

n ↔ p + e + ν̄e .

If these reactions are in equilibrium, we have µn + µνe = µp + µe , and the neu-


tron/proton ratio is
nn n
≡ = e−Q/T +(µe −µνe )/T , (6.11)
np p
where Q ≡ mn − mp = 1.293 MeV.
We need now some estimate of the chemical potentials of electrons and electron
neutrinos. The universe is electrically neutral, so the number of electrons (or ne− −
ne+ ) equals the number of protons, and µe can be calculated exactly in terms of
η and T . We leave the exact calculation as an exercise, but give below a rough
estimate for the ultrarelativistic limit (T ≫ me ):
2T 3 2 µe  µe 3
 
2
n e− − n e+ = 2
π + = n∗p < nB ≈ ηnγ = η 2 ζ(3)T 3 . (6.12)
6π T T π
Here n∗p includes the protons inside nuclei. Since η is small, µ ≪ T , and we can drop
the (µ/T )3 term to get
µe 6
. 2 ζ(3)η. (6.13)
T π
Thus µe /T ∼ η ∼ 10−9 . The nonrelativistic limit can be dealt with in a similar
manner. It turns out that µe rises as T falls, and somewhere between T = 30 keV
and T = 10 keV µe becomes larger than T , and, in fact, comparable to me .
For T & 30 keV, µe ≪ T , and we can drop µe in (6.11).
Since we have not measured the cosmic neutrino background (and will not do
so in the foreseeable future, as neutrinos interact so weakly), we don’t know the
neutrino chemical potentials. Usually it is assumed that the neutrino asymmetry is
small, like the baryon asymmetry, so that |µνe | ≪ T . The observational upper limit
from BBN is |µνe |/T . 0.1; if neutrinos are their own antiparticles, their chemical
potentials are exactly zero. Thus, we ignore both µe and µνe and get the equilibrium
neutron/proton ratio
n
= e−Q/T . (6.14)
p
(This is not valid for T . 30 keV, since then µe is no longer small, but we will use
this formula only at higher temperatures.)
We can thus express the number densities of all nuclei in terms of the free proton
number density np , as long as chemical equilibrium holds.

6.3 Bottlenecks
We define the mass fraction of nucleus i as
Ai ni
Xi ≡ . (6.15)
nB
6 BIG BANG NUCLEOSYNTHESIS 88

Since X
nB = Ai ni , (6.16)
i
P
(where the sum includes protons and neutrons), we have Xi = 1.
1
P
Using also the normalization condition (6.16)), or i Xi = 1, we get all equilib-
rium abundances as a function of T (they also depend on the value of the parameter
η). There are two items to note:

1. The normalization condition (6.16) includes all nuclei up to uranium and be-
yond. Thus we would get a huge polynomial equation from which to solve
Xp .

2. In practice we don’t have to care about the first item, since as the tempera-
ture falls the nuclei no longer follow their equilibrium abundances. The reac-
tions are in equilibrium only at high temperatures, when the other equilibrium
abundances except Xp and Xn are small, and we can use the approximation
Xn + Xp = 1.

In the early universe the baryon density is too low and the time available is
too short for reactions involving three or more incoming nuclei to occur at any
appreciable rate. The heavier nuclei have to be built sequentially from lighter nuclei
in two-particle reactions, so deuterium is formed first in the reaction

n+p → d + γ.

Only when deuterons are available can helium nuclei be formed, and so on. This
process has “bottlenecks”: the lack of sufficient densities of lighter nuclei hinders
the production of heavier nuclei, and prevents them from following their equilibrium
abundances.
As the temperature falls, the equilibrium abundances rise fast. They become
large later for nuclei with small binding energies. Since deuterium is formed directly
from neutrons and protons it can follow its equilibrium abundance as long as there
are large numbers of free neutrons available. Since the deuterium binding energy is
rather small, the deuterium abundance becomes large rather late (at T < 100 keV).
Therefore heavier nuclei with larger binding energies, whose equilibrium abundances
would become large earlier, cannot be formed. This is the deuterium bottleneck.
Only when there is lots of deuterium (Xd ∼ 10−3 ) can helium be produced in large
numbers.
The nuclei are positively charged so they repel each other electromagnetically.
The nuclei need large kinetic energies to overcome this Coulomb barrier and get
within the range of the strong interaction. Thus the cross sections for these fusion
reactions fall rapidly with energy and the nuclear reactions are “shut off” when the
temperature falls below T ∼ 30 keV. Thus there is less than one hour available for
nucleosynthesis. Because of the short time available and additional bottlenecks (e.g.
there are no stable nuclei with A = 8), only very small amounts of elements heavier
than helium are formed.
1
For np and nn we know just their ratio, since we do not know µp and µn , only that µp = µn .
Therefore this extra equation is needed to solve all ni .
6 BIG BANG NUCLEOSYNTHESIS 89

6.4 Calculation of the helium abundance


Let us now calculate the numbers. For T > 0.1 MeV, we still have Xn + Xp ≈ 1, so
the equilibrium abundances are

e−Q/T 1
Xn = and Xp = . (6.17)
1 + e−Q/T 1 + e−Q/T
Nucleons follow these equilibrium abundances until neutrinos decouple at T ∼
0.8 MeV, shutting off the weak n ↔ p reactions. After this, free neutrons decay, so

Xn (t) = Xn (t1 )e−(t−t1 )/τn , (6.18)

where τn = 880.0 ± 0.9 s is the mean lifetime of a free neutron2 . (The half-life is
τ1/2 = (ln 2)τn .) In reality, the decoupling and thus the shift from behavior (6.17)
to behavior (6.18) is not instantaneous, but an approximation where one takes it to
be instantaneous at time t1 when T = 0.8 MeV, so that Xn (t1 ) = 0.1657, gives a
fairly accurate final result.
The equilibrium mass fractions are, from (6.4),
1 5
Xi = XpZ XnA−Z gi A 2 ǫA−1 eBi /T (6.19)
2
where
 3/2  3/2  3/2
1 2π 1 2πT g∗s (T ) T
ǫ≡ nB = 2 ζ(3) η∼ η.
2 mN T π mN g∗s (T0 ) mN

The factors which change rapidly with T are ǫA−1 eBi /T . For temperatures mN ≫
T ≫ Bi we have eBi /T ∼ 1 and ǫ ≪ 1. Thus Xi ≪ 1 for others (A > 1) than
protons and neutrons. As temperature falls, ǫ becomes even smaller and at T ∼ Bi
we have Xi ≪ 1 still. The temperature has to fall below Bi by a large factor before
the factor eBi /T wins and the equilibrium abundance becomes large.
Deuterium has Bd = 2.22 MeV, so we get ǫeBd /T = 1 at Td = 0.06 MeV–0.07
MeV (assuming η = 10−10 − 10−9 ), so the deuterium abundance becomes large
close to this temperature. Since 4 He has a much higher binding energy, B4 = 28.3
MeV, the corresponding situation ǫ3 eB4 /T = 1 occurs at a higher temperature T4 ∼
0.3 MeV. But we noted earlier that only deuterium stays close to its equilibrium
abundance once it gets large. Helium begins to form only when there is sufficient
deuterium available, in practice slightly above Td . Helium then forms rapidly. The
available number of neutrons sets an upper limit to 4 He production. Since helium
has the highest binding energy per nucleon (of all isotopes below A = 12), almost
all neutrons end up in 4 He, and only small amounts of the other light isotopes, 2 H,
3 H, 3 He, 7 Li, and 7 Be, are produced.

The Coulomb barrier shuts off the nuclear reactions before there is time for heav-
ier nuclei (A > 8) to form. We get a fairly good approximation for 4 He production
2
The error bar may not be an accurate reflection of the uncertainty in the neutron lifetime, as
there are large differences between measurements, and the preferred value has changed annually at
the percent level, e.g. the shift from 2010 to 2012 was 5.6 seconds. This is the current best estimate,
from 2014 [1].
6 BIG BANG NUCLEOSYNTHESIS 90

by assuming instantaneous nucleosynthesis at T = Tns ∼ 1.1Td ∼ 70 keV, with all


neutrons ending up in 4 He, so that

X4 ≈ 2Xn (Tns ) . (6.20)

After electron-positron annihilation (T ≪ me = 0.511 MeV) the time-temperature


relation is
 −2
2.42 T
t≈ √ s, (6.21)
g∗ MeV

where g∗ = 3.363. Since most of the time in T = 0.8 MeV–0.07 MeV is spent at the
lower part of this temperature range, this formula gives a good approximation for
the time,
tns − t1 = 267 s (in reality 264.3 s).
Thus we get for the final 4 He abundance

X4 = 2Xn (t1 )e−(tns −t1 )/τn = 24.5 %. (6.22)

Accurate numerical calculations, using the reaction rates of the relevant weak and
strong reaction rates give X4 = 21–26 % (for η = 10−10 − 10−9 ).
This calculation of the helium abundance X4 involves a bit of cheating in the
sense that we have used results of accurate numerical calculations to infer that we
need to use T = 0.8 MeV as the neutrino decoupling temperature, and Tns = 1.1Td
as the “instantaneous nucleosynthesis” temperature, to best approximate the correct
behavior. However, it gives us a quantitative description of what is going on, and
an understanding of how the helium yield depends on various things.
Exercise: Using the preceding calculation, find the dependence of X4 on η, i.e.,
calculate dX4 /dη.

6.5 Why so late?


Let us return to the question of why the temperature has to fall so much below the
binding energy before the equilibrium abundances become large. From the energetics
we might conclude that when typical kinetic energies, hEk i ≈ 32 T , are smaller than
the binding energy, it would be easy to form nuclei but difficult to break them.
Above we saw that the smallness of the factor ǫ ∼ (T /mN )3/2 η is the reason why
this is not so. Here η ∼ 10−9 and (T /mN )3/2 ∼ 10−6 (for T ∼ 0.1 MeV). The main
culprit is thus the small baryon/photon ratio. Since there are 109 photons for each
baryon, there is a sufficient amount of photons who can disintegrate a nucleus in
the high-energy tail of the photon distribution, even at rather low temperatures.
One can also express this result in terms of entropy. A high photon/baryon ratio
corresponds to a high entropy per baryon. High entropy favors free nucleons.

6.6 The most important reactions


In reality, neither neutrino decoupling nor nucleosynthesis are instantaneous pro-
cesses. Accurate results require a rather large numerical computation where one
uses the cross sections of all the relevant weak and strong interactions. These cross
6 BIG BANG NUCLEOSYNTHESIS 91

sections are energy-dependent. Integrating them over the energy and velocity dis-
tributions and multiplying with the relevant number densities leads to temperature-
dependent reaction rates. The most important reactions are the weak n ↔ p reac-
tions (6.10) and the following strong reactions3 (see also Fig. 1):

p + n → 2H + γ
2H + p → 3 He + γ
2H + 2H → 3H + p
2H + 2H → 3 He + n
n + 3 He → 3H + p
p + 3H → 4 He + γ
2H + 3H → 4 He + n
2H + 3 He → 4 He + p
4 He + 3 He → 7 Be + γ
4 He + 3H → 7 Li + γ
7 Be + n → 7 Li + p
7 Li + p → 4 He + 4 He

In principle, all of these nuclear cross sections are determined by the just a few
parameters in QCD. However, calculating these cross sections from first principles
is too difficult in practice. Instead cross sections measured in the laboratory are
used. Cross sections of the weak reactions (6.10) are known theoretically (there is
one parameter describing the strength of the weak interaction, which is determined
3
The reaction chain that produces helium from hydrogen in BBN is not the same that occurs
in stars. The conditions in stars are different: on the one hand, there are no free neutrons and
the temperatures are lower, but on the other hand the densities are higher and there is more time
available. In addition, second generation stars contain heavier nuclei (C,N,O) that act as catalysts
in helium production. Some of the most important reaction chains in stars are [Karttunen et al:
Fundamental Astronomy, p. 251] :
1. The proton-proton chain
2
p+p → H + e+ + νe
2 3
H+p → He + γ
3
He + 3 He → 4
He + p + p,

2. and the CNO-chain


12 13
C+p → N+γ
13 13
N → C + e+ + νe
13 14
C+p → N+γ
14 15
N+p → O+γ
15 15
O → N + e+ + νe
15 12
N+ p → C + 4 He.

The cross section of the direct reaction d+d → 4 He + γ is small (i.e. the 3 H + p and 3 He + n
channels dominate d+d →), and it is not important in either context.
The triple-α reaction 4 He + 4 He + 4 He → 12 C, responsible for carbon production in stars, is also
not important during big bang, since the density is not sufficiently high for three-particle reactions
to occur (the three 4 He nuclei would need to come within the range of the strong interaction within
the lifetime 2.6×10−16 s of the intermediate state 8 Be). Exercise: calculate the number and mass
density of nucleons at T = 1 MeV.
6 BIG BANG NUCLEOSYNTHESIS 92

Figure 1: The 12 most important nuclear reactions in big bang nucleosynthesis.

experimentally. The relevant reaction rates are now known sufficiently accurately,
so that the nuclear abundances produced in BBN (for a given value of η) can be
calculated with better accuracy than the present abundances can be measured from
astronomical observations.
The reaction chain proceeds along stable and long-lived (compared to the nu-
cleosynthesis timescale—minutes) isotopes towards larger mass numbers. At least
one of the two incoming nuclei must be an isotope which is abundant during nu-
cleosynthesis, i.e. n, p, 2 H or 4 He. The mass numbers A = 5 and A = 8 form
bottlenecks, since they have no stable or long-lived isotopes. The A = 5 bottleneck
is crossed with the reactions 4 He+3 He and 4 He+3 H, which form a small number of
7 Be and 7 Li. Their abundances remain so small that we can ignore the reactions

(e.g. 7 Be + 4 He → 11 C + γ and 7 Li + 4 He → 11 B) which cross the A = 8 bottleneck.


Numerical calculations also show that the production of the other stable lithium
isotope, 6 Li is several orders of magnitude smaller than that of 7 Li.
Thus BBN produces the isotopes 2 H, 3 H, 3 He,4 He, 7 Li and 7 Be. Of these, 3 H
(half life 12.3 a) and 7 Be (53 d) are unstable and decay after nucleosynthesis into 3 He
and 7 Li. (Actually, 7 Be becomes 7 Li through electron capture 7 Be + e− → 7 Li + νe .)
In the end BBN has produced cosmologically significant (compared to present
6 BIG BANG NUCLEOSYNTHESIS 93

Figure 2: The time evolution of the n, 2 H (written as d) and 4 He abundances during


BBN. Notice how the final 4 He abundance is determined by the n abundance before nuclear
reactions begin. Only a small part of the neutrons decay or end up in other nuclei. Before be-
coming 4 He, all neutrons pass through 2 H. To improve the visibility of the deuterium curve,
we have plotted it also as multiplied by a factor of 50. The other abundances (except p) re-
main so low, that to see them the figure should be redrawn in logarithmic scale (see Fig. 4.3
of Kolb&Turner. This Figure is for η = 6 × 10−10 . The time at T = (90, 80, 70, 60) keV is
(152, 199, 266, 367) s. Thus the action peaks at about t = 4 min.

abundances) amounts of the four isotopes, 2 H, 3 He, 4 He and 7 Li (the fifth isotope
1 H=p we had already before BBN). Their production in the BBN can be calculated,

and there is only one free parameter, the baryon/photon ratio

nB0 Ωb0 ρc0 Ωb0 3H02


η ≡ = =
nγ0 mN nγ0 mn nγ0 8πG
 
−10 18 ρb0
= 274 × 10 ωb = 1.46 × 10 . (6.23)
kgm−3

Here ρb0 is the average density of baryonic matter today; recall that ωb ≡ Ωb0 h2 .

6.7 BBN as a function of time


Let us follow nucleosynthesis as a function of time (or temperature). See Fig. 4.3 in
Kolb&Turner or Fig. 2 here.
The nuclei 2 H and 3 H are intermediate states through which reactions proceed
towards 4 He. Therefore their abundance first rises, is highest at the time when 4 He
production is fastest, and then falls as baryonic matter ends up in 4 He. 3 He is also
an intermediate state, but the main channel from 3 He to 4 He is via 3 He+n→3 H+p
, which is extinguished early as the free neutrons are used up. Therefore the abun-
dance of 3 He does not fall the same way as 2 H and 3 H. The abundance of 7 Li also
rises at first and then falls via 7 Li+p→4 He+4 He. Since 4 He has a higher binding
energy per nucleon, B/A, than 7 Li and 7 Be have, these also want to return into 4 He.
This does not happen to 7 Be, however, since, just like for 3 He, the free neutrons
needed for the reaction 7 Be+n→4 He+4 He have almost disappeared.
6 BIG BANG NUCLEOSYNTHESIS 94

B(MeV) B/A
2H 2.2245 1.11
3H 8.4820 2.83
3 He 7.7186 2.57
4 He 28.2970 7.07
6 Li 31.9965 5.33
7 Li 39.2460 5.61
7 Be 37.6026 5.37
12 C 92.1631 7.68
56 Fe 492.2623 8.79

6.8 Primordial abundances as a function of the baryon-to-photon


ratio
Let us consider BBN as a function of η (see figure 3). The greater is η, the higher is
the number density of nucleons. The reaction rates are faster and the nucleosynthesis
can proceed further. This mean that a smaller fraction of “intermediate nuclei”, 2 H,
3 H, and 7 Li are left over— the burning of nuclear matter into 4 He is “cleaner”.

Also the 3 He production falls with increasing η. However, 7 Be production increases


with η. In the figure we have plotted the final BBN yields, so that 3 He is the sum
of 3 He and 3 H, and 7 Li is the sum of 7 Li and 7 Be. The complicated shape of the
7 Li(η) curve is due to these two contributions: 1) For small η we get lots of “direct”
7 Li, whereas 2) for large η there is very little “direct” 7 Li left, but a lot of 7 Be is

produced. In the middle, at η ∼ 3 × 10−10 , there is a minimum of 7 Li production


where neither way is very effective.
The 4 He production increases with η, since for higher density nucleosynthesis
begins earlier, when there are more neutrons left.

6.9 Comparison with observations


Abundances of the various isotopes calculated from BBN can be compared to the
observed abundances. This is one of the most important tests of the big bang theory.
A good agreement is obtained for η in the range η = 5–7 × 10−10 . This was, in fact,
the best method to estimate the amount of ordinary matter in the universe, until the
advent of precise cosmic microwave (CMB) anisotropy data in 2003, from the WMAP
satellite4 . The comparison of calculated abundances with observed abundances is
complicated by chemical evolution. BBN gives the primordial abundances of the
isotopes. The first stars form with this element composition. In stars, further fusion
reactions take place and the composition of the star changes with time. Towards the
end of its life, the star ejects its outer parts into interstellar space, and the processed
material mixes with primordial material. The next generation of stars forms from
this mixed material, and so on.
The observations of present abundances are based on spectra of interstellar clouds
and stellar surfaces. To obtain the primordial abundances from the present abun-
4
Many cosmological parameters can be estimated from the CMB anisotropy, as we will discuss
in chapter 12. The current best CMB estimate is from the Planck satellite (also using polarization
data from WMAP), ωb = 0.02207 ± 0.00027, corresponding to η = (6.047 ± 0.074) × 10−10 [2, 1].
6 BIG BANG NUCLEOSYNTHESIS 95

0
10
4
He
-1
10

-2
10

-3
10
D/H
3
10
-4 He/H

-5
10

-6
10

-7
10

-8
10

7
10
-9 Li/H

-10
10 -11 -10 -9 -8
10 2 5 10 2 5 10 2 5 10

Figure 3: The primordial abundances of the light elements as a function of η. For 4 He we


give the mass fraction, for D = 2 H, 3 He, and 7 Li the number ratio to H = 1 H, i.e., ni /nH .
6 BIG BANG NUCLEOSYNTHESIS 96

dances the effect of chemical evolution has to be estimated. Since 2 H is so fragile


(its binding energy is so low), there is hardly any 2 H production in stars, rather
any pre-existing 2 H is destroyed early on in stars. Therefore any interstellar 2 H is
primordial. The smaller the fraction of processed material in an interstellar cloud,
the higher its 2 H abundance should be. Thus all observed 2 H abundances are lower
limits to the primordial 2 H abundance5 . Conversely, stellar production increases the
4 He abundance. Thus all 4 He observations are upper limits to the primordial 4 He.

Moreover, stellar processing produces heavier elements, e.g. C, N, O, which are not
produced in BBN. Their abundance varies a lot from place to place, giving a mea-
sure of how much chemical evolution has happened in various parts of the universe.
Plotting 4 He vs. these heavier elements, we can extrapolate the 4 He abundance
to zero chemical evolution to obtain the primordial abundance. Since 3 He and 7 Li
are both produced and destroyed in stellar processing, it is more difficult to make
estimates of their primordial abundances based on observed present abundances.
There are two clear qualitative signatures of big bang nucleosynthesis in the
present universe:

1. All stars and gas clouds observed contain at least 23% 4 He. If all 4 He had been
produced in stars, we would see similar variations in the 4 He abundance as we
see for the other elements, such for C, N, and O, with some regions containing
just a few % or even less 4 He. This universal minimum amount of 4 He signifies
primordial abundance produced when matter in the universe was uniform.

2. The existence of significant amounts of 2 H in the universe is a sign of BBN,


since there are no known astrophysical sources of large amounts of 2 H.

The observed abundances of the BBN isotopes, 2 H, 3 He and 4 He indicate the


range η = 5.7—6.7 × 10−10 (95% C.L.), whereas 7 Li prefers the range η ≈ 1.5—
4.5 × 10−10 [1]. Measurements of each isotope are subject to different systematical
errors, and the convergence of three out of four of them to a narrow range strongly
supports it as the correct one. Furthermore, this range agrees with the baryon-
to-photon ratio η at the time of last scattering determined from the CMB. There
is a problem with the 7 Li abundance – this may be a hint of new particle physics
that would deplete Lithium, or it may simply indicate poorly understood stellar
astrophysics (the abundance is determined from measurements of old stars). Fig. 4
illustrates these issues.
The 95% C.L. range η = 5.7—6.7 × 10−10 corresponds to 0.021 ≤ ωb ≤ 0.025.
With h = 0.6 . . . 0.8, this gives

Ωb0 = 0.03 . . . 0.07 (6.24)

for a conservative range of the baryonic density parameter. With h = 0.7, we get

Ωb0 = 0.04 . . . 0.05 . (6.25)


5
This does not apply to sites which have been enriched in 2 H due to separation of 2 H from 1 H.
Deuterium binds into molecules more easily than ordinary hydrogen. Since deuterium is heavier
than ordinary hydrogen, deuterium and deuterated molecules have lower thermal velocities and do
not escape from gravity as easily. Thus planets tend to have high deuterium-to-hydrogen ratios.
6 BIG BANG NUCLEOSYNTHESIS 97

Figure 4: The abundances of 4 He, D, 3 He and 7 Li and the range of η10 ≡ 1010 η determined
from BBN (yellow boxes) and the the CMB (blue strip) [1]. Both the BBN and CMB ranges
are 95% C.L..
REFERENCES 98

Even the conservative range is much less than cosmological estimates for Ωm0 .
Therefore most of the matter in the universe is non-baryonic. In the next chapter,
we will discuss this non-baryonic dark matter.
We can also use BBN to test for the presence of physics beyond the Standard
Model. The expansion rate of the universe depends on the energy density of radia-
tion, encoded in g∗ . During BBN, we have g∗ = 5.5+1.75Nν , where Nν is the number
of neutrino species with masses so small that they are relativistic during BBN and
have weak interactions so that their distribution is coupled to the thermal bath until
about T = 0.8 MeV. The number of neutrino species can also be left as a free pa-
rameter, in which case it parametrises any additional radiation degrees of freedom
that may be present. As mentioned in the previous chapter, for the Standard Model
we have Nν = 3.046, because neutrinos are not totally decoupled from the thermal
bath when electrons and positrons annihilate, so some of the entropy (and energy
density) of electrons and positrons is transferred to the neutrinos, hence the 0.046
correction. If we leave Nν as a free parameter and fit the observations (neglecting
Lithium) we get [1] η = 4.9—7.1 × 10−10 and 1.8 < Nν < 4.5. So as far as BBN is
concerned, there is room for one more light neutrino species. Combining with CMB
data from Planck, we have Nν = 3.28 ± 0.28, so there is some room for additional
radiation degrees of freedom, but not quite for another full neutrino species6 .

References
[1] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38 (2014) 090001,
http://pdg.lbl.gov/2014/listings/contents listings.html

[2] P.A.R. Ade et al. [Planck Collaboration], arXiv:1303.5076 [astro-ph.CO].

6
We know from collider and laboratory experiments that there are only three light weakly in-
teracting neutrinos. The mass limit for new weakly interacting neutrinos is m > 40 GeV if they
are of the Majorana type and m > 2400 GeV if they are of the Dirac type [1]. The distinction
corresponds to whether they are or are not (respectively) their own antiparticles.
7 Dark matter
7.1 Observational evidence for dark matter
The term “dark matter” was coined by Jacobus Kapteyn in 1922 in his studies of
the motions of stars in our galaxy to refer to matter that interacts gravitationally,
but is not seen via electromagnetic radiation[1]. He found that no dark matter is
needed in in the galactic Solar neighbourhood. In 1932, Jan Oort suggested the
opposite result that there would be twice as much dark matter as visible matter in
the Solar vicinity. This is the first claim of evidence for dark matter. However, later
observations have shown this claim to be wrong, and the discovery of dark matter is
usually credited to Fritz Zwicky who made the first correct claim for the existence
of dark matter in 1933. Zwicky concluded from measurements of the redshifts of
galaxies in the Coma cluster that their velocities are much larger than the escape
velocity due to the visible mass of the cluster.
There are nowaways large amounts of evidence for dark matter, including from
gravitational lensing, expansion rate of the universe and other measures. One of the
earliest, and easiest to understand, pieces of evidence comes from rotation curves
of galaxies, which have been studied extensively since the 1970s. According to
Newtonian gravity, the velocity v of a body on a circular orbit in an axially symmetric
mass distribution is
v2 M (r)
= GN 2 , (7.1)
r r
where M (r) is the mass inside radius r, and the function v(r) is called the rotation
curve. For an orbit around a compact central mass, for example planets in the Solar
system, we get Kepler’s third law v ∝ 1/r1/2 . For stars orbiting the center of a
galaxy the situation is different, since the mass inside the orbit increases with the
distance. Suppose that the energy density of a galaxy decreases as a power-law,

ρ ∝ r−n (7.2)

with some constant n. Then the mass inside radius r is


Z
M (r) ∝ drr2 r−n ∝ r3−n for n < 3 . (7.3)

Thus the rotation velocity in our model galaxy should vary with distance from the
center as
v(r) ∝ r1−n/2 . (7.4)
Observed rotation curves increase with r for small r, i.e., near the center of the
galaxy, but then typically flatten out, becoming roughly v(r) ≈ const.. According
to (7.4), this would indicate a density profile

ρ ∝ r−2 . (7.5)

However, the density of stars falls more rapidly away from the core of a galaxy, and
falls exponentially at the edge. Also, the total mass from stars and other visible
objects, like gas and dust clouds, is too small to account for the rotation velocity at
large distances.
This seems to indicate the presence of another mass component to galaxies. This
mass component should have a different density profile than the visible, or luminous,

99
7 DARK MATTER 100

Figure 1: Rotation curves of spiral galaxies, from [2].

mass in the galaxy, so that it would be subdominant in the inner parts of the galaxy,
but would dominate in the outer parts. The dark component appears to extend well
beyond the visible parts of galaxies, forming a dark halo surrounding the galaxy.
From more detailed observations, we see that instead of of 1/r2 , the distribution
of dark matter in galaxies is well fit by the Navarro-Frenk-White (NFW) profile,
ρ0
ρ=  2 , (7.6)
r r
rs 1+ rs

where ρ0 and rs are constants. The profile obviously does not hold all the way
to the center (the density does not diverge anywhere!), and close to the centers of
galaxies, the densities are typically dominated by baryonic matter, and the dark
matter profile rises less steeply than in the NFW case.
Dark matter can be discussed in terms of the mass-to-light ratio M/L of sources.
It is customarily given in units of M⊙ /L⊙ , where M⊙ and L⊙ are the mass and
absolute luminosity of the Sun. The luminosity of a star increases with its mass
faster than linearly, so stars with M > M⊙ have M/L < 1, and smaller stars have
M/L > 1. Small stars are more common than large stars, so a typical mass-to-light
ratio from the stellar component of galaxies is M/L ∼ a few. For stars in our part of
the Milky Way galaxy, M/L ≈ 2.2. Because large stars are more short-lived, M/L
decreases with the age of the star system, and the typical M/L from stars in the
universe is somewhat larger. However, this still does not account for the full masses
of galaxies.
The mass-to-light ratio of a galaxy turns out to be difficult to determine; the
larger volume around the galaxy you include, the larger M/L you get. The mass
M is determined from velocities of orbiting bodies and at large distances there may
be no such bodies visible. For galaxy clusters you can use the velocities of the
galaxies themselves as they orbit the center of the cluster. The mass-to-light ratios
of clusters appear to be several hundreds. Presumably isolated galaxies would have
similar values if we could measure them to large enough radii.
Estimates for the total matter density Ωm0 based on the gravitational effects of
matter in the universe via many different methods give a similar conservative range
0.1 . Ωm0 . 0.4, with a more likely range of 0.15 . Ωm0 . 0.3 [3]. Also, from the
CMB we have ωm = Ωm0 h2 = 0.1426 ± 0.0025 for the spatially flat ΛCDM model
[4], and ωm = 0.14 ± 0.01 model-independently [5]1 .
1
In fact, if there is only baryonic matter, the CMB anisotropy pattern looks qualitatively different
7 DARK MATTER 101

The estimates for the amount of ordinary matter in the objects we can see on
the sky, stars and visible gas and dust clouds, i.e. luminous matter, give a much
smaller contribution to the density parameter,

Ωlum0 . 0.01 (7.7)

In the previous chapter we found that big bang nucleosynthesis leads to the value
0.019 ≤ Ωb0 h2 ≤ 0.024 at 95% C.L., and the CMB gives a similar range. Taking
conservatively h = 0.6 . . . 0.8, we have

Ωb0 = 0.03 . . . 0.07 (7.8)

for baryonic matter.


We thus have at very high confidence

Ωlum0 < Ωb0 < Ωm0 . (7.9)

This is consistent, as all luminous matter is baryonic, and all baryonic matter is
matter. That we have two inequalities tells us that there are two kinds of dark matter
(as opposed to luminous matter): baryonic dark matter (BDM) and nonbaryonic
dark matter. We do not know the precise nature of either kind of dark matter, and
this is called the dark matter problem. Determining the nature of dark matter is one
of the most important problems in cosmology today. Often the expression “dark
matter” is used to refer to the nonbaryonic kind only, or only to non-baryonic dark
matter other than neutrinos, i.e. only to the part which is unknown.

7.2 Baryonic dark matter


Candidates for BDM include compact (e.g. planet-like) objects in interstellar space
as well as thin intergalactic gas (or plasma). Objects of the former kind have been
dubbed MACHOs (Massive Astrophysical Compact Halo Objects) to contrast them
with another dark matter candidate, WIMPs, to be discussed later. A way to
detect such a dark compact object is gravitational microlensing: if such a massive
object passes near the line of sight between us and a distant star, its gravity focuses
the light of that star towards us, and the star appears to brighten for a while. The
brightening has a characteristic time profile, and is independent of wavelength, which
clearly distinguishes it from other ways a star may brighten (variable stars).
An observation of an microlensing event gives an estimate of the mass, distance
and velocity2 of the compact object, but tells nothing else about it. Thus in principle
we could have nonbaryonic MACHOs. But as we do not know of any such objects
(except black holes), the MACHOs are usually thought of as ordinary substellar
objects, such as brown dwarfs or “jupiters”. Ordinary stars can of course also cause a
microlensing event, but then we would also see light from the star. Heavier relatively
faint objects that could fall into this category include old white dwarfs, neutron stars
and black holes.
than in the case with dark matter, so the CMB provides a strong case for dark matter even in the
absence of any other observations. We return to this in the second part of the course.
2
Actually we do not get an independent measure of all three quantities, as the observables
depend on combinations of these. However, we can make some reasonable assumptions of the
expected distance and velocity distributions among such objects, leading to a rough estimate of the
mass. Especially from a set of many events, we can get an estimate for the typical mass.
7 DARK MATTER 102

The masses of ordinary black holes are included in the Ωb estimate from BBN,
since they were formed from baryonic matter after BBN. However, if there are
primordial black holes produced in the big bang before BBN, they would not be
included in Ωb .
A star requires a mass of about 0.07M⊙ to ignite thermonuclear fusion, and to
start to shine as a star. Smaller, “failed”, stars are called brown dwarfs. They are not
completely dark; they are warm balls of gas which radiate faint thermal radiation.
They were warmed up by the gravitational energy released in their compression
to a compact object. Thus brown dwarfs can be, and have been, observed with
telescopes if they are quite close by. Smaller such objects are called “jupiters” after
the representative in the Solar System.
The strategy to observe a microlensing event is to monitor constantly a large
number of stars to catch such a brightening when it occurs for one of them. Since
the typical time scales of these events are many days, or even months, it is enough
to look at each star, say, once every day or so. As most of the dark matter is in the
outer parts of the galaxy, further out than we are, it would be best if the stars to
be monitored where outside of our galaxy. The Large Magellanic Cloud, a satellite
galaxy of our own galaxy, is a good place to look for, being at a suitable distance,
where individual stars are still easy to distinguish. Because of the required precise
alignment of us, the MACHO, and the distant star, the microlensing events will be
rare. But if the BDM in our galaxy consisted mainly of MACHOs (with masses
between that of Jupiter and several solar masses), and we monitored constantly
millions of stars in the LMC, we should observe many events every year.
Such observing campaigns (MACHO, OGLE, EROS, . . . ) were begun in the
1990’s. Indeed, several microlensing events have been observed. The typical mass of
these MACHOs turned out to by ∼ 0.5M⊙ , much larger than the brown dwarf mass
that had been expected. The most natural faint object with such a mass would be
a white dwarf. However, white dwarfs had been expected to be much too rare to
explain the number of observed events. On the other hand the number of observed
events is too small for these objects to dominate the mass of the BDM in the halo
of our galaxy.
BDM in our universe is dominated by thin intergalactic ionized gas. In fact, in
large clusters of galaxies, we can see this gas, as it has been heated by the deep
gravitational well of the cluster, and radiates X-rays.

7.3 Nonbaryonic dark matter


The favorite candidates for nonbaryonic dark matter can be divided into three
classes, hot dark matter (HDM), warm dark matter (WDM) and cold dark mat-
ter (CDM), based on the typical velocities of the particles making up this matter at
the time they decouple from the thermal bath.
Dark matter particles are called HDM if they decouple while they are relativistic
and the number density is determined by the freeze-out of their interactions (we
will discuss this shortly)3 . Then they retain a large number density, requiring their
masses to be small, less than 100 eV, so that their density is less than the critical
density. Because of the small mass, their thermal velocities are large when structure
formation begins, making it difficult to trap them in potential wells of the forming
3
Today, the HDM particles should be nonrelativistic in order to count as “matter”.
7 DARK MATTER 103

Figure 2: Comparison of the expected halo of the Milky Way and the galaxies M31 and M33
in CDM and WDM models. From http://www.clues-project.org/images/darkmatter.html.

structures. CDM, on the other hand, refers to dark matter particles with negligible
velocities. If these velocities are thermal, this requires their masses to be large,
which means that they must have decoupled while already nonrelativistic, so their
number density would have been suppressed by annihilation. Candidates between
hot and cold are called, naturally enough, warm dark matter.
HDM, WDM and CDM all have a different effect on structure formation in the
universe. Structure formation refers to the process in which the originally nearly
homogeneously distributed matter forms bound structures such galaxies and galaxy
clusters under the pull of gravity. We shall discuss structure formation in the second
part of the course. But let us already mention that today the best way to differentiate
between HDM, WDM and CDM is through the observed large-scale structure in the
universe, i.e. the way galaxies are distributed in space, combined with the CMB
which shows the seeds of structure. We show in figure 2 the results of a simulation
of the halo of dark matter around the Milky Way and two other galaxies. For CDM,
there is more substructure and satellites around the galaxy, while their formation is
suppressed for WDM. According to observations, there is an order of magnitude less
satellites observed around the Milky Way than predicted by CDM models. However,
the observations are not complete, and the discrepancy may also be due to other
causes than WDM.
The most common candidates for non-baryonic dark matter are Weakly Inter-
acting Massive Particles, or WIMPs. They decouple from the thermal bath of the
early universe early, like neutrinos, but are much heavier, so that they are a form of
CDM. The interactions of some dark matter candidates are stronger or weaker. For
example, gravitinos have only gravitational interactions, while TIMPs (Technicolour
Interacting Massive Particles) can have stronger interactions.
7 DARK MATTER 104

7.4 Hot dark matter


The archetypal HDM candidates are neutrinos, which have a small but nonzero rest
mass. The cosmic neutrino background would make a significant contribution to the
density parameter if the neutrinos had a rest mass of the order of 1 eV.
For massive neutrinos, the number density today is the same as for massless
neutrinos, but their energy density today is dominated by their rest masses, giving
(there is a factor of 3/4 since neutrinos are fermions and and 4/11 due to e+ e− -
annihilation)
X 3 X
ρν = mνi nνi = nγ mνi , (7.10)
11
i i

where the sum is over the neutrino mass eigenstates (which are not the same as the
weak interaction eigenstates, for whom the names electron neutrino, muon neutrino
and tau neutrino are properly reserved). For T0 = 2.725 K, this gives the neutrino
density parameter P
2 mν
Ων0 h = , (7.11)
94.14 eV
which applies if the neutrino masses are less than the neutrino decoupling tem-
perature, 1 MeV, but greater than the present temperature of massless neutrinos,
Tν0 = 0.168 meV. This counts then as one contribution to Ωm0 . In this case, neu-
trinos are hot dark matter (HDM). Data on large scale structure combined with
structure formation theory requires that a majority of the matter in the universe
has to be CDM and the present upper limit to HDM in the form of massive neutrinos
leads to the conservative mass limit [6]
X
mν . 2.0 eV . (7.12)

Therefore the maximum contribution of neutrinos is Ων0 h2 . 0.02, at most the


same order of magnitude as baryonic matter.
If neutrinos were the dominant form of matter, there would be a lower limit on
their mass from constraints on the phase space density, called the Tremaine-Gunn
limit. Essentially, in order to achieve a certain rotation velocity for galaxies, you need
a certain amount of mass inside a given volume, and the Pauli exclusion principle
constrains the number number of particles you can pack inside a given volume. Even
though we know that neutrinos are a subdominant component of dark matter, the
Tremaine-Gunn limit applies to any fermionic dark matter candidate, even if its
distribution is not thermal. There is no such lower limit on the mass of a bosonic
dark matter particle.
Exercise. Suppose neutrinos would dominate the mass of galaxies (to the ex-
tent you could ignore all other forms of matter). We know the mass of a galaxy
(within a certain radius) from its rotation velocity. The mass could come from a
smaller number of heavier neutrinos or a larger number of lighter neutrinos, but the
available phase space (you don’t have to assume a thermal distribution) limits the
total number of neutrinos, whose velocity is below the escape velocity. This leads
to a lower limit of the neutrino mass mν . Assume for simplicity that either a) all
neutrinos have the same mass, or b) only ντ is massive. Let r be the radius of the
galaxy, and v its rotation velocity at this distance. Find the minimum mν needed
7 DARK MATTER 105

for neutrinos to dominate the galaxy mass. (A rough estimate is enough: you can,
e.g., assume that the neutrino distribution is spherically symmetric, and that the
escape velocity within radius r equals the escape velocity at r.) Give the numerical
value for the case v = 200km/s and r = 10kpc.

7.5 Cold dark matter


Observations of large-scale structure together with the theory of structure formation
requires that dark matter is dominated by CDM or WDM, with CDM being the
currently preferred option. There is no particle in the Standard Model of particle
physics that is suitable as CDM, We can therefore say that cosmological observations
of dark matter are one of the most important pieces of evidence for physics beyond
the Standard Model.
A major class of CDM particle candidates is called WIMPs (Weakly Interacting
Massive Particless). For a HDM candidate, the mass must be small so that the total
contribution to the energy density today would not be huge; from (7.11) we see that
a neutrino mass larger than a few dozen eV would give more energy density than
observed. In contrast, if the mass of a weakly interacting particle species is much
larger than the decoupling temperature of weak interactions, these particles are
largely annihilated before the decoupling. This suppression of the number density
makes it possible to achieve a suitable energy density starting from an initial thermal
distribution at very high temperatures.
A favourite WIMP candidate comes from supersymmetric extensions of the Stan-
dard Model. In the simplest version, the Minimal Supersymmetric Standard Model
(MSSM), every Standard Model particle has a partner with the same quantum
numbers4 but a spin which is different by 1/2. The MSSM has a symmetry called
R-parity due to which superpartners can only be created or destroyed in pairs, so
the lightest supersymmetric partner (LSP) is stable. The parameters of the MSSM
can be chosen such that the LSP is electrically neutral and a color singlet, so that it
has only weak interactions. If it exists, it is possible that the LSP would be created
and detected at the LHC (Large Hadron Collider) at CERN. A measurement of
its properties would allow a calculation of its expected number and energy density
in the universe. Thus far, there has been no evidence for (or even suggestions of)
MSSM, or any other physics beyond the Standard Model, at the LHC.
If a CDM particle was in thermal equilibrium in the early universe, its number
density is suppressed, as noted above. Its mass then has to be large to have a
significant energy density today. (We will soon look at this in detail!) In the MSSM,
the LSP is expected to have a mass somewhere in the range between 100 GeV to
a few TeV or so. However, if the particle was not in thermal equilibrium when it
decoupled, the number density is not thus constrained.
For the particle not to be in thermal equilibrium, its interactions need to be
very weak, and typically it should not even feel the weak interaction (which, despite
the name, is not actually weak at large energies; recall that the weak interaction
cross section is ∝ E 2 ). One such candidate is called the axion. Axion particles
are “born cold” and have never been in thermal interaction. It is related to the
4
If supersymmetry were unbroken, the mass would also be the same. In that case superpartners
would have been observed already, so supersymmetry has to be broken. The partners retain the
same quantum numbers, but their masses become different.
7 DARK MATTER 106

so-called “strong CP problem” in particle physics. We shall not go into the details
of this, just note that it can be phrased as the question “why is the neutron electric
dipole moment so small?”. (The electric dipole moment is zero to the accuracy of
measurement, the upper limit being dn < 0.29 × 10−25 ecm [7], whereas the neutron
does have a significant magnetic dipole moment.) A proposed solution involves
an additional symmetry of particle physics (the Peccei–Quinn symmetry). The
axion would then be the Goldstone boson of the breaking of this symmetry. The
important point for us is that these axions would be created in the early universe
when the temperature fell below the QCD energy scale (of the order of 100 MeV),
and they would be created with negligible kinetic energy, and would never be in
thermal interaction. Thus axions would have negligible velocities, and act like CDM.
(Though calling axions “cold” is bit of a misnomer, as their phase space distribution
is not thermal! Here the word just means that their typical kinetic energy is much
smaller than the mass.) Another dark matter candidate of this type is the gravitino,
the supersymmetric partner of the graviton. We will not discuss this kind of dark
matter further, and will stick to massive CDM particles.

7.6 Dark matter decoupling


Many dark matter candidates, WIMPs in particular, are in thermal equilibrium at
early times and decouple once their interactions become too weak to keep them in
equilibrium. Such particles are called thermal relics, since their density today is
determined by the thermal equilibrium of the early universe, just as is the case for
neutrinos. If the candidate is stable (or has a long lifetime) and there are no particles
decaying to it, the number of particles is conserved after decoupling, so the number
density falls like a−3 . If we assume that the main interaction is the annihilation of
dark matter particles and antiparticles, we can write

ṅdm + 3Hndm = −hσvin̄dm ndm + ψdm , (7.13)

where ndm is the number density of the dark matter particles, n̄dm is the antiparticle
number density, ψdm is the rate of creation of the dark matter particles, and hi
indicates average over the phase space distribution. Let us first consider the case
when there is no particle-antiparticle asymmetry, so the chemical potential is zero,
µdm = 0. We will later see what happens if there is a conserved quantum number
which enforces a particle-antiparticle asymmetry. (The term “thermal relic” is often
used to refer only to the case when an asymmetry between particles and antiparticles
is not important.) In equilibrium, equally many particles are being annihilated and
created, so ψdm = hσvin2dm ≡ hσvin2eq ≡ Γneq , where neq is the number density in
equilibrium. Denoting the number of dark matter particles Ndm ∝ a3 ndm (and the
equilibrium number by Neq ), we have
" #
Ndm 2

1 dNdm Γ
=− −1 . (7.14)
Neq d(ln a) H Neq

In the limit Γ ≫ H, interactions rapidly restore any deviations from the equilibrium
distribution. If Ndm > Neq , the right-hand side of (7.14) is negative, so the numbers
will decrease, and the opposite for Ndm < Neq . In the limit of weak coupling,
Γ ≪ H, we get Ndm ≈ constant. The time when the number of particles reaches
7 DARK MATTER 107

this constant value is called decoupling (a term we already used with photons and
neutrinos) or freeze-out. A crude approximation is to say that decoupling happens
at exactly the temperature Td where H = Γ, and that the number of particles
follows the equilibrium behaviour before and is conserved afterwards, as we did for
the neutrinos.
If a particle decouples while it is relativistic, the number density is of the order
3
Td . We calculated this starting from the phase space distribution, but it is fairly
obvious, because Td is the only relevant dimensional quantity. As we discussed
above, such hot dark matter would have a large energy density today unless the
mass is small. However, as a particle species becomes non-relativistic, the number
density falls exponentially (still assuming that the chemical potential is zero), so
the mass of the dark matter particle can be large while keeping the number density
down.
The number density of a non-relativistic particle in thermal equilibrium (with
zero chemical potential) at decoupling time td and temperature Td is
 3/2
mTd
neq (td ) = gdm e−m/Td , (7.15)

where m is the mass of the dark matter particle. From this we get the density today
as (assuming negligible decay)

a(td )3 g∗S (T0 ) T0 3


 
ndm (t0 ) = n (t
eq d ) = neq (td ) , (7.16)
a(t0 )3 g∗S (Td ) Td

where we have used the relation g∗S (T )T 3 a3 = constant, which follows from con-
servation of entropy. The total energy density is ρdm = 2mndm (the factor 2 comes
from the fact that we have an equal number of particles and antiparticles).
In order to determine the number density of a thermal relic, we need to know
the mass, the decoupling temperature and the number of degrees of freedom at
decoupling. At decoupling, Γ = neq (td )hσvi, so we need to know the mean of the
cross section times the velocity. The cross-section depends on the details of the
particle physics, but we can roughly parametrise the annihilation cross-section as
σv ∝ v 2q , where q = 0 for annihilation in the ground state (s-wave), and q = 1
for annihilation in the p-wave state. This can be understood as an expansion in
the square of the velocity, and since v ≪ 1, only the leading term is relevant.
(The p-wave term is only important if annihilation in the ground state is forbidden
or
p strongly suppressed for some reason.) For a non-relativistic particle, h|v|i =
8/π T /m, so we write hσvi = σ0 (T /m)q . We therefore have
p

gdm −q−3/2 −y
Γ(td ) = σ0 m3 y e , (7.17)
(2π)3/2
where we have defined y ≡ m/Td ; we have y ≫ 1 since the dark matter particle is
non-relativistic.
According to the Friedmann equation, the Hubble parameter is given by

π2 π2
3H 2 = 8πGN g∗ (T )T 4 = g∗ (T )T 4 , (7.18)
30 30MP2 l
so
7 DARK MATTER 108

r
g∗ (Td ) m2 −2
H(td ) = π y . (7.19)
90 MP l
Equating Γ(td ) = H(td ), we get an equation from which we can solve the decoupling
temperature in units of the dark matter mass, y,

N y 1/2−q e−y = 1 , (7.20)


p p
where N ≡ 45/(4π 5 g∗ (Td ))gdm MP l mσ0 . For a given value of gdm mσ0 / g∗ (Td ),
we can straightforwardly solve y numerically from (7.20). However, to get some
analytical understanding, we can write (7.20) as
 
1
y = ln N + − q ln y , (7.21)
2

and solve iteratively, so that

y0 = ln N
 
1
y1 = ln N + − q ln(ln N ) , (7.22)
2

and so on; the first approximation will be good enough for us.
From (7.15) and (7.16), the relic abundance is

g∗S (T0 ) gdm 3/2 −y 3


ndm0 = y e T0
g∗S (Td ) (2π)3/2
g∗S (T0 ) gdm
= N −1 y 1+q T03
g∗S (Td ) (2π)3/2
π 3 g∗S (T0 ) y 1+q
= √ p nγ0
360 ζ(3) g∗ (Td )MP l mσ0
y 1+q
≈ 5.31 p nγ0 , (7.23)
g∗ (Td )MP l mσ0

where we have used (7.20), put g∗S (Td ) = g∗ (Td ) (we assume that no particles
are becoming non-relativistic as the dark matter decouples) and g∗S (T0 ) ≈ 3.91,
and traded the temperature today for the photon number density via the relation
nγ = 2ζ(3)T 3 /π 2 . The relic energy density ρdm0 = 2mndm0 depends on the mass
only logarithmically via y, apart from the possible mass dependence of σ0 .

7.7 The WIMP miracle


Let us consider a particle with with gdm = 4 (for example, a spin 21 fermion with
both left- and right-handed components), mass m not too different from GeV, weak-
scale annihilation cross section σ0 ∼ G2F E 2 ∼ G2F m2 , where the Fermi constant is
GF ≈ 1.17 × 10−5 GeV−2 . Let us also assume that the particle annihilates via the
s-wave process, q = 0. Then we have ndm0 ∝ m−3 , ρdm0 ∝ m−2 . In the Standard
Model, g∗ (Td ) = 75.75 for 4 GeV > T > 1 GeV, and let us adopt that value. We
then have N ≈ 2.9 × 107 (m/GeV)3 , or ln N ≈ 17 + 3 ln(m/ GeV), which is also
7 DARK MATTER 109

the approximate the value of y. We thus get Td ≈ m/[17 + 3 ln(m/ GeV)]. This is
consistent with the adopted value of g∗ (Td ) only for roughly 40 GeV & m & 10 GeV,
but since g∗ (Td ) enters only logarithmically, the value of Td is not sensitive to the
precise number of degrees of freedom. These numbers give
 m   m −3
ndm0 ≈ 3 × 10−8 1 + 0.2 ln nγ0
GeV GeV
 m   m −3
= 3 × 10−8 η −1 1 + 0.2 ln nB0
GeV GeV
 m   m −3
≈ 50 1 + 0.2 ln nB0 , (7.24)
GeV GeV
where we have taken η = 6 × 10−10 . Since mb ≈ 1 GeV, we have
 m   m −2
ρdm0 ≈ 100 1 + 0.2 ln ρb0 . (7.25)
GeV GeV
For m = 1 GeV, we would have ρdm0 /ρb0 ≈ 10 and m = 100 GeV would give
ρdm0 /ρb0 ≈ 10−2 . Since ρb0 ≈ 0.05ρc0 , we get the bound m & 2 GeV on the mass
of the dark matter particle in order for its present density not to exceed the crit-
ical density. This is called the Lee-Weinberg bound. We get the observed ratio
ρdm0 /ρb0 ≈ 5 . . . 7 for m ≈ 4 . . . 5 GeV. Note the assumptions in the derivation
of the bound: the particle is assumed to be a thermal relic (i.e. the number den-
sity is determined by the thermal equilibrium distribution at decoupling) and the
annihilation occurs via the s-wave process.
The fact that a thermal relic with weak cross section and a mass not too different
from the weak scale gives the right relic abundance is called the WIMP miracle.
However, in the MSSM, a weakly interacting dark matter particle with a mass of 4
to 5 GeV would already have been detected in collider experiments. According to
the latest analyses, the lower mass limit for fermionic SUSY partners in the MSSM is
15 GeV [8]. As the LHC starts taking data again in 2015, the limit is expected to go
rise. (Lighter particles can viable in more complicated models, however.) Generally,
the preferred range for dark matter masses is of the order 100 GeV or so in the
usually studied models. One can still get the right relic abundance by making the
self-annihilation cross section smaller so that more particles remain, and extensions
of the Standard Model such as MSSM contain enough free parameters to adjust the
cross sections and masses. However, they can be independently tested in colliders
and via direct and indirect detection of the dark matter particles, which we will
shortly discuss.

7.8 Asymmetric dark matter


It is noteworthy that the observed dark matter abundance is so close to the baryon
abundance, given that in the scenario discussed above the two are determined by
completely different physics. The baryon number density is determined by the con-
servation of baryon number after baryogenesis in the primordial universe, while for
a WIMP thermal relic the dark matter number density is determined by the balance
between weak interactions and gravity via the freeze-out temperature.
There are also models where the dark matter abundance is determined by a
conserved quantum number, as is the case for baryons. It is illustrative to first con-
sider what would happen if there were no baryon-antibaryon asymmetry. Then the
7 DARK MATTER 110

baryon abundance would be determined by the freeze-out of nucleon annihilations


just as in the case for WIMP dark matter. We have g = 4 (protons and neutrons
both have 2 spin states) and mN = 0.94 GeV. The nucleon-antinucleon annihilation
cross section is hσvi = σ0 ∼ m−2 π0
, where the neutral pion mass is mπ0 = 0.135
GeV. We take g∗ (Td ) = 10.75, which is the value for Td between 100 MeV and 0.5
MeV. These numbers give N ≈ 6 × 1019 , or ln N ≈ 46, which gives y ≈ 50. For the
freeze-out temperature we get Td ≈ 19 MeV. The resulting nucleon abundance is
nN 0 ≈ 7 × 10−19 nγ0 (plus an equal number of antinucleons), which is about 10−9
times smaller than the real nucleon density nN 0 = nB0 = 6 × 10−10 nγ0 .
This failure of the reasoning based on the naive freeze-out argument which does
not account for the presence of a conserved quantum number can be light-heartedly
called the “baryon catastrophe”. The lesson is that a primordial baryon asymmetry
and the conservation of the baryon number are essential in determining the baryon
density.
We don’t know which is the correct theory of particle physics that determines
the dark matter density. In many models such as MSSM, there is no dark matter-
antimatter asymmetry. However, there are also models where the dark matter carries
a conserved quantum number which has an asymmetry generated at early times.
In particular, this is the case in some technicolour models. (A recent overview of
asymmetric dark matter can be found in [9].)
In the Standard Model colour interaction, quarks are the relevant degrees of free-
dom at high energies, but at low energies they are bound into mesons and baryons.
Technicolour is a higher energy version of the same idea. In technicolour, the Higgs
is not an elementary particle, but a bound state of some elementary fields which
become visible when probing sufficiently high energies. Technicolour models also
contain other bound states, just like QCD does, and one of those bound states could
be the dark matter particle. In correspondence to the baryon number B of the
Standard Model, there is the technibaryon number TB , carried by elementary tech-
nicolour particles and their bound states. If there is a conserved asymmetry in the
technibaryon number, the abundance of dark matter particles may be determined
by this asymmetry, and it can be very different from the freeze-out abundance we
calculated above, just as with baryons.
If the process which generates the asymmetry in the dark matter is related to
the process which generates the asymmetry in the baryons (baryogenesis), then the
baryon and dark matter number densities are naturally related to each other. This
possibility is called cogenesis. Alternatively, the quantum numbers could be related
because they are mixed by some later process, a possibility called sharing. The
details depend on the particle physics models, and as in the case of WIMP thermal
relics, we keep the discussion at a general level.
If the dark matter particle carries one unit of the conserved quantum number Q
(which could for example be the technibaryon number) and the symmetry-violating
interactions produce N units of Q for every unit of B, and there is no mixing
afterwards, the dark matter abundance today is simply

ndm0 = N nB0 , (7.26)

so
ρdm0 mdm
=N , (7.27)
ρB0 mN
7 DARK MATTER 111

which agrees with the observed ratio 5 . . . 7 for mdm = (5 . . . 7)/N GeV.
One constraint on such models is that the phase space distribution of the dark
matter particles has to correspond to CDM (or WDM). So the dark matter particle
cannot have decoupled at the electroweak phase transition with a thermal distri-
bution function if its mass is smaller than 100 GeV. However, a model where the
distribution function is not thermal would sitll be possible – the essential thing is
that the high momentum states of the dark matter particles are not occupied. From
the point of view of technicolour models, mdm . 10 GeV is also a somewhat un-
naturally low unless N ≪ 1, since the technicolour scale has to be & 1 TeV to be
consistent with collider experiments (no technicolour bound states —or any other
signatures of technicolour for that matter— have been observed). Naively, one would
expect the mass of the stable technicolour dark matter particle to be of this order,
or at least of the order of the Higgs mass, mH ∼ 100 GeV, since they have the same
origin as bound states. But there could be a reason why the lightest stable fermionic
bound state is much lighter than a bosonic unstable state. (In QCD, the lightest
bosonic bound states, the pions with mπ0 = 135 GeV and mπ± = 140 GeV, are
about an order of magnitude lighter than the lightest stable bound state, the proton
with mp = 938 GeV, because of chiral symmetry – let’s not get into this!)
Alternatively, we can have reactions which mix particles carrying baryon number
and particles carrying Q together, so that their relative abundance depends freeze-
out temperature Tf of these interactions. Let’s say that we have reactions which
interconvert baryons and dark matter particles

dm + X ↔ q + Y , (7.28)

where q stands for a quark, which carries B = 1/3, dm stands for the dark matter
particle which carries Q = 1 (or any other particle carrying the same quantum
number), and X and Y are particles which carry neither B nor Q, and we assume
we can neglect their chemical potentials. We then have, as long as these reactions
are in equilibrium, µdm = µq . Let us assume that these reactions freeze out at the
electroweak phase transition, and take the particle carrying the quantum number to
be massless. (Since the top quark receives a mass of the order of the electroweak
scale at the phase transition, we would have to consider the effect of this. We
neglect this complication. At least in some technicolour models, the top quark does
not make a difference [10].)
We assume that the technicolour particles are in thermal equiblirium. In order
for them to count as CDM, we then need mdm ≫ Tf . We thus have
µB
nB − n̄B = gB T 3
T
mdm T 3/2 − mdm  µdm
 
µdm 
nQdm − n̄Qdm = gdm e T e T − e− T

mdm T 3/2 − mdm
 
µdm
≃ 2 gdm e T , (7.29)
T 2π
where we have taken into account that the asymmetries and thus the chemical
potentials are small, and gB = 24 (the number of degrees of freedom in the quarks is
72, and each quark has B = 1/3). Note that just as gB is the number of degrees of
freedom which carry the conserved quantum number B which ends up in baryons,
7 DARK MATTER 112

gdm is the number of degrees of freedom which carry Qdm , which will in the late
universe end up in the dark matter particles only. Equating the chemical potentials
and noting that today ρB0 = mN nB0 , we obtain
 3/2
ρdm0 gdm mdm mdm −
mdm

= e Tf
. (7.30)
ρB0 12(2π)3/2 mN Tf

Taking gdm = 100 and Tf = 100 GeV, we get the observed abundance for mdm ≈
700 . . . 800 GeV ∼ 1 TeV. (Note that the temperature at which the electroweak
phase transition happens may change from the Standard Model value of 100 GeV
due to the new particles and interactions present in technicolour.)

7.9 Dark matter vs. modified gravity


Since all evidence for non-baryonic dark matter comes from its gravitational effects,
it could in principle be possible that the observations could instead be explained
by changing the law of gravity. Until the dark matter particle is detected, there is
some room for uncertainty. The problem for modified gravity is that there are so
many different observations for dark matter, in different physical systems: motions
of stars in galaxies, motions of galaxies in clusters, gravitational lensing, large-
scale structure, CMB anisotropies and so on [1, 11]. Gravity has to be adjusted
in a different manner for these different observations, and the resulting models are
rather contrived. Expressed another way, the dark matter scenario is very predictive:
the simple hypothesis of a massive particle with weak couplings to itself and to the
Standard Model particles explains a number of disparate observations and has made
several successful predictions.
One example which got a lot of attention a few years ago is the Bullet cluster [12].
Shown in figure 3 is the collision between two clusters of galaxies. According to the
dark matter scenario the mass of a cluster of galaxies has three main components: 1)
visible galaxies, 2) intergalactic gas and 3) cold dark matter. The last component is
expected have the largest mass, and the first one the smallest. When two clusters of
galaxies collide, it is unlikely for individual galaxies to crash, and the intergalactic
gas is too thin to noticeably slow down the relatively compact galaxies. On the
other hand, the intergalactic gas components do not travel through each other freely
but are slowed down and heated up by the collision. Thus after the clusters have
passed through each other, much of the intergalactic gas is left behind between the
receding clusters. Cold dark matter, in turn, should be weakly interacting, and thus
practically collisionless. Thus the CDM components of both clusters should also
travel through each other unimpeded.
In the picture of the Bullet cluster, figure 3, the intergalactic gas has indeed
been left behind the galaxies in the collision. The mass distribution of the system
has been estimated from the gravitational lensing effect on the apparent shapes of
galaxies behind the cluster. If there were no cold dark matter, most of the mass
would be in the intergalactic gas, whose mass is estimated to be about five times
that of the visible galaxies. Even in a modified gravity theory, we would expect
most of the lensing effect to be where most of the mass is. However, expectation is
not proof, so the observation cannot be said to out all possible models of modified
gravity. Nevertheless, it does provide an example of a successful complex prediction
of the cold dark matter hypothesis.
7 DARK MATTER 113

Figure 3: A composite image of galaxy cluster 1E 0657-56, also called the Bullet Cluster. It
consists of two subclusters, a larger one on the left, and a smaller one on the right. They have
recently collided and traveled through each other. One component of the image is an optical
image which shows the visible galaxies. Superposed on it, in red, is an X-ray image, which
shows the heated intergalactic gas, that has been slowed down by the collision and left behind
the galaxy components of the clusters. The blue colour is another superposed image, which
represent an estimate of the total mass distribution of the cluster, based on gravitational
lensing. NASA Astronomy Picture of the Day 2006 August 24. Composite Credit: X-Ray:
NASA/CXC/CfA/M. Markevitch et al. Lensing map: NASA/STScI; ESO WFI; Magel-
lan/U. Arizona/D. Clowe et al. Optical: NASA/STScI; Magellan/U. Arizona/D. Clowe et
al..
7 DARK MATTER 114

7.10 Direct detection


As we have seen, there are different plausible mechanisms for producing the observed
dark matter abundance. (There also mechanisms which we did not discuss, which
involve neither a conserved quantum number nor a thermal relic, such as those
relevant for axions and gravitinos.) These mechanisms are in turn realised in many
different models. In order to distinguish the models and confirm the identity of
the dark matter particle, as well as to be sure that the correct interpretation of
observations is really dark matter and not modified gravity, we need to observe dark
matter also via its non-gravitational interactions.
Usually, detection of dark matter is divided into three different categories: pro-
ducing the dark matter particle at colliders (collider detection), measuring the inter-
actions baryonic matter with dark matter in the laboratory (direct detection) and
measuring the end products of astrophysical dark matter annihilation or decay (in-
direct detection). A fourth category could be added, detecting the influence of dark
matter on the evolution of stars and the intergalactic medium. For example, dark
matter annihilation in the early universe can heat up the gas which forms stars and
thus have an impact on the formation of early stars and reionization. It has also
been suggested that the first stars would be powered mainly by dark matter anni-
hilation instead of fusion reactions; these have been dubbed ’dark stars’ [13] (this is
something of a misnomer, as they do shine!). Discussing such details requires delv-
ing into details of astrophysics, so we just mention the possibility. Detailed collider
signals are also properly the topic of a specialised particle physics course, we simply
note that if dark matter physics is related to the electroweak scale, whether via su-
persymmetry, technicolour or some other theory, then it is expected to be accessible
in experiments at the LHC. On the other hand, axions or light warm dark matter
candidates would not necessarily have any signature in high-energy colliders.
Let us consider direct detection. Since dark matter is everywhere, including on
(and in) the Earth, we should be able to see its interactions with baryonic mat-
ter. As dark matter interactions with ordinary matter have to be weak in order to
agree with cosmological observations, sensitive dedicated experiments are required.
Mostly WIMPs, like neutrinos, pass through the Earth without interacting at all,
but sometimes they interact with ordinary matter. A typical WIMP direct detec-
tion setup is a well isolated crystal or liquid sample, which is being observed to
find the energy and momentum deposited inside it by a collision of a nucleus with
a dark matter particle5 . The problem is that there are many “background” events
which may cause a similar signal. Thus WIMP detectors are continuously detecting
something. Therefore one looks for an annual modulation in the signal. WIMPs, if
they exist, have a particular velocity distribution related to the gravitational well of
our galaxy. The Earth is moving with respect to this velocity distribution, and the
annual change in the direction of Earth’s motion should result in a corresponding
variation in the detection rate.
Let us estimate the expected energy deposition from the elastic collision of a
dark matter particle and a nucleus. For cold dark matter, the velocities are non-
5
For dark matter particles which do not feel the weak interaction, different detection methods
are needed. For axions, one kind of a detector is a low noise microwave cavity with a large magnetic
field. An axion may interact with the magnetic field and convert into a microwave photon. No
axions have so far been detected. Some dark matter candidates, such as gravitinos, may interact
so weakly that they can in practice be detected only via their gravitational effect.
7 DARK MATTER 115

relativistic, so in the laboratory frame we have from conservation of energy and


momentum

mv 2 = mvdm
2
+ mt vt2
mv = mvdm + mt vt , (7.31)

where mt is the mass of the target nucleus, and m is still the dark matter mass. As
the kinetic energy 21 mt vt2 given to the nucleus, we get

2mt
E = v2
(1 + mt /m)2
 2
2A v
≈ keV , (7.32)
(1 + AmN /m)2 300km/s

where A is the mass number of the target nucleus and mt ≈ AmN .


The velocity distribution of the dark matter particles is often taken to be√Maxwellian
(with a cut-off at the Galactic escape velocity), with a dispersion of 220/ 2 km/s,
the velocity of the Solar system with respect to the Galaxy is 230 km/s, and the
velocity of the Earth relative to the Sun is 30 km/s. A rough estimate of the typical
root mean square velocity is thus v ≈ 300 km/s. Note that the interaction strength
is irrelevant for the energy exchange, it only affects the probability of the interaction
(i.e. the rate of events observed in the detector). The expected annual modulation
is roughly 30 (km/s)/v, which in our approximation is about 10%. There are un-
certainties in the dark matter distribution and the rotation of the Solar system in
the Galaxy, and the annual modulation rate can reasonable be between 1% and 10%
[14].
The event rate depends on the dark matter-nucleus cross-section, σdm−nucleus ≈
2
A σdm−p , where σdm−p is the dark matter-proton cross section. The dark matter-
proton cross section can be completely different from the dark matter-dark matter
annihilation cross section. The total number of events per unit time is given by the
interaction rate of a single nucleus the number of nuclei in the target with mass M ,
which we denote by N = M/(AmN ):

ΓN = hσdm−nucleus vindm N
2 × 104 A M hσdm−p vi ρdm  m −1
≈ , (7.33)
yr ton 10−40 cm2 × 300km/s 0.3 GeV/cm3 GeV

where we have put in typical values for the cross section, velocity and WIMP density.
The latter two are determined by taking a given density profile for the dark matter as
a function of radius and using the observed rotation curves, and they also agree with
typical values obtained from galactic simulations of dark matter6 . For comparison,
the weak interaction annihilation cross section for 1 GeV mass is σ ∼ G2F GeV2 ∼
10−10 GeV−2 ≈ 4 × 10−38 cm2 ≈ 10−27 cm3 /s, using the relation 197 MeV ≈ 1/fm.
6
The energy density one gets in detailed analyses typically does not vary from ρdm =
0.3 GeV/cm3 by more than a factor of a few. However, strictly speaking, observations are con-
sistent with no dark matter in the Solar system – as far as galactic rotation curves are concerned,
dark matter is only needed in the outer parts of the galaxy. There is a direct upper limit on the
density of dark matter in the Solar system from the fact that no disruption of planetary orbits in
has been observed, and it is about 106 times this value.
7 DARK MATTER 116

Residuals (cpd/kg/keV) 2-6 keV


DAMA/LIBRA ≈ 250 kg (0.87 ton×yr)

Time (day)

Figure 4: Modulation of the detection rate of the DAMA/LIBRA experiment in the 2-6
keV energy range, in units of countr per day/kg/keV. From [15].

One direct detection experiment, DAMA/LIBRA7 claims to have detected dark


matter. They see an annual modulation signal with the expected time of maximum
rate (given the direction of the Solar system’s velocity with respect to the galaxy
and the direction of Earth’s velocity with respect to the Sun). They use a sodium
crystal with A = 23. The modulation of the rate is shown in 4. The peak of the
energy is at 3 keV, corresponding to a dark matter particle mass of around 10 GeV.
They had a total of about 1.17 ton×year of exposure in the beginning of 2010 (when
figure 4 was released), so we would expect about 4 × 105 × σdmp /(10−40 cm2 ) events,
or about 0.1 events per day per kilogram. With a modulation rate of 10%, we would
have a change of 0.01 events per day. This roughly agrees with the number 0.02 in
figure 4 for σdmp ≈ 10−40 cm2 . (Note that the y-axis for counts per day/kg/keV. We
should integrate that number over the energy-dependent count rate over the range
2–6 keV to compare to our estimate; this will give a factor of order unity.) The Co-
GeNT experiment has also seen annual modulation. However, other direct detection
experiments have ruled out ordinary WIMPs as an explanation for DAMA or Co-
GeNT, as shown in figure 5, and the interpretation of the data remains controversial.
(For a bit more details on the various experiments, see [17].) Particle physics pos-
sibilities include non-elastic collisions involving an excited state of the dark matter
particle, but systematic effects in the experimental setup are widely considered the
most likely explanation. In any case, there are several direct detection experiments
taking data, and if the dark matter is composed of standard WIMPs, it is possible
they will be observed in the near future.

7.11 Indirect detection


Indirect detection refers to the case when the dark matter particle is identified
through its annihilation or decay products. If there are no dark matter antiparticles
around, as is the case for asymmetric dark matter, there is no annihilation signal.
If the particle is stable or has a lifetime much longer than the age of the universe,
there is no detectable signal from decays. We consider only annihilation.
The relic density of a thermal relic WIMP is determined by when the annihila-
tion reactions freeze out (essentially because the density gets so low that particles
and antiparticles don’t meet). However, the density in local clumps grows during
7
http://www.lngs.infn.it/lngs/htexts/dama/
7 DARK MATTER 117

Figure 5: Allowed regions of parameter space for a WIMP interpretation of DAMA


and CoGeNT results. The DAMA and CoGent regions are ruled out by other ex-
periments. From [16].

structure formation, and this can lead to observable amounts of annihilation. (Note
the similarity to nuclear reactions: they freeze out in the early universe, but light
up again in regions where the density of baryonic matter rises sufficiently due to
gravitational collapse.)
The amount of annihilation is proportional to the square of the dark matter
density, so largest signal is expected from regions with high dark matter density, such
as dwarf galaxies or the centre of our own galaxy. Dark matter can also accumulate in
the Sun and at the center of the Earth, and though the numbers are much smaller,
these locations are much nearer to us, so the detection is easier. However, only
neutrinos can escape from the Sun or the centre of the Earth, whereas in the case
of astrophysical objects we can observe several kinds of annihilation products –
though there too the propagation of charged particles is a bit complicated. From
the direction where we measure a positron or an antiproton we cannot deduce where
the source is, since the paths of both particles are twisted by magnetic fields on the
way. Only the detected number of charged particles carried useful information, not
their direction (and even to calculate the numbers we have to make some assumptions
about propagation). In contrast, photons at the relevant energies travel basically
unimpeded through the galaxy, so we can immediately determine where they come
from. (Scattering of light due to dust is negligible at high energies.)
Let us consider the annihilation signal from the centre of our galaxy. The anni-
hilation rate per particle is Γ = hσvindm , so the number of annihilation events per
unit volume per unit time is hσvin2dm = hσvim−2 ρ2dm . Integrating along the line of
7 DARK MATTER 118

sight, the observed flux from annihilations into particle X is

NX (E)hσvi 1
Z Z
ΦX (E) = dΩ dlρ2dm
4πm2 ∆Ω ∆Ω
GeV 2 ¯
 
−8 NX hσvi
≈ 2 × 10 J(∆Ω)m−2 s−1 sr−1 , (7.34)
10−30 cm3 s−1 m

where NX (E) is the number of X particles of energy E produced


R in each anni-
hilation, ∆Ω is observed angular element and the integral dl is over the line of
sight, averaged over the angle. The reference value 10−30 cm3 s−1 is the weak scale
¯
annihilation cross section (with m = 1 GeV) times 300 km/s. The function J(∆Ω)
(the overbar stands for angular average) contains the uncertainties due to the dark
matter distribution, and is defined as
 2
1 1 1
Z Z
¯
J(∆Ω) ≡ dΩ dlρ2dm . (7.35)
8.5kpc 0.3 GeV/cm3 ∆Ω ∆Ω

The virtue of indirect detection is that the relevant cross section is the same
one that determines the relic density (for thermal relics), unlike in direct detection,
though the issue is complicated by model-dependent decay channels. However, the
dark matter density profile at the galactic centre (and on small scales in general)
is poorly known. In fact the mismatch between observations and simulations as
regards the centers of galaxies is considered to be one of the most pressing issues of
the CDM model. Simulations predict a sharper increase of density near the center
than inferred from observations of galaxies. For different choices of the density
profile, one gets a range J¯ ≈ 30 . . . 106 for ∆Ω = 10−3 sr and J¯ ≈ 30 . . . 108 for
∆Ω = 10−5 sr [18]. The relevant angular size depends on the angular resolution of
the instrument. In any case, for weak scale annihilation cross-sections, the expected
flux is quite small, though not completely out of reach.
Observational limits from the flux of photons measured by the Fermi-LAT satel-
lite on dark matter in a constrained version of the MSSM are shown are shown in
figure 6.
At present, there are some observational signals which have been interpreted
as evidence for dark matter. In particular, an excess of positrons was seen by
the PAMELA satellite experiment in 2008, and confirmed by the Fermi satellite
experiment in 2011. However, the rate is too high by a factor of about 103 if the
annihilation cross section is taken to be fixed by the observed relic density, so the
observations are inconsistent with the simple WIMP picture. Different options, from
increased clumping of the dark matter to a mechanism which boosts the annihilation
cross section at small velocities (i.e. today but not at decoupling) were suggested,
but the interpretation remains uncertain. It is possible that the positron excess is
due to astrophysical sources such as pulsars or supernova remnants, or even that
all of the excess positrons are generated by the scattering of other cosmic rays (so
that there is no primary source of excess cosmic rays), as the details of cosmic ray
propagation in the Galaxy are not clear.
One interesting indirect detection channel is neutrinos. High-energy neutrinos
from outside the Solar system have been detected in 2010–2012 by the IceCube
detector on the South Pole [20]. They could be from astrophysical sources, but an
interpretation in terms of dark matter decay has also been proposed.
7 DARK MATTER 119

Figure 6: Observational limits on WIMP dark matter in a constrained version of


the MSSM from dwarf spheroidal galaxies. From [19].

In April 2012, a monochromatic (i.e. occurring at one energy only) gamma ray
signal of 130 GeV was reported from the galactic center using the Fermi satellite
data [21]. If confirmed, this would be a strong indication of dark matter, since
astrophysical sources typically generate a continuum of energies, and dark matter
annihilation is the only known source for emission at a single energy. However, in
dark matter models, we also expect a continuum to accompany the monochromatic
signal, since the dark matter decays also to other particles than photons, and some
of these then decay to photons, producing a wide range of energies in the photon
final states. Typically, the continuum signal is expected to be about 1000 times
stronger than the monochromatic line (remember that dark matter couples weakly
to photons!), and no such excess in the continuum emission is seen. So if the signal is
due to dark matter, it has properties rather different from what is expected. At the
moment, the matter remains open. (For more details on the various experiments,
see [17].)
In summary, as with direct detection, the reach of indirect detection experi-
ments is increasing, and many avenues are being investigated. Whether dark matter
will be detected (via its non-gravitational interactions) depends on which model of
dark matter is correct, and it is worth bearing in mind that there are some models
for which it is impossible to detect dark matter via non-gravitational interactions
REFERENCES 120

in the foreseeable future. For example gravitinos have only gravitational-strength


interactions with other particles.

References
[1] J. Einasto, arXiv:0901.0632 [astro-ph.CO].

[2] Y. Sofue and V. Rubin, Ann. Rev. Astron. Astrophys. 39, 137 (2001),
arXiv:astro-ph/0010594.

[3] P.J.E. Peebles, arXiv:astro-ph/0410284.

[4] P.A.R. Ade et al. [Planck Collaboration], arXiv:1303.5076 [astro-ph.CO].

[5] M. Vonlanthen, S. Räsänen and R. Durrer, JCAP08 (2010) 023, arXiv:1003.0810


[astro-ph.CO], B. Audren, J. Lesgourgues, K. Benabed and S. Prunet,
JCAP02(2013)001, arXiv:1210.7183 [astro-ph.CO].

[6] S. Hannestad, Prog.Part.Nucl.Phys.65:185-208,2010, arXiv:1007.0658 [hep-ph].

[7] K. Nakamura et al. (Particle Data Group), J. Phys. G 37, 075021 (2010),
http://pdg.lbl.gov/2010/tables/contents tables.html.

[8] G. Belanger et al., [arXiv:1308.3735].

[9] K.M Zurek, arXiv:1308.0338 [hep-ph].

[10] T.A. Ryttov and F. Sannino, Phys.Rev. D78, 115010 (2008), arXiv:0809.0713v1
[hep-ph].

[11] M. Roos, arXiv:1001.0316 [astro-ph.CO].

[12] D. Clowe et al., Astrophys. J. Lett. 648, L109 (2006), arXiv: astro-ph/0608407.

[13] K. Freese, D. Spolyar, P. Bodenheimer and P. Gondolo, New J. Phys. 11, 105014
(2009), arXiv:0903.0101 [astro-ph.CO].

[14] A. M. Green, Mod. Phys. Lett. A 27, 1230004 (2012), arXiv:1112.0524 [astro-
ph.CO]].

[15] R. Bernabei et al., arXiv:1002.1028v1 [astro-ph.GA].

[16] E. Aprile et al. [XENON100 Collaboration], Phys. Rev. Lett. 109, 181301
(2012), arXiv:1207.5988 [astro-ph.CO].

[17] L. Bergström, arXiv:1309.7267 [hep-ph].

[18] L. Roszkowski et al., Phys. Lett. B671, 10 (2009), arXiv:0707.0622 [astro-ph].

[19] A. Drlica-Wagner [Fermi LAT Collaboration], arXiv:1210.5558 [astro-ph.HE].

[20] Francis Halzen for the IceCube Collaboration, arXiv:1308.3171 [astro-ph.HE].

[21] C. Weniger, JCAP 1208 (2012) 007, [arXiv:1204.2797 [hep-ph]].


8 Inflation: background
8.1 Motivation
Inflation is a scenario in which there was a period of accelerated expansion in the
very early universe. While it has not been established beyond reasonable doubt
that inflation took place, inflation, like dark matter, is a very successful hypothesis.
Inflationary models have made several detailed predictions that have been observa-
tionally verified, and no competing scenario has had the same success.
The word “inflation” also refers to the period of accelerated expansion, and also
to the accelerated expansion itself. For example. we say that the “the universe
inflates” to mean that the expansion accelerates. The motivation for inflation came
from some unresolved issues of the “Big Bang model”, i.e. the homogeneous and
isotropic FRW model with matter consisting of a gas of particles, which we have
considered thus far.

8.1.1 Homogeneity and isotropy problem, or the horizon problem


One question concerns the origin of the symmetry of the FRW model. There is
no unique way to define a measure on the “set of spacetimes” (though attempts
have been made), but the homogeneous and isotropic universes seem very special.
Homogeneity and isotropy of the universe have two distinct aspects. First, depar-
tures from homogeneity and isotropy are small, i.e. the differences in any physical
quantity calculated at two different spatial points are small. This is the case in the
early universe, as we know from the fact that the amplitude of the CMB perturba-
tions is only ∼ 10−5 (apart from the dipole component which is 10−3 and which is
presumably overwhelmingly due to our motion)1 . Today, perturbations are locally
large in the sense that differences in the local value of the energy density, expan-
sion rate and so on are large. The density of a galaxy can typically be 106 times
larger than the mean density, and a void can expand 50% faster than the mean.
However–and this is the second aspect of homogeneity and isotropy– the universe is
still statistically homogeneous and isotropic today, because the initial distribution
of the small perturbations had this symmetry (we will discuss this in more detail
later). So the homogeneity and isotropy question is related to the origin of the
seeds of structure. Where did the small deviations come from, and why is their
distribution the observed one? A particular aspect of this the horizon problem. The
CMB anisotropies are correlated on all observed scales. However, at the time of last
scattering any regions which are today separated by more than about 1◦ had not
had time to interact in the Big Bang model (i.e. assuming the FRW metric and
ordinary matter).
1
Strictly speaking, this is the amplitude of deviations in the distribution of photons. As baryonic
matter was in equilibrium with the photons, it was also close to homogeneous and isotropic. How-
ever, departures from homogeneity and isotropy were larger in the dark matter, which is decoupled
from photons. This is crucial for structure formation, as we will discuss in chapter 11.

121
8 INFLATION: BACKGROUND 122

Figure 1: The horizon problem.

8.1.2 The flatness problem


Another issue is the spatial flatness of the universe. The density parameter is
K
Ω−1= . (8.1)
(aH)2

If Ω is unity at some time, it is always unity. If Ω 6= 1 at any time, it evolves in


time. Assuming that the spatial curvature is initially small, we have
1
mat. dom. a ∝ t2/3 , H ∝ t−1 ⇒ ∝ t1/3 ⇒ |1 − Ω| ∝ t2/3 (8.2)
aH
1
rad. dom. a ∝ t1/2 , H ∝ t−1 ⇒ ∝ t1/2 ⇒ |1 − Ω| ∝ t . (8.3)
aH
If the spatial curvature is positive, it will quickly dominate over matter or radia-
tion, and the expansion will stop and turn around, and the universe will collapse. If
the spatial curvature is negative, the universe will quickly become empty and cold.
The flatness problem is thus also called the oldness problem. In the ΛCDM model,
the curvature today is very small, |Ω(t0 ) − 1| < 10−2 . Therefore, at BBN we have
|Ω(tBBN ) − 1| . 10−17 , which seems like a strong tuning. (Of course, if the spatial
curvature is initially zero, it will remain zero. However, this is a special value, for
which we would like to have an explanation.)

8.1.3 The relic problem


At early times in the Big Bang model the temperature is very high. In grand unified
theories of particle physics there are phase transitions at high temperatures (just
like the Standard Model has the QCD and electroweak phase transitions). These
phase transitions typically generate topological defects such as magnetic monopoles,
8 INFLATION: BACKGROUND 123

cosmic strings and domain walls, which correspond to (approximately) zero-, one-
and two-dimensional topological defects. Just like (given a specific model) we can
calculate the relic density of dark matter particles, we can calculate the density of
these relics. In some models the energy density in monopoles today would be much
higher than the observed energy density. This is related to the fact that monopoles
are typically very massive, with masses of the order of the grand unified scale. The
presence of cosmic strings and domain walls would also be problematic, as they
are also typically very heavy, and their energy density relative to ordinary matter
increases with time (i.e. it goes down more slowly). In supersymmetric models one
particular problem is the overproduction of gravitinos, the supersymmetric partners
of the graviton. If gravitinos are not stable, their lifetime is very long, because
they interact only gravitationally, so they typically decay after BBN, and ruin its
observational success. It is however also possible that the gravitinos form the dark
matter. The constraint on the temperature of the universe from gravitinos is of
the order T . 107 GeV. However, it may be that the grand unified theories or
supersymmetric models do not describe reality, in which case this is not a problem.
Observationally, from BBN, we know only that the universe has been at least as hot
as 1 MeV.

8.1.4 What is needed


All of the above problems are solved if we have a mechanism which produces an
“initial condition” for the universe at T > 1 MeV, where the universe is homogeneous
and isotropic up to small perturbations that are correlated on all observable scales,
where the spatial curvature is very small and matter is in thermal equilibrium (at
least the part which consists of baryons, photons and neutrinos). In specific theories
of particle physics, there may be an upper limit on the temperature.

8.2 Inflation introduced


Inflation is not a replacement for the Hot Big Bang model, but an add-on, occurring
at very early times (somewhere between the energy scales of MeV and 1016 GeV, in
most models closer to the upper end) without disturbing its successes. Inflation is
defined as accelerating expansion:

inflation ⇔ ä > 0 . (8.4)

Often the term inflation is used to refer only to a period of acceleration expansion
in the early universe, and not to the recent phase of accelerated expansion.
Consider how the flatness and horizon problems can be solved with inflation.
The origin of the flatness problem is that |Ω − 1| = |K|/(aH)2 = |K|ȧ−2 grows
with time because ȧ falls, i.e. the universe decelerates. If the expansion instead
accelerates, Ω is driven towards unity starting from any value. (This is the case for
an expanding universe. If the universe contracts, the behaviour is reversed.)
Consider now the horizon problem. The problem is that in the standard Big
Bang model the horizon at the time of photon decoupling is small compared to the
part of the universe we can see today. In standard Big Bang picture the universe
is first radiation-dominated and then becomes matter-dominated somewhat before
photon decoupling. (Recall that teq ≈ 50 000 years and tdec ≈ 380 000 years.) In
8 INFLATION: BACKGROUND 124

the radiation-dominated era, the horizon is dhor (t) = 2t = H −1 . In the matter-


dominated era, we have dhor (t) = 3t = 2H −1 . The horizon at decoupling is between
these values, H −1 < dhor (tdec ) < 2H −1 . The size of the observable universe today
is of the order of the present Hubble length dhor (t0 ) ∼ H0−1 (the precise prefactor
depends on the vacuum energy density, but the order of magnitude is enough for
us). The presently observable universe was a factor of adec /a0 smaller at decoupling.
In order to compare size of the horizons at different times we can use comoving
lengths, where this change is taken into account. The comoving horizon at decou-
pling is

dchor (tdec ) = (1 + zdec )dhor (tdec ) ∼ (1 + zdec )Hdec


−1
= (adec Hdec )−1 , (8.5)

and dhor (t0 ) = dchor (t0 ) ∼ H0−1 . The horizon problem arises because the first horizon
is much smaller than the second,
dchor (tdec ) a 0 H0 tdec
c ∼ ∼ (1 + zdec ) ≈ 0.03 ≪ 1 , (8.6)
dhor (t0 ) adec Hdec t0
where we have for clarity inserted a0 , even though it is equal to unity, and used
tdec = 380000 yr, t0 = 14 × 109 yr and zdec = 1090. In other words, the presently
observed universe was about 30 times larger than the particle horizon at decoupling,
so it contained about 105 regions that had never been in causal contact with each
other. Thus the problem is again that aH decreases with time,
d
(aH) = ä < 0 , (8.7)
dt
so a period with ä > 0 might solve the problem.
Recall that the particle horizon refers to the maximum distance that light could
in principle have travelled from the beginning of the universe until time t. If we add
a new period in the early universe to the matter-dominated era and the radiation-
dominated era, such as like accelerating expansion, the calculation of dchor will depend
on it. We always have dchor (t0 ) > dchor (tdec ) since t0 > tdec , and the interval (0, tdec )
is included in the interval (0, t0 ). However, in the horizon problem, the relevant
present-day quantity is not actually the distance from which we could in principle
have received signals, but the distance from which we actually measure signals.
Because the universe is opaque before decoupling, the size of the present observable
universe is given by the distance photons have travelled in the interval (tdec , t0 ), and
this is not affected by what happens before tdec . Thus the relevant present-day scale
is always ∼ H0−1 .
Note that the comoving Hubble parameter is equal to the conformal Hubble
parameter,
1 da
aH = = ȧ , (8.8)
a dη
where η is conformal time. The Hubble length is

lH ≡ H −1 , where H ≡ , (8.9)
a
and the comoving Hubble length is

c 1 1 1
lH ≡ lH = = . (8.10)
a aH ȧ
8 INFLATION: BACKGROUND 125

If aH decreases, then (aH)−1 increases, and vice versa. So we can say that inflation
is any epoch when the comoving Hubble length shrinks:
 
d 1
inflation ⇔ <0. (8.11)
dt aH
It has unfortunately become customary in cosmology to use the word “horizon”
also for the Hubble distance, particularly with regard to inflation. We adopt this
lamentable practice when referring to subhorizon or superhorizon modes (to be de-
fined a bit later), but will otherwise try to be careful not to confuse the two concepts.
Let us consider an example of accelerated expansion that we are already familiar
with from the discussion on dark energy, namely exponential expansion, correspond-
ing to the vacuum energy equation of state w = −1, a(t) ∝ eHt , with constant H.
We will shortly see that this a first approximation for the expansion law during
inflation. The horizon distance is
Z t
dt′
dhor (t) = a(t) ′)
= H −1 eHt (1 − e−Ht ) ≃ H −1 eHt , (8.12)
0 a(t

where the last limit is for t ≫ H −1 . So in contrast to the radiation- and matter-
dominated eras, the physical particle horizon grows exponentially, and the comoving
particle horizon stays almost constant, dchor (t) ≃ H −1 . The present observable
universe has evolved from a small patch of a much larger causally connected region.
See figure 2.
However, even though the particle horizon grows exponentially, the distance over
which it is possible to send signals does not grow. If we consider a light ray, we have
0 = ds2 = −dt2 + a(t)2 dr2 , so the comoving coordinate separation (which is the
comoving distance, since the universe is spatially flat) between emission at t1 and
reception at t2 is (taking a = eHt )
Z t2
dt′
∆r = ′)
= H −1 (e−Ht1 − e−Ht2 ) < H −1 . (8.13)
t1 a(t

If the coordinate separation between two points is more than the Hubble length, it
is not possible to send signals between them. In this sense, the Hubble length gives
the comoving size of the region during inflation over which it is possible to retain
causal connection. If the universe before inflation is matter-dominated, for example,
observers with separation 2(aH)−1 have been able to send signals to each other,
so causal connection is lost. Also, regardless of what happened before inflation,
during inflation a signal sent at t1 cannot travel a longer coordinate distance than
H −1 e−Ht1 , and this distance gets smaller t1 grows, so causal connection is lost during
inflation. Note that the particle horizon, which expresses the maximum distance at
which parts of the universe can have been in causal contact always grows as a
function of time, it never shrinks. What changes during inflation is just that regions
that once were in causal contact cannot send signals to each other any more.
The Friedmann equations are

ȧ2 K
3 = 8πGN ρ − 3 (8.14)
a2 a2

3 = −4πGN (ρ + 3p) . (8.15)
a
8 INFLATION: BACKGROUND 126

Figure 2: Evolution of the comoving Hubble radius (length, distance) during and after
inflation (schematic).

Thus, in general relativity and assuming the FRW metric, accelerating expansion
requires negative pressure:

inflation ⇔ ρ + 3p < 0 . (8.16)

Note that the energy density of matter for which p/ρ < − 31 falls down in an
expanding universe slower than a−2 , i.e. it grows relative to the spatial curvature.
(If p/ρ < −1, the energy density actually rises as the universe expands.) The flatness
problem of the Big Bang model is simply the feature that for matter composed of a
gas of particles we have p ≥ 0, so the energy density falls at least as fast as a−3 , and
the curvature term will at some point overtake the energy density. In inflationary
models, the energy density typically remains nearly constant during a period in
which the scale factor grows by a huge factor, typically by a factor e60 or more.
Thus inflation predicts that Ω0 = 1 to extremely high accuracy2 . See figure 3.
As for the relic problem, if unwanted relics are produced before inflation, they
are diluted to practically zero density by the expansion, just like spatial curvature.
However, we have to be careful that they are not produced after inflation, i.e. the
reheating temperature (see below) has to small enough. This is one constraint on
models of inflation. At the end of inflation, matter is produced in reheating3 , which
2
If it were discovered by observations that Ω0 6= 1, this would be a blow to the credibility of
inflation. However, there is a version of inflation, called open inflation, for which it is natural that
Ω0 < 1. The existence of such models of inflation have led critics of inflation to complain that
inflation is “unfalsifiable” in the sense that no matter what the observation, a model of inflation
can be found that agrees with it. Nevertheless, most models of inflation give similar “generic”
predictions, including Ω0 = 1 to great accuracy, and thus far the observations have been in good
agreement with them.
3
“Reheating” may turn out to be as much a misnomer as “recombination”, as it is not clear
whether matter was ever in a thermal state before inflation.
8 INFLATION: BACKGROUND 127

Figure 3: Solving the flatness problem. This figure is for a universe with no dark energy,
where the expansion keeps decelerating after inflation ended in the early universe. Present
observational evidence indicates that actually the expansion began accelerating again a few
billion years ago. Thus the universe is, technically speaking, inflating again, and Ω is again
being driven towards 1. However, this current epoch of inflation is not enough to solve the
flatness problem, or the other problems, since the universe has only expanded by about a
factor of 2 during it.

produces the gas of particles that is the initial condition for the hot Big Bang model.
Inflation is better called a scenario rather than a theory. It is an idea of a certain
kind of behaviour of the universe, which is realised in dozens of different models.
The models are related to extensions of the Standard Model of particle physics or
extensions of general relativity, and some of them are just “toy models”, which
have the right features and are presumably at most an approximate description of
some more complicated physics. One noteworthy inflationary model is based on the
Standard Model Higgs boson coupled to gravity in a non-standard way.
The important point is that inflation makes many “generic” predictions, i.e. pre-
dictions that are independent of the particular model of inflation, for most models.
(Though exceptional models can be found that would violate one or more of these
general features.) There are also numerical predictions of cosmological observables
that differ from one model of inflation to another, allowing observations to rule out
models (and some have been already ruled out). Present observational data agrees
with the generic predictions (which were made before the advent of the observations
of the CMB anisotropies, which are the most direct way of testing the models),
while alternatives to inflation have not managed to explain the observations in an
equally simple and successful way. Most cosmologists thus consider it likely that
inflation took place in the primordial universe. To quote the cosmologist Douglas
Scott, “something like inflation is something like proven”.
Exercise: Assume that at the beginning of inflation we have |ΩK | = 0.1. Calcu-
late, as a function of the reheating temperature Treh , how many e-folds of inflation
are required to reduce present-day spatial curvature to |ΩK0 | < 10−2 . (Assume
h = 0.7 and that neutrinos are massless.) Approximate that the expansion rate at
the beginning of inflation is completely dominated by the inflaton, that the inflaton
field value does not change during inflation and that reheating happens instanta-
neously. In which directions do the above approximations change the result? What
is the number of e-folds for Treh = 107 GeV?
8 INFLATION: BACKGROUND 128

8.2.1 Starting inflation


In the discussion, we have already assumed that we can use the FRW metric, i.e. that
the universe is homogeneous and isotropic. In order to explain how inflation produces
homogeneity and isotropy and solves the horizon problem, we should consider how
inflation gets started from some generic initial conditions. Inflation certainly makes
the homogeneity and isotropy problem “exponentially smaller” in the sense that
it produces a large homogeneous and isotropic causally connected patch from a
small one. However, the issue of how to get inflation started remains an open
question. There are some ideas and studies of this, but as we have no solid theoretical
understanding (and no observations at all) of the pre-inflationary era, the issue
remains rather speculative. We will comment on this a bit more after discussing the
simplest inflationary models.
We will assume that sufficient inflation has already taken place to make the
universe (within a horizon volume) flat and homogeneous, and follow inflation in
detail after that. Thus we will work in the flat FRW universe. In any case, from
the modern point of view, the most important (and testable) aspect of inflation is
the generation of the seeds of structure, which we will discuss in chapter 10, which
makes the question of deviations from homogeneity and isotropy quantitative.

8.3 The inflaton field


As we saw in section 8.2, inflation requires negative pressure. In chapter 5 we
considered systems of particles where interaction energies can be neglected (the
ideal gas approximation). For such systems the pressure is always non-negative4 .
However, the particle picture is not fundamental. In the early universe, at high
energy densities, we have to consider the more fundamental entities, fields. Particles
are just excitations of fields. The mean value of a field can have negative pressure,
even if a gas consisting of the particles corresponding to the field does not. The
simplest form of matter which has a negative pressure is a scalar field, so the simplest
inflationary models involve just a single scalar field. The field responsible for inflation
(and the corresponding spin 0 particle) is called the inflaton.
The starting point is the Lagrangean density L(ϕ, ∂ µ ϕ), where ϕ is the inflaton
field. In the simplest case where the kinetic term of the field has the kanonical form
and the field is minimally coupled (see below), we have couple
1
L = − g µν ∂µ ϕ ∂ν ϕ − V (ϕ) . (8.17)
2
where V (ϕ) is the potential of the field. The action is correspondingly

Z
S = d4 x −gL , (8.18)

where g is the determinant of the metric. The effect of spacetime curvature is


manifested via the metric in the kinetic term and the determinant of the metric in
the integration measure. This case is called the minimal coupling to gravity. It is also
possible to include in the Lagrangean density terms which couple the scalar field to
quantities built from the derivatives from the metric. (Such a non-minimal coupling
4
A gas of interacting particles could have negative pressure.
8 INFLATION: BACKGROUND 129

is important if the Higgs field is the inflaton.) We will not discuss non-minimal
coupling.
If the field is free, we have
1
V (ϕ) = m2 ϕ2 , (8.19)
2
and the mass of the particle corresponding to the field ϕ is m. If the potential
has higher order terms, these describe self-interactions of the field. Even when the
potential is more complicated than in eq. (8.19), we define the quantity m2 (ϕ) ≡
V ′′ (ϕ). For m2 > 0, this gives the mass of the particles when the field has the value
ϕ. In the case m2 < 0, the field configuration is unstable, and small perturbations
no longer describe particles with mass m. We also use the notation

dV d2 V
V ′ (ϕ) ≡ and V ′′ (ϕ) ≡ . (8.20)
dϕ dϕ2
Minimisation of the action leads to the Euler–Lagrange equation
√ √
∂( −gL) ∂( −gL)
− ∂µ =0. (8.21)
∂ϕ ∂[∂µ ϕ]

For the Lagrange density (8.17) we get the field equation


1 √
− √ ∂µ ( −gg µν ∂ν ϕ) + V ′ = 0 . (8.22)
−g

In flat spacetime (Minkowski space), we have g µν = diag(−1, 1, 1, 1), so we get

ϕ̈ − ∇2 ϕ + V ′ = 0 . (8.23)

For a free field, we have V ′ (ϕ) = m2 ϕ, and the equation of motion reduces to
the Klein-Gordon equation. For the spatially flat FRW metric we have (in Cartesian
coordinates) g µν = diag(−1, a−2 , a−2 , a−2 ), so we get

ϕ̈ + 3H ϕ̇ − a−2 ∇2 ϕ + V ′ = 0 , (8.24)

where ∇2 ϕ ≡ δ ij ∂i ∂j . During inflation, the field (like the space) is almost homo-
geneous, so we can take ∂i ϕ = 0 for the background evolution (we will consider
perturbations in chapters 9 and 10). In fact, inflation makes the inflaton field more
homogeneous, as the coefficient a−2 falls. A sufficient level of initial homogeneity of
the field is required to get inflation started. We start our discussion when a sufficient
level of inflation has already taken place to make the gradients negligible, so that
the field can be considered homogeneous.
The Lagrangean density also gives us the energy-momentum tensor
∂L
Tµν = − ∂ν ϕ + gµν L , (8.25)
∂(∂ µ ϕ)

which for the Lagrangean density (8.17) is


 
1 αβ
Tµν = ∂µ ϕ∂ν ϕ − gµν g ∂α ϕ∂β ϕ + V . (8.26)
2
8 INFLATION: BACKGROUND 130

For the FRW metric, the energy density and pressure measured by an observer
comoving with the FRW metric are5
1
ρ = −T 00 = ϕ̇2 + V (8.27)
2
1
p = T ii = ϕ̇2 − V , (8.28)
2
The field has negative pressure when the potential dominates over the kinetic term,
i.e. when the field is moving slowly. The equation of state parameter w ≡ p/ρ is
ϕ̇2 − 2V (ϕ) 1 − 2V /ϕ̇2
w= 2
= , (8.29)
ϕ̇ + 2V (ϕ̇) 1 + 2V /ϕ̇2
so
−1 ≤ w ≤ 1 . (8.30)
1 2
If the kinetic term 2 ϕ̇
dominates, w ≈ 1; if the potential term V (ϕ) dominates,
w ≈ −1. Different inflaton models have different potentials V (ϕ). From (8.27), we
can form the useful combinations
ρ + p = ϕ̇2
(8.31)
ρ + 3p = 2 ϕ̇2 − V

.
We have the equation of motion of the field from (8.24). Alternatively, we could
just insert the energy density and pressure from (8.27) into the continuity equation

ρ̇ = −3H(ρ + p) . (8.32)

This gives the same result,

ϕ̈ + 3H ϕ̇ = −V ′ . (8.33)

This is the field equation for a homogeneous field in a spatially flat FRW universe.
The effect of expansion is to add the term 3H ϕ̇, which acts like friction and slows
down the evolution of ϕ.
The condition for inflation, ρ + 3p < 0, is satisfied if

ϕ̇2 < V . (8.34)

Let us assume that ϕ is initially far from the minimum of V (ϕ). The potential
then pulls ϕ towards the minimum (see figure 4). If the potential has a suitable
(sufficiently flat) shape, the friction term soon makes ϕ̇ small enough to satisfy
(8.34), even if it was not satisfied initially.
We also need the Friedmann equation,
8πG 1
H2 = ρ= 2 ρ . (8.35)
3 3MPl
Inserting the energy density from (8.27), we have
 
2 1 1 2
H = 2 ϕ̇ + V . (8.36)
3MPl 2
5
Those used to the Einstein summation convention should note that there is no summation over
i in (8.28).
8 INFLATION: BACKGROUND 131

Figure 4: An example of inflaton potential.

We have ignored other contributions to the energy density and pressure besides
the inflaton. During inflation, the inflaton moves slowly, so the inflaton energy
density, which is dominated by V (ϕ), also changes slowly. If there are matter and
radiation components in the energy density, they decrease fast, ρ ∝ a−3 or ∝ a−4 ,
and soon become negligible, like the spatial curvature. The presence of extra matter
can put some constraints on the initial conditions for inflation to get started and
the inflaton to become dominant. But once inflation begins, we can soon forget
components other than the inflaton.

8.4 Slow-roll inflation


The friction (expansion) term tends to slow down the evolution of ϕ, so the system
easily reaches a situation where the following conditions hold:

ϕ̇2 ≪ V (8.37)
|ϕ̈| ≪ 3H|ϕ̇| (8.38)

These are the slow-roll conditions. If the slow-roll conditions are valid, we may ap-
proximate (the slow-roll approximation) (8.33) and (8.36) by the slow-roll equations:

V
H2 = 2 (8.39)
3MPl
3H ϕ̇ = −V ′ . (8.40)

The shape of the potential V (ϕ) determines the slow-roll parameters, defined as

1 2 V′ 2
 
ε(ϕ) ≡ MPl (8.41)
2 V
′′
2 V
η(ϕ) ≡ MPl . (8.42)
V
Exercise: Show that

ε≪1 and |η| ≪ 1 ⇐ (8.37) and (8.38) (8.43)

Note that the implication goes only in this direction. The conditions ε ≪ 1 and
|η| ≪ 1 are necessary, but not sufficient for the slow-roll approximation (i.e. the
slow-roll conditions) to be valid. The conditions are not sufficient, because they
8 INFLATION: BACKGROUND 132

Figure 5: The potential V (ϕ) = 21 m2 ϕ2 and its two slow-roll sections.

only constrain the form of the potential, and identify from the potential a slow-roll
section, where the slow-roll approximation may be valid. Since the field equation
(8.33) is second order, it accepts arbitrary ϕ and ϕ̇ as initial conditions. Thus (8.37)
and (8.38) may not hold initially, even if ϕ is in the slow-roll section. However, it
turns out that the slow-roll solution, the solution of the slow-roll equations (8.39)
and (8.40), is an attractor of the full equations, (8.33) and (8.36). This means that
the solution of the full equations rapidly approaches it, if the initial conditions that
are in the basin of attraction. To be in the basin of attraction means that ϕ must
be in the slow-roll section; if ϕ̇ is large, ϕ needs to be deep in the slow-roll section.
Once the system has reached the attractor, where (8.39) and (8.40) hold, ϕ̇ is
determined by ϕ. In fact everything is determined by ϕ (assuming a fixed potential
V (ϕ)). The value of ϕ is the single parameter describing the state of the universe,
and ϕ evolves down the potential V (ϕ) as specified by the slow-roll equations.
The ideas of “attractor” and “basin of attraction” can be taken further. If the
universe (or a region of it) finds itself initially (or enters) the basin of attraction
of slow-roll inflation, meaning that: there is a sufficiently large region, where the
curvature is sufficiently small, the inflaton makes a sufficient contribution to the
total energy density, the inflaton is sufficiently homogeneous, and lies sufficiently
deep in the slow-roll section, then this region begins inflating, it becomes rapidly
very homogeneous and flat, all other contributions to the energy density besides the
inflaton become negligible, and the inflaton begins to follow the slow-roll solution.
Thus inflation erases all memory of initial conditions, and we can predict the
later history of the universe just from the shape of V (ϕ) and the assumption that
ϕ started out far enough in the slow-roll part of it. Note the similarity to thermal
equilibrium. In the stages of the universe we discussed earlier, things were calculable
because in thermal equilibrium, it is sufficient to know the temperature, masses of
particles and conserved quantum numbers in order to have full information about
the system. In the case of inflation, knowing the inflaton field value (and the shape
of the potential) is enough, because of a rather different kind of attractor behaviour.
Example:
1
V (ϕ) = m2 ϕ2 ⇒ V ′ (ϕ) = m2 ϕ , V ′′ (ϕ) = m2 (8.44)
2
8 INFLATION: BACKGROUND 133

1 2 2 2
  
ε(ϕ) = MPl   2
2 ϕ
 MPl
⇒ ε=η=2 (8.45)
2 2
 ϕ
η(ϕ) = MPl 2


ϕ
and
ε, η ≪ 1 ⇒ ϕ2 ≫ 2MPl
2
(8.46)
See figure 5.

8.4.1 Relation between inflation and slow-roll


ȧ ä ȧ2 ä
H= ⇒ Ḣ = − ⇒ = Ḣ + H 2 (8.47)
a a a2 a
Thus the condition for inflation is Ḣ + H 2 > 0. This would be satisfied if Ḣ > 0,
but this is not possible here, since it would require p < −ρ, i.e., w ≡ p/ρ < −1,
which is not possible for a minimally coupled scalar field, see (8.29).6 Thus we have
Ḣ ≤ 0 and:

inflation ⇔ − 2 < 1 . (8.48)
H
If the slow-roll approximation is valid,

V V ′ ϕ̇ V ′ H ϕ̇ 3H ϕ̇=−V ′ V ′2
H2 = 2 ⇒ 2H Ḣ = 2 ⇒ H 2 Ḣ = 2 = − 2
3MPl 3MPl 6MPl 18MPl
4 2
V ′2 9MPl V′

Ḣ 1 2
⇒ − 2 = 2 2
= MPl =ε≪1.
H 18MPl V 2 V

So if the slow-roll approximation is valid, inflation is guaranteed. This result


also shows that during slow-roll inflation, the Hubble parameter changes slowly
(while the scale factor changes almost exponentially). As we have noted, slow-roll
conditions are not necessary for inflation, it is possible to have inflation even when
the slow-roll parameters are not small (called fast-roll inflation). However, when we
consider perturbations in chapter 10, we will see that slow-roll inflation automatically
produces a spectrum of perturbations that is in close agreement with observations,
unlike fast-roll inflation.

8.5 Models of inflation


A scalar field model of inflation consists of the potential for the inflation and its
couplings to other fields. In most models, couplings to other fields don’t matter
during inflation, and only the inflaton is dynamically important. However, these
6
From the Friedmann equations,
 2 
ȧ 8πG K
= ρ− 2

ȧ2
  
a 3 a ä K

⇒ Ḣ = − 2 = −4πG ρ + p − 2
ä 4πG  a a 3a
=− (ρ + 3p)

a 3
K
Thus Ḣ > 0 requires ρ + p − 3a 2 < 0. In the above, we assume that spatial curvature can already
be neglected, i.e. we can take K = 0.
8 INFLATION: BACKGROUND 134

Figure 6: Potential for (a) large field and (b) small field inflation. For a typical small-field
model, the entire range of ϕ shown is ≪ MPl .

couplings usually come into play when inflation ends. Inflation can end because the
slow-roll approximation is no longer valid, as the field has rolled down the potential.
In this case inflation ends when either ε(ϕ) or |η(ϕ)| becomes of order unity. Another
possibility is that inflation ends while the inflaton undergoes slow-roll, because other
fields coupled to the inflaton become dynamically important and terminate inflation.
An example of this is hybrid inflation, where there is an extra scalar field in addition
to the inflaton.
Inflation models can be divided into two classes:
1. Small field inflation: ∆ϕ < MPl in the slow-roll section.
2. Large field inflation: ∆ϕ > MPl in the slow-roll section.
Here ∆ϕ change in ϕ during (the observationally relevant part of) inflation.
Example: Consider a simple potential of the form V (ϕ) = Aϕn . This is a large
field model, since V ′ /V = n/ϕ ⇒ ε ≪ 1 requires ϕ2 ≫ 12 n2 MPl 2 .

See figure 6 for typical shapes of potentials of large field and small field models.

8.5.1 An exact solution


Usually the slow-roll approximation is sufficient. In single-field models it fails near
the end of inflation, but this is usually not a large correction. It is also much easier
to solve the slow-roll equations, (8.39) and (8.40), than the full equations, (8.33) and
(8.36). However, it is useful to have some exact solutions to the full equations, for
comparison. For some special cases, exact analytical solutions exist. One example
is power-law inflation, where the potential is
 r 
2 ϕ
V (ϕ) = V0 exp − , p > 1, (8.49)
p MPl
where V0 and p are constants.
An exact solution for the full equations, (8.33) and (8.36), is
a(t) = a0 tp (8.50)
s !
p V0 t
ϕ(t) = 2pMPl ln . (8.51)
p(3p − 1) MPl
8 INFLATION: BACKGROUND 135

Figure 7: After inflation, the inflaton field is left oscillating at the bottom.

The slow-roll parameters for this model are


1 1
ε= η= , (8.52)
2 p
independent of ϕ. In this model inflation never ends unless other physics intervenes.

8.6 Reheating
During inflation, practically all energy in the universe is in the inflaton potential
V (ϕ), since according to the slow-roll approximation 12 ϕ̇2 ≪ V (ϕ). As inflation
ends, this energy is transferred in the reheating process to a thermal bath of particles
produced in the reheating. Thus reheating creates, from V (ϕ), all the stuff there is
in the later universe. The conversion of the inflaton energy density into a thermal
gas of particles does not affect the spectrum of density perturbations in single field
models of inflation (at least on superhorizon scales; see section 8.7 below). (It does
change the relationship between the relation of ϕk . and k/H0 given in (8.62), i.e.
amount that the perturbations are stretched between the end of inflation and today.)
The main constraint on reheating is that the reheating temperature must be above 1
MeV, but sufficiently low so as not to produce unwanted relics – where “sufficiently”
depends on the theory under consideration. For typical supersymmetric theories the
constraint on the reheating temperature is TR . 107 GeV.

8.6.1 Scalar field oscillations


After inflation, the inflaton field ϕ begins to oscillate at the bottom of the potential
V (ϕ), see figure 7. The inflaton field is still homogeneous, ϕ(t, ~x) = ϕ(t), so it
oscillates in the same phase everywhere (the oscillation is coherent). The oscillation
period soon becomes much shorter than the expansion time scale H −1 .
Assume the potential can be approximated as V (ϕ) = 21 m2 ϕ2 near the minimum
of V (ϕ), where the amplitude of ϕ is small. The equation of motion is then

ϕ̈ + 3H ϕ̇ = −m2 ϕ . (8.53)

In the limit m ≫ H, we can neglect the friction term, and the field undergoes
oscillations with frequency m. We can write the energy continuity equation as
3
ρ̇ + 3Hρ = −3Hp = − H m2 ϕ2 − ϕ̇2 .

(8.54)
2
8 INFLATION: BACKGROUND 136

Figure 8: The time evolution of ϕ as inflation ends.

The oscillating factor on the right hand side averages to zero over one oscillation
period (in the limit where the period is ≪ H −1 ), so on average the energy density
goes like ρ ∝ a−3 , just like in a matter-dominated universe. The fall in the energy
density shows as a decrease of the oscillation amplitude, see figure 8.

8.6.2 Inflaton decay


When the inflaton field is oscillating around the minimum of the potential, the
energy stored in the inflaton field is transferred into particles, both by decay into
quanta of the inflaton field, which subsequently decay, and direct decay into other
fields via coupling between them and the inflaton. There can be tension between
achieving efficient reheating and having a long period of inflation. To have a long
duration of inflation, the inflaton field must be weakly coupled, but couplings to
other degrees of freedom are required for reheating7 .
If the decay is slow, inflaton energy density satisfies the equation

ρ̇ϕ + 3Hρϕ = −Γϕ ρϕ , (8.55)

where Γϕ = 1/τϕ , the decay width, is the inverse of the inflaton decay time τϕ , and
the term −Γϕ ρϕ represents energy transfer to other particles.
If the inflaton can decay into bosons, the decay may be very rapid, involving a
mechanism called parametric resonance. The produced particles are far from thermal
equilibrium (only certain bands in momentum space become populated, and their
occupation numbers are huge). In realistic models of inflation, the inflaton can
decay via mixture of different decay methods. The process by which the inflaton
transfers its energy into particles is called preheating and the thermalisation of the
gas of particles is called reheating. However, terminology varies, and often the term
reheating is used to refer just to the energy transfer, even if the final state is not in
thermal equilibrium.

8.6.3 Thermalisation
The particles produced from the inflaton will interact, create other particles through
particle reactions, and the resulting soup will eventually reach thermal equilibrium
7
In fact, if the scale of inflation is sufficiently high, it is possible to reheat without any couplings
between the inflaton and the Standard Model degrees of freedom by producing particles gravitation-
ally out of the vacuum. This is called gravitational reheating, and it is one of the many delicacies
of inflation we will not have time to sample!
8 INFLATION: BACKGROUND 137

Figure 9: Remaining number of e-foldings N (t) as a function of time.

with some temperature Treh . This reheating temperature is determined by the energy
density ρreh at the end of the reheating epoch:

π2 4
ρreh = g∗ (Treh )Treh . (8.56)
30
Necessarily ρreh < ρend (end = end of inflation). If reheating takes a long time,
we may have ρreh ≪ ρend . The evolution of the gas of particles into a thermal
state can be quite involved, and it has been studied in various models. Usually it
is just assumed that it happens eventually, since the particles are able to interact.
However, it is possible that some particles (such as gravitinos) never reach ther-
mal equilibrium, since their interactions are too weak. In any case, as long as the
momenta of the particles are much higher than their masses, the energy density of
the universe behaves like radiation, regardless of the momentum space distribution.
So the background expansion rate is the same. After thermalisation of at least the
baryons, photons and neutrinos is complete, the standard Hot Big Bang era begins.

8.7 Scales of inflation


8.7.1 Amount of inflation
During inflation, the scale factor a(t) grows by a huge factor. We define the number
of e-foldings from time t to end of inflation tend

a(tend )
N (t) ≡ ln . (8.57)
a(t)

See figure 9. We can calculate N (t) ≡ N (ϕ(t)) ≡ N (ϕ) from the shape of the
potential V (ϕ) and the value of ϕ at time t:
tend ϕend ϕ
a(tend ) H 1 V
Z Z Z
slow roll
N (ϕ) = ln = H(t)dt = dϕ ≈ 2 dϕ ,
a(t) t ϕ ϕ̇ MPl ϕend V′
(8.58)

where we have used da


a = d ln a = Hdt = H dϕ
ϕ̇ .

8.7.2 Evolution of scales


When discussing the evolution of density perturbations and formation of structures
in the universe (to which we will get later), we will be interested in the history
8 INFLATION: BACKGROUND 138

of each comoving distance scale, or each comoving wave number k (from Fourier
expansion in comoving coordinates).
2π λ
k= , k −1 =
λ 2π
An important question is whether a distance scale is larger or smaller than the
Hubble length at a given time. A scale is said to be

k −1 > (aH)−1

• superhorizon, when k < aH

• at horizon (exiting or entering horizon), when k = aH

k −1 < (aH)−1 .

• subhorizon, when k > aH

Note that large length scales (large k −1 ) correspond to small k, and vice versa,
although we often talk about “scale k”. This can easily cause confusion, so be
careful with wording! Notice also that we are here using the word “horizon” to
refer to the Hubble length: more correct terminology would be “sub-Hubble” and
“super-Hubble”8 . Recall that (aH)−1 shrinks during inflation, and grows during all
other eras. See figures 10 and 11.
We shall later find that the amplitude of primordial density perturbations on a
given comoving scale freezes as this scale exits the horizon during inflation. The
largest observable scales are of the size of the horizon today. (Since the universe
has recently began accelerating again, these scales have just barely entered, and are
now exiting again.)
To identify the distance scales during inflation with the corresponding distance
scales in the present universe, we need a complete history from inflation to the
present. We divide it into the following periods:

1. From the time the scale k of interest exits the horizon during inflation to the
end of inflation (tk to tend ).

2. From the end of inflation to the time when thermal equilibrium at high tem-
perature (Hot Big Bang conditions) is achieved, i.e. reheating. We assume
that the universe behaves as if matter-dominated, ρ ∝ a−3 , during this period,
as discussed in section 8.6.1 (tend to treh ).

3. From reheating to matter-radiation equality (the radiation era, ρ ∝ a−4 ) (treh


to teq ).

4. The matter era, ρ ∝ a−3 from teq to t0 .

Consider a scale k that exits at t = tk , when a = ak and H = Hk

⇒ k = (aH)k = ak Hk .
8
As discussed in the first part of the course, there are (at least) three different usages for the
word “horizon”:
1. particle horizon
2. event horizon
3. Hubble length
8 INFLATION: BACKGROUND 139

Figure 10: The evolution of the Hubble length, and two scales, k1−1 and k2−1 , seen in
comoving coordinates.

Figure 11: The evolution of the Hubble length, and the scale k−1 seen in terms of physical
distance.
8 INFLATION: BACKGROUND 140

To find out how large this scale is today, we relate it to the present “horizon”, i.e.,
the Hubble scale (for clarity, we insert a0 here, even though we have chosen it to be
equal to unity):

k a k Hk
=
a 0 H0 a 0 H0
ak aend areh Hk
=
aend areh a0 H0
− 1   1 1
ρreh − 4 Vk 2

−N (k) ρend
3
= e , (8.59)
ρreh ρr0 ρc0

where we have used the relation aaend k


= e−N (k) , where N (k) is the number of e-
foldings until the end of inflation after the scale k exits the horizon. We have also
taken into account that from the end of inflation until reheating we have approxi-
mately ρ ∝ a−3 and that from reheating until today the radiation component evolves
like ρ ∝ a−4 . This is slightly inaccurate, since ρr ∝ a−4 does not take into account
the change in g∗ . However, the approximation is good enough for us here, as the
error will only enter logarithmically in the number of e-foldings9 . (We assume that
almost all energy density goes into particles with masses smaller than the reheating

temperature.) Finally, we have used H ∝ ρ, which follows from the Friedmann
equation, and noted that during slow-roll inflation ρk ≈ Vk , where the subscript k
again refers to the time when the mode with wavenumber k exits the horizon.
We can rewrite (8.59) as
1 1 1  1
16 Vk4
4

k −N (k) 10 GeVρr0 Vk 4 ρreh 12
= e 1 , (8.61)
a0 H0 2
ρc0 1016 GeV Vend Vend

where we have inserted the comparison scale 1016 GeV, taken into account that
ρend ≈ Vend (if inflation ends due to the slow-roll approximation being violated, this
will only be true up to factors of order unity, which we neglect) and rearranged
some of the terms. We don’t know the energy scale of inflation, but there is an
upper limit of approximately 1016 GeV from the lack of observation of primordial
9
Accurately this would go as:
 1
areh g∗s (T0 ) 3 T0
g∗s a3 T 3 = const. ⇒ = . (8.60)
a0 g∗s (Treh ) Treh
We approximated this with
 1  1
ρr0 4 g∗ (T0 ) 4 T0
=
ρreh g∗ (Treh ) Treh
Taking g∗s (Treh ) = g∗ (Treh ) ∼ 100, the ratio of these two becomes
1 1
g∗s (T0 ) 3 3.909 3
1 1 ≈ 1 1 = 0.79 ∼ 1 .
g∗ (T0 ) 4 g∗ (Treh ) 12 3.363 4 100 12

Note that a ∝ ρr is a better approximation than a ∝ T −1 , since these two differ by


−1/4

 1  1
g∗ (Treh ) 4 100 4
∼ ∼ 2.33 .
g∗ (T0 ) 3.363
8 INFLATION: BACKGROUND 141

gravity waves, whose amplitude provides a measure of the inflationary energy scale.
Inserting the values ρr0 = 4.18 × 10−5 h−2 ρc0 (assuming massless neutrinos) and
1/4 √
ρc0 = ( 3H0 MPl )1/2 ≈ 3.0 × 10−12 h1/2 GeV, and taking h = 0.7, we obtain for the
number of e-folds
1/4 1/4
k 1 V Vk 1016 GeV
N (ϕk ) = − ln + 61 − ln end
1/4
+ ln 1/4
− ln 1/4
, (8.62)
a 0 H0 3 ρreh Vend Vk

where ϕk ≡ ϕ(tk ). The terms have been arranged such that the quantities in the
logarithms are bigger than unity. The second term depends on the efficiency of
reheating: if all of the inflaton potential energy is converted into radiation degrees
of freedom instantaneously, it is zero. The third term is expected to be small, since
the potential varies slowly during slow-roll: the dependence on k in the first term is
expected to dominate. The last factor can however be large if the inflation scale is
much lower than 1016 GeV. For example, inflation at the TeV scale would give −30.
For any given present scale, given as a fraction of the present Hubble distance10 ,
(8.62) identifies the value ϕk the inflaton had, when this scale exited the horizon
during inflation. The last three terms give the dependence on the energy scales con-
nected with inflation and reheating. In typical inflation models, they are relatively
small. Usually, the precise value of N is not that important; we are more interested
in the derivative dN/dk, or rather dϕk /dk.
Anyway, we see that typically (for high scale inflation) about 60 e-foldings of
inflation occur after the largest observable scales exit the horizon. There is no
similar constraint on the number of e-folds before these scales exited the horizon,
and the number varies from a few to 108 (or more) between different models.

8.8 Before inflation


As we discussed earlier, inflation erases all memory of the initial conditions before
inflation, and on the theoretical side we do not have a good theoretical understanding
of what happened in that era. However, there are some ideas. During inflation, the
universe is expanding and (in most models) the energy density is decreasing. We thus
expect that the energy density is higher before inflation than during it or after it.
Often it is assumed that inflation begins right at the Planck scale, ρ ∼ MPl4 , which is

the limit to how high energy densities we can extend our discussion, which is based on
classical GR, and quantum gravitational effects are expected to be important. One
idea is that the universe at that time, the Planck era, is some kind of a “spacetime
foam”, where the fabric of spacetime itself is subject to large quantum fluctuations.
When the energy density of some region, larger than H −1 , falls below MPl4 , spacetime

in that region begins to behave in a classical manner. See figure 12.


The initial conditions, i.e, conditions at the time when our Universe emerges
from the spacetime foam, are usually assumed chaotic (this word does not refer to
chaos theory!), i.e. ϕ takes random values at different regions. Since ρ ≥ ρϕ , and

1 1 1
ρϕ = ϕ̇2 + ∂i ϕ∂i ϕ + V , (8.63)
2 2 a2
10
For example, k/H0 = 10 means that we are talking about a a scale corresponding to a wave-
length λ such that λ/2π is one tenth of the Hubble distance.
8 INFLATION: BACKGROUND 142

Figure 12: Classical regions emerging from spacetime foam.

we must have
1
ϕ̇2 . MPl
4
, ∂i ϕ∂i ϕ . MPl4
, V . MPl 4
(8.64)
a2
in a region for it to emerge from the spacetime foam.
Inflation may begin at many different parts of the spacetime foam. Our observ-
able universe is just one small part of one such region which has inflated to a huge
size.
It is also possible that during inflation, for some part of the potential, quantum
fluctuations of the inflaton (not the spacetime!) dominate over the classical evolution
and push ϕ higher in some regions. These regions will then expand faster, and
dominate the volume. This gives rise to eternal inflation, where, at any given time,
most of the volume of the universe is inflating. (Whether or not this can happen
depends on the shape of the potential and the field value during inflation.) But our
observable Universe is part of a region where ϕ rolled down and came to a region
of the potential, where the quantum fluctuations of ϕ were small and the classical
behaviour began to dominate and inflation ended.
Thus the ultra-large scale structure of the universe may be very complicated.
However, this is not observable to us, and all the features of the universe we see
can be explained in terms of what happened in our patch during or after inflation.
These ideas of the spacetime foam and eternal inflation are rather speculative, and
there are also different suggestions for the initial stages of the universe.
9 Linear perturbation theory
9.1 Structure formation
Up to this point we have discussed the universe in terms of the homogeneous and
isotropic FRW model. We have however already used the notion of temperature,
which involves fluctuations, so inhomogeneities have already implicitly been present.
We now take the next step by explicitly considering small perturbations around the
homogeneous and isotropic model (which we now refer to as the “unperturbed” or the
“background” universe). In cosmology, perturbation theory has wide applicability.
Often the distribution of non-linear objects can be treated in terms of linear theory,
even though their internal composition cannot, and even very non-linear structures
such as planets, stars and galaxies have evolved from small initial perturbations
under the influence of gravity. This growth is called structure formation, though
sometimes the term is used to refer only to the situation when perturbations become
of order unity and bound structures form. The discussion of perturbations can thus
be divided into two parts.

1) The generation of the primordial perturbations, “the seeds of structure”. This


is the more speculative part of structure formation theory. We don’t know
how the primordial perturbations came about, but we have a good candidate
scenario, inflation, the predictions of which have so far agreed very well with
observations, and which are currently being tested more thoroughly. According
to the inflationary scenario, all structure originates from quantum fluctuations
in the early universe.

2) The growth of the small perturbations into the present observable structure
of the universe. This part is less speculative, since we have a well established
theory of gravity, general relativity. However, there is uncertainty in this part
too, since we do not know the precise nature of the dominant components
to the energy density of the universe, the dark matter and the possible dark
energy. The gravitational growth depends on the equations of state and the
streaming lengths (particle mean free path between interactions) of these den-
sity components. Besides gravity, the growth is affected by pressure (due to
non-gravitational interactions).

We will first discuss the formalism of cosmological perturbation theory. We will


apply it to the generation and early evolution of structures, then to the evolution
of the perturbations in the various later eras in the history of the universe. We
will discuss the cosmic microwave background using perturbation theory. We will
not discuss the formation of galaxies or other non-linear structures except in very
general terms, as we only follow perturbations up to the time when they enter the
non-linear regime.
We will work with first order perturbation theory (also called linear perturbation
theory). This means that all quantities are written as a sum of the background value,
corresponding to the homogeneous and isotropic model, and a perturbation, which
is the deviation from the background value. For example, for the energy density we
have
ρ(t, x) = ρ̄(t) + δρ(t, x) ,

143
9 LINEAR PERTURBATION THEORY 144

where x are the comoving spatial coordinates. We assume that perturbations are
small, so that we can drop all terms which contain a product of two or more pertur-
bations. The remaining equations then contain only terms which are either zeroth
order i.e. contain only background quantities, or first order i.e. contain exactly one
power of the perturbed quantities. If we understand the zeroth order parts as the
average, then the average of the perturbations vanishes. By averaging the inhomo-
geneous equations we thus get back the equations of the homogeneous and isotropic
universe. Subtracting these from our equations we arrive at the perturbation equa-
tions where every term is first order in the perturbation quantities, i.e. the equations
are linear1 .
The more rigorous way of doing perturbation theory would be to take the full
set of equations (in this case the various components of the Einstein equation) for a
general inhomogeneous spacetime and linearise them, dropping higher order terms
as discussed above. The more conventional way is to start with the homogeneous
and isotropic model and add perturbations on top of that. We will follow this easier
route2 .

9.2 The perturbed metric


Let us first discuss perturbations of the metric. We leave the development of cos-
mological perturbation theory to a more advanced course, and just summarise some
basic concepts and results. (The interested reader may consult [1, 2] for details.)
We have the line element

ds2 = gαβ dxα dxβ


= (ḡαβ + δgαβ )dxα dxβ , (9.1)

where ḡαβ is the background metric,

ḡαβ dxα dxβ = −dt2 + a2 (t)δij dxi dxj , (9.2)

and δgαβ is a perturbation, which we take to be small. In this course, we only


consider spatially flat backgrounds, as spatial curvature would introduce technical
complications we don’t want to deal with. The question what is a small perturbation
is not entirely straightforward. For example, we might naively demand |δgαβ | ≪
|ḡαβ |. But (leaving aside that some of the components ḡαβ are zero) this kind of a
statement is coordinate-dependent. We can make a coordinate transformation that
will make a large change to the metric, while keeping the physics exactly the same.
An example would be a large Lorentz boost. This shows another problem, namely
that perturbations in the metric do not necessarily correspond to changes in the
physical state of the system.
In general, if perturbations in all physical quantities are small, it should be
possible to choose a coordinate system where the metric perturbations are small
(compared to unity). Note that the reverse is not true: from the fact that the
metric perturbations are small one cannot conclude that perturbations in all physical
1
This way of decoupling the background and the perturbations does not work straightforwardly
beyond first order perturbation theory. We will be content with linear theory.
2
Note that it is not guaranteed that such a linear extension is a linearised version of a solution
of the full equations. We won’t worry about such details.
9 LINEAR PERTURBATION THEORY 145

quantities are small. For example, the gravitational field in the solar system is quite
small, and the solar system can be represented as a linear perturbation around
Minkowski space. However, the energy density in the solar system changes by a
factor 1020 when going from Earth to interplanetary space.
From now on, we assume that we have chosen an appropriate coordinate system
such that the metric perturbations are small, so we can neglect all terms which are
second order or higher in the metric perturbations.
In the linear approximation, the metric perturbations do not influence the evo-
lution of the background on which they live. The metric perturbations inherit geo-
metric structure from the background. Just like in classical electrodynamics we can
decompose a general tensor into irreducible representations of the Lorentz group,
we can decompose the metric perturbations into irreducible parts with regard to
the symmetries of the background, namely translation and rotation in the spatial
dimensions. In less technical language, the perturbations can be split up into things
which have either zero, one or two spatial indices, and which we can treat like scalars,
vectors and tensor living on a Euclidean space. The most general linear perturbation
around the FRW metric (9.2), decomposed into its irreducible parts, reads

ds2 = gαβ dxα dxβ


= ḡαβ dxα dxβ + δgαβ dxα dxβ
= −(1 + 2Φ)dt2 + 2a(t)(B,i − Si )dxi dt
+a(t)2 [(1 − 2Ψ)δij + 2E,ij + Fi,j + Fj,i + hij ] dxi dxj , (9.3)

where Φ, Ψ, B and E are scalars, Si , Fi are vectors and hij is a tensor, and a comma
stands for derivative with respect to xi i.e. f,i ≡ ∂f /∂xi . The vector perturbations
are transverse, δ ij Si,j = δ ij Fi,j = 0, and the tensor perturbation is transverse and
traceless, δ ij hij = 0, hij,j = 0. Physically, tensors correspond to gravity waves,
vectors describe rotation and scalars are directly related to the density perturbation,
as we will see.
Since we drop all non-linear terms, the scalar, vector and tensor perturbations
evolve independently. The vector perturbations decay with the expansion, and are
expected to be negligible in the linear regime, so we put them to zero, Si = Fi = 0.
There can be significant tensor perturbations in the universe, and they may be
observable in the cosmic microwave background anisotropy. This depends on the
details of inflation. No tensor perturbations have been detected thus far, but it is
possible the Planck satellite, whose data on the polarisation of the CMB is set to
be released in 2014 will be able to detect them.
For the metric perturbation, we have 10 functions δgαβ (t, x). So there would
appear to be ten degrees of freedom. However, four of them are not physical degrees
of freedom, they just correspond to the freedom of choosing the four coordinates.
So there are 6 physical degrees of freedom. There are thus different coordinate
systems (also called different gauges) which describe the same physics. The choice
of coordinates is called a choice of gauge3 . It can be shown that we can choose
3
More precisely, perturbation theory is formulated in terms of a mapping from the real inhomo-
geneous and anisotropic spacetime to a background spacetime, and it is the choice of map which
is called a “gauge choice”. However, the choice of coordinates and choice of mapping are often
conflated in cosmological parlance. More simply, change of gauge is a change of coordinates, except
that it only affects the perturbations, the background is kept fixed. We will not get into such details.
9 LINEAR PERTURBATION THEORY 146

E = B = 0, and that doing so fixes the coordinate system completely. This choice
is known as the longitudinal gauge and also as the conformal Newtonian gauge. We
are then left with the metric

ds2 = gαβ dxα dxβ


= ḡαβ dxα dxβ + δgαβ dxα dxβ
= −(1 + 2Φ)dt2 + a(t)2 [(1 − 2Ψ)δij + hij ] dxi dxj , (9.4)

so we have two scalar degrees of freedom and one transverse traceless symmetric
tensor, which has two independent degrees of freedom. The metric perturbations
Φ(t, x) and Ψ(t, x) are called the Bardeen potentials4 . The function Φ is also called
the Newtonian potential, since in the Newtonian limit, it becomes equal to the New-
tonian potential perturbation, and Ψ is called the Newtonian curvature perturbation,
because it determines the curvature of the 3-dimensional t = const. subspaces, which
are flat in the unperturbed universe.
The evolution of the metric perturbations is determined by the Einstein equa-
tion, which couples the metric to the matter content as described by the energy-
momentum tensor.

9.3 The perturbed equations of motion


The Einstein equation is

Gαβ = 8πGN Tαβ , (9.5)

where Gαβ is a tensor which is built from the metric and its first and second deriva-
tives, and the energy-momentum tensor Tαβ describes the properties of matter. In
chapter 3 we noted that for an ideal fluid the energy-momentum tensor has the
following form

Tαβ = (ρ + p)uα uβ + pgaβ , (9.6)

where ρ is the energy density and p is the pressure measured by an observer moving
with four-velocity uα . In the FRW case, the energy-momentum tensor necessarily
has this form for all forms of matter due to the symmetry of the spacetime. In
the perturbed case, the energy-momentum tensor can also have contributions from
energy flux and anisotropic stress in addition to then energy density and pressure.
We will not discuss such imperfect fluids.
As with the metric, we split the contributions to the energy-momentum tensor
into background plus perturbations,

ρ(t, x) = ρ̄(t) + δρ(t, x) (9.7)


p(t, x) = p̄(t) + δp(t, x) (9.8)
α α0 α
u (t, x) = δ + δu (t, x) , (9.9)

and we throw out all terms which have two or more powers of the perturbations,
whether of the metric or the matter variables. The four-velocity is normalised as
gαβ uα uβ = −1, from which it follows that δu0 = −Φ in linear theory.
4
Warning: Sign conventions for Φ and Ψ differ, and the definitions of Ψ and Φ are also sometimes
switched with each other.
9 LINEAR PERTURBATION THEORY 147

Equating the Einstein tensor corresponding to the metric (9.4) to the energy-
momentum tensor (9.6) (times 8πGN ) in the linear approximation, we get the fa-
miliar equations for the background:

3H 2 = 8πGN ρ̄ (9.10)
2
3(Ḣ + H ) = −4πGN (ρ̄ + 3p̄) , (9.11)

where we have used the relation ä/a = Ḣ + H 2 . For the perturbations, we get
1 2
4πGN δρ = ∇ Ψ − 3H(Ψ̇ + HΦ) (9.12)
a2
4πGN (ρ̄ + p̄)δui = −(Ψ̇ + HΦ),i (9.13)
 
1 1 2
4πGN δpδij = (2Ḣ + 3H 2 )Φ + H Φ̇ + Ψ̈ + 3H Ψ̇ + ∇ D δij
2 a2
1 1
− 2 D,ij (9.14)
2a
1
0 = ḧij + 3H ḣij − 2 ∇2 hij , (9.15)
a
where ∇2 ≡ δ ij ∂i ∂j and D ≡ Φ − Ψ. These are the central equations for discussing
the evolution of perturbations. In this course, we cannot properly derive them from
the general Einstein equation, we just have to take them as given.
From the non-diagonal components of (9.14) we get that D,ij = 0 for all i 6=
j. The general solution of this equation is D = A(t, x) + B(t, y) + C(t, z). In
cosmology there are no preferred coordinate axes, so the only physically relevant
solution is D = D(t). However, this corresponds to changing the time coordinate,
so we can set D(t) = 0 without loss of generality. We therefore have Φ = Ψ.5 To see
what the single remaining scalar metric degree of freedom corresponds to, we can
manipulate the remaining perturbations equations (9.12)–(9.14). Let us introduce
some notation: the density contrast is defined as
δρ
δ≡ . (9.16)
ρ̄

We also define the background equation of state as w ≡ p̄/ρ̄, and introduce the
variable v 2 ≡ δp/δρ. We will later see that v corresponds (for certain types of
perturbation called adiabatic) to the sound speed of the cosmic fluid (if v 2 < 0,
it instead describes an instability of the fluid). We can now express the pressure
5
In fact, neutrinos develop anisotropic stress after neutrino decoupling, they do not behave like
an ideal fluid. Therefore the two Bardeen potentials actually differ from each other by about 10%
in the time between neutrino decoupling and matter-radiation equality. After the universe becomes
matter-dominated, the neutrinos become unimportant, and Ψ and Φ rapidly approach each other.
The same thing happens to photons after photon decoupling, but the universe is then already
matter-dominated, so the photons do not cause a significant difference between Ψ and Φ.
9 LINEAR PERTURBATION THEORY 148

perturbation in terms of v 2 and δ, and write (9.12)–(9.15) as


1 2
0 = Φ̈ + H(4 + 3v 2 )Φ̇ − v 2 ∇ Φ + [2Ḣ + (3 + 3v 2 )H 2 ]Φ (9.17)
a2
2 1 1
δ = 2
∇2 Φ − 2 Φ̇ − 2Φ (9.18)
3 (aH) H
1
δui = ∂i (Φ̇ + HΦ) (9.19)
2
a Ḣ
1
0 = ḧij + 3H ḣij − 2 ∇2 hij . (9.20)
a
From the set of equations (9.17)–(9.19) it follows that the metric perturbation
Φ is non-zero only if there is matter. So φ is generated directly by matter sources,
in particular by the density perturbations. In contrast, the tensor perturbation hij
can be non-zero even if the space is empty: they correspond to gravity waves.
The procedure for solving the perturbed equations is the following.

1) Give the matter model, i.e. give w and v 2 .

2) Solve for the evolution of the background and obtain a(t).

3) Solve the perturbation equations.

The order of solving the perturbation equations is that (9.17) gives the evolution
of Φ, and we then find the corresponding density contrast from (9.18) and the
velocity perturbation from (9.19). (We will not be much concerned about the velocity
perturbation.) Note an important difference in (9.18) from the classical Poisson
equation: there are terms of the metric perturbation without any gradients on the
right-hand side. This is a purely general relativistic feature which has very important
consequences, as we will see.

9.4 Fourier transformation


Since the equations are linear, they are easily solved in terms of a Fourier transfor-
mation. We define
1
Z
Φ(t, x) = d3 kΦk (t)eik·x , (9.21)
(2π)3/2

and δk , uik and hkij are defined in the same way. Because the universe is expanding,
the variable k, called the comoving momentum or comoving wavenumber, is not
the physical momentum, which is instead given by k/a. With the scale factor
normalised to unity today, the comoving momentum of a Fourier mode is the physical
momentum it has today.
The flatness of the spatial sections is crucial here. If the spatial sections were
curved, plane waves would not form a complete set of basis functions, and we would
instead have to use more complicated functions. (There would also be an additional
scale present, given by the spatial curvature term K/a2 .)
Different Fourier modes decouple, and the equations for the metric perturbations
reduce to ordinary second order differential equations for each mode. Inserting (9.21)
9 LINEAR PERTURBATION THEORY 149

into (9.17)–(9.20) (we drop the equation for the velocity)


k2
0 = Φ̈k + H(4 + 3v 2 )Φ̇k + v 2 Φk + [2Ḣ + (3 + 3v 2 )H 2 ]Φk (9.22)
a2
2 k2 1
δk = − Φk − 2 Φ̇k − 2Φk (9.23)
3 (aH)2 H
k2
0 = ḧkij + 3H ḣkij + 2 hkij , (9.24)
a
where we have denoted k ≡ |k|.
The equations (9.22) and (9.23) as well as (9.24) have an interesting property. For
a fluid for which v 2 = w, the last term in (9.22) vanishes due to (9.10) and (9.11).
Thus, for long wavelength perturbations, k ≪ aH, we find that Φk =constant
is a solution of the equations, and (9.23) shows that the density contrast δk is
then also constant in time and equal to −2Φk . The gravity waves also have a
constant solution, regardless of the sound speed or the equation of state, as long as
k ≪ aH. So the relativistic equations allow for the possibility that perturbations
with wavelengths much larger than the Hubble scale are ’frozen in’ and remain
unaffected by cosmological evolution. Such a feature is not present in Newtonian
gravity.
In the first part of the course we saw that the early universe is radiation-
dominated until t = teq ≈ 50 000 years, after which the universe is matter-dominated
until it becomes (in the ΛCDM model) dominated by the vacuum energy at a few
billion years. In order to know the evolution of the perturbations, all we need to
do is to plug the background evolution we have already calculated into the above
equations and solve, keeping in mind that we have to track at least four different
components (photons, neutrinos, baryons and dark matter) with different behaviour
(i.e. different w and v 2 ).
The equations (9.22) and (9.23) give the time evolution of the Fourier compo-
nents, but the spatial dependence (i.e. dependence on k) is left unconstrained, and
since the equations are linear, all linear combinations of solutions are also solutions.
The spatial dependence is fixed by the initial conditions at early times. Until the
1980s, initial conditions were based on assumptions about simplicity, but today we
have a scenario called inflation in which it is possible to actually calculate how per-
turbations are generated from quantum fluctuations. We will discuss this in the
next chapter, but let us first consider some statistical properties of fluctuations.

9.5 Gaussian perturbations


Simplest models of inflation predict, and observations show, that cosmological per-
turbations are (in the linear regime) close to Gaussian. Possible deviations from
Gaussianity are a topical subject in cosmology at the moment. No deviations in th
primordial perturbations have been found, and the non-Gaussian contribution has
to be less than 10−4 , according to observations by the Planck satellite. (Non-linear
structure formation does destroy the Gaussianity of the initial perturbations on small
scales.). Let us discuss a generic Gaussian perturbation g(x), where g could be Φ,
δ or some other linear theory quantity (we suppress the time dependence here):
X
g(x) = gk eik·x , (9.25)
k
9 LINEAR PERTURBATION THEORY 150

where the set of Fourier coefficients {gk } is a result of a Gaussian random process.
We have here used a Fourier series instead of the integral Fourier transformation.
Formally, this corresponds to considering some cubic region (“box”) of the universe,
in the comoving coordinates, with some comoving volume L3 and assuming periodic
boundary conditions. The box is just a physically irrelevant mathematical conve-
nience. In the end we can take the limit L3 → ∞ and replace the Fourier series
with a Fourier integral. (See section 9.6 for the correspondence.) In cosmology,
we can only predict the probability distribution from which the perturbations are
drawn (since they originate in quantum mechanics), not the particular realisation
that corresponds to out universe. This brings some limitations on the comparison
between theory and observation, which we will come back to when we discuss the
cosmic microwave background.
Cosmological perturbations are real, so we have g−k = gk∗ . We can write gk in
terms of its real and imaginary part,

gk = αk + iβk . (9.26)

To know a random process means to know the probability distribution Prob(gk ).


The expectation value of a quantity which depends on gk as f (gk ) is given by
Z
hf (gk )i ≡ f (gk )Prob(gk )dαk dβk , (9.27)

where the integral is over the complex plane, i.e.


Z ∞ Z ∞
dαk dβk .
−∞ −∞

We shall now define what we mean by Gaussian perturbations (or by a Gaussian


random process, which is a process which produces such perturbations). We restrict
to perturbations with zero mean, which is the relevant situation in cosmology. Such
perturbations g(x) satisfy two properties:

1. The probability distribution of an individual Fourier component is Gaussian6 :

1 |gk |2
 
1
Prob(gk ) = exp −
2πs2k 2 s2k
(9.28)
1 αk2 1 βk2
   
1 1
=√ exp − 2 × √ exp − 2 .
2πsk 2 sk 2πsk 2 sk

From this distribution we immediately get (exercise) its mean

hgk i = 0 (9.29)

and variance
h|gk |2 i = 2s2k . (9.30)
The distribution has one free parameter for each value of k, the real positive
number sk that gives the width (determines the variance) of the distribution.
6
We take the definition of Gaussianity to include zero mean.
9 LINEAR PERTURBATION THEORY 151

2. The probabilities of different Fourier modes are independent (i.e., they are not
correlated),
hgk gk∗ ′ i = 0 for k 6= k′ . (9.31)
Because of the ∗ , this holds also when k′ = −k (exercise).
In addition, the distribution is assumed to be statistically homogeneous and isotropic
in space. This means that the probability distribution is independent of the direction
of the Fourier mode k:
sk = s(k) . (9.32)
Like Gaussianity, this is a prediction of typical models of inflation, and seems to be
agreement with the data. (There appear to be some anomalies in the CMB which
may point to a small violation of this symmetry, but the issue remains unsettled.)
We can combine (9.30) and (9.31) into a single equation,

hgk gk∗ ′ i = 2δkk′ s2k = δkk′ h|gk |2 i (9.33)

Going from Fourier space back to coordinate space, we find


* +
X X
hg(x)i = gk eik·x = hgk ieik·x = 0 (9.34)
k k

The expectation value of the perturbation is zero, since it represents a deviation from
the background value, and positive and negative deviations are equally probable.
(In other words, the background quantity gives the mean value.) The square of the
perturbation can be written as
X X
g(x)2 = gk∗ e−ik·x gk′ eik ·x

(9.35)
k k′

since g(x) is real. The typical amplitude of the perturbation is described by the
variance, the expectation value of this square,
X X X
hg(x)2 i = hgk∗ gk′ iei(k −k)·x = h|gk |2 i = 2 s2k .

(9.36)
kk′ k k

9.6 The power spectrum


As noted above, all statistical information about a Gaussian perturbation is encoded
in a single function of one variable. In cosmology, this function gives the spatial
dependence of the initial conditions for the perturbations, and it is usually discussed
in terms of the power spectrum. which is defined as
 3
L L3
Pg (k) ≡ 4πk 3 h|gk |2 i = 2 k 3 h|gk |2 i . (9.37)
2π 2π
We will want to convert the Fourier series back into a Fourier integral. The
correspondence between the two is (following Liddle & Lyth [3])
1
Z
g(x) = g(k)eik·x d3 k
(2π)3/2
(9.38)
1
Z
−ik·x 3
g(k) = g(x)e d x .
(2π)3/2
9 LINEAR PERTURBATION THEORY 152

To take the limit of infinite box size, L3 → ∞, we replace

2π 3 X
  Z
→ d3 k
L
k
 3
L 1 (9.39)
gk → g(k)
2π (2π)3/2
 3
L
δkk′ → δ 3 (k − k′ )

It is usually easiest to work with the series, and convert to the integral near the end
(to avoid dealing with products of delta functions).
We find for the variance of g(x),7
 3 X
2
X
2 2π 1
hg(x) i = h|gk | i = Pg (k)
L 4πk 3
k k (9.40)
1 d3 k ∞
dk ∞
Z Z Z
→ Pg (k) = Pg (k) = Pg (k)d ln k .
4π k3 0 k −∞

Thus the power spectrum of g gives the contribution of a logarithmic scale inter-
val to the variance of g(x). For Gaussian perturbations, the power spectrum gives a
complete statistical description, and all statistical quantities can be calculated from
it.
In practice the integration is not extended all the way from k = 0 to k = ∞.
Rather, there is usually some largest and smallest relevant scale, which introduce
natural cutoffs at both end of the integral. The largest relevant scale could be the
size of the observable universe: The perturbation g(x) represents a deviation from
the background quantity, but the best estimate we have for the background may
be the average taken over the observable universe. Then perturbations at larger
scales contribute to our estimate of the background value instead of contributing to
the perturbation away from it. However, the appropriate cutoff scale is a matter of
some debate, and we will find it necessary to discuss perturbations larger than the
size of the Hubble scale. The smallest relevant scale in the present context is the
end of the linear regime. However, by including non-linear corrections, it is possible
to discuss the power spectrum also in the non-linear regime, though on very small
scales the original information has now been erased by non-linear processes. From
a fundamental point of view, there is expected to be no information on very small
scales anyway, because of the process of free-streaming, which we will discuss later.
From a practical point of view, the relevant scale for comparing to observations it
limited by the resolution of the observational survey considered. For example, if
we consider density perturbations in terms of perturbations in the number density
of galaxies, then this is only meaningfully defined on scales larger than the typical
separation between galaxies.
An alternative definition for the power spectrum is

Pg (k) ≡ L3 h|gk |2 i . (9.41)


7
Note that the result has no x-dependence. Even though the function g(x)2 varies from place
to place, its expectation value is the same everywhere.
REFERENCES 153

Both this and the previous definition are used; in these notes we distinguish them
by the different typeface. They are related by

2π 2
Pg (k) = Pg (k) . (9.42)
k3
Given the matter content and the initial condition in terms of the power spectrum
(both for the scalar and tensor perturbations), the solution in the linear regime is
completely determined by (9.20), (9.22) and (9.23). In the next chapter, we discuss
how the initial field of Gaussian perturbations is generated by inflation and what
are the expected power spectra for scalars and for tensors.

References
[1] V.F. Mukhanov, H.A. Feldman, R.H. Brandenberger, Theory of cosmologi-
cal perturbations. Part 1. Classical perturbations. Part 2. Quantum theory of
perturbations. Part 3. Extensions, Phys. Rept. 215 (1992) 203-333.

[2] V. Mukhanov, Physical Foundations of Cosmology (Cambridge University


Press, 2005)

[3] A.R. Liddle and D.H. Lyth: Cosmological Inflation and Large-Scale Structure
(Cambridge University Press, 2000).
10 Inflation: perturbations
10.1 The evolution of perturbations
10.1.1 The equations of motion
We now want to find out how perturbations are generated during inflation and
how they evolve. In chapter 9 we gave the equations of motion for the metric
perturbations, and noted that in order to solve them we need to give the background
equation of state and v 2 = δp/δρ. We have discussed the background evolution
during inflation in chapter 8. Rather than dealing with the perturbation equations in
terms of the energy density and pressure, in the inflationary case it is more convenient
to discuss perturbations in the inflaton field. As with the other quantities, we split
the field into the background and the perturbation,

ϕ(t, x) = ϕ̄(t) + δϕ(t, x) . (10.1)

In chapter 8, we derived the equation of motion for the scalar field,


1 √
√ ∂µ ( −gg µν ∂ν ϕ) − V ′ (ϕ) = 0 . (10.2)
−g
In the spatially flat Friedmann-Robertson-Walker universe, we have
1 2
ϕ̈ + 3H ϕ̇ − ∇ ϕ + V ′ (ϕ) = 0 , (10.3)
a2
which in Minkowski space reduces to

ϕ̈ − ∇2 ϕ + V ′ (ϕ) = 0 . (10.4)

We now input, instead of the FRW metric, the perturbed metric in the longitu-
dinal gauge from chapter 9. We then get (recall that g µν is the inverse of the metric
tensor)
 
1  
δ ϕ̈ + 3Hδ ϕ̇ + − 2 ∇2 + V ′′ (ϕ̄) δϕ = −2ΦV ′ (ϕ̄) + Φ̇ + 3Ψ̇ ϕ̄˙ . (10.5)
a
Making a Fourier transformation, we obtain
"  #
k 2  
δ ϕ̈k + 3Hδ ϕ̇k + + m2 (ϕ̄) δϕk = −2Φk V ′ (ϕ̄) + Φ̇k + 3Ψ̇k ϕ̄˙ . (10.6)
a

where we have used m2 (ϕ̄) ≡ V ′′ (ϕ̄).


We could now write the perturbed energy density and pressure of the scalar field,
plug them into the perturbation equations given in chapter 9, and solve them in con-
nection with (10.6). However, there is an easier way. We mentioned in chapter 9
that not all metric perturbations correspond to changes in physics. This is not just a
nuisance, it may also be used for benefit. By making coordinate transformations (or
more precisely gauge transformations!), we can change the form of our equations of
motion to be more easily solved. Dealing with the details of the gauge transforma-
tions is beyond the scope of this course, so we just note that it is possible to choose
the coordinate system such that metric perturbations make a negligible contribution

154
10 INFLATION: PERTURBATIONS 155

to the equation of motion of the inflaton perturbations during slow-roll inflation, to


first order in the slow-roll parameters1 . The equation (10.6) then reduces to
"  #
k 2 2
δ ϕ̈k + 3Hδ ϕ̇k + + m (ϕ̄) δϕk = 0 . (10.7)
a

This is precisely what we would get if we just inserted (10.1) into the background
equation of motion for the inflaton field and subtracted the background (i.e. ignored
perturbations in the metric).

10.1.2 Solutions
During inflation, H and m2 change slowly. Thus we now make an approximation
where we treat them as constants. The general solution of (10.7) is then
    
−3/2 k k
δϕk (t) = a Ak J−ν + Bk J ν , (10.8)
aH aH
where Jν is the Bessel function of order ν, with
r
9 m2
ν= − . (10.9)
4 H2
The time dependence of the scale factor for constant H is
a(t) ∝ eHt . (10.10)
If the slow-roll approximation is valid, the inflaton has negligible mass, m2 ≪ H 2 ,
since then
m2 2 V
′′
= 3M Pl = 3η ≪ 1 . (10.11)
H2 V
Thus we can drop m2 /H 2 in (10.9), so
3
ν= . (10.12)
2
Bessel functions of half-integer order are the spherical Bessel functions which can be
expressed in terms of trigonometric functions. The solution (10.8) now reduces to
δϕk (t) = Ak wk (t) + Bk wk∗ (t) , (10.13)
where the constants Ak , Bk have been redefined to absorb some numerical constants,
compared to (10.8), and
   
k ik
wk (t) = i + exp . (10.14)
aH aH
Well before horizon exit, k ≫ aH, the argument of the exponent is large, and the
solution oscillates rapidly. After horizon exit, k ≪ aH, the solution stops oscillating
and approaches the constant value i(Ak − Bk ). (This fits in with our observation in
chapter 9 that the scalar metric perturbation and the density become constant for
k ≪ aH.)
1
One such gauge is the spatially flat gauge, where the scalar perturbations are chosen such
that constant time slices have Euclidean geometry. There are still perturbations in the spacetime
curvature, which show up in the g0i components of the metric.
10 INFLATION: PERTURBATIONS 156

10.1.3 The comoving curvature perturbation


We have now the solution for the field perturbation – or, more precisely, a field
perturbation, that is to say, the field perturbation in a particular gauge. (The
field perturbation is not a gauge-invariant quantity.) How is this field perturbation
related to quantities in the longitudinal gauge we have used earlier? The price to
pay for simplifying the equations of motion by judicious choice of gauge is that we
have to deal with quantities in different gauges. A clean way to solve the problem is
to use quantities which are gauge-invariant, that is to say, the same in every gauge.
A central such quantity is the the comoving curvature perturbation R. We won’t go
into the definition of this quantity: for us it is sufficient to know that its value is the
same in all gauges. So if we calculate R in terms of δϕ in the gauge above, we can
use the resulting value of R in any other gauge. The gauge invariant quantity is a
“bridge” from one gauge to another, if you will.
In the gauge we used above, the comoving curvature perturbation is
δϕk
Rk = −H . (10.15)
ϕ̄˙
So, we should calculate the inflaton field perturbation some time after horizon
exit, when it has settled to a constant value, calculate R with (10.15). This is then
a quantity which is gauge-independent and conserved outside the horizon, and we
can calculate things like the density contrast δ from it (we will discuss this in the
next chapter).
The pieces that we are missing are the constants of integration in (10.13), i.e.
the initial conditions for the perturbation.

10.2 The generation of perturbations


It may sound somewhat odd to discuss the generation of perturbations. This implies
that we consider the state of a system which is homogeneous and isotropic at some
initial time, but where the behaviour is nevertheless different at different positions at
a later time. This may seem impossible, because then we would have to a have a rule
that would say where the perturbations are going to be, which would distinguish one
position from another. Therefore it would seem that perturbations have to be given
as initial conditions, and cannot be calculated from first principles. In a deterministic
theory, this is true. However, quantum mechanics offers a way out of this impasse.
Quantum theory is is indeterministic, and there is no rule that will tell what the
outcome of a quantum process will be, only probabilities of various outcomes (i.e.
statistical distributions) are calculable. To discuss quantum behaviour of the inflaton
field, we need to use quantum field theory in an inflating FRW universe. To warm
up we first consider quantum field theory of a scalar field in Minkowski space.

10.2.1 Vacuum fluctuations in Minkowski space


1 2 2
The field equation for a massive free (i.e. V (ϕ) = 2m ϕ ) real scalar field in
Minkowski space is
ϕ̈ − ∇2 ϕ + m2 ϕ = 0 , (10.16)
or
ϕ̈k + Ek2 ϕk = 0 , (10.17)
10 INFLATION: PERTURBATIONS 157

where Ek2 = k 2 + m2 , for Fourier components. We recognise (10.17) as the equation


for a harmonic oscillator. Thus each Fourier component of the field behaves as an
independent harmonic oscillator.
In the quantum mechanical treatment of the harmonic oscillator one introduces
the creation and annihilation operators, which raise and lower the energy state of
the system. We can do the same here.
Now we have a different pair of creation and annihilation operators â†k , âk for
each Fourier mode k. We denote the ground state of the system by |0i, and call
it the vacuum. As discussed earlier, particles are quanta of the oscillations of the
field. The vacuum is a state with no particles. Operating on the vacuum with the
creation operator â†k , we add one quantum with momentum k and energy Ek to the
system, i.e., we create one particle. We denote this state with one particle, whose
momentum is k by |1k i. Thus
â†k |0i = |1k i , (10.18)
and the state is normalised as h1k |1k′ i = δkk′ . This particle has a well-defined mo-
mentum k, and therefore it is completely unlocalised, as dictated by the Heisenberg
uncertainty principle. The annihilation operator acting on the vacuum gives zero,
i.e. not the vacuum state but the zero element of Hilbert space (the space of all
quantum states),
âk |0i = 0 . (10.19)
We denote the hermitian conjugate of the vacuum state by h0|. Thus

h0|âk = h1k | and h0|â†k = 0 . (10.20)

The commutation relations of the creation and annihilation operators are

[â†k , â†k′ ] = [âk , âk′ ] = 0 , [âk , â†k′ ] = δkk′ . (10.21)

When going from classical physics to quantum physics, classical observables are
replaced by operators. We can then calculate expectation values for these observ-
ables using the operators. Here the classical observable
X
ϕ(t, x) = ϕk (t)eik·x (10.22)

is replaced by the field operator


X
ϕ̂(t, x) = ϕ̂k (t)eik·x (10.23)

where2
ϕ̂k (t) = wk (t)âk + wk∗ (t)â†−k (10.24)
and
1
wk (t) = L−3/2 √ e−iEk t (10.25)
2Ek
is the mode function, a solution of the field equation (10.17). (The normalisation
has been fixed to get the right commutation relations, (10.27).) We are using the
2
We skip the detailed derivation of the field operator, which belongs to a course of quantum
field theory. See e.g. Peskin & Schroeder, section 2.3 (note different normalisations of operators and
states, related to doing Fourier integrals rather than sums, and considerations of Lorentz invariance).
10 INFLATION: PERTURBATIONS 158

Heisenberg picture, i.e. we have time-dependent operators and the quantum states
are time-independent. Note that since the operator ϕ̂(t, x) is Hermitian (corre-
sponding to a real field), ϕ̂(t, x)† = ϕ̂(t, x), the corresponding Fourier components
satisfy ϕ̂k (t)† = ϕ̂−k (t). So the Fourier component operators are not Hermitian.
In quantum mechanics, we have two conjugate variables, position and momen-
tum. In quantum field theory, we have the field and the corresponding canonical
momentum, which is in this case just given by the time derivative of the field. Com-
bining (10.24) and (10.25), we have
 
ϕ̂˙ k (t) = −iEk wk (t)âk − wk∗ (t)â†−k . (10.26)

We can now calculate the commutator between the field operator and the cor-
responding velocity operator. A straightforward calculation with the rules (10.21)
gives

[ϕ̂k (t), ϕ̂˙ k′ (t)] = iL−3 δk,−k′ . (10.27)

(Exercise: Show that demanding the canonical commutation relation (10.27) fixes
the normalisation to be the one given in (10.25).) Recall that the Lagrange density
of a scalar field is (in Minkowski space)
1
L̂ = − η µν ∂µ ϕ̂ ∂ν ϕ̂ − V (ϕ̂) , (10.28)
2
where η µν = diag(−1, 1, 1, 1) as always. The corresponding Hamiltonian density is
1
Ĥ = − η µν ∂µ ϕ̂ ∂ν ϕ̂ + V (ϕ̂) , (10.29)
2
and the Hamiltonian is the spatial integral of the Hamiltonian density,
Z
Ĥ = d3 xĤ . (10.30)

(Note that the Lagrange density corresponds to the pressure of the scalar field, and
the Hamiltonian density corresponds to the energy density.) Since the Hamiltonian
depends on the field velocity operator, it does not commute with the field operator,

[Ĥ, ϕ̂] 6= 0 . (10.31)

As a result, the Hamiltonian and the field operator do not share a complete set of
eigenstates. So, in general an eigenstate of the Hamiltonian is not an eigenstate of the
field operator. Eigenstates of the Hamiltonian operator are the energy eigenstates,
and the state with the smallest energy is called the vacuum state. Since the vacuum
is not an eigenstate of the field operator, the eigenvalues of the field operator are
not well defined, instead we have only a distribution of values. In other words, the
scalar field has vacuum fluctuations. It can be shown that these fluctuations are
Gaussian (we skip the proof). This means that they are completely characterised
by the power spectrum, as discussed in chapter 9.
It is straightforward to calculate the power spectrum, defined as

k3
Pϕ (k) = L3 h|ϕk |2 i . (10.32)
2π 2
10 INFLATION: PERTURBATIONS 159

Recall that the power spectrum is related to the variance of the field as (note
that hϕ̂i = 0)
Z ∞
2 dk
hϕ̂(x) i = Pϕ (k) . (10.33)
0 k

For the vacuum state |0i, the expectation value of |ϕk |2 is

h0|ϕ̂k ϕ̂†k |0i =


|wk |2 h0|âk â†k |0i + wk2 h0|âk â−k |0i + (wk∗ )2 h0|â†−k â†k |0i + |wk |2 h0|â†−k â−k |0i
= |wk |2 h1k |1k i = |wk |2 (10.34)

since all but the first term give 0, and the states are normalised so that h1k |1k′ i =
δkk′ . Therefore the power spectrum is
k3
Pϕ (k) = L3 |wk |2 . (10.35)
2π 2
From (10.25) we have |wk |2 = 1/(2L3 Ek ), so as the final result we get

k3
Pϕ (k) = . (10.36)
4π 2 Ek
In the case of inflation, the mode functions are different because space is expand-
ing, but the reasoning is the same.

10.2.2 Vacuum fluctuations during inflation


During inflation the field equation for inflaton perturbations is, from (10.6),
"  #
k 2
δ ϕ̈k + 3Hδ ϕ̇k + + m2 (ϕ̄) δϕk = 0 . (10.37)
a

In inflation, the background field is treated classically, and only the perturbations
around the mean value of the field are quantised. In fact, if we were to do the
calculation in a gauge-independent manner, we would see that the variables which
are quantised are a linear combination of the scalar field perturbations and metric
perturbations. Thus in inflation, part of the spacetime metric is quantised. Inflation
may thus be called the first quantum gravity scenario which has been confronted with
observations – with great success. However, just like the background scalar field,
the background metric is not quantised. How to quantise the metric in general, and
not just small perturbations, remains one of the most studied and most difficult
questions in physics. In this course, we just treat the field perturbation during
inflation the same way that we treated the field in Minkowski space. That is, the
Fourier modes of the field perturbation are written as

δ ϕ̂k (t) = wk (t)âk + wk∗ (t)â†−k , (10.38)

where the mode function wk (t) satisfies the classical equation of motion (10.6), with
the normalisation fixed by the canonical commutation relation,

[δ ϕ̂k (t), δ ϕ̂˙ k′ (t)] = i(aL)−3 δk,−k′ , (10.39)


10 INFLATION: PERTURBATIONS 160

where the only difference from the Minkowski space commutator (10.27) is the pres-
ence of a−3 on the right-hand side.
Taking the solution of (10.6) given in section 10.1.2, under the approximations
m2
H = const. and H 2 = 3η ≈ 0 and fixing the normalisation with (10.39), we get the
solution    
−3/2 H k ik
wk (t) = L √ i+ exp , (10.40)
2k 3 aH aH
where the time-dependence is a(t) ∝ eHt .
When the scale k is well inside the horizon, k ≫ aH, δϕk (t) oscillates rapidly
compared to the Hubble time H −1 . If we consider distance and time scales much
smaller than the Hubble scale, spacetime curvature does not matter and things
should behave like in Minkowski space. Considering (10.40) in this limit, one finds
(exercise) that wk (t) indeed becomes (up to a slowly varying phase), equal to the
Minkowski space mode function (10.25), with the lengths scaled by a. (The prefactor
in (10.40) was chosen so that the normalisations would agree.) Therefore the mode
function wk (t) of (10.40) tells us how the perturbation behaves as it approaches and
exits the horizon.
and the power spectrum of inflaton fluctuations is, as in Minkowski space,

k3
Pϕ (k) = L3 |wk |2 . (10.41)
2π 2
Well before horizon exit, k ≫ aH, and on timescales ≪ H −1 , the field operator
δ ϕ̂k (t) agrees with the Minkowski space field operator and we the same kind of
vacuum fluctuations in δϕ as in Minkowski space. However, the time evolution of
the perturbations is different. Well after horizon exit, k ≪ aH, the mode function
approaches a constant
iH
wk (t) → L−3/2 √ , (10.42)
2k 3
so the vacuum fluctuations “freeze” and the power spectrum acquires the constant
value  2
k3 H
Pϕ (k) = L3 2 |wk |2 = . (10.43)
2π 2π
We have calculated the power spectrum of the inflaton field perturbations by
using the quantum mechanical expectation value of the square of the field perturba-
tion. We now identify this with the expectation value of a probability distribution of
a classical variable, i.e. we assume that the quantum mechanical fluctuations become
classical. Some part of this process is understood (it can be shown that the quan-
tum mechanical expectation values become equal to those of a classical stochastic
distribution, or “squeezed”), but the problem of how classical reality emerges from
a quantum system is a problem which remains unsolved. In particle physics appeal
is often made to the Copenhagen interpretation according to which states become
classical when they are measured, but for cosmology this is inadequate. We just as-
sume that we can replace an expectation value of a quantum state with the ensemble
average of a classical distribution.
For our purposes, quantum mechanics generates the initial perturbations and
solves the problem of how perturbations can emerge from a state which is homoge-
neous and isotropic. As a remnant of the indeterministic origin of the perturbations,
10 INFLATION: PERTURBATIONS 161

we cannot predict the specific member of the ensemble which is realised in the uni-
verse, we can only calculate the statistical distribution of perturbations. As noted,
this distribution is Gaussian, so all Fourier modes δϕk acquire their values as in-
dependent random variables (except for the reality condition δϕ−k = δϕ∗k ) with a
Gaussian probability distribution.
The result (10.43) was obtained treating H as a constant. However, H does
change, albeit slowly, during inflation. The main purpose of our discussion was to
follow the inflaton perturbations through the horizon exit. After the perturbation
is well outside the horizon, we switch to other variables, namely the curvature per-
turbation Rk which remains constant outside the horizon even though H changes,
unlike δϕk (we see from (10.37) that δϕk is not constant in general). To take into
account evolution we use for each scale k the value of H which is representative for
the evolution of that particular scale through the horizon. That is, we choose the
value of H at horizon exit3 , so that aH = k. Thus the power spectrum is
2
k3

H
Pϕ (k) = L3 |wk |2 = , (10.44)
2π 2 2π aH=k

where the subscript notation signifies that the value of H for each k is to be taken
at horizon exit of that particular scale.
Since we have only one quantity which has fluctuations, the inflaton field, and the
perturbations are treated in linear theory, the perturbations of any other quantity
are related to the inflaton field fluctuation by linear and local equations. In other
words, any perturbation quantity gk depends only on the field perturbation δϕk
with the same wavenumber, gk (t) = fgϕ (t, k)δϕk (tk ). Thus the statistics of the
inflaton perturbations δϕ(x) are inherited by all other perturbations, and we have
Pg (k) = fgϕ (t, k)2 Pϕ (k). The function fgϕ (t, k) (like the power spectrum Pϕ (k)) can
only depend on the magnitude k, not on the direction of k, because the background
is homogeneous and isotropic. So the distribution of the perturbations inherits
the property of homogeneity and isotropy from the symmetry of the background
on which they are created and evolve: perturbations generated by inflation are
statistically homogeneous and isotropic.
In particular, for the comoving curvature perturbation we have, from (10.15),

δϕk
Rk = −H , (10.45)
ϕ̄˙
so we obtain  2  2
H HH
PR (k) = Pϕ (k) = . (10.46)
ϕ̄˙ ϕ̄˙ 2π aH=k
This the main result for quantum fluctuations during inflation. The problem has
now been completely reduced to the evolution of the background scalar field and the
background Hubble parameter. We just need to specify the inflation potential and
3
One can do a more precise calculation, where one takes into account the evolution of H(t).
The result is that one gets a correction to the amplitude of PR (k), which is first order in slow-roll
parameters and a correction to its spectral index n which is second order in the slow-roll parameters.
Note that H is assumed to be constant only for each k mode during the time it crosses the horizon.
The equations of motion of the different modes are independent, so in principle H could be very
different for modes that exit at very different times withtout violating our assumptions.
10 INFLATION: PERTURBATIONS 162

calculate how the background evolves, and plug it in (10.46) to get complete infor-
mation about the perturbations. That, in turn, is the starting point for calculating
structure formation and the CMB anisotropy. Turning this around, observations of
large-scale structure and the CMB can be used obtain information about quantum
processes in the primordial universe. Note that the power spectrum depends only
on k. Statistical homogeneity and isotropy of the perturbations, inherited from the
symmetry of the background, is a strong feature of inflation. (I use the word ’feature’
rather than ’prediction’, because it is possible to construct models where, for exam-
ple, space expands anisotropically during inflation. However, that requires untypical
assumptions, such as having a short period of inflation, so that the anisotropy is not
washed away, or inflation driven by something else than a scalar field.)

10.3 The primordial spectrum in slow-roll inflation


So, inflation generates primordial perturbations Rk with the power spectrum
 2
HH
PR (k) = , (10.47)
ϕ̇ 2π aH=k

(In this section, we drop the overbar from the background values.) We have ex-
pressed the dynamics of slow-roll inflation in terms of the two slow-roll variables,
so let us see how the power spectrum looks like in terms of them. Applying the
slow-roll equations
V
H2 = 2 and 3H ϕ̇ = −V ′
3MPl
(10.47) becomes
1 1 V3 1 1 V
PR (k) = 2 6 ′2
= 2 4 ε , (10.48)
12π MPl V 24π MPl
where ε is the slow-roll parameter.
According to observations of CMB and large-scale structure, the amplitude of
the primordial power spectrum is

PR (k)1/2 ≈ 5 × 10−5 (10.49)

on cosmological scales. This gives a constraint on inflation


 1/4
V √ p
≈ 241/4 π 5 × 10−5 MPl ≈ 0.028MPl = 6.8 × 1016 GeV . (10.50)
ε

Since ε ≪ 1, this implies an upper limit on the energy scale of inflation,

V 1/4 < 0.028MPl . (10.51)

This puts a limit on the Hubble scale during inflation. From H 2 = V /(3MPl 2 ), the

constraint V 1/4 < 6.8 × 1016 GeV translates into H < 1015 GeV, or in terms of
length, H −1 > 10−31 m.
Since during slow-roll inflation V and V ′ change slowly while a wide range of
scales k exit the horizon, we expect PR (k) to be a slowly varying function of k. We
10 INFLATION: PERTURBATIONS 163

describe this small variation with the spectral index n of the primordial spectrum,
defined as4
d ln PR
n(k) − 1 ≡ . (10.52)
d ln k
If the spectral index is independent of k, we say that the spectrum is scale-free. In
this case the primordial spectrum is a power law
 n−1
2 k
PR (k) = A , (10.53)
kp

where the“pivot scale” kp is some chosen reference scale and A is the amplitude at
this pivot scale.
If the power spectrum is constant,

PR (k) = const. , (10.54)

corresponding to n = 1, we say that the spectrum is scale- invariant (which is a


special case of a scale-free spectrum.). A scale-invariant spectrum is also called the
Harrison–Zel’dovich spectrum.
If n 6= 1, the spectrum is called tilted. A tilted spectrum is called red if n < 1
(more power on large scales) and blue if n > 1 (more power on small scales). If
dn/dk 6= 0, it is said that there is a running spectral index.
Using (10.48) and (10.52), we can calculate the spectral index for slow-roll infla-
tion. Since PR (k) is evaluated from (10.48) when k = aH, we have

d ln k d ln(aH) ȧ Ḣ
= = + = (1 − ε)H ,
dt dt a H

where we used the fact that in the slow-roll approximation Ḣ = −εH 2 in the last
step. Thus

d 1 1 d 1 ϕ̇ d M2 V ′ d ′
2 V d
= = = − Pl ≈ −MPl . (10.55)
d ln k 1 − ε H dt 1 − ε H dϕ 1 − ε V dϕ V dϕ
Let us first calculate the scale dependence of the slow-roll parameters:
" # "   ′ 2 ′′ #
2  ′ 2
dε ′
2 V d MPl V 4 V′ 4 V V
= −MPl = MPl − = 4ε2 − 2εη
d ln k V dϕ 2 V V V V
(10.56)
and, in a similar manner (exercise),


= . . . = 2εη − ξ , (10.57)
d ln k
where we have defined a third slow-roll parameter

4 V ′ ′′′
ξ ≡ MPl V . (10.58)
V2
4
The −1 is in the definition for historical reasons, related to other ways of defining the power
spectrum of perturbations.
10 INFLATION: PERTURBATIONS 164

p
The parameter ξ is typically second-order small in the sense that |ξ| is of the same
order of magnitude as ε and η. (Therefore it is sometimes written as ξ 2 , although
this can be misleading, as it does not have to be positive.)
We can now calculate the spectral index:
 
1 dPR ε d V 1 dV 1 dε
n−1= = = −
PR d ln k V d ln k ε V d ln k ε d ln k

(10.59)
2 V 1 dV
= −MPl · − 4ε + 2η = −6ε + 2η .
V V dϕ

Slow-roll requires ε ≪ 1 and |η| ≪ 1, so the spectrum is predicted to be close to


scale invariant. This agrees well with observations.
Assuming that at late times the universe is described by the ΛCDM model,
the latest constraint using data from the Planck satellite (and some data from the
WMAP satellite), the spectral index is [1]

n = 0.9603 ± 0.0073 . (10.60)

This value is model-dependent, and with a different cosmological model (differ-


ent dark energy, added isocurvature perturbations –to which we come in the next
chapter–, added topological defects and so on), the preferred value of the spectral
index can change, but in all but the most exotic models it remains close to scale-
invariant.
From the results of the running of ε and η, we obtain the running of the spectral
index:
dn
= 16εη − 24ε2 − 2ξ . (10.61)
d ln k
The running is second order in slow-roll parameters, so it’s expected to be even
smaller than the deviation from scale invariance. The observational range is (using
data from Planck, CMB and some ground-based CMB experiments) [1]

dn
= −0.015 ± 0.017 (10.62)
d ln k
Some inflation models have |n − 1| and |dn/d ln k| larger than this, while others do
not. These observations have ruled out some inflation models, while a zoo of dozens
and dozens of viable models remains [2]. Note how, as in the case of dark matter,
things work out automatically. In order to have negative pressure, a scalar field has
to roll slowly. Once the background evolution is slowly rolling, the perturbations are
close to scale-invariant, without needing to add new ingredients or tune anything.
CMB experiments have measured the CMB anisotropy over a range ∆ ln k ≈ 8.
On scales smaller than this, the CMB anisotropy is expected to be negligible (see
chapter 12 for the reason why!), so there’s nothing more to find. However, it is
possible to probe these smaller scales by observations of large-scale structure. Recall
that for high energy-scale inflation, the number of e-folds until the end of inflation
when the largest observable modes are generated is about 60, so we are only seeing
a small part of inflation.
The above results do not yet allow an independent determination of the two
slow-roll parameters ε and η. However, it turns out that the spectral index of tensor
perturbations produced by inflation is independent of η (it is −2ε). So if tensor
10 INFLATION: PERTURBATIONS 165

perturbations are detected (from their signature on the CMB) and their spectrum
is measured, we can get both ε and η. The amplitude of the tensor perturbations
also depends directly on the Hubble parameter on inflation, so it will provide a
measurement of the energy scale of inflation. Typically, large-field inflation models
produce tensor perturbations with much larger amplitude than small-field inflation
models. In the small-field case they may be too small to be detectable in the near
future. It is possible to calculate the spectrum of gravity waves the same way as
we did for the scalar perturbations (the calculation is in fact simpler in the sense
that the gravity waves do not couple to matter, so we don’t have to worry about
the scalar field perturbations and gauges).
Example: Consider the simple inflation model
1
V (ϕ) = m2 ϕ2 . (10.63)
2
In chapter 8 we already calculated the slow-roll parameters for this model:
2
MPl
ε=η=2 2
(10.64)
ϕ
and we immediately see that ξ = 0. Thus

MPl 2
 
n = 1 − 6ε + 2η = 1 − 8
ϕ
MPl 4
 
dn 2
= 16εη − 24ε − 2ξ = −32 . (10.65)
d ln k ϕ
To get the numbers out, we need the values of ϕ when the relevant cosmological
scales left the horizon. We know that the number of inflation e-foldings after that
should be about N ≈ 50 . . . 60. We have
Z ϕ
1 V 1 ϕ 1
Z
ϕ2 − ϕend 2 ,

N (ϕ) = 2 ′
dϕ = 2 dϕ = 2 (10.66)
MPl ϕend V MPl 2 4MPl
2 /ϕ 2

and we estimate ϕend from ε(ϕend ) = 2MPl end = 1 ⇒ ϕend = 2MPl to get

ϕ2 = ϕend 2 + 4MPl
2 2
N = 2MPl 2
+ 4MPl 2
N ≈ 4MPl N. (10.67)

Thus  2
MPl 1
= (10.68)
ϕ 4N
and
2
n = 1− ≈ 0.96
N
dn 2
= − 2 ≈ −0.0008 . (10.69)
d ln k N
The energy scale of inflation is determined from (10.48) and (10.49). Putting in
(10.68), we get
9 14
m≈ 10 GeV ≈ 2 × 1013 GeV ≈ 8 × 10−6 MPl , (10.70)
N
REFERENCES 166

for N = 50. We get V 1/4 ≈ 2 × 1016 GeV as the energy scale for the period when
the perturbations seen in the CMB were generated. Potential energy at the end of
inflation is
 1/4 r
1/4 1 2 2 m
Vend = m ϕend = MPl ≈ 3 × 10−3 MPl ≈ 7 × 1015 GeV . (10.71)
2 MPl

Because of the high energy scale, the amplitude of tensor perturbations, as quantified
by the tensor-to-scalar ratio r is significant, r ≈ 0.1. As these is not sign of tensor
perturbations in the Planck data, this simple model is slightly disfavoured by the
data. There was an announcement in March 2014 by the BICEP2 telescope team
that inflationary gravity waves would have been detected, but this turned out to be
premature.

References
[1] P.A.R. Ade et al. [Planck Collaboration], Astron. Astrophys. (2014)
[arXiv:1303.5076 [astro-ph.CO]]

[2] J. Martin, C. Ringeval and V. Vennin, Phys. Dark Univ. (2014)


[arXiv:1303.3787 [astro-ph.CO]]
11 Perturbations after inflation
11.1 Evolution on superhorizon scales
We have calculated the primordial power spectrum of scalar field fluctuations and
noted how it is related to the gauge invariant quantity R, the comoving curvature
perturbation, which is conserved on superhorizon scales1 , k ≪ aH. Now we want to
know how R is related to Φ, the metric perturbation in the longitudinal gauge, and
to the density contrast δ. We want to know how the perturbations evolve after the
end of inflation, i.e. how to go from primordial perturbations to the perturbations
seen today.
It can be shown that R is related to Φ as follows:
5 + 3w 2
R = − Φ− H −1 Φ̇ ; (11.1)
3 + 3w 3 + 3w
recall that w ≡ p̄/ρ̄. Given R, we can read (11.1) as a differential equation from
which to solve Φ. During any period when w = constant, the solution is
3 + 3w
Φk = − Rk + a decaying part . (11.2)
5 + 3w
Thus, after w has been constant for some time, the Bardeen potential has settled to
the constant value
3 + 3w
Φk = − Rk . (11.3)
5 + 3w
In particular, we have
2
Φk = − Rk (rad.dom., w = 13 )
3
3
Φk = − Rk (mat.dom., w = 0) . (11.4)
5
Recall from chapter 9 that the density contrast is given by

2 k2 1
δk = − 2
Φk − 2 Φ̇k − 2Φk . (11.5)
3 (aH) H

So for superhorizon modes with k ≪ aH and a constant equation of state we


have (after we can neglect the decaying mode)
6 + 6w
δk = −2Φk = Rk . (11.6)
5 + 3w
We should now find out how the perturbations evolve when they enter the hori-
zon, and how the situation changes as we pass from radiation domination to matter
domination to being dominated by vacuum energy. (Exercise. According to (11.6),
we would get δk = 0 for w = −1. Explain this in physical terms.)
1
More precisely, R is conserved when the perturbations are adiabatic. We will come back to this
shortly.

167
11 PERTURBATIONS AFTER INFLATION 168

11.2 Horizon entry


When the expansion of the universe decelerates, i.e. after inflation but before the
recent period of accelerated expansion, scales are entering the horizon2 . Short scales
enter first, large scales enter later. The history of different scales after horizon
entry, and thus their present perturbation amplitude, depends on the epoch during
which they enter the horizon. Even if the primordial perturbations are scale-free,
the perturbations seen today are not scale-free, because different scales have been
processed differently. The wavelengths of the modes which enter during transitions
between epochs are the special scales which characterise the present structure of the
universe. Such important scales are the scale
−1
keq = (aeq Heq )−1 ≈ 13.7ωm
−1
Mpc , (11.7)

which enters at the time teq of matter-radiation equality, and the scale
−1
kdec = (adec Hdec )−1 ≈ 90ωm
−1/2
Mpc , (11.8)

which enters at the time tdec ≈ 380 000 yr of photon decoupling. A conservative
−1 = 86 . . . 110 Mpc and
observational range is ωm = 0.12 . . . 0.16. This gives keq
−1 −1 ≈ 100 Mpc and k −1 ≈ 240 Mpc. The
kdec = 220 . . . 260 Mpc, with mean values keq dec
smallest “cosmological” scale is that corresponding to a typical distance between
galaxies, about 1 Mpc.3 This scale entered during the radiation-dominated epoch,
well after Big Bang nucleosynthesis.
The scale corresponding to the present “horizon” (i.e. Hubble length) is

k0−1 = (a0 H0 )−1 ≈ 3000h−1 Mpc ≈ 4000 . . . 5000 Mpc (11.9)

for values h = 0.6 . . . 0.8, and the commonly accepted value h = 0.7 gives (a0 H0 )−1 =
4300 Mpc. If the universe is accelerating at the moment4 this scale is actually exiting
now, and there are scales, somewhat larger than this, that have briefly entered, and
then exited again in the recent past. The largest observable scales, of the order of
k0−1 , are essentially at their primordial amplitude now.

11.3 Composition of the real universe


In the ΛCDM model, the energy density of the universe has five relevant components:

1. cold dark matter (CDM)

2. baryonic matter

3. photons
2
Recall that what we call the horizon here is just the Hubble radius, not the particle horizon.
3
In the present universe, structure at smaller scales has undergone a non-linear process of galaxy
formation, and it bears little relation to the primordial perturbations. However, observations of
the high-redshift universe, especially so-called Lyman-α observations (absorption spectra of high-z
quasars, which reveal distant gas clouds along the line of sight), can reveal these structures when
they are closer to their primordial state. With such observations, the “cosmological” range of scales
can be extended down to ∼ 0.1 Mpc.
4
This is the case in the ΛCDM model, but there are also models where the acceleration has
transitioned back into deceleration. Either possibility is allowed by observations.
11 PERTURBATIONS AFTER INFLATION 169

4. neutrinos

5. vacuum energy .

The existence of baryons, photons and neutrinos is beyond reasonable doubt, the
existence of dark matter is considered established by most cosmologists (however,
warm dark matter remains a plausible alternative to cold dark matter) and the
existence and nature dark energy is still a subject of debate. As in the first part of
the course, we will stick with the ΛCDM model and only consider vacuum energy.
We have
ρ = ρc + ρb + ργ + ρν +ρΛ , (11.10)
| {z } | {z }
ρm ρr

where we have grouped CDM (denoted with c) and baryons together as matter, and
photons and neutrinos as radiation. As we have discussed, neutrinos are actually
non-relativistic today and so constitute matter. However, for simplicity we will
neglect neutrino masses, as we have done before. (Because the contribution of the
neutrinos to the total energy density, or the energy density of matter, is small when
they become non-relativistic, this approximation is not too bad.)
Until the decoupling of photons and matter at t = tdec , baryons and photons are
tightly coupled, so for t < tdec it is useful to treat them as a single component,

ρbγ ≡ ρb + ργ . (11.11)

We treat the other components as non-interacting (except via gravity). The


description of matter as an ideal fluid (i.e. one with a unique density and velocity
at every point in space) applies to components whose particle mean free paths are
smaller than the scales of interest, or for which only very low momentum modes are
occupied (as is the case for CDM), so that the momentum distribution is irrelevant.
After decoupling, photons free-stream, i.e. move almost without scattering, and
cannot be discussed as an ideal fluid. On the other hand, the density contrast in
the photon component does not grow after decoupling, so we can neglect the effect
of photon perturbations compared to perturbations in the matter after decoupling5 .
We make the same approximation for the neutrinos, treating them as an ideal fluid
of radiation. If the dark energy is vacuum energy, it is perfectly smooth, with
no perturbations. (In other dark energy models, perturbations of dark energy are
typically not important on small scales, but they may have an effect on large scales.)

11.4 Multifluid matter


Let us now discuss the general case when the matter consists of several components,
which individually can be treated as ideal fluids and which interact with each other
only gravitationally. This means that each component sees only its own pressure6 ,
5
The CMB perturbations carry important information, and will be the focus of our attention
in the next section. However, their influence on the evolution of the total density perturbation is
small.
6
In standard cosmology, we actually have just one component, the baryon-photon fluid (or, after
decoupling, the baryons), which sees its own pressure, the other components do not see even their
own pressure. Neutrinos after decoupling interact only weakly, and cold dark matter does not even
have significant pressure.
11 PERTURBATIONS AFTER INFLATION 170

and the components can have different flow velocities. We can introduce the density,
pressure, and velocity perturbations for each component separately,

ρi (t, x) = ρ̄i (t) + δρi (t, x) (11.12)


pi (t, x) = p̄i (t) + δpi (t, x) (11.13)
uαi (t, x) = δ α0
+ δuαi (t, x) , (11.14)

and the total quantities are


X X
ρ̄ = ρ̄i , p̄ = p̄i (11.15)
i i
X X
δρ = δρi , δp = δpi . (11.16)
i i

The individual density contrasts are


δρi
δi ≡ , (11.17)
ρ̄i
and the total density contrast is
P
δρ δρi
δ= = Pi . (11.18)
ρ̄ j ρ̄j

Note that the total density contrast is not just the sum of the individual density
contrasts. Instead, the density contrasts are weighted by the mean densities,
X ρ̄i
δ= δi . (11.19)
ρ̄
i

11.5 Adiabatic and isocurvature perturbations


Before going to the evolution of the different components, let us discuss perturbations
in the multifluid case. Suppose that the equation of state is barotropic

p = p(ρ) , (11.20)

i.e. the pressure is uniquely determined by the energy density. Then the pertur-
bations δp and δρ are necessarily related by the derivative dp/dρ of the function
p(ρ),
dp dp
p = p̄ + δp = p̄(ρ̄) + (ρ̄)δρ ⇒ δp = δρ .
dρ dρ
The time derivatives of the background quantities p̄ and ρ̄ are related by the same
derivative,
dp̄ dp dρ̄ dp
p̄˙ = = (ρ̄) = ρ̄˙ .
dt dρ dt dρ
Assuming the derivative dp/dρ is non-negative, its square root is the speed of sound
s
dp
cs ≡ . (11.21)

11 PERTURBATIONS AFTER INFLATION 171

We thus have, for barotropic equation of state, the relation

δp p̄˙
v2 ≡ = = c2s .
δρ ρ̄˙
In general, p may depend on other variables besides ρ. The sound speed is then
given given by  
∂p
c2s = (11.22)
∂ρ S
where the subscript S indicates that the derivative is taken so that the entropy of the
fluid element is kept constant. Since the background universe expands adiabatically
(meaning that there is no entropy production), we have

p̄˙
 
∂p
= = c2s . (11.23)
ρ̄˙ ∂ρ S

Perturbations with the property


δp p̄˙
= (11.24)
δρ ρ̄˙
are called adiabatic perturbations. If p = p(ρ), perturbations are necessarily adia-
batic. In the general case, perturbations may or may not be adiabatic. If they are
not, the perturbations can be divided into an adiabatic component and isocurvature
perturbations. An adiabatic perturbation corresponds to change in the total energy
density, whereas isocurvature perturbations correspond to perturbations between
the different components. For adiabatic perturbations we thus have

p̄˙
δp = c2s δρ = δρ . (11.25)
ρ̄˙
Adiabatic perturbations are the simplest kind of perturbations. Single-field in-
flation produces adiabatic perturbations, since perturbations in all quantities are
proportional to a perturbation δϕ in a single scalar quantity, the inflaton field.
Adiabatic perturbations have the property that the local state of matter (deter-
mined here by the quantities p and ρ) at some spacetime point (t, x) of the perturbed
universe is the same as in the background universe at some slightly different time
t + δt, this time difference being different for different locations x. We can thus view
adiabatic perturbations as some parts of the universe being “ahead” and others
“behind” in the evolution, as visualised in figure 1.
For the different components we have

δpi p̄˙ i

˙
)  =
δρi (x) = ρ̄i δt(x) ρ̄˙ i

 δρi
⇒ (11.26)
˙
δpi (x) = p̄i δt(x) δρ ρ̄˙
 i = i


δρj ρ̄˙ j

If there is no energy transfer between the fluid components at the background level,
the energy continuity equation is satisfied by each one separately,

ρ̄˙ i = −3H(ρ̄i + p̄i ) = −3H(1 + wi )ρ̄i , (11.27)


11 PERTURBATIONS AFTER INFLATION 172

Figure 1: For adiabatic perturbations, the conditions in the perturbed universe


(right) at (t1 , x) equal conditions in the (homogeneous) background universe (left)
at some time t1 + δt(x).

Thus for adiabatic perturbations we have


δi δj
= . (11.28)
1 + wi 1 + wj

For matter components wi = 0, and for radiation components wi = 13 . Thus, for


adiabatic perturbations, all matter components have the same perturbation

δi = δm (11.29)

and we likewise have for all radiation perturbations


4
δi = δr = δm . (11.30)
3
The isocurvature perturbation between two components is defined as
 
δρi δρj δi δj
Sij ≡ −3H − = − , (11.31)
ρ̄˙ i ρ̄˙ j 1 + wi 1 + wj

and it describes deviation from the adiabatic case.


Adiabatic perturbations remain adiabatic while they are outside the horizon and
are frozen, but isocurvature perturbations may develop when the perturbations enter
the horizon and start evolving, because different components can evolve differently.
However, there can also be primordial isocurvature perturbations. Present observa-
tional data is consistent with the primordial perturbations being purely adiabatic,
and any isocurvature contribution is constrained to be at most of the order of 10%.

11.5.1 Multifluid evolution


The background evolution is given by
K
3H 2 = 8πGN ρ̄ − 3 (11.32)
a2

3 = −4πGN (ρ̄ + 3p̄) (11.33)
a
0 = ρ̄˙ i + 3H(ρ̄i + p̄i ) . (11.34)
11 PERTURBATIONS AFTER INFLATION 173

If we had energy transfer between components, the left-hand side of (11.34) would
be non-zero for the individual components (but still zero for the total energy density
and pressure).
Just like the background expansion is sourced by the total energy density and
pressure, the metric perturbations are sourced by the perturbations in the total
energy density and pressure, so we have, from chapter 9,
k2
0 = Φ̈k + H(4 + 3v 2 )Φ̇k + v 2 Φk + [2Ḣ + (3 + 3v 2 )H 2 ]Φk (11.35)
a2
2 k2 1
δk = − 2
Φk − 2 Φ̇k − 2Φk , (11.36)
3 (aH) H
where v 2 ≡ δp/δρ. For adiabatic perturbations, we have v 2 = c2s .

11.6 The radiation-dominated era


After reheating (or, more accurately, preheating, the matter does not need to be
thermal) the universe is dominated by radiation. As late as BBN, matter contributes
only about a fraction of 10−6 to the total energy density. So let us first see how the
density perturbations evolve in the radiation-dominated universe. In this era, we
have to good accuracy for the background energy density ρ ≈ ρr ∝ a−4 , and spatial
curvature and vacuum energy are negligible. We therefore get from (11.32) a ∝ t1/2 ,
H = 1/(2t). We assume that the perturbations are adiabatic, so we have from
(11.30) δm = 43 δr . We therefore have δρr ≫ δρm , so δp/δρ = δpr /δρ ≈ δpr /δρr =
1/3 and δ ≈ δr (see (11.18)) to good accuracy. Hence v 2 = c2s = 13 .
The general solution of (11.35) and (11.36) is then
Φk (t) = [y cos y − sin y] a−3 A1k + [y sin y + cos y] a−3 A2k (11.37)
   
2 1 2
δk (t) = 4 (y − 1) sin y + y 1 − y cos y a−3 A1k
2
   
2 1 2
+4 (1 − y ) cos y + y 1 − y sin y a−3 A2k , (11.38)
2
where
√ the behaviour has been conveniently expressed in terms of the variable y ≡
k/( 3aH) ∝ a ∝ t1/2 . There are two limiting regimes, perturbations much larger
than the horizon (y ≪ 1) and perturbations deep inside the horizon (y ≫ 1).
For y ≪ 1, the mode proportional to A2k in Φk decays as a−3 , while the am-
plitude of the A1k mode stays constant, and likewise for δk . So, the non-decaying
mode behaviour in the long-wavelength limit is
 3
1 k
Φk (t) = − √ A1k = constant
9 3 H0
 3
2 k
δk (t) = √ A1k = constant . (11.39)
9 3 H0
On sub-horizon scales, y ≫ 1, we have (again dropping the decaying mode)
 
1 k
Φk (t) = √ a−2 cos yA1k ∝ a−2 cos y
3 H 0
 3
2 k
δk (t) = − √ cos yA1k ∝ cos y . (11.40)
3 3 H 0
11 PERTURBATIONS AFTER INFLATION 174

So the gravitational potential decays, while the density perturbation oscillates around
a constant amplitude.
Though the physical wavelength of the mode is growing ∝ a, the visual horizon
is growing faster, H −1 ∝ a2 . (Viewed in terms of the comoving wavelength, it
stays constant, while aH ∝ a−1 drops.) For superhorizon modes, the decaying
mode becomes negligible, while the non-decaying mode remains constant. Once the
wavelength of the mode becomes smaller than the horizon, the density contrast starts
to oscillate, and the gravitational potential decays. In both cases, the perturbations
remain small.
What about perturbations in the matter? Baryons are tightly coupled to radi-
ation until z ≈ 1100, so they have the same perturbations as the radiation fluid.
(We will later come back to what happens when baryons and photons decouple;
that occurs in the matter dominated era.) However, dark matter decouples from
the thermal bath earlier than the baryons, since it interacts weakly. We assume
here that dark matter is cold, so its pressure is negligible. After the decoupling of
dark matter, its energy-momentum tensor is individually conserved. Since the dark
matter contributes negligibly to the background and to the gravitational potential,
we can take (11.37) as a given and see how the dark matter perturbation evolves
in this gravitational potential. The derivation for the dark matter density contrast
is not complicated, but it requires a bit more general relativity than we have on
this course, so we just give the result. For a general FRW background and metric
perturbation Φ, we have

k2
δ̈ck + 2H δ̇ck = 3Φ̈k + 6H Φ̇k − Φk . (11.41)
a2
It is clear that the solution for superhorizon modes k ≪ aH is δck = constant,
given that Φk = constant. In the opposite limit k ≫ aH we get, by inputting
a ∝ t1/2 and (11.40), the solution

δck = Ã1k + Ã2k ln y , (11.42)

where the coefficients Ã1k and Ã2k are expressible in terms of A1k and A2k . (Ex-
ercise. Calculate Ã1k and Ã2k in terms of A1k and A2k .) (Recall that if we assume
adiabatic initial conditions, we have δm = 43 δr ≈ δ.) So, in contrast to baryons, the
density contrast of cold dark matter grows logarithmically during the radiation dom-
inated era. The dark matter perturbations thus have a head start on perturbations
in baryonic matter, which is tightly coupled to the photons.

11.7 The matter-dominated era


11.7.1 CDM density perturbations
For cold dark matter, it is simple to determine how the perturbations evolve. In
the case of baryons, we have to consider two separate periods: before decoupling,
when baryons evolve as part of the tightly coupled baryon-photon fluid, and after
decoupling, when baryons are an independent pressureless fluid. Let us first consider
cold dark matter. In the matter-dominated era, we have ρ ≈ ρm ∝ a−3 , so we get
(assuming negligible spatial curvature) a ∝ t2/3 , H = 2/(3t). Assuming that the
initial density contrast in the radiation is not much larger than that of cold dark
11 PERTURBATIONS AFTER INFLATION 175

matter, we can neglect perturbations in the radiation fluid in the matter-dominated


era (as δρr = δr ρ̄r ). This is always true for adiabatic perturbations. We therefore
have v 2 ≈ c2s ≈ 0. The general solution of (11.35) and (11.36) is now

Φk (t) = B1k + a−5/2 B2k


δk (t) = −(2y 2 + 2)B1k − (2y 2 − 3)a−5/2 B2k , (11.43)

where we have y ≡ k/( 3aH) ∝ a1/2 ∝ t1/3 . Note that with c2s = 0, the equation
(11.35) for the gravitational potential contains no spatial derivatives, so one cannot
have oscillating solutions. (This is physically obvious: with zero sound speed, there
cannot be any sound waves.) For superhorizon modes, k ≪ aH, the behaviour
is qualitatively the same as in the radiation-dominated era: the decaying mode
becomes negligible, and the amplitude of the non-decaying mode remains constant,
both for the gravitational potential and the density contrast. However, the short
wavelength behaviour is quite different. The gravitational potential is constant,
and the density contrast grows like (aH)−2 ∝ a ∝ t2/3 . It is also noteworthy
that (neglecting the decaying mode), the metric perturbation during the matter-
dominated era is constant on all scales, not just for super-Hubble wavelengths.
As the universe changes from radiation domination to being dominated by mat-
ter, the coefficient B1k is determined in terms of the radiation era coefficient A1k
(more precisely, the full solution describes a smooth interpolation between the two
eras).

11.7.2 Baryon density perturbations


Falling into CDM potential wells. Although CDM is the dominant matter
component in the universe, most observations are of (light emitted by) baryonic
matter. The main method to observe the density perturbations today is to study
the distribution of galaxies. Thus to compare the theory of structure formation to
observations, it is crucial to know how perturbations in the baryonic component
evolve. The issue is complicated by the coupling between baryons and photons.
After decoupling, the evolution of the baryon density perturbation is governed by
the gravitational effect of the dominant matter component, the CDM. So in fact the
driving term for baryonic perturbations is the total density contrast, which includes
both baryons and CDM. On large scales, we can ignore the pressure of the baryonic
component, and then δb has the same evolution equation as δc , namely (11.41). We
can define the baryon-CDM isocurvature perturbation as

Scb = δc − δb , (11.44)

and it expresses how perturbations in the two components deviate from each other.
For both δc and δb , the right-hand side of (11.41) is the same, so subtracting the
equations we get an equation for Scb :

S̈cb + 2H Ṡcb = 0 . (11.45)

We assume that the primordial perturbations were adiabatic, so that we origi-


nally had δb = δc , i.e, Scb = 0 at horizon entry. For large scales, which enter the
horizon after decoupling, a non-zero Scb does not develop, so the evolution of the
11 PERTURBATIONS AFTER INFLATION 176

baryon perturbations is the same as CDM perturbations. (This is for linear scales:
when perturbations become non-linear, baryons and CDM behave differently.)
But for scales which enter before decoupling, a non-zero Scb develops because
baryon perturbations are coupled to photon perturbations, whereas CDM pertur-
bations are not. After decoupling, δc ≫ δb , since δc has been growing, while δb has
been oscillating. The initial condition is then Scb ∼ δc (“initial” time here being the
time of decoupling tdec ). During the matter-dominated epoch, the solution for Scb
is
Scb = A + Bt−1/3 , (11.46)
whereas for δc it is, neglecting the effect of baryons on it, from Eq. (11.56),

δc = Ct2/3 + Dt−1 ≃ Ct2/3 . (11.47)

We call the first term the “growing” and the second term the “decaying” mode,
even though the “growing” mode of Scb is actually constant. The modes have been
evolving since horizon entry, so we can drop the decaying part.
To work out the precise initial conditions, we would need to work out the be-
haviour of Scb during decoupling. However, we really only need to assume that there
is no strong cancellation between the growing and decaying modes, so that Scb ∼ δc
implies that A is not much larger than δc ,

A . Ctdec 2/3 . (11.48)

Later, at t ≫ tdec ,

Scb ∼ A . Ctdec 2/3


δc ∼ Ct2/3 ≫ Scb = δc − δb ⇒ δb ∼ δc .

Thus the baryon density contrast δb grows to match the CDM density contrast
δc (see figure 2), and we have eventually δb = δc = δ to high accuracy.
The baryon density perturbation begins to grow only after tdec . Before decou-
pling the radiation pressure prevents growth. Without CDM, the density contrast
would grow only as δb ∝ a ∝ t2/3 after decoupling (during the matter-dominated
period, and the growth stops when the universe becomes dark energy dominated).
Thus it would have grown at most by the factor a0 /adec = 1 + zdec ≈ 1090 after
decoupling. In the anisotropy of the CMB we observe directly the baryon density
perturbations at t = tdec . They are too small (about 10−5 ) for a growth factor of
1090 to give the present observed large scale structure7 .
With CDM, this problem is solved. The CDM perturbations begin to grow
earlier, logarithmically in a during the radiation-dominated era and linearly in a from
t ∼ teq onwards, so by t = tdec they are much larger than the baryon perturbations.
After decoupling the baryons lose support from photon pressure and fall into the
CDM gravitational potential wells and catch up with the CDM perturbations. This
allows the baryon perturbations to be small at t = tdec and to grow after that by
much more than the factor 103 , solving the problem with observations. This is one
of the strongest pieces of evidence for dark matter.
7
This assumes adiabatic primordial perturbations, since we are seeing δγ , not δb . For a time,
primordial baryon entropy perturbations Sbγ = δb − 34 δγ were considered a possible explanation,
but more accurate observations have ruled out this possibility.
11 PERTURBATIONS AFTER INFLATION 177

Figure 2: Evolution of the CDM and baryon density perturbations after horizon entry (at
t = tk ). The figure is just schematic; the upper part is to be understood as having a ∼
logarithmic scale; the difference δc − δb stays roughly constant, but the fractional difference
becomes negligible as both δc and δb grow by a large factor.

The above situation became clear in the 1980s when the upper limits to CMB
anisotropy (which was finally discovered by COBE in 1992) became tighter and
tighter. Today we have accurate measurements of the structure of the CMB anisotropy
which are compared to detailed calculations which CDM, and the argument is raised
to a different level – instead of comparing just two numbers we are now comparing
entire power spectra, which we will discuss in the next chapter.

The Jeans equation. Before decoupling, baryons see the photon pressure (as
well as their own pressure), while after decoupling, they just see their own pressure.
Baryon pressure is much smaller than photon pressure, but it is important on small
scales. At the background level, the baryon pressure can be taken to be zero p̄b = 0,
but the perturbation is non-zero, δpb 6= 0. After decoupling, baryonic matter is a
gas of hydrogen and helium. If we ignore the formation of molecules in the gas and
neglect the contribution of helium, so that we have a monoatomic gas, we have
δpb δnb Tb
v2 = ≈ Tb = , (11.49)
δρb δρb mN
where we have taken into account that the temperature is very uniform, and mN ≈
1 GeV is the nucleon mass. Note that in this case v 2 = c2s = ∂pb /∂ρb . Down until
z ∼ 100, residual free electrons maintain enough interaction between the baryon
and photon components to keep Tb ≈ Tγ . During this period, we thus have c2s ≈
10−13 (1 + z) ∝ 1/a. After that the baryon temperature falls faster than the photon
temperature,
Tb ∝ (1 + z)2 whereas Tγ ∝ 1 + z
(as shown in an exercise in chapter 4).
However, even a tiny pressure can be important on small scales. If we take
the analogue of (11.41) for the baryonic component, which includes a tiny pressure
11 PERTURBATIONS AFTER INFLATION 178

contribution (we skip the derivation), we get the Jeans equation8 , valid on subhorizin
scales,
2
 
2k
δ̈bk + 2H δ̇bk + cs 2 − 4πGN ρ̄ δbk = 0 . (11.50)
a

We have assumed that the universe is spatially flat, so we can also can write this as
2
 
2k 3 2
δ̈bk + 2H δ̇bk + cs 2 − H δbk = 0 . (11.51)
a 2

We see that the small pressure term c2s is enhanced on small scales by the term
k 2 . If take k to be sufficiently large, this term will dominate, no matter how small
is c2s . The nature of the solution to the Jeans equation depends on the sign of the
factor in brackets. Pressure resists compression, so if the first term dominates, we
get an oscillating solution, i.e. sound waves. The second term in the brackets is due
to gravity. If this term dominates, the perturbations grow. The wavenumber for
which the terms are equal,
√ r
4πGρ̄ 3 aH
kJ = a = , (11.52)
cs 2 cs
is called the Jeans wavenumber, and the corresponding wavelength

λJ = (11.53)
kJ
is called the Jeans length.
For scales much smaller than the Jeans length, k ≫ kJ , we can approxi-
mate the Jeans equation by

k2
δ̈bk + 2H δ̇bk + c2s δbk = 0 . (11.54)
a2
The solutions oscillate with angular frequency ω = cs k/a (assuming that cs is con-
stant, or changes slowly – this is not really quite true, as we have seen). The
oscillations are damped by the 2H δ̇k term, thus the amplitude of the oscillations
decreases with time. There is no growth of structure on sub-Jeans scales.
For scales much longer than the Jeans length (but still subhorizon), aH ≪
k ≪ kJ , we have
3
δ̈bk + 2H δ̇bk − H 2 δbk = 0 . (11.55)
2

In the matter-dominated era we have a ∝ t2/3 , and the general solution is

δbk (t) = C1k t2/3 + C2k t−1 , (11.56)

So baryon perturbations on scales larger than the Jeans length but smaller than the
Hubble length grow just like CDM perturbations, as we discussed earlier.
8
Often the Jeans equations are derived starting from the equations of Newtonian gravity, in
which context they were originally presented.
11 PERTURBATIONS AFTER INFLATION 179

The ratio of the (comoving) Jeans length to the comoving Hubble length is, from
(11.52) r
λJ 2
−1
= 2π cs .
(aH) 3
Before decoupling, the baryons see the photon pressure, and c2s ∼ 31 . From
(11.7.2) we would then conclude that before decoupling the baryonic Jeans length
is comparable to the Hubble length, so that all subhorizon modes are sub-Jeans.
Therefore, all subhorizon baryon modes oscillate before decoupling. However, this
argument is not really correct, because the Jeans equation is not valid when c2s is
large. Also, in the period close to decoupling the photon mean free path λγ grows
rapidly. The fluid description, which we are here using for the perturbations, applies
only to scales ≫ λγ , whereas the photons are smooth only on scales ≪ λγ . The
behaviour during this period can be treated properly only with numerical codes, such
as COSMOMC. Nevertheless, the conclusion that all baryonic subhorizon modes
oscillate before decoupling is correct, at least when perturbations are adiabatic9 .
After decoupling, the Jeans length grows. However, at all times until today, it is
≪ Mpc. It would be relevant if we were interested in the process of the formation
of individual galaxies, but here we are interested in the larger scales reflected in
perturbations of the galaxy number density. Thus for our purposes, the baryonic
component is pressureless after decoupling.
The subhorizon evolution history of the different cosmological scales of pertur-
bations is summarised in figure 3.

11.8 The transfer function


Let us summarise the evolution of the linearly perturbed universe. The universe
expands like a ∝ t1/2 in the radiation-dominated era, followed by expansion given
by a ∝ t2/3 in the matter-dominated era, with a smooth transition around redshift
zeq = 3500 at 50 000 years. During both eras, perturbations with wavelengths
larger than the horizon remain frozen10 . This means that the properties of the
superhorizon perturbations (i.e. the growing mode amplitudes A1k ) are preserved
from the inflationary era.
As perturbations enter the horizon during the radiation-dominated era, the grav-
itational potential decays, while the density contrast of photons and baryons oscil-
lates. The density contrast of dark matter grows logarithmically. As the universe
becomes matter-dominated, the density contrast of sub-horizon modes starts to grow
proportional to a, and the gravitational potential stays constant. When the universe
becomes dominated by dark energy, perturbations stop growing. (Exercise: Show
this.)
9
If there is an initial baryon isocurvature perturbation, i.e. a perturbation in baryon density
without the corresponding radiation perturbation, it will initially begin to grow in the same manner
as a CDM perturbation, since the pressure perturbation provided by the photons is missing. (Such
a baryon entropy perturbation corresponds to a perturbation in the baryon-photon ratio η.) But as
the movement of baryons drags the photons with them, a radiation perturbation will be generated,
and the baryon perturbation will begin to oscillate around its initial value (instead of oscillating
around zero).
10
In fact, during the transition from radiation domination to matter domination, the equation of
state is not barotropic, so the perturbations undergo a small change even on super-Hubble scales.
11 PERTURBATIONS AFTER INFLATION 180

Figure 3: The evolution of perturbations on different subhorizon scales. The baryon Jeans
length kJ−1 drops precipitously at decoupling so that all cosmological scales became super-
Jeans after decoupling, whereas all subhorizon scales were also sub-Jeans before decoupling.
The wavy lines symbolise the oscillation of baryon perturbations before decoupling, and the
opening pair of lines around them symbolise the ∝ a growth of CDM perturbations after
teq . There is also logarithmic growth of CDM perturbations between horizon entry and teq .
11 PERTURBATIONS AFTER INFLATION 181

These effects modify the primordial value of the perturbations, and this is en-
coded in the transfer function. We also express the relation between the primordial
curvature perturbation and Rk and any other quantity we are interested in via a
transfer function. Since we have only one source of perturbations and perturbations
are assumed to be small, the value of any perturbation g at time t is related to the
primordial perturbation Rk linearly:

gk (t) = Tg (t, k)Rk , (11.57)

where Tg (t, k) is the transfer function for perturbation g. The transfer function
depends only on the magnitude k and not on the direction of k, because perturba-
tions are evolving on a homogeneous and isotropic background. Often the transfer
function separates, Tg (t, k) = f (t)F (k). In particular, this is the case for cold dark
matter, if the decaying mode can be neglected. The transfer function incorporates
all the physics that determines how structure evolves in the linear regime. The
power spectrum of g is

Pg (t, k) = Tg (t, k)2 PR (k) . (11.58)

On scales k −1 ≫ 10 Mpc, perturbations are still small today, and one does not
have to go beyond the transfer function. For smaller scales, corresponding to galax-
ies and galaxy clusters, the density perturbations have become large at late times,
and the physics of structure growth has become nonlinear. As the perturbations be-
come non-linear, modes with different wavenumber become coupled. This nonlinear
evolution is typically studied using large numerical simulations which use Newtonian
gravity. There are also some analytical results, also mostly in Newtonian gravity.
On scales that are still superhorizon today, the relation between the density
contrast and the primordial perturbations is simple, we have from δm ≈ δ = −2Φ =
6 6
5 R, where we have used (11.4). So for k ≫ a0 H0 , we simply have Tδ (t, k) = 5 .
On scales that are subhorizon today, the situation is a bit more involved. Let
us make a crude estimate of the transfer function on those scales. Let us first look
−1 ≈ 13.7ω −1 Mpc ≈
at scales that enter before matter-radiation equality, k −1 < keq m
100 Mpc. We make the approximation that the relation (11.4) Φk = − 32 Rk holds
all the way to horizon entry (k = aH), though it is strictly only valid for k ≪ √ aH.
From (11.37) and (11.38) we have that at horizon entry (k = aH, or y = 1/ 3)
δk ≈ − 52 Φk = 35 Φk . With adiabatic initial conditions, we have δm = 43 δr ≈ 34 δ. We
thus get
3 5
δck ≈ δk ≈ Rk . (11.59)
4 4
at horizon entry. If we neglect the logarithmic growth of the CDM density per-
turbations, their amplitude stays at this level until the universe becomes matter-
dominated at t = teq , after which we can approximate δk ≈ δck and δk begins to
grow according to the matter-dominated law, ∝ 1/(aH)2 ∝ a. Putting in the log-
arithmic growth from horizon entry to matter-radiation equality, the perturbations
are in addition enhanced by a factor ln(aeq /aentry ) = 2 ln[aentry Hentry /(aeq Heq )] =
2 ln(k/keq ), where the subscript entry refers to horizon entry. So all in all we have,
11 PERTURBATIONS AFTER INFLATION 182

for k ≫ keq in the matter-dominated era

aeq Heq 2
 
5 k
δk (t) ≈ ln Rk
2 aH keq
5 keq 2
 
k
= ln Rk . (11.60)
2 aH keq

In contrast, for perturbations which enter the horizon during matter domination
k ≪ keq , we have

k 2
 
2
δk (t) = − Φk
3 aH
k 2
 
2
= Rk , (11.61)
5 aH

where we have used the relation given by (11.4), Φ = − 53 R.


For a scale-invariant spectrum of primordial comoving curvature perturbations,
the amplitude of the density perturbations grows on small scales proportional k 2 . All
modes enter (k = aH) with approximately the same amplitude, but their amplitude
then grows when they are subhorizon. However, the modes which entered during the
radiation-dominated era have not grown during that era, so their growth is damped
by the extra term (keq /k)2 (modulo the logarithmic growth). This behaviour can
be parametrised by introducing a new transfer function T (k), which is defined as
 2
2 k
δk = Rk T (k) . (11.62)
5 aH

Putting the above results together, we have

T (k) = 1 k ≪ keq
keq 2
 
k
T (k) ≈ ln k ≫ keq , (11.63)
k keq

where we have dropped factors of order unity from the case k ≫ keq , since the
calculation is anyway approximate. If we wanted a transfer function which is con-
tinuous, we could replace ln(k/keq ) with ln(e + k/keq ). However, our calculation
is rather crude, and we should take into account the transition from radiation to
matter domination in more detail. An analytical fit to a numerical calculation gives
[1]

ln(1 + 2.34q)
T (k) = , (11.64)
2.34q [1 + 3.89q + (16.1q)2 + (5.46q)3 + (6.71q)4 ]1/4

where q ≈ kefb /(14keq ), and the baryon fraction fb ≡ ωb /ωm takes into account
interactions between baryons and photons which dampen the matter perturbations.
The form (11.64) is called the BBKS transfer function after Bardeen, Bond, Kaiser
and Szalay. For realistic values fb ≈ 0.2, it has an error of around 30% around
the turning value keq , while it is accurate for high and low values of k. In detailed
calculations, numerical solutions of the baryon-photon-dark matter system are used
11 PERTURBATIONS AFTER INFLATION 183

to derive the transfer function. There are publicly available computer programs
for doing this, such as COSMOMC. One of the main effects missing from both
(11.63) and (11.64) is baryon acoustic oscillations in the regime k > keq . These are
remnants of the oscillations of the baryon-photon fluid before decoupling, which are
imprinted on the pattern of density fluctuations (and thus the the distribution of
galaxies) today. Since there is much more dark matter than baryons, the oscillations
are only a small feature in the overall power spectrum, but they carry important
cosmological information, like the CMB anisotropies we discuss in the next chapter.
Further discussion of the baryon acoustic oscillations is beyond the scope of this
course.
According to the currently favoured picture, the universe becomes dark energy
dominated as we approach the present time. The equation-of-state parameter w
becomes negative and Φ begins to decay, so the growth of the density perturbations
is damped. This effect is not very big up until today (and we shall not calculate
it now), since the universe has expanded by less than a factor of 2 after the onset
of dark energy domination, but it is important in detailed matching of observations
and theory.
We have calculated everything using linear perturbation theory. This breaks
down when the perturbations become large, |δ(x)| ∼ 1. We say that the perturbation
becomes nonlinear. This has happened for scales k −1 . 10 Mpc by now. When
the perturbation becomes nonlinear, i.e. an overdense region becomes about twice
as dense as the average density of the universe, it collapses rapidly, and forms a
gravitationally bound structure, such as a galaxy or a cluster of galaxies. Further
collapse is prevented by the angular momentum of the structure. Stars and gas and
CDM particles in a galaxy orbit around the center of mass of the bound structure,
and galaxies in galaxy groups and clusters have more complicated orbits around
each other. Underdense regions start to depart from the linear behaviour when they
are roughly half as dense as the background. Such regions become ever emptier, as
they expand faster than the background.

11.9 The meaning of scale-invariance


Inflation predicts and observations give evidence for an almost scale invariant pri-
mordial power spectrum. Let us forget the “almost” for a moment and discuss what
it means for the primordial power spectrum to be scale-invariant.
The primordial spectrum is something we have at superhorizon scales, where we
have discussed it in terms of the comoving curvature perturbation R. The pertur-
bation spectrum is called scale-invariant when

PR (k) = A2 = const. (11.65)

where in the real universe A ≈ 5 × 10−5 .


In terms of the other definition of the power spectrum, P (k) ≡ (2π 2 /k 3 )P(k) we
have
PR (k) ∝ k −3 PR ∝ k −3
(11.66)
Pδ (k) ∝ k −3 Pδ ∝ kPR ∝ k

For PR (k) ∝ k n−1 we have Pδ (k) ∝ k n . This is the reason for the −1 in the definition
of the spectral index in terms of PR —it was originally defined in terms of Pδ .
11 PERTURBATIONS AFTER INFLATION 184

We might ask why inflation generates a scale-invariant spectrum – not the math-
ematical reason (we calculated that in the previous chapter) but the physical idea.
During inflation the universe is close to a de Sitter universe, with the metric
ds2 = −dt2 + e2Ht (dx2 + dy 2 + dz 2 )
with H = const. The de Sitter universe is an example of a maximally symmetric
spacetime. In addition to being homogeneous (in the space directions), it also looks
the same at all times. (This is not obvious from the metric, just like spatial ho-
mogeneity is not obvious from the metric for FRW universes with non-zero spatial
curvature.) Therefore, modes of different wavelength get the same perturbations
imprinted on them regardless of when they leave the horizon.
We would now like to see how the scale-invariance relates to the density pertur-
bation. The power spectrum of density perturbations is
k 4
 
4
Pδ (k) = T (k)2 PR (k) , (11.67)
25 aH
and for the gravitational potential we have
9
PΦ (k) = PR (k)T (k)2 = constant for k < keq . (11.68)
25
We see that perturbations in the gravitational potential are scale invariant (apart
from the transfer function), but perturbations in density are not. Instead the density
perturbation spectrum is steeply rising on small scales, meaning that there is more
structure at small scales than at large scales. Thus the scale invariance refers to
the metric perturbations. The density perturbation then turns at ∼ keq to become
almost flat (growing ∼ ln k) at small scales, due to the inhibition of the growth
of density perturbations during the radiation-dominated era. We can also say that
the scale-invariance refers to the density perturbations as they enter the horizon,
i.e. density perturbations on all scales enter the horizon with the same amplitude
(2/5)A ≈ 2 × 10−5 .
The relation between density and gravitational potential perturbations reflects
the nature of gravity: a 1% overdense region 100 Mpc across generates a much
deeper potential well than a 1% overdense region 10 Mpc across, since the former
has 1000 times more mass. Therefore we need much stronger density perturbations
at smaller scales to have an equal contribution to Φ.
Thus the perturbations get rapidly stronger on smaller scales, down to keq −1 ∼

100 Mpc. The ∼ 100 Mpc scale appears indeed quite prominent in large scale struc-
ture surveys, like the 2dFGRS and SDSS galaxy distribution surveys. Towards
smaller scales the structure keeps getting stronger, but now quite slowly. However,
the perturbations are now so large that first order perturbation theory begins to
fail, and that limit is crossed at around k −1 ∼ 10 Mpc. Nonlinear effects cause the
density power spectrum to rise more steeply than calculated by perturbation theory
on scales smaller than this.
The present-day density power spectrum Pδ (k) can be determined observation-
ally from the distribution of galaxies (Fig. 5). The quantity plotted is usually
Pδ (k) ≡ (2π 2 /k 3 )Pδ (k). It should go as
Pδ (k) ∝ k n for k ≪ keq
n−4
(11.69)
Pδ (k) ∝ k ln k for k ≫ keq .
11 PERTURBATIONS AFTER INFLATION 185

See figure 6.

11.10 Free-streaming
We earlier presented a simple argument for why dark matter is needed, based on
the 10−5 amplitude of the observed CMB anisotropies. Because baryons are tightly
coupled with photons at the time of last scattering, their density contrast δb is also
∼ 10−5 , and since density perturbations grow only linearly with the scale factor, an
expansion factor of ∼ 1000 is not enough to produce non-linear perturbations. How-
ever, the density contrast of dark matter, which is not coupled to the baryons, grows
logarithmically during the radiation-dominated era, and so factor of one thousand
amplification is enough to give non-linear structures today.
With the more detailed look above, we note that even without the transfer func-
tion, the amplitude of the density perturbation, unlike the gravitational potential,
depends on the scale. The conclusion that non-linear baryonic structures on the
presently observed scales could not have formed without dark matter is correct, but
the argument is a bit more subtle. Perturbations on comoving length scale R become
non-linear when their density contrast becomes of order unity. The density contrast
smoothed on a ball of radius R around the point x is
 ′ 
1
Z
|x − x|
δ(x, R) ≡ W δ(x′ )d3 x′ , (11.70)
V R

where W (y), the window function, is some function which falls off rapidly as y > 1,
i.e., |x − x′ | > R, and V ≡ d3 xW (x/R) is the volume of W . A typical choice of
R

W is a Gaussian, W (x/R) = exp(−x2 /2R2 ).


We are not interested in any specific point x, but in the typical value of |δ(x, R)|
(the average of δ(x, R) is zero), so we consider the mean square density contrast

σ 2 (R) ≡ hδ(x, R)2 i . (11.71)

where hi stands for the spatial average. As we are considering theR linear density
R 3 field, 2
2 3
this is just the average over the background space, hδ(x, R) i = ( d x) −1 d xδ(x, R) .
Structures start forming on comoving scale R when σ(R), which grows linearly with
the scale factor, reaches unity. Doing a Fourier transform, we can write the mean
square density contrast as
Z ∞
2 dk
σ (R) = Pδ (k, t)W (kR)2 , (11.72)
0 k
1 2 2
where for a Gaussian window function we have W (kR) = e− 2 k R . If the spectrum
of density perturbations were a power law, P(k) = Ak n+3 , we would have (exercise)
 
2 1 n+3
σ (R) = Γ Pδ (R−1 ) . (11.73)
2 2
So the mean square density contrast on a given comoving scale R would be
roughly given by the value of the power spectrum at k = R−1 . The real power
spectrum is more complicated because of the transfer function, but qualitatively it
is still true that the amplitude of density perturbations on a given scale is roughly
given by the power spectrum on that scale.
11 PERTURBATIONS AFTER INFLATION 186

If the transfer function were to continue to have the k 2 ln k behaviour for very
large k without limit, we would have Pδ (k) ∼ k n−1 (ln(k/keq ))2 . So if n ≥ 1, the
power spectrum would reach non-linear values at all times, if we just go to small
enough scales. So we would always have non-linear structures, albeit on very small
scales! However, the radiation-dominated era after inflation has a finite duration,
so the amount of logarithmic growth is limited. There is also another effect which
wipes out structure on small scales, namely the motion of the dark matter particles,
called free-streaming.
Even CDM has a finite temperature, which means that dark matter particles
have thermal motions, and this smooths density perturbations below some scale,
as particles from overdense and underdense regions mix and balance the density
2 2
perturbations out. For CDM, the transfer function is modified by the term e−k /kf s
for k ≫ kf s , where kf s is the free-streaming scale, related to the distance the dark
matter particles have moved since decoupling. For k < kf s , structure formation is
unaffected, but on small scales, perturbations are highly suppressed. The smallest
scale on which structures form is given by the free-streaming length, which for a
WIMP is approximately [4]
 1/2
−1
 m 1/2 TD
kf s ≈ pc , (11.74)
100 GeV 30 MeV

where m is the mass of the dark matter particle and TD is its decoupling temperature.
The smallest structures for a typical WIMP are therefore of comoving length 1 pc.
They form around a redshift of z = 40 . . . 60.
For warm dark matter, the free-streaming scale is larger, so structures on larger
scales are wiped out. For example, for sterile neutrinos (a prominent WDM can-
didate), the transfer function is instead modified approximately with the term
[1 + (k/kf s )2 ]−5 , with [5]
 m 
kf s ≈ Mpc−1 . (11.75)
500 eV
If the sterile neutrino mass were 500 eV, all structures on comoving scales smaller
than a Mpc would have been suppressed, in drastic conflict with observations. How-
ever, for a mass of say 5 keV, galaxies still form, but smaller structures are sup-
pressed, which may help to explain why there are fewer observed satellites of the
Milky Way than predicted in CDM models11 . Viewed from another perspective,
observations of structures can be used to constrain particle physics dark matter
models.
11
Note that kf−1
s is the comoving scale of the linear perturbation from which the structure formed.
The corresponding actual size of the structure today is smaller, because structures contract and
then stop expanding when they form, whereas in linear theory they would have been stretched
linearly with the scale factor.
REFERENCES 187

Figure 4: The whole picture of structure formation theory from quantum fluctuations
during inflation to the present-day power spectrum at t0 .

References
[1] Bardeen J M, Bond J R, Kaiser N and Szalay A S, The statistics of peaks of
Gaussian random fields, 1986 Astrophys. J. 304 15.

[2] J. Richard Gott III et al., A Map of the Universe, Astrophys. J. 624, 463 (2005),
astro-ph/0310571.

[3] M. Tegmark et al., Cosmological Constraints from the SDSS Luminous Red
Galaxies, Phys. Rev. D74, 123507 (2006), arXiv:astro-ph/0608632.

[4] A.M. Green, S. Hofmann and D.J. Schwarz, MNRAS 353, L23 (2004),
arXiv:astro-ph/0309621.

[5] S.H. Hansen, J. Lesgourgues, S. Pastor and J. Silk, MNRAS 333, 544 (2002),
arXiv:astro-ph/0106108.
REFERENCES 188

Figure 5: Distribution of galaxies according to the Sloan Digital Sky Survey (SDSS). This
figure shows galaxies that are within 2◦ of the equator and closer than 858 Mpc (assuming
H0 = 71 km/s/Mpc). Figure from astro-ph/0310571[2].

10

1
2
k P(k) / 2π

0.1
3

0.01

0.001

0.0001

0.01 0.1 1
-1
k [Mpc ]
REFERENCES 189

1e+05
P(k) [Mpc ]
3

10000

1000

100

0.01 0.1 1
-1
k [Mpc ]

Figure 6: The matter power spectrum from the SDSS obtained using luminous red galaxies
[3]. The top figure shows Pδ (k) and the bottom figure Pδ (k). A Hubble constant value
H0 = 71.4 km/s/Mpc has been assumed for this figure. (These galaxy surveys only obtain
the scales up to the Hubble constant, and therefore the observed Pδ (k) is usually shown in
units of h Mpc−1 , so that no value for H0 need to be assumed.) The black bars are the
observations and the red curve is a theoretical fit, from linear perturbation theory, to the
data. The bend in P (k) at keq ∼ 0.01 Mpc−1 is clearly visible in the bottom figure. Linear
perturbation theory fails when P(k) & 1, and therefore the data points do not follow the
theoretical curve to the right of the dashed line (representing an estimate on how far linear
theory can be trusted). Figure by R. Keskitalo.
12 Cosmic microwave background
The cosmic microwave background (CMB) is isotropic to a high degree. This tells
us that the early universe was very homogeneous at t = tdec , when the CMB was
formed. However, with precise measurements we can detect the small anisotropy of
the CMB, which reflects the small perturbations in the early universe.
This anisotropy was first detected by the COBE satellite in 1992, which mapped
the whole sky in three microwave frequencies. The angular resolution of COBE was
rather poor, 7◦ , so only features larger than this were detected. Measurements with
better resolution, but covering only small parts of the sky, were then performed
using instruments carried by balloons to the upper atmosphere, and ground-based
detectors located at high altitudes.
The next full-sky map of the CMB was made by the Wilkinson Microwave
Anisotropy Probe (WMAP) satellite, in orbit around the L2 point of the Sun-Earth
system, 1.5 million kilometers from the Earth in the direction opposite to the Sun.
The satellite was launched by NASA in June 2001, and the results of the first year
of measurements were published in February 2003. The WMAP satellite made eight
years of measurements, and the data from the first seven have been made public.
The Planck satellite was launched by ESA in May 2009, and the first cosmological
results were released in March 2013. The polarisation data has not yet been released,
it is expected to be made public in December 2014.

Figure 1: The cosmic microwave background according to the DMR instrument aboard the
COBE satellite.

Figures 1 to 3 show the observed variation δT



T obs in the temperature of the
CMB on the sky (in the first two plots, yellow and red mean hotter than average,
blue means colder than average).
The photons we see as the CMB have have travelled to us from where our past
light cone intersects the hypersurface where photons decouple at t = tdec . This
intersection forms a sphere that is called the last scattering surface1 . We are at the
1
Or the last scattering sphere. The expression “last scattering surface” often refers to the entire
t = tdec time slice.

190
12 COSMIC MICROWAVE BACKGROUND 191

Figure 2: The cosmic microwave background according to WMAP 7-year results.

Figure 3: The cosmic microwave background according to Planck 1.5-year results.


12 COSMIC MICROWAVE BACKGROUND 192

Figure 4: The observed CMB temperature anisotropy gets a contribution from the last
scattering surface, (δT /T )intr = Θ(tdec , xls , n̂) and from along the photon’s journey to us,
(δT /T )jour .

center of this sphere, which extends away from us both in space and in time.
The observed temperature anisotropy is due to two contributions, an intrinsic
temperature variation at the surface of last scattering and a variation in the redshift
the photons have suffered during their journey to us,
     
δT δT δT
= + . (12.1)
T obs T intr T jour

See figure 4. There are two ways to define what we mean by the CMB perturbation
δT . The first way is to just takeR the angular average of the temperature field and
1
call this the mean, T̄ ≡ T0 ≡ 4π dΩT , and defined the anisotropy as the difference
from the mean, δT = T − T0 . This is the physically most correct way. However, in
the context of perturbed FRW models, it can be simpler to call the temperature in
the background model the mean temperature. The perturbations also contribute to
the mean temperature, so this is a bit misleading, but common. We will also use
the notation δT δT δT
T instead of T̄ or T0 , as is common, but it should be understood
that the temperature in the denominator is the mean temperature. (Of course, this
would only make a difference at second order.)
δT

The first term in (12.1), T intr represents the temperature variation of the
photon gas at t = tdec . (It also includes the Doppler effect from the motion of this
photon gas.) At that time the largest scales we see on the CMB sky were still outside
the horizon. The separation of δT /T into two components is gauge-dependent. If
the time slice t = tdec dips further into the past in some location, it finds a higher
temperature, but the photons from there also have then a longer way to go and suffer
a larger redshift, so the two effects balance each other. We can calculate in any gauge
we want, getting different results for (δT /T )intr and (δT /T )jour depending on the
gauge, but their sum (δT /T )obs is gauge-independent, because it is an observed
quantity.
One might think that (δT /T )intr should be equal to zero, since in our earlier dis-
cussion of recombination and decoupling we identified decoupling with a particular
temperature Tdec ∼ 3000 K. This kind of thinking corresponds to a particular gauge
choice where the t = tdec time slice coincides with the T = Tdec hypersurface. In
12 COSMIC MICROWAVE BACKGROUND 193

Figure 5: Depending on the gauge, the Tdec = const. surface may, or (usually) may not
coincide with the t = tdec time slice.

this gauge (δT /T )intr = 0, except for the Doppler effect (we are not going to use this
gauge). Anyway, it is not true that all photons have their last scattering exactly
when T = Tdec . Rather they occur during a rather large temperature interval and
time period. The zeroth-order (background) time evolution of the temperature of
the photon distribution is the same before and after last scattering, T ∝ a−1 , so it
does not matter how we draw the artificial separation line, the time slice t = tdec
separating the fluid and free particle treatment of the photons. See figure 5.

12.1 Multipole analysis


The CMB temperature anisotropy is a function on a sphere. In analogy with Fourier
expansion in three-dimensional flat space, we separate out the contributions of dif-
ferent angular scales by doing a multipole expansion,
δT X
(θ, φ) = aℓm Yℓm (θ, φ) (12.2)
T0
where the sum runs over l = 1, 2, . . . ∞ and m = −l, . . . , l, giving 2ℓ + 1 values of m
for each ℓ. The functions Yℓm (θ, φ) are the spherical harmonics (see figure 6), which
form an orthonormal set of functions over the sphere, so that we can calculate the
multipole coefficients aℓm from
δT
Z

aℓm = Yℓm (θ, φ) (θ, φ)dΩ . (12.3)
T
This definition gives dimensionless aℓm . Often they are defined without the T0 =
2.725 K term in (12.2), and then they have the dimension of temperature and are
usually given in units of µK.
The coefficient al0 is real but the other alm are complex, and al,−m = a∗lm . The
sum begins at ℓ = 1, since Y00 = const. and therefore we must have a00 = 0 for
a quantity which represents a deviation from average. The dipole part, ℓ = 1, is
dominated by the Doppler effect due to the motion of the solar system with respect
to the last scattering surface, and it is difficult to separate the cosmological dipole
caused by large scale perturbations. (This was done for the first time with Planck,
though not to great accuracy.) Therefore we are here interested only in the ℓ ≥ 2
part of the expansion.
Another notation for Yℓm (θ, φ) is Yℓm (n̂), where n̂ is a unit vector whose direction
is specified by the angles θ and φ. (The hat denotes unit vector.)

12.1.1 Spherical harmonics


We list here some useful properties of the spherical harmonics. They are orthonormal
functions on the sphere, so
Z
dΩ Yℓm (θ, φ)Yℓ∗′ m′ (θ, φ) = δℓℓ′ δmm′ . (12.4)
12 COSMIC MICROWAVE BACKGROUND 194

Summing over the m corresponding to the same multipole number ℓ we have the
closure relation
2ℓ + 1
|Yℓm (θ, φ)|2 =
X
. (12.5)
m

We will also use the expansion of a plane wave in terms of spherical harmonics,
X
eik·x = 4π iℓ jℓ (kx)Yℓm (x̂)Yℓm

(k̂) . (12.6)
ℓm

Here x̂ and k̂ are the unit vectors in the directions of x and k, and jℓ is the spherical
Bessel function.

12.1.2 The theoretical angular power spectrum


The CMB anisotropy is due to the primordial perturbations, and therefore it reflects
their Gaussian nature. Because we get the values of the aℓm from the other pertur-
bation quantities through linear equations (in first-order perturbation theory), the
aℓm are also (complex) Gaussian random variables. Since they represent deviation
from the average temperature, their expectation value is zero,

haℓm i = 0 , (12.7)

and the quantity we want to calculate from theory is the variance h|aℓm |2 i to get a
prediction for the typical size of the aℓm . The isotropic nature of the random process
shows up in the aℓm so that these expectation values depend only on ℓ not m. (The
ℓ are related to the angular size of the anisotropy pattern, whereas the m are related
to “orientation” or “pattern”.) Since h|aℓm |2 i is independent of m, we can define
1 X
Cℓ ≡ h|aℓm |2 i = h|aℓm |2 i . (12.8)
2ℓ + 1 m

The aℓm are independent random variables, so

haℓm a∗ℓ′ m′ i = δℓℓ′ δmm′ Cℓ . (12.9)

This function Cℓ (of integers l ≥ 1) is called the (theoretical) angular power spec-
trum. It is analogous to the power spectrum P(k) of density perturbations. For
Gaussian perturbations, Cℓ contains all the statistical information about the CMB
temperature anisotropy. This is all we can predict from theory. Thus analysis of
the CMB anisotropy consists of calculating the angular power spectrum from the
observed CMB and comparing it to the Cℓ predicted by theory2 .
2
In addition to the temperature anisotropy, the CMB also has another property, its polarisation.
There are two additional power spectra related to the polarisation, CℓEE and CℓBB , and one related
to the correlation between temperature and polarisation, CℓT E . The spectra CℓEE and CℓT E have
been measured, while there is thus far no detection of a non-zero CℓBB , only an upper bound. A
detection would indicate the presence of primordial gravitational waves. In the simplest inflationary
models, such as the m2 ϕ2 model, the amplitude of the gravitational waves produced during inflation
is large enough that it should be seen by Planck. In many other models, the amplitude is too small
to be detected by CMB experiments in the near future.
12 COSMIC MICROWAVE BACKGROUND 195

Figure 6: The three lowest multipoles ℓ = 1, 2, 3 of spherical harmonics. Left column: Y10 ,
Re Y11 , Im Y11 . Middle column: Y20 , Re Y21 , Im Y21 , Re Y22 , Im Y22 . Right column: Y30 ,
Re Y31 , Im Y31 , Re Y32 , Im Y32 , Re Y33 , Im Y33 . Figure by Ville Heikkilä.
12 COSMIC MICROWAVE BACKGROUND 196

Just like the three-dimensional density power spectrum P(k) gives the contri-
bution of scale k to the density variance hδ(x)2 i, the angular power spectrum Cℓ is
related to the contribution of multipole ℓ to the temperature variance,
*  + * +
δT (θ, φ) 2 X X
= aℓm Yℓm (θ, φ) a∗ℓ′ m′ Yℓ∗′ m′ (θ, φ)
T
ℓm ℓ′ m′
XX
= Yℓm (θ, φ)Yℓ∗′ m′ (θ, φ)haℓm a∗ℓ′ m′ i
ℓℓ′ mm′
X 2ℓ + 1
|Yℓm (θ, φ)|2 =
X X
= Cℓ Cℓ , (12.10)
m

ℓ ℓ

where we used (12.9) and the closure relation (12.5).


Thus, if we plot (2ℓ + 1)Cℓ /4π on a linear ℓ scale, or ℓ(2ℓ + 1)Cℓ /4π on a log-
arithmic ℓ scale, the area under the curve gives the temperature variance, i.e. the
expectation value for the squared deviation from the average temperature. It has
become customary to plot the angular power spectrum as ℓ(ℓ + 1)Cℓ /2π, which is
neither of these, but for large ℓ approximates the second case. The reason for this
custom is explained later.
Equation (12.10) represents the expectation value from theory and thus it is the
same for all directions θ,φ. The actual, “realised”, value of course varies from one
direction θ,φ to another. We can imagine an ensemble of universes, each representing
a different realisation of the same random process that produces the primordial
perturbations. Then h i represents the average over such an ensemble.

12.1.3 Observed angular power spectrum


Theory predicts expectation values h|aℓm |2 i from the random process responsible for
the CMB anisotropy, but we can observe only one realisation of this random process,
the set {aℓm } of our CMB sky. We define the observed angular power spectrum as
the average
bℓ ≡ 1
X
C |aℓm |2 (12.11)
2ℓ + 1 m
of these observed values.  2
The variance of the observed temperature anisotropy is the average of δT (θ,φ) T
over the celestial sphere,

δT (θ, φ) 2
Z  
1 1
Z X X
dΩ = dΩ aℓm Yℓm (θ, φ) a∗ℓ′ m′ Yℓ∗′ m′ (θ, φ)
4π T 4π
ℓm ℓ′ m′
1 X X Z
= aℓm a∗ℓ′ m′ Yℓm (θ, φ)Yℓ∗′ m′ (θ, φ)dΩ

ℓm ℓ′ m′ | {z }
δℓℓ′ δmm′
1 XX
= |aℓm |2


|m {z }
bℓ
(2ℓ+1)C
X 2ℓ + 1
= C
bℓ . (12.12)


12 COSMIC MICROWAVE BACKGROUND 197

Figure 7: The observed angular power spectrum Cbℓ according to the Planck satellite.
The observational results are the data points, with error bars representative of the cosmic
variance. The solid curve is the theoretical Cℓ from the best-fit ΛCDM model, and the gray
band around it represents the cosmic variance corresponding to this Cℓ .

Contrast this with (12.10), which gives the variance of δT /T at an arbitrary location
on the sky over different realisations of the random process which produced the
primordial perturbations; whereas equation (12.12) gives the variance of δT /T of
our given sky over the celestial sphere.

12.1.4 Cosmic variance


The expectation value of the observed spectrum Cbℓ is equal to Cℓ , the theoretical
spectrum (12.8), i.e.
hC
bℓ i = Cℓ ⇒ hC bℓ − Cℓ i = 0 , (12.13)
but its actual, realised, value is not, although we expect it to be close. The expected
squared difference between C bℓ and Cℓ is called the cosmic variance. We can calculate
it using the properties of (complex) Gaussian random variables (exercise). The
answer is
h(C bℓ − Cℓ )2 i = 2 C 2 . (12.14)
2ℓ + 1 ℓ
We see that the expected relative difference between C bℓ and Cℓ is smaller for
higher ℓ. This is because we have a larger (size 2ℓ + 1) statistical sample of aℓm
available for calculating the C bℓ .
The cosmic variance limits the accuracy of comparison of CMB observations with
theory, especially for large scales (low ℓ).

12.2 Multipoles and scales


12.2.1 Rough correspondence
The different multipole numbers ℓ correspond to different angular scales, low ℓ to
large scales and high ℓ to small scales. Examination of the functions Yℓm (θ, φ) reveals
that they have an oscillatory pattern on the sphere, so that there are typically ℓ
“wavelengths” of oscillation around a full great circle of the sphere. See figure 8.
12 COSMIC MICROWAVE BACKGROUND 198

Thus the angle corresponding to this wavelength is


2π 360◦
θλ = = . (12.15)
ℓ ℓ
See figure 9. The angle corresponding to a “half-wavelength”, i.e. the separation
between a neighbouring minimum and maximum is then
π 180◦
θres = = . (12.16)
ℓ ℓ
This is the angular resolution required of the microwave detector for it to be able to
resolve the angular power spectrum up to this ℓ.
For example, COBE had an angular resolution of 7◦ allowing a measurement up
to ℓ = 180/7 = 26, WMAP had resolution 0.23◦ reaching to ℓ = 180/0.23 = 783,
and the European Planck satellite has resolution 5′ , which allows to measure Cℓ up
to ℓ = 21603 .
The angles on the sky are related to actual physical or comoving distances via the
angular diameter distance dA , defined as the ratio of the physical length (transverse
to the line of sight) and the angle it covers, as discussed in chapter 3,

λphys
dA ≡ . (12.17)
θ
Likewise, we defined the comoving angular diameter distance dcA by

λc
dcA ≡ (12.18)
θ
where λc = (1/a)λphys = (1 + z)λphys is the corresponding comoving length. Thus
dcA = (1/a)dA = (1 + z)dA . See figure 10.
Consider now the Fourier modes of our earlier perturbation theory discussion.
A mode with comoving wavenumber k has comoving wavelength λc = 2π/k. Thus
this mode should show up as a pattern on the CMB sky with angular size
λc 2π 2π
θλ = c = c = . (12.19)
dA kdA ℓ

For the last equality we used the relation (12.15). From it we get that the modes
with wavenumber k contribute mostly to multipoles around

ℓ = kdcA . (12.20)

12.2.2 Exact treatment


The above matching of wavenumbers with multipoles is rather naive, for two reasons:

1. The description of a spherical harmonic Yℓm having an “angular wavelength”


of 2π/ℓ is just a crude characterisation. See figure 8.

2. The modes k are not wrapped around the sphere of last scattering, but the
wave vector forms a different angle with the sphere at different places.
3
In reality, there is no sharp cut-off at a particular ℓ, the observational error bars just blow up.
12 COSMIC MICROWAVE BACKGROUND 199

Figure 8: Randomly generated skies containing only a single multipole ℓ. Staring from top
left: ℓ = 1 (dipole only), 2 (quadrupole only), 3 (octopole only), 4, 5, 6, 7, 8, 9, 10, 11, 12.
Figure by Ville Heikkilä.

Figure 9: The rough correspondence between multipoles ℓ and angles.


12 COSMIC MICROWAVE BACKGROUND 200

Figure 10: The comoving angular diameter distance relates the comoving size of an object
and the angle in which we see it.

The following precise discussion applies only for the case of a flat universe (K = 0
Friedmann model as the background), where one can Fourier expand functions on
a time slice. We start from the expansion of the plane wave in terms of spherical
harmonics, for which we have the result (12.6),
X
eik·x = 4π iℓ jℓ (kx)Yℓm (x̂)Yℓm

(k̂) , (12.21)
ℓm

where jℓ is the spherical Bessel function.


Consider now some function
X
f (x) = fk eik·x (12.22)
k

on the t = tdec time slice. We want the multipole expansion of the values of this
function on the last scattering sphere. See figure 11. These are the values f (xx̂),
where x ≡ |x| has a constant value, the (comoving) radius of this sphere. Thus
Z

aℓm = dΩx Yℓm (x̂)f (xx̂)
X Z
= dΩx Yℓm∗
(x̂)fk eik·x
k
XXZ

(x̂)iℓ jℓ′ (kx)Yℓ′ m′ (x̂)Yℓ∗′ m′ (k̂)

= 4π dΩx fk Yℓm
k ℓ′ m′
X
ℓ ∗
= 4πi fk jℓ (kx)Yℓm (k̂) , (12.23)
k

where we used the orthonormality of the spherical harmonics. The corresponding


result for a Fourier transform f (k) is

4πiℓ
Z
aℓm = d3 kf (k)jℓ (kx)Yℓm

(k̂) . (12.24)
(2π)3/2
The jℓ are oscillating functions with decreasing amplitude. For large values of ℓ
the position of the first (and largest) maximum is near kx = ℓ (see figure 12). Thus
the aℓm pick a large contribution from the Fourier modes k for which

kx ∼ ℓ . (12.25)
12 COSMIC MICROWAVE BACKGROUND 201

Figure 11: A plane wave intersecting the last scattering sphere.

0 20 40 60 80 100
0.3
0.3 j2(x)
0.2
j3(x)
0.1 j4(x)
0.2
0

-0.1
0.1

-0.1

0 2 4 6 8 10 12 14 16 18 20
0 200 400 600 800 1000

0.02 0.01 j200(x)


0.005 j201(x)
0.015 j202(x)
0

0.01 -0.005

-0.01
0.005

-0.005

-0.01
180 190 200 210 220 230 240 250

Figure 12: Spherical Bessel functions jℓ (x) for ℓ = 2, 3, 4, 200, 201, and 202. Note how
the first and largest peak is near x = ℓ (but to be precise, at a slightly larger value). Figure
by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 202

In a flat universe the comoving distance x (from our location to the sphere of last
scattering) and the comoving angular diameter distance dcA are equal, so we can
write this result as
kdcA ∼ ℓ . (12.26)
The conclusion is that a given multipole ℓ acquires a contribution from modes
with a range of wavenumbers, but most of the contribution comes from near the value
given by (12.20). This concentration is tighter for larger ℓ. We will use equation
(12.20) for qualitative purposes.

12.3 Important distance scales on the last scattering surface


12.3.1 Angular diameter distance to the last scattering surface
In chapter 3 we derived the comoving angular diameter distance to redshift z in a
FRW model,
Z z
√ dz ′
 
c 1
dA (z) = √ sinh −K ′
−K 0 H(z )
Z 1
da
= H0−1 p , (12.27)
1
1+z
Ω0 (a − a ) − ΩΛ0 (a − a4 ) + a2
2

where the second line holds for an FRW model that contains only matter and vacuum
energy (Ω0 = Ωm0 +ΩΛ0 ). In the real universe, the contribution of radiation is small,
since the radiation-dominated era ends early, when the universe is around 50 000
years old. Recall that Ω0 − 1 = −ΩK0 = K/H02 . We are interested in the distance
to the last scattering sphere, i.e. dcA (zdec ), where 1 + zdec ≈ 1090.
In the simplest case of the spatially flat matter-dominated universe, ΩΛ0 = 0,
Ωm0 = 1, the integral gives
Z 1  
c −1 da −1 1
dA (zdec ) = H0 √ = 2H0 1− √ = 1.94H0−1 ≈ 2H0−1 , (12.28)
1 a 1 + z dec
1+z

where the last approximation corresponds to ignoring the contribution from the
lower limit.
We also consider two more general situations, of which the above is a special
case.
a) Open universe with no dark energy, ΩΛ0 = 0 and Ωm0 = Ω0 < 1. Now we have
Z 1 !
c H0−1 p da
dA (zdec ) = √ sinh 1 − Ωm0 p
1 − Ωm0 1
1+z
(1 − Ωm0 )a2 + Ωm0 a
 
−1 Z 1
H da
= √ 0 sinh  q 
1 − Ωm0 1
1+z a 2 + Ωm0 a
1−Ωm0
!
H0−1
r r
1 − Ωm0 1 − Ωm0 1
= √ sinh 2 arsinh − 2 arsinh
1 − Ωm0 Ωm0 Ωm0 1 + zdec
!
H −1
r
1 − Ωm0
≈ √ 0 sinh 2 arsinh
1 − Ωm0 Ωm0
H0−1
= 2 , (12.29)
Ωm0
12 COSMIC MICROWAVE BACKGROUND 203

Horizon distance for =0


10
angular diameter distance
9 distance

-1
Comoving distance in H0
7

0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Figure 13: The comoving proper distance dcP (z = ∞) (dashed) and the comoving angular
diameter distance dcA (z = ∞) (solid) to the horizon in matter-only open universe. The
vertical axis is the distance in units of Hubble distance H0−1 and the horizontal axis is the
density parameter Ω0 = Ωm0 . The distances to last scattering, dcP (zdec ) and dcA (zdec ), are a
few per cent less.

where again the approximation ignores the contribution from the lower limit
(i.e., it actually gives the comoving angular diameter distance to the hori-
zon, dcAp (z = ∞)). In the last step we used sinh 2x = 2 sinh x cosh x =
2 sinh x 1 + sinh2 x. We show this result (together with the comoving proper
distance dcP (z = ∞)) in figure 13.

b) Spatially flat universe with vacuum energy, ΩΛ + Ωm = 1. Here the integral


does not give an elementary function, but a reasonable approximation, which
we use in the following, is
2
dcA (zdec ) ≈ H0−1 . (12.30)
Ω0.4
m

The distance dcA (zdec ) depends on the expansion history of the universe. For one,
the longer it takes for the universe to cool from Tdec to T0 (i.e., to expand by the
factor 1 + zdec ), the longer distance the photons have time to travel. For spatially
curved universes the angular diameter distance gets an additional effect from the
geometry of the universe, which acts like a “lens” to make the distant CMB pattern
at the last scattering sphere to look smaller or larger (see figure 14).

12.3.2 Decoupling scale and the matter-radiation equality scale


Subhorizon (k ≫ aH) and superhorizon (k ≪ aH) scales behave differently. Thus
we want to know which of the structures we see on the last scattering surface are
12 COSMIC MICROWAVE BACKGROUND 204

Figure 14: The geometry effect in a closed (top) or an open (bottom) universe affects the
angle at which we see a structure of given size at the last scattering surface, and thus its
angular diameter distance.

subhorizon and which are superhorizon (at the time of last scattering). For that we
need to know the comoving Hubble scale aH at tdec .
We make the approximation that neutrinos are massless. The physical radiation
density today is then ωr ≡ Ωr0 h2 ≈ 4.18 × 10−5 , the photon contribution being
ωγ ≈ 2.47×10−5 . We also make the approximation that the universe was completely
matter-dominated at tdec , i.e. we ignore the radiation contribution to the Friedmann
equation at tdec . This is not a terribly good approximation, since

ρm (tdec ) ωm
= ≈ 22ωm ≈ 2.6 . . . 3.5 , (12.31)
ρr (tdec ) (1 + zdec )ωr

for ωm = 0.12 . . . 0.16. The curvature and (for most dark energy models, including
vacuum energy) dark energy contributions are negligible at tdec . Thus we have

2 8πG
Hdec ≈ ρm = Ωm0 H02 (1 + zdec )3 , (12.32)
3
and we get for the comoving Hubble scale

H −1 1
−1
kdec −1
≡ (adec Hdec )−1 = (1 + zdec )Hdec = (1 + zdec )−1/2 √ 0 = √ 91h−1 Mpc ,
Ωm0 Ωm0
(12.33)
using 1 + zdec = 1090. The scale which is entering at t = tdec is thus

kdec = adec Hdec = (1 + zdec )1/2 Ωm0 H0 ,


p
(12.34)
12 COSMIC MICROWAVE BACKGROUND 205

and the corresponding multipole number on the last scattering sphere is


−0.5

c 1/2
p 2/Ωm0 = 66.0 Ωm0 (ΩΛ = 0)
ℓH ≡ kdec dA = (1 + zdec ) Ωm0 × 0.4 0.1
2/Ωm0 = 66.0 Ωm0 (Ω0 = 1)
(12.35)
The angle subtended by a half-wavelength π/k of this mode on the last scattering
sphere is
180◦ 2.7◦ Ω0.5

π m0
θH ≡ = = (12.36)
ℓH ℓH 2.7◦ Ω−0.1
m0 .
For the open model with Ωm0 = 0.3, we get 1.5◦ , and for the spatially flat ΛCDM
model with Ωm0 = 0.3, we get ∼ 3◦ .
Another important scale is keq , the scale which enters at the time of matter-
radiation equality teq , since the transfer function T (k) is bent at that point. Pertur-
bations for scales k ≪ keq essentially maintain their primordial spectrum, whereas
scales k ≫ keq have lost relative power between their horizon entry and teq . With a
calculation similar to kdec (taking into account that ρtot (teq ) = 2ρm (teq )), we get
−1
keq = (aeq Heq )−1 ≈ 14ωm
−1
Mpc = 4.7 × 10−3 Ω−1 −1 −1
m0 h H0 . (12.37)

For ωm = 0.14 we have keq = 100 Mpc. The corresponding multipole number is

c 2/Ωm0 = 430h (ΩΛ = 0)
ℓeq = keq dA = 214Ωm0 h × (12.38)
2/Ω0.4
m0 ≈ 430h Ω 0.6 (Ω = 1) .
m0 0

12.4 CMB anisotropy from perturbation theory


We began this chapter with the observation (12.1), that the CMB temperature
anisotropy is a sum of two parts,
     
δT δT δT
= + , (12.39)
T obs T intr T jour

and that this separation is gauge dependent. We shall consider this in the longi-
tudinal gauge, since the second part, δT

T jour , the integrated redshift perturbation
along the line of sight, is easiest to calculate in this gauge. The calculation requires
more general relativity tools than we have available, so we just give the result.
  Z o Z o  
δT 1
= − dΦ + vobs · n̂ + dt Φ̇ + Ψ̇ − ḣij n̂i n̂j
T jour dec dec 2
Z o  
1
= Φ(tdec , xls ) − Φ(t0 , 0) + vobs · n̂ + dt Φ̇ + Ψ̇ − ḣij n̂i n̂j
dec 2
Z o
1 i j o
Z
Ψ≈Φ
= Φ(tdec , xls ) − Φ(t0 , 0) + vobs · n̂ + 2 dtΦ̇ − n̂ n̂ dtḣij ,
dec 2 dec
(12.40)

where the integral is from (tdec , xls ) to (t0 , 0) along the path of the photon (a null
geodesic) and n̂ is a unit vector pointing in the direction the observer is looking
at. The observer’s location has been chosen as the origin 0. The term vobs · n̂
is the Doppler effect from the observer’s motion (which is assumed nonrelativistic,
|vobs | ≪ 1), where vobs is the observer’s velocity. The subscript ls in xls indicates that
12 COSMIC MICROWAVE BACKGROUND 206

x lies somewhere on the last scattering sphere. In the matter-dominated universe


the Newtonian potential remains constant in time, Φ̇ = 04 , so we get a contribution
from the integral only from epochs when the contributions of radiation, dark energy
of spatial curvature to the total energy density cannot be ignored.
We can understand the above result as follows. If the potential is constant in
time, the blueshift the photon acquires when falling into a potential well is canceled
by the redshift from climbing up the well. Thus the net redshift/blueshift caused
by gravitational potential perturbations is just the difference between the values of
Φ at the beginning and in the end. However, if the potential is changing while the
photon is traversing the well, this cancellation is not exact, and we get the integral
term to account for this effect.
The value of the potential perturbation at the observing site, Φ(t0 , 0) is the same
for photons coming from all directions. Thus it does not contribute to the observed
anisotropy. It just produces an overall shift in the observed average temperature.
(Recall the discussion of the two ways of defining the mean temperature at the
beginning of the chapter.) This is included in the observed value T0 = 2.725 K, and
there is no way for us to separate it from the unperturbed value. Thus we will ignore
the monopole. The observer motion vobs causes a dipole (ℓ = 1) pattern in the CMB
anisotropy, from which it is difficult to disentangle the cosmological dipole on the
last scattering sphere. Therefore the dipole is usually removed from the CMB map
before analysing it for cosmological purposes. Accordingly, we ignore this term also.
We will also not consider the effect of gravitational waves. Our final result for the
journey part is therefore
  Z o
δT
= Φ(tdec , xls ) + 2 Φ̇dt . (12.41)
T jour dec

The other part, δT



T intr , comes from the local temperature perturbation at t =
tdec and the Doppler effect, −v · n̂, from the local (baryon+photon) fluid motion at
that time. Since
π2 4
ργ = T , (12.42)
15
the local temperature perturbation is directly related to the relative perturbation in
the photon energy density,
 
δT 1
= δγ − v · n̂ . (12.43)
T intr 4

We can now write the observed temperature anisotropy as


  Z o
δT 1
= δγ − v · n̂ + Φ(tdec , xls ) + 2 Φ̇dt . (12.44)
T obs 4 dec

Both the density perturbation δγ and the fluid velocity v are gauge dependent; we
use the longitudinal gauge only.
To make further progress we now

1. consider only adiabatic primordial perturbations and


4
In linear perturbation theory. In second and higher order perturbation theory we have Φ̇ 6= 0
even in a spatially flat matter-dominated universe.
12 COSMIC MICROWAVE BACKGROUND 207

2. make the (crude) approximation that the universe is already matter dominated
at t = tdec .
For adiabatic perturbations we have
3
δb = δc ≡ δm = δγ . (12.45)
4
The perturbations stay adiabatic only on superhorizon scales. Once the per-
turbation has entered horizon, different physics begin to act on different matter
components, so the adiabatic relation between their density perturbations is bro-
ken. In particular, the baryon-photon perturbation is affected by photon pressure,
which damps its growth and causes it to oscillate, whereas the CDM perturbation
is unaffected and keeps growing. Since the baryon and photon components see the
same pressure, they evolve together and maintain their adiabatic relation until pho-
ton decoupling. Thus, after horizon entry but before decoupling we have,
3
δc 6= δb = δγ . (12.46)
4
At decoupling, the equality holds for scales larger than the photon mean free path
at tdec .
After decoupling, this connection between the photons and baryons is broken,
and the baryon density perturbation begins to approach the CDM density pertur-
bation,
3
δc ← δb 6= δγ . (12.47)
4
We shall return to these issues when we discuss the shorter scales in sections 12.6
and 12.7. But let us first consider the scales which are still superhorizon at tdec , so
that (12.45) applies.

12.5 Large scales: Sachs–Wolfe part of the spectrum


Consider now the scales k ≪ kdec , or ℓ ≪ ℓH , which are still superhorizon at
decoupling. According to the adiabatic condition (12.45) we have
1 1 1
δγ = δm ≈ δ , (12.48)
4 3 3
where the latter (approximate) equality comes from taking the universe to be matter
dominated at tdec , so that we can identify δ ≈ δm . For these scales the Doppler effect
from fluid motion is subdominant, and we can ignore it. This can be seen from (9.19):
Fourier transforming the equation we have ui ∼ k i Φ/(a2 H). Thus (12.44) becomes
  Z o
δT 1
= δ + Φ(tdec , xls ) + 2 Φ̇dt . (12.49)
T obs 3 dec

On superhorizon scales we have δ = −2Φ and (12.49) becomes

  o
δT 2
Z
= − Φ(tdec , xls ) + Φ(tdec , xls ) + 2 Φ̇dt
T obs 3 dec
Z o
1
= Φ(tdec , xls ) + 2 Φ̇dt . (12.50)
3 dec
12 COSMIC MICROWAVE BACKGROUND 208

This part of the CMB anisotropy is called the Sachs–Wolfe effect. The firstR part,
1
3 Φ(tdec , xls ),
is called the ordinary Sachs–Wolfe effect, and the second part, 2 Φ̇dt,
is called the integrated Sachs-Wolfe effect (ISW), since it involves integrating along
the line of sight. There are two contributions to the integrated Sachs–Wolfe effect,
the early Sachs–Wolfe effect and the late Sachs–Wolfe effect. The first is caused by
the effect of radiation at last scattering. In our approximation where we assume
that the universe is completely matter-dominated at t = tdec , this term is absent.
When dark energy becomes important at times close to today, Φ starts to evolve
again, which leads to the late ISW effect, which shows up as a rise in the smallest ℓ
of the angular power spectrum Cℓ . However, it is difficult to detect this effect due to
the large cosmic variance at small ℓ. The late ISW effect also leads to a correlation
between the CMB anisotropies and the galaxy distribution, which makes it easier to
detect its presence. The late ISW effect has been detected this way, from the cross-
correlation of the CMB and large scale structure. We shall now for a while ignore
the ISW, which for ℓ ≪ ℓH is expected to be smaller than the ordinary Sachs–Wolfe
effect.

12.5.1 Angular power spectrum from the ordinary Sachs–Wolfe effect


We now calculate the contribution from the ordinary Sachs–Wolfe effect,
 
δT 1
= Φ(tdec , xls ) , (12.51)
T SW 3
to the angular power spectrum Cℓ . This is the dominant effect for ℓ ≪ ℓH .
Since Φ is evaluated at the last scattering sphere, we have from (12.23),
X1
aℓm = 4πiℓ ∗
Φk jℓ (kx)Yℓm (k̂) , (12.52)
3
k

In the matter-dominated epoch,


3
Φ = − R, (12.53)
5
so that we have
4π ℓ X ∗
aℓm = − i Rk jℓ (kx)Yℓm (k̂) . (12.54)
5
k
The coefficient aℓm is thus a linear combination of the independent random
variables Rk , i.e. it is of the form
X
bk R k , (12.55)
k

For any such linear combination, the expectation value of its absolute value squared
is
* 2 +
X XX
b R = bk b∗k′ hRk R∗k′ i

k k

k k k′
 3 X
2π 1
= PR (k) |bk |2 , (12.56)
L 4πk 3
k
12 COSMIC MICROWAVE BACKGROUND 209

where we used  3
2π 1
hRk R∗k′ i = δkk′ PR (k) (12.57)
L 4πk 3
(the independence of the random variables Rk and the definition of the power spec-
trum P(k)).
Thus
1 X
Cℓ ≡ h|aℓm |2 i
2ℓ + 1 m
16π 2 1 X 2π 3 X 1
  2
2 ∗
= P (k)j (kx) Y ( k̂)

3 R ℓ ℓm
25 2ℓ + 1 m L 4πk

k
1 2π 3 X 1
 
= PR (k)jℓ (kx)2 . (12.58)
25 L k3
k

(Although all h|aℓm |2 i are equal for the same ℓ, we used the sum over m in order to
apply (12.5).) Replacing the sum with an integral, we get
Z 3
1 d k
Cℓ = PR (k)jℓ (kx)2
25 k3
4π ∞ dk
Z
= PR (k)jℓ (kx)2 , (12.59)
25 0 k

the final result for an arbitrary primordial power spectrum PR (k).


The integral can be done for a power-law power spectrum, PR (k) = A2 (k/kp )n−1 .
In particular, for a scale-invariant (n = 1) primordial power spectrum,

PR (k) = const. = A2 , (12.60)

we have
4π ∞
dk A2 2π
Z
Cℓ = A2 jℓ (kx)2 = , (12.61)
25 0 k 25 ℓ(ℓ + 1)
since ∞
dk 1
Z
jℓ (kx)2 = . (12.62)
0 k 2ℓ(ℓ + 1)
We can write this as
ℓ(ℓ + 1) A2
Cℓ = = const. (independent of ℓ) (12.63)
2π 25
The reason why the angular power spectrum is customarily plotted as ℓ(ℓ + 1)Cℓ /2π
is that it makes the Sachs–Wolfe part of the Cℓ flat for a scale-invariant primordial
power spectrum PR (k).
Present data shows that the spectrum has a small red tilt, n = 0.96 ± 0.007,
as expected from the simplest inflationary models. Since the spectrum is close to
scale-invariant, determining the spectral index requires observations over a range of
scales. However, determining the overall amplitude is possible just by observing the
few lowest multipoles, known as the Sachs–Wolfe plateau. The COBE satellite saw
12 COSMIC MICROWAVE BACKGROUND 210

only up to about ℓ = 25, so the COBE data in figure 1 is completely in this region.
The amplitude is
ℓ(ℓ + 1) b
Cℓ ≈ 10−10 . (12.64)

This gives the amplitude of the primordial power spectrum as
2
PR (k) = A2 ≈ 25 × 10−10 = 5 × 10−5 . (12.65)

We already used this result (confirmed after COBE by other experiments) in chapter
10 as a constraint on the energy scale of inflation. Nowadays, the detailed structure
of the anisotropies has been measured: the latest data from Planck is shown in figure
7. Let us now discuss how the structure of peaks and troughs is generated.

12.6 Acoustic oscillations


Consider now the scales k ≫ kdec , or ℓ ≫ ℓH , which are subhorizon at decoupling.
The observed temperature anisotropy is, from (12.44)
  Z o
δT 1
= δγ (tdec , xls ) + Φ(tdec , xls ) − v · n̂(tdec , xls ) + 2 Φ̇dt . (12.66)
T obs 4 dec

We concentrate on the three first terms, which correspond to the situation at the
point (tdec , xls ) we are looking at on the last scattering sphere.
Before decoupling the photons are tightly coupled to the baryons. The per-
turbations in the baryon-photon fluid are oscillating, whereas CDM perturbations
grow (logarithmically during the radiation-dominated epoch, and then ∝ a during
the matter-dominated epoch). Therefore CDM perturbations begin to dominate the
total density perturbation δρ and thus also Φ already before the universe becomes
matter-dominated and CDM begins to dominate the background energy density.
Thus we can make the approximation that Φ is given by the CDM perturbation.
The baryon-photon fluid oscillates in these potential wells caused by the CDM. The
potential Φ evolves at first but then becomes constant as the universe becomes
matter dominated.
We will not do a full calculation of the δbγ oscillations in the expanding universe,
that would require a bit more general relativity tools than we have at our disposal.
One reason is that ρbγ is a relativistic fluid, and we gave the equation for the density
perturbation for a nonrelativistic fluid only (the Jeans equation). The nonrelativistic
perturbation equation for a fluid component i is (this follows from (11.50) when we
replace the baryonic density contrast with the total density contrast in the driving
term)
 2
k
c2s δki + Φk .

δ̈ki + 2H δ̇ki = − (12.67)
a
The generalisation of the (subhorizon) perturbation equations to the case of a
relativistic fluid is considerably easier if we ignore the expansion of the universe.
Then (12.67) becomes
δ̈ki + k 2 c2s δki + Φk = 0 .

(12.68)
According to GR, the density of “passive gravitational mass” is ρ+p = (1+w)ρ, not
just ρ as in Newtonian gravity. Therefore the force on a fluid element of the fluid
12 COSMIC MICROWAVE BACKGROUND 211

component i is proportional to (ρi + pi )∇Φ = (1 + wi )ρi ∇Φ instead of just ρi ∇Φ,


and (12.68) generalises to the case of a relativistic fluid as5

δ̈ki + k 2 c2s δki + (1 + wi )Φk = 0 .


 
(12.69)

In the present application the fluid component ρi is the baryon-photon fluid ρbγ
and the gravitational potential Φ is caused by the CDM. Before decoupling, the
adiabatic relation δb = 43 δγ still holds between photons and baryons, and we have
the adiabatic relation between pressure and density perturbations,

δpbγ = c2s δρbγ , (12.70)

so the sound speed of the fluid is given by


δpbγ δpγ 1 δργ 1 ρ̄γ δγ 1 1 1 1
c2s = ≈ = = = 3 ρ̄b ≡ . (12.71)
δρbγ δρbγ 3 δργ + δρb 3 ρ̄γ δγ + ρ̄b δb 3 1 + 4 ρ̄ 31+R
γ

We defined
3 ρ̄b
R≡ . (12.72)
4 ρ̄γ
We can now write the perturbation equation (12.69) for the baryon-photon fluid as

δ̈bγk + k 2 c2s δbγk + (1 + wbγ )Φk = 0 .


 
(12.73)

The equation-of-state parameter for the baryon-photon fluid is


1
p̄bγ ρ̄γ 1 1
wbγ ≡ = 3 = , (12.74)
ρ̄bγ ρ̄γ + ρ̄b 3 1 + 34 R

We can therefore write (12.73) as


" #
4
2 1 1 3 (1 + R)
δ̈bγk + k δbγk + Φk = 0 . (12.75)
31+R 1 + 34 R

We introduce the notation6


1
Θ 0 ≡ δγ , (12.76)
4
which gives the perturbation in the photons, not in the baryon-photon fluid. The
two are related by
δρbγ δργ + δρb ρ̄γ δγ + ρ̄b δb 1+R
δbγ = = = = δγ . (12.77)
ρ̄bγ ρ̄γ + ρ̄b ρ̄γ + ρ̄b 1 + 34 R

Thus we can write (12.73) as


 
2 1 1 4
δ̈γk + k δγk + Φk = 0 , (12.78)
31+R 3
5
Actually the derivation is more complicated, since also the density of “inertial mass” is ρi + pi
and the energy continuity equation is modified by a work-done-by-pressure term. Anyway, (12.69)
is the correct result.
6
The subscript 0 refers to the monopole (ℓ = 0) of the local photon distribution. Likewise,
the dipole (ℓ = 1) of the local photon distribution corresponds to the velocity of the photon fluid,
Θ1 ≡ vγ /3.
12 COSMIC MICROWAVE BACKGROUND 212

or  
2 1 1 1
Θ̈0k + k Θ0k + Φk = 0 , (12.79)
31+R 3
or
Θ̈0k + c2s k 2 [Θ0k + (1 + R)Φk ] = 0 , (12.80)
If we now take R and Φk to be constant, this is the harmonic oscillator equation
for the quantity Θ0k + (1 + R)Φk with the general solution

Θ0k + (1 + R)Φk = Ak cos(cs kt) + Bk sin(cs kt) , (12.81)

or
Θ0k + Φk = −RΦk + Ak cos(cs kt) + Bk sin(cs kt) , (12.82)
or
Θ0k = −(1 + R)Φk + Ak cos(cs kt) + Bk sin(cs kt) . (12.83)
We are interested in the quantity Θ0 + Φ = 14 δγ + Φ, called the effective temperature
perturbation, since this combination appears in (12.66). It is the local temperature
perturbation minus the redshift photons suffer when climbing from the potential well
of the perturbation (negative Φ for a CDM overdensity). We see that this quantity
oscillates in time, and the effect of baryons (via R) is to shift the equilibrium point
of the oscillation by −RΦk .
In the preceding we ignored the effect of the expansion of the universe. The
expansion affects the result in several ways. For example, cs , wbγ and R change
with time. The potential Φ also evolves, especially at early times when radiation
dominates the expansion. However, the qualitative result of an oscillation of Θ0 + Φ,
and the shift of its equilibrium point by baryons, remains. The time t in the solution
(12.82) gets replaced by conformal time η, and since cs changes with time, cs η is
replaced by Z η Z t
c cs (t)
rs (t) ≡ cs dη = dt . (12.84)
0 0 a(t)

We call this quantity rsc (t) the comoving sound horizon at time t, since it gives the
comoving distance sound waves have travelled to time t.
The relative weight of the cosine and sine solutions (i.e., the constants Ak and
Bk in (12.81) depends on the initial conditions. Since the perturbations are initially
at superhorizon scales, the initial conditions are determined there, and the present
discussion does not really apply. However, using the superhorizon initial conditions
gives the correct qualitative result for the phase of the oscillation.
We had that for adiabatic primordial perturbations, initially Φ = − 35 R and
1 2 2 1 1
4 δγ = − 3 Φ = 5 R, giving us an initial condition Θ0 + Φ = 3 Φ = − 5 R = const.
(At these early times R ≪ 1, so we can drop the factor 1 + R.) Thus adiabatic
primordial perturbations correspond essentially to the cosine solution. There are
effects at the horizon scale which affect the amplitude of the oscillations—the main
effect being the decay of Φ as it enters the horizon—so we can’t use the preceding
discussion to determine the amplitude, but we get the right result about the initial
phase of the Θ0 + Φ oscillations.
Thus we have that, qualitatively, the effective temperature behaves at subhorizon
scales as
Θ0k + (1 + R)Φk ∝ cos[krsc (t)] , (12.85)
12 COSMIC MICROWAVE BACKGROUND 213

Consider a region where the primordial curvature perturbation R is positive. It


begins with an initial overdensity (as we assume perturbations are adiabatic, this
applies to all components: photons, baryons, CDM and neutrinos), and a negative
gravitational potential Φ. For the scales of interest for CMB anisotropy, the potential
stays negative, since the CDM begins to dominate the potential early enough and
the CDM perturbations do not oscillate, they just grow. The effective temperature
perturbation Θ0 + Φ, which is the oscillating quantity, begins with a negative value.
After half an oscillation period it is at its positive extreme value. This increase of
Θ0 + Φ corresponds to an increase in δγ ; from its initial positive value it has grown
to a larger positive value. Thus the oscillation begins by the initially overdense
baryon-photon fluid element falling deeper into the potential well, and reaching
maximum compression after half a period. After this maximum compression the
photon pressure pushes the baryon-photon fluid out from the potential well, and
after a full period, the fluid reaches its maximum decompression in the potential
well. Since the potential Φ has meanwhile decayed (horizon entry and the resulting
potential decay always happens during the first oscillation period, since the sound
horizon and the Hubble length are close to each other, as the sound speed is close to
the speed of light), the decompression does not bring the δbγ back to its initial value
(which was overdense), but the photon-baryon fluid actually becomes underdense in
the potential well (and overdense in the neighbouring potential “hill”). And so the
oscillation goes on until photon decoupling.
These are standing waves and they are called acoustic oscillations. See figure 15.
Because of the potential decay at horizon entry, the amplitude of the oscillation is
larger than Φ, and thus also Θ0 changes sign in the oscillation.
These oscillations end at photon decoupling, when the photons are liberated.
The CMB shows these standing waves as a snapshot7 at their final moment t = tdec .
At photon decoupling we have

Θ0k + (1 + R)Φk ∝ cos[krsc (tdec )] . (12.86)

At this moment oscillations for scales k which have

krsc (tdec ) = nπ (12.87)

(n = 1, 2, 3, . . .) are at their extreme values (maximum compression or maximum


decompression). Therefore we see strong structure in the CMB anisotropy at the
multipoles
dc (tdec )
ℓ = kdcA (tdec ) = nπ Ac ≡ nℓA (12.88)
rs (tdec )
corresponding to these scales. Here
dcA (tdec ) π
ℓA ≡ π c
≡ (12.89)
rs (tdec ) θs
is the acoustic scale in multipole space and
rsc (tdec )
θs ≡ (12.90)
dcA (tdec )
7
Actually, photon decoupling takes quite a long time. Therefore this “snapshot” has a rather
long “exposure time” causing it to be “blurred”. This prevents us from seeing very small scales in
the CMB anisotropy.
12 COSMIC MICROWAVE BACKGROUND 214

Figure 15: Acoustic oscillations. The top panel shows the time evolution of the Fourier
amplitudes Θ0k , Φk , and the effective temperature Θ0k + Φk . The Fourier mode shown
corresponds to the fourth acoustic peak of the Cℓ spectrum. The bottom panel shows δbγ (x)
for one Fourier mode as a function of position at various times (maximum compression,
equilibrium level, and maximum decompression).

is the sound horizon angle, i.e., the angle at which we see the sound horizon on the
last scattering surface. This is the CMB anisotropy quantity which is determined
with most accuracy from the data. Analysis of the 5-year data from the WMAP
satellite and data from the ACBAR ground-based CMB experiment gives the model-
independent value θs = 0.593◦ ± 0.001◦ , a precision of 0.3% [1].
Because of these acoustic oscillations, the CMB angular power spectrum Cℓ has
a structure of acoustic peaks on subhorizon scales. The centers of these peaks are
located approximately at ℓn ≈ nℓA . An exact calculation shows that they actually
lie at somewhat smaller ℓ due to a number of effects. The separation of Neighbouring
peaks is closer to ℓA than the positions of the peaks are to nℓA .
These acoustic oscillations involve motion of the baryon-photon fluid. When the
oscillation of one Fourier mode is at its extreme, e.g. at the maximal compression in
the potential well, the fluid is momentarily at rest, but then it begins flowing out of
the well until the other extreme, the maximal decompression, is reached. Therefore
those Fourier modes k which have the maximum effect on the CMB anisotropy via
the 41 δγ (tdec , xls ) + Φ(tdec , xls ) term (the effective temperature effect) in (12.66) have
the minimum effect via the −v · n̂(tdec , xls ) term (the Doppler effect) and vice versa.
Therefore the Doppler effect also contributes a peak structure to the Cℓ spectrum,
but its peaks are in the locations where the effective temperature contribution has
troughs.
The Doppler effect is subdominant to the effective temperature effect, and there-
fore the peak positions in the Cℓ spectrum is determined by the effective tempera-
12 COSMIC MICROWAVE BACKGROUND 215

ture effect, according to (12.88). The Doppler effect just partially fills the troughs
between the peaks, weakening the peak structure of Cℓ . See figure 18. The calcula-
tion involves some approximations which allow the description of Cℓ as just a sum
of these contributions, and is not as accurate as a full calculation using e.g. the
CAMB code8 .)
Figure 16 shows the values of the effective temperature perturbation Θ0 + Φ (as
well as Θ0 and Φ separately) and the magnitude of the velocity perturbation (Θ1 ∼
v/3) at tdec as a function of the scale k. This is a result of a numerical calculation
which includes the effect of the expansion of the universe, but not diffusion damping.

12.7 Diffusion damping


For small enough scales the effect of photon diffusion and the finite thickness of
the last scattering surface (∼ the photon mean free path just before last scattering)
smooth out the photon distribution and the CMB anisotropy. This effect is charac-
−1
terised by the damping scale kD , which is the distance that photons have travelled
up to last scattering. The photon density and velocity perturbations at scale k are
damped at tdec by
2 2
e−k /kD , (12.91)
and the Cℓ spectrum is damped as
2 /ℓ2
e−ℓ D . (12.92)

We can estimate kD and ℓD as follows (see [2], page 129, for a bit more details).
Before decoupling photons are scattering from the electrons in the plasma. The
typical time between collisions (i.e. the photon mean free path) at time t is λγ =
α2
tc (t) = Γ−1 = (ne (t)σT )−1 , where σT = 8π
3 m2e is the Thomson cross-section. The
free electron density depends on the ionisation fraction x (see section 5.6). For
simplicity, we take x = 1. (In fact, the ionisation fraction drops, and tc grows,
rapidly during decoupling.) The photon direction changes randomly at each collision
and independently of the previous collision, so photons undergo a random walk with
uncorrelated steps. The number of steps the photons has taken up to time t is
N = t/tc (taking tc to be constant for simplicity), and the total distance it has
travelled at decoupling is
√ √
dD = N tc = tdec tc ≈ 14 kpc , (12.93)

where we have put in tdec = 380 000 yr, tc = tc (tdec ) and used Tdec = 3000 K,
η = 6 × 10−10 . The comoving diffusion wavenumber is given by
−1
kD = (1 + zdec )dD ≈ 15 Mpc , (12.94)

using zdec = 1090. This corresponds to multipole moment

ℓD ∼ kD dcA (zdec ) ≈ 900 , (12.95)

where we have put in dcA (zdec ) = 13.8 Gpc (see section 12.9.2).
8
CAMB is a publicly available code for precise calculation of the Cℓ spectrum. See
http://camb.info/ .
12 COSMIC MICROWAVE BACKGROUND 216

0.5

Φ, Θ0
0

-0.5

0.5

(Θ0 + Φ)
0

-0.5

0.4

0.2

Θ1
0

-0.2

-0.4 ωm = 0.10
ωm = 0.20
0.6 ωm = 0.30
0.5
2
0.4 (Θ0 + Φ)
0.3
0.2
0.1
0

0.15
2
Θ1

0.1

0.05

0
0 200 400 600 800 1000
k/H0

Figure 16: Values of oscillating quantities (normalised to an initial value Rk = 1) at the


time of decoupling as a function of the scale k, for three different values of ωm , and for
ωb = 0.01. Θ1 represents the velocity perturbation. The effect of diffusion damping is
neglected. Figure and calculation by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 217

0.7

ωm = 0.20
Undamped spectra
0.6 ωm = 0.25
ωm = 0.30
ωm = 0.35
0.5

0.4

0.3

0.2

0.1
Including damping

0
0 500 1000 1500 2000
Figure 17: The angular power spectrum Cℓ , calculated both with and without the effect of
diffusion damping. The spectrum is given for four different values of ωm , with ωb = 0.01.
(This is a rather low value of ωb , about half the real value, so ℓD < 1500 and the damping
is quite strong.) Figure and calculation by R. Keskitalo.

This calculation is rather approximate, because of the rapid growth of the pho-
ton mean free path (and the typical time between collisions) during recombination,
and a more accurate calculation involves an integral over time to take into account
this effect. However, the order of magnitude ℓD ∼ 1000 is correct, as we see from fig-
ure 17, which shows the result of a numerical calculation with and without diffusion
damping.
Of the cosmological parameters, the damping depends most strongly on ωb , since
increasing baryon density shortens the photon mean free path before decoupling.
Thus for larger ωb the damping moves to shorter scales, i.e. ℓD becomes larger.
The time evolution of λγ before decoupling, and the diffusion scale, is different
for different ωb . For small ωb , tc has already become quite large through the slow
dilution of the baryon density by the expansion of the universe, and the growth of
λγ relies less on the fast reduction of free electron density during recombination.

12.8 The complete Cℓ spectrum


As we have discussed the CMB anisotropy has three contributions (see 12.66), the
effective temperature effect,
1
δγ (tdec , xls ) + Φ(tdec , xls ) , (12.96)
4
the Doppler effect,
−v · n̂(tdec , xls ) , (12.97)
and the integrated Sachs–Wolfe effect,
Z o
2 Φ̇dt . (12.98)
dec
12 COSMIC MICROWAVE BACKGROUND 218

Because Cℓ is quadratic in the perturbations, it includes cross-terms between these


three effects.
The calculation of the full Cℓ proceeds much as the calculation of just the or-
dinary Sachs–Wolfe part (which the effective temperature effect becomes at super-
horizon scales) in section 12.5.1, but now with the full δT /T . Since all perturbations
are proportional to the primordial perturbations, the Cℓ spectrum is proportional
to the primordial perturbation spectrum PR (k) (with integrals over the spherical
Bessel functions jℓ (kx), like in (12.59), to get from k to ℓ).
The difference is that instead of the constant proportionality factor (δT /T )SW =
−(1/5)R, we have a k-dependent proportionality resulting from the evolution (in-
cluding e.g. the acoustic oscillations) of the perturbations.
In figure 18 we show the full Cℓ spectrum and the different contributions to it.
Because the Doppler effect and the effective temperature effect are almost completely
off-phase, their cross term gives a negligible contribution.
Since the ISW effect is relatively weak, it contributes more via its cross terms
with the Doppler effect and effective temperature than directly. The cosmological
model used for figure 18 has ΩΛ = 0, so there is no late ISW effect (which would
contribute at the very lowest ℓ), and the ISW effect shown is the early ISW effect
due to radiation contribution to the expansion law. This effect contributes mainly
to the first peak and to the left of it, explaining why the first peak is so much higher
than the other peaks. It also shifts the first peak position slightly to the left and
changes its shape.

12.9 Cosmological parameters and CMB anisotropy


Let us finally consider the effect of the various cosmological parameters on the Cℓ
spectrum. The Cℓ provides perhaps the most important single observational data set
for determining (or constraining) cosmological parameters, since it has a rich struc-
ture which we can measure with an accuracy that other cosmological observations
cannot match, and because it depends on several different cosmological parameters.
The latter is both a strength and weakness: the CMB has only a couple of features
(overall amplitude and the positions and heights of the peaks and troughs), so typi-
cally you cannot hope to determine more than a handful if independent parameters
from the data. This is because different parameters affect the same features in sim-
ilar ways, so that we only get a constraint on their combination. Such parameters
are called degenerate. Other cosmological observations are needed to break these
degeneracies.
The CMB anisotropy pattern is set by the physics at decoupling, and it is then
modified as the CMB passes through the universe to be observed today. The CMB
pattern at decoupling is determined by the primordial spectrum, and the densities of
photons, neutrinos, baryons and cold dark matter. The photon density is precisely
known from the CMB mean temperature, and (assuming zero neutrino chemical
potential) this also fixes the density of neutrinos. In the case of many inflationary
models, the primordial spectrum is a power-law, characterised by a an amplitude
and a spectral index. In summary, the physics at decoupling is determined by

• ωb ≡ Ωb0 h2 the physical baryon density

• ωm ≡ Ωm0 h2 the physical matter density


12 COSMIC MICROWAVE BACKGROUND 219

Full spectrum
0.3
Θ0 + Φ
Θ1
ISW
ISW cross terms
(Θ0+Φ)×Θ1
0.2
2l(l+1)Cl/2π

0.1

0 500 1000 1500 2000


l
0.2

0.15 0.001

(Θ0+Φ)×Θ1
Θ0+Φ

0.1 0

0.05 -0.001

0.08 ISW×(Θ0+Φ)
0.08
ISW×Θ1 0.06
0.06
0.04
Θ1

0.04 0.02
0.02 0
-0.02
0
0.02 ISW×(Θ0+Φ) + ISW×Θ1 0.08
0.01
0.01
0.06
0.04
ISW

0.001 0
0 500 1000 1500 0.02
0.0001 0
-0.02
1e-05
0 500 1000 1500 2000
l

Figure 18: The full Cℓ spectrum calculated for the cosmological model Ω0 = 1, ΩΛ = 0,
ωm = 0.2, ωb = 0.03, A = 1, n = 1, and the different contributions to it. Here Θ1 denotes
the Doppler effect. Figure and calculation by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 220

• A amplitude of primordial scalar perturbations (at pivot scale kp )

• n spectral index of primordial scalar perturbations


The angular scale at which the pattern is seen changes as the universe evolves,
and this is the main effect of the physics after decoupling on the CMB anisotropy. In
addition, the CMB photons scatter off free charges after reionisation. In principle,
this effect is determined in terms of the physical parameters at decoupling, but the
physics involved in stellar formation and other relevant processes is too complicated
to calculate from first principles. Therefore the effect of reionisation is encoded in an
effective parameter τ called the optical depth (discussed in section 12.9.6). Roughly
speaking, τ gives the probability that a given photon scatters at least once between
decoupling and today. We could therefore take the model-independent CMB post-
decoupling CMB parameters as
• dcA (zdec ) comoving angular diameter distance to the last scattering surface

• τ optical depth
The angular diameter distance is a general model-independent quantity. In a
given FRW model, it is determined by the spatial curvature and the expansion
history, as we have discussed. In the ΛCDM model, where there is vacuum energy
and spatial curvature, the angular diameter distance can be replaced by these two
parameters, so we have
• Ω0 total density parameter

• ΩΛ0 vacuum energy density parameter

• τ optical depth
In addition to changing the angular diameter distance, vacuum energy and spatial
curvature also contribute to the CMB anisotropy via the ISW effect, as discussed
earlier. The decoupling and post-decoupling parameters add up to a total of seven
parameters. Since spatial curvature is not needed to explain the observations and
there is no indication for it, it is usually put to zero, i.e. Ω0 = 1. Usually references
to the ΛCDM model, or the “standard cosmological model”, or the concordance
model refer to the model parametrised by the six parameters above, without spatial
curvature. We will keep spatial curvature in the discussion in order to show what
effect it would have.
There are other possible cosmological parameters (“additional parameters”) which
might affect the Cℓ spectrum, e.g.
• mνi neutrino masses

• w dark energy equation of state parameter


dn
• d ln k scale dependence of the spectral index

• r, nT relative amplitude and spectral index of tensor perturbations

• B, niso amplitudes and spectral indices of primordial isocurvature perturbations,

• Acor , ncor and their correlation with primordial curvature perturbations


12 COSMIC MICROWAVE BACKGROUND 221

We assume here that these additional parameters have no impact, i.e., they have
the “standard” values
dn
mνi = r = = B = Acor = 0 , w = −1 (12.99)
d ln k
to the accuracy which matters for Cℓ observations. Apart from the neutrino masses,
there is no sign in the present-day CMB data for non-zero values of these parameters.
On the other hand, significant deviations from zero can be consistent with the data,
and may be discovered by future CMB (and other) observations, in particular the
Planck satellite. The primordial isocurvature perturbations refer to the possibility
that the primordial scalar perturbations are not adiabatic, and therefore are not
completely determined by the comoving curvature perturbation R.
The assumption that these additional parameters have no impact leads to a
determination of the standard parameters with an accuracy that may be too op-
timistic, since the standard parameters may have degeneracies with the additional
parameters.

12.9.1 Independent vs. dependent parameters


The above is our choice of independent cosmological parameters. Ωm0 , Ωb0 and H0
(or h) are then dependent (or “derived”) parameters, since they are determined by
Ω0 = Ωm0 + ΩΛ0 ⇒ Ωm0 = Ω0 − ΩΛ0 (12.100)
r
ωm
r
ωm
h= = (12.101)
Ωm0 Ω0 − ΩΛ0
ωb ωb
Ωb0 = 2 = (Ω0 − ΩΛ0 ) (12.102)
h ωm
In particular, the Hubble constant H0 is a dependent parameter. The CMB has
no sensitivity to H0 except via the angular diameter distance to the last scattering
surface.
Different choices of independent parameters are possible within our 7-dimensional
parameter space (e.g. we could have chosen H0 to be an independent parameter and
let ΩΛ0 to be a dependent parameter instead). They can be though of as different co-
ordinate systems in this seven-dimensional space. It is not meaningful to discuss the
effect of one parameter without specifying what is the set of independent parameters!
Some choices of independent parameters are better than others. The above choice
represents standard practice in cosmology today.9 The independent parameters have
been chosen so that they correspond as directly as possible to physics affecting
the Cℓ spectrum and thus to observable features in it. We want the effects of
our independent parameters on the observables to be as different (“orthogonal”) as
possible in order to avoid parameter degeneracy.
In particular,
• ωm (not Ωm0 ) determines zeq and keq , and thus e.g. the magnitude of the
early ISW effect and which scales enter during matter- or radiation-dominated
epoch.
9
There are other choices in use, that are even more geared to minimising parameter degeneracy.
For example, the sound horizon angle θs may be used instead of ΩΛ0 as an independent parameter,
since it is directly determined by the acoustic peak separation, and thus less subject to degeneracies.
However, the determination of the dependent parameters from it is in turn more complicated.
12 COSMIC MICROWAVE BACKGROUND 222

• ωb (not Ωb0 ) determines the baryon/photon ratio and thus e.g. the relative
heights of the odd and even peaks.

• ΩΛ0 (not ΩΛ0 h2 ) determines the late ISW effect.


There are many effects on the Cℓ spectrum, and parameters act on them in dif-
ferent combinations. Thus there is no perfectly “clean” way of choosing independent
parameters.
In the following plots made with CAMB we see the effect of these parameters on
Cℓ by varying one parameter at a time around a reference model, whose parameters
have the following values.
Independent parameters:

Ω0 = 1 ΩΛ0 = 0.7
A=1 ωm = 0.147
n=1 ωb = 0.022
τ = 0.1

which give for the dependent parameters

Ωm0 = 0.3 h = 0.7


Ωc0 = 0.2551 ωc = 0.125
Ωb0 = 0.0449

The meaning of setting A = 1 is just that the resulting Cℓ still need to be multiplied
by the true value of A2 . (In this model the true value should be about A = 5 × 10−5
to agree with observations.) If we really had A = 1, perturbation theory of course
would not be valid! This is a relatively common practice, since the effect of changing
A is so trivial, it doesn’t make sense to plot Cℓ separately for different values of A.

12.9.2 Sound horizon angle


The positions of the acoustic peaks of the Cℓ spectrum provides us with a measure-
ment of the sound horizon angle
rsc (tdec )
θs ≡
dcA (tdec )
We can use this in the determination of the values of the cosmological parameters,
once we have calculated how this angle depends on those parameters. It is the ratio
of two quantities, the sound horizon at photon decoupling, rsc (tdec ), and the angular
diameter distance to the last scattering, dcA (tdec ).

Angular diameter distance to last scattering


The angular diameter distance dcA (tdec ) to the last scattering surface we have
already calculated and it is given by (12.27) as
Z 1 !
1 da
dcA (tdec ) = H0−1 √
p
sinh 1 − Ω0 p .
1 − Ω0 1+z
1 Ω0 (a − a2 ) − ΩΛ0 (a − a4 ) + a2
dec
(12.103)
12 COSMIC MICROWAVE BACKGROUND 223

from which we see that it depends on the three cosmological parameters H0 , Ω0 and
ΩΛ0 . Here Ω0 = Ωm0 + ΩΛ0 , so we could also say that it depends on H0 , Ωm0 , and
ΩΛ0 , but it is easier to discuss the effects of these different parameters if we keep
Ω0 as an independent parameter, instead of Ωm0 , since the “geometry effect” of the
curvature of space, which determines the relation between the comoving angular
diameter distance dcA and the comoving distance dc , is determined by Ω0 .

1. The comoving angular diameter distance is inversely proportional to H0 (di-


rectly proportional to the Hubble distance H0−1 ).

2. Increasing Ω0 decreases dcA (tdec ) in relation to dc (tdec ) because of the geometry


effect.

3. With a fixed ΩΛ0 , increasing Ω0 decreases dcA (tdec ), since it means increasing
Ωm0 , which has a decelerating effect on the expansion. With a fixed present
expansion rate H0 , deceleration means that expansion was faster earlier ⇒
universe is younger ⇒ there is less time for photons to travel as the uni-
verse cools from Tdec to T0 ⇒ last scattering surface is closer to us.

4. Increasing ΩΛ0 (with a fixed Ω0 ) increases dcA (tdec ), since it means a larger
part of the energy density is in dark energy, which has an accelerating effect
on the expansion. With fixed H0 , this means that expansion was slower in
the past ⇒ universe is older ⇒ more time for photons ⇒ last
scattering surface is further out ⇒ ΩΛ0 increases dcA (tdec ).

Here 2 and 3 work in the same direction: increasing Ω0 decreases dcA (tdec ), but the
geometry effect (2) is stronger. See figure 13 for the case ΩΛ0 = 0, where the dashed
line (the comoving distance) shows effect (3) and the solid line (the comoving angular
diameter distance) the combined effect (2) and (3).
However, now we have to take into account that, in our chosen parametrisation,
H0 is not an independent parameter, but
r
−1 Ω0 − ΩΛ0
H0 ∝ ,
ωm

so that via H0−1 , Ω0 increases and ΩΛ0 decreases dcA (tdec ), which are the opposite
effects to those discussed above. For ΩΛ0 this opposite effect wins. See Figs. 21 and
22.

Sound horizon
To calculate the comoving sound horizon,
Z tdec Z tdec Z adec
cs (t) dt da
rsc (tdec ) = a0 dt = cs (t) = cs (a) , (12.104)
0 a(t) 0 a 0 a · (da/dt)

we need the speed of sound from (12.71),


1 1 1 1
c2s (a) = ρ̄b = , (12.105)
3
3 1 + 4 ρ̄ 3 1 + 43 ωωb a
γ γ

where the upper limit of the integral is adec = 1/(1 + zdec ).


12 COSMIC MICROWAVE BACKGROUND 224

The other element in the integrand of (12.104) is the expansion law a(t) before
decoupling. We have
da p
a = H0 ΩΛ0 a4 + (1 − Ω0 )a2 + Ωm0 a + Ωr0 . (12.106)
dt
In the integral (12.103) we dropped the Ωr0 , since it is important only at early times,
and the integral from adec to 1 is dominated by late times. Integral (12.104), on the
other hand, includes only early times, and now we can instead drop the ΩΛ0 and
1 − Ω0 terms (i.e., we can ignore the effect of curvature and dark energy in the early
universe, before photon decoupling), so that

da p √ ωm a + ωr
a ≈ H0 Ωm0 a + Ωr0 = H100 ωm a + ωr = , (12.107)
dt 2998 Mpc
where we have written
km/s h
H0 ≡ h · 100 ≡ h · H100 = . (12.108)
Mpc 2997.92 Mpc
Thus the sound horizon is given by
Z a
c (a)da
c
rs (a) = 2998 Mpc √ s
0 ωm a + ωr
Z a
1 da (12.109)
= 2998 Mpc · √ r .
3ωr 0 
3 ωb
1 + ωωmr a 1 + 4 ωγ a

Here

ωγ = 2.4702 × 10−5 and (12.110)


"  4/3 #
7 4
ωr = 1 + N ν ωγ = 1.6904 ωγ = 4.1756 × 10−5 (12.111)
8 11

are accurately known from the CMB temperature T0 = 2.725 K (and therefore we
do not consider them as cosmological parameters in the sense of something to be
determined from the Cℓ spectrum).
Thus the sound horizon depends on the two cosmological parameters ωm and ωb ,

rsc (tdec ) = rsc (ωm , ωb )

From (12.109) we see that increasing either ωm or ωb makes the sound horizon at
decoupling, rsc (adec ), shorter:

• ωb slows the sound down

• ωm speeds up the expansion at a given temperature, so the universe cools to


Tdec in less time.

The integral (12.109) can be done and it gives


√ √
c 2998 Mpc 2 1 + R ∗ + R∗ + r ∗ R∗
rs (tdec ) = √ √ ln √ , (12.112)
1 + zdec 3ωm R∗ 1 + r ∗ R∗
12 COSMIC MICROWAVE BACKGROUND 225

where
ρ̄r (tdec ) ωr 1 1 1 + zdec
r∗ ≡ = = 0.0459 (12.113)
ρ̄m (tdec ) ωm adec ωm 1100
3ρ̄b (tdec ) 3ωb 1100
R∗ ≡ = adec = 27.6 ωb . (12.114)
4ρ̄γ (tdec ) 4ωγ 1 + zdec

For our reference values ωm = 0.147, ωb = 0.022, and 1 + zdec = 110010 we get r∗ =
0.312 and R∗ = 0.607 and rsc (tdec ) = 143 Mpc for the sound horizon at decoupling.

Summary
The angular diameter distance dcA (tdec ) is most naturally discussed in terms of
H0 , Ω0 , and ΩΛ0 , but since these are not the most convenient choice of independent
parameters for other purposes, we shall trade H0 for ωm according to (12.101). Thus
we see that the sound horizon angle depends on 4 parameters,
rsc (ωm , ωb )
θs ≡ = θs (Ω0 , ΩΛ0 , ωm , ωb ) . (12.115)
dcA (Ω0 , ΩΛ0 , ωm )

If we keep ωm and ωb fixed, we have rsc (tdec ) = 143 Mpc. From the observed
model-independent value θs = 0.593◦ ± 0.001◦ [1] we then have dcA = 13.8 Gpc
≈ 4.6hH0−1 ≈ 3H0−1 , where in the last equality we have taken h = 0.7. For the
Einstein-de Sitter model we have dcA (1090) ≈ 1.97H0−1 ≈ 8.4 Gpc, so the observed
distance to the last scattering surface is about 50% larger than predicted by the
FRW model without dark energy or spatial curvature.
We get a rough estimate of the angular diameter distance from the observed
angular size of the extrema on the CMB sky as follows.
√1 dhor (tdec )
rsc (tdec )
dcA (zdec )
= ≈ 3 (1 + zdec )
θs θs
180◦ √
≈ 3tdec (1 + zdec ) ≈ 21 Gpc , (12.116)
πθs (◦ )

where we have approximated rs = dhor / 3 and dhor = 3t, and θs (◦ ) is θs in degrees.
This value is within a factor of 2 of the real result. However, the difference between
the observation and the Einstein-de Sitter result for dcA is only 50%, so this rough
approximation cannot be used to rule out the Einstein-de Sitter model, we have to
use a more precise value for the sound horizon.

12.9.3 Acoustic peak heights


There are a number of effects which affect the heights of the acoustic peaks:

1. The early ISW effect. The early ISW effect raises the first peak. It is
caused by the evolution of Φ because of the effect of the radiation contribution
on the expansion law after tdec . This depends on the radiation-matter ratio at
that time; decreasing ωm makes the early ISW effect stronger.
10
Photon decoupling temperature, and thus 1 + zdec , depends somewhat on ωb , but since this
dependence is not easy to calculate (recombination and photon decoupling were discussed in chapter
5), we have mostly ignored this dependence and used the fixed value 1 + zdec = 1100.
12 COSMIC MICROWAVE BACKGROUND 226

2. Shift of oscillation equilibrium by baryons. (Baryon drag.) This makes


the odd peaks (which correspond to compression of the baryon-photon fluid
in the potential wells, decompression on potential hills) higher, and the even
peaks (decompression at potential wells, compression on top of potential hills)
lower.

3. Baryon damping. The time evolution of R ≡ 3ρ̄b /4ρ̄γ causes the amplitude
of the acoustic oscillations to be damped in time roughly as (1 + R)−1/4 . This
reduces the amplitudes of all peaks.

4. Radiation driving.11 This is an effect related to horizon scale physics that


we have not tried to properly calculate. For scales k which enter during the
radiation-dominated epoch, or near matter-radiation equality, the potential Φ
decays around the time when the scale enters. The potential keeps changing as
long as the radiation contribution is important, but the largest change in Φ is
around horizon entry. Because the sound horizon and Hubble length are com-
parable, horizon entry and the corresponding potential decay always happen
during the first oscillation period. This means that the baryon-photon fluid is
falling into a deep potential well, and therefore is compressed by gravity by a
large factor, before the resulting overpressure is able to push it out. Meanwhile
the potential has decayed, so it is less able to resist the decompression phase,
and the overpressure is able to kick the fluid further out of the well. This
increases the amplitude of the acoustic oscillations. The effect is stronger for
the smaller scales which enter when the universe is more radiation-dominated,
and therefore raises the peaks with a larger peak number n more. Reducing
ωm makes the universe more radiation dominated, making this effect stronger
and extending it towards the peaks with lower peak number n.

5. Diffusion damping. Diffusion damping lowers the heights of the peaks. It


acts in the opposite direction than the radiation driving effect, lowering the
peaks with a larger peak number m more. Because the diffusion damping
effect is exponential in ℓ, it wins for large ℓ.

Effects 1 and 4 depend on ωm , effects 2, 3, and 5 on ωb . See Figs. 19 and 20 for the
effects of ωm and ωb on peak heights.

12.9.4 Effect of Ω0 and ΩΛ0


These two parameters have only two effects:

1. they affect the sound horizon angle and thus the positions of the acoustic peaks

2. they affect the late ISW effect

See Figs. 21 and 22. Since the late ISW effect is in the region of the Cℓ spectrum
where the cosmic variance is large, it is difficult to detect. Thus we can in practice
only use θs to determine Ω0 and ΩΛ0 . Since ωb and ωm can be determined quite
accurately from Cℓ acoustic peak heights, peak separation, i.e., θs , can then indeed
be used for the determination of Ω0 and ΩΛ0 . Since one number cannot be used
11
This is also called gravitational driving, which is perhaps more appropriate, since the effect is
due to the change in the gravitational potential.
12 COSMIC MICROWAVE BACKGROUND 227

ωb = 0.01 ωb = 0.03

0.6 0.6
l(l+1)Cl/2π

0.4 0.4
ωm = 0.10
ωm = 0.20
0.2 0.2
ωm = 0.30
ωm = 0.40
0 0
0 500 1000 1500 2000 0 500 1000 1500 2000
l l

Figure 19: The effect of ωm . The angular power spectrum Cℓ is here calculated without
the effect of diffusion damping, so that the other effects on peak heights could be seen more
clearly. Notice how reducing ωm raises all peaks, but the effect on the first few peaks is
stronger in relative terms, as the radiation driving effect is extended towards larger scales
(smaller ℓ). The first peak is raised mainly because the ISW effect becomes stronger. Figure
and calculation by R. Keskitalo.

0.6

0.5

0.4
2l(l+1)Cl/2π

0.3

0.2
ωb = 0.01
ωb = 0.02
ωb = 0.03
0.1
ωb = 0.04

0
0 500 1000 1500 2000
l

Figure 20: The effect of ωb . The angular power spectrum Cℓ is here calculated without
the effect of diffusion damping, so that the other effects on peak heights could be seen more
clearly. Notice how increasing ωb raises odd peaks relative to the even peaks. Because of
baryon damping there is a general trend downwards with increasing ωb . This figure is for
ωm = 0.20. Figure and calculation by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 228

to determine two, the parameters Ω0 and ΩΛ0 are degenerate. CMB observations
alone cannot be used to determine them both. Other cosmological observations (like
the power spectrum Pδ (k) from large scale structure, or the SNIa redshift-distance
relationship) are needed to break this degeneracy.
A fixed θs together with fixed ωb and ωm determine a line on the (Ω0 , ΩΛ0 )
-plane. See figure 23. Derived parameters, e.g., h, vary along that line. As you can
see from Figs. 21 and 22, changing Ω0 (around the reference model) affects θs much
more strongly than changing ΩΛ0 . This means that the orientation of the line is such
that ΩΛ0 varies more rapidly along that line than Ω0 . Therefore using additional
constraints from other cosmological observations, e.g. the Hubble Space Telescope
determination of h based on the distance ladder, which select a short section from
that line, gives us a fairly good determination of Ω0 , leaving the allowed range for
ΩΛ0 still quite large.
Therefore it is often said that CMB measurements have determined that Ω0 ∼ 1,
i.e. that the universe is spatially flat. However, this is misleading. First, the
CMB only determines the angular diameter distance to the last scattering surface.
Determining the spatial curvature from this requires knowing the expansion history
H(z), in other words the constraints on the spatial curvature are model-dependent.
Even restricting to the ΛCDM model, we need to use some other cosmological data
to fix H0 . So the correct statement is that assuming that the universe is described
by the ΛCDM model, and given constraints on the Hubble parameter today, the
CMB data shows the universe to be close to spatially flat.

12.9.5 Effect of the primordial spectrum


The effect of the primordial spectrum is simple: raising the amplitude A raises the
Cℓ also, and changing the primordial spectral index tilts Cℓ . See Figs. 24 and 25.

12.9.6 Optical depth due to reionisation


When radiation from the first stars reunites the intergalactic gas, CMB photons may
scatter from the resulting free electrons. The optical depth τ due to reionisation
is the expectation number of such scatterings per CMB photon. It has a value of
about τ = 0.09±0.02, i.e., most CMB photons do not scatter at all. The rescattering
causes additional polarisation of the CMB, and CMB polarisation measurements are
actually the best way to determine τ .
Because of this scattering, not all CMB photons come from the location on
the last scattering surface they seem to come from. The effect of the rescattered
photons is to mix up signals from different directions and therefore reduce the CMB
anisotropy. The reduction factor on δT /T is e−τ and on the Cℓ spectrum e−2τ .
However, this does not affect the largest scales, scales larger than the area from
which the rescattered photons reaching us from a certain direction originally came
from. Such a large-scale anisotropy has affected all such photons the same way, and
thus is not lost in the mixing. See figure 26.

12.9.7 Effect of ωb and ωm


These parameters affect both the positions of the acoustic peaks (through θs ) and
the heights of the different peaks. The latter effect is the more important, since
12 COSMIC MICROWAVE BACKGROUND 229

0.4
0 = 1.1
0.35 0 = 1.0
0 = 0.9
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
0 = 1.1
0.35 0 = 1.0
0 = 0.9
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 21: The effect of changing Ω0 from its reference value Ω0 = 1. The top panel
shows the Cℓ spectrum with a linear ℓ scale so that details at larger ℓ where cosmic variance
effects are smaller can be better seen. The bottom plot has a logarithmic ℓ scale so that
the integrated Sachs-Wolfe effect at small ℓ can be better seen. The logarithmic scale also
makes clear that the effect of the change in sound horizon angle is to stretch the spectrum
by a constant factor in ℓ space.
12 COSMIC MICROWAVE BACKGROUND 230

0.4
= 0.80
0.35 = 0.70
= 0.60
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
= 0.80
0.35 = 0.70
= 0.60
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 22: The effect of changing ΩΛ0 from its reference value ΩΛ0 = 0.7.
12 COSMIC MICROWAVE BACKGROUND 231

Distance between successive acoustic peaks (∆ l)


1.5
no big ωb = 0.022, ωm = 0.147, h is derived parameter
bang 50

50
100
100
1
30 200
0
35 250
0
Λ

600

accel −−− decel 200


0.5 open −−− closed

250
40 30
0
50

0
0

0
0.2 0.4 0.6 0.8 1 1.2 1.4

m

Figure 23: The lines of constant sound horizon angle θs on the (Ωm0 ,ΩΛ0 ) plane for fixed
ωb and ωm . The numbers on the lines refer to the corresponding acoustic scale ℓA ≡ π/θs
(∼ peak separation) in multipole space. Figure by J. Väliviita. See his PhD thesis[5], p.70,
for an improved version including the HST constraint on h.

both parameters have their own signature on the peak heights, allowing an accurate
determination of these parameters, whereas the effect on θs is degenerate with Ω0
and ΩΛ0 .
Especially ωb has a characteristic effect on peak heights: Increasing ωb raises the
odd peaks and reduces the even peaks, because it shifts the balance of the acoustic
oscillations (the −RΦ effect). This shows the most clearly at the first and second
−1
peaks.12 Raising ωb also shortens the damping scale kD due to photon diffusion,
moving the corresponding damping scale ℓD of the Cℓ spectrum towards larger ℓ.
This has the effect of raising Cℓ at large ℓ. See figure 27.
Increasing ωm makes the universe more matter dominated at tdec and therefore
it reduces the early ISW effect, making the first peak lower. This also affects the
shape of the first peak.
The “radiation driving” effect is most clear at the second to fourth peaks. Reduc-
ing ωm makes these peaks higher by making the universe more radiation-dominated
at the time the scales corresponding to these peaks enter, and thus strengthening
this radiation driving. The fifth and further peaks correspond to scales that have
anyway essentially the full effect, whereas for the first peak this effect is anyway
weak. (We see instead the ISW effect in the first peak.) See figure 28.
12
There is also an overall “baryon damping effect” on the acoustic oscillations which we have not
calculated. It is due to the time dependence of R ≡ 3ρ̄b /4ρ̄m , which reduces the amplitude of the
oscillation by about (1 + R)−1/4 . This explains why the third peak in figure 27 is no higher for
ωb = 0.030 than it is for ωb = 0.022.
12 COSMIC MICROWAVE BACKGROUND 232

0.4
A = 1.1
0.35 A=1
A = 0.9
0.3

0.25

( +1)C / 2
0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
A = 1.1
0.35 A=1
A = 0.9
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 24: The effect of changing the primordial amplitude from its reference value A = 1.
0.4
n = 1.1
0.35 n=1
n = 0.9
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
n = 1.1
0.35 n=1
n = 0.9
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 25: The effect of changing the spectral index from its reference value n = 1.
12 COSMIC MICROWAVE BACKGROUND 233

0.4
= 0.20
0.35 = 0.10
=0
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
= 0.20
0.35 = 0.10
=0
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 26: The effect of changing the optical depth from its reference value τ = 0.1.
12 COSMIC MICROWAVE BACKGROUND 234

0.4
b = 0.030
0.35 b = 0.022
b = 0.015
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
b = 0.030
0.35 b = 0.022
b = 0.015
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 27: The effect of changing the physical baryon density parameter from its reference
value ωb = 0.022.
12 COSMIC MICROWAVE BACKGROUND 235

0.4
m = 0.200
0.35 m = 0.147
m = 0.100
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0
0 200 400 600 800 1000 1200 1400

0.4
m = 0.200
0.35 m = 0.147
m = 0.100
0.3

0.25
( +1)C / 2

0.2

0.15

0.1

0.05

0.0 1 2 3
2 5 10 2 5 10 2 5 10 2

Figure 28: The effect of changing the physical matter density parameter from its reference
value ωm = 0.147.
12 COSMIC MICROWAVE BACKGROUND 236

12.10 Best values of the cosmological parameters


The most important cosmological data set for determining the values for the cosmo-
logical parameters is the Planck satellite data on the CMB anisotropy. For high ℓ,
it can be supplemented with CMB measurements from ground-based and balloon-
borne instruments with higher resolution but poorer sensitivity and sky coverage.
The most accurate measurements for the higher multipoles to date are from the
Arcminute Cosmology Bolometer Array Receiver (ACBAR) [3] and the South Pole
Telescope (SPT) [4].
Because of degeneracies of cosmological parameters in the CMB data, most im-
portantly the fact that the CMB is sensitive to the vacuum energy and spatial
curvature mostly via the angular diameter distance, CMB observations have to be
supplemented by other cosmological data for a good determination of the main cos-
mological parameters.
Large scale structure surveys, i.e. the measurement of the 3-dimensional matter
power spectrum Pδ (k) from the distribution of galaxies, mainly measure the com-
bination Ωm0 h, since this determines where Pδ (k) turns down. The turn is at keq
which is proportional to ωm ≡ Ωm0 h2 , but since in these surveys the distances to
the galaxies are deduced from their redshifts (therefore these surveys are also called
galaxy redshift surveys), which give the distances only up to the Hubble constant
H0 , these surveys determine h−1 keq instead of keq . This cancels one power of h.
Having Ωm0 h2 from CMB and Ωm0 h from the galaxy surveys, gives us both h and
Ωm0 = Ω0 − ΩΛ0 , which breaks the Ω0 -ΩΛ0 degeneracy.
These measurements of Pδ (k) are now so accurate that the small residual effect
from the baryon acoustic oscillations (BAO) before photon decoupling can be seen
as a weak wavy pattern [6]. This is the same structure which we see in the Cℓ
but now much fainter, since now the baryons have fallen into the CDM potential
wells, and the CDM was only mildly affected by these oscillations in the baryon-
photon fluid. The half-wavelength of this pattern, however, corresponds to the
same sound horizon distance rsc (tdec ) in both cases.13 The redshift at which the
pattern is seen is however much smaller, so this gives a measurement of dcA (z) at a
different redshift14 . Field Galaxy Redshift Survey (2dFGRS) and the Sloan Digital
Sky Survey (SDSS). Another way to break the Ω0 -ΩΛ0 degeneracy is to use the
redshift-distance relationship from Supernova Type Ia surveys.
However, the more datasets one puts together, the more assumptions are involved
in the analysis, so constraints from large combinations of data should be treated
cautiously. In Table I we give values for the standard parameters from the analysis
of the 1.5 year Planck data [7]. It has been assumed that Ω0 = 1. The first column
gives the mean value and the error bars15 for the Planck 1.5-year data only, and in
the second column a measurement of polarisation from the WMAP satellite (WP),
large multipole data from ground-based CMB experiments (highL) and data from
baryon acoustic oscillations (see below) has also been used. In Table II we list some
13
To be accurate, the best tdec value to represent the effect in Pδ (k) is not exactly the same as
for Cℓ , since photon decoupling was not instantaneous, and in the galaxies are looking at the effect
on matter and in the CMB the effect on photons.
14
In fact, the BAO signal gives a combination of dA (z) and H(z).
15
The upper and lower limits are “16- and 84-percentiles” which means that there is some re-
lation to having a formal 68% probability that the correct value is in this range. The probability
interpretation has some subtleties however; we will not go into the matter here.
12 COSMIC MICROWAVE BACKGROUND 237

Table I: Standard parameters


Planck + WP +
Planck only
HighL + BAO
100ωb 2.217 ± 0.033 2.214 ± 0.024
ωc 0.1186 ± 0.0031 0.1187 ± 0.0017
n 0.9635 ± 0.0094 0.9608 ± 0.0054
ΩΛ0 0.693 ± 0.019 0.692 ± 0.010
τ 0.089 ± 0.032 0.092 ± 0.013

Table II: Derived parameters


Planck + WP +
Planck only
HighL + BAO
Ωm0 0.307 ± 0.019 0.308 ± 0.010
100h 67.9 ± 1.5 67.8 ± 0.77

related derived parameters, and in Table III we give limits on some non-standard
parameters. Note that in table III the error bars are the 95% confidence limits
(instead of the usual 68% confidence limits), and the first column is Planck data
plus WMAP polarisation data.
The BBN limit 0.019 ≤ ωb ≤ 0.024 has not been used here, but we see that the
constraint on the baryon density coming from the CMB is consistent with the BBN
value. The agreement between these two independent datasets (the abundances
of light elements and anisotropies on the microwave sky) one of which probes the
physics around a couple of minutes and the other at around 400 000 years is remark-
able. This increases our confidence that the basic physical picture of the evolution
of the universe is correct. Indeed, BBN and CMB are two of the most important
pieces of observational support for the standard cosmological model.
The parameters in Table III are derived under the assumption that the non-
standard parameters other than the one being considered remain zero. The CMB
alone does not give good constraints on the spatial curvature or the dark energy
equation of state (since they mostly only affect dcA (zdec ), and are thus degenerate
with ΩΛ0 ). In fact, the CMB data is consistent with a closed universe without dark
energy, with Ω0 = PΩm0 ≈ 1.3, and h ≈ 0.3. 2The2 upper limits given for the sum of
neutrino masses mν and the ratio r ≡ AT /A of tensor perturbations to scalar
perturbations are 95% confidence limits. We see that there is no indication in the
data for a deviation of these additional parameters from their standard values.
In conclusion, almost all cosmological data are consistent with a “vanilla” uni-
verse, i.e. a spatially flat ΛCDM model with adiabatic and Gaussian primordial
density perturbations, described by the six cosmological parameters ΩΛ0 , ωm , ωb ,
A, n, τ .
Simplest inflationary models predict an amplitude for gravity waves that Planck
would be able to observe using the polarisation of the CMB. This data will be
released in 2014.
12 COSMIC MICROWAVE BACKGROUND 238

Table III: Additional parameters


Planck + WP +
Planck + WP
highL + BAO
P
mν < 0.933 eV < 0.230 eV
w −1.49+0.65
−0.57 −1.13+0.23
−0.25

ΩK0 −0.037+0.043
−0.049 −0.0005+0.0065
−0.0066
dn
d ln k −0.013 ± 0.018 −0.014+0.016
−0,017

r < 0.12 < 0.111

Figure 29: Constraints on the scalar perturbation spectral index n, and the tensor/scalar
ratio r from Planck satellite data. Figure from [8].
REFERENCES 239

References
[1] M. Vonlanthen, S. Räsänen and R. Durrer, JCAP08 (2010) 023, arXiv:1003.0810
[astro-ph.CO].

[2] David H. Lyth and Andrew R. Liddle: The Primordial Density Perturbation
(Cambridge University Press 2009).

[3] C. Reichardt et al., High Resolution CMB Power Spectrum from the Complet
ACBAR Data Set, arXiv:0801.1491.

[4] K.T. Story et al, A Measurement of the Cosmic Microwave Background Damp-
ing Tail from the 2500-square-degree SPT-SZ survey, arXiv:1210.7231.

[5] J. Väliviita, PhD thesis, University of Helsinki 2005.

[6] W.J. Percival et al., Measuring the Baryon Acoustic Oscillation scale using the
Sloan Digital Sky Survey and 2dF Galaxy Redshift Survey, arXiv:0705.3323,
Mon.Not.Roy.Astron.Soc. 381 (2007) 1053.

[7] P.A.R. Ade et al. [Planck Collaboration], arXiv:1303.5076 [astro-ph.CO].

[8] P.A.R. Ade et al. [Planck Collaboration], arXiv:1303.5082 [astro-ph.CO].

You might also like