Professional Documents
Culture Documents
Rasanen - Cosmology I & II (Notes) PDF
Rasanen - Cosmology I & II (Notes) PDF
Rasanen - Cosmology I & II (Notes) PDF
Fall 2014
Syksy Räsänen
1 INTRODUCTION 1
Preface
These are lecture notes for the courses Cosmology I and II at the University of
Helsinki. They are closely based on the notes prepared by Hannu Kurki-Suonio,
who has lectured this course in the past. (Credit for some figures goes to Elina
Keihänen, Jussi Väliviita and Reijo Keskitalo.)
A difficulty in teaching cosmology is that some central aspects rely on rather
advanced physics, such as quantum field theory in curved spacetime. Nevertheless,
the main applications of these advanced concepts can be discussed in relatively
simple terms, so requiring students to first learn general relativity, quantum field
theory and so on in detail is not necessary. Thus only the standard Bachelor level
theoretical physics background (mechanics, special relativity, quantum mechanics,
statistical physics) is assumed. The more advanced theories that cosmology relies
on are reviewed as part of these courses to the required extent. This means that
some results have to be accepted without proper derivation, especially in the later
parts of the courses.
The course is divided into two parts. In Cosmology I, the universe is discussed in
terms of the homogeneous and isotropic approximation (the Friedmann-Robertson-
Walker model) and statistical physics. In Cosmology II, deviation from homogeneity
and isotropy, is discussed from inflation as a theory of initial perturbations to struc-
ture formation and the cosmic microwave background.
1 Introduction
1.1 Overview of modern cosmology
Cosmology is the study of the universe as a whole, its structure, origin and evolution.
Cosmology is grounded on observations, many of them astronomical, and laws of
physics measured on Earth. These lead naturally to the standard framework of
modern cosmology, the Hot Big Bang theory.
As a science, cosmology has a rare, if not singular, restriction: we can observe
only one universe. We cannot do experiments in cosmology, and observations are
restricted to a single object: the Universe. We cannot observationally make compar-
ative or statistical studies among many universes (to the extent such a concept even
makes sense). Also, we cannot move around our universe, but are restricted to (on
cosmological scales) a single point in space and time. As a result, cosmology relies
more on model-dependent interpretations than many other branches of physics.
Nevertheless, the last few decades have seen a remarkable progress in cosmol-
ogy, as a significant body of observational data has become available with modern
astronomical instruments. We now have a good observational handle of the overall
history of the universe for all times between one second and the present time (the
universe is today about 14 billion years old). Theoretically, we understand the evo-
lution of the universe at times between 10−11 s and a few billion years. Important
questions remain, in particular about the nature of dark matter, dark energy and
the processes of inflation and baryogenesis. In the first part of the course, we will
consider dark energy and dark matter, and will look at inflation in the second part.
Baryogenesis will be mentioned only in passing.
One historically important observation supporting the big bang theory is the
1 INTRODUCTION 2
redshift of distant galaxies. Their spectra are shifted towards longer wavelengths.
The further out they are, the larger is the shift. This implies that they are receding
from us; their distance from us is increasing. According to general relativity, this is
understood as the expansion of space itself (which is an aspect of the curvature of
spacetime), not as motion of galaxies in space. As space expands, the wavelength of
the light travelling through space is stretched.
The expansion appears to be uniform over large scales. While there are deviations
of order unity in the expansion rate on small scales (for example our galaxy does not
expand), the average expansion rate on cales larger than galaxy clustering, 10 Mpc
or so, is almost the same everywhere. In homogeneous and isotropic cosmological
models, the expansion is simply described by a time-dependent scale factor a(t).
Starting from the observed present expansion rate, H ≡ ȧ/a, we can use general
relativity to calculate a(t) as a function of time, given our understanding of the
physics of the matter content in the universe. (We will discussed this in more detail
later.) The result is that a(t) → 0 and the density of the universe ρ → ∞ about 14
billion years ago. At this singularity, time and space begin, and we can choose it as
the origin of the time coordinate, t = 0. However, we do not expect general relativity,
which governs the evolution of spacetime, nor the Standard Model of particle physics,
which governs the behaviour of matter, to be applicable at extremely high energy
densities. At the so called Planck density, ρPl ∼ 1097 kg/m3 , quantum gravitational
effects should be large. To describe the earliest times, the Planck era, we would need
a theory of quantum gravity, which we do not have.1 At present we don’t know what
happened at t = 0 or in its immediate vicinity at the Planck time of 10−42 s. It
seems likely that a major breakthrough in our understanding of physics is required
before we can make any definite statements about that primordial era.
Thus, the big bang theory does not actually apply all the way to the beginning
of time and a “big bang”. Rather, it is a valid description of the history of the
universe starting from some early time when the universe was very hot2 , very dense,
and expanding rapidly. The universe was at early times filled with an almost ho-
mogeneous “soup” of particles which was in thermal equilibrium for a long time.
We can therefore describe the state of the early universe with a small number of
thermodynamic variables, and this makes the evolution of the universe calculable.
The fact that the scale factor tends to zero at early times does not imply that
the universe would have started from a point. The part of the universe which we
can observe today was indeed very small at very early times, possibly smaller than
1 mm in diameter at the earliest times that can be sensibly discussed within the
big bang framework. And if the inflationary scenario is correct, it would have been
much smaller than that before inflation. However, the universe extends beyond
what can be observed today (beyond our horizon), and may even be infinite. In
the current cosmological models, if the universe is infinite, it has been always been
infinite, except at the moment t = 0, when the size is not defined. (We do not know,
1
String theory is a candidate for the theory of quantum gravity, and applying it to the early
universe is an active area of research. However, we do not have a complete formulation of the theory,
and it is not known whether string theory is in fact correct, so its applications are necessarily rather
speculative.
2
The realisation that the early universe must have had a high temperature did not come im-
mediately after the discovery of the expansion. The results of big bang nucleosynthesis and the
discovery of the cosmic microwave background –which we will discuss in some detail– however
provided convincing evidence that the early universe was hot.
1 INTRODUCTION 3
Model, there can be a first order phase transition (so the two phases have different
energy densities at the critical temperature), in which case it proceeds through the
formation of bubbles of the new phase. The phase transition can then have inter-
esting effects, and baryogenesis, the generation of the observed matter-antimatter
asymmetry in baryons, may occur in that process.
For every type of particle there is a corresponding antiparticle, which has the
same properties (e.g. mass and spin) as the particle, except for charges (like the
electric charge or colour charge), which have the opposite sign. Particles that do not
have any charges, such as photons, are their own antiparticles. At high temperatures,
T ≫ m, where m is the mass of the particle, particles and antiparticles are constantly
created and annihilated in various reactions, and there is roughly the same number
of particles and antiparticles. But when T ≪ m, particles and antiparticles may still
annihilate each other (and decay, if they are unstable), but there is no more thermal
production of particle-antiparticle pairs. As the universe cools, heavy particles and
antiparticles therefore annihilate each other. These annihilation reactions produce
additional lighter particles and antiparticles. If the universe had originally an equal
number of particles and antiparticles, only photons and neutrinos (of the known
particles) would be left over today in any significant quantities. The presence of
matter today indicates that in the early universe there must have been slightly
more nucleons and electrons than antinucleons and positrons, and this excess was
left over. The lightest known charged massive particle is the electron, so the last
annihilation event was the electron-positron annihilation which took place when
T ∼ me ∼ 0.5 MeV and t ∼ 1 s5 . After this the only remaining antiparticles were
the antineutrinos, and the primordial soup consisted of a large number of photons
(who are their own antiparticles) and neutrinos (and antineutrinos) and a smaller
number of “left-over” protons, neutrons, and electrons. (Dark matter is left out of
this story; in typical dark matter models, there are an equal number of dark matter
particles and antiparticles in the late universe. We will come back to this later.)
When the universe was a few minutes old, T ∼ 100 keV, protons and neutrons
formed nuclei of light elements. This process is known as Big Bang Nucleosynthesis
(BBN), and it produced about 75% (of the total mass in ordinary matter) 1 H, 25%
4 He, 10−4 2 H, 10−4 3 He, and 10−9 7 Li. (Other elements were formed much later, in
stars.) At this time matter was completely ionized, all electrons were free. In this
plasma the mean free path of a photon was short and the universe was opaque.
The universe became transparent when it was about 400 000 years old. At a
temperature T ∼ 3000 K (∼ 0.25 eV), electrons and nuclei formed neutral atoms,
and the photon mean free path became longer than the radius of the observable
universe. This event is called recombination. (This name, taken from statistical
physics, is misleading, since this is actually the first time ever when electrons and
nuclei combine.) Since recombination the primordial photons have been travelling
through space almost without scattering. We can observe them today as the Cosmic
Microwave Background (CMB). The CMB is like a photograph of the universe at
400 000 years of age, modified by its passage to us through 13 billion years.
From the CMB we learn that the early universe was locally very homogeneous
and isotropic, unlike the present universe, where matter has accumulated into stars,
5
Neutrinos also have very small masses. However, at temperatures less than the neutrino mass,
the neutrino interactions are so weak that the neutrinos and antineutrinos cannot annihilate each
other. It is also possible that neutrinos are their own antiparticles, like photons.
1 INTRODUCTION 5
amounts of thin gas, which is apparently the main component of baryonic dark
matter.
However, we can estimate from BBN and the CMB anisotropies the total amount
of baryonic matter, and there is not nearly enough of it to explain the whole dark
matter problem. Most of the dark matter is non-baryonic, meaning that it is not
made out of protons and neutrons9 . The only non-baryonic particles in the standard
model of particle physics that could act as dark matter are neutrinos. If they had a
suitable mass, ∼ 1 eV, neutrinos left from the early universe would have a sufficient
total mass to be a significant dark matter component. However, producing the
structures seen in the universe requires most of the dark matter to have different
properties than neutrinos have. Technically, most of the dark matter must be “cold”
or “warm”, instead of “hot”. These terms refer to the dynamics of the particles
making up the matter, and do not specify the details of the particles. The difference
between hot dark matter (HDM) and cold dark matter (CDM) is that HDM is made
of particles whose velocities were large when structure formation began, but CDM
particles had small velocities. Neutrinos with m ∼ 1 eV, would be HDM. Dark
matter between the two alternatives is called warm dark matter (WDM), and it
is still observationally allowed. As the Standard Model does not contain particles
that could explain all of the dark matter, it appears that most of the matter in the
universe is made out of some unknown particles.
Usually, the term ”dark matter” is used to refer only to non-baryonic dark mat-
ter, and often neutrinos are also excluded, so that ”dark matter” refers only to the
unknown, exotic part of the matter that is not observed via light.
Particle physicists have independently come to the conclusion that the Standard
Model is not the final word in particle physics. Many proposed extensions of the
Standard Model contain suitable dark matter particle candidates (e.g. neutralinos,
technibaryons, axions, right-handed neutrinos). Their interactions have to be rather
weak to explain why they have not been detected so far, which imposes constraints
on models of particle physics. Dark matter thus presents one area where the physics
of the very small and the very large overlap.
The situation of dark energy is different. While cosmologists consider dark mat-
ter quite natural, many of them find dark energy puzzling. The increasing breadth
and precision of cosmological observations has made it possible to determine the
distance scales and the expansion of the universe accurately. In the context of the
homogeneous and isotropic models based on general relativity, a form of matter
called dark energy is required to fit these observations. Unlike dark matter, which
is clustered, dark energy is relatively uniform in the observable universe. And while
dark matter has negligible pressure, dark energy has a large negative pressure. The
simplest possibility for dark energy is a cosmological constant or vacuum energy.
High energy physics predicts that the the vacuum has an energy density, but it is
difficult to understand the small energy scale ∼ meV that is required to explain the
observations. Another possible explanation is modification of the law of gravity at
large distances. In the dark energy case, this is less difficult than for dark mat-
ter, as the only observable effect of dark energy is to increase the expansion rate
of the universe at late times, whereas the effect of dark matter is seen in various
9
And electrons. Although electrons are not baryons (they are leptons), cosmologists refer to
matter made out of protons, electrons, and neutrons as “baryonic”. Electrons are anyway so light
that their contribution to the total mass is tiny.
1 INTRODUCTION 7
physical systems on different scales and in different eras of the universe. Neverthe-
less, constructing models that would explain the observations on large scales while
being consistent with the precision tests of general relativity in the solar system
has proven to be difficult (apart from the cosmological constant). This remains an
active area of research. The third possibility is that the homogeneous and isotropic
approximation is not good enough at late times due to the formation of non-linear
structures. Studying this possibility is difficult, because it requires dealing with
non-perturbative general relativity in the complex setting of cosmological structure
formation.
c=1
The theory of relativity unifies space and time into a single concept, the four-
dimensional spacetime. It is thus natural to use the same units for measuring spatial
distance and time. Since the (vacuum) speed of light is c = 299 792 458 m/s, we set
1 s ≡ 299 792 458 m, so that c = 1 and 1 second = 1 light second and 1 year = 1 light
year. Velocity is thus a dimensionless quantity, and smaller than one for massive
objects. Energy and mass have the same dimension and the relation between mass
m and energy E for free particles E 2 = m2 c4 + p2 c2 is simply E 2 = m2 + p2 , where
p is particle momentum.
kB = 1
Temperature T is a parameter that describes a thermal equilibrium distribution.
The formula for the occupation number of energy level E includes the exponential
form eβE , where β = 1/(kB T ). The only function of the Boltzmann constant,
kB = 1.3805 × 10−23 J/K, is to convert temperature into energy units. We decide to
give temperature directly in energy units, so kB becomes unnecessary. We define 1
K = 1.3805 × 10−23 J, or
1eV = 11600K = 1.78 × 10−36 kg = 1.60 × 10−19 J . (1.1)
Thus kB = 1, and the exponential form is just eE/T .
~=1
The third simplification in the natural system of units is to set the reduced Planck
constant to unity, ~ = h/2π = 1. In SI units we have ~ = 1.054573 × 10−34 Js,
so in the natural system of units the dimensions of mass and energy are equal to
the dimension of 1/time or 1/distance. This is convenient, because the typical time
and distance scales of quantum mechanics are associated with particle energy. For
example, the energy of a photon E = ~ω = ω is equal to its angular frequency. We
have
1 eV = 5.07 × 106 m−1 = 1.52 × 1015 s−1 . (1.2)
A very useful relation to remember is
~ ≈ 197 MeV fm = 1 , (1.3)
1 INTRODUCTION 8
where we have the energy scale ∼ 100 MeV and length scale 1 fm of strong interac-
tions.
Equations become now simpler and the physical relations more transparent, since
we do not have to include the above fundamental constants. However, still have to
do conversions among different units because the preferred units used in particle
units and cosmology (not to mention astrophysics) are different.
Astronomical units. A common unit of mass and energy is the solar mass, m⊙ =
1.99 × 1030 kg, and a common unit of length is one parsec, 1 pc = 3.26 light years =
3.09×1016 m. One parsec is defined as the distance from which 1 astronomical unit
(AU, the distance between the earth and the sun) forms an angle of one arcsecond,
1”.10 (Astronomers and cosmologists only use light years when talking to outsiders.)
A more common scale in cosmology is 1 Mpc = 106 pc, which is roughly the typical
distance between galaxies at the present time.
z = H0 d , (1.5)
10
One degree is divided into 60 arc minutes, denoted by 60’, and one arc minute is divided into
60 arc seconds, denoted by 60”.
11
Assuming that the laws of physics are the same here and at the emission event. Put another
way, spectral lines offer a sensitive test of the change of the laws of quantum electrodynamics and
nuclear physics in time and space. No deviations from the laws observed on Earth have been found
so far.
1 INTRODUCTION 9
where d is the distance to the galaxy and z is its redshift. (The speed of light c is
set to unity, as noted above.) This relation was first discovered by Lemaı̂tre, and it
is called the Hubble law. The proportionality constant H0 is correspondingly called
Hubble constant. It was introduced, its value was first determined from observations
and it was interpreted as the expansion rate by Lemaı̂tre.
While the redshift can be readily determined with high accuracy, it is more
difficult to determine the distance d. Measurements of distances in general and the
Hubble parameter in particular have been the subject of much work and controversy
over decades. Distance determinations used to be exclusively based on the notion of
the cosmic distance ladder. This refers to a series of relative distance determinations
between more nearby and faraway objects. The first step of the ladder is made of
nearby stars, whose absolute distance can be determined from their parallax, their
apparent motion on the sky due to our motion around the Sun. The other steps
require “standard candles”, classes of objects with the same absolute luminosity
(radiated power), so that their relative distances are inversely related to the square
roots of their “brightness” or apparent luminosity (received flux density). Often
several steps are needed, since objects that can be found close by are too faint to
be observed from very far away, errors (inaccuracies) accumulate from step to step.
Nowadays there are also measurements which do not use the distance ladder, and
the value of H0 is determined to a reasonable accuracy, though the matter may not
be entirely settled.
The latest measurement of H0 reports the value [1]
The error bars are derived from combining three different measurements which are
used to anchor the distance ladder. Systematic effects are included, but may be
somewhat underestimated. The error represents the range where the real result is
with a 68% probability. Doubling the error bar gives the 95% probability range.
(Unless otherwise noted, all errors bars given during the course are 68% error bars.)
There are other observations pointing to around this value (as well as some
observations pointing to a somewhat lower value). This uncertainty of the distance
scale is reflected in many cosmological quantities. It is customary to give these
quantities multiplied by the appropriate power of h, defined by
H0 = h · 100km/s/Mpc. (1.6)
v=z . (1.7)
The further out the galaxies are, the faster they are receding from us. Astronomers
often report the redshift in units of velocity (by reintroducing c in (1.7), z = v/c).
However, according to general relativity, movement in space is not the proper
way understand the redshift. The galaxies are not actually moving, the distance
between the galaxies increases because the space between the galaxies is expanding.
We will later derive the redshift from general relativity. It turns out that equations
1 INTRODUCTION 10
(1.5) and (1.7) hold only at the limit z ≪ 1, and the general result, d(z), which
relates distance d and redshift z ,is more complicated. In particular, the distance
reaches a finite value as z goes to infinity – though we should be careful about what
we mean by distance! We look at this in detail in the next chapter. The redshift is
directly related to the expansion. The easiest way to understand the cosmological
redshift is that the wavelength of travelling light expands with the universe. Thus
the universe has expanded by a factor 1 + z during the time light travelled from an
object with redshift z to us.
The largest observed redshift of a galaxy is at present is z = 8.6. Thus the
universe has expanded by a factor of about 10 while the observed light has been on
its way. When the light left the galaxy, the age of the universe was only about 600
million years. At that time the first galaxies were just being formed. This upper
limit in the observations is however not due to there being no earlier galaxies, but
rather to the fact that they are so faint due to the large distance. There may be some
galaxies with a redshift greater than 10. NASA is planning a new space telescope,
the James Webb Space Telescope12 , which would be able to observe these.
The expansion rate H changes on the cosmological timescale. Properly, the
time-dependent function H(t) is called the Hubble parameter, and its present value
is called the Hubble constant, H0 . In cosmology, it is customary to denote present
values of quantities with the subscript 0. Thus H0 ≡ H(t0 ).
The galaxies are not exactly at rest in the expanding space. Each galaxy has
its own peculiar motion vgal , caused by the gravity of nearby mass concentrations,
such as other galaxies. Neighbouring galaxies can fall towards each other, orbit each
other and so on13 .
Thus the redshift of an individual galaxy is the sum of the cosmic and the peculiar
redshift.
z = H0 d + vgal (when z ≪ 1) . (1.8)
Usually only the redshift is known precisely. Typically vgal is around 300 . . . 500
km/s. (In large galaxy clusters, where galaxies orbit each other, it can be several
thousand km/s; but then one can take the average redshift of the cluster.) For
faraway galaxies, H0 d ≫ vgal . The larger the redshift, the younger the universe was
when the light left.
“practical horizon”, i.e. the limit to what we can see, lies already at z = 1090. The
distances d(z = 1090) and d(z = ∞) are very close to each other; z = 4 lies about
halfway from here to the horizon.
Therefore we can only observe a finite region of the universe, enclosed in the
sphere with radius dhor . The universe can extend to large distances beyond that,
and it may even be infinite. Sometimes the word “universe” is used to denote just this
observable part of the “whole” universe. Then we can say that the universe contains
some 1012 galaxies and 1023 stars. Over cosmological time scales the horizon recedes
and parts of the universe which are beyond our present horizon become observable.
(However, if the expansion continues to accelerate as it seems to have done during
the past few billion years, the observable region will not grow, and in the distant
future galaxies that are now observable will disappear from our sight.)
Optical astronomy and the large scale structure. There is a large body of
data relevant to cosmology from optical astronomy. Counting the number of stars
and galaxies we can estimate the matter density they contribute to the universe.
From the different redshifts of galaxies within the same galaxy cluster we obtain
their relative motions, which reflect the gravitating mass within the system. The
mass estimates for galaxy clusters obtained this way are much larger than those
obtained by counting the visible stars and galaxies in the cluster, pointing to the
existence of dark matter.
From the spectral lines of stars and gas clouds we can determine the relative
amounts of different elements and their isotopes in the universe.
The distribution of galaxies in space and their relative velocities tell us about
the large scale structure of the universe. The galaxies are not distributed uniformly.
There are galaxy groups and clusters. Our own galaxy belongs to a small group
of galaxies called the Local Group. The Local Group consists of three large spiral
galaxies: M31 (the Andromeda galaxy), M33, and the Milky Way, and about 60
smaller galaxies and dwarf galaxies. The local group’s diameter is around 3 Mpc.
The nearest large cluster is the Virgo Cluster. The grouping of galaxies into clus-
ters is not as strong as the grouping of stars into galaxies. Rather, galaxies are
distributed in a complex pattern called the cosmic web, which consists of walls, fil-
aments, clusters and voids (low-density regions). Most galaxies are not part of any
well defined cluster.
Radio astronomy. The sky looks very different on radio wavelengths than to the
naked eye. There are many strong radio sources very far away. These are galaxies
which are optically barely observable. They are distributed isotropically, i.e. there
are equal numbers of them in every direction, but there are more far away (at z > 1)
than close by (z < 1). The isotropy is evidence of the homogeneity of the universe
at the largest scales—there is structure only at smaller scales. The dependence
on distance is a time evolution effect in two ways: the radio sources evolve and
the volume of the universe evolves. In general, there are more objects at larger
distances simply because there is more volume there. However, the evolution in the
number counts is not explained only by the change in the volume of the universe,
but it shows that the radio sources themselves evolve. Some galaxies are strong
radio sources when they are young, but become weaker with age by a factor of more
than 1000.
Cold gas clouds can be mapped using the 21 cm spectral line of hydrogen. The
ground state (n = 1) of hydrogen is split into two very close energy levels depending
on whether the proton and electron spins are parallel or antiparallel (the hyperfine
structure). The separation of these energy levels, the hyperfine structure constant,
is 5.9 µeV, corresponding to a photon wavelength of 21 cm, i.e. radio waves. The
redshift of this spectral line shows that redshift is independent of wavelength (it is
the same for radio waves and visible light), as it should if it is due to the expansion
of space.
small contrasts. The electromagnetic spectrum of the CMB is the black body spec-
trum with a temperature of T0 = 2.72548 ± 0.00057 K [2]. It follows the theoretical
black body spectrum better than anything else we have observed or produced. It
is the remnant of a hot state in the early universe, when matter and light were al-
most homogeneously and isotropically distributed and in thermal equilibrium. The
temperature of the CMB falls as (1 + z)−1 due to photon redshift, so as the CMB
redshift is about 1090, the original temperature was about 3000 K.
The state of a system in thermal equilibrium is determined by a small number
of thermodynamic variables, in this case the temperature and chemical potentials
(for particles with conserved quantum numbers). The observed temperature of the
CMB and the observed density of the present universe allows us to fix the evolution
of the temperature and the density of the universe, which then allows us to calculate
the sequence of events in the early universe. That the early universe was hot and
in thermal equilibrium is a central part of the Big Bang paradigm, and it is often
called the Hot Big Bang theory to spell this out.
With sensitive instruments a small anisotropy can be observed in the microwave
sky. This is dominated by the dipole anisotropy (one side of the sky is slightly hotter
and the other side colder), with an amplitude of 3.346 ± 0.017 mK, or ∆T /T0 =
0.0012. This is a Doppler effect due to the motion of the observer, i.e. the motion of
our Solar System with respect to the radiating matter at our horizon. The velocity
of this motion is v = (∆T /T0 ) c = 369 km/s, or v = 0.00123 and it is directed
towards the constellation of Leo (R.A. 11h 8m 50s , Dec. −6◦ 37′ ), near the autumnal
equinox (where the ecliptic and the equator cross on the sky) [3]. It is due to two
components, the motion of the Sun around the center of the Galaxy, and the peculiar
motion of the Galaxy due to the gravitational pull of matter concentrations up to
100–200 Mpc away14 .
When we subtract the effect of this motion from the observations (and look
away from the plane of the Galaxy—our Galaxy also emits microwave radiation,
but with a non-thermal spectrum) the true anisotropy of the CMB remains, with an
amplitude of about 3 × 10−5 , or 80 muK.15 This anisotropy gives a picture of the
small density variations in the early universe, the “seeds” of galaxies. Theories of
structure formation have to match both the small inhomogeneity of the order 10−5
at z = 1090 and the structure observed today (z = 0).
14
Sometimes it is asked whether there is a contradiction with special relativity here—doesn’t
CMB provide an absolute reference frame? There is no contradiction. The relativity principle
just says that the laws of physics are the same in the different reference frames. It does not say
that systems cannot have reference frames which are particularly natural for that system, e.g. the
center-of-mass frame or the laboratory frame. For road transportation, the surface of the earth is
a natural reference frame. In cosmology, the CMB gives us a good “natural” reference frame—it is
closely related to the center-of-mass frame of the observable part of the universe, or rather, a part
of it which is close to the horizon (the last scattering surface). The different parts of the plasma
from which the CMB originates are moving with different velocities (part of the 10−5 anisotropy
is due to these velocity variations). If there is something surprising here, it is that these relative
velocities are so small, of the order of just a few km/s, reflecting the astonishing homogeneity of the
early universe over large scales. We will return to the question of whether these are natural initial
conditions when we discuss inflation in the second part of the course.
15
The numbers refer to the standard deviation of the CMB temperature on the sky. The hottest
and coldest spots deviate some 4 or 5 times this amount from the average temperature.
REFERENCES 14
Gamma ray bursts and quasars. The highest energy region of the electro-
magnetic spectrum is occupied by γ rays. Space-based γ-ray observatories have
discovered powerful Gamma Ray Bursts (GRB) on the sky. These are short events
lasting from a fraction of a second to a few minutes. They are observed about once
per day, and are distributed isotropically on the sky. The isotropic distribution sug-
gests that they are at cosmological distances (further out than our own or nearby
galaxies). This has been confirmed by the identification of some GRB’s with galaxies
with high redshifts (z > 1). This means that the bursts must have extremely high
energies. The longer duration (longer than a second) GRB’s appear to be related to
particularly powerful supernova events. The shorter duration (less than a second)
are possibly due to collisions of neutron stars with each other or with black holes.
Quasars (Quasistellar Objects, QSOs) are the most powerful continuously radi-
ating objects in the universe. Thus the most-distant (earliest) objects observed in
the universe are mostly quasars. The highest observed power is about 1041 W. At
first quasars were considered different from galaxies since they looked like point-like
objects. In photographs they looked like stars, but their redshifts revealed their
huge distances and thus their huge power outputs. Now better observations have
revealed “host” galaxies around several quasars. It has been concluded that quasars
are powerfully radiating galactic nuclei, and are related to some more close-by galax-
ies (Seyfert galaxies), whose nuclei are also fairly powerful sources of radiation. To-
gether these objects are called Active Galactic Nuclei (AGN). Quasars are powerful
sources at many different wavelengths (radio, optical, X-ray). Some of them be-
long to the radio sources mentioned earlier, others are radio quiet. There are more
quasars at large distances (in the past, z > 1) than nearer to us (later, z < 1,
because quasars grow fainter as they age; they become ordinary’ quiet galaxies.
The power source of an AGN is thought to be a very large black hole (with
m = 108 M⊙ or so) at the center of the galaxy, into which surrounding matter is
falling. As it approaches the hole, this matter is heated up and begins to radiate.
AGN’s quiet down over cosmological time scales as the black hole gradually cleans
up the surrounding regions.
References
[1] G. Efstathiou, Mon. Not. Roy. Astron. Soc. f440 (2014) 1138 [arXiv:1311.3461
¯
[astro-ph.CO]]
REFERENCES 15
xi → x′i = Ri j xj + Ai + v i t
t → t′ = Bt + C , (2.1)
a sphere with radius r has surface area 4πr2 . The space around Earth is indeed
curved due to Earth’s gravity, but the curvature is so small that more sophisticated
measurements than the ones described above are needed to detect it.
The line element has the dimension of distance. As a working definition for the
metric, we can use that the metric is an expression which gives the square of the
line element in terms of the coordinate differentials.
We could use another coordinate system on the same 2-dimensional Euclidean
space, e.g., polar coordinates. Then the metric is
in spherical coordinates (where the r coordinate has the dimension of distance, but
the angular coordinates θ and ϕ are dimensionless).
Now we can go to our first example of a curved (2-dimensional) space, the sphere.
Let the radius of the sphere be a. For the two coordinates on this 2D space we can
take the angles θ and ϕ. We get the metric from the Euclidean 3D metric in spherical
coordinates by setting r = r0 ,
Figure 4: The part of the sphere covered by the coordinates in Eq. (2.11).
this r = sin θ begins to decrease again, repeating the same values. Also, at r = 1,
the 1/(1 − r2 ) factor in the metric becomes infinite. We say we have a coordinate
singularity at the equator. There is nothing wrong with the space itself, but our
chosen coordinate system applies only for a part of this space, the region “north” of
the equator.
The fact that time appears in the metric with a different sign is responsible for
the special geometric features of Minkowski space. (We assume that the reader is
familiar with special relativity, and won’t go into details.) There are three kinds of
distance intervals,
2 BASICS OF GENERAL RELATIVITY 21
• lightlike, ds2 = 0
The lightlike directions form the observer’s future and past light cones. Light
moves along the light cone, so everything we see lies on our past light cone. To see
us as we are now, the observer has to lie on our future light cone. As we move in
time along our world line, we drag our light cones with us so that they sweep over
the spacetime. The motion of a massive body is always timelike, and the motion of
massless particles is always lightlike.
into a “physical” velocity (with respect to the coordinate system), we still need to
use the metric, see Eq. (2.20).
In an orthogonal coordinate system the coordinate lines are everywhere orthog-
onal to each other. The metric is then diagonal, meaning that it contains no cross-
terms like dxdy. We will only use orthogonal coordinate systems in this course.
The three-dimensional subspace, or hypersurface t = const. of spacetime is called
the space (or the universe) at time t, or a time slice of the spacetime. It is possible
to slice the same spacetime in many different ways i.e. to make different choices of
the time coordinate t.
We introduce the Einstein summation rule: we always sum over P repeated indices,
even if we don’t bother to write down
P 3 P3 the summation sign . This also applies
i j i j
to Latin indices, gij dx dx ≡ i=1 j=1 gij dx dx . The objects gµν are the com-
ponents of the metric tensor. They are usually taken to be dimensionless, but
sometimes (particularly in the case of angular coordinates) it is more useful to keep
the coordinates dimensionless and put the dimension in the metric. The components
of the metric tensor form a symmetric 4 × 4 matrix.
In the case of Minkowski space, the metric tensor in Cartesian coordinates is
called ηµν ≡ diag(−1, 1, 1, 1). In matrix notation we have for Minkowski space
−1 0 0 0
0 1 0 0
gµν =
0 0 1 0
(2.16)
0 0 0 1
in Cartesian coordinates, and
−1 0 0 0
0 1 0 0
gµν =
0 2
(2.17)
0 r 0
0 0 0 r2 sin2 θ
2 BASICS OF GENERAL RELATIVITY 23
in spherical coordinates.
As another example, the metric tensor for a sphere (discussed above as an ex-
ample of a curved 2D space) has the components
2
r0 0
[gij ] = . (2.18)
0 r02 sin2 θ
The vectors that occur naturally in relativity are four-vectors, with four compo-
nents, as with the four-velocity discussed above. We will use the short term “vector”
to refer both to three-vectors and four-vectors, as it should be obvious from the con-
text which one we mean. As in three-dimensional flat geometry, the values of the
components depend on the basis used. For example, if we move p along the coordinate
√
1 1
x so that it changes by dx , the distance travelled is ds = g11 dx1 dx1 = g11 dx1 .
Similarly, the components of a vector do not give the physical magnitude of the
quantity. In the case when the metric is diagonal, we just multiply by the relevant
metric component to get the physical magnitude,
where wα is the component of a vector in the basis where the metric is gαβ , and wα̂
is the correctly normalised physical magnitude of the vector. (In the above, there is
no summation over α.)
For example, the physical velocity of a object is3
√
v î = gii dxi / |g00 |dx0 ,
p
(2.20)
w · u ≡ gαβ uα wβ . (2.25)
w · w ≡ gαβ wα wβ . (2.26)
wα ≡ gαβ wβ . (2.27)
as well as the matter degrees of freedom. The requirement of invariance under gen-
eral coordinate transformations restricts the equation of motion (in four dimensions)
to have the form
Gµν = 8πGN Tµν , (2.34)
where Gµν is a unique tensor constructed from the metric and its first and second
derivatives and Tµν is the energy-momentum tensor, also known as the stress-energy
tensor. This equation specifies how the geometry of spacetime and its matter content
interact, in other words it is the law of gravity according to general relativity. We
will not discuss the Einstein tensor or this equation in much detail in this course.
In the first part of the course we only need it in the case of the homogeneous and
isotropic approximation, and in the second part we will look at small perturbations
around this. However, we have explained a little bit about general relativity to give
some idea of the mathematical structure which underlies the Friedmann-Robertson-
Walker models.
The energy-momentum tensor describes all properties of matter which affect the
spacetime, namely energy density, momentum density, pressure, and stress. For
frictionless continuous matter, a perfect fluid, it has the form
where ρ is the energy density and p is the pressure measured by an observer moving
with four-velocity uµ (such an observer is in the rest frame of the fluid). In cosmology
we can usually assume that the energy tensor has the perfect fluid form. T00 is the
energy density in the coordinate frame, Ti0 gives the momentum density, which is
equal to the energy flux T0i and Tij gives the flux of momentum i-component in
j-direction.
In Newton’s theory the source of gravity is mass, in the case of continuous matter,
the mass density ρm . According to Newton, the gravitational field ~gN is given by
the equation
∇2 Φ = −∇ · ~gN = 4πGρm , (2.36)
where Φ is the gravitational potential. (We earlier discussed Newton’s law in the
form of the force law for point particles; this potential formulation for a continuous
medium is equivalent, for finite systems.) Comparing (2.36) to (2.34), the mass
density ρm has been replaced by Tµν , and ∇2 Φ has been replaced by the Einstein
tensor Gµν , which is a short way of writing a complicated expression built from gµν
and its first and second derivatives of. Thus the gravitational potential is replaced
by the 10-component tensor gµν .
In the case of a weak gravitational field, the metric is close to the Minkowski
metric, and it can be written as
Comparing this to Eq. (2.36) we see that the mass density ρm has been replaced
by ρ + 3p. For relativistic matter, where mass is not the dominant contribution
to the energy density and p can be of the same order of magnitude as ρ, this is
2 BASICS OF GENERAL RELATIVITY 26
In order to determine the angular diameter distance, we need to know the proper
size of the object we are observing. In cosmology, this is can be done reliably only in
a few cases, the most notable of which is the pattern of the anisotropy of the CMB,
which we will discuss in the second part of the course.
The luminosity distance is defined in a similar manner. In Euclidean space, if an
object radiates isotropically with absolute luminosity L (this is the radiated energy
per unit time you would measure next to the object), an observer at distance d sees
the flux (energy per unit time per unit area)
L
F = . (2.42)
4πd2
In general relativity, the luminosity distance dL is defined as
r
L
dL ≡ . (2.43)
4πF
As with the angular diameter distance, objects in curved spacetime are further
away because they look fainter, not the other way around. (However, at least in
homogeneous and isotropic models, the luminosity distance behaves qualitatively as
expected from Euclidean intuition, unlike dA .)
In any spacetime, the two distances are related by dL = (1 + z)2 dA , so there is
really only one independent observational cosmological distance measure.
In astronomy, luminosity is often expressed in terms of magnitude. This system
hails back to the ancient Greeks, who classified stars visible to the naked eye into six
classes according to their brightness. Magnitude in modern astronomy is defined so
that it roughly matches this ancient classification, but it is not restricted to positive
integers. The magnitude scale is logarithmic in such a way that a difference of 5
magnitudes corresponds to a factor of 100 in luminosity4 . The absolute magnitude
M and the apparent magnitude m of an object are defined as
L
M ≡ −2.5 log10
L0
F
m ≡ −2.5 log10 , (2.44)
F0
where L0 and F0 are reference luminosity and flux. There are actually different
magnitude scales corresponding to different regions of the electromagnetic spectrum,
with different reference luminosities. The bolometric magnitude and luminosity refer
to the power or flux integrated over all frequencies, whereas the visual magnitude
and luminosity refer only to the visible light. In the bolometric magnitude scale
L0 = 3.0×1028 W. The reference flux F0 for the apparent scale is chosen so in relation
to the absolute scale that a star whose distance is d = 10 pc has m = M . From
this, (2.43) and (2.44) follows that the difference between apparent and absolute
magnitudes is related to the luminosity distance as
References
[1] C.M. Will, The confrontation between general relativity and experiment, Liv-
ing Rev. Rel. 9 (2006) 3, http://www.livingreviews.org/lrr-2006-3 [arXiv:gr-
qc/0510072]
[2] C.W. Misner, K.S. Thorne, J.A. Wheeler, Gravitation (Freeman 1973)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 29
dr2
2 2 2 2 2 2 2 2
ds = −dt + a (t) + r dθ + r sin θ dϕ . (3.1)
1 − Kr2
In either form, this is called the Robertson–Walker (RW) metric, sometimes the
Friedmann–Robertson–Walker (FRW) metric or the Friedmann–Lemaı̂tre–Robertson–
Walker (FLRW) metric1 . Note that neither form of the metric has the same amount
of symmetry as the spacetime itself: the metrics are isotropic, but not homogeneous.
The full symmetry of the spacetime is usually not apparent in the metric itself, even
though all physical quantities calculated from the metric display the symmetry. The
time coordinate t is the cosmic time. Here K is a constant, related to curvature of
space (not spacetime) and a(t) is a function of time which tells how the universe
expands (or contracts). We call
p
Rcurv ≡ a(t)/ |K| (3.3)
the curvature radius of space (at time t). The metric (3.1) is given in spherical
coordinates. We see immediately that the 2-dimensional surfaces t = r = const have
the metric of a sphere with radius ar. The time-dependent factor a(t) is called the
1
The most commonly used term is the FRW metric. However, some authors prefer to make
the distinction between the geometry (with the names Robertson and Walker attached) and the
equations of motion (endowed with the name Friedmann and sometimes also Lemaı̂tre).
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 30
scale factor. We will need the Einstein equation to solve a(t). From the geometrical
point of view, it is just an arbitrary function of the time coordinate t.
We have the freedom to rescale the radial coordinate r. For example, we can
multiply all values of r by a factor of 2, if we also divide a by a factor of 2 and K
by a factor of 4. The geometry of the spacetime stays the same, the meaning of the
coordinate r has just changed: the point that had a given value of r has now twice
that value in the rescaled coordinate system. There are two common ways to use
the rescaling to make the notation easier. If K 6= 0, we can rescale r to make K
equal to ±1. In this case K is usually denoted k. In this case r is dimensionless, and
a(t) has the dimension of distance. The other way is to set the scale factor today
to unity2 , a(t0 ) ≡ a0 = 1. We will use this latter convention in the course. In this
case a(t) is dimensionless, and r and K −1/2 have the dimension of distance.
If K = 0, the space part (t = const.) of the Robertson–Walker metric is flat.
The 3-metric (the space part of the full metric) is that of ordinary Euclidean space,
with the radial distance given by ar. The spacetime, however, is curved, since a(t)
depends on time, describing the expansion or contraction of space. It is often said
that the “universe is flat” in this case, though if the universe is understood as the
four-dimensional spacetime (as opposed to a spatial slice), “spatially flat” would be
more correct. √
If K > 0, the coordinate system is singular at r = rK ≡ 1/ K. (Remem-
ber the discussion of the 2-sphere in the previous chapter.) With the substitution
(coordinate transformation) r = rK sin χ the metric becomes
The spatial part has the metric of a 3D hypersphere, a sphere with one extra dimen-
sion. There is a new angular coordinate,
√ χ, whose values range from 0 to π, just
like θ. The singularity at r = 1/ K disappears in this coordinate transformation,
showing that it was just a coordinate artifact, not a physical singularity. The orig-
inal coordinates
√ covered only half of the hypersphere, as the coordinate singularity
r = 1/ K divides the hypersphere into two halves. The case K > 0 corresponds to
a closed universe, whose spatial curvature is positive.3 This is a finite universe, with
circumference 2πarK = 2πRcurv and volume 2π 2 a3 rK 3 = 2π 2 R3
curv , and Rcurv is the
radius of the hypersphere.
If K < 0, there is no coordinate singularity, and r ranges from 0 to ∞. The
substitution r = |K|−1/2 sinh χ is, however, often useful in calculations. The case
K < 0 corresponds to an open universe, the spatial curvature of which is negative.
The metric is then
Figure 1: The hypersphere. This figure is for K = k = 1. Consider the semicircle in the
figure. It corresponds to χ ranging from 0 to π. You get the (2-dimensional) sphere by
rotating this semicircle off the paper around the vertical axis by an angle ∆ϕ = 2π. You get
the (3-dimensional) hypersphere by rotating it twice, in two extra dimensions, by ∆θ = π
and by ∆ϕ = 2π, so that each point makes a sphere. Thus each point √ in the semicircle
corresponds to a full sphere with coordinates θ and ϕ, and radius (a/ K) sin χ.
The Robertson–Walker metric has two associated length scales, both of which
in general evolve in time. The first is the curvature radius, Rcurv ≡ a|K|−1/2 . The
second is the time scale of the expansion, the Hubble time, tH ≡ H −1 , where H ≡ ȧ/a
is the Hubble parameter. The Hubble time multiplied by the speed of light, c = 1,
gives the Hubble length, ℓH ≡ ctH ≡ H −1 . In the case K = 0 the universe is flat, so
the Hubble length is only length scale.
The coordinates (t, r, θ, ϕ) of the Robertson–Walker metric are called comoving
coordinates. This means that the coordinate system follows the expansion of space,
so that the space coordinates of objects which do not move with respect to the
background remain the same. The homogeneity of the universe fixes a special frame
of reference, the cosmic rest frame given by the above coordinate system, so (unlike
in the empty Minkowski space) the concept “does not move” has a specific meaning
(as long as the energy density and pressure are not zero). The coordinate distance
between two such objects stays the same, but their physical, or proper, distance
grows with time as space expands. The time coordinate t, the cosmic time, gives
the time measured by such an observer, at (r, θ, ϕ) = const.
It can be shown that expansion causes the motion of an object in free fall to slow
down with respect to the comoving coordinate system. For nonrelativistic physical
velocities we have,
a(t1 )
v(t2 ) = v(t1 ). (3.6)
a(t2 )
The velocity of a galaxy with respect to the background is called peculiar velocity5 .
5
When there are perturbations, the split between the background and the perturbations is a
delicate issue, and statements like “moving with respect to the background” have to be phrased
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 32
3.1.3 Redshift
As mentioned in chapter 1, redshift is one of the most important cosmological ob-
servables. Let us find how it is related to the spacetime geometry in the case of
the FRW metric. Consider galaxy A. Light leaves the galaxy at time t1 with wave-
length λ1 and arrives at galaxy O at time t2 with wavelength λ2 . It takes a time
δt1 = λ1 /c = 1/f1 to send one wavelength and a time δt2 = λ2 /c = 1/f2 to receive
one wavelength. Follow now the two light rays sent at times t1 and t1 + δt1 (see
figure). We can choose the coordinates such that the light path is radial (since all
directions are equivalent), θ and ϕ stay constant while t and r change. Light follows
lightlike geodesics for which
ds2 = 0 . (3.8)
We thus have
dr2
ds2 = −dt2 + a2 (t) = 0 (3.9)
1 − Kr2
dt −dr
⇒ = √ . (3.10)
a(t) 1 − Kr2
Integrating this, we get for the first light ray,
Z t2 Z rA
dt dr
= √ , (3.11)
t1 a(t) 0 1 − Kr2
and for the second,
t2 +δt2 rA
dt dr
Z Z
= √. (3.12)
t1 +δt1 a(t)
0 1 − Kr2
The right hand sides of the two equations are the same, since the sender and the
receiver have not moved (they stay at r = rA and r = 0). Thus
Z t2 +δt2 Z t2 Z t2 +δt2 Z t1 +δt1
dt dt dt dt δt2 δt1
0= − = − = − , (3.13)
t1 +δt1 a(t) t1 a(t) t2 a(t) t1 a(t) a(t2 ) a(t1 )
carefully. We will discuss this a bit more in the second part of the course.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 33
where we have already used the knowledge that for realistic cosmological models,
the universe has a finite age and that at the beginning a = 0 (i.e. z = ∞), and
chosen the beginning of time as t = 0. Putting z = 0 gives the present age of the
universe,
Z ∞
dz ′ 1
t0 = . (3.18)
0 1 + z H(z ′ )
′
Subtracting the two tells us for how long the photons travelled to come to our
detectors:
Z z
dz ′ 1
∆t ≡ t0 − t(z) = ′ ′
. (3.19)
0 1 + z H(z )
Note that whereas time t is a coordinate whose origin is in the past (in usual
cosmological models it is chosen to be at the beginning of the universe), the origin
of the redshift is set to be today. Conceptually, t is just like Newtonian time, so
it simple to use. For example, if we discuss two different cosmological models, it
is straightforward to compare them when the universe has the same age (assuming
both have a beginning of time). In contrast, comparing them at the same redshift
doesn’t make sense unless you specify by which criteria you select ’today’ in the two
models. Observationally, however, it is difficult to determine the age of the universe,
while it is easy to measure the redshift. The redshift is useful when it is expressed in
relation to some quantities which are easier to measure than time, such as distances,
to which we now turn.
1
dA = a(t)r = r. (3.20)
1+z
Now we have to relate the radial coordinate r to the observed redshift. As light
travels on null geodesics, we have
dr2
ds2 = −dt2 + a(t)2 =0
1 − Kr2
dr
⇒ dt = −a(t) √ . (3.21)
1 − Kr2
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 35
Since we place the observer at the center, the radial coordinate for incoming light
rays decreases as time increases, hence the minus sign. Integrating, we obtain
Z t0 Z r
dt dr
= √
t1 a(t) 0 1 − Kr2
= (−K)−1/2 arsinh[(−K)1/2 r]
−1/2 arcsin(K 1/2 r) ,
K
K>0
= r, K=0 (3.22)
−1/2
|K| 1/2
arsinh(|K| r) . K < 0
can occur even if the spatial geometry is Euclidean (i.e. K = 0). It is related to the
fact that the angular diameter distance is defined along a lightlike direction in the
non-Euclidean spacetime, not along a spatial slice (which may itself, of course, be
non-Euclidean).
A = 4πr2 . (3.29)
Nγ Eobs Nγ Eem 1 1
F = = . (3.31)
tobs A tem (1 + z) 4πr2
2
We thus have
r
L
dL = = (1 + z)r
4πF
= (1 + z)2 dA (z)
Z z
√ dz ′
1
= (1 + z) √ sinh −K ′
, (3.32)
−K 0 H(z )
where we have used (3.20) and (3.25). Compared to the angular diameter distance
dA (z), there are two extra factors of 1 + z. One-half comes from the redshift of
photon energy, one-half from cosmological time dilation in receiving the emitted
photons, and one from the change in the area element. (As we mentioned in chapter
2, this relation holds for a general spacetime, not just for the FRW universe; however,
proving the general case is a more complicated.)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 37
dr2
ds2 = a(t)2 , (3.33)
1 − Kr2
so the proper distance is
Z A
dP = ds
0
Z rA
dr
= a(t) √
0 1 − Kr2
−1
= a(t)SK (r) . (3.34)
The distance between two points which are fixed in the comoving coordinates
grows proportionally to the scale factor as the universe expands, like we would
expect.
In cosmology, it is common to use the comoving distance, which just means
the physical distance to redshift z scaled by the difference between the scale factor
at then and now. So if we have some distance measure d(z), the corresponding
comoving distance, denoted dc (z), is dc (z) = (1 + z)d(z). The idea is that it is easier
to compare objects from different eras if we discuss them in terms of the distance
they would now span. For example, the sound horizon of the photon-baryon plasma
at the time of last scattering when the universe was about 380 000 years old is
rs ≈ 0.13 Mpc, whereas the comoving sound horizon is (1 + z∗ )rs ≈ 140 Mpc,
where 1 + z∗ ≈ 1090 is the redshift of the last scattering surface. This is especially
convenient for the comoving proper distance, which remains constant in time,
dcP = (1 + z)dP = SK
−1
(r) . (3.35)
The relation (3.34) shows how the coordinate r is related to the physical distance
dP ,
The radial coordinate r does not give the physical distance, but nevertheless has
a clear physical interpretation. The physical distance to an object at coordinate r
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 38
is dP , the length of the circle with physical radius dP (t, r) is 2πa(t)r and its surface
area is 4πa(t)2 r2 , as can be immediately verified from the FRW metric (3.1).
−1
The functions SK and SK convert between two natural length measures of a
FRW universe: the proper distance measured along the radial line (i.e. the proper
radius) and the area distance measured along the surface of a sphere. The fact
that these quantities do not agree is a reflection of the fact that the space is non-
Euclidean. In the flat case with K = 0, we have simply SK (x) = x, as the space is
Euclidean. In this case the only relativistic effect is the stretching of space.
In addition to the straightforward issue of proper distance as a function of time
as measured on the spacelike slice, we can ask the following slightly more involved
question: if we see (along a null geodesic) a galaxy at redshift z, what is the proper
distance (along the spacelike slice) to the galaxy today? Here we assume that the
galaxy is at rest in the comoving frame (i.e. we neglect peculiar velocities) and still
exists today. (In fact, we cannot know what has happened to the galaxy since the
light left it.)
From (3.20), (3.25) and (3.34) we have for the proper distance to object that
emiited light at time t1 , as measured at time t:
−1
dP (t1 , t) = a(t)SK (dA /a)
Z t
dt′
= a(t) ′
t1 a(t )
Z z1
dz ′
= (1 + z)−1 , (3.37)
z H(z ′ )
Note that this result is independent of spatial curvature. So the proper distance to
redshift z today is
Z z
dz ′
dP (z) = ′
. (3.38)
0 H(z )
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 39
dA ≃ dL ≃ dP ≃ H0−1 z (3.40)
for z ≪ 1. For redshifts that are not small, the relation between the distance and the
redshift is more complicated, as shown by (3.25), (3.32) and (3.37). We need to know
not just the present value H0 , but the function H(z) all the way to the redshift of
the source (in the case of the angular diameter distance and the luminosity distance,
we also need the spatial curvature). The function H(z) is determined by the matter
content according to the dynamics of general relativity, to which we now turn.
3.2 Dynamics
3.2.1 The Friedmann equations
The considerations thus far have been purely geometrical and kinematical. In order
to find how the scale factor a(t) evolves, we need to consider the equations of mo-
tion, given by the Einstein equation. The Robertson-Walker metric of (3.1) has the
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 40
components
0 0 0
−1
0 a2
1−Kr 2
0 0
gµν =
0 2 2
. (3.41)
0 a r 0
2 2 2
0 0 0 a r sin θ
Calculating the Einstein tensor from this metric gives
ȧ2 K
G00 = −3 2 − 3 2 (3.42)
a a
2
i ä ȧ K
G j = − 2 + 2 + 2 δij (3.43)
a a a
G0i = 0 . (3.44)
From the symmetry of the spacetime it follows that the energy-momentum tensor
has the perfect fluid form,
−ρ 0 0 0
0 p 0 0
T µν =
0 0 p 0 ,
(3.45)
0 0 0 p
where ρ is the energy density and p is the pressure. Homogeneity implies that they
only depend on time, ρ = ρ(t), p = p(t).
In general, the Einstein equation Gαβ = 8πGN Tαβ is a non-linear system of ten
partial differential equations. In the case of the FRW universe, it reduces to two
ordinary non-linear differential equations:
ȧ2 K
32
+3 2 = 8πGN ρ (3.46)
a a
ä ȧ2 K
−2 − 2 − 2 = 8πGN p . (3.47)
a a a
This pair of equations can be rearranged as
ȧ2 K
3 +3 2 = 8πGN ρ (3.48)
a2 a
ä
3 = −4πGN (ρ + 3p) . (3.49)
a
These are the Friedmann equations. (“Friedmann equation” in the singular refers
to (3.48).)
The general relativity version of energy and momentum conservation, energy-
momentum continuity, follows from the Einstein equation. In the present case this
becomes the energy continuity equation (sometimes this is considered to be one of
the Friedmann equations)
ȧ
ρ̇ = −3(ρ + p) . (3.50)
a
Since the symmetry of the situation forbids fluid flow in the spatial directions, the
equation corresponding to momentum conservation is satisfied identically. (Exer-
cise: Derive (3.50) from the Friedmann equations.)
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 41
In fact, (3.50) is not a conservation equation for the energy. Rather, it shows
how the energy density evolves as the universe expands. We can rewrite (3.50) as
1 1 d(a3 ρ)
p = −
3H a3 dt
d(a3 ρ)
= − . (3.51)
d(a3 )
If the pressure is zero, the energy contained in a volume remains constant as the
universe expands or contracts. If the pressure is positive, the total amount of energy
decreases with the expansion of the universe (and increases if the universe contracts).
If the pressure is negative, the opposite happens: the energy of an expanding universe
increases. We can compare (3.51) with the first law of thermodynamics,
X
T dS = dU + pdV − µi dNi , (3.52)
i
K
3H 2 = 8πGN ρ − 3 . (3.53)
a2
Dividing by 3H 2 , we have
8πGN ρ K
1 = 2
−
3H (aH)2
≡ Ω + ΩK , (3.54)
3H(t)2
ρc (t) ≡ . (3.56)
8πGN
The critical density is the energy density that a spatially flat universe that expands
with the rate H(t) would have. So the critical density changes as the universe evolves
and the Hubble parameter changes. Often in cosmology the word critical density is
used to refer just to the present value. We always use the subscript 0 when referring
to the present value:
3H02
ρc0 ≡ ρc (t0 ) = , (3.57)
8πGN
so we have
ρ(t)
Ω(t) ≡ . (3.58)
ρc (t)
Positive curvature contributes to the Hubble rate with a negative sign and neg-
ative curvature with a positive sign, as (3.48) shows. In other words, if we measure
that the density of the universe is ρ and the critical density is ρc (i.e. the Hubble
parameter is H), we can make the following conclusion about the spatial curvature:
Thus Ω = 1 implies that the universe is spatially flat, Ω < 1 implies that spatial
curvature is negative and Ω > 1 that spatial curvature is positive. The Friedmann
equation can be written as
2
K ℓH
Ω(t) = 1 + =1+ , (3.62)
a(t)2 H(t)2 Rcurv
where ℓH is the Hubble length and Rcurv is the curvature radius. If Ω < 1 (or > 1)
at some instant of time, it will stay that way (since K is constant). And if Ω = 1, it
will stay constant, Ω = Ω0 = 1. Observations show that the density of the universe
today is close to critical, Ω0 ≈ 1.
not barotropic, will be important in the second part of the course.) The simplest
possibilities are the following:
• Matter. The term “matter” refers to a form of matter for which the pressure
is zero p = 0, or at least negligible, |p| ≪ ρ. Such a form of matter is also
called “dust”. (The name “dust” is more common in a pure general relativity
context.) This is the case for a gas of free non-relativistic particles, where
the energy density is dominated by the mass. The relation (3.51), shows that
d(ρa3 )/dt = 0, or ρ ∝ a−3 .
• Radiation. The term “radiation” refers to matter for which the pressure
is (exactly or very closely) 1/3 of the energy density, p = 13 ρ. This is the
case for a gas of free ultrarelativistic particles, for which the energy density is
dominated by the kinetic energy (i.e. the momentum is much bigger than the
mass). In particular, this always holds for massless particles such as photons.
From (3.51), we get d(ρa4 )/dt = 0, in other words ρ ∝ a−4 .
• Vacuum energy. For vacuum energy the energy density does not change in
time, ρ = constant. From (3.50) it follows that the pressure is very negative
p = −ρ. (This type of matter is, a bit misleadingly, also called the cosmological
constant; see section 3.2.4 below.) Thus a positive vacuum energy corresponds
to negative vacuum pressure. The total amount of energy increases propor-
tional to the volume of space (because there is more space, and a constant
amount of energy per volume everywhere).
The universe contains non-relativistic matter in the form or ordinary, baryonic
matter (i.e. atoms, ions and electrons) as well as (most probably) dark matter, which
is (practically) pressureless, weakly interacting and extremely cold. Dark matter is
usually thought to consist of a gas of a new heavy particle species. We discuss dark
matter in more detail in chapter 7, at the end of the first part of the course. There is
also radiation, most importantly in the form of the cosmic microwave background,
which is a remnant of the radiation that used to dominate the expansion of the
universe. In addition, there are neutrinos, which behaved like radiation in the early
universe but now behave like matter. This happens for all particles which are not
strictly massless: the kinetic energy falls with the expansion of the universe, so that
at some point the mass starts dominating the particle energy. In chapters 4 and 5
we discuss in detail how different particle species behave as radiation in the early
universe when it is very hot, but as the universe cools, the massive particles change
form ultrarelativistic (radiation) to nonrelativistic (matter). During the transition
period the pressure due to that particle species falls from p = ρ/3 to p ∼ 0. In this
chapter, we focus on the late universe, when it is sufficient to divide matter into dust
(p ≈ 0) and radiation (p ≈ ρ/3), without worrying about the transitions. (Neutrinos
may undergo the transition quite late –the neutrino masses are not precisely known–
but their contribution to the total energy budget is negligible at late times, so we
can skip this detail.)
We have mentioned in that the present observational data cannot be explained in
terms of known particles (or hypothetical particles with similar properties), general
relativity and the FRW metric. One of the three assumptions –known forms of
matter, general relativity and the approximation of homogeneity and isotropy– is
then wrong. Sticking to the FRW metric and general relativity, the observations
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 44
indicate that the expansion of the universe has accelerated during the past few billion
years. From (3.49) we see that this requires (in the context of general relativity and
the FRW metric) an energy component with negative pressure, dark energy. It
is called dark since it has not been observed to emit or absorb light, and energy,
since the name “dark matter” was already taken (though “dark pressure” might be
more appropriate!). The simplest possibility for dark energy is just the cosmological
constant (vacuum energy), which generates repulsive gravity, leading to accelerated
expansion which fits the data in detail. Therefore we shall carry on our discussion
assuming three energy components: matter, radiation, and vacuum energy. We shall
later comment on how much current observations actually constrain the equation of
state of dark energy, if it is not just vacuum energy.
If the universe contains these three energy components, we can arrange (3.48)
into the form
ȧ2
3 = 8πGN ρr0 a−4 + 8πGN ρm0 a−3 − 3Ka−2 + Λ, (3.63)
a2
where ρr0 , ρm0 , a0 , K, and Λ are constants.6 The four terms on the right hand side
are due to radiation, matter, spatial curvature, and vacuum energy, respectively. As
the universe expands (a grows), different components on the right hand side become
important at different times. The universe was first radiation-dominated up to about
50 000 years, then the expansion was dominated by matter until a few billion years
ago, when vacuum energy started to dominate. (The universe has apparently never
been in state where the spatial curvature would have been the largest term.)
The radiation component is insignificant at present, and we can forget it in
(3.63), if we exclude the first few million years of the universe from discussion. In
the “inflationary scenario”, there was something resembling a very large vacuum
energy density in the very early universe (during the first small fraction of the
first second). So there may have been a very early “vacuum-dominated” era called
inflation – we will return to this in the second part of the course.
We thus divide the density into matter, radiation, and vacuum components ρ =
ρm + ρr + ρvac , and likewise for the density parameter, Ω = Ωm + Ωr + ΩΛ , where
Ωm ≡ ρm /ρc , Ωr ≡ ρr /ρc , and ΩΛ ≡ ρvac /ρc ≡ Λ/3H 2 . The density parameters Ωm ,
Ωr , and ΩΛ are functions of time (although ρvac is constant, ρc (t) is not). We have
Ω = Ωm + Ωr + ΩΛ . (3.64)
Even more so than in the case of the critical density, the symbols Ωm , Ωr , ΩΛ and
ΩK are often used to denote the present values of these quantities. In this course, to
avoid confusion, we always use the subscript 0 when referring to the present values,
Ωm0 ≡ Ωm (t0 ), Ωr0 ≡ Ωr (t0 ), ΩΛ0 ≡ ΩΛ (t0 ), ΩK0 ≡ ΩK (t0 ). The present radiation
density is relatively small, Ωr0 ∼ 10−4 (we will calculate the precise number in
chapter 5). So we usually write just
In addition to being small today, the radiation density is also known very accurately
from the temperature of the cosmic microwave background, and therefore Ωr0 is
6
We ignore transfer of energy between the components. Such transfer is important only in the
early universe, before the decoupling of the different particle species, or when particle species go
from being relativistic to non-relativistic. In chapters 4 and 5 we return to this issue in some detail.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 45
vacuum energy has exactly the same effect as a cosmological constant with the value
ρ ∝ a−3(1+ω) . (3.67)
At a finite time into the past, the scale factor becomes zero; without loss of gener-
ality, we choose the origin of the time coordinate to be there, ti = 0. At this time
the energy density is correspondingly infinite, and the spacetime is also infinitely
curved. This singularity is called the big bang, and it is a general feature not only
of FRW models but of realistic cosmological models which include inhomogeneities.
Space and time do not continue beyond this event. However, at the big bang (or
more properly, as we come near its vicinity) general relativity does not apply any-
more, so we cannot make any definite statements about what happens very near the
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 47
beginning. Also, we cannot really expect matter to behave in this simple way in the
early universe; in the second part of the course when we discuss inflation we will see
one possibility of how the early universe can behave differently. (But inflation does
not save us from the cosmological singularity.)
In particular, we have the three cases
ω = 1/3 radiation-dominated a ∝ t1/2
ω=0 matter-dominated a ∝ t2/3
ω = −1/3 curvature-dominated (K < 0) a ∝ t
The cases K > 0 and vacuum energy have to be treated differently (this is left as
an exercise).
H p
= Ωr0 a−4 + Ωm0 a−3 + ΩK0 a−2 + ΩΛ0
H0
p
= Ωr0 (1 + z)4 + Ωm0 (1 + z)3 + ΩK0 (1 + z)2 + ΩΛ0 . (3.71)
We will have much use for this convenient form of the Friedmann equation. Inserting
(3.71) into the relation (3.17) for the age, we find the time it takes for the universe
to expand from scale factor a1 to a2 , or from redshift z1 to z2 ,
Z z1
dz ′ 1
t2 − t1 = ′ ′
z2 1 + z H(z )
Z z1
dz ′
= H0−1 p
′ ′ 4 ′ 3 ′ 2
z2 (1 + z ) Ωr0 (1 + z ) + Ωm0 (1 + z ) + ΩK0 (1 + z ) + ΩΛ0
Z 1
1+z1 da
= H0−1 p , (3.72)
1 Ω a −2 + Ω a −1 + Ω + Ω a 2
1+z2 r0 m0 K0 Λ0
where the second form is more convenient due to the cancellation of some factors of
1 + z. Recall that ΩK0 = 1 − Ωr0 − Ωm0 − ΩΛ0 ≡ 1 − Ω0 . The expression (3.17) is
integrable to an elementary function if two of the four terms under the root sign are
absent. From this we get the age of the universe t at redshift z as
1
da
Z
1+z
t(z) = H0−1 p . (3.73)
0 Ωr0 a + Ωm0 a−1 + ΩK0 + ΩΛ0 a2
−2
This gives the function t(z), that is, t(a). Inverting this function gives us a(t), the
scale factor as a function of time. Note that a(t) is not necessarily an elementary
function, even if t(a) is. However, even in that case we can sometimes have a
parametric representation a(ψ), t(ψ) in terms of elementary functions.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 48
If the Ωs are of order unity (and ΩΛ0 is not the only non-zero one), the value of
the integral is of order unity. So the the age of the universe is of the order of the
Hubble time. In the real universe, Ωr0 ≈ 10−4 , so dropping the radiation term
causes negligible error (physically, this means that the radiation-dominated era is
relatively short).
Example: Age of the open universe. Let us consider an open universe (K < 0 or
Ω0 < 1), without vacuum energy (ΩΛ = 0), and approximating Ωr ≈ 0. Integrating
Ωm0
(3.74) (e.g., with the substitution a = 1−Ω m0
sinh2 ψ2 ) gives the age of the open
universe as
Z 1
−1 da
t0 = H 0 √
0 1 − Ωm0 + Ωm0 a−1
−1 1 Ωm0 2
= H0 − arcosh −1 . (3.75)
1 − Ωm0 2(1 − Ωm0 )3/2 Ωm0
A special case of the open universe is the completely empty universe, which is dom-
inated by the spatial curvature, with Ωm = ΩΛ = 0 and ΩK = 1. In this case we
obtain from (3.71) the result a = H0 t, and we have t0 = H0−1 . We thus get the
following table for the age of the universe:
Ωm0 ΩΛ0 t0 H 0
0 0 1
0.1 0 0.90
0.3 0 0.81
0.5 0 0.75
1 0 2/3
The cases (Ωm > 1, ΩΛ = 0) and (Ω0 = Ωm + ΩΛ = 1, ΩΛ > 0) are left as
exercises. The more general case (ΩK 6= 1, ΩΛ 6= 0) leads to elliptic functions. The
results for H0 t0 are plotted in figure 4.
The best model-independent estimates of the age of the universe (based on ages
of globular clusters, which are compact groups of stars in our galaxy) give a 95%
probability lower limit on the age of the universe of 11 Gyr, and a best-fit age of
about 13.4 Gyr. The Hubble time is H0−1 ≈ h−1 9.8 Gyr, so we get H0 t0 & 1.14h
as the lower limit, and H0 t0 ≈ 1.37h as the preferred value. So the age of the
universe implies that models with only matter and curvature need a small Hubble
parameter. For a spatially flat matter model, we would need h = 0.48, and an open
model with Ωm0 ≈ 0.3 (as indicated by observations) would need h ≈ 0.6. Recall
that measurements of H0 indicate h = 0.73 ± 0.04: the mean value gives H0 t0 ≈ 1.0.
So just from measurements of H0 and t0 we can conclude that models with spatial
curvature and matter have trouble fitting the observations. However, the strongest
evidence against a model with no vacuum energy (or other form of negative pressure
matter) comes from distance measurements.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 49
-1
Age of the universe / H0
2
no big bang
1.5
1.5
2
1.2
75
0.
0.9
0.
7
0.
1
recollapses eventually
0.8
5
0.7
-0.5
0.7
5
0.6
0.6
5
0.5
-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Ω
m
Fig. by E. Sihvola
where on the last line we have dropped the Ωr0 term which has negligible effect,
and used Ωm0 = Ω0 − ΩΛ0 . The proper distance depends on three independent
cosmological parameters, for which we have taken H0 , Ω0 and ΩΛ0 , and the distance
at a given redshift is proportional to the Hubble distance, H0−1 . If we give the
distance in units of H0−1 , then the distance depends only on the two remaining
parameters, Ω0 and ΩΛ0 .
If we increase Ω0 while keeping ΩΛ0 constant (meaning that we increase Ωm0 ),
the distance corresponding to a given redshift decreases. This is because the universe
has expanded faster in the past, so that there is less time between a given value of
the scale factor a = 1/(1 + z) and the present. The distance to the galaxy with
redshift z is shorter because photons have had less time to travel. Whereas if we
increase ΩΛ0 with a fixed Ω0 (meaning that we decrease Ωm0 ), we have the opposite
situation and the distance increases. In figure 5 we show the proper distance for
some parameter values.
In the case ΩΛ0 = 0, we have
2
dcP (z) = H0−1
p
Ωm0 z − (2 − Ωm0 )( 1 + Ωm0 z − 1) . (3.77)
Ω2m0 (1 + z)
In figure 6 the comoving horizon distance is plotted for various choices of parameters.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 51
1.8 1.8
1.6 1.6
1.4 1.4
distance (H0 )
distance (H0 )
-1
-1
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift
Figure 5: The proper distance, (3.76), for a) the matter-only universe ΩΛ = 0, Ωm0 = 0,
0.2,. . . ,1.8 (from top to bottom) b) the spatially flat universe Ω = 1 (ΩΛ = 1−Ωm ), Ωm0 = 0,
0.05, 0.2, 0.4, 0.6, 0.8, 1.0, 1.05 (from top to bottom). The thick line in both cases is the
Einstein-de Sitter model with Ωm = 1, ΩΛ = 0.
where we have used the definition ΩK0 = −K/H02 = 1 − Ω0 and have on the last line
again dropped Ωr0 . The angular diameter distance is plotted in figure 7 for some
values of the parameters; figure 8 shows the same plot for the comoving angular
diameter distance. As always, the luminosity distance is dL = (1 + z)2 dA .
In a spatially flat universe the angular diameter distance is equal to the proper
distance,
From anisotropies of the CMB we can infer that dA (1090) ≈ 13 Mpc. In the
second part of course we will discuss in detail where this length scale comes from.
But given this number, it is simple to use it as a cosmological constraint: the
parameters of any model have to be such that this distance is reproduced.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 52
-1
Horizon / H0
2
no big bang
1.5
2.2
2.4
2.7
2
3
3.5
5
6
4
7
recollapses eventually
2.4
3
-0.5
1.6
1.9
1.8
2
1.7
-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Ω
m
Fig. by E. Sihvola
1.8 1.8
angular diameter distance (H0 )
-1
1.6 1.6
1.4 1.4
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift
Figure 7: The angular diameter distance, for a) the matter-only universe ΩΛ = 0, Ωm0 = 0,
0.2,. . . ,1.8 (from top to bottom) b) the spatially flat universe Ω = 1 (ΩΛ = 1−Ωm ), Ωm0 = 0,
0.05, 0.2, 0.4, 0.6, 0.8, 1.0, 1.05 (from top to bottom). The thick line in both cases is the
Einstein-de Sitter model with Ωm = 1, ΩΛ = 0. Note how the angular diameter distance
decreases for large redshifts, meaning that the object that is farther away may appear larger
on the sky. In the flat case, this is an expansion effect. In the matter-only case, the effect
is enhanced by space curvature effects for the closed (Ωm > 1) models.
-1
1.8 1.8
1.6 1.6
1.4 1.4
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift
Figure 8: Same as figure 7, bur for the comoving angular diameter distance. For the closed
models (for Ωm > 1 in the case of ΩΛ = 0) even the comoving angular diameter distance
may begin to decrease if we look at large enough redshifts. This happens when we are
looking beyond χ = π/2, where the universe “begins to close up” as we pass the equator of
the hypersphere. The figure does not go to high enough z to show this for the parameters
used. Note how for the flat universe the comoving angular diameter distance is equal to the
comoving distance (see figure 5).
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 54
Figure 9: Spacetime diagrams for a flat universe giving a) the actual distance b) the
comoving distance from origin as a function of cosmic time.
Exercise. Show that in order to fit this distance in the Einstein-de Sitter model,
the Hubble parameter has to be smaller than the observed value 73±4 km/s/Mpc.
9 9
8 8
luminosity distance (H0 )
-1
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift
Figure 11: Same as Fig. 7, bur for the luminosity distance. Note how the vertical scale now
extends to 10 Hubble distances instead of just 2, to have room for the much more rapidly
increasing luminosity distance.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 56
Matter only Flat universe
5 5
4 4
3 3
2 2
magnitude
magnitude
1 1
0 0
-1 -1
-2 -2
-3 -3
-4 -4
-5 -5
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
redshift redshift
Figure 12: Same as Fig. 7, bur for the magnitude-redshift relationship. The constant
M − 5 − 5 lg H0 in (3.84), which is different for different classes of standard candles, has
been arbitrarily set to 0.
compared to the theoretical one to find the values of the cosmological parameters
which give the best fit between theory and observations.
As we discussed in chapter 2, astronomers give luminosities as magnitudes. From
the definitions of the absolute and apparent magnitude,
L F
M ≡ −2.5 log10 , m ≡ −2.5 log10 , (3.82)
L0 F0
and (3.27) we get the distance modulus m − M in terms of the luminosity distance
as
F L0 F0
m − M = −2.5 log10 = 5 log10 dL + 2.5 log10 4π = −5 + 5 log10 dL (pc) .
L F0 L0
(3.83)
(As explained in chapter 2, the constants L0 and F0 are chosen so as to give the
value −5 for the constant term, when dL is given in parsecs.) For a set of standard
candles, all having the same absolute magnitude M , we find that their apparent
magnitudes m should be related to their redshift z as
0.8
0.6
0.4
m
0.2
0.0
-0.2
-0.4
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
redshift
Figure 13: The difference between the magnitude-redshift relationship of the different
models in Fig. 12 from the reference model Ωm = 1, ΩΛ = 0 (which appears as the horizontal
thick line). The red (solid) lines are for the matter-only (ΩΛ = 0) models and the blue
(dashed) lines are for the flat (Ω0 = 1) models.
1.0
0.5
∆(m-M) (mag)
0.0
-0.5
Ground Discovered
-1.0 HST Discovered
)
=1.0) Ω M=1.0
t ( + Ω M o lu ti o n ~ z, (+
y dus Ev
z gra
0.5
high-
∆(m-M) (mag)
0.0
ΩM =1.0
-0.5
Empty (Ω=0) , ΩΛ =0
.0
ΩM=0.27, ΩΛ=0.73
"replenishing" gray Dust
0.0 0.5 1.0 1.5 2.0
z
Figure 14: The Supernova Ia luminosity-redshift data. The top panel shows all supernova
of the data set. The bottom panel show the averages from different redshift bins. The
curves corresponds to three different FRW cosmologies, and some alternative explanations:
“dust” refers to the possibility that the universe is not transparent, but some photons get
absorbed on the way; “evolution” to the possibility that the SNIa are not standard candles,
but were different in the younger universe, so that M = M (z). This Figure is from Riess et
al., astro-ph/0402512 [4].
We have in the preceding assumed that the mysterious dark energy component of
the universe is vacuum energy, for which pde = −ρde . Instead allowing the equation
of state parameter wde ≡ pde /ρde for dark energy to be an arbitrary constant11 , we
see that wde is restricted to be close to −1; see Fig. 16.
It is worth emphasising that all the supernova observations (and observations of
the angular diameter distance to the CMB) show is that the distances are longer
than in the Einstein-de Sitter model. If the distance observations are interpreted
assuming that the FRW approximation is valid (i.e. that the FRW relation (3.25)
between the distance and the expansion rate holds), it follows that the expansion
rate has accelerated. Assuming that the Friedmann equations hold (i.e. that general
relativity is valid), (3.49) shows that the total pressure then has to negative. While
the acceleration has not been established independently of the assumption that the
FRW approximation holds, observations of t0 and H0 (and other observables, such
as the growth rate of cosmic structures) are consistent with this interpretation. Note
that the only cosmological effect of vacuum energy is to increase the expansion rate
and correspondingly increase the distances. Its success in fitting various cosmological
observations in detail thus is thus strong evidence for faster expansion, but it may
be that the explanation for the faster expansion is not vacuum energy but something
11
There is no theoretical justification for the assumption that wde is constant, if it is different
from −1. It is just taken for simplicity.
3 THE FRIEDMANN-ROBERTSON-WALKER MODEL 59
Figure 15: The densities Ωm and ΩΛ determined from Supernova Ia data. The dotted
contours are the 1998 results [2]. This figure is from [5].
Figure 16: The matter density Ωm and the dark energy equation of state w determined
from Supernova Ia data, assuming spatial flatness. This figure is from [5].
References
[1] L.M. Krauss and B. Chaboyer, Age Estimates of Globular Clusters in the Milky
Way: Constraints on Cosmology, Science 299 (2003) 65
L3 V
3
= 3, (4.2)
h h
and the state density in phase space {(~x, p~)} is 1/h3 . If the particle has g internal
degrees of freedom (e.g., spin),
g g h
density of states = 3 = ~ ≡ ≡ 1 . (4.3)
h (2π)3 2π
This result is true even for relativistic momenta. The state density in phase space
is independent of the volume V , so we can apply it to arbitrarily large systems
(including an infinite universe).
For much of the early universe, we can ignore interaction energies between par-
ticles. Then the particle energy is
p
p) = p2 + m2 ,
E(~ (4.4)
61
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 62
then
µi + µj = µk + µl . (4.12)
Thus all chemical potentials can be expressed in terms of the chemical potentials
of conserved quantities, e.g., the baryon number chemical potential, µB . There are
thus as many independent chemical potentials as there are independent conserved
particle numbers. For example, if the chemical potential of particle species i is µi ,
then the chemical potential of the corresponding antiparticle is −µi .
As the universe expands, T and µ change in such a way that the energy continuity
equation is satisfied and conserved quantum numbers are conserved. In principle,
an expanding universe is not in equilibrium. The expansion is however so slow that
the particle soup usually has time to settle close to local equilibrium. (And since
the universe is homogeneous, the local values of thermodynamic quantities are also
global values). From the remaining numbers of fermions (electrons and nucleons) in
the present universe, we can conclude that in the early universe we had |µ| ≪ T when
T ≫ m. (We don’t know the chemical potentials of the three neutrino species, but
they are usually assumed to be small too.) If the temperature is much p greater than
the mass, T ≫ m, the ultrarelativistic limit, we can approximate E = p2 + m2 ≈ p.
For |µ| ≪ T and m ≪ T , we approximate µ = 0 and m = 0 to get the following
formulae
3
g
Z ∞
4πp2 dp 2 ζ(3)gT 3 fermions
n = = 4π (4.13)
(2π)3 0 ep/T ± 1 1 ζ(3)gT 3 bosons
π2
2
7π
gT 4 fermions
4πp3 dp
Z ∞
g
8 30
ρ = = (4.14)
(2π)3 0 ep/T ± 1 π 2
4
gT
bosons
30
Z ∞ 4 3 1.0505nT fermions
g 3 πp dp 1
p = = ρ ≈ (4.15)
(2π)3 0 ep/T ± 1 3
0.9004nT bosons .
7π 4
T ≈ 3.151T fermions
ρ
180ζ(3)
hEi = = (4.16)
n π4
T ≈ 2.701T bosons .
30ζ(3)
Note that the last forms in equations (4.17) and (4.18) are exact, not just truncated
series. (The difference n − n̄ and the sum ρ + ρ̄ lead to a nice cancellation between
the two integrals. We don’t get such an elementary form for the individual n, n̄, ρ,
ρ̄, or the sum n + n̄ and the difference ρ − ρ̄ when µ 6= 0.)
In the nonrelativistic limit, T ≪ m and T ≪ m − µ, the typical kinetic energies
are much below the mass m, so that we can approximate E = m + p2 /2m. The
second condition, T ≪ m − µ, leads to occupation numbers ≪ 1, a dilute system.
This second condition is usually satisfied in cosmology when the first one is. (It is
violated in systems of high density, like white dwarf stars and neutrons stars.) We
can then approximate
e(E−µ)/T ± 1 ≈ e(E−µ)/T , (4.19)
so that the boson and fermion expressions become equal1 , and we get (exercise)
3/2
m−µ
mT
n = g e− T (4.20)
2π
3T
ρ = n m+ (4.21)
2
p = nT ≪ ρ (4.22)
3T
hEi = m + (4.23)
2
3
mT 2 − m µ
n − n̄ = 2g e T sinh . (4.24)
2π T
In the general case, where neither T ≪ m, nor T ≫ m, the integrals don’t give
elementary functions, but n(T ), ρ(T ), etc. need to be calculated numerically for the
region T ∼ m.2
By comparing the ultrarelativistic (T ≫ m) and nonrelativistic (T ≪ m) limits
we see that the number density, energy density, and pressure of a particle species
falls exponentially as the temperature falls below the mass of the particle. We
have not so far made assumptions about the interactions that are responsible for
maintaining equilibrium. In the cosmological case, these include annihilation and
particle-antiparticle pair formation. At high temperatures, these reactions balance
each other, but as the temperature falls below the mass, the thermal particle energies
are not sufficient for pair production any more, so the reactions happen only in the
annihilation direction. The process of particle-antiparticle annihilation takes place
mainly (about 80%) during the temperature interval T = m → 61 m, as shown in
figure 1. It is thus not an instantaneous event, but takes several Hubble times.
1
This approximation leads to what is called Maxwell–Boltzmann statistics; whereas the previous
exact formulae give Fermi–Dirac (for fermions) and Bose–Einstein (for bosons) statistics.
2
If we use Maxwell–Boltzmann statistics, i.e., we drop the term ±1, the integrals give modified
Bessel functions, e.g., K2 (m/T ), and the error is often less than 10%.
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 65
Figure 1: The fall of energy density of a particle species, with mass m, as a function of
temperature (decreasing to the right).
where i runs over particle species. Since the energy density of relativistic species is
much greater than that of nonrelativistic species, it suffices to include the relativistic
species only. (This is true in the early universe, but not at later times. Eventually
the rest masses of the particles left over from annihilation begin to dominate and
we enter the matter-dominated era.) Thus we have
π2
ρ(T ) = g∗ (T )T 4 , (4.25)
30
where
7
g∗ (T ) = gb (T ) + gf (T ),
8
P P
and gb = i gi over relativistic bosons and gf = i gi over relativistic fermions.
For pressure we have p(T ) ≈ 31 ρ(T ).
The above is a simplification of the true situation: Since the annihilation takes
a long time, often the annihilation of some particle species is going on, and the
contribution of this species disappears gradually. Using the exact formula for ρ we
define the effective number of degrees of freedom g∗ (T ) by
30 ρ
g∗ (T ) ≡ . (4.26)
π2 T 4
We also define
90 p
g∗p (T ) ≡ ≈ g∗ (T ) . (4.27)
π2 T 4
When there are no annihilations taking place, g∗p = g∗ = const ⇒ p = 31 ρ. From
the Friedmann equation it then follows that ⇒ ρ ∝ a−4 , so we have and ρ ∝ T 4
and T ∝ a−1 . We will soon calculate the scale factor-temperature relation more
precisely (including the effects of annihilations).
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 66
a(t1 )
p(t2 ) = p(t1 ) . (4.28)
a(t2 )
It follows that ultrarelativistic non-interacting particles stay in kinetic equilibrium.
We can see this as follows.
At time t1 a phase space element d3 p1 dV1 contains
g
dN = p1 )d3 p1 dV1
f (~ (4.29)
(2π)3
particles, where
1
f (~
p1 ) =
e(p1 −µ1 )/T1
±1
is the distribution function at time t1 . At time t2 these same dN particles are in a
phase space element d3 p2 dV2 . How is the distribution function at t2 , given by
g dN
f (~
p2 ) = 3 ,
(2π)3 d p2 dV2
g d3 p1 dV1
dN = (dN evaluated at t1 )
(2π)3 e(p1 −µ1 )/T1 ± 1
a 3 3 a 3
g ( a21 ) d p2 ( a21 ) dV2 (rewritten in terms of
= (4.30)
(2π)3 ( aa12 p2 − µ1 )/T1 p2 , dp2 , and dV2 )
e ±1
g d3 p2 dV2
= (defining µ2 and T2 ) ,
(2π)3 e(p2 −µ2 )/T2 ± 1
where µ2 ≡ (a1 /a2 )µ1 and T2 ≡ (a1 /a2 )T1 . Thus distribution retains the thermal
shape; the temperature and the chemical potential just redshift ∝ a−1 .
Exercise. Show that for a non-relativistic particle species, the distribution func-
tion retains the thermal shape as the universe expands, with T2 = T1 (a(t1 )/a(t2 ))2 ∝
a(t2 )−2 and µ(t2 ) = m + (µ(t1 ) − m)T2 /T1 .
2π 2
s(T ) ≡ g∗s (T )T 3 . (4.31)
45
The equation (4.31) defines the coefficient g∗s (T ).
According to the second law of thermodynamics the total entropy of the uni-
verse never decreases: it either stays constant or grows. It turns out that entropy
production in various processes in the universe is insignificant compared to the total
4 THERMODYNAMICS IN AN EXPANDING UNIVERSE 67
Figure 2: The expansion of the universe increases the volume element dV and decreases
the momentum space element d3 p so that the phase space element d3 pdV stays constant.
entropy of the universe3 , which is huge, and which is dominated by the relativistic
species. Thus it is an excellent approximation to treat the expansion of the universe
as adiabatic, so the entropy stays constant i.e.,
d(sa3 ) = 0. (4.32)
Adding up all relativistic species and allowing for the possibility that some of them
may have a kinetic temperature Ti different from the temperature T of those species
that remain in thermal equilibrium, we get
4 4
X Ti 7X Ti
g∗ (T ) = gi + gi
T 8 T
bos fer
3 3
X Ti 7X Ti
g∗s (T ) = gi + gi , (4.36)
T 8 T
bos fer
and the sums are over all relativistic species of bosons and fermions.
If some species are “semirelativistic”, i.e. m = O(T ), then ρ(T ) and s(T ) have
to be calculated from the integral formulae of section 4.2. Non-relativistic species
give negligible contribution to the entropy.
As long as all species have the same temperature and p ≈ 13 ρ, we have
g∗s (T ) ≈ g∗ (T ) . (4.37)
We will see that this approximation breaks down in the real universe at around 1 s.
5 Thermal history of the early universe
5.1 Timescale of the early universe
We will now apply the thermodynamics discussed in the previous section to the
evolution of the early universe. It is useful to keep in mind some simple relations
between time, distance and temperature in a radiation-dominated universe. Spatial
curvature can be neglected in the early universe, so the metric is
π2 T4
3H 2 = 8πGN ρ(T ) = g∗ (T ) 2 , (5.2)
30 MPl
where
√ we have written Newton’s constant in terms of the Planck mass, MPl ≡
1/ 8πGN ≈ 2.436 × 1021 MeV. To integrate this equation exactly we would need
to calculate numerically the function g∗ (T ) with all the annihilations. For most of
the time, however, g∗ (T ) is changing slowly, so we can approximate g∗ (T ) = const.
Then T ∝ a−1 and H ∝ a−2 , so we get the following relation between the age of the
universe t and the Hubble parameter H:
r −2
1 −1 45 1 MPl 1.51 MPl 2.42 T
t= H = √ ≈ √ ≈ √ s. (5.3)
2 2π 2 g∗ T 2 g∗ T 2 g∗ MeV
We thus have
a ∝ T −1 ∝ t1/2 .
This approximate result (5.3) will be sufficient for us as far as the time scale is
concerned1 , but for the relation between a and T , we need to use the more exact
result derived in section 4.5.
The distance to the horizon (i.e. the proper comoving distance to t = 0, or
z = ∞) is
t
dt′
Z
dhor = a(t) = 2t = H −1 . (5.4)
0 a(t′ )
In the radiation-dominated early universe, the distance to the horizon is equal to the
Hubble length, so we can use the terms “horizon length”, “horizon” and “Hubble
length” interchangeably. (This is often also done for other eras, when the two are
not equal.)
69
5 THERMAL HISTORY OF THE EARLY UNIVERSE 70
the Standard Model, there are presumably other, so far undiscovered, species. In
particular, we will discuss dark matter particles in chapter 7. As the temperature
falls, the various particle species become nonrelativistic and annihilate at different
times.
The particles of the Standard Model are listed in table 1, and the effective num-
ber of degrees of freedom g∗ (T ) (solid), g∗p (T ) (dashed), and g∗s (T ) are plotted in
figure 1 as a function of temperature. In table 2 we list some important events in
the early universe.
gf = 72 + 12 + 6 = 90
gb = 16 + 11 + 1 = 28
For T > mt = 173 GeV, all known particles are relativistic. Adding up their
internal degrees of freedom we get
2
10
2
g*
1
10
0
10
-6 -5 -4 -3 -2 -1 0 1 2
-lg(T/MeV)
Figure 1: The functions g∗ (T ) (solid), g∗p (T ) (dashed), and g∗s (T ) (dotted) calcu-
lated for Standard Model particle content.
The electroweak (EW) phase transition took place close to the temperature 100
GeV.2 As with other phase transitions, the system was not in thermal equilibrium
during this event, and this may have important cosmological consequences (in partic-
ular, it may determine the baryon-antibaryon asymmetry observed in the universe),
depending on the way the electroweak phase transition happens – this is one of the
main research topics at the Large Hadron Collider in CERN, due to start taking
data again in 2015. We will not discuss the electroweak phase transition, and for
our purposes it is enough to know that it appears that g∗ was the same before and
after the transition. Going to earlier times and higher temperatures, we expect g∗
to get larger than 106.75 as new physics (new, thus far unknown, particle species)
comes to play.
Let us now follow the history of the universe starting at the time when the
EW transition has already happened. We have T ∼ 100 GeV, t ∼ 20 ps, and t
quark annihilation is ongoing. (Recall that the transition from relativistic into non-
relativistic behaviour is not complete until about T ≈ m/6 ≈ 30 GeV.) The Higgs
boson and the gauge bosons W ± , Z 0 annihilate next. At T ∼ 10 GeV, we have
g∗ = 86.25. Next the b and c quarks annihilate, followed by the τ lepton. If the s
quark would also have had time to annihilate, we would reach g∗ = 51.25.
2
Like many other terms in cosmology, this may be bit of a misnomer, because in the Standard
Model, there is no phase transition, just a smooth crossover from one regime to another. In some
extensions of the Standard Model, there is really a phase transition.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 72
Table 3: History of g∗ (T )
This table gives what value g∗ (T ) would have after the annihilation is over as-
suming the next annihilation would not have begun yet. In reality they overlap in
many cases. The temperature value on the left is the approximate mass of the parti-
cle in question and indicates roughly when the annihilation begins. The temperature
is much smaller when the annihilation ends. The top quark receives its mass in the
EW transition, so its annihilation only begins after the EW transition.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 73
e+ + e− → γ+γ .
The photons are thus heated relative to neutrinos (the photon temperature does not
fall as much). In the electron-positron annihilation, g∗s changes from
to 3
Tν
g∗s = 2 + 5.25 . (5.6)
T
For time 1 before the annihilation and time 2 after it, we have from (4.34)
Before the electron-positron annihilation, the neutrino temperature was the same as
the temperature of the other species, so a31 T13 = a31 Tν1
3 = a3 T 3 , where we have used
2 ν2
the fact that Tν ∝ a−1 throughout, since neutrinos are relativistic and they are not
heated by the electron-positron annihilation. We thus have from (5.7)
3
T
10.75 = 2 + 5.25 ,
Tν
These relations remain true for the photon+neutrino background as long as the
neutrinos stay ultrarelativistic (mν ≪ T ). The neutrinos are no longer in chemical or
thermal equilibrium, but they are still in kinetic equilibrium, i.e. their distribution
function has the thermal shape.
If the neutrinos are massless or their masses are so small that they can be ig-
nored, the above relation would apply even today, when the photon (the CMB)
temperature is T = T0 = 2.725 K = 0.2348 meV, giving the neutrino background
the temperature Tν0 = 0.714 · 2.725 K = 1.945 K = 0.1676 meV today. However,
neutrino oscillation experiments in the 1990s established that neutrinos have masses
which are at least in the meV range5 , and there is an upper limit of about 2 eV from
direct detection experiments and cosmology. Therefore, the neutrino background is
non-relativistic today. As neutrinos become non-relativistic, they fall out of kinetic
equilibrium, because the shape of the thermal distribution function is not preserved
as the momenta redshift to the value p ∼ m. However, once neutrinos become very
non-relativistic, with typical values of the momenta p ≪ m, the distribution function
again has the thermal shape, but with a different temperature scaling.
4
To be more precise, neutrino decoupling was not complete when e+ e− -annihilation began, so
some of the energy and entropy did leak to the neutrinos. Therefore the neutrino energy density
after e+ e− -annihilation is about 1.3% higher (at a given T ) than the above calculation gives. The
neutrino distribution also deviates slightly from kinetic equilibrium.
5
Specifically, the oscillations show that the mass differences between the neutrinos are of the
order meV. In principle, the lightest neutrino could be massless.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 75
Figure 2: The evolution of the energy density, or rather, g∗ (T ), and its different components
through electron-positron annihilation. Since g∗ (T ) is defined as ρ/(π 2 T 4 /30), where T
is the photon temperature, the photon contribution appears constant. If we had plotted
ρ/(π 2 Tν4 /30) ∝ ρa4 instead, the neutrino contribution would appear constant, and the
photon contribution would increase at the cost of the electron-positron contribution, which
would better reflect what is going on.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 76
5.5 Matter
We noted that the early universe is dominated by the relativistic particles, and we
can forget the nonrelativistic particles when we are considering the dynamics of
the universe. We followed one species after another becoming nonrelativistic and
disappearing from the picture, until only photons (the cosmic background radiation)
and neutrinos were left, and even the latter of these had stopped interacting.
We now return to look in more detail what happens to nucleons and electrons.
We found that they annihilated with their antiparticles when the temperature fell
below their respective rest masses. For nucleons, the annihilation began immediately
after they were formed in the QCD phase transition. There were however slightly
more particles than antiparticles, and this small excess of particles was left over.
(This has to be the case because we observe electrons and nucleons today). This
means that the chemical potential µB associated with baryon number differs from
zero (it is positive). Baryon number is a conserved quantity in the eras we are
considering (though not before the electroweak phase transition). Baryon number
resides today in nucleons (protons and neutrons; since the proton is lighter than the
neutron, free neutrons have decayed into protons, but there are neutrons in atomic
nuclei) because they are the lightest baryons. The universe is electrically neutral,
and the negative charge lies in the electrons, the lightest particles with negative
charge. Therefore the number of electrons must equal the number of protons.
The number densities etc. of the electrons and the nucleons we get from the
equations of chapter 4. But what is the value of the chemical potential µ? For each
species, we get µ(T ) from the conserved quantities6 . The baryon number resides in
the nucleons,
nB = nN − nN̄ = np + nn − np̄ − nn̄ . (5.9)
Let us define the parameter η, the baryon-photon ratio today,
nB (t0 )
η≡ . (5.10)
nγ (t0 )
From observations we know that η ≈ 6 × 10−10 . (We take a closer look at this
number in the next chapter.) Since baryon number is conserved, nB V ∝ nB a3 stays
constant, so
nB ∝ a−3 . (5.11)
After electron annihilation nγ ∝ a−3 , so we get
2ζ(3) 3
nB (T ) = ηnγ = η T for T ≪ me . (5.12)
π2
We can put (5.11) and (5.12) together and replace a−3 using the relation (4.34)
between the temperature and the scale factor to obtain
2ζ(3) g∗s (T ) 3
nB (T ) = η T . (5.13)
π 2 g∗s (T0 )
6
In general, the recipe to find how the thermodynamical parameters evolve in the expanding FRW
universe is to use the conservation laws of the conserved number densities, entropy conservation
and the energy-momentum tensor continuity relation, to find how the number densities and energy
densities evolve. The other thermodynamical parameters will then evolve so as to satisfy these
requirements.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 77
nN̄ ≪ nN and nN ≡ nn + np = nB .
In the next chapter, we will discuss big bang nucleosynthesis, i.e. how the protons
and neutrons form atomic nuclei. Approximately one quarter of all nucleons (all
neutrons and roughly the same number of protons) form nuclei (A > 1) and three
quarters remain as free protons. Let us denote by n∗p and n∗n the number densities
of protons and neutrons including those in nuclei (and also those in atoms), whereas
we shall use np and nn for the number densities of free protons and neutrons, which
are not bound with each other or electrons. We thus write
At this time (T ∼ 10 keV → 1 eV) the universe contains a relativistic photon and
neutrino background (“radiation”) and nonrelativistic free electrons, protons, and
nuclei (“matter”). Since ρ ∝ a−4 for radiation, but ρ ∝ a−3 for matter, the energy
density in radiation falls eventually below the energy density in matter—the universe
becomes matter-dominated.
The above discussion is in terms of the known particle species. Today there is
much indirect observational evidence for the existence of what is called cold dark
matter (CDM), which is supposedly made out of some yet undiscovered species of
particles. The CDM particles interact weakly with normal matter (they decouple
early), and their energy density contribution should be small when we are well in
the radiation-dominated era, so they do not affect the above discussion much. They
become nonrelativistic early and dominate the matter density of the universe today
(there is about five to six times as much mass in CDM as there is in baryons). Thus
the CDM causes the universe to become matter-dominated earlier than if the matter
consisted of nucleons and electrons only. The CDM will be important later when
we discuss the formation of structure in the universe. The time of matter-radiation
equality teq is calculated in an exercise at the end of this chapter.
5.6 Recombination
Radiation (photons) and matter (electrons, protons, and nuclei) remained in thermal
equilibrium for as long as there were lots of free electrons. When the temperature
became low enough the electrons and nuclei combined to form neutral atoms, an
event known as recombination7 , and the density of free electrons fell sharply. The
photon mean free path grew rapidly and became longer than the horizon distance.
Thus the universe became transparent. Photons and matter decoupled, i.e. their
interaction was no longer able to maintain them in thermal equilibrium with each
other. After this, by T we refer to the photon temperature. Today, these photons are
the CMB, and T = T0 = 2.725 K. (After photon decoupling, the matter temperature
7
This is the first time when nuclei and electrons combine, so the term recombination, adopted
from chemistry, is somewhat of a misnomer.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 78
fell at first faster than the photon temperature, but structure formation then heated
up the matter to different temperatures at different places.)
The relevant interaction here is not weak interaction, as in the case of the neu-
trinos, but instead the electromagnetic interaction between photons and electrons.
The interaction rate is Γ ∼ ne σT , where σT = 8π 2 2
3 α /me ≈ 2 × 10
−3 MeV−2 is
mi T 3/2 µi −mi
ni = g i e T . (5.14)
2π
For as long as the reaction
p + e− → H + γ (5.15)
is in chemical equilibrium the chemical potentials are related by µp + µe = µH (since
µγ = 0). Using this we get the relation
me T −3/2 B/T
gH
nH = np ne e , (5.16)
gp ge 2π
between the number densities. Here B = mp + me − mH = 13.6 eV is the binding
energy of hydrogen. The numbers of internal degrees of freedom are gp = ge = 2,
gH = 4. Outside the exponent we approximated mH ≈ mp . Defining the fractional
ionisation
np
x≡ , (5.17)
nB
equation (5.16) becomes
√ 3/2
1−x 4 2 ζ(3) T
= √ η eB/T , (5.18)
x2 π m e
1 1
0.2 0.2
0 0
1000 1e+09
-3
100 ne/m 1e+08
λγ/Mpc
-3
λγ/Mpc
ne/m
10 1e+07
1 1e+06
0.1 1e+05
1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900
1+z
Figure 3: Recombination. In the top panel the dashed curve gives the equilibrium ionisation
fraction as given by the Saha equation. The solid curve is the true ionisation fraction,
calculated using the actual reaction rates (original calculation by Peebles). You can see that
the equilibrium fraction is followed at first, but then the true fraction lags behind. The
bottom panel shows the free electron number density ne and the photon mean free path
λγ . The latter is given in comoving units, i.e., the distance is scaled to the corresponding
present distance. This figure is for η = 8.22 × 10−10 . (Figure by R. Keskitalo.)
10 10
1 1
Saha
0.1 Peebles 0.1
0.01 0.01
x
0.001 0.001
0.0001 0.0001
1e-05 1e-05
1e+10 1e+08
1e+08 1e+06
-3
ne/m
λγ/Mpc
10000
-3
1e+06
λγ/Mpc
ne/m
10000 100
100 1
1 0.01
0.01 0.0001
1800 1600 1400 1200 1000 800 600 400 200 0
1+z
Figure 4: Same as figure 3, but with a logarithmic scale for the ionisation fraction, and the
redshift scale extended to present time (z = 0 or 1 + z = 1). You can see that a residual
ionisation x ∼ 10−4 remains. This figure does not include reionisation, which happened
around z ∼ 10. (Figure by R. Keskitalo.)
5 THERMAL HISTORY OF THE EARLY UNIVERSE 80
2. It gives the initial conditions for the more complicated calculation that will
give the true evolution.
A similar situation holds for many other events in the early universe, e.g. big bang
nucleosynthesis.
Recombination is not instantaneous. Let us define the recombination tempera-
ture Trec as the temperature where x = 0.5. Now Trec = T0 (1+zrec ) since 1+z = a−1
and the photon temperature falls as T ∝ a−1 . (Since η ≪ 1, the energy release in
recombination is negligible compared to ργ ; and after photon decoupling photons
travel freely maintaining kinetic equilibrium with T ∝ a−1 .)
We get (for η ∼ 10−9 )
Trec ∼ 0.3 eV
zrec ∼ 1300.
You might have expected that Trec ∼ B. Instead we found Trec ≪ B. The main
reason for this is that η ≪ 1. This means that there are very many photons for each
hydrogen atom. Even when T ≪ B, the high-energy tail of the photon distribution
contains photons with energy E > B so that they can ionise a hydrogen atom.
The photon decoupling takes place somewhat later, at Tdec ≡ (1 + zdec )T0 , when
the ionisation fraction has fallen enough. We define the photon decoupling time to
be the time when the photon mean free path exceeds the Hubble distance. The
numbers are roughly
The decoupling means that the recombination reaction can no more keep the ion-
isation fraction on the equilibrium track, but instead we are left with a residual
ionisation of x ∼ 10−4 .
A long time later (at z ∼ 10) the first stars form, and their radiation reionises
the gas that is left in interstellar space. The gas has now such a low density, however,
that the universe remains transparent.
Exercise: Transparency of the universe. We say the universe is transparent
when the photon mean free path λγ is larger than the Hubble length lH = H −1 ,
and opaque when λγ < lH . The photon mean free path is determined mainly by
the scattering of photons by free electrons, so that λγ = 1/(σT ne ), where ne = xn∗e
is the number density of free electrons, n∗e is the total number density of electrons,
and x is the ionisation fraction. The cross section for photon-electron scattering
is independent of energy for Eγ ≪ me and is then called the Thomson cross sec-
tion, σT = 8π 2
3 (α/me ) , where α is the fine-structure constant. In recombination x
falls from 1 to 10−4 . Show that the universe is opaque before recombination and
transparent after recombination. (Assume the recombination takes place between
instantly at z = 1300. You can assume a matter-dominated universe—see below
for parameter values.) The interstellar matter gets later reionised (to x ∼ 1) by the
5 THERMAL HISTORY OF THE EARLY UNIVERSE 81
Figure 5: The CMB frequency spectrum as measured by the FIRAS instrument on the
COBE satellite [2]. This first spectrum from FIRAS is based on just 9 minutes of measure-
ments. The CMB temperature estimated from it was T = 2.735 ± 0.060 K. The final result
from FIRAS is T = 2.725±0.002 K (95% confidence) [3]. Using data from other experiments
as well, the best current value is T0 = 2.72548 ± 0.00057 K (68% confidence) [4].
light from the first stars. What is the earliest redshift when this can happen without
making the universe opaque again? (You can assume that most (∼ all) matter has
remained interstellar.) Calculate for Ωm0 = 1.0 and Ωm0 = 0.3 (note that Ωm also
includes nonbaryonic matter). Use ΩΛ = 0, h = 0.7 and η = 6 × 10−10 .
The photons in the cosmic background radiation have thus travelled almost with-
out scattering through space all the way since we had T = Tdec ∼ 1100 T0 .8 When we
look at this cosmic background radiation we thus see the universe (its faraway parts
near our horizon) as it was at that early time. Because of the redshift, these pho-
tons which were then largely in the visible part of the spectrum, have now become
microwave photons, so this radiation is now called the cosmic microwave background
(CMB). It still maintains the thermal equilibrium distribution. This was confirmed
to high accuracy by the FIRAS (Far InfraRed Absolute Spectrophotometer) instru-
ment on the COBE (Cosmic Background Explorer) satellite in 1989. John Mather
received the 2006 Physics Nobel Prize for this measurement of the CMB frequency
(photon energy) spectrum (see figure 5).9
We shall now, for a while, stop the detailed discussion of these events, recom-
bination and photon decoupling. The universe is about 400 000 years old now.
Next, gravitationally bound structures start to form as gravity attracts matter into
overdense regions. Before photon decoupling the radiation pressure from photons
8
The probability for a photon to have one or more scatterings between decoupling and today is
about 10%.
9
He shared the Nobel Prize with George Smoot, who got it for the discovery of the CMB
anisotropy with the DMR instrument on the same satellite. The CMB anisotropy will be discussed
in the second part of the course.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 82
prevented this. But before going to the physics of structure formation, we discuss
some earlier events (big bang nucleosynthesis, dark matter decoupling and inflation)
in more detail.
2ζ(3) 3
nγ0 = T = 410.5 photons/cm3 (5.19)
π2 0
and the energy density is
π2 4
ργ0 = T = 2.701T0 nγ0 = 4.641 × 10−31 kg/m3 . (5.20)
15 0
Since the critical density today is
3H02
ρc0 = = h2 · 1.8788 × 10−26 kg/m3 (5.21)
8πG
we get for the photon density parameter
ργ0
Ωγ0 ≡ = 2.47 × 10−5 h−2 . (5.22)
ρc0
While relativistic, neutrinos contribute another radiation component
7Nν π 2 4
ρν = T . (5.23)
8 15 ν
10
Although galaxies seen from far away are rather faint objects, difficult to see with the unaided
eye. If you were suddenly transported to a random location in the present universe, you might not
be able to see anything. To enjoy the spectacle, our hypothetical observer should be located within
a forming galaxy, or have a good telescope.
5 THERMAL HISTORY OF THE EARLY UNIVERSE 83
Nν = 3.046 . (5.25)
(This does not mean that there are 3.046 neutrino species, but that the total energy
density in neutrinos is 3.046 times as much as the energy density one neutrino species
would contribute had it decoupled completely before e+ e− annihilation.)
If neutrinos were still relativistic today, the neutrino density parameter would
be 1
7Nν 4 3
Ων0 = Ωγ0 = 1.71 × 10−5 h−2 , (5.26)
22 11
so the total radiation density parameter would be
We thus confirm the claim in chapter 3 that the radiation component can be ignored
in the Friedmann equation, except in the early universe. The combination Ωi h2 is
often denoted by ωi , so we have
As noted earlier, neutrinos have masses in the meV to eV range. Thus neutrinos
are nonrelativistic today and count as matter, not radiation, so the above result for
the neutrino energy density does not apply. However, unless the neutrino masses
are above 0.2 eV, they would still have been relativistic, and counted as radiation,
at the time of recombination and matter-radiation equality. While the neutrinos are
relativistic, we get neutrino energy density
using Ων0 from (5.26), even though Ων0 does not give the present density of neutrinos.
REFERENCES 84
Today, even though the photon and neutrino backgrounds do not dominate the
energy density of the universe any more, they do dominate the entropy density.
Exercise: Matter–radiation equality. The present density of matter is
ρm0 = Ωm0 ρc and the present density of radiation is ρr0 = ργ0 + ρν0 (we assume
neutrinos are massless). What was the age of the universe teq when ρm = ρr ? (Note
that in these early times—but not today—you can ignore the curvature and vacuum
terms in the Friedmann equation.) Give numerical value (in years) for the cases Ωm0
= 0.1, 0.3, and 1.0, and H0 = 70km/s/Mpc. What was the temperature at that
time, Teq ?
References
[1] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38 (2014) 090001,
http://pdg.lbl.gov/2014/listings/contents listings.html
ordinary matter in the present universe comes from this small excess of nucleons.
Let us now consider what happened to it in the early universe. We will focus on the
period when the temperature fell from T ∼ 10 MeV to T ∼ 10 keV (from t ∼ 0.01 s
to a few hours).
6.1 Equilibrium
The total number of nucleons minus antinucleons stays constant due to baryon
number conservation. In the temperature range under consideration, the number
density of antinucleons is negligible. This baryon number can be in the form of
protons and neutrons or atomic nuclei. Weak nuclear reactions convert neutrons
and protons into each other and strong nuclear reactions build nuclei from them.
During the period of interest the nucleons and nuclei are nonrelativistic (T ≪
mp ). Assuming thermal equilibrium we have
3/2
µi −mi
mi T
ni = g i e T (6.1)
2π
for the number density of nucleus type i. If the nuclear reactions needed to build
nucleus i (with mass number A and charge Z) from the nucleons,
(A − Z)n + Zp ↔ i
85
6 BIG BANG NUCLEOSYNTHESIS 86
AZ B g
2H 2.22 MeV 3
3H 8.48 MeV 2
3 He 7.72 MeV 2
4 He 28.3 MeV 1
12 C 92.2 MeV 1
where
Bi ≡ Zmp + (A − Z)mn − mi (6.5)
is the binding energy of the nucleus. Here we have approximated mp ≈ mn ≈ mi /A
outside the exponent, and denoted it by mN (“nucleon mass”).
The different number densities add up to the total baryon number density
X
Ai ni = nB . (6.6)
nB g∗s (T )
= η (6.8)
nγ g∗s (T0 )
as
g∗s (T ) 2
nB = η ζ(3)T 3 . (6.9)
g∗s (T0 ) π 2
After electron-positron annihilation g∗s (T ) = g∗s (T0 ) and nB = ηnγ . Here η is the
present baryon/photon ratio. It can be estimated from various observations, and it
is about 6 × 10−10 .
For temperatures mN ≫ T & Bi we have
n + νe ↔ p + e−
n + e+ ↔ p + ν̄e (6.10)
−
n ↔ p + e + ν̄e .
6.3 Bottlenecks
We define the mass fraction of nucleus i as
Ai ni
Xi ≡ . (6.15)
nB
6 BIG BANG NUCLEOSYNTHESIS 88
Since X
nB = Ai ni , (6.16)
i
P
(where the sum includes protons and neutrons), we have Xi = 1.
1
P
Using also the normalization condition (6.16)), or i Xi = 1, we get all equilib-
rium abundances as a function of T (they also depend on the value of the parameter
η). There are two items to note:
1. The normalization condition (6.16) includes all nuclei up to uranium and be-
yond. Thus we would get a huge polynomial equation from which to solve
Xp .
2. In practice we don’t have to care about the first item, since as the tempera-
ture falls the nuclei no longer follow their equilibrium abundances. The reac-
tions are in equilibrium only at high temperatures, when the other equilibrium
abundances except Xp and Xn are small, and we can use the approximation
Xn + Xp = 1.
In the early universe the baryon density is too low and the time available is
too short for reactions involving three or more incoming nuclei to occur at any
appreciable rate. The heavier nuclei have to be built sequentially from lighter nuclei
in two-particle reactions, so deuterium is formed first in the reaction
n+p → d + γ.
Only when deuterons are available can helium nuclei be formed, and so on. This
process has “bottlenecks”: the lack of sufficient densities of lighter nuclei hinders
the production of heavier nuclei, and prevents them from following their equilibrium
abundances.
As the temperature falls, the equilibrium abundances rise fast. They become
large later for nuclei with small binding energies. Since deuterium is formed directly
from neutrons and protons it can follow its equilibrium abundance as long as there
are large numbers of free neutrons available. Since the deuterium binding energy is
rather small, the deuterium abundance becomes large rather late (at T < 100 keV).
Therefore heavier nuclei with larger binding energies, whose equilibrium abundances
would become large earlier, cannot be formed. This is the deuterium bottleneck.
Only when there is lots of deuterium (Xd ∼ 10−3 ) can helium be produced in large
numbers.
The nuclei are positively charged so they repel each other electromagnetically.
The nuclei need large kinetic energies to overcome this Coulomb barrier and get
within the range of the strong interaction. Thus the cross sections for these fusion
reactions fall rapidly with energy and the nuclear reactions are “shut off” when the
temperature falls below T ∼ 30 keV. Thus there is less than one hour available for
nucleosynthesis. Because of the short time available and additional bottlenecks (e.g.
there are no stable nuclei with A = 8), only very small amounts of elements heavier
than helium are formed.
1
For np and nn we know just their ratio, since we do not know µp and µn , only that µp = µn .
Therefore this extra equation is needed to solve all ni .
6 BIG BANG NUCLEOSYNTHESIS 89
e−Q/T 1
Xn = and Xp = . (6.17)
1 + e−Q/T 1 + e−Q/T
Nucleons follow these equilibrium abundances until neutrinos decouple at T ∼
0.8 MeV, shutting off the weak n ↔ p reactions. After this, free neutrons decay, so
where τn = 880.0 ± 0.9 s is the mean lifetime of a free neutron2 . (The half-life is
τ1/2 = (ln 2)τn .) In reality, the decoupling and thus the shift from behavior (6.17)
to behavior (6.18) is not instantaneous, but an approximation where one takes it to
be instantaneous at time t1 when T = 0.8 MeV, so that Xn (t1 ) = 0.1657, gives a
fairly accurate final result.
The equilibrium mass fractions are, from (6.4),
1 5
Xi = XpZ XnA−Z gi A 2 ǫA−1 eBi /T (6.19)
2
where
3/2 3/2 3/2
1 2π 1 2πT g∗s (T ) T
ǫ≡ nB = 2 ζ(3) η∼ η.
2 mN T π mN g∗s (T0 ) mN
The factors which change rapidly with T are ǫA−1 eBi /T . For temperatures mN ≫
T ≫ Bi we have eBi /T ∼ 1 and ǫ ≪ 1. Thus Xi ≪ 1 for others (A > 1) than
protons and neutrons. As temperature falls, ǫ becomes even smaller and at T ∼ Bi
we have Xi ≪ 1 still. The temperature has to fall below Bi by a large factor before
the factor eBi /T wins and the equilibrium abundance becomes large.
Deuterium has Bd = 2.22 MeV, so we get ǫeBd /T = 1 at Td = 0.06 MeV–0.07
MeV (assuming η = 10−10 − 10−9 ), so the deuterium abundance becomes large
close to this temperature. Since 4 He has a much higher binding energy, B4 = 28.3
MeV, the corresponding situation ǫ3 eB4 /T = 1 occurs at a higher temperature T4 ∼
0.3 MeV. But we noted earlier that only deuterium stays close to its equilibrium
abundance once it gets large. Helium begins to form only when there is sufficient
deuterium available, in practice slightly above Td . Helium then forms rapidly. The
available number of neutrons sets an upper limit to 4 He production. Since helium
has the highest binding energy per nucleon (of all isotopes below A = 12), almost
all neutrons end up in 4 He, and only small amounts of the other light isotopes, 2 H,
3 H, 3 He, 7 Li, and 7 Be, are produced.
The Coulomb barrier shuts off the nuclear reactions before there is time for heav-
ier nuclei (A > 8) to form. We get a fairly good approximation for 4 He production
2
The error bar may not be an accurate reflection of the uncertainty in the neutron lifetime, as
there are large differences between measurements, and the preferred value has changed annually at
the percent level, e.g. the shift from 2010 to 2012 was 5.6 seconds. This is the current best estimate,
from 2014 [1].
6 BIG BANG NUCLEOSYNTHESIS 90
where g∗ = 3.363. Since most of the time in T = 0.8 MeV–0.07 MeV is spent at the
lower part of this temperature range, this formula gives a good approximation for
the time,
tns − t1 = 267 s (in reality 264.3 s).
Thus we get for the final 4 He abundance
Accurate numerical calculations, using the reaction rates of the relevant weak and
strong reaction rates give X4 = 21–26 % (for η = 10−10 − 10−9 ).
This calculation of the helium abundance X4 involves a bit of cheating in the
sense that we have used results of accurate numerical calculations to infer that we
need to use T = 0.8 MeV as the neutrino decoupling temperature, and Tns = 1.1Td
as the “instantaneous nucleosynthesis” temperature, to best approximate the correct
behavior. However, it gives us a quantitative description of what is going on, and
an understanding of how the helium yield depends on various things.
Exercise: Using the preceding calculation, find the dependence of X4 on η, i.e.,
calculate dX4 /dη.
sections are energy-dependent. Integrating them over the energy and velocity dis-
tributions and multiplying with the relevant number densities leads to temperature-
dependent reaction rates. The most important reactions are the weak n ↔ p reac-
tions (6.10) and the following strong reactions3 (see also Fig. 1):
p + n → 2H + γ
2H + p → 3 He + γ
2H + 2H → 3H + p
2H + 2H → 3 He + n
n + 3 He → 3H + p
p + 3H → 4 He + γ
2H + 3H → 4 He + n
2H + 3 He → 4 He + p
4 He + 3 He → 7 Be + γ
4 He + 3H → 7 Li + γ
7 Be + n → 7 Li + p
7 Li + p → 4 He + 4 He
In principle, all of these nuclear cross sections are determined by the just a few
parameters in QCD. However, calculating these cross sections from first principles
is too difficult in practice. Instead cross sections measured in the laboratory are
used. Cross sections of the weak reactions (6.10) are known theoretically (there is
one parameter describing the strength of the weak interaction, which is determined
3
The reaction chain that produces helium from hydrogen in BBN is not the same that occurs
in stars. The conditions in stars are different: on the one hand, there are no free neutrons and
the temperatures are lower, but on the other hand the densities are higher and there is more time
available. In addition, second generation stars contain heavier nuclei (C,N,O) that act as catalysts
in helium production. Some of the most important reaction chains in stars are [Karttunen et al:
Fundamental Astronomy, p. 251] :
1. The proton-proton chain
2
p+p → H + e+ + νe
2 3
H+p → He + γ
3
He + 3 He → 4
He + p + p,
The cross section of the direct reaction d+d → 4 He + γ is small (i.e. the 3 H + p and 3 He + n
channels dominate d+d →), and it is not important in either context.
The triple-α reaction 4 He + 4 He + 4 He → 12 C, responsible for carbon production in stars, is also
not important during big bang, since the density is not sufficiently high for three-particle reactions
to occur (the three 4 He nuclei would need to come within the range of the strong interaction within
the lifetime 2.6×10−16 s of the intermediate state 8 Be). Exercise: calculate the number and mass
density of nucleons at T = 1 MeV.
6 BIG BANG NUCLEOSYNTHESIS 92
experimentally. The relevant reaction rates are now known sufficiently accurately,
so that the nuclear abundances produced in BBN (for a given value of η) can be
calculated with better accuracy than the present abundances can be measured from
astronomical observations.
The reaction chain proceeds along stable and long-lived (compared to the nu-
cleosynthesis timescale—minutes) isotopes towards larger mass numbers. At least
one of the two incoming nuclei must be an isotope which is abundant during nu-
cleosynthesis, i.e. n, p, 2 H or 4 He. The mass numbers A = 5 and A = 8 form
bottlenecks, since they have no stable or long-lived isotopes. The A = 5 bottleneck
is crossed with the reactions 4 He+3 He and 4 He+3 H, which form a small number of
7 Be and 7 Li. Their abundances remain so small that we can ignore the reactions
abundances) amounts of the four isotopes, 2 H, 3 He, 4 He and 7 Li (the fifth isotope
1 H=p we had already before BBN). Their production in the BBN can be calculated,
Here ρb0 is the average density of baryonic matter today; recall that ωb ≡ Ωb0 h2 .
B(MeV) B/A
2H 2.2245 1.11
3H 8.4820 2.83
3 He 7.7186 2.57
4 He 28.2970 7.07
6 Li 31.9965 5.33
7 Li 39.2460 5.61
7 Be 37.6026 5.37
12 C 92.1631 7.68
56 Fe 492.2623 8.79
0
10
4
He
-1
10
-2
10
-3
10
D/H
3
10
-4 He/H
-5
10
-6
10
-7
10
-8
10
7
10
-9 Li/H
-10
10 -11 -10 -9 -8
10 2 5 10 2 5 10 2 5 10
Moreover, stellar processing produces heavier elements, e.g. C, N, O, which are not
produced in BBN. Their abundance varies a lot from place to place, giving a mea-
sure of how much chemical evolution has happened in various parts of the universe.
Plotting 4 He vs. these heavier elements, we can extrapolate the 4 He abundance
to zero chemical evolution to obtain the primordial abundance. Since 3 He and 7 Li
are both produced and destroyed in stellar processing, it is more difficult to make
estimates of their primordial abundances based on observed present abundances.
There are two clear qualitative signatures of big bang nucleosynthesis in the
present universe:
1. All stars and gas clouds observed contain at least 23% 4 He. If all 4 He had been
produced in stars, we would see similar variations in the 4 He abundance as we
see for the other elements, such for C, N, and O, with some regions containing
just a few % or even less 4 He. This universal minimum amount of 4 He signifies
primordial abundance produced when matter in the universe was uniform.
for a conservative range of the baryonic density parameter. With h = 0.7, we get
Figure 4: The abundances of 4 He, D, 3 He and 7 Li and the range of η10 ≡ 1010 η determined
from BBN (yellow boxes) and the the CMB (blue strip) [1]. Both the BBN and CMB ranges
are 95% C.L..
REFERENCES 98
Even the conservative range is much less than cosmological estimates for Ωm0 .
Therefore most of the matter in the universe is non-baryonic. In the next chapter,
we will discuss this non-baryonic dark matter.
We can also use BBN to test for the presence of physics beyond the Standard
Model. The expansion rate of the universe depends on the energy density of radia-
tion, encoded in g∗ . During BBN, we have g∗ = 5.5+1.75Nν , where Nν is the number
of neutrino species with masses so small that they are relativistic during BBN and
have weak interactions so that their distribution is coupled to the thermal bath until
about T = 0.8 MeV. The number of neutrino species can also be left as a free pa-
rameter, in which case it parametrises any additional radiation degrees of freedom
that may be present. As mentioned in the previous chapter, for the Standard Model
we have Nν = 3.046, because neutrinos are not totally decoupled from the thermal
bath when electrons and positrons annihilate, so some of the entropy (and energy
density) of electrons and positrons is transferred to the neutrinos, hence the 0.046
correction. If we leave Nν as a free parameter and fit the observations (neglecting
Lithium) we get [1] η = 4.9—7.1 × 10−10 and 1.8 < Nν < 4.5. So as far as BBN is
concerned, there is room for one more light neutrino species. Combining with CMB
data from Planck, we have Nν = 3.28 ± 0.28, so there is some room for additional
radiation degrees of freedom, but not quite for another full neutrino species6 .
References
[1] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38 (2014) 090001,
http://pdg.lbl.gov/2014/listings/contents listings.html
6
We know from collider and laboratory experiments that there are only three light weakly in-
teracting neutrinos. The mass limit for new weakly interacting neutrinos is m > 40 GeV if they
are of the Majorana type and m > 2400 GeV if they are of the Dirac type [1]. The distinction
corresponds to whether they are or are not (respectively) their own antiparticles.
7 Dark matter
7.1 Observational evidence for dark matter
The term “dark matter” was coined by Jacobus Kapteyn in 1922 in his studies of
the motions of stars in our galaxy to refer to matter that interacts gravitationally,
but is not seen via electromagnetic radiation[1]. He found that no dark matter is
needed in in the galactic Solar neighbourhood. In 1932, Jan Oort suggested the
opposite result that there would be twice as much dark matter as visible matter in
the Solar vicinity. This is the first claim of evidence for dark matter. However, later
observations have shown this claim to be wrong, and the discovery of dark matter is
usually credited to Fritz Zwicky who made the first correct claim for the existence
of dark matter in 1933. Zwicky concluded from measurements of the redshifts of
galaxies in the Coma cluster that their velocities are much larger than the escape
velocity due to the visible mass of the cluster.
There are nowaways large amounts of evidence for dark matter, including from
gravitational lensing, expansion rate of the universe and other measures. One of the
earliest, and easiest to understand, pieces of evidence comes from rotation curves
of galaxies, which have been studied extensively since the 1970s. According to
Newtonian gravity, the velocity v of a body on a circular orbit in an axially symmetric
mass distribution is
v2 M (r)
= GN 2 , (7.1)
r r
where M (r) is the mass inside radius r, and the function v(r) is called the rotation
curve. For an orbit around a compact central mass, for example planets in the Solar
system, we get Kepler’s third law v ∝ 1/r1/2 . For stars orbiting the center of a
galaxy the situation is different, since the mass inside the orbit increases with the
distance. Suppose that the energy density of a galaxy decreases as a power-law,
ρ ∝ r−n (7.2)
Thus the rotation velocity in our model galaxy should vary with distance from the
center as
v(r) ∝ r1−n/2 . (7.4)
Observed rotation curves increase with r for small r, i.e., near the center of the
galaxy, but then typically flatten out, becoming roughly v(r) ≈ const.. According
to (7.4), this would indicate a density profile
ρ ∝ r−2 . (7.5)
However, the density of stars falls more rapidly away from the core of a galaxy, and
falls exponentially at the edge. Also, the total mass from stars and other visible
objects, like gas and dust clouds, is too small to account for the rotation velocity at
large distances.
This seems to indicate the presence of another mass component to galaxies. This
mass component should have a different density profile than the visible, or luminous,
99
7 DARK MATTER 100
mass in the galaxy, so that it would be subdominant in the inner parts of the galaxy,
but would dominate in the outer parts. The dark component appears to extend well
beyond the visible parts of galaxies, forming a dark halo surrounding the galaxy.
From more detailed observations, we see that instead of of 1/r2 , the distribution
of dark matter in galaxies is well fit by the Navarro-Frenk-White (NFW) profile,
ρ0
ρ= 2 , (7.6)
r r
rs 1+ rs
where ρ0 and rs are constants. The profile obviously does not hold all the way
to the center (the density does not diverge anywhere!), and close to the centers of
galaxies, the densities are typically dominated by baryonic matter, and the dark
matter profile rises less steeply than in the NFW case.
Dark matter can be discussed in terms of the mass-to-light ratio M/L of sources.
It is customarily given in units of M⊙ /L⊙ , where M⊙ and L⊙ are the mass and
absolute luminosity of the Sun. The luminosity of a star increases with its mass
faster than linearly, so stars with M > M⊙ have M/L < 1, and smaller stars have
M/L > 1. Small stars are more common than large stars, so a typical mass-to-light
ratio from the stellar component of galaxies is M/L ∼ a few. For stars in our part of
the Milky Way galaxy, M/L ≈ 2.2. Because large stars are more short-lived, M/L
decreases with the age of the star system, and the typical M/L from stars in the
universe is somewhat larger. However, this still does not account for the full masses
of galaxies.
The mass-to-light ratio of a galaxy turns out to be difficult to determine; the
larger volume around the galaxy you include, the larger M/L you get. The mass
M is determined from velocities of orbiting bodies and at large distances there may
be no such bodies visible. For galaxy clusters you can use the velocities of the
galaxies themselves as they orbit the center of the cluster. The mass-to-light ratios
of clusters appear to be several hundreds. Presumably isolated galaxies would have
similar values if we could measure them to large enough radii.
Estimates for the total matter density Ωm0 based on the gravitational effects of
matter in the universe via many different methods give a similar conservative range
0.1 . Ωm0 . 0.4, with a more likely range of 0.15 . Ωm0 . 0.3 [3]. Also, from the
CMB we have ωm = Ωm0 h2 = 0.1426 ± 0.0025 for the spatially flat ΛCDM model
[4], and ωm = 0.14 ± 0.01 model-independently [5]1 .
1
In fact, if there is only baryonic matter, the CMB anisotropy pattern looks qualitatively different
7 DARK MATTER 101
The estimates for the amount of ordinary matter in the objects we can see on
the sky, stars and visible gas and dust clouds, i.e. luminous matter, give a much
smaller contribution to the density parameter,
In the previous chapter we found that big bang nucleosynthesis leads to the value
0.019 ≤ Ωb0 h2 ≤ 0.024 at 95% C.L., and the CMB gives a similar range. Taking
conservatively h = 0.6 . . . 0.8, we have
This is consistent, as all luminous matter is baryonic, and all baryonic matter is
matter. That we have two inequalities tells us that there are two kinds of dark matter
(as opposed to luminous matter): baryonic dark matter (BDM) and nonbaryonic
dark matter. We do not know the precise nature of either kind of dark matter, and
this is called the dark matter problem. Determining the nature of dark matter is one
of the most important problems in cosmology today. Often the expression “dark
matter” is used to refer to the nonbaryonic kind only, or only to non-baryonic dark
matter other than neutrinos, i.e. only to the part which is unknown.
The masses of ordinary black holes are included in the Ωb estimate from BBN,
since they were formed from baryonic matter after BBN. However, if there are
primordial black holes produced in the big bang before BBN, they would not be
included in Ωb .
A star requires a mass of about 0.07M⊙ to ignite thermonuclear fusion, and to
start to shine as a star. Smaller, “failed”, stars are called brown dwarfs. They are not
completely dark; they are warm balls of gas which radiate faint thermal radiation.
They were warmed up by the gravitational energy released in their compression
to a compact object. Thus brown dwarfs can be, and have been, observed with
telescopes if they are quite close by. Smaller such objects are called “jupiters” after
the representative in the Solar System.
The strategy to observe a microlensing event is to monitor constantly a large
number of stars to catch such a brightening when it occurs for one of them. Since
the typical time scales of these events are many days, or even months, it is enough
to look at each star, say, once every day or so. As most of the dark matter is in the
outer parts of the galaxy, further out than we are, it would be best if the stars to
be monitored where outside of our galaxy. The Large Magellanic Cloud, a satellite
galaxy of our own galaxy, is a good place to look for, being at a suitable distance,
where individual stars are still easy to distinguish. Because of the required precise
alignment of us, the MACHO, and the distant star, the microlensing events will be
rare. But if the BDM in our galaxy consisted mainly of MACHOs (with masses
between that of Jupiter and several solar masses), and we monitored constantly
millions of stars in the LMC, we should observe many events every year.
Such observing campaigns (MACHO, OGLE, EROS, . . . ) were begun in the
1990’s. Indeed, several microlensing events have been observed. The typical mass of
these MACHOs turned out to by ∼ 0.5M⊙ , much larger than the brown dwarf mass
that had been expected. The most natural faint object with such a mass would be
a white dwarf. However, white dwarfs had been expected to be much too rare to
explain the number of observed events. On the other hand the number of observed
events is too small for these objects to dominate the mass of the BDM in the halo
of our galaxy.
BDM in our universe is dominated by thin intergalactic ionized gas. In fact, in
large clusters of galaxies, we can see this gas, as it has been heated by the deep
gravitational well of the cluster, and radiates X-rays.
Figure 2: Comparison of the expected halo of the Milky Way and the galaxies M31 and M33
in CDM and WDM models. From http://www.clues-project.org/images/darkmatter.html.
structures. CDM, on the other hand, refers to dark matter particles with negligible
velocities. If these velocities are thermal, this requires their masses to be large,
which means that they must have decoupled while already nonrelativistic, so their
number density would have been suppressed by annihilation. Candidates between
hot and cold are called, naturally enough, warm dark matter.
HDM, WDM and CDM all have a different effect on structure formation in the
universe. Structure formation refers to the process in which the originally nearly
homogeneously distributed matter forms bound structures such galaxies and galaxy
clusters under the pull of gravity. We shall discuss structure formation in the second
part of the course. But let us already mention that today the best way to differentiate
between HDM, WDM and CDM is through the observed large-scale structure in the
universe, i.e. the way galaxies are distributed in space, combined with the CMB
which shows the seeds of structure. We show in figure 2 the results of a simulation
of the halo of dark matter around the Milky Way and two other galaxies. For CDM,
there is more substructure and satellites around the galaxy, while their formation is
suppressed for WDM. According to observations, there is an order of magnitude less
satellites observed around the Milky Way than predicted by CDM models. However,
the observations are not complete, and the discrepancy may also be due to other
causes than WDM.
The most common candidates for non-baryonic dark matter are Weakly Inter-
acting Massive Particles, or WIMPs. They decouple from the thermal bath of the
early universe early, like neutrinos, but are much heavier, so that they are a form of
CDM. The interactions of some dark matter candidates are stronger or weaker. For
example, gravitinos have only gravitational interactions, while TIMPs (Technicolour
Interacting Massive Particles) can have stronger interactions.
7 DARK MATTER 104
where the sum is over the neutrino mass eigenstates (which are not the same as the
weak interaction eigenstates, for whom the names electron neutrino, muon neutrino
and tau neutrino are properly reserved). For T0 = 2.725 K, this gives the neutrino
density parameter P
2 mν
Ων0 h = , (7.11)
94.14 eV
which applies if the neutrino masses are less than the neutrino decoupling tem-
perature, 1 MeV, but greater than the present temperature of massless neutrinos,
Tν0 = 0.168 meV. This counts then as one contribution to Ωm0 . In this case, neu-
trinos are hot dark matter (HDM). Data on large scale structure combined with
structure formation theory requires that a majority of the matter in the universe
has to be CDM and the present upper limit to HDM in the form of massive neutrinos
leads to the conservative mass limit [6]
X
mν . 2.0 eV . (7.12)
for neutrinos to dominate the galaxy mass. (A rough estimate is enough: you can,
e.g., assume that the neutrino distribution is spherically symmetric, and that the
escape velocity within radius r equals the escape velocity at r.) Give the numerical
value for the case v = 200km/s and r = 10kpc.
so-called “strong CP problem” in particle physics. We shall not go into the details
of this, just note that it can be phrased as the question “why is the neutron electric
dipole moment so small?”. (The electric dipole moment is zero to the accuracy of
measurement, the upper limit being dn < 0.29 × 10−25 ecm [7], whereas the neutron
does have a significant magnetic dipole moment.) A proposed solution involves
an additional symmetry of particle physics (the Peccei–Quinn symmetry). The
axion would then be the Goldstone boson of the breaking of this symmetry. The
important point for us is that these axions would be created in the early universe
when the temperature fell below the QCD energy scale (of the order of 100 MeV),
and they would be created with negligible kinetic energy, and would never be in
thermal interaction. Thus axions would have negligible velocities, and act like CDM.
(Though calling axions “cold” is bit of a misnomer, as their phase space distribution
is not thermal! Here the word just means that their typical kinetic energy is much
smaller than the mass.) Another dark matter candidate of this type is the gravitino,
the supersymmetric partner of the graviton. We will not discuss this kind of dark
matter further, and will stick to massive CDM particles.
where ndm is the number density of the dark matter particles, n̄dm is the antiparticle
number density, ψdm is the rate of creation of the dark matter particles, and hi
indicates average over the phase space distribution. Let us first consider the case
when there is no particle-antiparticle asymmetry, so the chemical potential is zero,
µdm = 0. We will later see what happens if there is a conserved quantum number
which enforces a particle-antiparticle asymmetry. (The term “thermal relic” is often
used to refer only to the case when an asymmetry between particles and antiparticles
is not important.) In equilibrium, equally many particles are being annihilated and
created, so ψdm = hσvin2dm ≡ hσvin2eq ≡ Γneq , where neq is the number density in
equilibrium. Denoting the number of dark matter particles Ndm ∝ a3 ndm (and the
equilibrium number by Neq ), we have
" #
Ndm 2
1 dNdm Γ
=− −1 . (7.14)
Neq d(ln a) H Neq
In the limit Γ ≫ H, interactions rapidly restore any deviations from the equilibrium
distribution. If Ndm > Neq , the right-hand side of (7.14) is negative, so the numbers
will decrease, and the opposite for Ndm < Neq . In the limit of weak coupling,
Γ ≪ H, we get Ndm ≈ constant. The time when the number of particles reaches
7 DARK MATTER 107
this constant value is called decoupling (a term we already used with photons and
neutrinos) or freeze-out. A crude approximation is to say that decoupling happens
at exactly the temperature Td where H = Γ, and that the number of particles
follows the equilibrium behaviour before and is conserved afterwards, as we did for
the neutrinos.
If a particle decouples while it is relativistic, the number density is of the order
3
Td . We calculated this starting from the phase space distribution, but it is fairly
obvious, because Td is the only relevant dimensional quantity. As we discussed
above, such hot dark matter would have a large energy density today unless the
mass is small. However, as a particle species becomes non-relativistic, the number
density falls exponentially (still assuming that the chemical potential is zero), so
the mass of the dark matter particle can be large while keeping the number density
down.
The number density of a non-relativistic particle in thermal equilibrium (with
zero chemical potential) at decoupling time td and temperature Td is
3/2
mTd
neq (td ) = gdm e−m/Td , (7.15)
2π
where m is the mass of the dark matter particle. From this we get the density today
as (assuming negligible decay)
where we have used the relation g∗S (T )T 3 a3 = constant, which follows from con-
servation of entropy. The total energy density is ρdm = 2mndm (the factor 2 comes
from the fact that we have an equal number of particles and antiparticles).
In order to determine the number density of a thermal relic, we need to know
the mass, the decoupling temperature and the number of degrees of freedom at
decoupling. At decoupling, Γ = neq (td )hσvi, so we need to know the mean of the
cross section times the velocity. The cross-section depends on the details of the
particle physics, but we can roughly parametrise the annihilation cross-section as
σv ∝ v 2q , where q = 0 for annihilation in the ground state (s-wave), and q = 1
for annihilation in the p-wave state. This can be understood as an expansion in
the square of the velocity, and since v ≪ 1, only the leading term is relevant.
(The p-wave term is only important if annihilation in the ground state is forbidden
or
p strongly suppressed for some reason.) For a non-relativistic particle, h|v|i =
8/π T /m, so we write hσvi = σ0 (T /m)q . We therefore have
p
gdm −q−3/2 −y
Γ(td ) = σ0 m3 y e , (7.17)
(2π)3/2
where we have defined y ≡ m/Td ; we have y ≫ 1 since the dark matter particle is
non-relativistic.
According to the Friedmann equation, the Hubble parameter is given by
π2 π2
3H 2 = 8πGN g∗ (T )T 4 = g∗ (T )T 4 , (7.18)
30 30MP2 l
so
7 DARK MATTER 108
r
g∗ (Td ) m2 −2
H(td ) = π y . (7.19)
90 MP l
Equating Γ(td ) = H(td ), we get an equation from which we can solve the decoupling
temperature in units of the dark matter mass, y,
y0 = ln N
1
y1 = ln N + − q ln(ln N ) , (7.22)
2
and so on; the first approximation will be good enough for us.
From (7.15) and (7.16), the relic abundance is
where we have used (7.20), put g∗S (Td ) = g∗ (Td ) (we assume that no particles
are becoming non-relativistic as the dark matter decouples) and g∗S (T0 ) ≈ 3.91,
and traded the temperature today for the photon number density via the relation
nγ = 2ζ(3)T 3 /π 2 . The relic energy density ρdm0 = 2mndm0 depends on the mass
only logarithmically via y, apart from the possible mass dependence of σ0 .
the approximate the value of y. We thus get Td ≈ m/[17 + 3 ln(m/ GeV)]. This is
consistent with the adopted value of g∗ (Td ) only for roughly 40 GeV & m & 10 GeV,
but since g∗ (Td ) enters only logarithmically, the value of Td is not sensitive to the
precise number of degrees of freedom. These numbers give
m m −3
ndm0 ≈ 3 × 10−8 1 + 0.2 ln nγ0
GeV GeV
m m −3
= 3 × 10−8 η −1 1 + 0.2 ln nB0
GeV GeV
m m −3
≈ 50 1 + 0.2 ln nB0 , (7.24)
GeV GeV
where we have taken η = 6 × 10−10 . Since mb ≈ 1 GeV, we have
m m −2
ρdm0 ≈ 100 1 + 0.2 ln ρb0 . (7.25)
GeV GeV
For m = 1 GeV, we would have ρdm0 /ρb0 ≈ 10 and m = 100 GeV would give
ρdm0 /ρb0 ≈ 10−2 . Since ρb0 ≈ 0.05ρc0 , we get the bound m & 2 GeV on the mass
of the dark matter particle in order for its present density not to exceed the crit-
ical density. This is called the Lee-Weinberg bound. We get the observed ratio
ρdm0 /ρb0 ≈ 5 . . . 7 for m ≈ 4 . . . 5 GeV. Note the assumptions in the derivation
of the bound: the particle is assumed to be a thermal relic (i.e. the number den-
sity is determined by the thermal equilibrium distribution at decoupling) and the
annihilation occurs via the s-wave process.
The fact that a thermal relic with weak cross section and a mass not too different
from the weak scale gives the right relic abundance is called the WIMP miracle.
However, in the MSSM, a weakly interacting dark matter particle with a mass of 4
to 5 GeV would already have been detected in collider experiments. According to
the latest analyses, the lower mass limit for fermionic SUSY partners in the MSSM is
15 GeV [8]. As the LHC starts taking data again in 2015, the limit is expected to go
rise. (Lighter particles can viable in more complicated models, however.) Generally,
the preferred range for dark matter masses is of the order 100 GeV or so in the
usually studied models. One can still get the right relic abundance by making the
self-annihilation cross section smaller so that more particles remain, and extensions
of the Standard Model such as MSSM contain enough free parameters to adjust the
cross sections and masses. However, they can be independently tested in colliders
and via direct and indirect detection of the dark matter particles, which we will
shortly discuss.
so
ρdm0 mdm
=N , (7.27)
ρB0 mN
7 DARK MATTER 111
which agrees with the observed ratio 5 . . . 7 for mdm = (5 . . . 7)/N GeV.
One constraint on such models is that the phase space distribution of the dark
matter particles has to correspond to CDM (or WDM). So the dark matter particle
cannot have decoupled at the electroweak phase transition with a thermal distri-
bution function if its mass is smaller than 100 GeV. However, a model where the
distribution function is not thermal would sitll be possible – the essential thing is
that the high momentum states of the dark matter particles are not occupied. From
the point of view of technicolour models, mdm . 10 GeV is also a somewhat un-
naturally low unless N ≪ 1, since the technicolour scale has to be & 1 TeV to be
consistent with collider experiments (no technicolour bound states —or any other
signatures of technicolour for that matter— have been observed). Naively, one would
expect the mass of the stable technicolour dark matter particle to be of this order,
or at least of the order of the Higgs mass, mH ∼ 100 GeV, since they have the same
origin as bound states. But there could be a reason why the lightest stable fermionic
bound state is much lighter than a bosonic unstable state. (In QCD, the lightest
bosonic bound states, the pions with mπ0 = 135 GeV and mπ± = 140 GeV, are
about an order of magnitude lighter than the lightest stable bound state, the proton
with mp = 938 GeV, because of chiral symmetry – let’s not get into this!)
Alternatively, we can have reactions which mix particles carrying baryon number
and particles carrying Q together, so that their relative abundance depends freeze-
out temperature Tf of these interactions. Let’s say that we have reactions which
interconvert baryons and dark matter particles
dm + X ↔ q + Y , (7.28)
where q stands for a quark, which carries B = 1/3, dm stands for the dark matter
particle which carries Q = 1 (or any other particle carrying the same quantum
number), and X and Y are particles which carry neither B nor Q, and we assume
we can neglect their chemical potentials. We then have, as long as these reactions
are in equilibrium, µdm = µq . Let us assume that these reactions freeze out at the
electroweak phase transition, and take the particle carrying the quantum number to
be massless. (Since the top quark receives a mass of the order of the electroweak
scale at the phase transition, we would have to consider the effect of this. We
neglect this complication. At least in some technicolour models, the top quark does
not make a difference [10].)
We assume that the technicolour particles are in thermal equiblirium. In order
for them to count as CDM, we then need mdm ≫ Tf . We thus have
µB
nB − n̄B = gB T 3
T
mdm T 3/2 − mdm µdm
µdm
nQdm − n̄Qdm = gdm e T e T − e− T
2π
mdm T 3/2 − mdm
µdm
≃ 2 gdm e T , (7.29)
T 2π
where we have taken into account that the asymmetries and thus the chemical
potentials are small, and gB = 24 (the number of degrees of freedom in the quarks is
72, and each quark has B = 1/3). Note that just as gB is the number of degrees of
freedom which carry the conserved quantum number B which ends up in baryons,
7 DARK MATTER 112
gdm is the number of degrees of freedom which carry Qdm , which will in the late
universe end up in the dark matter particles only. Equating the chemical potentials
and noting that today ρB0 = mN nB0 , we obtain
3/2
ρdm0 gdm mdm mdm −
mdm
= e Tf
. (7.30)
ρB0 12(2π)3/2 mN Tf
Taking gdm = 100 and Tf = 100 GeV, we get the observed abundance for mdm ≈
700 . . . 800 GeV ∼ 1 TeV. (Note that the temperature at which the electroweak
phase transition happens may change from the Standard Model value of 100 GeV
due to the new particles and interactions present in technicolour.)
Figure 3: A composite image of galaxy cluster 1E 0657-56, also called the Bullet Cluster. It
consists of two subclusters, a larger one on the left, and a smaller one on the right. They have
recently collided and traveled through each other. One component of the image is an optical
image which shows the visible galaxies. Superposed on it, in red, is an X-ray image, which
shows the heated intergalactic gas, that has been slowed down by the collision and left behind
the galaxy components of the clusters. The blue colour is another superposed image, which
represent an estimate of the total mass distribution of the cluster, based on gravitational
lensing. NASA Astronomy Picture of the Day 2006 August 24. Composite Credit: X-Ray:
NASA/CXC/CfA/M. Markevitch et al. Lensing map: NASA/STScI; ESO WFI; Magel-
lan/U. Arizona/D. Clowe et al. Optical: NASA/STScI; Magellan/U. Arizona/D. Clowe et
al..
7 DARK MATTER 114
mv 2 = mvdm
2
+ mt vt2
mv = mvdm + mt vt , (7.31)
where mt is the mass of the target nucleus, and m is still the dark matter mass. As
the kinetic energy 21 mt vt2 given to the nucleus, we get
2mt
E = v2
(1 + mt /m)2
2
2A v
≈ keV , (7.32)
(1 + AmN /m)2 300km/s
ΓN = hσdm−nucleus vindm N
2 × 104 A M hσdm−p vi ρdm m −1
≈ , (7.33)
yr ton 10−40 cm2 × 300km/s 0.3 GeV/cm3 GeV
where we have put in typical values for the cross section, velocity and WIMP density.
The latter two are determined by taking a given density profile for the dark matter as
a function of radius and using the observed rotation curves, and they also agree with
typical values obtained from galactic simulations of dark matter6 . For comparison,
the weak interaction annihilation cross section for 1 GeV mass is σ ∼ G2F GeV2 ∼
10−10 GeV−2 ≈ 4 × 10−38 cm2 ≈ 10−27 cm3 /s, using the relation 197 MeV ≈ 1/fm.
6
The energy density one gets in detailed analyses typically does not vary from ρdm =
0.3 GeV/cm3 by more than a factor of a few. However, strictly speaking, observations are con-
sistent with no dark matter in the Solar system – as far as galactic rotation curves are concerned,
dark matter is only needed in the outer parts of the galaxy. There is a direct upper limit on the
density of dark matter in the Solar system from the fact that no disruption of planetary orbits in
has been observed, and it is about 106 times this value.
7 DARK MATTER 116
Time (day)
Figure 4: Modulation of the detection rate of the DAMA/LIBRA experiment in the 2-6
keV energy range, in units of countr per day/kg/keV. From [15].
structure formation, and this can lead to observable amounts of annihilation. (Note
the similarity to nuclear reactions: they freeze out in the early universe, but light
up again in regions where the density of baryonic matter rises sufficiently due to
gravitational collapse.)
The amount of annihilation is proportional to the square of the dark matter
density, so largest signal is expected from regions with high dark matter density, such
as dwarf galaxies or the centre of our own galaxy. Dark matter can also accumulate in
the Sun and at the center of the Earth, and though the numbers are much smaller,
these locations are much nearer to us, so the detection is easier. However, only
neutrinos can escape from the Sun or the centre of the Earth, whereas in the case
of astrophysical objects we can observe several kinds of annihilation products –
though there too the propagation of charged particles is a bit complicated. From
the direction where we measure a positron or an antiproton we cannot deduce where
the source is, since the paths of both particles are twisted by magnetic fields on the
way. Only the detected number of charged particles carried useful information, not
their direction (and even to calculate the numbers we have to make some assumptions
about propagation). In contrast, photons at the relevant energies travel basically
unimpeded through the galaxy, so we can immediately determine where they come
from. (Scattering of light due to dust is negligible at high energies.)
Let us consider the annihilation signal from the centre of our galaxy. The anni-
hilation rate per particle is Γ = hσvindm , so the number of annihilation events per
unit volume per unit time is hσvin2dm = hσvim−2 ρ2dm . Integrating along the line of
7 DARK MATTER 118
NX (E)hσvi 1
Z Z
ΦX (E) = dΩ dlρ2dm
4πm2 ∆Ω ∆Ω
GeV 2 ¯
−8 NX hσvi
≈ 2 × 10 J(∆Ω)m−2 s−1 sr−1 , (7.34)
10−30 cm3 s−1 m
The virtue of indirect detection is that the relevant cross section is the same
one that determines the relic density (for thermal relics), unlike in direct detection,
though the issue is complicated by model-dependent decay channels. However, the
dark matter density profile at the galactic centre (and on small scales in general)
is poorly known. In fact the mismatch between observations and simulations as
regards the centers of galaxies is considered to be one of the most pressing issues of
the CDM model. Simulations predict a sharper increase of density near the center
than inferred from observations of galaxies. For different choices of the density
profile, one gets a range J¯ ≈ 30 . . . 106 for ∆Ω = 10−3 sr and J¯ ≈ 30 . . . 108 for
∆Ω = 10−5 sr [18]. The relevant angular size depends on the angular resolution of
the instrument. In any case, for weak scale annihilation cross-sections, the expected
flux is quite small, though not completely out of reach.
Observational limits from the flux of photons measured by the Fermi-LAT satel-
lite on dark matter in a constrained version of the MSSM are shown are shown in
figure 6.
At present, there are some observational signals which have been interpreted
as evidence for dark matter. In particular, an excess of positrons was seen by
the PAMELA satellite experiment in 2008, and confirmed by the Fermi satellite
experiment in 2011. However, the rate is too high by a factor of about 103 if the
annihilation cross section is taken to be fixed by the observed relic density, so the
observations are inconsistent with the simple WIMP picture. Different options, from
increased clumping of the dark matter to a mechanism which boosts the annihilation
cross section at small velocities (i.e. today but not at decoupling) were suggested,
but the interpretation remains uncertain. It is possible that the positron excess is
due to astrophysical sources such as pulsars or supernova remnants, or even that
all of the excess positrons are generated by the scattering of other cosmic rays (so
that there is no primary source of excess cosmic rays), as the details of cosmic ray
propagation in the Galaxy are not clear.
One interesting indirect detection channel is neutrinos. High-energy neutrinos
from outside the Solar system have been detected in 2010–2012 by the IceCube
detector on the South Pole [20]. They could be from astrophysical sources, but an
interpretation in terms of dark matter decay has also been proposed.
7 DARK MATTER 119
In April 2012, a monochromatic (i.e. occurring at one energy only) gamma ray
signal of 130 GeV was reported from the galactic center using the Fermi satellite
data [21]. If confirmed, this would be a strong indication of dark matter, since
astrophysical sources typically generate a continuum of energies, and dark matter
annihilation is the only known source for emission at a single energy. However, in
dark matter models, we also expect a continuum to accompany the monochromatic
signal, since the dark matter decays also to other particles than photons, and some
of these then decay to photons, producing a wide range of energies in the photon
final states. Typically, the continuum signal is expected to be about 1000 times
stronger than the monochromatic line (remember that dark matter couples weakly
to photons!), and no such excess in the continuum emission is seen. So if the signal is
due to dark matter, it has properties rather different from what is expected. At the
moment, the matter remains open. (For more details on the various experiments,
see [17].)
In summary, as with direct detection, the reach of indirect detection experi-
ments is increasing, and many avenues are being investigated. Whether dark matter
will be detected (via its non-gravitational interactions) depends on which model of
dark matter is correct, and it is worth bearing in mind that there are some models
for which it is impossible to detect dark matter via non-gravitational interactions
REFERENCES 120
References
[1] J. Einasto, arXiv:0901.0632 [astro-ph.CO].
[2] Y. Sofue and V. Rubin, Ann. Rev. Astron. Astrophys. 39, 137 (2001),
arXiv:astro-ph/0010594.
[7] K. Nakamura et al. (Particle Data Group), J. Phys. G 37, 075021 (2010),
http://pdg.lbl.gov/2010/tables/contents tables.html.
[10] T.A. Ryttov and F. Sannino, Phys.Rev. D78, 115010 (2008), arXiv:0809.0713v1
[hep-ph].
[12] D. Clowe et al., Astrophys. J. Lett. 648, L109 (2006), arXiv: astro-ph/0608407.
[13] K. Freese, D. Spolyar, P. Bodenheimer and P. Gondolo, New J. Phys. 11, 105014
(2009), arXiv:0903.0101 [astro-ph.CO].
[14] A. M. Green, Mod. Phys. Lett. A 27, 1230004 (2012), arXiv:1112.0524 [astro-
ph.CO]].
[16] E. Aprile et al. [XENON100 Collaboration], Phys. Rev. Lett. 109, 181301
(2012), arXiv:1207.5988 [astro-ph.CO].
121
8 INFLATION: BACKGROUND 122
cosmic strings and domain walls, which correspond to (approximately) zero-, one-
and two-dimensional topological defects. Just like (given a specific model) we can
calculate the relic density of dark matter particles, we can calculate the density of
these relics. In some models the energy density in monopoles today would be much
higher than the observed energy density. This is related to the fact that monopoles
are typically very massive, with masses of the order of the grand unified scale. The
presence of cosmic strings and domain walls would also be problematic, as they
are also typically very heavy, and their energy density relative to ordinary matter
increases with time (i.e. it goes down more slowly). In supersymmetric models one
particular problem is the overproduction of gravitinos, the supersymmetric partners
of the graviton. If gravitinos are not stable, their lifetime is very long, because
they interact only gravitationally, so they typically decay after BBN, and ruin its
observational success. It is however also possible that the gravitinos form the dark
matter. The constraint on the temperature of the universe from gravitinos is of
the order T . 107 GeV. However, it may be that the grand unified theories or
supersymmetric models do not describe reality, in which case this is not a problem.
Observationally, from BBN, we know only that the universe has been at least as hot
as 1 MeV.
Often the term inflation is used to refer only to a period of acceleration expansion
in the early universe, and not to the recent phase of accelerated expansion.
Consider how the flatness and horizon problems can be solved with inflation.
The origin of the flatness problem is that |Ω − 1| = |K|/(aH)2 = |K|ȧ−2 grows
with time because ȧ falls, i.e. the universe decelerates. If the expansion instead
accelerates, Ω is driven towards unity starting from any value. (This is the case for
an expanding universe. If the universe contracts, the behaviour is reversed.)
Consider now the horizon problem. The problem is that in the standard Big
Bang model the horizon at the time of photon decoupling is small compared to the
part of the universe we can see today. In standard Big Bang picture the universe
is first radiation-dominated and then becomes matter-dominated somewhat before
photon decoupling. (Recall that teq ≈ 50 000 years and tdec ≈ 380 000 years.) In
8 INFLATION: BACKGROUND 124
and dhor (t0 ) = dchor (t0 ) ∼ H0−1 . The horizon problem arises because the first horizon
is much smaller than the second,
dchor (tdec ) a 0 H0 tdec
c ∼ ∼ (1 + zdec ) ≈ 0.03 ≪ 1 , (8.6)
dhor (t0 ) adec Hdec t0
where we have for clarity inserted a0 , even though it is equal to unity, and used
tdec = 380000 yr, t0 = 14 × 109 yr and zdec = 1090. In other words, the presently
observed universe was about 30 times larger than the particle horizon at decoupling,
so it contained about 105 regions that had never been in causal contact with each
other. Thus the problem is again that aH decreases with time,
d
(aH) = ä < 0 , (8.7)
dt
so a period with ä > 0 might solve the problem.
Recall that the particle horizon refers to the maximum distance that light could
in principle have travelled from the beginning of the universe until time t. If we add
a new period in the early universe to the matter-dominated era and the radiation-
dominated era, such as like accelerating expansion, the calculation of dchor will depend
on it. We always have dchor (t0 ) > dchor (tdec ) since t0 > tdec , and the interval (0, tdec )
is included in the interval (0, t0 ). However, in the horizon problem, the relevant
present-day quantity is not actually the distance from which we could in principle
have received signals, but the distance from which we actually measure signals.
Because the universe is opaque before decoupling, the size of the present observable
universe is given by the distance photons have travelled in the interval (tdec , t0 ), and
this is not affected by what happens before tdec . Thus the relevant present-day scale
is always ∼ H0−1 .
Note that the comoving Hubble parameter is equal to the conformal Hubble
parameter,
1 da
aH = = ȧ , (8.8)
a dη
where η is conformal time. The Hubble length is
ȧ
lH ≡ H −1 , where H ≡ , (8.9)
a
and the comoving Hubble length is
c 1 1 1
lH ≡ lH = = . (8.10)
a aH ȧ
8 INFLATION: BACKGROUND 125
If aH decreases, then (aH)−1 increases, and vice versa. So we can say that inflation
is any epoch when the comoving Hubble length shrinks:
d 1
inflation ⇔ <0. (8.11)
dt aH
It has unfortunately become customary in cosmology to use the word “horizon”
also for the Hubble distance, particularly with regard to inflation. We adopt this
lamentable practice when referring to subhorizon or superhorizon modes (to be de-
fined a bit later), but will otherwise try to be careful not to confuse the two concepts.
Let us consider an example of accelerated expansion that we are already familiar
with from the discussion on dark energy, namely exponential expansion, correspond-
ing to the vacuum energy equation of state w = −1, a(t) ∝ eHt , with constant H.
We will shortly see that this a first approximation for the expansion law during
inflation. The horizon distance is
Z t
dt′
dhor (t) = a(t) ′)
= H −1 eHt (1 − e−Ht ) ≃ H −1 eHt , (8.12)
0 a(t
where the last limit is for t ≫ H −1 . So in contrast to the radiation- and matter-
dominated eras, the physical particle horizon grows exponentially, and the comoving
particle horizon stays almost constant, dchor (t) ≃ H −1 . The present observable
universe has evolved from a small patch of a much larger causally connected region.
See figure 2.
However, even though the particle horizon grows exponentially, the distance over
which it is possible to send signals does not grow. If we consider a light ray, we have
0 = ds2 = −dt2 + a(t)2 dr2 , so the comoving coordinate separation (which is the
comoving distance, since the universe is spatially flat) between emission at t1 and
reception at t2 is (taking a = eHt )
Z t2
dt′
∆r = ′)
= H −1 (e−Ht1 − e−Ht2 ) < H −1 . (8.13)
t1 a(t
If the coordinate separation between two points is more than the Hubble length, it
is not possible to send signals between them. In this sense, the Hubble length gives
the comoving size of the region during inflation over which it is possible to retain
causal connection. If the universe before inflation is matter-dominated, for example,
observers with separation 2(aH)−1 have been able to send signals to each other,
so causal connection is lost. Also, regardless of what happened before inflation,
during inflation a signal sent at t1 cannot travel a longer coordinate distance than
H −1 e−Ht1 , and this distance gets smaller t1 grows, so causal connection is lost during
inflation. Note that the particle horizon, which expresses the maximum distance at
which parts of the universe can have been in causal contact always grows as a
function of time, it never shrinks. What changes during inflation is just that regions
that once were in causal contact cannot send signals to each other any more.
The Friedmann equations are
ȧ2 K
3 = 8πGN ρ − 3 (8.14)
a2 a2
ä
3 = −4πGN (ρ + 3p) . (8.15)
a
8 INFLATION: BACKGROUND 126
Figure 2: Evolution of the comoving Hubble radius (length, distance) during and after
inflation (schematic).
Thus, in general relativity and assuming the FRW metric, accelerating expansion
requires negative pressure:
Note that the energy density of matter for which p/ρ < − 31 falls down in an
expanding universe slower than a−2 , i.e. it grows relative to the spatial curvature.
(If p/ρ < −1, the energy density actually rises as the universe expands.) The flatness
problem of the Big Bang model is simply the feature that for matter composed of a
gas of particles we have p ≥ 0, so the energy density falls at least as fast as a−3 , and
the curvature term will at some point overtake the energy density. In inflationary
models, the energy density typically remains nearly constant during a period in
which the scale factor grows by a huge factor, typically by a factor e60 or more.
Thus inflation predicts that Ω0 = 1 to extremely high accuracy2 . See figure 3.
As for the relic problem, if unwanted relics are produced before inflation, they
are diluted to practically zero density by the expansion, just like spatial curvature.
However, we have to be careful that they are not produced after inflation, i.e. the
reheating temperature (see below) has to small enough. This is one constraint on
models of inflation. At the end of inflation, matter is produced in reheating3 , which
2
If it were discovered by observations that Ω0 6= 1, this would be a blow to the credibility of
inflation. However, there is a version of inflation, called open inflation, for which it is natural that
Ω0 < 1. The existence of such models of inflation have led critics of inflation to complain that
inflation is “unfalsifiable” in the sense that no matter what the observation, a model of inflation
can be found that agrees with it. Nevertheless, most models of inflation give similar “generic”
predictions, including Ω0 = 1 to great accuracy, and thus far the observations have been in good
agreement with them.
3
“Reheating” may turn out to be as much a misnomer as “recombination”, as it is not clear
whether matter was ever in a thermal state before inflation.
8 INFLATION: BACKGROUND 127
Figure 3: Solving the flatness problem. This figure is for a universe with no dark energy,
where the expansion keeps decelerating after inflation ended in the early universe. Present
observational evidence indicates that actually the expansion began accelerating again a few
billion years ago. Thus the universe is, technically speaking, inflating again, and Ω is again
being driven towards 1. However, this current epoch of inflation is not enough to solve the
flatness problem, or the other problems, since the universe has only expanded by about a
factor of 2 during it.
produces the gas of particles that is the initial condition for the hot Big Bang model.
Inflation is better called a scenario rather than a theory. It is an idea of a certain
kind of behaviour of the universe, which is realised in dozens of different models.
The models are related to extensions of the Standard Model of particle physics or
extensions of general relativity, and some of them are just “toy models”, which
have the right features and are presumably at most an approximate description of
some more complicated physics. One noteworthy inflationary model is based on the
Standard Model Higgs boson coupled to gravity in a non-standard way.
The important point is that inflation makes many “generic” predictions, i.e. pre-
dictions that are independent of the particular model of inflation, for most models.
(Though exceptional models can be found that would violate one or more of these
general features.) There are also numerical predictions of cosmological observables
that differ from one model of inflation to another, allowing observations to rule out
models (and some have been already ruled out). Present observational data agrees
with the generic predictions (which were made before the advent of the observations
of the CMB anisotropies, which are the most direct way of testing the models),
while alternatives to inflation have not managed to explain the observations in an
equally simple and successful way. Most cosmologists thus consider it likely that
inflation took place in the primordial universe. To quote the cosmologist Douglas
Scott, “something like inflation is something like proven”.
Exercise: Assume that at the beginning of inflation we have |ΩK | = 0.1. Calcu-
late, as a function of the reheating temperature Treh , how many e-folds of inflation
are required to reduce present-day spatial curvature to |ΩK0 | < 10−2 . (Assume
h = 0.7 and that neutrinos are massless.) Approximate that the expansion rate at
the beginning of inflation is completely dominated by the inflaton, that the inflaton
field value does not change during inflation and that reheating happens instanta-
neously. In which directions do the above approximations change the result? What
is the number of e-folds for Treh = 107 GeV?
8 INFLATION: BACKGROUND 128
is important if the Higgs field is the inflaton.) We will not discuss non-minimal
coupling.
If the field is free, we have
1
V (ϕ) = m2 ϕ2 , (8.19)
2
and the mass of the particle corresponding to the field ϕ is m. If the potential
has higher order terms, these describe self-interactions of the field. Even when the
potential is more complicated than in eq. (8.19), we define the quantity m2 (ϕ) ≡
V ′′ (ϕ). For m2 > 0, this gives the mass of the particles when the field has the value
ϕ. In the case m2 < 0, the field configuration is unstable, and small perturbations
no longer describe particles with mass m. We also use the notation
dV d2 V
V ′ (ϕ) ≡ and V ′′ (ϕ) ≡ . (8.20)
dϕ dϕ2
Minimisation of the action leads to the Euler–Lagrange equation
√ √
∂( −gL) ∂( −gL)
− ∂µ =0. (8.21)
∂ϕ ∂[∂µ ϕ]
ϕ̈ − ∇2 ϕ + V ′ = 0 . (8.23)
For a free field, we have V ′ (ϕ) = m2 ϕ, and the equation of motion reduces to
the Klein-Gordon equation. For the spatially flat FRW metric we have (in Cartesian
coordinates) g µν = diag(−1, a−2 , a−2 , a−2 ), so we get
ϕ̈ + 3H ϕ̇ − a−2 ∇2 ϕ + V ′ = 0 , (8.24)
where ∇2 ϕ ≡ δ ij ∂i ∂j . During inflation, the field (like the space) is almost homo-
geneous, so we can take ∂i ϕ = 0 for the background evolution (we will consider
perturbations in chapters 9 and 10). In fact, inflation makes the inflaton field more
homogeneous, as the coefficient a−2 falls. A sufficient level of initial homogeneity of
the field is required to get inflation started. We start our discussion when a sufficient
level of inflation has already taken place to make the gradients negligible, so that
the field can be considered homogeneous.
The Lagrangean density also gives us the energy-momentum tensor
∂L
Tµν = − ∂ν ϕ + gµν L , (8.25)
∂(∂ µ ϕ)
For the FRW metric, the energy density and pressure measured by an observer
comoving with the FRW metric are5
1
ρ = −T 00 = ϕ̇2 + V (8.27)
2
1
p = T ii = ϕ̇2 − V , (8.28)
2
The field has negative pressure when the potential dominates over the kinetic term,
i.e. when the field is moving slowly. The equation of state parameter w ≡ p/ρ is
ϕ̇2 − 2V (ϕ) 1 − 2V /ϕ̇2
w= 2
= , (8.29)
ϕ̇ + 2V (ϕ̇) 1 + 2V /ϕ̇2
so
−1 ≤ w ≤ 1 . (8.30)
1 2
If the kinetic term 2 ϕ̇
dominates, w ≈ 1; if the potential term V (ϕ) dominates,
w ≈ −1. Different inflaton models have different potentials V (ϕ). From (8.27), we
can form the useful combinations
ρ + p = ϕ̇2
(8.31)
ρ + 3p = 2 ϕ̇2 − V
.
We have the equation of motion of the field from (8.24). Alternatively, we could
just insert the energy density and pressure from (8.27) into the continuity equation
ρ̇ = −3H(ρ + p) . (8.32)
ϕ̈ + 3H ϕ̇ = −V ′ . (8.33)
This is the field equation for a homogeneous field in a spatially flat FRW universe.
The effect of expansion is to add the term 3H ϕ̇, which acts like friction and slows
down the evolution of ϕ.
The condition for inflation, ρ + 3p < 0, is satisfied if
Let us assume that ϕ is initially far from the minimum of V (ϕ). The potential
then pulls ϕ towards the minimum (see figure 4). If the potential has a suitable
(sufficiently flat) shape, the friction term soon makes ϕ̇ small enough to satisfy
(8.34), even if it was not satisfied initially.
We also need the Friedmann equation,
8πG 1
H2 = ρ= 2 ρ . (8.35)
3 3MPl
Inserting the energy density from (8.27), we have
2 1 1 2
H = 2 ϕ̇ + V . (8.36)
3MPl 2
5
Those used to the Einstein summation convention should note that there is no summation over
i in (8.28).
8 INFLATION: BACKGROUND 131
We have ignored other contributions to the energy density and pressure besides
the inflaton. During inflation, the inflaton moves slowly, so the inflaton energy
density, which is dominated by V (ϕ), also changes slowly. If there are matter and
radiation components in the energy density, they decrease fast, ρ ∝ a−3 or ∝ a−4 ,
and soon become negligible, like the spatial curvature. The presence of extra matter
can put some constraints on the initial conditions for inflation to get started and
the inflaton to become dominant. But once inflation begins, we can soon forget
components other than the inflaton.
ϕ̇2 ≪ V (8.37)
|ϕ̈| ≪ 3H|ϕ̇| (8.38)
These are the slow-roll conditions. If the slow-roll conditions are valid, we may ap-
proximate (the slow-roll approximation) (8.33) and (8.36) by the slow-roll equations:
V
H2 = 2 (8.39)
3MPl
3H ϕ̇ = −V ′ . (8.40)
The shape of the potential V (ϕ) determines the slow-roll parameters, defined as
1 2 V′ 2
ε(ϕ) ≡ MPl (8.41)
2 V
′′
2 V
η(ϕ) ≡ MPl . (8.42)
V
Exercise: Show that
Note that the implication goes only in this direction. The conditions ε ≪ 1 and
|η| ≪ 1 are necessary, but not sufficient for the slow-roll approximation (i.e. the
slow-roll conditions) to be valid. The conditions are not sufficient, because they
8 INFLATION: BACKGROUND 132
only constrain the form of the potential, and identify from the potential a slow-roll
section, where the slow-roll approximation may be valid. Since the field equation
(8.33) is second order, it accepts arbitrary ϕ and ϕ̇ as initial conditions. Thus (8.37)
and (8.38) may not hold initially, even if ϕ is in the slow-roll section. However, it
turns out that the slow-roll solution, the solution of the slow-roll equations (8.39)
and (8.40), is an attractor of the full equations, (8.33) and (8.36). This means that
the solution of the full equations rapidly approaches it, if the initial conditions that
are in the basin of attraction. To be in the basin of attraction means that ϕ must
be in the slow-roll section; if ϕ̇ is large, ϕ needs to be deep in the slow-roll section.
Once the system has reached the attractor, where (8.39) and (8.40) hold, ϕ̇ is
determined by ϕ. In fact everything is determined by ϕ (assuming a fixed potential
V (ϕ)). The value of ϕ is the single parameter describing the state of the universe,
and ϕ evolves down the potential V (ϕ) as specified by the slow-roll equations.
The ideas of “attractor” and “basin of attraction” can be taken further. If the
universe (or a region of it) finds itself initially (or enters) the basin of attraction
of slow-roll inflation, meaning that: there is a sufficiently large region, where the
curvature is sufficiently small, the inflaton makes a sufficient contribution to the
total energy density, the inflaton is sufficiently homogeneous, and lies sufficiently
deep in the slow-roll section, then this region begins inflating, it becomes rapidly
very homogeneous and flat, all other contributions to the energy density besides the
inflaton become negligible, and the inflaton begins to follow the slow-roll solution.
Thus inflation erases all memory of initial conditions, and we can predict the
later history of the universe just from the shape of V (ϕ) and the assumption that
ϕ started out far enough in the slow-roll part of it. Note the similarity to thermal
equilibrium. In the stages of the universe we discussed earlier, things were calculable
because in thermal equilibrium, it is sufficient to know the temperature, masses of
particles and conserved quantum numbers in order to have full information about
the system. In the case of inflation, knowing the inflaton field value (and the shape
of the potential) is enough, because of a rather different kind of attractor behaviour.
Example:
1
V (ϕ) = m2 ϕ2 ⇒ V ′ (ϕ) = m2 ϕ , V ′′ (ϕ) = m2 (8.44)
2
8 INFLATION: BACKGROUND 133
1 2 2 2
ε(ϕ) = MPl 2
2 ϕ
MPl
⇒ ε=η=2 (8.45)
2 2
ϕ
η(ϕ) = MPl 2
ϕ
and
ε, η ≪ 1 ⇒ ϕ2 ≫ 2MPl
2
(8.46)
See figure 5.
V V ′ ϕ̇ V ′ H ϕ̇ 3H ϕ̇=−V ′ V ′2
H2 = 2 ⇒ 2H Ḣ = 2 ⇒ H 2 Ḣ = 2 = − 2
3MPl 3MPl 6MPl 18MPl
4 2
V ′2 9MPl V′
Ḣ 1 2
⇒ − 2 = 2 2
= MPl =ε≪1.
H 18MPl V 2 V
Figure 6: Potential for (a) large field and (b) small field inflation. For a typical small-field
model, the entire range of ϕ shown is ≪ MPl .
couplings usually come into play when inflation ends. Inflation can end because the
slow-roll approximation is no longer valid, as the field has rolled down the potential.
In this case inflation ends when either ε(ϕ) or |η(ϕ)| becomes of order unity. Another
possibility is that inflation ends while the inflaton undergoes slow-roll, because other
fields coupled to the inflaton become dynamically important and terminate inflation.
An example of this is hybrid inflation, where there is an extra scalar field in addition
to the inflaton.
Inflation models can be divided into two classes:
1. Small field inflation: ∆ϕ < MPl in the slow-roll section.
2. Large field inflation: ∆ϕ > MPl in the slow-roll section.
Here ∆ϕ change in ϕ during (the observationally relevant part of) inflation.
Example: Consider a simple potential of the form V (ϕ) = Aϕn . This is a large
field model, since V ′ /V = n/ϕ ⇒ ε ≪ 1 requires ϕ2 ≫ 12 n2 MPl 2 .
See figure 6 for typical shapes of potentials of large field and small field models.
Figure 7: After inflation, the inflaton field is left oscillating at the bottom.
8.6 Reheating
During inflation, practically all energy in the universe is in the inflaton potential
V (ϕ), since according to the slow-roll approximation 12 ϕ̇2 ≪ V (ϕ). As inflation
ends, this energy is transferred in the reheating process to a thermal bath of particles
produced in the reheating. Thus reheating creates, from V (ϕ), all the stuff there is
in the later universe. The conversion of the inflaton energy density into a thermal
gas of particles does not affect the spectrum of density perturbations in single field
models of inflation (at least on superhorizon scales; see section 8.7 below). (It does
change the relationship between the relation of ϕk . and k/H0 given in (8.62), i.e.
amount that the perturbations are stretched between the end of inflation and today.)
The main constraint on reheating is that the reheating temperature must be above 1
MeV, but sufficiently low so as not to produce unwanted relics – where “sufficiently”
depends on the theory under consideration. For typical supersymmetric theories the
constraint on the reheating temperature is TR . 107 GeV.
ϕ̈ + 3H ϕ̇ = −m2 ϕ . (8.53)
In the limit m ≫ H, we can neglect the friction term, and the field undergoes
oscillations with frequency m. We can write the energy continuity equation as
3
ρ̇ + 3Hρ = −3Hp = − H m2 ϕ2 − ϕ̇2 .
(8.54)
2
8 INFLATION: BACKGROUND 136
The oscillating factor on the right hand side averages to zero over one oscillation
period (in the limit where the period is ≪ H −1 ), so on average the energy density
goes like ρ ∝ a−3 , just like in a matter-dominated universe. The fall in the energy
density shows as a decrease of the oscillation amplitude, see figure 8.
where Γϕ = 1/τϕ , the decay width, is the inverse of the inflaton decay time τϕ , and
the term −Γϕ ρϕ represents energy transfer to other particles.
If the inflaton can decay into bosons, the decay may be very rapid, involving a
mechanism called parametric resonance. The produced particles are far from thermal
equilibrium (only certain bands in momentum space become populated, and their
occupation numbers are huge). In realistic models of inflation, the inflaton can
decay via mixture of different decay methods. The process by which the inflaton
transfers its energy into particles is called preheating and the thermalisation of the
gas of particles is called reheating. However, terminology varies, and often the term
reheating is used to refer just to the energy transfer, even if the final state is not in
thermal equilibrium.
8.6.3 Thermalisation
The particles produced from the inflaton will interact, create other particles through
particle reactions, and the resulting soup will eventually reach thermal equilibrium
7
In fact, if the scale of inflation is sufficiently high, it is possible to reheat without any couplings
between the inflaton and the Standard Model degrees of freedom by producing particles gravitation-
ally out of the vacuum. This is called gravitational reheating, and it is one of the many delicacies
of inflation we will not have time to sample!
8 INFLATION: BACKGROUND 137
with some temperature Treh . This reheating temperature is determined by the energy
density ρreh at the end of the reheating epoch:
π2 4
ρreh = g∗ (Treh )Treh . (8.56)
30
Necessarily ρreh < ρend (end = end of inflation). If reheating takes a long time,
we may have ρreh ≪ ρend . The evolution of the gas of particles into a thermal
state can be quite involved, and it has been studied in various models. Usually it
is just assumed that it happens eventually, since the particles are able to interact.
However, it is possible that some particles (such as gravitinos) never reach ther-
mal equilibrium, since their interactions are too weak. In any case, as long as the
momenta of the particles are much higher than their masses, the energy density of
the universe behaves like radiation, regardless of the momentum space distribution.
So the background expansion rate is the same. After thermalisation of at least the
baryons, photons and neutrinos is complete, the standard Hot Big Bang era begins.
a(tend )
N (t) ≡ ln . (8.57)
a(t)
See figure 9. We can calculate N (t) ≡ N (ϕ(t)) ≡ N (ϕ) from the shape of the
potential V (ϕ) and the value of ϕ at time t:
tend ϕend ϕ
a(tend ) H 1 V
Z Z Z
slow roll
N (ϕ) = ln = H(t)dt = dϕ ≈ 2 dϕ ,
a(t) t ϕ ϕ̇ MPl ϕend V′
(8.58)
of each comoving distance scale, or each comoving wave number k (from Fourier
expansion in comoving coordinates).
2π λ
k= , k −1 =
λ 2π
An important question is whether a distance scale is larger or smaller than the
Hubble length at a given time. A scale is said to be
k −1 > (aH)−1
• superhorizon, when k < aH
k −1 < (aH)−1 .
• subhorizon, when k > aH
Note that large length scales (large k −1 ) correspond to small k, and vice versa,
although we often talk about “scale k”. This can easily cause confusion, so be
careful with wording! Notice also that we are here using the word “horizon” to
refer to the Hubble length: more correct terminology would be “sub-Hubble” and
“super-Hubble”8 . Recall that (aH)−1 shrinks during inflation, and grows during all
other eras. See figures 10 and 11.
We shall later find that the amplitude of primordial density perturbations on a
given comoving scale freezes as this scale exits the horizon during inflation. The
largest observable scales are of the size of the horizon today. (Since the universe
has recently began accelerating again, these scales have just barely entered, and are
now exiting again.)
To identify the distance scales during inflation with the corresponding distance
scales in the present universe, we need a complete history from inflation to the
present. We divide it into the following periods:
1. From the time the scale k of interest exits the horizon during inflation to the
end of inflation (tk to tend ).
2. From the end of inflation to the time when thermal equilibrium at high tem-
perature (Hot Big Bang conditions) is achieved, i.e. reheating. We assume
that the universe behaves as if matter-dominated, ρ ∝ a−3 , during this period,
as discussed in section 8.6.1 (tend to treh ).
⇒ k = (aH)k = ak Hk .
8
As discussed in the first part of the course, there are (at least) three different usages for the
word “horizon”:
1. particle horizon
2. event horizon
3. Hubble length
8 INFLATION: BACKGROUND 139
Figure 10: The evolution of the Hubble length, and two scales, k1−1 and k2−1 , seen in
comoving coordinates.
Figure 11: The evolution of the Hubble length, and the scale k−1 seen in terms of physical
distance.
8 INFLATION: BACKGROUND 140
To find out how large this scale is today, we relate it to the present “horizon”, i.e.,
the Hubble scale (for clarity, we insert a0 here, even though we have chosen it to be
equal to unity):
k a k Hk
=
a 0 H0 a 0 H0
ak aend areh Hk
=
aend areh a0 H0
− 1 1 1
ρreh − 4 Vk 2
−N (k) ρend
3
= e , (8.59)
ρreh ρr0 ρc0
where we have inserted the comparison scale 1016 GeV, taken into account that
ρend ≈ Vend (if inflation ends due to the slow-roll approximation being violated, this
will only be true up to factors of order unity, which we neglect) and rearranged
some of the terms. We don’t know the energy scale of inflation, but there is an
upper limit of approximately 1016 GeV from the lack of observation of primordial
9
Accurately this would go as:
1
areh g∗s (T0 ) 3 T0
g∗s a3 T 3 = const. ⇒ = . (8.60)
a0 g∗s (Treh ) Treh
We approximated this with
1 1
ρr0 4 g∗ (T0 ) 4 T0
=
ρreh g∗ (Treh ) Treh
Taking g∗s (Treh ) = g∗ (Treh ) ∼ 100, the ratio of these two becomes
1 1
g∗s (T0 ) 3 3.909 3
1 1 ≈ 1 1 = 0.79 ∼ 1 .
g∗ (T0 ) 4 g∗ (Treh ) 12 3.363 4 100 12
1 1
g∗ (Treh ) 4 100 4
∼ ∼ 2.33 .
g∗ (T0 ) 3.363
8 INFLATION: BACKGROUND 141
gravity waves, whose amplitude provides a measure of the inflationary energy scale.
Inserting the values ρr0 = 4.18 × 10−5 h−2 ρc0 (assuming massless neutrinos) and
1/4 √
ρc0 = ( 3H0 MPl )1/2 ≈ 3.0 × 10−12 h1/2 GeV, and taking h = 0.7, we obtain for the
number of e-folds
1/4 1/4
k 1 V Vk 1016 GeV
N (ϕk ) = − ln + 61 − ln end
1/4
+ ln 1/4
− ln 1/4
, (8.62)
a 0 H0 3 ρreh Vend Vk
where ϕk ≡ ϕ(tk ). The terms have been arranged such that the quantities in the
logarithms are bigger than unity. The second term depends on the efficiency of
reheating: if all of the inflaton potential energy is converted into radiation degrees
of freedom instantaneously, it is zero. The third term is expected to be small, since
the potential varies slowly during slow-roll: the dependence on k in the first term is
expected to dominate. The last factor can however be large if the inflation scale is
much lower than 1016 GeV. For example, inflation at the TeV scale would give −30.
For any given present scale, given as a fraction of the present Hubble distance10 ,
(8.62) identifies the value ϕk the inflaton had, when this scale exited the horizon
during inflation. The last three terms give the dependence on the energy scales con-
nected with inflation and reheating. In typical inflation models, they are relatively
small. Usually, the precise value of N is not that important; we are more interested
in the derivative dN/dk, or rather dϕk /dk.
Anyway, we see that typically (for high scale inflation) about 60 e-foldings of
inflation occur after the largest observable scales exit the horizon. There is no
similar constraint on the number of e-folds before these scales exited the horizon,
and the number varies from a few to 108 (or more) between different models.
the limit to how high energy densities we can extend our discussion, which is based on
classical GR, and quantum gravitational effects are expected to be important. One
idea is that the universe at that time, the Planck era, is some kind of a “spacetime
foam”, where the fabric of spacetime itself is subject to large quantum fluctuations.
When the energy density of some region, larger than H −1 , falls below MPl4 , spacetime
1 1 1
ρϕ = ϕ̇2 + ∂i ϕ∂i ϕ + V , (8.63)
2 2 a2
10
For example, k/H0 = 10 means that we are talking about a a scale corresponding to a wave-
length λ such that λ/2π is one tenth of the Hubble distance.
8 INFLATION: BACKGROUND 142
we must have
1
ϕ̇2 . MPl
4
, ∂i ϕ∂i ϕ . MPl4
, V . MPl 4
(8.64)
a2
in a region for it to emerge from the spacetime foam.
Inflation may begin at many different parts of the spacetime foam. Our observ-
able universe is just one small part of one such region which has inflated to a huge
size.
It is also possible that during inflation, for some part of the potential, quantum
fluctuations of the inflaton (not the spacetime!) dominate over the classical evolution
and push ϕ higher in some regions. These regions will then expand faster, and
dominate the volume. This gives rise to eternal inflation, where, at any given time,
most of the volume of the universe is inflating. (Whether or not this can happen
depends on the shape of the potential and the field value during inflation.) But our
observable Universe is part of a region where ϕ rolled down and came to a region
of the potential, where the quantum fluctuations of ϕ were small and the classical
behaviour began to dominate and inflation ended.
Thus the ultra-large scale structure of the universe may be very complicated.
However, this is not observable to us, and all the features of the universe we see
can be explained in terms of what happened in our patch during or after inflation.
These ideas of the spacetime foam and eternal inflation are rather speculative, and
there are also different suggestions for the initial stages of the universe.
9 Linear perturbation theory
9.1 Structure formation
Up to this point we have discussed the universe in terms of the homogeneous and
isotropic FRW model. We have however already used the notion of temperature,
which involves fluctuations, so inhomogeneities have already implicitly been present.
We now take the next step by explicitly considering small perturbations around the
homogeneous and isotropic model (which we now refer to as the “unperturbed” or the
“background” universe). In cosmology, perturbation theory has wide applicability.
Often the distribution of non-linear objects can be treated in terms of linear theory,
even though their internal composition cannot, and even very non-linear structures
such as planets, stars and galaxies have evolved from small initial perturbations
under the influence of gravity. This growth is called structure formation, though
sometimes the term is used to refer only to the situation when perturbations become
of order unity and bound structures form. The discussion of perturbations can thus
be divided into two parts.
2) The growth of the small perturbations into the present observable structure
of the universe. This part is less speculative, since we have a well established
theory of gravity, general relativity. However, there is uncertainty in this part
too, since we do not know the precise nature of the dominant components
to the energy density of the universe, the dark matter and the possible dark
energy. The gravitational growth depends on the equations of state and the
streaming lengths (particle mean free path between interactions) of these den-
sity components. Besides gravity, the growth is affected by pressure (due to
non-gravitational interactions).
143
9 LINEAR PERTURBATION THEORY 144
where x are the comoving spatial coordinates. We assume that perturbations are
small, so that we can drop all terms which contain a product of two or more pertur-
bations. The remaining equations then contain only terms which are either zeroth
order i.e. contain only background quantities, or first order i.e. contain exactly one
power of the perturbed quantities. If we understand the zeroth order parts as the
average, then the average of the perturbations vanishes. By averaging the inhomo-
geneous equations we thus get back the equations of the homogeneous and isotropic
universe. Subtracting these from our equations we arrive at the perturbation equa-
tions where every term is first order in the perturbation quantities, i.e. the equations
are linear1 .
The more rigorous way of doing perturbation theory would be to take the full
set of equations (in this case the various components of the Einstein equation) for a
general inhomogeneous spacetime and linearise them, dropping higher order terms
as discussed above. The more conventional way is to start with the homogeneous
and isotropic model and add perturbations on top of that. We will follow this easier
route2 .
quantities are small. For example, the gravitational field in the solar system is quite
small, and the solar system can be represented as a linear perturbation around
Minkowski space. However, the energy density in the solar system changes by a
factor 1020 when going from Earth to interplanetary space.
From now on, we assume that we have chosen an appropriate coordinate system
such that the metric perturbations are small, so we can neglect all terms which are
second order or higher in the metric perturbations.
In the linear approximation, the metric perturbations do not influence the evo-
lution of the background on which they live. The metric perturbations inherit geo-
metric structure from the background. Just like in classical electrodynamics we can
decompose a general tensor into irreducible representations of the Lorentz group,
we can decompose the metric perturbations into irreducible parts with regard to
the symmetries of the background, namely translation and rotation in the spatial
dimensions. In less technical language, the perturbations can be split up into things
which have either zero, one or two spatial indices, and which we can treat like scalars,
vectors and tensor living on a Euclidean space. The most general linear perturbation
around the FRW metric (9.2), decomposed into its irreducible parts, reads
where Φ, Ψ, B and E are scalars, Si , Fi are vectors and hij is a tensor, and a comma
stands for derivative with respect to xi i.e. f,i ≡ ∂f /∂xi . The vector perturbations
are transverse, δ ij Si,j = δ ij Fi,j = 0, and the tensor perturbation is transverse and
traceless, δ ij hij = 0, hij,j = 0. Physically, tensors correspond to gravity waves,
vectors describe rotation and scalars are directly related to the density perturbation,
as we will see.
Since we drop all non-linear terms, the scalar, vector and tensor perturbations
evolve independently. The vector perturbations decay with the expansion, and are
expected to be negligible in the linear regime, so we put them to zero, Si = Fi = 0.
There can be significant tensor perturbations in the universe, and they may be
observable in the cosmic microwave background anisotropy. This depends on the
details of inflation. No tensor perturbations have been detected thus far, but it is
possible the Planck satellite, whose data on the polarisation of the CMB is set to
be released in 2014 will be able to detect them.
For the metric perturbation, we have 10 functions δgαβ (t, x). So there would
appear to be ten degrees of freedom. However, four of them are not physical degrees
of freedom, they just correspond to the freedom of choosing the four coordinates.
So there are 6 physical degrees of freedom. There are thus different coordinate
systems (also called different gauges) which describe the same physics. The choice
of coordinates is called a choice of gauge3 . It can be shown that we can choose
3
More precisely, perturbation theory is formulated in terms of a mapping from the real inhomo-
geneous and anisotropic spacetime to a background spacetime, and it is the choice of map which
is called a “gauge choice”. However, the choice of coordinates and choice of mapping are often
conflated in cosmological parlance. More simply, change of gauge is a change of coordinates, except
that it only affects the perturbations, the background is kept fixed. We will not get into such details.
9 LINEAR PERTURBATION THEORY 146
E = B = 0, and that doing so fixes the coordinate system completely. This choice
is known as the longitudinal gauge and also as the conformal Newtonian gauge. We
are then left with the metric
so we have two scalar degrees of freedom and one transverse traceless symmetric
tensor, which has two independent degrees of freedom. The metric perturbations
Φ(t, x) and Ψ(t, x) are called the Bardeen potentials4 . The function Φ is also called
the Newtonian potential, since in the Newtonian limit, it becomes equal to the New-
tonian potential perturbation, and Ψ is called the Newtonian curvature perturbation,
because it determines the curvature of the 3-dimensional t = const. subspaces, which
are flat in the unperturbed universe.
The evolution of the metric perturbations is determined by the Einstein equa-
tion, which couples the metric to the matter content as described by the energy-
momentum tensor.
where Gαβ is a tensor which is built from the metric and its first and second deriva-
tives, and the energy-momentum tensor Tαβ describes the properties of matter. In
chapter 3 we noted that for an ideal fluid the energy-momentum tensor has the
following form
where ρ is the energy density and p is the pressure measured by an observer moving
with four-velocity uα . In the FRW case, the energy-momentum tensor necessarily
has this form for all forms of matter due to the symmetry of the spacetime. In
the perturbed case, the energy-momentum tensor can also have contributions from
energy flux and anisotropic stress in addition to then energy density and pressure.
We will not discuss such imperfect fluids.
As with the metric, we split the contributions to the energy-momentum tensor
into background plus perturbations,
and we throw out all terms which have two or more powers of the perturbations,
whether of the metric or the matter variables. The four-velocity is normalised as
gαβ uα uβ = −1, from which it follows that δu0 = −Φ in linear theory.
4
Warning: Sign conventions for Φ and Ψ differ, and the definitions of Ψ and Φ are also sometimes
switched with each other.
9 LINEAR PERTURBATION THEORY 147
Equating the Einstein tensor corresponding to the metric (9.4) to the energy-
momentum tensor (9.6) (times 8πGN ) in the linear approximation, we get the fa-
miliar equations for the background:
3H 2 = 8πGN ρ̄ (9.10)
2
3(Ḣ + H ) = −4πGN (ρ̄ + 3p̄) , (9.11)
where we have used the relation ä/a = Ḣ + H 2 . For the perturbations, we get
1 2
4πGN δρ = ∇ Ψ − 3H(Ψ̇ + HΦ) (9.12)
a2
4πGN (ρ̄ + p̄)δui = −(Ψ̇ + HΦ),i (9.13)
1 1 2
4πGN δpδij = (2Ḣ + 3H 2 )Φ + H Φ̇ + Ψ̈ + 3H Ψ̇ + ∇ D δij
2 a2
1 1
− 2 D,ij (9.14)
2a
1
0 = ḧij + 3H ḣij − 2 ∇2 hij , (9.15)
a
where ∇2 ≡ δ ij ∂i ∂j and D ≡ Φ − Ψ. These are the central equations for discussing
the evolution of perturbations. In this course, we cannot properly derive them from
the general Einstein equation, we just have to take them as given.
From the non-diagonal components of (9.14) we get that D,ij = 0 for all i 6=
j. The general solution of this equation is D = A(t, x) + B(t, y) + C(t, z). In
cosmology there are no preferred coordinate axes, so the only physically relevant
solution is D = D(t). However, this corresponds to changing the time coordinate,
so we can set D(t) = 0 without loss of generality. We therefore have Φ = Ψ.5 To see
what the single remaining scalar metric degree of freedom corresponds to, we can
manipulate the remaining perturbations equations (9.12)–(9.14). Let us introduce
some notation: the density contrast is defined as
δρ
δ≡ . (9.16)
ρ̄
We also define the background equation of state as w ≡ p̄/ρ̄, and introduce the
variable v 2 ≡ δp/δρ. We will later see that v corresponds (for certain types of
perturbation called adiabatic) to the sound speed of the cosmic fluid (if v 2 < 0,
it instead describes an instability of the fluid). We can now express the pressure
5
In fact, neutrinos develop anisotropic stress after neutrino decoupling, they do not behave like
an ideal fluid. Therefore the two Bardeen potentials actually differ from each other by about 10%
in the time between neutrino decoupling and matter-radiation equality. After the universe becomes
matter-dominated, the neutrinos become unimportant, and Ψ and Φ rapidly approach each other.
The same thing happens to photons after photon decoupling, but the universe is then already
matter-dominated, so the photons do not cause a significant difference between Ψ and Φ.
9 LINEAR PERTURBATION THEORY 148
The order of solving the perturbation equations is that (9.17) gives the evolution
of Φ, and we then find the corresponding density contrast from (9.18) and the
velocity perturbation from (9.19). (We will not be much concerned about the velocity
perturbation.) Note an important difference in (9.18) from the classical Poisson
equation: there are terms of the metric perturbation without any gradients on the
right-hand side. This is a purely general relativistic feature which has very important
consequences, as we will see.
and δk , uik and hkij are defined in the same way. Because the universe is expanding,
the variable k, called the comoving momentum or comoving wavenumber, is not
the physical momentum, which is instead given by k/a. With the scale factor
normalised to unity today, the comoving momentum of a Fourier mode is the physical
momentum it has today.
The flatness of the spatial sections is crucial here. If the spatial sections were
curved, plane waves would not form a complete set of basis functions, and we would
instead have to use more complicated functions. (There would also be an additional
scale present, given by the spatial curvature term K/a2 .)
Different Fourier modes decouple, and the equations for the metric perturbations
reduce to ordinary second order differential equations for each mode. Inserting (9.21)
9 LINEAR PERTURBATION THEORY 149
where the set of Fourier coefficients {gk } is a result of a Gaussian random process.
We have here used a Fourier series instead of the integral Fourier transformation.
Formally, this corresponds to considering some cubic region (“box”) of the universe,
in the comoving coordinates, with some comoving volume L3 and assuming periodic
boundary conditions. The box is just a physically irrelevant mathematical conve-
nience. In the end we can take the limit L3 → ∞ and replace the Fourier series
with a Fourier integral. (See section 9.6 for the correspondence.) In cosmology,
we can only predict the probability distribution from which the perturbations are
drawn (since they originate in quantum mechanics), not the particular realisation
that corresponds to out universe. This brings some limitations on the comparison
between theory and observation, which we will come back to when we discuss the
cosmic microwave background.
Cosmological perturbations are real, so we have g−k = gk∗ . We can write gk in
terms of its real and imaginary part,
gk = αk + iβk . (9.26)
1 |gk |2
1
Prob(gk ) = exp −
2πs2k 2 s2k
(9.28)
1 αk2 1 βk2
1 1
=√ exp − 2 × √ exp − 2 .
2πsk 2 sk 2πsk 2 sk
hgk i = 0 (9.29)
and variance
h|gk |2 i = 2s2k . (9.30)
The distribution has one free parameter for each value of k, the real positive
number sk that gives the width (determines the variance) of the distribution.
6
We take the definition of Gaussianity to include zero mean.
9 LINEAR PERTURBATION THEORY 151
2. The probabilities of different Fourier modes are independent (i.e., they are not
correlated),
hgk gk∗ ′ i = 0 for k 6= k′ . (9.31)
Because of the ∗ , this holds also when k′ = −k (exercise).
In addition, the distribution is assumed to be statistically homogeneous and isotropic
in space. This means that the probability distribution is independent of the direction
of the Fourier mode k:
sk = s(k) . (9.32)
Like Gaussianity, this is a prediction of typical models of inflation, and seems to be
agreement with the data. (There appear to be some anomalies in the CMB which
may point to a small violation of this symmetry, but the issue remains unsettled.)
We can combine (9.30) and (9.31) into a single equation,
The expectation value of the perturbation is zero, since it represents a deviation from
the background value, and positive and negative deviations are equally probable.
(In other words, the background quantity gives the mean value.) The square of the
perturbation can be written as
X X
g(x)2 = gk∗ e−ik·x gk′ eik ·x
′
(9.35)
k k′
since g(x) is real. The typical amplitude of the perturbation is described by the
variance, the expectation value of this square,
X X X
hg(x)2 i = hgk∗ gk′ iei(k −k)·x = h|gk |2 i = 2 s2k .
′
(9.36)
kk′ k k
2π 3 X
Z
→ d3 k
L
k
3
L 1 (9.39)
gk → g(k)
2π (2π)3/2
3
L
δkk′ → δ 3 (k − k′ )
2π
It is usually easiest to work with the series, and convert to the integral near the end
(to avoid dealing with products of delta functions).
We find for the variance of g(x),7
3 X
2
X
2 2π 1
hg(x) i = h|gk | i = Pg (k)
L 4πk 3
k k (9.40)
1 d3 k ∞
dk ∞
Z Z Z
→ Pg (k) = Pg (k) = Pg (k)d ln k .
4π k3 0 k −∞
Thus the power spectrum of g gives the contribution of a logarithmic scale inter-
val to the variance of g(x). For Gaussian perturbations, the power spectrum gives a
complete statistical description, and all statistical quantities can be calculated from
it.
In practice the integration is not extended all the way from k = 0 to k = ∞.
Rather, there is usually some largest and smallest relevant scale, which introduce
natural cutoffs at both end of the integral. The largest relevant scale could be the
size of the observable universe: The perturbation g(x) represents a deviation from
the background quantity, but the best estimate we have for the background may
be the average taken over the observable universe. Then perturbations at larger
scales contribute to our estimate of the background value instead of contributing to
the perturbation away from it. However, the appropriate cutoff scale is a matter of
some debate, and we will find it necessary to discuss perturbations larger than the
size of the Hubble scale. The smallest relevant scale in the present context is the
end of the linear regime. However, by including non-linear corrections, it is possible
to discuss the power spectrum also in the non-linear regime, though on very small
scales the original information has now been erased by non-linear processes. From
a fundamental point of view, there is expected to be no information on very small
scales anyway, because of the process of free-streaming, which we will discuss later.
From a practical point of view, the relevant scale for comparing to observations it
limited by the resolution of the observational survey considered. For example, if
we consider density perturbations in terms of perturbations in the number density
of galaxies, then this is only meaningfully defined on scales larger than the typical
separation between galaxies.
An alternative definition for the power spectrum is
Both this and the previous definition are used; in these notes we distinguish them
by the different typeface. They are related by
2π 2
Pg (k) = Pg (k) . (9.42)
k3
Given the matter content and the initial condition in terms of the power spectrum
(both for the scalar and tensor perturbations), the solution in the linear regime is
completely determined by (9.20), (9.22) and (9.23). In the next chapter, we discuss
how the initial field of Gaussian perturbations is generated by inflation and what
are the expected power spectra for scalars and for tensors.
References
[1] V.F. Mukhanov, H.A. Feldman, R.H. Brandenberger, Theory of cosmologi-
cal perturbations. Part 1. Classical perturbations. Part 2. Quantum theory of
perturbations. Part 3. Extensions, Phys. Rept. 215 (1992) 203-333.
[3] A.R. Liddle and D.H. Lyth: Cosmological Inflation and Large-Scale Structure
(Cambridge University Press, 2000).
10 Inflation: perturbations
10.1 The evolution of perturbations
10.1.1 The equations of motion
We now want to find out how perturbations are generated during inflation and
how they evolve. In chapter 9 we gave the equations of motion for the metric
perturbations, and noted that in order to solve them we need to give the background
equation of state and v 2 = δp/δρ. We have discussed the background evolution
during inflation in chapter 8. Rather than dealing with the perturbation equations in
terms of the energy density and pressure, in the inflationary case it is more convenient
to discuss perturbations in the inflaton field. As with the other quantities, we split
the field into the background and the perturbation,
ϕ̈ − ∇2 ϕ + V ′ (ϕ) = 0 . (10.4)
We now input, instead of the FRW metric, the perturbed metric in the longitu-
dinal gauge from chapter 9. We then get (recall that g µν is the inverse of the metric
tensor)
1
δ ϕ̈ + 3Hδ ϕ̇ + − 2 ∇2 + V ′′ (ϕ̄) δϕ = −2ΦV ′ (ϕ̄) + Φ̇ + 3Ψ̇ ϕ̄˙ . (10.5)
a
Making a Fourier transformation, we obtain
" #
k 2
δ ϕ̈k + 3Hδ ϕ̇k + + m2 (ϕ̄) δϕk = −2Φk V ′ (ϕ̄) + Φ̇k + 3Ψ̇k ϕ̄˙ . (10.6)
a
154
10 INFLATION: PERTURBATIONS 155
This is precisely what we would get if we just inserted (10.1) into the background
equation of motion for the inflaton field and subtracted the background (i.e. ignored
perturbations in the metric).
10.1.2 Solutions
During inflation, H and m2 change slowly. Thus we now make an approximation
where we treat them as constants. The general solution of (10.7) is then
−3/2 k k
δϕk (t) = a Ak J−ν + Bk J ν , (10.8)
aH aH
where Jν is the Bessel function of order ν, with
r
9 m2
ν= − . (10.9)
4 H2
The time dependence of the scale factor for constant H is
a(t) ∝ eHt . (10.10)
If the slow-roll approximation is valid, the inflaton has negligible mass, m2 ≪ H 2 ,
since then
m2 2 V
′′
= 3M Pl = 3η ≪ 1 . (10.11)
H2 V
Thus we can drop m2 /H 2 in (10.9), so
3
ν= . (10.12)
2
Bessel functions of half-integer order are the spherical Bessel functions which can be
expressed in terms of trigonometric functions. The solution (10.8) now reduces to
δϕk (t) = Ak wk (t) + Bk wk∗ (t) , (10.13)
where the constants Ak , Bk have been redefined to absorb some numerical constants,
compared to (10.8), and
k ik
wk (t) = i + exp . (10.14)
aH aH
Well before horizon exit, k ≫ aH, the argument of the exponent is large, and the
solution oscillates rapidly. After horizon exit, k ≪ aH, the solution stops oscillating
and approaches the constant value i(Ak − Bk ). (This fits in with our observation in
chapter 9 that the scalar metric perturbation and the density become constant for
k ≪ aH.)
1
One such gauge is the spatially flat gauge, where the scalar perturbations are chosen such
that constant time slices have Euclidean geometry. There are still perturbations in the spacetime
curvature, which show up in the g0i components of the metric.
10 INFLATION: PERTURBATIONS 156
When going from classical physics to quantum physics, classical observables are
replaced by operators. We can then calculate expectation values for these observ-
ables using the operators. Here the classical observable
X
ϕ(t, x) = ϕk (t)eik·x (10.22)
where2
ϕ̂k (t) = wk (t)âk + wk∗ (t)â†−k (10.24)
and
1
wk (t) = L−3/2 √ e−iEk t (10.25)
2Ek
is the mode function, a solution of the field equation (10.17). (The normalisation
has been fixed to get the right commutation relations, (10.27).) We are using the
2
We skip the detailed derivation of the field operator, which belongs to a course of quantum
field theory. See e.g. Peskin & Schroeder, section 2.3 (note different normalisations of operators and
states, related to doing Fourier integrals rather than sums, and considerations of Lorentz invariance).
10 INFLATION: PERTURBATIONS 158
Heisenberg picture, i.e. we have time-dependent operators and the quantum states
are time-independent. Note that since the operator ϕ̂(t, x) is Hermitian (corre-
sponding to a real field), ϕ̂(t, x)† = ϕ̂(t, x), the corresponding Fourier components
satisfy ϕ̂k (t)† = ϕ̂−k (t). So the Fourier component operators are not Hermitian.
In quantum mechanics, we have two conjugate variables, position and momen-
tum. In quantum field theory, we have the field and the corresponding canonical
momentum, which is in this case just given by the time derivative of the field. Com-
bining (10.24) and (10.25), we have
ϕ̂˙ k (t) = −iEk wk (t)âk − wk∗ (t)â†−k . (10.26)
We can now calculate the commutator between the field operator and the cor-
responding velocity operator. A straightforward calculation with the rules (10.21)
gives
(Exercise: Show that demanding the canonical commutation relation (10.27) fixes
the normalisation to be the one given in (10.25).) Recall that the Lagrange density
of a scalar field is (in Minkowski space)
1
L̂ = − η µν ∂µ ϕ̂ ∂ν ϕ̂ − V (ϕ̂) , (10.28)
2
where η µν = diag(−1, 1, 1, 1) as always. The corresponding Hamiltonian density is
1
Ĥ = − η µν ∂µ ϕ̂ ∂ν ϕ̂ + V (ϕ̂) , (10.29)
2
and the Hamiltonian is the spatial integral of the Hamiltonian density,
Z
Ĥ = d3 xĤ . (10.30)
(Note that the Lagrange density corresponds to the pressure of the scalar field, and
the Hamiltonian density corresponds to the energy density.) Since the Hamiltonian
depends on the field velocity operator, it does not commute with the field operator,
As a result, the Hamiltonian and the field operator do not share a complete set of
eigenstates. So, in general an eigenstate of the Hamiltonian is not an eigenstate of the
field operator. Eigenstates of the Hamiltonian operator are the energy eigenstates,
and the state with the smallest energy is called the vacuum state. Since the vacuum
is not an eigenstate of the field operator, the eigenvalues of the field operator are
not well defined, instead we have only a distribution of values. In other words, the
scalar field has vacuum fluctuations. It can be shown that these fluctuations are
Gaussian (we skip the proof). This means that they are completely characterised
by the power spectrum, as discussed in chapter 9.
It is straightforward to calculate the power spectrum, defined as
k3
Pϕ (k) = L3 h|ϕk |2 i . (10.32)
2π 2
10 INFLATION: PERTURBATIONS 159
Recall that the power spectrum is related to the variance of the field as (note
that hϕ̂i = 0)
Z ∞
2 dk
hϕ̂(x) i = Pϕ (k) . (10.33)
0 k
since all but the first term give 0, and the states are normalised so that h1k |1k′ i =
δkk′ . Therefore the power spectrum is
k3
Pϕ (k) = L3 |wk |2 . (10.35)
2π 2
From (10.25) we have |wk |2 = 1/(2L3 Ek ), so as the final result we get
k3
Pϕ (k) = . (10.36)
4π 2 Ek
In the case of inflation, the mode functions are different because space is expand-
ing, but the reasoning is the same.
In inflation, the background field is treated classically, and only the perturbations
around the mean value of the field are quantised. In fact, if we were to do the
calculation in a gauge-independent manner, we would see that the variables which
are quantised are a linear combination of the scalar field perturbations and metric
perturbations. Thus in inflation, part of the spacetime metric is quantised. Inflation
may thus be called the first quantum gravity scenario which has been confronted with
observations – with great success. However, just like the background scalar field,
the background metric is not quantised. How to quantise the metric in general, and
not just small perturbations, remains one of the most studied and most difficult
questions in physics. In this course, we just treat the field perturbation during
inflation the same way that we treated the field in Minkowski space. That is, the
Fourier modes of the field perturbation are written as
where the mode function wk (t) satisfies the classical equation of motion (10.6), with
the normalisation fixed by the canonical commutation relation,
where the only difference from the Minkowski space commutator (10.27) is the pres-
ence of a−3 on the right-hand side.
Taking the solution of (10.6) given in section 10.1.2, under the approximations
m2
H = const. and H 2 = 3η ≈ 0 and fixing the normalisation with (10.39), we get the
solution
−3/2 H k ik
wk (t) = L √ i+ exp , (10.40)
2k 3 aH aH
where the time-dependence is a(t) ∝ eHt .
When the scale k is well inside the horizon, k ≫ aH, δϕk (t) oscillates rapidly
compared to the Hubble time H −1 . If we consider distance and time scales much
smaller than the Hubble scale, spacetime curvature does not matter and things
should behave like in Minkowski space. Considering (10.40) in this limit, one finds
(exercise) that wk (t) indeed becomes (up to a slowly varying phase), equal to the
Minkowski space mode function (10.25), with the lengths scaled by a. (The prefactor
in (10.40) was chosen so that the normalisations would agree.) Therefore the mode
function wk (t) of (10.40) tells us how the perturbation behaves as it approaches and
exits the horizon.
and the power spectrum of inflaton fluctuations is, as in Minkowski space,
k3
Pϕ (k) = L3 |wk |2 . (10.41)
2π 2
Well before horizon exit, k ≫ aH, and on timescales ≪ H −1 , the field operator
δ ϕ̂k (t) agrees with the Minkowski space field operator and we the same kind of
vacuum fluctuations in δϕ as in Minkowski space. However, the time evolution of
the perturbations is different. Well after horizon exit, k ≪ aH, the mode function
approaches a constant
iH
wk (t) → L−3/2 √ , (10.42)
2k 3
so the vacuum fluctuations “freeze” and the power spectrum acquires the constant
value 2
k3 H
Pϕ (k) = L3 2 |wk |2 = . (10.43)
2π 2π
We have calculated the power spectrum of the inflaton field perturbations by
using the quantum mechanical expectation value of the square of the field perturba-
tion. We now identify this with the expectation value of a probability distribution of
a classical variable, i.e. we assume that the quantum mechanical fluctuations become
classical. Some part of this process is understood (it can be shown that the quan-
tum mechanical expectation values become equal to those of a classical stochastic
distribution, or “squeezed”), but the problem of how classical reality emerges from
a quantum system is a problem which remains unsolved. In particle physics appeal
is often made to the Copenhagen interpretation according to which states become
classical when they are measured, but for cosmology this is inadequate. We just as-
sume that we can replace an expectation value of a quantum state with the ensemble
average of a classical distribution.
For our purposes, quantum mechanics generates the initial perturbations and
solves the problem of how perturbations can emerge from a state which is homoge-
neous and isotropic. As a remnant of the indeterministic origin of the perturbations,
10 INFLATION: PERTURBATIONS 161
we cannot predict the specific member of the ensemble which is realised in the uni-
verse, we can only calculate the statistical distribution of perturbations. As noted,
this distribution is Gaussian, so all Fourier modes δϕk acquire their values as in-
dependent random variables (except for the reality condition δϕ−k = δϕ∗k ) with a
Gaussian probability distribution.
The result (10.43) was obtained treating H as a constant. However, H does
change, albeit slowly, during inflation. The main purpose of our discussion was to
follow the inflaton perturbations through the horizon exit. After the perturbation
is well outside the horizon, we switch to other variables, namely the curvature per-
turbation Rk which remains constant outside the horizon even though H changes,
unlike δϕk (we see from (10.37) that δϕk is not constant in general). To take into
account evolution we use for each scale k the value of H which is representative for
the evolution of that particular scale through the horizon. That is, we choose the
value of H at horizon exit3 , so that aH = k. Thus the power spectrum is
2
k3
H
Pϕ (k) = L3 |wk |2 = , (10.44)
2π 2 2π aH=k
where the subscript notation signifies that the value of H for each k is to be taken
at horizon exit of that particular scale.
Since we have only one quantity which has fluctuations, the inflaton field, and the
perturbations are treated in linear theory, the perturbations of any other quantity
are related to the inflaton field fluctuation by linear and local equations. In other
words, any perturbation quantity gk depends only on the field perturbation δϕk
with the same wavenumber, gk (t) = fgϕ (t, k)δϕk (tk ). Thus the statistics of the
inflaton perturbations δϕ(x) are inherited by all other perturbations, and we have
Pg (k) = fgϕ (t, k)2 Pϕ (k). The function fgϕ (t, k) (like the power spectrum Pϕ (k)) can
only depend on the magnitude k, not on the direction of k, because the background
is homogeneous and isotropic. So the distribution of the perturbations inherits
the property of homogeneity and isotropy from the symmetry of the background
on which they are created and evolve: perturbations generated by inflation are
statistically homogeneous and isotropic.
In particular, for the comoving curvature perturbation we have, from (10.15),
δϕk
Rk = −H , (10.45)
ϕ̄˙
so we obtain 2 2
H HH
PR (k) = Pϕ (k) = . (10.46)
ϕ̄˙ ϕ̄˙ 2π aH=k
This the main result for quantum fluctuations during inflation. The problem has
now been completely reduced to the evolution of the background scalar field and the
background Hubble parameter. We just need to specify the inflation potential and
3
One can do a more precise calculation, where one takes into account the evolution of H(t).
The result is that one gets a correction to the amplitude of PR (k), which is first order in slow-roll
parameters and a correction to its spectral index n which is second order in the slow-roll parameters.
Note that H is assumed to be constant only for each k mode during the time it crosses the horizon.
The equations of motion of the different modes are independent, so in principle H could be very
different for modes that exit at very different times withtout violating our assumptions.
10 INFLATION: PERTURBATIONS 162
calculate how the background evolves, and plug it in (10.46) to get complete infor-
mation about the perturbations. That, in turn, is the starting point for calculating
structure formation and the CMB anisotropy. Turning this around, observations of
large-scale structure and the CMB can be used obtain information about quantum
processes in the primordial universe. Note that the power spectrum depends only
on k. Statistical homogeneity and isotropy of the perturbations, inherited from the
symmetry of the background, is a strong feature of inflation. (I use the word ’feature’
rather than ’prediction’, because it is possible to construct models where, for exam-
ple, space expands anisotropically during inflation. However, that requires untypical
assumptions, such as having a short period of inflation, so that the anisotropy is not
washed away, or inflation driven by something else than a scalar field.)
(In this section, we drop the overbar from the background values.) We have ex-
pressed the dynamics of slow-roll inflation in terms of the two slow-roll variables,
so let us see how the power spectrum looks like in terms of them. Applying the
slow-roll equations
V
H2 = 2 and 3H ϕ̇ = −V ′
3MPl
(10.47) becomes
1 1 V3 1 1 V
PR (k) = 2 6 ′2
= 2 4 ε , (10.48)
12π MPl V 24π MPl
where ε is the slow-roll parameter.
According to observations of CMB and large-scale structure, the amplitude of
the primordial power spectrum is
This puts a limit on the Hubble scale during inflation. From H 2 = V /(3MPl 2 ), the
constraint V 1/4 < 6.8 × 1016 GeV translates into H < 1015 GeV, or in terms of
length, H −1 > 10−31 m.
Since during slow-roll inflation V and V ′ change slowly while a wide range of
scales k exit the horizon, we expect PR (k) to be a slowly varying function of k. We
10 INFLATION: PERTURBATIONS 163
describe this small variation with the spectral index n of the primordial spectrum,
defined as4
d ln PR
n(k) − 1 ≡ . (10.52)
d ln k
If the spectral index is independent of k, we say that the spectrum is scale-free. In
this case the primordial spectrum is a power law
n−1
2 k
PR (k) = A , (10.53)
kp
where the“pivot scale” kp is some chosen reference scale and A is the amplitude at
this pivot scale.
If the power spectrum is constant,
d ln k d ln(aH) ȧ Ḣ
= = + = (1 − ε)H ,
dt dt a H
where we used the fact that in the slow-roll approximation Ḣ = −εH 2 in the last
step. Thus
d 1 1 d 1 ϕ̇ d M2 V ′ d ′
2 V d
= = = − Pl ≈ −MPl . (10.55)
d ln k 1 − ε H dt 1 − ε H dϕ 1 − ε V dϕ V dϕ
Let us first calculate the scale dependence of the slow-roll parameters:
" # " ′ 2 ′′ #
2 ′ 2
dε ′
2 V d MPl V 4 V′ 4 V V
= −MPl = MPl − = 4ε2 − 2εη
d ln k V dϕ 2 V V V V
(10.56)
and, in a similar manner (exercise),
dη
= . . . = 2εη − ξ , (10.57)
d ln k
where we have defined a third slow-roll parameter
4 V ′ ′′′
ξ ≡ MPl V . (10.58)
V2
4
The −1 is in the definition for historical reasons, related to other ways of defining the power
spectrum of perturbations.
10 INFLATION: PERTURBATIONS 164
p
The parameter ξ is typically second-order small in the sense that |ξ| is of the same
order of magnitude as ε and η. (Therefore it is sometimes written as ξ 2 , although
this can be misleading, as it does not have to be positive.)
We can now calculate the spectral index:
1 dPR ε d V 1 dV 1 dε
n−1= = = −
PR d ln k V d ln k ε V d ln k ε d ln k
′
(10.59)
2 V 1 dV
= −MPl · − 4ε + 2η = −6ε + 2η .
V V dϕ
dn
= −0.015 ± 0.017 (10.62)
d ln k
Some inflation models have |n − 1| and |dn/d ln k| larger than this, while others do
not. These observations have ruled out some inflation models, while a zoo of dozens
and dozens of viable models remains [2]. Note how, as in the case of dark matter,
things work out automatically. In order to have negative pressure, a scalar field has
to roll slowly. Once the background evolution is slowly rolling, the perturbations are
close to scale-invariant, without needing to add new ingredients or tune anything.
CMB experiments have measured the CMB anisotropy over a range ∆ ln k ≈ 8.
On scales smaller than this, the CMB anisotropy is expected to be negligible (see
chapter 12 for the reason why!), so there’s nothing more to find. However, it is
possible to probe these smaller scales by observations of large-scale structure. Recall
that for high energy-scale inflation, the number of e-folds until the end of inflation
when the largest observable modes are generated is about 60, so we are only seeing
a small part of inflation.
The above results do not yet allow an independent determination of the two
slow-roll parameters ε and η. However, it turns out that the spectral index of tensor
perturbations produced by inflation is independent of η (it is −2ε). So if tensor
10 INFLATION: PERTURBATIONS 165
perturbations are detected (from their signature on the CMB) and their spectrum
is measured, we can get both ε and η. The amplitude of the tensor perturbations
also depends directly on the Hubble parameter on inflation, so it will provide a
measurement of the energy scale of inflation. Typically, large-field inflation models
produce tensor perturbations with much larger amplitude than small-field inflation
models. In the small-field case they may be too small to be detectable in the near
future. It is possible to calculate the spectrum of gravity waves the same way as
we did for the scalar perturbations (the calculation is in fact simpler in the sense
that the gravity waves do not couple to matter, so we don’t have to worry about
the scalar field perturbations and gauges).
Example: Consider the simple inflation model
1
V (ϕ) = m2 ϕ2 . (10.63)
2
In chapter 8 we already calculated the slow-roll parameters for this model:
2
MPl
ε=η=2 2
(10.64)
ϕ
and we immediately see that ξ = 0. Thus
MPl 2
n = 1 − 6ε + 2η = 1 − 8
ϕ
MPl 4
dn 2
= 16εη − 24ε − 2ξ = −32 . (10.65)
d ln k ϕ
To get the numbers out, we need the values of ϕ when the relevant cosmological
scales left the horizon. We know that the number of inflation e-foldings after that
should be about N ≈ 50 . . . 60. We have
Z ϕ
1 V 1 ϕ 1
Z
ϕ2 − ϕend 2 ,
N (ϕ) = 2 ′
dϕ = 2 dϕ = 2 (10.66)
MPl ϕend V MPl 2 4MPl
2 /ϕ 2
√
and we estimate ϕend from ε(ϕend ) = 2MPl end = 1 ⇒ ϕend = 2MPl to get
ϕ2 = ϕend 2 + 4MPl
2 2
N = 2MPl 2
+ 4MPl 2
N ≈ 4MPl N. (10.67)
Thus 2
MPl 1
= (10.68)
ϕ 4N
and
2
n = 1− ≈ 0.96
N
dn 2
= − 2 ≈ −0.0008 . (10.69)
d ln k N
The energy scale of inflation is determined from (10.48) and (10.49). Putting in
(10.68), we get
9 14
m≈ 10 GeV ≈ 2 × 1013 GeV ≈ 8 × 10−6 MPl , (10.70)
N
REFERENCES 166
for N = 50. We get V 1/4 ≈ 2 × 1016 GeV as the energy scale for the period when
the perturbations seen in the CMB were generated. Potential energy at the end of
inflation is
1/4 r
1/4 1 2 2 m
Vend = m ϕend = MPl ≈ 3 × 10−3 MPl ≈ 7 × 1015 GeV . (10.71)
2 MPl
Because of the high energy scale, the amplitude of tensor perturbations, as quantified
by the tensor-to-scalar ratio r is significant, r ≈ 0.1. As these is not sign of tensor
perturbations in the Planck data, this simple model is slightly disfavoured by the
data. There was an announcement in March 2014 by the BICEP2 telescope team
that inflationary gravity waves would have been detected, but this turned out to be
premature.
References
[1] P.A.R. Ade et al. [Planck Collaboration], Astron. Astrophys. (2014)
[arXiv:1303.5076 [astro-ph.CO]]
2 k2 1
δk = − 2
Φk − 2 Φ̇k − 2Φk . (11.5)
3 (aH) H
167
11 PERTURBATIONS AFTER INFLATION 168
which enters at the time teq of matter-radiation equality, and the scale
−1
kdec = (adec Hdec )−1 ≈ 90ωm
−1/2
Mpc , (11.8)
which enters at the time tdec ≈ 380 000 yr of photon decoupling. A conservative
−1 = 86 . . . 110 Mpc and
observational range is ωm = 0.12 . . . 0.16. This gives keq
−1 −1 ≈ 100 Mpc and k −1 ≈ 240 Mpc. The
kdec = 220 . . . 260 Mpc, with mean values keq dec
smallest “cosmological” scale is that corresponding to a typical distance between
galaxies, about 1 Mpc.3 This scale entered during the radiation-dominated epoch,
well after Big Bang nucleosynthesis.
The scale corresponding to the present “horizon” (i.e. Hubble length) is
for values h = 0.6 . . . 0.8, and the commonly accepted value h = 0.7 gives (a0 H0 )−1 =
4300 Mpc. If the universe is accelerating at the moment4 this scale is actually exiting
now, and there are scales, somewhat larger than this, that have briefly entered, and
then exited again in the recent past. The largest observable scales, of the order of
k0−1 , are essentially at their primordial amplitude now.
2. baryonic matter
3. photons
2
Recall that what we call the horizon here is just the Hubble radius, not the particle horizon.
3
In the present universe, structure at smaller scales has undergone a non-linear process of galaxy
formation, and it bears little relation to the primordial perturbations. However, observations of
the high-redshift universe, especially so-called Lyman-α observations (absorption spectra of high-z
quasars, which reveal distant gas clouds along the line of sight), can reveal these structures when
they are closer to their primordial state. With such observations, the “cosmological” range of scales
can be extended down to ∼ 0.1 Mpc.
4
This is the case in the ΛCDM model, but there are also models where the acceleration has
transitioned back into deceleration. Either possibility is allowed by observations.
11 PERTURBATIONS AFTER INFLATION 169
4. neutrinos
5. vacuum energy .
The existence of baryons, photons and neutrinos is beyond reasonable doubt, the
existence of dark matter is considered established by most cosmologists (however,
warm dark matter remains a plausible alternative to cold dark matter) and the
existence and nature dark energy is still a subject of debate. As in the first part of
the course, we will stick with the ΛCDM model and only consider vacuum energy.
We have
ρ = ρc + ρb + ργ + ρν +ρΛ , (11.10)
| {z } | {z }
ρm ρr
where we have grouped CDM (denoted with c) and baryons together as matter, and
photons and neutrinos as radiation. As we have discussed, neutrinos are actually
non-relativistic today and so constitute matter. However, for simplicity we will
neglect neutrino masses, as we have done before. (Because the contribution of the
neutrinos to the total energy density, or the energy density of matter, is small when
they become non-relativistic, this approximation is not too bad.)
Until the decoupling of photons and matter at t = tdec , baryons and photons are
tightly coupled, so for t < tdec it is useful to treat them as a single component,
ρbγ ≡ ρb + ργ . (11.11)
and the components can have different flow velocities. We can introduce the density,
pressure, and velocity perturbations for each component separately,
Note that the total density contrast is not just the sum of the individual density
contrasts. Instead, the density contrasts are weighted by the mean densities,
X ρ̄i
δ= δi . (11.19)
ρ̄
i
p = p(ρ) , (11.20)
i.e. the pressure is uniquely determined by the energy density. Then the pertur-
bations δp and δρ are necessarily related by the derivative dp/dρ of the function
p(ρ),
dp dp
p = p̄ + δp = p̄(ρ̄) + (ρ̄)δρ ⇒ δp = δρ .
dρ dρ
The time derivatives of the background quantities p̄ and ρ̄ are related by the same
derivative,
dp̄ dp dρ̄ dp
p̄˙ = = (ρ̄) = ρ̄˙ .
dt dρ dt dρ
Assuming the derivative dp/dρ is non-negative, its square root is the speed of sound
s
dp
cs ≡ . (11.21)
dρ
11 PERTURBATIONS AFTER INFLATION 171
δp p̄˙
v2 ≡ = = c2s .
δρ ρ̄˙
In general, p may depend on other variables besides ρ. The sound speed is then
given given by
∂p
c2s = (11.22)
∂ρ S
where the subscript S indicates that the derivative is taken so that the entropy of the
fluid element is kept constant. Since the background universe expands adiabatically
(meaning that there is no entropy production), we have
p̄˙
∂p
= = c2s . (11.23)
ρ̄˙ ∂ρ S
p̄˙
δp = c2s δρ = δρ . (11.25)
ρ̄˙
Adiabatic perturbations are the simplest kind of perturbations. Single-field in-
flation produces adiabatic perturbations, since perturbations in all quantities are
proportional to a perturbation δϕ in a single scalar quantity, the inflaton field.
Adiabatic perturbations have the property that the local state of matter (deter-
mined here by the quantities p and ρ) at some spacetime point (t, x) of the perturbed
universe is the same as in the background universe at some slightly different time
t + δt, this time difference being different for different locations x. We can thus view
adiabatic perturbations as some parts of the universe being “ahead” and others
“behind” in the evolution, as visualised in figure 1.
For the different components we have
δpi p̄˙ i
˙
) =
δρi (x) = ρ̄i δt(x) ρ̄˙ i
δρi
⇒ (11.26)
˙
δpi (x) = p̄i δt(x) δρ ρ̄˙
i = i
δρj ρ̄˙ j
If there is no energy transfer between the fluid components at the background level,
the energy continuity equation is satisfied by each one separately,
δi = δm (11.29)
If we had energy transfer between components, the left-hand side of (11.34) would
be non-zero for the individual components (but still zero for the total energy density
and pressure).
Just like the background expansion is sourced by the total energy density and
pressure, the metric perturbations are sourced by the perturbations in the total
energy density and pressure, so we have, from chapter 9,
k2
0 = Φ̈k + H(4 + 3v 2 )Φ̇k + v 2 Φk + [2Ḣ + (3 + 3v 2 )H 2 ]Φk (11.35)
a2
2 k2 1
δk = − 2
Φk − 2 Φ̇k − 2Φk , (11.36)
3 (aH) H
where v 2 ≡ δp/δρ. For adiabatic perturbations, we have v 2 = c2s .
So the gravitational potential decays, while the density perturbation oscillates around
a constant amplitude.
Though the physical wavelength of the mode is growing ∝ a, the visual horizon
is growing faster, H −1 ∝ a2 . (Viewed in terms of the comoving wavelength, it
stays constant, while aH ∝ a−1 drops.) For superhorizon modes, the decaying
mode becomes negligible, while the non-decaying mode remains constant. Once the
wavelength of the mode becomes smaller than the horizon, the density contrast starts
to oscillate, and the gravitational potential decays. In both cases, the perturbations
remain small.
What about perturbations in the matter? Baryons are tightly coupled to radi-
ation until z ≈ 1100, so they have the same perturbations as the radiation fluid.
(We will later come back to what happens when baryons and photons decouple;
that occurs in the matter dominated era.) However, dark matter decouples from
the thermal bath earlier than the baryons, since it interacts weakly. We assume
here that dark matter is cold, so its pressure is negligible. After the decoupling of
dark matter, its energy-momentum tensor is individually conserved. Since the dark
matter contributes negligibly to the background and to the gravitational potential,
we can take (11.37) as a given and see how the dark matter perturbation evolves
in this gravitational potential. The derivation for the dark matter density contrast
is not complicated, but it requires a bit more general relativity than we have on
this course, so we just give the result. For a general FRW background and metric
perturbation Φ, we have
k2
δ̈ck + 2H δ̇ck = 3Φ̈k + 6H Φ̇k − Φk . (11.41)
a2
It is clear that the solution for superhorizon modes k ≪ aH is δck = constant,
given that Φk = constant. In the opposite limit k ≫ aH we get, by inputting
a ∝ t1/2 and (11.40), the solution
where the coefficients Ã1k and Ã2k are expressible in terms of A1k and A2k . (Ex-
ercise. Calculate Ã1k and Ã2k in terms of A1k and A2k .) (Recall that if we assume
adiabatic initial conditions, we have δm = 43 δr ≈ δ.) So, in contrast to baryons, the
density contrast of cold dark matter grows logarithmically during the radiation dom-
inated era. The dark matter perturbations thus have a head start on perturbations
in baryonic matter, which is tightly coupled to the photons.
Scb = δc − δb , (11.44)
and it expresses how perturbations in the two components deviate from each other.
For both δc and δb , the right-hand side of (11.41) is the same, so subtracting the
equations we get an equation for Scb :
baryon perturbations is the same as CDM perturbations. (This is for linear scales:
when perturbations become non-linear, baryons and CDM behave differently.)
But for scales which enter before decoupling, a non-zero Scb develops because
baryon perturbations are coupled to photon perturbations, whereas CDM pertur-
bations are not. After decoupling, δc ≫ δb , since δc has been growing, while δb has
been oscillating. The initial condition is then Scb ∼ δc (“initial” time here being the
time of decoupling tdec ). During the matter-dominated epoch, the solution for Scb
is
Scb = A + Bt−1/3 , (11.46)
whereas for δc it is, neglecting the effect of baryons on it, from Eq. (11.56),
We call the first term the “growing” and the second term the “decaying” mode,
even though the “growing” mode of Scb is actually constant. The modes have been
evolving since horizon entry, so we can drop the decaying part.
To work out the precise initial conditions, we would need to work out the be-
haviour of Scb during decoupling. However, we really only need to assume that there
is no strong cancellation between the growing and decaying modes, so that Scb ∼ δc
implies that A is not much larger than δc ,
Later, at t ≫ tdec ,
Thus the baryon density contrast δb grows to match the CDM density contrast
δc (see figure 2), and we have eventually δb = δc = δ to high accuracy.
The baryon density perturbation begins to grow only after tdec . Before decou-
pling the radiation pressure prevents growth. Without CDM, the density contrast
would grow only as δb ∝ a ∝ t2/3 after decoupling (during the matter-dominated
period, and the growth stops when the universe becomes dark energy dominated).
Thus it would have grown at most by the factor a0 /adec = 1 + zdec ≈ 1090 after
decoupling. In the anisotropy of the CMB we observe directly the baryon density
perturbations at t = tdec . They are too small (about 10−5 ) for a growth factor of
1090 to give the present observed large scale structure7 .
With CDM, this problem is solved. The CDM perturbations begin to grow
earlier, logarithmically in a during the radiation-dominated era and linearly in a from
t ∼ teq onwards, so by t = tdec they are much larger than the baryon perturbations.
After decoupling the baryons lose support from photon pressure and fall into the
CDM gravitational potential wells and catch up with the CDM perturbations. This
allows the baryon perturbations to be small at t = tdec and to grow after that by
much more than the factor 103 , solving the problem with observations. This is one
of the strongest pieces of evidence for dark matter.
7
This assumes adiabatic primordial perturbations, since we are seeing δγ , not δb . For a time,
primordial baryon entropy perturbations Sbγ = δb − 34 δγ were considered a possible explanation,
but more accurate observations have ruled out this possibility.
11 PERTURBATIONS AFTER INFLATION 177
Figure 2: Evolution of the CDM and baryon density perturbations after horizon entry (at
t = tk ). The figure is just schematic; the upper part is to be understood as having a ∼
logarithmic scale; the difference δc − δb stays roughly constant, but the fractional difference
becomes negligible as both δc and δb grow by a large factor.
The above situation became clear in the 1980s when the upper limits to CMB
anisotropy (which was finally discovered by COBE in 1992) became tighter and
tighter. Today we have accurate measurements of the structure of the CMB anisotropy
which are compared to detailed calculations which CDM, and the argument is raised
to a different level – instead of comparing just two numbers we are now comparing
entire power spectra, which we will discuss in the next chapter.
The Jeans equation. Before decoupling, baryons see the photon pressure (as
well as their own pressure), while after decoupling, they just see their own pressure.
Baryon pressure is much smaller than photon pressure, but it is important on small
scales. At the background level, the baryon pressure can be taken to be zero p̄b = 0,
but the perturbation is non-zero, δpb 6= 0. After decoupling, baryonic matter is a
gas of hydrogen and helium. If we ignore the formation of molecules in the gas and
neglect the contribution of helium, so that we have a monoatomic gas, we have
δpb δnb Tb
v2 = ≈ Tb = , (11.49)
δρb δρb mN
where we have taken into account that the temperature is very uniform, and mN ≈
1 GeV is the nucleon mass. Note that in this case v 2 = c2s = ∂pb /∂ρb . Down until
z ∼ 100, residual free electrons maintain enough interaction between the baryon
and photon components to keep Tb ≈ Tγ . During this period, we thus have c2s ≈
10−13 (1 + z) ∝ 1/a. After that the baryon temperature falls faster than the photon
temperature,
Tb ∝ (1 + z)2 whereas Tγ ∝ 1 + z
(as shown in an exercise in chapter 4).
However, even a tiny pressure can be important on small scales. If we take
the analogue of (11.41) for the baryonic component, which includes a tiny pressure
11 PERTURBATIONS AFTER INFLATION 178
contribution (we skip the derivation), we get the Jeans equation8 , valid on subhorizin
scales,
2
2k
δ̈bk + 2H δ̇bk + cs 2 − 4πGN ρ̄ δbk = 0 . (11.50)
a
We have assumed that the universe is spatially flat, so we can also can write this as
2
2k 3 2
δ̈bk + 2H δ̇bk + cs 2 − H δbk = 0 . (11.51)
a 2
We see that the small pressure term c2s is enhanced on small scales by the term
k 2 . If take k to be sufficiently large, this term will dominate, no matter how small
is c2s . The nature of the solution to the Jeans equation depends on the sign of the
factor in brackets. Pressure resists compression, so if the first term dominates, we
get an oscillating solution, i.e. sound waves. The second term in the brackets is due
to gravity. If this term dominates, the perturbations grow. The wavenumber for
which the terms are equal,
√ r
4πGρ̄ 3 aH
kJ = a = , (11.52)
cs 2 cs
is called the Jeans wavenumber, and the corresponding wavelength
2π
λJ = (11.53)
kJ
is called the Jeans length.
For scales much smaller than the Jeans length, k ≫ kJ , we can approxi-
mate the Jeans equation by
k2
δ̈bk + 2H δ̇bk + c2s δbk = 0 . (11.54)
a2
The solutions oscillate with angular frequency ω = cs k/a (assuming that cs is con-
stant, or changes slowly – this is not really quite true, as we have seen). The
oscillations are damped by the 2H δ̇k term, thus the amplitude of the oscillations
decreases with time. There is no growth of structure on sub-Jeans scales.
For scales much longer than the Jeans length (but still subhorizon), aH ≪
k ≪ kJ , we have
3
δ̈bk + 2H δ̇bk − H 2 δbk = 0 . (11.55)
2
So baryon perturbations on scales larger than the Jeans length but smaller than the
Hubble length grow just like CDM perturbations, as we discussed earlier.
8
Often the Jeans equations are derived starting from the equations of Newtonian gravity, in
which context they were originally presented.
11 PERTURBATIONS AFTER INFLATION 179
The ratio of the (comoving) Jeans length to the comoving Hubble length is, from
(11.52) r
λJ 2
−1
= 2π cs .
(aH) 3
Before decoupling, the baryons see the photon pressure, and c2s ∼ 31 . From
(11.7.2) we would then conclude that before decoupling the baryonic Jeans length
is comparable to the Hubble length, so that all subhorizon modes are sub-Jeans.
Therefore, all subhorizon baryon modes oscillate before decoupling. However, this
argument is not really correct, because the Jeans equation is not valid when c2s is
large. Also, in the period close to decoupling the photon mean free path λγ grows
rapidly. The fluid description, which we are here using for the perturbations, applies
only to scales ≫ λγ , whereas the photons are smooth only on scales ≪ λγ . The
behaviour during this period can be treated properly only with numerical codes, such
as COSMOMC. Nevertheless, the conclusion that all baryonic subhorizon modes
oscillate before decoupling is correct, at least when perturbations are adiabatic9 .
After decoupling, the Jeans length grows. However, at all times until today, it is
≪ Mpc. It would be relevant if we were interested in the process of the formation
of individual galaxies, but here we are interested in the larger scales reflected in
perturbations of the galaxy number density. Thus for our purposes, the baryonic
component is pressureless after decoupling.
The subhorizon evolution history of the different cosmological scales of pertur-
bations is summarised in figure 3.
Figure 3: The evolution of perturbations on different subhorizon scales. The baryon Jeans
length kJ−1 drops precipitously at decoupling so that all cosmological scales became super-
Jeans after decoupling, whereas all subhorizon scales were also sub-Jeans before decoupling.
The wavy lines symbolise the oscillation of baryon perturbations before decoupling, and the
opening pair of lines around them symbolise the ∝ a growth of CDM perturbations after
teq . There is also logarithmic growth of CDM perturbations between horizon entry and teq .
11 PERTURBATIONS AFTER INFLATION 181
These effects modify the primordial value of the perturbations, and this is en-
coded in the transfer function. We also express the relation between the primordial
curvature perturbation and Rk and any other quantity we are interested in via a
transfer function. Since we have only one source of perturbations and perturbations
are assumed to be small, the value of any perturbation g at time t is related to the
primordial perturbation Rk linearly:
where Tg (t, k) is the transfer function for perturbation g. The transfer function
depends only on the magnitude k and not on the direction of k, because perturba-
tions are evolving on a homogeneous and isotropic background. Often the transfer
function separates, Tg (t, k) = f (t)F (k). In particular, this is the case for cold dark
matter, if the decaying mode can be neglected. The transfer function incorporates
all the physics that determines how structure evolves in the linear regime. The
power spectrum of g is
On scales k −1 ≫ 10 Mpc, perturbations are still small today, and one does not
have to go beyond the transfer function. For smaller scales, corresponding to galax-
ies and galaxy clusters, the density perturbations have become large at late times,
and the physics of structure growth has become nonlinear. As the perturbations be-
come non-linear, modes with different wavenumber become coupled. This nonlinear
evolution is typically studied using large numerical simulations which use Newtonian
gravity. There are also some analytical results, also mostly in Newtonian gravity.
On scales that are still superhorizon today, the relation between the density
contrast and the primordial perturbations is simple, we have from δm ≈ δ = −2Φ =
6 6
5 R, where we have used (11.4). So for k ≫ a0 H0 , we simply have Tδ (t, k) = 5 .
On scales that are subhorizon today, the situation is a bit more involved. Let
us make a crude estimate of the transfer function on those scales. Let us first look
−1 ≈ 13.7ω −1 Mpc ≈
at scales that enter before matter-radiation equality, k −1 < keq m
100 Mpc. We make the approximation that the relation (11.4) Φk = − 32 Rk holds
all the way to horizon entry (k = aH), though it is strictly only valid for k ≪ √ aH.
From (11.37) and (11.38) we have that at horizon entry (k = aH, or y = 1/ 3)
δk ≈ − 52 Φk = 35 Φk . With adiabatic initial conditions, we have δm = 43 δr ≈ 34 δ. We
thus get
3 5
δck ≈ δk ≈ Rk . (11.59)
4 4
at horizon entry. If we neglect the logarithmic growth of the CDM density per-
turbations, their amplitude stays at this level until the universe becomes matter-
dominated at t = teq , after which we can approximate δk ≈ δck and δk begins to
grow according to the matter-dominated law, ∝ 1/(aH)2 ∝ a. Putting in the log-
arithmic growth from horizon entry to matter-radiation equality, the perturbations
are in addition enhanced by a factor ln(aeq /aentry ) = 2 ln[aentry Hentry /(aeq Heq )] =
2 ln(k/keq ), where the subscript entry refers to horizon entry. So all in all we have,
11 PERTURBATIONS AFTER INFLATION 182
aeq Heq 2
5 k
δk (t) ≈ ln Rk
2 aH keq
5 keq 2
k
= ln Rk . (11.60)
2 aH keq
In contrast, for perturbations which enter the horizon during matter domination
k ≪ keq , we have
k 2
2
δk (t) = − Φk
3 aH
k 2
2
= Rk , (11.61)
5 aH
T (k) = 1 k ≪ keq
keq 2
k
T (k) ≈ ln k ≫ keq , (11.63)
k keq
where we have dropped factors of order unity from the case k ≫ keq , since the
calculation is anyway approximate. If we wanted a transfer function which is con-
tinuous, we could replace ln(k/keq ) with ln(e + k/keq ). However, our calculation
is rather crude, and we should take into account the transition from radiation to
matter domination in more detail. An analytical fit to a numerical calculation gives
[1]
ln(1 + 2.34q)
T (k) = , (11.64)
2.34q [1 + 3.89q + (16.1q)2 + (5.46q)3 + (6.71q)4 ]1/4
where q ≈ kefb /(14keq ), and the baryon fraction fb ≡ ωb /ωm takes into account
interactions between baryons and photons which dampen the matter perturbations.
The form (11.64) is called the BBKS transfer function after Bardeen, Bond, Kaiser
and Szalay. For realistic values fb ≈ 0.2, it has an error of around 30% around
the turning value keq , while it is accurate for high and low values of k. In detailed
calculations, numerical solutions of the baryon-photon-dark matter system are used
11 PERTURBATIONS AFTER INFLATION 183
to derive the transfer function. There are publicly available computer programs
for doing this, such as COSMOMC. One of the main effects missing from both
(11.63) and (11.64) is baryon acoustic oscillations in the regime k > keq . These are
remnants of the oscillations of the baryon-photon fluid before decoupling, which are
imprinted on the pattern of density fluctuations (and thus the the distribution of
galaxies) today. Since there is much more dark matter than baryons, the oscillations
are only a small feature in the overall power spectrum, but they carry important
cosmological information, like the CMB anisotropies we discuss in the next chapter.
Further discussion of the baryon acoustic oscillations is beyond the scope of this
course.
According to the currently favoured picture, the universe becomes dark energy
dominated as we approach the present time. The equation-of-state parameter w
becomes negative and Φ begins to decay, so the growth of the density perturbations
is damped. This effect is not very big up until today (and we shall not calculate
it now), since the universe has expanded by less than a factor of 2 after the onset
of dark energy domination, but it is important in detailed matching of observations
and theory.
We have calculated everything using linear perturbation theory. This breaks
down when the perturbations become large, |δ(x)| ∼ 1. We say that the perturbation
becomes nonlinear. This has happened for scales k −1 . 10 Mpc by now. When
the perturbation becomes nonlinear, i.e. an overdense region becomes about twice
as dense as the average density of the universe, it collapses rapidly, and forms a
gravitationally bound structure, such as a galaxy or a cluster of galaxies. Further
collapse is prevented by the angular momentum of the structure. Stars and gas and
CDM particles in a galaxy orbit around the center of mass of the bound structure,
and galaxies in galaxy groups and clusters have more complicated orbits around
each other. Underdense regions start to depart from the linear behaviour when they
are roughly half as dense as the background. Such regions become ever emptier, as
they expand faster than the background.
For PR (k) ∝ k n−1 we have Pδ (k) ∝ k n . This is the reason for the −1 in the definition
of the spectral index in terms of PR —it was originally defined in terms of Pδ .
11 PERTURBATIONS AFTER INFLATION 184
We might ask why inflation generates a scale-invariant spectrum – not the math-
ematical reason (we calculated that in the previous chapter) but the physical idea.
During inflation the universe is close to a de Sitter universe, with the metric
ds2 = −dt2 + e2Ht (dx2 + dy 2 + dz 2 )
with H = const. The de Sitter universe is an example of a maximally symmetric
spacetime. In addition to being homogeneous (in the space directions), it also looks
the same at all times. (This is not obvious from the metric, just like spatial ho-
mogeneity is not obvious from the metric for FRW universes with non-zero spatial
curvature.) Therefore, modes of different wavelength get the same perturbations
imprinted on them regardless of when they leave the horizon.
We would now like to see how the scale-invariance relates to the density pertur-
bation. The power spectrum of density perturbations is
k 4
4
Pδ (k) = T (k)2 PR (k) , (11.67)
25 aH
and for the gravitational potential we have
9
PΦ (k) = PR (k)T (k)2 = constant for k < keq . (11.68)
25
We see that perturbations in the gravitational potential are scale invariant (apart
from the transfer function), but perturbations in density are not. Instead the density
perturbation spectrum is steeply rising on small scales, meaning that there is more
structure at small scales than at large scales. Thus the scale invariance refers to
the metric perturbations. The density perturbation then turns at ∼ keq to become
almost flat (growing ∼ ln k) at small scales, due to the inhibition of the growth
of density perturbations during the radiation-dominated era. We can also say that
the scale-invariance refers to the density perturbations as they enter the horizon,
i.e. density perturbations on all scales enter the horizon with the same amplitude
(2/5)A ≈ 2 × 10−5 .
The relation between density and gravitational potential perturbations reflects
the nature of gravity: a 1% overdense region 100 Mpc across generates a much
deeper potential well than a 1% overdense region 10 Mpc across, since the former
has 1000 times more mass. Therefore we need much stronger density perturbations
at smaller scales to have an equal contribution to Φ.
Thus the perturbations get rapidly stronger on smaller scales, down to keq −1 ∼
100 Mpc. The ∼ 100 Mpc scale appears indeed quite prominent in large scale struc-
ture surveys, like the 2dFGRS and SDSS galaxy distribution surveys. Towards
smaller scales the structure keeps getting stronger, but now quite slowly. However,
the perturbations are now so large that first order perturbation theory begins to
fail, and that limit is crossed at around k −1 ∼ 10 Mpc. Nonlinear effects cause the
density power spectrum to rise more steeply than calculated by perturbation theory
on scales smaller than this.
The present-day density power spectrum Pδ (k) can be determined observation-
ally from the distribution of galaxies (Fig. 5). The quantity plotted is usually
Pδ (k) ≡ (2π 2 /k 3 )Pδ (k). It should go as
Pδ (k) ∝ k n for k ≪ keq
n−4
(11.69)
Pδ (k) ∝ k ln k for k ≫ keq .
11 PERTURBATIONS AFTER INFLATION 185
See figure 6.
11.10 Free-streaming
We earlier presented a simple argument for why dark matter is needed, based on
the 10−5 amplitude of the observed CMB anisotropies. Because baryons are tightly
coupled with photons at the time of last scattering, their density contrast δb is also
∼ 10−5 , and since density perturbations grow only linearly with the scale factor, an
expansion factor of ∼ 1000 is not enough to produce non-linear perturbations. How-
ever, the density contrast of dark matter, which is not coupled to the baryons, grows
logarithmically during the radiation-dominated era, and so factor of one thousand
amplification is enough to give non-linear structures today.
With the more detailed look above, we note that even without the transfer func-
tion, the amplitude of the density perturbation, unlike the gravitational potential,
depends on the scale. The conclusion that non-linear baryonic structures on the
presently observed scales could not have formed without dark matter is correct, but
the argument is a bit more subtle. Perturbations on comoving length scale R become
non-linear when their density contrast becomes of order unity. The density contrast
smoothed on a ball of radius R around the point x is
′
1
Z
|x − x|
δ(x, R) ≡ W δ(x′ )d3 x′ , (11.70)
V R
where W (y), the window function, is some function which falls off rapidly as y > 1,
i.e., |x − x′ | > R, and V ≡ d3 xW (x/R) is the volume of W . A typical choice of
R
where hi stands for the spatial average. As we are considering theR linear density
R 3 field, 2
2 3
this is just the average over the background space, hδ(x, R) i = ( d x) −1 d xδ(x, R) .
Structures start forming on comoving scale R when σ(R), which grows linearly with
the scale factor, reaches unity. Doing a Fourier transform, we can write the mean
square density contrast as
Z ∞
2 dk
σ (R) = Pδ (k, t)W (kR)2 , (11.72)
0 k
1 2 2
where for a Gaussian window function we have W (kR) = e− 2 k R . If the spectrum
of density perturbations were a power law, P(k) = Ak n+3 , we would have (exercise)
2 1 n+3
σ (R) = Γ Pδ (R−1 ) . (11.73)
2 2
So the mean square density contrast on a given comoving scale R would be
roughly given by the value of the power spectrum at k = R−1 . The real power
spectrum is more complicated because of the transfer function, but qualitatively it
is still true that the amplitude of density perturbations on a given scale is roughly
given by the power spectrum on that scale.
11 PERTURBATIONS AFTER INFLATION 186
If the transfer function were to continue to have the k 2 ln k behaviour for very
large k without limit, we would have Pδ (k) ∼ k n−1 (ln(k/keq ))2 . So if n ≥ 1, the
power spectrum would reach non-linear values at all times, if we just go to small
enough scales. So we would always have non-linear structures, albeit on very small
scales! However, the radiation-dominated era after inflation has a finite duration,
so the amount of logarithmic growth is limited. There is also another effect which
wipes out structure on small scales, namely the motion of the dark matter particles,
called free-streaming.
Even CDM has a finite temperature, which means that dark matter particles
have thermal motions, and this smooths density perturbations below some scale,
as particles from overdense and underdense regions mix and balance the density
2 2
perturbations out. For CDM, the transfer function is modified by the term e−k /kf s
for k ≫ kf s , where kf s is the free-streaming scale, related to the distance the dark
matter particles have moved since decoupling. For k < kf s , structure formation is
unaffected, but on small scales, perturbations are highly suppressed. The smallest
scale on which structures form is given by the free-streaming length, which for a
WIMP is approximately [4]
1/2
−1
m 1/2 TD
kf s ≈ pc , (11.74)
100 GeV 30 MeV
where m is the mass of the dark matter particle and TD is its decoupling temperature.
The smallest structures for a typical WIMP are therefore of comoving length 1 pc.
They form around a redshift of z = 40 . . . 60.
For warm dark matter, the free-streaming scale is larger, so structures on larger
scales are wiped out. For example, for sterile neutrinos (a prominent WDM can-
didate), the transfer function is instead modified approximately with the term
[1 + (k/kf s )2 ]−5 , with [5]
m
kf s ≈ Mpc−1 . (11.75)
500 eV
If the sterile neutrino mass were 500 eV, all structures on comoving scales smaller
than a Mpc would have been suppressed, in drastic conflict with observations. How-
ever, for a mass of say 5 keV, galaxies still form, but smaller structures are sup-
pressed, which may help to explain why there are fewer observed satellites of the
Milky Way than predicted in CDM models11 . Viewed from another perspective,
observations of structures can be used to constrain particle physics dark matter
models.
11
Note that kf−1
s is the comoving scale of the linear perturbation from which the structure formed.
The corresponding actual size of the structure today is smaller, because structures contract and
then stop expanding when they form, whereas in linear theory they would have been stretched
linearly with the scale factor.
REFERENCES 187
Figure 4: The whole picture of structure formation theory from quantum fluctuations
during inflation to the present-day power spectrum at t0 .
References
[1] Bardeen J M, Bond J R, Kaiser N and Szalay A S, The statistics of peaks of
Gaussian random fields, 1986 Astrophys. J. 304 15.
[2] J. Richard Gott III et al., A Map of the Universe, Astrophys. J. 624, 463 (2005),
astro-ph/0310571.
[3] M. Tegmark et al., Cosmological Constraints from the SDSS Luminous Red
Galaxies, Phys. Rev. D74, 123507 (2006), arXiv:astro-ph/0608632.
[4] A.M. Green, S. Hofmann and D.J. Schwarz, MNRAS 353, L23 (2004),
arXiv:astro-ph/0309621.
[5] S.H. Hansen, J. Lesgourgues, S. Pastor and J. Silk, MNRAS 333, 544 (2002),
arXiv:astro-ph/0106108.
REFERENCES 188
Figure 5: Distribution of galaxies according to the Sloan Digital Sky Survey (SDSS). This
figure shows galaxies that are within 2◦ of the equator and closer than 858 Mpc (assuming
H0 = 71 km/s/Mpc). Figure from astro-ph/0310571[2].
10
1
2
k P(k) / 2π
0.1
3
0.01
0.001
0.0001
0.01 0.1 1
-1
k [Mpc ]
REFERENCES 189
1e+05
P(k) [Mpc ]
3
10000
1000
100
0.01 0.1 1
-1
k [Mpc ]
Figure 6: The matter power spectrum from the SDSS obtained using luminous red galaxies
[3]. The top figure shows Pδ (k) and the bottom figure Pδ (k). A Hubble constant value
H0 = 71.4 km/s/Mpc has been assumed for this figure. (These galaxy surveys only obtain
the scales up to the Hubble constant, and therefore the observed Pδ (k) is usually shown in
units of h Mpc−1 , so that no value for H0 need to be assumed.) The black bars are the
observations and the red curve is a theoretical fit, from linear perturbation theory, to the
data. The bend in P (k) at keq ∼ 0.01 Mpc−1 is clearly visible in the bottom figure. Linear
perturbation theory fails when P(k) & 1, and therefore the data points do not follow the
theoretical curve to the right of the dashed line (representing an estimate on how far linear
theory can be trusted). Figure by R. Keskitalo.
12 Cosmic microwave background
The cosmic microwave background (CMB) is isotropic to a high degree. This tells
us that the early universe was very homogeneous at t = tdec , when the CMB was
formed. However, with precise measurements we can detect the small anisotropy of
the CMB, which reflects the small perturbations in the early universe.
This anisotropy was first detected by the COBE satellite in 1992, which mapped
the whole sky in three microwave frequencies. The angular resolution of COBE was
rather poor, 7◦ , so only features larger than this were detected. Measurements with
better resolution, but covering only small parts of the sky, were then performed
using instruments carried by balloons to the upper atmosphere, and ground-based
detectors located at high altitudes.
The next full-sky map of the CMB was made by the Wilkinson Microwave
Anisotropy Probe (WMAP) satellite, in orbit around the L2 point of the Sun-Earth
system, 1.5 million kilometers from the Earth in the direction opposite to the Sun.
The satellite was launched by NASA in June 2001, and the results of the first year
of measurements were published in February 2003. The WMAP satellite made eight
years of measurements, and the data from the first seven have been made public.
The Planck satellite was launched by ESA in May 2009, and the first cosmological
results were released in March 2013. The polarisation data has not yet been released,
it is expected to be made public in December 2014.
Figure 1: The cosmic microwave background according to the DMR instrument aboard the
COBE satellite.
190
12 COSMIC MICROWAVE BACKGROUND 191
Figure 4: The observed CMB temperature anisotropy gets a contribution from the last
scattering surface, (δT /T )intr = Θ(tdec , xls , n̂) and from along the photon’s journey to us,
(δT /T )jour .
center of this sphere, which extends away from us both in space and in time.
The observed temperature anisotropy is due to two contributions, an intrinsic
temperature variation at the surface of last scattering and a variation in the redshift
the photons have suffered during their journey to us,
δT δT δT
= + . (12.1)
T obs T intr T jour
See figure 4. There are two ways to define what we mean by the CMB perturbation
δT . The first way is to just takeR the angular average of the temperature field and
1
call this the mean, T̄ ≡ T0 ≡ 4π dΩT , and defined the anisotropy as the difference
from the mean, δT = T − T0 . This is the physically most correct way. However, in
the context of perturbed FRW models, it can be simpler to call the temperature in
the background model the mean temperature. The perturbations also contribute to
the mean temperature, so this is a bit misleading, but common. We will also use
the notation δT δT δT
T instead of T̄ or T0 , as is common, but it should be understood
that the temperature in the denominator is the mean temperature. (Of course, this
would only make a difference at second order.)
δT
The first term in (12.1), T intr represents the temperature variation of the
photon gas at t = tdec . (It also includes the Doppler effect from the motion of this
photon gas.) At that time the largest scales we see on the CMB sky were still outside
the horizon. The separation of δT /T into two components is gauge-dependent. If
the time slice t = tdec dips further into the past in some location, it finds a higher
temperature, but the photons from there also have then a longer way to go and suffer
a larger redshift, so the two effects balance each other. We can calculate in any gauge
we want, getting different results for (δT /T )intr and (δT /T )jour depending on the
gauge, but their sum (δT /T )obs is gauge-independent, because it is an observed
quantity.
One might think that (δT /T )intr should be equal to zero, since in our earlier dis-
cussion of recombination and decoupling we identified decoupling with a particular
temperature Tdec ∼ 3000 K. This kind of thinking corresponds to a particular gauge
choice where the t = tdec time slice coincides with the T = Tdec hypersurface. In
12 COSMIC MICROWAVE BACKGROUND 193
Figure 5: Depending on the gauge, the Tdec = const. surface may, or (usually) may not
coincide with the t = tdec time slice.
this gauge (δT /T )intr = 0, except for the Doppler effect (we are not going to use this
gauge). Anyway, it is not true that all photons have their last scattering exactly
when T = Tdec . Rather they occur during a rather large temperature interval and
time period. The zeroth-order (background) time evolution of the temperature of
the photon distribution is the same before and after last scattering, T ∝ a−1 , so it
does not matter how we draw the artificial separation line, the time slice t = tdec
separating the fluid and free particle treatment of the photons. See figure 5.
Summing over the m corresponding to the same multipole number ℓ we have the
closure relation
2ℓ + 1
|Yℓm (θ, φ)|2 =
X
. (12.5)
m
4π
We will also use the expansion of a plane wave in terms of spherical harmonics,
X
eik·x = 4π iℓ jℓ (kx)Yℓm (x̂)Yℓm
∗
(k̂) . (12.6)
ℓm
Here x̂ and k̂ are the unit vectors in the directions of x and k, and jℓ is the spherical
Bessel function.
haℓm i = 0 , (12.7)
and the quantity we want to calculate from theory is the variance h|aℓm |2 i to get a
prediction for the typical size of the aℓm . The isotropic nature of the random process
shows up in the aℓm so that these expectation values depend only on ℓ not m. (The
ℓ are related to the angular size of the anisotropy pattern, whereas the m are related
to “orientation” or “pattern”.) Since h|aℓm |2 i is independent of m, we can define
1 X
Cℓ ≡ h|aℓm |2 i = h|aℓm |2 i . (12.8)
2ℓ + 1 m
This function Cℓ (of integers l ≥ 1) is called the (theoretical) angular power spec-
trum. It is analogous to the power spectrum P(k) of density perturbations. For
Gaussian perturbations, Cℓ contains all the statistical information about the CMB
temperature anisotropy. This is all we can predict from theory. Thus analysis of
the CMB anisotropy consists of calculating the angular power spectrum from the
observed CMB and comparing it to the Cℓ predicted by theory2 .
2
In addition to the temperature anisotropy, the CMB also has another property, its polarisation.
There are two additional power spectra related to the polarisation, CℓEE and CℓBB , and one related
to the correlation between temperature and polarisation, CℓT E . The spectra CℓEE and CℓT E have
been measured, while there is thus far no detection of a non-zero CℓBB , only an upper bound. A
detection would indicate the presence of primordial gravitational waves. In the simplest inflationary
models, such as the m2 ϕ2 model, the amplitude of the gravitational waves produced during inflation
is large enough that it should be seen by Planck. In many other models, the amplitude is too small
to be detected by CMB experiments in the near future.
12 COSMIC MICROWAVE BACKGROUND 195
Figure 6: The three lowest multipoles ℓ = 1, 2, 3 of spherical harmonics. Left column: Y10 ,
Re Y11 , Im Y11 . Middle column: Y20 , Re Y21 , Im Y21 , Re Y22 , Im Y22 . Right column: Y30 ,
Re Y31 , Im Y31 , Re Y32 , Im Y32 , Re Y33 , Im Y33 . Figure by Ville Heikkilä.
12 COSMIC MICROWAVE BACKGROUND 196
Just like the three-dimensional density power spectrum P(k) gives the contri-
bution of scale k to the density variance hδ(x)2 i, the angular power spectrum Cℓ is
related to the contribution of multipole ℓ to the temperature variance,
* + * +
δT (θ, φ) 2 X X
= aℓm Yℓm (θ, φ) a∗ℓ′ m′ Yℓ∗′ m′ (θ, φ)
T
ℓm ℓ′ m′
XX
= Yℓm (θ, φ)Yℓ∗′ m′ (θ, φ)haℓm a∗ℓ′ m′ i
ℓℓ′ mm′
X 2ℓ + 1
|Yℓm (θ, φ)|2 =
X X
= Cℓ Cℓ , (12.10)
m
4π
ℓ ℓ
δT (θ, φ) 2
Z
1 1
Z X X
dΩ = dΩ aℓm Yℓm (θ, φ) a∗ℓ′ m′ Yℓ∗′ m′ (θ, φ)
4π T 4π
ℓm ℓ′ m′
1 X X Z
= aℓm a∗ℓ′ m′ Yℓm (θ, φ)Yℓ∗′ m′ (θ, φ)dΩ
4π
ℓm ℓ′ m′ | {z }
δℓℓ′ δmm′
1 XX
= |aℓm |2
4π
ℓ
|m {z }
bℓ
(2ℓ+1)C
X 2ℓ + 1
= C
bℓ . (12.12)
4π
ℓ
12 COSMIC MICROWAVE BACKGROUND 197
Figure 7: The observed angular power spectrum Cbℓ according to the Planck satellite.
The observational results are the data points, with error bars representative of the cosmic
variance. The solid curve is the theoretical Cℓ from the best-fit ΛCDM model, and the gray
band around it represents the cosmic variance corresponding to this Cℓ .
Contrast this with (12.10), which gives the variance of δT /T at an arbitrary location
on the sky over different realisations of the random process which produced the
primordial perturbations; whereas equation (12.12) gives the variance of δT /T of
our given sky over the celestial sphere.
λphys
dA ≡ . (12.17)
θ
Likewise, we defined the comoving angular diameter distance dcA by
λc
dcA ≡ (12.18)
θ
where λc = (1/a)λphys = (1 + z)λphys is the corresponding comoving length. Thus
dcA = (1/a)dA = (1 + z)dA . See figure 10.
Consider now the Fourier modes of our earlier perturbation theory discussion.
A mode with comoving wavenumber k has comoving wavelength λc = 2π/k. Thus
this mode should show up as a pattern on the CMB sky with angular size
λc 2π 2π
θλ = c = c = . (12.19)
dA kdA ℓ
For the last equality we used the relation (12.15). From it we get that the modes
with wavenumber k contribute mostly to multipoles around
ℓ = kdcA . (12.20)
2. The modes k are not wrapped around the sphere of last scattering, but the
wave vector forms a different angle with the sphere at different places.
3
In reality, there is no sharp cut-off at a particular ℓ, the observational error bars just blow up.
12 COSMIC MICROWAVE BACKGROUND 199
Figure 8: Randomly generated skies containing only a single multipole ℓ. Staring from top
left: ℓ = 1 (dipole only), 2 (quadrupole only), 3 (octopole only), 4, 5, 6, 7, 8, 9, 10, 11, 12.
Figure by Ville Heikkilä.
Figure 10: The comoving angular diameter distance relates the comoving size of an object
and the angle in which we see it.
The following precise discussion applies only for the case of a flat universe (K = 0
Friedmann model as the background), where one can Fourier expand functions on
a time slice. We start from the expansion of the plane wave in terms of spherical
harmonics, for which we have the result (12.6),
X
eik·x = 4π iℓ jℓ (kx)Yℓm (x̂)Yℓm
∗
(k̂) , (12.21)
ℓm
on the t = tdec time slice. We want the multipole expansion of the values of this
function on the last scattering sphere. See figure 11. These are the values f (xx̂),
where x ≡ |x| has a constant value, the (comoving) radius of this sphere. Thus
Z
∗
aℓm = dΩx Yℓm (x̂)f (xx̂)
X Z
= dΩx Yℓm∗
(x̂)fk eik·x
k
XXZ
∗
(x̂)iℓ jℓ′ (kx)Yℓ′ m′ (x̂)Yℓ∗′ m′ (k̂)
′
= 4π dΩx fk Yℓm
k ℓ′ m′
X
ℓ ∗
= 4πi fk jℓ (kx)Yℓm (k̂) , (12.23)
k
4πiℓ
Z
aℓm = d3 kf (k)jℓ (kx)Yℓm
∗
(k̂) . (12.24)
(2π)3/2
The jℓ are oscillating functions with decreasing amplitude. For large values of ℓ
the position of the first (and largest) maximum is near kx = ℓ (see figure 12). Thus
the aℓm pick a large contribution from the Fourier modes k for which
kx ∼ ℓ . (12.25)
12 COSMIC MICROWAVE BACKGROUND 201
0 20 40 60 80 100
0.3
0.3 j2(x)
0.2
j3(x)
0.1 j4(x)
0.2
0
-0.1
0.1
-0.1
0 2 4 6 8 10 12 14 16 18 20
0 200 400 600 800 1000
0.01 -0.005
-0.01
0.005
-0.005
-0.01
180 190 200 210 220 230 240 250
Figure 12: Spherical Bessel functions jℓ (x) for ℓ = 2, 3, 4, 200, 201, and 202. Note how
the first and largest peak is near x = ℓ (but to be precise, at a slightly larger value). Figure
by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 202
In a flat universe the comoving distance x (from our location to the sphere of last
scattering) and the comoving angular diameter distance dcA are equal, so we can
write this result as
kdcA ∼ ℓ . (12.26)
The conclusion is that a given multipole ℓ acquires a contribution from modes
with a range of wavenumbers, but most of the contribution comes from near the value
given by (12.20). This concentration is tighter for larger ℓ. We will use equation
(12.20) for qualitative purposes.
where the second line holds for an FRW model that contains only matter and vacuum
energy (Ω0 = Ωm0 +ΩΛ0 ). In the real universe, the contribution of radiation is small,
since the radiation-dominated era ends early, when the universe is around 50 000
years old. Recall that Ω0 − 1 = −ΩK0 = K/H02 . We are interested in the distance
to the last scattering sphere, i.e. dcA (zdec ), where 1 + zdec ≈ 1090.
In the simplest case of the spatially flat matter-dominated universe, ΩΛ0 = 0,
Ωm0 = 1, the integral gives
Z 1
c −1 da −1 1
dA (zdec ) = H0 √ = 2H0 1− √ = 1.94H0−1 ≈ 2H0−1 , (12.28)
1 a 1 + z dec
1+z
where the last approximation corresponds to ignoring the contribution from the
lower limit.
We also consider two more general situations, of which the above is a special
case.
a) Open universe with no dark energy, ΩΛ0 = 0 and Ωm0 = Ω0 < 1. Now we have
Z 1 !
c H0−1 p da
dA (zdec ) = √ sinh 1 − Ωm0 p
1 − Ωm0 1
1+z
(1 − Ωm0 )a2 + Ωm0 a
−1 Z 1
H da
= √ 0 sinh q
1 − Ωm0 1
1+z a 2 + Ωm0 a
1−Ωm0
!
H0−1
r r
1 − Ωm0 1 − Ωm0 1
= √ sinh 2 arsinh − 2 arsinh
1 − Ωm0 Ωm0 Ωm0 1 + zdec
!
H −1
r
1 − Ωm0
≈ √ 0 sinh 2 arsinh
1 − Ωm0 Ωm0
H0−1
= 2 , (12.29)
Ωm0
12 COSMIC MICROWAVE BACKGROUND 203
-1
Comoving distance in H0
7
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Figure 13: The comoving proper distance dcP (z = ∞) (dashed) and the comoving angular
diameter distance dcA (z = ∞) (solid) to the horizon in matter-only open universe. The
vertical axis is the distance in units of Hubble distance H0−1 and the horizontal axis is the
density parameter Ω0 = Ωm0 . The distances to last scattering, dcP (zdec ) and dcA (zdec ), are a
few per cent less.
where again the approximation ignores the contribution from the lower limit
(i.e., it actually gives the comoving angular diameter distance to the hori-
zon, dcAp (z = ∞)). In the last step we used sinh 2x = 2 sinh x cosh x =
2 sinh x 1 + sinh2 x. We show this result (together with the comoving proper
distance dcP (z = ∞)) in figure 13.
The distance dcA (zdec ) depends on the expansion history of the universe. For one,
the longer it takes for the universe to cool from Tdec to T0 (i.e., to expand by the
factor 1 + zdec ), the longer distance the photons have time to travel. For spatially
curved universes the angular diameter distance gets an additional effect from the
geometry of the universe, which acts like a “lens” to make the distant CMB pattern
at the last scattering sphere to look smaller or larger (see figure 14).
Figure 14: The geometry effect in a closed (top) or an open (bottom) universe affects the
angle at which we see a structure of given size at the last scattering surface, and thus its
angular diameter distance.
subhorizon and which are superhorizon (at the time of last scattering). For that we
need to know the comoving Hubble scale aH at tdec .
We make the approximation that neutrinos are massless. The physical radiation
density today is then ωr ≡ Ωr0 h2 ≈ 4.18 × 10−5 , the photon contribution being
ωγ ≈ 2.47×10−5 . We also make the approximation that the universe was completely
matter-dominated at tdec , i.e. we ignore the radiation contribution to the Friedmann
equation at tdec . This is not a terribly good approximation, since
ρm (tdec ) ωm
= ≈ 22ωm ≈ 2.6 . . . 3.5 , (12.31)
ρr (tdec ) (1 + zdec )ωr
for ωm = 0.12 . . . 0.16. The curvature and (for most dark energy models, including
vacuum energy) dark energy contributions are negligible at tdec . Thus we have
2 8πG
Hdec ≈ ρm = Ωm0 H02 (1 + zdec )3 , (12.32)
3
and we get for the comoving Hubble scale
H −1 1
−1
kdec −1
≡ (adec Hdec )−1 = (1 + zdec )Hdec = (1 + zdec )−1/2 √ 0 = √ 91h−1 Mpc ,
Ωm0 Ωm0
(12.33)
using 1 + zdec = 1090. The scale which is entering at t = tdec is thus
For ωm = 0.14 we have keq = 100 Mpc. The corresponding multipole number is
c 2/Ωm0 = 430h (ΩΛ = 0)
ℓeq = keq dA = 214Ωm0 h × (12.38)
2/Ω0.4
m0 ≈ 430h Ω 0.6 (Ω = 1) .
m0 0
and that this separation is gauge dependent. We shall consider this in the longi-
tudinal gauge, since the second part, δT
T jour , the integrated redshift perturbation
along the line of sight, is easiest to calculate in this gauge. The calculation requires
more general relativity tools than we have available, so we just give the result.
Z o Z o
δT 1
= − dΦ + vobs · n̂ + dt Φ̇ + Ψ̇ − ḣij n̂i n̂j
T jour dec dec 2
Z o
1
= Φ(tdec , xls ) − Φ(t0 , 0) + vobs · n̂ + dt Φ̇ + Ψ̇ − ḣij n̂i n̂j
dec 2
Z o
1 i j o
Z
Ψ≈Φ
= Φ(tdec , xls ) − Φ(t0 , 0) + vobs · n̂ + 2 dtΦ̇ − n̂ n̂ dtḣij ,
dec 2 dec
(12.40)
where the integral is from (tdec , xls ) to (t0 , 0) along the path of the photon (a null
geodesic) and n̂ is a unit vector pointing in the direction the observer is looking
at. The observer’s location has been chosen as the origin 0. The term vobs · n̂
is the Doppler effect from the observer’s motion (which is assumed nonrelativistic,
|vobs | ≪ 1), where vobs is the observer’s velocity. The subscript ls in xls indicates that
12 COSMIC MICROWAVE BACKGROUND 206
Both the density perturbation δγ and the fluid velocity v are gauge dependent; we
use the longitudinal gauge only.
To make further progress we now
2. make the (crude) approximation that the universe is already matter dominated
at t = tdec .
For adiabatic perturbations we have
3
δb = δc ≡ δm = δγ . (12.45)
4
The perturbations stay adiabatic only on superhorizon scales. Once the per-
turbation has entered horizon, different physics begin to act on different matter
components, so the adiabatic relation between their density perturbations is bro-
ken. In particular, the baryon-photon perturbation is affected by photon pressure,
which damps its growth and causes it to oscillate, whereas the CDM perturbation
is unaffected and keeps growing. Since the baryon and photon components see the
same pressure, they evolve together and maintain their adiabatic relation until pho-
ton decoupling. Thus, after horizon entry but before decoupling we have,
3
δc 6= δb = δγ . (12.46)
4
At decoupling, the equality holds for scales larger than the photon mean free path
at tdec .
After decoupling, this connection between the photons and baryons is broken,
and the baryon density perturbation begins to approach the CDM density pertur-
bation,
3
δc ← δb 6= δγ . (12.47)
4
We shall return to these issues when we discuss the shorter scales in sections 12.6
and 12.7. But let us first consider the scales which are still superhorizon at tdec , so
that (12.45) applies.
o
δT 2
Z
= − Φ(tdec , xls ) + Φ(tdec , xls ) + 2 Φ̇dt
T obs 3 dec
Z o
1
= Φ(tdec , xls ) + 2 Φ̇dt . (12.50)
3 dec
12 COSMIC MICROWAVE BACKGROUND 208
This part of the CMB anisotropy is called the Sachs–Wolfe effect. The firstR part,
1
3 Φ(tdec , xls ),
is called the ordinary Sachs–Wolfe effect, and the second part, 2 Φ̇dt,
is called the integrated Sachs-Wolfe effect (ISW), since it involves integrating along
the line of sight. There are two contributions to the integrated Sachs–Wolfe effect,
the early Sachs–Wolfe effect and the late Sachs–Wolfe effect. The first is caused by
the effect of radiation at last scattering. In our approximation where we assume
that the universe is completely matter-dominated at t = tdec , this term is absent.
When dark energy becomes important at times close to today, Φ starts to evolve
again, which leads to the late ISW effect, which shows up as a rise in the smallest ℓ
of the angular power spectrum Cℓ . However, it is difficult to detect this effect due to
the large cosmic variance at small ℓ. The late ISW effect also leads to a correlation
between the CMB anisotropies and the galaxy distribution, which makes it easier to
detect its presence. The late ISW effect has been detected this way, from the cross-
correlation of the CMB and large scale structure. We shall now for a while ignore
the ISW, which for ℓ ≪ ℓH is expected to be smaller than the ordinary Sachs–Wolfe
effect.
For any such linear combination, the expectation value of its absolute value squared
is
* 2 +
X XX
b R = bk b∗k′ hRk R∗k′ i
k k
k k k′
3 X
2π 1
= PR (k) |bk |2 , (12.56)
L 4πk 3
k
12 COSMIC MICROWAVE BACKGROUND 209
where we used 3
2π 1
hRk R∗k′ i = δkk′ PR (k) (12.57)
L 4πk 3
(the independence of the random variables Rk and the definition of the power spec-
trum P(k)).
Thus
1 X
Cℓ ≡ h|aℓm |2 i
2ℓ + 1 m
16π 2 1 X 2π 3 X 1
2
2 ∗
= P (k)j (kx) Y ( k̂)
3 R ℓ ℓm
25 2ℓ + 1 m L 4πk
k
1 2π 3 X 1
= PR (k)jℓ (kx)2 . (12.58)
25 L k3
k
(Although all h|aℓm |2 i are equal for the same ℓ, we used the sum over m in order to
apply (12.5).) Replacing the sum with an integral, we get
Z 3
1 d k
Cℓ = PR (k)jℓ (kx)2
25 k3
4π ∞ dk
Z
= PR (k)jℓ (kx)2 , (12.59)
25 0 k
we have
4π ∞
dk A2 2π
Z
Cℓ = A2 jℓ (kx)2 = , (12.61)
25 0 k 25 ℓ(ℓ + 1)
since ∞
dk 1
Z
jℓ (kx)2 = . (12.62)
0 k 2ℓ(ℓ + 1)
We can write this as
ℓ(ℓ + 1) A2
Cℓ = = const. (independent of ℓ) (12.63)
2π 25
The reason why the angular power spectrum is customarily plotted as ℓ(ℓ + 1)Cℓ /2π
is that it makes the Sachs–Wolfe part of the Cℓ flat for a scale-invariant primordial
power spectrum PR (k).
Present data shows that the spectrum has a small red tilt, n = 0.96 ± 0.007,
as expected from the simplest inflationary models. Since the spectrum is close to
scale-invariant, determining the spectral index requires observations over a range of
scales. However, determining the overall amplitude is possible just by observing the
few lowest multipoles, known as the Sachs–Wolfe plateau. The COBE satellite saw
12 COSMIC MICROWAVE BACKGROUND 210
only up to about ℓ = 25, so the COBE data in figure 1 is completely in this region.
The amplitude is
ℓ(ℓ + 1) b
Cℓ ≈ 10−10 . (12.64)
2π
This gives the amplitude of the primordial power spectrum as
2
PR (k) = A2 ≈ 25 × 10−10 = 5 × 10−5 . (12.65)
We already used this result (confirmed after COBE by other experiments) in chapter
10 as a constraint on the energy scale of inflation. Nowadays, the detailed structure
of the anisotropies has been measured: the latest data from Planck is shown in figure
7. Let us now discuss how the structure of peaks and troughs is generated.
We concentrate on the three first terms, which correspond to the situation at the
point (tdec , xls ) we are looking at on the last scattering sphere.
Before decoupling the photons are tightly coupled to the baryons. The per-
turbations in the baryon-photon fluid are oscillating, whereas CDM perturbations
grow (logarithmically during the radiation-dominated epoch, and then ∝ a during
the matter-dominated epoch). Therefore CDM perturbations begin to dominate the
total density perturbation δρ and thus also Φ already before the universe becomes
matter-dominated and CDM begins to dominate the background energy density.
Thus we can make the approximation that Φ is given by the CDM perturbation.
The baryon-photon fluid oscillates in these potential wells caused by the CDM. The
potential Φ evolves at first but then becomes constant as the universe becomes
matter dominated.
We will not do a full calculation of the δbγ oscillations in the expanding universe,
that would require a bit more general relativity tools than we have at our disposal.
One reason is that ρbγ is a relativistic fluid, and we gave the equation for the density
perturbation for a nonrelativistic fluid only (the Jeans equation). The nonrelativistic
perturbation equation for a fluid component i is (this follows from (11.50) when we
replace the baryonic density contrast with the total density contrast in the driving
term)
2
k
c2s δki + Φk .
δ̈ki + 2H δ̇ki = − (12.67)
a
The generalisation of the (subhorizon) perturbation equations to the case of a
relativistic fluid is considerably easier if we ignore the expansion of the universe.
Then (12.67) becomes
δ̈ki + k 2 c2s δki + Φk = 0 .
(12.68)
According to GR, the density of “passive gravitational mass” is ρ+p = (1+w)ρ, not
just ρ as in Newtonian gravity. Therefore the force on a fluid element of the fluid
12 COSMIC MICROWAVE BACKGROUND 211
In the present application the fluid component ρi is the baryon-photon fluid ρbγ
and the gravitational potential Φ is caused by the CDM. Before decoupling, the
adiabatic relation δb = 43 δγ still holds between photons and baryons, and we have
the adiabatic relation between pressure and density perturbations,
We defined
3 ρ̄b
R≡ . (12.72)
4 ρ̄γ
We can now write the perturbation equation (12.69) for the baryon-photon fluid as
or
2 1 1 1
Θ̈0k + k Θ0k + Φk = 0 , (12.79)
31+R 3
or
Θ̈0k + c2s k 2 [Θ0k + (1 + R)Φk ] = 0 , (12.80)
If we now take R and Φk to be constant, this is the harmonic oscillator equation
for the quantity Θ0k + (1 + R)Φk with the general solution
or
Θ0k + Φk = −RΦk + Ak cos(cs kt) + Bk sin(cs kt) , (12.82)
or
Θ0k = −(1 + R)Φk + Ak cos(cs kt) + Bk sin(cs kt) . (12.83)
We are interested in the quantity Θ0 + Φ = 14 δγ + Φ, called the effective temperature
perturbation, since this combination appears in (12.66). It is the local temperature
perturbation minus the redshift photons suffer when climbing from the potential well
of the perturbation (negative Φ for a CDM overdensity). We see that this quantity
oscillates in time, and the effect of baryons (via R) is to shift the equilibrium point
of the oscillation by −RΦk .
In the preceding we ignored the effect of the expansion of the universe. The
expansion affects the result in several ways. For example, cs , wbγ and R change
with time. The potential Φ also evolves, especially at early times when radiation
dominates the expansion. However, the qualitative result of an oscillation of Θ0 + Φ,
and the shift of its equilibrium point by baryons, remains. The time t in the solution
(12.82) gets replaced by conformal time η, and since cs changes with time, cs η is
replaced by Z η Z t
c cs (t)
rs (t) ≡ cs dη = dt . (12.84)
0 0 a(t)
We call this quantity rsc (t) the comoving sound horizon at time t, since it gives the
comoving distance sound waves have travelled to time t.
The relative weight of the cosine and sine solutions (i.e., the constants Ak and
Bk in (12.81) depends on the initial conditions. Since the perturbations are initially
at superhorizon scales, the initial conditions are determined there, and the present
discussion does not really apply. However, using the superhorizon initial conditions
gives the correct qualitative result for the phase of the oscillation.
We had that for adiabatic primordial perturbations, initially Φ = − 35 R and
1 2 2 1 1
4 δγ = − 3 Φ = 5 R, giving us an initial condition Θ0 + Φ = 3 Φ = − 5 R = const.
(At these early times R ≪ 1, so we can drop the factor 1 + R.) Thus adiabatic
primordial perturbations correspond essentially to the cosine solution. There are
effects at the horizon scale which affect the amplitude of the oscillations—the main
effect being the decay of Φ as it enters the horizon—so we can’t use the preceding
discussion to determine the amplitude, but we get the right result about the initial
phase of the Θ0 + Φ oscillations.
Thus we have that, qualitatively, the effective temperature behaves at subhorizon
scales as
Θ0k + (1 + R)Φk ∝ cos[krsc (t)] , (12.85)
12 COSMIC MICROWAVE BACKGROUND 213
Figure 15: Acoustic oscillations. The top panel shows the time evolution of the Fourier
amplitudes Θ0k , Φk , and the effective temperature Θ0k + Φk . The Fourier mode shown
corresponds to the fourth acoustic peak of the Cℓ spectrum. The bottom panel shows δbγ (x)
for one Fourier mode as a function of position at various times (maximum compression,
equilibrium level, and maximum decompression).
is the sound horizon angle, i.e., the angle at which we see the sound horizon on the
last scattering surface. This is the CMB anisotropy quantity which is determined
with most accuracy from the data. Analysis of the 5-year data from the WMAP
satellite and data from the ACBAR ground-based CMB experiment gives the model-
independent value θs = 0.593◦ ± 0.001◦ , a precision of 0.3% [1].
Because of these acoustic oscillations, the CMB angular power spectrum Cℓ has
a structure of acoustic peaks on subhorizon scales. The centers of these peaks are
located approximately at ℓn ≈ nℓA . An exact calculation shows that they actually
lie at somewhat smaller ℓ due to a number of effects. The separation of Neighbouring
peaks is closer to ℓA than the positions of the peaks are to nℓA .
These acoustic oscillations involve motion of the baryon-photon fluid. When the
oscillation of one Fourier mode is at its extreme, e.g. at the maximal compression in
the potential well, the fluid is momentarily at rest, but then it begins flowing out of
the well until the other extreme, the maximal decompression, is reached. Therefore
those Fourier modes k which have the maximum effect on the CMB anisotropy via
the 41 δγ (tdec , xls ) + Φ(tdec , xls ) term (the effective temperature effect) in (12.66) have
the minimum effect via the −v · n̂(tdec , xls ) term (the Doppler effect) and vice versa.
Therefore the Doppler effect also contributes a peak structure to the Cℓ spectrum,
but its peaks are in the locations where the effective temperature contribution has
troughs.
The Doppler effect is subdominant to the effective temperature effect, and there-
fore the peak positions in the Cℓ spectrum is determined by the effective tempera-
12 COSMIC MICROWAVE BACKGROUND 215
ture effect, according to (12.88). The Doppler effect just partially fills the troughs
between the peaks, weakening the peak structure of Cℓ . See figure 18. The calcula-
tion involves some approximations which allow the description of Cℓ as just a sum
of these contributions, and is not as accurate as a full calculation using e.g. the
CAMB code8 .)
Figure 16 shows the values of the effective temperature perturbation Θ0 + Φ (as
well as Θ0 and Φ separately) and the magnitude of the velocity perturbation (Θ1 ∼
v/3) at tdec as a function of the scale k. This is a result of a numerical calculation
which includes the effect of the expansion of the universe, but not diffusion damping.
We can estimate kD and ℓD as follows (see [2], page 129, for a bit more details).
Before decoupling photons are scattering from the electrons in the plasma. The
typical time between collisions (i.e. the photon mean free path) at time t is λγ =
α2
tc (t) = Γ−1 = (ne (t)σT )−1 , where σT = 8π
3 m2e is the Thomson cross-section. The
free electron density depends on the ionisation fraction x (see section 5.6). For
simplicity, we take x = 1. (In fact, the ionisation fraction drops, and tc grows,
rapidly during decoupling.) The photon direction changes randomly at each collision
and independently of the previous collision, so photons undergo a random walk with
uncorrelated steps. The number of steps the photons has taken up to time t is
N = t/tc (taking tc to be constant for simplicity), and the total distance it has
travelled at decoupling is
√ √
dD = N tc = tdec tc ≈ 14 kpc , (12.93)
where we have put in tdec = 380 000 yr, tc = tc (tdec ) and used Tdec = 3000 K,
η = 6 × 10−10 . The comoving diffusion wavenumber is given by
−1
kD = (1 + zdec )dD ≈ 15 Mpc , (12.94)
where we have put in dcA (zdec ) = 13.8 Gpc (see section 12.9.2).
8
CAMB is a publicly available code for precise calculation of the Cℓ spectrum. See
http://camb.info/ .
12 COSMIC MICROWAVE BACKGROUND 216
0.5
Φ, Θ0
0
-0.5
0.5
(Θ0 + Φ)
0
-0.5
0.4
0.2
Θ1
0
-0.2
-0.4 ωm = 0.10
ωm = 0.20
0.6 ωm = 0.30
0.5
2
0.4 (Θ0 + Φ)
0.3
0.2
0.1
0
0.15
2
Θ1
0.1
0.05
0
0 200 400 600 800 1000
k/H0
0.7
ωm = 0.20
Undamped spectra
0.6 ωm = 0.25
ωm = 0.30
ωm = 0.35
0.5
0.4
0.3
0.2
0.1
Including damping
0
0 500 1000 1500 2000
Figure 17: The angular power spectrum Cℓ , calculated both with and without the effect of
diffusion damping. The spectrum is given for four different values of ωm , with ωb = 0.01.
(This is a rather low value of ωb , about half the real value, so ℓD < 1500 and the damping
is quite strong.) Figure and calculation by R. Keskitalo.
This calculation is rather approximate, because of the rapid growth of the pho-
ton mean free path (and the typical time between collisions) during recombination,
and a more accurate calculation involves an integral over time to take into account
this effect. However, the order of magnitude ℓD ∼ 1000 is correct, as we see from fig-
ure 17, which shows the result of a numerical calculation with and without diffusion
damping.
Of the cosmological parameters, the damping depends most strongly on ωb , since
increasing baryon density shortens the photon mean free path before decoupling.
Thus for larger ωb the damping moves to shorter scales, i.e. ℓD becomes larger.
The time evolution of λγ before decoupling, and the diffusion scale, is different
for different ωb . For small ωb , tc has already become quite large through the slow
dilution of the baryon density by the expansion of the universe, and the growth of
λγ relies less on the fast reduction of free electron density during recombination.
Full spectrum
0.3
Θ0 + Φ
Θ1
ISW
ISW cross terms
(Θ0+Φ)×Θ1
0.2
2l(l+1)Cl/2π
0.1
0.15 0.001
(Θ0+Φ)×Θ1
Θ0+Φ
0.1 0
0.05 -0.001
0.08 ISW×(Θ0+Φ)
0.08
ISW×Θ1 0.06
0.06
0.04
Θ1
0.04 0.02
0.02 0
-0.02
0
0.02 ISW×(Θ0+Φ) + ISW×Θ1 0.08
0.01
0.01
0.06
0.04
ISW
0.001 0
0 500 1000 1500 0.02
0.0001 0
-0.02
1e-05
0 500 1000 1500 2000
l
Figure 18: The full Cℓ spectrum calculated for the cosmological model Ω0 = 1, ΩΛ = 0,
ωm = 0.2, ωb = 0.03, A = 1, n = 1, and the different contributions to it. Here Θ1 denotes
the Doppler effect. Figure and calculation by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 220
• τ optical depth
The angular diameter distance is a general model-independent quantity. In a
given FRW model, it is determined by the spatial curvature and the expansion
history, as we have discussed. In the ΛCDM model, where there is vacuum energy
and spatial curvature, the angular diameter distance can be replaced by these two
parameters, so we have
• Ω0 total density parameter
• τ optical depth
In addition to changing the angular diameter distance, vacuum energy and spatial
curvature also contribute to the CMB anisotropy via the ISW effect, as discussed
earlier. The decoupling and post-decoupling parameters add up to a total of seven
parameters. Since spatial curvature is not needed to explain the observations and
there is no indication for it, it is usually put to zero, i.e. Ω0 = 1. Usually references
to the ΛCDM model, or the “standard cosmological model”, or the concordance
model refer to the model parametrised by the six parameters above, without spatial
curvature. We will keep spatial curvature in the discussion in order to show what
effect it would have.
There are other possible cosmological parameters (“additional parameters”) which
might affect the Cℓ spectrum, e.g.
• mνi neutrino masses
We assume here that these additional parameters have no impact, i.e., they have
the “standard” values
dn
mνi = r = = B = Acor = 0 , w = −1 (12.99)
d ln k
to the accuracy which matters for Cℓ observations. Apart from the neutrino masses,
there is no sign in the present-day CMB data for non-zero values of these parameters.
On the other hand, significant deviations from zero can be consistent with the data,
and may be discovered by future CMB (and other) observations, in particular the
Planck satellite. The primordial isocurvature perturbations refer to the possibility
that the primordial scalar perturbations are not adiabatic, and therefore are not
completely determined by the comoving curvature perturbation R.
The assumption that these additional parameters have no impact leads to a
determination of the standard parameters with an accuracy that may be too op-
timistic, since the standard parameters may have degeneracies with the additional
parameters.
• ωb (not Ωb0 ) determines the baryon/photon ratio and thus e.g. the relative
heights of the odd and even peaks.
Ω0 = 1 ΩΛ0 = 0.7
A=1 ωm = 0.147
n=1 ωb = 0.022
τ = 0.1
The meaning of setting A = 1 is just that the resulting Cℓ still need to be multiplied
by the true value of A2 . (In this model the true value should be about A = 5 × 10−5
to agree with observations.) If we really had A = 1, perturbation theory of course
would not be valid! This is a relatively common practice, since the effect of changing
A is so trivial, it doesn’t make sense to plot Cℓ separately for different values of A.
from which we see that it depends on the three cosmological parameters H0 , Ω0 and
ΩΛ0 . Here Ω0 = Ωm0 + ΩΛ0 , so we could also say that it depends on H0 , Ωm0 , and
ΩΛ0 , but it is easier to discuss the effects of these different parameters if we keep
Ω0 as an independent parameter, instead of Ωm0 , since the “geometry effect” of the
curvature of space, which determines the relation between the comoving angular
diameter distance dcA and the comoving distance dc , is determined by Ω0 .
3. With a fixed ΩΛ0 , increasing Ω0 decreases dcA (tdec ), since it means increasing
Ωm0 , which has a decelerating effect on the expansion. With a fixed present
expansion rate H0 , deceleration means that expansion was faster earlier ⇒
universe is younger ⇒ there is less time for photons to travel as the uni-
verse cools from Tdec to T0 ⇒ last scattering surface is closer to us.
4. Increasing ΩΛ0 (with a fixed Ω0 ) increases dcA (tdec ), since it means a larger
part of the energy density is in dark energy, which has an accelerating effect
on the expansion. With fixed H0 , this means that expansion was slower in
the past ⇒ universe is older ⇒ more time for photons ⇒ last
scattering surface is further out ⇒ ΩΛ0 increases dcA (tdec ).
Here 2 and 3 work in the same direction: increasing Ω0 decreases dcA (tdec ), but the
geometry effect (2) is stronger. See figure 13 for the case ΩΛ0 = 0, where the dashed
line (the comoving distance) shows effect (3) and the solid line (the comoving angular
diameter distance) the combined effect (2) and (3).
However, now we have to take into account that, in our chosen parametrisation,
H0 is not an independent parameter, but
r
−1 Ω0 − ΩΛ0
H0 ∝ ,
ωm
so that via H0−1 , Ω0 increases and ΩΛ0 decreases dcA (tdec ), which are the opposite
effects to those discussed above. For ΩΛ0 this opposite effect wins. See Figs. 21 and
22.
Sound horizon
To calculate the comoving sound horizon,
Z tdec Z tdec Z adec
cs (t) dt da
rsc (tdec ) = a0 dt = cs (t) = cs (a) , (12.104)
0 a(t) 0 a 0 a · (da/dt)
The other element in the integrand of (12.104) is the expansion law a(t) before
decoupling. We have
da p
a = H0 ΩΛ0 a4 + (1 − Ω0 )a2 + Ωm0 a + Ωr0 . (12.106)
dt
In the integral (12.103) we dropped the Ωr0 , since it is important only at early times,
and the integral from adec to 1 is dominated by late times. Integral (12.104), on the
other hand, includes only early times, and now we can instead drop the ΩΛ0 and
1 − Ω0 terms (i.e., we can ignore the effect of curvature and dark energy in the early
universe, before photon decoupling), so that
√
da p √ ωm a + ωr
a ≈ H0 Ωm0 a + Ωr0 = H100 ωm a + ωr = , (12.107)
dt 2998 Mpc
where we have written
km/s h
H0 ≡ h · 100 ≡ h · H100 = . (12.108)
Mpc 2997.92 Mpc
Thus the sound horizon is given by
Z a
c (a)da
c
rs (a) = 2998 Mpc √ s
0 ωm a + ωr
Z a
1 da (12.109)
= 2998 Mpc · √ r .
3ωr 0
3 ωb
1 + ωωmr a 1 + 4 ωγ a
Here
are accurately known from the CMB temperature T0 = 2.725 K (and therefore we
do not consider them as cosmological parameters in the sense of something to be
determined from the Cℓ spectrum).
Thus the sound horizon depends on the two cosmological parameters ωm and ωb ,
From (12.109) we see that increasing either ωm or ωb makes the sound horizon at
decoupling, rsc (adec ), shorter:
where
ρ̄r (tdec ) ωr 1 1 1 + zdec
r∗ ≡ = = 0.0459 (12.113)
ρ̄m (tdec ) ωm adec ωm 1100
3ρ̄b (tdec ) 3ωb 1100
R∗ ≡ = adec = 27.6 ωb . (12.114)
4ρ̄γ (tdec ) 4ωγ 1 + zdec
For our reference values ωm = 0.147, ωb = 0.022, and 1 + zdec = 110010 we get r∗ =
0.312 and R∗ = 0.607 and rsc (tdec ) = 143 Mpc for the sound horizon at decoupling.
Summary
The angular diameter distance dcA (tdec ) is most naturally discussed in terms of
H0 , Ω0 , and ΩΛ0 , but since these are not the most convenient choice of independent
parameters for other purposes, we shall trade H0 for ωm according to (12.101). Thus
we see that the sound horizon angle depends on 4 parameters,
rsc (ωm , ωb )
θs ≡ = θs (Ω0 , ΩΛ0 , ωm , ωb ) . (12.115)
dcA (Ω0 , ΩΛ0 , ωm )
If we keep ωm and ωb fixed, we have rsc (tdec ) = 143 Mpc. From the observed
model-independent value θs = 0.593◦ ± 0.001◦ [1] we then have dcA = 13.8 Gpc
≈ 4.6hH0−1 ≈ 3H0−1 , where in the last equality we have taken h = 0.7. For the
Einstein-de Sitter model we have dcA (1090) ≈ 1.97H0−1 ≈ 8.4 Gpc, so the observed
distance to the last scattering surface is about 50% larger than predicted by the
FRW model without dark energy or spatial curvature.
We get a rough estimate of the angular diameter distance from the observed
angular size of the extrema on the CMB sky as follows.
√1 dhor (tdec )
rsc (tdec )
dcA (zdec )
= ≈ 3 (1 + zdec )
θs θs
180◦ √
≈ 3tdec (1 + zdec ) ≈ 21 Gpc , (12.116)
πθs (◦ )
√
where we have approximated rs = dhor / 3 and dhor = 3t, and θs (◦ ) is θs in degrees.
This value is within a factor of 2 of the real result. However, the difference between
the observation and the Einstein-de Sitter result for dcA is only 50%, so this rough
approximation cannot be used to rule out the Einstein-de Sitter model, we have to
use a more precise value for the sound horizon.
1. The early ISW effect. The early ISW effect raises the first peak. It is
caused by the evolution of Φ because of the effect of the radiation contribution
on the expansion law after tdec . This depends on the radiation-matter ratio at
that time; decreasing ωm makes the early ISW effect stronger.
10
Photon decoupling temperature, and thus 1 + zdec , depends somewhat on ωb , but since this
dependence is not easy to calculate (recombination and photon decoupling were discussed in chapter
5), we have mostly ignored this dependence and used the fixed value 1 + zdec = 1100.
12 COSMIC MICROWAVE BACKGROUND 226
3. Baryon damping. The time evolution of R ≡ 3ρ̄b /4ρ̄γ causes the amplitude
of the acoustic oscillations to be damped in time roughly as (1 + R)−1/4 . This
reduces the amplitudes of all peaks.
Effects 1 and 4 depend on ωm , effects 2, 3, and 5 on ωb . See Figs. 19 and 20 for the
effects of ωm and ωb on peak heights.
1. they affect the sound horizon angle and thus the positions of the acoustic peaks
See Figs. 21 and 22. Since the late ISW effect is in the region of the Cℓ spectrum
where the cosmic variance is large, it is difficult to detect. Thus we can in practice
only use θs to determine Ω0 and ΩΛ0 . Since ωb and ωm can be determined quite
accurately from Cℓ acoustic peak heights, peak separation, i.e., θs , can then indeed
be used for the determination of Ω0 and ΩΛ0 . Since one number cannot be used
11
This is also called gravitational driving, which is perhaps more appropriate, since the effect is
due to the change in the gravitational potential.
12 COSMIC MICROWAVE BACKGROUND 227
ωb = 0.01 ωb = 0.03
0.6 0.6
l(l+1)Cl/2π
0.4 0.4
ωm = 0.10
ωm = 0.20
0.2 0.2
ωm = 0.30
ωm = 0.40
0 0
0 500 1000 1500 2000 0 500 1000 1500 2000
l l
Figure 19: The effect of ωm . The angular power spectrum Cℓ is here calculated without
the effect of diffusion damping, so that the other effects on peak heights could be seen more
clearly. Notice how reducing ωm raises all peaks, but the effect on the first few peaks is
stronger in relative terms, as the radiation driving effect is extended towards larger scales
(smaller ℓ). The first peak is raised mainly because the ISW effect becomes stronger. Figure
and calculation by R. Keskitalo.
0.6
0.5
0.4
2l(l+1)Cl/2π
0.3
0.2
ωb = 0.01
ωb = 0.02
ωb = 0.03
0.1
ωb = 0.04
0
0 500 1000 1500 2000
l
Figure 20: The effect of ωb . The angular power spectrum Cℓ is here calculated without
the effect of diffusion damping, so that the other effects on peak heights could be seen more
clearly. Notice how increasing ωb raises odd peaks relative to the even peaks. Because of
baryon damping there is a general trend downwards with increasing ωb . This figure is for
ωm = 0.20. Figure and calculation by R. Keskitalo.
12 COSMIC MICROWAVE BACKGROUND 228
to determine two, the parameters Ω0 and ΩΛ0 are degenerate. CMB observations
alone cannot be used to determine them both. Other cosmological observations (like
the power spectrum Pδ (k) from large scale structure, or the SNIa redshift-distance
relationship) are needed to break this degeneracy.
A fixed θs together with fixed ωb and ωm determine a line on the (Ω0 , ΩΛ0 )
-plane. See figure 23. Derived parameters, e.g., h, vary along that line. As you can
see from Figs. 21 and 22, changing Ω0 (around the reference model) affects θs much
more strongly than changing ΩΛ0 . This means that the orientation of the line is such
that ΩΛ0 varies more rapidly along that line than Ω0 . Therefore using additional
constraints from other cosmological observations, e.g. the Hubble Space Telescope
determination of h based on the distance ladder, which select a short section from
that line, gives us a fairly good determination of Ω0 , leaving the allowed range for
ΩΛ0 still quite large.
Therefore it is often said that CMB measurements have determined that Ω0 ∼ 1,
i.e. that the universe is spatially flat. However, this is misleading. First, the
CMB only determines the angular diameter distance to the last scattering surface.
Determining the spatial curvature from this requires knowing the expansion history
H(z), in other words the constraints on the spatial curvature are model-dependent.
Even restricting to the ΛCDM model, we need to use some other cosmological data
to fix H0 . So the correct statement is that assuming that the universe is described
by the ΛCDM model, and given constraints on the Hubble parameter today, the
CMB data shows the universe to be close to spatially flat.
0.4
0 = 1.1
0.35 0 = 1.0
0 = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
0 = 1.1
0.35 0 = 1.0
0 = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 21: The effect of changing Ω0 from its reference value Ω0 = 1. The top panel
shows the Cℓ spectrum with a linear ℓ scale so that details at larger ℓ where cosmic variance
effects are smaller can be better seen. The bottom plot has a logarithmic ℓ scale so that
the integrated Sachs-Wolfe effect at small ℓ can be better seen. The logarithmic scale also
makes clear that the effect of the change in sound horizon angle is to stretch the spectrum
by a constant factor in ℓ space.
12 COSMIC MICROWAVE BACKGROUND 230
0.4
= 0.80
0.35 = 0.70
= 0.60
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
= 0.80
0.35 = 0.70
= 0.60
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 22: The effect of changing ΩΛ0 from its reference value ΩΛ0 = 0.7.
12 COSMIC MICROWAVE BACKGROUND 231
50
100
100
1
30 200
0
35 250
0
Λ
Ω
600
250
40 30
0
50
0
0
0
0.2 0.4 0.6 0.8 1 1.2 1.4
Ω
m
Figure 23: The lines of constant sound horizon angle θs on the (Ωm0 ,ΩΛ0 ) plane for fixed
ωb and ωm . The numbers on the lines refer to the corresponding acoustic scale ℓA ≡ π/θs
(∼ peak separation) in multipole space. Figure by J. Väliviita. See his PhD thesis[5], p.70,
for an improved version including the HST constraint on h.
both parameters have their own signature on the peak heights, allowing an accurate
determination of these parameters, whereas the effect on θs is degenerate with Ω0
and ΩΛ0 .
Especially ωb has a characteristic effect on peak heights: Increasing ωb raises the
odd peaks and reduces the even peaks, because it shifts the balance of the acoustic
oscillations (the −RΦ effect). This shows the most clearly at the first and second
−1
peaks.12 Raising ωb also shortens the damping scale kD due to photon diffusion,
moving the corresponding damping scale ℓD of the Cℓ spectrum towards larger ℓ.
This has the effect of raising Cℓ at large ℓ. See figure 27.
Increasing ωm makes the universe more matter dominated at tdec and therefore
it reduces the early ISW effect, making the first peak lower. This also affects the
shape of the first peak.
The “radiation driving” effect is most clear at the second to fourth peaks. Reduc-
ing ωm makes these peaks higher by making the universe more radiation-dominated
at the time the scales corresponding to these peaks enter, and thus strengthening
this radiation driving. The fifth and further peaks correspond to scales that have
anyway essentially the full effect, whereas for the first peak this effect is anyway
weak. (We see instead the ISW effect in the first peak.) See figure 28.
12
There is also an overall “baryon damping effect” on the acoustic oscillations which we have not
calculated. It is due to the time dependence of R ≡ 3ρ̄b /4ρ̄m , which reduces the amplitude of the
oscillation by about (1 + R)−1/4 . This explains why the third peak in figure 27 is no higher for
ωb = 0.030 than it is for ωb = 0.022.
12 COSMIC MICROWAVE BACKGROUND 232
0.4
A = 1.1
0.35 A=1
A = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
A = 1.1
0.35 A=1
A = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 24: The effect of changing the primordial amplitude from its reference value A = 1.
0.4
n = 1.1
0.35 n=1
n = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
n = 1.1
0.35 n=1
n = 0.9
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 25: The effect of changing the spectral index from its reference value n = 1.
12 COSMIC MICROWAVE BACKGROUND 233
0.4
= 0.20
0.35 = 0.10
=0
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
= 0.20
0.35 = 0.10
=0
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 26: The effect of changing the optical depth from its reference value τ = 0.1.
12 COSMIC MICROWAVE BACKGROUND 234
0.4
b = 0.030
0.35 b = 0.022
b = 0.015
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
b = 0.030
0.35 b = 0.022
b = 0.015
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 27: The effect of changing the physical baryon density parameter from its reference
value ωb = 0.022.
12 COSMIC MICROWAVE BACKGROUND 235
0.4
m = 0.200
0.35 m = 0.147
m = 0.100
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0
0 200 400 600 800 1000 1200 1400
0.4
m = 0.200
0.35 m = 0.147
m = 0.100
0.3
0.25
( +1)C / 2
0.2
0.15
0.1
0.05
0.0 1 2 3
2 5 10 2 5 10 2 5 10 2
Figure 28: The effect of changing the physical matter density parameter from its reference
value ωm = 0.147.
12 COSMIC MICROWAVE BACKGROUND 236
related derived parameters, and in Table III we give limits on some non-standard
parameters. Note that in table III the error bars are the 95% confidence limits
(instead of the usual 68% confidence limits), and the first column is Planck data
plus WMAP polarisation data.
The BBN limit 0.019 ≤ ωb ≤ 0.024 has not been used here, but we see that the
constraint on the baryon density coming from the CMB is consistent with the BBN
value. The agreement between these two independent datasets (the abundances
of light elements and anisotropies on the microwave sky) one of which probes the
physics around a couple of minutes and the other at around 400 000 years is remark-
able. This increases our confidence that the basic physical picture of the evolution
of the universe is correct. Indeed, BBN and CMB are two of the most important
pieces of observational support for the standard cosmological model.
The parameters in Table III are derived under the assumption that the non-
standard parameters other than the one being considered remain zero. The CMB
alone does not give good constraints on the spatial curvature or the dark energy
equation of state (since they mostly only affect dcA (zdec ), and are thus degenerate
with ΩΛ0 ). In fact, the CMB data is consistent with a closed universe without dark
energy, with Ω0 = PΩm0 ≈ 1.3, and h ≈ 0.3. 2The2 upper limits given for the sum of
neutrino masses mν and the ratio r ≡ AT /A of tensor perturbations to scalar
perturbations are 95% confidence limits. We see that there is no indication in the
data for a deviation of these additional parameters from their standard values.
In conclusion, almost all cosmological data are consistent with a “vanilla” uni-
verse, i.e. a spatially flat ΛCDM model with adiabatic and Gaussian primordial
density perturbations, described by the six cosmological parameters ΩΛ0 , ωm , ωb ,
A, n, τ .
Simplest inflationary models predict an amplitude for gravity waves that Planck
would be able to observe using the polarisation of the CMB. This data will be
released in 2014.
12 COSMIC MICROWAVE BACKGROUND 238
ΩK0 −0.037+0.043
−0.049 −0.0005+0.0065
−0.0066
dn
d ln k −0.013 ± 0.018 −0.014+0.016
−0,017
Figure 29: Constraints on the scalar perturbation spectral index n, and the tensor/scalar
ratio r from Planck satellite data. Figure from [8].
REFERENCES 239
References
[1] M. Vonlanthen, S. Räsänen and R. Durrer, JCAP08 (2010) 023, arXiv:1003.0810
[astro-ph.CO].
[2] David H. Lyth and Andrew R. Liddle: The Primordial Density Perturbation
(Cambridge University Press 2009).
[3] C. Reichardt et al., High Resolution CMB Power Spectrum from the Complet
ACBAR Data Set, arXiv:0801.1491.
[4] K.T. Story et al, A Measurement of the Cosmic Microwave Background Damp-
ing Tail from the 2500-square-degree SPT-SZ survey, arXiv:1210.7231.
[6] W.J. Percival et al., Measuring the Baryon Acoustic Oscillation scale using the
Sloan Digital Sky Survey and 2dF Galaxy Redshift Survey, arXiv:0705.3323,
Mon.Not.Roy.Astron.Soc. 381 (2007) 1053.