Professional Documents
Culture Documents
1803 07098 PDF
1803 07098 PDF
1803 07098 PDF
High-School Students
arXiv:1803.07098v2 [physics.pop-ph] 30 Jan 2019
Barak Shoshany
bshoshany@perimeterinstitute.ca
February 1, 2019
Abstract
We present a conceptually clear introduction to quantum theory at a level suitable for exceptional
high-school students. It is entirely self-contained and no university-level background knowledge is
required. The lectures we given over four days, four hours each day, as part of the International
Summer School for Young Physicists (ISSYP) at Perimeter Institute, Waterloo, Ontario, Canada. On
the first day the students were given all the relevant mathematical background from linear algebra
and probability theory. On the second day, we used the acquired mathematical tools to define the
full quantum theory in the case of a finite Hilbert space and discuss some consequences such as
entanglement, Bell’s theorem and the uncertainty principle. Finally, on days three and four we
presented an overview of advanced topics related to infinite-dimensional Hilbert spaces, including
canonical and path integral quantization, the quantum harmonic oscillator, quantum field theory,
the Standard Model, and quantum gravity.
Contents
1 Linear Algebra 3
1.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Dual Vectors, Inner Product and Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Matrices and Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 The Outer Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 The Completeness Relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.8 Inner Products with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.9 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.10 Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.11 Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.12 The Spectral Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.13 The Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1
2 Probability Theory 10
2.1 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Normal (Gaussian) Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6 Quantum Gravity 37
7 Further Reading 38
2
8 Acknowledgments 38
1 Linear Algebra
1.1 Vectors
We shall mostly deal with n-dimensional complex vectors, which are simply an ordered list of n
complex numbers, also known as an n-tuple. For example, a 2-dimensional complex vector is written
as:
Ψ1
|Ψi = , Ψ1 , Ψ2 ∈ C, (1.1)
Ψ2
where the notation Ψ1 , Ψ2 ∈ C means that the components Ψ1 , Ψ2 are complex numbers. The notation
|Ψi is called Dirac (or “bra-ket”) notation and Ψ is simply a label used to refer to the vector. We can
write anything as the label, including English/Greek letters, numbers, or even words, for example:
| Ai , | βi , |3i , |cati , . . . (1.2)
Addition of vectors is defined by simply adding the components. That is, if
Φ1
|Φi = , (1.3)
Φ2
then
Ψ 1 + Φ1
|Ψ i + |Φi = . (1.4)
Ψ 2 + Φ2
We also define multiplication of the whole vector by a scalar, that is, a complex number λ ∈ C, by
multiplying all of the components:
λΨ1
λ |Ψi = . (1.5)
λΨ2
λ is called a scalar because it “scales” the vector. All the n-dimensional complex vectors together
make up a vector space called C n .
Exercise: Show that addition and multiplication by a scalar are distributive, that is, λ (|Ψi + |Φi) =
λ |Ψi + λ |Φi and (α + β) |Ψi = α |Ψi + β |Ψi.
3
where the ∗ denotes the complex conjugate, defined for any complex number z as follows:
z = a+ib =⇒ z∗ = a − i b, a, b ∈ R. (1.7)
Addition and multiplication by a scalar is defined as for vectors, but you may not add vectors and
dual vectors together! Note that the dual of a dual vector is again a vector, since (z∗ )∗ = z.
Using the dual vectors, we may define the inner product. This product allows us to take two vectors
and produce one complex number out of them. The inner product only works for one vector and one
dual vector, and to calculate it we multiply the components of both vectors one by one and add them
up:
Φ1
hΨ|Φi = Ψ1∗ Ψ2∗ = Ψ1∗ Φ1 + Ψ2∗ Φ2 . (1.8)
Φ2
In bra-ket notation, vectors |Ψi are “kets” and dual vectors hΨ| are “bras”. Then the notation for
hΨ|Φi is called a “bracket”.
Note that this inner product resembles the dot product defined for vectors in R n , but here one must be
careful to only multiply vectors by dual vectors. Finally, we may define the norm-squared of a vector
by taking its inner product with itself (“squaring” it):
Ψ1
kΨk2 = hΨ|Ψi = Ψ1∗ Ψ2∗ = | Ψ 1 |2 + | Ψ 2 |2 , (1.9)
Ψ2
where the magnitude-squared of a complex number z is defined to be
|z|2 = z∗ z. (1.10)
Then we can define the norm as the square root of the norm-squared:
q q
kΨk = k Ψk2 = hΨ|Ψi. (1.11)
A vector space with an inner product satisfying the properties you will now prove in the exercise is
called a Hilbert space. Thus, C n is a Hilbert space.
Exercise:
1. Prove that the norm-squared kΨk2 is always non-negative, and it is zero if and only if |Ψi is the
zero vector, that is, the vector whose components are all zero.
2. Prove that hΦ|Ψi = hΨ|Φi∗ , that is, if we swap the vectors in the product we get the complex
conjugate. (Thus the inner product is non-commutative.)
3. Prove that if α, β ∈ C and |Ψi , |Φi , |Θi ∈ C n then
4
2. They span C n , which means that any vector |Ψi ∈ C n can be written as a linear combination of
the basis vectors:
n
|Ψi = ∑ λ i | Bi i , (1.14)
i =1
for a unique choice of λi ∈ C.
3. They are all orthogonal to each other, that is, the inner product of any two different vectors
evaluates to zero:
h Bi | Bj i = 0 for all i 6= j. (1.15)
4. They are all unit vectors, that is, have a norm of 1:
1. Show that property 1 means that no vector in the basis can be written as a linear combination
of the other vectors in the basis.
2. Any basis which is orthogonal but not orthonormal, that is, does not satisfy property 4, can
be made orthonormal by normalizing each basis vector, i.e. dividing it by its norm: | Bi i 7→
| Bi i / k Bi k. Assume a basis which satisfies properties 1-3 and show that after normalizing it,
properties 1-3 are still satisfied.
These requirements become much simpler in n = 2 dimensions. Then an orthonormal basis for C2 is
a set of 2 non-zero vectors | B1 i , | B2 i such that:
1. They are linearly independent, which means that we cannot write one in terms of a scalar times
the other, i.e.:
| B1 i 6= λ | B2 i , λ ∈ C. (1.18)
2. They span C2 , which means that any vector |Ψi ∈ C2 can be written as a linear combination of
the basis vectors:
|Ψi = λ1 | B1 i + λ2 | B2 i , (1.19)
for a unique choice of λ1 , λ2 ∈ C.
3. They are orthogonal to each other, that is, the inner product between them evaluates to zero:
h B1 | B2 i = 0. (1.20)
k B1 k2 = h B1 | B1 i = 1, k B2 k2 = h B2 | B2 i = 1. (1.21)
Exercise: Show that the standard basis vectors satisfy the four properties above.
5
1.4 Matrices and Adjoint
Matrices in n dimensions are n × n arrays of complex numbers1 . In n = 2 dimensions we have
A11 A12
A= , A11 , A12 , A21 , A22 ∈ C. (1.23)
A21 A22
A matrix acts on a vector from the left by taking the inner product of each row of the matrix with the
vector:
A11 A12 Ψ1 A11 Ψ1 + A12 Ψ2
A |Ψi = = . (1.24)
A21 A22 Ψ2 A21 Ψ1 + A22 Ψ2
A matrix can also act on a dual vector from the right by taking the inner product of the dual vector
with each column of the matrix:
∗ ∗ A11 A12
= Ψ1∗ A11 + Ψ2∗ A21 Ψ1∗ A12 + Ψ2∗ A22 .
Ψ
h | A = Ψ 1 Ψ 2 (1.25)
A21 A22
Note that the dual vector hΨ| A is not the dual of the vector A |Ψi. However, we can define the
adjoint of a matrix by both transposing rows into columns and taking the complex conjugate of the
components: ∗ ∗
† A11 A21
A = ∗ ∗ , (1.26)
A12 A22
where the notation † for the adjoint is called “dagger”. Then the vector dual to A |Ψi is hΨ| A† .
The identity matrix, written simply as 1, is:
1 0
1= . (1.27)
0 1
Acting with it on any vector or dual vector does not change it: 1 |Ψi = |Ψi.
Exercise:
we define the outer product as the matrix whose component at row i, column j is given by multiplying
the component at column j of |Ψi and row i of hΦ|:
∗
Ψ1 Φ1 Ψ2∗ Φ1
Φ1 ∗ ∗
|ΦihΨ| = Ψ1 Ψ2 = . (1.29)
Φ2 Ψ1∗ Φ2 Ψ2∗ Φ2
1 In fact, matrices don’t have to be square, they can have a different number of rows and columns, but non-square matrices
6
Exercise: Calculate the outer product |ΨihΦ| for
1 3−i
|Ψi = , | Φi = . (1.30)
2+i 4i
Remember that when writing the dual vector, the components are complex-conjugated!
Note that what we did here is go from a vector | Bi i times a complex number h Bi |Ψi to a matrix | Bi ih Bi |
times a vector | Ψi, for each i. The fact that they are, in fact, equal to one another (as you will prove in
the exercise) is made intuitive by using the bra-ket notation. The notation now suggests that
n
∑ | Bi ih Bi | = 1, (1.35)
i =1
where | Bi ih Bi | is the outer product defined above, and the 1 on the right-hand side is the identity
matrix.
In C2 , we simply have
| B1 ih B1 | + | B2 ih B2 | = 1. (1.36)
Exercise:
1. Provide a more rigorous proof that
!
n n
∑ | Bi ih Bi |Ψi = ∑ | Bi ih Bi | | Ψ i. (1.37)
i =1 i =1
7
1.7 Multiplication of Matrices
The product of two matrices is calculated by taking the inner product of each row of the left matrix
with each column on the right matrix:
A11 A12 B11 B12 A11 B11 + A12 B21 A11 B12 + A12 B22
AB = = . (1.39)
A21 A22 B21 B22 A21 B11 + A22 B21 A21 B12 + A22 B22
Exercise:
4. Show that multiplying by a scalar is the same as multiplying by a diagonal matrix with the
scalar at each components of the diagonal, that is:
λ 0
λA = A. (1.42)
0 λ
h Ψ | A | Φi ∗ = h Φ| A † | Ψ i. (1.45)
Taking the complex conjugate thus reverses the order of the inner product, and also replaces the
matrix with its adjoint.
Exercise: Calculate the inner product hΨ| A|Φi where
5+2i 9 8i 3+4i
|Ψi = , A= , |Φi = . (1.46)
4−3i 7+6i 5−4i 2−5i
8
1.9 Eigenvalues and Eigenvectors
If the matrix A, acting on the vector |Ψi, results in a scalar multiple of |Ψi:
A |Ψi = λ | Ψi , (1.47)
1. The matrix
1 2
A= (1.51)
2 1
has two eigenvectors. Find them and their corresponding eigenvalues.
2. Prove that, if |Ψi is an eigenvector, then α |Ψi is also an eigenvector for any α ∈ C, and it has
the same eigenvalue.
A = A† . (1.52)
1. Find the most general 2 × 2 Hermitian matrix by demanding that A = A† and finding condi-
tions on the components of A.
2. Prove that the eigenvalues of a Hermitian matrix are always real.
3. Prove that two eigenvectors |Ψi , |Φi of a Hermitian matrix with different eigenvalues are nec-
essarily orthogonal, that is, hΦ|Ψi = 0.
4. Prove that the eigenvectors of a Hermitian matrix, when properly normalized, make up an
orthonormal basis of C n . Assume that all eigenvectors have distinct eigenvalues.2
2 If two eigenvectors correspond to the same eigenvalue, they will not necessarily be orthogonal. This case is called degener-
ate eigenvalues. (Give an example of a Hermitian matrix which has degenerate eigenvalues!) We will not deal with degenerate
eigenvalues in these lectures. However, it is in fact possible to find an orthonormal basis even if there are degenerate eigenval-
ues, using the Gram-Schmidt process, which the student is encouraged to look up and study.
9
1.11 Unitary Matrices
A unitary matrix U is a matrix which, when multiplied by its adjoint on either side, results in the
identity matrix:
UU † = U † U = 1. (1.53)
Exercise: Find the most general 2 × 2 unitary matrix by demanding that UU † = U † U = 1 and finding
conditions on the components of U.
This will be referred to as the spectral decomposition of A. We also call this diagonalization because this
means the operator may be represented as a diagonal matrix with the eigenvalues λi on the diagonal
(but we are not going to give the details here because they are not important for what follows).
Exercise:
2. In fact, the theorem is true for any normal matrix, which satisfies AA† = A† A. Prove that a
Hermitian matrix is a special case of a normal matrix. Prove that a unitary matrix is also a
special case of a normal matrix.
3. Prove the spectral decomposition theorem.
2 Probability Theory
2.1 Random Variables
A random variable X is a function which assigns a real value to each possible outcome of an experiment
or process. Sometimes these values will be the actual measured value in some way: for example, the
value of the random variable X for rolling a 6-sided die will simply be the number on the die. Other
times, the value of the random variable will be just a numerical label assigned to each outcome: for
example, for a coin toss we can assign 1 to heads and 0 to tails.
10
These examples were of discrete random variables, but we can also have continuous random variables,
such as the position of a particle along a line, which in principle can take any real value. There are
some subtleties to continuous random variables, which will not concern us here.
Exercise: Think of more examples of discrete and continuous random variables.
1 1
P ( X = 1) = , P ( X = 0) = , (2.1)
2 2
and for the die roll we have
1 1 1
P ( X = 1) = , P ( X = 2) = , P ( X = 3) = , (2.2)
6 6 6
1 1 1
P ( X = 4) = , P ( X = 5) = , P ( X = 6) = . (2.3)
6 6 6
The sum of probabilities for all possible values of X is always 1.
These probability distributions are uniform, since they assign the same probability to each value of X.
However, probability distributions need not be uniform. For example, if we toss two coins X1 and X2
and add the results, X = X1 + X2 , we can get any of the following 4 outcomes:
0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 2. (2.4)
1 1 1
P ( X = 0) = , P ( X = 1) = , P ( X = 2) = . (2.6)
4 2 4
Exercise: Calculate the probability distribution for the sum of two rolls of a 6-sided die. This is known
to players of role-playing games (such as Dungeons & Dragons) as a “2d6”, where we define ndN to
be the sum of n rolls of an N-sided die.
where N is the total number of possible outcomes, xi is the value of outcome number i and pi is its
probability. In the example of the coin toss, we have:
1 1 1
hXi = ·0+ ·1 = , (2.8)
2 2 2
11
and for the die roll, we have:
1 1 1 1 1 1 7
hX i = · 1 + · 2 + · 3 + · 4 + · 5 + · 6 = = 3.5. (2.9)
6 6 6 6 6 6 2
Note that the expected value is often not an actual value the random variable can take.
Exercise: Calculate the expected value for the sum of two coin tosses and for the 2d6. What do you
learn from that?
Exercise: Prove that the second line follows from the first. Use the fact that the expected value is
linear, hX + Y i = h X i + hY i and hλX i = λ h X i where λ is a constant, and that hX i itself is a constant.
For example, for the coin toss we have from before
1
hXi = , (2.10)
2
and we also calculate: D E 1 1 1
X 2 = · 02 + · 12 = , (2.11)
2 2 2
which gives us r
1 1 1
σX = − = . (2.12)
2 4 2
This is to be expected, since the two actual values of the outcomes, 0 and 1, lie exactly 1/2 away from
the mean in each direction.
For the die roll, we have from before
7
hXi = , (2.13)
2
and we also calculate: D E 1 2 91
X2 = 1 + 22 + 32 + 42 + 52 + 62 = , (2.14)
6 6
which gives us r r
91 49 35
σX = − = ≈ 1.7. (2.15)
6 4 12
Exercise: Calculate the standard deviation for the sum of two coin tosses and for the 2d6.
12
Figure 1: The normal (or Gaussian) distribution with mean value µ = 0.
The normal distribution is the most common probability distribution you will encounter. The reason
for that is that there is a theorem, the central limit theorem, which states that whenever we add inde-
pendent random variables, the probability distribution of the sum will look like a normal distribution.
As we add more and more variables, the sum will be closer and closer to a normal distribution.
This can already be seen in the case of the die rolls. For 1d6 we have a uniform distribution, as
depicted in Fig. 2. For 2d6, the sum of two die rolls, we get a triangular distribution centered at the
mean value of 7, as depicted in Fig. 3. By solving the exercise in Sec. 2.2, you already know that the
probability for each outcome is 61 · 61 = 36
1
, but the outcomes 2 and 12 appear only once (corresponding
to either 1 or 6 on both dice), while the outcome 7 appears six times (corresponding to 1+6, 2+5, 3+4,
4+3, 5+2 and 6+1) and thus has a probability of 6/36 = 1/6.
For 3d6, as depicted in Fig. 4, we see that the probability distribution is starting to obtain the familiar
“bell” shape of the normal distribution. Its mean value is 10.5, as you can calculate. We will get
closer and closer to a normal distribution as we increase the number of dice, that is, the n in nd6. In
the limit n → ∞, we will precisely obtain a normal distribution, but even for small values of n, the
approximation is already close enough for all practical purposes.
3 The square of the standard deviation is called the variance, but it will not interest us here.
13
Figure 2: The distribution of results for one roll of a 6-sided die, also known as 1d6. It is
a uniform distribution.
Figure 3: The distribution of results for the sum of two rolls of a 6-sided die, also known
as 2d6. It is triangular.
Figure 4: The distribution of results for the sum of three rolls of a 6-sided die, also known
as 3d6. It is starting to obtain the “bell” shape of a normal distribution.
14
Exercise: Plot the probability distributions of the sum of n coin tosses up to whatever value of n
satisfies you. Note how the distribution looks more and more like a normal distribution as you
increase n.
The numerical value thus does not have any physical meaning whatsoever! It is merely a consequence
of choosing to work with one system of units (e.g. meters and seconds) and not another. For this
reason, we can simply choose to work in Planck units, where:
1
c = 8πG = h̄ = = k B = 1. (3.2)
4πε 0
Here c is the speed of light, G is the gravitational constant, h̄ is the (reduced) Planck constant used in
quantum mechanics, 1/4πε 0 is the Coulomb constant used in electromagnetism, and k B is the Boltz-
mann constant used in statistical mechanics.
Planck units are commonly used in quantum gravity research, and we will also use them in these
lecture notes. This means that h̄ will not appear in any of our equations!
15
3.2 States and Operators
A system in a quantum theory is the mathematical representation of a physical system (such as a
particle) as a Hilbert space. The type and dimension of the Hilbert space depend on the particular
system; note that the dimension of the Hilbert space is unrelated to the dimension of spacetime. In
the finite-dimensional case, the Hilbert space will usually be C n for some n, such as C2 , which was
used in the examples above. In the infinite-dimensional case, it will usually be a space of functions,
which is much more complicated, and we will not describe it in detail here.
An operator on a quantum system is a matrix in the appropriate Hilbert space. It represents an action
performed on the system, such as a measurement, a transformation, or time evolution. In the contin-
uous case, where the states are functions, this “matrix” will in fact correspond to derivatives acting
on the functions.
A state of a quantum system is a vector with norm 1 in the appropriate Hilbert space, that is, a vector
|Ψi which satisfies q
kΨ k = hΨ|Ψi = 1. (3.3)
The state represents the configuration of the system, and encodes the possible outcomes of measure-
ments performed on that system. It is important to stress that, although there are many vectors in
a Hilbert space, only vectors which have norm equal to 1 represent states. However, if we have a
vector with non-unit norm, we can simply divide it by its norm to obtain a unit vector, which then
represents a state.
3.4 Superposition
Let the state of a quantum system be |Ψi. Once we have chosen a Hermitian operator to be our
observable, there is a basis of states | Bi i corresponding to the eigenvectors of the operators. We may
then write the vector |Ψi as a linear combination of the basis vectors, as we did above:
n
|Ψi = ∑ | Bi ih Bi |Ψi. (3.4)
i =1
Remember that h Bi |Ψi is a scalar (a complex number). So this is a sum over the basis vectors | Bi i
with a complex number attached to each of them. Such a linear combination of states is called a
superposition.
16
3.5 Probability Amplitudes
In quantum theory, the inner product h Bi |Ψi is called the probability amplitude to measure the eigen-
value λi corresponding to the eigenvector | Bi i, given the state |Ψi. When we take the magnitude-
squared of a probability amplitude, we get the corresponding probability. Thus
|h Bi |Ψi|2 (3.5)
is the probability to measure the eigenvalue λi corresponding to the eigenvector | Bi i, given the state
| Ψ i.
Why is this a probability? It is easy to see that it behaves like a probability is expected to. It’s obvious
that |h Bi |Ψi|2 is always non-negative, because it is a magnitude of a complex number. In addition,
the sum of all possible probabilities for distinct outcomes has to equal one. Let us check that this is
indeed the case.
Exercise: Verify that ∑ni=1 |h Bi |Ψi|2 = 1.
So the magnitudes-squared |h Bi |Ψi|2 are non-negative numbers which sum to 1, and therefore may
represent probabilities. Why they actually represent probabilities is a question that has no good an-
swer except that this is how quantum theory works and it can be verified experimentally.
| Ψ i = a |0 i + b | 1 i , (3.7)
Exercise:
1. Show that a and b indeed take these values (note that we have proven this for a general basis
above).
2. Show that the requirement that the state |Ψi is normalized means that | a|2 + |b|2 = 1, that is,
the probabilities sum to one (again, this follows from the general case we’ve discussed above).
3. Show that we can write the state |Ψi explicitly as a vector in the Hilbert space:
a h0| Ψ i
|Ψi = = . (3.9)
b h1| Ψ i
17
The fact that |0i and |1i are orthogonal hints that they are eigenstates of a Hermitian operator with
different eigenvalues. Let us introduce the Pauli matrices:
0 1 0 −i 1 0
σx = , σy = , σz = . (3.10)
1 0 i 0 0 −1
It is easy to see that |0i and |1i are, in fact, eigenstates of σz , with eigenvalues +1 and −1 respectively.
Exercise:
18
3.8 Inner Products with Matrices and the Expectation Value
Consider a Hermitian operator A with an orthonormal bases of eigenstates | Bi i and eigenvalues λi .
This means that
A | Bi i = λ i | Bi i , h Bi | Bj i = δij , (3.15)
where δij is the Kronecker delta, defined above:
(
0 if i 6= j,
δij = (3.16)
1 if i = j.
Let |Ψi be the state of the system. We may write |Ψi, as usual, as a superposition of the basis states:
n
|Ψi = ∑ | Bi ih Bi |Ψi. (3.17)
i =1
When taking the sum over j, the δij is always 0 except when j = i. Therefore the sum over j always
gives us just one element, the one where j = i. We get:
n
hΨ| A|Ψi = ∑ λi hΨ| Bi ih Bi |Ψi
i =1
n
= ∑ λi |hΨ| Bi i|2 .
i =1
Since |hΨ| Bi i|2 is the probability to measure the eigenvalue λi (associated with the eigenstate | Bi i)
given the state |Ψi, we have obtained, by definition, the expectation value for the measurement of
A! For this reason, we sometimes simply write h Ai instead of hΨ| A|Ψi, as long as it is clear that the
expected value is taken with respect to the state |Ψi. Sometimes we also use the notation
h A i Ψ = h Ψ | A | Ψ i. (3.19)
19
Exercise: Calculate hσz iΨ where σz is the Pauli matrix
1 0
σz = , (3.20)
0 −1
for the following three states:
1
| Ψi = √ (|0i + |1i) , (3.21)
2
1 2
| Ψ i = √ | 0 i + √ |1 i , (3.22)
5 5
3 2
| Ψ i = √ | 0 i + √ |1 i . (3.23)
13 13
20
where in each of these, the first state is the state of qubit A and the second is the state of qubit B. Thus
|0i ⊗ |0i corresponds to |0i for both qubits, |0i ⊗ |1i corresponds to |0i for qubit A and |1i for qubit
B, |1i ⊗ |0i corresponds to |1i for qubit A and |0i for qubit B, and |1i ⊗ |1i corresponds to |1i for both
qubits.
The product ⊗ is called a tensor product. We will not go into its definition and properties; what’s
important is that it simply represents a way to combine the states of two separate systems into one
system. Note that it is not commutative, since it matters which system is the first and which system
is the second. It is, however, distributive.
The most general state of both qubits is described as a superposition of all possible combinations:
|Ψi = α00 |0i ⊗ |0i + α01 |0i ⊗ |1i + α10 |1i ⊗ |0i + α11 |1i ⊗ |1i , (4.4)
where α00 , α01 , α10 , α11 ∈ C. We may now ask the question: when do the two qubits depend on each
other? Another way to phrase this question is that we would like to know whether qubit A can be |0i
or |1i independently of the state of qubit B, and vice versa. This depends on the coefficients αij , as we
will now see.
Let us define a separable state: this is a state which can be written as just one tensor product instead of
a sum of tensor products. That is, a separable state is of the form
|Ψi = |Ψ A i ⊗ |Ψ B i , (4.5)
where |Ψ A i is the state of qubit A and |Ψ B i is the state of qubit B. A simple example of a separable
state would be:
| Ψ i = | 0 i ⊗ |0 i . (4.6)
This just means that both qubits are, with 100% probability, in the state |0i. A more interesting sepa-
rable state is:
1
|Ψi = (|0i ⊗ |0i + |0i ⊗ |1i + |1i ⊗ |0i + |1i ⊗ |1i) . (4.7)
2
To see that it is separable, all we need to do is simplify using the distributive property, and get:
1 1
|Ψi = √ (|0i + |1i) ⊗ √ (|0i + |1i) . (4.8)
2 2
Exercise: Use the distributive property of the tensor product, that is,
1
|Ψi = √ (|0i ⊗ |1i + |1i ⊗ |0i) . (4.12)
2
21
No matter how much we try, we can never write it as just one tensor product; it is always going to
be the sum of two products! This means that the states of both qubits are no longer independent.
Indeed, if qubit A is in the state |0i then qubit B must be in the state |1i (due to the first term), and
if qubit A is in the state |1i then qubit B must be in the state |0i (due to the second term). This is
precisely what is means for two systems to be entangled.
Exercise: Find three more entangled states.
22
When Bob gets to Andromeda, he opens his envelope. If he sees 0 he knows that Alice’s envelope
says 1, and if he sees 1 he knows that Alice’s envelope says 0.
Obviously, this does not allow any information to be transmitted between Alice and Bob, nor does
each envelope need to “know” what’s inside the other envelope in order for the measurements to
match. If Bob sees 0, then the piece of paper saying 0 was inside the envelope all along, and the
piece of paper saying 1 was inside Alice’s envelope all along – and vice versa. The envelopes are
classically correlated and nothing weird is going on. What, then, is the difference between this classical
correlation and quantum entanglement? The answer to this question can be made precise using Bell’s
inequality.
Consider the following experiment. I prepare two qubits, and give one to Alice and another to Bob.
Alice can measure one of two different physical properties of her qubit, Q or R, both having two
possible outcomes, +1 or −1. Similarly, Bob can measure one of two different physical properties of
his qubit, S or T, both having two possible outcomes, +1 or −1.
We now make two important assumptions:
• Locality: Both Alice and Bob measure their qubits at the same time in different places, so that
their measurements cannot possibly disturb or influence each other without sending informa-
tion faster than light.
• Realism: The values of the physical properties Q, R, S, T exist independently of observation, that
is, they have certain definite values q, r, s, t which are already determined before any measure-
ments took place (like in the envelope scenario).
rs + qs + rt − qt = (r + q) s + (r − q) t = ±2. (4.14)
To see that, note that since r = ±1 and q = ±1, we must either have r + q = 0 if they have opposite
signs, or r − q = 0 if they have the same sign. In the first case we have (r − q) t = ±2 because t = ±1
and in the second case we have (r + q) s = ±2 because s = ±1.
Using this information, we can calculate the mean value of this expression. To do that, we assign the
probability p (q, r, s, t) to each outcome of q, r, s, t. For example, we might simply assign a probability
distribution where all probabilities are equal:
1
p (q, r, s, t) = , (4.15)
16
for any values of q, r, s, t. However, the probability distribution can be anything. Even though we
don’t know the probabilities in advance, we we can nonetheless still calculate the mean value:
≤2 ∑ p (q, r, s, t)
q,r,s,t ∈{−1,+1}
= 2.
To go to the second line we used the fact that rs + qs + rt − qt = ±2 and thus it is always less than or
equal to 2, and to go to the third line we used the fact that the sum of all possible probabilities is 1.
Also, since the expected value function is linear, we have
23
Exercise: Prove this.
We thus obtain the (in)famous Bell inequality:
Now we are going to see that quantum entanglement violates this inequality. To see that, assume that
I prepared the following entangled state of two qubits:
1
|Ψi = √ (|0i ⊗ |1i − |1i ⊗ |0i) . (4.18)
2
Alice gets the first qubit in the tensor product, and Bob gets the second qubit. We define the observ-
ables Q, R, S, T in terms of the Pauli matrices. Alice has
Q = σz , R = σx , (4.19)
1 1
h RSi = h QSi = h RT i = √ , h QT i = − √ . (4.21)
2 2
We thus get: √
h RSi + hQSi + h RT i − h QT i = 2 2 > 2, (4.22)
which violates the Bell inequality. This means that our assumptions, either locality or realism (or
both), must be incorrect!
Exercise:
1. Think about the consequences of letting go of each assumption. Which one would you rather
lose, locality or realism?
2. The precise statement of Bell’s theorem is that theories of local hidden variables cannot reproduce
all the predictions of quantum mechanics. You are encouraged to look up theories of local
hidden variables and read about them for a deeper understanding of the theorem.
[ A, B] = AB − BA. (4.23)
If the operators commute, then AB = BA and thus the commutator vanishes: [ A, B] = 0. Otherwise,
AB 6= BA and the commutator is non-zero: [ A, B] 6= 0. The commutator thus tells us if the operators
commute or not. Note that any operator commutes with itself: [ A, A] = 0 for any A.
Recall now the Pauli matrices:
0 1 0 −i 1 0
σx = , σy = , σz = . (4.24)
1 0 i 0 0 −1
The Pauli matrices do not commute with each other, as you will find in the exercise:
Exercise: Calculate the commutators σx , σy , σy , σz and [σz , σx ].
24
4.6 The Uncertainty Principle
When two quantum observables do not commute, we get an uncertainty relation. The most well-
known such relation is the position-momentum uncertainty relation4 :
1
∆x∆p ≥ . (4.25)
2
On the left-hand side we have the uncertainty, or more precisely the standard deviation, in position x,
which is labeled ∆x, multiplied by the uncertainty in momentum p, which is labeled ∆p. This relation
can be derived from the commutator of the operators x and p:
[ x, p] = i . (4.26)
Let us prove this relation for the general case of two Hermitian operators, A and B, which do not
commute, that is,
[ A, B] 6= 0. (4.27)
Recall that the standard deviation ∆A of A is given by
D E
(∆A)2 = ( A − h Ai)2 . (4.28)
We have seen that expectation values in quantum theory are calculated using the inner product
h A i = h Ψ | A | Ψ i, (4.29)
where |Ψi is the state of the system with respect to which the expectation value is calculated. How-
ever, the uncertainty relation does not depend on the specific choice of state, so we will work with a
general state |Ψi. The standard deviation is thus
25
Using the Cauchy-Schwarz inequality (given in the linear algebra chapter), we have
and similarly
hb| ai = h BAi − h Aih Bi. (4.35)
Thus
and so we get
1
(∆A)2 (∆B)2 ≥ − h[ A, B]i2 . (4.36)
4
Finally, since −1/4 = (1/2 i)2 , we can write
2
2 2 1
(∆A) (∆B) ≥ [ A, B] . (4.37)
2i
Exercise:
[ x, p] = i, (4.38)
26
3. Calculate the uncertainty relation for σx and σy given the most general state of a qubit:
As always, the exact form of U (t1 , t2 ) is determined by the specific quantum system. All quantum
theory tells us is that it must be a unitary operator, just like an observable must be described by a
Hermitian operator.
Exercise:
1. Recall that the Pauli matrices σx , σy , σz are unitary. Take U (t1 , t2 ) to be each of these matrices,
and describe how the general qubit state,
evolves using each choice. Note that this is a discrete evolution, that is, there is no explicit
dependence of the state on a continuous time variable t. We simply evolve from one state to
another by applying the unitary operator.
2. Another unitary operator which acts on qubits is the Hadamard gate:
1 1 1
G= √ . (4.44)
2 1 −1
27
4.8 Dynamics: Continuous Time Evolution
Th equation we gave in the previous section relates two states at two different times by a unitary
operator. However, in cases where the states depend on a continuous time variable t, we may write a
differential equation giving the state |Ψi for any time t. This equation is the Schrödinger equation5 :
d |Ψ (t)i
= − i H |Ψ (t)i . (4.45)
dt
It is called a differential equation because it involves derivatives. The equation simply tells us how
|Ψ (t)i changes when t changes, in a precise way.
On the right-hand side, the operator H is a Hermitian operator called the Hamiltonian. This Hermitian
operator corresponds to an observable: the energy of the system. The evolution of the state |Ψ (t)i is
thus dictated solely by the action of the Hamiltonian operator H on the state.
Since the Hamiltonian is Hermitian, it has eigenvalues Ek corresponding to eigenstates | Ek i which
make up an orthonormal basis6 :
H | Ek i = Ek | Ek i . (4.46)
The basis eigenstate | Ek i is simply a state in which the system has energy Ek , and is called an energy
eigenstate. There will always be a state of lowest energy, that is, a state | E0 i for which the eigenvalue
E0 is the lowest among all the eigenvalues:
| Ψ 2 i = e− i H ( t2 − t1 ) | Ψ 1 i . (4.48)
The perceptive student will surely notice that this equation is of the form presented in the previous
section, with the unitary operator
U ( t 1 , t 2 ) = e− i H ( t2 − t1 ) . (4.49)
Therefore, the Schrödinger equation is equivalent to the discrete time evolution equation presented
above. Note that this also explains where the i in the Schrödinger equation comes from!
Exercise: Show that, if H is any Hermitian operator, then U = ei Hα is a unitary operator for any real
number α ∈ R.
28
• Unitary operators corresponding to time evolution of the system,
• Specific states on which these operators act, which correspond to different configurations of the
system.
Of course, not every possible model we can make will actually correspond to a physical system that
we can find in nature. However, amazingly, the opposite statement does seem to be true: every phys-
ical system that we find in nature can be precisely described by a model built using the ingredients of
quantum theory!
We can think of quantum theory as a sort of language. Just like English is a language with rules such
as grammar and spelling, so is quantum theory a language with its own rules: observables must be
Hermitian operators, measurements are given by eigenvalues of these operators, and so on. And just
like we can use English to make any sentence we want, both true and false, we can use quantum
theory to make any model we want, both models that correspond to real physical systems and those
that do not.
29
of the canonical coordinates, q and p, which in the simplest cases represent position and momentum
respectively.
Since we have limited time, and we are interested in quantum mechanics and not classical mechanics,
we will no go over the Hamiltonian formulation in detail. We will instead just note that the Hamilto-
nian is generally of the form
p2
H = T ( p ) + V (q) , T ( p) = , (5.1)
2m
where T is the kinetic energy, V is the potential energy, and m is the mass of the particle. Using this
expression we can then get the time evolution of the system, that is, the derivatives of q and p with
respect to time t:
dq ∂H dp ∂H
=+ , =− . (5.2)
dt ∂p dt ∂q
These equations are called Hamilton’s equations.
In canonical quantization of a classical system described by a Hamiltonian, we promote the variables q
and p, as well as the Hamiltonian, to Hermitian operators. All three operators now represent observ-
ables in the quantum theory; they have eigenstates and eigenvalues, and these represent measure-
ments. This means that the values of q, p and H are no longer uniquely determined from some initial
conditions, as in the classical theory; they become probabilistic. In addition, the time evolution of the
system is no longer described by Hamilton’s equations, but rather, by the Schrödinger equation.
We will not perform canonical quantization in detail in these lecture notes, except for the quantum
harmonic oscillator (see below). However, many examples can be found in any quantum mechanics
textbook.
Since x̂ is a Hermitian operator, the eigenstates | x i form an orthonormal basis7 . The wavefunction ψ
for a particle described by the state |Ψi is given by:
ψ ( x ) = h x | Ψ i, (5.4)
where | x i is an eigenstate of position. Wavefunctions are, therefore, functions on space which return
the probability amplitudes to find the particle in each point in space8 .
It should be noted that wavefunctions are not fundamental entities in modern quantum theory. The
fundamental entities are the states. Wavefunctions only exist for particular systems where it is mean-
ingful to define them, such as the quantized particle, and even in those cases we lose much of the
rich mathematical toolbox of linear algebra, presented above, when we describe the system using
wavefunctions instead of states. Furthermore, some quantum systems, such as the qubit, can only be
described using states.
7 Or, more precisely, the infinite-dimensional equivalent of an orthonormal basis, which we will not discuss here.
8 Equivalently, wavefunctions may be defined on momentum basis: ψ ( p) = h p|Ψi.
30
5.4 Phases and the Wave-Particle Duality
It is important to note that since the probability, as predicted by quantum mechanics, only depends
only on the magnitude (or absolute value) of the probability amplitude, we can multiply the am-
plitude by any complex number with magnitude 1, and the probabilities will remain the same. For
example, the following states all have probability 50% to measure either 0 or 1:
1 1
| Ψ i = √ | 0 i + √ |1 i , (5.5)
2 2
1 i
| Ψ i = √ | 0 i + √ |1 i , (5.6)
2 2
1 i
| Ψ i = √ | 0 i − √ |1 i , (5.7)
2 2
i 1+i
| Ψ i = √ |0 i + |1 i . (5.8)
2 2
This means that there is something “extra” in quantum mechanics, more than just probability; the
probability amplitude is the fundamental quantity, and the probability is just a consequence of it!
Exercise: Write down four different states which all have a probability of 1/3 to measure 0 and 2/3
to measure 1. Remember that the states must be normalized to 1.
The phase of the amplitude is irrelevant if we have just one probability amplitude, but if we add up
several probability amplitudes and only then take the magnitude-squared, the √ different amplitudes
√
will interfere with each other. For example, if we have two amplitudes a = 1/ 2 and b = 1/ 2, then
|Ψi = a |Ψ A i + b |Ψ B i , | a |2 + | b |2 . (5.11)
The probability amplitude to measure the particle at a specific position x0 is, of course,
h x0 | Ψ i = a h x0 | Ψ A i + bh x0 | Ψ B i. (5.12)
|h x0 |Ψi|2 = | ah x0 |Ψ A i + bh x0 |Ψ B i|2
= | a|2 |h x0 |Ψ A i|2 + |b|2 |h x0 |Ψ B i|2 + 2 Im ( a∗ bhΨ A | x0 ih x0 |Ψ B i) ,
where Im stands for “the imaginary part of, that is, if a, b ∈ R then Im (a + b i) = b.
9 This is a very important experiment in quantum mechanics. If you’re not familiar with it, look it up!
31
Exercise: Derive explicitly the expression given above for |h x0 |Ψi|2 .
The terms | a|2 |h x0 |Ψ A i|2 and |b|2 |h x0 |Ψ B i|2 are always positive. However, the third term is a real
number which can be either positive or negative, depending on the specific values of a, b, and the
phases of the probability amplitudes. This term will either increase or decrease the probability to find
the particle at x0 , and it is precisely what is responsible for the interference pattern in the double-slit
experiment.
We have thus seen that wave-particle duality is not really that mysterious: it’s simply the consequence
of the particle in quantum mechanics having a probability amplitude to be in every possible position,
instead of just one unique position as in classical mechanics!
where a† is called the creation operator and a is called the annihilation operator.
Exercise:
The last result of the exercise is very important: the form of the Hamiltonian has been simplified
considerably! This is what allows us to solve for the energy eigenstates of the system (that is, the
eigenstates of H) easily. We define a new operator called the number operator:
N = a† a. (5.17)
32
Since both ω and 1/2 are just numbers, the problem of finding the eigenvalues of H now reduces to
finding the eigenvalues of N.
Exercise:
1. Show that if n is an eigenvalue of N then ω n + 21 is an eigenvalue of H.
[ a, a† ] = 1, [ N, a† ] = a† , [ N, a] = − a. (5.19)
Now, let |ni be an eigenstate of N with eigenvalue n. Since N is Hermitian, we know that n must be
a real number. In fact, we can do more than that. Let us calculate the expectation value:
where we used the fact that hn| a† is the dual vector to a |ni. On the other hand, we have
where we used the fact that n is the eigenvalue of N corresponding to the eigenstate |ni, that is,
N |ni = n |ni, and that the state |ni is normalized to 1, like all states, so hn|ni = 1. By comparing the
two equations, we see that
n = k a|nik2 ≥ 0, (5.22)
that is, n is not only real but non-negative.
Next, we act with Na and Na† on | ni. In the exercise you showed that
Na − aN = [ N, a] = − a, (5.23)
Na† − a† N = [ N, a† ] = a† , (5.24)
so we have
Na = aN − a = a ( N − 1) , Na† = a† N + a† = a† ( N + 1) , (5.25)
and thus
Na |ni = a ( N − 1) |ni = (n − 1) a |ni , (5.26)
Na† |ni = a† ( N + 1) |ni = (n + 1) a† |ni , (5.27)
where we used the fact that N |ni = n |ni and that since n ± 1 is a number, it commutes with operators
and can be moved to the left. Writing this result in a different way, we see that
33
To calculate k a† |ni k2 , we recall from the exercise that
aa† − a† a = [ a, a† ] = 1, (5.31)
and thus
aa† = a† a + 1 = N + 1. (5.32)
We therefore get
k a† |ni k2 = hn| aa† |ni = hn| ( N + 1) |ni = hn| N |ni + hn|ni = n + 1. (5.33)
To summarize, the norms are
√ √
k a |nik = n, k a† |ni k = n + 1. (5.34)
The normalized eigenstates are now obtained, as usual, by dividing by the norm:
1 1
| n − 1i = √ a | ni , | n + 1i = √ a† |ni . (5.35)
n n+1
Another way to write this, from a different point of view, is as the action of the operators a and a† on
the state |ni: √
√
a | ni = n | n − 1i , a † | ni = n + 1 | n + 1i . (5.36)
We see that a reduces the energy eigenvalue by 1, while a† increases the energy eigenvalue by 1. A
more fancy way to describe this is by saying that a† gets us to the state of next higher energy (it
“creates a quantum of energy”) while a gets us to the state of next lower energy (it “annihilates a
quantum of energy”). For this reason, we call a† the creation operator and a the annihilation operator.
We also call them the ladder operators because they let us “climb the ladder” of energy eigenstates.
Going back to the definition of the Hamiltonian in terms of the number operator, we see that
1
H |ni = ω n + |ni , (5.37)
2
and thus |ni is an energy eigenstate with eigenvalue
1
En = ω n + . (5.38)
2
In particular, since we showed above that n must be non-negative, and since we now also see that it
has to be an integer (as it can only be increased or decreased by 1), the possible eigenstates are found
to be
| 0 i , | 1 i , | 2 i , |3 i , . . . . (5.39)
The state of lowest energy is |0i, also called the ground state, which has energy
1
E0 = ω. (5.40)
2
We say that a† , which takes us from |0i to |1i, excites the harmonic oscillator from the ground state
to the first excited state, which has exactly one quantum. Yes, this is what the “quantum” in “quantum
mechanics” means! In general, the state |ni has exactly n quanta, while the ground state |0i has no
quanta.
We found that the energy of the harmonic oscillator is discrete, or quantized, and the system can only
have energy which differs from ω/2 by equal steps of ω.
Exercise: Prove that n
a†
| n i = √ |0 i . (5.41)
n!
This means that, once we know the ground state, we can create any energy eigenstate by simply
applying n times the operator a† and normalizing.
34
5.6 The Path Integral Formulation
So far, we have been working exclusively with one way to define quantum mechanics, using Hilbert
spaces, states and operators. There is another way, which is equivalent but very different, which is
using path integrals.
The path integral formulation of quantum mechanics tells us that, in order to find the probability
amplitude for the particle to go from the point x1 at time t1 to the point x2 at time t2 , we need to take
into account all the different paths that the particle can take between these points. We will thus sum
(or integrate) over all the paths, and each path will get a certain weight in this sum, given by a phase
factor. The equation describing this process is:
Z x (t2 )= x2
h x2 , t2 | x1 , t1 i = D x (t) Φ (x (t)) , (5.42)
x (t1 )= x1
where:
• h x2 , t2 | x1 , t1 i is the probability amplitude for the particle to go from the point x1 at time t1 to the
point x2 at time t2 .
R x (t )= x
• x (t 2)= x 2 D x (t) means “sum over all the paths x (t) such that x (t1 ) = x1 and x (t2 ) = x2 ”. In
1 1
other words, sum over all the paths the particle can take from the point x1 at time t1 to the point
x2 at time t2 .
• Φ ( x (t)) is the phase factor, which is a complex number with magnitude 1 assigned to each path
x (t ).
When we sum over the phase factors for all the different paths, they will interfere constructively and
destructively, as discussed previously. Thus, we see that important results of quantum theory such
as the wave-particle duality follow naturally from this approach, without ever needing to define a
Hilbert space, or assign states and operators to various aspects of the system. Everything is defined
using only the paths x (t) and the phase factor Φ ( x (t)), which is calculated using tools from classical
mechanics.
35
footing as the dimensions of space. In a relativistic theory, we still evolve systems in time, so in the
relativistic quantum theory, we will still relate states at different times by a unitary operator; but we
must treat space the same way we treat time.
There are two ways to do that. One is to promote time to an operator, just like we did for the position
operator. This works to some extent, but turns out not to be very convenient as a mathematical
framework. The other option is to demote position from an operator to a mere label of states, just like
time is a label of the state |Ψ (t)i. The states of the system will now have, as a label, not only time but
also position, and there is no longer a position operator (and thus also no momentum operator, since
the two go hand in hand).
Roughly speaking, the way it works mathematically is by placing a quantum harmonic oscillator
at each point in space. Then we define creation operators a† (x ) and annihilation operators a (x ) at
each point x. These operators create or destroy quanta, which we call particles. The vacuum state |0i
is simply the state with no particles anywhere. Acting with a† ( x ) on the vacuum state creates one
particle at position x:
a † ( x ) |0 i = |1 x i . (5.43)
Acting with a† (y) on this state creates another particle, this time at position y:
a † ( y ) |1 x i = 1 x 1 y .
(5.44)
And so on. The states with one or more particles are called excited states. In this way, we can build
excited states with arbitrary numbers of particles at arbitrary points in space. The actual state of the
system is going to be, as usual, a superposition of all the possible states – that is, all the possible
numbers of particles at all the possible positions!
This is called a field because it is exactly what you get when you quantize a classical theory describing
a classical field, such as the electromagnetic field. We will not go over the details here. A nice way to
visualize it is to think about a “field” of harmonic oscillators spread throughout space, each having
its own “ladder” of eigenstates corresponding to different numbers of quanta.
• Scalar fields: They have spin 0 and are represented mathematically simply as some complex
number φ. The only scalar in the Standard Model is the Higgs field, which gives the other fields
mass through a process known as the Higgs mechanism.
• Fermion fields: They have spin 1/2 and may be represented mathematically as a vector in C4 ,
that is, a vector (ψ1 , ψ2 , ψ3 , ψ4 ) with four complex components ψ1 , ψ2 , ψ3 , ψ4 ∈ C. Note that this
is an abstract vector, not a vector in spacetime itself! All particles of matter are described using
fermions. This includes the quarks u, d, c, s, t, b and the leptons e, νe , µ, νµ , τ, ντ . The quarks and
leptons come in three generations, each more massive than the previous one and thus requires
more energy to create in particle collisions. The fermions are listed in Table 1.
36
1st Generation 2nd Generation 3rd Generation
“Up-Type” Quarks Up (u) Charm (c) Top (t)
“Down-Type” Quarks Down (d) Strange (s) Bottom (b)
Charged Leptons Electron (e) Muon (µ) Tau (τ)
Neutral Leptons Electron Neutrino (νe ) Muon Neutrino (νµ ) Tau Neutrino (ντ )
• Vector fields: Also known as gauge fields, they have spin 1 and are represented mathematically as
a vector in R4 , that is, a vector (t, x, y, z) in spacetime itself. Gauge fields are used to mediate
interactions between the particles. There are four types of gauge bosons in the Standard Model:
the photon mediates the electromagnetic interaction, the gluon mediates the strong interaction, and
the W and Z bosons mediate the weak interaction.
In general, fields with integer spin are called bosons while fields with half-integer spins are called
fermions. This is why the excitation of the Higgs field is called the Higgs boson and the excitations of
the gauge fields are called gauge bosons. The meaning of spin can be interpreted as follows: if a field
has spin S, it means if we rotate space around it, its mathematical representation will come back to
its original configuration after 1/S full rotations. A scalar field is just a number, so it always stays the
same; a vector field is just like a usual vector in space(time), so it comes back to itself after one full
rotation; and a fermion field is a bit weird because it gains a minus sign after a full rotation, which
means that it goes back to its original state only after two full rotations!
Each of these fields can be excited to obtain a particle. For example, by exciting the electron field
using the creation operator at a particular point, we get an electron at that point. The mass of the field
is simply the amount of energy needed to put into the field in order to excite it. So it is much easier to
create an electron, with mass 0.5 MeV, than a Higgs boson, with mass 125, 000 MeV. (Mega-electron-
volt or MeV is a unit of mass roughly equal to 1.8 × 10−30 kg.)
6 Quantum Gravity
As we have alluded above, the mathematical framework of quantum theory has been successfully
applied to most of known physics, with the notable exception of gravity. General relativity describes
gravity using the notion of spacetime curvature, and it is our best and most precise theory of gravity.
In the terms we have used so far, quantizing gravity would mean finding a quantum system, that
is, a Hilbert space with appropriate states and operators, which reduces to general relativity when
quantum effects are neglected.
In naive attempts to quantize gravity, one tries to approximate curved spacetime as a flat spacetime
with tiny “bumps” of curvature. These “bumps” are called gravitons. Quantizing them then leads
to a low-energy effective theory, which means that the theory is only valid as long as we are dealing
with large distance scales (or equivalently, low energy scales). For small distances (or high energies),
which are what we are actually interested in when formulating a quantum theory, the graviton theory
breaks down and can no longer produce any predictions. Thus, it is not useful as a scientific theory.
There are several speculative approaches to solving the problem of quantizing gravity. Some ap-
proaches, like string theory, hypothesize a completely new quantum theory, with new fundamental
entities (strings, in this case), which produces general relativity in the appropriate limit. Other ap-
proaches, such as loop quantum gravity, start with general relativity itself, with a fully curved space-
time, and try to quantize it like any other quantum field theory. However, at this point all of these
approaches are purely hypothetical, with no complete theory available and definitely no experimen-
tal verification.
37
In order to formulate a complete and mathematically consistent theory of quantum gravity, fresh
ideas that have not been tried before are needed. Perhaps the reader of these lecture notes will con-
tribute to this endeavor?
7 Further Reading
The readers who successfully made it though these lecture notes, and wish to learn more about quan-
tum theory, are invited to check out volume III of the lectures by Feynman [1]10 , the undergraduate-
level quantum mechanics textbook by Griffiths and Schroeter [2], and the quantum computation text-
book by Nielsen and Chuang [3].
8 Acknowledgments
The author would like to thank his three groups of students from ISSYP 2015, 2016 and 2017. It was
a great pleasure to teach you, and these notes could not have been completed without your valuable
feedback.
This research was supported in part by Perimeter Institute for Theoretical Physics. Research at
Perimeter Institute is supported by the Government of Canada through the Department of Inno-
vation, Science and Economic Development Canada and by the Province of Ontario through the Min-
istry of Research, Innovation and Science.
References
[1] R.P. Feynman, R.B. Leighton, and M. Sands. The Feynman Lectures on Physics, boxed set: The New
Millennium Edition. Basic Books, 2011.
[2] David J. Griffiths and Darrell F. Schroeter. Introduction to Quantum Mechanics. Cambridge Univer-
sity Press, 3 edition, 2018.
[3] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information: 10th
Anniversary Edition. Cambridge University Press, 2010.
38