Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Version 0.

2h Eduardo Martin-Martinez - Phys 434

Block 1

Review of the mathematical formalism of


Quantum Mechanics
1.1 Introduction to Hilbert spaces. Dual space and Bra-Ket notation

1.1.1 Hilbert spaces and inner product

Definition (Vector space). A vector space over a field F (typically R or C, the elements of F are called
scalars) is a set V together with two operations: vector addition and multiplication by a scalar that satisfy a
series of axioms:

• The addition has the following properties.

– Associativity: u + (v + w) = (u + v) + w.
– Commutativity: u + v = v + u.
– There exists an identity element 0 such that u + 0 = u.
– Every element u has an inverse −u such that u + (−u) = 0.

• The multiplication by scalar has an identity element 1 such that 1u = u.

• Compatibility of both operations: a(bv) = (ab)v.

• Distributivity of multiplication by a scalar:


a(u + v) = au + av; (a + b)v = av + bv.

where u, v, w ∈ V and a, b ∈ F .

The elements of V are called vectors. For the purposes of this introduction, it is going to be almost always
the case that F = C. Then V is called a complex vector space.

Definition (Inner product). An inner product h·, ·i is a map from V × V to C (it assigns a complex number
to each pair of elements of the vector space V ), which satisfies the following properties:

• Conjugate symmetry: hx, yi = hy, xi∗ .

• Linearity (in the second argument): hx, a(y + z)i = ahx, yi + ahx, zi.
Hence antilinearity in the first argument: hax, yi = a∗ hx, yi.

• Positive-definiteness : hx, xi ≥ 0, with hx, xi = 0 ⇔ x = 0.

Review of the foundations of QM 1 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

An example of inner product could be the dot product in Cn . Let us take the two vectors
   
i 1
x = 0 , y = 1 , (1.1.1)
1 0
and compute the dot product hx, yi. Notice that since it satisfies the properties above, the components of the
first vector have to be complex conjugated when multiplied by the components of the second. Therefore,
hx, yi = −i · 1 + 0 · 1 + 1 · 0 = −i. (1.1.2)

The inner product defines an Euclidean norm, that we can denote as ||x||. The norm would be the square
root of inner product of a vector with itself,
p
||x|| = + hx, xi, (1.1.3)
and that this norm defines a natural distance between two vectors as the norm of the difference of the two
vectors:
d(x, y) = ||x − y||. (1.1.4)

Let us now introduce the notion of Cauchy sequence.


Definition (Cauchy sequence). Given a metric space (V, d) (where V is a set, e.g., the elements of a vector
space, and d is a distance function), a sequence {v1 , v2 , v3 , . . . }, where vi ∈ V is a Cauchy sequence if for
every positive real number  > 0 there is a positive integer N such that for all possitive integers m, n, > N the
distance d(xm , xn ) < . A metric space (V, d) in which every Cauchy sequence converges to an element of V
is called complete.

In simple and rough words: A Cauchy sequence is a sequence such that any two consecutive elements
become arbitrarily close to each other as the sequence progresses.
Not all Cauchy sequences converge in the metric space they are defined. for example, consider as the
metric space the rational numbers V = Z with the absolute value as the norm and the distance that comes
from it. The sequence
1 n
 
an = 1 + (1.1.5)
n
is a Cauchy sequence (all the elements are rational numbers [e.g., a1 = 2, a2 = 49 , a3 = 64 645
27 , a4 = 256 , . . . ], and
for any finite n and the distance between consecutive terms decreases). however the limit of the sequence is
the number e, which is not rational.
A vector space with a notion of norm (which defines a distance as the norm of the difference of two vectors)
that is complete with respect to the norm distance (or in other words, a vector space for which all Cauchy
sequences converge in V with respect to the norm distance) is called a Banach space. We now have all the
necessary ingredients for the following definition.
Definition (Hilbert space). A Hilbert space V is a vector space with an inner product which is also complete
with respect to the distance function introduced by the inner product. That is, it is a Banach space in which
the distance function is induced by the inner product is a Hilbert space. All Hilbert spaces are also Banach
spaces. Not all Banach spaces are Hilbert spaces.

Examples of Hilbert spaces are Rn and Cn endowed with the inner product defined by the usual dot
product, and the space L2 (R) of square-integrable functions over the real numbers with the inner product
Z
hf, gi = dxf (x)∗ g(x). (1.1.6)
R

Review of the foundations of QM 2 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

1.1.2 Bras and Kets. The dual space

Definition (Linear operator). A linear operator Ô is a map between two Hilbert spaces V and W that pre-
serves:

• additivity: Ô(v1 + v2 ) = Ô(v1 ) + Ô(v2 );

• multiplication by a scalar: Ô(av) = aÔ(v).

The set of vectors v ∈ V over which Ô is defined is called the domain of Ô, Dom(Ô), and the set of vectors
Ô(v) ∈ W ∀v ∈ Dom(Ô) is called the image of Ô, Im(Ô).

Examples of linear operators are n × m matrices (which map Cm into Cn ), or the derivative operator ∂x
in the space of square-integrable functions L2 (R).

In a finite-dimensional Hilbert space, linear operators admit a ‘matrix representation’, that is, any linear
operator mapping vectors in a Hilbert space of dimension m to a Hilbert space of dimension n can be written
as a n × m matrix once we have chosen specific bases in both spaces. For our purposes we will mainly focus
on linear operators mapping a vector space to itself. These can be represented by square matrices in the case
of finite-dimensional Hilbert spaces.

Now, we may ask the following question: is there a set of linear operators such that they map vectors of the
Hilbert space into complex numbers? As we will discuss in more detail when we review differential geometry,
such objects are called ‘linear functionals’ or ‘one-forms’. In some contexts they are also called ‘co-vectors’.
More formally, a linear functional is a linear operator  on the space V such that

Ây = c ∈ C. (1.1.7)

with y ∈ V .

Example: Let us take C3 as our Hilbert space. We represent y ∈ C3 with a column vector
 
y1
y =  y2  . (1.1.8)
y3

A linear operator  (a matrix) such that Ây = c is a complex number has to be a row vector, let us say
 = (x∗1 , x∗2 , x∗3 ) so that
 
y1
(x∗1 , x∗2 , x∗3 )  y2  = x∗1 y1 + x∗2 y2 + x∗3 y3 ∈ C (1.1.9)
y3

So, row vectors are the linear functionals on the space of column vectors. We will say that the dual space
to C3 is just the vector space of column vectors (or their co-vectors). More generally (see the section on
differential geometry for details):

Definition (Dual space). The dual space V ∗ of a vector space V is the vector space formed by the linear
functionals Â, i.e. such that Ây = c for all y ∈ V, c ∈ C.

Review of the foundations of QM 3 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

In the Hilbert space Cn , the dot product uniquely assigns a linear functional to each vector and vice versa
Âv ≡ v † , so that
Âv w = hv, wi = v † · w. (1.1.10)
This can be generalized to general Hilbert spaces since the inner product provides a unique way of assigning
to each vector v a linear functional Av through the relation

Âv w = hv, wi. (1.1.11)

Inspired by this and following in Dirac’s footsteps, let us notate the vectors of the Hilbert space as follows:
y ≡ |yi; and the element of the dual space uniquely assigned by the inner product as Âx ≡ hx|. The inner
product can then be written as
hx, yi = hx|yi. (1.1.12)

1.1.3 Adjoint of a linear operator and self-adjoint operators

In the bra-ket notation we usually express the action of an operator on a vector as follows

|wi = Ô |vi (1.1.13)

Each linear operator in a Hilbert space has a so-called adjoint operator. If the Hilbert space were finite-
dimensional, that would be the Hermitian conjugate of the matrix (swapping of rows and columns and taking
the complex conjugate of the matrix elements). In general we can define in a very abstract way the adjoint
operation as follows:

Definition (Adjoint of a linear operator). Consider a linear operator Ô mapping vectors v ∈ V to vectors
w ∈ W . The adjoint of Ô, denoted as Ô† , is the only operator on W that satisfies the following relationship

hw, Ôvi = hÔ† w, vi. (1.1.14)

In the bra-ket notation this would be telling us that if Ô |vi = |ui then hv| Ô† = hu|. This means that we
can write (1.1.14) equivalently as
hw| Ô |vi = hv| Ô† |wi∗ (1.1.15)
and the operator Ô† that satisfies (1.1.15) is the adjoint of Ô.
A couple of important properties of the adjoint operation, very useful to remember, are

(cA)† = c∗ A† (1.1.16)
(AB)† = B † A† (1.1.17)

where c ∈ C.

Definition (Hermitian operator). A linear operator Ĥ is Hermitian if and only if it satisfies

hu|Ĥ|vi = hu|Ĥ † |vi (1.1.18)

for all |vi , |ui ∈ Dom(Ĥ) ⊆ V .

Review of the foundations of QM 4 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

Notice that to satisfy the Hermitian property Ĥ has to be a map from (a dense subset1 of) V to itself (i.e.,
a square matrix in finite dimensional Hilbert spaces).
Definition (Self-Adjoint operator). A linear operator Ĥ is self-adjoint if and only if Ĥ = Ĥ † .

The difference between Hermitian and self-adjoint operators can be more clearly seen in this alternative
definition of self-adjoint operator:
Definition (Self-Adjoint operator). A linear operator Ĥ is self-adjoint if and only if Ĥ is Hermitian and
Dom(Ĥ) = Dom(Ĥ † ).

In finite dimensional Hilbert spaces, Hermitian and self-adjoint are equivalent definitions (Ĥ and Ĥ † are
guaranteed to have the same domain, the whole V , if they are Hermitian). However this will not be true
in general for infinite dimensional Hilbert spaces. An example of an operator which is Hermitian but not
d
self-adjoint is i dx for square integrable functions in the closed interval V = [0, 1]. In this case, it can be made
self-adjoint by imposing boundary conditions, i.e. restricting the Hilbert space on which it acts.
Definition. Let Ô be an operator (not necessarily linear) on a Hilbert space V and Ô(v) its action on a vector
v ∈ V . The operator Ô is bounded if there exists a real number M such that kÔ(v)k ≤ M kvk for all v ∈ V .
Definition. The eigenvectors u of an operator Ô with eigenvalue λ ∈ C are all u ∈ V such that

Ôu = λu. (1.1.19)

In other words, the eigenvectors of a linear operator Ô are the set of vectors on which Ô acts by multipli-
cation by a scalar.
The set of all the eigenvalues of a bounded operator is bounded2 .
Let us see a couple of important theorems regarding Hermitian operators that are very useful in quantum
mechanics:
Theorem 1.1.3.1. If Ĥ is a Hermitian operator then all its eigenvalues λ are real.

Proof. Trivially, if |λi is an eigenvalue of Ĥ then by the definition of Ĥ we have that

hλ|Ĥ|λi = hλ|Ĥ|λi∗ ⇒ λhλ|λi = λ∗ hλ|λi ⇒ λ = λ∗ ⇒ λ ∈ R. (1.1.20)

Although out of the scope of this course, notice that the set of eigenvalues of a Hermitian operator is
just part of its full spectrum. Indeed the spectrum of a Hermitian operator can be divided in three disjoint
subsets: the point spectrum, the continuous spectrum and the residual spectrum. Rigorously speaking, only
the point spectrum is the set of the operator’s eigenvalues. For a Hermitian operator the eigenvalues and
the continuous spectrum are real, but the residual spectrum may, in general, be complex. It can be proved
that if the operator is also self-adjoint, then there is no residual spectrum and the whole spectrum (point and
continuous) is real (For details see, e.g., [?, ?]). We will come back to this superficially when we analyze the
continuous spectrum of self-adjoint operators.
1
A subset of V is dense if it intersects every non-empty open set in V .
2
A subset S of real numbers is called bounded from above if there is a real number k such that k ≥ s for all s ∈ S. The number
k is called an upper bound of S. The terms bounded from below and lower bound are similarly defined. More generally, a subset
S of a metric space (M, d) is bounded if it is contained in a ball of finite radius, i.e. if there exists x ∈ M and r > 0 such that for
all s ∈ S, we have d(x, s) < r.

Review of the foundations of QM 5 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

Theorem 1.1.3.2. Two eigenvectors with different eigenvalues of a Hermitian operator Ĥ have zero inner
product (are orthogonal).

Proof. Let us consider two eigenvectors of Ĥ with different eigenvalues such that Ĥ |λ1 i = λ1 |λ1 i , Ĥ |λ2 i =
λ2 |λ2 i with λ1 6= λ2 then, since Ĥ is Hermitian,

hλ1 |Ĥ|λ2 i = hλ2 |Ĥ|λ1 i∗ ⇒ hλ1 |λ2 iλ2 = hλ2 |λ1 i∗ λ∗1 ⇒ (1.1.21)

hλ1 |λ2 i(λ2 − λ∗1 ) = 0 ⇒ hλ1 |λ2 i(λ2 − λ1 ) = 0, (1.1.22)


where we used theorem 1.1.3.1 that states that λ∗1 = λ1 . Now given that our hypothesis is that the eigenvalues
are different,
λ1 6= λ2 ⇒ hλ1 |λ2 i = 0. (1.1.23)

1.1.4 Tensor product, orthonormal sets, bases and projectors

Tensor Product (of vector and one-forms)

Without entering in deep details that we will see in later blocks, the tensor product of a vector and a one-form
is the following.

Definition (Tensor Product of a vector and a one-form). Let V and W be two vector spaces, |xi ∈ V an
element of V and hy| ∈ W ∗ an element of the dual of W . The tensor product |xi ⊗ hy| ≡ |xihy| is a map from
W to V such that
|xihy| : |wi → hy|wi |xi (1.1.24)
where |wi ∈ W .

If we think again in terms of Cn as our vector space, the tensor product of a column vector x ∈ Cn and a
row vector y ∈ (Cn )∗ will be the matrix that takes a column vector v and returns x multiplied by the scalar
product of v and y. As a remark, notice that the properties of the adjoint operation (1.1.17) trivially tells us
that
†
|ψihφ| = |φihψ| . (1.1.25)

Orthonormal bases

The Hilbert spaces that we will encounter are separable, i.e. admit a countable (possibly infinite) orthonomal
basis in the following sense.
A set of linearly independent vectors B = {|ei i} ∈ V such that hei |eP
j i = δij (they are orthonormal) is
an orthonormal basis if any vector |vi ∈ V can be expressed as |vi = vi |ei i where vi = hei |vi, are the
‘coordinates’ of |vi in the basis B (and the collection of numbers vi provides a representation of |vi in that
basis). Let us rewrite this as
!
X X X
|vi = hei |vi |ei i = |ei i hei |vi = |ei ihei | |vi (1.1.26)
i i i

Review of the foundations of QM 6 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

The last relationship above trivially ensues the so-called ‘closure’ or ‘completeness relationship’,
X
|ei ihei | = 11 (1.1.27)
i

also known as the spectral decomposition of the identity or the resolution of the identity.
Having a basis is convenient for operational reasons. For any given orthonormal basis, we can represent
vectors in coordinates and we can represent linear operators in terms of their matrix elements in a given basis.
Let us consider a linear operator Ô from the finite-dimensional vector space V to the vector space W , with
{|ei i} and {|di i} bases of V and W respectively. the matrix elements Oij of Ô in these bases are

Oij = hdi |Ô|ej i ∈ C. (1.1.28)

That is, the matrix {Oij } takes elements of V in the basis |ei i and returns elements of W in the basis |dj i.
As an example, let consider an operator Ô from C2 to itself. We can check whether the notation convention
picked here is consistent with the usual convention about matrix subindices: Oij is the matrix element in the
i−th row and the j−th column. It is rather easy to see that everything is consistent by, for example, seeing
what would be the O12 entry of the operator
 
a b
Ô = (1.1.29)
c d

represented in the canonical basis |ê1 i = (1, 0)† , |ê2 i = (0, 1)† (notice that |êi i are column vectors). According
to the definition (1.1.28),   
a b 0
O12 = hê1 | Ô |ê2 i = (1, 0) =b (1.1.30)
c d 1

Let us again illustrate this concept with a simple example finding two representations in different orthonor-
mal basis of an operator from C2 to itself.
Consider the following operator Ô given in the canonical basis B̂ whose elements are |ê1 i = (1, 0)† , |ê2 i =
(0, 1)†  
−i 0
ÔB̂ = , (1.1.31)
−1 0

where we are using the subindex B̂ to explicitly denote that this matrix is the representation of Ô in the
canonical basis B̂ = {|ê1 i , |ê2 i}. Let us compute the matrix elements of Ô in a (different) orthonormal basis
B whose elements are:    
1 1 1 i
|e1 i = √ , |e2 i = √ (1.1.32)
2 i 2 1
that is, we want the matrix representing Ô to ‘eat’ vectors in the basis B and return vectors in the same basis.
We know that Oij = hei |Ô|ej i. Therefore
   
he1 |Ô|e1 i he1 |Ô|e2 i 0 0
ÔB = = . (1.1.33)
he2 |Ô|e1 i he2 |Ô|e2 i −1 −i

Although we have explicitly denoted above the representation in which we are writing the matrix corre-
sponding to the operator Ô using the basis subindex, it is common for finite-dimensional linear operators not

Review of the foundations of QM 7 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

to specify the representation explicitly to lighten the notation. We will adhere to this in all cases where there
is no ambiguity.
It is important to remark at this point the difference between vectors and operators in a Hilbert space,
and their representation in a given basis. Very often (even earlier in this notes) notation is abused when one
identifies an abstract vector in a Hilbert space, with its representation in a given basis. While sometimes
convenient for notational reasons, it is important to always have in mind that a vector in, say, C3 is not a
collection of three complex numbers, and rather, given a basis, it can be represented by these three numbers.
In this regard it would be more correct to write that |xi can be represented as

x1
(hei |xi) = x2  , (1.1.34)
x3

in the canonical basis {|e1 i , |e2 i , |e3 i} rather than the rigorously incorrect abuse of notation

x1
|xi = x2  . (1.1.35)
x3

This will be particularly relevant when we consider infinite dimensional Hilbert spaces in further sections.
Finally, the trace of an operator Ô is the sum of its eigenvalues. In matrix representation is just the sum
of the diagonal elements. In this way, given any orthonormal basis B = {|ei i} and in our notation it can be
expressed as X
Tr Ô = hei |O|ei i. (1.1.36)
i

Notice that since the eigenvalues are independent of basis, the trace is invariant under change of basis.

Projectors

Definition. A projector P̂ is a linear self-adjoint operator that satisfies that P̂ 2 = P̂ .

The action of a projector is to cancel all the components of a given vector perpendicular to the subspace
onto which it is projecting. Indeed, if |ψi is any vector, then

|ψi = P |ψi + (1 − P ) |ψi (1.1.37)

and both terms are orthogonal. An example of a projector operator is the tensor product of a unit-norm
vector with itself, that is,
P̂|ei = |eihe| , (1.1.38)
2 = |ei he|ei he| = |eihe| = P . The action of P
where he|ei = 1, is a projector. Indeed P|ei |ei |ei on any vector of a
Hilbert space V gives the component of that vector in the direction of |ei.
More generally, if we consider a subspace S ⊂ V and {|si i} is an orthonormal basis of S, then the projector
over the subspace S is X
P̂S = |si ihsi | . (1.1.39)
i

Review of the foundations of QM 8 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

Let us consider again an example in a simple Hilbert space, R3 . We want to build the projector over the
XY plane. We know that a basis for that plane is made of the two vectors
   
1 0
|e1 i = 0 , |e2 i = 1  .
   (1.1.40)
0 0
We build the projector over the XY plane following (1.1.39) as
     
1 0 1 0 0
P̂XY = |e1 ihe1 | + |e2 ihe2 | =  0  (1, 0, 0) +  1  (0, 1, 0) =  0 1 0  . (1.1.41)
0 0 0 0 0
 
vx
Evidently, the action of P̂XY over a vector |vi =  vy  is just keeping the components on the XY plane:
vz
    
1 0 0 vx vx
P̂XY |vi =  0 1 0   vy  =  vy  . (1.1.42)
0 0 0 vz 0

Some aspects of the spectral theorem

Now, if the spectrum of a self-adjoint operator is discrete (finite or infinite) and the sum of the dimensions
of the eigenspaces is the same as the dimension of the Hilbert space, the operator can be diagonalized. In
other words, its eigenvectors form an orthonormal basis of the Hilbert space. Furthermore if |λi i such that
 |λi i = λi |λi i, it is possible to write  in its diagonal form as a sum of projectors
X
 = λi |λi ihλi | . (1.1.43)
i

More generally and hence taking into account the multiplicity of the eigenvalues,
X
 = λ i Pi , (1.1.44)
i

where P̂i is the projector onto the subspace of eigenvectors with eigenvalue λi .
This is particularly useful in order to compute arbitrary functions of operators f (Â):
X
f (Â) = f (λi )P̂i . (1.1.45)
i

Pauli operators

Very often we will make use of some particularly useful self-adjoint operators that act on a two-dimensional
Hilbert space, namely the Pauli operators σ̂x , σ̂y , and σ̂z defined in the following way. Let {|0i , |1i} be an
orthonormal basis. Then
σ̂x = |0ih1| + |1ih0| (1.1.46)
σ̂y = i |0ih1| − i |1ih0| (1.1.47)
σ̂z = − |0ih0| + |1ih1| (1.1.48)

Review of the foundations of QM 9 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

This operators satisfy the following commutation relations:


[σ̂x , σ̂y ] = 2iσ̂z (1.1.49)
and cyclic permutations of the indices. They also satisfy that
σ̂x2 = σ̂y2 = σ̂z2 = 11. (1.1.50)

If we use the canonical matrix representation |0i = (0, 1)† and |1i = (1, 0)† , they have the form
     
0 1 0 −i 1 0
σ̂x = , σ̂y = , σ̂z = . (1.1.51)
1 0 i 0 0 −1

Continuous spectrum

Now what happens in the case where the operator Ĥ has a continuous spectrum? The main problem comes
from the fact that the eigenvectors of continuous spectrum operators are not normalizable (and therefore not
in the Hilbert space). Let us consider for example the ‘position operator’ X̂, that acts by multiplication by
x on functions of x: X̂φ(x) = xφ(x). Now the eigenvalue equation X̂φ(x) = x0 φ(x) admits the (formal)
solutions φ(x) = δ(x − x0 ) (obvious: since x0 is a constant, the eigenfunctions should be zero everywhere
except when x = x0 ). Those solutions are not well defined elements of the Hilbert space, since, to begin with,
they are not normalizable with respect to the L2 product, but they can be given meaning in the context of
distributions (limits of sequences of functions).
The usual treatment, due to Dirac, is to write formally the eigenvalue problem as
X̂ |x0 i = x0 |x0 i (1.1.52)
and then write the orthonormality condition of the basis in the continuous case as
hx0 |x00 i = δ(x0 − x00 ), (1.1.53)
where the Dirac delta function is defined as
Z ∞
1
δ(x − x0 ) = dp eip(x−x0 ) . (1.1.54)
2π −∞

This allows us to write the completeness relationship as


Z ∞
11 = dx0 |x0 ihx0 | (1.1.55)
−∞

and the spectral decomposition of the operator X̂ as


Z ∞
X̂ = dx0 x0 |x0 ihx0 | . (1.1.56)
−∞

This is a convenient working solution but the problem is that the objects in the integral are not well-defined
objects in the Hilbert space. The usual ‘elegant’ solution is to appropriately extend the Hilbert space to
include this sort of distributions. The resulting structure is called a rigged Hilbert space [].
The spectral theorem guarantees that for every self-adjoint operator there corresponds a unique family of
projection operators E(λ) that project onto the proper subspace corresponding to eigenvalues smaller or equal
than a given λ. The theorem also guarantees that the limit λ → ∞ (in the distributional sense) exists and it
is such that E(λ) projects onto the whole Hilbert space in this limit. .

Review of the foundations of QM 10 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

1.1.5 Complete commuting sets of self-adjoint operators

We are going to give two extremely useful theorems (without proof) regarding self-adjoint operators
Theorem 1.1.5.1. Let  and B̂ be two self-adjoint operators that possess a complete set of eigenvectors.
Then if [Â, B̂] = 0 then there exists a complete set of vectors which are simultaneously eigenvectors of both Â
and B̂.

That is to say, two self-adjoint operators that commute can be diagonalized simultaneously in the same
basis.
Definition (Complete commuting set of operators). Let (Â, B̂, . . . ) be a set of mutually commuting operators.
If there is only one eigenvector |λn i per each combination of eigenvalues (an , bn , . . . ) then the set is said to be
a complete commuting set of operators.
Theorem 1.1.5.2. Any operator that commutes with all the members of a complete commuting set must be a
function of the operators in that set.

1.1.6 Unitary operators

To finish this review section, we will define unitary operators:


Definition (Unitary operator). A unitary operator Û is a bounded linear operator mapping (isomorphically)
a Hilbert space V to itself satisfying Û Û † = Û † Û = 11.

In other words, the inverse of a unitary operator is its adjoint.


Unitary operators have the important property that they preserve the inner product. In other words

hÛ x, Û yi = hx, yi, (1.1.57)

which is trivially proven from the definition and using the Bra-Ket notation:

hÛ x, Û yi ≡ hÛ x|Û yi = hx| Û † Û |yi = hx|yi. (1.1.58)

The complex exponential of a self-adjoint operator is unitary, indeed

Û = eiĤ ⇒ Û † = e−iĤ , (1.1.59)

and vice versa, every unitary operator can be written as the exponential of a self-adjoint operator.
Definition (Anti-linear map). A map F from a complex vector space V to another W is said to be anti-linear
if
F [(a |v1 i + b |v2 i] = a∗ F [|v1 i] + b∗ F [|v2 i]
ˆ is a bounded antilinear operator mapping a
Definition (Anti-unitary operator). An anti-unitary operator Ū
ˆ Ū
Hilbert space V to itself satisfying Ū ˆ † = Ū
ˆ † Ū
ˆ = 11.

An important property of anti-unitary operators is that

ˆ x, Ū
hŪ ˆ yi = hx, yi∗

Review of the foundations of QM 11 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

1.2 Position and momentum representations

Let us consider a given Hilbert space whose elements are going to be the abstract vectors |ψi. We have seen
that its expression in coordinates in a given basis {|ei i} would be given by
X
|ψi = hei |ψi |ei i
i

Now let us assume that |ψi is an element of the space L2 of square integrable functions. Every abstract
vector |ψi represents an element of that function space. we call the coordinate representation the expansion
of |ψi in the basis of eigenfunctions (eigenvectors in a function space) of the Hermitian position operator X̂.
I.e, we can write
ψ(x) = hx|ψi, ∀ |xi . (1.2.1)
That is, since the eigenfunctions of X̂, |xi form a continuous basis the collection of all the coefficients defines
a continuous function of x. This is the equivalent to all the components of a row vector in a finite dimensional
vector space.
Remembering what we saw before, the eigenstates of X̂ are delta functions. Indeed, for every value of
x = x0 we can write the value of the function ψ(x0 ) as
Z
ψ(x0 ) = dx δ(x − x0 )ψ(x)

In the same fashion, the action of an operator in the abstract vector space can be written in coordinates as

Ôψ(x) = hx|Ôψi = hx|Ô|ψi

Since X̂ is Hermitian, the action of the position operator is of course

X̂ψ(x) = hx|X̂|ψi = xhx|ψi = xψ(x)

That is, it acts by multiplication as we know.


Also, in the position representation we can readily retrieve the usual L2 product

Z Z
hϕ|ψi = dx hϕ|xihx|ψi = dx ϕ(x)∗ ψ(x)

This is just one representation that we can build. We could have considered the momentum operator, P̂ .
We define this operator by its action on the vectors of the L2 Hilbert space in the position representation3 as

hx|P̂ |ψi = −i∂x ψ(x).

We denote the eigenfunctions of the momentum operator as the |pi for which

P̂ |pi = p |pi
3
A common abuse of notation, again identifying the abstract operator with its position representation, can be found in most
introductory texts to quantum theory, so it is not unusual to see P̂ = −i∂x . Writing this is fine (and convenient sometimes!) as
long as we know it is an abuse of notation.

Review of the foundations of QM 12 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

and as it is a continuous basis it would be also delta-normalized.

hp|p0 i = δ(p − p0 )

Let us call φp (x) the eigenfunctions of P̂ . It would be useful to write the eigenbasis of P̂ in the basis of
eigenfunctions of X̂. This is to say, to compute hx|pi. One way to compute them is to write hx|P̂ |pi in the
following two forms
hx|P̂ |pi = phx|pi
−i∂x φp (x) = −i∂x hx|pi = hx|P̂ |pi
These two equations imply
−i∂x hx|pi = phx|pi
which is an extremely simple differential equation that can be easily solved as

hx|pi = c(p)eipx

where the constant is determined by the delta normalization


Z Z
0 1
δ(p − p ) = hp|p i = dx hp|xihx|p i = c (p)c(p ) dx eix(p −p) = c∗ (p)c(p0 ) 2πδ(p − p0 ) ⇒ c(p) = √
0 0 0 ∗ 0

where we have made use of the closure relationship (1.1.55).


Therefore we know that
1
hx|pi = √ eipx .

Now, equivalently to (1.2.1) we can have a representation of the vectors of the Hilbert space |ψi in terms
of the eigenbasis of the momentum operator, this is

ψ̃(p) = hp|ψi

How do we change from the position representation ψ(x) to the momentum representation ψ̃(p)? As easy as
introducing the spectral decomposition of the identity operator again:
Z Z
1
ψ̃(p) = hp|ψi = dx hp|xihx|ψi = √ dx e−ipx ψ(x)

Therefore the momentum representation of a function that has ψ(x) as its position representation is just the
Fourier transform of ψ(x): Z
1
ψ̃(p) = √ dx e−ipx ψ(x)

1.2.1 Postulates of quantum mechanics

Postulate 1. All physical systems are associated with a separable complex Hilbert space H. Vectors of unit
norm |vi in H are associated with states of the system. Vectors represent the same state if they differ only
by a phase factor. The Hilbert space of a composite system is the Hilbert space tensor product of the state
spaces associated with the component systems.

Review of the foundations of QM 13 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

Postulate 2. Physical observables (in principle measurable properties of the system4 ) are represented by
(possibly unbounded) self-adjoint operators on H. The result of a ‘classical’ measurement of any observable
Ô will always be one of the eigenvalues of that observable5 .
Postulate 3 (Born’s rule). the probability of measuring the outcome ai when measuring the observable
associated with  on the state |vi will be given by

P(A=ai ) = |hai |vi|2 , (1.2.2)

where |ai i is an eigenvector of A of eigenvalue ai .


Therefore, the expectation value (in the sense of probability theory) of the observable Ô for the system in
a state represented by the unit vector |vi ∈ V is

hÂi := hv|Â|vi. (1.2.3)

1.3 Uncertainty relations

As we saw, in its most elemental way, the notion of quantum state is identified with distributions of probability
for observable quantities. In assignment 1 you had the chance to get yourself mathematically re-acquianted
with the notion of commutators. We will use them here to show that the distributions of probability for
quantum observables can be related with each other.
Let A and B be two observables represented by self-adjoint operators  and B̂, and let
h i
Â, B̂ = iĈ (1.3.1)

(where the factor i is conveniently introduced so that Ĉ is self-adjoint: Ĉ † = Ĉ). In an arbitrary state
represented by the state vector |ψi the expectation value and the variance of  are hÂi = hψ|Â|ψi and
2
∆2A = Â − hÂi11 respectively. For B̂ we have analogous expressions.

Let us now define the operators Â0 = Â − hÂi11 and B̂0 = B̂ − hB̂i11. Then the variance of the two
distributions is given by ∆2A = hψ| Â20 |ψi , and ∆2B = hψ| B̂02 |ψi
For any operator T̂ we have that the product with its adjoint is non-negative6 : hψ| T̂ T̂ † |ψi ≥ 0 . If we
now cleverly pick T̂ = Â0 + iω B̂0 and therefore T̂ † = Â0 − iω B̂0 where ω ∈ R. Then we can write the following
inequality  
hψ| T T † |ψi = hψ| Â20 − iω Â0 , B̂0 |ψi + ω 2 hψ| B̂02 |ψi ≥ 0

(1.3.2)

 
Since the identity commutes with everything, we know from (1.3.1) that Â0 , B̂0 = iĈ. The strongest
inequality will be given by the value of ω that minimizes the quadratic form in (1.3.2). This value is

hψ| Ĉ |ψi
ω=− , (1.3.3)
2 hψ| B̂02 |ψi
4
One should not always identify self-adjoint operators with magnitudes that can be physically measured. One counterexample
are self-adjoint operators representing non-gauge invariant objects, such the electromagnetic potential.
5
The definition of classical measure is a process that assigns a fixed value to the result of a measurement process.
6
An operator Ô is non-negative if and only if hψ|Ô|ψi ≥ 0 for all |ψi. Alternatively, an operator is non-negative if all its
eigenvalues are non-negative

Review of the foundations of QM 14 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

therefore the most restrictive inequality is


 2
hψ| Ĉ |ψi
hψ| Â20 |ψi hψ| B̂02 |ψi − ≥0
4
or in other words,
 2
hψ| Ĉ |ψi
∆2A ∆2B ≥ (1.3.4)
4
that we could write in compact form as
1
∆A ∆B ≥ hĈi (1.3.5)
2
This result is true for any two operators  and B̂ satisfying (1.3.1). If you want you can substitute those for
X̂ and P̂ and obtain Heisenberg’s uncertainty principle.

1.3.1 Projective Measurements and POVMs

Oftentimes projective measurements are introduced as an additional postulate in quantum mechanics. The
reason is that the notion of measurement in quantum mechanics is a particularly tricky one: while the nature
of quantum states is fundamentally probabilistic, the outcome of measurements that one observes and writes
down when working at a physics lab is a particular numerical result.
The ‘Copenhagian’ handling of the issue of quantum measurements is often the most common way in
which basic texts on quantum mechanics deal with this issue: They introduce the notion of “collapse of the
wavefunction” and postulate that a measurement ‘collapses’ the state of the quantum system to an eigenstate
of the self-adjoint operator representing the observed quantity corresponding to the eigenvalue that coincides
with the outcome of the measurement. This however, is terribly unsatisfactory both from the point of view
of quantum foundations and for its incompatibility with relativity. Because of that, it is a notion that is
abandoned in most modern approaches to quantum theory, for the reasons we will discuss below.
Let us begin by introducing a modern (and more general than the usual Copenhagian one) quantum
measurement ‘Postuloid’ to capture the measurement process in quantum mechanics, and of which projective
measurements will be a particular case:
Postuloid 4 (Definition of quantum measurements). Quantum measurements are represented by a set
{M̂n } of measurement operators. These operators act on the state space of the system being measured and
satisfy a completeness relation X
M̂n† M̂n = 11 (1.3.6)
n
The index n refers to the measurement outcomes that may occur in the experiment. If the state of the quantum
system is |ψi immediately before the measurement then

1. The probability that measurement outcome n occurs is given by

p(n) = hψ| M̂n† M̂n |ψi . (1.3.7)

2. The state of the system after the measurement is

M̂n |ψi
ψ0 = q . (1.3.8)

hψ| M̂n M̂n |ψi

Review of the foundations of QM 15 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

Notice that the set of projectors over an orthonormal basis satisfy the completeness relationship since they
are Hermitian and also square to themselves. If we take M̂n to be projectors over an orthonormal basis then
(1.3.6) follows immediately since it is simply the the spectral decomposition of the identity in that basis.
The usual projective measurements are a particular case of Postuloid 4 in which the measurement
operators correspond to the projectors over the eigenbasis of an observable represented by the self-adjoint
operator Ô. Indeed, if an experiment that measures the observable Ô can have the outcomes {on }, with
associated eigenstates {|on i} and eigenprojectors {P̂n } = {|on ihon |} then

1. The probability that measurement outcome on occurs is given by

p(on ) = hψ| P̂n |ψi . (1.3.9)

2. And the state of the system after the measurement is

P̂n |ψi
ψ0 = p . (1.3.10)
p(on )

As an example of this, let us consider a qubit in an arbitrary state

|ψi = a |0i + b |1i (1.3.11)

where |0i and |1i are eigenstates of σ̂z with eigenvalues −1 and 1 respectively. Let us do two sequential
projective measurements, first we will measure σz and compute the probability of obtaining the value 1. Then
we will assume that the measurement outcome is indeed 1, and then measure σx and compute the probability
of obtaining -1. We will then compute what is the state after these two processes.

1. Measuring σ̂z : probability of obtaining the value 1:

since σ̂z = |1ih1| − |0ih0| we see that the projector over the eigenvector of σ̂z of eigenvalue 1 is P̂1z = |1ih1|.
Applying (1.3.9):
p(σz = 1) = hψ| P̂1z |ψi = |hψ|1i|2 = |b|2 . (1.3.12)

2 Computing the post-measurement state:

Applying (1.3.10),
P̂ z |ψi b
ψ0 = p 1 = |1i ≡ |1i , (1.3.13)
p(σz = 1) |b|
where in the last step we have used that states that differ only in a phase represent the same physical state.

3 Measuring σ̂x : probability of obtaining the value -1:

We recall that σ̂x = |0ih1| + |1ih0|. This operator has two eigenvectors:

1 1
|+i = √ (|1i + |0i) , |−i = √ (|1i − |0i) (1.3.14)
2 2

Review of the foundations of QM 16 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

respectively with eigenvalues 1 and −1 as it is trivial to check acting with σ̂x on them. We see that the
projector over the eigenvector of σ̂x of eigenvalue −1 is

x 1
P̂−1 = |−ih−| = (|1ih1| + |0ih0| − |0ih1| − |1ih0|) . (1.3.15)
2
Applying (1.3.9):
1
p(σx = −1) = ψ 0 P̂−1
x
ψ 0 = h1| P̂−1
x
|1i = (1.3.16)
2
which is independent of the initial state of the system since the first measurement ‘collapsed’ the state to
correspond to the eigenstate of the observed measurement result.

4 Computing the post-measurement state:

Applying (1.3.10),
P̂ x |ψ 0 i
ψ 00 = p −1 = |−i . (1.3.17)
p(σx = −1)
which is independent of the initial state of the system since the first measurement ‘collapsed’ the state to
correspond to the eigenstate of the observed measurement result.
As anticipated, this classic way of viewing measurements presents fundamental problems. First of all,
if we accept quantum mechanics as general framework to describe all physical systems in our universe, no
matter what the measurement apparatus is, it will still be subjected to the laws of quantum mechanics.
The interaction between the measurement apparatus and the quantum system being measured would have
to be described by a (possibly very complex, perhaps even unknowable) interaction Hamiltonian including
the interaction of the system with its environment (as we will discuss in detail in the section about time
evolution later on). However, it can be proved that the action on the probed system of this unitary evolution
of detector+system+environment cannot be just a projection thus rendering projective measurements unable
to describe such scenario. Indeed we can compute the unitary evolution of the total system consisting of target
system, detector and its environment and then average over all possible states of environment and detector to
get the state of the system, which will not be, in general, representable as a state vector. We will address this
further when we see multipartite quantum systems and density matrices.
Projective measurements have been abused to give entity of phsyical reality to the phenomenon known as
“collapse of the wavefunction”, which is a convenient fiction that is strongly incompatible with relativity and
inconsistent with the flows of information in quantum mechanics, as we will discuss a bit more in depth when
we talk about multipartite systems and entanglement. This ‘collapse’ is not a justified element of physical
reality (above all if we want to work with relativity as well), but it is a reasonable effective description for the
outcome of most idealized experiments, hence why it is so convenient and used. Nevertheless, collapse is not a
phsyical process in itself and it is insufficient to understand how flows of information and measurements work
in quantum theory.

POVMs

So if we were to advocate that projective measurements are not really elements of physical reality, how do we
model the outcome and effect of measurements? POVMs is the answer to that question.
The acronym POVM stands for ’Positive Operator-Valued Measure’. Let us go back to Postuloid 4, which
is rather elegant and general. Suppose that a measurement is performed upon a quantum system in the state

Review of the foundations of QM 17 Eduardo Martı́n-Martı́nez


Version 0.2h Eduardo Martin-Martinez - Phys 434

|ψi that can have n outcomes, each of them associated to measurement operators Mn . Let us define the
operator
Ên = M̂n† M̂n (1.3.18)
which by definition
P is positive semidefinite (all its eigenvalues are non-negative) and satisfies a completeness
relationship n Ên = 11. The set of operators {Ên } defining a POVM are called POVM elements.
In the particular case that the set {Ên } is made of projectors, the POVM reduces to a projective mea-
surement.
The power of POVMs is that, assuming that environment, detector, and target system are all ruled by
quantum mechanics, it can be proved that the action on the target system of any kind of interaction of
said system with a detector and its environment, followed by a measurement on the detector system can
be expressed as a (convex combination of) POVM(s) in the system. We will see this in more detail after
introducing the formalism for multipartite quantum systems. That is why we used the name ‘Postuloid’ since
POVMs can be thought of emerging as the effect of the action of quantum systems (detector+environment)
unitarily coupling to the target system on which we do not need to postulate any additional measurement
update rules.

Review of the foundations of QM 18 Eduardo Martı́n-Martı́nez

You might also like