Professional Documents
Culture Documents
QUANTUM NOTES: Matrix Analysis - Linear Algebra On Complex Scalar Product Spaces
QUANTUM NOTES: Matrix Analysis - Linear Algebra On Complex Scalar Product Spaces
Ψ
Budapest Semesters in Mathematics / Aquincum Institute of Technology
Preface
The aim of this note is to deal with ortho-projections, self-adjoint and unitary matrices,
operator positivity and with (finite dimensional) operator- and trace-inequalities. These
are not purely linear algebraic concepts; they belong to the area which some would call
“matrix-analyses”. The crucial difference is that apart from a vector space, here we also
need a scalar product.
Since we intend to apply the machinery learned here in Quantum Information Theory, we
shall need to study these concepts over the field of complex numbers. However, according
to my experience, although students attending my course have some familiarity with —
what they call — “dot-products” or “inner-products”, they have never seen them employed
on complex spaces. Thus here we begin with the concept of scalar products on complex
spaces.
Over the years, starting as a couple of pages long handout, the notes kept getting steadily
longer and longer. So let me also use this preface to thank BSM student Ryan Utke, who
gave a hand in writting up this note.
2 The adjoint 11
2.1 Definition and elementary properties . . . . . . . . . . . . . . . . . . . . . 11
2.2 Normal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Spectral and further characterizations . . . . . . . . . . . . . . . . . . . . . 16
3 Operator positivity 21
3.1 Fundamental properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Gramm-matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Operator and trace inequalities . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 The cone of positives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 The convex body of density operators . . . . . . . . . . . . . . . . . . . . . 28
2 CONTENTS
Chapter 1
Definition 1.1.1. Let F be either R or C and V a vector space over F. A binary operation
h·, ·i : V × V → F satisfying for all v, x, y ∈ V and λ ∈ F
is called a scalar product on V . A vector space with a fixed scalar product is called a
scalar product space.
Note that already by the second listed property it follows that hv, vi must be real;
however, it still could be negative — this is why we made one more requirement. Note
3
4 CHAPTER 1. SCALAR PRODUCTS ON COMPLEX SPACES
also that the first and second listed property imply that scalar product is conjugate linear
(rather than just linear) in its first variable; that is, we have that
Of course what is important here that we require linearity only in one variable. Whether
we list (hence write) this to be the first or the second variable, is a question of convention
and typography similar to choosing between writing f (x) (i.e. the function f applied to x)
or (x)f (the input x is “loaded” in f ). In fact, mathematicians usually follow a different
convention: they would call such an operation to be an inner product and they would
require linearity in its first, rather than in its second variable. Here we decided to follow
the physicist convention.
The set l2 (N) is clearly closed under multiplication by scalars, but as |z + c|2 ≤ 2|z|2 + 2|c|2
for every z, c ∈ C, it is also closed under addition. Moreover, as for any z, c ∈ C we also
have that |zc| = |z| |c| ≤ 21 |z|2 + 12 |c|2 , our formula for the scalar product actually gives a
well-defined finite number for any pair of elements of l2 (N), turning l2 (N) to be an infinite
dimensional scalar product space.
Another infinite generalization can be obtained by replacing the summation by inte-
gration and thus define the scalar product scalar-valued functions on some (measure)space
X by the formula Z
hf, gi := f g.
X
Of course, also here, we need to make sure that the integral is well-defined and results in a
finite number. For example, we may choose the vector spaceRV := C([0, 1], C) to be the set
1
of continuous functions on [0, 1]; then the formula hf, gi := 0 f (x)g(x)dx will indeed give
a well-defined scalar product on V . Alternatively, like in our previous example, we may
consider the set of square-integrable functions on our space X. However, here some care
1.2. THE CAUCHY-SCHWARZ INEQUALITY 5
is needed, since —Runlike in the previous example where we used continuous functions on
[0, 1] — in general X |f |2 = 0 does not imply that f is constant zero. Thus one usually uses
the above integral formula to introduce a scalar product on the space L2 (X) equivalence
classes of square-integrable functions. So for example, consider two functions f and f˜ on
the real line that are square-integrable (with respect to the usual Lebesgue measure) and
differ only at a single point.R Then as functions f 6= f˜, but as elements of L2 (R), we regard
them to be the same since R |f − f˜|2 = 0.
which is precisely how we expect that length should behave. However, we are also used of
another property; namely, the triangle inequality. Is that also automatic? As we shall see
in the next section, the answer is yes.
|hu, vi| = |hu, λui| = |λ||hu, ui| = |λ|kuk2 = kuk kλuk = kuk kvk.
So let us assume now that u and v are not parallel; then u + λv 6= 0 and hence
Expanding the above scalar product and noting that hu, λvi + hλv, ui = 2Re(hu, λvi) we
conclude that
∀λ ∈ C : kuk2 + kλvk2 + 2Re(hu, λvi) > 0.
1
i.e. one of them is a scalar multiple of the other
6 CHAPTER 1. SCALAR PRODUCTS ON COMPLEX SPACES
Then taking a real parameter t ∈ R and setting λ = thu, vi we find that the real polynomial
is positive for every t ∈ R. Of course, if he scalar product hu, vi is zero, we are ready since
the claimed inequality is then clearly satisfied in a strict manner (note that as u and v
are assumed to be non-parallel, in particular they cannot be zero and hence kuk and kvk
are strictly positive). On the other hand, when hu, vi 6= 0, the polynomial q is of second
degree and its strict positivity implies that its discriminant is negative:
2
2|hu, vi|2 − 4|hu, vi|2 kvk2 | kuk2 < 0
which, after simple reordering, is just the (strict version) of the claimed inequality.
But hu, vi + hv, ui = 2Re(hu, vi) < 2|hu, vi| and hence using the Cauchy-Schwarz inequal-
ity
ku + vk2 ≤ kuk2 + kvk2 + 2|hu, vi| ≤ kuk2 + kvk2 + 2kuk kvk = (kuk + kvk)2 .
In a real scalar product space — which, especially in the finite dimensional case, is
usually called a Euclidean space — the Cauchy-Schwarz inequality implies that one can
introduce the notion of angle between vectors. If u, v are nonzero, then the ratio
hu, vi
kuk kvk
must be a number in [−1, 1] and hence there exists a unique α ∈ [0, π] such that
which, in accordance with the formula defining the “dot-product”, can be viewed as the
angle between the two nonzero vector in question.
We shall further say that two collections of vectors S1 and S2 are orthogonal to each other
when all vectors of S1 are orthogonal to all vectors of S2 ; in notations S1 ⊥ S2 . Also, we
introduce the orthogonal of a collection S of vectors of a scalar product space V is the set
S ⊥ ≡ {v ∈ V | u ⊥ v for all u ∈ V }
of all vectors that are orthogonal to every element of S. It is quite evident that
• S ⊥ is a linear subspace (regardless whether S was so or not),
• S ⊥ = (Span(S))⊥
• if S1 ⊂ S2 then S1⊥ ⊃ S2⊥ ,
• S ⊂ (S ⊥ )⊥ .
Our experience with the 3-dimensional Euclidean space suggest that we should also have
the following two more properties: if S is already a linear subspace, then S and S ⊥ are
complementary and (S ⊥ )⊥ = S. However, in infinite dimensions these are actually false
affirmations! (See exercise E 1.5.) So although we shall only need finite dimensional scalar
product spaces for our study, we do need some justification of the finite dimensional version
of these statements — we cannot just say that these are “evidently” true when in fact they
do not follow from the defining properties of a scalar product.
We shall address these questions in the next section; for the moment, we shall consider
some special collections of vectors e1 , . . . en having the property that
1 if j = k,
hej , ek i =
0 otherwise.
Such collections are said to form an ortho-normal system or in short, an ONS. Indeed,
each member of an ONS is “normalized” (as by the above relation it must be of length 1)
and elements of an ONS are pairwise orthogonal.
Lemma 1.3.1. An ONS e1 , . . . en is automatically a linearly independent set.
Proof. Suppose c1 e1 + . . . + cn en = 0 for some scalar coefficients c1 . . . cn . Then considering
the scalar product with ek for any k = 1, . . . n we have that
0 = hek , 0i = hek , c1 e1 + . . . + cn en i = c1 hek , e1 i + . . . + ck hek , ek i + . . . + cn hek , en i
= c1 0 . . . + ck 1 + . . . + cn 0 = ck .
Hence each of our scalar coefficients must be zero, which is exactly what we needed to
conclude.
Proposition 1.3.2. Let e1 , . . . en be an ONS. Then the map
n
X
P : v 7→ hek , vi ek
k=1
Proof. Since the scalar product is linear in its second variable P is indeed a linear map.
It is also quite evident that Im(P ) ⊂ Span{e1 , . . . en }, as by our defining formula P v is
actually given as a linear combination of the e vectors. Moreover, as for every j = 1, . . . n
n
X
P ej = hek , ej i ek = 0 e1 + . . . + 1 ej + . . . 0 en = ej
k=1
as by our previous lemma the vectors e1 , . . . en are linearly independent. Thus Ker(P ) =
{e1 , . . . en }⊥ = (Span{e1 , . . . en })⊥ = (Im(P ))⊥ .
Corollary 1.3.3. If V is a finite dimensional scalar product space, then there exists an
ONS e1 , . . . en ∈ V which is also a basis of V .
Proof. We take a nonzero vector 0 6= v ∈ V and make the first vector of our ONS to be its
1
normalized version: e1 := kvk v. Then, if we find another unit-vector which is orthogonal
to our first element of our ONS, we set that to be our second element e2 and we continue
in this manner: whenever we find a unit-vector which is orthogonal to all elements of our
so-far-obtained ONS, we add that as next element.
This procedure will surely stop at some point: an ONS is linearly independent, thus
its cardinality cannot exceed the dimension of V . But when will this exactly happen?
Suppose at moment our ONS e1 , . . . ek is not a spanning set in V : there exists a vector
v ∈ V such that v ∈ / Span{e1 , . . . ek }.
Consider the projection P introduced in the previous proposition for which Im(P ) =
Span{e1 , . . . ek } and Ker(P ) = (Im(P ))⊥ . We have that the vector v − P v 6= 0 (as v ∈ /
Span{e1 , . . . ek }) and that it is actually in the orthogonal of our ONS (since P (v −P v) = 0
and so it is in Ker(P ) = (Im(P ))⊥ ). So setting ek+1 := kv−P 1
vk
(v − P v) creates a larger
ONS. Thus the conclusion is that when the procedure stops (and it indeed does stop after
finite many step), it gives an ONS which is a spanning set and hence is also a basis.
Such a basis discussed above is usually called an ortho-normal basis, or in short an
ONB. Note that if E = (e1 , . . . enP
) is an ONB in V , then by the formula appearing in the
main proposition of this section, nk=1 hek , vi ek = v for all v ∈ V . Thus we can just read
off the coordinates of v in this basis:
he1 , vi
(v)E = .. .
hen , vi
1.4. ORTHOGONAL PROJECTIONS 9
Proof. It is clear that it suffices to show one of the relations (the other one follows by
exchanging the role of U and W ). So suppose U ⊥ W ; then of course U ⊥ ⊃ W . For
the other containment, consider a vector v ∈ U ⊥ . Since U and W are complementary,
there exists a decomposition v = u + w in which u ∈ U and w ∈ W . Then using the
orthogonality relations,
Corollary 1.4.2. Let P be an ortho-projection. Then Ker(P ) is not only orthogonal, but
it is precisely the orthogonal of Im(P ). Hence an ortho-projection is fully determined by
the subspace its project onto.
Proof. By what was explained in the last chapter, there exists a finite ONB in U which
then can be used to explicitly construct the desired projection (see the main proposition
of the last section). The rest has been discussed just before our claim.
Exercises
E 1.1. If the vectors of a scalar product space v, u ∈ V are orthogonal, then one can easily
show that ku + vk2 = kuk2 + kvk2 . Is the converse true; i.e. does this “Pythagorean”
equality imply that the vectors in question must be orthogonal?
E 1.2. Can there be 4 (nonzero) vectors in a Euclidean space so that the angle between
any two larger or equal than 120◦ ?
E 1.3. Use the Cauchy-Schwarz inequality to show that for any a, b, c ≥ 0 we have
√ √ √ p
a + b + a + c + b + c ≤ 6(a + b + c).
E 1.5. Let D ⊂ l2 (N) be the subspace formed by the “finite” sequences; i.e. the ones
having only finitely many nonzero terms. What is D⊥ ? Is it true that (D⊥ )⊥ coincides
with D? Is it true that D and its orthogonal are complementary?
E 1.6. Show that if U, W are linear subspaces in finite dimensional scalar product space,
then U ⊥ ∩ W ⊥ = (U + W )⊥ and U ⊥ + W ⊥ = (U ∩ W )⊥ . What can be said about the
validity of these relations in infinite dimensions?
E 1.7. We have seen that an ortho-normal system of vectors is always linearly independent.
Let us loosen the condition of “strict” orthogonality: show that if the vectors v1 , . . . vn+1
are of unit length with |hvj , vk i| < 1/n for all j 6= k, then they are linearly independent.
E 1.9. Let P be an ortho-projection. Prove that P v is precisely the point of the subspace
Im(P ) which is the closest to v.
The adjoint
Lemma 2.1.1. If A∗1 and A∗2 are both adjoints of A, then A∗1 = A∗2 .
Proof. Using the defining property of the adjoint we find that h(A∗1 − A∗2 )v, wi = 0 for
all v, w ∈ V . In particular, setting w := (A∗1 − A∗2 )v, this implies that (A∗1 − A∗2 )v = 0
for all v ∈ V which means that A∗1 and A∗2 are the same.
After discussing uniqueness, let us turn to the question of existence. Of course here one
could start by simply trying to show that for each A ∈ Lin(V ) there exists a corresponding
adjoint A∗ . However, more than just knowing existence, what we would really prefer is to
be able to compute the adjoint in a concrete manner; i.e. to have a simple formula for A∗ .
For this reason, let us pick an ONB E = (e1 , . . . en ) in V and consider how an operator
X ∈ Lin(V ) “looks like” in this basis. The k th column of the matrix (X)E is simply the
coordinates of Xek in our ONB. Thus, by the remark at the end of section 1.3, the entry
value at the j th row, k th column is
11
12 CHAPTER 2. THE ADJOINT
• (A + B)∗ = A∗ + B ∗ ,
• (AB)∗ = B ∗ A∗ ,
• (A∗ )∗ = A,
• I ∗ = I,
• Ker(A∗ ) and Im(A) are the orthogonal of each other, and hence
• Tr(A∗ ) = Tr(A),
• det(A∗ ) = det(A).
S(V ) ≡ {A ∈ Lin(V)| A = A∗ }
(AB)∗ = B ∗ A∗ = BA;
thus, the product remains self-adjoint if and only if A and B commute. This is the exact
opposite of what we have with unitary operators. Indeed, for example both I and −I are
unitary, but I + (−I) = 0 is of course not unitary. Thus the sum of the unitary operators
U1 , U2 need not be unitary; however
(U −1 )∗ = (U ∗ )∗ = U = (U −1 )−1 ,
so the inverse also remains unitary. Thus, the set of unitary operators on V form a group
under composition.
An easy, but fairly important observation is that both unitary and self-adjoint operators
are such, that they commute with their adjoint. This is a common, nontrivial property
(random examples with matrices will quickly convince the reader that in general X and
X ∗ do not commute). Since in what follows we shall only use this property, it worth to
introduce a word for this; we shall say, that an operator
• N such that N ∗ N = N N ∗ is normal.
This extra class does not have such a nice structure as those of the self-adjoint operators
(that form a real linear subspace) or the unitary ones (that form a group under composi-
tion). Actually, when doing quantum physics, we shall not need to consider operators that
are just normal. However, since both unitary and self-adjoint operators are also normal, by
making statements with them we can avoid to repeat propositions for the separate classes
that are important for us.
14 CHAPTER 2. THE ADJOINT
where we have used that N ej = λj ej for some scalar λj (as the vectors e1 , . . . ek are all
eigenvectors) and our earlier corollary by which N ej = λj ej implies that N ∗ ej = λj ej .
This shows that the subspace W := {e1 , . . . ek }⊥ is invariant under N ; i.e. that N maps
vectors of W to vectors of W . Thus, if W was at least one-dimensional, then the restriction
of N to W would still have an eigenvector; say 0 6= w ∈ W . But then e1 , . . . ek , w/kwk
would be an even larger ONS consisting of eigenvectors of N , contradicting to the assumed
maximality. Therefore, W must be zero-dimensional, and hence e1 , . . . ek must actually
form a basis.
For the other direction — for the “if” part of the statement — the argument is simpler
and actually it works regardless whether our space is complex or real. Indeed, suppose
that there exists an ONB consisting of eigenvectors of N . Then the matrix of N in this
basis is a diagonal one and hence also its transpose-conjugate is a diagonal matrix. But
two diagonal matrices always commute, so N must commute with N ∗ .
Let us note that the “only if” part of the previous statement becomes false if one
considers real spaces instead of complex ones. Indeed, let V = R2 with its standard scalar
product and consider the 90◦ anticlockwise rotation; i.e. the multiplication with the matrix
0 −1
R= .
1 0
This is easily seen to be unitary (or, as it is usually called in the real case: an orthogonal
transformation) and hence also normal. However, it has not a single eigenvector, so clearly
one will find no ONB of R2 consisting of eigenvectors of R. In some sense of course R does
have eigenvalues — namely i and −i — but these are not real numbers and hence any
eigenvector of R must contain some non-real entry, too.
The previous proposition can be reformulated in terms of spectral decompositions, too.
By what was established, eigenvectors of a normal operator span the full space, so every
normal operator admits a spectral decomposition.
A spectral projection Pλ projects onto the eigenspace associated to the eigenvalue λ
along the sum of the other eigenspaces and as we proved in this section, eigenvectors
of associated to different eigenvalues of a normal operator are orthogonal. Thus, each
spectral projection appearing in the spectral decomposition of a normal operator is an
ortho-projection.
Corollary 2.2.6. Let V be a finite dimensional complex scalar product space. ThenP an
operator N on V is normal if and only if it admits a spectral decomposition N = λ λPλ
and each projection Pλ appearing in the spectral decomposition is an orthogonal one.
P
Proof. We have just discussed the “only if” direction. On the other hand, if N = λ λPλ
is a spectral decomposition and each projection Pλ is an orthogonal one, then
!∗
X X X
N∗ = λPλ = (λPλ )∗ = λPλ
λ λ λ
since every ortho-projection is self-adjoint. Thus N commutes with N ∗ since each is a
linear combination of the same collection of commuting projections.
16 CHAPTER 2. THE ADJOINT
Proposition 2.3.1. Let V be a finite dimensional (real or complex) scalar product space
and N an operator on V . Then
Proof. In words, the claim is that eigenvalues of self-adjoint operators are real, eigenvalues
of positive operators are nonnegative and eigenvalues of unitary operators are of unit
absolute value. So let us assume that λ is an eigenvalue of N and v 6= 0 is a corresponding
eigenvector. Then
hv, N vi
λ=
hv, vi
where the denominator is a positive number. Thus, if N = N ∗ then hv, N vi = hN v, vi
and hence
hv, N vi hN v, vi
λ= = = λ,
hv, vi hv, vi
showing that λ must be real. If N is positive, then N = X ∗ X for some operator X and
hence hv, N vi = hv, X ∗ Xvi = hXv, Xvi ≥ 0 which, by our expression λ, implies the
non-negativity of λ, too. Finally, if N is unitary, then N ∗ N = I so
Clearly, the above spectral properties alone cannot give in general complete character-
izations. First, because looking at the eigenvalues only does not reveal anything about the
angle between the eigenspaces — and we know that all these types of operators are normal
and hence should have orthogonal eigenspaces. So to have some new characterizations, at
least we should require normality. But there is also another problem. The spectral decom-
position theorem for normal operators that we discussed in the last section holds only in
the complex case. In fact, if we are over the field of real numbers, then of course we cannot
have non-real eigenvalues; however, it is clear that even in the real case, not every normal
operator is self-adjoint. Thus, for turning the above properties into characterizations, we
also need to be above complex number.
2.3. SPECTRAL AND FURTHER CHARACTERIZATIONS 17
Theorem 2.3.2. Let V be a finite dimensional complex scalar product space and N a
normal operator on V . Then
• if N is self-adjoint if and only if Sp(N ) ⊂ R,
Hence if |λ| = 1 for every λ ∈ Sp(N ), then N N ∗ = I showing that N in this case is
unitary. Finally, as N admits a spectral decomposition, if Sp(N ) ⊂√R+ 0 then one can
√
apply the root function : R+0 → R +
0 to N . The resulting operator N is still normal,
as apart
√ from a √re-labeling, its spectral projections coincide with those of N . However,
as Sp( N ) = { λ|λ ∈ Sp(N )} is still a subset of reals, from what we have already
√ √ ∗√ √ √ √ 2
established, N is a self-adjoint operator and actually N N = N N = N = N
showing that in this case N is a positive operator.
Note that since positive operators form a subset of self-adjoint one which in turn form
a subset of normal operators, in the positive case the above characterization also results
the following slightly different one.
Corollary 2.3.3. Let V be a finite dimensional complex scalar product space and A an
operator on V . Then A ≥ 0 if and only if A∗ = A and Sp(A) ⊂ R+
0.
In practice, when dealing with given matrices, it is easier to check self-adjointness then
normality, so often instead of the characterization given by Theorem 2.3.2, we shall use
the one provided by our previous corollary.
Let us discuss now some characterizations based on scalar product values rather than
eigenvalues. Suppose A = X ∗ X is a positive operator on V . Then hv, Avi = hv, X ∗ Xvi =
18 CHAPTER 2. THE ADJOINT
showing that hv, Avi is still at least a real number. This latter property in the real case of
course is of not much use (since in that case every scalar product value is real). However,
it is nice to know that in the complex case it actually gives a characterization.
Theorem 2.3.4. Let V be a finite dimensional complex scalar product space and A an
operator on V . Then A = A∗ if and only if hv, Avi is real for all v ∈ V .
Proof. By what we have already discussed, we only need to show the “if” direction. So
suppose hv, Avi is real for all v ∈ V ; then
hx, Ayi + hy, Axi hx, Ayi − hy, Axi hx, Ayi + hy, Axi hx, Ayi − hy, Axi
+ = − = hy, Axi.
2 2 2 2
Thus hAy, xi = hy, Axi for every x, y ∈ V , showing that the adjoint of A is itself.
Corollary 2.3.5. Let V be a finite dimensional complex scalar product space and A an
operator on V . Then A ≥ 0 if and only if hv, Avi is a nonnegative real for all v ∈ V .
Proof. Again, by what we have discussed earlier, we only need to show the “if” direction.
So assume hv, Avi ≥ 0 for all v ∈ V . Then in particular by the previous theorem A
is self-adjoint. If λ ∈ Sp(A), then there is a corresponding eigenvector v 6= 0 such that
Av = λv and then by our assumption λ = hv,Avi hv,vi
is a nonnegative. Thus the self-adjoint
operator A has nonnegative eigenvalues only and hence by Cor. 2.3.3 it is actually a positive
operator.
We finish this section with two further characterizations of unitary operators. The first
is rather straightforward to show while the second one is easily deduced from the first one
by noticing that scalar product values can be expressed with norms; in the real case, we
have
kx + yk2 − kx − yk2
hx, yi =
4
2.3. SPECTRAL AND FURTHER CHARACTERIZATIONS 19
Exercises
E 2.1. Let A and B be self-adjoint operators. Show that AB = 0 if and only if the images
of A and B are orthogonal subspaces.
E 2.2. Let
i + 1 3i − 1 i + 1 0 3−i 0
A = i − 1 2 1 and B = 3 + i −1 −2 .
i−1 1 1 0 −2 2
Find all values of t ∈ R for which the equation
A + t2 A∗ + tB − X ∗ X = 0
admits a solution, and for each such value present an X ∈ M3,3 (C) satisfying the equation.
E 2.3 (Polar decomposition). Show that for any operator (of a finite dimensional space)
X there exists a positive operator A ≥ 0 and a unitary one U = (U ∗ )−1 such that
X = U A.
Show further that the positive operator A appearing in the above decomposition is uniquely
determined by X, whereas the unitary U is uniquely determined if and only if ∃X −1 .
E 2.4. We proved that if on a complex (finite dimensional) scalar product space an operator
N is normal then there exists an ONB consisting of eigenvectors of N only. Though in the
real case this is false, prove that if in particular N is self-adjoint, then it still follows that
there exists an ONB consisting of eigenvectors of N only.
20 CHAPTER 2. THE ADJOINT
Chapter 3
Operator positivity
showing that tA + B ≥ 0.
Proposition 3.1.2. For a positive operator A ≥ 0 we have Av = 0 ⇔ hv, Avi = 0.
Proof. The implication to the right is trivial; it does not need any assumption of positivity.
As for the other direction, using the algebraic definition of positivity, we have that A =
X ∗ X for some X and
21
22 CHAPTER 3. OPERATOR POSITIVITY
Which one could be positive? Our original definition (according to which positives are
of the form X ∗ X) in itself does not seem to give an easy way to answer this question.
However, since we know that positives are also self-adjoint, we can immediately rule out
the last one, whose transpose conjugate is not itself. Moreover, the first one has a negative
number on its diagonal, and the second one has a zero at a diagonal place with a further
non-zero element in the same row; so according to our lemma, we can rule out also those.
This leaves us with just 2 matrices: the 3rd and 4th ones. They are self-adjoint and have no
“evident” problems, so we might decide to use our spectral characterization and compute
their eigenvalues. The eigenvalues of the 3rd one turn out to be 1 and 4, so it is indeed a
positive matrix. On the other hand, the 4th matrix has also a negative eigenvalue (−1/2),
so it is not a positive matrix.
Apart from such uses, the previous lemma also has an important consequence regarding
trace values. Let us formally state this.
Corollary 3.1.5. For a positive operator A ≥ 0 we have that Tr(A) ≥ 0 with equality
holding if and only if A = 0.
We can also easily establish the non-negativity of the determinant. However, to do so,
this time the best is to use the original definition.
Proposition 3.1.6. For a positive operator A ≥ 0 we have that det(A) ≥ 0.
3.2. GRAMM-MATRICES 23
We finish this section with the uniqueness of a positive square-root. Both for estab-
lishing its existence (which we have “accidentally” also done in the last chapter, when we
proved the equivalence of the definition and the spectral characterization) and to conclude
its uniqueness we shall use the spectral decomposition given by the spectral characteriza-
tion.
Proposition 3.1.7. Let A be a positive operator. Then there exists a unique positive
operator B whose square is A.
3.2 Gramm-matrices
Let V be a scalar product space over F = R or C and v1 , . . . vn ∈ V a collection (more
precisely: a list) of vectors. The matrix n × n square-matrix G defined by the formula
Gj,k ≡ hvj , vk i
is called the Gram matrix of the system of vectors v1 , . . . vn . Considering Fn with its
standard scalar product, we have that
n n n
! n
! n
X X X X X
hc, Gci = ck Gk,l cl = ck hvk , vl icl = h c k vk , c l vl i = k ck vk k2 ≥ 0
k,l=1 k,l=1 k=1 l=1 k=1
for any c ∈ Fn . This shows that G is a positive matrix. This also works in the other
direction; if M is an arbitrary positive matrix, then there exists a matrix X such that
G = X t X and it is straightforward to check that G is actually the Gram matrix of the
system of vectors formed by the columns of X. Thus, every Gram matrix is positive and
every positive matrix is the Gram matrix of a vector collection.
Considering the Gram matrix G of a system of vectors v1 , . . . vn is a simple but powerful
trick which is often the key in solving certain type of problems (see exercise E 3.5 for an
example). The reason for its usefulness lies in the way it encodes some essential properties
of our collection of vectors. In particular, we have the following.
24 CHAPTER 3. OPERATOR POSITIVITY
Pn Pn Pn
• The trace of the Gram matrix is Tr(G) = k=1 Gk,k = k=1 hvk , vk i = k=1 kvk k2 .
• The maximum number of linearly independent colums of G coincides with the max-
imum number of linearly independent elements of v1 , . . . , vn ; that is, the rank of the
Gram matrix is
rk(G) = dim(Span{v1 , . . . , vn }).
P
Indeed, if vj is Pexpressed asPa linear combination P vj = k6=j ck vk , then Gl,j =
hvl , vj i = hvl , k6=j ck vk i = k6=j ck hvl , vk i = k6P c G
=j k l,k for all l = 1, . . . , n. As
for the other direction, now suppose that Gl,j = ck Gl,k for all l = 1, . . . , n.
k6=jP
Then, going backward, we get that the difference vj − k6=j ck vk has a zero scalar
product with all of the vectors vl for l = 1, . . . , n and hence it is orthogonal to
the subspace Span{v1 , . . . , vn }. However, as it is also evidently contained P in this
subspace, this difference must be zero and hence we must have that vj = k6=j ck vk .
Finally — as we shall not use this — without proof we also mention here that in the real
case, the determinant of the Gram matrix is the square of the volume of the parallelotope
formed by the vectors v1 , . . . , vn .
Proof. Most of the above affirmations are fairly easy to show and are left to the reader
to check. Here we only comment on one point; namely, that both transitivity and the
claim regarding the sum of operator-inequalities relays on the fact that the sum of positive
operators is positive. Indeed, the operator-inequalities A ≥ B and B ≥ C mean that A−B
and B − C are positive operators and hence that (A − B) + (B − C) = A − C ≥ 0, too.
What we have learned so far about operator-inequalities makes one to think that it is
something similar to inequalities between real numbers. Some differences however must be
pointed out. First, operator-inequality is only a partial-ordering. Indeed, if for example the
self-adjoint operator A has both a positive and a negative eigenvalue, then neither A, nor
−A is not a positive operator. Second, while operator-inequalities can be summed, they
cannot be squared for example. That is, whereas for two nonnegative reals a, b ≥ 0 the
inequality a ≥ b implies that a2 ≥ b2 , for two positive operators A, B ≥ 0 the inequality
A ≥ B does not imply that A2 ≥ B 2 . In fact, such discrepancies lead to the concept
of operator-monotone functions. A function f : R → R (or f : R+ 0 → R) is said to be
operator-monotone on self-adjoint (or on positive) operators, if for any A, B self-adjoint
(or positive) operators
A ≥ B ⇒ f (A) ≥ f (B).
If f is operator-monotone then in particular it is monotone on 1 × 1 matrices; i.e. it is also
monotone in the conventional sense. However, the mentioned case of squares show that
the in general the contrary is false.
There are several existing characterizations allow us to find a wide variety of operator-
monotone functions. For example, if 0 < q ≤ 1, then so is the function x 7→ xq . However,
such theorems are outside of our present scope and most of them also require some further
tools that we did not discuss here. Nevertheless, even with what we have, as a good
example and as an interesting problem, the reader can try to find an elementary proof
√
showing that is operator-monotone on positive operators.
Let us now move on to the topic of trace-inequalities. If A, B are self-adjoint then
(AB)∗ = B ∗ A∗ = BA, hence the product is again self-adjoint if and only if A and B com-
mute. Nevertheless, Tr(AB) = Tr((AB)∗ ) = Tr(BA) = Tr(AB), showing that although
not necessarily self-adjoint, the product still has a real trace.
Since a positive operator is in particular self-adjoint, by what was just explained, the
product of positive operators cannot be in general a positive operator. However, regarding
trace-values we have the following.
Proposition 3.3.2. Let A, B be positive operators. Then Tr(AB) ≥ 0 with equality holding
if and only if AB = 0.
Proof. We may assume A = X ∗ X and B = Y ∗ Y for some operators X and Y . Then,
setting Z := XY ∗ , we have that
Tr(AB) = Tr(X ∗ XY ∗ Y ) = Tr(XY ∗ Y X ∗ ) = Tr(Z ∗ Z) ≥ 0,
as the trace of the positive operator Z ∗ Z is nonnegative. Let us now move on to the part
regarding equality. Moreover, by corollary 3.1.5, equality implies that Z ∗ Z = 0. It is
26 CHAPTER 3. OPERATOR POSITIVITY
since both A + B and A − B are positive operators. This shows that A ≥ B ≥ 0 does imply
that Tr(A2 ) ≥ Tr(B 2 ). We now move on to discuss a simple but important construction.
Proposition 3.3.3 (The Hilbert-Schmidt scalar product). For a finite dimensional scalar
product space V the formula hA, Bi := Tr(A∗ B) defines a scalar product on Lin(V ).
Finally, by corollary 3.1.5 and by the remark made in the proof of the previous proposition;
namely, that Z ∗ Z = 0 implies Z = 0, it follows that
hA, Ai = Tr(A∗ A) ≥ 0
It is easy to check that the Hilbert-Schmidt scalar product satisfies the following two
additional properties:
We finish our discussion by mentioning, that the Cauchy-Schwarz inequality, when applied
to the Hilbert-Schmidt scalar product, also gives an important trace-inequality:
Since the scalar product is linear in its second variable, our map is in fact a linear operator;
|vihw| ∈ Lin(V ). What makes this formalism nice is that all sort of equalities that the
symbol seems to suggest turn out to be valid formulas. In particular, the reader can easily
justify the following properties:
• |vih(w1 + w2 )| = |vihw1 | + |vihw2 | and |(v1 + v2 )ihw| = |v1 ihw| + |v2 ihw|,
• |vihw|∗ = |wihv|,
• if e1 , . . . ek is an ONS then by Prop. 1.3.2, the operator kj=1 |ej ihej | is precisely
P
ortho-projection onto the subspace spanned by our ONS,
• in particular, if e1 , . . . en is an ONB then nj=1 |ej ihej | = I and if v is a unit-length
P
vector then |vihv| is the ortho-projection onto the line given by v.
It is worth to consider the case when our vector space is Cn with its standard scalar
product. Let us think of elements of Cn as column vectors and interpret the symbol hw| to
be the row vector (w1 , w2 , . . . wn ) and regard |vi ≡ v to be an “ordinary” column vector.
Then
(w1 , w2 , . . . wn ) v1
n
v2 X
hw| |vi = =
. . . w k vk
k=1
vn
is the scalar product between w and v, whereas writing them in the reversed order gives
v1 (w1 , w2 , . . . wn ) w1 v1 w2 v1 . . . wn v1
v2 w1 v2 w2 v2 . . . wn v2
|vihw| =
. . . =
...
... ... ...
vn w1 vn w2 vn . . . wn vn
an n × n matrix; it is easy to see that this is exactly (the concrete matrix realization of)
the linear operator |vihw| we introduced in an abstract manner before. Considering the
matrix realization, we can also note one more important property:
• Tr(|vihw|) = hw, vi.
Let us now return to the question of self-duality.
Proposition 3.4.1. Let V be a finite dimensional scalar product space and X ∈ Lin(V).
Then X ∈ S + (V ) if and only if Tr(AX) ≥ 0 for all A ∈ S + (V ).
Proof. The “if part” has been already dealt with. So suppose Tr(AX) ≥ 0 for all positive
operators. Then in particular for the positive operator X = |vihv| we have
Exercises
E 3.1. Show by example that if A and B are positive operators such that A ≥ B, then it
does not follow that A2 ≥ B 2 . Show also that if A and B commute, then the implication
in question becomes true.
E 3.2. Let A and B be two self-adjoint operators. Show that Tr((AB)2 ), Tr(A2 B 2 ) are
real and satisfy the inequality
|Tr((AB)2 )| ≤ Tr(A2 B 2 ).
√ √
E 3.3. Show that if A, B are positive operators and A ≥ B, then A≥ B.
E 3.4. Suppose V is a finite dimensional vector space on which we are given two (possibly
different) scalar products. Show that there exists a basis of V whose members are pairwise
orthogonal with respect to both scalar products.
E 3.5. Let v1 , . . . v16 be sixteen unit-length vectors of a scalar product space V . Show
that if
1
|hvj , vk i| <
3
for all j 6= k, then dim(V ) ≥ 7.
E 3.6. Let A and B be two positive operators. Show that there exists a t > 0 such that
tA ≥ B if and only if Im(B) ⊂ Im(A).
√
E 3.7. Let |X| := X ∗ X. Show that |Tr(X)| ≤ Tr(|X|).