Matrix Alga 20

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

Lecture Notes 1: Matrix Algebra

Part A: Vectors and Matrices

Peter J. Hammond
My email is p.j.hammond@warwick.ac.uk
or hammond@stanford.edu
A link to these lecture slides can be found at
https://web.stanford.edu/~hammond/pjhLects.html

revised 2020 September 14th

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 2 of 71


Example of Two Equations in Two Unknowns
It is easy to check that
)
x + y = 10
=⇒ x = 8, y = 2
x −y =6

More generally, one can:


1. add the two equations, to eliminate y ;
2. subtract the second equation from the first, to eliminate x.
This leads to the following transformation
) (
x + y = b1 2x = b1 + b2
=⇒
x − y = b2 2y = b1 − b2

of the two equation system with general right-hand sides.


Obviously the solution is

x = 12 (b1 + b2 ), y = 12 (b1 − b2 )
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 3 of 71
Using Matrix Notation, I
Matrix notation allows the two equations

1x + 1y = b1
1x − 1y = b2

to be expressed as
    
1 1 x b1
=
1 −1 y b2

or as Az = b, where
     
1 1 x b1
A= , z= , and b = .
1 −1 y b2

Here A, z, b are respectively: (i) the coefficient matrix;


(ii) the vector of unknowns; (iii) the vector of right-hand sides.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 4 of 71


Using Matrix Notation, II

Also, the solution x = 12 (b1 + b2 ), y = 12 (b1 − b2 )


can be expressed as
x = 12 b1 + 12 b2
y = 12 b1 − 12 b2
or as
! ! ! !
1 1 1 1
x 2 2 b1 2 2
z= = 1
= Cb, where C=
y 2 − 12 b2 1
2
1
−2

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 5 of 71


Two General Equations

Consider the general system of two equations

ax + by = u = 1u + 0v
cx + dy = v = 0u + 1v

in two unknowns x and y , filled in with some extra 1s and 0s.


In matrix form, these equations can be written as
     
a b x 1 0 u
= .
c d y 0 1 v

For simplicity, we assume throughout the subsequent analysis


that the coefficients a, b, c, d on the left-hand sides are all 6= 0.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 6 of 71


Three Different Cases
We are considering the two equations

ax + by = u and cx + dy = v

They correspond to the two straight lines

y = (u − ax)/b and y = (v − cx)/d

The two lines have respective slopes −a/b and −c/d.


These two slopes are equal iff a/b = c/d, or iff ad = bc,
or iff D := ad − bc = 0.
We will distinguish three cases:
(A) If D 6= 0, the two lines have different slopes,
so their intersection consists of a single point.
(B) If D = 0 and u/b 6= v /d, then the two lines
are parallel but distinct, so their intersection is empty.
(C) If D = 0 and u/b = v /d, then the two lines are identical,
so their intersection consists of all the points on either line.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 7 of 71
First Steps
We assume that a 6= 0 in the matrix equation
     
a b x 1 0 u
= .
c d y 0 1 v

We can eliminate x from the second equation


by adding −c/a times the first row to the second.
Given D = ad − bc, we obtain the new equality
     
a b x 1 0 u
=
0 D/a y −c/a 1 v

Now multiply the second row by a to obtain


     
a b x 1 0 u
=
0 D y −c a v

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 8 of 71


Two General Equations: Case A
The equations are
     
a b x 1 0 u
=
0 D y −c a v
In Case A when D := ad − bc 6= 0,
we can add −b/D times the second row to the first, which yields
     
a 0 x 1 + (bc/D) −ab/D u
=
0 D y −c a v
Recognizing that 1 + (bc/D) = (D + bc)/D = ad/D,
then dividing the two rows/equations by a and D respectively,
we obtain
       
x 1 0 x 1 d −b u
= =
y 0 1 y D −c a v
This implies the unique solution
x = (1/D)(du − bv ) and y = (1/D)(av − cu)
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 9 of 71
Two General Equations: Cases B and C
Beyond Case A, when D := ad − bc = 0,
the multiplier −ab/D is undefined and the system
     
a b x 1 0 u
=
0 D/a y −c/a 1 v

   
a b x u
collapses to = .
0 0 y v − cu/a
This leaves us with two cases:
Case B) If cu 6= av , there is no solution.
We are trying to find the intersection of two distinct parallel lines.
Case C) If cu = av , then the second equation reduces to 0 = 0.
There is a continuum of solutions
satisfying the one remaining equation ax + by = u,
or x = (u − by )/a where y is any real number.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 10 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 11 of 71


Vectors and Inner Products
Let x = (xi )m m
i=1 ∈ R denote a column m-vector of the form
 
x1
 x2 
 ..  .
 
 . 
xm

Its transpose is the row m-vector

x> = (x1 , x2 , . . . , xm ).

Given a column m-vector x and row n-vector y> = (yj )nj=1 ∈ Rn


where m = n, the dot or scalar or inner product is defined as
Xn
y> x := y · x := yi xi .
i=1

But when m 6= n, the scalar product is not defined.


University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 12 of 71
Exercise on Quadratic Forms

Exercise
Consider the quadratic form f (w) := w> w
as a function f : Rn → R of the column n-vector w.
Explain why f (w) ≥ 0 for all w ∈ Rn ,
with equality if and only if w = 0,
where 0 denotes the zero vector of Rn .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 13 of 71


Net Quantity Vectors
Suppose there are n commodities numbered from i = 1 to n.
Each component qi of the net quantity vector q = (qi )ni=1 ∈ Rn
represents the quantity of the ith commodity.
Often each such quantity is non-negative.
But general equilibrium theory, following Debreu’s Theory of Value,
often uses only the sign of qi to distinguish between
I a consumer’s demands and supplies of the ith commodity;
I or a producer’s outputs and inputs of the ith commodity.
This sign is taken to be
positive for demands or outputs;
negative for supplies or inputs.
In fact, qi is taken to be
I the consumer’s net demand for the ith commodity;
I the producer’s net supply or net output of the ith commodity.
Then q is the net quantity vector.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 14 of 71
Price Vectors

Each component pi of the (row) price vector p> ∈ Rn


indicates the price per unit of commodity i.
Then the scalar product
Xn
p> q = p · q = pi qi
i=1

is the total value of the net quantity vector q


evaluated at the price vector p.
In particular, p> q indicates
I the net profit (or minus the net loss) for a producer;
I the net dissaving for a consumer.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 15 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 16 of 71


Definitions

Consider any two n-vectors x = (xi )ni=1 and y = (yi )ni=1 in Rn .


Their sum s := x + y and difference d := x − y are constructed
by adding or subtracting the vectors component by component
— i.e., s = (si )ni=1 and d = (di )ni=1 where

si = xi + yi and di = xi − yi

for i = 1, 2, . . . , n.
The scalar product λx of any scalar λ ∈ R
and vector x = (xi )ni=1 ∈ Rn is constructed by multiplying
each component of the vector x by the scalar λ — i.e.,

λx = (λxi )ni=1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 17 of 71


Algebraic Fields
Definition
An algebraic field (F, +, ·) of scalars is a set F that, together with
the two binary operations + of addition and · of multiplication,
satisfies the following axioms for all a, b, c ∈ F:
1. F is closed under + and ·:
— i.e., both a + b and a · b are in F.
2. + and · are associative:
both a + (b + c) = (a + b) + c and a · (b · c) = (a · b) · c.
3. + and · both commute:
both a + b = b + a and a · b = b · a.
4. There are identity elements 0, 1 ∈ F for + and · respectively,
with 0 6= 1, such that: (i) a + 0 = a; (ii) 1 · a = a.
5. There are inverse operations − for + and −1 for · such that:
(i) a + (−a) = 0; (ii) provided a 6= 0, also a · a−1 = 1.
6. The distributive law: a · (b + c) = (a · b) + (a · c).
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 18 of 71
Three Examples of Real Algebraic Fields
Exercise
Verify that the following well known sets are algebraic fields:
I the set R of all real numbers,
with the usual operations of addition and multiplication;
I the set Q of all rational numbers
— i.e., those that can be expressed as the ratio r = p/q
of integers p, q ∈ Z with q 6= 0.
(Check that Q is closed
under the usual operations of addition and multiplication,
and that each non-zero rational
has a rational multiplicative inverse.)
√ √
I the set Q + 2Q := {r1 + 2r2 | r1 , r2 ∈ Q} ⊂ R
of all real numbers that can be expressed as the sum of:
(i) a rational number; √
(ii) a rational multiple of the irrational number 2.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 19 of 71


Two Examples of Complex Algebraic Fields
Exercise
Verify that the following well known sets are algebraic fields:
I C, the set of all complex numbers
— i.e., those that can be expressed as c = a + ib,
where a, b ∈ R and i is defined to satisfy i 2 = −1.
Note that C is effectively the set R × R of ordered pairs (a, b)
satisfying a, b ∈ R, together with the operations of:
(i) addition defined by (a, b) + (c, d) = (a + c, b + d)
because (a + bi) + (c + di) = (a + c) + (b + d)i;
(ii) multiplication defined by
(a, b) · (c, d) = (ac − bd, ad + bc)
because (a + bi)(c + di) = (ac − bd) + (ad + bc)i.
I the set of all rational complex numbers
— i.e., those that can be expressed as c = a + ib,
where a, b ∈ Q and i is defined to satisfy i 2 = −1.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 20 of 71
General Vector Spaces

Definition
A vector (or linear) space V over an algebraic field F
is a combination hV , F, +, ·i of:
I a set V of vectors;
I the field F of scalars;
I the binary operation V × V 3 (u, v) 7→ u + v ∈ V
of vector addition
I the binary operation F × V 3 (α, u) 7→ αu ∈ V
of multiplication by a scalar.
These are required to satisfy
all of the following eight vector space axioms.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 21 of 71


Eight Vector Space Axioms
1. Addition is associative: u + (v + w) = (u + v) + w
2. Addition is commutative: u + v = v + u
3. Additive identity: There exists a zero vector 0 ∈ V
such that v + 0 = v for all v ∈ V .
4. Additive inverse: For every v ∈ V , there exists
an additive inverse −v ∈ V of v such that v + (−v) = 0
5. Multiplication by a scalar is distributive
w.r.t. vector addition: α(u + v) = αu + αv
6. Multiplication by a scalar is distributive
w.r.t. field addition: (α + β)v = αv + βv
7. Multiplication by a scalar and field multiplication
are compatible: α(βv) = (αβ)v
8. The unit element 1 ∈ F is an identity element
for scalar multiplication: 1v = v

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 22 of 71


Multiplication by the Zero Scalar

Exercise
Prove that 0v = 0 for all v ∈ V .
Hint: Which three axioms justify the following chain of equalities

0v = [1 + (−1)]v = 1v + (−1)v = v − v = 0 ?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 23 of 71


A General Class of Finite Dimensional Vector Spaces

Exercise
Given an arbitrary algebraic field F, let Fn denote
the space of all lists hai ini=1 of n elements ai ∈ F
— i.e., the n-fold Cartesian product of F with itself.
1. Show how to construct the respective binary operations

Fn × Fn 3 (x, y) 7→ x + y ∈ Fn
F × Fn 3 (λ, x) 7→ λx ∈ Fn

of addition and scalar multiplication


so that (Fn , F, +, ×) is a vector space.
2. Show too that subtraction and division by a (non-zero) scalar
can be defined by v − w = v + (−1)w and v/α = (1/α)v.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 24 of 71


Two Particular Finite Dimensional Vector Spaces

From now on we mostly consider real vector spaces


over the real field R,
and especially the space (Rn , R, +, ×) of n-vectors over R.
We will consider, however, the space (Cn , C, +, ×)
of n-vectors over C — the complex plane — when considering:
I eigenvalues and diagonalization of square matrices;
I systems of linear difference and differential equations;
I the characteristic function of a random variable.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 25 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 26 of 71


Linear Combinations

Definition Pk
A linear combination of vectors is the weighted sum h
h=1 λh x ,
where xh ∈ V and λh ∈ F for h = 1, 2, . . . , k.

Exercise
By induction on k, show that the vector space axioms
imply that any linear combination of vectors in V
must also belong to V .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 27 of 71


Linear Functions

Definition
A function V 3 u 7→ f (u) ∈ F is linear provided that

f (λu + µv) = λf (u) + µf (v)

for every linear combination λu + µv of two vectors u, v ∈ V ,


with λ, µ ∈ F.

Exercise
Prove that the function V 3 u 7→ f (u) ∈ F is linear
if and only if both:
1. for every vector v ∈ V and scalar λ ∈ F
one has f (λv) = λf (v);
2. for every pair of vectors u, v ∈ V
one has f (u + v) = f (u) + f (v).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 28 of 71


Key Properties of Linear Functions
Exercise
Use induction on k to show
that if the function f : V → F is linear, then
X  X
k h k
f λh x = λh f (xh )
h=1 h=1

for all linear combinations kh=1 λh xh in V


P
— i.e., f preserves linear combinations.

Exercise
In case V = Rn and F = R, show that
any linear function is homogeneous of degree 1,
meaning that f (λv) = λf (v) for all λ ∈ R and all v ∈ Rn .
In particular, putting λ = 0 gives f (0) = 0.
What is the corresponding property in case V = Qn and F = Q?
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 29 of 71
Affine Functions

Definition
A function g : V → F is said to be affine
if there is a scalar additive constant α ∈ F
and a linear function f : V → F such that g (v) ≡ α + f (v).

Exercise
Under what conditions is an affine function g : R → R linear
when its domain R is regarded as a vector space?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 30 of 71


An Economic Aggregation Theorem
Suppose that a finite population of households h ∈ H
with respective non-negative incomes yh ∈ Q+ (h ∈ H)
have non-negative demands xh ∈ R (h ∈ H)
which depend on household income via a function yh 7→ fh (yh ).
P
Given total income Y := h P yh , underP
what conditions
can their total demand X := h xh = h fh (yh )
be expressed as a function X = F (Y ) of Y alone?
The answer is an implication of Cauchy’s functional equation.
In this context the theorem asserts that this aggregation condition
implies that the functions fh (h ∈ H) and F must be co-affine.
This means there exists a common multiplicative constant ρ ∈ R,
along with additive constants αh (h ∈ H) and A, such that

fh (yh ) ≡ αh + ρyh (h ∈ H) and F (Y ) ≡ A + ρY

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 31 of 71


Cauchy’s Functional Equation: Proof of Sufficiency

Theorem
Except in the trivial case when HPhas only one Pmember,
Cauchy’s functional equation F ( h∈H yh ) ≡ h∈H fh (yh )
is satisfied for functions F , fh : Q → R if and only if:
1. there exists a single function φ : Q → R such that

F (q) = F (0) + φ(q) and fh (q) = fh (0) + φ(q) for all h ∈ H

2. the function φ : Q → R is linear,


implying that the functions F and fh are co-affine.

Proof.
Suppose fh (yh ) ≡ αh + ρyh for all h ∈ P
H, and F (Y )P≡ A + ρY .
Then Cauchy’s functional equation F ( P h∈H yh ) ≡ h∈H fh (yh )
is obviously satisfied provided that A = h∈H αh .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 32 of 71


Cauchy’s Equation: Necessity in the Differentiable Case
P P
Suppose that #H ≥ 2 and that F ( h∈H yh ) ≡ h∈H fh (yh )
where each R 3 yh 7→ fh (yh ) ∈ R is differentiable.
For any pair j, k ∈ H, consider the effect of transferring
a small amount η from k to j, with yh fixed for all h ∈ H \ {j, k}.
P P
Because both h∈H yh and h∈H\{j,k} fh (yh ) are unchanged,
equating the changes to the two sides of the Cauchy equation
gives 0 = fj (yj + η) + fk (yk − η) − fj (yj ) − fk (yk ).
Because we assume that fj0 (yj ) and fk0 (yk ) both exist,
we can differentiate the last equation to get 0 = fj0 (yj ) − fk0 (yk ).
It follows that there exists a constant c
such that fh0 (y ) = c for all h ∈ H and all real y , so fh (y ) = αh + cy .
But then one has
X X
F( yh + η) − F ( yh ) = fj (yj + η) − fj (yj ) = cη
h∈H h∈H

It follows that F (Y ) is the affine function A + cY , for some real A.


University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 33 of 71
Cauchy’s Equation: Beginning the Proof of Necessity
Lemma
The mapping Q 3 q 7→ φ(q) := F (q) − F (0) ∈ R must satisfy;
1. φ(q) ≡ fi (q) − fi (0) for all i ∈ H and q ∈ Q;
2. φ(q + q 0 ) ≡ φ(q) + φ(q 0 ) for all q, q 0 ∈ Q.

Proof.
To prove part 1, consider any i ∈ H and all q ∈ Q.
P P
Note that Cauchy’s equationPF ( h yh ) ≡ h fh (yh )
implies that F (q) = fi (q)P+ h6=i fh (0)
and also F (0) = fi (0) + h6=i fh (0).
Now define the function φ(q) := F (q) − F (0) on the domain Q.
Then subtract the second equation from the first to obtain

φ(q) = F (q) − F (0) = fi (q) − fi (0)

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 34 of 71


Cauchy’s Equation: Continuing the Proof of Necessity
Proof.
To prove part 2, consider any i, j ∈ H with i 6= j,
and any q, q 0 ∈ Q.
P P
Note that Cauchy’s equation F ( h yh ) ≡ h fh (yh ) implies that

F (q + q 0 ) = fi (q) + fj (q 0 ) + h∈H\{i,j} fh (0)


P
P
F (0) = fi (0) + fj (0) + h∈H\{i,j} fh (0)

Now subtract the second equation from the first,


then use the equation φ(q) = F (q) − F (0) = fi (q) − fi (0)
derived in the previous slide, to obtain successively

φ(q + q 0 ) = F (q + q 0 ) − F (0)
= fi (q) − fi (0) + fj (q 0 ) − fj (0)
= φ(q) + φ(q 0 )

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 35 of 71


Cauchy’s Equation: Resuming the Proof of Necessity

Because φ(q + q 0 ) ≡ φ(q) + φ(q 0 ),


for any k ∈ N one has φ(kq) = φ((k − 1)q) + φ(q).
As an induction hypothesis, which is trivially true for k = 2,
suppose that φ((k − 1)q) = (k − 1)φ(q).
Confirming the induction step, the hypothesis implies that

φ(kq) = φ((k − 1)q) + φ(q) = (k − 1)φ(q) + φ(q) = kφ(q)

So φ(kq) = kφ(q) for every k ∈ N and every q ∈ Q.


Putting q 0 = kq implies that φ(q 0 ) = kφ(q 0 /k).
Interchanging q and q 0 , it follows that φ(q/k) = (1/k)φ(q).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 36 of 71


Cauchy’s Equation: Completing the Proof of Necessity

So far we have proved that, for every k ∈ N and every q ∈ Q,


one has both φ(kq) = kφ(q) and φ(q/k) = (1/k)φ(q).
Hence, for every rational r = m/n ∈ Q
one has φ(mq/n) = mφ(q/n) = (m/n)φ(q) and so φ(rq) = r φ(q).
In particular, φ(r ) = r φ(1), so φ is linear on its domain Q
(though not on the whole of R without additional assumptions
such as continuity or monotonicity).
The rest of the proof is routine checking of definitions.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 37 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 38 of 71


Euclidean Norm as Length
Pythagoras’s theorem implies that the lengthq
of the typical vector x = (x1 , x2 ) ∈ R2 is x12 + x22
or, perhaps less clumsily, (x12 + x22 )1/2 .
In R3 , the same result implies that the length
of the typical vector x = (x1 , x2 , x3 ) is
 2 1/2
(x12 + x22 )1/2 + x32 = (x12 + x22 + x32 )1/2 .

An obvious extension to Rn is the following:


Definition
The length of the typical n-vector x = (xi )ni=1 ∈ Rn
is its (Euclidean) norm
X n 1/2 √ √
kxk := xi2 = x> x = x·x
i=1

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 39 of 71


Unit n-Vectors, the Unit Sphere, and Unit Ball
Definition
A unit vector u ∈ Rn is a vector
Pwith unit norm
— i.e., its components satisfy ni=1 ui2 = kuk2 = 1.
The set of all such unit vectors forms a surface
called the unit sphere of dimension n − 1
(one less than n because of the defining equation).
It is defined as the hollow set (like a football or tennis ball)
n Xn o
S n−1 := x ∈ Rn xi2 = 1
i=1

The unit ball B ⊂ Rn is the solid set (like a cricket ball or golf ball)
n Xn o
B := x ∈ Rn xi2 ≤ 1
i=1

of all points bounded by the surface of the unit sphere S n−1 ⊂ Rn .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 40 of 71


Cauchy–Schwartz Inequality
Theorem
For all pairs a, b ∈ Rn , one has |a · b| ≤ kakkbk.
Proof. Pn
Define the function R 3 ξ 7→ f (ξ) := 2
i=1 (ai ξ + bi ) ∈ R.
Clearly f is the
Pn quadratic function f (ξ)
Pn ≡ Aξ 2 + Bξ + C
2 = kak2 , B := 2
where A := Pn i=1 a i i=1 ai bi = 2a · b,
2
and C := i=1 bi = kbk . 2

There is a trivial case when A = 0 because a = 0.


Otherwise A > 0, so we can complete the square to get

f (ξ) ≡ Aξ 2 + Bξ + C = A[ξ + (B/2A)]2 + C − B 2 /4A

But the definition of f implies that f (ξ) ≥ 0 for all ξ ∈ R,


including ξ = −B/2A, so 0 ≤ f (−B/2A) = C − B 2 /4A.
Hence 41 B 2 ≤ AC ,

implying that |a · b| = 12 B ≤ AC = kakkbk.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 41 of 71
Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 42 of 71


The Angle Between Two Vectors
Consider the triangle in Rn whose vertices
are the three disjoint vectors x, y, 0.
Its three sides or edges have respective lengths kxk, kyk, kx − yk,
where the last follows from the parallelogram law.
Note that kx − yk2 S kxk2 + kyk2 according as the angle at 0 is:
(i) acute; (ii) a right angle; (iii) obtuse. But
Xn Xn
kx − yk2 − kxk2 − kyk2 = (xi − yi )2 − (x 2 + yi2 )
i=1 i=1 i
Xn
= (−2xi yi ) = −2x · y
i=1

So the three cases (i)–(iii) occur according as x · y T 0.


Using the Cauchy–Schwartz inequality, one can define the angle
between x and y as the unique solution θ = arccos (x · y/kxkkyk)
in the interval [0, π) of the equation cos θ = x · y/kxkkyk ∈ [−1, 1].

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 43 of 71


Orthogonal and Orthonormal Sets of Vectors
Case (ii) suggests defining two vectors x, y ∈ Rn
as orthogonal just in case x · y = 0,
which is true if and only if θ = arccos 0 = 12 π = 90◦ .
A set of k vectors {x1 , x2 , . . . , xk } ⊂ Rn is said to be:
I pairwise orthogonal just in case xi · xj = 0 whenever j 6= i;
I orthonormal just in case, in addition, each kxi k = 1
— i.e., all k elements of the set are vectors of unit length.
Define the Kronecker delta function

{1, 2, . . . , n} × {1, 2, . . . , n} 3 (i, j) 7→ δij ∈ {0, 1}

on the set of pairs i, j ∈ {1, 2, . . . , n} by


(
1 if i = j
δij :=
0 otherwise

Then the set of k vectors {x1 , x2 , . . . , xk } ⊂ Rn is orthonormal


if and only if xi · xj = δij for all pairs i, j ∈ {1, 2, . . . , k}.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 44 of 71
Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 45 of 71


The Canonical Basis of Rn

Example
A prominent orthonormal set is the canonical basis of Rn ,
defined as the set of n different n-vectors ei (i = 1, 2, . . . , n)
whose respective components (eji )nj=1
satisfy eji = δij for all j ∈ {1, 2, . . . , n}.

Exercise
Show that each n-vector x = (xi )ni=1 is a linear combination
Xn
x = (xi )ni=1 = xi ei
i=1

of the canonical basis vectors {e1 , e2 , . . . , en },


with the multiplier attached to each basis vector ei
equal to the respective component xi (i = 1, 2, . . . , n).

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 46 of 71


The Canonical Basis in Commodity Space

Example
Consider the case when each vector x ∈ Rn is a quantity vector,
whose components are (xi )ni=1 ,
where xi indicates the net quantity of commodity i.
Then the ith unit vector ei of the canonical basis of Rn
represents a commodity bundle
that consists of one unit of commodity i,
but nothing of every other commodity.
In case the row vector p> ∈ Rn
is a price vector for the same list of n commodities,
the value p> ei of the ith unit vector ei must equal pi ,
the price (of one unit) of the ith commodity.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 47 of 71


Linear Functions

Theorem
The function Rn 3 x 7→ f (x) ∈ R is linear
if and only if there exists y ∈ Rn such that f (x) = y> x.

Proof.
Sufficiency is easy to check.
Pn i
Conversely, note that x equals the linear combination i=1 xi e
of the n canonical basis vectors.
Hence, linearity of f implies that
X n  Xn Xn
f (x) = f x i ei = xi f (ei ) = f (ei ) xi = y> x
i=1 i=1 i=1

where y is the column vector whose components


are yi = f (ei ) for i = 1, 2, . . . , n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 48 of 71


Linear Transformations: Definition

Definition
The vector-valued function

Rn 3 x 7→ F(x) = (Fi (x))m


i=1 ∈ R
m

is a linear transformation just in case


each component function Rn 3 x 7→ Fi (x) ∈ R is linear
— or equivalently, iff F(λx + µy) = λF(x) + µF(y)
for every linear combination λx + µy of every pair x, y ∈ Rn .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 49 of 71


Characterizing Linear Transformations
Theorem
The mapping Rn 3 x 7→ F(x) ∈ Rm is a linear transformation
if and only if there exist vectors yi ∈ Rm for i = 1, 2, . . . , n
such that each component function satisfies Fi (x) = yi> x.

Proof.
Sufficiency is obvious.
Conversely, because x equals the linear combination ni=1 xi ei
P
of the n canonical basis vectors {ei }ni=1
and because each component function Rn 3 x 7→ Fi (x) is linear,
one has
Xn  Xn
Fi (x) = Fi xj ej = xj Fi (ej ) = yi> x
j=1 j=1

where yi> is the row vector whose components


are (yi )j = Fi (ej ) for i = 1, 2, . . . , m and j = 1, 2, . . . , n.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 50 of 71
Representing a Linear Transformation

Definition
A matrix representation
of the linear transformation Rn 3 x 7→ F(x) ∈ Rm
relative to the canonical bases of Rn and Rm
is an m × n array whose n columns
are the m-vector images F(ej ) = (Fi (ej ))mi=1 ∈ R
m

of the n canonical basis vectors {ej }nj=1 of Rn .

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 51 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 52 of 71


Linear Combinations and Dependence: Definitions

Definition
A linear combination of theP finite set {x1 , x2 , . . . , xk } of vectors
is the scalar weighted sum kh=1 λh xh ,
where λh ∈ R for h = 1, 2, . . . , k.

Definition
The finite set {x1 , x2 , . . . , xk } of vectors is linearly
P independent
just in case the only solution of the equation kh=1 λh xh = 0
is the trivial solution λ1 = λ2 = · · · = λk = 0.
Alternatively, if the equation has a non-trivial solution,
then the set of vectors is linearly dependent.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 53 of 71


Characterizing Linear Dependence

Theorem
The finite set {x1 , x2 , . . . , xk } of k n-vectors is linearly dependent
if and only if at least one of the vectors, say x1 after reordering,
can be expressed as a linear combination of the others
— i.e., there exist h
1
Pkscalars hα (h = 2, 3, . . . , k)
such that x = h=2 αh x .

Proof. P
If x = h=2 αh xh , then (−1)x1 + kh=2 αh xh = 0,
1 k P

so kh=1 λh xh = 0 has a non-trivial solution.


P

Pk h
Conversely, suppose h=1 λh x = 0 has a non-trivial solution.
After reordering, we can suppose that λ1 6= 0.
Then x1 = kh=2 αh xh ,
P
where αh = −λh /λ1 for h = 2, 3, . . . , k.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 54 of 71


Orthogonality Implies Linear Independence

Theorem
If the finite set S = {x1 , x2 , . . . , xk } of k non-zero n-vectors
is pairwise orthogonal, then it is linearly independent.

Proof. Pk
Let s denote the linear combination h
h=1 αh x .
· xj = kh=1 αh xh · xj .
P
Then for each j = 1, . . . , k one has s
In case S is pairwise orthogonal, one has xh · xj = 0 for all h 6= j,
and so s · xj = αj xj · xj .
So when s = 0, it follows that αj xj · xj = 0 for all j = 1, . . . , k.
Then, because we assumed that xj 6= 0,
we have αj = 0 for all j = 1, . . . , k.
This proves that S is linearly independent.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 55 of 71


Dimension
Definition
The dimension of a vector space V is the size
of any maximal set of linearly independent vectors,
if this number is finite.
Otherwise, if there is an infinite set of linearly independent vectors,
the dimension is infinite.

Exercise
Show that the canonical basis of Rn is linearly independent.

Example
The previous exercise shows that the dimension of Rn is at least n.
Later results will imply that any set of k > n vectors in Rn
is linearly dependent.
This implies that the dimension of Rn is exactly n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 56 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 57 of 71


Matrices as Rectangular Arrays
An m × n matrix A = (aij )m×n is a (rectangular) array
 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
m n n m
A= . ..  = ((aij )i=1 )j=1 = ((aij )j=1 )i=1
 
..
 .. . . 
am1 am2 . . . amn
Note that in aij , we write the row number i
before the column number j.
An m × 1 matrix is a column vector with m rows and 1 column.
A 1 × n matrix is a row vector with 1 row and n columns.
The m × n matrix A consists of:
n columns in the form of m-vectors
aj = (aij )m m
i=1 ∈ R for j = 1, 2, . . . , n;
m rows in the form of n-vectors
a> n n
i = (aij )j=1 ∈ R for i = 1, 2, . . . , m.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 58 of 71
The Transpose of a Matrix
The transpose of the m × n matrix A = (aij )m×n
is defined as the n × m matrix
 
a11 a21 . . . an1
 a12 a22 . . . an2 
A> = (aij> )n×m = (aji )n×m =  .
 
.. .. 
 .. . . 
a1m a2m . . . anm

Thus the transposed matrix A> results from transforming


each column m-vector aj = (aij )m
i=1 (j = 1, 2, . . . , n) of A
into the corresponding row m-vector a> > m
j = (aji )i=1 of A .
>

Equivalently, for each i = 1, 2, . . . , m,


the ith row n-vector a> n
i = (aij )j=1 of A
is transformed into the ith column n-vector ai = (aji )nj=1 of A> .
Either way, one has aij> = aji for all relevant pairs i, j.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 59 of 71


Rows Before Columns

VERY Important Rule: Rows before columns!


This order really matters.
Reversing it gives a transposed matrix.
Exercise
Verify that the double transpose of any m × n matrix A
satisfies (A> )> = A
— i.e., transposing a matrix twice recovers the original matrix.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 60 of 71


Outline

Solving Two Equations in Two Unknowns


First Example

Vectors
Vectors and Inner Products
Addition, Subtraction, and Scalar Multiplication
Linear versus Affine Functions
Norms and Unit Vectors
Orthogonality
The Canonical Basis
Linear Independence and Dimension

Matrices
Matrices and Their Transposes
Matrix Multiplication: Definition

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 61 of 71


Multiplying a Matrix by a Scalar

A scalar, usually denoted by a Greek letter,


is simply a member α ∈ F of the algebraic field F
over which the vector space is defined.
So when F = R, a scalar is a real number α ∈ R.

The product of any m × n matrix A = (aij )m×n


and any scalar α ∈ R
is the new m × n matrix denoted by αA = (αaij )m×n ,
each of whose elements αaij results
from multiplying the corresponding element aij of A by α.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 62 of 71


Matrix Multiplication

The matrix product of two matrices A and B


is defined (whenever possible) as the matrix C = AB = (cij )m×n
whose element cij in row i and column j
is the inner product cij = a>
i bj of:
>
I the ith row vector ai of the first matrix A;
I the jth column vector bj of the second matrix B.

a>
i · bj = cij

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 63 of 71


Compatibility for Matrix Multiplication, I

Again: rows before columns!


Note that the resulting matrix product C must have:
I as many rows as the first matrix A;
I as many columns as the second matrix B.

Yet again: rows before columns!

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 64 of 71


Compatibility for Matrix Multiplication, II

Question: when is this definition


of the matrix product C = AB possible?
Answer: if and only if A has as many columns as B has rows.
This condition ensures that every inner product a>i bj is defined,
which is true iff (if and only if) every row of A
has exactly the same number of elements as every column of B.
In this case, the two matrices A and B
are compatible for multiplication.
Specifically, if A is m × ` for some m,
then B must be ` × n for some n.
Then the product C = AB isPm × n,
`
with elements cij = a> i bj = k=1 aik bkj
for i = 1, 2, . . . , m and j = 1, 2, . . . , n.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 65 of 71


Laws of Matrix Multiplication
Exercise
Verify that the following laws of matrix multiplication hold
whenever the matrices A, B, C are compatible for multiplication.
associative law for matrices: A(BC) = (AB)C;
distributive: A(B + C) = AB + AC and (A + B)C = AC + BC;
transpose: (AB)> = B> A> .
associative law for scalars: α(AB) = (αA)B = A(αB) (all α ∈ R).

Exercise
Let X be any m × n matrix, and z any column n-vector.
1. Show that the matrix product z> X> Xz is well-defined,
and that its value is a scalar.
2. By putting w = Xz in the previous exercise regarding the sign
of the quadratic form w> w, what can you conclude
about the value of the scalar z> X> Xz?
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 66 of 71
Exercise for Econometricians I

Exercise
An econometrician has access to data series (such as time series)
involving the real values
I yt (t = 1, 2, . . . , T ) of one endogenous variable;
I xti (t = 1, 2, . . . , T and i = 1, 2, . . . , k)
of k different exogenous variables
— sometimes called explanatory variables or regressors.
The data is to be fitted into the linear regression model
Xk
yt = bi xti + et
i=1

whose scalar constants bi (i = 1, 2, . . . , k)


are unknown regression coefficients,
and each scalar et is the error term or residual.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 67 of 71


Exercise for Econometricians II

1. Discuss how the regression model can be written


in the form y = Xb + e for suitable column vectors y, b, e.
2. What are the dimensions of these vectors,
and of the exogenous data matrix X?
3. Why do you think econometricians use this matrix equation,
rather than the alternative y = bX + e?
4. How can the equation y = Xb + e
accommodate the constant term α P
in the alternative equation yt = α + ki=1 bi xti + et ?

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 68 of 71


Matrix Multiplication Does Not Commute I
The two matrices A and B commute just in case AB = BA.
Note that typical pairs of matrices DO NOT commute,
meaning that AB 6= BA — i.e., the order of multiplication matters.
Indeed, suppose that A is ` × m and B is m × n,
as is needed for AB to be defined.
Then the reverse product BA is undefined
except in the special case when n = `.
Hence, for both AB and BA to be defined,
where B is m × n, the matrix A must be n × m.
But then AB is n × n, whereas BA is m × m.
Evidently AB 6= BA unless m = n.
Thus all four matrices A, B, AB and BA are m × m = n × n.
We must be in the special case
when all four are square matrices of the same dimension.
University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 69 of 71
Matrix Multiplication Does Not Commute II
Even if both A and B are n × n matrices,
implying that both AB and BA are also n × n,
one can still have AB 6= BA.
Example
Here is a 2 × 2 example:
         
0 1 0 0 0 1 0 0 0 0 0 1
= 6= =
1 0 0 1 0 0 1 0 0 1 1 0

Exercise
For matrix multiplication, explain why there are two different
versions of the distributive law — namely

A(B + C) = AB + AC and (A + B)C = AC + BC

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 70 of 71


More Warnings Regarding Matrix Multiplication

Exercise
Let A, B, C denote three general matrices.
Give examples showing that:
1. The matrix AB might be defined, even if BA is not.
2. One can have AB = 0 even though A 6= 0 and B 6= 0.
3. If AB = AC and A 6= 0, it does not follow that B = C.

University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 71 of 71

You might also like