Strang Linear Algebra Notes

Strang's Introduction to Linear Algebra Notes
Chapter 1: Introduction to Vectors

1.2: Lengths and Dot Products
, the angle in between the vectors, is 0. AKA the vectors are perpendicular.
, the unit vector in the same direction as v
Schwarz inequality
Triangle Inequality
1.3: Matrices
Matrix multiplication on the right by a vector can be thought of as a linear combination of columns of the
matrix
where u,v,w are columns of A
One can also think of acting on the vector in insofar as the elements of the vector are the
result of linear combinations of according to the corresponding rows of
One example is the difference matrix:
Of course, there's also the usual way of multiplying matrices which takes dot products of corresponding
rows and columns (or for a vector, the whole vector)
One other way of looking at the equation is as a system of equations, where is known, and
we're trying to find
The system of equations of the difference matrix can be written as follows:
This system of equations is well-behaved, and as a result the matrix is invertible. What that means
is that for every constant there is exactly one solution . Note that this doesn't have to be the
case: if we had an under or over determined system, we would have infinitely many solutions / no
solutions at all.
How do these facts about the corresponding systems of equations translate over to the matrix
equation?
Not having enough variables in an equation corresponds to redundant columns in the source
matrix, AKA the columns of are dependent. Not having enough equations for our unknowns
means that our rows are dependent. Either of these can result in infinitely many solutions or no
solutions, depending on .
Another way to look at dependence: the column / row vectors (in 3D) lie in the same plane,
instead of spanning the entire 3D space. This means that not all vectors can result from a
linear combination of the columns, and there is sometimes no solution.
Having dependent columns / rows means that the matrix is not invertible. If the matrix was
invertible, that would imply that there was a single solution to the system of equations. Since this
is clearly not the case, the matrix cannot be inverted.
Chapter 2: Solving Linear Equations

2.1 Vectors and Linear Equations
Take the following system of equations:
Viewing this in column form it's clear why linearly dependent columns lead to a singular matrix:
If the columns are linearly dependent, then for some vectors there will be no such that a
linear combination of the columns equals
The row picture, of course, is of two lines in the - plane, intersecting in a point
Now say we have a system of 3 linear equations, with the coefficient matrix
The matrix equation is thus:
What does it mean to multiply by ?
Matrix multiplication can be thought of as an operation that's defined by its correspondence

with the linear system of equations it represents
The row picture - is the dot products of with rows of

The column picture - is a linear combination of the columns of
The column picture is the one Strang prefers, for its cleanliness
2.2 The Idea of Elimination

A visual of what elimination leads to: the upper triangular system
It's clear that these systems can be quickly solved with back substitution
Some nomenclature:
The pivot is the first nonzero element in the row that does the elimination (i.e. the row that is
subtracted from other rows)
The multiplier is (entry to eliminate) / (pivot coefficient), in the above example, 3
Elimination can fail in some scenarios - e.g. the pivot is 0. This happens in systems with no solution, as well
as in systems with infinitely many solutions. Sometimes, the pivot is 0 and the system is still solvable, but
rows need to be exchanged.
2.3 Elimination Using Matrices

What if we want to represent the elimination steps using matrices?
Consider the following elimination step, when we subtract 2x the first row from the second. Here it is,
represented in matrix form:
The elimination matrix that subtracts the jth row from the ith row:
Change the entry in the identity matrix to , and you have

How do elimination matrices act on ?
The purpose of elimination is to produce a 0 in the row that's being acted upon, in the corresponding
position from the row that's being subtracted
The operation we're performing is
Also, what's the impact of the ordering of the s on the final elimination matrix?
Ok, so how do we multiply the matrices on the left hand side?
We know what the result of is, so we can make various observations from this result
We also know that the rule for multiplying matrices must hold the same result as the
We also know that the rule for multiplying matrices must hold the same result as the
multiplication when is a single column matrix
This also holds for multiple columns:
Note that can be viewed in the column fashion as a linear combination of

the columns of by the elements of
Why?
Cool! We just defined matrix multiplication using matrix-vector multiplication, which came from
linear equations
The augmented matrix:
When performing elimination, add as the last column, because the same operations act on as
2.4 Rules for Matrix Operations

is a valid matrix product for dim dim
dot product of th row of with th row of

Along with the column picture, there's a row picture for matrix multiplication
An example is elimination ( ) - think about the resulting rows of and their relation to the
original matrices
Block multiplication
You can divide matrices into blocks. If in the multiplication , the column cuts of match the row
cuts of , you can do the resulting multiplication just like if the cuts were numbers
One example is doing multiplication column-row instead of row-column:
2.5 Inverse Matrices

2.5 Inverse Matrices
A number has an inverse if it's not zero - the matrix condition for not having an inverse is more interesting
Some interesting notes on invertibility
for a nonzero vector implies that is not invertible
You can see how this is equivalent to linear dependence of the columns
Matrix is invertible iff determinant is not 0
Diagonal matrix is invertible iff no diagonal entries are 0
invertible implies is invertible
For square matrices, right inverses are left inverses
Gauss-Jordan Elimination
Think about this in column form:
This is 3 systems of 3 linear equations; we can solve them all at once in elemination
Once you get to an upper triangular matrix, you perform another round of elimination, but upwards
so that all elements above the pivots in are , resulting in a diagonal matrix
Finally, you divide rows through by the elements on the diagonals to get on the left and on the
right
Why is the matrix on the right hand side ?
We've multiplied by a series of elimination matrices to get , which implies that , the product
of all the elimination matrices, is , and since we've performed the same steps on , the right
hand matrix is
Gauss-Jordan form explains why when we computationally solve , we generally won't try to invert
. We need to solve systems of equations to invert with dimension , but to solve we
simply need to solve one system of equations
So, when Gauss-Jordan elimination fails, we don't have an invertible matrix
2.6 Elimination = Factorization:

Lots of the interesting theorems and operations of linear algebra are factorizations of a matrix into
multiple other matrices
Elimination is one of these - when you take , is always an invertible matrix (if there are no row
exchanges), so this is equivalent to factoring
Each step of the elimination is intended to produce a in the element of the operand by multiplying
by with the multiplier in the element of the elimination matrix; the inverse does the opposite
An amazing thing happens - in the product , each of the original multipliers go in directly, without any
An amazing thing happens - in the product , each of the original multipliers go in directly, without any
cross-talk!
Why does this happen?
It happens because each row of is composed of a sum of a row of , and rows of , because
the rows above a given row in do not change, so the original row of can be recovered by
reversing the summing of rows in
If we want to be cleaner (have s on the diagonal like ), we can divide out by a matrix to get the new
factorization
The best way to look at this is with the row picture, multiplying on the left
2.7 Transposes and Permutations

Transpose: the matrix flips over it's diagonal,
Some rules:
Dot product / inner product
The inner product of two vectors is
Called the inner product because the transpose is on the inside, the outer product is
Note that the inner product of two dimensional vectors, is a scalar, while the outer product is
an matrix
There's a way to define transposes through dot products, but it doesn't seem to add intuition for me
Symmetric matrices have
The inverse of a symmetric matrix is also symmetric

You can produce a symmetric square matrix from any rectangular matrix
and are both symmetric square matrices (but they're different matrices)
Look at the individual elements - the and elements are the same because as you swap
matrices, you also swap rows for columns
Symmetric matrices make elimination a bit easier
During elimination of a symmetric matric, turns into so we only have to

perform elimination on , not carry it over to
Why is this the case?

Why is this the case?
Permutation matrices
Permutation matrices swap rows of the matrix they're multiplied by (on the left multiplication)
One way to look at this: is a product of single row exchanges

since the transpose of a single row
exchange is still the row exchange. This is the row exchanges in reverse order, which is the
inverse
The transpose of a row exchange is the row exchange because swapping row and is
equivalent to swapping column and column in the identity matrix
Bringing this back to elimination - factorization doesn't always work - namely when we need row
exchanges to eliminate
If we perform all our permuations first, then we get
Chapter 3: Vector Spaces and Subspaces

3.1: Spaces of Vectors
The key attribute of vector spaces is that they're closed under linear combinations
Subspaces are closed spaces encolsed in a larger space
Note that subspaces must always include the 0 vector!

The column space of is the space of all linear combinations of the columns; and is a subspace of
where is the number of rows
is solvable iff
3.2: The Nullspace of : Solving
3.3 The Rank and the Row Reduced Form
3.4 The Complete Solution to

See Notes for 2.2 from Applications, much better explanation there.
3.5 Independence, Basis, and Dimension

Most notes in Applications 2.3
Bases for Matrix Spaces and Function Spaces
Chapter 4: Orthogonality
Questions
Questions
Why is ?
What is a geometric description of the subspace that we project into in least squares?
Need to rederive projection formulas for myself.
Need to rederive least squares formulas for myself + get intuitive understanding of each step.
4.1 Orthogonality of the Four Subspaces
4.2 Projections
We want to find projections of vectors onto subspaces - i.e. find the vectors in those subspaces that are
the closest to the original vector
Our goal, thus, is to find the vector which is the projection of our original vector into a subspace as
well as the matrix that takes
Every subspace of has an projection matrix
Projection Onto a Line
The key to projection is orthogonality - consider the example where we're trying to project a vector onto
a line in the direction of : the resulting vector on the line is
The vector connecting and is perpendicular to
The projection is a multiple of :
Since is a scalar, we can shift to get the projection matrix , which is a rank
matrix
Projection Onto a Subspace
We have vectors in that are linearly independent (i.e. form a basis for a subspace), and we want to
find the combination of the vectors that is closest to a given vector
Our error vector is where is the matrix with the vectors as columns
We know that our error vector is perpendicular to the subspace, which implies that it's in the left
nullspace, so
Solving this, we get
Why are we guaranteed is invertible?
The nullspace of is but since the column space of is orthogonal to

The nullspace of is but since the column space of is orthogonal to
the left nullspace of , this is , so has the same nullspace as
When has linearly independent columns, its nullspace is trivial, so also has trivial
nullspace, and since it's sqaure, that means it's invertible
4.3 Least Squares Approximations

Often we have where there are more rows than columns, and is not in , but we still want
to get a solution
The error is nonzero in this case, but we want to minimize it
The solution that minimizes the error is the least squares solution
is the projection of into
What is the subspace we're projecting into?
The relation between the plane that is formed by and the straight line we're fitting to is
unclear
So, when has no solution, multiply by and solve
But how to solve this equation?

By geometry
Every (for fitting a straight line) lies in the plane of the columns
We want the point on the plane closest to , which is the projection
This gives the smallest error , and the points are fitted to a line
Don't really understand how this gives us a solution, it just sketches out what a solution looks
like
By algebra
We can decompose where is in the column space and is orthogonal to the column
space, i.e. is in the left nullspace
The error is by Pythagorean thm since the two vectors are
orthogonal, and for
By calculus
We take partial derivatives of the error, and find the point where they're simultaneously , the
resulting system of equations is equivalent to
4.4 Orthogonal Bases and Gram-Schmidt

An orthogonal matrix is a matrix with orthonormal columns
An orthogonal matrix is really easy to work with
which is clearly seen

For the special case where is square,
It's obv that every permutation matrix is orthogonal (columns of just 1 and rest s)
Multiplication by orthogonal matrices leaves lengths unchanged, as well as dot products
Projections Using Orthogonal Bases

Projections Using Orthogonal Bases
Previously, when we projected we had with the columns as bases for the subspace we wanted to project
into, and , but if is orthogonal, this simplifies to (and simplifies even
further if is square)
Note that if is square, the projection matrix is the identity, which makes sense - the space you're
projecting into is the whole space
E.g. the least squares solution becomes
So, you ask, how do we get this orthogonal matrix from our sad, sad matrix ?
The Gram-Schmidt Process
Very simple - say we have vectors that we want to turn into orthonormal vectors
Simple: subtract away from each vector the portion that lies in the span of previous vectors
I.e. has a portion that is parallel to and a portion that is perpendicular.
The perpendicular portion is
For , subtract away both the projection onto and
At the end, divide by the lengths to get orthonormal vectors
The Factorization
We went from a matrix to a matrix - are these matrices related via matrix operations such that we
can factor ?
Yep - the key insight is similar to what makes so clean in factorization
in each column of is made of sums of that
column and preceding ones in , later columns aren't involved

Any can be factored into , which is useful for least squares because it implies
Chapter 5: Determinants
5.1 The Properties of Determinants
3 basic properties, plus more that follow as corollaries:
1.
2. The determinant changes signs when two rows or columns are exchanged
3. The determinant has linearity for each row separately:

The multiplicative linearity has a close relation to area / volume - multiply an dimensional matrix by
, and the factor is , as if the determinant represented the volume taken up by the matrix
4. If two rows of are equal, det
5. Subtracting a multiple of one row from another does not change the determinant
6. A matrix with a row of s has det
7. If is triangular then the determinant is the product of the diagonals
8. det is singular
9. det det det
10. det det
Every rule that applies to rows also applies to columns
5.2 Permutations and Cofactors

The Pivot Formula
If no row exchanges are involved, the product of the pivots is the determinant
(det P) (det A) = (det L)(det U)
det A = (det P) (det U)
det P is determined by the number of row exchanges ( ), det U is the product of the pivots
You get this formula because elimination doesn't change the determinant, so for an invertible matrix you
can reduce all the way to , where linearity of the determinant by row means you can factor out each non-
pivot
The Big Formula for Determinants
The pivot formula is much easier to compute, but it's harder to relate the end product back to the terms of
the initial matrix
The Big Formula has terms - fuck me
You can derive the formula, esp in the simple case, through applying linearity a bunch
det - or the sum of all column permutations
Each term in the formula uses each row and column once
Determinant by Cofactors
Another way to simplify the big formula is to group by a row or column
E.g. in the case,

E.g. in the case,
The three things in parantheses are cofactors - determinants of smaller submatrices
where is the submatrix that throws out row and column of
Cofactors are useful when a row or column has lots of s
5.3 Cramer's Rule, Inverses, and Volumes

Doesn't seem to be super important. Gives a closed form solution for which is useful for some
abstract proofs, gives a closed form solution for the inverse in terms of cofactors and the determinant, shows
some neat things about how the determinant relates to areas and volumes.
Chapter 6: Eigenvalues and Eigenvectors

6.1 Introduction to Eigenvalues
Eigenvalues and eigenvectors become key when we consider dynamic systems
E.g. we can find the steady state of matrices, , by using eigen
Most vectors change direction when multiplied by , but eigenvectors are distinguished by the fact that
is parallel to
Basic equation: , where the eigenvalue determines how the vector is scaled by
One interesting example is the identity matrix, which has one eigenvalue but every vector is an
eigenvector
Most matrices will have independent eigenvectors and eigenvalues
When is raised to a power, the eigenvectors stay the same, while the eigenvalues are raised to the same
power
The geometric picture here is clear - the directionally invariant vectors stay that way, while the scaling
gets applied again and again
There's a cool picture here - say you have eigenvalues , then the first eigenvector will be the
steady state since it doesn't get scaled as you keep applying , while the second eigenvector will
decay as
Special properties of a matrix lead to special eigenvectors and eigenvalues
So:
This has a solution when is singular, i.e. det

We can then solve this equation for the eigenvalues, and use the eigenvalues to get the eigenvectors
by substituting the eigenvalue and solving by elimination
Note that elimination does not preserve the eigenvalues
There are two useful sanity checks we can get out -

There are two useful sanity checks we can get out -
- i.e. the sum of the diagonal is the sum of the
eigenvalues, and the product of the eigenvalues is the determinant

Strang Linear Algebra Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Strang Linear Algebra Notes

Uploaded by

Copyright:

Available Formats

Strang's Introduction to Linear Algebra Notes

Chapter 1: Introduction to Vectors

, the unit vector in the same direction as v

where u,v,w are columns of A

One example is the difference matrix:

The system of equations of the difference matrix can be written as follows:

Chapter 2: Solving Linear Equations

Take the following system of equations:

The matrix equation is thus:

What does it mean to multiply by ?

Matrix multiplication can be thought of as an operation that's defined by its correspondence

The row picture - is the dot products of with rows of

2.2 The Idea of Elimination

2.3 Elimination Using Matrices

represented in matrix form:

Change the entry in the identity matrix to , and you have

This also holds for multiple columns:

Note that can be viewed in the column fashion as a linear combination of

The augmented matrix:

2.4 Rules for Matrix Operations

dot product of th row of with th row of

One example is doing multiplication column-row instead of row-column:

2.5 Inverse Matrices

Some interesting notes on invertibility

for a nonzero vector implies that is not invertible

Diagonal matrix is invertible iff no diagonal entries are 0

invertible implies is invertible

For square matrices, right inverses are left inverses

Think about this in column form:

Why is the matrix on the right hand side ?

So, when Gauss-Jordan elimination fails, we don't have an invertible matrix

2.6 Elimination = Factorization:

Why does this happen?

2.7 Transposes and Permutations

Dot product / inner product

The inner product of two vectors is

Symmetric matrices have

The inverse of a symmetric matrix is also symmetric

During elimination of a symmetric matric, turns into so we only have to

Why is this the case?

One way to look at this: is a product of single row exchanges

If we perform all our permuations first, then we get

Chapter 3: Vector Spaces and Subspaces

Subspaces are closed spaces encolsed in a larger space

Note that subspaces must always include the 0 vector!

3.2: The Nullspace of : Solving

3.3 The Rank and the Row Reduced Form

3.4 The Complete Solution to

3.5 Independence, Basis, and Dimension

Bases for Matrix Spaces and Function Spaces

Need to rederive projection formulas for myself.

4.1 Orthogonality of the Four Subspaces

Every subspace of has an projection matrix

Projection Onto a Line

The vector connecting and is perpendicular to

The projection is a multiple of :

Projection Onto a Subspace

Solving this, we get

Why are we guaranteed is invertible?

The nullspace of is but since the column space of is orthogonal to

4.3 Least Squares Approximations

The error is nonzero in this case, but we want to minimize it

is the projection of into