Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Linear Algebra: Review

Anubha Gupta, PhD.


Professor
SBILab, Dept. of ECE,
IIIT-Delhi, India
Contact: anubha@iiitd.ac.in; Lab: http://sbilab.iiitd.edu.in
Machine Learning in Hindi

Linear Algebra: Review


(Matrices)
Machine Learning in Hindi

Learning Objectives
In this module, we will study

• Matrix properties
• System of linear equations
• Matrix Factorization
o Eigenvalues and eigenvectors
o Eigenvalue Decomposition (EVD)
o Singular Value Decomposition (SVD)
• Taylor Series
Machine Learning in Hindi

Matrix Properties
● Matrix Operations: transpose, trace, determinant, adjoint, algebraic operations on
matrices
(addition and multiplication)

● The rank of a matrix A ∈ Rm×n, rank(A), is the size of the largest linearly independent set of
columns of A. Rank satisfies rank(A) ≤ min{m,n}.

● For matrices A ∈ Rm×k and B ∈ Rk×n, we have


rank(A)+rank(B)−k ≤ rank(AB) ≤ min{rank(A), rank(B)}.

● For the two matrices of same size, we have


rank(A + B) ≤ rank(A) + rank(B).

● The trace of a matrix is the sum of its main diagonal entries, that is,
trace(A) = aii

● For two matrices A and B, trace(AB) = trace(BA)


Machine Learning in Hindi

System of Linear Equations


● The range of a matrix A ∈ Rm×n, range(A), is the set
range(A) = {b ∈ Rm | ∃ x ∈ Rn with Ax = b}
Equivalently, the range of A is the set of all linear combinations of columns of A.

● The nullspace of a matrix, null(A) is the set of all vectors x such that Ax = 0.
Machine Learning in Hindi

System of Linear Equations


● The equation is consistent if there exists a solution x to this equation; equivalently, we
have rank([A, b]) = rank(A) (Rouche–Capelli theorem), or b ∈ range(A).

● The equation has a unique solution if rank([A, b]) = rank(A) = n.


● The equation has infinitely many solutions
if rank([A, b]) = rank(A) < n.
● The equation has no solution if rank([A, b]) > rank(A).

● A matrix A ∈ Rm×n is nonsingular if Ax = 0 if and only if x= 0.


Machine Learning in Hindi

Matrix Properties Continued


● A normal matrix is a matrix N such that NNH = NHN.
● A Hermitian matrix is one such that AH = A. A real valued Hermitian matrix is called a
symmetric matrix.
● A skew Hermitian matrix is one such that AH = −A.
● A unitary matrix is a square matrix with UUH = I = UHU.
● A real-valued unitary matrix is called an orthogonal matrix.
● An idempotent matrix satisfies A2 = A.
● A Projection matrix P satisfies P2 = P (P is idempotent). If P = PH,
then P is called an orthogonal projection.
Machine Learning in Hindi

Matrix Decomposition (Factorization)


Matrix factorization is a method of expressing a given matrix as a product of two or
more matrices

● Need: to simplify the matrix by breaking it down into smaller, more manageable
components; the properties of component matrices is helpful in many
applications.

● Types
○ Eigenvalue decomposition (EVD)
○ Singular value decomposition (SVD)

Note: There are many other types of matrix decomposition, such as QR,
Cholesky decomposition, NMF, and so on.

8
Machine Learning in Hindi

Eigenvalues and Eigenvectors


Definition: A matrix K is called positive semi-definite, if
the quadratic form
zTKz  0  z  0,
and is positive definite (p.d.), if the quadratic form
zTKz > 0  z  0.

Definition: The eigenvalues of an nxn matrix K are


those numbers  for which the characteristic equation
K= has a solution for   0,
where  = eigenvector and  = eigenvalue.
Eigenvectors are normalized vectors having unit norm,
i.e., T=  2 = 1.

Theorem:  is an eigenvalue of the square matrix K if


and only if
det(K-I) = 0.
Machine Learning in Hindi

Eigenvalues and Eigenvectors


Definition: Two nxn matrices are called similar if there exists an nxn matrix T with det(T) 
0, s. t. T-1AT=B, where A and B are similar matrices.

Theorem: An nxn matrix K is similar to a diagonal matrix if and only if K has n-linearly
independent eigenvectors and hence,
U-1KU = 

Theorem: Let K be a real symmetric matrix with eigenvalues 1, 2, …..n, . Then, K has n
mutually orthogonal eigenvectors 1, 2,…., n.

Theorem: Let K be a real symmetric matrix with eigenvalues 1, 2, …..n.
Then, K is similar to a diagonal matrix, i.e.,
U-1KU = ,
where the columns of U contain the ordered
orthogonal unit eigenvectors 1, 2,…., n of K.
Machine Learning in Hindi

Eigenvalues and Eigenvectors

Theorem: A real symmetric matrix K is positive definite if and only if all its eigenvalues 1, 2,
…..n, are positive.

Note: Since det(K) is the product of its eigenvalues, det(K) > 0. This implies that the matrix K
is full rank.
Machine Learning in Hindi

Let us Try!
Question: Perform eigenvalue decomposition on the following matrix.

5 4
𝐀=
1 2

You may pause the video and try.


Machine Learning in Hindi

Let us Try!
Question: Perform eigenvalue decomposition on the following matrix.

5 4
𝐀=
1 2
Answer:
Machine Learning in Hindi

Let us Try!
Machine Learning in Hindi

Try It Yourself!
Question: Perform eigenvalue decomposition on the following matrix.

2 7 8
𝐀= 5 7 1
0 4 3

This is for your own practice. Solution will not be provided for this problem.
Machine Learning in Hindi

Singular Value Decomposition (SVD)


Let A ∈ Cm×n. Then, a singular value decomposition (SVD) of A is given by
A = UΣVH, where
● U ∈ Cm×m is unitary
● Σ = diag(σ1, σ2, · · · , σp) ∈ Rm×n is diagonal with
● σ1 ≥ σ2 ≥ · · · ≥ σp ≥ 0 [p = min(m, n)] and
● V ∈ Cn×n is unitary
The values σi = σi(A) are called the singular values of A and are uniquely determined
as the positive square roots of the eigenvalues of AHA.
Machine Learning in Hindi

Let us Try!
Question: Perform singular value decomposition on the following matrix.

3 2 2
𝐀=
2 3 −2

You may pause the video and try.


Machine Learning in Hindi

Try It Yourself!
Question: Perform singular value decomposition on the following matrix.

1 −1 3
𝐀=
3 1 1

This is for your own practice. Solution will not be provided for this problem.
Machine Learning in Hindi

Taylor Series
● A mathematical tool to represent a function as an infinite sum of
terms, where each term is a derivative of the function evaluated at a
specific point
● Used to approximate the value of a function
● The general form of a Taylor series for a function 𝑓(𝑥) centered at a
point 𝑎 is:
𝑓′ 𝑎 𝑥 − 𝑎 𝑓 ′′ 𝑎 𝑥 − 𝑎 2 𝑓 ′′′ 𝑎 𝑥 − 𝑎 3
𝑓 𝑥 =𝑓 𝑎 + + + +⋯ Brook Taylor FRS
1! 2! 3! (Aug. 18, 1685 – Dec. 29, 1731)
https://galileo-unbound.blog/2020/08/03/brook-
where, taylors-infinite-series/

𝑓 ′ 𝑎 : first derivative of 𝑓(𝑥) evaluated at x = 𝑎


𝑓 ′′ 𝑎 : second derivative of 𝑓(𝑥) evaluated at x = 𝑎
… and so on

!: factorial
Machine Learning in Hindi

Summary
In this module, we studied

• Matrix properties
• System of linear equations
• Matrix Factorization
o Eigenvalues and eigenvectors
o Eigenvalue Decomposition (EVD)
o Singular Value Decomposition (SVD)
• Taylor Series

You might also like