Suggestion For ML Study

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Amartansh Dubey, PhD student/Signal Processing/Data Science/Wireless comm

Updated September 10, 2019 · Upvoted by Peter Ferguson, MSci Theoretical Physics & Mathematics,


Lancaster University (2018)

Here is a detailed 5 months plan to intuitively learn the mathematics required for
Machine Learning:

In this answer, I will give a 5-month plan covering all important mathematical topics with
suitable online learning resources arranged in chronological order. These resources helped
me to intuitively understand the mathematics behind not only ML algorithms but many
other advanced engineering fields like statistical signal processing, computational
electrodynamics, etc. The mathematical papers in top ML conferences/journals might be
overwhelming for those who approach ML only from a programming point of view (i.e. just
using smart toolboxes, libraries and functions) and it can take weeks or months to
understand all the derivations and prerequisite of each paper. As famously
said: “Mathematics is the language of the universe” and hence it is the language of
physics and since all engineering fields are directly or indirectly derived from physics, hence
math is the language of engineering too. If you aspire to be an ML developer or researcher,
there is no way to escape learning math. So here is the list of topics which are crucial for
learning ML:

1. Linear Algebra (optional advanced  topics include: Multi-linear and Tensor


algebra)
2. Probability Theory (optional advanced topics: Measure Theory, Stochastic
Processes, Information Theory)
3. Multivariable Calculus (optional advanced topics: Stochastic Calculus,
Differential Equation)
4. Multivariate Statistics (optional advanced topics: Random Matrix Theory)
5. Convex Optimization (optional advanced topics: Stochastic and Non-Convex
Optimization)
Five topics in bold font in the above list are essential to developing a mathematical
understanding of ML algorithms. In fact, these topics are sufficient to understand complex
topics in any field of engineering. The remaining topics which are marked as “optional” are
simply an extension and are required for doing advance research.

Each of these topics can take months or years to master. But I will provide resources which
will sufficiently cover subtopics required for ML. Here is the 5-month plan which requires
roughly 9 to 11 hours each week.

First Month:

Start with Linear Algebra. The best available resource is lectures and book by Prof.
Gilbert Strang. There are 35 lectures on MIT OCW. Any student in senior years of his/her
UG program can finish one lecture in 1 hour with proper notes. If you can understand it
quickly, then watch lectures at 1.5x or 2x speed. Understanding the steps involved in vector
and matrix operations is the easier part of this course. Visualizing these steps in 3D or 2D
space needs patience and intuition and it is crucial for Data Science. Try to visualize how a
given transformation matrix is affecting a given vector and talking it to newly transformed
space. Two most important topics to physically visualize are eigendecomposition, change of
basis.

Most beautiful visualization is provided by Youtube channel named as  “blue1brown”  under
the playlist “Essence of Linear Algebra”. It only has 15 lectures of 10–15 mins each. If you
have difficulty in visualization topics taught by Prof Strang then first finish these 15 lectures.
It will hardly take 1 week. So total: 1 week for “3blue1brown” and 3 weeks for lectures
by Gilbert Strang.  If some time is remaining, watch some random youtube videos on
multilinear algebra and Tensors. These are optional for now. Overall, understanding linear
algebra will help you learn the art of representing a vast amount of data in terms of vectors
and matrices and extrapolate your mathematical understanding of physical 3D space to N-
dimensional feature space.

Second Month:

This can be the hardest month. It is for Probability and Measure theory. The probability
theory is not a certain set of rules like we learn in deterministic Maths (like algebra or
calculus). The real-world data and events are far more complex than the flipping of coins or
dices which we were taught in schools. Probability is the science of uncertainty. Whether it is
quantum laws of nature or patterns in human behaviour, everything has uncertainties but
with enough observations/data, these uncertainties can be modelled by deterministic
asymptotic laws and hence, to make sense of real-world data, probability theory is a must.

The best lecture series available on this topic is course by Prof Krishna
Jagannathan accompanied by Grimmett’s  book on probability and random processes. It
also covers topics of Measure Theory which is a superset of Probability theory. This course
will test your abstract thinking abilities to its limit. If you are not comfortable with
elementary probability, then don’t watch this course. Instead, you can watch lectures
by Prof. John Tsitsiklis. It is more applied course and focuses less on abstract topics and
derivations. It will start with very basic topics like axioms of probability but soon it will dive
deeper into advanced topics like Markov Chains, Least squares, convergence of random
variables, etc. It has around 25 lectures and I will also recommend solving problems given in
the course. So give full one month to this course.

My suggestion: If you are starting your grad studies or PhD studies then finish the
course by Prof Krishna Jagannathan. If you are in the industry or taught masters
program then choose lectures by Prof. John Tsitsiklis. It's not feasible to do both
courses in one month.

Third Month:

The third month is for Multivariable Calculus. Don’t worry if your elementary calculus
concepts are rusty. Any course on multivariable calculus will cover the basics of calculus. It is
very important to note that if you just want to focus on calculus commonly used in machine
learning then you can only study differential calculus and leave integral calculus for later
stages. So in the third month, you only need to study differential calculus. In the first 15
days, watch 99 lectures by “3Blue1Brown”. Don’t worry, these lectures are on average 5–8
mins long. so it is easily doable. Leave the rest of the lectures in this course as they are on
integral calculus. In remaining 20 days, finish either first 20 lectures of “Multivariable
Calculus by Prof. Denis Auroux” or watch first 6 lectures by “Multivariable Calculus by
Adrian Banner”.

Break and Adjustments:

Let’s take a pause and relax. Congratulate yourself. You are already done with 80% basic
maths required for ML. By now, let us assume that you are 10–20 days behind the schedule.
NO WORRIES! You can skip a few things which will not affect the rest of your schedule and
your understanding of ML algorithms. You can skip multivariable calculus lectures by Denis
Auroux and Adrian Banner. Videos by 3blue1brown are sufficient at this stage if you are too
behind the schedule. This will save you 15–20 days. You can also skip the last 7 lectures by
Prof Tsitsiklis  on probability. This will additionally save 7–8 days. But try to avoid this
adjustment if possible.

Fourth Month:

Now we are at Multivariate Statistics which directly covers some famous ML algorithms.


This subject is derived from a combination of linear algebra and Probability and statistics in
higher dimensional feature space. This is a vast subject with no well-defined conventional
syllabus. There is no point of going too deep in this. Just finish one of these three tasks  in
no more than 15 days and you are done with ML specific Multivariate statistics:

1. Chapters 4, 7 and 8 of the book by R. Johnson: Applied Multivariate


Statistical Analysis,  or
2. Chapters 2,3 and 11 of the book by T W. Anderson: An Introduction to
Multivariate Statistical Analysis,  or
3. Just randomly select some good videos and articles (see the number of
likes/subscriber/comments, etc) from the internet on following topics: Multivariate
Normal distribution, Principle Component Analysis (PCA), Singular Value
Decomposition (SVD), properties of Covariance matrix and Gaussian Mixture
Models.

Again, complete only one of the above three mentioned resources as per your preference.

Break and Adjustments: You still have 15 days left in the fourth month. Take rest, enjoy,
watch movies, TV series or whatever refreshes you and prepare yourself for the final lap.

Fifth Month:

This month is for the topic in which everything you have learnt until now will be used. Let’s
learn Convex Optimization. All the tools learnt previously will help you to build an initial
model with certain tunable parameters. The values of these parameters will decide how well
your model is representing the empirical data and these values are estimated using tools
from optimization theory. Unfortunately, there are limited well-organized courses or books
on this topic. Most of the resources are too complex from an engineering point of view.
Most of the graduate-level courses use lectures and book by Prof. Stephen
Boyd  (Stanford ). No doubt that his book is the most comprehensive resource to learn
applied convex optimization. But it might be hard as it needs rigorous multivariable calculus
which we have not covered in much depth in this plan.

A simpler alternative to Boyd’s book and lectures is course by Prof. Aditya K.


Jagannatham”. This is a unique course in terms of its content and the number of real-world
examples of the latest technologies. The only drawback of this course is examples related to
“Wireless Communication” which only students from the electrical engineering background
will understand. So overall, there are two options for learning convex optimization and both
are heavy. If you want to give less time on convex optimization and just learn practical
concepts then follow this strategy:

Watch lectures 1 to 33 and lectures 42 to 50 of Prof Jagannatham’s course (lectures


are small (around 20 mins each)). Don’t worry if you cannot understand examples
from Wireless Communication. After this, directly read chapter 6 of Boyd’s book and
practice Matlab/python codes given for this chapter on Prof. Boyd’s website. That’s it.
Congratulations! You have finished all important topics required not only for understanding
machine learning but any engineering mathematics.

Sixth Month and onwards:

Now you are ready. Just take any standard book or course on Machine Learning. Now you
will be able to get the mathematical essence behind any topics in ML. Most of the ML
enthusiast might have already watched Prof. Andrew Ng’s course before reading all the
above Math courses. Now watch his course again. You will feel a significant difference. You
can now understand probabilistic properties of the data, manipulate conventional cost
functions as per the data to get better results, and can even make your own customized
algorithms.

My suggestions for studying ML from the mathematical point of view:

1. Any book or videos lecture by Prof. Trevor Hastie and Prof. Tibshirani from
Stanford. The highly recommended beginner level book is “Introduction to
Statistical Learning” by Hastie and Tibshirani.
2. Pattern Recognition and Machine Learning Book by Christopher Bishop.
3. Introduction to Computational Thinking and Data Science.
4. Machine Learning by Prof. Tommi Jaakkola.
As I said earlier, mathematics is the language for scientists and engineers. It will give you the
superpower to randomly pick any topic in Physics, Computer Science, Engineering, finance,
and understand it. You will be amazed to see how a wide variety of seemingly different
fields uses exact same mathematical methods to explain the wildly different phenomenon.
For example, the exact method of eigendecomposition is used in wireless communication,
portfolio optimization in finance, stress/strain modelling in Civil engineering, dimensionality
reduction in Data Science, Vaccination design against genetic diseases, and many other
fields. I hope this answer will help you to get a strong grasp on engineering mathematics,
especially for understanding wonderful algorithms in Data Science.

You might also like