Everything You Need To Know About Linear Regression - by Sushant Patrikar - Towards Data Science

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Open in app Sign up Sign In

Search Medium

This is your last free member-only story this month. Sign up for Medium and get an extra one.

Member-only story

Everything You Need To Know About Linear


Regression
Sushant Patrikar · Follow
Published in Towards Data Science
8 min read · Sep 10, 2019

Listen Share

Linear Regression is the first stepping stone in the field of Machine Learning. If you
are new in Machine Learning or a math geek and want to know all the math behind
Linear Regression, then you are at the same spot as I was 9 months ago. Here we will
look at the math of linear regression and understand the mechanism behind it.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 1/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Linear Regression (Source: https://datumguy.com/blog/blog/view/5ce138213122e?


utm_content=buffer09809&utm_medium=social&utm_source=facebook.com&utm_campaign=buffer)

Introduction
Linear Regression. After breaking it down, we get two words ‘Linear’ & ‘Regression’.
When we think mathematically, the word ‘Linear’ appears to be something related to
the straight line while the term ‘Regression’ means A technique for determining the
statistical relationship between two or more variables.

Simply putting it together, Linear Regression is all about finding an equation of a line
that almost fits the given data so that it can predict the future values.

Hypothesis
Now, what’s this hypothesis? It’s nothing but the equation of line we were talking
about. Let’s look at the equation below.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 2/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Does this look familiar to you? It is the equation of a straight line. This is a hypothesis.
Let’s rewrite this in a somewhat similar way.

We have just replaced y with h(x) and c,m with Θ₀ and Θ₁ respectively. h(x) will be our
predicted value. This is the most common way of writing a hypothesis in Machine
Learning.

Now to understand this hypothesis, we will take the example of the housing prices.
Suppose you collect the size of different houses in your locality and their respective
prices. The hypothesis can be represented as

Now all you have to do is find the appropriate base price and the value of Θ₁ based on
your dataset so that you can predict the price of any house when given its size.

To say it more technically, we have to tune the values of Θ₀ & Θ₁ in such a way that our
line fits the dataset in the best way possible. Now we need some metric to determine
the ‘best’ line, and we have it. It’s called a cost function. Let’s look into it.

Cost Function J(Θ)


The cost function of the linear regression is

To make it look more beautiful for our brain, we can rewrite it as

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 3/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Here m means the total number of examples in your dataset. In our example, m will be
the total number of houses in our dataset.

Now look at our cost function carefully, we need predicted values i.e h(x) for all m
examples. Let’s look again how our predicted values or predicted price look like.

To calculate our cost function, what we need is h(x) for all m examples i.e m predicted
prices corresponding to m houses.

Now, to calculate h(x), we need a base price and the value of Θ₁. Note that these are the
values which we will tune to find our best fit. We need something to start with, so we
will randomly initialize these two values.

Explanation of Cost Function


If you look at the cost function carefully

you’ll find that what we are doing is just averaging the square of the distances between
predicted values and actual values over all the m examples.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 4/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Look at the graph above, here m = 4. The points on the blue line are predicted values
while the red points are actual values. The green line is the distance between the actual
value and the predicted value.

The cost for this line will be

So what the cost function calculates is just the mean of the square of the length of
green lines. We also divide it by 2 to ease out some future calculations which we will
see.

Linear Regression tries to minimize this cost by finding the proper values of Θ₀ and Θ₁.
How? By using Gradient Descent.

Gradient Descent
Gradient Descent is a very important algorithm when it comes to Machine Learning.
Right from Linear Regression to Neural Networks, it is used everywhere.

This is how we update our weights. This update rule is executed in a loop & it helps us
to reach the minimum of the cost function. The α is a constant learning rate which we
will talk about in a minute.

U
nderstanding Gradient Descent
So basically we are updating our weight by subtracting it with the partial
derivative of our cost function w.r.t the weight.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 5/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

But how is this taking us to the minimum cost? Let’s visualize it. For easy
understanding, let’s assume that Θ₀ is 0 for now.

So the hypothesis becomes

And the cost function

Now let’s see how the cost is dependent on the value of Θ₁. Since this is a quadratic
equation, the graph of Θ₁ vs J(Θ) will be a parabola and it will look something like this
with Θ₁ on the x-axis and J(Θ) on the y-axis.

Source: Machine Learning by Andrew Ng

Our goal is to reach the minimum of the cost function, which we will get when our Θ₁
will be equal to Θₘᵢₙ.

Now, to start with we will randomly initialize our Θ₁.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 6/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Source: Machine Learning by Andrew Ng

Suppose, the Θ₁ gets initialized as shown in the figure. The cost corresponding to
current Θ₁ is equal to the blue dot on the graph.

Now, let’s update Θ₁ using gradient descent.

We are subtracting the derivative of the cost function w.r.t Θ₁ multiplied by some
constant.

Source: Machine Learning by Andrew Ng

The derivative of cost function w.r.t Θ₁ gives the slope of the curve at that point. Which
in these case is positive. So we are subtracting positive quantity from our current value
of Θ₁. This will force Θ₁ to move in the left direction and slowly diverge to the value of
Θₘᵢₙ where our cost function is minimum. Here comes the role of α which is our
learning rate. It is the learning rate which decides how much we want to descent in
one iteration. Also, one point to note here is that as we are moving to the minimum,

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 7/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

the slope of the curve is also getting less steeper that means, as we are reaching the
minimum value, we will be taking smaller and smaller steps.

Source: Machine Learning by Andrew Ng

Eventually, the slope will become zero at the minimum of the curve and then Θ₁ will
not be updated.

Think of it like this. Suppose a man is at top of the valley and he wants to get to the
bottom of the valley. So he goes down the slope taking larger steps when the slope is
steep and smaller steps when the slope is less steep. He decides his next position based
on his current position and stops when he gets to the bottom of the valley which was
his goal.

Similarly, if the Θ₁ is initialized on the left side of the minimum value,

Source: Machine Learning by Andrew Ng

The slope at this point will be negative. In gradient descent, we subtract the slope, but
here slope is negative. So, the negative of negative will become positive. So we will
keep on adding until it is reached where cost becomes minimum.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 8/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Gradient Descent (Source: https://saugatbhattarai.com.np/what-is-gradient-descent-in-machine-learning/)

The above figure is a good depiction of gradient descent. Note how the steps are
getting smaller and smaller as we are reaching the minimum.

Similarly, the value of Θ₀ will also be updated using gradient descent. I did not show it,
because we need to update the values of Θ₀ and Θ₁ simultaneously which will result in
a 3-dimensional graph (cost on one axis, Θ₀ on one axis and Θ₁ on another axis) which
becomes kind of hard to visualize.
Derivative of Cost Function
We are using a derivative of the cost function in gradient descent.

Let’s look at what we get after differentiating it.

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 9/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Similarly, for Θ₁

Linear Regression Visualization

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 10/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Linear Regression Visualization

In this visualization, you can see how the line is fitting to the dataset. Note that
initially, the line is covering the distance very quickly. But as the cost is decreasing, the
line becomes slower.

The code for the above visualization is available on my GitHub.

Got Questions? Need Help? Contact me!

Email: sushantpatrikarml@gmail.com

Github: https://github.com/sushantPatrikar

LinkedIn: https://www.linkedin.com/in/sushant-patrikar/

Website: https://sushantpatrikar.github.io/

Machine Learning Linear Regression Gradient Descent Mathematics

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 11/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Follow

Written by Sushant Patrikar


180 Followers · Writer for Towards Data Science

I breathe Machine Learning.

More from Sushant Patrikar and Towards Data Science

Sushant Patrikar in Towards Data Science

Batch, Mini Batch & Stochastic Gradient Descent

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 12/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

An introduction to gradient descent and it’s variants.

5 min read · Oct 1, 2019

1.5K 8

Jacob Marks, Ph.D. in Towards Data Science

How I Turned My Company’s Docs into a Searchable Database with OpenAI


And how you can do the same with your docs

15 min read · Apr 25

3.1K 39

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 13/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Leonie Monigatti in Towards Data Science

Getting Started with LangChain: A Beginner’s Guide to Building LLM-


Powered Applications
A LangChain tutorial to build anything with large language models in Python

· 12 min read · Apr 25

2.2K 18

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 14/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Matt Chapman in Towards Data Science

How I Stay Up to Date With the Latest AI Trends as a Full-Time Data


Scientist
No, I don’t just ask ChatGPT to tell me

· 8 min read · May 1

1.4K 21

See all from Sushant Patrikar

See all from Towards Data Science

Recommended from Medium

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 15/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Dr. Roi Yehoshua in Towards Data Science

Mastering Logistic Regression


From theory to implementation in Python

· 17 min read · May 20

233

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 16/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Peter Karas in Artificial Intelligence in Plain English

Logistic Regression in Depth


Logistic regression, activation function, derivation, math

· 7 min read · Jan 31

260 1

Lists

What is ChatGPT?
9 stories · 62 saves

Staff Picks
329 stories · 83 saves

Dr. Soumen Atta, Ph.D.

Regression models: a concise tutorial of real-life examples with Python


implementations (Part III)

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 17/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

In this tutorial, we will discuss lasso regression and non-linear regression with real-life examples
and Python implementations. This is…

· 8 min read · Mar 6

Peter Karas in Artificial Intelligence in Plain English

Linear Regression in depth


The directive equation of a straight line, simple linear regression, math, cost functions

· 6 min read · Jan 27

111 2

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 18/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Matt Chapman in Towards Data Science

The Portfolio that Got Me a Data Scientist Job


Spoiler alert: It was surprisingly easy (and free) to make

· 10 min read · Mar 24

2.9K 44

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 19/20
5/31/23, 9:36 AM Everything You Need To Know About Linear Regression | by Sushant Patrikar | Towards Data Science

Sadrach Pierre, Ph.D. in Towards Data Science

Mastering P-values in Machine Learning


Understanding P-values and ML use cases

· 7 min read · Jan 6

171

See more recommendations

https://towardsdatascience.com/everything-you-need-to-know-about-linear-regression-b791e8f4bd7a 20/20

You might also like