Professional Documents
Culture Documents
Lecture 05
Lecture 05
Spring 2024
Linear Regression
Dependent Variable
Weight gain vs intake of food
Positive relationship
Regression Line
Minimize the difference between the estimated and
actual value Negative relationship
Error
Linear Regression
^
𝑦𝑦=𝑏 0 +𝑏
=𝑚𝑥 1 𝑥1
+𝐶
𝑏^
𝑦 =𝑏
=
6 𝑥 +𝑏 𝑥
0 =0.6
0 1 𝑏
1 =
∑ (=1𝒙 − 𝒙 ) ∗( 𝒚 − 𝒚 )
Where
∑ ( 𝒙 − 𝒙 )𝟐
1 1
10
^
𝑦 =𝑏 +𝑏1 𝑥 1
𝑏1 =? 0
^ =slope+𝑏
𝑦 =𝑏 of the line 𝑏 0=2.2
0 1 𝑥1 4=𝑏0 + 0.6 ∗ 3
(3 , 4)
^
𝑦 =𝑏 0 +𝑏1 𝑥 1
x y
1 2 1-3=-2 2-4=-2 4 4
6
5
2 4 2-3=-1 4-4=0 1 0
4 3 5 3-3=0 5-4=1 1 0
3
4 4 4-3=1 4-4=0 1 0
2
1 5 5 5-3=2 5-4=1 4 2
0 1 2 4 5
3
3 4 10 6
Linear
Regression 6
5
𝑏1 =∑ 𝒚^ −𝒚 ¿
2 ¿ ¿ 𝑏1 =
3.6
=𝟎 . 𝟔
4
range is from 0 to 1
∑ 𝒚 −𝒚 ¿2 6
=0.6 means it’s a good fit
3
2
1
^
𝑦 =2.2+ 0.6 𝑥
0 1 2 3 4 5
x y
1 2 1-3=-2 2-4=-2 4 2.8 2.8-4=-1.2 1.44 4 4
2 4 2-3=-1 4-4=0 0 3.4 -0.6 0.36 1 0
3 5 3-3=0 5-4=1 1 4 0 0 1 0
4 4 4-3=1 4-4=0 0 4.6 0.6 0.36 1 0
5 5 5-3=2 5-4=1 1 5.2 1.2 1.44 4 2
3 4 6 3.6 10 6
Anaconda
is open source (free) of python programing for machine learning with tools like
Spider
Jupitar notebook is a platform
Google provided free GPU (Online access). Also paid ( Faster)
More number of cores, parallel computation.
Pandas Application Programing Interface (API)
Numerical Python (numpy)
Pandas is open source python library data analysis tool, providing high performance
Variance:
How much scattered the estimated values are
A model with high variance pays lots of attention to training
data and doesn't generalize.
Overfitting
Calibrate the regression problems to prevents under or overfitting in order to minimize the adjusted loss
function.
Ridge regularization
Used L2 Norm
Loss= sum of the square of the errors
Lasso regularization
Used L1 Norm