LR 2

Lec-3 Linear
Regression -
2
Evaluation Metrics : &2 Score
Model
Interpretability feature
:
Importance
feature
-
scaling
-
Optimization
-
Scratch Implementation
Recap
-Single variable >
-
predict sectionship
Linear
regression goal :
b/w independent var
and
4
dependent var
Y price a
o futs fa
f . . .
fd -
* ↓
-
-
- 2D
X-train Y-train
Page
multiple features >
-
X-test 4-fest
find the best fit
D+ 1 dimensional
X
to the
hyperplane
↳ train-test-split data .
Linear Regression Intuition -
Single variable
↳ Age
Price
A
fit
*
x
x
Linear
tries to
+
X X
* regression
ped .
X ·
-
*
"Best fit line" through the
i
---
⑳
....
-
price -x
x
data
iii 3
Age Geometric intuition
:
close to
The pts should
be as
the line as possible
Y = W, x + Wo
multiple features
in case
of
3
i =
Will , twin ... +
Would
goal : find W
, , W2
, ...,
wd and Wo for the
-
-
bias
weights
best fit D +1 dim .
Hyperplane
of y - target
y E [O , ; 2
, . . .,
100]
& classification
residual
y -4
error = =
= Actual -
predicted
total error
2 Lyi - i) +
Very bad idea
total error = Mean absolute ercor =
↓ 14 -Yi ;
(MAE)
Mean squared cror 1T (yi -Y -

=
(MSES
x2
y =
(x) n y =
>
d v
~
* not differentiable at
*
Differentiable at all
c= 0 points
* Difficult to implement * MSt punishes
GD MAE outliers more
using
as loss
&
L * 21 is a batter fit
t ②
X
·
·
*
I A MSt is less for Li
O O La
>
(MSt)
>
- loss function : Mean squared error
for linear regression

Goal : Minimize the loss function
>
-
find cits and bias for a hyperplane having

the least MSE
- in *W
*
-argmin Loss
To
,
Wo
argmin ↓ lyi -* -
T Wo
R-2 scose >
-
evaluation metric
relative idea of
* MSE and MAE do not actually give a
how good our model is .
a price model Mean model

Naive :
- F
T *
X
X
·
Mean residual -
(yi 5)
>
-
=> O
Xo X
-
price Ex X ↳ mean
tincog Sum
of squares
②
>
-
to tal
-Age
-N (yi -
y)" = S
Sum
of squares errors
= SSE = lyi -
Yi
R-squared Score = R2 score = I - SSE
SST
=
1 _
. (yi-y e
E(yi -
* R2 Score for the best model :
SSE = O
- R2 1
SE 1- 0 =
-
=
=
↓ R2 Score for Naive model :
SSE = SST - R2 = / = 1 -
1 =
0
SST
- The higher the R-2 score
,
the better is the performance
= R2 score
of the worst model : -
Infinity = - x
&2 Score - C-1 ,

1]
lies blu & 1

RL 0
practically score
worst
&
4 +
-
X
Y
X
X
D
Model Interpretability (Feature importance)
Wully +Wo
↑ = w, d, twid Wels +
that
of canal to zero , what is
the impact of
if wa is
feature ?
=> Feature is not important in
generating the
prediction
of Wi =
10 and We = 1 .
Which feature is more important ?
solul A lets
change 2, by 1 and see the impact on prediction
>
-
Y will change by W, X 1 = 10 units

A if we
change 2
by 1 then y will only change
by 1 unit
=> I , is mose important

does the change in Mu
of if wn = I ,
how
affect y
>
-
I unit increase in 24 will result in
I unit decrease i
of W
= 10 and W2 = -
20
,
which feature
has more
impact on target ?
=> M2 is more impactful

=
feature Importance /weights/
weight higher the
magnitude of
is
- higher the ,
feature importance
e.
NOTE
:
--
>
-
weights are able to tell feature importances , only

features .
normalized
works if the are
W2(km-driven) wo
Y
+
Eg. = W, (age) +
↓ ↓
↓ 0-10 1 00 000
,,
- 10, 00 000
,
1, 00, 000
&
20, 00, OG
= -
10 000
m
cage) -
0 .
5(km-driven) + wo
u
Wi
We
1 W ,
) SING Does that mean age
is more important
-
scales of the two feature
This is due to different
and not feature importance
.
Gradient Descent Revision
①
x2
Randomly initialite
↑
y
=
② update the
·
value
of 2 :
·
i
O
a
Pnew =
Hold =
4 d
.....
z
O S
③ keep
·
on
repeating step 2
until reach minima

you
or you complete a
given
iteration.
of
no
Quiz - what does gradient represent
gradient
>
-
Steepest ascent
descent
negative gradient
->
steepest
Optimization
To * .* &* 1yi *e
-
W =
argmin
i, wo -
Loss = L
fifz - . . - fd
④
②
2
11
2
,2 23 --- Red
ye i i, = W, 2 , + Walt ...
Hold two
d
Yz n
Y2 W, 22 , + W2&2 + Huddled +Wo
③
- -
12
=
·
·
! yi Yi i =
Willi , + we hist ...
Wasid tw
&
y i
rule :
update
t+
* W
,
I
wit-y 3 At = + (yi 5 e
awe
tH t
= ((y
aL
We
- W
2
-
2 y ) + -(2
lyze
Twz
, ,
ya)2)
i
.... +
(yo -
+ With =
Wat--
I
Oh
-
2 Wd
t
A Wottl 2 Wo -
I
GL
-
L Wo
-Y ) "
St
1 14 5) 5) in
* (y , .
=
, -
-
6(y y ) ,
-
. ow
,
- 2x
=
2 (
y ,
-
Y ,
) +ly , _
Mii-Wold-wo
"
&
1y "
,
y ,
=
= 2
(y ,
-
J , )2,
on ,
& 22
Owe
=
xxt( (yi -
ji) ) =
lyi -
Yi i i
=>
t =
=2 N 1yi -Yi) xi e
z
At =
-NE (y ; ji) i it
-
zD -L
--
↓ Wo
-
-
z(yi -iel
fi fz
Q Nil 2,2
y. -
ye
② 22/ 222 N
Y2 -Y2
③ M3 232
i
① Mit Hiz yi -
Gi
:
N
④ UNL UN2 Yo-YN

a
-
y -
y
*
XT y -
i X
+
(y -
5) &
Is Y
Lyi y i
2 des DNI y
22/w,
. . .
-
-
e
29 / 2 M22 M32 --
2002 42 -
Gr
Elyi -Yike e 84/W2
i
I
I
! i
You-Yo CV/Wd
I x x
+
(y Y
-
-
- -
T
final rules
update
:
z
i
t
= i
+
-
2E =
+
-
y / + 1y y)
-
&
[ = < <yi-Y. e
th 2
2 W
.
=
Not-4
SWo
=
Not -
-
diz)
4
learning rate =
Step mise
·
-
Eas S
af
fifz - . . - fd >
- i ,
= i 2 two
④
②
2
11
2
,2 23 --- Red
ye i i, = W, 2 , + Wal,27 ...
Holid two
d
Yz n
Y2 W, 22 , + W2&2 + Huddled + Wo
③
- -
12
=
·
·
yi Yi i =
Willi , + we hist ...
Wasid tw
&
y i
X
fifz - . . - fd
2
11
2
,2 23 --- Red We W, difW2 d,2 f--fwd did Wo
Wa
Wo
i
I
t
I I
! /
Wa !
dx1
Wo
ned
-
nx1
X W
X
Y
N
Y,
X
92
i Xw + Wo
=
=
up dot
.
(X i),
+ wo
N
Yar

LR 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LR 2

Uploaded by

Copyright:

Available Formats

Lec-3 Linear

Evaluation Metrics : &2 Score

the line as possible

total error = Mean absolute ercor =

Mean squared cror 1T (yi -Y -

for linear regression

find cits and bias for a hyperplane having

how good our model is .

a price model Mean model

* R2 Score for the best model :

↓ R2 Score for Naive model :

&2 Score - C-1 ,

lies blu & 1

Y will change by W, X 1 = 10 units

=> I , is mose important

I unit increase in 24 will result in

=> M2 is more impactful

weights are able to tell feature importances , only

until reach minima

④ UNL UN2 Yo-YN

You might also like