Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Lec-3 Linear

Regression -
2

Evaluation Metrics : &2 Score

Model
Interpretability feature
:
Importance
feature
-

scaling
-

Optimization
-
Scratch Implementation
Recap
-Single variable >
-

predict sectionship
Linear
regression goal :
b/w independent var

and

4
dependent var
Y price a
o futs fa
f . . .
fd -
* ↓

-
-
- 2D

X-train Y-train
Page
multiple features >
-

X-test 4-fest
find the best fit
D+ 1 dimensional
X
to the
hyperplane
↳ train-test-split data .
Linear Regression Intuition -

Single variable
↳ Age
Price
A

fit
*

x
x
Linear
tries to
+
X X
* regression
ped .
X ·
-
*
"Best fit line" through the
i
---

....
-

price -x
x

data
iii 3
Age Geometric intuition
:

close to
The pts should
be as

the line as possible

Y = W, x + Wo
multiple features
in case
of
3
i =
Will , twin ... +
Would

goal : find W
, , W2
, ...,
wd and Wo for the
-
-
bias
weights
best fit D +1 dim .

Hyperplane
of y - target

y E [O , ; 2
, . . .,
100]

& classification
residual
y -4
error = =

= Actual -

predicted

total error
2 Lyi - i) +
Very bad idea

total error = Mean absolute ercor =

↓ 14 -Yi ;

(MAE)

Mean squared cror 1T (yi -Y -


=

(MSES
x2
y =
(x) n y =

>

d v
~
* not differentiable at
*
Differentiable at all

c= 0 points
* Difficult to implement * MSt punishes
GD MAE outliers more
using
as loss
&
L * 21 is a batter fit
t ②

X
·
·
*
I A MSt is less for Li

O O La

>

(MSt)
>
- loss function : Mean squared error

for linear regression


Goal : Minimize the loss function
>
-

find cits and bias for a hyperplane having


the least MSE
- in *W
*

-argmin Loss

To
,
Wo

argmin ↓ lyi -* -
T Wo
R-2 scose >
-
evaluation metric

relative idea of
* MSE and MAE do not actually give a

how good our model is .

a price model Mean model


Naive :
- F

T *
X
X
·
Mean residual -
(yi 5)
>
-
=> O
Xo X
-

price Ex X ↳ mean

tincog Sum
of squares

>
-

to tal
-Age
-N (yi -

y)" = S

Sum
of squares errors
= SSE = lyi -

Yi
R-squared Score = R2 score = I - SSE

SST

=
1 _
. (yi-y e
E(yi -

* R2 Score for the best model :

SSE = O

- R2 1
SE 1- 0 =
-
=
=

↓ R2 Score for Naive model :

SSE = SST - R2 = / = 1 -
1 =
0
SST
- The higher the R-2 score
,
the better is the performance

= R2 score
of the worst model : -
Infinity = - x

&2 Score - C-1 ,


1]

lies blu & 1


RL 0
practically score

worst
&
4 +

-
X
Y

X
X

D
Model Interpretability (Feature importance)

Wully +Wo
↑ = w, d, twid Wels +

that
of canal to zero , what is
the impact of
if wa is

feature ?
=> Feature is not important in
generating the
prediction

of Wi =
10 and We = 1 .
Which feature is more important ?

solul A lets
change 2, by 1 and see the impact on prediction
>
-

Y will change by W, X 1 = 10 units


A if we
change 2
by 1 then y will only change

by 1 unit

=> I , is mose important


does the change in Mu
of if wn = I ,
how

affect y

>
-

I unit increase in 24 will result in

I unit decrease i
of W
= 10 and W2 = -
20
,
which feature

has more
impact on target ?

=> M2 is more impactful


=
feature Importance /weights/
weight higher the
magnitude of
is
- higher the ,

feature importance
e.
NOTE
:
--

>
-

weights are able to tell feature importances , only


features .
normalized
works if the are

W2(km-driven) wo
Y
+
Eg. = W, (age) +

↓ ↓
↓ 0-10 1 00 000
,,
- 10, 00 000
,

1, 00, 000

&

20, 00, OG
= -
10 000
m
cage) -
0 .

5(km-driven) + wo
u

Wi
We

1 W ,
) SING Does that mean age
is more important
-
scales of the two feature
This is due to different
and not feature importance
.
Gradient Descent Revision


x2
Randomly initialite

y
=

② update the
·

value
of 2 :

·
i
O

a
Pnew =
Hold =

4 d
.....
z
O S
③ keep
·
on
repeating step 2

until reach minima


you

or you complete a
given
iteration.
of
no
Quiz - what does gradient represent

gradient
>
-

Steepest ascent

descent
negative gradient
->
steepest
Optimization

To * .* &* 1yi *e
-

W =
argmin
i, wo -

Loss = L
fifz - . . - fd


2
11
2
,2 23 --- Red
ye i i, = W, 2 , + Walt ...
Hold two
d
Yz n
Y2 W, 22 , + W2&2 + Huddled +Wo

- -

12
=

·
·

! yi Yi i =
Willi , + we hist ...
Wasid tw

&
y i
rule :
update
t+
* W
,
I
wit-y 3 At = + (yi 5 e
awe

tH t

= ((y
aL
We
- W
2
-

2 y ) + -(2
lyze
Twz
, ,

ya)2)
i
.... +
(yo -

+ With =
Wat--
I
Oh
-
2 Wd

t
A Wottl 2 Wo -

I
GL
-

L Wo
-Y ) "
St
1 14 5) 5) in
* (y , .
=
, -
-

6(y y ) ,
-

. ow
,
- 2x

=
2 (
y ,
-

Y ,
) +ly , _
Mii-Wold-wo
"
&
1y "
,
y ,
=
= 2
(y ,
-

J , )2,
on ,

& 22
Owe
=

xxt( (yi -

ji) ) =
lyi -

Yi i i

=>
t =

=2 N 1yi -Yi) xi e
z
At =

-NE (y ; ji) i it
-

zD -L
--

↓ Wo
-
-

z(yi -iel
fi fz
Q Nil 2,2
y. -
ye
② 22/ 222 N
Y2 -Y2
③ M3 232

i
① Mit Hiz yi -

Gi
:
N

④ UNL UN2 Yo-YN


a
-
y -

y
*
XT y -

i X
+
(y -

5) &

Is Y
Lyi y i
2 des DNI y
22/w,
. . .
-

-
e
29 / 2 M22 M32 --
2002 42 -

Gr
Elyi -Yike e 84/W2

i
I
I

! i
You-Yo CV/Wd

I x x
+
(y Y
-
-
- -

T
final rules
update
:

z
i
t
= i
+
-

2E =
+
-

y / + 1y y)
-

&

[ = < <yi-Y. e
th 2
2 W
.
=
Not-4
SWo
=
Not -

-
diz)

4
learning rate =
Step mise

·
-
Eas S
af

fifz - . . - fd >
- i ,
= i 2 two



2
11
2
,2 23 --- Red
ye i i, = W, 2 , + Wal,27 ...
Holid two
d
Yz n
Y2 W, 22 , + W2&2 + Huddled + Wo

- -

12
=

·
·

yi Yi i =
Willi , + we hist ...
Wasid tw

&
y i
X
fifz - . . - fd
2
11
2
,2 23 --- Red We W, difW2 d,2 f--fwd did Wo

Wa
Wo
i
I
t
I I

! /

Wa !
dx1
Wo
ned
-
nx1
X W
X

Y
N

Y,
X

92

i Xw + Wo
=

=
up dot
.
(X i),
+ wo
N

Yar

You might also like