Professional Documents
Culture Documents
Quiz 1: September 3rd: EE 615: Pattern Recognition & Machine Learning Fall 2016
Quiz 1: September 3rd: EE 615: Pattern Recognition & Machine Learning Fall 2016
Quiz 1: September 3rd: EE 615: Pattern Recognition & Machine Learning Fall 2016
Fall 2016
Question 1:
Part (a): Explain what you understand by multicolinearity effect?
x11
x21
Solution: Let us consider the data matrix X =
...
x12
x22
..
.
...
...
..
.
xN 1 xN 2 . . .
x1d
y1
y2
x2d
and Y =
..
...
.
xN d
yN
Multicolinearity effect: When all the columns of X are not linearly independent.
Part (b): When do we say that the given data has multicolinearity effect? Explain your answer by using the example of linear regression problem.
Solution: Now consider the example of linear regression problem:
L(W) = kXW-Yk22
To get the optimal solution W* , we set the W L(W) = 0
Which gives us, W* = (XT X)-1 XT Y
Now we say that the data has multicolinearity effect if the matrix XT X is rank deficient. It
follows from the fact that since X doesnt have its all columns linearly independent, therefore
X is rank deficient. This can be proved as follows, X is rank deficient, therefore its null
space is not empty, which implies Xv 6= 0 for some non zero v . Therefore, XT Xv = 0 is
also true for the same set of v . Hence XT X has non empty null space.
Question 2: For least square regression, show that the following properties hold
true:
Part (a): The sum of residual errors is zero.
Solution: For least square regression,
n
X
2
LD (W) =
(xT
i W yi )
i=1
1-1
1-2
LD (W) =
n
X
W0 +
d
X
i=1
xij Wj yi
2
j=1
j=1
n
X
(xT
i W yi ) = 0
i=1
i = xT
Therefore the predicted value y
i W
n
X
(
yi yi ) = 0
i=1
and residual is ei = (
yi yi )
=
n
X
ei = 0
i=1
=
y
1X
y
n i=1 i
X
1X
i = 0
yi
y
n i=1
i=1
= yi = yi
n
2
1 X
Var(y) =
yi y
n i=1
n
2
1 X
i y
Var(
y) =
y
n i=1
(1)
1-3
Question 3 Solution:
x is drawn randomly from the space X. Toss a coin:
If heads then:
y = wT x
If tails then:
y = vT x
Now since hBayes is the hypothesis which minimizes the generalization error.
Z Z
hB ayes = argminh
l(h(x, y))p(y|x)p(x)dxdy
x
Here p(x) is a uniform probability distribution and after getting x we are finding y by thowing
a fair coin. Hence p(yx) is is Bernoulli distribution as
p(y = wt |x) =
1
2
and
p(y = v t |x) =
1
2
From eq. 1
Z
Z
1
1
hB ayes = argminh l(h, (x, w x)) p(x)dx + argminh l(h, (x, v T x)) p(x)dx
2
2
x
x
Z
1
hB ayes = argminh (l(h, (x, wT x)) + l(h, (x, v T x))) p(x)dx
2
T
Now we consider the inner terms to minimize the loss because for a given x,it is constant as
the output variable is h.
Let h(x) = t
d1
[(t wT x)2 + (t v T x)2 ] = 0
dt 2
2
[(t wT x) + (t v T x)] = 0
2
1-4
2t wT x v T x = 0
1
t = (wT x + v T x)
2
t = E xy (y|x)
1
hB ayes = E xy (y|x) = (wT x + v T x)
2
1-5
b) l(h,(x,y)) = h(x) - y
Z Z
|h(x) y|p(y|x)p(x)dxdy
hB ayes = argminh
x
hB ayes = argminh
1
2
Let h(x) = t
1
hB ayes = argminh
2
(|t wT x| + |t v T x|)p(x)dx
Any value of t S is the global minimizer of the cost function mentioned above, where,
S = [min(wT x, v T x), max(wT x, v T x)]
Hence,
hB ayes [min(wT x, v T x), max(wT x, v T x)]