Professional Documents
Culture Documents
02 02 Regression
02 02 Regression
1 2
Output: a vector 𝐰 ∈ ℝ# and scalar 𝑏 ∈ ℝ such that 𝐱 $% 𝐰 + 𝑏 ≈ 𝑦$ . 1. Add one dimension to 𝐱 ∈ ℝ# : 𝐱& = [𝐱& ; 1] ∈ ℝ#'!.
,
2. Solve least squares regression: min
!"#
𝐗𝐰−𝐲 ,
.
𝐰∈ℝ
3 4
1
9/5/23
Conjugate Gradient
5 6
7 8
2
9/5/23
Polynomial regression:
1. Define a feature map 𝛟 𝑥 = [1, 𝑥, 𝑥 ,, 𝑥 /, ⋯ , 𝑥 . ].
2. For 𝑗 = 1 to 𝑛, do the mapping 𝑥& ↦ 𝛟(𝑥& ).
• Let 𝚽 = 𝛟 𝑥! ; ⋯ , 𝛟 𝑥" % ∈ ℝ"× .'!
,
3. Solve the least squares regression min
%"#
𝚽𝐰−𝐲 ,
.
𝐰∈ℝ
9 10
import numpy
In [2]:
Polynomial features:
from sklearn.preprocessing import PolynomialFeatures
𝛟 𝐱 = [1, 𝑥!, 𝑥,, 𝑥!,, 𝑥,,, 𝑥!𝑥,, 𝑥!/, 𝑥,/, 𝑥!𝑥,,, 𝑥!,𝑥, ]. poly = PolynomialFeatures(degree=3)
Phi = poly.fit_transform(X)
print('Phi = ')
print(Phi)
In [1]:
3
print('shape of y_test: ' + str(y_test.shape))
/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the sec
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
shape of x_train: (404, 13)
9/5/23
In [1]:
Polynomial Regression
shape of y_train: (404,)
shape of x_train: (404, 13)
shape of y_test: (102,)
shape of x_test: (102, 13)
shape of y_train: (404,)
In [2]:
Input: vectors 𝐱!, ⋯ , 𝐱 " ∈ ℝ
shape #
of y_test: (102,)
and labels 𝑦!, ⋯ , 𝑦" ∈ ℝ.
import numpy
In [2]:
#
Output: a function 𝑓: ℝ ↦ ℝ such that 𝑓 𝐱 ≈𝑦.
n, d = x_train.shape
import numpy $ $
xbar_train = numpy.concatenate((x_train, numpy.ones((n, 1))),
n, d = x_train.shape
axis=1)
xbar_train = numpy.concatenate((x_train, numpy.ones((n, 1))),
Training, Test, and Overfitting
print('shape of x_train: ' + str(x_train.shape))
axis=1)
print('shape of xbar_train: ' + str(xbar_train.shape))
shape
!
print('shape of x_train:
of x_train:
# of xbar_train:
print('shape
shape
(404, 13)
of xbar_train: "
' + str(x_train.shape))
' + str(xbar_train.shape))
(404, 14)
shape of x_train: (404, 13)
In [3]:
shape of xbar_train: (404, 14)
# the analytical solution
In [3]:
$
xx = numpy.dot(xbar_train.T,
# the analytical solution % $&
xbar_train)
xx_inv = numpy.linalg.pinv(xx)
xy = numpy.dot(xbar_train.T, y_train)
xx = numpy.dot(xbar_train.T, xbar_train)
w = numpy.dot(xx_inv,
xx_inv xy)
= numpy.linalg.pinv(xx)
xy = numpy.dot(xbar_train.T, y_train)
15 w =[4]:
In numpy.dot(xx_inv, xy)
16
# mean squared error (training)
In [4]:
y_lsr
# mean= squared
numpy.dot(xbar_train,
error (training)w)
diff = y_lsr - y_train
mse = numpy.mean(diff * diff)
y_lsr = numpy.dot(xbar_train, w)
print('Train
diff = y_lsr -MSE: ' + str(mse))
y_train
mse = MSE:
Train numpy.mean(diff * diff)
22.00480083834814
print('Train MSE: ' + str(mse)) 4
Train
In MSE: 22.00480083834814
[5]:
print(y_train[0:10])
9/5/23
17 18
Underfitting Overfitting
19 20
5
9/5/23
𝐱′ 𝐱′ 𝐱′
BAD GOOD BAD linear model
Degree-15 polynomial
Degree-4 polynomial
21 22
23 24
6
9/5/23
25 26
27 28
7
9/5/23
labels labels
features features
𝑛 training samples 𝑚 test samples 𝑛 training samples 𝑛&'( validation 𝑚 test samples
samples
29 30
31 32
8
9/5/23
𝑘-Fold Cross-Validation
1. Propose a grid of hyper-parameters.
• E.g. 𝑝 ∈ {1, 2, 3, 4, 5, 6}.
2. Randomly partition the training samples to 𝑘 parts.
𝑘-Fold Cross-Validation • 𝑘 − 1 parts for training.
• One part for test.
3. Compute the averaged test errors of the 𝑘 repeats.
• The average is called the validation error.
4. Choose the hyper-parameter 𝑝 that leads to the
smallest validation error.
33 34
p=1 23.19
p=2 21.00
p=3 18.54
validation error
p=4 24.36
p=5 27.96
p=6 33.10
35 36
9
9/5/23
37 38
Model
𝐲89:;<= 𝐲8><?@AB
Model
39 40
10
9/5/23
Submission Submission
Answer: The score can be evilly used
for hyper-parameter tuning (cheating).
Score=0.9527 Secret! Score=0.9527 Secret!
41 42
Summary
43
11