Professional Documents
Culture Documents
DL Lec03 Part01
DL Lec03 Part01
Image source
Outline
• Formalization of statistical learning of classifiers
• Ways to train linear classifiers
Linear regression
Perceptron training algorithm
Logistic regression
Support vector machines
• Gradient descent and stochastic gradient descent
Formalization
• Let’s focus on statistical learning of a parametric model in a
supervised scenario
Image source
Formalization
Given: training data { ( x i , y i ) : 1 ≤ i ≤ n }
• Find: predictor y =f ( x )
• Goal: make good predictions on test data
Source: Y. Liang
Formalization
• Given: training data { ( x i , y i ) : 1 ≤ i ≤ n }
• Find: predictor y =f ( x )
• Goal: make good predictions on test data
Source: Y. Liang
Formalization
• Given: training data { ( x i , y i ) : 1 ≤ i ≤ n }
• Find: predictor y =f ( x ) ∈ H
• Goal: make good predictions on test data
Hypothesis class
Source: Y. Liang
Formalization
• Given: training data { ( x i , y i ) : 1 ≤ i ≤ n }
• Find: predictor y =f ( x ) ∈ H
• Goal: make good predictions on test data
Connection betwee
training and test data
Source: Y. Liang
Formalization
• Given: training data { ( x i , y i ) : 1 ≤ i ≤ n } i.i.d. from distribution
• Find: predictor y =f ( x ) ∈ H
• Goal: make good predictions on test data
i.i.d. from distribution
Same
distribution
Source: Y. Liang
Formalization
• Given: training data { ( x i , y i ) : 1 ≤ i ≤ n } i.i.d. from distribution
• Find: predictor y =f ( x ) ∈ H
• Goal: make good predictions on test data
i.i.d. from distribution
Source: Y. Liang
Formalization
• Given: training data i.i.d. from distribution
• Find: predictor
• S.t. the expected loss is small:
L (f )=Ex , y [ l (f , x , y )]
Source: Y. Liang
Formalization
• Given: training data i.i.d. from distribution
• Find: predictor
• S.t. the expected loss is small:
• Example losses:
Source: Y. Liang
Formalization
• Given: training data i.i.d. from distribution
• Find: predictor
• S.t. the expected loss is small:
Source: Y. Liang
Formalization
• Given: training data i.i.d. from distribution
• Find: predictor
• s.t. the expected loss is small
Empirical
Source: Y. Liang
Supervised learning in a nutshell
1. Collect training data and labels
2. Specify model: select hypothesis class and loss function
3. Train model: find the function in the hypothesis class that
minimizes the empirical loss on the training data
Training linear classifiers
• Given: i.i.d. training data
Training linear classifiers
• Given: i.i.d. training data
• Hypothesis class:
Source: Y. Liang