Chapter 5: Mind Map: Mathematical Functions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Linear

Classes are distinct and separable

Iris model from other chapter


Better fit = more attributes
Patterns that do not generalize: over-fitting
Mathematical Functions Over-fitting and it's Avoidance
Adding more xi's is more complex Allows for flexibility when searching data
wi = a learned parameter

Measure accuracy on training and test set


If not pure: estimate based on average

Sweet spot: where it starts to over-fit

Generalization
Sectioning to get "pure" data Chapter 5: Mind Map Not fit with other data: over-fit
Over-fitting in Tree Induction
For previously unseen data

memorizes training data and doesn't generalize


Number of nodes = complexity of the tree
Sampling approach = table model
Growing trees until the leaves are pure: how to over-fit
If fails: more realistic models will fail too

All data models could and do, do this


Recognize and manage in the principle way

Based on how complex you allow the model to be


Tendency to make models with training data
Overfitting
At the expense of Generalization
Based on accuracy as a model of complexity
Fitting Graph

Comparing predicted values w/hidden true values Increases when you allow more flexibility
Generalizaiton Performance
Why is it bad?
estimated performance
estimates all data
Must mis-trust data on a training set Cross-validation:
More sophisticated
Churn Data-set Model will pick up harmful correlations

all models are susceptible to over-fitting effects

Tree induction
Stop growing the tree
Avoidance
Grow until it is too large hen prune it back

Estimate the generalizing performance of each model

Find the right balance


Equations
Parameter optimization

You might also like