Professional Documents
Culture Documents
BookSlides 1 Machine Learning For Predictive Data Analytics
BookSlides 1 Machine Learning For Predictive Data Analytics
8 Summary
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Example Applications:
Price Prediction
Fraud Detection
Dosage Prediction
Risk Assessment
Propensity modelling
Diagnosis
Document Classification
...
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Ill posed problems often arise if one tries to infer general laws
from few data
the hypothesis space is too large
there are not enough data
In general ML leads to ill-posed solutions because
the solution may be too complex
it may be not unique
it may change radically when leaving one sample out
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Table: A full set of potential prediction models before any training data
becomes available.
B BY A LC O RG G RP M1 M2 M3 M4 M5 ... M6 561
no no no ? couple couple single couple couple couple
no no yes ? single couple single couple couple single
no yes no ? family family single single single family
no yes yes ? single single single single single couple
...
yes no no ? couple couple family family family family
yes no yes ? couple family family family family couple
yes yes no ? single family family family family single
yes yes yes ? single single family family couple family
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Table: A sample of the models that are consistent with the training
data
B BY A LC O RG G RP M1 M2 M3 M4 M5 ... M6 561
no no no couple couple couple single couple couple couple
no no yes couple single couple single couple couple single
no yes no ? family family single single single family
no yes yes single single single single single single couple
...
yes no no ? couple couple family family family family
yes no yes family couple family family family family couple
yes yes no family single family family family family single
yes yes yes ? single single family family couple family
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Table: A sample of the models that are consistent with the training
data
B BY A LC O RG G RP M1 M2 M3 M4 M5 ... M6 561
no no no couple couple couple single couple couple couple
no no yes couple single couple single couple couple single
no yes no ? family family single single single family
no yes yes single single single single single single couple
...
yes no no ? couple couple family family family family
yes no yes family couple family family family family couple
yes yes no family single family family family family single
yes yes yes ? single single family family couple family
Inductive Bias
In machine learning, the term inductive bias refers to a set
of (explicit or implicit) assumptions made by a learning
algorithm in order to perform induction, that is, to
generalize a finite set of observation (training data) into a
general model of the domain.
Every machine learning algorithm with any ability to
generalize beyond the training data that it sees has some
type of inductive bias, which are the assumptions made by
the model to learn the target function and to generalize
beyond training data.
For example, in linear regression, the model assumes that
the output or dependent variable is related to independent
variable linearly (in the weights). This is an inductive bias
of the model.
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Restriction Bias
Restriction bias is the representational power of an
algorithm, or, the set of hypotheses our algorithm will
consider. So, in other words, restriction bias tells us what
our model is able to represent.
Preference Bias
Preference bias is simply what representation(s) a
supervised learning algorithm prefers. For example, a
decision tree algorithm might prefer shorter, less complex
trees. In other words, it is our algorithm?s belief about
what makes a good hypothesis.
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
No free lunch!
What happens if we choose the wrong inductive bias:
1 underfitting
2 overfitting
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
80000
●
60000
●
Income
●
●
40000
20000
0 20 40 60 80 100
Age
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
80000
●
60000
●
Income
●
●
40000
20000
0 20 40 60 80 100
Age
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
80000
●
60000
●
Income
●
●
40000
20000
0 20 40 60 80 100
Age
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
80000
●
60000
●
Income
●
●
40000
20000
0 20 40 60 80 100
Age
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
80000
80000
80000
80000
● ● ● ●
60000
60000
60000
60000
● ● ● ●
Income
Income
Income
Income
● ● ● ●
● ● ● ●
40000
40000
40000
40000
20000
20000
20000
20000
● ● ● ●
Data
Prepara1on
Deployment
Data
Modeling
Evalua1on
Figure: A diagram of the CRISP-DM process which shows the six key
phases and indicates the important relationships between them. This
figure is based on Figure 2 of [?].
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
Summary
Predictive Data Analytics What is ML? How Does ML Work? Bias Underfitting/Overfitting Lifecycle Road Ahead Summary
8 Summary