Professional Documents
Culture Documents
Ai&ml Unit 4
Ai&ml Unit 4
Ai&ml Unit 4
Concept Learning and The General-to-Specific Ordering: Machine Learning is the study of computer Algorithms that improve
Introduction, A Concept Learning Task, Concept Learning as Search, Automatically through experience.
FIND-S: Finding a Maximally Specific Hypothesis, Version Spaces and Tom Mitchell.
the Candidate Elimination Algorithm, Remarks on Version spaces and
Candidate-Elimination, Inductive Bias.
Working Applications of ML
Traditional Programming
• Classification of mortgages
Data • Predicting portfolio performance
Computer Output
Program • Electrical power control
• Chemical process control
• Character recognition
Data • Face recognition
Machine Learning Computer Program • DNA classification
Output • Credit card fraud detection
• Cancer cell detection
WEBSITES
Sample Applications
• Web search
http://videolectures.net/mlas06_mitchell_itm/
.
• Computational biology
• Finance
• E-commerce
• Space exploration
Applying AI, we wanted to build better and
• Robotics
intelligent machines
• Information extraction
• Social networks
• Debugging
• [Your favorite area]
Machine learning results from many fields including statistics, artificial intelligence, WELL-POSED LEARNING PROBLEMS
philosophy, Information theory, biology, cognitive science, computational
complexity, and control theory.
Computer learning is broadly defined as
AI Philosophy Definition:
statistics
A computer program is said to learn from experience E with
Information theory respect to some class of tasks T and performance measure
machine
P, if its performance at tasks in T, as measured by P,
learning
control theory Biology
improves with experience E.
2. A handwriting recognition learning problem: The goal is to define precisely a class of problems that
Task T: recognizing and classifying handwritten words within encompasses interesting forms of learning, to explore algorithms
images that solve such problems, and to understand the fundamental
Performance measure P: percent of words correctly classified structure of learning problems and processes.
Training experience E: a database of handwritten words with given
classifications
DESIGNING A LEARNING SYSTEM DESIGNING A LEARNING SYSTEM
1. Choosing the Training Experience. let us consider designing a program to learn to play checkers,
with the goal of entering it in the world checkers tournament.
2. Choosing the Target Function
1. Choosing the Training Experience.
3. Choosing a Representation for the Target Function The first design choice we face is to choose the type of training
experience from which our system will learn.
Let us call this target function V and again use the notation
V:B to denote that V maps any legal board state from the set B
to some real value (we use to denote the set of real numbers).
If the system can successfully learn such a target function V,
then it can easily use it to select the best move from any current board position.
State Space Search for blue State Space Search black moves for b1 step
V(b)= ? V(b1)= ?
Black wins: V(b)= 100 X1: the number of black pieces on the board
X2: the number of red pieces on the board
Task T: playing checkers For instance, the following training example describes a board state b in which
Performance measure P: percent of games won in the world black has won the game (note x2 = 0 indicates that red has no remaining pieces)
tournament and for which the target function value Vtrain(b) is therefore +100.
Training experience E: games played against itself
Target function: V: Board
Target function representation
Below we describe a procedure that first derives such training
examples from the indirect training experience available to the
learner, then adjusts the weights wi to best fit these training
The first three items above correspond to the specification of the examples.
learning task, whereas the final two items constitute design choices for
the implementation of the learning program.
a) Estimating Training Values
wi wi Vtrain b Vˆ b xi
where is a small constant (e.g. 0.1)
1. The Performance Module: Takes as input a new board
and outputs a trace of the game it played against itself.
Final Design
2. The Critic: Takes as input the trace of a game and
Experiment
New problem Hypothesis outputs a set of training examples of the target function.
Generator
(initial game board) Vˆ
Determine Representation
data and any prior knowledge held by the
Polynomial
Artificial neural
network
of Learned Function
learner.
Linear function of six features …
Determine
Learning Algorithm
Gradient descent Linear Programming …
Complete Design
Issues in Machine Learning
What algorithms can approximate functions well and when?
How much training data is sufficient?.
When and how can prior knowledge held by the learner guide the
process of generalizing from examples? Concept Learning and The General-to-Specific Ordering:
What is the best strategy for choosing a useful next training
experience, and how does the choice of this strategy alter the Introduction, A Concept Learning Task, Concept Learning as
complexity of the learning problem?
Search, FIND-S: Finding a Maximally Specific Hypothesis,
What is the best way to reduce the learning task to one or more Version Spaces and the Candidate Elimination Algorithm,
function approximation problems? Put another way, what specific
functions should the system attempt to learn? Can this process Remarks on Version spaces and Candidate-Elimination,
itself be automated? Inductive Bias.
How can the learner automatically alter its representation to
improve its ability to represent and learn the target function?
A Concept Learning Task For each attribute, the hypothesis will either
Target Concept
indicate by a "?' that any value is acceptable for this attribute,
Example “Days on which my friend Aldo enjoys his favorite
specify a single required value (e.g., Warm) for the attribute, or
water sport”
indicate by a “Ø " that no value is acceptable.
( find “Days on which the beach will be crowded” )
Task the hypothesis that Aldo enjoys his favourite sport only on cold days
Learn to predict the value of EnjoySport/Crowded for an with high humidity is represented by the expression below as.
arbitrary day
Training Examples for the Target Concept (?, Cold, High, ?, ?, ?)
Example Sky Air Humidity Wind Water Forecast Enjoy
Temp Sport The most general hypothesis – every day is a positive example of
1 Sunny Warm Normal Strong Warm Same Yes this concept
2 Sunny Warm High Strong Warm Same Yes
3 Rainy Cold High Strong Warm Change No
<?, ?, ?, ?, ?, ?>
4 Sunny Warm High Strong Cool Change Yes
TABLE 2.1 The most specific possible hypothesis – no day is a positive example
Positive and negative training examples for the target concept EnjoySport. of this concept
6 attributes (Nominal-valued (symbolic) attributes):
< Ø, Ø , Ø, Ø, Ø, Ø >
Sky (SUNNY, RAİNY, CLOUDY), Temp (WARM,COLD), Humidity (NORMAL, HIGH),
Wind (STRONG, WEAK), Water (WARM, COOL), Forecast (SAME, CHANGE)
1. Notation
Instances for which c(x) = 1 are called positive examples, or 2. The Inductive Learning Hypothesis
members of the target concept.
FIGURE 2.1 Instances, hypotheses, and the more - general - than relation. Find-S: Finding a maximally specific hypothesis
Instances X Hypotheses H
Find-S is guaranteed to output the most specific hypothesis within H
Specific that is consistent with the positive training examples.
h2,3
x1+ x2+
x4+ h4
h0 = <Ø, Ø, Ø, Ø, Ø, Ø>
x1 = <Sunny, Warm, Normal, Strong, Warm, Same>, +
h1 = <Sunny, Warm, Normal, Strong, Warm, Same>
This subset of all hypotheses is called the version space with respect to
the hypothesis space H.
•Problems
• The hypothesis space must be finite
• Enumeration of all the hypothesis, rather inefficient
A More Compact Representation for Version Spaces A simple way to understand Candidate algorithm
This Version Space, containing all 6 hypotheses can be compactly represented with its
most specific (S) and most general (G) sets.
How to generate all h in VS, given G and S?
x1 = <Sunny, Warm, Normal, Strong, Warm, Same>, +
x2 = <Sunny, Warm, High, Strong, Warm, Same>, +
x3 = <Rainy, Cold, High, Strong, Warm, Change>, -
x4 = <Sunny, Warm, High, Strong, Cool, Change>, +
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G4:{<Sunny,?,?,?,?,?> <?,Warm,?,?,?,?>}
64
Candidate Elimination Algorithm
Candidate-Elimination Learning Algorithm:
G maximally general hypotheses in H
The CANDIDATE-ELIMINATON Algorithm computes the version space S maximally specific hypotheses in H
containing all hypotheses from H that are consistent with an
observed sequence of training examples.
For each training example d, do
If d is positive
- Initialize G to the set of maximally general hypotheses in H • Remove from G any hypothesis inconsistent with d
- Initialize S to the set of maximally specific hypotheses in H • For each hypothesis s in S that is inconsistent with d
G0 {<?, ?, ?, ?, ?, ?>} • Remove s from S
• Add to S all minimal generalizations h of s such that
S0 {< ,, , , , >} 1. h is consistent with d, and
2. some member of G is more general than h
• Remove from S any hypothesis that is more general
than another hypothesis in S
1) Will the candidate-elimination Algorithm Converge to the We use the term query to refer to such instances constructed by the
Correct Hypothesis? learner, which are then classified by an external oracle(e.g., nature
The version space learned by the candidate-elimination Algorithm will converge or a teacher).
toward the hypothesis that correctly describes the target concept, provided
(1) there are no errors in the training examples, and (2) there is some hypothesis
in H that correctly describes the target concept.
73
S:{(x1 x2 x3)}
the G boundary will consist of the hypothesis that rules out only
the observed negative examples
G:{(x4 x5)}
Inductive system three learning algorithms, which are listed from weakest to
strongest bias.