Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Chapter 1

Introduction to Machine
Learning
What is Learning?
• Herbert Simon: “Learning is any process by which a
system improves performance from experience.”
• “Learning denotes changes in a system that ... enable
a system to do the same task … more efficiently the
next time.” - Herbert Simon
• What is the task?
– Classification
– Categorization/clustering
– Problem solving / planning / control
– Prediction
– others
2
What is Machine Learning?
• ML is changes in a system that enable it to do the same
task or tasks drawn from the same population more
efficiently and more effectively the next time.'' (Simon
1983)
• A branch of artificial intelligence, concerned with the
design and development of algorithms that allow
computers to evolve behaviors based on empirical data.

There are two ways that a system can improve:


1. By acquiring new knowledge
– acquiring new facts
– acquiring new skills
2. By adapting its behavior
– solving problems more accurately
– solving problems more efficiently 3
What is Machine Learning?
• The complexity in traditional computer
programming is in the code (programs that people
write).
• In machine learning, algorithms (programs) are in
principle simple and the complexity (structure)
is in the data.
• Is there a way that we can automatically learn that
structure? That is what is at the heart of machine
learning.
That is, machine learning is the about the construction
and study of systems that can learn from data. This is
very different than traditional computer programming. 4
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output
5
Machine Learning is…
Machine learning is about predicting the future based on
the past.
-- Hal Daume III

past future

Training model/ Testing model/


Data predictor Data predictor

6
Why Machine Learning?
• No human experts
– industrial/manufacturing control
– mass spectrometer analysis, drug design, astronomic
discovery
• Black-box human expertise
– face/handwriting/speech recognition
– driving a car, flying a plane
• Rapidly changing phenomena
– credit scoring, financial modeling
– diagnosis, fraud detection
• Need for customization/personalization
– personalized news reader
– movie/book recommendation 7
Examples of Machine Learning Problems
• Pattern Recognition
– Facial identities or facial expressions
– Handwritten or spoken words (e.g., Siri)
– Medical images
– Sensor Data/IoT
• Optimization
– Many parameters have “hidden” relationships that can be the basis of
optimization
• Pattern Generation
– Generating images or motion sequences
• Anomaly Detection
– Unusual patterns in the telemetry from physical and/or virtual plants (e.g.,
data centers)
– Unusual sequences of credit card transactions
– Unusual patterns of sensor data from a nuclear power plant
• or unusual sound in your car engine or …

• Prediction
8
– Future stock prices or currency exchange rates
Defining the Learning Task
Improve on task, T, with respect to
performance metric, P, based on experience, E.
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words

T: Categorize email messages as spam or legitimate.


P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels

T: Driving on four-lane highways using vision sensors


P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

T: Playing checkers
P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself
9
Designing a Learning System
• Choose the training experience
• Choose exactly what is to be learned, i.e. the
target function.
• Choose how to represent the target function.
• Choose a learning algorithm to infer the target
function from the experience.

Learner

Environment/
Experience Knowledge

Performance
Element 10
Designing a Learning System
• Training is the process of making the system able to learn.
– Training set and testing set come from the same distribution
– Need to make some assumptions or bias
• Performance - There are several factors affecting the
performance:
– Types of training provided
– The form and extent of any initial background knowledge
– The type of feedback provided
– The learning algorithms used
• Algorithms - The success of machine learning system also
depends on the algorithms.
– The algorithms control the search to find and build the knowledge
structures.
– The learning algorithms should extract useful information from training
11
examples
Types of Learning
• Supervised (inductive) learning
– Training data includes desired outputs
– Learn patterns from (labeled) data.
• Unsupervised learning
– Training data does not include desired outputs
– Learn patterns from (unlabeled) data.
• Semi-supervised learning
– Training data includes a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions
12
Supervised Learning
• A training set of examples with the correct responses (targets) is provided
• Based on this training set, the algorithm generalises to respond correctly to all
possible inputs.
• This is also called learning from exemplars.
• Supervised learning is the machine learning task of learning a function that
maps an input to an output based on example input-output pairs.
• In supervised learning, each example in the training set is a pair consisting of an
input object (typically a vector) and an output value.
• A supervised learning algorithm analyzes the training data and produces a
function, which can be used for mapping new examples.
• In the optimal case, the function will correctly determine the class labels for
unseen instances.
• Both classification and regression problems are supervised learning problems.
• A wide range of supervised learning algorithms are available, each with its
strengths and weaknesses.
– There is no single learning algorithm that works best on all supervised learning 13
problems.
Supervised Learning
• A “supervised learning” is so called because the process of an algorithm
learning from the training dataset can be thought of as a teacher supervising
the learning process.
• We know the correct answers (that is, the correct outputs), the algorithm
iteratively makes predictions on the training data and is corrected by the
teacher.
• Learning stops when the algorithm achieves an acceptable level of
performance.
• Example Consider the following data regarding patients entering a clinic. The
data consists of the gender and age of the patients and each patient is labeled
as “healthy” or “sick”.

14
Unsupervised Learning
• Correct responses are not provided, but instead the algorithm tries to identify
similarities between the inputs so that inputs that have something in common are
categorised together.
• The statistical approach to unsupervised learning is known as density estimation.
• Unsupervised learning is a type of machine learning algorithm used to draw inferences
from datasets consisting of input data without labeled responses.
• In unsupervised learning algorithms, a classification or categorization is not included in
the observations.
• There are no output values and so there is no estimation of functions.
• Since the examples given to the learner are unlabeled, the accuracy of the structure that
is output by the algorithm cannot be evaluated.
• The most common unsupervised learning method is cluster analysis, which is used for
exploratory data analysis to find hidden patterns or grouping in data.
• Example - Consider the following data regarding patients entering a clinic. The data
consists of the gender and age of the patients.

Based on this data, can we infer anything regarding the


patients entering the clinic?
15
Reinforcement Learning
• This is somewhere between supervised and unsupervised learning.
• The algorithm gets told when the answer is wrong, but does not
get told how to correct it.
• It has to explore and try out different possibilities until it works
out how to get the answer right.
• Reinforcement learning is the problem of getting an agent to act in
the world so as to maximize its rewards.
• A learner (the program) is not told what actions to take as in most
forms of machine learning, but instead must discover which
actions yield the most reward by trying them.
• Consider teaching a dog a new trick: we cannot tell it what to do,
but we can reward/punish it if it does the right/wrong thing.
– It has to find out what it did that made it get the reward/punishment 16
17
Supervised Learning techniques
• Linear classifier (numerical functions)
• Parametric (Probabilistic functions)
– Naïve Bayes, Gaussian discriminant analysis (GDA), Hidden
Markov models (HMM), Probabilistic graphical models
• Non-parametric (Instance-based functions)
– K-nearest neighbors, Kernel regression, Kernel density
estimation, Local regression
• Non-metric (Symbolic functions)
– Classification and regression tree (CART), decision tree
• Aggregation
– Bagging (bootstrap + aggregation), Adaboost, Random forest
18
Unsupervised Learning techniques
• Clustering
– K-means clustering
– Spectral clustering
• Density Estimation
– Gaussian mixture model (GMM)
– Graphical models
• Dimensionality reduction
– Principal component analysis (PCA)
– Factor analysis

19
More Recent Classification

20
Deep Learning
• Soon, the data that is available these days has become so
humongous that the conventional techniques developed
so far failed to analyze the big data and provide us the
predictions.
• Thus, came the deep learning where the human brain is
simulated in the Artificial Neural Networks (ANN) created in
our binary computers.
• The machine now learns on its own using the high
computing power and huge memory resources that are
available today.
• It is now observed that Deep Learning has solved many of the
previously unsolvable problems.
• The technique is now further advanced by giving
incentives to Deep Learning networks as awards and there
21
finally comes Deep Reinforcement Learning.
Learning Models
• Machine learning is concerned with using the right
features to build the right models that achieve the
right tasks.
• The basic idea of Learning models has divided into
three categories.
• For a given problem, the collection of all possible
outcomes represents the sample space or instance
space.
– Using a Logical expression. (Logical models)
– Using the Geometry of the instance space. (Geometric
models)
– Using Probability to classify the instance space. (Probabilistic
models)
– Grouping and Grading 22
History of Machine Learning
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
23
History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism, backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking 24
– Bayes Net learning
History of Machine Learning (cont.)
• 2000s
– Support vector machines
– Kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications
• Compilers
• Debugging
• Graphics
• Security (intrusion, virus, and worm detection)
– Email management
– Personalized assistants that learn
25
– Learning in robotics and vision
Disciplines relevant to ML
• Artificial intelligence
• Bayesian methods
• Control theory
• Information theory
• Computational complexity theory
• Philosophy
• Psychology and neurobiology
• Statistics
• Many practical problems in engineering and
business 26
Related Fields
data
mining control theory
statistics
decision theory
information theory machine
learning
cognitive science
databases
evolutionary psychological models
models neuroscience

Machine learning is primarily concerned with the accuracy


and effectiveness of the computer system.

27
Issues in Machine Learning
• What algorithms can approximate functions well
and when
– How does the number of training examples influence
accuracy
• Problem representation / feature extraction
• Intention/independent learning
• Integrating learning with systems
• What are the theoretical limits of learnability
• Transfer learning
• Continuous learning
28
Possible areas to Cover in ML
• Supervised learning
– Decision tree induction
– Rule induction
– Instance-based learning
– Bayesian learning
– Neural networks
– Support vector machines
– Model ensembles
– Learning theory
• Unsupervised learning
– Clustering
– Dimensionality reduction
• Reinforcement learning
29

You might also like