Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 61

Machine Learning

(BITS F464)
Dr.N.L.Bhanu Murthy
BITS Pilani
Hyderabad Campus
What is Learning?

“Gain knowledge or understanding of or skill in by


study, instruction or experience” - Webster

BITS Pilani, Hyderabad Campus


What is Learning?

“Learning is any process by which a system improves


performance from experience.” - Herbert Simon
Researcher in Professor @
Artificial Intelligence Carnegie Mellon University
Cognitive psychology University of California,
Computer science Berkeley
Economics Illinois Institute of Technology
Political science

Awards:
Turing Award, 1975
Nobel Prize in Economics1978
National Medal of Science1986 1916 - 2001
von Neumann Theory Prize1988

BITS Pilani, Hyderabad Campus


What is Machine Learning?

Machine Learning is study of


algorithms that
improve their performance P
at some task T
with experience E
Tom Mitchell (1990)

Well-defined learning task: <P,T,E>

BITS Pilani, Hyderabad Campus


Example - Machine Learning
Handwritten Digit Recognition
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words

BITS Pilani, Hyderabad Campus


Example - Machine Learning
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

BITS Pilani, Hyderabad Campus


Example - Machine Learning

BITS Pilani, Hyderabad Campus


Example - Machine Learning

BITS Pilani, Hyderabad Campus


Example - Machine Learning

Learning to drive an autonomous


vehicle (Pomerleau, 1989).

BITS Pilani, Hyderabad Campus


Example - Machine Learning

BITS Pilani, Hyderabad Campus


Example - Machine Learning

BITS Pilani, Hyderabad Campus


Examples of Successful Applications of Machine Learning

 Learning to recognize spoken words


(Lee, 1989; Waibel, 1989).

 Learning to classify new astronomical structures


(Fayyad et al., 1995).

 Learning to play world-class backgammon


(Tesauro 1992, 1995).

 Categorize email messages as spam or legitimate.

BITS Pilani, Hyderabad Campus


Gary Kasparov on loss to Deep Blue

• Human uses 1% calculation, 99% understanding


– based on patterns, drawing information from experience
• Machine opposite: 99% calculation 1% understanding
– though this understanding is growing

BITS Pilani, Hyderabad Campus


Machine Learning, a Magic?

No, more like gardening


 Seeds = Algorithms
 Nutrients = Data
 Gardener = You
 Plants = Programs

BITS Pilani, Hyderabad Campus


Machine Learning in Computer Science

Speech/Au Robotics
dio
Processing Planning
Natural
Language
Processing
Machine Vision/Image
Processing
Biomedical/Chemed
ical Learning
Informatics

Human Financial Modeling


Computer Analytics
Interaction

BITS Pilani, Hyderabad Campus


They said it!!
 “A breakthrough in machine learning would be worth ten Microsofts”
- Bill Gates, Chairman, Microsoft

 Machine learning is the hot new thing”


- John Hennessy, President, Stanford

 “Web rankings today are mostly a matter of machine learning”


- Prabhakar Raghavan, Dir. Research, Yahoo

 “Machine learning is going to result in a real revolution”


- Greg Papadopoulos, CTO, Sun

 “Machine learning is today’s discontinuity” - Jerry Yang, CEO, Yahoo

 “Machine learning is the next Internet”


- Tony Tether, Director, DARPA
BITS Pilani, Hyderabad Campus
Future prospects..

BITS Pilani, Hyderabad Campus


History of
Technology

BITS Pilani, Hyderabad Campus


12 IT skills that employers can't say no to

1) Machine learning
2) Mobilizing applications
3) Wireless networking
4) Human-computer interface
5) Project management
6) General networking skills
7) Network convergence technicians
8) Open-source programming
9) Business intelligence systems
10) Embedded security
11) Digital home technology integration
12) .Net, C #, C ++, Java -- with an edge

BITS Pilani, Hyderabad Campus


History of Machine Learning

• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM

BITS Pilani, Hyderabad Campus


History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
BITS Pilani, Hyderabad Campus
History of Machine Learning (cont.)
• 2000s
– Support vector machines
– Kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications
• Compilers
• Debugging
• Graphics
• Security (intrusion, virus, and worm detection)
– Email management
– Personalized assistants that learn
– Learning in robotics and vision

BITS Pilani, Hyderabad Campus


Why does Machine Learning need math?
 Calculus
– We need to identify the maximum likelihood, or minimum risk.
Optimization
– Integration allows the marginalization of continuous probability
density functions
 Linear Algebra
– Many features leads to high dimensional spaces
– Vectors and matrices allow us to compactly describe and manipulate high
dimensional feature spaces.
 Vector Calculus
– All of the optimization needs to be performed in high dimensional
spaces
– Optimization of multiple variables simultaneously – Gradient Descent
– Want to take a marginal over high dimensional distributions like
Gaussians.
BITS Pilani, Hyderabad Campus
Teaching & Evaluation (BITS C464 – L P U – 3 0 3)

Evaluation Components & Criteria


Component Weightage Duration Date Mode
(out of 200)
Mid Test 60 90 minutes As per Closed Book
Timetable
Assignments 50     Open Book
 
Comprehensive 90 3 hours As per Closed Book
Timetable

Make-up Policy: Make-up for other tests will be granted on prior permission and on
justifiable grounds only.

Course Notices: All notices pertaining to this course will be displayed on the LTC
Notice Board as well as the CS & IS Notice Board.

Chamber Consultation: Friday 1600 Hrs – 1700 Hrs


BITS Pilani, Hyderabad Campus
Text Book

T1. Christopher Bishop: Pattern Recognition and Machine Learning,


Springer International Edition.

BITS Pilani, Hyderabad Campus


Reference Book

R1. Tom M. Mitchell: Machine Learning, The McGraw-Hill Companies, Inc.

BITS Pilani, Hyderabad Campus


Reference Book

BITS Pilani, Hyderabad Campus


Reference Book

BITS Pilani, Hyderabad Campus


Reference Book

BITS Pilani, Hyderabad Campus


Machine Learning - Examples

Employability Prediction
 CGPA
 Communication Skills
Features / Attributes / Predictors  Aptitude
 Programming Skills

S.No. CGPA Communication Aptitude Programming Job Offered?


Skills Skills
1 9.1 Average Good Excellent Yes
2 8.4 Good Good Good Yes
3 8.3 Poor Average Average No
4 7.1 Average Good Average No
5 8.2 Good Excellent Excellent No

BITS Pilani, Hyderabad Campus


Machine Learning - Examples

Predicting price of a used car


 Brand
 Year (Mfg)
Features / Attributes / Predictors  Engine Capacity
 Mileage
 Distance travelled
 Cab?

S.No Brand Year Engine Mileage Distance Cab? Price


(Mfg) Capacity travelled (in Rs.)
1. Honda City ZX 2008 1100 10.5 45000 N 3,50,000
2
3
4

BITS Pilani, Hyderabad Campus


Machine Learning - Examples
Market Segmentation Study
Features / Attributes / Predictors Customers for a retailer may fall into
 Family income two groups say big spenders and
 # of visits in a month low spenders
 Average money spent in a month three groups say big spenders,
 Zip code medium spenders and low spenders
Four groups, ….
S.N Zip Family # of visits in a Average Money Spent in a
o. Code Income month month
1 500078 11,50,000 4 8,000

BITS Pilani, Hyderabad Campus


Supervised Learning
Feature tuple: (CGPA, Communication Skills, Aptitude, Programming
Skills)
Response
Supervised/ Target: JobFit
Learning: Offered
a model that relates response to the feature
tuples, with the aim of accurately predicting the response for future
observation or better understanding the relationship between response
and features.
S.No. CGP Communication Aptitude Programming Job Offered?
A Skills Skills
1 9.1 Average Good Excellent Yes
2 8.4 Good Good Good Yes
3 8.3 Poor Average Average No
4 7.1 Average Good Average No
5 8.2 Good Excellent Excellent No

BITS Pilani, Hyderabad Campus


Unsupervised Learning
Feature tuple: (Zip Code, Family Income, # of visits in a month,
Average Money spent in a month)
Response / Target: None
Unsupervised Learning: To discover groups of similar examples
within the data set

S.No. Zip Code Family # of visits in Average Money Spent in a


Income a month month
1 500078 11,50,000 4 8,000

BITS Pilani, Hyderabad Campus


Supervised Learning

Features
Employability  CGPA Response / Target
Prediction  Communication Skills  Job Offered?
 Aptitude
 Programming Skills

Features
 Brand

Response / Target
Predicting price Year (Mfg)
 Engine Capacity  Price (in Rs.)
of a used car
 Mileage
 Distance travelled
 Cab?

BITS Pilani, Hyderabad Campus


Classification & Regression
Classification problems are supervised Learning
problems where target/response variables take only
discrete (finite/countable) values.
Example: Employability prediction

Regression problems are supervised learning problems


where target / response is a continuous variable (or
equivalently can take any real number).
Example: Predicting price of a used car

BITS Pilani, Hyderabad Campus


Classification & Regression – Examples
Classification
 Predicting whether a patient has a particular disease or not.

 Hand written digit recognition

 Email spam detection

Regression
 Predicting house/property price

 Predicting stock market price

 Predicting sales of a product

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Supervised Learning

Decision Tree Learning


 Target Concept
“Days on which my friend, yar, enjoys his favorite water sport”
(you may find it more intuitive to think of
“Days on which the beach will be crowded” concept)
 Task
Learn to predict the value of EnjoySport/Crowded for an arbitrary day

 Training Examples for the Target Concept

Example Sky Air Humidity Wind Water Forecast Enjoy


Temp Sport
0 Sunny Warm Normal Strong Warm Same Yes
1 Sunny Warm High Strong Warm Same Yes
2 Rainy Cold High Strong Warm Change No
3 Sunny Warm High Strong Cool Change Yes

BITS Pilani, Hyderabad Campus


Supervised Learning
Decision Tree Learning
 Hypothesis space search
 Occam’s razor
 Overfitting
 Measure for Selecting attributes –
Entropy, Gini Index etc.
 Issues in DT learning
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
BITS Pilani, Hyderabad Campus
Generative and Discriminative Models: An
analogy
Generative approach is to learn each language and determine
as to which language the speech belongs to

Discriminative approach is to determine the linguistic


differences without learning any language– a much easier task!

BITS Pilani, Hyderabad Campus


Taxonomy of ML Models

BITS Pilani, Hyderabad Campus


Supervised Learning

Fisher’s linear discriminant

BITS Pilani, Hyderabad Campus


Supervised Learning

Perceptron Algorithm

BITS Pilani, Hyderabad Campus


Supervised Learning
denotes +1
Support Vector Machine (SVM)
denotes -1
x2

V. Vapnik

x1

BITS Pilani, Hyderabad Campus


Supervised Learning
denotes +1
Support Vector Machine (SVM)
denotes -1
x2

V. Vapnik

x1

BITS Pilani, Hyderabad Campus


Supervised Learning denotes +1
denotes -1
Support Vector Machine (SVM) x2

V. Vapnik

x1

BITS Pilani, Hyderabad Campus


Supervised Learning denotes +1
denotes -1
Support Vector Machine (SVM) x2

V. Vapnik

x1

BITS Pilani, Hyderabad Campus


Supervised Learning denotes +1
denotes -1
Support Vector Machine (SVM) x2
Margin
“safe zone”

V. Vapnik

x1

BITS Pilani, Hyderabad Campus


Supervised Learning
Artificial Neural Networks (ANN)

 Networks of processing units (neurons) with connections


(synapses) between them

 Large number of neurons: 1014

 Large connectitivity: 104

 Parallel processing

 Distributed computation/memory

 Robust to noise, failures

BITS Pilani, Hyderabad Campus


Supervised Learning

BITS Pilani, Hyderabad Campus


Unsupervised Learning

 Clustering Algorithms

BITS Pilani, Hyderabad Campus


Thank You!!

BITS Pilani, Hyderabad Campus

You might also like