Machine Learning Theory CSE 250C: Introductory Lecture

Machine Learning Theory
CSE 250C
Introductory Lecture
General Info
•  Instructor: Raef Bassily (rbassily@ucsd.edu)
–  Office Hours: Thu 5-6 PM (4111 Atkinson Hall)
•  TA: Shuang Song (shs037@ucsd.edu)
–  Office Hours: Tue 10-11 AM (CSE Basement: B260A)
•  Website:
http://cseweb.ucsd.edu/classes/sp16/cse250C-a/
Also, available on
http://rbassily.eng.ucsd.edu/home/teaching/cse-250c
What is Machine Learning?
•  The automated process of “making sense” out of data:
–  A tool to extract information from data and use it.
•  ML has invaded our daily lives:
!  search engines, recommendation systems,
!  Email spam detection, fraud detection in credit cards,
!  Personal assistance in smart phones, face detection in digital
cameras,
!  Navigation, military applications, medicine, bioinformatics,
astronomy,..
•  How is ML different from traditional programming?
–  Endowing programs with the ability to “learn” and adapt
to data on their own.
Topics to be covered
•  Part 1: Fundamentals
"  Preliminaries: Tools from Probability
"  PAC Learning
"  Occam’s Razor
"  Learnability via Uniform Convergence
"  The VC-Dimension
•  Goal: to answer fundamental questions of learning
" What is learning? How can a machine learn?
" How do we quantify the amount of data required to learn a
certain concept?
" How can we evaluate the success of the learning process?
" Is learning always possible?
Topics to be covered
•  Part 2: Key Algorithmic Techniques
"  Boosting: Weak vs. Strong Learnability
"  Convex Learning Problems
"  Regularization and Stability
"  Stochastic Gradient Descent Algorithm
"  One of the following topics:
"  Support Vector Machines (SVMs)
"  Introduction to Online Learning
•  Goal: to present algorithmic techniques widely used in
practice.
Useful Readings
•  M. J. Kearns, U. V. Vazirani, An introduction to Computational

Learning Theory.
•  S. Shalev-Schwartz, S. Ben-David, Understanding Machine
Learning: From Theory to Algorithms.
•  Other specific readings may also be suggested in class.

•  There is no textbook for this class.
What is Learning?
•  The process of transforming an experience into expertise
or knowledge.
What is Learning?
or knowledge.
Learning Algorithm
Training Data Learned Concept or Rule
(Experience) (Expertise)
What is Learning?
or knowledge.
Learning Algorithm
•  Example: Spam Detection

" Input: Set of emails, each labeled: Spam, or Not Spam.
" Output: Prediction rule to classify emails
What is Learning?
or knowledge.
Learning Algorithm
•  Example: Spam Detection

" Input: Set of emails, each labeled: Spam, or Not Spam.
" Output: Prediction rule to classify emails
•  How to evaluate a learning algorithm?
"  Test the output (e.g., the prediction rule) on new unseen
data (called test data).
Fundamental Questions
Learning Algorithm
•  What assumptions do we need for learning to be possible?

"  Training and test data are “similar” in some sense.
"  Put some restriction on the class of possible concepts (e.g.,
prediction rules) to be learned.
Learning Algorithm

Example: Binary Classifiers
Learning Algorithm

Should we consider
linear classifiers?
Learning Algorithm

Should we consider 2nd
degree poly classifiers?
Learning Algorithm

Should we consider
higher degree poly or
other complex functions?
Learning Algorithm

•  Given a fixed model (e.g., a fixed type of prediction rules),
how can the machine output the “right” prediction rule?
Learning Algorithm

•  How many data samples are needed to ensure that the output
prediction rule will generalize well to the unseen data?
Learning Algorithm

•  How many data samples are needed to ensure that the output
prediction rule will generalize well to the unseen data?
"  The ability to generalize is the essence of any learning system.
"  Simpler prediction rules are easier to generalize.
Administrative Information
Prerequisites
•  Decent knowledge of probability and multivariate
calculus.
•  Need to be comfortable working with mathematical

abstractions and proofs.
•  Previous exposure to machine learning is useful, but not

strictly required.
Assessment
•  Homeworks (including one mini-project): 45%
•  Mid-term (in class): 20%
•  Final (take-home): 35%
•  A bonus for top answers on Piazza: up to 5%

Exams
•  Mid term in class on Mon, May 9.
•  Final will be posted on the class webpage on the
weekend after the last day of class.
–  The deadline for submitting the answers will be
decided later (tentatively, will be due in ~2-3 days).
–  The answers should be returned to me or the TA on
the specified due date/time.
Homeworks
•  Homeworks should be returned in class before the
lecture starts on the specified due date.
–  If you arrive late, please, wait till the end of the
lecture.
•  No late homeworks will be accepted except in
the case of emergencies:
–  Even then, except for documented medical emergencies,
1/3 of the homework grade will be deducted for every day
of late submission.
•  The last homework will include a mini-project.
Collaboration Policy
•  Each student can choose a homework collaborator to
collaborate with in solving the homework.
–  Choosing a collaborator is optional.
•  Each homework group must email me their names by
April 13.
–  If you choose to work alone, you still need to email me to
confirm your choice.
•  If you need a collaborator, please post on the course
group on Piazza.
•  No collaboration is allowed in the Final (or the
mid-term)!
Collaboration Policy
•  Each homework must include a brief account of the
contribution of each collaborator.
•  You must not look for homework solutions on the

internet.
•  If you receive help from someone else (except me

or the TA), or happen to see a solution somewhere,
please, acknowledge the source.
Grading Policy
•  Solutions to most problems involve proofs. Grading
will be based on both correctness and clarity.
•  Be concise. Excessively long proofs are probably

incorrect.
•  Show your reasoning in a clear and precise way

–  More partial credit for clearly written partial solution than
for attempt to provide a full solution with many holes in
your argument.
Class Participation
•  Please sign up today for the class forum on Piazza:
http://piazza.com/ucsd/spring2016/cse250c
•  Please, engage with your classmates in discussion

of the course material on the forum!
•  5% bonus for students with the best answers to the

posted questions.
Other Related Courses
•  CSE 250A: Principles of AI: Probabilistic Reasoning
and Learning
–  Instructor: Lawrence Saul
•  CSE 291-D: Latent Variable Models

–  Instructor: James Foulds.
Calibration Quiz!

Machine Learning Theory CSE 250C: Introductory Lecture

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Theory CSE 250C: Introductory Lecture

Uploaded by

Copyright:

Available Formats

Machine Learning Theory

• M. J. Kearns, U. V. Vazirani, An introduction to Computational

• Other specific readings may also be suggested in class.

• Example: Spam Detection

• Example: Spam Detection

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• What assumptions do we need for learning to be possible?

• Need to be comfortable working with mathematical

• Previous exposure to machine learning is useful, but not

• Homeworks (including one mini-project): 45%

• Mid-term (in class): 20%

• Final (take-home): 35%

• A bonus for top answers on Piazza: up to 5%

• You must not look for homework solutions on the

• If you receive help from someone else (except me

• Be concise. Excessively long proofs are probably

• Show your reasoning in a clear and precise way

• Please, engage with your classmates in discussion

• 5% bonus for students with the best answers to the

• CSE 291-D: Latent Variable Models

You might also like

•  M. J. Kearns, U. V. Vazirani, An introduction to Computational

•  Other specific readings may also be suggested in class.

•  Example: Spam Detection

•  Example: Spam Detection

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  What assumptions do we need for learning to be possible?

•  Need to be comfortable working with mathematical

•  Previous exposure to machine learning is useful, but not

•  Homeworks (including one mini-project): 45%

•  Mid-term (in class): 20%

•  Final (take-home): 35%

•  A bonus for top answers on Piazza: up to 5%

•  You must not look for homework solutions on the

•  If you receive help from someone else (except me

•  Be concise. Excessively long proofs are probably

•  Show your reasoning in a clear and precise way

•  Please, engage with your classmates in discussion

•  5% bonus for students with the best answers to the

•  CSE 291-D: Latent Variable Models