Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

CSCI417

Machine Intelligence
Dr. Ghada Khoriba
Dr. Ensaf Hussien
Dr. Nermin Negied
Dr. Wael Gomaa

Spring 2024

1
Agenda
Course Objectives

Course Map & References

Grading Distribution

Introduction to machine learning


Where we are? - What is machine learning (ML)? -
ML Applications- ML Life Cycle - Types of ML
Where to get Project Ideas
How to Read/write a paper? Basic idea
2
Course Objectives

• To understand the fundamental concepts of machine learning.


• To learn how to implement machine learning algorithms using PyTorch and
Scikit-Learn.
• To gain hands-on experience with building machine learning models on real-
world datasets.
• To understand the trade-offs and limitations of different machine learning
techniques.
• To be able to critically evaluate machine learning models and choose the best
solution for a given problem.

3
4
5
Tentative Course Topics
1.Machine Learning Basics
2.Classifying with k-Nearest Neighbors
3.Splitting datasets one feature at a time: decision trees
4.Classifying with probability theory: naïve Bayes
5.Logistic regression
6.Support vector machines
7.Model Evaluation and Improvement: Cross-validation, Grid Search, Evaluation Metrics,
and Scoring
8.Ensemble learning and improving classification with the AdaBoost meta-algorithm.
9.Introduction to Neural Networks - Building NN for classification (binary/multiclass)
10.Convolutional Neural Network (CNN)
11.Pretrained models (VGG, Alexnet,..)
12.Machine learning pipeline and use cases.

6
References

For readers For Math lovers


For Coders

7
Grading Criteria
Grading:
 Course Work:
o Programming Assignments (2 times checkpoint): 15% (DataCamp group/ Machine learning track)
o Lecture Quizzes (2 times): 10%
o Lab Practical Quizzes (2 times): 10%
o Project:
 Paper Report and UGRF: 5%
 Project Implementation and Discussion: 20%
 Midterm: 15%
 Final Exam: 25%

8
Introduction to Machine Learning

What is artificial intelligence?


The set of all tasks in which a computer can make decisions
What is machine learning?
Machine learning is similar to artificial intelligence, and often, their definitions are
confused. Machine learning (ML) is a part of artificial intelligence, and we define it as
follows:

The set of all tasks in which a computer can make decisions based on data

How humans make decisions. we make decisions in the following two ways:
•By using logic and reasoning
•By using our experience

Ref: Grokking Machine Learning, Luis G. Serrano 9


Machine Learning
• ML systems learn
• how to combine input
–to produce useful predictions
»on never-before-seen data

MODEL
• A set of rules that represent our data and
can be used to make predictions

ALGORITHM
A procedure, or a set of steps, used to
solve a problem or perform a
computation. Usually, the goal of an
algorithm is to build a model.

Ref: https://miro.medium.com/ 10
ML Terminologies

ML optimizes predictive performance, while statistics emphasize


interpretability and parsimony/simplicity.
ML / Statistics Jargon Definition
Label/Target/Output Variable/”y” The results to predict
Feature/Input Variable/”x” Input data to help make predictions
Feature Engineering / Transformation Reshaping raw input data to give more insights
Dimensionality / [1st d, 2nd d, … , nth d] Number of features
Model Weights / Parameters A set of numbers embedded in a model to make
predictions
Model Training Applying optimization techniques to find the “best” set
of model weights

parsimony, a simpler model with fewer parameters is favored over more complex models with more parameters, provided the models fit the data similarly well. 11
ML Applications

Problem type Description Example

Ranking algorithm within Amazon Search


Ranking Helping users find the most relevant items

Giving users the items they may be most


Recommendation interested in

Figuring out what category does an item belongs


Classification to

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

12
ML Applications

Problem type Description Example

Recommendations across the website


Ranking Helping users find the most relevant items

Giving users the items they may be most


Recommendation interested in

Figuring out what category does an item belongs


Classification to
Amazon’s Choice

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

13
ML Applications

Problem type Description Example

Product classification for our catalog


Ranking Helping users find the most relevant items

Giving users the items they may be most


Recommendation interested in

Figuring out what category does an item belongs


Classification to
High-Low Dress Straight Dress

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items Striped Skirt Graphic Shirt

14
ML Applications

Problem type Description Example

Predicting sales for specific ASINs


Ranking Helping users find the most relevant items

Giving users the items they may be most


Recommendation interested in

Figuring out what category does an item belongs


Classification to

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items Seasonality | Out of stock | Promotions

15
ML Applications

Problem type Description Example

Close-matching for near-duplicates


Ranking Helping users find the most relevant items

Giving users the items they may be most


Recommendation interested in

Figuring out what category does an item belongs


Classification to

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

16
ML Applications

Problem type Description Example


Fruit freshness
Ranking Helping users find the most relevant items

Giving users the items they may be most Before After


Recommendation interested in

Figuring out what category does an item belongs


Classification to

Regression Predicting a numerical value of an item

Clustering Putting similar items together


Good
Damage
Serious Damage
Anomaly Detection Finding uncommon items Decay

17
We can describe problems and their solutions using six characteristics

Characterize the problem Characterize the solution

1. Problem class: What is the nature of the training 4. Model type: Will an intermediate model be
data and what kinds of queries will be made at made? What aspects of the data will be
the testing time? modeled? How will the model be used to make
2. Assumptions: What do we know about the predictions?
source of the data or the form of the solution? 5. Model class: What particular parametric class of
3. Evaluation criteria: What is the goal of the models will be used? What criterion will we use
prediction or estimation system? How will the to pick a particular model from the model class?
answers to individual queries be evaluated? How 6. Algorithm: What computational process will be
will the overall performance of the system be used to fit the model to the data and/or to make
measured? predictions?

18
Supervised, Unsupervised, and Reinforcement learning

Ref: Grokking Machine Learning, Luis G. Serrano 19


Supervised Learning
machine learning that works with labeled data

Ref: Grokking Machine Learning, Luis G. Serrano 20


Supervised Learning

Supervised learning can be subdivided into classification and regression based on the quantity we are trying
to predict.
• If our output 𝑦 is a discrete quantity (e.g. 𝐾 distinct classes) we have a classification problem.
• On the other hand, if our output y is a continuous quantity (e.g. a real number such as stock price) we have
a regression problem.

Thus, the nature of the problem changes based on the quantity y we are trying to predict. We want to get as
close as possible to the ground truth value of y.

Machine Learning Algorithms in Depth, Vadim Smolyakov 21


Classification

• Training data 𝐷 is in the form of a set of pairs


{(𝑥 ( ) , 𝑦 ( ) ), …, (𝑥 ( ) , 𝑦 ( ) )}
WWWW WW WWWW
where 𝒙(𝒊) represents an object to be classified, most typically a d-dimensional vector of real and/or discrete
values, and
𝒚(𝒊) is an element of a discrete set of values.
The , 𝑦 values are sometimes called target values.
A classification problem is binary or two-class if 𝑦 ( ) is drawn from a set of two possible values; otherwise,
it is called multi-class. WWWW WW WWWW
The goal in a classification problem: given a new input value 𝒙(𝒏 𝟏)
, to predict the value of 𝒚(𝒏 𝟏)
.
Classification problems are a kind of supervised learning, because the desired output (or class) 𝑦 ( ) is
specified for each of the training examples 𝑥 ( ) .

Regression is like classification, except that ()

22
Ref: Grokking Machine Learning, Luis G. Serrano 23
Unsupervised learning
The branch of machine learning that works with unlabeled data

Ref: Grokking Machine Learning, Luis G. Serrano 24


Reinforcement learning

Reinforcement learning algorithms interact


with an environment, so there is a feedback
loop between the learning system and its
experiences.

Ref: Grokking Machine Learning, Luis G. Serrano 25


Regression or Classification?

• Given data about the size of houses on the real estate market, try to
predict their price. R
• Given a picture of Male/Female, We have to predict his/her age on
the basis of given picture. R
• Given a picture of Male/Female, We have to predict Whether She is
of High school, College, Graduate age.C
• Banks have to decide whether or not to give a loan to someone on the
basis of his credit history. C

26
How to find a Project Idea

27
27
Machine Learning Project

28
Read a Paper

Why to read research Papers


•To have a better grasp and understanding of the field
•To be able to contribute to the field in terms of novel ideas
•To develop confidence in the field
•Most condensed and authentic source of latest knowledge in the field

Literature survey of a domain


The basic steps to perform a literature survey in a field are the following:
1.Assemble collections of resources in the form of research papers, Medium articles, blog posts,
videos, GitHub repository etc.
2.Conduct a deep dive to classify the relevant and irrelevant material.
3.Take structured notes summarizing a paper’s key discoveries, findings, and techniques.

29
https://saiamrit.github.io/technical-blog
Organization of a Paper
The majority of papers follow, more or less, the same convention of organization:
1.Title: Hopefully catchy ! Includes additional info about the authors and their institutions.
2.Abstract: High level summary of the entire work of the paper.
3.Introduction: Background info on the field and related research leading up to this paper.
4.Related works: Describe the already existing literature on the particular domain.
5.Methods: Highly detailed section on the study that was conducted, how it was set up, any
instruments used, and finally, the process and workflow.
6.Results: Authors talk about the data that was created or collected, it should read as an unbiased
account of what occurred.
7.Discussions: Here is where authors interpret the results, and convince the readers of their findings
and hypothesis.
8.References: Any other work that was cited in the body of the text will show up here.
9.Appendix: More figures, additional treatments on related math, or extra items of interest can find
their way in an appendix.

30
https://saiamrit.github.io/technical-blog
Read a Paper
Please check Andrew’s lecture, https://youtu.be/733m6qBH-jI?si=2t4MXkQL3S3u-5w9

Reading a paper sequentially one section


after another is not a good option.

3 pass approach to read a research paper


1.First Pass: Read the title, abstract, subsection titles and glance at the figures and figure
captions.
• Should be able to answer the five C’s (Category, Context, Correctness, Contribution,
Clarity)
2.Second Pass: Read the Introduction, Conclusion and rest figures and skim rest of the
sections(ignoring the details such as mathematical derivations, proofs, etc.)

3.Third Pass: Reading the entire paper with the intention to reimplement it.

31
https://saiamrit.github.io/technical-blog
How to write?

32
https://www.turing.com/kb/how-to-write-research-paper-in-machine-learning-area
33
https://www.turing.com/kb/how-to-write-research-paper-in-machine-learning-area

You might also like