Course Overview

CSCI417
Machine Intelligence
Dr. Ghada Khoriba
Dr. Ensaf Hussien
Dr. Nermin Negied
Dr. Wael Gomaa
Spring 2024
1
Agenda
Course Objectives
Course Map & References
Grading Distribution
Introduction to machine learning

Where we are? - What is machine learning (ML)? -
ML Applications- ML Life Cycle - Types of ML
Where to get Project Ideas
How to Read/write a paper? Basic idea
2
Course Objectives
• To understand the fundamental concepts of machine learning.

• To learn how to implement machine learning algorithms using PyTorch and
Scikit-Learn.
• To gain hands-on experience with building machine learning models on real-
world datasets.
• To understand the trade-offs and limitations of different machine learning
techniques.
• To be able to critically evaluate machine learning models and choose the best
solution for a given problem.
3
4
5
Tentative Course Topics
1.Machine Learning Basics
2.Classifying with k-Nearest Neighbors
3.Splitting datasets one feature at a time: decision trees
4.Classifying with probability theory: naïve Bayes
5.Logistic regression
6.Support vector machines
7.Model Evaluation and Improvement: Cross-validation, Grid Search, Evaluation Metrics,
and Scoring
8.Ensemble learning and improving classification with the AdaBoost meta-algorithm.
9.Introduction to Neural Networks - Building NN for classification (binary/multiclass)
10.Convolutional Neural Network (CNN)
11.Pretrained models (VGG, Alexnet,..)
12.Machine learning pipeline and use cases.
6
References
For readers For Math lovers

For Coders
7
Grading Criteria
Grading:
 Course Work:
o Programming Assignments (2 times checkpoint): 15% (DataCamp group/ Machine learning track)
o Lecture Quizzes (2 times): 10%
o Lab Practical Quizzes (2 times): 10%
o Project:
 Paper Report and UGRF: 5%
 Project Implementation and Discussion: 20%
 Midterm: 15%
 Final Exam: 25%
8
Introduction to Machine Learning
What is artificial intelligence?

The set of all tasks in which a computer can make decisions
What is machine learning?
Machine learning is similar to artificial intelligence, and often, their definitions are
confused. Machine learning (ML) is a part of artificial intelligence, and we define it as
follows:
The set of all tasks in which a computer can make decisions based on data
How humans make decisions. we make decisions in the following two ways:
•By using logic and reasoning
•By using our experience
Ref: Grokking Machine Learning, Luis G. Serrano 9

Machine Learning
• ML systems learn
• how to combine input
–to produce useful predictions
»on never-before-seen data
MODEL
• A set of rules that represent our data and
can be used to make predictions
ALGORITHM
A procedure, or a set of steps, used to
solve a problem or perform a
computation. Usually, the goal of an
algorithm is to build a model.
Ref: https://miro.medium.com/ 10
ML Terminologies
ML optimizes predictive performance, while statistics emphasize

interpretability and parsimony/simplicity.
ML / Statistics Jargon Definition
Label/Target/Output Variable/”y” The results to predict
Feature/Input Variable/”x” Input data to help make predictions
Feature Engineering / Transformation Reshaping raw input data to give more insights
Dimensionality / [1st d, 2nd d, … , nth d] Number of features
Model Weights / Parameters A set of numbers embedded in a model to make
predictions
Model Training Applying optimization techniques to find the “best” set
of model weights
parsimony, a simpler model with fewer parameters is favored over more complex models with more parameters, provided the models fit the data similarly well. 11
ML Applications
Problem type Description Example
Ranking algorithm within Amazon Search

Ranking Helping users find the most relevant items
Giving users the items they may be most

Recommendation interested in
Figuring out what category does an item belongs

Classification to
Regression Predicting a numerical value of an item
Clustering Putting similar items together
Anomaly Detection Finding uncommon items
12
ML Applications
Recommendations across the website



Classification to
Amazon’s Choice
13
ML Applications
Product classification for our catalog



Classification to
High-Low Dress Straight Dress
Anomaly Detection Finding uncommon items Striped Skirt Graphic Shirt
14
ML Applications
Predicting sales for specific ASINs



Classification to
Anomaly Detection Finding uncommon items Seasonality | Out of stock | Promotions
15
ML Applications
Close-matching for near-duplicates



Classification to
16
ML Applications

Fruit freshness
Giving users the items they may be most Before After


Classification to

Good
Damage
Serious Damage
Anomaly Detection Finding uncommon items Decay
17
We can describe problems and their solutions using six characteristics
Characterize the problem Characterize the solution
1. Problem class: What is the nature of the training 4. Model type: Will an intermediate model be
data and what kinds of queries will be made at made? What aspects of the data will be
the testing time? modeled? How will the model be used to make
2. Assumptions: What do we know about the predictions?
source of the data or the form of the solution? 5. Model class: What particular parametric class of
3. Evaluation criteria: What is the goal of the models will be used? What criterion will we use
prediction or estimation system? How will the to pick a particular model from the model class?
answers to individual queries be evaluated? How 6. Algorithm: What computational process will be
will the overall performance of the system be used to fit the model to the data and/or to make
measured? predictions?
18
Supervised, Unsupervised, and Reinforcement learning

Supervised Learning
machine learning that works with labeled data

Supervised Learning
Supervised learning can be subdivided into classification and regression based on the quantity we are trying
to predict.
• If our output 𝑦 is a discrete quantity (e.g. 𝐾 distinct classes) we have a classification problem.
• On the other hand, if our output y is a continuous quantity (e.g. a real number such as stock price) we have
a regression problem.
Thus, the nature of the problem changes based on the quantity y we are trying to predict. We want to get as
close as possible to the ground truth value of y.
Machine Learning Algorithms in Depth, Vadim Smolyakov 21

Classification
• Training data 𝐷 is in the form of a set of pairs

{(𝑥 ( ) , 𝑦 ( ) ), …, (𝑥 ( ) , 𝑦 ( ) )}
WWWW WW WWWW
where 𝒙(𝒊) represents an object to be classified, most typically a d-dimensional vector of real and/or discrete
values, and
𝒚(𝒊) is an element of a discrete set of values.
The , 𝑦 values are sometimes called target values.
A classification problem is binary or two-class if 𝑦 ( ) is drawn from a set of two possible values; otherwise,
it is called multi-class. WWWW WW WWWW
The goal in a classification problem: given a new input value 𝒙(𝒏 𝟏)
, to predict the value of 𝒚(𝒏 𝟏)
.
Classification problems are a kind of supervised learning, because the desired output (or class) 𝑦 ( ) is
specified for each of the training examples 𝑥 ( ) .
Regression is like classification, except that ()
22
Unsupervised learning
The branch of machine learning that works with unlabeled data

Reinforcement learning
Reinforcement learning algorithms interact

with an environment, so there is a feedback
loop between the learning system and its
experiences.

Regression or Classification?
• Given data about the size of houses on the real estate market, try to
predict their price. R
• Given a picture of Male/Female, We have to predict his/her age on
the basis of given picture. R
• Given a picture of Male/Female, We have to predict Whether She is
of High school, College, Graduate age.C
• Banks have to decide whether or not to give a loan to someone on the
basis of his credit history. C
26
How to find a Project Idea
27
27
Machine Learning Project
28
Read a Paper
Why to read research Papers

•To have a better grasp and understanding of the field
•To be able to contribute to the field in terms of novel ideas
•To develop confidence in the field
•Most condensed and authentic source of latest knowledge in the field
Literature survey of a domain

The basic steps to perform a literature survey in a field are the following:
1.Assemble collections of resources in the form of research papers, Medium articles, blog posts,
videos, GitHub repository etc.
2.Conduct a deep dive to classify the relevant and irrelevant material.
3.Take structured notes summarizing a paper’s key discoveries, findings, and techniques.
29
https://saiamrit.github.io/technical-blog
Organization of a Paper
The majority of papers follow, more or less, the same convention of organization:
1.Title: Hopefully catchy ! Includes additional info about the authors and their institutions.
2.Abstract: High level summary of the entire work of the paper.
3.Introduction: Background info on the field and related research leading up to this paper.
4.Related works: Describe the already existing literature on the particular domain.
5.Methods: Highly detailed section on the study that was conducted, how it was set up, any
instruments used, and finally, the process and workflow.
6.Results: Authors talk about the data that was created or collected, it should read as an unbiased
account of what occurred.
7.Discussions: Here is where authors interpret the results, and convince the readers of their findings
and hypothesis.
8.References: Any other work that was cited in the body of the text will show up here.
9.Appendix: More figures, additional treatments on related math, or extra items of interest can find
their way in an appendix.
30
Read a Paper
Please check Andrew’s lecture, https://youtu.be/733m6qBH-jI?si=2t4MXkQL3S3u-5w9
Reading a paper sequentially one section

after another is not a good option.
3 pass approach to read a research paper

1.First Pass: Read the title, abstract, subsection titles and glance at the figures and figure
captions.
• Should be able to answer the five C’s (Category, Context, Correctness, Contribution,
Clarity)
2.Second Pass: Read the Introduction, Conclusion and rest figures and skim rest of the
sections(ignoring the details such as mathematical derivations, proofs, etc.)
3.Third Pass: Reading the entire paper with the intention to reimplement it.
31
How to write?
32
https://www.turing.com/kb/how-to-write-research-paper-in-machine-learning-area
33
https://www.turing.com/kb/how-to-write-research-paper-in-machine-learning-area

Course Overview

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Course Overview

Uploaded by

Copyright:

Available Formats

CSCI417

Course Map & References

Introduction to machine learning

• To understand the fundamental concepts of machine learning.

For readers For Math lovers

What is artificial intelligence?

Ref: Grokking Machine Learning, Luis G. Serrano 9

ML optimizes predictive performance, while statistics emphasize

Problem type Description Example

Ranking algorithm within Amazon Search

Giving users the items they may be most

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

Problem type Description Example

Recommendations across the website

Giving users the items they may be most

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

Problem type Description Example

Product classification for our catalog

Giving users the items they may be most

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items Striped Skirt Graphic Shirt

Problem type Description Example

Predicting sales for specific ASINs

Giving users the items they may be most

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items Seasonality | Out of stock | Promotions

Problem type Description Example

Close-matching for near-duplicates

Giving users the items they may be most

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Anomaly Detection Finding uncommon items

Problem type Description Example

Giving users the items they may be most Before After

Figuring out what category does an item belongs

Regression Predicting a numerical value of an item

Clustering Putting similar items together

Characterize the problem Characterize the solution

Ref: Grokking Machine Learning, Luis G. Serrano 19

Ref: Grokking Machine Learning, Luis G. Serrano 20

Machine Learning Algorithms in Depth, Vadim Smolyakov 21

• Training data 𝐷 is in the form of a set of pairs

Regression is like classification, except that ()

Ref: Grokking Machine Learning, Luis G. Serrano 24

Reinforcement learning algorithms interact

Ref: Grokking Machine Learning, Luis G. Serrano 25

Why to read research Papers

Literature survey of a domain

Reading a paper sequentially one section

3 pass approach to read a research paper

You might also like