Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Chapter 1 : Introduction to

Machine Learning
Machine Learning Team
UP GL-BD
What is Machine Learning (ML) ?
Fundamentals:

Artificial
intelligence
Fundamentals:
Learning is useful when:

• Lack of human expertise.


• Large amount of data to process.
• Humans can’t explain their expertise.
• The solution changes along the way in an exceptional way.
• The solution must adapt to its user (biometrics, filtering: e-mail).

Why now ?

⁻ Flood of available data especially with the advent of the Internet.


⁻ Increasing computational power
⁻ Growing progress in available algorithms and theories developed by
researchers.
⁻ Increasing support for industries …
The concept of Learning in a ML System:

Learning = Improving with experience at some task :

- Improve over task ‘t’


- With respect to performance measure ‘P’
- Based on experience ‘E’
What is Machine Learning?

• Machine learning is programming computers to optimize a


performance criterion using example data or past experience.
• Role of Statistics: Inference from a sample
• Role of Computer science: Efficient algorithms to
• Solve the optimization problem
• Representing and evaluating the model for inference

7
Machine Learning

• Develop systems that can automatically discover new knowledge


from large databases.
• Ability to mimic human and replace certain monotonous tasks.
• Develop systems that are difficult or expensive to construct
manually.
ML Vs. Traditional Programming
ML Vs. Traditional Programming
Machine Learning Applications
Some more examples of tasks that are best solved by using a
learning algorithm:
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
Categories of Learning
Types of Learning

• Supervised (inductive) Learning:


training data + desired outputs (labels)
• Unsupervised Learning:
training data (without desired outputs)
• Semi-supervised Learning:
training data + a few desired outputs
• Reinforcement Learning:
Rewards from sequence of actions
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Framing a Learning Problem:
CRISP-DM Methodology

(CRoss-Industry Standard Process


for Data Mining)
Data Science is an approach

• Data science is a method for gleaning insights from


structured and unstructured data using approaches ranging
from statistical analysis to machine learning.
• CRISP-DM: Framework for recording experience and aid to
project planning and management.
• A very efficient method for data science projects.
• CRISP-DM is an agile and iterative method. Each iteration
brings additional business knowledge which makes it easier
to tackle the next iteration.
CRISP-DM

• 6 phases life cycle process model.


CRISP-DM: Phases

• Business Understanding
Project objectives and requirements understanding, Data mining problem definition
• Data Understanding
Initial data collection and familiarization, Data quality problems identification
• Data Preparation
Table, record and attribute selection, Data transformation and cleaning
• Modeling
Modeling techniques selection and application, Parameters calibration
• Evaluation
Business objectives & issues achievement evaluation
• Deployment
Result model deployment, Repeatable data mining process implementation
IBM Master Plan
IBM Master Plan Phases
1. [Business Understanding]: allows to determine which data will be used to answer the
core question. Two things must be set: The goal and the objectives.

1. [analytic approach]: helps limit the algorithm(s) that will be used during the modeling
(predictive model / descriptive model).

1. [Data Requirements]: Identify the necessary data content, formats and sources for
initial data collection.

1. [Data Understanding]: Represent the collected data according to the problem we want
to solve

1. [Data Preparation]: Many operations such as addressing missing or invalid values and
removing duplicates. This step generally takes almost 90% of the overall project time.

1. [Modeling]: Generation of the model based on the analytic approach that was taken.

1. [Evaluation] : It’s the step in which we check if the model we have already generated
answer the initial request or not.
Data Science Tools

1. Data exploration with Jupyter


Lab
2. Development with VS Code
3. Deep learning with
PyTorch
4. Dashboarding with
Voilà
Bibliography
• Mastering Machine Learning with Python, Samynathan

• Machine Learning in Action, Peter Harrington, Manning Publications


Co. Greenwich, CT, USA ©2012 ISBN:1617290181 97816172901

• Data Science and its Relationship to Big Data and Data-Driven Decision
Making, Foster ProvostData Science and its Relationship to Big Data and
Data-Driven Decision Making, Foster Provost and Tom Fawcett,
VOLUME 1, ISSUE 1 / MARCH 2013

• CRISP-DM: Towards a Standard Process Model for Data Mining, Rüdiger


Wirth and Jochen Hipp, Proceedings of the 4th international conference
on the …, 2000 https://www.infoworld.com/article/2674025/ibm-s-
master-plan-for-data.html

• Team Data Science Process Documentation 2020

You might also like