01 - Introduction To ML

L01
Introduction to
Machine Learning (ML)
19/01/2022
Definitions and core concepts
Eng. Marco Zappatore, Ph.D.

University of Salento
2
# Table of contents
› Foreword
› Introduction
› ML core elements
› Artificial Neural Networks and Deep Learning
› Data Preparation & Feature Engineering
› Essential Statistics and Data Visualizations
L01 – Introduction to ML
› Dataset examples
M. Zappatore
3
# Foreword
› This course will address the usage of Python scripts for Machine Learning (ML).
› Before introducing core Python programming concepts as well as specific packages
dedicated to ML (and even before presenting the development environments needed to
implement, execute, and validate Python scripts) it is worth providing some core
concepts and definitions about ML.
› This lesson provides some introductory explanations on ML algorithms and processes.
› The addressed topics will be dealt with in other courses for Ph.D. students at your
University/Department with much more details. Moreover, many of you most probably
have a certain degree of expertise about ML already.
› However, if you are not going to attend any of the other courses dedicated to ML or if
you did not have any prior knowledge about ML, you can refer to this lesson as the
common knowledge ground about ML, to be referenced during the remaining part of
M. Zappatore
this course.
4
Introduction
First definitions
5
# A lot of (apparent) confusion…

› There are several definitions and even more buzzwords
› For non-professionals, Artificial Intelligence (AI), Machine Learning (ML)
and Deep Learning (DL) are all the same
› However, they are not synonyms
› ML is a subfield of AI  there are several non-learning applications in AI

› DL is a subfield of ML  there are several non-deep application in ML
› Moreover, many other definitions and concepts are available, too: Data
Mining, Pattern Recognition, Statistical Learning, Computational Learning,
M. Zappatore
Computational Statistics, etc.
6
# First of all, what is learning?

› (for humans) It is a process by which we acquire new (or modify existing)
knowledge, skills, behaviors or preferences, thanks to specific underlying memory
mechanisms (e.g., habituation, associative learning, observational learning, etc.)
› According to Poggio & Shelton (AI Magazine, 1999), “The problem of learning is
arguably at the very core of the problem of intelligence, both biological and
artificial.”
› When machines are involved, the Artificial Intelligence (AI) is involved, which
refers to the creation of intelligent and/or adaptive systems.

› When intelligent systems are referred, if they are supplied with a learning
component, they are allowed to modify their decisional mechanisms in order to
improve their performances.
M. Zappatore
› This situation is commonly defined as Machine Learning (ML).
7
# What is Machine Learning (ML)?

› One of the most widely-accepted definitions was proposed by Tom Mitchell in 1997:
› Machine Learning is a computer program [that] is said to learn from
experience E with respect to some class of (learning) tasks T and
performance measure P, if its performance at tasks in T, as measured by P,
improves with experience E
› The experience E usually comes in the form of data.
› The learning process usually happens in terms of rewards or penalties.
› The classes of (learning) tasks T are manifold: supervised/unsupervised learning, etc.

› There exists a relationships between data and performance, and not only between data and
learning task(s).
› Therefore, it can be said that ML is a computer algorithm that learns patterns and
M. Zappatore
models from data automatically (from large quantities of data) and then uses the
learnt model for predicting on new data.
8
# What is Machine Learning (ML)?

M. Zappatore
Source: A. Farbin (2016)
9
# Why is ML useful?
› ML is useful as a system construction method, because:
– Some tasks can be properly defined only by example
– Certain features of the working environment are not known at the design time
– Sometimes it is more useful to extract the solution from data rather than trying
to write down explicitly all the computational steps required to reach the solution
– Applications that can adapt to a changing environment reduce the need for
constant redesign
M. Zappatore
1
# When is ML useful?
0
› To address real-world problems that:

– Lack of consolidated theoretical/knowledge background
– Cannot be solved (efficiently or at all) by using available mathematical models
– Are affected by noisy data
– Are supplied with an excessive amount of data / information
› …provided that:
– An adequate amount of data well representing the problem is available
– It is possible / allowed to have a certain degree of tolerance in the results
precision/accuracy
M. Zappatore
1
# Where is ML useful?
1
› In situations that require:

– Predicting behaviours / events
– Estimating values / quantities
– Personalising services
– Recognising significant data from noisy/complex datasets
– Analysing large amounts of data / information
› More in general, ML is needed to tackle problems that are difficult to be managed

with traditional programming techniques (e.g., spam detection, behavior estimation,
facial recognition, real-time translation, etc.)
M. Zappatore
1
# AI vs ML vs DL
2
› Artificial Intelligence (AI): any technique able to mimic human behaviour (e.g.,
symbolic systems, knowledge bases, etc.)
› Machine Learning (ML): any AI technique able to learn from data (e.g., logistic
regression, decision trees, clustering, etc.)
› Representation Learning (RL): any ML technique able to learn representation of
data suitable to specific/generic tasks (e.g., shallow auto-encoders)
› Artificial Neural Network (ANN): any RL technique biologically inspired
› Deep Learning (DL): any multi-layered ANN (e.g., MLPs, DNNs, CNNs, RNNs,
GANs, VAEs, etc.)
M. Zappatore
1
# AI vs ML vs DL
3
M. Zappatore
Source: A. Farbin (2016)
#
of ML
Short history
(2020)
4
1
M. Zappatore
#
ML application potential
(2017)
5
1
M. Zappatore
1
# ML application potential
6
M. Zappatore
(Source: McKinsey & Company, 2019)
17
ML core elements
List of aspects that must be considered
when designing a ML application/solution
1
# Key elements for ML

8
› Data (observations or sample data)

› Learning Task (supervised/unsupervised/etc.)
› Learning “stuff”
– Computational model (how knowledge is represented)
› Decision Trees
› Artificial Neural Networks
› Bayesian Models
– Learning algorithm (how knowledge is adapted to data)
› Backpropagation
› Expectation-maximization
M. Zappatore
› Performance measurement (how learning quality and performance are quantified)
1
# Data and Data Quality

9
› As it can be easily guessed even at this initial stage, no matter how sophisticated
your ML system is, but it can only be as good as the data it is fed with.
› Similarly, if you input poor data to your ML system (garbage-in) you can achieve
poor output only (garbage-out).
› Learning quality increases with dataset size and data quality.
› A suitable learning quality is achieved only if an adequate coverage of the process
that you are willing to model is achieved.
› However, many more challenges exist: noisy data, missing data, unbalanced data,
etc.
M. Zappatore
2
# Preprocessing
0
› Preliminary activity of data preparation and filtering to ensure a pre-defined/minimal

degree of data quality
– Errors correction
– Missing data
– Noise reduction
› Finding data representation maximizing the performance of the learning model
– Scaling and normalization
– Feature selection and extraction

› ML models themselves can be used to preprocess the data
M. Zappatore
› The role of data preparation and data quality is pivotal in ML, as it will be
clarified in the following slides.
2
# Learning tasks
1
› Supervised Learning: predicting one or more dependent variables; based on labelled data;
like classification and regression. It begins with an established set of data and a certain
understanding of how data are classified. It is widely used in scenarios where are required
actions such as classification, approximation, control, modelling/identification, signal
processing, optimization, etc.
– Semi-Supervised Learning: not all the available data are labelled.
– Active Learning: the ML algorithm has to ask for (usually costly) labels with a limited
budget.
› Unsupervised Learning: looking for structure in data (without labels); like clustering or
pattern mining. It is best suited when the problem requires a massive amount of data that
are not known a priori (unlabelled). It is typically used for clustering, vector quantization,
feature extraction, signal coding, data analysis, etc.
M. Zappatore
2
# Learning tasks
2
› Adversarial Learning: the environment tries to deceive the learner; it can be both
supervised and unsupervised (e.g., spam filters, malware detectors).
– Generative-Adversarial Learning: the fooling environment is replaced by another ML
algorithm.
› Reinforcement Learning: only based on feedback to the algorithm's actions in a dynamic
environment. It is a behavioural learning model. The algorithm receives feedback from the
data analysis so that the user is guided to the best outcome. It differs from supervised
learning as the system is trained by trial and error instead of a sample dataset only.
› Deep Learning (DL): it exploits multi-layered ANNs so that the system can learn from
sample data in an iterative way. It is useful when it is required to learn patterns from
unstructured data
M. Zappatore
2
# Supervised Learning
3
› Learns a function h that maps inputs to desired outputs
Output is a
continuous
vector
Assign each
input to a
discrete class
A. Asperti, UniBO, 2019
y is discrete: y is (conceptually) continuous
› Needs supervised information associating the input xi to the desired target yi (which
M. Zappatore
can be an integer in {1, … , C} (classification) or real (regression)
› Training set is in the form of D = { (x1,y1), … , (xN,yN) }
2
# Unsupervised Learning
4
› Learns a natural / suitable grouping of the input data. It is used for clustering (see
picture below), finding a compressed representation of the available data,
estimating data density
› Only input pattern xi is provided (no desired output)

› Training set is in the form of D = {x1, … , xN}
M. Zappatore
2
# Reinforcement Learning
5
› Learns how to chose the best action based on rewards or penalties received from
the interacting environment. It is used for planning or behaviour learning.
› An input pattern xi is provided, which describes an observation coming from the

environment, along with a reward ri ∈ { –1, +1} returned as a response to the
predicted action yi
› Training set is in the form of D = { (x1,y1,r1), … , (xN,yN,rN) }
M. Zappatore
2
# Taxonomy of learning tasks

6
M. Zappatore
E. Ricci, FBK
2
# Taxonomy of learning tasks (enlarged)

7
M. Zappatore
2
# Computational Models (some examples)

8
M. Zappatore
› Computational models will be discussed in the following slides,
2
# Performance Measures for testing & validation

9
› Performance measurements strictly depend on the task (e.g., classification

performance metrics must differ from reinforcement learning performance metrics
at the two task types are different).
› Very often, performance measurements depend on the specific context, too (e.g.,
false positives assume different importance depending on the scenario: a false
positive in a repeatable test is less severe than a false positive in a one-shot test).
› Performance measurements may also depend on ethical, moral, and legal
considerations (e.g., a human behaviour classifier may achieve higher accuracy if

additional data such as race, religion, gender are considered, but this would render
the classifier ethically unacceptable).
M. Zappatore
3
# Sequence of key stages for ML

0
› Acquired knowledge is stored into model parameters W = {w1, … , wP}

› Two operational models are available:
– Learning phase (training and fitting)
› Building he model from known data (if available)
› Estimate model parameters from training dataset Dtrain
– Predictive phase (test or validation)
› Running the model with new, previously spared, samples Dtest
› Feed new data x ∈ Dtest as input to predict an output out(x)

› A Loss Function L(D,W) is used to estimate the quality of learned model
parameters W against dataset D,
M. Zappatore
3
# ML typical workflow (basic!)

1
M. Zappatore
ProMech (2018)
3
# ML typical workflow (basic, but applied)

2
M. Zappatore
ProMech (2018)
3
# ML typical workflow (slightly more detailed)

3
M. Zappatore
Medium.com (2019)
3
# Training
4
› It is an iterative process.
› Determines new values for model parameters W’ based on training data Dtrain
› Evaluates the newly obtained model based on the loss L(Deval, W’),
where Deval is either the training set Dtrain or an external validation set Dvalid
› If L(Deval, W’) is sufficiently small, the training phase stops, otherwise it keeps
iterating the steps described above.
M. Zappatore
3
# Examples of loss functions

5
› For classification tasks
𝑛𝑛𝑛𝑛. 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠

𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 =
𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑛𝑛𝑛𝑛. 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
› For regression tasks

𝑁𝑁
𝑅𝑅𝑅𝑅𝑅𝑅 = �(𝑦𝑦𝑖𝑖 − 𝑜𝑜𝑜𝑜𝑜𝑜(𝑥𝑥𝑖𝑖 ))2

𝑖𝑖=1
M. Zappatore
3
# Testing & validation

6
› The core question for ML is, therefore, how well does an hypothesis perform.
› Measuring this aspect on training data only is not indicative
› Therefore, an external data subset not used for training is needed
› This subset, usually called validation subset, provides a reasonable estimation of
the ML system’s performances on new data. This operation is called validation or
test.
› This approach requires coping with two fundamental issues:

1. Model selection and model training stages must be separated from the model
testing stage (generalisation assessment)
2. More advanced statistical methods can be used to assess model performances if
M. Zappatore
small datasets are available (e.g., bootstrapping, cross-validation, etc.)
3
# Note: ML vs Optimization
7
› Even if ML problems have an optimization purpose, they differ from typical

optimization problems since the ML solution is not provided in an analytical form
(i.e., typically no closed form solutions are available)
› The optimization purpose is reached in ML applications via an iterative approach
› By iterating the algorithm applications, the result can be approximated
progressively
› Therefore, ML applications are forms of progressive learning applied to a
fitness function (i.e., loss function) depending on the results of past

observations
M. Zappatore
38
Artificial Neural Networks

and Deep Learning
An introduction
3
# Deep Learning (DL) – Definitions

9
› What is DL? DL refers to a specific branch of ML based on the usage of Artificial

Neural Networks (ANNs) made up of several layers of neuronal nodes that process
inputs in order to produce expected outputs.
› Why is DL so important? DL is more efficient than ML as neuronal layers
compute relevant features automatically, as human brain normally does.
› What is DL (more formally)? A family of parametric models which learn non-linear
hierarchical representations:
M. Zappatore
4
#
0
Artificial Neural Networks (ANNs) – A short history
M. Zappatore
VUNO, 2016
4
# Note: KB vs ML vs DL
1
› Knowledge-Based System (KBS): exploits logical rules defined according to

domain experts who explained how to solve a problem
› Traditional Machine Learning (ML): exploits machine-based progressive learning

algorithms applied to relevant features identified by a domain expert amongst all
the available data
› Deep Learning (DL): the same of traditional ML, without the domain expert
M. Zappatore
4
# Real Neurons (a reminder…)

2
› A brain neuron behaves like an organic switch:

– Synapses within dendrites send signals to it
– When the received signal exceeds a specific threshold, the neuron is
M. Zappatore
activated and emits a signal along its axion, which activates nearby neurons
4
# Artificial Neurons
3
› An artificial neuron is an abstraction of a real neuron

– It computes a weighted sum of its inputs and then result is passed through a
non-linear activation function
M. Zappatore
4
#
4
The simplest ANN: the Rosemblatt’s Perceptron
M. Zappatore
E. Ricci, FBK
4
# Non-Linear Activation Functions

5
› The activation function must be selected depending on the specific task (e.g., ReLU and
Identity: regression; Sigmoid: multiple classification, etc.)
M. Zappatore
› Activation functions are homogeneous within the same layer (i.e., all the nodes within the
same layer are activated by the same function)
4
# Layered architecture for ANNs

6
› A typical ANN is organized into multiple layers

› Each layer represents an abstraction hierarchy
M. Zappatore
› Each neuron within a layer must be activated according to a (possibly non-
linear) activation function
4
# Some reference layered architectures

7
› MULTILAYER PERCEPTRON (MLP): the simplest ones, made up of several

layers, ideal for regression and general-purpose classifications
› CONVOLUTIONAL NEURAL NETWORK (CNN): specifically designed for image

processing and classification
› RECURRENT NEURAL NETWORK (RNN): particularly suitable to sequential data

(e.g., text processing/translation, time-series analyses, etc.)
M. Zappatore
4
# Example #1 – Dense MLP

8
› Each neuron has multiple inputs and produces a single output that is passed as input to the
neurons of the following layer
M. Zappatore
› If each neuron of the following layer is reached by that input, the ANN is said to be dense
› If more than one hidden layer is adopted, the ANN is deep, otherwise it is shallow
4
# MLP architecture (details)

9
M. Zappatore
E. Ricci, FBK
5
# MLP training by backpropagation (pseudo-code)

0
M. Zappatore
E. Ricci, FBK
5
# Example #2 – CNN
1
M. Zappatore
5
# ANNs – Real-Life example

2
M. Zappatore
A. Brunello, UniUD, 2015
5
#
3
ANNs – Typical parameters & Hyper-parameters
Parameters
› Node weights
› Number of inputs
› Number of outputs
Hyper-parameters
› Number of hidden layers
› Number of nodes per hidden layer
› Inter-layer connection type

› Activation function type (per layer)
› Loss function (per layer)
› Learning rate
› Batch size
M. Zappatore
› Optimization method
› Number of epochs
54
Data Preparation &

Feature Engineering
Two core requirements
in any ML/DL project
5
# Data Preparation in ML projects

5
DATA PREPARATION
M. Zappatore
Source: Developers.google.com (2020)
5

6
100%
~80%
M. Zappatore
5

7
M. Zappatore
5
# Data Quality considerations

8
› Reliability: the degree to which you can trust your data (the more reliable training data are,
the more reliable the trained ML model is). Reliability is affected by:
– label errors (i.e., wrongly labelled data points),
– noisy features (e.g., fluctuating measurements),
– unfiltered data
– omitted values (e.g., the data operator forgot to fill in a data field)
– duplicate values
› Feature representation: it is the mapping of data to useful features. It may require:

– Data modelling
– Data normalization
– Outlier handling
M. Zappatore
5
# Data Types (from a statistics perspective)

9
Nominal discrete data, no quantitative value, no order

Categorical
Data Types
Ordinal discrete data, no quantitative value, ordered
discrete data, quantitative value,

Discrete can be counted, cannot be measured
continuous data, quantitative value,

Continuous cannot be counted, can be measured
Numerical
discrete data, quantiative value, ordered,
Interval no real base value, each unit has the same difference
discrete data, quantiative value, ordered,
Ratio real base value, each unit has the same difference
M. Zappatore
Source: N. Donges (2018)
6
# Data Types (from a statistics perspective)

0
Nominal
Categorical
Data Types
Ordinal
Discrete
Continuous
Numerical
Interval
Ratio
M. Zappatore
Source: N. Donges (2018)
6
# Data Evaluation
1
› A dataset can show different behaviors:

– Excessive availability of data
– Shortage of data
– Skewed proportions of data classes
– Heterogeneous data types (numerical parameters plus string parameters plus
Boolean parameters plus images plus… etc.)
› Therefore, data normalization is needed
M. Zappatore
6
# Data Transformation Techniques

2
Numerical parameters can be transformed in two ways:
› Normalization: numerical parameters are converted into the same scale to improve performance and
training stability of the model. Normalization is needed when you have:
– excessively different values within the same feature (this may cause problems to the gradient
update of the ML model)
– different ranges on different features (this may affect the model convergence)
› Bucketing: numerical (continuous) parameters are grouped into discrete bins/buckets
› Encoding: categorical (discrete) parameters are encoded as numerical (continuous) ones

Normalization
M. Zappatore
Data Transformation Techniques 6
# for numerical parameters

3
M. Zappatore
Data Transformation Techniques: 6
# feature clipping
4
M. Zappatore
# log scaling
5
M. Zappatore
# Z-score
6
M. Zappatore
# Bucketing (fixed spacing)

7
M. Zappatore
# Bucketing (quantile spacing)

8
M. Zappatore
# Encoding (one-hot)
9
NOTE: from 1 feature with N

values we obtain N features
M. Zappatore
70
Essential Statistics and

Data Visualizations
When preparing your ML/DL project,
statistical and visual data analysis is pivotal
7
# Suitable statistics
1
› Descriptive: identify patterns amongst data without allowing to make hypotheses

(e.g., mean, median, deviation, etc.)
› Inferential: allow making hypotheses on a sample of the entire population

M. Zappatore
7
# Typical descriptive statistics

2
› Count
› Mean
› Standard deviation
› min
› Max
› 25% or bottom quartile
› 50% or second quartile

› 75% or top quartile
M. Zappatore
7
# Essential chart types: Box-and-Whisker plot

3
M. Zappatore
7
# Box-and-Whisker plot [example]

4
M. Zappatore
Source: Matplotlib documentation (2020)
7
# Essential chart types: Scatter Plot

5
M. Zappatore
7
# Scatter plot [example]

6
M. Zappatore
Source: Seaborn documentation (2020)
7
# Essential chart types: Correlation Matrix

7
› It is a matrix having all the parameters to be tested placed both on rows and
columns
› It is widely used to assess visually the existence of linear correlations amongst the
considered parameters
› Each cell is colored depending on whether a positive or negative correlation exist
› No correlation conditions can be depicted without any color or with specific colors
M. Zappatore
7
# Correlation Matrix [example]

8
M. Zappatore
Source: Data to Fish (2020)
79
Dataset Examples
Three widely-known, entry-level
training&validation datasets
8
# MNIST – Handwritten digit database

0
M. Zappatore
#
IRIS Dataset
1
8
M. Zappatore
DATASET DATA MODEL

PIMA Indians Diabetes dataset
2
8
M. Zappatore
DATASET DATA MODEL

Boston house prices dataset
3
8
M. Zappatore
84
End of lesson.

01 - Introduction To ML

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

01 - Introduction To ML

Uploaded by

Copyright:

Available Formats

L01

Eng. Marco Zappatore, Ph.D.

# A lot of (apparent) confusion…

› ML is a subfield of AI  there are several non-learning applications in AI

› DL is a subfield of ML  there are several non-deep application in ML

# First of all, what is learning?

refers to the creation of intelligent and/or adaptive systems.

# What is Machine Learning (ML)?

› The classes of (learning) tasks T are manifold: supervised/unsupervised learning, etc.

# What is Machine Learning (ML)?

› To address real-world problems that:

› In situations that require:

› More in general, ML is needed to tackle problems that are difficult to be managed

# Key elements for ML

› Data (observations or sample data)

# Data and Data Quality

› Preliminary activity of data preparation and filtering to ensure a pre-defined/minimal

– Feature selection and extraction

› Learns a function h that maps inputs to desired outputs

y is discrete: y is (conceptually) continuous

› Only input pattern xi is provided (no desired output)

› An input pattern xi is provided, which describes an observation coming from the

› Training set is in the form of D = { (x1,y1,r1), … , (xN,yN,rN) }

# Taxonomy of learning tasks

# Taxonomy of learning tasks (enlarged)

# Computational Models (some examples)

A. Asperti, UniBO, 2019

# Performance Measures for testing & validation

› Performance measurements strictly depend on the task (e.g., classification

considerations (e.g., a human behaviour classifier may achieve higher accuracy if

# Sequence of key stages for ML

› Acquired knowledge is stored into model parameters W = {w1, … , wP}

› Feed new data x ∈ Dtest as input to predict an output out(x)

# ML typical workflow (basic!)

# ML typical workflow (basic, but applied)

# ML typical workflow (slightly more detailed)

# Examples of loss functions

› For classification tasks

𝑛𝑛𝑛𝑛. 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠

› For regression tasks

𝑅𝑅𝑅𝑅𝑅𝑅 = �(𝑦𝑦𝑖𝑖 − 𝑜𝑜𝑜𝑜𝑜𝑜(𝑥𝑥𝑖𝑖 ))2

# Testing & validation

› This approach requires coping with two fundamental issues:

› Even if ML problems have an optimization purpose, they differ from typical

fitness function (i.e., loss function) depending on the results of past

Artificial Neural Networks

# Deep Learning (DL) – Definitions

› What is DL? DL refers to a specific branch of ML based on the usage of Artificial

› Knowledge-Based System (KBS): exploits logical rules defined according to

› Traditional Machine Learning (ML): exploits machine-based progressive learning

# Real Neurons (a reminder…)

› A brain neuron behaves like an organic switch:

› An artificial neuron is an abstraction of a real neuron

# Non-Linear Activation Functions

# Layered architecture for ANNs

› A typical ANN is organized into multiple layers

# Some reference layered architectures

› MULTILAYER PERCEPTRON (MLP): the simplest ones, made up of several

› CONVOLUTIONAL NEURAL NETWORK (CNN): specifically designed for image

› RECURRENT NEURAL NETWORK (RNN): particularly suitable to sequential data

# Example #1 – Dense MLP

# MLP architecture (details)