AI Lec-03

AI Foundations and Applications
3. Feature Engineering
Thien Huynh-The
HCM City Univ. Technology and Education
Jan, 2023
Machine Learning
A program that can

sense, reason, act,
and adapt.
AI ML DL Data Science
Math, Statistics, and

Visualization
Algorithms whose performance

improve as they are exposed to Subset of ML in which
more data over time multiplayer neural networks
learn from vast amounts of data
HCMUTE AI Foundations and Applications 03/18/2024 2
Machine Learning
• A computer program is said to learn from experience E with respect to some tasks
T and performance measure P, if its performance at tasks in T, as measured by P,
improves with experience E

Task T
• A task in ML is usually described as a procedure to process a data point.

• In the image classification task, a data point is an image
• In the client classification task, a data point is a client
• In the physical activity classification task, a data point is a vector of sensory data.
• A data point encompasses various features, each feature is usually represented by
numerical value.
• In general, a set of features can be structured as a vector or higher dimensional
structured data like matrix
• In an image, value of each pixel can be considered as a feature. A vector contains all pixel
values is a feature vector.

Classification vs Regression
Classification Regression
Classification predictive modeling is the task of Regression predictive modeling is the task of
approximating a mapping function (f) from input approximating a mapping function (f) from input
variables (x) to discrete output variables (y). variables (x) to a continuous output variable (y).
The output variables are often called labels or A continuous output variable is a real-value, such as an
categories. The mapping function predicts the class or integer or floating point value. These are often
category for a given observation. quantities, such as amounts and sizes.
For example, an email of text can be classified as For example, a house may be predicted to sell for a
belonging to one of two classes: “spam“ and “not specific dollar value, perhaps in the range 100,000 to
spam“. 200,000

Classification vs Regression
Classification Regression
• A classification problem requires that examples be • A regression problem requires the prediction of a
classified into one of two or more classes. quantity.
• A classification can have real-valued or discrete • A regression can have real valued or discrete input
input variables. variables.
• A problem with two classes is often called a two- • A problem with multiple input variables is often
class or binary classification problem. called a multivariate regression problem.
• A problem with more than two classes is often called • A regression problem where input variables are
a multi-class classification problem. ordered by time is called a time series forecasting
• A problem where an example is assigned multiple problem.
classes is called a multi-label classification problem.
Task: Students learn about clustering by themselves

Ref: https://www.geeksforgeeks.org/clustering-in-machine-learning/

Performance P
• Normally, when working with ML algorithms, the whole data will be divided into
training set and test set.
• Training set is used to find the parameter of ML model.
• Test set is used to evaluate the performance of ML model
• Note: the performance evaluation can be applied to two sets.
• A good model should work well on the training set as priority
• Reading task:
• Discriminate online learning/training vs offline learning/training
• What is the generalization in ML (ref: https://deepai.space
/what-is-generalization-in-machine-learning/)

Experience E
• Training a ML model can be referred to as an experiencing ML model on the

training set.
• Different datasets will give unsimilar experiences
• The quality of dataset also affect to the model performance
Supervised Learning Unsupervised Learning

• Supervised learning is a machine learning • Unsupervised learning uses machine learning
approach that’s defined by its use of labeled algorithms to analyze and cluster unlabeled data
datasets. sets.
• These datasets are designed to train or • These algorithms discover hidden patterns in data
“supervise” algorithms into classifying data or without the need for human intervention (hence,
predicting outcomes accurately. they are “unsupervised”).
• Using labeled inputs and outputs, the model can
measure its accuracy and learn over time.

Unsupervised learning
• Unsupervised learning models are used for three main tasks: clustering, association
and dimensionality reduction:
• Clustering is a data mining technique for grouping unlabeled data based on their similarities or
differences. This technique is helpful for market segmentation, image compression, etc.
• Association is another type of unsupervised learning method that uses different rules to find
relationships between variables in a given dataset.
• Dimensionality reduction is a learning technique used when the number of features (or
dimensions) in a given dataset is too high. It reduces the number of data inputs to a
manageable size while also preserving the data integrity.

More information
• Read more at
https://www.ibm.com/cloud/blog/supervised-vs-unsupervised-learning
• Dive into reinforcement learning at
https://developer.ibm.com/learningpaths/get-started-automated-ai-for-decision-
making-api/what-is-automated-ai-for-decision-making/
• Loss function in Vietnamese at

https://khanh-personal.gitbook.io/ml-book-vn/chapter1/ham-mat-mat

ML framework

Feature Extraction
• Usually used when the original raw data is very different and we cannot use for ML
modeling. In this case, we should transform raw data into the desired form
• Feature extraction is the method for creating a new and smaller set of features
that capture most of the useful information of raw data
• Some of the popular types of raw data from which features (new feature creation)
can be extracted
• Text / Images / Geospatial data / Date and time / Web data / Sensory data

Feature Extraction
• Feature extraction is a technique used to reduce a large input data set into
relevant features. This is done with dimensionality reduction to transform large
input data into smaller, meaningful groups for processing.
Benefits
• Feature extraction can prove helpful when training a machine learning model.
• It leads to:
• A boost in training speed
• An improvement in model accuracy
• A reduction in risk of overfitting
• A rise in model explainability
• Better data visualization

Feature Extraction - Example
• Consider a following example

• Data collected from sensors
• This is the sensory data in time with amplitude
and frequency (no. sample recorded each
second)
• In the Figure
• Too hard to discriminate the signals recorded in
two conditions, i.e., healthy operation vs faulty
operation
• After computing condition indication from raw
data, the signals are more discriminative

• Can derive condition indicators from data

using time, frequency, and time-frequency
domain features.
• Some time features

• Mean
• Standard deviation
• Skewness
• Root-mean square
• Kurtosis
• ...


• Frequency-domain features
• Power bandwidth
• Mean frequency
• Peak values
• Peak frequencies
• Harmonics
• ...
• Time-frequency domain features
• Spectral entropy
• Spectral kurtosis
• ...

Feature Extraction
• Time-domain features
Features Formulation
Mean
Standard deviation
Variance
Skewness
Kurtosis

Feature Extraction

Feature Extraction
• Depending upon the data types, such as time series data or high-dimensional data,
vector or matrix, discrete or continuous, we should select the set of appropriate
features to maximize the discrimination of input samples.
• Besides, there are some advanced feature extraction techniques:
• Explicit Semantic Analysis (ESA).
• Non-Negative Matrix Factorization (NMF).
• Singular Value Decomposition (SVD)
Ref: https://www.geeksforgeeks.org/singular-value-decomposition-svd/
• and Principal Component Analysis (PCA).
Ref: https://viblo.asia/p/gioi-thieu-principal-component-analysis-07LKXpq2KV4

Feature Normalization
• Normalization is another important concept needed to change all features to the

same scale. This allows for faster convergence on learning, and more uniform
influence for all weights.
• Rescaling
• changes all features to be between 0 and 1
• Standardization
• standardize features by removing the mean and scaling to unit variance

Reading
In-class assignment
• Discovery evaluation protocols
• Learn the performance metrics in ML
• Describe the following feature selection methods
• Filter method
• Wrapper method
• Embedded method

Coding Time
• Python
• Classification using different classification algorithms with raw data
• Classification using different classification algorithms feature extraction
• Select the best classification algorithm and feature descriptor to obtain the highest accuracy

AI Lec-03

Uploaded by

Copyright:

Available Formats

You might also like

AI Lec-03

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Lec-03

Uploaded by

Copyright:

Available Formats

AI Foundations and Applications

A program that can

Math, Statistics, and

Algorithms whose performance

HCMUTE AI Foundations and Applications 03/18/2024 3

• A task in ML is usually described as a procedure to process a data point.

HCMUTE AI Foundations and Applications 03/18/2024 4

HCMUTE AI Foundations and Applications 03/18/2024 5

Task: Students learn about clustering by themselves

HCMUTE AI Foundations and Applications 03/18/2024 6

HCMUTE AI Foundations and Applications 03/18/2024 7

• Training a ML model can be referred to as an experiencing ML model on the

Supervised Learning Unsupervised Learning

HCMUTE AI Foundations and Applications 03/18/2024 8

HCMUTE AI Foundations and Applications 03/18/2024 9

• Loss function in Vietnamese at

HCMUTE AI Foundations and Applications 03/18/2024 10

HCMUTE AI Foundations and Applications 03/18/2024 11

HCMUTE AI Foundations and Applications 03/18/2024 12

HCMUTE AI Foundations and Applications 03/18/2024 13

• Consider a following example

HCMUTE AI Foundations and Applications 03/18/2024 14

• Can derive condition indicators from data

• Some time features

HCMUTE AI Foundations and Applications 03/18/2024 15

HCMUTE AI Foundations and Applications 03/18/2024 16

HCMUTE AI Foundations and Applications 03/18/2024 17

HCMUTE AI Foundations and Applications 03/18/2024 18

HCMUTE AI Foundations and Applications 03/18/2024 19

HCMUTE AI Foundations and Applications 03/18/2024 20

• Normalization is another important concept needed to change all features to the

HCMUTE AI Foundations and Applications 03/18/2024 21

HCMUTE AI Foundations and Applications 03/18/2024 22

HCMUTE AI Foundations and Applications 03/18/2024 23

You might also like