Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Feature Engineering

(In Machine Learning)


What is a feature?

An attribute (coordinate) of the observation (point)


that is important from learning or prediction point
of view

Not all attributes are features

2
Examples of Features …
• An attribute (in a table)
• Line in an image
• A phrase
• A word count

3
What is Feature Engineering?

A set of steps taken to present the original and / or


transformed data to a machine learning strategy –
such that inherent important structures in the data
are exposed for the purpose of model creation

4
What is Feature Engineering?

5
Feature Engineering: when required, and not …
• Feature engineering is required when …
– Limited data is available
• “Curse of dimensionality” if more features are considered in
model building
• Cases of over-fitting if there are more features and less data
– Limited computation power

• Feature engineering may not be required when …


– Copious data availability (eg. images, server logs)
– Computation power is not an issue (eg. cloud computing)
– Most important: availability of universal function
approximators
• Artifical Neural Networks, Deep Learning Networks

6
The Feature Engineering Process
Execution of the following steps

1. Create / identify set of relevant features


2. Fit a model and run validation tests
3. Re-design or re-select features based on results of
validation
4. Perform step 2

Repeat the process until ‘satisfactory’ results are


obtained or there is no further improvement

7
Components of Feature Engineering
• Feature Extraction
• Feature Creation
• Feature Selection
• Dimensionality Reduction
– PCA, SVD

8
Feature Extraction
• Goal
– To increase the level of abstraction
– To reduce the total data sent into learning algorithms
• Example
– Edge detection in images
– Curvature detection in 3d models
– Number of concavities / convexities in 3D models
– Identifying regions with same “colours”
• Satellite imaging
• Temperature based tool condition monitoring
• MRI / X-Ray processing
9
Feature Extraction

10
Feature Extraction

11
Feature Creation
• Goal: To create a set of attributes
– based on domain knowledge or pre-processing /
visualization
– that are known to better describe the structure of the data
to be processed
• Example:
– In linear regression, addition of new terms like ‘log’, ‘tanh’,
‘exp’, ‘sin’, square, cube, x1 * x2 (feature combinations),
etc.
– One-hot encoding: creation of dummy variables
– Discretizing continuous attributes
– Combining multiple attributes into one feature
– Addition of new terms resulting from ‘feature extraction’
12
One-hot-encoding

13
Feature Selection
• Goal: To reduce the total number of ‘features’ sent
into the machine learning algorithm
– To reduce model complexity and model computation time
• Methods
– Forward selection:
• Start with minimal set and gradually add features
– Backward selection:
• Start with a maximal set and gradually reduce features
– Filter
• Based on analysis such as Pearson correlation coefficient
– Embedded methods
• LASSO (Least Absolute Shrinkage and Selection Operator: L1
penalty), Ridge (L2), ElasticNet (L1+L2) Regression

14
Feature Selection: Forward Selection

15
Feature Selection: Forward Selection

16
Feature Selection: Backward Selection

17
Feature Selection: Backward Selection

18
Dimensionality Reduction
• Goal:
– To reduce number of features by identifying feature combinations
• Example
– Principal Component Analysis

19

You might also like