Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

What is Feature Engineering?

Feature engineering is the process of transforming raw data into features that are suitable for machine learning
models. In other words, it is the process of selecting, extracting, and transforming the most relevant features from the
available data to build more accurate and efficient machine learning models.
The success of machine learning models heavily depends on the quality of the features used to train them. Feature
engineering involves a set of techniques that enable us to create new features by combining or transforming the existing
ones. These techniques help to highlight the most important patterns and relationships in the data, which in turn helps the
machine learning model to learn from the data more effectively.
MODEL EVALUATION:
Model evaluation is the process of using different evaluation metrics to understand a machine learning
model's performance, as well as its strengths and weaknesses. Model evaluation is important to assess
the efficacy of a model during initial research phases, and it also plays a role in model monitoring.

+Feature Engineering in ML Lifecycle

Feature engineering in ML lifecycle diagram

Some common types of feature engineering include:

 Scaling and normalization means adjusting the range and center of data to ease learning and improve the interpretation of
the results.
 Filling missing values implies filling in null values based on expert knowledge, heuristics, or by some machine
learning techniques. Real-world datasets can be missing values due to the difficulty of collecting complete datasets and
because of errors in the data collection process.
 Feature selection means removing features because they are unimportant, redundant, or outright counterproductive to
learning. Sometimes you simply have too many features and need fewer.
 Feature coding involves choosing a set of symbolic values to represent different categories. Concepts can be captured with
a single column that comprises multiple values, or they can be captured with multiple columns, each of which represents a D

You might also like