Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

OMBAIML 301:

BASICS OF Artificial intelligence &


Machine Learning

Unit 6:
Data Quality & Transformation

By: Asst. Prof. Toshi Dave


 Duplicate data, incomplete data,
inconsistent data, inconsistent
Introduction data, wrong data, incorrectly
specified data, improperly
organized data, and inadequate
data security are a few examples
of poor data quality.

 The six basic factors that make up


data quality are
 Accuracy
 Completeness
 Consistency
 Validity
 Uniqueness
 Timeliness.

3
• Imputing manually with Mean value

• Using Hmisc Library and imputing with


Median value

• Impute with a specific Constant value

Data • Impute the entire dataset

Imputation

4
Data Transformation

• The process of transforming raw data into a format that makes it


easier to conduct data mining and recover strategic information is
known as data transformation.

• Data can be transformed using 3 techniques:


 Min-Max
 Log Transformation
 Z-Score

5
 A data pre-processing method
called binning divides a set of
numerical values into bins.
Binning
 Binning can occasionally
improve the forecasting model's
precision.

 By dividing a set of numerical


values into fewer bins, data
binning allows you to have a
better understanding of the
distribution of the data.

6
THANK YOU

You might also like