Professional Documents
Culture Documents
12 Ai Data Story 3
12 Ai Data Story 3
1. Problem statement
2. Hypothesis generation
3. Reading and understanding
4. Exploratory Data Analysis (EDA)
Univariate Analysis
Bivariate Analysis
5. Feature Engineering
6. Modelling and Evaluation
STEPS OF DATA MODELLING TO GENERATE DATA STORIES
1.Problem statement -
3 basic questions
1. What is the problem?
2. Why does the problem need to be solved?
3. How can we solve the problem?
These questions will help us to understand and validate the data collected.
STEPS OF DATA MODELLING TO GENERATE DATA STORIES
1. Hypothesis generation
• After problem statement, need to list on which
factors the target variable depends.
• Can do this using our experience and brainstorming
with key stakeholders.
STEPS OF DATA MODELLING TO GENERATE DATA STORIES
Univariate Analysis
It comprises visualization of features one at
a time.
Used to know how our continuous features
are distributed.
Histogram plots and box plots are used in
the case of continuous features
Count plots are used in the case of
categorical data.
EXPLORATORY DATA ANALYSIS (EDA)
Bivariate Analysis
examine the trends of each feature with
the target variable.
Applying on pairs of different features,
trends, and patterns
For continuous-continuous features –
Scatter plots, line-plots etc.
For continuous-categorical – bar plots, box
plots etc.
For categorical-categorical – stacked bar
plots, cross-table methods.
FEATURE ENGINEERING
• Consists of two activities – Feature generation and Encoding