Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

1.

Write the Problem Statement and Goal

2. Import All Necessary Libraries and Read the Dataset

3. Get Basic Info on Both Datasets Using info() or describe()

4. Write Dataset Dimensions and Identify Dependent Variable

5. Explain All the Features of the Datasets


6. Make Some Initial Hypothesis Based on Some Facts

8. Do Exploratory Data Analysis on the Dataset


● “Countplot” for each feature (For Classification)
● Correlate each feature with the dependent variable
● Also, use “Boxplot” in correlation with the dependent variable to
make more analysis and find outlier

9. Do Feature Engineering
● Use some grouping and categorization methods to group and
categories some data
● Create bins and group them under some category
● Again make plots with the dependent variable to find new
analysis
● Find out the type of skewness in the features and use appropriate
methods to reduce skewness and fill in missing values
● Use proper transformation techniques to engineer the features
● Also, convert categorical data to numerical data
● Make sure there is no difference in the training set and test set

10. Modelling
● Import necessary Classifiers classes
● Split the dataset into Features and Dependent Variable
● Apply Feature Scaling
● Use Different Models with Hyperparameter Tuning
● Choose the best model

11. Present the Results of the Analysis

You might also like