Welcome to Scribd!

DS2 Report

Uploaded by

0% found this document useful (0 votes)

4 views2 pages

The dataset contains 10,886 rows and 12 columns of both continuous and categorical data. The variables are not normally distributed with most frequent values in season, holiday, working, and weather. The target variable "count" is balanced. There are missing values that were imputed with mean imputation after changing data types, and outliers were detected using IQR method. Categorical data was one-hot encoded and continuous features were normalized using min-max scaling to prepare the data for modeling.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

4 views2 pages

DS2 Report

Uploaded by

i221435

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 2

Search inside document

DATASET 2

1. What is the size of the dataset, and what types of variables are included?

The size of the dataset is 10886 rows, 12 columns. It includes both continues and categoric
data .

2. What are the distributions of the variables, and are they normally distributed?

No the data was not normally distributed.

3. What are the most frequent values or categories in the dataset, and how do they relate
to the target variable?

Season holiday working weather are the columns containing most frequent values

4. What are the important variables that influence the target variable?

5. Are there any correlations or patterns between the independent variables?

Yes, “registered” and “casual” are the variables which are highly correlated to each other more
than0.5

6. Is the dataset balanced, or is there an imbalance in the target variable distribution?

The dataset is balanced with the target variable “count” for this test set.

7. Are there any missing values, and if so, what is the best way to impute them?

Yes, there are missing values in this dataset. I filled them through mean.but before that I
changed the data type

8. Are there any outliers, and how should they be treated?

Yes, there exists the outliers in this dataset which I have calculated using IQR method

9. What is the appropriate method for feature scaling or normalization?

i used min max method to normalize the data

10. What is the best way to handle categorical variables in the model?

The categorical data was converted through one hot encoding

Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Document13 pages
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
jerryperezaxkfsjizmc
100% (15)
Top 100 Machine Learning Questions With Answers For Interview PDF
Document48 pages
Top 100 Machine Learning Questions With Answers For Interview PDF
Piyush Saraf
100% (2)
DS1 Report
Document2 pages
DS1 Report
i221435
No ratings yet
DIscussion Forum Answers
Document3 pages
DIscussion Forum Answers
sherry
No ratings yet
Chapter 14 - Analyzing Quantitative Data
Document8 pages
Chapter 14 - Analyzing Quantitative Data
jucar fernandez
No ratings yet
DS3 Report
Document2 pages
DS3 Report
i221435
No ratings yet
Terms and Measures
Document13 pages
Terms and Measures
Jomar Rabia
No ratings yet
Part A
Document16 pages
Part A
Saumya Singh
No ratings yet
Midterm Self Tests
Document4 pages
Midterm Self Tests
Walter Golden
No ratings yet
ANINO-Math100 PREFINAL M4 L3
Document12 pages
ANINO-Math100 PREFINAL M4 L3
EvilGenius Official
No ratings yet
Purpose of Analysis Is To Answer The Research Questions Outlined in The Objectives
Document15 pages
Purpose of Analysis Is To Answer The Research Questions Outlined in The Objectives
Bogdan Tudor
No ratings yet
Quantitative Ii Partial
Document106 pages
Quantitative Ii Partial
Wendy Cevallos
No ratings yet
1st Unit Notes
Document22 pages
1st Unit Notes
Jazz
No ratings yet
DMBAR Chapter 4 Dimension Reduction
Document25 pages
DMBAR Chapter 4 Dimension Reduction
ANAM AFTAB 22GSOB2010404
No ratings yet
Statistics and Data: Week 6 (3 Hours)
Document6 pages
Statistics and Data: Week 6 (3 Hours)
kimshin satomi
No ratings yet
Module 4. Data Management
Document12 pages
Module 4. Data Management
Hotaro Oreki
No ratings yet
Analytics Advanced Assignment Mubassir Surve
Document7 pages
Analytics Advanced Assignment Mubassir Surve
Mubassir Surve
No ratings yet
Likert Scales and Data Analyses + Quartile
Document6 pages
Likert Scales and Data Analyses + Quartile
Bahrouni
No ratings yet
Data Types
Document5 pages
Data Types
guruvarshniganesapandi
No ratings yet
Data Screening Checklist
Document57 pages
Data Screening Checklist
Sugan Pragasam
No ratings yet
Classification and Regression Trees As Alternatives To Regression
Document2 pages
Classification and Regression Trees As Alternatives To Regression
Sofía Riveros Malebrán
No ratings yet
Questions Stats and Trix
Document39 pages
Questions Stats and Trix
Aakriti Jain
No ratings yet
It Is Also Including Hypothesis Testing and Sampling
Document12 pages
It Is Also Including Hypothesis Testing and Sampling
Melanie Arangel
No ratings yet
SPSS Regression
Document82 pages
SPSS Regression
efandeskot
No ratings yet
Data Science Interview Question
Document23 pages
Data Science Interview Question
Roshan atul
No ratings yet
Measures of Central Tendency
Document5 pages
Measures of Central Tendency
Juan Pablo Córdoba
No ratings yet
Measure of Central Tendency (Assignment)
Document8 pages
Measure of Central Tendency (Assignment)
Everly Gamayo
No ratings yet
One of The feat-WPS Office
Document12 pages
One of The feat-WPS Office
rmconvidhya sri2015
No ratings yet
Step by Step How To Write The Ia
Document11 pages
Step by Step How To Write The Ia
Mayur Vanjani
No ratings yet
Unit - II Data Preprocessing
Document35 pages
Unit - II Data Preprocessing
ANITHA AMMU
No ratings yet
Summary of Chapter 12 and 13
Document8 pages
Summary of Chapter 12 and 13
Abdul Basit
No ratings yet
How To Prepare Data For Predictive Analysis
Document5 pages
How To Prepare Data For Predictive Analysis
Mahak Kathuria
No ratings yet
Grade 7 Math Unit 7
Document30 pages
Grade 7 Math Unit 7
Harlen Maghuyop
No ratings yet
Bank Data Analysis Report
Document14 pages
Bank Data Analysis Report
Chandra Prakash S
No ratings yet
Quiz 2 - Data Exploration
Document2 pages
Quiz 2 - Data Exploration
Mr.Padmanaban V
100% (1)
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
Document9 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
guptarohitkumar
No ratings yet
Chapter 1 RM
Document44 pages
Chapter 1 RM
Ankur Dharod
No ratings yet
Stats Interview Questions Answers 1697190472
Document54 pages
Stats Interview Questions Answers 1697190472
Amit Sinha
No ratings yet
Lec 10 ENV 420
Document17 pages
Lec 10 ENV 420
Pinku Khan
No ratings yet
Analysing Data Using Spss
Document94 pages
Analysing Data Using Spss
Sandeep Bhatt
100% (1)
Factor Analysis Is An Interdependence Technique Whose Primary Purpose Is To Define The Underlying
Document3 pages
Factor Analysis Is An Interdependence Technique Whose Primary Purpose Is To Define The Underlying
Rema Vanchhawng
No ratings yet
Basic Analytical Concepts
Document12 pages
Basic Analytical Concepts
Himadri Jana
No ratings yet
Inquiry Investigation and Immersion Mod 1
Document14 pages
Inquiry Investigation and Immersion Mod 1
dimapasokrencelle
No ratings yet
Descriptive and Inferential Statistics
Document10 pages
Descriptive and Inferential Statistics
Ryan Menina
100% (1)
Chapter 1 - Data and Decisions
Document19 pages
Chapter 1 - Data and Decisions
Димитър Димитров
No ratings yet
8614 (1) - 1
Document17 pages
8614 (1) - 1
Saqib Khalid
No ratings yet
Data Screening (Sometimes Referred To As "Data Screaming") Is The Process of Ensuring Your Data Is
Document4 pages
Data Screening (Sometimes Referred To As "Data Screaming") Is The Process of Ensuring Your Data Is
Abdullah Afzal
No ratings yet
PPT2 Central Tendency and Variability
Document28 pages
PPT2 Central Tendency and Variability
zy- SBG
No ratings yet
Pca Tutorial
Document27 pages
Pca Tutorial
Gregory A Perdomo P
No ratings yet
Ecotrix With R and Python
Document25 pages
Ecotrix With R and Python
zuhanshaik
No ratings yet
Statistics
Document5 pages
Statistics
Elene Grace Barte
No ratings yet
Likert Scales and Data Analyses
Document4 pages
Likert Scales and Data Analyses
Pedro Mota Veiga
No ratings yet
Activity 3 Interpreting Data
Document7 pages
Activity 3 Interpreting Data
Isa Pearl
No ratings yet
Bana 2
Document2 pages
Bana 2
Joana Trinidad
No ratings yet
Paper Solution of RM. UNIT - 2,3,4
Document15 pages
Paper Solution of RM. UNIT - 2,3,4
arbazkhan13218980
No ratings yet
Measures of Central Tendency
Document5 pages
Measures of Central Tendency
shane
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Statistical Analysis for Beginners: Comprehensive Introduction
From Everand
Statistical Analysis for Beginners: Comprehensive Introduction
Daniel Garfield
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Document 123
Document4 pages
Document 123
i221435
No ratings yet
Grade 5 Retake Paper 2024
Document10 pages
Grade 5 Retake Paper 2024
i221435
No ratings yet
Report of Assignment 3 ML
Document6 pages
Report of Assignment 3 ML
i221435
No ratings yet
ML1 Dataset1
Document12 pages
ML1 Dataset1
i221435
No ratings yet