Professional Documents
Culture Documents
DS2 Report
DS2 Report
1. What is the size of the dataset, and what types of variables are included?
The size of the dataset is 10886 rows, 12 columns. It includes both continues and categoric
data .
2. What are the distributions of the variables, and are they normally distributed?
3. What are the most frequent values or categories in the dataset, and how do they relate
to the target variable?
Season holiday working weather are the columns containing most frequent values
4. What are the important variables that influence the target variable?
Yes, “registered” and “casual” are the variables which are highly correlated to each other more
than0.5
The dataset is balanced with the target variable “count” for this test set.
7. Are there any missing values, and if so, what is the best way to impute them?
Yes, there are missing values in this dataset. I filled them through mean.but before that I
changed the data type
10. What is the best way to handle categorical variables in the model?