Welcome to Scribd!

DS1 Report

Uploaded by

0% found this document useful (0 votes)

6 views2 pages

The dataset contains 5,000 rows and 27 columns of continuous and categorical data. The variables are not normally distributed. Delay, housing, race, sex are the most frequent categorical values related to the diagnosis target variable. There are correlations between variables like focus, concentration, and unusual thoughts. The dataset is balanced for the diagnosis target variable. Missing values were imputed using mean and frequent values, and outliers were identified using IQR. Categorical variables were handled using one-hot encoding and normalization was done using min-max scaling.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

6 views2 pages

DS1 Report

Uploaded by

i221435

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 2

Search inside document

DATASET 1

1. What is the size of the dataset, and what types of variables are included?

The size of the dataset is 5000 rows and 27 columns. It includes both continues and
categoric data .

2. What are the distributions of the variables, and are they normally distributed?

No the variables were not normally distributed. I converted the categorical data into discrete
data and than I applied the normal distribution on the both the discrete and continuous data.

3. What are the most frequent values or categories in the dataset, and how do they relate
to the target variable?

Delay, housing ,race, sex are the columns containing most frequent values and they are related
to the target variable diagnosis as all of them are categorical data.

4. What are the important variables that influence the target variable?

5. Are there any correlations or patterns between the independent variables?

Yes, “Focus” and “Concentration” and “unusual thought and apathy” are the variables which are
highly correlated to each other.

6. Is the dataset balanced, or is there an imbalance in the target variable distribution?

The dataset is balanced with the target variable “Diagnosis” for this testset.

7. Are there any missing values, and if so, what is the best way to impute them?

Yes, there are missing values in this dataset. I used mean and frequent value method to find the
missing values.
8. Are there any outliers, and how should they be treated?

Yes, there exists the outliers in this dataset which I have calculated using IQR method

9. What is the appropriate method for feature scaling or normalization?

i used min max method to normalize the data

10. What is the best way to handle categorical variables in the model?

The categorical data was converted through one hot encoding

Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
Document13 pages
Statistics For The Behavioral Sciences 10th Edition Gravetter Solutions Manual
jerryperezaxkfsjizmc
100% (15)
DS2 Report
Document2 pages
DS2 Report
i221435
No ratings yet
It Is Also Including Hypothesis Testing and Sampling
Document12 pages
It Is Also Including Hypothesis Testing and Sampling
Melanie Arangel
No ratings yet
Chapter 14 - Analyzing Quantitative Data
Document8 pages
Chapter 14 - Analyzing Quantitative Data
jucar fernandez
No ratings yet
DIscussion Forum Answers
Document3 pages
DIscussion Forum Answers
sherry
No ratings yet
Part A
Document16 pages
Part A
Saumya Singh
No ratings yet
Basic Analytical Concepts
Document12 pages
Basic Analytical Concepts
Himadri Jana
No ratings yet
Main Point List - Lec3
Document2 pages
Main Point List - Lec3
Hương Diệu
No ratings yet
Statistics Analysis With Software Application
Document22 pages
Statistics Analysis With Software Application
LEZIL ECLIPSE
No ratings yet
The Practice of Social Research: Chapter 14 - Quantitative Data Analysis
Document38 pages
The Practice of Social Research: Chapter 14 - Quantitative Data Analysis
Elle
No ratings yet
Descriptive Statistics PDF
Document24 pages
Descriptive Statistics PDF
Krishna lim
No ratings yet
Chapter 1-Introduction To Non-Parametric Statistics
Document10 pages
Chapter 1-Introduction To Non-Parametric Statistics
Marben Orogo
No ratings yet
Propensity Scores
Document48 pages
Propensity Scores
goudou
No ratings yet
C119 EssentialsofStatistics Handbook
Document75 pages
C119 EssentialsofStatistics Handbook
Coding Ninja
No ratings yet
Measures of Dispersion: 1. Range
Document3 pages
Measures of Dispersion: 1. Range
Sheila Mae Guad
No ratings yet
Measures of Central Tendency
Document5 pages
Measures of Central Tendency
Juan Pablo Córdoba
No ratings yet
Data Types
Document5 pages
Data Types
guruvarshniganesapandi
No ratings yet
Define Statistics
Document89 pages
Define Statistics
khanji
No ratings yet
Content 2:: Descriptive and Inferential Statistics
Document59 pages
Content 2:: Descriptive and Inferential Statistics
John Fuerzas
No ratings yet
Paper Solution of RM. UNIT - 2,3,4
Document15 pages
Paper Solution of RM. UNIT - 2,3,4
arbazkhan13218980
No ratings yet
Quantitative Method CP 102
Document5 pages
Quantitative Method CP 102
Prittam Kumar Jena
No ratings yet
1 The Role of Statistics and The Data Analysis Process
Document30 pages
1 The Role of Statistics and The Data Analysis Process
IT GAMING
100% (1)
Week1Lab - PSYC4700 - Statistics Lab Unit 1
Document2 pages
Week1Lab - PSYC4700 - Statistics Lab Unit 1
Tanya Alkhaliq
100% (1)
Recap: Categorical Quantitative Continuous Discrete Ordinal Nominal
Document3 pages
Recap: Categorical Quantitative Continuous Discrete Ordinal Nominal
Aurea Simao
No ratings yet
Missing Value Treatment
Document22 pages
Missing Value Treatment
rphmi
No ratings yet
m4 Lesson 6
Document18 pages
m4 Lesson 6
Aleza Deniega Lebreza
No ratings yet
Data Screening Checklist
Document57 pages
Data Screening Checklist
Sugan Pragasam
No ratings yet
Stats Interview Questions Answers 1697190472
Document54 pages
Stats Interview Questions Answers 1697190472
Amit Sinha
No ratings yet
Descriptive Analysis
Document35 pages
Descriptive Analysis
akshay.dm23
No ratings yet
Quiz 2 - Data Exploration
Document2 pages
Quiz 2 - Data Exploration
Mr.Padmanaban V
100% (1)
Module 4. Data Management
Document12 pages
Module 4. Data Management
Hotaro Oreki
No ratings yet
1st Quarter Exam Reviewer in Research 2
Document24 pages
1st Quarter Exam Reviewer in Research 2
Lysajean Doroques
No ratings yet
Business Club: Basic Statistics
Document26 pages
Business Club: Basic Statistics
Justin Russo Harry
No ratings yet
Profed 10
Document4 pages
Profed 10
Junel Sildo
No ratings yet
Classification and Regression Trees As Alternatives To Regression
Document2 pages
Classification and Regression Trees As Alternatives To Regression
Sofía Riveros Malebrán
No ratings yet
احصاء حيوي
Document37 pages
احصاء حيوي
ᗰOHAᗰẸᗪ HẸᗰᗪᗩN
No ratings yet
Main Point List - Lec3
Document1 page
Main Point List - Lec3
linhmilumilu
No ratings yet
When To Use Mean Median Mode
Document2 pages
When To Use Mean Median Mode
Madison Hartfield
No ratings yet
DS3 Report
Document2 pages
DS3 Report
i221435
No ratings yet
Group 5 Report
Document50 pages
Group 5 Report
kathtolentino.55
No ratings yet
Eleven Multivariate Analysis Techniques
Document4 pages
Eleven Multivariate Analysis Techniques
Harit Yadav
No ratings yet
Statistics and Data: Week 6 (3 Hours)
Document6 pages
Statistics and Data: Week 6 (3 Hours)
kimshin satomi
No ratings yet
Assign 1revised2
Document14 pages
Assign 1revised2
api-290018716
No ratings yet
Likert Scales and Data Analyses + Quartile
Document6 pages
Likert Scales and Data Analyses + Quartile
Bahrouni
No ratings yet
Data Science Interview Questions: Answer Here
Document54 pages
Data Science Interview Questions: Answer Here
neeraj12121
No ratings yet
DataScience Interview Questions
Document66 pages
DataScience Interview Questions
ravi Kiran
No ratings yet
Data Processing and Presentation: Joycee D. Loquite
Document40 pages
Data Processing and Presentation: Joycee D. Loquite
ROGEN MAE DIONIO
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
Document13 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
Heloise Krystene Sindol
No ratings yet
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
Document9 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
guptarohitkumar
No ratings yet
Chapter 7. Data Analysis and Interpretation 7.1. Overview of Data Processing and Analysis
Document24 pages
Chapter 7. Data Analysis and Interpretation 7.1. Overview of Data Processing and Analysis
geachew mihiretu
No ratings yet
Chapter 1
Document31 pages
Chapter 1
gaiyle cortez
100% (1)
Data Science Related Interview Question
Document77 pages
Data Science Related Interview Question
Tanvir Rashid
100% (1)
Statistics Data Analysis and Decision Modeling 5th Edition Evans Solutions Manual
Document36 pages
Statistics Data Analysis and Decision Modeling 5th Edition Evans Solutions Manual
helotrydegustbd0v98
100% (29)
DAAN436277 Buoi09 EDA
Document132 pages
DAAN436277 Buoi09 EDA
trumxi936
No ratings yet
Week 11 Data Analysis Techniques in Quantitative Research
Document30 pages
Week 11 Data Analysis Techniques in Quantitative Research
Andrei Vaughn
No ratings yet
Franklin Idioma Tecnico
Document2 pages
Franklin Idioma Tecnico
Adrian Cucul
No ratings yet
Data Analysis and Interpretation
Document33 pages
Data Analysis and Interpretation
Kalpita Dhuri
No ratings yet
KCU 200-Statistics For Agriculture-Notes.
Document115 pages
KCU 200-Statistics For Agriculture-Notes.
treazeragutu365
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Statistical Analysis for Beginners: Comprehensive Introduction
From Everand
Statistical Analysis for Beginners: Comprehensive Introduction
Daniel Garfield
No ratings yet
Document 123
Document4 pages
Document 123
i221435
No ratings yet
Grade 5 Retake Paper 2024
Document10 pages
Grade 5 Retake Paper 2024
i221435
No ratings yet
Report of Assignment 3 ML
Document6 pages
Report of Assignment 3 ML
i221435
No ratings yet
ML1 Dataset1
Document12 pages
ML1 Dataset1
i221435
No ratings yet