Welcome to Scribd!

Data Mining Project Using Naive Bayes

Uploaded by

0% found this document useful (0 votes)

1 views10 pages

The document is a student assignment submission for a Data Warehousing and Data Mining course. It discusses using the Naive Bayes classification algorithm to predict whether patients will have a stroke based on attributes like gender, age, and smoking status from a dataset containing over 5,000 records obtained from Kaggle. The student performs preprocessing like splitting the data into training and test sets, trains a Naive Bayes model on the training set which achieves an accuracy of 90.7697%, and tests it on the held-out test set for an accuracy of 92.4658%.

Original Description:

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

1 views10 pages

Data Mining Project Using Naive Bayes

Uploaded by

Mr SHINIGAMI

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 10

Search inside document

DWDM Midterm Assignment

Course: Data Warehousing & Data Mining

Submitted by:
ID:
Section:
To
Course Teacher:

Date of Submission:

American International University-Bangladesh (AIUB)

I am selecting Naive Bayes classification because,
1. This algorithm works very fast and can easily predict the class of a test
dataset.
2. We can use it to solve multi-class prediction problems as it’s quite useful
with them.
3. Naive Bayes classifier performs better than other models with less training
data if the assumption of independence of features holds.
4. If we have categorical input variables, the Naive Bayes algorithm performs
exceptionally well in comparison to numerical variables.

I have selected this dataset from Kaggle website. This dataset is used to
predict whether a patient is likely to get stroke based on the input
parameters like gender, age, and smoking status. Each row in the data
provides relevant information about the patient.
There are,
Attribute: 9
 ID
 Gender
 Age
 Ever Married
 Work Type
 Residence
 Avg Glucose Level
 Smoking Status
 Stroke
Stroke represents class attribute.

Total instance: 5110

Processes in WEKA:
Original Dataset:
Training Dataset:

Here are the data that were selected by using WEKA,

60% of instances were taken in this step.
Data of training set,

By using Naïve Bayes my accuracy of this training set is 90.7697%,

Test Dataset:
Here are the rest half of 40% data that were not taken during training
set,
By using Naïve Bayes my accuracy of this test set is 92.4658%,
Cross Validation:
By using Naïve Bayes my accuracy of this Cross Validation is 91.7808%,
Predicted Data:
Here is the prediction from our data,

6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
Document3 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
sprasadv
No ratings yet
Healthy Food Suggestions Based On Blood Parameters Web Application
Document23 pages
Healthy Food Suggestions Based On Blood Parameters Web Application
kalampasha460
No ratings yet
Bayes Analysis
Document9 pages
Bayes Analysis
harshit rathore
No ratings yet
Quiz 6 - Classification
Document2 pages
Quiz 6 - Classification
Mr.Padmanaban V
No ratings yet
Semester 2, 2020 Week 8: Data Mining in WEKA Tutorial/Lab Session - 7
Document13 pages
Semester 2, 2020 Week 8: Data Mining in WEKA Tutorial/Lab Session - 7
Tiarana Paka
No ratings yet
Implementation of Decision Tree and Naïve Bayes Classification Method For Predicting Study Period
Document7 pages
Implementation of Decision Tree and Naïve Bayes Classification Method For Predicting Study Period
Fenny Rahmayani
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
Document6 pages
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
Zahid Dar
No ratings yet
Experiment 6 AIM: Implementation of Decision-Tree and Naive-Based Classification-Based Algorithms. Theory
Document8 pages
Experiment 6 AIM: Implementation of Decision-Tree and Naive-Based Classification-Based Algorithms. Theory
Yashika Gupta
No ratings yet
Hybrid Decision Tree and Naïve Bayes Classifier For Predicting Study Period and Predicate of Student's Graduation
Document6 pages
Hybrid Decision Tree and Naïve Bayes Classifier For Predicting Study Period and Predicate of Student's Graduation
Nurul Renaningtias
No ratings yet
Analysis of Naïve Bayes Classification For Diabetes Mellitus
Document5 pages
Analysis of Naïve Bayes Classification For Diabetes Mellitus
kharis
No ratings yet
DWM Exp 5,219
Document12 pages
DWM Exp 5,219
Mayur Pawade
No ratings yet
1.9. Naive Bayes - Scikit-Learn 0.21.3 Documentation
Document4 pages
1.9. Naive Bayes - Scikit-Learn 0.21.3 Documentation
Paulo Marquezini
No ratings yet
Weka Sample
Document21 pages
Weka Sample
LoveAstro
No ratings yet
ML Unit-4
Document82 pages
ML Unit-4
ShadowOP
No ratings yet
Untitled
Document3 pages
Untitled
Anurag Singh
No ratings yet
QlikView and Naive Bayes
Document10 pages
QlikView and Naive Bayes
Soumyadeep Chakraborty
No ratings yet
Data Warehousing and Data Mining: Mid Term Assignment
Document7 pages
Data Warehousing and Data Mining: Mid Term Assignment
Hossain Joy
No ratings yet
Blue and White Geometric Company Profile Presentation (1) (1) (1) - Compressed
Document16 pages
Blue and White Geometric Company Profile Presentation (1) (1) (1) - Compressed
gnaneswarkarakavalasa123
No ratings yet
Naive Bayes
Document29 pages
Naive Bayes
Thaïs El
No ratings yet
BDM Final
Document41 pages
BDM Final
nsnarnold
No ratings yet
MODULE 3 Classification
Document5 pages
MODULE 3 Classification
dhruu2503
No ratings yet
Share BATCH 8 JOURNAL PAPER
Document12 pages
Share BATCH 8 JOURNAL PAPER
Om namo Om namo Srinivasa
No ratings yet
Study and Analysis of Breast Cancer Data IJERTCONV5IS21015
Document3 pages
Study and Analysis of Breast Cancer Data IJERTCONV5IS21015
Onoja Mary oluwafunke
No ratings yet
A N M F C R - B N B C: Ovel Ethodology OR Onstructing ULE Ased Aïve Ayesian Lassifiers
Document13 pages
A N M F C R - B N B C: Ovel Ethodology OR Onstructing ULE Ased Aïve Ayesian Lassifiers
Ives Rodríguez
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
Document12 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
luckyaliss786
No ratings yet
Performance Analysis of Data Mining Classification Method Using Naïve Bayes Algorithm To Predict Student Graduation Timeliness
Document4 pages
Performance Analysis of Data Mining Classification Method Using Naïve Bayes Algorithm To Predict Student Graduation Timeliness
International Journal of Innovative Science and Research Technology
No ratings yet
Neal Zhang
Document33 pages
Neal Zhang
saurabh_34
No ratings yet
An Overview of Anomalous Sub Population: Mukti, Hari Singh
Document4 pages
An Overview of Anomalous Sub Population: Mukti, Hari Singh
erpublication
No ratings yet
Mental Health Detection Using Machine Learning
Document31 pages
Mental Health Detection Using Machine Learning
gvarshithavarshi
No ratings yet
Chapter 11 KNN Naive Bayes and LDA
Document15 pages
Chapter 11 KNN Naive Bayes and LDA
kiitlabsbsc
No ratings yet
Types of MC
Document29 pages
Types of MC
njhujhkun
No ratings yet
IJERTpaper
Document6 pages
IJERTpaper
ravinder
No ratings yet
Bayes Classification
Document8 pages
Bayes Classification
soma7513
No ratings yet
Multi-Disease Prediction With Machine Learning
Document7 pages
Multi-Disease Prediction With Machine Learning
Umar Khan
No ratings yet
BATCH - 11: Classifying Interactions/Reactions SVM (Machine Learning Concept)
Document13 pages
BATCH - 11: Classifying Interactions/Reactions SVM (Machine Learning Concept)
NANDESHVAR KALEEDASS
No ratings yet
Survey On Approaches, and Applications of The Boosting: Problems
Document3 pages
Survey On Approaches, and Applications of The Boosting: Problems
International Journal of Application or Innovation in Engineering & Management
No ratings yet
Data Mining
Document16 pages
Data Mining
pranay23varanasi
No ratings yet
Research Paper On Naive Bayes Classifier
Document4 pages
Research Paper On Naive Bayes Classifier
n1dihagavun2
100% (1)
It C Synopsis
Document11 pages
It C Synopsis
UJJWAL KUMAR
No ratings yet
Ai 5
Document7 pages
Ai 5
meesam2021
No ratings yet
Assignment1 2020
Document6 pages
Assignment1 2020
Adam Master
No ratings yet
Bias in Error Estimation When Using Cross-Validation For Model Selection." BMC Bioinformatics, 7 (1), 91
Document9 pages
Bias in Error Estimation When Using Cross-Validation For Model Selection." BMC Bioinformatics, 7 (1), 91
sterepavel
No ratings yet
Heart Disease Prediction
Document27 pages
Heart Disease Prediction
023-Subha K
No ratings yet
Hybrid Heart Disease Prediction Model Using Machine Learning Algorithm
Document6 pages
Hybrid Heart Disease Prediction Model Using Machine Learning Algorithm
International Journal of Innovative Science and Research Technology
No ratings yet
Cost Sensitive Naive Bayes
Document8 pages
Cost Sensitive Naive Bayes
517wangyiqi
No ratings yet
ML Unit3
Document21 pages
ML Unit3
svkr00001
No ratings yet
Lapse Team
Document28 pages
Lapse Team
Laura Stephanie
No ratings yet
Domain Independent Model For Data Prediction and Visualization
Document7 pages
Domain Independent Model For Data Prediction and Visualization
ijsret
No ratings yet
Hit 2203-Big Data & Data Analytics - Lecture - 3
Document10 pages
Hit 2203-Big Data & Data Analytics - Lecture - 3
sanyengere
No ratings yet
Manisha 3001 Week 12
Document22 pages
Manisha 3001 Week 12
Suman Gaihre
No ratings yet
Assignment No 2
Document5 pages
Assignment No 2
ADITYA PATIL
No ratings yet
Astuti 2018 J. Phys. Conf. Ser. 971 012003
Document8 pages
Astuti 2018 J. Phys. Conf. Ser. 971 012003
fauzieahmad526
No ratings yet
Mid Term Assignment Data Warehousing and Data Mining Section: C Name: Joy, MD - Monowar Hossain ID: 18-38618-2
Document3 pages
Mid Term Assignment Data Warehousing and Data Mining Section: C Name: Joy, MD - Monowar Hossain ID: 18-38618-2
Hossain Joy
No ratings yet
Intellihealth
Document16 pages
Intellihealth
Usman Qamar
No ratings yet
2095083-MachineLearning Basic1
Document23 pages
2095083-MachineLearning Basic1
Roopa BK
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 6
Document115 pages
Data Mining: Concepts and Techniques: - Chapter 6
maria.atef130
No ratings yet
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Efficiency Improvement in Classification Tasks Using Naive Bayes PDF
Document5 pages
Efficiency Improvement in Classification Tasks Using Naive Bayes PDF
iaetsdiaetsd
No ratings yet
Naive Bayes Classifier
Document7 pages
Naive Bayes Classifier
ARUN GOVIND MEDABALIMI
No ratings yet
Detecting Duplicate Questions in Online Forums Using Machine Learning Techniques
Document6 pages
Detecting Duplicate Questions in Online Forums Using Machine Learning Techniques
IJRASETPublications
No ratings yet