UCI Dataset

Uploaded by

diljots78

0% found this document useful (0 votes)

4 views1 page

UCI dataset

Original Title

UCI dataset

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

UCI dataset

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

4 views1 page

UCI Dataset

Uploaded by

diljots78

UCI dataset

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

4.

DESIGN OF THE PROJECT:

Step 1: Understanding our dataset:

We will be using a dataset from the UCI Machine Learning repository which has a very good
collection of datasets for experimental research purposes.

The columns in the data set are currently not named and as you can see, there are 2 columns.
The first column takes two values, 'ham' which signifies that the message is not spam, and 'spam'
which signifies that the message is spam.
The second column is the text content of the SMS message that is being classified.

Step 2: Data Preprocessing:

Now that we have a basic understanding of what our dataset looks like, lets convert our labels to binary
variables, 0 to represent 'ham'(i.e. not spam) and 1 to represent 'spam' for ease of computation. You
might be wondering why do we need to do this step? The answer to this lies in how scikit-learn handles
inputs. Scikit-learn only deals with numerical values and hence if we were to leave our label values as
strings, scikit-learn would do the conversion internally (more specifically, the string labels will be cast
to unknown float values).
Our model would still be able to make predictions if we left our labels as strings, but we could have
issues later when calculating performance metrics, for example when calculating our precision and
recall scores. Hence, to avoid unexpected 'gotchas' later, it is good practice to have our categorical
values be fed into our model as integers

Grokking Simplicity: Taming complex software with functional thinking
From Everand
Grokking Simplicity: Taming complex software with functional thinking
Eric Normand
Rating: 3 out of 5 stars
3/5 (2)
Data Preprocessing in Machine Learning
Document27 pages
Data Preprocessing in Machine Learning
Naashit Hashmi
No ratings yet
E-Mail Spam Detection Using Machine Lear PDF
Document7 pages
E-Mail Spam Detection Using Machine Lear PDF
Trending this year
No ratings yet
E-Mail Spam Detection Using Machine Learning and Deep Learning
Document7 pages
E-Mail Spam Detection Using Machine Learning and Deep Learning
IJRASETPublications
No ratings yet
Implementing Artificial Neural Network in Python From Scratch
Document16 pages
Implementing Artificial Neural Network in Python From Scratch
sidrakhalid1357
No ratings yet
Exp 6
Document9 pages
Exp 6
hemanthsairavipati
No ratings yet
Ie ML Project (Getting Started)
Document3 pages
Ie ML Project (Getting Started)
nicool
No ratings yet
What Is One Hot Encoding and How To Do It: Michael Delsole Apr 24, 2018 5 Min Read
Document6 pages
What Is One Hot Encoding and How To Do It: Michael Delsole Apr 24, 2018 5 Min Read
Vikash Rryder
No ratings yet
Unit 1 Review
Document5 pages
Unit 1 Review
Aalaa Mady
No ratings yet
ML (Prac1)
Document12 pages
ML (Prac1)
dk9859164
No ratings yet
Fake Phase3
Document14 pages
Fake Phase3
Imran S
No ratings yet
Banknote Authentication Analysis Using Python K-Means Clustering
Document3 pages
Banknote Authentication Analysis Using Python K-Means Clustering
International Journal of Innovative Science and Research Technology
No ratings yet
(Download PDF) Machine Learning With Python Cookbook 2Nd Edition Chris Albon Ebook Online Full Chapter
Document53 pages
(Download PDF) Machine Learning With Python Cookbook 2Nd Edition Chris Albon Ebook Online Full Chapter
wltokahao
100% (3)
Analysis of Spam Email Filtering Through Naive Bayes Algorithm Across Different Datasets
Document4 pages
Analysis of Spam Email Filtering Through Naive Bayes Algorithm Across Different Datasets
International Journal of Innovative Science and Research Technology
No ratings yet
Detecting Spam in Emails. Applying NLP and Deep Learning For Spam - by Ramya Vidiyala - Towards Data Science
Document23 pages
Detecting Spam in Emails. Applying NLP and Deep Learning For Spam - by Ramya Vidiyala - Towards Data Science
Dương Vũ Minh
No ratings yet
Neural Networks
Document27 pages
Neural Networks
Dania Dallah
No ratings yet
Lab 12 16082022 125032pm 30082022 102624pm
Document9 pages
Lab 12 16082022 125032pm 30082022 102624pm
Soomal Fatima
No ratings yet
Expt 9
Document4 pages
Expt 9
Riya Bisht
No ratings yet
DL Unit - 5
Document14 pages
DL Unit - 5
Kalpana M
No ratings yet
Natural Language Processing
Document38 pages
Natural Language Processing
btms
No ratings yet
Loading & Working With Datasets: Student Name Student Roll # Program Section
Document8 pages
Loading & Working With Datasets: Student Name Student Roll # Program Section
every one
No ratings yet
2 Computer Programming Module 10
Document14 pages
2 Computer Programming Module 10
joel lacay
No ratings yet
Predicting Credit Card Approvals
Document14 pages
Predicting Credit Card Approvals
as
100% (1)
Week 02 Ch2.1 Introduction To Neural Networks
Document44 pages
Week 02 Ch2.1 Introduction To Neural Networks
gunduz.gizem
No ratings yet
2 Machine Learning
Document21 pages
2 Machine Learning
anna.na16567
No ratings yet
Datapreprocessing
Document8 pages
Datapreprocessing
NAVYA Tadisetty
No ratings yet
Pricing Via Processing or Combatting Junk Mail
Document12 pages
Pricing Via Processing or Combatting Junk Mail
hygreeva
No ratings yet
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
Document56 pages
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
Sravan Jana
No ratings yet
Handwritten Digit Recognition Using ML&DL
Document3 pages
Handwritten Digit Recognition Using ML&DL
UTSAV BHARDWAJ
No ratings yet
Research Paper On Md5 Algorithm
Document5 pages
Research Paper On Md5 Algorithm
h07d16wn
100% (1)
CSL0777 L09
Document29 pages
CSL0777 L09
Konkobo Ulrich Arthur
No ratings yet
Machine Learning - Project Group 3
Document17 pages
Machine Learning - Project Group 3
Tangirala Ashwini
No ratings yet
W5-Module-Handling Primitive Data Types
Document12 pages
W5-Module-Handling Primitive Data Types
Jitlee Papa
No ratings yet
III-II CSM (Ar 20) DL 5 Units Question Answers
Document108 pages
III-II CSM (Ar 20) DL 5 Units Question Answers
Sravan Jana
No ratings yet
Project Report: Optical Character Recognition Using Artificial Neural Network
Document9 pages
Project Report: Optical Character Recognition Using Artificial Neural Network
Richard James
No ratings yet
Ebook Machine Learning With Python Cookbook 2Nd Edition First Early Release Kyle Gallatin Online PDF All Chapter
Document69 pages
Ebook Machine Learning With Python Cookbook 2Nd Edition First Early Release Kyle Gallatin Online PDF All Chapter
deborah.young458
100% (8)
Week 2 - PART II: Computer Script For Analysis of Cognitive Experiment On Intelligence
Document7 pages
Week 2 - PART II: Computer Script For Analysis of Cognitive Experiment On Intelligence
Andrei Firte
No ratings yet
A Hands-On Introduction To Insecure Deserialization
Document18 pages
A Hands-On Introduction To Insecure Deserialization
asdff
No ratings yet
Machine Learning With ML - Net and C# - VB - Net - CodeProject
Document17 pages
Machine Learning With ML - Net and C# - VB - Net - CodeProject
Gabriel Gomes
No ratings yet
Lesson 1 - History, Definitions and Basic Concepts
Document6 pages
Lesson 1 - History, Definitions and Basic Concepts
Sameera Khatoon
No ratings yet
Network Security SUMS
Document9 pages
Network Security SUMS
Kaushal Prajapati
0% (1)
Deep Learning Assignment
Document8 pages
Deep Learning Assignment
Diriba Sabo
No ratings yet
CS DBMS 3
Document5 pages
CS DBMS 3
Prabhat
No ratings yet
Topic:-Optical Character Recognition Using Neural Networks Abstract
Document24 pages
Topic:-Optical Character Recognition Using Neural Networks Abstract
Praneeth Reddy Nimma
No ratings yet
Data Science Tips and Tricks: 1. Running Jupyter Notebooks On A Shared Linux/Unix Machine
Document6 pages
Data Science Tips and Tricks: 1. Running Jupyter Notebooks On A Shared Linux/Unix Machine
eight8888
No ratings yet
Dive Into Python
Document4 pages
Dive Into Python
Susan F
No ratings yet
Unit 1
Document43 pages
Unit 1
amrutamhetre9
No ratings yet
Miniproject 1: Machine Learning 101: Preamble
Document5 pages
Miniproject 1: Machine Learning 101: Preamble
SuryaKumar Devarajan
No ratings yet
ANN Final Exam
Document13 pages
ANN Final Exam
basit
100% (1)
Mining and Visualising Real-World Data: About This Module
Document16 pages
Mining and Visualising Real-World Data: About This Module
Alexandra Veres
100% (1)
07 The Basics of Ai & Machine Learning
Document57 pages
07 The Basics of Ai & Machine Learning
Patrick Hugo
No ratings yet
Lab 1. Implementation of HMAC Algorithm: Learning Outcomes
Document5 pages
Lab 1. Implementation of HMAC Algorithm: Learning Outcomes
deva sena
No ratings yet
Project Name Spam Email Detection 1
Document7 pages
Project Name Spam Email Detection 1
ayeshanaseem9999
No ratings yet
OOP Design #1 - Encapsulation
Document8 pages
OOP Design #1 - Encapsulation
Abdul Gafur
No ratings yet
Ios Chapter 1 Basic
Document27 pages
Ios Chapter 1 Basic
Seethalakshmi Sethurajan
No ratings yet
Lab2 Journal 04102022 113850am
Document4 pages
Lab2 Journal 04102022 113850am
niamh aly
No ratings yet
Pyhton Tutorial 3
Document62 pages
Pyhton Tutorial 3
saranya sasikumar
No ratings yet
MLchallenge2022 Block4
Document9 pages
MLchallenge2022 Block4
fede
No ratings yet
K Means
Document10 pages
K Means
S B
No ratings yet
Harvard CS197 Lecture 5 Notes
Document14 pages
Harvard CS197 Lecture 5 Notes
Valeria Rocha
No ratings yet