Welcome to Scribd!

Text Classification

Uploaded by

0% found this document useful (0 votes)

15 views5 pages

This document outlines steps for using ensemble learning techniques like bagging, boosting, and random forests to classify emails as spam, fraud or normal. It involves preprocessing text data, converting text to vectors, training machine learning models, checking predictions, and calculating accuracy scores and confusion matrices to evaluate model performance on the email classification task.

Original Description:

Original Title

Text_classification

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

15 views5 pages

Text Classification

Uploaded by

Mohammad Haris

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 5

Search inside document

Text classification

Activity:
This activity is to assess your understanding of the text classifier. It involves going through
spam email examples to understand the text classifier. You will be using different ensemble
learning techniques to test the performance of the classifier.

Task 1: Spam email example

Activity for Ensemble Learning

Question: Is the given email a Fraud, Spam, or Normal?

What is the Label/Class column?

Which classification it is?

Step 1: Import basic libraries

Step 2: Read file

Step 3: Filter out unnecessary columns

Step 4: Find target class

Step 5: Find the frequency of target class

Step 6: Import libraries for text processing

Step 7: Define functions to process text

Step 8: Call functions for text processing

Step 9: Convert text to vector
The code sample below converts text to vector using TfidVectorizer and
unigrams. In addition to it, use possible combinations of Countvectorizer,
unigram, and bigrams.

Step 10: Convert target values to numbers

Step 11: Develop machine learning models using one or

more of the Ensemble Learning techniques – Bagging,
Boosting, and Random Forest. Use following resources
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html

https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html

https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

Step 12: Check predictions

Step 13: Check accuracy scores

Step 14: Display confusion matrix

Explain how to read confusion matrix

Step 15: Obtain encoded labels to text

Note: Step 10 converts target labels to numbers. The numbers can be converted back to the
text using inverse_transform() function.

Step 16: Using the information in step 15, display confusion

matrix with text in X and Y axis.

OOP in C# Tasks
Document12 pages
OOP in C# Tasks
scientistfromuet
No ratings yet
Object Oriented Programming BIT153
Document5 pages
Object Oriented Programming BIT153
Jagdish Bhatta
No ratings yet
Aim Algorithm Result
Document10 pages
Aim Algorithm Result
Suraj Ramakrishnan
100% (1)
OOPL Lab Manual
Document51 pages
OOPL Lab Manual
Rohan Shelar
No ratings yet
Unit 4
Document45 pages
Unit 4
chirag S
No ratings yet
C# 2010 All-in-One For Dummies
From Everand
C# 2010 All-in-One For Dummies
Bill Sempf
No ratings yet
Python
Document108 pages
Python
gaby21marian3712
No ratings yet
2 Computer Programming Module 10
Document14 pages
2 Computer Programming Module 10
joel lacay
No ratings yet
Java Stream Lab Exercises
Document9 pages
Java Stream Lab Exercises
Ragini Bajpai
0% (1)
Java Lectureflow
Document8 pages
Java Lectureflow
satu
No ratings yet
2 Computer Programming Module 2
Document5 pages
2 Computer Programming Module 2
joel lacay
No ratings yet
Rational: Automate A Manual Test That Is Based On Keywords
Document17 pages
Rational: Automate A Manual Test That Is Based On Keywords
Juraj Bobek
No ratings yet
PYTHON PROGRAMMING Internship
Document1 page
PYTHON PROGRAMMING Internship
Ahsan Raza
No ratings yet
Detecting Spam in Emails. Applying NLP and Deep Learning For Spam - by Ramya Vidiyala - Towards Data Science
Document23 pages
Detecting Spam in Emails. Applying NLP and Deep Learning For Spam - by Ramya Vidiyala - Towards Data Science
Dương Vũ Minh
No ratings yet
Introducing Python - 3
Document14 pages
Introducing Python - 3
Sonia
No ratings yet
Python Notes Module5
Document34 pages
Python Notes Module5
Dr. Rama Satish K V
No ratings yet
Email Prioritization
Document8 pages
Email Prioritization
Vidul Ap
No ratings yet
Cpe009fa1 Guariño Danica Lab7-1
Document9 pages
Cpe009fa1 Guariño Danica Lab7-1
ferly Anne
No ratings yet
More On Classes: Workshop 4 in This Workshop, You'll Learn
Document2 pages
More On Classes: Workshop 4 in This Workshop, You'll Learn
Thắng Nguyễn
No ratings yet
Core Java: Assignment 1
Document6 pages
Core Java: Assignment 1
santosh_glb
No ratings yet
The Four Pillars of Object
Document7 pages
The Four Pillars of Object
Krishna Sg
No ratings yet
Python How-To: 63 techniques to improve your Python code
From Everand
Python How-To: 63 techniques to improve your Python code
Yong Cui
No ratings yet
OOP in C# Tasks
Document7 pages
OOP in C# Tasks
scientistfromuet
No ratings yet
122 14211291439 13 PDF
Document5 pages
122 14211291439 13 PDF
Nancy Pareta
No ratings yet
AI Phash 5
Document14 pages
AI Phash 5
techusama4
No ratings yet
Analysis of Email Fraud Detection Using WEKA Tool
Document5 pages
Analysis of Email Fraud Detection Using WEKA Tool
seventhsensegroup
No ratings yet
Chapter 9 - Artificial Intelligence For Mobile Apps
Document30 pages
Chapter 9 - Artificial Intelligence For Mobile Apps
manar ahmed
No ratings yet
Chapter 4 After Modfiy
Document4 pages
Chapter 4 After Modfiy
fatmahelawden000
No ratings yet
IT Practical File - X
Document49 pages
IT Practical File - X
cbjpjyfcgr
No ratings yet
Advance Python
Document57 pages
Advance Python
lakshya.agnihotri.32
No ratings yet
Object-Oriented Software Development - Tutorial 4
Document20 pages
Object-Oriented Software Development - Tutorial 4
vicrattlehead2013
No ratings yet
Redmine Plugin Extension and Development Sample Chapter
Document12 pages
Redmine Plugin Extension and Development Sample Chapter
Packt Publishing
No ratings yet
CS3 H Semester 1 2015 Review
Document7 pages
CS3 H Semester 1 2015 Review
Nikhil Singh
No ratings yet
Digital Transformation in Banking
Document4 pages
Digital Transformation in Banking
Sharlee Jain
No ratings yet
Miniproject 1: Machine Learning 101: Preamble
Document5 pages
Miniproject 1: Machine Learning 101: Preamble
SuryaKumar Devarajan
No ratings yet
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
Document32 pages
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
Packt Publishing
100% (1)
Introduction To Computer Science 2: Lab 2: Interfaces and Polymorphism
Document2 pages
Introduction To Computer Science 2: Lab 2: Interfaces and Polymorphism
Yiping Huang
No ratings yet
JavaScript Introduction
From Everand
JavaScript Introduction
Lisa Saldivar
No ratings yet
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
Document11 pages
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
shoaib riaz
No ratings yet
Final Demo
Document5 pages
Final Demo
jazh ladjahali
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
Document34 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
rahacse
No ratings yet
Lab 3
Document3 pages
Lab 3
bc040400330737
No ratings yet
GitHub - Mikel - Mail - A Really Ruby Mail Library
Document12 pages
GitHub - Mikel - Mail - A Really Ruby Mail Library
Harssh S Shrivastava
No ratings yet
Project Name Spam Email Detection 1
Document7 pages
Project Name Spam Email Detection 1
ayeshanaseem9999
No ratings yet
Assignment 3
Document17 pages
Assignment 3
Aniket Waghode NEW ADMISSION
No ratings yet
Walkthrough: Creating and Using Dynamic Objects (C# and Visual Basic)
Document10 pages
Walkthrough: Creating and Using Dynamic Objects (C# and Visual Basic)
Kyle Daly
No ratings yet
Module 7: Object-Oriented Programming
Document27 pages
Module 7: Object-Oriented Programming
sameer_kini
No ratings yet
AI Phase2
Document42 pages
AI Phase2
Deepan Kumar
No ratings yet
Advanced Python Techniques: Expert-Level Coding and Best Practices: Python, #3
From Everand
Advanced Python Techniques: Expert-Level Coding and Best Practices: Python, #3
Kamel Bousnina
No ratings yet
4 Types of Classification Tasks in Machine Learning
Document14 pages
4 Types of Classification Tasks in Machine Learning
Harish Sreenivas
No ratings yet
Dawood Public School Course Outline 2015-16 Computer Class VIII
Document5 pages
Dawood Public School Course Outline 2015-16 Computer Class VIII
Said samimullah Noori
No ratings yet
Chapter 2lab2 (SPR2014 JavaFundamental)
Document4 pages
Chapter 2lab2 (SPR2014 JavaFundamental)
vgphreak
No ratings yet
ES2D7 System and Software Engineering Principles - Object Orientated Approaches
Document9 pages
ES2D7 System and Software Engineering Principles - Object Orientated Approaches
Namita Gera
No ratings yet
Cpe106l - Experiment 2
Document22 pages
Cpe106l - Experiment 2
Loven Garcia
No ratings yet
Unit 1
Document43 pages
Unit 1
amrutamhetre9
No ratings yet
HTMLCSS JS Database Training Syllabus 2019
Document20 pages
HTMLCSS JS Database Training Syllabus 2019
NvNIT vRma
No ratings yet
SMS Spam Classification Using WEKA: Dipak R. Kawade Kavita S. Oza
Document5 pages
SMS Spam Classification Using WEKA: Dipak R. Kawade Kavita S. Oza
yasmin liza
No ratings yet
PR2 - Tutorial 08
Document4 pages
PR2 - Tutorial 08
ntmaichi2003cv
No ratings yet
Java 2 Marks
Document3 pages
Java 2 Marks
Anonymous MvURUO1o
No ratings yet
Assignment 05 PDF
Document2 pages
Assignment 05 PDF
Hardik Maheshwari
No ratings yet
Data Management and Query Processing in Semantic Web Databases - Compress
Document273 pages
Data Management and Query Processing in Semantic Web Databases - Compress
Mohammad Haris
No ratings yet
Case Study On Linked Data and SPARQL Usage For Web
Document5 pages
Case Study On Linked Data and SPARQL Usage For Web
Mohammad Haris
No ratings yet
Classification Server Preza Hudak Kopacsi
Document16 pages
Classification Server Preza Hudak Kopacsi
Mohammad Haris
No ratings yet
Cycle Guide
Document1 page
Cycle Guide
Mohammad Haris
No ratings yet
Shopping List and Other Info
Document8 pages
Shopping List and Other Info
Mohammad Haris
No ratings yet
Student Village Rent Prices 22 23
Document1 page
Student Village Rent Prices 22 23
Mohammad Haris
No ratings yet
English Grammer Detail in Gujarati PDF by Hirensir
Document62 pages
English Grammer Detail in Gujarati PDF by Hirensir
Mohammad Haris
100% (1)
Student Travel Discounts
Document1 page
Student Travel Discounts
Mohammad Haris
No ratings yet
Cambridge-Accommodation-Cost-Comparison 22-23
Document2 pages
Cambridge-Accommodation-Cost-Comparison 22-23
Mohammad Haris
No ratings yet
Volume II Grammar PDF
Document101 pages
Volume II Grammar PDF
Mohammad Haris
No ratings yet
Alias in MySQL
Document4 pages
Alias in MySQL
Mohammad Haris
No ratings yet
Important Synonyms by H.j.patel PDF
Document5 pages
Important Synonyms by H.j.patel PDF
Mohammad Haris
No ratings yet
Food and Health (Vocabulary For Ielts Writing Task 2)
Document2 pages
Food and Health (Vocabulary For Ielts Writing Task 2)
Mohammad Haris
No ratings yet
PHP 5 Include Files
Document7 pages
PHP 5 Include Files
Mohammad Haris
No ratings yet