NLP Report Ass2

Uploaded by

Rand Abu Ghazaleh

0% found this document useful (0 votes)

3 views2 pages

Original Title

nlp report ass2

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

3 views2 pages

NLP Report Ass2

Uploaded by

Rand Abu Ghazaleh

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

Here's the steps that I did to get the result :

1. Data Loading: I loaded two CSV files (train and test) containing text data and fake flag
using pd.read_csv() function from the pandas library.

2. Data Preprocessing: I transformed all text into lowercase , I splitted the train data into
train and test sets using train_test_split() function from sklearn.model_selection
module. I tokenized the text data in X_train using word_tokenize() function from
nltk.tokenize module. I removed stop words from the tokenized text using a lambda
function.

3. Word Embeddings:

With Word2Vec: Word2Vec model is initialized and trained using Word2Vec class from
gensim.models module. I trained the sentences list which contains tokenized sentences from
the train data using Word2Vec model.

With TF-IDF: TF-IDF vectorizer is initialized and trained using TfidfVectorizer class from
sklearn.feature_extraction.text module. I transformed the text data in X_train into TF-
IDF vectors using the trained vectorizer.

With FastText: FastText model is initialized and trained using FastText class from
gensim.models module. I trained the sentences list which contains tokenized sentences
from the train data using the FastText model.

4. Model Training: Logistic Regression model is initialized and trained using

LogisticRegression class from sklearn.linear_model module. I used the word embeddings
obtained from (Word2Vec, TF-IDF, FastText ) as features for training the Logistic
Regression model.

5. Model Evaluation: I used the trained Logistic Regression model to predict the labels for
the validation set using predict() method.

The accuracy score is calculated using accuracy_score() function from sklearn.metrics

module to evaluate the performance of the model.
The result based on the accuracy was :

Word2Vec  0.6766
TF-IDF  0.6797
FastText  0.6782

So TF-IDF is the best way to represent the text

FineTune OPUS MT Engine
Document9 pages
FineTune OPUS MT Engine
Leon
No ratings yet
Introduction To Scikit
Document9 pages
Introduction To Scikit
ASHUTOSH TRIVEDI
No ratings yet
Capstone - Review 2
Document11 pages
Capstone - Review 2
kimjihaeun
No ratings yet
Multi-Output Classification With Machine Learning
Document10 pages
Multi-Output Classification With Machine Learning
panigrahisuman7
No ratings yet
Team Alacrity - Amazon ML Challenge 2023 - Text File
Document8 pages
Team Alacrity - Amazon ML Challenge 2023 - Text File
omkar sameer chaubal
No ratings yet
Combine PDF
Document124 pages
Combine PDF
rsdhiva22
No ratings yet
Lab Session VI-RTextTools (P8)
Document13 pages
Lab Session VI-RTextTools (P8)
Rohan Sharma
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
Document34 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
rahacse
No ratings yet
Fake News Detection
Document8 pages
Fake News Detection
mhashimzaffar1995
No ratings yet
Lab1 BoW ImageClassification
Document3 pages
Lab1 BoW ImageClassification
Vikramaditya Tarai
No ratings yet
Step 1: Install Required Libraries
Document3 pages
Step 1: Install Required Libraries
Lakshmi Harshitha Yechuri
No ratings yet
Fake News Detection
Document2 pages
Fake News Detection
BARATH P
No ratings yet
PML Ex4
Document8 pages
PML Ex4
Jasmitha B
No ratings yet
Ex 2
Document13 pages
Ex 2
sumerian786
No ratings yet
DL Mannual For Reference
Document58 pages
DL Mannual For Reference
Devant Pajgade
No ratings yet
2324 BigData Lab3
Document6 pages
2324 BigData Lab3
Elie Al Howayek
No ratings yet
QB 1
Document11 pages
QB 1
ksaikrishna5601
No ratings yet
Machine Learning Lab Manual
Document23 pages
Machine Learning Lab Manual
Prakash Jeeva
No ratings yet
AI Phase4
Document11 pages
AI Phase4
techusama4
No ratings yet
SVM
Document12 pages
SVM
Narmatha Balachandar
No ratings yet
Core Middleware
Document52 pages
Core Middleware
Reduan ahimu
No ratings yet
CNN Implementation in Python
Document7 pages
CNN Implementation in Python
Muhammad Usman
No ratings yet
Cse425 Assignement - 20101257
Document12 pages
Cse425 Assignement - 20101257
sudipta nandi
No ratings yet
Machine Learning Program 4 (SHANKAR)
Document6 pages
Machine Learning Program 4 (SHANKAR)
21EE076 NIDHIN
No ratings yet
Medical Text Classifier GabrieldeOlaguibel
Document12 pages
Medical Text Classifier GabrieldeOlaguibel
gabriel-l
No ratings yet
Feed Forward
Document5 pages
Feed Forward
Sur Esh
No ratings yet
Fashion Clothing Classification
Document10 pages
Fashion Clothing Classification
Captain Mike
No ratings yet
Pattern
Document1 page
Pattern
ahmadkhalil
No ratings yet
Neural Network Toolbox: A Tutorial For The Course Computational Intelligence
Document8 pages
Neural Network Toolbox: A Tutorial For The Course Computational Intelligence
nehabatra14
No ratings yet
Academic Analytics Model - Weka Flow
Document3 pages
Academic Analytics Model - Weka Flow
Madalina Beret
No ratings yet
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
Document33 pages
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
viniskykumar
100% (1)
Random Forest
Document5 pages
Random Forest
surya
No ratings yet
Report
Document2 pages
Report
Noor Ul Sehar
No ratings yet
Lab W7
Document4 pages
Lab W7
ARINA SYAKIRAH MUHAIYUDDIN
No ratings yet
Machine Learning Project Car Price Prediction Algorithm
Document4 pages
Machine Learning Project Car Price Prediction Algorithm
Ruqaiya Ali
No ratings yet
DL Lab Ex - No.5
Document2 pages
DL Lab Ex - No.5
21jr1a43d1
No ratings yet
PROJECT
Document1 page
PROJECT
Gatti Narmada
No ratings yet
Assingment-3 NLP
Document5 pages
Assingment-3 NLP
mukesh shah
No ratings yet
Distributed Fine-Tuning With The Transformers API by HuggingFace - Databricks
Document7 pages
Distributed Fine-Tuning With The Transformers API by HuggingFace - Databricks
XuanHungHo
No ratings yet
Chapter 11
Document19 pages
Chapter 11
ramaraju
No ratings yet
Talking Avatar Application
Document9 pages
Talking Avatar Application
labnexaplan9
No ratings yet
Linq Expand
Document7 pages
Linq Expand
Vinod Gupta
No ratings yet
Lab Report 8
Document11 pages
Lab Report 8
SADIA JANNAT 201-15-3136
No ratings yet
Lab Manual-ANN
Document7 pages
Lab Manual-ANN
faizan majid
No ratings yet
AIML 7 To 11
Document7 pages
AIML 7 To 11
shrihari.9919an
No ratings yet
Introduction To Unit Testing
Document16 pages
Introduction To Unit Testing
saipriyacoool
No ratings yet
(Lab4) CNN
Document2 pages
(Lab4) CNN
projectdirector
No ratings yet
Cs294a 2011 Assignment
Document5 pages
Cs294a 2011 Assignment
Jose
No ratings yet
Predicting Stock Values Using A Recurrent Neural Network
Document12 pages
Predicting Stock Values Using A Recurrent Neural Network
Mr SKammer
No ratings yet
AI Phash 5
Document14 pages
AI Phash 5
techusama4
No ratings yet
NN & DL Lab Manual 1-1
Document23 pages
NN & DL Lab Manual 1-1
samueljason733
No ratings yet
Traffic Signs Recognition
Document12 pages
Traffic Signs Recognition
vijay kumar
No ratings yet
Apache Spark Mllib Guide For Pipelining
Document3 pages
Apache Spark Mllib Guide For Pipelining
Sai Teja Pinninti
No ratings yet
COMP-377 Lab2
Document3 pages
COMP-377 Lab2
Rich 1st
No ratings yet
Muhammad Hassaan 288203 Software Construction Lab 7
Document7 pages
Muhammad Hassaan 288203 Software Construction Lab 7
Ahmad Dogar
No ratings yet
Word2vec Flow
Document2 pages
Word2vec Flow
Mummana Sanjay
No ratings yet
INLP Assignment 3
Document5 pages
INLP Assignment 3
narender singh 015
No ratings yet
Introduction To Keras
Document14 pages
Introduction To Keras
Rosina Ahiave
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet