Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Assignment No.

1
By

Name
Roll no.

Final report submitted

To

Prof. Dr.

MASTER OF PHILOSOPHY
IN
COMPUTER SCIENCE

DEPARTMENT OF COMPUTER SCIENCE


GOVERNMERNT COLLAGE UNIVERSITY FAISALABAD

Jun 2023
Task 1: Explore and download a dataset which are freely available.
Dataset Name: Gender Classification Dataset
Size: The data contains 5000 observations with 8 attributes. (19KB)

Type: gender_classification_v7.csv (text and numeric type)

Source: Kaggle

Task 2: Select research papers regarding dataset from last 5 years


(2019-2023).
Sr References Techniques/ methods Results Limitations/
. Future work
No
1. (Vikas, Reddy, Data Pre-Processing: LogReg: 71.377 Future work for this
C, & feature engineering, data XG Boost: 66.599 study can be done by
Shanmugasund visualisation, and ML Multinomial NB: including additional
aram, 2022) classifiers 68.570 attributes from the
Gaussian NB:
Training Machine dataset such as
52.302
Learning Models using RF: 68.303
number of retweets,
algorithms (Logistic language-based
Regression, Random mining and many
Forest, Gaussian NB, others which would
Multinomial NB, enhance the accuracy
XGBoost) and provide better
results.
2. (Vashisth & Natural Language LR (Logistic
Meehan, 2020) Processing (NLP) Regression) 54.43
Compare multiple SVM (Support
techniques such as Bag of Vector Machine)
Words (TF-IDF), Word 52.67
Embedding (W2Vec, Random Forest
GloVe) and traditional 48.46
Machine Learning XGBoost 52.39
techniques (Logistic
Regression, Support
Vector Machine and
Naïve Bayes)
3.
4.
5.
6.
Task 3: Final selection of dataset and research papers
Problem Statement:

Research Questions:

Objectives:

References
Vashisth, P., & Meehan, K. (2020). Gender Classification using Twitter Text Data. IEEE Xplore.

Vikas, K., Reddy, A. V., C, S. K., & Shanmugasundaram, H. (2022, July 04). User Gender Classification
Based on Twitter Profile Using Machine Learning. IEEE Xplore.

You might also like