Untitled PDF

IDENTIFICATION OF FAKE USER AND SPAMMER
DETECTION ON SOCIAL MEDIA SITES
A PROJECT REPORT
Submitted by
S.AJITH (511916104003)
D.BOOPALAN (511916104008)
E.DHIWAGAR (511916104011)
R.MEGANATHAN (511916104023)
in partial fulfillment for the award of the degree

of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
PRIYADARSHINI ENGINEERING COLLEGE, VANIYAMBADI-635 751
ANNA UNIVERSITY: CHENNAI 600 025

APRIL 2020
i
ANNA UNIVERSITY: CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “IDENTIFICATION OF FAKE USER AND

SPAMMER DETECTION ON SOCIAL MEDIA SITES” is the bonafide work
of “S.AJITH, D.BOOPALAN, E.DHIWAGAR, R.MEGANATHAN” who
carried out the project work under my supervision.
SIGNATURE SIGNATURE
Prof.A.S.KUMARESAN, M.E., [Ph.D] DR.S.VIJAYARANGAM, M.E., [Ph.D]
HEAD OF THE DEPARTMENT, SUPERVISOR,
Associate Professor, Associate Professor,
Department of CSE, Department of CSE,
Priyadarshini Engineering College, Priyadarshini Engineering College,
Vaniyambadi-635 751, Vaniyambadi-635 751,
Tirupattur District. Tirupattur District.
ii
CERTIFICATE OF EVALUATION
COLLEGE NAME : PRIYADARSHINI ENGINEERING COLLEGE,

VANIYAMBADI-635751
BRANCH & SEMESTER : COMPUTER SCIENCE AND ENGINEERING
& 8th SEMESTER
SL. NAME OF THE REGISTER TITLE OF THE NAME OF THE
NO STUDENT NUMBER PROJECT SUPERVISOR
01 S.AJITH 511916104003
IDENTIFICATION DR.S.VIJAYARANGAM,
02 D.BOOPALAN 511916164008 OF FAKE USER M.E., [Ph.D]
AND SPAMMER Associate Professor,
03 E.DHIWAGAR 511916104011 DETECTION ON Department of CSE.
SOCIAL MEDIA
04 R.MEGANATHAN 511916104023 SITES
The report of the project work submitted by the above students in partial
fulfillment for the award of Bachelor of Engineering Degree in Computer Science
and Engineering of Anna University were evaluated and confirmed to the report of
the project work done by the above students and then evaluated.
Submitted for the university project viva-voce examination held on
…………………. at Priyadarshini Engineering College, Vaniyambadi-635751.
INTERNAL EXAMINER EXTERNAL EXAMINER
iii
ACKNOWLEDGEMENT
We thank god almighty for giving us enough strength and tremendous

opportunity through PRIYADARSHINI ENGINEERING COLLEGE to fulfill
our desire to have one more step in the educational ladder.
We give special thanks to our Principal Dr.K.VIJAYARAJ, M.E., [Ph.D]
for his extremely valuable guidance and support, also providing all facilities to
complete our project.
Our indebted to Head of the Department Prof.A.S.KUMERASAN, M.E.,
[Ph.D] without his help and constant guidance we would not have achieved this
laurel. He is giving continuous support to our project, for his patience, motivation,
enthusiasm and immense Knowledge. His guidance helped us in all the time of my
project and writing of this report.
We place on our record & indebtedness to our respected guide
DR.S.VIJAYARANGAM, M.E., [Ph.D] for immense care, motivation, and
encouragement throughout our project work.
Our Special thanks to DR.S.VIJAYARANGAM, M.E., [Ph.D] Project
Coordinator for her extremely valuable Guidance and support throughout our project
work. We forever grateful to my friends and our department faculty who proposed
so many valuable hints and contributed to the success our project work.
We would like to express our sincere thanks to Computer Science and
Engineering Department of Priyadarshini Engineering College for providing us
environment and facility to do our project.
S.AJITH
D.BOOPALAN
E.DHIWAGAR
R.MEGANATHAN
iv
ABSTRACT
Public networking sites connect millions of users around the world. The users'
communications with these social sites, such as Twitter and Facebook have a
wonderful collision repercussions for existence. The outstanding social networking
sites have turned into a target platform for the spammers to disperse a huge amount
of deleterious information. Twitter, for example, has become one of the most
profligately used platforms of all times and therefore allows an difficult to deal with
amount of spam. Fake users send undesired tweets to users to promote services or
websites that not only affect legitimate users but also interrupt resource spending.
Recently, the detection of spammers and identification of fake users on Twitter has
become a common area of research in up to date online social Networks (OSNs).
Moreover a taxonomy of the Twitter spam detection approaches is presented that
classifies the techniques based on their ability to detect:
i. Spammer Detection based on comment post,
ii. Fake users identification based on URL,
iii. Fake user identification based on user profile, and
iv. View Fake User identification result in bar chat.
Then on hand techniques are also compare based on a variety of features, such
as user skin tone, content features, graph features, structure features, and time
features. We are hopeful that the presented study will be a useful resource for
researchers to find the thing to see of recent developments in Twitter spam detection
on a single platform.
INDEX TERMS:
Classification, fake user detection, online social network, spammer’s
identification.
v
CHAPTER TABLE OF CONTENTS PAGE NO.
ABSTRACT v
LIST OF TABLES ix
LIST OF FIGURES x
LIST OF ABBREVATIONS xii
1 INTRODUCTION 1
1.1 Project description 2
1.2 System analysis 3
1.2.1 Literature survey 3
1.2.2 Existing system 6
1.2.3 Proposed system 7
1.3 System requirement 8
1.3.1 Software requirement 8
1.3.2 Hardware requirement 9
vi
2 DESIGN AND IMPLEMENTATION 10
2.1 System study 10
2.1.1 Feasibility study 10
2.2 Architecture diagram 12
2.2.1 Description 13
2.3 System design 14
2.3.1 Data flow diagram of users and admin 14
2.3.2 UML diagrams 15
2.4 Module split up 22
2.4.1 Spammer Detection based on comment post 22
2.4.2 Fake user identification based on URL 26
2.4.3 Fake user identification based on user profile 27
2.4.4 View Fake User identification result in bar chat 28
2.5 Coding 29
2.6 Interface design 39
2.6.1 Home page 39
vii
2.6.2 User login page 40
2.6.3 Server login page 41
2.6.4 User registration page 42
2.6.5 User page 43
2.6.6 Server page 44
2.6.7 Posting page 45
2.6.8 User profile page 46
2.7 Table design 47
2.8 System testing 52
2.8.1 Unit testing 52
2.8.2 Integration testing 52
2.8.3 User acceptance testing 54
2.8.4 Output testing 54
2.8.5 Validation Checking 54
3 CONCLUSION 57
3.1 Future enhancement 57
4 REFERENCES 58
viii
LIST OF TABLES
TABLE No. TABLE NAME PAGE No.
1 Registration data table 47
2 Server data table 48
3 Spammer filter data table 49
4 Friend request data table 50
5 Fake user data table 51
ix
LIST OF FIGURES
FIGURE No. NAME OF THE FIGURE PAGE NO.
1 Architecture diagram 12
2 DFD of User and Admin 14
3 Use case diagram 16
4 Sequence diagram 17
5 FCD for user 18
6 FCD for admin 19
7 Class diagram 21
8 Filter database 22
9 Add filter database 23
10 Negative spam detection 23
11 Sexual spam detection 24
12 Offensive spam detection 24
13 Hateful spam detection 25
14 Vulgar spam detection 25
15 Tweet URL database 26
x
16 Fake user database 27
17 Fake user result in bar chat 28
18 Home page 39
19 User login page 40
20 Server login page 41
21 User registration page 42
22 User page 43
23 Server page 44
24 Posting page 45
25 User profile page 46
xi
S.NO. LIST OF ABBREVATIONS
1 DFD Flow chart diagram
2 FCD Data Flow Diagram
3 CDIO Conceive, Design, Implement, Operate
xii
CHAPTER 1
INTRODUCTION
Millions of users are engaged with social networking sites around the world. Social
sites like twitter, Facebook have a large impact on rare unwanted consequences
caused in our regular life in user’s interactions. In order to disperse a large amount
of inappropriate and harmful data protruding social networking sites are made as a
target platform for the spammers. Twitter is main example that has become one of
the important platforms for unreasonable amount of spam in all the tomes for fake
users to tweet and promote websites or services that crates a major effect for
legitimate users and also it disturbs resource consumption. By resulting the opening
for unusual and harmful information there is an increase of fake identities that
expands invalid data. Research on current online social networks (OSN) is quit
natural for identifying of spammers and also detection of fake users on twitter. This
paper is a review paper that tells about detecting spammer techniques on twitter.
Depending on the ability detection taxonomy of twitter spam identification methods

are classified and presented as:
i. Spammer Detection based on comment post,
ii. Fake users identification based on URL,
iii. Fake user identification based on user profile, and
iv. View Fake User identification result in bar chat.
The present methods are similar which are built on user, content, graph, structure
and time features. The present study is very beneficial resource study for the
researchers for developing the recent features in twitter spam identification in one
single platform.
1
1.1 PROJECT DESCRIPTION
The project is an implementation of analysis method utilized on behalf of

distinguishing spammers on Twitter. We additionally exhibited taxonomy of Twitter
spam identification method are considered as false contented recognition, URL built
spam identification, spam location in inclining points, and phony client recognition
strategies. We likewise analyzed the introduced strategies dependent on a few
features, for example, client features, content features, chart features, structure
features, and time features.
Besides, the procedures were likewise looked at regarding their predefined
objectives and datasets utilized. It is foreseen that the introduced audit will assist
scientists with finding the data on best in class Twitter spam discovery procedures
in a united structure. Notwithstanding the improvement of proficient and viable
methodologies for the spam discovery and phony client distinguishing proof on
Twitter, there are as yet certain open zones that need extensive consideration by the
analysts.
The problems are quickly featured as: False news recognizable proof via web-
based networking media systems is an issue that should be investigated in view of
the genuine consequences of that news at specific just as aggregate level. Another
related subject that merits exploring is the distinguishing proof of talk sources via
web-based networking media. Albeit a couple of concentrates dependent on factual
strategies have just been led to recognize the wellsprings of bits of gossip,
progressively modern methodologies, e.g., informal organization based
methodologies are applicable in view of demonstrated accuracy.
2
1.2 SYSTEM ANALYSIS
1.2.1 LITERATURE SURVEY

Literature Survey is the most important step in software development process.
Before developing the tool it is necessary to determine the time factor, economy and
company strength. Once these things are satisfied, ten next steps are to determine
which operating system and the language can be used for developing the tool. Once
the programmers start building the tool the programmers need lot of external
support. This support can be obtained from senior programmers, from the book or
from the websites. Before building the system the above consideration are taken into
the account for developing the proposed system.
1 Title
Improving spam detection in online social networks
Authors
A.Gupta and R.Kaushal,
Description
In this paper, an algorithm, combining three different learning algorithms
(namely Naive Bayes, Clustering and Decision trees) was implemented. This
integrated algorithm categorizes an account as Spammer/Non-Spammer with an
overall accuracy of 87.9%. Finally, this algorithm was compared with all the three
learning algorithm taken alone. It was observed that the combined approach could
give best results in terms of overall accuracy and in detection of non-spammers.
Though, Decision Trees alone perform better in detection of spammers but it is poor
in detecting non-spam accounts, thus it can’t be used solely.
3
2 Title
Twitter fake account detection
Authors
B. Erçahin, Ö. Akta³, D. Kilinç, and C. Akyol
Description
In this paper a new classification algorithm was proposed to improve detecting
fake accounts on social networks, where the SVM trained model decision values
were used to train a NN model, and SVM testing decision values were used to test
the NN model. To reach our goal we used ”MIB” baseline dataset from and run it
into pre-processing phase where four feature reduction techniques were used to
reduce the feature vector.
4
3 Title
Detecting spammers on Twitter
Authors
F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida
Description
This paper is the first to examine Twitter memes and spam based on network
and temporal properties. Twitter shares some properties with earlier communication
mediums, but introduces new structural properties. We find that spam accounts are
not significantly newer, they exhibit slightly higher retweet and reply properties,
they have a higher number of followers and friends, and indegree and outdegree may
be associated with legitimacy.
5
1.2.2 EXISTING SYSTEM
Existing concept:-
Existing concept deals with providing filtering the spammer database no specified
category. The method supports the accessible and obtainable information in the
tweet object to recognize spam tweets and the tweets that are handled previously
related to the same topic.
Profile filtering, where each of which uses two-hop sub networks that are centered
at each other. Assemble techniques and cascading filtering are also proposed for
combining the properties of both trade significance profile and social status. To
check whether a user is fake or not, a two-hop social network for each user is focused
to gather social information from social networks.
Disadvantages
1. No URL based Fake User Identification.
2. No specified category for spammer detection.
3. There is no filtering system based on re-tweeted post.
4. There is no filtering system based on a preprocessing schedule and on Naïve
Bayes algorithm to discard the tweets containing inaccurate information,.
5. Less security due No URL Based Spam Detection.
6
1.2.3 PROPOSED SYSTEM
Proposed concept:-
In the proposed system, the system elaborates a classification of spammer detection

techniques.
1. The first category (Spammer Detection based on comment post) using relational
database to filtering the unwanted comment.
2. In the second category (Fake users identification based on URL), identified re-
tweeted URL through machine learning algorithms.
3. The third category (Fake user identification based on user profile) is based on
detecting fake users through the existing account or similar account.
4. The last category (View Fake User identification result in bar chat) is based on
view fake user percentage range using the bar chart.
Advantages:
1. The URL based Fake User Identification can be classified.
2. There are different specified category for spammer detection database.
3. There are filtering system based on re-tweeted post.
4. There are bar chart for view the percentage range of fake users.
5. The fake content propagation was identified through the metrics that include:
Social reputation, global engagement, topic engagement, likeability and
credibility.
7
1.3 SYSTEM REQUIREMENT
1.3.1 Software Requirement

The software requirement is the specification of the system. It should include both
the definition and specification of the requirements. It is a set of what the system
should do rather than how it should do it. The software requirements provide basis
for creating the software requirements specification. It is useful in estimating cost,
planning team activities, performing tasks and tracking team`s progress throughout
the development activity.
1. Operating system - Windows XP

2. Coding Language - Java/J2EE(JSP , Servlet)
3. Front End - J2E
4. Back End - MySQL
8
1.3.2 Hardware Requirement
The hardware requirement may serve as the basis for a contract for the
implementation of the system and should therefore be a complete and consistent
specification of the whole system. They are used by software engineers as the
starting point for the system design. It shows what the system does and not how it
should be implemented.
1. Processor - Pentium–IV
2. RAM - 4 GB(min)
3. Hard Disk - 20 GB
4. Keyboard - Standard Windows Keyboard
5. Mouse - Two or Three Button Mouse
6. Monitor - SVGA
9
CHAPTER 2
DESIGN AND IMPLEMENTATION
2.1 SYSTEM STUDY

2.1.1 FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal
is put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out.
This is to ensure that the proposed system is not a burden to the company. For
feasibility analysis, some understanding of the major requirements for the system is
essential.
Three key considerations involved in the feasibility analysis are
1. Economical feasibility
2. Technical feasibility
3. Social feasibility
1. Economical feasibility
This study is carried out to check the economic impact that the system will have
on the organization. The amount of fund that the company can pour into the research
and development of the system is limited.
The expenditures must be justified. Thus the developed system as well within
the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.
10
2. Technical feasibility
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.
3. Social feasibility
The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user
must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are
employed to educate the user about the system and to make him familiar with it. His
level of confidence must be raised so that he is also able to make some constructive
criticism, which is welcomed, as he is the final user of the system.
11
2.2 ARCHITECTURE DIAGRAM
Tweet Server
View and Authorize Users
,Add and View Spam Filters
,View All User Posted Tweets
,View All User Tweets Based
On URLs
,View Friend Request and
Response
,View All Tweets with Re-
Tweets
Process all ,View All Tweets , Re-Tweets
user queries and Comments
,View All Spammers Detection
,View All Fake User
Store and retrievals Identification
Web , View Fake User Identification
Database Results
Registering
the User , View Fake Tweet
Identification Result
My Profile
, Search Friends
, Create Tweets
, View My Friends
, View Friend Requests Remote User
, Search Tweets and Comment
, View My Tweets and Comments
, View Friend's Retweets and Give
Comments
Figure 1: Architecture diagram
12
2.2.1 DESCRIPTION
Admin
In this module, the Admin has to login by using valid user name and password. After
login successful he can do some operations such as View and Authorize Users, Add
and View Spam Filters, View All User Posted Tweets, View All User Tweets Based
On URLs, View Friend Request and Response, View All Tweets with Re-Tweets,
View All Tweets, Re-Tweets and Comments, View All Spammers Detection, View
All Fake User Identification, View Fake User Identification Results, View Fake
Tweet Identification Results.
User
In this module, there are n numbers of users are present. User should register before
doing some operations. After registration successful he has to wait for admin to
authorize him and after admin authorized him. He can login by using authorized user
name and password. Login successful he will do some operations like My Profile,
Search Friends ,Create Tweets, View My Friends, View Friend Requests, Search
Tweets and Comment ,View My Tweets and Comments, View Friend's Retweets
and Give Comments.
13
2.3 SYSTEM DESIGN
The DFD is also called as bubble chart. It is a simple graphical formalism that
can be used to represent a system in terms of input data to the system, various
processing carried out on these data and the output data is generated by the system.
2.3.1 DATA FLOW DIAGRAM OF USERS AND ADMIN
Tweet Server Login Web Server View all

users and
authorize
Response
Request
Request
Register
and Search
Login friends, Create
Tweets
My Profile, Search
View and Authorize Users,
Friends, Create
Add and View Spam Filters Response Tweets, View My
,View All User Posted
Friends, View
Tweets, View All User
Friend Requests,
Tweets Based On URLs,
Search Tweets and
View Friend Request and
Comment, View
Response, View All Tweets
My Tweets and
with Re-Tweets, View All
End User Comments
Tweets , Re-Tweets and
Comments.
Figure 2: DFD of User and Admin
14
2.3.2 UML DIAGRAMS
2.3.2.1 USE CASE DIAGRAM

A use case is a set of scenarios that describing an interaction between a user
and a system. A use case diagram displays the relationship among the actors and use
cases. The two main component of a use case diagram are use cases and actors.
15
USE CASE:
Register and Login
List Users and authorize
Search friend & req
List and View all Friends

Req and Res
Create Tweet
User
View all your tweets

and comments
View All Spammers Detection,

View Fake User Identification
Results
User
,View
View all otherFake Tweet
friends’ tweets
Identification Results
and make your comment
View All Fake User Identification &

View all your friends and their
profile details
Figure 3: Use case diagram
16
2.3.2.2 SEQUENCE DIAGRAM
Sequence diagrams demonstrate the behavior of the objects in a use case by

describing the objects and the messages they pass. The diagrams are read left to right
and descending. The example below shows an object of class
1. Start the behavior by sending a message to an object of class.
2. Messages pass between the different objects until the object of class 1
receives the final message.
Sequence diagrams, collaboration diagrams, or both diagrams can be used
to demonstrate the interaction of objects in a use case.
Tweet Server Server User
List All Users and authorize Register and Login, View User
Profile
List and View all Friends Req and Res Search friends, req and response
View all tweet comments / Re tweet Create Tweet

details
comments
View All Spammers Detection View all your tweets and comments
View all other friends’ tweets and

View All Fake User Identification make your comment

Results, View Fake Tweet View all your friends and their
Identification Results profile details
Figure 4: Sequence diagram
17
2.3.2.3 FLOW CHART DIAGRAM
Flow Chart Diagram displays a special state diagram where most of the states are
action states and most of the transitions are triggered by completion of the actions in
the source states. This diagram focuses on flows driven by internal processing.
FLOW CHART: USER
Start
User Register
Yes No
Login
View users Profile Username &

Password Wrong
Search friends, req / res

friends
Create Tweets
Logout
View all your tweets and comments,

View all your friends and their profile
details
View all other friends’ tweets

and make your comment
Figure 5: FCD for user
18
FLOW CHART: ADMIN
Start
Admin
Yes No
Login
View All Users Profile Username &

Password Wrong
List all Friends Req and Res and List Tweets
View all friend request and response

and View Tweets
Logout
View all tweet comments /

re tweet details
View All Spammers Detection &

View All Fake User Identification

Results, View Fake Tweet
Identification Results
Figure 6: FCD for admin
19
2.3.2.4 CLASS DIAGRAM
Class Diagram models class structure and contents using design elements such as
classes, packages and objects. It also displays relationships such as containment,
inheritance, associations and others. Class diagrams are used in nearly all Object
Oriented software designs. Use them to describe the classes of the system and their
relationships to each other.
The class diagram is the main building block of object oriented modeling. It is used
both for general conceptual modeling of the systematic of the application, and for
detailed modeling translating the models into programming code. Class diagrams
can also be used for modeling. The classes in a class diagram represent both the main
objects, interactions in the application and the classes to be programmed.
In the diagram, classes are represented with boxes which contain three parts
• The upper part holds the name of the class
• The middle part contains the attributes of the class
• The bottom part gives the methods or operations the class can take or
undertake
In the design of a system, a number of classes are identified and grouped together in
a class diagram which helps to determine the static relations between those objects.
With detailed modeling, the classes of the conceptual design are often split into a
number of subclasses.
20
Admin Web Server
View and Authorize Users, Add and

View Spam Filters ,View All User
Posted Tweets, View All User Tweets
Based On URLs, View Friend Request
and Response, View All Tweets with
Re-Tweets, View All Tweets , Re-
Methods
Tweets and Comments, View All
Spammers Detection, View All Fake
User Identification, View Fake User
Identification Results, View Fake
Tweet Identification Results
Members
User & Author Login Tweet Name, Tweet Description,
Tweet Uses, Select Tweet Image,
Tweet Owner, Tweet Date and Time
Methods Login, Register, Reset

Remote User
Members User Name, Password
Methods Register, Reset, Search
Name, Password, DOB,

End User Gender, Address, City,
Members Country, Email, Mobile,
My Profile, Search Friends, Create History, Post
Tweets, View My Friends, View
Friend Requests, Search Tweets
and Comment, View My Tweets Methods
and Comments, View Friend's
Retweets and Give Comments
Tweet Name, Tweet Description,

Tweet Uses, Select Tweet Image, Members
Tweet Owner, Tweet Date and
Time
Figure 7: Class diagram
21
2.4 MODULE SPLIT UP
The project consists of four modules. They are,
 Spammer Detection based on comment post,

 Fake users identification based on URL,
 Fake user identification based on user profile, and
 View Fake User identification result in bar chat.
2.4.1 Spammer Detection based on comment post
This is the first module our project for spammer detection on comment post.
While user comment on any another user post, this module scan the content of
comment then match with spammer filter database.
Figure 8: Filter database
22
Here, we can add data with different category.
Figure 9: Add filter database
After scanning the filter database, if the data is matched then it list into spammer
details with different category.
Figure 10: Negative spam detection
23
Figure 11: Sexual spam detection
Figure 12: Offensive spam detection
24
Figure 13: Hateful spam detection
Figure 14: Vulgar spam detection
25
2.4.2 Fake user identification based on URL
This module is identify based on retweeted user. The user post a tweet the
module scanning the previous tweet, if the tweet is similar the URL list asa retweeted
URLs list.
Figure 15: Tweet URL database
26
2.4.3 Fake user identification based on user profile
In this module the fake user identification is based on user profile details. The
user try to create a existing account or similar account then the user considered has
a fake user. Then block the user and wanted the user that you are try to create a fake
user.
Figure 16: Fake user database
27
2.4.4 View Fake User identification result in bar chat
This module used to view the fake user identification in bar chat with the
highest range value. It is used to view a fake user with the range of the percentage.
Figure 16: Fake user result in bar chat
28
2.5 CODING
User login:-
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>User Login Page..</title>
<meta name="keywords" content="Holiday, free CSS template, clean, neat, aqua,
white" />
<meta name="description" content="Holiday is a clean and neat free CSS template
using aqua and white colors." />
<link href="templatemo_style.css" rel="stylesheet" type="text/css" />
<script language="javascript" type="text/javascript">
function clearText(field)
{
if (field.defaultValue == field.value) field.value = '';
else if (field.value == '') field.value = field.defaultValue;
}
</script>
<style type="text/css">

</style>
</head>
<body>
<div id="templatemo_top_wrapper">
<div id="templatemo_top">
<div id="templatemo_header">
<div>
<table width="965" border="0" cellspacing="2" cellpadding="2"><tr>
<td width="957"><p><span class="style4">Identification of Fake User and
Spammer Detection on Social Media Sites</span>
</p>
<p align="center"> <span class="style4"></span>
</p>
</td>
</tr>
</table>
</div>
</div> 
<div id="templatemo_middle">
<div id="templatemo_menu"><ul>
<li><a href="index.html">Home</a></li>
<li><a href="UserLogin.jsp" class="current">User</a>
</li>
30
<li><a href="ServerLogin.jsp">Tweet Server</a>
</li>
</ul>
<div id="search_box">
<form action="#" method="post">
<input type="text" value="Enter keyword here..." name="q" size="10"
id="searchfield" title="searchfield" onfocus="clearText(this)"
onblur="clearText(this)" />
<input type="submit" name="Search" value="" id="searchbutton" title="Search"/>
</form>
</div>
<div class="cleaner"></div>
</div> 
<div id="mid_content">
<h2>Twitter - Online Social Network </h2>
<p>Twitter is a popular online social network service for sharing short messages
(tweets) among friends.</p></div>
</div> 
</div> 
</div> 
<div id="templatemo_main">
<div class="col_w600 float_l">
<div class="content_box">
<h2><span class="style18">Welcome To User Login</span>..</h2>
<p>
<img src="images/WrongLogin.jpg" width="308" height="128" />
</p>
31
<div class="image_wrapper image_fl">
<img src="images/templatemo_image_01.jpg" alt="Image" />
</div>
<form id="form1" name="form1" method="post"
action="UserAuthentication.jsp">
<table width="403" border="0" cellspacing="2" cellpadding="2"><tr>
<td width="149" height="62" align="center" bgcolor="#FFFF00"><div
align="center" class="style19">
<div align="left">Name (required)</div>
</div>
</td>
<td width="240"><input id="name" name="userid" class="text" /></td>
</tr>
<tr>
<td height="46" align="center" bgcolor="#FFFF00">
<div align="center" class="style19">
<div align="left">Password (required)</div>
</div>
</td>
<td><input type="password" id="pass" name="pass" class="text" /></td>
</tr>
<tr>
<td> </td>
<td> </td>
</tr>
<tr>
<td> </td>
32
<td><span class="style19">
<input name="imageField" type="submit" class="LOGIN" id="imageField"
value="Login" />
New User?</span><a href="UserRegister.jsp" class="style20"> Register </a>
</td>
</tr>
<tr>
<td height="26"> </td>
<td> </td></tr>
</table>
<p align="right">
<a href="index.html" class="style21"><strong>Back</strong></a></p>
</form></div>
<div class="cleaner"></div></div>
<div class="col_w300 float_r">
<h2>Sidebar Menu </h2>
<p><a href="UserLogin.jsp"><strong>Home</strong></a></p>
<p><a href="index.html"><strong>Index Page</strong></a></p>
<p class="news_box"> </p></div>
</div> 
<div id="templatemo_footer_wrapper_01">
<div id="templatemo_footer_wrapper_02"></div>

</div> 
</body>
</html>
33
Server login:-
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Tweet Server Login Page..</title>
<meta name="keywords" content="Holiday, free CSS template, clean, neat, aqua,
white" />
<meta name="description" content="Holiday is a clean and neat free CSS template
using aqua and white colors." />
<link href="templatemo_style.css" rel="stylesheet" type="text/css" />
<script language="javascript" type="text/javascript">
function clearText(field)
{
if (field.defaultValue == field.value) field.value = '';
else if (field.value == '') field.value = field.defaultValue;
}
</script>
<style type="text/css">

</style>
</head>
<body>
<div id="templatemo_top_wrapper">
<div id="templatemo_top">
<div id="templatemo_header">
<div>
<table width="965" border="0" cellspacing="2" cellpadding="2">
<tr>
<td width="957"><p><span class="style4">Identification of Fake User and
Spammer Detection on Social Media Sites</span></p>
<p align="center"> <span class="style4"></span></p></td>
</tr></table>
</div>
</div> 
<div id="templatemo_middle">
<div id="templatemo_menu">
<ul>
<li>
<a href="index.html">Home</a>
</li>
35
<li><a href="UserLogin.jsp">User</a></li>
<li><a href="ServerLogin.jsp" class="current">Tweet Server</a></li>
</ul>
<div id="search_box">
<form action="#" method="post">
<input type="text" value="Enter keyword here..." name="q" size="10"
id="searchfield" title="searchfield" onfocus="clearText(this)"
onblur="clearText(this)" />
<input type="submit" name="Search" value="" id="searchbutton" title="Search"/>
</form>
</div>
</div> 
<div id="mid_content">
<h2>Twitter - Online Social Network </h2>
<p>Twitter is a popular online social network service for sharing short messages
(tweets) among friends.</p>
</div>
</div> 
</div> 
</div> 
<div id="templatemo_main">
<div class="col_w600 float_l">
<div class="content_box">
<h2><span class="style25">Welcome to Tweet Server Login Page..</span></h2>
<p><img src="images/WrongLogin.jpg" width="308" height="128" /></p>
<div class="image_wrapper image_fl">
36
<img src="images/templatemo_image_01.jpg" alt="Image" />
</div>
<form id="form1" name="form1" method="post"
action="ServerAuthentication.jsp">
<table width="423" border="0" cellspacing="2" cellpadding="2">
<tr>
<td width="197" height="46" align="center" bgcolor="#FFFF00"><span
class="style19">
<label for="name">Server Name (required)</label>
</span>
</td>
<td width="212"><input id="name" name="userid" class="text" /></td>
</tr>
<tr>
<td height="72" align="center" bgcolor="#FFFF00"><span
class="style19">Password (required)</span></td>
<td>
<input type="password" id="pass" name="pass" class="text" />
</td>
</tr>
<tr>
<td> </td>
<td>
<input name="imageField" type="submit" class="LOGIN" id="imageField"
value="Login" />
</td>
</tr>
37
</table>
<p align="right"><a href="index.html" class="style20">Back</a>
</p>
</form>
</div>
</div>
<div class="col_w300 float_r">
<h2>Sidebar Menu </h2>
<p>
<a href="ServerLogin.jsp"><strong>Home</strong></a>
</p>
<p><a href="index.html"><strong>Index Page</strong>
</a>
</p>
<p class="news_box"> </p>
</div>
<div class="cleaner">
</div>
</div> 
</div>

</div> 
</body>
</html>
38
2.6 INTERFACE DESIGN
2.6.1 Home Page
Figure 18: Home page
39
2.6.2 User login page
Figure 19: User login page
40
2.6.3 Server login page
Figure 20: Server login page
41
2.6.4 User registration page
Figure 21: User registration page
42
2.6.5 User page
Figure 22: User page
43
2.6.6 Server page
Figure 23: Server page
44
2.6.7 Posting page
Figure 24: Posting page
45
2.6.8 User profile page
Figure 25: User profile page
46
2.7 Table Design
Registration data table
Table 1: Registration data table
47
Server data table
Table 2: Server data table
48
Spammer filter data table
Table 3: Spammer filter data table
49
Friend request data table
Table 4: Friend request data table
50
Fake user data table
Table 5: Fake user data table
51
2.8 SYSTEM TESTING
The following are the Testing Methodologies:

1. Unit Testing.
2. Integration Testing.
3. User Acceptance Testing.
4. Output Testing.
5. Validation Testing.
2.8.1 Unit Testing

Unit testing focuses verification effort on the smallest unit of Software design
that is the module. Unit testing exercises specific paths in a module’s control
structure to ensure complete coverage and maximum error detection.
This test focuses on each module individually, ensuring that it functions
properly as a unit. Hence, the naming is Unit Testing.
During this testing, each module is tested individually and the module
interfaces are verified for the consistency with design specification. All important
processing path are tested for the expected results. All error handling paths are also
tested.
2.8.2 Integration Testing

Integration testing addresses the issues associated with the dual problems of
verification and program construction. After the software has been integrated a set
of high order tests are conducted.
The main objective in this testing process is to take unit tested modules and
builds a program structure that has been dictated by design.
52
The following are the types of Integration Testing:
1. Top Down Integration
This method is an incremental approach to the construction of program
structure. Modules are integrated by moving downward through the control
hierarchy, beginning with the main program module. The module subordinates to
the main program module are incorporated into the structure in either a depth first
or breadth first manner.
In this method, the software is tested from main module and individual stubs
are replaced when the test proceeds downwards.
2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest
level in the program structure.
Since the modules are integrated from the bottom up, processing required for
modules subordinate to a given level is always available and the need for stubs is
eliminated.
The bottom up integration strategy may be implemented with the following steps:
 The low-level modules are combined into clusters into clusters that perform a
specific Software sub-function.
 A driver (i.e.) the control program for testing is written to coordinate test case
input and output.
 The cluster is tested.
 Drivers are removed and clusters are combined moving upward in the
program structure
The bottom up approaches tests each module individually and then each module is
module is integrated with a main module and tested for functionality.
53
2.8.3 User Acceptance Testing
User Acceptance of a system is the key factor for the success of any system.
The system under consideration is tested for user acceptance by constantly keeping
in touch with the prospective system users at the time of developing and making
changes wherever required.
The system developed provides a friendly user interface that can easily be
understood even by a person who is new to the system.
2.8.4 Output Testing

After performing the validation testing, the next step is output testing of the
proposed system, since no system could be useful if it does not produce the required
output in the specified format.
Asking the users about the format required by them tests the outputs generated
or displayed by the system under consideration.
Hence the output format is considered in 2 ways – one is on screen and another
in printed format.
2.8.5 Validation Checking

Validation checks are performed on the following fields.
Text Field:
The text field can contain only the number of characters lesser than or equal
to its size.
The text fields are alphanumeric in some tables and alphabetic in other tables.
Incorrect entry always flashes and error message.
54
Numeric Field:
The numeric field can contain only numbers from 0 to 9. The individual
modules are checked for accuracy and what it has to perform.
Each module is subjected to test run along with sample data. The
individually tested modules are integrated into a single system. Testing involves
executing the real data information is used in the program the existence of any
program defect is inferred from the output. A successful test is one that gives out the
defects for the inappropriate data and produces and output revealing the errors in the
system.
Preparation of Test Data

Taking various kinds of test data does the above testing. Preparation of test
data plays a vital role in the system testing. After preparing the test data the system
under study is tested using that test data.
While testing the system by using test data errors are again uncovered and
corrected by using above testing steps and corrections are also noted for future use.
Using Live Test Data:

Live test data are those that are actually extracted from organization files.
After a system is partially constructed, programmers or analysts often ask users to
key in a set of data from their normal activities.
Then, the systems person uses this data as a way to partially test the system.
In other instances, programmers or analysts extract a set of live data from the files
and have the entered themselves. It is difficult to obtain live data in sufficient
amounts to conduct extensive testing.
55
And, although it is realistic data that will show how the system will perform
for the typical processing requirement, assuming that the live data entered are in fact
typical, such data generally will not test all combinations or formats that can enter
the system.
This bias toward typical values then does not provide a true systems test and
in fact ignores the cases most likely to cause system failure.
Using Artificial Test Data:

Artificial test data are created solely for test purposes, since they can be
generated to test all combinations of formats and values. In other words, the artificial
data, which can quickly be prepared by a data generating utility program in the
information systems department, make possible the testing of all login and control
paths through the program.
The most effective test programs use artificial test data generated by persons
other than those who wrote the programs. Often, an independent team of testers
formulates a testing plan, using the systems specifications.
The package “Virtual Private Network” has satisfied all the requirements
specified as per software requirement specification and was accepted.
56
CHAPTER 3
CONCLUSION
In this paper, we performed a review of techniques used for detecting spammers on

Twitter. In addition, we also presented a taxonomy of Twitter spam detection
approaches and categorized them as Spammer Detection based on comment post,
Fake users identification based on URL, Fake user identification based on tweet post,
and View Fake User identification result in bar chat. We also compared the presented
techniques based on several features, such as user features, content features, graph
features, structure features, and time features. Moreover, the techniques were also
compared in terms of their specified goals and datasets used. It is anticipated that the
presented review will help researchers find the information on state-of-the-art
Twitter spam detection techniques in a consolidated form.
Despite the development of efficient and effective approaches for the spam
detection and fake user identification on Twitter, there are still certain open areas
that require considerable attention by the researchers. Another associated topic that
is worth investigating is the identification of rumor sources on social media.
Although a few studies based on statistical methods have already been conducted to
detect the sources of rumors, more sophisticated approaches, e.g., social network
based approaches, can be applied because of their proven effectiveness.
3.1 FUTURE ENHANCEMENT
Furthermore, we can deploy the Project Management System with the plagiarism
checker, video conference and chat bots.
57
REFERENCES
1. B. Erçahin, Ö. Akta³, D. Kilinç, and C. Akyol, ``Twitter fake account

detection,'' in Proc. Int. Conf. Comput. Sci. Eng. (UBMK), Oct. 2017, pp.
388392.
2. F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida, ``Detecting
spammers on Twitter,'' in Proc. Collaboration, Electron. Messaging, Anti-
Abuse Spam Conf. (CEAS), vol. 6, Jul. 2010, p. 12.
3. S. Gharge, and M. Chavan, `Àn integrated approach for malicious tweets
detection using NLP,'' in Proc. Int. Conf. Inventive Commun. Comput.
Technol. (ICICCT), Mar. 2017, pp. 435438.
4. T. Wu, S. Wen, Y. Xiang, and W. Zhou, ``Twitter spam detection: Survey of
new approaches and comparative study,'' Comput. Secur., vol. 76, pp.
265284, Jul. 2018.
5. S. J. Soman, `À survey on behaviors exhibited by spammers in popular
social media networks,'' in Proc. Int. Conf. Circuit, Power Comput. Tech-
nol. (ICCPCT), Mar. 2016, pp. 16.
6. A. Gupta, H. Lamba, and P. Kumaraguru, ``1.00 per RT #BostonMarathon
# prayforboston: Analyzing fake content on Twitter,'' in Proc. eCrime
Researchers Summit (eCRS), 2013, pp. 112.
7. F. Concone, A. De Paola, G. Lo Re, and M. Morana, ``Twitter analysis for
real-time malware discovery,'' in Proc. AEIT Int. Annu. Conf., Sep. 2017,
pp. 16.
58
8. [8] N. Eshraqi, M. Jalali, and M. H. Moattar, ``Detecting spam tweets in
Twitter using a data stream clustering algorithm,'' in Proc. Int. Congr.
Technol., Commun. Knowl. (ICTCK), Nov. 2015, pp. 347351.
9. C. Chen, Y. Wang, J. Zhang, Y. Xiang, W. Zhou, and G. Min, ``Statistical
features-based real-time detection of drifted Twitter spam,'' IEEE Trans. Inf.
Forensics Security, vol. 12, no. 4, pp. 914925, Apr. 2017.
10. C. Buntain and J. Golbeck,”Automatically identifying fake news in popular
Twitter threads,'' in Proc. IEEE Int. Conf. Smart Cloud (SmartCloud), Nov.
2017, pp. 208215.
11. C. Chen, J. Zhang, Y. Xie, Y. Xiang,W. Zhou, M. M. Hassan, A. AlElaiwi,
and M. Alrubaian, `À performance evaluation of machine learning-based
streaming spam tweets detection,'' IEEE Trans. Comput. Social Syst., vol. 2,
no. 3, pp. 6576, Sep. 2015.
12. G. Stafford and L. L. Yu, `Àn evaluation of the effect of spam on Twitter
trending topics,'' in Proc. Int. Conf. Social Comput., Sep. 2013,pp. 373378.
13. M. Mateen, M. A. Iqbal, M. Aleem, and M. A. Islam, `À hybrid approach
for spam detection for Twitter,'' in Proc. 14th Int. Bhurban Conf. Appl. Sci.
Technol. (IBCAST), Jan. 2017, pp. 466471.
14. A. Gupta and R. Kaushal, `Ìmproving spam detection in online social
networks,'' in Proc. Int. Conf. Cogn. Comput. Inf. Process. (CCIP), Mar.
2015, pp. 16.
15. F. Fathaliani and M. Bouguessa, `À model-based approach for identifying
spammers in social networks,'' in Proc. IEEE Int. Conf. Data Sci. Adv. Anal.
(DSAA), Oct. 2015, pp. 19.
59

Untitled PDF

Uploaded by

Copyright:

Available Formats

You might also like

Untitled PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Untitled PDF

Uploaded by

Copyright:

Available Formats

IDENTIFICATION OF FAKE USER AND SPAMMER

DETECTION ON SOCIAL MEDIA SITES

in partial fulfillment for the award of the degree

PRIYADARSHINI ENGINEERING COLLEGE, VANIYAMBADI-635 751

ANNA UNIVERSITY: CHENNAI 600 025

Certified that this project report “IDENTIFICATION OF FAKE USER AND

COLLEGE NAME : PRIYADARSHINI ENGINEERING COLLEGE,

INTERNAL EXAMINER EXTERNAL EXAMINER

We thank god almighty for giving us enough strength and tremendous

LIST OF ABBREVATIONS xii

1.1 Project description 2

1.2 System analysis 3

1.2.1 Literature survey 3

1.2.2 Existing system 6

1.2.3 Proposed system 7

1.3 System requirement 8

1.3.1 Software requirement 8

1.3.2 Hardware requirement 9

2.1 System study 10

2.1.1 Feasibility study 10

2.2 Architecture diagram 12

2.3 System design 14

2.3.1 Data flow diagram of users and admin 14

2.3.2 UML diagrams 15

2.4 Module split up 22

2.4.1 Spammer Detection based on comment post 22

2.4.2 Fake user identification based on URL 26

2.4.3 Fake user identification based on user profile 27

2.4.4 View Fake User identification result in bar chat 28

2.6 Interface design 39

2.6.1 Home page 39

2.6.3 Server login page 41

2.6.4 User registration page 42

2.6.5 User page 43

2.6.6 Server page 44

2.6.7 Posting page 45

2.6.8 User profile page 46

2.7 Table design 47

2.8 System testing 52

2.8.1 Unit testing 52

2.8.2 Integration testing 52

2.8.3 User acceptance testing 54

2.8.4 Output testing 54

2.8.5 Validation Checking 54

3.1 Future enhancement 57

TABLE No. TABLE NAME PAGE No.

1 Registration data table 47

2 Server data table 48

3 Spammer filter data table 49

4 Friend request data table 50

5 Fake user data table 51

FIGURE No. NAME OF THE FIGURE PAGE NO.

2 DFD of User and Admin 14

3 Use case diagram 16

5 FCD for user 18

6 FCD for admin 19

9 Add filter database 23

10 Negative spam detection 23