Fake Profiling

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

Project Name : Study on Fake Profiling Academic Year : 2019-2020

Subject Name: Emerging Trends Semester : Sixth

[ college logo ]

A STYDY ON

Study on Fake Profiling

MICRO PROJECT REPORT


Submitted in March 2020 by the group of……3….students

Sr. Roll No Enrollment Seat No


Full name of Student
No (Sem- No (Sem-
vi) vi)
1
2
3

Under the Guidance of


[ your guide name ]

in
Three Years Diploma Program in Engineering & Technology of Maharashtra
State Board of Technical Education, Mumbai (Autonomous)
ISO 9001:2008 (ISO/IEC-27001:2013)
at

[ your college name ]

1
MAHARASHTRA STATE BOARD OF TECHNICAL
EDUCATION, MUMBAI

Certificate
This is to certify that Mr. /Mrs.

Roll No: of Sixth Semester of Diploma

in Engineering & Technology at [ your college name ] , has completed the Micro

Project satisfactorily in Subject ETI in the academic year 2019-2020 as per the

MSBTE prescribed curriculum of I Scheme.

Place: Pune Enrollment No:

Date: / / 2020 Exam Seat No:

Project Guide Head of the Department Principal

Head of
Institute

2
INDEX

Sr. Title Page No .


Abstract
1. Introduction
2. Literature Survey
3. System Design
4. Data Flow Diagram
5. Use Case Diagram
6. Testing and Evaluation
7. Conclusions
8. References

3
Abstract

Social networks such as Facebook, Twitter and Google+ have attracted millions of users in the
last years. One of the most widely used social networks, Facebook, recently had an initial public
offering (IPO) in May 2012, which was among the biggest in Internet technology. Forprofit and
nonprofit organizations primarily use such platforms for target-oriented advertising and large-
scale marketing campaigns. Social networks have attracted worldwide attention because of their
potential to address millions of users and possible future customers. The potential of social
networks is often misused by malicious users who extract sensitive private information of
unaware users. One of the most common ways of performing a large-scale data harvesting attack
is the use of fake profiles, where malicious users present themselves in profiles impersonating
fictitious or real persons. The main goal of this research is to evaluate the implications of fake
user profiles on Facebook. To do so, we established a comprehensive data harvesting attack, the
social engineering experiment, and analyzed the interactions between fake profiles and regular
users to eventually undermine the Facebook business model. Furthermore, privacy
considerations are analyzed using focus groups. As a result of our work, we provided a set of
countermeasures to increase the awareness of user

4
Introduction

In recent years, online social networks such as Facebook, Twitter an Google+ have become a
global mass phenomenon and one of the fastest emerging e-services according to Gross and
Acquisti (2005) and Boyd and Ellison (2007). A study recently published by Facebook (2012)
indicates that there were about 901 million monthly active users on the platform at the end of
March 2012. Therefore, Facebook is one of the largest online social networks. Not only common
users but also celebrities, politicians and other people of public interest use social media to
spread content to others. Furthermore, companies and organizations consider social media sites
the medium of choice for large-scale marketing and target-oriented adver-tising campaigns.
The sustainability of the business model relies on several different factors and is usually not
publicly disclosed. Nonetheless, we assume that two major aspects are significant for Facebook.
First and foremost, Facebook relies on people using their real-life identity and therefore
discourages the use of pseudonyms. Verified accounts allow (prominent) users to verify their
identity and to continue using pseudonyms, e.g., stage names such as ‘Gaga’. This is considered
to be a security mechanism against fake accounts (TechCrunch 2012); moreover, users are asked
to identify friends who do not use their real names.

Motivation of the Project

One of the problems in data classification is the unbalanced distribution of data, in which items
in some classes are more than those of other classes. This problem arises in two-class
applications more than the others; it means that one class has more items than the other class.
The resampling approach means changing the distribution of training sample sets by processing
data. There are several approaches towards improving the class efficiency by balancing the
datasets [1. Resampling data may balance the distribution of the data class by removing the
samples of majority class by the use of undersampling approach or increasing the samples of
minority class using oversampling to balance. There is another approach known as the minority
class artificial sampling which creates the Synthetic Minority Oversampling Technique
(SMOTE) of artificial data based on similarity of the characteristics between minority class
items. In the proposed model, due to the use of similarity feature of the nodes and the

5
unwillingness to remove information, SMOTE method is used. Due to the replication of minority
class samples from the main data in all oversampling approaches, it may increase noise data and
processing time and result in overfitting and decrease in efficiency.

Chawla proposed the SMOTE algorithm. This algorithm can randomly create items of a
minority class based on certain rule and combine these new sample items with the original
dataset to produce new training steps. This approach can be used to produce new minority class
items. In minority classes, different samples have different roles in the process of oversampling,
and these marginal samples take more roles than the items at the center of minority class.
Examples obtained on the margin of a minority class may improve the theme recognition
decision and classification rate for minority class prototypes.

6
Literature Survey

The literature survey has been carried out to explore the previous research works done in the
following relevant areas: Web Mining Web Content Mining Extractive Summarization Query
Based Summarization Sentence Scoring Approaches Preprocessing for Query Based
Summarization Web page Filtering Approaches Web page Segmentation Applications of
Summarization Applications of Query Based Summarization 2.1 WEB MINING World Wide
Web is accumulated with heterogeneous information contents to cater to the information needs
of various user communities, and has now become the single largest collection of data in the
information era. This enormous corpus which is actually growing beyond the expectations and
imagination of everyone opens up scope for more research work. 25 Data mining techniques
could be applied to this huge collection of diversified contents to uncover hidden patterns, trends
and useful knowledge which could be utilized to provide value added services to web users. Few
of the value added services resulting due to the knowledge gained are, search result ranking,
targeted marketing, improved customer relationship and service, and fraudulent user or
transaction detection. Therefore capturing and discovering knowledge from the web data
resources has become very important for web mining research community. Soumen Chakrabarti
et al (1999) described a new hypertext resource discovery system known as focused crawler
which analyzed its crawl boundary to find most relevant links and to avoid irrelevant regions of
the web. This approach led to significant reduction in hardware and network resources
requirement. Mei Kobayashi and Koichi Takeda (2000) studied the growth and development of
Internet and technologies that were useful for information retrieval on the web and also discussed
the development of various techniques targeted to resolve problems such as slow retrieval speed .

System Design
7
Characteristics of the proposed system
The fake profiling created for identifying crimininals has following features
 In comparison to the present system the proposed system will be less time consuming and
is more efficient.
 Analysis will be very easy in proposed system as it is automated
 Result will be very precise and accurate and will be declared in very short span of time
because calculation and evaluations are done by the simulator itself.
 The proposed system is very secure as no chances of leakage of question paper as it is
dependent on the administrator only.
 The logs of appeared candidates and their marks are stored and can be backup for future
use

Admin Table:

S.No. Field name Data Type Description

1. User name Text Store user name for checking correct


username
2. Password Text Store password corresponding to
username
3. User Type Text User Type Administrator or User

Fake profiling Table:

8
S.No. Field name Data Type Description

1. Id Number Unique key for Every Teacher


2. Name Text Name of Teacher

Criminal fake id Table:

S.No. Field name Data Type Description

1. Name . Text Name of id


2. Status Number Total number of id by particular
person
3. . phone no number Phone number of your phone
4. . Subject Text Id Is maintained

Data Flow diagram

9
Fig. Data Flow Diagram of fake profiling

Use case Diagram

10
Fig . Use case diagram for Online Examination Portal

Testing and Evaluation


11
This section deals with the functional evaluation of the application. The application is tested by
installing the application on BLU LIFE PURE mobile phone with android version 4.2.1. The
application supports all Database versions upto 4.4. In order to install the application, the “install
from other locations” setting, under the developer options of the phone should be enabled. The
later sections of the chapter deals with various test cases.

Conclusion

12
In this paper a new classification
algorithm was proposed
to improve detecting fake accounts on
social networks,
where the SVM trained model
decision values were used
to train a NN model, and SVM testing
decision values were
used to test the NN model.
To reach our goal we used ”MIB”
baseline dataset from
[26] and run it into pre-processing
phase where four feature
reduction techniques were used to
reduce the feature vector
In this paper a new classification
algorithm was proposed

13
to improve detecting fake accounts on
social networks,
where the SVM trained model
decision values were used
to train a NN model, and SVM testing
decision values were
used to test the NN model.
To reach our goal we used ”MIB”
baseline dataset from
[26] and run it into pre-processing
phase where four feature
reduction techniques were used to
reduce the feature vector
In this paper a new classification
algorithm was proposed
to improve detecting fake accounts on
social networks,

14
where the SVM trained model
decision values were used
to train a NN model, and SVM testing
decision values were
used to test the NN model.
To reach our goal we used ”MIB”
baseline dataset from
[26] and run it into pre-processing
phase where four feature
reduction techniques were used to
reduce the feature vector
The correlation feature set records a
remarkable accuracy
among the other feature selection
technique sets, because
correlation technique not only select
the best features, but
also removes the redundanc
15
The correlation feature set records a
remarkable accuracy
among the other feature selection
technique sets, because
correlation technique not only select
the best features, but
also removes the redundanc
In this paper a new classification algorithm was proposedto improve detecting fake accounts on
social networks,where the SVM trained model decision values were usedto train a NN model,
and SVM testing decision values wereused to test the NN model.To reach our goal we used
”MIB” baseline dataset from and run it into pre-processing phase where four featurereduction
techniques were used to reduce the feature.
As mentioned above the feature subsets with highest
accuracy was highlighted, as following:
spearmans rank-order Correlation best pattern was
(1000001000110110), Multiple linear Regression best
pattern was (0110110111001111), Wrapper-SVM best
pattern was (110111111011111). NN accuracy results
illustrated in Figure 7.
As shown in Figure 7, the results show that SVM classifier
has the highest accuracy while using Wrapper-SVM feature
set and the lowest accuracy was with Yang et al. feature
set. while the accuracy results for NN classifier were lower
than their counterparts using SVM classifier, with highest
accuracy 0.888 from regression feature set and lowest
accuracy using PCA feature set.
By comparing the accuracy results of all the three
classification algorithms, it was illuminated that SVM-NN
classification algorithm has the highest classification
accuracy results on all the feature subsets compared with
the other two previous classifiers as in Figure 8, with
highest accuracy 0.983.
6. Conclusion

16
In this paper a new classification algorithm was proposed
to improve detecting fake accounts on social networks,
where the SVM trained model decision values were used
to train a NN model, and SVM testing decision values were
used to test the NN model.
To reach our goal we used ”MIB” baseline dataset from
[26] and run it into pre-processing phase where four feature
reduction techniques were used to reduce the feature vector.
The correlation feature set records a remarkable accuracyamong the other feature selection
technique sets, becausecorrelation technique not only select the best features.

17
References

Books:

 International Journalof Emerging Technology and Advanced Engineering, Website:


www.ijetae.com ( ISSN 2250 -2459, ISO 9001:2008 Certified Journal, Volume 4 , Issue
3 , March 2014 ) 660 Online Descriptive Examination and Assessment System
Bhagyashri Kaiche 1, Samiksha Kalan 2,Sneha More 3 , Lekha Shelukar 4 1,2,3,4 KBT
College of Engg Nashik, (India)

 Z. M. Yuan, L. Zhang, G. H. Zhan, A novel web-based online examination system for


computer science education, In proceeding of the 33rd Annual Frontiers in Education,
2013, S3F7-10.

 WebBased online Secured Exam; B.Persis Urbana Ivy,A.shalini, A.Yamuna/International


Journal of Engineering Research and Applications (IJERA) ISSN:2248-9622
www.ijera.com Vol. 2, Issue 1,Jan-Feb 2012, pp.943-944943.

 Online Descriptive Examination and Assessment System.L. Zhang, et al., Development


of Standard Examination System of Special Course for Remote Education, Journal of
Donghua University (English Edition), 2013, Vol. 19, NO.1, 99-102.

Websites :

 www.tutorialspoint.com

 www.w3schools.com

 www.geeks4geeks.com

18

You might also like