Professional Documents
Culture Documents
Spam Profile Detection On Instagram Using Machine Learning Algorithms On WEKA and RapidMiner
Spam Profile Detection On Instagram Using Machine Learning Algorithms On WEKA and RapidMiner
Spam Profile Detection On Instagram Using Machine Learning Algorithms On WEKA and RapidMiner
ISSN No:-2456-2165
Abstract:- With every passing second social network sharing app is ranked 6th with over 1 billion active monthly
community is growing rapidly, because of that, attackers accounts. Linked Inn is the most famous social media
have shown keen interest in these kinds of platforms and professional network with over 300 million active monthly
want to distribute mischievous contents on these users. Key elements that is the base for the components being
platforms. With the focus on introducing new set of shared on these social media are its users (Ahmed & Abulaish,
characteristics and features for counteractive measures, a 2012).
great deal of studies has researched the possibility of
lessening the malicious activities on social media networks. Online social platforms have emerged as easily
This research was to highlight features for identifying accessible, cheap, and effective social media, that facilitates
spammers on Instagram and additional features were worldwide users for information sharing and communication.
presented to improve the performance of different Although the basic purpose of social networking sites is online
machine learning algorithms. Performance of different communication and interaction, but the pattern of usage and
machine learning algorithms namely, Multilayer specific goals are different on different services. In the recent
Perceptron, Random Forest, K-Nearest Neighbor and years, social media platforms like Facebook, Instagram and
Support Vector Machine were evaluated on machine Twitter have become worldwide sensation and one of the
learning tools named, RapidMiner and WEKA. The result quickest emerging e-services, as stated by (Ellison, 2008).
from this research tells us that Random Forest
outperformed all other selected machine learning Users are usually recognized by a profile at these social
algorithms on both selected machine learning tools. media sites. It generally comprises of name and picture,
Overall, Random Forest provided best results on birthday, possibly an address and other personal information.
RapidMiner. These results are useful for the researchers However, these social media sites do not strictly identify that
who are keen to build machine learning models to find out the one creating the profile and using it, is actually the same
the spamming activities on social network communities. person as stated in the profile. Someone else is using
somebody else’s identity, if that is not the case, this is known
Keywords:- Malicious Activities, Spammers, Machine as false identity. One can easily create profile with fabricated
Learning Algorithms, Multilayer Perceptron, Random Forest, names and other information that is not associated with any
K-Nearest Neighbor, Support Vector Machine, Rapidminer, person living in any part of the world. In these kinds of cases
WEKA, Social Network Communities. the identity is known as faked identity. Pictures in these kind
of profile can still be of real persons that can be easily taken
I. INTRODUCTION from the internet (Romanov et al., 2017).
Online platform which is used by people to build social 5% to 6% of the registered accounts on the Facebook are
relationships or social networks with other people is known as fake accounts, as stated by Facebook. It is clearly stated in the
social networking service (or also known as social media, or Terms and Conditions of Facebook that the users must not
social media networking or SNS). These social networks or provide false information and their information must be kept
social relationships with other people are based on similar up to date. False and wrong information puts in danger the
activities, real life connections or backgrounds, personal supportability of Facebook’s business model. This clearly
interests, career interests. These social networks are spread shows that for Facebook’s business model it is important that
across numerous computer networks. These social networks user data should be correct and accurate (Krombholz et al.,
are basically computer networks that links knowledge, 2012).
organizations and most importantly people. Social life has
been drastically changed by social networks in the recent past There are number of mischievous activities executed by
and web is change into social web where users and their the spam profiles, which consists of phishing, following great
networks are the points of online development, commerce, number of users with little followers, overflowing the social
growth, and data sharing. Facebook is currently at the top of media platforms with fake profiles, random link connection,
the chain with being the first social media platform that has 1 spreading malware and endanger existing valid accounts etc.
billion registered accounts and over 2 billion active monthly In spite of the benefits got from the social relationships, user’s
users. To spread contents, not only ordinary people but also profile has turned out to be one of the focused resource by the
the politician’s, public figures, people of interest and spammer’s, who among user’s, influence the trust relationship
celebrities use social media platforms. Instagram, the photo to acquire more unfortunate victims (Hanif et al., 2018).
(Washha et al., 2016) utilized Random Forest algorithm (Sohrabi & Karimi, 2018) stated that as the trend of social
to detect the spammers. They proposed new time-based media is increasing day by day, the systems have become a
features and advanced the design of some existing features that highlighted instrument for spammers by spreading spam.
M2 = IF(AND(Num Name=1,Verified=0),”1”,”0”)