Professional Documents
Culture Documents
CBDPPT
CBDPPT
CBDPPT
1. Abstract
2. Introduction
3. Objective
4. Literature Survey
5. Existing System
6. Proposed System
7. Methodology
8. Requirement Specification
9. Conclusion
10.Reference
ABSTRACT
Social media is a platform where many young people are getting bullied. As social
networking sites are increasing, cyber bullying is increasing day by day. To identify word
similarities in the tweets made by bullies and make use of machine learning and can
develop an ML model automatically detect social media bullying actions. However, many
social media bullying detection techniques have been implemented, but many of them
were textual based. The objective of our project work is to show the implementation of
NLP and CNN which detects bullied tweets, posts, etc. A machine learning model is
proposed to detect and prevent bullying on Twitter. Two classifiers i.e. NLP(Natural
Language Processing) are used for identifying the complete sentence in the comments
and CNN(Convolution Neural Networks) for image identification. Both NLP and CNN
were able to detect the true positives with more accuracy. Also, Twitter API is used to
fetch tweets and tweets are passed to the model to detect whether the tweets are bullying
or not.
INTRODUCTION
With the advancement in technology, the internet has been a safe and secure sphere of communication, though
the arena of social media has been prone to cybercrimes. It is characterized as the utilization of online
communication to bully an individual, regularly by sending messages of an intimidating or threatening nature.
Around 87 percent of the today’s youth have witnessed some form of cyber bullying. Cyber bullying
can take different structures like Sexual Harassment, Hostile Environment, Revenge, and Retaliation.Since the
offender is hidden to the victim, the problem statement gets complex. With the multiplication of online life
and internet access, the act of cyber bullying too has increased , and it’s difficult to detect .Thus, it is necessary
to detect cyber bullying in order to protect adolescents. The focus is on identifying textual cyber bullying.
Automatic surveillance of cyber bullying has gained considerable interest in the field of computer science.
In this research, this vital data is utilized and information in the form of texts to improve the existing cyber
bullying detection performance.
A Convolution Neural Network (CNN) popularly known as ConvNet is a specific type of artificial
neural network that use perceptrons, a machine learning algorithm to analyze data. CNN's apply to
image processing, natural language processing and other intellectual assignments.
OBJECTIVE
• In the proposed system the tweets are classified into a threat tweet or not a threat tweet. This is
done by utilizing a set of keywords belonging to different categories. For each of category the
probability is computed, after the probability is computed contingency is computed and after that
sorting is performed in order to classify the tweet as belonging different cyber threat category.
• Framework aids the E-crime department to identify suspicious words from cyber messages and
trace the suspected culprits. Currently existing Instant Messengers and Social Networking Sites
lack these features of capturing significant suspicious patterns of threat activity from dynamic
messages and find relationships among people, places and things during online chat, as criminals
have adapted to it.
• We will apply different algorithms for identifying and preventing the text or post in comments.
Later our model will help people from the attacks of social media bullies.
PROBLEM STATEMENT
Cyberbullying has become a pervasive and harmful issue in today's digital age, causing emotional
distress and psychological harm to individuals, especially among adolescents and young adults. To
address this problem, there is a critical need for an automated system that can effectively detect and
mitigate instances of cyberbullying on social media platforms, chat applications, and other online
communication channels.
The goal of this project is to develop a robust and accurate Cyber Bullying Detection system using deep
learning techniques, specifically Long Short-Term Memory (LSTM) and Convolutional Neural Networks
(CNN). This system should be capable of analyzing text and multimedia content (such as images and
videos) to identify and classify instances of cyberbullying, hate speech, or offensive content in real-
time.
KEY FEATURES
1.Data Collection: Gather a diverse and comprehensive dataset containing text, images, and videos from various online
sources, including social media platforms, forums, and messaging apps. This dataset should include labeled examples of
cyberbullying and non-cyberbullying content.
2.Data Preprocessing: Clean, preprocess, and annotate the dataset to ensure consistency and prepare it for training the
deep learning models. This includes text tokenization, image resizing, and video frame extraction.
3.Model Architecture: Design and implement a hybrid deep learning model that combines LSTM and CNN layers. The
LSTM layers will process textual data, while the CNN layers will handle image and video content. Fine-tune the
architecture to optimize performance.
4.Feature Extraction: Extract relevant features from text, images, and videos, which capture the linguistic, visual, and
contextual cues associated with cyberbullying.
5.Training and Validation: Train the model using the preprocessed dataset and implement cross-validation techniques to
ensure robustness and minimize overfitting.
6.Real-time Detection: Develop an interface or API that allows users to input text, images, or video content for real-time
cyberbullying detection. The system should provide immediate feedback on the likelihood of cyberbullying and the
severity of the content.
7.Model Evaluation: Evaluate the performance of the LSTM-CNN model using appropriate metrics such as precision,
recall, F1-score, and accuracy. Compare the results with existing cyberbullying detection methods.
8.User Alerts and Reporting: Implement a mechanism to notify users or moderators when cyberbullying content is
detected. Provide reporting and logging capabilities for further analysis and action.
CHALLENGES
.
LITERATURE SURVEY
Published in - 2018
In this paper, the authors have comprehensively evaluated the performance of the twelve ML
algorithms for the detection of anomalous behaviours that may be indicative of cyber attacks. In order
to recommend the best-fit algorithms, three datasets (i.e. UNSW-NB15, CICIDS-2017, and ICS
cyberattack) were applied to the selected methods, but deep learning classification requires a very large
amount of data to train the models and this is not available in the current studies, and Naive Bayes
classification has the lowest performance in terms of accuracy, precision, recall and AUC.
.
LITERATURE SURVEY
The Role of Artificial Intelligence and Cyber Security for Social Media
Published in - 2020
This paper has discussed the benefits of social media and the application of machine learning
techniques for social media. For example, machine learning techniques are being used to detect the
sentiment of the users and to provide information on the spread of deadly diseases as well as prevent
child trafficking. It also discussed the use of machine learning for detecting fake news and malicious
software. Next, the paper discussed security and privacy issues for social media systems including
access control models and privacy aware social media systems. Finally, the paper discussed the
integration of AI ad cyber security for social media systems such as adversarial machine learning and
the inference and privacy problems.
.
LITERATURE SURVEY
A Framework to Predict Social Crime through Twitter Tweets By Using Machine Learning
Authors - Zaheer Abbass, Zain Ali, Mubashir Ali, Bilal Akbar, Ahsan Saleem
Published in - 2020
The aim of this research study to predict social media crimes by using twitter data. They use
three ML classifier with bag of word model. The study proves better result with existing state of art.
The proposed model is currently offline in future work it can be extended for real-time Twitter data
streaming to predict further crimes. More crime classes can be added to make the system efficient and
robust but not for image detection.
.
LITERATURE SURVEY
Published in - 2020
An approach is proposed for detecting and preventing Twitter cyber bullying using
Supervised Binary classification Machine Learning algorithms. This model is evaluated on both
Support Vector Machine(SVM) and Naive Bayes, also for feature extraction, used the TFIDF
vectorizer. As the results shows that the accuracy for detecting cyber bullying content has also been
great for Support Vector Machine which is better than Naive Bayes. But this technique doesn’t
identify bullying text more accurately.
EXISTING SYSTEM
Different ways to track the cyber crimes. But most of the papers work of 2
category classification i.e either the action is a crime or not a crime and does
not work on type of crime. There are work which also classifies the mails,
social media data as spam or not a spam. However the classification under
spam is not available. Hence an approach is needed which can classify the
data into various categories. After classification is performed the spam or
any data more accuracy is obtained and necessary actions can be taken on
each category users.
- Existing System used SVM and Naive Bayes but SVM algorithm is not
suitable for large data sets. SVM does not perform very well when the data
set has more noise i.e. target classes are overlapping. In cases where the
number of features for each data point exceeds the number of training data
samples, the SVM will underperform.
PROPOSED SYSTEM
In this project, a solution is proposed to detect twitter cyberbullying. The main difference
with previous research is that we not only developed a machine learning model to detect
cyberbullying content but also implemented it on particular locations real-time tweets
using Twitter API.
In Data Pre-processing, It is important to ensure that our dataset is good enough for
analysis. This is where data cleaning becomes extremely vital. Data cleaning extensively
deals with the process of detecting and correcting of data records, ensuring that data is
complete and accurate and the components of data that are irrelevant are deleted or
modified as per the needs.
In feature extraction step has got more to do with the feature that we are selecting from
the set of possible features that the dataset could have. We had to make an intelligent
decision regarding the type of feature that we want to select to go ahead with our
machine learning model.
In test train split we are splitting the dataset for training and testing for crating model and
prediction. Then apply the algorithm for creating model for the sentiment classification.
PROPOSED SYSTEM
Proposed System uses NLP Technique and CNN algorithm , Where CNN Little
dependence on pre processing, decreasing the needs of human effort developing
its functionalities. It is easy to understand and fast to implement. It has the
highest accuracy among all algorithms that predicts images.training CNN algo
Input(text/image)
Classification Classification
result model
Checks accuracy CNN algo applied
DATA COLLECTION : Data collection is the process of gathering and
measuring information on targeted variables in an established system.
PRE-PROCESSING : To do preliminary processing of data. data preprocessing
include cleaning, instance selection, normalization, one hot encoding,
transformation, feature extraction and selection, etc. The product of data
preprocessing is the final training set.
FEATURE EXTRACTION : Feature Extraction aims to reduce the number of
features in a dataset by creating new features from the existing ones (and then
discarding the original features). These new reduced set of features should then be
able to summarize most of the information contained in the original set of features.
MODEL BUILDING : A machine learning model is built by learning and
generalizing from training data, then applying that acquired knowledge to new
data it has never seen before to make predictions and fulfill its purpose.
CLASSIFICATION OF BULLYING/NON-BULLYING
COMPUTE ACCURACY PRECITION : metric that quantifies the number of
correct positive predictions made. Precision, therefore, calculates the accuracy for
the minority class.
ALGORITHM
Software Requirements:
• Python, PyCharm, Anaconda.
• Windows
Hardware Requirements:
• Processor – Intel core i7
• Memory – 2GB RAM
• 0.5TB Hard Disk Drive
• Mouse, Keyboard, Display device
Programming:
• Project will be in Python Programming
CONCLUSION
Internet crimes have become very dangerous because victims are continuously Being
hunted, and there is little possibility of escape. Cyber bullying is one of the most critical
internet crimes, and research has demonstrated its critical impact on the victims.
The system uses a accurate method of CNN implementation using keras and helps in
achieving precise results. This can help the users by preventing them for
becoming
victims to this harsh consequence of cyber bullying.
Hence , compare to the existing model our technique is going to identify more
accurate
result of cyber bullying, where this new technique.
REFERENCE
• Elaheh Raisi, Bert Huang., “Cyber bullying Identification Using Participant-Vocabulary
Consistency” Virginia Tech, Blacksburg, VA , 2016
• Nebrase Elmrabit, Feixiang Zhou, Fengyin Li, Huiyu Zhou., “Evaluation of Machine Learning
Algorithms for Anomaly Detection” 2018
• Bhavani Thuraisingham ., “The Role of Artificial Intelligence and Cyber Security for Social
Media” Computer Science Dept. The University of Texas at Dallas Richardson, USA
bxt043000@utdallas.edu 2020
• Zaheer Abbass, Zain Ali, Mubashir Ali, Bilal Akbar, Ahsan Saleem ., “A Framework to Predict
Social Crime through Twitter Tweets By Using Machine Learning” Department of Computer Science
University of Lahore, Gujrat Campus, Pakistan 2020
• Rahul Ramesh Dalvi, Sudhanshu Baliram Chavan, Aparna Halbe., ” Detecting A Twitter Cyber
bullying Using Machine Learning” Department of Information Technology Sardar Patel Institute of
Technology Mumbai, India 2020
THANK YOU