CSE Dept. PPT 176 173

Project
Automated Violence Detection in Video

Streams
160120733173 –Sushanth Reddy

160120733176 –Uday Kumar Reddy
Dr. T Sridevi
16/10/2023 1
Abstract
A variety of methods have been tried to curb the violent activities which includes
installing of surveillance systems. It will be of great significance if the surveillance
systems can automatically detect violent activities and give warning or alert
signals. The whole system can be implemented with a sequence of procedures.
Firstly, the system has to identify the presence of human beings in a video frame.
Then, the frames which are predicted to contain violent activities has to be
extracted. The irrelevant frames are to be dropped at this stage. Finally, the trained
model detects violent behavior and these frames are separately saved as images.
These images are enhanced to detect faces of people involved in the activity, if
possible. The enhanced images along with other necessary details such as time and
location is sent as an alert to the concerned authority.
2
Introduction
- Violence in public spaces, private premises, and online platforms poses significant
threats to individuals and communities.
- Timely intervention is essential to prevent harm and ensure the safety of people
and property.
- Traditional violence detection methods lack speed and accuracy.
- Deep Learning in Real Time utilizes diverse datasets, training deep neural networks
on both violent and non-violent behaviors.
- These models offer adaptability and continuous learning, improving their recognition
of various forms of violence.
16/10/2023 Automated Violence Detection in Video Streams 3

Introduction
-These networks can then autonomously analyze live video streams or images,
rapidly identifying potential instances of violence with remarkable precision.
-The ability to perform such tasks in real time makes this technology invaluable
for a wide range of applications, including but not limited to security systems,
law enforcement, social media content moderation, and public safety.
-Detection violence activity is not a simple task because it faces problems like
anomaly detection in general and processing these videos.

Literature Survey
Sno. Title Year of Publication Observations
1 Violence • For detecting violence has been presented by us that uses a network
Detection in 2023 similar to the U-NET with the encoder mobilenetv2 to extract spatial
Real Life features before moving on to an LSTM block for the extraction of
Videos using temporal features and binary classification
Deep Learning
• Accuracy:94%
• Drawbacks: The fusion of U-Net and ResNet can result in a more

complex and computationally intensive model, which may require
significant computational resources and longer training times.
2 Real-Time • The CNN + LSTM approach, in further cases also tried and test
Violence 2022 different models of CNN to get which one provides the most
Detection Using accuracy and also try to extract information from audio of the video,
CNN-LSTM and try get inference from it.
• Accuracy:90%
• Drawbacks: While the combination of CNN, LSTM, and ResNet

may seem advantageous, in some cases, using such complex
architectures may not significantly improve the performance
compared to simpler models. This over-engineering could lead to
unnecessary complexities.

Literature Survey
Sno. Title Year of Observations
Publication
3 An Effective • This proposes strategies to incorporate Deep Learning and Natural
Approach for 2023 Language Processing (NLP) to simultaneously detect anomalous objects
Violence Detection and scenarios from videos using TensorFlow and aggressive, offensive,
using Deep and hate speech from an audio channel of surveillance cameras.
Learning and
Natural Language • Accuracy:84%
Processing
• Drawbacks: This can be improved by either increasing the dataset or
merging these three categories into one (physical abnormal activity), in
which case the accuracy will drastically increase. This system can
achieve much better accuracy if the dataset is increased for both object
detection and text classification
4 Violence Detection • This paper discusses this research problem and explores LSTM and
using Deep 2022 BiLSTM based solution to solve it. In addition, a layer of attention is
Learning present and used a new content database that collected from surveillance
Techniques camera and normal recorded videos available on YouTube, Facebook
• Accuracy:95%
• Drawbacks: It requires continuous evaluation of more standard dataset

where identification of more violent activities that include weapons
which is hard to detect.

Literature Survey
Sno. Title Year of Publication Observations
5 Real-time • The presented system in the paper is premised on a hybrid approach of
Violence 2022 employing different algorithms for assessing all distinct aspects of the
Detection problem in a viable and effective manner. The proposed system is reliant
using Deep upon YOLO for real-time object detection and Long Short-Term
Learning Memory for developing the classification module.
Techniques • Accuracy: 91%
• Drawbacks: High Computing Power Requirement: The DEEPSORT
algorithm along with the Long Short-Term Memory (LSTM) architecture
used in the system requires significant computing power, which may
limit its implementation on low-end devices or in resource-constrained
environments.
6 Violence • The detection model relies on the usage of 3D Convolutional Neural

Detection 2022 Networks. The classification model utilizes the pre-trained Inception-v3
and model for feature extraction, followed by Gated Recurrent Units (GRUs)
Recognition for temporal processing.
from Diverse • Accuracy: 92%
Video • Drawbacks: Performance variation: The results of the experiments show
Sources some variation in performance, particularly when using the UCF-Crime
model as a pre-trained model for other datasets. This may indicate a
slight bias of the UCF-Crime model towards its own categories, resulting
in lower performance when used as a pre-trained model

Literature Survey
Sno. Title Year of Observations
Publication
7 Violence Detection • These features are related to the representation of the image, the
Based on 2022 appearance of the image, and their motion speed, and are fed as input
Multisource Deep to a neural network (CNN), which transforms them into spatial,
CNN with temporal feature, and feature streams, trained a network through this
Handcraft Features spatial stream to recognize contextual patterns in each frame of video.
• Accuracy:90%
• Drawbacks: The integration of deep CNNs and handcrafted features
can result in a complex model, which may be challenging to design,
train, and optimize. This complexity can increase the risk of overfitting.
8 Human Violence • this project is to apply deep learning models for human violence
Detection Using 2022 estimation. The deep leaning model will be trained using the Local
LHOGF Algorithm Histogram of Oriented Gradient features. Deep learning learns multiple
and Deep Learning layers of models that are related to multiple levels of concepts; They
Model produce a measure of level perceptions where the bigger the level, the
more abstract ideas are learned
• Accuracy:93%
• Drawbacks: LHOGF may not capture the subtle and complex features
of violent actions as effectively as deep learning methods, particularly in
cases where actions involve intricate details or nuanced motion patterns.

Existing Systems
1. VGG (Visual Geometry Group):VGGNet is known for its simplicity
and effectiveness. It has various versions (e.g., VGG16, VGG19) and
can be used as a feature extractor in real-time violence detection systems.
2. ResNet (Residual Networks): ResNet is designed to address the
vanishing gradient problem in deep networks. Its skip connections make
it well-suited for training very deep networks, and it has been used in
action recognition for real-time violence detection tasks.
3. Inception (GoogLeNet):The Inception architecture, particularly
InceptionV3 and Inception ResNet, offers a good trade-off between
accuracy and computational efficiency. It can be used as a backbone
network for feature extraction.

Gaps and Challenges
1. Ambiguity in Context: Determining the context in which an action
occurs is crucial. Some actions that may appear violent could be part
of legitimate activities or self-defense. Contextual understanding is
often a gap in automated systems.
2. Data Imbalance: Violence is a relatively rare event compared to
non-violent actions, leading to class imbalance issues in training
datasets. Models can be biased towards the majority class and may
not perform well on the minority class.
3. Generalization Across Cultures: Cultural norms and behaviors can
vary significantly, making it challenging to develop universal
violence detection models that work effectively in all regions and
contexts.
16/10/2023
Automated Violence Detection in Video Streams 10
Proposed System
• The proposed method is a deep learning based automatic detection

approach that uses Convolutional Neural Network to detect violence
present in a video. But, the disadvantage of using just CNN is that, it
requires a lot of time for computation and is less accurate.
• Hence, a pre-trained model, MobileNetv2, which provides higher

accuracy and acts as a starting point for the building of the entire model.
An alert message is given to the concerned authorities using telegram
application.

Proposed System
Framework of our proposed system

Advantages
• Training on Limited Data: MobileNetv2 can often achieve reasonable
performance even with limited training data, which can be beneficial in
scenarios where large annotated datasets are scarce.
• Efficiency: MobileNetv2 is specifically designed for resource-constrained
devices, making it highly efficient in terms of both computational and
memory requirements. It can run smoothly on mobile and edge devices,
which is essential for real-time surveillance and monitoring applications.
• Transfer Learning: MobileNetv2 models can be fine-tuned or transferred
to adapt to specific surveillance environments, which is useful for
customizing the model's performance.

Conclusion
• In conclusion, the implementation of real-time violence detection
systems holds great promise in enhancing public safety and security
by swiftly identifying and responding to potential threats. Continued
research and development in this field are essential to improve the
accuracy and effectiveness of such systems, ultimately contributing to
safer communities and environments.

References
[1] B. Jain, A. Paul and P. Supraja, "Violence Detection in Real Life Videos
using Deep Learning," 2023 Third International Conference on Advances in
Electrical, Computing, Communication and Sustainable Technologies
(ICAECT).
[2] Patel, Mann. (2021). Real-Time Violence Detection Using CNN-LSTM.
[3] V. Kumari, K. Memon, B. Aslam and B. S. Chowdhry, "An Effective

Approach for Violence Detection using Deep Learning and Natural Language
Processing," 2023 7th International Multi-Topic ICT Conference (IMTIC)
16/10/2023
Automated Violence Detection in Video Streams 15
[4] H. Gupta and S. T. Ali, "Violence Detection using Deep Learning
Techniques," 2022 International Conference on Emerging Techniques in
Computational Intelligence (ICETCI)
[5] e. Fatima Kiani and T. Kayani, "Real-time Violence Detection using
Deep Learning Techniques," 2022 3rd International Conference on
Innovations in Computer Science & Software Engineering (ICONICS)
[6] M. Gadelkarim, M. Khodier and W. Gomaa, "Violence Detection and
Recognition from Diverse Video Sources," 2022 International Joint
Conference on Neural Networks (IJCNN)
[7] N. Appavu and N. C, "Violence Detection Based on Multisource Deep
CNN with Handcraft Features," 2023 IEEE International Conference on
Advanced Systems and Emergent Technologies (IC_ASET)
[8] A. Chauhan and R. Gupta, "Human Violence Detection Using LHOGF
Algorithm and Deep Learning Model," 2022 4th International Conference
on Advances in Computing, Communication Control and Networking
(ICAC3N)

Thank you

CSE Dept. PPT 176 173

Uploaded by

Copyright:

Available Formats

You might also like

CSE Dept. PPT 176 173

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSE Dept. PPT 176 173

Uploaded by

Copyright:

Available Formats

Project

Automated Violence Detection in Video

160120733173 –Sushanth Reddy

16/10/2023 Automated Violence Detection in Video Streams 3

16/10/2023 Automated Violence Detection in Video Streams 4

• Drawbacks: The fusion of U-Net and ResNet can result in a more

• Drawbacks: While the combination of CNN, LSTM, and ResNet

16/10/2023 Automated Violence Detection in Video Streams 5

• Drawbacks: It requires continuous evaluation of more standard dataset

16/10/2023 Automated Violence Detection in Video Streams 6

6 Violence • The detection model relies on the usage of 3D Convolutional Neural

16/10/2023 Automated Violence Detection in Video Streams 7

16/10/2023 Automated Violence Detection in Video Streams 8

16/10/2023 Automated Violence Detection in Video Streams 9

• The proposed method is a deep learning based automatic detection

• Hence, a pre-trained model, MobileNetv2, which provides higher

16/10/2023 Automated Violence Detection in Video Streams 11

16/10/2023 Automated Violence Detection in Video Streams 12

16/10/2023 Automated Violence Detection in Video Streams 13

16/10/2023 Automated Violence Detection in Video Streams 14

[2] Patel, Mann. (2021). Real-Time Violence Detection Using CNN-LSTM.

[3] V. Kumari, K. Memon, B. Aslam and B. S. Chowdhry, "An Effective

16/10/2023 Automated Violence Detection in Video Streams 16

16/10/2023 Automated Violence Detection in Video Streams 17

You might also like