Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.

org (ISSN-2349-5162)

IMAGE FORGERY DETECTION USING CNN

B.Akshaya K.Manaswini
V.Prudhvi B. Tech Student B. Tech Student
Professor Department of Software Engineering Software Engineering
Software Engineering Siddhartha Institute of Siddhartha Institute of
Technology and sciences Technology and sciences

Ch.Chandana P.Manikanta K.Swetha


B. Tech Student B. Tech Student B. Tech Student
Software Engineering Software Engineering Software Engineering
Siddhartha Institute of Siddhartha Institute of Siddhartha Institute of
Technology and sciences Technology and sciences Technology and sciences

Abstract: With the increasing use of digital images in various applications, the problem of image forgery has become more prevalent
than ever. In this paper, we propose a novel image forgery detection system based on Convolutional Neural Networks (CNNs) that
can detect various types of image manipulations, including copy-move, splicing, and retouching. Our proposed system integrates
Error Level Analysis (ELA) with deep learning techniques to provide a more accurate and reliable solution to the problem of image
forgery detection. We evaluated the proposed system on a dataset of real-world images and achieved a high detection accuracy of
93%. Our system outperformed existing methods for image forgery detection and demonstrated its potential for various applications,
including forensics, security, and digital image analysis. Overall, the proposed CNN-based image forgery detection system offers a
robust and effective solution to the growing problem of image manipulation and forgery in today's visual media landscape.

Keywords: Image Forgery, Error Level Analysis, Convolutional Neural Networks, Deep Learning, Forgery Detection Of Image

1. INTRODUCTION
In Image forgery is the process of manipulating a digital image to hide valuable or essential content or to force the viewer to believe
an idea. It has been defined as the process of manipulating an original digital image to either conceal its original identity or create
an entirely different image than what was originally intended by the user of the digital platform. Forged images can cause
disappointment and emotional distress and affect public sentiment and behavior. Images can transmit much more information than
text. People tend to believe what they can see, and this affects their judgment, which leads to a series of unwanted responses. Because
fabrications have become widespread, the urgency to detect forgeries has significantly increased. The copy move approach is one of
the most widely used forgery techniques. It copies a part of the image and pastes it onto another part of the image. The technique
itself is not harmful, but it can lead to critical situations if someone uses it with malicious intent. Convolutional Neural Networks
(CNNs) have emerged as powerful tools in various image processing tasks, owing to their ability to automatically learn hierarchical
features from raw pixel data. Leveraging the deep learning capabilities of CNNs, researchers have achieved significant advancements
in the field of image forgery detection. By training CNN models on large datasets of authentic and forged images, these models can
learn to discern subtle inconsistencies or artifacts introduced during image manipulation.

1.1 MOTIVATION
In an era where digital imagery permeates nearly every facet of our lives, ensuring the integrity and authenticity of visual content
has become an increasingly daunting challenge. These forgeries not only undermine the credibility of information but also have
farreaching consequences in fields such as journalism, law enforcement, and digital forensics. Traditional methods of forgery
detection, often reliant on handcrafted features and heuristic algorithms, are struggling to keep pace with the ever-evolving
techniques employed by forgers. As such, there is a pressing need for advanced and adaptive solutions that can effectively detect
and mitigate the proliferation of manipulated imagery. Convolutional Neural Networks (CNNs) have emerged as a beacon of
hope in this landscape of digital deception.Through this project, we aim to contribute to the ongoing efforts to safeguard the
integrity of visual information in the digital age. By exploring the potential of CNN architectures tailored for forgery detection
and leveraging large-scale datasets, we aspire to empower forensic analysts, journalists, and law enforcement agencies with
reliable tools for preserving the authenticity and trustworthiness of digital imagery.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k385
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

1.2 PROBLEM STATEMENT


The proliferation of digital imagery in various domains has brought forth a pressing challenge: the detection of image forgeries.
Image forgery encompasses a wide range of manipulations, including but not limited to, copy-move, splicing, and retouching,
aimed at deceiving viewers or altering the truth portrayed by an image. These forgeries not only erode the credibility of visual
information but also have serious implications in fields such as journalism, law enforcement, and digital forensics. The problem
entails developing a robust Convolutional Neural Network (CNN) model capable of accurately detecting various types of image
forgeries, such as copy-move, splicing, and retouching. This involves addressing challenges such as the need for largescale
labeled datasets, designing efficient architectures, and ensuring robustness to diverse forgery techniques and image variations.
The goal is to provide a reliable automated solution to safeguard the integrity and authenticity of digital imagery in all domains.

1.3 SCOPE
The scope of image forgery detection using Convolutional Neural Networks (CNNs) encompasses a wide array of applications
and challenges within the domain of digital forensics and image analysis. Firstly, the scope involves the detection of various
types of image manipulations, including but not limited to copy-move, splicing, and retouching, across different domains such
as journalism, social media, and legal evidence. CNNs offer a promising approach to address these challenges by automatically
learning discriminative features from raw pixel data, enabling the detection of subtle inconsistencies and artifacts introduced
during manipulation. Secondly, the scope extends to the development of CNN architectures tailored specifically for forgery
detection, which strike a balance between detection accuracy, computational efficiency, and scalability. These architectures may
include variations such as Siamese networks for pairwise image comparison, multi-scale feature extraction for detecting forgery
at different resolutions, and attention mechanisms for focusing on relevant regions of interest. Thirdly, the scope encompasses
the exploration of advanced training techniques and augmentation strategies to enhance the robustness and generalization
capabilities of CNN models across diverse forgery techniques and image variations. Techniques such as transfer learning, data
augmentation, and adversarial training may be employed to mitigate overfitting and improve model performance on unseen data.

1.4 OBJECTIVES
The primary objective of this project is to develop a state-of-the-art Convolutional Neural Network (CNN) model for accurate
and robust detection of image forgeries. This involves several key sub-objectives. Firstly, we aim to curate and preprocess large-
scale datasets of authentic and forged images, encompassing a wide range of forgery types and variations, to facilitate the training
and evaluation of our CNN model. Secondly, we strive to design and implement novel CNN architectures tailored specifically
for the task of forgery detection, balancing model complexity with computational efficiency to ensure practical deployment in
real-world scenarios. Thirdly, we seek to train and fine-tune our CNN models using advanced optimization techniques and
augmentation strategies, leveraging the power of deep learning to learn discriminative features for detecting subtle
inconsistencies and artifacts introduced during image manipulation.

2 . LITERATURE REVIEW
Here we will elaborate the aspects like the literature survey of the project and what all projects are existing and been actually used
in the market which the makers of this project took the inspiration from and thus decided to go ahead with the project covering with
the problem statement.

2.1 LITERATURE SURVEY


1) Syed Sadaf Ali et.al : They proposed Image Forgery Detection Using Recompressing Images. The techniques used are
adapted to the individual needs, intrests and preferences of the user or society . Image compression involves reducing
the pixels , size or colour components of the images in order to reduce the file size for forgery detection

2) J.Malathi et.al : Image Forgery Detection Using Support Vector Machine developed by J.Malathi . SVM is a supervised
classification algorithm that is used to differentiate between two separate categories by drawing a line between them.
However, it seems that this technique has some drawbacks .

3) F.Marra et.al: Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection,
carried out by F.Marra et al. It proposes a framework for detecting Image Forgery using CNN. The framework includes
a feature extraction module and a classification module, both using CNNs.
4) S.B.G.T. Babu et.al : Statistical Features based Optimized Technique for Copy Move Forgery Detection, carried out by
S.B.G.T. Babu et al. The technique suggests a novel method for identifying copy-move forgeries in digital photos.
5) M.H. Alkawaz et.al: Digital Image Forgery Detection based on the Expectation Maximization Algorithm, Executed by
M.H. Alkawaz It proposes a new approach for detecting digital image Forgeries using an expectationmaximization
algorithm.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k386
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

2.2 EXISTING SYSTEM

Existing systems for image forgery detection, apart from Convolutional Neural Networks (CNNs), encompass a variety of techniques
and methodologies tailored to detect different types of image manipulations with high accuracy and reliability. These systems often
employ traditional machine learning algorithms, such as Support Vector Machines (SVM), Random Forests, and Decision Trees,
along with handcrafted features and heuristics, to identify inconsistencies and artifacts indicative of image forgery. Feature-based
such as Scale-Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF), extract distinctive keypoints and
descriptors from images, enabling the detection of forged regions through keypoint matching and clustering. Statistical analysis
techniques, including Noise Level Estimation (NLE) and Moment Invariants, exploit statistical properties and mathematical
characteristics of images to detect anomalies introduced during manipulation. Furthermore, model-based approaches, such as Error
Level Analysis (ELA) and Principal Component Analysis (PCA), analyze discrepancies in compression artifacts and principal
components to identify tampered regions in images. These existing systems often rely on handcrafted features and predefined
thresholds, which may limit their robustness and scalability in handling diverse forgery techniques and variations. Moreover, these
systems require extensive parameter tuning and domain expertise, making them less adaptable to evolving forgery methods and
scenarios. Despite these limitations, existing systems for image forgery detection other than CNNs have demonstrated effectiveness
in specific contexts and applications, particularly in scenarios where computational resources are limited or labeled data is scarce.
However, ongoing research and innovation are necessary to overcome the inherent challenges and limitations of traditional
approaches and to develop more robust and scalable solutions capable of addressing the complexities of modern image forgery
techniques.

2.2.1 Drawbacks:
 Reduced accuracy
 Updates and Maintenance
 Complexity
 False Positives/Negatives

2.3 PROPOSED SYSTEM

Existing systems for image forgery detection using Convolutional Neural Networks (CNNs) represent a significant advancement in
the field, leveraging the power of deep learning to achieve robust and accurate detection of various types of image manipulations.
These systems typically consist of several key components, including data preprocessing, model architecture design, training, and
evaluation. Data preprocessing involves curating large-scale datasets containing authentic and forged images, encompassing diverse
forgery techniques and variations, to serve as training and evaluation data for CNN models. Model architecture design plays a crucial
role in the effectiveness of forgery detection, with researchers proposing innovative CNN architectures tailored specifically for this
task. These architectures often incorporate hierarchical layers, multi-scale feature extraction, and attention mechanisms to capture
subtle inconsistencies and artifacts indicative of image manipulation. Training CNN models involves optimizing model parameters
advanced optimization techniques and augmentation strategies to enhance robustness and generalization capabilities. Additionally,
existing systems rigorously evaluate the performance of CNN models using standard metrics such as accuracy, precision, recall, and
F1-score, as well as conducting extensive experimentation to assess their performance under different scenarios and conditions.
Benchmark datasets, such as KAGGLE, have been instrumental in facilitating comparative evaluations and benchmarking of
different CNN architectures and techniques. Moreover, existing systems explore practical applications and real-world deployments
of CNN-based forgery detection systems, highlighting their potential to assist forensic analysts, journalists, and law enforcement
agencies in preserving the integrity and authenticity of digital imagery. Ongoing research efforts are necessary to overcome these
challenges and further advance the state-of-the-art in CNN-based forgery detection.

2.3.1 Advantages:
 High accuracy
 Real Time Detection

3. REQUIREMENT ANALYSIS AND PLANNING


In requirements analysis encompasses those tasks that go into determining the needs or conditions to meet for a new or altered
product or project, taking account of the possibly conflicting requirements of the various stakeholders, analyzing, documenting,
validating and managing software or system requirements. Project planning is part which relates to the use of schedules such as
Gantt charts to plan and subsequently report progress within the project environment. Initially, the project scope is defined and the
appropriate methods for completing the project are determined. It is a process of collecting and interpreting facts, identifying the
problems, and decomposition of a system into its components. System analysis is conducted for the purpose of studying a system
or its parts in order to identify its objectives. It is a problem solving technique that improves the system and ensures that all the
components of the system work efficiently to accomplish their purpose. Analysis specifies what the system should do. Requirement
Analysis will cover the topics like the Functional, Non-Functional and the specific requirements of the project and touching all the
software and the hardware requirements as well.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k387
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

3.1 FUNCTIONAL REQUIREMENTS


Functional requirements are a set of specifications that define what a software system or product should do, its features, functions,
and capabilities. These requirements outline the intended behaviour of the system or product and describe how it should interact
with users and other systems.
Functional requirements are the following:
 The model used above should be able to receive and store datasets with relevant features.
 The model used above should be able to train various deep learning algorithms on the pre processed images.
 The model used above should be able to select the best performing model based on the evaluation results.

3.2 NON-FUNCTIONAL REQUIREMENTS


Non-functional requirements are a set of specifications that define how a software system or product should behave, perform, or
operate. Unlike functional requirements, nonfunctional requirements do not describe the specific functions or features of the
system, but rather its qualities and characteristics.
Non-Functional requirements are the following:
 The system should be fast and accurate in its predictions.
 The system should be able to handle large amounts of data.
 The system should be secure to protect user data and ensure user privacy.
 The system should be easy to use and have a user-friendly interface for both technical and non-technical users.
 The system should be maintained and supported to keep-up-to-date with changes in machine learning algorithms.
 The system should be accessible on multiple platforms and devices.
 The system should be maintained and supported to keep-up-to-date with changes in deep learning algorithms.

3.3 SOFTWARE REQUIREMENTS

 Operating system : Windows


 Programming language : Python 3.10
 Frontend : HTML, CSS, JavaScript
 Web Framework : Flask

3.4 HARDWARE REQUIREMENTS

 Hard Disk : 512 GB


 Ram : 8 GB
 System : i3 Processor

4. ARCHITECTURE

Designing a robust system for image forgery detection using Convolutional Neural Networks (CNNs) entails meticulous planning
and consideration of various components and methodologies to ensure effectiveness and efficiency. The system design encompasses
several key stages, beginning with data preprocessing, where extensive datasets containing authentic and manipulated images are
curated and prepared for training and evaluation.
Subsequently, the focus shifts towards the design of CNN architectures tailored specifically for forgery detection. This includes the
selection of appropriate network architectures, layer configurations, and optimization algorithms to maximize detection accuracy
while minimizing computational complexity. Hierarchical networks, attention mechanisms, and multi-scale feature extraction
techniques are often integrated into the design to capture subtle inconsistencies and artifacts indicative of image manipulation.
The proposed system architecture for image forgery detection consists of several steps, starting with dataset preparation. The open
image dataset's annotations are converted into a format accessible by the model during the training process. The testing process
involves converting the image into an ELA image format, calculating the noise and signal ratio, denoising the image, and converting
it to a black-and-white format.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k388
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

Figure-4 : Architecture diagram

5. MODULES

 Image Dataset : The Kaggle Dataset is very useful in our system for detection of forgery with more accurate results. Using the
Kaggle Dataset, the system will automatically predict which image is aunthentic and which is forged. System will accept images
as an input. The justified format of the image should be given as an input to get processed.

 Importing the dependencies : Importing dependencies for image forgery detection using Convolutional Neural Networks
(CNNs) involves including the necessary libraries and modules in the project environment to facilitate data manipulation, model
construction, training, and evaluation.

 Data collection : Data has been collected from Kaggle, one of the most data source providers for the learning purpose and
hence the data is collected from Kaggle, which had two data sets one for the training and another testing.
The training dataset is used to train the model in which datasets is further divided into two parts such as 80:20 or 70:30 the
major datasets is used for the train the model and the minor dataset is used for the test the model and hence the accuracy of our
developed model is calculated. The size of the training dataset is 80% whereas the size of test data is 20%.

 Data preprocessing : : This function preprocesses individual images before feeding them into the CNN model. Preprocessing
steps may include resizing images to a uniform size, normalizing pixel values, and applying data augmentation techniques to
increase the diversity of the training dataset.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k389
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

 Error level analysis : Error Level Analysis (ELA) is an image analysis technique used to detect inconsistencies introduced
during digital image manipulation. The algorithm works by examining the error levels present in an image, which are the
differences in compression quality that occur when an image is saved and resaved. While ELA can highlight suspicious areas
in an image, it cannot definitively identify the type or extent of manipulation. Therefore, ELA is often used in conjunction with
other forensic techniques for a more comprehensive analysis of image authenticity. Overall, Error Level Analysis provides a
useful tool for detecting potential image manipulations by analyzing compression inconsistencies. However, it's important to
interpret its results cautiously and in conjunction with other forensic methods for accurate assessment.

 CNN Model : A Convolutional Neural Network (CNN) is a type of deep learning algorithm that is particularly well-suited for
image recognition and processing tasks. It is made up of multiple layers, including convolutional layers, pooling layers, and
fully connected layers. The architecture of CNNs is inspired by the visual processing in the human brain, and they are well-
suited for capturing hierarchical patterns and spatial dependencies within images . Convolutional Neural Networks (CNNs) are
becoming a widely used tool for identifying fake images. CNNs are a kind of deep learning algorithm that can be taught to
identify various categories and extract features from photos. They are modeled after the human visual system and are made up
of several layers of networked neurons that work together to extract features from the input image through convolution
operations. CNNs are useful for image forensics because of their ability to identify minute artifacts that might be invisible to
the unaided eye. For instance, there might be minute differences in the texture or pixel values of an image that serve as indicators
of manipulation, such as when a fragment is copied and pasted from one image to another.

 Training Model : This function trains the CNN model using the training dataset. It involves feeding batches of preprocessed
images into the model, adjusting its parameters using an optimization algorithm, and iterating through multiple epochs until
convergence..

6.DATA FLOW DIAGRAM :

1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to represent a system in terms of
input data to the system, various processing carried out on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to model the system components. These
components are the system process, the data used by the process, an external entity that interacts with the system and the
information flows in the system.
3. DFD shows how the information moves through the system and how it is modified by a series of transformations. It is a
graphical technique that depicts information flow and the transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of abstraction. DFD may be
partitioned into levels that represent increasing information flow and functional detail.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k390
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

Figure - 6 : Data flow diagram

7. RESULTS

Figure-7(a): Authentic Image Output

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k391
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

Figure-7(b): Forged Image Output

CONCLUSION

In conclusion, the project represents a significant step forward in the field of digital image forensics, showcasing the potential of
CNN-based approaches in combating the growing threat of image manipulation and forgery. The developed forgery detection system
holds immense promise for real-world applications, offering stakeholders a powerful tool to safeguard the integrity and authenticity
of digital imagery in domains such as law enforcement, journalism, healthcare, and e-commerce. Moving forward, continued
research and development efforts will be necessary to further refine and optimize the system, addressing challenges such as
scalability, interpretability, and robustness to adversarial attacks. By fostering collaboration between academia, industry, and
government agencies, we can collectively advance the frontier of digital image forensics and uphold the integrity of visual
information in the digital age.

Image forgery involves distorting images, sometimes images of people, for malicious reasons. This involves a genuine image that
had been displayed on a public website or a digital communication platform and is edited into an entirely different image. The new
image will likely be immoral in nature or targeted to spread negative publicity.

The ELA algorithm shows whether an image is manipulated when the input images quality is close to the quality used in the
algorithm. If there is a large difference between the quality of the image and the quality of the algorithm, then the result will always
be incorrect. Furthermore, the algorithm does not show the exact area of manipulation.

A pre-trained model is a model that has been trained on a certain task on the Image Net dataset. It is a model that has been trained
to solve issues that might be similar to the problem at hand. A pre-trained model is preferred in most cases to training a model from
scratch. The process of importing a pre-trained model is referred to as transfer learning.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k392
© 2024 JETIR May 2024, Volume 11, Issue 5 www.jetir.org (ISSN-2349-5162)

REFERENCES

1. Raghavendra, Rohit, et al. "On the robustness of convolutional neural networks to common corruptions and
perturbations." IEEE Transactions on Neural Networks and Learning Systems 31.11 (2020): 4241-4258.
2. Z. J. Barad and M. M. Goswami,"ImageForgery Detection using Deep Learning: ASurvey," 2020 6th International
Conference onAdvanced Computing and Communication Systems (ICACCS), 2020, pp. 571-576,
doi:10.1109/ICACCS48705.2020.9074408.

3. Linguistics.Association for Computational Linguistics, 2009. Bayar, Belhassen, and Matthew C. Stamm. "A deep
learning approach to universal image manipulation detection using a new convolutional layer." Signal Processing: Image
Communication 72 (2019): 57-69.
4. Bayar, Belhassen, and Matthew C. Stamm. "A deep learning approach to universal image manipulation detection using
a new convolutional layer." Signal Processing: Image Communication 72 (2019): 57-69.
5. Li Yansong, et al. "A hybrid CNN-CRF model for detecting and locating image forgeries." Pattern Recognition Letters
125 (2019): 343-349.
6. Li Yansong, et al. "A multi-task learning framework for image forensics." Neurocomputing 340 (2019): 211-221.
7. Liu, Fang, et al. "Image splicing detection with deep learning: A review." IEEE Transactions on Information Forensics
and Security 15 (2019): 1636-1659
8. Pan, Xingjun, et al. "Towards detection of universal adversarial perturbations." IEEE Transactions on Image Processing
28.8 (2019): 3814-3825.
9. Cozzolino, Davide, Giovanni Poggi, and Luisa Verdoliva. "Recurrent convolutional strategies for image manipulation
detection." IEEE Transactions on Information Forensics and Security 13.8 (2018): 1993-2006.
10. Amerini, Irene, et al. "Deep learning for image tampering detection." Information Forensics and Security, IEEE
Transactions on 13.5 (2018): 1285-1298.

JETIR2405A48 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org k393

You might also like