Professional Documents
Culture Documents
Survey Paper
Survey Paper
Keywords: convolutional neural networks, • Development of a model trained on the processed dataset,
deepfake which classifies images as real or fake, followed by post-
processing for video analysis.
1. INTRODUCTION
Today's era is dominated by artificial
intelligence and machine learning, the 2. DEEPFAKE CREATION
fabrication of data has become all too The landscape of deepfake creation is vast, with an array of tools
common. This distortion of reality poses a available online catering to users of varying technical proficiency.
significant challenge as fake images and From novices to experts, anyone can now generate deepfakes
videos proliferate across the internet. effortlessly, regardless of their technical knowledge. These deepfake
Deepfakes, powered by advanced technology, videos vary widely in quality, ranging from basic, easily identifiable
are becoming increasingly indistinguishable fabrications to highly sophisticated manipulations that defy
from genuine content, making them difficult to detection even by astute observers.
detect. While deepfake technology may offer
certain benefits, it also brings about significant At the core of deepfake creation lie artificial intelligence and deep
harm. learning methods. One commonly employed technique involves
leveraging a specific type of convolutional neural network known as
In a time marked by rampant misinformation, an autoencoder. An autoencoder works by compressing input images
the danger of accepting deepfake content at through dimension reduction and image compression, and then
face value cannot be overstated. These reconstructing them using a decoder. Notably, the autoencoder
manipulated videos and images can be used for operates as a self-supervised algorithm, employing targets within its
various malicious purposes, including training process. An upgrade to this method is Generative
defamation of celebrities, political Adversarial Network, an unsupervised deep learning algorithm,
manipulation, personal attacks, intimidation, which further improves the quality of deepfake created. A Generative
propaganda, piracy, and other nefarious Adversarial Network (GAN) comprises two neural networks: the
activities. generator and the discriminator
One of the primary targets for deepfake
manipulation is the human face. Many
algorithms and techniques have been
developed to detect facial manipulation, which
analysis target various parameters such as facial warping
artifacts, blinking rates, and head movements. In 2018,
"MesoNet" was introduced, utilizing the Inception model
to detect faults at a mesoscopic level. Convolutional Neural
Networks (CNNs) have demonstrated exceptional feature
extraction capabilities,
4.2 Proposed Method During testing, any given video can be processed frame by frame,
and the predictions for each frame are utilized to derive the final
This paper introduces a method for training a classification. This video processing approach enhances accuracy by
classifier using video frames as input. The incorporating various versions of similar inputs into the dataset fed
frames undergo face extraction and alignment to the neural network. Moreover, different image transformation
before being fed into the classifier for training. operations like zooming, flipping, and slight rotation enrich the
dataset, as the frame's output class remains consistent even after
Prior to training the model, the dataset
these transformations.
undergoes preprocessing, which includes face
alignment and extraction. The proposed model Integrating the feature extraction and pre-processing model with
focuses on detecting faults introduced during models designed to detect temporal features, such as Recurrent
deepfake creation around the outline of the face. Neural Networks (RNNs), can further improve accuracy.
Thus, face extraction is employed to isolate the
area requiring processing, while face alignment
accommodates variations in head positions 5. RESULT ANALYSIS
within the deepfake video.
The model described in the paper achieves an accuracy of
The proposed classifier is based on a fine-tuned approximately 70% . Figure 3 illustrates the plot of the categorical
convolutional model trained on the cross-entropy loss function value against the accuracy for the model
preprocessed dataset. It utilizes a VGG-16 trained over 20 epochs. As depicted, with each epoch, the loss
model as its base, supplemented with batch diminishes while accuracy ascends.
normalization, dropout, and a custom twonode
dense layer. The final dense layer comprises two
nodes representing the two classes (real and
fake). Batch normalization normalizes and
scales inputs from the previous layer, while
dropout reduces overfitting and aids in weight
optimization by randomly deactivating nodes
during each epoch. This introduces randomness
into the training process, enhancing model
robustness.