Prob. Framework-CMFD

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Multimedia Tools and Applications

https://doi.org/10.1007/s11042-019-7713-2

A probabilistic framework for copy-move forgery


detection based on Markov Random Field

Behnaz Elhaminia1 · Ahad Harati1 · Amirhossein Taherinia1

Received: 13 July 2018 / Revised: 15 April 2019 / Accepted: 29 April 2019 /

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
Copy-move forgery is one of the most common kind of image tampering where some part
of an image is copied, may be with minor modifications, pasted to another area of the same
image. With the growing usage of images in todays life, image authenticity has become
a vital issue and consequently many image forgery detection techniques have been pre-
sented. In this paper, for the first time, we propose to treat copy-move forgery detection as
labeling problem in a Markov Random Field. To gain a proper balance between precision
and speed, an over segmentation is performed as a preprocessing step to obtain super-
pixels which are then regarded as nodes of the markov network. Intelligent selection of
unary and binary potentials let the maximum a posteriori labeling to be a precise map of
the forged regions. Qualitative and quantitative comparison with the state-of-the-art meth-
ods using public benchmarks demonstrate that the proposed method can improve precision
while keeping the processing demands low.

Keywords Copy-move forgery detection · Image forensics · Markov random filed ·


Binary labeling

1 Introduction

Nowadays, with the continuous growth of technology, it is quite easy to manipulate dig-
ital media in general; specially digital images are effortlessly tampered with the grace of
advanced image manipulating softwares. On the other hand, the important role of digital
images in trading, news and politics, social networks and relations has imposed a challenge

 Ahad Harati
a.harati@um.ac.ir

Behnaz Elhaminia
behnazelhaminia@stu.um.ac.ir

Amirhossein Taherinia
taherinia@um.ac.ir

1 Department of Computer Engineering, Faculty of Engineering, Ferdowsi University of Mashhad,


Mashhad, Iran
Multimedia Tools and Applications

for researchers to develop automatic algorithms to discriminate between an original image


and its doctored version. This process is known as forgery detection. In its general form,
forgery refers to any maliciously manipulating or tampering without leaving any obvious
clue or traces, always with the aim of distorting some information in the input image. There
are many different kinds of image forgery, which based on the process involved, are cate-
gorized into three major groups: image retouching, image splicing and copy-move forgery.
Retouching is the process which is used in photography with the aim of enhancing the qual-
ity. This type of forgery is known as the less harmful type as it does not change the image
significantly and is always used to enhance or reduce some features of the image. The sec-
ond type is image splicing, where two or more images are combined to make a fake image.
This type of forgery can be easily detected by searching for abnormalities, such as incon-
sistency in color component or noise component, etc. The third one is copy-move forgery
where some part of the image is copied and then pasted to a desired location in the image. A
copy-move attack is shown in Fig. 1. Typically, copy-move doctoring is performed for the
purpose of hiding some meaningful objects of image or increasing the number of objects.
For this aim and, for making forgery more normal, usually the pasted part is augmented
with some post processing operations like blurring, scaling, rotation and so on. Using these
post processing operations has made its detection problematic.
In recent years, many Copy-Move Forgery Detection (CMFD) methods have been
proposed which are basically divided into two groups: block-based and keypoint-based
methods. However, many methods from both groups share similar steps like feature extrac-
tion, matching and filtering; see Fig. 2. In block-based techniques the futures present blocks
of the image and comparison is between each two different block, while in keypoint-based
methods some selected pixels of the image are described and evaluate to find the forgery. In
our method, we presented a new framework which is based on Markov Random Fields.
Markov networks as special case of Probabilistic Graphical Models are widely used
in many different computer vision applications since they provide appropriate probabilis-
tic representations for image related problems on one hand, and offer rich set of energy
functions on the other hand which when properly minimized can accurately determine the
solution to the initial problem. In this paper, for the first time, we proposed a new frame-
work for detecting copy-move forgery which is based on inference within Markov Random

Fig. 1 Example of Copy-move forgery. The Left image is original and right one is forged [32]
Multimedia Tools and Applications

Fields (MRF). A major contribution of our work is hence, formulating CMFD as maximum
a posteriori inference in MRF. There are three prominent features which encourage using
MRFs and probabilistic inference for CMFD: First, it provides a well-formulated, flexible
and principled method to combine prior knowledge and data likelihood in an adapted graph
structure which models contextual constraints among variables. Second, the factorization of
joint probability into local potential terms leads to tractable inference which can effectively
collect partial cues into solid conclusions about the forged regions. Finally, MRFs have the
potential advantages of parameter learning and adaptation using tagged databases. Consid-
ering the striking advantages of MRFs, we propose it as a suitable framework for CMFD
and design proper unary and binary energy functionals which successfully delivers not only
the forged images, but also highlights the forged regions.
To obtain proper balance between precision and speed, an over segmentation is per-
formed using the Speeded-up Linear Iterative Clustering method [1] as a preprocessing step
to obtain superpixels which are then regarded as nodes of the markov network. Touching
superpixels are considered adjacent in the markov graph to encourage label consistency.
Unary and binary potentials are computed based on a rich set of features which considers
not only the appearance similarity of the superpixels but also the consistency of their geo-
metric transformations. These energy terms let the inferred maximum a posteriori labeling
to be a precise map of the forged regions.
The rest of this paper is organized as follows. In the next section we review the existing
methods concerning the detection of copy-move forgeries. Section 3 presents the proposed
forgery detection framework. Experimental results are discussed in Section 4 and the last
section contains conclusions and future works.

2 Related works

In order to address the CMFD problem, various methods have been developed which can
be classified into two categories: block-based methods and keypoint-based ones. These
two categories have fundamentally similar steps, as they both have feature extraction fol-
lowed by some sort of similarity evaluation based on predetermined thresholds which may
be iteratively adapted. Finally spurious mappings are usually removed in post processing
stage; see Fig. 2. However the main difference is that in block-based approaches, regions
of input image are processed for finding candidate matches while keypoint-based methods
find correspondence between points.
In block-based methods, the input image is first divided into overlapping blocks which are then
compared to each other based on their extracted features. Finally, two similar blocks (considering
a predefined similarity measure) are known as forgery parts. While all block-based methods
have a similar framework, they typically differ in feature extraction step. A wide variety of fea-
ture descriptors are used for copy-move forgery detection, where the preference is invariance
against the aforementioned post-processing operations such as rotation and blurring.
In [7], authors proposed a feature which is based on Fourier-Mellin Transform which is
invariant to small changes in rotation and scaling but fails in significant changes. Another

Fig. 2 Common framework of the copy-move forgery detection methods


Multimedia Tools and Applications

rotation invariant feature used in CMFD (Copy-Move Forgery Detection) is LBP (Local
Binary Patterns) proposed by Li et al. [19]. Authors in [15] use Discrete Cosine Transform
(DCT) to describe each blocks. Although the results show a good performance in finding
simple forgeries, it is not good for some post processing operations such as scaling, rota-
tion and blurring. In a similar method [5], they proposed using Discrete Wavelet Transform
(DWT) and kernel Principal Analysis (kPCA) as robust features. The results show accept-
able robustness against some post-processing but still the problem of scaling and rotation is
not solved. In [14], also Wavelet features are used to find the tampered parts of the image.
Recently in research by Lee et al. [17], Histogram of Oriented Gradient (HOG) is used to
represent each block. In this method, first the RGB image is transformed into gray scale.
Then after dividing the gray image into overlap blocks, the HOG is applied to each block
to get the feature vectors. 12 bins is considered for local histogram, each bin contains 15
degrees. As a result, the identical blocks have the same HOG. Searching for the most simi-
lar vectors determines the forgery. According to the results, this method can detect forgeries
with rotation up to 10 degrees and works well in case of brightness changes. In a new research
[12] authors represent a block by Singular Value Decomposition (SVD) and Stationary Wavelet
Transform (SWT), which is shift in variant. The results show a good resistance in blurring
attack. Also, in [2] block representation is done by extracting Discrete Cosine Transform (DCT)
coefficients and in a different approach [11] fractional quaternion cosine transform is used
for features.
The major drawbacks of block-based methods are less robustness against rotation and
scaling attacks and computational complexity. Keypoint-based methods try to avoid these
problems by focusing on points of interest in the input image. These techniques, select
important points from all over the image, and create proper descriptors to capture sig-
nificant features of its vicinity. The obtained keypoint descriptors are finally used for
matching similar regions of the input image which after post processing deliver the desired
result.
In recent years numerous keypoint extraction methods have been presented with dif-
ferent advantages and disadvantages. Early researches commenced with Scale Invariant
Feature Transform (SIFT) [22]. As SIFT selects keypoints from different scales of image
and describes them based on their neighborhood, and it also considers an orientation for
each of them, one can says SIFT keypoints are partly robust against rotation and scaling
changes. So many methods have been presented with SIFT descriptor. In [24], authors used
SIFT to detect and describe keypoints. Amerini et al. in [3], after extracting SIFT keypoints,
used a clustering method to cluster keypoints; therefore looking for similarity is carried out
by searching for similar clusters. The results show a better forgery region localization. In a
similar method [25], SIFT is used to extract and describe keypoints of image. In this method
after extracting keypoints and finding matched ones, a geometric transformation between
matched regions is estimated. Besides SIFT, there are many feature detectors which are
also using in this scope. In a more recent research [35], authors proposed a modified SIFT-
based detector which is specialized for CMFD scenario.In [21] keypoint extraction is done
by lowering the contrast of the image and resizing the size of it. this process guarantee to
have more robust keypoints even in smooth area. Also in [36], a new interest point detector
combined with SIFT is proposed. In this method after extracting features and matching the
keypoints an iterative filtering step is executed which is based on image segmentation.
SURF or Speeded-Up Robust Feature presented by [6], is another feature detector which
has a similar method to SIFT to find and describe keypoints but at a higher speed. In [34],
SURF is used to detect forgery parts. To improve false positive, in another method based
on Zernike moments [27], an affine transformation is estimated and the phase of Zernike
Multimedia Tools and Applications

moments helps to improve estimation and eliminate false detection. Affine estimation is per-
formed by Ransac. Ransac or random sample consensus is a method which estimates param-
eters of a model by getting a set of observed data that contains outliers and it can be used to
detect a transformation between two matched points. In recent years, almost the majority of
the proposed methods have used Ransac or similar techniques to filter false positive answers
[30]. Authors in [4] proposed to match triangles of keypoints. In this method after extracting
interest points and modeling them as a set of triangles, forgery detection is done by searching for
similar triangles based on their shapes, their contents and features. In [38], authors proposed an
iterative keypoint-based method that have special keypoints which are based on the texture
and uniqueness of image. In fact, more unique and smooth regions of image have more keypoints
in. Furthermore in order to reduce false positive answers, they proposed a method similar to
Ransac but in a quicker way. In a recent work [37], a different feature extraction method is
proposed. In this technique oriented Features from Accelerated Segment Test (FAST) and
rotated Binary Robust Independent Features (BRIF) are used for feature extraction.
Although keypoint-based methods are more efficient in comparison with block-based
techniques, but still the problem of robustness is not completely solved. Extracting proper
keypoints and robust features are vital to have a robust method. On the other hand, keypoint-
based methods have more complicated algorithms despite the simple and easy to understand
block-based methods. In order to solve these problems, new methods have been proposed
which they take the advantages of both keypoint-based and block-based methods. In [9],
the author uses segmentation and compare each pair of patches with their keypoints. In an
other technique [18], to improve affine estimation, after extracting SURF keypoints, a seg-
mentation algorithm is applied, therefore, the Ransac will find the transformation between
points of two segments. Li et al. [23] used Simple Linear Iterative Clustering (SLIC) to over-
segment the image and to find the forgery segments, they used SIFT descriptor and KNN
(K-nearest). In another segmentation-based method [28], the authors use the same approach
as [23] and found the forgery with comparing the Adaptive Non-Maximal Suppression
(ANMS) features combined with DAISY descriptors [31]. Bi et al. [10] used superpixels
which are meaningful non-overlapping regions and final decision is based on seeking the
similarity between these superpixels. Although segmentation-based methods are simple and
effective, there are still some problems. In these methods, the forgery detection is done by
merely comparing two different patches, segments or superpixels and no other information
including the neighborhood information and the probability of being a manipulated
superpixel is considered. Therefore the error rate could be high in finding the solution.
Our approach in this paper, is different from traditional segmentation based forgery detection
methods as it uses an over segmentation into superpixels to construct a markov network of reason-
able size, and not a proper segmentation to directly choose forged regions. Secondly, the designed
unary singleton and doubleton energy terms help to find the optimal joint labeling of the
superpixels into forged and intact regions. In fact, each region is not judged in separation –
as in the case of the existing segmentation based methods – but considering the state of all
other regions the MAP inference try to estimate the best suited label for each superpixel.
In the following, we review our method in details.

3 Proposed method
We propose a novel method for detecting copy-move forgery based on Markov Ran-
dom Fields. In this method, the image is first divided into several superpixels and then
some features are extracted from each superpixel individually. Then a clustering algorithm
Multimedia Tools and Applications

Fig. 3 The proposed framework of copy-move forgery

clusters them based on their features, putting similar superpixels in the same clusters. After
that, a graph is constructed with nodes of superpixels. Two neighboring superpixels in an
image have a mutual edge in mentioned graph (See Fig. 4). So, the problem of finding forgery
parts of image is turned to a maximum a posteriori (MAP) inference on the constructed
graph. To do MAP inference, an energy function defined based on two special terms, unary
and binary terms, is minimized. Unary potential term investigates an affine transformation
between similar parts and binary term considers label continuities based on the information
of co-cluster superpixels. To minimize the energy function, we use ICM (Iterated Condi-
tional Modes [1])algorithm, which is simple and efficient. A flowchart of proposed method
is illustrated in Fig. 3. Next, we review the methods in separate subsections.

3.1 Dividing image into superpixels

The first step of the proposed method is image segmentation, where the image is divided into
several superpixels. To do so, a segmentation algorithm called SLIC (Simple Linear Iterative

Fig. 4 A graphical illustration of the MRF model for CMFD detection. pink nodes in the graph represent
superpixels and grey ones shows feature nodes
Multimedia Tools and Applications

Clustering) presented by [1] is used. It should be noted that, we use the implementation
of SLIC in vLFeat library [33] which is one of the most effective implementations(version
0.9.20) have been presented.

3.2 Feature extraction and clustering

In our proposed method, we use two different kinds of features for two different goals.
First, we extract PCT (Polar Cosine Transform) form all pixels of the image. PCT is one of
the frequently used features due to its high discrimination property. The PCT features of an
image g, of order n, with repetition l, are calculated as follows [20]:
  
f = Mn,l  suchthatn + l ≤ 3, 0 ≤ n, l < 3 . (1)
in which:  
2π 1 ∗
Mn,l = [Hn,l (r, θ)] g (r, θ) rdrdθ, (2)
0 0
   1
πn =0
Hn,l (r, θ) = n cos π nr 2 ej lθ , n = (3)
2
πn = 0
where g (r, θ) shows the image in polar representation. To describe a keypoint with PCT a
B x B block around the keypoint is transformed to a feature vector via (1). To describe a
superpixel, the average of PCT features of all pixels in a superpixel is calculated. In the end,
a feature vector describes and represents that superpixel. As the results show, two similar
superpixels have very close feature vectors. We use this features for clustering superpix-
els. By using k-means for clustering, two similar superpixels, will be in a common cluster.
Our goal of clustering is twofold: Firstly, searching for similar superpixels is performed by
searching in each cluster. In fact, after clustering, the copied part and the original one are in a com-
mon cluster, so search space will be limited into clusters. Secondly, we use this clustering
in binary term of energy function which we will be discussed later in Sections 3 and 4.
Another feature we used is SURF. To figure out if two superpixels are similar or not, we
extract SURF keypoints and compare them. The motivation to do so is that SURF keypoints
are robust to rotation, scaling and affine transformation which is well-suited for copy-move
forgery detection. In summary, we extract two kinds of features:
1- PCT for assisting in limiting search area and using in binary energy function.
2- SURF features for matching and detecting similar parts of the image in unary energy
function.

3.3 Keypoint matching

Two keypoints of two superpixels which are in the same cluster are compared with each
other. For keypoint matching, we perform according to [1]; Descriptor D1 is matched to a
descriptor D2 only if the Euclidean distance d (D1, D2) multiplied by predefined threshold
ths is not greater than the distance of D1 to all other descriptors. This Euclidean distance
is stored as a match score between two keypoints. The advantage of this way of matching
is that similarity is defined with considering other descriptors. That means two descriptors
may be known as matched in one iteration but known as different in another iteration- when
more descriptors are added. Besides, it prevent producing false matches when all descriptors
are very close to each other, for example in unique regions of the image.
To detect similar superpixels, we carry out matching in two steps. First, in initialization
for energy function and next in unary term of thereof. For initialization, each two superpixels
Multimedia Tools and Applications

which have at least one matched keypoints are known as similar superpixels. Similarity
matching for unary term is discussed in the next section.

3.4 Problem formulation with Markov random fileds

In recent years, Markov Random Filed models have been widely used in different prob-
lems of computer vision and image processing like segmentation, image restoration, noise
reduction, depth construction, etc. The success of MRF modelling can be attributed to the
fact that it give rise to good, flexible, stochastic image models [26]. Image segmentation is
one of the most important tasks in computer vision where pixels are grouped into different
regions based on their features. The goal is to set a label to each part of image, in a simple
problem, foreground and background can be these labels. The segmentation problem have
been well solved with MRF models during the recent years. To solve the CMFD problem,
we treat it like an image segmentation problem where each part of the image has a special
label, i.e., forgery or safe. So formulation of CMFD with markov random field based on
image segmentation is as follows:
Assume that the image consists of finite superpixels S = {S1 , S2 , . . . , SN }. And the label
of each superpixel, ωs can be defined as a discrete random variable which has a value in
{0, 1} (0 shows the safe region and 1 shows forgery). So the set of labels is defined as: ω =
{ωs | s  S }. Furthermore if the
 observed features of each superpixel is defined as another
random variable, F = { f s  s  S } , the overall CMFD problem is to find an optimal
labeling ω, that maximize P ( ω | F ), which is the maximum a posteriori probability or
MAP inference or argmax ω   P ( ω | F ), where  is the set of all possible labelings.
Figure 4 shows the MRF modeling for CMFD problem. Note that in a usual segmentation
problem pixels of image are nodes of the graph, therefore MRF model is a regular network
as each pixel has exactly four neighbors, but in our peoposed method, each superpixel may
have different number of neighbors, and the network is not regular.
According to Hammersley-Clifford theorem [13], joint probability distribution P (ω)
follows a Gibbs distribution which can be written as:
1 1

P (ω) = exp(−U (ω)) = exp(−Vc (ωc )) (4)


Z z
c∈C

Where U (ω) is called energy function and Vc is known as clique potential of clique c ∈ C
having the label ωc . C is the set of all cliques and Z is normalizing constant known as
partition function and defined as bellows [16] :

Z= exp(− U (ω)) (5)
ω

Considering (4) the maximization of P ( ω | F ) is equal to minimization of the energy


function. Our proposed energy function for CMFD problem has the following form:

U (ω) = α G ( ωs ) + β ψ ωp , ωq , (6)
s  S {(p, q)  ξ }

Where α and β are regularization parameters and ξ denotes the set of edges. The first
term(G (ωs )) is called unary potential function, also known as singleton term, is a function
defined over each node individually
and directly
affects the modeling of the labels, while
the second one or binary term (ψ ωp , ωq ), also known as doubleton term, considers the
relationship between neighboring node (superpixel) labels. We discuss each term in details
in the following.
Multimedia Tools and Applications

A. Unary Potential Function: To set a proper label for each node S, this part of energy
function investigates 2 factors: feature matching and transformation matching.
1. Feature matching: this part of unary function considers similarity of superpixels based
on their features. More similar features means a higher probability of being forged.
2. Transformation matching: this part states that between two similar superpixels should
have a partly well-defined transformation.
In summary, two parts of an image can be detected as forgery parts when they have simi-
lar features and there is a good transformation among them. Considering that, we determine
G (ωs ) as bellow:

G ( ωs ) = ωs ( k ∗ mean ( Rn {s} ) + k ∗ sum ( Mch {s} ) ) + ws ∗ C . (7)

Where C is a constant for safe nodes and k is a controller parameter (discussed in the
following). Rn {s} and Mch {s} are:

Mch {s} = Match(CC s , Matched( CCs ) (8)

Rn {s} = T ransf ormation (CC s , Matched( CCs ) (9)

Where CC s Shows the connected component, in case of labels, consisting of superpixel s


(CC s is a sub-graph consisting node s , with the maximal set of nodes that have the same
label as node s, in label graph). Match (A, B) Is a function that gives a vactor of scores for
the similarity of points in A and B. Also, T ransf ormation (A, B) gives a vector of scores
for estimated transformation.
In order to estimate the transformation for one superpixel, the whole connected compo-
nent consisting the superpixel is considered. In some methods, to estimate transformation,
they simply consider the neighboring segments [18, 38] but in our proposed method we look
for an estimation between all superpixels in connected component, which results in more
keypoints and a more precise estimation.
To estimate an affine transformation, all keypoints of CC s with their matched points
are considered. It helps to estimate more accurate transformation and avoid false positives
(Because a superpixel cannot be forgery alone and it should has some forgery neighbors).
Another advantages of that is for scaling, one superpixel may turned to two or more super-
pixels. In fact, in our method, we decide for forgery keypoints with exploring its neighbors
and not comparing just two superpixels.
To realize the role of these terms, consider 3 different cases that can happen for one
superpixel in comparison with others. With the forgery label:
1. There are no matched points between that and other superpixels. So, it should be con-
sidered as a safe node. The energy must raise, therefore it rises as much as constant
C.
2. There are some matched points and an affine transformation between them. Thus, they
can be considered as a forgery part. In this case, k is on (k = 1) and the average of
matched scores is calculated as energy. Lower matched score (lower distance between
feature vectors or more similar superpixels) Results in a higher probability of being
forged.
3. There are some matched points, but not an affine transformation. Therefore the energy
must be less than case 1 and more than case2. In other words, it may be forgery or may
not be. In that case, the sum of matched points are calculated.
Multimedia Tools and Applications

In math world, Rn {s} is a vector consisting of the match scores of all keypoints in
superpixel s detected as inlier points in affine transformation by Ransac. Mch {s} is a vector
consisting of the match scores of all keypoints of superpixel s.

B. Binary Potential Function: This part of function checks out the continuity of the labels.
Two neighboring superpixels should have similar labels, therefore similar to segmentation
problem, this term should penalize energy when two neighboring nodes do not have the
same labels. However there are some differences in CMFD. In CMFD two neighbors can
have two different labels, one forgery and one safe. Most of superpixels are safe (just some
parts of the image is manipulated). Thus, they can be neighbors and have different labels. In
fact, if they are parts of one object, they should have the same labels. Therefore to calculate
the edge energy (binary energy), it should be determined whether they can have the same
labels. To do that, we use the clustering job mentioned in 3.2. Thus it should penalize energy
when two neighboring superpixels have different labels and they are in the same cluster.
The binary term is defined as bellow:

ψ ωp , ωq = Cp,q 1 − σ ωp = ωq (10)
where σ is the indicator function, whose value is 1 if the input argument is true, otherwise it
returns 0. C is a pre-computed matrix which element Cp,q is 1 if superpixel p and superpixel
q are in one cluster, otherwise it is 0. There are 4 existing cases between two nodes of an
edge. Table 1 shows the cases and penalizing.
Defining the pairwise (binary) potential function as (10) guarantees if one part of image
is wrongly detected by unary potential function as forgery part, because its co-cluster neigh-
bors do not have the same labels, it penalizes the labeling, so that during the inference
process it will be known as safe part. Besides, sometimes for one superpixel, there is not
any good keypoints to be detected (for example for smooth texture) and in unary potential
it is known as safe part, but the binary potential makes the correct results. In summary, the
unary potential is responsible for detecting manipulated parts of image and binary potential
avoids producing wrong answers.

C. Energy Minimization ICM or Iterated Conditional Modes [8], proposed for MRF energy
minimization, is a deterministic algorithm which computes the Maximum A Posteriori
(MAP) inference over a Markov Random Field. This is done by iteratively maximizing the
probability of single variables. In each iteration, label of current node is set to the condi-
tional mode which minimizes the total energy. After a number of iterations, the algorithm
converges to a local maximum of the joint posterior probability. ICM method proceeds
first by getting an initialization for variables and then iterates over each node recalculat-
ing the value that minimize the energy conditioned on the current label of its neighboring
nodes. In this work, we initialize the binary labels (i.e. forged vs. safe) by thresholding the
similarity obtained among corresponding superpixels.(explained in 3.3). In fact, in the ini-
tialization stage, a crude and approximate map of the manipulated regions of the input image
is given to the algorithm. In repeated iterations of the ICM, this initial crude map is refined
based on labeling of the neighboring superpixels. The convergence is obtained when labels

Table 1 The penalizing of


energy for existing different Cluster / Label Similar Different
situations of two superpixels
Similar no yes
Different no no
Multimedia Tools and Applications

remain intact in two successive iterations or when decrease in the energy gets below a given
threshold. An sketch of ICM is given in Algorithm 1.

4 Experimental result

In this section, we provide the experimental results and implementation details. In the first
part the data set used for experiments is described and following the evaluation criteria and
implementation details are explained. Finally, the results are illustrated.

4.1 Data sets

In order to evaluate the performance of the proposed method, we employed one two data
sets which are various in some factors like image size, image texture, and post-processing
attacks. Amerini et al. [3] prepared the image forgery data set MICC-F220 and MICC-
F600. These two data sets consist of forgery and safe images including nature, humans and
outdoor images which all have smooth and rough textures with different kinds of attacks.
The information about two data sets are provided in Table 2 and details about the attacks
are shown in Table 3.

Table 2 The ditails of different datasets

Data set Details

MICC-F220 Total number of 220 images. Consists of 110 forgery image


and 110 safe ones. Scaling, rotation and joint scaling and rota-
tion are the attacks. The size of the images varies but the
average size is 800×500.
MICC-F600 Total number of 760 images. Consists of 160 forgery image,
160 ground truth image and 440 safe ones. Rotation with
θ o =30 and multiple forgeries (more than one forgery in the
image) are the attacks(80 rotation attacks and 80 multiple
attacks). The size of the images varies but the average size is
3000×2300.
Multimedia Tools and Applications

Table 3 The details of different


attacks of MICC-F220 Dataset. Attack θo Sy Sx
sx and sy present scaling factor
in the x-axis and y-axis f 0 1.2 1.2
g 0 1.3 1.3
h 0 1.2 1.4
I 10 1.2 1.2
J 20 1.2 1.4

It has the total number of 220 images which consists of 110 forgery image and 110 safe
ones. Scaling, rotation and joint scaling and rotation are the attacks. The size of the images
varies but the average size is 800×500.

4.2 Evaluation metrics

To evaluate the efficiency of CMFD methods, 2 kinds of evaluation are applied: image level
and pixel level. An image-level assessment evaluates the overall response of the method
to the image; In other words, how many forgery images can be detected truly as forgery
and how many safe ones can be determined wrongly as forgery. There are two image level
measurements: TPR (True Positive Rate) and FPR (False Positive Rate) defined as:
| { images detected as f orged being f orged} |
TPR = (11)
| { F orged images} |
| { images detected as f orged being original }
FPR = (12)
| {Original images} |
Although higher TPR and lower FPR shows a good performance for the evaluated
method, there are some problems with this evaluation. As an instance, for one method which
detects all input images as forgery, T P R is high, while the method is not accurate. Or one
method can detect correctly a manipulated image but not correctly the place that forgery
happened and TPR is still high. Pixel level evaluation, computes the number of pixels which
are truly detected. Keypoint-based methods find the forgery region by detecting keypoints,
therefore, pixel level evaluations are not proper enough. Consequently, to evaluate a CMFD
method more precisely one should use a combination of them. In this paper, we use TPR
and FPR with a measure used in [16] defined as bellow:
| matches in B |
FP = (13)
|B|
 
 missed matches in  Ri 
FN =    i
(14)
 
i Ri
Where R1 is the i th duplicated region and B is the whole unchanged background. It is
obvious that low values of FN and FP state more localization accuracy.

4.3 Set up

The performance of the proposed method is highly dependent on its parameters. α and β
used in (6) are regularization parameters for unary and binary potential function, respec-
tively. As mentioned earlier, ICM is an iterative optimization method whose answer gets
closer to the optimum answer after each iteration. As a result for the first iterations, α should
be greater than β; in fact in first iterations many of regions are detected as forgery parts.
Multimedia Tools and Applications

Table 4 Set up values for image


of size M×N Parameter Value

α 1
β 0.05

B 12+(0.01 MN )
γ 0.2
ths 1.7
c 0.05
S round (0.1 ∗ Sqrt (M ∗ N ))

Thus binary term decision for neighbors is not correct. But as iteration continues, the unary
decision gets more precise and β should be greater of its previous value. In 80 percent of
cases, the algorithm converges in less than 30 iterations. In our implementation β starts from
0 and get 0.05 greater at each iteration until it gets to 0.5.
The block size for computing PCT feature defined earlier as B is based on [38] which is
computed by the size of image (M and N are variables for the length and width of image).
Also, for segmentation by SLIC algorithm, it needs two parameters. Region size, shown by
S is the starting size of the superpixels and regularizer γ is the trades-off appearance for
spatial regularity when clustering (a larger value results in more spatial regularization). For
setting S we perform based on [38].Other parameter have been obtained by experiments.
Table 4 illustrates the parameters used in implementation.

4.4 Experiment results on MICC-F220 and MICC-F600

In this section, we present the comparison of proposed method with three recent methods
of Zandi [38], Silva [29] and Bhanu [9] on MICC-F220 and MICC-F600 datasets. Zandi
[38] is an iterative method which has its especial keypoints. In this method, the smooth
regions of the image are denser in comparison with other regions and it can detect and
localize forgeries in smooth regions. Silva [29] method investigates the image in different
resolutions. This method also works with keypoints. As our method works with segments
of the image and not keypoints, we also report the comparison with a segmentation-based
method,Bhanu [9]. This method first divides the image into different segments and then in
matching stage, after comparing features of each patch, a transformation estimation using
RANSAC is performed to find the final matched segments. Table 5 presents the comparison
of FPR and TPR rate.
The FPR measurement states safe images which are wrongly detected. In terms of FPR,
our proposed method has the least number of wrong detections. In general, it can be claimed
that the proposed method has less errors in comparison with others. The main reason lies
with the fact that making decision about each part of an image is repeatedly revised during

Table 5 The comparison of


Overall performance on Method TPR (%) FPR (%)
MICC-F220
Proposed 62 40
Zandi [38] 53 64
Silva [29] 64 57
Bhanu [9] 38 66
Multimedia Tools and Applications

Fig. 5 The total average of Fn and FP for rotation and scaling/rotation attacks on MICC-F220

the energy minimization and besides that, the binary potential function helps to make a
better decision for each part of the image. In terms of TPR, our peoposed method performs
better than Silva and and Bhanu, and it is close to Zandi.
Figure 5 presents the comparison of Fp and F n of proposed method with three other
methods. As it is illustrated, our method outperforms the others in terms of Fp and F n .
Fp shows the errors in regions out of the forgery regions and F n shows the missed forgery
regions. Therefore the lower Fp and Fn states more precision in localization. As it is illus-
trated in Fig. 5 the the proposed method find the forgeries at lower error rate. Although
Zandi has a better performance in Fn , its Fp is almost greater than ours, which shows that
it detects more wrong parts of background. Furthermore, all other methods suffer from
low performance in scaling and joint scaling-rotation. As it is demonstrated, our proposed
method can detect forgeries more precise in case of joint attacks (I and J) and it has lower
Fp .

Table 6 The comparison of


Overall performance on Method TPR (%) FPR (%)
MICC-F600
Proposed 84.37 11.81
Zandi [38] 82.12 16.33
Silva [29] 75.00 13.63
Bhanu [9] 53.33 49.63
Multimedia Tools and Applications

Table 7 The comparison of FP


on MICC-F600, r-s is the joint Method Rotation r-s Multiple
rotation-scale attack
Proposed 8.11 9.60 8.63
Zandi [38] 18.35 10.15 11.48
Silva [29] 27.90 23.87 18.80
Bhanu [9] 32.51 41.80 36.19

Table 8 The comparison of FN


on MICC-F600,r-s is the joint Method rotation r-s Multiple
rotation-scale attack
Proposed 10.82 37.60 8.63
Zandi [38] 9.13 79.15 10.13
Silva [29] 23.71 39.50 13.80
Bhanu [9] 35.10 51.18 49.06

Fig. 6 The results of the proposed method on simple Copy-move forgery. left to right : original image, CMF
image, proposed method, Bhanu method and ground truth
Multimedia Tools and Applications

Table 9 Average execution time per image in second

Average size Proposed Zandi [38] Silva [29] Bhanu [9]

800*500 11.86 9.48 13.04 12.19


3000*2300 103.96 191.16 208.22 199.02

Table 6 shows the overall performances of the four methods on MICC-F600. In this data
set, there are 160 forgery images which are divided into three different types of attack; rota-
tion (only with angel 30 degrees), scaling-rotation(only with angel 30 degrees and scaling
factor 1200) and multiple forgeries. The average image size for this dataset is 3000*2300
which can assess all methods in working with large size images.The bold numbers in table
show the best performance among all.
As it can be seen in Table 6 our proposed method has the highest TPR rate and lowest
FPR rate. In comparison with Bhanu [29], that is also a segmentation-based method, our
proposed method has by far a better performance in both criteria. Tables 7 and 8 illustrate
the average error rate of FP and Fn for all methods. The proposed method outperforms in
terms of false detection which indicates that, it finds the forgeries with higher precision.
Figure 6 shows some of the experimental results on MICC-F220 data set. The third col-
umn shows the proposed method results and the forth is Bhanu’s method which is also a
segmentation-based method.

4.5 Execution time comparison

To have a comprehensive comparison, we also report execution times for MICC-F220 and
MICC-F600. In dataset MICC-F600, the image size varies from 800*533 to 3872*2592
and running on a same machine, all methods got slower dramatically. The comparison of
average run time is illustrated in Table 9. The results show that, as resolution gets higher,
since the proposed method works with superpixels and inference, the algorithm slowdown
less, compared to others.

5 Conclusion and future works

In this research, we presented a novel approach for copy-move forgery detection prob-
lem which is based on Markov Random Filed. In our research, we solve the CMFD as a
maximum a posteriori labeling problem in a markov network of superpixels obtained from
the input image. This formulation helps to gain a more global insight and improves the
precision. The energy minimization is tackled using ICM, which is an iterative method guar-
anteed to converge. It can be said that in each iteration, the information about forgery regions
gets more completed and the answer gets closer to the desired solution. The reported results
suggest that the proposed method can detect forged regions efficiently and with acceptable
accuracy.
Our main goal in this study, was to encourage a probabilistic framework for CMFD. The
introduced method is a proof of concept and may be improved in many aspects including
using more complex inference methods or graph structure, learning parameters of the energy
terms from data or adding a post processing step which compensate for deficiencies of the
initial over-segmentation obtained from SLIC.
Multimedia Tools and Applications

References

1. Achanta R, Shaji A, Smith K, Lucchi A, Fua P (2010) SLIC Superpixels EPFL Technical Report 149300
2. Alkawaz MH, Sulong G, Saba T, Rehman A (2018) Detection of copy-move image forgery based on
discrete cosine transform. Neural Comput Applic 30(1):183–192
3. Amerini I, Ballan L, Caldelli R, Del-Bimo A, Serra G (2011) A SIFT-Based forensic method for copy-
move attack detection and transformation recovery. IEEE Trans Inf Secur 6(3):1099–1110
4. Ardizzone E, Bruno A, Mazzola G (2015) Copy-move forgery detection by matching triangles of
keypoints . IEEE Trans Inf Forensics Secur 10(10):2084–2094
5. Bashar M, Knoda N, Ohnishi K, Mori (2010) Exploring duplicated regions in natural images, IEEE
Transactions on Image Processing in press
6. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) peeded-up robust features (SURF). Comput Vis Image
Understand 110(3):346–359
7. Bayram S, Taha Sencar H, Memon N (2009) In IEEE ICASSP, Washington, DC, USA
8. Besag J (1986) On the statistical analysis of dirty pictures. J R Statist Soc 48(3):259–302
9. BhavyaBhanu MP, ArunKumar MN (2017) Copy-move forgery detection using segmentation. In: 11th
International Conference on Intelligent Systems and Control (ISCO), 5–6 Jan. 2017, Coimbatore,
pp 228–224
10. Bi X, Pun C-M, Yuan X-C (2018) Multi-scale feature extraction and adaptive matching for copy-move
forgery detection. Multimed Tools Appl 77:363–385
11. Chen B, Yu M, Su Q, Li L (2018) Fractional quaternion cosine transform and its application in color
image copy-move forgery detection. Multimedia Tools and Applications
12. Dixit R, Naskar R, Mishra S (2017) Blur-invariant copy-move forgery detection technique with improved
detection accuracy utilising SWT-SVD. IET Image Process 11(5):301–309
13. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of
images. IEEE Trans Pattern Anal Mach Intell 6:721–741
14. Haghighi B, Taherinia AH, Harati A (2018) TRLH: Fragile and blind dual watermarking for image
tamper detection and self-recovery based on lifting wavelet transform and halftoning technique. J Vis
Commun Image Represent 50:49–64
15. Huang Y, Lu W, Sun W, Long D (2011) Improved DCT-based detection of copy-move forgery in images.
Forensic Sci Int 206:178–184
16. Koller D, Friedman N (2009) Probabilistic graphical models, Massachusetts Institute of Technology
17. Lee J, Chang CH, Chen W (2015) Detection of copy–move image forgery using histogram of Information
Sciences
18. Li J, Li X, Yang B, Sun X (2015) Segmentation-based image copy-move forgery detection scheme. IEEE
Trans Inf Forensics Secur 10(3):507-518
19. Li L, Li S, Zhu H, Chu S-C, Roddick JF, Pan J-S (2013) An efficient scheme for detecting copy-move
forged images by local binary patterns. J Inf Hiding Multimedia Signal Process 4(1):46–56
20. Li Y (2013) Image copy-move forgery detection based on polar cosine transform and approximate
nearest neighbor searching. Forensic Sci Int 224:59–67
21. Li Y, Zhou J (2018) Fast and effective image copy-move forgery detection via hierarchical feature point
matching. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
22. Lowe D (2004) Distinctive image features. International Journal of Computer Vision
23. Minakshi K (2003) Digital image processing. In: Satellite remote sensing and GIS applications in
agricultural meteorology, world meteorological organization publishing, pp 81–102
24. Pan X, Lyu S (2010) Region duplication detection using image feature matching. IEEE Trans Inf
Forensics Secur 5(4):857–867
25. Park C, Choeh JY (2017) Fast and robust copy-move forgery detection based on scale-space representa-
tion . Multimed Tools Appl 77(13):16795–16811
26. Rangarajan A, Chellappa R (1995) Markov random fileds in image processing, Department of Computer
Science, Yale University
27. Ryu S-J, Kirchner M, Lee M-J, Lee H-K (2013) Rotation invariant localization of duplicated image
regions based on Zernike moments. IEEE Trans Inf Forensics Secur 8(8):1355–1370
28. Sekhar R, Shaji R (2016) A study on segmentation-based copy-move forgery detection using DAISY
descriptor. In: Proceedings of the international conference on soft computing systems, pp 223–233
29. Silva E, Carvalho T, Ferreira A, Rocha A (2015) Going deeper into copy-move forgery detection:
Exploring image. J Vis Commun Image R 29(C):16–32
30. Soni B, Das PK, Thounaojam DM (2019) Geometric transformation invariant block based copy-move
forgery detection using fast and efficient hybrid local features. Journal of Information Security and
Applications
Multimedia Tools and Applications

31. Swaminathan A, Wu M, Liu KR (2008) Digital image forensics via intrinsic fingerprints. IEEE Trans
Inf Forensics Secur 3:101–117
32. Tralic D, Zupancic I, Grgic S, Grgic M (2013) CoMoFoD - new database for copy-move forgery
detection. In: ELMAR, 55th international symposium, pp 49–54
33. Vedaldi A, Fulkerson B VLFeat : An open and portabale library of computer vision algorithms, 2008.
[Online]. Available: http://www.vlfeat.org
34. Satheesh S, Thomas A, Devasia A (2017) Image forensic, copy-move forgery, SURF. Int J Eng Trends
Technol (IJETT) V45(6):285–287
35. Yang B, Sun X, Guo H, Xia ZH, Chen X (2018) A copy-move forgery detection method based on
CMFD-SIFT. Multimed Tools Appl 77(1):837–855
36. Yanga F, Lia J, Lua W, Wengb J (2017) Copy-move forgery detection based on hybrid features. Eng
Appl Artif Intel 59:73–83
37. Yeap YY, Sheikh UU, Rahman A (2018) Image forensic for digital image copy move forgery detection.
In: Signal processing and its applications (CSPA), Malaysia, Malaysia
38. Zandi M, Aznave A, Talebpour A (2016) Iterative copy-move forgery detection based on a new interest
point detector . IEEE Trans Inf Forensics Secur 11(11):2499–2512

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

Behnaz Elhaminia received her BSc. And MSc. degrees in computer engineering and AI & Robotics from
Ferdowsi University of Mashhad (FUM), Mashhad, Iran in 2014 and 2018 respectively. She is currently a
researcher in MVLab in FUM, Mashhad, Iran. Her research interests include Machine Learning, Computer
Vision and Probabilistic Graphical Models.
Multimedia Tools and Applications

Ahad Harati received his BSc. and MSc. degrees in computer engineering and AI & Robotics from Amirkabir
University of Technology and Tehran University, Tehran, Iran in 2000 and 2003 respectively. He was awarded
with Ph.D. in Manufacturing Systems & Robotics from Swiss Federal Institute of Technology (ETHZ),
Zurich, Switzerland, in 2008. He is currently an Assistant Professor in Computer Engineering department,
Ferdowsi University of Mashhad. His main area of research is Robot Perception, Probabilistic Models and
Machine Learning.

Amirhossein Taherinia received the B.S. degree in computer engineering from Ferdowsi University of
Mashhad, Mashhad, Iran, in 2004, and the M.Sc. and Ph.D. degree in computer engineering in 2006 and
2012 respectively, form Sharif University of Technology, Tehran, Iran. From 2013 he is assistant professor at
department of computer engineering, Ferdowsi University of Mashhad (FUM). His research interests include
multimedia security, data hiding and multimedia signal processing.

You might also like