Professional Documents
Culture Documents
A Novel Steganalysis Algorithm of Phase Coding in Audio Signal
A Novel Steganalysis Algorithm of Phase Coding in Audio Signal
A Novel Steganalysis Algorithm of Phase Coding in Audio Signal
Abstract
Audio steganalysis has attracted more attentions
recently. Phase steganalysis is one of the most
challenging research fields. In this paper, a novel
algorithm to detect phase coding steganography in
audio signal is proposed. It is based on analysis of the
phase discontinuities, and can be described as follows.
Firstly, it takes FFT transform of special segment of
audio and unwraps the phases of each audio sample,
then extracts the phase difference between neighboring
samples. Secondly, in order to monitor the change of
phase difference, it calculates the five statistical
features of phase difference for steganalysis. Thirdly,
the SVM classifier is utilized for classification. All of
the 800 various audios are trained and tested in our
experimental work. With various embedding
parameters for training and testing audios, the
proposed algorithm can achieve a good classification,
and the correct rate of detecting is up to 95%.
1. Introduction
Recently, digital watermarking and data hiding
have become a vibrant research area. Various kinds of
multimedia files can be downloaded freely from the
Internet. Terrorists might have seen this as an
opportunity to communicate secretly with each other.
Thus, various steganalysis methods have emerged as
means to deter covert communication by terrorists.
Steganalysis is the scientific technology to decide if a
medium carries some hidden messages or not and, if
possible, to determine what the hidden messages are.
In addition to preventing secret communication among
terrorists, steganalysis serves a way to judge the
security performance of steganography techniques.
Audio steganography is a useful means for
transmitting covert battlefield information via an
innocuous cover audio signal. Phase coding is a coding
schemes that introduces least perceptible noise to the
host .The off-set of the phase of a sound is irrelevant to
261
1 (k ), if (m) = 1
0 (k ) =
+1 (k ), if (m) = +1
(2)
i (k ) = 0 (k ) + j (k )
(3)
j =1
Im( S ( ) )
f () = tan1
where 0 < f ( ) < 2 (4)
Re ( S ( ) )
262
Furthermore,
), > 0
4. Experimental results
The proposed phase steganalysis technique is
implemented and tested on a set of 800 16bit wav files
(44.1 KHz, 20 sec). The audio files include music
types (piano, symphony, violin, and rock), songs,
speech (male, female), nature noise etc. In phase
coding, there are five embedding parameters:
embedded messages, block length N, subblock length
n, phase modifier, frequency slots per bit.
In phase coding algorithm, we must concern the
phase dispersion cause by a break in the relationship of
the phases between each of the frequency components.
Minimizing phase dispersion constrains the data rate of
phase coding. One cause of the phase dispersion is the
substitution of phase 0 ( k ) with binary code. The
magnitude of the phase modifier needs to be close to
the original value in order to minimize dispersion.
The difference between phase modifier states
should be maximized in order to minimize the
susceptibility of the encoding to noise. In our modified
phase representation, a 0-bit is 2 and a 1-bit
is 2 .
Another source of distortion is the rate of change of
the phase modifier. With N-point DFT, theoretically,
we can use up to N-frequency slots of the phase matrix
of the coding. However, because of the noise in the
decoded phase in a typical sound waveform, it is
almost impossible to code on bit frequency slot.
Moreover, the modification of the phase done to each
frequency component will cause severe phase
dispersion. By changing the phase more slowly and
transitioning between phase changes, the audible
distortion is greatly reduced. Here we set interval of
phase modification as 16 in each subblock.
In addition, as to simply the calculation, we choose
one segment of 1024 samples which have most power
in audio file to analysis. In our experiment, we use 200
clean audios and their stego audios as input to train
SVM, and test another 600 clean audios and their stego
audios. The block length N and subblock length n is
N=512, n=128; N=512, n=256; N=1024, n=128;
such
m
1
min T + C k ,
,b , 2
k =1
(5)
k 0, k = 1, , m,
Where training data are mapped to a higher
dimensional space by the function , and C is a
penalty parameter on the training error. For any test
instance x , the decision function (predictor) is
f ( x) = sgn ( T ( x ) + b )
is
The train and test audios use the five statistical features
derived from each plot, as described in Sec 3.2.
subject to yk ( T ( xk ) + b ) 1 k ,
kernel is K xi , x j = exp xi x j
K ( xi , x j ) ( xi ) ( x j )
(6)
263
5. Conclusion
Phase coding is one of the most effective coding
methods in terms of the signal-to-perceived noise ratio.
In this paper, we present a novel method to detect
hidden message by typical phase coding in audio
signal. We use statistical analysis of phase difference
to monitor the phase discontinuities and use SVM
classifier to capture the faint changes of phase causing
by embedding. Experiments are conducted on a set of
various types of audios and the correct rate of
classification reaches to 95%.
As to monitor the statistical changes caused by
other phase coding algorithm, future work may focus
on analyzing more effective features in audio signal.
Also an appropriate classifier need further study.
References
[1] W Bender, D Gruh, N Morimoto, et al, Techniques for
data hiding, IBM System.1996, vol.35, no.3&4:313-336.
[8]
C.C.Chang,
C.J.Lin,
http://www.csie.ntu.edu.tw/~cjlin/libsvm,
support vector machines, 2007.
264
"LIBSVM",
library for