Secure Client-Side ST-DM Watermark Embedding

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

CLIENT SIDE EMBEDDING FOR ST-DM WATERMARKS A. Piva, T. Bianchi, A.

De Rosa University of Florence Department of Electronics and Telecommunications Via S.Marta 3, 50139 - Firenze, Italy
ABSTRACT Client side watermark embedding schemes have been proposed as a possible solution for the copyright protection in large scale content distribution environments. In this framework, we propose a lookup-table based secure embedding system, designed for the Spread Transform Dither Modulation (ST-DM) watermarking algorithm, that outperforms Spread Spectrum based systems. Index Terms content distribution, secure watermark embedding, multiple decryption, ST-DM informed embedding, ngerprinting 1. INTRODUCTION New distribution channels such as digital music downloads and video-on-demand services present new challenges to the design of content protection measures for the prevention or deterring of copyright violations from malevolent users. One of the adopted solutions is the digital watermarking technology [1]. In the distribution models we are analyzing, this technique is used to embed into each copy of the content a unique code identifying a particular user or a specic device that receives it. When unauthorized published content is found, the detection of the hidden watermark allows to trace the user who has redistributed the content. Current content distribution systems are based on a client-server architecture, where the watermark embedding is usually carried out by a trusted server before releasing the content to the nal user. However in large scale distribution systems the server may become overloaded, due to the fact that the computational burden grows linearly with the number of users. In addition, since the distribution of individually watermarked copies requires point-to-point connections, bandwidth requirements can become prohibitive. A possible solution consists in using client-side watermark embedding: the server is allowed to send a unique copy of the content to all the interested users through broadcasting. Each client will be in charge of embedding a watermark identifying the received copy. Since the client is untrusted, the users should not have access to the original content or to the secret information required to embed the watermark. In secure watermark embedding schemes, the server transmits the same encrypted version of the original content to all the clients but a client-specic decryption key allows to decrypt the
The work described in this paper has been partially supported by the European Commission through the IST Programme under Contract no 034238 - SPEED and by the Italian Research Project (PRIN 2007): Privacy aware processing of encrypted signals for treating sensitive information. The information in this document reects only the authors views, is provided as is and no guarantee or warranty is given that the information is t for any particular purpose. The user thereof uses the information at its sole risk and liability.

content and at the same time to implicitly embed a watermark to obtain a uniquely watermarked version of the content. In literature, several approaches for secure embedding have been proposed. Here we are interested in techniques based on encryption systems that allow the use of multiple decryption keys, which decrypt the same ciphertext to slightly different plaintexts. The difference between original and decrypted content represents thus the hidden watermark. The rst scheme following this approach was proposed by Anderson et al. [2] who designed a special stream cipher, called Chameleon, which allows to decrypt the ciphertext in slightly different ways. During encryption, a sequence of indices is used to select four entries from a look-up-table (LUT) for each plaintext element. These entries are XORed with the plaintext to form the ciphertext. The decryption is identical to encryption except for the use of a decryption LUT, that superimposes some errors onto the content, thus leaving a unique watermark. Generalizations of Chameleon, suitable for embedding spread spectrum watermarks, have also been proposed [3, 4]. The analysis of the related literature reveals that several solutions exist for secure watermark embedding, however, up till now, they have been applied to Spread-Spectrum (SS) watermarking algorithms. In this paper, we propose a LUT based secure embedding system designed for the Spread Transform Dither Modulation (STDM) algorithm [5], belonging to the class of data hiding schemes dened informed embedding algorithms or host-interference rejecting methods. 2. BACKGROUND The LUT based secure embedding proposed in [4] works as follows. The distribution server generates a long-term master encryption look-up table E of size T , whose entries E[t] are indipendently and randomly generated according to a Gaussian distribution. The LUT E will be used to encrypt the content to be distributed to the KU clients. Next, for the k-th client, the server generates a personalized watermark LUT Wk according to a desired probability distribution, and builds a personalized decryption LUT Dk by combining componentwise master encryption LUT E and watermark LUT Wk : Dk [t] = E[t] + Wk [t], t = 0, 1, . . . , T 1. (1)

The personalized LUTs are then transmitted once to each client over a secure channel. A content, represented as a vector x of size M, is encrypted by adding to each element R entries of the LUT E pseudo randomly selected according to a session key sk. The encrypted content c is sent to all the authorized clients along with the session key sk. The k-th client can decrypt c by using his/her personalized decryption LUT Dk , with the nal effect that a spread-spectrum watermark sequence

978-1-4244-5654-3/09/$26.00 2009 IEEE

97

ICIP 2009

is embedded into the decrypted content yk , through an additive rule. In detail, driven by the session key sk, a set of M R values tij in the range [0, T 1] is generated, where 0 i M 1, 0 j R1. Each feature xi is encrypted by adding R entries of E, obtaining the encrypted feature ci :
R1

Each dithered quantizer is used to quantize one of the L randomly w x chosen projections, so the marked components are (rj rj ) = x x [Q0 (rj ) rj ] + j , where j A. The vector of watermarked features is then given by: y = x+
w x (rj rj )sj = x+ jA x x [Q0 (rj )rj ]sj + jA jA

j sj . (6)

ci

xi +
j=0

E[tij ].

(2)

Joint decryption and watermarking is accomplished by reconstructing with sk the same sequence of indices tij and by adding R entries of Dk to each encrypted feature ci :
R1 R1

yk,i = ci +
j=0

Dk [tij ] = xi +
j=0

Wk [tij ] = xi + wk,i .

(3)

In a forensic application, we can think that each k-th user can be identied by employing a different set of dithered quantizers, characterized by a dithering vector k = {k,j }jA . According to this approach, referring to equation (6), in yk it is possible to distinguish between a term present in all the watermarked copies of the conx x tent, the summation jA [Q0 (rj ) rj ]sj and a term identifying the single k-th user, i.e. the summation jA k,j sj . The detector will thus try to identify a dishonest client by looking at this uniquely distinguishing component. 3.1. ST-DM client side embedding Let us now describe how we implement a ST-DM based secure client side embedding. A distribution server, like in [4] generates an encryption look-up table E, whose entries are i.i.d. random variables following a Gaussian distribution N (0, E ); moreover, for each client, a personalized watermark LUT Wk is generated, according to N (0, W ), and a decryption LUT Dk is computed by combining componentwise E and Wk . In addition, let us suppose that the projection matrix S has been generated. The personalized LUTs and the matrix S are then transmitted once to each client over a secure channel. The server encrypts a content x of size M by adding to it some entries of E; however, differently from Eq. (2), here R entries are added along each of the Md orthogonal directions sj . In addition, x x in L randomly chosen directions the common terms [Q0 (rj ) rj ] present in the embedding rule are introduced, so that at the server side the host features will be modied as in the following:
Md R1

The result of this operation is the sequence of watermarked content features yk identifying the k-th user. 2.1. Spread Transform Dither Modulation The ST-DM algorithm belongs to the wider class of Quantization Index Modulation (QIM) watermarking [5]. According to the QIM approach, watermark embedding is achieved through the quantization of the host feature vector x on the basis of a set of predened quantizers, where the particular quantizer depends on the to-be-hidden message. In the case of ST-DM, the correlation between the host feature vector x and a reference spreading signal s is computed as M rx =< x, s >= i=1 xi si ; this correlation is then quantized by applying to it either a quantizer Q0 , or a quantizer Q1 , depending on the to-be-hidden bit, obtaining the quantized correlation rw . The watermarked features are then: y = x + (rw rx )s (4)

To recover the embedded bit, a minimum distance decoder applied to the correlation r of the watermarked and possibly attacked features y with the vector s is adopted [6]. The ST-DM approach can be extended in such a way that the host features are projected not only along one direction, but on a vector subspace, allowing to introduce an additional degree of freedom in the design of the scheme. 3. ST-DM SECURE EMBEDDING Starting from an original vector composed by M features, a M Md projection matrix S = (s1 , s2 ,. . . , sMd ) whose columns are orthogonal is generated. The host features are projected according to S, which, differently from the traditional ST-DM, needs to be known to the clients; in order to add a level of secrecy, only L out of Md projections will be quantized to embed the watermark, where the L directions are kept secret to the clients. Let us indicate by A the indexes corresponding to the L directions where the watermark will be introduced. To represent that only L out of Md projections are quantized we will resort to a M L matrix SA denoting a partition of S obtained by picking the columns whose indexes are in A. To embed the watermark, it is chosen to use not just two quantizers, but a set of L dithered quantizers, shifted each by a factor j with respect to a reference quantizer Q0 () having a xed step size , so that, for j A Qj (x) = Q0 (x) + j . (5)

ci = x i +
j=1 h=0

E[tjh ]sji +

x x [Q0 (rj ) rj ]sji . jA

(7)

Decryption and watermark embedding is driven by the session key sk needed to reconstruct the sequence of indices tjh and add Md R entries of the decryption LUT Dk to each encrypted feature R1 ci : yk,i = ci + Md j=1 h=0 Dk [tjh ]sji , then:
Md R1

yk,i = xi +

x x [Q0 (rj ) rj ]sji + jA j=1 h=0

Wk [tjh ]sji .

(8)

If we assume it is possible to set R1 Wk [tjh ] = k,j , the result of h=0 this operation is the sequence of watermarked features yk , using the dithered quantizers shifted by the set k = {k,j }jA identifying the k-th user. The joint decryption and watermarking process then becomes: yk,i = xi +
x x [Q0 (rj ) rj + k,j ]sji + jA j A /

k,j sji .

(9)

The nal effect of the joint decryption and watermarking, is that a ST-DM watermark has been embedded in L directions, but in (Md L) directions a spread spectrum like noise has been added. These noise terms can not be avoided, since the client is not allowed to know the L out of Md directions, indicated by the set A.

98

4. DETECTION It is assumed that the input to the detector is a vector of possibly altered watermarked features, denoted as y. Such a vector is projected onto the L directions carrying the watermark, yielding a vector of L watermarked projections: = SA y . (10)

Since the embedding rule makes it difcult to dene a likelihood ratio, the proposed detector relies on a suboptimal approach based on a correlation statistic, followed by a maximum detector [7]. Namely, the detector computes a vector of quantization errors as e = Q () and the detector statistics for the k-th client is dened as: Tk ()
T k e

M = l m. If the number of 8 8 blocks inside an image is NB , the number of available chunks NC is given by: NC = mNB . M For each chunk the same projection matrix is adopted, but different sets of dithering j are considered. According to this, the k-th client is identied by a vector obtained as the concatenation of NC i vectors k of length L (one for each chunk), that is in detection the N 1 vector k = (k , ..., k C ) having size L NC = L mNB will be M used. The performance analysis is carried out by dening the operating conditions in terms of Document to Watermark Ratio (DWR). The DWR expresses the ratio between the power of the host features and that of the watermark: DWR = 20 log10 x , WAT (17)

(11)

||k || /2.

(12)

where WAT is the standard deviation of the watermark components. While x can be estimated on the original host features, WAT can be computed as follows:
2 2 WAT =

The decision is made according to the following test: D () = arg maxKU k=1 Tk () if if maxKU k=1 maxKU k=1 Tk () Tk () < . (13)

2 L m MD m + M 64 12 M 64

(18)

The output of the test is either the index k of the guilty client or the special symbol k = meaning that no watermark has been found on the examined content. The threshold has to be set so as to minimize the probability of detection errors. To do so, we formulate the problem as a binary hypothesis testing where the hypotheses are: H0 , the content is not watermarked; Hk , the content contains the watermark of the k -th client. The detector makes an error every time it accuses a client and no watermark was present (false alarm) or it fails in detecting the watermark of the k -th client because it decides that no watermark is present (missed detection) or it wrongly accuses an innocent client (wrong accusation). The performance of the detector is then measured by the probability of false alarm Pf , the probability of missed detection Pm , and the probability of wrong accusation Pw : Pf =P r {D () = ; H0 } Pm =P r {D () = ; Hk } Pw =P r D () = k , k = k ; Hk . (14) (15) (16)

Hence, the probability of correct detection should be expressed as Pd = 1 Pm Pw . The above error probabilities will depend on the threshold and on several other parameters of the system, and will allow to measure the performance of the proposed system. 5. PERFORMANCE EVALUATION To assess the performance of the proposed system, a practical implementation of our scheme has been developed and compared with the implementation of the previous LUT-based secure Spread Spectrum watermark embedding presented in [4]. The system embeds a watermark into a gray level image by modifying m out of 64 block DCT coefcients. In particular, for each 8 8 block, the DCT coefcients are reordered in the zig-zag scan, and the ones from the second until the (m+1)-th are selected. Since the host features have variable size, whereas the spreading vector has a xed size (i.e. M ), we divide the vector of available host features into chunks of length M : each chunk is composed by the DCT coefcients belonging to l blocks, so that the length of each chunk is

m L m where MD 64 and M 64 represent the percentages of DCT coefM cients suffering the quantization error introduced by the embedding process, due to the shift addition process (involving all the MD directions) and to the quantization process (involving only L directions) respectively. In order to force a given DWR value for a specic watermarked image, we introduce a parameter controlling the watermark strength and we put it as a factor multiplying the watermark LUT; specically, we will consider E[tjh ] and Dk [tjh ] instead of E[tjh ] and Dk [tjh ] in equations (7) and (8), and consequently also the watermark LUT will result multiplied by . Since this parameter is required in decryption, the server will need to send the adopted value to all the clients. The relationship between and the DWR can be computed by taking into account that to assure that the shift value remains inside the interval [/2, +/2], and given that follows a Gaussian distribution, the standard deviation of the shifts has to be chosen in such a way that = /2 . Furthermore, since a shift value is 4 obtained by the addition of R entries of the LUT Wk , we have that 2 2 2 = R W . 2 By considering that 2 = 64 , equation (18) can thus be rewritten as: 2 WAT = 2 R2 W m 16L MD + . 64M 3

(19)

Finally, we achieve the watermark strength as a function of the linear DWR value (DWRl ) as: =
2 64M x + 16L/3]DWRl

2 RW m[MD

(20)

and therefore by imposing a given watermark distortion (i.e. a given DWR), a proper value for is achieved. In order to compare the performances of the proposed ST-DM client-side watermarking system vs. the SS version, we implemented the two systems considering the following values for the system parameters: M = 32, L = 4, m = 4, MD = 32, R = 4, E = 100, W = 0.01, T = 216 , DW R = 36 dB. Parameters , and have then been derived from the xed ones, while x and NB are

99

10

10 SS lena SS mandrill STDM lena STDM mandrill

SS lena SS mandrill STDM lena STDM mandrill


1

10

10

10

Pm 10
2

10 20

19

18

17 WNR

16

15

14

10

20

30

40

50 JPEG quality

60

70

80

Fig. 1. Comparison between SS and ST-DM client-side embedding: missed detection probability (Pm ) for different values of WNR.

Fig. 2. Comparison between SS and ST-DM client-side embedding: missed detection probability (Pm ) for different JPEG qualities. 7. REFERENCES [1] M. Barni and F. Bartolini, Watermarking Systems Engineering: Enabling Digital Assets Security and Other Applications, Marcel Dekker, 2004. [2] R. J. Anderson and C. Manifavas, Chameleona new kind of stream cipher, in Proceedings of the 4th International Workshop on Fast Software Encryption FSE97, London, UK, 1997, pp. 107113, Springer-Verlag. [3] A. Adelsbach, U. Huber, and A.-R. Sadeghi, Fingercasting joint ngerprinting and decryption of broadcast messages, in 11th Australasian Conference on Information Security and Privacy. 2006, vol. 4058 of Lecture Notes in Computer Science, pp. 136147, Springer. [4] M. Celik, A. Lemma, S. Katzenbeisser, and M. van der Veen, Look-up table based secure client-side embedding for spreadspectrum watermarks, IEEE Transactions on Information Forensics and Security, vol. 3, no. 3, pp. 475487, 2008. [5] B. Chen and G. Wornell, Quantization index modulation: a class of provably good methods for digital watermarking and information embedding, IEEE Trans. on Information Theory, vol. 47, no. 4, pp. 14231443, May 2001. [6] L. Perez-Freire, P. Comesana-Alfaro, and F. Perez-Gonzalez, Detection in quantization-based watermarking: performance and security issues, in Security, Steganography, and Watermarking of Multimedia Contents VII, Proc. SPIE Vol. 5681, P. W. Wong and E. J. Delp, Eds., San Jose, CA, USA, January 2005, pp. 721733, SPIE. [7] Z. Jane Wang, Min Wu, Hong Vicky Zhao, Wade Trappe, and K. J. Ray Liu, Anti-collusion forensics of multimedia ngerprinting using orthogonal modulation, IEEE Trans. on Image Processing, vol. 14, no. 6, pp. 804821, June 2005. [8] M. Barni, F. Bartolini, and A. De Rosa, On the performance of multiplicative spread spectrum watermarking, in Proc. IEEE Work. on Multimedia signal Processing, MMSP02, San Thomas, Virgin Islands, USA, December 2002.

estimated from the under testing image (512 512 8-bit grey level images were considered). The Pf has been set to 103 . Here, the two systems are evaluated in presence presence of additive white Gaussian noise (AWGN), and in presence of JPEG compression. In the rst case, we computed the missed detection probability (Pm ) with respect to the Watermark to Noise Ratio (WNR), that expresses the ratio between the power of the watermark and that of the noise: WNR = 20 log10 WAT /n where n is the standard deviation of the considered AWGN. The robustness of SS and ST-DM client-side systems to the AWGN attack is represented in Fig. 1: the results, concerning two stadard images, are in agreement with the usual behavior of the corresponding non-client-side systems [8], that is, ST-DM shows a better performance for higher WNR values. A similar behavior can be observed in the case of the second attack, i.e. JPEG compression, as shown in Fig. 2. We can conclude that in both cases, ST-DM shows a vanishing probability of missed detection at high WNR/JPEG quality and performs better than SS when the degradation on the watermarked content is kept within an acceptable range.

6. CONCLUSIONS In this paper we propose a new scheme following the client side watermark embedding approach for the data copyright protection in a large scale content distribution environment. In particular, starting from the idea of the LUT based secure embedding, we modify such a scheme for designing it specically for the Spread Transform Dither Modulation (ST-DM) belonging to the informed watermark embedding algorithms. This modication is not straightforward; however, the experimental results conrm that the superiority of ST-DM vs. SS watermarking exhibited in the classical embedding approach is maintained also in the client-side embedding one. We are currently working on the theoretical analysis of the detector performance and we are also studying the performance of the system under the average collusion attack.

100

You might also like