Download as pdf or txt
Download as pdf or txt
You are on page 1of 83

MASTERS THESIS

WATERMARK SYNCHRONIZATION IN CAMERA


PHONES AND SCANNING DEVICES
Pramila A. (2007) Watermark synchronization in camera phones and scanning
devices. Department of Electrical and Information Engineering, University of Oulu,
Oulu, Finland. Masters Thesis, 83p.

ABSTRACT

The development of Internet and numerous hardware and software applications have
created a need for copy and copyright protection of the content. One possible way to
fill this need is to use watermarking. The idea of digital image watermarking is to
hide information to the image so that a human eye cannot detect the transformations
made to the image, but a computer can read the hidden information.
This work is focused on the use of watermarking in value-adding services instead
of copy and copyright protection applications. The value-adding services offer extra
services to the user. Therefore the user does not want to remove the embedded
information but prefer to read it and gain access to the hidden data. In this work, two
watermarking methods for digital images were developed and analysed. The first
method is robust against a print-scan attack, where the image is printed and then
scanned before extraction of the watermark. The second method is robust against a
print-cam attack, where the image is first printed and then captured with a camera
phone.
In both methods, multiple watermarking methods are applied in which multiple
watermarks are embedded in the image. In this case, from two to three watermarks
are embedded in the image out of which one is so called message watermark that
contains a service message such as a link to a website or extra information about the
image. Other watermarks are embedded in order to correct the geometrical
distortions inflicted to the image by the scanning phase.
The results obtained were promising and success ratios high in both of the methods.
Both methods were tested against various unintentional attacks, and it was concluded
that both of the methods are robust against geometrical distortions and JPEG
compression.

Key words: multiple watermarking, value adding services, digital image
watermarking, watermarking methods
Pramila A. (2007) Vesileiman synkronointi kamerapuhelimissa ja skannereissa.
Oulun yliopisto, shk- ja tietotekniikan osasto. Diplomity, 83 s.

TIIVISTELM

Internetin kehittyminen ja lukuisat laitteisto- ja ohjelmistosovellukset ovat luoneet
tarpeen sislln kopio- ja tekijnoikeuksien suojaamiselle. Yksi mahdollinen ratkaisu
ongelmaan on vesileimauksen kyttminen. Digitaalisen kuvan vesileimauksen idea
on piilottaa informaatiota kuvaan siten, ett ihmissilm ei pysty erottamaan
muutoksia, joita kuvaan on tehty, mutta tietokone pystyy lukemaan piilotetut tiedot.
Tss tyss keskitytn kopio- ja tekijnoikeuksien suojaussovellusten sijaan
vesileimauksen kyttmiseen lisarvopalveluissa. Lisarvopalvelut tarjoavat
kyttjlle ylimrisi palveluita ja siten kyttj ei halua poistaa upotettua
informaatiota vaan lukea sen ja pst ksiksi piilotettuun aineistoon. Tyss
kehitetn ja analysoidaan kaksi vesileimausmenetelm digitaalisille kuville.
Ensimminen vesileimausmenetelm kest kuvan tulostus- ja lukuhykkyksen,
jossa kuva ensin tulostetaan ja sitten luetaan skannerilla ennen vesileiman
irrottamista. Toinen menetelm kest tulostus- ja kuvanottohykkyksen, jossa kuva
ensin tulostetaan ja tulostetusta kuvasta otetaan kuva kamerapuhelimella.
Molemmissa menetelmiss kytetn monivesileimausmenetelmi, joissa kuvaan
upotetaan useita vesileimoja. Tss tapauksessa kuviin upotetaan kahdesta kolmeen
vesileimaa, joista yksi on niin sanottu viestivesileima, joka sislt palveluviestin
kuten linkin nettisivulle tai ylimrist tietoa kuvasta. Muut vesileimat upotetaan
jotta kuvaan lukuvaiheessa tulleet geometriset virheet saadaan korjattua.
Saadut tulokset olivat lupaavia ja onnistumissuhde suuri molemmissa
menetelmiss. Molempia menetelmi testattiin useilla tahattomilla hykkyksill ja
lopuksi voitiin ptell, ett molemmat menetelmt kestvt geometrisi
hykkyksi sek JPEG-pakkauksen.

Avainsanat: monivesileimaus, lisarvopalvelut, digitaalisen kuvan vesileimaus,
vesileimausmenetelmt
TABLE OF CONTENTS

ABSTRACT
TIIVISTELM
TABLE OF CONTENTS
FOREWORD
LIST OF ABBREVIATIONS AND SYMBOLS
1. INTRODUCTION.................................................................................................. 8
1.1. History of watermarking .............................................................................. 8
1.2. Applications and scenarios........................................................................... 9
1.3. Research problem......................................................................................... 9
2. TRADEOFFS IN WATERMARKING................................................................ 11
2.1. Performance considerations ....................................................................... 11
2.2. Imperceptibility .......................................................................................... 12
2.2.1 Visible and invisible watermarks ................................................... 12
2.2.2 JND................................................................................................. 12
2.3. Robustness.................................................................................................. 14
2.3.1 Robust and fragile watermarks....................................................... 14
2.3.2 Printed images ................................................................................ 15
2.3.3 Print-scan........................................................................................ 15
2.3.4 Mobile phone with a camera .......................................................... 17
2.3.5 Geometrical attacks ........................................................................ 18
2.3.6 JPEG transform.............................................................................. 21
3. WATERMARKING METHODS ........................................................................ 24
3.1. Generic watermarking scheme ................................................................... 24
3.2. Domains...................................................................................................... 25
3.2.1 Spatial ............................................................................................. 26
3.2.2 Fourier domain based methods....................................................... 27
3.2.3 Wavelet domain based methods ..................................................... 29
3.2.4 Methods in other domains .............................................................. 31
3.3. Multiple watermarking............................................................................... 32
4. COMMERCIAL DEVELOPMENT .................................................................... 34
4.1. Digimarc..................................................................................................... 34
4.2. CyberSquash............................................................................................... 34
4.3. Bar codes .................................................................................................... 36
4.3.1 Short history ................................................................................... 36
4.3.2 Operation........................................................................................ 37
4.4. Mobot ......................................................................................................... 38
4.5. Sanyo.......................................................................................................... 38
5. PRINT-SCAN RESILIENT WATERMARKING............................................... 39
5.1. Frequency domain template ....................................................................... 39
5.1.1 Embedding...................................................................................... 39
5.1.2 Extracting ....................................................................................... 41
5.2. Spatial domain template ............................................................................. 46
5.2.1 Embedding...................................................................................... 46
5.2.2 Extracting ....................................................................................... 47
5.3. Wavelet domain multibit message ............................................................. 51
5.3.1 Embedding...................................................................................... 51
1
5.3.2 Extracting ....................................................................................... 52
5.4. Experiments and results.............................................................................. 52
5.5. Discussion .................................................................................................. 54
6. PRINT-CAM RESILIENT WATERMARKING ................................................ 56
6.1. Frame detection method ............................................................................. 56
6.1.1 Embedding...................................................................................... 57
6.1.2 Extracting ....................................................................................... 58
6.2. Experiments and results.............................................................................. 60
6.3. Discussion .................................................................................................. 62
7. DISCUSSION ...................................................................................................... 64
8. CONCLUSION.................................................................................................... 66
9. REFERENCE....................................................................................................... 67
10. APPENDIXES...................................................................................................... 71

FOREWORD

This work was done at MediaTeam Oulu research group in Information processing
laboratory, University of Oulu, Finland. The work was part of Zirion project with a
focus on the use of watermarking in value-adding services and reading the
watermarks with camera-phones.
I would like to give my sincere thank you for all those people who have helped me
with this work. Especially, I would like to thank Professor Tapio Seppnen for all the
great advice and instructions during the work and Ms.c. Anja Keskinarkaus for all
the ideas and help. A huge thank you belongs also to my friends and fellow students
with whom I was able to share my thoughts and fears about the work. I would also
like to thank my parents for always being there for me and my brother Jani for
cheering up my days.

Oulu 8.2.2007

Anu Pramila

LIST OF ABBREVIATIONS AND SYMBOLS


AC Alternating Current
BCH Bose-Chaudhuri-Hocquenghem. Error correction code
BER Bit Error Ratio
CCITT International Telegraph and Telephone Consultative Committee
CMOS Complementary Metal Oxide Semiconductor
COP Centre of Projection
DA/AD Digital to Analog/ Analog to Digital transform
DC Direct Current
DCT Discrete Cosine Transform
DFT Discrete Fourier Transform
DRM Digital Rights Management
DS-SS Direct Sequence Spread Spectrum, modulation method
DWT Discrete Wavelet Transform
EAN.UCC International-Uniform Code Council
GF Galois Field
GS Guided Scrambling
HP Hewlet Packard
HSI Hue, Saturation, Intensity. A colour space.
HVS Human Visual System
IEC International Electrotechnical Commission
ISO International Organization for Standardization
ITU International Telecommunication Union
JND Just Noticeable Difference
JPEG Joint Photographic Experts Group
MSE Mean Squared Error
NTT Nippon Telegraph and Telephone Corporation
PC Personal Computer
PSNR Peak Signal to Noise Ratio
PSPNR Peak Signal to Perceptible Noise Ratio
UPC Universal Product Code
URL Uniform Resource Locator
1. INTRODUCTION

The history of data hiding is long and the amount of applications wide. This chapter
contains a brief glance to the history of data hiding and watermarking and
applications invented. Then the research problem is presented and some scenarios
viewed.


1.1. History of watermarking

Entering into the digital era has brought along many problems to be solved and
questions to be asked. Distributing pieces of content, such as music, text, images or
video, is now easier than ever before. With a few clicks pieces of content are
transferred from one user to another, but the content providers are troubled: with
those few clicks, a great deal of data and content that should not move also move,
and coins that should drop inside cash machine fail to do so. A while ago
watermarking was proposed as a way to restrict the content from moving too freely.
The idea was that by hiding information to the piece of content the piracy and illegal
copying could be stopped or at least the people behind it could be captured. It has
been said that one picture is worth a thousand words, but few people know that a
picture may truly contain a thousand words, hidden beneath the surface.
Since ancient times the idea of data hiding has been used in several different
applications, some of which are still in use. Hiding information from the enemies has
always been vital in wars, and finding enemies secret messages has been even more
important for changing the course of the battle. Little did the warriors back in the
time of Roman Empire know that their efforts in hiding and uncovering messages led
to development of early watermarking technologies.
The first real paper watermarks were introduced in the late 1200s in Italy,
though the watermarking technology had been invented a thousand years earlier in
China. By the eighteenth century the paper watermarks were widely used in
trademarking to record manufacturing dates. At that time, watermarks began to be
used in money and other documents for copy protection. [1]
The first digital watermarking proposals were invented somewhere around the late
1970s and the early 1980s, but the term digital watermarking itself was not
introduced until the late 1980s [1]. After that hundreds of different digital
watermarking methods have been proposed for hiding information in digital content
like audio, image and video.
It was not a long time ago when digital watermarking was seen as a solution for
many copy and copyright protection problems. There are, however, many critics of
the digital watermarking who argue that the watermarks decrease the quality of the
content, they are easily destroyed and bring more harm to ordinary consumers than is
acceptable. They forget, however, that copyright protection is but one of many kinds
of watermarking applications and even the copyright protection algorithms are still
under development.


9
1.2. Applications and scenarios

Most of the watermarking technologies through the ages have been developed to
protect the content from illegal and unauthorized use, which is still the most
important application area, but watermarking is capable of much more. Some service
providers have realized this and a new generation of watermarking technologies is
rising.
In addition to copy- and copyright protection, the watermarking can be used in
many ways: It can be employed in fraud detection to tell if the content has been
changed. It can also be used for embedding metadata in the content to tell, for
example who has made the image or who composed the song, or it can be employed
for commercial purposes as value-adding services, where the embedded information
is beneficial to the user. This thesis is about the last-mentioned type of applications
to answer a question what kind of watermarking is required in mobile phone
environment to be able to produce value-adding services.
MediaTeam Oulu research group has launched a project called Zirion, where the
embedded information is so-called caption data, which offers more information about
host media content to the user. The information can be a link to further information, a
link to another service, confidential textual data, of another media type such as an
audio file or even a functional command. Therefore the embedded information is
beneficial to the user, and the main presumption is that the user does not want to
destroy the embedded information and intentional attacks are not expected.
Although intentional attacks may not appear, some unintentional attacks must be
considered. For example, while reading the watermark with a camera phone, digital-
to-analog-to-digital (DA/AD) conversion, JPEG compression and some geometrical
distortions are almost bound to happen. DA/AD conversion and JPEG compression
add noise to the signal and geometrical attacks change the positions of the image
pixels, making detecting of the watermark difficult. These kinds of attacks define
requirements for the robustness of the watermarking method used.
A possible scenario is show in Figure 1. In the scenario, a user flips through a
catalogue and when she sees an interesting piece of merchandise, she orders it by
taking a picture of it with her camera phone. The program in the phone processes the
image and extracts the watermark which contains the information of the product.
Soon, the user receives a packet through post with the merchandise in it.
Another scenario could be linked with periodicals. A watermark can be embedded
in an image of a famous rock band in a magazine. The watermark may contain a link
to a website to order ticket to a concert or a ring tone from the album. The future
scenarios are limited only by imagination.


1.3. Research problem

In print-cam robust watermarking, the watermark should be readable with a
camera phone or digital camera by taking a picture from the watermarked image and
then processing it. The aim of the research is focused on finding a method that could
recover from geometrical distortions such as rotation, scaling and translation in an
environment with high level of distortions.
The research done on the field of value-adding watermarks is limited and only few
10

Figure 1. An example of an application for reading watermarks with a
camera phone.

papers have been proposed on reading watermarks with camera phones. The aim of
this work is to invent new methods for value-adding watermarking.
For simplicity a watermarking algorithm was first created in print-scan
environment where most of the attacks are similar to those of the mobile phone
environment. The biggest difference between the print-scan process and taking a
picture with a mobile phone camera is the fact that the print-scan process results in a
two dimensional problem, whereas the picture taking with a camera phone is clearly
a three dimensional problem.
The research work was done on PC environment and with a Matlab

program.
Testing of the watermarking algorithm was made with a camera phone by sending
the picture of a watermarked image to PC and processing it there. Some restrictions
were necessary to make, and consequently, the watermarking methods were not
tested with different lighting conditions, paper qualities or image resolutions. Also,
the area around the image was left blank to simplify the extraction process.
In this work the term print-cam process is used of a process where the
watermarked image is printed on a paper and then read with a camera phone by
taking a picture of the watermarked image. In some previous papers, print-capture
process has been used, but the term is somewhat misleading: capturing is not
specific to cameras but may mean many kinds of image reading, such as filming.
The contents of this thesis are as follows: Chapter 2 explains some properties of
watermarks and describes some of the attacks that could be encountered in this kind
of application. Chapter 3 gives an introduction to different watermarking
technologies in a form of a literature survey. Chapter 4 extends the information in
Chapter 3 and gives some examples of commercial applications that are already in
use. Chapter 5 proposes a method for value-adding watermarks that is robust against
the print-scan process and Chapter 6 proposes another method that is robust against
the print-cam process.
11
2. TRADEOFFS IN WATERMARKING

All digital watermarks have three properties: imperceptibility, that is how visible the
watermark is, robustness, which means how well the watermark resists attacks, and a
capacity which tells how much information can be embedded in the image with the
watermarking technology in question. These properties are, however, conflicting, and
no watermark can have high levels in each at the same time. Here I shall explain the
tradeoffs involved and each property in more detail.


2.1. Performance considerations

The idea of watermarking is to embed information in the host data by applying minor
changes in such a manner that a human eye cannot perceive the embedded data. The
watermark information can be extracted afterwards from the host data by detecting
these modifications. There are a wide range of possible modifications that can be
done to the host data and the host data can also be, for example, transformed to
another domain, such as the Fourier, the wavelet, the DCT (Discrete cosine
transform), or even some fractal domain, where the properties of the transform
domain can be taken advantage of. [2] Transform domains will be discussed in more
detail in Chapter 3.
Each watermark should have three commonly acknowledged properties. They are
imperceptibility, capacity and robustness. All of these are important requirements for
a good watermark, but no watermark can have high levels in them all at the same
time and therefore some trade off will happen between these three. The illustration of
the trade off is shown in Figure 2, where the properties are arranged in a triangle.
The yellow point shows the chosen point of the trade off for an application where
imperceptibility is not highly important, but where capacity and robustness are
valued high.













Figure 2. Triangle of tradeoffs while choosing watermark properties.

Watermark is described as robust if it cannot be removed from the signal where it
is embedded without destroying the signal [3]. The watermarks are usually divided in
fragile and robust watermarks and the fragile watermarks are meant to get broken

Imperceptibility
Capacity
Robustness
12
when the content is tampered. In value-adding watermarking, the watermarks are
generally considered robust but only against unintentional attacks such as
geometrical distortions, lossy compression and AD/DA transforms.
Imperceptibility means how visible the watermark is. Usually the imperceptibility
is evaluated with HVS (Human Visual System), but also special Just Noticeable
Difference (JND) methods can be applied [3]. This means that instead of embedding
the watermark to the image with a constant level of intensity the strength at which
the watermark is embedded is chosen for each pixel separately.
A third required property of a watermark is capacity. Capacity is the amount of
data that can be included in the image by using the specified watermarking method.
The capacity requirements vary between different watermarking applications and in
value-adding watermarking a small capacity can be sufficient. Out of the three
properties, imperceptibility and robustness are explained in more detail in the
following sections.


2.2. Imperceptibility

Seeing the watermark and only knowing it is there defines the visible and invisible
watermarking respectively. Here the two ways to use watermarking and later the
JND method are explained.


2.2.1 Visible and invisible watermarks

Usually watermarks are divided in visible and invisible watermarks, where the idea
of a visible watermark in digital image processing is the same as in physical paper
watermarking. The visible watermark is usually a transparent logo or the name of the
copyright owner, and it is included on the top of the image. It is a simple way to
show who owns the image, but it can be easily removed by cropping it off or by
using some image processing tool. Therefore it should be placed somewhere on the
image where it is difficult to remove without destroying the quality of the image.
Invisible watermarks cannot be perceived with human eyes when comparing the
original and watermarked images, but a computer accompanied with suitable
program can read them. This is possible by employing the properties of HVS and
embedding the watermark weakly enough to remain unseen but strongly enough to
be robust against certain attacks.


2.2.2 JND

To be able to embed the messages as robustly as possible but so that the watermark
stays invisible, the properties of HVS must be studied. The most common way to
embed watermarks is to choose first some scaling factor for the strength of the
watermark, and use this scaling factor through entire image. It is a simple and
efficient way but not very effective. A human does not see all the colours and
13
intensities similarly and therefore embedding a watermark with only one coefficient
does not always work as expected.
Three properties of the HVS are usually presented: frequency sensitivity,
luminance sensitivity and contrast masking. Frequency sensitivity, here spatial
frequency sensitivity, means that the high frequencies, that is, fine picture details, are
less visible. Luminance sensitivity means that if the background luminance is high,
the luminance increase is not perceived. Contrast masking means that an image detail
may be difficult to detect in the presence of another detail. [4]
These properties of the HVS enable us to use the watermarking technologies and
embed the information robustly to the images. Especially, the watermark can be
embedded more strongly to image areas where there is some texture and high
variation in luminance values and leave the plain image areas unwatermarked. The
methods where the embedding strength is selected for each pixel individually, so that
the pixel value changes only a certain amount without making the change perceptible,
are called JND methods.
One JDN model for images with eight bit intensity levels is proposed by Chou and
Li [4]. The scaling factor for each pixel is derived from the following equations:

{ } )) , ( ( )), , ( ), , ( ( max ) , (
2 1
y x bg f y x mg y x bg f y x JND
fb
= (1)

)) , ( ( )) , ( ( ) , ( )) , ( ), , ( (
1
y x bg y x bg y x mg y x mg y x bg f + = (2)

> +
+
=
127 ) , ( 3 ) 127 ) , ( (
127 ) , ( 3 )) ) 127 / ) , ( ( 1 (
)) , ( (
2 / 1
0
2
y x bg for y x bg
y x bg for y x bg T
y x bg f

(3)

115 . 0 0001 . 0 ) , ( )) , ( ( + = y x bg y x bg (4)

W y H x for y x bg y x bg < < = 0 , 0 01 . 0 ) , ( )) , ( ( , (5)

where H and W are the height and width of the image, respectively. bg(x, y) is the
average background luminance and mg(x, y) is the maximum weighted average of
luminance differences around the pixel at (x, y).

{ } ) , ( max ) , (
4 , 3 , 2 , 1
y x grad y x mg
k
k =
= (6)

= =
+ + =
5
1
5
1
0 , 0 ), , ( ) 3 , 3 (
16
1
) , (
i j
k k
W y H x for j i G j y i x p y x grad (7)

= =
+ + =
5
1
5
1
) , ( ) 3 , 3 (
32
1
) , (
i j
j i B j y i x p y x bg (8)

G
1
=


0 0 0 0 0
1 3 8 3 1
0 0 0 0 0
1 3 8 3 1
0 0 0 0 0 G
2
=



0 0 1 0 0
0 8 3 0 0
1 3 0 3 1
0 0 3 8 0
0 0 1 0 0 G
3
=



0 0 1 0 0
0 0 3 8 0
1 3 0 3 1
0 8 3 0 0
0 0 1 0 0 G
4
=

0 1 0 1 0
0 3 0 3 0
0 8 0 8 0
0 3 0 3 0
0 1 0 1 0 B=

1 1 1 1 1
1 2 2 2 1
1 2 0 2 1
1 2 2 2 1
1 1 1 1 1 (9)
14
The function f
1
(x, y) models the spatial masking effect, which means that values
near edges in an image can be changed much more that the values near constant
intensities. The function f
2
(x, y) defines the visibility threshold due to background
luminance and the values T
0
, and are chosen as to be 17, 3/128 and
respectively. [4]
To test how well the JDN calculations work, Chou and Li developed a so-called
PSPNR (Peak Signal to Perceptible Noise Ratio) value. Often, the PSNR (Peak
Signal to Noise Ratio) value is calculated with

MSE
g o l PSNR
255
20
10
= , (10)

where the MSE is the mean squared error. Unfortunately it cannot tell accurately the
perceptual quality of the image and therefore some other methods are needed [4].
The PSPNR value measures the perceptible distortion energy and it is defined as

[ ] { }
W y H x for
y x JND y x p y x p if
y x JND y x p y x p if
y x
y x y x JND y x p y x p E
g o l PSPNR
fb
fb
fb
< <


>
=

=
0 , 0
) , ( ) , ( ) , ( , 0
) , ( ) , ( ) , ( , 1
) , (
) , ( ) , ( ) , ( ) , (
255
20
2
10

, (11)

where the ) , ( y x p denotes the reconstructed pixel at (x, y) and ) , ( y x JND
fb
the
original JND profile. [4]


2.3. Robustness

Watermark that will not endure even the most common data processing such as
scaling is useless. Even the fragile watermarks that are meant to get broken will not
ideally break by accident. This chapter presents some of the attacks expected,
including JPEG compression and geometrical attacks. The print-scan process and
picture taking with a mobile phone are viewed separately.


2.3.1 Robust and fragile watermarks

Most of the watermarks are designed so that they resist many kinds of attacks meant
to destroy the embedded information, but some watermarks are made to get broken.
These watermarks are called fragile watermarks and their purpose is to detect
tampering of the image. If the image is purposefully changed, the watermark will get
destroyed.
Robust watermarks are required to survive through almost all attacks, but
designing such watermarks is very difficult. It is practically impossible to design a
15
watermarking system that would resist all kinds of attacks. With a careful analysis of
system requirements, however, it is possible to design a watermarking system that is
robust against most probable attacks in the required environment. One way to deal
with different kinds of attacks is to use multiple watermarking. That is, few
watermarks are embedded in the image each of which is designed to recover from
different kinds of attacks. Multiple watermarking is explained in more detail in
section 3.3.


2.3.2 Printed images

Most of the time, when taking about the print-scan process, only the scanning
process is considered and the printing process is neglected. However, the printing
process also inflicts attacks to the watermarking process. It is generally
acknowledged that the printing quality varies between different printers. Perry et al.
[5] experimented with different printers and concluded that the end products of
different printers vary across different manufacturers and even between identical
models from the same manufacturer.
Paper quality obviously affects the quality of the printed image, and Perry et al.
reminded that also ink density has an effect on the result [5]. These results show that
the printing process should not be neglected while reading watermarks from printed
images but carefully considered.


2.3.3 Print-scan

Print-scan process has many similarities with photographing and therefore it has been
used as a prerequisite when defining a watermarking system for camera phones. In
the process, watermarked image is first printed and then scanned, and as a
consequence, the watermark should be robust against various kinds of attacks. Figure
3 shows the user interface of the Epson 15000 scanner, where the user defines the
scanning area with a dash line quadrilateral. A large portion of the background of the
image is also being cropped along, and the watermark is no longer in the centre of
the scanned image. The watermark should endure through geometrical
transformations, such as rotation, scaling and translation, but it should also be
readable after DA/AD transform and noise addition and it should not get broken by
slight cropping of edges. Some research has been done recently on the properties of
the print-scan process, but the problem is complex, because the distortions during the
print-scan process are printer/scanner-dependent and time-variant even for the same
printer/scanner [6, 7]. While trying to find properties that are invariant to the print-
scan process, Solanki et al. [7] studied the print-scanproperties of the discrete
Fourier transform (DFT) magnitudes and concluded that:
1. The low and mid frequency coefficients are preserved much better than the
high frequency ones.
2. In the low and mid frequency bands, the coefficients with low magnitudes
see a much higher noise than their neighbours with high magnitudes.
16


Figure 3. The user interface of the Epson GT-15000 scanner and scanning area
selected by a user.

3. Coefficients with higher magnitudes see gain of roughly unity.
4. Slight modifications to the selected high magnitude low frequency
coefficients do not cause significant perceptual distortion to the image.
These properties were further studied by He and Sun [6], who introduced three more
properties including:
5. Most textures can be preserved, or, most relationships between DFT
coefficients are preserved though individual DFT magnitude may vary.
6. The dynamic range of intensity values is reduced, that is, the original range
between 0-255 becomes 70-250 after the print-scan process.
7. The distribution of pixel values after the print-scan process look roughly like
a spindle as in Figure 4.



Figure 4. The intensity distribution of the print-scan process. X-axis represents the
original intensity while the Y-axis represents the print-scanned intensity.

These results give some guidelines to design a working print-scan robust
watermarking system.
scanned background
image
17
2.3.4 Mobile phone with a camera

While the print-scan process is clearly a two dimensional problem, taking a picture
with a camera phone, that is, the print-cam process, is a three dimensional one. All
attacks that occur in the print-scan process will also occur in the print-cam process.
This is not by any means the end of the story, but photographing with a camera
phone introduces an abundance of attacks to watermarking systems.
Some of the attacks explained here are due to the mobile phone camera properties
and some are interlinked with the camera lens. The camera phone itself presents
some technological constraints that need to be considered. One of the biggest
problems is the low processing power of the camera phones, which sets new
requirements for the watermarking system. The application must be lightweight and
its memory consumption must not exceed certain limits if the watermark processing
is done in the phone. The watermarking system should also be robust against JPEG-
compression because in most of the camera phones the captured image is
automatically compressed before saving. The JPEG-compression is explained in
more detail in section 2.3.6.
The cameras in mobile phones are not of high quality, and, although the qualities
approach those in digital cameras, they are still far behind. At present, the best
cameras in the mobile phones are of resolution two megapixels or more but camera
phones like that are still rare. It must be remembered that the amount of megapixels
is not the whole truth, but also the quality of optics has a huge impact on the quality
of an photographed image.
Even high quality optics will not entirely save the image from pincushion and
barrel distortions shown in Figure 5. In barrel distortion, straight lines in real world
bow outward in images, whereas in the pincushion distortion straight lines bow
inward. In both distortions, the amount of distortion is bigger close to image edges.
Fortunately, this type of distortion can be corrected easily because in every camera
the properties of the lens stay the same, and therefore the parameters that define the
amount of distortion can be determined beforehand for each camera lens.


Figure 5. Chess patterned reference image a) original image b) barrel
distorted image c) pincushion distorted image.

While correcting barrel distortions, all that is needed to know are the properties of
the camera lens. These properties can be found out by taking one or more pictures
from a reference image and analysing the pictures. A reference image can be for
example a chess board image, where black and white squares alternate as in Figure 5.
18
The properties of the lens stay the same in every picture taken, and after the
properties have been once found, the barrel distortion can always be inverted.
In this work, separate software is used for correcting barrel distortions. Camera
Calibration Toolbox for Generic Lenses is a Matlab toolbox made by Kannala which
is freely available in the Internet [8]. The toolbox is based on the generic camera
model and enables correction of barrel distortions as well as several other corrections
[9]. The calibration of the camera was done by using a calibration cube shown in
Figure 6.



Figure 6. Reference cube for the Calibration Toolbox for Generic Lenses.

Other kinds of distortions that should be corrected are the effects of three
dimensional world, that is, perspective distortions. It is practically impossible to set
the camera so that it is entirely perpendicular to the image and therefore the picture
taken will be slanted.


2.3.5 Geometrical attacks

Robustness against geometrical attacks is necessary in designing a print-scan or
print-cam robust watermarking system. In this research, mostly rotation, translation
and scaling are studied, but also barrel distortion and perspective transformations are
paid attention to.
In Figure 7, there is as an example photograph of a watermarked image that has
been taken with a Nokia N90 camera phone. As seen, there is a visible barrel
distortion in the image and also the perspective has somewhat changed: the right side
of the image is slightly narrower than the left side. These distortions make the
reading of the watermark difficult, and without a proper watermarking technology all
the information embedded in the image could be lost.
The previously proposed methods for reading the watermark from distorted images
can be divided roughly in two main categories. The first one is to find out the
geometrical transformations that the image has gone through and then apply an
inverse transform [10]. The other way is to embed the watermark in a transformation
invariant domain, such as the Fourier-Mellin domain [11].
While taking a picture with a camera, it is practically impossible to keep the
camera perfectly straight and perpendicular to the object as shown in Figure 8.
Therefore some perspective transformations will happen. A perspective transform
19


Figure 7. An image taken with a N90 camera phone where some distortions
have occurred.

is a result of projecting a three dimensional scene on the two dimensional
image plane. Usually perspective transformation is well approximated by an affine
transformation and is equivalent to the composed effects of translation, rotation,
scaling and shear. Here, homogeneous coordinates are utilized to define the
transformation matrixes because all affine transformations can be represented as
matrix multiplications in homogeneous coordinates [12]. Homogeneous coordinates
are explained in more detail in Appendix 1.

COP

Figure 8. When the camera is not perpendicular to the object perspective,
transforms will happen.
Image plane
COP
Image plane
20
Translation is defined as an operation that displaces image points by a fixed
distance in a given direction. It is possible to describe translation of point P to point
P by specifying a displacement vector d by

, ' d P P + = (12)

for all points P on the object. The homogeneous coordinate forms of these points and
the vector are

,
1

= y
x
P


=
1
' ' y
x
P ,

=
0
y
x
d

, (13)

from where we can see that

.
,
y
x
y y
x x

+ =
+ =
(14)

This result can be represented as a matrix multiplication [12]:

TP P = , (15)

where

.
1 0 0
1 0
0 1

=
y
x
T

(16)

For scaling where a fixed point, that is, the point that is unchanged by the
transformation, is at the origin, the two corresponding equations to (12) are

,
,
y y
x x
y
x

=
=
(17)

where
x
and
y
are the scaling coefficients for x and y dimensions, respectively [12].
These equations can be combined as

SP P = , (18)

where the transformation matrix is

=
1 0 0
0 0
0 0
y
x
S

. (19)
21
The third basic transformation matrix, rotation, can be derived similarly. The fixed
point is again set at the origin and the equations for rotation are

,
,
+ =
=
os c y n si x y
n si y os c x x
(20)

where is the rotation angle counter clockwise about the origin. [12] The matrix
form is

RP P = , (21)

where



=
1 0 0
0
0
s co n si
n si os c
R . (22)

The motivation for using transformation matrices to represent transformations is
that transformations can be combined and inverted. This can be done by using matrix
multiplication [12]. For example, if T is translation matrix and R rotation matrix, we
get

RTa Ca b = = , (23)

where a is some vector, C is the new combined transformation matrix of R and T,
and b is the resulting translated and rotated vector. The order of the transformation
matrixes is important, because RT in not the same as TR. Here RT means that the
vector a has been translated first to some location and then rotated around origin. If
the equation had been TR, the vector a would have been first rotated and then the
rotated vector would have been translated.


2.3.6 JPEG transform

In many kinds of image processing applications, compression algorithms play an
important role. Reducing the file size of an image is necessary for storage and
transmission. Especially in mobile phone environment, where the space is scarce,
heavy compression is used. One of the most well known and widely used algorithms
is JPEG (Joint Photographic Experts Group), which is usually defined as a lossy
compression algorithm. This means that some of the information the image contains
is lost during compression. Losses are not usually noticeable with human eyes but
affect on the watermark extraction quality.
Most of the time when JPEG is being talked about the image compression standard
is meant. Actually the abbreviation JPEG means a joint ISO/CCITT (International
Organization for Standardization / International Telegraph and Telephone
Consultative Committee) committee group that has published several standards
including ISO/IEC IS 10918-1 | ITU-T Recommendation T.81, which is referred
22
often as JPEG. The standard was approved by ISO and CCITT, which is now called
ITU-T (International Telecommunication Union, Telecommunication
Standardization Sector), in 1994. [13]
JPEG is designed to be an efficient coding scheme for continuous tone (multilevel)
still images and it was intended to become the first international digital compression
standard for still images. It has four encoding modes under which various coding
algorithms are defined:

1. Sequential encoding
2. Progressive encoding
3. Lossless encoding
4. Hierarchical encoding

The implementations are not required to cover all of these, but the baseline system is
based on sequential coding. [14, 15]
Coding algorithms are mainly based on two dimensional DCT (discrete cosine
transforms) except the lossless encoding scheme that employs predictive processes.
In the lossless encoding, predictive coding and entropy coding are used. The
resulting compression ratio is only about 2:1 but because no information is lost, the
decoded image is an exact replica of the original image unlike in DCT coding
schemes where some of the information is always lost in quantization. [14]
In the DCT based encoding process, samples of an image are grouped into 8x8
blocks, each of which is transformed with DCT into a set of 64 coefficients. Each of
the coefficients is then quantized by a different uniform quantizer, where the
quantization step-sizes are based on a visibility threshold of 64-element quantization
matrices. The standard does not specify default values for quantization tables but lets
the applications specify values for their particular task. [14, 15]
After quantization entropy coding is applied, the DC coefficient is differentially
encoded by using previous quantized DC (Direct Current) coefficient to predict the
current DC coefficient. The 63 AC (Alternating Current) coefficients are transformed
into one dimensional sequence with a zigzag scan shown in Figure 9. The one
dimensional sequence is then entropy coded by using either Huffman or Arithmetic
coding. For the baseline system, only the Huffman coding is used. [15]
The JPEG transform is for the moment the most commonly used image
compression standard. However, the situation will change in the near future when the
new JPEG2000 (ISO 15444) standard gains ground. The JPEG2000 standard uses
wavelet transformations instead of Fourier domain, and it is claimed to be able to
compress images up to 200 times with no appreciable degradation in quality. [16]

23


Figure 9. A zigzag scan of quantized DCT coefficients.

24
3. WATERMARKING METHODS

Methods to embed watermarks are not limited to one domain but the watermark can
be embedded in almost any transformation domain available. There can even be
multiple watermarks which are then embedded in the image: some to the same
domain and some to different domains. Explaining all the different watermarking
methods proposed is practically impossible, and therefore, only some of the most
important methods concerning our application are explained. The first section
explains the basic watermarking scheme, the second section is actually a brief
literature survey of the previously proposed methods in different domains and the
third section is about the multiple watermarking.


3.1. Generic watermarking scheme

Watermark can be embedded in an image with many ways. Some researchers exploit
the properties of transform domains, others create transform domains of their own
with properties they need. Here I shall present some methods used in the different
domains focusing mostly on the blind print-scan attack resilient watermarking
methods. In blind watermarking methods, the original image is not needed in the
extraction process, whereas in the non-blind ones the original image is required.
Before embedding the watermark, the pixels of an image are usually divided into
luminance and chrominance components. It is possible to embed the watermark in
some colour information, but the most common way is to use luminance information.
[2]
The watermark itself is usually a pseudorandom noise signal consisting of the
integers {-1, 0, 1}, and the amplitude of the signal is low compared to the image
amplitude to prevent the watermark from being visible. The only constraints are that
the watermark signal should not correlate with the image content and the energy in
the pseudorandom signal should be uniformly distributed. The most straight-forward
way to embed a watermark is thus to add the pseudorandom signal with a suitable
gain factor to the luminance values of the pixels of an image. [2]
The basic watermark embedding process is illustrated in Figure 10, where the
watermarked image I
W
(x, y) is obtained by adding the pseudorandom sequence W(x,
y) to the original image I(x, y). The corresponding formula is

) , ( ) , ( ) , ( y x kW y x I y x I
W
+ = , (24)

where the pseudorandom sequence W(x, y) is multiplied by a small gain factor k. [2]
The previously embedded watermark can be detected by calculating the cross-
correlation


=
+ + =
1
0
1
0
,
) , ( * ) , ( ' ) , (
M
m
N
n
W I
j n i m W n m I j i R (25)

between the possibly watermarked image I
W
(x, y) and the complex conjugate of the
25


Figure 10. Generic watermark embedding procedure.

pseudorandom sequence W(x, y). If the result of the correlation exceeds some
predefined threshold, the watermark is detected.
The detector can make two kinds of errors. It may detect a watermark even if there
is none, an error known as the false positive, or the detector may not detect the
watermark even if there is one, an error called the false negative. Generally the false
positives are considered as a worse kind than the false negatives because if the
existing watermark is not found, the image can be checked again and again whereas
the false positives cannot be corrected but the watermark is assumed to be detected
even if it is not. [2]
By using the aforementioned method only one bit can be embedded. To increase
the payload, the image can for example be divided into several blocks or sub-images
and embed a bit of a string of information in each of these sub-images, as did Smith
and Comiskey [17]. Figure 11 [2] illustrates a similar method.


3.2. Domains

This section focuses on some of the most common watermarking domains. First
some methods embedding the watermark in spatial domain are dealt with, then, some
methods working on Fourier domain and wavelet domain are discussed in a similar
way. The last section is for the methods working on other domains, not mentioned
here.


26


Figure 11. Embedding watermark in blocks.


3.2.1 Spatial

Nowadays, a robust watermark is required to hold on through many kinds of attacks,
out of which geometrical attacks are considered the most difficult ones to recover
from. Kostopoulos et al. [18] tried to solve this problem by embedding multiple
cross-shaped patterns in the image. Their method seemed to improve the robustness
of a watermarked image against small amounts of rotation, translation and scaling,
but it was very vulnerable to noise and more sophisticated attacks.
Methods that rely on synchronization template like the method by Kostopoulos et
al. are not generally valued high - for it is easy for an attacker to remove the template
after which the watermark cannot be read. Template embedding methods are,
nevertheless, a very robust way to recover from geometric distortions, and while
designing value-adding services only unintentional attacks must be considered and
consequently the template removal attacks are not expected.
A large number of template embedding methods have been proposed and studied
for their great ability to recover from geometrical distortions. Kutter proposed [19] a
method for recovering from general geometric transformations. The idea of this
method was multiple embedding of the same watermark on shifted locations in the
image. The method can be seen as some sort of spread spectrum watermarking
except that he used an extra step that was used for predicting of the embedded
watermark and thus increase the performance of the detector. Watermarks could then
be predicted and correlated to determine the geometric transformation.
27
Deguillaume et al. [20], too, proposed a method based on repetition. In the method
a periodic pattern was embedded in an image in order to get a high number of peaks
after autocorrelation from the magnitude spectrum of Fourier transform. After
autocorrelation Hough or Radon transform could be applied to determine the regular
grid shaped template. With the orientation of the grid, it was possible to determine
the parameters of the general affine transform applied to the image.


3.2.2 Fourier domain based methods

Fourier transform is one of the most famous transforms used in signal processing. It
was named after Joseph Fourier, a French mathematician and physicist, who lived
during Napoleons time and was the first to suggest that any function of a variable
can be expanded in a series of sines of multiples of the variable. This was not true,
however, but the suggestion that it might be true, even partially, was a breakthrough.
[21]. The two dimensional discrete Fourier transform (2D-DFT) of f(i,k) is defined as

=
|

\
|
+
=
1
0
1
0
2
) , ( ) , (
N
i
M
k
N
k
n
M
i
m j
e k i f n m F

, (26)

where f(i,k) is an N-by-M array and j
2
= -1. The result F(m,n) is a complex signal,
with real and imaginary parts, out of which the magnitude and phase of the Fourier
transform can be determined. The magnitude and phase of a Fourier transform are
described respectively as

) , ( ) , ( ) , (
2
Im
2
Re
n m F n m F n m F + = , (27)

=

) , (
) , (
) , (
Re
Im 1
n m F
n m F
an t n m , (28)

where F
Re
is the real part of the transform and F
Im
is the imaginary part. [22]
The inverse transform is defined as

=
|

\
|
+
=
1
0
1
0
2
) , (
1
) , (
N
n
M
m
N
n
k
M
m
i j
e n m F
NM
k i f

. (29)

From these equations it can be seen that a rotation in spatial domain follows a
rotation in frequency domain, that is,

( )

os c v in s u in s v os c u F
os c y in s x in s y os c x f
+ +
+ +
,
) , (
(30)

and scaling in spatial domain corresponds to scaling in frequency domain, that is,

28
|

\
|

b
v
a
u
F
ab
by ax f ,
1
) , (
. (31)

Because of the properties of the Fourier domain presented above, the Fourier
transform is a powerful tool in watermarking. One of the most difficult and crucial
phases in detecting a watermark is the ability of the watermarking system to find the
watermark from the distorted image. When using Fourier domain, the problem
decreases significantly by noting that the magnitudes of the Fourier domain are
invariant to translations in spatial domain but not to rotation and scaling [22]. A
translation in the spatial domain is a phase shift in the frequency domain.
Some research has been done on the watermark templates in the frequency domain.
Pereira and Pun [10] embedded a template in the middle frequencies of the Fourier
domain magnitudes. The template they used did not contain any information but was
merely used for detecting the transformations the image had gone through. The
template consisted of approximately 14 points embedded in the magnitudes of
Fourier domain. The points were embedded uniformly along two lines at different
angles and by finding these lines from transformed and watermarked image, the
amount of rotation and scaling could be determined. The actual message was
embedded by using spread-spectrum methods into the Fourier domain between two
radii occupying a mid-frequency range. Their method has been criticised because the
template is easy to remove and thus the actual watermark is also lost. The method is
nevertheless quite robust against rotation, scaling and noise. [10]
Another similar method was proposed by Lee and Kim [23]. They embedded a
pseudorandom sequence into the middle frequencies of the input image as in Figure
12 and used cross-correlation at different radii to find the sequence, as illustrated in


Figure 12. Composing the circular template.
29
Figure 13. Since the sequence was pseudorandom, they could derive the amount of
rotation by finding the position of the cross-correlation peak. The drawback of this
method was that the rotation angle could only be calculated at the precision of 1 and
the amount of translation could not be found. On the other hand, the method is fairly
fast and relatively simple to use.



Figure 13. a) The spectrum of the watermarked image. b) Searching
of the template.

The magnitudes of the Fourier domain are generally used for their invariance to
translation in spatial domain. In some papers the idea of the invariance in Fourier
magnitude domain has been developed further and domains that are invariant to
translations, rotations and scalings have been researched. ORuanaidh and Pun used
Fourier-Mellin transform based invariants for watermarking [11]. There is one
drawback in this method, however: it works only against rotation, scaling and
translation distortions and not against aspect ratio changes or shear, for example. For
more information about watermarking in spatial and frequency domains, see paper
by Hartung and Kutter [24].


3.2.3 Wavelet domain based methods

From the Fourier transform, we know that most of the signals can be expressed as a
series of sines and cosines. The Fourier transform is an efficient way to analyze a
signal, but even if we get to know all the frequencies in a signal, we would not know
when the frequencies are present. The solution for this is to divide the signal into
small segments and analyse them separately. After that, we have some kind of
knowledge on when and where the frequencies appeared, whereas dividing the signal
we come up against Heisenbergs uncertainty principle, which states that it is not
possible to determine the exact frequency and the exact time of occurrence of
frequency in a signal simultaneously.
The problem seems to be unsolvable, but the wavelet transform offers a possible
solution. The wavelet transform employs a fully scalable modulated window, which
is shifted along the signal and the spectrum is calculated for every position. The
30
process is then repeated multiple times with a slightly different length of the window.
The final result is a collection of time-frequency representations with different
resolutions of the signal, the so-called multiresolution analysis. [25]
The discrete wavelet transform of f(m) is usually defined as

=
1
0
,
) ( ) ( ) , (
N
m
s
m f m s

. (32)

where, the * corresponds to a complex conjugation. The formula describes how the
function f(m) is decomposed into a set of basis functions ) (
,
x
s
, called the wavelets.
A set of wavelet basis functions, { ) (
,
x
s
}, can be generated by translating and
scaling the basis wavelet ) (x as

|

\
|
=
s
s
s

1
) (
,
(33)

where the s is a scale factor, is the translation factor and the single basic wavelet
) (m is the so-called mother wavelet. [25]
One of the first wavelet transforms and probably the most applied transform is the
Haar transform which was invented before the term wavelet. It is the simplest
possible wavelet and its wavelet function is of the form [25]

<
<
=
times other at
x
x
x
, 0
1 5 . 0 , 1
5 . 0 0 , 1
) ( , (34)

Wavelet domain is neither translation nor rotation invariant, but it is often used
because of the many advantages it has compared to other domains. In Fourier domain,
the transform applies sinusoidal waves as basis functions and thus the Fourier
transform is only localized in frequency. By contrast, wavelets are described as
waves with a limited duration and therefore are localized in both time and frequency.
This space-frequency representation is good at localizing image features, such as
edges and textured areas which might be neglected while working in the Fourier
domain. [22, 26]
Another main advantage is wavelet domains superior HVS modelling capabilities
compared with other domains. A reason for that are the similarities of the wavelet
transforms to the multiple channel models of the HVS. The frequency decomposition
of the wavelet transform resembles the signal processing of the HVS, so that both of
them divide the image into frequency channels that respond to an explicit spatial
location, a limited band of frequencies and a limited range of orientations. [26]
Wavelet transform of an image is also usually fast to calculate. This is due to a low
linear complexity O(n) compared for example with DCT (Discrete Cosine
Transform), applied over an entire image, which has complexity of O(n*log n).
Transmitting of a transformed image is also fast due to the multi-resolution
31
representation of the image because hierarchical processing can be done in a
straightforward way. [26]
Normally, wavelet watermarking methods are categorized by the wavelet
coefficients in which the watermark is embedded and especially between
approximation coefficients which contain the low-frequency information and other
coefficients, that is, the detail sub-bands that represent the high-frequency
information in horizontal, vertical and diagonal orientation. These detail sub-bands
are shown in Figure 14. [26]



Figure 14. a) Wavelet coefficients calculated with Haar function in Matlab. b)
Structure of the wavelet coefficients in the image a).

Barni et al. [27] embedded a binary pseudorandom sequence in the DWT (Digital
Wavelet Transform) coefficients of the three largest detail sub-bands of the image,
that is, vertical detail (LH
1
), horizontal detail (HL
1
) and diagonal detail (HH
1
), by
using visual masking so that the watermark could be embedded with maximum
energy. The watermark was detected by using a correlation between the marked
wavelet transform and the watermarking sequence. The detection results obtained
were really good while dealing with image cropping, because the watermark energy
could be kept as high as possible for the similarity of DWT decomposition to the
models of HVS and therefore even small portions of the image were sufficient to
correctly guess the embedded code.
Watermarking in wavelet domain resembles watermarking in spatial domain.
Many of the basic techniques and methods used in spatial domain can be also
employed in the wavelet domain. Another aspect to be considered when designing a
watermarking system in wavelet domain is the upcoming JPEG2000 standard, which
works in wavelet domain. That is, however, only one of the reasons why the wavelet
domain appears to be so attractive right now.


3.2.4 Methods in other domains

Spatial, Fourier and wavelet transforms are not the only transformation domains that
have been used in the field of digital watermarking. Many other domains have been
32
researched and their qualities investigated and exploited. Most of the other domains,
however, are variations or extensions of well-known Fourier or wavelet domains.
Hadamard transform is an example of generalized class of Fourier transforms. The
difference between these two transforms is that the basis functions of the Hadamard
transform are variations of a square wave rather than sinusoid. The Hadamard
transform has only 1 and -1 as elements in its kernel matrix and the simplicity of the
Hadamard transform is a significant advantage in processing time over some other
transforms.
Quite a few researches have been published concerning Hadamard transform in
watermarking and one of them is a method proposed by Gilani and Skodras [28]. In
their technique, the watermark is embedded in the perceptually most significant
spectral component of an image. The image is first Haar wavelet transformed and
then the lowest frequency band is Hadamard transformed. The result is then zigzag
scanned and the watermark is embedded in those coefficients. The extraction method
is fairly similar but the zigzag scanned coefficients are cross-correlated with the
watermark generated by a secret key.
Some research has also been done in Gabor [29] and Fresnel [30] transform
domains, but the research in these domains has not evolved a great deal. The
methods developed are not robust enough against print-scan attacks or geometrical
distortions. Another promising domain is the fractional Fourier domain, which is the
generalization of the classical Fourier transform Error! Reference source not
found.. Not much watermarking research has been done on this domain because the
idea of it is similar to that of wavelet domain and so it remains to be seen if it is a
suitable domain for watermarking.
Lot of Discrete Cosine Transform (DCT) based watermarking algorithms have
also been proposed. Some of them use block based algorithms, where the image is
divided into blocks and the watermark is embedded in those blocks. These methods,
however, are not generally robust against geometric transformations and are not
examined here.
There exist various transform domains of which only some are studied with
watermarking. All of them have good or even superior qualities but also some side
effects. Therefore, if a transform domain is to be used, it must be selected carefully
and the properties of the environment where the watermark is used must be kept in
mind.


3.3. Multiple watermarking

Multiple watermarking means in short embedding more than one watermark in the
image. Lhetkangas [32] studied the problem of multiple watermarking in her Master
of Science thesis. She studied cases where there were multiple watermarks and
multiple users who wanted to embed information in digital images. To analyse
different multiple watermarking techniques, she developed a new classification
system for the multiple watermarking methods. The previous method classification
system was developed by Sheppard et al. [33]. They divided the multiple
watermarking methods into three classes: re-watermarking, segmented watermarking
and composite watermarking.
33
In re-watermarking, the multiple watermarks are embedded by adding them one by
one on top of each other. This method is fast and simple but it can also be used as an
attack in some circumstances. The lastly embedded watermark can destroy the
previously embedded watermark and thus one must be careful while choosing the
watermarking methods. Another drawback of this method is that every embedded
watermark decreases the quality of the watermarked image and consequently the
PSNR value also drops. [33]
Another way to embed multiple watermarks in an image is to divide the image into
segments and embed watermarks each in its own segment. This is called segmented
watermarking, and it does not degrade the image more than embedding only one
watermark. It has some limits, however, because when the amount of segments rise,
their size decreases and watermark embedding to smaller segments becomes harder.
[33]
A third way to use multiple watermarking is to use composite watermarking by
building a composite watermark from a collection of watermarks. The watermarks
can be for example pseudo random sequences that are combined and then embedded
in the image as usual. The composite watermark will be separable if the different
watermarks are orthogonal or, like in the case of pseudo random sequences,
uncorrelated. [33]
Lhetkangas motivated her work with value-adding watermarks in addition to
DRM (Digital Rights Management) problems and discussed about the multiple
watermarking hiding methods from various points of view. She, too, divides the
multiple watermarking methods to three classes, but this time the classes are the
basic algorithm, methods to divide watermarking space and multiple watermarking
hiding techniques. [32]
The basic algorithm is a watermarking algorithm that is applied to hide one
watermark once. The basic algorithms used in multiple watermarking form the basis
and set limits for performance. It is possible to take advantage of the properties
multiple basic algorithms in multiple watermarking. [32]
Instead of classifying re-watermarking and segmenting watermarking separately
[33] Lhetkangas [32]combines them under methods to divide watermarking space.
She claims that they define if the multiple watermarks are embedded in the content
over each other or in parallel with each other.
The third class of the Lhetkangas classification system is the multiple
watermarking hiding techniques. They define the order in which the watermarks are
embedded and who is embedding the watermarks. In some applications there might
be several users who want to embed watermarks to prove ownership and rights to use.
For example, the creator of the image may want to embed creator information in the
image, whereas the distributor of the content may wish to embed copyright
information in the image. Some information might be protected and therefore
everyone cannot be allowed to access the embedded information as is. [32]
Most commonly the multiple watermarking algorithms are applied to enhance the
robustness of a watermark, but the development of the digital world has brought new
application areas. In the digital world, the media should be playable on different
platforms and devices and with different programs. The multiple watermarking
techniques can be applied to help the adaptation of the content to the various
environments by embedding watermarks in the content, each of which can contain
information about the functionality of the content, settings required and programs
needed [32].
34
4. COMMERCIAL DEVELOPMENT

Although digital watermarking has been around only for a little while, some
commercial applications have been initiated, Digimarc being probably the most
famous one. Here some of the commercial applications in value-adding
watermarking are introduced.


4.1. Digimarc

Digimarc Corporation is based in Beaverton, Oregon, but it has international offices
also in London and Mexico. Digimarc is a developer of digital watermarking
solutions and it offers security and brand protection solutions to global corporations
and government entities. Although Digimarc has focused mainly on digital rights
management issues, it has launched a different kind of initiative based on
watermarking to enhance mobile computing and commerce. [34, 35]
The goal of the Digimarcs initiative is to provide a service for camera phone users
to navigate from printed materials to a URL for a website with one click. That is, the
printed material contains a watermark that can be read with a camera phone. The
phone then recognizes the image and sends it to Digimarcs registry to determine
what to do with it: whether to direct the user to some website or to an e-commerce
application. The registry contains information about the user, for example how he
wants to pay his purchases and so on. To be able to use the service, the user must
have a downloaded Digimarcs client to his phone. [35]
The Digimarc has acknowledged the problem that unlike barcodes an invisible
watermark is not apparent to the naked eye. This causes problems in how to let the
consumers know about the watermark in the materials. The Digimarcs solution to
this is, at least initially, to partner with an e-commerce or catalogue company. The
users of a catalogue are assumed to be comfortable in using a device to select
catalogue items. [35]
The Digimarcs initiative has roused interest at least in Japan where MediaGrid
has licensed Digimarcs technology. In July 25, 2006, Digimarc announced a launch
of a digital watermarking pilot in Japan in Amusement Caf Maid in Japan caf.
The idea of the pilot is to offer customers with a camera phone possibility to interact
with digitally-watermarked print materials. The materials may contain links to online
content such as a theme-oriented city guide or a mobile phone wallpaper featuring
favourite characters. The pilot was rolled out by MediaGrid and Success Corporation,
a leading developer of games and video games in Japan. [34]


4.2. CyberSquash

CyberSquash is developed by NTT (Nippon Telegraph and Telephone Corporation)
Cyber Solutions Laboratories. It is defined as an Internet Access Platform that makes
use of watermarking technologies. In this system, a watermark, indicating an URL
for a desired homepage, is embedded in a printed image, which can be read with a
35
web camera or a mobile phone with an i-appli digital camera. The image is then
processed and the user is directed to the specified homepage. [36]
There are two types of CyberSquash software that are used for reading the
watermarks: active-X version and i-appli version. The Active-X version is developed
to read watermarks with a Web camera on a PC and i-appli version is developed to
read watermarks with a mobile phone equipped with a digital camera. The i-appli
version works only in NTT DoCoMos i-mode mobile phones and it is created in
Java programming language. [36]
In CyberSquash the watermark is embedded in the image in four phases: first error
correction coding is applied and the received code is modulated by using Direct
Sequence Spread Spectrum (DS-SS) modulation. The modulated code is then
permuted with pseudorandom sequence to reduce the imbalance of robustness among
bits. In the third step, the modulated and interleaved code is embedded in the image
by applying two dimensional pattern modulation in small blocks, as illustrated in
Figure 15, where the patterns are two, two dimensional sine curves with 90
rotational symmetry. The actual embedding is done by multiplying the watermark
pattern with an embedding strength factor and superposing the original image on the
watermark pattern. Adaptive pattern superposition can be also employed to improve
the balance between the image quality and the robustness of the watermark. [37]



Figure 15. 2D Pattern modulation in CyberSquash application.

The method presented above is not robust against geometric distortions and
therefore the writers placed a frame around the watermarked image to recover
synchronization. The frame also works as an indicator showing that the watermark
has been embedded. After the image has been read with a camera, the frame is
recognized and the four corner points are located. Out of these locations parameters
of the affine transform and scale can be determined. The parameters determined are
more like approximations than exact values and thus the corrected image may
contain small geometric distortions. The embedded watermark is designed such that
it is robust against such small distortions. [37]
36
In the watermark detection, the scaled and geometrically corrected image is
filtered with a pre-processor to increase the robustness of the watermark. After
filtering, the image is divided into small individual blocks and the energy of the
frequencies corresponding to the two sine curves is calculated on each block. By
calculating differences between two energy levels the sign of the embedded sequence
is obtained. When the embedded sequence is determined, the sequence is de-
scrambled with pseudorandom permutation and demodulated. The last step is to put
the sequence through an error correction process. [37]
The CyberSquash trial was initiated in 2003 and was planned to be around for six
months. However, after the trial, the CyberSquash application has disappeared from
the news headlines and it seems that the development of it has stopped.


4.3. Bar codes

Bar codes are not really watermarks, but the application areas of the bar codes are so
similar to those of watermarking that it is necessary to introduce them. The first
section contains a short history of bar coding and some description about the
applications where they are used. The second section tells how the bar codes work.


4.3.1 Short history

Bar codes are usually thought as a rival of watermarking in the field of value-adding
services, because they can be used in similar applications. There are, however, many
applications where either one is clearly more suitable. For example, in catalogues
where space is scarce, watermarking is clearly a better solution than bar codes. On
the other hand, in advertising where the user needs to be informed separately about
the extra data included in the image, a bar code might be more suitable than
watermarking.
Bar codes were invented on 1949 when a young graduate student, Joseph
Woodland, draw idly some dots and dashes on the sand. He was trying to figure out
how to read automatically information about a product and he knew that Morse codes
were the key to solve the problem. While lying on the beach he finally understood
how it should be done and so the idea of bar codes was created. [38]
Joseph Woodland and his partner, Bernard Silver, received a patent on bar codes in
1952 (US Patent 2,612,994) but it was not a rapid commercial success. Although the
idea was ready for commercial world, the technologies that were needed in the bar
code scanners were expensive or yet to be found. It took fifteen years before the first
commercial use of bar codes and it was at mid seventies when the bar codes finally
came in to the stores. This was enabled by the invention and development of lasers
and integrated circuits which came affordable in the 1960s and made the bar code
scanners simple and profitable. [39]
One of the first standards created was the UPC (Universal Product Code), now
officially as EAN.UCC-12, (International-Uniform Code Council), which is still in
use in USA and Canada. In the early 1970s, US grocery industry was trying to find a
way to reduce costs. They reasoned that automating the grocery checkout process
37
could do this, and after two years effort they announced an UPC and UPC bar code
symbol on April 1, 1973. First item bought by using this system was a package of
Wrigley's gum sold in Marsh's Supermarket in Troy, Ohio on June 26, 1974. [38]
Nowadays, bar codes are used in multiple ways and even some programs have
been published for camera phones that read bar codes. The critics who favour
watermarking over bar codes claim that the bar codes are ugly and require extra
space. However, right now the bar codes can contain more information that
watermarks and they are more robust in mobile applications. Technology has surely
advanced from the dates when the first bubblegum packet was sold with bar codes.

4.3.2 Operation

Bar codes are often described as a machine-readable representation of information
printed on some surface. The traditional bar code consists of bars and spaces of
alternating diffuse reflectivity, usually black and white parallel stripes, as illustrated
in Figure 16. The bar codes in the figure were generated with a bar code online demo
[40]. The information in the bar codes is encoded to the bars and spaces along one
dimension, horizontal, and therefore the vertical height of the bar code has no
specific meaning. It only makes the reading of the bar code easier. [41]



Figure 16. Examples of UPC-A bar codes.

There are two main ways to encode information to bar codes. The first one is to
divide the piece of code into 1s and 0s and then paint 1s with black as bars and 0s
as spaces as in Figure 16, in which is an example of UPC-A codes. The second way
is to use width coding, that is, assign each bit to a bar or space and make that element
wide if the bit is 1 and narrow if the bit is 0. For example, bar code standard Code 39
is a width code. [41]
The encoded information can be read by various technologies. The most common
ones are cameras and lasers and although the technologies are different, the idea is
the same: when the scanner reads a bar code, it detects only reflections of light. The
black stripes will not reflect any light whereas the white stripes reflect most of the
light back. [41]
As the technologies evolved, it was realized that the traditional one dimensional
bar codes were not good enough. Data Matrix is a 2-dimansional bar code standard
consisting of black and white dots as in Figure 17. The Data Matrix code includes
four basic elements: two solid-line locators, two synchronization lines, data area and
a quiet zone. The data area contains, obviously, the encoded binary data, whereas the
quiet zone is the empty narrow area around the data matrix. The two solid-line
locators are solid perpendicular lines that indicate the data area boundaries and the
orientation of the data matrix. The two synchronization lines opposite the solid-line
locators indicate the sample modules. [42]
38



Figure 17. Data Matrix bar code encoding MediaTeam.

Bar codes have been used in practically any imaginative application. They have
been embedded into groceries, aeroplanes, cars, images and even into fashion and
tattoos. They have been used in monitoring movement, merchandise and tracking of
objects. Bar codes have been around for a long time, but their story is far from over.


4.4. Mobot

Mobot does not use watermarking technologies for offering value adding services,
but it is worth mentioning for the wide publicity it has received in USA. Mobot does
not require any kind of barcode, logo or special symbol, nor does it need any kind of
client software in the mobile device [35]. Instead of watermarks, Mobots solution is
based on image recovery, pattern recognition and image matching capabilities. This
enables Mobot to support all camera phones in the marked regardless of camera
accuracies. [43]
The user needs only to snap a photo with his/her camera phone of the interesting
ad and send it to Mobot server which then analyses the image and in turn sends the
user whatever the advertiser wants him/her to receive. The data user receives could
be for example a coupon, a giveaway or additional information about the product but
for consumers to actually receive the giveaways and offers from the advertisers, they
must first register with the company. All this is already in use in the Jane magazine,
a magazine for young women, which has launched promotion Jane talks back. [35]


4.5. Sanyo

A Japanese electronics company, Sanyo, too, has done some research on
watermarking. Takeuchi et al. from Sanyo Electric Co., Ltd. proposed a method to
read watermarks from printed images with a camera phone. The actual watermark is
embedded in their method with guided scrambling (GS) techniques. Unlike many
other watermarking methods that are tested with cameras, the method by Takeuchi et
al. compensates also the radial distortions. The coefficients of the correction model
are calculated by using a chessboard calibration pattern as a preliminary work. It was
assumed in the paper that the coefficients would not change between phones of the
same model and a database of the coefficients that could be referred by a product
name of the camera phone and focal length during photo acquisition was created.
The perspective distortion was compensated by calculating four corners of the image.
Unfortunately the more specific publications of this method are written in Japanese
and the information is not thus available for the international audience. [44]
39
5. PRINT-SCAN RESILIENT WATERMARKING

Printing and scanning of an image produces a set of distortions to the watermarked
image, as explained in section 2.3.3. In this chapter, a watermarking method is
proposed which is resilient to print-scan attacks. The block diagram of the
watermarking system is shown in Figure 18. The proposed method consists of three
parts: three separate watermarks. The first watermark is embedded in the frequency
domain to recover the image from rotation and scaling attacks. The second
watermark is embedded in spatial domain to recover from translation attack and the
third watermark is the multibit message which is embedded in the wavelet domain.
The last two sections of this chapter discusses about the experiments done and results
achieved, respectively, to validate the use of the method.
Every watermark embedded can be considered as an attack against formerly
embedded watermarks. The order in which the watermarks are embedded is thus
carefully chosen but it could be any other. Here, the multibit message is most fragile
of the three watermarks and therefore it is embedded last.


Figure 18. Block diagram of the proposed print-scan robust method.

5.1. Frequency domain template

Fourier domain has an advantage over other domains concerning watermarking but
this may also be its drawback: invariance to translations. This property is used here
for determining the amount of rotation and scale, because it is a lot easier to find the
watermark from Fourier transform domain when translation need not to be worried
about. The translation invariance forces us to find a different method for determining
the amount of translation, which is introduced in section 5.2. The next section
explains the embedding process, the second the extracting process.


5.1.1 Embedding

The template watermark is embedded in the magnitudes of the Fourier domain. The
first thing to do before embedding is to transform the luminance values of an image
to the Fourier domain which results in two Fourier images, real and imaginary parts.
In Fourier domain representation, low frequencies are located to the corners of the
transformed image. Before processing the image the low frequencies are moved to
the centre and then the magnitudes of the transform are calculated.

Embed template
in Fourier domain
Extract the
message
Invert rotation
and scale
Invert translation
Host image
Embed template
in spatial domain
Embed multibit
message
Taking a
picture
40

Template

After the magnitudes of the Fourier transform have been determined, the template is
embedded. To recover from rotation and scaling distortions, a template is embedded
in the magnitudes of the Fourier transform of the image in a somewhat similar
manner to the method by Lee and Kim [23], where a pseudorandom template
sequence of length 180 bits is embedded in the middle frequencies of magnitudes of
Fourier transform. The template of a pseudorandom sequence of 1s and 0s is
embedded in the middle frequencies of the image in a form of a sparse circle. This
process is illustrated in Figure 19. The points in the figure are exaggerated to make
them visible to the eye in printed material.



Figure 19. Embedding a pseudorandom sequence in the Fourier domain of an image.

The template is symmetrical around its origin because the magnitude component
of Fourier transform has the origin of symmetry. Every point on the circle is added to
the Fourier domain at an angle /20 from each other. The value of /20 is chosen for
convenience but it could be different. The values of the pseudorandom sequence that
differ from 0 form peaks to the Fourier domain when embedded. Therefore all the
points of the pseudorandom sequence are not visible but the 0s appear as gaps in the
circle. The strength at which the values are embedded varies with local mean and
standard deviation. This is because the embedding strength should clearly be larger
close to the low frequencies, where, in general, are the highest values of the Fourier
transform.
The decision to embed values in the middle frequencies is a compromise. The low
frequencies of a Fourier transform contain most of the energy in an image. Therefore
all the changes made to the low frequencies are highly visible in the image and
especially so because the watermark signal should be embedded very strongly so that
the energy of the image would not be overwhelming. On the other hand, the high
41
frequencies are very vulnerable to various kinds of attacks, for example to the JPEG
compression.
The result of the watermark embedding process can be seen in Figure 20 where a
small magnified piece of image is shown on the upper left corner of each of the
images. The magnified portion of the image shows more clearly the effect of
watermark embedding than the image itself. Some of the variation in the quality of
the image will be flattened during the printing process and thus it is possible to
embed the watermark more strongly that it would be when distributing the image in
digital form.


a) b)

Figure 20. a) Original image b) original image after embedding the watermark.

When embedding the watermark peaks in the magnitudes of a Fourier transform,
the watermark spreads over the entire image. This fact enables the watermark to be
robust against slight cropping. Cropping, on the other hand, inflicts noise in Fourier
domain, but if the template is embedded robustly enough, it will hold through it.


5.1.2 Extracting

Figure 21 shows the image after the print-scan process which has rotated and scaled
the image. To find the embedded template from the scanned image, the image is first
padded with zeros to its original geometry, a square. If the image is not padded with
zeros beforehand, the template circle would be stretched to an oval and the extraction
process would be more difficult.
The extraction of the template from the Fourier transform domain is mainly about
locating the peaks. From here on we can think of the image itself as noise and the
watermark as the information to be preserved. To find the hidden information, we
must first filter out the noise, that is, the image.

42


Figure 21. The watermarked image after print and scan process in which
some distortions have occurred.


Wiener filtering

The first thing to do after calculating the magnitudes of the Fourier transform is to
use Wiener filtering. The Wiener filtering removes some of the noise and helps in
finding the peaks. To find the peaks, the Fourier transform of the image is Wiener
filtered and the filtered transform image is subtracted from the distorted transform
image. Wiener filter minimizes the mean square error between an estimate
f

and the
original image f

( ) { }
2
2

f f E e = . (35)

Wiener filter is usually defined in the frequency domain with a formula

) , ( / ) , ( ) , (
) , ( *
) , (
2
v u P v u P v u H
v u H
v u G
f n
+
= , (36)

where P
f
(u,v) and P
n
(u,v) are the power spectra of the original image and noise
respectively. In the formula, P
n
(u,v)/P
f
(u,v) can be replaced with a constant, which
can be approximated roughly beforehand. [22]


Finding peaks by cross-correlation

The Wiener filtering helps in finding the peaks of the template from the noisy
environment and so the template can be found by using cross-correlation. To reduce
43
the noise, the Wiener filtered image of the Fourier transform magnitude domain is
further thresholded before applying cross-correlation. The thresholding is applied so
that a point is selected as one if the local mean around the point exceeds certain
predefined limit.
There are two things that we know about the template: the pseudorandom
sequence and that the template is shaped like a circle around origin. What we want to
know are the value of radius and the angle of rotation. The searching of the rotation
and scale factors is processed in two phases. In the first stage, a rough estimation of
the rotation angle and scale factor is determined and in the second, finer results are
achieved. Since the Fourier transform magnitudes are invariant to shifts in spatial
domain, it is enough to search for the template circle around the origin. To find the
circle, every possible radius must be searched. This could very easily lead to an
exhaustive search, but because the image is in digitized form, it is enough to search
first only integer valued radii and find out the exact value later in the second stage.
It is not needed to examine all the radii because at the low frequencies the noise
from the image is overwhelming. When calculating a cross-correlation of
pseudorandom sequence and a highly noisy signal, the result may show a high
correlation between the two signals even if there is none. Therefore some of the low
frequency radii can be discarded and the search area resembles an annulus between
two predefined frequencies f
1
and f
2
as in the paper by Pereira and Pun [10].
The detection of the template circle is performed as follows: first a radius is
selected and a one dimensional sequence corresponding to the radius in the Fourier
transform is extracted as in Figure 13 [23]. The sequence is cross-correlated with the
pseudorandom sequence by using a cross-covariance function which is related to
cross-correlation. The cross-covariance can be defined as a cross-correlation of mean
removed sequences

( )

<
|

\
|
|

\
|
+
=


=

=
0 ,
0 ,
1 1
) (
) (
*
1
0
1
0
* *
1
0
*
m m c
m y
N
y x
N
m n x
m c
yx
m N
n
N
i
i i
N
i
i
xy
, (37)

where x is a sequence of the image at some radius with length N and y is the
pseudorandom sequence interpolated to the length N. The maximum of the resulting
cross-covariance is saved to a vector. After the integer radii between frequencies f
1

and f
2
are examined, the maximum is selected from the vector containing maximums
of the cross-covariances, which is shown in Figure 22.
When a rough estimation of the radius of the template circle is found, the locations
of the template peaks are extracted. The peaks are found by examining the space at
wide 2 pixels around the previously found radius. Every point at this space is
examined by calculating a local mean about the point and deciding whether the point
is a peak high enough or not. The point is selected to be a peak if the value in that
point is 3 times bigger than the local mean and if the peak is a maximum on that area.
The difficulty of finding the peaks is obvious when looking at the magnitudes of
the Fourier transform as in Figure 23. The points are sharp and clear in the original
image but stretched and spread in the distorted image. The low frequencies are
strongly visible in the distorted image because some of the white scanned
background of the image is still in on calculations, whereas the original image
contains only the image and no scanned background.
44

Figure 22. The vector containing maximums of the cross-correlations.



Figure 23. Magnitudes of the Fourier transform of a) distorted image b) original
image.


Determining rotation and scaling parameters

After the peaks are found they are transformed into polar coordinates and divided
into /20 segments accordingly to their angle. Extra points can be discarded inside
these segments, because we know that the points should be at angle ~/20 from each
other. The resulting piece of signal is then cross-correlated with the embedded
pseudorandom signal and the maximum of the cross-correlation signal shows the
amount of rotation in a multiple of /20. For example, if a cross-correlation peak is
at a, the amount of rotation is roughly a* /20.
After the rotation has been found in a multiple of i/20, a more accurate value can
be determined. The angles of the peaks are subtracted with the original angles of the
45
embedded pseudorandom sequence and the value for rotation is thus received by
taking a median from the resulting values.
The scale factor is calculated by taking a trimmed mean from the radii of the peaks
and dividing the value with the original radii of the embedded pseudorandom
sequence. When the scaling and rotation parameters are found, the distortions can be
inverted with matrix operations explained in section 2.3.5. The image after rotation
and scaling is show in Figure 24.



Figure 24. The image after correction of rotation angle and scaling.

The algorithm for the extraction method is as follows:

1. Pad the image with zeros to its original geometry
2. Calculate the Fourier domain magnitudes
3. Apply Wiener filtering to remove noise
4. Find out a rough estimate of the rotation angle and scale factor
4.1. Apply thresholding
4.2. Search through integer radii between frequencies f
1
and f
2
with cross-
correlation between a radius and the pseudorandom sequence used in
embedding
4.3. Select the radius with the maximum cross-correlation value as the radius of
the circle
4.4. Calculate cross-correlation between the sequence at selected radius and
pseudorandom sequence
4.5. Select the location of the cross-correlation peak as a rough estimate of the
rotation angle
5. Refine the estimate
5.1. Calculate the exact locations of the template peaks with the rough estimates
5.2. Take a median from the angles of the peaks to get the angle of rotation
5.3. Take a trimmed mean from the radii of the peaks to get the amount of scale
46
5.2. Spatial domain template

Translation, that is, how far from its original place the watermark has shifted,
describes the location of the watermark in the image. Locating the watermark is not
so easy a task as it may appear at first sight. The watermark has probably been
rotated and scaled and those transforms must be inverted before the watermark is
located accurately. Also, there is no point in rummaging the whole image around
while searching the starting point of the watermark, but we should be able to restrict
the search somehow. Here a separate watermark has been embedded in spatial
domain to serve as a template. Locating the watermark is now faster than the
exhaustive search but sets its own impact to the imperceptibility of the watermark
load.


5.2.1 Embedding

The watermarking system should be robust against translation attack but the Fourier
transform magnitudes are invariant to translations. Therefore, another watermark is
needed to recover the image from translation attack.

Template

The template watermark for recovering from translation attack is embedded in the
spatial domain and the shape of it is shown in Figure 25. The template consists of
two similar parts, one for the horizontal translations and the other for the vertical
translations. A template part, either horizontal or vertical, is build with a small
pseudorandom sequence of size 127. The sequence is an m-sequence and the length
of the sequence is carefully chosen to be robust enough. A longer sequence would be
sensitive to small rotations but, on the other hand, a shorter sequence would be
difficult to find with cross-correlation.



Figure 25. The template embedded in the image in order to recover the
message watermark from a translation attack.
47
The m-sequence is repeated across part of the image, separately horizontally and
vertically. The horizontal pattern is formed as follows: The first line is embedded in a
suitable row so that the final pattern is similar to that in Figure 25. The second line is
an exact copy of the first line, but the third and fourth lines are skipped. This is done
because the pattern should not be visible to human eye in the final image and, if all
the lines are used, the periodical pattern of the template shows. The fifth line is
otherwise similar to the first line, but the m-sequence is shifted to the right by 2
pixels. Therefore the pattern seems to be oblique. The vertical pattern is similar to
the horizontal case, but the lines are columns instead of rows.


5.2.2 Extracting

The cross-correlation is used here to extract a watermark from spatial domain but the
problem is accuracy.

Quarter-pixel interpolation

Trying to find the template pattern from an image will not always end as expected,
because of all the geometrical distortions discussed in section 2.3.3. The main idea
here is to calculate cross-correlation with the embedded m-sequence and every other
row or column, shift the value with a suitable amount and add all the results together.
The reason for this is that the averaging of the cross-correlation results diminishes
noise.
From the two resulting sequences, a sequence for each rows and columns, it is
possible to see how much the image has been translated to each direction. The
problem with this is the fact that the amount of translation can be determined only
with accuracy 1 pixels, but in real world the image may be shifted only pixels,
for example. An error of pixels may very well destroy the reading of a watermark.
Therefore interpolation methods are applied for achieving better precision.
Here a quarter-pixel interpolation is applied by doing half-pixel interpolation twice
and a bilinear interpolation is applied to determine the values at the midpoint
between the pixels. Bilinear interpolation of a point is calculated by taking the
closest 2x2 neighbourhood of known pixel values surrounding the unknown pixel
value. The unknown pixel value is then calculated with a weighted average of the
four surrounding pixel values, as illustrated in the Figure 26. If all the distances from
the known pixel locations to the unknown pixel are equal, the interpolated value is
simply the sum of known pixel values divided by four.



Figure 26. Bilinear interpolation of an unknown pixel value.
(x
1
, y
2
)
(x
1
, y
1
)
(x
2
, y
2
)
(x
2
, y
1
)
(x, y)
48

Determining translation parameters

The determination of the translation parameters requires some processing power. The
rotation and scale corrected image is interpolated to a quadruple of its size by
determining the three values at equal distances between known pixels. Also, the
embedded m-sequence is interpolated to quadruple of its size for cross-correlation.
The extraction of the watermark is done in two phases, separately for horizontal and
vertical translations. Unfortunately, finding the translation parameters is not
straightforward, but the determined values must be suitably combined before the
values for the horizontal and vertical translations are found.
The cross-correlation with the embedded m-sequence is calculated for every other
row and the results of the cross-correlations are shifted with two and added up to a
vector, so that the possible cross-correlation peak is strengthened. The amount of
translation can be calculated from the location of the peak because we know the size
of the original image and the place where the translation template should be. In
Figure 27 there is a filtered plot of the resulting cross-correlation sequence.



Figure 27. Cross-correlation image of the horizontal template used for
determining the amount of the translation in the image.

In Figure 28, there is an example of the template extraction process where the x1,
x2, y1 and y2 are unknowns. The original size of the image was 512x512 and the
template watermark was embedded between lines 64 and 192 as explained in
preceding sections. After the print-scan process, the image is distorted and there are
borders around the image, resulting from an incorrect cropping while scanning the
image.
The image size is somewhat larger than the original image size due to the
background cropped along the image. In Figure 28 the rotation and scale have been
corrected from the distorted image and it can be assumed that the distorted image
now contains the watermarked image in its original size, but we just do not know its
exact location.
To be able to determine the location of the image, the unknowns in Figure 28
should be solved. The image is first interpolated to achieve an accuracy of 0.25
pixels. After interpolation, the amount of translation can be determined by
calculating a cross-correlation between the embedded and interpolated m-sequence
and every other line. There is no need to calculate cross-correlation with every line
49


Figure 28. The translation template after spatial shift.

because the template is always similar in interpolated image in eight succeeding lines.
In the original image, the template is similar in two succeeding lines. The cross-
correlations are added up to each other, by first shifting each line with one more than
the previous line and then summing it to previous results. The shifting is done so that
the possible peaks that declare the location of a template line in the cross-correlation
sequences are added up to each other and so strengthened. The same process is
repeated for the vertical dimension, resulting in two cross-correlation sequences and
two peaks.
The locations of the peaks will not tell the translation parameters right away
because of all the shifts and add ups. To find the translation parameters, we must
remove the effect of the image from the locations of the cross-correlation peaks. In
this example, it means that we take the locations of the peaks in the two cross-
correlation sequences and subtract the known location of the translation template in
the original image. That is, we add up 192, the location of the template on a first
template row, and 127, the length of the template, and the number of shifts before the
first template line, that is 32, because the cross-correlations are calculated with only
every other line. All these values are multiplied with four, because the image has
been interpolated to the quadruple of its size and subtracted from the locations of the
two peaks. This way, we find two values that include the translations.
The two values are not enough to find four unknowns, however. Therefore we
calculate two more values from the other end of the cross-correlation sequences. That
is, we take the length of the sequences and subtract the locations of the peaks from
those. These new values can then be processed as above and two more values are
received.
The four unknowns can be solved from the following equations.

50
1 2 2
1
2 2
1
1
2 1 2
1
1 2
1
2
4
3
2
1
y x val
y x val
y x val
y x val
+ =
+ =
+ =
+ =
(38)

where the val1-val4 are the values extracted above and x1, x2, y1 and y2 are the
unknown translation parameters. The final image after translation is shown in the
Figure 29. The image can now be extracted from the background and the actual
value-adding watermark can be read.



Figure 29. Watermarked and print-scanned image after correction of translation,
rotation and scaling.

A pseudo code representation of the extraction method is as follows:

1. Apply quarter-pixel interpolation to the image
2. Process horizontal part of the template
2.1. Calculate cross-correlation with every other row of the image
2.2. Shift every cross-correlation result with one more that the result of the
previous row so that the peaks are in a same line
2.3. Add all the results together
2.4. Find the maximum peak and calculate the distance from both ends of the
sequence
2.5. Remove the location of the template in the original image
3. Process vertical part of the template
3.1. Same as above but columns as rows
4. Solve the amount of translation from the received results


51
5.3. Wavelet domain multibit message

The wavelet domain is very sensitive to small geometrical distortions and all the
geometrical distortions must be removed before the message watermark can be read
from the wavelet domain. That is the reason why the template watermarks are used to
remove the effects of rotation, scale and translation from the image. This section
explains how the message watermark is embedded with a spread spectrum technique
and extracted with a method based on a thresholded correlation receiver.


5.3.1 Embedding

Error-correction coding

Before embedding the message watermark, the message was protected with an error-
correcting coding and (15, 7) BCH (Bose-Chaudhuri-Hocquenghem) codes were
chosen for their widespread use and simplicity. BCH codes are multilevel, cyclic,
variable-length codes applied to correct multiple random error patterns and the BCH
(15, 7) is especially able to correct two errors. The BCH codes are based on the idea
of adding parity bits to the code word to check if any changes have occurred. [45]
When calculating BCH codes Galois Fields are applied. Galois fields are called
also finite fields because they contain only finite number of elements. For example a
Galois field GF(q) is a field with q elements where q is a finite number. Every Galois
field has at least one primitive element a such that every field element except zero
can be expressed as a power of a. [45]
The BCH design rule requires that there are twice as many powers of a as the error
correction capacity t. If q=2
m
, where m is any integer 3, the elements of the field
can be represented by polynomials whose coefficients are elements of the field GF(2),
that is, 0 and 1. The block length of a such code is n=2
m
-1 and the error correction
capacity of the code is t<(2
m
-1)/2. [45]
The final code word consists of two parts: the message part and the remainder for
checking the message. The remainder part, that is, the part that contains parity bits, is
calculated with generator polynomial. Here, for BCH (15, 7), the generator
polynomial is x
8
+x
7
+x
6
+x
4
+1 which has been calculated with Matlab. The length of
the code word is then 15 and the length of the message is 7. The code word can be
generated by multiplying the corresponding polynomial of the message word with
generator polynomial.


Spread Spectrum Technique

The technique used here is similar to some extent to that by Keskinarkaus et al. in
[46]. The message bits that are protected with BCH code are embedded in the image
in wavelet domain. The image is decomposed to the first level sub-bands using Haar
wavelets and the watermark is embedded in the detail coefficients and while the
approximation coefficients are left unmodified. In the [46] the watermark was
embedded in approximation coefficients to gain better robustness, but here the
52
properties of the detail coefficients are employed, because they contain higher
imperceptibility properties. Especially the horizontal detail coefficients are used.
As in [46] the watermark is embedded with

= =
= + =
0 ), ( ) ( ) (
1 ), ( ) ( ) (
*
,
* *
,
*
,
* *
,
messagebit k m n Y n Y
messagebit k m n Y n Y
f l f l
f l f l

, (39)

where Y* is an image which has already been watermarked with the templates in
Fourier and spatial domain. ) (
*
,
n Y
f l
is the sub-band of Y* in the l
th
resolution level
and f
th
frequency orientation. ) (
* *
,
n Y
f l
is a new watermarked sub-band, where **
means that multiple watermarking has been applied. is a scaling coefficient to
control the embedding strength and m(k) is the m-sequence the length of which
controls the chip rate for spreading.
After the message has been embedded, the inverse wavelet transform is applied to
the image. The amount of distortion and noise presented by the multiple
watermarking is evaluated with eye and PSNR value. The results of the evaluation
are presented in the upcoming sections.


5.3.2 Extracting

All the geometrical distortions must be corrected before the watermark can be read
from the wavelet domain. After correcting the distortions the wavelet transform is
applied to the watermarked image and the detail coefficients are divided into small
segments of the same size as the m-sequence used for embedding the message. The
message watermark is extracted by calculating a mean removed cross-correlation
between the coefficient segment and the m-sequence. The result of the correlation is
analyzed and the message bit is chosen to be 1 if the correlation value is above a
certain threshold value and 0 otherwise.


5.4. Experiments and results

The image used in the experiments was the famous Lena image of size 512x512
pixels. The message was embedded in the image with spread spectrum techniques
and before embedding the message was error coded with (15,7) BCH code which has
error correction capability of 2 bits. After error correction the length of the message
was 135 bits.
After embedding the message and the template watermarks the image was JPEG
compressed with different compression ratios. Compression ratios examined here
were 100, 80, 60 and 40. The images used in the testing are included to the end of
this work as Appendix 3. It was noticed that different printers give different printing
qualities and thus for the experiments two printers were used. Most of the work was
done with Hewlet Packard ColorLaserJet 5500 DTN printer but one image was
printed with Hewlet Packard ColorLaserJet 4500 DN printer. The result of the latter
53
printer was significantly darker than the result with the 5500 DTN printer, as can be
seen from the next image. All the images were of physical size of 10.3cm x 10.3cm.
The scanner utilized was Epson GT-15000 scanner and every image was
scanned 50 times with 300dpi and then 50 times with 150dpi and saved to
uncompressed tiff-format. The image was rotated randomly between separate
scanning times and the rotation angle varied between -45 and 45 degrees. Also, the
scanned image area was changed, that is, how much white was left around the
scanned image.


a) b)

Figure 30. Lena image printed with different printers and then scanned. a) printed
with HP LaserJet 5500 DTN printer, b) printed with HP LaserJet 4500 DN printer.

The quality of the image was tested with PSNR and PSPNR values after
embedding the image and compressing it with JPEG. The results of the PSNR and
PSPNR calculations are collected to Table 1. From the large values in the table it is
possible to see that the embedding method works well and the quality of the images
stay fine through the embedding process. It must be noted, however, that the printing
process flattens somewhat the pixel values and the watermark will be even more
difficult to perceive. This works for the perceptibility of the watermark but against
the robustness of it.

Table 1. PSNR and PSPNR after compression

JPEG Compression Ratio PSNR PSPNR
100 39.5 57.6
80 37.3 49.9
60 36.5 47.7
40 35.9 46.0

The images were scanned with two different printing resolutions, but before
extracting the watermarks, the image areas scanned with printing resolution of
300dpi were scaled to 25%. This was done to reduce the computational complexity
and processing times. The results were gathered to two tables, Table 2 and Table 3.
The results of the reliability calculations of the extraction process are shown in
Table 2. The table shows the calculated success ratios, that is, percentage of times
when the message was extracted correctly. On the rows, there are different
54
compression ratio values for images before printing. The columns of the table show
the results with two different scanning resolutions.
Table 3 contains the average BER (Bit Error Ratio) values for different images. As
in the previous table, Table 3 is organized with the same way: on the rows, there are
different compression ratio values for images before printing. The columns of the
table show the results for two different scanning resolutions. The BER was
calculated from the received message before error correction by comparing the
received bits to the embedded bits. Thus the BER value here represents the quality of
the channel, that is, how many bit errors occur between printing and scanning. The
value in the bracket indicates the average BER when the extraction process was not a
success.

Table 2. Success ratio with different compression ratios and scanning settings

JPEG Compression
Ratio
300dpi 150dpi averaged
(uncompressed) 86.0% 90.0% 88.0%
100 94.0% 94.0% 94.0%
80 92.0% 92.0% 92.0%
60 78.0% 82.4% 80.2%
40 62.0% 58.0% 60.0%

100 (4500 DN) 14.0% 21.6% 17.8%


Table 3. Average BER with different compression ratios and scanning settings

JPEG Compression
Ratio
300dpi 150dpi averaged
(uncompressed) 6.5% (33.5%) 4.5% (32.7%) 5.5%
100 4.6% (50.2%) 3.6% (30.2%) 4.1%
80 3.5% (20.8%) 4.0% (31.3%) 3.8%
60 9.1% (33.2%) 6.8% (27.9%) 8.0%
40 14.9% (29.5%) 11.4% (23.0%) 13.2%

100 (4500 DN) 19.6% (21.6%) 21.5 % (22.9%) 20.6%


5.5. Discussion

After combining the results in the two tables it is clear that the method is fairly
robust against rotation, scale and translation attacks. The method is also robust
against some JPEG compression, but more work should be done to improve the
reliability of the method.
From the results in Table 2 it is possible to see that the success ratio decreases
when the quality of the JPEG compression decreases. This was expected, but it was
surprising how large the impact of selecting the printer actually is. By comparing the
55
first and last lines on Table 2, the importance of selecting the printer is visible. The
first line shows the results when a JPEG compressed image with a compression ratio
of 100 is printed with HP LaserJet 5500 DTN, whereas the last line shows the results
for a similar image printed with HP LaserJet 4500 DN. The scanner used was the
same for the both images all the time. The results show remarkable difference in the
watermark reading reliability and therefore a remark can be made that also the printer
quality should be considered when designing a watermarking system and not solely
the scanner quality.
In this method the value-adding watermark was embedded in the detail coefficients
of the wavelet transform. However, the degradation after JPEG compression of the
image hits most severely to details in an image and so the high frequency wavelet
detail coefficients may not work very well. Instead of embedding the watermark to
detail coefficients it would be interesting to study if embedding the watermark to
approximation coefficients would increase the reliability.
In the method of Keskinarkaus et al. [46], the watermark was embedded in the
approximation coefficients with fairly similar method as used here. The robustness
was only tested with degrees of -15 -15 in the method by Keskinarkaus et al., but in
previously described method, the rotation angle was varied between -45 and 45
degrees and the method was found robust against the rotations. The method of
Keskinarkaus et al. was however more robust against JPEG compression, where even
success ratios of 100% were reported with compression ratio of 80%.
The results in Table 3 show that JPEG compression affects strongly to the BER
when the compression ratio is equal or greater than 60%. With compression ratios
equal or greater that 60%, the amount of bit errors is still manageable. If the
compression ratio is smaller than 60%, the extraction of the watermark and
correcting of the bit errors gets difficult.
While comparing the method with previous methods it can be seen that the method
works very well. For example, the method works better in comparison with the
block-based method by He and Sun [6]. In their experiments, they got BER values of
15%, while in the previously described method the BER values were most of the
time under 5%. Not until the images were compressed beforehand with JPEG
compression ratios under 60, did the BER values got worse. One reason for high
differences between the methods is the fact that capacity is significantly higher in the
method of He and Sun, which weakens the robustness.
The exact BER values are included in the end of this work as graphs to Appendix
2. From the graphs is possible to see that in most of the cases the message is not
totally lost but been covered under noise. These kinds of messages could be saved
with a stronger error correction coding. On the other hand, sometimes the BER
closes to 0.5 and then the message cannot be read anymore. In this kind of situation
the correction process of geometrical distortions has probably failed.
It was noted that the bit error rate was acceptable when the JPEG compression ratio
was above 60% and so quite a lot of improvement could be done by using better
error correcting codes. Unfortunately the capacity of the method is not high and
therefore the error correction codes should be carefully chosen in the future works so
that the information rate would be optimal for the task.



56
6. PRINT-CAM RESILIENT WATERMARKING

Print-scan robustness is a good requirement to begin with in watermarking systems
but the number of applications is limited. The print-cam process would have a great
deal more applications because many people carry around a camera phone in their
pockets but the attacks are more severe. The print-scan process introduces many of
the attacks that are present in the print-cam process but in a simplified form. For
example, in the print-scan process the image may be translated in horizontal and in
vertical directions in the scanned image, whereas in the print-cam process also the
distance between the image and the camera varies. This three dimensionality of the
problem makes the extraction of the watermark more difficult than in the print-scan
process and different synchronization methods are required.
Here a frame is added around the image and a method for finding the corner points
of the frame is proposed. With the corner points, the affine transformation
parameters are determined to approximate and invert perspective transformations.
The block diagram of the proposed system is shown in Figure 31.
The method for embedding and extracting the multibit message is the same as in
section 5.3. Unfortunately, the multibit message watermark is very sensitive against
even small distortions and although the rotation, scale and translation are inverted
with the affine transform the inversion process is not accurate enough and thus a
more specific method is needed to find the exact amount of translation. Here the
same method for determining translation is used as in section 5.2.
Before extracting any of the watermarks the barrel distortions are inverted with
the Camera Calibration Toolbox [8]. All of the pictures taken with a camera do not
contain lens distortions, but in some cameras they are so severe that their correction
cannot be neglected. The lens distortions such as barrel distortions occur due to the
lens properties and therefore the parameters for correction transform need to be
calculated only once. The parameters can be determined beforehand with a reference
image as explained in section 2.3.4.



Figure 31. Block diagram of the proposed print-cam robust method.


6.1. Frame detection method

Not much research has been done on the field of reading watermarks with a camera
phone but this is no wonder - for the camera phones have been around only from the
Embed visible
frame
Extract the multibit
message
Inverting barrel
distortions
Process frame
information
Determine the
amount of translation
Host
image
Taking a
picture
Embed multibit
message
Embed translation
watermark
57
year 2000 when the Sharp corporation announced the first camera phone ever. Only
during the last few years, have the camera resolutions grown high enough for
watermark detection and the first commercial applications have been invented as
explained in the chapter 4.
The problem of reading watermarks with camera phones is somewhat similar to
that of the print-scan process, but the biggest difference is the extra dimension, the
effects of which need to be considered. While in a two dimensional problem we
examined a planar surface from the level of the surface, in the three dimensional
problem we examine the surface from somewhere above. In the simplest case of the
three dimensional problem, the optical axis is perpendicular to the plane and the
resulting picture can be analysed with same ways as the two dimensional case. If the
plane is tilted relative to the optical axis, reading the watermark gets more difficult
because the relative distances between the points on the surface plane and the camera
have changed.
As there is no way to get to know the amount of tilting, the only acknowledged
solution to this is to use affine transformation as an approximation to inverse the
effects of the perspective distortion. The method used in here is a modified version of
that by Katayama et al. [47], where a frame was added around the image and the
corner points were calculated to determine affine transform parameters.


6.1.1 Embedding

The frame embedded here is identical to that by Katayama et al. [47]. A frame is
added along the outside of the image as in Figure 32. To separate the frame from the
image, the frame is added at a small distance from the border of the image. The
distance is related to the width of the frame so that it is possible later to determine the
exact location of the image. The colour of the frame was chosen to be blue, but it
could be any other with an intensity level different enough from the background. The
frame width and the width of the gap between the image and the frame were decided
to be 5 pixels.



Figure 32. Framed image.

58
6.1.2 Extracting

In the method by Katayama et al. [47], the extraction of the frame was performed
with frame detection filters and thresholds. A point was judged to be part of the
frame if the result of the frame-detection filter at that point was bigger that a
predefined threshold value. The correct threshold value varies from image to image
and even over same image with lighting and therefore thresholding was not used here
but a different kind of a method was developed.
The beginning of this method is similar to Katayama et al.s [47]. As in their
method, the picture taken is divided with a crosswise line to upper and lower sections.
It can be assumed that the watermarked image lies somewhere around the centre of
the captured image and so the crosswise line is assumed to cross left and right sides
of the frame. The frame sides can thus be found by searching along that line. At this
point of calculations, we do not know the scale of the picture and so we do not know
the width of the frame. However, it can be estimated by differentiating pixel intensity
in the crosswise direction and calculating the width from the positions of the
maximum and minimum values. The process of frame detection is illustrated in
Figure 33.



Figure 33. The frame is found by searching along a crosswise
line and advancing up and down the found side of the frame.

When the width of the frame is known, the information can be used in the frame
detection filter. The frame detection filter matrix is of size 3xn where the n is two
pixels more than the width of the frame. Along the both sides of the matrix there are
values of n-2 and the middle of the matrix is filled with -2s. For example, for frame
width of 5 pixels the frame detection filter is of the form




=
2 5 5 5 5 5 2
2 5 5 5 5 5 2
2 5 5 5 5 5 2
F
, (40)

For each point to be examined a convolution value is calculated with

59

= =
=
3
1 1 i
n
j
ij ij
F I FrameValue , (41)

where I is a small part of the image centred about the point to be examined.
The algorithm begins from the midpoint of the left edge of the captured image.
From there the calculations advance to the right searching for the left side of the
frame. After the location of the left side of the frame has been found it can be traced
up- and downwards to find the side of the frame, as shown with red coloured lines in
Figure 33. The side can be slanted and therefore one pixel to the left and right from
the current side position should also be examined instead of examining only the pixel
directly above or below the current position.
Unlike in the method by Katayama et al. [47] where the corners and sides of the
frame were determined with a threshold value a different approach has been chosen
here which does not require thresholding. The examining of the pixels is done with
the frame detection filter described earlier. From the three pixel values examined on
every row, the one with the maximum filter value is chosen to be part of the frame.
At some point the calculations go over the point where the side of the frame ends but
the calculations are continued nevertheless. The values after that are not correct but
the length of the incorrect segment is assumed to be small compared with the length
of the frame side. Therefore we can take all the points of the frame just calculated
and approximate them with a straight line. This same procedure is repeated to all the
sides of the quadrilateral frame. After the straight lines are approximated, the corners
of the quadrilateral can be approximated from the intersections of the lines.
The approximations of the corners from the intersections of the straight lines are
not entirely correct and therefore a small area around the points is chosen and
inspected further. The approximations are further specified by selecting a small area
around the intersection and determining the exact location of the corner point by
correlating a small corner image with the small area around the assumed corner. By
using correlation we can determine the exact location of the crossing and the corners
of the quadrilateral frame are thus found as shown in Figure 34.



Figure 34. The found corners of the frame.
60
The correction of the perspective distortion is done with the following equations [47].

position picture camera y x
position picture original y x
y b x a
c y b x a
y
y b x a
c y b x a
x
: ) , (
: ) , (
1
1
0 0
2 2 2
0 0
1 1 1

+ +
+ +
=
+ +
+ +
=
. (42)

The algorithm for the extraction process is as follows:

1. Find the width of the frame
1.1. Divide the image with crosswise line to upper and lower sections
1.2. Differentiate the pixel intensities in the crosswise direction
1.3. Select the width from the positions of the maximum and minimum values
2. Determine the frame detection filter
3. Locate the corner points with frame detection filter
3.1. Start from the middle of the left side of the image
3.2. Find all the sides of the frame
3.2.1. Advance to the right until a side of the frame has been found
3.2.2. Trace the frame up and downwards and examine also the points one
pixel to the left and right of the current side position
3.2.3. Select maximum of the three points to be part of the frame
3.2.4. Rotate the image to find other sides and go back to 3.2.1 until all the
sides of the frame has been found
3.3. Approximate the points with straight lines
3.4. Calculate the intersections of the lines
4. Refine corner locations with correlation
5. Correct perspective distortions


6.2. Experiments and results

The experiments were done with the 512x512 Lena image and (15,7) BCH coded
message of the length 135 bits as in the print-scan method described earlier. The
message watermark was embedded in the image as in section 5.3. and because the
frame cannot correct translation attack accurately enough, a template watermark was
embedded in the spatial domain as in section 5.2. The frame was attached around the
image to recover the image from geometrical and perspective transforms.
The research was done with five images: one uncompressed image and three JPEG
compressed images which are shown in Appendix 5. Compression ratios examined
are 100, 80 and 60 and the images were printed with Hewlet Packard ColorLaserJet
5500 DTN printer. Before printing the images out PSNR and PSPNR values were
calculated for each of the images. The values were gathered up to Table 4 and from
the values it is possible to see that the qualities of the images stayed high even after
embedding process.
61

Table 4. PSNR and PSPNR after embedding process

JPEG Compression Ratio PSNR PSPNR
(uncompressed) 39.2 59.5
100 39.0 58.5
80 37.1 48.6
60 36.6 47.1

Every image was photographed 100 times with resolution 800x600 and 100 times
with resolution 1600x1200. In advance to the photographing, the image was pinned
against a wall to make it straight but no special arrangements were done to prevent
the camera from moving: the pictures were taken as perpendicularly as possible to
the image on the wall, but freehandedly.
The camera phone used was Nokia N90 with CMOS (Complementary Metal
Oxide Semiconductor) 2 megapixel camera with focal length of 5.5mm. This
information is useful for determining the camera parameters when correcting the
barrel distortions with the Camera Calibration Toolbox. The available image
resolutions in the camera were 640x480, 800x600 and 1600x1200, but it was found
that the lowest resolution level is too low for watermark extraction.
The resulting success ratios of the method are displayed in Table 5. The rows show
the images with different compression ratios where as the columns show the results
with different resolution settings of the camera. In the experiments, the images taken
with resolution 1600x1200 were scaled to 25% prior estimating the parameters. This
was done to reduce the computational complexity and thus processing speed and
memory consumption.

Table 5. Average success ratio with different compression ratios and capturing
settings

JPEG Compression
Ratio
resolution
800x600
resolution
1600x1200
averaged
(uncompressed) 75.0% 96.0% 85.5%
100 90.0% 90.0% 90.0%
80 82.0% 92.0% 87.0%
60 31.0% 69.0% 50.5%

Table 6 shows the average BER values of the experiments. The arrangement of the
rows is similar to that in the previous table: the rows show BER values for images
with various compression ratios whereas the columns show results for different
resolution values. The BER values for the table were calculated before error
correction because calculations were done to examine how many errors would be
expected in the process, not how well the error correction coding performs. The
value in parentheses indicates the average BER when the extraction process was not
successful.




62
Table 6. Average BER with different compression ratios and capturing settings

JPEG Compression
Ratio
resolution
800x600
resolution
1600x1200
averaged
(uncompressed) 5.6% (11.3%) 3.1% (20.0%) 4.4%
100 4.6% (15.8%) 3.1% (9.3%) 3.9%
80 4.8% (8.9%) 3.6% (5.5%) 4.2%
60 10.7% (11.8%) 6.5% (11.9%) 8.6%


6.3. Discussion

The results in Table 5 show that the method is very promising and works well in the
test case. Unfortunately, the method is fairly sensitive to distortions and some
restrictions were necessary to make the method work: for example the image was
taken as perpendicularly as possible above the watermarked image. This is due to the
wavelet domain multibit message watermark, which requires nearly perfect
correction of geometrical distortions.
The multibit watermark is especially frail against tilting of the optical axis. If the
optical axis is tilted, some parts of the image appear to be closer that the other. In the
camera image, the parts that are close are presented with high resolution but parts
that are further away are presented with lower resolution. In some cases, the
resolution could be too low and the correction algorithm cannot correct distortions
accurately enough and message will not be extracted correctly. The multibit message
watermark in the wavelet domain requires high resolution and so it will be destroyed
if the tilting of the optical axis is too high.
The choice of resolution at which the image is taken is important. The succession
ratios were significantly better when the resolution level of 1200x1600 was used.
From the results it can be deduced that the 2 megapixel camera seems to be enough
for reading a watermark correctly from a printed image but higher resolution would
obviously be better. This is not a problem as the cameras evolve rapidly in the
mobile phones and even while this work is being documented phones with better
camera capabilities are being published.
Even with this camera and camera properties, the BER values of the method are
acceptable. When looking at the values in Table 6 it is possible to see that the method
would work better with stronger error correction coding. The values in parentheses
are BER when the method was not successful, that is, when the error correction
failed. The values are evidently below 0.5, which is the limit around which error
correction is not possible with any coding technique.
Some more specific information about the calculated BER values is included to
Appendix 4. The images show that the BER values are usually at the same low level
but now and then there are peaks to the level 0.5. This indicates that the extraction of
the watermark has failed completely. In these cases it can be assumed that the
synchronization process has somehow failed.
One of the possible reasons why the message is too erroneous to be read correctly
in addition to the tilting of the optical axis is the compression of the image
beforehand. Table 5 shows that the method is robust against slight JPEG
compression but deteriorates rapidly as the compression ratio increases. This was
63
expected, but even though the JPEG compression worsens, the results, the success
ratios, are still above 90% with compression ratios of 80%.
It must be noted, however, that the compression ratios reported here tell only the
amount of compression applied to the image before printing. More compression will
occur when the image is taken with the camera phone which compresses the image
before saving it to the memory. From this point of view, the watermarking method is
even more robust against JPEG compression.
Comparing of the method developed here with other similar methods is difficult
because only few watermarking methods have been proposed for the camera phones.
The method by Nakamura et al. [37] also used a frame around the watermarked
image to correct perspective distortions but the way they handled the results was
different from mine. They reported as high success ratios as 100% when the picture
was taken straight above the watermarked image, but this value cannot be compared
with the method proposed earlier because they used a much stronger error correction
coding and the camera phones used were different. Also, the capacity of their method
was lower and smaller message was embedded in the image: only 16 message bits
were embedded against 63 bits embedded here. This, too, enhanced the robustness in
their favour and it is claimed that the method proposed here would compete well
with theirs with similar settings.
The experiments were done here in a noise free environment and many distortions
were neglected. For example, impacts of lighting were not considered and the light
around the image was stable through the experiments. In the future it is important to
do research with different light conditions and variable lighting, as the reflections of
the light from the image will affect to the extraction of the watermark.
Another difficult thing to be examined is the distortions around the image. Rarely
in the real life, is the image placed alone to the page. Often the image is surrounded
with other images and text. The method must be improved to handle these kinds of
distortions. Nevertheless, the method is promising and a great deal of information
was gained in the research for the future use.
64
7. DISCUSSION

No one knows what the future brings but we can always make good guesses. Even
now, more and more content moves around without wires between portable devises,
cell phones, laptops, PDAs and so on. A growing number of people have cell phones
in their pockets accompanied with mp3 players and digital cameras, but even the
limits between devices are diminishing. A cell phone may now contain in itself a
media player and a video camera and still be available for consumers at a reasonable
price.
As the properties of devices blend together to one device, so does different media
formats. With watermarking, music files can be included in image files and links to
websites are embedded in both of them. In this work, two watermarking methods
were proposed for value adding watermarking. The first method was robust against
print-scan attacks which were considered as prerequisites for the second, print-cam
method.
The print-scan process works in two dimensional world in which the user should
own a scanner to be able to read the watermark, but from the users point of view it
would be easier if the watermark could be read by taking a picture of the
watermarked image with a camera phone and the watermark could be read at anytime
in anywhere.
A motivation for this work was the lack of publications discussing about reading
watermarks with digital cameras or camera phones. Only few papers were found and
almost all of them were developed for commercial purposes. This indicates that there
is a demand for print-cam robust watermarking systems.
The methods presented here were fairly similar in spite of the different
environments they were required to work in. In the print-scan robust method, the
focus was on inverting some geometrical distortions, that is, rotation, scale and
translation. Along with that, the print-cam robust method focused on correcting the
effects of perspective distortions. In both of the methods, multiple watermarking
methods were employed, where the multibit watermark was embedded in the wavelet
domain and one or two template watermarks were embedded to recover from
geometrical distortions. The parameters for inverting translation were determined
with the same watermark in both methods, but the rotation and scale were calculated
with different kinds of watermarks: with a template in Fourier domain in the print-
scan robust method and with a visible frame in the print-cam method. Both methods
seemed to work very well, and, with a stronger error correction coding, the results
would have been even better.
While storing the bits of media for future use, the compression algorithms have a
huge role to play. Right now the most popular image compression format is the JPEG
compression and every watermarking system should be robust against it. Here it has
been shown that both of the methods are robust against JPEG compression with a
compression ratio of 60. Compression ratios less than that are rarely used because
after compression ratio of 60 the compression starts to decrease the quality of the
image.
In the future work, the reliability of the methods will be improved and new
synchronization methods will be developed. The focus will be transferred to the
print-cam robust methods and print-scan robustness will be only the first step
towards print-cam robust systems. The next generation of camera phones will be
65
published soon and the resolutions in the cameras will increase. Soon the qualities of
the camera phones will exceed the qualities of the present digital cameras and thus
the research will be done in the near future with digital cameras instead of existing
camera phones.


66
8. CONCLUSION

The aim of this work was to find a method to read a watermark from a printed image
with a camera phone. As a prerequisite for the problem, a print-scan robust
watermarking method was developed an examined. Based on the results achieved
with the print-scan robust watermarking method, a print-cam robust method was
proposed.
In both of the methods multiple watermarking methods were applied successfully
and the results obtained were promising. In the print-scan robust method, three
watermarks were embedded in the image: two template watermarks were embedded
in order to recover the watermark from rotation, scale and translation attacks and the
third watermark embedded was the multibit watermark, containing the actual
message. One of the template watermarks was a pseudorandom sequence embedded
in the magnitudes of the Fourier domain in a form of a circle around the centre of the
magnitude coefficients. With this watermark, the rotation angle and the amount of
the scale of the image occurred at the scanning process could be inverted. Since the
magnitudes of the Fourier domain are invariant to translation, a second watermark
was required and embedded in the spatial domain. The multibit message watermark
was embedded at last in the wavelet domain.
In the print-cam robust method, two invisible watermarks were embedded in the
image and a visible frame was added around the image. The visible frame was
necessary so that the perspective distortions could be approximated and inverted with
affine transformation. The two invisible watermarks embedded were the same as in
the print-scan robust method. One of the invisible watermarks was a template
watermark, embedded in the image in order to recover the watermark from
translation attack, whereas the other invisible watermark embedded was the multibit
message watermark.
The methods were tested by taking multiple pictures with a scanner and camera
phone from a watermarked image. The success ratios and BER values were
calculated for both of the methods with various resolutions levels. In the print-scan
robust method, the results with resolution levels did not have any significant
differences but in the print-cam robust method the difference was obvious: the higher
resolution level gave better results regardless of the compression level of the image
used.
The methods were also tested by compressing the test image beforehand with
different JPEG compression ratios. The results were expected, as the success ratio
decreased and the BER increased while the compression ratio decreased. However,
the results were acceptable until the compression ratio went below 60 and thus it can
be concluded that both of the methods are robust against JPEG compression.
The results of the methods do not reach 100%, however, but with better error
correction coding the value could be approached. The future work includes
improving of the print-cam method and moving to use digital cameras instead of
camera phones. This is due the fact that the qualities of the cameras in cell phones
are getting better and soon they will reach the qualities of modern digital cameras.
67
9. REFERENCE

[1] Cox, I.J., Miller M.L. & Bloom J.A. (2002) Digital watermarking. Morgan
Kaufman publishers, Academic Press, USA, 542 p. ISBN 1-55860-714-5

[2] Hanjalic, A., Langelaar, G.C., van Roosmalen, P.M.B., Biemond, J. &
Langendijk, R.L. (2000) Image and Video Databases: Restoration, Watermarking
and Retrieval. Elsevier Science B.V., Amsterdam, Netherlands, 445 p. ISBN 0-444-
50502-4

[3] Mkel K. (2000) Digital Watermarking and Steganography. Diploma Thesis.
University of Oulu, Department of Electrical Engineering, Oulu, Finland.

[4] Chou, C-H & Li, Y-C. (1995) A Perceptually Tuned Subband Image Coder
Based on the Measure of Just-Noticeable-Distortion Profile. In: IEEE Transactions
on circuits and systems for video technology. Dec 1995, Vol. 5, Issue 6, pp. 467 -
476.

[5] Perry, B., MacIntosh, B. & Cushman, D. (2002) Digimarc MediaBridge The
birth of a consumer product, from concept to commercial application. In:
Proceedings of SPIE Security and Watermarking of Multimedia Contents IV, Jan 21-
24, San Jose, California, USA, Vol. 4675, pp. 118-123.

[6] He, D. & Sun, Q. (2005) A Practical Print-scan Resilient Watermarking Scheme.
In: IEEE International Conference on Image Processing (ICIP), Sept. 11-14, 2005,
Vol. 1, pp. I - 257-60.

[7] Solanki, K., Madhow, U., Manjunath, B.S. & Chandrasekaran, S. (2004)
Estimating and Undoing Rotation for Print-scan Resilient Data Hiding. In: IEEE
International Conference on Image Processing (ICIP), Oct. 24-26, Vol. 1, pp. 39-42.

[8] Camera Calibration Toolbox for Generic Lenses. (read 27.8.2006) URL:
http://www.ee.oulu.fi/mvg/mvg.php?page=calibration. Matlab

version 6.5 or later


with the Image Processing Toolbox and Optimization Toolbox is required.

[9] Kannala, J. & Brandt, S. S. (2006) A Generic Camera Model and Calibration
Method for Conventional, Wide-Angle, and Fish-Eye Lenses. In: IEEE Transactions
on pattern analysis and machine intelligence. Aug 2006, Vol. 28, No. 8, pp. 1335-
1340.

[10] Pereira, S. & Pun, T. (June 2000) Robust Template Matching for Affine
Resistant Image Watermarks. In: IEEE Transactions in Image processing, June 2000,
Vol. 9, Issue 6, pp. 1123-1129.

[11] ORuanaidh, J.J.K. & Pun T. (1997) Rotation, scale and translation invariant
digital image watermarking. In: IEEE Proceedings of International Conference on
Image Processing, Oct. 26-29, 1997, Santa Barbara, California, USA, Vol. 1, pp.
536-539.
68

[12] Angel, E. (2006) Interactive Computer Graphics: a top-down approach with
OpenGL -4
th
edition. Addison Wesley, Pearson Education Inc., USA. 784 p. ISBN 0-
321-32137-5

[13] JPEG Home Page (read 27.6.2006) URL: http://www.jpeg.org/jpeg/index.html

[14] Rao, K.R. & Hwang, J.J. (1996) Techniques and Standards for Image, Video
and Audio Coding. Pretience Hall PTR, New Jersey, USA. 563 p.

[15] ISO/IEC JTC1 10918-1 | ITU-T Recommendation T.81. (1992) Information
technology- Digital Compression and Coding of Continuous-tone Still Images:
Requirements and Guidelines. Terminal Equipment and Protocols for Telematic
Services. CCITT.

[16] Johnson, R. C. (read 27.6.2006) JPEG2000 wavelet compression spec approved
URL: http://www.eetimes.com/story/OEG19991228S0028 EETimes, Dec. 28, 1999

[17] Smith, J.R., Comiskey, B.O. (1996) Modulation and Information hiding in
Images. In: Proceedings of the First Information Hiding Workshop May 30 June 1,
an Isaac Newton Institute, University of Cambridge, UK, Vol. 1174, pp.207 - 226 .

[18] Kostopoulos, V., Skodras, A.N., & Christodoulakis, D. (2000) Digital Image
Watermarking: On the Enhancement of Detector Capabilities. In: Proceedings of Fifth
International Conference on Mathematics in Signal Processing, Warwick, Dec. 18-20,
2000.

[19] Kutter, M. (1998) Watermarking resisting to translation, rotation and scaling. In:
Proceedings of SPIE Multimedia Systems and Applications, Boston, MA 1998, Vol.
3528, pp. 423-431.

[20] Deguillaume, F., Voloshynovskiy, S. & Pun, T. (2002) Method for the
Estimation and Recovering from General Affine Transforms. In: Proceedings of
SPIE, Electronic Imaging 2002, Security and Watermarking of Multimedia Contents
IV, Vol. 4675. pp. 313-322.

[21] Joseph Fourier (read 2.8.2006) Wikipedia, the free encyclopedia.
URL:http://en.wikipedia.org/wiki/Joseph_Fourier

[22] Castleman, K.R. (1996) Digital Image processing. Prentice-Hall, Jew Jersey,
USA, 1996. 667 p. ISBN 0-13-211467-4

[23] Lee, J.-S. & Kim, W.-Y. (2004) A Robust Image Watermarking Scheme to
Geometrical Attacks for Embedment of Multibit Information. In: Proceeidng of
Advances in Multimedia Information Processing - PCM 2004: 5th Pacific Rim
Conference on Multimedia, Nov 30 - Dec 3, Tokyo, Japan, Part 3, pp.348-355.

[24] Hartung, F. & Kutter, M. (1999) Multimedia Watermarking Techniques In:
Proceedings of the IEEE, Jul, Vol. 87, Issue, pp. 1079-1107.
69

[25] Digitaalinen kuvanksittely (2005) Lecture notes based on the book: Gonzalez,
R.C., Woods, R.E.: Digital Image Processing, Prentice Hall, 2002. 793 p. ISBN: 0-
20-118075-8

[26] Meerwald, P., Uhl, A. (2001) A survey of wavelet-domain watermarking
algorithms. In: Proceedings of SPIE, Electronic Imaging, Security and Watermarking
of Multimedia Contents, Jan 20-26, San Jose, California, USA, Vol. 4314, pp. 505-
516.

[27] Barni, M., Bartolini, F., Capellini, V., Lippi, A. & Piva, A. (1999) A DWT-
based technique for spatio-frequency masking of digital signatures. In: Proceedings
of the SPIE/IS&T International Conference on Security and Watermarking of
Multimedia Contents, Jan 25-25, San Jose, California, USA, Vol. 3657, pp. 31-39.

[28] Gilani, S.A.M. & Skodras, A.N. (2001) Watermarking By Multiresolution
Hadamard Transform. In: Proceedings of the European Conference on Electronic
Imaging & Visual Arts (EVA2001), Mar 26-30, Florence, Italy, pp. 73-77.

[29] Fotopoulos, V., Krommydas, S. & Skodras, A.N. (2001) Gabor Trasform
Domain Watermarking. In: Proceedings of International Conference on Image
Processing, Oct. 7-10, Vol. 2, pp. 510-513 .

[30] Kang, S. & Aoki, Y. (1999) Image Data Embedding System for Watermarking
Using Fresnel Trasnform. In: IEEE international Conference on Multimedia
Computing and Systems, Jun. 7-11, Vol. 1, pp.885-889.

[31] Bailey, D.H. & Swarztrauber, P.N. (1991) The Fractional Fourier Transform and
Applications. In: SIAM Review, vol. 33, Issue 3, pp. 389-404.

[32] Lhetkangas, E. (2005) Digitaalisen kuvan vesileimaus, Diploma Thesis,
University of Oulu, Department of Electrical and Information Engineering, Finland.

[33] N. P. Sheppard, R. Safavi-Naini & P. Ogunbona. (2001) On Multiple
Watermarking, In:Workshop on Security and Multimedia at ACM Multimedia 2001,
Ottawa, Canada, pp. 3-6.

[34] Digimarc (read 4.10.2006) Digimarc Mobile E-Commerce Pilot Debuts at
Popular Tokyo-based Maid in Japan Caf. Press Release on 25.7.2006 URL:
http://www.digimarc.com/media/release.asp?newsID=478

[35] Marek S. (read 9.5.2006) Camera Phones Click With Marketers.
URL:http://www.wirelessweek.com/article/CA457747.html?text=mobot&stt=001
Wireless Week, Oct. 1, 2004

[36] NTT Corporation (read 6.6.2006) NTT Develops "CyberSquash" Internet
Access Platform using Electronic Watermarks. URL:
http://www.ntt.co.jp/news/news03e/0307/030707.html News release on 7.7.2003

70
[37] Nakamura, T., Katayama, A., Yamamuro, M. & Sonehara, N. (2004) Fast
Watermark Detection Scheme for Camera-equipped Cellular Phone. In: ACM
International Conference Proceeding Series; Volume 83. Proceedings of the 3rd
international conference on Mobile and ubiquitous multimedia. College Park,
Maryland. URL: http://portal.acm.org/citation.cfm?id=1052395

[38] Brown, Stephen A. (read 8.6.2006) A History of the Bar Code. URL:
http://eh.net/encyclopedia/article/brown.bar_code EH.net Encyclopedia.

[39] Seideman, T. The history of barcodes. (read 8.6.2006) URL:
http://www.basics.ie/History.htm. Article appeared in American Heritage of
Invention and Technology, Forbes Publication

[40] TEC-IT, Team for Engineering and Consulting in Information Technologies,
Bar Code Online Demo, TBarCode OCX. (Used 19.7.2006)
http://www.tec-it.com/asp/main/startfr.asp?mainmenu=Software&sbmenu=
Online&redirect=demo/playground.asp&LN=1

[41] Pavlidis, T. Swartz, J. & Wang, Y.P. (1990) Fundamentals of bar code
information theory. In: IEEE Computer Society, Apr., Vol. 23, Issue 4, pp. 74-86

[42] The 2D Data Matrix barcode (2005), In: Computing & Control Engineering
Journal, Dec. 2005-Jan. 2006, Vol. 16, Issue 6, pp. 39.

[43] Reiter (read 20.7.2006) Reiters Camera Phone Report. URL:
http://www.wirelessmoment.com/barcodes_and_scanning_camera_phones/index.htm
l

[44] Takeuchi, S., Kunisa, A., Tsujita, K. & Inoue Y. (2005) Geometric Distortion
Compensation of Printed Images Containing Imperceptible Watermarks, In:
International conference on Consumer Electronics, 2005. ICCE. 2005 Digest of
Technical Papers. 8-12 Jan. 2005. pp. 411 412.

[45] Peterson, W. W. & Weldon, E. J. (1972) Error-correcting codes, 2
nd
edition.
Colonial Press, Inc., USA. 560 p. ISBN 0-262-16-039-0

[46] Keskinarkaus, A., Pramila, A., Seppnen, T. & Sauvola, J. (2006) A Wavelet
Domain Print-scan and JPEG Resilient Data Hiding Method. In: Proceedings of 5
th

International Workshop on Digital Watermarking, Lecture Notes on Computer
Science, Nov. 8-11, Jeju Island, Korea, Vol. 4283, pp. 82-95

[47] Katayama, A., Nakamura, T., Yamamuro, M. & Sonehara, N. (2004) New High-
speed Frame Detection Method: Side Trace Algorithm (STA) for i-appli on Cellular
Phones to Detect Watermarks. In: Proceedings of the 3rd international conference on
Mobile and ubiquitous multimedia, ACM International Conference Proceeding
Series, Oct 27-29, 2004. Vol. 83, pp.109-116




71
10. APPENDIXES

Appendix 1 Homogeneous coordinates
Appendix 2 BER figures for the print-scan method
Appendix 3 Images used in the print-scan process for testing
Appendix 4 BER figures for the print-cam method
Appendix 5 Images used in the print-cam process for testing


Appendix 1 Homogeneous coordinates 72
HOMOGENEOUS COORDINATES [12]

Homogeneous coordinates are generally used in the field of computer graphics to
simplify transformations and projections. In Affine geometry they have a special role
because every affine transform can be represented as a matrix multiplication. In
homogeneous coordinate system the transformation of an n dimensional vector is
done in n+1 dimensional space.
Lets consider a point P located in three dimensional space defined by a point P
0
and the vectors
2 1
, and .
3
Usually the point P located at (x, y, z) is represented
with a column matrix

=
z
y
x
p , (1)

where x, y and z are components of basis vectors in this point, so that


3 2 1 0
z y x P P + + + = . (2)


However, this representation is not very good, because it can be confused with a
representation of a vector

3 2 1
z y x W + + = , (3)

which does not have any starting point and can be placed anywhere in the space.
Homogeneous coordinates offer a solution for this problem by introducing an extra
dimension. Then, point P can be written uniquely as

=
1
z
y
x
p , (4)

because from equation (2)

[ ]

=
0
3
2
1
1
P
z y x P

. (5)

Similarly, the vector w can be written in column matrix representation as



Appendix 1 Homogeneous coordinates 73

=
0
z
y
x
w , (6)

and

[ ]

= + + =
0
3
2
1
3 2 1
0
P
z y x z y x W

. (7)

We can see that the representations of a point and a vector are now different and
cannot be confused anymore. By using same derivation, an arbitrary transformation
matrix

,

=
i h g
f e d
c b a
s (8)

can be represented in homogeneous coordinates as

.
1 0 0 0
0
0
0

=
i h g
f e d
c b a
s (9)

Although the transformations are now done in four dimensional space to solve a
three dimensional case when the homogeneous coordinates are used, less arithmetic
work is required. The uniform representation of affine transformations makes
composing of successive transformations far easier and faster than in three
dimensions. In addition, modern computers are able to use parallelism to speed up
homogeneous coordinate operations.


Appendix 2 BER figures for print-scan method 74

The BER figures presented here are extracted from the tests of print-scan method
proposed earlier. On every y-axel show the BER value and x-axel give the number of
image examined. CR means JPEG compression ratio and last number in the figure
caption tells the resolution used in scanning.



Figure A.2.1 BER (CR=uncomp, 300dpi). Figure A.2.2 BER (CR=uncomp, 150dpi).




Figure A.2.1 BER (CR=100, 300dpi). Figure A.2.2 BER (CR=100, 150dpi).




Figure A.2.3 BER (CR=80, 300dpi). Figure A.2.4 BER (CR=80, 150dpi).



Appendix 2 BER figures for print-scan method 75



Figure A.2.5 BER (CR=60, 300dpi). Figure A.2.6 BER (CR=60, 150dpi).




Figure A.2.7 BER (CR=40, 300dpi). Figure A.2.8 BER (CR=40, 150dpi).




Figure A.2.9 BER (CR=100, 300dpi) Figure A.2.10 BER (CR=100, 150dpi)
(HP LaserJet 4500 DN). (HP LaserJet 4500 DN).





Appendix 3 Images used in print-scan process for testing 76


Figure A.3.1 Original image before watermarking.




Figure A.3.2 Image after watermark embedding and JPEG
compression with compression ratio of 100.


Appendix 3 Images used in print-scan process for testing 77


Figure A.3.3 Image after watermark embedding and JPEG
compression with compression ratio of 80.



Figure A.3.4 Image after watermark embedding and JPEG
compression with compression ratio of 60.


Appendix 3 Images used in print-scan process for testing 78


Figure A.3.5 Image after watermark embedding and JPEG
compression with compression ratio of 40.


Appendix 4 BER in print-cam method 79
The BER figures presented here are extracted from the tests of print-cam method
proposed earlier. On every y-axel show the BER value and x-axel give the number of
image examined. CR means JPEG compression ratio and last number in the figure
caption tells the resolution used when the picture was taken.



Figure A.4.1 BER (uncomp., 800x600). Figure A.4.2 BER (uncomp., 1600x1200).




Figure A.4.3 BER (CR=100, 800x600). Figure A.4.4 BER (CR=100, 1600x1200).




Figure A.4.5 BER (CR=80, 800x600). Figure A.4.6 BER (CR=80, 1600x1200).


Appendix 4 BER in print-cam method 80



Figure A.4.7 BER (CR=60, 800x600). Figure A.4.8 BER (CR=60, 1600x1200).






Appendix 5 Images used in print-cam process for testing 81


Figure A.5.1 Original image before watermarking.



Figure A.5.2 Image after watermark embedding before JPEG compression.


Appendix 5 Images used in print-cam process for testing 82


Figure A.3.3 Image after watermark embedding (JPEG CR = 100).



Figure A.3.4 Image after watermark embedding (JPEG CR = 80).


Appendix 5 Images used in print-cam process for testing 83


Figure A.3.5 Image after watermark embedding (JPEG CR = 60).



Figure A.3.5 Image after watermark embedding (JPEG CR = 60).

You might also like